The era of cryptic exons: implications for ALS-FTD | Molecular Neurodegeneration

‘TDP-43 proteinopathy’ refers to human diseases pathologically defined by cellular nuclear-to-cytoplasmic mislocalisation and aggregation of TAR DNA-binding protein-43 (TDP-43), an RNA-binding protein with a crucial role in RNA metabolism and splicing. TDP-43 proteinopathy encompasses a wide spectrum of progressive neurodegenerative diseases and phenotypes. It is a key feature in ~ 97% of people with amyotrophic lateral sclerosis (ALS) and ~ 45% of people with frontotemporal dementia (FTD), a seminal discovery made in 2006 [1, 2]. Subsequently, TDP-43 proteinopathy has been found as a co-pathology in a large fraction of Alzheimer’s disease, as well as inclusion body myositis and Paget disease of bone [3,4,5]. Furthermore, TDP-43 mislocalisation can be trigged by traumatic stimuli, such as physical stress or brain injury, which are themselves increasingly recognised as risk factors for a number of the above described neurodegenerative disorders [6,7,8,9].

Common to TDP-43 proteinopathies is a lack of directed disease-modifying therapies, and an inability to confirm and stratify individuals that have TDP-43 pathology ante-mortem. The disease that is most frequently characterised by TDP-43 mislocalisation is ALS, which is a rapidly progressive paralysing illness with eventual death on average 3 years from symptom onset [10]. Up to 15% of people with ALS can experience associated cognitive, language, and behavioural deficits, consistent with a diagnosis of FTD, comprising the ALS-FTD spectrum [11]. Pathogenic mutations in TARDBP, the gene which encodes TDP-43, are rare, occurring in < 1% of ALS cases, but underscore TDP-43’s central role in disease pathogenesis [12,13,14]. Altered subcellular localisation of TDP-43 is relevant – either via cytoplasmic toxic gain of function, nuclear loss of function, or both. However, the precise molecular mechanisms downstream of TDP-43 mislocalisation remain incompletely understood.

In this review, we will focus on a key ‘on-off’ consequence of nuclear TDP-43 loss of function, which has emerged over recent years – the occurrence of splicing defects leading to the erroneous inclusion of intronic sequences into mature mRNA, forming so-called ‘cryptic exons’ [15]. With a spotlight on ALS-FTD, we will review their relevance to disease pathogenesis, and discuss exciting and emerging opportunities on the horizon for biomarker development and therapeutics in the field of TDP-43 proteinopathies (Fig. 1).

Fig. 1
figure 1

Schematic summarising how cryptic exons (CEs) arise in the context of TDP-43 mislocalisation from the nucleus to the cytoplasm (owing to de-repression of splicing in intronic regions), possible downstream consequences (loss of functional protein, nonsense-mediated decay (NMD) of the cryptic-containing transcript, or translation of a novel cryptic peptide), and opportunities that CEs provide for better understanding disease mechanisms (STMN2, UNC13A, other genes yet to be explored), biomarker development (RNA and protein biomarkers), and therapeutics (via restoration of protein levels, or splicing modification)

TDP-43 depletion leads to de-repression and inclusion of cryptic exons

TDP-43 was identified in 1995 as a suppressor of HIV gene expression. It was later discovered that TDP-43 expression promotes skipping of exon 9 in the CFTR gene, first realising TDP-43’s role as an RNA-binding protein and regulator of alternative splicing [16]. Since then, an increasing number of genes have been identified for which the splicing of annotated, conserved exons is regulated by TDP-43; as such, altered splicing in POLDIP3, SORT1, and PFKP have all been used as reliable readouts for nuclear TDP-43 function in models with TDP-43 knockout, overexpression, and for studying TDP-43 ALS-causing mutations [17,18,19,20,21,22,23,24,25,26,27].

TDP-43’s consensus binding motif comprises “UG”-rich dinucleotide repeats. High-throughput sequencing, combined with cross-linking immunoprecipitation experiments to discover direct TDP-43 mRNA targets, revealed that the majority of TDP-43’s binding sites in pre-RNAs were in introns [26, 27]. A number of these intronic binding sites were later revealed to be sites where TDP-43 binding acts as a splicing repressor. When, as a result of TDP-43 depletion, non-conserved, intronic sequences are erroneously included in mature RNA, they are called cryptic exons [15]. Importantly, these cryptic exons can be found in patient tissues from people affected by ALS and FTD, Alzheimer’s disease, and inclusion body myositis [15, 23, 28,29,30,31,32,33,34]. While TDP-43 regulation of annotated splicing leads to shifts in the ratio of known isoforms, TDP-43 cryptic splicing produces novel RNA isoforms. These changes include novel cassette exons, skipping of canonical splicing products, 3′ or 5′ extensions of annotated exons, novel transcription start sites, and novel poly-adenylation events [15, 35].

What are the possible consequences that result from the insertion of novel non-conserved sequences in mRNA (Fig. 1)? Most cryptic exons are predicted to lead to destabilisation and degradation of mRNA, thereby resulting in a reduction in functional levels of the corresponding protein. This occurs primarily through nonsense-mediated decay (NMD) due to introduction of frameshifts and premature termination codons [32, 36]. Conceptually, such cryptic exons could lead to deleterious downstream physiological consequences if they reside in transcripts encoding critical proteins. Occasionally, cryptic exons lead to an in-frame change to the nucleotide sequence with no premature stop codons, thereby escaping NMD and resulting in a predicted translation of novel cryptic peptides. Alternatively, they can result in premature polyadenylation, potentially leading to truncated proteins. It is conceivable that these proteins resulting from mis-spliced RNA could acquire gain of function changes in their biology, be toxic, or serve as detectable biomarkers for monitoring of disease activity, or for stratification of subtypes in ALS-FTD and other neurodegenerative disorders.

Relevance to disease

Impact on neuronal health – STMN2

In 2019, a breakthrough discovery from two independent groups found that TDP-43 loss of function leads to the inclusion of a cryptic exon within the STMN2 gene [23, 30]. This gene encodes for stathmin-2, a microtubule-associated protein needed for axonal growth and repair in neurons. Occurring after exon 1 and containing a stop codon and polyadenylation site, the cryptic exon leads to premature transcript termination and predicted formation of a truncated non-functional protein. Using in vitro neuronal models of TDP-43 depletion via knockdown, induction of TDP-43 mislocalisation, and pathological loss of function TDP-43 mutations known to cause ALS, the research groups demonstrated a resultant dramatic reduction in levels of mature STMN2 mRNA and functional protein. Crucially, the STMN2 cryptic exon was confirmed to be detectable also in ALS-FTD brains and spinal cords where TDP-43 pathology is present and specifically in patient post-mortem neurons with TDP-43 pathology isolated by laser capture or nuclear sorting [30,31,32, 37]. Without TDP-43, motor neuron axons were unable to regenerate after an axotomy. Remarkably, they were able to rescue axonal regrowth by restoring levels of STMN2 protein, thereby demonstrating that STMN2 is critical to the health of motor neurons, and that its TDP-43 dependent loss can be a key pathogenic player in ALS. Following this pivotal discovery, the hunt for other cryptic exons that might contribute to or exacerbate ALS pathology continued.

Impact on ALS disease progression – UNC13A

UNC13A is one of the top genetic risk factors for ALS, first reported in a genome-wide association study in 2009 [38]. UNC13A protein plays an important role in neurotransmitter release and synaptic transmission, and mice that lack this gene (called Munc13–1 in mice) have synaptic impairments and die soon after birth [39]. In humans, rare cases with homozygous nonsense mutations in UNC13A result in fatal microcephaly, cortical hyperexcitability, and myasthenia [40]. Single nuclear polymorphisms (SNPs) in UNC13A have been shown to increase risk of ALS and also shorten survival in patients. However, the underlying mechanism by which these UNC13A variants increase ALS risk was not known until earlier this year when, along with the Gitler and Petrucelli groups, we independently found that TDP-43 depletion resulted in inclusion of a cryptic exon between exons 20 and 21 of UNC13A, leading to reduced expression. Similarly to the STMN2 cryptic exon, this event can also be detected in post-mortem brain and spinal cord tissue from patients with ALS-FTD, exclusively where TDP-43 pathology is present, but not in healthy controls.

Intriguingly, two of the associated UNC13A SNPs, rs12608932 (A > C) and rs12973192 (C > G), were found to be located within the intron containing the cryptic exon, with the former SNP being located within the cryptic exon region itself. Especially given the large size of the UNC13A gene (~ 87 kb), the proximity of the SNPs to the cryptic exon led us to hypothesise a direct link between the SNPs and UNC13A cryptic splicing. Given that the cryptic exon is not present in RNA-sequencing data from healthy controls, even when homozygous for the risk SNPs (approximately 10% of Caucasians are homozygous for both risk SNPs), they seem not to be sufficient to cause cryptic exon inclusion on their own. Thus, it was hypothesised that rather than increasing ALS or FTD risk directly, the SNPs act by potentiating the expression of a cryptic exon which is dependent on TDP-43 depletion.

Indeed, the risk SNPs enhance cryptic splicing and alter direct TDP-43 binding, thereby reducing its ability to physiologically repress cryptic exon inclusion. In keeping with this, patients homozygous for either SNP had more UNC13A cryptic exon-containing transcripts in a dose-responsive manner compared to heterozygotes and non-carriers. This discovery provides a biological explanation for how presence of the UNC13A risk SNPs in ALS-FTD patients with TDP-43 pathology modifies disease outcomes, negatively affecting survival from symptom onset [32, 33, 41,42,43,44,45,46,47]. Further work needs to be done to understand the biology connecting the above molecular events and disease progression, by specifically investigating the functional consequences of synaptic dysfunction associated with TDP-43 and UNC13A loss.

Future directions

Opportunities for novel biomarkers

Biomarkers for early diagnosis, disease stratification of subtypes and clinical stages, prognostication, and monitoring treatment response are vitally important, both for clinical trials and patient care. Fluid biomarkers of general neurodegeneration exist, most notably serum neurofilament light chain, cerebrospinal fluid (CSF) phosphorylated neurofilament heavy chain, and urinary p75ECD [48,49,50]. However, biomarkers specific for ALS-FTD and other TDP-43 proteinopathies are not currently available, nor are there readouts of TDP-43 function in living patients.

Unlike canonical TDP-43 regulated splicing, the ‘on-off’ expression of cryptic exons only under TDP-43 nuclear loss of function makes them potential potent biomarker candidates for TDP-43 pathology. This proof of concept was first shown with the STMN2 cryptic exon in post-mortem frontal cortex samples, in which its presence was able to discriminate between patients with FTD-TDP, progressive supranuclear palsy, and healthy control participants [31].

A key future direction will be to exploit this characteristic to develop methods that detect TDP-43 pathology ante-mortem in patient biofluids, such as in blood, CSF, urine, or tissue. Assays could involve either direct RNA detection of the cryptic exon itself, or detection of novel cryptic peptides for transcripts that escape NMD. An ideal biomarker would have a reproducibly high and linear correlation with disease activity, high sensitivity and specificity, stability throughout the day, and easy access to obtain a sample [51].

With regards to biomarker detection of RNA, methods have been developed that can sensitively detect circulating RNA in the blood of individuals with Alzheimer’s disease that reflect disease-specific neural transcriptome changes [52]. However, there are challenges. For example, most cryptic exon-containing transcripts, such as for UNC13A, undergo NMD and are therefore not stable. Moreover, if transcripts are at low levels, highly sensitive tools would be required. Detection of a panel of relevant cryptic exons deemed to be pathogenic players could theoretically increase sensitivity.

For cryptic exons that produce cryptic peptides, specific antibodies could be designed to detect these via immunohistochemistry, or ELISA and SIMOA-based assays. This has its own challenges, whether it be due to NMD of cryptic-containing transcripts, low expression, or instability of the peptides. For STMN2, even though the cryptic-containing transcript is highly expressed and escapes NMD, attempts to detect the putative 16 amino acid cryptic peptide have not yet been successful. However, antibody generation to specifically detect pathological proteins is evolving, including in the field of ALS-FTD, with antibodies now available to detect dipeptide repeats that occur in the C9orf72 subtype of ALS-FTD. These have not only furthered our understanding of disease mechanisms, but are also being utilised to assess therapies [53,54,55,56].

Excitingly, two new pre-print publications this year have independently reported the detection of cryptic peptides in the CSF of patients with ALS [57, 58]. Irwin et al. demonstrated that a newly characterised monoclonal antibody, specific to a TDP-43-dependent cryptic epitope encoded by the cryptic exon found in HDGFL2, detects the cryptic peptide in C9orf72-associated ALS. Strikingly, this includes pre-symptomatic mutation carriers [57]. In Seddighi et al., our groups first demonstrated the presence of de novo cryptic peptides in iPSC-derived neurons with TDP-43 knockdown, and then used a novel targeted proteomics assay to confirm the presence of cryptic peptides in CSF of patients with ALS-FTD [58]. Further work will be needed to assess the specificity and sensitivity of cryptic peptides as biomarkers, but, taken together, these discoveries are encouraging steps towards facilitating earlier diagnosis of ALS, and also providing a way of measuring target engagement in clinical trials for new therapies aimed at restoring TDP-43 function.

Given the cell-type specificity of TDP-43 controlled cryptic exons, it is also possible that biomarkers towards cryptic exons could be designed to assess the flow of TDP-43 proteinopathy across the nervous tissue [25, 35, 59, 60]. While TDP-43 proteinopathy in cortical and motor neurons has received much attention, TDP-43 proteinopathy is also present in glia and Schwann cells [2, 60,61,62,63]. Indeed, TDP-43 depletion in mouse Schwann cells showed that TDP-43 regulates the inclusion of a cell-type specific cryptic exon in Neurofascin in these cells, highlighting how, if certain TDP-43 cryptic exons are expressed only by given cells, they could allow tracing of TDP-43 proteinopathy. Furthermore, recent work showing that pTDP-43 and loss of nuclear TDP-43 can be detected years before ALS disease presentation in central nervous system and non-central nervous system tissue suggests that detection of TDP-43 cryptic exons could possibly pre-date disease onset [64, 65].

Another interesting avenue is the use of model systems to determine the timing of the emergence of cryptic exon inclusion events. For instance, studies assessing the presence of cryptic exons at different levels of TDP-43 knockdown in vitro, or in patients with or post-mortem tissue from varying clinical stages of ALS-FTD, could allow curation of a database reflecting which cryptic exons appear early and late in disease. Further analysis of this could provide a greater mechanistic understanding of cryptic exon biology and what factors make certain genes more sensitive and susceptible to the inclusion of cryptic exons with TDP-43 loss. Such a stratification may also have implications for biomarker studies and early diagnosis and stratification of people with ALS-FTD.

Further work is necessary to utilise the specificity of cryptic exons to develop appropriate biomarkers, which could be transformative to the field of ALS-FTD and other TDP-43 proteinopathies. Particularly with new therapies on the horizon for neurodegenerative diseases, early diagnosis of individuals who are or will become affected and a readout for treatment response will be critical in halting the disease process at an early enough stage to pre-empt the point of no return in the ALS disease cascade [66].

Opportunities for therapeutics

The discovery that certain cryptic exons, and potentially more that remain to be explored, underpin the disease process in ALS – either by impairing neuronal health, increasing ALS risk, or worsening disease progression – makes them exciting novel therapeutic targets. If early diagnosis can be achieved, the hope would be to restore expression of critical proteins to normal functional levels, or use splicing therapies to modulate and prevent aberrant splicing from occurring altogether in TDP-43 proteinopathies.

The STMN2 cryptic exon leads to a reduction in STMN2 expression. Thus, one strategy would be to restore normal STMN2 levels using a viral vector containing the correct DNA sequence, such as via adeno-associated viral (AAV) or lentiviral delivery. This proof of concept was shown when transduction of TDP-43 depleted iPSC-derived motor neurons with lentivirus carrying STMN2 restored STMN2 levels and also, importantly, restored axonal regeneration after axotomy [30]. It was also recently shown in mice deficient of Stmn2 (Stmn2−/−) that the introduction of the human STMN2 gene was able to rescue motor deficits [67]. However, translating this method to humans poses challenges, including discerning what the target level should be and at what point to treat in order to avoid missing the effective therapeutic window but also avoid overexpression and potential gain of function toxicity, as was observed in spinal muscular atrophy (SMA) studies as a result of SMN overexpression [68]. Moreover, it would be important that such therapies are directed to the correct anatomical site with appropriate tissue and cell-type specificity to prevent off-target effects, and also that any harmful immunogenic responses to the viral vector itself are avoided [69]. Another obstacle would be the limitation in packaging capacity of the viral vector and therefore the size of the gene needing to be restored. As such, UNC13A would be too large to be introduced via this methodology, as opposed to STMN2, which would theoretically be more amenable.

Other important gene-therapy strategies involve the prevention of erroneous RNA splicing. Compared to the above described gene delivery methods, these approaches aim to correct endogenous splicing and therefore avoid problems deriving from overexpression and toxicity. One method is via antisense oligonucleotide therapies (ASOs), which are short synthetic DNA sequences that are designed to selectively bind to pre-mRNA in cells. ASOs have come to the forefront as a new tool for numerous neurodegenerative diseases, including toxic gain of function gene-specific ALS subtypes, where they have been used to degrade mutant transcripts. ASOs designed to reduce protein produced from mutant SOD1 and C9orf72 have been tested in ALS clinical trials (BIIB067 and BIIB078, respectively). The former showed very encouraging results, but failed to reach significance in achieving primary clinical endpoints in a phase 3 trial, and the latter phase 1 trial showed no difference from placebo for clinical endpoints and in fact trended towards greater decline at the higher dose. These results highlight some of the caveats of ASO therapies [70, 71]. ASOs can also be used to directly introduce splicing modulation, and this has been used to improve functional splicing in disorders, such as SMA, or to limit the effect of deleterious mutations by skipping specific exons, such as in Duchenne muscular dystrophy (DMD) [72,73,74,75]. One drawback of ASO therapies is the need for repeated dosing for a maintained response, which is particularly undesirable for patients if the delivery method is invasive, such as via intrathecal administration.

Other effective splice switching strategies include systems that target pre-mRNA, but are stably expressed in the long-term, thereby preventing the need for repeated dosages, whilst also aiming to circumvent the above concerns regarding overexpression. These include CRISPR-Cas ribonuclease programming, such as via RfxCas13d (also named CasRx, from Ruminococcus flavefaciens), and using modified uridine-rich small nuclear RNA (snRNA) gene therapy, such as via U7 snRNA delivery, which has shown promise in DMD, and has entered its first clinical trial [76,77,78]. In addition to the above benefits, both methods involve relatively small-sized effectors, such that they can be packaged into a viral vector, and theoretically may even allow for multiple cryptic exons to be targeted in parallel.

Finally, small molecule delivery has been developed as a further tool to alter splicing. A clear benefit of this therapeutic platform is the ability to deliver the treatment to patients orally. The first small molecule splicing modifier to be approved was Risdiplam for oral delivery in SMA. It was shown to increase SMN protein in the central nervous system and peripheral tissues in mice, and to restore functional SMN protein in patients with SMA [79,80,81,82].

Homing in on correcting cryptic exon-related mis-splicing, a phase I clinical trial is planned to assess an ASO designed to restore STMN2 levels [83]. Commonly in the field of therapeutics, mouse models are used prior to testing in humans; however, human cryptic exons have been found to share very little overlap with mouse cryptic exons, making this a problem [15]. The cryptic exons in both STMN2 and UNC13A that have been reported in humans are not present in mice. In fact, and interestingly, different UNC13A cryptic exons are present in mice [15]. Therefore, in vivo models are needed to test splicing therapies that target specific cryptic exons. One possibility is the development of humanised mouse models containing the relevant human cryptic exons [84, 85]. Another issue arising from the use of human iPSC models alone to study therapeutics in the field of cryptic exons is that cryptic exon signatures in TDP-43 depletion have been shown to be highly variable between cell types [25, 35]. Whilst there was some commonality between stem cells, neurons, and myocytes, most non-conserved cryptic exons were cell-type specific. Thus, TDP-43 loss of function may impair cell-type specific pathways, which has mechanistic and treatment-evaluation implications that would need to be considered. With the development of better models for assessing human-relevant cryptic exons, there is significant promise for novel therapies on the horizon.

Whilst blocking TDP-43 cryptic exons to restore gene expression appears to have a relatively clear mechanistic benefit, it remains to be seen how many and which cryptic exons need to be targeted for an effective therapy. Future experiments may seek to determine which cryptic exon inclusion events are pathologically relevant, impacting on a biological pathway that contributes to disease, as this may impact the development of therapies and potential ways to restore the expression of multiple mRNA targets, as opposed to one at a time.

Read more here: Source link