Evolutionary conservation of the fidelity of transcription

Multiple proteins are required to maintain the fidelity of transcription in yeast

Using genetically engineered yeast strains21,36,37 and massively parallel sequencing technology34,35, we and others previously demonstrated that the non-essential subunits Rpb9 and TFIIS enhance the fidelity of transcription of RNAPII in living cells16,21,26,34,35,36,37,38,39. In addition, we showed that the functional analog of these proteins, Rpa12, controls the in vivo fidelity of RNAPI in yeast35. To further unravel the molecular architecture that controls the fidelity of transcription, we investigated whether the remaining non-essential subunits of RNAPI and II also play a role in the fidelity of transcription. First, we measured the error rate of yeast cells that carry a deletion of Rpa14, Rpa34 or Rpa49, the three remaining non-essential subunits of RNAPI. To do so, we used an optimized version of the circle-sequencing assay35,40, a consensus sequencing approach that was initially designed to measure the mutation rate of RNA viruses41,42 (Fig. 1). These experiments revealed that loss of Rpa34 and Rpa49 increased the base substitution rate of RNAPI 4-fold (1.4 × 10−5 ± 6.3 × 10−7/bp for Rpa34Δ cells and 8.6 × 10−6 ± 7.8 × 10−7/bp for Rpa49Δ cells, Fig. 2A), while the rate of insertions and deletions increased 2-fold (Fig. S1a, b), indicating that both Rpa34 and Rpa49 are important for the fidelity of RNAPI in living yeast. Full tables containing the number of errors detected and bases sequenced for key datapoints are listed in supplementary table 1. Interestingly, recent in vitro experiments directly support this conclusion43. We further note that no increase was detected in the error rate of RNAPII, consistent with the idea that Rpa34 and Rpa49 are only part of the RNAPI holoenzyme. In addition, these observations serve as an internal control to demonstrate that the reduced fidelity of RNAPI was not the result of unrelated, cellular conditions that affected transcriptional fidelity. In contrast, loss of Rpa14 had no effect on the error rate of transcription, indicating that it does not play a role in transcriptional fidelity. When we examined the increased error rates of Rpa34Δ and Rpa49Δ cells further, we found that they were primarily fueled by an increase in G→A errors (Fig. 2B). Although other errors increased in abundance as well (full error spectra are depicted in Figs. S2 and S3), a disproportionate increase in G→A errors characterized every other mutant yeast strain known to display error-prone transcription, including Rpa12Δ, Rpb9Δ, Dst1Δ (DST1 encodes TFIIS), and Rpb1E1103G cells35 (RNAPII, Fig. 2C). This comparison suggests that G→A errors pose a substantial threat to the fidelity of transcription and that multiple proteins evolved to prevent them. Interestingly, loss of the fidelity factor GreA increases the rate of G→A errors in bacteria as well, further underlining the universality of this observation44.

Fig. 1: Core concept of the circle-sequencing assay.
figure 1

Traditional sequencing approaches can identify transcription errors (red circle) present in RNA molecules (blue lines). However, during reverse transcription of RNA templates, additional errors (blue circles) are introduced into the cDNA (yellow lines) by reverse transcriptases that are indistinguishable from true transcription errors. Additional artifacts (yellow circles) that resemble transcription errors are also introduced during sequencing itself. To prevent these artifacts from confounding our measurements, RNA is circularized prior to reverse transcription. These circularized molecules are then reverse transcribed in a rolling circle fashion to generate linear cDNA molecules constructed of tandem repeats of the original RNA fragment. If a transcription error was present in the original template, this error will be present in all copies of this template, while artifacts will occur only once.

Fig. 2: A survey of proteins and alleles that control the fidelity of transcription in yeast.
figure 2

A Rpa12 (n = 3, P = 0.0005), Rpa34 (n = 3, P = 0.0005) and Rpa49 (n = 3, P = 0.0061) are required for high-fidelity transcription by RNAPI (WT n = 7, Rpa14 n = 1). B, C All the error-prone alleles identified so far in RNAPI and II specifically increase G→A errors (n identical to A). D RNAPIII displays a higher error rate than RNAPI and II primarily due to an increased G→A error rate (n = 7). E The error spectrum of WT and Rpo41 mutants (n = 3). F Two alleles, Rpo41G1023A (P = 0.0092) and Rpo41G1028A (P = 0.01) display increased error rates (n = 3). A third allele, Rpo41H1163A resulted in too few reads from the mitochondrial genome to draw firm conclusions (#). G Cells that carry the Rpo41H1163A allele display an increased error rate in nuclear RNA (n = 3, P = 0.0344 for RNAPI and P = 0.255). H Transcription errors detected in18S rRNA. Red bases indicate bases in which errors were detected. Green and blue lines indicate that these bases form secondary structures or make connections with other rRNA subunits. The errors we detected affect every aspect of RNA structure and function. In all cases, n is defined as biologically independent replicates. All experiments were analyzed by unpaired, two-tailed Welch’s t-tests using Prism software. *P < 0.05, **P < 0.01, ***P < 0.001. Error bars depict standard error from the mean. Source data are provided as a Source Data file.

Next, we examined the error rate of yeast cells that lack Rpb4, the only non-essential subunit of RNAPII in addition to Rpb9 and TFIIS, and a functional homolog of Rpa14. Like Rpa14, loss of Rpb4 had no effect on the error rate of RNAPI or II (Fig. 2A), suggesting that it does not play a role in transcriptional fidelity. In addition to non-essential subunits, essential components of RNAPII could contribute to the fidelity of transcription as well. To examine this possibility, we analyzed yeast strains from the DamP collection45, which carry insertions in the 5′UTR of essential genes that reduce their expression (Table 1). From this collection, we analyzed strains with reduced expression of Rpb3, Rpb5, Rpb7, and Rpb8, all of which encode core components of the RNAPII holoenzyme. In addition, we tested whether reduced expression of Tfg1 or FcpI affects the fidelity of RNAPII. Tfg1 and FcpI are part of the TFIIF elongation complex and play a similar role in RNAPII as Rpa34 and Rpa49 in RNAPI, which suggests that they could affect transcriptional fidelity as well. However, all these strains displayed similar error rates compared to WT cells (Fig. S4). Although this result suggests that these proteins do not contribute to fidelity, it should be noted that the degree to which some genes were knocked down was insufficient to draw firm conclusions (Table 1). Another caveat of this experiment is that RNAPII holo-enzymes assembled without these proteins may not produce enough transcripts to affect the overall error rate of transcription. Accordingly, we propose that experiments with targeted point mutations that leave the initiation, processivity and elongation rate of RNAPII intact may be needed to resolve these issues. Similar experiments could also address whether auxiliary components of the transcription machinery, including those that are only transiently associated with this process, such as transcription coupled DNA repair proteins, alter the fidelity of transcription. Depending on the safety net these proteins affect, these mutations could alter the fidelity of transcription in various ways. Finally, these experiments could reveal whether residues that maintain the fidelity of transcription of RNAPII serve a similar function in related polymerases. For example, the rpb1E1103G mutation causes error-prone transcription by RNAPII in yeast. It would be interesting to test whether the analogous mutation in RNAPI (Rpa190E1124G)46 plays a similar role in transcriptional fidelity.

Table 1 Yeast proteins and alleles monitored for their contribution to transcriptional fidelity

Since Rpa12, Rpb9, and TFIIS are the strongest modulators of transcriptional fidelity identified thus far, we investigated whether the functional analog of these proteins in RNAPIII (Rpc11) controls the in vivo fidelity of RNAPIII in yeast. Previous in vitro studies have shown that Rpc11 contains a strong, intrinsic cleavage activity that could improve the fidelity of RNAPIII as much as 1000-fold18. Unlike Rpa12, Rpb9 and TFIIS though, Rpc11 is essential for the viability of yeast cells, preventing examination of homozygous Rpc11Δ strains. Neither a DamP strain or a heterozygous knockout of Rpc11 displayed increased error rates though (Fig. S5), suggesting that specific mutations in Rpc11 that knock out its proofreading ability (including the Rpc11E92H mutation47) may be needed to assess the role of Rpc11 on the fidelity of transcription in living cells. Our experiments did demonstrate though, that RNAPIII commits 5 times more mistakes than RNAPI and II (Fig. 2D), primarily due to an elevated G→A error rate. This phenotype strongly resembles the fidelity of RNAPI and II without its fidelity factors, suggesting that the cleavage activity of Rpc11, despite its potency, is still insufficient to fully suppress transcriptional mutagenesis in living cells.

Finally, we examined the fidelity of the mitochondrial RNA polymerase (Rpo41), a single subunit enzyme that resembles the T7 phage RNA polymerase. Four amino acids were previously described that control the fidelity of the phage polymerase48, suggesting that these residues could modulate the error rate of Rpo41as well. Therefore, we genetically engineered 4 analogous mutations into Rpo41 by gene substitution, creating the rpo41G1023A, rpo41V1027A, rpo41G1028A and rpo41H1163A alleles, Table 1, Fig. S6a) and measured the error rates of these cell lines. Interestingly, we found that cells that carry the rpo41G1023A and rpo41G1028A mutations indeed displayed an elevated error rate in mitochondrial RNA (1.31 × 10−5 ± 1.05 × 10−6/bp for rpo41G1023A cells and 2.32 × 10−5 ± 5.94 × 10−6/bp for rpo41G1023A cells, Fig. 2E, F), demonstrating that the fidelity function of these amino acids is functionally conserved. In contrast to the error-prone RNAPI and II lines though, these error-prone lines did not display an elevated G→A error rate, consistent with the distinct evolutionary origins of Rpo41 compared to RNAPI and II. In contrast, the rpo41V1027A allele did not affect the fidelity of mtRNA, indicating that not all T7 phage allelic affects are conserved in Rpo41. And finally, we were unable to accurately measure the fidelity of the rpo41H1163A allele. Cells that carried this allele had lost the vast majority of their mtDNA and mtRNA (Fig. S6b), preventing confident measurements of RNA integrity. Because the mitochondrial RNA polymerase is required to generate RNA primers for mtDNA replication49,50, one explanation for this observation is that the rpo41H1163A allele inhibits either the initiation, elongation, or processivity of Rpo41, thereby preventing the primers required for mtDNA replication to be generated efficiently.

We were surprised to find though, that rpo41H1163A cells do display an increased error rate in nuclear RNA (Fig. 2g). Thus, there seems to be an unexpected relationship between mitochondrial function and the fidelity of transcription in the nucleus. Because oxidative damage is a powerful source of transcription errors28,29,30, it is possible that the rpo41H1163A allele elevates the error rate of transcription in the nucleus by inducing reactive oxygen species. Alternatively loss of mtDNA could affect nuclear transcriptional fidelity by altering the production of mitochondrial iron-sulfur clusters, which is required for efficient DNA repair51 and potentially transcription52. Regardless, these experiments identify multiple alleles and proteins that play a role in transcriptional fidelity and in doing so, mutant strains that can now be used to understand the impact of transcription errors on cellular health, including errors that occur in mitochondrial RNA and rRNA. For example, our experiments show that transcription errors can affect every aspect of rRNA structure and function (Fig. 2H). How these errors affect cellular function can now be determined with the help of these mutant strains.

Evolutionary conservation of fidelity genes in higher organisms

Next, we wondered whether the fidelity factors identified in yeast play a similar role in multi-cellular organisms. To test this hypothesis, we selected the strongest fidelity factor in yeast (TFIIS) as our primary target and asked if TFIIS has a functional homolog in the nematode C. elegans. By searching for a similar protein sequence in the worm proteome with BLASTP we identified T24H10.1 as its closest relative (E-value 4 × 10−27), a protein that has with 27.5% identity and 54% strong similarity as determined by the Clustal Omega Alignment tool (Fig. S7). The only other close relative was a region of rpc-11, a subunit of RNAPIII in C. elegans that encodes an embedded TFIIS-like domain (E-value 2 × 10−6). We then examined a strain that contains a deletion in T24H10.1 and found that these worms display a 5-fold increase in base substitutions (1.72 × 10−5 ± 5.87 × 10−7/bp, Fig. 3A) and a 1.5-fold increase in insertions compared to control worms (Fig. 3B). Further analysis showed that the elevated error rate of T24H10.1Δ worms is fueled by G→A errors and that these errors are only increased in RNAs generated by RNAPII (Fig. 3A), suggesting that T24H10.1 is a fidelity factor for RNAPII (Fig. 3C). Thus, T24H10.1Δ worms display a similar fidelity defect compared to TFIIS null cells, indicating that T24H10.1 is indeed a functional homolog of TFIIS in C. elegans. This strain provides a unique opportunity to understand the impact of transcription errors on organismal health.

Fig. 3: The error rate of transcription in genetically engineered worms and human cells.
figure 3

A Worms that carry a partial deletion in the T24H10.1 gene (n = 3, P < 0.0001) display an increased error rate in RNAPII but not RNAPI, III or the mitochondrial RNA polymerase compared to WT worms (n = 6). B RNAPII in T24H10.1Δ worms (n = 3) display an increased insertion rate compared to WT worms (n = 6, P = 0.0004). C The increased error rate of T24H10.1Δ worms (n = 3) is primarily fueled by G→A errors (P = 0.0025). D Worms that carry a heterozygous mutation in the ama-1 gene (ama-1+/E1120G, n = 3) display an increased G→A error rate (P = 0.0458). E RNAPII in human cells that carry a homozygous mutation in the POLR2A gene (POLR2AE1126G, n = 3) displays an increased error rate compared to WT cells (n = 3), while RNAPI does not (P = 0.0003 for clone 1 and P = 0.0392 for clone 2). F The increased error rate of POLR2AE1126G cells is primarily fueled by an increased G→A error rate. In all cases, n is defined as biologically independent replicates. All experiments were analyzed by unpaired, 2-tailed Welch’s t-tests using Prism software. *P < 0.05, **P < 0.01, ***P < 0.001. Error bars depict standard error from the mean. Source data are provided as a Source Data file.

Next, we wondered whether the fidelity effect of specific amino acids are conserved in multi-cellular organisms as well. The best studied allele of error-prone transcription in yeast is the rpb1E1103G allele, which raises the error rate of transcription 5-fold. Therefore, we used CRISPR/Cas9 technology to replicate this allele in C. elegans (ama-1E1120G) and monitored the fidelity of RNAPII. Because homozygous ama-1E1120G worms are inviable, these experiments were performed on heterozygous animals, and these worms indeed displayed an increased G→A error rate in mRNA (5.4 × 10−5 ± 4.47 × 10−7), similar to yeast cells that carry the rpb1E1103G allele (3D). No increase was observed in rRNA, indicating that this increase was not caused by unrelated cellular conditions that affected transcriptional fidelity (Fig. S8). We then wondered if the fidelity effect of this allele is also conserved in humans and used CRISPR/Cas9 technology combined with single cell cloning to generate multiple cell lines that carry an analogous mutation in HEK293 cells (POLR2AE1126G). Analysis of 2 independent homozygous POLR2AE1126G/E1126G clones revealed that both cell lines display a 3-fold increase in transcription errors (6.85 × 10−6 ± 3.74 × 10−7/bp for clone 1 and 6.88 × 10−6 ± 1.51 × 10−6/bp for clone 2, Fig. 3E). This increase was primarily fueled by G→A errors (Fig. 3F), similar to the rpb1E1103G allele in yeast and the ama-1E1120G allele in worms. Taken together, these experiments demonstrate that the fidelity defects of yeast mutants can indeed be functionally conserved in multi-cellular organisms and highlight the threat of G→A errors to the fidelity of transcription across species. In addition, because the POLR2AE1126G/E1126G cells display error-prone transcription, they provide a unique opportunity to understand how transcription errors affect human biology.

Human patients with mutations in POLR2A that encode error-prone RNA polymerases

Because the experiments described above demonstrate that human cells are capable of error-prone transcription, we wondered whether patients exist that express error-prone RNA polymerases. Interestingly, a primary53 and secondary54 cohort of patients was recently identified that carry mutations in POLR2A, the major catalytic subunit of RNAPII in human cells. These patients suffer from various symptoms, including muscle weakness, enlarged ventricles, white matter abnormalities and cerebellar problems. Several mutations identified in these patients cluster in regions that are essential for transcriptional fidelity, including the trigger loop, prompting us to perform an exploratory screen on a collection of yeast strains that carry analogous, patient-specific mutations in Rpb1, the yeast homolog of POLR2A53 (Fig. 4A). From this screen, 4 promising mutants were selected for further validation, 2 of which indeed displayed an elevated error rate (9.8 × 10−6 ± 2.82 × 10−6/bp for rpb1L1101P cells and 4.67 × 10−6 ± 5.51 × 10−7/bp for rpb1N1232S cells, Fig. 4B). One of these mutations affected amino acid 1101 of Rpb1 (POLR2AL1124P in humans), which is only 2 amino acids upstream from the error-prone Rpb1E1103G mutation in yeast and the error-prone POLR2AE1126G mutation in humans (Fig. 4C). The second error-prone mutation (POLR2AN1251S in humans) is located just downstream from amino acid 1230 in yeast. Interestingly, mutations at that location (rpb1E1230K) disconnect Rpb1 from the fidelity factor TFIIS21,55, which results in a 10-fold increase in transcription errors (4.64 × 10−5 ± 2 × 10−6/bp), mimicking cells in which TFIIS has been deleted (Fig. 4D). It is possible that a similar disconnect is responsible for the increased error rate of rpb1N1232S cells. We further found that both alleles display an elevated G→A error rate (Fig. 4E), strengthening the relationship between the patient-specific alleles and the known, error-prone alleles that they flank.

Fig. 4: Increased error rates in yeast and human cells that carry patient-specific mutations.
figure 4

A Twenty-three patient-specific mutations were tested for their impact on transcriptional fidelity in yeast (n = 1). Four of these mutations (green bars) were selected for additional sequencing tests (n = 3). B During these additional tests, two of those mutations, Rpb1L1101P (P = 0.0474) and Rpb1N1232S (P = 0.0176) were found to display higher error rates compared to WT cells (n = 3). C The patient-specific mutations that were error-prone are located in close vicinity to alleles known to be error-prone, including the Rpb1E1103G and Rpb1E1230K allele. D Cells that carry the Rpb1E1230K allele cannot couple TFIIS to Rpb1, mimicking a TFIIS knock out (n = 3, P = 0.0007 for Rpb1E1230K and P = 0.0071 for dst1Δ cells) E The elevated error rate of Rpb1L1101P and Rpb1N1232S cells (n = 3) is primarily caused by an increased G→A error rate. F POLR2AN1251S cells (n = 3, P = 0.0134) display an increased error rate of transcription that is primarily fueled by G→A errors (G). H Although the overall error rate of POLR2AN1124P cells (n = 3) is not significantly different, these cells do display an increased G→A error rate (P = 0.0091), suggesting that POLR2AN1124P proteins are indeed error-prone. In all cases, n is defined as biologically independent replicates. All experiments were analyzed by unpaired, two-tailed Welch’s t-tests using Prism software. *P < 0.05, **P < 0.01, ***P < 0.001. Error bars depict standard error from the mean. Source data are provided as a Source Data file.

Next, we wanted to confirm that these mutations result in error-prone transcription in human cells as well. To do so, we measured the error rate of transcription in genetically engineered HeLa cells that express either the POLR2AL1124P or POLR2AN1251S allele, or a WT POLR2A as a control. Excitingly, we found that cells that express the POLR2AN1251S mutation indeed display an increased error rate fueled by G→A errors (1.32 × 10−5 ± 2.14 × 10−6/bp, Fig. 4E, F), similar to the rpb1N1232S allele in yeast. Surprisingly though, the effect of the POLR2AL1124P allele was modest compared to the analogous mutation in yeast (6.17 × 10−6 ± 1.52 × 10−6/bp), although POLR2AL1124P cells did display an elevated G→A error rate (3.3 × 10−6 ± 1.03 × 10−6, Fig. 4F). It is possible though, that incomplete deactivation of the native POLR2A enzyme, or retention of the mutant POLR2A in the cytoplasm may have contributed to this observation (Fig. S9). Regardless, these experiments provide the first definitive proof that human patients exist that carry an error-prone RNA polymerase.

Environmental causes of transcriptional mutagenesis

Multiple genes and protein structures that affect the fidelity of transcription in yeast are conserved in higher organisms. Accordingly, we wondered whether environmental factors that affect the error rate in yeast affect multi-cellular organisms as well. Currently, there are two environmental factors known to cause error-prone transcription in yeast: mutagen exposure34 (and the DNA damage it induces28,30) and natural aging38. One of the strongest transcriptional mutagens in yeast is the alkylating agent MNNG, which induces C→U errors through O6-methyl guanine adducts35. To determine if MNNG has the same effect on human cells, we exposed human fibroblasts for 1 h to 10μg/ml MNNG, replaced the medium and let the cells recuperate for 3 or 24 h to generate RNA molecules from damaged DNA templates and found that human cells display an 8-fold increase in transcription errors (3 × 10−5 ± 9.5 × 10−6/bp, Fig. 5A). Similar to yeast, this increase is primarily fueled by excess C→U errors, presumably due to O6-methyl guanine adducts that form on the DNA (Fig. 5B). These experiments demonstrate that exposure to mutagens can lead to enough DNA damage in human cells to elevate the error rate of transcription, and that this elevation can be maintained for an extended period of time after exposure. For two reasons, it is unlikely that these measurements were confounded by DNA mutations. First, our measurements were performed on cells that were grown into a non-dividing state by contact inhibition. Because the damaged cells were not actively replicating their genome and DNA replication is required to fix DNA damage into mutations, this quiescent state limits the mutagenic potential of MNNG adducts. Similarly, DNA replication is also required to fix ENU damage into mutations56 (ENU is a DNA alkylating agent as well). As a result, this feature can be exploited to bypass the impact of mutations on error measurements in the context of DNA damage28,30,31). Second, our measurements show that the error rate of transcription is >100-fold higher than the mutation rate in human cells57,58. For example, measurements in cultured human cells demonstrate that mutations occur at a frequency of 1.6 ± 1.2 × 10−8/bp and only increase several-fold in response to DNA damage56,57,58,59. Thus, it is unlikely that these experiments were confounded by DNA mutations introduced by MNNG exposure. Furthermore, because we observed a similar increase in C→U errors in worms, flies and mouse cells exposed to MNNG (Fig. 5A, B), we conclude that transcriptional mutagenesis is a universal consequence of exposure to MNNG.

Fig. 5: The error rate of transcription in multiple organisms of increasing complexity.
figure 5

A MNNG elevates the error rate of RNAPII in yeast (n = 7 for untreated cells and n = 3 for treated cells, P = 0.0104), worm (n = 4 for untreated worms and n = 3 for treated worms, P = 0.0036), fly (n = 7 for untreated flies and n = 3 for treated flies, P = 0.0193), mouse (P = 0.0002) and human cells (P = 0.0075) n = 3 for untreated cells, n = 3 for cells that were allowed to recover for a short time, and n = 1 for cells that recovered for a long time). B In each organism, the elevated error rate is primarily caused by a excessive C→U errors (n as in A). C Old flies display higher error rates than young flies (n = 4, P = 0.0009). D The increased error rate of old flies is primarily fueled by C→U errors (n = 4, P = 0.0162). E Age-related errors in flies can affect any aspect of protein structure and function, including actin molecules. Each circle indicates a base pair where errors were found. The number in the circle indicates the number of errors found at that base. Errors can result in non-synonymous amino acid changes (blue), synonymous changes (yellow) or premature stop codons (red). In all cases, n is defined as biologically independent replicates. All experiments were analyzed by unpaired, two-tailed Welch’s t-tests using Prism software. *P < 0.05, **P < 0.01, ***P < 0.001. Error bars depict standard error from the mean. Source data are provided as a Source Data file.

Next, we sought to determine whether aging also affects the error rate of higher organisms. To do so, we compared the error rate of young flies (10 days) to aged flies (60 days) and found that old flies (9.66 × 10−6 ± 9.37 × 10−7/bp) indeed display higher error rates than young flies (5.69 × 10−6 ± 8.2 × 10−7/bp, Fig. 5c). Interestingly, this increased error rate was nearly entirely fueled by C→U errors (Fig. 5d), suggesting that the mechanism responsible for these errors is distinct from reduced proofreading or loss of fidelity factors. Instead, another molecular mechanism seems to be responsible. It is possible that one of these mechanisms is DNA damage, which is a hallmark of all aging organisms60. For example, lesions that specifically affect guanine bases, including the O6-methyl guanine lesions described above, could contribute to this age-related increase in C→U errors. Regardless of the mechanism, it is known that errors in mRNA transcripts cause protein misfolding38. Because misfolded proteins are part of the etiology of various age-related neurodegenerative diseases, including Alzheimer’s and Parkinson’s disease, an age-related increase in transcription errors could contribute to the progression of these diseases. Notably, a recent study indicated that silencing of MGMT, the human DNA repair protein that repairs O6-methyl guanine lesions, is a risk factor for Alzheimer’s disease in women and associated with a higher load of protein plaques and tangles61. Transcription errors could also affect aging organisms through other mechanisms though. For example, cytoskeletal structure is lost with age62,63, and errors in actin transcripts or other cytoskeletal components could contribute to this process by generating proteins that prevent additional subunits from being added to a growing chain (Fig. 5e).

Human error rates, spectra and the genetic code

Our measurements on untreated, primary human fibroblasts indicate that the error rate of human cells is 5.0 × 10−6 ± 1.43 × 10−6 for RNAPI, 4.7 × 10−6 ± 9.9 × 10−8 for RNAPII, 1.73 × 10−5 ± 4.53 × 10−6 for RNAPIII and 8.0 × 10−6 ± 4.81 × 10−7 for mtRNAP (Fig. 6A). These measurements provide the first reasonable estimate of the error rate of transcription in human cells, a useful benchmark for future experiments that examine the consequences of environmental exposure, genetic perturbations and human diseases on transcriptional mutagenesis. When we examined the error spectrum of RNAPII in human cells and compared it to error yeast, worms, flies and murine cells, we further noticed that in almost every case the error rate of RNAPII is primarily fueled by C→U and G→A errors (Fig. 6B). The similarities between these error spectra suggests that there might be an explanation for this phenomenon that transcends species.

Fig. 6: The error rate of transcription in various cell types and the genetic code.
figure 6

A The error rate of transcription in human cells (n = 7). B The error spectrum of transcription in organisms of increasing complexity (n = 7 for yeast, n = 4 for worms and flies, n = 3 for mice and n = 7 for human cells). C The genetic code is constructed in such a way that C→U and G→A errors do not result in mutated proteins if they occur in the “wobble” position. Errors that are more likely to result in a non-synonymous change, (as expressed by the ratio of synonymous to non-synonymous changes that can potentially result from that error), the lower the error rate is in yeast (D) and humans (E). Conversely, errors that are more likely to synonymous changes display higher error rates. In all cases, n is defined as biologically independent replicates. All experiments were analyzed by unpaired, two-tailed Welch’s t-tests using Prism software. *P < 0.05, **P < 0.01, ***P < 0.001. Error bars depict standard error from the mean. Source data are provided as a Source Data file.

One universal feature that is shared between species is the organization of the codon table. Intriguingly, the codon table is organized in such a way that amino acids encoded by two codons invariably have a C and a U, or a G and an A in the wobble position (Fig. 6C). As a result, the two most common transcription errors (C→U and G→A) do not result in a mutated protein if they affect the wobble base, suggesting that there is a beneficial relationship between the error rate, the error spectrum and the genetic code. To examine this relationship, we plotted the error rate of transcription against the likelihood that a transcription error will change the sequence of a protein. This analysis, which considers all three bases of each possible codon, demonstrates that transcription errors that are more likely to change a protein sequence or generate a stop codon (as expressed by a lower synonymous to non-synonymous ratio) occur less frequently than those that have a smaller chance of mutating an amino acid (Fig. 6D, E, Fig. S10). Thus, the error rate and spectrum of transcription is apparently biased to reduce the impact of errors on the proteome.

Read more here: Source link