Clonal dynamics underlying the skewed CD4/CD8 ratio of mouse thymocytes revealed by TCR-independent barcoding

Implementation of a high-resolution barcoding system

In order to determine the clonal relationships of developing thymocytes, we used an endogenous barcoding system (Fig. 1a) that consists of three components19. (i) A construct for the ubiquitous expression of the Hprt gene-specific sgRNA (hU6:sgRNAHprt)19; (ii) A conditional Cas9 expression construct (Rosa26:LSL-Cas9-YFP) inserted into the ubiquitously transcribed Rosa26 locus20; (iii) A pLck:Cre expression construct21. Once the proximal Lck promotor (pLck) becomes active in DN2/DN3 thymocytes, Cre recombinase is produced, and removes a stop cassette in the Rosa26 locus to initiate Cas9 gene expression; the Cas9 protein forms a specific RNP complex with the ubiquitously expressed sgRNA that attacks exon 3 of the Hprt gene to generate double strand breaks (DSBs); error-prone repair of the DSBs leads to the formation of “scar sequences”, which serve as unique barcode identifiers for all subsequent progeny of a particular DN2/DN3 thymocyte. Owing to the inevitable delay in Cas9 activation, it is formally possible that barcode generation continues into the DN4 stage, or perhaps even into the early DP stages; note, however, that positive selection only begins in the DP stage, suggesting that most barcodes are fixed before selection begins. In male hU6:sgRNAHprt; Rosa26:LSL-Cas9-YFP; pLck:Cre triple-transgenic mice, each cell possesses a single barcode, since the Hprt gene is located on the X chromosome; it can be read out at the DNA level after amplification of exon 3 sequences.

Fig. 1: Characterization of the barcoding system.
figure 1

a Schematic illustrating the successive steps in barcode generation. b Schematic indicating the two stages of barcoding at the LSK bone marrow progenitor, and the DN2/3 immature thymocyte stages; the common analysis time point at the mature thymocyte stage is also indicated. c Size distributions of barcode sequences in mice of the indicated genotypes. The values for mean±standard deviation are, from left to right, 71.037 ± 2.337 nucleotides; 72.038 ± 3.782 nucleotides; 71.762 ± 3.602 nucleotides, respectively. P values for two-way comparisons are indicated (t-test). d Examples of barcode sequences with deletions (indicated by dashes) and insertions (highlighted by blue shading). The first nucleotide in the wild-type sequence corresponds to nucleotide 303, the last nucleotide corresponds to nucleotide 354 in Genbank accession number NM_013556.2. e Number of barcodes recovered from individual thymocyte populations in mice of the indicated genotypes.

Several features of the barcoding system are notable. The overall size distributions of barcodes appear to depend on the target cell (as illustrated here for LSK lymphoid progenitors in the bone marrow, and DN2/3 thymocytes, respectively [Fig. 1b]), possibly reflecting cell type-specific differences in DSB repair processes; and on the outcome of intra-thymic selection processes (Fig. 1c). Barcodes often contain deletions, but also insertions of non-templated sequences (Fig. 1d). The overall sequence complexity of barcode sequences depends on the target cell population. For instance, the Flt3:CreERT2 transgene is active in LSK bone marrow progenitors22, whereas pLck:Cre is active in DN2/DN3 thymocytes21. Therefore, when barcodes are read out in mature thymocyte subsets (CD4 single-positive [SP4] and CD8 single-positive [SP8] cells) in mice carrying the Flt3:CreERT2 transgene, only a few hundred different barcodes are detectable, whereas tens of thousands are found in case of the pLck:Cre transgene (Fig. 1e); this outcome mirrors the large difference in the numbers of cells in the target populations (~2 × 102 for LSK; ~2 × 105 for DN2/DN3 [Ref. 22]). Moreover, we found that the efficiency of barcode generation varies among individuals of the same genotype (Fig. 1e); however, despite the variations in absolute numbers of barcodes in an individual, their relative distribution among different lymphocyte lineages remains generally constant, attesting to stable trajectories of differentiation from progenitor cells (Fig. 1e). Lastly, as seen with other barcoding schemes23, we found that the generative probabilities of particular barcodes vary. Whereas some barcodes are generated at high frequency, the majority of barcodes are rare; this pattern is seen irrespective of the genotype of mice (Fig. 2a, b; see below for detailed description of mouse strains). This non-uniform generative probability tends to reduce the resolution of clonal analysis, since cells eventually carrying the same barcode may have originated from distinct scarring events. Note that, since DN3 cells proliferate considerably24 after they have become indelibly marked, many mature thymocytes carry the same barcode. This is an important feature underlying the present study, because it makes the analysis of barcodes largely insensitive to the precise number of cells used for the analysis; this is apparent from the flat regression lines demonstrating that the numbers of detected barcodes do not correlate to the numbers of cells used for the analysis (Fig. 2c).

Fig. 2: Frequency of barcodes.
figure 2

a, b Frequency distributions of barcodes recovered from individual thymocyte populations in mice of the indicated genotypes. c Number of barcodes recovered from individual thymocyte populations in mice of the indicated genotypes as a function of cell input.

Barcode diversity in mature thymocytes

Our first experiment addressed the relationship between cell numbers and barcodes in SP4 and SP8 thymocytes, respectively. SP4 and SP8 cell numbers both decline with age (Supplementary Fig. 1a), commensurate with the age-related reduction of thymopoietic activity of the thymus25. The number of barcodes represented in the SP4 population is greater than that in the SP8 population (Supplementary Fig. 1b). Clone sizes (numbers of cells per barcode) in SP4 and SP8 populations tend to decline over time (Supplementary Fig. 1c), indicative of reduced proliferative capacity of thymocytes and/or impaired efficiency of positive selection.

Given the inter-individual differences in barcode numbers, we calculated the SP4/SP8 ratios per mouse in order to comparatively evaluate these general trends. Over the course of the first 6 months of life, the average ratio of SP4 and SP8 cell numbers increases (Fig. 3a); when averaged over this time window, it is equivalent to 3.78 ± 1.13 (mean ± S.E.M.) (Fig. 3b), reflecting the known higher numbers of SP4 lineage cells. By contrast, the SP4/SP8 barcode ratio is stable over time (Fig. 3a), and amounts to 1.79 ± 0.54 (mean ± S.E.M.) (Fig. 3b). This result suggests that only just over half as many barcodes are found in the SP8 lineage than in the SP4 lineage during the differentiation process from DN2/DN3 to mature SP stages. The discrepancy between cell and barcode numbers translates into a SP4/SP8 clone size ratio of 2.18 ± 0.5 (mean ± S.E.M.) cells (Fig. 3b). These data identify clone number and clone size as approximately equal contributors to the skewed SP4/SP8 ratio in thymocyte populations.

Fig. 3: Structure of thymocyte populations.
figure 3

a Age-dependent trends of cell and barcode numbers and clone sizes expressed as SP4/SP8 ratio to normalize for inter-individual differences in barcoding efficiency as a function of age. Solid line, linear regression; dashed lines, 95% confidence interval. b Aggregated values of parameters shown in a. Box plots show quantiles 25 and 75 and mean (line) as well as the total range (whiskers); t-test, two-tailed (n = 14; P value is indicated). c Barcodes recovered from thymocyte populations barcoded at the DN2/DN3 stage of intrathmyic differentiation (pLck:Cre; n = 12) and at the LSK stage of lymphoid differentiation in the bone marrow (Flt3:CreERT2n = 4). The difference between the populations is significant (t-test, two-tailed; P value is indicated). Box plots show quantiles 25 and 75 and mean (line) as well as the total range (whiskers). d Rank-rank correlation of shared barcodes in SP4 and SP8 thymocytes. The dotted line indicates a 1:1 correlation.

Lineage bias of Tcrb clonotypes

Next, we explored the basis for the skewed SP4/SP8 barcode ratio. In our scheme, barcodes are induced at the DN2/DN3 stage, and therefore become associated with particular Tcrb clonotypes, which are independently generated at the same stage of differentiation. Hence, a barcode ratio of ~1.8 indicates that, on average, any Tcr β clonotype is about twice as likely to contribute to an MHCII-compatible TCR αβ heterodimer expressed by SP4 cells than to an MHCI-compatible TCR αβ heterodimer expressed by SP8 cells. This result implies the presence of a strong bias of particular Tcrb clonotypes for either CD4 or CD8 lineages, confirming recent findings17,18. As a control, we examined the SP4/SP8 barcode ratio in thymocytes arising from progenitors indelibly marked in the bone marrow at haematopoietic precursor stages. By design, Flt3:CreERT2-induced barcodes are generated in a comparatively small number of precursor cells and are not associated with particular Tcrb clonotypes, which are generated much later during intra-thymic differentiation. Rather, because of the high proliferation rates of precursors before they reach the DN2/DN3 stage (in the order of at least 10 cell divisions26), many different Tcrb clonotypes eventually share the same barcode. The random distribution of Tcrb clonotypes among the comparatively small number of barcodes represented in SP4 and SP8 cells precludes the detection of selection bias, as indicated by a SP4/SP8 barcode ratio of 1.04 ± 0.12 (mean ± S.E.M.) (Fig. 3c).

Returning to the structure of pLck:Cre-induced barcode repertoires in SP4 and SP8 thymocytes, we identified and ranked shared barcodes in SP4 and SP8 populations according to their frequencies. If each barcode-associated Tcr β clonotype has an equal chance of appearing in the resultant SP4 and SP8 populations, the rank-rank correlation plot of barcodes should have an inclination of approximately 1. If, however, particular Tcr β clonotypes are incompatible with selection into either CD4 or CD8 lineages, a certain fraction of barcodes would disappear from the final SP4 and SP8 populations, respectively. In SP4/SP8 comparisons, a greater net loss of barcodes in the SP4 population would result in an inclination >1, a greater loss in SP8 cells would result in an inclination <1; the final inclination would reflect the relative contributions of these two opposing tendencies. Since only a small fraction of barcodes appears with high frequencies, the inclinations of rank-rank correlations are shaped by barcodes that are in the mid- to low-frequency range of generation probabilities; thus, our analysis strategy minimizes the impact of frequent barcodes, which are likely associated with many Tcrb clonotypes and hence poorly record Tcrb-related selection events. In wild-type mice, the inclinations for SP4/SP8 rank correlations of thymocytes in individual mice are significantly smaller than 1 (0.70 ± 0.04; mean±S.E.M.; P < 0.0001, single sample t-test; n = 14; hypothetical population mean = 1) (Fig. 3d). This indicates that during differentiation, the CD8 lineage suffers a greater loss of clonotypes from the original pool of DN2/DN3 clonotypes than the CD4 lineage. In sum, analysis of TCR repertoires at the clonal level underscores the differential compatibility of Tcr β clonotypes for selection by pMHCI and pMHCII complexes. It is instructive to compare our results with data obtained with Tcrb transgenic mice; whenever a significant lineage bias was observed, it correlated with the origin of the Tcrb clonotype. In transgenic mice expressing Tcrb chains of CD4+ cells27,28,29, a preference for the CD4 lineage was observed; conversely, in mice expressing Tcrb clonotpes derived from CD8+ cells, the lineage bias tended to favour the development of CD8+ cells30,31,32. To the best of our knowledge, the effect of the expression of lineage-specific Tcrb clonotypes has not been tested for their effect on iNKT development. However, since iNKT-specific TCR αβ heterodimers are composed of many different Tcrb clonotypes, we hypothesize that such Tcrb clonotypes could also be found expressed in SP4 and SP8 cells.

Population structure in pLck:Trav11-Traj18-Trac transgenic mice

The high-resolution barcoding system described above afforded us with an unprecedented opportunity to examine the changes in population structure in TCR transgenic mice. To illustrate the value of this approach, we chose to explore thymocyte differentiation in mice precociously expressing a rearranged Tcra gene. In the present case, we employed a pLck:Trav11-Traj18-Trac transgene that encodes the canonical TCR α chain of iNKT cells, Vα14 Jα18 Cα33,34,35,36,37. In this constellation, the iNKT-specific TCR α chain is precociously expressed on DN2/3 thymocytes, concurrently to the endogeneous TCR β chains, but prior to endogenous TCR α chains. Thus, the initial population of TCR αβ heterodimers is formed with transgenic TCR α chains and a diverse array of endogenous TCR β chains. We presume that this saturates the iNKT differentiation pathway, but does not exclude the possibility of subsequently replacing the transgenic TCR α chain by endogenous TCR α chains in a kind of receptor editing process for those TCR αβ pairs, which do not bind to iNKT-type ligands and thus cannot enter the iNKT differentiation pathway. However, it is reasonable to assume that the TCR α chain replacement is not quantitative, and we hypothesize that many of the remaining TCR αβ heterodimers are unable to bind to pMHCII complexes, contributing to the skewed representation of Tcrb clonotypes in the SP4 and SP8 cells. In this regard, we note that the CD1 molecule, which presents the iNKT-related ligands, is a non-classical MHCI molecule38; the invariant α chain may therefore be more prone to contribute to binding to classical pMHCI complexes than to pMHCII complexes.

In pLck:Trav11-Traj18-Trac transgenic mice, the fractions of CD4/CD8-double negative (DN), CD4/CD8-double positive (DP), and SP4 and SP8 thymocytes are significantly altered (Fig. 4a, b), as noted in previous experiments using a similar construct33. Note that the extent of lineage bias depends on the type of promotor used to express the transgene; when the expression of Trav11-Traj18-Trac occurs at a later stage of thymocyte differentiation, the lineage bias is minimal39. In our transgenic mice, whereas the number of SP8 cells remains unchanged, the numbers of DP and SP4 cells are reduced (Fig. 4b). As a result, the SP4/SP8 ratio significantly drops to ~1 (Fig. 4c); by contrast, staining of thymocytes with the αGalCer–CD1d tetramer indicates a ~10-fold increase of iNKT cells (Fig. 4a, b), as expected33.

Fig. 4: Altered thymocyte differentiation in pLck:Trav11-Traj18-Trac transgenic mice.
figure 4

a Representative flow cytometric profiles of non-transgenic wild-type and transgenic thymocytes of 4-week-old mice. The total numbers of thymocytes are indicated at the top of the upper panels; the proportions of individual thymocyte populations are indicated in the respective quadrants. In transgenic mice, the absolute number of iNKT cells (CD3+αGalCer-CD1 tetramer+; boxed with indicated proportions) is increased; absolute numbers are given next to the quadrant marking iNKT cells. b Numbers of thymocyte populations in non-transgenic wild-type (black; n = 4) and transgenic (red; n = 5) mice (mean ± s.e.m.; t-test, two-tailed; P values are indicated). c Reduced SP4/SP8 ratios in transgenic mice (red). The difference between the populations is significant (t-test, two-tailed; n = 5 for both genotypes; P value is indicated). Box plots show quantiles 25 and 75 and mean (line) as well as the total range (whiskers).

Distribution of barcodes in mature transgenic thymocyte

In order to explore the mechanistic underpinnings of the skewed SP4/SP8 ratio, we combined the pLck:Trav11-Traj18-Trac transgene with the tri-partite barcoding system. For simplicity, we henceforth refer to the barcoding configuration (pLck:Cre; hU6:sgRNAHprt; Rosa26:LSL-Cas9-YFP) as wild-type, and to mice additionally expressing the pLck:Trav11-Traj18-Trac construct (pLck:Cre; hU6:sgRNAHprt; Rosa26:LSL-Cas9-YFP; pLck:Trav11-Traj18-Trac) as transgenic.

The introduction of the three transgenes of the barcoding system did not alter the composition of the thymocyte populations further; SP4 cells are reduced in the quadruple transgenic mice, SP8 cells are unchanged, and the iNKT population increases (Fig. 5a). Although the absolute numbers of detectable barcodes differ from individual to individual, we noted a trend towards more restricted barcode repertoires in SP4 cells in the transgenic situation; in 6/7 wild-type mice, SP8 cells exhibit fewer barcodes than SP4 cells, whereas this ratio is reversed in transgenic mice (2/7 mice) mice (Fig. 5b), commensurate with the changes in absolute cell numbers. By contrast, although the number of iNKT cells increases sharply in the transgenic situation, the number of barcodes is (with one exception) always lower than the numbers of barcodes in SP4 and SP8 cells (Fig. 5b), indicating that iNKT cells induced in the transgenic situation have a higher clone size (Fig. 5c).

Fig. 5: Altered thymocyte population structure in pLck:Trav11-Traj18-Trac transgenic mice.
figure 5

a Absolute cell numbers of thymocyte populations in mice of the indicated genotypes; lines connect data for individual mice to indicate stability of differentiation trajectories (left panel). The right panel is a summary presentation of data. b Absolute barcode numbers of thymocyte populations in mice of the indicated genotypes to indicate stability of differentiation trajectories irrespective of barcoding efficiency (left panel). The right panel is a summary presentation of data. c Clone sizes of thymocyte populations in mice of the indicated genotypes (left panel). The right panel is a summary presentation of data. In this experiment, mouse cohorts of approximately the same age (5–12 weeks) were included. pLck:Cre;hU6:sgRNAHprt;Rosa26:LSL-Cas9-YFP (n = 8); pLck:Cre;hU6:sgRNAHprt;Rosa26:LSL-Cas9-YFP;pLck:Trav11-Traj18-Trac (n = 7). In ac, box plots show quantiles 25 and 75 and mean (line) as well as the total range (whiskers).

In order to account for inter-individual differences, we calculated the fractions of barcodes that are recovered in the three different cell populations of each mouse. In transgenic animals, a relative increase in barcodes is found for the SP8 population (P = 0.0052; t-test, two-tailed), which was accompanied by a trend towards smaller fractions of barcodes in SP4 cells (P = 0.0842; t-test, two-tailed) (Fig. 6a).

Fig. 6: Loss of CD4-lineage committed thymocytes in pLck:Trav11-Traj18-Trac transgenic mice.
figure 6

a Fractions of barcodes in individual thymocyte populations of the indicated genotypes. Each data point represents one mouse (t-test, two-tailed; pLck:Cre;hU6:sgRNAHprt;Rosa26:LSL-Cas9-YFP (n = 8); pLck:Cre;hU6:sgRNAHprt;Rosa26:LSL-Cas9-YFP;pLck:Trav11-Traj18-Trac (n = 7); P values are indicated). b SP4/SP8 ratios for the indicated parameters and genotypes; t-test, two-tailed (pLck:Cre;hU6:sgRNAHprt;Rosa26:LSL-Cas9-YFP (n = 14); pLck:Cre;hU6:sgRNAHprt;Rosa26:LSL-Cas9-YFP;pLck:Trav11-Traj18-Trac (n = 7); P-values are indicated). c Rank-rank correlations of shared barcodes in SP4 and SP8 thymocytes for the indicated genotypes. Data for genotype pLck:Cre;hU6:sgRNAHprt;Rosa26:LSL-Cas9-YFP are taken from Fig. 3d. d Summary of rank-rank correlations for the indicated comparisons of thymocyte populations for the two genotypes; t-test, two-tailed (pLck:Cre;hU6:sgRNAHprt;Rosa26:LSL-Cas9-YFP (n = 14); pLck:Cre;hU6:sgRNAHprt;Rosa26:LSL-Cas9-YFP;pLck:Trav11-Traj18-Trac (n = 7); P values are indicated). In a, b, d, box plots show quantiles 25 and 75 and mean (line) as well as the total range (whiskers).

The substantial re-configuration of mature SP4 and SP8 thymocyte populations in the presence of a precociously expressed rearranged Tcra gene suggests that the canonical iNKT TCR α chain biases the resulting TCR αβ repertoire against selection by peptide/MHCII complexes. This conclusion is reflected in the substantially reduced SP4/SP8 ratios of barcode numbers in the transgenic animals (Fig. 6b).

Compared to the wild-type situation, rank-rank-correlations of barcodes in transgenic mice (Fig. 6c) exhibit significantly increased inclinations for SP4-SP8 correlations (0.99 ± 0.07; mean ± SEM; P < 0.001; t-test, two-tailed) (Fig. 6d). Given that the number of barcodes tends to be smaller in transgenic SP4 cells, but remains constant in SP8 cells (Fig. 5b), the most parsimonious explanation for this observation is that the transgenic Vα14Jα18Cα chain is an unfavourable partner for many TCR β clonotypes that are normally associated with SP4 cells. As a result, such clonotypes are lost from the pool of mature SP4 cells, substantially reducing the physiological bias favouring the CD4 lineage.

Collectively, these results suggest that the balanced SP4/SP8 ratio in the transgenic situation appears to be due to a combination of diminished compatibility of SP4-biased clonotypes with the Vα14Jα18Cα chain, and reduced clone size in the SP4 lineage (Fig. 5c), perhaps as a result of poor pMHCII complex binding affinity.

With respect to the iNKT cells, we found that the fraction of barcodes associated with this cell type remains unchanged in the transgenic situation (Fig. 6a). As a result, the increase in the number of iNKT cells induced by the expression of the transgene is associated with a larger clone size (Fig. 5c), likely because the iNKT-specific TCR is formed ectopically in immature T cells with inherently higher proliferative capacity.

Read more here: Source link