Genomic skimming and nanopore sequencing uncover cryptic hybridization in one of world’s most threatened primates

Feasibility of genomic skimming on the ONT minION sequencer

The term ’genomic skimming’ was first coined by in 2012 by Straub et al. (2012)20 as a way to utilize shallow sequencing of gDNA to obtain relatively deeper coverage of high-copy portions of the genome, including mitogenomes. In combining genomic skimming with ONT long-read sequencing, we successfully reconstructed a complete marmoset mitogenome without the need for prior PCR enrichment, using standard molecular biology equipment, and a compact portable sequencer that connects to a laptop computer. Preparation of genetic material for ONT sequencing in this study took less than a full day, and sequencing reads were available within 48 hours. Although the coverage of our reconstructed ONT mitogenome was low-medium (9x) and one of the largest sources of error for the ONT reads were missing reads in long homopolymer runs, the ONT data showed a high degree of concordance with gold standard mtDNA Sanger sequencing reads for the same individual. Hence, this work along with a number of previous studies (e.g.,5,21), highlights ONT-based genomic skimming as holding great potential for enhancing mitogenomic and diversity studies of data-deficient and/or non-model organisms.

A major challenge in ONT sequencing is the relatively high sequencing error (5%-15%), but the application of computational ’polishing’ significantly reduces errors of raw ONT data (e.g.22). Another challenge with ONT methodologies is the large amount of input DNA needed for sequencing relative to other types of methods, particularly PCR and Sanger sequencing. Multiplexing samples onto the same flow cell is one way to reduce the required amount of per sample DNA, and currently ONT chemistry allows for up to 24 individual gDNA samples to be multiplex per flow cell. Another option to improve mitogenome coverage from genome skimming shotgun data, especially for sensitive applications is to use sample preparation approaches that specifically enrich for mtDNA (e.g.,

It is important to point out that our approach represents a starting point from which methodological aspects could be adjusted to further improve and modify our protocol. An important consideration for long-read sequencing is access to high-quality DNA which is not degraded. For marmosets especially, another consideration for input DNA is whether chimerism could bias genomic analysis or not, as levels of chimerism vary between marmoset biological tissues. Marmosets usually give birth to twins that are natural hematopoietic chimeras due to cellular exchange from placental vascular anastomoses during early fetal development23,24,25. This chimerism may result in the presence of up to 4 alleles of a single-copy genomic locus within a single individual. In marmosets, skin shows some of the lowest amounts of chimerism while blood is highly chimeric24,25,26. Depending on project design, high levels of chimerism can bias base calling of nuclear genome derived sequence reads, but this is less of a concern for mitogenomic studies as mtDNA is haploid and transmitted maternally.

In this work, we obtained DNA from a ear skin biopsy, but this represents a minimally invasive source of genetic material. As an epidermal tissue, buccal swabs are a relatively less invasive source of low-chimerism epidermal DNA. Recently, urine has also been shown to be a non-invasive source of high-quality DNA27, but the amount of chimerism is currently not known for marmoset urine. Urine represents a potentially non-invasive genetic tissue which could be combined with genomic skimming of highly endangered non-model organisms, particularly within captive settings.

Callithrix aurita and anthropogenic marmoset hybridization

Our original aim in this work was to reconstruct the mitogenome of the endangered buffy-tufted-ear marmoset with a PCR-free ’genomic skimming’ approach with minimal technical requirements. We successfully reconstructed the full mitogenome from a captive individual possessing a C. aurita phenotype, but the mitogenomic lineage showed unexpected discordance with this phenotype. While we expected the mitogenome of the sampled individual to be that of C. aurita, instead the sampled individual possessed a C. penicillata mitogenomic lineage. Our results also represent the first ever known instance of one-way genetic introgression from C. penicillata into C. aurita, and indicate that our sampled marmoset was actually a cryptic C. aurita x C. penicillata hybrid.

Although a number of scenarios could explain the phenotypic-genotypic discordance we uncovered in individual BJT022, this case is likely the result of relatively recent anthropogenic hybridization between a C. penicillata female and C. aurita male. Callithrix species are naturally allo- and parapatric, and natural hybridization occurs between marmoset species under secondary contact8. Past natural genetic introgression between C. aurita and C. penicillata would most likely have occurred in the natural contact zone between these species that exists in the transitional areas between the Cerrado and Atlantic Forest Biomes of southeastern Brazil. Because C. penicillata mitogenomic clades tend to be well defined by their biogeographic origin7, for past, natural introgresssion of C. penicillata into C. aurita, we would expect haplotype BJT022 to have grouped with the C. penicillata Atlantic Forest/Cerrado clade. However, that is not the case, as the BJT022 haplotype grouped instead within the C. penicillata Caatinga Clade. There is a relatively large geographic separation between the Caatinga biome of northeastern Brazil and the portion of the southeastern Brazilian Atlantic Forest that houses the natural region of C. aurita. This wide geographic gap highly reduces the possibility of past natural interbreeding between Caatinga populations of C. penicillata and any C. aurita population.

We could also consider incomplete lineage sorting to explain the phylogenetic position of the BJT022 mitogenomic haplotype as reflecting a C. aurita mitogenome that sorted within a C. penicillata phylogenetic clade instead of a C. aurita clade. Overall, we see strong consistency in grouping patterns of mitogenomic haplotypes within their expected Callithrix phylogenetic clades. Further, C. aurita and the jacchus marmoset subgroup (C. geofforyi, C. kuhlii, C. jacchus, C. penicillata) diverged about 3.54 million years ago7, leaving relatively more time for mitochondrial lineage sorting between C. aurita and the jacchus group than among jacchus group species. While incomplete lineage sorting has indeed been used to explain C. penicillata and C. kuhlii polyphyly7, we still do see clear grouping patterns of C. kuhlii and C. penicillata mitogenomic clades according to their species of origin. Therefore, the strong tendency for Callithrix mitogenomic lineages to group within their expected clades reduces the likelihood of incomplete lineage sorting of mitogenomic lineages between C. aurita and the jacchus group.

The similarity of the case of BJT022 to other likely instances of anthropogenic Callithrix hybridization provide further support for BJT022 representing anthropogenic interbreeding between C. aurita and C. penicillata. Callithrix penicillata and C. jacchus have been introduced into the native range of C. aurita in southeastern Brazil largely as a result of the illegal pet trade and subsequent releases of exotic marmosets into forest fragments7,8. Malukiewicz et al. (2021)7 recently found evidence of genetic introgression from of exotic C. jacchus into C. aurita within the metropolitan area of the city of São Paulo. A cryptic C. aurita hybrid sampled by Malukiewicz et al.7 originates from the municipality of Mogi das Cruzes, which lies in the eastern portion of metropolitan São Paulo8,28. Following zoological records, BJT022 originated from the municipality of São Jose dos Campos, which also lies in the eastern portion of metropolitan São Paulo. These cryptic hybrids also likely represent an advanced stage of anthropogenic hybridization between native C. aurita and exotic jacchus group species. First generation and early generation aurita and jacchus group marmoset hybrids are known to possess a distinct “koala bear” appearance10,11,29. As this is not the phenotype seen for BJT022 and the cryptic C. aurita hybrids from Malukiewicz et al.7, this observation suggests that these cases of anthropogenic hybridization arose through backcrossing of an earlier non-cryptic C. aurita x Callithrix sp. hybrid with C. aurita. Eventually these backcrosses led to the genomic capture of introgressed jacchus group mitogenome lineages by the C. aurita populations of the eastern portion of the São Paulo metropolitan area.

The above results are alarming since they suggest that genetic introgression is underway from exotic, invasive marmosets to the endangered, native marmosets of southeastern Brazil. At this time, it is not possible to determine how board this pattern is at the geographic, genomic and species levels, and whether introgression is only unidirectional and exactly which exotic and native species are involved. Specifically for C. aurita, unidirectional genetic introgression from invasive marmosets as well as cryptic hybridization is worrying due to the species’ threatened conservation status. A small number of captive facilities around southeastern Brazil are currently breeding captive C. aurita for eventual reintroductions into the wild8,12. Individuals within these captive populations should be confirmed both genetically and phenotypically as not being of hybrid origin, as to avoid introducing exogenous genetic material into the captive population and subsequently into the wild. Additionally, further genetic information is needed for wild C. aurita populations to not only characterize diversity within the species, but also to better assess the occurrence of hybridization between exotic and native marmosets in southeastern Brazil. This information is critical for defining genetic diversity of C. aurita and maintaining species genetic integrity in the wild and captivity.

Utility of mitogenomics for evolutionary and conservation studies of Callithrix aurita and other marmosets

The buffy-tufted-ear marmoset is not only critically endangered but also highly data-deficient in terms of genetic information. The limited number of genetic studies involving C. aurita have used the mtDNA control region13,15, COI10, and the mitogenome7 for phylogenetic study of Callithrix mtDNA lineages, species identification, and detection of hybridization. The phylogenies obtained by us and Malukiewicz et al. (2021)7 do show some geographical separation between C. aurita mitogenome haplotypes originating from different portions of the species’ natural range. Our calculation of Callithrix mtDNA diversity indexes based on data from Malukiewicz et al. (2021)7 show that diversity in C. aurita is still comparable to that of other Callithrix species. However, a large sampling effort of C. aurita in terms of individual numbers and across the species range is needed for accurate determination of current levels of species standing genetic variation. Additionally, surveys should be conducted of the standing genetic variation levels of the captive C. aurita population. These data are crucial for understanding anthropogenic impacts on the species as well for making appropriate decisions for species conservation.

The application of genomic skimming based on portable ONT long-read technology can be applied to address several of these knowledge gaps for C. aurita. First, with large-scale sampling of wild and captive C. aurita, genetic diversity estimates, demographic history, and other evolutionary analyses can be calculated relatively easily from mitogenomic data. Given the relatively fast turnaround time to obtain sequencing data from the minION, such data could be quickly obtained for a primate as highly endangered as C. aurita, without weeks or months long wait times for sequencing data. Laboratory setup of the minION also does not require any additional special equipment, which also makes genomic work with highly endangered species as C. aurita accessible for investigators under relatively constrained budgets.

Callithrix aurita’s sister species Callithrix flaviceps faces a similar plight as C. aurita, but with an adult population estimated to be at about 2000 adult individuals30. Currently there are also plans to breed C. flaviceps in captivity for eventual wild reintroduction, but currently there is, to our knowledge, no genetic data available for this species. Thus, the same sort of sampling and research efforts are needed for C. flaviceps as for C. aurita, perhaps even more urgently for the former species given its smaller population. As such, C. flaviceps is a good candidate case for the adaptation of techniques such as genomic skimming and low-cost desktop sequencing to rapidly increase genomic resources for a non-model species for conservation and evolutionary studies.

In the case of marmosets, while mitogenomics shows great potential for usage in evolutionary and conservation studies, we strongly urge against sole use of mtDNA markers for identification of species and hybrids. As the results of this study, as well as that of Malukiewicz et al. (2021)7 clearly show, cryptic hybrids can easily be mistaken for species, and had we only depended on mtDNA results we would have misidentified three cryptic Callithrix hybrids as C. jacchus and C. penicillata. Instances of cryptic hybrids have also been shown among natural C. jacchus x C. penicillata hybrids25. All of these instances underline the need to use several lines of evidence for taxanomic identification of marmoset individuals, particularly due to widespread anthropogenic hybridization among marmosets. We used a combination of phenotypic and mitochondrial data to classify the sampled individual BJT022 as a cryptic hybrid. As mitochondrial DNA is maternally transmitted, it is also not possible to genetically identify the paternal lineage of hybrids without further use of autosomal or Y-chromosome genetic markers. When ever phenotypic data are available, these data should be used jointly with molecular data for identification or classification of a marmoset individual as belonging to a specific species or hybrid type. Indeed, the integrated use of phenotypic and molecular approaches will lead to a better understand the phenomena that involve hybridization processes31.


Brazilian legal instruments that protect C. aurita consider hybridization a major threat to the survival of this species8,12. In this report, we have uncovered the first known case of cryptic hybridization between C. aurita and C. penicillata, which may represent a larger trend of genetic introgression from exotic into native marmosets in southeastern Brazil. Our findings are based on the combination of two recent innovations in the field of genomics, that of genomic skimming and portable long-read sequencing on the ONT minION. Given that C. aurita is still very deficient for genetic data, our approach provides a substantial advance in making more genomic data available for one of the world’s most endangered primates. Genomic skimming based on ONT sequencing can be integrated easily with phenotypic and other genetic data to quickly make new information accessible on species biodiversity and hybridization. Such data can then be utilized within the legal Brazilian framework to protect endangered species like C. aurita. More specifically, rapid access to emerging biological information on such species leads to more informed decisions on updating or modifying legal actions for protecting endangered fauna. The ONT genomic skimming approach we present here can be further utilized and optimized to more rapidly generate genomic information without the need for specialized technological infrastructure nor the need for a priori genomic information.

Read more here: Source link