Towards the biogeography of prokaryotic genes

  • 1.

    Sunagawa, S. et al. Structure and function of the global ocean microbiome. Science 348, 1261359 (2015).

    PubMed 

    Google Scholar
     

  • 2.

    Zou, Y. et al. 1,520 reference genomes from cultivated human gut bacteria enable functional microbiome analyses. Nat. Biotechnol. 37, 179–185 (2019).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • 3.

    Mohammad, B. F. et al. Structure and function of the global topsoil microbiome. Nature 560 233–237 (2018).

  • 4.

    Qin, J. et al. A human gut microbial gene catalogue established by metagenomic sequencing. Nature 464, 59–65 (2010).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • 5.

    Xiao, L. et al. A catalog of the mouse gut metagenome. Nat. Biotechnol. 33, 1103–1108 (2015).

    CAS 
    PubMed 

    Google Scholar
     

  • 6.

    Coelho, L. P. et al. Similarity of the dog and human gut microbiomes in gene content and response to diet. Microbiome 6, 72 (2018).

    PubMed 
    PubMed Central 

    Google Scholar
     

  • 7.

    Pasolli, E. et al. Extensive unexplored human microbiome diversity revealed by over 150,000 genomes from metagenomes spanning age, geography, and lifestyle. Cell 176, 649–662.e20 (2019).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • 8.

    Partridge, S. R., Kwong, S. M., Firth, N. & Jensen, S. O. Mobile genetic elements associated with antimicrobial resistance. Clin. Microbiol. Rev. 31, (2018).

  • 9.

    Mende, D. R. et al. ProGenomes2: An improved database for accurate and consistent habitat, taxonomic and functional annotations of prokaryotic genomes. Nucleic Acids Res. 48, D621–D625 (2020).

    CAS 
    PubMed 

    Google Scholar
     

  • 10.

    Jain, C., Rodriguez-R, L. M., Phillippy, A. M., Konstantinidis, K. T. & Aluru, S. High throughput ANI analysis of 90K prokaryotic genomes reveals clear species boundaries. Nat. Commun. 9, 5114 (2018).

    ADS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • 11.

    Steinegger, M. & Söding, J. MMseqs2 enables sensitive protein sequence searching for the analysis of massive data sets. Nat. Biotechnol. 35, 1026–1028 (2017).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • 12.

    Daniel H. et al. RefSeq: an update on prokaryotic genome annotation and curation. Nuc. Acids Res. 46, D851–D860 (2018).

  • 13.

    Mering, C. von et al. Quantitative phylogenetic assessment of microbial communities in diverse environments. Science 315, 1126–1130 (2007).

    ADS 

    Google Scholar
     

  • 14.

    Richardson, E. J. et al. Gene exchange drives the ecological success of a multi-host bacterial pathogen. Nat. Ecol. Evol. 2, 1468–1478 (2018).

    PubMed 
    PubMed Central 

    Google Scholar
     

  • 15.

    Nielsen, H. B. et al. Identification and assembly of genomes and genetic elements in complex metagenomic samples without using reference genomes. Nat. Biotechnol. 32, 822–828 (2014).

    CAS 
    PubMed 

    Google Scholar
     

  • 16.

    Mende, D. R., Sunagawa, S., Zeller, G. & Bork, P. Accurate and universal delineation of prokaryotic species. Nat. Methods 10, 881–884 (2013).

    CAS 
    PubMed 

    Google Scholar
     

  • 17.

    Huerta-Cepas, J. et al. Fast genome-wide functional annotation through orthology assignment by eggNOG-mapper. Mol. Biol. Evol. 34, 2115–2122 (2017).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • 18.

    Louca, S. et al. Function and functional redundancy in microbial systems. Nat. Ecol. Evol. 2, 936–943 (2018).

    PubMed 

    Google Scholar
     

  • 19.

    Maistrenko, O. M. et al. Disentangling the impact of environmental and phylogenetic constraints on prokaryotic within-species diversity. ISME J. 14, 1247–1259 (2020).

    PubMed 
    PubMed Central 

    Google Scholar
     

  • 20.

    Baumdicker, F., Hess, W. R. & Pfaffelhuber, P. The diversity of a distributed genome in bacterial populations. Ann. Appl. Probab. 20, 1567–1606 (2010).

    MathSciNet 
    MATH 

    Google Scholar
     

  • 21.

    Sela, I., Wolf, Y. I. & Koonin, E. V. Theory of prokaryotic genome evolution. Proc. Natl Acad. Sci. USA 113, 11399–11407 (2016).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • 22.

    Dandekar, T., Snel, B., Huynen, M. & Bork, P. Conservation of gene order: a fingerprint of proteins that physically interact. Trends Biochem. Sci. 23, 324–328 (1998).

    CAS 
    PubMed 

    Google Scholar
     

  • 23.

    Nei, M., Suzuki, Y. & Nozawa, M. The neutral theory of molecular evolution in the genomic era. Annu. Rev. Genomics Hum. Genet. 11, 265–289 (2010).

    CAS 
    PubMed 

    Google Scholar
     

  • 24.

    Iranzo, J., Cuesta, J. A., Manrubia, S., Katsnelson, M. I. & Koonin, E. V. Disentangling the effects of selection and loss bias on gene dynamics. Proc. Natl Acad. Sci. USA 114, E5616–E5624 (2017).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • 25.

    Wolf, Y. I., Makarova, K. S., Lobkovsky, A. E. & Koonin, E. V. Two fundamentally different classes of microbial genes. Nat. Microbiol. 2, 16208 (2016).

    CAS 
    PubMed 

    Google Scholar
     

  • 26.

    Rasko, D. A. et al. The pangenome structure of Escherichia coli: comparative genomic analysis of E. coli commensal and pathogenic isolates. J. Bacteriol. 190, 6881–6893 (2008).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • 27.

    Koskella, B., Hall, L. J. & Metcalf, C. J. E. The microbiome beyond the horizon of ecological and evolutionary theory. Nat. Ecol. Evol. 1, 1606–1615 (2017).

    PubMed 

    Google Scholar
     

  • 28.

    Liu, R. et al. Gut microbiome and serum metabolome alterations in obesity and after weight-loss intervention. Nat. Med. 23, 859–868 (2017).

    CAS 
    PubMed 

    Google Scholar
     

  • 29.

    Metcalf, J. L. et al. Microbial community assembly and metabolic function during mammalian corpse decomposition. Science 351, 158–162 (2015).

    ADS 
    PubMed 

    Google Scholar
     

  • 30.

    Vincent, C. et al. Bloom and bust: intestinal microbiota dynamics in response to hospital exposures and Clostridium difficile colonization or infection. Microbiome 4, 12 (2016).

    PubMed 
    PubMed Central 

    Google Scholar
     

  • 31.

    Zeller, G. et al. Potential of fecal microbiota for early‐stage detection of colorectal cancer. Mol. Syst. Biol. 10, 766 (2014).

    PubMed 
    PubMed Central 

    Google Scholar
     

  • 32.

    Gibson, M. K. et al. Developmental dynamics of the preterm infant gut microbiota and antibiotic resistome. Nat. Microbiol. 1, 16024 (2016).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • 33.

    Zhang, X. et al. The oral and gut microbiomes are perturbed in rheumatoid arthritis and partly normalized after treatment. Nat. Med. 21, 895–905 (2015).

    CAS 
    PubMed 

    Google Scholar
     

  • 34.

    Brito, I. L. et al. Mobile genes in the human microbiome are structured from global to individual scales. Nature 535, 435–439 (2016).

    ADS 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • 35.

    Vatanen, T. et al. Variation in microbiome LPS immunogenicity contributes to autoimmunity in humans. Cell 165, 842–853 (2016).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • 36.

    Turnbaugh, P. J. et al. The human microbiome project. Nature 449, 804–810 (2007).

    ADS 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • 37.

    Hannigan, G. D. et al. The human skin double-stranded DNA virome: topographical and temporal diversity, genetic enrichment, and dynamic associations with the host microbiome. MBio 6, e01578-15 (2015).

    PubMed 
    PubMed Central 

    Google Scholar
     

  • 38.

    Taft, D. H. et al. Intestinal microbiota of preterm infants differ over time and between hospitals. Microbiome 2, 36 (2014).

    PubMed 
    PubMed Central 

    Google Scholar
     

  • 39.

    Zeevi, D. et al. Personalized nutrition by prediction of glycemic responses. Cell 163, 1079–1094 (2015).

    CAS 
    PubMed 

    Google Scholar
     

  • 40.

    Wilhelm, R. C. et al. Biogeography and organic matter removal shape long-term effects of timber harvesting on forest soil microbial communities. ISME J. 11, 2552–2568 (2017).

    PubMed 
    PubMed Central 

    Google Scholar
     

  • 41.

    Xie, H. et al. Shotgun metagenomics of 250 adult twins reveals genetic and environmental impacts on the gut microbiome. Cell Syst. 3, 572–584.e3 (2016).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • 42.

    The MetaSUB International Consortium. The metagenomics and metadesign of the subways and urban biomes (metasub) international consortium inaugural meeting report. Microbiome 4, 24 (2016).


    Google Scholar
     

  • 43.

    Chatelier, E. L. et al. Richness of human gut microbiome correlates with metabolic markers. Nature 500, 541–546 (2013).

    PubMed 

    Google Scholar
     

  • 44.

    Li, J. et al. Gut microbiota dysbiosis contributes to the development of hypertension. Microbiome 5, (2017).

  • 45.

    Pehrsson, E. C. et al. Interconnected microbiomes and resistomes in low-income human habitats. Nature 533, 212–216 (2016).

    ADS 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • 46.

    Li, J. et al. An integrated catalog of reference genes in the human gut microbiome. Nat. Biotechnol. 32, 834–841 (2014).

    CAS 
    PubMed 

    Google Scholar
     

  • 47.

    Feng, Q. et al. Gut microbiome development along the colorectal adenoma–carcinoma sequence. Nat. Commun. 6, 6528 (2015).

    ADS 
    CAS 
    PubMed 

    Google Scholar
     

  • 48.

    Gu, Y. et al. Analyses of gut microbiota and plasma bile acids enable stratification of patients for antidiabetic treatment. Nat. Commun. 8, 1785 (2017).

    ADS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • 49.

    Karlsson, F. H. et al. Gut metagenome in european women with normal, impaired and diabetic glucose control. Nature 498, 99–103 (2013).

    ADS 
    CAS 
    PubMed 

    Google Scholar
     

  • 50.

    Yu, J. et al. Metagenomic analysis of faecal microbiome as a tool towards targeted non-invasive biomarkers for colorectal cancer. Gut 66, 70–78 (2017).

    CAS 
    PubMed 

    Google Scholar
     

  • 51.

    Youngster, I. et al. Fecal microbiota transplant for relapsing clostridium difficile infection using a frozen inoculum from unrelated donors: a randomized, open-label, controlled pilot study. Clin. Infect. Dis. 58, 1515–1522 (2014).

    PubMed 
    PubMed Central 

    Google Scholar
     

  • 52.

    Guittar, J., Shade, A. & Litchman, E. Trait-based community assembly and succession of the infant gut microbiome. Nat. Commun. 10, 512 (2019).

    ADS 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • 53.

    Vogtmann, E. et al. Colorectal cancer and the human gut microbiome: reproducibility with whole-genome shotgun sequencing. PLoS ONE 11, e0155362 (2016).

    PubMed 
    PubMed Central 

    Google Scholar
     

  • 54.

    Chng, K. R. et al. Whole metagenome profiling reveals skin microbiome-dependent susceptibility to atopic dermatitis flare. Nat Microbiol 1, 16106 (2016).

    CAS 
    PubMed 

    Google Scholar
     

  • 55.

    Chu, D. M. et al. Maturation of the infant microbiome community structure and function across multiple body sites and in relation to mode of delivery. Nat. Med. 23, 314–326 (2017).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • 56.

    Van Rossum, T. et al. Spatiotemporal dynamics of river viruses, bacteria and microeukaryotes. Preprint at doi.org/10.1101/259861 (2018).

  • 57.

    Feng, Q. et al. Integrated metabolomics and metagenomics analysis of plasma and urine identified microbial metabolites associated with coronary heart disease. Sci. Rep. 6, 22525 (2016).

    ADS 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • 58.

    Oh, J., Byrd, A. L., Park, M., Kong, H. H. & Segre, J. A. Temporal stability of the human skin microbiome. Cell 165, 854–866 (2016).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • 59.

    Xiao, L. et al. A reference gene catalogue of the pig gut microbiome. Nat. Microbiol. 1, 16161 (2016).

    CAS 
    PubMed 

    Google Scholar
     

  • 60.

    R Core Team. R: a language and environment for statistical computing (R Foundation for Statistical Computing, 2014).

  • 61.

    Coelho, L. P. et al. NG-meta-profiler: Fast processing of metagenomes using ngless, a domain-specific language. Microbiome 7, 84 (2019).

    PubMed 
    PubMed Central 

    Google Scholar
     

  • 62.

    Li, D., Liu, C.-M., Luo, R., Sadakane, K. & Lam, T.-W. MEGAHIT: an ultra-fast single-node solution for large and complex metagenomics assembly via succinct De Bruijn graph. Bioinformatics 31, 1674–1676 (2015).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • 63.

    Besemer, J. & Borodovsky, M. GeneMark: web software for gene finding in prokaryotes, eukaryotes and viruses. Nucleic Acids Res. 33, W451–W454 (2005).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • 64.

    Coelho, L. P. Jug: Software for parallel reproducible computation in Python. J. Open Res. Softw. 5, 30 (2017).


    Google Scholar
     

  • 65.

    Buchfink, B., Xie, C. & Huson, D. H. Fast and sensitive protein alignment using diamond. Nat. Methods 12, 59–60 (2015).

    CAS 
    PubMed 

    Google Scholar
     

  • 66.

    Eberhardt, R. Y. et al. AntiFam: A tool to help identify spurious ORFs in protein annotation. Database 2012, bas003 (2012).

    PubMed 
    PubMed Central 

    Google Scholar
     

  • 67.

    Kang, D. et al. MetaBAT 2: An adaptive binning algorithm for robust and efficient genome reconstruction from metagenome assemblies. PeerJ 7, e7359 (2019).

    PubMed 
    PubMed Central 

    Google Scholar
     

  • 68.

    Li, H. Aligning sequence reads, clone sequences and assembly contigs with bwa-mem. Preprint at arxiv.org/abs/1303.3997 (2013).

  • 69.

    Parks, D. H., Imelfort, M., Skennerton, C. T., Hugenholtz, P. & Tyson, G. W. CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes. Genome Res. 25, 1043–1055 (2015).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • 70.

    Bowers, R. M. et al. Minimum information about a single amplified genome (MISAG) and a metagenome-assembled genome (MIMAG) of bacteria and archaea. Nat. Biotechnol. 35, 725–731 (2017).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • 71.

    Zhou, W., Gay, N. & Oh, J. ReprDB and panDB: minimalist databases with maximal microbial representation. Microbiome 6, 15 (2018).

    PubMed 
    PubMed Central 

    Google Scholar
     

  • 72.

    Hingamp, P. et al. Exploring nucleo-cytoplasmic large DNA viruses in tara oceans microbial metagenomes. ISME J. 7, 1678–1695 (2013).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • 73.

    Seemann, T. Prokka: rapid prokaryotic genome annotation. Bioinformatics 30, 2068–2069 (2014).

    CAS 
    PubMed 

    Google Scholar
     

  • 74.

    Huerta-Cepas, J. et al. eggNOG 5.0: A hierarchical, functionally and phylogenetically annotated orthology resource based on 5090 organisms and 2502 viruses. Nucleic Acids Res. 47, D309–D314 (2019).

    CAS 
    PubMed 

    Google Scholar
     

  • 75.

    Eddy, S. R. Accelerated profile HMM searches. PLoS Comput. Biol. 7, e1002195 (2011).

    ADS 
    MathSciNet 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • 76.

    Smyshlyaev, G., Barabas, O. & Bateman, A. Sequence analysis allows functional annotation of tyrosine recombinases in prokaryotic genomes. Mol. Syst. Biol. 17, e9880 (2021).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • 77.

    Jia, B. et al. CARD 2017: Expansion and model-centric curation of the comprehensive antibiotic resistance database. Nucleic Acids Res. 45, D566–D573 (2017).

    CAS 
    PubMed 

    Google Scholar
     

  • 78.

    Gibson, M. K., Forsberg, K. J. & Dantas, G. Improved annotation of antibiotic resistance determinants reveals microbial resistomes cluster by ecology. ISME J. 9, 207–216 (2015).

    CAS 
    PubMed 

    Google Scholar
     

  • 79.

    Li, T., Fan, K., Wang, J. & Wang, W. Reduction of protein sequence complexity by residue grouping. Protein Eng. 16, 323–330 (2003).

    CAS 
    PubMed 

    Google Scholar
     

  • 80.

    Zhao, M., Lee, W.-P., Garrison, E. P. & Marth, G. T. SSW library: an SIMD Smith–Waterman C/C++ library for use in genomic applications. PLoS ONE 8, e82138 (2013).

    ADS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • 81.

    Li, H. Minimap2: Pairwise alignment for nucleotide sequences. Bioinformatics 34, 3094–3100 (2017).


    Google Scholar
     

  • 82.

    Milanese, A. et al. Microbial abundance, activity and population genomic profiling with mOTUs2. Nat. Commun. 10, 1014 (2019).

    ADS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • 83.

    Salter, S. J. et al. Reagent and laboratory contamination can critically impact sequence-based microbiome analyses. BMC Biol. 12, 87 (2014).

    PubMed 
    PubMed Central 

    Google Scholar
     

  • 84.

    Kumar, R., Acharya, V., Singh, D. & Kumar, S. Strategies for high-altitude adaptation revealed from high-quality draft genome of non-violacein producing Janthinobacterium lividum ERGS5:01. Stand. Genomic Sci. 13, 11 (2018).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • 85.

    Patijanasoontorn, B. et al. Hospital acquired Janthinobacterium lividum septicemia in srinagarind hospital. J. Med. Assoc. Thai. 75 Suppl 2, 6–10 (1992).

    PubMed 

    Google Scholar
     

  • 86.

    Harris, C. R. et al. Array programming with NumPy. Nature 585, 357–362 (2020).

    ADS 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • 87.

    Virtanen, P. et al. SciPy 1.0: Fundamental algorithms for scientific computing in Python. Nat. Methods 17, 261–272 (2020).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • 88.

    Pedregosa, F. et al. Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011).

    MathSciNet 
    MATH 

    Google Scholar
     

  • 89.

    Collins, R. E. & Higgs, P. G. Testing the infinitely many genes model for the evolution of the bacterial core genome and pangenome. Mol. Biol. Evol. 29, 3413–3425 (2012).

    CAS 
    PubMed 

    Google Scholar
     

  • 90.

    Sievers, F. et al. Fast, scalable generation of high-quality protein multiple sequence alignments using clustal omega. Mol. Syst. Biol. 7, 539 (2011).

    PubMed 
    PubMed Central 

    Google Scholar
     

  • 91.

    Price, M. N., Dehal, P. S. & Arkin, A. P. FastTree 2—approximately maximum-likelihood trees for large alignments. PLoS ONE 5, e9490 (2010).

    ADS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • 92.

    Huerta-Cepas, J., Serra, F. & Bork, P. ETE 3: reconstruction, analysis, and visualization of phylogenomic data. Mol. Biol. Evol. 33, 1635–1638 (2016).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • 93.

    Edgar, R. C. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 32, 1792–1797 (2004).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • 94.

    Suyama, M., Torrents, D. & Bork, P. PAL2NAL: robust conversion of protein sequence alignments into the corresponding codon alignments. Nucleic Acids Res. 34, W609–12 (2006).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • 95.

    Murrell, B. et al. FUBAR: a fast, unconstrained Bayesian approximation for inferring selection. Mol. Biol. Evol. 30, 1196–1205 (2013).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • 96.

    Smith, M. D. et al. Less is more: an adaptive branch-site random effects model for efficient detection of episodic diversifying selection. Mol. Biol. Evol. 32, 1342–1353 (2015).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • 97.

    Washietl, S. et al. RNAcode: robust discrimination of coding and noncoding regions in comparative sequence data. RNA 17, 578–594 (2011).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Read more here: Source link