Roles of adenine methylation in the physiology of Lacticaseibacillus paracasei

Bacterial strains and cultivation

Twenty-eight L. paracasei isolates (including L. paracasei Zhang) and a pglX gene-inactivated strain of L. paracasei Zhang were obtained from the Key Laboratory of Dairy Biotechnology and Engineering, Ministry of Education, at the Inner Mongolia Agricultural University of China. For strain activation, the bacteria were cultivated in standard de Mann Rogosa Sharpe (MRS) broth (CM0359; Oxoid, Ltd., Basingstoke, UK). For RNA-seq analysis, proteomics analysis, Hi-C, and metabolomics analysis, the bacteria were cultivated in a CDM (Supplementary Table S3). The CDM was a minimal medium developed for investigating the growth and metabolism of L. paracasei33. The growth of L. paracasei Zhang and the pglX mutant in CDM were measured by changes in pH and optical density at 600 nm (OD600).

Genomics and methylomics analyses by Illumina and SMRT sequencing

Genomic DNA was isolated by the Wizard Genomic DNA Purification Kit (Promega, Madison, WI, USA). The integrity of DNA was examined by 0.6% agarose gel and 1.2% Lonza FlashGel electrophoresis. For SMRT sequencing, libraries with an insert size of 10 kb were constructed using the PacBio SMRTbell TM Template Kit. The quality of the libraries was evaluated on a Qubit® 2.0 Fluorometer (Thermo Fisher Scientific, Waltham, MA, USA), and the insert fragment size was determined by an Agilent 2100 Bioanalyzer (Agilent Technologies, Inc., Santa Clara, CA, USA). For Illumina sequencing, libraries were prepared using the NEBNext® Ultra™ DNA Library Prep Kit (New England Biolabs, Inc., Ipswich, MA, USA). The DNA samples were first fragmented by sonication to a size of around 350 bp. Then, the DNA fragments were end-polished, A-tailed, and ligated with the full-length adaptor by PCR amplification. The PCR products were purified with AMPure XP system, and the quality and size distribution of libraries were evaluated by an Agilent 2100 Bioanalyzer. Sequencing was performed on a PacBio Sequel platform (Pacific Biosciences of California, Inc., Menlo Park, CA, USA) and an Illumina NovaSeq 6000 (Illumina, Inc., San Diego, CA, USA), respectively.

De novo assemblies were realized by a standard hierarchical genome assembly process using only PacBio sequencing data from a single, long-insert library; and the consensus was called across reads after assembly polishing. Effective data of each sample after quality control were used to assemble the genome of reads by SMRT link v5.1.0 software, and the preliminary assembly results could reflect the crude genome quality of samples. Then, Arrow software (Pacific Biosciences of California, Inc., Menlo Park, CA, USA) was used to optimize the assembly results and correct areas with assembly errors by comparing the original data of the initial assembly sequence against data generated by the Illumina platform34,35. The chromosomal and plasmid sequences were identified, and chromosomal sequences were assembled into a circular genome. To identify base modifications and methyltransferase motifs, the protocols for modification and motif analysis in SMRT Link software were used with the identification quality score ≥2036. Methylation sites generated by the protocol were mapped to the genomes. Methyltransferases were identified by REBASE using BLASTP with identity >50%, e value <1e–10, and bit score >5037.

Gene prediction was realized in Prokka (version 1.13) with the argument of kingdom Bacteria38. Functional annotation of coding sequences (CDSs) was conducted by using the databases of Rapid Annotation Subsystem Technology (RAST) 2.039, KEGG40, and COG41. The ANI was calculated by a standalone java ANI calculator42. The skewness of CDS and COG distribution was evaluated with a Markov model that considered motif composition36. Motif-based sequence analysis was performed by the MEME suite (v5.0.5)43. First, the upstream regions with a length of 50–300 bp of L. paracasei genes were extracted using a python script, intergenic_regions.py44. A Lactobacillaceae-specific TFBS catalog was built by using the sites2meme script of MEME suite based on motif sequences, which included 82 transcription factor regulons of 15 Lactobacillaceae strains. Then, the FIMO tool included in the MEME suite was used to scan upstream regions of L. paracasei genes for the occurrence of putative TFBSs with the q value (adjusted P value) threshold of 0.0545. The motif sequence logo was constructed by WebLogo346.

RNA-seq analysis

Triplicate parallel cultures of wild-type L. paracasei Zhang (reference condition) and pglX mutant (test condition) were grown in the CDM to late log phase, and bacterial cells were harvested. Total RNA was extracted using the Trizol reagent (Invitrogen Corporation, Carlsbad, CA, USA) following the manufacturer’s instructions. The RNA library was constructed from 2 μg of total RNA using the TruSeqTM RNA Sample Preparation Kit (Illumina Inc., San Diego, CA, USA). Briefly, rRNA was removed from the total RNA by a Ribo-Zero Magnetic Kit (Epicenter Biotechnologies, Madison, WI, USA), and the mRNAs were randomly fragmented into lengths of about 200 nucleotides. Double-stranded cDNA was synthesized by reverse transcription using random hexamer primers (Illumina Inc., San Diego, CA, USA) and a SuperScript Double-stranded cDNA Synthesis Kit (Invitrogen Corporation, Carlsbad, CA, USA). Phusion DNA polymerase (New England Biolabs, Inc., Ipswich, MA, USA) was used for PCR amplification by a total of 15 cycles. After the library was quantified by the Turner BioSystems TBS-380 Mini-Fluorometer (in conjunction with Molecular Probes’ PicoGreen® dsDNA Quantitation Reagent), Illumina HiSeq X Ten was used for RNA-seq paired-end sequencing.

Clean reads were obtained by removing the adapter sequences, filtering low-quality sequences at the end of the reads, and removing reads with N ratio of 10%. The high-quality clean reads were mapped to the reference genome by using Bowtie2 (bowtie-bio.sourceforge.net/bowtie2/index.shtml). In addition, 10,000 raw reads were randomly selected from each sample and compared against the Rfam database (rfam.xfam.org/) using BLAST. The rRNA contamination rate in the samples was calculated based on the annotation results. DESeq2 software (bioconductor.org/packages/release/bioc/html/DESeq2.html) was used to identify DEGs between samples (with a cut-off false discovery rate [FDR] of ≤0.05 and 2.0-fold change).

Real-time quantitative PCRs were performed to validate the RNA sequencing results. The RNA of three biological replicates of the collected samples was extracted by using the RNAprep Pure Cell/Bacteria Kit (Tiangen Biotech Co., Ltd., Beijing, China). Then 500 ng of RNA was reverse transcribed into cDNA with a reverse transcription kit (PrimeScript RT Reagent Kit with gDNA Eraser; Takara Biomedical Technology Co., Ltd., Beijing, China) according to the manufacturer’s instructions. Quantitative analysis was conducted via the qTOWER3G Touch Real-Time PCR System (Analytik Jena AG, Jena, Germany). The reaction was performed in a 20 μL system, containing 1 µL of cDNA template, 10 µL of SYBR Premix Ex TaqII (Takara Biomedical Technology Co., Ltd., Beijing, China), 0.8 µL of each primer, and 7.4 µL of ddH2O. The PCR conditions were as follows: initial denaturation at 95 °C for 30 s, 40 cycles of denaturation at 95 °C for 5 s, primer annealing, and DNA extension at 60 °C for 30 s. The housekeeping gene, glyceraldehyde phosphate dehydrogenase, was used as the reference gene. Comparative threshold cycle method (2−ΔΔCT) was used to calculate the relative gene expression level47. The primers used are listed in Supplementary Data 5.

Proteomics analysis

Three biological replicates of culture samples of wild-type L. paracasei Zhang (reference condition) and pglX mutant (test condition) grown to late log growth phase in CDM were prepared. For protein extraction, samples were dissolved in the extraction buffer (1% sodium deoxycholate, 200 mM dithiothreitol, 50 mM Tris-HCl) containing protease inhibitors. Protein concentrations were assayed by a Pierce bicinchoninic acid protein assay kit (Thermo Fisher Scientific, Waltham, MA, USA). After reduction, cysteine alkylation and digestion, samples were labeled with tandem mass tag reagent (TMT reagent; Thermo Fisher Scientific, Waltham, MA, USA) according to the manufacturer’s instructions. Pooled samples were separated by an ACQUITY UPLC BEH C18 column (1.7 µm, 2.1 mm × 150 mm; Waters, Milford, MA, USA). Proteomic analyses were performed on an Easy-nLC system coupled to a Q Exactive HF-X (Thermo Fisher Scientific, Waltham, MA, USA) for 60 min. The peptides were dissolved in mass spectrometric loading buffer and separated on the C18-reversed phase column (75 μm × 25 cm, Thermo Fisher Scientific, Waltham, MA, USA) for 120 min at a volume flow rate of 300 nL/min; the mobile phases consisted of aqueous solution A (2% acetonitrile with 0.1% formic acid) and B (80% acetonitrile with 0.1% formic acid). The peptides were eluted using the following gradient: 0–67 min, 6–23% B; 67–81 min, 23–29% B; 81–90 min, 29–38% B; 90–92 min, 38–48% B; 92–93 min, 48–100% B; 93–120 min, 100–0% B. The Q Exactive HF-X was run in the collection mode of data-dependent acquisition. The mass spectrometry (MS) spectra (m/z 350-1500) were obtained with primary MS resolution 120000. The automatic gain control (AGC) was targeted at 3e6, and the maximum fill time was 50 ms. The top 15 intense precursor ions were selected into collision cell for fragmentation by higher-energy collision dissociation. The MS/MS resolution was set at 45,000; the AGC target was 2e5; the maximum fill time was 120 ms; the fixed first mass was 110 m/z; the minimum AGC target was 1e4; the intensity threshold was 8.3e4; and the dynamic exclusion time was 30 s.

Raw data of LC-MS/MS spectra were analyzed by Proteome DiscoverTM Software 2.4. The MS/MS search criteria were as follows: precursor mass tolerance of 20 ppm; fragment mass tolerance of 0.02 Da; trypsin as the enzyme with 2 missed cleavage allowed; carbamidomethyl (C), TMTpro (K), and TMTpro (N-terminus) as static modifications; and oxidation (M), acetyl (N-terminus), met-loss (N-terminus), and met-loss with acetyl (N-terminus) as dynamic modifications. The cut-off FDR of peptide identification was ≤0.01. For protein identification, each protein should match at least one unique peptide. Proteins displaying a P value of <0.05 by t-test were considered statistically significant. A 1.2-fold change was defined as the threshold for regulated proteins.

Hi-C analysis

The wild-type L. paracasei Zhang (reference condition) and pglX mutant (test condition) were grown to the late log phase in a CDM. Cells were collected by centrifugation, washed at room temperature, and crosslinked with 3% formaldehyde for 30 min. The formaldehyde was quenched with 0.375 M glycine for 20 min at 4 °C. The fixed cells were collected and stored in a −80 °C freezer. For library construction, the fixed cells were suspended in 100 µL Tris-EDTA buffer with 2 µL of lysozyme (Ready-Lyse™ Lysozyme Solution; Epicenter Biotechnologies, Madison, WI, USA). After incubation for 20 min, sodium dodecyl sulfate was added to lyze cells for 10 min at 65 °C. The lysed cells were digested in the reaction mixture consisting of 300 µL water, 50 µL 10-fold NEB buffer 2.1 (New England Biolabs, Inc., Ipswich, MA, USA), and 100 U of Sau3AI. Restriction fragment ends were labeled with biotinylated cytosine nucleotides by biotin-14-dCTP (TriLINK Biotechnologies, San Diego, CA, USA). After blunt-end ligation, proteinase K was used for reversing cross-linking overnight. The DNA was purified using the QIAamp DNA Mini Kit (Qiagen GmbH, Hilden, Germany) and sheared to a length of ~400 bp. Point ligation junctions were pulled down using Dynabeads® MyOne™ Streptavidin C1 (Thermo Fisher Scientific, Waltham, MA, USA). The Hi-C library was prepared by NEBNext® Ultra™ II DNA library Prep Kit (New England Biolabs, Inc., Ipswich, MA, USA) and was submitted for sequencing on an Illumina HiSeq X Ten platform (Illumina Inc., San Diego, CA, USA).

To avoid any artificial bias, quality filtering was realized by Trimmomatic software version 0.38, and then the clean data were iteratively aligned to the reference genome48. Valid paired reads were binned into nonoverlapping genomic intervals to construct contact maps. After the statistics of valid contacts at a defined resolution, an observed interaction matrix was obtained and normalized with an iterative normalization method. The contacts at the resolution of 1 kb bins were imported to Fit-Hi-C software for calculating the cumulative probability P value and FDR (q value). Significant interactions were discriminated by: p and q values of less than 0.01, and contact count >249. CIDs are contiguous regions with a high degree of self-association, which were identified by dividing the chromosome into windows with fixed length using an insulation score algorithm50. Differential insulation areas were obtained by using the sliding-window method49. According to the insulation score of bins, the Pearson correlation coefficient of each window between two samples was calculated49. Windows with Pearson coefficient >0.6 were merged, and the remaining bins in the genome were regarded as the unique insulation regions49. Interactions, and CIDs that occurred only under the reference condition (in wild type L. paracasei Zhang but not the pglX mutant) was considered to be unique to the test condition and vice versa.

Targeted metabolomics analysis of metabolites involved in energy metabolism

Samples of wild-type L. paracasei Zhang (reference condition) and pglX mutant (test condition) prepared from cells grown to late log phase in CDM were separated by an ACQUITY UPLC BEH Amide column (1.7 µm, 2.1 × 100 mm; Waters, Milford, MA, USA). The solvent system consisted of water with 10 mM ammonium acetate and 0.3% ammonium hydroxide (A), and 90% acetonitrile/water (B). The gradient was as follows: 0–1.2 min, 95% B; 8 min, 70% B; 9–11 min, 50% B; 11.1–15 min, 95% B.

Linear ion trap and triple quadrupole scans were carried out on a QTRAP® 6500+ LC-MS/MS System coupled to an electrospray ionization (ESI) turbo ion-spray interface. It was operated in both positive and negative ion modes. The operation conditions for ESI source were as follows: ion source, ESI±; source temperature, 550 °C; ion-spray voltage, 5500 V (positive), −4500 V (negative); curtain gas, 35 psi. Metabolites in energy metabolism were analyzed using multiple reaction monitoring (MRM). Data acquisition was realized using Analyst 1.6.3 software (Sciex, Framingham, MA, USA). Multiquant 3.0.3 software (Sciex, Framingham, MA, USA) was used to quantify metabolites. Mass spectrometer parameters, such as the declustering potentials and collision energies for individual MRM transitions, were optimized. A specific set of MRM transitions were monitored for each period according to the metabolites eluted within this period. Metabolite identification was based on the MetWare online platform (www.metware.cn/). Differentially regulated metabolites in energy metabolism between samples were determined by variable importance in projection and fold change.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Read more here: Source link