Comparison of Oxford Nanopore Technologies and Illumina MiSeq sequencing with mock communities and agricultural soil

Study sites

Soils were collected from two different sites (ARDEC: Colorado State University’s Agricultural Research, Development and Education Center in Fort Collins, CO; and CPCRC: USDA Columbia Plateau Conservation Research Center in Pendleton, OR). At each site, four replicate plots of no-till corn (ARDEC) or no-till annual wheat (CPCRC) were sampled. At ARDEC, the soils are clay loam and CPCRC the soils are Walla Walla silt loams (fine-loamy, mesic Aridic Haplustalls). For each plot, six 1″ diameter cores (15 cm deep) were sampled near plant crowns, composited in resealable plastic bags and stored on ice in coolers until transfer to the laboratory (less than 30 min). Once in the laboratory, the soils were homogenized by hand, sieved to 4 mm, and stored in the freezer (− 20 °C) until DNA extraction. Prior to freezing, subsamples (~ 5 g) were removed from each sample to measure gravimetric soil water content.

DNA extraction

DNA was extracted from three replicate 0.25 g soil samples from each plot using the Qiagen DNeasy Powersoil Pro Kit (Qiagen, Germantown, MD). The extraction process was carried out using a fully automated Qiagen QIAcube robot with a 10-min vortex lysis step. DNA quality was assessed using a Nanodrop 1000 (Thermo Scientific, Waltham, MA) and quantified fluorometrically with the Invitrogen dsDNA HS Assay Kit on a Qubit 2.0 (Life Technologies, Carlsbad, CA).

Library preparation

PCR amplifications were performed on each DNA sample using two different 16S rRNA gene primer pairs. The first primer pair, 341F/806R17, targets the V3-V4 region of the 16S gene and was used for both platforms. The second primer pair, 27F/1492R, targets the full-length 16S rRNA gene and was only used on the ONT MinION platform (Table 1).

Table 1 Summary of platforms and bioinformatics methods compared in this study.

ONT MinION PCR conditions and library preparation

Extracted DNA samples were amplified in 60 µL PCR reactions containing 30 µL Phusion HSII (Thermo Scientific) master mix, 0.6 µL of each forward and reverse primer (10 µM concentration), 21.6 µL molecular grade H2O, and 6 µL soil DNA diluted 1:20 with nuclease-free water. Reactions were held at 98 °C for 30 s, with amplification proceeding for 25 cycles at 98 °C for 15 s, 50 °C for 15 s, and 72 °C for 60 s with a final extension at 72 °C for 5 min. The PCR products (PCR1) were purified using AMPure XP beads (Beckman Coulter, Indianapolis, IN).

Unique barcodes (EXP-PBC096, ONT, Oxford, UK) were added to both ends of the DNA fragments by PCR. These were 50 µL PCR reactions containing 25 µL Phusion HSII master mix, 19 µL H2O, 1 µL of forward/reverse barcodes, and 5 µL PCR1 product diluted 1:10 with nuclease-free water. Reactions were held at 98 °C for 30 s, with amplification proceeding for 15 cycles at 98 °C for 15 s, 62 °C for 15 s, and 72 °C for 60 s; a final extension at 72 °C for 5 min. The barcoded products of this PCR reaction were purified a second time using AMPure XP beads.

Barcoded amplicons from all samples were pooled and prepared for sequencing using the SQK-LSK109 Ligation Sequencing Kit (ONT). The library was loaded on a MinION flow cell FLO-MIN106D-R9 (ONT) per manufacturers’ protocol and sequencing was started with a runtime of 48 h and voltage of − 180 V. All libraries included no template (H2O-only) negative controls and a mock community (ZymoBIOMICS Microbial Community DNA Standard D6305; Zymo Research, Irvine CA).

MiSeq PCR conditions and library preparation

Extracted DNA was amplified in triplicate, in 20 µL PCR reactions containing 10 µL Maxima SYBR-green (Thermo Scientific), 2 µL of each forward and reverse primer (10 μL concentration), 4 µL molecular grade H2O, and 2 µL soil DNA diluted 1:20 with nuclease-free water. Reactions were held at 95 °C for 5 min, with amplification proceeding for 28 cycles at 95 °C for 40 s, 55 °C for 120 s, and 72 °C for 60 s; a final extension at 72 °C for 7 min. Thermocycling was performed with a Roche 96 Lightcycler (Roche, Indianapolis, IN). The products of the triplicate PCR reactions were pooled and purified using AMPure XP beads.

Nextera XT barcode sequences (Illumina, San Diego, CA) were added to both ends of the DNA fragments by PCR using 50 µL PCR reactions containing 25 µL Maxima SYBR-green, 10 µL H2O, 5 µL of each forward and reverse barcode (5 µM concentration), and 5 µL of sample PCR1 product. Reactions were held at 95 °C for 3 min, with amplification proceeding for 8 cycles at 95 °C for 30 s, 55 °C for 30 s, and 72 °C for 30 s; a final extension at 72 °C for 5 min. The barcoded products of this PCR reaction were purified a second time using AMPure XP beads. Barcoded amplicons from all samples were pooled and sequenced on an Illumina MiSeq instrument at Colorado State University using an Illumina MiSeq v3 600-cycle Kit with 25% PhiX spike-in (Illumina).

Bioinformatics and sequence processing

Emu MinION and Emu MiSeq

Sequences generated on the MinION platform were base-called and demultiplexed using Guppy v6.0.1 (ONT). Except were otherwise noted, default parameters were used. Sequences were filtered based on length (V34: 300–600 bp; Full: 1000–2000 bp) and a minimum q-score of 70 using Filtlong v0.2.118 and Cutadapt v3.219. Chimeras were filtered using vsearch20, and taxonomy was assigned with minimap2 v2.2221. Error-correcting was done with Emu v3.0.05, using default parameters (–min-abundance = 0.0001, –N = 50, –K = 500 MB, –keep-counts = FALSE), which applies an expectation minimization algorithm to adjust taxonomic assignments using up to 50 sequence alignments per sequence read.

Paired forward and reverse MiSeq reads were joined using PEAR v0.9.822. Sequences were then filtered based on length (V34: 300–600 bp) and a minimum quality score of 70 using Filtlong v0.2.118 and Cutadapt v3.219. Chimeras were filtered using vsearch UCHIME v2.13.320, taxonomy was assigned with minimap2 and error-corrected with Emu v3.0.05.

For DADA223 MiSeq, all primers were removed from demultiplexed raw fastq files using Cutadapt v3.219 and amplicon sequence variants were inferred using the default pipeline in DADA2. Each sequence variant was classified to the default NCBI-linked reference database available from the Emu v3.0.0 website (gitlab.com/treangenlab/emu) using minimap2 v2.2221 and the primary alignment for each sequence was chosen with SAMtools v1.924 and used for taxonomic assignments. One phylum of bacteria has not been assigned a name, and is reported as “p_of_Bacteria.” All downstream data analyses were performed on taxonomic abundance tables following classification using the rank level(s) defined below.

Data analysis

Total library sizes were as follows: MinION Full 1,695,436 total sequence reads with an average of 66,843 reads per sample; MinION V34 2,318,235 total sequence reads with an average of 96,730 reads per sample; and MiSeq V34 2,111,798 total sequence reads with an average of 83,345 reads per sample (Table 2). Therefore, prior to calculating alpha diversity (i.e., species richness) estimates all samples were rarefied to 50,000 reads. Principal Coordinates Analysis (PCoA) was performed using Bray–Curtis distances (BC) calculated from square root-transformed, genus-level relative abundances, and significant differences between platforms and/or sites were tested using adonis in the vegan package for R25. Figure 3 was constrained by both platform and site. Differential abundances were tested using either the DESeq2 package or Wilcoxon test in the metacodeR package using a false discovery rate < 0.0526.

Table 2 Summary data of sequencing results from the four platform and bioinformatics pipelines (MinION Full, MinION V34, MiSeq V34, and MiSeq V34 DADA2). Similarity was calculated with Bray–Curtis against the expected Zymo mock community. F statistics are shown for perMANOVA results for site differences (F Site) with each full dataset, and plot differences (F ARDEC and F Pendleton) for each site subset.subset (ARDEC and Pendleton).

Read more here: Source link