Identification of missed viruses by metagenomic sequencing of clinical respiratory samples from Kenya

The detection of viruses

As shown in Table 1, viral NGS identified at least one syndrome-associated mammalian virus in 35 of 95 Kilifi County Hospital (KCH) inpatient samples (36.8%) and 23 of 95 household cohort (study investigating Who-Acquires-Infection-From-Whom, WAIFW) samples (24.2%), leading to an overall 30.5% “missed virus detection rate”. Among 58 samples that yielded a virus were six samples with mixed virus infections (5 in KCH patients and 1 in the household cohort). In the five KCH mixed infections, four were human rhinovirus A (HRV-A)/respiratory syncytial virus B (RSVB), HRV-A/Enterovirus D68 (EV-D68), Bocavirus/human rhinovirus B (HRV-B), or EV-D68/Coxsackievirus A16 (CV-A16) and one sample showed two distinct human rhinovirus C (HRV-C) strains. Overall, 64 viruses were identified; these were either missed viruses (viruses included in the standard diagnostic panel but the assay was negative; n = 38) or unexpected viruses (viruses not part of the standard diagnostic panel; n = 26).

Table 1 Detected respiratory viruses, sample source and our explanation for diagnostic failure.

Missed viruses

The number of respiratory viruses (with contigs ≥ 1000 nt) that had been missed by the standard diagnostic panel included 28 human rhinovirus (HRV) in 27 samples (14.2%; 27/190), one human parainfluenza virus 1 (HPIV-1) (0.5%; 1/190), three RSVB (1.6%; 3/190), and six human metapneumovirus (HMPV) (3.2%; 6/190) (Table 1). For these missed viruses, nucleotide mismatch between the diagnostic primers and probes and the viral target sequence were identified, potentially accounting for detection failures (see below).

Analysis of primer mismatches

Likely causes of missed diagnoses are mismatches between the primer/probes and viral target sites. For all viral contigs ≥ 1000 nt, if the virus family was part of the diagnostic panel, target sites were examined for differences from the primer/probe. For HRV-A, HRV-B, HRC-V, HMPV and HPIV-1, a number of nucleotide changes were observed in target sites and most were consistent with failed or suboptimal diagnostic tests (Figs. 1, 2). For the RSVB genomes detected, there were nucleotide changes in the probe targets; an updated panel of primers/probe was recently developed and used successfully1.

Figure 1
figure1

Human metapneumovirus (HMPV) identified in the study. (A) The diagnostic primers and probe target sites in the Kilifi HMPV genotype A genomes and contigs were examined. All viral contigs from each virus family or type were aligned using MAFFT25, and the alignment was trimmed to a 100–200 nt region surrounding the primer and probe target sites. Nucleotide differences between the expected primer and probe target sites and the actual contig sequences were identified and plotted in shades of blue and gaps in contig sequences were indicated in grey. (B) As in (A) but for HMPV genotype B. (C) Maximum-likelihood (ML) phylogenetic tree of HMPV genomes. Local strains on the phylogenetic tree were indicated by circles coloured in blue indicating household member and in red indicating KCH patients. The tree was mid-point rooted for clarity and horizontal branch lengths were drawn to the scale of nucleotide substitutions per site. The tree comparing local HMPV genomes to global genomes suggested that the local HMPV belonged to genotype A2 and B1.

Figure 2
figure2

Diagnostic primers and probes check for human rhinoviruses (HRV) and human parainfluenza virus identified from the study. The diagnostic primers and probes target sites in the Kilifi HRV-A, HRV-B, HRV-C and HPIV-1 genomes were examined (see Fig. 1 legend for detailed methods). For each contig, nucleotide changes from the expected target sites were indicated by vertical blue lines, gaps in the sequence were indicated by grey bars.

Unexpected viruses

An advantage of the agnostic viral NGS is the ability to detect viruses present in a specimen without a prior knowledge of virus genome sequence for primer design. In the 190 samples, 26 unexpected viruses from five families of viruses were identified, none were included in the standard diagnostic panel (Table 1). Most unexpected viruses (50%; 13/26) were Picornaviridae, genus Enterovirus, species A (CV-A16, n = 1), species B (Echovirus E1, n = 1), species C (CV-A24, n = 6 and human poliovirus 2 strain Sabin, n = 1), and species D (EV-D68, n = 4). The Parvoviridae Human bocavirus (HBoV) and parvovirus B19 (B19) were identified in four and one sample (Table 1). Rubella virus (RVi) was detected in two KCH paediatric patients with very different clinical presentations (further details below).

Human metapneumovirus

Human metapneumovirus (HMPV) infection is frequent in young children12 and was identified in six samples (Table 1). The diagnostic primer targets in the genomes showed mismatches (Fig. 1, panels A, B) that could explain the missed HMPV diagnostics. A maximum-likelihood phylogenetic tree comparing the HMPV complete genomes from the local strains to global circulating strains showed the two HMPV from the household cohort were genotype A2; the HMPV from the KCH patient was genotype B1, closest to strains KC562240 (A2) and KF530179 (B1) from Australia in 2003 (Fig. 1, panel C). The reported HMPV sequences were also compared with local Kenya HMPV short sequences available from GenBank, all fell into similar lineages (Supplementary Fig. S1, panel B) and were likely missed because of primer mismatch rather than because they were a new lineage.

Enterovirus genus

Among 64 viruses detected, the majority were from the Enterovirus genus, Picornaviridae family (N = 42; 66%). Apart from HRV (N = 28), viruses from the Enterovirus species were not included in the routine screening.

Rhinovirus species, Enterovirus genus

The most abundant Enteroviruses identified were Rhinovirus species A, B and C, with 28 complete or partial genomes identified. All three sets of diagnostic primers used at the time showed multiple mismatches with the genome target sites that could account for the 28 missed HRV cases (Fig. 2). A high diversity of circulating HRV has been noted in this region6,13 as shown in phylogenetic trees comparing local HRV identified from this study with global HRV genomes (Supplementary Fig. S2).

Enterovirus A species (CV-A16), Enterovirus genus

Coxsackievirus A16 (CV-A16), enterovirus 71 (EV-71) and several additional Enterovirus species are associated with hand, foot and mouth disease (HFMD)14. The 22-month old patient infected with CV-A16 was hospitalized at KCH with pneumonia, but presented no clinical HFMD symptoms, and was discharged home after 3 days. Phylogenetically, the patient’s CV-A16 virus genome was closely related to a CV-A16 strain identified from an Ethiopian child in April 2016 (Supplementary Fig. S1, panel A)15.

Enterovirus C species (CV-A24), Enterovirus genus

Coxsackievirus A24 (CV-A24) was identified in six samples, all from a 2-month period (8 April thorugh 3 June 2010) in the household study (Fig. 5). The six infected individuals were aged 8.5 to 33 months, and came from different households. None of these children presented with conjunctivitis, one had diarrhea and all had rhinorrhea. The identified CV-A24 genomes showed 12 to 146 nt differences and very few shared SNPs (Fig. 3, panel A), suggesting that the viruses were not directly transmitted between the 6 individuals. The samples were selected to cover as many households as possible over the entire cohort time period, thus the observed diversity may reflect a much larger outbreak that would account for the number of nucleotide changes. This is also consistent with the monophyletic phylogeny for the six genomes (Fig. 3, panel B). When analyzed with all available CV-A24 genomes from GenBank, the local CV-A24 sequences formed a monophyletic group closest to sequences from Uganda (GenBank MF189567) and French Guiana (GenBank MF419263) which were associated with ocular inflammation or acute haemorrhagic conjunctivitis (AHC) in 201716 (Fig. 3, panel B).

Figure 3
figure3

(A) Genome comparison of identified human coxsackievirus A24 (CV-A24) from the study. Kilifi CV-A24 genomes were examined and compared against the earliest Kilifi CV-A24 genome identified (20693_9, sample collected on 8 April 2010). For each genome, nucleotide changes from the first CV-A24 genome (20693_3) were indicated by vertical blue lines, gaps in the sequence were indicated by grey bars. (B) Maximum-likelihood phylogenetic tree of CV-A24 genomes. The ML tree compared 6 Kilifi CV-A24 genomes (all from household cohort, indicated as blue circles) to global genomes. The tree was mid-point rooted for clarity and horizontal branch lengths were drawn to the scale of nucleotide substitutions per site, and significant bootstrap values were shown for major nodes. (C) Human enterovirus D68 (EV-D68) maximum-likelihood phylogenetic tree. Maximum-likelihood phylogenetic tree were inferred comparing 4 Kilifi EV-D68 genomes (all from KCH pneumonia patients cohort, highlighted in red) to global genomes. The tree was mid-point rooted for clarity and horizontal branch lengths were drawn to the scale of nucleotide substitutions per site, and significant bootstrap values were shown for major nodes. These four local EV-D68 viruses belonged to clade A1, as shown in the zoomed out tree.

Enterovirus C species (human poliovirus), Enterovirus genus

The detection of human poliovirus type 2 was likely due to viral shedding after oral poliovirus vaccination. The child in whom this isolate was detected was 6 weeks old at the time of sampling, and had received a dose of oral polio vaccine (OPV) 4 days prior. The child was hospitalised with cough and difficulty breathing, and was discharged home after 3 days. The human poliovirus 2 genome obtained was identical to the vaccine strain Sabin 2 (GenBank AY184220). Health authorities were informed about this finding.

Enterovirus D species (EV-D68), Enterovirus genus

Four EV-D68 were identified in the hospitalised cohort over a 5-month period (8 April through 318 July 2010). Two of these patients were co-infected with additional Enterovirus strains, CV-A16 or HRV-A. The four children with EV-D68 were hospitalised with severe pneumonia for 3–6 days, and were discharged with no report of neurological symptom.

Phylogenetic analysis (Fig. 3, panel C) suggested that the four Kilifi EV-D68 viruses were clade A1 and closely related to a strain identified from a respiratory patient in 2014 in Sweden (GenBank MH674114) and to a Canadian EV-D68 strain in 2014 (GenBank KP455258).

Matonaviridae family, Rubivirus genus (Rubella virus)

Rubella is a contagious typically mild disease caused by rubella virus (RVi), a single-stranded RNA virus in the Matonaviridae family, Rubivirus genus, infecting people of any age. However, primary RVi infection during the first trimester of pregnancy may result in congenital rubella syndrome (CRS) or miscarriage. Common sequelae of CRS include deafness, glaucoma and retinopathy and heart defects. Rubella infections can be prevented by highly effective rubella vaccine. In Kenya, national rubella vaccination was not implemented until October 2016 and there was no surveillance of RVi prevalence or CRS incidence17.

RVi was detected in two KCH patients through this NGS study (Table 2). They were 10 days and 27 days old at the time of hospitalisation, and had admission diagnoses of neonatal sepsis. Their hospital admissions occurred 5 weeks apart. Phylogenetic analysis on the complete RVi genomes compared with all available RVi genomes available from Genbank indicated that the two Kilifi RVi genomes were similar (60nt differences, 99.4% identity) and belonged to the same genotype (genotype 2B, Fig. 4). Rubella is not routinely screened for or suspected in respiratory infections or neonatal sepsis. Identification of RVi in two neonatal patients in the context of absent or low vaccination coverage in LMIC settings, would alert clinicians to consider this virus in their diagnoses.

Table 2 Infection features of two rubella cases.
Figure 4
figure4

Human rubella virus (RVi). Maximum-likelihood phylogenetic tree was inferred comparing two RVi genomes (all from KCH pneumonia patients cohort, indicated as red circles) to global genomes. The global RVi strains identified in confirmed CRS cases were indicated as orange circles in the tree, and RVi vaccine strains were indicated as green circles. The tree was mid-point rooted for clarity and horizontal branch lengths were drawn to the scale of nucleotide substitutions per site, and significant bootstrap values were shown for major nodes.

Parvoviridae family

We identified parvovirus B19 (B19) in a 5-month old hospitalised patient with very severe pneumonia and anaemia. Tests for malaria were negative, and the patient was discharged after 3 days. The identified B19 virus genome belonged to genotype 1A, similar to other global B19 sequences as shown in the phylogenetic tree comparing the local B19 genome to global sequences (genotype 1A, Supplementary Fig. S1, panel C).

Human bocavirus (HBoV) type 1 was identified in a hospitalised child with malnutrition, severe pneumonia and diarrhoea, and in three children with upper respiratory infections from different households. These 4 HBoV1 genomes clustered in 2 sub-lineages within genotype 1 when compared with all global sequences as shown in Supplementary Fig. S1, panel D.

Other viruses

Viruses detected at low frequency included HPIV-1 (one case), human parechovirus (one case), human herpesvirus 5 (HHV-5; two cases), human herpes simplex virus (HSV-1; one case), Dengue Virus type 2 (DENV-2; one case) and echovirus E1 (one case).

Detection timeline

The date of collection of specimens that were test negative using the routine viral panel assay, and their NGS viral detection results, are plotted by time (Fig. 5). The various HMPV, HRV-A, HRV-B and HRV-C positive samples are distributed throughout the observation period and occurred in both study groups (KCH and WIAFW). CV-A24 and EV-D68 positive samples were detected over discrete time periods (2 months in 2010 and 5 months in 2015, respectively) as mentioned above. The other observed viruses were too few to make strong conclusions about their temporal distribution.

Figure 5
figure5

The timeline of viruses identified in the household cohort and KCH pneumonia patients. The viruses detected in the study (by row) were plotted against the date from which samples were collected (by column). Each virus was presented as a different colour. All the positive samples from household cohort were from 10 December 2009 to 3 June 2010, while positive samples from KCH patients were from 12 January 2015 to 25 December 2015.

Read more here: Source link