Phylogenomic analysis uncovers a 9-year variation of Uganda influenza type-A strains from the WHO-recommended vaccines and other Africa strains

Demographic characteristics of sampled patients

The Uganda Virus Research Institute National Influenza Centre (UVRI-NIC) laboratory tested 18,353 patients between 22nd October 2010 and 9th May 2018. Thirteen-percent (2404/18,353) were positive for influenza, 69.88% (1680/2404), 29.62% (712/2404), and 0.17% (4/2404) had influenza A, B, and A/B co-infection, respectively (Fig. 1A). IAV positives included 67.08% (1127/1680) A(H3N2), 32.2% (541/1680) A(H1N1)pdm09, and 0.12% (2/1680) AH1/H3 co-infections.

Figure 1
figure 1

Workflow of swab selection and whole genome recovery. (A) Shows how swabs were selected for influenza whole-genome sequencing (WGS). Patients diagnosed with either influenza subtypes A(H1N1)pdm09 or A(H3N2) and whose swab had a PCR CT ≤ 35 had their laboratory codes randomised based on the subtype and year of collection using the R software v3.6.3 (www.r-project.org). All available swabs were retrieved for years with less than fifteen swabs. The 697 swabs missing include some shipped to the Centers for Disease Control and Prevention (CDC) for routine surveillance and some lost due to an accidental failure of a freezer. The numbers are based on the UVRI-NIC laboratory dataset only, as of 9th May 2018. (B) Shows how viral samples were excluded before and after sequencing and the rate of whole genome recovery. Eight viruses [2 A(H1N1)pdm09 and 6 A(H3N2)] failed quality control (QC) before sequencing.

The mean number of swabs sampled per subtype per year was 13 (1–18), excluding 2012, 2016, and 2018 with 2, 1 A(H1N1)pdm09, and no A(H3N2) swab available, respectively (Supplementary Table 1). Three A(H1N1)pdm09 and one A(H3N2)] sampled patients’ swabs lacked demographic data. Of the 230 swabs with data, 65.22% (150/230) and 34.78% (80/230) were from ILI and SARI cases, respectively. The number of sequenced and un-sequenced swabs were not significantly different per case, gender, age group, and geographical region per year, except Central had more swabs sequenced than other regions in 2014 and 2016 (Table 1). The mean age was not significantly different between our study and the UVRI-NIC patients (Supplementary Fig. 1).

Table 1 Comparison of demographic characteristics of influenza A positive patients sampled by the general UVRI-NIC surveillance programme whose viral swabs were successfully or not sequenced.

Sequencing efficiency

All 234 sampled swabs were analysed, and their mean read counts (ranges) are reported below.

The MiSeq generated 569,435 (806–1,644,430) paired reads per sample (data not shown). Following quality control (QC), 266,020 (70,229–909,340) clean reads per sample were processed using the Iterative Refinement Meta-Assembler (IRMA), 265,868 (70,150–908,946) passed IRMA’s QC, 213,809 (1381–908,234) matched flu references, and 113,164 (777–461,172) paired reads were assembled (Supplementary Fig. 2A).

The number of assembled reads decreased with an increase in gene size. The shortest, MP and non-structural protein (NS), had 25,101 (28–91,585) and 19,341 (41–79,167) assembled reads, respectively. The NA, HA, and nucleoprotein (NP) had 18,653 (74–71,377), 14,466 (35–74,550), and 14,151 (69–74,415) reads assembled, respectively. The polymerase subunits: PA, PB2, and PB1 had 10,465 (31–56,243), 7885 (15–47,444), and 4497 (8–31,354) reads assembled, respectively (Supplementary Fig. 2B).

We successfully sequenced and assembled viral genes from 96.58% (226/234) of the swabs (Fig. 1B). Eleven viral WGs with a depth of coverage < 100 were excluded, leaving 215 viruses. 89.77% (193/215) of these were WGs, spanning 100% and > 96.7% nucleotides in the coding sequences (CDS) and complete genome of A/California/7/2009(H1N1) and A/Perth/16/2009(H3N2) vaccine viruses. Our overall WG recovery rate was 85.4% (193/226). The remaining 10.23% (22/215) viruses had complete CDS for 2–7 genes. Two viruses sampled as A(H1N1)pdm09 matched IRMA’s A(H3N2) references and were included in the A(H3N2) analysis. All newly-generated 215 virus sequences were submitted in a publicly accessible database, GISAID EpiFlu™ (www.gisaid.org/), under accessions EPIISL498819–EPIISL498931 [A(H1N1pdm09)], and EPIISL498934–EPIISL499037 [A(H3N2)].

Antigenic drift among Uganda IAVs

Uganda IAV HA1 proteins continuously drifted away from the 2010–2020 vaccines (Supplementary Table 2). For seasons when formulations differed, Uganda A(H3N2) strains had 1–2 extra unique amino acid (aa) substitutions when compared to the Southern (SH) than the Northern hemisphere (NH) vaccine strains. Since Uganda’s largest part lies north of the equator, the substitutions described below are relative to NH and SNH vaccines (shared by NH and SH) for the sampled 2010–2018 [A(H1N1)pdm09] and 2010–2017 [A(H3N2)] seasons.

We observed 18 unique aa substitutions across the five antigenic sites15,16 amongst the 107 A(H1N1)pdm09 strains (Supplementary Table 2A). Ranking from the most variable, the main antigenic sites Ca2, Sa, Sb, Ca1, and Cb had 6, 5, 4, 2, and 1 unique aa substitutions, respectively. Substitution S164T, S185T, S203T, and H138R and S74R were the most frequent at site Sa, Sb, Ca1, and Ca2, respectively. All 2010–2016 viruses had S203T, and 90% (27/30) of the 2017–2018 viruses had S164T and S74R.

There were 92 unique aa substitutions across the five antigenic sites16,17 amongst the 99 A(H3N2) strains (Supplementary Table 2B). The antigenic sites B, A, D, C, and E had 24, 22, 17, 16, and 13 unique aa substitutions, respectively. Substitution K144N, P194L, H311Q, S96N, and K62E was the most frequent at site A, B, C, D, and E, respectively. Forty-seven percent (47/99) and 41.41% (41/99) of the 2010–2017 strains had V186G and N145S, respectively.

Uganda A(H1N1)pdm09 strains had mutated receptor binding sites (RBS, H138Q/R, S190V, and D222E) and S164T that alter the glycosylation motif at sites 162–1647. Uganda A(H3N2) strains had more aa substitutions affecting the RBS [130-loop (T135K, A/S138S/A, I/R140K, R140I), 150-loop (Q/H156H/Q), 190-helix (I192V, P194L, A196T, Q197H/R, A/S/A/P/S198S/A/P/S/P), 220-loop (N/D225D/N, F/Y219S)], and those creating (S45N, A128T, K160T) and removing [N45S, N122S/D, N144K/S, T/N128A, T135K] potential N-linked glycosylation sites (Supplementary Table 2B).

Subgroup analysis showed differences of 1–4 and 1–32 unique aa substitutions at antigenic sites of A(H1N1)pdm09 and A(H3N2) strains, respectively, sampled from different cases, gender, age groups, and geographical regions relative to each subtype vaccines (Supplementary Tables 3, 4).

Amino acid similarity of complete HA, NA, and MP protein sequences of Uganda IAVs to vaccine strains

The complete HA (H1), NA (N1), and MP protein sequences of Uganda A(H1N1)pdm09 strains had 94, 81, and 21 unique aa substitutions and a mean amino acid similarity of 98.09 (96.99–99.65%), 98.2 (96.8–99.79%), and 99.17 (97.73–100%), respectively, compared to A/California/7/2009(H1N1), A/Michigan/45/2015(H1N1), and A/Brisbane/02/2018(H1N1) vaccines (Supplementary Table 5). All N1 proteins lacked the neuraminidase inhibitors (NAIs) resistance substitution H275Y. However, 7.55% (8/106) had T362I (n = 1), I117M (n = 2), Y155H (n = 2), and V234I (n = 3) associated with reduced susceptibility to NAIs in vitro18.

The complete HA (H3), NA (N2), and MP protein sequences of Uganda A(H3N2) strains had 160, 118, and 31 unique aa substitutions, and mean amino acid similarity of 97.47 (95.23–99.29%), 97.64 (95.31–99.79%), and 99 (95.46–100%), respectively, compared to A/Perth/16/2009(H3N2), A/Victoria/361/2011(H3N2), A/Texas/50/2012(H3N2), A/Switzerland/9715293/2013(H3N2), A/Hong Kong/4801/2014(H3N2), A/Singapore/INFIMH-16-0019/2016(H3N2), A/Switzerland/8060/2017(H3N2), A/South Australia/34/2019(H3N2), and A/Kansas/14/2017(H3N2) vaccines (Supplementary Table 6). N2 proteins lacked the NAI-resistance H274Y (N2 numbering), but 18.8% (19/101) carried Y155F (n = 3), and E/D221D/K/E (n = 16) that reduce susceptibility to NAIs18.

All A(H1N1)pdm09 and A(H3N2) M2 proteins had the primary adamantine-resistance marker (S31N), and 7.7% (8/104) of A(H3N2) had secondary V27A relative to adamantine-susceptible A/New York/392/2004(H3N2) strain (Supplementary Fig. 3). All Uganda IAVs M2 proteins had aa substitutions (L3I, L4F, T5F, E6K, V7I, E8C, and T9R) fixed in their extracellular N-terminal, a region that supports M2-antibody interactions19.

The influenza surveillance webtool (FluSurver) identified aa substitutions in complete HA and NA proteins reported to alter host specificity and cause mild/strong drug resistance, respectively, and aa substitutions in both proteins that could potentially alter viral virulence, antigenic drift, glycosylation, and sites of interactions (Supplementary Tables 5, 6).

Temporal and spatial divergence of Uganda IAVs

Uganda IAV strains phylogenetically clustered according to their year of sampling, with multiple lineages circulating annually (Fig. 2). Two major H1 lineages co-circulated; lineage 1 (shaded blue) with strains belonging to clade 6A and lineage 2 (shaded grey) with clade 6C, 6B, 6B.1, 6B.1A, 6B.1A.6 strains circulated in 2013–2016 and 2013–2018, respectively (Fig. 2A). The N1 and MP phylogenies showed similar lineages 1 and 2 emerged in 2013 and 2014, respectively, and lineage 2 dominated in 2016–2018.

Figure 2
figure 2

Phylogenies showing the temporal divergence of the HA, NA and MP genes of Uganda A(H1N1)pdm09 (A) and A(H3N2) (B) influenza viruses sampled from 2010 to 2018. Trees were rooted using the oldest sequence in the dataset. Shaded clusters are the two and three major co-circulating lineages observed since 2013 and 2012 for A(H1N1)pdm09 and A(H3N2) viruses, respectively. The third A(H3N2) lineage (with one 2011 and 2016–2017 viruses) disappeared in the MP phylogeny.

The dominant H3 lineage 1 (pink) contained clade 3C.3b and 3C.3 strains that circulated from 2013 to 2016. Lineages 2 (grey) had clade 3C.3a and lineage 3 (blue) had clade 3C.2a, 3C.2a3, and 3C.2a1(a, b) strains that emerged in 2014 and 2015, respectively, and co-circulated through 2017. We observed two long-branched clusters (bootstrap = 100%); cluster 1 (KSW0659 and KSW0643, sampled in 2010) and cluster 2 (TOR0492, TOR1664, and NSY0304, sampled in 2013–2016) in the N2 and MP phylogeny, with 3–5 unique amino acid (9–13 nucleotides) substitutions absent in other Uganda strains (Fig. 2B).

Virus strains sampled from different geographical sites mixed in all phylogenies (Supplementary Fig. 4).

Viral clades circulating in Uganda

Uganda A(H1N1)pdm09 strains belonged to five global clades (A/Hong Kong/2212/2010(H1N1)-HK, 3, 5, 6, and 7) (Fig. 3A and Supplementary Fig. 5). The HK clade circulated in 2010 and had aa substitutions V19I, N97D, and S128P. Two novel clades H1-UG1 with P83S, D222E, and I267T, and H1-UG2 with T134A, P183S, and S185T, circulated in 2010 and 2011, respectively. Clade 3 had 2010–2011 strains with A134T and S183P. Clade 5 had 2011 strains with D97N, R205K, I216V, and V249L. One 2011 strain from Kisenyi clustered with A/St.Petersburgh/100/2011(H1N1) in clade 7 with A197T, S143G, and K163I. Clade 6 viruses with D97N, S185T, and S203T dominated since 2012 and diverged into 6A, 6B, and 6C. A novel subclade 6B.1A.6 with T120A emerged in November 2017 and dominated through 2018.

Figure 3
figure 3

Genetic clades of influenza A viruses that previously circulated in Uganda during the 2010–2018 seasons. All labelled clades (indicated by black bars) were inferred based on the signature amino acid substitutions in the HA1 protein indicated on the tree trunk in bold. (A) Shows clades for 2010–2018 A(H1N1)pdm09 viruses. Novel clades H1-UG1 and H1-UG2 are indicated. Genetic clade 6 diverged into 6A, 6B, and 6C. All clade 3, 5, and 7 viruses were collected from Entebbe and Kampala (Central Uganda) and circulated in 2010–2011. (B) Shows clades for 2010–2017 A(H3N2) viruses. A novel clade H3-UG1 is indicated. Clade 3 persisted in all 9 years. A similar figure with the full sequence names is provided in Supplementary Fig. 5.

Uganda A(H3N2) strains belonged to two global clades 3 and 7 (Fig. 3B) and co-circulated in 2010–2012. A novel clade H3-UG1 with L183H, T212A, S214I, and P289S circulated in 2010. The major clade 3 had strains with N145S and V223I and diverged into 3B, 3C.2, and 3C.3. New subclades 3C.2a1a with T135K and 3C.2a1b with K92R and H311Q dominated in May–June and May–November 2017, respectively.

Based on our dataset, similar clades circulated in Uganda and other African countries, except for A(H1N1)pdm09 clade 7, HK, and H1-UG2 (Supplementary Table 7). However, our recent systematic review showed that clade 7-like strains also circulated in Kenya, Tanzania, and South Africa between 2010 and 201210.

Phylogenetic relatedness of Africa IAVs

We define a group as a highly-supported phylogenetic cluster (bootstrap ≥ 90%) with at least three Uganda IAV nucleotide sequences.

Uganda A(H1N1)pdm09 strains collected before 2016 clustered uniquely towards the root, while the 2017–2018 strains mixed with Eastern, Central, Western, and Southern Africa strains (Supplementary Figs. 68). The H1, N1, and MP phylogeny contained 6, 5, and 1 group, with 50% (3/6), 60% (3/5), and 100% of the groups unique to Uganda, respectively (Supplementary Table 8A).

Uganda A(H3N2) strains collected in 2008–2016 and some before April 2017 clustered uniquely closest to the root (Supplementary Figs. 911). Notably, the 2008–2009 Makerere Walter Reed project (MWRP)11 and our newly-generated strain sequences clustered separately. The H3, N2, and MP phylogeny had 8, 10, and 2 groups, respectively, and 50% (4/8) H3 (circulated in different years) and 40% (4/10) N2 groups had only Uganda strains (Supplementary Table 8B).

If not clustered alone, Uganda strains grouped with strains from neighboring Kenya, Tanzania, Madagascar, and Congo. Interestingly, four unique H3 lineages (bootstrap ≥ 90%); Kenya (n = 1), Congo (n = 1), and West Africa (n = 2) co-circulated in 2019 (not shown). Virus group details are provided in Supplementary Table 8.

Read more here: Source link