Identification of ancestry-dependent and independent non-HLA SLE-associated variants and downstream target genes
Despite the success achieved by GWAS in mapping polygenic disease risk loci in SLE, the biological implications of the majority of identified variants has remained unknown. To gain a broader view of how inherited genetic variation impacts disease risk, we took the global approach of integrating SNPs with a range of association significance to generate a cohort of predicted genes that could ultimately be pruned and mapped to functional pathways for analysis. Immunochip-based association analyses have identified 700 single-nucleotide polymorphisms (SNPs) reported as significantly associated with SLE in patients of East Asian (AsA) ancestry6 and 757 SNPs associated with disease in European (EA) populations (Fig. 1A)8. Twenty SNP associations (< 1.5%) were shared between ancestries. In both ancestries, approximately 70% of SNPs were found in non-coding regions (intergenic and intronic), and 8% of SNPs were in coding regions (3′UTRs, 5′UTRs, synonymous and non-synonymous) (Fig. 1B and Supplemental Table S1). AsA populations had a significantly higher percentage of SLE-associated SNPs in non-coding (nc)RNAs (lncRNA and miRNA), whereas EA populations had more SLE-associated SNPs located within regulatory regions, including enhancers, promoters, open chromatin, and transcription factor binding sites (Fig. 1B).
We used a bioinformatics-based approach to identify the most plausible genes affected by the SLE-SNP association. As previously described11,12, we first determined whether there was evidence that the SNP was an expression quantitative trait loci (eQTL) using the GTEx database (version 8) and the Blood eQTL browser13. 226 EA- and 405 AsA- SLE-associated eQTLs linked to 730 and 1225 expression genes (E-Genes), respectively (Fig. 1C–E and Supplemental Table S2). eQTLs were identified for nearly 60% of the total number of AsA Immunochip SNPs, compared to 29% of SLE-associated SNPs in the EA data (Supplemental Fig. S1). Whereas the number of AsA eQTLs was higher across all genomic categories (coding, non-coding and regulatory regions as well as ncRNAs), the proportion of AsA SLE-associated SNPs linked to ncRNAs (51%; 42/82) was nearly three times higher than that observed in EA populations (18%; 8/44) (Supplemental Fig. S1 and Supplemental Table S1). Next, we sought to identify SNPs within distal and cis regulatory elements (e.g., enhancers and promoters), using the computational tools GeneHancer and HACER (Human ACtive Enhancers to interpret Regulatory variants), both of which connect regulatory SNPs with downstream target genes (T-Genes)14,15. Together, GeneHancer and HACER identified 105 SLE-associated SNPs (59 EA, 36 AsA) overlapping distal regulatory elements or promoters predicted to impact the expression of 964 T-Genes (617 EA, 350 AsA) (Fig. 1C–E and Supplemental Table S2). For variants located in coding regions, 44 SNPs (21 EA, 23 AsA) were associated with either non-synonymous or nonsense changes in 47 genes (C-Genes; 20 EA, 27 AsA) (Supplemental Table S2). The remaining SNPs that were not linked to E-, T- or C-Genes were assigned to the closest proximal gene (P-Gene), identifying an additional 956 P-Genes (487 EA, 469 AsA) (Supplemental Table S2).
Overlapping EA and AsA SNP-linked E-, T-, C- and P-Genes are depicted in Fig. 1D,E, respectively. No genes were shared within all four groups within either ancestry, and we observed limited commonality between T-, P- and E-Genes, with only 20 genes shared among the three groups in EA and 15 genes shared in AsA. It is notable that of the total of 3,432 SNP-linked genes, < 10% (327) overlapped between AsA and EA lupus cohorts (Fig. 2A).
Characterization of gene signatures
We next completed a series of bioinformatic analyses to determine the overall biological function of the 1349 genes associated with SLE in EA and 1429 genes associated with SLE in AsA, as well as the 327 SLE-associated genes common to both ancestries. Analysis of genes linked through EA revealed enrichment in processes related primarily to adaptive immune function, including the functional category for interferon stimulated genes, canonical pathways for TH1 and TH2 activation pathway and Macrophage classical activation signaling pathway, and GO terms for the Regulation of B cell proliferation (GO:0030888) and the Regulation of T cell proliferation (GO:0042129). Genes linked to SNPs associated with SLE in the AsA cohort were enriched in categories related to pathogen-influenced signaling, such as Role of PRRs in the recognition of bacteria and viruses, and the Positive regulation of lymphocyte differentiation (GO:0045621), as well as those representing more diverse biological functions, such as Regulation of oxidative stress-induced neuron death (GO:1903203) and DNA ligation involved in DNA repair (0051103). Shared genes were distributed in a range of adaptive and innate immune gene categories (Fig. 2B,D,E).
In addition, EA- and AsA-derived gene sets were examined using a clustering program that detects immune and inflammatory cell type signatures within large gene lists to identify dominant immune cell populations driving disease pathology within each ancestry16. Consistent with our pathway analysis, EA exhibited strong enrichment in cellular categories for myeloid, T, and B cells, whereas SLE-associated genes in AsA were not enriched in any cellular category (Fig. 2C). Independent analysis of shared genes revealed enrichment in the T, B and myeloid, and the NK or T cell categories. Finally, parallel analyses examining P-Genes separately from E-, T-, and C-Genes were conducted to assess the potential overrepresentation of immune-based processes because of the Immunochip design bias17. As expected, P-Genes (384 EA, 253 AsA) were enriched in immunologically-driven functional categories and pathways; exclusion of P-Genes resulted in only minor alterations to overall categorization in either ancestral background (Supplemental Fig. S2).
Delineation of signaling pathways identified by ancestry-specific SNP-associated genes
To assess ancestry-driven key signaling pathways in greater detail, ancestry-based protein–protein interaction (PPI) networks consisting of EA-associated, AsA-associated, or ancestry-independent genes were constructed using STRING, visualized in Cytoscape and clustered using MCODE (Supplemental Table S3). To provide an additional level of functional annotation, clusters contributing to overall immune function, tissue repair, mechanisms of cellular stress, cell motility, metabolic function or general cell function were grouped together. EA-associated genes were dominated by the functional category for interferon stimulated genes observed in cluster 2 (118 genes) (Fig. 3A), along with multiple canonical pathways related to the activation of pattern recognition receptors and downstream type I interferon signaling (Supplemental Table S4). Cluster 7 revealed additional enrichment in lymphocyte activation and differentiation, such as the TH1 and TH2 activation pathway that was also represented in the shared gene network, and cellular enrichment for cells of myeloid and/or lymphoid origin. Notably, the EA-associated network lacked evidence of cell motility and cell stress/injury, whereas metabolic function was represented by clusters 12 and 13 enriched in retinoid X receptor activation (LXR/RXR activation, PPARα/RXRα activation) involved in the regulation of lipid metabolism, inflammation, and cholesterol bile acid catabolism. Pathways associated with SLE in AsA were indicative of a diverse range of biological processes with protein metabolic functions dominating clusters 2 and 17 (Fig. 3B), whereas clusters 3 and 6 were enriched in multiple canonical pathways related to cytokine production and signaling (Supplemental Table S4). Interestingly, genes linked to SNPs associated in the AsA cohort did not include a unique interferon signature, but instead coalesced into multiple small clusters related to mitochondrial dysfunction (clusters 9 and 19) and metabolism, evident in clusters 16, 22 and 30. Additionally, AsA-associated gene clusters were enriched in chromatin remodeling found in cluster 1, along with evidence of cell motility (clusters 11, 12, 23 and 25). AsA cellular enrichment was dominated by monocytes and myeloid lineage cells.
Pathways exemplified by SLE-linked genes in both EA and AsA appear to be a blend of the pathways enriched within each ancestry. Common pathways included Interferon signaling, TH1 and TH2 activation pathway, Complement system and Leukocyte extravasation (Fig. 3C and Supplemental Table S5). Figure 3D depicts a selection of both the unique and overlapping canonical pathways motivated by the EA-associated and AsA-associated gene sets. We also carried out a parallel series of bioinformatics analyses to determine the biological function of the full array of EA (1676) and AsA (1756) SNP-predicted genes, including those associated with both ancestries (Supplemental Fig. S3 and Supplemental Table S5). As expected, shared genes were evenly distributed throughout each large network and subsequent connectivity mapping revealed the addition of several new clusters to both the EA and AsA networks. For example, the full EA network gained several clusters contributing to cell motility enriched in integrin signaling and granulocyte diapedesis (clusters 34 and 35), whereas the enlarged AsA network gained multiple clusters enriched in immune function (clusters 9, 12 and 31) and interferon signaling (cluster 3), as well as enrichment in a more diverse array of cell types, including T and, B cells, neutrophils and NK/T cells. Both networks acquired a small cluster enriched in the functional category for reactive oxygen species (ROS) protection (EA cluster 22, AsA cluster 24) driven by the glutathione peroxidases GPX4, an antioxidant enzyme involved in ferroptosis18, and GPX3 an ROS scavenger. Overall, the addition of the shared gene cohort to each network highlights many common pathways and biological functions, but still revealed ancestry-driven differences. Taken together, unique pathways identified via EA analyses appear dominated by immune-related signaling as well as T, B and myeloid cells, whereas those in AsA analyses are dominated by biological processes related to altered metabolism and mitochondrial dysfunction.
Validation of AsA-enriched molecular pathways using summary GWAS data
To test our pathway predictions, we combined summary data from previously reported GWA studies7,9 that identified 1350 SNPs associated with SLE in patients of East AsA ancestry (Supplemental Table S6). Of these SNPs, 68% were located in non-coding regions, 6.5% were in coding regions, 2.7% were in regulatory regions and 22% were located within or proximal to non-coding RNAs (Supplemental Fig. S4). Validation AsA-associated GWAS SNPs exhibited limited commonality when compared to Immunochip SNPs, with < 1% of either EA- or AsA-associated Immunochip SNPs overlapping GWAS SNPs, and only 3 SNPs common to all 3 datasets (Supplemental Fig. S4). We next applied our same bioinformatics-driven methodology to generate a validation gene cohort composed of 1321 E-Genes, 307 T-Genes, 17 C-Genes and 974 P-Genes (Supplemental Table S7). Connectivity mapping of all validation genes were used to create PPI networks that were clustered as described above (Fig. 4A). Examination of each cluster revealed functional similarity to those derived from AsA Immunochip-associated genes. For example, clusters 1, 3, 4, 5 and 6 share hallmarks of tissue repair and remodeling exemplified by categories for mRNA processing, pro-cell cycle and protein degradation (proteasome, lysosome, endocytosis). Additionally, we observed smaller clusters (21, 27 and 28) representative of processes involved in metabolic function, and clusters (13, 18 and 24) characteristic of cell stress and injury, including the Inhibition of ARE-mediated degradation pathway and Mitochondrial dysfunction canonical pathways (Supplemental Table S8). Cluster 9 contained a small interferon-stimulated gene signature consisting of IFI27, IFI44 and RSAD2 (Supplemental Table S3). Cellular categories were again dominated by monocytes, T cells, NK cells, B cells and plasmacytoid (p)DCs and are consistent with findings observed with AsA Immunochip-associated genes. In contrast to AsA-associated genes, where we observed large, highly connected clusters, an equivalent cohort of apparently random genes generally formed smaller clusters, exhibited fewer intra- and inter-cluster connections, and were primarily enriched in functional categories mostly related to basic cell function (general cell surface, secreted and ECM) (Fig. 4B,C). As shown in Fig. 4D, which displays the number of genes (and percentage of total genes) assigned to each functional category, random genes are skewed toward general cell function, whereas AsA-associated genes are more prevalent in the overall immune (15.3% of genes), tissue repair (53.4%) and cell stress (7%) categories. The random gene network also lacked evidence of cell movement and the diversity of cellular enrichment identified from AsA SNP-associated genes (Fig. 4B).
Validation of AsA-enriched molecular pathways using gene expression data
SNP-based pathway predictions were also tested in differential gene expression analyses from whole blood samples collected from AsA SLE patients with active disease and renal involvement compared with healthy controls (E-MTAB-11191, Supplementary Table S11). Overall, differential expression (DE) analysis revealed 5886 DE genes (DEGs) enriched in functional categories for interferon stimulated genes, gene expression, RNA processing and metabolism (Fig. 4 and Supplemental Fig. S5). A total of 685 AsA and 300 EA SNP-predicted genes were shared with AsA SLE DEGs, and 144 genes, representative of type I and type II interferon signaling, were shared among all three groups (Supplemental Fig. S6). Genes common to AsA DEGs and AsA SNP-predicted genes were enriched in RNA processing and translation, whereas DEGs shared with EA SNP-predicted genes were specifically enriched in type I interferon/cytokine signaling. Connectivity mapping and pathway analysis of AsA DEGs again revealed striking commonality with AsA SNP-associated genes from both the Immunochip and summary GWAS, exemplified by cluster enrichment in RNA splicing/processing, ubiquitylation, chromatin remodeling and metabolic function (glycolysis, TCA cycle, fatty acid biosynthesis and peroxisomes), and canonical pathways for Spliceosome cycle (cluster 2), Inhibition of ARE-mediated RNA degradation (clusters 12, 20 and 25), and Mitochondrial dysfunction (cluster 16) (Fig. 5A and Supplemental Table S9). The distribution of DE genes in AsA for each overall category were also similar to that observed with AsA-associated genes (Immunochip and GWAS) but differed from both the EA-Immunochip predicted or random genes (Fig. 5B).
To test the pathway predictions, Gene Set Variation Analysis (GSVA)19 was applied to determine the relative enrichment of gene signatures identified in peripheral blood mononuclear cell (PBMC) samples from SLE patients (EA and AsA) and controls (Supplemental Table S10). In FDAPBMC1, a dataset composed of EA patients (Supplemental Table S11), all 7 IFN gene signatures (IGS) and signatures for the RIG-I pathway and DNA/RNA sensors were strongly enriched in SLE PBMCs compared to controls (Fig. 6A). In contrast, only the signatures for IFNA2, IFNB1, IFNW1 and the Type I core were enriched in SLE PBMCs from AsA patients in GSE81622 (Fig. 6B). GSVA using a random group of genes did not separate SLE from controls in either dataset.
GSVA enrichment scores using signatures for complement activation and metabolic pathways, including mitochondrial oxidative phosphorylation (OXPHOS), the TCA cycle and glycolysis were able to separate AsA SLE patients, but not EA patients, from healthy controls (Fig. 6C,D). Consistent with our predictions, AsA SLE patients exhibited significant enrichment in signatures for mitochondrial dysfunction and oxidative stress. As shown in Fig. 6E–H, enrichment of a number of pathways and cell types in both EA and AsA SLE cohorts were noted, including TLR signaling, inflammasome, TNF signaling and monocyte/myeloid lineage cells, whereas AsA patients exhibited additional enrichment in pro-cell cycle, low density granulocytes (LDGs) and B cells. Varying degrees of T cell/NK cell lymphopenia were evident in both ancestral populations.
Differential Inflammatory cell metabolism and activation status by ancestry
Because dysfunctional metabolic reprogramming can directly influence and exacerbate defective immune responses, we carried out linear regression analysis between the GSVA scores for individual cell signatures and the glycolysis and oxidative phosphorylation gene signatures to interrogate the metabolic status of immune cells from EA and AsA PBMC datasets. Enrichment scores for the glycolysis gene signature in AsA SLE patients exhibited positive correlation with both the monocyte/myeloid (R2 = 0.14, p = 0.03) and B cell signatures (R2 = 0.12, p = 0.051) (Fig. 7A). In contrast, glycolysis was not associated with gene signatures representing monocyte/myeloid cells, T cells or NK cells from EA patients, whereas the B cell category exhibited significant negative correlation (R2 = 0.26, p = 0.03). The gene signature for oxidative phosphorylation lacked positive correlation with all cell types in AsA patients, but did exhibit significant correlation (R2 = 0.6, p = 0.0003) with T cells in individuals of EA ancestry (Fig. 7A).
Since altered immune cell metabolism is associated with heightened oxidative cell stress responses in SLE20, we used the same linear regression approach to test the relationship between different immune cell types and GSVA enrichment scores for oxidative stress (Fig. 7A). Monocyte/myeloid cells from AsA patients were the only cell type demonstrating a significant positive relationship with the gene signature for oxidative stress (R2 = 0.3, p = 0.0008). These results were confirmed in purified CD14+ monocytes isolated from EA and AsA SLE patients (California Lupus Epidemiology Study, CLUES; GSE16445721) (Supplemental Table S11) showing significant enrichment of gene signatures for oxidative stress, glycolysis and IFNA2, along with a trend toward elevated OXPHOS and mitochondrial dysfunction, specifically in AsA SLE patients (Fig. 7B). Given that enhanced oxidative stress and concomitant mitochondrial dysfunction have been shown to correlate with disease activity, DNA damage, and autoantibody production22,23,24, we next examined the potential link between enrichment scores for mitochondrial dysfunction and SLE disease activity index (SLEDAI) score in these patients. GSVA scores for mitochondrial dysfunction demonstrated a significant positive relationship with SLEDAI in AsA but not EA SLE monocytes (Fig. 7C). Similarly, enrichment scores for oxidative stress and mitochondrial dysfunction showed a significant positive correlation with anti-dsDNA titers among AsA SLE patients with active disease (SLEDAI ≥ 6) compared to EA patients (Fig. 7D). In addition, overall complement C3 levels were lower in AsA patients and demonstrated a significant negative correlation with both anti-dsDNA and SLEDAI (Supplemental Fig. S7) in accordance with the observation that complement depletion and anti-dsDNA antibodies are often associated with elevated disease activity25.
Finally, given that mitochondrial dysfunction and nucleic acid sensing are potent inducers of interferons26, we tested for a relationship between IFNA2 enrichment scores and enrichment scores for the mitochondrial dysfunction, DNA/RNA sensors and TLR pathways (Fig. 7E). In EA patients, GSVA scores for the DNA/RNA sensors (R2 = 0.62, p < 0.0001) and TLR pathway (R2 = 0.17, p = 0.0013) signatures, but not mitochondrial dysfunction, were associated with IFNA2. In contrast, all three signatures exhibited a positive, significant correlation with IFNA2 in AsA patients.
Read more here: Source link