A hypoxia-related signature in lung squamous cell carcinoma

Introduction

Lung cancer is the major leading cause of tumour-related deaths throughout the world, while lung squamous cell carcinoma (LUSC) as the second most common histological type of lung cancer.1 Each year, almost 1.8 million people are diagnosed with lung cancer worldwide and 400,000 of these die from LUSC.2,3 Due to the lack of early symptoms of LUSC, patients are often diagnosed at an advanced stage.4,5 In the past, there was a lack of effective targeted therapeutic options for patients with LUSC, and therefore the first choice for patients with LUSC remained traditional radiotherapy and chemotherapy.6 In recent years, although immune checkpoint blocker (ICB) therapy has been gradually used for LUSC patients, some LUSC patients still experience excessive progression after ICB therapy. Given the limitations of existing therapies such as immunotherapy and targeted therapy, the prognosis of LUSC remains poor.7,8 Therefore, there is an urgent need to investigate the underlying biological mechanisms and identify new therapeutic targets.

Hypoxia is one of the essential features of the tumor microenvironment (TME). During cancer progression, tumor cells induce hypoxia through various mechanisms, such as increasing metabolic rate and oxygen consumption leading to endothelial dysfunction or disrupting oxygen delivery due to various effects on blood vessels, forming a chronic hypoxic environment, activating hypoxia-inducible factor HIF signaling, accelerating tumor growth, increasing tumor aggressiveness, and promoting tumor metastasis.9–11 Not only that, tumor hypoxia has been reported to contribute to heterogeneous changes, genetic instability, angiogenesis, and resistance to chemoradiotherapy and targeted therapy.12–14 Hypoxia stimulates tumor progression and metastasis through physiological and genomic mechanisms and has become a poor prognostic factor for cancer assessment.15–17 Several studies have shown that genes associated with hypoxia have prognostic value in a variety of cancers. For example, overexpression of LBH in gliomas under hypoxic conditions is associated with poor prognosis.18 In lung adenocarcinoma, NLUCAT1 is a transcript that is strongly upregulated in hypoxia and has been shown to have an important prognostic value.19 However, the prognostic value of hypoxia-related genes have not been reported in LUSC.

In this study, we performed a systematic and comprehensive analysis of the hypoxia scores of LUSC. We found that a high hypoxia score implied a worse clinical outcome. Meanwhile, functional enrichment analysis and CIBERSORT demonstrated that a risk scoring system based on 3 HRGs was associated with the immune microenvironment. Moreover, we predicted potential compounds for prognostic genes by CTD, providing a theoretical basis for clinical translation efforts of the prognostic signature. In summary, we developed and validated a Hypoxia-related model containing 3 genes for LUSC, reflecting the cellular processes and immune microenvironment characteristics and predicting the prognostic outcomes and targeted compounds.

Materials and Methods

Data Collection

Transcriptome data in FPKM format and corresponding clinical information of 49 para-cancerous samples and 496 LUSC samples were collected from the TCGA database (portal.gdc.cancer.gov/). In this study, 20 LUSC samples with missing survival information were excluded, and the remaining 476 LUSC samples were utilized for hypoxia score estimation, WGCNA, and prognostic model construction. One set of independent microarrays, GSE73403 (www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE73403), were extracted from the Gene Expression Omnibus (GEO, www.ncbi.nlm.nih.gov/geo/) database. The 69 LUSC patients with complete survival data and clinical information included in this dataset were used in the validation analysis of the prognostic signature. Specific sample information would be described carefully in the corresponding methods as well. The 200 hypoxia-related genes (Supplementary Table 1) were obtained from the HALLMARK_HYPOXIA attributed to the hallmark gene set according to MsigDB (www.gsea-msigdb.org/gsea/msigdb/).

GSVA Calculation of the Hypoxia Score

Hypoxia scores were analyzed with 200 hypoxia-associated genes and gene set variation analysis (GSVA), which is a non-parametric unsupervised algorithm that estimates changes in pathway activity in a sample population.20 In this study, the GSVA scores obtained by GSVA for each sample (n = 476) based on hypoxia-related genes were considered as hypoxia scores, representing the hypoxic status of each sample. LUSC patients were divided into high- (n = 193) and low– (n = 283) hypoxia score groups based on the cut-off value (cut off = 0.04) calculated using the best separation tool in the R package “survminer”. Subsequently, the difference in OS between the two groups was assessed by K-M curves. Moreover, Wilcox test was employed to determine the differences in clinical indicators between the groups. To avoid randomness, we used the “sample ()” function to randomly divide the TCGA-LUSC sample into two groups with the same sample size as the previously divided high-low hypoxia group and compared the difference in OS between the two groups. Also, the procedure was repeated 1000 times, and then a probability of P<0.05 was imputed to determine that the difference in OS between the high and low hypoxia scoring groups we identified was not a random effect. In addition, the TCGA-LUSC samples were also randomly divided into training and testing sets based on a 5:5 ratio. The hypoxia scores of the samples in the training and testing sets were calculated using the same method. The samples were divided into high and low hypoxia score groups according to the best cutoff value of hypoxia score in each data set, and OS differences were calculated.

Weighted Gene Co-Expression Network Analysis (WGCNA)

Genes with mean expression values (FPKM) >1 in all high- and low-hypoxia group samples (476 LUSC samples in total) were subjected to construction of co-expression networks using R package “WGCNA”. The high- and low-hypoxia scores were served as trait data. The “goodSamplesGenes” function was also used to perform sample clustering to identify and remove outliers. For making the co-expression network contented the distribution of scale-free network, a soft-thresholding power was computed with the pickSoftThreshold function. The dynamic tree cutting method was used to identify different modules, with the minimum number of genes in each module of 30. Subsequently, set a merging threshold of 0.2 to merge similar modules. Subsequently, correlations of modules and traits were calculated by Pearson correlation analysis. |correlation (cor)| > 0.3 and P < 0.05 was considered significantly correlated. The module that possessed the highest correlation coefficient with the trait was considered as the interesting module, and the genes in this module were considered as hub genes.

Identification of DEGs

The “limma” package of R 3.4.3 software was performed to identify the DEGs in LUSC (n = 496) vs para-cancerous (n = 49) samples and high-risk group vs low-risk group. The screening criteria were as follows: |log2 (fold change) FC| > 1 and adjust P-value < 0.05. The volcano map developed by the R package “ggplot2” was applied to exhibit the distribution of DEGs, while the heatmap generated by the “pheatmap” package displayed the expression pattern of DEGs. Overlapping genes of DEGs and hub genes were recognized as DE-HRGs.

Functional Enrichment Analysis

To further understand the potential biological functions of DE-HRGs and risk score-related DEGs (high-risk group vs low-risk group), we performed GO enrichment analysis, including cellular component (CC), molecular function (MF), and biological process (BP). Then, we performed KEGG pathway analysis. If the P-value is less than 0.05, the result is statistically significant. The functional enrichment analysis was implemented in the R package “clusterProfiler”.

DE-HRGs-Related Gene Signature Construction and Validation

Here, 476 TCGA-LUSC samples were randomly divided into a training set (n = 333) and an internal testing set (n = 143) in a 7:3 ratio. TCGA-training set was used to construct and evaluate the prognostic predictive validity of the prognostic signature. TCGA-internal testing set was employed for internal validation of the prognostic signature. Besides, the GSE3403 dataset was treated as an external validation set to verify the general applicability of the constructed prognostic signature. Feature genes were screened using K-M analysis and Cox analysis based on the identified DE-HRGs. Briefly, we performed univariate Cox analysis incorporating K-M analysis on DE-HRGs and selected variables with p-values less than 0.05 in both K-M analysis and univariate Cox analysis for inclusion in the multivariate Cox analysis with stepwise regression to determine the best variables for the prognostic signature. This study used a risk scoring system to assess the ability of the prognostic signature. The risk score was calculated as follows: Risk score = β1* Exp1 + β2* Exp2 + …. + βi* Expi, where β represents the coefficient value, and Exp represents the level of gene expression. The corresponding risk scores were calculated for the LUSC samples in each dataset (TCGA-training set, TCGA-internal testing set, and external validation set).

Based on the above formula, the risk score of the LUSC samples in each dataset was calculated by R package “predict” to obtain the median value that could classify the samples in each dataset into high-risk and low-risk groups. Subsequently, K-M curve analysis generated by the R package “survminer” was carried out in the TCGA-training set and the internal testing set to assess OS, Event Free Survival (EFS), and Cumulative Incidence in Relapse differences between high- and low-risk groups. Prognostic receiver operating characteristic (ROC) curves (“survival ROC” package) were then created at 1, 3, and 5 years. To test the accuracy of the prognostic model, we repeated the above analysis in an external validation set (GSE73403 dataset).

To further determine the necessity of our method for screening the 3 DE-HRGs used to construct the prognostic signature, we randomly selected 3 genes from 376 DE-HRGs to construct the control prognostic signature (random prognostic signature) and assessed their prognostic predictive validity in the TCGA-LUSC dataset using the same method. The coefficients of each gene were calculated using multivariate Cox analysis.

Independent Prognostic Analysis

To investigate whether prognostic characteristics could be independent of other clinical parameters (including sex, age, pathologic stage, pathologic T, pathologic N, and pathologic M), univariate and multivariate Cox regression model analyses were performed in TCGA-training set. P < 0.05 was considered statistically significant. Ultimately, variables with P < 0.05 in the multivariate Cox analysis were identified as independent prognostic factors for LUSC.

CIBERSORT

CIBERSORT is a deconvolution algorithm using a gene expression signature consisting of 547 genes.21 This algorithm can determine the genetic composition of each cell by calculating the expression level of each gene in each immune cell, thus performing gene expressionism analysis of 22 immune cells. In the present study, we proposed to use CIBERSORT to analyze the composition and proportion of 22 immune infiltrating cells in the high- and low-risk groups of TCGA set. It is important to note that CIBERSORT derives inverse convolution p-values for each sample. Only samples meeting P < 0.05 were included in the follow-up analysis. A total of 438 samples met the above requirement, 225 in the high-risk group and 213 in the low-risk group. Additionally, Spearman correlation analysis examined the relationship between 7 prognostic genes and 22 immune cells. |correlation coefficient (cor)| ≥ 0.3 and P < 0.05 was considered significant.

Prediction of Chemotherapy

Chemotherapy response was predicted for each LUSC patient in TCGA database according to the Genomics of Drug Sensitivity in Cancer (GDSC) database. The GDSC database collects data on the sensitivity and response of a large number of tumor cells to drugs.22 In the present study, the measurement to assess drug sensitivity was the half maximal inhibitory concentration (IC50). We performed a ridge regression model based on the expression profile of GDSC cancer cell lines using the R package pRRophetic23 to predict the IC50 values of each drug. We then compared the differences between the low-risk and high-risk groups to determine whether the two groups exhibited different drug sensitivities. For drug selection, we mainly considered the current third-generation chemotherapeutic agents for LUSC, including gemcitabine,24,25 paclitaxel,26,27 and pemetrexed.28,29 Unfortunately, the GDSC database does not record information about pemetrexed.

Comparative Toxicogenomics Database (CTD)

CTD (ctdbase.org/) serves as an innovative digital ecosystem capable of linking a given gene to toxicological information about chemicals, genes, phenotypes, diseases and exposures. In this study, we predicted drug candidates for selected prognostic genes in Homo sapiens only.

MetaboAnalyst

MetaboAnalyst (version 5.0; www.metaboanalyst.ca/home.xhtml) is a web-based platform for comprehensive analysis of quantitative metabolomic data. This study was based on this platform for functional analysis of compounds.

Statistical Analysis

The R programming language was used for all analyses. All survival analysis was implemented in the R package “survival”. The results of K-M plots were displayed as p-values of Log rank tests. Overlap analysis and Venn diagram were implemented in the Jvenn (bioinfo.genotoul.fr/jvenn) online tool. The Wilcox assay was used to reveal differences in the abundance of immune infiltrating cells between high- and low-risk groups. The visualization of the prognostic gene-compound-pathway network was performed by Cytoscape. P < 0.05 was considered to be consistent with statistical significance if not otherwise stated.

Results

Evaluation of the Degree of Hypoxia

The distribution of hypoxia scores of TCGA-LUSC (n = 476) samples calculated by GSVA based on 200 hypoxia-related gene expression profiles ranged from −0.487 to 0.560 (Supplementary Table 2). The 476 TCGA-LUSC patients were categorized into 193 in the hypoxia score-high group and 283 in the hypoxia score-low group according to the optimal cut-off value of the hypoxia score (cut-off = 0.04, Figure 1A). We initially focused on the prognostic impact of hypoxia. K-M curves suggested that patients with high hypoxia scores were associated with worse OS (P = 0.04, Figure 1B). As shown in Figures 1C–H, the differences in hypoxia scores between patients stratified according to different clinical characteristics were not statistically significant. Random effects analysis showed that after performing 1000 replications, the TCGA-LUSC sample was randomly divided into two groups containing 238 and 193 cases with a significant probability of OS difference (P < 0.05) at 0.048 (Supplementary Figure 1A and B). Furthermore, we randomly divided the 476 LUSC samples into a training set (n = 238) and a testing set (n = 238) according to a 5:5 ratio. And the respective optimal cutoff values were calculated based on the hypoxia scores of all samples in the training and testing sets (cutoff value training set = 0.28, cutoff value testing set = −0.15; Supplementary Figure 1C and D). Based on the respective cutoff values, the clinical outcomes of populations with high- and low-hypoxia scores could be well distinguished between the training and testing sets (P < 0.05; Supplementary Figure 1E and F). This evidence indicated that hypoxia affected the outcome of patients with LUSC.

Figure 1 Hypoxia affected the outcome of TCGA-LUSC patients. (A) 476 LUSC patients were divided into two groups: hypoxia score-high group (n = 193) and hypoxia score-low group (n = 283), according to their optimal cut-off value (0.04). (B) Analysis of overall survival (OS) based on hypoxia score-high group, n = 193; hypoxia score-low group, n = 283. Wilcox analysis was performed to evaluate the differences of hypoxia score in different clinical characteristics. (C) Distribution of hypoxia score of patients younger than 65 and those older than 65 years of age. (D) Hypoxia score of different gender. (E) Hypoxia score of different stages. (F) Distribution of hypoxia score of patients with different T stage. (G) Distribution of hypoxia score of patients with (N1-3) or without (N0) lymph node metastasis. (H) Distribution of hypoxia score of patients with (M1) or without (M0) distant metastasis.

Identification of the Closely Connected Module Genes for Hypoxia

In the R package WGCNA, we constructed co-expression networks using genes with FPKM > 1 in 476 TCGA-LUSC cases with high- and low-hypoxia scores as clinical traits, aiming to assess the modules of interest and hub genes most associated with hypoxia scores. The obvious outliers samples (n = 33; Supplementary Table 3) above the red line were removed by clustering, and β = 4 (scale-free R2 = 0.9) was selected to construct a scale-free network (Figure 2A and B). Ultimately, 24 modules were identified according to the average hierarchical clustering and dynamic tree clipping (Figure 2C). Subsequently, we assessed each module’s correlation with two clinical traits (hypoxia score-high and hypoxia score-low). The results indicated that the black module had the highest correlation not only with the high hypoxia score (cor = 0.5, P = 2e-29) but also with the low hypoxia score (cor = −0.5, P = 2e-29) (Figure 2D). Thus, we extracted 3510 genes from the black module for subsequent analysis (Figure 2E; Supplementary Table 4).

Figure 2 Recognition of hub genes related to hyper-hypoxia score by WGCNA. (A) Sample dendrogram. The cut height was set as 34,000 with 33 deviated samples. (B) Analysis of network topology for various soft-thresholding powers. The x-axis reflects the soft-thresholding power (left). The y-axis reflects the scale-free topology model fit index. The x-axis reflects the soft-thresholding power. The y-axis reflects the mean connectivity (degree) (right). Numbers in the plots indicate the corresponding soft thresholding powers. The approximate scale-free topology can be attained at the soft-thresholding power of 4. (C) Clustering dendrogram of genes. Branches with different colors correspond to 24 different modules. (D) Module–trait associations. Each row corresponds to a module, and each column corresponds to a trait. Each cell contains the corresponding correlation and P value. The table is color-coded by correlation according to the color legend. (E) Scatter plot analysis of modules in the black.

Identification and Functional Enrichment Analysis of DE-HRGs

In the TCGA-LUSC dataset, 3055 DEGs were identified between LUSC (n = 476) and normal (n= 49) samples, among which 1459 DEGs were upregulated and 1596 DEGs were downregulated (Figure 3A; Supplementary Table 5). The heatmap displayed the expression patterns of the top 50 up- and down-regulated genes. (Figure 3B). By taking the intersection of DEGs and black module genes, we obtained 376 overlapping genes, namely DE-HRGs (Figure 3C; Supplementary Table 6).

Figure 3 Differentially expressed and functional enrichment analysis. (A) Volcano plot of DEGs. Red represented upregulated genes, and blue indicated downregulated genes. (B) Expression differences of the top 50 DEGs. Red represented high expression, green represented low expression. (C) Venn diagrams showing the number of common genes in DEGs and black module. (D) Gene Ontology analysis of the DE-HRGs regarding biological process, cellular component, and molecular function. (E) All significant KEGG pathways. The x-axis showed the number of genes enriched in this term.

To further comprehend the latent functions of 376 DE-HRGs in LUSC, we performed GO and KEGG enrichment analyses. The top 10 components of BP, CC, and MF were illustrated in Figure 3D. Interestingly, we discovered that the DE-HRGs were remarkably enriched in numerous cell cycle-related processes, such as nuclear division (BP), condensed chromosome (CC), and microtubule binding (MF) (Supplementary Table 7). KEGG pathway analysis demonstrated that the DE-HRGs were mainly involved in cell cycle-related pathways (Figure 3E, Supplementary Table 8). Combining the above results, we speculated that the DE-HRGs might participate in the occurrence and development of LUSC by regulating the cell cycle process.

Construction and Verification of the Prognostic Model

In the TCGA-training set, univariate Cox regression with K-M survival analyses were performed on 376 DE-HRGs, and 6 genes (HELIS, GPRIN1, TARBP1, CST3, FAM83A, and LY6D) related to the OS of LUSC were selected (P < 0.05; Figure 4A; Supplementary Table 9). Subsequently, stepwise multivariate Cox regression analysis was used to construct a prognostic model for LUSC, which was composed of HELLS, GPRIN1, and FAM83A (Figure 4B; Supplementary Table 10). The risk score of each LUSC patient in the TCGA-training set was calculated according to the prognosis model: Risk score = −0.10399 × expression of HELLS + 0.07685 × expression of GPRIN1 + 0.00738 × expression of FAM83A. According to the median value of the risk score, the LUSC cases in the training set were assigned into a low-risk group and a high-risk group.

Figure 4 Construction of the risk score signature using 376 DE-HRGs. (A) Univariate Cox regression analysis of six DE-HRGs in TCGA-training set. (B) Identification of three prognostic genes in TCGA-training set and the coefficients constructed using the multivariate Cox method.

We performed a survival analysis of 333 LUSC patients from the TCGA-training set and found that OS (P = 0.013; Figure 5A) and EFS (P = 0.009; Figure 5B) were significantly lower in the high-risk group than in the low-risk group, and conversely, Cumulative Incidence in Relapse was significantly lower in the low-risk group than in the high-risk group (P = 0.018; Figure 5C). The AUC for 1-, 3-, and 5-year survival rate were 0.614, 0.660, and 0.650, respectively (Figure 5D). As the risk score increased, the mortality of patients also increased. Compared with the low-risk group, the expression levels of FAM83A and GPRIN1 in the high-risk group were higher, while the expression level of HELLS was lower (Supplementary Figure 2A). Furthermore, we adopt analogous methods to further evaluate the prognostic performance of the 3 gene features in TCGA-internal testing set and GEO-external validation set. It was consistent with the result of analysis in the TCGA-training set, the patients in the high-risk group were significantly associated with a poorer survival rate compared to the patients in the low-risk group (P < 0.05, Figure 5E and I). Similarly, in the TCGA-internal testing set, patients in the high-risk group had a significantly lower EFS (P = 0.01; Figure 5F) than in the low-risk; the difference in Cumulative Incidence in Relapse between the two groups was not significant, but overall, patients in the high-risk group had a slightly higher Cumulative Incidence in Relapse than in the low-risk group (P = 0.9; Figure 5G). Unfortunately, information on patients’ EFS and Cumulative Incidence in Relapse was not recorded in the external validation set. The AUC for survival rate was 0.686 at 1 year, 0.614 at 3 years, and 0.604 at 5 years in TCGA-internal testing set (Figure 5H); 0.565 at 1 year, 0.632 at 3 years, and 0.647 at 5 years in the GEO-external validation set (Figure 5J). The Supplementary Figure 2B and C were displayed the distribution of the risk score, survival status, and 3 gene expression profiles between the two risk groups in TCGA-internal testing set and GEO-external validation set, respectively.The above evidence suggests that our prognostic characteristics based on the 3 DE-HRGs constructed by Cox regression analysis are effective in predicting patient prognosis.

Figure 5 The prognostic value of the risk score signature both in TCGA and GEO datasets. (A and B) Survival analysis of patients in the high- and low-risk score groups in TCGA-training group. (C) Distribution of patients with different risk scores in TCGA-training group. (D) The area under the curve (AUC) of ROC curves were 0.614, 0.660, and 0.650 in predicting 1-, 3-, and 5-year OS events from TCGA-training group, respectively. (EH) Survival status of patients with different risk scores in TCGA-internal testing group: OS curve, EFS curve, Cumulative Incidence in Relapse curve, and ROC analysis of the three-gene signature in the testing set of TCGA cohort. (I and J) Validation of the risk score signature in the GSE73403 dataset using the same analysis.

Independent Prognostic Analysis of Risk Model

Univariate and multivariate Cox regression analyses were conducted in the training set to ascertain the independence of the 3-gene signature in clinical application. It was exhibited that pathologic stage (HR = 1.266, 95% CI = 1.071–1.495, P = 0.006), pathologic T (HR = 1.307, 95% CI = 1.089–1.568, P = 0.004), pathologic M (HR = 1.277, 95% CI = 1.054–1.547, P = 0.012), and risk score (HR = 2.263, 95% CI = 1.566–3.271, P < 0.001) were obviously connected with OS in the univariable Cox analysis (Figure 6A; Supplementary Table 11). The results of multivariate Cox regression analysis displayed that the risk score (HR = 2.235, 95% CI = 1.635–3.392, P < 0.001) and pathologic m (HR = 1.280, 95% CI = 1.051–1.558, P = 0.014) were the independent prognostic indicators for LUSC (Figure 6B; Supplementary Table 12).

Figure 6 Univariate and multivariate association of the prognostic model and clinicopathological characteristics with overall survival. (A) Univariate prognostic analyses. For a clinical parameter or riskScore, if the P < 0.05, it was related to survival; if HR > 1, it was a high-risk factor. (B) Multivariate prognostic analyses. For the results of univariate and multivariate independent prognostic analysis, if the riskScore P-value of both was < 0.05, it indicated that riskScore was independent of other clinical parameters and could be used as an independent prognostic factor in clinical practice.

Revealing the Potential Functions of Risk Score-Related DEGs

To further understand the potential mechanisms by which risk score predicted the outcome of LUSC patients, we first identified a total of 14 up-regulated genes and 9 down-regulated genes between high- and low-risk groups based on the R package limma (Figure 7A and B; Supplementary Table 13), and these DEGs were considered as risk score-associated DEGs. Subsequently, GO analysis was used to reveal the potential functions of risk score-related DEGs (Supplementary Table 14). Figure 7C was illustrated the top 10 enriched terms in the three categories of the GO system. We focused on the results of the GO analysis and found that immune response (“leukocyte aggregation” and “leukocyte migration involved in inflammatory response”) related terms were significantly enriched. KEGG analysis indicated that the IL-17 signaling pathway was the most enriched pathway (Figure 7D; Supplementary Table 15). This evidence suggested that the risk scoring system might be mediating the immune microenvironment of LUSC.

Figure 7 Functional annotation of DEGs between high- and low-risk groups. (A) Volcano plot of risk score-related DEGs. Red represented upregulated genes, and blue indicated downregulated genes. (B) Expression differences of DEGs. Red represented high expression, green represented low expression. (C) Gene Ontology analysis of the risk score-related DEGs regarding biological process, cellular component, and molecular function. (D) All significant KEGG pathways. The x-axis showed the number of genes enriched in this term.

Furthermore, interestingly, we found that the results of the GO (Supplementary Table 16) and KEGG (Supplementary Table 17) enrichment analyses of the 39 DEGs (Supplementary Table 18) between the high and low hypoxia score groups were strikingly similar to those of the risk score-related DEGs (Supplementary Figure 3AD). Subsequently, Sankey plots showed that most of the samples in the high hypoxia score group were also classified in the high-risk group (Supplementary Figure 4).

Risk Scoring System-Mediated Alterations in the Immune Landscape of LUSC

Inspired by the above results, we explored the effect of a risk scoring system constructed based on 3 HRGs on the immune microenvironment of LUSC. The abundance of 22 immune cell subtypes in each LUSC sample for the high (n = 225) low (n = 213) risk group was shown in Figure 8A. Further, the violin plot revealed that patients in high-risk group had a significant decrease in the fraction of “B cells naive”, “T cells CD8”, “T cells follicular helper”, and “Macrophages M1”, while a remarkably increased in the proportion of “T cells CD4 memory resting”, “Macrophages M0”, “Dendritic cells activated”, and “Neutrophils” (Figure 8B). Subsequently, Spearman correlation analysis pointed out that the prognostic genes GPRIN1 (cor = −0.301, P = 1.28E-10) and FAM83A (cor = 0.356, P = 1.52E-14) were significantly and weakly correlated with Neutrophils (Figure 8C; Supplementary Table 19).

Figure 8 Immune infiltrating cells profile in LUSC and correlation analysis. (A) Barplot showing the proportion of 22 kinds of immune infiltrating cells in the high- and low-risk LUSC samples. Column names of plot were sample ID. (B) Violin plot of 22 immune cells content in the high-risk and low-risk group. Red color represented high-risk group while blue color represented low-risk group. Differential immune cell type expression was observed between the high- and low-risk groups. (C) Heatmap showing the correlation between 22 kinds of immune cells and prognostic genes. The shade of each tiny color box represented corresponding correlation value between two features, and Spearman coefficient was used for significance test. ***P < 0.001.

Prediction of Potential Compounds Targeting Prognostic Genes

Next, to improve the ability of the clinical utility of the risk scoring system, we identified the sensitivity of patients in the high- and low-risk groups to current third-generation chemotherapeutic agents for LUSC using the GDSC database. The results showed significantly lower IC50 values for gemcitabine in the low-risk group, indicating that patients with LUSC exhibiting low risk scores were more sensitive to gemcitabine (Figure 9A). However, the IC50 values of paclitaxel did not differ significantly between the high- and low-risk groups (Figure 9B). Due to limited database information, our risk scoring system seems to be currently associated with Gemcitabine only. To facilitate the development of new LUSC drugs, we used the CTD database to predict compounds capable of targeting prognostic genes. Ultimately, a total of 92 compounds were predicted, of which HELLS had 81 acting relationships with 65 compounds; 21 compounds constituted 23 interacting relationships with GPRIN1; and FAM83A had 39 pairs of relationships with 25 compounds. Benzo(a)pyrene, Estradiol, Tobacco Smoke Pollution, and Valproic Acid were the common compounds for the 3 prognostic genes, but their interaction relationships were not identical. In detail, Benzo(a)pyrene led to increased HELLS and GPRIN1 mRNA expression, increased GPRIN1 5’ UTR methylation, decreased FAM83A exon methylation, and also affected FAM83A promoter methylation; Estradiol resulted in increased HELLS mRNA expression, and GPRIN1 and FAM83A mRNA expression was due to co-treatment of Estradiol with TGFB1 protein; Tobacco Smoke Pollution contributed to decreased HELLS mRNA expression but led to increased GPRIN1 and FAM83A mRNA expression and also affected FAM83A protein expression; Valproic Acid was able to affect HELLS mRNA expression and lead to increased methylation of GPRIN1 and FAM83A genes. Detailed information on the above results was presented in Supplementary Table 20. Further, we performed a functional network analysis of these compounds via the MetaboAnalyst 5.0 online network. From Figure 9C, Aflatoxin B1, which targeted FAM83A, and Bazedoxifene and 7,8-Dihydro-7,8-dihydroxybenzo(a)pyrene 9,10-oxide, which targeted HELLS, were all involved in the Metabolism of xenobiotics by cytochrome P450; Testosterone, Progesterone, and Estradiol, which could target HELLS, were associated with Steroid hormone biosynthesis; Steroid biosynthesis and Glyoxylate and dicarboxylate metabolism were associated with Calcitriol and Oxygen, which can target HELLS, respectively; Tretinoin, which would target FAM83A, was involved in Retinol metabolism.

Figure 9 Drug sensitivity analysis and potential compound prediction. (A) IC50 values of gemcitabine in high- and low-risk groups. (B) IC50 values of paclitaxel in high- and low-risk groups. (C) The prognostic gene-compound-pathway network consisting of 15 nodes and 18 edges. Red rounded rectangles represent functional pathways, yellow diamonds indicate compounds, and light blue ovals indicate prognostic genes.

Discussion

Lung squamous cell carcinoma (LUSC) is one of the leading causes of cancer-related death in the world, and the development of LUSC is a complicated process influenced by various factors.1–3 Although molecularly targeted drugs have made great progress in the treatment of LUAD due to the presence of EGFR mutations and ALK fusions.30 Unfortunately, these phenomena are not present in LUSC,31–33 resulting in the difficulty of many effective target drugs to exert therapeutic effects in LUSC, so LUSC patients have been lacking effective targeted therapeutic options.

Hypoxia is an important factor in the development and progression of malignant tumours, and the hypoxic microenvironment of tumour tissue can exacerbate disease progression and metastasis through physiological and genomic mechanisms.15 Meanwhile, several studies have demonstrated that hypoxia-related genes can be used in the prognosis of other types of cancer, such as melanoma,34 breast cancer35 and hepatocellular carcinoma.36 However, systematic studies on the prognostic role of hypoxia-related genes in LUSC are still scarce.Therefore, there is urgent to explore biomarkers and therapeutic targets potentially associated with the hypoxic tumor environment to develop better individualized treatment plans and improve LUSC patients’ prognosis.

To confirm the correlation between hypoxia and LUSC, we verify hypoxia-related genes’ prognostic value, the GSVA algorithm calculated gene expression profiles and hypoxia scores for all LUSC samples. Then patients were divided into high and low hypoxia groups based on hypoxia scores. Patients with higher hypoxia scores had poorer survival rates, and we identified 376 DE-HRGs as significantly hypoxia-related genes correlated with LUSC by intersecting DEGs and critical modules. Hypoxic activity in the tumour microenvironment is complex and previous studies have demonstrated that the hypoxic microenvironment of tumours can promote tumour cell proliferation.37–39 In this study, we found that these DE-HRGs were significantly enriched by GO enrichment analysis in several cell cycle-related processes, including nuclear division, chromosome condensation and microtubule binding. KEGG pathway analysis indicated that DE-HRGs were mainly involved in cell cycle-related pathways. Combining the above results, we speculated that the hypoxia might participate in the occurrence and development of LUSC by regulating the cell cycle process.

In this study, we identified three genes (HELLS, GPRIN1 and FAM83A) associated with hypoxia that are closely associated with LUSC.The DNA helicase HELLS, which is central to tumour proliferation and progression, has been shown to be expressed in many tumours and plays a relevant role in the transcriptional and genomic stabilisation of cancer by transcriptionally regulating genes involved in cleavage furrow regulation to promote tumour cell division.40 Furthermore, it has been shown that G protein-regulated neuronal growth inducer 1 (Gprin1) may promote lung cancer proliferation and migration by affecting the epithelial-mesenchymal transition of lung cancer cells and may be an effective target for the treatment of lung cancer.40 As a member of sequence similarity 83, FAM83A has also been found to promote the progression of various cancers such as breast, cervical and lung cancers by activating signalling pathways such as epidermal growth factor receptor EGFR/PI3K/AKT.41–43 However, the above three signature genes have rarely been studied in the context of hypoxia and immune binding.To further understand the potential mechanisms underlying the outcome of hypoxic LUSC patients, we performed GO and KEGG enrichment analysis on DEGs. The results both showed that DEGs were involved in the immune response. This analysis reveals an association between hypoxia and the immune microenvironment of LUSC and provides insight into potential therapeutic mechanisms that may underlie individualised treatment of LUSC patients.

To date, many studies have confirmed that most tumors have hypoxic zones and that the secondary formation of hostile metabolic and physical microenvironments leads to an imbalance of positive and negative regulators of processes such as activation and dysregulation of angiogenesis, demyelination and inflammation, Thus the development of abnormal vascular and hypoxic microenvironments promotes abnormal angiogenesis, demyelination and inflammation, all of which contribute to tumor development and resistance to therapy.44,45 It has been reported that inflammation may promote cellular transformation and that approximately 25% of cancer cases are associated with chronic inflammation of different origins.46 Not only that, the hypoxic state of tumor tissue is important in promoting immunosuppression of tumors. Tumor hypoxic areas can recruit immunosuppressive cells, such as tumor-associated macrophages (TAM), and can inhibit the activation of immune cells such as T cells.47,48 These reports coincide with our analysis, where we found significantly higher rates of “T-cell CD4 memory quiescence”, “macrophage M0”, “dendritic cell activation” and “neutrophils” in hypoxic high-risk patients, and a strong correlation between the model genes GPRIN1 and FAM83A for neutrophils. Given the central role of hypoxia in regulating LUSC progression and immunosuppression, it may provide translational value for the clinical management of LUSC patients. Finally, after determining the relationship between hypoxia and the immune microenvironment of LUSC, it may be necessary to provide potential therapeutic agents. We used the GDSC database to screen hypoxic LUSC for sensitive and selective drugs. The screening results showed that gemcitabine showed potential sensitivity and selectivity for LUSC with low risk of hypoxia. To facilitate the development of new LUSC drugs, we used the CTD database to predict compounds that could target prognostic genes. Ultimately, a total of 92 compounds were predicted, with Benzo(a)pyrene, Estradiol, Tobacco Smoke Pollution, and Valproic Acid as common compounds for these 3 prognostic genes, which provides new insights into future gene targeting therapy for hypoxic LUSC patients.

Fortunately, this is the first comprehensive study to develop and validate hypoxia-related gene signatures to predict the prognosis of LUSC. Our research aimed to explore the hypoxia-related genes and construct reliable models for predicting prognosis, cellular processes, immune microenvironment, and targeted compounds in lung squamous cell carcinoma. Our research provides new insights into the clinical management of patients with LUSC.

Data Sharing Statement

The datasets generated during the current study are available in the TCGA database (portal.gdc.cancer.gov/) and GEO database (www.ncbi.nlm.nih.gov/geo/).

Ethics Approval and Consent to Participate

The Ethics Committee of Affiliated Hospital of Nantong University has granted exemptions from approval for research related to the use of such public databases.

Funding

Qun Xue provided all the funding for this study.

Disclosure

The authors report no conflicts of interest for this work and declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

1. Torre LA, Bray F, Siegel RL, et al. Global cancer statistics, 2012. CA Cancer J Clin. 2015;65(2):87–108. doi:10.3322/caac.21262

2. Ferlay J, Soerjomataram I, Dikshit R, et al. Cancer incidence and mortality worldwide: sources, methods and major patterns in GLOBOCAN 2012. Int J Cancer. 2015;136:E359–386. doi:10.1002/ijc.29210

3. Bray F, Ferlay J, Soerjomataram I, Siegel RL, Torre LA, Jemal A. Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin. 2018;68(6):394–424. doi:10.3322/caac.21492

4. Khozin S, Miksad RA, Adami J, et al. Real-world progression, treatment, and survival outcomes during rapid adoption of immunotherapy for advanced non-small cell lung cancer. Cancer. 2019;125:4019–4032. doi:10.1002/cncr.32383

5. Miller KD, Nogueira L, Mariotto AB, et al. Cancer treatment and survivorship statistics, 2019. CA Cancer J Clin. 2019;69:363–385. doi:10.3322/caac.21565

6. Claiborn K, William G, Kaelin J, Gregg L. Semenza receive the 2012 ASCI/ Stanley J. Korsmeyer Award. J Clin Invest. 2012;122(4):1136–1137. doi:10.1172/jci63264

7. Chun YS, Lee KH, Choi E, et al. Phorbol ester stimulates the nonhypoxic induction of a novel hypoxia-Inducible factor 1α iso-form. Cancer Res. 2003;63(24):8700–8707.

8. Lind H, Zienolddiny S, Ekstrom PO, et al. Association of a functional polymorphism in the promoter of the MDM2 gene with risk of non small cell lung cancer. Int J Cancer. 2006;119(3):718–721.

9. Kiyohara C, Horiuchi T, Takayama K, et al. Genetic polymorphisms involved in the inflammatory response and lung cancer risk: a case – control study in Japan. Cytokine. 2014;65(1):88–94.

10. Manoochehri Khoshinani H, Afshar S. A double-edged sword in cancer therapy. Cancer Invest. 2016;34(10):536–545. doi:10.1080/07357907.2016.1245317

11. Whiteside TL. The tumor microenvironment and its role in promoting tumor growth. Oncogene. 2008;27(45):5904–5912. doi:10.1038/onc.2008.271

12. Al Tameemi W, Dale TP, Al-Jumaily RMK. Hypoxia-modified cancer cell metabolism. Front Cell Dev Biol. 2019;7:4.

13. Zhao C, Luo C, Wu X. Hypoxia promotes 786-O cells invasiveness and resistance to sorafenib via HIF-2α/COX-2. Med Oncol. 2015;32(1):419. doi:10.1007/s12032-014-0419-4

14. Bielecka Z, Malinowska A, Brodaczewska K, et al. Hypoxic 3D in vitro culture models reveal distinct resistance processes to TKIs in renal cancer cells. Cell Biosci. 2017;7:71. doi:10.1186/s13578-017-0197-8

15. Akanji MA, Rotimi D, Adeyemi OS. Hypoxia-inducible factors as an alternative source of treatment strategy for cancer. Oxid Med Cell Longev. 2019;2019:8547846.

16. Semenza GL. HIF-1 mediates metabolic responses to intratumoral hypoxia and oncogenic mutations. J Clin Invest. 2013;123(9):3664–3671.

17. Harris AL. Hypoxia: a key regulatory factor in tumour growth. Nat Rev Cancer. 2002;2(1):38–47.

18. Walsh JC, Lebedev A, Aten E, Madsen K, Marciano L, Kolb HC. The clinical importance of assessing tumor hypoxia: relationship of tumor hypoxia to prognosis and therapeutic opportunities. Antioxid Redox Signal. 2014;21(10):1516–1554.

19. Jing X, Yang F, Shao C, et al. Role of hypoxia in cancer therapy by regulating the tumor microenvironment. Mol Cancer. 2019;18(1):157.

20. Sonja H, Robert C, Justin G. GSVA: gene set variation analysis for microarray and RNA-seq data. BMC Bioinform. 2013;14:7.

21. Shen-Orr Shai S, Renaud G. Computational deconvolution: extracting cell type-specific information from heterogeneous samples. Curr Opin Immunol. 2013;25:571–578.

22. Wanjuan Y, Jorge S, Patricia G, et al. Genomics of Drug Sensitivity in Cancer (GDSC): a resource for therapeutic biomarker discovery in cancer cells. Nucleic Acids Res. 2013;41:D955–61.

23. Paul G, Nancy C, Stephanie HR. pR Rophetic: an R package for prediction of clinical chemotherapeutic response from tumor gene expression levels. PLoS One. 2014;9:e107468.

24. Nick T, Hirsch FR, Luft alexander V, et al. Necitumumab plus gemcitabine and cisplatin versus gemcitabine and cisplatin alone as first-line therapy in patients with stage IV squamous non-small-cell lung cancer (SQUIRE): an open-label, randomised, controlled Phase 3 trial. Lancet Oncol. 2015;16:763–774.

25. Paz-Ares L, Socinski MA, Shahidi J, et al. Correlation of EGFR-expression with safety and efficacy outcomes in SQUIRE: a randomized, multicenter, open-label, Phase III study of gemcitabine-cisplatin plus necitumumab versus gemcitabine-cisplatin alone in the first-line treatment of patients with stage IV squamous non-small-cell lung cancer. Ann Oncol. 2016;27:1573–1579.

26. Julien M, Dariusz K, Alexander L, et al. Health-related quality of life with carboplatin-paclitaxel or nab-paclitaxel with or without pembrolizumab in patients with metastatic squamous non-small-cell lung cancer. J Clin Oncol. 2020;38:271–280.

27. Alan S, Robert G, Perry Michael C, et al. Paclitaxel-carboplatin alone or with bevacizumab for non-small-cell lung cancer. N Engl J Med. 2006;355:2542–2550.

28. Chun KW, Fong CT, Yan CK, et al. Haematological toxicity of pemetrexed in patients with metastatic non-squamous non-small cell carcinoma of lung with third-space fluid. Lung Cancer. 2021;152:15–20.

29. Grønberg BH, Bremnes RM, Oystein F, et al. Phase III study by the Norwegian lung cancer study group: pemetrexed plus carboplatin compared with gemcitabine plus carboplatin as first-line chemotherapy in advanced non-small-cell lung cancer. J Clin Oncol. 2009;27:3217–3224.

30. Rekhtman N, Paik PK, Arcila ME, et al. Clarifying the spectrum of driver oncogene mutations in biomarker-verified squamous carcinoma of lung: lack of EGFR/KRAS and presence of PIK3CA/AKT1 mutations. Clin Cancer Res. 2012;18:1167–1176. doi:10.1158/1078-0432.CCR-11-2109

31. Soda M, Choi YL, Enomoto M, et al. Identification of the transforming EML4-ALK fusion gene in non-small-cell lung cancer. Nature. 2007;448:561–566. doi:10.1038/nature05945

32. Lynch TJ, DW Bell, R Sordella. et al. Activating mutations in the epidermal growth factor receptor underlying responsiveness of non-small-cell lung cancer to gefitinib. N Engl J Med. 2004;350:2129–2139. doi:10.1056/NEJMoa040938

33. Paez JG, PA Janne, JC Lee, et al. EGFR mutations in lung cancer: correlation with clinical response to gefitinib therapy. Science. 2004;304:1497–1500. doi:10.1126/science.1099314

34. Shou Y, Yang L, Yang Y, Zhu X, Li F, Xu J. Identification of signatures of prognosis prediction for melanoma using a hypoxia score. Front Genet. 2020;11:570530. doi:10.3389/fgene.2020.570530

35. Peng-Ju G, You-Cheng S, Si-Rui H, et al. Corrigendum: hypoxia-associated prognostic markers and competing endogenous RNA co-expression networks in breast cancer. Front Oncol. 2020;10:637481.

36. Baohui Z, Bufu T, Jianyao G, et al. A hypoxia-related signature for clinically predicting diagnosis, prognosis and immune microenvironment of hepatocellular carcinoma patients. J Transl Med. 2020;18:342.

37. Devi TA, Kumar D, Vamsi Krishna K, Kantevari S. Microtubule targeting agents as cancer chemotherapeutics: an overview of molecular hybrids as stabilizing and destabilizing agents. Curr Top Med Chem. 2017;17:2523–2537. doi:10.2174/1568026617666170104145640

38. Tagliamento M, Genova C, Rossi G, et al. Microtubule-targeting agents in the treatment of non-small cell lung cancer: insights on new combination strategies and investigational compounds. Expert Opin Invest Drugs. 2019;28:513–523. doi:10.1080/13543784.2019.1627326

39. Bian X, Lin W. Targeting DNA replication stress and DNA double-strand break repair for optimizing SCLC treatment. Cancers. 2019;11(9):1289. PMID: 31480716; PMCID: PMC6770306. doi:10.3390/cancers11091289

40. Annalisa T, Elisabetta S, Valentina M, et al. The DNA-helicase HELLS drives ALK ALCL proliferation by the transcriptional control of a cytokinesis-related program. Cell Death Dis. 2021;12:130.

41. Tianyu L, Jian C, Du Q, et al. Family with sequence similarity 83 member A promotes tumor cell proliferation and metastasis and predicts poor prognosis in cervical cancer. Pathol Res Pract. 2021;222:153450.

42. Lee SY, Meier R, Furuta S, et al. FAM83A confers EGFR-TKI resistance in breast cancer cells and in mice. J Clin Invest. 2012;122:3211–3220. doi:10.1172/JCI60498

43. Grant S. FAM83A and FAM83B: candidate oncogenes and TKI resistance mediators. J Clin Invest. 2012;122:3048–3051. doi:10.1172/JCI64412

44. Jain RK. Antiangiogenesis strategies revisited: from starving tumors to alleviating hypoxia. Cancer Cell. 2014;26:605–622.

45. Whatcott CJ, Han H, Von Hoff DD. Orchestrating the tumor microenvironment to improve survival for patients with pancreatic cancer: normalization, not destruction. Cancer J. 2015;21:299–306.

46. Coussens LM, Werb Z. Inflammation and cancer. Nature. 2002;420:860–867. doi:10.1038/nature01322

47. Damgaci S, Ibrahim-Hashim A, Enriquez-Navas PM, Pilon-Thomas S, Guvenis A, Gillies RJ. Hypoxia and acidosis: immune suppressors and therapeutic targets. Immunology. 2018;154(3):354–362. doi:10.1111/imm.12917

48. Multhoff G, Vaupel P. Hypoxia compromises anti-cancer immune responses. Adv Exp Med Biol. 2020;1232:131–143. doi:10.1007/978-3-030-34461-0_18

Read more here: Source link