Construction and validation of the PCD-related signature
A total of 24 differentially expressed PRGs were meticulously screened in single-cell sequencing data and TCGA, employing strict criteria (|logFC| > 1 and FDR < 0.001), and the resulting intersection was depicted in Fig. 1A. Subsequently, a comprehensive selection process via univariate COX regression analysis yielded 9 prognostic PRGs (Fig. 1B). To construct PRS for prognostication, we harnessed the power of LASSO regression analysis, which enabled us to identify 8 pivotal PRGs (Fig. 1C,D). The PRS was found to exhibit a positive correlation with patient mortality, vividly demonstrated by the PRS distribution, survival status, and KM survival curve (Fig. 1E,F). The heatmap illustrated the distribution of the 8 modeled gene expression profiles alongside clinicopathological features (Fig. 1G). Moreover, the ROC curves for PRS at 1, 2, and 3 years stood at 0.724, 0.663, and 0.651, respectively (Fig. 1H). Furthermore, univariate and multivariate Cox analyses underscored the precision of PRS in prognosticating ccRCC patients (Fig. 1I,J). To fortify the reliability of our model, we applied the E-MATB-1980 dataset for external validation, with the KM curve affirming that the high PRS group portended a less favorable prognosis (Fig. 1K). ROC curves for PRS at 1, 2, and 3 years in the E-MATB-1980 dataset yielded AUCs of 0.851, 0.885, and 0.814, respectively (Fig. 1L).
Internal validation of the PCD-related signature
To bolster our findings, we adopted a 6:4 randomization ratio to divide patients into training and test groups for internal validation. In both groups, higher PRS values were consistently associated with poorer prognoses, as evidenced by PRS distribution, survival status, and KM curves (Fig. S1A–D). The comprehensive analysis of gene expression profiles and clinicopathological features was summarized in Fig. S1E,F. ROC analysis exhibited the remarkable prognostic value of PRS in both the training (AUC = 0.751) and test (AUC = 0.680) sets (Fig. S1G,H). Furthermore, univariate and multivariate Cox regression analyses confirmed the independent predictive role of PRS in both training and test groups (Fig. S1I–L). The meticulous validation process underscored the stability and robustness of PRS as a prognostic predictor.
Identification of immune characteristics of the PCD-related signature
Employing seven different algorithms, we generated a heatmap illustrating the diverse immune cell components (Fig. 2A). The relationship between PRS and immune cells, as delineated by various algorithms, was displayed in Fig. 2B. Furthermore, we observed significantly higher expression of three immunosuppressive cells in the high PRS group (Fig. 2C–E). Utilizing the ESTIMATE algorithm, we discerned that immune score, stromal score, and ESTIMATE score were all elevated in the high PRS group (Fig. 2F–H). To corroborate the validity of PRS for immunotyping, we investigated the association between PRS and immune subtypes, revealing differential expression in subtypes C1, C6, C3, C4, and C5 (Fig. 2I). Moreover, the high PRS group exhibited markedly elevated levels of immune-related molecules in comparison to the low PRS group (Fig. 2J). In our quest to delve deeper into the role of immune cells in ccRCC progression, we evaluated immune activity scores at each step using data from the TIP database. Impressively, we found that the high PRS group displayed significantly higher frequencies of anti-tumour immune cells (Fig. 2K). Our exploration further extended to the spatial distribution of PRS within ccRCC tissues using single-cell signature scoring. Strikingly, we embarked on a detailed exploration of the spatial distribution of PRS within ccRCC tissues. The high PRS group exhibited a substantial increase in the proportion of cancer cells, macrophages, and Treg cells (Fig. 2L), and notably, higher PRS expression within these cell types (Fig. 2M). Additionally, we detected a substantial enrichment of PRS in ccRCC tumors compared to normal tissue samples, as determined by the “AddModuleScore” algorithm (P < 0.0001; Fig. 2N).
Drug susceptibility analysis of the PCD-related signature
To assess the susceptibility of patients with high and low PRS, we employed the “pRRophetic” package to analyze the IC50 values of five common anti-renal cancer drugs, namely Sunitinib, Sorafenib, Pazopanib, Axitinib, and Temsirolimus. Strikingly, our results demonstrated a significant overexpression of IC50 values for all five drugs in the low PRS group (Fig. 3A–E). The target genes of these anticancer drugs, obtained from the DrugBank database, encompassed PDGFRB, FLT3, FLT4, CSF1R, PDGFRA, RAF1, FGFR1, RET, MTOR, FGF1, SH2B3, ITK, FGF2, and FKBP1A. Notably, the high PRS group exhibited higher expression levels for most of these target genes (Fig. 3F). These findings underscored the potential of PRS in aiding the selection of appropriate treatment strategies, ultimately contributing to improved prognosis.
Correlation of the PCD-related signature with clinicopathological characteristics
Within the TCGA cohort, we embarked on an exploration of the signature’s applicability across diverse clinical subgroups. Our analysis revealed a significant elevation of PRS in advanced-stage cases, high-grade cases, advanced T-stage cases, positive lymph node metastasis cases, and positive distant metastasis cases (Fig. S2A–E). Similarly, patients with advanced ccRCC (grade 3 or 4, stage III or IV, T3 or T4, M1, and N1) exhibited a higher propensity to belong to the high PRS group (Fig. S2F–J). The high-PRS group consistently exhibited a poor prognosis, as depicted in Fig. S2K–T. These observations affirmed the robust predictive capability of our model across distinct clinicopathological stages.
Identification of PCD-related clusters and bioinformatics analysis
By leveraging the expression profiles of the modeling genes, we categorized patients into two clusters using the NMF algorithm (Fig. 4A). The KM survival curve clearly delineated a worse prognosis for cluster B in comparison to cluster A (Fig. 4B). The Sankey diagram further illustrated the relationship between classification, PRS, and survival status, indicating that a majority of cluster B patients belonged to the high PRS group, with a notable concentration of deaths therein (Fig. 4C). This observation was consistent with our previous analyses. The heatmap provided a visual representation of the distribution of the 8 modeling gene expression profiles alongside clinicopathological characteristics (Fig. 4D). In an effort to explore the heterogeneity within each PCD-related cluster, we conducted functional enrichment analysis on the DEGs between the clusters. KEGG and GO analysis highlighted significant enrichment in pathways such as the PI3K-Akt signaling pathway, various metabolic pathways, mitochondrial matrix regulation, inflammatory response regulation, and angiogenesis regulation (Fig. 4E,F). Additionally, GSVA revealed significant enrichment of cancer-promoting pathways, including the P53 signaling pathway, JAK-STAT signaling pathway, cell cycle, and ECM receptor interaction, within cluster B (Fig. 4G).
Mutation and immunotherapeutic responses of the PCD-related signature
Given the significant correlation between Tumor Mutation Burden (TMB) and immunotherapy efficacy, we explored the link between PRS and TMB. Interestingly, the mutation rate was 77.17% in the low PRS group and 85.14% in the high PRS group, with TMB being significantly higher in the latter (Fig. 5A–C). A positive correlation between PRS and TMB was evident, with cluster B exhibiting higher PRS and TMB than cluster A (Fig. 5D). The KM curve indicated a poor prognosis in the high TMB group (Fig. 5E). Furthermore, combining PRS and TMB for prognosis prediction revealed that H-TMB + H-PRS had the worst prognosis, while L-TMB + L-PRS had the best prognosis (Fig. 5F). Then, previous studies have shown that TMB has an association with the outcome of immunotherapy. Our analysis of immunosuppressive checkpoint expression confirmed significant overexpression in the high PRS group (Fig. 5G). Additionally, the low PRS group displayed a higher probability of responding to CTLA4, PD-1, and CTLA4 + PD-1 immunotherapy, further emphasizing the clinical relevance of PRS (Fig. 5H–K).
Correlation of the modeling genes with clinicopathological characteristics
In Fig. S3A, we observed differential expression of the 8 modeling genes between cancer and normal tissues, with 5 genes up-regulated in cancer tissue and 3 down-regulated. Notably, all 8 modeling genes demonstrated substantial predictive capacity based on the AUC (Fig. S3B). Differential expression of these genes across various clinicopathological stages was highlighted in Fig. S3C–G. Patients were stratified into high-expression and low-expression groups based on median gene expression, revealing that 5 genes were positively correlated with poor prognosis, while 3 genes were inversely correlated (Fig. S3O).
Single-cell transcriptomic context of the 8 modeling genes
For an in-depth exploration of the potential mechanisms governed by the modeling genes in ccRCC, we leveraged three single-cell sequencing datasets from the GEO database (GSE131685, GSE152938, and GSE171306). After meticulous quality control, we analyzed 17,304 genes across 50,021 cells obtained from four ccRCC and four normal samples (Fig. S4A). Subsequently, 16 distinct clusters and 15 cell types were identified, encompassing immune cells, tubular cells, endothelial cells, and tumor cells (Fig. S4B,C).
In agreement with previous studies35, renal tubular epithelial cells predominated in normal renal cortical samples, while immune cells and tumor cells dominated in ccRCC, signifying the ccRCC tumor immune microenvironment (Fig. S4D). The majority of immune cells were identified in ccRCC patients, delineating the tumour immune microenvironment of ccRCC (Fig. S4E). Importantly, we examined the expression profiles of the eight modeled genes within different cell types (Fig. 6A–H) and discerned distinct distribution patterns in cancerous and paracancerous tissues. Specifically, PEBP1, NAPSA, and FDX1 exhibited higher expression in paracancerous tissues, while SERPINE1, NOL3, CEBPB, P4HB, and YBX1 were more highly expressed in cancerous tissues, reinforcing our previous analyses (Fig. 6I–P).
Validation of protein and mRNA expression of PCD-related genes in ccRCC
To validate the expression of PCD-related genes in ccRCC, we obtained protein expression data for the eight modeling genes from the Clinical Proteomic Tumor Analysis Consortium (CPTAC). Additionally, we performed qRT-PCR to assess differences in mRNA expression of these eight modeling genes in both tissues and cell lines. Figure 7A–E illustrated the protein and mRNA expression levels of five modeling genes (SERPINE1, P4HB, NOL3, CEBPB, and YBX3) were significantly that exhibited significant overexpression in ccRCC tissues and cell lines. In contrast, the remaining three genes (PEBP1, NAPSA, and FDX1) displayed noteworthy underexpression in ccRCC tumors, as depicted in Fig. 7F–H.