Abstract
Background: Diffuse-Large B-cell Lymphoma (DLBCL) is a heterogeneous disease with outcomes influenced by various clinical and molecular predictors. Our aim for this study was to analyze the RNA Expression data and identify gene signatures associated with overall survival (OS), with the potential to uncover prognostic markers and therapeutic targets.
Methods: We analyzed publicly available normalized expression data from NCBI-GEO (GSE181063), encompassing 1,311 DLBCL patients. A custom Python pipeline was developed to analyze the expression data. In preprocessing, we used HumanHT 12 v4.0 Gene Expression BeadChip Median expression of genes was calculated (multiple probes for each gene). Clinical data were integrated with expression data for each patient, followed by statistical tests for OS (Pearson, Spearman, and Linear regression). In addition, we performed the Benjamini–Hochberg false discovery rate (FDR-bh) correction for p-values. Further alignment was done between all the genes and the COSMIC (Catalogue Of Somatic Mutations In Cancer) gene database for cancer, followed by a strict criterion of genes altered in cancer and curated by an expert. Overall survival (median, ± standard deviation, minimum, and maximum) was calculated and/or populated into a table for comparison. The genes were ranked using a composite score base (|r| ≥ 0.13 and FDR-bh < 1×10⁻⁴).
Results: That dataset included 1,311 patients with a median OS of 5.3±4.0 years (range: <1 to 14.3 years). Based on the criteria mentioned in the methods section, 41 genes were associated with a significant survival OS association; the genes were ranked using a composite score. Most commonly altered pathways included: PI3K-Akt signaling pathway (genes; n=10), Human T-cell leukemia virus 1 Pathway(n=9), MicroRNAs in cancer (n=7), and Ras signaling pathway (n=7). Favorable prognostic markers included CEBPA (r≈+0.18, FDR≈1×10⁻⁸, ρ≈+0.21), IL7R (r≈+0.13, FDR≈5×10⁻⁵, ρ≈+0.16); other positive predictors (below |r| 0.13 but significant FDR-bh < 0.01) included FGFR1/2, FLT4, GATA3, NTRK3 and, NTRK2. The strongest unfavorable markers included MYC (r≈–0.15, FDR≈6×10⁻⁶, ρ≈–0.16), RECQL4 (r≈–0.14, FDR≈3×10⁻⁵, ρ≈–0.16), and TERT (r≈–0.130, FDR≈8.5×10⁻⁵), while other negative Pearson ranked genes (below |r| 0.13 but significant FDR-bh < 0.01), included BRCA1, BRCA2, DICER1, EZH2, FBXW7, RB1, and SPEN.
Discussion: As targeted therapies emerge for DLBCL, identifying patients who are non-responders to standard treatments becomes crucial. This exploratory analysis highlights several genes as potential therapeutic targets for patients with DLBCL who fail standard treatments. Our analysis suggests that four genes displaying high expression are associated with poor OS and that protein products encoded by these genes may serve as viable therapeutic targets. They include MYC (BET-bromodomain inhibitors), RECQL4(ATR or WEE1 inhibitors), TERT (telomerase inhibitors), and RB1(CDK4/6 inhibitors).
Conclusions: We identified 41 OS‐associated genes across key cancer pathways with CEBPA, IL7R, NTRK3, GATA3, NTRK2, FGFR1/2, and FLT4 as the strongest favorable prognostic markers; and MYC, RECQL4, TERT, RB1, FBXW7, SPEN, EZH2, BRCA1, BRCA2, and DICER1 as the strongest negative predictors. Targeting these genes may benefit patients who do not respond to standard treatment.While the identification of these potential targets is promising, further validation studies are essential to confirm and strengthen these findings.