• Cell of origin is stable between diagnosis and relapse.

  • A 30-gene panel of relapse-associated genes was able to stratify ABC patient survival at diagnosis.

Despite the effectiveness of immuno-chemotherapy, 40% of patients with diffuse large B-cell lymphoma (DLBCL) experience relapse or refractory disease. Longitudinal studies have previously focused on the mutational landscape of relapse but fell short of providing a consistent relapse-specific genetic signature. In our study, we have focused attention on the changes in GEP accompanying DLBCL relapse using archival paired diagnostic/relapse specimens from 38 de novo patients with DLBCL. COO remained stable from diagnosis to relapse in 80% of patients, with only a single patient showing COO switching from activated B-cell–like (ABC) to germinal center B-cell–like (GCB). Analysis of the transcriptomic changes that occur following relapse suggest ABC and GCB relapses are mediated via different mechanisms. We developed a 30-gene discriminator for ABC–DLBCLs derived from relapse-associated genes that defined clinically distinct high- and low-risk subgroups in ABC–DLBCLs at diagnosis in datasets comprising both population-based and clinical trial cohorts. This signature also identified a population of <60-year–old patients with superior PFS and OS treated with ibrutinib–R-CHOP as part of the PHOENIX trial. Altogether this new signature adds to the existing toolkit of putative genetic predictors now available in DLBCL that can be readily assessed as part of prospective clinical trials.

Diffuse large B-cell lymphoma (DLBCL) is a heterogeneous disease encompassing multiple molecular and biological subtypes. Although potentially curable with immuno-chemotherapy, up to 40% of patients will experience relapsed or refractory disease.1,2 The current standard of care for these lymphomas has not changed in the past 2 decades, and efforts are turning to identification of subgroups of DLBCL that may demonstrate preferential response to existing or novel therapies.3,4 Early work focused on DLBCL at diagnosis, using gene expression profiling (GEP) to delineate the cell-of-origin (COO) classification system (germinal center B-cell–like [GCB], activated B-cell–like [ABC], and unclassified [UNC]) with ABC tumors being linked to poorer outcome.5 However, attempts to use molecular analyses to tailor treatment, and specifically to develop alternative rituximab, cyclophosphamide, doxorubicin, vincristine, and prednisolone (R-CHOP) regimens to mitigate the poorer outcome of patients in the ABC–DLBCL subgroup, have not, to date, led to significant improvements.6-9 

More recently, attempts to refine the taxonomy of DLBCL through integrative genomic analysis have demonstrated additional heterogeneity not captured by the previous COO classification.10-14 This has led to the growing realization that DLBCL encompasses a number of biological entities with distinct oncogenic mechanisms, requiring a more sophisticated approach to patient management and trial design. To date, these studies have predominantly focused on analyzing single tumor biopsies at diagnosis, with our understanding of the pre-programmed or acquired mechanisms underpinning relapsed disease hindered by the limited availability of sequential biopsy samples. The majority of longitudinal studies published thus far have focused on genetic changes between the diagnostic and relapse tumor, providing important confirmation of the clonal relationship between diagnosis and relapse, and describing recurrent relapse-associated genetic aberrations, but fell short of providing a consistent relapse-specific genetic signature15-22 (supplemental Table 1). In this study, we sought to use GEP in paired diagnostic and relapse tumors to further understand the mechanisms underpinning treatment failure following immuno-chemotherapy. Using these data, we demonstrate the stability of COO at relapse in the majority of cases and identify a novel relapse-associated gene expression signature that reliably discriminated 2 distinct outcome groups within the ABC type of DLBCL at diagnosis.

Patient cohort

Ethical approval was obtained from the London Research Ethics Committee of the East London and the City Health authority (10/H0704/65 and 06/Q0605/69). Written consent was obtained for the use of specimens for research purposes and samples from collaborating centers had local ethical approval. Paired diagnosis/relapse DLBCL biopsies were collated from 38 patients across 5 centers in the United Kingdom. All patients were treated with standard first-line rituximab-based immuno-chemotherapy (eg, R-CHOP) and achieved either a partial or complete remission (Figure 1A; Table 1). COO was determined using the Lymph2Cx assay on the NanoString platform23 or the DLBCL Automatic Classifier24 and all biopsies had ≥19% total B cell content, as estimated by CIBERSORT.25 Thirty-four biopsies were nodal (15 diagnosis, 19 relapse) and 42 extranodal (23 diagnosis and 19 relapse). The site of the biopsy was concordant at diagnosis and relapse for 20 cases (8 nodal, 12 extranodal).

Figure 1.

Gene expression profiles of paired diagnosis and relapse diffuse large B-cell lymphoma (DLBCL) biopsies. (A) Thirty-eight patients who underwent relapse were included in the study; the clinical features of these patients are shown. (B) COO remained stable in the majority of cases. Gene expression profiling was carried out using an Ion AmpliSeq Transcriptome Human Gene Expression Kit. (C) Principal component analysis carried out on these samples suggested poor separation based on timepoint, with a greater degree of separation observed in the COO. Diagnosis = green; relapse = red; ABC = blue; GCB = orange; UNC = gray; NA = black. (D) Differential gene expression was carried out separately for the ABC and GCB cohorts and GSEA was performed, with the number of genes sets dysregulated (false discovery rate < = 0.1) at relapse are shown. (E) Heatmaps of normalized enrichment score for examples of the dysregulated gene sets are shown. A 30-gene panel capable of stratifying ABC–DLBCL patients from a training cohort14 into 2 risk groups with different overall survival was discovered using PAM (F). Red = high risk, blue = low risk; ∗∗p < = 0.01, ∗p < = 0.05, p < = 0.1. COO, cell of origin; ABC, activated B-cell–like; UNC, unclassified; GBC, germinal center B-cell–like; NA, not applicable; CR, complete response; PR, partial response; GSEA, gene set enrichment analysis.

Figure 1.

Gene expression profiles of paired diagnosis and relapse diffuse large B-cell lymphoma (DLBCL) biopsies. (A) Thirty-eight patients who underwent relapse were included in the study; the clinical features of these patients are shown. (B) COO remained stable in the majority of cases. Gene expression profiling was carried out using an Ion AmpliSeq Transcriptome Human Gene Expression Kit. (C) Principal component analysis carried out on these samples suggested poor separation based on timepoint, with a greater degree of separation observed in the COO. Diagnosis = green; relapse = red; ABC = blue; GCB = orange; UNC = gray; NA = black. (D) Differential gene expression was carried out separately for the ABC and GCB cohorts and GSEA was performed, with the number of genes sets dysregulated (false discovery rate < = 0.1) at relapse are shown. (E) Heatmaps of normalized enrichment score for examples of the dysregulated gene sets are shown. A 30-gene panel capable of stratifying ABC–DLBCL patients from a training cohort14 into 2 risk groups with different overall survival was discovered using PAM (F). Red = high risk, blue = low risk; ∗∗p < = 0.01, ∗p < = 0.05, p < = 0.1. COO, cell of origin; ABC, activated B-cell–like; UNC, unclassified; GBC, germinal center B-cell–like; NA, not applicable; CR, complete response; PR, partial response; GSEA, gene set enrichment analysis.

Table 1.

Cohort information

   Min Max Median 
Age at diagnosis (years)   38 89 64 
Time to relapse (years)   0.5 13.9 1.7 
Sex      
   Male Female  
   17 21  
Cell of Origin      
(Avaliable for 35 cases)   Relapse 
   ABC GCB UNC 
 Diagnosis ABC 17 
  GCB 11 
  UNC 
Site of Biopsy      
   Relapse  
   Nodal Extranodal  
 Diagnosis Nodal  
  Extranodal 11 12  
Treatment      
   Treatment Frequency  
   R-CHOP 30  
   R-CHOP + IT MTX  
   R-CHOP + LOCALISED RT  
   R-CHOP X3 + IT MTX AND IFRT  
   R-CHOP X3 + RADIOTHERAPY  
   R-CEOP  
   R-CEOP + RADIOTHERAPY  
   Min Max Median 
Age at diagnosis (years)   38 89 64 
Time to relapse (years)   0.5 13.9 1.7 
Sex      
   Male Female  
   17 21  
Cell of Origin      
(Avaliable for 35 cases)   Relapse 
   ABC GCB UNC 
 Diagnosis ABC 17 
  GCB 11 
  UNC 
Site of Biopsy      
   Relapse  
   Nodal Extranodal  
 Diagnosis Nodal  
  Extranodal 11 12  
Treatment      
   Treatment Frequency  
   R-CHOP 30  
   R-CHOP + IT MTX  
   R-CHOP + LOCALISED RT  
   R-CHOP X3 + IT MTX AND IFRT  
   R-CHOP X3 + RADIOTHERAPY  
   R-CEOP  
   R-CEOP + RADIOTHERAPY  

Gene expression analysis

GEP of formalin-fixed parafin embedded (FFPE) samples was carried out using the Ion Ampliseq™ Human Gene Expression array, consisting of 20 802 genes. Poorly captured genes (0 reads in ≥ 1/3 of the cohort) were removed, leaving 15 457 genes. Raw read counts were normalized to log2 counts per million. Differential expression between matched relapse and diagnostic samples, and gene set enrichment analysis (GSEA)26 were subsequently performed. The list of differentially expressed (DE) genes were selected for the following gene signature discovery using publicly available datasets.

Derivation of a prognostic gene panel

Relapse-associated genes found within our paired cohort (P < .05) were used in conjunction with the Prediction Analysis of Microarrays27 (PAM) algorithm to define a survival signature for DLBCL. The expression of these genes within a cohort of 264 GCB and 249 ABC diagnostic patients with DLBCL14 (called the “Reddy cohort” hereafter) was used to train the PAM model. For the validation of the resulting gene signatures, a linear predictor model was constructed based on the prognostic value of each gene in the training dataset and the expression value in the validation dataset. This predictor score was used to stratify patients in 3 independent GEP cohorts: the Randomised Evaluation of Molecular Guided Therapy for Diffuse Large B-cell Lymphoma with Bortezomib clinical trial (REMoDL-B),7 the Lymphoma/Leukemia Molecular Profiling Project (LLMPP) series,28 and the Haematological Malignancy Research Network (HMRN) population cohort.29 All survival analyses were performed using the Cox Proportional Hazards Model in R.

See supplemental Methods for a full description of the methods.

COO is stable between diagnosis and relapse

The longitudinal series included 38 paired diagnostic-relapsed DLBCLs (Figure 1A; Table 1), all treated at diagnosis with R-CHOP or R-CHOP–like regimens. COO calling was successfully completed in both biopsies for 35 cases. COO was stable across 28 patients (80%) and corresponded to 17 ABC–ABC and 11 GCB–GCB pairs, with 2 further cases being UNC at both timepoints (Figure 1B). Discordant COO was a feature of just 5 cases (1 ABC–GCB, 2 ABC–UNC, 1 GCB–UNC, and 1 UNC–ABC) with a single example of an ABC–GCB transition, suggesting that changes in DLBCL trajectory at relapse, while reported in the literature,30 are uncommon. The median time to relapse was 1.7 years, with 22 patients (58%) relapsing within 2 years. In 2 cases (1 ABC–ABC and 1 GCB–GCB) relapse occurred after more than 10 years (10.1 years and 13.9 years, respectively).

Deregulated gene expression between diagnosis and relapse

We interrogated whole-transcriptome GEP data from all 76 biopsies with the aim of identifying changes in gene expression associated with DLBCL relapse. Principal component analysis (PCA) based on the full set of profiled genes (n = 15 457) did not reveal distinct clustering of the diagnostic or relapse samples (Figure 1C). There was no consistent pattern observed in the PCA values within the individual pairs or based on the location of biopsies, nodal/extra-nodal disease, or time to relapse (supplemental Figure 1A–C). As expected, GEP profiles of the samples showed association based on their COO (Figure 1C), where DE analyses of the ABC (n = 17) and GCB (n = 11) pairs identified unique sets of genes associated with relapse, based on COO (<4% overlapping DE genes, limma analysis p < .05; supplemental Figure 1). This was also supported by GSEA where chromosome maintenance, DNA repair, and rRNA processing were among the top upregulated pathways (false discovery rate < 0.1) in the ABC–ABC series in comparison with adaptive immunity, cytokine signaling, and antigen processing and presentation signatures that were unique to GCB–GCB pairs (Figure 1D,E; supplemental Tables 2 and 3).

A 30-gene outcome predictor in ABC DLBCL

We postulated that the expression of these relapse-associated genes might hold some prognostic significance in a diagnostic cohort. To this end, the PAM algorithm27 was used to interrogate a total of 796 and 387 DE genes (p < .05) from our ABC and GCB diagnostic-relapse signatures, respectively, in the Reddy series of 264 GCB and 249 ABC diagnostic DLBCLs. This analysis identified a 30-gene signature that separated ABC patients into 136 low- and 113 high-risk cases with significantly different overall survival (hazard ratio [HR] = 1.89, 95% confidence interval [CI] = 1.26–2.83; log-rank p = .0017; Figure 1F). The majority of the genes in this panel have not previously been implicated in DLBCL pathogenesis, although notable exceptions included MYC and TNFRSF9, with MYC one of 5 genes demonstrating significant single-gene clinical association, inversely correlated with overall survival (p < .05; Figure 1F). STRING analysis of these 30 genes identified 7 highly interconnected clusters, with MYC at the center of this protein interaction network (supplementary Figure 2; supplemental Table 4). In contrast to ABC patients, there was no equivalent predictor detected using PAM in the corresponding set of GCB cases. Attempts to define a response signature using the Reddy cohort without the prior enrichment of relapse-associated genes were unsuccessful.

Validation of the 30-gene ABC predictor in 3 independent DLBCL series

The reproducibility of this 30-gene outcome predictor was evaluated in 3 separate DLBCL cohorts (REMoDL-B and HMRN, both with RNA profiling achieved using the cDNA-mediated annealing, selection, extension, and ligation assay; and LLMPP–RNA profiling from an Affymetrix microarray chip7,28,29), all treated with R-CHOP (R-CHOP + bortezomib in 126 patients from the REMoDL-B cohort) at diagnosis and comprising 504 ABC cases in total. We evaluated each series separately. Within each cohort, a linear predictor score was calculated for each patient, based on the summation of the expression of 29 or 30 genes (as not all genes were represented on each platform), weighted by their β-coefficients from the training dataset (supplemental Table 5). These linear predictors were standardized using a Z-transformation and each cohort was subdivided into high (standardized linear predictor > 0) and low (standardized linear predictor < 0) scoring risk groups (see supplemental Methods). Analysis of the cause of deaths in the HMRN cohort shows that patients with lymphoma-associated deaths had a significantly shorter follow-up time than patients who died of other causes (Wilcoxon rank sum p < 0.001; supplemental Figure 3D). Moreover, it was notable that non-lymphoma–related deaths increased significantly from 3 years in this series and so we restricted our analysis of overall survival accordingly.

The algorithm stratified the 255 ABC REMoDL-B cases, into 108 low- and 147 high-risk cases (3-year overall survival [OS]; HR = 2.04, 95% CI = 1.073–3.875; p = .026; Figure 2A); the LLMPP series of 93 ABC cases into 44 low- and 49 high-risk cases (3-year OS; HR = 2.3, 95% CI = 1.154–4.565; p = .015; Figure 2B); and a UK population-based cohort (HMRN) of 156 ABC cases, into 72 low- and 84 high-risk cases (3-year OS; HR = 1.93, 95% CI = 1.06–3.522; p=.029; Figure 2C). Across all 3 cohorts, patients with high linear predictor scores (high-risk) showed significant reduction in survival at 3 years. When later events were included, both the REMoDL-B and LLMPP data showed similar results (OS HR = 2.11, 95% CI = 1.115–3.993; p = .019 and HR = 2.17, 95% CI = 1.109–4.242; p = .02; supplemental Figure 3A,B, respectively), while the HMRN cohort showed a trend for reduced survival in the high-risk group, (HR = 1.39, 95% CI = 0.917–2.106; p = .12; supplemental Figure 3C); and we have reasoned that the performance of the discriminator may reflect the number of non-lymphoma–related deaths in this population-based cohort.

Figure 2.

Validation of 30-gene risk model for ABC–DLBCL (activated B-cell–like diffuse large B-cell lymphoma) in population and clinical trial cohorts. The risk model was tested with survival restricted to 3 years. (A) The 30-gene signature distinguished high- and low-risk groups in the REMoDL-B clinical trial,7 (B) the R-CHOP arm of the Lymphoma/Leukemia Molecular Profiling Project (LLMPP) 2008 cohort,28 and (C) the Haematological Malignancy Research Network (HMRN) population study:29 red = high risk, blue = low risk. (D) Comparison of International Prognostic Index (IPI) scores and the risk groups defined using the linear predictor in the REMoDL-B cohort. (E) Comparison of genetic subcategories described by Lacy et al13 with risk groups defined using the linear predictor in the HMRN cohort. Of the 156 ABC cases in the HMRN data, the genomic subgroups were available for 98 cases.

Figure 2.

Validation of 30-gene risk model for ABC–DLBCL (activated B-cell–like diffuse large B-cell lymphoma) in population and clinical trial cohorts. The risk model was tested with survival restricted to 3 years. (A) The 30-gene signature distinguished high- and low-risk groups in the REMoDL-B clinical trial,7 (B) the R-CHOP arm of the Lymphoma/Leukemia Molecular Profiling Project (LLMPP) 2008 cohort,28 and (C) the Haematological Malignancy Research Network (HMRN) population study:29 red = high risk, blue = low risk. (D) Comparison of International Prognostic Index (IPI) scores and the risk groups defined using the linear predictor in the REMoDL-B cohort. (E) Comparison of genetic subcategories described by Lacy et al13 with risk groups defined using the linear predictor in the HMRN cohort. Of the 156 ABC cases in the HMRN data, the genomic subgroups were available for 98 cases.

Close modal

We restricted our multivariate analysis to the REMoDL-B and LLMPP series, accounting for patient age, gender, International Prognostic Index (IPI), and stage (where available), and the high-scoring group remained associated with poorer OS (HR = 1.95, Wald test p = .042 for REMoDL-B; HR = 2.19, p = .023 for LLMPP; supplemental Table 6), suggesting that our linear score offers an additional independent predictor. While there was an over-representation of low IPI (0–2) cases observed in the low-risk group of the REMoDL-B cohort, this was not significant (Fisher’s exact test p = .08; Figure 2D) and, while case numbers are few, neither did we observe a significant enrichment in the 6 existing genetic subgroups defined by Lacy et al13 in the HMRN cohort (Lacy subtype available for 63% of samples, Fisher’s exact test p = .422; Figure 2E).

Previous studies have identified a large number of verifiable random gene signatures, associated with outcome in other cancer types,31,32 so for completeness, we next compared the prognostic ability of our signature against 300 000 random 30-gene panels in the REMoDL-B and LLMPP datasets, where it outperformed 95.5% of random signatures in the REMoDL-B data, 98.32% in the LLMPP dataset, and 99.92% in both datasets concurrently (supplemental Figure 4).

Signature predicts superior response to ibrutinib in younger DLBCL patients

We were also intent on testing whether our discriminator could identify populations of ABC patients most likely to respond to COO specific therapies. Wilson et al have recently reported superior outcomes of patients in specific subtypes of DLBCL.33 We reasoned that our signature may hold relevance for agents postulated to specifically target ABC-subtype DLBCL. The phase-III PHOENIX study examined the addition of ibrutinib to R-CHOP in non-GCB DLBCL. Although ibrutinib addition failed to show benefit across the whole intention-to-treat cohort, in younger patients (<60 years), outcomes were indeed superior in the ibrutinib–R-CHOP (I-R-CHOP) arm, with results for older (>60 years) patients seemingly confounded by increased toxicity of the drug. In view of the efficacy in this discrete group of patients, we assessed whether our linear predictor was able to discriminate patients in the PHOENIX cohort with a variable response to ibrutinib, focusing our attention on cases younger than 60 years that were confirmed as ABC subtype using the HTG EdgeSeq COO Assay (n = 133).

Altogether, patients with high linear predictor scores demonstrated poorer PFS compared with patients classified as low risk in all patients < 60 years of age irrespective of treatment (Figure 3A; low risk = 57, high risk = 76, HR = 2.52, 95% CI = 1.23–5.16, log-rank p = .009), although OS was only marginally different (supplemental Figure 5A; HR = 1.46, 95% CI = 0.54–3.95, p = .452). We next considered whether the linear predictor behaved differently in I-R-CHOP and R-CHOP treated patients. For ibrutinib-treated patients (n = 55), both PFS and OS were lower in the high- versus low-risk group (Figure 3B; supplemental Figure 5B; low risk = 26, high risk = 29, HR = 11.6, 95% CI = 1.48–90.9; p = .003 and p = .076, respectively). Indeed, the low-risk group (47%) had strikingly favorable outcomes, with no deaths reported in these 26 patients and only one patient experiencing progression. It is important to note that the control, R-CHOP arm, demonstrated only a trend to inferior outcomes in the high-risk group in PFS, compared with the significant survival differences observed in the LLMPP, REMoDL-B, and HMRN datasets (Figure 3C; supplemental Figure 5C; low risk = 32, high risk = 46; PFS: HR = 1.6, 95% CI = 0.727–3.52, p = .239 and OS: HR = 1.28, 95% CI = 0.429–3.82, p = .656, respectively).

Figure 3.

Prognostic ability of the linear predictor in the PHOENIX trial cohort. The gene expression profiling (GEP) data from the activated B-cell–like (ABC) patients < 60 years old in the PHOENIX trial were used to generate linear scores for each patient. These scores were then used to stratify the patients into high- and low-risk cohorts. Kaplan–Meier plots of the progression free survival (PFS) rate of these patient subgroups is shown. (A) Both treatment arms combined; only patients designated as ABC by GEP. The PFS rate of these subgroups was also examined in each arm separately: (B) ibrutinib and (C) placebo. Red = high risk, blue = low risk. Finally, the effect of the drugs on PFS within the subgroups was assessed: (D) low risk and (E) high risk. Green = R-CHOP + placebo; purple = R-CHOP + ibrutinib.

Figure 3.

Prognostic ability of the linear predictor in the PHOENIX trial cohort. The gene expression profiling (GEP) data from the activated B-cell–like (ABC) patients < 60 years old in the PHOENIX trial were used to generate linear scores for each patient. These scores were then used to stratify the patients into high- and low-risk cohorts. Kaplan–Meier plots of the progression free survival (PFS) rate of these patient subgroups is shown. (A) Both treatment arms combined; only patients designated as ABC by GEP. The PFS rate of these subgroups was also examined in each arm separately: (B) ibrutinib and (C) placebo. Red = high risk, blue = low risk. Finally, the effect of the drugs on PFS within the subgroups was assessed: (D) low risk and (E) high risk. Green = R-CHOP + placebo; purple = R-CHOP + ibrutinib.

Close modal

Finally, we assessed the effect of ibrutinib addition in high- and low-risk linear predictor groups separately. Low-risk patients treated with I-R-CHOP had superior PFS and OS than those treated with R-CHOP only (Figure 3D; ibrutinib = 24, Placebo = 33, p = .007 for PFS; supplemental Figure 5D, p = .028 for OS); while in contrast, the high-risk group showed no difference between the treatment arms for either PFS (Figure 3E; ibrutinib = 31, Placebo = 45, HR = 0.927, 95% CI = 0.44–1.95, P = .841) or OS (supplemental Figure 5E; HR = 0.589, 95% CI = 0.156–2.22, p = .428). Similar results were shown when examining the non-GCB group of patients. Together, these retrospective data suggest that our gene signature may identify a group of DLBCL patients < 60 years who derive benefit from ibrutinib in combination with R-CHOP therapy.

DLBCL comprises a molecularly heterogeneous group of lymphomas with different outcomes, linked to a variety of features including COO,5 occurrence of specific translocations34 and, more recently, a combination of gene mutation and copy number aberrations.10-13,35 There are several recently reported discriminators that rely primarily on gene expression, with an emphasis either toward the tumor B cell,3,36,37 or its immune microenvironment.38-41 However, despite an increased understanding of the biology of these aggressive lymphomas, improvements to the existing standard of care have proven problematic. Altogether, there has been a reliance on the study of the diagnostic biopsy samples, with longitudinal studies typically hindered by the limited availability of sequential biopsy material. Studies comparing mutation status at diagnosis and relapse in paired biopsies, or interrogating independent series of pre-treatment and relapse cases15-22 (supplemental Table 1), have identified recurrent relapse-associated genes including TP53 and MYC although alone they lack specificity to predict relapse. In this study, we focused attention on the changes in gene expression profile that accompany DLBCL relapse, to consider whether this approach might offer a novel perspective on the biology of disease resistance. Our new data demonstrate that COO is largely stable between time points, suggest a distinctive pattern of relapse in ABC and GCB lymphomas based on differential gene expression, and resolve a 30-gene discriminator in ABC-DLBCL that defined clinically distinct low- and high-risk subgroups at diagnosis, which was informative both in an independent series of R-CHOP-treated patients and young patients treated with ibrutinib + R-CHOP in the PHOENIX trial.6 

The accrual of paired material of suitable quality for analysis was challenging. From a large initial series of FFPE paired biopsies obtained from multiple UK institutions, suitably paired data were retrieved from 38 de novo DLBCL patients, constituting one of the largest published cohorts of paired diagnosis-relapse samples to date. Regardless, it is important to acknowledge the heterogeneity of the cohort; site of the biopsy differs between the diagnosis and relapse in 18 of the 38 pairs; the time to relapse varied across the series and samples demonstrated variable tumor content. Irrespective of these potential confounding effects, we have been able to make some robust observations shedding new light onto the evolution of DLBCL. We had initially sought to recover both DNA and RNA from these specimens to facilitate a parallel analysis of mutation and gene expression, but this proved technically unfeasible in the majority of cases, highlighting the challenges in collating paired material of sufficient quantity and quality for multi-omic analyses. Our subsequent studies focused exclusively on generating gene expression data, through global GEP and a COO analysis. Comparison of paired biopsies confirmed what has long been assumed, but not formally shown—that COO is stable in most paired diagnostic/relapse cases, ruling out a simple switch in COO as the dominant mechanism underlying disease relapse and R-CHOP failure. Indeed, while changes in COO accompanying DLBCL relapse were observed in 5 cases, this included just a single example of ABC–GCB switching, where biopsies were excised from different locations 1.5 years apart (Table 1). While this example is reminiscent of a recent study demonstrating spatial and temporal heterogeneity in a case of DLBCL manifesting as site-discordant COO and response to immuno-chemotherapy,42 these data confirm that such discordant cases represent the exception rather than the rule.

We noted minimal overlap in DE genes between COO groups, with GSEA suggesting that relapse is likely mediated by different mechanisms depending on the tumor’s COO. Tumor growth and proliferation signatures were enriched in ABC relapses, while adaptive immunity-related signatures were a feature of GCB-type lymphomas. Consequently, we considered ABC (n = 17) and GCB (n = 11) lymphomas separately for subsequent analysis. We next tested whether these relapse-associated genes held prognostic significance in a diagnostic cohort. Using the PAM algorithm, we resolved a 30-gene signature that divided ABC cases into low- and high-risk groups. Critically, this expression signature was validated using a linear score in 3 independent GEP datasets derived using different platforms and comprising both population-based and clinical trial cohorts.

Going forward, it will be important to prospectively validate individual signatures, as well as benchmark them against each other, to determine their relative merits and application in real-world patients. While it is reassuring to note in 3 recent mutation-focused studies10,12,13,35 the significant overlap and consensus across classifications based on gene mutation, it remains to be seen whether the various emerging gene-expression–based signatures similarly resolve identical groups of DLBCLs, or rather each identify distinct high-risk groups. Moreover, combined mutation and gene expression data from the HMRN dataset demonstrated that high- and low-risk patients from our ABC discriminator arose independently of the groups reported by Lacy et al.13 In contrast, too few patients in the PHOENIX trial were classified using both the LymphGen algorithm and the 30 gene signature to allow for a direct comparison. This data suggests that GEP imparts important information independent of mutation and CNA-based classifications.

There is a recognition that genetic signatures, rather than informing clinical decisions based on outcome prediction, may offer instead a tool to identify discrete populations of patients who may benefit from specific precision-based approaches to treatment. It was of interest in our study that our ABC-discriminator resolved patients with particularly favorable outcome following ibrutinib + R-CHOP in the PHOENIX study within ABC-subtype patients diagnosed at < 60 years, albeit in a small retrospective cohort. Importantly, however, in this cohort the discriminator was unable to identify groups with different outcomes in the R-CHOP arm. Ideally, this observation will undergo prospective validation in patients on the upcoming combination study of the BTK inhibitor acalabrutinib with R-CHOP for untreated DLBCL (REMoDL-A: clinicaltrials.gov/ct2/show/NCT04546620) as part of the UK PMAL program.

There are certain limitations in our study. Overall, the cohort sizes are small, particularly in the example of GCB-GCB relapse pairs, which may explain the inability to generate a prognostic discriminator for this group of patients. Furthermore, while we employed a biologically agnostic approach to our discriminator discovery, so as not to overlook the impact of unappreciated gene interactions or biology, the resulting discriminator by its nature lacks an immediately apparent biological rationale. However, an interaction network revealed 7 biologically distinct clusters of protein interactions containing several enriched pathways with potential relevance to disease progression, including RNA transport, protein processing, and immune pathways. The notable presence of MYC at the center of the interaction network highlights the role of MYC in disease aggressiveness and reinforces the need to develop MYC-directed therapies.

The future utility of the many emerging genetic discriminators requires independent validation as part of prospective clinical trials and highlights the need for comprehensive and multi-omic profiling of these cohorts. There are currently limitations in performing direct comparisons between existing GEP studies, eg, the use of different discovery platforms, and it is possible that fluctuations in the proportion of specific subgroups observed may reflect the unpredictable nature of real-world studies (HMRN) compared with clinical trials (REMoDL-B). Indeed, the inclusion of patients for analysis in many biological studies are typically dependent on a confirmed lymphoma diagnosis, their treatment, and having sufficient residual material for molecular analysis. In addition, while various candidates are being investigated to augment the efficacy of R-CHOP, the performance of the proposed predictive signatures will require re-appraisal in the context of any new standard of care.

In summary, we have leveraged one of the largest cohorts of paired diagnosis-relapse series in DLBCL demonstrating the stability of COO and derived a 30-gene signature that robustly distinguished low- and high-risk subgroups of ABC patients. This signature also identifies patients who derive benefit from BTK inhibition in combination with R-CHOP, adding to the existing toolkit of putative genetic predictors now available in DLBCL that can be readily assessed as part of prospective clinical trials.

The authors thank the patients and their families for donating specimens for research in this study as well as the Haemato-Oncology Tissue Bank at the Barts Cancer Institute. They also thank George Wright for earlier discussions on the LLMPP series of cases.

The Precision Medicine for Aggressive Lymphoma Consortium was funded by Blood Cancer UK; Barts Cancer Centre was funded by the Cancer Research UK Centre of Excellence under award #C16420/A18066; J.O was funded by grant #C57432/A22742 from Cancer Research UK; and the Haematological Malignancy Research Network was funded by Cancer Research UK under grants #29685 and #15037.

Contribution: F.B.-C., K.K., S.A., J.G., J.O., P.J., J.W., and J.F. conceived and designed the study; F.B.-C., J.W., and J.F. wrote the manuscript; K.K., S.A., E.K., A.C., T.C., M.A.-K., S.B., S.v.H., C.B., M.E., S.R., N.C., G.M., A.N., A.D., A.S., K.N.N., and M.C. collected samples and clinical information; F.B.-C., R.K.H., J.W., and J.F. devised methods for analysis; F.B.-C., D.J.H., and J.W. performed the bioinformatic analysis; K.K., S.A., D.W.S., and L.M.R. performed experiments; H.R., C.S., J.R.D., D.R.W., D.P., B.H., D.J.H., S.B., and A.S. provided access to other data sets; and all authors read, critically reviewed, and approved the manuscript.

Conflict-of-interest disclosure: J.F. has provided consultancy and received funding from Epizyme. K.K. is an employee and shareholder of Roche. D.W.S. and L.M.R. have IP rights to the Lymph2Cx assay. D.W.S. has provided consultancy to AbbVie, AstraZeneca, Celgene, Janssen, and Incyte, and has received research funding from Janssen, NanoString Technology, and Roche/Genentech.

Correspondence: Findlay Bewicke-Copley, Barts Cancer Institute, Queen Mary University, Charterhouse Square Campus, Charterhouse Square, London, EC1M 5PZ, United Kingdom; e-mail: f.copley@qmul.ac.uk.

1.
Coiffier
B
,
Lepage
E
,
Brière
J
, et al
.
CHOP chemotherapy plus rituximab compared with CHOP alone in elderly patients with diffuse large-B-cell lymphoma
.
N Engl J Med
.
2002
. ;
346
(
4
):
235
-
242
.
2.
Rovira
J
,
Valera
A
,
Colomo
L
, et al
.
Prognosis of patients with diffuse large B cell lymphoma not reaching complete response or relapsing after frontline chemotherapy or immunochemotherapy
.
Ann Hematol
.
2015
. ;
94
(
5
):
803
-
812
.
3.
Sha
C
,
Barrans
S
,
Cucco
F
, et al
.
Molecular high-grade B-cell lymphoma: defining a poor-risk group that requires different approaches to therapy
.
J Clin Oncol
.
2019
. ;
37
(
3
):
202
-
212
.
4.
Ma
Z
,
Niu
J
,
Cao
Y
, et al
.
Clinical significance of ‘double-hit’ and ‘double-expression’ lymphomas
.
J Clin Pathol
.
2019
. . jclinpath-2019-206199.
5.
Liu
Y
,
Barta
SK
.
Diffuse large B-cell lymphoma: 2019 update on diagnosis, risk stratification, and treatment
.
Am J Hematol
.
2019
. ;
94
(
5
):
604
-
616
.
6.
Younes
A
,
Sehn
LH
,
Johnson
P
, et al;
PHOENIX investigators
.
Randomized phase III trial of ibrutinib and rituximab plus cyclophosphamide, doxorubicin, vincristine, and prednisone in non-germinal center B-cell diffuse large B-cell lymphoma
.
J Clin Oncol
.
2019
. ;
37
(
15
):
1285
-
1295
.
7.
Davies
A
,
Cummin
TE
,
Barrans
S
, et al
.
Gene-expression profiling of bortezomib added to standard chemoimmunotherapy for diffuse large B-cell lymphoma (REMoDL-B): an open-label, randomised, phase 3 trial
.
Lancet Oncol
.
2019
. ;
20
(
5
):
649
-
662
.
8.
Gopal
AK
,
Schuster
SJ
,
Fowler
NH
, et al
.
Ibrutinib as treatment for patients with relapsed/refractory follicular lymphoma: Results from the open-label, multicenter, phase II DAWN study
.
J Clin Oncol
.
2018
. ;
36
(
23
):
2405
-
2412
.
9.
Nowakowski
GS
,
Chiappella
A
,
Gascoyne
RD
, et al
.
ROBUST: a phase III study of lenalidomide plus R-CHOP versus placebo plus R-CHOP in previously untreated patients with ABC-type diffuse large B-cell lymphoma
.
J Clin Oncol
.
2021
. ;
39
(
12
):
1317
-
1328
.
10.
Schmitz
R
,
Wright
GW
,
Huang
DW
, et al
.
Genetics and pathogenesis of diffuse large B-cell lymphoma
.
N Engl J Med
.
2018
. ;
378
(
15
):
1396
-
1407
.
11.
Wright
GW
,
Huang
DW
,
Phelan
JD
, et al
.
A probabilistic classification tool for genetic subtypes of diffuse large B cell lymphoma with therapeutic implications
.
Cancer Cell
.
2020
. ;
37
(
4
):
551
-
568
. e14.
12.
Chapuy
B
,
Stewart
C
,
Dunford
AJ
, et al
.
Molecular subtypes of diffuse large B cell lymphoma are associated with distinct pathogenic mechanisms and outcomes
.
Nat Med
.
2018
. ;
24
(
5
):
679
-
690
.
13.
Lacy
SE
,
Barrans
SL
,
Beer
PA
, et al
.
Targeted sequencing in DLBCL, molecular subtypes, and outcomes: a Haematological Malignancy Research Network report
.
Blood
.
2020
. ;
135
(
20
):
1759
-
1771
.
14.
Reddy
A
,
Zhang
J
,
Davis
NS
, et al
.
Genetic and functional drivers of diffuse large B cell lymphoma
.
Cell
.
2017
. ;
171
(
2
):
481
-
494
. e15.
15.
Morin
RD
,
Assouline
S
,
Alcaide
M
, et al
.
Genetic landscapes of relapsed and refractory diffuse large B-cell lymphomas
.
Clin Cancer Res
.
2016
. ;
22
(
9
):
2290
-
2300
.
16.
Melchardt
T
,
Hufnagl
C
,
Weinstock
DM
, et al
.
Clonal evolution in relapsed and refractory diffuse large B-cell lymphoma is characterized by high dynamics of subclones
.
Oncotarget
.
2016
. ;
7
(
32
):
51494
-
51502
.
17.
Juskevicius
D
,
Lorber
T
,
Gsponer
J
, et al
.
Distinct genetic evolution patterns of relapsing diffuse large B-cell lymphoma revealed by genome-wide copy number aberration and targeted sequencing analysis
.
Leukemia
.
2016
. ;
30
(
12
):
2385
-
2395
.
18.
Jiang
Y
,
Redmond
D
,
Nie
K
, et al
.
Deep sequencing reveals clonal evolution patterns and mutation events associated with relapse in B-cell lymphomas
.
Genome Biol
.
2014
. ;
15
(
8
):
432
.
19.
Mareschal
S
,
Dubois
S
,
Viailly
P-J
, et al
.
Whole exome sequencing of relapsed/refractory patients expands the repertoire of somatic mutations in diffuse large B-cell lymphoma
.
Genes Chromosomes Cancer
.
2016
. ;
55
(
3
):
251
-
267
.
20.
Rushton
CK
,
Arthur
SE
,
Alcaide
M
, et al
.
Genetic and evolutionary patterns of treatment resistance in relapsed B-cell lymphoma
.
Blood Adv
.
2020
. ;
4
(
13
):
2886
-
2898
.
21.
Nijland
M
,
Seitz
A
,
Terpstra
M
, et al
.
Mutational evolution in relapsed diffuse large B-cell lymphoma
.
Cancers (Basel)
.
2018
. ;
10
(
11
):
459
.
22.
Greenawalt
DM
,
Liang
WS
,
Saif
S
, et al
.
Comparative analysis of primary versus relapse/refractory DLBCL identifies shifts in mutation spectrum
.
Oncotarget
.
2017
. ;
8
(
59
):
99237
-
99244
.
23.
Scott
DW
,
Mottok
A
,
Ennishi
D
, et al
.
Prognostic significance of diffuse large B-cell lymphoma cell of origin determined by digital gene expression in formalin-fixed paraffin-embedded tissue biopsies
.
J Clin Oncol
.
2015
. ;
33
(
26
):
2848
-
2856
.
24.
Barrans
SL
,
Crouch
S
,
Care
MA
, et al
.
Whole genome expression profiling based on paraffin embedded tissue can be used to classify diffuse large B-cell lymphoma and predict clinical outcome
.
Br J Haematol
.
2012
. ;
159
(
4
):
441
-
453
.
25.
Newman
AM
,
Liu
CL
,
Green
MR
, et al
.
Robust enumeration of cell subsets from tissue expression profiles
.
Nat Methods
.
2015
. ;
12
(
5
):
453
-
457
.
26.
Subramanian
A
,
Tamayo
P
,
Mootha
VK
, et al
.
Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles
.
Proc Natl Acad Sci USA
.
2005
. ;
102
(
43
):
15545
-
15550
.
27.
Tibshirani
R
,
Hastie
T
,
Narasimhan
B
, et al
.
Diagnosis of multiple cancer types by shrunken centroids of gene expression
.
Proc Natl Acad Sci USA
.
2002
. ;
99
(
10
):
6567
-
6572
.
28.
Lenz
G
,
Wright
G
,
Dave
SS
, et al;
Lymphoma/Leukemia Molecular Profiling Project
.
Stromal gene signatures in large-B-cell lymphomas
.
N Engl J Med
.
2008
. ;
359
(
22
):
2313
-
2323
.
29.
Smith
A
,
Howell
D
,
Crouch
S
, et al
.
Cohort profile: the Haematological Malignancy Research Network (HMRN): a UK population-based patient cohort
.
Int J Epidemiol
.
2018
. ;
47
(
3
). 700-700g.
30.
Araf
S
,
Korfi
K
,
Bewicke-Copley
F
, et al
.
Genetic heterogeneity highlighted by differential FDG-PET response in diffuse large B-cell lymphoma
.
Haematologica
.
2020
. ;
105
(
6
):
318
-
321
.
31.
Venet
D
,
Dumont
JE
,
Detours
V
.
Most random gene expression signatures are significantly associated with breast cancer outcome
.
PLOS Comput Biol
.
2011
. ;
7
(
10
):
e1002240
.
32.
Shimoni
Y
.
Association between expression of random gene sets and survival is evident in multiple cancer types and may be explained by sub-classification
.
PLOS Comput Biol
.
2018
. ;
14
(
2
):
e1006026
.
33.
Wilson
WH
,
Wright
GW
,
Huang
DW
, et al
.
Effect of ibrutinib with R-CHOP chemotherapy in genetic subtypes of DLBCL
.
Cancer Cell
.
2021
. ;
39
(
12
):
1643
-
1653
. e3.
34.
Hilton
LK
,
Tang
J
,
Ben-Neriah
S
, et al
.
The double-hit signature identifies double-hit diffuse large B-cell lymphoma with genetic events cryptic to FISH
.
Blood
.
2019
. ;
134
(
18
):
1528
-
1532
.
35.
Runge
HFP
,
Lacy
S
,
Barrans
S
, et al
.
Application of the LymphGen classification tool to 928 clinically and genetically-characterised cases of diffuse large B cell lymphoma (DLBCL)
.
Br J Haematol
.
2021
. ;
192
(
1
):
216
-
220
.
36.
Dubois
S
,
Tesson
B
,
Mareschal
S
, et al;
Lymphoma Study Association (LYSA) investigators
.
Refining diffuse large B-cell lymphoma subgroups using integrated analysis of molecular profiles
.
EBioMedicine
.
2019
. ;
48
:
58
-
69
.
37.
Ennishi
D
,
Jiang
A
,
Boyle
M
, et al
.
Double-hit gene expression signature defines a distinct subgroup of germinal center B-cell–like diffuse large B-cell lymphoma
.
J Clin Oncol
.
2019
. ;
37
(
3
):
190
-
201
.
38.
Kotlov
N
,
Bagaev
A
,
Revuelta
MV
, et al
.
Clinical and biological subtypes of B-cell lymphoma revealed by microenvironmental signatures
.
Cancer Discov
.
2021
. ;
11
(
6
):
1468
-
1489
.
39.
Autio
M
,
Leivonen
S-K
,
Brück
O
, et al
.
Immune cell constitution in the tumor microenvironment predicts the outcome in diffuse large B-cell lymphoma
.
Haematologica
.
2021
. ;
106
(
3
):
718
-
729
.
40.
Merdan
S
,
Subramanian
K
,
Ayer
T
, et al
.
Gene expression profiling-based risk prediction and profiles of immune infiltration in diffuse large B-cell lymphoma
.
Blood Cancer J
.
2021
. ;
11
(
1
):
2
.
41.
Staiger
AM
,
Altenbuchinger
M
,
Ziepert
M
, et al
.
A novel lymphoma-associated macrophage interaction signature (LAMIS) provides robust risk prognostication in diffuse large B-cell lymphoma clinical trial cohorts of the DSHNHL
.
Leukemia
.
2019
. ;
34
(
2
):
543
-
553
.
42.
Araf
S
,
Wang
J
,
Korfi
K
, et al
.
Genomic profiling reveals spatial intra-tumor heterogeneity in follicular lymphoma
.
Leukemia
.
2018
. ;
32
(
5
):
1261
-
1265
.

Author notes

Novel data from the publication is available from the Gene Expression Omnibus database (accession number GSE193566). Contact the corresponding author for other forms of data sharing, Findlay Bewicke-Copley (f.copley@qmul.ac.uk).

The full-text version of this article contains a data supplement.

Jun Wang and Jude Fitzgibbon are co-senior authors.