Abstract
Multiple gene expression–based signatures have been identified in diffuse large B-cell lymphoma that are predictive for survival outcomes. Most studies assess predictive significance based on P values from multivariable Cox regression. Few investigations have evaluated the incremental usefulness of these signatures. Recent developments in statistical methodology extend the use of concordance measures on censored survival data. We applied these methods to evaluate the added value in survival risk prediction from 3 published gene-based signatures on 2 sets of patients with diffuse large B-cell lymphoma treated with CHOP or R-CHOP. Our results indicate these gene-based signatures are inferior to clinical factors and provide little added value in risk assessment. To develop highly discriminating risk prediction models, we need to use appropriate approaches and consider more than gene expression. However, the study of gene expression and clinical outcomes retains considerable potential to enhance understanding of disease mechanisms and uncover new therapeutic targets.
Key Points
Gene-based predictors have good discrimination ability; the IPI remains the most powerful predictor of clinical outcome in DLBCL.
One needs to use appropriate approaches and consider more than gene expression to develop highly effective risk prediction models.
Introduction
Risk prediction procedures are valuable tools for cancer management, and risk scoring systems have been established for assessing individual risks in survival outcome for various cancer type. In diffuse large B-cell lymphoma (DLBCL), 5 independent clinical characteristics (age, Ann Arbor stage, serum lactate dehydrogenase, performance status, and number of extranodal sites) have been used in international prognostic index (IPI) to predict outcomes in current clinical practice. Several gene expression–based molecular signatures have been developed for clinical risk prediction. The lymphoma/leukemia molecular profiling project reported a 17-gene predictor,1 and a 3-component signature (∼ 400 genes).2 Lossos et al built a predictive model based on 6 genes.3 Alizadeh et al further simplify to a 2-gene model as risk predictor.4 All models were claimed to be independent of the IPI and add to its predictive power. However, all studies assess predictive significance based on P value from multivariable Cox regression that provides little knowledge of the added value in individual risk prediction. Despite the discussion of DLBCL risk predictors in the statistical literature,5,6 a key component in assessing risk prediction is its ability to distinguish subjects who will develop an event (progression or death) from those who will not, by a specified time. This concept, known as discrimination, has been well quantified for binary outcomes by concordance measures such as the area under the receive operating characteristics curve, also referred as “C-statistic.” Various concordance measures have been extended to censored survival data in the statistical literature7-9 and are now being used in clinical setting to assess the prediction usefulness of biomarkers.10,11 In this report, we assess the usefulness of 3 published gene-based risk signatures compared with known clinical prognostic factors; with the goal of investigating the added value
Methods
The first gene-expression risk signature is a 6-gene predictor described as (−0.0273 × LMO2) + (−0.2103 × BCL6) + (−0.1878 × FN1) + (0.0346 × CCND2)+(0.1888 × SCYA3) + (0.5527 × BCL2).3,12 The second gene-expression risk signature is a 3-component signature reported as (−0.419 × germinal center B-cell avg) + (−1.015 × stromal-1 avg) + (0.675 × stromal-2 avg),2 where avg is the mean expression of all genes in the given group. There are 39 genes in germinal center B-cell group, 283 genes in stromal-1 group, and 72 genes in stromal-2 group. The third one was recently published as (−0.0323 × LMO2) + (−0.29 × TNFRSF9).4 Two datasets used were introduced by Lenz et al,2 where pretreatment tumor-biopsy specimens and clinical data were obtained from 233 and 181 patients with newly diagnosed DLBCL treated with RCHOP or CHOP regimen. The event rate is 25.8% and 58.0%, with median follow-ups of 2.81 and 7.62 years, 2.93 and 7.20 years among living patients, respectively. Datasets were downloaded from GEO as normalized expression levels (GSE10846) and log2 transformed. Two concordance measures were used. The C-statistic, the estimated concordance between prediction and observation (event vs nonevent)—the probability that predicted risk score is higher for subjects with earlier times of event, provides a global measure of a fitted survival model for the continuous event time rather than at a particular follow up time. The integrated discrimination improvement (IDI) measures overall improvement in sensitivity and specificity, roughly, the sum of increased risk score for events and reduced risk score for nonevents.13,14 Mathematically, the IDI is the reduction in R2 for the Cox model, with a 0 to 1 range. Multivariable Cox regression was used with 2 sets of prespecified models, 1 model with only clinical prognostic factors or only gene-based predictor, and 1 model with both clinical factors and molecular predictor. The discriminatory capability of the 2 models was evaluated by C-statistic; the improvement was assessed by the difference in C-statistic and IDI. The survival duration for evaluation was 3 and 5 years for the RCHOP and CHOP datasets. An unbiased estimator for the C-statistic (R package, SurvC1),8 which is robust with respect to the choice of evaluation time, was used. All variables are considered continuous. Because the CHOP and RCHOP datasets were used to build the 3-component signature2 and the 2-gene model,4 respectively, they were not used to evaluate that predictor.
Results and discussion
All gene-based predictors are significantly associated with survival outcomes with P < .001 when used alone, and they remain significant with P < .001 after adjusting for all clinical prognostic factors in multivariable Cox model. In RCHOP validation dataset, the C-statistic was 0.600 and 0.717 for 6-gene predictor and 3-component signature, suggesting good discrimination ability when used alone. However, the performance is inferior to the known clinical factors with a C-statistic of 0.739. When added to clinical factors, the C-statistic was increased to 0.752 and 0.771, showing improvement of 0.013 (95% confidence interval [CI], −0.021 to 0.047) and 0.031 (95% CI, −0.026 to 0.089) for 6-gene predictor and 3-component signature (Table 1), respectively. Further assessment by IDI reveals an added value of 0.001 (95% CI, −0.008 to 0.049) and 0.076 (95% CI, 0.013 to 0.167) for the 2 predictors. Similar trends were observed in validation with the CHOP dataset (Table 1). The C-statistic was 0.678 and 0.619 for the 6-gene predictor and 2-gene models and was 0.721 for the known clinical factors when used alone. The improvement when added to clinical factors were 0.002 (95% CI, −0.021 to 0.026) and 0.018 (95% CI, −0.016 to 0.052) in C-statistics, and 0.022 (95% CI, −0.006 to 0.080) and 0.037 (95% CI, −0.002 to 0.098) assessed by IDI. The improvement was small and statistically significant only for the 3-component model in the RCHOP dataset. In contrast, clinical factors improve risk prediction significantly, for example, improvement of 0.146 (95% CI, 0.064 to 0.227) and 0.120 (95% CI, 0.051 to 0.189) in C-statistic, when added to 6-gene predictor and 2-gene models in the CHOP dataset.
Risk factor . | RCHOP . | CHOP . | ||||
---|---|---|---|---|---|---|
C-statistic . | Difference in C (95% CI) . | IDI . | C-statistic . | Difference in C (95% CI) . | IDI . | |
Clinical factors | 0.739 | 0.721 | ||||
Clinical factors + 6-gene predictor | 0.752 (0.600)* | 0.013 (−0.021, 0.047) | 0.001 (−0.008, 0.049) | 0.724 (0.678)* | 0.002 (−0.021, 0.026) | 0.022 (−0.006, 0.080) |
Clinical factors + 3-component signature | 0.771 (0.717)* | 0.031 (−0.026, 0.089) | 0.076 (0.013, 0.167) | |||
Clinical factors + 2-gene model | 0.739 (0.619)* | 0.018 (−0.016, 0.052) | 0.037 (−0.002, 0.098) |
Risk factor . | RCHOP . | CHOP . | ||||
---|---|---|---|---|---|---|
C-statistic . | Difference in C (95% CI) . | IDI . | C-statistic . | Difference in C (95% CI) . | IDI . | |
Clinical factors | 0.739 | 0.721 | ||||
Clinical factors + 6-gene predictor | 0.752 (0.600)* | 0.013 (−0.021, 0.047) | 0.001 (−0.008, 0.049) | 0.724 (0.678)* | 0.002 (−0.021, 0.026) | 0.022 (−0.006, 0.080) |
Clinical factors + 3-component signature | 0.771 (0.717)* | 0.031 (−0.026, 0.089) | 0.076 (0.013, 0.167) | |||
Clinical factors + 2-gene model | 0.739 (0.619)* | 0.018 (−0.016, 0.052) | 0.037 (−0.002, 0.098) |
C-statistic when molecular predictor was used alone.
Survival risk scores derived from the multivariable Cox model were used to rank cases that were then divided into quartile groups. Figure 1 shows the Kaplan-Meier curves of survival probabilities for 4 groups of patients using risk scores derived from model with clinical factors alone, and clinical factors + molecular predictors. Further investigation on subject-specific incremental value suggests gene-based biomarkers improve risk prediction only for patients with intermediate risk, and not for patients with high or low risk.
Although gene-based predictors have good discrimination ability, when used alone, the IPI remains the most powerful predictor of clinical outcome in patients with DLBCL. The improvement with the addition of gene-based predictors is not statistically significant in most cases evaluated by the C-statistic and IDI measures. Although patients with intermediate risk by IPI, might benefit from additional testing, the clinical utility of the 3 predictors is questionable. P values from Cox models, although testing whether there is an association with outcome, do not measure the separation (discrimination) in predictor scores between patients with and without events. Improvement and refinement would be achieved with use of appropriate methods such as concordance measures. Toward the goal of risk assessment, we are not intending to compare all approaches, but to highlight the need to move forward with more appropriate methods to derive and evaluate predictors, and consider more than gene expression to develop substantially more effective predictors. However, the study of gene expression and clinical outcomes retains its importance in understanding disease mechanism and developing new therapeutic strategies.
Presented in abstract form at the 53rd Annual Meeting of the American Society of Hematology, San Diego, CA, December 10, 2011.
The publication costs of this article were defrayed in part by page charge payment. Therefore, and solely to indicate this fact, this article is hereby marked “advertisement” in accordance with 18 USC section 1734.
Acknowledgments
The authors thank Dr Donna Neuberg for constructive suggestions and review of the manuscript and 2 reviewers for insightful criticism.
This study was funded by the Department of Biostatistics and Computational Biology R.S. fund.
Authorship
Contribution: F.H. designed the study, analyzed data, and wrote the manuscript; B.S.K. and R.G. provided critical suggestions and evaluated and edited the manuscript; and all coauthors subsequently collaborated on completing the article.
Conflict-of-interest disclosure: The authors declare no competing financial interests.
Correspondence: Fangxin Hong, Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, Harvard School of Public Health, Boston, MA 02115; e-mail: fxhong@jimmy.harvard.edu