Abstract
Background: Although progression-free survival (PFS) in early-stage (ES) classic Hodgkin Lymphoma (cHL) is high, there remains a subset of patients (pts) at higher risk of worse outcomes, particularly in those with unfavorable risk disease. Leveraging pt-level data from within the HoLISTIC (Hodgkin Lymphoma International STudy for Individual Care) Consortium, the E-HIPI was developed in clinical trial pts and validated in real-world registry pts to predict 2-year (y) PFS (Rodday. NEJM Evidence 2025). Continuous, objective, and readily measurable pre-treatment variables were considered, with sex, maximum tumor diameter, albumin, and hemoglobin included as significant predictors. Using the E-HIPI, interactive calculators for pt risk, comparison, and stratification were created (https://rtools.mayo.edu/holistic_ehipi/). Although the E-HIPI was developed and validated in both unfavorable and favorable ES cHL pts, the clinical predictors are primarily markers of unfavorable disease. In the development and initial validation cohorts, 2y PFS was 93.7% and 90.3%, respectively, the C-statistic was 0.63 in both cohorts, calibration was strong with slight underprediction of events in the validation cohort, and the E-HIPI outperformed the historic binary classification of patients as favorable or unfavorable using EORTC criteria. However, since pts in the original E-HIPI development and validation were primarily treated with ABVD chemotherapy protocols, additional validation is needed, particularly among pts with unfavorable ES cHL who were treated initially with more intensive chemotherapeutic platforms (e.g., BEACOPP-based). Thus, we conducted an external validation of the E-HIPI in pts treated on recent unfavorable ES cHL GHSG trials.
Methods: The validation cohort included 2367 adults (ages 18 to 60y) with newly-diagnosed unfavorable cHL from randomized phase 3 trials conducted by the GHSG (HD14 von Tresckow. JCO 2012; and HD17 Borchmann. Lancet 2021). We calculated the predicted probability of 2y PFS events based on the E-HIPI model equation (scale 1-100%). Model performance was assessed using discrimination and calibration. Discrimination, which is the model's ability to separate high and low risk pts, was assessed using Harrell's C-statistic. Calibration, which is the model's ability to accurately predict absolute risk, was assessed by comparing observed and predicted probabilities of 2y PFS events within quintiles of predicted probabilities and estimating calibration-in-the-large (predicted minus observed event rate) and the calibration slope.
Results: For pts treated on unfavorable GHSG trials, mean age was 33y, 95% had stage II disease, mean baseline E-HIPI score (i.e., predicted 2y PFS event rate) was 6.0% (SD=2.3%), and the observed 2y PFS rate was 96.2% (95%CI: 95.4-97.0%). The C-statistic was 0.65 (95% CI: 0.59-0.70), calibration-in-the-large was 0.02, and the calibration slope was 1.08 (95% CI: 0.52, 1.64). Within the lowest risk E-HIPI quintile, the observed 2y PFS event rate was 1.6%, compared with 6.1% in the highest risk quintile.
Conclusions: In external validation of the recently published E-HIPI in pts with cHL treated on unfavorable GHSG trials using more intensive BEACOPP-based protocols, discrimination was similar to the original development and validation cohorts. Collectively, the E-HIPI is a treatment-agnostic prediction tool across varied treatment regimens for unfavorable ES cHL pts. Better 2y PFS in the GHSG trials than in the E-HIPI development and initial validation cohorts may have contributed to mild misclassification, likely reflecting differences in treatment regimens. This may be resolved with recalibration or incorporating granular treatment variables into future models. Additional research is needed to understand E-HIPI performance in pts with favorable disease who have even lower event rates as well as to more fully predict individualized pt outcomes across varied chemotherapy platforms and radiation received for all ES cHL pts.