Key Points
Involvement of ≥2 EN sites or skin/soft tissue on PET/computed tomography represent predictors of progression-free survival and overall survival in newly diagnosed FL.
We developed a PET-based prognostic index for untreated FL, without need for bone marrow biopsy; this index did not outperform FLIPI-2.
Visual Abstract
Parameters detected by positron emission tomography (PET) are not included in the usual predictive indices for follicular lymphoma (FL). Data are lacking regarding PET extranodal (EN) factors that predict outcomes in FL. PET scans from 258 patients with untreated grade 1 to 3A FL included in the E2408 randomized phase 2 trial were reviewed. Validation of the prognostic significance of PET factors identified in a previous retrospective study was performed. PET factors with a significant impact on survival outcomes were combined with significant FL International Prognostic Index 2 (FLIPI-2) factors to evaluate the predictive value of a PET-based prognostic index. Presence of ≥2 EN sites and skin/soft tissue involvement on PET were validated as predictors of overall survival. These factors were combined with all FLIPI-2 factors except “positive bone marrow biopsy” to form a PET-based prognostic score. This novel score identified a group of patients at high risk of progression of disease at 24 months. This PET-based score did not perform better than the traditional FLIPI-2 on additional analysis. Presence of ≥2 EN sites and skin/soft tissue involvement on PET predict poor outcomes in untreated FL. Further studies should be performed to determine the validity of a PET-based prognostic index in FL.
Introduction
Follicular lymphoma (FL) is an indolent lymphoma with a generally favorable prognosis, with an estimated 10-year overall survival (OS) of 80%.1 Patients who remain in complete remission 24 months after first-line therapy have a similar life expectancy to that of the general population,2 and progression of disease within 24 months (POD24) of immunochemotherapy (IC) correlates with worse OS.3,4
Numerous clinical and molecular models have attempted to predict outcome and guide therapy in untreated FL. However, these indices do not reliably predict which patients will experience POD24, or “early clinical failure.”5,6 There is a need to better identify patients with FL who are at high risk for POD24.6-8
It is known that lymphoma involvement of extranodal (EN) sites is better detected by [18F]fluorodeoxyglucose (FDG) positron emission tomography (PET)/computed tomography (CT) than CT alone,9,10 and EN involvement in FL has been described as a poor prognostic factor.11 Despite this, the prognostic significance of EN involvement on PET/CT in untreated FL remains uncertain. To address this question, we conducted a retrospective analysis of 613 cases of newly diagnosed FL grade 1 to 3A that had undergone evaluation with a PET scan at diagnosis. Patients were identified using the Mayo Lymphoma database as well as Mayo Clinic patients from the University of Iowa/Mayo Clinic Lymphoma Specialized Program of Research Excellence (SPORE) Molecular Epidemiology Resource (MER) database.12 We found that qualitative PET/CT abnormalities of the spleen, skin/soft tissue, and bone (specifically in a pattern of focal osseous lesions superimposed on diffuse bone marrow involvement, or “focal on diffuse”), as well as involvement of ≥2 EN sites, predicted early clinical failure. In order to validate these findings, we investigated a separate cohort of patients with untreated high-tumor burden FL treated exclusively with bendamustine-based therapy in the Eastern Cooperative Oncology Group - American College of Radiology Imaging Network (ECOG-ACRIN) Cancer Research Group E2408 randomized phase 2 trial.
Our second aim was to evaluate the prognostic significance of a predictive index incorporating the PET factors that were validated in this study. Existing prognostic models in FL do not include parameters detected by FDG PET/CT,5,6 which has become a standard component of the initial workup of FL.6,10,13 Moreover, the traditional Follicular Lymphoma International Prognostic Index (FLIPI) and FLIPI-2 scores include bone marrow biopsy (BMB), which was historically performed as part of the initial FL evaluation; however this is no longer the case. Novel prognostic indices should be designed to meet the needs of modern clinical practice.
Methods
Patients
FDG PET/CT exams from 258 patients with newly diagnosed FL included in the ECOG-ACRIN E2408 randomized phase 2 trial were reviewed.14 Of these, 9 patients were excluded because of poor image quality. The E2408 study included patients with untreated FL grade 1 to 3A; stage II, III, or IV disease; and high-risk disease defined as either high-tumor burden by Groupe D’Etude des Lymphomes Folliculaires and/or a FLIPI score of 3 to 5. Patients were treated with a bendamustine/rituximab backbone for induction, followed by 2 years of rituximab-based maintenance.14
Ethics approval and patient consent
This study was found to be exempt from institutional review board approval by the Northwestern institutional review board (number SP0066660). This study was exempt from patient consent because of its retrospective nature.
FDG PET/CT imaging
Patients enrolled in E2408 were required to have an FDG PET/CT performed within 6 weeks of randomization for baseline assessment, and images were securely stored at IROC RI/QARC (Imaging and Radiation Oncology Core Group Rhode Island/Quality Assurance Review Center) after completion of the study. These PET/CT images were obtained from IROC RI/QARC and reviewed by a trained oncology fellow (F.S.-P.) and nuclear radiologist (S.M.B.), who performed the PET/CT imaging review in the initial retrospective analysis by St-Pierre et al.12 MIM software (MIM Software Inc, Cleveland, OH) was used for PET/CT analysis. Location, pattern, and number of EN sites, as well as splenic involvement were recorded. EN disease was defined as all sites outside of the lymph nodes, spleen, thymus, and oropharyngeal (salivary gland and tonsil) lymphoid tissue. Extension of a nodal site into a surrounding EN site (eg, soft tissue) was considered positive for EN involvement at the site of extension. The criteria for PET-defined osseous involvement were derived from Deauville criteria: diffuse bone FDG uptake equal to or greater than that of the liver and/or the presence of focal osseous lesions with uptake ≥ liver.15 Other organs were considered involved if focal lesions were present, and FDG uptake ≥ liver. Spleen involvement was defined as diffuse or focal FDG uptake equal to or greater than that of liver, and did not qualify as an EN site. Involvement of an EN site in multiple locations (eg, multifocal osseous involvement) was counted as 1 EN site. Although we assume that in most cases, diffuse bone involvement represents marrow involvement and focal involvement represents bone matrix involvement, that distinction could not always be made, and presence of either diffuse and/or focal bone involvement was counted as 1 EN site.
Statistical analysis
The primary aim of this study was to validate the findings from the St-Pierre et al retrospective analysis of 613 patients from the Mayo Lymphoma database and the Mayo/Iowa SPORE MER database.12 An additional aim was to evaluate the prognostic significance of a predictive index incorporating the PET factors that were validated in this study. Primary end points of the validation study were progression-free survival (PFS) and OS. Time-to-event measures were estimated using the Kaplan-Meier method, in which differences between groups were assessed using log-rank statistics. Multivariate analyses were performed using Cox proportional hazards regression models. The multivariate analyses were performed using FLIPI-2 factors,16 in addition to factors from the univariate analysis that were statistically significant for PFS and/or OS.
PET factors that had a statistically significant (P ≤ .05) impact on PFS and/or OS in the validation study were combined with statistically significant FLIPI-2 factors in this cohort to create a PET-based prognostic index for newly diagnosed FL. Outcomes including PFS, OS, and POD24 were assessed in each risk group of the PET-based prognostic index. Outcomes analysis was performed with and without the PET-based factors to confirm their effect on the prognostic index. The analysis of the PET-based prognostic index was subsequently reproduced in the cohort of 613 patients from the Mayo Lymphoma database and the Mayo/Iowa SPORE MER database.12 The discriminatory impact of the PET-based prognostic index was evaluated using the Harrell concordance index (c-index).17,18 A summary of our study schema is outlined in Figure 1.
Validation study summary schema. Hgb, hemoglobin; LN, lymph node; β2M, β2-microglobulin.
Validation study summary schema. Hgb, hemoglobin; LN, lymph node; β2M, β2-microglobulin.
Results
Validation study
A total of 249 patients from the ECOG-ACRIN E2408 randomized phase 2 trial were included in the analysis. Patient characteristics are outlined in Table 1. Overall, 191 patients had EN involvement on PET/CT. The most common sites of EN involvement were bone/bone marrow (n = 147), skin/soft tissue (n = 107), lung/pleura (n = 20), genitourinary tract (n = 17), gastrointestinal tract (n = 12), and the liver (n = 5). Presence of 2 or more EN sites and skin/soft tissue involvement on PET/CT were validated as univariate predictors of PFS and OS. In a multivariate analysis, presence of 2 or more EN sites on PET/CT was validated as an independent predictor of PFS (hazard ratio [HR], 1.69; P = .045) and OS (HR, 2.52; P = .009). Multivariate analysis for skin/soft tissue involvement was not significant for PFS (HR, 1.58; P = .085), and trended toward statistical significance for OS (HR, 2.01; P = .059). Multivariate analyses are detailed in supplemental Tables 1 and 2. Two-year OS was 83% in the group with 2 or more EN sites compared with 94% in the group with 0 to 1 EN sites, and 84% in the group with skin/soft tissue involvement compared with 96% in the group with no skin/soft tissue involvement. In subset analyses, these PET factors were predictive of PFS and OS across all E2408 treatment arms. Osseous and spleen involvement, contrary to the previous study, did not predict outcomes. Results are outlined in Table 2.
Patient characteristics from the E2408 validation study
. | Overall (N = 249) n (%) . |
---|---|
Age, years | |
≤60 | 124 (49.8) |
>60 | 125 (50.2) |
Stage | |
II | 8 (3.2) |
IIE | 7 (2.8) |
III | 40 (16.1) |
IIIE | 20 (8.0) |
IV | 174 (69.9) |
Longest diameter of any lymph node | |
≤6 cm | 125 (50.2) |
>6 cm | 124 (49.8) |
Hemoglobin, g/dL | |
≤12 | 185 (74.3) |
<12 | 64 (25.7) |
Bone marrow involvement (biopsy) | |
No | 101 (41.1) |
Yes | 145 (58.9) |
Missing | 3 |
Elevated β2-microglobulin | |
No | 81 (33.2) |
Yes | 163 (66.8) |
Missing | 5 |
No. of EN sites on PET | |
0-1 | 164 (65.9) |
≥2 | 85 (34.1) |
Osseous involvement on PET | |
No | 102 (41.0) |
Yes | 147 (59.0) |
Skin or soft tissue involvement PET | |
No | 142 (57.0) |
Yes | 107 (43.0) |
Spleen involvement by PET | |
No | 100 (40.5) |
Yes | 147 (59.5) |
Missing | 2 |
Elevated LDH level | |
No | 174 (69.9) |
Yes | 75 (30.1) |
First-line treatment | |
BR arm A∗ | 55 (22.1) |
BVR arm B | 81 (32.5) |
BR arm C | 113 (45.4) |
FLIP2: no. of factors | |
0 | 11 (4.4) |
1-2 | 107 (43.0) |
3-5 | 123 (49.4) |
Missing | 8 |
PET-based score: no. of factors | |
0-1 | 64 (25.7) |
2-3 | 101 (40.6) |
4-6 | 79 (31.7) |
Missing | 5 |
. | Overall (N = 249) n (%) . |
---|---|
Age, years | |
≤60 | 124 (49.8) |
>60 | 125 (50.2) |
Stage | |
II | 8 (3.2) |
IIE | 7 (2.8) |
III | 40 (16.1) |
IIIE | 20 (8.0) |
IV | 174 (69.9) |
Longest diameter of any lymph node | |
≤6 cm | 125 (50.2) |
>6 cm | 124 (49.8) |
Hemoglobin, g/dL | |
≤12 | 185 (74.3) |
<12 | 64 (25.7) |
Bone marrow involvement (biopsy) | |
No | 101 (41.1) |
Yes | 145 (58.9) |
Missing | 3 |
Elevated β2-microglobulin | |
No | 81 (33.2) |
Yes | 163 (66.8) |
Missing | 5 |
No. of EN sites on PET | |
0-1 | 164 (65.9) |
≥2 | 85 (34.1) |
Osseous involvement on PET | |
No | 102 (41.0) |
Yes | 147 (59.0) |
Skin or soft tissue involvement PET | |
No | 142 (57.0) |
Yes | 107 (43.0) |
Spleen involvement by PET | |
No | 100 (40.5) |
Yes | 147 (59.5) |
Missing | 2 |
Elevated LDH level | |
No | 174 (69.9) |
Yes | 75 (30.1) |
First-line treatment | |
BR arm A∗ | 55 (22.1) |
BVR arm B | 81 (32.5) |
BR arm C | 113 (45.4) |
FLIP2: no. of factors | |
0 | 11 (4.4) |
1-2 | 107 (43.0) |
3-5 | 123 (49.4) |
Missing | 8 |
PET-based score: no. of factors | |
0-1 | 64 (25.7) |
2-3 | 101 (40.6) |
4-6 | 79 (31.7) |
Missing | 5 |
PET-based score factors: age, longest diameter of any lymph node, hemoglobin, elevated β2-microglobulin, number of EN sites on PET, and skin or soft tissue involvement on PET.
BR, bendamustine-rituximab; BVR, bendamustine-bortezomib-rituximab; LDH, lactate dehydrogenase.
BR arm A, BR induction followed by 2-year rituximab maintenance; BR arm B, BVR with rituximab maintenance; BR arm C, BR followed by lenalidomide (1 year) with rituximab maintenance.
PET/CT factors as predictors of PFS and OS in the E2408 cohort
Variable . | Univariate analysis for PFS . | Univariate analysis for OS . | ||
---|---|---|---|---|
HR (95% CI) . | P value . | HR (95% CI) . | P value . | |
≥2 EN sites (n = 85) | 1.93 (1.20-3.12) | .007 | 3.32 (1.73-6.38) | <.001 |
Skin/soft tissue involvement (n = 107) | 1.85 (1.14-3.01) | .013 | 2.45 (1.27-4.74) | .008 |
Osseous involvement (n = 147) | 1.37 (0.82-2.28) | .23 | 1.19 (0.62-2.31) | .6 |
Focal on diffuse pattern of osseous involvement (n = 36) | 1.20 (0.56-2.58) | .635 | 1.49 (0.57-3.91) | .42 |
Spleen involvement (n = 147) | 1.02 (0.62-1.66) | .95 | 0.92 (0.48-1.75) | .8 |
Variable . | Univariate analysis for PFS . | Univariate analysis for OS . | ||
---|---|---|---|---|
HR (95% CI) . | P value . | HR (95% CI) . | P value . | |
≥2 EN sites (n = 85) | 1.93 (1.20-3.12) | .007 | 3.32 (1.73-6.38) | <.001 |
Skin/soft tissue involvement (n = 107) | 1.85 (1.14-3.01) | .013 | 2.45 (1.27-4.74) | .008 |
Osseous involvement (n = 147) | 1.37 (0.82-2.28) | .23 | 1.19 (0.62-2.31) | .6 |
Focal on diffuse pattern of osseous involvement (n = 36) | 1.20 (0.56-2.58) | .635 | 1.49 (0.57-3.91) | .42 |
Spleen involvement (n = 147) | 1.02 (0.62-1.66) | .95 | 0.92 (0.48-1.75) | .8 |
CI, confidence interval.
PET-based prognostic score
FLIPI-2 factors that significantly affected PFS and/or OS in the E2408 cohort (P ≤ .05) included age greater than 60 years, longest diameter of any lymph node greater than 6 cm, hemoglobin less than 12 g/dL, and elevated β2-microglobulin. Bone marrow involvement by biopsy did not significantly affect outcomes in this cohort. These results are outlined in Table 3. The 4 significant FLIPI-2 factors were combined with the number of EN sites involved (0-1 vs 2 or more) and presence of skin/soft tissue involvement on PET/CT to form a PET-based prognostic score. Each factor represents 1 point on a 6-point scale. We categorized patients into 3 risk groups based on their scores: scores of 4 to 6 were classified as high risk, scores of 2 to 3 as intermediate risk, and scores of 0 to 1 as low risk. This stratification was deliberately engineered to ensure the most pronounced differentiation among the groups. Our approach was validated through Kaplan-Meier curves, with distinctions between groups further substantiated by the log-rank test. Specifically, we aimed to achieve a P value that underscored a statistically significant separation among the groups.
FLIPI-2 factors as predictors of PFS and OS in the E2408 cohort
Variable . | Univariate analysis for PFS . | Univariate analysis for OS . | ||
---|---|---|---|---|
HR (95% CI) . | P value . | HR (95% CI) . | P value . | |
Age, >60 years (n = 125) | 1.64 (1.00-2.67) | .048 | 1.83 (0.95-3.54) | .073 |
Longest diameter of any lymph node, >6 cm (n = 124) | 2.02 (1.22-3.34) | .006 | 1.74 (0.90-3.37) | .099 |
Hemoglobin of <12 g/dL (n = 64) | 1.76 (1.07-2.89) | .025 | 2.89 (1.53-5.47) | .001 |
Elevated β2-microglobulin (n = 163) | 1.85 (1.05-3.24) | .032 | 3.06 (1.28-7.31) | .012 |
Positive BMB (n = 145) | 1.01 (0.62-1.65) | .967 | 1.37 (0.70-2.69) | .360 |
Variable . | Univariate analysis for PFS . | Univariate analysis for OS . | ||
---|---|---|---|---|
HR (95% CI) . | P value . | HR (95% CI) . | P value . | |
Age, >60 years (n = 125) | 1.64 (1.00-2.67) | .048 | 1.83 (0.95-3.54) | .073 |
Longest diameter of any lymph node, >6 cm (n = 124) | 2.02 (1.22-3.34) | .006 | 1.74 (0.90-3.37) | .099 |
Hemoglobin of <12 g/dL (n = 64) | 1.76 (1.07-2.89) | .025 | 2.89 (1.53-5.47) | .001 |
Elevated β2-microglobulin (n = 163) | 1.85 (1.05-3.24) | .032 | 3.06 (1.28-7.31) | .012 |
Positive BMB (n = 145) | 1.01 (0.62-1.65) | .967 | 1.37 (0.70-2.69) | .360 |
PFS and OS by PET-based score in the E2408 cohort
A total of 241 patients were included in the analysis for the PET-based prognostic score. Eight patients from the initial cohort of 249 patients were excluded because of missing data for calculation of the FLIPI-2 score. A total of 67 events and 38 deaths were recorded. A high-risk score of 4 to 6 using this index was associated with a 2-year PFS of 70% vs 87% in the intermediate-risk group (score, 2-3), and 91% in the low-risk group (score, 0-1; P < .001). The 2-year OS was 82% in the high-risk group, compared with 93% in the intermediate-risk group, and 97% in the low-risk group (P < .001). Using the traditional FLIPI-2 score16 (1 point for each of the following: age greater than 60 years, longest diameter of any lymph node greater than 6 cm, hemoglobin less than 12 g/dL, elevated β2-microglobulin, and positive BMB), the 2-year PFS was 79% in the high-risk group (score, 3-5), 85% in the intermediate-risk group (score, 1-2), and 100% in the low-risk group (score, 0; P = .081). The 2-year OS was 87% in the high-risk group (score, 3-5), 93% in the intermediate-risk group (score, 1-2), and 100% in the low-risk group (score, 0; P = .023). These results are outlined in Figure 2.
PFS and OS by risk category using the PET-based score compared with FLIPI-2 in the E2408 cohort. (A) PFS using the PET-based score. The KM Ests are shown with 2-year PFS of 70% in the high-risk group (score, 4-6), 87% in the intermediate-risk group (score, 2-3), and 91% in the low-risk group (score, 0-1). (B) PFS using FLIPI-2. The KM Ests are shown with 2-year PFS of 79% in the high-risk group (score, 3-5), 85% in the intermediate-risk group (score, 1-2), and 100% in the low-risk group (score, 0). (C) OS using the PET-based score. The KM Ests are shown with 2-year OS of 82% in the high-risk group, 93% in the intermediate-risk group, and 97% in the low-risk group. (D) OS using FLIPI-2. The KM Ests are shown with 2-year OS of 87% in the high-risk group, 93% in the intermediate-risk group, and 100% in the low-risk group. CI, confidence interval; Est, estimate; KM, Kaplan-Meier; NE, not estimable.
PFS and OS by risk category using the PET-based score compared with FLIPI-2 in the E2408 cohort. (A) PFS using the PET-based score. The KM Ests are shown with 2-year PFS of 70% in the high-risk group (score, 4-6), 87% in the intermediate-risk group (score, 2-3), and 91% in the low-risk group (score, 0-1). (B) PFS using FLIPI-2. The KM Ests are shown with 2-year PFS of 79% in the high-risk group (score, 3-5), 85% in the intermediate-risk group (score, 1-2), and 100% in the low-risk group (score, 0). (C) OS using the PET-based score. The KM Ests are shown with 2-year OS of 82% in the high-risk group, 93% in the intermediate-risk group, and 97% in the low-risk group. (D) OS using FLIPI-2. The KM Ests are shown with 2-year OS of 87% in the high-risk group, 93% in the intermediate-risk group, and 100% in the low-risk group. CI, confidence interval; Est, estimate; KM, Kaplan-Meier; NE, not estimable.
PFS and OS by PET-based score in the Mayo/Iowa SPORE MER cohort
The PET-based score was subsequently evaluated in the cohort of patients from the University of Iowa/Mayo Clinic SPORE MER database.12 Overall, 548 patients were included in the analysis. A total of 214 events and 64 deaths were recorded. Patients without BMB report (n = 65) were excluded. A high-risk score using the PET-based index was associated with a 2-year PFS of 49% vs 69% in the intermediate-risk group and 79% in the low-risk group (P < .001). The 2-year OS was 80% in the high-risk group, compared with 91% in the intermediate-risk group and 97% in the low-risk group (P < .0001). Using the traditional FLIPI-2 score in this cohort, the 2-year PFS was 59% in the high-risk group, 75% in the intermediate-group, and 87% in the low-risk group (P < .0001). The 2-year OS was 85% in the high-risk group, 95% in the intermediate-risk group, and 99% in the low-risk group (P < .0001). These results are illustrated in Figure 3.
PFS and OS by risk category using the PET-based score compared to FLIPI-2 in the Mayo/Iowa SPORE MER cohort. (A) PFS using the PET-based score. The KM Ests are shown with 2-year PFS of 49% in the high-risk group (score, 4-6), 69% in the intermediate-risk group (score, 2-3), and 79% in the low-risk group (score, 0-1). (B) PFS using FLIPI-2. The KM Ests are shown with 2-year PFS of 59% in the high-risk group (score, 3-5), 75% in the intermediate-risk group (score, 1-2), and 87% in the low-risk group (score, 0). (C) OS using the PET-based score. The KM Ests are shown with 2-year OS of 80% in the high-risk group, 91% in the intermediate-risk group, and 97% in the low-risk group. (D) OS using FLIPI-2. The KM Ests are shown with 2-year OS of 85% in the high-risk group, 95% in the intermediate-risk group, and 99% in the low-risk group. CI, confidence interval; NE, not estimable.
PFS and OS by risk category using the PET-based score compared to FLIPI-2 in the Mayo/Iowa SPORE MER cohort. (A) PFS using the PET-based score. The KM Ests are shown with 2-year PFS of 49% in the high-risk group (score, 4-6), 69% in the intermediate-risk group (score, 2-3), and 79% in the low-risk group (score, 0-1). (B) PFS using FLIPI-2. The KM Ests are shown with 2-year PFS of 59% in the high-risk group (score, 3-5), 75% in the intermediate-risk group (score, 1-2), and 87% in the low-risk group (score, 0). (C) OS using the PET-based score. The KM Ests are shown with 2-year OS of 80% in the high-risk group, 91% in the intermediate-risk group, and 97% in the low-risk group. (D) OS using FLIPI-2. The KM Ests are shown with 2-year OS of 85% in the high-risk group, 95% in the intermediate-risk group, and 99% in the low-risk group. CI, confidence interval; NE, not estimable.
POD24 by PET-based score
In the E2408 cohort, patients with high-risk disease by the PET-based score had a significantly higher rate of POD24 compared with the intermediate- and low-risk groups (30% vs 13% vs 9%, respectively; P = .002). With FLIPI-2, the difference in POD24 between groups was not statistically significant (21% vs 15% vs 0%, respectively; P = .21). In this cohort, 32 of 76 patients (42%) who experienced POD24 were high-risk by PET-based score.
In the Mayo/Iowa SPORE MER cohort, 237 patients received IC upfront, whereas others received other types of treatment or were placed on observation. Of 237 patients, 227 patients had a BMB report available and were included in the analysis. Among patients treated with IC, high-risk disease by the PET-based score resulted in a significantly higher rate of POD24 than that of the intermediate- and low-risk groups 49% vs 23% vs 13%, respectively (P = .003). Results remained significant when stratifying by FL grade (1-2 vs 3A). With FLIPI-2, POD24 was 35% in the high-risk group, 17% in the intermediate-risk group, and 4% in the low-risk group (P = .002). In the Mayo/Iowa SPORE MER cohort, 10 of 72 (14%) patients who experienced POD24 were high risk by the PET-based score. Figure 4 illustrates the rate of POD24 by risk group in both patient cohorts.
POD24 using the PET-based score compared with FLIPI-2. (A) POD24 using the PET-based score in the E2408 cohort. The KM Ests are shown with rate of POD24 of 30% in the high-risk group (score, 4-6), 13% in the intermediate-risk group (score, 2-3), and 9% in the low-risk group (score, 0-1). (B) POD24 using FLIPI-2 in the E2408 cohort. The KM Ests are shown with rate of POD24 of 21% in the high-risk group (score, 3-5), 15% in the intermediate-risk group (score, 1-2), and 0% in the low-risk group (score, 0). (C) POD24 using the PET-based score in the Mayo/Iowa SPORE MER cohort. The KM Ests are shown with rate of POD24 of 49% in the high-risk group, 23% in the intermediate-risk group, and 13% in the low-risk group. (D) POD24 using FLIPI-2 in the Mayo/Iowa SPORE MER cohort. The KM Ests are shown with rate of POD24 of 35% in the high-risk group, 17% in the intermediate-risk group, and 4% in the low-risk group. CI, confidence interval; NE, not estimable.
POD24 using the PET-based score compared with FLIPI-2. (A) POD24 using the PET-based score in the E2408 cohort. The KM Ests are shown with rate of POD24 of 30% in the high-risk group (score, 4-6), 13% in the intermediate-risk group (score, 2-3), and 9% in the low-risk group (score, 0-1). (B) POD24 using FLIPI-2 in the E2408 cohort. The KM Ests are shown with rate of POD24 of 21% in the high-risk group (score, 3-5), 15% in the intermediate-risk group (score, 1-2), and 0% in the low-risk group (score, 0). (C) POD24 using the PET-based score in the Mayo/Iowa SPORE MER cohort. The KM Ests are shown with rate of POD24 of 49% in the high-risk group, 23% in the intermediate-risk group, and 13% in the low-risk group. (D) POD24 using FLIPI-2 in the Mayo/Iowa SPORE MER cohort. The KM Ests are shown with rate of POD24 of 35% in the high-risk group, 17% in the intermediate-risk group, and 4% in the low-risk group. CI, confidence interval; NE, not estimable.
The Harrell concordance statistic
The Harrell c-index was performed for the E2408 cohort as well as the Mayo/Iowa SPORE MER cohort to compare PFS, OS, and POD24 between the 3 risk groups (high, intermediate, and low risk) for both the PET-based score and the traditional FLIPI-2 score. A value of 1 indicates that the model predicts the outcome perfectly. Results of the Harrell c-index for the PET-based prognostic index compared with FLIPI-2 are outlined in Table 4. There was no statistically significant difference in the Harrell c-index between the PET-based score and the traditional FLIPI-2 score (P values between .93 and .99).
The Harrell c-index for the PET-based score compared with FLIPI-2
Patient cohort . | c-index for PFS . | c-index for OS . | c-index for POD24 . | |||
---|---|---|---|---|---|---|
PET-based score . | FLIPI-2 . | PET-based score . | FLIPI-2 . | PET-based score . | FLIPI-2 . | |
E2408 cohort (N = 249) | 0.631 | 0.574 | 0.685 | 0.619 | 0.642 | 0.562 |
Mayo/Iowa SPORE MER cohort (N = 548) | 0.551 | 0.585 | 0.641 | 0.692 | 0.585 | 0.621 |
Patient cohort . | c-index for PFS . | c-index for OS . | c-index for POD24 . | |||
---|---|---|---|---|---|---|
PET-based score . | FLIPI-2 . | PET-based score . | FLIPI-2 . | PET-based score . | FLIPI-2 . | |
E2408 cohort (N = 249) | 0.631 | 0.574 | 0.685 | 0.619 | 0.642 | 0.562 |
Mayo/Iowa SPORE MER cohort (N = 548) | 0.551 | 0.585 | 0.641 | 0.692 | 0.585 | 0.621 |
Discussion
This study reinforces the prognostic value of PET/CT in the diagnostic evaluation of FL. We confirm that the presence of 2 or more EN sites on PET/CT at diagnosis represents an independent predictor of PFS and OS in patients with untreated FL. Skin/soft tissue involvement was validated as a univariate predictor of PFS and OS in these patients. These results were validated in a population of patients with FL treated with contemporary bendamustine-based induction therapy with rituximab maintenance, and these PET factors were predictive of survival outcomes across all E2408 treatment arms.
Spleen and osseous involvement were not validated as independent predictive factors. Similarly, lymphoma involvement on BMB was not predictive of survival in this validation cohort, suggesting that bone marrow involvement in and of itself may not be a strong predictor of outcomes in FL. It is also possible that the sample size of this study may have been insufficient to confirm these as predictive factors.
We evaluated the potential prognostic significance of a PET-based score in untreated FL, using the positive PET factors from this validation study, combined with traditional FLIPI-2 factors. Although skin/soft tissue involvement was not confirmed as an independent predictive factor in our multivariate analysis, it trended toward significance with a P value of .059 for OS, and we considered this to be important enough for inclusion in the prognostic score.
This PET-based score predicted survival outcomes in patients with newly diagnosed FL and identified a group of patients at high risk of experiencing POD24. This score predicted almost half of the patients who eventually experienced POD24 in the E2408 cohort, suggesting that these PET parameters may be particularly effective at predicting outcomes in patients with higher-risk disease who meet criteria for treatment. In the lower-risk Mayo/Iowa SPORE MER cohort, only 4% of patients were categorized as high risk by the PET-based score, and this score identified 14% of patients who went on to experience POD24. In both cohorts, the score appeared less effective at distinguishing between low-risk and intermediate-risk groups. The PET-based score was not proven to be superior to the traditional FLIPI-2 based on the Harrell concordance statistic.
Biopsies were not required to confirm the presence of EN involvement on PET/CT. We believe this is a strength of our study, which was designed to reach conclusions based on imaging alone, without the need for additional invasive testing. Similarly, bone involvement on PET was not confirmed with BMB. BMB and PET findings of bone involvement do not always correlate in FL.19
It is important to note that the PET-based score was developed based on analysis of the E2408 cohort and was subsequently evaluated in the Mayo/Iowa SPORE MER cohort, in which the predictive PET/CT factors were initially identified (Figure 1). These results have not been validated with a third, independent cohort, and calibration or internal validation were not performed. The Harrell c-index requires larger cohorts for optimal accuracy, which represents a limitation to this study.
Further studies will be necessary to confirm the prognostic value of a PET-based score in newly diagnosed FL. In FL there is need for a novel prognostic index that uses standard imaging from the initial FL evaluation, that obviates the need for invasive testing (ie, BMB) that is rarely performed in modern clinical practice, and that might predict POD24. Our PET-based score uses straight-forward criteria for defining positive EN sites on PET, without the need for a confirmatory biopsy, allowing clinicians to quickly calculate this score based on their interpretation of the imaging in addition to basic laboratory findings and patient characteristics.
We suggest further evaluation of a PET-based score that would include the 2 PET-based factors identified in our validation study and omit BMB results. Additional studies could focus on incorporating other potential adverse PET prognostic factors to the PET-based index, such as baseline metabolic tumor volume,20,21 and standard uptake value of EN sites.19 The use of continuous data values for nonbinary variables should also be explored. The clinical utility of a PET-based score with respect to therapy selection in patients with high-tumor burden FL remains to be determined.
Conclusions
In this validation study, we confirm that involvement of 2 or more EN sites and skin/soft tissue involvement on qualitative PET/CT represent predictors of PFS and OS in newly diagnosed FL, whereas osseous and spleen involvement were not validated. A PET-based score combining these PET factors with traditional FLIPI-2 factors predicted outcomes in both a high-risk and low-risk cohort of patients with untreated FL, and was particularly useful at identifying POD24 in patients who meet criteria for treatment. This PET-based score was not found to be superior to the traditional FLIPI-2, and additional studies will be necessary to evaluate the prognostic significance of a PET-based index in FL.
Acknowledgments
This study was coordinated by the Eastern Cooperative Oncology Group - American College of Radiology Imaging Cancer Network (ECOG-ACRIN) Cancer Research Group (Peter J. O’Dwyer and Michelle D. Schnall, group cochairs); supported by the National Cancer Institute of the National Institutes of Health (award numbers U10CA180820, U10CA180794, UG1CA233320, and UG1CA233339). This study was funded by the Robert H. Lurie Comprehensive Cancer Center.
The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.
Authorship
Contribution: F.S.-P. conceived the study question and data analysis plan, reviewed positron emission tomography (PET) scans, performed statistical analysis, interpreted analysis, and wrote the manuscript; S.M.B. reviewed PET scans and wrote and edited the manuscript; Z.S., M.K., and B.L. performed statistical analysis; M.J.M. performed statistical analysis and interpreted analysis; A.Q., L.B., and L.K. read PET scans in the E2408 study and edited the manuscript; H.S. reviewed PET scans and edited the manuscript; J.N.W. interpreted analysis and edited the manuscript; and T.E.W., B.K., A.M.E., and L.I.G. conceived the study question and data analysis plan, interpreted analysis, and edited the manuscript.
Conflict-of-interest disclosure: B.K. reports consulting roles with Genentech/Roche, ADC Therapeutics, AbbVie, AstraZeneca, BeiGene, Celgene/Bristol Myers Squibb, Kite, Genmab, Lilly, and GlaxoSmithKline; and research funding from Genentech/Roche, AbbVie, AstraZeneca, and BeiGene. The remaining authors declare no competing financial interests.
Correspondence: Leo I. Gordon, Division of Hematology/Oncology, Robert H. Lurie Comprehensive Cancer Center, Northwestern University Feinberg School of Medicine, 675 N St Clair St, Suite 85o, Chicago, IL 60611; email: l-gordon@northwestern.edu.
References
Author notes
The data that support the findings of this study are available on request from the corresponding author, Leo I. Gordon (l-gordon@northwestern.edu).
The full-text version of this article contains a data supplement.