Visual Abstract
The ultimate goal of bringing most new drugs to the clinic in hematologic malignancy is to improve overall survival. However, the use of surrogate end points for overall survival is increasingly considered standard practice, because a well validated surrogate end point can accelerate the outcome assessment and facilitate better clinical trial design. Established examples include monitoring minimal residual disease in chronic myeloid leukemia and acute leukemia, and metabolic response assessment in lymphoma. However, what happens when a clinical trial end point that is not a good surrogate for disease-modifying potential becomes ingrained as an expected outcome, and new agents are expected or required to meet this end point to demonstrate “efficacy”? Janus kinase (JAK) inhibitors for myelofibrosis (MF) have a specific impact on reducing symptom burden and splenomegaly but limited impact on the natural history of the disease. Since the introduction of ruxolitinib more than a decade ago there has been modest incremental success in clinical trials for MF but no major leap forward to alter the natural history of the disease. We argue that the clinical development of novel agents for MF will be accelerated by moving away from using end points that are specifically tailored to measure the beneficial effects of JAK inhibitors. We propose that specific measures of relevant disease burden, such as reduction in mutation burden as determined by molecular end points, should replace established end points. Careful reanalysis of existing data and trials in progress is needed to identify the most useful surrogate end points for future MF trials and better serve patient interest.
Introduction
Myelofibrosis (MF) is a chronic, progressive myeloproliferative neoplasm (MPN) associated with impaired quality of life (QoL) and shortened survival. Secondary MF arises from an underlying MPN, essential thrombocythemia, or polycythemia vera, or MF may occur de novo as primary MF.1 The average age at diagnosis is ∼70 years,2 and the majority of patients are ineligible for allogeneic stem cell transplantation, which is currently the only curative therapy. Median overall survival is 5 to 7 years2 but varies according to the underlying disease features, with prognostic scoring systems identifying low-risk patients with projected survival in excess of 20 years, and very high-risk patients with projected survival of <2 years.3,4
Over the past decade or more, measures of clinical benefit from Janus kinase (JAK) inhibitor therapy have become standard in clinical trials involving patients with MF. A ≥50% reduction in total symptom score (TSS50) and a ≥35% reduction in spleen volume (SVR35) are commonly used end points that have helped demonstrate the improvements that can be achieved with JAK inhibitors.5-8 The next generation of MF clinical trials seeks to improve on JAK inhibitor responses to find a pathway toward long-term disease-modifying potential. We define disease modification as an improvement in overall survival accompanied by a reduction in the allelic burden of disease-associated genetic variants. Two broad approaches are being explored: monotherapy with agents that better target driver mutations (eg, type 2 or mutant-selective inhibitors of JAK29,10; and mutant-specific monoclonal antibodies against calreticulin [CALR]11); and combination approaches with a JAK inhibitor that add an agent to target an additional pathway (eg, bromodomain and extra-terminal motif [BET] inhibitors and BCL-2 homology domain 3 [BH3] mimetics).12,13 Our aim is to discuss how to identify an early signal of clinical efficacy of novel agents that goes beyond the established benefits of JAK inhibitor therapy. Because we focus on strategies to prevent or delay progressive disease our focus is on the first-line treatment of chronic phase MF (defined by blasts of <10%). Similar end points may have value in subsequent lines of treatment for chronic phase, but alternative end points may be needed for accelerated or blast phase MF.
Surrogate end points and overall survival
The ultimate goal of new treatments for patients with cancer is usually to improve overall survival, but in diseases in which median survival exceeds a number of years, surrogate end points are often used. This is highly relevant for clinical trials in newly diagnosed MF, in which the robust end points of leukemic transformation or death are rarely observed in the first 2 to 3 years. Surrogate end points may be defined as measurements that are predictive of a true outcome of interest.14,15 Surrogates are typically used because they can be assessed earlier than the true outcome and therefore accelerate clinical development by providing an early measure of efficacy. A surrogate end point should have biological plausibility and must be highly associated with the true outcome of interest.
Perhaps the best example of a surrogate end point is molecular response (MR) assessment using BCR::ABL1 in chronic myeloid leukemia (CML).16 Early MR is predictive of later, deeper levels of MR,17 which are in turn associated with freedom from disease progression and prolonged overall survival (the true outcome). As in CML, it is likely that the greatest impact on the natural history of MF will be achieved by using the most effective agents as first-line treatment. However, unlike CML, no validated surrogates for overall survival are available for use in MF clinical trials. This is now an urgent priority for the field. Types of response that might be linked to disease-modifying potential are summarized in Figure 1.
Responses associated with disease-modifying potential in MF. Response measures commonly assessed in MF are shown. The size of each circle is proportional to its predicted importance as a marker of disease-modifying potential. Our assessment of importance comprises both its biological relevance and the feasibility of assessing the response accurately in routine clinical practice.
Responses associated with disease-modifying potential in MF. Response measures commonly assessed in MF are shown. The size of each circle is proportional to its predicted importance as a marker of disease-modifying potential. Our assessment of importance comprises both its biological relevance and the feasibility of assessing the response accurately in routine clinical practice.
Surrogate end points may also be used when the true outcome is difficult to measure objectively. Patient experience and QoL cannot be measured objectively, so patient-reported outcomes using standardized instruments, such as the MF symptom assessment form,8 may be viewed as another form of surrogate end point. Whereas the only biological variable that affects BCR::ABL1 is the level of residual disease, TSS and patient-reported QoL may be influenced by many factors unrelated to the disease being studied (important examples being comorbidities, intercurrent illnesses, and side effects of medicines), so that other measures are likely to correlate better with disease-modifying activity.
JAK inhibitor clinical trials and response assessment
Ruxolitinib, a tyrosine kinase inhibitor of both JAK1 and JAK2, was the first medical therapy to be approved for the treatment of MF.7 It leads to a reduction in spleen size and improvement in symptom burden but does not alter the risk of progression to acute myeloid leukemia (AML). The pivotal phase 3 randomized clinical trial that compared ruxolitinib with placebo (COMFORT-1) or best available therapy (predominantly hydroxyurea or no therapy; COMFORT-2) included a crossover to ruxolitinib.5,6 After 5 years of follow-up in COMFORT-1 there was an improvement in overall survival with ruxolitinib over placebo, despite the crossover.18 A similar trend was seen in COMFORT-2,19 but there was no significant prolongation of overall survival in comparison with best available therapy (Table 1).
Early reports of efficacy with ruxolitinib showed clear improvements in QoL, and measures were developed to quantify this palliative benefit. The MF symptom assessment form, comprising questions that reflect the severity of common symptoms of the disease, was used to calculate a TSS, and a 50% reduction was a key efficacy end point adopted in many JAK inhibitor trials.8 Improvements in TSS are likely a class effect of JAK inhibitors as symptomatic response correlates with changes in cytokines, as well as global QoL and SVR.7 However, despite its demonstrated utility, there are important limitations to TSS50 as a quantitative end point: it assumes that all symptoms have equal value, and that halving is clinically meaningful regardless of the absolute baseline score, which likely oversimplifies the patient experience.20,21
The primary end point in the COMFORT-1 trial was SVR35 at 24 weeks.6 The cutoff of 35% was based on the median reduction in spleen volume in a cohort of 24 patients who achieved a median 50% reduction in palpable spleen length during ruxolitinib treatment.7 Volumetric assessment of spleen size is a more accurate and objective measure of disease burden than palpable spleen length, which can be influenced by body habitus, phase of respiration, and the proportion of the spleen that lies above the costal margin. However, the SVR35 cutoff remains somewhat arbitrary and was not originally selected as a biomarker for overall survival.
Alternative type 1 JAK inhibitors (fedratinib,22 momelotinib,23,24 and pacritinib25,26) have been developed and have been tested as second-line therapy after ruxolitinib or as first-line therapy (Table 1). Although these agents may offer a useful alternative to ruxolitinib, the responses seen are not qualitatively different.
Prognostic factors
Adverse prognosis in MF may be indicated by simple clinicopathological features, such as blood counts (anemia, leukocytosis, thrombocytopenia, and increased blasts), presence of specific constitutional symptoms, and severity of bone marrow (BM) fibrosis.27,28 Perhaps because of the imprecision of measurement, palpable spleen length has not been shown to influence survival in commonly used prognostic scores for MF. Spleen volume is not routinely measured in clinical practice, but a pooled analysis of the COMFORT studies showed that increasing baseline spleen volume was related to shortened overall survival (hazard ratio [HR], 1.14 for each 50 mL increase in volume).29 In the SIMPLIFY-1 (JAK inhibitor–naïve patients) and SIMPLIFY-2 (patients with anemia during ruxolitinib treatment) clinical trials of momelotinib, a larger baseline spleen volume was associated with shortened overall survival in univariate analysis.30 In SIMPLIFY-1, the multivariate analysis did not show an effect of spleen volume, suggesting that it is a covariate with other factors, such as International Prognostic Scoring System risk group, whereas in SIMPLIFY-2 a larger spleen volume was independently associated with shorter survival.23,24
Genetic prognostic factors include karyotypic abnormalities and sequence variants in selected genes.3,31 Driver mutations in JAK2, CALR, and MPL occur in ∼65%, 20%, and 5% of patients, respectively.32 Nondriver mutations in ASXL1, SRSF2, EZH2, IDH1, IDH2, and U2AF1 are recognized as molecular high-risk variants in primary MF, and multiple high-risk variants confer higher risk.3 Variants in other genes have been identified as adverse in MPN more generally.32 The presence of molecular high-risk variants at the time of starting JAK inhibitor therapy is associated with a shorter duration of response and shorter overall survival.33-36 Inflammation is a hallmark of MF and cytokine levels may also have prognostic relevance.37 Elevated levels of interleukin-8 (IL-8) have been associated with shortened survival, although this effect was not independent of clinical risk score.38
Assessment of prognostic factors was developed primarily to enable risk stratification before consideration of allogeneic stem cell transplantation,27,39 but risk scores have been repurposed as an eligibility criterion for MF clinical trials using JAK inhibitors. Although the risk score is not predictive of response, higher risk patients tend to have lesser degrees of clinical benefit and a shorter time to treatment failure.40
Molecular measures of disease burden
The only direct measure of the neoplastic clone routinely available is MR, which can be measured accurately in most patients with MPN using peripheral blood samples.41 Clinical utility of measuring the variant allelic fraction (VAF) of JAK2 V617F during treatment has been demonstrated in patients with MF undergoing allogeneic stem cell transplantation. Detectable JAK2 V617F 3 to 6 months after allograft is associated with a higher incidence of relapse.42 In contrast, in patients with MF treated with JAK inhibitors the average change in JAK2 V617F VAF at week 24 has ranged from −21% to +0.4%.22,43 In the COMFORT-1 study, patients with greater decreases in VAF (first tertile) tended to have a shorter duration of disease at study entry,44 highlighting the greater potential of treatment to induce MR when given earlier in the course of MF. Notably, these JAK inhibitor monotherapy trials have not reported an association between MR and progression-free or overall survival, and also have not reported MR for patients with driver mutations in CALR or MPL. Furthermore, MR in nondriver mutations, such as ASXL1, has not been studied in detail.
Imetelstat showed partial or complete hematologic responses in 21% of patients in a phase 1 trial.45 A randomized study compared 2 different doses of imetelstat (a telomerase inhibitor) in patients with MF relapsed or refractory after prior JAK inhibitor therapy. At week 24, SVR35 was achieved in only 10.2% of patients randomized to the higher dose of 9.4 mg/kg given every 3 weeks, leading to closure of recruitment for futility.46 Despite this, 42.1% of patients achieved ≥25% reduction in VAF of a driver mutation at any time. Patients achieving ≥20% reduction in VAF of a driver mutation tended to have higher rates of SVR and MF grade improvement and longer overall survival (31.6 vs 22.8 months), but these differences were not statistically significant.
Multiple studies are testing the addition of a novel agent to ruxolitinib with the aim of improving response, and 2 of these agents have published MR data. Cohort 1a of the REFINE phase 2 trial tested navitoclax (BH3 mimetic) in combination with ruxolitinib after a suboptimal response or disease progression. There were 26 patients evaluable for MR (JAK2 or CALR), 6 of whom (23%) had a ≥20% reduction in VAF at week 24, which in 5/6 patients was accompanied by a reduction in MF grade. SVR35 at any time was associated with MR (≥20% reduction), reduction in MF grade, reduction in cytokines, and prolonged overall survival.47 Arm 3 of the MANIFEST phase 2 trial tested the combination of pelabresib and ruxolitinib in JAK inhibitor-naïve MF. At week 24 the mean reduction in JAK2 V617F VAF was 13%; a reduction in VAF >25% from baseline was observed in 29.5% of evaluable patients.13
The assessment and interpretation of MR in MF is more complex than in diseases such as BCR::ABL1 CML or PML::RARA AML. The median VAF of BCR::ABL1 at diagnosis of CML was 84%, with a minimum VAF of 45%.48 The median VAF of JAK2 V617F in the COMFORT-2 study was similar, but the minimum VAF was 1%,49 highlighting the more variable clonal structure of MF. Lastly, a reduction in VAF is observed only if the MPN clone is suppressed and there is a reciprocal increase in nonmutated cells. This may lead to divergent patterns of MR if a dominant clone (eg, with JAK2 mutation) is suppressed, conferring a competitive advantage on an alternative clone (eg, with TET2 mutation) rather than “wild-type” hematopoiesis. If hematopoiesis lacking the driver mutation does not recover, then there will be no substantial reduction in VAF. For example, in patients with CML treated with imatinib there is minimal reduction in BCR::ABL1 VAF in the first month of treatment,48 although the CML clone is suppressed and hematologic remission is typically achieved within this time.
Leukemia-free survival
Secondary AML arising in a patient with MF carries a poor prognosis,50 and a disease-modifying therapy should reduce the risk of transformation. To date, no medical therapy for MF has shown a reduction in the incidence of AML, which may reflect the limited disease-modifying potential of current therapies. However, the observation that a significant proportion of cases of secondary AML lack the MF driver mutation suggests that suppression of the myeloproliferative clone might not always equate to suppression of the leukemia-initiating clone.51,52 Whole-genome sequencing of hematopoietic colonies has highlighted the degree of clonal diversity of MPN, showing that in some cases the driver mutation was a secondary event acquired after considerable clonal diversity was already established.53,54 In cases in which AML arises from an ancestral or independent myeloid clone, even highly effective MPN therapies may have limited impact on leukemia-free survival.
Other biomarkers of treatment response
SVR
Patients with a greater reduction in palpable spleen length after 6 months of ruxolitinib treatment had a longer time to treatment failure and prolongation of survival in a retrospective analysis.55 In a pooled analysis of the COMFORT 1 and 2 studies, the achievement of SVR10 at week 24 was associated with improved survival, and deeper levels of SVR were associated with a greater benefit (HR, 0.36 for ≥10 to <25% reduction; HR, 0.18 for SVR50).29 Achievement of SVR35 at week 24 was associated with improved survival in both arms of SIMPLIFY-1, but the difference was statistically significant only in patients randomly assigned to treatment with ruxolitinib.30 Despite the established value of SVR as a measure of disease burden, complications such as splenic infarction and portal hypertension may decrease or increase the spleen volume independent of disease burden and treatment response.
BM histomorphometry
The only therapy that commonly induces complete hematologic response is allogeneic stem cell transplantation. Full donor chimerism and suppression of the MF clone is typically achieved in the first month after transplant, yet even with eradication of the malignant clone normalization of BM histology takes a median of 6 months.56 Consequently, early assessment of conventional histological responses with a medical therapy likely has limited sensitivity to detect disease-modifying potential. In patients randomized to ruxolitinib in the COMFORT-2 study, a reduction in MF grade at last assessment (median treatment duration, 2.2 years) was seen in 15.8% of patients, stable grade in 32.2%, and 18.5% had a higher grade of MF.19 The remaining 33.5% of patients were not evaluable because of missing samples, which possibly reflects the reluctance of patients to undergo repeated BM biopsies. In the SIMPLIFY-1 trial the proportion of patients with regression of fibrosis at week 24 was similar with ruxolitinib (22%) and momelotinib (23%) and there was no correlation with anemia improvement, SVR, or overall survival.57 Notably, there was no central review of fibrosis grading, so the robustness of this analysis may be questioned. The reproducibility of fibrosis grading is problematic: even in a study involving 3 expert hematopathologists there was agreement within 1 grade (Baumeister scale) only 69% of the time.58 In routine practice there is even variability in which grading system is used or whether any structured grade is reported.59 These limitations highlight the need for better standardized assessment of histology if it is to be useful as biomarker of treatment response.
Regression of fibrosis by ≥1 grade has been reported in published clinical trials with multiple agents with differing mechanisms of action, including JAK inhibitors,43 imetelstat,46 navitoclax,12 parsaclisib,60 pelabresib,13 and zinpentraxin-α.61 With imetelstat, 40.5% of patients on the higher dose (9.4 mg/kg) showed regression of fibrosis, and these patients had a trend to prolonged survival (HR, 0.54; 95% confidence interval, 0.23-1.29).46 Regression of fibrosis with the addition of navitoclax to ruxolitinib was similarly associated with a trend to improved survival.47 Other studies either did not report such an analysis or reported no statistically significant association between fibrosis response and overall survival, acknowledging that these were small studies and inadequately powered to detect an association.
Cytokines
Multiple cytokines are elevated in the plasma of patients with MF and are related to the high burden of constitutional symptoms. JAK inhibitors suppress proinflammatory cytokines, and the degree of improvement in elements of the TSS is proportional to the degree of reduction in certain cytokines.7 Not all cytokines are regulated by the JAK-STAT pathway: tumor necrosis factor α, IL-6, and IL-8 may be maintained by transforming growth factor-β (TGF-β) signaling and remain elevated in patients treated with ruxolitinib.62 AVID200, a TGF-β trap, reduced the plasma level of TGF-β1 in a phase 1 trial involving 21 patients, but there was no demonstrable association between the cytokine reduction and clinical outcomes. Other therapies with different mechanisms of action reduce cytokine levels in MF, including navitoclax,47 parsaclisib,60 pelabresib,13 and zinpentraxin-α.61 Reduced cytokine levels early in treatment have been associated with SVR,47,63,64 and improved cytopenia,63 but there is a lack of data on whether the cytokine responses are independently associated with overall survival. Pragmatically, measuring cytokine levels in clinical practice is difficult and requires prompt processing and analysis at a specialist facility.
Improving cytopenia
An ideal disease-modifying therapy should suppress the neoplastic clone and allow normal hematopoiesis to recover, which would lead to improvement in cytopenia that is associated with improved survival. However, improvement in cytopenia can also occur with therapies that stimulate hematopoiesis or ameliorate ineffective hematopoiesis without substantially reducing the neoplastic clone, such as filgrastim, thrombopoietin receptor agonists, androgens, or erythropoiesis-stimulating agents.65,66 Conversely, worsening cytopenia does not preclude meaningful clinical benefit, as observed with ruxolitinib.6
In the SIMPLIFY trials momelotinib was compared with ruxolitinib or best available therapy (predominantly ruxolitinib) and transfusion independence (TI) at week 24 was observed in 43% to 66.5% of patients randomized to momelotinib and 21% to 49.3% randomized to ruxolitinib/best available therapy.23,24 A statistically significant in overall survival with TI was reported only for those patients on momelotinib.30 In the MOMENTUM trial TI at week 24 was seen in 30% of patients randomized to momelotinib vs 20% with danazol67; the duration of TI was longer with momelotinib than danazol, but longer follow-up may be required to assess effect on survival. Imetelstat resulted in anemia improvement in 25% of patients on the 9.4 mg/kg dose46; an activin receptor trap, luspatercept, resulted in TI in 10% to 26% of patients68; and increased platelet counts were observed in 81% of patients treated with AVID200 but with no data on correlation with survival.
Randomized trials of combination therapy
Multiple phase 2 trials adding a second agent to ruxolitinib are underway or in development and have been reviewed elsewhere.69 Two of these agents have preliminary data from phase 3 randomized placebo-controlled trials: the BH3 mimetic, navitoclax, and the BET inhibitor, pelabresib.
Cohort 1a of the REFINE phase 2 trial tested navitoclax in combination with ruxolitinib after a suboptimal response or disease progression.47 There were 26 patients evaluable for MR (JAK2 or CALR), 6 of whom (23%) had a ≥20% reduction in VAF at week 24, which in 5 of 6 patients was accompanied by a reduction in MF grade. SVR35 at any time was associated with MR (≥20% reduction), reduction in MF grade, reduction in cytokines, and prolonged overall survival.47 The phase 3 TRANSFORM-1 study sought to test ruxolitinib in combination with navitoclax or placebo in JAK inhibitor naïve MF, measuring SVR as the primary end point and TSS as a key secondary end point (ClinicalTrials.gov identifier: NCT04472598)70 but was terminated because of failure to meet its TSS end point.
Arm 3 of the MANIFEST phase 2 trial tested the combination of pelabresib and ruxolitinib in JAK inhibitor-naïve MF. At week 24, SVR35 was achieved in 68% of patients, 28% had an improvement in MF grade, and the mean reduction in JAK2 V617F VAF was 13%. A reduction in VAF of >25% from baseline was observed in 29.5% of evaluable patients.13 The phase 3 MANIFEST-2 study will examine ruxolitinib in combination with pelabresib or placebo in JAK inhibitor–naïve MF (ClinicalTrials.gov identifier: NCT04603495) with similar end points to TRANSFORM-1.71,72
The first results of the TRANSFORM-1 and MANIFEST-2 randomized trials reported approximately a doubling of SVR35 at week 24 in comparison with ruxolitinib plus placebo, with little or no improvement in TSS50.70,72 MRs have not yet been reported. Both studies showed responses that we predict may lead to improved overall survival in longer-term follow-up yet failed to meet prespecified end points for symptom response that could be a barrier to regulatory approval. This experience highlights the need for a reappraisal of study design and key efficacy end points in MF in order to develop treatments that offer a substantial survival advantage.
Recommendations
Existing end points that were developed for JAK inhibitor therapies may no longer capture the spectrum of benefit with novel therapies that have distinct mechanisms of action and target critical pathways in the MPN clone. From a patient and clinician point of view, there is an urgent need for drugs that change the natural history of MF and reduce the incidence of post-MF AML. Presently, the field does not have reliable surrogate end points of longer-term outcome in MF to accelerate clinical development by enabling researchers to prioritize those drugs that show early evidence of disease-modifying potential. It is important that these prospective trials measure biomarkers and correlative end points, for which we have summarized some recommendations in Table 2.
An unintended, harmful consequence of the reliance on SVR is that patients without splenomegaly are frequently excluded from clinical trials, even if they have significant complications (eg, cytopenia) that are inadequately addressed by available therapies. The use of composite end points, such as “clinical improvement,”73 that capture different types of benefit should enable more patients to be enrolled in clinical trials and avoid biasing trial results due to skewed patient selection.
Patient-reported outcomes
Healthy individuals have symptoms that impose a ceiling on TSS improvement.74 Hence, there is a narrow window for demonstrable gain from new MF treatments over ruxolitinib using these end points. Medication that is given long-term must be tolerable, and should ideally improve QoL, but changes in TSS may be disconnected from disease response if a medication causes side effects that are captured in the TSS. This point is illustrated by the decline in QoL reported by many patients with CML on imatinib treatment,75 despite the positive impact of tyrosine kinase inhibitor treatment on progression risk and overall survival.76 When investigational therapies are combined with a JAK inhibitor there is the potential for additive toxicity. In this context, any improvement in end points that may have biological importance (eg, SVR and MR) without deterioration in QoL (eg, TSS) should be viewed positively. Indeed, patients themselves see disease-modifying therapy as a high priority,77 and might tolerate a period of reduced QoL in exchange for longer-term benefit.
Spleen volume
Retrospective analysis of clinical trial data should be undertaken to determine the degree of SVR that best correlates with time to treatment failure and overall survival, rather than relying on SVR35. In fact, it is unknown whether the proportional reduction in spleen volume or the absolute reduction in volume or the residual volume is more strongly associated with overall survival. The same applies to loss of response, with a need to validate degrees of increase in spleen volume over time that correlate with clinically relevant adverse outcomes. Radiological measurement of spleen volume should be incorporated into routine clinical practice, wherever possible, to facilitate the subsequent translation of clinical trial findings into practice.
Complete normalization of the spleen size must leave a residual volume, so that it is formally impossible for any patient to achieve SVR100. Ultrasonographic estimation of the normal splenic volume yields a median of 134 mL in females (5th-95th percentiles, 64-231 mL) and 179 mL in males (5th-95th percentiles, 90-334 mL).78 The mean volume of the spleen measured by magnetic resonance imaging in the United Kingdom Biobank study was 167 mL, and splenic volume was higher in males, increased with body weight, and decreased with age.79 Failure to account for the normal spleen volume in calculating SVR introduces a bias to underestimate response. This bias is minor if the baseline splenic volume is substantially increased (as in the COMFORT-1 trial; Table 1), but if the baseline is only modestly increased (many trials requiring a minimum of 450 mL) the percentage reduction that can be achieved is smaller. This measurement bias would become increasingly important if future trials were to enroll patients earlier in the course of the disease with the aim of preventing progression.
MR
The highly variable VAF of MF-related mutations at diagnosis adds a level of complexity to assessment of MR. MRs can be expressed as a relative reduction from the individual patient’s baseline VAF (eg, an approach commonly used in allele-specific IGH PCR for acute lymphoblastic leukemia) or can be expressed relative to an absolute value (as in the BCR::ABL1 International Scale for CML).80 Ideally, both approaches should be compared so that the method that best correlates with clinical outcomes can be identified. If therapies are developed that frequently induce deeper MR it would become important to move to reporting MR values as for BCR::ABL1 in which the MR value represents either the measured VAF or the calculated detection limit of the individual assay if the target is not detected.81
The use of myeloid sequencing panels often identifies alternative genes (eg, ASXL1) mutated at a substantial VAF.35 Whether nondriver mutations could be used for monitoring response requires investigation, as does the significance of mutations emerging or increasing in VAF during treatment. Comprehensive molecular evaluation of large numbers of patients prospectively assessed at fixed intervals would be helpful to determine the prognostic effect of changing VAF in multiple genes. In fact, serial samples or molecular data already exist within industry-sponsored clinical trials but may not have been analyzed comprehensively when there has been no commercial imperative to do so.
BM histomorphometry
Novel methods to quantify histological response should be explored in clinical trials. Machine learning analysis of BM trephine images has been used to track regression of fibrosis in MPN samples and is likely to be more sensitive than conventional MF grading.82 This type of analysis could be applied retrospectively using stored BM slides from clinical trials with longer follow-up to enable more sensitive analysis of MF regression and its correlation with other end points. Semiquantitative assessment of megakaryocyte number and topography may also be performed using a computational approach.83 Magnetic resonance imaging can provide additional information about the structure of the marrow in the axial skeleton that is captured along with splenic images. Changes in cellularity (measured based on fat content of the marrow) could be detected in response to ruxolitinib treatment,84 and were seen in an exploratory study in 4 patients with MF.85 A correlation between MF grade and uptake of 18F-fluorothymidine was demonstrated using positron emission tomography.86 Imaging approaches that assess a much larger volume of marrow than is acquired in a trephine biopsy have the potential to improve accuracy of marrow histomorphometry.
Clinical trial end points
We propose that clinical trials for MF should be considered in 2 categories: those with palliative intent (which does not preclude a survival benefit), and those that aim to have disease-modifying potential. Established end points, such as improved cytopenia or TSS, remain appropriate for the first category. For drugs that have disease-modifying potential, it is often impractical to wait for demonstration of prolonged survival and therefore it would be useful to identify reliable surrogates for overall and leukemia-free survival. Ideally, such surrogates should be evidence based, derived from analysis of current and past clinical trials (Figure 2). Different surrogate end points may be required for drugs with different mechanisms of action. Consequently, we propose a flexible composite end point for disease-modifying potential that includes MR together with SVR and/or cytopenia improvement.
Steps required to develop and validate novel end points as surrogates for disease-modifying potential in MF.
Steps required to develop and validate novel end points as surrogates for disease-modifying potential in MF.
Conclusions
A focus on improvement in splenomegaly and symptoms has been a useful way to capture the important symptomatic benefit and improvement in QoL enjoyed by many patients treated with JAK inhibitors. Results from JAK inhibitor trials consistently show that markers of more advanced disease are associated with inferior outcomes, so major advances will likely come from applying safe and effective therapies in the earlier stages of MF (or in antecedent essential thrombocythemia/polycythemia vera) in which splenomegaly and symptom burden will be less informative than in the past generation of trials. The shift in recent years to find new therapies with greater disease-modifying potential calls for reevaluation of clinical trial end points in MF. Rather than focusing on incremental benefits in established end points, new end points are needed to capture disease-modifying potential that translates to more substantial improvements in overall survival. Substantial data sets have been acquired in academic and pharmaceutical industry trials, including serial histology, cytokine levels, and molecular analyses. These data should be made available for reanalysis to identify biomarkers or end points that better correlate with clinically useful outcomes, such as duration of response, leukemia-free survival, and overall survival. We predict that composite end points that comprise a reduction in MF-associated mutations together with at least 1 established measure of clinical benefit will be predictive of disease-modifying potential and prolonged overall survival. Novel end points will ultimately require validation in future clinical trials.
Acknowledgment
The authors thank Madeleine Kersting Flynn (medical illustrator, QIMR Berghofer Medical Research Institute) for assistance with preparation of the figures.
Authorship
Contribution: D.M.R., S.W.L., and C.N.H. reviewed the literature and wrote the paper.
Conflict-of-interest disclosure: The authors have received no funding in connection with the current work. D.M.R. has received honoraria from Merck and Novartis and has had an advisory role for Keros, Merck, Menarini, Novartis, and Takeda. S.W.L. has had an advisory role, or participated in, speakers' bureau for AbbVie and has received funding from Bristol Myers Squibb. C.N.H. has had an advisory role for AbbVie, AOP Orphan Pharmaceuticals, Bristol Myers Squibb, CTI BioPharma, Galacteo, Geron, GlaxoSmithKline, Incyte, Ionis, Janssen, Keros, Merck, Morphosys, Novartis, Silence, and Sobi and has received research funding from GlaxoSmithKline, Morphosys, and Novartis.
Correspondence: David M. Ross, Department of Haematology, Royal Adelaide Hospital, Port Rd, Adelaide, SA 5000, Australia; email: david.ross@sa.gov.au.