Abstract
A fundamental difficulty in testing “targeted therapies” in acute myeloid leukemia (AML) is the limitations of preclinical models in capturing inter- and intrapatient genomic heterogeneity. Clinical trials typically focus on single agents despite the routine emergence of resistant subclones and experience in blast-phase chronic myeloid leukemia and acute promyelocytic leukemia arguing against this strategy. Inclusion of only relapsed-refractory, or unfit newly diagnosed, patients risks falsely negative results. There is uncertainty as to whether eligibility should require demonstration of the putative target and regarding therapeutic end points. Although use of in vivo preclinical models employing primary leukemic cells is first choice, newer preclinical models including “organoids” and combinations of pharmacologic and genetic approaches may better align models with human AML. We advocate earlier inclusion of combinations ± chemotherapy and of newly diagnosed patients into clinical trials. When a drug plausibly targets a pathway uniquely related to a specific genetic aberration, eligibility should begin with this subset, including patients with other malignancies, with subsequent extension to other patients. In other cases, a more open-minded approach to initial eligibility would facilitate quicker identification of responsive subsets. Complete remission without minimal residual disease seems a particularly useful short-term end point. Genotypic and phenotypic studies should be prespecified and performed routinely to distinguish responders from nonresponders.
Introduction
Recent years have seen explosive growth in trials investigating “targeted” therapies in different hematologic malignancies, including acute myeloid leukemia (AML). Of course, traditional chemotherapy is itself targeted because it could not produce remissions absent more toxicity for AML blasts than normal cells. Here, targeted agents will denote drugs aimed at discrete genetic or “molecular” lesions specific to, or enriched in, AML cells compared with normal cells. This discreteness engenders hope that targeted therapy will be both more effective and less toxic than conventional cytotoxic chemotherapy.
Despite the remarkable advances with molecularly and immunologically targeted therapies in various human malignancies,1-4 their use in AML has met with only modest success.5 Here, we examine possible reasons and propose potentially more rational approaches. First, we discuss the preclinical studies responsible for identification of appropriate targets and the credentialing of therapies presumed effective against these. We examine the current paradigm for targeted agent clinical trials typically characterized by the following: (1) use of only one such agent per trial; (2) considerable delay between first use of the targeted agent and its combination with chemotherapy; and (3) limitation to certain patient groups. Lastly, we consider whether presence of the “target” should be required for eligibility, the related question of optimal biological dose (OBD) vs maximum tolerated dose (MTD), and the choice of therapeutic end points.
Issue 1: preclinical studies: limitations and newer approaches
Discovering effective new therapies in AML is often inefficient, with many drugs eventually proven ineffective after considerable expenditure of time and resources. This inefficiency has prompted recent introduction of “selection” or “pick-a-winner” designs intended to clinically investigate a larger number of agents in a given period of time.6,7
Drug development would likely become more efficient if preclinical models provided more accurate predictions of clinical results. Several impediments stand in the way. In some cases (isocitrate dehydrogenase [IDH], TET2 mutations), there are few human models available for study. In others, the primary leukemia cells from patient-derived specimens that grow out in culture or in experimental animal models do so in a “normal” environment and thus may descend from a selected subpopulation of cells that is insufficiently representative of the entirety of the disease. Here, it is important to note the genomic complexity of AML and the presence of multiply coexisting molecularly defined clones or subclones.8-12 Xenotransplantation experiments confirm that subclones are functionally different in their ability to engraft and produce AML leading to decreased subclone complexity in the AML seen in xenotransplant recipients,10 with no consistent relation between a subclone’s engraftment potential and its likely contribution to clinical relapse.11 Because preclinical models may select preferentially for the growth of particular clones while ignoring others, results of drug sensitivity testing in these models may not predict clinical results.
Recent studies suggest that investigating drug sensitivity in primary clinical specimens of leukemia grown in vivo in xenograft assays is much more clinically relevant than performing such testing in cell lines that have been passaged for many years in vitro. Assays done in the presence of supporting stroma might in principle provide a more realistic replica of the in vivo situation.13-15 Some limitations remain, however, and again include the problem of an artificial environment leading to spurious clonal selection. Additionally, most preclinical studies are limited in duration (usually 3-6 weeks maximum) and examine only a few dosing approaches and a single efficacy end point. If therapies take weeks to show therapeutic effectiveness clinically, circumscribed preclinical in vitro testing may not permit a realistic prediction of their effectiveness.16
There are signs, however, that research is improving preclinical models. New biological insights have led to improved engraftment of human hematopoietic cells in the NOD/SCID mouse.17 Transgenic strains of the conventional NOD/SCID mouse are better able to engraft primary human AML isolates.18 Within the constraints of translating human doses to mouse doses, treatment of new xenograft and genetically engineered mouse models with doxorubicin and cytarabine reproduce important aspects of AML therapy including kinetics of response/resistance, quantifiable residual disease, and greater responsiveness of pretreatment patient samples than of samples from patients with relapsed disease.19 Efforts are being made in primary patient samples both at diagnosis and relapse to couple ex vivo sensitivity testing to a large panel of targeted agents with detailed genetic profiling.20 Preliminary findings in these systems suggest clonal evolution accompanying relapse, similar to the likely clinical situation, and the intriguing possibility that drugs clinically ineffective at diagnosis might at times be more effective at relapse. The innovative use in solid tumors of primary tumor derived “organoids” to faithfully mimic the in vivo behavior and drug sensitivity of the whole tumor21-24 has raised great interest, but predictive ability for drug sensitivity testing and translation to the AML context require further investigation.
The ongoing development of preclinical models that may be better replicas of clinical AML supports continued use of such models, including patient-derived xenograft models, despite their acknowledged limitations. We believe it will be important to systematically and formally compare the accuracy of various models in predicting clinical response; this might inform the need for “coclinical trials” investigating agents simultaneously in the human and preclinical context.
Issue 2: each trial involves only 1 “targeted agent”
Why pursue a single-agent clinical approach to early drug development? Substantial evidence of the limited utility of this approach has accumulated.5 A major source of evidence is genetic analyses of AML. The complex genetic landscape of AML often reveals more than a single driver mutation,10 with on average 13 coding mutations reported in de novo AML, and with on average 5 of these occurring in genes that are recurrently mutated consistent with a pathogenic role in AML. Most mutations are stochastically acquired events in normal hematopoietic stem/progenitor cells, with a spectrum of mutations retained after the hematopoietic stem/progenitor cell acquires the relatively small number of “driver” mutations.9 Although apparently auguring well for success following use of single targeted drugs that reverse the effects of these mutations, this small number ignores observations that during AML evolution multiple new critical abnormalities may be acquired that will then act as active drivers in disease progression.10-12 These additional mutations may be downstream of the initial (founding) mutation(s), or they may bypass the founding mutation(s) by using a parallel cellular pathway. In this fashion, and perhaps hastened by therapy, cells from the founding clone frequently give rise to a variety of subclones potentially resistant to therapy. These clones can predominate at relapse making AML a “progressive” disease, molecularly if not clinically.10,11 Certain mutations are mutually exclusive, for example those in genes encoding cohesin and spliceosome proteins, and those in signaling and epigenetic regulatory proteins.8 This mutual exclusivity likely will have important therapeutic implications. Although suggesting that 1 mutation in a specific pathway, and not multiple mutations in the same pathway, is sufficient to contribute to AML, it also suggests the dependence of AML development and growth on multiple distinct pathways. Each of these coactive pathways may need to be targeted therapeutically, arguing for a combined treatment strategy.
It is also instructive to examine clinical results with successful targeted drugs. For example, imatinib and its successors target the abnormal tyrosine kinase that results consequent to acquisition of the BCR-ABL fusion central to the pathogenesis of chronic myeloid leukemia (CML). Given the potent kinase inhibition achieved and the key role of activated kinase signaling in CML biology, these agents produce durable remissions and even the potential of cure (eg, cessation of therapy) in patients with chronic-phase CML. However, when used alone in the blastic phase of CML, imatinib and congeners are much less effective,25-29 with cure largely dependent on intensive chemotherapy/kinase inhibitor combination therapy followed by allogeneic hematopoietic cell transplantation (HCT).30 Clinically, the blast phase of CML is more similar to AML than to chronic-phase CML. If such a precisely targeted, successful drug as imatinib does not produce durable remissions as a single agent in blast-phase CML, is it credible that other targeted drugs used alone will be successful in AML? Yet such use is standard, certainly in the initial, often lengthy, stage of investigation of targeted agents.
The need for combined treatment approaches is also clear from the remarkable experience in therapy of acute promyelocytic leukemia (APL). In APL, all-trans retinoic acid (ATRA) used alone is active but does not produce lengthy remissions.31,32 Only the combination of ATRA with chemotherapy33 or with another targeted drug (arsenic trioxide) without classical chemotherapy34 produces durable long-term survival.
Hence, both the frequent emergence of genetically and functionally heterogeneous subclones that lead to relapse and clinical experience in CML and APL suggest that the current focus on trials testing novel targeted agents in isolation is problematic. It seems reasonable and preferable to study, as soon as possible and assuming a plausible biological rationale, several agents in combination, each of which modulates distinct pathways or targets; the agents could be administered simultaneously or sequentially at a stage of early drug development. Thus, ABL001, which binds to a different site on the ABL1 kinase than conventional tyrosine kinase inhibitors such as nilotinib, is designed to prevent the emergence of nilotinib resistance, motivating ABL001-nilotinib combinations, including in BCR-ABL+ acute lymphoid leukemia (ALL).35 Mutations in isocitrate dehydrogenase (IDH) 1 and 2, often thought to be initiating events in AML,8 afford an AML-specific example. Mutant IDH1 and IDH2 enzymes result in accumulation of the oncometabolite R-2-hydroxyglutarate (2-HG) with resultant epigenetic changes and impaired differentiation. AG-221, a first-in-class inhibitor of 2-HG accumulation, has shown activity in IDH2-mutated AML.36 Accumulation of 2-HG may also promote development of AML by disruption of components of the mitochondrial electron transport chain, thus mimicking a state of oxygen deprivation. This in turn leads to dependence on the antiapoptotic protein BCL-2.37 Indeed, IDH1/2 mutant cells have been shown to be more sensitive than IDH1/2 wild-type cells to the highly specific BCL-2 inhibitor ABT-199.37 Although combinations of AG-221 and ABT-199 might reduce addiction to BCL-2 given AG-221’s ability to decrease 2-HG concentrations and thus be counterproductive, pharmacologic inhibitors of the electron transport chain might be expected to increase dependence on BCL-2 and thus increase sensitivity to ABT-199 in IDH1/2 mutant AML. Very few combinatorial studies have been done in AML. Although these would require increased collaboration among pharmaceutical companies and perhaps a novel attitude on the part of regulatory agencies, the data discussed previously will hopefully foster such change. Moreover, insightful preclinical studies can help determine the most compelling dose/sequence combinations, thus reducing the time to initial testing of effective specific combinations.
Issue 3: delay in investigating combining targeted agents with chemotherapy
A recurrent theme is the eventual combination of targeted agents with chemotherapy, a successful approach in APL and Ph+ ALL. Although not as chemosensitive as APL, AML is not entirely chemoresistant. Hence, it would be desirable to move to chemotherapy combinations, with for example IDH inhibitors, more quickly than the usual 2 to 4 years. However, single-agent activity of the targeted agent is typically only demonstrated during this time frame; thus, more rapid introduction of combinations would depend on willingness to do so despite uncertainty about the intrinsic single-agent activity of the targeted therapy. Presumably, trials comparing chemotherapy ± the targeted agent could eventually establish the clinical benefit of the latter, if not its single-agent efficacy; the same would apply in the case of combinations of targeted agents. Furthermore, patients and physicians would likely be less concerned about the relative contributions of the targeted agent(s) and chemotherapy to efficacy, and more concerned about pursuing therapeutic approaches that historically have been only modestly successful. Nonetheless, use of combinations before single-agent activity is demonstrated would, understandably enough, often appear unattractive to pharmaceutical companies, given the financial outlays required. This reluctance has been perhaps the principal impediment to more expeditious introduction of combinations.
However, this may be becoming less problematic. For example, the initial phase 1 trial of the MDM2 antagonist RG7388 administered this drug ± cytarabine.38 In an analogous study of RO5429083, a recombinant humanized monoclonal antibody that binds to CD44, patients were treated in parallel with either RO5429083 or RO5429083 plus cytarabine at the relatively high dose of 1 g/m2 daily ×5 days.39 Similarly, a phase 1 dose-escalation study of the anti–CXC chemokine receptor 4 antibody F50067 included arms ± cytarabine.40 Despite the obstacles to more rapid targeted therapy-chemotherapy combinations, the previous studies serve as useful/informative precedents. Although rarely used, statistical designs exist that specifically permit dose finding for multiagent combinations (eg, a targeted agent and chemotherapy)41 and facilitate patient-specific dose finding.42
Issue 4: in which patients should new agents be tested?
Targeted therapy trials are largely limited to patients with relapsed/refractory AML or those newly diagnosed patients considered unfit for conventional chemotherapy. Although truly effective drugs might work even in very advanced disease (as is true with ATRA), it also seems plausible that conclusions about the value of a targeted therapy based solely on testing in relapsed/refractory patients may be falsely negative. Likewise, restricting testing to newly diagnosed, unfit patients impedes introduction of combinations of targeted agents with chemotherapy, despite the possible merit of this approach noted previously.
Thus, we suggest extension of initial trials of targeted therapies to other patient subsets. At first, these would include (1) newly diagnosed younger adults with high-risk disease (eg, evidenced by a monosomal or complex karyotype,43,44 TP 53 mutation,45 or FLT3 internal tandem duplications [ITDs]46 ) and (2) fit, newly diagnosed older patients. Because both these groups are chemotherapy naïve, they may not have developed resistance-conferring mutations. Even fit older patients are at high risk but, although fit, are much less likely than younger patients to receive HCT; consequently, the effect of the targeted therapy on relapse or survival is less likely to be confounded with that of HCT in older patients. Eventually, even younger patients at better risk could be included, also with the intent of reducing false negatives. Given improvements in supportive care,47 consideration might be given in these cases to a window approach, similar to that used in solid tumors,48 in which targeted therapies are used alone followed, in case of failure, by administration of chemotherapy.
Another possible avenue for testing new therapies is patients in complete remission (CR) but at high risk of relapse, as evidenced for example by cytogenetics,41-44 FLT3 ITD status,46 or, particularly, the presence of minimal residual disease (MRD).49,50 One advantage is the opportunity to gain a better appreciation of toxicity to normal hematopoietic progenitors. A second is the possibility of discovering therapy whose effectiveness might have been overlooked in the setting of higher disease burden. This putative greater effectiveness might simply reflect self-selection such that patients in CR will do better with any active therapy; alternatively, it might result from the lower volume of disease associated with CR, which might be particularly conducive to fostering response with targeted therapy. Thus, outcome with reduced-intensity HCT, in effect a targeted therapy relying on an immunologically mediated graft vs AML effect, is superior in patients in remission than in patients with active disease.51 The same principle underlies a phase 1 trial of an anti–killer inhibitory receptor monoclonal antibody (IPH2101) intended to restore the anti-AML activity of natural killer cells in elderly patients in first CR52 and a phase 3 trial of interleukin-2 (IL-2) plus histamine dihydrochloride also designed to enhance the function of cytotoxic lymphocytes in patients in first or greater CR.53 Although patients randomized to IL-2 plus histamine rather than no treatment after “consolidation therapy” had longer leukemia-free survival (P = .01), the effect size was relatively modest, and IL-2 plus histamine has not found widespread use. This suggests a potential disadvantage of the CR setting, specifically the predominance in CR of resistant subclones that either were present at diagnosis or emerged under therapy.10-12 A logistical disadvantage is the much longer time needed to observe relapse than to observe CR. This time might be shortened using decrease in MRD as an end point, as discussed subsequently (“Issue 6: potential end points to assess efficacy”). Certainly, evaluation of a new drug in CR should be accompanied by MRD assessments in order to obtain a quantitative measure of leukemia response.
Issue 5: must the “target” be present for the patient to be eligible?
An issue related to patient selection is whether eligibility should require demonstration of a drug’s presumed target. A valid criticism of current trials is that they ignore interpatient heterogeneity by allowing entry of a wide range of patients, although response may only occur in specific patient subsets. We will consider 2 different scenarios. In the first, there is strong compelling evidence from preclinical work (recalling its limitations discussed previously) that the drug targets a specific pathway uniquely related to a particular gene aberration. Examples are antibodies that bind to cell surface receptors and drugs that only target mutant forms of a specific protein, for example a drug that targets the downstream effects of an IDH gene mutation36 or a mixed-lineage leukemia rearrangement.54 Here, development of a drug would proceed in a specific molecularly defined AML subset. Indeed, one could even follow a “basket approach” involving different cancers that all share similar gene mutations.55 One example is the BRAF gene mutation apparent in a variety of tumors of different tissue origin such as melanoma1 and hairy cell leukemia.56 In such cases, classical morphologic boundaries can be ignored and drugs tested in patients with various tissue tumor types that share identical pathway abnormalities. Here, dose finding could more rationally focus on identifying a dose that modulated the drug’s target (OBD) than on identifying an MTD, particularly given the extra time required to discover an MTD. Also, in this more homogeneous context, a detailed understanding of the characteristics that distinguish responsive and nonresponsive disease can be obtained.
In the second scenario, the preclinical rationale is less absolute. Here, drug development would entertain a more open-minded approach. For example, a particular targeted drug may affect more than a single mutated target or perturbed intracellular pathway even if currently unbeknownst. Thus, the drug may be effective beyond the narrow frame of the predefined abnormality. Likewise, AMLs with different genetic abnormalities may use common pathways. This open-minded approach would permit broader eligibility and, although focusing more on identification of an MTD than an OBD, would be strongly supported by molecular correlative and exploratory studies to potentially discover additional responsive AML subtypes. For example, although often considered an “FLT3 ITD inhibitor,” sorafenib is a multikinase inhibitor,57 and in a trial randomizing patients aged 18 to 60 with newly diagnosed AML to 3+7+/− sorafenib, the beneficial effect of sorafenib largely owed to results in FLT3-ITD-negative patients.58 Moreover, although a more selective FLT3-ITD inhibitor with less “off-target” effects, quizartinib appears to be active in FLT3-ITD-negative AML.59 Thus, quizartinib may also affect other clinically relevant targets. Another example is an ongoing North American trial evaluating dasatinib as a KIT inhibitor in core-binding factor AML; the trial allows entry of patients with and without a KIT mutation, hypothesizing that overexpressed wild-type KIT may also be a valid target for dasatinib, perhaps even better than the KIT mutations seen in AML that are not highly sensitive to this agent.60 This unrestricted eligibility study design facilitates discovery of a range of potentially drug-responsive subsets.
Although each of the 2 scenarios discussed previously is realistic in principle, distinguishing which scenario is applicable to a given drug will often be problematic. Generally, we advocate that early trials of targeted therapies be pursued in specific, genotypically defined populations, followed, particularly if clinical responses are seen, by studies in a broader AML population. When the preclinical rationale seems compelling, we strongly endorse molecular-specific basket trials to increase the likelihood of clinical benefit and accelerate drug development. In more opaque situations, patients with and without the target might be enrolled while emphasizing enrollment of the former, with detailed genomic profiling embedded in end-point studies in order to post hoc identify predictors of response not conceived initially.
Issue 6: potential end points to assess efficacy
Several end points have been used to evaluate the efficacy of targeted therapies. There is almost certainly no single ideal end point. Particularly in relapsed/refractory patients, the ability of the new therapy (for example quizartinib) to lead to HCT (“bridge to transplant”) is often reported.61 However, this end point perhaps erroneously presupposes that success following HCT does not depend on the response to the preceding therapy.50 Nonetheless, in relapsed/refractory patients, achievement of CR may not be realistic, and CR + CRi (incomplete platelet recovery) might be a preferable end point, recalling that CRi requires both a marrow blast count <5% and enumeration of at least 200 cells to assure that a therapy does not merely produce general aplasia.62 In less advanced patients, CR appears a reasonable end point but suffers from a lack of a consistent relation with prolongation of survival.63,64 This may reflect observations that in a substantial proportion of CRs defined by morphology and blood counts, MRD, detected using molecular testing or multiparameter flow cytometry, is present.49 A correlation between CR and survival might be more obvious if the criterion for CR is extended to include a marrow without evidence of MRD. Indeed, achievement of “stringent CR,” rather than only CR, after autologous transplant for multiple myeloma is associated with longer survival, independently of other covariates.65 Particularly once results of multiparameter flow cytometry testing become more reproducible, we believe the more robust CR without MRD should replace CR as an end point, as in ALL.66 Similarly, although the end point for approval of new drugs in AML has typically been “overall survival” (OS), we believe that this might well be replaced with “event-free survival” (EFS). Preliminary observations suggest that EFS is at least as effective in forecasting OS as the prostate-specific antigen test is in predicting the presence of prostate cancer or the widely used HCT comorbidity index is in predicting nonrelapse mortality after HCT (see Luskin et al67 and M. Othus and E.E., unpublished observations). Furthermore, EFS takes less time to evaluate than OS and is also a less confounded indicator of the value of a new drug or drug combination than OS, whose length depends on therapy given after the new treatment has failed. Precedents for use of criteria other than improved survival are the approvals granted by the Food and Drug Administration to novel targeted therapies in melanoma and lung cancer.68,69
Summary
We advocate the earlier use of combinations of targeted agents and of targeted agents combined with chemotherapy, and expansion of testing to encompass high-risk newly diagnosed patients including younger patients and fit, elderly patients, with eventual expansion to lower-risk newly diagnosed patients. Although there is no single optimal end point, we believe eradication of MRD and EFS have advantages over OS as a primary end point for new drug development. To ensure a homogeneous population, we support initial limitation of eligibility to patients in whom the target is present in cases in which a drug plausibly targets a specific pathway uniquely related to a particular gene aberration, with inclusion of patients with other cancers sharing the same potentially critical abnormality/pathway (basket approach). However, in cases in which the rationale is less absolute, we believe a more open-minded initial approach to eligibility might facilitate discovery of additional responsive subsets, coupled with intensive genomic and functional studies to correlate response heterogeneity with subset-specific features. Critically, preclinical model systems need to be representative of the disease, and several of the newer approaches discussed previously may be useful in this regard. Few studies include prospective correlative studies based on detailed genomic profiling; consequently, correlative studies to identify responsive subtypes are performed only retrospectively and in patients with adequate samples banked for such studies. Accordingly, there is pressing need for sophisticated genotypic and phenotypic studies, specified in advance and performed routinely both before and, if possible, after therapy to distinguish responders from nonresponders, thus identifying specific subsets that may benefit from a specific targeted therapy.
Acknowledgments
The authors acknowledge the many colleagues and reviewers whose comments have materially improved the manuscript.
Authorship
Contribution: E.E., R.L.L., and B.L. conceived of the paper, analyzed data, and wrote the manuscript.
Conflict-of-interest disclosure: The authors declare no competing financial interests.
Correspondence: Elihu Estey, 825 Eastlake Ave E, Seattle, WA 98109; e-mail: eestey@uw.edu.