Detection of minimal residual disease (MRD) in acute lymphoblastic leukemia (ALL) predicts outcome. Previous studies were invariably based on relative quantification and did not investigate sample-inherent parameters that influence test accuracy, which makes comparisons and clinical conclusions cumbersome. Hence, we conducted a prospective, population-based MRD study in 108 sequentially recruited children with ALL uniformly treated with the ALL-Berlin-Frankfurt-Münster (ALL-BFM) 95 protocol in Austria (median follow-up of 40 months). Using sensitive, limited antibody panel flow cytometry applicable to 97% of patients, we investigated 329 bone marrow samples from 4 treatment time points. MRD was quantified by blast percentages among nucleated cells (NCs) and by absolute counts (per microliter). Covariables such as NC count, normal B cells, and an estimate of the test sensitivity were also recorded. Presence and distinct levels of MRD correlated with a high probability of early relapse at each of the time points studied. Sequential monitoring at day 33 and week 12 was most useful for predicting outcome independently from clinical risk groups: patients with persistent disease (≥ 1 blast/μL) had a 100% probability of relapse, compared to 6% in all others. Absolute MRD quantification was more appropriate than relative, due to considerable variations in total NC counts between samples. Regeneration of normal immature B cells after periods of rest from treatment limited the test sensitivity. In conclusion, MRD detection by flow cytometry is a strong and independent outcome indicator in childhood ALL. Standardization regarding absolute quantification on the basis of NCs and assessment during periods of continuous treatment promise to increase the accuracy, simplicity, and cost efficiency of the approach.

Measurement of minimal residual disease (MRD) by flow cytometry (FC) or polymerase chain reaction (PCR) emerges as an attractive new tool for risk assessment in childhood acute lymphoblastic leukemia (ALL).1 Several studies with FC, which bears the methodologic advantage of being relatively simple and quick, have demonstrated that MRD detection based on leukemia-associated phenotypes correlates with outcome and should be applicable to a majority of children with ALL.2-6 Two large FC investigations established the significance of different levels of MRD at sequential time points in therapy, proving that MRD is an independent outcome indicator.7-9 Between these studies and in comparison to the most relevant PCR-derived data,10,11 divergences exist regarding the proportion of MRD+ patients at distinct treatment time points, the prognostic relevance of certain MRD levels, and the general applicability of techniques that compete in the quest for the most favorable expenditure/efficacy profile. Because quantitative data generated by PCR and FC seem largely interchangeable,12 it needs to be determined in the future whether these divergences are mainly caused by differences between therapeutic regimens or by technology. Notably, all studies published to date have relied on only relative measurements of MRD (relative to leukemic genome copy standards or to cells of a sample) mostly among mononuclear cell preparations. It is therefore tempting to speculate whether methodologic standardization on the basis of truly absolute quantification of MRD among total nucleated cells (NCs) in combination with a better understanding of the dynamic changes of cell content and cellular composition of the bone marrow (BM) during treatment provides even more accurate and prognostically reliable information.

Hence, we conducted a prospective FC study that allowed investigation of the values of relative and absolute MRD quantification as well as the modalities of assessment in pediatric ALL patients with risk-directed therapy. Our method was based on a limited antibody panel approach applicable to most patients due to a combination of techniques directed toward investigation of B-cell precursor (BCP) and T-lineage phenotypes, and enabled the detection of one leukemic cell among at least 104 normal cells.4,5 13-18 

Patients and samples

Between April 1996 and October 1998, 139 children with ALL were consecutively enrolled in a nationwide Austrian multicenter (n = 12) treatment study based on the ALL Berlin-Frankfurt-Münster (BFM) 95 protocol. From 108 of these patients, diagnostic and follow-up BM samples (treatment time points, day 15, day 33, week 12, weeks 22-24, occasionally further) could be obtained for the prospective investigation of MRD (of note, these time points were essentially the same as those in the study of van Dongen et al11). BM investigations were approved by the institutional ethical committees and were done according to informed consent guidelines. The median age of the 108 study patients at diagnosis was 3.8 years (range, 0.08-17.08 years), and there were 45 girls and 63 boys. Sixty-four patients (59%) had a leukocyte count less than 20 × 109/L at presentation. One hundred patients had a BCP-ALL (5 pro-B, 67 common, and 28 pre-B ALLs), and there were 8 patients with T-ALL (7.4%). Genetic analyses disclosed a t(9;22) or BCR/ABLrearrangement in 6 patients, and a t(4;11) or MLL/AF4rearrangement in 2. Nine patients (8.3%) responded poorly to the prednisone prephase (> 1000 blasts/μL blood on day 8 of induction), and only 2 children did not acquire remission at day 33. Risk group stratification according to the protocol was based on features at presentation and on therapy efficacy.19Thirty-five patients (32.4%), characterized by a combination of good prednisone response, leukocyte counts less than 20 × 109/L, a non-T immunophenotype, and age between 1 year and less than 6 years, were assigned to the arm with lowest risk (ie, standard-risk, SR). Sixteen (14.8%) were classified as high-risk (HR) patients by a poor prednisone response, no remission at day 33, a t(9;22) or BCR/ABL rearrangement, or a t(4;11) orMLL/AF4 rearrangement. All other patients (n = 57; 52.8%) were included in the medium-risk group (MR). The median observation time of patients who remained in remission was 40 months (range, 1.83-4.42 years). Thirteen patients (6 MR, 7 HR) had relapse in a median of 15 months after diagnosis (range, 0.42-2.67 years). There were 10 systemic and 3 localized relapses (all central nervous system [CNS]). Importantly, the 31 patients who could not be investigated were not significantly different from the 108 study patients regarding age, gender, leukemic immunophenotype, white blood cell count at diagnosis, prednisone response, risk group distribution, and relapse-free survival (tests: Wilcoxon 2-sample, Fisher exact, χ2, log-rank).

Treatment protocol

Treatment with the ALL-BFM 95 protocol was based on a modification of the ALL-BFM 90 regimen.19 20 In brief, all patients received a 5-week induction treatment consisting of prednisone (during the first week as systemic monotherapy), vincristine, daunorubicin (reduced by 50% in SR patients), andl-asparaginase. SR and MR patients continued directly with phase 2 of induction (4 weeks) comprising oral 6-mercaptopurine, cyclophosphamide, and cytarabine. After 2 weeks of rest this was followed from week 12 onward by consolidation treatment (8 weeks) with high-dose methotrexate and oral 6-mercaptopurine. HR patients were switched after phase 1 of induction to 6 rotational 6-day pulses of intensive multiagent chemotherapy instead of the extended induction and normal consolidation therapy. Of note, the week 12 follow-up BM investigation in HR patients was performed after rest from 2 such pulses. After a further 2 weeks of rest, all patients received similar reinduction treatment for 7 weeks from weeks 22 to 24 onward (comparable to induction therapy). Subsequent maintenance therapy consisting mainly of oral 6-mercaptopurine and methotrexate was conducted until completion of the second year of treatment (prolonged for SR boys). All patients received repeated intrathecal therapy until the end of reinduction, and prophylactic cranial irradiation (12 Gy) was reserved for patients with T-ALL or HR criteria.

Sample preparation and immunofluorescence staining

Heparinized BM samples were usually received by express mail within 1 day after collection and immediately processed. First, the NC count of each aspirate was assessed with a Sysmex F-820 cell counter (TOA Medical Electronics, Hamburg, Germany). Subsequently, NCs were prepared using a commercially available red cell lysing solution (Becton Dickinson, Sunnyvale, CA). The panel of monoclonal antibodies (MoAbs) for immunofluorescence staining has been delineated,4,13,21,22 except for the following unconjugated or fluorescein isothiocyanate (FITC)–, phycoerythrin (PE)–, peridinin chlorophyll protein (PerCP)–, PE-cyanin 5.1 (PC5)–, or allophycocyanin (APC)–labeled MoAbs: CD3 (UCHT1-FITC,-PE, pure), CD20 (B-Ly1-PE), and CD38 (AT13/5-FITC) from Dako (Glostrup, Denmark); CD19 (SJ25C1-APC), and CD45 (2D1-PerCP) from Becton Dickinson; CD34 (581-PC5) from Beckman-Coulter (Vienna, Austria). Except for investigations of normal BCPs, which were done with 4 directly labeled MoAbs, the panel was based on the use of 3-color stainings including one unconjugated MoAb. The labeling cascade, quality control measures, and the cellular permeabilization procedure have been described in detail.4,21 22 

FC analyses

Analyses were performed on a dual-laser FACSCalibur (Becton Dickinson). Details regarding the test standardization, the data acquisition with the CELL Quest software (Becton Dickinson), and the data analysis using the PAINT-A-GATE software (Becton Dickinson) have been delineated.4,13,22 We used several standardized antibody combinations to screen ALL samples at diagnosis for leukemia-associated aberrations as well as to investigate follow up BM.5 First-line strategic MoAb combinations against frequently aberrant antigens were used in all patients: for BCP-ALL CD34/CD10/CD19 (in pro-B ALL: CD34/CD10 plus CD20/CD19) and CD10/CD45RA/CD19, and for T-ALL TdT/cytoplasmicCD3/surfaceCD3 as well as cytoplasmicCD3/CD7/surfaceCD3. In the first 72 sequential patients, first-line combinations were always complemented in case of BCP-ALL by additional MoAb triplets (CD10/CD45RA/CD11a, CD10/CD45RA/CD44, CD10/CD33/CD19, and cytoplasmic IgM/CD34/CD19), which were used in follow-up investigations whenever applicable. In later cases (n = 36), full typing including the complementary markers was only done if aberrations relevant for follow-up could not be determined with the first-line marker panel (n = 8). Aberrant characteristics were judged relevant only if they were sufficiently strong or homogeneous on a majority of blasts of a leukemia sample. Table1 summarizes the leukemia-associated characteristics used to monitor MRD in the 108 ALL cases. Fifty-two leukemias had one aberrant marker (48.1%), 53 (49.1%) had more than one (39 had 2, 11 had 3, and 3 had 4), and only 3 (2.8%) did not have a useful aberration.

Table 1.

Summary of the relevant leukemia-associated characteristics used to MRD in follow-up BM samples of the 108 ALL patients

ALLAberration*Frequency%
B lineage CD10 overexpression 70/100 70  
 (n = 100) CD45RA underexpression 32/100 32  
 CD11a underexpression 22/70 31  
 CD44 underexpression 6/68 9  
 CD19 overexpression 2/100 
 CD34 overexpression 9/100 9  
 CD33 and CD19 cross-lineage
coexpression 
5/63 8  
 IgMand CD34 desynchronous
coexpression 
9/61 15  
 CD10 and CD20 desynchronous
underexpression1-153 
5/100 5  
T lineage CD3 (surface) ectopic underexpression 8/8 100  
 (n = 8) TdT and CD3 (cytoplasmic) ectopic
coexpression 
6/8 75 
 CD7 overexpression 1/8 13 
ALLAberration*Frequency%
B lineage CD10 overexpression 70/100 70  
 (n = 100) CD45RA underexpression 32/100 32  
 CD11a underexpression 22/70 31  
 CD44 underexpression 6/68 9  
 CD19 overexpression 2/100 
 CD34 overexpression 9/100 9  
 CD33 and CD19 cross-lineage
coexpression 
5/63 8  
 IgMand CD34 desynchronous
coexpression 
9/61 15  
 CD10 and CD20 desynchronous
underexpression1-153 
5/100 5  
T lineage CD3 (surface) ectopic underexpression 8/8 100  
 (n = 8) TdT and CD3 (cytoplasmic) ectopic
coexpression 
6/8 75 
 CD7 overexpression 1/8 13 
*

Leukemia-associated phenotypic characteristics.4,5 12-17 

Frequency among individual ALL lineages (samples with aberration versus tested specimens; percentage of tested cohort); note that not all leukemic cases were investigated for all markers.

Cytoplasmic IgM.

F1-153

In combination with CD19/CD34 positivity (pro-B ALL phenotype).

Stainings with the cell-permeant, live-cell nucleic acid fluorochrome SYTO 16 (emission at 518 nm; Molecular Probes, Leiden, The Netherlands) combined with CD19 or CD3/CD45 were used in follow-up analyses to exclude residual nonnucleated erythroid cells, thrombocytes, or debris (SYTO16) from calculations. This staining allowed the unbiased proportional quantification of MRD among total NCs (SYTO16+) via the percentage of CD19+ or CD3+ cells.5 The 4-color MoAb combination (CD38/CD45RA/CD34/CD19), which discriminates the 3 major BCP stages in BM as well as plasma cells, was used to investigate normal B cells.21,23 24 At diagnosis, and for SYTO16 as well as normal B-cell analyses, 1.5 × 105 NCs were labeled for acquisition of 30 000 events. For MRD measurements, 7.5 × 105 NCs were first analyzed by plain acquisition of 30 000 events. Subsequently, live gates were set in appropriate parameter correlations, and gated events (usually CD10+/orthogonal light scatteringlow/intermediate, in a few applications CD19+/CD34+, in T-ALL cytoplasmicCD3+/surfaceCD3low/−) were acquired until 30 000 events or until all cells in the tube had passed through the flow cytometer, whichever occurred first.

Data handling

Minimal residual disease was defined as accumulation of events with leukemia-associated phenotypic characteristics in follow-up BM. A relative estimate (percentage of leukemic cells among NCs) and an absolute count (leukemic cells per microliter) were assessed. Absolute counts were calculated by multiplying blast percentage values with the total NC count of the sample. Events that did not form typical cellular clusters within the regions of FC parameter correlations expected to contain only leukemic cells, or that rather overlapped leukemia-associated regions from adjacent areas, were regarded indefinite but similarly quantified. Such events occurred as infrequent background related to diminished sample quality or as normal populations, which to some extent resembled leukemic cells by phenotype (eg, immature normal BCP). Samples containing only such indefinite events were judged MRD, but the proportion and absolute number of indefinite events was recorded as threshold of sensitivity, which reflected that MRD more numerous than this value did not exist in a given sample. Because occurrence of such indefinite events did not preclude that leukemic cells, if present, could accurately be discerned, unfavorably high thresholds were not considered an obligation to exclude samples from further analyses. Rather, respective data were analyzed with respect to the prognostic reliability of the test system as well as their association with sample composition parameters and time points.

Statistical analysis

The Wilcoxon 2-sample test was used to investigate for differences in sample composition parameters and sensitivity thresholds. Associations between these data were assessed with the Spearman rank correlation. Differences between patient groups regarding the proportion as well as the log-level distributions of MRD+ samples were estimated using the Mantel-Haenszel χ2 test. The principal end point used to determine the prognostic value of MRD results was the relapse-free interval (RFI), calculated as the interval from the time point of a given assessment until the date of first relapse or end of observation. Four patients who underwent BM transplantation (including the 2 patients who failed to attain remission at day 33) and did not relapse thereafter, and 3 patients who died from causes other than leukemia itself (infection or thromboembolism), were included but censored with the date of transplantation or death. The probability of the RFI (pRFI) was estimated by the method of Kaplan-Meier.25 SDs of Kaplan-Meier estimates were calculated according to Greenwood. The log-rank test was used to explore the prognostic impact of MRD in univariate analyses. A Cox regression model, adjusted for BFM risk groups, was used for multivariate analyses.26 Relative hazard rates (CI 95%) were calculated, a Wald test was used to test the significance of differences between groups, and the predictive value of MRD results additional to BFM risk groups was examined by the log-likelihood ratio test.

MRD results and influencing factors

We assessed MRD on the basis of relative and absolute blast cell measures in children with ALL (Table 2). Follow-up BM samples (n = 329) were taken at 4 standardized time points during the first 6 months of treatment. Covariables such as the NC count, the total normal B-cell content, and the BCP subset composition were also investigated. The proportion of samples that were MRD+ was high at day 15 (89.1%), and decreased thereafter to 40.9% at day 33, 13.7% at week 12, and 4.3% at weeks 22 to 24 (Table 2). The MRD burden of positive samples is reviewed in Table2.

Table 2.

Residual disease in children with ALL at different time points during the first 6 months of treatment according to the ALL-BFM 95 protocol

Time pointNo. patients
with MRD vs
tested samples* (%)
Median %
(range) of
leukemic cells
Blasts/μL BM
median (range)
d 15 90/101  (89.1) 0.57  (0.002-96.8) 17.8  (0.1-18 295)  
d 33 43/105  (40.9) 0.06  (0.001-43.1) 4.0  (0.1-3 235) 
wk 12 7/51*  (13.7) 0.057  (0.001-4.9) 10.4  (0.53-820)  
wk 22-24 2/47*  (4.3) 0.023  (0.004-0.5) 6.8  (0.2-13.1) 
Time pointNo. patients
with MRD vs
tested samples* (%)
Median %
(range) of
leukemic cells
Blasts/μL BM
median (range)
d 15 90/101  (89.1) 0.57  (0.002-96.8) 17.8  (0.1-18 295)  
d 33 43/105  (40.9) 0.06  (0.001-43.1) 4.0  (0.1-3 235) 
wk 12 7/51*  (13.7) 0.057  (0.001-4.9) 10.4  (0.53-820)  
wk 22-24 2/47*  (4.3) 0.023  (0.004-0.5) 6.8  (0.2-13.1) 
*

Data of consecutively tested patients (ie, first half of the study cohort).

Data of all available MRD+ testings included (week 12, n = 13; weeks 22-24, n = 6; ie, from MRD+ patients of the first half of the study cohort, and from later enrolled patients who were tested only if MRD+at the preceding time point).

Total NC counts showed a high degree of variance between individual samples of a given time point as well as between time points in general (Table 3; P < .001 for the differences between time points except for week 12 versus weeks 22 to 24, which was not significant). The lowest values were usually found at day 15, and the highest at week 12 and weeks 22 to 24. Related to the variance in NC counts, absolute blast counts were classified to different log-levels compared to the respective relative estimates in 59 (38.8%) of 152 MRD+ samples. The composition of BM samples regarding normal BCP subsets changed also significantly during follow-up (Table 3). Samples from day 15 and day 33 contained almost exclusively mature B cells (stage 3). By contrast, samples from week 12 and weeks 22 to 24 were dominated by very immature BCP (stage 1;P < .001 for the differences between the earlier and the later time points), which are phenotypically related to leukemic cells. The relative proportion of total normal B cells among NCs regressed from day 15 to week 12, and slightly increased thereafter (weeks 22-24). Related to these peculiarities of the B-cell pool, the proportional sensitivity threshold values of samples judged MRD (Table 3) were favorably lower at day 15 and day 33 compared to week 12 and weeks 22 to 24 (day 15: P = .017 and .011, respectively; day 33: P < .001 for both late time points). Both at week 12 and weeks 22 to 24, lower sensitivity thresholds correlated with fewer normal BCP (P < .001).

Table 3.

BM characteristics at follow-up time points

Parameterd 15d 33wk 12wk 22-24
Total NC count3-150 4.3  (0.6-45) 7.3  (0.9-32) 15.7  (2-120) 24.2  (2.4-75) 
Normal B cells3-151 14.6  (0.5-40) 5.3  (0.3-40) 1.2  (0.01-9) 2.2  (0.04-19)  
BCP 13-152 0.0  (0.0-0.0) 0.0  (0.0-0.0) 51.0  (0.0-99) 44.5  (0.0-95)  
BCP 23-152 0.0  (0.0-5.0) 0.0  (0.0-1.0) 7.3  (0.0-57) 25.7  (0.0-61) 
BCP 33-152 99.9  (95-100) 98.2  (97.6-100) 8.7  (0.0-82) 6.8  (0.0-48)  
Plasma cells3-152 0.1  (0.0-0.9) 1.2  (0.0-2.4) 7.6  (0.0-77) 4.7  (0.0-80)  
Threshold3-153 0.013  (0.001-0.32) 0.01  (0.001-0.14) 0.065  (0.002-2.7) 0.07  (0.002-2.6) 
Parameterd 15d 33wk 12wk 22-24
Total NC count3-150 4.3  (0.6-45) 7.3  (0.9-32) 15.7  (2-120) 24.2  (2.4-75) 
Normal B cells3-151 14.6  (0.5-40) 5.3  (0.3-40) 1.2  (0.01-9) 2.2  (0.04-19)  
BCP 13-152 0.0  (0.0-0.0) 0.0  (0.0-0.0) 51.0  (0.0-99) 44.5  (0.0-95)  
BCP 23-152 0.0  (0.0-5.0) 0.0  (0.0-1.0) 7.3  (0.0-57) 25.7  (0.0-61) 
BCP 33-152 99.9  (95-100) 98.2  (97.6-100) 8.7  (0.0-82) 6.8  (0.0-48)  
Plasma cells3-152 0.1  (0.0-0.9) 1.2  (0.0-2.4) 7.6  (0.0-77) 4.7  (0.0-80)  
Threshold3-153 0.013  (0.001-0.32) 0.01  (0.001-0.14) 0.065  (0.002-2.7) 0.07  (0.002-2.6) 
F3-150

Total nucleated cells (including erythroid precursors) × 109/L; median (range); tested BM samples: day 15, n = 101; day 33, n = 103; week 12, n = 66; weeks 22-24, n = 54.

F3-151

Percentage of CD19+ B cells among NCs; median (range); tested BM samples: day 15, n = 92; day 33, n = 93; week 12, n = 62; weeks 22-24, n = 47.

F3-152

Proportion of BCP stage among total CD19+ B cells; median (range); tested BM samples: day 15, n = 87; day 33, n = 94; week 12, n = 34; weeks 22-24, n = 32.

F3-153

Threshold proportions of NC at/above which samples were definitely MRD, median (range); the threshold characterizes the test sensitivity in MRD BM samples: day 15, n = 11; day 33, n = 62; week 12, n = 55; week 22-24, n = 48.

Outcome and correlation with MRD results

Thirteen of the 108 study patients had relapse during the observation period. Presence of MRD was associated with a greater likelihood of relapse at day 33 (P = .024) as well as at week 12 and weeks 22 to 24 (P < .001 for both correlations), but not at day 15. More importantly, the incidence of relapses correlated with distinct levels of MRD positivity. As shown in Table 4, the prognostically relevant levels of MRD decreased with time under treatment (day 15: ≥ 1.0% of NC or ≥ 100 blasts/μL BM; day 33: ≥ 0.1% or ≥ 10/μL; week 12: ≥ 0.01% or ≥ 1/μL; weeks 22-24: any MRD positivity). Of note, SR patients, who received a slightly less intensive induction treatment, did not differ from MR patients regarding the proportion of positive samples and the distribution among levels of MRD both at day 15 and day 33. Figure 1 summarizes the predictive value of MRD results at each time point. The accuracy in defining patients with relapse culminated at week 12 by continuously narrowing the population at risk. Low or absent MRD did not allow a further substratification with respect to favorable therapy outcome. Including BFM risk groups in a multivariate Cox model, quantitative MRD results proved to be independent prognostic factors at day 15 (only absolute estimate), and in particular at day 33 and week 12 (both estimates, Table 5). At weeks 22 to 24, too few data were available for these calculations.

Table 4.

Residual disease status during the first 6 months of treatment and relapse incidence

MRD statusd 15d 33wk 124-150wk 22-244-150
Relative
(% NC)
Absolute (cells/μL BM)Relative Rx/ptsAbsolute Rx/ptsRelative Rx/ptsAbsolute Rx/ptsRelative Rx/ptsAbsolute Rx/ptsRelative Rx/ptsAbsolute Rx/pts
≥ 10% > 1000 4/8 4/8 
1-9.9% 100-999 5/30 5/19 3/3 3/3 1/1 
0.1-0.99% 10-99 2/20 1/19 5/11 5/10 4/4 4/4 2/2 
0.01-0.099% 1-99 1/23 1/25 0/20 1/20 2/2 2/2 3/3 1/1 
< 0.01% < 1 0/2 1/12 1/5 0/6 1/5 0/4 1/2 1/2 
Negative 1/11 1/11 4/59 4/59 5/52 5/49 6/46 6/46 
Total eligible 13/94 13/94 13/98 13/98 12/63 12/60 10/51 10/51 
(Total) (101) (101) (105) (105) (68) (64) (54) (53) 
MRD statusd 15d 33wk 124-150wk 22-244-150
Relative
(% NC)
Absolute (cells/μL BM)Relative Rx/ptsAbsolute Rx/ptsRelative Rx/ptsAbsolute Rx/ptsRelative Rx/ptsAbsolute Rx/ptsRelative Rx/ptsAbsolute Rx/pts
≥ 10% > 1000 4/8 4/8 
1-9.9% 100-999 5/30 5/19 3/3 3/3 1/1 
0.1-0.99% 10-99 2/20 1/19 5/11 5/10 4/4 4/4 2/2 
0.01-0.099% 1-99 1/23 1/25 0/20 1/20 2/2 2/2 3/3 1/1 
< 0.01% < 1 0/2 1/12 1/5 0/6 1/5 0/4 1/2 1/2 
Negative 1/11 1/11 4/59 4/59 5/52 5/49 6/46 6/46 
Total eligible 13/94 13/94 13/98 13/98 12/63 12/60 10/51 10/51 
(Total) (101) (101) (105) (105) (68) (64) (54) (53) 

Rx indicates relapses; pts, patients (eligible, total patients excluding prematurely censored due to transplantation or early death).

F4-150

Data from all available patients included (only the first half of patients was consecutively investigated at this time point, whereas samples from later patients were tested only if MRD+ at the preceding time point).

Fig. 1.

Probability of sustained remission according to FC MRD results, relative and absolute estimate, at 4 time points during the first 6 months of ALL therapy.

Probabilities of the RFI were calculated using the method of Kaplan-Meier and were based on individual intervals between a respective MRD assessment and relapse or last observation. SDs of pRFI, 3-year estimates, and comparisons of groups of patients using the log-rank test are also shown. Relative MRD values relate to percentages of leukemic cells among total NCs of the sample, and absolute counts represent leukemic cells per microliter BM.

Fig. 1.

Probability of sustained remission according to FC MRD results, relative and absolute estimate, at 4 time points during the first 6 months of ALL therapy.

Probabilities of the RFI were calculated using the method of Kaplan-Meier and were based on individual intervals between a respective MRD assessment and relapse or last observation. SDs of pRFI, 3-year estimates, and comparisons of groups of patients using the log-rank test are also shown. Relative MRD values relate to percentages of leukemic cells among total NCs of the sample, and absolute counts represent leukemic cells per microliter BM.

Close modal
Table 5.

Multivariate analyses5-150 of BFM risk groups and MRD information

Time pointParameter5-151Relative estimateAbsolute estimate
Risk ratio5-152P5-153Risk ratio5-152P5-153
d 15 MRD 2.0  (0.5-7.6) .32 4.3  (1.3-14.6) .018  
 BFM 8.3  (2.8-34.5) < .001 8.9  (2.9-27.8) < .001  
d 33 MRD 7.0  (1.9-25.7) < .001 7.8  (2.1-29.3) < .001  
 BFM 4.7  (1.3-16.4) .017 4.2  (1.2-15.4) .029  
wk 12 MRD 14.9  (3.2-66.7) < .001 26.3  (5.2-125) < .001 
 BFM 2.0  (0.5-8.4) .36 1.2  (0.3-5.5) .78 
Combination MRD 19.8  (3.8-104) < .001 41.2  (7.3-234) < .001  
(d 33/wk 12) BFM 2.4  (0.5-11.7) .27 1.4  (0.3-6.9) .66 
Time pointParameter5-151Relative estimateAbsolute estimate
Risk ratio5-152P5-153Risk ratio5-152P5-153
d 15 MRD 2.0  (0.5-7.6) .32 4.3  (1.3-14.6) .018  
 BFM 8.3  (2.8-34.5) < .001 8.9  (2.9-27.8) < .001  
d 33 MRD 7.0  (1.9-25.7) < .001 7.8  (2.1-29.3) < .001  
 BFM 4.7  (1.3-16.4) .017 4.2  (1.2-15.4) .029  
wk 12 MRD 14.9  (3.2-66.7) < .001 26.3  (5.2-125) < .001 
 BFM 2.0  (0.5-8.4) .36 1.2  (0.3-5.5) .78 
Combination MRD 19.8  (3.8-104) < .001 41.2  (7.3-234) < .001  
(d 33/wk 12) BFM 2.4  (0.5-11.7) .27 1.4  (0.3-6.9) .66 
F5-150

Cox regression model including BFM risk groups and MRD results; note that too few data points were available for similar calculations at weeks 22-24.

F5-151

MRD: high MRD load vs low load according to time-point-specific prognostically relevant thresholds (see “Results”); BFM: HR patients vs SR/MR patients.

F5-152

Relative risk of relapse (95% CI in parenthesis).

F5-153

P according to Wald test.

Strategic combination of day 33 and week 12 MRD results

Ninety-nine percent of MRD tests at day 33 (102 of 103 samples) had a sensitivity that included the levels relevant for outcome prediction at this time point, whereas 70% of assessments (45 of 64) at week 12 had a reduced sensitivity according to threshold proportions. Invariably, patients with predicted relapses had high MRD loads both at day 33 and week 12. To neutralize the surmised reduced reliability of week 12 assessments, we found it useful to combine MRD information from day 33 (for a sensitive predefinition of patients at risk) and week 12 (for the confirmation of an adverse outcome based on the kinetic evolution of MRD between the 2 time points). The predictive value of the strategic combination of the 2 time points is shown in Figure 2. Patients were split up by MRD results into 2 groups with greatly different prognosis (P < .001). Of note, 68 measurements at week 12 were available (n = 51 from consecutive patients; patients recruited later during the study were tested only if MRD+ at day 33, ie, n = 17). Data of a further 34 children were imputed because MRD at day 33 into the MRD low-load group of the combined analysis, because only 2 of 208 paired assessments from 2 consecutive time points displayed an increase in MRD. One sample-pair showed increasing MRD positivity preceding a very early relapse. The other was found 0.001% positive by week 12, although previously and thereafter judged MRD. All other 78 sample-pairs, which were negative at the earlier time point, were negative also at the following investigation. Data of further 1 (absolute count) and 2 patients (relative estimate), respectively, with high a MRD load on day 33 but lacking week 12 results were imputed in the MRD high-load cohort. Patients with adverse outcome (pRFI 0.0 at 3 years) were those with an MRD at least 0.1% of NC (or ≥ 10 blasts/μL BM) at day 33, who remained MRD+ at least 0.01% (or ≥ 1 blast/μL) at week 12. All other patients had a pRFI of 0.93 (± 0.03 SD) according to the relative MRD estimate, and of 0.94 (± 0.03 SD) with absolute counting. In multivariate analysis (Cox model), MRD-based risk assignment was found to be an independent and overriding prognostic factor, whereas conventional BFM risk grouping had no significant additional prognostic impact (log-likelihood ratio-testP < .001; Table 5). BFM HR patients were split up by MRD-based stratification into a group that relapsed (pRFI 0.0 at 3 years; n = 8) and a group (n = 8) that profited from chemotherapy alone (pRFI 0.75 ± 0.15 by relative estimate, P = .018; 0.86 ± 0.13 by absolute count, P = .004). Of note, 1 of 2 BFM HR patients with a relapse despite favorable relative MRD results had a localized disease recurrence in the CNS. The other patient was correctly assigned a high risk by the absolute MRD count. Of the 6 BFM MR patients with a relapse, 2 were predicted by MRD results. The subgroup of BFM HR patients with a poor response to the prednisone prephase (n = 9) was also split up by MRD results. Patients with continuously high MRD load (n = 4) relapsed in 3 (1 censored due to transplantation) and 4 cases (relative versus absolute estimate), respectively, whereas of patients with low MRD (n = 5) only 1 patient suffered a relapse in the relative assessment group,P = .024, and none in the absolute count group,P = .008. Notably, only 1 of these 9 prednisone poor responders had an additional HR criterion, that is, a t(4;11).

Fig. 2.

Probability of sustained remission according to combined FC MRD results, relative and absolute estimate, from day 33 and week 12.

Kaplan-Meier calculations of the pRFI were based on intervals from week 12 onward. Three-year estimates of pRFI including SDs and log-rank test results are also shown. By relative assessment, MRD values of at least 0.1% leukemic cells among total BM NCs at day 33 in combination with at least 0.01% at week 12 were classified as high MRD load (versus all others). By absolute estimation, at least 10 leukemic cells/μL BM at day 33 in combination with at least 1 blast/μL at week 12 were classified high MRD. Of note, patients with high MRD load at day 33, but whose leukemic cells were cleared below the relevant thresholds by week 12 (n = 7 and 5, respectively) were subsumed in the MRD low-load groups. All these patients stayed relapse-free except for one patient of the relative-estimate series, who, in contrast, had been assigned to the MRD high-load group at week 12 by absolute counting.

Fig. 2.

Probability of sustained remission according to combined FC MRD results, relative and absolute estimate, from day 33 and week 12.

Kaplan-Meier calculations of the pRFI were based on intervals from week 12 onward. Three-year estimates of pRFI including SDs and log-rank test results are also shown. By relative assessment, MRD values of at least 0.1% leukemic cells among total BM NCs at day 33 in combination with at least 0.01% at week 12 were classified as high MRD load (versus all others). By absolute estimation, at least 10 leukemic cells/μL BM at day 33 in combination with at least 1 blast/μL at week 12 were classified high MRD. Of note, patients with high MRD load at day 33, but whose leukemic cells were cleared below the relevant thresholds by week 12 (n = 7 and 5, respectively) were subsumed in the MRD low-load groups. All these patients stayed relapse-free except for one patient of the relative-estimate series, who, in contrast, had been assigned to the MRD high-load group at week 12 by absolute counting.

Close modal

Analyses of relapse patients

In all 10 investigated relapse cases, the leukemic phenotype was stable compared to primary diagnosis in at least one marker used for MRD assessment. BM samples (n = 11) from aspirations preceding these relapses were available in 9 patients (6 MRD+, 5 MRD). The median interval between these aspirations and relapse diagnosis was 3 months in positive cases (range, 0.5-6 months), and 6 months in negative cases (range, 4-16 months). Three of 5 patients with MRD+ results had been negative in prior follow-up investigations, whereas 2 patients had been continuously positive for 5 and 7 months since primary diagnosis.

Sequential monitoring of MRD during the first months of therapy provides information on the timely response to treatment and proves to be a powerful and independent indicator of treatment outcome in children with ALL.7-11 Our population-based prospective study on 108 unselected pediatric ALL patients, which were uniformly treated according to the widely used ALL-BFM 95 protocol, confirms and extends these observations. We used an FC method based on a limited panel of antibodies that was applicable to 97% (105 of 108) of analyzed patients. At any time point, presence of MRD was associated with a higher risk of disease recurrence. Distinct cutoff levels of MRD were found, which correlated with a particularly high relapse hazard, and which declined with time in therapy. At day 33, patients at risk could sensitively be predefined (cutoff ≥ 0.1% NCs or ≥ 10 leukemic cells/μL BM). At week 12, those who remained MRD+ (≥ 0.01% NCs or ≥ 1 leukemic cells/μL BM) had an utmost probability of relapse. In contrast, the risk of patients whose MRD levels had decreased below the cutoff, favorably reverted to that of all other patients with primarily absent or low MRD already at day 33 (none of 8 such patients relapsed). This observation confirms that a slow response to initial treatment may be neutralized within a defined period of treatment by subsequent therapy with other drugs.9 Importantly, MRD assessment proved to be an overriding and independent prognostic indicator also in multivariate analyses that included the BFM risk stratification. Based on this, it seems possible to draw clinically relevant conclusions: patients with high conventional risk (HR) can be split up by MRD results into a group with very unfavorable outcome (in which alternative treatment could be of value) and a group with good prognosis (which seems to profit from intensive chemotherapy alone). Patients with low conventional risk (SR/MR) but unfavorable MRD results are destined to relapse, but may take advantage of intensified treatment, whereas those with a slow initial response neutralized by subsequent therapy (7 SR/MR), may further profit from reintensification with those drugs that resulted in control of the disease. Notably, 3 of 5 recurrences not predicted by MRD results occurred as late (2.67 years after diagnosis; n = 1) or localized relapses (n = 2). It is conceivable that MRD assessment as a measure of the initial therapy response may not allow us to predict recurrences related to other mechanisms than primary resistant leukemia.

In several aspects our FC data confirm those generated with PCR technology by van Dongen et al11 on the basis of a related regimen (protocol ALL-BFM 90); however, some divergences exist. Those between the prognostically relevant levels of MRD at week 12 (10−4 versus 10−3) rather seem due to differences in technology. Van Dongen et al11 used semiquantification in mononuclear cell preparations, whereas we investigated total NCs. This underscores that the impact of MRD may depend on the methodologic approach. Van Dongen and coworkers defined 3 risk groups according to results from day 33 and week 12 (most favorable: MRD on day 33, worst prognosis: MRD ≥ 10−3 at week 12). In contrast, we aimed only at increasing the reliability of dismal outcome predictions when combining MRD assessments from 2 time points. Whether the lack of a third MRD-based risk group in our series (at 3 years, our general low-risk group (n = 94) had a probability of sustained remission of 0.94 compared to 0.98 in the group with lowest risk (n = 55) and 0.77 at intermediate risk (n = 55) according to van Dongen et al11) is due to the sample size, minor differences between regimens, or to a more precise MRD interpretation by FC needs assessment. A favorable impact of a rapid response to treatment, as reported recently also by Panzer-Grümayer et al27 may also strongly depend on the observation time. Notably, recurrences in primary resistant disease (MRD HR group) seems to occur mostly earlier than in leukemias that initially respond well to therapy but then relapse. This might explain differences in the potentialities to define prognostically favorable patient cohorts between the former studies and the results of Cave et al and our study (48 and 66 versus 38 and 40 months median follow-up, respectively).10,11 27 Comprehensive investigations simultaneously performed with both methodologies are under way, which re-evaluate these issues and which will eventually define the correlation of technologies.

Differences in intensity and schedule of therapeutic regimens may also evoke divergences in results and in predictive value of MRD assessment. This seems corroborated by our data that show a decline of the prognostically relevant MRD cutoff over time correlated with the cumulative exposure to cytotoxic drugs. In line with this, markedly different percentages of patients MRD+ at high levels after induction have been observed with 2 different therapeutic regimens that bear similar cure expectancies.28 It may therefore be necessary that MRD data for outcome prediction are generated separately for each therapeutic regimen.

In our study, we quantified residual leukemia on the basis of total NCs, which bears conceptual advantages regarding test accuracy over the common practice of analyzing mononuclear cell preparations. By our procedure, assessments can be corrected for bias otherwise caused by variations in preparative quality (increased amount of residual nonnucleated events, eg, erythrocytes, and diminished efficacy in separation of mononuclear cells by imbalance of anticoagulants or prolonged storage), although the theoretical possibility of sample dilution by blood remains a factor that still could adversely influence any type of assessment. However, based on NC analysis, truly absolute MRD quantification by concept may further allow us to correct for a bias that occurs in relative investigations due to differences in total cell content when comparing a series of individual specimens from a given time point. Such differences in NC counts of follow-up samples precipitated log-level divergences in 38.8% of relative/absolute measurement pairs, which led to several divergences also in risk estimates. Day 15 results (only the absolute MRD estimate allowed a statistically significant separation of patients), odds ratios of multivariate analyses, as well as data from individual patients (eg, one patient who relapsed was defined as HR by absolute MRD assessment only) suggested that absolute assessment allows an even more precise risk estimation than relative measurement by ranking samples with high relative MRD values but low total NC counts prognostically more correct. Associated with the individual kinetics both of cell depletion with cytotoxic therapy and, later on, of hematopoietic regeneration after rest from chemotherapy, we found that NC counts differed largely also between samples from different follow-up time points of an individual patient. Therefore, relative measures may also not allow us to reliably assess the quantitative evolution of leukemia over time. In this respect, we found generally higher NC counts as well as a completely different BM composition regarding normal BCP at the 2 later time points compared to the former, which extends recent observations.21,29,30 Our data substantiate speculations that variations in the occurrence of immature lymphoid precursors, which resemble leukemic cells by phenotype, may influence MRD detection.3,6 21 Regenerative time points (week 12 and weeks 22-24) were characterized by a limited sensitivity in MRD assessment in a high proportion of B-lineage cases due to the preponderance of resurgent normal very immature BCP, whereas the high reliability of testings at earlier time points was based on the exclusive presence of mature B cells (apart from residual leukemic cells). Hence, options for investigating MRD may not only depend on leukemia itself, but more importantly also on time point–related factors such as total count and developmental stage of normal cells. Strategic placement of assessments into nonregenerative phases of treatment may thus allow an accurate and sensitive determination of MRD although using a single marker combination that is sufficiently broad to reach most patients. Reducing the need for a diversified investigative setup should add to the clinical significance of FC also by augmenting its economic efficiency. We suggest that a methodologic standardization in these respects is a prerequisite for drawing reliable conclusions for clinical decision making on the basis of MRD.

This study is dedicated to Professor Hansjörg Riehm who pioneered the usage of treatment response assessment for risk estimation in childhood ALL. We thank the participants of the Austrian BFM Study Group (B. Ausserer, F.M. Fink, H. Haas, N. Jones, R. Jones, W. Kaulfersch, G. Müller, I. Mutz, R. Ploier, K. Schmitt, O. Stöllinger, and C. Urban) for their close collaboration in providing BM samples as well as clinical data. In this respect, we are also grateful to the hemato-oncology and the laboratory teams of the St Anna Kinderspital (in particular to S. Juhasz, R. Kornmüller, E. Neidhart, and U. Stalze). We thank J. Regelsberger from the Austrian BFM Study Group documentation center for support, and P. Buchinger, D. Scharner, as well as D. Wimmer for excellent technical assistance.

The publication costs of this article were defrayed in part by page charge payment. Therefore, and solely to indicate this fact, this article is hereby marked “advertisement” in accordance with 18 U.S.C. section 1734.

1
Pui
 
CH
Campana
 
D
New definition of remission in childhood acute lymphoblastic leukemia.
Leukemia.
14
2000
783
785
2
Farahat
 
N
Morilla
 
A
Owusu-Ankomah
 
K
et al
Detection of minimal residual disease in B-lineage acute lymphoblastic leukaemia by quantitative flow cytometry.
Br J Haematol.
101
1998
158
164
3
Griesinger
 
F
Piro-Noack
 
M
Kaib
 
N
et al
Leukaemia-associated immunophenotypes (LAIP) are observed in 90% of adult and childhood acute lymphoblastic leukaemia: detection in remission marrow predicts outcome.
Br J Haematol.
105
1999
241
255
4
Dworzak
 
MN
Fritsch
 
G
Fleischer
 
C
et al
Comparative phenotype mapping of normal vs. malignant pediatric B-lymphopoiesis unveils leukemia-associated aberrations.
Exp Hematol.
26
1998
305
313
5
Dworzak
 
MN
Fritsch
 
G
Panzer-Grümayer
 
ER
Mann
 
G
Gadner
 
H
Detection of residual disease in pediatric B-cell precursor acute lymphoblastic leukemia by comparative phenotype mapping: method and significance.
Leuk Lymphoma.
38
2000
295
308
6
Weir
 
EG
Cowan
 
K
LeBeau
 
P
Borowitz
 
MJ
A limited antibody panel can distinguish B-precursor acute lymphoblastic leukemia from normal B precursors with four color flow cytometry: implications for residual disease detection.
Leukemia.
13
1999
558
567
7
Ciudad
 
J
San Miguel
 
JF
Lopez-Berges
 
MC
et al
Prognostic value of immunophenotypic detection of minimal residual disease in acute lymphoblastic leukemia.
J Clin Oncol.
16
1998
3774
3781
8
Coustan-Smith
 
E
Behm
 
FG
Sanchez
 
J
et al
Immunological detection of minimal residual disease in children with acute lymphoblastic leukaemia.
Lancet.
351
1998
550
554
9
Coustan-Smith
 
E
Sancho
 
J
Hancock
 
ML
et al
Clinical importance of minimal residual disease in childhood acute lymphoblastic leukemia.
Blood.
96
2000
2691
2696
10
Cave
 
H
van der Werff ten Bosch
 
J
Suciu
 
S
et al
(for the EORTC). Clinical significance of minimal residual disease in childhood acute lymphoblastic leukemia.
N Engl J Med.
339
1998
591
598
11
van Dongen
 
JJM
Seriu
 
T
Panzer-Grümayer
 
ER
et al
Prognostic value of minimal residual disease in acute lymphoblastic leukaemia in childhood.
Lancet.
352
1998
1731
1738
12
Neale
 
GAM
Coustan-Smith
 
E
Pan
 
Q
et al
Tandem application of flow cytometry and polymerase chain reaction for comprehensive detection of minimal residual disease in childhood acute lymphoblastic leukemia.
Leukemia.
13
1999
1221
1226
13
Dworzak
 
MN
Stolz
 
F
Fröschl
 
G
et al
Detection of residual disease in pediatric B-cell precursor acute lymphoblastic leukemia by comparative phenotype mapping: a study of five cases controlled by genetic methods.
Exp Hematol.
27
1999
673
681
14
Bradstock
 
KF
Janossy
 
G
Tidman
 
N
et al
Immunological monitoring of residual disease in treated thymic acute lymphoblastic leukemia.
Leuk Res.
5
1981
301
309
15
Van Dongen
 
JJM
Hooijkaas
 
H
Adriaansen
 
HJ
Hählen
 
K
Van Zanen
 
GE
Detection of minimal residual acute lymphoblastic leukemia by immunological marker analysis: possibilities and limitations.
Minimal Residual Disease in Acute Leukemia.
Hagenbeek
 
A
Löwenberg
 
B
1986
113
133
Nijhoff
Dordrecht, The Netherlands
16
Campana
 
D
Coustan-Smith
 
E
Janossy
 
G
The immunologic detection of minimal residual disease in acute leukemia.
Blood.
76
1990
163
171
17
Campana
 
D
Pui
 
C-H
Detection of minimal residual disease in acute leukemia: methodologic advances and clinical significance.
Blood.
85
1995
1416
1434
18
Ginaldi
 
L
Matutes
 
E
Farahat
 
N
De-Martinis
 
M
Morilla
 
R
Catovsky
 
D
Differential expression of CD3 and CD7 in T-cell malignancies: a quantitative study by flow cytometry.
Br J Haematol.
93
1996
921
927
19
Schrappe
 
M
Gadner
 
H
Reiter
 
A
Riehm
 
H
für die ALL-BFM Studiengruppe
Resultate der Studie ALL-BFM 90 und erstes Zwischenergebnis der Studie ALL-BFM 95 [abstract].
Monatsschrift Kinderheilkunde.
5
1999
519a
20
Schrappe
 
M
Reiter
 
A
Ludwig
 
WD
et al
Improved outcome in childhood acute lymphoblastic leukemia despite reduced use of anthracyclines and cranial radiotherapy: results of trial ALL-BFM 90.
Blood.
95
2000
3310
3322
21
Dworzak
 
MN
Fritsch
 
G
Fleischer
 
C
et al
Multiparameter phenotype mapping of normal and post-chemotherapy B lymphopoiesis in pediatric bone marrow.
Leukemia.
11
1997
1266
1273
22
Dworzak
 
MN
Fritsch
 
G
Fröschl
 
G
Printz
 
D
Gadner
 
H
Four-color flow cytometric investigation of terminal deoxynucleotidyl transferase-positive lymphoid precursors in pediatric bone marrow: CD79a expression precedes CD19 in early B-cell ontogeny.
Blood.
92
1998
3203
3209
23
Lucio
 
P
Parreira
 
A
van en Beemd
 
MWM
et al
Flow cytometric analysis of normal B cell differentiation: a frame of reference for the detection of minimal residual disease in precursor-B-ALL.
Leukemia.
13
1999
419
427
24
Terstappen
 
LWMM
Johnsen
 
S
Segers-Nolten
 
IMJ
Loken
 
MR
Identification and characterization of plasma cells in normal human bone marrow by high-resolution flow cytometry.
Blood.
76
1990
1739
1747
25
Kaplan
 
E
Meier
 
P
Nonparametric estimation from incomplete observations.
J Am Stat Assn.
53
1958
457
481
26
Cox
 
D
Regression models and live tables.
J R Stat Soc (B).
34
1972
187
220
27
Panzer-Grümayer
 
ER
Schneider
 
M
Panzer
 
S
et al
Rapid molecular response during early induction chemotherapy predicts a good outcome in childhood acute lymphoblastic leukemia.
Blood.
95
2000
790
794
28
Zur Stadt
 
U
Harms
 
DO
Schlüter
 
S
et al
Outcome prediction by means of MRD at the end of induction therapy in childhood acute lymphoblastic leukemia strongly depends on the therapeutic regimen: a report from the German COALL study [abstract].
Blood.
94 (suppl.1)
1999
284a
29
Van Lochem
 
EG
Wiegers
 
YM
van den Beemd
 
R
Hählen
 
K
van Dongen
 
JJM
Hooijkaas
 
H
Regeneration pattern of precursor-B-cells in bone marrow of acute lymphoblastic leukemia patients depends on the type of preceding chemotherapy.
Leukemia.
14
2000
688
695
30
Van Wering
 
ER
Van der Linden-Schrever
 
BEM
Szczepanski
 
T
et al
Regenerating normal B-cell precursors during and after treatment of acute lymphoblastic leukaemia: implications for monitoring of minimal residual disease.
Br J Haematol.
110
2000
139
146

Author notes

Michael N. Dworzak, Children's Cancer Research Institute, St Anna Kinderspital, Kinderspitalgasse 6, A-1090 Vienna, Austria; e-mail: dworzak@ccri.univie.ac.at.

Sign in via your Institution