Standardizing any aspect of a disease as clinically and biologically heterogeneous as the myelodysplastic syndromes (MDSs), whose natural history is not infrequently confounded by wide fluctuations in the peripheral blood (PB) indexes and where therapeutic options range from a simple watch-and-wait policy to stem cell transplantation, is a formidable task. Previous attempts of international experts to define some aspects of this cryptic disease include the French-American-British classification (FAB), the World Health Organization classification (WHO), and the International Prognostic Scoring System (IPSS). The latest in this series is a publication from the International Working Group (IWG) detailing standardized criteria to be used for response evaluation in MDS.1 First of all, the attempt must be lauded. It is a welcome and timely step, especially for those of us whose papers reporting the results of clinical trials have been returned over and over because the reviewer's idea of a response did not match the investigator's. Unfortunately, in keeping with many previous exercises to harness the complexity of MDS, the IWG's current proposal is also deficient despite attempts to appear comprehensive by including the more subtle issues such as emotional, social, and/or spiritual improvements in quality of life. Below are some of the most immediate practical problems encountered in applying these recommendations to a real-life situation.

Nowhere in this detailed document does it specify how to calculate either the baseline or intratherapy values for hemoglobin, absolute neutrophil count (ANC), or platelets. Should a single pretherapy absolute complete blood count (CBC) be used as the gold standard against which to measure response, or should this value be a derivative of many weekly counts? And if the latter, then how many weeks should one consider appropriate as being representative? Similarly, how should one calculate transfusion dependency? I may call a patient transfusion dependent if they received even a single transfusion in the preceding 2 months. Should that make them transfusion independent if they did not receive another for 9 weeks after starting therapy? The confusion is even worse when one tries to handle weekly CBCs during the course of therapy. It is recommended that studies should prospectively assess whether there is a difference in outcome from 0 to 6 months, 6 to 12 months, and longer than 12 months from the diagnosis of MDS. It sounds eminently sensible, except when one tries to actually use these recommendations practically. Should an average value of CBCs obtained over 6-month periods be used for these calculations, or should an absolute value closest to the specified times be the standard? Or perhaps a combination of both may be better? What should these values be compared with? A single pretherapy CBC or an average of many? And what about transfusions?

Recently, the responses of patients treated on one of our protocols were evaluated by ourselves and by 2 independent groups. While all 3 evaluators agreed that a significant proportion of the patients responded to the therapy, the specific percent responders according to the 2 independent groups who used IWG response criteria was different. The major stumbling block was a lack of specific criteria for establishing baseline parameter levels and for identifying the intratherapy values that should be used for response assessment.

Another problem is related to the interpretation of subtle differences between the responses of individual patients. Consider for example, the variations between just 2 patients whose responses according to the new criteria might read something like “PR, cytogenetics minor, HI-E major transfusion, HI-E minor Hb, HI-P minor, HI-N major,” versus “PR, cytogenetics minor (by FISH only), HI-E minor transfusion, HI-E major Hb, HI-P major, HI-N minor.” Are these 2 responders really different, and if so, among 20 responders, what is the likelihood of having 20 different types of responses with minor variations? Further, are not these greatly expanded numbers of variables likely to affect P values and the interpretation of what constitutes significance? How is this supposed to introduce uniformity in the interpretation of results?

I would like to suggest that, before making such detailed recommendations with the serious intent for universal application, the writers of these types of classifications should make an attempt to apply their recommendations to a practical situation, since at least their obvious deficiencies would be immediately apparent. The IWG paper1 would have been far more significant if the authors had included an analysis of an actual clinical trial to back the significance and universal utility of their recommendations. Failure to do so has resulted in adding to the confusion in an already complex situation. By neglecting to standardize baseline, as well as intratherapy, values of blood counts that must be used for response assessment, the IWG projects an image of a body far removed from patient care and evaluation of clinical trials. Rather, their recommendations appear to be a patchwork representing the unique interests of a few individual IWG participants. This situation is analogous to the ancient parable in the Upanishads where descriptions of an unseen elephant varied widely depending upon which part the narrator felt through a curtain. Perhaps it is time to address the ultimate Schrödinger-type interrogative: what constitutes MDS? I suggest that it is more prudent to consider refractory anemia, refractory anemia with ring sideroblasts, refractory anemia with excess of blasts, and chronic myelomonocytic leukemia as 4 distinct disorders, thereby simplifying interpretation of both biologic and clinical studies.

1
Cheson
BD
Bennett
JM
Kantarjian
H
et al
Report of an international working group to standardize response criteria for myelodysplastic syndromes.
Blood.
96
2000
3671
3674
Sign in via your Institution