Kewan T, Durmaz A, Bahaj W, et al. Molecular patterns identify distinct subclasses of myeloid neoplasia. Nat Comm. 2023;14:3136.

Disease classification has major implications for patients and their treating physicians. Standardized disease names enable patients to better understand their illness and provide a lingua franca for communication among physicians. Furthermore, the general category of disease (e.g., myelodysplastic syndromes [MDS] vs. acute myeloid leukemia [AML]) and specific disease subgroups each can aid in predicting clinical behavior, suggesting therapeutic options, and determining eligibility for clinical trials and approved treatments. However, the process of creating and modifying disease classification systems is imperfect, being driven by groups of experts whose interpretations of supportive literature, emphasis on practical versus scientific considerations, and tendency to respect historic tenets may differ.

The framework we use to diagnose and classify myeloid neoplasms has been developed via an iterative process, beginning with the establishment of the French-American-British (FAB) classification in 1982.1  In the context of MDS and AML, this framework has evolved from a single, morphologically based FAB scheme to a system that includes increasingly refined schemes that now incorporate both genetically and morphologically defined disease entities.

In 2022, two new classification systems were published: the International Consensus Classification (ICC)2  and 5th edition of the World Health Organization Classification (WHO5).3  While these two systems share many similarities, their criteria for the diagnosis and classification of MDS and AML differ in several respects,4  which is not surprising given that the two systems were independently developed by two different groups of experts. This complex situation begs the question as to whether we can leverage artificial intelligence to improve upon the current human-designed systems used to diagnose and classify disease.

Dr. Tariq Kewan and colleagues applied machine-learning methods to identify discrete molecular entities within a dataset derived from 3,588 patients with MDS (2,853 patients) and secondary AML (sAML, 744 patients). Computational algorithms refined using an unsupervised approach identified 14 unique molecular classes (MCs) with shared patterns of gene mutations and cytogenetic aberrations, which were validated internally as well as externally in a separate cohort of 412 patients with MDS/sAML. The defining features of each group were driven by the presence of unique driver mutations and cytogenetic abnormalities and were influenced by co-mutation patterns (Figure 1A). Gratifyingly, some groups corresponded to well-established MDS entities. For example, MC8 was composed entirely of patients with del(5q) cytogenetic aberrations, two-thirds of whom had disease that could be classified as MDS with isolated del(5q) based on the current classification schemes. Two genetically defined entities introduced in the recent ICC and WHO5 classifications were represented in MC13 (bi-allelic TP53 mutation and complex karyotype) and MC4/MC10 (MDS with SF3B1 mutation). However, other diagnoses were distributed amongst several molecular subgroups. For example, sAML cases were mostly observed in four separate subgroups (MC1, MC2, MC13, MC14), while MDS with excess blasts largely populated three different subgroups (MC3, MC6, MC9). Splicing factor mutations occurred in several different subgroups due to the influence of co-mutations in epigenetic modifier genes.

Figure 1A

Color-coded mutation profiles for each of the 14 genomic clusters, determined using a uniform manifold approximation of projection (UMAP) technique, which reduces the 16-dimensional binary mutation profiles into a three-dimensional (3D) diagram. Each cluster occupies a distinct space on the 3D diagram, indicating a unique mutation signature (adapted from Figure 1B in Kewan, T et al. Nat Comm. 2023;14:3136).

Figure 1A

Color-coded mutation profiles for each of the 14 genomic clusters, determined using a uniform manifold approximation of projection (UMAP) technique, which reduces the 16-dimensional binary mutation profiles into a three-dimensional (3D) diagram. Each cluster occupies a distinct space on the 3D diagram, indicating a unique mutation signature (adapted from Figure 1B in Kewan, T et al. Nat Comm. 2023;14:3136).

Close modal

The authors noted significant differences in patient age, sex, and blood counts among the molecular subgroups, indicating that this unsupervised classification process had created disease categories with distinct clinical features; this genomic-clinical association has also been demonstrated in other molecular-derived classification systems.5  Not surprisingly, marked differences in survival were observed between the different molecular subgroups, with median survival times ranging from 5.5 to 51 months (Figure 1B). However, significant heterogeneity in morphologic features (e.g., blast count), prognostic risk categories, and patient outcomes was observed within each category. While this can be interpreted as a potential weakness, Dr. Kewan and colleagues emphasized the value of an objectively developed genomic classification system that more accurately captures true disease pathophysiology based on its underlying molecular signature. Further, the authors suggest that features such as the blast percentages that traditionally separate MDS from AML and define discrete categories within MDS may be better regarded as reflecting the stage of disease, independent of the actual disease class. That is, patients with higher blast counts may present later in the course of disease evolution rather than manifesting a disease that should be considered in a distinct category. Moreover, in some molecular subgroups, such as that characterized by bi-allelic TP53 mutation, the blast percentage exerted no significant effect on patient outcomes.

Figure 1B

Kaplan-Meier analysis showing overall survival for patients in each genomic cluster, with marked differences in outcomes among the different genomic clusters. The color coding for each survival curve corresponds to the clusters indicated in Figure 1A (adapted from Supplementary Figure 11 in Kewan T, et al. Nat Comm. 2023;14:3136).

Figure 1B

Kaplan-Meier analysis showing overall survival for patients in each genomic cluster, with marked differences in outcomes among the different genomic clusters. The color coding for each survival curve corresponds to the clusters indicated in Figure 1A (adapted from Supplementary Figure 11 in Kewan T, et al. Nat Comm. 2023;14:3136).

Close modal

The recent work by Dr. Kewan and colleagues demonstrates that artificial intelligence can indeed generate molecular signatures that can be used to develop an objective scheme for disease classification. Building on the findings of previous studies conducted using similar machine-learning schemes to classify MDS, this work provides further evidence that disease categories can and should be defined based on their genetic pathogenesis rather than their historically codified morphology and clinical presentation5-7 ; nonetheless, it remains to be determined whether and how a genomically based classification scheme should incorporate tried-and-true morphologic features such as blast percentage or the degree of morphologic dysplasia. Although we may not yet be at the point of relinquishing disease classification to a computer algorithm, the work by Dr. Kewan’s team and other similar bioinformatic efforts are sure to exert a substantial impact on the next iteration of myeloid neoplasm classifications.

Dr. Hasserjian reports receiving consulting income from AstraZeneca, Bluebird Bio, Daiichi Sankyo, and Jazz Pharmaceuticals.

1
Bennett
JM
,
Catovsky
D
,
Daniel
MT
, et al
.
Proposals for the classification of the myelodysplastic syndromes
.
Br J Haematol
.
1982
;
51
(
2
):
189
199
.
2
Arber
DA
,
Orazi
A
,
Hasserjian
RP
, et al
.
International Consensus Classification of Myeloid Neoplasms and Acute Leukemia: Integrating morphological, clinical, and genomic data
.
Blood
.
2022
;
140
(
11
):
1200
1228
.
3
Khoury
JD
,
Solary
E
,
Abla
O
, et al
.
The 5th edition of the World Health Organization Classification of Haematolymphoid Tumours: Myeloid and Histiocytic/Dendritic Neoplasms
.
Leukemia
.
2022
;
36
(
7
):
1703
1719
.
4
Aster
JC
.
What is in a name? Consequences of the classification schism in hematopathology
.
J Clin Oncol
.
2023
;
41
(
8
):
1523
1526
.
5
Nagata
Y
,
Zhao
R
,
Awada
H
, et al
.
Machine learning demonstrates that somatic mutations imprint invariant morphologic features in myelodysplastic syndromes
.
Blood
.
2020
;
136
(
20
):
2249
2262
.
6
Bersanelli
M
,
Travaglino
E
,
Meggendorfer
M
, et al
.
Classification and personalized prognostic assessment on the basis of clinical and genomic features in myelodysplastic syndromes
.
J Clin Oncol
.
2021
;
39
(
11
):
1223
1233
.
7
Huber
S
,
Haferlach
T
,
Muller
H
, et al
.
MDS subclassification-do we still have to count blasts?
Leukemia
.
2023
;
37
(
4
):
942
945
.