Key point(s)
Machine learning models can predict leukemic evolution in acquired aplastic anemia patients using retrospective clinical data.
ABSTRACT
Acquired aplastic anemia (AA) patients treated with immunosuppressive therapy (IST) face up to a 20% long-term risk of developing secondary myeloid neoplasms (sMNs), including acute myeloid leukemia and myelodysplastic syndromes. Although hematopoietic stem cell transplantation (HSCT) is curative and prevents sMNs, older patients and those lacking suitable donors have historically received IST as first-line therapy. Recent improvements in HSCT outcomes have expanded transplant eligibility, highlighting the need for tools to better identify patients at high risk for sMN. Validated predictive models could help guide early HSCT consideration or tailor surveillance strategies.
We developed two binary machine learning models to predict sMN development in AA patients at clinically relevant time points: diagnosis (Model 1) and six months after IST response (Model 2). We analyzed data from 275 adult AA patients treated at UT Southwestern, Cleveland Clinic, and the Hospital of the University of Pennsylvania between 1975 and 2023. Seventy-nine clinical variables were collected, including demographics, somatic mutations, and treatment response. Neural networks were trained with leave-one-out cross-validation.
Both models achieved strong performance (AUC 0.82, sensitivity 0.82, specificity 0.73). Shared key predictors included DNMT3A mutation, CUX1 mutation, total mutation count, and age. TET2 mutation was specific to Model 1; PNH clone presence was unique to Model 2. High-risk classification was significantly associated with worse overall survival (p < 0.0001).
These findings support the feasibility of machine learning–based sMN risk prediction in AA. With training on larger datasets and external validation, these models may support individualized decision-making around HSCT and post-IST surveillance.
Author notes
A.C.T and M.M. are co-first authors
DATA SHARING STATEMENT
For deidentified data, study protocol, and source code, please contact Taha.Bat@utsouthwestern.edu.