Background. In hematology, leveraging real-world multimodal data at large scale is crucial for developing personalized medicine to address unmet clinical needs, particularly for rare diseases. Generative AI in healthcare shows great promise by generating multimodal synthetic data (SD) to improve patients' diagnosis and prognosis while accelerating clinical research (PMID: 34131324). The challenges in generating SD include accessing complete real-world datasets for model training, maintaining intrinsic relationships among different data layers, and ensuring clinical accuracy and privacy protection.

Aims. This project conducted by GenoMed4All and Synthema consortia, aimed to: 1) implement an innovative approach for generating high-fidelity multimodal SD from patients with myeloid neoplasms (MN); 2) develop a comprehensive multimodal Synthetic Validation Framework (SVF) to assess the SD clinical and statistical fidelity and privacy preservation; 3) verify the SD technology capability to accelerate research and enhance predictive models through multimodal data integration.

Methods. We developed a SD generation pipeline with conditional GAN, Tabular-VAE and Tabular-GPT architectures to generate tabular data including clinical information, cytogenetics, somatic mutations and transcriptomics (bulk RNA-seq of CD34+ bone marrow (BM) cells). Longitudinal information was generated by a hematological fine-tuned Large Language Model. Starting from clinical and genomic features, BM Hematoxylin and Eosin, May-Grunwald Giemsa stained images were generated by a Stable Diffusion model with hematological-trained CLIP module. Privacy preservability, statistical and clinical fidelity of SD were assessed with SVF. MOSAIC framework (PMID: 38875514) was exploited to perform disease classification and personalized prognostic assessment, explained by SHAP. Deep learning-based framework for multimodal analysis in hematology (based on PMID:35944502) was implemented for survival analysis.

Results. Our pipeline, trained on 605 MDS and 877 AML patients, generated 1,210 and 2,631 synthetic patients. Fidelity was assessed by comparing real and SD using SVF. Feature distributions and correlations of clinical information and BM morphological features were comparable (91% and 87% of fidelity respectively). Genomic alterations distribution and pairwise gene association showed 88% of fidelity.

We assessed quality and biological fidelity of real vs. synthetic RNA-seq data. Descriptive statistics, reads coverage distribution, gene-wise dispersion estimates and PCA were comparable in both sets (90% of fidelity). Differentially expressed genes and enriched biological pathways were overlapping as well. Transcriptomic signatures were compared and clinically validated using unsupervised clustering and survival analysis.

We then compared longitudinal outcomes in real and synthetic patients, finding overlapping Overall survival (OS) and leukemia-free survival (LFS) with 96% and 92% of fidelity, and log-rank p-value of 0.77 and 0.52, respectively. In terms of privacy preservability no real patients were copied in SD and NNDR scored 0.84 indicating poor privacy risk.

As clinical validation, we showed that SD augmentation improved performances on disease classification based on clinical, genomic, cytogenetic and BM morphological features. Two XGBOOST classification models trained on real and SD, and tested on a separate real set, resulted in comparable performance (F1-score 76% vs 81%). All features were included in a multimodal deep learning-based framework with OS as primary endpoint. Results showed similar concordance in both models trained on real and SD (0.85 vs 0.84). Preliminary analysis showed that training models on a hybrid dataset (real and SD), improved the performance of classification and prognostic models. We implemented the JUNO platform (https://juno-xkb3corsxq-ew.a.run.app/) to enable clinicians to generate multimodal SD from an existing biobank of real patients.

Conclusion. AI-generated SD accurately replicates statistical properties and complexity of multimodal features in MN. They provide reliable, privacy-compliant and clinically accurate information that can be customized to test scientific hypotheses, validate models, and potentially accelerate clinical trials, thereby improving personalized medicine in hematology.

Disclosures

Santoro:Beigene: Speakers Bureau; Sandoz: Speakers Bureau; Lilly: Speakers Bureau; Arqule: Speakers Bureau; Astrazeneca: Speakers Bureau; Celgene: Speakers Bureau; Amgen: Speakers Bureau; Abb-vie: Speakers Bureau; Roche: Speakers Bureau; Takeda: Speakers Bureau; MSD: Membership on an entity's Board of Directors or advisory committees, Speakers Bureau; Bayer: Membership on an entity's Board of Directors or advisory committees, Speakers Bureau; EISAI: Membership on an entity's Board of Directors or advisory committees, Speakers Bureau; Pfizer: Membership on an entity's Board of Directors or advisory committees, Speakers Bureau; Gilead: Membership on an entity's Board of Directors or advisory committees, Speakers Bureau; Servier: Membership on an entity's Board of Directors or advisory committees, Speakers Bureau; BMS: Membership on an entity's Board of Directors or advisory committees, Speakers Bureau; Incyte: Consultancy; Sanofi: Consultancy; Novartis: Speakers Bureau. Sanz:AstraZeneca, GSK: Consultancy, Honoraria; Novartis, ExCellera: Speakers Bureau; BMS: Research Funding; Novartis, BMS, J&J, Takeda, Amgen, Menarini, Bayer, Pfizer: Other. Santini:Ascentage, AbbVie, Bristol Myers Squibb, CTI BioPharma, Geron, Gilead, Novartis, Servier, Syros Pharmaceuticals: Other: Advisory Board. Platzbecker:Amgen: Consultancy, Research Funding; BMS: Consultancy, Membership on an entity's Board of Directors or advisory committees, Other: Travel support, Research Funding; MDS Foundation: Membership on an entity's Board of Directors or advisory committees; Abbvie: Consultancy, Research Funding; Curis: Consultancy, Honoraria, Research Funding; Geron: Consultancy; Janssen: Consultancy, Honoraria, Research Funding; Merck: Research Funding; Novartis: Consultancy, Research Funding. Fenaux:Astex: Research Funding; Agios: Research Funding; Servier: Research Funding; AbbVie: Honoraria, Research Funding; BMS: Honoraria, Research Funding; Janssen: Research Funding; Novartis: Research Funding; Jazz Pharmaceuticals: Honoraria, Research Funding. Diez-Campelo:AGIOS: Consultancy, Membership on an entity's Board of Directors or advisory committees; SYROS: Membership on an entity's Board of Directors or advisory committees; HEMAVAN: Membership on an entity's Board of Directors or advisory committees; ASTEX/OTSUKA: Membership on an entity's Board of Directors or advisory committees, Other: TRAVEL TO MEETINGS; BLUEPRINT MEDICINES: Consultancy, Membership on an entity's Board of Directors or advisory committees; KEROS: Honoraria, Membership on an entity's Board of Directors or advisory committees; Novartis: Consultancy, Honoraria, Membership on an entity's Board of Directors or advisory committees; GSK: Consultancy, Membership on an entity's Board of Directors or advisory committees; BMS/Celgene: Consultancy, Honoraria, Membership on an entity's Board of Directors or advisory committees, Other: Advisory board fees; Gilead: Other: Travel reimbursement; CURIS: Membership on an entity's Board of Directors or advisory committees. Kordasti:Pfizer: Consultancy, Speakers Bureau; MorphoSys: Research Funding; Beckman Coulter: Speakers Bureau; Alexion: Consultancy; API: Consultancy; Boston Biomed: Consultancy; Celgene: Research Funding; Novartis: Consultancy, Honoraria, Research Funding, Speakers Bureau. Komrokji:Celgene/BMS: Consultancy, Membership on an entity's Board of Directors or advisory committees, Research Funding; BMS: Honoraria, Membership on an entity's Board of Directors or advisory committees; Sobi: Consultancy, Membership on an entity's Board of Directors or advisory committees, Speakers Bureau; Sumitomo Pharma: Consultancy, Membership on an entity's Board of Directors or advisory committees; Janssen: Consultancy; BMS: Research Funding; Taiho: Membership on an entity's Board of Directors or advisory committees; DSI: Honoraria, Membership on an entity's Board of Directors or advisory committees; Servio: Membership on an entity's Board of Directors or advisory committees; Servio: Honoraria; Novartis: Membership on an entity's Board of Directors or advisory committees; Rigel: Consultancy, Honoraria, Membership on an entity's Board of Directors or advisory committees, Speakers Bureau; Servier: Consultancy, Membership on an entity's Board of Directors or advisory committees, Speakers Bureau; Geron: Consultancy, Membership on an entity's Board of Directors or advisory committees; Genentech: Consultancy; AbbVie: Consultancy, Membership on an entity's Board of Directors or advisory committees; Keros: Membership on an entity's Board of Directors or advisory committees; CTI biopharma: Membership on an entity's Board of Directors or advisory committees; PharmaEssentia: Consultancy, Membership on an entity's Board of Directors or advisory committees, Speakers Bureau; Jazz Pharmaceuticals: Consultancy, Membership on an entity's Board of Directors or advisory committees, Speakers Bureau; DSI: Consultancy, Membership on an entity's Board of Directors or advisory committees. Garcia-Manero:Merck: Research Funding; AbbVie: Research Funding; Amphivena: Research Funding; Curis: Research Funding; Janssen: Research Funding; Helsinn: Research Funding; Astex: Other: Personal fees; Helsinn: Other: Personal fees; Novartis: Research Funding; Genentech: Research Funding; Genentech: Other: Personal fees; Astex: Research Funding; Aprea: Research Funding; Onconova: Research Funding; Forty Seven: Research Funding; Bristol Myers Squibb: Other: Personal fees, Research Funding; H3 Biomedicine: Research Funding. Della Porta:Bristol Myers Squibb: Consultancy.

This content is only available as a PDF.
Sign in via your Institution