So far, comprehensive diagnosis of leukemia requires a combination of cytomorphology, immunophenotyping, and genetic methods. We aimed at developing a new diagnostic tool based solely on gene expression profiling to accurately predict all clinically relevant subtypes of leukemia in adults and to distinguish these from normal bone marrow. Therefore, we analyzed samples from 1337 untreated patients at diagnosis and healthy donors using oligonucleotide microarrays. The first series of 937 cases was hybridized to HG-U133A+B microarrays (Affymetrix). The following 13 subgroups were included: 620 AML (42 t(15;17); 38 t(8;21); 49 inv(16); 47 t(11q23); 75 complex aberrant karyotype; 193 normal karyotype; 176 other cytogenetic abn.); 152 ALL (26 Pro-B-ALL/t(11q23); 12 ALL-t(8;14); 32 T-ALL; 82 c-ALL/Pre-B-ALL); 75 CML, 45 CLL, and 45 bone marrows from healthy volunteers or non-leukemia pts. (nBM). For each disease entity the top 100 differentially expressed genes were calculated in a one-versus-all (OVA) approach. Class prediction was performed using support vector machines (SVM). Prediction accuracy was estimated by 10-fold cross validation (CV) and assessed for robustness in a resampling approach. 891 of the 937 samples (95.1%) were correctly classified (10-fold CV). A resampling approach with 2/3 training and 1/3 test cohort (100 runs of SVM) confirmed this high accuracy (median, 93.8%). In particular, a median of 100% sensitivity and specificity was achieved for AML with t(15;17), t(8;21), and inv(16), as well as for Pro-B-ALL/t(11q23), and CLL. The median specificity was at least 99.7% in all subgroups except for AML normal/other (median specificity, 93.7%). In a second step T-ALL cases were separated into cortical and immature ones (accuracy, 84.4%) and c-ALL/Pre-B-ALL into cases with and without t(9;22) (accuracy, 82.9%). The second prospective series comprized 400 unselected cases which were hybridized to the new generation HG-U133 Plus 2.0 microarrays (Affymetrix). To validate the diagnostic accuracy of our approach these cases were processed blinded in parallel to routine diagnostic work-up and classified based on the gene expression signatures discovered in the first series described above. Applying a first classification step as described above the 13 different diagnoses were predicted with an accuracy of 94.5%. Failures were mostly due to misclassification into biologically related subgroups, e.g. AML with del(5q) aberrations classified as AML with complex aberrant karyotype. In the second step (separation of the two T-ALL subtypes, and c-ALL/Pre-B-ALL with or without t(9;22)) accuracies of 100% and 70.6% respectively were achieved. In conclusion, we were able to identify within a routine diagnostic workflow distinct expression profiles for all clinically and prognostically relevant adult leukemia subtypes and their discrimination from nBM based only on gene expression data. Accuracy, sensitivity, and specificity were higher than achieved with each of the gold standard techniques alone used today. Thus, gene expression patterns analyzed by microarrays qualify as a diagnostic tool in a routine setting for leukemia diagnosis and classification and may guide relevant therapeutic decisions.

Author notes

Corresponding author

Sign in via your Institution