Microarray analysis can identify differentially expressed genes associated with distinct clinical and therapeutically relevant classes of both pediatric and adult leukemias. Recently, the MILE (Microarray Innovations in Leukemia) research study program has been launched in 10 centers: 7 from the European Leukemia Network (ELN, WP13) and 3 from the US. In this study, which will include 4,000 patients, the clinical accuracy of gene expression profiles of 16 acute and chronic leukemia subclasses, MDS, and non-leukemia as control will be assessed as compared to current routine diagnostic workup. Each center is trained on an identical microarray protocol and uses the same laboratory equipment, kits, and reagents for target preparation (Affymetrix HG-U133 Plus 2.0). First, the intra- and inter-laboratory comparability was investigated using 2 different cell line samples, MCF-7 and HEPG2, with different amounts of starting material (1 μg and 5 μg input for cDNA synthesis). Also, each center prepared in parallel total RNA and processed replicate samples from three leukemia pts (AML with t(8;21), CML, and CLL). We found a high reproducibility among the different centers: unsupervised analyses accordingly group the two different cell lines distinct from the three types of leukemia samples. In hierarchical clustering and principal component analysis the non-leukemia samples are clearly distinct from the leukemia samples and no clustering of the individual centers can be seen. Remarkably, for the replicates of the leukemia samples the squared correlation coefficients of gene expression range between 0.975 and 0.997 for CML, between 0.975 and 0.998 for CLL, and between 0.970 and 0.999 for the AML with t(8;21). Secondly, the samples were analyzed by a classification algorithm. The algorithm was trained on a database that contains gene expression profiles of >1,600 leukemia patients and cell lines and can distinguish 16 different classes of leukemia, MDS, and non-leukemia. Several methods are used to form linear classifiers for all 18 * (18 – 1)/2 = 153 class pairs. The average cross-validation accuracy is 91% or higher. Miscalls are predominantly seen in the distinction between MDS and AML with normal karyotype. The accuracy of resubstitution (application of the classifier to the data forming the classifier) is 100%. For the new data accurate predictions for the non-leukemia cell lines, AML with t(8;21), and CLL were observed. Interestingly, the CML in blast crisis is predicted as AML with other abnormalities. This may be due to the fact that the classifier was trained on CML in chronic phase only. In conclusion, for the first time an international multi-center research study demonstrates a very high reproducibility of microarray analyses performed at different centers for the same leukemia samples. This lays the foundation for an international clinical research initiative evaluating the application of microarrays in the diagnosis and classification of hematological malignancies.

Author notes

Corresponding author

Sign in via your Institution