• Classification algorithms based on fluorescence quantiles enhance the performance of flow cytometry analysis in B-CLPDs.

  • Data driven decision tree improves the diagnosis of B-CLPDs.

Abstract

Accurate diagnosis of B-cell chronic lymphoproliferative disorders (B-CLPDs) remains challenging due to overlapping phenotypes across subtypes. Machine learning (ML) offers promising tools to improve marker evaluation and refine flow cytometry analysis. We investigated the use of ML algorithms to evaluate the diagnostic value of incorporating CD148, CD180, and CD200 into standard B-CLPD phenotyping panel and to develop a diagnosis decision tree. We trained models with flow cytometry data from 480 patients with B-CLPDs using XGBoost and DecisionTree algorithms. The final models integrated 2 categorical markers (CD5 and CD10) and quantiles of fluorescence intensity of 4 quantitative markers (CD20, CD180, and CD200) to classify 6 B-CLPD subtypes. These trained models were applied to an independent cohort of 433 patients with B-CLPD analyzed on a different flow cytometer platform. DecisionTree models achieved the highest classification accuracy (mean accuracy, 0.88) in the validation cohort. The overall specificity ranged from 0.95 lymphoplasmacytic lymphoma (LPL) to 1 hairy cell leukemia (HCL), whereas sensitivity varied from 0.75 (LPL) to 1 (HCL). The DecisionTree model demonstrated superior identification of chronic lymphocytic leukemia compared to a Matutes score of 4 or 5 (P = .029). In more than half of the cases, a diagnosis was determined with near certainty using only the cytometry data. For the remaining cases, a hierarchical approach incorporating additional tests was proposed. For practical implementation, an interactive interface provides diagnostic predictions, positive predictive values, and Gini index scores. This study establishes a ML-optimized strategy for B-CLPD classification, combining phenotypic, cytogenetic, and molecular data to enhance diagnostic accuracy of leukemic B-CLPD cells. This trial was registered at www.ClinicalTrials.gov as #NCT04952974.

Flow cytometry plays a pivotal role in the initial evaluation of B-cell chronic lymphoproliferative disorders (B-CLPDs), alongside with cytological examinations.1,2 It provides crucial diagnostic information within a short time frame, which is later refined with histological, genetic, and molecular analyses for definitive classification in some cases.3 Expert consensus panels have established diagnostic guidelines that typically include 8 to >15 antibodies and guided by tumor-cell morphology. However, the significant phenotypic overlap among B-CLPDs subtypes present major diagnostic challenges.2,4-6 Scoring systems exist for specific subtypes, such as chronic lymphocytic leukemia (CLL; Matutes score, based on the expression of CD5, CD23, FMC7, CD22/CD79b, and surface immunoglobulin) and hairy cell leukemia (HCL; which use markers such as CD5, CD23, CD10, CD27, CD11c, CD103, CD123, and CD25). However, these systems are limited in scope and do not fully encompass the entire spectrum of B-CLPDs.7,8 

Recent studies have identified CD148, CD180, CD200 as valuable markers for diagnosing B-CLPDs.9-12 CD148, also known as protein tyrosine phosphatase receptor type J (PTPRJ) or density-enhanced phosphatase 1 (DEP-1) was initially identified as a phenotypic marker of mantle cell lymphoma (MCL) through mass spectrometry analysis of plasma membrane microvesicles from leukemic cells.13 This tyrosine phosphatase acts as a negative regulator of cell proliferation, differentiation and transformation, and is highly expressed in MCL.10,13,14 CD180, also known as radioprotective 105 kDa (RP-105), is a Toll-like receptor–related protein that closely resemble Toll-like receptor 4. It was also identified through mass spectrometry analysis of plasma membrane microvesicles and is highly expressed in marginal zone lymphoma (MZL).15 The CD180:CD200 ratio has been suggested as diagnostic marker for splenic diffuse red pulp small B-cell lymphoma.16,17 

CD200 is a broadly expressed glycoprotein that plays a key role in modulating immune response making it an important immune checkpoint molecule.18 CD200 expression tends to be low in MCL3,14 but significantly high in CLL and HCL.18,19 However, the heterogeneous expression patterns of these markers across B-CLPDs pose challenges for developing simple diagnostic scoring systems or decision trees.

To address these challenges, we explored the potential of machine learning (ML) algorithms to improve the accuracy and efficiency of B-CLPD subtype classification using flow cytometry.

Recent studies have shown that ML techniques such as Uniform Manifold Approximation and Projection and random forest classifier can effectively distinguish normal from pathological phenotypes across various hematologic malignancy subgroups.20 Dimension reduction tools such as principal component analysis, Uniform Manifold Approximation and Projection, or t-distributed stochastic neighbor embedding are now available in multiparametric flow cytometry analysis platforms.21 However, these tools do not provide diagnostic prediction. In this study, we integrated novel markers (CD148, CD180, and CD200) with established markers, including those from the Matutes score, to develop a robust ML-based classification model. We propose an optimized diagnosis strategy to improve the accuracy of B-CLPDs subtype classification.

Study design and patients

This monocentric study was approved by the ethical committee of the School of Medicine of Strasbourg and the University Hospital of Strasbourg, France (reference CE-2021-42) and is registered in (Clinicaltrials.gov identifier: NCT04952974). This study was conducted in compliance with the Good Clinical Practices protocol and Declaration of Helsinki principles. Only deidentified patient records were used, and no individually identifiable data were collected, used, or transmitted. All patients were informed about the research use of their anonymized clinical data. No additional blood or marrow samples were collected beyond routine diagnostic procedures. Diagnoses were established according to the 2016 World Health Organization criteria, including clinical symptoms, morphological analysis (cytology of lymphoid cells and histology of biopsy when available), immunophenotyping (flow cytometry), conventional cytogenetic and/or fluorescence in situ hybridization (FISH), and molecular genetics including MYD88 L265P mutation.

Six B-CLPD subtypes were investigated: CLL (including lymphocytic lymphoma), follicular lymphoma (FL), HCL, lymphoplasmacytic lymphoma (LPL; including Waldenström macroglobulinemia), MCL, and MZL (including leukemic phases of nodal, extranodal, and splenic forms as well as splenic diffuse red pulp small B-cell lymphoma).

Given recent publications highlighting the diagnostic potential of CD148, CD180, and CD200 in distinguishing B lymphoproliferative syndromes, these 3 markers were incorporated into routine diagnostic procedures.

Eligibility criteria

The eligibility criteria required patients to meet the following conditions: they had to exhibit blood and/or bone marrow involvement with a confirmed B-CLPD diagnosis. For patients with CLL, a Matutes score of 4 or 5 was necessary to confirm the diagnosis whereas a Matutes score of 3 was accepted for atypical CLL. In the training cohort, 11 patients with CLL had a Matutes score of 3. Among these, trisomy 12 was identified in 5 patients. Mutations of NOTCH1 were detected in 5 cases, including 1 who had also trisomy 12. No MYD88 L265P mutation was found in any of the patients. Additionally, neither t(11;14) nor CCND1 rearrangement was observed by FISH.

In the validation cohort (cohort 2), 12 of 223 patients with CLL with a Matutes score of 3 were included. These patients underwent karyotyping, FISH analysis, CCND1 expression quantification, and targeted next-generation sequencing analysis for MYD88, NOTCH1, NOTCH2, KLF2, ATM, TP53, BRAF, FBXW7, SF3B1, and KRAS. Eight patients harbored trisomy 12 as the sole chromosomal abnormality. Neither t(11;14) translocations nor CCND1/immunoglobulin H rearrangements were detected by FISH, and no CCND1 overexpression was observed in any patient. Mutations in NOTCH1 were identified in 2 patients, 1 of whom also harbored a mutation in the DNA binding domain of TP53. Another patient had concurrent mutations in FBXW7 and KRAS. No MYD88 L2675P mutation was detected in this cohort.

For other B-CLPD subtypes, additional clinical and pathological data were required such as histopathology, additional cytometric markers, karyotype and/or FISH, and molecular biology findings. For patients with repeated immunophenotyping, only the initial analysis performed during the study period, which included Matutes score antigens, CD10, CD20, CD148, CD180, and CD200, was selected. Participants had to be aged ≥18 years.

When both blood and marrow were analyzed, only 1 sample was included in the study.

Flow cytometry

The study analyzed 2 cohorts of patients with B-CLPD between January 2017 and December 2023. All investigations were carried out within the facilities of Strasbourg University Hospital. The cohort 1 (625 samples) was analyzed until May 2021 using a FACSCanto II flow cytometer (BD Biosciences, San Diego, CA). Diagnoses in this cohort were retrospectively reviewed by at least 2 expert hematologists in flow cytometry after up to 5 years of evolution. The cohort 2 (622 samples) was analyzed using a FACSLyric flow cytometer (BD Biosciences). Of the total samples, 145 cases (23%) in cohort 1 and 89 cases (17%) in cohort 2 were deemed ineligible for the study (Figure 1). The remaining eligible cases included 480 patients in cohort 1 and 433 in cohort 2. Antibodies and flow cytometry analysis strategies are detailed in supplemental Methods. Markers analyzed including CD5, CD10, FMC7, CD23, and surface immunoglobulin light chains were considered positive if expressed by at least 30% of the cells.

Figure 1.

Flowchart for cohort 1 and 2. Additional data include additional cytometric markers, histopathology, karyotype, FISH, or molecular biology. Double population means patients harboring 2 different malignant clones, based on flow cytometry results. FCM, flow cytometer.

Figure 1.

Flowchart for cohort 1 and 2. Additional data include additional cytometric markers, histopathology, karyotype, FISH, or molecular biology. Double population means patients harboring 2 different malignant clones, based on flow cytometry results. FCM, flow cytometer.

Close modal

Model development and validation

Cohort 1 (480 eligible patients) was processed to train and select the best algorithms. The data set consisted of 10 features including those of the Matutes score (CD5, CD23, FMC7, surface immunoglobulin light chains, and CD22/CD79b) combined with CD10 and positive discrete quantitative features, represented by percentile distributions for the variables of CD20, CD148, CD180, and CD200. Fluorescence quantiles (expressed as percentiles) were determined from the distribution of median fluorescence intensity (MFI) for the CD20, CD148, CD180, and CD200 antibodies across the entire cohort.

Cohort 1 was used for model selection and variable selection. We trained and evaluated K-nearest neighbors, random forest, XGBoost, and DecisionTree algorithms using simple crossvalidation with oversampling of the minority classes. XGBoost demonstrated the highest performance among the evaluated models based on AUPRC (area under the precision-recall curve) results (supplemental Table 2), and was therefore, selected for further analysis. The decision tree algorithm, exhibiting comparable performance to other models, was also chosen for its inherent interpretability. Hyperparameters were chosen using 10-fold crossvalidation on cohort 1, selecting those that maximized the median F1 score across the folds. To ensure the algorithm’s reproducibility, we used a percentile-based representation for the quantitative variables. To transform the quantitative variables (“CD148,” “CD180,” “CD200,” “CD20”) into percentiles, we first calculated percentiles using 1000 randomly oversampled samples from cohort 1. After rebalancing the classes through oversampling, we trained the XGBoost and DecisionTree models on this cohort, using the feature set (“CD5,” “CD10,” “CD23,” “CD148,” “CD180,” “CD200,” “CD20”). This quantile distribution was created after rebalancing the sample using the Synthetic Minority Oversampling Technique,22 ensuring consistency with distribution found in the literature.23 To fine-tune the hyperparameters, we repeated the 10-fold crossvalidation process 100 times with different random oversampling. The final hyperparameters were those that produced the highest median F1 score across all iterations and folds. To validate our models on cohort 2, we retrained them using the hyperparameters selected from cohort 1.

Model validation

The cohort 2 (433 eligible patients) was then used to validate the best algorithm on new data. We performed 200 iterations of oversampling (Synthetic Minority Oversampling Technique) on cohort 2. For each iteration, percentile distributions for (“CD148,” “CD180,” “CD200,” “CD20”) were calculated and used to apply the trained models to cohort 2. The results of these 200 iterations are presented below.

The ML algorithm prediction was compared to the diagnosis retained after completion of the required explorations (see Eligibility criteria), that was set as the gold standard.

For each patient, the probability of being classified in a category was defined as the average probability of being classified in that category over all its sample. The predicted category was the 1 with the highest probability. For the decision tree, the predicted category was the one that was predicted more often in each sample for each patient.

Performance was assessed in terms of sensitivity, specificity, precision, accuracy, and F1 score for each B-CLPD subtype. F1 score represent the harmonic mean of positive predictive value (PPV) and sensitivity F1 = [2×(PPV×sensitivity)(PPV+sensitivity)]. Global performance was assessed with global accuracy, macro average (mean of performance for each type) and weighted average (average weighted by the number of data points for each type). All analysis were performed using Python 3.6 with sklearn and XGBoost library and R software.24,25 

Cohort 1 of the study included 480 cases (306 blood and 174 bone marrow samples).

The distribution of B-CLPD subtypes was as follows: 182 CLL (37.9%), 106 MZL (22.1%, which included nodal, extranodal, and splenic subtypes), 107 LPL (22.3%), 52 MCL (10.8%), 22 HCL (4.6%), and 11 FL (2.3%). Patient demographics, including categorical data for marker expression, age, and sex distribution, are summarized in Table 1. Descriptive metrics of CD20, CD148, CD180, and CD200 are available in supplemental Table 1. No significant differences were observed in marker expression between malignant lymphocytes in blood and bone marrow samples.

Table 1.

Age, sex, and categorical markers values of the patients (cohort 1)

CLLFLHCLLPLMCLMZL
Number, % 182 (37.9) 11 (2.3) 22 (4.6) 107 (22.3) 52 (10.8) 106 (22.1) 
Male-to-female ratio 1.6 1.3 5.3 1.4 1.6 1.3 
Mean age, y 71.9 64.6 61.4 73.2 68.9 75.7 
CD10+/CD10 1/181 10/1 2/20 3/104 0/52 2/104 
CD5+/CD5 170/12 1/10 1/21 19/88 45/7 12/94 
CD23+/CD23 171/11 7/4 3/19 29/78 14/38 20/86 
CD22/CD79b (low/medium to high) 159/23 3/8 1/21 30/77 8/44 2/104 
Surface immunoglobulin light chains (low/medium to high) 159/23 1/10 2/20 7/100 3/49 8/98 
FMC7+/FMC7 46/136 10/1 21/1 63/44 46/6 101/5 
CLLFLHCLLPLMCLMZL
Number, % 182 (37.9) 11 (2.3) 22 (4.6) 107 (22.3) 52 (10.8) 106 (22.1) 
Male-to-female ratio 1.6 1.3 5.3 1.4 1.6 1.3 
Mean age, y 71.9 64.6 61.4 73.2 68.9 75.7 
CD10+/CD10 1/181 10/1 2/20 3/104 0/52 2/104 
CD5+/CD5 170/12 1/10 1/21 19/88 45/7 12/94 
CD23+/CD23 171/11 7/4 3/19 29/78 14/38 20/86 
CD22/CD79b (low/medium to high) 159/23 3/8 1/21 30/77 8/44 2/104 
Surface immunoglobulin light chains (low/medium to high) 159/23 1/10 2/20 7/100 3/49 8/98 
FMC7+/FMC7 46/136 10/1 21/1 63/44 46/6 101/5 

CLL, chronic lymphocytic leukemia.

Categorical variables (CD5, CD10, CD23, FMC7, CD22/CD79b, and surface immunoglobulin light chain status) and quantitative fluorescence intensity data for CD20, CD148, CD180, and CD200 (represented by their respective quantiles) were used as input features for training the different ML models.

The relative importance of each immunophenotypic marker in classifying B-CLPD was evaluated using XGBoost algorithm. CD10, low expression of surface immunoglobulin light chains, CD5, CD200, CD180, and CD20, emerged as key contributors (data not shown) whereas FMC7 and CD22/CD79b had comparatively weaker contributions. High surface expression of immunoglobulin light chain was not discriminatory across the B-CLPD subtypes.

XGBoost and DecisionTree algorithms, which demonstrated the highest performances, were applied to cohort 1 with the most impactful markers previously identified: CD10, CD20, CD23, CD148, CD180, and CD200. The optimal hyperparameters for the XGBoost algorithm were: learning rate 0.2, maximum_depth: 4, minimum_child_weight: 2, n_estimators: 50, subsample 0.5. The mean classification precision score across all 6 B-CLPD subtypes was 0.85 (median, 0.865; standard deviation [SD], 0.06; minimum, 0.72; maximum, 0.92). The overall specificity, sensitivity, and F1 score were 0.94, 0.83 and 0.87, respectively. The F1 scores ranged from 0.73 for LPL to 0.92 for CLL, whereas specificity ranged from 0.92 (SD, 0.04) for LPL to 0.99 (SD, 0.01) for FL, and sensitivity ranged from 0.74 (SD, 0.12) for MZL to 0.92 (SD, 0.08) for CLL.

We extracted the probability of each B-CLPD classification prediction for each patient obtained with the XGBoost algorithms (supplemental Table 3). In 380 of 433 cases (87.8%), the XGBoost model assigned a classification probability >0.50, indicating a significant predictive capacity of the algorithm.

Both XGBoost and decision tree models developed using cohort 1 were further validated in an independent cohort (cohort 2) which was analyzed on a different flow cytometer model. The cohort 2 included 433 cases (284 blood and 149 bone marrow samples) distributed as follow: 223 CLL, 11 FL, 5 HCL, 79 LPL, 44 MCL, and 71 MZL (nodal, extranodal, and splenic MZL subtypes). The performance scores of both algorithms accross the different B-CLPDs studied are shown in Table 2.

Table 2.

Performance score on the 200 draws for cohort 2 with XGBoost and DecisionTree algorithms

SubtypePrecisionF1 scoreSpecificitySensitivityNo. of cases
XGBoost      
CLL 0.80 0.88 0.74 0.98 223 
FL 0.86 0.84 1.00 0.82 11 
HCL 0.71 0.72 1.00 0.740 
LPL 0.87 0.66 0.98 0.53 79 
MCL 0.92 0.93 0.99 0.95 44 
MZL 0.90 0.75 0.99 0.64 71 
Mean global accuracy, 0.83 (SD, 0.004); mean weighted accuracy, 0.78 (SD, 0.017) 
DecisionTree      
CLL 0.95 0.95 0.95 0.94 223 
FL 0.92 0.96 1.00 11 
HCL 1.00 0.89 1.00 0.80 
LPL 0.75 0.73 0.95 0.71 79 
MCL 0.82 0.86 0.98 0.91 44 
MZL 0.80 0.82 0.96 0.83 71 
Mean global accuracy, 0.88 (SD, 0.002); mean weighted accuracy, 0.78 (SD, 0.017) 
SubtypePrecisionF1 scoreSpecificitySensitivityNo. of cases
XGBoost      
CLL 0.80 0.88 0.74 0.98 223 
FL 0.86 0.84 1.00 0.82 11 
HCL 0.71 0.72 1.00 0.740 
LPL 0.87 0.66 0.98 0.53 79 
MCL 0.92 0.93 0.99 0.95 44 
MZL 0.90 0.75 0.99 0.64 71 
Mean global accuracy, 0.83 (SD, 0.004); mean weighted accuracy, 0.78 (SD, 0.017) 
DecisionTree      
CLL 0.95 0.95 0.95 0.94 223 
FL 0.92 0.96 1.00 11 
HCL 1.00 0.89 1.00 0.80 
LPL 0.75 0.73 0.95 0.71 79 
MCL 0.82 0.86 0.98 0.91 44 
MZL 0.80 0.82 0.96 0.83 71 
Mean global accuracy, 0.88 (SD, 0.002); mean weighted accuracy, 0.78 (SD, 0.017) 

Results are expressed as mean. F1 score represents the harmonic mean of PPV and sensitivity: 2×(PPV×sensitivity)(PPV+sensitivity). Mean: macro average (mean of performance for each type). Weighted: weighted average (average weighted by the number of data points for each type).

DecisionTree algorithms achieved the highest overall accuracy (0.88; SD, 0.002), outperforming XGBoost (0.83; SD, 0.005). Specificity obtained with DecisionTree ranged from 0.95 (LPL) to 1 (HCL), whereas sensitivity ranged from 0.71 (LPL) to 1 (HCL). The predictive performance of the classification models is detailed in Table 3, which presents the actual vs predicted values (confusion matrix) of the ML approach. These results were comparable to those obtained in cohort 1, further demonstrating that ML approach, especially DecisionTree algorithm, is robust and provide accurate classification across different patient cohorts and flow cytometry platforms. A DecisionTree representation for cohort 2 is provided in supplemental Figure 1.

Table 3.

Confusion matrix of classifications of B-CLPD subtypes obtained using DecisionTree algorithm

SubtypeCLLFLHCLLPLMCLMZL
p-CLL 208 
p-FL 11 
p-HCL 
p-LPL 56 
p-MCL 40 
p-MZL 14 59 
SubtypeCLLFLHCLLPLMCLMZL
p-CLL 208 
p-FL 11 
p-HCL 
p-LPL 56 
p-MCL 40 
p-MZL 14 59 

Columns represent actual diagnosis and rows the predicted (p) diagnosis.

The DecisionTree algorithm identified 2 key categorical markers (CD5, CD10) and 3 quantitative markers (CD20, CD180, and CD200) for which quantiles of median fluorescence thresholds (1-100) enabled the discrimination of B-CLPD subtypes. A simplified and more practical representation of the DecisionTree algorithm is showed in Figure 2, enhanced with additional diagnostic tests that would be required for confirmation. In this scheme, the confidence of a prediction in each terminal leaf of the tree is indicated by the Gini impurity index26 and PPVs. Overall, 54.3% of cases in cohort 2 were classified with near perfect accuracy (Gini index ≤0.04; PPV >.97; leafs 3, 5, 6, and 9).

Figure 2.

Biological diagnosis strategy for B-CLPDs. Dominant diagnoses appear in the colored boxes in the upper part of the diagram. These 10 colored boxes have been numbered to facilitate their identification in “Results.” Nearly half of the cases are identified with certainty by flow cytometry alone. Only 3.9% of the cases are directed in a group with no dominant subtype. The lower part of the diagram indicates additional tests for either confirmation of the diagnosis proposed by cytometry or discrimination of the other subtypes. CD, cluster of differentiation; CLL, chronic lymphocytic leukemia/lymphocytic lymphoma; FL, follicular lymphoma; HCL, hairy cell leukemia; LPL, lymphoplasmacytic lymphoma; MCL, mantle cell lymphoma; MZL, marginal zone lymphoma; NA, not applicable; NDS, no dominant subtype; PPV, positive predictive value; Q, quantile (in instance percentile). ∗The MYD88 L265P mutation can also be detected in rare cases of MZL and CLL.

Figure 2.

Biological diagnosis strategy for B-CLPDs. Dominant diagnoses appear in the colored boxes in the upper part of the diagram. These 10 colored boxes have been numbered to facilitate their identification in “Results.” Nearly half of the cases are identified with certainty by flow cytometry alone. Only 3.9% of the cases are directed in a group with no dominant subtype. The lower part of the diagram indicates additional tests for either confirmation of the diagnosis proposed by cytometry or discrimination of the other subtypes. CD, cluster of differentiation; CLL, chronic lymphocytic leukemia/lymphocytic lymphoma; FL, follicular lymphoma; HCL, hairy cell leukemia; LPL, lymphoplasmacytic lymphoma; MCL, mantle cell lymphoma; MZL, marginal zone lymphoma; NA, not applicable; NDS, no dominant subtype; PPV, positive predictive value; Q, quantile (in instance percentile). ∗The MYD88 L265P mutation can also be detected in rare cases of MZL and CLL.

Close modal

Notably, cases exhibiting a CD5+ proliferation with low CD20 expression (<51st quantile; leaf 9) were highly likely CLL (PPV = 1.000; Gini index, 0.0; 43.9% of the cases of the cohort). Interestingly, a comparison between the Matutes score (4 or 5) and the DecisionTree algorithm for CLL diagnosis demonstrated the superiority of the DecisionTree algorithm (P = .029, χ2 test).

Additionally, a CD5+ proliferation with a very low CD200 expression (below the seventh percentile; leaf 6) was highly indicative of MCL (7.6% of the whole cohort; PPV = 0.97; Gini index = 0.02).

Although 41.8% of the cases were correctly classified, they exhibited a lower confidence (PPV, 0.16-0.62), especially in cases of LPL and MZL, which required additional testing. In CD10 negative and CD5 negative proliferations with low to moderate CD200 expression, a CD180 expression above the 71st percentile supported a diagnosis of MZL (leaf 2). However, given the PPV of 0.797 and Gini index of 0.62, MYD88 L265P mutation testing is necessary to rule out LPL. If MYD88 L265P testing was negative, CD20 expression may help to refine the diagnosis. Indeed, as suggested by the results in leaf 9, low CD20 expression (below the 51st quantile) favored a diagnosis of CLL. Similarly, proliferations that fulfill the criteria of leaf 1 (CD5, CD10, CD200 <87th quantile, and CD180 <72nd quantile) were predominantly classified as LPL. However, MYD88 testing was required for confirmation given the VPP of 0.825 and a Gini index of 0.53.

In 3.9% of cases (leaf 8), the immunophenotypic profile (CD5+, CD200 between the 15th and 33rd quantile) was not indicative of any specific B-CLPD subtype. In these cases, additional diagnostic tests are mandatory.

To facilitate practical use, we developed an interface that generates diagnostic classifications based on the decision tree algorithm proposed in this study (see supplemental Methods). This algorithm was trained using cohort 1 as the training set and validated with cohort 2 as an independent validation set. We propose that each center input data from a substantial cohort (>200 cases) to establish the initial distribution of fluorescence quantiles categorical (positive or negative) for CD5 and CD10, and MFI values for CD20, CD180, and CD200. Once a new entry is manually added for all 5 parameters, the fluorescence quantiles are automatically updated. The user is then prompted to apply our decision tree. A prediction is then generated, along with Gini index and PPV for the assigned diagnosis. If necessary, the system also suggests a hierarchy of additional tests to support further diagnostic classification. This approach is entirely independent of the antibody or fluorochrome used, provided they remain unchanged.

Flow cytometry remains the cornerstone for diagnosing and monitoring B-CLPDs. However, accurate subtyping often requires a comprehensive panel of antibodies and additional investigations such as karyotyping, cyclin D1 expression, and MYD88 mutation analysis.

This study demonstrates the feasibility of developing a robust classification algorithm for B-CLPD subtypes using a limited panel of CD markers (CD5, CD10, CD20, CD180, and CD200) and fluorescence intensity measurements, provided that clonality is assessed using anti-kappa and anti-lambda antibodies. By leveraging the quantiles of CD20, CD180, and CD200, we achieved a high degree of accuracy, which was validated in an independent cohort analyzed on a different flow cytometer platform. This quantile-based approach offers a valuable strategy for standardizing flow cytometry data across various laboratories and clinical research settings.

ML not only facilitated accurate classification but also provided valuable insights into the relative importance of different markers. CD148, despite its reported significance in MCL,10,14 was not included in the final model by the DecisionTree algorithm, suggesting its limited discriminatory power. Additionally, some classical markers such as CD23, FMC7, and CD22/CD79b as well as surface immunoglobulins appeared to be dispensable or redundant for immunophenotypic classification of B-cell proliferation, resulting in a cost reduction. Nevertheless, surface immunoglobulins remain essential for demonstrating clonality. Therefore, it would be beneficial to extend the study to additional cohorts.

This study focused on B-CLPDs, excluding diffuse large B-cell lymphoma and Burkitt lymphoma. The MZL group included nodal, extranodal, and splenic MZL, introducing some heterogeneity. Despite this, leukemic cells across these MZL subgroups displayed a similar phenotype using the proposed panel.

The MCL cases analyzed in this study were limited to leukemic cells and/or bone marrow localization, a cohort with a significant proportion of CD5-negative cases (15%), potentially indicative of more indolent forms of MCL.27 Despite this relatively frequent lack of CD5 expression, our approach accurately classified the vast majority of MCL cases, highlighting the utility of CD200 in identifying even these indolent MCL forms, making a single karyotype, possibly supplemented by FISH, sufficient in many cases.

Moreover, given the high PPV of our model for CD5+ patients with low CD20 expression, further investigation may be unnecessary when cases fall into CLL groups with consistently very high predictive performance, even in immunophenotypically atypical CLL cases with a Matutes score of 3. Notably, in the validation cohort, 11 of 12 CLL cases with a Matutes score of 3 (among 223 CLL cases) were correctly classified, with only 1 case being misclassified as an MCL. This highlights the strength of our approach in addressing diagnostically challenging cases. However, it should be noted that CD5CLL cases are not accurately classified by our model. It is important to note that lymph node or spleen samples were not analyzed in this study. CD marker expression, especially for markers such as CD180, can vary between leukemic and secondary lymphoid organ cells.28,29 

Although our model effectively classified most B-CLPD subtypes, 3.9% of cases exhibited overlapping phenotypic features, underscoring the inherent biological heterogeneity within the different subtypes and the need of new discriminating markers, especially for LPL and MZL.

This study highlighted a novel approach to the use of artificial intelligence (AI) tools in medicine. These AI tools have mainly been implemented for image analysis in clinical pathology and radiology, as well as for the diagnosis and epidemiology of COVID-19.30-32 Recently, ML has been used for predicting primary cancer types and assessing the risk of myeloid neoplasia.33 Although the lack of well-annotated data often limits the implementation of AI in medicine, the lack of transparency in prediction processes can hinder trust in the algorithm’s predictions. The DecisionTree classifier offers a significant advantage by providing a measure of prediction confidence through the PPV and Gini impurity index.

The development of a user-friendly interface for our decision tree algorithm significantly advances its practical application in routine diagnostics. This interface simplifies data entry (CD5/CD10 status and MFI values for CD20, CD180, and CD200) and automates the analysis process, reducing the time and expertise required for result generation. By providing the Gini index and PPV alongside diagnostic predictions, the interface empowers clinicians to make informed decisions based on clear and concise metrics. A low Gini index may reflect overlapping features between different subtypes (such as LPL and MZL), as well as the inherent heterogeneity within these subtypes. However, a low Gini index does not necessarily indicate poor classification performance. Although the classification boundaries may be less distinct, the DecisionTree classifier is still able to identify strongly discriminative features leading to accurate positive classifications and high PPVs. Furthermore, despite class imbalance due to varying subtype size, the model continues to make correct classifications even though the overall impurity remains relatively high. This streamlined workflow has the potential to facilitate wider adoption of the DecisionTree algorithm and improve diagnostic efficiency.

In conclusion, this study demonstrates that ML can provide a robust strategy for classifying B-CLPD subtypes. The innovative use of fluorescence quantiles offers a novel method for harmonizing cytometry data, independent of both analyzers and antibodies, thereby facilitates multicenter studies and aids in the classification of hematologic malignancies.

Contribution: L. Mauvieux, R.B., L. Miguet, R.H., and T.F. designed and coordinated the study; C.M.-R., A.E., A.-C.G., D.R., and L. Mauvieux performed cytological and immunological experiments; A.N. performed a proportion of pathological examinations; L. Mauvieux and L. Miguet performed molecular experiments; S.H.-B. and C.G. realized karyotype and fluorescence in situ hybridization experiments; R.B., T.F., T.G., and F.S. developed the methods to process the data; D.R., C.M.-R., L. Miguet, and L. Mauvieux were actively involved in both data collection and analysis; L.-M.F. and R.H. provided clinical information; and L. Mauvieux, R.H., T.F., and L. Miguet verified the data and wrote the manuscript, with support from all authors.

Conflict-of-interest disclosure: The authors declare no competing financial interests.

Correspondence: Laurent Mauvieux, Laboratoire d’hématologie, Hôpitaux Universitaires de Strasbourg, Ave Molière, 67100 Strasbourg, France; email: laurent.mauvieux@chru-strasbourg.fr.

1.
Kalina
T
,
Flores-Montero
J
,
van der Velden
VHJ
, et al
.
EuroFlow standardization of flow cytometer instrument settings and immunophenotyping protocols
.
Leukemia
.
2012
;
26
(
9
):
1986
-
2010
.
2.
Zalcberg
I
,
D’Andrea
MG
,
Monteiro
L
,
Pimenta
G
,
Xisto
B
.
Multidisciplinary diagnostics of chronic lymphocytic leukemia: European research initiative on CLL - ERIC recommendations
.
Hematol Transfus Cell Ther
.
2020
;
42
(
3
):
269
-
274
.
3.
Campo
E
,
Jaffe
ES
,
Cook
JR
, et al
.
The International Consensus Classification of mature lymphoid neoplasms: a report from the Clinical Advisory Committee
.
Blood
.
2022
;
140
(
11
):
1229
-
1253
.
4.
Qiu
L
,
Xu
J
,
Tang
G
, et al
.
Mantle cell lymphoma with chronic lymphocytic leukemia-like features: a diagnostic mimic and pitfall
.
Hum Pathol
.
2022
;
119
:
59
-
68
.
5.
Ho
AK
,
Hill
S
,
Preobrazhensky
SN
,
Miller
ME
,
Chen
Z
,
Bahler
DW
.
Small B-cell neoplasms with typical mantle cell lymphoma immunophenotypes often include chronic lymphocytic leukemias
.
Am J Clin Pathol
.
2009
;
131
(
1
):
27
-
32
.
6.
Nelson
BP
,
Variakojis
D
,
Peterson
LC
.
Leukemic phase of B-cell lymphomas mimicking chronic lymphocytic leukemia and variants at presentation
.
Mod Pathol
.
2002
;
15
(
11
):
1111
-
1120
.
7.
Matutes
E
,
Owusu-Ankomah
K
,
Morilla
R
, et al
.
The immunological profile of B-cell disorders and proposal of a scoring system for the diagnosis of CLL
.
Leukemia
.
1994
;
8
(
10
):
1640
-
1645
.
8.
Matutes
E
,
Morilla
R
,
Owusu-Ankomah
K
,
Houliham
A
,
Meeus
P
,
Catovsky
D
.
The immunophenotype of hairy cell leukemia (HCL). Proposal for a scoring system to distinguish HCL from B-cell disorders with hairy or villous lymphocytes
.
Leuk Lymphoma
.
1994
;
14
(
suppl 1
):
57
-
61
.
9.
Edwards
K
,
Lydyard
PM
,
Kulikova
N
, et al
.
The role of CD180 in hematological malignancies and inflammatory disorders
.
Mol Med
.
2023
;
29
(
1
):
97
.
10.
Gautam
A
,
Sreedharanunni
S
,
Sachdeva
MUS
, et al
.
The relative expression levels of CD148 and CD180 on clonal B cells and CD148/CD180 median fluorescence intensity ratios are useful in the characterization of mature B cell lymphoid neoplasms infiltrating blood and bone marrow - results from a single centre pilot study
.
Int J Lab Hematol
.
2021
;
43
(
5
):
1123
-
1131
.
11.
Li
Y
,
Tong
X
,
Huang
L
, et al
.
A new score including CD43 and CD180: Increased diagnostic value for atypical chronic lymphocytic leukemia
.
Cancer Med
.
2021
;
10
(
13
):
4387
-
4396
.
12.
Gross
Z
,
Veyrat-Masson
R
,
Grange
B
, et al
.
Diagnosis of chronic B-cell lymphoproliferative disease in peripheral blood = how machine learning may help to the interpretation of flow cytometry data
.
Hematol Oncol
.
2024
;
42
(
1
):
e3245
.
13.
Miguet
L
,
Béchade
G
,
Fornecker
L
, et al
.
Proteomic analysis of malignant B-cell derived microparticles reveals CD148 as a potentially useful antigenic biomarker for mantle cell lymphoma diagnosis
.
J Proteome Res
.
2009
;
8
(
7
):
3346
-
3354
.
14.
Fan
L
,
Miao
Y
,
Wu
Y-J
, et al
.
Expression patterns of CD200 and CD148 in leukemic B-cell chronic lymphoproliferative disorders and their potential value in differential diagnosis
.
Leuk Lymphoma
.
2015
;
56
(
12
):
3329
-
3335
.
15.
Miguet
L
,
Lennon
S
,
Baseggio
L
, et al
.
Cell-surface expression of the TLR homolog CD180 in circulating cells from splenic and nodal marginal zone lymphomas
.
Leukemia
.
2013
;
27
(
8
):
1748
-
1750
.
16.
Mayeur-Rousse
C
,
Guy
J
,
Miguet
L
, et al
.
CD180 expression in B-cell lymphomas: a multicenter GEIL study
.
Cytometry B Clin Cytom
.
2016
;
90
(
5
):
462
-
466
.
17.
Favre
R
,
Manzoni
D
,
Traverse-Glehen
A
, et al
.
Usefulness of CD200 in the differential diagnosis of SDRPL, SMZL, and HCL
.
Int J Lab Hematol
.
2018
;
40
(
4
):
e59
-
e62
.
18.
Dorfman
DM
,
Shahsafaei
A
.
CD200 (OX-2 membrane glycoprotein) expression in b cell-derived neoplasms
.
Am J Clin Pathol
.
2010
;
134
(
5
):
726
-
733
.
19.
Palumbo
GA
,
Parrinello
N
,
Fargione
G
, et al
.
CD200 expression may help in differential diagnosis between mantle cell lymphoma and B-cell chronic lymphocytic leukemia
.
Leuk Res
.
2009
;
33
(
9
):
1212
-
1216
.
20.
Ng
DP
,
Zuromski
LM
.
Augmented human intelligence and automated diagnosis in flow cytometry for hematologic malignancies
.
Am J Clin Pathol
.
2021
;
155
(
4
):
597
-
605
.
21.
Shopsowitz
K
,
Lofroth
J
,
Chan
G
, et al
.
MAGIC-DR: an interpretable machine-learning guided approach for acute myeloid leukemia measurable residual disease analysis
.
Cytometry B Clin Cytom
.
2024
;
106
(
4
):
239
-
251
.
22.
Chawla
NV
,
Bowyer
KW
,
Hall
LO
,
Kegelmeyer
WP
.
SMOTE: synthetic minority over-sampling technique
.
jair
.
2002
;
16
:
321
-
357
.
23.
Le Guyader-Peyrou
S
,
Defossez
G
,
Dantony
E
, et al
.
Estimations nationales de l’incidence et de la mortalité par cancer en France métropolitaine entre 1990 et 2018 - Hémopathies malignes: Étude à partir des registres des cancers du réseau Francim. INSERM
. 2019. Accessed 5 July 2019. https://www.santepubliquefrance.fr/docs/estimations-nationales-de-l-incidence-et-de-la-mortalite-par-cancer-en-france-metropolitaine-entre-1990-et-2018-volume-2-hemopathies-malignes.
24.
Pedregosa
F
,
Varoquaux
G
,
Gramfort
A
, et al
.
Scikit-learn: machine learning in Python
.
J Machine Learn Res
.
2011
;
12
:
2825
-
2830
.
25.
Chen
T
,
Guestrin
C
. XGBoost: a scalable tree boosting system. In:
Krishnapuram
B
,
Shah
M
,
Smola
AJ
,
Aggarwal
C
,
Shen
D
,
Rastogi
R
, eds.
Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
.
Association for Computing Machinery
;
2016
:
785
-
794
.
26.
Yuan
Y
,
Wu
L
,
Zhang
X
.
Gini-impurity index analysis
.
IEEE Trans Inf Forensics Secur
.
2021
;
16
:
3154
-
3169
.
27.
Soleimani
A
,
Navarro
A
,
Liu
D
, et al
.
CD5-negative mantle cell lymphoma: clinicopathologic features of an indolent variant that confers a survival advantage
.
Leuk Lymphoma
.
2022
;
63
(
4
):
911
-
917
.
28.
Miljkovic
D
,
Ou
J
,
Kirana
C
, et al
.
Discordant frequencies of tissue-resident and circulating CD180-negative B cells in chronic rhinosinusitis
.
Int Forum Allergy Rhinol
.
2017
;
7
(
6
):
609
-
614
.
29.
Mestrallet
F
,
Sujobert
P
,
Sarkozy
C
, et al
.
CD180 overexpression in follicular lymphoma is restricted to the lymph node compartment
.
Cytometry B Clin Cytom
.
2016
;
90
(
5
):
433
-
439
.
30.
Veziroglu
EM
,
Farhadi
F
,
Hasani
N
, et al
.
Role of artificial intelligence in PET/CT imaging for management of lymphoma
.
Semin Nucl Med
.
2023
;
53
(
3
):
426
-
448
.
31.
Yao
K
,
Singh
A
,
Sridhar
K
,
Blau
JL
,
Ohgami
RS
.
Artificial intelligence in pathology: a simple and practical guide
.
Adv Anat Pathol
.
2020
;
27
(
6
):
385
-
393
.
32.
Wang
L
,
Wang
W
,
Xu
R
,
Berger
NA
.
SARS-CoV-2 primary and breakthrough infections in patients with cancer: implications for patient care
.
Best Pract Res Clin Haematol
.
2022
;
35
(
3
):
101384
.
33.
Moon
I
,
LoPiccolo
J
,
Baca
SC
, et al
.
Machine learning for genetics-based classification and treatment response prediction in cancer of unknown primary
.
Nat Med
.
2023
;
29
(
8
):
2057
-
2067
.

Author notes

The data set analyzed during this study will be made available upon reasonable request from the corresponding author, Laurent Mauvieux (laurent.mauvieux@chru-strasbourg.fr), and the algorithm will be made available upon reasonable request from author, Thibaut Fabacher (thibaut.fabacher@chru-stasbourg.fr).

The full-text version of this article contains a data supplement.

Supplemental data