• Exact quantitation of RBC dysmorphologies in peripheral blood smears can be accurately performed using a computer vision system.

  • This quantitation allowed for improved diagnostic and prognostic evaluations of multiple hematologic disease states.

Examination of red blood cell (RBC) morphology in peripheral blood smears can help diagnose hematologic diseases, even in resource-limited settings, but this analysis remains subjective and semiquantitative with low throughput. Prior attempts to develop automated tools have been hampered by their poor reproducibility and limited clinical validation. Here, we present a novel, open-source machine-learning approach (denoted as RBC-diff) to quantify abnormal RBCs in peripheral smear images and generate an RBC morphology differential. RBC-diff cell counts showed high accuracy for single-cell classification (mean AUC, 0.93) and quantitation across smears (mean R2, 0.76 compared with experts, interexperts R2, 0.75). RBC-diff counts were concordant with the clinical morphology grading for 300 000+ images and recovered the expected pathophysiologic signals in diverse clinical cohorts. Criteria using RBC-diff counts distinguished thrombotic thrombocytopenic purpura and hemolytic uremic syndrome from other thrombotic microangiopathies, providing greater specificity than clinical morphology grading (72% vs 41%; P < .001) while maintaining high sensitivity (94% to 100%). Elevated RBC-diff schistocyte counts were associated with increased 6-month all-cause mortality in a cohort of 58 950 inpatients (9.5% mortality for schist. >1%, vs 4.7% for schist; <0.5%; P < .001) after controlling for comorbidities, demographics, clinical morphology grading, and blood count indices. RBC-diff also enabled the estimation of single-cell volume-morphology distributions, providing insight into the influence of morphology on routine blood count measures. Our codebase and expert-annotated images are included here to spur further advancement. These results illustrate that computer vision can enable rapid and accurate quantitation of RBC morphology, which may provide value in both clinical and research contexts.

Quantitation and differential profiling of blood cells are the cornerstone of modern clinical diagnosis.1,2 For example, the white blood cell (WBC) differential, a quantitative profile of WBC subtypes, can flag infections or malignancies.3,4 Unlike WBCs, there are no functionally distinct normal RBC subtypes. However, morphologic RBC subtypes are associated with pathology and can be pathognomonic5 – eg, sickle cells in sickle cell disease, spiculated cells in liver disease, or teardrop cells in bone marrow disorders. Although clinical laboratory technologies to analyze WBCs have advanced,6-8 RBC profiling technologies have not and are still primarily limited to evaluating changes in RBC size and hemoglobin content, with only a limited analysis of morphology.7,9 An objective and quantitative differential of RBC subtypes could provide valuable clinical insights as a scalable, standardized, and automated summary of morphology. However, RBC shape cannot be accurately detected by standard automated hematology analyzers that rely on optical scatter or electrical impedance,10 making alternative approaches necessary.

To be the most clinically useful, RBC morphology classification must be fast and accurate. For example, identification of schistocytes is a linchpin in the diagnosis of immune thrombotic thrombocytopenic purpura (iTTP), a life-threatening medical emergency that can be treated with immediate therapeutic plasma exchange.11 Yet, assessment of schistocytes is primarily performed through manual examination of a peripheral smear, a slow and subjective process often involving initial evaluation by a laboratory technologist and subsequent review by a hematologist or hematopathologist.7 These review processes typically generate semiquantitative flags (No flag, 1+, 2+, and 3+) that categorize smears in terms of frequency but are based on criteria that can vary substantially across hospitals.9 The result may be delayed or inaccurate diagnoses12 that do not fully use the information present in the smear. The lack of methods for rapid and objective RBC evaluation is also a key obstacle in clinical research to investigate the novel diagnostic information contained in peripheral blood smears. In particular, the lack of automated tools means that RBC morphology quantitation is not regularly recorded in electronic medical records and is subsequently unavailable for large-scale retrospective studies of hematologic diseases. More broadly, smears contain rich information on the RBC shape, size, and hemoglobin content at the single-cell level, which are rarely captured or used. This is in stark contrast to the increasing role of single-cell data in understanding human physiology in other settings.

The automated capture of peripheral blood smear images and artificial intelligence have the potential to address many of these limitations. Digital peripheral smear images are already automatically captured in many hospitals and are regularly used to conduct remote manual reviews. Current state-of-the-art for automated RBC morphology analysis includes CellaVision analyzers, which provide some preclassification of RBCs to assist with manual grading of smears.7,13,14 These systems assist in creating semiquantitative grading but are not sufficiently calibrated for important RBC subtypes, such as schistocytes,7,14 and the corresponding hardware may not be available in resource-limited settings. Research and development of additional tools have been hampered by poor reproducibility,15 inadequately small image data sets,16 limited clinical testing,17,18 or narrow focus on a few morphologies.19 Although some recent approaches have shown promise,17,20,21 they have not been validated at the cell population level, at which clinical assessments are made, nor have they been shown to add value in clinical diagnosis.

Here, we address these limitations by presenting a novel, open-source machine-learning pipeline (denoted as RBC-diff) for the calculation of an RBC morphology differential from peripheral blood smear images. We validated the RBC-diff performance at single-cell and cell population levels using a clinical grading from a multicenter database of 338 577 smears. We then retrospectively applied the RBC-diff in multiple clinical contexts, demonstrating its value in differential diagnosis and prognosis. Finally, we illustrated the utility of RBC-diff in a research setting by showing how this tool can derive novel single-cell data that can help improve the understanding of how morphology contributes to routine blood count indices.

Peripheral smear collection

Images were collected for all peripheral blood smears at Massachusetts General Hospital (MGH) between 4 November 2015 and 15 November 2021 (n = 281 745 images and 49 056 patients), and at Brigham and Women’s Hospital (BWH) between 1 January 2021 and 20 December 2021 (n = 56 832 images and 9894 patients). Smear slides were created as part of standard clinical care (further details are given in supplemental Methods) and imaged using CellaVision (DM96 or DI60) with an image resolution of ∼0.2 μm per pixel. The CellaVision system automatically identifies and captures an appropriately dispersed area of the smear adequate for clinical evaluation,14 typically between 500 and 600 μm in width and height, containing ∼1000 to 3000 RBCs. Morphology grading flags (generated by the clinical hematology laboratory) were recorded as either present, 1+, 2+, or 3+ per the local clinical laboratory guidelines (supplemental Methods). The characteristics of the MGH and BWH cohort are given in supplemental Table 1.

RBC-diff algorithm

The RBC-diff was designed to calculate the relative abundance of 9 types of RBC morphology (normal RBCs, elliptocytes, microcytes, macrocytes, schistocytes, sickle cells, spiculated cells, teardrop cells, and other abnormal RBCs). The algorithm takes a smear image, binarizes it, and uses black-white boundary detection to identify all potential cells. Ten geometric features were used to classify each potential RBC using a support vector machine classifier. See supplemental Methods for details on (1) feature calculation, (2) algorithm training, (3) effects of sample preparation delay, (4) intrasample variability, (5) robustness against data set shift, (6) performance with manually collected images, and (7) approximate normal reference ranges.

Expert estimates

To provide a reference for RBC-diff performance, 5 experts (board-certified hematopathologists or hematologists) were asked to estimate the prevalence (%) of specified cell types in 5 sets of 10 smears (10 smears for elliptocytes, schistocytes, sickle cells, spiculated cells, and teardrop cells), with each set containing 5 smears with a 1+ flag for the given cell type, and 5 with no flag. The experts were blinded to the clinical details and morphology grading flags. To simulate standard high-power microscopic fields, each smear was presented as a series of 16 smaller images, each containing ∼100 to 200 RBCs. To best reflect clinical practice, the experts were not given specific instructions on how to perform the task.

Clinical cohort studies

For further validation, we tested the discriminatory capacity of RBC-diff counts across 5 clinical cohorts with clear pathophysiologic signals: elliptocytosis vs spherocytosis, before and after liver transplantation, before and after RBC exchange in patents with sickle cell disease, before and after iron supplementation in patients with iron-deficiency, and before and after splenectomy. The cohort inclusion criteria are given in the supplemental Methods.

Thrombotic microangiopathy (TMA) cohort

Patients with TMA were drawn from the Harvard TMA Research Collaborative data set.22 Two TMA cohorts were collated: a derivation and a validation cohort. The derivation cohort consisted of patients presenting at the MGH between 31 March 2017 and 30 November 2020, and the validation cohort consisted of patients presenting at the MGH and BWH between 1 January 2021 and 19 December 2021. Patient details were gathered through a detailed chart review by members of the study team. Immune thrombotic thrombocytopenic purpura (iTTP) was defined as an ADAMTS13 enzyme activity level ≤10% (normal reference range, activity >66% for assay at Blood Center of Wisconsin) or ADAMTS13 enzyme activity ≤ 25% with an inhibitor of >1.0 inhibitor units (normal reference range, <0.5 inhibitor units). Outpatient cases of Upshaw-Schulman syndrome were not defined as iTTP cases. See the supplemental Methods for additional details.

Matched cohort mortality analysis

Relationships between RBC-diff counts and all-cause mortality were estimated using a matched cohort analysis. Using each patient’s first available smear, for a given abnormal RBC type, each patient with a corresponding count < 0.5% was matched to a patient of the same sex, race, comorbidity profile, age (<5-year gap), hematocrit (<10% absolute gap), and morphology grades, with the given cell type count between 0.5% and 1% or >1%. Mortality differences were analyzed using Kaplan-Meier curves and log-rank test. See the supplemental Methods for additional details.

Generation of single-cell volume-morphology distributions

To estimate individual cell volumes, the mean pixel area of each detected RBC was converted to μm2 (based on image resolution) and multiplied by 2.5 μm (approximate average vertical height of an RBC23). Smear-derived estimated volumes were then compared with blood count-derived mean corpuscular volume (MCV) and RBC distribution width (RDW), measured as part of standard clinical care on the Sysmex and Advia instruments, (supplemental Figure 1). These estimated volumes are based on 2D information and are therefore expected to be less accurate than approaches that include 3D information.24 

Statistical analysis

All statistical analysis was performed in MATLAB and R. For continuous variables, unless otherwise noted, we reported the means (std) and use 2-sided t tests (for 2 variables) or analysis of variance (for 3+ variables) for population comparisons. For categorical variables, we reported percentages and used a χ2 test for population comparisons. Differences between model sensitivities and specificities were calculated using χ2 tests based on true positive and false negative rates (for sensitivity) and true negative and false positive rates (for specificity). The thresholds for statistical significance in the hypothesis tests were set at P = .05. For event rates, confidence intervals were calculated assuming binomial distributions.

Ethics

The study protocol was approved by the local institutional review board (IRB) of the MGH.

The RBC-diff provides rapid and accurate morphologic assessments

Across a set of 5000 manually labeled single-cell images (2/3 used for training and 1/3 for testing), RBC-diff accurately classified each major morphologic class (mean test set area under the receiver-operator curve [AUC], 0.93, minimum AUC, 0.85; Figure 1A). Across cell population smear images (typically containing 1000-3000 RBCs), RBC-diff counts were concordant with expert estimates (R2 = 0.61, 0.71, 0.98, 0.75, and 0.75, for elliptocytes, schistocyte, sickle, spiculated cells and teardrop cells respectively; Figure 1B; supplemental Figure 2). The mean algorithm-expert correlation (R2, 0.76) was comparable with the interexpert concordance (R2, 0.75) suggesting that the algorithm performance is limited by the lack of an objective gold standard definition for each class. Interexpert comparisons were concordant but often only weakly calibrated, with the average estimated cell prevalence varying up to fourfold (supplemental Figure 2), highlighting the potential value of a more objective and consistent approach to quantitation. Across 8459 cases for which 2 smears were generated from 1 blood sample, RBC-diff counts showed low intrablood sample variability (Figure 1C). Across 281 745 smears from MGH, the RBC-diff counts aligned with the morphology grades assigned by the hematology laboratory, with higher grade flags (1+, 2+, 3+) associated with increased cell counts of the given type (Figure 1D). This consistency was also observed across 56 832 smears from BWH (supplemental Figure 3), despite significant interhospital differences in smear grading protocols (see supplemental Methods).

Figure 1.

The RBC-diff accurately identifies red cell morphologies. (A) RBC-diff classification accuracy for a test set of 1334 manually labeled single-cell images. (B) Concordance of RBC-diff counts with expert quantitation on standard smear images (∼2000 RBCs per image) (mean algorithm-expert R2: 0.76; comparison with the individual experts shown in supplemental Figure 2; mean interexpert R2, 0.75). (C) Mean difference in counts between pairs of smears created with different aliquots from the same blood sample compared with the expected difference when randomly sampled from a cell population twice (further details in Methods). (D) Association between RBC-diff cell density estimates on standard smear images and morphology grading flags at MGH (1+, 2+, 3+ reflect increasing frequency of the given morphology). See supplemental Figure 3 for the validation of this result at BWH. (E–I) Distributions associated with panels C and D are shown in supplemental Figures 10, 11, and 12. The distribution of the flags (panel D) is shown in supplemental Figure 3. All P values were calculated using a 2-sided Student t test. AUC, area under the receiver operating curve; N.f, not flagged.

Figure 1.

The RBC-diff accurately identifies red cell morphologies. (A) RBC-diff classification accuracy for a test set of 1334 manually labeled single-cell images. (B) Concordance of RBC-diff counts with expert quantitation on standard smear images (∼2000 RBCs per image) (mean algorithm-expert R2: 0.76; comparison with the individual experts shown in supplemental Figure 2; mean interexpert R2, 0.75). (C) Mean difference in counts between pairs of smears created with different aliquots from the same blood sample compared with the expected difference when randomly sampled from a cell population twice (further details in Methods). (D) Association between RBC-diff cell density estimates on standard smear images and morphology grading flags at MGH (1+, 2+, 3+ reflect increasing frequency of the given morphology). See supplemental Figure 3 for the validation of this result at BWH. (E–I) Distributions associated with panels C and D are shown in supplemental Figures 10, 11, and 12. The distribution of the flags (panel D) is shown in supplemental Figure 3. All P values were calculated using a 2-sided Student t test. AUC, area under the receiver operating curve; N.f, not flagged.

Close modal

As a final validation, we tested whether RBC-diff counts would detect expected qualitative morphology perturbations across 5 clinical cohorts (Figure 2). Elliptocyte elevations were observed in patients with hereditary elliptocytosis but not in those with hereditary spherocytosis (Figure 2A). RBC-diff counts also accurately tracked expected changes after clinical intervention: spiculated cells decreased after liver transplantation25 (Figure 2B); sickle cells decreased after RBC exchange26 (Figure 2C); microcytes decreased after IV iron supplementation in iron-deficient anemia27 (Figure 2D), and schistocytes increased after splenectomy28 (Figure 2E). These changes typically occurred with stable profiles for the other morphologies (supplemental Figure 4) and, often, in settings in which grading by the clinical laboratory did not change. For example, 20 of 46 (44%) patients who underwent liver transplantation showed no change in spiculated smear grades, as assessed by the clinical laboratory, from pre to posttransplantation, whereas RBC-diff detected a decrease in spiculated cells in 17 of 20 (85%) patients (mean absolute decrease, 7.3%).

Figure 2.

The RBC-diff accurately detects physiologic and clinical signals. Comparison of RBC-diff counts across 5 cohorts with expected morphologic differences driven by physiologic shifts or clinical interventions: (A) hereditary spherocytosis and elliptocytosis, (B) before and after liver transplantation, (C), before and after RBC exchange in patients with sickle cell disease, (D) before and after intravenous iron supplementation in patients with iron-deficiency anemia, and (E) before and after splenectomy. RBC-diff counts for all morphologies of the patients in panels A-E are shown in supplemental Figure 4. The test statistics for panels A-E are 16.0, 6.16, 2.39, 3.90, and 2.09, respectively. Black lines reflect interquartile range, white circles represent median, and violin shapes the data distribution.

Figure 2.

The RBC-diff accurately detects physiologic and clinical signals. Comparison of RBC-diff counts across 5 cohorts with expected morphologic differences driven by physiologic shifts or clinical interventions: (A) hereditary spherocytosis and elliptocytosis, (B) before and after liver transplantation, (C), before and after RBC exchange in patients with sickle cell disease, (D) before and after intravenous iron supplementation in patients with iron-deficiency anemia, and (E) before and after splenectomy. RBC-diff counts for all morphologies of the patients in panels A-E are shown in supplemental Figure 4. The test statistics for panels A-E are 16.0, 6.16, 2.39, 3.90, and 2.09, respectively. Black lines reflect interquartile range, white circles represent median, and violin shapes the data distribution.

Close modal

In addition to its accuracy, RBC-diff was also (1) fast (<1 second image processing time), (2) accurate with manually photographed smear images (supplemental Figure 5), and (3) insensitive to changes in image hue, as is often observed between medical centers29 (supplemental Figure 6).

RBC-diff facilitates the speed and specificity of iTTP and HUS diagnosis

To evaluate the diagnostic utility of the RBC-diff, we considered a cohort of patients with TMA22 with concern for iTTP, a medical emergency involving a severe acquired deficiency in the von Willebrand factor-cleaving protease ADAMTS13.11 The definitive diagnostic test for iTTP is an ADAMTS13 activity assay, which is typically performed in a reference laboratory, limiting availability in emergency settings. Patients with thrombocytopenia suspected of having iTTP were evaluated manually and subjectively for the presence of schistocytes in the peripheral smear. Therefore, we sought to test whether RBC-diff counts could facilitate objective and rapid iTTP diagnosis before the ADAMTS13 activity results were known. We constructed 2 independent cohorts of 106 (derivation cohort) and 90 (validation cohort) TMA cases, with etiology determined by physician review of clinical charts (Figure 3A, see Methods for further details). iTTP and hemolytic uremic syndrome (HUS) showed higher schistocyte counts than all other TMA etiologies (Figure 3B), although the relapsed iTTP cases had lower schistocyte counts than the initial episodes (Figure 3B). Considering the full differential, iTTP and HUS cases exhibited a unique fingerprint with schistocyte elevations being predominant (schistocyte levels being higher than other morphologies; Figure 3C,D). Elevated schistocytes (with or without predominance) provided high specificity and sensitivity for the diagnosis of iTTP or HUS compared with other TMAs, outperforming hematology laboratory grading (Figure 3E). From the derivation cohort, the optimal diagnostic criteria were identified as (1) schistocytes >4% or (2) schistocytes >2% and predominant (supplemental Figure 7). In the validation cohort, these joint criteria produced significantly higher specificity (72% vs 42%; P < 1e-5) and positive predictive value (41% vs 25%; P < 1e-5) than the hematology laboratory grades, while providing 100% sensitivity (Figure 3F). Schistocyte counts provided a diagnostic signature that was not captured via routine blood count measures (Figure 3G).

Figure 3.

RBC-diff schistocyte counts improve diagnostic evaluation of TMAs. (A) TMA cohort inclusion criteria and study design. (B) RBC-diff schistocyte counts according to TMA etiology. (C,D) RBC-diff counts for iTTP (C), and all other TMA etiologies (D). The interrupted y-axis shows a single outlier (93% spiculated, DIC case). For box plots (B,C,D), center lines show the medians; box limits indicate the 25th to 75th percentiles; whiskers extend 1.5× the interquartile range; and dots represent outliers. (E) The sensitivity and specificity of diagnosis of iTTP and/or HUS against all other TMA etiologies using schistocyte count with (light blue) and without (dark blue) predominance (requirement that schistocyte count be higher than that for other abnormal cell types) and using morphology grading flags (brown diamonds). An equivalent plot of the validation cohort is shown in supplemental Figure 13. (F) Sensitivity, specificity, and positive predictive value (PPV) of the RBC-diff cell count criteria for the diagnosis of iTTP and/or HUS in derivation and validation cohorts (supplemental Figure 7). Sensitivity and PPV using the joint criteria were significantly higher than those using morphology grading flags (P < 1e-5; exact binomial test). (G) Volcano plot indicating the statistical significance and fold changes of CBC indices, RBC-diff counts, and ADAMTS13 activity, each for iTTP compared with non-iTTP/HUS TMA cases. P values using Bonferroni-corrected 2-sided Student t test. DIC, disseminated intravascular coagulation; Hgb, hemoglobin; Plt, platelet count.

Figure 3.

RBC-diff schistocyte counts improve diagnostic evaluation of TMAs. (A) TMA cohort inclusion criteria and study design. (B) RBC-diff schistocyte counts according to TMA etiology. (C,D) RBC-diff counts for iTTP (C), and all other TMA etiologies (D). The interrupted y-axis shows a single outlier (93% spiculated, DIC case). For box plots (B,C,D), center lines show the medians; box limits indicate the 25th to 75th percentiles; whiskers extend 1.5× the interquartile range; and dots represent outliers. (E) The sensitivity and specificity of diagnosis of iTTP and/or HUS against all other TMA etiologies using schistocyte count with (light blue) and without (dark blue) predominance (requirement that schistocyte count be higher than that for other abnormal cell types) and using morphology grading flags (brown diamonds). An equivalent plot of the validation cohort is shown in supplemental Figure 13. (F) Sensitivity, specificity, and positive predictive value (PPV) of the RBC-diff cell count criteria for the diagnosis of iTTP and/or HUS in derivation and validation cohorts (supplemental Figure 7). Sensitivity and PPV using the joint criteria were significantly higher than those using morphology grading flags (P < 1e-5; exact binomial test). (G) Volcano plot indicating the statistical significance and fold changes of CBC indices, RBC-diff counts, and ADAMTS13 activity, each for iTTP compared with non-iTTP/HUS TMA cases. P values using Bonferroni-corrected 2-sided Student t test. DIC, disseminated intravascular coagulation; Hgb, hemoglobin; Plt, platelet count.

Close modal

RBC-diff counts are associated with prognosis in multiple populations

While reviewing the clinical charts, we noted a high mortality rate in the TMA derivation cohort, particularly among patients who were ultimately not diagnosed with iTTP or HUS. This led us to investigate whether high schistocyte counts were associated with mortality. In the TMA derivation cohort (excluding iTTP and HUS cases), elevated schistocytes at the time of ADAMTS13 testing were associated with a nearly five fold increase in 7-day mortality (3.7% to 17.7%; P = .027; χ2 = 4.9, df = 1) (Figure 4A). Similar schistocyte-mortality associations were observed in the earliest available blood smears from 49 056 patients with MGH. In this cohort, elevated levels of schistocytes (>1%) were associated with increased 6-month all-cause mortality compared with low levels of schistocytes (<0.5%; 11.3% mortality vs 6.4%; P < .001), after matching cohorts for demographics, comorbidities, hematology laboratory grading, hematocrit, and other RBC-diff counts (Figure 4B; supplemental Methods). This signal was validated in an independent cohort of 9894 patients with BWH and was maintained after excluding patients with a cancer diagnosis before or within 30 days of the blood smear (supplemental Figure 8). A chart review of 100 randomly selected deceased patients with high or low levels of schistocytes found no significant differences in the primary cause of death (supplemental Figure 8; χ2 test; P = .56; χ2 = 4.9, df = 6), suggesting that this schistocyte signal may be a complementary predictor of mortality risk and is not specific to 1 pathologic process. This signal was also maintained after controlling for RDW, which is a well-known nonspecific risk factor for morbidity and mortality30,31 (supplemental Figure 9). A weaker mortality association was observed in elevated spiculated cells (Figure 4B). No mortality association was observed for other RBC morphologies (supplemental Figure 8).

Figure 4.

RBC-diff counts are associated with patient prognosis. (A) Seven-day, all-cause mortality in non iTTP or HUS etiologies in the TMA derivation cohort. (B) Kaplan-Meier survival curves for patients with MGH stratified based on RBC-diff counts from the first available blood smear. Each group in panel B was matched for demographics, comorbidities, smear grading, blood count markers, and other RBC-diff counts (supplemental Methods). The findings in panel B were validated in the BWH cohort (supplemental Figure 8). The significance of the results in panel A was calculated using the χ2 test (P = .027; χ2 = 4.9; df = 1). The significance of survival curves in panel B was determined using a log-rank test, P values (test statistic) for low to midgroup comparisons were .04 (Z = 2.09) and .98 (Z = 0.02) and for high to midgroup comparisons were .003 (Z = 2.98) and .01 (Z = 2.60) for schistocytes and spiculated cells, respectively.

Figure 4.

RBC-diff counts are associated with patient prognosis. (A) Seven-day, all-cause mortality in non iTTP or HUS etiologies in the TMA derivation cohort. (B) Kaplan-Meier survival curves for patients with MGH stratified based on RBC-diff counts from the first available blood smear. Each group in panel B was matched for demographics, comorbidities, smear grading, blood count markers, and other RBC-diff counts (supplemental Methods). The findings in panel B were validated in the BWH cohort (supplemental Figure 8). The significance of the results in panel A was calculated using the χ2 test (P = .027; χ2 = 4.9; df = 1). The significance of survival curves in panel B was determined using a log-rank test, P values (test statistic) for low to midgroup comparisons were .04 (Z = 2.09) and .98 (Z = 0.02) and for high to midgroup comparisons were .003 (Z = 2.98) and .01 (Z = 2.60) for schistocytes and spiculated cells, respectively.

Close modal

The RBC-diff provides single-cell insights into routine blood count measures

Using the pixel dimensions of each identified cell, the RBC-diff can provide an estimate of individual RBC volumes (see Methods). Although less accurate than 3D approaches,24 these estimates are concordant with the routine complete blood count (CBC) indices MCV and RDW (supplemental Figure 1), and the RBC-diff can therefore be used to investigate how RBC morphology affects CBC indices by analyzing approximate volume-morphology distributions (Figure 5A). Using this method, across the preoperative liver transplantation cohort (Figure 2B), spiculated cells were, on average, 14% smaller than other RBCs but did not significantly decrease MCV (Figure 5B). In the iron-deficiency cohort (Figure 2D), the response to iron therapy involved an increase in the size of all RBCs and not just a reduction in microcytes (Figure 5C-D). In the derivation iTTP cohort (Figure 3), schistocytes were, on average, 30% smaller than other cells but only drove a 2 fL mean decrease in MCV (90.5-88.4 fL) (Figure 5E-F). Conversely, schistocytes drove an average absolute RDW increase of 1.9% (18.4%-20.3%) (Figure 5G). These 2 results suggest that previously reported MCV decreases in iTTP32 may be driven mostly by increased microcytosis rather than by schistocytosis and that a sudden increase in RDW in inpatient settings may be an early signal of emergent schistocytosis. Single-cell analysis of iTTP cases also revealed a significant inverse correlation between average schistocyte size (as a percentage of average cell size) and schistocyte count, suggesting that higher schistocyte counts may involve harsher or repeat shearing of cells (Figure 5H).

Figure 5.

RBC-diff enables the estimation of single-cell volume-morphology distributions. (A) Red cell volume distribution across different RBC morphologies, as estimated from a peripheral smear using RBC-diff. (B) The average estimate cell volume across 46 preoperative patients who underwent liver transplantation for normal cells, spiculated cells, and all cells except spiculated. (C) Red cell volume-morphology distribution in a patient with iron-deficiency anemia (IDA) before and after intravenous iron infusion. (D) Mean cell volume of 30 patients with IDA before and after infusion for all cells and only normal cells. (E) Red cell volume-morphology distribution for a patient with iTTP near the point of ADAMTS13 testing. (F) Mean cell volume of patients with 15 iTTP for normal cells, all cells except schistocytes, and schistocytes. (G) Estimated mean RDW for 15 patients with iTTP for all cells after exclusion of schistocytes. (H) Association between the mean size of schistocytes (relative to the average cell size) and schistocyte count across patients with 15 iTTP (42 smears in total) with schistocyte counts <10%. All significance levels a 2-sample t test. IDA, iron-deficiency anemia.

Figure 5.

RBC-diff enables the estimation of single-cell volume-morphology distributions. (A) Red cell volume distribution across different RBC morphologies, as estimated from a peripheral smear using RBC-diff. (B) The average estimate cell volume across 46 preoperative patients who underwent liver transplantation for normal cells, spiculated cells, and all cells except spiculated. (C) Red cell volume-morphology distribution in a patient with iron-deficiency anemia (IDA) before and after intravenous iron infusion. (D) Mean cell volume of 30 patients with IDA before and after infusion for all cells and only normal cells. (E) Red cell volume-morphology distribution for a patient with iTTP near the point of ADAMTS13 testing. (F) Mean cell volume of patients with 15 iTTP for normal cells, all cells except schistocytes, and schistocytes. (G) Estimated mean RDW for 15 patients with iTTP for all cells after exclusion of schistocytes. (H) Association between the mean size of schistocytes (relative to the average cell size) and schistocyte count across patients with 15 iTTP (42 smears in total) with schistocyte counts <10%. All significance levels a 2-sample t test. IDA, iron-deficiency anemia.

Close modal

Here, we present a novel machine-learning algorithm for the quantification of RBC morphologies in peripheral blood smear images. We validated this method at the single-cell and cell population levels, including comparison with morphology grading flags. We demonstrate how this method can aid in the differential diagnosis and evaluation of patient prognosis in multiple clinical settings. Finally, we illustrate how this method may help elucidate the effects of RBC morphology on routine CBC indices and help understand the pathophysiology of disease progression and treatment response.

Some previously developed machine-learning methods for the classification of RBC abnormalities have been limited by small or poor-quality data sets,16 choice of nonstandard classification categories,33 and limited clinical correlation.17,18 Other approaches have shown good performance in larger or well-defined data sets,17,20 and have often focused on individual cell classification without validation at the smear level or in the context of clinical care. Because human assessment of an individual morphology of a cell will be informed by morphologic heterogeneity across the entire smear, the clinical application of blood smear analysis involves consideration of the overall RBC population. Our approach overcomes these limitations using a robust and multipronged validation approach to demonstrate the accuracy of the method and its potential for diagnostic and prognostic applications (Figures 1-4). RBC-diff classifications were also insensitive to changes in image hue and the method performed well at a separate medical center and on manually collected images (supplemental Figures 3, 5, and 6). Because it is possible that alternative or complementary approaches to classification, such as using neural networks or automating feature selection,17,20,21 could enhance performance, we provide single-cell and cell population images (and associated expert labels) as a public resource (supplemental Data 2).

One significant challenge in automating the detection of RBC morphology is the lack of clear definitions of the specific morphologies. Unlike WBCs, RBC types do not have distinct mechanistic functions that help inform cellular structure, and morphologic classes tend to arise subjectively. Although Researchers such as Bessis et al, have elucidated and described RBC morphology in detail in experimental settings,34,35 the definition of morphology in clinical settings remains subjective, as demonstrated by the modest interexpert agreement levels we found (supplemental Figure 2). The type of objective definitions of morphologic class provided by the RBC-diff would improve the reproducibility of smear analysis, interpretation, and clinical utilization.

RBC-diff demonstrates the potential benefits of more precise and objective quantitation of RBC morphology. Compared with morphology grading flags, RBC-diff counts improved the sensitivity and specificity of the differential diagnosis of iTTP (Figure 3). Schistocyte levels are known to be of importance in ADAMTS13 deficiencies,32,36 but manual differentiation between different levels (1+, 2+, etc) of schistocytes is challenging, with expert assessments often differing substantially.13 The RBC-diff provides an objective and reproducible definition of significant schistocyte elevation, including determination of predominance, a recommendation in clinical guidelines.37 Different TMA etiologies had distinct RBC-diff count fingerprints (Figure 3), suggesting that this tool could play a role in the initial evaluation of patients with TMA, complementing scoring systems such as the PLASMIC score32,36 to assess the risk of severe ADAMTS13 deficiency.

Figure 4 shows that RBC-diff counts may in some scenarios be predictive of patient outcomes or track with patient prognosis. The surprising associations with mortality in Figure 4B persisted after adjusting for multiple factors, including morphology grading flags, comorbidities, and RDW, suggesting the presence of valuable and underutilized clinical information in blood smears. We note that the population of patients with blood smears at MGH is not representative of the general patient population, and further study of this signal in healthy cohorts is required.

RBC-diff can also help provide single-cell insights into the influence of morphology on CBC indices (Figure 5) via the estimation of single-cell volumes. Estimation of blood count parameters from imaging data has previously been shown to be promising, with prototype approaches showing similar accuracy to flow-cytometry approaches.38,39 By connecting estimated CBC indices to morphology, RBC-diff can generate morphology-corrected CBC indices that may provide improved discrimination of pathologic states or response to treatment. It has been shown that hemoglobin levels can be estimated from blood smear images,39 suggesting that RBC-diff could potentially be extended to other blood count measures such as hemoglobin and mean corpuscular hemoglobin.

This study focuses primarily on the quantitation of schistocytes because they are commonly elevated in important acute care settings40-44 and existing automated systems show limited specificity of detection.14,45 Schistocyte counts can be approximated via the fragmented red cell count (FRC), which can be calculated via flow-cytometry.46,47 FRC counts are sensitively but nonspecifically associated with imaging-derived schistocyte counts47 and may have diagnostic value for TMAs.48 However, FRC counts do not provide information on other RBC morphologic classes, a key feature for differential diagnosis in our study (Figure 3) and a part of the current recommendations for TMA diagnosis.41 Although FRC is a valuable correlate of schistocyte levels, it is typically used to highlight the need for manual smear review,46 and thus may provide value in tandem with the RBC-diff. A robust comparison of the RBC-diff schistocyte counts and FRC was not possible in this study because the primary clinical hematology analyzers at the MGH and BWH do not routinely record FRC values.

The RBC-diff is not intended to replace manual smear review but rather to provide technical assistance to improve speed and objectivity. The CBC and WBC differential currently provide an objective and quantitative foundation that informs manual smear review, and the RBC-diff could bolster this foundation. Given its accuracy with manually collected images (supplemental Figure 5), this potential application may be of particular benefit in resource-limited settings in which automated imaging systems are unavailable. However, it should be noted that RBC-diff was designed to quantify only 5 major morphologic classes and does not currently assess hypochromia, pallor, polychromasia, or RBC inclusions, and thus does not yet detect target cells, spherocytes, or RBC parasites. Similarly, the algorithm does not use advanced techniques17 to account for cell adhesion or crowding and may be less accurate in settings of extreme agglutination or poor smear quality. The expansion of cell classes and adjustments for smear quality are exciting avenues for future work.

Our application of RBC-diff primarily focused on the clinical setting of TMAs, where red cell dysfunction is commonplace. However, we speculate that RBC-diff may be valuable in many other clinical settings, such as: (1) schistocyte quantitation in disseminated intravascular coagulation,40 sepsis, or pregnancy-related conditions such as HELLP49; (2) sickle cell quantitation for sickle cell disease50; and (3) spiculated cell quantitation in severe liver disease. These reflect avenues for future research with a significant potential to improve clinical outcomes. More broadly, given its speed, accuracy, and robustness, we hope that RBC-diff may provide a powerful new lens to study red blood cell morphology in disease.

The authors thank the Mass General Brigham Research Patient Data Registry and Electronic Data Warehouse groups for facilitating the use of their databases, Chris Lofgren for the assistance with MGH database access and management, Olga Pozdnyakova for the assistance in accessing blood smears from BWH, Rahul Deo for the valuable conversations about the analysis, and the CellaVision team and Hangs-Inge Bengtsson for their help in archiving blood smear data.

This work is supported by the Vickery-Colvin Pathology Research Grant (J.A.S., J.M.H., and R.S.M.), the One Brave Idea Initiative (J.M.H.), and the Evelyn and Robert Luick Endowed Fund for the Blood Transfusion Service at MGH (R.S.M.). H.A.-S. is the recipient of the American Society of Hematology Scholar Award.

Contribution: J.A.S., B.H.F., J.M.H., and R.S.M. conceived the project and its design; B.H.F. wrote the code for the RBC-diff with input from other authors; and all authors conducted analyses and contributed to writing the manuscript.

Conflict-of-interest disclosure: R.S.M. and P.K.B. have both worked as consultants for Alexion on a project to validate use of the PLASMIC score for diagnosis of atypical HUS. H.A.-S. lists universal disclosures from research funding Agios, Amgen, Dova/Sobi, and consultancy for Agios, Dova/Sobi, Novartis, Rigel, argenx, Moderna, and Forma (all unrelated to this article). The remaining authors declare no competing financial interests.

Correspondence: John M. Higgins, 185 Cambridge St, Boston, MA 02114; e-mail: higgins.john@mgh.harvard.edu; and Robert S. Makar, Massachusetts General Hospital, GRJ2-233, 55 Fruit St, Boston, MA 02114; e-mail: rmakar@mgh.harvard.edu.

1.
Wintrobe
MM
,
Maxwell
M
.
Clinical hematology
. 7th ed..
Lea & Febiger
;
1974
.
2.
Ward
PCJ
.
The CBC at the turn of the millennium: an overview
.
Clin Chem
.
2000
;
46
(
8 Pt 2
):
1215
-
1220
.
3.
Todd
JK
.
Childhood infections. diagnostic value of peripheral white blood cell and differential cell counts
.
Am J Dis Child
.
1974
;
127
(
6
):
810
-
816
.
4.
Weitzman
M
.
Diagnostic utility of white blood cell and differential cell counts
.
Am J Dis Child
.
1975
;
129
(
10
):
1183
-
1189
.
5.
Kaushansky
K
,
Lichtman
MA
,
Prchal
JT
,
Levi
M
,
Burns
LJ
,
Linch
DC
. Williams Hematology. 10th ed..
McGraw Hill
;
2021
.
6.
Ruzicka
K
,
Veitl
M
,
Thalhammer-Scherrer
R
,
Schwarzinger
I
.
The new hematology analyzer Sysmex XE-2100: performance evaluation of a novel white blood cell differential technology
.
Arch Pathol Lab Med
.
2001
;
125
(
3
):
391
-
396
.
7.
Kratz
A
,
Lee
SH
,
Zini
G
,
Riedl
JA
,
Hur
M
,
Machin
S
;
International Council for Standardization in Haematology
.
Digital morphology analyzers in hematology: ICSH review and recommendations
.
Int J Lab Hematol
.
2019
;
41
(
4
):
437
-
447
.
8.
Katz
BZ
,
Feldman
MD
,
Tessema
M
, et al
.
Evaluation of scopio labs X100 full field PBS: The first high-resolution full field viewing of peripheral blood specimens combined with artificial intelligence-based morphological analysis
.
Int J Lab Hematol
.
2021
;
43
(
6
):
1408
-
1416
.
9.
Palmer
L
,
Briggs
C
,
Mcfadden
S
, et al
.
ICSH recommendations for the standardization of nomenclature and grading of peripheral blood cell morphological features
.
Int J Lab Hematol
.
2015
;
37
(
3
):
287
-
303
.
10.
Ford
J
,
Ford
J
.
Red blood cell morphology
.
Int J Lab Hematol
.
2013
;
35
(
3
):
351
-
357
.
11.
Zheng
XL
,
Vesely
SK
,
Cataland
SR
, et al
.
ISTH guidelines for treatment of thrombotic thrombocytopenic purpura
.
J Thromb Haemost
.
2020
;
18
(
10
):
2496
-
2502
.
12.
Nester
CM
,
Thomas
CP
.
Atypical hemolytic uremic syndrome: what is it, how is it diagnosed, and how is it treated?
.
Hematology
.
2012
;
2012
(
1
):
617
-
625
.
13.
Hervent
AS
,
Godefroid
M
,
Cauwelier
B
,
Billiet
J
,
Emmerechts
J
.
Evaluation of schistocyte analysis by a novel automated digital cell morphology application
.
Int J Lab Hematol
.
2015
;
37
(
5
):
588
-
596
.
14.
Park
SJ
,
Yoon
J
,
Kwon
JA
,
Yoon
SY
.
Evaluation of the CellaVision advanced RBC application for detecting red blood cell morphological abnormalities
.
Ann Lab Med
.
2021
;
41
(
1
):
44
-
50
.
15.
Beam
AL
,
Manrai
AK
,
Ghassemi
M
.
Challenges to the reproducibility of machine learning models in health care
.
JAMA
.
2020
;
323
(
4
):
305
-
306
.
16.
Alzubaidi
L
,
Fadhel
MA
,
Al-shamma
O
,
Zhang
J
,
Duan
Y
.
Deep learning models for classification of red blood cells in microscopy images to aid in sickle cell anemia diagnosis
.
Electronics
.
2020
;
9
(
3
):
427
.
17.
Wong
A
,
Anantrasirichai
N
,
Chalidabhongse
TH
,
Palasuwan
D
,
Palasuwan
A
,
Bull
D
.
Analysis of vision-based abnormal red blood cell classification
.
arXiv preprint. arXiv:2106.00389
.
Preprint posted online 1 June 2021
.
18.
Song
W
,
Huang
P
,
Wang
J
, et al
.
Red blood cell classification based on attention residual feature pyramid network
.
Front Med
.
2021
;
8
:
741407
.
19.
Lotfi
M
,
Nazari
B
,
Sadri
S
,
Sichani
NK
. The detection of dacrocyte, schistocyte and elliptocyte cells in iron deficiency anemia.
IPRIA
;
2015
. 2nd international conference on pattern recognition and image analysis.
20.
Demagny
J
,
Roussel
C
,
le Guyader
M
, et al
.
Combining imaging flow cytometry and machine learning for high-throughput schistocyte quantification: a SVM classifier development and external validation cohort
.
EBioMedicine
.
2022
;
83
:
104209
.
21.
Sadafi
A
,
Koehler
N
,
Makhro
A
, et al
.
Multiclass deep active learning for detecting red blood cell subtypes in brightfield microscopy
.
Lect Notes Comput Sci
.
2019
;
11764 LNCS
:
685
-
693
.
22.
Bendapudi
PK
,
Li
A
,
Hamdan
A
, et al
.
Impact of severe ADAMTS13 deficiency on clinical presentation and outcomes in patients with thrombotic microangiopathies: the experience of the Harvard TMA research collaborative
.
Br J Haematol
.
2015
;
171
(
5
):
836
-
844
.
23.
Guo
Q
,
Duffy
SP
,
Matthews
K
,
Santoso
AT
,
Scott
MD
,
Ma
H
.
Microfluidic analysis of red blood cell deformability
.
J Biomech
.
2014
;
47
(
8
):
1767
-
1776
.
24.
Simionato
G
,
Hinkelmann
K
,
Chachanidze
R
, et al
.
Red blood cell phenotyping from 3D confocal images using artificial neural networks
.
PLoS Comput Biol
.
2021
;
17
(
5
):
e1008934
.
25.
Marks
PW
.
Hematologic manifestations of liver disease
.
Semin Hematol
.
2013
;
50
(
3
):
216
-
221
.
26.
Swerdlow
PS
.
Red cell exchange in sickle cell disease
.
Hematology
.
2006
;
2006
(
1
):
48
-
53
.
27.
Campion
EW
,
Deloughery
TG
.
Microcytic anemia
.
N Engl J Med Overseas Ed
.
2014
;
371
(
14
):
1324
-
1331
.
28.
Weintraub
LR
.
Splenectomy: who, when, and why?
.
Hosp Pract
.
2015
;
29
(
6
):
27
-
34
.
29.
Finlayson
SG
,
Subbaswamy
A
,
Singh
K
, et al
.
The clinician and dataset shift in artificial intelligence
.
N Engl J Med
.
2021
;
385
(
3
):
283
-
286
.
30.
Foy
BH
,
Carlson
JC
,
Reinertsen
E
, et al
.
Association of red blood cell distribution width with mortality risk in adults hospitalized with covid-19 infection
.
JAMA Netw Open
.
2020
;
3
(
9
):
e2022058
.
31.
Patel
K v
,
Ferrucci
L
,
Ershler
WB
,
Longo
DL
,
Guralnik
JM
.
Red blood cell distribution width and the risk of death in middle-aged and older adults
.
Arch Intern Med
.
2009
;
169
(
5
):
515
-
523
.
32.
Bendapudi
PK
,
Hurwitz
S
,
Fry
A
, et al
.
Derivation and external validation of the PLASMIC score for rapid assessment of adults with thrombotic microangiopathies: a cohort study
.
Lancet Haematol
.
2017
;
4
(
4
):
e157
-
e164
.
33.
Doan
M
,
Sebastian
JA
,
Caicedo
JC
, et al
.
Objective assessment of stored blood quality by deep learning
.
Proc Natl Acad Sci
.
2020
;
117
(
35
):
21381
-
21390
.
34.
Bessis
M
,
Mohandas
N
.
Red cell structure, shapes and deformability
.
Br J Haematol
.
1975
;
31
(
s1
):
5
-
10
.
35.
Weed
RI
,
Leblond
PF
,
Bessis
M
. Red cell shape: physiology, pathology and ultrastructure.
Springer Verlag
;
1973
.
36.
Bendapudi
PK
,
Upadhyay
V
,
Sun
L
,
Marques
MB
,
Makar
RS
.
Clinical scoring systems in thrombotic microangiopathies
.
Semin Thromb Hemost
.
2017
;
43
(
5
):
540
-
548
.
37.
Zini
G
,
de Cristofaro
R
.
Diagnostic testing for differential diagnosis in thrombotic microangiopathies
.
Turk J Haematol
.
2019
;
36
(
4
):
222
-
229
.
38.
Winkelman
JW
,
Tanasijevic
MJ
,
Zahniser
DJ
.
a novel automated slide-based technology for visualization, counting, and characterization of the formed elements of blood a proof of concept study
.
Arch Pathol Lab Med
.
2017
;
141
(
8
):
1107
-
1112
.
39.
Bruegel
M
,
George
TI
,
Feng
B
, et al
.
Multicenter evaluation of the cobas m 511 integrated hematology analyzer
.
Int J Lab Hematol
.
2018
;
40
(
6
):
672
-
682
.
40.
Lesesve
JF
,
Martin
M
,
Banasiak
C
, et al
.
Schistocytes in disseminated intravascular coagulation
.
Int J Lab Hematol
.
2014
;
36
(
4
):
439
-
443
.
41.
Zini
G
,
d’Onofrio
G
,
Briggs
C
, et al
.
ICSH recommendations for identification, diagnostic value, and quantitation of schistocytes
.
Int J Lab Hematol
.
2012
;
34
(
2
):
107
-
116
.
42.
Noutsos
T
,
Currie
BJ
,
Brown
SG
,
Isbister
GK
.
Schistocyte quantitation, thrombotic microangiopathy and acute kidney injury in Australian snakebite coagulopathy
.
Int J Lab Hematol
.
2021
;
43
(
5
):
959
-
965
.
43.
Bahr
TM
,
Judkins
AJ
,
Christensen
RD
, et al
.
Neonates with suspected microangiopathic disorders: performance of standard manual schistocyte enumeration vs. the automated fragmented red cell count
.
J Perinatol
.
2019
;
39
(
11
):
1555
-
1561
.
44.
O’Brien
TE
,
Bowman
L
,
Hong
A
,
Goparaju
K
.
Quantification of schistocytes from the peripheral blood smear in thrombotic thrombocytopenic purpura (TTP) compared to non-TTP thrombocytopenic hospitalized patients
.
Blood
.
2018
;
132
(
Suppl 1
):
4983
.
45.
Horn
CL
,
Mansoor
A
,
Wood
B
, et al
.
Performance of the CellaVision® DM96 system for detecting red blood cell morphologic abnormalities
.
J Pathol Inform
.
2015
;
6
(
1
):
11
.
46.
Lesesve
JF
,
Speyer
E
,
Perol
JP
.
Fragmented red cells reference range for the Sysmex XN â-series of automated blood cell counters
.
Int J Lab Hematol
.
2015
;
37
(
5
):
583
-
587
.
47.
Lesesve
JF
,
Asnafi
V
,
Braun
F
,
Zini
G
.
Fragmented red blood cells automated measurement is a useful parameter to exclude schistocytes on the blood film
.
Int J Lab Hematol
.
2012
;
34
(
6
):
566
-
576
.
48.
Saigo
K
,
Jiang
M
,
Tanaka
C
, et al
.
Usefulness of automatic detection of fragmented red cells using a hematology analyzer for diagnosis of thrombotic microangiopathy
.
Clin Lab Haematol
.
2002
;
24
(
6
):
347
-
351
.
49.
Rath
W
,
Faridi
A
,
Dudenhausen
JW
.
HELLP syndrome
.
J Perinat Med
.
2000
;
28
(
4
):
249
-
260
.
50.
Alvarez
O
,
Montague
NS
,
Marin
M
,
O’Brien
R
,
Rodriguez
MM
.
Quantification of sickle cells in the peripheral smear as a marker of disease severity
.
Fetal Pediatr Pathol
.
2015
;
34
(
3
):
149
-
154
.

Author notes

B.H.F. and J.A.S. are joint first authors and contributed equally to this study.

The code to run RBC-diff, including example images and README, is included in supplemental Data 3.

The raw data for the figures and tables in the manuscript are included in supplemental Data 1. Because of IRB restrictions on the sharing of protected health information, raw data for certain figures were not included or have been limited to ensure anonymization.

A labeled set of 5000 single RBC images used to train the algorithm and a set of 50 cell population level smears with expert estimates of cell density are provided (supplemental Data 2).

An additional set of 5000 manually labeled single-cell images, not used in model training, is also provided (see supplemental Methods for further details).

Other details about the code are available on request from the corresponding author, Brody H. Foy (bfoy1@mgh.harvard.edu).

The full-text version of this article contains data supplement.