Myelodysplastic syndrome (MDS), aplastic anemia (AA) and paroxysmal nocturnal hemoglobinuria (PNH) are a heterogeneous group of disorders with significant clinical overlap. Current nosologic classifications schemes based mostly on morphology may not reflect pathophysiologic relationship between individual entities. The recent refinement of surface-enhanced laser desorption/ionization (SELDI) time-of-flight (TOF) technology allows for a high-throughput protein analysis of various biologic samples. Plasma samples from patients with MDS (N=134), AA (N=41), PNH (N=34), and healthy controls (N=75) were subjected to SELDI analyses using H4 chips. We limited analysis to spectra with more than 80 peaks. Reproducibility for each array run was monitored by repeating measurement of one standard sample on each chip. All spectra with peaks characterized by mass-to-charge ratios between 1.5–30 kDa were examined. Peaks with amplitude of at least > 3 times the average background noise were considered as meaningful. We had previously attempted SELDI analysis in bone marrow failure in a direct, manual, non-parametric fashion; this approach did not result in satisfactory discrimination. Here we have designed a new method utilizing a "binning" algorithm of peaks fulfilling selection criteria. A non-parametric approach was applied to select peaks within the bin defined by specific m/z amplitude. For each condition the "binned" signals were screened using Fisher’s exact test to determine significant peaks; p<0.001 was used as the cutoff for selecting signals for further study. Logistic regression models with step-wise and backward variable selection was then used to determine peaks that had independent prognostic value and could therefore discriminate between the patient groups. The resulting model was validated internally using a bootstrap re-sampling procedure; random samples from the original population were drawn with replacement in such a way that the resulting "bootstrap population" had the same number of subjects as the original study population. This process was repeated 500 times and the frequency of each factor’s inclusion in the resulting models was calculated. Factors present in ≥ 50% of the models were considered significant and used to build a final "bootstrap-based" model. Discriminating signals were identified and a scoring system was used to derive a score for each condition relative to controls (+1 for the presence of the signal in a bin being associated with the condition and −1 for the absence of the signal being associated with the condition). A score of 3 was used as a cutoff in MDS and a score of 2 in other diseases to differentiate from normal controls (Table 1). The observed discrimination obtained using only one chip surface and crude disease definitions encourages future studies. Currently non-overlapping sample sets are being generated for further analysis to identify proteomic patterns in specific cytogenetic defects and morphological categories; thereby validating the pilot results in more stringently defined disease phenotypes.

Disease versus normal controls

GroupNumberScore + in %Score - in %psensitivity/specificity
Control 75 83 17   
MDS 134 17 83 <0.001 83%/83% 
Control 75 88 12   
AA 41 24 76 <0.001 76%/88% 
Control 75 92   
PNH 34 21 79 <0.001 79%/92% 
GroupNumberScore + in %Score - in %psensitivity/specificity
Control 75 83 17   
MDS 134 17 83 <0.001 83%/83% 
Control 75 88 12   
AA 41 24 76 <0.001 76%/88% 
Control 75 92   
PNH 34 21 79 <0.001 79%/92% 

Author notes

Corresponding author

Sign in via your Institution