Hematopoietic cells and cell surface molecules have both been defined in the hundreds, and the cell-specific profiles arising from the presence of specific proteins on the surface of different cells or biological states (e.g. developmental stages, disease states, etc.) represent data of high combinatorial complexity. The dynamic surface marker profiles of cells have been extensively used for cell sorting and for therapeutics where specific surface markers are used to direct therapeutic agents to diseased cells, using either monoclonal antibodies or cell-based therapies.

Immunophenotyping is commonly used to define and sort cells based on the proteins present on their surface. In order to efficiently separate similar cells, a large number of surface proteins are often used. Complete knowledge about unique, cell-specific profiles of surface protein expression are likely to reveal much simpler surface profiles than currently used, as well as the definition of surface profiles where non are currently defined.

In cancer immunotherapy, adoptive transfer of chimeric antigen receptor (CAR) engineered T cells show promise as therapy modality. Currently, the main achievements utilizing this technique have been made targeting single malignancy-specific surface molecules, but progress is being made in requiring binding of several ligands before lymphocyte activation, which will increases the specificity of the therapy and thereby decrease off-target effects.

Defining surface protein expression profiles for cell stratification and CAR therapy in silicorequires information about expression of a large number of surface proteins on a large number of cells. At present, no high-throughput technique for measuring surface protein expression exists, although efforts to increase throughput using mass spectrometry and computational prediction of protein expression from mRNA expression are being explored. However, surface molecule expression on individual cells has been characterized at low rates using immunohistochemistry or flow cytometry for decades, and vast amounts of cell-specific expression has been measured and published. This represents a rich, but unstructured source of data and information.

To facilitate the definition of unique surface molecule profiles, we collected and organized large amounts of these data of human hematopoietic cells and the corresponding quantitative or qualitative presence (depending on availability) of known molecular surface molecules from the primary literature. To do so, we employed text mining techniques for article classification (as either containing information about surface protein expression or not) and subsequently extensive manual curation to assemble the data foundation for defining cell surface profiles for stratification and therapy. To analyze these data, we have developed algorithms for selection of cell surface protein for cell stratification and for target selection for CAR-based therapies.

The resulting database contains expression of 305 surface proteins across 206 hematopoietic cells, totaling 6153 data points. We have applied our algorithm to define unique profiles for each of the 206 cells, thus characterizing the surface profiles of the majority of hematopoietic cells to increase efficiency and specificity of cell stratification and therapy targeting. Future efforts will include expanding the database to contain surface protein expression for cells in all human tissues, as well as experimental validation of discovered surface profiles.

Disclosures

No relevant conflicts of interest to declare.

Author notes

*

Asterisk with author names denotes non-ASH members.

Sign in via your Institution