The tumor surface proteome, or surfaceome, is vital for tumor-microenvironment interactions and therapeutic innovation.1 Targeting cell surface proteins is central to cancer immunotherapy, though identifying these targets is challenging.2 Current therapies include monoclonal antibodies, bispecific T-cell engagers, antibody-drug conjugates, and adoptive cellular treatments like chimeric antigen receptor T-cell (CAR-T) and CAR natural killer therapy. While these approaches have shown success in hematologic cancers, including CD19- and CD20-targeted treatments for B-cell acute lymphoblastic leukemia (B-ALL) and diffuse large B-cell lymphoma, as well as B-cell maturation antigen (BCMA)-targeted therapies for multiple myeloma (MM), broader application is limited by a lack of specific cancer targets and disease heterogeneity.3 New strategies targeting tumor-specific antigens are needed to improve efficacy and overcome issues like antigen escape and tumor resistance.4
Current approaches used for target discovery include bulk transcriptome,5 single-cell RNA sequencing (scRNAseq),6 cell surface proteomics,1 and antibody-based proteomics7 (Figure). These methods have successfully identified new therapeutic targets, aiding in the development of new therapies and advancing disease biology research. Additionally, artificial intelligence and machine learning are increasingly used to analyze large datasets from these techniques, further enhancing target discovery.8
Each method generates data on genomic and molecular heterogeneity from bulk populations to single cells, with unique strengths and weaknesses. Consequently, these data are often complementary and used in combination to validate target discoveries.
Bulk Transcriptome
Bulk transcriptome analysis involves sequencing the entire set of RNA transcripts in a sample via RNA sequencing. RNA is converted into complementary DNA (cDNA) using reverse transcriptase, creating a library for next-generation RNA sequencing (RNA-seq). This method identifies specific genes, quantifies gene expression, and measures differential gene expression, providing insights into cell composition, molecular architecture, and functional details of tissues.9
Bulk transcriptome analysis is valuable for target discovery due to large-scale public databases (e.g., the Genotype-Tissue Expression project, the Cancer Cell Line Encyclopedia, and the Human Protein Atlas) that help identify upregulated transcripts in tumors. Disease-specific databases, such as that based on the Multiple Myeloma Research Foundation’s Relating Clinical Outcomes in Multiple Myeloma to Personal Assessment of Genetic Profile study, and primary research datasets in the Gene Expression Omnibus further enhance its utility. RNA-seq technology has been widely available for more than 15 years, building on earlier cDNA microarray methods.
As an example of using bulk transcriptome for immunotherapy target discovery, Eric L. Smith, MD, PhD, and colleagues analyzed mRNA from human bone marrow samples and MM cell lines, finding GPRC5D mRNA expression to be 1,000-fold higher in malignant plasma cells compared to peripheral blood B cells, with minimal expression in normal tissues.5 GPRC5D-targeted CAR T cells are in late-stage clinical trials,10 and a bispecific antibody against GPRC5D has received approval from the U.S. Food and Drug Administration.11
Using bulk transcriptome analysis, our group identified CD70 as a target in high-risk myeloma and developed a CD27-based, anti-CD70 CAR-T therapy that showed a more than 80-fold greater expansion in vivo compared to single chain fragment variable (scFv)-based CARs.12 Juwita Werner, MD, and colleagues discovered CLL-1 as a target in juvenile myelomonocytic leukemia and demonstrated that anti-CLL-1 CAR-T therapy effectively reduced leukemia cells and stem cells in vivo.13 Ji Li, PhD, and colleagues developed a T cell-dependent bispecific antibody targeting FcRH5, linked to 1q21 genetic gain in high-risk MM, by analyzing bulk mRNA expression in plasma cells.3
In pediatric myeloid leukemias, fusion proteins from chromosomal translocations drive the disease. In non-Down syndrome acute megakaryoblastic leukemia, the CBFA2T3-GLIS2 fusion is notably aggressive. Thao Tang and colleagues analyzed bulk RNA-seq in acute myeloid leukemia (AML) and identified FOLR1 as uniquely expressed in this genomic subtype. Preclinical studies showed that the FOLR1-targeting antibody-drug conjugate STRO-002 effectively treats this subtype of AML.14
Bulk transcriptome analysis has notable limitations, including a moderate correlation between transcript and protein levels, especially for surface proteins,15 necessitating protein-level validation. It can be affected by non-malignant cells in primary tumors, leading to false negatives, and fails to assess intratumoral heterogeneity, which may result in identifying targets prone to antigen-negative relapse. Additionally, it may miss off-tumor target expression on rare but crucial cell populations in normal tissues.
Single-Cell RNA Sequencing
To address the limitations of bulk transcriptome analysis, scRNAseq offers a promising alternative by analyzing gene expression at the single-cell level rather than averaging across many cells.6 This technique isolates and sequences RNA from individual cells, revealing variability and providing detailed insights into gene expression via specialized bioinformatic routines. Despite its high cost, ongoing efforts to reduce costs per cell and expand datasets are improving its potential for identifying tumor-specific gene expression patterns.16 Adrian Gottschlich, MD, and colleagues analyzed scRNAseq data from more than 500,000 single cells of 15 AML patients and nine healthy individuals. Through detailed single-cell analysis, CSF-1R and CD86 were identified as potential CAR T-cell therapy targets in AML. Functional validation showed these CAR T cells were highly effective in vitro and in vivo using AML models, with minimal off-target effects on healthy tissues.8
Limitations of scRNAseq and bulk RNA-seq include sparse transcript expression, where cell surface proteins may be present but not detected by scRNA-seq. Additionally, like in bulk RNA-seq, transcripts detected by either method may not correspond to proteins present on the cell surface.15 Both approaches rely on comparing hits to computational databases of cell surface proteins, potentially overlooking targets not classified as canonical cell surface proteins but aberrantly expressed on tumor cells, thus evading detection by transcript-level analysis alone.
Cell Surface Proteomics
To overcome transcriptome analysis limitations, direct analysis of surface proteins using mass spectrometry-based proteomics is promising. This method allows for selective enrichment of cell surface proteins and unbiased protein assessment, facilitating target discovery without prior target knowledge.17
In one example, cell surface proteomics identified CD72 as a promising target for poor-prognosis KMT2A-rearranged B-ALL. In this work, Matthew A. Nix, PhD, and colleagues selected synthetic CD72-specific nanobodies, incorporated them into CAR T cells, and demonstrated robust activity against B-cell malignancy models, including those with CD19 loss.18,19
Ian D. Ferguson, PhD, and colleagues identified CCR10 as a key target on MM cells using glycoprotein capture proteomics. They engineered CAR T cells with CCL27 natural ligand, effective against MM cell lines. They also identified CD53, CD10, EVI2B, and CD33 as markers of resistance to bortezomib and lenalidomide. Lenalidomide treatment enhanced mucin 1-targeting CAR T-cell activity by upregulating the antigen.1 Using surface proteomics, Georgina S. F. Anderson, PhD, and colleagues discovered semaphorin-4A (SEMA4A) as a novel myeloma-associated surface protein.1,2 An antibody-drug conjugate targeting SEMA4A showed potent and selective efficacy in in vitro and in vivo models.2
Using a combination of proteomics and transcriptomics, Fabiana Perna, MD, PhD, and colleagues found optimal CAR targets for AML, developing a strategy to combine multiple targets like CD82, CD70, TNFRSF1B, and CD96.20 Margaux Lejeune, PhD, and colleagues used an algorithm to identify 23 proteins for combinatorial pairing in MM. Flow cytometry confirmed the expression of Fc receptor-like protein 5, BCMA, interleukin-6 receptor, endothelin receptor type B (ET-B), and SLCO5A1 in primary samples. Additionally, a monoclonal antibody (RB49) was developed to target ET-B, recognizing an epitope that becomes accessible after ET-B activation.4
Kamal Mandal, PhD, and colleagues used “structural surfaceomics,” a method combining cross-linking mass spectrometry with glycoprotein surface capture, to reveal unique, cancer-specific surface protein conformations not detectable by standard analyses. They identified the activated form of integrin beta-2 as an AML-specific target and developed recombinant antibodies against this conformation. These scFv antibodies sequences enabled CAR T cells to effectively target AML cells and patient-derived xenografts with minimal toxicity to normal cells.21
Mass spectrometry, while useful for target discovery, has several limitations: it cannot resolve intratumoral heterogeneity, often captures intracellular proteins despite strongly enriching cell surface proteins, and requires specific datasets for each tumor type. Additionally, there is a lack of large-scale, high-quality datasets for normal tissue surface proteins, making comparisons with tumor samples challenging.
Single-Cell, Antibody-Based Proteomics
Single-cell proteomics using large antibody panels complements bulk cell surface proteomics by offering detailed insights at the individual cell level. Key methods include cytometry by time of flight (CyTOF) and cellular indexing of transcriptomes and epitopes by sequencing (CITE-seq). CyTOF combines flow cytometry with mass spectrometry to measure more than 40 parameters per cell using isotopic barcodes.22 CITE-seq integrates protein and transcriptome data in single cells using oligonucleotide-labeled antibodies, allowing measurement of up to 300 surface proteins.23
CyTOF profiling by Muharrem Muftuoglu, MD, and colleagues revealed variable CD123 expression in relapsed and refractory AML samples, with CD33 being a key marker for differentiated leukemic blasts.
CLL-1 was identified as a potential target, though not present in all patients.24 Katherine Knorr, MD, PhD, and colleagues used CITE-seq to find the RNA helicase U5 snRNP200 on the AML cell surface, present in about one-third of samples, and demonstrated its effectiveness in AML models when targeted with anti-U5 snRNP200 antibodies and combined with azacytidine.7
These methods offer detailed insights into tumor cell surface proteins, allowing a single-cell approach to discover unexpected patterns of expression and cell subtypes, but they require prior knowledge and high-quality antibodies, limiting their ability to discover entirely new targets.
Conclusion
Here, we provide a concise overview of various technologies and their role in driving discoveries of new immunotargets in hematologic malignancies. Innovative immune-based approaches have generated excitement in hematologic malignancy treatment by targeting tumor-selective antigens and sparing healthy tissues. Monoclonal antibodies and, more recently, bispecific antibodies, along with CAR T-cell therapies, have significantly improved overall survival and progression-free survival.25 While significant progress has been made in target discovery and translating therapies into clinical practice, new treatment options are still needed for patients with relapsed and refractory diseases. Therefore, it is essential to continue advancing target discovery methods to create effective and safe immune-cellular therapies.
Disclosure Statement
Drs. Patiño-Escobar and Perez-Lugo indicated no relevant conflicts of interest. Dr. Wiita acknowledges being an equity holder in Indapta Therapeutics and receiving an honorarium from Sanofi.