Lesional gene expression profiling in cutaneous T-cell lymphoma reveals natural clusters associated with disease outcome

Shin, Jessica; Monti, Stefano; Aires, Daniel J.; Duvic, Madeleine; Golub, Todd; Jones, David A.; Kupper, Thomas S.

doi:10.1182/blood-2006-12-061507

Abstract

Cutaneous T-cell lymphoma (CTCL) is defined by infiltration of activated and malignant T cells in the skin. The clinical manifestations and prognosis in CTCL are highly variable. In this study, we hypothesized that gene expression analysis in lesional skin biopsies can improve understanding of the disease and its management. Based on 63 skin samples, we performed consensus clustering, revealing 3 patient clusters. Of these, 2 clusters tended to differentiate limited CTCL (stages IA and IB) from more extensive CTCL (stages IB and III). Stage IB patients appeared in both clusters, but those in the limited CTCL cluster were more responsive to treatment than those in the more extensive CTCL cluster. The third cluster was enriched in lymphocyte activation genes and was associated with a high proportion of tumor (stage IIB) lesions. Survival analysis revealed significant differences in event-free survival between clusters, with poorest survival seen in the activated lymphocyte cluster. Using supervised analysis, we further characterized genes significantly associated with lower-stage/treatment-responsive CTCL versus higher-stage/treatment-resistant CTCL. We conclude that transcriptional profiling of CTCL skin lesions reveals clinically relevant signatures, correlating with differences in survival and response to treatment. Additional prospective long-term studies to validate and refine these findings appear warranted.

Introduction

Cutaneous T-cell lymphoma (CTCL) refers to a heterogeneous group of non-Hodgkin lymphomas that share the common feature of infiltration of the skin with malignant T cells.² The most common form of CTCL is historically known as mycosis fungoides (MF), and it has a variable clinical course. While many patients have a relatively indolent course, others experience rapidly progressive and often lethal disease. Like other lymphomas, prognosis is linked, albeit imprecisely, to clinical stage at presentation. Patients with stage IA disease have normal life expectancies, while those in stage IB have median survivals of 12.8 years, and median survival of stage IIB disease is only 3.2 years.³ Other variants of CTCL are less common, and include diseases with more or less aggressive clinical courses.^2,4 Sézary syndrome (SS) is the leukemic variant of CTCL with a 5-year survival of only 11%.⁵

In early disease, activated reactive but benign T cells greatly outnumber malignant cells in skin, and the clinical appearance is that of a patch or a thin plaque of an inflammatory skin disease.⁶ The cytokine milieu of the involved skin is dominated by type 2 T-cell cytokines, and the infiltrating cells are typically positive for cutaneous lymphocyte antigen (CLA) and CCR4.⁷ As disease progresses, the ratio of malignant cells to reactive cells increases, and thick plaques and tumors are characterized by loss of epidermotropism, increase in dermal lymphocytic infiltrate, and size and atypia of cells.⁸ The etiology of CTCL remains obscure, and we have incomplete understanding concerning the initiation of malignant disease and the role of processes such as genomic instability or antigenic stimulation in disease progression. At this time, it is impossible to predict which patients with early-stage CTCL will progress, and which will continue to manifest indolent skin-limited disease.⁹

Transcriptional analysis of cancer has proven to be a powerful and increasingly useful tool in biomedical research.^10-12 These patterns of gene expression have revealed unappreciated complexity in histologically homogeneous diseases like large-cell B-cell lymphoma. A goal of many studies is to link gene expression patterns revealed by transcriptional profiling to both understanding pathophysiology of disease and predicting prognosis and responsiveness to therapy.^13-15 These approaches have been applied to CTCL in a limited way to characterize leukemic cells in SS,¹⁶ to distinguish MF from benign dermatoses,¹⁷ and to predict responsiveness to interferon.¹⁸

In the present study, we use transcriptional analysis to gain insight into the pathophysiologic and prognostic features of CTCL. We evaluated 62 patients with various stages of CTCL for whom detailed clinical history was available. Lesional biopsies were analyzed via transcriptional profiling using oligonucleotide microarrays (Affymetrix HG-U133 chips; Santa Clara, CA). We present a robust unsupervised analysis of these samples with clustering into 3 groups, and explore the molecular pathways implied by the resultant clustering. We purposefully sought a rigorous assessment of optimal cluster number in order to better appreciate categories of disease process. We used supervised analysis to determine signatures of early- and late-stage CTCL and to evaluate responsiveness to therapy and clinical course. Taken together, these studies shed light on functional pathways in CTCL and suggest an approach to further predictive studies.

Patients, materials, and methods

Samples

All patients were enrolled in a study protocol approved by the Dana-Farber Cancer Institute's Institutional Review Board. Informed consent was obtained in accordance with the Declaration of Helsinki. Patients were recruited from the Cutaneous Lymphoma Clinic at the Dana Farber Cancer Institute (DFCI)/Brigham and Women's Hospital (BWH). One sample was obtained through collaboration with MD Anderson. A total of 63 6-mm punch biopsies from lesional skin were collected from 62 patients between January 26, 2003, and June 1, 2005. One patient with patch/plaque disease was biopsied on 2 separate occasions from similar lesions at different body sites in order to provide methodologic verification through consistent clustering patterns. Patients enrolled were at various stages of disease, including 13 patients with limited patches and plaques covering less than 10% of body surface area (stage IA), 29 with generalized patches and plaques covering more than 10% of body surface area (stage IB), 8 with tumors (stage IIB), and 12 with generalized erythroderma (stage III). One-third of patients presented for diagnostic biopsy and were treatment naive, whereas the remainder had already started treatments with various therapies. Patients were diagnosed and staged based on clinical history and physical exam, biopsy findings, flow cytometry, and computed tomography (CT) studies. An extensive chart review was conducted to collect information on clinical parameters and outcomes. Data were stored in a study database. Mean age of patients with CTCL in this study was 60 years (range, 25-90 years). A total of 60% of patients were male.

Target preparation

Skin biopsies of CTCL skin lesions taken with a 6-mm punch were obtained and immediately snap-frozen in liquid nitrogen. Tissue was powdered in liquid nitrogen (Cryo-Press; Microtec Co, Chiba, Japan), and total RNA was extracted using Trizol (Invitrogen, Carlsbad, CA) according to the manufacturer's instructions. RNA was then labeled using the standard Affymetrix in vitro transcription labeling protocol, and 10 μg cRNA was hybridized to Human Genome U133A GeneChip oligonucleotide arrays, which consist of 22 283 probe sets representing approximately 13 500 named genes.

Data analysis and filtering

Gene expression data were processed using Robust Multi-Chip Average (RMA)¹⁹ as implemented in the R statistical package provided by Bioconductor (Seattle, WA).^20,21 RMA data processing includes background correction, quantile normalization, and expression summarization.

Genes were filtered in order to eliminate those with cross-sample variation using a variation filter²² estimated on 14 HeLa cell lines and 5 Strategene samples (Stratagene, La Jolla, CA; Figure 1). We used this filtering to define 2 gene sets, 1 with 6997 genes (α = 0.999) and another with 4204 genes (α = 0.9999). Unsupervised analyses used the 4204-gene set, reflecting the more stringent statistical criteria required when samples and genes are allowed to cluster naturally. Supervised analyses used the 6997-gene set.

View large Download PPT

Figure 1

Gene filtering. A total of 22 283 probe sets were filtered using a noise envelope to eliminate genes with little cross-chip variability. The noise envelope was estimated on 14 HeLa cell lines and 5 Strategene samples. Two filters were used, resulting in gene sets of 6997 (α = 0.999) and 4204 (α = 0.9999) probe sets. Unsupervised analyses were done using the 4204-gene filter. Supervised analyses were done using the 6997-gene filter.

Unsupervised analysis using consensus clustering

In this analysis, 2 unsupervised clustering techniques were used: hierarchical clustering (HC)²³ and self-organizing maps (SOMs).²⁴ The 2 clustering techniques were compared using consensus clustering²⁵ in order to assess the stability of the identified clusters and to evaluate sensitivity to sampling variability. Confusion matrices for gene as well as sample assignment were generated to assess agreement between clusters produced by the different clustering algorithms.

All heat maps for visualization were generated using DChip.²⁶ Fold changes were calculated based on average expression for class versus nonclass.

Supervised analysis

Statistical tests to compare predefined classes of samples based on phenotypic differences were performed using GenePattern (Broad Institute, MIT, Cambridge, MA, http://www.broad.mit.edu/cancer/software/genepattern/).^27,28 In particular, we used the Comparative Marker Selection suite, which computes statistical significance for each gene and corrects for multiple hypothesis testing (MHT).²⁹ A total of 10 000 permutations were performed for the estimation of empirical p-values, which were then corrected for MHT by the false discovery rate (FDR) procedure.³⁰ Genes with an FDR value of less than or equal to .05 were considered significant in our study.

Gene pathway analysis

A total of 2 different algorithms were used to evaluate contribution of gene pathways to the transcriptional differentiation of samples.

GO analysis.

Gene Ontology (GO) is a collaborative and comprehensive gene annotation resource compiled by the Gene Ontology Consortium.³¹ GO annotations were obtained from Affymetrix and analyzed using GeneMerge.³² GeneMerge uses defined class gene markers and calculates enrichment for GO pathways.

GSEA.

Gene set enrichment analysis (GSEA) was performed by generating a ranked list of genes differentiating defined classes and comparing this ranked list with specified gene sets.³³ Genes were ranked by signal to noise ratio. An enrichment score was generated for each gene set. Statistical significance was then estimated using phenotype-based permutations, with the attained P values adjusted for MHT. A total of 539 gene sets generated from several independent sources were used in our analysis. Among the resources used to compile these gene sets were (1) Biocarta (Biocarta, San Diego, CA), with 274 pathways involved in adhesion, apoptosis, cell activation, cell-cycle regulation, cell signaling, cytokines and chemokines, developmental biology, hematopoiesis, immunology, metabolism, and neurosciences; (2) GenMAPP (Gene MicroArray Pathway Profiler; www.genmapp.org), a resource for metabolic pathways, cell signaling processes, and physiologic functions, with 67 pathways; (3) Signal Transduction Knowledge Environment (STKE), a branch of the journal Science that serves as an online resource for signal transduction research and includes 29 signal transduction pathways; (4) SigmaAldrich (Sigma-Aldrich, St Louis, MO), with 12 pathways, involved in the cell cycle, mitogen-activated protein (MAP) kinase signaling, and apoptosis; (5) GO, from which were culled 23 pathways; and (6) the GEArray pathway-specific arrays made by SuperArray (SuperArray Bioscience, Frederick, MD), from which were derived 3 pathways involved in calcineurin-NFAT signaling, breast cancer and estrogen signaling, and bcl2 regulation. A total of 63 manually curated pathways involved in mitochondrial function and metabolism were also included in the analysis. An additional 59 pathways were generated from the current literature.

Results

Identification of the consensus number of clusters

We used HC and SOM algorithms to differentiate biologically meaningful subsets of CTCL with similar transcription profiles. Based on consensus analysis, we determined that the most robust clustering structure included 3 distinct clusters (Figure 2). This clustering showed a high level of agreement between clusters produced by HC and SOMs, with approximately 89% of samples and more than 90% of genes assigned to the same clusters (Figure 3).

View large Download PPT

Figure 2

Consensus clustering. Consensus clustering indicates that 3 clusters is the greatest number of stable clusters across clustering techniques. Consensus heat maps produced by HC and SOMs for cluster number k = 3 are shown. The consensus heat map is a visual representation of the consensus matrix, which is a matrix of sample pairs. Each matrix entry is the proportion of times the pair's samples are clustered together across resampling iterations. In the heat map, a value of 1 (corresponding to 2 samples that are always clustered together) is represented by the color red, and a value of 0 (corresponding to 2 samples that are never clustered together) is represented by the color white.

View large Download PPT

Figure 3

Consensus clustering overlap matrices for the 3-cluster model.(A) Sample overlap comparing sample assignment between HC and SOMs. Entries are the number of samples in overlapping clusters. (B) Marker overlap between HC and SOMs. Gene markers for k = 3 clusters using HC and SOMs were evaluated to assess the number of markers held in common between clustering techniques. Entries are presented as: residual [no. observed | no. expected], where residual = (observed −expected) / sqrt (expected).

One patient was biopsied at 2 different time points and 2 different body sites to verify reproducibility of the clustering. Clustering performed with each sample separately resulted in the same cluster assignment (cluster 1) both times.

Characterization of the consensus clusters

We next examined the gene sets involved in the 3-cluster structure produced by consensus clustering. Using standard differential analysis, we identified 593 genes with a P value less than or equal to .025 and fold change greater than or equal to 2 (Figure 4). We then further characterized the clusters identified by unsupervised analysis based on sample phenotype as well as gene identity.

View large Download PPT

Figure 4

Hierarchical clustering heat map. This heat map shows partitioning of 63 samples and the top 593 genes with P values less than or equal to .025 and fold change greater than or equal to 2. The chart shows the top 15 GO terms for each cluster, ranked by FDR. Each GO term is accompanied by the number of up-regulated genes/the total number of genes included in the test group. All included GO terms have P values less than or equal to .05 and FDR less than or equal to .05.

Phenotype correlation.

Sample clustering appeared to correlate with clinical stage (Figure 5). Cluster 1 contained a mix of patients with stages IA, IB, IIB, and III disease. This cluster included 75% (6 of 8) of all stage IIB (tumor stage) samples. Cluster 2 contained primarily stage IA and IB disease, with only 2 patients with higher-stage disease. Cluster 3 contained only 1 stage IA patient and a relatively large proportion of stage III disease (58%, or 7 of 12 stage III patients).

View large Download PPT

Figure 5

Cluster composition based on clinical stage. Entries are presented as: residual [no. observed | no. expected], where residual = (observed −expected) /sqrt (expected). Yellow indicates relative deficiency of cluster with respect to the clinical stage, and green indicates enrichment of cluster with respect to the clinical stage, comparing observed with expected composition. Sample phenotype correlation showed cluster 1 to be composed of a roughly equivalent proportion of clinical stages. However, this breakdown was statistically enriched with stage IIB, or tumor stage, disease and lacking in stage IB disease. Cluster 2 was enriched with patch/plaque stage disease, with only 2 samples categorized as stage IIB and III. Cluster 3 showed depletion in stage IA and IIB disease and enrichment in stage IB and III disease. These findings were statistically significant, with a P value of .004.

Among patients with CD30⁺ disease, all 5 were in cluster 1. Interestingly, none of the 5 patients with CD8⁺ disease were found in cluster 1. Of the patients with CD8⁺ disease, 2 were distributed to cluster 2 and 3 to cluster 3. Folliculotropic disease was found to be somewhat more predominant in cluster 3, with 2 of 7 patients found in cluster 1, 1 of 7 patients in cluster 2, and 4 of 7 patients in cluster 3.

The 3 clusters were also characterized by differences in survival. Cluster 1 included 4 deaths: 3 stage IIB patients and 1 stage IB patient. Deaths occurred at 6 months after biopsy for the stage IB patient, and at 3 months, 3 months, and 4 months, respectively, for the 3 stage IIB patients. In all cases, death occurred after developing metastatic and/or leukemic involvement and failing multiple treatment regimens, including systemic therapies. Cluster 3 included 2 deaths at 5 and 9 months, respectively, both of whom were patients with SS. In comparison, the single death in cluster 2 was due to an unrelated medical condition, occurring 20 months after biopsy.

We first examined the impact of prebiopsy treatments on the clusters, and did not find a significant correlation between particular treatment regimens and clusters, suggesting that the clustering patterns were not driven by treatment signatures (Table 1). Examination of treatment response showed that clusters 1 and 3 are associated with more aggressive, less treatment-responsive disease. Recalcitrant disease across stages was more commonly found in clusters 1 and 3. In particular, 2 patients assigned to cluster 1 have since undergone bone marrow transplantation as a last resort to control their disease, and 1 patient had progressed from stage IA disease to large-cell transformation after the biopsy was taken, with subsequent persistent disease despite several treatment regimens, including denileukin diftitox (ONTAK) total skin electron beam therapy, and a suberoylanilide hydroxamic acid (SAHA) trial. By contrast, patients in cluster 2 generally exhibited extremely good response to treatment across all stages of disease. All but 3 patients are currently in remission or under good control on topical therapies, including steroids, tacrolimus, bexarotene, and/or phototherapy, administered as PUVA, narrow-band UVB, or natural sunlight. The only stage IIB patient went into remission on PUVA alone. Among stage IB patients, those who were more difficult to treat, characterized by longer time to remission, failure to respond to standard treatment regimens, and use of therapies more commonly reserved for more advanced disease, tended to fall into clusters 1 and 3. Stage IB patients whose disease was more amenable to treatment tended to fall into cluster 2. Among stage IB patients in cluster 2, 9 of 11 were maintained on intermittent topical therapies, including topical steroids, bexarotene gel, and/or phototherapy, without progression of disease. The stage IB patients not in cluster 2 included 1 patient in cluster 1 who developed metastatic disease and died despite the use of ONTAK, oral bexarotene, total skin electron beam radiation, extracorporeal photopheresis, interferon, and single- and multiple-agent chemotherapy. A total of 7 of the 18 stage IB patients in clusters 1 and 3 were placed on systemic therapies, including oral bexarotene, prednisone, extracorporeal photopheresis, interferon, ONTAK, methotrexate, chemotherapy, and SAHA.

Table 1

Cluster composition compared with therapeutic interventions by the time of biopsy

Therapy	Patients, %			P
Therapy	Cluster 1	Cluster 2	Cluster 3	P
No therapy	33	47	23	.400
Topical corticosteroids	62	53	73	.719
Topical tacrolimus	14	16	0	.550
Topical nitrogen mustard	10	11	0	.672
Topical bexarotene	14	5	9	.653
PUVA	29	26	23	.930
UVB (narrow or broadband)	10	16	18	.746
Electron beam radiation (total skin or local)	14	5	9	.653
Oral bexarotene	29	11	23	.449
ONTAK (denileukin diftitox)	24	5	14	.304
Extracorporeal photopheresis	14	11	18	.814
Interferon	19	5	23	.351
Oral corticosteroids	29	5	23	.223
Chemotherapy (single or multiple agent)	19	11	23	.642
Suberoylanilide hydroxamic acid (SAHA)	0	0	5	.556

Therapy	Patients, %			P
Therapy	Cluster 1	Cluster 2	Cluster 3	P
No therapy	33	47	23	.400
Topical corticosteroids	62	53	73	.719
Topical tacrolimus	14	16	0	.550
Topical nitrogen mustard	10	11	0	.672
Topical bexarotene	14	5	9	.653
PUVA	29	26	23	.930
UVB (narrow or broadband)	10	16	18	.746
Electron beam radiation (total skin or local)	14	5	9	.653
Oral bexarotene	29	11	23	.449
ONTAK (denileukin diftitox)	24	5	14	.304
Extracorporeal photopheresis	14	11	18	.814
Interferon	19	5	23	.351
Oral corticosteroids	29	5	23	.223
Chemotherapy (single or multiple agent)	19	11	23	.642
Suberoylanilide hydroxamic acid (SAHA)	0	0	5	.556

Table shows percentage of patients in each cluster who had been treated with the given therapy by the time of biopsy. Clustering did not correlate with particular therapies at the time of biopsy. The chi-squared distribution was used to calculate the P value.

Although differences in survival among clusters did not reach significance, differences in event-free survival, such as disease progression or large-cell transformation, did. Event-free survival curves are shown in Figure 6.

View large Download PPT

Figure 6

Event-free Kaplan-Meier curves. Survival was evaluated using Kaplan-Meier analysis. Events included disease progression, large-cell transformation, and death. There were 10 events, including 7 deaths, 1 large-cell transformation after biopsy, and 3 patients who progressed despite systemic treatments. The log-rank test was used to determine significance of differences between survival curves. Event-free survival is shown (A) by clinical stage (P < .001); (B) by hierarchical cluster, using a 3-cluster partition (P = .028); and (C) by hierarchical cluster, using a 2-cluster partition (P = .02).

Genes and gene pathways implicated in differentiating CTCL subgroups.

We identified dominant genes differentiating the 3 CTCL clusters by applying the t statistic and determining fold change.

Cluster 1, composed of a mix of clinical stages, showed significant up-regulation of many genes involved in immune response and T-cell activation (Table 2). Notable among these are all 3 components of the IL-2 receptor, lymphocyte-specific protein tyrosine kinase (LCK), PIK3CD, T-cell receptor α and β chains, CD3, ζ-chain–associated protein kinase (ZAP70), FYN-binding protein (FYB), T-cell receptor–interacting molecule (TCRIM), CD69, and linker for activation of T cells. SH2D1A, along with SLAMF1 (fold change, 2.99; P = .003), has been implicated in TCR activation; deficiency has been associated with the development of fatal X-linked lymphoproliferative disorder after Epstein-Barr virus (EBV) infection.³⁴ CD8⁺ T-cell markers, including granulysin, were also up-regulated in Cluster 1.

Table 2

Cluster 1 selected up-regulated genes

Gene symbol	Gene name	Fold change	P
IGLC1; IGLC2	Immunoglobulin lambda constant and variable	8.63	.035
IL26	Interleukin-26	8.44	.043
IGHM	Immunoglobulin heavy constant mu	7.16	.025
POU2AF1	POU domain, class 2, associating factor 1	5.93	.016
TNFRSF17	Tumor necrosis factor receptor superfamily, member 17	5.78	.040
T3JAM	TRAF3-interacting Jun N-terminal kinase (JNK)–activating modulator	4.73	< .001
LEF1	Lymphoid enhancer-binding factor 1	4.66	.004
SH2D1A	SH2 domain protein 1A, Duncan disease	4.61	.006
GNLY	Granulysin	4.56	.041
PTPN7	Protein tyrosine phosphatase, nonreceptor type 7	4.45	< .001
FYB	FYN binding protein (FYB-120/130)	4.44	< .001
TNFSF14	Tumor necrosis factor (ligand) superfamily, member 14	4.16	< .001
LCK	Lymphocyte-specific protein tyrosine kinase	4.11	< .001
RAC2	Ras-related C3 botulinum toxin substrate 2	4.10	< .001
IL21R	Interleukin-21 receptor	4.09	< .001
TCRIM	T-cell receptor interacting molecule	4.01	< .001
ZAP70	ζ-chain (TCR)–associated protein kinase 70 kDa	3.89	< .001
TNFSF4	Tumor necrosis factor (ligand) superfamily, member 4	3.86	.010
CCR8	Chemokine (C-C motif) receptor 8	3.65	.031
FUT7	Fucosyltransferase 7 (alpha (1,3) fucosyltransferase)	3.64	.007
SELL	Selectin L (lymphocyte adhesion molecule 1)	3.64	.012
CD3G	CD3G antigen, gamma polypeptide (TiT3 complex)	3.64	< .001
CCL4	Chemokine (C-C motif) ligand 4	3.62	.042
ITK	IL-2–inducible T-cell kinase	3.61	< .001
CXCL13	Chemokine (C-X-C motif) ligand 13 (B-cell chemoattractant)	3.60	.016
IL2RA	Interleukin-2 receptor α	3.49	.015
TNIK	TRAF2 and NCK interacting kinase	3.40	< .001
IL2RB	Interleukin-2 receptor β	3.29	< .001
TNFRSF7	Tumor necrosis factor receptor superfamily, member 7	3.23	< .001
TRBC1	T-cell receptor β constant 1	3.20	< .001
LAT	Linker for activation of T cells	3.16	< .001
ST8SIA1	ST8 alpha-N-acetyl-neuraminide alpha-2,8-sialyltransferase 1	3.01	.006
IL2RG	Interleukin-2 receptor γ (severe combined immunodeficiency)	3.01	< .001
LILRB4	Leukocyte immunoglobulin-like receptor, subfamily B, member 4	2.95	.007
CCR4	Chemokine (C-C motif) receptor 4	2.91	.003
ITGAL	Integrin α_L	2.91	< .001
CD3E	CD3E antigen ϵ polypeptide (TiT3 complex)	2.89	< .001
CD3Z	CD3Z antigen ζ polypeptide (TiT3 complex)	2.88	< .001
PRF1	Perforin 1 (pore-forming protein)	2.86	.009
CCR5	Chemokine (C-C motif) receptor 5	2.84	.008
LTB	Lymphotoxin β (TNF superfamily, member 3)	2.82	.001
CCR7	Chemokine (C-C motif) receptor 7	2.72	< .001
BIRC3	Baculoviral IAP repeat–containing 3	2.71	.005
MAP4K1	Mitogen-activated protein kinase 1	2.68	< .001
IL27RA	Interleukin-27 receptor α	2.67	< .001
CD69	CD69 antigen (p60, early T-cell activation antigen)	2.67	.001
IFNG	Interferonγ	2.66	.009
CXCR4	Chemokine (C-X-C motif) receptor 4	2.66	< .001
CD3D	CD3D antigen δ polypeptide (TiT3 complex)	2.58	< .001
ST8SIA4	ST8 alpha-N-acetyl-neuraminide alpha-2,8-sialyltransferase 4	2.57	.001
MYB	v-myb myeloblastosis viral oncogene homolog (avian)	2.55	.006
ITGB7	Integrin β₇	2.50	.002
LPXN	Leupaxin	2.40	< .001
PTPRC	Protein tyrosine phosphatase receptor type C	2.27	< .001
TRA@	T-cell receptor α locus	2.25	< .001
EVI2B	Ecotropic viral integration site 2B	2.19	< .001
TNF	Tumor necrosis factor (TNF superfamily, member 2)	1.90	< .001
TNFAIP8	Tumor necrosis factorα–induced protein 8	1.87	< .001
WIF1	WNT inhibitory factor 1	4.93	.003
IL1F7	Interleukin-1 family, member 7 (ζ)	4.17	< .001
C1orf68	Chromosome 1 open reading frame 68	3.71	< .001
PSORS1C2	Psoriasis susceptibility 1 candidate 2	3.06	< .001

Gene symbol	Gene name	Fold change	P
IGLC1; IGLC2	Immunoglobulin lambda constant and variable	8.63	.035
IL26	Interleukin-26	8.44	.043
IGHM	Immunoglobulin heavy constant mu	7.16	.025
POU2AF1	POU domain, class 2, associating factor 1	5.93	.016
TNFRSF17	Tumor necrosis factor receptor superfamily, member 17	5.78	.040
T3JAM	TRAF3-interacting Jun N-terminal kinase (JNK)–activating modulator	4.73	< .001
LEF1	Lymphoid enhancer-binding factor 1	4.66	.004
SH2D1A	SH2 domain protein 1A, Duncan disease	4.61	.006
GNLY	Granulysin	4.56	.041
PTPN7	Protein tyrosine phosphatase, nonreceptor type 7	4.45	< .001
FYB	FYN binding protein (FYB-120/130)	4.44	< .001
TNFSF14	Tumor necrosis factor (ligand) superfamily, member 14	4.16	< .001
LCK	Lymphocyte-specific protein tyrosine kinase	4.11	< .001
RAC2	Ras-related C3 botulinum toxin substrate 2	4.10	< .001
IL21R	Interleukin-21 receptor	4.09	< .001
TCRIM	T-cell receptor interacting molecule	4.01	< .001
ZAP70	ζ-chain (TCR)–associated protein kinase 70 kDa	3.89	< .001
TNFSF4	Tumor necrosis factor (ligand) superfamily, member 4	3.86	.010
CCR8	Chemokine (C-C motif) receptor 8	3.65	.031
FUT7	Fucosyltransferase 7 (alpha (1,3) fucosyltransferase)	3.64	.007
SELL	Selectin L (lymphocyte adhesion molecule 1)	3.64	.012
CD3G	CD3G antigen, gamma polypeptide (TiT3 complex)	3.64	< .001
CCL4	Chemokine (C-C motif) ligand 4	3.62	.042
ITK	IL-2–inducible T-cell kinase	3.61	< .001
CXCL13	Chemokine (C-X-C motif) ligand 13 (B-cell chemoattractant)	3.60	.016
IL2RA	Interleukin-2 receptor α	3.49	.015
TNIK	TRAF2 and NCK interacting kinase	3.40	< .001
IL2RB	Interleukin-2 receptor β	3.29	< .001
TNFRSF7	Tumor necrosis factor receptor superfamily, member 7	3.23	< .001
TRBC1	T-cell receptor β constant 1	3.20	< .001
LAT	Linker for activation of T cells	3.16	< .001
ST8SIA1	ST8 alpha-N-acetyl-neuraminide alpha-2,8-sialyltransferase 1	3.01	.006
IL2RG	Interleukin-2 receptor γ (severe combined immunodeficiency)	3.01	< .001
LILRB4	Leukocyte immunoglobulin-like receptor, subfamily B, member 4	2.95	.007
CCR4	Chemokine (C-C motif) receptor 4	2.91	.003
ITGAL	Integrin α_L	2.91	< .001
CD3E	CD3E antigen ϵ polypeptide (TiT3 complex)	2.89	< .001
CD3Z	CD3Z antigen ζ polypeptide (TiT3 complex)	2.88	< .001
PRF1	Perforin 1 (pore-forming protein)	2.86	.009
CCR5	Chemokine (C-C motif) receptor 5	2.84	.008
LTB	Lymphotoxin β (TNF superfamily, member 3)	2.82	.001
CCR7	Chemokine (C-C motif) receptor 7	2.72	< .001
BIRC3	Baculoviral IAP repeat–containing 3	2.71	.005
MAP4K1	Mitogen-activated protein kinase 1	2.68	< .001
IL27RA	Interleukin-27 receptor α	2.67	< .001
CD69	CD69 antigen (p60, early T-cell activation antigen)	2.67	.001
IFNG	Interferonγ	2.66	.009
CXCR4	Chemokine (C-X-C motif) receptor 4	2.66	< .001
CD3D	CD3D antigen δ polypeptide (TiT3 complex)	2.58	< .001
ST8SIA4	ST8 alpha-N-acetyl-neuraminide alpha-2,8-sialyltransferase 4	2.57	.001
MYB	v-myb myeloblastosis viral oncogene homolog (avian)	2.55	.006
ITGB7	Integrin β₇	2.50	.002
LPXN	Leupaxin	2.40	< .001
PTPRC	Protein tyrosine phosphatase receptor type C	2.27	< .001
TRA@	T-cell receptor α locus	2.25	< .001
EVI2B	Ecotropic viral integration site 2B	2.19	< .001
TNF	Tumor necrosis factor (TNF superfamily, member 2)	1.90	< .001
TNFAIP8	Tumor necrosis factorα–induced protein 8	1.87	< .001
WIF1	WNT inhibitory factor 1	4.93	.003
IL1F7	Interleukin-1 family, member 7 (ζ)	4.17	< .001
C1orf68	Chromosome 1 open reading frame 68	3.71	< .001
PSORS1C2	Psoriasis susceptibility 1 candidate 2	3.06	< .001

LEF1 and TCF7, both of which were up-regulated in cluster 1, are downstream members of the WNT signaling pathway. A recent study has shown that several isoforms of LEF1 and TCF7 are up-regulated in naive CD8⁺ T cells, and down-regulated by TCR activation.³⁵ Both LEF1 and TCF7 have been shown to be expressed in lymphocytic leukemias, including a subset of peripheral T-cell lymphomas and T-cell acute lymphocytic leukemia (T-ALL).^36,37 Based on activation markers, studies suggest that LEF1 and TCF7 are up-regulated in the setting of T-helper 1 (Th1) as compared with Th2 activation.³⁷

Also up-regulated in cluster 1 were several central memory T-cell markers and genes involved in lymphocyte homing: SELL (L-selectin), CCR7, and FUT7 (fucosyltransferase 7), an important regulator of selectin ligands, including CLA.^38,39

Tumor necrosis factor (TNF) pathway genes, such as TNFRSF17, TNFRSF14, TNFSF4, TNFRSF7, LTB(lymphotoxin β), TNF, TRAF1, T3JAM, and TNFAIP8, are also up-regulated in this cluster. TNFRSF14, also known as herpesvirus entry mediator (HVEM), has been shown to interact with TNFSF14(also known as LIGHT) and LTA (lymphotoxin A) resulting in costimulation of T cells.^40,41BIRC1, BIRC3, and BIRC5 were also significantly up-regulated in this cluster. BIRC3 is a member of the inhibitor of apoptosis (IAP) gene family, and inhibits apoptosis by binding to TNF receptor–associated factors TRAF1 and TRAF2.⁴² The previous work of Tracey et al¹⁷ also emphasized TNF antiapoptotic pathway activation in MF, consistent with these results.

Interestingly, several B-cell–related genes were found to be up-regulated in this lymphocyte activation cluster. Immunoglobulin genes are among the most up-regulated by fold change, although the significance of this result (P ≤ .05) is lower than that of many other genes. POU2AF1 has been implicated in CLL.⁴³

Cluster 2, consisting of mostly stage IA and IB patients, showed significant up-regulation of several genes involved in keratinocyte and epidermal differentiation and proliferation. Selected genes are shown in Table 3. Among the genes up-regulated in this cluster were keratinocyte-related genes, including loricrin (LOR), late cornified envelope 2B (LCE2B), and keratin 2A (KRT2A). Members of the WNT signaling pathway were also up-regulated, including catenin β-interacting protein 1 (CTNNBIP1), transcription factor 7-like 2 (TCF7L2), and transcription factor 7-like 1 (TCF7L1).

Table 3

Cluster 2 selected up-regulated genes

Gene symbol	Gene name	Fold change	P
PPFIBP1	PTPRF interacting protein, binding protein 1 (liprin beta 1)	2.71	< .001
CST6	Cystatin E/M	2.70	< .001
NFIC	Nuclear factor I/C (CCAAT-binding transcription factor)	2.69	< .001
FCGBP	Fc fragment of IgG-binding protein	2.67	< .001
LAMC3	Laminin, γ 3	2.53	.003
LOR	Loricrin	2.47	< .001
EPS8L1	Epidermal growth factor receptor pathway substrate 8 (EPS8)–like 1	2.47	< .001
SEMA3B	Sema domain, immunoglobulin domain (Ig), short basic domain, secreted, (semaphorin) 3B	2.45	< .001
LCE2B	Late cornified envelope 2B	2.31	< .001
CTNNBIP1	Catenin, β-interacting protein 1	2.26	< .001
PARD3	Par-3 partitioning defective 3 homolog (C elegans)	2.20	< .001
MATN4	Matrilin 4	2.18	.033
HLA-DQB2	Major histocompatibility complex, class II, DQ β 2	2.10	.001
TCF7L1	Transcription factor 7-like 1 (T-cell specific, HMG-box)	2.05	< .001
CD207	CD207 antigen, langerin	2.05	.001
KRT2A	Keratin 2A (epidermal ichthyosis bullosa of Siemens)	2.02	< .001
COL7A1	Collagen, type VII, α 1	2.01	< .001
LTBP4	Latent transforming growth factor β–binding protein 4	2.00	< .001
FLG	Filaggrin	1.91	< .001
ST6GALNAC2	ST6-N-acetylgalactosaminide alpha-2,6-sialyltransferase 2	1.89	< .001

Gene symbol	Gene name	Fold change	P
PPFIBP1	PTPRF interacting protein, binding protein 1 (liprin beta 1)	2.71	< .001
CST6	Cystatin E/M	2.70	< .001
NFIC	Nuclear factor I/C (CCAAT-binding transcription factor)	2.69	< .001
FCGBP	Fc fragment of IgG-binding protein	2.67	< .001
LAMC3	Laminin, γ 3	2.53	.003
LOR	Loricrin	2.47	< .001
EPS8L1	Epidermal growth factor receptor pathway substrate 8 (EPS8)–like 1	2.47	< .001
SEMA3B	Sema domain, immunoglobulin domain (Ig), short basic domain, secreted, (semaphorin) 3B	2.45	< .001
LCE2B	Late cornified envelope 2B	2.31	< .001
CTNNBIP1	Catenin, β-interacting protein 1	2.26	< .001
PARD3	Par-3 partitioning defective 3 homolog (C elegans)	2.20	< .001
MATN4	Matrilin 4	2.18	.033
HLA-DQB2	Major histocompatibility complex, class II, DQ β 2	2.10	.001
TCF7L1	Transcription factor 7-like 1 (T-cell specific, HMG-box)	2.05	< .001
CD207	CD207 antigen, langerin	2.05	.001
KRT2A	Keratin 2A (epidermal ichthyosis bullosa of Siemens)	2.02	< .001
COL7A1	Collagen, type VII, α 1	2.01	< .001
LTBP4	Latent transforming growth factor β–binding protein 4	2.00	< .001
FLG	Filaggrin	1.91	< .001
ST6GALNAC2	ST6-N-acetylgalactosaminide alpha-2,6-sialyltransferase 2	1.89	< .001

Genes up-regulated in cluster 3 included several genes related to keratinocyte function, but also included T-cell–specific genes. Selected genes are shown in Table 4. Up-regulated genes included keratins 6 and 16, indicating keratinocyte hyperproliferation.⁴⁴SERPINB3, CEACAM1, and WNT 5A were also significantly up-regulated. These genes have been previously associated with benign hyperplasia in the setting of psoriasis as compared with squamous cell carcinoma.⁴⁵

Table 4

Cluster 3 selected up-regulated genes

Gene symbol	Gene name	Fold change	P
FOSL1	FOS-like antigen 1	3.80	< .001
SERPINB13	Serine (or cysteine) proteinase inhibitor, clade B (ovalbumin), member 13	3.66	< .001
MMP12	Matrix metalloproteinase 12 (macrophage elastase)	3.41	.004
SERPINB4	Serine (or cysteine) proteinase inhibitor, clade B (ovalbumin), member 4	3.33	< .001
EVA1	Epithelial V–like antigen 1	3.12	< .001
SELE	Selectin E (endothelial adhesion molecule 1)	3.08	< .001
IL13RA2	Interleukin-13 receptor α2	3.05	.006
TP73L	Tumor protein p73–like	2.94	< .001
FGFBP1	Fibroblast growth factor–binding protein 1	2.86	< .001
KRT17	Keratin 17	2.71	< .001
IL1RN	Interleukin-1 receptor antagonist	2.58	< .001
CD24	CD24 antigen (small-cell lung carcinoma cluster 4 antigen)	2.57	< .001
KRT16	Keratin 16 (focal nonepidermolytic palmoplantar keratoderma)	2.54	< .001
TNFAIP6	Tumor necrosis factor α-induced protein 6	2.51	.006
ID1	Inhibitor of DNA binding 1, dominant-negative helix-loop-helix protein	2.38	< .001
SERPINB3	Serine (or cysteine) proteinase inhibitor, clade B (ovalbumin), member 3	2.36	< .001
DUSP1	Dual specificity phosphatase 1	2.24	.001
CCL2	Chemokine (C-C motif) ligand 2	2.22	.002
ITGA6	Integrin α₆	2.14	< .001
CEACAM1	Carcinoembryonic antigen-related cell adhesion molecule 1	2.07	.002
SOCS3	Suppressor of cytokine signaling 3	2.00	.001
LTB4R	Leukotriene B4 receptor	1.95	< .001
DUSP3	Dual specificity phosphatase 3 (vaccinia virus phosphatase VH1–related)	1.91	< .001
ITGB4	Integrin β₄	1.89	< .001
KRT6A, B, C, E	Keratin 6A, keratin 6B, keratin 6C, keratin 6E	1.89	< .001
JAG1	Jagged 1 (Alagille syndrome)	1.88	< .001
VEGF	Vascular endothelial growth factor	1.88	< .001
IVL	Involucrin	1.87	.002
MCL1	Myeloid cell leukemia sequence 1 (BCL2-related)	1.82	.008
TGFA	Transforming growth factor α	1.80	.004
SERPINB1	Serine (or cysteine) proteinase inhibitor, clade B (ovalbumin), member 1	1.79	.001
S100A2	S100 calcium-binding protein A2	1.79	< .001
YES1	v-yes-1 Yamaguchi sarcoma viral oncogene homolog 1	1.77	< .001
FGFR2	Fibroblast growth factor receptor 2	1.72	< .001
TNFRSF21	Tumor necrosis factor receptor superfamily, member 21	1.69	< .001
RAB6A	RAB6A, member RAS oncogene family	1.69	.003
JUNB	jun B proto-oncogene	1.68	< .001
WNT5A	Wingless-type MMTV integration site family, member 5A	1.65	.006

Gene symbol	Gene name	Fold change	P
FOSL1	FOS-like antigen 1	3.80	< .001
SERPINB13	Serine (or cysteine) proteinase inhibitor, clade B (ovalbumin), member 13	3.66	< .001
MMP12	Matrix metalloproteinase 12 (macrophage elastase)	3.41	.004
SERPINB4	Serine (or cysteine) proteinase inhibitor, clade B (ovalbumin), member 4	3.33	< .001
EVA1	Epithelial V–like antigen 1	3.12	< .001
SELE	Selectin E (endothelial adhesion molecule 1)	3.08	< .001
IL13RA2	Interleukin-13 receptor α2	3.05	.006
TP73L	Tumor protein p73–like	2.94	< .001
FGFBP1	Fibroblast growth factor–binding protein 1	2.86	< .001
KRT17	Keratin 17	2.71	< .001
IL1RN	Interleukin-1 receptor antagonist	2.58	< .001
CD24	CD24 antigen (small-cell lung carcinoma cluster 4 antigen)	2.57	< .001
KRT16	Keratin 16 (focal nonepidermolytic palmoplantar keratoderma)	2.54	< .001
TNFAIP6	Tumor necrosis factor α-induced protein 6	2.51	.006
ID1	Inhibitor of DNA binding 1, dominant-negative helix-loop-helix protein	2.38	< .001
SERPINB3	Serine (or cysteine) proteinase inhibitor, clade B (ovalbumin), member 3	2.36	< .001
DUSP1	Dual specificity phosphatase 1	2.24	.001
CCL2	Chemokine (C-C motif) ligand 2	2.22	.002
ITGA6	Integrin α₆	2.14	< .001
CEACAM1	Carcinoembryonic antigen-related cell adhesion molecule 1	2.07	.002
SOCS3	Suppressor of cytokine signaling 3	2.00	.001
LTB4R	Leukotriene B4 receptor	1.95	< .001
DUSP3	Dual specificity phosphatase 3 (vaccinia virus phosphatase VH1–related)	1.91	< .001
ITGB4	Integrin β₄	1.89	< .001
KRT6A, B, C, E	Keratin 6A, keratin 6B, keratin 6C, keratin 6E	1.89	< .001
JAG1	Jagged 1 (Alagille syndrome)	1.88	< .001
VEGF	Vascular endothelial growth factor	1.88	< .001
IVL	Involucrin	1.87	.002
MCL1	Myeloid cell leukemia sequence 1 (BCL2-related)	1.82	.008
TGFA	Transforming growth factor α	1.80	.004
SERPINB1	Serine (or cysteine) proteinase inhibitor, clade B (ovalbumin), member 1	1.79	.001
S100A2	S100 calcium-binding protein A2	1.79	< .001
YES1	v-yes-1 Yamaguchi sarcoma viral oncogene homolog 1	1.77	< .001
FGFR2	Fibroblast growth factor receptor 2	1.72	< .001
TNFRSF21	Tumor necrosis factor receptor superfamily, member 21	1.69	< .001
RAB6A	RAB6A, member RAS oncogene family	1.69	.003
JUNB	jun B proto-oncogene	1.68	< .001
WNT5A	Wingless-type MMTV integration site family, member 5A	1.65	.006

We also analyzed genes in our clusters according to established functional and molecular pathways. As our results have already shown, several pathways are easily identified ad hoc. However, we used 2 rigorous techniques to identify enriched pathways because with a large set of genes, it is often difficult to pick out patterns relevant to pathophysiology of disease, as many genes, despite relatively low fold changes, may contribute as members of an up-regulated set rather than individually. We compared GO pathways and a set of pathways curated by the Broad Institute at MIT to our dataset using the GeneMerge and GSEA tools, respectively. GeneMerge scores GO pathways according to a predefined list of cluster markers. GSEA, in comparison, uses processed expression data and class phenotype information to discover pathway enrichment.

Analysis of GO pathways gave more strength to the patterns suggested by the cluster marker gene lists. The top 15 pathways for each cluster, ranked by P value and corrected for multiple testing, are shown with the heat map in Figure 4. Cluster 1 showed up-regulation of pathways involving immune response, endogenous and exogenous antigen presentation and processing, major histocompatibility complex (MHC) receptor activity, chemokine receptor activity, TCR binding, and TNF-receptor binding. In addition, proliferative pathways are also highlighted, including mitosis, cell division, nucleus, chromosome, DNA repair, T-cell proliferation, and B-cell proliferation. In contrast, cluster 2 revealed up-regulation of several pathways involved in extracellular matrix, collagen, epidermis development, transcription, keratin filament, WNT signaling, melanin metabolism, keratinocyte differentiation, and cornified envelope. Cluster 3 showed up-regulation of many varied pathways, including angiogenesis, oxygen binding, inflammatory response, chemokine activity, chemotaxis, interleukin-1 receptor binding, and response to virus.

GSEA resulted in the identification of significantly enriched pathways in only 1 cluster—cluster 1. For all pathways containing 5 or more genes, 8 were significantly enriched in cluster 1, with FDR less than or equal to 0.05, and 3 genes with FDR less than 0.01 (Table 5). The enriched pathways involve T-cell function, TCR signaling, and T-cell activation. Also enriched was a pathway up-regulated in GATA1 mutant megakaryoblasts harvested from mice, previously implicated in leukemogenesis.⁴⁶

Table 5

GSEA-enriched pathways for cluster 1

Pathway	Description	Curator	Size, no. genes	Enrichment score	FDR
CSK pathway	Csk inhibits T-cell activation by phosphorylating Lck; Csk is regulated by cAMP-dependent kinases and is opposed by the T-cell activator CD45.	BioCarta	24	0.8398	< .001
T helper pathway	Helper T cells coordinate the actions of B cells, macrophages, and other immune cells via surface molecules such as T-cell receptor/CD3 and their characteristic marker CD4.	BioCarta	23	< .001.8698	< .001
T cytotoxic pathway	Cytotoxic T cells release perforin and granzyme to lyse foreign cell targets and express Fas ligand to promote Fas-induced apoptosis.	BioCarta	23	< .001.8513	.003
GATA1 mutant megakaryoblasts	Genes up-regulated in megakaryoblasts of GATA1-mutant mice. GATA1 (a hematopoietic transcription factor) has been implicated in leukemogenesis in patients with Down syndrome.	Li et al⁴⁶	22	< .001.6708	.017
T-cell receptor pathway	T-cell receptors bind to foreign peptides presented by MHC molecules and induce T-cell activation.	BioCarta	24	< .001.7998	.020
IL-17 pathway	Activated T cells secrete IL-17, which stimulates fibroblasts and other cells to secrete inflammatory and hematopoietic cytokines.	BioCarta	16	< .001.7848	.020
T-cell receptor activation pathway	The kinases Lck and Fyn phosphorylate and activate the T-cell receptor, which recognizes antigen-bound MHC II and leads to T-cell activation.	BioCarta	17	< .001.9049	.020
T-cell signal transduction	On activation of the T-cell receptor, phospholipase C is activated to produce second messengers DAG and PIP3, both required for T-cell activation.	STKE	29	< .001.7062	.034

Pathway	Description	Curator	Size, no. genes	Enrichment score	FDR
CSK pathway	Csk inhibits T-cell activation by phosphorylating Lck; Csk is regulated by cAMP-dependent kinases and is opposed by the T-cell activator CD45.	BioCarta	24	0.8398	< .001
T helper pathway	Helper T cells coordinate the actions of B cells, macrophages, and other immune cells via surface molecules such as T-cell receptor/CD3 and their characteristic marker CD4.	BioCarta	23	< .001.8698	< .001
T cytotoxic pathway	Cytotoxic T cells release perforin and granzyme to lyse foreign cell targets and express Fas ligand to promote Fas-induced apoptosis.	BioCarta	23	< .001.8513	.003
GATA1 mutant megakaryoblasts	Genes up-regulated in megakaryoblasts of GATA1-mutant mice. GATA1 (a hematopoietic transcription factor) has been implicated in leukemogenesis in patients with Down syndrome.	Li et al⁴⁶	22	< .001.6708	.017
T-cell receptor pathway	T-cell receptors bind to foreign peptides presented by MHC molecules and induce T-cell activation.	BioCarta	24	< .001.7998	.020
IL-17 pathway	Activated T cells secrete IL-17, which stimulates fibroblasts and other cells to secrete inflammatory and hematopoietic cytokines.	BioCarta	16	< .001.7848	.020
T-cell receptor activation pathway	The kinases Lck and Fyn phosphorylate and activate the T-cell receptor, which recognizes antigen-bound MHC II and leads to T-cell activation.	BioCarta	17	< .001.9049	.020
T-cell signal transduction	On activation of the T-cell receptor, phospholipase C is activated to produce second messengers DAG and PIP3, both required for T-cell activation.	STKE	29	< .001.7062	.034

In addition to the pathway name and description, curator (source of the pathway), size (number of genes in the pathway), ES (calculated by the GSEA algorithm, giving a statistical measure of the enrichment of the sample class by pathway genes), P value, and FDR, which corrects the P value for multiple-testing correction, are also provided. All P values were 0.

Histologic and immunophenotypic correlation.

In some cases, patients had 2 biopsies available from the same lesion, 1 processed for gene expression profiling and 1 processed for histologic analysis. We examined the histology in these cases for any correlation to clustering. In particular, we were interested to determine if the relative number of T cells present in a biopsy correlated with clustering. Figure 7 shows histology and CD3 immunostaining of representative biopsies taken at the same site as for the clustering. As illustrated in the images, there appears to be variability in the number of lymphocytes with respect to cluster, suggesting that the number of T cells in a biopsy does not dominate the clustering.

View large Download PPT

Figure 7

Histologic and immunophenotypical correlation with cluster. Representative biopsies are shown. Left panels are the hematoxylin-eosin stain, and right panels are immunohistochemical stains using monoclonal antibody to CD3. (A) Cluster 2: stage IA, untreated erythematous plaque on the chest. (B) Cluster 3: top 2 panels depict a stage IB lesion, while the bottom panel shows a stage III lesion. All images were acquired on a Nikon Eclipse E600 microscope (Nikon, Melville, NY) with a 10×/0.30 objective using a SPOT digital camera and image acquisition software (Diagnostic Instruments, Sterling Heights, MI).

Markers of clinical stage and response to treatment

Differential analysis was performed on a number of different phenotypes in order to identify genes associated with particular clinical differences. We first evaluated markers differentiating clinical stage. In a comparison of patch/plaque disease (stages IA and IB) compared with more advanced disease (stages IIB and III), we identified 480 genes with FDR less than or equal to 0.05, and 50 genes with FDR less than or equal to 0.01 (Figure 8). Genes ranked by signal-to-noise ratio contributing to differentiation of clinical stages are shown in Table 6. Genes up-regulated in earlier stage disease include several involved in epidermal development, WNT signaling pathway, and keratinocyte development. Genes up-regulated in later stage disease include several genes involved in immune response, mitosis, cell proliferation, DNA replication, apoptosis, response to virus, chemotaxis, and interferon (IFN) and TNF-alpha signaling. Although GSEA was unable to show any significant enrichment of evaluated pathways, GO analysis did show significant up-regulation in particular pathways. The top 20 up-regulated GO pathways are shown in Figure 8.

View large Download PPT

Figure 8

Heat map showing gene selection for stage IA/IB compared with stage IIB/III. Our analysis yielded gene sets of 480 genes with FDR less than or equal to 0.05, using the gene filter for 6997 genes, ranked by signal-to-noise ratio. The chart shows the top 20 GO terms for each stage, ranked by FDR. Each GO term is accompanied by the number of up-regulated genes/the total number of genes included in the test group. All included GO terms have cutoffs of .05 for both P value and FDR.

Table 6

Genes differentiating clinical stage

Gene symbol	Gene name	Fold change	P	FDR
Up-regulated in stage IA/IB
HLA-DRB4	mMajor histocompatibility complex, class II, DR β 4	4.53	.003	.048
WIF1	WNT inhibitory factor 1	4.41	< .001	< .001
CD207	CD207 antigen, langerin	3.63	< .001	< .001
FCGBP	Fc fragment of IgG binding protein	2.87	< .001	.017
MATN4	Matrilin 4	2.79	< .001	< .001
HLA-DQB2	Major histocompatibility complex, class II, DQ β 2	2.74	< .001	< .001
CD1A	CD1A antigen, a polypeptide	2.60	< .001	< .001
FCER1A	Fc fragment of IgE, high affinity I, receptor for α polypeptide	2.45	< .001	.029
S100B	S100 calcium-binding protein β (neural)	2.17	< .001	.017
KRT18	Keratin 18	2.06	.003	.047
LAMC3	Laminin γ 3	2.05	.002	.043
CTSG	Cathepsin G	2.04	< .001	< .001
CMA1	Chymase 1, mast cell	2.03	< .001	< .001
GSTT1	Glutathione S-transferase θ 1	2.03	< .001	.029
HLF	Hepatic leukemia factor	2.00	.002	.039
CD1C	CD1C antigen, c polypeptide	1.94	.002	.043
FRZB	Frizzled-related protein	1.94	< .001	.029
KRT15	Keratin 15	1.86	.001	.036
MMP27	Matrix metalloproteinase 27	1.85	.003	.044
PADI2	Peptidyl arginine deiminase, type II	1.78	< .001	.017
SCEL	Sciellin	1.78	.002	.043
TCF7L1	Tanscription factor 7-like 1 (T-cell specific, HMG-box)	1.73	< .001	.023
LTBP4	Latent transforming growth factor β–binding protein 4	1.70	< .001	.033
CTNNBIP1	Catenin β interacting protein 1	1.68	.003	.046
COL21A1	Collagen, type XXI α 1	1.67	.003	.047
KIT	v-kit Hardy-Zuckerman 4 feline sarcoma viral oncogene homolog	1.66	< .001	< .001
S100A1	S100 calcium-binding protein A1	1.64	< .001	< .001
NFIB	Nuclear factor I/B	1.60	.002	.039
IL11RA	Interleukin-11 receptor α	1.59	< .001	< .001
MATN2	Matrilin 2	1.53	< .001	.017
Up-regulated in stage IIA/III
AIM2	Absent in melanoma 2	4.62	< .001	< .001
LAG3	Lymphocyte-activation gene 3	4.42	< .001	< .001
APOBEC3B	Apolipoprotein B mRNA–editing enzyme, catalytic polypeptide-like 3B	3.37	< .001	.023
SH2D1A	SH2 domain protein 1A, Duncan disease (lymphoproliferative syndrome)	3.04	.002	.039
TNIP3	TNFAIP3 interacting protein 3	2.75	< .001	.023
IGH	Ig heavy locus	2.68	.002	.041
IL21R	Interleukin-21 receptor	2.67	< .001	.017
TNFSF14	Tumor necrosis factor (ligand) superfamily, member 14	2.64	.002	.039
LILRA3	Leukocyte Ig-like receptor, subfamily A (without TM domain), member 3	2.64	.002	.043
CCR4	Chemokine (C-C motif) receptor 4	2.52	< .001	.023
G1P2	Interferon α–inducible protein (clone IFI-15K)	2.49	< .001	.029
ISG20	Interferon-stimulated gene 20 kDa	2.42	< .001	< .001
TNIK	TRAF2 and NCK interacting kinase	2.39	< .001	.017
IGL	Ig lambda locus, constant, variable and joining regions	2.36	.002	.042
PRKCQ	Protein kinase C θ	2.24	.002	.041
OAS2	2′-5′-oligoadenylate synthetase 2, 69/71 kDa	2.18	< .001	< .001
OAS1	2′,5′-oligoadenylate synthetase 1, 40/46 kDa	2.12	.002	.039
IGKC; IGKV1-5	Ig κ constant and variable 1-5	2.09	.003	.050
CCR7	Chemokine (C-C motif) receptor 7	2.08	.001	.036
ZAP70	ζ-chain (TCR)–associated protein kinase 70 kDa	2.05	.002	.041
CCND2	Cyclin D2	2.00	.002	.039
GBP1	Guanylate-binding protein 1, interferon-inducible, 67 kDa	1.95	.003	.046
ZC3HAV1	Zinc finger CCCH type, antiviral 1	1.91	< .001	< .001
IL2RB	Interleukin-2 receptor β	1.90	.002	.041
PTPRCAP	Protein tyrosine phosphatase, receptor type, C-associated protein	1.90	.003	.048
C1GALT1	Core 1 synthase	1.89	< .001	.017
C5R1	Complement component 5 receptor 1 (C5a ligand)	1.88	< .001	.033
PIM2	pim-2 oncogene	1.88	.003	.050
ITGB7	Integrin β₇	1.84	.002	.042
DDX58	DEAD (Asp-Glu-Ala-Asp) box polypeptide 58	1.84	.002	.041
RARRES3	Retinoic acid receptor responder (tazarotene induced) 3	1.83	.003	.047
IFRG28	28 kDa interferon-responsive protein	1.82	.001	.036
MX2	Myxovirus (influenza virus) resistance 2 (mouse)	1.75	.002	.043
SP100 (IFI41)	Nuclear antigen Sp100	1.71	< .001	< .001
VAV1	vav 1 oncogene	1.68	.001	.036
LCP2	Lymphocyte cytosolic protein 2	1.65	.002	.042
NFATC2IP	NFAT, cytoplasmic, calcineurin-dependent 2 interacting protein	1.65	< .001	< .001
DUSP5	Dual specificity phosphatase 5	1.64	< .001	.017
IL10	Interleukin-10	1.64	.002	.042
BAX	BCL2-associated X protein	1.63	.002	.043
FYB	FYN binding protein (FYB-120/130)	1.61	< .001	.023
NUP98	Nucleoporin 98 kDa	1.60	.002	.041
HSPD1	heat shock 60 kDa protein 1 (chaperonin)	1.56	.003	.048
IFNAR2	Interferon (α, β, and ω) receptor 2	1.53	< .001	< .001
RAP140	Retinoblastoma-associated protein 140 (CTCL tumor antigen se89-1)	1.52	< .001	.033
FGFR1OP	FGFR1 oncogene partner	1.51	< .001	< .001

Gene symbol	Gene name	Fold change	P	FDR
Up-regulated in stage IA/IB
HLA-DRB4	mMajor histocompatibility complex, class II, DR β 4	4.53	.003	.048
WIF1	WNT inhibitory factor 1	4.41	< .001	< .001
CD207	CD207 antigen, langerin	3.63	< .001	< .001
FCGBP	Fc fragment of IgG binding protein	2.87	< .001	.017
MATN4	Matrilin 4	2.79	< .001	< .001
HLA-DQB2	Major histocompatibility complex, class II, DQ β 2	2.74	< .001	< .001
CD1A	CD1A antigen, a polypeptide	2.60	< .001	< .001
FCER1A	Fc fragment of IgE, high affinity I, receptor for α polypeptide	2.45	< .001	.029
S100B	S100 calcium-binding protein β (neural)	2.17	< .001	.017
KRT18	Keratin 18	2.06	.003	.047
LAMC3	Laminin γ 3	2.05	.002	.043
CTSG	Cathepsin G	2.04	< .001	< .001
CMA1	Chymase 1, mast cell	2.03	< .001	< .001
GSTT1	Glutathione S-transferase θ 1	2.03	< .001	.029
HLF	Hepatic leukemia factor	2.00	.002	.039
CD1C	CD1C antigen, c polypeptide	1.94	.002	.043
FRZB	Frizzled-related protein	1.94	< .001	.029
KRT15	Keratin 15	1.86	.001	.036
MMP27	Matrix metalloproteinase 27	1.85	.003	.044
PADI2	Peptidyl arginine deiminase, type II	1.78	< .001	.017
SCEL	Sciellin	1.78	.002	.043
TCF7L1	Tanscription factor 7-like 1 (T-cell specific, HMG-box)	1.73	< .001	.023
LTBP4	Latent transforming growth factor β–binding protein 4	1.70	< .001	.033
CTNNBIP1	Catenin β interacting protein 1	1.68	.003	.046
COL21A1	Collagen, type XXI α 1	1.67	.003	.047
KIT	v-kit Hardy-Zuckerman 4 feline sarcoma viral oncogene homolog	1.66	< .001	< .001
S100A1	S100 calcium-binding protein A1	1.64	< .001	< .001
NFIB	Nuclear factor I/B	1.60	.002	.039
IL11RA	Interleukin-11 receptor α	1.59	< .001	< .001
MATN2	Matrilin 2	1.53	< .001	.017
Up-regulated in stage IIA/III
AIM2	Absent in melanoma 2	4.62	< .001	< .001
LAG3	Lymphocyte-activation gene 3	4.42	< .001	< .001
APOBEC3B	Apolipoprotein B mRNA–editing enzyme, catalytic polypeptide-like 3B	3.37	< .001	.023
SH2D1A	SH2 domain protein 1A, Duncan disease (lymphoproliferative syndrome)	3.04	.002	.039
TNIP3	TNFAIP3 interacting protein 3	2.75	< .001	.023
IGH	Ig heavy locus	2.68	.002	.041
IL21R	Interleukin-21 receptor	2.67	< .001	.017
TNFSF14	Tumor necrosis factor (ligand) superfamily, member 14	2.64	.002	.039
LILRA3	Leukocyte Ig-like receptor, subfamily A (without TM domain), member 3	2.64	.002	.043
CCR4	Chemokine (C-C motif) receptor 4	2.52	< .001	.023
G1P2	Interferon α–inducible protein (clone IFI-15K)	2.49	< .001	.029
ISG20	Interferon-stimulated gene 20 kDa	2.42	< .001	< .001
TNIK	TRAF2 and NCK interacting kinase	2.39	< .001	.017
IGL	Ig lambda locus, constant, variable and joining regions	2.36	.002	.042
PRKCQ	Protein kinase C θ	2.24	.002	.041
OAS2	2′-5′-oligoadenylate synthetase 2, 69/71 kDa	2.18	< .001	< .001
OAS1	2′,5′-oligoadenylate synthetase 1, 40/46 kDa	2.12	.002	.039
IGKC; IGKV1-5	Ig κ constant and variable 1-5	2.09	.003	.050
CCR7	Chemokine (C-C motif) receptor 7	2.08	.001	.036
ZAP70	ζ-chain (TCR)–associated protein kinase 70 kDa	2.05	.002	.041
CCND2	Cyclin D2	2.00	.002	.039
GBP1	Guanylate-binding protein 1, interferon-inducible, 67 kDa	1.95	.003	.046
ZC3HAV1	Zinc finger CCCH type, antiviral 1	1.91	< .001	< .001
IL2RB	Interleukin-2 receptor β	1.90	.002	.041
PTPRCAP	Protein tyrosine phosphatase, receptor type, C-associated protein	1.90	.003	.048
C1GALT1	Core 1 synthase	1.89	< .001	.017
C5R1	Complement component 5 receptor 1 (C5a ligand)	1.88	< .001	.033
PIM2	pim-2 oncogene	1.88	.003	.050
ITGB7	Integrin β₇	1.84	.002	.042
DDX58	DEAD (Asp-Glu-Ala-Asp) box polypeptide 58	1.84	.002	.041
RARRES3	Retinoic acid receptor responder (tazarotene induced) 3	1.83	.003	.047
IFRG28	28 kDa interferon-responsive protein	1.82	.001	.036
MX2	Myxovirus (influenza virus) resistance 2 (mouse)	1.75	.002	.043
SP100 (IFI41)	Nuclear antigen Sp100	1.71	< .001	< .001
VAV1	vav 1 oncogene	1.68	.001	.036
LCP2	Lymphocyte cytosolic protein 2	1.65	.002	.042
NFATC2IP	NFAT, cytoplasmic, calcineurin-dependent 2 interacting protein	1.65	< .001	< .001
DUSP5	Dual specificity phosphatase 5	1.64	< .001	.017
IL10	Interleukin-10	1.64	.002	.042
BAX	BCL2-associated X protein	1.63	.002	.043
FYB	FYN binding protein (FYB-120/130)	1.61	< .001	.023
NUP98	Nucleoporin 98 kDa	1.60	.002	.041
HSPD1	heat shock 60 kDa protein 1 (chaperonin)	1.56	.003	.048
IFNAR2	Interferon (α, β, and ω) receptor 2	1.53	< .001	< .001
RAP140	Retinoblastoma-associated protein 140 (CTCL tumor antigen se89-1)	1.52	< .001	.033
FGFR1OP	FGFR1 oncogene partner	1.51	< .001	< .001

Selected genes with up-regulated expression in the specified stages are shown, with fold change greater than or equal to 1.5, P value less than or equal to .05, and FDR less than or equal to .05.

We also evaluated patients in terms of response to treatment, identifying those with disease persistence or progression despite stage-appropriate treatment as difficult to treat, and those whose disease was responsive to stage-appropriate therapies as easy to treat. Transcriptional profiling revealed a treatment response signature shown in Table 7. Changes are similar to those differentiating early- and later-stage disease: genes down-regulated in poor treatment response were involved in pathway extracellular matrix, WNT signaling pathway, epidermal development, and frizzled signaling, whereas up-regulated genes included those involved in mitosis, immune response, response to virus, apoptosis, and T-cell activation.

Table 7

Genes up- and down-regulated in patients with poor response to treatment

Gene symbol	Gene name	Fold change	P	FDR
In patients with poor response to treatment
IL21R	Interleukin-21 receptor	2.81	< .001	.048
MYO7A	Myosin VIIA (usher syndrome 1B [autosomal recessive, severe])	2.44	< .001	.048
RGS16	Regulator of G-protein signaling 16	2.29	< .001	< .001
ANP32E	Acidic (leucine-rich) nuclear phosphoprotein 32 family, member E	2.00	< .001	< .001
SLA	Src-like adaptor	1.91	< .001	.048
PPP1R16B	Protein phosphatase 1, regulatory (inhibitor) subunit 16B	1.79	< .001	.048
C1GALT1	Core 1 synthase	1.64	< .001	.048
IL10	Interleukin-10	1.60	< .001	< .001
TRIB2	Tribbles homolog 2 (Drosophila)	1.55	< .001	< .001
MTHFD2	Methylenetetrahydrofolate dehydrogenase (NADP⁺ dependent) 2, methenyltetrahydrofolate cyclohydrolase	1.51	< .001	< .001
EHD1	EH-domain containing 1	1.51	< .001	< .001
In patients with good response to treatment
WIF1	WNT inhibitory factor 1	4.02	< .001	.048
KRT15	Keratin 15	1.92	< .001	< .001
GSTA3	Glutathione S-transferase A3	1.87	< .001	.048
PTN	Pleiotrophin (heparin-binding growth factor 8, neurite growth-promoting factor 1)	1.79	< .001	.048

Gene symbol	Gene name	Fold change	P	FDR
In patients with poor response to treatment
IL21R	Interleukin-21 receptor	2.81	< .001	.048
MYO7A	Myosin VIIA (usher syndrome 1B [autosomal recessive, severe])	2.44	< .001	.048
RGS16	Regulator of G-protein signaling 16	2.29	< .001	< .001
ANP32E	Acidic (leucine-rich) nuclear phosphoprotein 32 family, member E	2.00	< .001	< .001
SLA	Src-like adaptor	1.91	< .001	.048
PPP1R16B	Protein phosphatase 1, regulatory (inhibitor) subunit 16B	1.79	< .001	.048
C1GALT1	Core 1 synthase	1.64	< .001	.048
IL10	Interleukin-10	1.60	< .001	< .001
TRIB2	Tribbles homolog 2 (Drosophila)	1.55	< .001	< .001
MTHFD2	Methylenetetrahydrofolate dehydrogenase (NADP⁺ dependent) 2, methenyltetrahydrofolate cyclohydrolase	1.51	< .001	< .001
EHD1	EH-domain containing 1	1.51	< .001	< .001
In patients with good response to treatment
WIF1	WNT inhibitory factor 1	4.02	< .001	.048
KRT15	Keratin 15	1.92	< .001	< .001
GSTA3	Glutathione S-transferase A3	1.87	< .001	.048
PTN	Pleiotrophin (heparin-binding growth factor 8, neurite growth-promoting factor 1)	1.79	< .001	.048

Shown are genes with fold change greater than or equal to 1.5, P value less than or equal to .05, and FDR less than or equal to 0.05.

Comparison of patients with leukemic disease versus patients with no evidence of peripheral blood involvement identified only 3 genes with significant fold changes based on multiple testing. These genes included 2 genes up-regulated in leukemic disease, lymphocyte-activation gene 3 (LAG3; fold change 4.24) and B-factor (BF; fold change 2.36), and 1 gene down-regulated in leukemic disease, collagen XI A (COL11A1; fold change −9.84). LAG3 has been shown to be involved in lymphocyte activation, and has a close relationship to CD4; BF is a component of the alternative pathway of complement activation; and COL11A1 is an extracellular matrix component, also reported to be expressed in some T-cell lines. To further investigate these potential markers of leukemic disease, we performed reverse transcription–polymerase chain reaction (RT-PCR) to evaluate expression of these genes in peripheral blood mononuclear cell RNA samples of patients with CTCL with and without leukemic disease (as opposed to the skin lesion RNA used in our GeneChip analysis). We found no correlation between expression of these genes in blood samples and the presence of peripheral blood involvement with CTCL (data not shown).

Evaluation of SS in comparison with all other patient samples revealed significant down-regulation of only the patched homolog gene (PTCH; fold change −1.72). PTCH encodes the receptor for sonic hedgehog, and functions as a tumor suppressor.

Correlation with TREC and spectratyping data

We have previously shown that T-cell receptor excision circles (TRECs), which decrease with T-cell division, are significantly decreased in the peripheral blood of patients with CTCL, indicating increased T-cell turnover.⁴⁷ We have also previously shown that the diversity of the T-cell population appears to be decreased in CTCL, by measuring length distributions of T-cell receptor β variable regions (“spectratyping”).^47,48 We compared these parameters of T-cell immunity with clustering results in 39 of the patients of this study for whom TREC and spectratyping data were available. As shown previously, mean TRECs decreased significantly with increasing clinical stage, and mean spectratype contracture increased, consistent with loss of TCR complexity (Table 8). In this study, we compared TRECs and spectratypes by cluster and found the greatest loss of TRECs in cluster 1 and the least in cluster 2 (Table 8). Loss of T-cell diversity on spectratyping was greatest in cluster 3 and least in cluster 2.

Table 8

Evaluation of TRECs

	Mean no. TRECs	Mean contracture	Sample no.
Clinical stage
IA	6432.46	1.86	7
IB	5293.29	2.44	18
IIB	534.62	2.33	6
III	268.12	6.56	8
Cluster
1	1551.66	3.18	11
2	6980.11	2.03	15
3	1837.65	4.46	13

	Mean no. TRECs	Mean contracture	Sample no.
Clinical stage
IA	6432.46	1.86	7
IB	5293.29	2.44	18
IIB	534.62	2.33	6
III	268.12	6.56	8
Cluster
1	1551.66	3.18	11
2	6980.11	2.03	15
3	1837.65	4.46	13

Shown are TRECs with respect to clinical stage (P = .034) and cluster (P = .011). As shown previously,⁴⁷ TRECs decreased significantly across all clinical stages, and contracture increased from early-stage to late-stage CTCL, indicating loss of T-cell receptor diversity. Clusters 1 and 3 had the lowest TREC levels and highest contractures, while cluster 2 had the highest TREC levels and the lowest contractures. These findings significantly correlate with the clinical observation of poorer outcome in clusters 1 and 3. P values calculated based on ANOVA.

Discussion

This study shows that gene expression profiling in lesional skin biopsies provides information which may be useful in the evaluation and management of patients with CTCL. We determined the most natural clustering of 62 patients with CTCL, and examined in detail the genes and gene sets defining each group. We further correlated these clusters with other parameters including clinical stage, event-free survival, treatment resistance, TREC levels, and spectratype diversity. Our results demonstrate that grouping patients based on gene expression reveals many interesting similarities and differences with currently accepted clinical staging. This work can serve as a framework for further studies aimed at understanding the diverse spectrum of CTCL in terms of gene expression rather than clinical manifestations alone.

Using consensus analysis, we found 3 clusters to be the most statistically robust grouping of our samples, with approximately one-third of patients in each group. One cluster (cluster 2) was associated with less aggressive disease by all measures. Another cluster (cluster 3) was associated with more extensive disease, decreased event-free survival, and poor response to treatment. A third cluster (cluster 1) was associated with an even poorer prognosis and included many patients in tumor stage. A previous microarray study of lesional CTCL skin biopsies by Tracey et al¹⁷ divided subjects into 2 clusters using hierarchical clustering. These clusters were generally characterized by clinical parameters to be associated with more and less aggressive disease. They did not perform consensus clustering and did not publish any results of clustering with an explicit 3-cluster model. However, when we examined their data in a 3-cluster division for comparison with the results of our consensus clustering, we found that a 3-cluster structure for their data were highly consistent with our results, based on sample as well as gene partitioning (Figure 9).

View large Download PPT

Figure 9

Comparison with published results. Tracey et al¹⁷ examined genes included on the CNIO Oncochip microarray platform, a subset of the genes included on the Affymetrix HG-U133 microarray platform. They identified 2 main subgroups of patients using hierarchical clustering. We compared this to our 3-cluster division determined through consensus clustering. Sample clustering in the space of genes identified by Tracey et al showed significant consistency with our results, despite only 21% overlap between genes used in clustering. We also examined a less optimized 2-cluster structure of our data for comparison, and found relatively good overlap with the 2-cluster structure of Tracey et al. (A) 3-cluster structure. 79% of samples overall were correctly classified. This result was significant (P < .001). (B) 2-cluster structure. 87% of samples overall were correctly classified using a 2-cluster structure. The result was again significant (P < .001). (C) Gene correlation. Genes were similarly classified for 71% of 130 common genes. Genes associated with more aggressive disease by Tracey et al were found to be associated with more advanced disease in our study as well. This finding was significant (P < .001).

In addition to examining individual gene differences, we examined groups of genes organized according to functional pathways. Pathways were tested using the GO and GSEA tools, which group changes in gene expression based on molecular pathways in order to uncover changes in cell functions that might otherwise be difficult to delineate because of small fold changes of individual genes. Cluster 1 demonstrated significant up-regulation of many other genes involved in lymphocyte activation, as well as up-regulation of TNF pathway genes. Gene pathways that were predominant in cluster 2, consisting of patients with relatively limited disease, tended to be associated with epidermal development and differentiation. Cluster 3, consisting of patients with more extensive disease, emphasized pathways involved in inflammation and benign epidermal hyperproliferation. Although clusters 2 and 3 appear to be clinically divergent, it is possible that they lie on a clinical spectrum, with cluster 3 tending toward more extensive skin involvement (stages IB and III, with very few stage IA patients) and with cluster 2 tending toward more limited surface area stage IA disease. Cluster 1 was found to be enriched with lymphocyte activation genes, and correlated with poorer outcome, including disease-related deaths and progression despite systemic therapies. This cluster also had lower TREC levels across all clinical stages, consistent with lymphocyte clonal expansion, as well as contracted spectratypes, indicating decreased TCR diversity. This significant correlation between our clustering and TREC/spectratype measurements supports the idea that our clustering does correlate with the biology of the disease.

Biopsies showed that the number of CD3⁺ cells in the epidermal and dermal infiltrates is highly variable across clusters, suggesting that up-regulation of T-cell activation genes is unlikely due solely to increased numbers of lymphocytes.

Markers of clinical stage in the supervised analysis were largely in line with those of the natural clusters, with lower stage disease enriched in epidermal differentiation pathways and higher-stage disease enriched in lymphocyte activation pathways, among others. More work with larger numbers of patients will be required to characterize subgroups more clearly and to isolate causative genes. However, this does appear to be a useful framework for additional study of the pathways and genes involved in clinical stage progression and response to treatment.

This study suggests that gene expression profiles of lesional skin provide information which correlates with disease progression and response to therapy. This information may prove useful in the future in combination with traditionally accepted clinical staging criteria to improve classification of CTCL and to help guide therapy, such as more aggressive treatment of those patients with gene expression profiles enriched in lymphocyte activation genes relative to epidermal differentiation and proliferation genes.

Presented in abstract form at the 67th annual meeting of the Society for Investigative Dermatology, Philadelphia, PA, May 4, 2006.¹

The publication costs of this article were defrayed in part by page charge payment. Therefore, and solely to indicate this fact, this article is hereby marked “advertisement” in accordance with 18 USC section 1734.

Acknowledgments

We are indebted to Marianne Tawa and Lisa Batchelder for their help with identifying and recruiting eligible patients for our study. We would also like to thank Helen Kuo for her help extracting RNA. We thank the Broad Institute for help with target preparation and hybridization.

This work was supported by a grant from the National Institutes of Health SPORE program (P50 CA093683 to T.S.K.).

National Institutes of Health

Authorship

Contribution: All authors participated in designing the research; D.A. and M.D. collected samples; D.A., M.D., and J.S. collected clinical information; J.S. and D.J. examined pathology slides; J.S. and S.M. performed statistical analyses; J.S. and D.J. wrote the paper; and all authors reviewed the final version of the manuscript.

Conflict-of-interest disclosure: The authors declare no competing financial interests.

Correspondence: David A. Jones or Thomas S. Kupper, Department of Dermatology, Brigham and Women's Hospital, 77 Ave Louis Pasteur, HIM Room 660, Boston, MA 02115; e-mail: dajones@partners.org or tkupper@partners.org.

References

1

Shin

J

,

Jones

DA

,

Hurwitz

D

,

Kupper

TS

. ,

Gene expression profiling in cutaneous T cell lymphoma skin lesions.

,

J Invest Dermatol

,

2006

, vol.

126

Supplement

pg.

24

Google Scholar

2

Willemze

R

,

Jaffe

ES

,

Burg

G

, et al. ,

WHO-EORTC classification for cutaneous lymphomas.

,

Blood

,

2005

, vol.

105

(pg.

3768

-

3785

)

Google Scholar

Crossref

PubMed

3

Zackheim

HS

,

Amin

S

,

Kashani-Sabet

M

,

McMillan

A

. ,

Prognosis in cutaneous T-cell lymphoma by skin stage: long-term survival in 489 patients.

,

J Am Acad Dermatol

,

1999

, vol.

40

(pg.

418

-

425

)

Google Scholar

Crossref

PubMed

4

Diamandidou

E

,

Colome-Grimmer

M

,

Fayad

L

,

Duvic

M

,

Kurzrock

R

. ,

Transformation of mycosis fungoides/Sezary syndrome: clinical characteristics and prognosis.

,

Blood

,

1998

, vol.

92

(pg.

1150

-

1159

)

Google Scholar

Crossref

PubMed

5

Kim

YH

,

Liu

HL

,

Mraz-Gernhard

S

,

Varghese

A

,

Hoppe

RT

. ,

Long-term outcome of 525 patients with mycosis fungoides and Sezary syndrome: clinical prognostic factors and risk for disease progression.

,

Arch Dermatol

,

2003

, vol.

139

(pg.

857

-

866

)

Google Scholar

PubMed

6

Kim

EJ

,

Hess

S

,

Richardson

SK

, et al. ,

Immunopathogenesis and therapy of cutaneous T cell lymphoma.

,

J Clin Invest

,

2005

, vol.

115

(pg.

798

-

812

)

Google Scholar

Crossref

PubMed

7

Kupper

TS

,

Fuhlbrigge

RC

. ,

Immune surveillance in the skin: mechanisms and clinical consequences.

,

Nat Rev Immunol

,

2004

, vol.

4

(pg.

211

-

222

)

Google Scholar

Crossref

PubMed

8

Kazakov

DV

,

Burg

G

,

Kempf

W

. ,

Clinicopathological spectrum of mycosis fungoides.

,

J Eur Acad Dermatol Venereol

,

2004

, vol.

18

(pg.

397

-

415

)

Google Scholar

Crossref

PubMed

9

Morales Suarez-Varela

MM

,

Llopis Gonzalez

A

,

Marquina Vila

A

,

Bell

J

. ,

Mycosis fungoides: review of epidemiological observations.

,

Dermatology

,

2000

, vol.

201

(pg.

21

-

28

)

Google Scholar

Crossref

PubMed

10

Quackenbush

J

. ,

Microarray analysis and tumor classification.

,

N Engl J Med

,

2006

, vol.

354

(pg.

2463

-

2472

)

Google Scholar

Crossref

PubMed

11

Ebert

BL

,

Golub

TR

. ,

Genomic approaches to hematologic malignancies.

,

Blood

,

2004

, vol.

104

(pg.

923

-

932

)

Google Scholar

Crossref

PubMed

12

Tefferi

AM

,

Bolander

MEM

,

Ansell

SMM

,

Wieben

EDP

,

Spelsberg

TCP

. ,

Primer on medical genomics part III: microarray experiments and data analysis.

,

Mayo Clin Proc

,

2002

, vol.

77

(pg.

927

-

940

)

Google Scholar

Crossref

PubMed

13

Sorlie

T

,

Perou

CM

,

Tibshirani

R

, et al. ,

Gene expression patterns of breast carcinomas distinguish tumor subclasses with clinical implications.

,

Proc Natl Acad Sci U S A

,

2001

, vol.

98

(pg.

10869

-

10874

)

Google Scholar

Crossref

PubMed

14

van 't Veer

LJ

,

Dai

H

,

van de Vijver

MJ

, et al. ,

Gene expression profiling predicts clinical outcome of breast cancer.

,

Nature

,

2002

, vol.

415

(pg.

530

-

536

)

Google Scholar

Crossref

PubMed

15

Monti

S

,

Savage

KJ

,

Kutok

JL

, et al. ,

Molecular profiling of diffuse large B-cell lymphoma identifies robust subtypes including one characterized by host inflammatory response.

,

Blood

,

2005

, vol.

105

(pg.

1851

-

1861

)

Google Scholar

Crossref

PubMed

16

Kari

L

,

Loboda

A

,

Nebozhyn

M

, et al. ,

Classification and prediction of survival in patients with the leukemic phase of cutaneous T cell lymphoma.

,

J Exp Med

,

2003

, vol.

197

(pg.

1477

-

1488

)

Google Scholar

Crossref

PubMed

17

Tracey

L

,

Villuendas

R

,

Dotor

AM

, et al. ,

Mycosis fungoides shows concurrent deregulation of multiple genes involved in the TNF signaling pathway: an expression profile study.

,

Blood

,

2003

, vol.

102

(pg.

1042

-

1050

)

Google Scholar

Crossref

PubMed

18

Tracey

L

,

Villuendas

R

,

Ortiz

P

, et al. ,

Identification of genes involved in resistance to interferon-alpha in cutaneous T-cell lymphoma.

,

Am J Pathol

,

2002

, vol.

161

(pg.

1825

-

1837

)

Google Scholar

Crossref

PubMed

19

Irizarry

RA

,

Bolstad

BM

,

Collin

F

,

Cope

LM

,

Hobbs

B

,

Speed

TP

. ,

Summaries of Affymetrix GeneChip probe level data.

,

Nucleic Acids Res

,

2003

, vol.

31

pg.

e15

Google Scholar

Crossref

PubMed

20

R Development Core Team

,

R: a language and environment for statistical computing.

,

2005

Vienna, Austria

R Foundation for Statistical Computing

Available: http://www.R-project.org

21

Gentleman

R

,

Carey

V

,

Bates

D

, et al. ,

Bioconductor: open software development for computational biology and bioinformatics.

,

Genome Biology

,

2004

, vol.

5

pg.

R80

Google Scholar

Crossref

PubMed

22

Shipp

MA

,

Ross

KN

,

Tamayo

P

, et al. ,

Diffuse large B-cell lymphoma outcome prediction by gene-expression profiling and supervised machine learning.

,

Nat Med

,

2002

, vol.

8

(pg.

68

-

74

)

Google Scholar

Crossref

PubMed

23

Eisen

MB

,

Spellman

PT

,

Brown

PO

,

Botstein

D

. ,

Cluster analysis and display of genome-wide expression patterns.

,

Proc Natl Acad Sci U S A

,

1998

, vol.

95

(pg.

14863

-

14868

)

Google Scholar

Crossref

PubMed

24

Tamayo

P

,

Slonim

D

,

Mesirov

J

, et al. ,

Interpreting patterns of gene expression with self-organizing maps: methods and application to hematopoietic differentiation.

,

Proc Natl Acad Sci U S A

,

1999

, vol.

96

(pg.

2907

-

2912

)

Google Scholar

Crossref

PubMed

25

Monti

S

,

Tamayo

P

,

Mesirov

J

,

Golub

T

. ,

Consensus clustering: a resampling-based method for class discovery and visualization of gene expression microarray data.

,

Machine Learning

,

2003

, vol.

52

(pg.

91

-

118

)

Google Scholar

Crossref

26

Li

C

,

Wong

WH

. ,

Model-based analysis of oligonucleotide arrays: expression index computation and outlier detection.

,

Proc Natl Acad Sci U S A

,

2001

, vol.

98

(pg.

31

-

36

)

Google Scholar

Crossref

PubMed

27

Reich

M

,

Liefeld

T

,

Gould

J

,

Lerner

J

,

Tamayo

P

,

Mesirov

JP

. ,

GenePattern 2.0.

,

Nat Genet

,

2006

, vol.

38

(pg.

500

-

501

)

Google Scholar

Crossref

PubMed

28

Golub

TR

,

Slonim

DK

,

Tamayo

P

, et al. ,

Molecular classification of cancer: class discovery and class prediction by gene expression monitoring.

,

Science

,

1999

, vol.

286

(pg.

531

-

537

)

Google Scholar

Crossref

PubMed

29

Gould

J

,

Getz

G

,

Monti

S

,

Reich

M

,

Mesirov

JP

. ,

Comparative gene marker selection suite.

,

Bioinformatics

,

2006

, vol.

22

(pg.

1924

-

1925

)

Google Scholar

Crossref

PubMed

30

Benjamini

Y

,

Hochberg

Y

. ,

Controlling the false discovery rate: a practical and powerful approach to multiple testing.

,

J R Stat Soc Ser B

,

1995

, vol.

57

(pg.

289

-

300

)

Google Scholar

31

Ashburner

M

,

Ball

CA

,

Blake

JA

, et al. ,

Gene Ontology: tool for the unification of biology.

,

Nat Genet

,

2000

, vol.

25

(pg.

25

-

29

)

Google Scholar

Crossref

PubMed

32

Castillo-Davis

CI

,

Hartl

DL

. ,

GeneMerge: post-genomic analysis, data mining, and hypothesis testing.

,

Bioinformatics

,

2003

, vol.

19

(pg.

891

-

892

)

Google Scholar

Crossref

PubMed

33

Subramanian

A

,

Tamayo

P

,

Mootha

VK

, et al. ,

From the cover: gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles.

,

Proc Natl Acad Sci U S A

,

2005

, vol.

102

(pg.

15545

-

15550

)

Google Scholar

Crossref

PubMed

34

Mehrle

S

,

Frank

S

,

Schmidt

J

,

Schmidt-Wolf

IGH

,

Marten

A

. ,

SAP and SLAM expression in anti-CD3 activated lymphocytes correlates with cytotoxic activity.

,

Immunol Cell Biol

,

2005

, vol.

83

(pg.

33

-

39

)

Google Scholar

Crossref

PubMed

35

Willinger

T

,

Freeman

T

,

Herbert

M

,

Hasegawa

H

,

McMichael

AJ

,

Callan

MFC

. ,

Human naive CD8 T cells down-regulate expression of the WNT pathway transcription factors lymphoid enhancer binding factor 1 and transcription factor 7 (T cell factor-1) following antigen encounter in vitro and in vivo.

,

J Immunol

,

2006

, vol.

176

(pg.

1439

-

1446

)

Google Scholar

Crossref

PubMed

36

Castrop

J

,

van Wichen

D

,

Koomans-Bitter

M

, et al. ,

The human TCF-1 gene encodes a nuclear DNA-binding protein uniquely expressed in normal and neoplastic T-lineage lymphocytes.

,

Blood

,

1995

, vol.

86

(pg.

3050

-

3059

)

Google Scholar

Crossref

PubMed

37

Dorfman

DM

,

Greisman

HA

,

Shahsafaei

A

. ,

Loss of expression of the WNT/β-catenin-signaling pathway transcription factors lymphoid enhancer factor-1 (LEF-1) and T cell factor-1 (TCF-1) in a subset of peripheral T cell lymphomas.

,

Am J Pathol

,

2003

, vol.

162

(pg.

1539

-

1544

)

Google Scholar

Crossref

PubMed

38

Martinez

M

,

Joffraud

M

,

Giraud

S

, et al. ,

Regulation of PSGL-1 interactions with L-selectin, P-selectin, and E-selectin: role of human fucosyltransferase-IV and -VII.

,

J Biol Chem

,

2005

, vol.

280

(pg.

5378

-

5390

)

Google Scholar

Crossref

PubMed

39

Nakayama

F

,

Teraki

Y

,

Kudo

T

, et al. ,

Expression of cutaneous lymphocyte-associated antigen regulated by a set of glycosyltransferases in human T cells: involvement of α1,3-fucosyltransferase VII and β1,4-galactosyltransferase I.

,

J Invest Dermatol

,

2000

, vol.

115

(pg.

299

-

306

)

Google Scholar

Crossref

PubMed

40

Sedy

JR

,

Gavrieli

M

,

Potter

KG

, et al. ,

B and T lymphocyte attenuator regulates T cell activation through interaction with herpesvirus entry mediator.

,

Nat Immunol

,

2005

, vol.

6

(pg.

90

-

98

)

Google Scholar

Crossref

PubMed

41

Shi

G

,

Luo

H

,

Wan

X

,

Salcedo

TW

,

Zhang

J

,

Wu

J

. ,

Mouse T cells receive costimulatory signals from LIGHT, a TNF family member.

,

Blood

,

2002

, vol.

100

(pg.

3279

-

3286

)

Google Scholar

Crossref

PubMed

42

Wang

C-Y

,

Mayo

MW

,

Korneluk

RG

,

Goeddel

DV

,

Baldwin

AS

. ,

NF-B antiapoptosis: induction of TRAF1 and TRAF2 and c-IAP1 and c-IAP2 to suppress caspase-8 activation.

,

Science

,

1998

, vol.

281

(pg.

1680

-

1683

)

Google Scholar

Crossref

PubMed

43

Auer

RL

,

Starczynski

J

,

McElwaine

S

, et al. ,

Identification of a potential role for POU2AF1 and BTG4 in the deletion of 11q23 in chronic lymphocytic leukemia.

,

Genes Chromosomes Cancer

,

2005

, vol.

43

(pg.

1

-

10

)

Google Scholar

Crossref

PubMed

44

Weiss

RA

,

Eichner

R

,

Sun

TT

. ,

Monoclonal antibody analysis of keratin expression in epidermal diseases: a 48- and 56-kdalton keratin as molecular markers for hyperproliferative keratinocytes.

,

J Cell Biol

,

1984

, vol.

98

(pg.

1397

-

1406

)

Google Scholar

Crossref

PubMed

45

Haider

AS

,

Peters

SB

,

Kaporis

H

, et al. ,

Genomic analysis defines a cancer-specific gene expression signature for human squamous cell carcinoma and distinguishes malignant hyperproliferation from benign hyperplasia.

,

J Invest Dermatol

,

2006

, vol.

126

(pg.

869

-

881

)

Google Scholar

Crossref

PubMed

46

Li

Z

,

Godinho

FJ

,

Klusmann

J-H

,

Garriga-Canut

M

,

Yu

C

,

Orkin

SH

. ,

Developmental stage-selective effect of somatically mutated leukemogenic transcription factor GATA1.

,

Nat Genet

,

2005

, vol.

37

(pg.

613

-

619

)

Google Scholar

Crossref

PubMed

47

Yamanaka

K-i

,

Yawalkar

N

,

Jones

DA

, et al. ,

Decreased T-cell receptor excision circles in cutaneous T-cell lymphoma.

,

Clin Cancer Res

,

2005

, vol.

11

(pg.

5748

-

5755

)

Google Scholar

Crossref

PubMed

48

Yawalkar

N

,

Ferenczi

K

,

Jones

DA

, et al. ,

Profound loss of T-cell receptor repertoire complexity in cutaneous T-cell lymphoma.

,

Blood

,

2003

, vol.

102

(pg.

4059

-

4066

)

Google Scholar

Crossref

PubMed

2007

Sign in via your Institution

Lesional gene expression profiling in cutaneous T-cell lymphoma reveals natural clusters associated with disease outcome

Abstract

Introduction

Patients, materials, and methods

Samples

Target preparation

Data analysis and filtering

Unsupervised analysis using consensus clustering

Supervised analysis

Gene pathway analysis

GO analysis.

GSEA.

Results

Identification of the consensus number of clusters

Characterization of the consensus clusters

Phenotype correlation.

Genes and gene pathways implicated in differentiating CTCL subgroups.

Histologic and immunophenotypic correlation.

Markers of clinical stage and response to treatment

Correlation with TREC and spectratyping data

Discussion

Acknowledgments

Authorship

References

Cited By

Email alerts

ASH Publications

American Society of Hematology

Lesional gene expression profiling in cutaneous T-cell lymphoma reveals natural clusters associated with disease outcome Free

Abstract

Introduction

Patients, materials, and methods

Samples

Target preparation

Data analysis and filtering

Unsupervised analysis using consensus clustering

Supervised analysis

Gene pathway analysis

GO analysis.

GSEA.

Results

Identification of the consensus number of clusters

Characterization of the consensus clusters

Phenotype correlation.

Genes and gene pathways implicated in differentiating CTCL subgroups.

Histologic and immunophenotypic correlation.

Markers of clinical stage and response to treatment

Correlation with TREC and spectratyping data

Discussion

Acknowledgments

Authorship

References

This feature is available to Subscribers Only

My Account

Cited By

Email alerts

ASH Publications

American Society of Hematology

This Feature Is Available To Subscribers Only

Lesional gene expression profiling in cutaneous T-cell lymphoma reveals natural clusters associated with disease outcome