Human embryonic stem (huES) cells have the ability to differentiate into a variety of cell lineages and potentially provide a source of differentiated cells for many therapeutic uses. However, little is known about the mechanism of differentiation of huES cells and factors regulating cell development. We have used high-quality microarrays containing 16 659 seventy–base pair oligonucleotides to examine gene expression in 6 of the 11 available huES cell lines. Expression was compared against pooled RNA from multiple tissues (universal RNA) and genes enriched in huES cells were identified. All 6 cell lines expressed multiple markers of the undifferentiated state and shared significant homology in gene expression (overall similarity coefficient > 0.85).A common subset of 92 genes was identified that included Nanog, GTCM-1, connexin 43 (GJA1), oct-4, and TDGF1 (cripto). Gene expression was confirmed by a variety of techniques including comparison with databases, reverse transcriptase–polymerase chain reaction, focused cDNA microarrays, and immunocytochemistry. Comparison with published “stemness” genes revealed a limited overlap, suggesting little similarity with other stem cell populations. Several novel ES cell–specific expressed sequence tags were identified and mapped to the human genome. These results represent the first detailed characterization of undifferentiated huES cells and provide a unique set of markers to profile and better understand the biology of huES cells. (Blood. 2004;103: 2956-2964)

Embryonic stem cells derived from the inner cell mass, or embryoblast, of the human embryo provide a potential source of differentiated cells for a variety of therapeutic uses.1  These cells are capable of unlimited symmetrical self-renewal, maintain clonality, and may have the ability to treat a host of degenerative diseases including Parkinson disease and diabetes.2,3  Human embryonic stem (huES) cells were first isolated and successfully propagated in 1998.4  Among 71 independent huES cell lines identified worldwide, 11 cell lines are currently available for research purposes with limited published data on their culture and differentiation characteristics.5,6  Much of the information on ES cells has been derived from studies on mouse ES cells. These include reports identifying conditions that promote differentiation of mouse ES cells into heart, blood, muscle, blood vessels, brain, and insulin-producing islet cells.2,7-10  Far fewer studies have been performed with huES cells. While some properties of huES cells may potentially be extrapolated from experiments performed in murine ES cell cultures, observed differences between human and mouse biology may lead to identification of key differences between these 2 cell types. The first step in the characterization of huES cells will involve identification of a set of ES cell–specific genes that may function as markers or identify unique regulatory pathways.

Microarray is a new and powerful technology that can monitor the expression of thousands of genes at once, providing a tool for large-scale analysis of the molecular status of a particular cell line in the steady state.11,12  Further, the availability of a well-annotated genome database suggests that global gene-expression profiling of huES cell cultures will allow one to map and identify genes specific to huES cells and will reveal novel insights into the behavior of ES cells.13-15  Therefore, in this study, we have profiled 6 available huES cell lines by oligonucleotide microarrays. The expression of defined genes was confirmed by reverse transcriptase–polymerase chain reaction (RT-PCR), immunocytochemistry, focused microarrays and comparison to various databases maintained at the National Cancer Institute (NCI), and by comparison with an expressed sequence tag (EST) enumeration database of huES cells (R.B., et al, unpublished data, August 2003). We show that these huES cell lines are, overall, similar to each other and express a unique molecular signature of 92 genes.

Isolation and growth of ES cells

Human ES cell lines were obtained from Bresagen (Athens, GA), Wicell (Madison, WI), Dr Itskovitz Eldor (Haifa, Israel), and Geron (Menlo Park, CA) and maintained according to provider's protocols.16-18  ES cell derivation, culture conditions, and their characteristics are listed in Table 1. In brief, 5 huES cell lines (GE01, GE09, BG01, BG02, and TE06) were maintained on inactivated mouse embryonic fibroblast (MEF) feeder cells in Dulbecco modified Eagle medium (DMEM) supplemented with 15% fetal bovine serum (FBS), 5% knockout serum replacement (KSR), 2 mM nonessential amino acids, 2 mM l-glutamine, 50 μg/mL Penn-Strep (all from Invitrogen, Carlsbad, CA), 0.1 mM β-mercaptoethanol (Specialty Media, Philipsburg, NJ), and 4 ng/mL basic fibroblast growth factor (bFGF; Sigma, St Louis, MO). Cells were passaged by incubation in cell dissociation buffer (Invitrogen), dissociated, and then seeded at about 20 000 cells/cm2. The 3 cell lines GE01, GE07, and GE09 were also cultured with MEF-conditioned medium as described,19  and RNA was pooled from the 3 cell lines to produce the pooled embryonic stem (PES) cell sample.

Table 1.

ES cell derivation, culture conditions, and characteristics


Cell line

Derivation

Serum

Feeder

Cell harvest/passaging*

Characteristics/comments
PES   Frozen blastocyst, prior to August 9, 2001   Yes   No (CM)   RNA prepared from pooled samples of GE01, GE07, and GE09 cell lines. Every fourth day with trypsin   Expression of undifferentiated markers, Xu et al19  
BG01   Frozen blastocyst, prior to August 9, 2001   Yes   MEF   Every third day with collagenase   Expression of undifferentiated markers, Zeng et al20  
BG02   Frozen blastocyst, prior to August 9, 2001   Yes   MEF   Every third day with collagenase   Expression of undifferentiated markers, Zeng et al20  
GE01   Frozen blastocyst, prior to August 9, 2001   Yes   MEF   Every fifth day with collagenase   Expression of undifferentiated markers, Thomson et al4  
GE09   Frozen blastocyst, prior to August 9, 2001   Yes   MEF   Every third day with collagenase   Expression of undifferentiated markers, Amit and Iskovitz-Eldor21  
TE06   Frozen blastocyst, prior to August 9, 2001   Yes   MEF   Every third day with collagenase   Expression of undifferentiated markers, Amit and Iskovitz-Eldor21  
EB
 
Prepared from PES
 
Yes
 
No (CM)
 
Embryoid bodies prepared from 3 independent experiments from 3 different cell lines
 
Carpenter et al22,23 
 

Cell line

Derivation

Serum

Feeder

Cell harvest/passaging*

Characteristics/comments
PES   Frozen blastocyst, prior to August 9, 2001   Yes   No (CM)   RNA prepared from pooled samples of GE01, GE07, and GE09 cell lines. Every fourth day with trypsin   Expression of undifferentiated markers, Xu et al19  
BG01   Frozen blastocyst, prior to August 9, 2001   Yes   MEF   Every third day with collagenase   Expression of undifferentiated markers, Zeng et al20  
BG02   Frozen blastocyst, prior to August 9, 2001   Yes   MEF   Every third day with collagenase   Expression of undifferentiated markers, Zeng et al20  
GE01   Frozen blastocyst, prior to August 9, 2001   Yes   MEF   Every fifth day with collagenase   Expression of undifferentiated markers, Thomson et al4  
GE09   Frozen blastocyst, prior to August 9, 2001   Yes   MEF   Every third day with collagenase   Expression of undifferentiated markers, Amit and Iskovitz-Eldor21  
TE06   Frozen blastocyst, prior to August 9, 2001   Yes   MEF   Every third day with collagenase   Expression of undifferentiated markers, Amit and Iskovitz-Eldor21  
EB
 
Prepared from PES
 
Yes
 
No (CM)
 
Embryoid bodies prepared from 3 independent experiments from 3 different cell lines
 
Carpenter et al22,23 
 

To produce conditioned media, MEFs were grown as described20  for less than 5 passages and then growth arrested with mitomycin C or irradiation. Medium was replaced with ES growth medium and the supernatant was collected after 48 hours. The medium was pooled and used fresh or frozen in pooled aliquots. SR indicates serum; EB, embryoid body; MEF, mouse embryonic fibroblast; and CM, conditioned medium.

*

Cells were harvested after digesting with enzymes as indicated for each cell line and maintained as described. Prior to harvesting for microarray analysis, expression of ES cell markers and the absence of markers of differentiation were assessed by RT-PCR and immunocytochemistry (data not shown). RNA of EB for EST enumeration analysis was prepared by pooling samples from 3 different (GE01, GE07, and GE09) cell lines.

Embryoid body outgrowths (EB) were prepared from GE01, GE07, and GE09 cells as described (R.B. et al, unpublished data, August 2003). Briefly, confluent plates of undifferentiated hES cells were used to generate EBs by a brief exposure to collagenase IV; small clusters of cells were obtained by scraping with a pipette. Cell clusters were resuspended in differentiation medium (KO-DMEM supplemented with glutamine, nonessential amino acids (NEAAs), and beta mercaptoethanol (BME) as described for the undifferentiated huES cells,19  with 20% FBS in place of 20% serum replacement (SR) and no preconditioning by MEFs) and transferred to individual wells of low-adhesion 6-well plates (Costar, San Diego, CA). After 4 days in suspension, cells were transferred to typical tissue-culture 6-well plates precoated with gelatin. Cells were harvested for the preparation of cytoplasmic RNA on day 8.

Immunocytochemistry

Immunocytochemistry was performed following the procedure previously described.24  HuES cells were fixed with formalin for 30 minutes and stained for CD24 and GTCM-1 expression using an appropriate concentration of antibodies diluted in phosphate-buffered saline (PBS) containing 1% bovine serum albumin (BSA). Fluorescent conjugated secondary antibodies (Jackson Immunologicals, Raritan, NJ) were used to detect expression.

RT-PCR analysis

Total RNA derived from 6 huES cell lines were subjected to RT-PCR analysis as previously described.25  β-actin and glyceraldehyde 3 phosphate dehydrogenase (G3PDH) mRNA amplified from these samples served as an internal control. The primers used are listed in a supplemental table (Table S1 in the Supplemental Document; see the Supplemental Materials link at the top of the online article on the Blood website). The thermocycler conditions used for amplification were 94°C, 10-minute hot start; 94°C, 45 seconds; 48°C, 30 seconds; and 72°C, 1 minute. Amplification products (10 μL) were resolved in 2% agarose gel, stained with ethidium bromide (EtBr), and visualized in a transilluminator and photographed.

Microarray analysis

We followed MIAME (minimum information about a microarray experiment) guidelines for the presentation of our data.26 

High-quality oligonucleotide glass arrays were produced containing a total of 16 659 seventy-mer oligonucleotides chosen from 750 bases of the 3′ end of each open reading frame (ORF). The array includes probes for 2121 hypothetical proteins and 18 ESTs, spans approximately 50% of the human genome, and is one of the largest verified sets available (Operon, Valencia, CA). The arrays were fabricated in-house by spotting oligonucleotides on poly-l-lysine–coated glass slides by Gene Machines robotics system (Omnigrid, San Carlos, CA).

Probe preparation. Total huES-derived RNA was isolated by using Trizol reagent (Invitrogen). Total human universal RNA (huURNA) isolated from a collection of adult human tissues to represent a broad range of expressed genes from both male and female donors (BD Biosciences, Palo Alto, CA) served as a universal reference control in the competitive hybridization. Labeled cDNA probes were produced as described.27  Briefly, 20 μg total RNA was incubated at 70°C for 5 minutes along with 1 μL oligo dT and quickly chilled for 3 minutes. Then, 3 μL of 10× first-strand buffer, 2 μL SSII enzyme (Stratagene, La Jolla, CA), 2 μL of 20× aminoallyl dioxy ribonocleotide triphosphate (dNTP), and 3 μL of 0.1 M dithiothreitol (DTT) were added and incubated for 90 minutes at 42°C for reverse transcription. After incubation volume of the mixture was increased to 50 μL with 20 μL diethyl pyrocarbonate (DEPC) water.

cDNA was purified by MinElute column (Qiagen, Valencia, CA). After washing, the probe was eluted by 15 μL elution buffer, centrifuged for 1 minute, and dried by speed-vac for 14 to 15 minutes (probe should not be overdried). Finally, 5 μL of 2× coupling buffer and 5 μL Cy3 and Cy5 dye were mixed into the control (huURNA) and experimental cDNAs (huES cell–derived), and incubated at room temperature in the dark for 1 hour. After incubation, the volume was raised to 50 μL by water and then cDNA was purified by MinElute column once again, eluted with 12 μL elution buffer, centrifuged to collect the cDNA probes, and then both probes combined.

Prehybridization and hybridization. Arrays were prehybridized with 50 μL prehybridization buffer (25 μL 20× SSC, 20 μL 5% BSA, 54 μL DEPC water, and 1 μL 10% sodium dodecyl sulfate [SDS]) under a coverslip for 1 hour at 42°C, washed with dH2O and isopropanol (2 minutes in each one), spin-dried, and kept in a clean box at room temperature.

For hybridization, 34 μL hybridization mixture (24 μL cDNA mixture, 1 μL [10 μg] COT-1 DNA, 1 μL [8-10 μg] poly(dA), 1 μL [4 μg] yeast tRNA, 6 μL 20× SSC, and 1 μL 10% SDS) was preheated at 100°C for 2 minutes and cooled for 1 minute (by centrifugation at maximum speed). Total volume of probe was added on dried (prehybridized) array and covered with coverslip (22 mm × 40 mm). Slides were placed in hybridization chambers and incubated at 65°C in a water bath overnight (10-16 hours). Then, slides were washed for 2 minutes each in 2× SSC, 1× SSC, and 0.2× SSC, and spin dried.

Data filtration, normalization, and analysis. Microarray slides were scanned in both Cy3 (532 nm) and Cy5 (635 nm) channels using an Axon GenePix 4000B scanner (Axon Instruments, Foster City, CA) with a 10-μM resolution. Scanned microarray images were exported as TIFF files to GenePix Pro 3.0 software for image analysis. The raw images were collected at 16-bit/pixel resolutions that displayed all pixels in a 0 to 65 535 count dynamic range. The area surrounding each spot image was used to calculate a local background and subtracted from each spot before Cy5/Cy3 ratio calculation. The average of the resulting total Cy3 and Cy5 signal gave a ratio that was used to normalize the signals. Each microarray experiment was globally normalized to make the median value of the log-2 ratio equal to zero. The normalization process corrects for dye bias, PMT (photo multiplier tube) voltage imbalance, and variations between channels in the amounts of the labeled cDNA probes hybridized. The data files representing the differentially expressed genes were then created.

For advanced data analysis, data files (in gpr format) and image (in jpeg format) were imported into the microarray database (mAdb), and analyzed by software tools provided by the National Institutes of Health Center for Information Technology. Spots with a confidence interval of 99% (> 3- fold), a fluorescence intensity of at least 150 in both channels, and a size of 30 μM were only considered as good spots for analysis. These advanced filters prevented the potential effect of the poor-quality spots in data analysis.

Focused microarray analysis

Nonradioactive GEArray Q series cDNA expression array filters for human stem cell genes and transforming growth factor β/bone morphogenic protein 1 (TGFβ/BMPl) pathway genes (Hs601 and Hs023; SuperArray Biosciences, Frederick, MD) were used according to the manufacturer's protocol.28  The biotin dUTP-labeled cDNA probes were generated by using gene-specific primers, total RNA (4 μg) and 200 U maloney murine leukemia virus–derived reverse transcriptase (Promega, Madison, WI). The array filters were hybridized with biotin-labeled probes at 60°C for 17 hours, washed twice with 2× SSC/1% SDS and then twice with 0.1 × SSC/1% SDS at 60°C for 15 minutes each. Chemiluminescent detection steps were performed by subsequent incubation of filters with alkaline phosphatase–conjugated streptavidin and CDP-Star substrate. Array membranes were exposed to x-ray film. Quantification of the gene expression on the array was performed with ScionImage software. Mode optical density (OD) of each gene/spot was calculated and normalized to expression.

EST enumeration

EST frequency counts of genes expressed in human ES cells were done as described (R.B. et al, unpublished data, August 2003). Briefly, cDNA libraries of hES cell lines GE01, GE07, and GE07 grown in feeder-free conditions, and of EBs derived from the same 3 cell lines were constructed and submitted for EST sequencing. The EST sequences were assembled into overlapping sequence assemblies and mapped to the UniGene database of nonredundant human transcripts. Expression levels were assessed by counting the number of ESTs for a particular gene that were derived from the undifferentiated hES cells and comparing them to the number of ESTs derived from the EB sample. Statistical significance was determined using the Fisher exact test29  using a P value of less than or equal to .05.

Expression profiling of human huES cells

Expression of ES cell markers. The undifferentiated state of cultured cells was analyzed and confirmed by both RT-PCR and immunocytochemistry analyses to detect the presence of undifferentiated cell markers (Oct3/4, Sox2, Rex1, UTF1, hTERT, ABCG2, CD24, Cx43, and Cx45 (Figure 1A and data not shown) and the absence of early markers of differentiation such as GATA-2, -4, nestin, GFAP, Sox-1, myf5, Pdx-1, and myoD (data not shown).

Figure 1.

Expression profiling of 6 huES cell lines. (A) RT-PCR analysis of 6 ES cell lines showed consistent expression of markers of undifferentiated cells. PES cells are a pooled sample of 3 huES cell lines (GE01, GE07, and GE09). G3PDH served as an internal control. (B) Scatter plot analysis of cy5- and cy3-labeled genes in BG02 huES cells and huURNA sample indicating differential gene expression. Some significantly overexpressed ES cell genes are listed. (C) Hierarchical clustering of genes that were expressed 3-fold or at a higher level compared with huURNA. The color indicates the relative expression levels of each gene, with red indicating higher expression, green indicating negative expression, and black representing absent expression. The 5 genes as indicated by the arrows were not present in all cell lines. The minimum spot intensity for all genes was set at 150 fluorescence units except for the CER1 and DNMT3B genes; the minimum intensity was set at 100 fluorescence units for this analysis only to compare expression with other genes within the same array.

Figure 1.

Expression profiling of 6 huES cell lines. (A) RT-PCR analysis of 6 ES cell lines showed consistent expression of markers of undifferentiated cells. PES cells are a pooled sample of 3 huES cell lines (GE01, GE07, and GE09). G3PDH served as an internal control. (B) Scatter plot analysis of cy5- and cy3-labeled genes in BG02 huES cells and huURNA sample indicating differential gene expression. Some significantly overexpressed ES cell genes are listed. (C) Hierarchical clustering of genes that were expressed 3-fold or at a higher level compared with huURNA. The color indicates the relative expression levels of each gene, with red indicating higher expression, green indicating negative expression, and black representing absent expression. The 5 genes as indicated by the arrows were not present in all cell lines. The minimum spot intensity for all genes was set at 150 fluorescence units except for the CER1 and DNMT3B genes; the minimum intensity was set at 100 fluorescence units for this analysis only to compare expression with other genes within the same array.

Close modal

Expression profiling by microarray. Gene expression patterns in huES cell lines were assessed by comparing expression to huURNA using oligonucleotide glass arrays. Arrays with low background and optimized linear dye response were used for all experiments (not shown). Table 2 shows that ES cell lines expressed from 420 to 1014 genes at 3-fold or higher levels compared with huURNA. The scatter plot analysis of a hybridization profile from one cell line compared with huURNA demonstrated many differences (Figure 1B). Raw images and scatter plots of all experiments are available at http://www.grc.nia.nih.gov/branches/lns/scbudata.htm. While most genes are expressed at detectable levels when arrays are probed with huURNA, far fewer genes are found to be expressed when probed with RNA from huES cells (data not shown). Repeated hybridization experiments showed a correlation coefficient of more than 0.92, indicating high reproducibility. Multidimensional scaling (principal component analysis) of all array data from all 6 huES cell lines showed clustering of genes close to each other in one plane (data not shown), confirming the high correlation of gene expression among the huES cell lines despite differences in their growth and culture conditions. A high degree of correlation of gene expression in all 6 cell lines was also confirmed by hierarchical clustering analysis (Figure 1C). This analysis showed that 6 huES cell lines clustered tightly together, indicating a similar expression profile. Cluster analysis also highlighted an additional 5 ES cell–specific genes that were not expressed in all 6 cell lines, but were expressed at high levels in 4 cell lines. Genes that were similar in all cell populations at the 99% confidence interval but different from huURNA were considered to represent candidate huES cell–enriched genes.

Table 2.

Total number of genes overexpressed (≥ 3-fold) in huES cells compared with huURNA


ES cell lines

No. of genes expressed in huES cells compared with huURNA
BG01   446  
BG02   1014  
GE01   612  
GE09   851  
PES   422  
TE06
 
490
 

ES cell lines

No. of genes expressed in huES cells compared with huURNA
BG01   446  
BG02   1014  
GE01   612  
GE09   851  
PES   422  
TE06
 
490
 

Comparison of genes overexpressed at the 3-fold or higher level identified 92 that were enriched and common to all 6 huES cell lines (overall similarity coefficient > 0.85; Table 3, see also Table S2 in the Supplemental Document). These 92 genes constitute a molecular signature (“stemness”) of the huES cells tested and should be examined in all other available huES cell lines. Further analysis and organization of these potential “stemness” genes suggested several overall themes: all 6 huES cell lines showed (1) expression of several genes known to be expressed in mouse ES cells or human ES cells,13-15  which included POU-domain transcription factor (Oct3/4), Nanog, Cripto/TDGF1, GTCM-1, galanin, and connexin 43/GJA1 and the absence of markers of differentiation; (2) ribosomal protein transcripts were overexpressed in huES cells as were several DNA repair enzymes, while genes active in the p53 and retinoblastoma pathways were absent or expressed at low levels; (3) modulators of wnt signaling, the activin superfamily, and components of retinoid signaling were abundant; (4) several zinc finger transcription repressors that appeared specific to ES cells were present; (5) cell cycle regulatory genes such as cyclin C, cyclin B1, and CDC20 were elevated while inhibitors of cell cycle such as p16, p21, were low or absent; and (6) LIN-28, a heterochronic regulator of differentiation,30  and 3 other genes of unknown function were elevated 3-fold or higher (see below).

Table 3.

Expression profiling of 6 huES cell lines


Category*

Common genes expressed in all 6 huES cell lines

Genes overexpressed but not in all 6 cell lines
Markers of undifferentiated huES cell lines (6)   GJA 1, Podocalyxin-like (GTCM-1), Nanog, galanin§, Oct 4 (POU5F1), GDF3§  SOX2, DNMT3B, ACVR2B  
Differentiation markers (5)   Keratin 8, Keratin 18, ACTC, TUBB-5, TNNT 1  
Cell signaling/cell cycle/cell growth/cellular process (19)   PSIP1§, SFRP2, TDGF1, Sema6a, CRABP1, CRABP2, NME2, IMP-2, CCNB1, LEFTB, Calumenin, CDC2, LIN-28, MAD2L2, PITX2§, STK12, PTTG1, SET, BRIX   CD24, CER1  
Metabolism: DNA and RNA related (21)   RPS24, RPL4, RPL6, RPL7, RPLP0, RPL24, NBR2, HDAC2, HMGIY, HMGB2, KPNA2, SNRPF, DDX21, Nucleostemin, SSB, HNRAPB, RAMP, EPRS, EIF4A1, NPM1, JADE-1   
Metabolic pathways (23)   SMS, SPS, CYP26A1, HSPA4, ELOVL6, FABP5, IMPDH2, IDH1, LDHB, LAPTM4B, MGST1, CCNC, MTHFD1, MTHFD2, TK1, PSMA2, PSMA3, HSSG1, SERPINH1, SLC 16A1, KIF4A, CCT8, TUBB4   
Others (3)   NASP, DSG2, LRRN1   
Zinc finger protein (3)   Zinc finger protein 43, znf 117, znf 257   
Hypothetical protein (12)
 
KIAA 1573, MGC27165, GSH1§, C20orf1, C15orf15, C20orf129, TD60, HNRPA1, ARL8, Ribosomal s40A lamin R, Numatrin, PPAT
 

 

Category*

Common genes expressed in all 6 huES cell lines

Genes overexpressed but not in all 6 cell lines
Markers of undifferentiated huES cell lines (6)   GJA 1, Podocalyxin-like (GTCM-1), Nanog, galanin§, Oct 4 (POU5F1), GDF3§  SOX2, DNMT3B, ACVR2B  
Differentiation markers (5)   Keratin 8, Keratin 18, ACTC, TUBB-5, TNNT 1  
Cell signaling/cell cycle/cell growth/cellular process (19)   PSIP1§, SFRP2, TDGF1, Sema6a, CRABP1, CRABP2, NME2, IMP-2, CCNB1, LEFTB, Calumenin, CDC2, LIN-28, MAD2L2, PITX2§, STK12, PTTG1, SET, BRIX   CD24, CER1  
Metabolism: DNA and RNA related (21)   RPS24, RPL4, RPL6, RPL7, RPLP0, RPL24, NBR2, HDAC2, HMGIY, HMGB2, KPNA2, SNRPF, DDX21, Nucleostemin, SSB, HNRAPB, RAMP, EPRS, EIF4A1, NPM1, JADE-1   
Metabolic pathways (23)   SMS, SPS, CYP26A1, HSPA4, ELOVL6, FABP5, IMPDH2, IDH1, LDHB, LAPTM4B, MGST1, CCNC, MTHFD1, MTHFD2, TK1, PSMA2, PSMA3, HSSG1, SERPINH1, SLC 16A1, KIF4A, CCT8, TUBB4   
Others (3)   NASP, DSG2, LRRN1   
Zinc finger protein (3)   Zinc finger protein 43, znf 117, znf 257   
Hypothetical protein (12)
 
KIAA 1573, MGC27165, GSH1§, C20orf1, C15orf15, C20orf129, TD60, HNRPA1, ARL8, Ribosomal s40A lamin R, Numatrin, PPAT
 

 
*

Numbers in parentheses represent the number of genes under each category.

A common set of genes (n = 92) elevated in all 6 huES cell lines (TE06, GE01, GE09, BG01, BG02, and PES) at a 99% confidence interval (3-fold or higher) are shown.

ES cell-specific genes were not shown to be expressed (≥ 3-fold) in all 6 huES cell lines.

§

These genes showed no expression by EST enumeration.

These genes are not expected to be present in undifferentiated ES cells because they are known differentiation markers and we used undifferentiated cells.

While many known markers of ES cells were detected, some known markers of undifferentiated ES cells did not meet our 99% cutoff criterion in all huES cell lines examined. For example, CD24, DNMT3B, SOX2, ACVR2B, and CER1 showed more than 3-fold expression in 4 cell lines but not in 2 others (Table 3). Similarly, utf1, Connexin 45, and LIFR showed less than 2-fold expression and, consequently, did not meet our cutoff criterion in all 6 cell lines. On the other hand, while REX-1, Fox-D3, TERT, and Foxh1 were present on the array, they were not detected in any cell line (Table S2 in the Supplemental Document). The expression of many of these genes was also confirmed by RT-PCR, immunocytochemistry, and focused microarray analyses. ESG1 Dppa5 (ESG1) was readily detected by RT-PCR (Figure 1A), but the Dppa5 coding sequences were not represented on the array.

Confirmation of gene expression by EST enumeration, RT-PCR, and immunocytochemistry

To further confirm the fidelity of our results we compared the expression of 92 genes that were elevated in all 6 lines with an EST enumeration database of huES cells generated using pooled RNA samples of the cell lines GE01, GE07, and GE09 grown in feeder cell–free conditions (R.B. et al, unpublished data, August 2003). Expression of 77 of the 92 genes (see Tables S4, S5, and S6 in the Supplemental Document) was confirmed. The 15 genes that were not detected in huES cells by EST enumeration included PSIP1, galanin, GSH1, GDF3, RPL24, RPL4, SNRPF and others (see Tables S4 and S5 in the Supplemental Document). Failure to detect expression likely represented a lack of sensitivity of the EST analysis as expression of 10 genes could be confirmed by either RT-PCR or focused microarrays (Figure 2A-B, Figure 4A). We compared the expression of 77 genes detected by EST enumeration in huES cells to their expression in EBs derived from the same cell lines (R.B. et al, unpublished data, August 2003). Of the 77 genes that are elevated in all 6 huES cell lines and present in the EST enumeration database, 14 genes are significantly up-regulated in the huES cells compared with the EBs, and 2 are down-regulated (Table 4; Table S5 in the Supplemental Document).

Figure 2.

Verification of microarray results. (A) A subset of the genes, including 5 early differentiation markers overexpressed in all the huES cell lines by microarray, were confirmed by RT-PCR. (B) RT-PCR analysis of PSIP1 and GDF3 genes. β-actin served as internal control and reverse transcriptase (RT) as a no-enzyme negative control. (C-D) Protein expression of 2 undifferentiated markers of huES cells was confirmed by immunocytochemistry. Undifferentiated human ES cells (GE01) were cultured, fixed, and processed for immunocytochemistry using specific antibodies to CD24 and GTCM-1 (original magnification, × 20).

Figure 2.

Verification of microarray results. (A) A subset of the genes, including 5 early differentiation markers overexpressed in all the huES cell lines by microarray, were confirmed by RT-PCR. (B) RT-PCR analysis of PSIP1 and GDF3 genes. β-actin served as internal control and reverse transcriptase (RT) as a no-enzyme negative control. (C-D) Protein expression of 2 undifferentiated markers of huES cells was confirmed by immunocytochemistry. Undifferentiated human ES cells (GE01) were cultured, fixed, and processed for immunocytochemistry using specific antibodies to CD24 and GTCM-1 (original magnification, × 20).

Close modal
Figure 4.

RT-PCR confirmation of various novel genes enriched in all 6 huES cell lines. (A) Expression of Nanog (FLJ12581) and 3 novel genes (KIAA1573, MGC27165, and GSH1), (B) KIAA1265 and Zf43 genes, and (C) TNNT1, Laminin receptor, ARL8, PPAT, Numatrin, HNRPA1, and TD-60. β-actin served as an internal control and RT without input DNA as a negative control.

Figure 4.

RT-PCR confirmation of various novel genes enriched in all 6 huES cell lines. (A) Expression of Nanog (FLJ12581) and 3 novel genes (KIAA1573, MGC27165, and GSH1), (B) KIAA1265 and Zf43 genes, and (C) TNNT1, Laminin receptor, ARL8, PPAT, Numatrin, HNRPA1, and TD-60. β-actin served as an internal control and RT without input DNA as a negative control.

Close modal
Table 4.

Analysis of 28 common genes overexpressed in pooled huES cell lines compared with EB by EST enumeration


Gene name

Accession no.

ES

EB

ES/EB
HSPA4  AB023420  9   0   OE* 
ELOVL6  NM_024090  5   0   OE* 
LEFTB  AF081507  4   0   OE  
PTTG1  AJ223953  4   0   OE  
CYP26A1  AF005418  4   0   OE  
PSMA3  D00762  3   0   OE  
C20orf129  BC001068  3   0   OE  
FABP5  M94856  3   0   OE  
KIAA1573  AB046793  2   0   OE  
ZNF257  AF070651  2   0   OE  
FLJ12581/Nanog  NM_024865  2   0   OE  
CRABP1  S74445  1   0   OE  
TDGF1  X14253  20   1   OE* 
SEMA6A  AF225425  16   1   OE* 
NASP  M97856  58   7   OE* 
CCNB1  M25753  8   1   OE* 
SFRP2  AF311912  7   1   OE* 
RPL17  X53777  7   1   OE* 
JADE-1  NM_024900  6   1   OE  
RAMP  NM_016448  6   1   OE  
NS  NM_014366  5   1   OE  
TNNT1  M19309Ê   5   1   OE  
MGC27165 J04164  4   1   OE  
HDAC2  U31814  4   1   OE  
KPNA2  U28386  19   6   OE* 
SLC16A1  AL162079  22   7   OE* 
CCNC  M740914  3   1   OE  
MGST1
 
J03746
 
3
 
1
 
OE
 

Gene name

Accession no.

ES

EB

ES/EB
HSPA4  AB023420  9   0   OE* 
ELOVL6  NM_024090  5   0   OE* 
LEFTB  AF081507  4   0   OE  
PTTG1  AJ223953  4   0   OE  
CYP26A1  AF005418  4   0   OE  
PSMA3  D00762  3   0   OE  
C20orf129  BC001068  3   0   OE  
FABP5  M94856  3   0   OE  
KIAA1573  AB046793  2   0   OE  
ZNF257  AF070651  2   0   OE  
FLJ12581/Nanog  NM_024865  2   0   OE  
CRABP1  S74445  1   0   OE  
TDGF1  X14253  20   1   OE* 
SEMA6A  AF225425  16   1   OE* 
NASP  M97856  58   7   OE* 
CCNB1  M25753  8   1   OE* 
SFRP2  AF311912  7   1   OE* 
RPL17  X53777  7   1   OE* 
JADE-1  NM_024900  6   1   OE  
RAMP  NM_016448  6   1   OE  
NS  NM_014366  5   1   OE  
TNNT1  M19309Ê   5   1   OE  
MGC27165 J04164  4   1   OE  
HDAC2  U31814  4   1   OE  
KPNA2  U28386  19   6   OE* 
SLC16A1  AL162079  22   7   OE* 
CCNC  M740914  3   1   OE  
MGST1
 
J03746
 
3
 
1
 
OE
 

EST enumeration was done as described in R.B. et al (unpublished data, August 2003). Columns ES and EB indicate the number of ESTs found for a particular gene. The Fisher exact test (P ≤ .05 was used to determine whether the difference in EST number is likely to reflect a real difference in expression levels. OE indicates a gene overexpressed in the ES cells. For lower copy number (5 ESTs and below in ES and 0 in EB, or less than 7 ESTs in ES and 1 in EB) the test cannot determine significance, even though the differences might be real. The signal of all genes was ≥ 3-fold higher compared with huURNA in microarray experiments.

*

Genes significantly overexpressing ES compared with EB.

As presented in “Expression profiling of human huES cells,” most markers of differentiation were not expressed at detectable levels in the huES cell lines. However, 5 genes thought to be specific for differentiation appeared to be present at high levels in all the huES cell lines (keratin 8, keratin 18, beta tubulin 5, cardiac actin, and troponin T1). Expression of keratin 8, keratin 18, beta tubulin 5, and cardiac actin in huES cells was confirmed by RT-PCR (Figure 2A). We also compared the expression levels for these genes in huES cells to the expression in EBs by EST enumeration (Table S3 in the Supplemental Document). All 5 genes were detected in huES cells by EST enumeration. Two of the 5 genes were significantly down-regulated in huES cells (keratin 8 and keratin 18), and one was significantly up-regulated in huES (troponin T1) compared with EBs. Cardiac actin and beta tubulin 5 had fewer ESTs expressed in the huES sample, but the number was not statistically significant. The expression of early markers of differentiation in 6 huES cell lines is not unusual as these cell lines have been in culture for a prolonged period of time. A certain percentage of cells may be differentiated as a result of culture and manipulation. In addition, it is possible that certain genes are expressed but may not be transcribed and translated to protein. Future studies will explore functional significance of these differentiation genes and their protein expression in huES cells. Nevertheless, these results suggest that under current culture conditions, these genes may represent the earliest and most sensitive markers of differentiation.

The expression of several genes that we found to be present at high levels in all 6 human ES lines but that had not been previously described, such as CD24 and GTCM-1 (podocalyxin-like), was confirmed by immunocytochemistry (Figure 2C-D and data not shown), RT-PCR (data not shown and Cai et al31 ), and focused microarray analysis (Table 5; data not shown; and Luo et al32 ). A representative array result analyzing the TGFβ superfamily pathway is shown (Table 5). The arrays confirm that cripto, Lefty A and B, and noggin are enriched in huES cells and may act to repress the activin pathway, which includes downstream activators such as SMADs and TSC22.

Table 5.

Analysis of gene expression for activin/TGFβ-signaling pathway by focused microarray


No. expressed genes/no. total genes

Gene name
39/96   ACVR2, ACVR2B, LeftyA, LeftyB, TDGF1, CER1 Stat1, TSC22, Identify, Identifying, Identified, Smad3, Smad5, Smad6, SOX4, CDC25a, Phosphatase, p21Waf1 (p21Cip1), COLIA2, COL3A1  

 
FST/Follistatin, BMP2, BMP9, ALK-3, BMP10, INH6, jun-B, DPC4, c-myc, NMA, TGIF, TIMP1, ALK-2, ALK_4, AMH, AMHR2, MADH2, TGFB1, BMPR2
 

No. expressed genes/no. total genes

Gene name
39/96   ACVR2, ACVR2B, LeftyA, LeftyB, TDGF1, CER1 Stat1, TSC22, Identify, Identifying, Identified, Smad3, Smad5, Smad6, SOX4, CDC25a, Phosphatase, p21Waf1 (p21Cip1), COLIA2, COL3A1  

 
FST/Follistatin, BMP2, BMP9, ALK-3, BMP10, INH6, jun-B, DPC4, c-myc, NMA, TGIF, TIMP1, ALK-2, ALK_4, AMH, AMHR2, MADH2, TGFB1, BMPR2
 

A subset of genes known to be part of the activin/TGFβ-signaling pathway was assessed by a focused microarray. RNA isolated from 3 different undifferentiated huES cell lines was pooled (PES), tested for the absence of markers of differentiation by RT-PCR and immunocytochemistry, and used to prepare cDNA that was used to probe for an activin receptor superfamily–focused microarray. Expression was normalized to housekeeping genes and background signal was subtracted using blanks included in the array. Experiments were performed in duplicate with independently isolated samples and similar results were obtained with other cell lines (data not shown).

Comparison with published microarray results in mouse ES cells

The large number of genes expressed at 3-fold or higher levels could constitute a molecular signature of ES cells based on (1) the high degree of correlation among ES cell lines in terms of expression of a large number of known ES markers, (2) independent confirmation of high-level expression of these genes using EST scan, and (3) confirmation that the huES cells were largely undifferentiated. To determine if this signature included genes identified as ES cell markers or as universal stem cell markers we compared the results with the “stemness” signatures described previously.13-15  Our comparison using gene annotation of published results shows that between 12 and 33 of the 92 huES genes were overexpressed in murine ES cell lines as reported by Ivanova et al,13  Ramalho-Santos et al,14  and Tanaka et al,15  (see Table S6 in the Supplemental Document). Given that these limited overlapping genes included cell-cycle regulators and that a less-rigorous standard (> 1.4-fold) was used in these published reports, the number of human stemness genes shared by mouse ES cells appears to be quite low. Such a limited overlap illustrates the importance of examining multiple independent isolates of ES cells and comparing them to pooled tissue samples, and suggests that comparison across databases must be carefully evaluated. A number of hypotheses could be proposed to explain the limited overlap in gene expression between mouse and human ES cells. First, ES cell populations examined in these 2 species were harvested at various stages of cell culture resulting in different gene expression. Second, mouse ES cells, but not human ES cells, were treated with leukemia inhibitory factor (LIF) for propagation and the maintenance of pluripotency.4,33  Thus, different biologies of human and mouse ES cells may result in limited overlap of gene expression.

Comparison of microarray results with digital differential display

Recently, a digital differential display strategy was used to identify 20 genes that are highly expressed in mouse ES cells but not in other tissues.34  In addition, Ehox was identified as an early and specific marker of murine ES cells.35  To assess if any of these genes should be included in the molecular signature of huES cells, we identified the human homologs of these genes, determined their presence on the microarrays, and verified expression in undifferentiated huES cells by microarray, EST enumeration, and RT-PCR (Table 6). Several genes were present on the array and expressed by huES cells (Nanog, oct3/4 and cripto/TDGF1, GDF3). No orthologs of Ehox, a mouse EST, PRB1, or Tcl1 could be identified and, therefore, their expression could not be evaluated. ERAS appeared to be identical to HRASP, a previously described pseudogene of the ras family,36  but the nearest ortholog of HRASP could not be readily detected (Figure 3B). Other genes, including Brachyury, keratin 17, zinc finger proteins, FBX15, and HRAS were present on the array but not detectably elevated in human cells by array analysis, EST enumeration, or by RT-PCR for some genes (Figure 3B). Expression of DNMT3L, Dppa5 (ESG1), tudor, DAX-1, and zf342 (Zf296) was confirmed by RT-PCR (Figure 3A), extending the number of genes that are shared between mouse and human cells, but also highlighting fundamental differences in their biology.

Table 6.

EST enumeration and RT-PCR analysis of genes reported to be specific to mouse ES cells


Mouse gene

Human homolog

Array A/P/E/NE

EST E/NF

RT-PCR

Comments
POU5F1* Z11898  E   E   +   ES cell marker  
TDGF1/Cripto* X13293  E   E   +   ES cell marker  
Nanog*  Hs.326290   E   E   +   ES cell marker  
GDF-3*  Hs.86232   E   NF   +   Expression confirmed by focused array  
Utf1* AB011076  P/E   NF   +   Undifferentiated embryonic cell transcription factor 1  
Brachyury   Hs.143507   P/NE   NF   +   Early mesodermal marker  
DNMT3L   Hs.157237   P/NE   E   +   Low but detected  
HNRNPG-T   Hs.121605   P/NE   NF   +   Testis-specific ring finger protein  
Rex/Zfp42*  gi: 1082557   P/NE   NF   +   ES cell marker  
Dax-1/Nrob 1   Hs.268490   P/NE   NF   +   Nuclear receptor of unknown function  
Keratin-17   Hs.2785/Z19574   P/NE   E   −   KRT-8/18 present but not KRT-17  
Tudor protein   AK010714.1   A   NF   +   Maternal protein high in mouse ES cells  
ESG1/DPPA5   (cloned)   A   NF   +   Embryonic stem cell–specific gene  
Fbx15   Hs.124087   P/NE   NF   −   F-box containing protein expressed in embryoid bodies regulated by Oct 3/4  
Zfp296 (Znf342)   Hs.192237   A   NF   ND   Other zinc finger proteins are present in huES cells  
Zfp226   Sp: QGNYT6   A   E   ND   Tex-20 or spalt like; detected in differentiating cells  
E-ras Mm.249624  AB093575  A   NF   ND   huES cells express Ha-ras but no Eras detected  
TCL1   NP068801.1   NP   NP   ND   Low homology to tcl1 (56%)  
PRB1 (Mm.157658)   No human homolog   NP   NP   ND   Low homology to proline-rich protein PRB1  
Mm.45676 (EST)   No human homolog   NP   NP   ND   EST with low homology to Drosophila pipsqueak  
E-Hox Mm.021300
 
No human homolog
 
NP
 
NP
 
ND
 
No ortholog identified in human (2 paralogues on X chromosome)
 

Mouse gene

Human homolog

Array A/P/E/NE

EST E/NF

RT-PCR

Comments
POU5F1* Z11898  E   E   +   ES cell marker  
TDGF1/Cripto* X13293  E   E   +   ES cell marker  
Nanog*  Hs.326290   E   E   +   ES cell marker  
GDF-3*  Hs.86232   E   NF   +   Expression confirmed by focused array  
Utf1* AB011076  P/E   NF   +   Undifferentiated embryonic cell transcription factor 1  
Brachyury   Hs.143507   P/NE   NF   +   Early mesodermal marker  
DNMT3L   Hs.157237   P/NE   E   +   Low but detected  
HNRNPG-T   Hs.121605   P/NE   NF   +   Testis-specific ring finger protein  
Rex/Zfp42*  gi: 1082557   P/NE   NF   +   ES cell marker  
Dax-1/Nrob 1   Hs.268490   P/NE   NF   +   Nuclear receptor of unknown function  
Keratin-17   Hs.2785/Z19574   P/NE   E   −   KRT-8/18 present but not KRT-17  
Tudor protein   AK010714.1   A   NF   +   Maternal protein high in mouse ES cells  
ESG1/DPPA5   (cloned)   A   NF   +   Embryonic stem cell–specific gene  
Fbx15   Hs.124087   P/NE   NF   −   F-box containing protein expressed in embryoid bodies regulated by Oct 3/4  
Zfp296 (Znf342)   Hs.192237   A   NF   ND   Other zinc finger proteins are present in huES cells  
Zfp226   Sp: QGNYT6   A   E   ND   Tex-20 or spalt like; detected in differentiating cells  
E-ras Mm.249624  AB093575  A   NF   ND   huES cells express Ha-ras but no Eras detected  
TCL1   NP068801.1   NP   NP   ND   Low homology to tcl1 (56%)  
PRB1 (Mm.157658)   No human homolog   NP   NP   ND   Low homology to proline-rich protein PRB1  
Mm.45676 (EST)   No human homolog   NP   NP   ND   EST with low homology to Drosophila pipsqueak  
E-Hox Mm.021300
 
No human homolog
 
NP
 
NP
 
ND
 
No ortholog identified in human (2 paralogues on X chromosome)
 

Expression of 21 human homologs of murine ES cell–specific genes was analyzed by microarray, EST enumeration, and RT-PCR analysis.

A indicates absent; E, expressed; P, present on chip; NE, not expressed; NF, not found; ND, not done; NP, not performed (array and EST analysis could not be performed as no human homolog is available); +, positive; and −, negative.

*

Known ES cell markers in undifferentiated human ES cells that could be confirmed.

Known ES cell markers expressed at ≥ 3-fold higher than huURNA.

Figure 3.

RT-PCR analysis of genes reported to be specific to mouse ES cells. (A) RT-PCR analysis of Dppa5, Dax-1, Zf296, and DNMT3L genes. (B) RT-PCR analysis of FBX15 and HRASP (ERAS) genes. FBX15 and HRASP were not detected in any cell line, whereas DNMT3L was present though levels were variable.

Figure 3.

RT-PCR analysis of genes reported to be specific to mouse ES cells. (A) RT-PCR analysis of Dppa5, Dax-1, Zf296, and DNMT3L genes. (B) RT-PCR analysis of FBX15 and HRASP (ERAS) genes. FBX15 and HRASP were not detected in any cell line, whereas DNMT3L was present though levels were variable.

Close modal

Thus, our microarray, RT-PCR, EST enumeration, and comparison with published databases identified 97 genes as being overexpressed in huES cells—all of which can be mapped to the human genome database. Approximately 65 of these represent genes previously not known to be enriched in either mouse or human ES cells.

Bioinformatics analysis of 16 novel genes

We identified 16 novel genes that are likely to be functionally important. These genes were highly overexpressed in most embryonic cell lines (see Tables 3 and 7, and Table S2 in the Supplemental Document). Three of these genes were identified as zinc finger proteins that belong to the Kruppel family of C2H2-type zinc finger proteins and shared overall homology to each other (Table 7). Sequence alignment suggested that they contained multiple zinc finger domains and likely function as transcriptional repressors. Several zinc finger proteins have been identified as being specific to hematopoietic stem cells and unique zinc finger proteins have been identified in rodent ES cells which share sequence homology suggesting that rodent and human cells use similar, but not identical, strategies to regulate self-renewal and differentiation.37-39  Thirteen other novel genes were also mapped. Expression of these novel genes and zinc finger protein 43 were confirmed by RT-PCR analysis (Figure 4A-C). Blast analysis confirmed that 1 of the 13 genes is a human homolog of the Nanog (FLJ12581) gene, which is a marker of undifferentiated ES cells and was recently identified as having a critical function in maintaining the stem cell state in rodent ES cells.34,40  One hypothetical gene, MGC27165, showed identity to the fragilis family of interferon-inducible transmembrane protein genes. This family of proteins has been shown to be expressed in primordial germ cells and is critical in maintaining their self-renewal.41 MGC27165 may play a similar role in ES cell cultures. Other genes were characterized in detail using various available databases, which allowed assignment of tentative function to these genes. One hypothetical gene of unknown function, KIAA1573, contains a domain, which shares homology to a voltage-dependent calcium channel. Another is a homeodomain protein that shares homology to GSH1. C20orf1 is commonly known as TPX2, which is required for targeting STK6 to the spindle apparatus, and STK6 may regulate the function of TPX2 during spindle assembly.42 HNRNP core protein A1 appears to be a pseudogene, which encodes for a novel protein with Kunitz/Bovine pancreatic trypsin inhibitor domain.

Table 7.

Analysis of novel genes identified in 6 huES cell lines


Gene name

Chromosome location

Predicted size

Comments
ZNF43   19p13.11   803 AA   Zinc finger proteins that share high sequence homology and likely function as transcriptional repressors; also known to be expressed in early development  
ZNF257   19q13ÊÊ   535 AA   
ZNF117  7q11.21   418 AA   
MGC27165  Chr 14   5155 bases   Related to the fragilis group of proteins known to be germ cell–specific and critical for their maintenance  
GSH1   Chr 13   792 bases   GSH1 homeodomain protein containing a HOX domain  
   264 AA   
FLJ12581 (Nanog)   12p13.31   6660 bases   ES cell–specific transcription factor critical for pluripotency  
KIAA1573   1p13.3   3559 bases   Low homology to calcium channel, voltage-dependent, alpha 2/delta subunit 2  
Ribosomal 40A Laminin receptor   Chr 6   —   Human DNA sequence from clone RP3-334F4 on chromosome 6 contains ESTs, STSs, and GSS: contains a LAMR1 (laminin receptor 1, ribosomal protein SA) pseudogene and an RPL10 (ribosomal protein L 10) pseudogene  
C15orf15   15q21   15 719 bases   Chromosome 15 open reading frame 15  
C20orf129   20q11.22-q12   26 744 bases   Chromosome 20 open reading frame 129  
C20orf1   20q11.2   62 472 bases   Chromosome 20 open reading frame 1  
Numatrin   Chr 13   —   Human DNA sequence from clone RP11-248N6 on chromosome 13 contains ESTs, STSs, and GSSs; contains 2 olfactory receptor pseudogenes, an NPM1 (nucleophosmin, nucleolar phosphoprotein B23, numatrin) pseudogene, and a BCR (breakpoint cluster region) pseudogene  
PPAT   —   —   Homo sapiens similar to phosphoribosyl pyrophosphate amiodotransferase (LOC151641), mRNA  
ARL8   Chr10   —   Sapiens mRNA; cDNA DKFZp43401317 (from clone DKFZp43401317)/ADP-ribosylation factor-like 8  
C20orf168 (HNRPA1)
 
Chr 20
 

 
Human DNA sequence from clone RP3-447F3 on chromosome 20 and contains TNNC2 (troponin C2, fast) gene
 

Gene name

Chromosome location

Predicted size

Comments
ZNF43   19p13.11   803 AA   Zinc finger proteins that share high sequence homology and likely function as transcriptional repressors; also known to be expressed in early development  
ZNF257   19q13ÊÊ   535 AA   
ZNF117  7q11.21   418 AA   
MGC27165  Chr 14   5155 bases   Related to the fragilis group of proteins known to be germ cell–specific and critical for their maintenance  
GSH1   Chr 13   792 bases   GSH1 homeodomain protein containing a HOX domain  
   264 AA   
FLJ12581 (Nanog)   12p13.31   6660 bases   ES cell–specific transcription factor critical for pluripotency  
KIAA1573   1p13.3   3559 bases   Low homology to calcium channel, voltage-dependent, alpha 2/delta subunit 2  
Ribosomal 40A Laminin receptor   Chr 6   —   Human DNA sequence from clone RP3-334F4 on chromosome 6 contains ESTs, STSs, and GSS: contains a LAMR1 (laminin receptor 1, ribosomal protein SA) pseudogene and an RPL10 (ribosomal protein L 10) pseudogene  
C15orf15   15q21   15 719 bases   Chromosome 15 open reading frame 15  
C20orf129   20q11.22-q12   26 744 bases   Chromosome 20 open reading frame 129  
C20orf1   20q11.2   62 472 bases   Chromosome 20 open reading frame 1  
Numatrin   Chr 13   —   Human DNA sequence from clone RP11-248N6 on chromosome 13 contains ESTs, STSs, and GSSs; contains 2 olfactory receptor pseudogenes, an NPM1 (nucleophosmin, nucleolar phosphoprotein B23, numatrin) pseudogene, and a BCR (breakpoint cluster region) pseudogene  
PPAT   —   —   Homo sapiens similar to phosphoribosyl pyrophosphate amiodotransferase (LOC151641), mRNA  
ARL8   Chr10   —   Sapiens mRNA; cDNA DKFZp43401317 (from clone DKFZp43401317)/ADP-ribosylation factor-like 8  
C20orf168 (HNRPA1)
 
Chr 20
 

 
Human DNA sequence from clone RP3-447F3 on chromosome 20 and contains TNNC2 (troponin C2, fast) gene
 

Nanog and 15 other novel genes overexpressed in 6 huES cell lines were analyzed by using bioinformatic tools available at multiple databases, including Swiss protein, Gencard, and Locus Link from the National Center for Biotechnology Information (NCBI), the National Cancer Institute, and the Center for Information Technology, National Institutes of Health. Chromosomal localization, predicted size, and possible function are shown. AA indicates amino acids; and—, not found.

Thus, we demonstrate that huES cell lines express 92 unique genes that are common in all 6 huES cell lines examined. In addition, we identified 16 novel genes; 15 of them were not previously characterized and one gene product, termed Nanog, was recently cloned and found to be important in maintaining the pluripotent state of mouse ES cells.34,40  The present study also confirms the expression of several markers of huES cells that have been identified in murine ES cells.13-15  Thus, the 92 genes identified can be used to define the core identity of undifferentiated huES cells and may be important in defining their stem cell capabilities.

The expression of several genes was confirmed by RT-PCR, focused microarray, and immunocytochemistry analyses. In addition, the fidelity of expression of the 92 genes that were elevated in all 6 ES cell lines was confirmed by comparison with an EST enumeration database using PES RNA (R.B. et al, unpublished data, August 2003). The 15 genes that were not detected in huES cells by EST enumeration included PSIP1, Galanin, GSH1, GDF3, PITX2, and others. Failure to detect expression likely represented a lack of sensitivity of the EST analysis as expression of at least 10 genes could be confirmed by RT-PCR. Of the 92 genes, 14 were also significantly overexpressed in huES cells compared with EB cells and are good candidate marker genes for the undifferentiated huES cells or may be involved in regulating huES cell differentiation. Many more genes have at least 3-fold more ESTs in the huES sample than in the EB sample, but low overall EST copy number does not allow us to conclude that they are overexpressed in the undifferentiated huES cells. Further verification is needed to determine whether these genes may be involved in regulating ES cell differentitation.

We note that Lin 28, a Caenorhabditis elegans gene that controls the timing of diverse developmental events during the animal's larval stage,30  is highly expressed in all huES cells and is down-regulated as cells differentiate. Lin 28 expression is conserved in higher species and expression could be detected in rodent ES cells (data not shown), suggesting that this pathway may play an important role in regulating appropriate differentiation of ES cells.

Overall, our results identify several novel genes that are likely to play an important role in maintaining the pluripotency of huES cells, highlighting the similarities between the various human ES lines available for research purposes, and suggest the importance of careful documentation for culture and phenotypic differentiation. The novel genes identified also provide insight into the undifferentiated state and candidate regulator genes for ES cell differentiation. Our data further identify additional markers for the undifferentiated state, candidate regulators of ES cell differentiation, validate the utility of a microarray approach to analyze ES cell populations, and suggest that genes identified in such a screen can be used to develop focused microarrays for quality control to profile the state of stem and progenitor cells and reveal the extent of differentiation in samples from different laboratories. Comparison of our findings to other stem cell populations reveals that while stem cells may use similar overall strategies to maintain a stem cell state, the specific molecules utilized appear different, suggesting that it may be possible to use similar methods to develop distinct molecular signatures for other stem cell populations as well.

We note that assessment of 16 659 spots of our oligonuclotide arrays for huES cell–specific genes identified approximately 92 genes that are enriched in all ES cell cultures relative to other tissues. Assuming that we have detected only about 50% of such genes given array sensitivity, sampling errors, and a rigorous 99% confidence cutoff (with minimum intensity ≥ 150), we would expect additional experiments to identify approximately 90 additional genes whose expression may be linked to the huES phenotype and differentiation potential. Furthermore, since our analysis was restricted to the currently curated Operon database, it represents a sampling of approximately 50% of the human genome,43  and 180 additional ES cell–enriched genes may be present based on this analysis. A hypothetical total of approximately 360 huES cell–enriched genes is small enough to be readily profiled using current methods, while sufficient to develop a comprehensive and unique molecular signature for huES cell populations, which would include most biologically relevant molecules. The limited overlap between signatures in different cells suggests that truly universal stem cell markers are likely to be an uncommon subset of about 100 genes, and may be better identified by a more direct comparison of purified homogenous populations of stem cells using a focused microarray approach.

Prepublished online as Blood First Edition Paper, December 30, 2003; DOI 10.1182/blood-2003-09-3314.

Supported by grants to M.S.R. from the Amyotrophic Lateral Sclerosis (ALS) Center at Johns Hopkins, Children's Neurobiological Solutions (CNS) Foundation, and the National Institutes of Health (NIH) Stem Cell Center; and NIH grant PAR-02-023 to J.I.-E.

R.B. and R.S.T. are employed by Geron Corp, whose potential product was studied in the present work.

The online version of the article contains a data supplement.

An Inside Blood analysis of this article appears in the front of this issue.

The publication costs of this article were defrayed in part by page charge payment. Therefore, and solely to indicate this fact, this article is hereby marked “advertisement” in accordance with 18 U.S.C. section 1734.

We thank Drs Jesse L. Goodman, Steven Bauer, and Brenton McCright for review and helpful comments, Philip D. Noguchi for encouragement, J. Carl Barrett, Kathryn C. Zoon, Neil Goldman, Jeffrey Green, Earnie Kawakasaki, and Mr David Peterson for their support of the CBER/NCI InterAgency Agreement on genomics program, and Dr Jing Han for general support and data analysis.

1
McKay R. Stem cells: hype and hope.
Nature.
2000
;
406
:
361
-364.
2
Lumelsky N, Blondel O, Laeng P, Velasco I, Ravin R, McKay R. Differentiation of embryonic stem cells to insulin-secreting structures similar to pancreatic islets.
Science.
2001
;
292
:
1389
-1394.
3
Kim JH, Auerbach JM, Rodriquez-Gomez JA, et al. Dopamine neurons derived from embryonic stem cells function in an animal model of Parkinson's disease.
Nature.
2002
;
418
:
50
-56.
4
Thomson JA, Itskovitz-Eldor J, Shapiro SS, et al. Embryonic stem cell lines derived from human blastocysts [erratum appears in Science. 1998;282:1827].
Science.
1998
;
202
:
1145
-1147.
5
Zerhouni E. Embryonic stem cells.
Science.
2003
;
300
:
911
-912.
6
Shamblott MJ, Axelman J, Littelfield JW, et al. Human embryonic germ cell derivatives express a broad range of developmentally distinct markers and proliferate extensively in vitro.
Proc Natl Acad Sci U S A.
2001
;
98
:
113
-118.
7
Keller GM. In vitro differentiation of embryonic stem cells.
Curr Opin Cell Biol.
1995
;
7
:
862
-869.
8
Klug MG, Soonpaa MH, Koh GY, Field LJ. Genetically selected cardiomyocytes from differentiating embryonic stem cells form stable intra-cardiac grafts.
J Clin Invest.
1996
;
98
:
216
-224.
9
Okabe S, Forsberg-Nilsson K, Spiro AC, Segal M, McKay RD. Development of neuronal precursor cells and functional post mitotic neurons from embryonic stem cells in vitro.
Mech Dev.
1996
;
59
:
89
-102.
10
Yamashita J, Itoh H, Hirashima M, et al. Flk1-positive cells derived from embryonic stem cells serve as vascular progenitors.
Nature.
2000
;
408
:
92
-96.
11
Schena M, Shalon D, Davis RW, Brown PO. Quantitative monitoring of gene expression patterns with a complementary DNA microarray.
Science.
1995
;
270
:
467
-470.
12
Bowtell DDL. Options available—from start to finish—for obtaining expression data by microarray.
Nat Genet.
1999
;
21
:
25
-32.
13
Ivanova NB, Dimos JT, Schaniel C, et al. A stem cell molecular signature.
Science.
2002
;
298
:
601
-604.
14
Ramalho-Santos M, Yoon S, Matsuzaki Y, Mulligan RC, Melton DA. “Stemness”: transcriptional profiling of embryonic and adult stem cells.
Science.
2000
;
298
:
597
-600.
15
Tanaka TS, Kunath T, Kimber WL, et al. Gene expression profiling of embryo-derived stem cells reveals candidate genes associated with pluripotency and lineage specificity.
Genome Res.
2002
;
12
:
1921
-1928.
16
Thomson JA, Itskovitz-Eldor J, Shapiro SS, et al. Embryonic stem cell lines derived from human blastocysts.
Science.
1998
;
282
:
1145
-1147.
17
Amit M, Carpenter MK, Inokuma MS, et al. Clonally derived human embryonic stem cell lines maintain pluripotency and proliferative potential for prolonged periods of culture.
Dev Biol.
2000
;
15
:
271
-278.
18
NIH Human Embryonic Stem Cell Registry. NIH Web site. Available at: http://stemcells.nih.gov/registry/index.asp. Accessed January 2003.
19
Xu C, Inokuma MS, Denham J, et al. Feeder-free growth of undifferentiated human embryonic stem cells.
Nat Biotechnol.
2001
;
19
:
971
-974.
20
Zeng X, Miura T, Luo Y, et al. Properties of pluripotent human embryonic stem cells BG01 and BG02.
Stem Cells.
2004
. In press.
21
Amit M, Itskovitz-Eldor J. Derivation and spontaneous differentiation of human embryonic stem cells.
J Anat.
2002
;
200
(pt 3):
225
-232.
22
Carpenter MK, Rosler E, Rao MS. Characterization and differentiation of human embryonic stem cells.
Cloning Stem Cells.
2003
;
5
:
79
-88.
23
Carpenter MK, Inokuma MS, Denham J, Mujtaba T, Chiu CP, Rao MS. Enrichment of neurons and neural precursors from human embryonic stem cells.
Exp Neurol.
2001
;
172
:
383
-397.
24
Rao MS, Mayer-Proschel M. Glial-restricted precursors are derived from multipotent neuroepithelial stem cells.
Dev Biol.
1997
;
188
:
48
-63.
25
Kawakami K, Kawakami M, Snoy PJ, Husain SR, Puri RK. In vivo overexpression of IL-13 receptor alpha2 chain inhibits tumorigenicity of human breast and pancreatic tumors in immunodeficient mice.
J Exp Med.
2001
;
194
:
1743
-1754.
26
Brazma A, Higamp P, Quackenbush J, et al. Minimum information about a microarray experiment (MIAME): toward standards for microarray data.
Nat Genet.
2001
;
29
:
365
-371.
27
Risinger JI, Maxwell GL, Chandramouli GV, et al. Microarray analysis reveals distinct gene expression profiles among different histologic types of endometrial cancer.
Cancer Res.
2003
;
63
:
6
-11.
28
Terskikh AV, Easterday MC, Li L, et al. From hematopoiesis to neuropoiesis: evidence of overlapping genetic programs.
Proc Natl Acad Sci U S A.
2001
;
98
:
7934
-7939.
29
Siegel S, Castellan N.
Nonparametric Statistics for the Behavioral Sciences
. 2nd ed. Columbus, OH: McGraw-Hill;
1988
.
30
Seggerson K, Tang L, Moss EG. Two genetic circuits repress the Caenorhabditis elegans heterochronic gene lin-28 after translation initiation.
Dev Biol.
2002
;
243
:
215
-225.
31
Cai J, Limke TL, Ginis I, Rao MS. Identifying and tracking neural stem cells.
Blood Cells Mol Dis.
2003
;
31
:
18
-27.
32
Luo Y, Long JM, Spangler EL, et al. Identification of maze learning-associated genes in rat hippocampus by cDNA microarray.
J Mol Neurosci.
2001
;
17
:
397
-404.
33
Odorico JS, Kaufman DS, Thomson JA. Multilineage differentiation from human embryonic stem cell lines.
Stem Cells.
2001
;
19
:
193
-204.
34
Mitsui K, Tokuzawa Y, Itoh H, et al. The homeoprotein Nanog is required for maintenance of pluripotency in mouse epiblast and ES cells.
Cell.
2003
;
113
:
631
-652.
35
Jackson M, Baird JW, Cambray N, et al. Cloning and characterization of Ehox, a novel homeobox gene essential for embryonic stem cell differentiation.
J Biol Chem.
2002
;
277
:
38683
-38692.
36
Loring JF, Porter JG, Seilhammer J, Kaser MR, Wesselschmidt R. A gene expression profile of embryonic stem cells and embryonic stem cell-derived neurons.
Restor Neurol Neurosci.
2001
;
18
:
81
-88.
37
Kimura Y, Hart A, Hirashima M, et al. Zinc finger protein, Hzf, is required for megakaryocyte development and hemostasis.
J Exp Med.
2002
;
195
:
941
-952.
38
Chang AN, Cantor AB, Fujiwara Y, et al. GATA-factor dependence of the multitype zinc-finger protein FOG-1 for its essential role in megakaryopoiesis.
Proc Natl Acad Sci U S A.
2002
;
99
:
9237
-9242.
39
Inoue A, Ishiji A, Kasagi S, et al. The transcript for a novel protein with a zinc finger motif is expressed at specific stages of mouse spermatogenesis.
Biochem Biophys Res Commun.
2000
;
273
:
398
-403.
40
Chambers I, Colby D, Robertson M, et al. Functional expression cloning of Nanog, a pluripotency sustaining factor in embryonic stem cells.
Cell.
2003
;
113
:
643
-655.
41
Lange U, Saitou M, Western P, Barton S, Surani M. The fragilis interferon-inducible gene family of transmembrane proteins is associated with germ cell specification in mice.
BMC Dev Biol.
2003
;
3
:
1
-11.
42
Kufer TA, Sillje HH, Korner R, et al. Human TPX2 is required for targeting Aurora-A kinase to the spindle.
J Cell Biol.
2002
;
158
:
617
-623.
43
Collins FS, Green ED, Guttmacher AE, et al. A vision for the future of genomics research.
Nature.
2003
;
422
:
835
-847.
Sign in via your Institution