Stroke is a major cause of morbidity and mortality in sickle cell (SS) disease. Genetic risk factors have been postulated to contribute to this clinical outcome. The human genome project has substantially increased the catalog of variations in genes, many of which could modify the risk for manifestations of disease outcome in a monogenic disease, namely SS. VCAM1 is a cell adhesion molecule postulated to play a critical role in the pathogenesis of SS disease. We identified a total of 33 single nucleotide polymorphisms (SNPs) by sequencing the entire coding region, 2134 bp upstream of the 5′ end of the published cDNA, 217 bp downstream of the 3′ end of the cDNA, and selected intronic regions of the VCAM1 locus. Allelic frequencies for selected SNPs were determined in a healthy population. We subsequently analyzed 4 nonsynonymous coding, 2 synonymous coding, and 4 common promoter SNPs in a genetic association study of clinically apparent stroke in SS disease conducted in a cohort derived from a single institution in Jamaica (51 symptomatic cases and 51 matched controls). Of the 10 candidate SNPs analyzed in this pilot study, the variant allele of the nonsynonymous SNP, VCAM1 G1238C, may be associated with protection from stroke (odds ratio [OR] 0.35, 95% confidence interval [CI] 0.15-0.83, P = .04). Further study is required to confirm the importance of this variant inVCAM1 as a clinically useful modifier of outcome in SS disease.
Introduction
Stroke is a major complication of sickle cell (SS) disease and is associated with significant morbidity and mortality. It is estimated that the lifetime risk for stroke is between 8% and 10%.1,2 Associated risk factors include peripheral leukocytosis,1,2 the rate of acute chest syndrome (ACS) episodes,1 relative hypertension,3 abnormal cerebral blood flow detected by transcranial doppler ultrasonography,4 and selected genetic variants.5,6 The search for predictive markers has direct clinical implications, specifically the institution of early interventional strategies, including close monitoring and preventative therapies.7
Currently, controversy exists among geneticists as to the best means for identifying the genetic determinants of multigenic or complex traits like stroke in SS disease.8 Traditionally, studies have only proceeded to linkage analysis on multigenerational pedigrees once a familial pattern of trait aggregation has been established. Once localized by linkage to a genetic marker, a disease gene is identified from candidates mapping to a chromosomal region. Recently, the draft sequence of the human genome has generated an extraordinary resource for using single nucleotide polymorphisms (SNPs) as physical markers in either a candidate gene or eventually the entire genome to identify complex traits and modifiers of monogenic diseases.9 While traditional linkage studies are still attractive, some have generated controversy by suggesting that such studies may require unrealistically large pedigrees for localizing complex disease determinants of modest effect, and suggesting that association studies are a reasonable alternative approach.8,10 As a result, availability of human genomic sequence has facilitated successful association studies testing SNPs in candidate genes, selectively chosen on the basis of existing knockout animal models or in vitro data implicating the gene in disease pathogenesis, even in the absence of genetic linkage.11-14
The cell adhesion molecules (CAMs) of the immunoglobulin superfamily adhere with high affinity to integrins expressed on inflammatory and endothelial cells. A critical member, the vascular cell adhesion molecule 1 (VCAM-1), coordinates the inflammatory response by recruiting leukocytes and in turn, activating lymphocytes.15 VCAM-1 is a cell surface sialoglycoprotein highly expressed on endothelial cells following cytokine stimulation with interleukin 1 alpha (IL-1α), tumor necrosis factor alpha (TNF-α), and IL-4,16 while it is constitutively expressed on the bone marrow stroma where it has recently been implicated as a possible determinant of hematopoietic cell mobilization in response to granulocyte colony-stimulating factor (G-CSF).17 A potential role in the regulation of circulating leukocytes is further supported by targeted disruption of the murine VCAM1 gene, in which it has been observed that VCAM-1–deficient mice have elevated peripheral leukocytes.18
Variants of the VCAM1 gene could be informative genetic modifiers of phenotypic differences in SS disease. In vitro, sickle erythrocytes adhere to cytokine-stimulated or -transfected VCAM-1 on endothelial or COS cells via VLA-4 (α4β1integrin) expressed on sickle red cells.19-22 Perfusion with sickle erythrocytes induces VCAM-1 expression in cultured endothelial cells.23 In addition, elevated levels of soluble VCAM-1 have been detected in the plasma of patients with SS disease at baseline and during episodes of ACS.24,25 This has also been observed in non–SS disease patients with acute stroke, as well as in a small series of 6 adult stroke autopsies, which were notable for high expression of VCAM-1 restricted to areas of brain ischemia.26-28 Furthermore, inhibition of α4integrin, a major counter-receptor for VCAM-1, protects against ischemic injury in a rat model of transient cerebral ischemia.29 Based on these data, VCAM1 was selected as a candidate gene for study in SS disease. The current study resequenced a portion of the VCAM1 locus in healthy controls in search of common variants. SNPs were identified that have a pattern of nucleotide diversity consistent with the neutral mutation hypothesis in African Americans. Selected nonsynonymous coding and promoter SNPs were then studied in a single institution case control study of SS subjects with clinically symptomatic stroke.
Patients, materials, and methods
Screening population for VCAM1 polymorphisms
Genomic DNA from 40 healthy African Americans was obtained under an institutional review board–approved protocol for anonymous DNA collection in the Department of Transfusion Medicine, Clinical Center, National Institutes of Health (NIH; Bethesda, MD). cDNA was synthesized from mRNA isolated from apheresis lymphocytes from 16 of these donors (FastTrack 2.0 and Ready to Go; Invitrogen, Carlsbad, CA). The cDNA of VCAM1 was screened first by sequencing both directions. These data were compared to coding DNA sequence for the VCAM1 gene (locus ID 7412, GenBank accession number XM_035776). We also sequenced from polymerase chain reaction (PCR)–amplified genomic DNA 2258 base pairs upstream of the VCAM1 initiation of translation codon (ATG) and all 9 VCAM1 exons, in both directions for all 40 subjects, corresponding to a publicly available sequence of the genomic contig NT_004308. We amplified 8 of these 40 samples (20%) and sequenced them in duplicate with 100% agreement in genotype at each of the 33 polymorphic sites identified. Primers for 19 PCR amplicons used to sequence 5962 nucleotides of theVCAM1 locus have been submitted to GenBank as sequence tagged sites (STSs; accession numbers G73334-G73349 and G73795-G73797). cDNA primers used for variant screening are available upon request. SNPs were identified only if observed with sequencing in both directions. Exon 5 of VCAM1 was amplified from 10 healthy controls (of 9 heterozygotes and 1 homozygote) carrying the variant T953 and G1150 alleles using Platinum pfx high fidelity DNA polymerase (Invitrogen). PCR products were purified and isolated on Qiagen purification columns (Qiagen, Valencia, CA), incubated with a DNA polymerase/dATP mixture at 72°C for 15 minutes, and were ligated and cloned using the TOPO TA cloning vector (Invitrogen). Clones were analyzed by PCR using plasmid-specific primers and confirmed by direct DNA sequencing. Numeric designations for polymorphisms are defined in reference to the ATG following the recommendation of the Human Mutation Nomenclature Working Group.30
Genotyping
Genotyping was performed by investigators blinded to patient identifiers. PCR reactions were performed with MJ Research model PTC-225 thermal cyclers under conditions as follows: 5 ng to 50 ng genomic DNA, 100 ng of each primer, 200 μM of each dNTP, 2 mM MgCl2, 0.5 U AmpliTaq Gold DNA polymerase (ABI-Perkin Elmer, Foster City, CA), and the manufacturer's buffer. Direct sequencing of amplified DNA using the Dye Terminator method (ABI-Perkin Elmer) was analyzed with ABI-Perkin Elmer platforms (models 377, 3100, or 3700), Sequence Analysis 3.3 (ABI-Perkin Elmer) and Sequencher 3.1 (Gene Codes Corporation, Ann Arbor, MI) software. Genotype assays are presented in the . All promoter genotyping assays in the SS disease population were performed by nested PCR, with the initial PCR step being performed independently, at least twice. A 2267 bp 5′ region of VCAM1 was amplified from genomic DNA using the primer pair (TAT TTC AGT GGG GAC AAG GC and GTC GTG ATG AGA AAA TAG TGG TTC) under the following conditions: an initial 10 minutes at 95°C; 35 cycles of 95°C × 30 seconds, 70°C × 30 seconds, 72°C × 150 seconds, and a final extension at 72°C for 10 minutes. One microliter was then used for each of the VCAM1 promoter genotype PCR assays described in the .
Confirmation of candidate SNPs
Allelic distributions of selected VCAM1 SNPs were determined in 130 healthy African American blood donors at the NIH and in 100 Centre d'Etude du Polymorphisme Humain (CEPH) African American donors (Human Variation Panel HD100AA; Coriell Cell Repositories, Camden, NJ).
Patients with sickle cell disease
Patients were recruited at the Sickle Cell Clinic of the University Hospital of the West Indies, Kingston, Jamaica. There were 51 patients with SS disease identified with a history of clinical stroke and 51 patients with SS disease without stroke (controls) who were matched by sex and age at the time of study recruitment to index cases. Radiographic evaluation of the head was not performed on control subjects. Study participants provided informed consent under the auspice of an institutional review board–approved study at the University of the West Indies. A symptomatic stroke was defined as an acute clinical neurologic syndrome with localizing symptoms of more than 24 hours duration. In addition, most stroke cases were documented by computed tomography or angiogram.2 Overall, all evidence (clinical and radiographic) suggested that the great majority of cases were symptomatic infarctive strokes. Patients with transient ischemic attacks or fatal stroke events were not included in this study.
Clinical and laboratory values for this population have been previously reported including a mean age of 17.1 years for stroke cases at recruitment.31 Mean age at stroke was 10.3 years (standard deviation 6.1 years). Steady state white blood cell (WBC) counts were determined after the age of 5 years from an average of 10.9 measurements in the control group (range 1-43) and 9.7 measurements among stroke cases (range 1-37).
Statistical analysis
Statistical analysis used Instat 2.0 (Graph Pad Software, San Diego, CA), EpiInfo 6.04 (Centers for Disease Control, Atlanta, GA), and DnaSP 3.5 (http://www.bio.ub.es/julio/DnaSP.html). The standardized linkage disequilibrium coefficient D′ was calculated using the method of Lewontin.32 θ was estimated asS/a1 (where S is the number of segregating sites;a1 = ∑ 1/i = 4.9528 where n is 80 chromosomes) divided by the number of nucleotides sequenced.33 Nucleotide diversity was also estimated as π or the average heterozygosity per nucleotide site (π = k/[1 − (1/n)], divided by the number of nucleotides sequenced; wherek = ∑ 2pj(1 − pj),pj is the observed frequency of thejth SNP and n is 80 chromosomes). The Tajima D statistic was determined from the difference between these 2 estimates of nucleotide diversity (π and θ, TajimaD =), where the value of D is expected to be 0 for selectively neutral variants in a steady state population.34 Comparison of categorical data was by Fisher exact test (2 × 2 table with 1 degree of freedom) or analysis of matched pairs (McNair-corrected χ2 and paired P values) including odds ratios (ORs) and 95% confidence intervals (CIs) (using the approximation of Woolf). The protective potential of the observed association was determined as follows: (percent reduction in the prevalence of stroke = 1 − observed OR) × (observed genotype frequencies for carriers of the informative allele).35 Continuous data were analyzed by paired t test or Mann-Whitney test withP values where indicated. In this pilot study, Pvalues are presented without correction, and were considered significant for an α less than .05.
Results
Screening for common polymorphisms
Sequencing of indicated regions ofVCAM1 in 40 healthy individuals identified 33 biallelic SNPs: 17 upstream of the ATG, 4 nonsynonymous coding, 2 synonymous coding, 5 3′ untranslated regions (UTRs), and 5 additional noncoding SNPs (Table 1). Allelic distributions of selected variants were determined in a healthy African American control population (Table 2), and estimated gene frequencies were calculated and found to be consistent with Hardy-Weinberg equilibrium (data not shown). Within the open reading frame of VCAM1, 4 nonsynonymous SNPs alter the predicted amino acid sequence and 2 are in linkage: C953T (S318F in domain 4 of VCAM-1) and A1150G (T360K, also in domain 4) (D′ = 1.0 for 2-locus combination in 230 healthy controls by subcloning 10 of 15 variant allele carriers, data not shown). There are 2 other SNPs, G1238C (G413A in domain 5) and A2146T (I716L in the transmembrane domain), that appear to segregate independently. VCAM1G-14C, G1238C, C2079T, A2146T, and A2208G are expressed at the transcriptional level in lymphocytes based on sequence analysis of first strand cDNA from 16 of 40 screening donors.
VCAM1 region . | Nucleotide position8-150 . | Amino acid position . | SNP typology classification8-151 . | Variant alleles observed8-152 . | dbSNP identifier8-153 . |
---|---|---|---|---|---|
5′ upstream | A−2062C | — | VI | 1 | rs3783595 |
C−2021T | — | VI | 37 | rs1409419 | |
C−1698G | — | VI | 3 | rs3783597 | |
T−1599G | — | VI | 8 | rs3783598 | |
T−1592C | — | VI | 8 | rs1041163 | |
T−1530C | — | VI | 3 | rs3783599 | |
T−1379C | — | VI | 1 | rs3783600 | |
A−1356G | — | VI | 3 | rs3783601 | |
A−1242T | — | VI | 2 | rs3181087 | |
G−1148A | — | VI | 3 | rs3783603 | |
T−833C | — | VI | 10 | rs3170794 | |
A−540G | — | VI | 1 | rs3783605 | |
T−383G | — | VI | 1 | rs3783606 | |
cDNA | |||||
5′ UTR | C−118T | — | IV | 1 | rs3783607 |
5′ UTR | C−109T | — | IV | 8 | rs3783608 |
5′ UTR | G−54A | — | IV | 8 | rs3783609 |
5′ UTR | G−14C | — | IV | 1 | rs3783610 |
Ig domain 4 | C953T | S318F | I | 3 | rs3783611 |
Ig domain 4 | A1150G | T384K | I | 3 | rs3783612 |
Ig domain 5 | G1238C | G413A | II | 5 | rs3783613 |
TMD | C2079T | D693D | III | 15 | rs3176878 |
TMD | A2146T | I716L | II | 4 | rs3783615 |
TMD | A2208G | K736K | III | 19 | rs3176879 |
3′ UTR | G2262C | — | V | 1 | rs3783617 |
3′ UTR | C2507A | — | V | 2 | rs3783618 |
3′ UTR | A2548G | — | V | 1 | rs3783619 |
3′ UTR | G2844A | — | V | 1 | rs3783620 |
3′ UTR | C2876A | — | V | 2 | rs3783621 |
Other noncoding | |||||
Intron 38-150 | C4757T | — | VI | 3 | rs2392221 |
Intron 78-150 | A14563G | — | VI | 5 | rs3176875 |
3′ downstream8-150 | A19228G | — | VI | 20 | rs3181092 |
3′ downstream8-150 | C19312T | — | VI | 6 | rs3181093 |
3′ downstream8-150 | A19382G | — | VI | 3 | rs3783624 |
VCAM1 region . | Nucleotide position8-150 . | Amino acid position . | SNP typology classification8-151 . | Variant alleles observed8-152 . | dbSNP identifier8-153 . |
---|---|---|---|---|---|
5′ upstream | A−2062C | — | VI | 1 | rs3783595 |
C−2021T | — | VI | 37 | rs1409419 | |
C−1698G | — | VI | 3 | rs3783597 | |
T−1599G | — | VI | 8 | rs3783598 | |
T−1592C | — | VI | 8 | rs1041163 | |
T−1530C | — | VI | 3 | rs3783599 | |
T−1379C | — | VI | 1 | rs3783600 | |
A−1356G | — | VI | 3 | rs3783601 | |
A−1242T | — | VI | 2 | rs3181087 | |
G−1148A | — | VI | 3 | rs3783603 | |
T−833C | — | VI | 10 | rs3170794 | |
A−540G | — | VI | 1 | rs3783605 | |
T−383G | — | VI | 1 | rs3783606 | |
cDNA | |||||
5′ UTR | C−118T | — | IV | 1 | rs3783607 |
5′ UTR | C−109T | — | IV | 8 | rs3783608 |
5′ UTR | G−54A | — | IV | 8 | rs3783609 |
5′ UTR | G−14C | — | IV | 1 | rs3783610 |
Ig domain 4 | C953T | S318F | I | 3 | rs3783611 |
Ig domain 4 | A1150G | T384K | I | 3 | rs3783612 |
Ig domain 5 | G1238C | G413A | II | 5 | rs3783613 |
TMD | C2079T | D693D | III | 15 | rs3176878 |
TMD | A2146T | I716L | II | 4 | rs3783615 |
TMD | A2208G | K736K | III | 19 | rs3176879 |
3′ UTR | G2262C | — | V | 1 | rs3783617 |
3′ UTR | C2507A | — | V | 2 | rs3783618 |
3′ UTR | A2548G | — | V | 1 | rs3783619 |
3′ UTR | G2844A | — | V | 1 | rs3783620 |
3′ UTR | C2876A | — | V | 2 | rs3783621 |
Other noncoding | |||||
Intron 38-150 | C4757T | — | VI | 3 | rs2392221 |
Intron 78-150 | A14563G | — | VI | 5 | rs3176875 |
3′ downstream8-150 | A19228G | — | VI | 20 | rs3181092 |
3′ downstream8-150 | C19312T | — | VI | 6 | rs3181093 |
3′ downstream8-150 | A19382G | — | VI | 3 | rs3783624 |
We amplified 5962 base pairs of the VCAM1coding sequence (GenBank accession number XM_035776) or flanking regions as 19 different PCR products in 40 healthy African Americans. PCR products were sequenced in both directions to identify common genetic variants. A total of 33 biallelic single nucleotide polymorphisms (SNPs) were detected.
Ig indicates immunoglobulin; TMD, transmembrane domain; and —, noncoding sequence.
Nucleotide positions for the cDNA SNPs are relative to the ATG of the coding (translated) region. Positions for the 5′ upstream and “Other noncoding” SNPs are relative to the ATG on genomic contig NT_004308. Alternatively, the nucleotide positions of the “Other noncoding” SNPs refer to position 1277 of intron 3 (Int3 + 1277), position 1739 of intron 7 (Int7 + 1739), and positions + 46, + 130, and + 200 downstream from the 3′ end of cDNA XM_035776.
SNP Typology Classification as proposed by Risch8 where Class I equals nonsynonymous coding with nonconservative amino acid change; Class II equals nonsynonymous coding with conservative amino acid change; Class III equals synonymous coding (no amino acid change); Class IV equals 5′ untranslated region (5′ UTR of cDNA); Class V equals 3′ UTR; and Class VI equals other noncoding regions (including introns, promoters, and intergenic regions).
The number of variant alleles observed in a population of 80 chromosomes is indicated.
National Center for Biotechnology Information Database of Single Nucleotide Polymorphisms (http://www.ncbi.nlm.nih.gov/SNP/).
VCAM1 region . | Nucleotide position . | Major allele frequency (95% CI) . |
---|---|---|
5′ upstream | C−2021T | 0.59 (0.54-0.64) |
T−1599G | 0.94 (0.92-0.96) | |
T−1592C | 0.84 (0.81-0.87) | |
T−1379C | 0.98 (0.97-0.99) | |
A−1356G | 0.98 (0.97-0.99) | |
A−1242T | 0.99 (0.98-1.00)* | |
G−1148A | 0.98 (0.97-0.99) | |
T−833C | 0.85 (0.82-0.88) | |
5′ UTR, exon 1 | C−109T | 0.90 (0.87-0.93) |
5′ UTR, exon 1 | G−54A | 0.90 (0.87-0.93) |
Coding | ||
Ig domain 4, exon 5 | C953T | 0.96 (0.94-0.99) |
Ig domain 4, exon 5 | A1150G | 0.96 (0.94-0.99) |
Ig domain 5, exon 6 | G1238C | 0.90 (0.87-0.93) |
TMD, exon 9 | C2079T | 0.82 (0.78-0.86) |
TMD, exon 9 | A2146T | 0.93 (0.91-0.95) |
TMD, exon 9 | A2208G | 0.76 (0.74-0.78) |
VCAM1 region . | Nucleotide position . | Major allele frequency (95% CI) . |
---|---|---|
5′ upstream | C−2021T | 0.59 (0.54-0.64) |
T−1599G | 0.94 (0.92-0.96) | |
T−1592C | 0.84 (0.81-0.87) | |
T−1379C | 0.98 (0.97-0.99) | |
A−1356G | 0.98 (0.97-0.99) | |
A−1242T | 0.99 (0.98-1.00)* | |
G−1148A | 0.98 (0.97-0.99) | |
T−833C | 0.85 (0.82-0.88) | |
5′ UTR, exon 1 | C−109T | 0.90 (0.87-0.93) |
5′ UTR, exon 1 | G−54A | 0.90 (0.87-0.93) |
Coding | ||
Ig domain 4, exon 5 | C953T | 0.96 (0.94-0.99) |
Ig domain 4, exon 5 | A1150G | 0.96 (0.94-0.99) |
Ig domain 5, exon 6 | G1238C | 0.90 (0.87-0.93) |
TMD, exon 9 | C2079T | 0.82 (0.78-0.86) |
TMD, exon 9 | A2146T | 0.93 (0.91-0.95) |
TMD, exon 9 | A2208G | 0.76 (0.74-0.78) |
Based on successfully genotyping 96% to 100% of samples in duplicate for a population of 230 healthy African American donors. 95% CI calculated according to the following formula: q − 2s to q + 2s, where the standard deviation is s =, q is the major allele frequency, and n is the number of chromosomes successfully genotyped.
The CI for A-1242T suggests that the variant allele may be rare; present at a frequency of less than 1%.
Based on data for 80 chromosomes, estimates of θ were 15.2 × 10−4, 9.8 × 10−4, and 11.2 × 10−4 for promoter, cDNA, and total screening regions, respectively (Table 3). Values for θ were 4.7 × 10−4 and 7.9 × 10−4for the 1705 nonsynonymous and 512 synonymous base pairs of the protein coding region, respectively. π or the estimated average heterozygosity per nucleotide site value was estimated for regions ofVCAM1 (Table 3), including subanalysis of the protein coding region (1705 nonsynonymous base pairs with 4 SNPs where π = 2.1 × 10−4 and 512 synonymous base pairs with 2 SNPs where π = 13.2 × 10−4). Tajima D statistics for the 5′ upstream region (D = −1.26, P = NS), cDNA (D = −1.32, P = NS), and for the 5962 base pairs screened overall (D = −1.28, P = NS) were consistent with neutrality and equilibrium within this population of 40 individuals (80 chromosomes, Table 3).
Region sequenced . | Base pairs screened . | Variant sites . | θ × 10−4 (SD) . | π × 10−4(SD) . | Tajima D . |
---|---|---|---|---|---|
5′ upstream3-150 | 2258 | 17 | 15.2 (5.2) | 8.6 (0.8) | −1.26 |
cDNA | 3103 | 15 | 9.8 (3.4) | 5.2 (0.7) | −1.32 |
Total base pairs screened3-151 | 5962 | 33 | 11.2 (3.4) | 6.6 (0.5) | −1.28 |
Region sequenced . | Base pairs screened . | Variant sites . | θ × 10−4 (SD) . | π × 10−4(SD) . | Tajima D . |
---|---|---|---|---|---|
5′ upstream3-150 | 2258 | 17 | 15.2 (5.2) | 8.6 (0.8) | −1.26 |
cDNA | 3103 | 15 | 9.8 (3.4) | 5.2 (0.7) | −1.32 |
Total base pairs screened3-151 | 5962 | 33 | 11.2 (3.4) | 6.6 (0.5) | −1.28 |
P values for all calculations of the Tajima D statistic are > .10, indicating that the value of D is not significantly different from the expected value of 0 for neutral mutations in a population at equilibrium. SD indicates standard deviation.
Defined as all bases upstream of the ATG for this analysis including 124 bp of the published cDNA.
Includes 2258 bp of the 5′ upstream region (including 124 bp of the cDNA), the entire coding region (cDNA XM_035776), 200 bp downstream of the 3′ end of the cDNA, and some intronic regions.
VCAM1 genotypes in stroke and SS disease
Selected coding SNPs and common promoter SNPs (defined as a variant allele frequency of > 0.10) were then studied in the SS stroke population. The promoter SNP, T-1599G, was also selected for further study because it maps to position 5 of a putative AP-1 consensus-binding sequence36 (identified by NSITE software at http://genomic.sanger.ac.uk). Of the 10 SNPs analyzed in the pilot study, one is informative in this well-characterized Jamaican SS population. The wild-type VCAM1 G1238 allele was more common in the stroke group compared with the control group (P = .04; Table 4), suggesting that the C variant could be protective against stroke in SS disease. Analysis by allelic frequency showed C1238 alleles to be significantly reduced among stroke cases versus controls (8/102 C alleles versus 20/102 C alleles, respectively; χ2 = 5.70, OR 0.35; 95% CI 0.15-0.83;P = .02; Table 5). The strength of this association by analysis of matched pairs yielded comparable, but not statistically significant, results (VCAM1 1238 GC/CC versus GG genotypes, McNair-corrected χ2 = 3.37, OR 0.36, 95% CI 0.11-1.06, paired P = .07). Based on genotype frequencies for the VCAM1 C1238 allele (derived from Table2; GC 39/229 individuals = 0.1703, and CC 3/229 individuals = 0.0131) and an observed 65% reduction in the prevalence of stroke (1 − observed 0.35 OR), the estimated proportion of symptomatic stroke cases that might be prevented for carriers of this genetic marker could be as high as 11.9% [(0.65 × 0.1703) + (0.65 × 0.0131) = 11.9%].35 There were 7 additional SNPs in the VCAM1 gene, presented in Table4, that were not significantly associated with stroke, as was also observed for 2 synonymous coding SNPs in exon 9 (C2079T,P = NS and A2208G, P = NS; data not shown).
VCAM1 SNP and genotype . | Stroke (%) n = 51 . | Controls (%) n = 51 . | P for genotype4-150 . |
---|---|---|---|
C−2021T4-151 | |||
CC | 23 (46) | 22 (44) | 1.00 |
CT | 22 (44) | 20 (40) | .84 |
TT | 5 (10) | 8 (16) | .56 |
T−1599G | |||
TT | 43 (84) | 48 (94) | .20 |
TG | 8 (16) | 3 (6) | .20 |
GG | 0 (0) | 0 (0) | NA |
T−1592C | |||
TT | 40 (78) | 35 (69) | .26 |
CT | 8 (16) | 13 (25) | .22 |
CC | 3 (6) | 3 (6) | .67 |
T−833C4-151 | |||
TT | 34 (68) | 33 (66) | 1.00 |
TC | 13 (26) | 16 (32) | .66 |
CC | 3 (6) | 1 (2) | .62 |
C953T/A1150G‡ | |||
CC/AA | 48 (96) | 47 (96) | 1.00 |
CT/AG | 2 (4) | 2 (4) | 1.00 |
TT/GG | 0 (0) | 0 (0) | NA |
G1238C | |||
GG | 43 (84) | 34 (67) | .04 |
GC | 8 (16) | 14 (27) | .15 |
CC | 0 (0) | 3 (6) | .24 |
A2146T4-153 | |||
AA | 45 (88) | 46 (92) | .74 |
AT | 6 (12) | 3 (6) | .49 |
TT | 0 (0) | 1 (2) | .50 |
VCAM1 SNP and genotype . | Stroke (%) n = 51 . | Controls (%) n = 51 . | P for genotype4-150 . |
---|---|---|---|
C−2021T4-151 | |||
CC | 23 (46) | 22 (44) | 1.00 |
CT | 22 (44) | 20 (40) | .84 |
TT | 5 (10) | 8 (16) | .56 |
T−1599G | |||
TT | 43 (84) | 48 (94) | .20 |
TG | 8 (16) | 3 (6) | .20 |
GG | 0 (0) | 0 (0) | NA |
T−1592C | |||
TT | 40 (78) | 35 (69) | .26 |
CT | 8 (16) | 13 (25) | .22 |
CC | 3 (6) | 3 (6) | .67 |
T−833C4-151 | |||
TT | 34 (68) | 33 (66) | 1.00 |
TC | 13 (26) | 16 (32) | .66 |
CC | 3 (6) | 1 (2) | .62 |
C953T/A1150G‡ | |||
CC/AA | 48 (96) | 47 (96) | 1.00 |
CT/AG | 2 (4) | 2 (4) | 1.00 |
TT/GG | 0 (0) | 0 (0) | NA |
G1238C | |||
GG | 43 (84) | 34 (67) | .04 |
GC | 8 (16) | 14 (27) | .15 |
CC | 0 (0) | 3 (6) | .24 |
A2146T4-153 | |||
AA | 45 (88) | 46 (92) | .74 |
AT | 6 (12) | 3 (6) | .49 |
TT | 0 (0) | 1 (2) | .50 |
NA indicates not applicable.
2 × 2 tables (Fisher exact test) for indicated genotype versus all others.
There were 2 PCR reactions (1 case and 1 control) that failed to amplify.
There were 1 case and 2 control PCR reactions that failed to amplify.
There was 1 control PCR reaction that failed to amplify.
Nucleotide . | Stroke (%) n = 102 alleles . | Controls (%) n = 102 alleles . | Odds ratio (95% CI) . | P5-150 . |
---|---|---|---|---|
G1238C | ||||
C alleles | 8 (8) | 20 (20) | 0.35 (0.15-0.83) | .02 |
G alleles | 94 (92) | 82 (80) |
Nucleotide . | Stroke (%) n = 102 alleles . | Controls (%) n = 102 alleles . | Odds ratio (95% CI) . | P5-150 . |
---|---|---|---|---|
G1238C | ||||
C alleles | 8 (8) | 20 (20) | 0.35 (0.15-0.83) | .02 |
G alleles | 94 (92) | 82 (80) |
2 × 2 table.
Steady state WBC count and VCAM1genotype
Previously, an elevated steady state WBC count has been identified as a risk factor for stroke in Jamaica (in this population stroke mean WBC count 15.3 × 106/μL versus control WBC count 13.6 × 106/μL;P = .006).2 31 Steady state WBC count was significantly higher among stroke cases with the VCAM1−1599 TT genotype compared with TG strokes (P = .02) and TT controls (P = .002, Table6). Comparison of steady state WBC count by VCAM1 G1238C genotype demonstrates that leukocyte counts are significantly elevated for stroke cases with the GG genotype versus controls with the GG genotype (P = .01, Table 7).
. | Stroke . | Control . | ||
---|---|---|---|---|
T − 1599G | TT (n = 43) | TG (n = 8) | TT (n = 47) | TG (n = 3) |
Mean WBC (SD)6-150 | 15.7 (3.7)6-151,6-152 | 13.1 (1.1)6-151 | 13.7 (3.3)6-152 | 12.0 (3.0) |
. | Stroke . | Control . | ||
---|---|---|---|---|
T − 1599G | TT (n = 43) | TG (n = 8) | TT (n = 47) | TG (n = 3) |
Mean WBC (SD)6-150 | 15.7 (3.7)6-151,6-152 | 13.1 (1.1)6-151 | 13.7 (3.3)6-152 | 12.0 (3.0) |
SD indicates standard deviation.
Values for white blood cell (WBC) counts are 106/μL.
Mean values are significantly different for TT strokes versus TG stokes by Mann-Whitney test (U statistic = 78,P = .02).
Mean values are significantly different for strokes versus controls of the TT genotype by Mann-Whitney test (U statistic = 674, P = .004).
. | Stroke . | Control . | |||
---|---|---|---|---|---|
G1238C | GG (n = 43) | GC (n = 8) | GG (n = 34) | GC (n = 14) | CC (n = 3) |
Mean WBC (SD)7-150 | 15.3 (3.8)7-151 | 15.1 (1.7) | 13.1 (3.2)7-151 | 15.1 (3.3) | 11.7 (2.3) |
. | Stroke . | Control . | |||
---|---|---|---|---|---|
G1238C | GG (n = 43) | GC (n = 8) | GG (n = 34) | GC (n = 14) | CC (n = 3) |
Mean WBC (SD)7-150 | 15.3 (3.8)7-151 | 15.1 (1.7) | 13.1 (3.2)7-151 | 15.1 (3.3) | 11.7 (2.3) |
SD indicates standard deviation.
Values for white blood cell (WBC) counts are 106/μL.
Mean values are significantly different for strokes versus controls of the GG genotype by Mann-Whitney test (U statistic = 480,P = .01).
Discussion
SS disease is a monogenic disorder characterized by a mutation resulting in a single amino acid change in the β-globin chain (in theHBB gene). The observed spectrum of variability in clinical course, even within affected family members, attests to the heterogeneity of influences, including mutational heterogeneity at theHBB locus, environmental factors, or genetic variability at other unlinked loci.37 38 Consequently, investigation has focused on factors that impact vaso-occlusion of sickle cells, which can result from accelerated polymerization of sickle hemoglobin. In this regard, events controlling inflammation, and specifically cell adhesion, could influence the pathogenesis of stroke.
In the present study, we have characterized sites of genetic variation in selected regions of the candidate gene VCAM1, which has previously been implicated in the pathogenesis of both SS disease19-22,25 and ischemic stroke without underlying SS disease.26-28 To assess whether the SNPs identified might be associated with selective function as opposed to neutral mutation, we have estimated VCAM1 nucleotide diversity and tested the population genetic hypothesis that all SNPs in VCAM1 are selectively neutral (Tajima D). The nucleotide diversity values (ie, π and θ) are within ranges reported by other large gene surveys.39-41 The difference between π, whose value is affected by the frequency of variants, and θ, which is determined by the overall number of variants, forms the basis of the Tajima D statistic.34 Moreover, the frequency and distribution ofVCAM1 SNPs are consistent with evolution under a neutral model of mutation, although larger screening populations may be necessary to increase the statistical power of the Tajima test.42 The negative values for Tajima D reflect the many low-frequency VCAM1 SNPs in this population and may indicate selection against specific VCAM1 alleles.43However, a larger data set suggests recent expansion of human population size may be a more likely explanation.39Overall, our fine-scale SNP map of VCAM1 can be used as a model for comprehensive annotation of selected candidate modifier genes of the SS disease phenotype,37 for constructing haplotypes for the VCAM1 locus, and as markers for subsequent association studies in other inflammatory or thromboembolic diseases.
Selected variants were then evaluated in a small SS disease population with clinically overt stroke.2,31 Significantly, the C allele of VCAM1 G1238C (domain 5) could be associated with protection from stroke in SS disease. Thus, as many as 11.9% of stroke cases might be prevented35 in association with the C1238 allele, implying a moderate-to-strong genetic effect in modifying the clinical outcome of SS disease. Lastly, subanalysis of steady state WBC count and genotype suggests that some of these variants could influence leukocyte counts, a known risk factor for stroke, adverse complications, mortality, and possibly a determinant of vascular occlusion in SS disease.1,2,44-46 This final observation is preliminary but intriguing in light of experimental data supporting a functional role for VCAM-1 in the release of hematopoietic cells from bone marrow, given recent observations of potentially fatal SS complications associated with intense leukocytosis after administration of G-CSF, and an association between duration of leukocyte infiltration and both infarct size and the severity of neurologic outcome in non–SS disease adults with CNS infarcts.17,18 47-49
Although this study indicates that a VCAM1 SNP could be associated with stroke in a homogeneous SS disease population, we recognize that confounding variables inherent to case control studies might also influence our preliminary observation. A limitation of our control group is the unknown prevalence of undetected silent cerebral infarcts, owing to the fact that neuroimaging was not performed on control subjects. Patients with SS disease may have silent lesions more frequently than previously appreciated.50 However, our study is defined on the basis of clinically documented symptomatic events and not radiographic findings in asymptomatic patients with SS, as it is not presently known whether symptomatic strokes are different from silent infarcts. However, a difference is suggested by a multicenter evaluation of patients with SS disease who are asymptomatic for cerebrovascular disease, in which there was discordance between MRI and transcranial doppler results suggestive of CNS infarction.51 Finally, VCAM1 could be in linkage disequilibrium (LD) with another nearby locus affecting susceptibility to this SS disease complication, although this seems unlikely given the observation that LD does not appear to extend significantly beyond 5 kb in many regions of the genome, particularly in a population of African descent.52 53 Confirmatory testing of this association in a much larger independent SS disease population with attention to these and other factors is required before clinically meaningful conclusions can be made. Furthermore, future studies of genetic modifiers in SS disease will need to provide heritability estimates for the phenotypes or complications under study, so that the relative contribution of any single genetic determinant can be assessed.
The specific mechanism for a modifying effect in SS disease by a single amino acid substitution in VCAM-1 could shed light on a critical structure/function relationship, especially in the context of previous in vitro mutational studies of immunoglobulin superfamily adhesion molecules.54 Indeed, adhesion receptors direct a variety of functions such as cell-cell interaction, leukocyte trafficking/signaling, and T-cell recognition.15 Immune responses generated by these interactions can be either protective or pathologic, as in the case of transplant rejection. The role ofVCAM1 promoter and nonsynonymous SNPs, which could theoretically alter function and expression, await further investigation.
This pilot genetic association study provides preliminary evidence that an individual VCAM1 genetic variant may influence risk for stroke in SS disease, especially in light of the reported role of VCAM-1 in the pathogenesis of SS disease.19-22,25 However, the results of this pilot association study must be confirmed in both larger population-based trials and family studies before its implications can be realized in either clinical care or the design of novel anti-inflammatory therapies such as monoclonal antibodies.55 Further examination of variation at the VCAM1 locus by haplotype may help to quantify the cumulative effects of all variants at this locus, elucidate the underlying pathogenesis of cerebrovascular events, both in the absence and presence of SS disease, and identify patients with SS disease who are candidates for therapies designed to prevent strokes.7
We thank Todd Brooks, Ed Miller, and Bernice Packer of the Genetic Annotation Initiative/NCI Cancer Genome Anatomy Project for providingVCAM1 cDNA PCR primers, equipment, advice, and assistance with software; and Drs EunWha Choi, Hans C. Erichsen, and Charles B. Foster for encouragement and useful discussions.
VCAM1 SNP . | Primers . | Annealing temperature . | Detection method . |
---|---|---|---|
VCAM1 promoter | F: TATTTCAGTGGGGACAAGGC | 63°C | Sequencing and BsrI digest |
C-2021T | R: GACATCTGAAGTCCTACCTGC | ||
VCAM1 promoter | F: AAGGACCTCTGGGTTACTTGTTT | 55°C | Sequencing |
T–1599G | R: CTCCTCTTGGATACTGATGTGGCT | ||
T–1592C | |||
VCAM1 promoter | F: AGCCACATCAGTATCCAAGAGGAG | 61°C | Sequencing |
T–1379C | R: GACCAGTTCTTGTTCATTGTTCATTGTTGTATC | ||
A–1356G | |||
A–1242T | |||
VCAM1 promoter | F: GATACAACAATGAACAAGAACTGGTC | 52°C | Sequencing |
G–1148A | R: CAATAACCAACTCTATGTTCCTTTTC | ||
VCAM1 promoter | F: CAAGAGATTTGCCACTTCAGATG | 55°C | RsaI digest and sequencing |
T–833C | R: AAAAGGGACACCATAACTTCTTAG | ||
VCAM1 exon 1 | F: AGTGGAACTTGGCTGGGTG | 63°C | Sequencing |
C–109T | R: TCACTACTATCGCAAAACTGACTG | ||
G–54A | |||
VCAM1 exon 5 | F: GTGTCCCAGAGAAACCATTTAC | 63°C | ApaLI digest (A1150G only) and sequencing |
C953T | R: GAAAACCACTTACAGTAGAGCTCC | ||
A1150G | |||
VCAM1 exon 6 | F: CGTTTTTGCTTGCGATTTG | 55°C | Cac8I digest and sequencing |
G1238C | R: CCAGTATCTTCAATGGTAGGGATG | ||
VCAM1 exon 9 | F: TAGACATTAATTGCATCCATTTTG | 63°C | Sequencing |
C2079T | R: ATTCAGGGAAGTCTGCCTCTC | ||
A2146T | |||
A2208G |
VCAM1 SNP . | Primers . | Annealing temperature . | Detection method . |
---|---|---|---|
VCAM1 promoter | F: TATTTCAGTGGGGACAAGGC | 63°C | Sequencing and BsrI digest |
C-2021T | R: GACATCTGAAGTCCTACCTGC | ||
VCAM1 promoter | F: AAGGACCTCTGGGTTACTTGTTT | 55°C | Sequencing |
T–1599G | R: CTCCTCTTGGATACTGATGTGGCT | ||
T–1592C | |||
VCAM1 promoter | F: AGCCACATCAGTATCCAAGAGGAG | 61°C | Sequencing |
T–1379C | R: GACCAGTTCTTGTTCATTGTTCATTGTTGTATC | ||
A–1356G | |||
A–1242T | |||
VCAM1 promoter | F: GATACAACAATGAACAAGAACTGGTC | 52°C | Sequencing |
G–1148A | R: CAATAACCAACTCTATGTTCCTTTTC | ||
VCAM1 promoter | F: CAAGAGATTTGCCACTTCAGATG | 55°C | RsaI digest and sequencing |
T–833C | R: AAAAGGGACACCATAACTTCTTAG | ||
VCAM1 exon 1 | F: AGTGGAACTTGGCTGGGTG | 63°C | Sequencing |
C–109T | R: TCACTACTATCGCAAAACTGACTG | ||
G–54A | |||
VCAM1 exon 5 | F: GTGTCCCAGAGAAACCATTTAC | 63°C | ApaLI digest (A1150G only) and sequencing |
C953T | R: GAAAACCACTTACAGTAGAGCTCC | ||
A1150G | |||
VCAM1 exon 6 | F: CGTTTTTGCTTGCGATTTG | 55°C | Cac8I digest and sequencing |
G1238C | R: CCAGTATCTTCAATGGTAGGGATG | ||
VCAM1 exon 9 | F: TAGACATTAATTGCATCCATTTTG | 63°C | Sequencing |
C2079T | R: ATTCAGGGAAGTCTGCCTCTC | ||
A2146T | |||
A2208G |
PCR genotyping reactions were carried out as described under the following cycling conditions: initial denaturation for 10 minutes, followed by 40 cycles of 94°C ×30 seconds, indicated annealing temperature (see above) ×30 seconds, 72°C ×40 seconds, and final extension for 10 minutes. PCR products were then sequenced or digested with a genotype-specific restriction endonuclease. Restriction digests were incubated at 65°C (BsrI) or 37°C (ApaLI,Cac8I, and RsaI) for 4 hours, and then resolved on an agarose gel.
Prepublished online as Blood First Edition Paper, August 15, 2002; DOI 10.1182/blood-2001-12-0306.
The publication costs of this article were defrayed in part by page charge payment. Therefore, and solely to indicate this fact, this article is hereby marked “advertisement” in accordance with 18 U.S.C. section 1734.
References
Author notes
Stephen J. Chanock, Section on Genomic Variation, Pediatric Oncology Branch, Advanced Technology Center, National Cancer Institute, 8717 Grovemont Cir, Gaithersburg, MD 20877; e-mail: sc83a@nih.gov.
This feature is available to Subscribers Only
Sign In or Create an Account Close Modal