TO THE EDITOR:
Down syndrome (DS) is caused by constitutional trisomy of chromosome 21 and is associated with an up to 30-fold increased risk of acute lymphoblastic leukemia (ALL).1,2 While DS is associated with alterations in epigenetic markers, including DNA methylation, and gene expression,3-6 these mechanisms have not been fully explored in relation to DS-ALL etiology.7 Because the epigenome is sensitive to genetic and environmental influences during fetal development and can be leveraged to characterize blood cell proportions,8 we sought to evaluate the role of the neonatal methylome in children with DS on subsequent ALL risk.
Our epigenome-wide association study (EWAS) included 126 DS-ALL cases and 198 DS control subjects from the International Study of Down Syndrome Acute Leukemia7,9 in the Discovery dataset and 24 cases and 24 control subjects from the Michigan-based DS-ALL study7 in the Replication group. DNA was isolated from neonatal dried bloodspots, bisulfite-converted and assayed using Illumina EPIC methylation arrays. Further details on study subjects, quality control and processing of methylation array data, and statistical analyses are included in the supplemental Methods. The Institutional Review Boards of each participating site approved the study, which was conducted according to the Declaration of Helsinki.
Demographic and birth-related data are summarized in Table 1. Unsupervised hierarchical clustering did not differentiate DS-ALL cases from DS control subjects but did demonstrate variation in blood cell proportions, determined by reference-based deconvolution using the Identifying Optimal Libraries algorithm,10 and identified a subset of DS newborns with high nucleated red blood cell proportions, as previously shown6 (supplemental Figure 1).
. | DS-ALL Discovery Study . | DS-ALL Replication Study . | ||||
---|---|---|---|---|---|---|
DS control subjects (n = 198), n (%) . | DS-ALL (n = 126), n (%) . | P value . | DS control subjects (n = 24), n (%) . | DS-ALL (n = 24), n (%) . | P value . | |
Sex | ||||||
Male | 91 (46.0) | 84 (66.7) | – | 14 (58.3) | 13 (54.2) | – |
Female | 107 (54.0) | 42 (33.3) | .00037* | 10 (41.7) | 11 (45.8) | .771* |
Race/ethnicity | ||||||
Asian | 10 (5.1) | 2 (1.6) | – | 1 (4.2) | 1 (4.2) | – |
Latino | 96 (48.5) | 86 (68.3) | – | 3 (12.5) | 2 (8.3) | – |
Non-Latino White | 54 (27.3) | 32 (25.4) | – | 15 (62.5) | 20 (83.3) | – |
Non-Latino Black | 10 (5.1) | 2 (1.6) | – | 5 (20.8) | 1 (4.2) | – |
Other | 28 (14.1) | 4 (3.2) | .00037* | 0 | 0 | .287* |
Missing | 0 | 0 | 0 | 0 | ||
Age at DS-ALL diagnosis (y) | ||||||
Median (range) | – | 4.0 (0-14.6) | – | – | <4.0, n = 13;≥4.0, n = 11† | – |
Blood collection age (d) | ||||||
Mean (SD) | 2.47 (2.03) | 2.03 (2.14) | .068 | N/A | N/A | – |
Median (range) | 1.71 (0.17-15.25) | 1.46 (0-18.96) | – | N/A | N/A | – |
Missing | 3 (1.5) | 9 (7.1) | – | 24 (100.0) | 24 (100.0) | – |
Gestational age (wk) | ||||||
Mean (SD) | 38.10 (2.33) | 38.22 (2.83) | .67‡ | N/A | N/A | – |
Median (range) | 38.29 (26.42-44.71) | 38.43 (25.57-44.43) | – | N/A | N/A | – |
Preterm (<37) | 41 (22.7) | 29 (24.0) | .78* | N/A | N/A | – |
Missing | 17 (8.6) | 5 (4.0) | – | 24 (100.0) | 24 (100.0) | – |
Birthweight (kg) | ||||||
Mean (SD) | 3.00 (0.74) | 3.08 (0.60) | .31‡ | N/A | N/A | – |
Median (range) | 3.02 (0.81-8.65) | 3.12 (0.94-4.58) | – | N/A | N/A | – |
Missing | 4 (2.0) | 1 (0.8) | – | 24 (100.0) | 24 (100.0) | – |
. | DS-ALL Discovery Study . | DS-ALL Replication Study . | ||||
---|---|---|---|---|---|---|
DS control subjects (n = 198), n (%) . | DS-ALL (n = 126), n (%) . | P value . | DS control subjects (n = 24), n (%) . | DS-ALL (n = 24), n (%) . | P value . | |
Sex | ||||||
Male | 91 (46.0) | 84 (66.7) | – | 14 (58.3) | 13 (54.2) | – |
Female | 107 (54.0) | 42 (33.3) | .00037* | 10 (41.7) | 11 (45.8) | .771* |
Race/ethnicity | ||||||
Asian | 10 (5.1) | 2 (1.6) | – | 1 (4.2) | 1 (4.2) | – |
Latino | 96 (48.5) | 86 (68.3) | – | 3 (12.5) | 2 (8.3) | – |
Non-Latino White | 54 (27.3) | 32 (25.4) | – | 15 (62.5) | 20 (83.3) | – |
Non-Latino Black | 10 (5.1) | 2 (1.6) | – | 5 (20.8) | 1 (4.2) | – |
Other | 28 (14.1) | 4 (3.2) | .00037* | 0 | 0 | .287* |
Missing | 0 | 0 | 0 | 0 | ||
Age at DS-ALL diagnosis (y) | ||||||
Median (range) | – | 4.0 (0-14.6) | – | – | <4.0, n = 13;≥4.0, n = 11† | – |
Blood collection age (d) | ||||||
Mean (SD) | 2.47 (2.03) | 2.03 (2.14) | .068 | N/A | N/A | – |
Median (range) | 1.71 (0.17-15.25) | 1.46 (0-18.96) | – | N/A | N/A | – |
Missing | 3 (1.5) | 9 (7.1) | – | 24 (100.0) | 24 (100.0) | – |
Gestational age (wk) | ||||||
Mean (SD) | 38.10 (2.33) | 38.22 (2.83) | .67‡ | N/A | N/A | – |
Median (range) | 38.29 (26.42-44.71) | 38.43 (25.57-44.43) | – | N/A | N/A | – |
Preterm (<37) | 41 (22.7) | 29 (24.0) | .78* | N/A | N/A | – |
Missing | 17 (8.6) | 5 (4.0) | – | 24 (100.0) | 24 (100.0) | – |
Birthweight (kg) | ||||||
Mean (SD) | 3.00 (0.74) | 3.08 (0.60) | .31‡ | N/A | N/A | – |
Median (range) | 3.02 (0.81-8.65) | 3.12 (0.94-4.58) | – | N/A | N/A | – |
Missing | 4 (2.0) | 1 (0.8) | – | 24 (100.0) | 24 (100.0) | – |
P values calculated using a 2-tailed Fisher’s exact test.
Age-at-diagnosis only available in categories for DS-ALL cases in the Replication Study.
P values were calculated using a 2-tailed t test.
Deconvolution of blood cell proportions in the Discovery study revealed a significant increase in B-cell proportions at birth in DS-ALL cases (mean, 0.0128) compared with DS control subjects (mean, 0.00826; P = 8.58 × 10−4), a difference which was also observed in the Replication study (P = .03) and meta-analysis (effect sizemeta = 0.0056; Pmeta = 1.69 × 10−4; Phet = .15) (supplemental Figure 2 and Table 2). Among cell types, B cells showed the greatest proportional difference between cases and control subjects in both Discovery (+55.57% in DS-ALL) and Replication (+22.23%) studies (supplemental Table 1). An independent deconvolution method, Epigenetic Dissection of Intra-Sample-Heterogeneity (EpiDISH),11 confirmed the increased B-cell proportions in DS-ALL cases in both studies (Pmeta = 1.67 × 10−4) (supplemental Table 2).
Cell type . | Discovery Study . | Replication Study . | Meta-analysis . | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
(126 cases, 198 control subjects) . | (24 cases, 24 control subjects) . | ||||||||||
Effect estimate* . | Standard error* . | P value* . | Effect estimate* . | Standard error* . | P value* . | Effect estimate† . | Standard error* . | Pmeta† . | Phet† . | Direction . | |
CD4 T cell | 0.0036 | 0.0055 | .51 | −0.0147 | 0.0136 | .29 | 0.0011 | 0.0051 | .83 | .21 | −+ |
CD8 T cell | 0.0071 | 0.0030 | .016 | 0.0168 | 0.0102 | .11 | 0.0079 | 0.0028 | .0055 | .36 | ++ |
B cell | 0.0051 | 0.0015 | 8.58 × 10−4 | 0.0152 | 0.0069 | .03 | 0.0056 | 0.0015 | 1.69 × 10−4 | .15 | ++ |
NK cells | 0.0028 | 0.0024 | .24 | 0.0048 | 0.0079 | .55 | 0.0030 | 0.0023 | .19 | .81 | ++ |
Granulocyte | 0.0076 | 0.0178 | .67 | −0.0482 | 0.0376 | .21 | −0.0026 | 0.0161 | .87 | .18 | −+ |
Monocyte | 0.0010 | 0.0040 | .81 | −0.0003 | 0.0107 | .98 | 0.0008 | 0.0038 | .83 | .91 | −+ |
nRBC | −0.0301 | 0.0228 | .19 | 0.0163 | 0.0357 | .65 | −0.0166 | 0.0192 | .39 | .27 | +− |
Cell type . | Discovery Study . | Replication Study . | Meta-analysis . | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
(126 cases, 198 control subjects) . | (24 cases, 24 control subjects) . | ||||||||||
Effect estimate* . | Standard error* . | P value* . | Effect estimate* . | Standard error* . | P value* . | Effect estimate† . | Standard error* . | Pmeta† . | Phet† . | Direction . | |
CD4 T cell | 0.0036 | 0.0055 | .51 | −0.0147 | 0.0136 | .29 | 0.0011 | 0.0051 | .83 | .21 | −+ |
CD8 T cell | 0.0071 | 0.0030 | .016 | 0.0168 | 0.0102 | .11 | 0.0079 | 0.0028 | .0055 | .36 | ++ |
B cell | 0.0051 | 0.0015 | 8.58 × 10−4 | 0.0152 | 0.0069 | .03 | 0.0056 | 0.0015 | 1.69 × 10−4 | .15 | ++ |
NK cells | 0.0028 | 0.0024 | .24 | 0.0048 | 0.0079 | .55 | 0.0030 | 0.0023 | .19 | .81 | ++ |
Granulocyte | 0.0076 | 0.0178 | .67 | −0.0482 | 0.0376 | .21 | −0.0026 | 0.0161 | .87 | .18 | −+ |
Monocyte | 0.0010 | 0.0040 | .81 | −0.0003 | 0.0107 | .98 | 0.0008 | 0.0038 | .83 | .91 | −+ |
nRBC | −0.0301 | 0.0228 | .19 | 0.0163 | 0.0357 | .65 | −0.0166 | 0.0192 | .39 | .27 | +− |
NK, natural killer; nRBC, nucleated red blood cells.
P < .05 highlighted in bold.
P values, coefficients, and standard errors calculated using linear regression, testing each blood cell type separately as the dependent variable, with DS-ALL status as the independent variable, and including sex, batch, and ancestry-related principal components from EPISTRUCTURE12 (n = 10 for Discovery study; n = 3 for Replication study) as covariates. P values were not adjusted for multiple comparisons.
Meta-analysis performed using METAL.13
In analyses stratified by self-reported race and ethnicity in the Discovery study, increased neonatal B-cell proportions showed a stronger effect in Latinos (effect size = 0.0058; P = 6.15 × 10−3) than in non-Latino Whites (effect size = 0.0046; P = .098), although this difference was not statistically significant (Phet = .74) (supplemental Table 3).
We performed several sensitivity analyses in the Discovery study to assess potential confounders of the increased B-cell proportions in DS-ALL. First, in subjects with available birth-variable data, we adjusted the regression model for gestational age, birth weight, and bloodspot collection age, and the difference in B-cell proportions between DS-ALL cases (n = 116) and DS control subjects (n = 173) remained significant (effect size = 0.0059; P = 3.38 × 10−4).
Next, in Latino and non-Latino White subjects with single nucleotide polymorphism (SNP) genotype data (117 cases, 130 control subjects), we assessed whether SNPs associated with DS-ALL risk in ARID5B (rs7089424), IKZF1 (rs11978267), CDKN2A (rs3731249), or GATA3 (rs3824662)7 may confound the association with B-cell proportions, as these loci were previously associated with variation in white blood cell traits.14 We included the genotypes of these 4 SNPs in the regression model one at a time and also all together, and the significantly increased B-cell proportions in DS-ALL cases remained, with similar effect sizes in Latinos and non-Latino Whites (supplemental Table 4).
Finally, we removed GATA1 mutation-positive control subjects (n = 30 of 184 tested, see supplemental Methods), and the difference in B-cell proportions remained significant (effect size = 0.0043; P = 9.02 × 10−3).
In the Discovery study EWAS of DS-ALL (126 cases, 198 control subjects), the genomic inflation factor was 1.11 after correction with BACON, a Bayesian method to control bias and inflation in EWAS.15 There were 38 significant differentially methylated probes (DMPs) after false discovery rate (FDR) correction and 10 epigenome-wide–significant DMPs after Bonferroni correction (P < 7.95 × 10−8) (supplemental Figure 3; supplemental Table 5). Pathway enrichment analysis of FDR-significant DMPs revealed significant enrichment of 21 gene ontology pathways (supplemental Table 6). The top DS-ALL–associated CpG (cg27347265; P = 2.90 × 10−12) was located in a putative regulatory region of the B-cell transcription factor gene EBF1 (supplemental Figure 4; supplemental Table 5). For all 10 Bonferroni-significant DMPs, the case-control methylation β-value difference was <0.02, and none were significant in the Replication study at P < .05, although 6 out of 10 had consistent directions of effect.
We identified 31 significant differentially methylated regions (DMRs) associated with DS-ALL in the Discovery study (supplemental Table 7). Although none of the DMRs were statistically significant in the Replication study, 4 of 31 contained significant (P < .05) differentially methylated CpGs with the same direction of methylation changes as the Discovery study (supplemental Table 7).
In summary, an increase in the neonatal proportion of B cells was associated with DS-ALL risk, a finding that persisted after adjustment for potential confounding factors and was consistent between 2 independent case-control datasets. DS is associated with reduced fetal B-cell production16,17 and reduced numbers of B cells in fetal life16,18 and childhood.19,20 We previously observed lower B-cell proportions in newborns with DS than in newborns without DS using reference-based cell-type deconvolution analysis.6 Results from the current study support that, in the context of DS, children with greater B-cell proportions at birth have an increased risk of developing DS-ALL. A genetic predisposition to overproducing lymphocytes was recently associated with increased ALL risk in the non-DS population.14 Further studies are required to understand the mechanisms underlying the association between increased B cells and ALL development in children with and without DS, but these may involve effects on the proliferation of preleukemic clones and generation of leukemia-forming mutations, as well as potential impacts on immune function and response to infections.14,21
We did not find strong evidence for differences in DNA methylation at birth that might predict subsequent DS-ALL risk, although the Replication dataset was underpowered to reproduce significance for the small differences found between cases and controls in the Discovery study. The significant EBF1 DMP is intriguing given that this gene is frequently deleted in ALL.22 Investigation of DNA methylation differences in sorted cell populations is required to determine cell-specific epigenetic changes associated with DS-ALL risk.
A strength of our study was the use of newborn DBS, collected before disease onset and, therefore, any case-control differences should not be confounded by the presence of leukemia cells; indeed, in the Discovery study, only 1 DS-ALL case was diagnosed <1 year of age and the B-cell case-control difference was significant both when restricted to cases with age-at-diagnosis ≤4 years (n = 64; effect size = 0.0039; P = .034) or >4 years of age (n = 62; effect size = 0.0061; P value = 1.01 × 10−3).
A study limitation includes the use of a blood cell proportion deconvolution methodology developed in euploid individuals,10 although the same approach confirmed known differences in blood cell proportions associated with DS.6 Nonetheless, the increased B-cell proportion in DS-ALL cases requires confirmation using blood cell count measures in newborns. Another limitation was that sequencing data for somatic GATA1 mutations, which cause transient abnormal myelopoiesis,23 were only available for DS controls in the Discovery study; however, removal of GATA1 mutation-positive control subjects had minimal effect on the B-cell association.
Future studies are needed to understand the role of blood cell trait variation in DS-ALL etiology and examine increased neonatal B cells as a potential risk factor for ALL in the non-DS population.
Acknowledgments: The authors thank Robin Cooley and Steve Graham (Genetic Disease Screening Program, CDPH) for their assistance and expertise in the procurement and management of DBS specimens. The authors also thank Hong Quach and Diana Quach at the UC Berkeley QB3 Genetic Epidemiology and Genomics Laboratory for their support in preparing and processing samples for genome-wide DNA methylation arrays. The authors additionally thank the families for their participation in the California Childhood Leukemia Study (formerly known as the Northern California Childhood Leukemia Study). For recruitment of subjects enrolled in the California Childhood Leukemia Study, the authors gratefully acknowledge the clinical investigators at the following collaborating hospitals: University of California, Davis Medical Center (Jonathan Ducore), University of California, San Francisco (Mignon Loh and Katherine Matthay), Children’s Hospital of Central California (Vonda Crouse), Lucile Packard Children’s Hospital (Gary Dahl), Children’s Hospital Oakland (James Feusner and Carla Golden), Kaiser Permanente Roseville (formerly Sacramento) (Kent Jolly and Vincent Kiley), Kaiser Permanente Santa Clara (Carolyn Russo, Alan Wong, and Denah Taggart), Kaiser Permanente San Francisco (Kenneth Leung), Kaiser Permanente Oakland (Daniel Kronish and Stacy Month), California Pacific Medical Center (Louise Lo), Cedars-Sinai Medical Center (Fataneh Majlessipour), Children’s Hospital Los Angeles (Cecilia Fu), Children’s Hospital Orange County (Leonard Sender), Kaiser Permanente Los Angeles (Robert Cooper), Miller Children’s Hospital Long Beach (Amanda Termuhlen), University of California, San Diego Rady Children’s Hospital (William Roberts), and University of California, Los Angeles Mattel Children’s Hospital (Theodore Moore).
This work was supported by an Alex’s Lemonade Stand Foundation “A” Award (A.J.d.S.), a National Institutes of Health (NIH) National Cancer Institute (NCI) Grant (R01CA249867 [K.R.R., P.J.L., and A.J.d.S.]), an NIH NCI Administrative Supplement grant (3R01CA175737-05S1 [J.L.W., X.M., and A.J.d.S.]), and National Institute for Environmental Health Sciences (NIEHS) grants (R01ES009137, P42ES004705, and R24ES028524). The content of this manuscript is solely the responsibility of the authors and does not necessarily represent the official views of the NIH. A subset of biospecimens and/or data used in this study were obtained from the California Biobank Program at the California Department of Public Health (CDPH), SIS request numbers 572 and 26, in accordance with Section 6555(b), 17 CCR. The CDPH is not responsible for the results or conclusions drawn by the authors of this publication.
The collection of cancer incidence data used in the CCRLP study was supported by the California Department of Public Health pursuant to California Health and Safety Code Section 103885, Centers for Disease Control and Prevention’s (CDC) National Program of Cancer Registries, under cooperative agreement 5NU58DP003862-04/DP003862, the National Cancer Institute’s Surveillance, Epidemiology and End Results (SEER) Program under contract HHSN261201000140C awarded to the Cancer Prevention Institute of California, contract HHSN261201000035C awarded to the University of Southern California, and contract HHSN261201000034C awarded to the Public Health Institute. The ideas and opinions expressed herein are those of the author(s) and do not necessarily reflect the opinions of the State of California, the Department of Public Health, the National Institutes of Health, and the Centers for Disease Control and Prevention or their Contractors and Subcontractors. This study used birth data obtained from the State of California Center for Health Statistics and Informatics. The California Department of Public Health is not responsible for the analyses, interpretations, or conclusions drawn by the authors regarding the birth data used in this publication.
Contribution: A.J.d.S., J.L.W., P.J.L., and A.L.B. designed and supervised this study; S.L., P.S., K.X., I.S.M., N.E., P.P., and A.J.d.S. analyzed the data; N.E., S.S.M., and H.M.H. performed experiments; L.M.M., A.Y.K., C.M., X.M., B.A.M., A.R., I.R., and K.R.R. provided resources; S.L., P.S., K.R.R., A.L.B., P.J.L., J.L.W., and A.J.d.S. prepared the manuscript; and all authors edited and approved the paper.
Conflict-of-interest disclosure: The authors declare no competing financial interests.
Correspondence: Adam J. de Smith, USC Norris Comprehensive Cancer Center, 1450 Biggy St, NRT-1509H, Los Angeles, CA 90033; e-mail: adam.desmith@med.usc.edu.
References
Author notes
This study used biospecimens from the California Biobank Program. Any uploading of genomic data (including genome-wide DNA methylation data) and/or sharing of these biospecimens or individual data derived from these biospecimens have been determined to violate the statutory scheme of the California Health and Safety Code Sections 124980(j), 124991(b), (g), (h), and 103850 (a) and (d), which protect the confidential nature of biospecimens and individual data derived from biospecimens. The individual-level data derived from these biospecimens that support the findings of this study are available from the corresponding author upon request (adam.desmith@med.usc.edu) and with permission from the California Biobank Program and Michigan Newborn Screening Program. Data for deconvoluted blood cell proportions and available covariates in the Discovery Study subjects are included in the supplemental Dataset.
The full-text version of this article contains a data supplement.