Introduction:

The age-associated accumulation of somatic mutations in blood, termed Age-Related Clonal Hematopoiesis (ARCH) has been documented across several cohorts in recent years and confers increased risk of morbidity and mortality. Despite significant efforts, the diversity of ancestries in which the genomic, clinical, and environmental associations of ARCH have been reported remains limited, impacting our collective ability to generalize findings and undermining our efforts towards genomics equity across human populations. Here we present a combined effort to profile mosaic chromosomal alterations (mCAs) driving CH in more than 37,000 samples, including over 12,000 samples from 10 African countries and over 25,000 samples from Quebec and Ontario. We aimed to characterize patterns of mCA accumulation as well as germline associations affecting the risk of mCA development in both African and North American populations.

Methods:

A total of 37,857 samples from African cohorts (AWI-Gen, TrypanoGEN+) and North American cohorts (CARTaGENE, Ontario Health Study) were genotyped and passed QC and filtering. All samples from African cohorts were genotyped with the Infinium H3Africa array, while CARTaGENE samples were either genotyped with the Infinium Global Screening Array or OMNI2.5 Array and Ontario Health Study samples were genotyped with the UKBiobank Axiom array. All mCA calling was performed using publicly available MoChA software. We implemented a genome-wide search of regions significantly enriched for autosomal mCA accumulation using binomial tests scaled by chromosome and array type, classifying a genomic position as a “hotspot” if the -log(p-value) surpassed the FDR-adjusted threshold. We also conducted Fisher's exact tests to identify germline single nucleotide polymorphism (SNP) genotypes that were significantly associated with mCA occurrence on the same chromosome, querying every SNP within a 4 Mbp range of an mCA.

Results:

We identified a total of 545 SNPs as hotspots of mCA accumulation in the African cohorts and 221 SNPs in the North American cohorts. 53.4% of all autosomal mCA calls were in hotspot regions in North American cohorts, while 73.9% were in hotspot regions in African cohorts. Notable hotspots on 13q and 16p were shared by both African and North American cohorts, while the 17q21.31 loss hotspot was uniquely seen in North American cohorts. We further documented genes overlapping hotspot regions, 10 of which were seen in both North American and African populations and included known drivers of CH such as DNMT3A and DLEU1. In African populations, we observed that hotspot accumulation does not demonstrate country- or region-specific biases, but rather occurred in samples across Africa. Furthermore, the mean cell fraction of mCAs in hotspots was significantly higher than that of mCAs in non-hotspot regions in both North American and African populations, suggesting that the genomic position of an mCA may potentially influence its proliferative potential. We also discovered a negative relationship between the length of an mCA and its cell fraction, consistent across North American and African populations. Sex-specific analyses demonstrated that in both North America and Africa, 13q loss was enriched in males, and 16p loss was enriched in females in Africa. Finally, we identified germline variants on chromosome 14 and 6 that affected the risk of developing an mCA. We captured statistically significant SNP-mCA cis associations in discovery and replication cohorts of African datasets, which have not been previously described in studies involving participants of European ancestry.

Conclusions:

In summary, we interrogated the landscape of mCAs across diverse and previously understudied ancestries. We showed similar and divergent patterns in the hematopoietic dynamics of individuals across populations and identified genes previously implicated in CH that appear to be ancestry-independent targets of hotspot accumulation. We explored how the risk of development of an mCA is impacted by sex and by germline variants, as well as the modification of this risk according to ancestry. In the future, finer-resolution molecular profiling of mCA-driven CH using long-read or higher-coverage sequencing may aid in capturing mCAs of smaller lengths, and may also be used to replicate these results in independent cohorts.

Disclosures

No relevant conflicts of interest to declare.

This content is only available as a PDF.
Sign in via your Institution