Background

Chronic lymphocytic leukaemia (CLL) is characterised by clinical and biological heterogeneity. Despite significant advances in therapeutic management, CLL remains largely incurable. Current risk stratification is based on cytogenetic features (del(17p), del(11q), del(13q), +12). So far, sequencing studies in CLL have focussed predominantly on the exome. These have identified a number of genes that are recurrently mutated at low frequency such as TP53, SF3B1, ATM, NOTCH1, MYD88, and BIRC3. Apart from TP53 abnormalities, none of these are currently used to guide clinical decisions and it is unclear how they are implicated in disease pathogenesis.

Methods

In this study, we sought to further refine the molecular landscape of CLL using whole genome sequencing (WGS) of paired tumour and germline DNA samples from a cohort of clinically annotated patients with CLL. We sequenced a heterogeneous cohort of 41 samples (25 males, 16 females, median age 69 (range 49-94)) with a range of clinical features (49% fludarabine refractory, 61% unmutated IgVH). Whole genome sequencing libraries were generated using the Illumina TruSeq PCR-free sample preparation kit, with a median insert size of 400bp, and subjected to 100bp paired-end sequencing on an Illumina HiSeq 2500 platform. Both tumour and germline libraries were sequenced to an average depth of 38x. Sequencing reads were aligned using the Isaac algorithm and the Starling and Strelka algorithms were used for SNV and Indel calling in germline and tumour samples respectively. All variants with a read depth <10x or a quality score <Q30 were excluded using Illumina VariantStudio software. For validation, selected mutations were verified using a combination of a targeted deep sequencing panel on the Illumina MiSeq platform and conventional Sanger sequencing. Copy number alterations were identified from the whole genome sequencing data using Nexus 7.5 (Biodiscovery), with findings validated on Illumina OmniExpress24 arrays.

Results

Whole genome sequencing revealed a total of 95,305 somatic indels and base substitutions, averaging 30.8 per patient (range 7-57) or 0.3 mutations per megabase. Of these mutations, 1266 occur in protein coding regions across 1108 genes, including 556 in 3’ and 5’ untranslated regions. Of these 1108 genes, we identified 93 as recurrently mutated (mutations present in more than one sample), including the previously described SF3B1 (12/41, 29.3%), TP53 (9/41, 22%), ATM (6/41, 14.6%), NOTCH1 (6/41, 14.6%), FAT1 (4/41, 9.8%) and BIRC3 (2/41, 4.9%). In addition to FAT1, we also identified two missense mutations in another cadherin superfamily member, FAT4(2/41, 4.9%), both occurring within the extracellular cadherin domains.

Missense mutations were the most frequent (42.7%) followed by those in 3’ UTRs (36.1%), 5’ UTRs (7.7%), splice sites (6.1%), small indels (4.3%) and nonsense mutations (3.1%). In addition, 61.5% of missense mutations were identified as either deleterious or damaging by the SIFT or PolyPhen-2 algorithms. We used a modified version of the MutSigCV algorithm to identify genes with significantly higher mutation rates in the coding sequence. A similar statistical approach was used to identify significant mutations in untranslated regions. Importantly, a number of interesting candidate genes carried mutations in non-coding regions, including NFKBIZ (3/41, 7.3%), IGLL5 (3/41, 7.3%) and BCL2(2/41, 4.9%).

Conclusion

To our knowledge, this is the largest whole genome sequencing study in CLL so far. We present a comprehensive catalogue of genomic alteration in CLL and associate genome-wide patterns, including the presence of subclones, with clinical outcome. In addition to demonstrating the heterogeneous nature of the CLL genome, our data highlights the variety of mutations present in the regulatory regions of genes as well as structural variations, thus providing new insights for hypothesis-driven biomarker and therapeutic discovery.

Disclosures

Humphray:Illumina Cambridge Ltd: Employment. Becq:Illumina Cambridge Ltd: Employment. Bentley:Illumina Cambridge Ltd: Employment.

Author notes

*

Asterisk with author names denotes non-ASH members.

Sign in via your Institution