• Germ line loss-of-function mutations in shelterin genes occur in a subset of families with CLL.

  • Telomere dysregulation is further implicated in CLL predisposition.

Chronic lymphocytic leukemia (CLL) can be familial; however, thus far no rare germ line disruptive alleles for CLL have been identified. We performed whole-exome sequencing of 66 CLL families, identifying 4 families where loss-of-function mutations in protection of telomeres 1 (POT1) co-segregated with CLL. The p.Tyr36Cys mutation is predicted to disrupt the interaction between POT1 and the telomeric overhang. The c.1164-1G>A splice-site, p.Gln358SerfsTer13 frameshift, and p.Gln376Arg missense mutations are likely to impact the interaction between POT1 and adrenocortical dysplasia homolog (ACD), which is a part of the telomere-capping shelterin complex. We also identified mutations in ACD (c.752-2A>C) and another shelterin component, telomeric repeat binding factor 2, interacting protein (p.Ala104Pro and p.Arg133Gln), in 3 CLL families. In a complementary analysis of 1083 cases and 5854 controls, the POT1 p.Gln376Arg variant, which has a global minor allele frequency of 0.0005, conferred a 3.61-fold increased risk of CLL (P = .009). This study further highlights telomere dysregulation as a key process in CLL development.

Chronic lymphocytic leukemia (CLL; MIM151400) is clinically defined by the presence of a clonal population of B-cell lymphocytes (>5 × 109 cells/L) with a characteristic immunophenotype. The disease accounts for ∼25% of all leukemia and is the most common form of lymphoid malignancy in Western countries, affecting ∼16 000 individuals in the United States each year.1  Although the last decade has seen a dramatic evolution in the treatment options for CLL,2-4  it still remains an incurable malignancy. It is anticipated that an increased understanding of CLL pathogenesis will generate further therapeutic targets to either delay or prevent progression of the precursor to frank malignancy.

CLL has one of the highest familial risks of any cancer, with risk being increased eightfold in relatives of patients.5  Recent genome-wide association studies (GWAS) have identified common risk single nucleotide polymorphisms (SNPs) at 31 loci associated with sporadic CLL.6-13  The risk of CLL associated with each of these variants is however modest at best. Although families segregating CLL provide evidence for Mendelian susceptibility, no rare alleles of large effect have thus far been discovered. The identification of this class of susceptibility is especially important because mutations are causal and provide direct insight to cancer biology, in contrast to GWAS associations.

Here we report on the whole exome sequencing (WES) of familial CLL, and establish a key role for rare disruptive mutations in protection of telomeres 1 (POT1) and other shelterin complex genes as determinants of susceptibility to CLL. Our findings thus extend the spectrum of cancer types associated with germ line mutation in these genes.

Patient samples and DNA extraction

The families and CLL cases included in this study were recruited through a United Kingdom national study of CLL genetics, established by The Institute of Cancer Research (ICR) Divisions of Genetics and Epidemiology and Molecular Pathology in 1996. The diagnosis of CLL and other hematologic cancers in family members were established. In all cases, the diagnosis of CLL was based on accepted standard clinico-pathological and immunologic criteria that are in accordance with current World Health Organization classification guidelines. Informed consent was obtained under the Multi-Research Ethics Committee 99/1/082. Genomic DNA was extracted from peripheral blood and saliva using standard methods, and quantified by PicoGreen (Invitrogen).

Pedigrees and clinical presentation

See supplemental Table 1, available on the Blood Web site, for details on all pedigrees, the number of cases of CLL in each family, and the number of cases that were whole-exome sequenced.

Sequence alignment and analysis

Exon capture was performed using the Nextera Rapid Capture Exome Enrichment Kit (Illumina, San Diego, CA). The Illumina HiSeq 2000 analyzer with 101 bp reads was used for sequencing. Paired-end FASTQ files were extracted using CASAVA software (version 1.8.1; Illumina) and aligned to build 37 (hg19) of the human reference genome using Stampy14  and Burrows-Wheeler Aligner15  software. Alignments were processed using the Genome Analysis Tool Kit pipeline (version 3.2-2),16  according to best practices.17,18  Variants were filtered for positions found in >1 sample from an in-house collection of 1609 control exomes, including 961 samples from the ICR1000 data set generated by Nazneen Rahman’s team in the Division of Genetics and Epidemiology at the ICR, London, United Kingdom,19  plus an extra 648 samples from the UK 1958 Birth Cohort (BC),20  sequenced in-house using Illumina TruSeq exome methodology. We also filtered variants based on frequencies in the 1000 Genomes Project, National Heart, Lung, and Blood Institute Exome Sequencing Project (ESP6500), and the Exome Aggregation Consortium (ExAC) catalog. Positions resulting in protein-altering changes were identified using the Ensembl Variant Effect Predictor (version 78) and variants shared between family members were annotated using custom scripts. The predicted functional consequences of missense variants were assessed using SIFT,21  CADD,22  and SuSPect23  algorithms.

Sanger sequencing

Germ line verification of variants found by next-generation sequencing was performed by Sanger sequencing of mouthwash DNA samples. Primers are listed in supplemental Table 2.

MaxEntScan scoring of splice acceptor variants

We used the MaxEntScan algorithm24  to assess the effect of POT1 g.124481233C>T and adrenocortical dysplasia homolog (ACD) g.67692984T>G mutations. Scores for the mutated splice acceptor site and wild-type (WT) splice site sequence were 5.66; −3.08 and 9.88, and 1.84, respectively.

Confirmation of aberrant splicing in an individual carrying the splice acceptor variant, 7:g.124481233C>T

RNA extracted from the whole blood of a splice acceptor variant carrier and control was converted to complementary DNA (cDNA) using Superscript III Reverse Transcriptase (Invitrogen). Polymerase chain reaction (PCR) was then performed to confirm that 7:g.124481233C>T disrupted splicing. The product was visualized on a 2.5% agarose gel. Sanger sequencing was used to confirm the sequence of the product.

Exome array genotyping

A total of 1111 unrelated CLL cases were genotyped for the p.Gln376Arg variant using the Illumina OmniExpress Exome array as previously described.12  After quality control filtering, genotype data were available for 1083 CLL cases. For controls, we used publicly accessible data for 5854 individuals from the 1958 BC,20  genotyped using the Illumina HumanExome-12 version 1 array. These data are available from the European Genome-phenome Archive under accession #EGAD00010000234. The χ2 test was used to determine the significance of the difference in case-control allele counts. Confidence intervals (CIs) were calculated by the Woolf method.

Protein alignment and structural modeling

Multiple sequence alignments were generated for homologous POT1 and telomeric repeat binding factor 2, interacting protein (TERF2IP) sequences using T-Coffee25,26  to evaluate conservation. POT1 alignments were generated with the following sequences: NP_056265.2, XP_519345.2, NP_001127526.1, XP_009001386.1, XP_006149256.1, NP_598692.1, XP_002712135.2, XP_010802750.1, XP_005628494.1, XP_001501458.4, XP_006910616.1, XP_010585693.1, XP_004478311.1, XP_007504310.1, XP_001508179.2, NP_996875.1, and NP_001084422.1. TERF2IP alignments were generated with the following sequences: NP_061848.2, NP_001267142.1, XP_003780774.2, XP_008984478.1, XP_006152679.1, NP_065609.2, XP_002711780.1, NP_001068880.1, XP_536776.2, XP_005608497.1, XP_006908867.1, XP_010595146.1, XP_004470975.1, XP_001508762.2, NP_989799.1, and NP_001084428.1. Jalview27  was used to visualize and format the alignments. The crystal structure of the N-terminal region (oligonucleotide/oligosaccharide-binding 1 [OB1] and OB2 domains) of the human POT1 protein (Research Collaboratory for Structural Bioinformatics protein data bank [PDB], 3KJP, and 1XJV) was visualized using Chimera (version 1.10.2)28  and Cn3D (version 4.3.1).29  The impact of the p.Tyr36Cys mutation on stability of the POT1:DNA interaction was assessed using the mutation Cutoff Scanning Matrix (mCSM) approach.30  The effect of missense mutations on protein stability was assessed using the Impact of Non-synonymous mutations on Protein Stability (INPS) server.31 

LOH analyses

Loss-of-heterozygosity (LOH) analysis was conducted using ExomeCNV,32  which detects copy number variation and LOH events using depth-of-coverage and B-allele frequencies. LOH calls were made by first identifying all heterozygous germ line positions. The Genome Analysis Tool Kit was then used to create BAF files and ExomeCNV was used to call LOH at heterozygous positions individually and at combined LOH segments.

Assessment of telomere length

Relative telomere length was determined by 2 methods: using exome sequencing data and with real-time PCR (RT-PCR). Analysis of off-target reads from exome sequencing data were performed essentially as described,33  using a telomeric repeat copy number of k = 4. We used data from blood-derived DNA only and also excluded samples with average sequencing depth <20 and with missing covariate data (n = 12). Telomere length was adjusted for age at blood draw, sex, and sequencing batch by a linear model determined using data from noncarriers only. For the SYBR green RT-PCR, the ratio of telomere repeat units to a single-copy gene (β-globin) for 109 samples, was determined as previously described.34,35  Primers are listed in supplemental Table 2. Reactions were performed in triplicate, using 10 ng DNA per sample. Each 10 μL reaction also contained 5 μL of 2X SYBR Green Master Mix (Applied Biosystems), plus either 300 and 700 nmol/L of the control forward and reverse primers, respectively, or 100 and 900 nmol/L of the telomere unit forward and reverse primers, respectively. The telomere reaction also included 0.3 μL dimethyl sulfoxide. Cycling was performed using an ABI7900HT thermal cycler as previously described.35  Relative telomere length was calculated using 2−ΔCt derived from the RT-PCR data and was adjusted for age at blood draw and sex. A Wilcoxon rank-sum test was used to compare the relative adjusted telomere length for POT1 mutation carriers vs noncarriers of shelterin gene mutations.

Identification of shelterin gene mutations

To maximize the prospects of identifying rare disease-causing variants for CLL, we initially focused our search on the 18 families with the strongest family histories of CLL (supplemental Table 1), which had been ascertained through an ongoing study.36  We performed WES on genomic DNA from blood of 45 affected individuals from the 18 families. We excluded variants that were observed more than once in our in-house database of 1609 healthy individuals from the 1958 BC who had been exome sequenced. We also discounted variants with an allele frequency of >0.1% in large-scale sequencing projects (1000 Genomes Project, ESP6500, or the ExAC catalog). In our first stage analysis, we required the filtered variants to be present in all sequenced affecteds within the family.

To further filter the variants identified, we prioritized missense and disruptive variants (nonsense, splice acceptor/donor, and frameshift) occurring in genes with a reported cancer association or documented role in cancer predisposition. Analysis of these genes led us to identify pedigree 5047, in which all 3 affected family members carried a splice acceptor variant in intron 13 (chromosome 7 g.124481233C>T/c.1164-1G>A) of POT1 (MIM 606478) (Figures 1 and 2A).37-40  We confirmed the mutation by Sanger sequencing in blood and saliva-derived DNA in all 3 cases (supplemental Figure 1). The mutation was predicted to disrupt splicing by the MaxEntScan algorithm24  (WT score 5.66 vs mutated score −3.08; 154% reduction). This also identified a potential alternative splice acceptor site 43 bp downstream (MaxEntScan score = 7.66), the use of which would result in a truncated protein product. We confirmed the presence of an aberrant splicing product in a mutation carrier by RT-PCR and validated the use of the predicted alternative splice site using Sanger sequencing (Figure 3 and supplemental Figure 2).

Figure 1

Rare POT1, ACD, and TERF2IP mutations in CLL families. Black-filled symbols indicate CLL cases, other cancers are indicated by a red-filled symbol, and an unfilled symbol indicates an individual with no known cancer. These symbols have a central dot to indicate cases that were exome sequenced. A central blue dot denotes a shelterin gene mutation carrier; a peach dot denotes a WT individual. A line through a symbol indicates that an individual is deceased. Age of diagnosis (in years) is listed for CLL cases. Splice acceptor variants are numbered relative to POT1 transcript NM_015450 and ACD transcript NM_001082486. NHL, non-Hodgkin lymphoma.

Figure 1

Rare POT1, ACD, and TERF2IP mutations in CLL families. Black-filled symbols indicate CLL cases, other cancers are indicated by a red-filled symbol, and an unfilled symbol indicates an individual with no known cancer. These symbols have a central dot to indicate cases that were exome sequenced. A central blue dot denotes a shelterin gene mutation carrier; a peach dot denotes a WT individual. A line through a symbol indicates that an individual is deceased. Age of diagnosis (in years) is listed for CLL cases. Splice acceptor variants are numbered relative to POT1 transcript NM_015450 and ACD transcript NM_001082486. NHL, non-Hodgkin lymphoma.

Close modal
Figure 2

Impact of rare familial mutations on POT1 protein. (A) Schematic showing the position of germ line POT1 mutations identified in CLL families relative to OB domains (red) and ACD binding region (blue). Also shown are somatic POT1 mutations identified in previous studies of CLL patients37,38  (unshaded background) and germ line mutations found in familial cutaneous melanoma39,40  (peach background). (B) Cross-species conservation of POT1 amino acids subject to missense mutation in CLL families. (C) Schematic of the crystal structure of human POT1 N-terminal OB domains bound to a telomeric DNA sequence (PDB 3KJP), illustrating the proximity of tyrosine 36 to the DNA strand. OB domains are shown in gray, DNA in blue, and Tyr.36 is highlighted in magenta.

Figure 2

Impact of rare familial mutations on POT1 protein. (A) Schematic showing the position of germ line POT1 mutations identified in CLL families relative to OB domains (red) and ACD binding region (blue). Also shown are somatic POT1 mutations identified in previous studies of CLL patients37,38  (unshaded background) and germ line mutations found in familial cutaneous melanoma39,40  (peach background). (B) Cross-species conservation of POT1 amino acids subject to missense mutation in CLL families. (C) Schematic of the crystal structure of human POT1 N-terminal OB domains bound to a telomeric DNA sequence (PDB 3KJP), illustrating the proximity of tyrosine 36 to the DNA strand. OB domains are shown in gray, DNA in blue, and Tyr.36 is highlighted in magenta.

Close modal
Figure 3

Impact of POT1 splice acceptor site mutation on splicing. (A) Splice acceptor site consensus scores predicted by MaxEntScan24  for each base from the natural POT1 intron 13/exon 14 boundary across exon 14. For clarity, only part of the intron (lower case text above black line) and exon (upper case text above black box) are shown. The predicted score for the unmutated natural splice site (red bar) is also labeled. Positive scores are otherwise marked in blue and negative scores in peach. (B) MaxEntScan splice acceptor consensus scores for the same region based upon the sequence of c.1164-1G>A POT1 mutation carriers. The scores of the mutated natural splice acceptor (pink bar) and the predicted alternative splice site with the highest MaxEntScan score (43 bp downstream) are labeled. The part of exon 14 that would be removed by use of this splice site is indicated by a gray box. (C) Abnormal splicing product detected by RT-PCR using cDNA from a CLL case (Ca) carrying the c.1164-1G>A mutation. This product is absent from control (Co) cDNA. bp, base pairs; L, ladder; NT, no template reaction.

Figure 3

Impact of POT1 splice acceptor site mutation on splicing. (A) Splice acceptor site consensus scores predicted by MaxEntScan24  for each base from the natural POT1 intron 13/exon 14 boundary across exon 14. For clarity, only part of the intron (lower case text above black line) and exon (upper case text above black box) are shown. The predicted score for the unmutated natural splice site (red bar) is also labeled. Positive scores are otherwise marked in blue and negative scores in peach. (B) MaxEntScan splice acceptor consensus scores for the same region based upon the sequence of c.1164-1G>A POT1 mutation carriers. The scores of the mutated natural splice acceptor (pink bar) and the predicted alternative splice site with the highest MaxEntScan score (43 bp downstream) are labeled. The part of exon 14 that would be removed by use of this splice site is indicated by a gray box. (C) Abnormal splicing product detected by RT-PCR using cDNA from a CLL case (Ca) carrying the c.1164-1G>A mutation. This product is absent from control (Co) cDNA. bp, base pairs; L, ladder; NT, no template reaction.

Close modal

To further investigate the potential role of POT1 and other members of the shelterin gene complex in familial CLL, we expanded our exome sequencing data set to include an additional 96 affected relative-pairs from 48 families (supplemental Table 1). We then looked for shared missense and disruptive variants in the 6 components of the shelterin complex (POT1, ACD, telomeric repeat binding factor 1 [TERF1] interacting nuclear factor 2 [TINF2], TERF1, TERF2, and TERF2IP). Through this analysis, we identified 3 additional families that harbored POT1 mutations (Figure 1); 2 missense mutations, p.Tyr36Cys and p.Gln376Arg, occurring at evolutionarily conserved residues (Figure 2B), predicted in silico to be damaging by multiple algorithms, and a frameshift mutation (Table 1). Collectively, we therefore identified mutations in POT1 in 6% of the CLL families, as compared with the documented frequency of such variants of only 0.9% among the 60 706 individuals included in the ExAC catalog (P = .003).

Table 1

Germ line mutations in shelterin complex genes identified in CLL pedigrees

GeneMutation positionEffect predictions
Genomic (hg19)cDNA*ProteinVariant typePedigreeCarriersCADDSIFTSuSPect§GERP
POT1 7:g.124481233C>T c.1164-1G>A N/A Splice acceptor 5047 3/3/3 17.16 N/A N/A 4.67 
POT1 7:g.124532337T>C c.107A>G p.Tyr36Cys Missense 162 2/2/2 18.9 Deleterious 71 5.71 
POT1 7:g.124482952_124482953insA c.1071_1072insT p.Gln358SerfsTer13 Frameshift 4029 2/2/4 N/A N/A N/A 4.71 
POT1 7:g.124482897T>C c.1127A>G p.Gln376Arg Missense 4013 2/2/3 23 Deleterious 50 5.55 
ACD 16:g.67692984T>G c.752-2A>C N/A Splice acceptor 233 2/2/2 19.35 N/A N/A 5.15 
TERF2IP 16:g.75682090G>C c.310G>C p.Ala104Pro Missense 4092 2/2/3 10.15 Tolerated 10 
TERF2IP 16:g.75682178G>A c.398G>A p.Arg133Gln Missense 4014 2/3/3 14.88 Deleterious 73 5.34 
GeneMutation positionEffect predictions
Genomic (hg19)cDNA*ProteinVariant typePedigreeCarriersCADDSIFTSuSPect§GERP
POT1 7:g.124481233C>T c.1164-1G>A N/A Splice acceptor 5047 3/3/3 17.16 N/A N/A 4.67 
POT1 7:g.124532337T>C c.107A>G p.Tyr36Cys Missense 162 2/2/2 18.9 Deleterious 71 5.71 
POT1 7:g.124482952_124482953insA c.1071_1072insT p.Gln358SerfsTer13 Frameshift 4029 2/2/4 N/A N/A N/A 4.71 
POT1 7:g.124482897T>C c.1127A>G p.Gln376Arg Missense 4013 2/2/3 23 Deleterious 50 5.55 
ACD 16:g.67692984T>G c.752-2A>C N/A Splice acceptor 233 2/2/2 19.35 N/A N/A 5.15 
TERF2IP 16:g.75682090G>C c.310G>C p.Ala104Pro Missense 4092 2/2/3 10.15 Tolerated 10 
TERF2IP 16:g.75682178G>A c.398G>A p.Arg133Gln Missense 4014 2/3/3 14.88 Deleterious 73 5.34 

GERP, Genomic Evolutionary Rate Profiling score; N/A, not applicable.

*

POT1 reference transcript is NM_015450 and ACD reference transcript is NM_001082486.

Carriers given as number of familial cases with mutation/number of cases in family exome sequenced/total number of CLL cases in family.

CADD Phred-like score.

§

Scores of 50 and above considered to indicate deleterious mutations.

Intriguingly, somatic mutations of residue Tyr36 have previously been reported in CLL (Figure 2A).37,41,42  The p.Gln376Arg variant, identified in pedigree 4013, has a global minor allele frequency of 0.0005 in the ESP6500 database and is included on the Illumina Exome array. We therefore initiated a genetic association study of this recurrent variant, making use of Illumina exome array data on 1083 unselected CLL cases and 5854 1958 BC controls. Six of the cases and 9 of the controls were heterozygous for the p.Gln376Arg variant (odds ratio = 3.61; 95% CI, 1.28-10.15; P = .009).

In addition to POT1 mutations, we identified mutations in other shelterin complex genes in families 233, 4092, and 4014. Specifically, the ACD (MIM 609377) splice site variant c.752-2A>C was carried by both affected siblings in pedigree 233 (Figure 1 and supplemental Figure 3A) and was predicted by the MaxEntScan algorithm to disrupt the exon 7 splice acceptor signal (WT score 9.88 vs mutated score 1.84; 81% reduction) (supplemental Figure 3B). In TERF2IP (MIM 605061), the missense mutation c.398G>A (p.Arg133Gln) was identified in 2 out of 3 CLL cases sequenced in family 4014 (Figure 1 and supplemental Figure 4A). This mutation occurs at an evolutionarily conserved site and was predicted to be damaging by multiple methods (Table 1; supplemental Figure 4B). We also found the c.310G>C (p.Ala104Pro) TERF2IP variant in both siblings in family 4092 (Figure 1; Table 1; supplemental Figure 4A). Although this residue is partially conserved, the p.Ala104Pro mutation is not predicted to be damaging by SIFT or SuSPect (Table 1; supplemental Figure 4B).

Structural predictions

The POT1 N-terminus contains 2 OB folds that bind to the single-stranded telomeric overhang (Figure 2), whereas the C-terminus is responsible for binding to ACD and anchoring the shelterin complex. The crystal structure of human POT1 has been resolved for only the N-terminal OB folds (PDB 3KJP and 1XJV). Based upon these structures, Tyr36 is one of 24 residues found at the POT1:telomeric polynucleotide interface37  (Figure 2). The p.Tyr36Cys mutation is predicted by the mutation Cutoff Scanning Matrix approach to reduce the POT1:DNA complex affinity (PDB 1XJV, ΔΔG −0.27 kcal/mol; PDB 3KJP, predicted ΔΔG −0.21 kcal/mol).

Because crystal structures for full-length POT1 and TERF2IP are lacking, we used the machine learning algorithm Impact of Non-synonymous mutations on Protein Stability to predict the thermodynamic change in free energy caused by the p.Gln376Arg (POT1), p.Ala104Pro, and p.Arg133Gln (TERF2IP) mutations, based upon the protein sequence (supplemental Table 3). Using this method, p.Arg133Gln was predicted to have the largest effect upon protein stability.

Analysis of somatic events

We used ExomeCNV to look for evidence of LOH in the proband of pedigree 5047, comparing exome sequencing data from blood-derived DNA to saliva-derived DNA, finding no evidence of a somatic abnormality at the POT1 locus. We also looked for deleterious variants identified only in the blood-derived DNA of this case (ie, absent from the saliva-derived DNA sample and also absent from the other affected individuals in pedigree 5047), and found no somatic inactivating POT1 mutations. We did however, note the presence of a somatic splice donor site mutation affecting the first base of intron 10 of ATR (or ataxia-telangiectasia and rad3-related) in this case.

Effect of POT1 mutations on maintenance of telomere length

Given the role of the shelterin complex in telomere length maintenance, we examined whether CLL cases from shelterin-mutated pedigrees had telomere lengths that differed from noncarrier CLL cases using exome sequencing and RT-PCR data. We observed no consistent significant difference between the telomere lengths of POT1 mutation carriers and CLL cases without a mutation in a shelterin complex gene, by exome sequencing or RT-PCR (P = .03 and P = .57, respectively). The telomere lengths of cases with ACD or TERF2IP variants also displayed no obvious trend, although the small numbers of cases harboring these variants precluded a meaningful evaluation of their impact on telomere length.

Here we have implemented WES to search for rare disruptive risk alleles for CLL, identifying germ line-inactivating shelterin gene mutations in a subset of CLL families. These findings are consistent with the evidence of linkage of familial CLL to chromosomes 7q31.32-q33 and 16q12.2-q23.1 that we previously observed (supplemental Figure 5).43 

Germ line disruptive variants within shelterin genes have recently been implicated in predisposition to familial melanoma,39,40  cardiac angiosarcoma,44  glioma,45  and colorectal cancer,46  whereas somatic mutations of POT1 are detectable in 3.5% of all CLL and 9% of encoding immunoglobulin heavy chain variable-unmutated CLL,37  and were also identified in 10% of patients with cutaneous T-cell lymphoma.47 

POT1-mutated CLL cells have numerous telomeric and chromosomal abnormalities, suggesting that POT1 mutation facilitates the acquisition of these malignant features.37  Our observation of germ line mutations in POT1 being associated with familial CLL would concur with this assumption. Our findings also support the proposal that POT1 mutation is an early event in CLL development.41 

In a CLL GWAS, we previously reported an association between the common allele of the POT1 3′ untranslated region variant rs17246404 (risk allele frequency = 0.75) and increased CLL risk, with a small per allele effect size (odds ratio = 1.22).12  The recurrent POT1 coding variant, p.Gln376Arg, identified in the current study is not however in linkage disequilibrium with SNP rs17246404 (r2 = 0.00). Therefore, although further studies are required to determine exactly how rs17246404 influences CLL risk, it is plausible that the functional basis of the association is through differential gene expression.

Shelterin is a telomere-specific protein complex composed of 6 family members, encoded by POT1, ACD, TERF2IP, and TERF1, TERF2, and TINF2, that protects the ends of chromosomes. Together, the components of the shelterin complex are necessary for all telomere functions, including the protection of telomeres from degradation, aberrant recombination, and incorrect processing by DNA-repair machinery, as well as facilitating chromosome capping to mediate telomerase activity.48 

POT1 directly contacts telomeric DNA overhangs49  and also binds to ACD,50  which connects POT1 to the other shelterin components via its bridge with TINF2.51  The POT1:ACD interaction enhances the affinity of POT1 for telomeric DNA.50,52  The ACD splice site mutation c.752-2A>C will disrupt the POT1 binding domain and abolish the TINF2 binding domain, so could therefore be predicted to result in an unformed shelterin complex.

In silico predictions suggest that the germ line p.Tyr36Cys mutation identified in pedigree 162 is likely to disrupt the interaction between POT1 and the single-stranded telomeric DNA overhang. The POT1 frameshift and splice site mutation are likely to result in truncated protein products, impairing their interaction with ACD. The p.Gln376Arg variant, though not predicted in silico to impact protein stability, alters an evolutionarily constrained residue thus implying functional importance.

Previous experiments have shown that when the ACD/POT1 subunit is inhibited, the telomerase complex increases telomere length.49,51  We observed no significant differences between the telomere lengths of CLL cases with a POT1 mutation and those who did not harbor a shelterin gene mutation. This observation is comparable to that in tumor cells derived directly from CLL cases with a somatic POT1 mutation vs matched cases with no POT1 mutation,37  and may reflect the numerous unmeasured variables that can influence telomere length in human populations. We also acknowledge that our telomere length measurements are based on blood-derived DNA and therefore could be subject to the effects of uncharacterized somatic mutations. In this regard, we note that the proband of pedigree 5047 harbored a somatic splice site mutation in ATR, a gene also known to play a key role in telomere maintenance.53  Furthermore, although GWAS have identified SNPs at loci including other telomere maintenance genes that are associated with telomere length, there has been no such association reported for a POT1 SNP.12,54,55 

TERF2IP associates with the shelterin complex via its C-terminus to a central region of TERF2, forming a stable 1:1 complex. TERF2IP, as part of the shelterin complex, is vital for the repression of homology-directed repair of double-strand chromosomal break at the telomere. Although the novel missense variant p.Arg133Gln is predicted to be pathogenic, markedly reducing the stability of the protein, p.Ala104Pro is less well conserved and is thus more likely to be tolerated.

Germ line disruptive mutations in POT1 have previously been associated with susceptibility to melanoma in 9 families39,40  and glioma in 3 families.45  Furthermore, recent studies have identified POT1 p.Arg117Cys in 4 Li-Fraumeni–like syndrome families,44  and ACD and TERF2IP mutations in 8 melanoma families.56  None of the mutation carriers in the melanoma families featured cases of glioma or CLL. Similarly, the glioma families did not feature cases of melanoma or CLL and the only case of melanoma was seen in 1 of the Li-Fraumeni–like syndrome families. Collectively, these data and the fact that none of our families segregated glioma or melanoma, suggest that the penetrance associated with rare shelterin complex mutations is modest. Such an assertion is supported by our observation that the predicted deleterious p.Arg133Gln TERF2IP variant was identified in only 2 out of 3 CLL cases sequenced in family 4014. Additionally, in our case-control analysis, the POT1 p.Gln376Arg mutation was shown to confer a modest 3.6-fold increase in risk of CLL. Furthermore, the absence of significant LOH in the tumors of carriers, when examined,44  suggests that mutations in the shelterin complex genes do not function as high penetrance tumor suppressors but rather as moderate penetrance alleles.

Early age of onset in cancer can be indicative of inherited predisposition and it is noteworthy that in this study, mutation carriers were diagnosed with CLL much younger than the population average (59 years vs 71 years). Because 7 of the 66 CLL families were carriers of shelterin mutations, this translates to 11% of familial CLL being ascribed to mutations in this class of genes (95% CI, 4-21). However, we acknowledge that our analyses were based only on families ascertained in the United Kingdom, and therefore the impact of such mutations on familial CLL could vary depending on ethnicity. Moreover, it remains to be established, through additional studies, whether other CLL families are the consequence of polygenic susceptibility or as yet unidentified higher impact disease-causing mutations.

In conclusion, the POT1, ACD, and TERF2IP loss-of-function mutations we report here suggest that multiple components of the shelterin complex play a role in CLL predisposition. Moreover, they extend the spectrum of cancer associated with inherited mutations in these genes. It is however, likely that shelterin complex gene mutations confer cancer risks analogous to those associated with ATM heterozygosity57  or CHEK258  for breast cancer. Nevertheless, because the dysregulation of telomere protection has been identified as a target for potential therapeutic intervention in CLL, it may be possible that early identification of mutation carriers will facilitate improvements in future disease management.

The online version of this article contains a data supplement.

The publication costs of this article were defrayed in part by page charge payment. Therefore, and solely to indicate this fact, this article is hereby marked “advertisement” in accordance with 18 USC section 1734.

The authors are grateful to all investigators, and the patients and individuals for their participation. This study made use of genotyping data on the 1958 BC; a full list of the investigators who contributed to the generation of these data is available at http://www.wtccc.org.uk/.

Principal funding for the study was provided by Bloodwise (LRF05001, LRF06002, and LRF13044). The authors acknowledge support from Cancer Research UK (C1298/A8362 supported by the Bobby Moore Fund), the Arbib Fund, and the Leicester Experimental Cancer Medicine Centre (C325/A15575 Cancer Research UK/UK Department of Health). B.K. received a doctoral studentship from the ICR, supported by the Sir John Fisher Foundation.

Contribution: H.E.S. and R.S.H. drafted the manuscript; H.E.S. performed project management, sequencing, and bioinformatic analysis; B.K., D. Chubb, P.J.L., and K.L. performed bioinformatic analysis; P.B. performed sample preparation; S.J. performed sample database management; C.D., M.J.S.D., G.A.F., and D. Catovsky performed sample recruitment; and R.S.H. obtained financial support.

Conflict-of-interest disclosure: The authors declare no competing financial interests.

Correspondence: Richard S. Houlston, Division of Genetics and Epidemiology, The Institute of Cancer Research, 15 Cotswold Rd, Sutton, Surrey SM2 5NG, United Kingdom; e-mail: richard.houlston@icr.ac.uk.

1
Siegel
 
R
Naishadham
 
D
Jemal
 
A
Cancer statistics, 2012.
CA Cancer J Clin
2012
, vol. 
62
 
1
(pg. 
10
-
29
)
2
Byrd
 
JC
Brown
 
JR
O’Brien
 
S
, et al. 
RESONATE Investigators
Ibrutinib versus ofatumumab in previously treated chronic lymphoid leukemia.
N Engl J Med
2014
, vol. 
371
 
3
(pg. 
213
-
223
)
3
Cartron
 
G
de Guibert
 
S
Dilhuydy
 
MS
, et al. 
Obinutuzumab (GA101) in relapsed/refractory chronic lymphocytic leukemia: final data from the phase 1/2 GAUGUIN study.
Blood
2014
, vol. 
124
 
14
(pg. 
2196
-
2202
)
4
Furman
 
RR
Sharman
 
JP
Coutre
 
SE
, et al. 
Idelalisib and rituximab in relapsed chronic lymphocytic leukemia.
N Engl J Med
2014
, vol. 
370
 
11
(pg. 
997
-
1007
)
5
Goldin
 
LR
Björkholm
 
M
Kristinsson
 
SY
Turesson
 
I
Landgren
 
O
Elevated risk of chronic lymphocytic leukemia and other indolent non-Hodgkin’s lymphomas among relatives of patients with chronic lymphocytic leukemia.
Haematologica
2009
, vol. 
94
 
5
(pg. 
647
-
653
)
6
Berndt
 
SI
Skibola
 
CF
Joseph
 
V
, et al. 
Genome-wide association study identifies multiple risk loci for chronic lymphocytic leukemia.
Nat Genet
2013
, vol. 
45
 
8
(pg. 
868
-
876
)
7
Crowther-Swanepoel
 
D
Broderick
 
P
Di Bernardo
 
MC
, et al. 
Common variants at 2q37.3, 8q24.21, 15q21.3 and 16q24.1 influence chronic lymphocytic leukemia risk.
Nat Genet
2010
, vol. 
42
 
2
(pg. 
132
-
136
)
8
Di Bernardo
 
MC
Crowther-Swanepoel
 
D
Broderick
 
P
, et al. 
A genome-wide association study identifies six susceptibility loci for chronic lymphocytic leukemia.
Nat Genet
2008
, vol. 
40
 
10
(pg. 
1204
-
1210
)
9
Sava
 
GP
Speedy
 
HE
Di Bernardo
 
MC
, et al. 
Common variation at 12q24.13 (OAS3) influences chronic lymphocytic leukemia risk.
Leukemia
2015
, vol. 
29
 
3
(pg. 
748
-
751
)
10
Slager
 
SL
Rabe
 
KG
Achenbach
 
SJ
, et al. 
Genome-wide association study identifies a novel susceptibility locus at 6p21.3 among familial CLL.
Blood
2011
, vol. 
117
 
6
(pg. 
1911
-
1916
)
11
Slager
 
SL
Skibola
 
CF
Di Bernardo
 
MC
, et al. 
Common variation at 6p21.31 (BAK1) influences the risk of chronic lymphocytic leukemia.
Blood
2012
, vol. 
120
 
4
(pg. 
843
-
846
)
12
Speedy
 
HE
Di Bernardo
 
MC
Sava
 
GP
, et al. 
A genome-wide association study identifies multiple susceptibility loci for chronic lymphocytic leukemia.
Nat Genet
2014
, vol. 
46
 
1
(pg. 
56
-
60
)
13
Berndt
 
SI
Camp
 
NJ
Skibola
 
CF
, et al. 
Meta-analysis of genome-wide association studies discovers multiple loci for chronic lymphocytic leukemia.
Nat Commun
2016
, vol. 
7
 pg. 
10933
 
14
Lunter
 
G
Goodson
 
M
Stampy: a statistical algorithm for sensitive and fast mapping of Illumina sequence reads.
Genome Res
2011
, vol. 
21
 
6
(pg. 
936
-
939
)
15
Li
 
H
Durbin
 
R
Fast and accurate short read alignment with Burrows-Wheeler transform.
Bioinformatics
2009
, vol. 
25
 
14
(pg. 
1754
-
1760
)
16
McKenna
 
A
Hanna
 
M
Banks
 
E
, et al. 
The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data.
Genome Res
2010
, vol. 
20
 
9
(pg. 
1297
-
1303
)
17
DePristo
 
MA
Banks
 
E
Poplin
 
R
, et al. 
A framework for variation discovery and genotyping using next-generation DNA sequencing data.
Nat Genet
2011
, vol. 
43
 
5
(pg. 
491
-
498
)
18
Van der Auwera
 
GA
Carneiro
 
MO
Hartl
 
C
, et al. 
From FastQ data to high confidence variant calls: the Genome Analysis Toolkit best practices pipeline.
Curr Protoc Bioinformatics
2013
, vol. 
11
 
1110
(pg. 
11.10.1
-
11.10.33
)
19
Ruark
 
E
Münz
 
M
Renwick
 
A
, et al. 
The ICR1000 UK exome series: a resource of gene variation in an outbred population.
F1000Res
2015
, vol. 
4
 pg. 
883
 
20
Power
 
C
Elliott
 
J
Cohort profile: 1958 British birth cohort (National Child Development Study).
Int J Epidemiol
2006
, vol. 
35
 
1
(pg. 
34
-
41
)
21
Ng
 
PC
Henikoff
 
S
SIFT: predicting amino acid changes that affect protein function.
Nucleic Acids Res
2003
, vol. 
31
 
13
(pg. 
3812
-
3814
)
22
Kircher
 
M
Witten
 
DM
Jain
 
P
O’Roak
 
BJ
Cooper
 
GM
Shendure
 
J
A general framework for estimating the relative pathogenicity of human genetic variants.
Nat Genet
2014
, vol. 
46
 
3
(pg. 
310
-
315
)
23
Yates
 
CM
Filippis
 
I
Kelley
 
LA
Sternberg
 
MJ
SuSPect: enhanced prediction of single amino acid variant (SAV) phenotype using network features.
J Mol Biol
2014
, vol. 
426
 
14
(pg. 
2692
-
2701
)
24
Yeo
 
G
Burge
 
CB
Maximum entropy modeling of short sequence motifs with applications to RNA splicing signals.
J Comput Biol
2004
, vol. 
11
 
2-3
(pg. 
377
-
394
)
25.
Di Tommaso
 
P
Moretti
 
S
Xenarios
 
I
, et al. 
T-Coffee: a web server for the multiple sequence alignment of protein and RNA sequences using structural information and homology extension.
Nucleic Acids Res
2011
, vol. 
39
 
Web Server issue
(pg. 
W13
-
W17
)
26
Notredame
 
C
Higgins
 
DG
Heringa
 
J
T-Coffee: a novel method for fast and accurate multiple sequence alignment.
J Mol Biol
2000
, vol. 
302
 
1
(pg. 
205
-
217
)
27
Waterhouse
 
AM
Procter
 
JB
Martin
 
DM
Clamp
 
M
Barton
 
GJ
Jalview version 2--a multiple sequence alignment editor and analysis workbench.
Bioinformatics
2009
, vol. 
25
 
9
(pg. 
1189
-
1191
)
28
Pettersen
 
EF
Goddard
 
TD
Huang
 
CC
, et al. 
UCSF Chimera--a visualization system for exploratory research and analysis.
J Comput Chem
2004
, vol. 
25
 
13
(pg. 
1605
-
1612
)
29
Wang
 
Y
Geer
 
LY
Chappey
 
C
Kans
 
JA
Bryant
 
SH
Cn3D: sequence and structure views for Entrez.
Trends Biochem Sci
2000
, vol. 
25
 
6
(pg. 
300
-
302
)
30
Pires
 
DE
Ascher
 
DB
Blundell
 
TL
mCSM: predicting the effects of mutations in proteins using graph-based signatures.
Bioinformatics
2014
, vol. 
30
 
3
(pg. 
335
-
342
)
31
Fariselli
 
P
Martelli
 
PL
Savojardo
 
C
Casadio
 
R
INPS: predicting the impact of non-synonymous variations on protein stability from sequence.
Bioinformatics
2015
, vol. 
31
 
17
(pg. 
2816
-
2821
)
32
Sathirapongsasuti
 
JF
Lee
 
H
Horst
 
BA
, et al. 
Exome sequencing-based copy-number variation and loss of heterozygosity detection: ExomeCNV.
Bioinformatics
2011
, vol. 
27
 
19
(pg. 
2648
-
2654
)
33
Ding
 
Z
Mangino
 
M
Aviv
 
A
Spector
 
T
Durbin
 
R
UK10K Consortium
Estimating telomere length from whole genome sequence data.
Nucleic Acids Res
2014
, vol. 
42
 
9
pg. 
e75
 
34
Cawthon
 
RM
Telomere measurement by quantitative PCR.
Nucleic Acids Res
2002
, vol. 
30
 
10
pg. 
e47
 
35
Pooley
 
KA
Sandhu
 
MS
Tyrer
 
J
, et al. 
Telomere length in prospective and retrospective cancer case-control studies.
Cancer Res
2010
, vol. 
70
 
8
(pg. 
3170
-
3176
)
36
Sellick
 
GS
Webb
 
EL
Allinson
 
R
, et al. 
A high-density SNP genomewide linkage scan for chronic lymphocytic leukemia-susceptibility loci.
Am J Hum Genet
2005
, vol. 
77
 
3
(pg. 
420
-
429
)
37
Ramsay
 
AJ
Quesada
 
V
Foronda
 
M
, et al. 
POT1 mutations cause telomere dysfunction in chronic lymphocytic leukemia.
Nat Genet
2013
, vol. 
45
 
5
(pg. 
526
-
530
)
38
Puente
 
XS
Beà
 
S
Valdés-Mas
 
R
, et al. 
Non-coding recurrent mutations in chronic lymphocytic leukaemia.
Nature
2015
, vol. 
526
 
7574
(pg. 
519
-
524
)
39
Robles-Espinoza
 
CD
Harland
 
M
Ramsay
 
AJ
, et al. 
POT1 loss-of-function variants predispose to familial melanoma.
Nat Genet
2014
, vol. 
46
 
5
(pg. 
478
-
481
)
40
Shi
 
J
Yang
 
XR
Ballew
 
B
, et al. 
NCI DCEG Cancer Sequencing Working Group; NCI DCEG Cancer Genomics Research Laboratory; French Familial Melanoma Study Group
Rare missense variants in POT1 predispose to familial cutaneous malignant melanoma.
Nat Genet
2014
, vol. 
46
 
5
(pg. 
482
-
486
)
41
Landau
 
DA
Carter
 
SL
Stojanov
 
P
, et al. 
Evolution and impact of subclonal mutations in chronic lymphocytic leukemia.
Cell
2013
, vol. 
152
 
4
(pg. 
714
-
726
)
42
Landau
 
DA
Tausch
 
E
Taylor-Weiner
 
AN
, et al. 
Mutations driving CLL and their evolution in progression and relapse.
Nature
2015
, vol. 
526
 
7574
(pg. 
525
-
530
)
43
Sellick
 
GS
Goldin
 
LR
Wild
 
RW
, et al. 
A high-density SNP genome-wide linkage search of 206 families identifies susceptibility loci for chronic lymphocytic leukemia.
Blood
2007
, vol. 
110
 
9
(pg. 
3326
-
3333
)
44
Calvete
 
O
Martinez
 
P
Garcia-Pavia
 
P
, et al. 
A mutation in the POT1 gene is responsible for cardiac angiosarcoma in TP53-negative Li-Fraumeni-like families.
Nat Commun
2015
, vol. 
6
 pg. 
8383
 
45
Bainbridge
 
MN
Armstrong
 
GN
Gramatges
 
MM
, et al. 
Gliogene Consortium
Germline mutations in shelterin complex genes are associated with familial glioma.
J Natl Cancer Inst
2014
, vol. 
107
 
1
pg. 
384
 
46
Chubb
 
D
Broderick
 
P
Dobbins
 
SE
, et al. 
Rare disruptive mutations and their contribution to the heritable risk of colorectal cancer.
Nat Commun
2016
, vol. 
7
 pg. 
11883
 
47
Pinzaru
 
AM
Hom
 
RA
Beal
 
A
, et al. 
Telomere replication stress induced by POT1 inactivation accelerates tumorigenesis.
Cell Reports
2016
, vol. 
15
 
10
(pg. 
2170
-
2184
)
48
de Lange
 
T
Shelterin: the protein complex that shapes and safeguards human telomeres.
Genes Dev
2005
, vol. 
19
 
18
(pg. 
2100
-
2110
)
49
Loayza
 
D
De Lange
 
T
POT1 as a terminal transducer of TRF1 telomere length control.
Nature
2003
, vol. 
423
 
6943
(pg. 
1013
-
1018
)
50
Xin
 
H
Liu
 
D
Wan
 
M
, et al. 
TPP1 is a homologue of ciliate TEBP-beta and interacts with POT1 to recruit telomerase.
Nature
2007
, vol. 
445
 
7127
(pg. 
559
-
562
)
51
Ye
 
JZ
Hockemeyer
 
D
Krutchinsky
 
AN
, et al. 
POT1-interacting protein PIP1: a telomere length regulator that recruits POT1 to the TIN2/TRF1 complex.
Genes Dev
2004
, vol. 
18
 
14
(pg. 
1649
-
1654
)
52
Wang
 
F
Podell
 
ER
Zaug
 
AJ
, et al. 
The POT1-TPP1 telomere complex is a telomerase processivity factor.
Nature
2007
, vol. 
445
 
7127
(pg. 
506
-
510
)
53
Tong
 
AS
Stern
 
JL
Sfeir
 
A
, et al. 
ATM and ATR signaling regulate the recruitment of human telomerase to telomeres.
Cell Reports
2015
, vol. 
13
 
8
(pg. 
1633
-
1646
)
54
Codd
 
V
Nelson
 
CP
Albrecht
 
E
, et al. 
CARDIoGRAM consortium
Identification of seven loci affecting mean telomere length and their association with disease.
Nat Genet
2013
, vol. 
45
 
4
(pg. 
422
-
427, e1-e2
)
55
Pooley
 
KA
Bojesen
 
SE
Weischer
 
M
, et al. 
A genome-wide association scan (GWAS) for mean telomere length within the COGS project: identified loci show little association with hormone-related cancer risk.
Hum Mol Genet
2013
, vol. 
22
 
24
(pg. 
5056
-
5064
)
56
Aoude
 
LG
Pritchard
 
AL
Robles-Espinoza
 
CD
, et al. 
Nonsense mutations in the shelterin complex genes ACD and TERF2IP in familial melanoma.
J Natl Cancer Inst
2014
, vol. 
107
 
2
pg. 
dju408
 
57
Renwick
 
A
Thompson
 
D
Seal
 
S
, et al. 
Breast Cancer Susceptibility Collaboration (UK)
ATM mutations that cause ataxia-telangiectasia are breast cancer susceptibility alleles.
Nat Genet
2006
, vol. 
38
 
8
(pg. 
873
-
875
)
58
Meijers-Heijboer
 
H
van den Ouweland
 
A
Klijn
 
J
, et al. 
CHEK2-Breast Cancer Consortium
Low-penetrance susceptibility to breast cancer due to CHEK2(*)1100delC in noncarriers of BRCA1 or BRCA2 mutations.
Nat Genet
2002
, vol. 
31
 
1
(pg. 
55
-
59
)
Sign in via your Institution