IG sequences from a cohort of 333 CLL cases were studied for somatic mutation patterns and compared to public IG sequences from normal and autoreactive cells contained in the IMGT database. Our series included 326 IG heavy (HC) and 291 light chain (LC) genes (189 κ, 102 λ). Parallel assessment of IGH and IGK/IGL mutation status was possible for 267/333 CLL cases. IGHV/IGKV/IGLV sequences were <100% homologous to germline (“mutated”) in 74/68/74% of cases. In 227/267 cases, concordant results were obtained with regard to HC and LC mutation status. In 19/23 CLL cases with IGHV-mutated and IGKV/IGLV unmutated rearrangements, IGHV genes had ≥ 99% homology to germline (GL); 14/17 cases with IGHV-unmutated and IGKV/IGLV mutated rearrangements carried IGKV/IGLV genes with ≥ 99% homology to GL. The most frequent genes among IGH/IGK/IGL mutated CLL sequences were, respectively, IGHV4-34/IGHV3-23, IGKV3-20/IGKV4-1, IGLV3-21/IGLV2-8. Replacement/silent (R/S) mutation ratios were elevated in CDRs (mean values: 3.2/2.7/4.1 for IGHV/IGKV/IGLV, respectively) and decreased in FRs (mean values: 1.5/1.3/1.3 for IGHV/IGKV/IGLV, respectively). Mutated sequences were analyzed for targeting to the tetranucleotide (4-NTP) motifs RGYW/WRCY and DGYW/WRCH. In CLL, a bias for R mutations and clustering to 4-NTP motifs in the HCDR1 was observed among IGHV4-expressing cases; in contrast, IGHV3-expressing CLL cases were more often targeted at the HCDR2 region. Among IGKV sequences, except for a higher targeting of KFR2 motifs in CLL, mutation distributions were generally similar in all datasets. In our CLL dataset, mutation distributions differed among IGKV1-39/1D-39, IGKV2-30 or IGKV3 sequences, perhaps reflecting differences in GL composition. Among IGL sequences, mutation distributions in CLL and autoreactive sequences were similar; normal cells had less R mutations in the LCDR2 region. Serine AGC codons were differentially targeted by R mutations in CLL. Specifically, Ser AGC codons were frequently changed only at positions HCDR1-31, KCDR1-30/32, KFR3-92 and LCDR2-58; other AGC Ser codons carried frequent R mutations only in IGHV4 genes (especially serine at HFR3-92, which was replaced in 65% of mutated IGHV4-34 sequences). IMGT positions C-23/104, W-41, aliphatics-21/89, amide-44, P-46, G-47, basic-75, acidic-98, Y-102 and W-52 were very rarely (<3%) found to carry R mutations in CLL IGHV sequences. Among IGKV+IGLV CLL sequences, G residues at positions 16/47/70/84 were never mutated. Comparison to normal and autoreactive datasets revealed a “CLL-biased” R mutation distribution in the case of the IGHV4-34 and IGKV2-30 genes, with more R mutations in the HCDR1 and less R mutations in the HCDR2 region, as well as a Y to H substitution at IGKV2-30 CDR1-31 in all cases belonging to a subset with stereotyped IGHV4-34/IGKV2-30 mutated IGs. In conclusion, hotspot motifs are differentially targeted by mutations in CLL. Selected IG CLL sequences have been molded to direct mutations to specific codons in a disease-biased manner. Finally, somatic hypermutation makes a significant contribution to shaping the antigen-selected IG repertoire in at least a proportion of CLL cases.

Author notes

Corresponding author

Sign in via your Institution