Abstract
Single-stranded oligonucleotides (ssODNs) serve as donor templates for gene editing through targeted endonuclease cleavage and homology-directed repair (HDR). While oligonucleotide synthesis is a mature technology, the quality of products used in clinical protocols is limited to standard assessments by liquid chromatography and mass spectrometry. This study investigated the sequence fidelity of commercially synthesized ssODNs using direct sequencing and the impact of synthesis errors on gene editing outcomes.
We tested a 168 bp ssODN designed for use as a donor template in a protocol to correct the sickle cell disease mutation in the human β-globin gene using CRISPR/Cas9-directed HDR. Three different manufacturers produced identical ssODNs, which were subjected to deep sequencing using two library preparation kits that allowed for direct incorporation of the ssODN in an Illumina library. Each ssODN was sequenced with minimum 3 million 150bp paired-end reads, achieving average minimum depth of 150,000 aligned reads.
Direct sequencing revealed the presence of single nucleotide errors (SNEs) and small deletions in all ssODNs. Error rates reached up to 9% for SNEs and 3% for deletions at individual positions. Error positions were consistent across manufacturers, but error rates showed dramatic variation among manufacturers. Manufacturer A's ssODN typically contained over two-fold more errors than those from manufacturers B and C.
SNEs occurred more frequently at specific positions within the ssODN sequence across all vendors, particularly at cytosine and guanine positions. Small deletions (typically 1-2 bp) were more uniformly distributed across the sequence. The errors did not involve specific nucleotide conversions—a specific cytosine might change primarily to thymine but also to guanine or adenine.
To validate that observed errors were genuine synthesis errors rather than artifacts arising from library construction and sequencing, we analyzed genomic DNA from hematopoietic stem and progenitor cells (HSPCs) that had been edited using these ssODNs. We expected that true synthesis errors would be incorporated into the genome through HDR only in the region where the ssODN serves as a template, whereas sequencing errors would be absent from the genomes of edited HSPCs.
Analysis of edited HSPC genomic DNA confirmed that SNEs were faithfully propagated into the genome through HDR, indicating that they were true synthesis errors. SNEs appeared in genomic DNA at rates correlating with their frequency in the respective ssODNs, but only within the HDR region (approximately 50-75 bp upstream and 10 bp downstream of the Cas9 cleavage site). Outside this region, error rates matched those expected from normal PCR or sequencing artifacts.
The incorporation pattern of SNEs provided valuable information on the extent
and polarity of HDR in our protocol, indicating that HDR in this system initiates at the Cas9 cleavage site, proceeds largely asymmetrically, and attenuates with distance from the cleavage site. In edited alleles, synthesis errors were incorporated at frequencies as high as 12% for SNEs and 3% for deletions at individual positions.
The propagation of synthesis errors has significant implications for therapeutic gene editing. In the β-globin context, while SNEs may create benign variants based on available clinical data, 90% of observed deletions were 1-2 bp frameshift mutations that could create deleterious β-thalassemia-like alleles with more consequential effects on protein production.
This study demonstrates that synthesis errors in commercial ssODNs are systematically propagated into target genomes during HDR-mediated gene editing. The substantial variation in error rates among manufacturers represents a previously unrecognized source of editing errors that must be considered in therapeutic gene correction strategies. The findings highlight the critical need for sequence fidelity assessment of ssODNs prior to clinical use and suggest that current quality control measures in oligonucleotide manufacturing may be insufficient for clinical applications.