The correction of mutant β-globin genes has long been a therapeutic goal for patients with β-thalassemia or hemoglobinopathies. The use of homologous recombination (HR) to achieve this goal is an attractive approach because it eliminates the need to include regulatory sequences in the therapeutic construct, and it eliminates mutagenesis induced by random integration. However, HR is a very inefficient process for gene correction, and its efficiency is probably locus dependent. The length of targeting arms is thought to be a determinant of targeting efficiency, so we compared the ability of standard (8-kb) versus very long (16-, 24-, and 110-kb) regions of homology to correct a mutant murine β-globin gene in embryonic stem cells. Increasing the length of the targeting sequences did not increase the efficiency of HR in this locus, suggesting that alternative approaches will be required to improve the efficiency of this approach for globin gene correction.

Conventional gene therapy strategies generally involve the use of retroviral or lentiviral vectors to introduce therapeutic genes and linked regulatory sequences into random sites within the target cell genome. Random integration approaches to gene therapy are associated with a number of problems, including integration site–dependent expression, viral genome silencing, and the risk of insertional mutagenesis.1,2 

An alternative strategy is the use of homologous recombination (HR) to “correct” a mutant gene. The major challenge of this strategy is the low efficiency of HR in mammalian cells. The factors that are most crucial for determining the gene targeting efficiency in mammalian cells are not yet well understood. One parameter that has clearly been shown to improve HR efficiency is the use of long regions of isogenic DNA in the targeting vector.3-7  Thomas and Capecchi first described an exponential relationship between length of homology and targeting frequency in the HPRT locus.3  A subsequent study by Hasty et al5  also revealed an increase in targeting rate with an increase in targeting arm size from 1.3 to 6.8 kb. In a systematic study by Deng and Capecchi, the targeting frequency was highly dependent on homology length up to 14 kb DNA, and then reached a plateau at 18.4 kb.7  To our knowledge, no systematic studies have yet been reported comparing vectors with “standard” targeting arms (ie, total arm length of 6-8 kb) to vectors with arms longer than 18.4 kb.

To explore this issue for the β-globin locus, we tested the ability of large targeting vectors to correct an engineered mutation of the murine β-globin gene. Our results showed that these vectors did not improve targeting efficiency in this locus, which suggests that the relationship between targeting arm size and HR efficiency is more complicated than previously thought.

The generation of mouse embryonic stem (ES) cell lines bearing the mutation changing GCT (encoding Ala) to ATC (encoding Ile) at position 6 of β-major globin gene (β6I) has been described previously.8  The loxP-flanked PGK-neo cassette located in the targeted locus was subsequently removed via Cre-mediated recombination to create ES lines heterozygous for the β6IΔPGK-neo allele (Figure 1A).

Figure 1.

Design and construction of the targeting vectors. (A) Diagram shows the 2-step approach for developing a murine model system to study targeted gene correction. The first step involved the generation of mouse ES cells containing the β6I mutant globin gene. The structure of the wild-type β-globin gene locus is shown on the top line (βmaj genomic allele), and the targeting vector that was used to introduce the β6I mutation is shown on the second line (β6I targeting vector). The structure of the targeted allele with a retained PGK-neo is depicted on the third line (β6I + PGK-neo allele), and shown below is the structure of the allele after PGK-neo is excised by Cre recombinase (β6IΔPGK-neo allele). In step 2, the ES cells heterozygous for the β6IΔPGK-neo mutant allele were used to determine whether correcting vectors of different sizes could correct the β6I mutation via homologous recombination. The corrected β-globin locus is depicted on the bottom line (corrected βmaj allele). The location of the internal probe is shown (black bar). After EcoRV and BglII double digestion of G418-resistant ES cell DNA, the β6IΔPGK-neo mutant allele yields a 2.7-kb fragment, and the wild-type/corrected allele produces a 3.4-kb fragment. The disappearance of the diagnostic 2.7-kb band indicates that targeted correction of the mutant allele has occurred. Insertion of the correcting vector DNA into a random site(s) of the genome is expected to give rise either to fragments of random sizes, or occasionally, to produce a more prominent 3.4-kb wild-type band without the loss of the 2.7-kb mutant band. B indicates BglII; E, EcoRV. (B) Vector construction. pβC110:A 1.5-kb AvaI fragment containing phosphoglycerate kinase 1 gene promoter-driven neomycin resistance gene (PGK-neo) was inserted into an AvaI site 4.1 kb downstream from the cap site of β-major gene in P1 1935. PGK-neo was inserted in the same transcriptional orientation as that of the globin genes. pβC24: A 24-kb PvuII fragment containing the entire β-major gene was isolated from the P1 1935 DNAand subcloned into the SmaI site of pGEM7zf(–) (Promega, Madison, WI). The identical AvaI PGK-neo–containing fragment was then inserted at the same AvaI site as described. pβC16: A 7.9-kb fragment, including the 7.8 kb in the 5′ portion of the 24-kb β-major gene insert and a small region containing the multicloning sites, was removed from pβC24 by HindIII partial digestion (loss of 3 contiguous 0.3-, 6.4-, and 1.2-kb HindIII fragments). Circularization of the remaining portion of the vector gave rise to the pβC16. pβC8: A 6-kb AvaI fragment spanning the mouse β-major gene was isolated from P1 1935 and subcloned into an AvaI site that is located upstream from a PGK-neo in pCR II (Invitrogen, Carlsbad, CA). Using P1 1935 as the template, a 2.2-kb fragment, located between 4.1 kb and 6.3 kb downstream from β-major cap site, was generated by polymerase chain reaction (PCR). Flanking BamHI-HindIII sites were introduced to facilitate the insertion of this fragment into BamHI-HindIII sites downstream from PGK-neo. The left arm, PGK-neo cassette, and right arm were cloned in the proper orientation. No β-globin locus DNA sequences were removed from pβC8. All the targeting vectors were linearized before use (pβC110 with SalI; pβC24 and pβC16 with XhoI; pβC8 with ApaI). The map of the mouse β6IΔPGK-neo mutant allele is shown on the top line (β6I allele), and the structure of the targeting vectors containing 110 kb (pβC110), 24 kb (pβC24), 16 kb (pβC16), and 8 kb (pβC8) of targeting DNAare depicted below. (C) Structural analysis of the mouse β-globin gene locus. The P1/PAC clone 1935 (top line; P1 1935) and the PGK-neo-containing β-globin locus within the targeting vector pβC110 (bottom line; pβC110) are shown. The sizes of relevant restriction fragments (in kb) are shown below the diagrams of β-globin loci. The pβC110 vector was generated by insertion of a 1.5-kb AvaI fragment containing a PGK-neo cassette (hatched box) into an AvaI site (designated as A*) downstream from the β-major globin gene within the P1 1935. Insertion of PGK-neo led to an increase in the size of some restriction fragments, including a BglII fragment from 2.8 kb to 4.3 kb, an EcoRV fragment from 3.7 kb to 5.2 kb, and an HpaI fragment from 3.0 kb to 4.5 kb. Addition of PGK-neo, which contained an internal NcoI and a PstI site, disrupted an 8.1 kb NcoI and a 4.3 kb PstI fragment, and produced 2 diagnostic 4.8-kb NcoI fragments, and 2 PstI fragments of 3.6 kb and 2.3 kb in length. B indicates BglII; E, EcoRV; H, HpaI; N, NcoI; P, PstI; A, AvaI). (D) Restriction analysis of P1 1935 and pβC110. Plasmid DNA was digested with BglII, EcoRV, HpaI, NcoI, and PstI, and the resultant restriction fragments were analyzed by agarose gel electrophoresis. The data revealed that the pβC110 vector contained a single PGK-neo cassette that had been properly inserted as predicted in panel C. The correct orientation of PGK-neo in the locus was revealed by DNA sequence analysis of the insert-locus junctions with sequencing primers located both within the locus and within the PGK-neo. Similar structural and sequence analyses were also performed on other targeting vectors to verify the correct vector configuration (data not shown).

Figure 1.

Design and construction of the targeting vectors. (A) Diagram shows the 2-step approach for developing a murine model system to study targeted gene correction. The first step involved the generation of mouse ES cells containing the β6I mutant globin gene. The structure of the wild-type β-globin gene locus is shown on the top line (βmaj genomic allele), and the targeting vector that was used to introduce the β6I mutation is shown on the second line (β6I targeting vector). The structure of the targeted allele with a retained PGK-neo is depicted on the third line (β6I + PGK-neo allele), and shown below is the structure of the allele after PGK-neo is excised by Cre recombinase (β6IΔPGK-neo allele). In step 2, the ES cells heterozygous for the β6IΔPGK-neo mutant allele were used to determine whether correcting vectors of different sizes could correct the β6I mutation via homologous recombination. The corrected β-globin locus is depicted on the bottom line (corrected βmaj allele). The location of the internal probe is shown (black bar). After EcoRV and BglII double digestion of G418-resistant ES cell DNA, the β6IΔPGK-neo mutant allele yields a 2.7-kb fragment, and the wild-type/corrected allele produces a 3.4-kb fragment. The disappearance of the diagnostic 2.7-kb band indicates that targeted correction of the mutant allele has occurred. Insertion of the correcting vector DNA into a random site(s) of the genome is expected to give rise either to fragments of random sizes, or occasionally, to produce a more prominent 3.4-kb wild-type band without the loss of the 2.7-kb mutant band. B indicates BglII; E, EcoRV. (B) Vector construction. pβC110:A 1.5-kb AvaI fragment containing phosphoglycerate kinase 1 gene promoter-driven neomycin resistance gene (PGK-neo) was inserted into an AvaI site 4.1 kb downstream from the cap site of β-major gene in P1 1935. PGK-neo was inserted in the same transcriptional orientation as that of the globin genes. pβC24: A 24-kb PvuII fragment containing the entire β-major gene was isolated from the P1 1935 DNAand subcloned into the SmaI site of pGEM7zf(–) (Promega, Madison, WI). The identical AvaI PGK-neo–containing fragment was then inserted at the same AvaI site as described. pβC16: A 7.9-kb fragment, including the 7.8 kb in the 5′ portion of the 24-kb β-major gene insert and a small region containing the multicloning sites, was removed from pβC24 by HindIII partial digestion (loss of 3 contiguous 0.3-, 6.4-, and 1.2-kb HindIII fragments). Circularization of the remaining portion of the vector gave rise to the pβC16. pβC8: A 6-kb AvaI fragment spanning the mouse β-major gene was isolated from P1 1935 and subcloned into an AvaI site that is located upstream from a PGK-neo in pCR II (Invitrogen, Carlsbad, CA). Using P1 1935 as the template, a 2.2-kb fragment, located between 4.1 kb and 6.3 kb downstream from β-major cap site, was generated by polymerase chain reaction (PCR). Flanking BamHI-HindIII sites were introduced to facilitate the insertion of this fragment into BamHI-HindIII sites downstream from PGK-neo. The left arm, PGK-neo cassette, and right arm were cloned in the proper orientation. No β-globin locus DNA sequences were removed from pβC8. All the targeting vectors were linearized before use (pβC110 with SalI; pβC24 and pβC16 with XhoI; pβC8 with ApaI). The map of the mouse β6IΔPGK-neo mutant allele is shown on the top line (β6I allele), and the structure of the targeting vectors containing 110 kb (pβC110), 24 kb (pβC24), 16 kb (pβC16), and 8 kb (pβC8) of targeting DNAare depicted below. (C) Structural analysis of the mouse β-globin gene locus. The P1/PAC clone 1935 (top line; P1 1935) and the PGK-neo-containing β-globin locus within the targeting vector pβC110 (bottom line; pβC110) are shown. The sizes of relevant restriction fragments (in kb) are shown below the diagrams of β-globin loci. The pβC110 vector was generated by insertion of a 1.5-kb AvaI fragment containing a PGK-neo cassette (hatched box) into an AvaI site (designated as A*) downstream from the β-major globin gene within the P1 1935. Insertion of PGK-neo led to an increase in the size of some restriction fragments, including a BglII fragment from 2.8 kb to 4.3 kb, an EcoRV fragment from 3.7 kb to 5.2 kb, and an HpaI fragment from 3.0 kb to 4.5 kb. Addition of PGK-neo, which contained an internal NcoI and a PstI site, disrupted an 8.1 kb NcoI and a 4.3 kb PstI fragment, and produced 2 diagnostic 4.8-kb NcoI fragments, and 2 PstI fragments of 3.6 kb and 2.3 kb in length. B indicates BglII; E, EcoRV; H, HpaI; N, NcoI; P, PstI; A, AvaI). (D) Restriction analysis of P1 1935 and pβC110. Plasmid DNA was digested with BglII, EcoRV, HpaI, NcoI, and PstI, and the resultant restriction fragments were analyzed by agarose gel electrophoresis. The data revealed that the pβC110 vector contained a single PGK-neo cassette that had been properly inserted as predicted in panel C. The correct orientation of PGK-neo in the locus was revealed by DNA sequence analysis of the insert-locus junctions with sequencing primers located both within the locus and within the PGK-neo. Similar structural and sequence analyses were also performed on other targeting vectors to verify the correct vector configuration (data not shown).

Close modal

The targeting vectors used in this study (described in Figure 1) all contained targeting DNA sequences that were derived from a P1/PAC clone no. 1935, which contains a 110-kb insert isogenic to the β-globin locus in β6I mutant ES cells (129/SvJ background). The ES cell culture, electroporation of linearized targeting vector DNA, G418 selection, and DNA preparation from G418-resistant clones were performed as previously described.8 

We previously created a mouse ES cell line that contains an engineered mutation changing GCT (encoding Ala) to ATC (encoding Ile) in the position 6 of the mouse β-major globin gene (β6I).8  The loxP-flanked PGK-neo selectable marker cassette was subsequently removed from the mutant locus via Cre/LoxP-mediated recombination. This model was designed to facilitate the testing of very large targeting vectors, so that analysis of homologous recombination could use probes that are contained within the targeting sequences. The β6I mutation creates a novel EcoRV site in the β-major gene (Figure 1A; β6IΔPGK-neo allele; E*, novel EcoRV site), which allows the mutant allele to be distinguished from the “wild-type,” corrected allele with Southern analysis. Because the probe is internal to the targeting sequences, it also detects randomly integrated targeting vectors.

We created replacement-type vectors that contained a total of 8, 16, 24, or 110 kb of wild-type isogenic DNA from the mouse β-globin gene locus in the targeting arms (Figure 1B). We compared the abilities of these vectors to correct the β6I mutation in RW-4 (129/SvJ) ES cells bearing the β6IΔPGK-neo allele. Positive selection was provided by a PGK-neo selectable marker gene (Figure 1B hatched box, PGK-neo) positioned downstream from the wild-type β-major gene (Figure 1B open box; βWT) at the same position as a residual 42-bp lox-P site in the β6IΔPGK-neo mutant allele. The placement of PGK-neo at this site allowed the targeting DNA used in these vectors to be uninterrupted by the heterologous lox-P site retained in the target locus. All vectors were linearized by restriction enzyme digestion at a unique site within the vector backbone that was close (< 26 bp) to one edge of the targeting fragment. Targeting vector DNA (40 μg) was transfected for each experiment. Therefore, fewer molar equivalents of the larger vectors were transfected, which was reflected in proportionally reduced numbers of G418-resistant colonies per microgram transfected DNA (data not shown). The ratio of homologous to random integration events (ie, the “efficiency” of HR) is still comparable for vectors of all sizes, because the total integration efficiency is “normalized” by the same PGK-neo cassette present in each vector.

To measure HR frequency, each targeting vector was transfected into β6I mutant ES cells, and G418-resistant clones were screened by Southern analysis (Figure 2). A targeting vector with the same arms was previously used to create the β6I mutation in RW-4 cells; 4 of 220 G418-resistant ES clones were shown to be correctly targeted.8  A representative Southern blot of BglII/EcoRV-digested ES cell DNA hybridized with the internal probe is shown in Figure 2. The pβC8 vector yielded one ES clone with a corrected allele of 207 tested (Figure 2, Table 1; pβC8). Three of 146 clones transfected with pβC24 contained a corrected β-globin gene (Table 1). We did not detect any targeted clones with either the 16-kb or 110-kb vectors (Table 1). The overall frequency of HR is underestimated by a factor of 2 because we detected HR of only the mutant β-globin allele. The 95% confidence intervals for HR frequencies for each of these vectors are overlapping and not statistically different from the targeting frequency of the vector used to make the original mutation.

Figure 2.

Correction of the marked β6I globin gene by HR. Southern analysis of DNA derived from G418-resistant ES cell clones. DNA was digested with EcoRV and BglII and hybridized with the internal probe. Shown above each lane is the genotype of the individual ES cell clone. The fragments marked with a star (*) were generated by the randomly integrated vector DNA sequences. The ES cell clones represented in lanes 1-5 and 7-8 contained randomly integrated correcting DNA within the genome (WT/β6I + RI). The ES clone represented in lane 6 did not produce a mutant 2.7-kb band (marked with an open circle [○]), revealing that this clone had undergone HR-mediated correction of the β6I mutation (WT/C). WT indicates wild-type allele; C, corrected allele; and RI, random integrant.

Figure 2.

Correction of the marked β6I globin gene by HR. Southern analysis of DNA derived from G418-resistant ES cell clones. DNA was digested with EcoRV and BglII and hybridized with the internal probe. Shown above each lane is the genotype of the individual ES cell clone. The fragments marked with a star (*) were generated by the randomly integrated vector DNA sequences. The ES cell clones represented in lanes 1-5 and 7-8 contained randomly integrated correcting DNA within the genome (WT/β6I + RI). The ES clone represented in lane 6 did not produce a mutant 2.7-kb band (marked with an open circle [○]), revealing that this clone had undergone HR-mediated correction of the β6I mutation (WT/C). WT indicates wild-type allele; C, corrected allele; and RI, random integrant.

Close modal
Table 1.

HR efficiencies


Vector

L-arm, kb

R-arm, kb

Number of experiments

Correcting efficiency (%)

95% CI, %

P*
pβC8   5.8   2.2   2   1/207 (0.48)   0.01-2.66   .60  
pβC16   10.9   5.1   2   0/116 (0)   0.00-2.55   .30  
pβC24   18.9   5.1   2   3/146 (2.05)  0.43-5.89   .36  
pβC110
 
∼70
 
∼40
 
1
 
0/119 (0)
 
0.00-2.49
 
.30
 

Vector

L-arm, kb

R-arm, kb

Number of experiments

Correcting efficiency (%)

95% CI, %

P*
pβC8   5.8   2.2   2   1/207 (0.48)   0.01-2.66   .60  
pβC16   10.9   5.1   2   0/116 (0)   0.00-2.55   .30  
pβC24   18.9   5.1   2   3/146 (2.05)  0.43-5.89   .36  
pβC110
 
∼70
 
∼40
 
1
 
0/119 (0)
 
0.00-2.49
 
.30
 

Shown are the sizes of the left arm (L-arm), the right arm (R-arm), the number of independent experiments, and the number of “correction” events from the total number of G418-resistant clones for each targeting vector. The 95% confidence intervals for correction events are shown. All of the vectors displayed HR frequencies that were statistically indistinguishable from the parental “knock-in” vector (4 correctly targeted events of 220 G418-resistant clones. For this calculation [*], we divided the parental targeting efficiency by 2, because we can only detect HR events that occur on the mutant allele with the “correcting” vectors). We have transfected the same RW-4 ES cell used here with targeting constructs representing 255 independent loci in our ES core. The design of all vectors was standardized to include at least 2-kb isogenic targeting sequences in each arm, with selection provided by the same PGK-neo cassette used in this study. A total of 39 085 G418-resistant clones have been evaluated; 606 of these (1.55%) have been confirmed to contain correctly targeted alleles, an HR efficiency that is statistically the same as the efficiencies measured in this study (E. Ross, J. Mudd, D. George, and T. J. Ley; http://medicine.wustl.edu/~escore/htmldocs/gencon.htm#targrec).

One correctly targeted clone also harbored additional randomly integrated vector sequences elsewhere in the genome (data not shown).

Our results extend several earlier studies and provide a crude estimate of the critical length of homology that gives a “saturating” gene targeting efficiency in the β-globin locus. The absolute targeting frequency attained in our system was in the range of 105 with targeting frequencies in the range of 1% to 2% of G418-resistant clones (similar to that previously described).3,7,9  The HR frequencies obtained in this study are statistically identical to our overall experience using traditionally sized targeting vectors (Figure 2, Table 1).

Why did long regions of homology not improve targeting efficiency? One possibility is that large DNA fragments are unstable in host cells during integration. However, studies from our group and others have shown that purified human gDNA fragments of 100 kb are usually unrearranged after electroporation and integration into murine ES cell genomes10,11  (R.M.K., T.J.L., unpublished observation, January 1999). It is also possible that HR is less efficient when great distances separate the ends of the targeting arms.

Our results suggest that the relationship between targeting arm size and HR efficiency is more complicated than previously appreciated,12  because targeting efficiency for the β-globin locus is not improved by very large vectors. The exact relationship between vector size and targeting efficiency is probably influenced by other variables, such as the targeted locus itself, vector design, and the status of cellular HR machinery,13  all of which are under investigation.

Prepublished online as Blood First Edition Paper, May 1, 2003; DOI 10.1182/blood-2003-03-0708.

Supported by grant no. DK38682 (T.J.L.) from the National Institutes of Health and a fellowship grant (Z.H.L.) from the Cooley's Anemia Foundation.

The publication costs of this article were defrayed in part by page charge payment. Therefore, and solely to indicate this fact, this article is hereby marked “advertisement” in accordance with 18 U.S.C. section 1734.

We thank Rick Goforth and the Siteman Cancer Center Embryonic Stem Cell Core for technical assistance. Kim Trinkaus provided excellent statistical support. Nancy Reidelberger provided expert assistance with manuscript preparation.

1
Persons DA, Nienhuis AW. Gene therapy for the hemoglobin disorders: past, present, and future.
Proc Natl Acad Sci U S A.
2000
;
97
:
5022
-5024.
2
Li Z, Dullmann J, Schiedlmeier B, Schmidt M, et al. Murine leukemia induced by retroviral gene marking.
Science
2002
;
296
:
497
.
3
Thomas KR, Capecchi MR. Site-directed mutagenesis by gene targeting in mouse embryo-derived stem cells.
Mol Cell Biol.
1987
;
51
:
503
-512.
4
Shulman MJ, Nissen L, Collins C. Homologous recombination in hybridoma cells: dependence on time and fragment length.
Mol Cell Biol.
1990
;
10
:
4466
-4472.
5
Hasty P, Rivera-Perez J, Bradley A. The length of homology required for gene targeting in embryonic stem cells.
Mol Cell Biol.
1991
;
11
:
5586
-5591.
6
te Riele H, Maandag ER, Berns A. Highly efficient gene targeting in embryonic stem cells through homologous recombination with isogenic DNA constructs.
Proc Natl Acad Sci U S A.
1992
;
89
:
5128
-5132.
7
Deng C, Capecchi MR. Reexamination of gene targeting frequency as a function of the extent of homology between the targeting vector and the target locus.
Mol Cell Biol.
1992
;
12
:
3365
-3371.
8
Kaufman RM, Lu ZH, Behl R, Holt JM, Ackers GK, Ley TJ. Lack of neighborhood effects from a transcriptionally active phosphoglycerate kinase-neo cassette located between the murine b-major and b-minor globin genes.
Blood.
2001
;
98
:
65
-73.
9
Hatada S, Nikkuni K, Bentley SA, Kirby S, Smithies O. Gene correction in hematopoietic progenitor cells by homologous recombination.
Proc Natl Acad Sci U S A.
2000
;
97
:
13807
-13811.
10
Kaufman RM, Pham CTN, Ley TJ. Transgenic analysis of a 100-kb human b-globin cluster-containing DNA fragment propagated as a bacterial artificial chromosome.
Blood.
1999
;
94
:
3178
-3184.
11
Antoch MP, Song EJ, Chang AM, et al. Functional identification of the mouse circadian Clock gene by transgenic BAC rescue.
Cell.
1997
;
89
:
655
-667.
12
Fujitani Y, Yamamoto K, Kobayashi I. Dependence of frequency of homologous recombination on the homology length.
Genetics.
1995
;
140
:
797
-809.
13
Vasquez KM, Marburger K, Intody Z, Wilson H. Manipulating the mammalian genome by homologous recombination.
Proc Natl Acad Sci U S A.
2001
;
98
:
8403
-8410.
Sign in via your Institution