Key Points
CAR T-cell engineering at GSH6 achieves long-term tumor control.
Validated criteria identify targetable extragenic GSHs.
Abstract
Cell therapies that rely on engineered immune cells can be enhanced by achieving uniform and controlled transgene expression in order to maximize T-cell function and achieve predictable patient responses. Although they are effective, current genetic engineering strategies that use γ-retroviral, lentiviral, and transposon-based vectors to integrate transgenes, unavoidably produce variegated transgene expression in addition to posing a risk of insertional mutagenesis. In the setting of chimeric antigen receptor (CAR) therapy, inconsistent and random CAR expression may result in tonic signaling, T-cell exhaustion, and variable T-cell persistence. Here, we report and validate an algorithm for the identification of extragenic genomic safe harbors (GSH) that can be efficiently targeted for DNA integration and can support sustained and predictable CAR expression in human peripheral blood T cells. The algorithm is based on 7 criteria established to minimize genotoxicity by directing transgene integration away from functionally important genomic elements, maximize efficient CRISPR/Cas9–mediated targeting, and avert transgene silencing over time. T cells engineered to express a CD19 CAR at GSH6, which meets all 7 criteria, are curative at low cell dose in a mouse model of acute lymphoblastic leukemia, matching the potency of CAR T cells engineered at the TRAC locus and effectively resisting tumor rechallenge 100 days after their infusion. The identification of functional extragenic GSHs thus expands the human genome available for therapeutic precision engineering.
Introduction
The therapeutic use of genetically engineered human cells is rapidly expanding beyond gene therapy for inherited monogenic disorders to a range of acquired disorders. Alterations of the human genome are not only undertaken to treat inherited mutations1 but also to introduce natural or synthetic genes to reprogram cell function, as exemplified in chimeric antigen receptor (CAR) therapy.2,3 Effective genetic engineering requires predictable and dependable transgene expression, sustaining an optimal level over time without incurring genotoxic adverse events. Albeit effective in achieving stable genetic modifications,1 γ-retroviral, lentiviral, and transposon-based vectors all afford semirandom integration, resulting in obligate insertional mutagenesis4-6 and variegated transgene expression.7,8 The integration of γ-retroviral and lentiviral vectors is biased toward gene loci, increasing the probability of transgene expression as well as the risk of disrupting the expression of endogenous genes,9,10 the most dreaded consequence of which is oncogene activation,11 as was originally observed in some patients treated with retrovirus-mediated gene therapy for X-linked severe combined immunodeficiency (X-SCID).12-14 Clonal expansions stopping short of leukemic transformation have also occurred in hematopoietic stem cell therapies for globin gene disorders15 and CAR T-cell therapy.16,17 The other major detrimental consequences of semirandom integration are variegated and unpredictable transgene expression, owing to chromosomal position effects that include transcriptional silencing because of heterochromatinization.8 In regard to T-cell engineering, suboptimal CAR expression may impair T-cell function via tonic signaling induction and terminal T-cell differentiation and exhaustion.18
In principle, these challenges could be overcome if the transgene were integrated at a defined genomic site that reliably provides safe and stable gene expression, referred to as a genomic safe harbor (GSH). Intra or juxtagenic sites have been proposed as potential GSHs in human cells, including the adeno-associated virus site 1, the chemokine receptor 5 locus, and the human orthologue of the mouse ROSA26 locus.19-24 These sites lie either within a gene that is thought to be dispensable or in close proximity to genes that are deemed not to pose an oncogenic threat. Their vicinity, however, is gene-rich, which raises the risk of endogenous gene transactivation by ectopic enhancer or promoter elements, which may vary in different cell types.
To avert such shortcomings, one may alternatively search for gene-remote, extragenic GSH (eGSH).19 Informed by an extensive literature on retroviral integration sites and insertional mutagenesis, we previously proposed criteria for identifying safe viral vector integrations.25 The advent of site-specific nucleases now makes it possible to prospectively direct transgene integration to such locations, provided that the latter are accessible for efficient cleavage and homologous recombination. Focusing on T-cell engineering, here, we explore the potential of CRISPR/Cas9 and recombinant adeno-associated virus type 6 (rAAV6) donor vectors to target a CAR transcription unit to candidate eGSHs in order to achieve sustained and predictable transgene expression. Using a CAR specific for CD19, we demonstrate that one such site, termed as GSH6, directs optimal CAR expression and potent antitumor activity in vivo. The identification of accessible eGSHs in primary T cells will facilitate the generation of T cells that predictably and homogeneously express their therapeutic gene cargo, including CARs or other immunomodulatory molecules, thereby enhancing the efficacy and safety of cellular immunotherapy.
Materials and methods
In vivo assessment of GSH-CARs in CD19-CAR stress-test model of B-cell acute lymphoblastic leukemia
Six to 12-week-old NOD/SCID/IL-2Rγ null (NSG) male mice (The Jackson Laboratory) were used under an Memorial Sloan Kettering Cancer Center Institutional Animal Care and Use Committee approved protocol. All relevant animal use guidelines and ethical regulations were followed. Mice were infused with 0.5 × 106 CD19-FfLuc–green flourescent protein (GFP) NALM6 cells via tail vein injection, followed by injection of 5 × 104, 1 × 105, 2 × 105, or 4 × 105 CAR+ T cells 4 days later (4 days after transfection, unpurified, CAR+ cell number was calculated using flow cytometry). Tumor rechallenge experiments were performed through IV administration of 1 × 106 CD19-FfLuc-GFP Nalm6 cells at intervals of 10 days, at indicated time points. No mice were excluded before treatment. No randomization or blinding methods were used. Bioluminescence imaging was performed using the IVIS Imaging System (PerkinElmer) and analyzed with the Living Image software (PerkinElmer). Tumor burden was assessed as average total flux signal of dorsal and ventral images per mouse.
Statistical analyses
All statistical analyses were performed using the Prism 7 (GraphPad) software. No statistical methods were used to predetermine sample size.
Results
Identification of efficiently targeted eGSH in human primary T cells
We previously proposed 5 safety criteria to safely position transgenes away from endogenous genes, including cancer-related genes25 (criteria 1-5; Figure 1A), to which we have added a sixth safety criterion to exclude disruption of noncoding RNAs26,27 (Figure 1A). A seventh criterion addresses the need for efficient site-specific transgene integration, which requires effective access and cleavage by a nuclease and subsequent homologous recombination. An eighth criterion addresses local chromatin structure and the need for dependable and sustained transgene function (Figure 1A). Because cleavage efficiency is difficult to predict based on the guide RNA (gRNA) sequence alone,28 cleavage needs to be empirically tested. We hypothesized that accessible chromatin would be more susceptible to efficient cleavage by Cas9, as corroborated via an analysis of cleavage efficiency using assay for transposase-accessible chromatin sequencing (ATAC-seq) peak signal intensity (supplemental Figure 1, available on the Blood website). We thus performed ATAC-seq analyses in activated peripheral blood T cells obtained from healthy human donors, generating an atlas comprising all 21 566 reproducible ATAC-seq peaks (details are presented in supplemental Methods). Another genome atlas was independently established to identify genomic regions satisfying the first 6 eGSH criteria (Figure 1A). Overlapping the 2 atlases yielded 379 candidate eGSHs comprising at least 1 ATAC-seq peak inside or within 5 kb of the eGSH boundaries. These 379 eGSHs were then ranked based on the average signal intensity at the summit of the associated ATAC-seq peak (Figure 1A-B).
We first evaluated the cleavage efficiency at the 6 top eGSHs, designing 4 gRNAs per site at the ATAC-seq peak summit, all of which had a Doench score ≥50 and a specificity score >0.229,30 (see supplemental Table 1 for gRNA details). Electroporation of Cas9 messenger RNA and chemically modified single guide RNAs31 resulted in >80% cleavage efficiencies at all 6 sites (Figure 1C; supplemental Figure 2A). To assess whether high cleavage efficiency is limited to the summit or extends to its immediate vicinity, we randomly chose 2 eGSHs (GSH1 and GSH5) from the top 6 and designed gRNAs spanning 2.5 kb on either side of the peak, followed by quantification of their cleavage efficiencies in peripheral blood T cells (supplemental Figure 2B-C). Although cleavage efficiency diminished 1000 bp and 500 bp away from the edge of the GSH1 and GSH5 peaks, respectively, high efficiency was generally maintained within the peak.
To correlate chromatin accessibility and cleavage efficiency, we tested 2 gRNAs per eGSH at the peak summit for 6 eGSHs that had low ATAC-seq peak signal intensities and 3 previously identified eGSHs25 that had no associated ATAC-seq peaks (supplemental Figure 2D). We additionally included a multiple target site-specific gRNA32 that targets 8 different loci which have different associated ATAC-seq peak signal intensities (supplemental Figure 2E). Although a direct correlation between ATAC-seq peak signal intensity and cleavage efficiency could not be established, these controls further corroborated that an extragenic site with an associated high signal intensity ATAC-seq peak had a higher probability of consistent, efficient cleavage than a site associated with low intensity or no ATAC-seq peak.
eGSHs differentially regulate CAR expression and CAR T-cell function in vitro
We initially targeted the 1928ζ-1XX CAR33 driven by the elongation factor 1α (EF1α) promoter34 to GSH 1-3 (Figure 1D). TRAC-integrated 1928ζ-1XX has been shown to support robust in vivo CAR T-cell function18,33,35-38 and was therefore used as a gold standard reference for effective intragenic integration and expression. GSH 1-3 were effectively targeted, albeit less effectively than the optimized TRAC targeting, yielding readily detectable CAR expression (Figure 1E). Commensurate with their respective CAR expression levels (Figure 1E), sorted GSH1 and GSH2-CAR T cells displayed greater cytolytic activity against CD19+ NALM6 leukemia cells than GSH3-CAR T cells (Figure 1F). To further analyze GSH-CAR T-cell function and CAR expression, we measured T-cell proliferation induced by repeated antigen encounter over 2 weeks, examining cell-surface CAR expression at regular time intervals (Figure 2A-C). After the first exposure to antigen, CAR expression was maintained at GSH 1 but diminished at GSH2 and GSH3 (Figure 2B). After the second exposure to antigen, CAR expression also diminished at GSH1, in contrast to the stable, sustained level measured in TRAC-CAR T cells (Figure 2B). Therefore, at all 3 eGSHs, CAR expression eventually declined, most rapidly at GSH3, followed by GSH2 and finally GSH1 (Figure 2B). The proliferation capacity of the GSH-CAR T cells over these 2 weeks was also less than that of TRAC-CAR T cells (Figure 2C).
We hypothesized that a chromatin insulator element with barrier function may offset silencing and sustain CAR expression. We thus flanked the CAR transcription unit at GSH1 with the human C1 insulator39 (Figure 2D) and tested 3 additional eGSHs (GSHs 4-6; Figure 1C). Initial CAR expression levels varied among the sites, with GSH6 yielding expression most similar to TRAC, either with or without the insulator (Figure 2E; supplemental Figure 3A-B). All GSHs showed comparable cytolytic potential against NALM6 on day 0, except for GSH3, which showed lower cytolysis (Figure 2F). After 2 weekly exposures to CD19, GSH2 and GSH3-CAR T cells showed the least expansion (Figure 2G). The C1 insulator had a variable, site-specific impact, increasing CAR expression at GSH1 but not at GSH 4, 5, and 6 (Figure 2E). These observations are consistent with previous observations on the site-specific dependence of insulator activity.40,41 Interestingly, uninsulated GSH6-CAR T cells still maintained their CAR expression levels and displayed proliferation capacity closest to TRAC-CAR T cells. Long-term expression and activity were observed in GSH6-CAR T cells after 3 weeks, with and without antigenic stimulation, further emphasizing the ability of GSH6 to support long-term transgene expression (supplemental Figure 3C-F). A nonsignaling transgene (Humanized Renilla reniformis GFP) integrated at GSH6 also similarly showed sustained transgene expression for 2 weeks in culture (supplemental Figure 4).
In vivo antitumor efficacy depends on eGSH-mediated maintenance of CAR expression
We proceeded to functionally test 3 representative GSH-CAR T cells (GSH 1, 4, and 6, with and without the C1 insulator element) in a pre–B-acute lymphoblastic leukemia NALM6 mouse model. GSH-CAR T cells were assessed under stress-test conditions,42 in which CAR T-cell dosing is lowered to test the functional limits of the CAR T cells (Figure 3A). GSH1 and GSH4±C1-CAR T cells only transiently controlled tumor burden before relapse (Figure 3B-C; supplemental Figure 5). Consistent with the improved CAR expression of the C1 element at GSH1 (Figure 2E), GSH1+C1-CAR T cells displayed improved functional capacity relative to that of GSH1-CAR T cells (Figure 3 B-C). GSH6-CAR T cells, with or without the C1 element, displayed equivalent potency to TRAC-CAR T cells, mediating long-term tumor control (Figure 3 B-C), consistent with their sustained CAR expression in vitro (Figure 2E).
CAR expression and T-cell differentiation and exhaustion markers, were examined in T cells retrieved from bone marrow (BM) 10 days after infusion (supplemental Figures 6, 7, and 8A). GSH6±C1-CAR T cells were the most abundant (Figure 3D; supplemental Figure 7A-B). The C1 insulator significantly improved CAR T-cell numbers at GSH1 but not at GSH4. GSH1-CAR T cells displayed an overall naïve phenotype, consistent with their low CAR expression and limited target engagement, whereas GSH6±C1-CAR T cells exhibited differentiation and exhaustion profiles overall similar to that of TRAC-CAR T cells (supplemental Figure 7C-F). When CAR T cells were freshly retrieved from BM and were cocultured with CD19+ NALM6, GSH6-CAR T cells, with or without C1, showed upregulation in CAR expression upon exposure to CD19 and lysed NALM6 cells most effectively (Figure 3E; supplemental Figure 8A). GSH1+C1, GSH4, and GSH4+C1-CAR T cells also showed upregulation in CAR expression but to a lesser extent than that of GSH6-CAR T cells, in line with their lesser lytic activity. Taken together, our observations suggest that the inability to sustain CAR expression at GSH1 and GSH4 accounted for the failure to control NALM6 leukemia. GSH6-CAR T cells, however, sustained CAR expression and prevailed over NALM6.
GSH6 supports long-term tumor control at low T-cell dose and upon multiple rechallenges
CAR expression at GSH6 was maintained upon re-exposure to antigen and over a span of time, whether in the presence or absence of C1, in vitro and in vivo (Figure 2E; supplemental Figure 8A). These remarkable attributes prompted us to closely compare GSH6-CAR T cells to TRAC-CAR T cells.18,33,35-38 Monitoring CAR expression after exposure to CD19 showed an initial dip, as expected from CAR internalization,18 followed by the recovery of cell-surface expression after 36 hours (Figure 4A-B). In contrast to the consistent high levels of CAR expression observed with the TRAC-EF1α CAR, CAR expression at GSH6 returned to a baseline level, similar to that with TRAC-CAR.18
GSH6-CAR T cells were further compared with TRAC-CAR T cells in vivo, lowering the CAR T-cell dose to 2 × 105, 1 × 105, and 5 × 104. GSH6±C1-CAR T cells effectively controlled tumor burden at all T-cell doses, equally to TRAC-CAR T cells but for a slight kinetic difference (Figure 4C; supplemental Figure 8B). To further pressure-test GSH6-CAR T cells, we rechallenged mice treated with 4 × 105 GSH6-CAR T cells with CD19+NALM6 cells administered at 10-day intervals. GSH6-CAR T cells completely controlled every challenge, matching TRAC-CAR T cells (Figure 4D).
Finally, to assess whether GSH6-CAR T cells could control tumor burden long after initial tumor eradication, we rechallenged long-term surviving mice (from Figure 3C) with CD19+ NALM6, 100 days after CAR T-cell infusion. One tumor-free GSH1+C1 mouse surviving at day 100 was also similarly rechallenged. The GSH1+C1–CAR T-cell recipient was unable to control the challenge, but 6 of 8 rechallenged GSH6±C1–CAR T-cell treated mice remained tumor free (Figure 4E). BM analyses performed 40 days after rechallenging showed persistence of the adoptively transferred CAR T cells (supplemental Figure 8C). We measured CAR expression in T cells at various differentiation states between the time of infusion, at day 10 after infusion and day 40 after rechallenge or day 140 after infusion in nonrechallenged mice (supplemental Figure 9). These data establish that CAR expression at GSH6 remained stable between days 10 and 140 in vivo.
eGSH architecture and association with function
Low CAR expression or rapid silencing was observed at most eGSHs apart from GSH6 and was associated with treatment failure. We expanded our functional studies to include 4 additional sites, including 2 (GSH20 and GSH30) that were located within or very close to a pseudogene, because GSH6 lies within a pseudogene (Figure 5A). All 4 additional sites comprised at least 1 high intensity ATAC-seq peak and showed high cleavage efficiency (criterion 7; supplemental Figure 10A). Although GSH20 seemed most similar to GSH6 in terms of presence of endogenous gene proximity and fell within a pseudogene, CAR expression at GSH20 was consistently lower than at GSH6, and GSH20-CAR T cells failed to express the CAR to the same extent as GSH6-CAR T cells upon multiple antigen stimulations (supplemental Figure 10B-C). Cytotoxicity assays and CAR expression showed that GSH 7, 12, and 20 belonged to the intermediate performing eGSH group, whereas GSH30 was among the lowest, similar to GSH 2, 3, and 4 (supplemental Figure 10C-E). These data argue against encoding a CAR within a pseudogene being sufficient to afford sustained CAR expression and CAR T-cell function.
Comparing ATAC-seq peak distribution within 50 kb on either side at all eGSHs, GSH6 presented with the greatest bilateral peak abundance in both activated and resting T cells (Figure 5A; Figure 6). In addition, GSH6 also exhibited a higher number of active genes within 200 kb of the integration site. Integration of the EF1α enhancer/promoter driven CAR at GSH6 without the C1 insulator did not alter their expression apart from slightly reducing the number of ZNF767P pseudogene transcripts (Figure 5B-C). On comparing the structural features of all 10 eGSHs, GSH6 was found to distinguish itself in possessing the following attributes: (1) proximity to multiple peaks on both sides of the targeted peak, (2) greater intensity of proximal peaks, (3) presence of proximal and targeted peaks in both resting and activated states, and (4) presence of and expression of surrounding genes in resting as well as activated state (albeit at a distance, respecting criteria 1-6; Figure 5A, Figure 6, supplemental Table 2). None of the other 9 eGSH met all 4 attributes.
To further support this proposition, we sought to prospectively test this criterion in a collection of adoptively transferred, clonally traceable T cells, annotated for their vector integration site. To identify integration sites permitting long-term vector expression, we turned to a genetic correction model, in which sustained vector expression is required for long-term T-cell persistence and vector silencing would predictably result in T-cell loss. To this end, we investigated a data set generated in patients with X-linked severe combined immunodeficiency (SCID-X1) treated with a γ-retroviral vector encoding the interleukin receptor common γ-chain, which is required to restore T-cell responsiveness to γ-chain cytokines and for T-cell survival.43 We thus assessed the distributions of eGSH integrations in patients who underwent SCID-X1 gene therapy44 under the hypothesis that vector integrations in eGSH regions that persisted continued to express the transgene product and would be enriched over time for those bearing proximity to ATAC-seq peaks relative to their initial preinfusion distribution.
We thus assessed the enrichment of integration sites that fall within eGSHs and that have ATAC-seq peaks within specified distances from the integration site, comparing their relative abundance over time (median follow-up = 1 year). Enrichment (fold change) was observed to be highest when the eGSH integration site was located within 50 kb of an activated or resting ATAC-seq peak (supplemental Table 3; Table 1). Location beyond these distances from the ATAC-seq peak was associated with reduced degree of enrichment (Table 1). Presence of ATAC-seq peaks on both sides of the integration site was associated with the greatest enrichment. Enrichment was also greater (fold change, 2.64; P = 1.69 × 10-13; Table 1, section “GSHs with ATAC-seq peaks both upstream and downstream within specified distances from GSH boundary”) for eGSH integration sites that were proximal to resting ATAC-seq peaks compared with the activated-only ATAC-seq peaks that were within 50 kb. These findings are thus consistent with observations in our 10 targeted eGSH (Figure 6) and further support that proximity to ATAC-seq peaks in the resting cell state is important for sustaining vector activity over time. Based on these findings, we formulated a final, seventh structural criterion for the identification of functional extragenic GSHs (Table 2) ie, ”Location within an ATAC-seq peak in the activated state and proximity to ATAC-seq peaks (<50 kb) and active genes in resting T-cell state (<200 kb),“ fulfilling the 2 functional goals we set out to address for the identification of functional extragenic GSHs, efficient cleavability and reliable transgene expression and regulation.
eGSHs located within ATAC-seq peaks or within 5 kb of ATAC-seq peak boundary∗ . | ||||||
---|---|---|---|---|---|---|
Condition . | T0 (165) . | Tl (1430) . | Fold change . | P value . | ||
No. . | % . | No. . | % . | |||
Activated | 14 | 8.50 | 423 | 29.60 | 3.48 | 5.14E-10 |
Resting | 36 | 21.80 | 733 | 51.30 | 2.35 | 2.54E-13 |
eGSHs located within ATAC-seq peaks or within 5 kb of ATAC-seq peak boundary∗ . | ||||||
---|---|---|---|---|---|---|
Condition . | T0 (165) . | Tl (1430) . | Fold change . | P value . | ||
No. . | % . | No. . | % . | |||
Activated | 14 | 8.50 | 423 | 29.60 | 3.48 | 5.14E-10 |
Resting | 36 | 21.80 | 733 | 51.30 | 2.35 | 2.54E-13 |
eGSHs with ATAC-seq peaks both upstream and downstream within specified distances from integration site† . | ||||||
---|---|---|---|---|---|---|
Condition_distance . | T0 (165) . | Tl (1430) . | Fold change . | P value . | ||
No. . | % . | No. . | % . | |||
ACT_50 kb | 16 | 9.7 | 244 | 17.1 | 1.76 | 1.41E-02 |
ACT_100 kb | 26 | 15.8 | 452 | 31.6 | 2.01 | 1.39E-05 |
ACT_200 kb | 32 | 19.4 | 522 | 36.5 | 1.88 | 6.30E-06 |
REST_50 kb | 29 | 17.6 | 664 | 46.4 | 2.64 | 1.69E-13 |
REST_100 kb | 41 | 24.8 | 752 | 52.6 | 2.12 | 6.99E-12 |
REST_200kb | 60 | 36.4 | 843 | 59.0 | 1.62 | 4.81E-08 |
eGSHs with ATAC-seq peaks both upstream and downstream within specified distances from integration site† . | ||||||
---|---|---|---|---|---|---|
Condition_distance . | T0 (165) . | Tl (1430) . | Fold change . | P value . | ||
No. . | % . | No. . | % . | |||
ACT_50 kb | 16 | 9.7 | 244 | 17.1 | 1.76 | 1.41E-02 |
ACT_100 kb | 26 | 15.8 | 452 | 31.6 | 2.01 | 1.39E-05 |
ACT_200 kb | 32 | 19.4 | 522 | 36.5 | 1.88 | 6.30E-06 |
REST_50 kb | 29 | 17.6 | 664 | 46.4 | 2.64 | 1.69E-13 |
REST_100 kb | 41 | 24.8 | 752 | 52.6 | 2.12 | 6.99E-12 |
REST_200kb | 60 | 36.4 | 843 | 59.0 | 1.62 | 4.81E-08 |
eGSHs with ATAC-seq peaks only upstream within specified distances from integration site‡ . | ||||||
---|---|---|---|---|---|---|
Condition_distance . | T0 (165) . | Tl (1430) . | Fold change . | P value . | ||
No. . | % . | No. . | % . | |||
ACT_50 kb | 12 | 7.3 | 94 | 6.57 | 0.90 | 7.41E-01 |
ACT_100 kb | 16 | 9.7 | 116 | 8.11 | 0.84 | 4.57E-01 |
ACT_200 kb | 26 | 15.8 | 176 | 12.31 | 0.78 | 2.16E-01 |
REST_50 kb | 13 | 7.9 | 91 | 6.36 | 0.81 | 4.09E-01 |
REST_100 kb | 18 | 10.9 | 143 | 10.00 | 0.92 | 6.83E-01 |
REST_200 kb | 25 | 15.2 | 188 | 13.15 | 0.87 | 4.69E-01 |
eGSHs with ATAC-seq peaks only upstream within specified distances from integration site‡ . | ||||||
---|---|---|---|---|---|---|
Condition_distance . | T0 (165) . | Tl (1430) . | Fold change . | P value . | ||
No. . | % . | No. . | % . | |||
ACT_50 kb | 12 | 7.3 | 94 | 6.57 | 0.90 | 7.41E-01 |
ACT_100 kb | 16 | 9.7 | 116 | 8.11 | 0.84 | 4.57E-01 |
ACT_200 kb | 26 | 15.8 | 176 | 12.31 | 0.78 | 2.16E-01 |
REST_50 kb | 13 | 7.9 | 91 | 6.36 | 0.81 | 4.09E-01 |
REST_100 kb | 18 | 10.9 | 143 | 10.00 | 0.92 | 6.83E-01 |
REST_200 kb | 25 | 15.2 | 188 | 13.15 | 0.87 | 4.69E-01 |
eGSHs with ATAC-seq peaks only downstream within specified distances from integration site§ . | ||||||
---|---|---|---|---|---|---|
Condition_distance . | T0 (165) . | Tl (1430) . | Fold change . | P value . | ||
No. . | % . | No. . | % . | |||
ACT_50 kb | 8 | 4.85 | 241 | 16.85 | 3.48 | 1.23E-05 |
ACT_100 kb | 12 | 7.27 | 74 | 5.17 | 0.71 | 2.72E-01 |
ACT_200 kb | 20 | 12.12 | 148 | 10.35 | 0.85 | 5.02E-01 |
REST_50 kb | 23 | 13.94 | 130 | 9.09 | 0.65 | 5.07E-02 |
REST_100 kb | 23 | 13.94 | 73 | 5.10 | 0.37 | 7.33E-05 |
REST_200 kb | 21 | 12.73 | 124 | 8.67 | 0.68 | 8.74E-02 |
eGSHs with ATAC-seq peaks only downstream within specified distances from integration site§ . | ||||||
---|---|---|---|---|---|---|
Condition_distance . | T0 (165) . | Tl (1430) . | Fold change . | P value . | ||
No. . | % . | No. . | % . | |||
ACT_50 kb | 8 | 4.85 | 241 | 16.85 | 3.48 | 1.23E-05 |
ACT_100 kb | 12 | 7.27 | 74 | 5.17 | 0.71 | 2.72E-01 |
ACT_200 kb | 20 | 12.12 | 148 | 10.35 | 0.85 | 5.02E-01 |
REST_50 kb | 23 | 13.94 | 130 | 9.09 | 0.65 | 5.07E-02 |
REST_100 kb | 23 | 13.94 | 73 | 5.10 | 0.37 | 7.33E-05 |
REST_200 kb | 21 | 12.73 | 124 | 8.67 | 0.68 | 8.74E-02 |
Table shows the number of eGSH insertion sites that have ATAC-seq peaks within the distances and direction specified in the table headers. Upstream and downstream are determined in relation to the direction of the integrated transgene. A 2-tailed Fisher exact test is used to compute the P value for the enrichment between time zero (before infusion, T0) and later (Tl) time points.
ACT, ATAC-seq peaks in the activated T-cell data set; REST, ATAC-seq peaks in the resting T-cell data set within the specified distances.
Number of eGSH integration sites that are located in an ATAC-seq peak or within 5 kb from an ATAC-seq peak boundary.
Number of eGSH integration sites with at least 1 ATAC-seq peak upstream and at least 1 ATAC-seq peak downstream considering a distance of 50 kb, 100 kb, or 200 kb from integration site.
Number of eGSH integration sites with at least 1 ATAC-seq peak upstream and no peaks downstream direction considering a distance of 50 kb, 100 kb, or 200 kb from integration site.
Number of eGSH integration sites with at least 1 ATAC-seq peak downstream direction and no peaks upstream direction considering a distance of 50 kb, 100 kb, or 200 kb from integration site.
1. | Distance of >50 kb from 5′ end of any gene |
2. | Distance of >300 kb from any cancer-related genes |
3. | Distance of >300 kb from any miRNA |
4. | Outside a gene transcription unit |
5. | Outside of ultraconserved regions |
6. | Outside of noncoding RNAs |
7. | Within ATAC-seq peak in activated state and proximity to ATAC-seq peaks (<50 kb) and active genes in resting state (<200 kb) |
1. | Distance of >50 kb from 5′ end of any gene |
2. | Distance of >300 kb from any cancer-related genes |
3. | Distance of >300 kb from any miRNA |
4. | Outside a gene transcription unit |
5. | Outside of ultraconserved regions |
6. | Outside of noncoding RNAs |
7. | Within ATAC-seq peak in activated state and proximity to ATAC-seq peaks (<50 kb) and active genes in resting state (<200 kb) |
Discussion
Therapeutic cell engineering based on transgene integration requires the identification of safe genomic sites that afford dependable transgene expression. To achieve this goal, one may elect to target gene loci that provide desirable transcriptional regulation, eg, the TRAC locus to express CARs18 or loci encoding genes deemed to be dispensable, eg, the Adeno-associated virus site 1.23,45 As an alternative, we sought to identify extragenic regions that did not disrupt conserved genetic elements but still afforded sustained transgene expression. Reviewing extensive insertional mutagenesis data accumulated in a number of clinical trials that used γ-retroviral and lentiviral vectors,6,11-15 we previously proposed safety criteria for identifying suitable integration sites in clonal stem cells25 (Figure 1A; criteria 1-5), to which we have now added 2 criteria for averting the disruption of noncoding RNAs and favoring nuclease accessibility and transgene expression (Table 2; criteria 6-7). Here, we demonstrate the feasibility of prospectively targeting such extragenic genomic regions and support CAR expression to afford long-term and potent antitumor T-cell efficacy.
Recent studies have established the impact of chromatin accessibility in determining cleavage activity in the context of a living cell.46-52 We, therefore, hypothesized that Cas9 would efficiently bind and cleave candidate eGSH presenting with high ATAC-seq peak signal intensity. Adding this requirement to our 6 safety criteria reduced the number of candidate eGSHs in human primary T cells to 379. All 10 sites that we tested at the ATAC-seq peak summit were efficiently cleaved by CRISPR/Cas9 (Figure 1C; supplemental Figure 10A). In addition, gRNAs located anywhere within ATAC-seq peak boundaries afforded equally high cleavage (supplemental Figure 2B-C), allowing for flexibility in selecting gRNAs with optimal off-target profiles for eventual clinical use. However, upon integration of a CAR transgene, CAR expression varied among sites (Figure 2E; supplemental Figure 10C). Although all targeted sites initially expressed the CAR, only 1 (GSH6) maintained expression after 2 weekly exposures to antigen (Figure 2E). The other sites silenced within days or weeks. The incorporation of a chromatin insulator element with barrier function39 partially rescued CAR expression at some sites but not others (Figure 2E), which is consistent with earlier studies describing the context dependence of chicken β-globin 5' DNase I-hypersensitive site (cHS4) insulator activity.40,41 At GSH6, flanking the CAR transcription unit with the barrier element slightly diminished CAR expression in vitro and CAR T-cell expansion in vivo (Figures 2E and 3D).
Our study highlights the profound impact of integration site and variegated gene expression on CAR T-cell function. Approaches based on the use of retroviral vectors (γ-retroviral, lentiviral, or spumaviral) and DNA transposons are all subject to position effects.7 These approaches also expose the risk of gene disruption, as shown with CAR lentiviral vectors integrated at the TET2 or CBL-B loci.16,17 As anticipated from the eGSH criteria, integration of the CAR transcription unit at GSH6 including a strong enhancer/promoter did not perturb expression of endogenous genes within 150 kb on either side, in resting or in activated T cells (Figure 5B-C). GSH6 is located within a pseudogene, ZNF767P, which is transcribed but is noncoding and lacks any known function. ZNF767P RNA is expressed at very low levels in T cells and its expression is further slightly reduced upon integration of the CAR transgene without the insulator. Incorporation of the insulator, resulted in greater alteration of expression of surrounding genes upon activation, possibly as a consequence of insulator-mediated forced chromatin looping.53
When expressed with GSH6, the CD19 CAR proved to be as effective as the TRAC-encoded CAR under stringent stress-test conditions.33 Therefore, after low-dose CAR T-cell infusion, GSH6-CAR treated mice were protected from 5 consecutive rechallenges with tumor cells and even a late rechallenge 100 days after the initial, single CAR T-cell infusion. CAR expression at GSH6 with and without antigen exposure remained in a relatively narrow range and at a moderate expression level, unlike the less optimal patterns of retroviral long terminal repeats or the EF1α promoter, which predispose to tonic signaling18 (Figure 4 A-B). Kinetic analysis of CAR expression, reflecting the function of the EF1α enhancer/promoter and the bovine growth hormone polyadenylation signal within the GSH6 chromosomal context, revealed that although the time to recovery of expression after antigen stimulation is similar to that after TRAC-EF1α-CAR, CAR expression at GSH6 did not remain elevated, as observed at the TRAC locus, but rather promptly returned to baseline levels. Upon re-exposure to antigen, CAR expression was upregulated and allowed CAR induced killing (supplemental Figure 8A), potentially avoiding the tonic signaling and premature exhaustion brought on by consistently high levels of expression.18 Although both TRAC-CAR and GSH6-CAR T cells performed in a similar manner in the in vivo therapeutic model, they differ fundamentally in that TRAC-CAR T cells lack a T-cell receptor, whereas GSH-CAR T cells retain it, which may be beneficial to increase in vivo expansion of CAR T cells in some settings.54-56 Other eGSHs that show short-term activity may be useful if extinction of gene expression within a certain time frame were desirable, harnessing chromatin regulation for the temporal control of CARs, T-cell receptors, and other therapeutic molecules.
Our initial hypothesis was that accessible eGSHs would display efficient transgene integration and expression, forming the basis for criteria 7 and 8 (Figure 1A). However, although cleavage efficiency was high as expected from high accessibility at all eGSHs, integration efficiency varied among sites, and further research is required to identify other factors that affect homologous recombination efficiency. In terms of transgene expression, the surrounding ATAC-seq peaks and gene expression profiles in resting and activated T cells provide some insights into what may constitute a more favorable site for sustained expression in T cells. Active peaks in both resting and activated T-cell states were found at GSH6 (Figure 6). A combination of factors affecting three-dimensional chromatin architecture likely plays an important role in regulating transgene expression from eGSHs, including distance and DNA scaffolding in different cell states. Although further analyses are needed, our data for 10 eGSHs and the SCID-X1 clinical trial data indicate that the presence of ATAC-seq peaks in close proximity (≤50 kb from integration site), especially in resting T-cell state and on both sides of the integration site as well as the presence of active genes in the resting T-cell state within a broader vicinity (∼200 kb), also on both sides of the integration site, are more likely to ensure dependable transgene expression from an eGSH and can serve as criteria for selecting dependable eGSHs for therapeutic cell engineering.
T-cell engineering at GSH6 is promising, given the sustained in vivo expression we observed over 100 days and the lack of perturbation of surrounding gene expression. In addition, GSH6 is also located away from the centromere and telomeres, which, in principle, reduces the risk of aneuploidy induced via CRISPR/Cas9 editing.57 Predictable transgene expression is essential for the regulation of a variety of immunotherapeutic molecules other than CARs, including cytokines, chemokines, single-chain variable fragments, bispecific T-cell engagers, and antibodies. Our approach for identifying dependable extragenic genomic regions for foreign DNA integration is applicable in principle to other clinically relevant cell types. We anticipate that the identification of eGSH based on the criteria we propose here will expand the functional human genome for the development of safe cell therapies based on precision engineering.
Acknowledgments
The authors thank George Stamatoyannopoulos for providing us the sequence of the C1 insulator. The authors thank Maria Lemdal-Sjӧstrand and Archana Iyer for the RNA-seq data set and Anton Dobrin, Elisa De Stanchina, Kvin Lertpiriyapong, Alessandra Piersigilli, and Pallavi Vedantam for their support in performing in vivo experiments. The authors thank Lee Zamparo for his support in analyzing the resting T-cell ATAC-seq data set. The authors thank the following Memorial Sloan Kettering Cancer Center (MSKCC) core facilities for their excellent support: The Cell Therapy and Cell Engineering Facility (CTCEF), the SKI Flow Cytometry core facility; animal facility; animal imaging core; laboratory for comparative pathology; bioinformatics core; and integrated genomics operation core. A part of the visual abstract was created using BioRender.com.
This work was partially supported by MSKCC core grant P30 CA008748.
Authorship
Contribution: A.O. designed the study, performed experiments, analyzed and interpreted the data, and wrote the manuscript; H.Y. performed the computational analysis for all ATAC-seq data; J.F. helped design and perform experiments; J.M.-S., F.K. and J. Eyquem helped to design and perform in vitro and in vivo experiments; V.A.C. and J. Everett performed analysis of SCID-X1 clinical trial data; F.D.B. assisted in the generation of the list of oncogenes used for identification of eGSH and analysis of SCID-X1 clinical trial data; C.S.L. designed, analyzed, and interpreted the computational analysis; M.S. designed the study, analyzed and interpreted the data, and wrote the manuscript.
Conflict-of-interest disclosure: A.O. and M.S. have submitted a patent application partially based on data reported in this manuscript. J.F., J. Eyquem, J.M.-S., and M.S. are named inventors on unrelated patent applications in the field of T-cell engineering. Patent applications are submitted by MSKCC. The remaining authors declare no competing financial interests.
The current affiliation for A.O. is Strand Therapeutics Inc, Boston, MA.
The current affiliation for H.Y. is Calico Life Sciences, San Francisco, CA.
The current affiliation for J.F. is Cluster of Excellence iFIT, University Children's Hospital Tübingen, Tübingen, Germany.
The current affiliation for J. Eyquem is University of Califormia San Francisco, Department of Medicine, Division of Hemato-Oncology, San Francisco, CA.
Correspondence: Michel Sadelain, Center for Cell Engineering and Immunology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, 1275 York Ave, New York, NY 10065; e-mail: m-sadelain@ski.mskcc.org.
References
Author notes
Additional data are available on request from the corresponding author, Michel Sadelain (m-sadelain@ski.mskcc.org).
The online version of this article contains a data supplement.
There is a Blood Commentary on this article in this issue.
The publication costs of this article were defrayed in part by page charge payment. Therefore, and solely to indicate this fact, this article is hereby marked “advertisement” in accordance with 18 USC section 1734.
This feature is available to Subscribers Only
Sign In or Create an Account Close Modal