Development of orthogonal base editors based on engineered glycosylases
Encouraged by the development of gGBE in our previous study3, we attempted to develop thymine and cytosine base editor using the deaminase-free glycosylase-based strategy. Since the three pyrimidine bases (i.e., T, C, and U) are structurally similar, we speculated that excision of canonical T or C could be achieved by engineering certain uracil DNA glycosylase (UNG). The excision of T or C would generate apurinic/apyrimidinic (AP) sites, then trigger the base excision repair (BER) pathway and facilitate direct T editing or C editing (Fig. 1a, b). Alternative splicing as well as transcription from two distinct start sites leads to two different human UNG isoforms, the mitochondrial UNG1 (304 amino acids, aa) and the nuclear UNG2 (313 aa), each possessing unique N-termini that mediate translocation to the mitochondria and the nucleus, respectively16 (Supplementary Fig. 1). Two human UNG1 variants, UNG1-Y147A and UNG1-N204D, have been engineered to excise T and C in DNA, respectively17. Y156A and N213D of UNG2 are equivalent to Y147A and N204D of UNG1, respectively. To edit the nuclear DNA, we generated two prototype gBEs, a deaminase-free glycosylase-based thymine base editor (gTBE) and a deaminase-free glycosylase-based cytosine base editor (gCBE), by fusing UNG2-Y156A and UNG2-N213D at the C-terminus of Cas9 D10A nickase (nCas9), respectively (Fig. 1a, c). We developed T-to-G reporter and C-to-G reporter, two intron-split EGFP reporter systems as reported previously9, to evaluate the editing activity of gTBE and gCBE, respectively (Supplementary Fig. 2a). In these reporters, the AG-to-AT or AG-to-AC inactive splicing acceptor (SA) could only be remediated with T-to-G or C-to-G conversion, thus leading to correct splicing of EGFP-coding sequence and EGFP activation (Supplementary Fig. 2b). The gBE vectors were co-transfected with the T-to-G or C-to-G reporter vector containing the single-guide RNA (sgRNA) that targets the corresponding mis-splicing mutations. We found that gTBE with UNG2-Y156A (hereafter referred to as gTBEv0.1) showed slight T-to-G conversion activity, and gCBE with UNG2-N213D (hereafter referred to as gCBEv0.1) showed slight C-to-G conversion activity (Fig. 1c–e).
Given the disordered N-terminal domain (NTD) of UNG contains protein binding motifs and sites for post-translational modifications18, which might constrain targeted excision activity of the glycosylase domain in ssDNA19,20, we constructed UNG-NTD-truncated gTBE and gCBE versions with UNG2Δ88 (1-88 amino acids truncation of UNG2) variants (Fig. 1c) to eliminate undesired protein-protein interactions20,21,22. The gTBEv0.2 with UNG2Δ88-Y156A fused at the C-terminus exhibited comparable T-to-G conversion activity with gTBEv0.1 (1.0% vs. 1.1%, Fig. 1d), while gCBEv0.2 with UNG2Δ88-N213D fused at the C-terminus increased the C-to-G conversion activity compared with gCBEv0.1 (13.3% vs. 1.0%, Fig. 1e). Moreover, the gTBEv0.3 with UNG2Δ88-Y156A and gCBEv0.3 with UNG2Δ88-N213D fused at the N-terminus showed much higher editing activity than those at the C-terminus (10.2% vs. 1.0%, and 51.4% vs. 13.3%, Fig. 1c-e), a 10- and 3.9-fold enhancement in the editing efficiency, respectively. No editing activity was found for all the above-mentioned versions of gTBE and gCBE together with the non-targeting sgRNA (Fig. 1d, e). In addition, gTBEv0.3 exhibited the highest T-to-G editing activity among various UNG-NTD-truncated versions of gTBE (Supplementary Fig. 3).
Furthermore, we examined the orthogonality of gTBE and gCBE for base editing. Although engineered from the same original glycosylase UNG, no C editing activity was found for gTBEv0.3 and no T editing activity was found for gCBEv0.3 (Fig. 1f). Thus, we developed two orthogonal base editors, gTBE for direct T editing and gCBE for C editing.
Evolution of gTBE with enhanced editing activity
To further increase the T-to-G activity of gTBEv0.3, we attempted to perform rational mutagenesis for engineering the UNG moiety, using the T-to-G reporter to evaluate the editing activity in cultured mammalian cells (HEK293T) (Fig. 2a). Based on structural and functional analysis, WT UNG contains five conserved motifs required for efficient glycosylase activity: the catalytic water-activating loop, the proline-rich loop, the uracil-binding motif, the glycine-serine motif and the leucine loop23,24,25 (Supplementary Fig. 1b). Since Y156 in the catalytic water-activating loop and N213 in the uracil-binding motif are critical for activity switch from U excision to T or C excision, we firstly selected sequential and spatial neighbors of these two residues and examined their roles in the regulation of base excision activity (Fig. 2a, b). We conducted alanine-scanning mutagenesis by replacing all non-alanine with alanine (X > A) and alanine with valine (A > V) to cover all the residues in the regions of I150-L179 and L210-T217. Interestingly, we obtained a variant gTBEv1.1 (v0.3 with A214V) largely elevating the T-to-G conversion activity by 2.68-fold (Supplementary Fig. 4a). To check whether there is any amino acid at position 214 performing better than the valine, we further performed site-saturation mutagenesis focused on the residue at position 214. We obtained gTBEv1.2 (v0.3 with A214T) with elevated editing efficiency by 1.06-fold in comparison with the T editing activity of gTBEv1.1 (Supplementary Fig. 4b).
Then, we examined the spatial neighbors of residue T214, nearby the Gly-Ser loop that compresses the DNA backbone 3′ to the lesion (Fig. 2b), and obtained variant gTBEv1.3 (v0.3 with Q259A), which increased the editing efficiency by 1.46-fold (Supplementary Fig. 4c). Furthermore, we found a synergistic enhancement of T-to-G editing activity in variant gTBEv2 (v0.3 with combination of A214T and Q259A), by 2.7-fold in comparison with the T editing activity of gTBEv0.3 (Fig. 2c). We also scanned residues in the regions of Q274-Y284, in or nearby the Leu-intercalation loop, by sequential replacement with amino acids of distinct properties, including arginine (with positive charged side chain), aspartic acid (with negative charged side chain), or valine (with small hydrophobic side chain) (X > R, D, or V). Although most of these mutations reduced the T editing activity, we found a variant gTBEv3 (v2 with Y284D) showed elevated editing efficiency by 1.22-fold as compared with that of gTBEv2 (Supplementary Fig. 5), and by 3.09-fold compared with gTBEv0.3 (Fig. 2c).
We validated the improvement of T editing activity by different gTBE variants at one endogenous genomic site in HEK293T. After transfected with all-in-one constructs encoding each gTBE variant, together with sgRNA that targeted site 9 in CLYBL gene and mCherry for fluorescence-activated cell sorting (FACS), mCherry-positive cells were FACS-sorted. Through target deep sequencing analysis, we obtained a gradual increase of overall T editing efficiency at T5 from 26.9% for gTBE1.1 to 67.4% for gTBE3, as well as the insertions and deletions (indels, from 3.6% to 13.3%), with T-to-S (i.e., T-to-C or T-to-G; S = C or G base) conversions as the predominant events at this site (Fig. 2d). These results indicate that rounds of mutagenesis described above had effectively optimized gTBE activity for T-to-C and T-to-G base editing. Thus, the engineered version of gTBEv3 (carrying Y156A, A214T, Q259A, Y284D mutations) had the highest T editing efficiency and was used for the following studies.
Characterization of gTBEv3 at human genomic DNA sites
We further characterized the editing profiles of gTBEv3 by targeting 20 endogenous genomic loci, most of which were used in previous base editing studies11,12,26,27. We found that gTBEv3 achieved efficient T base editing activity (ranged from 24.3% to 81.5%; Fig. 3a and Supplementary Fig. 6a, b), but essentially no A, C or G editing at all examined sites (Supplementary Fig. 6c–e). The T-to-C or T-to-G conversions were the predominant events (Supplementary Fig. 6f–h), only a low percentage of T-to-A conversion were detected (Fig. 3a and Supplementary Fig. 6i), consistent with previous findings of gGBE3, AYBE9 and CGBEs11,12,13,14,15. The ratios of T-to-S to T conversion ranged from 0.68 to 0.97 (without indels, Fig. 3b) and from 0.41 to 0.92 (with indels, Supplementary Fig. 6j). We found that gTBEv3 also induced indels with frequency ranging from 5.2% to 45.2% at the 20 edited sites (Fig. 3c). Furthermore, the editable range of gTBEv3 was positions 2 to 11, and the optimal editing window with high efficiency of T conversion covered protospacer positions 3 to 7, with the highest editing efficiency at position 5 (Supplementary Fig. 6b). We found no obvious motif preference for T conversions with gTBEv3 by analyzing the on-target editing and sequences of all tested sites (Supplementary Fig. 6k).
We have analyzed the off-target activity of gTBEv3 at several in silico-predicted28 guide-dependent off-target sites, and characterized the ability of gTBEv3 to mediate guide-independent off-target DNA editing using orthogonal R-loop assay in five previously reported dSaCas9 R-loops9,29. We found very low percentage of editing at all the guide-dependent off-target loci (Fig. 3d, e and Supplementary Fig. 7) and detected very low frequencies (1.1% in average) at all five guide-independent off-target sites (Fig. 3f). Taken together, the gTBEv3 represents a highly efficient T-to-S base editor with low off-target effects in mammalian cells.
Enhancement of C editing activity of gCBE
To examine whether the mutations emerged from the engineering of gTBE would benefit the enhancement of gCBE activity, we attempted to generate gCBEv1.1 by introducing A214V into gCBEv0.3 (Fig. 4a). We found that the gCBEv1.1 largely elevated the C-to-G conversion activity by 1.34-fold when evaluated using the C-to-G reporter (Supplementary Fig. 8a). We conducted alanine-scanning mutagenesis on the fragment of D154-D189 to examine its role in the regulation of base excision activity, and obtained a variant gCBEv1.2 (v0.3 with K184A) largely elevating the C-to-G conversion activity by 1.55-fold (Supplementary Fig. 8b). We further investigated the additive effect of A214V and K184A by combining these two mutations in gCBEv2 (carrying K184A, N213D, A214V mutations), and found synergistic enhancement of C-to-G editing activity by 1.3-fold compared with that of gCBEv0.3 (Fig. 4b). We further validated the improvement of C editing activity for different gCBE variants by targeting an endogenous genomic site, and found a gradual increase of overall C editing efficiency from 18.2% to 37.2% at C2 of the site 28 (Supplementary Fig. 9a).
By targeting 16 endogenous genomic loci, we characterized the editing profiles of gCBEv2 and obtained efficient C base editing activity ranged from 31.8% to 77.7% (Fig. 4c and Supplementary Fig. 9b–d). We found that gCBEv2 could induce predominant C-to-G conversions as well as C-to-T conversions, with the ratios of C-to-G/T to C-to-A/G/T conversion reaching up to 0.97, and there were very few C-to-A conversions detected (Fig. 4c, Supplementary Fig. 9e–h). The gCBEv2 could induce indels with frequency ranged from 3.1% to 48.3% at the examined sites (Supplementary Fig.9i). After analyzing the sequences of all tested sites, we found that the editable range of gCBEv2 was positions 2 to 9 (Supplementary Fig. 9c), and gCBEv2 showed preferences for editing at AC or TC motifs with a higher efficiency than other motifs (Supplementary Fig. 9j).
When compared to CGBE112, a C-to-G base editor, we found that gCBEv2 showed higher editing activity at certain positions towards the distal end of the target sequence (Fig. 4d and Supplementary Fig. 9c), indicating their positional preferences within different optimal editing windows (positions 2 to 6 for gCBEv2 vs. positions 5 to 7 for CGBE112). The gCBEv2 induced fewer indels at site 36, and more indels at site 28 and site 29 than CGBE1 (Supplementary Fig.9k). To be noted, using the orthogonal R-loop assay9,29 mentioned above, we found that gCBEv2 showed comparable frequencies with CGBE1 at two guide-independent off-target sites, but higher at the other three sites (Fig. 4e, f and Supplementary Fig. 9l).
Moreover, we found that the gCBEv2 could only facilitate C editing, but there was essentially no T editing at all examined sites (Supplementary Fig. 9c,d). The editing specificity of gCBEv2, together with that of gTBEv3 (Supplementary Fig. 6b–e), consolidated the orthogonality of these two base editors for base editing.
Applications of gTBE and gCBE
We further evaluated the potential applications of gTBE and gCBE. The gTBE could not only remediate inactive splicing signals in the intron-split EGFP reporter systems used above (Figs. 1, 2 and Supplementary Fig. 2), but also be used for exon skipping by disrupting splicing signals at splicing donor (SD) or splicing acceptor (SA) sites (Fig. 5a). After analyzing the splicing sites in 16 well-studied genes for gene and cell therapy research30,31,32, we found that gTBE and gCBE, together with other existing base editors, provide 1904 sgRNA candidates (Supplementary Data 3) with the SD or SA sites located in each optimal editing window (Fig. 5b and Supplementary Fig. 10a). Among the 771 sgRNA candidates for ABE and CBE targeting, 156 and 103 candidates overlapped with those for gGBE and gTBE, respectively (Fig. 5c). Moreover, 232 and 223 sgRNA candidates could only be screened by gGBE or gTBE targeting, respectively (Fig. 5c). For gCBE, apart from 205 sgRNA candidates overlapped with those for CBE, there were 148 unique candidates (Supplementary Fig. 10b). The availability of these base editors could largely expand the scope of sgRNA screening for efficient editing at splicing sites (Supplementary Fig. 10). In addition, the developed base editors could be utilized for bypassing premature termination codons (PTCs) and introduction of PTCs (Supplementary Fig. 11). The gTBE and gCBE could provide more versatile codon outcomes from PTCs editing (Supplementary Fig. 11b), and introduce PTCs by editing more codons coding various amino acids (Supplementary Fig. 11d). To potentially disrupt gene function by introduction of PTCs, we analyzed and obtained 851 sgRNA candidates (Supplementary Data 4) targeting various codons for PTCs introduction in 15 genes with gGBE and CBE, with 191 TACs and 124 TCAs for gGBE targeting (Supplementary Fig. 11e).
To illustrate these applications, we focused on editing the splicing sites in human DMD gene (Duchenne muscular dystrophy, coding dystrophin) that cannot be targeted with ABE or CBE. We designed and screened a series of sgRNAs specifically targeting SD or SA sites with gTBEv3 or gCBEv2 (Fig. 5d and Supplementary Fig. 10c), including three sgRNAs targeting the SD sites of DMD exon 45 (Fig. 5e), 12 and 37 (Supplementary Fig. 10d) uniquely targeted by gTBEv3. Disruption of the SD site of exon 45, thus leading to exon skipping, would be applicable to restore dystrophin expression in 9% DMD patients33. Thus, we co-injected gTBEv3 mRNA and sgRNA targeting the SD site of DMD exon 45 into zygotes of humanized mice to explore the potential application of gTBE. We found 100% (20/20) mouse embryos harbored efficient base conversion (ranged from 28.0% to 87.4%) at the desired position T3 (Fig. 5f, g), indicating the great potential of gTBE for human disease modeling and gene therapy. Overall, gBEs, including gTBE, gCBE and gGBE, provide more options for the sites that dBEs could not target, largely expanding the targeting scope of base editors.
Comparison of different editing systems
In this study, we have engineered gTBEs and gCBEs using structure-informed rational mutagenesis (Fig. 6a). During the peer review process of this work, two studies reported several independently developed deaminase-free glycosylase-based base editors34,35. He et al. developed a TSBE3 for T-to-G/C substitutions using protein language model (PLM)-assisted strategy34, while Ye et al. conducted rounds of random mutagenesis by error-prone PCR for directed evolution in Escherichia coli and obtained several deaminase-free base editors (DAF-TBEs and DAF-CBEs)35 (Fig. 6a). The basic architectures of above-mentioned base editors are different, for instance, TSBE3 was constructed using an embedding strategy and DAF-TBE2 using a circularly permuted strategy (Fig. 6b). Since embedding of deaminase or glycosylase into the Cas9 domain could modulate the editing efficiency and/or editing window of certain base editor10,36,37,38, we generated gTBEv4 and gTBEv5 by inserting the engineered UNG2 variant of gTBEv3 into the nCas9 domain at different locations (Fig. 6b).
To better characterize the performance of various deaminase-free base editors, we made a side-by-side comparison of base editors in our study and those from the other two studies. We first compared the T editing efficiency of various thymine base editors at 17 endogenous sites, including five sites from He’s study34 and five sites from Ye’s study35 (Fig. 6c and Supplementary Fig. 12). For base editors with UNG variant fused at the N-terminus of nCas9, gTBEv3 showed higher editing efficiency than DAF-TBE at the overwhelming majority of Ts (29 out of 35) of tested sites (Fig. 6c, Supplementary Fig. 12f), indicating that UNG variants generated by rational mutagenesis are superior to those by random mutagenesis in this situation. We also compared gTBEv3 with gTBEv4 and gTBEv5, two base editors constructed using the embedding strategy. The gTBEv4 showed a shifted editing window of positions 7–13 from positions 3–7 (Fig. 6d), with no significant difference in the average editing efficiency for gTBEv3 (23.2% vs. 23.1%, Supplementary Fig. 12f). For gTBEv5, the editing efficiency was largely increased compared to that of gTBEv3 (averaging 39.3% vs. 23.1%, Supplementary Fig. 12f), with the same predominant T-to-S conversions (Supplementary Fig. 12a–d, g), and the optimal editing window covered protospacer positions 5 to 9 (Fig. 6d). TSBE3 (carrying L83Q and G116E mutations, equivalent to L74Q and G107E in UNG1) is an nCas9-embedded base editor with almost the same insertion position as gTBEv5 (Fig. 6c). The gTBEv5 showed higher editing efficiency than TSBE3 (39.3% vs. 22.5%, Supplementary Fig. 12f) at the overwhelming majority of Ts (29 out of 35) of tested sites (Fig. 6c), indicating that UNG variants generated by rational mutagenesis are superior to those generated by PLM-assisted mutagenesis in this situation. The optimal editing window of TSBE3 covered protospacer positions 4 to 9 (Fig. 6d). The circularly permuted DAF-TBE2 showed an editing window of positions 9–13, different from the editing window (positions 2–6) of DAF-TBE (Fig. 6d). Despite showing the highest average editing efficiency, gTBEv5 induced comparable indel rates to that of DAF-TBE (14.4% vs. 14.4%), DAF-TBE2 (14.4% vs. 10.3%) and TSBE3 (14.4% vs. 13.5%, Supplementary Fig. 12e–g). To be noted, gTBEs induced much fewer unintended T editing than TSBE3 and DAF-TBEs in the proximal DNA sequence upstream from two sites (site 38 and site 44) harboring unintended edits (Supplementary Fig. 13), consistent with the finding that the NTD of UNG could promote targeting the enzyme to ssDNA–dsDNA junctions19.
Similarly, we then compared the C editing efficiency of various base editors (Supplementary Fig. 14a) at 19 endogenous sites, including five sites from He’s study34 and five sites from Ye’s study35 (Supplementary Fig. 14b). We found that gCBEs showed higher overall average editing efficiency than all other base editors (Supplementary Fig. 14b, e). The gCBEv2 outperformed DAF-CBE (30.1% vs. 21.3%) and CGBE-CDG (30.1% vs. 19.3%) for the average efficiency of base conversion (Supplementary Fig. 14c, f), indicating that UNG variants generated by rational mutagenesis are superior to those by random mutagenesis in this situation. Although CGBE1 induced the least indels and gCBEv3 induced more indels, gCBEv2 induced comparable average indel rates with other deaminase-free base editors, including DAF-CBE (16.8% vs. 16.9%), DAF-CBE2 (16.8% vs. 12.1%) and CGBE-CDG (16.8% vs. 13.6%, Supplementary Fig. 14d, g). The C-to-G editing frequency and purity of different base editors show respective advantages for CGBE1 and various deaminase-free base editors at different cytosine position across the protospacer (Supplementary Fig. 15a, b). Each base editor can edit its target base within a certain editable window, that is, positions 2 to 9 for gCBEv2, positions 2 to 11 for gCBEv3, positions 4 to 10 for CGBE1, positions 2 to 9 for CGBE-CDG, positions 2 to 9 for DAF-CBE, and positions 9 to 12 for DAF-CBE2 (Supplementary Fig. 15c).
After analyzing the off-target effects both at some sgRNA-dependent and sgRNA-independent off-target sites, we found that gTBEs and gCBEs induced comparable low-level off-target edits similar to that of other base editors at most sites (Supplementary Fig. 16a–c). Moreover, by performing transcriptome-wide RNA analysis, we found that gTBEv5 and gCBEv3 did not exhibit significant off-target RNA editing or impact the cell’s inherent DNA repair processes (Supplementary Fig. 16d, Supplementary Data 5), consistent with those of DAF-TBE, DAF-CBE, CGBE-CDG and TSBE334,35.
Prime editing (PE) system could theoretically mediate all types of base substitution, including T-to-G conversion and C-to-G conversion39. We compared gTBEv3 and gTBEv5 with the recently evolved PE6d system40 at six previously reported endogenous sites35 in HEK293T cells. The gTBEv3 and gTBEv5 outperformed PE6d or PE6d max for T-to-G conversion at four tested sites, whereas PEs exhibited higher efficiency and purity than gTBEs at the other two sites (Supplementary Fig. 17a, Supplementary Data 6). The gCBEv2 and gCBEv3 outperformed PE6d or PE6d max for C-to-G conversion at five tested sites, whereas PEs exhibited higher efficiency and purity than gCBEv2 at the other one site (Supplementary Fig. 17b, Supplementary Data 6). These findings indicate that base editing and prime editing offer complementary strengths, and base editors generally show more efficient editing if the target base is positioned optimally. In addition, gTBEs and gCBEs also exhibited efficient T and C editing activity across three different human cell lines (HEK293T, U2OS and Huh-7 cells), with slight perturbations of the product purity for gTBEs and comparable substitution frequency of certain base for gCBEs in different cell lines (Supplementary Fig. 18).
Taken together, we found that gTBEs and gCBEs in our study outperformed other base editors, including DAF-TBEs, DAF-CBE, TSBE3 and CGBE-CDG from the other two studies. And the alternative editing windows of different base editors would provide more choices for proper base conversion.
- SEO Powered Content & PR Distribution. Get Amplified Today.
- PlatoData.Network Vertical Generative Ai. Empower Yourself. Access Here.
- PlatoAiStream. Web3 Intelligence. Knowledge Amplified. Access Here.
- PlatoESG. Carbon, CleanTech, Energy, Environment, Solar, Waste Management. Access Here.
- PlatoHealth. Biotech and Clinical Trials Intelligence. Access Here.
- Source: https://www.nature.com/articles/s41467-024-49343-5