High-throughput continuous evolution of compact Cas9 variants targeting single-nucleotide-pyrimidine PAMs
CRISPR–Cas9 has enabled the development of genome-manipulating technologies that have transformed the life sciences and advanced new treatments for genetic disorders into the clinic. Target sites engaged by Cas9 must contain a protospacer-adjacent motif (PAM) that is recognized through a protein–DNA interaction before single-guide RNA (sgRNA) binding. While not prohibitive for some gene editing applications, such as target gene disruption, this PAM requirement limits the applicability of precision gene editing methods, including base editing, prime editing or site-specific DNA integration. For these technologies, the target modification must occur either at a specific distance or within a certain range of the PAM. Thus, the availability of a PAM sequence compatible with a Cas protein that retains robust activity in mammalian cells strongly determines the application scope of precision gene editing. Indeed, recent ex vivo and in vivo therapeutic base editing to rescue sickle-cell disease and progeria in mice used evolved or engineered Cas9 variants to precisely position the base editor at CACC or NGA PAMs (where N is A, C, G or T), respectively.
The limitations imposed by PAM restrictions have motivated efforts to engineer or evolve Cas protein variants with broadened or altered PAM compatibility. These approaches have generated variants of the most widely used Cas9 from Streptococcus pyogenes (SpCas9), which offers robust mammalian cell activity and engages sites with NGG PAMs1. The wild-type and evolved or engineered variants of SpCas9 described so far can collectively access essentially all purine-containing PAMs and a subset of pyrimidine-containing PAMs.
Researchers have also parsed the genomes of other bacterial species or bacteriophage to identify Cas variants with different PAM requirements. These Cas variants vary dramatically in size, PAM compatibility and enzymatic activity. Unfortunately, most of these natural homologs are less well characterized, less active in mammalian cells or have highly restrictive PAM requirements compared to SpCas9 , limiting their use for precision gene editing applications and the ease with which they can be modified. As such, engineering or evolution of non-SpCas9 orthologs has been uncommon, with only a few reported examples.
New engineering or evolution methods to address the limitations of reprogramming non-SpCas9 orthologs could provide new precision gene editing capabilities that expand on and complement the suite of commonly used SpCas9-derived variants. Nme2Cas9, a Cas9 variant from Neisseria meningitidis, is an attractive Cas ortholog for evolving PAM compatibility. The wild-type enzyme is active on N4CC PAMs, and thus may serve as a promising starting point to all pyrimidine PAMs previously inaccessible by SpCas9 variants. In addition, Nme2Cas9 has a smaller size than SpCas9 (1,082 versus 1,368 amino acids (aa)), making it attractive for future delivery applications. Nme2Cas9 has also shown robust activity in mammalian cells as both a nuclease and a base editor.
Here we report the directed evolution of Nme2Cas9, expanding its PAM scope from the N4CC requirement of the wild-type protein to include most N4YN sequences, where Y is C or T. To enable the evolution of this non-SpCas9 ortholog, we developed and integrated three technologies. First, we established a new, generalizable selection strategy requiring both PAM recognition and functional editing activity. We carried out selections in parallel across single PAM sequences using phage-assisted non-continuous evolution (PANCE) and a high-throughput eVOLVER-enabled phage-assisted continuous evolution (ePACE) platform. Last, we developed a high-throughput base editing-dependent PAM-profiling assay (BE-PPA) to rapidly and thoroughly characterize evolving Nme2Cas9 variants and to guide evolutionary trajectories. With these developments, we evolved four Nme2Cas9 variants that enable robust precision genome editing at PAMs with a single specified pyrimidine nucleotide: eNme2-C, eNme2-C.NR, eNme2-T.1 and eNme2-T.2. The evolved Nme2 variants exhibit comparable (eNme2-T.1 and eNme2-T.2) or more robust (eNme2-C) base editing and lower off-target editing than SpRY, the only other engineered variant capable of accessing similar PAMs for a subset of target sites. Together, these new variants offer broad PAM accessibility that is complementary to the suite of PAMs previously targetable by SpCas9-derived variants. Moreover, the selection strategy developed in this study is highly scalable and general. Because of the lack of target site requirements, this selection could in principle be applied to evolve functional activities in any Cas ortholog or to optimize editing at a specific PAM or target site.
Results
We hypothesized that our continuous evolution system, PACE, in which the propagation of M13 bacteriophage is coupled to the desired activity of a protein of interest, could be used to evolve Nme2Cas9 variants with expanded pyrimidine-rich PAM scope. Previously, we broadened the PAM scope of SpCas9 variants using a one-hybrid, DNA-binding PACE circuit. In those efforts, SpCas9 variants encoded on selection phage (SP) capable of simply binding the target PAM(s) successfully produce gene III (gIII), a gene essential for phage propagation. The resulting SpCas9 variants could access most NR PAM sequences (where R is A or G), but efforts to apply the DNA-binding selection to evolve pyrimidine-PAM recognition were less successful.
While this binding selection could be adapted to evolve Nme2Cas9, fundamental differences between the activities of SpCas9 and Nme2Cas9 could impede efforts to evolve the PAM scope of the latter. Nme2Cas9, and more broadly Type II-C Cas variants, may have slower nuclease kinetics relative to SpCas9. This weaker nuclease activity is attributed to slower Cas9 helicase activity, as artificially introduced bulges mimicking partially unwound DNA in the PAM-proximal region increase the cleavage rate of Type II-C Cas variants but not of SpCas9 . This theory is supported by observations that miniaturized SpCas9 variants with partially deleted domains have reduced DNA-binding affinity that can also be rescued by the introduction of PAM-proximal bulges in target DNA. Because a primary motivation for broadening PAM compatibility is to improve the applicability of precision gene editing technologies that require DNA unwinding3, it is critical that a selection preserves or improves R-loop formation, maintenance and nuclease activation. Notably, these Cas properties are dependent on domains outside the PAM-interacting domain (PID), which has been the focus of rational engineering approaches. Together, this analysis indicates that while DNA-binding selections or PID engineering can yield robust SpCas9 variants with altered PAM compatibilities, the same type of binding-only selection applied to the evolution of Nme2Cas9 or similar Cas orthologs may not yield both desired PAM recognition and efficient downstream activity (Fig. 1a). This hypothesis motivated us to envision a new, functional selection in PACE for evolving PAM compatibility.
a, Overview of previous Cas9 PACE (left) requiring only PAM binding, compared to the SAC-PACE developed in this study, which requires both PAM binding and subsequent base editing. b, The selection circuit in SAC-PACE. The SP encodes an adenine base editor in place of gIII. In the host cells, an accessory plasmid (AP) contains a cis intein-split gIII, with a linker (31–121 aa) containing stop codons. MP, mutagenesis plasmid 6. c, Overnight phage propagation assays to test the selection stringency of SAC-PACE with various AP promoter strengths. Mean ± s.e.m. are shown and are representative of n = 2 independent biological replicates. d, Overview of ePACE, enabling parallel lagoon evolution of a Cas9 variant on single PAMs (Supplementary Figs. 1–4). e, Overnight propagation assays of wild-type Nme2-ABE8e on two sets of 32 N3NYN PAMs. Fold-propagation was measured by qPCR and is reflective of the average of two independent biological replicates. The eight CTTAYNA PAMs are excluded as they introduce an additional stop codon in the accessory plasmid, preventing Cas-dependent propagation.
Development of a general functional selection for evolving PAM compatibility in PACE
To develop a functional selection for Cas9-based genome editing agents with altered PAM compatibilities, we combined elements of a DNA-binding selection with a base editing (BE) selection, such that both new PAM recognition and subsequent BE within the protospacer are required to pass the selection. Although we previously developed BE selections to evolve high-activity adenine and cytidine deaminases, these selections place targeted nucleotides within the coding sequence of T7 RNA polymerase (T7 RNAP). This selection strategy is not broadly applicable to evolve altered PAM compatibility since changing the target PAM and protospacer likely requires changing the coding sequence of T7 RNAP. Furthermore, evolved variants with high activity that edit over large activity windows may inadvertently alter the activity of T7 RNAP through bystander editing.
To address these limitations, we designed a new selection strategy in which the target protospacer and PAM can be fully specified without affecting the coding sequence of the gene responsible for selection survival (Fig. 1b). To achieve this programmability, we used the splicing capabilities of inteins, protein elements that insert and remove themselves from other proteins in cis, leaving only a small (roughly 3- to 10-aa) extein scar. We hypothesized that trans split-inteins could function effectively as cis-splicing elements when the N- and C-inteins are fused together with a linker containing a programmed PAM and protospacer. We used the split-intein pair from N. punciforme (Npu) since we previously showed that gIII split after Leu 10 with the Npu intein supports robust phage propagation after trans splicing.
To test whether the reconfigured cis-splicing Npu intein supports phage propagation, we constructed an accessory plasmid with the N- and C-terminal halves of the Npu intein fused together with a flexible 32-aa linker and inserted into the coding sequence of gIII after Leu 10 under the control of the phage shock promoter (psp)(Fig. 1b). When infected with ΔgIII-phage, host cells containing this accessory plasmid supported robust phage propagation in a splicing-dependent manner similar to cells containing psp-driven wild-type gIII. Installation of stop codons within the linker sequence reduced phage propagation by >105-fold relative to the unmutated construct (Extended Data Fig. 1a), indicating that this selection, which we term sequence-agnostic Cas-PACE (SAC-PACE), should enable robust selection of variants capable of correcting targeted stop codons.
Next, we tested whether adenine base editing could support phage propagation in SAC-PACE. Indeed, on host cells harboring an accessory plasmid containing gIII with two stop codons flanked by a cognate Nme2Cas9 N4CC PAM, phage encoding dead Nme2Cas9 fused to the adenosine deaminase TadA8e (ref. ) (Nme2-ABE8e) enriched 102- to 106-fold after overnight propagation, depending on the expression level of the gIII-construct (Fig. 1c). In contrast, phage containing only TadA8e or a nontargeting gene de-enriched in these host cells below the limit of detection at any tested expression level, indicating a large base editing-dependent dynamic range for this selection.
To test the generality of the selection circuit, we generated a series of APs containing linkers between 32 and 121 aa or with stop codons placed at different positions within the protospacer (Extended Data Fig. 1b,c). Although propagation decreased with increasing linker length, the maximum tested linker length of 121 aa still supported strong overnight propagation sufficient to support phage survival during PACE (>104-fold). This linker length can encode up to ten simultaneous protospacer/PAM combinations (23 to 30 nucleotides (nt) in length) with at least 7 nt between targets, a spacing shown to be compatible for multiple Cas protein binding events. Together, these results indicate that the SAC-PACE selection is a highly flexible system that could be used to evolve the PAM scope of Cas variants.
A high-throughput platform for ePACE
Previous efforts to evolve SpCas9 on specific PAM sequences (NAG, NAC, NAT, etc.) yielded variants with both higher activity and specificity compared to variants evolved on a broad set of pooled PAMs. Evolving on specific PAM sequences using traditional PACE methodology, however, is limited by throughput, since PACE is inherently challenging to parallelize due to cost, space and design complexity, requiring temperature-controlled rooms and fluid-handling equipment. This constraint limits the number of conditions that can be explored in a PACE campaign, a drawback given the difficulty of predicting the set of conditions that will evolve molecules with desired properties.
To address this throughput challenge and enable large-scale parallel PACE of Nme2Cas9 toward specific PAMs, we developed ePACE (Fig. 1d and Supplementary Figs. 1–3). The ePACE system combines the continuous mutagenesis and selection of PACE with the highly scalable, customizable and automated eVOLVER continuous culture platform, which has already proved effective for directed evolution. Three key design features of eVOLVER make it an ideal choice for facilitating parallel PACE selections. First, eVOLVER enables individual programmatic control of continuous culture conditions, allowing the platform to simultaneously operate PACE chemostat cell reservoirs and lagoons on a standard laboratory benchtop. Second, eVOLVER can scale in a cost-effective manner to arbitrary throughput, enabling large-scale parallelization of miniature PACE reactors. Last, the do-it-yourself and open-source nature of eVOLVER allow it to be rapidly adapted and reconfigured for new actuation elements, making it amenable to the customization necessary to run PACE (Supplementary Figs. 1–3). Integrating PACE and eVOLVER enables the simultaneous execution of PACE experiments across eight different PAMs (or other selection conditions) in parallel. Given that PACE experiments typically require 1–2 weeks each, this eightfold increase in throughput represents a 2–4-month reduction in experimental time compared to traditional single-lagoon PACE at a tenfold reduction in cost.
To facilitate and automate the liquid handling needs of PACE in eVOLVER, we developed customized ‘millifluidic’ integrated peristaltic pumps (IPPs), inspired by integrated microfluidics, that can be inexpensively manufactured using laser cutting to achieve accurate, tunable small volume flow rates (<0.1–40 µl s−1) (Supplementary Figs. 2 and 3 and Supplementary Note 1). Briefly, IPPs enable accurate and tunable metering of liquids through the sequential actuation of consecutively arranged pneumatic valves. We characterized several IPP valve sizes and cycle frequencies to generate calibration curves of achievable flow rates and verified robustness of these pumps over roughly 6 million actuations over 7 days, well over the typical load necessary for PACE (Supplementary Figs. 2 and 3). To test the evolutionary capabilities of ePACE, we evolved a folding-defective (G32D/I33S) maltose-binding protein (MBP) variant validated in traditional PACE. Previously, this folding-defective MBP was evolved using a two-hybrid selection scheme to optimize both soluble expression of the MBP variant and binding to an anti-MBP monobody. We replicated this evolution using ePACE, yielding evolved MBP variants with mutations at residues clustered around the monobody-MBP interaction interface (D32G, A63T, R66L) that we previously observed in PACE (Supplementary Fig. 4). These results demonstrate that eVOLVER equipped with IPP devices can successfully support and automate PACE, validating the ePACE platform for high-throughput continuous directed evolution.
Development of a high-throughput base editing-dependent PAM-profiling method
Next, we developed a method to rapidly profile the PAM scope of Nme2Cas9 variants that emerge during evolution. Assessing PAM compatibility by testing individual sites in mammalian cells is throughput-limited. Although many library-based PAM-profiling methods have been described, these methods rely on nuclease activity (PAM depletion, PAMDA, TXTL PAM profiling, CHAMP, etc.) or Cas protein binding activity (PAM-SCANR, CHAMP, etc.), which may not fully reflect PAM compatibility in precision gene editing applications such as base editing. We previously reported a mammalian cell base editing profiling assay; however, this method is both slower and costlier than cell-free or E. coli-based methods, making it better suited for the characterization of late-stage variants.
To address the need to rapidly assess the PAM specificities of newly evolved Cas9 variants in base editor form, we developed a base editing-dependent PAM-profiling assay (BE-PPA). In BE-PPA, a protospacer or library of protospacers containing target adenines (ABE-PPA) or cytosines (CBE-PPA) is installed upstream of a library of PAM sequences (Extended Data Fig. 2a,b). This library is transformed into E. coli along with a plasmid expressing a base editor of interest. Since base editing at each PAM is measured independently of other PAMs, BE-PPA offers greater sensitivity compared to nuclease-based assays. The PAM profile we observed for BE2 (rAPOBEC1-dSpCas9-UGI) using CBE-PPA closely matched (R2 = 0.97) the PAM profile we previously observed for the related CBE, BE4, in mammalian human embryonic kidney 293T (HEK293T) cells9 (Extended Data Fig. 2c and Supplementary Table 1), validating BE-PPA as a rapid base editor PAM-profiling method.
Strategy for evolving the PAM scope of Nme2Cas9
Having validated the SAC-PACE selection, the ePACE system for high-throughput continuous evolution and the BE-PPA method for profiling PAM compatibility of base editors, we next identified desirable target PAMs for evolving Nme2Cas9. In overnight propagation assays, phage containing Nme2-ABE8e exhibited modest to strong propagation (N3NCG < N3NCA < N3NCT < N3NCC) on the set of 16 N3NCN PAMs, and strong propagation on N3NTC PAMs if the base immediately downstream of the canonical six base pair PAM was a C (PAM position 7, NNNNNNN, counting the canonical PAM as positions 1–6), likely due to PAM slippage (Fig. 1e). This initial activity suggested an overall evolution campaign along two trajectories (Fig. 2b): a more difficult trajectory toward activity on N4TN PAMs that could require several selection stringencies, and a simpler trajectory toward N4CN-active variants. If successful, these variants could together enable targeting of PAM sequences largely complementary to the PAM scope of existing, high-activity SpCas9 variants.
a, SAC-PACE modifications increasing selection stringency. Left, original selection scheme; middle, split SAC-PACE selection and right, dual-PAM split SAC-PACE. b, Evolution campaigns toward Nme2Cas9 variants with N4CN or N4TN PAM compatibility. c, Summary heat map showing ABE-PPA activity for representative variants across both evolutionary trajectories. Values plotted are raw observed percentage A•T-to-G•C conversion for one replicate of each base editor. d, Mutation overview of the eNme2-C variant, mapped onto the crystal structure of wild-type Nme2Cas9 (Protein Data Bank (PDB) 6JE3), mutated positions are shown in blue. The inset shows the wild-type PAM and PAM-interacting residues (D1028, R1033), with evolved mutations listed. e, Summary dot plots showing the progression of mammalian cell adenine base editing activity at eight N4CN PAM-containing sites for representative variants from the N4CN evolution trajectory. f, Mutation overview of the eNme2-T.1 and eNme2-T.2 variants, mapped onto the crystal structure of wild-type Nme2Cas9 (PDB 6JE3). Positions mutated in both variants are shown in yellow. Mutations unique to eNme2-T.1 are shown in light green. Mutations unique to eNme2-T.2 are shown in dark green. The insets show the wild-type PAM and PAM-interacting residues (D1028, R1033), along with new mutations listed. g, Summary dot plots showing the progression of mammalian cell adenine base editing activity at eight N4TN PAM-containing sites for representative variants from the N4TN evolution trajectory. For e and g, each point represents the average editing of n = 3 independent biological replicates measured at the maximally edited position within each genomic site. Mean ± s.e.m. is shown and reflects the average activity and standard error of the pooled genomic site averages. NS, P > 0.05; *P ≤ 0.05; **P ≤ 0.01, ***P ≤ 0.001, ****P ≤ 0.0001. P values determined by Sidak’s multiple comparisons test following ordinary one-way analysis of variance.