Genome engineering is the field of research dedicated to developing site-directed DNA sequence modification methodologies and applications. Historically, such methodologies have been lacking in plants species and crops in particular, thereby thwarting advancements in functional genomics and trait development efforts. Desired DNA sequence modifications are initiated or stimulated by a double-stranded break (DSB) in the target DNA molecule (Fig. 1). Engineered nucleases are designed to catalyze the DSB at a specific location in the genome, thus stimulating the desired DNA modifications to occur at or near the break site. Here we provide a brief description of the most common forms of DNA sequence modifications that have been achieved using nuclease-based genome engineering methodologies:
(i) Mutagenesis: Mutagenesis is perhaps the easiest type of site-directed modification to achieve. To initiate mutagenesis, the engineered nuclease is designed to generate a DSB at a specific chromosomal site. Mutagenesis is achieved by simply expressing the nuclease within the plant cell. The DSB created by the nuclease is repaired by the host cell’s non-homologous end joining (NHEJ) DNA repair pathway that often results in small DNA insertions or deletions (indels) at the break site. Insertions or deletions introduced into nuclease target sites located in a gene’s open reading frame (ideally within an exon near the 5′ end of the gene) can cause frame shift mutations, in effect creating a nonfunctional gene “knock-out.” Populations of transformed plant cells expressing nucleases are then molecularly screened for the desired mutations.
(ii) Gene replacement or editing: While targeted gene knockouts have considerable value for basic and applied plant research, a potentially more powerful means of genome engineering involves modifying a gene rather than eliminating its function. It is possible to specifically alter a gene’s sequence in a nonrandom manner using the homologous recombination (HR) pathway to repair a nuclease-induced DSB. In this case, the DSB is repaired using a DNA donor template that contains sequences homologous to those flanking the break site (Fig. 1). This strategy is more complex than mutagenesis in that the donor template has to be delivered to the cell at the same time the nuclease mediates cleavage. The HR pathway uses the homology in the donor template to repair the DSB, thereby incorporating the donor sequence into the chromosome. The donor sequence can be tailored to the desired outcome, including changes to promoter regions or sequence modifications that alter the catalytic activity of an enzyme.
(iii) Gene insertion: The goal of gene insertion is to introduce a transgene or transgenes at a specific chromosomal location through HR. Sites of gene insertion are chosen that are conducive to high levels of gene expression. Furthermore, the insertion of multiple transgenes at the same chromosomal locus (stacking) makes it easy to move the transgenes into other germplasm by crossing, because the linked transgenes behave genetically as a single locus.
(iv) Site-directed structural changes (e.g., deletions and inversions): In this case, a nuclease or a combination of two nucleases can be targeted to cleave neighboring locations along a chromosome. Some of the events will be repaired by merging the respective outer breakpoints, thereby deleting the sequence space separating the two cleavage sites. One can imagine that this might be useful for removing gene clusters that may be detrimental to fitness (e.g., yield or some other desirable trait). Alternatively, it is possible that the released fragment will invert before DSB repair, thereby resulting in an inversion event between the two cleavage sites. Similar methods can be used to stimulate chromosomal translocations if the two nucleases target sites are on different chromosomes.
The four steps necessary for modifying a plant gene through genome engineering include (i) designing and developing an engineered nuclease construct, (ii) delivering the construct and perhaps donor molecule into the plant (typically by genetic transformation), (iii) inducing nuclease expression, and (iv) screening the plants for the desired DNA sequence change. Figure 2 depicts an example of the events that may occur (starting from plant transformation) to derive a novel mutation. It is important to note that there are two independent loci of interest, the transgene locus and the target locus (in the vast majority of cases these two loci will be unlinked).
In recent years, the major advances in genome engineering have focused on the design and delivery of proteins that reliably generate DSBs at specified loci. In this review we focus on the research advances in three distinct engineered nuclease systems that have been developed (and continue to be refined) for stimulating DSB events. Furthermore, we review recent studies that have implemented these strategies to target genes in crop plants.
Engineered Nuclease Systems for Targeted Genome Modification
The critical step in site-directed genome engineering is generating a DSB at a specific chromosomal location. Three types of engineered nucleases have been used to date for this purpose: zinc finger nucleases (ZFNs), transcription activator-like effector nucleases (TALENs), and LAGLIDADG homing endonucleases (LHEs), often termed meganucleases (Fig. 3). All three nucleases operate by the same general principle, as they are engineered proteins consisting of a DNA binding domain (which accounts for site specificity) and a endonuclease domain (which functions as the DSB-causing enzyme).
Zinc Finger Nucleases
The pioneering work that started this technology began more than a decade ago when researchers at John Hopkins University attempted to generate novel restriction enzymes (Kim et al., 1996). The work focused on type IIS restriction enzymes such as FokI, which recognize specific DNA sequences and cleave several base pairs downstream of the recognition site. The researchers reasoned that if they could fuse the catalytic domain of FokI to another DNA binding domain then they could alter the DNA specificity of the restriction enzymes. They ultimately selected the zinc finger as the DNA binding motif of choice, because zinc fingers can be linked together in custom arrays to recognize novel target DNA sequences. The hybrid enzymes proved effective and were called zinc finger nucleases (ZFNs). Two landmark papers demonstrated the utility of ZFNs for making targeted modifications to the Drosophila melanogaster and human (Homo sapiens sapiens) genomes (Bibikova et al., 2003; Porteus and Baltimore, 2003). To date, DSB-mediated gene disruption and correction have been shown to work effectively in other animal models, including rats (Rattus norvegicus), mice (Mus musculus), and zebrafish (Danio rerio) (Cui et al., 2011; Geurts et al., 2009; Mashimo et al., 2010). The initial successes in plant genome engineering with ZFNs were reported in Arabidopsis thaliana and tobacco (Nicotiana tabacum) (Lloyd et al., 2005; Wright et al., 2005).
There are a number of engineering platforms that can be used to construct custom ZFNs. One of the first publicly available platforms was modular assembly (Wright et al., 2006). This method involves the creation of multifinger DNA-binding arrays using individual zinc fingers with predetermined DNA-binding specificities. Modular assembly can be used to easily and inexpensively build ZFNs by standard restriction digestion-based subcloning techniques; however, many of the ZFNs made by this approach are not highly active (Ramirez et al., 2008). The reason is that modular assembly treats each finger as an independent unit, failing to account for context-dependent influences on DNA binding caused by neighboring fingers in the array. To overcome this shortcoming, selection-based engineering platforms were developed that take into account context effects on DNA binding. The oligomerized pool engineering protocol (OPEN) is a publicly available, selection-based platform used to identify highly functional multifinger arrays (Maeder et al., 2008; Zhang et al., 2010). The improved success of OPEN-constructed ZFNs is due to the identification of zinc finger combinations that function well together. Putative ZFN target sites can be identified every 200 bp using OPEN; however, construction of a ZFN requires a highly trained person approximately 2 to 3 mo. To address these limitations, a third platform was developed that combines the easy-to-assemble aspect of modular assembly yet uses genetic selections to take into account context effects. This platform, called context-dependent assembly (CoDA), makes it possible to engineer ZFNs in under a week. Context-dependent assembly allows a researcher to target a DNA locus on average once in every 500 bp. Work in zebrafish, Arabidopsis thaliana, and soybean [Glycine max (L.) Merr.] revealed that CoDA ZFNs function effectively at approximately 50% of sites targeted (Curtin et al., 2011; Sander et al., 2011b).
Transcription Activator-Like Effector Nucleases
Plant bacterial pathogens of the genus Xanthomonas infect a wide range of species including rice (Oryza sativa), citrus (Citrus spp.), tomato (Solanum lycopersicum), and soybean (Boch and Bonas, 2010; Kay and Bonas, 2009). During infection, Xanthomonas delivers to plant cells a battery of proteins known as transcriptional activator-like effectors (TALEs) (Boch and Bonas, 2010; Bogdanove et al., 2010). TALEs modify the host’s transcriptome by binding to specific DNA sequences in promoter regions, effectively mimicking host transcription factors (Kay and Bonas, 2009). TALEs have a central DNA binding domain, which typically consists of 16 to 20 single repeat monomers. Each monomer is 34 amino acid residues in length and is highly conserved except for hypervariable amino acid residues at positions 12 and 13 called repeat-variable di-residues (RVDs). Recent computational and molecular biological analyses made it possible to decipher the TALE code for DNA recognition (Boch et al., 2009; Moscou and Bogdanove, 2009). Each RVD recognizes a different DNA base; for example, repeats with RVDs NI, HD, NG, and NN bind to adenosine (A), cytosine (C), thymine (T), and guanine (G) or adenosine (A), respectively.
The decoding of the TALE DNA recognition mechanism immediately attracted the attention of genome engineers who recognized its potential for biotechnological applications (Bogdanove et al., 2010). One of the initial experiments was to fuse the TALE DNA binding domain to the catalytic domain of the FokI endonuclease thereby creating TALENs. The fusion of both native and custom DNA binding domains to FokI made it possible to create targeted DSBs at specific DNA sequences (Christian et al., 2010). To date TALENs have been used to generate targeted modifications in a variety of organisms such as Arabidopsis thaliana (Cermak et al., 2011), tobacco (Mahfouz et al., 2011), rice (Li et al., 2012), yeast (Saccharomyces cerevisiae) (Li et al., 2011), zebrafish (Huang et al., 2011; Sander et al., 2011a), rats (Tesson et al., 2011) and humans (Miller et al., 2011). In addition to custom TALENs, designer TALEs (dTALEs) have also been developed (Mahfouz et al., 2012; Morbitzer et al., 2010) that activate or repress gene expression in planta.
There are currently several publicly available platforms for constructing TALENs or dTALEs. They include Golden Gate assembly methods that allow seamless construction of TALE repeat arrays (Cermak et al., 2011; Engler et al., 2009; Sanjana et al., 2012; Zhang et al., 2011) as well as ligation-based systems (Reyon et al., 2012; Sander et al., 2011a). These platforms make it possible to rapidly generate TALENs. One of the main advantages of TALENs is that they allow a researcher to target a DNA locus on average once every 10 bp, which is much more frequent than ZFNs. Furthermore, the vast majority of engineered TALENs are functional, making them the reagent of choice for many genome engineering applications.
LAGLIDADG Homing Endonucleases
The third class of proteins used to generate targeted DSBs is the LHEs, also termed meganucleases (Stoddard, 2011; Taylor et al., 2012). LHEs differ from ZFNs and TALENs in that they are naturally occurring gene-targeting proteins encoded by mobile introns (Arnould et al., 2011). LHEs form homodimers comprising two identical subunits each 160 to 200 amino acid residues in size. They can also function as a single peptide of two tandem repeat monomers joined together by a linker sequence (Stoddard, 2011). The DNA target site for LHEs is typically 20 to 30 bp, which provides remarkable specificity, and it is for this reason that LHEs have been developed as a genome modification platform. In contrast to ZFNs and TALENs the cleavage and DNA binding domains of LHEs are not clearly demarcated. Attempts to reengineer DNA contact points of the endonuclease can be challenging and often compromise nuclease activity (Taylor et al., 2012). Numerous methods have been developed to assess mutated LHEs for activity and altered target specificities (Gao et al., 2010; Stoddard, 2011). For example, high-throughput assays have been developed in bacteria and yeast to measure LHE-induced cleavage and the ensuing reconstitution of reporter genes. Furthermore, computational approaches have been developed whereby the DNA binding affinities of modified enzymes are assessed using various algorithms. Because of these engineering challenges, only a handful of academic groups and companies routinely engineer LHEs that target novel DNA sites.
Examples of Targeted Mutagenesis in Crop Species
While most work in genome engineering has focused on model species, there are some examples of heritable site-directed mutagenesis and editing of endogenous genes in crop plants. Tzfira et al. (2012) recently compiled a list of published genome modification studies in both model and crop plant species. In the following sections, we will highlight some of the recent advances in nuclease-based site-directed mutagenesis and gene editing of crops.
Site-directed mutagenesis has been reported in a small number of crop plants, including maize (Zea mays L.), soybean, rice, tobacco, and petunia [Petunia ×atkinsiana (Sweet) D. Don ex W. H. Baxter (syn. Petunia ×hybrida hort. ex E. Vilm.)] (Curtin et al., 2011; Gao et al., 2010; Li et al., 2012; Marton et al., 2010). Gao et al. (2010) used a reengineered I-CreI LHE to target a sequence neighboring the maize liguleless locus and effectively generated heritable monoallelic and biallelic mutations at the target site. This result was the first report of the redesigning of a LHE to target a chromosomal locus of choice in a crop species. Targeted mutagenesis of paralogous genes in the soybean genome was performed using the CoDA ZFN platform (Curtin et al., 2011; Sander et al., 2011b). Soybean is a paleopolyploid in which approximately 75% of its genes are duplicated. Site-directed nuclease technologies appear to be an excellent technology for soybean (and other species with duplicate genomes), as single engineered nucleases can be designed to target multiple gene duplicates. Before whole plant transformation, Curtin et al. (2011) used a “hairy-root” inducing strain of Agrobacterium rhizogenes (K599) to rapidly identify seven genes that could be mutated by specific ZFNs. A ZFN designed to target both soybean DICER-LIKE4 (DCL4) gene copies generated mutations of both genes in the hairy-root assay and a heritable mutation in DCL4b.
Nuclease-mediated site-directed mutagenesis was recently shown to generate disease-resistant rice. This breakthrough was one of the first successful demonstrations of a nuclease-mediated modification of agronomic importance (Li et al., 2012). The phytopathogenic bacteria Xanthomonas oryzae is the causal agent of a devastating blight responsible for large yield losses in both temperate and tropical climates. During infection the bacteria secretes effector proteins that target DNA sequences in the promoter region of the rice sucrose-efflux transporter gene (OsSWEET14). This binding transcriptionally activates OsSWEET14, which thereby contributes to pathogen survival and virulence. Since OsSWEET14 plays an important role in the development of the plant, obtaining a knockout mutant to circumvent the effects of the pathogen was not feasible. Therefore the authors reasoned that if they could disrupt the promoter sequence bound by the effector (without disrupting the TATA box) they could eliminate the pathogen’s ability to induce OsSWEET14 and thereby reduce the pathogen’s virulence. A pair of TALENs that targeted the necessary sequence were designed and transformed into rice. Several independent mutant lines were recovered, some of which showed resistance to Xanthomonas oryzae. Rice plants that were resistant to the bacterial assay were further screened for their mutation status and were shown to be either homozygous monoallelic or heterozygous biallelic mutants at the targeted site. Resistant plants were morphologically identical to wild-type, indicating that the mutations had not resulted in adverse developmental phenotypes consistent with the disruption of OsSWEET14.
One of the potential advantages of genome engineering is that one can create a plant that has novel genetic variation and does not have a transgene. In their experiments with ZFNs and TALENs, respectively, both Curtin et al. (2011) and Li et al. (2012) genotyped T1 (first generation of a transgenic event) segregants to identify plants that had maintained the site-directed mutation while segregating away the engineered nuclease transgene (Fig. 2). (However, whole-genome sequencing will be required to confirm the absence of any transgene fragments in these lines.) Alternatively, viral-based methods for delivering engineered nucleases offers the possibility of circumventing transgenesis altogether. Marton et al. (2010) recently demonstrated that they could transiently deliver ZFNs to tobacco and petunia using a novel tobacco rattle virus (TRV)-based expression system (Marton et al., 2010). The TRV was capable of moving to developing buds, enabling the ZFN-mediated mutations to be transmitted into the next generation of the plant. Since the host plant genome is never genetically transformed, this viral-based delivery system may offer regulatory advantages for researchers interested in commercializing the resulting plant materials.
Examples of Targeted Gene Editing in Crop Species
The first reports of targeted gene editing in crops were published in tobacco and maize (Shukla et al., 2009; Townsend et al., 2009). Townsend et al. (2009) demonstrated high-frequency modification of tobacco acetolactate synthase genes (SurA and SurB) in which specific mutations of these genes confer resistance to certain herbicides. The authors used a yeast assay to identify three ZFN pairs that could promote DSBs at the target sites. Cleavage activity of the three candidate ZFNs were then tested against their endogenous gene targets in tobacco protoplasts. Gene disruption and repair by the NHEJ pathway was observed in the tobacco protoplasts. To test for HR-based gene replacement, the ZFN plasmids were electroporated into tobacco protoplasts along with SurB donor templates modified at positions known to confer herbicide resistance. Successful gene replacement was observed at frequencies ranging from 0.2 to 4%.
Another example of gene editing using ZFNs was published by Shukla et al. (2009). The authors generated ZFNs that targeted the maize IPK1 gene encoding the inositol-1,3,4,5,6-pentakisphosphate enzyme. This enzyme is an important catalyst in the phytate biosynthesis pathway, contributing to increased levels of total P in maize seed. Reduction of phytate by disruption of the IPK1 gene is of agricultural interest, since phytate is considered to be of limited nutritional benefit in feed and has been implicated as a pollutant in animal waste. A total of 66 ZFNs were designed across five targets sites of the IPK1 locus using an archive of prevalidated two-finger modules (Shukla et al., 2009). Based on the ability to initiate DSBs and NHEJ repair in cultured maize cells, four ZFNs were selected to target the second exon of IPK1. The IPK1 locus was disrupted by ZFN-mediated DSBs and repaired by HR using donor templates with an herbicide resistance gene and short, locus-specific homology arms. Approximately 600 transformed herbicide resistant calli were screened for gene targeting events, and several monoallelic and one biallelic insertion of the herbicide cassette at the IPK1 locus were observed. All progeny exhibited the expected segregation frequencies, indicating that the insertions were effectively transmitted to the next generation.
Targeted Structural Changes in Crop Species
The advent of nuclease technology has also raised the potential to create site-directed structural changes, such as large deletions, in the genomes of crop species. These types of structural rearrangements have been demonstrated in human cell lines (Lee et al., 2012). Published examples using nuclease technology for structural alterations in plants include efforts to remove transgenic reporter and marker genes. Removal of a transgene flanked by ZFN cleavage sites was achieved by crossing it with a plant harboring the ZFN (Petolino et al., 2010). The resultant progeny had a complete deletion of the 4.3 kb transgene. The implications of this technology for crop improvement include the ability to remove unwanted marker genes after transgene integration, which may have important regulatory benefits. Other applications of the technology include the deletion of large regions of highly repetitive DNA. Such site-directed deletions may one day become an important tool for generating new traits in crops through the deletion of repetitive DNA or other unwanted loci.
Combining multiple transgenes into a genome is typically performed through standard breeding practices. Depending on the number of transgenic loci to be combined, this may necessitate the screening of large populations of progeny to obtain a plant with the desired assortment of loci. The molecular stacking of linked transgene loci has been a coveted technology in crop sciences. Current technologies include recombinase-mediated gene stacking (e.g., using the Cre-lox, a site-specific recombinase technology, or FLP-FRT, a site-directed recombinase technology analogous to Cre-lox, systems) (Ow, 2011). The arrival of engineered nuclease approaches may accelerate the implementation of molecular stacking technology, enabling multiple transgenes to be inserted at the same locus through HR.
Additional Considerations for Crop Improvement
As illustrated above, sequence-specific nucleases clearly enable a wide variety of targeted modifications to plant genomes, and because of their many applications, one can anticipate that plant varieties generated through genome engineering will be regulated by governmental agencies in ways different from traditional transgenic plants (Kuzma and Kokotovich, 2011; Waltz, 2012). At present, regulatory authorities worldwide are reviewing the technology and making decisions as to how best to regulate plants modified using sequence-specific nucleases. It is likely that not all implementations of the technology will be regulated in the same way. Targeted gene knockouts, for example, do not necessarily involve the introduction of foreign DNA into the plant chromosome, but rather, DNA sequences are typically deleted. Other applications of sequence-specific nucleases, such as targeted gene replacements, can be used to create a broad spectrum of sequence alterations. But if a single nucleotide change is introduced into a plant gene and this is the only alteration to the genome, should the plant be regulated? Clearly traditional methods of mutagenesis could achieve the same outcome. Changes to regulatory policy will have important economic ramifications: the costs of attaining regulatory approval are considerable, and as such, only large multinational companies have traditionally had the requisite resources to advance genetically modified crop plants through the regulatory process. Furthermore, the crop species that have been genetically modified have been limited to those that are sufficiently profitable to recover the regulatory expense.
Considering the technology itself, it is clear that many of the technological barriers that currently limit the usefulness of genome engineering will be overcome within the coming years. However, this alone does not guarantee that genome engineering will significantly augment or replace current approaches to crop improvement. In fact, there are several species-specific external factors that may limit the future application of genome engineering practices toward crop improvement. For example, genome engineering is most immediately feasible in crop species that have a reference genome sequence that can be used to identify targets for DNA sequence modification. Furthermore, most applications of genome engineering currently require the capacity for efficient genetic transformation. Moreover, crop communities that are replete with genetic resources (e.g., mutant stocks) may be better positioned to make phenotypic predictions based on specific DNA modifications.
Many crops species currently lack some or all of these resources. Efficient genetic transformation pipelines have not been worked out for all crop species. Pre-genome resources, such as large-insert and expressed sequence tag libraries, do not provide sufficient genomewide information to allow for a good prediction of the impact that a given engineered nuclease may have on the genome (e.g., the number of targeted loci). Furthermore, even if a crop species is transformable and has a complete reference genome sequence, limitations in functional genomic resources can make it difficult to predict the phenotypic outcomes of targeted modifications. In species with poor functional genomics tools, there may be few genes that have been characterized. In this case, it will be difficult to assess which genes should be targeted for a specific trait outcome. The pool of good candidate genes will be small, and only the most promising may be worth the investment required for targeted modification. Functional characterization of orthologous genes in related model species may offer the best assessment of candidates for many crop species.
Despite the difficulties associated with implementing the technologies, genome engineering remains one of the hottest areas of genetic research at present. A decade ago the idea of purposefully targeting and modifying a specific nucleotide to create a given trait was a fanciful wish of crop breeders and geneticists. As the field of crop genomics has advanced and the technological barriers of genome engineering have been overcome, scientific fantasy has become reality.