A programmable dual-RNA-guided DNA endonuclease in adaptive bacteria l immunity supplementary
- 格式:pdf
- 大小:2.03 MB
- 文档页数:37
Our understanding of brain function at the cellular and circuit level has been greatly advanced by functional genomics and the availability of various genetic tools to decipher neuronal diversity and function and model human brain disorders in non-mammalian and mamma-lian organisms. Just as the development of chemical DNA mutagens 1 and RNA interference (RNAi)2 led to huge leaps in the fields of genetics and developmental biology — mainly as a result of research in non-mammalian organ-isms such as flies, worms and fish 3–5 — precise genetic modifications introduced by homologous recombination (HR) in embryonic stem cells (ESCs)6 paved the way for studying the mammalian brain and modelling human dis-eases in mice and rats. For example, many neurological disorders, such as Alzheimer disease, are associated with genetic risk factors that can be introduced and studied in animal models 7. In addition, novel approaches based on human ESCs and induced pluripotent stem cells (iPSCs) are changing the way that we model cellular processes under normal and pathological conditions in vitro . For exam-ple, human stem cells can be differentiated into neurons or glia to genetically dissect the molecular mechanisms of complex brain disorders in vitro 8–12. Genome-editing technologies are allowing researchers to take full advan-tage of both animal and cellular models and to work more easily with non-traditional model organisms for neuroscience research.Genome-editing tools based on site-specific DNA nucleases, including zinc-finger nucleases (ZFNs)13–15, tran-scription activator-like effector nucleases (TALENs)16–19 and the CRISPR-associated (Cas) effector proteins of clustered regularly interspaced short palindromic repeat (CRISPR) systems, such as Cas9 (REFS 20–25) and Cpf1 (REFS 26,27), have been developed to facilitate site-specific genomic modifications. In addition, ZFs 28, TALEs 29 and enzymatically inactive versions of Cas9 (known as deadCas9 (dCas9))30 can be coupled to functionally different enzymatic domains 30–35 or fluorescent proteins 36 to achieve targeted transcriptional control, epigenetic modification and DNA labelling (FIG. 1).ZFNs and TALENs recognize specific DNA sequences through protein–DNA interactions, whereas the DNA-specificity of Cas proteins is RNA-guided. To target Cas proteins to specific genomic loci, dual-guide RNAs or single-guide RNAs (sgRNAs)24,25,27,37,38 can be designed and generated quickly. Another key advantage of Cas proteins is that multiple sgRNAs can be used simulta-neously to edit multiple genes, which can be useful for studying genetic interactions and modelling multigenic disorders, something that previously required multiple cloning and complex protein engineering steps to achieve using ZFNs and TALENs.The benefits of using CRISPR–Cas systems to study the nervous system are highlighted by several successful applications in different animal species and cell types to study synaptic and circuit function 39–41, neuronal devel-opment 42–45 and diseases 41,46. Here, we describe how genome -editing tools, and in particular those based on CRISPR–Cas enzymes, are opening new avenues for neuro s cientific and biomedical research through the gen-eration of new model systems, both in vivo and in vitro , and discuss the challenges and possible future applications of this technology for understanding the brain.Overview of genome-editing strategiesSite-specific nucleases, including ZFNs, TALENs and Cas proteins, enable precise genetic modifications by inducing double-strand DNA breaks (DSBs) at target locations in the genome. Two highly conserved DNA-repair machinery pathways typically repair DSBs that would otherwise result in cell death: non-homologous end joining (NHEJ) and homology-directed repair1Broad Institute ofMassachusetts Institute of T echnology and Harvard, Cambridge, Massachusetts 02142, USA.2McGovern Institute for Brain Research, Massachusetts Institute of T echnology.3Department of Brain and Cognitive Sciences, Massachusetts Institute of T echnology.4Department of Biological Engineering, Massachusetts Institute of T echnology, Cambridge, Massachusetts 02139, USA.Correspondence to F .Z. zhang@ doi:10.1038/nrn.2015.2Published online 10 Dec 2015Functional genomicsThe study of gene functions and interactions in relationship to RNA transcripts and protein products using genome-wide data, and often involving high-throughput methods.RNA interference(RNAi). A technique used to knock down the expression of a specific gene by introducing a double-stranded RNA molecule that complements the gene of interest and triggers the degradation of the target mRNA.Applications of CRISPR–Cas systems in neuroscienceMatthias Heidenreich 1–4 and Feng Zhang 1–4Abstract | Genome‑editing tools, and in particular those based on CRISPR–Cas (clusteredregularly interspaced short palindromic repeat (CRISPR)–CRISPR‑associated protein) systems, are accelerating the pace of biological research and enabling targeted genetic interrogation in almost any organism and cell type. These tools have opened the door to the development of new model systems for studying the complexity of the nervous system, including animal models and stem cell‑derived in vitro models. Precise and efficient gene editing using CRISPR–Cas systems has the potential to advance both basic and translational neuroscience research.Knock-in or gene correction| Neuroscience a Precise gene editing5'5'3'3'NHEJ5'5'3'3'Indel mutationPremature stop codonb Chromosomal rearrangementc Large chromosomal deletiond5'3'5'3'Transcriptional5'3'5'5'3'3'dCas9Epigenetic CG Fluorescent Transcriptional control Epigenetic modulationDNA labellingHomologous recombination(HR). The exchange of homologous DNA strands between similar DNAmolecules, an event that occurs naturally during meiosis to generate genetic variation. HR is used to direct error-free repair of DNA double-strand breaks induced by DNAnucleases, such as zinc-finger nucleases (ZFNs), transcription activator-like effector nucleases (TALENs) and clustered regularlyinterspaced short palindromic repeat (CRISPR)-associated (Cas) proteins.Embryonic stem cells(ESCs). T otipotent cells derived from embryos that can begenetically manipulated in vitro to generate transgenic,knock-in and knockout mice. ESCs can also be directed to differentiate into various cell types in vitro , including neurons and glial cells.Induced pluripotent stem cells(iPSCs). Pluripotent cells derived from reprogrammed differentiated adult cells; iPSCs have properties similar to those of embryonic stem cells and therefore can, in principle, be differentiated into all cell types of the body.Figure 1 | Genome-editing applications of CRISPR–Cas9. a | Non-homologous end-joining (NHEJ) and homology-directed repair (HDR) after a DNA double-strand break (DSB) is induced by zinc-finger nucleases (ZFNs), transcription activator-like effector nucleases (TALENs) or clustered regularly interspaced short palindromic repeat (CRISPR)-associated protein 9 (Cas9). ZFNs and TALENs recognize their DNA-binding site via protein domains that can be modularly assembled for each DNA target sequence. Cas9 recognizes its DNA-binding site via RNA–DNA interactions mediated by the short single-guide RNA (sgRNA), which can be easily designed and cloned. Theerror-prone NHEJ repair pathway 53 can result in the introduction of insertion or deletion (indel) mutations that can lead to a frame shift, the introduction of a premature stop codon and, consequently, gene knockout. The alternative repair pathway, HDR 14,47–53, can be used to introduce precise genetic modifications if a homologous DNA template is present. b | Two different sgRNAs guide Cas9 to induce DNA cleavage at two different genes, resulting in chromosomalrearrangements 116,117. c | Two proximate sgRNAs guide Cas9 to induce DNA cleavage at two different loci of the same gene, introducing large deletions 118,119. d | The nuclease-inactivated version of Cas9 (dead Cas9 (dCas9)) can be fused to different functional enzymatic domains to mediate transcriptional control, epigenetic modulation or fluorescent DNA labelling of specific genetic loci 30–36. HR, homologous recombination; M, methyl group.(HDR)14,47–55(FIG. 1a). The highly error-prone NHEJ pathway induces insertions and deletions (indels) of various lengths that can result in frameshift mutations and, consequently, gene knockout. By contrast, the HDR pathway directs a precise recombination event between a homologous DNA donor template and the damaged DNA site, resulting in accurate correction of the DSB. Therefore, HDR can be used to introduce specific muta-tions or transgenes into the genome. Because ZFNs and TALENs achieve specific DNA binding via protein domains, individual nucleases have to be synthesized for each target site. By contrast, Cas9 is guided by a specificity-determining guide-RNA sequence (CRISPR RNA (crRNA)) that is associated with a trans-activating crRNA (tracrRNA) and forms Watson–Crick base pairs with the complementary DNA target sequence, resulting in a site-specific DSB22,23,37,56. A simple two-component system (consisting of Cas9 from the bacterial species Streptococcus pyogenes24,25 or Staphylococcus aureus57 and a fusion of the tracrRNA–crRNA duplex to a sgRNA)37 has been engineered for expression in eukaryotic cells and can achieve DNA cleavage at any genomic locus of interest. More recently, Cpf1, a single-RNA-guided nucle-ase that does not use tracrRNA, has also been adapted for genome editing27. Hence, different Cas proteins can be targeted to specific DNA sequences simply by changing the short specificity-determining part of the guide RNA, which can be easily achieved in one cloning step.Gene editing across speciesNon-human animal models provide an experimental platform to dissect the complexity of the brain and study the cellular and molecular underpinnings of brain disor-ders. Neuroscience in particular benefits from exploit-ing a wide range of species, including worms, flies, fish and mammals, as well as non-traditional model sys-tems, such as birds and amphibians58. Disrupting gene expression is a common approach to study gene func-tion and understand loss-of-function disease muta-tions. For many years, RNAi was the ‘gold standard’ for gene silencing and studying gene function in vitro and in vivo59,60; however, genome editing based on engi-neered designer nucleases offers several advantages over RNAi (TABLE 1). For example, genome-editing tools can be modified to allow for more refined control of gene expression beyond simple gene knockdown, adding to their versatility (FIG. 1d).Multiplying the power of simple model organisms. At the molecular level, non-mammalian model systems can provide important information about fundamen-tal features of the nervous system as a result of their well-characterized genetic and cellular organization and amenability to a range of genetic tools. For exam-ple, many evolutionarily conserved genes involved in human neurological disorders such as Alzheimer dis-ease and Parkinson disease have been extensively studiedTable 1 |Comparison of approaches for gene knockdown or knockoutpolyethylenimine; RNAi, RNA interference; TALEN, transcription activator-like effector nuclease; TSOs, translation-suppressing oligonucleotides; ZF, zinc-finger; ZFN, ZF nuclease.Epigenetic mechanisms Multilayered cellular processes that modulate gene expression and function in response to interoceptive and environmental stimuli during development, adult life and ageing, including DNA methylation, post-translational histone modifications,ATP-dependent nucleosome and higher-order chromatin remodelling, non-codingRNA deployment and nuclear ing flies, worms and fish61–63. For years, studies usingthese simple model organisms relied mainly on geneticscreens using chemical mutagenesis and RNAi3–5 orimprecise methods for transposon excision and retro-viral insertion64–66. More-precise genetic modificationshave been achieved using ZFNs67–69, TALENs70–73 andCas proteins (reviewed in REF. 74). In the case of Casproteins, large numbers of RNA guides can be easilysynthesized to study gene function on a large scale. Bycontrast, generating large libraries based on ZFNs andTALENs is challenging owing to difficulties in design-ing and synthesizing these proteins with varying DNAbinding specificities (TABLE 1). In a proof-of-conceptstudy, approximately 50 genes were screened with Cas9and novel loci involved in electrical synapse formationin zebrafish were identified43. Such in vivo screeningapproaches in small model organisms offer an accessibleplatform to identify the genes involved in various aspectsof nervous system function and dysfunction.Rapid generation of mammalian models. The devel-opment of methods facilitating HR in ESCs6 enabledneuroscientists to study the effects of gene knockouts,mainly in mice. This approach has been significantlyenhanced by genome-editing technologies (FIG. 2a,b).Genome editing in single-cell embryos has been used togenerate mouse75, rat76 and primate models77,78 that canbe used to study the role of specific proteins in nerv-ous system function. Mouse and rat models provide abridge between our understanding of the molecularunderpinnings of the nervous system gleaned from stud-ies in non-mammalian systems and the complex pheno-types observed in human brain disorders. However,in some cases, a comprehensive understanding of thehuman brain will require primate models, which havebrains that are more similar to the human brain interms of neuroanatomical, physiological, perceptual andbehavioural characteristics.Transgenic approaches in primates are generally veryinefficient. However, successful insertion of transgenicalleles in primates, including macaques79,80 and the com-mon marmoset81, has been achieved using retroviral andlentiviral approaches in early embryos. For example, theviral insertion of a disease-related version of the humangene huntingtin (HTT) into the macaque genomerecapitulated clinical features of Huntington disease80,representing an important step forward for genetic-disease modelling in non-human primates. TALENshave also been successfully used in monkeys to modelmutations in methyl-CpG-binding protein 2 (MECP2),an X-linked Rett syndrome gene77, and genome-engineered primates have been generated by precise dis-ruption of single and multiple genes with Cas9 (REF. 78).The simplicity of the use of Cas proteins relative to thatof ZFNs and TALENs, and the ability to modify multiplegenes simultaneously, is a breakthrough that is alreadycatalysing molecular interrogation of neurological andpsychiatric dysfunctions in disease-relevant brain cir-cuits using primate models78,82. The ability to examinebrain function in genetically modified non-humanprimates has the potential to contribute significantly toour understanding of higher cognitive functions and tothe development of new therapeutic strategies for dis-eases that cannot be adequately modelled in rodents.However, such research raises important bioethical ques-tions and requires careful consideration of the costs andb enefits before moving forward.In vivo gene editing in the brainIn vivo gene editing allows the systematic genetic dissec-tion of neuronal circuits and the ability to model patho-logical conditions while bypassing the need to engineergermline-modified mutant strains. This experimentalapproach is fast, independent of genetic background,animal species and availability of ESCs, and can beapplied to existing disease models and transgenic strains,as well as to aged animals to study age-related neurologi-cal changes (FIG. 2c). In vivo methods based on RNAi havebeen commonly used to reduce the expression of genesin the brain83. In addition, alternative methods basedon DNA antisense oligonucleotides (ASOs) can be usedfor gene silencing and have been shown to be promis-ing therapeutic molecules for suppressing pathogenicprotein aggregates in the brain84,85. However, neitherstrategy allows the generation of stable gene knockoutsor site-specific epigenetic modifications (TABLE 1). In themouse brain, histone modifications and transcriptionalcontrol have been achieved using ZFs86 and TALEs32, andCas9 has been used to induce indel mutations in neuronsto achieve stable gene knockouts in living animals39,41.This demonstrates the capacity for spatial and temporalcontrol of gene expression in fully developed circuitsand also opens the door to probing epigenetic dynam-ics30–33,35 in the brain. Epigenetic control is of particularinterest, as there is increasing evidence that epigeneticmechanisms, such as histone modifications and DNAmethylation, play a part in learning, memory formationand the pathology of neuropsychiatric disorders87. UsingCas proteins, functional domains of DNA-methylationor -demethylation enzymes or histone modifiers can beeasily targeted to specific DNA sequences to edit theepigenome with high spatial and temporal specificityin vivo(FIG. 1d).Delivery to the brain. Viral vectors are a promising modefor delivery of Cas proteins to the brain. Viral vectorshave defined, tissue-specific or cell type-specific tro-pism and can be admitted either locally to the brain orthrough the bloodstream to achieve more-systematictissue penetration88. The most-attractive gene-deliveryvectors are adeno-associated viruses (AAVs), whichafford long-term expression without genomic inte-gration, are relatively safe and are non-pathogenic89,90.However, AAV vectors have limited transgene capacity,and the large size of the commonly used S. pyogenes Cas9variant poses a significant challenge for AAV-mediateddelivery41,91. AAV-mediated delivery may become evenmore challenging when Cas9 is enlarged by the fusionof additional functional domains. Smaller Cas9 ortho-logues, such as those derived from S. aureus, are easierto pack57, making them an attractive option for in vivogenome editing in the brain.| Neuroscience Other techniques have been also used to deliver Cas9 and RNA guides to the brain, including in utero electroporation 39 and polyethylenimine (PEI)-mediated transfection 46. In rodents, electroporation and PEI-mediated transfection are easy to use, fast and efficient at delivering large plasmid DNA into a highnumber of neurons. However, two drawbacks of these techniques are their low spatial accuracy of transgene expression and the necessity of prenatal intervention, which often results in low viability and targeting of mitotic neuronal precursors instead of postmitotic, differentiated neurons.Figure 2 | Using Cas9 to generate genetically modified rodents and for in vivo genome editing. a ,b | Comparison ofthe timelines of traditional gene targeting using classic homologous recombination (HR) in embryonic stem cells (ESCs;part a ) and clustered regularly interspaced short palindromic repeat (CRISPR)-associated protein 9 (Cas9) gene targeting in one‑cell embryos (part b ). There are two main time- and cost-intensive phases of the HR approach. First, the design and cloning of the targeting vector, ESC transduction and selection, and generation of chimaeras. Second, the backcrossing of mice to a desired background and/or cross-breeding to generate multiple genetically modified animals. By contrast, cloning of short single-guide RNA (sgRNA) into a targeting vector, verification of sgRNA on-target efficiency (through the surveyor nuclease assay or sequencing), Cas9–sgRNA microinjection and founder identification are relatively easy and fast 120. Because embryos can be obtained from any mouse strain and multiple genes can be targeted simultaneously, genetic backcrossing and cross-breeding are not required. c | Cas9 nucleases also enable precise in vivo genome editing of specific cell types in the mammalian brain on a relatively short timescale. Cas9 is cloned under the control of cell type-specific promoters, and sgRNA efficiency is validated in vitro before being packaged into viral vectors, such as adeno-associated viruses (AAVs). sgRNA can then be stereotactically delivered into the brains of mice that haveendogenous Cas9 expression (Cas9 mice)91, or the sgRNA can be delivered together with Cas9 into wild-type mice 41 or rats, aged animals, disease models or reporter lines. In vivo genome editing in the brain is not limited to rodents and can theoretically be applied to other mammalian systems, including non-human primates. GFAP , glial fibrillary acidic protein; Neo, neomycin anitibiotic selection marker; SYN, human synapsin promoter.LiposomesLipid vesicles artificially formed by sonicating lipids in an aqueous solution. Liposomes can be packed with negatively charged molecules to deliver them into cells and are therefore promising vehicles for therapeutic applications. Cre–loxP recombinationA site-specific recombination system derived from Escherichia coli bacteriophage P1. T wo short DNA sequences (loxP sites) are engineered to flank the target DNA. Activation of the Cre recombinase enzyme catalyses recombination between the loxP sites, leading to excision of the intervening sequence.Alternatively, Cas9 protein itself, rather than the DNAor RNA that encodes it, could be delivered, an approachthat is particularly interesting for protein-based thera-peutics. The anionic nature of sgRNA allows the integra-tion of Cas9–sgRNA complexes into cationic liposomes,a commonly used DNA-, RNA- and protein-deliverytool. Liposome Cas9–sgRNA complexes have alreadybeen successfully used to achieve genome editing in themouse inner ear92. Therefore, lipid-mediated delivery ofCas9 may also serve as powerful tool for genome editingin the brain in the future.Cell type-specific genome editing. In the mammalianbrain, there are probably several hundred neuronal sub-types, each with distinct morphological, biophysical,biochemical and computational functions. Thus, celltype-specific tools are required to dissect this hetero-geneous tissue. Research has shown that the malfunctionof specific cell types in different brain regions contrib-utes to diverse symptoms usually connected with neuro-psychiatric symptoms, such as hallucinations, depressionand repetitive motor behaviour93. This highlights theneed to pinpoint causal relationships between celltypes within the context of relevant neuronal networks,genetics and behavioural dysfunction, which will requireprecise genome editing in specific cellular subtypes. Site-specific Cre–loxP recombination elements that enable thecontrol of the spatiotemporal expression of Cas9 havebeen introduced in fish94 and mouse embryos91, andsimilar approaches could achieve precise gene editing indefined cell types in vivo. The vast number of establishedCre-driver mouse lines95 and inducible Cas9 systems96–98can, when combined with conditional gene-targetingstrategies, provide enormous combinatorial power todecipher the logic of complex neuronal networks andtheir role in neurological disorders in vivo.In vivo efficiency and specificity. In postmitotic neu-rons, Cas9 has been successfully used to introduce sin-gle39–41,46,91 and multiple DSBs41,46 resulting in NHEJ andefficient formation of indel mutations. For example,AAV delivery of Cas9 and sgRNA targeting Mecp2 in theadult mouse brain resulted in the local loss of more than70% of MECP2, which was sufficient to recapitulate phe-notypes observed in classic Mecp2-mutant mouse mod-els and patients with Rett syndrome41. In another study,Cas9-mediated deletion of common tumour-suppressorgenes, such as patched homologue 1 (Ptch1), Trp53 (alsoknown as Tp53), phosphatase and tensin homologue(Pten) and neurofibromin 1 (Nf1), in the cerebellum orforebrain efficiently induced the formation of medullo-blastoma or glioblastoma tumours, respectively46.Despite this success, the validation of Cas-mediatedgene editing in the brain is still challenging, and sensitivemethods are required for analysing Cas efficiency andspecificity in targeted brain regions (BOX 1).Although NHEJ in postmitotic neurons has beendemonstrated to be active, it remains unclear how effi-cient HDR is in postmitotic cells. It is commonly believedthat HDR predominantly occurs in the S and G2 phasesof the cell cycle99,100, and HDR is therefore thought to berare in non-dividing cells, such as neurons. Introductionor correction of precise genetic mutations via HDR in thebrain would validate disease mutations in vivo and openthe door to therapeutic applications of genome editing inbrain disorders. Thus, future work should focus on iden-tifying and activating signalling pathways required fortriggering HDR in differentiated cells. However, it shouldalso be noted that gene insertion has been achievedthrough NHEJ pathways, which may allow us to insertDNA into neurons and glia101.In contrast to precise gene knockout and insertion,genome editing aimed at transcriptional regulation andepigenetic modulation may be less challenging in thebrain, as these approaches are independent of DNA-repair pathways. Achieving epigenetic and transcriptionalcontrol in neurons can aid in the study of the molecu-lar mechanisms of natural gene silencing in the nervoussystem and can help us to better understand neurolog-ical disorders associated with gene imprinting, such asAngelman syndrome102.Gene editing in human iPSCsCombinatorial approaches based on iPSC technologyand genome editing offer another approach to modelhuman neurological disorders in vitro. A key advan-tage of this approach is that genetic modifications canbe studied in different human genetic backgrounds,because iPSCs retain all of the individual donor’s geneticinformation. This is particularly important for complex| NeurosciencebDefective neuron sgRNA candidate gene A + BsgRNA candidate gene ANeuron with mild phenotype sgRNA candidate gene Bneurological disorders, because genetic variants associ-ated with such diseases act in concert with many other alleles. Another advantage is that the genetically modi-fied cells can be differentiated into almost any cell type, including those that are not easily accessible in patients, such as neurons and glia.iPSC-based disease models have been generated for several neurological disorders, including Parkinson 10,11, Alzheimer 9 and Huntington 8 diseases, and they have been proven to closely mimic cellular and molecular fea-tures of human diseases. Genome-editing tools applied to these models can be used to examine the genetic link between risk variants and cellular pathways involved in multigenic neurological disorders in a high-throughput manner (FIG. 3). Furthermore, specific signalling path-ways involved in the pathogenesis of the disease can be precisely dissected to gain insight into the molecular mechanisms of the disease and to identify new drug tar-gets 10. Gene editing may be performed either in iPSCs or induced later in differentiated cells 96,98, allowing forthe investigation of phenotypes that arise during cell differentiation, which may be relevant when studying neurodevelopmental aspects of a disease such as Rett syn-drome 103–105. In addition, inducing or rescuing a pheno-type in differentiated cells will be useful for validating potential therapeutic applications.Future perspectivesGenome-editing technologies allow for the introduction of genetic modifications into almost any cell type and organism. For example, Cas9 has already been used to alter genes in species such as killifish 106 and salaman-der 107, which are commonly used to study ageing and tissue regeneration, respectively. It may also open up the possibility of developing models in other species of interest to neuroscience research, such as social insects or songbirds 58, which have thus far been intractable to genetic modification. In addition to the generation of new model systems, including iPSC-derived in vitro models, genome editing in combination with single-cellFigure 3 | In vitro applications of Cas9 in human iPSCs. a | Evaluation of disease candidate genes from large-populationgenome-wide association studies (GWASs). Human primary cells, such as neurons, are not easily available and are difficult to expand in culture. By contrast, induced pluripotent stem cells (iPSCs ) derived from somatic cells (such as fibroblasts) of healthy individuals or patients with neurological disorders can be differentiated into neurons and cultured in vitro 8–12. Disease candidate genes can be examined in two ways. Site-specific homologous recombination (HR) of the candidate gene using clustered regularly interspaced short palindromic repeat (CRISPR)‑associated protein (Cas) nucleases can be applied in disease-affected cells (left). If this rescues disease phenotypes (as for candidate gene B in the example shown), the validity of the candidate gene is confirmed. Alternatively, candidate genes can be mutated in healthy cells (right). Where this recapitulates disease pathogenesis in vitro (as in the case of candidate gene B), the validity of the candidate gene is confirmed. b | The contribution of specific genetic loci to multigenic disorders, such as Alzheimer or Parkinson diseases, can also be systematically evaluated using Cas‑mediated single and multiplex genome editing. This may enable dissection of possible synergistic effects (as shown for candidate genes A and B) and screening for functional correlations between disease phenotypes and distinct gene mutations. sgRNA, single-guide RNA.。
CRISPR-Cas9植物基因敲除试剂盒说明书Cat. No. GP0138Protocol No. PT140730-1出版日期July.2014南京市汉中门大街301号南京国际服务外包产业园01栋13层A座电话:+86-25-66776700/66776718传真:+86-25-66776701邮编:210036网址:CRISPR-Cas9植物基因敲除试剂盒目录I.组份列表 (2)II.产品概要 (2)III.操作步骤 (3)1. 设计Oligo DNA 序列 (3)2. 线性化pP1C.2载体 (3)3. 重组pP1C.2载体的构建 (3)4. 重组pP1C.1载体的构建 (3)IV.附录pP1C系统质粒图谱..................................................................................4-5 V.参考文献. (6)图Figure 1. CRISPR-Cas9工作原理示意图Figure 2. pP1C.1质粒图谱Figure 3. pP1C.2质粒图谱表Table1. CRISPR-Cas9植物基因敲除试剂盒组份列表Protocol No. PT140730-1 Genloci Biotechnologies Inc.1/6CRISPR-Cas9植物基因敲除试剂盒I.组份列表CRISPR-Cas9基因敲除试剂盒组份如下:Table1. CRISPR-Cas9植物基因敲除试剂盒组份列表组份列表含量pP1C.1 Vector2μgpP1C.2 Vector 2μg贮存条件※注意:在收到货后,请您将pP1C.1 Vector 和pP1C.2 Vector瞬时离心后再保存。
将pP1C.1 Vector、pP1C.2 Vector置于-20℃保存,且避免污染;II.产品概要CRISPR/Cas9植物基因敲除试剂盒是一款高效、精准的基因编辑试剂盒,它是基于最新代的人工核酸内切酶CRISPR-Cas9而研发的,相对于传统敲除技术和TALENs\ZFNs技术而言,其操作更加简便,敲除效率最高,基因的编辑更加的精准,大大降低了脱靶机率。
互作转录组测序(Dual RNA-seq )物种间存在广泛的相互作用,包括寄生、共生、竞争等,而普通的转录组只能研究单一物种的信息,不仅浪费一部分数据,而且在分离中也会对样本本身造成一定的影响。
互作转录组(Dual RNA-seq ),无需分离两物种,避免分离造成的干扰,只构建一个转录组文库,便能同时对2个(或多个)物种进行测序和分析,从而揭示互作物种间基因表达的动态变化。
同时,还可以通过互作模型图获得物种间基因的调控关系,得到两物种间的相互作用机制。
研究互作过程中的调控网络、病原菌感染宿主致病机理和宿主抗病的机制;研究致病性菌在不同物种之间的进化关系、基于同源基因进一步发掘正选择相关基因等。
技术参数产品测序 策 略 推 荐 数 据周 期 互作转录组测序300bp RNA 文库 HiSeq PE150测序真核对照样本:6Gb clean data 原核对照样本:2Gb clean data真核-真核、真核-原核互作样本:10G clean data40~50个 工作日技术流程技术特点(1)自主研发的建库技术,有效降低起始量;(2)可结合UMI标签技术,提高比对率,让表达量分析精确度更上一个台阶;(3)强大的生信分析团队,提供多种个性化分析;(4)从方案设计、建库测序到数据挖掘,贴心全程陪伴,全力帮助您的研究。
部分结果展示差异基因表达GO富集分析共表达网络分析差异可变剪接事件火山图蛋白互作网络分析案例解析沙门氏菌感染HeLa细胞过程中非编码RNA的调控作用为了研究细菌感染细胞后两者的相互作用,作者使用带绿色荧光标记的沙门氏菌感染HeLa细胞,利用绿色荧光分离感染的细胞。
然后分别对未感染细胞和感染后2~24小时的细胞进行Dual RNA-seq,同时研究感染过程中两个物种的mRNA和非编码RNA变化。
他们锁定了small RNA——PinT,在细菌感染宿主细胞的最初4个小时,PinT的表达量上调最为显著(图a、b),在感染过程中增加了近100倍。
The CRISPR–Cas modules are adaptive immune sys-tems that are present in most archaea and many bacte-ria1–5 and provide sequence-specific protection against foreign DNA or, in some cases, RNA6. A CRISPR locus consists of a CRISPR array, comprising short direct repeats separated by short variable DNA sequences (called ‘spacers’), which is flanked by diverse cas genes. CRISPR–Cas immunity involves three distinct mechanis-tic stages: adaptation, expression and interference7–11. The adaptation stage involves the incorporation of fragments of foreign DNA (known as ‘protospacers’) from invad-ing viruses and plasmids into the CRISPR array as new spacers. These spacers provide the sequence memory for a targeted defence against subsequent invasions by the corresponding virus or plasmid. During the expres-sion stage, the CRISPR array is transcribed as a precursor transcript (pre-crRNA), which is processed and matured to produce CRISPR RNAs (crRNAs). During the inter-ference stage, crRNAs, aided by Cas proteins, function as guides to specifically target and cleave the nucleic acids of cognate viruses or plasmids7,9,12,13. Recent stud-ies suggest that CRISPR–Cas systems can also be used for non-defence roles, such as the regulation of collective behaviour and pathogenicity14–16.Numerous, highly diverse Cas proteins are involved in the different stages of CRISPR activity (BOX 1;see Supplementary information S1 (table)). Briefly, Cas1 and Cas2, which are present in most known CRISPR–Cas systems, form a complex that represents the adap-tation module and is required for the insertion of spacers into CRISPR arrays17,18. Protospacer acquisition in many CRISPR–Cas systems requires recognition of a short protospacer adjacent motif (PAM) in the target DNA19–22. During the expression stage, the pre-crRNA molecule is bound to either Cas9 (which is a single, multidomain protein) or to a multisubunit complex, forming the crRNA–effector complex. The pre-crRNA is processed into crRNAs by an endonuclease subunit of the multisubunit effector complex23 or via an alterna-tive mechanism that involves bacterial RNase III and an additional RNA species, the tracrRNA (trans a ctivating CRISPR RNA)24. Finally, at the interference stage, the mature crRNA remains bound to Cas9 or to the multi-subunit crRNA–effector complex, which recognizes and cleaves the cognate DNA10,11,25,26 or RNA26–31.The rapid evolution of most cas genes32–34 and the remarkable variability in the genomic architecture of CRISPR–cas loci poses a major challenge for theAn updated evolutionary classificationof CRISPR–Cas systemsKira S. Makarova1, Yuri I. Wolf1, Omer S. Alkhnbashi2, Fabrizio Costa2, Shiraz A. Shah3,Sita J. Saunders2, Rodolphe Barrangou4, Stan J. J. Brouns5, Emmanuelle Charpentier6,Daniel H. Haft1, Philippe Horvath7, Sylvain Moineau8, Francisco J. M. Mojica9,Rebecca M. T erns10, Michael P. T erns10, Malcolm F. White11, Alexander F. Yakunin12,Roger A. Garrett3, John van der Oost5, Rolf Backofen2,13 and Eugene V. Koonin1Abstract | The evolution of CRISPR–cas loci, which encode adaptive immune systems inarchaea and bacteria, involves rapid changes, in particular numerous rearrangements of thelocus architecture and horizontal transfer of complete loci or individual modules. Thesedynamics complicate straightforward phylogenetic classification, but here we present anapproach combining the analysis of signature protein families and features of thearchitecture of cas loci that unambiguously partitions most CRISPR–cas loci into distinctclasses, types and subtypes. The new classification retains the overall structure of theprevious version but is expanded to now encompass two classes, five types and 16 subtypes.The relative stability of the classification suggests that the most prevalent variants ofCRISPR–Cas systems are already known. However, the existence of rare, currentlyunclassifiable variants implies that additional types and subtypes remain to be characterized.1National Center forBiotechnology Information,National Library ofMedicine, National Institutesof Health, Bethesda,Maryland 20894, USA.Correspondence to E.V.K.e-mail:koonin@doi:10.1038/nrmicro3569Published online28 September 2015ANALYSISconsistent annotation of Cas proteins and for the clas-sification of CRISPR–Cas systems13,35. Nevertheless, a consistent classification scheme is essential for expedi-ent and robust characterization of CRISPR–cas loci in new genomes, and thus important for further progress in CRISPR research. Owing to the complexity of the gene composition and genomic architecture of the CRISPR–Cas systems, any single, all-encompassing classification criterion is rendered impractical, and thus a ‘polythetic’ approach based on combined evidence from phylo-genetic, comparative genomic and structural analy s is was developed13. At the top of the classification hierar-chy are the three main types of CRISPR–Cas systems (type I–type III). These three types are readily distin-guishable by virtue of the presence of unique signature proteins: Cas3 for type I, Cas9 for type II and Cas10 for type III13. Within each type of CRISPR–Cas system, several subtypes have been delineated based on addi-tional signature genes and characteristic gene arrange-ments13,35. Recently, in-depth sequence and structural analysis of the effector complexes from different vari-ants of CRISPR–Cas systems has uncovered common principles of their organization and function4,30,31,36–46. In parallel, the biotechnological development of molecu-lar components of type II CRISPR–Cas systems into a powerful new generation of genome editing and engi-neering tools has triggered intensive research into the functions and mechanisms of these systems, thereby advancing our understanding of the Cas proteins and associated RNAs47,48.In this Analysis article, we refine and extend the clas-sification of CRISPR–cas loci based on a comprehensive analysis of the available genomic data. As a result of this analysis, we introduce two classes of CRISPR–Cas sys-tems as a new, top level of classification and define two putative new types and five new subtypes within these classes, resulting in a total of five types and 16 subtypes. We employ this classification to analyse the evolution-ary relationships between CRISPR–cas loci using several measures. The results of this analysis highlight pro-nounced modularity as an emerging trend in the evolu-tion of CRISPR–Cas systems. Finally, we demonstrate the potential for automated annotation of CRISPR–cas loci by developing a computational approach that uses the new classification to assign CRISPR–Cas system subtype with high precision.Classification of CRISPR–cas lociThe classification of CRISPR–Cas systems should ide-ally represent the evolutionary relationships between CRISPR–cas loci. However, the pervasive exchange and divergence of cas genes and gene modules has resulted in a complex network of evolutionary relationships that cannot be readily (and cleanly) partitioned into a small number of distinct groupings (although such partition-ing might be achievable for individual modules, see below). Therefore, we adopted a two-step classifica-tion approach that first identified all cas genes in each CRISPR–cas locus and then determined the signature genes and distinctive gene architectures that would allow the assignment of these loci to types and subtypes.To robustly identify cas genes, which is a non-trivial task owing to high sequence variability, we developed a library of 394 position-specific scoring matrices (PSSM)49 for all 93 known protein families associ-ated with CRISPR–Cas systems (see Supplementary information S2 (table)). Importantly, this set included 229 PSSMs for recently characterized families that were not part of the previous CRISPR–Cas classification13. The PSSMs were used to search the protein sequences annotated in 2,751 complete archaeal and bacterial genomes that were available at the National Center for Biotechnology Information (NCBI) as of 1 February 2014 (see Supplementary information S3 (box) for a detailed description of the methods). A highly signifi-cant similarity threshold was used to identify bona fide cas genes. Genes that were located in the same genomic neighbourhood as bona fide cas genes (irrespectively of their proximity to a CRISPR array) and that encoded proteins with moderate similarity to Cas PSSMs were then identified as putative cas genes. This two-step pro-cedure was devised to minimize the false-positive rate, while allowing the detection of diverged variants of Cas proteins.Gene neighbourhoods around the identified cas genes were merged into 1,949 distinct cas loci from 1,302 of the 2,751 analysed genomes, including 1,694 complete loci. A cas locus was annotated as ‘complete’ if it encom-passed at least the full complement of genes for the main components of the interference module (the multisub-unit crRNA–effector complex or Cas9). This criterion was adopted because, although the adaptation module genes cas1 and cas2 are the most common cas genes,many otherwise complete (and hence thought to beA N A LY S I Sfunctionally active) CRISPR–Cas systems lack cas1 and cas2 and seem to instead depend on adaptation mod-ules from other loci in the same genome. Within the set of complete loci, 111 composite loci that contained two or more adjacent CRISPR–Cas units (each consisting of at least a full complement of essential effector complex components) were identified and split into distinct units. Each locus or unit was classified by scoring type-specific and subtype-specific PSSMs that were constructed from multiple sequence alignments of the respective signa-ture Cas proteins (see Supplementary information S2,S4 (tables)). For some of the more diverged signature pro-teins, multiple PSSMs were required for a single protein to capture the entire diversity of the cognate CRISPR–Cas subtype.Of the single-unit complete loci, 1,574 (93%) were assigned to a specific subtype or the newly defined putative types IV and V, which are not split into sub-types, eight were identified up to the type only and one remained unclassified by our procedure (a subtype I-D system operon that is adjacent to the remnants of a sub-type III-B system operon disrupted by recombination).Our analysis suggests that the CRISPR–Cas systems can be divided on the basis of the genes encoding the effector modules; that is, whether the systems have sev-eral variants of a multisubunit complex (the CRISPR-associated complex for antiviral defence (Cascade) complex, the Csm complex or the Cmr complex) or Cas9. Thus, we introduce a new, broadest level of classification of CRISPR–Cas systems, which divides them into ‘class 1’ and ‘class 2’. Class 1 systems possess multisubunit crRNA–effector complexes, whereas in class 2 systems all functions of the effector complex are carried out by a single protein, such as Cas9. We also find evidence for two putative new types, type IV and type V, which belong to class 1 and class 2, respectively. These observations result in a new classification system in which CRISPR–Cas systems are clustered into five types, each with a distinctive composition of expres-sion, interference and adaptation modules (FIG. 1). These five types are divided into 16 subtypes, including five new subtypes (II-C, III-C and III-D, together with the single subtypes of type IV and type V systems), as detailed below.Type I Pre-crRNA Effector module (crRNA and target binding)Target cleavageSpacer insertionRegulation Expression C l a s s 1C l a s s 2InterferenceAdaptation AncillaryHelper orType II Type III Type IV Type V | MicrobiologyClass 1 CRISPR–Cas systemsClass 1 CRISPR–Cas systems are defined by the pres-ence of a multisubunit crRNA–effector complex. The class includes type I and type III CRISPR–Cas systems, as well as the putative new type IV .Type I CRISPR–Cas systems. All type I loci contain the signature gene cas3 (or its variant cas3ʹ), which encodes a single-stranded DNA (ssDNA)-stimulated superfamily 2 helicase with a demonstrated capacity to unwind double-stranded DNA (dsDNA) and RNA–DNA duplexes 50–52. Often, the helicase domain is fused to a HD family endo-nuclease domain that is involved in the cleavage of the target DNA 50,53. The HD domain is typically located at the amino terminus of Cas3 proteins (with the exception of subtype I-U and several subtype I-A systems, in which the HD domain is at the carboxyl terminus of Cas3) or is encoded by a separate gene (cas3ʹʹ) that is usually adjacent to cas3ʹ (FIG. 1).Type I systems are currently divided into seven sub-types, I-A to I-F and I-U, all of which have been defined previously 13. In the case of subtype I-U, U stands for uncharacterized because the mechanism of pre-crRNA cleavage and the architecture of the effector complex for this system remain unknown 33. The type I-C, I-D, I-E and I-F CRISPR–Cas systems are typically encoded by a single (predicted) operon that encompasses the cas1, cas2 and cas3 genes together with the genes for thesubunits of the Cascade complex (BOX 2). By contrast, many type I-A and I-B loci seem to have a different organization in which the cas genes are clustered in two or more (predicted) operons 35. In most type I loci, each of the cas gene families is represented by a single gene.Each type I subtype has a defined combination of signature genes and distinct features of operon organi-zation (FIG. 2; see Supplementary information S4 (table)). Notably, cas4 is absent in I-E and I-F systems, and cas3 is fused to cas2 in I-F systems. Subtypes I-E and I-F are monophyletic (that is, all systems of the respective subtype are descended from a single ancestor) in phylo-genetic trees of Cas1 and Cas3, and each has one or more distinct signature genes (see Supplementary information S4,S5,S6 (table, box, box)).Subtypes I-A, I-B and I-C seem to be descendants of the ancestral type I gene arrangement (cas1–cas2–cas3–cas4–cas5–cas6–cas7–cas8)4,54. This arrangement is pre-served in subtype I-B, whereas subtypes I-A and I-C are diverged derivatives of I-B with differential gene loss and rearranged gene orders . A single signature gene for each of these subtypes could not be defined. The only protein that shows no significant sequence similarity between the subtypes is Cas8. However, the Cas8 sequence is highly diverged even within subtypes, so that consist-ent application of the signature gene approach would result in numerous new subtypes. For example, there are at least 10 distinct Cas8b families within subtype I-BFigure 1 | Functional classification of Cas proteins. Protein names follow the current nomenclature and classification 13.An asterisk indicates that the putative small subunit (SS) protein is instead fused to Cas8 (the type I system large subunit (LS)) in several type I subtypes 33. The type III system LS and type IV system LS are Cas10 and Csf1 (a Cas8 family protein), respectively. Dispensable components are indicated by dashed outlines. Cas6 is shown with a solid outline for type I because it is dispensable in some but not most systems and by a dashed line for type III because most systems lack this gene and use the Cas6 provided in trans by other CRISPR–cas loci. The two colours for Cas4 and three colours for Cas9 reflect that these proteins contribute to different stages of the CRISPR–Cas response. The functions shown for type IV and type V system components are proposed based on homology to the cognate components of other systems, and have not yet been experimentally verified. The functional assignments for Cpf1 are tentatively inferred by analogy with Cas9 (only the RuvC (and TnpB)-like domains of the two proteins are homologous). CARF , CRISPR-associated Rossmann fold; pre-crRNA, pre-CRISPR RNA. This research was originally published in Biochem. Soc. Trans. Makarova K. S., Wolf Y . I., & Koonin E. V . The basic building blocks and evolution of CRISPR–Cas systems. Biochem. Soc. Trans. 2013; 41: 1392–1400 © The Biochemical Society.A N A LY S I S| Microbiology dCas6e Cas7 x 6Cas7 (Cmr4) x 4Cas5 (Cmr3)Small subunit x 2Large subunit (Cas10 (Cmr2))Cmr1 andCmr6 (Cas7group)and at least 8 Cas8a families within subtype I-A (see Supplementary information S2 (table)). Thus, notwith-standing its complex evolution, we retain subtype I-B, which is best defined by the ancestral type I gene com-position. The three main subdivisions within subtype I-B roughly correspond to the previously described subtypes Hmari, Tneap and Myxan 32 (see also TIGRFAM direc-tory), and now could be defined through specific Cas8bfamilies, Cas8b1 (for Hmari), Cas8b2 (for Tneap) and Cas8b3 (for Myxan), with a few exceptions.A subset of subtype I-B systems defined by the pres-ence of the cas8b1 gene has been described as subtype I-G in the recent classification of archaeal CRISPR–Cas sys-tems 35. However, inclusion of bacterial CRISPR–Cas leads to increased diversity within subtype I-B so that if subtype I-G is recognized, consistency would requirecarboxy-terminal domain of the large subunit is predicted to functionally replace the small subunit in complexes where the. In type III systems, the large subunit is the putative cyclase-related enzyme encoded by cas10726 | NOVEMBER 2015 | VOLUME 13/reviews/microI-FYersinia pseudotuberculosis YPIII YPK_1644–YPK_1649IVAcidithiobacillus ferrooxidans ATCC 23270AFE_1037–AFE_1040II-A Streptococcus thermophilus CNRZ1066str0657–str0660II-CVNeisseria lactamica 020-06NLA_17660–NLA_17680Francisella cf. novicida Fx1FNFX1_1431–FNFX1_1428II-BLegionella pneumophila str. Paris lpp0160–lpp0163I-EEscherichia coli K12ygcB–ygbFGeobacter sulfurreducens GSU0051–GSU0054GSU0057–GSU0058I-UI-CBacillus halodurans C-125BH0336–BH0342I-AArchaeoglobus fulgidus AF1859, AF1870–AF1879I-BClostridium kluyveri DSM 555CKL_2758–CKL_2751 I-DCyanothece sp. PCC 8802Cyan8802_0527–Cyan8802_0520III-A Staphylococcus epidermidis RP62ASERP2463–SERP2455III-B Pyrococcus furiosus DSM 3638PF1131–PF1124III-C Methanothermobacterthermautotrophicus str. Delta HMTH328–MTH323III-DRoseiflexus sp. RS-1RoseRS_0369–RoseRS_0362C l a s s 2C l a s s 1| Microbiology Figure 2 | Architectures of the genomic loci for the subtypes of CRISPR–Cas systems. Typical operon organization is shown for each CRISPR–Cas system subtype. For each repre s entative genome, the respective gene locus tag names are indicated for each subunit. Homologous genes are colour-coded and identified by a family name. The gene names follow the classification from REF . 13. Where both a systematic name and a legacy name are commonly used, the legacy name is given under the systematic name. The small subunit is encoded by either csm2, cmr5, cse2 or csa5; no all-encompassing name has been proposed to collectively describe this gene family to date. Crosses through genes encoding the large subunit (Cas8 or Cas10 family members) indicate inactivation of the respective catalytic sites. Genes and gene regions encoding components of the interference module (CRISPR RNA (crRNA)–effector complexes or Cas9 proteins) are highlighted with a beige background. The adaptation module (cas1 and cas2) and cas6 are dispensable in subtypes III-A and III-B; in particular, they are rarely present in subtype III-B (dashed lines). Dark green denotes the CARF domain. Gene regions coloured cream represent the HD nuclease domain; the HD domain in Cas10 is distinct from that of Cas3 and Cas3ʹʹ. Also coloured are the regions of cas9 that roughly correspond to the RuvC-like nuclease (lime green), HNH nuclease (yellow), recognition lobe (purple) and protospacer adjacent motif (PAM)-interacting domains (pink). The regions of cpf1 aside from the RuvC-like domain are functionally uncharacterized and are shown in grey, as is the functionally uncharacterized all1473 gene in subtype III-D.A N A LY S I Ssplitting I-B into several subtypes. Therefore, at present, we classify these variants within subtype I-B. Subtype I-C seems to be a derivative of subtype I-B that lacks Cas6, which seems to be functionally replaced by Cas5 (REF. 55). Subtype I-A is another derivative of subtype I-B and is typically characterized by the fission of cas8 into two genes that encode degraded large and small subunits, respectively, as well as fission of cas3 into cas3ʹ and cas3ʹʹ.Subtype I-D also has several unique features, includ-ing Cas10d (instead of a Cas8 family protein) and a dis-tinct variant of Cas3 (REF. 13) (FIG. 2; see Supplementary information S2,S4 (tables)). Subtype I-U is typified by the presence of an uncharacterized signature gene (GSU0054; TIGRFAM reference TIGR02165) and sev-eral other distinctive features that have been analysed in detail previously33 (see Supplementary information S4 (table)). This group is monophyletic in the Cas3 tree and mostly monophyletic in the Cas1 tree (see Supplementary information S5,S6 (boxes)).The phylogenetic tree of the type I signature protein Cas3ʹ (and the homologous region of Cas3) has been reported to accurately reflect the subtype classification43, which is suggestive of a degree of evolutionary coher-ence between the phylogenies of the different genes in the operons of each subtype. However, re-analysis of the Cas3 phylogeny using a larger, more diverse sequence set (see Supplementary information S6 (box)) reveals a complex picture in which subtypes I-A, I-B and I-C are polyphyletic (that is, not descended from a common ancestor). Conceivably, this discrepancy results from a combination of accelerated evolution of many Cas3 variants and horizontal gene transfer.In addition to the complete type I CRISPR–cas loci, analysis of sequenced genomes has revealed a variety of putative type I-related operons that encode effector com-plexes but are not associated with cas1, cas2 or cas3 genes and are only in some cases adjacent to CRISPR arrays (see Supplementary information S4 (table)). These solo effector complexes are often encoded on plasmids and/ or associated with transposon-related genes. Many of these operons are derivatives of subtype I-F, whereas others are derivatives of subtype I-B (see Supplementary information S4,S7 (tables)). Some of the genomes that have these incomplete type I systems encode Cas1–Cas2 as parts of other CRISPR–cas loci but others lack these genes altogether (see Supplementary information S7 (table)). The functionality of solo effector complexes has not been investigated.Type III CRISPR–Cas systems. All type III systems pos-sess the signature gene cas10, which encodes a multi-domain protein containing a Palm domain (a variant of the RNA recognition motif (RRM)) that is homologous to the core domain of numerous nucleic acid polymer-ases and cyclases and that is the largest subunit of type III crRNA–effector complexes (BOX 2). Cas10 proteins show extensive sequence variation among the diverse type III CRISPR–Cas systems, which means that several PSSMs are required to identify these loci. All type III loci also encode the small subunit protein (see below), one Cas5 protein and typically several paralogous Cas7 proteins (FIG. 1). Often, Cas10 is fused to an HD family nuclease domain that is distinct from the HD domains of type ICRISPR–Cas systems and, unlike the latter, contains a circular permutation of the conserved motifs of the domain34,56.Type III systems have been previously classified into two subtypes, III-A (previously known as Mtube subtype or Csm module) and III-B (previously known as Cmr module or RAMP module), that can be distinguished by the presence of distinct genes encoding small subunits, csm2 (in the case of subtype III-A) and cmr5 (in the case of subtype III-B) (FIG. 2; see Supplementary informa-tion S4 (table)). Subtype III-A loci usually contain cas1, cas2 and cas6 genes, whereas most of the III-B loci lack these genes and therefore depend on other CRISPR–Cas systems present in the same genome4, providing strong evidence for the modularity of CRISPR–Cas systems35 (FIG. 2). Both subtype III-A and subtype III-B CRISPR–Cas systems have been shown to co-transcriptionally target RNA26,27,37–39,57 and DNA26,58–61.The composition and organization of type III CRISPR–cas loci are more diverse than those of type I sys-tems — although there are fewer type III subtypes, each of these is more polymorphic than type I subtypes. This diversity is due to gene duplications and deletions, domain insertions and fusions, and the presence of additional, poorly characterized domains that could be involved either in crRNA–effector complex functions or in associated immunity. At least two type III variants (one from subtype III-A and one from subtype III-B) are common and are here upgraded to subtypes III-D and III-C, respectively, as proposed earlier for archaea35 (FIG. 3; see Supplementary information S8 (table)). The distinctive feature of subtype III-C (previously known as MTH326-like33) is the apparent inactivation of the cyclase-like domain of Cas10 accompanied by extreme divergence of the sequence of this protein. Subtype III-D loci typically encode a Cas10 protein that lacks the HD domain. They also contain a distinct cas5-like gene known as csx10 and often an uncharacterized gene that is homologous to all1473 from Nostoc sp. PCC 7120 (REF. 33). Both of these new subtypes lack cas1 and cas2 genes (FIG. 2) and accordingly are predicted to recruit adaptation modules in trans. The phylogeny of Cas10, the signature gene of type III CRISPR–Cas, is consistent with the subtype classification, with each subtype repre-senting a distinct clade (see Supplementary information S9 (box)).Putative type IV CRISPR–Cas systems. Several bacterial genomes contain putative, functionally uncharacterized type IV systems, often on plasmids, as can be typified by the AFE_1037-AFE_1040 operon in Acidithiobacillus ferrooxidans ATCC 23270. Similar to most subtype III-B loci, this system lacks cas1 and cas2 genes and is often not in proximity to a CRISPR array or, in many cases, is encoded in a genome that has no detectable CRISPR arrays (it might be more appropriate to denote the respective loci Cas systems rather than CRISPR–Cas). Type IV systems encode a predicted minimalA N A LY S I Smultisubunit crRNA–effector complex that consists of a partially degraded large subunit, Csf1, Cas5 and — as a single copy — Cas7, and in some cases, a putative small subunit33(FIG. 1); csf1 can serve as a signature gene for this system. The minimalist architecture of type IV loci is distinct from those of all type I and type III subtypes (FIG. 2; see Supplementary information S4 (table)), which together with the unique large subunit (Csf1) justifies their status as a new type.There are two distinct variants of type IV CRISPR–Cas systems, one of which contains a DinG family helicase (REF. 62), and a second one that lacks DinG but typically contains a gene encoding a small α-helical protein, which is a putative small subunit 33. Type IV systems could be mobile modules that, similar to subtype III-B systems, use crRNAs from different CRISPR arrays once these become available. This possibility is consistent with the occasional localization of type IV loci adjacent to CRISPR arrays, cas6 genes and (less often) adaptation genes35.Class 2 CRISPR–Cas systemsClass 2 CRISPR–Cas systems are defined by the pres-ence of a single subunit crRNA–effector module. This class includes type II CRISPR–Cas systems, as well as a putative new classification, type V.Type II CRISPR–Cas systems. Type II CRISPR–Cas sys-tems dramatically differ from types I and III, and are by far the simplest in terms of the number of genes. The signature gene for type II is cas9, which encodes a multidomain protein that combines the functions of the crRNA–effector complex with target DNA cleavage25, and also contributes to adaptation63,64. In addition to cas9, all identified type II CRISPR–cas loci contain cas1 and cas2 (see REF. 65 for a detailed comparative analy-sis of type II systems) (FIG. 1) and most type II loci also encode a tracrRNA, which is partially complementary to the repeats within the respective CRISPR array65–67. The core of Cas9, which includes both nuclease domains and a characteristic Arg-rich cluster, most likely evolved from genes of transposable elements that are not associated with CRISPR65. Thus, owing to the significant sequence similarity between Cas9 and its homologues that are unrelated to CRISPR–Cas, Cas9 cannot be used as the only signature for identification of type II systems. Nevertheless, the presence of cas9 in the vicinity of cas1 and cas2 genes is a hallmark of type II loci.Type II CRISPR–Cas systems are currently classi-fied into three subtypes, which were introduced in the previous classification (II-A and II-B)13 or subsequently proposed on the basis of a distinct locus organization (II-C)65,66,68 (FIG. 2; see Supplementary information S4 (table)). Subtype II-A systems include an additional gene, csn2(FIG. 2), which is considered a signature gene for this subtype. The long and short variants of Csn2 form compact clusters when superimposed over the Cas9 phylogeny and seem to correspond to two distinct variants of subtype II-A65. However, as with subtype I-B, we chose to keep these two variants within subtype II-A. It was recently shown that all four subtype II-A Cas proteins are involved in spacer acquisition63.Subtype II-B lacks csn2 but includes cas4, which is otherwise typical of type I systems (FIG. 2). Moreover, subtype II-B cas1 and cas2 are more closely related to type I homologues than to subtype II-A, which is suggestive of a recombinant origin of subtype II-B65. Subtype II-C loci only have three protein-coding genes (cas1, cas2 and cas9) and are the most common type II CRISPR–Cas system in bacteria3,65,66. A nota-ble example of a subtype II-C system is the crRNA-processing-independent system found in Neisseria meningitidis69(BOX 1).In the Cas9 phylogeny, subtypes II-A and II-B are monophyletic whereas subtype II-C is paraphyletic with respect to II-A (that is, subtype II-A originates from within II-C)65. Nevertheless, II-C was retained as a single subtype given the minimalist architecture of the effector modules shared by all II-C loci.Putative type V CRISPR–Cas systems. A gene denoted cpf1 (TIGRFAM reference TIGR04330) is present in several bacterial genomes and one archaeal genome, adjacent to cas1, cas2 and a CRISPR array (for example, in the FNFX1_1431–FNFX1_1428 locus of Francisella cf. novicida Fx1)70(FIG. 2). These observations led us to putatively define a fifth type of CRISPR–Cas system, type V, which combines Cpf1 (the interference module) with an adaptor module (FIG. 1; see Supplementary information S4 (table)). Cpf1 is a large protein (about 1,300 amino acids) that contains a RuvC-like nuclease domain homologous to the respec-tive domain of Cas9 and the TnpB protein of IS605 family transposons, along with putative counterparts to the characteristic Arg-rich region of Cas9 and the Zn finger of TnpB. However, Cpf1 lacks the HNH nuclease domain that is present in all Cas9 proteins54,65. Given the presence of a predicted single-subunit crRNA–effector complex, the putative type V systems are assigned to class 2 CRISPR–Cas. Some of the putative type V loci also encode Cas4 and accordingly resemble subtype II-B loci, whereas others lack Cas4 and are more simi-lar in architecture to subtype II-C. Unlike Cas9, Cpf1 is encoded outside the CRISPR–Cas context in several genomes, and its high similarity with TnpB suggests that cpf1 is a recent recruitment from transposable elements. If future experiments were to show that these loci encode bona fide CRISPR–Cas systems and that Cpf1 is a functional analogue of Cas9, then these systems would arguably qualify as a novel type of CRISPR–Cas. Despite the overall similarity to type II CRISPR–Cas systems, the putative type V loci clearly differ from the estab-lished type II subtypes more than type II subtypes differ from each other, most notably in the distinct domain architectures of Cpf1 and Cas9. Furthermore, whereas type II systems are specific to bacteria, a putative type V system is present in at least one archaeon, Candidatus Methanomethylophilus alvus35.Rare, unclassifiable CRISPR–Cas systemsThe classification of CRISPR–Cas systems outlined above covers nearly all of the CRISPR–cas loci identified in the currently sequenced archaeal and bacterial genomesA N A LY S I S。
基因编辑与基因突变解读生命变异的奥秘随着科学技术的不断进步,我们对基因的认识与研究也变得越来越深入。
基因编辑和基因突变是两个与基因相关的重要概念,它们揭示了生命变异的奥秘。
在本文中,我们将探讨基因编辑和基因突变的概念、原因以及对生命演化的影响。
一、基因编辑的概念和原理基因编辑是一种人为干预基因组的技术,通过修改或删除目标基因的特定区域,实现对生物遗传特征的改变。
基因编辑的核心工具是CRISPR-Cas9系统,它能够精确定位并修饰基因组中的特定序列。
CRISPR-Cas9系统由CRISPR(Clustered Regularly Interspaced Short Palindromic Repeats)和Cas9(CRISPR-associated protein 9)组成。
CRISPR是一类具有重复和间隔序列的DNA区域,而Cas9则是一种内切酶,它能够通过与CRISPR中的RNA序列结合,指导定位并切割基因组中的特定DNA序列。
基因编辑的过程通常包括以下步骤:1. 设计合适的CRISPR引导RNA(gRNA)序列,使其能够与目标基因的特定区域配对。
2. 合成和引入gRNA和Cas9蛋白到目标细胞中。
3. Cas9蛋白与gRNA结合形成复合物,定位到目标基因组的特定区域。
4. Cas9蛋白通过其内切酶活性,切割目标基因组的DNA序列。
5. 细胞中的自我修复机制可以修复被Cas9切割的DNA,但也可能引入突变或缺失。
6. 经过多代的繁殖,修改后的基因组会在整个生物体中得到遗传。
二、基因突变的概念和分类基因突变是指基因组中的DNA序列发生突然而又稳定的改变,包括单个碱基的改变、插入和删除等。
基因突变可以分为以下几种类型:1. 点突变:即单个碱基的改变,包括错义突变、无义突变和同义突变。
错义突变导致氨基酸序列发生改变,可能影响蛋白质的结构和功能。
无义突变导致早停密码子的出现,使蛋白质合成提前终止。
同义突变发生在纯氨基酸序列中,不改变蛋白质的编码。
982023年8月上 第15期 总第411期工艺设计改造及检测检修China Science & Technology Overview1.2 主要仪器与设备多功能酶标仪:美国MD 公司的SpectraMax iD5多功能酶标仪;恒温水浴锅:常州国华电器有限公司的HH-4;桌面型迷你离心机:上海生工公司的G508010;台式高速微记实验使用冷冻法进行。
具体方法为将50μL 浓度为10μM 含多聚腺嘌呤标记的DNA 探针加入1mL 浓度为5nM 的AuNPs 溶液中。
溶液混匀后放置于-20℃冰箱中至少2h。
待室温下溶液解冻后,在4℃、12000r/min 条件下高速收稿日期:2022-12-29作者简介:元超群(1993—),男,河南林州人,硕士研究生,助理工程师,研究方向:产品检验检测。
基于CRISPR/Cas12a 系统的比色检测方法探究元超群(新乡市食品药品检验所,河南新乡 453000)摘 要:基于Clustered Regularly Interspaced Short Palindromic Repeats/Cas12a(CRISPR/Cas12a)系统的特征基因识别检测技术,已经可以实现DNA 的普适性检测。
同时,基于CRISPR/Cas12a 系统的反式酶切活性,可以降解(有目标DNA)或者不降解(无目标DNA)检测体系中添加的指示探针,设计一对通用的DNA 纳米金探针进行裸眼可视化DNA 比色检测。
整个检测过程与结果读出不依赖于实时定量荧光PCR 仪、恒温培养箱等大型仪器,整个流程可以在1小时之内完成。
工艺设计改造及检测检修China Science & Technology Overview离心30min,将上清液抽吸弃去,沉淀在清洗缓冲液中重悬,此清洗步骤重复3次。
最后,在重悬缓冲液中将沉淀重悬,AuNPs-DNA探针溶液置于棕色瓶中4℃保存。
(3)AsCas12a蛋白蛋白的表达和纯化。
GenCrispr Cas9 Nuclease Cat. No. Z03386 Version 01202016I Description (1)II Kit Contents .......................................................................... .... .... (1)III Key Features ............................... .... .... .... .... .... . (1)IV Storage..................................................................... . (2)V Diluent Compatibility (2)VI Activity test (2)VII References (2)I. DESCRIPTIONGenCrispr Cas9 Nuclease is the recombinant Streptococcus pyogenes Cas9 (wt) protein purified from E. coli that can be used for genome editing by inducing site-specific double stranded breaks in double stranded DNA. Cas9 protein forms a very stable ribonucleoprotein (RNP) complex with the guide RNA (gRNA) component of the CRISPR/Cas9 system. The RNP complex recognizes the target site by matching gRNA with the genomic DNA sequence and produces DNA breaks within 3 bases from the NGG PAM (Protospacer Adjacent Motif). With GenCrispr Cas9 nuclease, customers can screen for highly efficient gRNA in vitro using DNA cleavage assays. The high purity Cas9 protein can also be used for antibody production.Product Source: GenCrispr Cas9 Nuclease is produced by expression in an E. coli strain carrying a plasmid encoding the Cas9 gene from Streptococcus pyogenes without nuclear localization signal (NLS).II. KIT CONTENTSIII. KEY FEATURESHigh Protein Purity: GenCrispr Cas9 Nuclease is > 95% pure as determined by SDS-PAGE with Coomassie Blue detection.Non-specific DNase Activity: A 20 µl reaction in Cas9 reaction buffer containing 100 ng linearized pUC57 plasmid and 0.1 µg of GenCrispr Cas9, incubated for 16 h at 37°C. No DNA degradation is determined by agarose gel electrophoresis.Non-specific RNase Activity: A 10 µl reaction in Cas9 reaction buffer containing 1800 ng total RNA and 0.1 µg of GenCrispr Cas9 incubated for 2h at 37°C. No RNA degradation as determined by agarose gel electrophoresis.High Bioactivity: 20 nM GenCrispr Cas9 incubated for 1 hour at 37°C result in 90% digestion of the substrate DNA as determined by agarose gel electrophoresis.IV. STORAGEGenCrispr Cas9 is supplied with 1x storage buffer (10 mM Tris, 300 mM NaCl, 0.1 mM EDTA, 1 mM DTT, 50% Glycerol pH 7.4 @ 25°C). The recommended storage temperature is -20°C.V. Diluent CompatibilityDiluent Buffer: 300 mM NaCl, 10 mM Tris-HCl, 0.1 mM EDTA, 1 mM DTT, 500 μg/ml BSA and 50% glycerol. (pH 7.4 @ 25°C).VI. Activity testCas9 site-specific digestion:Genscript used in vitro digestion of a linearized plasmid to determine the activity of the Cas9 nuclease. It is a sensitive assay for GenCrispr Cas9 quality control. The linearized plasmid containing the target site:(CATCATTGGAAAACGTTCTT)can be digested with gRNA:(CAUCAUUGGAAAACGUUCUUGUUUUAGAGCUAGAAAUAGCAAGUUAAAAUAAGGCUAGUCCGU UAUCAACUUGAAAAAGUGGCACCGAGUCGGUGCUUUUUUUU)and GenCrispr Cas9. Two cleavage DNA fragments (812 bp and 1898 bp) are determined by agarose gel electrophoresis. A 20 µl reaction in 1xCas9 Nuclease Reaction Buffer containing 160 ng linearized plasmid, 40 nM gRNA and 20 nM GenCrispr Cas9 for 2 hours at 37°C results in 90% digestion of linearized plasmid as determined by agarose gel electrophoresis.VII References1. Jinek et al. A Programmable Dual-RNA–Guided DNA Endonuclease in Adaptive Bacterial Immunity. (2012)Science 337 (6096) 816-821 (2012).2. Larson, M. H., et al. CRISPR interference (CRISPRi) for sequence-specific control of gene expression.NatureProtocols. 8, (11), 2180-2196 (2013).3. Ran, F. A., et al. Genome engineering using the CRISPR-Cas9 system. Nature Protocols. 8, (11), 2281-2308(2013).Notes:1. This is a basic protocol. The reagent concentrations, conditions, and parameters may need to be optimized.2. 1000 nM is equal to 160 ng/µl.GenScript US860 Centennial Ave., Piscataway, NJ 08854Tel: 732-885-9188, 732-885-9688Fax: 732-210-0262, 732-885-5878Email: *********************Web: For Research Use Only.。
Originally published 28 June 2012; corrected 15 August 2012/cgi/content/full/science.1225829/DC1Supplementary Materials forA Programmable Dual-RNA–Guided DNA Endonuclease in AdaptiveBacterial ImmunityMartin Jinek, Krzysztof Chylinski, Ines Fonfara, Michael Hauer, Jennifer A. Doudna,*Emmanuelle Charpentier**To whom correspondence should be addressed. E-mail: doudna@ (J.A.D.);emmanuelle.charpentier@mims.umu.se (E.C.)Published 28 June 2012 on Science ExpressDOI: 10.1126/science.1225829This PDF file includes:Materials and MethodsFigs. S1 to S15Tables S1 to S3Full Reference ListCorrection: Formatting errors and typos have been corrected. Additionally, the format of the tables has been revised, and a duplicate entry has been removed from table S2.SUPPLEMENTARY MATERIALS AND METHODSBacterial strains and culture conditions. Table S1 lists the bacterial strains used in the study. Streptococcus pyogenes, cultured in THY medium (Todd Hewitt Broth (THB, Bacto, Becton Dickinson) supplemented with 0.2% yeast extract (Oxoid)) or on TSA (trypticase soy agar, BBL, Becton Dickinson) supplemented with 3% sheep blood, was incubated at 37°C in an atmosphere supplemented with 5% CO2 without shaking. Escherichia coli, cultured in Luria-Bertani (LB) medium and agar, was incubated at 37°C with shaking. When required, suitable antibiotics were added to the medium at the following final concentrations: ampicillin, 100 µg/ml for E. coli; chloramphenicol, 33 µg/ml for E. coli; kanamycin, 25 µg/ml for E. coli and 300 µg/ml for S. pyogenes. Bacterial cell growth was monitored periodically by measuring the optical density of culture aliquots at 620 nm using a microplate reader (SLT Spectra Reader). Transformation of bacterial cells. Plasmid DNA transformation into E. coli cells was performed according to a standard heat shock protocol (39). Transformation of S. pyogenes was performed as previously described with some modifications (40). The transformation assay performed to monitor in vivo CRISPR/Cas activity on plasmid maintenance was essentially carried out as described previously (4). Briefly, electro-competent cells of S. pyogenes were equalized to the same cell density and electroporated with 500 ng of plasmid DNA. Every transformation was plated two to three times and the experiment was performed three times independently with different batches of competent cells for statistical analysis. Transformation efficiencies were calculated as CFU (colony forming units) per µg of DNA. Control transformations were performed with sterile water and backbone vector pEC85.DNA manipulations. DNA manipulations including DNA preparation, amplification, digestion, ligation, purification, agarose gel electrophoresis were performed according to standard techniques (39)with minor modifications. Protospacer plasmids for the in vitro cleavage and S. pyogenes transformation assays were constructed as described previously (4). Additional pUC19-based protospacer plasmids for in vitro cleavage assays were generated by ligating annealed oligonucleotides between digested EcoRI and BamHI sites in pUC19. The GFP gene-containing plasmid has been described previously (41). Kits (Qiagen) were used for DNA purification and plasmid preparation. Plasmid mutagenesis was performed using QuikChange® II XL kit (Stratagene) or QuikChange site-directed mutagenesis kit (Agilent). All plasmids used in this study were sequenced at LGC Genomics or the UC Berkeley DNA Sequencing Facility and are listed in Table S2. VBC-Biotech Services, Sigma-Aldrich and Integrated DNA Technologies supplied the synthetic oligonucleotides and RNAs listed in Table S3.In vitro transcription and purification of RNA. RNA was in vitro transcribed using T7 Flash in vitro Transcription Kit (Epicentre, Illumina company) and PCR-generated DNA templates carrying a T7 promoter sequence. RNA was gel-purified and quality-checked prior to use. The primers used for the preparation of RNA templates from S. pyogenes SF370, Listeria innocua Clip 11262 and Neisseria meningitidis A Z2491 are listed in Table S3.Protein purification. The sequence encoding Cas9 (residues 1-1368) was PCR-amplified from the genomic DNA of S. pyogenes SF370 and inserted into a custom pET-based expression vector using ligation-independent cloning (LIC). The resulting fusion construct contained an N-terminal hexahistidine-maltose binding protein (His6-MBP) tag, followed by a peptide sequence containing a tobacco etch virus (TEV) protease cleavage site. The protein was expressed in E. coli strain BL21 Rosetta 2 (DE3) (EMD Biosciences), grown in 2xTY medium at 18°C for 16 h following induction with 0.2 mM IPTG. The protein was purified by a combination of affinity, ion exchange and size exclusion chromatographic steps. Briefly, cells were lysed in 20 mM Tris pH 8.0, 500 mM NaCl, 1 mM TCEP (supplemented with protease inhibitor cocktail (Roche)) in a homogenizer (Avestin). Clarified lysate was bound in batch to Ni-NTA agarose (Qiagen). The resin was washed extensively with 20 mM Tris pH 8.0, 500 mM NaCl and the bound protein was eluted in 20 mM Tris pH 8.0, 250 mM NaCl, 10% glycerol. The His6-MBP affinity tag was removed by cleavage with TEV protease, while the protein was dialyzed overnight against 20 mM HEPES pH 7.5, 150 mM KCl, 1 mM TCEP, 10% glycerol. The cleaved Cas9 protein was separated from the fusion tag by purification on a 5 ml SP Sepharose HiTrap column (GE Life Sciences), eluting with a linear gradient of 100 mM – 1 M KCl. The protein was further purified by size exclusion chromatography on a Superdex 200 16/60 column in 20 mM HEPES pH 7.5, 150 mM KCl and 1 mM TCEP. Eluted protein was concentrated to ~8 mg.ml-1, flash-frozen in liquid nitrogen and stored at -80°C. Cas9 D10A, H840A and D10A/H840A point mutants were generated using the QuikChange site-directed mutagenesis kit (Agilent) and confirmed by DNA sequencing. The proteins were purified following the same procedure as for the wild-type Cas9 protein.Cas9 orthologs from Streptococcus thermophilus (LMD-9,YP_820832.1), L. innocua (Clip11262, NP_472073.1), Campylobacter jejuni (subsp. jejuni NCTC 11168, YP_002344900.1) and N. meningitidis (Z2491, YP_002342100.1) were expressed in BL21 Rosetta (DE3) pLysS cells (Novagen) as His6-MBP (N. meningitidis and C. jejuni), His6-Thioredoxin (L. innocua) and His6-GST (S. thermophilus) fusion proteins, and purified essentially as for S. pyogenes Cas9 with the following modifications. Due to large amounts of co-purifying nucleic acids, all four Cas9 proteins were purified by an additional heparin sepharose step prior to gel filtration, eluting the bound protein with a linear gradient of 100 mM – 2 M KCl. This successfully removed nucleic acid contamination from the C. jejuni, N. meningitidis and L. innocua proteins, but failed to remove co-purifying nucleic acids from the S. thermophilus Cas9 preparation. All proteins were concentrated to 1-8 mg.ml-1 in 20 mM HEPES pH 7.5, 150 mM KCl and 1 mM TCEP, flash-frozen in liquid N2 and stored at -80°C.Plasmid DNA cleavage assay. Synthetic or in vitro-transcribed tracrRNA and crRNA were pre-annealed prior to the reaction by heating to 95°C and slowly cooling down to room temperature.Native or restriction digest-linearized plasmid DNA (300 ng (~8 nM)) was incubated for 60 min at 37°C with purified Cas9 protein (50-500 nM) and tracrRNA:crRNA duplex (50-500 nM, 1:1) in a Cas9 plasmid cleavage buffer (20 mM HEPES pH 7.5, 150 mM KCl, 0.5 mM DTT, 0.1 mM EDTA) with or without 10 mM MgCl2. The reactions were stopped with 5X DNA loading buffer containing 250 mM EDTA, resolved by 0.8 or 1% agarose gel electrophoresis and visualized by ethidium bromidestaining. For the Cas9 mutant cleavage assays, the reactions were stopped with 5X SDS loading buffer (30% glycerol, 1.2% SDS, 250 mM EDTA) prior to loading on the agarose gel.Metal-dependent cleavage assay. Protospacer 2 plasmid DNA (5 nM) was incubated for 1 h at 37°C with Cas9 (50 nM) pre-incubated with 50 nM tracrRNA:crRNA-sp2 in cleavage buffer (20 mM HEPES pH 7.5, 150 mM KCl, 0.5 mM DTT, 0.1 mM EDTA) supplemented with 1, 5 or 10 mM MgCl2, 1 or 10 mM of MnCl2, CaCl2, ZnCl2, CoCl2, NiSO4 or CuSO4. The reaction was stopped by adding 5X SDS loading buffer (30% glycerol, 1.2% SDS, 250 mM EDTA), resolved by 1% agarose gel electrophoresis and visualized by ethidium bromide staining.Single-turnover assay. Cas9 (25 nM) was pre-incubated 15 min at 37°C in cleavage buffer (20 mM HEPES pH 7.5, 150 mM KCl, 10 mM MgCl2, 0.5 mM DTT, 0.1 mM EDTA) with duplexed tracrRNA:crRNA-sp2 (25 nM, 1:1) or both RNAs (25 nM) not pre-annealed and the reaction was started by adding protospacer 2 plasmid DNA (5 nM). The reaction mix was incubated at 37°C. At defined time intervals, samples were withdrawn from the reaction, 5X SDS loading buffer (30% glycerol, 1.2% SDS, 250 mM EDTA) was added to stop the reaction and the cleavage was monitored by 1% agarose gel electrophoresis and ethidium bromide staining. The same was done for the single turnover kinetics without pre-incubation of Cas9 and RNA, where protospacer 2 plasmid DNA (5 nM) was mixed in cleavage buffer with duplex tracrRNA:crRNA-sp2 (25 nM) or both RNAs (25 nM) not pre-annealed, and the reaction was started by addition of Cas9 (25 nM). Percentage of cleavage was analyzed by densitometry and the average of three independent experiments was plotted against time. The data were fit by nonlinear regression analysis and the cleavage rates (k obs [min-1]) were calculated.Multiple-turnover assay. Cas9 (1 nM) was pre-incubated for 15 min at 37°C in cleavage buffer (20 mM HEPES pH 7.5, 150 mM KCl, 10 mM MgCl2, 0.5 mM DTT, 0.1 mM EDTA) with pre-annealed tracrRNA:crRNA-sp2 (1 nM, 1:1). The reaction was started by addition of protospacer 2 plasmid DNA (5 nM). At defined time intervals, samples were withdrawn and the reaction was stopped by adding 5X SDS loading buffer (30% glycerol, 1.2% SDS, 250 mM EDTA). The cleavage reaction was resolved by 1% agarose gel electrophoresis, stained with ethidium bromide and the percentage of cleavage was analyzed by densitometry. The results of four independent experiments were plotted against time (min).Oligonucleotide DNA cleavage assay. DNA oligonucleotides (10 pmol) were radiolabeled by incubating with 5 units T4 polynucleotide kinase (New England Biolabs) and ~3–6 pmol (~20–40 mCi) [γ-32P]-ATP (Promega) in 1X T4 polynucleotide kinase reaction buffer at 37°C for 30 min, in a 50 μL reaction. After heat inactivation (65°C for 20 min), reactions were purified through an Illustra MicroSpin G-25 column (GE Healthcare) to remove unincorporated label. Duplex substrates (100 nM) were generated by annealing labeled oligonucleotides with equimolar amounts of unlabeled complementary oligonucleotide at 95°C for 3 min, followed by slow cooling to room temperature. For cleavage assays, tracrRNA and crRNA were annealed by heating to 95°C for 30 s, followed by slow cooling to room temperature. Cas9 (500 nM final concentration) waspre-incubated with the annealed tracrRNA:crRNA duplex (500 nM) in cleavage assay buffer (20 mM HEPES pH 7.5, 100 mM KCl, 5 mM MgCl2, 1 mM DTT, 5% glycerol) in a total volume of 9 μl. Reactions were initiated by the addition of 1 μl target DNA (10 nM) and incubated for 1 h at 37°C. Reactions were quenched by the addition of 20 μl of loading dye (5 mM EDTA, 0.025% SDS, 5% glycerol in formamide) and heated to 95°C for 5 min. Cleavage products were resolved on 12% denaturing polyacrylamide gels containing 7 M urea and visualized by phosphorimaging (Storm, GE Life Sciences). Cleavage assays testing PAM requirements (Fig. 4B) were carried out using DNA duplex substrates that had been pre-annealed and purified on an 8% native acrylamide gel, and subsequently radiolabeled at both 5’ ends. The reactions were set-up and analyzed as above.Electrophoretic mobility shift assays. Target DNA duplexes were formed by mixing of each strand (10 nmol) in deionized water, heating to 95°C for 3 min and slow cooling to room temperature. All DNAs were purified on 8% native gels containing 1X TBE. DNA bands were visualized by UV shadowing, excised, and eluted by soaking gel pieces in DEPC-treated H2O. Eluted DNA was ethanol precipitated and dissolved in DEPC-treated H2O. DNA samples were 5’ end labeled with [γ-32P]-ATP using T4 polynucleotide kinase (New England Biolabs) for 30 min at 37°C. PNK was heat denatured at 65°C for 20 min, and unincorporated radiolabel was removed using an Illustra MicroSpin G-25 column (GE Healthcare). Binding assays were performed in buffer containing 20 mM HEPES pH 7.5, 100 mM KCl, 5 mM MgCl2, 1 mM DTT and 10% glycerol in a total volume of 10 μl. Cas9 D10A/H840A double mutant was programmed with equimolar amounts of pre-annealed tracrRNA:crRNA duplex and titrated from 100 pM to 1 μM. Radiolabeled DNA was added to a final concentration of 20 pM. Samples were incubated for 1 h at 37°C and resolved at 4°C on an 8% native polyacrylamide gel containing 1X TBE and 5 mM MgCl2. Gels were dried and DNA visualized by phosphorimaging.In silico analysis of DNA and protein sequences. Vector NTI package (Invitrogen) was used for DNA sequence analysis (Vector NTI) and comparative sequence analysis of proteins (AlignX).In silico modeling of RNA structure and co-folding. In silico predictions were performed using the Vienna RNA package algorithms (42, 43). RNA secondary structures and co-folding models were predicted with RNAfold and RNAcofold, respectively and visualized with VARNA (44).SUPPLEMENTARY FIGURESFig. S1. The type II RNA-mediated CRISPR/Cas immune pathway. The expression and interference steps are represented in the drawing. The type II CRISPR/Cas loci are composed of an operon of four genes (blue) encoding the proteins Cas9, Cas1, Cas2 and Csn2, a CRISPR array consisting of a leader sequence followed by identical repeats (black rectangles) interspersed with unique genome-targeting spacers (colored diamonds) and a sequence encoding the trans-activating tracrRNA (red). Represented here is the type II CRISPR/Cas locus of S. pyogenes SF370 (Accession number NC_002737) (4). Experimentally confirmed promoters and transcriptional terminator in this locus are indicated (4). The CRISPR array is transcribed as a precursor CRISPR RNA (pre-crRNA) molecule that undergoes a maturation process specific to the type II systems (4). In S. pyogenes SF370, tracrRNA is transcribed as two primary transcripts of 171 and 89 nt in length that have complementarity to each repeat of the pre-crRNA. The first processing event involves pairing of tracrRNA to pre-crRNA, forming a duplex RNA that is recognized and cleaved by the housekeeping endoribonuclease RNase III (orange) in the presence of the Cas9 protein. RNase III-mediated cleavage of the duplex RNA generates a 75-nt processed tracrRNA and a 66-nt intermediate crRNAs consisting of a central region containing a sequence of one spacer, flanked by portions of the repeat sequence. A second processing event, mediated by unknown ribonuclease(s), leads to the formation of mature crRNAs of 39 to 42 nt in length consisting of 5’-terminal spacer-derived guide sequence and repeat-derived 3’-terminal sequence. Following the first and second processing events, mature tracrRNA remains paired to the mature crRNAs and bound to the Cas9 protein. In this ternary complex, the dual tracrRNA:crRNA structure acts as guide RNA that directs the endonuclease Cas9 to the cognate target DNA. Target recognition by the Cas9-tracrRNA:crRNA complex is initiated by scanning the invading DNA molecule for homology between the protospacer sequence in the target DNA and the spacer-derived sequence in the crRNA. In addition to the DNA protospacer-crRNA spacer complementarity, DNA targeting requires the presence of a short motif (NGG, where N can be any nucleotide) adjacent to the protospacer (protospacer adjacent motif - PAM). Following pairing between the dual-RNA and the protospacer sequence, an R-loop is formed and Cas9 subsequently introduces a double-stranded break (DSB) in the DNA. Cleavage of target DNA by Cas9 requires two catalytic domains in the protein. At a specific site relative to the PAM, the HNH domain cleaves the complementary strand of the DNA while the RuvC-like domain cleaves the noncomplementary strand.Fig. S2. Purification of Cas9 nucleases. (A) S. pyogenes Cas9 was expressed in E. coli as a fusion protein containing an N-terminal His6-MBP tag and purified by a combination of affinity, ion exchange and size exclusion chromatographic steps. The affinity tag was removed by TEV protease cleavage following the affinity purification step. Shown is a chromatogram of the final size exclusion chromatography step on a Superdex 200 (16/60) column. Cas9 elutes as a single monomeric peak devoid of contaminating nucleic acids, as judged by the ratio of absorbances at 280 and 260 nm. Inset; eluted fractions were resolved by SDS-PAGE on a 10% polyacrylamide gel and stained with SimplyBlue Safe Stain (Invitrogen). (B) SDS-PAGE analysis of purified Cas9 orthologs. Cas9 orthologs were purified as described in Supplementary Materials and Methods. 2.5 μg of each purified Cas9 were analyzed on a 4-20% gradient polyacrylamide gel and stained with SimplyBlue Safe Stain.Fig. S3. Cas9 guided by dual-tracrRNA:crRNA cleaves protospacer plasmid and oligonucleotide DNA. See Fig. 1.The protospacer 1 sequence originates from S. pyogenes SF370 (M1)SPy_0700, target of S. pyogenes SF370 crRNA-sp1 (4). Here, the protospacer 1 sequence was manipulated by changing the PAM from a nonfunctional sequence (TTG) to a functional one (TGG). The protospacer 4 sequence originates from S. pyogenes MGAS10750 (M4) MGAS10750_Spy1285,target of S. pyogenes SF370 crRNA-sp4 (4). (A) Protospacer 1 plasmid DNA cleavage guided by cognate tracrRNA:crRNAduplexes. The cleavage products were resolved by agarose gel electrophoresis and visualized by ethidium bromide staining.M, DNA marker; fragment sizes in base pairs are indicated. (B) Protospacer 1 oligonucleotide DNA cleavage guided by cognate tracrRNA:crRNA-sp1 duplex. The cleavage products were resolved by denaturating polyacrylamide gel electrophoresis and visualized by phosphorimaging.Fragment sizes in nucleotides are indicated. (C) Protospacer 4 oligonucleotide DNA cleavage guided by cognate tracrRNA:crRNA-sp4 duplex. The cleavage products were resolved by denaturating polyacrylamide gel electrophoresis and visualized by phosphorimaging.Fragment sizes in nucleotides are indicated. (A, B, C) Experiments in (A) were performed as in Fig. 1A; in (B) and in (C) as in Fig. 1B. (B, C) A schematic of the tracrRNA:crRNA-target DNA interaction is shown below. The regions of crRNA complementarity to tracrRNA and the protospacer DNA are represented in orange and yellow, respectively. The PAM sequence is indicated in grey.Fig. S4. Cas9 is a Mg2+-dependent endonuclease with 3’-5’ exonuclease activity. See Fig. 1. (A) Protospacer 2 plasmid DNA was incubated with Cas9 complexed with tracrRNA:crRNA-sp2 in the presence of different concentrations of Mg2+, Mn2+, Ca2+, Zn2+, Co2+, Ni2+ or Cu2+. The cleavage products were resolved by agarose gel electrophoresis and visualized by ethidium bromide staining.Plasmid forms are indicated.(B) A protospacer 4 oligonucleotide DNA duplex containing a PAM motif was annealed and gel-purified prior to radiolabeling at both 5’ ends. The duplex (10 nM final concentration) was incubated with Cas9 programmed with tracrRNA (nucleotides 23-89) and crRNA-sp4 (500 nM final concentration, 1:1). At indicated time points (min), 10 μl aliquots of the cleavage reaction were quenched with formamide buffer containing 0.025% SDS and 5 mM EDTA, and analyzed by denaturing polyacrylamide gel electrophoresis as in Fig. 1B. Sizes in nucleotides are indicated.Fig. S5. Dual-tracrRNA:crRNA directed Cas9 cleavage of target DNA is site-specific. See Fig. 1 and fig. S3. (A) Mapping of protospacer 1 plasmid DNA cleavage. Cleavage products from fig. S3A were analyzed by sequencing as in Fig. 1C. Note that the 3’ terminal A overhang (asterisk) is an artifact of the sequencing reaction.(B) Mapping of protospacer 4 oligonucleotide DNA cleavage. Cleavage products from fig. S3C were analyzed by denaturing polyacrylamide gel electrophoresis alongside 5’ end-labeled oligonucleotide size markers derived from the complementary and noncomplementary strands of the protospacer 4 duplex DNA. M, marker; P, cleavage product. Fragment sizes in nucleotides are indicated. (C) Schematic representations of tracrRNA, crRNA-sp1 and protospacer 1 DNA sequences (left) and tracrRNA, crRNA-sp4 and protospacer 4 DNA sequences (right). tracrRNA:crRNA forms a dual-RNA structure directed to complementary protospacer DNA through crRNA-protospacer DNA pairing. The regions of crRNA complementary to tracrRNA and the protospacer DNA are represented in orange and yellow, respectively. The cleavage sites in the complementary and noncomplementary DNA strands mapped in (A) (left) and (B) (right) are represented with red arrows (A and B, complementary strand) and a red thick line (B, noncomplementary strand) above the sequences, respectively.Fig. S6. Dual-tracrRNA:crRNA directed Cas9 cleavage of target DNA is fast and efficient. See Fig. 1. (A) Single turnover kinetics of Cas9 under different RNA pre-annealing and protein-RNA pre-incubation conditions. Protospacer 2plasmid DNA was incubated with either Cas9 pre-incubated with pre-annealed tracrRNA:crRNA-sp2 (○), Cas9 not pre-incubated with pre-annealed tracrRNA:crRNA-sp2 (●), Cas9 pre-incubated with not pre-annealed tracrRNA and crRNA-sp2 (□) or Cas9 not pre-incubated with not pre-annealed RNAs (■). The cleavage activity was monitored in a time-dependent manner and analyzed by agarose gel electrophoresis followed by ethidium bromide staining. The average percentage of cleavage from three independent experiments is plotted against the time (min) and fitted with a nonlinear regression. The calculated cleavage rates (k obs) are shown in the table on the right. The results suggest that the binding of Cas9 to the RNAs is not rate-limiting under the conditions tested. Plasmid forms are indicated. The obtained k obs values are comparable to those of restriction endonucleases which are typically of the order of 1-10 min-1 (45-47).(B) Cas9 is a multiple turnover endonuclease. Cas9 loaded with duplexed tracrRNA:crRNA-sp2 (1 nM, 1:1:1 – indicated with red line on the graph) was incubated with a 5-fold excess of native protospacer 2 plasmid DNA. Cleavage was monitored by withdrawing samples from the reaction at defined time intervals (0 to 120 min) followed by agarose gel electrophoresis analysis (top) and determination of cleavage product amount (nM) (bottom). Standard deviations of three independent experiments are indicated. In the time interval investigated, 1 nM Cas9 was able to cleave ~2.5 nM plasmid DNA.Fig. S7. Cas9 contains two nuclease domains: HNH (McrA-like) and RuvC-like (N-terminal RNase H fold) resolvase. See Fig. 2. The amino-acid sequence of Cas9 from S. pyogenes SF370 (Accession number NC_002737) is represented.The three predicted RuvC-like motifs and the predicted HNH motif are shaded light blue and purple, respectively. Residues Asp10 and His840, which were substituted by Ala in this study are shown in red and highlighted by a red asterisk above the sequence. Underlined residues are highly conserved among Cas9 proteins from different species. Note that in a previous study and based on computational predictions, Makarova et al. (23) predicted coupling of the two nuclease-like activities, which is now confirmed experimentally in thepresent study (Fig. 2 and fig. S8).Fig. S8. The HNH and RuvC-like domains of Cas9 direct cleavage of the complementary and noncomplementary DNA strand, respectively. See Fig.2. Protospacer DNA cleavage by cognate tracrRNA:crRNA-directed Cas9 mutants containing mutations in the HNH or RuvC-like domain. (A) Protospacer 1 plasmid DNA cleavage. The experiment was performed as in Fig. 2A. Plasmid DNA conformations and sizes in base pairs are indicated. (B) Protospacer 4 oligonucleotide DNA cleavage. The experiment was performed as in Fig. 2B. Sizes in nucleotides are indicated.Fig. S9. tracrRNA is required for target DNA recognition. Electrophoretic mobility shift assays were performed using protospacer 4 target DNA duplex and Cas9 (containing nuclease domain inactivating mutations D10A and H840) alone or in the presence of crRNA-sp4, tracrRNA (75nt), or both. The target DNA duplex was radiolabeled at both 5’ ends. Cas9 (D10/H840A) and complexes were titrated from 1 nM to 1 μM. Binding was analyzed by 8% native polyacrylamide gel electrophoresis and visualized by phosphorimaging.Note that Cas9 alone binds target DNA with moderate affinity. This binding is unaffected by the addition of crRNA, suggesting that this represents sequence nonspecific interaction with the dsDNA. Furthermore, this interaction can be outcompeted by tracrRNA alone in the absence of crRNA. In the presence of both crRNA and tracrRNA, target DNA binding is substantially enhanced and yields a species with distinct electrophoretic mobility, indicative of specific target DNA recognition.Fig. S10. Minimal region of tracrRNA capable of guiding dual-tracrRNA:crRNA directed cleavage of target DNA. See Fig. 3. A fragment of tracrRNA encompassing a part of the crRNA paired region and a portion of the downstream region is sufficient to direct cleavage of protospacer oligonucleotide DNA by Cas9. (A) Protospacer 1 oligonucleotide DNA cleavage and (B) Protospacer 4 oligonucleotide DNA cleavage by Cas9 guided with a mature cognate crRNA and various tracrRNA fragments. (A, B) Sizes in nucleotides are indicated.Fig. S11. Dual-tracrRNA:crRNA guided target DNA cleavage by Cas9 is species specific. See Fig. 3. Like Cas9 from S. pyogenes, the closely related Cas9 orthologs from the Gram-positive bacteria L. innocua and S. thermophilus cleave protospacerDNA when targeted by tracrRNA:crRNA from S. pyogenes. However, under the same conditions, DNA cleavage by the less closely related Cas9 orthologs from the Gram-negative bacteria C. jejuni and N. meningitidis is not observed. Spy, S. pyogenes SF370 (Accession Number NC_002737); Sth, S. thermophilus LMD-9 (STER_1477 Cas9 ortholog; Accession Number NC_008532); Lin, L. innocua Clip11262 (Accession Number NC_003212); Cje, C. jejuni NCTC 11168 (Accession Number NC_002163); Nme, N. meningitidis A Z2491 (Accession Number NC_003116). (A) Cleavage of protospacer plasmid DNA. Protospacer 2 plasmid DNA (300 ng) was subjected to cleavage by different Cas9 orthologs (500 nM) guided by hybrid tracrRNA:crRNA-sp2 duplexes (500 nM, 1:1) from different species. To design the RNA duplexes, we predicted tracrRNA sequences from L. innocua and N. meningitidis based on previously published Northern blot data(4).The dual-hybrid RNA duplexes consist of species-specific tracrRNA and a heterologous crRNA. The heterologous crRNA sequence was engineered to contain S. pyogenes DNA-targeting sp2 sequence at the 5’ end fused to L. innocua or N. meningitidis tracrRNA-binding repeat sequence at the 3’ end.Cas9 orthologs from S. thermophilus and L. innocua, but not from N. meningitidis or C. jejuni, can be guided by S. pyogenes tracrRNA:crRNA-sp2 to cleave protospacer 2 plasmid DNA, albeit with slightly decreased efficiency. Similarly, the hybrid L. innocua tracrRNA:crRNA-sp2 can guide S. pyogenes Cas9 to cleave the target DNA with high efficiency, whereas the hybrid N. meningitidis tracrRNA:crRNA-sp2 triggers only slight DNA cleavage activity by S. pyogenes Cas9. As controls, N. meningitidis and L. innocua Cas9 orthologs cleave protospacer 2 plasmid DNA when guided by the cognate hybrid tracrRNA:crRNA-sp2. Note that as mentioned above, the tracrRNA sequence of N. meningitidis is predicted only and has not yet been confirmed by RNA sequencing. Therefore, the low efficiency of cleavage could be the result of either low activity of the Cas9 orthologs or the use of a nonoptimally designed tracrRNA sequence. (B) Cleavage of protospacer oligonucleotide DNA. 5’-end radioactively labeled complementary strand oligonucleotide (10 nM) pre-annealed with unlabeled noncomplementary strand oligonucleotide (protospacer 1) (10 nM) (left) or 5’-end radioactively labeled noncomplementary strand oligonucleotide (10 nM) pre-annealed with unlabeled complementary strand oligonucleotide (10 nM) (right) (protospacer 1) was subjected to cleavage by various Cas9 orthologs (500 nM) guided by tracrRNA:crRNA-sp1 duplex from S. pyogenes (500 nM, 1:1). Cas9 orthologs from S. thermophilus and L. innocua, but not from N. meningitidis or C. jejuni can be guided by S. pyogenes cognate dual-RNA to cleave the protospacer oligonucleotide DNA, albeit with decreased efficiency. Note that the cleavage site on the complementary DNA strand is identical for all three orthologs. Cleavage of the noncomplementary strand occurs at distinct positions. (C) Amino acid sequence identity of Cas9 orthologs. S. pyogenes, S. thermophilus and L. innocua Cas9 orthologs share high percentage of amino acid identity. In contrast, the C. jejuni and N. meningitidis Cas9 proteins differ in sequence and length (~300-400 amino acids shorter). (D)Co-foldings of engineered species-specific heterologous crRNA sequences with the corresponding tracrRNA orthologs from S. pyogenes (experimentally confirmed, (4)), L. innocua (predicted) or N. meningitidis (predicted). Red, tracrRNA; green, crRNA spacer 2 fragment; grey, crRNA repeat fragment. L. innocua and S. pyogenes hybrid tracrRNA:crRNA-sp2 duplexes share very similar structural characteristics, albeit distinct from the N. meningitidis hybrid tracrRNA:crRNA. Together with the cleavage data described above in (A) and (B),。