Graphical abstract 图形摘要
Keywords 关键词
Introduction 引言
Platforms that enable robust and scalable molecular recording and computation in living cells have broad biotechnological and biomedical applications, ranging from the study of signaling dynamics and cellular lineages in development and cancer, to building living biosensors and adaptive therapeutics, to encoding logic and programming cellular phenotypes (Farzadfard and Lu, 2018). With the advent of in vivo DNA writing technologies, several memory architectures have been described that use genomic DNA as a medium for information processing and storage in living cells (Farzadfard and Lu, 2014; McKenna et al., 2016; Perli et al., 2016; Roquet et al., 2016; Sheth et al., 2017). These technologies capture and record biological information in the form of mutational signatures in DNA. However, unlike their silicon-based counterparts, which have access to large capacities of addressable memory registers, in vivo genetic memory architectures use rudimentary “Read” and “Write” operations and remain limited in their encoding capacity and scalability. As a result, these memory architectures lose their recording capacity after recording one or a few molecular events and cannot be used to continuously monitor signaling dynamics or histories of events over long periods. Furthermore, these technologies lack an inherent Read operation to interrogate and monitor DNA memory states on the fly. Consequently, it is challenging to interconnect and scale-up these architectures to achieve complex DNA-based logic and memory operations in living cells. In addition, because of the requirements for host-specific DNA repair and genome editing mechanisms, these systems have been applicable only to a subset of organisms (Farzadfard and Lu, 2018).
能够在活细胞中实现强大且可扩展的分子记录和计算的平台在生物技术和生物医学领域具有广泛的应用,从研究发育和癌症中的信号动态和细胞谱系,到构建活体生物传感器和自适应治疗,再到编码逻辑和编程细胞表型( Farzadfard and Lu, 2018 )。随着体内 DNA 写入技术的出现,已经描述了几种记忆架构,这些架构使用基因组 DNA 作为活细胞中信息处理和存储的媒介( Farzadfard and Lu, 2014; McKenna et al., 2016; Perli et al., 2016; Roquet et al., 2016; Sheth et al., 2017 )。这些技术以 DNA 中的突变特征的形式捕获和记录生物信息。然而,与其硅基对应物不同,后者可以访问大量可寻址的内存寄存器,体内遗传记忆架构仅使用基本的“读取”和“写入”操作,并在编码能力和可扩展性方面受到限制。因此,这些记忆架构在记录一个或几个分子事件后失去了记录能力,无法用于持续监测信号动态或长时间的事件历史。 此外,这些技术缺乏内在的读取操作,以便实时查询和监测 DNA 记忆状态。因此,将这些架构互联并扩展以实现复杂的基于 DNA 的逻辑和记忆操作在活细胞中变得具有挑战性。此外,由于对宿主特异性 DNA 修复和基因组编辑机制的要求,这些系统仅适用于一部分生物体( Farzadfard and Lu, 2018 )。
6.
Farzadfard, F. ∙ Lu, T.K.
Synthetic biology. Genomically encoded analog memory with precise in vivo DNA writing in living cell populations
Science. 2014; 346:125627220.
McKenna, A. ∙ Findlay, G.M. ∙ Gagnon, J.A. ...
Whole-organism lineage tracing by combinatorial and cumulative genome editing
Science. 2016; 353:aaf790724.
Perli, S.D. ∙ Cui, C.H. ∙ Lu, T.K.
Continuous genetic recording with self-targeting CRISPR-Cas in human cells
Science. 2016; 353:aag051126.
Roquet, N. ∙ Soleimany, A.P. ∙ Ferris, A.C. ...
Synthetic recombinase-based state machines in living cells
Science. 2016; 353:aad855927.
Sheth, R.U. ∙ Yim, S.S. ∙ Wu, F.L. ...
Multiplex recording of cellular events over time on CRISPR biological tape
Science. 2017; 358:1457-1461能够在活细胞中实现强大且可扩展的分子记录和计算的平台在生物技术和生物医学领域具有广泛的应用,从研究发育和癌症中的信号动态和细胞谱系,到构建活体生物传感器和自适应治疗,再到编码逻辑和编程细胞表型( Farzadfard and Lu, 2018 )。随着体内 DNA 写入技术的出现,已经描述了几种记忆架构,这些架构使用基因组 DNA 作为活细胞中信息处理和存储的媒介( Farzadfard and Lu, 2014; McKenna et al., 2016; Perli et al., 2016; Roquet et al., 2016; Sheth et al., 2017
6.
Farzadfard, F. ∙ Lu, T.K.
Synthetic biology. Genomically encoded analog memory with precise in vivo DNA writing in living cell populations
Science. 2014; 346:125627220.
McKenna, A. ∙ Findlay, G.M. ∙ Gagnon, J.A. ...
Whole-organism lineage tracing by combinatorial and cumulative genome editing
Science. 2016; 353:aaf790724.
Perli, S.D. ∙ Cui, C.H. ∙ Lu, T.K.
Continuous genetic recording with self-targeting CRISPR-Cas in human cells
Science. 2016; 353:aag051126.
Roquet, N. ∙ Soleimany, A.P. ∙ Ferris, A.C. ...
Synthetic recombinase-based state machines in living cells
Science. 2016; 353:aad855927.
Sheth, R.U. ∙ Yim, S.S. ∙ Wu, F.L. ...
Multiplex recording of cellular events over time on CRISPR biological tape
Science. 2017; 358:1457-1461To overcome these bottlenecks, we describe a highly efficient and robust molecular recording and DNA memory platform that uses genomic DNA as an addressable, readable, and writable information storage and computation medium in living cells, much like a hard drive. This platform, called the DNA-based Ordered Memory and Iteration Network Operator (DOMINO), leverages precise DNA writing with CRISPR base editors (Komor et al., 2016; Nishida et al., 2016) to manipulate DNA dynamically and efficiently with single-nucleotide resolution in bacterial and mammalian cells. DOMINO enables robust and long-term molecular recording of the intensity and duration of signals of interest (i.e., analog information) into DNA. Multiple DOMINO operators can be layered to build logic operators that control the sequence and timing of events in living cells in a scalable fashion. Specifically, we show that the order and combinations of DNA writing events can be coordinated and tuned by external inputs, allowing order-independent (e.g., IF EVER A AND IF EVER B), sequential (e.g., A AND THEN B), and temporal (e.g., A AND AFTER TIME X THEN B) logic and memory operations to be executed. In addition, DOMINO can be combined with CRISPR-based gene regulation strategies, such as CRISPR interference (CRISPRi) (Qi et al., 2013) and CRISPR activation (CRISPRa) (Farzadfard et al., 2013; Gilbert et al., 2013), to achieve modular and versatile memory and gene regulatory functions. By leveraging this feature, we built a non-destructive DNA-state reporter genetic circuit in which mutational memory states generated in response to an incoming guide RNA (gRNA) are converted into distinct levels of a transcriptional output. Thus, the state of this circuit can be monitored by functional assays, obviating the need for cell destruction and DNA sequencing for readouts. As such, DOMINO extends the utility of molecular recording beyond DNA Write-only applications (for which the output can be read only by disruptive sequencing methods) and enables long-term recording and monitoring of in vivo molecular events. These advances address many limitations of current in vivo recording and computing technologies and pave the way toward next-generation memory architectures for information processing and storage in living cells.
15.
Komor, A.C. ∙ Kim, Y.B. ∙ Packer, M.S. ...
Programmable editing of a target base in genomic DNA without double-stranded DNA cleavage
Nature. 2016; 533:420-42423.
Nishida, K. ∙ Arazoe, T. ∙ Yachie, N. ...
Targeted nucleotide editing using hybrid prokaryotic and vertebrate adaptive immune systems
Science. 2016; 353 aaf87298.
Farzadfard, F. ∙ Perli, S.D. ∙ Lu, T.K.
Tunable and multifunctional eukaryotic transcription factors based on CRISPR/Cas
ACS Synth. Biol. 2013; 2:604-61311.
Gilbert, L.A. ∙ Larson, M.H. ∙ Morsut, L. ...
CRISPR-mediated modular RNA-guided regulation of transcription in eukaryotes
Cell. 2013; 154:442-451Results
Designing the DOMINO Memory Architecture Using Base Editors as an Efficient Read-Write Head for Genomic DNA
We previously developed a moderately efficient, addressable, and precise DNA writer for genomic DNA called Synthetic Cellular Recorders Integrating Biological Events (SCRIBE) and demonstrated that it can be used as a molecular recorder with a dynamic range to record signal intensity and duration (i.e., analog information) into long-lasting DNA records (Farzadfard and Lu, 2014). However, the absence of an inherent Write operation, the relatively low DNA writing efficiency, and the requirement for host-specific factors limited the application of this Write-only system.
To address these limitations, we sought to leverage the base-editing technology (Komor et al., 2016; Nishida et al., 2016; Tang and Liu, 2018) as a single-nucleotide resolution Read-Write head for genomic DNA to build dynamic and scalable memory architectures in living cells. The Read-Write head is composed of Cas9 nickase (nCas9, an addressable DNA “reader” module that is directed by gRNA to specific DNA targets and nicks them) fused to cytidine deaminase (CDA; a DNA “writer” module that edits the DNA) and uracil DNA glycosylase inhibitor (ugi, a peptide that improves the DNA writing efficiency by blocking cellular repair machinery). Once localized to the target based on the 12-bp gRNA seed sequence (READ address), the writer module can deaminate deoxycytidines (dCs) in the vicinity of the 5′ end of the target (WRITE address), resulting in DNA lesions that are preferentially repaired to thymidines (dTs) (Komor et al., 2016). Using CDA as the DNA writer module enables introduction of dC-to-dT mutations (or deoxyguanosine [dG]-to-deoxyadenosine [dA] mutations in the reverse complement strand) in the WRITE address, resulting in permanent records in DNA. In this memory architecture, an individual mutation or a group of mutations in a target site can be designated as a unique memory state for the corresponding memory register, and mutations introduced by DNA writing events can be considered unidirectional transitions between DNA memory states (Figure 1A).
15.
Komor, A.C. ∙ Kim, Y.B. ∙ Packer, M.S. ...
Programmable editing of a target base in genomic DNA without double-stranded DNA cleavage
Nature. 2016; 533:420-42423.
Nishida, K. ∙ Arazoe, T. ∙ Yachie, N. ...
Targeted nucleotide editing using hybrid prokaryotic and vertebrate adaptive immune systems
Science. 2016; 353 aaf872929.
Tang, W. ∙ Liu, D.R.
Rewritable multi-event analog recording in bacterial and mammalian cells
Science. 2018; 360:eaap8992DNA writing events in this system can be controlled by internal or external inputs by placing the expression of both gRNA and CDA-nCas9-ugi under the control of inducible promoters. In this design, which forms the basis of DOMINO operators, the signal controlling the expression of CDA-nCas9-ugi, required for the overall circuit to function, can be considered the operational signal, while the signals controlling the expression of individual gRNAs can be considered independently controllable inputs (Figure 1B). To demonstrate the performance of an individual DOMINO operator, we placed CDA-nCas9-ugi and gRNA on separate cassettes under the control of an anhydrotetracycline (aTc)-inducible promoter and an isopropyl β-D-1-thiogalactopyranoside (IPTG)-inducible promoter, respectively. Inducing Escherichia coli (E. coli) cells harboring these two cassettes with both aTc and IPTG resulted in efficient dC-to-dT mutations at two dC residues within the gRNA WRITE window, demonstrating successful DNA writing by DOMINO (Figure 1C). No mutation was detected if the cultures were not induced or if they were induced by only one of the two inducers. These results demonstrate that DOMINO can be used as a precise DNA writer to efficiently and deterministically manipulate genomic DNA in response to signals of interest.
Molecular Recording by DOMINO
We first asked whether DOMINO operators can be used (in an analogous way to SCRIBE) to record the dynamics of transient transcriptional signals of interest into DNA. To assess this, we exposed E. coli cultures harboring the previously mentioned circuit to the operational signal (aTc) and various levels of the input (IPTG) and monitored the accumulation of mutant alleles in the population by Illumina high-throughput sequencing (HTS) over the course of 24 h. The frequency of mutant allele (memory state S1) increased as the concentration of IPTG increased, demonstrating that the DOMINO operator recorded input levels in the form of the fraction of alleles in the S1 state (Figure 1D).
We then sought to study the dynamics of mutation accumulation. To this end, we initially induced the cultures with aTc for 4 h and then diluted the cultures and added IPTG. A significant increase in the S1 allele frequency was detected as early as 1 h after IPTG addition. Mutations accumulated linearly over the next 7 h, demonstrating that the duration of exposure to the input could be recorded in DNA (Figure 1E). Mutant frequency, however, did not increase substantially after 8 h as the cultures reached saturation, suggesting that mutations accumulated faster in freshly diluted and actively growing cells. Consistent with the previous experiment, we did not observe significant accumulation of the mutant allele in cultures that were not exposed to the input (IPTG).
In these experiments, two mutable residues (CC) within the WRITE window of the gRNA were used, and the memory states were defined such that mutations in both residues were required to be considered a state transition. Consistent with previous reports (Komor et al., 2016), we observed that residues in different positions along the WRITE window were edited with different dynamics (Figures S1A and S1B). These results suggest that the number of intermediate memory states, as well as the response dynamics, can be tuned for each DOMINO operator by adjusting the number or position of mutable residues (dCs or dGs) within the WRITE window.
By replacing the IPTG-inducible promoter in the DOMINO operator with various inducible promoters, we showed that various signals (with either biological or physiological relevance) could be recorded in DNA. These signals included sugars (arabinose [Ara]), heavy metals (Cu2+), and darkness, as well as several biomarkers of gastrointestinal inflammation: blood (heme), hydrogen peroxide, and nitric oxide (Figures S1C–S1H). These results demonstrate that DOMINO operators could be used as modular molecular recording devices to capture the dynamics of transcriptional signals of interest in DNA.
Layered Molecular Recording and Computation by DOMINO
Upon successful demonstration of molecular recording by DOMINO, we next sought to investigate whether multiple DOMINO operators could be used to concurrently record the dynamics of multiple signals and perform logic operations in living cells. Specifically, we theorized that because of the precise and well-defined nature of the mutational outcomes generated by DOMINO operators, information regarding multiple signals (e.g., logical features such as presence or absence and analog features such as intensity or duration) could be recorded into nearby memory registers and the mutational state of these registers could be read by sequencing or on the fly with another DOMINO operator. These would allow DOMINO operators to be arrayed and interconnected in a highly scalable fashion such that the mutational outcome of one operator would serve as input for other operators. Such interconnected operators could be used to execute a series of order-independent and/or sequential unidirectional DNA writing events and build robust and complex forms of logic and memory operations to record and control the combination, order, and timing of molecular events in living cells.
Order-Independent DOMINO Logic
To demonstrate this concept, we set out to build a two-input order-independent AND logic gate, with which the A AND B logic is executed independent of the order of addition of the inputs by layering two DOMINO operators as indicated in Figure 2A. In this design, two distinct gRNAs were used. One gRNA was placed under the control of an IPTG-inducible promoter, and the other gRNA was placed under the control of an Ara-inducible promoter. In the presence of its corresponding inducer, each gRNA, once expressed, would direct the DNA Read-Write module (which is expressed in the presence of the operational signal aTc) to its cognate target site, resulting in precise dC-to-dT mutations within its WRITE window.
To assess the performance of the order-independent DOMINO AND gate, we induced E. coli cells harboring this circuit with different combinations of the inducers for several days with successive rounds of dilutions (to maintain the cells in an actively growing state) and analyzed allele frequencies at the target locus by HTS. In the presence of the operational signal (aTc) and either of the two inputs (IPTG or Ara), mutations accumulated in the target sites of the induced gRNA linearly within the population; the frequency of mutant alleles reached a plateau after 72 h of induction (Figures 2B and S2A), corresponding to transitions from the unmodified state (state S0) to either of the two singly modified states (state S1 or S2). However, when cells were induced with both inputs (IPTG AND Ara), the target sites for both gRNAs were edited, resulting in the accumulation of doubly edited sites (state S3) in the target locus. Low levels of a singly mutated allele (state S2) accumulated in the absence of induction, likely because of the leakiness of the Ara-inducible promoter (pBAD) (Meyer et al., 2019) (Figure S1C). Nevertheless, the doubly edited allele (state S3) accumulated only in the presence of both IPTG and Ara, indicating that robust AND logic can be achieved despite the leakiness of one of the input promoters.
In this experiment, we defined S0, S1, and S2 states as the OFF (0) states and S3 as the ON (1) state, which means that this system implements AND logic. However, these states are defined arbitrarily; the same circuit can be defined, for example, as a NAND gate if the unmodified state (S0) is designated as ON output and the modified states (S1, S2, and S3) are designated as OFF outputs. Alternatively, each of the four mutational states can be defined as a distinct output, in which case the circuit can be considered a 2-input/4-output decoder.
The time required for transitioning between unmodified and fully modified states can be considered the propagation delay of the corresponding DOMINO operator. Each of the two DOMINO operators used in this experiment exhibited a propagation delay of ∼3 days (3 dilution cycles). Accumulation of the singly mutated alleles in the presence of the operational signal and individual inducer inputs followed a linear trend over the course of few days (Figure 2B). This feature means that DOMINO can be used to implement both analog and digital computing, because the continuous changes that occur within the propagation delay window can be used to implement analog computation, while fully converted states can be considered transitions between digital states and so be used for digital computation.
Although HTS offers a powerful way to quantify the outcome of DOMINO circuits, its relatively high cost motivated us to develop a strategy for using Sanger sequencing chromatograms to quantify position-specific mutant frequencies within a mixture of DNA species. This algorithm, named Sequalizer (for sequence equalizer), normalizes Sanger chromatogram signals and calculates the difference between the normalized signals from a test sample and an unmodified reference to identify position-specific mutations. It then uses this calculated difference to estimate position-specific mutant frequencies at any given target position (Data S1). We validated the accuracy of this method by constructing a standard curve (based on known ratios of mutant and wild-type [WT] sequences) and then comparing the Sequalizer results with next-generation sequencing (Figure S2). The Sequalizer output, which is based on population-averaged Sanger sequencing results, provides an estimate of position-specific mutant frequencies in a population. Though Sequalizer does not always provide accurate absolute values of mutant frequencies, fold changes in estimated mutant frequencies are accurate (Figures S2B–S2D). In addition, unlike HTS, Sequalizer output does not reveal the identities and frequencies of individual alleles in the population. Nevertheless, given the high specificity of the DNA writers and their predefined target sites, this approach can be used as a low-cost alternative to HTS to assess the performance of DOMINO and other precise genome-editing platforms.
We analyzed the samples obtained from the experiment shown in Figure 2B by Sanger sequencing and Sequalizer, in addition to HTS. As shown in Figure S2E, in samples induced with either of the two inputs, the frequencies of mutations in positions corresponding to the cognate target sites of the induced gRNA increased. However, in samples that were induced with both gRNAs, the mutation frequencies in the target sites of both gRNAs increased (state S3). These results demonstrate that Sequalizer results are consistent with those obtained by HTS and that Sequalizer could be used to accurately estimate changes in position-specific mutant frequencies obtained by HTS.
The output of DOMINO operators takes the form of DNA mutations that accumulate at a target site. These mutations can be directly read by sequencing or coupled with functional elements to control cellular phenotypes. For example, by flanking the input gRNA target site with a desired promoter and a gRNA handle, the output of a given DOMINO operator can be converted into downstream gRNA expression. The output gRNA can then be interconnected with other DOMINO operators to build more complex circuits or combined with CRISPR-based gene regulation strategies, such as CRISPRi and CRISPRa, to dynamically regulate cellular phenotypes. To demonstrate this concept, we engineered an AND gate by layering two DOMINO operators under the control of IPTG- and Ara-inducible promoters to edit a third gRNA as the output. In the presence of both inducers, the output gRNA was modified by both input gRNAs such that it could then bind to and repress a downstream reporter gene (GFP) (Figure 2C).
In addition to an AND gate, other logic operations can be readily implemented by DOMINO by carefully positioning mutable residues on the targets or by designing the combinations and order of DNA writing events (Figures S3A and S3B). Furthermore, additional input gRNAs can be incorporated to achieve operators with more than two inputs, thus demonstrating the multi-input recording capacity and scalability of this approach (Figure S3C). Moreover, the mutational outcome of DOMINO operators, in addition to gRNAs, can be directed toward other regulatory and functional elements, such as promoters, ribosome-binding sites, start and stop codons, and active sites within proteins, to tune the expression or activity of downstream components (Figure S3D).
Sequential DOMINO Logic
In addition to realizing order-independent logic, the order of DNA writing events executed by DOMINO operators can be carefully controlled to achieve sequential logic, which generates the desired outputs only when the correct order of inducers is added. For example, to achieve sequential logic, the gRNA output of one operator can be designed to serve as the input for a downstream operator (Figure S3B). This design can be used to functionally connect DOMINO operators that are not physically co-located. Alternatively, sequential logic can be achieved by overlapping mutable residues in the WRITE address of one operator with the READ address of a downstream operator (Figure 3). This design uses DNA mutations rather than cascades of gRNAs as a way to interconnect cis-encoded DOMINO operators, thus offering a highly compact and scalable strategy for encoding sequential logic.
To demonstrate the latter strategy, we constructed an asynchronous sequential DOMINO AND gate in E. coli, such that the sequential addition of the two inputs in the correct order (IPTG AND THEN Ara) would lead to the mutation of a cryptic start codon (ACG) into the canonical, more efficient start codon (ATG) in the GFP open reading frame (ORF), thus increasing the GFP signal (Figures 3A and 3B). We observed slight increases in GFP signal in cells that had been induced with the first inducer (i.e., IPTG) or those that had been co-induced with both inducers (Figure 3B). The former effect was likely caused by the leakiness of the Ara-inducible promoter (Figure S1C), while the latter was likely because of the simultaneous presence of both inducers in the media, which could produce, to some extent, sequential DNA mutations in the correct order. Nevertheless, the GFP signal was significantly higher when cells were exposed to the inducers in the correct order (IPTG AND THEN Ara). We confirmed these results by analyzing Sanger sequencing chromatograms by Sequalizer. Consistent with the flow cytometry data, the highest level of mutation in the cryptic start codon (Figure 3C, blue bars) was achieved when the samples were induced with the correct order, indicating the execution of sequential AND logic. To demonstrate that the order of different transcriptional events can be recorded as distinct mutational signatures in the DNA, we built two additional sequential logic circuits (Figure S4). These examples demonstrate that sequential DOMINO logic circuits can be used to program and commit cells to defined states based on the order of inputs.
Temporal DOMINO Logic
The preceding examples demonstrate that the sequence of DNA writing events mediated by DOMINO operators can be controlled by external cues. In addition to building sequential logic, such that the execution of events in a specified order leads to a desired output, the inherent propagation delay in DOMINO operators can be exploited to incorporate delays and temporal information into circuits so that a desired output is produced only after a certain period has passed. Multiple DOMINO operators can be placed sequentially in an array to build longer delays. In a simple form, DOMINO delay operators can be built by constructing a series of overlapping repeats to act as target sites for a desired gRNA (Figure 4A). This repeat configuration allows the READ address of each gRNA operator site to overlap with the WRITE address of the previous gRNA. Initially, the gRNA can bind to the first (i.e., 3′ end) repeat, but not to the upstream copies of the repeat that harbor dC residues (instead of dT), in the sequence corresponding to the gRNA READ address. Upon binding to the first repeat, the gRNA can recruit the Read-Write head to the first repeat, which can mutate the dC residues in the repeat immediately upstream of its binding site (i.e., the second repeat), thus converting that repeat to a new binding site for the same gRNA. This process is sequentially repeated to generate new binding sites for the same gRNA. Much like an array of physical domino pieces that fall one by one, each genome-editing event is initiated only after editing in the previous repeat has occurred, thus ensuring a sequential cascade of unidirectional DNA writing events over time. The output of the delay elements can be combined with additional logic operators and internal or external cues to create more complex forms of temporal logic.
To demonstrate this concept, we placed three DOMINO delay elements into an array and linked the output of the array to a second DOMINO operator (Figure 4A). This design achieves temporal and sequential AND logic, because the first (i.e., IPTG-inducible) gRNA has to execute three consecutive DNA writing events before the Ara-inducible gRNA (corresponding to the last operator) can bind to and edit its target. We induced cells harboring this circuit with IPTG at various concentrations for 4 consecutive days, followed by a final day of induction with Ara. Sequalizer and HTS analysis of these samples demonstrated a time- and IPTG-dosage-dependent accumulation of mutations in the target sites within repeats, corresponding to propagation of the signal through the repeat array (Figures 4B and 4C). By the end of the experiment, mutations in the target site of the second gRNA (shown by the blue arrow in Figure 4B) were detected only under conditions in which mutations had accumulated through the entire cascade, corresponding to the samples that had been induced with the high IPTG concentrations (i.e., 0.01 and 0.1 mM). The second gRNA, which is under the control of Ara, was only able to bind to and edit its target when the third copy of the repeat was edited by the first gRNA, thus demonstrating the desired time-dependent sequential logic. These results demonstrate that in addition to enacting delays in gene circuits, an array of DOMINO delay elements can be used as a multi-state memory register that undergoes stepwise transitions between discrete states in a time- and dosage-dependent fashion.
The timing and dynamics of transition between memory states (propagation delays) in DOMINO circuits can be further controlled by adjusting the writing efficiency (e.g., changing the position of mutable residues within the WRITE window) (Figure S1A) or tuned dynamically by external cues (i.e., stronger signal) (Figure 4). In addition, the number of memory states and the total delay can be programmed by changing the number of repeats (Figure S5). We envision that more complex versions of temporal logic, such as genetic counters and timers, can be constructed by integrating delay elements into multiple-input DOMINO operators.
Non-destructive DNA-State Reporters
Existing DNA-based molecular recording technologies rely on DNA sequencing as readout. As such, to retrieve the recorded information, recording has to be stopped. A unique feature of DOMINO compared with other memory platforms is that the DOMINO DNA Read-Write head can be further functionalized with additional effector domains (e.g., transcriptional activators and repressors) to achieve concurrent DNA writing and transcriptional regulation. This feature, combined with the precise and sequential DNA writing achieved by DOMINO, enables one to perform both genetic and epigenetic modulation and correlate DNA memory states with distinct functional outcomes that can be continuously monitored in living cells without disrupting the cells.
To demonstrate this concept, we constructed a non-destructive DNA-state reporter gene circuit in the HEK293T cell line. We first made an array of four overlapping WT operator repeats (4xOp) and a downstream single-copy mutant operator (1xOp∗), which harbored a dC-to-dT mutation. We then placed this array upstream of a minimal adenovirus major late promoter (MLP) controlling the expression of GFP, to build the 4xOp_1xOp∗_GFP reporter construct. As a control, we constructed the 1xOp∗_GFP reporter, which previously had been shown to have negligible activity (Farzadfard et al., 2013). We also functionalized the DNA Read-Write head (nCas9-CDA-ugi) with a transcriptional activator domain (VP64) and cloned the nCas9-CDA-ugi-VP64 fusion construct, along with either of the two reporter constructs, into lentiviral vectors, which were subsequently introduced into HEK293T cells. We then delivered a second lentiviral vector encoding either a mutant operator (Op∗)-specific gRNA (gRNA(Op∗)) or a non-specific gRNA (gRNA(NS), as negative control) to these cells. Upon binding, gRNA(Op∗) could mutate the critical dC residue in the adjacent upstream WT operator (Op) repeat, thus converting the Op repeat to a Op∗ sequence that could serve as a new binding site for the same gRNA; this strategy enables sequential rounds of mutation (i.e., Op to Op∗ conversion), gRNA binding, and transactivator recruitment events (Figure 5A).
We sequentially passaged cells harboring these circuits every three days for fifteen days (Figure 5B) and monitored both GFP expression (by microscopy, Figures 5C and S6A) and allele frequencies (by HTS, Figure 5D). The normalized frequency of GFP-positive cells and the GFP expression level of individual cells in cultures harboring the 4xOp_1xOp∗_GFP reporter and gRNA(Op∗) gradually increased over time, indicating the gradual activation of the reporter in these cells (Figure 5C). These data suggest that the number of bound transactivators, and thus the number of activated repeats (i.e., Op∗), which serve as operator sites for the chimeric Read-Write-transactivator protein, were increased in these cells. However, in cultures that were transfected with gRNA(NS) or those that contained the 1xOp∗_GFP reporter, the GFP signal remained below the detection threshold.
These results were confirmed by HTS analysis of the allele frequencies throughout the experiment. As shown in Figure 5D, the frequency of the WT allele (state S0) in cells containing the repeat array and gRNA(Op∗) linearly decreased over time, indicating that the DNA writing circuit can be used as an analog recorder for the input gRNA. As expected, the final memory state (i.e., S5), which corresponds to the highest GFP expression state, increased steadily over time. Consistent with the GFP data, the first four memory states (S1 through S4) started to accumulate sequentially until they reached a plateau (i.e., the steady state) toward the end of the experiment (Figure 5E). No significant changes in allele frequencies were observed in cells that were transduced with a non-specific gRNA (Figure S6B). Together with the microscopy data, these results show that the analog properties of a signal, such as the duration of exposure to gRNA(Op∗), can be faithfully and permanently recorded within the distribution of memory states of the DNA recorder. In this circuit, higher GFP-expression levels are correlated with the higher memory states. Thus, the GFP signal can be used to continuously monitor DNA memory states without requiring cellular destruction for sequencing. Furthermore, at the single-cell level, this memory architecture constitutes a multi-bit digital recorder that associates a longer (or higher intensity of) incoming signal (i.e., gRNA expression) with transitions to higher memory states.
In this experiment, in addition to dC-to-dT mutations, we observed dC-to-dG and dC-to-dA mutations, albeit with lower frequencies than dC-to-dT mutations (Figure S6C). This observation is consistent with previous reports for mammalian cell lines (Komor et al., 2016; Nishida et al., 2016) and reflects the promiscuous repair outcome of deaminated dC (dU) lesions in these cells. In samples containing the 1xOp∗_GFP reporter, frequency of the WT allele (state S0) decreased and frequency of the mutant alleles increased linearly over time (Figure S6C). Thus, even without having a repeat array, the accumulation of mutations in a specific target site can be used as an analog readout of an incoming signal.
15.
Komor, A.C. ∙ Kim, Y.B. ∙ Packer, M.S. ...
Programmable editing of a target base in genomic DNA without double-stranded DNA cleavage
Nature. 2016; 533:420-42423.
Nishida, K. ∙ Arazoe, T. ∙ Yachie, N. ...
Targeted nucleotide editing using hybrid prokaryotic and vertebrate adaptive immune systems
Science. 2016; 353 aaf8729In this experiment, we used VP64 as the transactivator domain. However, the activation level and dynamic range of the reporter output can be tuned by using stronger activator domains, such as VPR (Chavez et al., 2015). Alternatively, other effector domains, such as repressors (Farzadfard et al., 2013), DNA methyl transferases (Liu et al., 2016), acetyl transferases (Hilton et al., 2015), or other types of histone modification domains, could be used to implement more sophisticated forms of combined memory and gene regulation programs.
Discussion
By using a DNA Read-Write head that can manipulate genomic DNA with nucleotide resolution, DOMINO converts the genomic DNA into an addressable, readable, and writable medium for information processing and storage in living cells. Various orthogonal DOMINO operators can be built by simply changing the sequence of gRNAs, making the system highly scalable. Furthermore, because DOMINO enables manipulation of DNA with nucleotide resolution within a defined narrow window, compact multi-input operators can be readily created by targeting multiple gRNAs to nearby registers, which can then be interrogated by a downstream operator. Unlike other precise DNA writing systems that require multiple proteins (e.g., recombinases) to encode memory, DOMINO uses small gRNAs and only one protein moiety, minimizing metabolic burden on the cells. DOMINO enables molecular recording and highly compact and scalable logic and memory operators that, unlike previous strategies, can be used for both digital and analog computation in living cells. Thus, this scalable approach expands our capacity for molecular recording and computation in living cells and, as a result, the ability to monitor and control cellular phenotypes.
Like other synthetic gene circuits, non-optimal performance of gene regulatory elements, such as leaky promoters, could negatively affect the performance of DOMINO. These limitations may be overcome with systematic optimizations, such as engineering reduced basal promoter activities via directed evolution strategies (Meyer et al., 2019), using engineered promoters with tighter activity (Arpino et al., 2013), or lowering the copy numbers of gene circuits (Lee et al., 2016).
In our experiments (Figure 1E), we detected the presence of a signal as fast as 1 h (∼2 generations) after the start of induction. The temporal dynamics of recording could be improved by using newer generations of base editors with higher editing efficiency (Koblan et al., 2018) or by making conditional DNA writing modules with faster response times (e.g., by incorporating signal-responsive RNA and protein motifs into the gRNA and the base editor, respectively). Combining the nucleotide-deaminase-based DNA writing modules with alternative DNA binding proteins as Read modules (such as RNA polymerases) could also increase the temporal resolution and capacity of molecular recording.
Deterministic DOMINO operators and cascades rely on precise base-editing events for proper function. Our results show that when the CDA-nCas9-ugi head is used, the outcome of these operators in E. coli is almost exclusively in the form of dC-to-dT mutations (Figure S1). However, in human cells, other nucleotides (dG and, to a lesser extent, dA) are also generated, albeit at a lower rate than dT (Figure S6C). The promiscuous repair outcome of CDAs in mammalian cells results in by-products and abortive memory states that cannot undergo additional rounds of state transition in DOMINO cascades (Figure 5E), which in turn negatively affects the overall performance of the DOMINO circuits. Using DNA writer modules with higher DNA writing efficiency and purer mutational outcomes (such as adenosine base editors) (Gaudelli et al., 2017) could help to reduce the level of these by-products and improve the performance of deterministic and complex DOMINO circuits in mammalian cells. Furthermore, incorporating orthogonal DNA writing modules (e.g., cytidine and adenosine deaminases) into DOMINO should make reversible DNA writing possible, which has been challenging to achieve with previous DNA memory platforms. Reversible DNA writing would enable bidirectional cellular programs and pave the way for sophisticated biological state machines, cellular automata, and Turing machines that use the genomic DNA of living cells as a rewritable memory tape to perform advanced memory and computation operations.
Several CRISPR-Cas9-based strategies for recording biological events, such as signaling dynamics and cellular lineage histories, into DNA have been described (Kalhor et al., 2018; McKenna et al., 2016; Perli et al., 2016). These approaches rely on stochastic DNA memory states (i.e., insertion or deletion [indel] mutations) that are generated by pseudorandom DNA writers mediated by Cas9-mediated double-strand DNA breaks and subsequent repair of these breaks by non-homologous end joining (NHEJ). However, the recording capacity of these recorders is exhausted after recording a few molecular events because of the loss of gRNA target sites; these recorders are, therefore, not ideal for performing logic or long-term recording of signaling dynamics or event histories (Farzadfard and Lu, 2018). Moreover, because indel mutations (memory states) are stochastically generated by NHEJ, new mutations could destroy the previous mutations and thus overwrite the previous memory states, making it challenging to trace lineage histories. In addition, none of these strategies can be used in organisms lacking an efficient NHEJ repair pathway, such as prokaryotes. In contrast, mutational memory states generated by DOMINO are precise, unidirectional, position specific, and minimally disruptive. These features address the abovementioned limitations and ensure that previous mutations are preserved after each editing step and can be accurately traced.
13.
Kalhor, R. ∙ Kalhor, K. ∙ Mejia, L. ...
Developmental barcoding of whole mouse via homing CRISPR
Science. 2018; 361:eaat980420.
McKenna, A. ∙ Findlay, G.M. ∙ Gagnon, J.A. ...
Whole-organism lineage tracing by combinatorial and cumulative genome editing
Science. 2016; 353:aaf790724.
Perli, S.D. ∙ Cui, C.H. ∙ Lu, T.K.
Continuous genetic recording with self-targeting CRISPR-Cas in human cells
Science. 2016; 353:aag0511In addition, the precise and predictable memory state transitions in DOMINO recorders enable memory states to be coupled to functional biological outcomes, such as changes in gene expression, thus obviating the need for sequencing as readouts for certain applications (Figure 5). Furthermore, DOMINO does not require double-strand DNA breaks or NHEJ; thus, it can function in both bacterial and mammalian cells autonomously and continuously over many generations. We envision that the DNA records generated by DOMINO recording systems could be used to study signaling dynamics and event histories in their native contexts over many generations, as well as building gene circuits with artificial learning capacities. The promiscuous repair of dC lesions in mammalian cells could be beneficial for tracing cell lineages, because it can increase the number of potential memory states. Moreover, signal-responsive lineage maps with tunable resolution can be generated because the activity of the DOMINO recorder can be modulated by internal or external signals of interest. Combining these recorders with single-cell sequencing, advanced barcoding schemes, and self-targeting guide RNAs (Perli et al., 2016) should pave the way toward more advanced recorders for long-time monitoring of signaling dynamics and cellular lineages (Farzadfard and Lu, 2018).
We envision that our long-term, compact, scalable, modular, and minimally disruptive genetic memory architectures will enable an unprecedented set of applications for building genetic programs and recording and controlling spatiotemporal molecular events in their native contexts. These applications could affect many fields, including developmental biology, cancer, stem cell, and brain mapping. For example, DOMINO can be used to program the timing and progression of developmental stages within living animals or to perform long-term lineage tracking experiments in mammals, which has not been feasible to date because of the lack of scalable and long-term methodologies. DOMINO recorders could be adapted to map neural activity by driving the activity of DNA writers with regulators that respond to neural activity. One could study the order and temporal nature of signaling events in their native contexts and robustly control cellular differentiation cascades ex vivo and in vivo. These recorders could be programmed to investigate tumor development and unveil the cellular and environmental cues involved in tumor heterogeneity. They could also be used to record arbitrary information into the DNA of living cells for DNA storage applications. Finally, living sensors could be designed to sense pathogens, toxins, or other signals within the body or in the environment and then later report on this information in detail.
STAR★Methods
Key Resources Table
REAGENT or RESOURCE | SOURCE | IDENTIFIER |
---|---|---|
Bacterial and Virus Strains | ||
E. coli: MG1655 PRO Strain | Lutz and Bujard (1997) | N/A |
E. coli: 5-alpha F’ Iq Strain | NEB | Cat# C2992I |
E. coli: E. cloni® 10G Strain | Lucigen | Cat# 60107-4 |
Chemicals, Peptides, and Recombinant Proteins | ||
Anhydrotetracycline (aTc) | Cayman Chemical Co. | Cat# 10009542 |
Isopropyl β-D-1-thiogalactopyranoside (IPTG) | GoldBio | Cat# I2481C25 |
L-Arabinose (Ara) | GoldBio | Cat# A-300-500 |
Copper Sulfate | Sigma-Aldrich | Cat# C1297-100G |
Defibrinated Horse Blood | Hemostat Laboratories | Cat# DHB030 |
Hydrogen Peroxide | Sigma-Aldrich | Cat# 216763-500ML |
Diethylenetriamine/nitric oxide adduct | Sigma-Aldrich | Cat# D185-10MG |
Carbenicillin | Teknova | Cat# C2105 |
Kanamycin Sulfate | VWR | Cat# 97061-600 |
Chloramphenicol | Sigma-Aldrich | Cat# C0378-25G |
Dulbecco’s Modified Eagle Medium (DMEM) | Thermo Fisher Scientific | Cat# 10569044 |
Dulbecco’s Modified Eagle Medium (DMEM), no phenol red | Thermo Fisher Scientific | Cat# 21063029 |
Dulbecco’s Phosphate-Buffered Saline | Thermo Fisher Scientific | Cat# 14190250 |
0.25% Trypsin EDTA | Thermo Fisher Scientific | Cat# 25200114 |
Fetal Bovine Serum (FBS) | Corning | Cat# MT35010CV |
Penicillin-Streptomycin | Thermo Fisher Scientific | Cat# 15140122 |
FuGENE® HD Transfection Reagent | Promega | Cat# E2312 |
Opti-MEM® Reduced Serum Medium | Thermo Fisher Scientific | Cat# 31985-070 |
0.45 μm Acrodisc® Syringe Filters | Pall Corporation | Cat# 4614 |
Hexadimethrine bromide | Sigma-Aldrich | Cat# H9268-5G |
Puromycin Dihydrochloride | Thermo Fisher Scientific | Cat# A1113803 |
QuickExtract DNA Extraction Solution | EpiBio | Cat# QE09050 |
Phire Plant Direct PCR Master Mix | Thermo Fisher Scientific | Cat# F160L |
KAPA HiFi HotStart ReadyMix | Kapa Biosystems | Cat# KK2602 |
Deposited Data | ||
Raw HTS data | This paper | NCBI: PRJNA509198 |
Raw microscopy images | This paper; Mendeley Data | https://data.mendeley.com/datasets/vjvvz534dm/1 |
Experimental Models: Cell Lines | ||
Human: HEK293T Cell Line | ATCC | Cat# CRL-11268 |
Experimental Models: Organisms/Strains | ||
E. coli: MG1655 PRO strain | Lutz and Bujard (1997) | N/A |
Oligonucleotides | ||
Oligonucleotides are listed in Table S3 | This paper | N/A |
Recombinant DNA | ||
pFUGW | Lois et al. (2002) | Addgene #14883 |
pCMV-VSV-G | Stewart et al. (2003) | Addgene #8454 |
psPAX2 | Trono Lab | Addgene #12260 |
Additional plasmids are listed in Table S1 | This paper | N/A |
Software and Algorithms | ||
Sequalizer | This study | N/A |
MATLAB | MathWorks | N/A |
CellProfiler | Broad Institute | N/A |
FACSDIVA & FlowJo | BD Biosciences | N/A |
SeqDoc | Crowe (2005) | N/A |
Other | ||
DeltaVision microscopy imaging system | GE Healthcare | N/A |
LSR Fortessa II flow cytometer | BD Biosciences | N/A |
Lead Contact and Materials Availability
Further information and requests for resources and reagents should be directed to and will be fulfilled by the Lead Contact, Timothy K. Lu (timlu@mit.edu).
Experimental Model and Subject Details
Bacterial experiments were performed in E. coli MG1655 PRO strain (MG1655 strain that harbors PRO cassette (pZS4Int-lacI/tetR, Expressys) and expresses lacI and tetR at high levels) (Lutz and Bujard, 1997). Mammalian cell experiments were performed in HEK293T cell line (ATCC CRL-11268).
Method Details
Plasmid Construction
Standard molecular biology and cloning techniques, including ligation, Gibson assembly (Gibson, 2011) and Golden Gate assembly (Engler et al., 2008) were used to construct the plasmids. Chemically competent E. coli DH5α F’ lacIq (NEB) and E. cloni 10G (Lucigen) were used for cloning. Lists of plasmids, synthetic parts and sequencing primers used in this study are provided in Tables S1, S2, and S3, respectively. Plasmids and their corresponding maps will be available on Addgene.
Antibiotics and Inducers
For bacterial selection, antibiotics were used at the following concentrations: Carbenicillin (Carb, 50 μg/mL), Kanamycin (Kan, 20 μg/mL), and Chloramphenicol (Cam, 25-30 μg/mL). For the experiments shown in Figures 2C, S3B, and S4 different combinations of 200 ng/ml anhydrotetracycline (aTc), 0.1 mM Isopropyl β-D-1-thiogalactopyranoside (IPTG) and 0.2% Arabinose (Ara) were used to induce the corresponding circuits. For the experiments shown in Figures S3D and S5, 250 ng/ml aTc and 0.005% Ara were used. For the experiment shown in Figure 3A, 150 ng/ml aTc, 0.1 mM IPTG, and 0.2% Ara were used. For all the other experiments, unless otherwise noted, 250 ng/ml aTc, 1 mM IPTG and 0.2% Ara were used. All concentrations are final concentrations.
Other inducers (CuSO4, Heme, H2O2, and NO) were used with final concentrations as indicated in Figures S1C–S1H. Diethylenetriamine/nitric oxide adduct was used as the source of NO. Defibrinated horse blood (Hemostat) was used as the source of Heme (Blood was lysed by first diluting 1:10 in simulated gastric fluid (SGF) (0.2% NaCl, 0.32% pepsin, 84 mM HCl, pH 1.2) before further dilution in culture media) (Mimee et al., 2018).
Bacterial Cell Experiments
Different plasmids expressing gRNAs and targets (listed in Table S1) were transformed into the reporter cells (MG1655 PRO) harboring aTc-inducible CDA-nCas9-ugi (for bacterial experiments, APOBEC1 CDA (Komor et al., 2016) was used as the writing module). Single transformant colonies were grown in appropriate antibiotics for 4-8 hours to obtain seed cultures. Seed cultures were diluted (1:100) in fresh media containing different combinations of the inducers and grown in 96-well plates (or tubes for the Dark/Light conditions shown in Figure S1E) with serial dilutions (if applicable) as indicated in induction patterns in corresponding figures. Samples for various measurements including HTS, Sequalizer, and flow cytometry were taken at indicated time points.
Mammalian Cell Culture
HEK293T cells were grown in DMEM supplemented with 10% fetal bovine serum (FBS) and 1% penicillin-streptomycin at 37°C with 5% CO2.
Lentivirus Production
Lentiviral constructs were cloned in using the FUGW backbone (Addgene #25870) and packaged in HEK293T cells by co-transfection with psPAX2 and pCMV-pVSV-G helper plasmids. In brief, 4.4 × 105 HEK293T cells per well were seeded in a 6 well plate. After 24-hour incubation, media were replaced with 2mL fresh culture media containing FuGENE HD/DNA complexes. For FuGENE HD/DNA complex, 9 μL of FuGENE HD (Promega) was added to a mixture of 3 plasmids consisting of 0.1 μg of pCMV-VSV-G vector, 0.5 μg of lentiviral packaging psPAX2 vector, and 1 μg of lentiviral expression vector in 100 μL of Opti-MEM reduced serum medium (Thermo Fisher Scientific), followed by 20 minutes incubation at room temperature. Media of transfected cells were replaced with 2 mL of fresh culture media 18 hours post transfection. The supernatant containing newly produced viruses was collected at 48-hours post-transfection, and filtered through a 0.45 μm syringe filter (Pall Corporation, Ann Arbor, MI; Catalog #4614). Filtered lentiviruses were used to infect respective cell lines in the presence of polybrene (Sigma-Aldrich, 8 μg/mL). Successful lentiviral integration was confirmed by using lentiviral plasmid constructs constitutively expressing fluorescent proteins or antibiotic resistance genes to serve as infection markers.
Generation of Monoclonal Cell Lines
A lentiviral plasmid construct was made by placing the nCas9-CDA-ugi-VP64 fusion protein with nuclear localization signals linked to the Puromycin resistance gene with the P2A sequence under the control of constitutive CMV promoter (for mammalian experiments, PmCDA (Nishida et al., 2016) was used as the writing module). In addition, repeat arrays (4xOp_1xOp∗ or 1xOp∗) were placed upstream of the minimal adenovirus major late promoter (MLP) promoter driving GFP and the resultant reporter constructs were cloned into the same lentiviral construct and packaged into viral particles. For generation of monoclonal cell lines, 4.4 × 105 HEK293T cells per well were seeded in a 6 well plate. After 24-hour incubation, media were replaced with 4 mL of the viral supernatant to infect cells with lentiviruses in the presence of 8 μg/mL polybrene. The viral supernatant was replaced with 2 mL of fresh culture media 24 hours post transduction. Drug selection with Puromycin (Thermo Fisher Scientific, 7 μg/mL) was performed at 72 hours after transduction for 8 days. During drug selection, a confluent well or dish was expanded into a 10 cm dish or T-75 flask. The pooled population was then suspended at a concentration of 5 cells/mL in fresh cell culture media, and 100 μL of cell suspension was transferred into each well of a 96-well plate. After 7-day undisturbed incubation, each well was assessed by a microscope, and monoclonal cells forming a colony were harvested and expanded.
Non-destructive DNA-State Reporter Experiment
On day 0, 440,000 clonal reporter cells per well were infected, in the presence of 8 μg/mL polybrene, with 4 mL of the viral supernatant with high titer lentiviral particles encoding the gRNAs driven by the U6 promoter in a 6-well plate in triplicate. Infection efficiency was more than 90% in every sample. The cells were harvested every 3 days until day 15 after the infection. Half of the harvested cells were seeded in a 6-well plate for further culture. One-fifth of the harvested cells was seeded in a glass bottom 6 well plate (MatTek Corporation, USA) in DMEM without phenol red supplemented with 10% FBS and 1% penicillin-streptomycin for microscopic analysis. The remaining cells were collected for next-generation sequencing.
Microscopy
Fluorescence microscopy images of cells in glass-bottom tissue culture plates were obtained at 12 hours after subculture by using the DeltaVision microscopy imaging system (Applied Precision) with a 20x objective lens.
Flow Cytometry
Bacterial cultures were diluted 1:10 and the GFP signal in diluted samples were measured using an LSR Fortessa II flow cytometer (Becton Dickinson, NJ) equipped with 488/FITC laser/filter set.
High-throughput Sequencing
For each bacterial sample, 5 μL of culture was resuspended in 15 μL of QuickExtract DNA Extraction Solution (Epicenter, WI) and lysed by a two-step protocol (15 minutes incubation at 65°C followed by 2 minutes incubation at 98°C). For each mammalian cell sample, cell pellet was resuspended in 40 μL of QuickExtract DNA Extraction Solution and lysed by a two-step protocol (30 minutes incubation at 65°C followed by 16 minutes incubation at 98°C). Target sites were PCR amplified using 2 μL of lysed bacterial cultures (or 2.5 μL of extracted mammalian cell lysate) as templates and the appropriate primers listed in Table S3. The obtained amplicons were used as templates in a second round of PCR to add Illumina barcodes and adaptors. The amplicons were then multiplexed and sequenced by Illumina platform.
Sanger Sequencing
For each sample, target sites were PCR amplified by target-specific primers and Sanger sequenced by Quintara Biosciences. The obtained chromatograms were analyzed by Sequalizer.
Quantification and Statistical Analysis
Flow Cytometry Data Analysis
All samples were uniformly gated (using the strategy indicated in Figure S2F) and the mean fluorescence and percent of GFP-positive cells were calculated by FACSDIVA and FlowJo (BD Biosciences). Experiments were performed in triplicates.
Microscopy Image Analysis
Microscopy images were analyzed by CellProfiler software and the number of GFP- and BFP-positive cells as well as GFP signal intensity in GFP-positive cells were measured using the ‘ColorToGray’, ‘IdentifyPrimaryObjects’ (for BFP), ‘IdentifyPrimaryObjects’ (for GFP), ‘MeasureObjectIntensity’, and ‘ExportToSpreadsheet’ modules. Image data were inspected for complete removal of false positive debris after the CellProfiler analysis. For each sample, the number of GFP-positive cells was normalized to the total number of gRNA-transduced cells by dividing the total number of GFP-positive cells to BFP-positive cells in 40 random fields of view. Experiments were performed in triplicates.
HTS Data Analysis
The obtained sequencing reads were demultiplexed and allele frequencies were calculated using a custom MATLAB script.
Sequalizer Analysis
The obtained Sanger chromatograms for each sample was analyzed by Sequalizer (Data S1) as described in the Supplemental Information, using chromatogram of the seed culture for the corresponding experiment as the reference.
Data and Code Availability
Software
Software from this study has been described under ‘‘Quantification and Statistical Analysis’’ section.
Data Resources
The accession number for the raw sequencing data reported in this paper is NCBI: PRJNA509198.
Acknowledgments
We thank Christina Harrison for helping with some early experiments in this project. This work was supported by the NIH (P50 GM098792), the Office of Naval Research (N00014-13-1-0424), the National Science Foundation (MCB-1350625), the Defense Advanced Research Projects Agency, the MIT Center for Microbiome Informatics and Therapeutics, and an NSF Expeditions in Computing Program Award (1522074). F.F. thanks the Schmidt Science Fellows Program in partnership with the Rhodes Trust for their support.
Author Contributions
F.F. conceived the study, designed and performed experiments, and analyzed data. F.F. and N.G. designed the experiments, wrote the Sequalizer script, and analyzed next-generation sequencing data. F.F. and Y.H. performed the mammalian cell experiments and analyzed the results. G.J. and J.C. assisted with the bacterial experiments. T.K.L. supervised the research and provided scientific guidance and analysis. F.F., N.G., and T.K.L. wrote the manuscript, with input from all authors.
Declaration of Interests
F.F. and T.K.L. have filed a patent application based on this work. T.K.L. is a co-founder of Senti Biosciences, Synlogic, Engine Biosciences, Tango Therapeutics, Corvium, BiomX, and Eligo Biosciences. T.K.L. also holds financial interests in nest.bio, Ampliphi, and IndieBio.
Supplemental Information (3)
Document S1. Figures S1–S6 and Tables S1–S3
Data S1. Sequalizer Script (Written in MATLAB) Used to Analyses Sanger Sequencing Chromatograms, Related to STAR Methods
Document S2. Article plus Supplemental Information
References
Arpino, J.A. ∙ Hancock, E.J. ∙ Anderson, J. ...
Tuning the dials of Synthetic Biology
Microbiology. 2013; 159:1236-1253Briner, A.E. ∙ Donohoue, P.D. ∙ Gomaa, A.A. ...
Guide RNA functional modules direct Cas9 activity and orthogonality
Mol. Cell. 2014; 56:333-339Chavez, A. ∙ Scheiman, J. ∙ Vora, S. ...
Highly efficient Cas9-mediated transcriptional programming
Nat. Methods. 2015; 12:326-328Crowe, M.L.
SeqDoC: rapid SNP and mutation detection by direct comparison of DNA sequence chromatograms
BMC Bioinformatics. 2005; 6:133Engler, C. ∙ Kandzia, R. ∙ Marillonnet, S.
A one pot, one step, precision cloning method with high throughput capability
PLoS ONE. 2008; 3:e3647Farzadfard, F. ∙ Lu, T.K.
Synthetic biology. Genomically encoded analog memory with precise in vivo DNA writing in living cell populations
Science. 2014; 346:1256272Farzadfard, F. ∙ Lu, T.K.
Emerging applications for DNA writers and molecular recorders
Science. 2018; 361:870-875Farzadfard, F. ∙ Perli, S.D. ∙ Lu, T.K.
Tunable and multifunctional eukaryotic transcription factors based on CRISPR/Cas
ACS Synth. Biol. 2013; 2:604-613Gaudelli, N.M. ∙ Komor, A.C. ∙ Rees, H.A. ...
Programmable base editing of A⋅T to G⋅C in genomic DNA without DNA cleavage
Nature. 2017; 551:464-471Gilbert, L.A. ∙ Larson, M.H. ∙ Morsut, L. ...
CRISPR-mediated modular RNA-guided regulation of transcription in eukaryotes
Cell. 2013; 154:442-451Hilton, I.B. ∙ D’Ippolito, A.M. ∙ Vockley, C.M. ...
Epigenome editing by a CRISPR-Cas9-based acetyltransferase activates genes from promoters and enhancers
Nat. Biotechnol. 2015; 33:510-517Kalhor, R. ∙ Kalhor, K. ∙ Mejia, L. ...
Developmental barcoding of whole mouse via homing CRISPR
Science. 2018; 361:eaat9804Koblan, L.W. ∙ Doman, J.L. ∙ Wilson, C. ...
Improving cytidine and adenine base editors by expression optimization and ancestral reconstruction
Nat. Biotechnol. 2018; 36:843-846Komor, A.C. ∙ Kim, Y.B. ∙ Packer, M.S. ...
Programmable editing of a target base in genomic DNA without double-stranded DNA cleavage
Nature. 2016; 533:420-424Lee, J.W. ∙ Gyorgy, A. ∙ Cameron, D.E. ...
Creating Single-Copy Genetic Circuits
Mol. Cell. 2016; 63:329-336Liu, X.S. ∙ Wu, H. ∙ Ji, X. ...
Editing DNA Methylation in the Mammalian Genome
Cell. 2016; 167:233-247Lois, C. ∙ Hong, E.J. ∙ Pease, S. ...
Germline transmission and tissue-specific expression of transgenes delivered by lentiviral vectors
Science. 2002; 295:868-872Lutz, R. ∙ Bujard, H.
Independent and tight regulation of transcriptional units in Escherichia coli via the LacR/O, the TetR/O and AraC/I1-I2 regulatory elements
Nucleic Acids Res. 1997; 25:1203-1210McKenna, A. ∙ Findlay, G.M. ∙ Gagnon, J.A. ...
Whole-organism lineage tracing by combinatorial and cumulative genome editing
Science. 2016; 353:aaf7907Meyer, A.J. ∙ Segall-Shapiro, T.H. ∙ Glassey, E. ...
Escherichia coli “Marionette” strains with 12 highly optimized small-molecule sensors
Nat. Chem. Biol. 2019; 15:196-204Mimee, M. ∙ Nadeau, P. ∙ Hayward, A. ...
An ingestible bacterial-electronic system to monitor gastrointestinal health
Science. 2018; 360:915-918Nishida, K. ∙ Arazoe, T. ∙ Yachie, N. ...
Targeted nucleotide editing using hybrid prokaryotic and vertebrate adaptive immune systems
Science. 2016; 353 aaf8729Perli, S.D. ∙ Cui, C.H. ∙ Lu, T.K.
Continuous genetic recording with self-targeting CRISPR-Cas in human cells
Science. 2016; 353:aag0511Qi, L.S. ∙ Larson, M.H. ∙ Gilbert, L.A. ...
Repurposing CRISPR as an RNA-guided platform for sequence-specific control of gene expression
Cell. 2013; 152:1173-1183Roquet, N. ∙ Soleimany, A.P. ∙ Ferris, A.C. ...
Synthetic recombinase-based state machines in living cells
Science. 2016; 353:aad8559Sheth, R.U. ∙ Yim, S.S. ∙ Wu, F.L. ...
Multiplex recording of cellular events over time on CRISPR biological tape
Science. 2017; 358:1457-1461Stewart, S.A. ∙ Dykxhoorn, D.M. ∙ Palliser, D. ...
Lentivirus-delivered stable gene silencing by RNAi in primary cells
RNA. 2003; 9:493-501Tang, W. ∙ Liu, D.R.
Rewritable multi-event analog recording in bacterial and mammalian cells
Science. 2018; 360:eaap8992Figures (5) 图(5)
Article metrics 文章指标
-
-
70Citations 引用文献200Captures 捕获16Mentions 提及
-
Supplemental information (3)
补充信息 (3)
PDF (1.57 MB)
Document S1. Figures S1–S6 and Tables S1–S3
Archive (23.49 KB)
Data S1. Sequalizer Script (Written in MATLAB) Used to Analyses Sanger Sequencing Chromatograms, Related to STAR Methods
PDF (5.29 MB)
Document S2. Article plus Supplemental Information