这是用户在 2024-6-6 20:37 为 https://app.immersivetranslate.com/pdf-pro/3e3a3945-40a0-452c-b91f-72aa9fdb80f0 保存的双语快照页面,由 沉浸式翻译 提供双语支持。了解如何保存?
2024_06_06_10bf772e090285b6a215g
Figure 5-15 The effect of single-strand DNA-binding proteins (SSB proteins) on the structure of single-stranded DNA. Because each protein molecule prefers to bind next to a previously bound molecule, long rows of this protein form on a DNA single strand. This cooperative binding straightens out the DNA template and facilitates the DNA polymerization process. The "hairpin helices" shown in the bare, single-stranded DNA result from a chance matching of short regions of complementary nucleotide sequence.
图 5-15 单链 DNA 结合蛋白(SSB 蛋白)对单链 DNA 结构的影响。由于每个蛋白分子更倾向于结合到先前结合的分子旁边,这种蛋白在 DNA 单链上形成长行。这种协同结合使 DNA 模板变直并促进 DNA 聚合过程。裸露的单链 DNA 中显示的“发夹螺旋”是由于互补核苷酸序列的短区域偶然匹配而产生的。
releases itself from the clamp and dissociates from the template. With the help of the clamp loader, which hydrolyzes ATP as it loads a new clamp onto a primertemplate junction (Figure 5-17), this lagging-strand polymerase molecule then associates with the new clamp that is assembled on the RNA primer of the next Okazaki fragment.
释放自身并与模板解离。在夹具加载器的帮助下,夹具加载器在将新夹具加载到引物模板结合物上时水解 ATP(图 5-17),这个滞后链聚合酶分子随后与组装在下一个岡崎片段的 RNA 引物上的新夹具结合。

The Proteins at a Replication Fork Cooperate to Form a Replication Machine
复制叉处的蛋白质相互合作形成复制机器

Although we have discussed DNA replication as though it were performed by a set of proteins all acting independently, in reality most of these proteins are held together in a large and orderly multienzyme complex that rapidly synthesizes DNA. This complex can be likened to a tiny sewing machine composed of protein
尽管我们讨论 DNA 复制时似乎是由一组独立行动的蛋白质执行的,但实际上,这些蛋白质中的大多数被固定在一个庞大而有序的多酶复合物中,快速合成 DNA。这个复合物可以被比作由蛋白质组成的微型缝纫机。

(B)
Figure 5-16 Human single-strand binding protein bound to DNA. (A) Front view of the two DNA-binding domains of the protein (called RPA), which cover a total of eight nucleotides. Note that the DNA bases remain exposed in this protein-DNA complex. (B) Diagram showing the three-dimensional structure, with the DNA strand (orange) viewed end on. (PDB code: .)
图 5-16 人类单链结合蛋白结合到 DNA。(A) 蛋白质的两个 DNA 结合结构的正面视图(称为 RPA),总共覆盖了八个核苷酸。请注意,在这个蛋白质-DNA 复合物中,DNA 碱基保持暴露。 (B) 显示三维结构的图示,DNA 链(橙色)端面视图。(PDB 代码: .)
Figure 5-17 The sliding clamp that holds DNA polymerase on the DNA. (A) The structure of the clamp protein from E. coli, as determined by x-ray crystallography, with a DNA helix added to indicate how the protein fits around DNA (Movie 5.3). (B) Schematic illustration showing how the clamp is loaded onto DNA. The structure of the clamp loader (green) resembles a screw nut, with its threads matching the grooves of double-stranded DNA. The loader binds to a free clamp molecule, forcing a gap in its ring of subunits, which enables it to slip around DNA. The loader then "screws" the open clamp onto double-stranded DNA until it encounters the 3 ' end of a primer, at which point the loader hydrolyzes ATP and releases the clamp, allowing it to close around the DNA. In the simplified reaction shown here, the clamp loader dissociates once the clamp has been assembled. At bacterial replication forks, the clamp loader remains bound to the polymerase so that, on the lagging strand, it is ready to assemble a new clamp at the start of each new Okazaki fragment. (A, from X.P. Kong et al., Cell 69:425-437, 1992; PDB code: 3BEP; B, adapted from B.A. Kelch et al., Science 334:1675-1680, 2011.)
图 5-17 保持 DNA 聚合酶在 DNA 上的滑动夹具。 (A) 大肠杆菌中夹具蛋白的结构,由 X 射线晶体学确定,添加了一个 DNA 螺旋以指示蛋白如何围绕 DNA (视频 5.3)。(B) 示意图显示夹具如何加载到 DNA 上。夹具加载器的结构(绿色)类似于螺母,其螺纹与双链 DNA 的凹槽相匹配。加载器结合到一个自由的夹具分子,迫使其环状亚基之间产生间隙,使其能够滑动到 DNA 周围。然后,加载器将打开的夹具“螺丝”到双链 DNA 上,直到遇到引物的 3'端,此时加载器水解 ATP 并释放夹具,使其围绕 DNA 闭合。在此处显示的简化反应中,夹具加载器在夹具组装完成后解离。在细菌复制叉处,夹具加载器保持与聚合酶结合,因此,在滞后链上,它准备在每个新的 Okazaki 片段的开始处组装一个新的夹具。(A,来源于 X.P. Kong 等人,Cell 69:425-437,1992 年; PDB 代码: 3BEP; B,改编自 B.A. Kelch 等人,Science 334:1675-1680,2011 年。)

parts and powered by nucleoside triphosphate hydrolysis. Like a sewing machine, the replication complex probably remains stationary with respect to its immediate surroundings; the DNA can be thought of as a long strip of cloth being rapidly threaded through it. Although the replication complex has been most intensively studied in E. coli and several of its viruses, a very similar complex also operates in eukaryotes, as we shall see below.
由核苷三磷酸水解提供动力的部分。就像缝纫机一样,复制复合物可能在其周围保持静止;DNA 可以被视为一条长布条,快速穿过它。尽管复制复合物在大肠杆菌及其几种病毒中得到了最深入的研究,但在真核生物中也存在一个非常相似的复合物,我们将在下文中看到。
How the different proteins at the replication fork work together in bacteria is shown in Figure 5-18. At the front of the replication fork, DNA helicase opens the DNA helix. Several identical DNA polymerase molecules work at the fork, one on the leading strand and two on the lagging strand. Whereas the DNA polymerase molecule on the leading strand can operate in a continuous fashion, the DNA polymerase molecules on the lagging-strand alternate at short intervals, using the short RNA primers made by DNA primase to begin each Okazaki fragment. The close association of all these protein components increases the efficiency of replication, and it is made possible by a folding back of the lagging strand as shown in the figure. This arrangement facilitates the loading of the polymerase clamp each time that an Okazaki fragment is synthesized: the clamp loader and the lagging-strand DNA polymerase molecule are kept in place at the replication fork even when they detach from their DNA template. The replication proteins are thus linked together into a single large unit (total molecular mass daltons), enabling DNA to be synthesized on both sides of the replication fork in a coordinated and efficient manner.
不同蛋白质在细菌复制叉处如何协同工作的示意图如图 5-18 所示。在复制叉的前端,DNA 解旋酶打开 DNA 螺旋。几个相同的 DNA 聚合酶分子在复制叉处工作,一个在前导链上,两个在滞后链上。前导链上的 DNA 聚合酶分子可以连续工作,而滞后链上的 DNA 聚合酶分子则在短间隔内交替工作,使用 DNA 引物酶制造的短 RNA 引物来开始每个 Okazaki 片段。所有这些蛋白质组分的紧密关联增加了复制的效率,这是通过滞后链的折返来实现的,如图所示。这种排列有助于在合成每个 Okazaki 片段时每次加载聚合酶夹: 即使它们从 DNA 模板上分离,夹装载器和滞后链 DNA 聚合酶分子也会保持在复制叉处。 复制蛋白质因此被连接成一个单一的大单位(总分子量 道尔顿),使得 DNA 能够在复制叉的两侧以协调和高效的方式合成。
On the lagging strand, the DNA replication machine leaves behind a series of unsealed Okazaki fragments, which still contain the RNA that primed their synthesis at their ends. As discussed earlier, this RNA is removed, and the resulting
在滞后链上,DNA 复制机器留下一系列未封闭的岡崎片段,这些片段仍然包含在其 端引导合成的 RNA。正如前面讨论的那样,这些 RNA 被去除,产生的
newly synthesized leading strand
新合成的主链
(C)
Figure 5-18 A bacterial replication fork. (A) In this case, a single DNA polymerase molecule synthesizes the leading strand while two DNA polymerases are used - in alternating fashion-for lagging-strand DNA synthesis. All of these polymerase molecules, which are identical, are held in place at the fork by flexible "arms" that extend from the clamp loader. Additional interactions (for example, between the DNA helicase and primase) ensure that all the individual components function together as a well-coordinated protein machine (Movie 5.4). (B) An electron micrograph showing the replication machine from the bacteriophage T4 as it moves along a template synthesizing DNA behind it. (C) An interpretation of the micrograph is given in the sketch: note especially the DNA loop on the lagging strand. Apparently, during the preparation of this sample for electron microscopy, the replication proteins became partly detached from the very front of the replication fork. (B, from P.D. Chastain et al., J. Biol. Chem. 278:21276-21825, 2003. With permission from American Society for Biochemistry and Molecular Biology.)
图 5-18 细菌复制叉。 (A) 在这种情况下,单个 DNA 聚合酶分子合成领先链,而两个 DNA 聚合酶 - 交替使用 - 用于滞后链 DNA 合成。 所有这些相同的聚合酶分子都由从夹持装载器延伸的灵活“臂”固定在叉处。 额外的相互作用(例如,DNA 解旋酶和引物酶之间)确保所有单个组件作为一个协调良好的蛋白质机器一起发挥作用(电影 5.4)。 (B) 电子显微镜照片显示噬菌体 T4 的复制机器沿着模板移动,在其后合成 DNA。 (C) 在草图中给出了对显微图的解释:特别注意滞后链上的 DNA 环。 显然,在为电子显微镜制备此样品期间,复制蛋白部分从复制叉的最前端分离出来。(B,出自 P.D. Chastain 等人,J. Biol. Chem. 278:21276-21825,2003 年。获得美国生物化学与分子生物学学会许可。)
gap is filled in by DNA repair enzymes that operate behind the replication fork (see Figure 5-11).
DNA 修复酶填补了在复制叉后方运作的间隙(见图 5-11)。

DNA Replication Is Fundamentally Similar in Eukaryotes and Bacteria
DNA 复制在真核生物和细菌中基本上是相似的

Much of what we know about DNA replication was first derived from studies of purified bacterial and bacteriophage multienzyme systems capable of DNA replication in vitro. The development of these systems in the 1970s was greatly facilitated by the prior isolation of mutants in a variety of replication genes; these mutants were exploited to identify and purify the corresponding replication proteins. The first eukaryotic replication system that accurately replicated DNA in vitro was described in the mid-1980s, and mutations in genes encoding nearly all of the replication components have now been isolated and analyzed in the yeast Saccharomyces cerevisiae. As a result, much is known about the detailed enzymology of DNA replication in eukaryotes, and it is clear that the fundamental features of DNA replication-including replication-fork geometry and the use of DNA polymerases, helicases, clamps, clamp loaders, and single-strand binding proteins-are similar.
DNA 复制的许多知识最初是从对纯化的细菌和噬菌体多酶系统的研究中得出的,这些系统能够在体外进行 DNA 复制。这些系统的发展始于 1970 年代,之前已经通过多种复制基因的突变体的分离大大促进了这一过程;这些突变体被利用来识别和纯化相应的复制蛋白质。第一个能够准确在体外复制 DNA 的真核复制系统是在 1980 年代中期描述的,目前已经在酿酒酵母 Saccharomyces cerevisiae 中分离和分析了几乎所有复制组分编码基因的突变体。因此,我们对真核生物 DNA 复制的详细酶学特征有了很多了解,明显地,DNA 复制的基本特征,包括复制叉几何结构和使用的 DNA 聚合酶、解旋酶、夹具、夹具装载蛋白和单链结合蛋白等,是相似的。
Figure 5-19 Schematic diagram of a eukaryotic replication fork. Unlike the bacterial replication proteins, those from eukaryotes are thought to function largely independently, perhaps accounting for the slower speed of the eukaryotic replication fork (Movie 5.5). Note that the eukaryotic CMG helicase moves unidirectionally along the leading-strand template, whereas the bacterial helicase discussed earlier moves in one direction along the lagging-strand template (see Figure 5-18). In both cases, the DNA duplex is rapidly pried apart at the front of the moving replication fork by harnessing the energy of ATP hydrolysis.
图 5-19 真核复制叉的示意图。与细菌复制蛋白不同,真核生物的复制蛋白被认为主要独立运作,这或许解释了真核生物复制叉速度较慢的原因(见影片 5.5)。请注意,真核 CMG 解旋酶沿着前导链模板单向移动,而之前讨论的细菌解旋酶沿着滞后链模板单向移动(见图 5-18)。在两种情况下,通过利用 ATP 水解能量,DNA 双螺旋在移动复制叉前端迅速分离。
However, there are some important differences in how bacteria and eukaryotes replicate their DNA. Perhaps most important, eukaryotes use three different kinds of DNA polymerase at each replication fork (Figure 5-19). Polymerase (Pol ) synthesizes the leading strand, whereas Pol and Pol synthesize the lagging-strand Okazaki fragments. Each type of polymerase has special properties that make it well suited for its job. Pol binds to both the sliding clamp and the replicative helicase, allowing it to synthesize very long stretches of leading-strand DNA without dissociating. Pol includes DNA primase as one of its subunits, which begins all new chains by synthesizing a short length of RNA. This RNA is extended by a different subunit of Pol , which adds only about 20 nucleotides of DNA before dissociating. Finally, Pol , which is loaded in conjunction with a sliding clamp, takes over and completes synthesis of each Okazaki fragment to produce a total length of about 200 nucleotides.
然而,细菌和真核生物在复制 DNA 方面存在一些重要的差异。也许最重要的是,真核生物在每个复制叉处使用三种不同类型的 DNA 聚合酶(图 5-19)。聚合酶 (Pol )合成主链,而 Pol 和 Pol 合成滞后链的 Okazaki 片段。每种聚合酶都具有特殊的特性,使其非常适合其工作。Pol 同时结合滑动夹和复制解旋酶,使其能够在不解离的情况下合成非常长的主链 DNA 片段。Pol 将 DNA 引物酶作为其亚基之一,通过合成一小段 RNA 来开始所有新链。这段 RNA 由 Pol 的另一个亚基延伸,该亚基在解离之前仅添加约 20 个核苷酸的 DNA。最后,与滑动夹一起加载的 Pol 接管并完成每个 Okazaki 片段的合成,以产生约 200 个核苷酸的总长度。
The use of three different kinds of DNA polymerase at the replication fork is part of a trend toward higher complexity observed for eukaryotic DNA replication compared to that of bacteria. As another example, the eukaryotic single-strand binding protein is formed from three different subunits, while only a single subunit is found in bacteria. Likewise, the eukaryotic replicative helicase (known as the CMG helicase) is composed of 11 different protein subunits, while the bacterial enzyme is a hexamer of 6 identical subunits. We do not know why the eukaryotic replication machinery is so much more complex than that of bacteria; however, there are several possibilities. In eukaryotes, DNA replication must be coordinated with the elaborate process of mitosis; it must also deal with DNA packaged into nucleosomes, topics we discuss in the next part of the chapter. It is also possible that the difference in complexity between bacteria and eukaryotes largely reflects evolutionary pressure for bacteria to make do with fewer genes.
三种不同类型的 DNA 聚合酶在复制叉处的使用是观察到的真核 DNA 复制相对于细菌而言趋向更高复杂性的一部分。另一个例子是,真核单链结合蛋白由三种不同亚基组成,而细菌中只有一个亚基。同样,真核复制螺旋酶(称为 CMG 螺旋酶)由 11 种不同的蛋白亚基组成,而细菌酶是由 6 个相同亚基组成的六聚体。我们不知道为什么真核复制机制比细菌的复杂得多;然而,有几种可能性。在真核生物中,DNA 复制必须与有丝分裂的复杂过程协调进行;它还必须处理包装成核小体的 DNA,这是我们在本章的下一部分讨论的主题。细菌和真核生物之间复杂性差异的原因可能主要反映了细菌需要利用更少基因的进化压力。
Another important distinction between eukaryotic and bacterial replication protein complexes lies in the detailed structures of their individual protein
真核和细菌复制蛋白复合物之间的另一个重要区别在于它们各自蛋白的详细结构

components. With the exception of the sliding clamp, the replication proteins in bacteria have completely different structures and amino acid sequences than those of their eukaryotic counterparts. The simplest interpretation of this surprising fact is that, over hundreds of millions of years, the DNA replication machinery in eukaryotes and bacteria evolved independently, yet converged on the same basic mechanisms. This situation is in contrast to other fundamental processes in the cell, such as transcription and translation, where the fundamental components (RNA polymerase and the ribosome) are very similar between bacteria and eukaryotes-and where the structures are conserved from an ancient, common ancestor.
除了滑动夹具外,细菌中的复制蛋白与真核生物的对应物在结构和氨基酸序列上完全不同。对这一令人惊讶的事实最简单的解释是,在数亿年的时间里,真核生物和细菌中的 DNA 复制机制独立演化,但最终采用了相同的基本机制。这种情况与细胞中的其他基本过程形成对比,比如转录和翻译,在这些过程中,基本组分(RNA 聚合酶和核糖体)在细菌和真核生物之间非常相似,结构保留自古老的共同祖先。

A Strand-directed Mismatch Repair System Removes Replication Errors That Remain in the Wake of the Replication Machine
一个串导向的错配修复系统消除了在复制机器留下的复制错误

Because bacteria such as E. coli are capable of dividing once every 30 minutes, it is relatively easy to screen large populations to find a rare mutant cell that is altered in a specific process. One interesting class of mutants consists of those with alterations in so-called mutator genes, which greatly increase the rate of spontaneous mutation. Not surprisingly, one such mutant makes a defective form of the -to-5' proofreading exonuclease that is a part of the DNA polymerase enzyme (see Figures 5-8 and 5-9). The mutant DNA polymerase no longer proofreads effectively, and many replication errors that would otherwise have been removed accumulate in the DNA.
由于细菌如大肠杆菌能够每 30 分钟分裂一次,因此相对容易筛选大量群体以找到在特定过程中发生改变的罕见突变细胞。一个有趣的突变体类别包括那些在所谓的突变基因中发生改变的细胞,这些基因大大增加了自发突变的速率。毫不奇怪,这样的一个突变体制造了 DNA 聚合酶酶中的一种缺陷形式的 -to-5'校对外切酶(见图 5-8 和 5-9)。突变的 DNA 聚合酶不再有效地进行校对,许多本应被移除的复制错误在 DNA 中积累。
The study of other . coli mutants exhibiting abnormally high mutation rates uncovered an additional proofreading system, common to all cells on Earth, that removes those rare replication errors that were made by the polymerase and missed by its proofreading exonuclease. These errors leave mismatched base pairs behind the replication fork, which are subsequently recognized and corrected by a strand-directed mismatch repair system. This system picks out mismatches from normal DNA by monitoring their potential to distort the DNA double helix, which is greatly increased by the misfit between noncomplementary base pairs. However, if the repair system simply recognized a mismatch in newly replicated DNA and randomly corrected one of the two mismatched nucleotides, it would mistakenly "correct" the original template strand to match the error exactly half the time, thereby failing to lower the overall error rate. To be effective, such a proofreading system must be able to remove only the nucleotide on the newly synthesized strand, where the error occurred.
对其他大肠杆菌突变体的研究揭示了一种额外的校对系统,这种系统普遍存在于地球上的所有细胞中,它能够去除由聚合酶产生但被其校对外切酶忽略的那些罕见的复制错误。这些错误会在复制叉点之后留下不匹配的碱基对,随后会被一个链向错配修复系统识别并纠正。该系统通过监测错配碱基对扭曲 DNA 双螺旋的潜力来从正常 DNA 中挑选出错配。由于非互补碱基对之间的不匹配会大大增加 DNA 双螺旋的扭曲程度。然而,如果修复系统仅仅识别新复制的 DNA 中的错配并随机纠正两个错配核苷酸中的一个,那么它将错误地“纠正”原始模板链以使其与错误完全匹配的概率达到一半,从而无法降低整体错误率。为了有效,这样的校对系统必须能够仅去除新合成链上发生错误的核苷酸。
The strand-distinction mechanism used by the mismatch proofreading system in E. coli depends on the methylation of selected A residues in the DNA. Methyl groups are added to all A residues in the sequence GATC, but not until some time after the GATC has been synthesized. As a result, the only unmethylated GATC sequences lie in the newly synthesized strands just behind a replication fork. The recognition of these unmethylated GATCs (which are base-paired to methylated GATCs) allows the new DNA strands to be transiently distinguished from old ones, as required if their mismatches are to be selectively removed. The five-step error-correction process involves recognition of a mismatch, identification of the newly synthesized strand, excision of the portion containing the misincorporated nucleotide, resynthesis of the excised segment using the old strand as a template, and ligation to seal the DNA backbone. This strand-directed mismatch repair system reduces the number of errors made during DNA replication by an additional factor of 100-1000 (see Table 5-1, p. 260).
大肠杆菌中的错配校对系统使用的链区别机制取决于 DNA 中选择性 A 残基的甲基化。甲基基团被添加到序列 GATC 中的所有 A 残基,但直到 GATC 合成后的一段时间才添加。因此,唯一未甲基化的 GATC 序列位于复制叉后方刚合成的链中。识别这些未甲基化的 GATC(与甲基化的 GATC 成对)使得新的 DNA 链能够暂时与旧的区分开来,这是必要的,以便选择性地去除它们的错配。这个五步错误校正过程包括识别错配、识别新合成链、切除含有错误插入核苷酸的部分、使用旧链作为模板重新合成被切除的片段,以及连接以封闭 DNA 骨架。这个链定向的错配修复系统通过额外减少 100-1000 倍 DNA 复制过程中的错误数量(见表 5-1,第 260 页)。
A similar mismatch proofreading system functions in eukaryotic cells, but it uses a different way to distinguish the newly synthesized DNA strands from the parent strands. On the lagging strand, the newly synthesized DNA will contain transient single-strand gaps before the series of Okazaki fragments are processed and ligated together. Each gap will usually carry a sliding clamp, which remains on the DNA after the DNA polymerase has dissociated from it to begin the next fragment. Together, the clamp and the single-strand break signal to the mismatch
在真核细胞中,存在类似的不匹配校对系统,但它使用不同的方式来区分新合成的 DNA 链与母链。在滞后链上,新合成的 DNA 将在经过一系列 Okazaki 片段的处理和连接之前包含短暂的单链间隙。每个间隙通常会携带一个滑动夹具,该夹具在 DNA 聚合酶从 DNA 上解离以开始下一个片段后仍保留在 DNA 上。夹具和单链断裂一起向不匹配信号。
(A)
(B)
Figure 5-20 Strand-directed mismatch repair in eukaryotes. (A) The MutS protein binds to a mismatched base pair, recruits the MutL protein, and the complex scans the nearby DNA for a gap and a sliding clamp whose orientation determines which strand is to be cut and its nucleotides replaced. When these are encountered, MutL is activated and begins to cleave the DNA. In most organisms, MutL is joined by another nuclease and, together, they remove the newly synthesized DNA starting at the gap and extending past the mismatch. The gap is then filled in by DNA polymerase and sealed by DNA ligase. (B) The structure of the MutS protein bound to a DNA mismatch. This protein is a dimer, which grips the DNA double helix as shown, kinking the DNA at the mismatched base pair. It seems that the MutS protein scans the DNA for mismatches by testing for sites that can be readily kinked, which are those with an abnormal base pair. (PDB code: 1EWQ.)
图 5-20 有向链错配修复在真核生物中的作用。(A) MutS 蛋白结合到错配碱基对,招募 MutL 蛋白,复合物扫描附近的 DNA 以寻找缺口和一个滑动夹具,其方向决定应切割哪条链以及其核苷酸被替换。当遇到这些情况时,MutL 被激活并开始切割 DNA。在大多数生物中,MutL 会与另一种核酸酶结合,一起去除从缺口开始并延伸过错配的新合成 DNA。然后,DNA 聚合酶填补缺口并由 DNA 连接酶封闭。(B) MutS 蛋白与 DNA 错配结合的结构。该蛋白是二聚体,如图所示紧握 DNA 双螺旋,在错配碱基对处使 DNA 弯曲。MutS 蛋白似乎通过测试可轻松弯曲的位点来扫描 DNA 中的错配,这些位点是具有异常碱基对的位点。(PDB 代码:1EWQ)。
repair proteins to correct the mismatch using the parent DNA strand as the template (Figure 5-20).
使用母本 DNA 链作为模板,修复蛋白质来纠正错配(图 5-20)。
The two faces of the clamp differ, and the clamp loader always loads the clamp in the same orientation with respect to the end of the previously synthesized Okazaki fragment. Because all the clamps on the DNA "face" in the same direction relative to the replication process, the oriented clamps can be used by the mismatch repair machinery to distinguish newly synthesized DNA from parent DNA. It is not known for certain how strand discrimination occurs on the leading strand (where gaps in newly synthesized DNA should be rare), but because oriented sliding clamps are also left behind by the leading-strand polymerase, they can signal old from new DNA in the same way that they do on the lagging strand. The recent discovery of a correction system that removes misincorporated ribonucleotides suggests a further possibility for distinguishing newly synthesized DNA from parent DNA, as we discuss in the next section.
夹具的两个面不同,夹具加载器总是以与先前合成的 Okazaki 片段的 端相同的方向加载夹具。由于 DNA 上的所有夹具在相对于复制过程的同一方向上,“面向”,定向的夹具可以被错配修复机制用来区分新合成的 DNA 和母本 DNA。目前尚不清楚如何在前导链上进行链鉴别(在新合成的 DNA 中缺陷应该很少),但由于定向的滑动夹具也被前导链聚合酶留下,它们可以以与在滞后链上相同的方式来区分新旧 DNA。最近发现了一个纠正系统,可以去除错误插入的核糖核苷酸,这提出了一个进一步的可能性,用于区分新合成的 DNA 和母本 DNA,我们将在下一节中讨论。
Mismatch correction is crucial for all cells; its importance for humans is seen in individuals who inherit one defective copy of a mismatch repair gene (along
不匹配修复对所有细胞都至关重要;对人类的重要性体现在那些继承了一个缺陷不匹配修复基因的个体身上

with a functional gene on the other copy of the chromosome). These individuals have a marked predisposition for certain types of cancers. For example, in a type of colon cancer called hereditary nonpolyposis colorectal cancer (HNPCC), a spontaneous deleterious mutation of the one functional gene will produce a clone of somatic cells that, because they are deficient in mismatch proofreading, accumulate mutations unusually rapidly. Because most cancers arise in cells that have accumulated many mutations (as discussed in Chapter 20), cells deficient in mismatch proofreading have a greatly enhanced chance of becoming cancerous. Fortunately, most of us inherit two good copies of each gene that encodes a mismatch proofreading protein; this protects us, because it is highly unlikely that both copies will become mutated in the same cell.
拥有染色体另一拷贝上的一个功能基因)。这些个体对某些类型的癌症有明显的易感性。例如,在一种称为遗传性非息肉性结直肠癌(HNPCC)的结肠癌类型中,一个功能基因的自发有害突变将产生一群体细胞克隆,因为它们缺乏错配修复,导致异常快速地积累突变。因为大多数癌症发生在已经积累了许多突变的细胞中(如第 20 章所讨论的),缺乏错配修复的细胞极有可能发展为癌症。幸运的是,我们大多数人继承了每个编码错配修复蛋白的基因的两个良好拷贝;这保护了我们,因为两个拷贝在同一细胞中发生突变的可能性极低。

The Accidental Incorporation of Ribonucleotides During DNA Replication Is Corrected
DNA 复制过程中意外包含核糖核苷酸的错误被纠正

We have seen that cells have several ways to correct mistakes where the wrong deoxynucleotide has been incorporated in newly replicated DNA. Occasionally, however, DNA polymerases make a different kind of mistake, one that is not caused by improper base-pairing: in this case, they accidently incorporate a ribonucleotide instead of a deoxyribonucleotide. These molecules differ by a single group in the sugar portion of the nucleotide. Yet, when incorporated into DNA, they weaken the DNA chain at that point, rendering it highly susceptible to breakage. If left unrepaired, these "weak links" would cause high mutation rates and genome rearrangements. Even if it does not cause a break, an incorporated ribonucleotide distorts the DNA double helix and can stall some polymerases during the next cycle of DNA replication.
我们已经看到细胞有几种纠正错误的方式,其中错误的脱氧核苷酸已经被合并到新复制的 DNA 中。然而,偶尔,DNA 聚合酶会犯一种不同类型的错误,这种错误不是由于不正确的碱基配对引起的:在这种情况下,它们意外地将一个核糖核苷酸而不是脱氧核苷酸合并进去。这些分子在核苷酸的糖部分只有一个 基团的差异。然而,当它们被合并到 DNA 中时,它们会在那一点削弱 DNA 链,使其极易断裂。如果不加修复,这些“薄弱环节”将导致高突变率和基因组重排。即使它不引起断裂,合并的核糖核苷酸也会扭曲 DNA 双螺旋,并且在下一个 DNA 复制周期中会使一些聚合酶停滞。
Although DNA polymerases much prefer deoxyribonucleotides over ribonucleotides (by a factor of about a million), the concentration of ribonucleotides in the cell is much higher than that of their deoxy counterparts, as much as 500 -fold for ATP, which has many different uses in the cell. This concentration imbalance means that a ribonucleotide is accidentally incorporated approximately once per several thousand nucleotides of DNA synthesized. These mistakes are corrected by specific nucleases that cleave the DNA chain when they encounter a ribonucleotide, leading to the excision of the ribonucleotide and its replacement by DNA, much in the same way that RNA primers are replaced by DNA to complete lagging-strand synthesis (see Figure 5-11). Because this repair process produces gaps only in newly synthesized DNA, it has been proposed that these transient lesions help the mismatch repair system "know" which strand to repair; in particular, these cues may be especially important on the leading strand.
尽管 DNA 聚合酶更偏爱脱氧核苷酸而非核糖核苷酸(大约高出一百万倍),但细胞中核糖核苷酸的浓度远高于其脱氧核苷酸对应物,例如 ATP 的浓度高达 500 倍,而 ATP 在细胞中有许多不同的用途。这种浓度不平衡意味着在合成 DNA 时,大约每几千个核苷酸中就会意外地插入一个核糖核苷酸。这些错误会被特定的核酸酶纠正,当它们遇到核糖核苷酸时会切断 DNA 链,导致核糖核苷酸的切除并被 DNA 替换,就像 RNA 引物被 DNA 替换以完成滞后链合成一样(见图 5-11)。由于这种修复过程只在新合成的 DNA 中产生间隙,因此有人提出这些短暂的损伤有助于错配修复系统“知道”应该修复哪条链;特别是在前导链上,这些提示可能尤为重要。

DNA Topoisomerases Prevent DNA Tangling
DNA 拓扑异构酶防止 DNA 缠结

During Replication 在复制过程中

As a replication fork moves along double-stranded DNA, it creates what has been called the "winding problem." The two parent strands that are wound around each other must be unwound and separated for replication to occur. For every 10 nucleotide pairs replicated at the fork, one complete turn of the parent double helix must be unwound. In principle, this unwinding can be achieved by rapidly rotating the entire chromosome ahead of a moving fork; however, this is energetically highly unfavorable (particularly for long chromosomes). Instead, the DNA in front of a replication fork becomes overwound (Figure 5-21). This overwinding is continually relieved by enzymes known as DNA topoisomerases.
当复制叉沿着双链 DNA 移动时,它产生了所谓的“缠绕问题”。缠绕在一起的两个亲本链必须被解开和分离才能进行复制。在复制叉处复制的每 10 个核苷酸对,必须解开一个完整的亲本双螺旋的转数。原则上,这种解旋可以通过快速旋转整个染色体以超前于移动的叉来实现;然而,这在能量上是非常不利的(特别是对于长染色体)。相反,复制叉前面的 DNA 变得过度缠绕(图 5-21)。这种过度缠绕不断地被称为 DNA 拓扑异构酶的酶缓解。
A DNA topoisomerase can be viewed as a reversible nuclease that adds itself covalently to a DNA backbone phosphate, thereby breaking a phosphodiester bond in a DNA strand. This reaction is reversible, and the phosphodiester bond re-forms as the protein leaves.
DNA 拓扑异构酶可以被视为可逆核酸酶,它在 DNA 骨架磷酸酯上以共价方式加入自身,从而在 DNA 链中断一个磷酸二酯键。这个反应是可逆的,当蛋白质离开时,磷酸二酯键重新形成。
One type of topoisomerase, called topoisomerase I, produces a transient single-strand break; this break in the phosphodiester backbone allows the
一种拓扑异构酶,称为拓扑异构酶 I,产生一种瞬时的单链断裂;磷酸二酯骨架中的这种断裂允许
(C) torsional stress ahead of the helicase is relieved by free rotation of DNA around the phosphodiester bond opposite the single-strand break; the same DNA topoisomerase molecule that produced the break reseals it
(C) 螺旋酶前方的扭转应力通过 DNA 围绕单链断裂对面的磷酸二酯键的自由旋转得以缓解;产生断裂的同一 DNA 拓扑异构酶分子将其重新封闭
two sections of DNA helix on either side of the nick to rotate freely relative to each other, using the phosphodiester bond in the strand opposite the nick as a swivel point (Figure 5-22). Any tension in the DNA helix will drive this rotation in the direction that relieves the tension. As a result, DNA replication can occur with the rotation of only a short length of helix-the part just ahead of the fork. Because the covalent linkage that joins the DNA topoisomerase protein to a DNA phosphate retains the energy of the cleaved phosphodiester bond, resealing is rapid and does not require additional energy input. In this respect, the rejoining mechanism differs from that catalyzed by the enzyme DNA ligase, discussed previously (see Figure 5-12).
DNA 螺旋的两个部分在切口两侧可以自由旋转,相对于彼此,使用切口对面链中的磷酸二酯键作为旋转点(图 5-22)。 DNA 螺旋中的任何张力都会驱动这种旋转,以减轻张力的方向旋转。因此,DNA 复制可以仅通过螺旋的短长度旋转进行 - 就在叉前方的部分。由于连接 DNA 拓扑异构酶蛋白质与 DNA 磷酸的共价键保留了切割的磷酸二酯键的能量,重新封闭是快速的,不需要额外的能量输入。在这方面,重新连接机制与之前讨论的酶 DNA 连接酶催化的机制不同(参见图 5-12)。
A second type of DNA topoisomerase, topoisomerase II, forms a covalent linkage to both strands of the DNA helix at the same time, making a transient double-strand break in the helix. These enzymes are activated by sites on chromosomes where two double helices cross over each other, such as those generated by supercoiling in front of a replication fork (see Figure 5-21B). As illustrated in Figure 5-23, once a topoisomerase II molecule binds to such a crossing site, the protein uses ATP hydrolysis to perform the following set of reactions: (1) it breaks one double helix reversibly to create a DNA "gate"; (2) it causes the second, nearby double helix to pass through this opening; and (3) it then reseals the break and dissociates from the DNA. At crossover points generated by supercoiling, passage of the double helix through the gate occurs in the direction that will reduce supercoiling. In this way, type II topoisomerases-like type I topoisomerases-can relieve the overwinding tension generated in front of a replication fork.
DNA 拓扑异构酶的第二种类型,拓扑异构酶 II,同时与 DNA 螺旋的两条链形成共价连接,使螺旋中出现瞬时的双链断裂。这些酶被激活于染色体上的交叉点,例如由复制叉前的超螺旋所产生的那些交叉点(见图 5-21B)。正如图 5-23 所示,一旦拓扑异构酶 II 分子结合到这样一个交叉点,蛋白质利用 ATP 水解来执行以下一系列反应:(1)它可可逆地断裂一个双螺旋以创建 DNA“门”;(2)它导致第二个附近的双螺旋通过这个开口;(3)然后重新封闭断裂并与 DNA 解离。在由超螺旋产生的交叉点上,双螺旋通过门的通过方向是为了减少超螺旋。通过这种方式,II 型拓扑异构酶-就像 I 型拓扑异构酶一样-可以缓解复制叉前产生的过度绞紧张力。
Their reaction mechanism also allows type II DNA topoisomerases to efficiently separate any intertwined DNA molecules. This ability of topoisomerase II is especially important for preventing the severe DNA tangling problems that would otherwise arise from DNA replication. This role is nicely illustrated by mutant yeast cells that produce, in place of the normal topoisomerase II, a version that is inactive above . When the mutant cells are warmed to this temperature, their daughter chromosomes remain intertwined after DNA replication and are unable to separate. The enormous usefulness of topoisomerase II for untangling
它们的反应机制还允许 II 型 DNA 拓扑异构酶高效地分离任何缠绕在一起的 DNA 分子。拓扑异构酶 II 的这种能力对于防止由 DNA 复制引起的严重 DNA 缠结问题尤为重要。这一作用由突变酵母细胞很好地说明,这些细胞产生了一个在 以上无法活动的版本,代替了正常的拓扑异构酶 II。当这些突变细胞被加热到这个温度时,它们的子染色体在 DNA 复制后仍然缠绕在一起,无法分离。拓扑异构酶 II 在解开缠结问题方面的巨大用处得到了很好的展示。

Figure 5-21 The "winding problem" that arises during DNA replication. (A) For a bacterial replication fork moving at 500 nucleotides per second, the parent DNA helix ahead of the fork must rotate at about 50 revolutions per second. The brackets represent about 20 turns of DNA. (B) If the ends of the DNA double helix remain fixed (or difficult to rotate), tension builds up in front of the replication fork as it becomes overwound. Some of this tension can be taken up by supercoiling, whereby the DNA double helix twists around itself. However, if the tension continues to build up, the replication fork will eventually stop because further unwinding requires more energy than the DNA helicase at the fork can provide. (C) DNA topoisomerases relieve this stress by generating temporary singlestrand breaks in the DNA, which allow rapid rotation around the single strands opposite the break.
图 5-21 DNA 复制过程中出现的“缠绕问题”。(A) 对于以每秒 500 个核苷酸移动的细菌复制叉,叉前的母体 DNA 螺旋必须以大约每秒 50 次的速度旋转。括号代表大约 20 个 DNA 转弯。(B) 如果 DNA 双螺旋的末端保持固定(或难以旋转),则在复制叉前方会积聚张力,因为它变得过度缠绕。其中一些张力可以通过超螺旋来消除,即 DNA 双螺旋围绕自身扭曲。然而,如果张力继续积聚,复制叉最终将停止,因为进一步展开需要比叉处 DNA 解旋酶提供的能量更多。(C) DNA 拓扑异构酶通过在 DNA 中产生临时单链断裂来缓解这种压力,这允许在断裂相对的单链周围快速旋转。

the original phosphodiester bond energy is stored in the phosphotyrosine linkage, making the reaction reversible
原始的磷酸二酯键能量储存在磷酸酪氨酸连接中,使反应可逆

spontaneous re-formation of the phosphodiester bond regenerates both the DNA helix and the DNA topoisomerase
磷酸二酯键的自发重组再生了 DNA 螺旋和 DNA 拓扑异构酶
Figure 5-22 The reversible DNA nicking reaction catalyzed by a DNA topoisomerase I enzyme. As indicated, these enzymes transiently form a single covalent bond with DNA; this allows free rotation of the DNA around the covalent backbone bonds linked to the blue phosphate. On reversal of the reaction, the enzyme and the DNA are restored, the only difference being the relaxation of tension in the DNA.
图 5-22 DNA 拓扑异构酶 I 酶催化的可逆 DNA 切割反应。如图所示,这些酶会暂时与 DNA 形成单一共价键;这使得 DNA 围绕与蓝色磷酸盐相连的共价骨架键自由旋转。在反应逆转时,酶和 DNA 得以恢复,唯一的区别是 DNA 中张力的放松。

the topoisomerase gate opens to let the
拓扑异构酶门打开,让..

second DNA helix pass
第二个 DNA 螺旋经过

reversal of the covalent attachment of the topoisomerase restores an intact orange double helix
拓扑异构酶共价连接的逆转恢复了完整的橙色双螺旋
two DNA double helices that are separated
两个被分开的 DNA 双螺旋
opoisomerase ecognizes the entanglement and makes a reversible covalent attachment o the two opposite strands of one of the double helices (orange) ing a double trand break and forming a protein gate
拓扑异构酶识别缠结并在一个双螺旋的两条相对链上进行可逆的共价连接,形成蛋白质门
3

chromosomes before mitosis begins can readily be appreciated by anyone who has struggled to remove a severe tangle from a fishing line-or from a large ball of thread-without the aid of scissors.
有过在未使用剪刀的情况下,试图解开渔线或大团线结时的经历的人,很容易就能理解有丝分裂开始前的染色体。

Summary 摘要

DNA replication takes place at a -shaped structure called a replication fork. Self-correcting DNA polymerase enzymes catalyze nucleotide polymerization in a 5'-to-3' direction, copying a DNA template strand with remarkable fidelity. Because the two strands of a DNA double helix are antiparallel, this 5'-to-3' DNA synthesis can take place continuously on only one of the strands at a replication fork (the leading strand). On the lagging strand, short DNA fragments must be made by a "backstitching" process. Because the self-correcting DNA polymerases cannot start a new chain, these lagging-strand DNA fragments are primed by short RNA primer molecules that are subsequently erased and replaced with DNA.
DNA 复制发生在一个称为复制叉的 形结构上。自我校正的 DNA 聚合酶酶催化核苷酸在 5'-到 3'方向上的聚合,以非凡的准确性复制 DNA 模板链。由于 DNA 双螺旋的两条链是反平行的,这种 5'-到 3'的 DNA 合成只能在复制叉上的一条链上连续进行(领先链)。在滞后链上,必须通过“回缝”过程制作短的 DNA 片段。由于自我校正的 DNA 聚合酶无法启动新链,这些滞后链 DNA 片段由短的 RNA 引物分子引导,随后被擦除并替换为 DNA。
DNA replication requires the cooperation of many proteins. These include (1) DNA polymerases and DNA primases to catalyze nucleoside triphosphate polymerization; (2) DNA helicases and single-strand DNA-binding (SSB) proteins to help in opening up the DNA helix so that it can be copied; (3) clamps and clamp loaders to enable DNA polymerases to copy longer stretches of DNA; (4) DNA ligases and enzymes that degrade RNA primers to seal together the discontinuously synthesized lagging-strand DNA fragments; and (5) DNA topoisomerases to help to relieve helical winding and DNA tangling problems. Many of these proteins associate with each other at a replication fork to form a highly efficient "replication machine," through which the activities and spatial movements of the individual components are coordinated.
DNA 复制需要许多蛋白质的合作。这些蛋白质包括(1)DNA 聚合酶和 DNA 引物酶,用于催化核苷酸三磷酸聚合;(2)DNA 解旋酶和单链 DNA 结合蛋白(SSB 蛋白),帮助打开 DNA 螺旋以便进行复制;(3)夹具和夹具加载器,使 DNA 聚合酶能够复制更长的 DNA 片段;(4)DNA 连接酶和降解 RNA 引物的酶,用于封闭不连续合成的滞后链 DNA 片段;以及(5)DNA 拓扑异构酶,帮助缓解螺旋缠绕和 DNA 缠结问题。这些蛋白质中的许多在复制叉处相互结合,形成一个高效的“复制机器”,通过这个机器,各个组分的活动和空间移动得到协调。
The self-correcting DNA polymerases make mistakes only rarely when copying DNA; when they do, a variety of enzymes inspect the DNA shortly after it is made and correct any mishaps. Given the number of proteins dedicated to the task, copying DNA with extreme accuracy is clearly of great importance to all cells on Earth.
自我校正的 DNA 聚合酶在复制 DNA 时很少犯错;当它们犯错时,各种酶会在 DNA 制成后不久检查 DNA 并纠正任何错误。鉴于专门从事这项任务的蛋白质数量,以极高的准确性复制 DNA 显然对地球上所有细胞都至关重要。

THE INITIATION AND COMPLETION OF DNA REPLICATION IN CHROMOSOMES
染色体中 DNA 复制的启动和完成

We have seen how a set of replication proteins rapidly and accurately generates two daughter DNA double helices behind a replication fork. But how is this replication machinery assembled in the first place, and how are replication forks created on an intact, double-strand DNA molecule? In this part of the chapter, we discuss how cells initiate DNA replication and how they carefully regulate this process to ensure that it takes place only at the proper time and chromosomal sites. We also discuss special problems that the replication machinery in eukaryotic cells must overcome including the need to replicate the enormously long DNA molecules found in eukaryotic chromosomes, as well as the need to copy DNA molecules that are tightly complexed with nucleosomes.
我们已经看到,一组复制蛋白迅速而准确地在复制叉后面生成两个 DNA 双螺旋的女儿。但是,这个复制机器是如何首次组装的,以及如何在完整的双链 DNA 分子上创建复制叉呢?在本章的这一部分中,我们讨论细胞如何启动 DNA 复制以及它们如何仔细调控这个过程,以确保它只在适当的时间和染色体位点发生。我们还讨论真核细胞中复制机器必须克服的特殊问题,包括需要复制真核染色体中发现的巨大长 DNA 分子,以及需要复制与核小体紧密结合的 DNA 分子。

DNA Synthesis Begins at Replication Origins
DNA 合成始于复制起源

As discussed previously, the DNA double helix is normally very stable: the two DNA strands are locked together firmly by the hydrogen bonds formed between the bases on each strand. To begin DNA replication, the double helix must first be opened up and the two strands separated to expose unpaired bases. As we shall see, the process of DNA replication is begun by special initiator proteins that bind to double-stranded DNA and pry the two strands apart, breaking the hydrogen bonds between the bases.
正如前面讨论的那样,DNA 双螺旋结构通常非常稳定:两条 DNA 链通过各自链上碱基之间形成的氢键牢固地锁在一起。要开始 DNA 复制,首先必须打开双螺旋结构,将两条链分开以暴露未配对的碱基。正如我们将看到的,DNA 复制的过程是由特殊的启动蛋白质开始的,它们结合到双链 DNA 上并将两条链分开,打破碱基之间的氢键。
The positions at which the DNA helix is first opened are called replication origins (Figure 5-24). In simple cells like those of bacteria or budding yeast, origins are specified by DNA sequences several hundred nucleotide pairs in
DNA 螺旋首次打开的位置称为复制起点(图 5-24)。在细菌或酵母等简单细胞中,起点由数百个核苷酸对的 DNA 序列指定。
Figure 5-24 A replication bubble formed by replication-fork initiation. This diagram outlines the major steps in the initiation of replication forks at replication origins. In the last step, two replication forks move away from each other, separated by an expanding replication bubble.
图 5-24 由复制叉起始形成的复制泡泡。该图解释了在复制起源处复制叉启动的主要步骤。在最后一步中,两个复制叉相互远离,被一个不断扩张的复制泡泡分隔。

length. This DNA contains both short sequences that attract initiator proteins and stretches of DNA that are especially easy to open. We saw in Figure 4-5A that an A-T base pair is held together by fewer hydrogen bonds than is a G-C base pair. Therefore, DNA rich in A-T base pairs is relatively easy to pull apart, and regions of DNA enriched in A-T base pairs are typically found at replication origins.
长度。这段 DNA 包含吸引启动蛋白的短序列和特别容易打开的 DNA 片段。如图 4-5A 所示,A-T 碱基对由比 G-C 碱基对更少的氢键保持在一起。因此,富含 A-T 碱基对的 DNA 相对容易分开,并且富含 A-T 碱基对的 DNA 区域通常在复制起源处找到。
Although the basic process of replication-fork initiation depicted in Figure 5-24 is fundamentally the same for bacteria and eukaryotes, the detailed way in which this process is performed and regulated differs considerably between these two groups of organisms. We first consider the case in bacteria and then turn to the more complex situation found in yeasts, mammals, and other eukaryotes.
尽管图 5-24 中描绘的复制叉起始的基本过程在细菌和真核生物中基本相同,但在这两组生物中,这一过程的执行和调控方式有很大差异。我们首先考虑细菌中的情况,然后转向酵母、哺乳动物和其他真核生物中发现的更为复杂的情况。

Bacterial Chromosomes Typically Have a Single Origin of DNA Replication
细菌染色体通常只有一个 DNA 复制起点

The genome of . coli is contained in a single circular DNA molecule of nucleotide pairs. DNA replication begins at a single origin of replication, and the two replication forks assembled there proceed (at approximately 1000 nucleotides per second) in opposite directions until they meet up roughly halfway around the chromosome (Figure 5-25). The only point at which E. coli can control DNA replication is initiation: once the forks have been assembled at the origin, they synthesize DNA at a relatively constant speed until replication is finished. Therefore, it is not surprising that the initiation step of DNA replication is tightly regulated. The process begins when specialized initiator proteins (in their ATP-bound state) bind in multiple copies to specific DNA sites located at the replication origin, wrapping the DNA around the proteins to form a large protein-DNA filament that introduces torsional stress on the DNA double helix (Figure 5-26). This stress is partially relieved by melting of the adjacent AT-rich sequences. The protein-DNA complex then attracts two DNA helicases, each bound to a helicase loader, and these are placed-facing in opposite directionsaround adjacent DNA single strands whose bases have been exposed by the assembly of the initiator protein-DNA complex. The helicase loader is analogous to the clamp loader we encountered earlier; it has the additional job of keeping the helicase in an inactive form until it is properly loaded. Once the helicases are properly positioned on DNA, the loaders dissociate and the helicases begin to unwind DNA, exposing enough single-stranded DNA for DNA primases to synthesize the first RNA primers. This quickly leads to the assembly of the remaining replication proteins to create two replication forks that move in opposite directions away from the replication origin, each synthesizing new DNA as they travel.
大肠杆菌的基因组包含在一个单环 DNA 分子中,共有 个核苷酸对。DNA 复制始于单个复制起始点,两个在那里组装的复制叉以相反方向进行(大约每秒 1000 个核苷酸),直到它们大约在染色体的中间位置相遇(图 5-25)。大肠杆菌唯一能够控制 DNA 复制的时机是起始阶段:一旦复制叉在起始点组装完成,它们以相对恒定的速度合成 DNA,直到复制完成。因此,DNA 复制的起始阶段受到严格调控并不令人意外。该过程始于专门的启动蛋白(处于其 ATP 结合状态)以多份结合到特定 DNA 位点位于复制起始点周围,将 DNA 缠绕在蛋白质周围形成大型蛋白质-DNA 螺旋,对 DNA 双螺旋施加扭转应力(图 5-26)。相邻富含 AT 序列的熔解部分部分缓解了这种应力。 蛋白质-DNA 复合物随后吸引两个 DNA 解旋酶,每个解旋酶与一个解旋酶加载器结合,它们被放置在相对方向上,围绕着由启动蛋白质-DNA 复合物组装而暴露出碱基的相邻 DNA 单链。解旋酶加载器类似于我们之前遇到的夹具加载器;它的额外工作是保持解旋酶处于非活性形式,直到正确加载为止。一旦解旋酶正确定位在 DNA 上,加载器解离,解旋酶开始解开 DNA,暴露出足够的单链 DNA,以便 DNA 引物酶合成第一个 RNA 引物。这迅速导致其余复制蛋白质的组装,形成两个复制叉,沿着远离复制起点的相反方向移动,每个在移动时合成新的 DNA。
In E. coli, the interaction of the initiator proteins with the replication origin is carefully regulated, with initiation occurring only when sufficient nutrients are available for the bacterium to complete an entire round of replication. Initiation is also controlled to ensure that only one round of DNA replication occurs for each cell division. After replication is initiated, the initiator protein is inactivated by hydrolysis of its bound ATP molecule, and the origin of replication experiences a refractory period. The refractory period is caused by a delay in the methylation of newly incorporated A nucleotides in the origin (Figure 5-27). Initiation cannot occur again until the A's are methylated and the initiator protein is restored to its ATP-bound state, conditions that are met only when the cell is capable of carrying out a new round of DNA replication.
在大肠杆菌中,启动蛋白与复制起源的相互作用受到精心调控,只有在细菌有足够的营养来完成整个复制周期时才会发生启动。启动也受到控制,以确保每次细胞分裂只发生一轮 DNA 复制。在启动复制后,通过水解其结合的 ATP 分子来使启动蛋白失活,并且复制起源经历一段不敏感期。这段不敏感期是由于起源中新合并的 A 核苷酸甲基化延迟引起的(图 5-27)。只有当 A 核苷酸被甲基化并且启动蛋白恢复到其结合 ATP 的状态时,启动才能再次发生,这些条件只有在细胞能够进行新一轮 DNA 复制时才能满足。

Eukaryotic Chromosomes Contain Multiple Origins of Replication
真核染色体包含多个复制起点

We have seen how two replication forks begin at a single replication origin in bacteria and proceed in opposite directions, moving away from the origin until all of the DNA in the single circular chromosome is replicated. The bacterial genome is sufficiently small for these two replication forks to duplicate the genome in about
我们已经看到,在细菌中,两个复制叉从单个复制起源开始,并朝相反方向移动,远离起源,直到单个圆形染色体中的所有 DNA 被复制。细菌基因组足够小,使得这两个复制叉能够在大约
Figure 5-25 DNA replication of a bacterial genome. It takes E. coli about 30 minutes to duplicate its genome of nucleotide pairs. For simplicity, Okazaki fragments are not shown on the lagging strand.
图 5-25 细菌基因组的 DNA 复制。大肠杆菌大约需要 30 分钟来复制其由 个核苷酸对组成的基因组。为简单起见,滞后链上未显示岡崎片段。
30 minutes. Because of the much greater size of most eukaryotic chromosomes, a different strategy is required to allow their replication in a timely manner.
30 分钟。由于大多数真核染色体的尺寸更大,需要采用不同的策略来确保它们能够及时复制。
A method for determining the general pattern of eukaryotic chromosome replication was developed in the early 1960s that is similar to the strategy we saw earlier for visualizing bacterial replication (see Figure 5-6). Human cells growing
20 世纪 60 年代初期开发了一种确定真核染色体复制一般模式的方法,类似于我们早前看到的用于可视化细菌复制的策略(见图 5-6)。正在生长的人类细胞
Figure 5-26 The proteins that initiate DNA replication in bacteria. The mechanism shown was established by studies in vitro with mixtures of highly purified proteins. For E. coli DNA replication, the major initiator protein (purple), the helicase (yellow), and the primase (blue) are the dnaA, dnaB, and dnaG proteins, respectively. In the first step, many molecules of the initiator protein bind to specific DNA sequences at the replication origin and destabilize the double helix by forming a filamentous structure in which the DNA is wrapped around the protein. Next, two helicases are brought in by helicase-loading proteins (the dnaC proteins; brown), which inhibit the helicases until they are properly loaded at the replication origin. (The helicase-loading proteins prevent the replicative DNA helices from inappropriately entering other singlestrand stretches of DNA in the bacterial genome.) Aided by single-strand binding protein (not shown), the loaded helicases further separate the DNA strands, thereby enabling primases to enter and synthesize initial primers. In subsequent steps, two complete replication forks are assembled at the origin and move in opposite directions away from the replication origin. The initiator proteins are displaced as the lefthand fork moves through them.
图 5-26 在细菌中启动 DNA 复制的蛋白质。所示机制是通过体外使用高度纯化的蛋白质混合物进行研究建立的。对于大肠杆菌 DNA 复制,主要的启动蛋白(紫色)、解旋酶(黄色)和原始酶(蓝色)分别是 dnaA、dnaB 和 dnaG 蛋白。在第一步中,许多启动蛋白分子结合到复制起源处的特定 DNA 序列,并通过形成 DNA 缠绕在蛋白质周围的丝状结构来使双螺旋不稳定。接下来,两个解旋酶由解旋酶加载蛋白(dnaC 蛋白;棕色)带入,这些蛋白抑制解旋酶,直到它们在复制起源处正确加载为止。(解旋酶加载蛋白防止复制 DNA 螺旋不适当地进入细菌基因组中的其他单链 DNA 区段。)在单链结合蛋白的帮助下(未显示),加载的解旋酶进一步分离 DNA 链,从而使原始酶进入并合成初始引物。 在随后的步骤中,在起源处组装了两个完整的复制叉,它们朝着相反的方向远离复制起源。当左侧叉通过它们移动时,启动蛋白被排斥。
Figure 5-27 Methylation of the E. coli replication origin creates a refractory period for DNA initiation. DNA
图 5-27 大肠杆菌复制起源的甲基化会产生 DNA 起始的不可逾越期。DNA
methylation occurs at GATC sequences, 11 of which are found in the origin of replication (spanning approximately 250 nucleotide pairs). In its hemimethylated state (that is, one strand of the DNA methylated, the other unmethylated), the origin of replication is bound by an inhibitor protein (Seq A, not shown), which blocks the ability of the initiator proteins to unwind the origin DNA. About 15 minutes after replication is initiated, the hemimethylated origins become fully methylated by a DNA methylase enzyme; Seq A then dissociates allowing the origin of replication to become active.
甲基化发生在 GATC 序列上,其中有 11 个位于复制起始点(跨越大约 250 个核苷酸对)。在其半甲基化状态下(即 DNA 的一条链甲基化,另一条链未甲基化),复制起始点被一个抑制蛋白(未显示的 Seq A)结合,阻止启动蛋白展开起始点 DNA 的能力。在复制启动大约 15 分钟后,半甲基化的起始点会被 DNA 甲基转移酶完全甲基化;然后 Seq A 解离,使复制起始点变得活跃。
A single enzyme, the Dam methylase, is responsible for methylating all E. coli GATC sequences. As discussed earlier in the chapter, a lag in methylation after the replication of GATC sequences is also used by the E. coli mismatch proofreading system to distinguish the newly synthesized DNA strand from the parent DNA strand; in that case, the relevant GATC sequences are scattered throughout the chromosome, and they are not bound by Seq A.
一种酶,Dam 甲基化酶,负责甲基化所有大肠杆菌 GATC 序列。正如本章前面讨论的那样,在 GATC 序列复制后甲基化的滞后也被大肠杆菌错配校对系统用来区分新合成的 DNA 链与母体 DNA 链;在这种情况下,相关的 GATC 序列分散在整个染色体上,并且它们不受 Seq A 的约束。

in culture are labeled for a short time with -thymidine so that the DNA synthesized during this period becomes highly radioactive. The cells are then gently lysed, and the DNA is stretched on the surface of a glass slide coated with a photographic emulsion. Development of the emulsion in the dark reveals the pattern of labeled DNA through a technique known as autoradiography. The time allotted for radioactive labeling is chosen to allow each replication fork to move several micrometers along the DNA, so that the replicated DNA can be detected in the light microscope as lines of silver grains (radioactivity exposes photographic emulsion much as light does), even though the DNA molecule itself is too thin to be visible. In this way, both the rate and the direction of replication-fork movement can be determined (Figure 5-28). From the rate at which tracks of replicated DNA increase in length with increasing labeling time, the eukaryotic replication forks are estimated to travel at about 50 nucleotides per second. This is approximately twentyfold slower than the rate at which bacterial replication forks move, possibly reflecting the increased difficulty of replicating DNA that is packaged in chromatin.
在培养基中,细胞被标记了一段时间,使用 -胸腺嘧啶,使得在此期间合成的 DNA 变得高度放射性。然后轻柔地裂解细胞,将 DNA 拉伸到涂有感光乳剂的玻璃载玻片表面上。在黑暗中显影感光乳剂,通过一种称为放射自显影的技术揭示标记 DNA 的模式。选择放射性标记的时间允许每个复制叉沿 DNA 移动数微米,以便在光学显微镜中检测到复制的 DNA,表现为银粒线(放射性暴露感光乳剂,就像光线一样),即使 DNA 分子本身太细无法看见。通过这种方式,可以确定复制叉移动的速率和方向(图 5-28)。通过随着标记时间增加而增加的复制 DNA 轨迹长度的速率,估计真核生物复制叉每秒大约移动 50 个核苷酸。 这大约比细菌复制叉移动的速度慢了二十倍,可能反映了在染色质中包装的 DNA 复制的困难增加。
An average-size human chromosome contains a single linear DNA molecule of about 150 million nucleotide pairs. It would take 0.02 seconds/nucleotide nucleotides seconds (about 35 days) to replicate such a DNA molecule from end to end with a single replication fork moving at a rate of 50 nucleotides per second. As expected, therefore, the autoradiographic experiments just described reveal that many forks, belonging to separate replication bubbles, are moving simultaneously on each eukaryotic chromosome.
一个平均大小的人类染色体包含大约 1.5 亿个核苷酸对的单一线性 DNA 分子。如果以每秒 50 个核苷酸的速度移动的单个复制叉从一端到另一端复制这样一条 DNA 分子,需要 0.02 秒/核苷酸,即 1.5 亿个核苷酸需要 3,000,000 秒(约 35 天)。因此,正如预期的那样,刚刚描述的放射自显影实验揭示了许多复制叉,属于不同的复制泡泡,在每个真核染色体上同时移动。
Much more sophisticated methods now exist for monitoring DNA replication initiation and tracking the movement of DNA replication forks across whole genomes. If a population of cells can be synchronized so they all begin DNA replication at the same time, the amount of each segment of DNA in the genome can be determined at specific time points using one of the DNA sequencing methods described in Chapter 8. Because a segment of a genome that has been replicated will contain twice as much DNA as an unreplicated segment, replication-fork initiation and fork movement can be accurately monitored across an entire genome.
目前存在更加复杂的方法来监测 DNA 复制起始并跟踪 DNA 复制叉在整个基因组中的移动。如果一群细胞可以被同步化,使它们同时开始 DNA 复制,那么可以使用第 8 章中描述的 DNA 测序方法之一,在特定时间点确定基因组中每个 DNA 片段的数量。因为已经复制的基因组片段将包含两倍于未复制片段的 DNA 量,所以可以准确监测整个基因组中的复制叉起始和叉移动。
Experiments of this type have shown the following: (1) Approximately 30,00050,000 origins of replication are used each time a human cell divides. (2) The human genome has many more (perhaps tenfold more) potential origins than this, and different cell types use different sets of origins. This excess of origins may allow a cell to coordinate its active origins with other features of its chromosomes such as which genes are being expressed. The excess origins also provide "backups" in case a primary origin fails. (3) Origins of replication do not all "fire" simultaneously; rather, they often are activated in a prescribed order in a given cell type. (4) Regardless of when a given origin fires or where on the chromosome it is located, the replication forks all move at approximately the same speed. (5) As in bacteria, replication forks are formed in pairs and create an expanding
这类实验显示了以下结果:(1)每次人类细胞分裂时大约使用 30,000 至 50,000 个复制起点。(2)人类基因组可能有更多(可能是十倍以上)的潜在起点,不同的细胞类型使用不同的起点集。这种过剩的起点可能使细胞能够将其活跃的起点与其染色体的其他特征协调,比如哪些基因正在表达。过剩的起点还提供了“备用”,以防主要起点失败。(3)复制起点并非全部同时“启动”;相反,在给定细胞类型中,它们通常按照规定的顺序被激活。(4)无论给定起点何时启动或其在染色体上的位置如何,复制叉都以大致相同的速度移动。(5)与细菌一样,复制叉成对形成并创建一个扩展。
Figure 5-28 The experiments that first demonstrated the pattern in which replication forks are formed and move on eukaryotic chromosomes. The new DNA made in human cells in culture was labeled briefly with a pulse of highly radioactive thymidine -thymidine). (A) In this experiment, the cells were lysed, and the DNA was stretched out on a glass slide that was subsequently covered with a photographic emulsion. After several months, the emulsion was developed, revealing a line of silver grains over the radioactive DNA. The brown DNA in this figure is shown only to help with the interpretation of the autoradiograph; the unlabeled DNA is invisible in such experiments. (B) This experiment was the same except that a further incubation in unlabeled medium allowed additional DNA, with a lower level of radioactivity, to be replicated. The pairs of dark tracks in B were found to have silver grains tapering off in opposite directions, demonstrating bidirectional fork movement from a central replication origin where a replication bubble forms (see Figure 5-24). A replication fork is thought to stop only when it encounters a replication fork moving in the opposite direction or when it reaches the end of the chromosome; in this way, all the DNA is eventually replicated.
图 5-28 首次展示了复制叉在真核染色体上形成并移动的模式的实验。在培养的人类细胞中,新合成的 DNA 被短暂地标记为高度放射性的胸腺嘧啶脉冲 -胸腺嘧啶)。(A)在这个实验中,细胞被裂解,DNA 被拉直放在玻璃片上,随后覆盖上一层感光乳剂。几个月后,乳剂显影,显示出一行银颗粒在放射性 DNA 上。这幅图中的棕色 DNA 仅用于帮助解释自显影片;在这类实验中,未标记的 DNA 是看不见的。(B)这个实验与 A 相同,只是在未标记培养基中进一步孵育,使得额外的 DNA 以较低放射性水平复制。B 中的暗色轨迹成对出现,银颗粒朝相反方向逐渐减少,证明了双向叉运动,从一个中央复制起源点开始,形成一个复制泡泡(见图 5-24)。 复制叉被认为只有在遇到相反方向移动的复制叉或到达染色体末端时才会停止;通过这种方式,最终所有的 DNA 都会被复制。

replication bubble as they move in opposite directions away from a common point of origin, stopping only when they meet a replication fork moving in the opposite direction or when they reach a chromosome end. In this way, many replication forks operate independently on each chromosome and yet form two complete daughter DNA helices.
当它们沿着相反方向从一个共同起点移动时,复制泡会停止,只有当它们遇到相反方向移动的复制叉或到达染色体末端时才会停止。这样,许多复制叉在每条染色体上独立运作,却形成两条完整的子 DNA 螺旋。

In Eukaryotes, DNA Replication Takes Place During Only One Part of the Cell Cycle
在真核生物中,DNA 复制只发生在细胞周期的某个阶段

When growing rapidly, bacteria replicate their DNA nearly continually. In contrast, DNA replication in most eukaryotic cells occurs only during a specific part of the cell-division cycle, called the DNA synthesis phase, or S phase (Figure 5-29). In a mammalian cell, the S phase typically lasts for about 8 hours; in simpler eukaryotic cells such as yeasts, the phase can be as short as 40 minutes. By its end, each chromosome has been replicated to produce two complete copies, which remain joined together at their centromeres until the M phase ( for mitosis), which soon follows. Although different origins of replication fire at different times, all DNA replication is begun and completed during S phase. In Chapter 17, we describe the control system that runs the cell cycle, and we explain how entry into each phase of the cycle requires the cell to have successfully completed the previous phase.
当细菌快速生长时,它们几乎持续复制其 DNA。相比之下,大多数真核细胞中的 DNA 复制仅发生在细胞分裂周期的特定阶段,称为 DNA 合成阶段或 S 期(图 5-29)。在哺乳动物细胞中,S 期通常持续约 8 小时;在较简单的真核细胞如酵母中,S 期可能只有 40 分钟。到 S 期结束时,每条染色体都已复制成两份完整的拷贝,在其着丝粒处保持连接,直到随后的 M 期(有丝分裂期)。尽管不同的复制起点在不同时间点启动,但所有 DNA 复制都在 S 期开始并完成。在第 17 章中,我们描述了负责细胞周期的控制系统,并解释了进入周期的每个阶段都要求细胞成功完成前一个阶段。
In the following sections, we explore how DNA replication begins on eukaryotic chromosomes and how this event is coordinated with the cell cycle.
在接下来的章节中,我们将探讨 DNA 复制如何在真核染色体上开始,以及这一事件如何与细胞周期协调。

Eukaryotic Origins of Replication Are "Licensed" for Replication by the Assembly of an Origin Recognition Complex
真核起源复制是通过起源识别复合物的组装而“授权”进行复制

Having seen that a eukaryotic chromosome is replicated using many origins of replication, each of which fires at a characteristic time in S phase of the cell cycle, we turn to the nature of these origins of replication. We saw earlier in this chapter that replication origins have been precisely defined in bacteria as specific DNA sequences that attract initiator proteins, which then assemble the DNA replication machinery. We shall see that this is also the case for the singlecell budding yeast . cerevisiae, but it appears not to be strictly true for many other eukaryotes.
在看到真核染色体是使用许多复制起源进行复制的,每个起源在细胞周期的 S 期以特定时间发生后,我们转向这些复制起源的性质。我们在本章前面看到,复制起源在细菌中已被精确定义为特定的 DNA 序列,吸引启动蛋白,然后组装 DNA 复制机器。我们将看到,在单细胞酵母 . cerevisiae 中也是如此,但对许多其他真核生物来说似乎并非严格如此。
For budding yeast, the location of every origin of replication on each chromosome has been determined. The particular chromosome shown in Figure 5-30-chromosome III from S. cerevisiae-is one of the smallest chromosomes known, with a length less than that of a typical human chromosome. Its major origins are spaced an average of 30,000 nucleotide pairs apart, but only a subset of these origins is used by a given cell. Nonetheless, this chromosome can be replicated in about 15 minutes.
对于酿酒酵母,已确定了每条染色体上每个复制起点的位置。在图 5-30 中显示的特定染色体-来自酿酒酵母的染色体 III-是已知最小的染色体之一,长度不到典型人类染色体的 。其主要起点平均间隔 30,000 个核苷酸对,但只有给定细胞使用其中的一个子集。尽管如此,这条染色体大约可以在 15 分钟内复制。
The minimal DNA sequence required for directing DNA replication initiation in S. cerevisiae has been determined by taking a segment of DNA that spans an origin of replication and testing smaller and smaller DNA fragments for their ability to function as origins. These DNA sequences that can serve as an origin of replication are found to contain (1) a binding site for a large, multisubunit initiator protein called ORC, for origin recognition complex; (2) a stretch of DNA that is rich in A's and T's and therefore easy to pull apart; and (3) at least one binding site for proteins that facilitate ORC binding, probably by adjusting the local chromatin structure.
在酿酒酵母中确定了引导 DNA 复制起始所需的最小 DNA 序列,方法是取跨越复制起始点的 DNA 片段,测试越来越小的 DNA 片段是否能作为起始点发挥功能。这些能够作为复制起始点的 DNA 序列被发现包含(1)一个用于大型多亚基启动蛋白 ORC(起始点识别复合物)的结合位点;(2)一个富含 A 和 T 的 DNA 区段,因此易于分离;以及(3)至少一个蛋白结合位点,用于促进 ORC 结合,可能通过调整局部染色质结构。
nucleotide pairs (thousands)
核苷酸对(千对)
Figure 5-29 The four successive phases of a standard eukaryotic cell cycle. During the , and phases, the cell grows continually. During phase growth stops, the nucleus divides, and the cell divides in two. DNA replication is confined to the part of the cell cycle known as phase. is the gap between phase and phase; is the gap between phase and phase. Many eukaryotic cells spend only a small fraction of their time in phase.
图 5-29 标准真核细胞周期的四个连续阶段。在 阶段,细胞持续生长。在 阶段,生长停止,细胞核分裂,细胞分裂成两个。DNA 复制仅限于被称为 阶段的细胞周期部分。 阶段和 阶段之间的间隙; 阶段和 阶段之间的间隙。许多真核细胞只在 阶段花费很少的时间。
Figure 5-30 The origins of DNA replication on chromosome III of the yeast S. cerevisiae. This chromosome, one of the smallest eukaryotic chromosomes known, carries a total of 180 genes. As indicated, it contains 18 replication origins, although they are used with different frequencies. Those in red are typically used in less than of cell divisions, while those in green are used about of the time.
图 5-30 酿酒酵母 S. cerevisiae 染色体 III 上 DNA 复制的起源。这个染色体是已知的最小的真核染色体之一,携带了总共 180 个基因。如图所示,它包含 18 个复制起源,尽管它们的使用频率不同。红色表示的通常在 的细胞分裂中使用,而绿色表示的大约在 的时间中使用。

Features of the Human Genome That Specify Origins of Replication Remain to Be Fully Understood
人类基因组中指定复制起源的特征仍未完全理解

Compared with the situation in budding yeast, the determinants of replication origins in humans have been difficult to discover. It has been possible to identify specific human DNA sequences, each several thousand nucleotide pairs in length, that are sufficient to serve as replication origins. These origins continue to function when moved to a different chromosomal region by recombinant DNA methods, as long as they are placed in a region where the chromatin is relatively uncondensed. However, comparisons of such DNA sequences have not revealed DNA sequences in common as in the origins of bacteria and yeasts.
与酿酒酵母相比,人类复制起源的决定因素很难发现。已经能够识别出特定的人类 DNA 序列,每个序列长达数千个核苷酸对,足以作为复制起源。这些起源在通过重组 DNA 方法移动到不同染色体区域时继续发挥功能,只要它们被放置在染色质相对不致密的区域。然而,对这些 DNA 序列的比较并未揭示出类似细菌和酵母起源中常见的 DNA 序列。
Despite this, a human ORC that is very similar to the yeast ORC binds to origins of replication and initiates DNA replication in humans. Many of the other proteins that function in the initiation process in yeast likewise have central roles in humans. The yeast and human initiation mechanisms are thus similar, although some property of the genome other than a specific DNA sequence has the central role in attracting an ORC to a mammalian origin of replication. Origins of replication are often nucleosome-free, and it has been proposed that DNA that is difficult to fold onto a histone core may help define origins of replication. Nearby transcriptional activity on the genome may also play a role in activating certain origins, by altering the local chromatin structures, as we discuss in Chapter 7. This idea helps to explain why different cell types-which express different sets of genes-often use different origins. Consistent with this idea, origins that fire the earliest in S phase tend to be located near highly transcribed regions of the genome.
尽管如此,与酵母 ORC 非常相似的人类 ORC 结合到复制起源并在人类中启动 DNA 复制。在酵母中参与启动过程的许多其他蛋白质同样在人类中发挥核心作用。因此,酵母和人类的启动机制是相似的,尽管基因组的某些特性除了特定的 DNA 序列之外,在吸引 ORC 到哺乳动物复制起源中起着核心作用。复制起源通常是核小体自由的,有人提出 DNA 难以折叠到组蛋白核心上可能有助于定义复制起源。基因组上附近的转录活动也可能通过改变局部染色质结构在激活某些起源中发挥作用,正如我们在第 7 章中讨论的那样。这个想法有助于解释为什么不同的细胞类型-表达不同基因组合-通常使用不同的起源。与这个想法一致的是,在 S 期中最早发生的起源往往位于基因组高度转录区域附近。
Finally, origins located in proximity to each other tend to fire together, and it seems likely that the three-dimensional structure of chromosomes organizes origins of replication into domains, such that all the origins in a given domain fire simultaneously. All of these influences probably work together to determine how mammalian origins of replication are selected by the cell, thereby explaining the difficulty scientists have had in precisely defining their salient features.
最后,彼此靠近的起源往往会同时激活,并且很可能染色体的三维结构会将复制起源组织成域,使得同一域中的所有起源同时激活。所有这些影响可能共同作用,决定了哺乳动物复制起源如何被细胞选择,从而解释了科学家在准确定义其显著特征方面遇到的困难。

Properties of the ORC Ensure That Each Region of the DNA Is Replicated Once and Only Once in Each S Phase
ORC 的特性确保 DNA 的每个区域在每个 S 期间仅被复制一次

In bacteria, once the initiator protein is properly bound to the single origin of replication, the assembly of the replication forks seems to follow more or less automatically. In eukaryotes, the situation is significantly different because of a profound problem eukaryotes have in replicating chromosomes: with so many places to begin replication, how is the process regulated to ensure that all the DNA is copied once and only once?
在细菌中,一旦启动蛋白正确结合到单个复制起点,复制叉的组装似乎更多或更少地自动进行。在真核生物中,情况显著不同,因为真核生物在复制染色体方面存在一个深刻问题:由于有这么多开始复制的地方,如何调节这个过程以确保所有 DNA 只复制一次?
The answer lies in how the assembly of the replication-fork protein at the origins of replication is regulated. We discuss this process in more detail in Chapter 17, where we consider the machinery that underlies the cell-division cycle. In brief, during phase, a symmetrical complex of two incomplete helicases is loaded onto DNA by the bound ORC. Then, upon passage from phase to phase, specialized protein kinases come into play and direct the final assembly of the two replicative helicases, positioning one on each of the two complementary DNA single strands, where they move in opposite directions to begin opening the DNA double helix. At this point, the additional replication proteins are brought to the DNA, and two complete replication forks move in opposite directions away from the origin of replication (Figure 5-31).
答案在于复制叉蛋白在复制起点的组装是如何调控的。我们在第 17 章中更详细地讨论了这个过程,其中我们考虑了支持细胞分裂周期的机制。简而言之,在 相期间,两个不完整解旋酶的对称复合物由结合的 ORC 加载到 DNA 上。然后,在从 相到 相的过程中,专门的蛋白激酶开始发挥作用,指导两个复制解旋酶的最终组装,将其定位在两个互补的 DNA 单链上,它们以相反方向移动,开始打开 DNA 双螺旋。在这一点上,额外的复制蛋白被带到 DNA 上,两个完整的复制叉朝着远离复制起点的相反方向移动(图 5-31)。
The same protein kinases that trigger the final assembly of the helicases prevent the binding of new helicases to that origin until the next phase resets the entire cycle (for details, see pp. 1043-1045). They do this, in part, by phosphorylating ORC, rendering it unable to accept new helicases. Thus, the kinases specify a single window of opportunity for precursor helicases to be loaded at origins of replication ( phase, when kinase activity is low) and a second window for
同样的蛋白激酶触发螺旋酶的最终组装,阻止新的螺旋酶结合到该起源,直到下一个 相重置整个周期(有关详细信息,请参见第 1043-1045 页)。它们部分地通过磷酸化 ORC 来实现这一点,使其无法接受新的螺旋酶。因此,激酶指定了一个单一的机会窗口,用于在复制起源处加载前体螺旋酶( 相,当激酶活性较低时),以及第二个窗口。
Figure 5-31 DNA replication initiation in eukaryotes. This mechanism ensures that each origin of replication is activated only once per cell cycle. An origin of replication can be used only if two Mcm helicases (which form the enzymatic cores of the replicative helicases) are loaded in phase. At the beginning of phase, specialized kinases phosphorylate both the Mcm helicases and ORC, activating the former and inactivating the latter. These kinases also guide the assembly of additional proteins that complete the helicases to form the fully active replicative helicases, known as the CMG helicases. New Mcm helicases cannot be loaded at the origin until the cell progresses through mitosis to the next phase, when ORC is dephosphorylated. The name CMG derives from Cdc45, Mcm, and GINS, the components of the active helicase (see Figure 5-19).
图 5-31 真核生物 DNA 复制起始。这种机制确保每个复制起始点在每个细胞周期中只被激活一次。只有在 相加载了两个 Mcm 解旋酶(形成复制解旋酶的酶心)后,才能使用复制起始点。在 相初期,专门的激酶磷酸化 Mcm 解旋酶和 ORC,激活前者并使后者失活。这些激酶还引导额外蛋白质的组装,完成解旋酶形成完全活跃的复制解旋酶,即 CMG 解旋酶。直到细胞通过有丝分裂进展到下一个 相时,新的 Mcm 解旋酶才能加载到起始点,此时 ORC 被去磷酸化。CMG 的名称来源于 Cdc45、Mcm 和 GINS,这是活跃解旋酶的组成部分(见图 5-19)。

them to be assembled into their active form (S phase, when kinase activity is high). Because these two phases of the cell cycle are mutually exclusive and occur in a prescribed order, each origin of replication can fire only once during each cell cycle.
它们被组装成它们的活性形式(S 期,激酶活性高时)。由于细胞周期的这两个阶段是互斥的,并按照规定的顺序发生,因此每个复制起点在每个细胞周期中只能发生一次。
Because there are many more potential replication origins on a eukaryotic chromosome than are actually used in any one cell cycle (see Figure 5-30), the DNA at many ORC-bound replication origins will be replicated by forks formed at a neighboring region of the chromosome. Thus, preventing any single origin from firing more than once during an phase is not enough to avoid the re-replication of DNA in eukaryotes. In addition, any ORC-DNA complex that is passed by a replication fork must be inactivated, and it is the combination of the two mechanisms that guarantees that each region of the DNA is replicated once and only once in each phase.
由于真核染色体上存在的潜在复制起始点比任何一个细胞周期中实际使用的要多得多(见图 5-30),许多 ORC 结合的复制起始点上的 DNA 将由在染色体相邻区域形成的叉形结构复制。因此,在 相期间阻止任何单个起始点多次发射并不足以避免真核生物中 DNA 的再复制。此外,任何被复制叉结构通过的 ORC-DNA 复合物必须被失活,而正是这两种机制的结合确保了 DNA 的每个区域在每个 相期间只被复制一次。

New Nucleosomes Are Assembled Behind the Replication Fork
新核小体在复制叉后组装

Several additional aspects of DNA replication are specific to eukaryotes compared with bacteria. As discussed in Chapter 4, eukaryotic chromosomes are composed of roughly equal mixtures of DNA and protein. Chromosome duplication therefore requires not only the replication of DNA but also the synthesis of new chromosomal proteins and their assembly onto the DNA behind each replication fork. Although we are far from understanding this process in detail, we are beginning to learn how the fundamental unit of chromatin packaging, the nucleosome, is duplicated. The cell requires a large amount of new histone protein, approximately equal in mass to the newly synthesized DNA, each time it divides. For this reason, most eukaryotic organisms possess multiple copies of the gene for each histone. Vertebrate cells, for example, have about 20 repeated gene sets, most containing the genes that encode all five histones , , and ).
DNA 复制的几个额外方面与细菌相比,对真核生物具有特异性。正如第 4 章所讨论的,真核染色体由 DNA 和蛋白质大致相等的混合物组成。因此,染色体复制不仅需要复制 DNA,还需要合成新的染色体蛋白质并将其组装到每个复制叉后面的 DNA 上。尽管我们远未详细了解这一过程,但我们开始了解染色质包装的基本单位——核小体是如何复制的。每次细胞分裂时,细胞需要大量新的组蛋白蛋白质,其质量大致相等于新合成的 DNA。因此,大多数真核生物体拥有每种组蛋白的多个基因副本。例如,脊椎动物细胞大约有 20 个重复的基因组,其中大多数包含编码所有五种组蛋白的基因。
Unlike most proteins, which are made continually, histones are synthesized mainly in S phase, when the level of histone mRNA increases about fiftyfold as a result of both increased transcription and decreased mRNA degradation. The major histone mRNAs are degraded within minutes when DNA synthesis stops at the end of S phase. The mechanism depends on special properties of the ends of these mRNAs, as discussed in Chapter 7. In contrast to their mRNAs, the histone proteins themselves are remarkably stable and may survive for many generations. The tight linkage between DNA synthesis and histone synthesis appears to reflect a feedback mechanism that monitors the level of free histone to ensure that the amount of histone made exactly matches the amount of new DNA synthesized.
与大多数蛋白质不同,组蛋白主要在 S 期合成,当组蛋白 mRNA 水平增加约 50 倍时,这是由于转录增加和 mRNA 降解减少的结果。主要的组蛋白 mRNA 在 S 期结束时 DNA 合成停止后几分钟内被降解。该机制取决于这些 mRNA 的 端的特殊性质,如第 7 章所讨论的。与它们的 mRNA 相比,组蛋白蛋白质本身非常稳定,可以存活多代。DNA 合成和组蛋白合成之间的紧密联系似乎反映了一种反馈机制,监测游离组蛋白的水平,以确保制造的组蛋白数量与合成的新 DNA 数量完全匹配。
As a replication fork advances it must pass through the parent nucleosomes. In the cell, efficient replication requires chromatin remodeling complexes (discussed in Chapter 4) and histone chaperone proteins (discussed below) to destabilize the DNA-histone interfaces. Aided by such specialized proteins, replication forks can transit even highly condensed chromatin. As a replication fork passes through chromatin, the histones are transiently displaced leaving about 600 nucleotide pairs of "free" DNA in its wake. The reestablishment of nucleosomes behind a moving fork occurs in an intriguing way. When a nucleosome is traversed by a replication fork, the histone octamer is broken into an tetramer and two H2A-H2B dimers (discussed in Chapter 4), all of which are released from DNA. The H3-H4 tetramers remain in the vicinity of the fork by loosely binding to several of the proteins at the replication fork (primarily the CMG helicase) and are distributed at random to one or the other daughter duplexes as the fork moves forward. In contrast, the H2A-H2B dimers are released completely from the fork and may diffuse to entirely different chromosomes. Freshly made tetramers are added to the newly synthesized DNA to fill in the "spaces," and H2A-H2B dimers-half of which are old and half new-are then added at random
随着复制叉的推进,它必须穿过母核小体。在细胞中,高效的复制需要染色质重塑复合物(在第 4 章讨论)和组蛋白伴侣蛋白(下文讨论)来 destablize DNA-组蛋白界面。在这些专门的蛋白质的帮助下,复制叉甚至可以穿过高度浓缩的染色质。当复制叉穿过染色质时,组蛋白会暂时被移开,留下大约 600 个核苷酸对的“自由” DNA。在移动叉后重新建立核小体的过程非常有趣。当核小体被复制叉穿过时,组蛋白八聚体被分解成一个 四聚体和两个 H2A-H2B 二聚体(在第 4 章讨论),所有这些都从 DNA 中释放出来。H3-H4 四聚体通过松散地结合到复制叉处的几种蛋白质(主要是 CMG 解旋酶)而留在叉附近,并随着叉的前进随机分布到一个或另一个子代双链。相比之下,H2A-H2B 二聚体完全从叉中释放出来,可能扩散到完全不同的染色体。新制备的 四聚体被添加到新合成的 DNA 中,以填补“空隙”,然后 H2A-H2B 二聚体-其中一半是旧的,一半是新的-随机添加
to complete the nucleosomes behind the fork (Figure 5-32). The formation of new nucleosomes behind a replication fork has an important consequence for the process of DNA replication itself. As DNA polymerase discontinuously synthesizes the lagging strand (see Figure 5-19), the length of each Okazaki fragment is determined by the point at which DNA polymerase is blocked by a newly formed nucleosome. This tight coupling between nucleosome duplication and DNA replication probably explains why the length of Okazaki fragments in eukaryotes ( 200 nucleotides) is approximately the same as the nucleosome repeat length.
完成叉后的核小体(图 5-32)。在复制叉后形成新的核小体对 DNA 复制过程本身有重要影响。随着 DNA 聚合酶 不连续合成滞后链(见图 5-19),每个岡崎片段的长度由 DNA 聚合酶 被新形成的核小体阻断的点决定。核小体复制与 DNA 复制之间的紧密耦合可能解释了为什么真核生物中岡崎片段的长度(200 核苷酸)大致与核小体重复长度相同。
The orderly and rapid addition of new tetramers and dimers behind a replication fork requires histone chaperones (also called chromatin assembly factors). These multisubunit complexes bind the highly basic histones and release them on DNA only in the appropriate context. For example, some of the histone chaperones, along with their histone cargoes, are directed to newly replicated DNA through a specific interaction with the sliding clamp (see Figure 5-32). As we have seen, these clamps remain on the DNA behind replication forks, and some appear to linger just long enough for the histone chaperones to complete their tasks. Because they bind so well to histones, some histone chaperones also help to disassemble nucleosomes. Of particular importance to DNA replication is the FACT chaperone, which moves at the front of the replication machinery, disassembling nucleosomes as it moves forward (see Figure 5-32).
有序且迅速地在复制叉后添加新的 四聚体和 二聚体需要组蛋白伴侣(也称为染色质装配因子)。这些多亚基复合物结合高度碱性的组蛋白,并仅在适当的情境下释放它们在 DNA 上。例如,一些组蛋白伴侣与它们的组蛋白载体一起,通过与滑动夹具的特定相互作用被引导到新复制的 DNA 上(见图 5-32)。正如我们所见,这些夹具保留在复制叉后的 DNA 上,有些似乎停留的时间刚好足够组蛋白伴侣完成它们的任务。由于它们与组蛋白结合得很好,一些组蛋白伴侣还有助于解体核小体。对 DNA 复制特别重要的是 FACT 伴侣,它在复制机械的前面移动,随着前进解体核小体(见图 5-32)。

Termination of DNA Replication Occurs Through the Ordered Disassembly of the Replication Fork
DNA 复制的终止是通过有序拆卸复制叉进行的

We saw earlier in this chapter that E. coli DNA replication begins at a single origin, and two replication forks proceed bidirectionally around the circular genome, meeting at a spot opposite to the origin of replication. Here, the two forks do not simply collide with each other running at full speed; rather, this spot on the E. coli genome has a special DNA sequence that slows down and stalls the movement of each fork, causing them to disassemble. The remaining gaps in the daughter DNA molecules are filled in and sealed by repair DNA polymerases and DNA ligase (see Figures 5-11 and 5-12), and the two completed bacterial genomes are separated using topoisomerases (see Figure 5-23).
我们在本章前面看到,大肠杆菌 DNA 复制始于一个单一起源,两个复制叉以双向方式绕着圆形基因组前进,最终在与复制起源相对的位置相遇。在这里,这两个复制叉并不是简单地以全速相撞;相反,大肠杆菌基因组上的这个位置有一个特殊的 DNA 序列,可以减慢并使每个叉停滞不前,导致它们解体。子 DNA 分子中剩余的缺口由修复 DNA 聚合酶和 DNA 连接酶填补并封闭(见图 5-11 和 5-12),然后使用拓扑异构酶分离两个完成的细菌基因组(见图 5-23)。
As might be expected, the situation in eukaryotes is more complicated. First, each round of replication requires many termination events, roughly as many as there are initiation events at origins of replication. Thus, in mammalian cells, approximately termination events occur in every S phase. Second, the termination of replication forks in eukaryotes is largely independent of any underlying DNA sequence in the genome. Rather, the principal termination
正如人们所预料的那样,在真核生物中的情况更为复杂。首先,每一轮复制都需要许多终止事件,大致与复制起始点的启动事件数量相同。因此,在哺乳动物细胞中,每个 S 期大约会发生 次终止事件。其次,在真核生物中,复制叉的终止在很大程度上独立于基因组中的任何 DNA 序列。相反,主要的终止机制

Figure 5-32 Formation of nucleosomes behind a replication fork. Parent tetramers remain associated with the fork and are distributed at random to the daughter DNA molecules, with roughly equal numbers inherited by each daughter. In contrast, dimers are released completely from the fork as it passes. This release begins just in front of the replication fork and is facilitated by the histone chaperone FACT, which moves with the fork. FACT has several globular protein domains connected by flexible linkers and can make multiple contacts with a nucleosome to aid in its disassembly. Additional histone chaperones (NAP1 and CAF1) restore the full complement of histones to daughter molecules using both parent and newly synthesized histones. Although not shown in the figure, it has been proposed that FACT directly hands off parent tetramers to components of the replication machinery, which in turn hand them off to CAF1 chaperones, which deposit them evenly on the two daughter molecules. The way in which histones are distributed behind a replication fork means that some daughter nucleosomes contain only parent histones or only newly synthesized histones, but most are hybrids of old and new. For simplicity, the DNA double helix is shown as a single red line.
图 5-32 复制叉后面核小体的形成。母 四聚体保持与叉结合,并随机分布到子 DNA 分子上,每个子分子继承的数量大致相等。相反, 二聚体在叉通过时完全释放。这种释放始于复制叉的正前方,并由与叉一起移动的组蛋白伴侣 FACT 促进。FACT 具有几个由柔性连接器连接的球状蛋白结构域,可以与核小体多次接触,帮助其解体。额外的组蛋白伴侣(NAP1 和 CAF1)使用母本和新合成的组蛋白将完整的组蛋白补充到子分子中。尽管图中未显示,但有人提出 FACT 直接将母 四聚体移交给复制机械的组分,然后再将它们移交给 CAF1 组蛋白,后者将它们均匀地沉积在两个子分子上。 组蛋白在复制叉后面的分布方式意味着一些子核小体只包含母本组蛋白或新合成的组蛋白,但大多数是新旧组蛋白的混合体。为简单起见,DNA 双螺旋被表示为单一的红线。

signal is a head-on encounter with a fork moving in the opposite direction. When two forks meet, the CMG helicase at each fork is covalently modified by addition of ubiquitin (see Figure 3-65), which causes its disassembly and removal from DNA. Without the helicase, the other replication proteins rapidly dissociate from the fork. Repair DNA polymerase and DNA ligase subsequently fill in and seal any remaining gaps. Eukaryotic replication forks must also contend with the ends of chromosomes. Here, it is believed that the CMG helicase simply slides off the end of the DNA molecule, leading to the dissociation of the other fork proteins. However, replicating DNA to the very end of a chromosome presents a special challenge to the eukaryotic cell, as we describe next.
信号是与朝向相反方向移动的叉头对冲。当两个叉头相遇时,每个叉头上的 CMG 解旋酶会通过泛素的加成发生共价修饰(见图 3-65),导致其解体并从 DNA 中移除。没有解旋酶,其他复制蛋白会迅速从叉头中解离。修复 DNA 聚合酶和 DNA 连接酶随后填补并封闭任何剩余的缺口。真核复制叉头还必须应对染色体的末端。在这里,人们认为 CMG 解旋酶会简单地从 DNA 分子的末端滑落下来,导致其他叉头蛋白的解离。然而,将 DNA 复制到染色体的末端对真核细胞提出了特殊挑战,我们将在接下来的部分描述。

Telomerase Replicates the Ends of Chromosomes
端粒酶复制染色体的末端

We saw earlier in the chapter that synthesis of the lagging strand at a replication fork must occur discontinuously through a backstitching mechanism that produces short DNA fragments attached to RNA primers. The final RNA primer synthesized on the lagging-strand template cannot be replaced by DNA because there is no primer ahead of it to provide a end for the repair polymerase. Without a mechanism to deal with this problem, DNA would be lost from the ends of all chromosomes each time a cell divides.
我们在本章前面看到,在复制叉中,滞后链的合成必须通过一个反向缝合机制不连续地进行,这个机制产生与 RNA 引物连接的短 DNA 片段。在滞后链模板上合成的最终 RNA 引物不能被 DNA 替换,因为在它前面没有引物来为修复聚合酶提供一个 端。如果没有处理这个问题的机制,每次细胞分裂时,所有染色体的末端都会丢失 DNA。
Bacteria avoid this "end-replication" problem by having circular DNA molecules as chromosomes, as we have seen. Eukaryotes solve it in a different way: they have specialized nucleotide sequences at the ends of their chromosomes that are incorporated into structures called telomeres (discussed in Chapter 4). Telomeres contain many tandem repeats of a short sequence that is similar in organisms as diverse as protozoa, fungi, plants, and mammals. In humans, the sequence of the repeat unit is GGGTTA, and it is repeated roughly a thousand times at each telomere.
细菌通过拥有环状 DNA 分子作为染色体来避免这种“末端复制”问题,正如我们所见。真核生物以一种不同的方式解决这个问题:它们在染色体末端有专门的核苷酸序列,这些序列被整合到称为端粒的结构中(在第 4 章中讨论)。端粒包含许多短序列的串联重复,这些序列在原生动物、真菌、植物和哺乳动物等不同生物体中相似。在人类中,重复单元的序列是 GGGTTA,每个端粒大约重复一千次。
Telomere DNA sequences are recognized by sequence-specific DNA-binding proteins that attract an enzyme, called telomerase, that replenishes these sequences each time a cell divides. Telomerase recognizes the tip of an existing telomere DNA repeat sequence and elongates it in the -to- direction, using an RNA template that is a component of the enzyme itself to synthesize new DNA copies of the repeat (Figure 5-33). The enzymatic portion of telomerase resembles other reverse transcriptases, proteins that synthesize DNA using an RNA template, although, in this case, the telomerase RNA also contributes to the active site and is essential for efficient catalysis. After extension of the parent DNA strand by telomerase, replication of the lagging strand at the chromosome end can be completed by the conventional DNA polymerases, using these extensions as a template to synthesize the complementary strand (Figure 5-34).
端粒 DNA 序列被特异性 DNA 结合蛋白识别,这些蛋白吸引一种称为端粒酶的酶,该酶在每次细胞分裂时补充这些序列。端粒酶识别现有端粒 DNA 重复序列的末端,并在 - 方向上延长它,使用作为酶本身组成部分的 RNA 模板合成重复的新 DNA 拷贝(图 5-33)。端粒酶的酶部分类似于其他反转录酶,这些蛋白使用 RNA 模板合成 DNA,尽管在这种情况下,端粒酶 RNA 也对活性位点起作用,并且对于高效催化是必不可少的。在端粒酶延长母 DNA 链后,染色体末端的滞后链的复制可以由常规 DNA 聚合酶完成,使用这些延伸作为模板合成互补链(图 5-34)。
Figure 5-33 Schematic structure of human telomerase. This large enzyme is composed of 10 protein subunits and an RNA of 451 nucleotides. The RNA forms the scaffold of the complex, provides the template for synthesizing new DNA telomere repeats, and helps form the active site. The synthesis reaction itself is carried out by the reverse transcriptase domain of the protein, shown in light green, in conjunction with the RNA. A reverse transcriptase is a special form of polymerase enzyme that uses an RNA template to make a DNA strand; telomerase is unique in carrying its own RNA template with it. Telomerase also contains several additional protein complexes (some of which are shown in dark green and blue) that are needed to assemble the enzyme and, for many organisms but not humans, to bring it to the ends of chromosomes. (Modified from T.H.D. Nguyen et al., Nature 557: .
图 5-33 人类端粒酶的示意结构。这种大型酶由 10 个蛋白亚基和一个 451 个核苷酸的 RNA 组成。RNA 形成了复合物的支架,为合成新的 DNA 端粒重复序列提供模板,并帮助形成活性位点。合成反应本身由蛋白质的反转录酶结构域(浅绿色显示)与 RNA 一起进行。反转录酶是一种特殊形式的聚合酶酶,它使用 RNA 模板制造 DNA 链;端粒酶在携带其自身的 RNA 模板方面是独特的。端粒酶还包含几个额外的蛋白质复合物(其中一些显示为深绿色和蓝色),这些复合物需要组装酶,并且对许多生物体而言,但不包括人类,将其带到染色体的末端。(修改自 T.H.D. Nguyen 等人,自然 557:
Figure 5-34 Telomere replication. Shown here is the reaction that synthesizes the repeating sequences that form the ends of the chromosomes (telomeres) of eukaryotes. The 3' end of the parent lagging-strand template is extended by RNA-templated DNA synthesis; this allows the incomplete daughter DNA strand that is paired with it to be synthesized to the end of the chromosome. The synthesis of the final bit of lagging strand is carried out by DNA polymerase , which carries a DNA primase as one of its subunits (Movie 5.6). DNA polymerase is the same enzyme used to begin the synthesis of each Okazaki fragment on the lagging strand; it begins its synthesis with RNA (not shown) and continues with DNA (green). The telomere sequence illustrated is that of the ciliate Tetrahymena, in which these reactions were first discovered.
图 5-34 端粒复制。这里显示的是合成形成真核生物染色体末端(端粒)的重复序列的反应。母链滞后链模板的 3'端通过 RNA 模板 DNA 合成进行延伸;这使得与之配对的不完整的子 DNA 链能够合成至染色体末端。滞后链的最后一小部分的合成由 DNA 聚合酶 执行,该酶携带 DNA 引物酶作为其亚基之一(影片 5.6)。DNA 聚合酶 是用于在滞后链上开始每个 Okazaki 片段合成的相同酶;它以 RNA(未显示)开始合成,然后继续合成 DNA(绿色)。所示的端粒序列是被首次发现这些反应的纤毛虫类 Tetrahymena 的序列。
(A)
(B)

Telomeres Are Packaged into Specialized Structures That Protect the Ends of Chromosomes
端粒被包装成专门的结构,用于保护染色体的末端

The ends of chromosomes present cells with an additional problem. As we will see in the next part of this chapter, when a chromosome is accidently broken into two pieces, the break is rapidly repaired. Telomeres must clearly be distinguished from these accidental breaks; otherwise, the cell will attempt to "repair" telomeres, generating chromosome fusions and other genetic abnormalities. Telomeres have several features to prevent this from happening.
染色体的末端给细胞带来了额外的问题。正如我们将在本章的下一部分中看到的,当染色体意外地断裂成两部分时,这种断裂会迅速修复。端粒必须明显区别于这些意外断裂;否则,细胞将试图“修复”端粒,产生染色体融合和其他遗传异常。端粒具有几个特征来防止这种情况发生。
A specialized nuclease chews back the end of a telomere leaving a protruding, single-strand 3' end. This protruding end-in combination with the GGGTTA repeats in telomeres-attracts a group of proteins that form a protective chromosome cap known as shelterin. In particular, shelterin protects telomeres from being treated as damaged DNA. Another feature of telomeres may offer additional protection. When human telomeres are artificially cross-linked and viewed by electron microscopy, structures known as "t-loops" can be observed in which the protruding single-strand end of the telomere loops back and tucks itself into the duplex DNA of the telomere repeat sequence (Figure 5-35). An attractive idea is that t-loops are orchestrated by shelterin to help "hide" the very ends of chromosomes.
一种专门的核酸酶咬掉端粒的 端,留下一个突出的、单链的 3'端。这个突出的端与端粒中的 GGGTTA 重复序列结合,吸引一组蛋白质形成一种被称为 shelterin 的保护染色体帽。特别是,shelterin 保护端粒免受被视为受损 DNA 的处理。端粒的另一个特征可能提供额外的保护。当人类端粒被人为交联并通过电子显微镜观察时,可以观察到称为“t-环”的结构,其中端粒的突出单链末端回环并塞入端粒重复序列的双链 DNA 中(图 5-35)。一个吸引人的想法是,t-环由 shelterin 协调以帮助“隐藏”染色体的末端。

Telomere Length Is Regulated by Cells and Organisms
端粒长度受细胞和生物体调控

Because the processes that grow and shrink each telomere sequence are only approximately balanced, chromosome ends contain variable numbers of telomeric repeats. Not surprisingly, many cells, including stem cells and germ cells, have homeostatic mechanisms that maintain the number of these repeats within a limited range (Figure 5-36).
由于增长和缩短每个端粒序列的过程仅大致平衡,染色体末端包含可变数量的端粒重复序列。毫不奇怪,许多细胞,包括干细胞和生殖细胞,具有维持这些重复序列数量在有限范围内的稳态机制(图 5-36)。
In most of the dividing somatic cells of humans, however, telomeres gradually shorten, and it has been proposed that this provides a counting mechanism that helps prevent the unlimited proliferation of wayward cells in adult tissues. In its simplest form, this idea holds that our somatic cells start off in the embryo with a full complement of telomeric repeats. These are then eroded to different extents in different cell types. Some stem cells, notably those in tissues that must be replenished at a high rate throughout life-bone marrow or gut lining, for example-retain full telomerase activity. However, in many other types of cells, the level of telomerase is reduced so that the enzyme cannot quite keep up with chromosome duplication. Such cells lose 100-200 nucleotides from each telomere every time they divide. After many cell generations, the descendant cells will inherit chromosomes that lack functioning telomeres, and, as a result of this defect, activate a DNA-damage response causing them to withdraw permanently from the cell cycle and cease dividing-a process called replicative cell senescence (discussed in Chapters 17 and 20). In theory, such a mechanism could provide a
在人类大多数体细胞的分裂过程中,端粒逐渐缩短,有人提出这提供了一种计数机制,有助于防止成体组织中不受控制的细胞无限增殖。简单来说,这个想法认为我们的体细胞在胚胎中以完整的端粒重复序列开始。然后,这些序列在不同的细胞类型中被侵蚀到不同程度。一些干细胞,特别是那些在整个生命过程中必须高速再生的组织中的干细胞,例如骨髓或肠道内膜,保留了完整的端粒酶活性。然而,在许多其他类型的细胞中,端粒酶的水平降低,以至于酶无法跟上染色体复制的速度。这些细胞每次分裂时会从每个端粒失去 100-200 个核苷酸。经过多代细胞分裂后,后代细胞将继承缺乏功能性端粒的染色体,并且由于这种缺陷,激活 DNA 损伤反应,导致它们永久退出细胞周期并停止分裂-这个过程称为复制细胞衰老(在第 17 章和第 20 章中讨论)。理论上,这样的机制可以提供一种

Figure 5-35 A t-loop at the end of a mammalian chromosome. (A) Electron micrograph of the DNA at the end of an interphase human chromosome. The chromosome was fixed, deproteinated, and artificially thickened before viewing. The loop seen here is approximately 15,000 nucleotide pairs in length. (B) Schematic diagram of t-loop formation. (A, from J.D. Griffith et al., Cell 97:503-514, 1999. With permission from Elsevier.)
图 5-35 哺乳动物染色体末端的 t-环。(A) 人类染色体间期末端 DNA 的电子显微镜图。在观察之前,染色体被固定、去蛋白化并人为增厚。这里看到的环大约有 15,000 个核苷酸对的长度。(B) t-环形成的示意图。(A, 参见 J.D. Griffith 等人,Cell 97:503-514, 1999. 获 Elsevier 许可。)
telomere repeats 端粒重复
safeguard against the uncontrolled cell proliferation of abnormal cells in somatic tissues, thereby helping to protect us from cancer.
在体细胞组织中防止异常细胞不受控制增殖,从而帮助保护我们免受癌症的侵害。
The idea that telomere length acts as a "measuring stick" to count cell divisions and thereby regulate the lifetime of the cell lineage has been tested in several ways. For certain types of human cells grown in tissue culture, the experimental results support such a theory. Human fibroblasts normally proliferate for about 60 cell divisions in culture before undergoing replicative cell senescence. Like most other somatic cells in humans, fibroblasts produce only low levels of telomerase, and their telomeres gradually shorten each time they divide. When telomerase is provided to the fibroblasts by inserting a fully active telomerase gene, telomere length is maintained and many of the cells now continue to proliferate indefinitely. Also consistent with these ideas is the observation that, in approximately of cancer cells, the telomerase gene has become reactivated, thereby circumventing the normal safety mechanism (see pp. 1073-1074).
端粒长度作为“计数细胞分裂次数并从而调节细胞谱系寿命”的概念已经通过多种方式进行了测试。对于在组织培养中培养的某些类型的人类细胞,实验结果支持这样的理论。人类成纤维细胞在培养中通常进行大约 60 次细胞分裂后就会经历复制性细胞衰老。与人类其他多数体细胞一样,成纤维细胞只产生很低水平的端粒酶,它们的端粒在每次分裂时逐渐缩短。当通过插入一个完全活跃的端粒酶基因向成纤维细胞提供端粒酶时,端粒长度得以维持,许多细胞现在继续无限增殖。与这些想法一致的还有这样的观察结果,即在大约 的癌细胞中,端粒酶基因已经重新激活,从而规避了正常的安全机制(见 1073-1074 页)。
It has been proposed that this type of control on cell proliferation may contribute to the aging of animals like ourselves. These ideas have been tested by producing transgenic mice that lack telomerase entirely. The telomeres in mouse chromosomes are about five times longer than human telomeres, and the mice must therefore be bred through three or more generations before their telomeres have shrunk to the normal human length. It is therefore perhaps not surprising that the first generations of mice develop normally. However, the mice in later generations develop progressively more defects in some of their highly proliferative tissues. In addition, these mice show signs of premature aging and have a pronounced tendency to develop tumors. In these and other respects, these mice resemble humans with the genetic disease dyskeratosis congenita. Individuals afflicted with this disease carry one functional and one nonfunctional copy of the telomerase RNA gene; they have prematurely shortened telomeres and typically die of progressive bone marrow failure. These individuals also develop lung scarring and liver cirrhosis and show abnormalities in various epidermal structures including skin, hair follicles, and nails.
已经提出,这种对细胞增殖的控制可能会导致像我们这样的动物的衰老。通过制造完全缺乏端粒酶的转基因小鼠来测试这些想法。小鼠染色体上的端粒大约比人类端粒长五倍,因此这些小鼠必须经过三代或更多代的繁殖,直到它们的端粒缩短到正常人类长度为止。因此,也许并不奇怪,第一代小鼠会正常发育。然而,后代小鼠在一些高度增殖的组织中逐渐出现更多缺陷。此外,这些小鼠显示出早衰的迹象,并且有明显的肿瘤发展倾向。在这些方面和其他方面,这些小鼠类似于患有遗传疾病角化异常症的人类。患有这种疾病的个体携带一个功能性和一个非功能性的端粒酶 RNA 基因拷贝;他们的端粒会过早缩短,通常会死于进行性骨髓衰竭。 这些个体还会发展肺部瘢痕和肝硬化,并显示出各种表皮结构异常,包括皮肤、毛囊和指甲。
The above observations demonstrate that controlling cell proliferation by telomere shortening poses a risk to an organism, because not all of the cells that begin losing the ends of their chromosomes will stop dividing. Some apparently become genetically unstable, but continue to divide, giving rise to variant cells that can lead to cancer. As discussed above, many of these variant cells
上述观察表明,通过端粒缩短控制细胞增殖对生物体构成风险,因为并非所有开始失去染色体末端的细胞都会停止分裂。一些明显变得遗传不稳定的细胞仍会继续分裂,产生可能导致癌症的变异细胞。正如上文所讨论的,这些变异细胞中的许多
Figure 5-36 A demonstration that yeast cells control the length of their telomeres. In this experiment, the telomere at one end of a particular chromosome is artificially made either longer (left) or shorter (right) than average. After many cell divisions, the chromosome recovers, showing an average telomere length and a length distribution that is typical of the other chromosomes in the yeast cell. A similar feedback mechanism for controlling telomere length has been proposed for the germ-line cells and stem cells of mammals.
图 5-36 证明酵母细胞控制其端粒长度的实验。在这个实验中,某一染色体的一端的端粒被人为地延长(左)或缩短(右)至平均长度之外。经过多次细胞分裂后,该染色体恢复,显示出平均端粒长度和长度分布,与酵母细胞中其他染色体典型的情况相似。类似的反馈机制被提出用于控制哺乳动物的生殖细胞和干细胞的端粒长度。

ultimately produce high levels of telomerase, thereby ensuring their continued survival. Clearly, the use of telomere shortening as a regulating mechanism is not foolproof and, like many mechanisms in the cell, it must strike a balance between benefit and risk.
最终产生高水平的端粒酶,从而确保它们持续存活。显然,使用端粒缩短作为调节机制并非百分之百可靠,就像细胞中的许多机制一样,它必须在利益和风险之间取得平衡。

Summary 摘要

The proteins that initiate DNA replication bind to DNA sequences at a replication origin to catalyze the formation of a replication bubble with two outward-moving replication forks. The process begins when an initiator protein-DNA complex is formed that subsequently loads a DNA helicase onto the DNA template. Other proteins are then added to form the multienzyme "replication machine" that catalyzes DNA synthesis at each replication fork.
启动 DNA 复制的蛋白质结合到复制起源处的 DNA 序列上,催化形成具有两个向外移动的复制叉的复制泡泡。该过程始于形成一个启动蛋白质-DNA 复合物,随后将 DNA 解旋酶加载到 DNA 模板上。然后添加其他蛋白质以形成多酶“复制机器”,在每个复制叉处催化 DNA 合成。
In bacteria and some simple eukaryotes, replication origins are defined by specific DNA sequences that are several hundred nucleotide pairs long. In other eukaryotes, such as humans, features that specify an origin of DNA replication are less well defined, and probably depend more on structural features of chromosomes than on specific DNA sequences.
在细菌和一些简单的真核生物中,复制起源由几百个核苷酸对长的特定 DNA 序列定义。在其他真核生物中,如人类,指定 DNA 复制起源的特征定义不太明确,可能更依赖于染色体的结构特征而不是特定的 DNA 序列。
Bacteria typically have a single origin of replication in a circular chromosome. With fork speeds of up to 1000 nucleotides per second, they can replicate their genome in less than an hour. Eukaryotic DNA replication takes place in only one part of the cell cycle, the S phase. The replication fork in eukaryotes moves about 20 times more slowly than the bacterial replication fork, and the much longer eukaryotic chromosomes each require many replication origins to complete their replication in an S phase, which typically lasts for 8 hours in human cells. The different replication origins in these eukaryotic chromosomes are activated in a sequence, determined in part by which genes are currently being transcribed and the structure of chromatin across each chromosome. After the replication fork has passed, chromatin structure is re-formed by the addition of new histones to the old histones that are directly inherited by each daughter DNA molecule.
细菌通常在圆形染色体中有一个复制起点。以每秒 1000 个核苷酸的速度,它们可以在不到一个小时内复制其基因组。真核 DNA 复制仅发生在细胞周期的一个部分,即 S 期。真核生物的复制叉移动速度大约比细菌复制叉慢 20 倍,而更长的真核染色体每个都需要许多复制起点才能在 S 期内完成复制,人类细胞中 S 期通常持续 8 小时。这些真核染色体中的不同复制起点按顺序激活,部分取决于当前正在转录的基因以及每个染色体上染色质的结构。复制叉通过后,染色质结构通过向每个子 DNA 分子直接继承的旧组蛋白添加新组蛋白重新形成。
Eukaryotes solve the problem of replicating the ends of their linear chromosomes with a specialized end structure, the telomere, maintained by a special nucleotidepolymerizing enzyme called telomerase. Telomerase extends one of the DNA strands at the end of a chromosome by using an RNA template that is an integral part of the enzyme itself, producing a highly repeated DNA sequence that typically extends for thousands of nucleotide pairs at each chromosome end. Telomeres have specialized structures that distinguish them from broken ends of chromosomes, ensuring that they are not treated as damaged DNA.
真核生物通过一种特殊的末端结构——端粒来解决线性染色体末端复制的问题,端粒由一种称为端粒酶的特殊核苷酸聚合酶维持。端粒酶利用其内在的 RNA 模板延伸染色体末端的 DNA 链之一,产生一个高度重复的 DNA 序列,通常在每个染色体末端延伸数千个核苷酸对。端粒具有特殊的结构,使其与染色体断裂末端有所区别,确保它们不被视为受损的 DNA。

DNA REPAIR DNA 修复

Maintaining the genetic stability that an organism needs for its survival requires not only an extremely accurate mechanism for replicating DNA but also mechanisms for repairing the many accidental lesions that DNA continually suffers. Most such spontaneous changes in DNA are temporary because they are immediately corrected by a set of processes that are collectively called DNA repair. Of the tens of thousands of random changes created every day in the DNA of a human cell by heat, metabolic accidents, radiation of various sorts, and exposure to substances in the environment, only a few (less than ) accumulate as permanent mutations in the DNA sequence. The rest are eliminated with remarkable efficiency by DNA repair.
维持生物体生存所需的遗传稳定性不仅需要一种极其精确的 DNA 复制机制,还需要修复 DNA 持续遭受的许多意外损伤的机制。DNA 中的大多数自发变化是暂时的,因为它们会立即被一组被称为 DNA 修复的过程所纠正。在人类细胞的 DNA 中,每天由热量、代谢意外、各种辐射以及环境中物质的暴露造成的成千上万的随机变化中,只有少数(小于 )会积累为 DNA 序列中的永久突变。其余的则会被 DNA 修复以非凡的效率消除。
The importance of DNA repair is evident from the large investment that cells make in the enzymes that carry it out: several percent of the coding capacity of most genomes is devoted solely to DNA repair functions. The importance of DNA repair is also demonstrated by the increased rate of mutation that follows the inactivation of a DNA repair gene. Many DNA repair proteins and the genes that encode them-which we now know operate in a wide range of organisms,
DNA 修复的重要性显而易见,细胞在进行 DNA 修复时投入了大量资源:大多数基因组的编码能力中有几个百分点专门用于 DNA 修复功能。DNA 修复的重要性还体现在 DNA 修复基因失活后突变率的增加。许多 DNA 修复蛋白质及其编码基因,我们现在知道它们在广泛的生物体中发挥作用。
TABLE 5-2 Some Inherited Human Syndromes with Defects in DNA Repair
表 5-2 一些具有 DNA 修复缺陷的遗传性人类综合征

综合征名称或相关基因
Name of syndrome or
responsible genes
Phenotype 表型 Enzyme or process affected
酶或受影响的过程
Msh2, Msh3, Msh6, Mlh1, Pms2
Msh2,Msh3,Msh6,Mlh1,Pms2
Colon cancer 结肠癌 Mismatch repair 不匹配修复

聚合酶校对相关性息肉症
Polymerase proofreading-
associated polyposis
Colon cancer 结肠癌 Proofreading by DNA polymerase
DNA 聚合酶的校对
Aicardi-Goutières syndrome
艾卡迪-古铁尔氏综合征

脑病、神经功能障碍、基因组不稳定性
Encephalopathy, neurological dysfunction,
genome instability

DNA 中错误插入的核糖核苷酸的去除
Removal of misincorporated ribonucleotides
in DNA

Xeroderma pigmentosum (XP) A-G 组
Xeroderma pigmentosum (XP)
groups A-G

皮肤癌,紫外线敏感性,神经异常
Skin cancer, UV sensitivity, neurological
abnormalities
Nucleotide excision repair
核苷酸切除修复
Cockayne syndrome 库克恩综合征 UV sensitivity, developmental abnormalities
UV 敏感性,发育异常

核苷酸切除修复与转录的耦合
Coupling of nucleotide excision repair to
transcription
XP variant XP 变体 UV sensitivity, skin cancer
UV 敏感性,皮肤癌
Translesion synthesis by DNA polymerase
DNA 聚合酶 的转录延伸
Ataxia telangiectasia (AT)
共济失调毛细血管扩张症(AT)

白血病,淋巴瘤, -射线敏感性,基因组不稳定性
Leukemia, lymphoma, -ray sensitivity,
genome instability

ATM 蛋白,一种被双链 DNA 断裂激活的蛋白激酶
ATM protein, a protein kinase activated by
double-strand DNA breaks
Seckel syndrome 西克尔综合征 Dwarfism, microcephaly 侏儒症,小头畸形

ATR 蛋白,一种被单链 DNA 断裂激活的蛋白激酶
ATR protein, a protein kinase activated by
single-strand DNA breaks
Brca1 Brca1 Brca1 Breast and ovarian cancer
乳腺癌和卵巢癌
Repair by homologous recombination
同源重组修复
Brca2 Brca2 Brca2

乳腺、卵巢、前列腺和胰腺癌
Breast, ovarian, prostate, and pancreatic
cancer
Repair by homologous recombination
同源重组修复

共济失调-毛细血管扩张样障碍(ATLD)
Ataxia-telangiectasia-like
disorder (ATLD)

白血病,淋巴瘤, -射线敏感性,基因组不稳定性
Leukemia, Iymphoma, -ray sensitivity,
genome instability

Mre11 蛋白,用于处理双链 DNA 断裂
Mre11 protein, required for processing
double-strand DNA breaks
Werner syndrome 沃纳氏综合征

过早衰老,多个部位的癌症,基因组不稳定
Premature aging, cancer at several sites,
genome instability

辅助 3'-外切酶和 DNA 解旋酶用于修复
Accessory 3 '-exonuclease and DNA
helicase used in repair
Bloom syndrome 布鲁姆综合征

多个部位的癌症,生长受阻,基因不稳定
Cancer at several sites, stunted growth,
genome instability
DNA helicase needed for recombination
需要重组的 DNA 解旋酶
Fanconi anemia groups A-W
范可尼贫血 A-W 组

先天畸形,白血病,基因组不稳定
Congenital abnormalities, leukemia, genome
instability
DNA interstrand cross-link repair
DNA 互链交联修复
46BR patient 46BR 患者

对 DNA 损伤剂的高敏感性,基因组不稳定性
Hypersensitivity to DNA-damaging agents,
genome instability
DNA ligase I DNA 连接酶 I
including humans-were originally identified in bacteria by the isolation and characterization of mutants that displayed an increased mutation rate or an increased sensitivity to DNA-damaging agents.
包括人类在内的生物最初是通过分离和表征显示出增加突变率或对 DNA 损伤剂显示出增加敏感性的突变体来鉴定的。
Studies of the consequences of a diminished capacity for DNA repair in humans have linked many human diseases with decreased repair (Table 5-2). Thus, we saw previously that defects in a human gene whose product normally functions to repair the mismatched base pairs resulting from DNA replication errors can lead to an inherited predisposition to cancers of the colon and some other organs, caused by an increased mutation rate. In another human disease, xeroderma pigmentosum (XP), the afflicted individuals have an extreme sensitivity to ultraviolet radiation because they are unable to repair the damage to DNA caused by this component of sunlight. This repair defect results in an increased mutation rate that leads to serious skin lesions and a greatly increased susceptibility to skin cancers. Finally, mutations in the Brcal and Brca2 genes compromise a type of DNA repair known as homologous recombination and are a major cause of hereditary breast and ovarian cancers.
研究表明,人类 DNA 修复能力降低会导致许多人类疾病与修复能力下降相关(Table 5-2)。因此,我们之前发现,人类基因缺陷会导致遗传性易患结肠癌和其他一些器官癌症,这是由于 DNA 复制错误导致的错配碱基对无法修复,从而导致突变率增加。在另一种人类疾病——黑色素瘤性干皮病(XP)中,患者对紫外线极度敏感,因为他们无法修复受阳光中紫外线引起的 DNA 损伤。这种修复缺陷导致突变率增加,进而导致严重的皮肤病变和极大的皮肤癌易感性增加。最后,Brcal 和 Brca2 基因的突变会影响一种称为同源重组的 DNA 修复类型,并且是遗传性乳腺癌和卵巢癌的主要原因。

Without DNA Repair, Spontaneous DNA Damage Would Rapidly Change DNA Sequences
没有 DNA 修复,自发性 DNA 损伤会迅速改变 DNA 序列

Although DNA is a highly stable material—as required for the storage of genetic information-it is a complex organic molecule that is susceptible, even under normal cell conditions, to spontaneous changes that would lead to mutations if left unrepaired (Figure 5-37 and see Table 5-3). For example, the DNA of each human cell loses about 18,000 purine bases (adenine and guanine) every day because their -glycosyl linkages to deoxyribose break, a spontaneous hydrolysis reaction called depurination. Similarly, a spontaneous deamination of cytosine to uracil in
尽管 DNA 是一种高度稳定的材料——这是存储遗传信息所必需的,但它是一种复杂的有机分子,即使在正常细胞条件下也容易受到影响,如果不进行修复,就会导致突变(图 5-37 和参见表 5-3)。例如,每个人类细胞的 DNA 每天会失去约 18,000 个嘌呤碱基(腺嘌呤和鸟嘌呤),因为它们的 -脱氧核糖链与脱氧核糖断裂,这是一种自发的水解反应,称为去嘌呤作用。同样,胞嘧啶自发地脱氨基转变为尿嘧啶。

Figure 5-37 A summary of spontaneous alterations that require DNA repair. The sites on each nucleotide modified by spontaneous oxidative damage (red arrows), hydrolytic attack (blue arrows), and methylation (green arrows) are shown, with the width of each arrow indicating the relative frequency of each event (see Table 5-3). (After T. Lindahl, Nature 362:709-715, 1993.)
图 5-37 显示需要 DNA 修复的自发改变摘要。显示了每个核苷酸上由自发氧化损伤(红色箭头)、水解攻击(蓝色箭头)和甲基化(绿色箭头)修改的位点,每个事件的相对频率由每个箭头的宽度表示(参见表 5-3)。 (摘自 T. Lindahl, Nature 362:709-715, 1993 年。)

TABLE 5-3 Endogenous DNA Lesions Arising and Repaired in a Diploid Mammalian Cell in 24 Hours
表 5-3 24 小时内在二倍体哺乳动物细胞中产生和修复的内源性 DNA 损伤
DNA lesion DNA 损伤 Number repaired in   中的数字已修复
Hydrolysis 水解
Depurination 去嘌呤 18,000
Depyrimidination 去嘧啶化 600
Cytosine deamination 胞嘧啶脱氨基化 100
5-Methylcytosine deamination
5-甲基胞嘧啶脱氨基
10
Oxidation 氧化
8-oxoguanine 8-氧鸟嘌呤 1500

环饱和嘧啶类化合物(胸腺嘧啶醇,胞嘧啶水合物)
Ring-saturated pyrimidines (thymine glycol, cytosine
hydrates)
2000

脂质过氧化产物(M1G,乙烯-A,乙烯-C)
Lipid peroxidation products (M1G, etheno-A,
etheno-C)
1000
Nonenzymatic methylation by S-adenosylmethionine
S-腺苷甲硫氨酸的非酶甲基化
7-Methylguanine 7-甲基鸟嘌呤 6000
3-Methyladenine 3-甲基腺嘌呤 1200
Nonenzymatic methylation by nitrosated polyamines and peptides
非酶催化的亚硝酸化多胺和肽的甲基化
-Methylguanine  - 甲基鸟嘌呤

表中列出的 DNA 损伤是细胞内正常化学反应的结果。暴露于外部化学物质和辐射的细胞遭受更严重和更多样化的 DNA 损伤。
The DNA lesions listed in the table are the result of the normal chemical reactions that take
place in cells. Cells that are exposed to external chemicals and radiation suffer greater and more
diverse forms of DNA damage. (From T. Lindahl and D.E. Barnes, Cold Spring Harb. Symp.
Quant. Biol. 65:127-133, 2000.)
(A) DEPURINATION (一) 净化
(B) DEAMINATION (B) 脱氨化
DNA strand DNA 链
ONRHORHSHOONH
DNA strand DNA 链
DNA occurs at a rate of about 100 bases per cell per day (Figure 5-38). DNA bases are also occasionally damaged by encounters with reactive metabolites produced in the cell (for example, the high-energy methyl donor, -adenosylmethionine) or by exposure to toxic chemicals in the environment. Likewise, ultraviolet radiation from the Sun can produce a covalent linkage between two adjacent pyrimidine bases in DNA to form, for example, thymine dimers (Figure 5-39). If left uncorrected, most of these changes would lead either to the deletion of one or more base pairs or to a base-pair substitution in the daughter DNA chain when the DNA is replicated (Figure 5-40). These mutations would then be propagated throughout all subsequent cell generations. Such a high rate of unrepaired random changes in the DNA sequence would have disastrous consequences, both in the germ line and in somatic tissues.
DNA 每天以大约 100 个碱基的速率发生(图 5-38)。DNA 碱基有时也会受到与细胞内产生的反应性代谢产物(例如,高能量的甲基供体, -腺甲硫氨酸)的接触或暴露于环境中的有毒化学物质的损害。同样,来自太阳的紫外辐射可以在 DNA 中的两个相邻嘧啶碱基之间形成共价键连接,例如,胸腺嘧啶二聚体(图 5-39)。如果不加以纠正,这些变化中的大多数将导致一个或多个碱基对的缺失,或者在 DNA 复制时导致子 DNA 链中的碱基对替换(图 5-40)。然后这些突变将在所有后续细胞代中传播。这种高速率的未修复 DNA 序列中的随机变化将在生殖细胞系和体细胞组织中产生灾难性后果。
DNA strand DNA 链
Figure 5-39 The ultraviolet radiation in sunlight can cause the formation of thymine dimers. Two adjacent thymine bases have become covalently attached to each other to form a thymine dimer. Skin cells that are exposed to sunlight are especially susceptible to this type of DNA damage. Dimers can also form between an adjacent thymine and cytosine.
图 5-39 太阳光中的紫外线辐射会导致胸腺嘧啶二聚体的形成。两个相邻的胸腺嘧啶碱基已经共价结合在一起形成了一个胸腺嘧啶二聚体。暴露在阳光下的皮肤细胞特别容易受到这种类型的 DNA 损伤。二聚体也可以在相邻的胸腺嘧啶和胞嘧啶之间形成。

Figure 5-38 Depurination and deamination are the most frequent spontaneous chemical reactions known to create serious DNA damage in cells. (A) Depurination can remove guanine (or adenine) from DNA. (B) The major type of deamination reaction converts cytosine to uracil, which, as we have seen, is not normally found in DNA. However, deamination can occur on other bases as well. Both depurination and deamination take place on double-helical DNA, and neither reaction breaks the phosphodiester backbone.
图 5-38 脱嘌呤和脱氨是已知在细胞中造成严重 DNA 损伤的最常见的自发化学反应。(A) 脱嘌呤可以从 DNA 中去除鸟嘌呤(或腺嘌呤)。(B) 脱氨的主要类型反应将胞嘧啶转化为尿嘧啶,正如我们所见,尿嘧啶在 DNA 中通常不会被发现。然而,脱氨也可能发生在其他碱基上。脱嘌呤和脱氨都发生在双螺旋 DNA 上,且两种反应均不会破坏磷酸二酯骨架。
(A)
Figure 5-40 Chemical modifications of nucleotides, if left unrepaired, produce mutations. (A) Deamination of cytosine, if uncorrected, results in the substitution of one base for another when the DNA is replicated. As shown in Figure 5-43, deamination of cytosine produces uracil. Uracil differs from cytosine in its basepairing properties and preferentially basepairs with adenine. The DNA replication machinery therefore inserts an adenine when it encounters a uracil on the template strand. (B) Depurination, if uncorrected, can lead to the loss of a nucleotide pair. When the replication machinery encounters a missing purine on the template strand, it can skip to the next complete nucleotide, as shown, thus producing a daughter DNA molecule that is missing one nucleotide pair. In other cases, the replication machinery places an incorrect nucleotide across from the missing base, again resulting in a mutation (not shown).
图 5-40 核苷酸的化学修饰,如果未修复,会产生突变。(A) 胞嘧啶的脱氨基,如果未纠正,会导致 DNA 复制时一个碱基被另一个替代。如图 5-43 所示,胞嘧啶的脱氨基会产生尿嘧啶。尿嘧啶在碱基配对性质上与胞嘧啶不同,并且更倾向于与腺嘌呤配对。因此,当 DNA 复制机构在模板链上遇到尿嘧啶时,会插入一个腺嘌呤。(B) 脱嘌呤,如果未纠正,可能导致一个核苷酸对的丢失。当复制机构在模板链上遇到缺失的嘌呤时,会跳过到下一个完整的核苷酸,如图所示,从而产生一个缺少一个核苷酸对的子 DNA 分子。在其他情况下,复制机构会在缺失碱基的对面放置一个不正确的核苷酸,再次导致突变(未显示)。
(A) BASE EXCISION REPAIR
碱基切除修复
(B) NUCLEOTIDE EXCISION REPAIR
(B) 核苷酸切除修复
G A T G C C A G A T G A T A C C
G A T G C C A G A T G A T A C C 基因 A T G C C A G A T G A T A C C
DNA helix with 12 nucleotide gap
DNA 螺旋带有 12 个核苷酸间隙
DNA POLYMERASE ADDS NEW NUCLEOTIDES USING BOTTOM STRAND AS A TEMPLATE; DNA LIGASE SEALS BREAK
DNA 聚合酶利用底链作为模板添加新核苷酸;DNA 连接酶封闭断裂
C T A C G G T C T A C T A T G G
C T A C G G T C T A C T A T G G C T A C G G T C T A C T A T G G
G A T G C A G A T G A T A C C
G A T G C A G A T G A T A C C G A T G C A G A T G A T A C C
Figure 5-41 A comparison of two major DNA repair pathways. (A) Base excision repair. This pathway starts with a DNA glycosylase. In the example shown here, the enzyme uracil DNA glycosylase removes an accidentally deaminated cytosine in DNA. After the action of this glycosylase (or another DNA glycosylase that recognizes a different kind of damage), the sugar phosphate with the missing base is cut out by the sequential action of AP endonuclease and a phosphodiesterase. The gap of a single nucleotide is then filled by DNA polymerase and DNA ligase. The net result is that the that was created by accidental deamination is restored to a C. The loss of a base can occur either from the actions of DNA glycosylases that recognize damaged bases or from spontaneous chemical reactions (see Figure 5-37). AP endonuclease is so named because it recognizes any site in the DNA helix that contains a deoxyribose sugar with a missing base; such sites can arise either by the loss of a purine (apurinic sites) or by the loss of a pyrimidine (apyrimidinic sites). (B) Nucleotide excision repair. In bacteria, after a multienzyme complex has recognized a lesion such as a pyrimidine dimer (see Figure 5-39), one cut is made on each side of the lesion, and an associated DNA helicase then removes the entire portion of the damaged strand. The excision repair machinery in bacteria operates as shown. In humans, once the damaged DNA is recognized, a helicase is recruited to locally unwind the DNA duplex. Next, the excision nuclease enters and cleaves on either side of the damage, leaving a gap of about 30 nucleotides that is subsequently filled in. The nucleotide excision repair machinery in both bacteria and humans can recognize and repair many different types of DNA damage.
图 5-41 两种主要 DNA 修复途径的比较。 (A)碱基切除修复。 该途径始于 DNA 醣苷酶。 在此处所示的示例中,鸟嘌呤 DNA 醣苷酶酶去除 DNA 中意外脱氨基胞嘧啶。 在这种醣苷酶的作用之后(或者识别不同类型损伤的另一种 DNA 醣苷酶的作用之后),缺失碱基的磷酸糖将被 AP 内切核酸酶和磷酸二酯酶的连续作用切除。 然后,DNA 聚合酶和 DNA 连接酶填补一个单核苷酸的间隙。 最终结果是,由意外脱氨基生成的 被恢复为 C。 碱基的丢失可以是由于识别受损碱基的 DNA 醣苷酶的作用,也可以是由于自发化学反应(见图 5-37)引起的。 AP 内切核酸酶之所以如此命名,是因为它识别 DNA 螺旋中包含缺失碱基的脱氧核糖糖的任何位点; 这种位点可以通过丢失嘌呤(无嘌呤位点)或丢失嘧啶(无嘧啶位点)而产生。 (B)核苷酸切除修复。 在细菌中,当一个多酶复合物识别到一个损伤,比如嘧啶二聚体(见图 5-39),在损伤的两侧各做一次切割,然后一个相关的 DNA 解旋酶移除整个受损链的部分。细菌中的切除修复机制如图所示运作。在人类中,一旦受损的 DNA 被识别,一个解旋酶被招募来局部展开 DNA 双链。接下来,切除核酸酶进入并在损伤的两侧切割,留下大约 30 个核苷酸的间隙,随后填补。细菌和人类中的核苷酸切除修复机制可以识别和修复许多不同类型的 DNA 损伤。
Figure 5-41A). Depurination, which is by far the most frequent type of damage suffered by DNA, also leaves a deoxyribose sugar with a missing base. Depurinations are directly repaired beginning with AP endonuclease, following the bottom half of the pathway in Figure 5-41A.
图 5-41A)。脱嘌呤是 DNA 遭受的最常见损伤类型,也会使去氧核糖糖分子缺失一个碱基。脱嘌呤损伤可以直接通过 AP 内切酶修复,遵循图 5-41A 中路径的下半部分开始修复。
The second major repair pathway is called nucleotide excision repair. This mechanism can repair the damage caused by almost any large change in the structure of the DNA double helix. Such "bulky lesions" include those created by the covalent reaction of DNA bases with large hydrocarbons (such as the carcinogen benzopyrene, found in tobacco smoke, coal tar, and diesel exhaust), as well as the various pyrimidine dimers (T-T, T-C, and C-C) caused by sunlight. In this pathway, a large multienzyme complex scans the DNA for a distortion in the double helix, rather than for a specific base change. Once it finds a lesion, it cleaves the phosphodiester backbone of the abnormal strand on both sides of
第二个主要的修复途径被称为核苷酸切除修复。这种机制可以修复几乎任何 DNA 双螺旋结构发生较大变化造成的损伤。这种“体积庞大的损伤”包括由 DNA 碱基与大型碳氢化合物(如致癌物苯并芘,存在于烟草烟雾、煤焦油和柴油尾气中)共价反应产生的损伤,以及由阳光引起的各种嘧啶二聚体(T-T、T-C 和 C-C)。在这个途径中,一个大型多酶复合物扫描 DNA,寻找双螺旋结构的扭曲,而不是特定碱基的改变。一旦发现损伤,它会在异常链的两侧切割磷酸二酯骨架。
(A)
(B) the distortion, and a DNA helicase peels away the single-strand oligonucleotide containing the lesion. The large gap produced in the DNA helix is then repaired by DNA polymerase and DNA ligase (see Figure 5-41B).
(B)扭曲,DNA 解旋酶剥离含有损伤的单链寡核苷酸。然后 DNA 聚合酶和 DNA 连接酶修复 DNA 螺旋中产生的大缺口(见图 5-41B)。
An alternative to these base and nucleotide excision repair processes is the direct chemical reversal of DNA damage, and this strategy is selectively employed for the rapid removal of certain highly mutagenic or cytotoxic lesions. For example, the lesion -methylguanine has its methyl group removed by direct transfer to a cysteine residue in the repair protein itself. Because the repair protein is destroyed in the process, each molecule of it can only be used once. In another example, methyl groups in the lesions 1-methyladenine and 3-methylcytosine are "burned off" by an iron-dependent demethylase, with release of formaldehyde from the methylated DNA and regeneration of the native base.
这些碱基和核苷酸切除修复过程的替代方案是直接化学逆转 DNA 损伤,这种策略被选择性地用于快速去除某些高度致突变或细胞毒性的损伤。例如,损伤 -甲基鸟嘌呤通过直接转移至修复蛋白质中的半胱氨酸残基而去除其甲基基团。由于修复蛋白质在过程中被破坏,因此每个分子只能使用一次。在另一个例子中,损伤 1-甲基腺嘌呤和 3-甲基胞嘧啶中的甲基基团通过依赖铁的去甲基酶“燃烧”,从甲基化 DNA 中释放甲醛并再生原始碱基。

Coupling Nucleotide Excision Repair to Transcription Ensures That the Cell's Most Important DNA Is Efficiently Repaired
将核苷酸切除修复与转录耦合,确保细胞中最重要的 DNA 得到高效修复

All of a cell's DNA is under constant surveillance for damage, and the repair mechanisms we have described act on all parts of the genome. However, cells have a way of directing DNA repair to the DNA sequences that are most needed. They do this by linking RNA polymerase, the enzyme that transcribes DNA into RNA as the first step in gene expression, to the nucleotide excision repair pathway. As discussed above, this repair system can correct many different types of DNA damage. RNA polymerase stalls at DNA lesions and, through the use of coupling proteins, directs the excision repair machinery to those sites, thereby selectively repairing genes that are in current use by the cell. In bacteria, where genes are relatively short, the stalled RNA polymerase can be dissociated from the DNA; the DNA is repaired, and the gene is transcribed again from the beginning. In eukaryotes, where genes can be enormously long, a more complex reaction is used to "back up" the RNA polymerase, repair the damage, and then restart the polymerase.
细胞的所有 DNA 都在不断监视损伤,我们所描述的修复机制作用于基因组的所有部分。然而,细胞有一种方法可以将 DNA 修复引导到最需要的 DNA 序列。它们通过将 RNA 聚合酶(一种将 DNA 转录为 RNA 的酶,是基因表达的第一步)与核苷酸切除修复途径联系起来来实现这一点。正如上文所讨论的,这种修复系统可以纠正许多不同类型的 DNA 损伤。RNA 聚合酶在 DNA 损伤处停滞,并通过耦合蛋白的使用,将切除修复机制引导到这些位置,从而有选择地修复细胞当前使用的基因。在细菌中,基因相对较短,停滞的 RNA 聚合酶可以与 DNA 解离;DNA 得到修复,基因再次从头开始转录。在真核生物中,基因可能非常长,使用更复杂的反应来“备份”RNA 聚合酶,修复损伤,然后重新启动聚合酶。
The importance of transcription-coupled excision repair is seen in people with Cockayne syndrome, which is caused by a defect in this coupling. These individuals suffer from growth retardation, skeletal abnormalities, progressive neural retardation, and severe sensitivity to sunlight. Most of these problems are thought to arise from RNA polymerase molecules that become permanently stalled at sites of DNA damage that lie in important genes.
转录耦合修复的重要性在患有科克恩综合征的人群中得以体现,该综合征是由于这种耦合的缺陷引起的。这些个体患有生长迟缓、骨骼异常、进行性神经迟缓和对阳光的严重敏感。这些问题大多被认为是由于 RNA 聚合酶分子在重要基因中的 DNA 损伤部位被永久阻塞而引起的。

The Chemistry of the DNA Bases Facilitates Damage Detection
DNA 碱基的化学性质有助于损伤检测

The DNA double helix is well suited for repair. As noted earlier, it contains a backup copy of all genetic information. Equally importantly, the nature of the four bases in DNA makes the distinction between undamaged and damaged
DNA 双螺旋结构非常适合修复。正如前面所指出的,它包含了所有遗传信息的备份副本。同样重要的是,DNA 中四种碱基的性质使得未受损和受损之间的区别变得明显。

Figure 5-42 The recognition of an unusual nucleotide in DNA by baseflipping. The DNA glycosylase family of enzymes recognizes inappropriate bases in DNA in the conformation shown. Each of these enzymes cleaves the glycosyl bond that connects a particular recognized base (yellow) to the backbone sugar, removing it from the DNA. (A) Stick model of the DNA; (B) space-filling model.
图 5-42 DNA 中不寻常核苷酸的识别通过翻转碱基。DNA 醣苷酶家族的酶识别 DNA 中不当碱基的构象如图所示。这些酶中的每一个都会切断连接特定识别碱基(黄色)与骨架糖的醣苷键,将其从 DNA 中移除。(A)DNA 的棍状模型;(B)填充模型。
NNNH2NN
adenine 腺嘌呤
NNONHNH2N
guanine 鸟嘌呤
H2NO
cytosine 胞嘧啶
NONHO
thymine 胸腺嘧啶
NNONHN
hypoxanthine 次黄嘌呤
NOOO
NO DEAMINATION 无脱氨
(A)
NON
5-methylcytosine 5-甲基胞嘧啶
NH2OH
ONHON
thymine 胸腺嘧啶
(B)
bases very clear. For example, every possible deamination event in DNA yields an "unnatural" base, which can be directly recognized and removed by a specific DNA glycosylase. Hypoxanthine, for example, is the simplest purine base capable of pairing specifically with C. But hypoxanthine is not used in DNA, presumably because it is the direct deamination product of A. Instead G, with a second amino group, pairs with : cannot form from by spontaneous deamination, and its own deamination product (xanthine) is likewise unique (Figure 5-43).
碱基非常清晰。例如,DNA 中的每次可能的脱氨基事件都会产生一种“非自然”碱基,可以被特定的 DNA 醣苷酶直接识别并去除。例如,次黄嘌呤是能够与 C 特异配对的最简单的嘌呤碱基。但是次黄嘌呤在 DNA 中并未被使用,可能是因为它是 A 的直接脱氨基产物。相反,带有第二个氨基基团的 G 与 配对: 不能通过自发脱氨基形成 ,其自身的脱氨基产物(黄嘌呤)同样独特(图 5-43)。

Figure 5-43 The deamination of DNA nucleotides. In each case, the oxygen atom that is added in this reaction with water is colored red. (A) The spontaneous deamination products of and are recognizable as unnatural when they occur in DNA and thus are readily found and repaired, as is the deamination of to ; has no amino group to remove. (B) About of the nucleotides in vertebrate DNAs are methylated to help in controlling gene expression (discussed in Chapter 7). When these 5-methyl C nucleotides are accidentally deaminated, they form the natural nucleotide T. This T will be paired with a G on the opposite strand, forming a mismatched base pair.
图 5-43 DNA 核苷酸的脱氨作用。在每种情况下,通过水反应添加的氧原子为红色。 (A)当 DNA 中发生 的自发脱氨产物时,这些产物是不自然的,因此很容易被发现和修复,就像 的脱氨一样; 没有氨基可去除。 (B)在脊椎动物 DNA 中,约 的核苷酸被甲基化,以帮助控制基因表达(在第 7 章讨论)。当这些 5-甲基 C 核苷酸被意外脱氨时,它们形成自然核苷酸 T。 这个 T 将与对侧链上的 G 配对,形成不匹配的碱基对。
As discussed in Chapter 6, RNA is thought, on an evolutionary time scale, to have served as the genetic material before DNA, and it seems likely that the genetic code was initially carried in the four nucleotides , and . This raises the question of why the in RNA was replaced in DNA by (which is 5 -methyl ). We have seen that the spontaneous deamination of converts it to , but that this event is rendered relatively harmless by uracil DNA glycosylase. However, if DNA contained as a natural base, the repair system would not be able to distinguish a deaminated from a naturally occurring .
正如第 6 章所讨论的,从进化的时间尺度来看,RNA 被认为在 DNA 之前作为遗传物质,并且很可能遗传密码最初是由四个核苷酸携带的。这引发了一个问题,即为什么 RNA 中的 被 DNA 中的 (即 5-甲基 )所取代。我们已经看到, 的自发脱氨作用将其转化为 ,但是尿嘧啶 DNA 糖苷酶使这一事件相对无害。然而,如果 DNA 含有 作为一种自然碱基,修复系统将无法区分脱氨的 和自然存在的
A special situation occurs in vertebrate DNA, in which selected C nucleotides are methylated at specific CG sequences that are associated with inactive genes (discussed in Chapter 7). The accidental deamination of these methylated C nucleotides produces the natural nucleotide T (see Figure 5-43B) in a mismatched base pair with a G on the opposite DNA strand. To help in repairing deaminated methylated C nucleotides, a special DNA glycosylase recognizes a mismatched base pair involving in the sequence and removes the . This DNA repair mechanism must be relatively ineffective, however, because methylated C nucleotides are exceptionally common sites for mutations in vertebrate DNA. It is striking that, even though only about of the nucleotides in human DNA are methylated, mutations in these methylated nucleotides account for about one-third of the single-base mutations that have been observed in inherited human diseases.
脊椎动物 DNA 中发生了一种特殊情况,即选择性地在与非活性基因相关的特定 CG 序列上甲基化了某些 C 核苷酸(见第 7 章讨论)。这些甲基化的 C 核苷酸的意外脱氨基会产生天然核苷酸 T(见图 5-43B),与对面 DNA 链上的 G 形成不匹配的碱基对。为了帮助修复脱氨基化的甲基化 C 核苷酸,一种特殊的 DNA 醣基酶识别涉及 在序列 中的不匹配碱基对并去除 。然而,这种 DNA 修复机制必须相对无效,因为在脊椎动物 DNA 中,甲基化的 C 核苷酸是异常常见的突变位点。令人惊讶的是,即使人类 DNA 中只有约 中的 核苷酸被甲基化,这些甲基化核苷酸的突变却占据了遗传性人类疾病中观察到的单碱基突变约三分之一。

Special Translesion DNA Polymerases Are Used in Emergencies
特殊的转录 DNA 聚合酶在紧急情况下被使用

If a cell's DNA suffers heavy damage, the repair mechanisms that we have discussed are often insufficient to cope with it. In these cases, a different strategy is called into play, one that entails some risk to the cell. The highly accurate replicative DNA polymerases stall when they encounter damaged DNA, and in emergencies cells employ versatile, but less accurate, backup polymerases, known as translesion polymerases, to replicate through the DNA damage.
如果细胞的 DNA 遭受严重损伤,我们讨论过的修复机制通常无法应对。在这些情况下,会采用一种不同的策略,这种策略对细胞存在一定风险。当高度精确的复制 DNA 聚合酶遇到损伤的 DNA 时会停滞,细胞在紧急情况下会利用多功能但准确性较低的备用聚合酶,即称为跨损伤聚合酶,来复制经过 DNA 损伤的区域。
Human cells contain seven different translesion polymerases, some of which can recognize a specific type of DNA damage and add the nucleotides required to restore the correct sequence. For example, one such polymerase adds two A's opposite a thymine dimer (see Figure 5-39). Others make only "good guesses," especially when the template base has been extensively damaged. These enzymes are not as accurate as the normal replicative polymerases even when they copy an undamaged DNA sequence. For one thing, they lack exonucleolytic proofreading activity; in addition, many are much less discriminating than the replicative polymerase in choosing which nucleotide to incorporate initially. Each such translesion polymerase is therefore given a chance to add only one or a few nucleotides before a high-fidelity replicative polymerase resumes DNA synthesis.
人类细胞含有七种不同的转录失效聚合酶,其中一些可以识别特定类型的 DNA 损伤,并添加所需的核苷酸以恢复正确的序列。例如,这样的一种聚合酶在嘧啶二聚体对面添加两个 A(见图 5-39)。其他的只能做出“良好的猜测”,特别是当模板碱基受到严重损伤时。即使复制未受损的 DNA 序列时,这些酶也不如正常的复制聚合酶准确。首先,它们缺乏外切核酸校对活性;此外,许多在最初选择要合并的核苷酸时比复制聚合酶要不那么具有区分性。因此,每种这样的转录失效聚合酶只有在高保真度的复制聚合酶恢复 DNA 合成之前才有机会添加一个或几个核苷酸。
Despite their usefulness in allowing heavily damaged DNA to be replicated, these translesion polymerases do, as noted above, pose risks to the cell. They are probably responsible for most of the base-substitution and single-nucleotide deletion mutations that accumulate in genomes. Not only do they frequently produce mutations when copying damaged DNA, they probably also generate mutationsat a low level-on undamaged DNA. Clearly, it is important for the cell to tightly regulate these polymerases, activating them only at sites of DNA damage. Exactly how this happens for each translesion polymerase remains to be discovered, but a conceptual model is presented in Figure 5-44. The same principle applies to many of the DNA repair processes discussed in this chapter: because the enzymes that carry out these reactions are potentially dangerous to the genome, they must be brought into play only at the appropriate damaged sites.
尽管这些转录失译聚合酶在允许严重受损的 DNA 进行复制方面非常有用,但正如上文所指出的,它们对细胞构成风险。它们很可能是导致基础替换和单核苷酸缺失突变在基因组中积累的主要原因。它们不仅在复制受损 DNA 时经常产生突变,而且很可能也在未受损的 DNA 上以低水平产生突变。显然,对于细胞来说,紧密调控这些聚合酶非常重要,只在 DNA 损伤部位激活它们。每种转录失译聚合酶如何实现这一点仍有待发现,但在图 5-44 中提出了一个概念模型。同样的原则也适用于本章讨论的许多 DNA 修复过程:因为执行这些反应的酶对基因组具有潜在危险,所以它们必须只在适当的受损部位发挥作用。

Double-Strand Breaks Are Efficiently Repaired
双链断裂得到有效修复

An especially dangerous type of DNA damage occurs when both strands of the double helix are broken, leaving no intact template strand to enable accurate repair. Ionizing radiation, replication errors, oxidizing agents, and other
双螺旋的两条链都断裂时,会发生一种特别危险的 DNA 损伤,没有完整的模板链来进行准确修复。电离辐射、复制错误、氧化剂和其他因素。
removal of covalent modifications from clamp, reloading of replicative DNA polymerase, continuation of accurate DNA synthesis metabolites produced in the cell cause breaks of this type. If these lesions were left unrepaired, they would quickly lead to the breakdown of chromosomes into smaller fragments and to loss of genes when the cell divides. However, two distinct mechanisms have evolved to deal with this type of damage by restoring an intact double helix: nonhomologous end joining and homologous recombination (Figure 5-45).
去除夹具上的共价修饰,重新装载复制 DNA 聚合酶,继续准确的 DNA 合成代谢产物在细胞中引起这种类型的断裂。如果这些损伤未修复,它们将迅速导致染色体分解为较小的片段,并在细胞分裂时导致基因丢失。然而,已经演化出两种不同的机制来处理这种类型的损伤,通过恢复完整的双螺旋:非同源末端连接和同源重组(图 5-45)。
The simplest to understand is nonhomologous end joining, in which the broken ends are processed to remove any damaged nucleotides and simply brought together and rejoined by DNA ligation, generally with the loss of nucleotides at the site of joining (Figure 5-46). This end-joining mechanism, which can be seen as a "quick and dirty" solution to the repair of double-strand breaks, is the predominant way of repairing these lesions in mammalian somatic cells. Although a change in the DNA sequence (a mutation) usually results at the site of breakage, so little of the mammalian genome is essential for life that this mechanism is apparently an acceptable solution to the problem of rejoining broken chromosomes. By the time a human reaches the age of 70, the typical somatic cell contains more than 2000 such "scars," distributed throughout its genome, representing places where DNA has been inaccurately repaired by nonhomologous end joining.
最简单理解的是非同源末端连接,其中断裂的末端被处理以去除任何受损核苷酸,然后简单地将它们放在一起,并通过 DNA 连接重新连接,通常在连接点丢失核苷酸(图 5-46)。这种末端连接机制可以被视为修复双链断裂的“快速而粗糙”的解决方案,是哺乳动物体细胞中修复这些损伤的主要方式。尽管在断裂点通常会导致 DNA 序列的变化(突变),但哺乳动物基因组中对生命至关重要的部分很少,因此这种机制显然是重新连接断裂染色体问题的可接受解决方案。当人类达到 70 岁时,典型的体细胞中包含超过 2000 个这样的“疤痕”,分布在其基因组中,代表 DNA 被非同源末端连接不准确修复的地方。
But nonhomologous end joining presents another danger: nonhomologous end joining can occasionally generate rearrangements in which one broken chromosome becomes covalently attached to another. This can result
但非同源末端连接还存在另一个危险:非同源末端连接有时会产生重排,其中一个断裂的染色体会与另一个染色体共价连接。这可能导致

Figure 5-44 How translesion DNA polymerases are recruited to damaged templates. According to this model, a replicative polymerase stalled at a site of DNA damage is recognized by the cell as needing rescue. Specialized enzymes covalently modify the sliding clamp (typically, it is ubiquitylated - see Figure 3-65), which releases the replicative DNA polymerase and, together with the damaged DNA, attracts a translesion polymerase specific to that type of damage. Once the damaged DNA is bypassed, the covalent modification of the clamp is removed, the translesion polymerase dissociates, and the highfidelity replicative polymerase is brought back into play.
图 5-44 转录延伸 DNA 聚合酶如何被招募到受损模板。根据这个模型,停滞在 DNA 损伤点的复制聚合酶被细胞识别为需要救援。专门的酶会共价修饰滑动夹具(通常是泛素化的 - 见图 3-65),这会释放复制 DNA 聚合酶,并与受损 DNA 一起吸引特定于该类型损伤的转录延伸聚合酶。一旦受损 DNA 被绕过,夹具的共价修饰被去除,转录延伸聚合酶解离,高保真度的复制聚合酶再次发挥作用。
Figure 5-45 Cells can repair doublestrand breaks in one of two ways. (A) In nonhomologous end joining, the break is first "cleaned" by a nuclease that chews back the broken ends to produce flush ends. The flush ends are then stitched together by a DNA ligase. Some nucleotides are usually lost in the repair process, as indicated by the black lines in the repaired DNA. (B) If a double-strand break occurs in one of two duplicated DNA double helices after DNA replication has occurred, but before the chromosome copies have been separated, the undamaged double helix can be used as a template to repair the damaged double helix through homologous recombination. Although more complicated than nonhomologous end joining, this process accurately restores the original
图 5-45 细胞可以通过两种方式修复双链断裂。(A)在非同源末端连接中,断裂首先被核酸酶“清理”,核酸酶会咬掉断裂的末端以产生平滑末端。然后,DNA 连接酶将平滑末端缝合在一起。通常在修复过程中会丢失一些核苷酸,如修复后 DNA 中的黑线所示。(B)如果在 DNA 复制发生后,但染色体复制尚未分离之前,两个复制的 DNA 双螺旋中的一个发生双链断裂,那么未受损的双螺旋可以用作模板,通过同源重组修复受损的双螺旋。虽然比非同源末端连接更复杂,但这个过程可以准确恢复原始的。
in chromosomes with two centromeres and chromosomes lacking centromeres altogether; both types of aberrant chromosomes are missegregated during cell division. As previously discussed, the specialized structure of telomeres prevents the natural ends of chromosomes from being mistaken for broken DNA and "repaired" in this way.
在具有两个着丝点和完全缺乏着丝点的染色体中;这两种异常染色体在细胞分裂过程中都会被错误分离。正如先前讨论的那样,端粒的特殊结构防止染色体的自然末端被误认为是断裂的 DNA,并以这种方式“修复”。
A much more accurate type of double-strand break repair is also possible (see Figure 5-45B). Here, a damaged DNA molecule is repaired using a second DNA sequence at the site of the break. DNA double helix as a template, one with an identical (or nearly identical) DNA Homologous recombination is described in detail in the next part of this chapter. Although nonhomologous end joining and homologous recombination are the two principal ways that cells repair double-strand breaks, additional mechanisms exist.
双链断裂修复的一种更准确类型也是可能的(见图 5-45B)。在这种情况下,受损的 DNA 分子使用断裂点处的第二个 DNA 序列进行修复。DNA 双螺旋作为模板,其中一个具有相同(或几乎相同)的 DNA 同源重组在本章的下一部分中有详细描述。尽管非同源末端连接和同源重组是细胞修复双链断裂的两种主要方式,但还存在其他机制。
Figure 5-46 Nonhomologous end joining. (A) A central role is played by the Ku protein, a heterodimer that quickly grasps the broken chromosome ends. The additional proteins (shown in blue) are recruited to hold the broken ends together and remove any damaged nucleotides before the two DNA molecules are joined covalently by a specialized ligase that is dedicated to nonhomologous end joining. During this process, any single-strand gaps that arise are "filled in" by specialized repair polymerases. When DNA suffers double-strand breaks through ionizing radiation or chemical attack, the broken ends are often chemically damaged. Nonhomologous end joining is unusually versatile in being able to "clean up" just about any type of damaged end. (B) Three-dimensional structure of a Ku heterodimer bound to the end of a duplex DNA fragment. This Ku protein is also essential for joining, a specific process through which antibody and T-cell receptor diversity is generated in developing B and T cells (discussed in Chapter 24). V(D)J joining and nonhomologous end joining share many mechanistic similarities, but the former relies on specific double-strand breaks that are produced deliberately by the cell. (From J. Walker, R. Corpina, and J. Goldberg, Nature 412:607-614, 2001. With permission from Springer Nature; PDB codes: .)
图 5-46 非同源末端连接。(A) Ku 蛋白起着核心作用,它是一个迅速抓住断裂染色体末端的异源二聚体。其他蛋白质(蓝色显示)被招募来将断裂末端固定在一起,并在两个 DNA 分子被非同源末端连接的专门连接酶共价连接之前去除任何受损核苷酸。在这个过程中,任何产生的单链间隙都会被专门的修复聚合酶“填补”。当 DNA 遭受电离辐射或化学攻击导致双链断裂时,断裂末端通常会受到化学损伤。非同源末端连接在能够“清理”几乎任何类型的受损末端方面异常灵活。(B) 一个 Ku 异源二聚体与双链 DNA 片段末端结合的三维结构。这种 Ku 蛋白对 连接也是至关重要的,这是在发育中的 B 和 T 细胞中产生抗体和 T 细胞受体多样性的特定过程(在第 24 章讨论)。 V(D)J 连接和非同源末端连接在许多机制上有相似之处,但前者依赖于细胞有意产生的特定双链断裂。(来源:J. Walker, R. Corpina, 和 J. Goldberg, Nature 412:607-614, 2001. 获得 Springer Nature 许可;PDB 代码: 。)

described later in this chapter. Most organisms employ both nonhomologous end joining and homologous recombination to repair double-strand breaks in DNA. Nonhomologous end joining predominates in humans; homologous recombination is used only in the and cell-cycle phases, when one newly replicated daughter molecule can act as a template to repair damage to the other daughter that remains nearby.
本章后面描述。大多数生物体都同时使用非同源末端连接和同源重组来修复 DNA 中的双链断裂。在人类中,非同源末端连接占主导地位;同源重组仅在 细胞周期阶段使用,当一个新复制的子分子可以作为模板修复附近仍存在损伤的另一个子分子时。

DNA Damage Delays Progression of the Cell Cycle
DNA 损伤延迟细胞周期的进展

We have just seen that cells contain multiple enzyme systems that can recognize and repair many types of DNA damage (Movie 5.7). Because of the importance of maintaining intact, undamaged DNA from generation to generation, eukaryotic cells delay the progression of their cell cycle until DNA repair is complete. As discussed in detail in Chapter 17, the orderly progression of the cell cycle is stopped when damaged DNA is detected, and it restarts only when the damage has been repaired. In mammalian cells, the presence of DNA damage can block entry from phase into phase, it can slow phase once it has begun, and it can block the transition from phase to phase. These delays facilitate DNA repair by providing the time needed for the repair to reach completion.
我们刚刚看到,细胞包含多种酶系统,可以识别和修复许多类型的 DNA 损伤(电影 5.7)。由于保持完整、未受损的 DNA 代代相传的重要性,真核细胞延迟其细胞周期的进展,直到 DNA 修复完成。正如第 17 章详细讨论的那样,当检测到损坏的 DNA 时,细胞周期的有序进展会停止,只有在损伤修复完成后才会重新开始。在哺乳动物细胞中,DNA 损伤的存在可以阻止从 期进入 期,一旦 期开始,它可以减慢 期的进展,并且可以阻止从 期过渡到 期。这些延迟通过提供修复所需的时间来促进 DNA 修复的完成。
DNA damage also results in an increased synthesis of many DNA repair enzymes. This response depends on special signaling proteins that sense DNA damage and synthesize more of the DNA repair enzymes appropriate for the damage. The importance of this mechanism is revealed by the phenotype of humans who are born with defects in the gene that encodes the ATM protein. These individuals have the disease ataxia telangiectasia , the symptoms of which include neurodegeneration, a predisposition to cancer, and genome instability. The ATM protein is a large protein kinase that generates the intracellular signals needed to halt the cell cycle in response to many types of spontaneous DNA damage (see Figure 17-60), and individuals with defects in this protein suffer from the effects of unrepaired DNA lesions.
DNA 损伤还会导致许多 DNA 修复酶的合成增加。这种反应依赖于特殊的信号蛋白,它们能感知 DNA 损伤并合成更多适用于该损伤的 DNA 修复酶。这种机制的重要性通过那些在编码 ATM 蛋白的基因中出现缺陷的人类的表型得以揭示。这些个体患有共济失调毛细血管扩张症 ,其症状包括神经退行性变、癌症易感性和基因组不稳定性。ATM 蛋白是一个大型蛋白激酶,它产生细胞内信号以响应多种类型的自发 DNA 损伤而停止细胞周期(见图 17-60),而那些在该蛋白中存在缺陷的个体则遭受未修复 DNA 损伤的影响。

Summary 摘要

Genetic information can be stored stably in DNA sequences only because a large set of DNA repair enzymes continually scans the DNA double helix and replaces any damaged nucleotides. Most types of DNA repair depend on the fact that a DNA molecule carries two copies of its genetic information-one copy on each of its two complementary strands. This allows an accidental lesion on one strand to be removed by a repair enzyme and a corrected strand then resynthesized by reference to the information in the undamaged strand.
遗传信息之所以能够稳定地存储在 DNA 序列中,是因为大量的 DNA 修复酶不断扫描 DNA 双螺旋,并替换任何受损的核苷酸。大多数类型的 DNA 修复依赖于 DNA 分子携带其遗传信息的两份副本的事实-每个副本位于其两个互补链的一侧。这使得一条链上的意外损伤可以被修复酶移除,并且根据未受损链中的信息重新合成一条校正的链。
Most of the damage to DNA bases is excised by one of two major DNA repair pathways. In base excision repair, the altered base is removed by a DNA glycosylase enzyme, followed by excision of the resulting sugar phosphate. In nucleotide excision repair, a small section of the DNA strand surrounding the damage is removed from the DNA double helix. In both cases, the gap left in the DNA helix is filled in by the sequential action of DNA polymerase and DNA ligase, using the undamaged DNA strand as the template. Some types of DNA damage can be repaired by a different strategy-the direct chemical reversal of the damagewhich is carried out by specialized repair proteins. Usually, all such corrections are completed prior to DNA replication. But if not, a special class of inaccurate DNA polymerases, called translesion polymerases, is used to bypass the damage, allowing the cell to survive but sometimes creating permanent mutations at the sites of damage.
DNA 碱基大部分损伤是通过两种主要的 DNA 修复途径之一被切除的。在碱基切除修复中,改变的碱基被 DNA 醇基化酶酶去除,随后去除产生的糖磷酸。在核苷酸切除修复中,从 DNA 双螺旋中去除损伤周围的 DNA 链的一小部分。在这两种情况下,DNA 螺旋中留下的间隙由 DNA 聚合酶和 DNA 连接酶的顺序作用填补,使用未受损的 DNA 链作为模板。一些类型的 DNA 损伤可以通过不同的策略修复-直接化学逆转损伤,由专门的修复蛋白质执行。通常,在 DNA 复制之前完成所有这些修正。但如果没有,一种特殊类别的不准确 DNA 聚合酶,称为跨损伤聚合酶,用于绕过损伤,使细胞存活,但有时在损伤部位创建永久突变。
Other critical repair systems-based on either nonhomologous end joining or homologous recombination-are needed to reseal the accidental double-strand breaks that occasionally occur in the DNA helix. In most cells, an elevated level of DNA damage causes a delay in the cell cycle, which helps to ensure that the damage is repaired before the cell divides.
其他基于非同源末端连接或同源重组的关键修复系统需要重新封闭偶尔在 DNA 螺旋中发生的意外双链断裂。在大多数细胞中,DNA 损伤水平升高会导致细胞周期延迟,这有助于确保在细胞分裂之前修复损伤。

HOMOLOGOUS RECOMBINATION
同源重组

In the preceding parts of this chapter, we discussed the mechanisms that allow the DNA sequences in cells to be maintained from generation to generation with very little change. In this part, we further explore a group of repair mechanisms that depend on a process called homologous recombination. The key feature of homologous recombination (also known as general recombination) is an exchange of DNA strands between a pair of homologous duplex DNA sequences. Such a strand exchange between two regions of double helix that are very similar or identical in nucleotide sequence allows one stretch of duplex DNA to restore lost or damaged information on a second stretch of duplex DNA. Because the DNA sequence information that is used to correct the damage can come from a separate DNA molecule, homologous recombination can repair many types of DNA damage. It makes possible, for example, the accurate repair of double-strand breaks, as mentioned previously (see Figure 5-45B). As pointed out earlier, these double-strand breaks can result from reactive chemicals or radiation (for example, that from radon gas that accumulates in some old basements). But more frequently they arise from DNA replication accidents-when forks become stalled or broken independently of any such external cause. Homologous recombination accurately corrects these accidents, and, because they occur during nearly every round of DNA replication, this repair pathway is essential for every proliferating cell. Homologous recombination can also repair other types of DNA damage (for example, covalent cross-links between the two strands of a DNA double helix), being perhaps the most versatile DNA repair mechanism available to the cell; this probably explains why its mechanism and the proteins that carry it out have been conserved in virtually all cells on Earth.
在本章的前面部分,我们讨论了允许细胞中的 DNA 序列在世代之间保持几乎不变的机制。在本部分中,我们进一步探讨了一组依赖于称为同源重组的过程的修复机制。同源重组(也称为一般重组)的关键特征是在一对同源双链 DNA 序列之间交换 DNA 链。这种在核苷酸序列非常相似或相同的双螺旋区域之间进行的链交换允许一段双链 DNA 恢复第二段双链 DNA 上丢失或受损的信息。因为用于纠正损伤的 DNA 序列信息可以来自另一个 DNA 分子,所以同源重组可以修复许多类型的 DNA 损伤。例如,正如之前提到的那样,它可以实现双链断裂的准确修复(参见图 5-45B)。正如前面指出的,这些双链断裂可能是由反应性化学物质或辐射(例如,在一些老地下室中积聚的氡气)引起的。 但更频繁地,它们源于 DNA 复制事故-当叉变得停滞或独立于任何外部原因而断裂时。同源重组准确地纠正这些事故,并且,因为它们发生在几乎每一轮 DNA 复制期间,这种修复途径对于每一个增殖细胞都是必不可少的。同源重组还可以修复其他类型的 DNA 损伤(例如,DNA 双螺旋的两股之间的共价交联),也许是细胞中最多功能的 DNA 修复机制;这可能解释了为什么它的机制和执行它的蛋白质在地球上几乎所有细胞中都被保留下来。
We shall also see that homologous recombination plays an additional role in sexually reproducing organisms. During meiosis, a key step in gamete (sperm and egg) production, it catalyzes the orderly exchange of blocks of genetic information between corresponding (homologous) maternal and paternal chromosomes. This creates new combinations of DNA sequences in the chromosomes that are passed to offspring, giving the next generation unique characteristics upon which natural selection can act.
我们还将看到同源重组在有性繁殖生物中发挥额外作用。在减数分裂过程中,即配子(精子和卵子)的产生中,它催化母体和父体染色体之间对应(同源)基因信息块的有序交换。这在染色体中创造了新的 DNA 序列组合,传递给后代,赋予下一代独特特征,自然选择可以作用于这些特征。

Homologous Recombination Has Common Features in All Cells
同源重组在所有细胞中具有共同特征

The current view of homologous recombination as a critical DNA repair mechanism in all cells developed slowly from its original discovery as a key component in the specialized process of meiosis in plants and animals. The subsequent recognition that homologous recombination also occurs in unicellular organisms made it readily amenable to molecular analyses. Thus, much of what we know about the biochemistry of genetic recombination was derived from studies of bacteria, especially of . coli and its viruses, as well as from experiments with simple eukaryotes such as yeasts. For these organisms with short generation times and relatively small genomes, it was possible to isolate a large set of mutants with defects in their recombination processes. The protein altered in each mutant was then identified and, ultimately, studied biochemically. Very close relatives of these proteins were subsequently found in more complex eukaryotes including flies, mice, and humans, and it is now possible to directly analyze homologous recombination in these species as well. As a result, we now know that the fundamental processes that catalyze homologous recombination are common to all cells.
同源重组作为所有细胞中关键的 DNA 修复机制的当前观点,是从最初在植物和动物的减数分裂过程中作为关键组成部分的发现缓慢发展而来的。随后认识到同源重组也发生在单细胞生物中,使其容易受到分子分析的影响。因此,我们对遗传重组的生物化学知识的大部分来源于对细菌的研究,特别是大肠杆菌及其病毒,以及对酵母等简单真核生物的实验。对于这些繁殖周期短、基因组相对较小的生物,可以分离出一大批在其重组过程中存在缺陷的突变体。然后鉴定每个突变体中改变的蛋白质,并最终进行生物化学研究。随后在更复杂的真核生物中,包括果蝇、小鼠和人类中发现了这些蛋白质的非常近的亲缘物种,现在可以直接分析这些物种中的同源重组。因此,我们现在知道,催化同源重组的基本过程是所有细胞共有的。

DNA Base-pairing Guides Homologous Recombination
DNA 碱基配对引导同源重组

The hallmark of homologous recombination is that it takes place only between DNA duplexes that have extensive regions of sequence similarity (homology). Not surprisingly, base-pairing underlies this requirement: before undergoing homologous recombination, two DNA helices will "sample" each other's DNA sequence by testing the potential base-pairing between a single strand from one
同源重组的特点是它仅发生在具有广泛序列相似性(同源性)的 DNA 双螺旋之间。毫不奇怪,碱基配对是这一要求的基础:在进行同源重组之前,两条 DNA 螺旋将通过测试来自另一条单链的 DNA 序列之间的潜在碱基配对来“采样”彼此的 DNA 序列。
DNA duplex and a complementary single strand from the other. Recombination is initiated when a match is found; this match need not be perfect, but it must be very close for homologous recombination to succeed. As we shall see, the process is carefully controlled and guided by a group of specialized proteins.
DNA 双链和来自另一条链的互补单链。当找到匹配时,重组就会开始;这种匹配不必完美,但对于同源重组成功来说必须非常接近。正如我们将看到的那样,这个过程是由一组专门的蛋白质精心控制和引导的。

Homologous Recombination Can Flawlessly Repair Double-Strand Breaks in DNA
同源重组可以无缺地修复 DNA 中的双链断裂

Unlike the nonhomologous end joining discussed earlier, homologous recombination repairs double-strand breaks accurately, without any loss or alteration of nucleotides at the site of repair. For homologous recombination to do this repair job, the damaged DNA must first be brought into proximity with a homologous but undamaged DNA double helix, which can then serve as a template for repair. For this reason, homologous recombination often occurs after DNA replication, when the two daughter DNA molecules lie close together and one can serve as a template for repair of the other.
与前面讨论的非同源末端连接不同,同源重组可以准确修复双链断裂,修复过程中不会丢失或改变修复位置的核苷酸。为了让同源重组完成这一修复任务,受损的 DNA 必须首先与同源但未受损的 DNA 双螺旋靠近,后者可以作为修复的模板。因此,同源重组通常发生在 DNA 复制之后,当两个子 DNA 分子靠在一起时,其中一个可以作为另一个的修复模板。
One of the simplest pathways through which homologous recombination can repair double-strand breaks is shown in Figure 5-47. In essence, the broken DNA duplex and the template duplex carry out a "strand dance" so that one of the damaged strands can use the complementary strand of the intact DNA duplex
同源重组修复双链断裂的最简单途径之一如图 5-47 所示。本质上,断裂的 DNA 双链和模板双链进行“链舞”,以便受损链可以利用完整 DNA 双链的互补链。
Figure 5-47 A mechanism that repairs double-strand breaks by homologous recombination. Homologous recombination can be regarded as a flexible series of reactions, with the exact pathway differing from one case to the next. The pathway shown here represents one of the major forms of recombinational double-strand break repair; however other, closely related pathways also exist. All share the first two steps-resection and strand invasionbut they diverge afterward. For example, recombinational repair of some doublestrand breaks proceeds through a double Holliday junction, a structure we discuss later in this chapter.
图 5-47 通过同源重组修复双链断裂的机制。同源重组可以被视为一系列灵活的反应,具体的路径因情况而异。这里展示的路径代表了重组性双链断裂修复的主要形式之一;然而还存在其他密切相关的路径。所有这些路径都共享前两个步骤——剪切和链侵入,但在此之后它们分歧。例如,一些双链断裂的重组性修复通过双 Holliday 结构进行,这是我们在本章后面讨论的结构。

as a template for repair. Once the damaged and template DNA double helices are in proximity (as occurs, for example, after DNA replication), the ends of the broken DNA are chewed back, or "resected," by specialized nucleases to produce overhanging, single-strand 3' ends. The next step is strand exchange (also called strand invasion), during which one of the single-strand 3 ' ends from the damaged DNA molecule searches the template duplex for homologous sequences through base-pairing. Once stable base-pairing is established (which completes the strand-exchange step), an accurate DNA polymerase extends the invading strand using the information provided by the undamaged template molecule, thus restoring one of the damaged DNA strands. The last steps-strand displacement, further repair synthesis, and ligation-restore the two original DNA double helices and complete the repair process, as illustrated.
作为修复的模板。一旦受损的和模板 DNA 双螺旋靠近(例如,在 DNA 复制后发生),受损 DNA 的末端会被专门的核酸“削减”或“重组”,产生突出的、单链的 3'末端。下一步是链交换(也称为链入侵),在此期间,来自受损 DNA 分子的单链 3'末端通过碱基配对在模板双链中搜索同源序列。一旦建立稳定的碱基配对(完成链交换步骤),准确的 DNA 聚合酶使用未受损模板分子提供的信息延伸入侵链,从而恢复受损 DNA 链的一条。最后的步骤-链位移、进一步修复合成和连接-恢复两个原始的 DNA 双螺旋并完成修复过程,如图所示。
Homologous recombination resembles other DNA repair reactions in that a DNA polymerase utilizes a pristine template to restore damaged DNA. However, instead of using the partner strand as a template, as occurs in most DNA repair pathways, homologous recombination makes use of a complementary strand from a separate DNA duplex. In the following sections, we discuss the steps of homologous recombination in more detail with an emphasis on the proteins that guide this remarkable process.
同源重组类似于其他 DNA 修复反应,其中 DNA 聚合酶利用原始模板恢复受损的 DNA。然而,与大多数 DNA 修复途径中使用伴侣链作为模板不同,同源重组利用来自单独 DNA 双螺旋的互补链。在接下来的部分中,我们将更详细地讨论同源重组的步骤,重点介绍指导这一显著过程的蛋白质。

Specialized Processing of Double-Strand Breaks Commits Repair to Homologous Recombination
双链断裂的专门处理使修复转向同源重组

Once a double-strand break occurs, nonhomologous end joining and homologous recombination compete to repair the damage. But the specialized nuclease that resects DNA ends to begin homologous recombination becomes highly active during and (through its phosphorylation by cell-cycle-controlled kinases), and homologous recombination usually wins out at these times, allowing use of a newly replicated daughter DNA molecule as a template. The initiating nuclease (called the Mre11 complex in eukaryotes) chews back in the direction leaving protruding ends on either side of the break that can be as long as several thousand nucleotides. Single-strand binding protein (the same one used at replication forks) then coats the exposed single strands, protecting them from other nucleases in the cell and ensuring that they remain free of intramolecular base-pairing. The formation of these protruding ends prevents nonhomologous end joining from occurring, and it commits the repair pathway to homologous recombination.
一旦发生双链断裂,非同源末端连接和同源重组会竞争修复损伤。但在 期间,负责剪切 DNA 末端以开始同源重组的专门核酸酶会变得高度活跃(通过细胞周期控制的激酶对其进行磷酸化),通常在这些时候同源重组会取得胜利,允许使用新复制的子 DNA 分子作为模板。启动核酸酶(在真核生物中称为 Mre11 复合物)向 方向咬回,使断裂两侧留下突出的 末端,长度可长达数千个核苷酸。单链结合蛋白(与复制叉中使用的相同)然后包裹暴露的单链,保护它们免受细胞中其他核酸酶的影响,并确保它们保持不受分子内碱基配对的影响。这些突出末端的形成阻止了非同源末端连接的发生,并使修复途径承诺进行同源重组。

Strand Exchange Is Directed by the RecA/Rad51 Protein
链交换由 RecA/Rad51 蛋白质指导

Of all the steps of homologous recombination, strand invasion is the most difficult to imagine. How does the invading single strand rapidly sample a DNA duplex for a complementary sequence? Once the homology is found, how is the structure stabilized? And how is the inherent stability of the template double helix overcome to allow tests for base-pairing during this process?
在同源重组的所有步骤中,链侵入是最难想象的。入侵的单链如何快速地在 DNA 双链中寻找互补序列?一旦发现同源性,结构如何稳定?模板双螺旋的固有稳定性如何被克服,以允许在此过程中进行碱基配对的测试?
The answers to these questions came from biochemical and structural studies of the main protein that carries out this feat, called RecA in E. coli and Rad51 in virtually all eukaryotic organisms. A special group of accessory proteins loads a set of RecA/Rad51 monomers onto a protruding DNA single strand (such as that in Figure 5-47), forming a cooperatively bound filament that displaces the single-strand binding protein originally present. This orderly loading process produces a protein-DNA filament in which the DNA is held by RecA/Rad51 in an unusual conformation: groups of three consecutive nucleotides are positioned as though they were in a conventional DNA double helix, but, between adjacent triplets, the DNA backbone is untwisted and stretched out (Figure 5-48). This unusual protein-DNA structure then grasps a nearby duplex DNA molecule in a way that stretches it, destabilizing it and making it easy to pull the strands apart. The invading single strand then can sample the sequence of the duplex by conventional base-pairing to one of its strands. This sampling occurs in triplet
这些问题的答案来自于对执行这一功能的主要蛋白质进行的生化和结构研究,大肠杆菌中称为 RecA,在几乎所有真核生物中称为 Rad51。一组特殊的辅助蛋白质将一组 RecA/Rad51 单体加载到突出的 DNA 单链上(例如图 5-47 中的单链),形成一个协同结合的丝状物,将最初存在的单链结合蛋白置换出去。这种有序的加载过程产生了一个蛋白质-DNA 丝状物,其中 DNA 由 RecA/Rad51 以一种不寻常的构象保持:三个连续核苷酸组被定位得好像它们在一个传统的 DNA 双螺旋中一样,但在相邻的三联体之间,DNA 骨架是解开的并被拉伸出来(图 5-48)。然后,这种不寻常的蛋白质-DNA 结构以一种拉伸的方式抓住附近的双链 DNA 分子,使其变得不稳定并容易拉开链条。然后,入侵的单链可以通过与其中一条链的传统碱基配对来对双链的序列进行取样。这种取样是以三联体形式进行的。
Figure 5-48 Strand invasion catalyzed by the RecA protein. Our understanding of this reaction is based in part on structures determined by x-ray diffraction studies of the bacterial RecA protein bound to single-stranded and double-stranded DNA. These DNA structures (illustrated with the RecA protein removed) are shown on the left side of the diagram. The reaction begins when ATPbound RecA protein (blue) associates with a DNA single strand (typically a protruding end as shown in Figure 5-47), holding it in an elongated form where groups of three bases are separated from each other by a stretched and twisted backbone. The RecA-bound single strand then binds to duplex DNA, destabilizing it to allow the single strand to sample its sequence through basepairing, three bases at a time. If an extensive match is found, the structure is disassembled through ATP hydrolysis, resulting in protein dissociation and the exchange of one single strand of DNA for another, thereby forming a new heteroduplex from the complementary strands of two different DNA molecules. In the vast majority of cases, no match will be found in any one binding event, in which case the RecA-bound DNA single strand rapidly dissociates to begin a new search.
图 5-48 由 RecA 蛋白催化的链侵入。我们对这一反应的理解部分基于细菌 RecA 蛋白与单链和双链 DNA 结合的 X 射线衍射研究所确定的结构。这些 DNA 结构(去除了 RecA 蛋白的插图)显示在图表的左侧。当 ATP 结合的 RecA 蛋白(蓝色)与 DNA 单链(通常是如图 5-47 所示的突出的 端)结合时,反应开始,将其保持在一个拉长的形式中,其中三个碱基组被拉伸和扭曲的骨架分开。然后,RecA 结合的单链结合到双链 DNA,使其不稳定,以允许单链通过碱基配对每次三个碱基地进行序列采样。如果找到了广泛的匹配,结构将通过 ATP 水解而被解体,导致蛋白质解离并交换一条 DNA 单链以形成新的异源双链,这是由两种不同 DNA 分子的互补链形成的。 在绝大多数情况下,在任何一个结合事件中都不会找到匹配项,此时 RecA 结合的 DNA 单链会迅速解离,开始新的搜索。
(PDB code: .) (PDB 代码: .)
nucleotide blocks, each of which is already in a "base-pair ready" conformation in the invading strand; when a good triplet match occurs, only then is the adjacent triplet sampled, and so on. In this way, mismatches very quickly cause dissociation, so that millions of possible pairings can be tested. Only an extended stretch of base-pairing (at least 15 nucleotides) can stabilize the invading strand, leading to the next steps in homologous recombination.
核苷酸块,每个块在入侵链中已处于“碱基对就绪”构象;当发生良好的三联体匹配时,才会采样相邻的三联体,依此类推。这样,不匹配很快导致解离,从而可以测试数百万种可能的配对。只有延伸的碱基配对区段(至少 15 个核苷酸)才能稳定入侵链,引导同源重组的下一步。
RecA/Rad51 is an ATPase, and the steps described above require that each monomer along the filament be in the ATP-bound state. However, the searching itself does not require ATP hydrolysis; instead, the process occurs by simple molecular collisions, allowing an enormous number of potential sequences to be rapidly sampled. Once stable base-pairing occurs and a strand-exchange reaction is completed, ATP hydrolysis is necessary to disassemble RecA from the complex of DNA molecules. At this point, repair DNA polymerases and DNA ligase, which we encountered earlier in this chapter, complete the repair process, as shown previously in Figure 5-47.
RecA/Rad51 是一种 ATP 酶,上述描述的步骤要求螺旋丝上的每个单体都处于 ATP 结合状态。然而,搜索本身并不需要 ATP 水解;相反,该过程通过简单的分子碰撞发生,允许快速采样大量潜在序列。一旦稳定的碱基配对发生并完成链交换反应,ATP 水解就是必要的,以将 RecA 从 DNA 分子复合物中解离。在这一点上,修复 DNA 聚合酶和 DNA 连接酶,我们在本章前面遇到过,完成修复过程,如前面在图 5-47 中所示。

Homologous Recombination Can Rescue Broken and Stalled DNA Replication Forks
同源重组可以拯救断裂和停滞的 DNA 复制叉

Although accurately repairing double-strand breaks is a crucial function of homologous recombination, it can also repair other types of damage. For example, some chemicals cross-link the two strands of DNA together by covalently joining nucleotides on opposite strands. A special set of enzymes unlinks the strands and cuts out the damaged bits on both strands. At this point, the damaged DNA has been converted to a double-strand break, which can be accurately repaired by homologous recombination, as discussed earlier. Similarly, proteins can become accidently covalently linked to DNA, and these sites can also be converted by nucleases into double-strand breaks, allowing repair by homologous
尽管准确修复双链断裂是同源重组的一个关键功能,但它也可以修复其他类型的损伤。例如,一些化学物质通过共价连接 DNA 的两条链将核苷酸连接在相对链上。一组特殊的酶解开链条并切除两条链上的损伤部分。此时,受损的 DNA 已经转化为双链断裂,正如前面讨论的那样,可以通过同源重组准确修复。同样,蛋白质可能会意外地与 DNA 共价连接,这些位点也可以被核酸酶转化为双链断裂,从而通过同源重组修复。
Figure 5-49 Repair of a broken replication fork by homologous recombination. When a moving replication fork encounters a single-strand break, it will collapse but can be repaired by homologous recombination. The process uses many of the same reactions shown in Figure 5-47 and proceeds through the same basic steps. Green strands represent the new DNA synthesis that takes place after the replication fork has broken. This pathway allows the fork to move past the break on the damaged template using the undamaged duplex as a template to synthesize DNA. (Adapted from M.M. Cox, Proc. Natl. Acad. Sci. USA 98:8173-8180, 2001. Copyright 2001 National Academy of Sciences, USA. With permission from National Academy of Sciences.)
图 5-49 通过同源重组修复断裂的复制叉。当移动的复制叉遇到单链断裂时,它会崩溃,但可以通过同源重组修复。该过程使用了图 5-47 中显示的许多相同反应,并通过相同的基本步骤进行。绿色链代表复制叉断裂后发生的新 DNA 合成。这条途径允许复制叉通过损坏模板上的断裂,利用未受损的双链作为模板合成 DNA。(改编自 M.M. Cox, Proc. Natl. Acad. Sci. USA 98:8173-8180, 2001. 版权所有 2001 年美国国家科学院。获得美国国家科学院许可。)
recombination. But perhaps the most important role of homologous recombination is in rescuing broken or stalled DNA replication forks. Many types of events can cause a replication fork to stop, and here we consider two examples. The first arises from an accidental single-strand gap in the parent DNA helix that lies just ahead of a replication fork. When the fork reaches this lesion, it falls apart-resulting in one broken and one intact daughter chromosome. Because this is a "one-sided" double-strand break, it cannot be repaired by nonhomologous end joining, and homologous recombination becomes crucial. The broken fork can be accurately repaired using the same basic reactions we discussed earlier for the repair of double-strand breaks (Figure 5-49). With slight modifications, the set of reactions just depicted can accurately repair many different types of DNA damage, providing that an undamaged duplex DNA template is available.
重组。但也许同源重组最重要的作用是拯救断裂或停滞的 DNA 复制叉。许多事件都可能导致复制叉停止,这里我们考虑两个例子。第一个例子源于父本 DNA 螺旋中一个偶然的单链间隙,该间隙恰好位于复制叉前方。当复制叉到达这个损伤时,它会分解,导致一个断裂的和一个完整的子染色体。由于这是一个“单侧”双链断裂,无法通过非同源末端连接修复,同源重组变得至关重要。断裂的复制叉可以使用我们之前讨论过的用于修复双链断裂的相同基本反应进行准确修复(图 5-49)。稍作修改,刚刚描述的一组反应可以准确修复许多不同类型的 DNA 损伤,前提是有一个未受损的双链 DNA 模板可用。
A different type of problem arises when a replication fork attempts to move through certain types of DNA damage that clogs up the replication machinery, stalling the fork. Because such damaged DNA often ends up deeply buried in the core of the replication fork, it cannot be easily repaired. To resolve this problem, the replication machine "backs up" through a series of strand-exchange reactions similar to those we have discussed (Figure 5-50). This maneuver allows one newly synthesized DNA strand to act as a template for synthesis of the other new strand, thereby bypassing the damaged template and allowing replication to proceed.
一种不同类型的问题出现在复制叉试图穿过某些类型的损坏 DNA 时,这些损害会堵塞复制机器,使叉停滞不前。由于这种受损 DNA 通常最终深深埋藏在复制叉的核心,因此无法轻易修复。为了解决这个问题,复制机器通过一系列类似于我们讨论过的链交换反应(图 5-50)“倒退”。这一操作使得新合成的 DNA 链之一可以作为另一新链的合成模板,从而绕过受损模板,使复制得以继续进行。

DNA Repair by Homologous Recombination Entails
同源重组修复的 DNA 修复包括

Risks to the Cell
细胞的风险
Although homologous recombination neatly solves the problem of accurately repairing double-strand breaks and other types of DNA damage, it sometimes "repairs" damage using the wrong bit of the genome as the template. For example, sometimes a broken human chromosome is repaired using the homolog from the other parent instead of the sister chromatid as the template. Because maternal and paternal chromosomes differ in DNA sequence at many positions along their lengths, this type of repair can convert the sequence of the repaired DNA from the maternal to the paternal sequence or vice versa. The result of this type of errant recombination is a loss of heterozygosity. It can have severe consequences if the homolog used for repair contains a deleterious mutation, because the recombination event destroys the "good" copy. Loss of heterozygosity, although it happens rarely, is nonetheless a critical step in the formation of many cancers (discussed in Chapter 20).
虽然同源重组巧妙地解决了准确修复双链断裂和其他类型的 DNA 损伤的问题,但有时会使用错误的基因组片段作为模板来“修复”损伤。例如,有时一个破损的人类染色体会使用另一个亲本的同源染色体而不是姐妹染色单体作为模板进行修复。由于母源和父源染色体在其长度上的许多位置的 DNA 序列不同,这种修复类型可能会将修复后的 DNA 序列从母源转变为父源,反之亦然。这种错误重组的结果是杂合丧失。如果用于修复的同源含有有害突变,那么这种重组事件会破坏“好”的拷贝,可能会产生严重后果。虽然杂合丧失很少发生,但它仍然是许多癌症形成的关键步骤(见第 20 章讨论)。
Cells go to great lengths to minimize the risk of mishaps of these types; indeed, as we have seen, nearly every step of homologous recombination is carefully regulated. Recall that the first step (resection of the broken ends) is coordinated with the cell cycle: it occurs primarily in the and phases of the cell cycle, favoring the use of a daughter duplex (either as a partially replicated chromosome or a fully replicated sister chromatid) as a template for repair (see Figure 5-47). The close proximity of the two daughter chromosomes disfavors the use of other genome sequences in the repair process.
细胞竭尽全力来最小化这些类型的意外风险;事实上,正如我们所看到的,同源重组的几乎每一个步骤都受到精心调节。回想一下,第一步(切断端部的重组)与细胞周期协调:它主要发生在细胞周期的 阶段,有利于使用一个子代双链(作为部分复制的染色体或完全复制的姐妹染色单体)作为修复的模板(见图 5-47)。两个子代染色体的密切接近不利于在修复过程中使用其他基因组序列。
The loading of RecA/Rad51 onto the processed DNA ends and the subsequent strand-exchange reaction are also tightly controlled by the cell, and a host of accessory proteins is needed to regulate these steps. There are many such proteins, and exactly how all of them coordinate and control homologous recombination remains a mystery, although we do understand how a few of them work, as described below. We also know that the enzymes that catalyze
RecA/Rad51 的加载到经过处理的 DNA 末端以及随后的链交换反应也受到细胞的严格控制,需要大量辅助蛋白质来调节这些步骤。有许多这样的蛋白质,以及它们如何协调和控制同源重组仍然是一个谜,尽管我们了解其中一些如何工作,如下所述。我们也知道催化这些步骤的酶是如何工作的。

BROKEN REPLICATION FORK REPAIRED
断裂的复制叉已修复
Figure 5-50 Repair of a stalled replication fork by "fork reversal." This mechanism is brought into play when a replication fork stalls when it encounters certain types of damaged nucleotides. A specialized helicase (not shown) peels the newly synthesized DNA strands away from their parent templates, allowing them to form complementary base pairs with each other and backing up the replication fork. At this point two outcomes are possible. In the first, because the damaged DNA has been exposed, it can be repaired by conventional repair mechanisms, and the fork can be restarted. In the second, as shown, DNA synthesis can bypass the damage using newly synthesized daughter DNA (rather than the damaged parent strand) as the template. This scheme allows the replication fork to move through the DNA damage, which can be repaired at a later time. Although the initial steps of replication fork reversal are well understood, exactly how the fork restarts afterward remains a mystery.
图 5-50 通过“叉翻转”修复停滞的复制叉。当复制叉遇到某些类型的损坏核苷酸时停滞时,这种机制就会被启动。一种专门的解旋酶(未显示)将新合成的 DNA 链从其母模板中剥离,使它们能够互相形成互补碱基对,并支持复制叉。在这一点上有两种可能的结果。首先,由于损坏的 DNA 已经暴露,它可以通过常规修复机制修复,复制叉可以重新启动。其次,如所示,DNA 合成可以通过使用新合成的子代 DNA(而不是受损的母链)作为模板来绕过损伤。这种方案允许复制叉穿过 DNA 损伤,这些损伤可以在以后修复。尽管复制叉翻转的初始步骤已经被很好地理解,但关于复制叉之后如何重新启动仍然是一个谜。
recombinational repair are made at relatively high levels in eukaryotes and are dispersed throughout the nucleus in an inactive form. In response to DNA damage, they rapidly converge on the sites of DNA damage, become activated, and form "repair factories" where many lesions are apparently brought together and repaired (Figure 5-51). Formation of these factories probably results from many weak interactions between different repair proteins and between repair proteins and damaged DNA, producing the type of biomolecular condensates discussed in Chapter 3 (see Figure 3-77). The high local concentration of the appropriate proteins and their substrates within these condensates is thought to increase the speed and efficiency of the repair process.
在真核生物中,重组修复在相对较高水平上进行,并以非活性形式分散在细胞核中。作为对 DNA 损伤的响应,它们迅速汇聚到 DNA 损伤部位,被激活,并形成“修复工厂”,在那里许多损伤明显地被聚集在一起并修复(图 5-51)。这些工厂的形成可能是由于不同修复蛋白质之间以及修复蛋白质与受损 DNA 之间的许多弱相互作用,产生了第 3 章讨论的生物分子凝聚体的类型(参见图 3-77)。在这些凝聚体内适当蛋白质及其底物的高局部浓度被认为能够提高修复过程的速度和效率。
In Chapter 20, we shall see that both too much and too little homologous recombination can lead to cancer in humans, the former through repair using the "wrong" template (as described above) and the latter through an increased mutation rate caused by inefficient DNA repair. Clearly, a delicate balance has evolved that keeps this process in check on undamaged DNA, while still allowing it to act efficiently and rapidly on DNA lesions as soon as they arise.
在第 20 章中,我们将看到过多和过少的同源重组都可能导致人类患癌症,前者是通过使用“错误”的模板进行修复(如上所述),后者是通过 DNA 修复效率低导致的突变率增加。显然,已经演化出一种微妙的平衡,使得这一过程在未受损的 DNA 上得以控制,同时仍然允许它在 DNA 损伤出现时迅速高效地发挥作用。
Not surprisingly, mutations in the components that carry out and regulate homologous recombination are responsible for several inherited forms of cancer. Two of these, the Brcal and Brca2 proteins, were first discovered because mutations in their genes lead to a greatly increased frequency of breast cancer. Because these mutations cause inefficient repair by homologous recombination, accumulation of DNA damage can, in a small proportion of cells, give rise to a cancer. Brcal regulates an early step in broken-end processing; without it, such ends are not processed correctly for homologous recombination and instead damaged molecules are shunted to the error-prone nonhomologous end-joining pathway (see Figure 5-45). After resection, Brca2 is needed to correctly load the Rad51 protein onto the protruding single-strand DNA ends in preparation for strand exchange.
毫不奇怪,负责执行和调节同源重组的组分的突变负责多种遗传性癌症。其中两种,Brcal 和 Brca2 蛋白质,最初被发现是因为它们基因的突变导致乳腺癌发病率大大增加。由于这些突变导致同源重组修复效率低下,DNA 损伤在少数细胞中积累,可能导致癌症的发生。Brcal 调节断端处理的早期步骤;如果没有它,这些端点将无法正确处理以进行同源重组,而受损分子将被转移到容易出错的非同源末端连接途径(见图 5-45)。在重组后,Brca2 需要正确加载 Rad51 蛋白质到突出的单链 DNA 末端,为链交换做准备。

Homologous Recombination Is Crucial for Meiosis
同源重组对减数分裂至关重要

We have seen that homologous recombination can use a set of reactionsincluding broken-end resection, strand invasion, limited DNA synthesis, and ligation-to exchange DNA sequences between two double helices with the same nucleotide sequence and thereby repair damaged DNA. We now describe how homologous recombination is used to deliberately exchange material between two different chromosomes in order to generate DNA molecules that carry novel combinations of genes. This is a frequent and necessary part of meiosis, which occurs in sexually reproducing organisms such as fungi, plants, and animals.
我们已经看到同源重组可以利用一系列反应,包括断端重切、链侵入、有限 DNA 合成和连接,来在具有相同核苷酸序列的两个双螺旋之间交换 DNA 序列,从而修复受损的 DNA。我们现在描述同源重组如何被用来有意地在两个不同染色体之间交换物质,以生成携带新基因组合的 DNA 分子。这是减数分裂的一个频繁且必要的部分,发生在真菌、植物和动物等有性繁殖生物体中。
Figure 5-51 Experiment demonstrating the rapid localization of repair proteins to DNA double-strand breaks. Human fibroblasts were -irradiated to produce DNA double-strand breaks. Before the x-rays struck the cells, they were passed through a microscopic grid with -rayabsorbing "bars" spaced apart. This produced a striped pattern of DNA damage, allowing a comparison of damaged and undamaged DNA in the same nucleus. (A) Total DNA in a fibroblast nucleus stained with the dye DAPI. (B) Sites of new DNA synthesis due to repair of DNA damage, indicated by incorporation of BrdU (a thymidine analog) and subsequent staining with fluorescently labeled antibodies to BrdU (green). (C) Localization of the Mre11 complex to damaged DNA as visualized by antibodies against the Mre11 subunit (red). Mre11 is the nuclease that produces the protruding single-strand DNA ends needed for strand invasion (see Figure 5-47). A, B, and C were processed 30 minutes after x-irradiation. (From B.E. Nelms et al., Science 280:590-592, 1998. With permission from AAAS.)
图 5-51 实验证明修复蛋白迅速定位到 DNA 双链断裂部位。人类成纤维细胞被 辐射以产生 DNA 双链断裂。在 X 射线照射细胞之前,它们通过一个微观网格,网格上有 射线吸收的“条纹”,间隔 。这产生了 DNA 损伤的条纹图案,允许在同一细胞核中比较受损和未受损的 DNA。(A)用 DAPI 染色的成纤维细胞核中的总 DNA。(B)由于 DNA 损伤修复而导致的新 DNA 合成位点,通过 BrdU(一种胸苷类似物)的掺入和随后用荧光标记的 BrdU 抗体染色(绿色)来指示。(C)Mre11 复合物定位到受损 DNA,通过针对 Mre11 亚基的抗体(红色)可视化。Mre11 是产生用于链侵入所需的突出单链 DNA 末端的 核酸酶(参见图 5-47)。A、B 和 C 在 X 辐照后 30 分钟进行处理。(来源:B.E. Nelms 等人,Science 280:590-592,1998。获得 AAAS 许可。)
DNA SYNTHESIS CONTINUES PAST THE LESION
DNA 合成在损伤部位继续进行
(A)
(C)
Figure 5-52 Chromosome crossing-over occurs in meiosis. Meiosis is the process by which a diploid cell gives rise to four haploid germ cells, as described in detail in Chapter 17. Meiosis produces germ cells in which the paternal and maternal genetic information (red and blue) has been reassorted through chromosome crossovers. In addition, many short regions of gene conversion occur, as indicated.
图 5-52 染色体交叉发生在减数分裂中。减数分裂是一种二倍体细胞产生四个单倍体生殖细胞的过程,详细描述在第 17 章中。减数分裂产生的生殖细胞中,通过染色体交叉重新组合了父本和母本的遗传信息(红色和蓝色)。此外,还发生了许多短区域的基因转换,如所示。
In meiosis, homologous recombination is an integral part of the process that allows chromosomes to be parceled out to germ cells (sperm and eggs in animals). We discuss the process of meiosis in detail in Chapter 17; here we discuss how homologous recombination during meiosis produces chromosome crossing-over and gene conversion, resulting in hybrid chromosomes that contain genetic information from both the maternal and paternal homologs (Figure 5-52). These mechanisms, at their core, closely resemble those used to repair double-strand breaks.
在减数分裂中,同源重组是一个不可或缺的过程,它允许染色体被分配到生殖细胞(动物中的精子和卵子)。我们在第 17 章详细讨论了减数分裂的过程;在这里,我们讨论了减数分裂期间如何进行同源重组,产生染色体交叉和基因转换,导致包含来自母本和父本同源体的遗传信息的杂交染色体(图 5-52)。这些机制在本质上与用于修复双链断裂的机制非常相似。

Meiotic Recombination Begins with a Programmed Double-Strand Break
减数分裂重组始于程序化的双链断裂

Homologous recombination in meiosis starts with a bold stroke: a specialized Spoll protein complex breaks both strands of the DNA double helix in one of the recombining chromosomes (Figure 5-53). After catalyzing this reaction, the protein complex remains covalently bound to the broken DNA, much like the DNA topoisomerase we encountered earlier in this chapter (see Figure 5-22). Many of the subsequent recombination reactions closely resemble those already described for the repair of double-strand breaks; indeed, some of the same proteins are used for both processes. For example, the Mre11 complex, which we encountered earlier, chews back the DNA ends, removing the proteins along with the DNA and leaving the protruding 3 ' single-strand ends needed for strand invasion.
减数分裂中的同源重组始于一次大胆的举措:一个专门的 Spoll 蛋白复合物在重新组合染色体中的 DNA 双螺旋中断开了两条链(见图 5-53)。在催化这一反应之后,蛋白复合物仍然以共价键结合到断裂的 DNA 上,就像我们在本章前面遇到的 DNA 拓扑异构酶一样(见图 5-22)。许多随后的重组反应与已经描述的双链断裂修复非常相似;实际上,一些相同的蛋白质被用于两种过程。例如,我们之前遇到的 Mre11 复合物会咬掉 DNA 末端,将蛋白质连同 DNA 一起去除,留下需要进行链入侵的突出的 3'单链末端。
However, several meiosis-specific proteins come into play and guide the reactions somewhat differently, resulting in the distinctive outcomes observed for meiosis. A key difference is that, in meiosis, recombination occurs preferentially between maternal and paternal chromosomal homologs (which are held closely together during meiosis), rather than between newly replicated, identical DNA duplexes as in double-strand break repair. In the sections that follow, we describe in more detail those aspects of homologous recombination that are especially important for meiosis.
然而,几种减数分裂特异性蛋白参与其中并以略有不同的方式引导反应,导致观察到的减数分裂独特结果。一个关键区别是,在减数分裂中,重组优先发生在母源和父源染色体同源体之间(在减数分裂期间紧密结合在一起),而不是在双链断裂修复中那样在新复制的相同 DNA 双链之间。在接下来的章节中,我们将更详细地描述那些对减数分裂尤为重要的同源重组方面。

Holliday Junctions Are Recognized by Enzymes That Drive Branch Migration
分歧结构被推动分支迁移的酶所识别

Of special importance in meiosis is an intermediate structure known as a Holliday junction, or cross-strand exchange, in which two homologous DNA helices that have paired are held together by the reciprocal exchange of two of the four strands
在减数分裂中特别重要的是一种称为霍利戴结构或交叉链交换的中间结构,在这种结构中,已成对的两条同源 DNA 螺旋通过四条链中的两条互换而被保持在一起

paired homologous chromosomes -like protein exchange
成对的同源染色体-类似蛋白质交换

DNA SYNTHESIS
DNA 合成
J ALTERNATIVE PATHWAYS J 替代途径
ADDITIONAL DNA SYNTHESIS
额外的 DNA 合成
CONTINUED DNA 持续的 DNA
SYNTHESIS FOLLOWED BY DNA LIGATION
合成后进行 DNA 连接
CHROMOSOMES WITH CROSSOVER
染色体与交叉

Figure 5-53 Homologous recombination during meiosis can generate chromosome crossovers. Once the Spo11 complex breaks the duplex DNA and the Mre11 complex processes the ends, homologous recombination in meiosis can proceed along alternative pathways (Movie 5.8). One (right side of figure) closely resembles the double-strand break repair reaction shown in Figure 5-47 and results in chromosomes that have been "repaired" without crossing over. The other (left side of figure) proceeds through a double Holliday junction and produces two chromosomes that have crossed over. During meiosis, the maternal and paternal chromosome homologs are held tightly together (see Figure 17-54), and both types of recombination occur between them.
图 5-53 在减数分裂期间的同源重组可以产生染色体交叉。一旦 Spo11 复合物断裂双链 DNA 并且 Mre11 复合物处理末端,减数分裂期间的同源重组可以沿着替代途径进行(影片 5.8)。其中一种(图的右侧)与图 5-47 中显示的双链断裂修复反应非常相似,并且导致染色体已经“修复”而没有发生交叉。另一种(图的左侧)通过双 Holliday 结构进行,产生两条发生交叉的染色体。在减数分裂期间,母源和父源染色体同源被紧密地保持在一起(参见图 17-54),并且两种重组类型都在它们之间发生。
present, one strand originating from each of the helices. This junction can be considered to contain two pairs of strands: one pair of crossing strands and one pair of noncrossing strands (Figure 5-54A). But by undergoing a series of rotational movements, it can isomerize to form an open, symmetrical structure in which both pairs of strands occupy equivalent positions (Figure 5-54B and D). A special set of recombination proteins that binds to this open isomer uses the energy of ATP hydrolysis to catalyze a reaction known as branch migration (Figure 5-55), which greatly expands the region of heteroduplex DNA that was initially created by a strand-exchange reaction (Figure 5-54B and C). In meiosis, heteroduplex regions often "migrate" thousands of nucleotides from the original site of the double-strand break. The step where this migration occurs is indicated in Figure 5-53. As shown in the figure, Holliday junctions are often produced in pairs, known as double Holliday junctions.
目前,一条链起源于每个螺旋。这个连接点可以被认为包含两对链条:一对交叉链条和一对非交叉链条(图 5-54A)。但通过一系列旋转运动,它可以异构成一个开放的、对称的结构,在这个结构中,两对链条占据等价位置(图 5-54B 和 D)。一组特殊的重组蛋白结合到这个开放异构体上,利用 ATP 水解的能量催化一种称为分支迁移的反应(图 5-55),这大大扩展了最初由链交换反应创建的异源双链 DNA 区域(图 5-54B 和 C)。在减数分裂中,异源双链区域经常从双链断裂的原始位点“迁移”数千个核苷酸。这种迁移发生的步骤在图 5-53 中指示。正如图中所示,Holliday 连接点经常成对产生,被称为双 Holliday 连接点。

Homologous Recombination Produces Crossovers Between Maternal and Paternal Chromosomes During Meiosis
同源重组在减数分裂过程中产生母体和父体染色体之间的交叉

There are two basic outcomes of homologous recombination during meiosis, as shown previously in Figure 5-53 (Movie 5.8). In humans, approximately of the double-strand breaks produced during meiosis are resolved as non-crossovers (right side of Figure 5-53). Here, the two original DNA duplexes separate from each other in a form unaltered except for a region of heteroduplex that formed near the site of the original double-strand break. As already noted, this set of reactions resembles that described earlier for the repair of double-strand breaks.
在减数分裂过程中同源重组有两种基本结果,如之前在图 5-53(电影 5.8)中所示。在人类中,大约 的双链断裂在减数分裂过程中解决为非交叉(图 5-53 右侧)。在这种情况下,两个原始 DNA 双链从彼此分离,形式保持不变,除了在原始双链断裂点附近形成的异源双链区域。正如前面已经指出的,这一系列反应类似于早期描述的双链断裂修复过程。
Figure 5-54 A Holliday junction. The initially formed structure is usually drawn with two strands crossing, as in Figure 5-53. An isomerization of the Holliday junction (B) produces an open, symmetrical structure that is bound by specialized proteins. (C) These proteins "move" the Holliday junctions by a coordinated set of branch-migration reactions that involve the breaking and formation of base pairs (see Figure 5-55 and Movie 5.8). (D) Three-dimensional structure of the Holliday junction in the open form depicted in B. The Holliday junction is named for the scientist who first proposed its formation. (PDB code: 1DCW.)
图 5-54 霍利戴结构。最初形成的结构通常用两股交叉来绘制,如图 5-53 所示。霍利戴结构的异构化(B)产生一个开放的对称结构,由专门的蛋白质结合。这些蛋白质通过一组协调的分支迁移反应“移动”霍利戴结构,涉及碱基对的断裂和形成(见图 5-55 和视频 5.8)。霍利戴结构在 B 中所示的开放形式的三维结构。霍利戴结构以最初提出其形成的科学家命名。(PDB 代码:1DCW。)
Figure 5-55 Enzyme-catalyzed branch movement at a Holliday junction by branch migration. A tetramer of the RuvA protein (green) and two hexamers of the RuvB protein (yellow) bind to the open form of the junction. The RuvB protein, which resembles the hexameric helicases used in DNA replication (see Figure 5-14), uses the energy of ATP hydrolysis to spool DNA rapidly through the Holliday junction, extending the heteroduplex region as shown. The RuvA protein coordinates this movement, threading the DNA strands to avoid tangling. This example is from E. coli, but similar proteins function in meiosis in sexually reproducing organisms. (PDB codes: 1IXR, 1C7Y.)
图 5-55 酶催化的 Holliday 结交叉点的分支迁移。RuvA 蛋白的四聚体(绿色)和两个 RuvB 蛋白的六聚体(黄色)结合到交叉点的开放形式。RuvB 蛋白类似于 DNA 复制中使用的六聚体解旋酶(见图 5-14),利用 ATP 水解的能量快速穿过 Holliday 结交叉点,扩展异源双链区域如所示。RuvA 蛋白协调这种运动,引导 DNA 链以避免缠结。这个例子来自大肠杆菌,但类似的蛋白在有性繁殖生物的减数分裂中起作用。(PDB 代码:1IXR,1C7Y。)
The other outcome is much more profound: a double Holliday junction is formed and is cleaved by specialized enzymes (blue arrows on the left side of Figure 5-53) to create a crossover. The two original portions of each chromosome upstream and downstream from the two Holliday junctions are thereby swapped, creating two chromosomes that are said to have "crossed over"-each containing a large number of both maternally inherited and paternally inherited genes.
另一个结果更为深远:形成了一个双霍利戴结构,并由专门的酶(图 5-53 左侧的蓝色箭头)切割,以产生交叉。从两个霍利戴结构上游和下游的每条染色体的两个原始部分因此交换,形成了两条染色体,据说已经“交叉” - 每条包含大量母源和父源基因。
How does the cell decide which double-strand breaks to resolve as crossovers? The answer is not yet known, but we know the decision is not random. The relatively few crossovers that do form are distributed along chromosomes in such a way that a crossover in one position inhibits crossing-over in neighboring regions. Termed crossover control, this fascinating but poorly understood regulatory mechanism ensures the roughly even distribution of crossover points along chromosomes. It also ensures that each chromosome-no matter how smallundergoes at least one crossover event every meiosis. For many organisms, roughly two crossovers per chromosome occur during each meiosis, one on each arm. As discussed in detail in Chapter 17, these crossovers, in addition to producing novel DNA molecules, play an important mechanical role in the proper segregation of chromosomes during meiosis.
细胞如何决定解决哪些双链断裂作为交叉?答案尚不清楚,但我们知道这个决定并非随机的。形成的相对较少的交叉点沿着染色体分布,以至于一个位置的交叉会抑制相邻区域的交叉。这种称为交叉控制的引人入胜但尚不完全理解的调节机制确保了染色体上交叉点的大致均匀分布。它还确保每条染色体-无论多么小-在每次减数分裂中至少发生一次交叉事件。对于许多生物来说,在每次减数分裂中,每条染色体大约发生两次交叉,每条臂上一次。正如第 17 章详细讨论的那样,这些交叉除了产生新的 DNA 分子外,在减数分裂期间染色体的正确分离中发挥重要的机械作用。
Whether a meiotic recombination event is resolved as a crossover or a non-crossover, the recombination machinery leaves behind a heteroduplex region where a strand with the DNA sequence of the paternal homolog is base-paired with a strand from the maternal homolog (Figure 5-56). These heteroduplex regions can tolerate a small percentage of mismatched base pairs, and because of branch migration, they often extend for thousands of nucleotide pairs. The many non-crossover events that occur in meiosis thereby produce scattered sites in the germ cells where short DNA sequences from one homolog have been pasted into the other homolog. Heteroduplex regions mark sites of potential gene conversionwhere the four haploid chromosomes produced by meiosis contain three copies of a DNA sequence from one homolog and only one copy of this sequence from the other homolog, as explained next.
无论减数分裂重组事件是作为交叉还是非交叉解决的,重组机制都会留下一个异源双链区域,其中一个具有父本同源的 DNA 序列的链与母本同源的链相互配对(图 5-56)。这些异源双链区域可以容忍少量不匹配的碱基对,并且由于分支迁移,它们经常延伸数千个核苷酸对。因此,在减数分裂中发生的许多非交叉事件会在生殖细胞中产生分散的位点,其中一个同源的短 DNA 序列已经粘贴到另一个同源中。异源双链区域标记了潜在基因转换的位点,即由减数分裂产生的四个单倍体染色体中包含来自一个同源的 DNA 序列的三个拷贝,而来自另一个同源的这个序列只有一个拷贝,如下所述。

Homologous Recombination Often Results in Gene Conversion
同源重组通常导致基因转换

In sexually reproducing organisms, it is a fundamental law of genetics that-aside from mitochondrial DNA, which is inherited only through the mother-each parent makes an equal genetic contribution to an offspring. One complete set of nuclear genes is inherited from the father and one complete set is inherited from the mother. Underlying this law is the accurate parceling out of chromosomes to the germ cells (eggs and sperm) that takes place during meiosis. Thus, when a diploid cell in a parent undergoes meiosis to produce four haploid germ cells, exactly half of the genes distributed among these four cells should be maternal (genes inherited from the mother of this parent) and the other half paternal (genes inherited from the father of this parent). In some organisms (fungi, for example), it is possible to recover and analyze all four of the haploid gametes produced from a single cell by meiosis. Studies in such organisms have revealed rare cases in which the parceling out of genes violates the standard genetic rules. Occasionally, for example, meiosis yields three copies of the maternal version of a gene and only one copy of the paternal version. Alternative versions of the same gene are called alleles, and it is the divergence from their expected distribution during
在有性繁殖的生物中,除了线粒体 DNA 只通过母亲遗传外,每个父母对后代的遗传贡献是相等的,这是遗传学的基本定律。一个完整的核基因组来自父亲,另一个完整的核基因组来自母亲。这一定律的基础是染色体在减数分裂期间准确地分配给生殖细胞(卵子和精子)。因此,当一个父母的二倍体细胞经过减数分裂产生四个单倍体生殖细胞时,这四个细胞中分配的基因应该有一半是母亲的(来自这个父母的母亲的遗传基因),另一半是父亲的(来自这个父母的父亲的遗传基因)。在一些生物(例如真菌)中,通过减数分裂从单个细胞产生的四个单倍体配子可以被恢复和分析。在这些生物中的研究揭示了罕见的情况,即基因的分配违反了标准的遗传规则。例如,偶尔会出现减数分裂产生三个母本基因版本和一个父本基因版本的情况。同一基因的不同版本被称为等位基因,当它们的分布偏离预期时

Figure 5-56 Heteroduplexes formed during meiosis. Heteroduplex DNA is present at sites of recombination that are resolved either as crossovers or noncrossovers. Because the DNA sequences of maternal and paternal chromosomes differ at many positions along their lengths, heteroduplexes often contain a small number of base-pair mismatches, and they are therefore potential sites of gene conversion (see Figure 5-57).
图 5-56 减数分裂期形成的异源二链体。异源二链体 DNA 存在于重组位点,这些位点被解决为交叉或非交叉。由于母体和父体染色体的 DNA 序列在其长度上的许多位置不同,异源二链体通常包含少量碱基错配,因此它们是基因转换的潜在位点(见图 5-57)。

meiosis that is known as gene conversion (Movie 5.8). Genetic studies show that only small sections of DNA typically undergo gene conversion, and in many cases only a part of a gene is changed. How is this possible?
减数分裂被称为基因转换(电影 5.8)。遗传研究表明,通常只有 DNA 的小部分经历基因转换,而且在许多情况下只有基因的一部分被改变。这是如何可能的?
We have seen that both crossovers and non-crossovers produce heteroduplex regions of DNA. If the two strands that make up a heteroduplex region do not have identical nucleotide sequences, mismatched base pairs are formed, and these are often repaired by the cell's mismatch repair system (see Figure 5-20). However, unlike what happens after DNA replication, in meiosis the mismatch repair system randomly selects the strand to be used as a template, causing one allele to be lost and the other duplicated (Figure 5-57). Thus, gene conversion (the "conversion" of one allele to the other)-originally regarded as a mysterious deviation from the rules of genetics-can be seen as a straightforward consequence of the mechanisms of homologous recombination during meiosis.
我们已经看到,无论是交叉还是非交叉都会产生 DNA 的异源双链区域。如果构成异源双链区域的两条链不具有相同的核苷酸序列,就会形成错配的碱基对,这些通常会被细胞的错配修复系统修复(见图 5-20)。然而,与 DNA 复制后发生的情况不同,在减数分裂中,错配修复系统会随机选择要用作模板的链,导致一个等位基因丢失,另一个被复制(图 5-57)。因此,基因转换(将一个等位基因“转换”为另一个)-最初被视为遗传规律的神秘偏差,现在可以看作是减数分裂期间同源重组机制的直接结果。

Summary 摘要

Homologous recombination describes a flexible set of reactions resulting in the exchange of DNA sequences between a pair of identical or nearly identical duplex DNA molecules. Of special importance is a strand-exchange step whereby a single strand from one DNA duplex invades a second duplex and base-pairs with one strand while displacing the other. This reaction, catalyzed by the RecA/Rad51 family of proteins, can only occur if the invading strand can form a short stretch of consecutive nucleotide pairs with one of the strands of the duplex. This requirement ensures that homologous recombination occurs only between identical or very similar DNA sequences.
同源重组描述了一组灵活的反应,导致两个相同或几乎相同的双链 DNA 分子之间的 DNA 序列交换。特别重要的是一种链交换步骤,其中来自一个 DNA 双链的单链侵入第二个双链,并与一条链配对,同时排斥另一条链。这种反应由 RecA/Rad51 蛋白家族催化,只有在侵入链能够与双链的一条链形成一小段连续的核苷酸对时才会发生。这一要求确保同源重组仅发生在相同或非常相似的 DNA 序列之间。
When used as a DNA repair mechanism, homologous recombination usually occurs between a damaged DNA molecule and its recently duplicated sister molecule, with the undamaged duplex acting as a template to repair the damaged copy flawlessly. In meiosis, homologous recombination is initiated by deliberate, carefully regulated double-strand breaks and occurs preferentially between the homologous chromosomes rather than the newly replicated sister chromatids. The outcome can be either two chromosomes that have crossed over (that is, chromosomes in which the DNA on either side of the site of DNA pairing originates from two different homologs) or two non-crossover chromosomes. In the latter case, the two chromosomes that result are identical to the original two homologs, except for relatively minor DNA sequence changes at the site of recombination.
当用作 DNA 修复机制时,同源重组通常发生在受损的 DNA 分子和其最近复制的姐妹分子之间,未受损的双链作为模板修复受损的拷贝。在减数分裂中,同源重组是通过故意的、精心调节的双链断裂来启动的,并且优先发生在同源染色体之间,而不是新复制的姐妹染色单体之间。结果可能是两个发生交叉的染色体(即,DNA 配对位点两侧的 DNA 来源于两个不同的同源体)或两个非交叉染色体。在后一种情况下,结果是两个染色体与原始的两个同源体相同,除了在重组位点处相对较小的 DNA 序列变化。

TRANSPOSITION AND CONSERVATIVE SITE-SPECIFIC RECOMBINATION
转座和保守的位点特异性重组

We have seen that homologous recombination can result in the exchange of DNA sequences between chromosomes. However, the order of genes on the interacting chromosomes typically remains the same after homologous recombination, inasmuch as the recombining sequences must be very similar for the process to occur. In this part of the chapter, we describe two very different types of recombination-transposition and conservative site-specific recombinationthat do not require substantial regions of DNA homology. These two types of recombination reactions can alter the gene order along a chromosome and introduce whole blocks of DNA sequence into the genome.
我们已经看到同源重组可以导致染色体之间的 DNA 序列交换。然而,在同源重组之后,相互作用染色体上基因的顺序通常保持不变,因为重组序列必须非常相似才能发生这一过程。在本章的这一部分,我们描述了两种非常不同的重组类型——转座和保守的位点特异性重组,这两种类型的重组不需要大片的 DNA 同源性。这两种重组反应可以改变染色体上的基因顺序,并将整个 DNA 序列块引入基因组中。
Transposition and conservative site-specific recombination are largely dedicated to moving a wide variety of specialized segments of DNA-collectively termed mobile genetic elements-from one position in a genome to another. We will see that mobile genetic elements can range in size from a few hundred to tens of thousands of nucleotide pairs, and each typically carries a unique set of genes. Often, one of these genes encodes a specialized enzyme that catalyzes the movement of only that element and its close relatives, thereby making this type of recombination possible.
转座和保守的位点特异性重组主要用于将 DNA 的各种专门化片段(统称为移动遗传元素)从基因组中的一个位置移动到另一个位置。我们将看到,移动遗传元素的大小范围可以从几百对到数万对核苷酸,每个通常携带一组独特的基因。通常,其中一个基因编码了一种专门的酶,催化仅该元素及其近亲的移动,从而使这种重组成为可能。
Virtually all cells contain mobile genetic elements, known informally as "jumping genes." As explained in Chapter 4, over evolutionary time scales, they
几乎所有细胞都含有被非正式地称为“跳跃基因”的移动基因元素。正如第 4 章所解释的那样,在演化时间尺度上,它们
Figure 5-57 Gene conversion caused by mismatch correction. As shown in the preceding figure, heteroduplex DNA is formed at the sites of homologous recombination between maternal and paternal chromosomes. If the maternal and paternal DNA sequences are slightly different, the heteroduplex region will include some mismatched base pairs, which may then be corrected by the DNA mismatch repair machinery (see Figure 5-20). Because neither strand of DNA is newly synthesized, such repair can "erase" nucleotide sequences on either the paternal or the maternal strand. The consequence of this mismatch repair is gene conversion, detected as a deviation from the segregation of equal copies of maternal and paternal alleles that normally occurs in meiosis.
图 5-57 由错配校正引起的基因转换。如前图所示,在母体和父体染色体之间同源重组的位点形成异源双链 DNA。如果母体和父体 DNA 序列略有不同,异源双链区域将包括一些错配碱基对,这些错配碱基对可能会被 DNA 错配修复机制修正(见图 5-20)。由于 DNA 的任何链都不是新合成的,这种修复可以“擦除”父链或母链上的核苷酸序列。这种错配修复的后果是基因转换,表现为在减数分裂中正常发生的母体和父体等位基因等量分离的偏离。

have had a profound effect on the shaping of modern genomes. For example, nearly half of the DNA in the human genome can be traced to these elements (see Figure 4-63). Over time, random mutation has altered their nucleotide sequences, and, as a result, only a few of the many copies of these elements in our DNA are still active and capable of movement. The remainder are molecular fossils whose existence provides striking clues to our evolutionary history.
对现代基因组的塑造产生了深远影响。例如,人类基因组中近一半的 DNA 可以追溯到这些元素(见图 4-63)。随着时间的推移,随机突变改变了它们的核苷酸序列,因此,我们 DNA 中许多这些元素的副本中只有少数仍然活跃并能够移动。其余的是分子化石,它们的存在为我们的进化历史提供了引人注目的线索。
Mobile genetic elements are often considered to be molecular parasites (they are also termed "selfish DNA") that persist because cells cannot get rid of them; they certainly have come close to overrunning our own genome. However, mobile DNA elements can provide benefits to the cell. For example, the genes they carry are sometimes advantageous to the host, as in the case of antibiotic resistance in bacterial cells, discussed later. The movement of mobile genetic elements also produces many of the genetic variants upon which evolution depends, because, in addition to moving themselves, mobile genetic elements occasionally rearrange neighboring sequences of the host genome. Thus, spontaneous mutations observed in bacteria, Drosophila, humans, and other organisms are often due to the random movement of mobile genetic elements. While many of these mutations will be deleterious to the organism, some will be advantageous and may spread throughout the population. It is almost certain that much of the variety of life we see around us originally arose from the movement of mobile genetic elements.
移动基因元素通常被认为是分子寄生虫(它们也被称为“自私 DNA”),因为细胞无法摆脱它们而持续存在;它们确实已经接近于占据我们自己的基因组。然而,移动 DNA 元素可以为细胞提供好处。例如,它们携带的基因有时对宿主有利,就像细菌细胞中的抗生素抗性一样,在后文中讨论。移动基因元素的移动还产生了许多进化所依赖的基因变异,因为除了移动自身外,移动基因元素偶尔会重新排列宿主基因组的相邻序列。因此,在细菌、果蝇、人类和其他生物中观察到的自发突变往往是由于移动基因元素的随机移动。虽然这些突变中许多对生物体有害,但有些将是有利的,并可能在整个种群中传播。几乎可以肯定,我们周围看到的生命多样性的大部分起源于移动基因元素的移动。
In this part of the chapter, we introduce mobile genetic elements and describe the mechanisms that enable them to move from place to place in a genome. As mentioned above, these elements move through a variety of different mechanisms that can be grouped into two broad categories, transposition and conservative site-specific recombination. We begin with transposition, by far the most predominant of these two processes.
在本章的这一部分,我们介绍移动基因元件,并描述使它们能够在基因组中从一个地方移动到另一个地方的机制。正如上文所提到的,这些元件通过各种不同的机制移动,可以分为两大类,即转座和保守的位点特异性重组。我们首先讨论转座,这是这两种过程中远远最为突出的一种。

Through Transposition, Mobile Genetic Elements Can Insert into Any DNA Sequence
通过转座作用,移动基因元件可以插入到任何 DNA 序列中

Mobile elements that move by way of transposition are called transposons, or transposable elements. In transposition, a specific enzyme, usually encoded by the transposon itself and typically called a transposase, acts on specific DNA sequences at each end of the transposon, causing it to insert into a new DNA site. Most transposons are only modestly selective in choosing their target site, and they can therefore insert themselves into many different locations in a genome; in particular, there is no general requirement for sequence similarity between the ends of the element and the target sequence. Most transposons move only rarely. In bacteria, the rate is typically one transposition event once every cell divisions, and significantly more frequent movement would probably destroy the host cell's genome. In plants and animals, the situation is different: it is common for progeny to carry tens to hundreds of new insertions relative to their parents. These high rates are tolerated, in part, because these genomes typically carry vast amounts of nonessential DNA sequences where most of the insertions are likely to occur.
通过转座方式移动的移动元素称为转座子,或可移动元素。在转座过程中,通常由转座子本身编码的特定酶,通常称为转座酶,作用于转座子两端的特定 DNA 序列,导致其插入到新的 DNA 位点。大多数转座子在选择目标位点时只具有适度的选择性,因此它们可以插入到基因组中的许多不同位置;特别是,元素两端与目标序列之间通常没有序列相似性的一般要求。大多数转座子移动频率很低。在细菌中,速率通常是每 个细胞分裂事件发生一次转座事件,更频繁的移动可能会破坏宿主细胞的基因组。在植物和动物中,情况不同:后代携带相对于父代而言数十到数百个新插入是常见的。这些高速率在一定程度上是可以容忍的,因为这些基因组通常携带大量非必要的 DNA 序列,大多数插入可能发生在这些序列中。
On the basis of their structure and transposition mechanism, transposons can be grouped into three large classes: DNA-only transposons, retroviral-like retrotransposons, and nonretroviral retrotransposons. The differences among them are briefly outlined in Table 5-4, and each class will be discussed in turn.
根据其结构和转座机制,转座子可以分为三大类:仅 DNA 转座子、类逆转录病毒样逆转座子和非逆转录病毒逆转座子。它们之间的区别在表 5-4 中简要概述,并将依次讨论每一类。

DNA-only Transposons Can Move by a Cut-and-Paste Mechanism
DNA-only 转座子可以通过剪切粘贴机制移动

DNA-only transposons, so named because they exist exclusively as DNA during their movement, predominate in bacteria, and they are largely responsible for the spread of antibiotic resistance in bacterial strains. When antibiotics such as penicillin and streptomycin first became widely available in the 1950s, most bacteria that caused human disease were susceptible to them. Now, the situation
DNA-only 转座子之所以被如此命名,是因为它们在移动过程中仅以 DNA 形式存在,在细菌中占主导地位,并且在细菌菌株中传播抗生素抗性方面起着重要作用。当青霉素和链霉素等抗生素在 1950 年代首次广泛应用时,大多数导致人类疾病的细菌对它们敏感。现在,情况已经发生变化。

TABLE 5-4 Three Major Classes of Transposable Elements
表 5-4 转座元件的三个主要类别

Class description and structure
课程描述和结构

运动所需的专门酶
Specialized enzymes
required for movement
Mode of movement 运动模式 Examples 例子
DNA-only transposons DNA-only 转座子
Short inverted repeats at each end
每端都有短的倒置重复
Transposase 转座酶

以 DNA 的形式移动,可以通过剪切粘贴或复制路径
Moves as DNA, either
by cut-and-paste or
replicative pathways

P 元素(果蝇),Ac-Ds(玉米),Tn3(大肠杆菌),Tam3(狼尾草),Helraiser(蝙蝠)
P element (Drosophila), Ac-Ds
(maize), Tn3 (E. coli), Tam3
(snapdragon), Helraiser (bat)
Retroviral-like retrotransposons
逆转录病毒样逆转录转座子

直接在每端重复的长末端重复序列(LTRs)
Directly repeated long terminal
repeats (LTRs) at each end

反转录酶和整合酶
Reverse transcriptase and
integrase

通过一个 RNA 中间体进行移动,其产生受 LTR 中的启动子驱动
Moves via an RNA
intermediate whose
production is driven by
a promoter in the LTR

Copia 和 Gypsy(果蝇),Ty1(酵母),HERVK(人类),Bs1(玉米),EVADE(拟南芥)
Copia and Gypsy (Drosophila),
Ty1 (yeast), HERVK (human),
Bs1 (maize), EVADE (Arabidopsis)
Nonretroviral retrotransposons
非逆转录病毒逆转座子

RNA 转录本的 端为 Poly A; 端经常被截短
Poly A at end of RNA transcript;
end is often truncated

反转录酶和内切酶
Reverse transcriptase and
endonuclease

通过通常是从相邻启动子合成的 RNA 中间体进行移动
Moves via an RNA
intermediate that is
often synthesized from
a neighboring promoter

I 元素(果蝇),L1(人类),Cin4(玉米),Karma(稻米)
I element (Drosophila),
L1 (human), Cin4 (maize),
Karma (rice)
These elements range in length from 1000 to about 12,000 nucleotide pairs. Each family contains many members, only a few of which are listed here. Some viruses can also move in and out of host-cell chromosomes by transpositional mechanisms. These viruses are related to the first two classes of transposons.
这些元素的长度范围从 1000 到约 12,000 个核苷酸对。每个家族包含许多成员,这里只列出了其中的一部分。一些病毒也可以通过转座机制在宿主细胞染色体内移动。这些病毒与转座子的前两类相关。
is different-antibiotics such as penicillin (and its modern derivatives) are no longer effective against many modern bacterial strains, including those causing gonorrhea and bacterial pneumonia. The spread of antibiotic resistance is due largely to genes that encode antibiotic-inactivating enzymes that are carried on transposons (Figure 5-58). Although these mobile elements can transpose only within cells that already carry them, they can be moved from one cell to another through other mechanisms known collectively as horizontal gene transfer (see Figure 1-18). Once introduced into a new cell, a transposon can insert itself into the genome and be faithfully passed on to all progeny cells through the normal processes of DNA replication and cell division.
抗生素的不同种类,如青霉素(及其现代衍生物),已不再对许多现代细菌菌株有效,包括引起淋病和细菌性肺炎的细菌。抗生素耐药性的传播在很大程度上归因于编码抗生素失活酶的基因,这些基因携带在可移动的转座子上(见图 5-58)。虽然这些可移动元件只能在已携带它们的细胞内转座,但它们可以通过其他机制被从一个细胞转移到另一个细胞,这些机制被统称为水平基因转移(见图 1-18)。一旦引入新细胞,转座子可以插入基因组,并通过 DNA 复制和细胞分裂的正常过程被忠实地传递给所有后代细胞。
DNA-only transposons can relocate from a donor site to a target site by cutand-paste transposition (Figure 5-59). Here, the transposon is literally excised from one spot on a genome and inserted into another. This reaction produces a short duplication of the target DNA sequence at the insertion site; these direct repeat sequences that flank the transposon serve as convenient records of a prior transposition event. Such "signatures" often provide valuable clues in identifying transposons in genome sequences.
DNA-only 转座子可以通过剪切粘贴转座(图 5-59)从供体位点迁移到目标位点。在这里,转座子实际上是从基因组的一个位置切除并插入到另一个位置。这种反应在插入位点产生目标 DNA 序列的短重复;环绕转座子的这些直接重复序列作为先前转座事件的便捷记录。这种“标记”通常在识别基因组序列中的转座子方面提供有价值的线索。
When a cut-and-paste DNA-only transposon is excised from its original location, it leaves behind a "hole" in the chromosome. This lesion can be perfectly healed by recombinational double-strand break repair, provided that the chromosome has recently been replicated so that an identical copy of the damaged host sequence is available. Alternatively, a nonhomologous end-joining reaction can reseal the break; in this case, the DNA sequence that originally flanked the transposon is often altered, producing a mutation at the chromosomal site from which the transposon was excised (see Figure 5-45).
当一个剪切-粘贴 DNA 转座子从其原始位置切除时,在染色体上留下一个“孔”。只要染色体最近已经复制,使得受损宿主序列的一个相同拷贝可用,这种损伤可以通过重组双链断裂修复完全愈合。另外,非同源末端连接反应可以重新封闭断裂;在这种情况下,最初环绕转座子的 DNA 序列通常会发生改变,导致从转座子被切除的染色体位点处发生突变(见图 5-45)。
Figure 5-58 Transposons often code for the components they need for transposition. Shown here are two types of bacterial DNA-only transposons. Each carries a gene that encodes a transposase (dark blue and red)-the enzyme that catalyzes the element's movement-as well as short DNA sequences (light blue and pink) that are recognized by the matching transposase. The short sequences (two in each transposon) are usually arranged so that one is an inverted repeat of the other.
图 5-58 转座子通常编码了它们进行转座所需的组分。这里展示了两种细菌 DNA 转座子。每个携带一个编码转座酶(深蓝色和红色)的基因,这种酶催化元件的移动,以及被匹配的转座酶识别的短 DNA 序列(浅蓝色和粉色)。这些短序列(每个转座子中有两个)通常排列成一个是另一个的倒置重复。
Some transposons carry additional genes (yellow) that encode enzymes that inactivate antibiotics such as ampicillin (AmpR). The spread of these transposons is a serious problem in medicine, as it has allowed many disease-causing bacteria to become resistant to the antibiotics developed in the twentieth century.
一些转座子携带额外的基因(黄色),编码能够使抗生素如氨苄青霉素(AmpR)失效的酶。这些转座子的传播在医学上是一个严重问题,因为它使得许多引起疾病的细菌对二十世纪开发的抗生素产生了抗药性。
Remarkably, the same mechanism used to excise cut-and-paste transposons from DNA has been found to operate in the developing immune system of vertebrates, catalyzing the DNA rearrangements that produce antibody and T-cell receptor diversity. Known as recombination, this process will be discussed in Chapter 24. Found only in vertebrates, V(D)J recombination is a relatively recent evolutionary novelty, but its mechanism was probably derived from the much more ancient cut-and-paste transposons.
值得注意的是,用于从 DNA 中切除剪切转座子的相同机制已被发现在脊椎动物的免疫系统发育中起作用,催化产生抗体和 T 细胞受体多样性的 DNA 重排。这一过程被称为 重组,在第 24 章将进行讨论。V(D)J 重组仅在脊椎动物中发现,是一个相对较新的进化新颖性,但其机制可能源自更古老的剪切转座子。

Some DNA-only Transposons Move by Replicating Themselves
一些仅含 DNA 的转座子通过复制自身来移动

Although cut-and-paste transposition is common, especially in bacteria, there are other ways that DNA-only transposons can move. These involve replicating the transposon and moving the copy to a new position on the genome, leaving the original transposon intact and in its original position. There are several different ways this can occur and we discuss only one here, which is characteristic of a large class of DNA-only transposons known as helitrons. Found in all branches of life, these transposons are especially common in plants and animals where they can compose several percent of genomes. They carry a gene for an unusual type of transposase, one that functions as both a sequence-specific nuclease and as a helicase, thereby directing the movement of the transposon (Figure 5-60).
尽管剪切-粘贴转座是常见的,尤其是在细菌中,DNA-only 转座子可以通过其他方式移动。这涉及复制转座子并将副本移动到基因组的新位置,使原始转座子保持完整并保持在原始位置。这可以以几种不同的方式发生,我们在这里仅讨论一种,这是一类称为螺旋转座子的 DNA-only 转座子的特征。这些转座子在生命的所有分支中都可以找到,特别常见于植物和动物中,它们可以占据基因组的几个百分比。它们携带一种用于不寻常类型的转座酶的基因,这种转座酶既可以作为一个特异性核酸酶,又可以作为一个解旋酶,从而指导转座子的移动(图 5-60)。
Because of the mechanism behind their movement, helitrons often transfer bits of the genome, along with themselves, to new positions. For this reason, they are thought to be especially important in reshuffling genomic information to produce variant organisms subject to natural selection over evolutionary time scales.
由于其运动背后的机制,螺旋元件经常将基因组的片段与自身一起转移到新位置。因此,人们认为它们在重新排列基因组信息以产生受自然选择影响的变异生物方面尤为重要,这种影响是在演化时间尺度上发生的。

Some Viruses Use a Transposition Mechanism to Move Themselves into Host-Cell Chromosomes
一些病毒利用转座机制将自身移入宿主细胞染色体

Certain viruses are considered mobile genetic elements because they use transposition mechanisms to integrate their genomes into that of their host cell. However, unlike transposons, the nucleotide sequences that form these viruses encode proteins that package their genetic information into virus particles that can leave the original host cell to infect other cells. As discussed in Chapter 1, most viruses probably evolved from transposable elements through the capture
某些病毒被视为移动基因元素,因为它们利用转座机制将其基因组整合到宿主细胞的基因组中。然而,与转座子不同,构成这些病毒的核苷酸序列编码蛋白质,这些蛋白质将它们的遗传信息包装到病毒颗粒中,这些颗粒可以离开原始宿主细胞感染其他细胞。正如第 1 章所讨论的,大多数病毒可能是通过捕获可移动元素而进化而来。

Figure 5-59 Cut-and-paste
图 5-59 剪切和粘贴
transposition. DNA-only transposons can be recognized in chromosomes by the inverted repeat DNA sequences (red) present at their ends. These sequences, which can be as short as 20 nucleotides, are all that is necessary for the DNA between them to be transposed by the particular transposase enzyme associated with the element. The cut-and-paste movement of a DNA-only transposable element from one chromosomal site to another begins when the transposase enzyme brings the two inverted DNA sequences together, forming a DNA loop. Insertion into the target chromosome, also catalyzed by the transposase, occurs at a random site through the creation of staggered breaks in the target chromosome (purple arrowheads). After the transposition reaction, the single-strand gaps created by the staggered breaks are repaired by DNA polymerase and ligase (black). As a result, the insertion site is marked by a short direct repeat of the target DNA sequence, as shown. Although the break in the donor chromosome (green) is repaired, this process often alters the DNA sequence, causing a mutation at the original site of the excised transposable element (not shown).
转座。仅 DNA 转座子可以通过其末端的倒转重复 DNA 序列(红色)在染色体中被识别。这些序列可能只有 20 个核苷酸那么短,但对于它们之间的 DNA 被与该元素相关的特定转座酶转座来说是必要的。仅含 DNA 的可移动元件从一个染色体位点移动到另一个位点的剪切-粘贴运动始于转座酶将两个倒转 DNA 序列结合在一起,形成 DNA 环。插入到目标染色体中,也由转座酶催化,通过在目标染色体中产生交错断裂(紫色箭头)在一个随机位点发生。在转座反应之后,交错断裂产生的单链间隙由 DNA 聚合酶和连接酶(黑色)修复。因此,插入位点由目标 DNA 序列的短直接重复标记,如所示。尽管供体染色体(绿色)中的断裂被修复,但这个过程经常改变 DNA 序列,在被切除的可移动元件原始位点引起突变(未显示)。
RELEASE OF TRANSPOSON AS COVALENTLY JOINED DNA CIRCLE, DONOR DNA INCLUDING ORIGINAL TRANSPOSON RESTORED
转座子作为共价连接的 DNA 环释放,包括原始转座子在内的供体 DNA 得以恢复

DNA REPLICATION DNA 复制
of genes from their host cells. Although originally serving some other purpose in the cell, such captured genes, after a long process of mutation and selection, now code for the structural proteins of viruses, allowing them to escape the cell. Viruses are among the most numerous biological entities on Earth, and we discuss them in more detail in Chapter 23. The viruses that insert themselves into host chromosomes generally do so by employing one of the first two mechanisms
基因从宿主细胞中捕获。尽管最初在细胞中起着其他作用,但经过漫长的突变和选择过程后,这些捕获的基因现在编码病毒的结构蛋白,使它们能够逃离细胞。病毒是地球上数量最多的生物实体之一,我们将在第 23 章中更详细地讨论它们。将自己插入宿主染色体的病毒通常通过使用前两种机制之一来实现。

Figure 5-60 Mechanism of transposition by helitrons, a type of DNA-only transposon. Several models have been proposed for the movement of these recently discovered transposons, and one is shown here. This model is based on studies of a helitron found in bats, called Helraiser. The process begins when the transposase (green) makes a single-strand break at one end of the transposon (b/ue) and, with the aid of a helicase, "peels back" the single strand. A second transposasemediated reaction releases the transposon in the form of single-stranded DNA, which can move to new positions in the host genome. The transposase (which travels with the single-stranded DNA) can then catalyze the covalent insertion of the transposon into a new location in the host DNA. Transposition by helitrons often moves adjacent host genome sequences along with them. This occurs when, in the third step, the transposase skips over its own CTAGT sequence and cleaves its host DNA downstream at a similar DNA sequence. According to the model, this skipping produces a single-strand DNA circle that includes both helitron and host DNA, and both are inserted into target DNA.
图 5-60 螺旋体转座子的转座机制,一种仅含 DNA 的转座子。关于这些最近发现的转座子的移动,已经提出了几种模型,其中一种如下所示。该模型基于对蝙蝠中发现的一种螺旋体 Helraiser 的研究。该过程始于转座酶(绿色)在转座子的一端(b/ue)进行单链断裂,借助螺旋酶“剥离”单链。第二个由转座酶介导的反应以单链 DNA 形式释放转座子,该 DNA 可以在宿主基因组中移动到新位置。转座酶(随着单链 DNA 一起移动)随后可以催化将转座子共价插入宿主 DNA 的新位置。螺旋体的转座通常会将相邻的宿主基因组序列一起移动。当在第三步中,转座酶跳过自己的 CTAGT 序列并在类似的 DNA 序列处下游切割宿主 DNA 时,就会发生这种情况。根据该模型,这种跳跃会产生一个包括螺旋体和宿主 DNA 的单链 DNA 环,两者都会插入到目标 DNA 中。
listed in Table 5-4; namely, by behaving like DNA-only transposons or like retroviral-like retrotransposons. Indeed, much of our knowledge of these mechanisms has come from studies of particular viruses that employ them.
表 5-4 中列出;即,通过表现得像仅含 DNA 的转座子或像逆转录类逆转座子。事实上,我们对这些机制的许多了解来自于利用它们的特定病毒的研究。
Transposition has a key role in the life cycle of many viruses. Most notable are the retroviruses, which include the human AIDS virus, HIV. Outside the cell, a retrovirus exists as a single-strand RNA genome packed into a protein shell, or capsid, along with a virus-encoded reverse transcriptase enzyme. During the infection process, the viral RNA enters a cell and is converted to a double-strand DNA molecule by the action of this crucial enzyme, which is able to polymerize DNA on either an RNA or a DNA template (Figure 5-61). The term retrovirus refers to the virus's ability to reverse the usual flow of genetic information, which normally is from DNA to RNA (see Figure 1-4).
转座在许多病毒的生命周期中起着关键作用。最值得注意的是逆转录病毒,其中包括人类艾滋病病毒 HIV。在细胞外,逆转录病毒存在为一股单链 RNA 基因组,包装在一个蛋白质壳或衣壳中,同时还有一个病毒编码的逆转录酶。在感染过程中,病毒 RNA 进入细胞,并通过这种关键酶的作用转化为双链 DNA 分子,该酶能够在 RNA 或 DNA 模板上聚合 DNA(见图 5-61)。逆转录病毒这个术语指的是病毒能够逆转通常的遗传信息流动,通常是从 DNA 到 RNA(见图 1-4)。
Once the reverse transcriptase has produced a double-strand DNA molecule, specific sequences near its two ends are recognized by a virus-encoded transposase called integrase. Integrase then inserts the viral DNA into the chromosome by a mechanism similar to that used by the cut-and-paste DNA-only transposons (see Figure 5-59).
一旦逆转录酶产生了双链 DNA 分子,其两端附近的特定序列将被病毒编码的转座酶称为整合酶所识别。整合酶然后通过类似于仅限 DNA 的剪切粘贴转座子使用的机制将病毒 DNA 插入染色体中(见图 5-59)。

Some RNA Viruses Replicate and Express Their Genomes Without Using DNA as an Intermediate
一些 RNA 病毒在复制和表达其基因组时不使用 DNA 作为中间体

Retroviruses are not the only viruses that carry their genomes in the form of RNA. Other viruses also have single-strand RNA genomes, but, unlike retroviruses, many replicate and express their genomes without ever using DNA; that is, they are RNA-only viruses. For example, SARS-CoV-2, the coronavirus underlying the COVID-19 pandemic, replicates its single-strand RNA genome using a special, viral-encoded RNA-dependent RNA polymerase. Upon entering a cell, the viral genome is directly translated by ribosomes as though it were an mRNA molecule, producing many different viral-encoded proteins, including the RNA-dependent RNA polymerase. (We discuss mRNA and the process of translation in detail in Chapter 6.) The polymerase assembles with several other viral proteins and a few host proteins to form the complete replicase complex. This specialized replicase, which does not require a primer to begin synthesis, starts at the end of the viral genome and makes a complementary RNA copy of the entire genome
逆转录病毒并非唯一一种以 RNA 形式携带其基因组的病毒。其他病毒也有单链 RNA 基因组,但与逆转录病毒不同,许多病毒在复制和表达其基因组时从不使用 DNA;也就是说,它们是仅 RNA 的病毒。例如,导致 COVID-19 大流行的冠状病毒 SARS-CoV-2 使用一种特殊的、由病毒编码的 RNA 依赖性 RNA 聚合酶复制其单链 RNA 基因组。进入细胞后,病毒基因组直接被核糖体翻译,就像它是 mRNA 分子一样,产生许多不同的病毒编码蛋白质,包括 RNA 依赖性 RNA 聚合酶。(我们在第 6 章详细讨论了 mRNA 和翻译过程。)聚合酶与其他几种病毒蛋白质和少量宿主蛋白质组装在一起形成完整的复制酶复合物。这种专门的复制酶不需要引物即可开始合成,从病毒基因组的 端开始制作整个基因组的互补 RNA 拷贝。

Figure 5-61 The life cycle of a retrovirus. The retrovirus genome consists of an RNA molecule (blue) that is typically between 7000 and 12,000 nucleotides in length. It is packaged inside a virus-encoded protein capsid, which is surrounded by a lipid-based envelope that contains virus-encoded envelope proteins (green). Inside an infected cell, the enzyme reverse transcriptase (red circle) first makes a DNA copy of the viral RNA molecule and then a second DNA strand, generating a doublestrand DNA copy of the RNA genome. The integration of this DNA double helix into the host chromosome is then catalyzed by a virus-encoded integrase enzyme. This integration is required for the synthesis of new viral RNA molecules by the hostcell RNA polymerase, the enzyme that transcribes DNA into RNA (discussed in Chapter 6). As indicated, this viral RNA is then used by host-cell machinery to produce the capsid, envelope, and reverse transcriptase proteins needed to form new virus particles.
图 5-61 逆转录病毒的生命周期。逆转录病毒基因组由一条 RNA 分子(蓝色)组成,通常长度在 7000 至 12000 个核苷酸之间。它被包装在一个病毒编码的蛋白质壳内,该壳被一个含有病毒编码的包膜蛋白(绿色)的脂质包膜所包围。在感染的细胞内,酶逆转录酶(红色圆圈)首先制作病毒 RNA 分子的 DNA 复制品,然后制作第二条 DNA 链,生成 RNA 基因组的双链 DNA 复制品。然后,由病毒编码的整合酶催化将这个 DNA 双螺旋整合到宿主染色体中。这种整合是合成新病毒 RNA 分子所必需的,宿主细胞 RNA 聚合酶(将 DNA 转录为 RNA 的酶,详见第 6 章)负责此过程。如所示,宿主细胞机制随后利用这种病毒 RNA 来产生形成新病毒颗粒所需的壳、包膜和逆转录酶蛋白。
ASSEMBLY OF NEW INFECTIOUS
新传染性组装
VIRUS PARTICLES 病毒颗粒
(Figure 5-62). Using this complementary copy as a template, the replicase then synthesizes new genomes, which are then packaged with newly synthesized viral proteins into complete virus particles. The whole process of viral replication takes about 10 hours, and a single infected cell can produce as many as 1000 new virus particles, which can spread to other cells within the same host or move in aerosols to new hosts. Because coronaviruses do not use DNA, all steps of viral replication can take place outside the nucleus. In the case of SARS-CoV-2, viral replication occurs in the cytoplasm inside double-membrane compartments that are commandeered by the virus from the endoplasmic reticulum, an organelle described in detail in Chapter 12. These virus-induced compartments are believed to protect the virus from the cell's many antiviral defenses (see pp. 1337-1338) during viral replication and assembly.
(图 5-62)。使用这个互补的拷贝作为模板,复制酶然后合成新的基因组,然后将其与新合成的病毒蛋白包装成完整的病毒颗粒。病毒复制的整个过程大约需要 10 小时,一个被感染的细胞可以产生多达 1000 个新的病毒颗粒,这些颗粒可以传播到同一宿主的其他细胞,或者通过气溶胶传播到新的宿主。由于冠状病毒不使用 DNA,病毒复制的所有步骤都可以在细胞核外进行。在 SARS-CoV-2 的情况下,病毒复制发生在被病毒从内质网夺取的双膜区域内,内质网是第 12 章详细描述的细胞器。据信,这些由病毒诱导的区域在病毒复制和组装过程中可以保护病毒免受细胞的许多抗病毒防御(见 1337-1338 页)的影响。
Several features of coronaviruses distinguish them from other RNA-only viruses such as those that cause influenza or polio. Perhaps the most unusual is the ability of coronavirus replicase complexes to proofread as they copy their RNA genomes. This proofreading occurs in much the same way that we saw for DNA polymerases earlier in the chapter: An incorrectly added nucleotide is excised by a 'to-5' exonuclease carried in the replicase complex, giving the replicase another chance to add the correct nucleotide. This feature means that coronaviruses do not mutate as rapidly as most other RNA viruses, which lack proofreading ability. As discussed in Chapter 23, the relatively high mutation rate of influenza
冠状病毒的几个特征使它们与其他仅含 RNA 的病毒(如导致流感或小儿麻痹症的病毒)有所不同。也许最不同寻常的是冠状病毒复制酶复合物在复制其 RNA 基因组时具有校对的能力。这种校对的方式与我们在本章前面看到的 DNA 聚合酶类似:复制酶复合物中携带的“to-5”外切酶会切除错误添加的核苷酸,使复制酶有机会添加正确的核苷酸。这一特征意味着冠状病毒的突变速度不像大多数其他缺乏校对能力的 RNA 病毒那样快速。正如第 23 章所讨论的那样,流感的突变率相对较高。

Figure 5-62 Simplified view of the coronavirus life cycle as exemplified by SARS-CoV-2. The viral genome, a singlestrand RNA molecule of approximately 30,000 nucleotides, is packaged throughout its length with an RNA-binding protein (red) and enclosed by a lipid bilayer containing the viral spike protein. (The appearance of spike proteins, emanating from the lipid envelope, is responsible for the "corona" moniker.) As described in Chapter 23, the spike proteins bind to a receptor on the surface of susceptible cells and direct the fusion of the viral envelope with the outer cell membrane, releasing the viral genome into the cytoplasm. The genome is then directly translated into protein. Single-strand RNA viruses of this type are called [+] strand viruses, denoting the ability of their genomes to be immediately translated by the host-cell machinery. In contrast, the genomes of [-] strand viruses must first be used as templates to make complementary RNA strands that are then translated (see Table 23-1).
图 5-62 显示了 SARS-CoV-2 冠状病毒生命周期的简化视图。病毒基因组是一种约 30,000 核苷酸的单链 RNA 分子,其整个长度都包裹着一个 RNA 结合蛋白(红色),并被含有病毒尖刺蛋白的脂质双层包围。(尖刺蛋白的外观,从脂质包裹物中散发出来,是“冠状”名称的来源。)正如第 23 章所述,尖刺蛋白与易感细胞表面的受体结合,并指导病毒包裹物与外部细胞膜融合,释放病毒基因组进入细胞质。然后基因组直接被翻译成蛋白质。这种类型的单链 RNA 病毒被称为[+]链病毒,表示它们的基因组能够立即被宿主细胞机器翻译。相比之下,[-]链病毒的基因组必须首先被用作模板,制造出互补的 RNA 链,然后再进行翻译(见表 23-1)。
Among the first proteins made by coronaviruses are those that form the RNA-dependent RNA polymerase, which is responsible for producing new viral genomes by synthesizing RNA using RNA as a template. The replicase complex (which includes the polymerase and several other loosely associated proteins) first synthesizes complete noncoding copies of the viral genome. These complementary copies in turn serve as templates for the replicase complex to synthesize new genomes. The replicase also makes a series of shorter coding RNAs, which are needed to produce additional viral proteins including the spike. Once new viral genomes and proteins have been synthesized, new virus particles are assembled and exit the cell. Although only a few viral proteins are shown in the diagram, the virus codes for at least 27 different proteins; some of these organize the double-membrane structures in which the virus replicates, while others inhibit various immune system responses to the infection.
冠状病毒最早制造的蛋白质之一是形成 RNA 依赖的 RNA 聚合酶的蛋白质,该酶通过使用 RNA 作为模板合成 RNA 来负责生成新的病毒基因组。复制酶复合物(包括聚合酶和其他几种松散相关的蛋白质)首先合成病毒基因组的完整非编码复制品。这些互补复制品反过来作为复制酶复合物合成新基因组的模板。复制酶还制造一系列较短的编码 RNA,这些 RNA 需要用于产生包括尖刺蛋白在内的额外病毒蛋白质。一旦新的病毒基因组和蛋白质被合成,新的病毒颗粒被组装并离开细胞。尽管图表中仅显示了少量病毒蛋白质,但该病毒编码至少 27 种不同的蛋白质;其中一些组织形成病毒复制的双膜结构,而其他蛋白质则抑制对感染的各种免疫系统反应。

virus helps explain why new vaccines are needed every year, and this may not always be the case for coronaviruses.
病毒有助于解释为什么每年都需要新疫苗,对冠状病毒来说情况可能并非总是如此。
The proofreading also has other important implications. As discussed earlier in this chapter, the lower the mutation rate, the greater the number of essential proteins that a genome can maintain. Proofreading allows coronaviruses to have a larger genome than is typical for RNA viruses; for example, compare the 30,000 nucleotides of the SARS-CoV-2 genome (coding for at least 27 proteins) to the 13,500 nucleotides of the influenza virus (coding for about 10 proteins). Proofreading also affects the development of antiviral drugs. Viral replicases are attractive targets for such drugs, in part because similar enzymes do not exist in uninfected host cells, reducing the chance of side effects. Drugs of this type (for example, remdesivir) are typically nucleoside triphosphate analogs that "fool" the RNA replicase into adding them to growing RNA chains. Once incorporated, the analogs-which have improper ends-poison further chain elongation (see Figure 8-42). Coronavirus proofreading can excise many of these analogs and thereby reduce their potency. A related strategy (exemplified by molnupiravir) employs nucleoside triphosphate analogs that are incorporated into RNA by the viral replicase, escape proofreading, but base pair incorrectly in the next round of replication, thereby introducing a lethal number of mutations.
校对还有其他重要的含义。正如本章前面讨论的那样,突变率越低,基因组能够维持的必需蛋白质数量就越多。校对使冠状病毒的基因组比典型的 RNA 病毒更大;例如,将编码至少 27 种蛋白质的 SARS-CoV-2 基因组的 30,000 个核苷酸与编码约 10 种蛋白质的流感病毒的 13,500 个核苷酸进行比较。校对还影响抗病毒药物的开发。病毒复制酶是这类药物的有吸引力的靶点,部分原因是在未感染的宿主细胞中不存在类似的酶,从而降低了副作用的机会。这类药物(例如,瑞德西韦)通常是核苷三磷酸类似物,可以“欺骗”RNA 复制酶将它们添加到不断增长的 RNA 链中。一旦被合并,这些类似物-它们具有不正确的 端-会毒害进一步的链延伸(见图 8-42)。冠状病毒的校对可以切除许多这些类似物,从而降低它们的效力。 一种相关的策略(以莫努匹韦为例)采用核苷三磷酸类似物,这些类似物被病毒复制酶合并到 RNA 中,逃避校对,但在下一轮复制中错误地配对,从而引入致命数量的突变。
Another striking feature of coronaviruses is the way in which they make many different proteins from a genome carried on a single RNA molecule. As described earlier, the viral genome, once it enters a cell, is treated like an mRNA molecule and translated into protein. We shall see in the next chapter, however, that most eukaryotic mRNAs can code for only a single protein. Coronavirus production of many different proteins from a single RNA genome requires a series of unusual steps, some of which appear unique to coronaviruses. We shall discuss the general topics of mRNA translation and its regulation in Chapters 6 and 7. But first, we return to our discussion of transposons, some of which closely resemble viruses in the way they move from place to place in their host genomes.
冠状病毒的另一个引人注目的特征是它们从单个 RNA 分子携带的基因组中制造许多不同的蛋白质的方式。正如前面所述,一旦病毒基因组进入细胞,它就会像 mRNA 分子一样被翻译成蛋白质。然而,我们将在下一章中看到,大多数真核 mRNA 只能编码一个蛋白质。冠状病毒从单个 RNA 基因组中产生许多不同蛋白质需要一系列不寻常的步骤,其中一些似乎是冠状病毒独有的。我们将在第 6 章和第 7 章讨论 mRNA 翻译及其调控的一般主题。但首先,我们回到我们对转座子的讨论,其中一些在它们在宿主基因组中从一个地方移动到另一个地方的方式上与病毒非常相似。

Retroviral-like Retrotransposons Resemble Retroviruses, but Cannot Move from Cell to Cell
逆转录病毒样逆转座子类似于逆转录病毒,但无法在细胞间移动

A large family of transposons called retroviral-like retrotransposons (see Table 5-4) move themselves in and out of chromosomes by a mechanism similar to that used by retroviruses. These elements are present in organisms as diverse as yeasts, flies, and mammals; unlike viruses, they have no intrinsic ability to leave their resident cell but are passed along to all descendants of that cell through the normal processes of DNA replication and cell division. The first step in their transposition is the transcription of the entire transposon, producing an RNA copy of the element that is typically several thousand nucleotides long. This transcript, which is translated as a messenger RNA by the host cell, encodes a reverse transcriptase enzyme. This enzyme makes a double-strand DNA copy of the RNA molecule via an RNA-DNA hybrid intermediate, precisely mirroring the early stages of infection by a retrovirus (see Figure 5-61). Like a retrovirus, the linear, double-strand DNA molecule then integrates into a site on the chromosome using an integrase enzyme that is also encoded by the element. The structure and mechanisms of these integrases closely resemble those of the transposases of DNA-only transposons.
一类名为逆转录病毒样逆转座子(见表 5-4)的转座子大家族通过类似于逆转录病毒使用的机制在染色体内移动。这些元素存在于酵母、果蝇和哺乳动物等多种生物体中;与病毒不同的是,它们没有自身离开寄主细胞的固有能力,而是通过 DNA 复制和细胞分裂的正常过程传递给该细胞的所有后代。它们转座的第一步是整个转座子的转录,产生一个通常有数千核苷酸长的元素的 RNA 拷贝。这个转录本被寄主细胞翻译为信使 RNA,编码一个逆转录酶。这种酶通过 RNA-DNA 杂交中间体制作 RNA 分子的双链 DNA 拷贝,精确地反映了逆转录病毒感染的早期阶段(见图 5-61)。与逆转录病毒一样,线性的双链 DNA 分子然后利用由该元素编码的整合酶整合到染色体上的一个位点。 这些整合酶的结构和机制与仅含有 DNA 的转座子的转座酶非常相似。

A Large Fraction of the Human Genome Is Composed of Nonretroviral Retrotransposons
人类基因组的大部分是由非逆转录逆转座子组成

A significant fraction of many vertebrate chromosomes is made up of repeated DNA sequences. In human chromosomes, these repeats are mostly mutated and truncated versions of nonretroviral retrotransposons, the third major type of transposon (see Table 5-4). Although most of these transposons in the human genome are immobile, a few retain the ability to move. Movements of the L1 element (sometimes referred to as a LINE, or long interspersed nuclear element)
许多脊椎动物染色体的重要部分由重复的 DNA 序列组成。在人类染色体中,这些重复主要是非逆转录逆转座子的突变和截短版本,这是第三种主要类型的转座子(见表 5-4)。尽管人类基因组中的大多数转座子是不活动的,但有一些保留了移动能力。L1 元件的移动(有时称为 LINE,或长间隔核元件)

have been identified, some of which result in human disease; for example, a particular type of hemophilia results from an insertion into the gene encoding the blood-clotting protein Factor VIII (see Figure 6-25).
已经确定了一些,其中一些导致人类疾病;例如,特定类型的血友病是由插入到编码凝血蛋白因子 VIII 的基因中的 引起的(见图 6-25)。
Nonretroviral retrotransposons are found in many organisms and move via a distinct mechanism that requires a complex of an endonuclease and a reverse transcriptase. As illustrated in Figure 5-63, the RNA and reverse transcriptase have a much more direct role in the recombination event than they do in the retroviral-like retrotransposons described above.
非逆转录病毒逆转座子存在于许多生物中,并通过一种需要内切酶和逆转录酶复合物的独特机制移动。如图 5-63 所示,与上文描述的类逆转录病毒逆转座子相比,RNA 和逆转录酶在重组事件中起到了更直接的作用。
Inspection of the human genome sequence reveals that the bulk of nonretroviral retrotransposons-for example, the many copies of the Alu element, a member of the SINE (short interspersed nuclear element) family-do not carry their own endonuclease or reverse transcriptase genes. Nonetheless, they have successfully amplified themselves to become major constituents of our genome, presumably by pirating enzymes encoded by active elements. Together the LINEs and SINEs make up more than of the human genome (see Figure 4-62); there are 500,000 copies of the former and more than a million of the latter.
人类基因组序列的检查显示,大部分非逆转录病毒逆转座子(例如,Alu 元件的许多拷贝,它是 SINE(短间隔核元件)家族的成员)并不携带自己的内切酶或逆转录酶基因。尽管如此,它们成功地通过盗用由活跃元件编码的酶而使自己扩增,成为我们基因组的主要组成部分,LINEs 和 SINEs 共同占据了人类基因组的超过 (见图 4-62);前者有 50 万个拷贝,后者有 100 多万个。

Different Transposable Elements Predominate in Different Organisms
不同的可移动元件在不同的生物体中占主导地位

We have described several types of transposable elements: (1) DNA-only transposons, the movement of which is based on DNA breaking and joining reactions; (2) retroviral-like retrotransposons, which also move via DNA breakage and joining, but where RNA has a key role as a template to generate the DNA recombination substrate; and (3) nonretroviral retrotransposons, in which an RNA copy of the element is central to the incorporation of the element into the target DNA, acting as a direct template for a DNA target-primed reverse transcription event.
我们已经描述了几种转座元件类型:(1)仅 DNA 转座子,其移动基于 DNA 断裂和连接反应;(2)类逆转录病毒样逆转座子,也通过 DNA 断裂和连接移动,但 RNA 在生成 DNA 重组底物的模板中起关键作用;以及(3)非逆转录病毒逆转座子,其中元件的 RNA 副本对将元件合并到目标 DNA 中起关键作用,作为 DNA 靶引导逆转录事件的直接模板。
Intriguingly, different types of transposons predominate in different organisms. For example, the vast majority of bacterial transposons are DNA-only types, with a few related to the nonretroviral retrotransposons also present. In yeasts, the main mobile elements are retroviral-like retrotransposons. In Drosophila, DNA-only, retroviral, and nonretroviral transposons are all found. Finally, the human genome contains all three types of transposon, but as discussed below, their evolutionary histories are strikingly different.
有趣的是,不同类型的转座子在不同的生物体中占主导地位。例如,绝大多数细菌转座子是仅限于 DNA 类型的,少数与非逆转录病毒逆转座子相关的也存在。在酵母中,主要的移动元件是类逆转录病毒样逆转座子。在果蝇中,既有仅限于 DNA 的,也有逆转录病毒和非逆转录转座子。最后,人类基因组包含这三种类型的转座子,但正如下文所讨论的,它们的进化历史截然不同。

Genome Sequences Reveal the Approximate Times at Which Transposable Elements Have Moved
基因组序列揭示了转座元件移动的大致时间

The nucleotide sequence of the human genome provides a rich fossil record of the activity of transposons over evolutionary time spans. By carefully comparing the nucleotide sequences of the approximately 3 million transposable element remnants in the human genome, it has been possible to broadly reconstruct the movements of transposons in our ancestors' genomes over the past several hundred million years. For example, the cut-and-paste DNA-only transposons appear to have been very active well before the divergence of humans and Old World monkeys (25-35 million years ago), but because they gradually accumulated inactivating mutations, they have been dormant in the human lineage since that time. Likewise, although our genome is littered with relics of retroviral-like retrotransposons, none appear to be active today. Only a single family of retroviral-like retrotransposons is believed to have transposed in the human genome since the divergence of human and chimpanzee approximately 6 million years ago. The nonretroviral retrotransposons are also ancient, but in contrast to other types, some are still moving in our genome, as mentioned previously. For example, it is estimated that de novo movement of an Alu element occurs once in every 100-200 human births. This movement of nonretroviral retrotransposons is responsible for a small but significant fraction of new human mutations-perhaps two mutations out of every thousand.
人类基因组的核苷酸序列提供了转座子在演化时间跨度上活动的丰富化石记录。通过仔细比较人类基因组中约 300 万个可移动元件残留的核苷酸序列,已经能够广泛重建转座子在我们祖先基因组中数亿年来的移动情况。例如,剪切-粘贴 DNA 转座子在人类和旧大陆猴类分化之前(2500-3500 万年前)似乎非常活跃,但由于它们逐渐积累了失活突变,自那时起它们在人类谱系中就处于休眠状态。同样,尽管我们的基因组中充斥着类似逆转录病毒的遗迹,但今天似乎没有任何活跃的。据信,自人类和黑猩猩分化约 600 万年以来,只有一个家族的类似逆转录病毒样逆转录子在人类基因组中发生转座。非逆转录病毒逆转录子也是古老的,但与其他类型相反,一些仍在我们的基因组中移动,如前所述。 例如,据估计,Alu 元件的 de novo 移动在每 100-200 个人类出生中发生一次。这种非逆转录逆转座子的移动对于新的人类突变负有责任,也许每千个突变中有两个突变。
The situation in mice is significantly different. Although the mouse and human genomes contain roughly the same density of the three types of transposons, both
在小鼠中的情况有显著不同。尽管小鼠和人类基因组中大致包含相同密度的三种转座子,
Figure 5-63 Transposition by a nonretroviral retrotransposon. Transposition of the element (red) begins when an endonuclease that is part of a complex with the reverse transcriptase (green) bound to the end of RNA (blue) nicks the target DNA at the point at which insertion will occur. This cleavage produces a 3'-OH DNA end in the target DNA, which is then used as a primer for the reverse transcription step shown. This generates a single-strand DNA copy of the element that is directly linked to the target DNA. In subsequent reactions, further processing of the single-strand DNA copy results in the generation of a new double-strand DNA copy of the element that is inserted at the site of the initial nick.
图 5-63 非逆转录病毒逆转座子的转座。 元素(红色)的转座始于一个与 反转录酶(绿色)结合到 RNA(蓝色)末端的复合物中的内切酶在插入点处切割目标 DNA 时。这种切割在目标 DNA 中产生一个 3'-OH DNA 末端,然后用作所示的反转录步骤的引物。这产生了一个与目标 DNA 直接连接的元素的单链 DNA 拷贝。在随后的反应中,对单链 DNA 拷贝的进一步处理导致在最初切割点插入的 元素的新双链 DNA 拷贝的生成。

types of retrotransposons are still actively transposing in the mouse genome, being responsible for approximately of new mutations.
老鼠基因组中仍在积极转座的逆转录子类型,负责约 的新突变。
Although we are only beginning to understand how the movements of transposons have shaped the genomes of present-day mammals, it has been proposed that bursts in transposition activity could have been responsible for critical speciation events during the radiation of the mammalian lineages from a common ancestor, a process that began approximately 170 million years ago. At present, we can only wonder how many of our uniquely human qualities have been derived from the past activity of the mobile genetic elements whose remnants are found scattered throughout our chromosomes.
尽管我们只是开始了解转座子的运动如何塑造了现今哺乳动物的基因组,但有人提出,转座活动的突发可能是导致哺乳动物谱系辐射过程中关键的物种分化事件的原因,这个过程大约始于约 1.7 亿年前的共同祖先。目前,我们只能想象我们人类独特的特质中有多少是源自那些移动基因元素过去活动的遗迹,这些遗迹散布在我们的染色体中。

Conservative Site-specific Recombination Can Reversibly Rearrange DNA
保守的特异性位点重组可以可逆地重新排列 DNA

A different kind of recombination mechanism, known as conservative site-specific recombination, rearranges other types of mobile DNA elements. In this pathway, breakage and joining occur at two special sites, one on each participating DNA molecule, with the recombination event being carried out by a specialized enzyme that breaks and rejoins the two DNA double helices at these specific sequences. The same enzyme system that joins two DNA molecules can often take them apart again, precisely restoring the sequence of the two original DNA molecules (Figure 5-64A). Alternatively, with a different orientation of these two sequences in a chromosome, conservative site-specific recombination produces a DNA inversion (Figure 5-64B).
一种不同类型的重组机制,被称为保守的位点特异性重组,重新排列其他类型的移动 DNA 元素。在这条途径中,断裂和连接发生在两个特殊位点上,每个参与的 DNA 分子上有一个,重组事件由一种专门的酶执行,该酶在这些特定序列上断裂和连接两个 DNA 双螺旋。连接两个 DNA 分子的同一酶系统通常可以再次将它们分开,精确恢复两个原始 DNA 分子的序列(图 5-64A)。另外,当染色体中这两个序列的方向不同时,保守的位点特异性重组会产生 DNA 倒转(图 5-64B)。
The conservative site-specific recombination pathway illustrated in Figure 5-64A is often used by bacterial DNA viruses to move their genomes in and out of the genomes of their host cells. When integrated into its host genome, the viral DNA is replicated along with the host DNA and is faithfully passed on to all descendant cells. If the host cell suffers damage (for example, by UV irradiation), the virus can reverse the site-specific recombination reaction, excise its genome, and package it into a virus particle. In this way, many viruses can replicate themselves passively as a component of the host genome, but can also "leave the sinking ship" by excising their genomes and packaging them in a protective coat until a new, healthy host cell is encountered.
图 5-64A 中所示的保守的位点特异性重组途径经常被细菌 DNA 病毒用来在宿主细胞的基因组内外移动它们的基因组。当病毒 DNA 整合到宿主基因组中时,病毒 DNA 会与宿主 DNA 一起复制,并被忠实地传递给所有后代细胞。如果宿主细胞受损(例如,受紫外线照射),病毒可以逆转位点特异性重组反应,切除其基因组,并将其包装到病毒颗粒中。通过这种方式,许多病毒可以作为宿主基因组的一部分 passively 复制自己,但也可以通过切除其基因组并将其包装在保护性外壳中“离开沉船”,直到遇到新的健康宿主细胞。
Several features distinguish conservative site-specific recombination from transposition. First, conservative site-specific recombination requires specialized
保守的位点特异性重组与转座之间有几个特征不同。首先,保守的位点特异性重组需要专门的
(A)
(B)

INVERSION 倒装
Figure 5-64 Two types of DNA rearrangement produced by conservative site-specific recombination. The only difference between the reactions in and is the relative orientation of the two short DNA sites (indicated by arrows) at which a site-specific recombination event occurs. (A) Through an integration reaction, a circular DNA molecule can become incorporated into a second DNA molecule; by the reverse reaction (excision), it can exit to re-form the original DNA circle. Many bacterial viruses move in and out of their host chromosomes in this way. (B) Conservative site-specific recombination can also invert a specific segment of DNA in a chromosome. A well-studied example of DNA inversion through site-specific recombination occurs in the bacterium Salmonella enterica serovar Typhimurium, as we discuss in the next section.
图 5-64 保守位点特异性重组产生的两种 DNA 重排类型。 中反应唯一的区别是两个短 DNA 位点的相对定向(由箭头指示),在这些位点发生特异性重组事件。 (A)通过整合反应,一个环状 DNA 分子可以并入第二个 DNA 分子;通过反向反应(切除),它可以退出重新形成原始 DNA 环。许多细菌病毒以这种方式在宿主染色体内移动进出。 (B)保守位点特异性重组还可以倒转染色体中的特定 DNA 片段。一个研究充分的 DNA 倒转的例子是在沙门氏菌沙门氏菌血清型鼠伤寒沙门氏菌中通过位点特异性重组发生的,我们将在下一节中讨论。
DNA sequences on both the donor and recipient DNA (hence the term "sitespecific"). These sequences contain recognition sites for the particular recombinase that will catalyze the rearrangement. In contrast, transposition requires only that the transposon bears a specialized sequence; for most transposons, the recipient DNA can be of nearly any sequence. Second, the reaction mechanisms are fundamentally different. The recombinases that catalyze conservative site-specific recombination resemble topoisomerases in the sense that they form transient high-energy covalent bonds with the DNA and use this energy to complete all the DNA rearrangements without the need for new DNA synthesis (see Figure 5-22). Thus, all the phosphate bonds that are broken during a recombination event are restored upon its completion (hence the term "conservative"). Transposition, in contrast, typically leaves gaps in the DNA that must be repaired by DNA polymerases.
供体 DNA 和受体 DNA 上的 DNA 序列(因此称为“位点特异性”)。这些序列包含特定重组酶的识别位点,该酶将催化重排。相比之下,转座只需要转座子携带一个专门的序列;对于大多数转座子来说,受体 DNA 可以是几乎任何序列。其次,反应机制在根本上是不同的。催化保守位点特异性重组的重组酶类似于拓扑异构酶,因为它们与 DNA 形成瞬时的高能共价键,并利用这种能量完成所有 DNA 重排,而无需新的 DNA 合成(见图 5-22)。因此,在重组事件期间断裂的所有磷酸酯键在其完成后都会恢复(因此称为“保守”)。相比之下,转座通常会在 DNA 中留下必须由 DNA 聚合酶修复的间隙。

Conservative Site-specific Recombination Can Be Used to Turn Genes On or Off
保守的特异性重组可以用来打开或关闭基因

Many bacteria use conservative site-specific recombination to control the expression of particular genes. A well-studied example occurs in Salmonella bacteria, an organism that is a major cause of food poisoning in humans. Known as phase variation, the switch in gene expression results from the occasional inversion of a specific 1000-nucleotide-pair piece of DNA, brought about by a conservative site-specific recombinase encoded in the Salmonella genome. This change alters the expression of the cell-surface protein flagellin, for which the bacterium has two different genes. The DNA inversion changes the orientation of a promoter (a DNA sequence that directs transcription of a gene) that is located within the inverted DNA segment. With the promoter in one orientation, the bacteria synthesize one type of flagellin; with the promoter in the other orientation, they synthesize the other type (Figure 5-65).
许多细菌利用保守的位点特异性重组来控制特定基因的表达。一个广为研究的例子发生在沙门氏菌细菌中,这是人类食物中毒的主要原因之一。被称为相位变异,基因表达的转换是由沙门氏菌基因组中编码的保守位点特异性重组酶偶尔倒转特定的 1000 核苷酸对 DNA 片段引起的。这种改变改变了细胞表面蛋白鞭毛蛋白的表达,该细菌有两种不同的基因。DNA 倒转改变了位于倒转 DNA 片段内的一个启动子(指导基因转录的 DNA 序列)的方向。当启动子处于一种方向时,细菌合成一种类型的鞭毛蛋白;当启动子处于另一种方向时,它们合成另一种类型(图 5-65)。
The recombination reaction is reversible, allowing bacterial populations to switch back and forth between the two types of flagellin. Inversions occur only rarely, and because such changes in the genome will be copied faithfully during all subsequent replication cycles, entire clones of bacteria will have one type of flagellin or the other.
重组反应是可逆的,允许细菌群体在两种类型的鞭毛蛋白之间来回切换。倒位仅偶尔发生,由于基因组中的这种变化将在所有后续复制周期中被忠实复制,整个细菌克隆将具有一种或另一种类型的鞭毛蛋白。
Phase variation helps protect the bacterial population against the immune response of its vertebrate host. If the host makes antibodies against one type of flagellin, a few bacteria whose flagellin has been altered by gene inversion will still be able to survive and multiply.
相位变异有助于保护细菌群体免受其脊椎动物宿主的免疫反应。如果宿主产生抗体针对一种类型的鞭毛蛋白,一些细菌的鞭毛蛋白经基因倒位改变后仍能存活和繁殖。
Figure 5-65 Switching gene expression by DNA inversion in bacteria. Which one of the two flagellin genes in a Salmonella bacterium is used to produce its flagellum is controlled by a conservative site-specific recombination event that inverts a small DNA segment containing a promoter (A) In one orientation, the promoter activates transcription of the flagellin gene along with the transcription of a repressor protein that blocks the expression of the flagellin gene. Promoters and repressors are described in detail in Chapter 7; here we note simply that a promoter is needed to express a gene and that a repressor blocks this from happening. (B) When the promoter is inverted, it no longer turns on or the repressor, and the gene, which is thereby released from repression, is expressed instead. The inversion reaction requires specific DNA sequences (red) and a recombinase enzyme that is encoded in the invertible DNA segment. Because this conservative site-specific recombination mechanism is activated only rarely (about once in every cell divisions), the production of one or the other type of flagellin tends to be faithfully inherited in each clone of cells.
图 5-65 细菌中 DNA 倒转调控基因表达。在沙门氏菌中,哪一个鞭毛蛋白基因用于产生其鞭毛是由保守的位点特异性重组事件控制的,该事件倒转了一个包含启动子的小 DNA 片段(A)。在一个方向上,启动子激活 鞭毛蛋白基因的转录,同时转录出一个阻止 鞭毛蛋白基因表达的抑制蛋白。启动子和抑制蛋白在第 7 章中有详细描述;这里我们简单指出,启动子是表达基因所需的,而抑制蛋白则阻止这种表达。当启动子倒转时,它不再启动 或抑制蛋白,而是释放出受到抑制的 基因来代替。倒转反应需要特定的 DNA 序列(红色)和在可倒转 DNA 片段中编码的重组酶。由于这种保守的位点特异性重组机制很少被激活(大约每 个细胞分裂中才发生一次),因此在每个细胞克隆中,一种或另一种类型的鞭毛蛋白的产生往往会被忠实地遗传。
IN SPECIFIC TISSUE (e.g., LIVER)
在特定组织(例如,肝脏)
Cre recombinase made only in liver cells
Cre 重组酶仅在肝细胞中制造
gene of interest deleted from chromosome and lost as liver cells divide
基因被从染色体中删除,并在肝细胞分裂时丢失
IN OTHER TISSUES, THE GENE OF INTEREST IS EXPRESSED NORMALLY
在其他组织中,感兴趣的基因通常表达

protein of interest 感兴趣的蛋白质

Bacterial Conservative Site-specific Recombinases Have Become Powerful Tools for Cell and Developmental Biologists
细菌保守的特异性重组酶已成为细胞和发育生物学家的强大工具

Like many of the mechanisms used by cells and viruses, site-specific recombination has been put to work by scientists to aid in the study of a wide variety of problems. To decipher the roles of specific genes in complex multicellular organisms, genetic engineering techniques are used to produce worms, flies, and mice carrying both a gene encoding a site-specific recombination enzyme and a carefully designed target DNA that includes a gene of interest flanked by DNA sites recognized by the recombination enzyme. At an appropriate time, the gene encoding the enzyme can be activated to rearrange the target DNA sequence. Such a rearrangement is widely used to delete a specific gene in a particular tissue of a multicellular organism (Figure 5-66). This strategy is particularly useful when the gene of interest plays a key role in the early development of many tissues, and a complete deletion of the gene from the germ line would cause death very early in development. The same strategy can also be used to artificially express any specific gene in a tissue of interest; here, the triggered deletion joins a strong transcriptional promoter to the gene of interest. With this tool one can in principle determine the influence of any protein in any desired tissue of an intact animal.
像许多细胞和病毒使用的机制一样,位点特异性重组已被科学家利用来帮助研究各种问题。为了解复杂多细胞生物中特定基因的作用,遗传工程技术被用来产生携带既编码位点特异性重组酶又包括一个感兴趣基因的精心设计目标 DNA 的蠕虫、果蝇和小鼠。这些 DNA 目标包含了被重组酶识别的 DNA 位点环绕的感兴趣基因。在适当的时机,编码酶的基因可以被激活以重新排列目标 DNA 序列。这种重新排列被广泛用于删除多细胞生物特定组织中的特定基因。当感兴趣基因在许多组织的早期发育中起关键作用,而从生殖细胞系中完全删除该基因会导致早期死亡时,这种策略尤其有用。相同的策略也可用于在感兴趣组织中人为表达任何特定基因;在这里,触发的删除将一个强转录启动子与感兴趣基因连接起来。 使用这个工具,原则上可以确定在完整动物的任何所需组织中的任何蛋白质的影响。

Summary 摘要

The genomes of nearly all organisms contain mobile genetic elements that can move from one position in the genome to another by either transposition or conservative site-specific recombination. In most cases, this movement is random and happens at a very low frequency. There are three classes of transposons: the DNA-only transposons, the retroviral-like retrotransposons, and the nonretroviral retrotransposons. The first two classes have close relatives among the viruses, including the human retrovirus that causes AIDS, HIV. Although mobile genetic elements can be viewed as parasites, many of the new arrangements of DNA sequences that their recombination events produce have been important for creating the genetic variation required for the evolution of cells and organisms.
几乎所有生物的基因组都包含可以通过转座或保守的位点特异性重组从基因组中的一个位置移动到另一个位置的移动基因元件。在大多数情况下,这种移动是随机的,并且发生频率非常低。转座子有三类:仅含 DNA 的转座子、类逆转录病毒的逆转录转座子和非逆转录逆转录转座子。前两类在病毒中有近亲,包括导致艾滋病的人类逆转录病毒 HIV。尽管移动基因元件可以被视为寄生虫,但它们重组事件产生的 DNA 序列新排列对于创造细胞和生物进化所需的遗传变异至关重要。
Figure 5-66 How a conservative sitespecific recombination enzyme from bacteria is used to delete a specific gene from a particular mouse tissue. This approach requires the insertion of two specially engineered DNA molecules into the animal's germ line. The first contains the gene for a recombinase (in this case, the Cre recombinase from the bacteriophage P1) under the control of a tissue-specific promoter that ensures the recombinase is expressed only in that tissue. The second DNA molecule contains the gene of interest, flanked by the DNA sequences of the recognition sites for the recombinase (in this case, LoxP sites). The mouse has been engineered to contain only this copy of the gene of interest. Therefore, if the recombinase is expressed only in the liver, the gene of interest will be deleted there, and only there. As described in Chapter 7 , many tissue-specific promoters are known; moreover, many of these promoters are active only at specific times in development. Thus, this method makes it possible to study the effects of deleting any gene of interest at specific times during the development of each tissue. For this reason, it is a powerful tool for scientists investigating the role of individual genes in animal and plant development.
图 5-66 从细菌中使用保守的位点特异性重组酶来从特定的小鼠组织中删除特定基因的过程。这种方法需要将两个经过特殊设计的 DNA 分子插入动物的生殖系。第一个包含了一个重组酶基因(在本例中,来自噬菌体 P1 的 Cre 重组酶),该基因受组织特异性启动子的控制,确保重组酶仅在该组织中表达。第二个 DNA 分子包含了感兴趣基因,其周围被重组酶识别位点的 DNA 序列所包围(在本例中为 LoxP 位点)。小鼠被设计成仅包含这个感兴趣基因的拷贝。因此,如果重组酶仅在肝脏中表达,感兴趣基因将在那里被删除,仅在那里。正如第 7 章所述,许多组织特异性启动子是已知的;此外,许多这些启动子仅在发育的特定时期活跃。因此,这种方法使得在每种组织发育的特定时期研究删除任何感兴趣基因的影响成为可能。 因此,这是科学家研究动植物发育中个体基因作用的强大工具。

How Cells Read the Genome: From DNA to Protein
细胞如何阅读基因组:从 DNA 到蛋白质

Since the structure of DNA was discovered in the early 1950s, progress in cell and molecular biology has been astounding. We now know the complete genome sequences for thousands of different organisms, revealing fascinating details of their biochemistry as well as important clues as to how these organisms evolved. Complete genome sequences have also been obtained for hundreds of thousands of individual humans, as well as for a few of our now-extinct relatives, such as the Neanderthals. Knowing the maximum amount of information that is required to produce a complex organism like ourselves puts constraints on the biochemical and structural features of cells and makes it clear that biology is not infinitely complex.
自从 20 世纪 50 年代初发现 DNA 的结构以来,细胞和分子生物学的进展令人惊叹。我们现在已经知道成千上万种不同生物的完整基因组序列,揭示了它们生物化学的迷人细节,以及这些生物进化的重要线索。数十万个个体人类的完整基因组序列也已获得,以及我们一些现已灭绝的亲属,如尼安德特人。了解生产像我们这样复杂生物所需的最大信息量对细胞的生物化学和结构特征施加了限制,并清楚地表明生物学并非无限复杂。
As discussed in Chapter 1, most of the genetic information carried by DNA specifies the sequence of amino acids in proteins. But this DNA does not direct the synthesis of proteins directly, instead producing RNA as an intermediary. When the cell needs a particular protein, the nucleotide sequence of the appropriate portion of the DNA molecule in a chromosome is first copied into RNA (a process called transcription). These RNA copies of segments of the DNA sequence are then used to direct the synthesis of the protein (a process called translation). The genetic information in cells thereby flows from DNA to RNA to protein. All cells, from bacteria to humans, express their genetic information in this way-a principle so fundamental that it is termed the central dogma of molecular biology (Figure 6-1).
正如第 1 章所讨论的那样,DNA 携带的大部分遗传信息指定了蛋白质中氨基酸的顺序。但是,这种 DNA 并不直接指导蛋白质的合成,而是产生 RNA 作为中间体。当细胞需要特定蛋白质时,染色体中 DNA 分子适当部分的核苷酸序列首先被复制成 RNA(这个过程称为转录)。然后,这些 DNA 序列片段的 RNA 副本被用来指导蛋白质的合成(这个过程称为翻译)。细胞中的遗传信息从 DNA 流向 RNA 再到蛋白质。所有的细胞,从细菌到人类,都以这种方式表达它们的遗传信息-这是一个如此基础的原则,被称为分子生物学的中心法则(图 6-1)。
Despite this universality, there are important variations between organisms in the way information flows from DNA to protein. Most notable, the RNA transcripts in eukaryotic cells are subject to a series of processing steps in the nucleus, including RNA splicing, before they are permitted to exit from the nucleus and be translated into protein. As we discuss in this chapter, these processing steps can critically change the "meaning" of an RNA molecule, and they are therefore crucial for understanding how eukaryotic cells read their genome.
尽管这种普遍性存在,但在信息从 DNA 流向蛋白质的方式上,生物体之间存在重要的变异。值得注意的是,真核细胞中的 RNA 转录本在细胞核中经历一系列处理步骤,包括 RNA 剪接,然后才被允许离开细胞核并被翻译成蛋白质。正如我们在本章中讨论的那样,这些处理步骤可以关键地改变 RNA 分子的“含义”,因此对于理解真核细胞如何读取其基因组至关重要。
Although we shall focus in this chapter on the production of the proteins encoded by the genome, for some genes RNA is the final product. Like proteins, some of these RNAs fold into precise three-dimensional structures that have structural and catalytic roles in the cell. Although the functions of many noncoding RNAs are not yet known, some have been studied in great detail and are discussed in this and the following chapter.
尽管本章节的重点是基因组编码的蛋白质的产生,但对于一些基因来说,RNA 是最终产物。与蛋白质类似,其中一些 RNA 折叠成精确的三维结构,在细胞中具有结构和催化作用。虽然许多非编码 RNA 的功能尚未被了解,但一些已经被详细研究,并在本章节和下一章节中进行讨论。
One might have expected that the information present in genomes would be arranged in an orderly fashion, resembling a dictionary or a telephone directory. But it turns out that the genomes of most multicellular organisms are surprisingly disorderly, reflecting their chaotic evolutionary histories. The genes in these organisms largely consist of a long string of alternating short exons and long introns, as discussed in Chapter 4 (see Figure 4-15D). Moreover, small bits of DNA sequence that code for protein are interspersed with large blocks of seemingly meaningless DNA. Some sections of the genome contain many genes and others lack genes altogether. Proteins that work closely with one another in the cell usually have their genes located on different chromosomes, and adjacent genes typically encode proteins that have little to do with each other in the cell. Decoding genomes is therefore no simple matter. Even with the aid of powerful
人们可能会期望基因组中的信息被有序地排列,类似于字典或电话簿。但事实证明,大多数多细胞生物的基因组是令人惊讶地混乱,反映了它们混乱的进化历史。这些生物体中的基因主要由一长串交替的短外显子和长内含子组成,如第 4 章所讨论的(见图 4-15D)。此外,编码蛋白质的小 DNA 序列片段与大段看似毫无意义的 DNA 交替出现。基因组的某些区段包含许多基因,而其他区段则完全缺乏基因。在细胞中密切协同工作的蛋白质通常其基因位于不同染色体上,相邻基因通常编码在细胞中互不相关的蛋白质。因此,解码基因组并不是一件简单的事情。即使借助强大的。

CHAPTER 6 第 6 章

IN THIS CHAPTER 在本章中

From DNA to RNA
从 DNA 到 RNA
From RNA to Protein
从 RNA 到蛋白质
The RNA World and the Origins of Life
RNA 世界与生命起源
Figure 6-1 Genetic information directs the synthesis of proteins. The flow of genetic information from DNA to RNA (transcription) and from RNA to protein (translation) occurs in all living cells. As we saw in Chapter 5, DNA can also be copied-or replicated-to produce new DNA molecules.
图 6-1 遗传信息指导蛋白质的合成。从 DNA 到 RNA(转录)以及从 RNA 到蛋白质(翻译)的遗传信息流在所有活细胞中发生。正如我们在第 5 章中看到的,DNA 也可以被复制,产生新的 DNA 分子。

KEY: 关键:
Incontinentia Pigmenti 不连续性色素沉着症
disease phenotype caused by nucleotide changes in the indicated gene
由所指基因中的核苷酸变化引起的疾病表型
Figure 6-2 Schematic depiction of a small portion of the human chromosome. As summarized in the key, the known protein-coding genes (starting with Abcd1 and ending with F8) are marked by a dark gray central line, with their coding regions (exons) indicated by dark gray bars that extend above and below this line. Noncoding RNAs with known functions are indicated by purple diamonds. The blue histogram indicates the extent to which portions of the human genome are conserved with other vertebrate species. It is likely that additional genes, currently unrecognized, also lie within this portion of the human genome.
图 6-2 人类 染色体的小部分示意图。如关键所述,已知的编码蛋白基因(从 Abcd1 开始,以 F8 结束)由深灰色中央线标记,其编码区域(外显子)由延伸至该线上下的深灰色条表示。具有已知功能的非编码 RNA 由紫色菱形表示。蓝色柱状图表示人类基因组部分与其他脊椎动物物种保守的程度。很可能还有其他目前未被识别的基因位于人类基因组的这一部分内。
Genes whose mutation causes an inherited human condition are indicated by red brackets. The Abcd1 gene codes for a protein that imports fatty acids into the peroxisome; mutations in the gene cause demyelination of nerves, which can result in cognition and movement disorders. Incontinentia pigmenti is a disease of the skin, hair, nails, teeth, and eyes. Hemophilia A is a bleeding disorder caused by mutations in the Factor VIII gene, which codes for a blood-clotting protein (see Figure 6-25B). Because males have only a single copy of the chromosome, most of the conditions shown here affect only males; females that inherit one of these defective genes are often asymptomatic because a functional protein is made from their other X chromosome. (Courtesy of Alex Williams, data obtained from the University of California, Genome Browser, http://genome.ucsc.edu.)
基因突变导致遗传性人类疾病的基因用红色括号标出。Abcd1 基因编码一种将脂肪酸导入过氧化物体的蛋白质;该基因的突变导致神经髓鞘脱失,可能导致认知和运动障碍。色素不全是一种涉及皮肤、头发、指甲、牙齿和眼睛的疾病。血友病 A 是一种出血性疾病,由于 Factor VIII 基因的突变引起,该基因编码一种血液凝固蛋白(见图 6-25B)。由于男性只有一份 染色体,这里显示的大多数疾病只影响男性;继承这些缺陷基因之一的女性通常无症状,因为她们的另一条 X 染色体产生功能性蛋白质。(由 Alex Williams 提供,数据来源于加利福尼亚大学基因组浏览器,http://genome.ucsc.edu。)
computers, it is difficult for researchers, in the absence of direct experimental evidence, to locate definitively the beginning and end of genes, much less to decipher when and where each gene is expressed in the life of the organism. Yet the cells in our body do this automatically, thousands of times a second.
计算机,缺乏直接实验证据,研究人员很难明确地确定基因的起始和终止位置,更不用说解读每个基因在生物体生命中何时何地表达了。然而,我们体内的细胞却能自动完成这一过程,每秒数千次。
The problems that cells face in decoding genomes can be appreciated by considering a tiny portion of the human genome (Figure 6-2). The region illustrated represents less than of our genome and includes at least 48 genes that encode proteins plus 6 genes for noncoding RNAs. When we consider the entire human genome, we can only marvel at the capacity of our cells to rapidly and accurately handle such large amounts of information.
细胞在解码基因组时面临的问题可以通过考虑人类基因组的一个微小部分来理解(图 6-2)。所示区域代表我们基因组的不到 ,包括至少 48 个编码蛋白质的基因以及 6 个非编码 RNA 基因。当我们考虑整个人类基因组时,我们只能对我们的细胞如此迅速和准确地处理如此大量信息的能力感到惊叹。
In this chapter, we explain how cells decode and use the information in their genomes. Much has been learned about how the genetic instructions written in an alphabet of just four "letters" - the four different nucleotides in DNA-direct the formation of a bacterium, a fruit fly, or a human. Nevertheless, we still have a great deal to discover about how the information stored in an organism's genome produces even the simplest unicellular bacterium with about 500 genes, let alone how it directs the development of a human with approximately 25,000 genes. An enormous amount of ignorance remains; many fascinating challenges therefore await the next generation of cell biologists.
在这一章中,我们解释了细胞如何解码和利用其基因组中的信息。关于基因组中用仅有四种“字母”——DNA 中的四种不同核苷酸——编写的遗传指令如何指导细菌、果蝇或人类的形成,我们已经学到了很多。然而,我们仍然有很多待发现的内容,关于生物体基因组中存储的信息如何产生仅有大约 500 个基因的最简单的单细胞细菌,更不用说如何指导拥有大约 25,000 个基因的人类的发育。仍然存在大量的无知;因此,许多迷人的挑战等待着下一代细胞生物学家。

FROM DNA TO RNA
从 DNA 到 RNA

Transcription and translation are the means by which cells read out, or express, the genetic instructions in their genes. Because many identical RNA copies can be made from the same gene, and each RNA molecule can direct the synthesis of many identical protein molecules, cells can synthesize a large amount of protein from a single gene when necessary. Importantly, genes can be transcribed and translated with different efficiencies, allowing the cell to make vast quantities of some proteins and tiny amounts of others (Figure 6-3). Moreover, as we see in the next chapter, a cell can change (or regulate) the expression of each of
转录和翻译是细胞读取或表达其基因中的遗传指令的手段。由于可以从同一基因制作许多相同的 RNA 拷贝,并且每个 RNA 分子可以指导合成许多相同的蛋白质分子,因此在必要时细胞可以从单个基因合成大量蛋白质。重要的是,基因可以以不同的效率进行转录和翻译,使细胞能够制造大量某些蛋白质和少量其他蛋白质(图 6-3)。此外,正如我们在下一章中所看到的,细胞可以改变(或调节)每个基因的表达。
Figure 6-3 Genes can be expressed with different efficiencies. In this example, gene is transcribed much more efficiently than gene , and each RNA molecule that it produces is also translated more frequently. This causes the amount of protein in the cell to be much greater than that of protein B. In this and later figures, the portions of DNA that are transcribed are shown in orange.
图 6-3 基因的表达效率可以不同。在这个例子中,基因 的转录效率比基因 高得多,而且每个 RNA 分子产生的次数也更频繁。这导致细胞中蛋白质 的量远远大于蛋白质 B。在这个和后续的图中,已转录的 DNA 部分显示为橙色。

its genes according to its needs-commonly by controlling the production of its RNA—and many genes will not be expressed at all in some cells.
根据其需求调控其基因,通常是通过控制其 RNA 的产生,许多基因在某些细胞中根本不会表达。
One of the central problems in producing proteins from the information carried in genomes is that most steps depend on conventional nucleic acid base-pairing, which on its own has only modest specificity. In many contexts, a correct base pair is only times more thermodynamically stable than an incorrect base pair, so that most steps of gene expression rely on mechanisms that both improve the specificity of the base-pairing and correct the many mistakes that arise. A central theme of this chapter, therefore, is the way cells deal with the fundamentally inaccurate base-pairing process that lies at the heart of the mechanisms that they use to read their genome.
从基因组中携带的信息产生蛋白质的一个核心问题是,大多数步骤依赖于传统的核酸碱基配对,这种配对本身只具有适度的特异性。在许多情况下,正确的碱基对只比错误的碱基对稳定热力学上多 倍,因此基因表达的大多数步骤依赖于既提高碱基配对特异性又纠正产生的许多错误的机制。因此,本章的一个中心主题是细胞处理基本不准确的碱基配对过程的方式,这个过程是它们用来读取基因组的机制的核心。

RNA Molecules Are Single-Stranded
RNA 分子是单链的

The first step a cell takes in reading out its genetic instructions is to copy a particular portion of its DNA nucleotide sequence-a gene-into an RNA nucleotide sequence (Figure 6-4). The information in RNA, although copied into another chemical form, is still written in essentially the same language as it is in DNAthe language of a nucleotide sequence. Hence the name given to producing RNA molecules on DNA is transcription.
细胞在阅读其遗传指令时的第一步是将其 DNA 核苷酸序列的特定部分——一个基因——复制到 RNA 核苷酸序列中(图 6-4)。 RNA 中的信息虽然以另一种化学形式复制,但实质上仍以与 DNA 相同的语言书写——核苷酸序列的语言。因此,在 DNA 上产生 RNA 分子的过程被称为转录。
Like DNA, RNA is a linear polymer made of four different types of nucleotide subunits linked together by phosphodiester bonds (see Figure 6-4). It differs from DNA chemically in two respects: (1) the nucleotides in RNA are ribonucleotides-that is, they contain the sugar ribose (hence the name ribonucleic acid) rather than deoxyribose; (2) although, like DNA, RNA contains the bases adenine (A), guanine (G), and cytosine (C), it contains the base uracil (U) instead of the thymine (T) in DNA (Figure 6-5). Because U, like T, can base-pair by hydrogen-bonding with A (Figure 6-6), the complementary base-pairing properties described for DNA in Chapters 4 and 5 apply also to RNA (in RNA, G pairs with , and A pairs with ).
与 DNA 一样,RNA 是由四种不同类型的核苷酸亚基通过磷酸二酯键连接在一起形成的线性聚合物(见图 6-4)。在化学上,它与 DNA 有两个不同之处:(1)RNA 中的核苷酸是核糖核苷酸,即它们含有核糖糖(因此称为核糖核酸),而不是去氧核糖;(2)尽管 RNA 与 DNA 一样,含有腺嘌呤(A)、鸟嘌呤(G)和胞嘧啶(C)等碱基,但它含有尿嘧啶(U)而不是 DNA 中的胸腺嘧啶(T)(图 6-5)。由于 U 像 T 一样,可以通过氢键与 A 进行碱基配对(图 6-6),因此在第 4 章和第 5 章描述的 DNA 的互补碱基配对特性也适用于 RNA(在 RNA 中,G 与 配对,A 与 配对)。
Although these chemical differences are slight, DNA and RNA differ quite dramatically in overall structure. Whereas DNA always occurs in cells as a double-strand helix, RNA is single-stranded. An RNA chain can therefore fold up into a particular shape, just as a polypeptide chain folds up to form the final shape of a protein (Figure 6-7). As we see later in this chapter, the ability to fold into complex three-dimensional shapes allows some RNA molecules to have precise structural and catalytic functions.
尽管这些化学差异很小,DNA 和 RNA 在总体结构上有相当大的不同。DNA 总是以双链螺旋的形式存在于细胞中,而 RNA 是单链的。因此,RNA 链可以折叠成特定的形状,就像多肽链折叠成蛋白质的最终形状一样(图 6-7)。正如我们在本章后面看到的那样,能够折叠成复杂的三维形状使一些 RNA 分子具有精确的结构和催化功能。
Figure 6-5 The chemical structure of RNA. (A) RNA contains the sugar ribose, which differs from deoxyribose, the sugar used in DNA, by the presence of an additional -OH group. (B) RNA contains the base uracil, which differs from thymine, the equivalent base in DNA, by the absence of a group.
图 6-5 RNA 的化学结构。(A) RNA 含有核糖糖,与 DNA 中使用的去氧核糖不同,因为多了一个-OH 基团。(B) RNA 含有尿嘧啶碱基,与 DNA 中的等价碱基胸腺嘧啶不同,因为缺少一个 基团。
Figure 6-4 A short length of RNA. The phosphodiester chemical linkage between nucleotides in RNA is the same as that in DNA.
图 6-4 一小段 RNA。RNA 中核苷酸之间的磷酸二酯化学连接与 DNA 中的相同。

Transcription Produces RNA Complementary to One Strand of DNA
转录产生与 DNA 的一条链互补的 RNA

The RNA in a cell is made by DNA transcription, a process that has certain similarities to the process of DNA replication discussed in Chapter 5. Transcription begins with the opening and unwinding of a small portion of the DNA double helix to expose the bases on each DNA strand. One of the two strands of the DNA double helix then acts as a template for the synthesis of an RNA molecule. As in DNA replication, the nucleotide sequence of the RNA chain is determined by the complementary base-pairing between incoming nucleotides and the DNA template. When a good match is made (A with T, with with , and with ), the incoming ribonucleotide is covalently linked to the growing RNA chain in an enzymatically catalyzed reaction. The RNA chain produced by transcription-the transcript-is therefore elongated one nucleotide at a time, and it has a nucleotide sequence that is exactly complementary to the strand of DNA used as the template (Figure 6-8).
细胞中的 RNA 是由 DNA 转录制造的,这个过程与第 5 章讨论的 DNA 复制过程有一定的相似之处。转录始于打开和解旋 DNA 双螺旋的一小部分,以暴露每条 DNA 链上的碱基。DNA 双螺旋的两条链中的一条然后作为合成 RNA 分子的模板。与 DNA 复制一样,RNA 链的核苷酸序列是由传入核苷酸与 DNA 模板之间的互补碱基配对确定的。当形成良好匹配时(A 与 T, ,以及 ),传入的核糖核苷酸通过酶催化的反应与不断增长的 RNA 链共价连接。由转录产生的 RNA 链-转录本-因此每次延长一个核苷酸,并且其核苷酸序列与用作模板的 DNA 链完全互补(图 6-8)。
Transcription, however, differs from DNA replication in several crucial ways. Unlike a newly formed DNA strand, the RNA strand does not remain hydrogenbonded to the DNA template strand. Instead, just behind the region where the ribonucleotides are being added, the RNA chain is displaced and the DNA helix re-forms. Thus, the RNA molecules produced by transcription are released from the DNA template as single strands. In addition, because they are copied from only a limited region of the DNA, RNA molecules are much shorter than DNA molecules. A DNA molecule in a human chromosome can be up to 250 million nucleotide-pairs long; in contrast, most RNAs are no more than a few thousand nucleotides long, and many are considerably shorter.
转录与 DNA 复制在几个关键方面有所不同。与新形成的 DNA 链不同,RNA 链不会保持与 DNA 模板链的氢键结合。相反,在核糖核苷酸被添加的区域后面,RNA 链被位移,DNA 螺旋重新形成。因此,由转录产生的 RNA 分子作为单链从 DNA 模板释放出来。此外,由于它们仅从 DNA 的有限区域复制,RNA 分子比 DNA 分子短得多。人类染色体中的 DNA 分子可以长达 2.5 亿个核苷酸对;相比之下,大多数 RNA 不超过几千个核苷酸长,许多甚至更短。
Figure 6-6 Uracil base pairs with adenine. The absence of a methyl group in has no effect on base-pairing; thus, U-A base pairs closely resemble T-A base pairs (see Figure 4-5).
图 6-6 尿嘧啶与腺嘌呤配对。在 中甲基的缺失对碱基配对没有影响;因此,U-A 碱基对与 T-A 碱基对非常相似(见图 4-5)。

RNA Polymerases Carry Out DNA Transcription
RNA 聚合酶进行 DNA 转录

The enzymes that perform transcription are called RNA polymerases. Like the DNA polymerase that catalyzes DNA replication (discussed in Chapter 5), RNA polymerases catalyze the formation of the phosphodiester bonds that link the nucleotides together to form a linear chain. The RNA polymerase moves stepwise
执行转录的酶被称为 RNA 聚合酶。类似于催化 DNA 复制的 DNA 聚合酶(在第 5 章讨论),RNA 聚合酶催化形成连接核苷酸以形成线性链的磷酸二酯键。RNA 聚合酶逐步移动。
(A)
(B)
(C)
Figure 6-7 RNA can fold into specific structures. RNA is largely single-stranded, but it often contains short stretches of nucleotides that can form conventional base pairs with complementary sequences found elsewhere on the same molecule. These interactions, along with additional "nonconventional" base-pair interactions (for example, A-G), allow an RNA molecule to fold into a three-dimensional structure that is determined by its sequence of nucleotides (Movie 6.1). (A) Diagram of a folded RNA structure showing only conventional (G-C and A-U) base-pair interactions (red). (B) Formation of nonconventional (green) base-pair interactions folds the hypothetical structure shown in A even further. (C) Structure of an actual RNA molecule, in this case one that catalyzes its own splicing (see pp. 347-348). Each conventional base-pair interaction is indicated by a "rung" in the double helix. Bases in other configurations are indicated by broken rungs.
图 6-7 RNA 可以折叠成特定的结构。RNA 主要是单链的,但通常包含可以与同一分子的其他地方发现的互补序列形成传统碱基对的短链核苷酸。这些相互作用,以及额外的“非传统”碱基对相互作用(例如,A-G),使 RNA 分子能够折叠成由其核苷酸序列决定的三维结构(电影 6.1)。 (A)折叠的 RNA 结构图示仅显示传统(G-C 和 A-U)碱基对相互作用(红色)。 (B)形成非传统(绿色)碱基对相互作用进一步折叠 A 中显示的假设结构。 (C)实际 RNA 分子的结构,在这种情况下,它催化自身的剪接(见 347-348 页)。每个传统碱基对相互作用在双螺旋中用“梯子”表示。其他配置中的碱基由断裂的梯子表示。

along the DNA, unwinding the DNA helix just ahead of its active site for polymerization to expose a new region of the template strand for complementary base-pairing. In this way, the growing RNA chain is extended by one nucleotide at a time in the 5 '-to-3' direction (Figure 6-9). The substrates are ribonucleoside triphosphates (ATP, CTP, UTP, and GTP); as in DNA replication, the hydrolysis of high-energy bonds provides the energy needed to drive the reaction forward (see Figure 5-4 and Movie 6.2).
沿着 DNA,解开 DNA 螺旋,就在其聚合活性位点前方,以暴露模板链的新区域,以进行互补碱基配对。通过这种方式,RNA 链以 5'-到-3'方向逐个核苷酸地延伸(图 6-9)。底物是核糖核苷三磷酸(ATP,CTP,UTP 和 GTP);与 DNA 复制一样,高能键的水解提供了推动反应向前进行所需的能量(参见图 5-4 和视频 6.2)。
The almost immediate separation of the RNA strand from the DNA as it is synthesized means that many RNA copies can be made from the same gene in a relatively short time, with the synthesis of additional RNA molecules being started before the previous RNA molecules are completed (Figure 6-10). When RNA polymerase molecules follow hard on each other's heels in this way, each moving at speeds up to 50 nucleotides per second, more than a thousand transcripts can be synthesized in an hour from a single gene.
RNA 链与 DNA 的几乎立即分离意味着可以在相对较短的时间内从同一基因中制造许多 RNA 拷贝,而且在之前的 RNA 分子完成之前就已经开始合成额外的 RNA 分子(图 6-10)。当 RNA 聚合酶分子以这种方式紧随其后,每秒移动速度高达 50 个核苷酸,从单个基因中可以在一个小时内合成一千多个转录本。
Although RNA polymerase catalyzes essentially the same chemical reaction as DNA polymerase, there are some important differences between the activities of the two enzymes. First, and most obviously, RNA polymerase catalyzes the linkage of ribonucleotides, not deoxyribonucleotides. Second, unlike the DNA polymerases involved in DNA replication (see pp. 259-260), RNA polymerases can start an RNA chain without a primer. This difference is thought possible because transcription need not be as accurate as DNA replication (see Table 5-1, p. 260). RNA polymerases make about one mistake for every nucleotides copied into RNA (compared with an error rate for direct copying and proofreading by DNA polymerase of about one in nucleotides), and the consequences of an error in RNA transcription are much less significant as RNA does not permanently store genetic information in cells. Finally, unlike DNA polymerases, which make their products in segments that are later stitched together, RNA polymerases are processive; that is, the same RNA polymerase that begins an RNA molecule must finish it without dissociating from the DNA template.
尽管 RNA 聚合酶基本上催化与 DNA 聚合酶相同的化学反应,但这两种酶的活动之间存在一些重要的区别。首先,最明显的是,RNA 聚合酶催化核糖核苷酸的连接,而不是脱氧核糖核苷酸。其次,与参与 DNA 复制的 DNA 聚合酶不同(见第 259-260 页),RNA 聚合酶可以在没有引物的情况下启动 RNA 链。这种差异被认为是可能的,因为转录不需要像 DNA 复制那样准确(见表 5-1,第 260 页)。RNA 聚合酶在将核苷酸复制到 RNA 中时,大约每 个核苷酸就会出现一个错误(与 DNA 聚合酶直接复制和校对的错误率约为 个核苷酸中的一个错误相比),而 RNA 转录中的错误后果要小得多,因为 RNA 在细胞中不会永久存储遗传信息。最后,与 DNA 聚合酶不同,后者会将其产物分段制成,然后再将这些片段缝合在一起,RNA 聚合酶是连续作用的;也就是说,开始合成 RNA 分子的 RNA 聚合酶必须在不与 DNA 模板解离的情况下完成合成。
Although not nearly as accurate as the DNA polymerases that replicate DNA, RNA polymerases nonetheless have a modest proofreading mechanism. If an incorrect ribonucleotide is added to the growing RNA chain, the polymerase can back up, and the active site of the enzyme can perform an excision reaction that resembles the reverse of the polymerization reaction, except that a water molecule replaces the pyrophosphate and a nucleoside monophosphate is released.
尽管 RNA 聚合酶的准确性远不及复制 DNA 的 DNA 聚合酶,但它们仍具有适度的校对机制。如果在不正确的核糖核苷酸被添加到正在生长的 RNA 链中时,聚合酶可以倒退,酶的活性位点可以执行类似于反聚合反应的切除反应,只是水分子取代了焦磷酸酯,一个核苷酸单磷酸被释放。
Given that DNA and RNA polymerases both carry out template-dependent nucleotide polymerization, it might be expected that the two types of enzymes would be structurally related. However, x-ray crystallographic studies reveal that, other than containing a critical ion at the catalytic site, the two enzymes are quite different. Template-dependent nucleotide-polymerizing enzymes seem to have arisen at least twice during the early evolution of cells. One lineage led to the
鉴于 DNA 和 RNA 聚合酶都执行基于模板的核苷酸聚合,人们可能会期望这两种类型的酶在结构上有关联。然而,X 射线晶体学研究显示,除了在催化位点含有一个关键的 离子外,这两种酶是非常不同的。基于模板的核苷酸聚合酶似乎在细胞早期演化过程中至少出现了两次。其中一支系导致了
Figure 6-8 DNA transcription produces a single-strand RNA molecule that is complementary to one strand of the DNA double helix. Note that the sequence of bases in the RNA molecule produced is the same as the sequence of bases in the non-template DNA strand, except that a replaces every base in the DNA.
图 6-8 DNA 转录产生一条与 DNA 双螺旋的一条链互补的单链 RNA 分子。请注意,所产生的 RNA 分子中碱基的序列与非模板 DNA 链中碱基的序列相同,只是 DNA 中的每个碱基都被 替换。
Figure 6-9 DNA is transcribed by the enzyme RNA polymerase. The RNA polymerase (pale blue) moves stepwise along the DNA, unwinding the DNA helix at its active site indicated by the (red), which is required for catalysis. As it progresses, the polymerase adds nucleotides one by one to the RNA chain at the polymerization site, using an exposed DNA strand as a template. The RNA transcript is thus a complementary copy of one of the two DNA strands. A short region of DNA-RNA helix (approximately nine nucleotide pairs in length) is formed only transiently, and a "window" of DNA-RNA helix therefore moves along the DNA with the polymerase as the DNA double helix re-forms behind it. The incoming nucleotides are in the form of ribonucleoside triphosphates (ATP, UTP, CTP, and GTP), and the energy stored in their phosphate-phosphate bonds provides the driving force for the polymerization reaction. The figure, based on an -ray crystallographic structure, shows a cutaway view of the polymerase: the part facing the viewer has been sliced away to reveal the interior (Movie 6.3). (Adapted from P. Cramer et al., Science 288:640649, 2000. PDB code: 1HQM.)
图 6-9 DNA 由 RNA 聚合酶转录。RNA 聚合酶(淡蓝色)沿着 DNA 逐步移动,在其活性位点(红色)展开 DNA 螺旋,这是催化所必需的。随着进展,聚合酶逐个向 RNA 链的聚合位点添加核苷酸,利用裸露的 DNA 链作为模板。因此,RNA 转录本是两条 DNA 链中的一条的互补复制。DNA-RNA 螺旋的短区域(大约九对核苷酸对的长度)仅短暂形成,因此 DNA-RNA 螺旋的“窗口”随着聚合酶沿着 DNA 移动,DNA 双螺旋在其后重新形成。传入的核苷酸以核糖核苷酸三磷酸形式存在(ATP、UTP、CTP 和 GTP),它们磷酸-磷酸键中储存的能量为聚合反应提供推动力。基于 -射线晶体学结构的图示显示了聚合酶的剖面视图:面向观察者的部分已被切开以显示内部(影片 6.3)。 (改编自 P. Cramer 等人,Science 288:640649,2000 年。PDB 代码:1HQM。)
modern DNA polymerases and reverse transcriptases discussed in Chapter 5, as well as to a few RNA polymerases from viruses. The other lineage formed all of the RNA polymerases that we discuss in this chapter.
在第 5 章讨论的现代 DNA 聚合酶和逆转录酶,以及一些来自病毒的 RNA 聚合酶。另一支系形成了本章讨论的所有 RNA 聚合酶。

Cells Produce Different Categories of RNA Molecules
细胞产生不同类别的 RNA 分子

The majority of genes carried in a cell's DNA specify the amino acid sequence of proteins, and the RNAs that are copied from these genes (which ultimately direct the synthesis of proteins) are called messenger RNA (mRNA) molecules. The final product of other genes is the RNA molecule itself. These RNAs are known as noncoding RNAs, because they do not code for protein. In a well-studied, single-celled eukaryote, the yeast Saccharomyces cerevisiae, more than 1200 genes (about of the total) produce RNA as their final product. Humans produce about 5000 different noncoding RNAs. These RNAs, like proteins, serve as enzymatic, structural, and regulatory components for a wide variety of processes in the cell. In Chapter 5, we encountered one of them as the template RNA carried by the enzyme telomerase. We shall see in this chapter that ribosomal RNA (rRNA) molecules form the core of ribosomes, that transfer RNA (tRNA) molecules serve as the adaptors that select amino acids and hold them in place on a ribosome for incorporation into protein, and that small nuclear RNA (snRNA) molecules direct the splicing of pre-mRNA to form mRNA. In Chapter 7, we shall see that microRNA (miRNA) molecules and small interfering RNA (siRNA) molecules serve as key regulators of eukaryotic gene expression, and that piwi-interacting RNAs (piRNAs) protect animal germ lines from transposons; we also discuss the long noncoding RNAs (lncRNAs), a diverse set of RNAs, many of whose functions are just being discovered (Table 6-1).
细胞 DNA 携带的大多数基因指定蛋白质的氨基酸序列,从这些基因复制的 RNA(最终指导蛋白质合成)被称为信使 RNA(mRNA)分子。其他基因的最终产物是 RNA 分子本身。这些 RNA 被称为非编码 RNA,因为它们不编码蛋白质。在一个研究充分的单细胞真核生物——酵母 Saccharomyces cerevisiae 中,超过 1200 个基因(约总数的 )产生 RNA 作为它们的最终产物。人类产生大约 5000 种不同的非编码 RNA。这些 RNA 像蛋白质一样,在细胞中的各种过程中充当酶、结构和调节组分。在第 5 章,我们遇到了其中一个作为酶端粒酶携带的模板 RNA。我们将在本章看到,核糖体 RNA(rRNA)分子构成核糖体的核心,转移 RNA(tRNA)分子作为选择氨基酸并将其固定在核糖体上以合并到蛋白质中的适配器,小核 RNA(snRNA)分子指导前 mRNA 的剪接形成 mRNA。 在第 7 章中,我们将看到微小 RNA(miRNA)分子和小干扰 RNA(siRNA)分子作为真核基因表达的关键调节因子,并且 piwi 相互作用 RNA(piRNAs)保护动物生殖细胞系免受转座子的影响;我们还讨论了长非编码 RNA(lncRNAs),这是一组多样的 RNA,其中许多功能刚刚被发现(表 6-1)。

Figure 6-10 Transcription of two genes as observed under the electron microscope. The micrograph shows many molecules of RNA polymerase simultaneously transcribing each of two adjacent genes. Molecules of RNA polymerase are visible as a series of dots along the DNA with the newly synthesized transcripts (fine threads) attached to them. The RNA molecules (ribosomal RNAs; rRNAs) shown in this example are not translated into protein but are instead used directly as components of ribosomes, the machines on which translation takes place. The particles at the 5 ' end (the free end) of each rRNA transcript are believed to reflect the beginnings of ribosome assembly. From the relative lengths of the newly synthesized transcripts, it can be deduced that the RNA polymerase molecules are transcribing from right to left. (Courtesy of Ulrich Scheer.)
图 6-10 两个基因的转录在电子显微镜下观察到。显微照片显示许多 RNA 聚合酶分子同时转录两个相邻基因中的每一个。RNA 聚合酶分子可见于沿着 DNA 的一系列点,新合成的转录本(细丝)附着在它们上面。在这个例子中显示的 RNA 分子(核糖体 RNA;rRNA)不被翻译成蛋白质,而是直接用作核糖体的组成部分,即翻译发生的机器。每个 rRNA 转录本的 5'端(自由端)的颗粒被认为反映了核糖体组装的开始。通过新合成转录本的相对长度,可以推断 RNA 聚合酶分子是从右向左转录的。(由 Ulrich Scheer 提供。)
TABLE 6-1 Principal Types of RNAs Produced in Cells
表 6-1 细胞中产生的主要 RNA 类型
Type of RNA RNA 的类型 Function 功能
mRNAs Messenger RNAs, code for proteins
信使 RNA,编码蛋白质
rRNAs rRNAs 核糖核酸 Ribosomal RNAs, form the basic structure of the ribosome and catalyze protein synthesis
核糖体 RNA 构成核糖体的基本结构,并催化蛋白质合成
tRNAs tRNAs tRNA(转运 RNA) Transfer RNAs, central to protein synthesis as the adaptors between mRNA and amino acids
转运 RNA 对蛋白质合成至关重要,作为 mRNA 和氨基酸之间的适配器
Telomerase RNA 端粒酶 RNA Serves as the template for the telomerase enzyme that extends the ends of chromosomes
作为延长染色体末端的端粒酶的模板
snRNAs snRNAs 小核 RNA Small nuclear RNAs, function in a variety of nuclear processes, including the splicing of pre-mRNA
小核 RNA,在各种核过程中发挥作用,包括前 mRNA 的剪接
snoRNAs snoRNAs snoRNA 们 Small nucleolar RNAs, help to process and chemically modify rRNAs
小核糖核酸有助于处理和化学修饰核糖核酸
IncRNAs IncRNAs 内源性长链非编码 RNA

长非编码 RNA,并非所有都具有功能;一些作为支架,调控多样的细胞过程,包括 X 染色体失活
Long noncoding RNAs, not all of which appear to have a function; some serve as scaffolds and regulate
diverse cell processes, including X-chromosome inactivation
miRNAs miRNAs miRNA(microRNA)

微小 RNA 通过阻止特定 mRNA 的翻译并导致其降解来调控基因表达
MicroRNAs, regulate gene expression by blocking translation of specific mRNAs and causing their
degradation
siRNAs siRNAs siRNA(小干扰 RNA)

小干扰 RNA 通过指导选择性 mRNA 的降解和帮助建立抑制性染色质结构来关闭基因表达
Small interfering RNAs, turn off gene expression by directing the degradation of selective mRNAs and
helping to establish repressive chromatin structures
piRNAs piRNAs piRNAs
Piwi-interacting RNAs, bind to piwi proteins and protect the germ line from transposable elements
Piwi 相互作用 RNA 结合到 piwi 蛋白,并保护生殖细胞系免受可移动元件的影响
Each transcribed segment of DNA is called a transcription unit. In eukaryotes, a transcription unit typically carries the information of just one gene, and therefore codes for either a single RNA molecule or a single protein (or group of related proteins if the initial RNA transcript is spliced in more than one way to produce different mRNAs). In bacteria, a set of adjacent genes is often transcribed as a unit; the resulting mRNA molecule therefore carries the information for producing several distinct proteins.
DNA 的每个转录片段称为转录单元。在真核生物中,一个转录单元通常携带一个基因的信息,因此编码一个 RNA 分子或一个蛋白质(或者如果初始 RNA 转录本以多种方式剪接以产生不同的 mRNA,则编码一组相关蛋白质)。在细菌中,一组相邻的基因通常作为一个单元转录;因此产生的 mRNA 分子携带了产生几种不同蛋白质的信息。
Overall, RNA makes up a few percent of a cell's dry weight, whereas proteins compose about . Most of the RNA in cells is rRNA; mRNA composes only of the total RNA in a typical mammalian cell. The mRNA population is made up of tens of thousands of different species, and there are on average only 10-15 molecules of each species of mRNA present in each cell.
总的来说,RNA 占细胞干重的几个百分点,而蛋白质约占 。细胞中大部分 RNA 是 rRNA;mRNA 仅占典型哺乳动物细胞中总 RNA 的 。mRNA 群体由成千上万种不同物种组成,每种 mRNA 在每个细胞中平均只有 10-15 分子。

Signals Encoded in DNA Tell RNA Polymerase Where to Start and Stop
DNA 中编码的信号告诉 RNA 聚合酶何时开始和停止

To transcribe a gene accurately, RNA polymerase must recognize where on the genome to start and where to finish. The way in which RNA polymerases perform these tasks differs somewhat between bacteria and eukaryotes. Because the processes in bacteria are simpler, we discuss them first.
为了准确转录基因,RNA 聚合酶必须识别基因组的起始和终止位置。RNA 聚合酶执行这些任务的方式在细菌和真核生物之间略有不同。由于细菌中的过程较为简单,我们首先讨论它们。
The initiation of transcription is an especially important step in gene expression because it is the main point at which the cell regulates which proteins are to be produced and at what rate. The bacterial RNA polymerase core enzyme is a multisubunit complex that synthesizes RNA using the DNA template as a guide. An additional subunit called sigma ( ) factor associates with the core enzyme and assists it in reading the signals in the DNA that tell it where to begin transcribing (Figure 6-11). Together, factor and the core enzyme are known as the
转录的启动是基因表达中特别重要的一步,因为这是细胞调控应该产生哪些蛋白质以及产生速率的主要点。细菌 RNA 聚合酶核心酶是一个多亚基复合物,使用 DNA 模板作为指导合成 RNA。一个名为 sigma ( ) 因子的额外亚基与核心酶结合,并协助其阅读 DNA 中的信号,告诉它在哪里开始转录(图 6-11)。sigma 因子和核心酶一起被称为

(B)
(C)
Figure 6-11 The transcription cycle of bacterial RNA polymerase. (A) In step 1, the RNA polymerase holoenzyme (polymerase core enzyme plus factor) assembles and then, by sliding, locates a promoter DNA sequence (see Figure 6-12). The polymerase opens (unwinds) the DNA at the position at which transcription is to begin (step 2) and begins transcribing (step 3). This initial RNA synthesis (abortive initiation) is relatively inefficient as short, unproductive transcripts are often released. However, once RNA polymerase has managed to synthesize about 10 nucleotides of RNA, it breaks its interactions with the promoter DNA (step 4) and eventually releases factor-as the polymerase tightens around the DNA and shifts to the elongation mode of RNA synthesis, moving along the DNA (step 5). During the elongation mode, transcription is highly processive, with the polymerase leaving the DNA template and releasing the newly transcribed RNA only when it encounters a termination signal (steps 6 and 7). Termination signals are typically encoded in DNA, and many function by forming an RNA hairpin-like structure that destabilizes the polymerase's hold on the RNA.
图 6-11 细菌 RNA 聚合酶的转录循环。(A)在第 1 步中,RNA 聚合酶全酶(聚合酶核酶加上 因子)组装,然后通过滑动,定位到启动子 DNA 序列(见图 6-12)。聚合酶在转录即将开始的位置打开(解旋)DNA(第 2 步),并开始转录(第 3 步)。这种初始 RNA 合成(中止起始)相对低效,因为通常会释放出短而无效的转录本。然而,一旦 RNA 聚合酶成功合成大约 10 个核苷酸的 RNA,它会中断与启动子 DNA 的相互作用(第 4 步),最终释放 因子-因为聚合酶紧紧包裹在 DNA 周围,并转变为 RNA 合成的伸长模式,沿着 DNA 移动(第 5 步)。在伸长模式期间,转录是高度连续的,聚合酶只有在遇到终止信号时才会离开 DNA 模板并释放新合成的 RNA(第 6 和第 7 步)。终止信号通常编码在 DNA 中,许多终止信号通过形成类似 RNA 发夹的结构来破坏聚合酶对 RNA 的保持。
In bacteria, all RNA molecules are synthesized by a single type of RNA polymerase, and the cycle depicted in the figure therefore applies to the production of mRNAs as well as structural and catalytic RNAs. (B) Two-dimensional image of an elongating bacterial RNA polymerase, as determined by atomic force microscopy. (C) Interpretation of the image in B. (Adapted from K.M. Herbert et al., Annu. Rev. Biochem. 77:149-176, 2008. With permission from Annual Reviews.)
在细菌中,所有 RNA 分子都由一种类型的 RNA 聚合酶合成,因此图中描述的循环也适用于 mRNA 以及结构和催化 RNA 的产生。(B) 细菌 RNA 聚合酶的伸长二维图像,由原子力显微镜确定。(C) 对 B 中图像的解释。(摘自 K.M. Herbert 等人,Annu. Rev. Biochem. 77:149-176, 2008。获得 Annual Reviews 许可。)
RNA polymerase holoenzyme; this complex adheres only weakly to DNA when the two collide, and a holoenzyme typically slides rapidly along the long bacterial DNA molecule and then dissociates. However, when the polymerase holoenzyme slides into a special sequence of nucleotides indicating the starting point for RNA synthesis called a promoter, the polymerase binds tightly because its factor makes specific contacts with the edges of bases exposed on the outside of the DNA double helix (step 1 in Figure 6-11A).
RNA 聚合酶全酶;当这两者碰撞时,这个复合物只会弱弱地与 DNA 结合,全酶通常会快速沿着长的细菌 DNA 分子滑动,然后解离。然而,当聚合酶全酶滑入一种特殊的核苷酸序列,表明 RNA 合成的起始点,称为启动子时,聚合酶会紧密结合,因为它的 因子与 DNA 双螺旋外侧暴露的碱基边缘有特定接触(图 6-11A 中的步骤 1)。
The tightly bound RNA polymerase holoenzyme at a promoter opens up the double helix to expose a short stretch of nucleotides on each strand (step 2 in Figure 6-11A). The region of unpaired DNA (about 10 nucleotides) is called the transcription bubble, and it is stabilized by the binding of factor to the unpaired bases on one of the exposed strands. The other exposed DNA strand then acts as a template for complementary base-pairing with incoming ribonucleotides, two of which are joined together by the polymerase to begin an RNA chain (step 3 in Figure 6-11A).
RNA 聚合酶全酶在启动子处紧密结合,打开双螺旋以暴露每条链上的一小段核苷酸(图 6-11A 中的第 2 步)。未配对的 DNA 区域(约 10 个核苷酸)称为转录泡泡,它通过 因子与其中一条暴露链上的未配对碱基结合而稳定。另一条暴露的 DNA 链然后作为互补碱基与输入的核糖核苷酸配对的模板,其中两个核苷酸被聚合酶连接在一起以开始 RNA 链(图 6-11A 中的第 3 步)。
The first 10 or so nucleotides of RNA are synthesized using a "scrunching" mechanism, in which RNA polymerase remains bound to the promoter and pulls the upstream DNA into its active site, thereby expanding the transcription bubble. This process creates considerable stress, and the short RNAs are often released, thereby relieving the stress and forcing the polymerase, which remains in place, to begin synthesis over again. Eventually this process of abortive initiation is overcome, and the stress generated by scrunching helps the core enzyme to break free of its interactions with the promoter DNA (step 4 in Figure 6-11A) and discard the factor (step 5 in Figure 6-11A).
RNA 的前 10 个核苷酸左右是使用“蜷缩”机制合成的,其中 RNA 聚合酶保持与启动子结合,并将上游 DNA 拉入其活性位点,从而扩展转录泡泡。这个过程会产生相当大的压力,短的 RNA 经常会被释放出来,从而减轻压力并迫使保持在原位的聚合酶重新开始合成。最终,这种放弃性起始的过程被克服,蜷缩产生的压力帮助核心酶摆脱与启动子 DNA 的相互作用(图 6-11A 中的第 4 步),并丢弃 因子(图 6-11A 中的第 5 步)。
At this point, the polymerase begins to move down the DNA, synthesizing RNA in a stepwise fashion: the polymerase moves forward one base pair for every nucleotide added. During this process, the transcription bubble continually expands at the front of the polymerase and contracts at its rear. Chain elongation continues (at a speed of approximately 50 nucleotides per second for bacterial RNA polymerases) until the enzyme encounters a second signal, the terminator (step 6 in Figure 6-11A), where the polymerase halts and releases both the newly made RNA molecule and the DNA template (step 7 in Figure 6-11A). The free polymerase core enzyme then reassociates with a free factor to form a holoenzyme that can begin the process of transcription again (step 8 in Figure 6-11A).
在这一点上,聚合酶开始沿着 DNA 向下移动,以逐步方式合成 RNA:每添加一个核苷酸,聚合酶向前移动一个碱基对。在这个过程中,转录泡囊在聚合酶前端不断扩张,在后端收缩。链的延伸继续进行(对于细菌 RNA 聚合酶,速度约为每秒 50 个核苷酸),直到酶遇到第二个信号,终止子(图 6-11A 中的第 6 步),聚合酶停止并释放新合成的 RNA 分子和 DNA 模板(图 6-11A 中的第 7 步)。然后,自由的聚合酶核心酶再与一个自由的 因子重新结合,形成一个可以再次开始转录过程的全酶(图 6-11A 中的第 8 步)。
The process of transcription initiation is complicated and requires that the RNA polymerase holoenzyme and the DNA undergo a series of conformational changes, first opening the DNA double helix at promoters and subsequently tightening the enzyme around the DNA and RNA so that it does not dissociate before it has finished transcribing a gene. If an RNA polymerase does dissociate prematurely, it must start over again at the promoter.
转录起始过程复杂,需要 RNA 聚合酶全酶和 DNA 经历一系列构象变化,首先在启动子处打开 DNA 双螺旋,随后紧密将酶包裹在 DNA 和 RNA 周围,以确保在转录完基因之前不会解离。如果 RNA 聚合酶过早解离,必须重新从启动子开始。
How do the termination signals in the DNA stop the elongating polymerase? For most bacterial genes, a termination signal consists of a string of A-T nucleotide pairs preceded by a twofold symmetric DNA sequence, which, when transcribed into RNA, folds into a "hairpin" structure through Watson-Crick base-pairing (see Figure 6-92). As the polymerase transcribes across a terminator, the formation of the hairpin helps release the RNA transcript, which is held in place by relatively weak A-T and U-A base pairs (step 7 in Figure 6-11A). As we shall see, the folding of RNA into specific structures affects many steps in decoding the genome.
DNA 中的终止信号如何停止延伸的聚合酶?对于大多数细菌基因,终止信号由一串 A-T 核苷酸对组成,前面是一个对称的 DNA 序列,当转录成 RNA 时,通过 Watson-Crick 碱基配对形成一个“发夹”结构(见图 6-92)。当聚合酶跨越终止子时,发夹的形成有助于释放 RNA 转录本,这些转录本由相对较弱的 A-T 和 U-A 碱基对保持在原位(图 6-11A 中的第 7 步)。正如我们将看到的,RNA 折叠成特定结构会影响基因组解码中的许多步骤。

Bacterial Transcription Start and Stop Signals Are Heterogeneous in Nucleotide Sequence
细菌转录起始和终止信号在核苷酸序列上是异质的

As we have just seen, the processes of transcription initiation and termination involve a complicated series of structural transitions in protein, DNA, and RNA molecules. The signals encoded in DNA that specify these transitions are often difficult for researchers to recognize. Indeed, a comparison of many different bacterial promoters reveals a surprising degree of variation. Nevertheless, they all contain related sequences, reflecting aspects of the DNA that are recognized
正如我们刚刚看到的,转录起始和终止过程涉及蛋白质、DNA 和 RNA 分子中一系列复杂的结构转变。编码在 DNA 中的信号指定这些转变通常很难让研究人员识别。事实上,对许多不同细菌启动子的比较揭示了惊人的变化程度。然而,它们都包含相关序列,反映了被识别的 DNA 方面。
directly by the factor. These common features are often summarized in the form of a consensus sequence (Figure 6-12). A consensus nucleotide sequence is derived by comparing many sequences with the same basic function and tallying up the most common nucleotides found at each position. It therefore serves as a summary or "average" of a large number of individual nucleotide sequences. A more accurate way of displaying the range of DNA sequences recognized by a protein is through the use of a sequence logo, which reveals the relative frequencies of each nucleotide at each position (Figure 6-12C).
直接由 因子。这些共同特征通常以共识序列的形式总结(图 6-12)。共识核苷酸序列是通过比较具有相同基本功能的许多序列并统计在每个位置找到的最常见核苷酸而导出的。因此,它作为大量个体核苷酸序列的摘要或“平均”。显示蛋白质识别的 DNA 序列范围的更准确方法是通过使用序列标志,它显示了每个位置上每种核苷酸的相对频率(图 6-12C)。
The DNA sequences of individual bacterial promoters differ in ways that determine their strength, that is, the number of initiation events per unit time. Evolutionary processes have fine-tuned each to initiate as often as necessary and have thereby created a wide spectrum of promoter strengths. Promoters for genes that code for abundant proteins are much stronger than those associated with genes that encode rare proteins, and the nucleotide sequences of their promoters are responsible for these differences.
个别细菌启动子的 DNA 序列在某些方面不同,决定了它们的强度,即单位时间内的启动事件数量。进化过程已经对每个启动子进行了微调,使其能够按需启动,并因此创造了各种各样的启动子强度。编码丰富蛋白质的基因的启动子比编码稀有蛋白质的基因相关的启动子要强得多,它们的启动子核苷酸序列负责这些差异。
Like bacterial promoters, transcription terminators also have a wide range of sequences, with the potential to form a simple hairpin RNA structure being the most important common feature. Because an almost unlimited number of nucleotide sequences have this potential, terminator sequences are even more heterogeneous than promoter sequences.
与细菌启动子一样,转录终止子也具有广泛的序列范围,形成简单的发夹 RNA 结构的潜力是最重要的共同特征。由于几乎无限数量的核苷酸序列具有这种潜力,终止子序列甚至比启动子序列更加异质。
We have discussed bacterial promoters and terminators in some detail to illustrate an important point regarding the analysis of genome sequences. Although we know a great deal about bacterial promoters and terminators and can construct "average" sequences that summarize their most salient features, their variation in nucleotide sequence makes it difficult to definitively locate them simply by
我们已经详细讨论了细菌启动子和终止子,以阐明关于基因组序列分析的一个重要观点。尽管我们对细菌启动子和终止子了解很多,并且可以构建总结它们最显著特征的“平均”序列,但它们在核苷酸序列上的变异使得仅仅通过核苷酸序列来明确定位它们变得困难。

(B)
Figure 6-12 Consensus nucleotide sequence and sequence logo for the major class of Escherichia coli promoters. (A) On the basis of a comparison of 300 promoters, the frequencies of each of the four nucleotides at each position in the promoter are given. The consensus sequence, shown directly below the histogram, reflects the most common nucleotide found at each position in the collection of promoters. These promoters are characterized by two hexameric DNA sequences-the -35 sequence and the -10 sequence, named for their approximate location relative to the start point of transcription (designated +1 ). The sequence of nucleotides between the -35 and -10 hexamers shows little similarity among promoters but the spacing matters. For convenience, the nucleotide sequence of a single strand of DNA is shown; in reality, promoters are double-stranded DNA. The nucleotides shown in the figure are recognized by factor, a subunit of the RNA polymerase holoenzyme. (B) The distribution of spacing between the -35 and -10 hexamers found in E. coli promoters. (C) A sequence logo displaying the same information as in panel A. Here, the height of each letter is proportional to the frequency at which that base occurs at that position across a wide variety of promoter sequences. The total height of all the letters at each position is proportional to the information content (expressed in bits) at that position. For example, the total information content of a position that can tolerate several different bases is small (see the last three bases of the -35 sequences) but statistically greater than random.
图 6-12 大肠杆菌主要类别启动子的共识核苷酸序列和序列标志。 (A) 通过对 300 个启动子的比较,给出了启动子中每个位置上四种核苷酸的频率。直方图下方直接显示的共识序列反映了在启动子集合中每个位置上找到的最常见核苷酸。这些启动子以两个六核苷酸 DNA 序列为特征,即-35 序列和-10 序列,根据它们相对于转录起始点(标记为+1)的大致位置而命名。-35 和-10 六核苷酸之间的核苷酸序列在启动子之间显示出很少的相似性,但间距很重要。为方便起见,显示了单链 DNA 的核苷酸序列;实际上,启动子是双链 DNA。图中显示的核苷酸由 RNA 聚合酶全酶的亚单位 识别。 (B) 大肠杆菌启动子中-35 和-10 六核苷酸之间间距的分布。 (C) 显示与面板 A 相同信息的序列标志。 在这里,每个字母的高度与该碱基在各种启动子序列中出现在该位置的频率成比例。每个位置上所有字母的总高度与该位置的信息含量(以比特表示)成比例。例如,一个位置的总信息含量可以容忍几种不同的碱基(见-35 序列的最后三个碱基),但统计上大于随机。
analysis of the nucleotide sequence of a genome. It is even more difficult to locate analogous sequences in eukaryotic genomes, due in part to the excess DNA carried in these genomes. Often we need additional information, some of it from direct experimentation, to locate and accurately interpret the short DNA signals in genomes.
对基因组的核苷酸序列进行分析。在真核基因组中定位类似序列更加困难,部分原因是这些基因组携带的 DNA 过多。通常我们需要额外信息,其中一些来自直接实验,以定位并准确解释基因组中的短 DNA 信号。
As shown in Figure 6-12, promoter sequences are asymmetric, ensuring that RNA polymerase can bind in only one orientation. Because the polymerase can synthesize RNA only in the 5 '-to-3' direction, the promoter orientation specifies the strand to be used as a template. Genome sequences reveal that the DNA strand that is used as the template for RNA synthesis varies from gene to gene, depending on the orientation of the promoter (Figure 6-13).
如图 6-12 所示,启动子序列是不对称的,确保 RNA 聚合酶只能以一种方向结合。由于聚合酶只能在 5'-到-3'方向合成 RNA,启动子方向指定了要用作模板的链。基因组序列显示,用作 RNA 合成模板的 DNA 链因基因而异,取决于启动子的方向(图 6-13)。
Having considered transcription in bacteria, we now turn to the situation in eukaryotes, where the synthesis of RNA molecules is a much more elaborate affair.
考虑到细菌中的转录,我们现在转向真核生物中的情况,在那里 RNA 分子的合成是一个更加复杂的事务。

Transcription Initiation in Eukaryotes Requires Many Proteins
真核生物的转录起始需要许多蛋白质

In contrast to bacteria, which contain a single type of RNA polymerase, eukaryotic nuclei have three: RNA polymerase I, RNA polymerase II, and RNA polymerase III. The three polymerases are structurally similar to one another and share some common subunits, but they transcribe different categories of genes (Table 6-2). RNA polymerases I and III transcribe the genes encoding transfer RNA, ribosomal RNA, and various small RNAs. RNA polymerase II transcribes most genes, including all those that encode proteins, and our subsequent discussion therefore focuses on this enzyme.
与只含有一种 RNA 聚合酶的细菌相比,真核细胞核有三种:RNA 聚合酶 I、RNA 聚合酶 II 和 RNA 聚合酶 III。这三种聚合酶在结构上相似,并共享一些共同的亚基,但它们转录不同类别的基因(表 6-2)。RNA 聚合酶 I 和 III 转录编码转移 RNA、核糖体 RNA 和各种小 RNA 的基因。RNA 聚合酶 II 转录大多数基因,包括编码蛋白质的所有基因,因此我们后续的讨论重点放在这个酶上。
Eukaryotic RNA polymerase II has many structural similarities to bacterial RNA polymerase (Figure 6-14). But there are several important differences in the way in which the bacterial and eukaryotic enzymes function, two of which concern us immediately.
真核 RNA 聚合酶 II 在结构上与细菌 RNA 聚合酶有许多相似之处(图 6-14)。但在细菌和真核酶功能方式上存在几个重要差异,其中有两个立即引起我们关注。
  1. While bacterial RNA polymerase requires only a single transcriptioninitiation factor ( ) to begin transcription, eukaryotic RNA polymerases require many such factors, collectively called the general transcription factors.
    细菌 RNA 聚合酶只需要一个转录起始因子( )来开始转录,而真核细胞 RNA 聚合酶需要许多这样的因子,统称为一般转录因子。
  2. Eukaryotic transcription initiation must take place on DNA that is packaged into nucleosomes and higher-order forms of chromatin structure (described in Chapter 4), features that are absent from bacterial chromosomes.
    真核转录起始必须发生在包装成核小体和更高阶的染色质结构形式(在第 4 章中描述)上的 DNA 上,这些特征在细菌染色体中是不存在的。
TABLE 6-2 The Three RNA Polymerases in Eukaryotic Cells
表 6-2 真核细胞中的三种 RNA 聚合酶
Type of polymerase 聚合酶的类型 Genes transcribed 基因转录
RNA polymerase I RNA 聚合酶 I , and 28S rRNA genes
,以及 28S rRNA 基因
RNA polymerase II RNA 聚合酶 II

所有编码蛋白质的基因,加上 snoRNA 基因,miRNA 基因,siRNA 基因,IncRNA 基因,以及大多数 snRNA 基因
All protein-coding genes, plus snoRNA genes, miRNA genes,
siRNA genes, IncRNA genes, and most snRNA genes
RNA polymerase III RNA 聚合酶 III

tRNA 基因,5S rRNA 基因,一些 snRNA 基因以及其他小 RNA 的基因
tRNA genes, 5S rRNA genes, some snRNA genes, and
genes for other small RNAs
The rRNAs were named according to their "S" values, which refer to their rate of sedimentation in an ultracentrifuge. The larger the value, the larger the rRNA.
rRNA 根据其在超速离心机中沉降速率的“S”值进行命名。数值越大,rRNA 越大。

Figure 6-13 Directions of transcription along a short portion of a bacterial chromosome. Some genes are transcribed using one DNA strand as a template, while others are transcribed using the other DNA strand. The direction of transcription is determined by the orientation of the promoter at the beginning of each gene (green arrowheads). This diagram shows approximately ( 9000 base pairs) of the E. coli chromosome. The genes transcribed from left to right use the bottom DNA strand as the template; those transcribed from right to left use the top strand as the template. DNA that is not transcribed is indicated in gray.
图 6-13 细菌染色体一小部分的转录方向。一些基因使用一条 DNA 链作为模板进行转录,而另一些基因使用另一条 DNA 链进行转录。转录方向由每个基因开头的启动子的方向(绿色箭头)确定。该图显示大约 (9000 个碱基对)的大肠杆菌染色体。从左到右转录的基因使用底部 DNA 链作为模板;从右到左转录的基因使用顶部链作为模板。未转录的 DNA 以灰色表示。
Figure 6-14 Structural similarity between a bacterial RNA polymerase and a eukaryotic RNA polymerase II. Regions of the two RNA polymerases that have similar structures are indicated in green. The eukaryotic polymerase is larger than the bacterial enzyme ( 12 subunits instead of 5 ), and some of the additional regions are shown in gray. The red sphere represents the Mg atom present at the active site, where polymerization takes place, while the blue spheres denote atoms that serve as structural components. The RNA polymerases in all modern-day cells (bacteria, archaea, and eukaryotes) are closely related, indicating that the basic features of the enzyme were in place before the divergence of the three major branches of life. (Courtesy of P. Cramer and R. Kornberg.)
图 6-14 细菌 RNA 聚合酶和真核 RNA 聚合酶 II 之间的结构相似性。两种 RNA 聚合酶具有相似结构的区域用绿色标示。真核聚合酶比细菌酶大(12 个亚基而不是 5 个),一些额外区域显示为灰色。红色球代表存在于活性位点的 Mg 原子,聚合作用发生在此处,而蓝色球表示作为结构组分的 原子。所有现代细胞(细菌、古菌和真核生物)中的 RNA 聚合酶密切相关,表明酶的基本特征在这三大生命分支分化之前已经形成。 (由 P. Cramer 和 R. Kornberg 提供。)

To Initiate Transcription, RNA Polymerase II Requires a Set of General Transcription Factors
启动转录,RNA 聚合酶 II 需要一组通用转录因子

The general transcription factors help to position eukaryotic RNA polymerase correctly at the promoter, aid in pulling apart the two strands of DNA to allow transcription to begin, and release RNA polymerase from the promoter to start its elongation mode. The proteins are "general" because they are needed at nearly all promoters used by RNA polymerase II. They consist of a set of interacting proteins denoted arbitrarily as TFIIA, TFIIB, and so on (TFII standing for transcription factor for polymerase II). In a broad sense, the eukaryotic general transcription factors carry out functions that are equivalent to those of the factor in bacteria.
一般转录因子有助于正确定位真核 RNA 聚合酶在启动子处,帮助拉开 DNA 的两条链以允许转录开始,并释放 RNA 聚合酶从启动子以开始其延伸模式。这些蛋白质之所以被称为“一般”,是因为它们几乎在 RNA 聚合酶 II 使用的所有启动子上都是必需的。它们由一组相互作用的蛋白质组成,任意标记为 TFIIA、TFIIB 等(TFII 代表聚合酶 II 的转录因子)。从广义上讲,真核一般转录因子执行的功能相当于细菌中的 因子。
Figure 6-15 illustrates how the general transcription factors assemble at promoters used by RNA polymerase II, and Table 6-3 summarizes their activities. The assembly process begins when TFIID binds to a short double-helical DNA sequence primarily composed of and nucleotides. For this reason, this sequence is known as the TATA sequence, or TATA box, and the subunit of TFIID that recognizes it is called TBP (for TATA-binding protein). The TATA box is typically located about 30 nucleotides upstream from the transcription start site. It is not the only DNA sequence that signals the start of transcription (Figure 6-16), but for many polymerase II promoters it is the most important. The binding of
图 6-15 说明了一般转录因子如何在 RNA 聚合酶 II 使用的启动子上组装,表 6-3 总结了它们的活动。组装过程始于 TFIID 结合到一个主要由 核苷酸组成的短双螺旋 DNA 序列。因此,这个序列被称为 TATA 序列,或 TATA 盒,识别它的 TFIID 亚基称为 TBP(TATA 结合蛋白)。TATA 盒通常位于距离转录起始位点约 30 个核苷酸的上游。它并不是唯一标志转录开始的 DNA 序列(图 6-16),但对于许多聚合酶 II 启动子来说,它是最重要的。
Figure 6-15 Initiation of transcription of a eukaryotic gene by RNA polymerase II. To begin transcription, RNA polymerase requires several general transcription factors. (A) Many promoters contain a DNA sequence called the TATA box, which, in humans, is located about 30 nucleotides away from the site at which transcription is initiated. (B) Through its subunit TBP, TFIID recognizes and binds the TATA box, which then enables the adjacent binding of TFIIB and TFIIA. (C) The RNA polymerase and the rest of the general transcription factors assemble at the promoter. (D) TFIIH then uses energy from ATP hydrolysis to pry apart the DNA double helix at the transcription start point, locally exposing the template strand. TFIIH also phosphorylates the long C-terminal polypeptide tail of RNA polymerase II, also called the C-terminal domain (CTD). This causes the polymerase to be released from the general factors and begin the elongation phase of transcription. For most genes, TFIID remains bound at the promoter whereas most of the other general transcription factors are released when the polymerase begins transcribing.
图 6-15 通过 RNA 聚合酶 II 启动真核基因的转录。要开始转录,RNA 聚合酶需要几个一般的转录因子。(A) 许多启动子包含一个称为 TATA 盒的 DNA 序列,在人类中,它位于距离转录起始点约 30 个核苷酸的位置。(B) 通过其亚基 TBP,TFIID 识别并结合 TATA 盒,然后使 TFIIB 和 TFIIA 相邻结合。(C) RNA 聚合酶和其余的一般转录因子在启动子处组装。(D) TFIIH 然后利用 ATP 水解的能量在转录起始点处分开 DNA 双螺旋,局部暴露模板链。TFIIH 还磷酸化 RNA 聚合酶 II 的长 C 端多肽尾巴,也称为 C 端结构域(CTD)。这导致聚合酶从一般因子中释放并开始转录的延伸阶段。对于大多数基因,TFIID 仍然绑定在启动子上,而当聚合酶开始转录时,大多数其他一般转录因子被释放。

(D)
TABLE 6-3 The General Transcription Factors Needed for Transcription Initiation by Eukaryotic RNA Polymerase II
表 6-3 需要真核 RNA 聚合酶 II 转录启动的一般转录因子
Name 名称
 亚单位数量
Number of
subunits
Roles in transition initiation
转换启动中的角色
TFIID 12

识别 TATA 盒和靠近转录起始点的其他 DNA 序列
Recognizes TATA box and other DNA sequences
near the transcription start point
TFIIB 1

识别启动子中的 BRE 元件;准确地将 RNA 聚合酶定位在转录起始位点
Recognizes BRE element in promoters; accurately
positions RNA polymerase at the start site of
transcription
TFIIA 2

不是所有启动子都需要;稳定 TFIID 的结合
Not required in all promoters; stabilizes binding of
TFIID
TFIIF 3

稳定 RNA 聚合酶与 TFIIB 的相互作用;帮助吸引 TFIIE 和 TFIIH
Stabilizes RNA polymerase interaction with TFIIB;
helps attract TFIIE and TFIIH
TFIIE 2 Attracts and regulates
吸引和调节
10

在转录起始点解开 DNA,磷酸化 RNA 聚合酶 C 端结构域(CTD)的 Ser5;从启动子释放 RNA 聚合酶
Unwinds DNA at the transcription start point,
phosphorylates Ser5 of the RNA polymerase
C-terminal domain (CTD); releases RNA polymerase
from the promoter
TFIID causes a large distortion in the DNA of the TATA box (Figure 6-17). This distortion is thought to serve as a physical landmark for the location of an active promoter in the midst of a very large genome, and it brings DNA sequences on both sides of the distortion closer together to allow for subsequent protein assembly steps. The additional general transcription factors then assemble, along with RNA polymerase II, to form a complete transcription initiation complex (see Figure 6-15). The most complicated of the general transcription factors is TFIIH (shown in pink). Consisting of 10 subunits, it is nearly as large as RNA polymerase II itself and, as we shall see shortly, performs several enzymatic steps needed for the initiation of transcription.
TFIID 导致 TATA 盒的 DNA 发生严重扭曲(图 6-17)。认为这种扭曲作为一个物理标志,用于在非常庞大的基因组中标记活跃启动子的位置,并使扭曲两侧的 DNA 序列靠近,以便进行后续的蛋白质组装步骤。然后,额外的一般转录因子与 RNA 聚合酶 II 一起组装,形成完整的转录起始复合物(见图 6-15)。最复杂的一般转录因子是 TFIIH(显示为粉色)。由 10 个亚基组成,几乎和 RNA 聚合酶 II 本身一样大,并且,我们很快将看到,它执行了转录起始所需的几个酶步骤。
After forming a transcription initiation complex on the promoter DNA, RNA polymerase II must gain access to the template strand at the transcription start point. TFIIH makes this step possible by hydrolyzing ATP and pulling apart the DNA strands at the start site, thereby exposing the template strand. Next, RNA polymerase II, like the bacterial polymerase, remains at the promoter synthesizing short lengths of RNA until it undergoes a series of conformational changes that allow it to move away from the promoter and enter the elongation phase of transcription. A key step in this transition is the addition of phosphate groups to the "tail" of the RNA polymerase (known as the CTD, or C-terminal domain). In
在启动子 DNA 上形成转录起始复合物后,RNA 聚合酶 II 必须获得访问转录起始点处模板链的权限。TFIIH 通过水解 ATP 并在起始位点处拉开 DNA 链,从而暴露模板链,使这一步骤成为可能。接下来,RNA 聚合酶 II 像细菌聚合酶一样,留在启动子处合成短长度的 RNA,直到经历一系列构象变化,使其能够远离启动子并进入转录的延伸阶段。这一转变中的关键步骤是向 RNA 聚合酶的“尾巴”(称为 CTD 或 C-末端结构域)添加磷酸基团。
element 元素
 共识序列
consensus
sequence

一般转录因子
general
transcription
factor
BRE TFIIB
TATA T A T A A/T A A/T
T A T A A/T A A/T T A T A A/T A A/T

TFIID 的 TBP 亚基
TBP
subunit of TFIID
INR A N T/A
A N T/A
TFIID
DPE A/G G A/T C G T G
A/G G A/T C G T G A/G G A/T C G T G
TFIID
Figure 6-16 Consensus sequences found in the vicinity of eukaryotic RNA polymerase II start points. The name given to each consensus sequence (first column) and the general transcription factor that recognizes it (last column) are indicated. indicates any nucleotide, and two nucleotides separated by a slash indicate an equal probability of either nucleotide at the indicated position. In reality, each consensus sequence is a shorthand representation of a histogram similar to that of Figure 6-12.
图 6-16 在真核 RNA 聚合酶 II 起始点附近发现的共识序列。 每个共识序列的名称(第一列)和识别它的一般转录因子(最后一列)均已指示。 表示任何核苷酸,斜杠分隔的两个核苷酸表示在指定位置上两种核苷酸的等概率。 实际上,每个共识序列都是类似于图 6-12 的直方图的简写表示。
For most RNA polymerase II transcription start points, only two or three of the four sequences are present. For example, many polymerase II promoters have a TATA box sequence, but those that do not typically have a "strong" INR sequence. Although most of the DNA sequences that influence transcription initiation are located upstream of the transcription start point, a few, such as the DPE shown in the figure, are located within the transcribed region.
对于大多数 RNA 聚合酶 II 转录起始点,只有四个序列中的两个或三个存在。例如,许多聚合酶 II 启动子具有 TATA 盒序列,但那些没有的通常具有“强”INR 序列。尽管大多数影响转录起始的 DNA 序列位于转录起始点的上游,但有一些,如图中所示的 DPE,位于转录区域内部。

humans, the CTD consists of 52 tandem repeats of a seven-amino-acid sequence, which extend from the RNA polymerase core structure. During transcription initiation, the serines located at the fifth position in each repeat sequence (Ser5) are phosphorylated by TFIIH, which contains a protein kinase in one of its subunits. Triggered by these phosphorylations, the polymerase disengages from the cluster of general transcription factors (see Figure 6-15D). During this process, it undergoes a series of conformational changes that tighten its interaction with DNA, and it acquires new proteins that allow it to transcribe for long distances, in some cases for many hours, without dissociating from DNA.
人类的 CTD 由 52 个七氨基酸序列的串联重复组成,这些序列从 RNA 聚合酶的核心结构延伸出来。在转录起始过程中,每个重复序列中第五个位置的丝氨酸(Ser5)会被 TFIIH 磷酸化,TFIIH 中的一个亚基含有蛋白激酶。在这些磷酸化的作用下,聚合酶会脱离一簇一般转录因子(见图 6-15D)。在这个过程中,它经历一系列构象变化,加强与 DNA 的相互作用,并获得新的蛋白质,使其能够在某些情况下长时间转录,有时甚至长达数小时,而不与 DNA 解离。
Once the polymerase II has begun elongating the RNA transcript, most of the general transcription factors are released from the DNA so that they are available to initiate another round of transcription with a new RNA polymerase molecule. As we see shortly, the phosphorylation of the tail of RNA polymerase II has an additional function: it causes components of the RNA-processing machinery to load onto the polymerase and thereby be positioned to modify the newly transcribed RNA as it emerges from the polymerase.
一旦 RNA 聚合酶 II 开始延伸 RNA 转录本,大多数一般转录因子就会从 DNA 上释放出来,这样它们就可以用新的 RNA 聚合酶分子启动另一轮转录。很快我们会看到,RNA 聚合酶 II 尾部的磷酸化还有一个额外的功能:它会导致 RNA 加工机制的组分加载到聚合酶上,并因此被定位在修改新转录的 RNA 时从聚合酶中出来。

In Eukaryotes, Transcription Initiation Also Requires Activator, Mediator, and Chromatin-modifying Proteins
在真核生物中,转录起始还需要激活因子、介导因子和染色质修饰蛋白

The model for transcription initiation just described is based on experiments performed in vitro using purified proteins and DNA. However, as discussed in Chapter 4, DNA in eukaryotic cells is packaged into nucleosomes, which are further arranged in higher-order chromatin structures. As a result, transcription initiation in a eukaryotic cell is more complex and requires more proteins than it does on purified DNA. First, regulatory proteins known as transcriptional activators must bind to specific sequences in DNA (called enhancers) to help attract the general transcription factors and RNA polymerase II to the start point of transcription (Figure 6-18). We discuss the role of these activators in Chapter 7, because they are one of the main ways in which cells regulate expression of their genes. Here we simply note that their presence on DNA is required for transcription initiation in a eukaryotic cell. Second, eukaryotic transcription initiation in vivo requires the presence of a large protein complex known as Mediator, which
刚才描述的转录起始模型是基于在体外使用纯化的蛋白质和 DNA 进行的实验。然而,正如第 4 章所讨论的,真核细胞中的 DNA 被包装成核小体,这些核小体进一步排列成更高级别的染色质结构。因此,在真核细胞中的转录起始更为复杂,需要比在纯化 DNA 上更多的蛋白质。首先,被称为转录激活因子的调控蛋白必须结合到 DNA 中的特定序列(称为增强子)上,以帮助吸引一般转录因子和 RNA 聚合酶 II 到转录起始点(图 6-18)。我们在第 7 章中讨论这些激活因子的作用,因为它们是细胞调控基因表达的主要途径之一。在这里我们只是指出,它们在 DNA 上的存在对真核细胞中的转录起始是必需的。其次,真核细胞中的转录起始在体内需要一个被称为中介因子的大蛋白质复合物的存在,

Figure 6-17 Three-dimensional structure of TBP (TATA-binding protein) bound to DNA. The TBP is the subunit of the general transcription factor TFIID that is responsible for recognizing and binding the TATA box sequence in the DNA. The unique DNA bending caused by TBP-kinks in the double helix separated by partly unwound DNA - is thought to serve as a landmark that helps to attract the other general transcription factors (Movie 6.4). TBP is a single polypeptide chain that is folded into two very similar domains (blue and green). (Adapted from J.L. Kim et al., Nature .
图 6-17 TBP(TATA 结合蛋白)与 DNA 结合的三维结构。TBP 是一般转录因子 TFIID 的亚基,负责识别和结合 DNA 中的 TATA 盒序列。TBP 引起的独特 DNA 弯曲,由 TBP 在部分解开的 DNA 中间形成的双螺旋弯曲引起,被认为是一种地标,有助于吸引其他一般转录因子(电影 6.4)。TBP 是一个单肽链,折叠成两个非常相似的结构域(蓝色和绿色)。(改编自 J.L. Kim 等人,自然
Figure 6-18 Transcription initiation by RNA polymerase II in a eukaryotic cell. Transcription initiation in vivo requires the presence of transcription activator proteins. As described in Chapter 7, these proteins bind to short, specific sequences in DNA that are located in regulatory regions called enhancers. Although only one activator is shown here (in blue), a typical eukaryotic gene utilizes many DNA-bound transcription activator proteins, which act together to determine that gene's rate and pattern of transcription across different cell types. Often acting from a distance of many thousand nucleotide pairs along DNA (indicated by the dashes), these proteins help RNA polymerase, the general transcription factors, and Mediator all to assemble at the promoter. In addition, ATP-dependent chromatin remodeling complexes and histone-modifying enzymes are needed at most genes. One of the main roles of Mediator is to coordinate the assembly of all these proteins at the promoter so that transcription can begin As discussed in Chapter 4, the "default" state of eukaryotic DNA is to be packaged into nucleosomes and higher-order chromatin structures; for simplicity, these are not shown in this figure.
图 6-18 在真核细胞中 RNA 聚合酶 II 的转录起始。体内的转录起始需要转录激活蛋白的存在。正如第 7 章所述,这些蛋白结合到 DNA 中位于调控区域称为增强子的短特定序列上。尽管这里只显示了一个激活蛋白(蓝色),典型的真核基因利用许多 DNA 结合的转录激活蛋白,它们共同作用以确定该基因在不同细胞类型中的转录速率和模式。这些蛋白通常从 DNA 沿着许多千对核苷酸的距离(由短横线表示)起作用,它们帮助 RNA 聚合酶、一般转录因子和中介因子都在启动子处组装。此外,大多数基因需要依赖 ATP 的染色质重塑复合物和修饰组蛋白的酶。 中介者的主要作用之一是协调所有这些蛋白质在启动子处的组装,以便转录可以开始。正如第 4 章所讨论的,真核 DNA 的“默认”状态是被包装成核小体和更高级别的染色质结构;为简单起见,这些在图中未显示。

allows the activator proteins to communicate properly with the polymerase II and with the general transcription factors. Mediator also correctly positions TFIIH near the tail of RNA polymerase, facilitating phosphorylation of the tail and the consequent release of the polymerase from the promoter to begin synthesizing RNA. Finally, transcription initiation in a eukaryotic cell typically requires the recruitment of chromatin-modifying enzymes, including chromatin remodeling complexes and histone-modifying enzymes. As discussed in Chapter 4, both types of enzymes can increase access to the DNA in chromatin, and by doing so they facilitate the assembly of the transcription initiation machinery onto DNA.
允许激活蛋白与聚合酶 II 和一般转录因子进行正确沟通。介导体还能正确地将 TFIIH 定位在 RNA 聚合酶的尾部附近,促进尾部的磷酸化以及随后的聚合酶从启动子上释放以开始合成 RNA。最后,在真核细胞中,转录起始通常需要招募染色质修饰酶,包括染色质重塑复合物和组蛋白修饰酶。正如第 4 章所讨论的,这两类酶都能增加对染色质中 DNA 的访问,并通过这样做促进转录起始机器在 DNA 上的组装。
To summarize, as illustrated schematically in Figure 6-18, many proteins (well over 100 individual subunits) must assemble at the start point of transcription to initiate transcription in a eukaryotic cell. We shall return to some of these proteins-especially transcription activator proteins, chromatin remodeling complexes, and histone-modifying enzymes-in the following chapter, where we discuss how eukaryotic cells regulate the process of transcription initiation.
总结一下,如图 6-18 示意的那样,在真核细胞中,许多蛋白质(超过 100 个个体亚基)必须在转录起始点组装在一起,以启动转录。在接下来的章节中,我们将回顾其中一些蛋白质,特别是转录激活蛋白、染色质重塑复合物和组蛋白修饰酶,讨论真核细胞如何调控转录起始过程。

Transcription Elongation in Eukaryotes Requires Accessory Proteins
真核生物的转录延伸需要辅助蛋白

Once RNA polymerase has initiated transcription, it moves jerkily, pausing at some DNA sequences and rapidly transcribing through others. Elongating RNA polymerases, both bacterial and eukaryotic, are associated with a series of elongation factors, proteins that decrease the likelihood that RNA polymerase will dissociate before it reaches the end of a gene. These factors typically associate with RNA polymerase shortly after initiation, and they help the polymerase move both through nucleosomes (Figure 6-19) and through the wide variety of different DNA sequences that are found in genes.
一旦 RNA 聚合酶启动转录,它会不稳定地移动,在某些 DNA 序列处暂停,而在其他序列中快速转录。细菌和真核生物的 RNA 聚合酶在延伸过程中与一系列延伸因子相关联,这些蛋白质降低了 RNA 聚合酶在到达基因末端之前解离的可能性。这些因子通常在启动后不久与 RNA 聚合酶结合,并帮助聚合酶穿过核小体(图 6-19)以及基因中存在的各种不同 DNA 序列。
We will see in the next chapter that, like the process of transcription initiation, transcription elongation can be regulated by the cell; more specifically, we will see that at many human genes, RNA polymerase pauses shortly after it initiates transcription. This pause can last from several seconds to many hours, and the cell controls the duration of this pause as part of gene regulatory processes.
我们将在下一章中看到,就像转录起始过程一样,转录延伸也可以被细胞调节;更具体地说,我们将看到,在许多人类基因中,RNA 聚合酶在启动转录后不久会暂停。这种暂停可以持续几秒到几个小时,细胞控制这种暂停的持续时间作为基因调控过程的一部分。
As RNA polymerase II moves along a gene, some of the enzymes bound to it modify the histones, leaving behind a record of where the polymerase has been. Although it is not clear exactly how the cell uses this information, it may aid in transcribing a gene over and over again once it has become active for the first time.
随着 RNA 聚合酶 II 沿基因移动,一些与其结合的酶会修饰组蛋白,留下聚合酶所到之处的记录。尽管目前尚不清楚细胞如何利用这些信息,但一旦基因首次活跃起来,这可能有助于反复转录基因。

Transcription Creates Superhelical Tension
转录产生超螺旋张力

Nucleosomes are not the only impediment to elongating RNA polymerases, and in this section, we describe an entirely different type of barrier, one that applies to both bacterial and eukaryotic polymerases. To introduce this issue, we need first to consider a subtle property inherent in the DNA double helix called DNA supercoiling. DNA supercoiling is the name given to a conformation that DNA
核小体并不是延伸 RNA 聚合酶的唯一障碍,本节中我们描述了一种完全不同类型的障碍,适用于细菌和真核聚合酶。为了介绍这个问题,我们首先需要考虑 DNA 双螺旋中固有的一个微妙特性,称为 DNA 超螺旋。DNA 超螺旋是指 DNA 的一种构象。
Figure 6-19 Structure of an RNA polymerase II transcribing through a nucleosome. In the structure diagrammed here, which was determined by cryoelectron microscopy, the polymerase has moved about halfway through the DNA of the nucleosome, leaving only one of the two loops of duplex DNA still bound to the histone core. The polymerase is shown in blue, associated with three elongation factors (Spt4, Spt5, and Elf1) that help the polymerase transcribe through nucleosomes. These factors act in several ways: they form a wedge to pry the DNA away from the histone core as the polymerase moves forward; they directly destabilize histone-DNA interactions by pushing a positively charged surface ahead of the RNA polymerase; and they reduce the intrinsic "stickiness" of RNA polymerase for nucleosomes. In addition to these factors, eukaryotic transcription is typically aided by ATP-dependent chromatin remodeling complexes that seek out and rescue the occasional stalled polymerase, as well as by histone chaperones that can partially disassemble nucleosomes in front of a moving RNA polymerase and reassemble them behind. (Based on PDB code 6IR9.)
图 6-19 RNA 聚合酶 II 穿过核小体的结构。在这里绘制的结构图是通过冷冻电镜确定的,聚合酶已经穿过核小体 DNA 的大约一半,只剩下双螺旋 DNA 的一个环仍与组蛋白核心结合。聚合酶显示为蓝色,与三个延伸因子(Spt4、Spt5 和 Elf1)相关联,这些因子帮助聚合酶穿过核小体。这些因子以几种方式起作用:它们形成楔形将 DNA 从组蛋白核心中推开,随着聚合酶的前进;它们通过在 RNA 聚合酶前面推动一个带正电表面来直接破坏组蛋白-DNA 相互作用;它们减少 RNA 聚合酶与核小体的固有“粘性”。除了这些因子外,真核转录通常还受到依赖 ATP 的染色质重塑复合物的帮助,这些复合物寻找并拯救偶尔停滞的聚合酶,以及组蛋白伴侣,可以在移动的 RNA 聚合酶前部部分解组核小体,并在后面重新组装它们。(基于 PDB 代码 6IR9。)

can adopt in response to superhelical tension. Alternatively, the creation of loops or coils in a double-helical DNA molecule will produce such tension.
可以采取的措施来应对超螺旋张力。另外,在双螺旋 DNA 分子中创建环或卷会产生这种张力。
Figure 6-20 illustrates why. There are approximately 10 nucleotide pairs for every helical turn in a DNA double helix. Imagine a helix whose two ends are fixed with respect to each other (as they are in a DNA circle, such as a bacterial chromosome, or in a tightly clamped loop, as can exist in eukaryotic chromosomes). In this case, one large DNA supercoil will form to compensate for each 10 nucleotide pairs that are opened (unwound). The formation of this supercoil is energetically favorable because it restores a normal helical twist to the base-paired regions that remain, which would otherwise become overwound because of the fixed ends.
图 6-20 说明了原因。在 DNA 双螺旋中,每个螺旋转数大约有 10 个核苷酸对。想象一根螺旋,其两端相对固定(就像细菌染色体中的 DNA 环或真核染色体中可能存在的紧密夹紧的环一样)。在这种情况下,每打开(解旋)10 个核苷酸对,就会形成一个大的 DNA 超螺旋来补偿。这种超螺旋的形成在能量上是有利的,因为它恢复了剩余的碱基配对区域的正常螺旋扭转,否则由于固定端而变得过度扭转。
RNA polymerase creates superhelical tension as it moves along a stretch of DNA that is anchored at its ends. As illustrated in Figure 6-20C, if the polymerase is not free to rotate rapidly (and such rotation is unlikely given the size of RNA polymerases and their attached transcripts), a moving polymerase will generate positive superhelical tension in the DNA in front of it and negative helical tension
RNA 聚合酶在沿着以其两端为锚的 DNA 区段移动时会产生超螺旋张力。如图 6-20C 所示,如果聚合酶无法快速旋转(考虑到 RNA 聚合酶及其附加转录本的大小,这种旋转不太可能发生),移动的聚合酶将在其前方的 DNA 中产生正超螺旋张力和负螺旋张力。
(A)
DNA with free end
具有自由末端的 DNA
lone 孤独
acanacacou
DNA helix must rotate one turn
DNA 螺旋必须旋转一周

(B)
DNA with fixed ends
具有固定末端的 DNA
(C)
NEGATIVE SUPERCOILING 负超绕
POSITIVE SUPERCOILING helix opening facilitated helix opening hindered
正超绕螺旋开放促进螺旋开放受阻
Figure 6-20 Superhelical tension in DNA causes DNA supercoiling. (A) For a DNA molecule with one free end (or a break in one strand that serves as a swivel), the DNA double helix rotates by one turn for every 10 nucleotide pairs opened. (B) If rotation is prevented, superhelical tension is introduced into the DNA by helix opening. In the example shown, the DNA helix contains 10 helical turns, one of which is opened. One way of accommodating the tension created would be to increase the helical twist from 10 to 11 nucleotide pairs per turn in the double helix that remains. The DNA helix, however, resists such a deformation in a springlike fashion, preferring to relieve the superhelical tension by bending into supercoiled loops. As a result, one DNA supercoil forms in the DNA double helix for every 10 nucleotide pairs opened. The supercoil formed in this case is a positive supercoil. (C) Supercoiling of DNA is induced by a protein tracking through the DNA double helix. The two ends of the DNA shown here are unable to rotate freely relative to each other, and the protein molecule is assumed also to be prevented from rotating freely as it moves. Under these conditions, the movement of the protein causes an excess of helical turns to accumulate in the DNA helix ahead of the protein (inducing positive supercoils) and a deficit of helical turns to arise in the DNA behind the protein (inducing negative supercoils), as shown. Because locally pulling apart the two strands of the double helix relieves the tension from negative supercoils, it is easier to do behind the moving protein than ahead of it.
图 6-20 DNA 中的超螺旋张力引起 DNA 超螺旋。(A) 对于一个具有一个自由端的 DNA 分子(或者一条链中有一个断裂作为转轴),DNA 双螺旋每打开 10 个核苷酸对就旋转一圈。(B) 如果旋转被阻止,螺旋张力通过螺旋打开引入到 DNA 中。在所示的例子中,DNA 螺旋包含 10 个螺旋圈,其中一个被打开。一种适应所产生张力的方式是将剩余的双螺旋中的螺旋扭转从每圈 10 个核苷酸对增加到 11 个。然而,DNA 螺旋以弹簧般的方式抵抗这种变形,更倾向于通过弯曲成超螺旋环来缓解超螺旋张力。因此,每打开 10 个核苷酸对,DNA 双螺旋中形成一个 DNA 超螺旋。在这种情况下形成的超螺旋是正超螺旋。(C) DNA 的超螺旋是由跟踪 DNA 双螺旋的蛋白质诱导的。这里所示的 DNA 的两端无法相对自由旋转,同时假定蛋白质分子也无法自由旋转而是在移动时被阻止。 在这些条件下,蛋白质的运动导致蛋白质前面的 DNA 螺旋中积累过多的螺旋转向(诱导正超螺旋),而在蛋白质后面的 DNA 中出现螺旋转向的不足(诱导负超螺旋),如图所示。由于局部拉开双螺旋的两股可以减轻负超螺旋的张力,所以在移动蛋白质后面做这个动作比在前面更容易。

(A) EUKARYOTES (A)真核生物

behind it. If this tension is not relieved, the polymerase will grind to a halt because further unwinding requires more energy than the transcription process can provide. For eukaryotes, the mild buildup of tension is thought to provide a bonus: the positive superhelical tension ahead of the polymerase facilitates the partial unwrapping of the DNA in nucleosomes, inasmuch as the release of DNA from the histone core helps to relax this tension. The tension can also be relieved by DNA topoisomerase enzymes, as we saw in the previous chapter for the similar kind of tension generated by DNA polymerases during DNA replication (see Figure 5-21).
如果这种张力得不到缓解,聚合酶将停滞不前,因为进一步的解旋需要比转录过程提供的能量更多。对于真核生物来说,轻微的张力积累被认为提供了一个额外的好处:聚合酶前面的正超螺旋张力有助于部分解开核小体中的 DNA,因为 DNA 从组蛋白核心释放有助于缓解这种张力。这种张力也可以通过 DNA 拓扑异构酶来缓解,正如我们在前一章中看到的,DNA 聚合酶在 DNA 复制过程中产生类似张力时也是如此。
In bacteria (but not eukaryotes), a specialized topoisomerase called DNA gyrase uses the energy of ATP hydrolysis to pump supercoils continually into the DNA, thereby maintaining the DNA under constant tension. These are negative supercoils, having the opposite handedness from the positive supercoils that form when a region of DNA helix opens (see Figure 6-20B). Whenever a region of helix opens, it removes these negative supercoils from bacterial DNA, reducing the superhelical tension. DNA gyrase therefore makes the opening of the DNA helix in bacteria energetically favorable compared with helix opening in DNA that is not supercoiled. For this reason, it facilitates those genetic processes in bacteria, such as the initiation of transcription by bacterial RNA polymerase, that require helix opening (see Figure 6-11).
在细菌中(但不是真核生物),一种名为 DNA 旋转酶的专门拓扑异构酶利用 ATP 水解的能量不断将超螺旋泵入 DNA,从而保持 DNA 处于恒定张力下。这些是负超螺旋,与 DNA 螺旋区域打开时形成的正超螺旋相反(见图 6-20B)。每当螺旋区域打开时,它会从细菌 DNA 中去除这些负超螺旋,减少超螺旋张力。因此,DNA 旋转酶使细菌中 DNA 螺旋的打开在能量上比非超螺旋 DNA 中的螺旋打开更有利。因此,它促进了细菌中那些需要螺旋打开的遗传过程,比如细菌 RNA 聚合酶的转录起始(见图 6-11)。

Transcription Elongation in Eukaryotes Is Tightly Coupled to RNA Processing
真核生物的转录延伸与 RNA 处理密切耦合

We saw earlier that bacterial mRNAs are synthesized by the RNA polymerase starting and stopping at specific spots on the genome. The situation in eukaryotes is substantially different. In particular, transcription is only the first of several steps needed to produce a mature mRNA molecule. Other critical steps are the covalent modification of the ends of the RNA and the removal of intron sequences that are discarded from the middle of the RNA transcript by the process of splicing (Figure 6-21). As we shall see, RNA splicing not only joins together different portions of an RNA transcript to eliminate the intron sequences; it also provides eukaryotes with the ability to synthesize several related but different proteins from the same gene.
我们之前看到,细菌的 mRNA 是由 RNA 聚合酶在基因组的特定位置开始和停止合成的。真核生物的情况大不相同。特别是,转录只是产生成熟 mRNA 分子所需的几个步骤中的第一步。其他关键步骤包括 RNA 两端的共价修饰以及通过 剪接过程从 RNA 中间丢弃的内含子序列的去除(图 6-21)。正如我们将看到的那样,RNA 剪接不仅将 RNA 转录本的不同部分连接在一起以消除内含子序列;它还使真核生物能够从同一基因合成几种相关但不同的蛋白质。

(B) BACTERIA (B)细菌
Figure 6-21 Comparison of the steps leading from gene to protein in eukaryotes and bacteria. The final amount of a protein in the cell depends on the efficiency of each step and on the rates of degradation of the RNA and protein molecules. (A) In eukaryotic cells, the mRNA molecule resulting from transcription contains both coding (exon) and noncoding (intron) sequences. Before it can be translated into protein, the two ends of the RNA are modified, the introns are removed by an enzymatically catalyzed RNA splicing reaction, and the resulting mRNA is transported from the nucleus to the cytoplasm. For convenience, the steps in this figure are depicted as occurring one at a time; in reality, many occur concurrently. For example, the RNA cap is added and splicing begins before transcription has been completed. Because of the coupling between transcription and RNA processing intact primary transcripts-the full-length RNAs that would, in theory, be produced if no processing had occurred-are found only rarely. (B) In bacteria, the production of mRNA is much simpler. The end of an mRNA molecule is produced by the initiation of transcription, and the end is produced by the termination of transcription. Because bacteria lack a nucleus, transcription and translation take place in a common compartment, and the translation of a bacterial mRNA often begins before its synthesis has been completed. As indicated, a single bacterial mRNA typically produces several different proteins, another feature that distinguishes eukaryotes from bacteria.
图 6-21 比较真核生物和细菌中从基因到蛋白质的步骤。细胞中蛋白质的最终量取决于每个步骤的效率以及 RNA 和蛋白质分子的降解速率。(A) 在真核细胞中,转录产生的 mRNA 分子包含编码(外显子)和非编码(内含子)序列。在可以被翻译成蛋白质之前,RNA 的两端被修改,内含子通过酶催化的 RNA 剪接反应被去除,然后产生的 mRNA 被从细胞核运输到细胞质。为方便起见,本图中的步骤被描绘为依次发生;实际上,许多步骤同时发生。例如,在转录完成之前,RNA 帽子被添加并开始剪接。由于转录和 RNA 加工之间的耦合,完整的原始转录本(理论上如果没有加工将产生的全长 RNA)很少被发现。(B) 在细菌中,mRNA 的产生要简单得多。 mRNA 分子的 端由转录的起始产生, 端由转录的终止产生。由于细菌缺乏细胞核,转录和翻译发生在一个共同的区域,细菌 mRNA 的翻译通常在合成完成之前就开始。正如所指出的,单个细菌 mRNA 通常产生几种不同的蛋白质,这是区分真核生物和细菌的另一个特征。
(A)
eukaryotic mRNA 真核 mRNA
(B)
N+OOHOH
Figure 6-22 A comparison of the structures of bacterial and eukaryotic mRNA molecules. (A) The and ends of a bacterial mRNA are the unmodified ends of the chain synthesized by the RNA polymerase, which initiates and terminates transcription at those points, respectively. The corresponding ends of a eukaryotic mRNA are formed by adding a 5' cap and by cleavage of the pre-mRNA transcript near the end and the addition of a poly-A tail, respectively. The figure also illustrates another difference between the prokaryotic and eukaryotic mRNAs: bacterial mRNAs can contain the instructions for several different proteins, whereas eukaryotic mRNAs nearly always contain the information for only a single protein. (B) The structure of the cap at the end of eukaryotic mRNA molecules. Note the unusual -to-5' linkage of the 7 -methylguanosine to the remainder of the RNA. Most eukaryotic mRNAs carry an additional modification: methylation of the -hydroxyl group of the ribose sugar at the 5' end of the primary transcript. Although the precise role of this modification remained mysterious for many years, recent work indicates that it aids cells in distinguishing their own mRNAs from those of invading viruses, which typically lack this modification. On the basis of this difference, cells can block translation of viral RNAs, thereby defending themselves from viral attack.
图 6-22 细菌和真核 mRNA 分子结构的比较。 (A) 细菌 mRNA 的 末端是 RNA 聚合酶合成的链的未修饰末端,该酶在这些点上启动和终止转录。真核 mRNA 的对应末端通过添加 5'帽子和在 末端附近切割 pre-mRNA 转录本并添加 poly-A 尾巴而形成。图还说明了原核和真核 mRNA 之间的另一个差异:细菌 mRNA 可以包含几种不同蛋白质的指令,而真核 mRNA 几乎总是只包含单个蛋白质的信息。 (B) 真核 mRNA 分子 末端的帽子结构。请注意 7-甲基鸟苷与 RNA 其余部分之间的不寻常 -to-5'连接。大多数真核 mRNA 携带另一种修饰:在初级转录本的 5'末端的核糖糖的 -羟基基团的甲基化。 尽管这种修饰的确切作用多年来一直是个谜,但最近的研究表明,它有助于细胞区分自身的 mRNA 与入侵病毒的 mRNA,后者通常缺乏这种修饰。基于这种差异,细胞可以阻止病毒 RNA 的翻译,从而保护自己免受病毒攻击。
Both ends of eukaryotic mRNAs are modified: by capping on the end and by polyadenylation of the end (Figure 6-22). These special ends allow the cell to assess whether both ends of an mRNA molecule are present (and if the message is therefore intact) before it exports the RNA from the nucleus and translates it into protein.
真核 mRNA 的两端都经过修饰:通过在 端进行帽子修饰,以及在 端进行多聚腺苷酸化(图 6-22)。这些特殊的末端使细胞能够评估 mRNA 分子的两端是否完整(因此消息是否完整),然后再将 RNA 从细胞核输出并翻译成蛋白质。
A simple mechanism has evolved to couple all of the above RNA-processing steps to transcription elongation. As discussed previously, a key step in transcription initiation by RNA polymerase II is the phosphorylation of the RNA polymerase II tail, also called the CTD (C-terminal domain). This phosphorylation, which proceeds gradually as the RNA polymerase initiates transcription and moves along the DNA, not only helps to dissociate the RNA polymerase II from other proteins present at the start point of transcription but also allows a new set of proteins to associate with the RNA polymerase tail, which function in transcription elongation and RNA processing, as we discuss in the following sections. Some of the processing proteins are thought to "hop" from the polymerase tail onto the nascent RNA molecule to begin their processing reactions as soon as this RNA emerges from the RNA polymerase. Thus, we can view RNA polymerase II in its elongation mode as an RNA factory that not only moves along the DNA synthesizing an RNA molecule but also processes the RNA that it produces (Figure 6-23). Fully extended, the CTD is nearly 10 times longer than the remainder of the RNA polymerase. As a flexible protein domain, it serves as a scaffold or tether, holding a variety of proteins close by so that they can rapidly act when needed. This scaffolding strategy, which greatly speeds up the overall rate of a series of consecutive reactions, is one that is commonly utilized in the cell (see Figure 3-76).
一个简单的机制已经演变出将所有上述 RNA 处理步骤与转录延伸耦合在一起。正如之前讨论的,RNA 聚合酶 II 通过磷酸化 RNA 聚合酶 II 尾部(也称为 CTD,C-末端结构域)来启动转录的关键步骤。这种磷酸化随着 RNA 聚合酶启动转录并沿着 DNA 移动逐渐进行,不仅有助于将 RNA 聚合酶 II 与存在于转录起始点的其他蛋白质分离开来,还允许一组新的蛋白质与 RNA 聚合酶尾部结合,这些蛋白质在转录延伸和 RNA 处理中发挥作用,我们将在接下来的部分中讨论。一些处理蛋白被认为会从聚合酶尾部“跳”到新生 RNA 分子上,以便在这个 RNA 从 RNA 聚合酶中出现时立即开始它们的处理反应。因此,我们可以将 RNA 聚合酶 II 在其延伸模式中视为一个 RNA 工厂,它不仅沿着 DNA 移动合成 RNA 分子,还处理它所产生的 RNA(图 6-23)。完全展开时,CTD 几乎比 RNA 聚合酶的其余部分长 10 倍。 作为一种灵活的蛋白质结构域,它充当支架或系链,将各种蛋白质紧密相连,以便在需要时能够迅速发挥作用。这种支架策略极大地加快了一系列连续反应的整体速率,这是细胞中常用的一种策略(见图 3-76)。

RNA Capping Is the First Modification of Eukaryotic Pre-mRNAs
RNA 盖帽是真核前 mRNA 的第一个修饰

As soon as RNA polymerase II has produced about 20 nucleotides of RNA, the end of the new RNA molecule is modified by addition of a cap that consists of a modified guanine nucleotide (see Figure 6-22B). Three enzymes, acting in
一旦 RNA 聚合酶 II 合成了大约 20 个核苷酸的 RNA,新 RNA 分子的 端就会通过添加一个由修饰的鸟嘌呤核苷酸组成的帽子而进行修饰(见图 6-22B)。三种酶参与其中。
Figure 6-23 Eukaryotic RNA polymerase II as an RNA synthesis and processing machine. As the polymerase transcribes DNA into RNA, it carries RNA-processing proteins on its tail that are transferred to the nascent RNA at the appropriate time. The tail contains 52 tandem repeats of a seven-amino-acid sequence, and there are two serines in each repeat. The capping proteins first bind to the RNA polymerase tail when it is phosphorylated on Ser5 of the heptad repeats late in the process of transcription initiation (see Figure 6-15). This strategy ensures that the RNA molecule is efficiently capped as soon as its end emerges from the RNA polymerase. As the polymerase continues transcribing, its tail is extensively phosphorylated on the Ser2 positions by a kinase associated with the elongating polymerase and is eventually dephosphorylated at Ser5 positions. These further modifications attract splicing and 3 '-end processing proteins to the moving polymerase, positioning them to act on the newly synthesized RNA as it emerges from the RNA polymerase. There are many RNA-processing enzymes, and not all travel with the polymerase. In RNA splicing, for example, the tail carries only a few critical components; once bound to an emerging RNA molecule, they serve as a nucleation site for the remaining components.
图 6-23 真核细胞 RNA 聚合酶 II 作为 RNA 合成和加工机器。当聚合酶将 DNA 转录成 RNA 时,它在尾部携带 RNA 加工蛋白,这些蛋白在适当的时机转移到新生 RNA 上。尾部包含 52 个七氨基酸序列的串联重复,每个重复中有两个丝氨酸。当 RNA 聚合酶尾部上的七肽重复的 Ser5 被磷酸化时,首先结合到尾部的是帽蛋白,这发生在转录起始过程的后期(见图 6-15)。这种策略确保 RNA 分子在其 端从 RNA 聚合酶中出现时能够高效地加帽。随着聚合酶继续转录,其尾部在 Ser2 位置被与伸长聚合酶相关的激酶广泛磷酸化,并最终在 Ser5 位置被去磷酸化。这些进一步的修饰吸引剪接和 3'-端加工蛋白到移动的聚合酶附近,使它们能够作用于新合成的 RNA,当其从 RNA 聚合酶中出现时。有许多 RNA 加工酶,并非所有酶都随聚合酶一起运动。 在 RNA 剪接中,例如,尾部只携带了一些关键组分;一旦与新出现的 RNA 分子结合,它们就成为其余组分的核聚点。
When RNA polymerase II finishes transcribing a gene, it is released from DNA, and protein phosphatases remove the phosphates on its tail so that it can reinitiate transcription. Only the fully dephosphorylated form of RNA polymerase II is competent to begin RNA synthesis at a promoter.
当 RNA 聚合酶 II 完成转录基因时,它从 DNA 中释放出来,蛋白磷酸酶去除其尾部上的磷酸,以便它可以重新开始转录。只有完全去磷酸化的 RNA 聚合酶 II 形式才能在启动子处开始 RNA 合成。
succession, perform the capping reaction: a phosphatase removes a phosphate from the triphosphate left at the end of the nascent RNA molecule, a guanyl transferase adds a GMP in a reverse linkage (5'-to-5' instead of -to-3') to the diphosphate just produced, and a methyl transferase adds a methyl group to the guanosine (Figure 6-24). Because all three enzymes bind to the RNA polymerase tail phosphorylated at the Ser5 position-the modification added by TFIIH during transcription initiation-they are poised to modify the end of the nascent transcript as soon as it emerges from the polymerase.
继续,执行封顶反应:磷酸酶从新生 RNA 分子末端的三磷酸中去除一个磷酸,鸟苷转移酶在刚产生的二磷酸上以反向连接(5'-to-5'而不是 -to-3')添加一个 GMP,甲基转移酶向鸟苷添加一个甲基基团(图 6-24)。由于这三种酶都结合到在 Ser5 位置磷酸化的 RNA 聚合酶尾部上-这是 TFIIH 在转录起始过程中添加的修饰,它们准备修改新生转录本的 末端,一旦它从聚合酶中出现。
The 7-methylguanosine cap signifies the end of eukaryotic mRNAs, and this landmark helps the cell to distinguish mRNAs from the other types of RNA molecules present in the cell. For example, RNA polymerases I and III produce uncapped RNAs during transcription, in part because these polymerases lack a CTD. In the nucleus, the cap binds a protein complex called CBC (cap-binding complex), which, as we discuss in subsequent sections, helps a future mRNA to be further processed and exported. The cap also has an important role in the translation of mRNAs in the cytosol, as we discuss later in the chapter.
7-甲基鸟苷帽标志着真核 mRNA 的 端,这一标志有助于细胞区分 mRNA 与细胞中存在的其他类型的 RNA 分子。例如,RNA 聚合酶 I 和 III 在转录过程中产生无帽 RNA,部分原因是这些聚合酶缺乏 CTD。在细胞核中,帽子结合一个被称为 CBC(帽子结合复合物)的蛋白质复合物,正如我们在后续章节中讨论的那样,这有助于未来的 mRNA 进一步被加工和输出。 帽子在细胞质中 mRNA 的翻译中也起着重要作用,我们将在本章后面讨论。
Although the vast majority of eukaryotic mRNAs possess a 7 -methylguanosine cap at their ends, an alternative type of cap is found on some mRNAs, specifically a nicotinamide adenine dinucleotide phosphate ( ). We saw earlier in Chapter 2 that is an important cofactor for many biochemical reactions, and it is added to certain mRNAs by RNA polymerase itself as the first nucleotide when a new mRNA molecule chain is begun. The role of this particular type of mRNA cap is not known with certainty, but one hypothesis holds that it provides a way for the cell to link the expression of some of its genes to its overall metabolic "health."
尽管绝大多数真核 mRNA 在其 端具有 7-甲基鸟苷帽,但一些 mRNA 上发现了一种替代性帽,具体是烟酰胺腺嘌呤二核苷酸磷酸( )。我们在第 2 章早些时候看到 对许多生化反应是一个重要的辅因子,并且当一个新的 mRNA 分子链开始时,它由 RNA 聚合酶本身作为第一个核苷酸添加到某些 mRNA 上。这种特定类型的 mRNA 帽的作用尚不清楚,但有一种假设认为它提供了一种方式,使细胞能够将其一些基因的表达与其整体代谢“健康”联系起来。

RNA Splicing Removes Intron Sequences from Newly Transcribed Pre-mRNAs
RNA 剪接从新转录的前 mRNA 中去除内含子序列

As discussed in Chapter 4, the protein-coding sequences of eukaryotic genes are typically interrupted by noncoding intervening sequences (introns). Discovered in 1977, this feature of eukaryotic genes came as a surprise to scientists, who had been, until that time, familiar only with bacterial genes, which consist of a continuous stretch of coding DNA that is directly transcribed into mRNA. In marked contrast, eukaryotic genes were found to be broken up into small pieces of coding sequence (expressed sequences, or exons) interspersed with much longer
正如第 4 章所讨论的那样,真核基因的蛋白质编码序列通常被非编码的干扰序列(内含子)所打断。这一特征于 1977 年被发现,对科学家来说是一个惊喜,因为他们此前只熟悉细菌基因,细菌基因由一段连续的编码 DNA 组成,直接转录成 mRNA。相比之下,真核基因被发现分解成小片编码序列(表达序列,或外显子),这些序列间夹杂着更长的序列。
Figure 6-24 The reactions that cap the 5 ' end of each RNA molecule synthesized by RNA polymerase II. The final cap contains a novel -to-5' linkage between the positively charged 7-methylguanosine residue and the end of the RNA transcript (see Figure 6-22B). The letter N represents any one of the four ribonucleotides, although the nucleotide that starts an RNA chain is usually a purine (an A or a G). (After A.J. Shatkin, Bioessays 7:275-277, 1987. With permission from John Wiley & Sons.)
图 6-24 RNA 聚合酶 II 合成的每个 RNA 分子的 5'端上的反应。最终帽子包含了一个新颖的 -to-5'连接,在这个连接中,带正电荷的 7-甲基鸟苷残基与 RNA 转录本的 端之间形成了连接(见图 6-22B)。字母 N 代表四种核苷酸中的任意一种,尽管开始 RNA 链的核苷酸通常是一种嘌呤(A 或 G)。(摘自 A.J. Shatkin, Bioessays 7:275-277, 1987. 获 John Wiley & Sons 许可。)
end of nascent RNA transcript
初生 RNA 转录末端
-
GpppNpNp
add methyl group to base
将甲基基团添加到碱基
add methyl group to ribose †
在核糖上加甲基基团†
human -globin gene 人类 -珠蛋白基因
123
IIII
2000
(A)
nucleotide pairs 核苷酸对
(B)
intervening sequences, or introns; thus, the coding portion of a eukaryotic gene is often only a small fraction of the length of the gene (Figure 6-25).
干预序列,或内含子;因此,真核基因的编码部分通常只是基因长度的一小部分(图 6-25)。
Both intron and exon sequences are transcribed into RNA. The intron sequences are removed from the newly synthesized RNA through the process of RNA splicing. The vast majority of RNA splicing that takes place in cells functions in the production of mRNA, and our discussion of splicing focuses on this so-called precursor-mRNA (or pre-mRNA) splicing. Only after 5 '- and -end processing and splicing have taken place is an RNA transcript called an mRNA molecule.
内含子和外显子序列都被转录成 RNA。内含子序列通过 RNA 剪接的过程从新合成的 RNA 中移除。绝大多数在细胞中发生的 RNA 剪接功能在 mRNA 的产生中发挥作用,我们讨论的剪接重点放在所谓的前体 mRNA(或 pre-mRNA)的剪接上。只有在 5'端和 0 端处理以及剪接完成后,才会形成一个称为 mRNA 分子的 RNA 转录本。
Each splicing event removes one intron, proceeding through two sequential phosphoryl-transfer reactions known as transesterifications; these join two exons together while removing the intron between them as a "lariat" (Figure 6-26). The machinery that catalyzes pre-mRNA splicing is complex, consisting of five additional RNA molecules and several hundred proteins, and it hydrolyzes many ATP molecules per splicing event. This complexity ensures that splicing is accurate, while at the same time being flexible enough to deal with the enormous variety of introns found in a typical eukaryotic cell. On average, there are 11 introns in each of the approximately 20,000 human protein-coding genes, so the cell devotes considerable resources to this step in gene expression.
每次剪接事件都会去除一个内含子,通过两个连续的磷酸酯转移反应进行,这些反应被称为转酯化;这些反应将两个外显子连接在一起,同时去除它们之间的内含子形成“套圈”(图 6-26)。催化前 mRNA 剪接的机器复杂,由五个额外的 RNA 分子和数百个蛋白质组成,并且每次剪接事件会水解许多 ATP 分子。这种复杂性确保了剪接的准确性,同时又足够灵活以处理典型真核细胞中发现的大量内含子的巨大变化。平均而言,大约有 11 个内含子存在于每个约 20,000 个人类蛋白编码基因中,因此细胞将大量资源投入到基因表达中的这一步骤。
Although it seems wasteful to first produce and then remove large numbers of introns from an RNA transcript, this process provides an advantage to the cell. In many organisms, the transcript of a given gene can be spliced in more than one way, and this allows the same gene to produce a set of different but related proteins (Figure 6-27). It has been proposed that of human gene transcripts are spliced in more than one way, but this is almost certainly an overestimate: many
尽管从 RNA 转录本中首先产生然后移除大量内含子看起来是浪费的,但这个过程为细胞提供了优势。在许多生物体中,给定基因的转录本可以以多种方式剪接,这使得同一基因能够产生一组不同但相关的蛋白质(图 6-27)。已经提出 人类基因转录本以多种方式剪接,但这几乎可以肯定是一个高估:许多
(A)
(B)
end of intron sequence
内含子序列的末端

Figure 6-25 Structure of two human genes showing the arrangement of exons and introns. (A) The relatively small -globin gene, which encodes a subunit of the oxygen-carrying protein hemoglobin, contains 3 exons (see also Figure 4-9). (B) The much larger Factor VIII gene contains 26 exons; it codes for a protein (Factor VIII) that functions in the bloodclotting pathway. The most prevalent form of hemophilia results from mutations in this gene.
图 6-25 两个人类基因的结构,显示外显子和内含子的排列。(A)相对较小的 -珠蛋白基因,编码氧载体蛋白质血红蛋白的一个亚基,包含 3 个外显子(另见图 4-9)。 (B)较大的因子 VIII 基因包含 26 个外显子;它编码一个在血液凝血途径中起作用的蛋白质(因子 VIII)。 血友病最常见的形式是由该基因突变引起的。

Figure 6-26 The pre-mRNA splicing reaction. (A) In the first step, a specific adenine nucleotide in the intron sequence (indicated in red) attacks the splice site and cuts the sugar-phosphate backbone of the RNA at this point. The cut end of the intron becomes covalently linked to the adenine nucleotide, as shown in detail in (B), thereby creating a loop in the RNA molecule. The released free end of the exon sequence then reacts with the start of the next exon sequence, joining the two exons together and releasing the intron sequence in the shape of a lariat. The two exon sequences thereby become joined into a continuous coding sequence. The released intron sequence is discarded eventually being broken down into single nucleotides, which are recycled.
图 6-26 前 mRNA 剪接反应。 (A) 在第一步中,内含子序列中的特定腺嘌呤核苷酸(红色标示)攻击 剪接位点并在此处切断 RNA 的糖磷酸骨架。内含子的切割 端与腺嘌呤核苷酸形成共价键结合,如(B)中详细显示的那样,从而在 RNA 分子中形成一个环。释放的游离 外显子序列端然后与下一个外显子序列的起始部分发生反应,将两个外显子连接在一起并释放内含子序列成为一个套圈的形状。这两个外显子序列因此连接成一个连续的编码序列。最终释放的内含子序列被丢弃,最终被分解成单个核苷酸,这些核苷酸被回收利用。
of the splicing products that can be detected are the results of splicing errors and do not produce functional proteins. Nonetheless, alternative splicing is a key feature of gene expression in many organisms, and we shall return to this subject later, after we describe the cellular machinery that performs the basic reaction.
可以检测到的剪接产物中,有些是剪接错误的结果,并且不产生功能蛋白质。然而,另类剪接是许多生物基因表达的关键特征,我们将在稍后描述执行基本反应的细胞机制之后再回到这个主题。

Nucleotide Sequences Signal Where Splicing Occurs
核苷酸序列信号指示剪接发生的位置

The mechanism of pre-mRNA splicing shown in Figure 6-26 requires that the splicing machinery recognize three portions of the precursor RNA molecule: the splice site, the splice site, and the branch point in the intron sequence that forms the base of the excised lariat. Not surprisingly, each site has a consensus nucleotide sequence that is similar from intron to intron and provides the cell with cues for where splicing is to take place (Figure 6-28). These consensus sequences are relatively short and can accommodate extensive sequence variability, and as we shall see shortly, the cell uses additional types of information to ultimately choose exactly where, on each RNA molecule, splicing is to take place.
图 6-26 中显示的前 mRNA 剪接机制要求剪接机构识别前体 RNA 分子的三个部分: 剪接位点、 剪接位点以及形成被切除的环状分子底部的内含子序列中的分支点。毫不奇怪,每个位点都有一个类似的共识核苷酸序列,从内含子到内含子都是相似的,并为细胞提供了剪接应该发生的线索(图 6-28)。这些共识序列相对较短,可以容纳广泛的序列变异,正如我们很快将看到的,细胞使用额外类型的信息来最终选择在每个 RNA 分子的哪个位置进行剪接。

RNA Splicing Is Performed by the Spliceosome
RNA 剪接由剪接体执行

Unlike the other steps in mRNA production we have discussed, key steps in RNA splicing are performed by RNA molecules rather than proteins. Specialized RNA molecules recognize the nucleotide sequences that specify where splicing is to occur and also form the active site that catalyzes the chemistry of splicing. These RNA molecules are relatively short (less than 200 nucleotides each), and there are five of them, U1, U2, U4, U5, and U6. Known as snRNAs (small nuclear RNAs), each is complexed with at least seven protein subunits to form an snRNP (small nuclear ribonucleoprotein). These snRNPs form the core of the spliceosome, the large assembly of RNA and protein molecules that performs pre-mRNA splicing in the cell. Recognition of the splice junction, the branch-point site, and the splice junction is performed through base-pairing between the snRNAs and RNA sequences in the pre-mRNA substrate.
与我们讨论过的 mRNA 生产中的其他步骤不同,RNA 剪接中的关键步骤是由 RNA 分子而不是蛋白质执行的。专门的 RNA 分子识别指定剪接发生位置的核苷酸序列,并形成催化剪接化学反应的活性位点。这些 RNA 分子相对较短(每个不到 200 个核苷酸),共有五种,U1、U2、U4、U5 和 U6。被称为 snRNAs(小核 RNA)的每种 RNA 与至少七个蛋白亚基结合形成 snRNP(小核糖核蛋白)。这些 snRNP 形成剪接体的核心,剪接体是在细胞中执行前 mRNA 剪接的大型 RNA 和蛋白质分子组装体。通过 snRNAs 与前 mRNA 底物中的 RNA 序列之间的碱基配对来识别 剪接结合位点、支点位点和 剪接结合位点。
Figure 6-27 Alternative splicing of the -tropomyosin gene from rat. -Tropomyosin is a coiled-coil protein (see Figure 3-8) involved in the regulation of contraction in muscle cells. Its initial RNA transcript can be spliced in different ways, as indicated in the figure, to produce distinct mRNAs, which then give rise to variant proteins. Some of the splicing patterns are specific for certain types of cells. For example, the -tropomyosin made in striated muscle is different from that made from the same gene in smooth muscle. The arrowheads in the top part of the figure mark the sites where cleavage and poly-A addition form the ends of the mature mRNAs.
图 6-27 大鼠 -肌球蛋白基因的可变剪接。 -肌球蛋白是一个螺旋蛋白(见图 3-8),参与肌细胞收缩调节。其初始 RNA 转录本可以以不同方式剪接,如图所示,产生不同的 mRNA,然后产生变异蛋白。一些剪接模式特定于某些类型的细胞。例如,条纹肌中制造的 -肌球蛋白与平滑肌中同一基因制造的不同。图的顶部箭头标记了成熟 mRNA 的 末端的切割和多 A 添加的位置。
Figure 6-28 The consensus nucleotide sequences in an RNA molecule that signal the beginning and the end of most introns in humans. The three blocks of nucleotide sequences shown are required to remove an intron sequence. Here A, G, U, and C are the standard RNA nucleotides; stands for a purine (A or ); and stands for a pyrimidine ( or ). The A highlighted in red forms the branch point of the lariat produced by splicing (see Figure 6-26). Only the GU at the start of the intron and the AG at its end are invariant nucleotides in the splicing consensus sequences. Several different nucleotides can occupy the remaining positions, although the indicated nucleotides are preferred. Although the distances along the RNA between the three splicing consensus sequences are highly variable, the distance between the branch point and 3 ' splice junction is typically much shorter than the distance between the splice junction and the branch point.
图 6-28 信号人类大多数内含子起始和终止的 RNA 分子中的共识核苷酸序列。所示的三个核苷酸序列块是必需的,以去除内含子序列。这里 A、G、U 和 C 是标准 RNA 核苷酸; 代表嘌呤(A 或 ); 代表嘧啶( )。红色突出显示的 A 形成剪接产生的套圈的分支点(见图 6-26)。剪接共识序列中,内含子起始处的 GU 和终止处的 AG 是不变的核苷酸。其余位置可以有几种不同的核苷酸,尽管所指示的核苷酸更受青睐。尽管沿着 RNA 的三个剪接共识序列之间的距离高度可变,但分支点与 3'剪接连接点之间的距离通常比分支点与 剪接连接点之间的距离要短得多。
portion of a pre-mRNA transcript
一部分的前 mRNA 转录本

The U1 snRNP forms base pairs with the 5 splice junction. BBP (branch-point binding protein) recognizes the branch-point site and binds cooperatively with U2AF, which recognizes the polypyrimidine tract and the 3' splice junction (see Figure 6-28).
U1 snRNP 与 5'剪接点形成碱基对。BBP(分支点结合蛋白)识别分支点位点,并与识别多嘧嘌核甘酸序列和 3'剪接点的 U2AF 协同结合(见图 6-28)。
The U2 snRNP displaces BBP and U2AF and forms base pairs with the branch-point site consensus sequence.
U2 snRNP 将 BBP 和 U2AF 排挤出去,并与分支点位点共识序列形成碱基对。
The U4/U6+U5 "triple" snRNP enters the reaction. In this triple snRNP, the U4 and U6 snRNAs are held firmly together by base-pair interactions. Subsequent rearrangements break apart the U4/U6 base pairs, allowing U6 to displace U1 at the splice junction and ejecting the U4 snRNP and some of the proteins of the U6 snRNP.
U4/U6+U5“三联体”snRNP 进入反应。在这个三联体 snRNP 中,U4 和 U6 snRNA 通过碱基对相互牢固地结合在一起。随后的重新排列打破了 U4/U6 碱基对,使 U6 能够在 剪接点处取代 U1,并排出 U4 snRNP 和部分 U6 snRNP 的蛋白质。
Addition of the NTC/NTR protein complex positions the snRNPs to form the active site of the spliceosome and brings the branch point in proximity to the 5 ' splice site.
NTC/NTR 蛋白复合物的加入将 snRNPs 定位到剪接体的活性位点,并将分支点靠近 5'剪接位点。
Lariat formed. Additional rearrangements bring the two exon segments together and place them in the active site.
拉里亚特形成。额外的重新排列将两个外显子片段带在一起并将它们放置在活性位点。

Spliced RNA is released.
剪接的 RNA 被释放。
Spliceosome components recycled using ATP hydrolysis to "reset" them.
剪接体组分通过 ATP 水解回收,以“重置”它们。
Figure 6-29 The pre-mRNA splicing mechanism. RNA splicing is catalyzed by an assembly of snRNPs (shown as colored shapes) plus other proteins (most of which are not shown), which together constitute the spliceosome. The spliceosome recognizes the splicing signals on a pre-mRNA molecule, brings the two ends of the intron together, and forms the active site that catalyzes the two covalent bond-making and -breaking steps required (see Figure 6-26A and Movie 6.5). As shown, nearly every step is accompanied by hydrolysis of a molecule of ATP, which prevents the reaction from stalling or moving backwards and, as discussed below, increases the accuracy of splicing. Although only six molecules of ATP are shown in this simplified diagram, some steps require more than one, and a total of eight ATPs are consumed in each splicing reaction. As indicated in the last step, a set of proteins called the exon junction complex (EJC) is retained on the spliced mRNA molecule; its subsequent role will be discussed shortly.
图 6-29 前 mRNA 剪接机制。RNA 剪接由一组 snRNPs(显示为彩色形状)加上其他蛋白质(其中大部分未显示)催化,共同构成剪接体。剪接体识别前 mRNA 分子上的剪接信号,将内含子的两端相互靠近,并形成催化所需的两个共价键形成和断裂步骤的活性位点(参见图 6-26A 和视频 6.5)。如图所示,几乎每个步骤都伴随着一分子 ATP 的水解,这可以防止反应停滞或倒退,并且,如下所讨论的,增加了剪接的准确性。尽管这个简化图中只显示了六个 ATP 分子,但有些步骤需要多于一个,每个剪接反应消耗八个 ATP。如最后一步所示,一组蛋白质称为外显子连接复合物(EJC)保留在剪接后的 mRNA 分子上;其后续作用将很快讨论。
The spliceosome is a complex and dynamic machine. When studied in vitro, a few components of the spliceosome assemble on pre-mRNA and, as the splicing reaction proceeds, new components enter and those that have already performed their tasks are jettisoned (Figure 6-29). However, many scientists believe that, inside the cell, the spliceosome is a preexisting, loose assembly of all the components-capturing, splicing, and releasing RNA as a coordinated unit, and undergoing extensive rearrangements each time a splice is made.
剪接体是一个复杂而动态的机器。在体外研究时,剪接体的一些组分会在预 mRNA 上组装,随着剪接反应的进行,新的组分进入,已完成任务的组分被抛弃(图 6-29)。然而,许多科学家认为,在细胞内,剪接体是一个预先存在的、松散的所有组分的组装体,作为一个协调的单位捕获、剪接和释放 RNA,并在每次进行剪接时经历广泛的重组。

The Spliceosome Uses ATP Hydrolysis to Produce a Complex Series of RNA-RNA Rearrangements
剪接体利用 ATP 水解产生一系列复杂的 RNA-RNA 重排

There are many unusual features of the splicing reaction compared to the typical catalytic processes in the cell introduced in Chapter 2. First, splicing seems grossly inefficient, requiring more than a hundred proteins, five RNA molecules, and the hydrolysis of eight molecules of ATP to produce a single splice. Many genes require multiple splicing events to produce a single functional mRNA molecule (more than 20 in the example in Figure 6-25B), and the process seems inordinately complex. Second, each splicing reaction requires that the catalytic site for the reaction be assembled de novo on each pre-mRNA molecule through a complex, multistep process. Third, as mentioned earlier, the catalytic site of the spliceosome (which catalyzes both steps of the splicing reaction) is formed by RNA molecules (the snRNAs) rather than by proteins (Figure 6-30), with the proteins required to correctly position these RNAs. In the final section of this chapter, we describe the structure and chemical properties of RNA molecules that allow them to act as catalysts, as well as the proposal that the RNA-based reaction mechanisms that exist today are "leftovers" from ancient, RNA-only biological systems.
剪接反应与第 2 章介绍的细胞中典型催化过程相比有许多不同之处。首先,剪接似乎极其低效,需要超过一百种蛋白质、五种 RNA 分子和水解八个 ATP 分子才能产生一个剪接体。许多基因需要多次剪接事件才能产生一个功能性 mRNA 分子(如图 6-25B 中的示例中超过 20 次),这个过程似乎异常复杂。其次,每个剪接反应都需要通过一个复杂的、多步骤的过程在每个 pre-mRNA 分子上新组装反应的催化位点。第三,正如前面提到的,剪接体的催化位点(催化剪接反应的两个步骤)是由 RNA 分子(snRNAs)而不是蛋白质(如图 6-30 所示)形成的,需要蛋白质正确定位这些 RNA。在本章的最后一节中,我们描述了 RNA 分子的结构和化学性质,使它们能够作为催化剂,并提出今天存在的基于 RNA 的反应机制是来自古代仅有 RNA 的生物系统的“残留物”的建议。
How might we rationalize the unusual biochemical complexity of splicing compared to other steps in gene expression? As we discuss shortly, pre-mRNA splicing has evolved from a much simpler, purely RNA-based process, and its complexity may in part be an example of what some evolutionary biologists term "runaway bureaucracies," accruing more and more parts over time that are now required for the process without necessarily making it any better. We do know, however, that some of the complexity of pre-mRNA splicing provides advantages. Even though ATP hydrolysis is not required for the splicing reaction per se, the numerous ATP hydrolysis steps keep the reaction from stalling or running backwards and, as we will see shortly, they also increase the accuracy of pre-mRNA splicing.
我们如何解释剪接在基因表达中与其他步骤相比异常的生化复杂性?正如我们很快将讨论的那样,前 mRNA 剪接已经从一个更简单的、纯粹基于 RNA 的过程演变而来,其复杂性可能在一定程度上是一种一些进化生物学家所称的“失控官僚体系”的例子,随着时间的推移积累了越来越多的部分,现在这些部分对于过程是必需的,但并不一定使其变得更好。然而,我们确实知道,前 mRNA 剪接的一些复杂性提供了优势。尽管 ATP 水解对于剪接反应本身并非必需,但众多的 ATP 水解步骤可以防止反应停滞或倒退,并且正如我们很快将看到的,它们还可以提高前 mRNA 剪接的准确性。
Most of the spliceosome proteins that hydrolyze ATP use the released energy to break existing RNA-RNA interactions to allow the formation of new ones. These RNA-RNA rearrangements allow the splicing signals on the pre-mRNA to be examined several times during the course of splicing. For example, the U1 snRNP initially recognizes the splice site through conventional base-pairing; as splicing proceeds, these base pairs are broken (using the energy of ATP hydrolysis) and U1 is replaced by U6 (see Figure 6-30A and B). Likewise, the branch point is "examined" twice, the first time by the branch-point binding protein and the second time by the U2 snRNP (see Figure 6-29). In this way, the spliceosome checks and rechecks the splicing signals before the active site for the two transesterification reactions forms (see Figure 6-30C), thereby increasing overall accuracy.
大多数水解 ATP 的剪接体蛋白利用释放的能量来打破现有的 RNA-RNA 相互作用,以便形成新的相互作用。这些 RNA-RNA 重排允许在剪接过程中多次检查前体 mRNA 上的剪接信号。例如,U1 snRNP 最初通过传统的碱基配对识别 剪接位点;随着剪接的进行,这些碱基对被打破(利用 ATP 水解的能量),U1 被 U6 取代(见图 6-30A 和 B)。同样,分支点被“检查”两次,第一次由分支点结合蛋白,第二次由 U2 snRNP(见图 6-29)。通过这种方式,在两个酯交换反应的活性位点形成之前,剪接体会反复检查剪接信号(见图 6-30C),从而提高整体准确性。
Note that both catalytic steps of the splicing reaction occur in the same active site (the second reaction is almost the reverse of the first). ATP-mediated
请注意,剪接反应的两个催化步骤发生在同一个活性位点中(第二个反应几乎是第一个反应的逆反应)。通过 ATP 介导。
(A)
(C)
(B)
Figure 6-30 An example of an ATP hydrolysis-driven RNA-RNA rearrangement that occurs during splicing. Schematic diagram of the arrangement of RNA molecules before and after the active site has been formed for the two phosphoryltransfer reactions. This rearrangement also brings the branch point and the splice site together in preparation for the first reaction. The active site (which grasps the two magnesium ions needed for the chemistry of the reactions) is formed by U2 and U6. Several sequential ATP-dependent rearrangement steps are needed to convert the configuration in A to that in B. (C) The actual structure, determined by cryo-electron microscopy, of the RNA-based catalytic core of the spliceosome that was schematically illustrated in B. For simplicity, the proteins that surround the RNAs are not shown, even though their conformations are known. High-resolution structures have also been determined for most of the other spliceosome intermediates. (B, adapted from M.E. Wilkinson, C. Charenton, and K. Nagai, Annu. Rev. Biochem. 89:359-398, 2020. With permission from Annual Reviews.)
图 6-30 ATP 水解驱动的 RNA-RNA 重排示例,发生在剪接过程中。在两个磷酸转移反应的活性位点形成之前 和之后 ,RNA 分子的排列示意图。这种重排还将分支点和 剪接位点聚集在一起,为第一次反应做准备。活性位点(抓住反应化学所需的两个镁离子的地方)由 U2 和 U6 形成。需要几个顺序的 ATP 依赖性重排步骤,将 A 中的构型转换为 B 中的构型。(C)由冷冻电镜确定的剪接体 RNA 基催化核心的实际结构,该结构在 B 中以示意图形式呈现。为简单起见,未显示包围 RNA 的蛋白质,尽管它们的构象已知。大多数其他剪接体中间体的高分辨率结构也已确定。(B,改编自 M.E. Wilkinson,C. Charenton 和 K. Nagai,Annu. Rev. Biochem. 89:359-398,2020。获得《年度评论》的许可。)
rearrangements are required between the two reactions to properly reposition the pre-mRNA for the second reaction, helping to ensure that splicing accidents occur only rarely.
需要在这两个反应之间进行重新排列,以正确重新定位前 mRNA 进行第二个反应,有助于确保剪接意外仅发生很少。
A general process associated with ATP hydrolysis, called kinetic proofreading, further increases spliceosome accuracy. Because an incorrect, "off-target" basepairing interaction will be weaker than the correct one, incorrect interactions will dissociate more rapidly than correct ones. Each ATP-mediated rearrangement of the spliceosome takes a finite amount of time, and this time delay will favor the correct choice, because off-target interactions will often dissociate during this time window, giving the correct interaction multiple chances to form. Such kinetic proofreading is used throughout biology. For example, we saw in Chapter 5 that the initial selection of the correct nucleotides by DNA polymerase during DNA replication takes advantage of this principle. We will discuss it in more detail later in the chapter when we describe translation by the ribosome.
与 ATP 水解相关的一般过程,称为动力学校对,进一步提高了剪接体的准确性。因为一个不正确的“非靶向”碱基配对相互作用会比正确的配对弱,不正确的相互作用会比正确的更快地解离。剪接体的每次 ATP 介导的重排都需要一定的时间,这种时间延迟会有利于正确选择,因为在这段时间窗口内,非靶向相互作用通常会解离,使得正确的相互作用有多次形成的机会。这种动力学校对在整个生物学中都被使用。例如,我们在第 5 章中看到,在 DNA 复制过程中,DNA 聚合酶通过这一原理来选择正确的核苷酸。在本章后面描述核糖体的翻译时,我们将更详细地讨论这一点。
Once the splicing chemistry is completed, the snRNPs remain bound to the excised lariat. The disassembly of these snRNPs from the lariat (and from
一旦剪接化学完成,snRNPs 仍然与被切除的套圈结合在一起。这些 snRNPs 从套圈(和从)中解体。
(A)
(B)
Figure 6-31 Variation in intron and exon lengths in the human, worm, and fly genomes. (A) Size distribution of exons. (B) Size distribution of introns. Note that exon length is much more uniform than intron length. (Adapted from International Human Genome Sequencing Consortium, Nature 409:860-921, published 2001 by Macmillan Magazines Ltd. Reproduced with permission of SNCSC.)
图 6-31 人类、蠕虫和果蝇基因组内含子和外显子长度的变异。(A) 外显子大小分布。(B) 内含子大小分布。请注意,外显子长度比内含子长度更加均匀。(改编自国际人类基因组测序联盟, 自然杂志 409:860-921, 2001 年由麦克米兰杂志有限公司出版。经 SNCSC 许可复制。)
each other) requires another series of ATP-driven RNA-RNA rearrangements, thereby returning the snRNAs to their original configuration to be used again in a new reaction.
彼此之间)需要另一系列由 ATP 驱动的 RNA-RNA 重排,从而将 snRNA 恢复到其原始配置,以便再次在新的反应中使用。
As splicing is completed, the spliceosome directs a set of proteins to bind to the mRNA near the position formerly occupied by the intron. Called the exon junction complex , these proteins mark the site of a successful splicing event, and, as we shall see later in this chapter, they influence the subsequent fate of the mRNA.
随着剪接的完成,剪接体指导一组蛋白质结合到 mRNA 附近,该位置曾被内含子占据。这些蛋白质被称为外显子连接复合物 ,它们标记了成功剪接事件的位置,并且正如我们将在本章后面看到的那样,它们影响 mRNA 的随后命运。

Other Properties of Pre-mRNA and Its Synthesis Help to Explain the Choice of Proper Splice Sites
Pre-mRNA 的其他特性及其合成有助于解释适当剪接位点的选择

As shown in Figure 6-31, intron sequences vary enormously in size; some are more than 100,000 nucleotides long. One might therefore expect frequent splicing mistakes-including exon skipping and the mistaken use of "cryptic" splice sites (Figure 6-32). To minimize these problems the fidelity mechanisms built into the spliceosome are supplemented by two additional strategies that further increase the accuracy of splicing. The first is a simple consequence of splicing being coupled to transcription (Figure 6-33). As transcription proceeds, the phosphorylated tail of RNA polymerase carries several components that stimulate formation of the spliceosome (see Figure 6-23), and these components are transferred directly from the polymerase to the RNA as the RNA emerges from the polymerase. This strategy helps the cell to keep track of introns and exons; for example, the snRNPs that assemble at a 5 ' splice site are initially presented only with the single splice site that emerges next from the polymerase, inasmuch as the potential sites further downstream have not yet been synthesized. The coordination of transcription with splicing is thus important for preventing inappropriate exon skipping.
如图 6-31 所示,内含子序列在大小上变化巨大;有些长达超过 100,000 个核苷酸。因此,人们可能会预期频繁的剪接错误,包括外显子跳跃和错误使用“隐蔽”剪接位点(图 6-32)。为了最小化这些问题,剪接体内建的保真机制还会通过另外两种策略来增加剪接的准确性。第一种策略是剪接与转录耦合的简单结果(图 6-33)。随着转录的进行,RNA 聚合酶的磷酸化尾部携带几种促进剪接体形成的组分(见图 6-23),这些组分会直接从聚合酶转移到 RNA,随着 RNA 从聚合酶中出现。这种策略有助于细胞跟踪内含子和外显子;例如,在 5'剪接位点组装的 snRNPs 最初只与从聚合酶中接下来的单个剪接位点呈现,因为下游的潜在位点尚未合成。 转录与剪接的协调对于防止不恰当的外显子跳跃是非常重要的。
Another mechanism, called exon definition, also helps cells choose the appropriate splice sites. Exon size tends to be much more uniform than intron size,
另一个机制,称为外显子定义,也有助于细胞选择适当的剪接位点。外显子的大小往往比内含子的大小要均匀得多。

(B)
exon 1 外显子 1 exon 2 外显子 2
cryptic splice- 隐秘剪接-
cryptic 神秘
site selection 选址
Figure 6-32 Two types of potential splicing errors. (A) Exon skipping. (B) Cryptic splice-site selection. Cryptic splicing signals are nucleotide sequences of RNA that closely resemble true splicing signals and are sometimes mistakenly used by the spliceosome.
图 6-32 两种潜在的剪接错误。 (A) 外显子跳跃。 (B) 隐匿剪接位点选择。 隐匿剪接信号是 RNA 的核苷酸序列,与真实的剪接信号非常相似,有时会被剪接体错误地使用。
averaging about 150 nucleotide pairs across a wide variety of eukaryotic organisms (see Figure 6-31). Through exon definition, the splicing machinery seeks out the relatively homogeneously sized exon sequences rather than the intron sequences. It does this in the following way: as RNA synthesis proceeds, a group of additional components (most notably SR proteins, so-named because each contains a domain rich in serines and arginines) assemble on exon sequences and help to mark off each and splice site, starting at the end of the RNA (Figure 6-34). These proteins, in turn, recruit the U1 snRNP, which marks the one exon boundary, and U2AF, which, along with BBP, specifies the other. By marking out the exons in this way and thereby taking advantage of the relatively uniform size of exons, the cell increases the accuracy with which it deposits the initial splicing components on the nascent RNA and thereby avoids "near miss" splice sites. To further aid the cell in marking off exons and distinguishing them from introns, some SR proteins bind tightly to specific RNA sequences, often termed splicing enhancers, that are preferentially found in exons. Because several different codons specify most amino acids, it is possible for a splicing enhancer to evolve without a change in the amino acid sequence coded by the exon.
在各种真核生物中,平均约为 150 个核苷酸对(见图 6-31)。通过外显子定义,剪接机制寻找相对均匀大小的外显子序列,而不是内含子序列。它是通过以下方式实现的:随着 RNA 合成的进行,一组额外的组分(尤其是富含丝氨酸和精氨酸的 SR 蛋白,因为每个蛋白质都含有这样的结构域)会在外显子序列上组装,并帮助标记每个 剪接位点,从 RNA 的 端开始(图 6-34)。这些蛋白质反过来招募 U1 snRNP,标记一个外显子边界,以及 U2AF,它与 BBP 一起指定另一个。通过这种方式标记出外显子,并利用外显子的相对统一大小,细胞增加了将初始剪接组分沉积在新生 RNA 上的准确性,从而避免“近似”剪接位点。为了帮助细胞更好地标记外显子并将其与内含子区分开,一些 SR 蛋白紧密结合到特定的 RNA 序列上,通常称为剪接增强子,这些序列在外显子中优先出现。 由于大多数氨基酸由几个不同的密码子指定,因此,剪接增强子有可能在不改变外显子编码的氨基酸序列的情况下进化。
Both the marking of exon and intron boundaries and the assembly of the spliceosome begin on an RNA molecule while it is still being elongated by RNA polymerase (see Figure 6-33). However, because the actual chemistry of splicing can be delayed, intron sequences are not necessarily removed from a pre-mRNA molecule in the order in which they occur along the RNA chain.
外显子和内含子边界的标记以及剪接体的组装都始于 RNA 分子,而该分子在被 RNA 聚合酶拉长时(见图 6-33)。然而,由于剪接的实际化学过程可能会延迟,内含子序列不一定按照它们在 RNA 链上出现的顺序从前体 mRNA 分子中移除。

RNA Splicing Has Remarkable Plasticity
RNA 剪接具有非凡的可塑性

We have seen that the choice of splice sites depends on such features of the premRNA transcript as the strength of the three signals on the RNA recognized by
我们已经看到,剪接位点的选择取决于 premRNA 转录本的特征,如 RNA 上被识别的三个信号的强度
Figure 6-33 Electron micrograph of a heavily transcribed gene containing multiple introns in an early Drosophila embryo, illustrating the co-transcriptional splicing of its RNA transcripts. (Courtesy of Victoria Foe. Friday Harbor Laboratories, Friday Harbor WA.)
图 6-33 早期果蝇胚胎中含有多个内含子的大量转录基因的电子显微镜图,展示其 RNA 转录本的共转录剪接。(由 Victoria Foe 提供。Friday Harbor Laboratories,Friday Harbor WA。)

Figure 6-34 The exon definition hypothesis. SR proteins bind to each exon sequence in the pre-mRNA and thereby help to guide the snRNPs to the proper intron-exon boundaries. This demarcation of exons by the SR proteins occurs co-transcriptionally, beginning at the CBC (cap-binding complex) at the end. It has also been proposed that a group of proteins known as the heterogeneous nuclear ribonucleoproteins (hnRNPs) preferentially associates with intron sequences, further helping the spliceosome distinguish introns from exons. (Adapted from R. Reed, Curr. Opin. Cell Biol. 12:340-345, 2000. With permission from Elsevier.)
图 6-34 外显子定义假说。SR 蛋白结合到前体 mRNA 中的每个外显子序列,从而帮助引导 snRNPs 到正确的内含子-外显子边界。SR 蛋白对外显子的划分是在共转录过程中发生的,从 端的 CBC(帽结合复合物)开始。还有人提出,一组被称为异质核糖核蛋白(hnRNPs)的蛋白偏好与内含子序列结合,进一步帮助剪接体区分内含子和外显子。(改编自 R. Reed, Curr. Opin. Cell Biol. 12:340-345, 2000. 获 Elsevier 许可。)

the splicing machinery (the and splice junctions and the branch point), the co-transcriptional assembly of the spliceosome, and the "bookkeeping" that underlies exon definition. We do not know exactly how accurate splicing normally is because, as we see later, there are several quality-control systems that rapidly destroy mRNAs whose splicing goes awry. However, we do know that, compared with other steps in gene expression, splicing is unusually flexible.
剪接机制( 剪接位点以及分支点)、剪接体的共转录组装以及支持外显子定义的“簿记”等方面。我们并不确切知道剪接通常有多准确,因为正如后面所看到的,存在几个质量控制系统会迅速销毁剪接出现问题的 mRNA。然而,我们知道,与基因表达的其他步骤相比,剪接具有异常的灵活性。
Thus, for example, a mutation in a nucleotide sequence critical for splicing of a particular intron does not necessarily prevent splicing of that intron altogether. Instead, the mutation typically creates a new pattern of splicing (Figure 6-35). Most common, an exon is simply skipped (Figure 6-35B). In other cases, the mutation causes a cryptic splice junction to become the default choice (Figure 6-35C). Apparently, the splicing machinery has evolved to pick out the best possible pattern of splice junctions, and if the optimal one is damaged by mutation, it will seek out the next best pattern, and so on. This inherent plasticity in the process of RNA splicing suggests that changes in splicing patterns caused by random mutations have been important in the evolution of genes and organisms. It also means that mutations that affect splicing can be severely detrimental to the organism: in addition to the -thalassemia example presented in Figure 6-35, aberrant splicing plays important roles in the development of cystic fibrosis, frontotemporal dementia, Parkinson's disease, retinitis pigmentosa, spinal muscular atrophy, myotonic dystrophy, premature aging, and cancer. It has been estimated that of the many point mutations that cause inherited human diseases, produce aberrant splicing of the gene containing the mutation.
因此,例如,对于关键于剪接特定内含子的核苷酸序列的突变并不一定完全阻止该内含子的剪接。相反,突变通常会产生新的剪接模式(图 6-35)。最常见的是,外显子被简单地跳过(图 6-35B)。在其他情况下,突变会导致隐匿剪接连接成为默认选择(图 6-35C)。显然,剪接机制已经进化到能够挑选出最佳的剪接连接模式,如果最佳模式受到突变损害,它将寻找下一个最佳模式,依此类推。RNA 剪接过程中的这种固有可塑性表明,由随机突变引起的剪接模式变化在基因和生物体的进化中起着重要作用。这也意味着影响剪接的突变对生物体可能会造成严重危害:除了图 6-35 中提到的-地中海贫血的例子外,剪接异常在囊性纤维化、额颞叶痴呆症、帕金森病、视网膜色素变性、脊髓性肌萎缩症、肌无力症、早衰和癌症的发展中起着重要作用。 据估计,导致遗传性人类疾病的许多点突变中, 会产生包含该突变的基因的异常剪接。
The plasticity of RNA splicing also means that the cell can easily regulate the pattern of RNA splicing. Earlier in this section we saw that alternative splicing can give rise to different proteins from the same gene and that this is a common strategy to enhance the coding potential of genomes. Some examples of alternative splicing are constitutive; that is, the alternatively spliced mRNAs are produced continually by cells of an organism. However, in many cases, the cell regulates the splicing patterns so that different forms of the protein are produced at different times and in different tissues (see Figure 6-27). In Chapter 7, we return to this issue to discuss some specific examples of regulated RNA splicing.
RNA 剪接的可塑性也意味着细胞可以轻松调节 RNA 剪接的模式。在本节的前面,我们看到选择性剪接可以使同一基因产生不同的蛋白质,这是增强基因组编码潜力的常见策略。一些选择性剪接的例子是固定的;也就是说,有机体的细胞持续产生选择性剪接的 mRNA。然而,在许多情况下,细胞会调节剪接模式,以便在不同的时间和不同的组织中产生不同形式的蛋白质(见图 6-27)。在第 7 章中,我们将回到这个问题,讨论一些受调控的 RNA 剪接的具体例子。

Spliceosome-catalyzed RNA Splicing Evolved from RNA Self-splicing Mechanisms
剪接体催化的 RNA 剪接起源于 RNA 自剪接机制

When the spliceosome was first discovered, it puzzled molecular biologists. Why do RNA molecules instead of proteins perform important roles in splicesite recognition and in the chemistry of splicing? Why is a lariat intermediate used rather than the apparently simpler alternative of bringing the and splice sites together in a single step, followed by their direct cleavage and rejoining? The answers to these questions reflect the way in which the spliceosome has evolved.
当剪接体首次被发现时,这让分子生物学家感到困惑。为什么 RNA 分子而不是蛋白质在剪接位点识别和剪接化学过程中扮演重要角色?为什么要使用套圈中间体,而不是显然更简单的将 剪接位点一次性结合,然后直接裂解和重连接?这些问题的答案反映了剪接体演化的方式。
As discussed in the final section of this chapter, it is likely that early cells used RNA molecules rather than proteins as their major catalysts and that they stored their genetic information in RNA rather than in DNA sequences. RNA-catalyzed splicing reactions presumably had critical roles in these early cells. As evidence, some self-splicing RNA introns (that is, intron sequences in RNA whose splicing out can occur in the absence of proteins or any other RNA molecules) remain today, for example, in the nuclear rRNA genes of the ciliate Tetrahymena, in a few bacteriophage T4 genes, and in some mitochondrial and chloroplast genes. In these cases, the RNA molecule folds into a specific three-dimensional structure that brings the intron-exon junctions together and directly catalyzes the two transesterification reactions. A self-splicing intron sequence can be identified in a test tube by incubating a pure RNA molecule that contains the intron sequence and observing the splicing reaction. Because the basic chemistry as well as the structure of the active site of some self-splicing RNAs is so similar to those of the pre-mRNA spliceosome, the much more involved
正如本章的最后一节所讨论的那样,早期细胞很可能使用 RNA 分子而不是蛋白质作为它们的主要催化剂,并且将它们的遗传信息存储在 RNA 而不是 DNA 序列中。RNA 催化的剪接反应可能在这些早期细胞中起着关键作用。作为证据,一些自剪接 RNA 内含子(即,在 RNA 中的内含子序列可以在没有蛋白质或任何其他 RNA 分子的情况下剪接出来的内含子序列)今天仍然存在,例如在草履虫 Tetrahymena 的核 rRNA 基因中,在一些噬菌体 T4 基因中以及一些线粒体和叶绿体基因中。在这些情况下,RNA 分子会折叠成特定的三维结构,将内含子-外显子结合位点聚集在一起,并直接催化两个酯交换反应。通过孵化含有内含子序列的纯 RNA 分子并观察剪接反应,可以在试管中识别自剪接内含子序列。由于一些自剪接 RNA 的基本化学性质以及活性位点的结构与前 mRNA 剪接体的非常相似,因此更复杂。

(A) NORMAL ADULT -GLOBIN RNA TRANSCRIPT
(A)正常成人α-珠蛋白 RNA 转录物
normal mRNA is formed from three exons
正常的 mRNA 由三个外显子形成
(B) A SINGLE-NUCLEOTIDE CHANGE THAT DESTROYS A NORMAL SPLICE SITE, THEREBY CAUSING EXON SKIPPING
(B)一个单核苷酸改变破坏了正常的剪接位点,从而导致外显子跳跃
(C) A SINGLE-NUCLEOTIDE CHANGE THAT DESTROYS A NORMAL SPLICE SITE, THEREBY ACTIVATING A CRYPTIC SPLICE SITE
(C)一个单核苷酸改变破坏了正常的剪接位点,从而激活了一个隐匿的剪接位点
mRNA with extended exon 3
具有扩展外显子 3 的 mRNA
(D) A SINGLE-NUCLEOTIDE CHANGE THAT CREATES A NEW SPLICE SITE, THEREBY CAUSING A NEW EXON TO BE INCORPORATED
(D)一个单核苷酸改变,产生一个新的剪接位点,从而导致一个新外显子被合并
mRNA with extra exon inserted between exon 2 and exon 3
在外显子 2 和外显子 3 之间插入额外外显子的 mRNA
Figure 6-35 Abnormal processing of the -globin primary RNA transcript in humans with the disease -thalassemia. In the examples shown, the disease (a severe anemia due to aberrant hemoglobin synthesis) is caused by splice-site mutations found in the genomes of affected individuals. The dark blue boxes represent the three normal exon sequences; the red lines connect the and splice sites that are used. The light blue boxes depict new nucleotide sequences included in the final mRNA molecule as a result of the mutation denoted by the black arrowhead. Note that when a mutation leaves a normal splice site without a partner, an exon is skipped as in panel B or one or more abnormal cryptic splice sites nearby is used as the partner site as in panel .
图 6-35 人类患有 -地中海贫血症的 -珠蛋白原始 RNA 转录异常处理。在所示的示例中,这种疾病(由于异常血红蛋白合成而导致的严重贫血)是由发现在受影响个体基因组中的剪接位点突变引起的。深蓝色方框代表三个正常外显子序列;红线连接使用的 剪接位点。浅蓝色方框描述由于黑色箭头标记的突变而包含在最终 mRNA 分子中的新核苷酸序列。请注意,当突变使正常剪接位点失去配对时,会跳过一个外显子,如 B 面所示,或者使用附近一个或多个异常隐匿剪接位点作为配对位点,如 面所示。
process of pre-mRNA splicing described earlier very likely evolved from a simpler, ancestral form of RNA self-splicing.
前 mRNA 剪接的过程很可能是从更简单的、祖先形式的 RNA 自剪接进化而来的。

RNA-processing Enzymes Generate the 3' End of Eukaryotic mRNAs
RNA 加工酶生成真核 mRNA 的 3'端

We have seen that the end of the pre-mRNA produced by RNA polymerase II is capped almost as soon as it emerges from the RNA polymerase. Then, as the polymerase continues its movement along a gene, spliceosomes assemble on the RNA and delineate the intron and exon boundaries. The long C-terminal tail of the RNA polymerase coordinates these processes by transferring capping and splicing components directly to the RNA as it emerges from the enzyme. In this section, we describe how a similar mechanism ensures that the end of the pre-mRNA is properly processed as RNA polymerase II reaches the end of a gene.
我们已经看到,由 RNA 聚合酶 II 产生的 pre-mRNA 的 端几乎在其从 RNA 聚合酶中出现时就被加帽。然后,随着聚合酶沿基因的移动继续进行,剪接体在 RNA 上组装,并勾画内含子和外显子的边界。RNA 聚合酶的长 C 端尾部通过将加帽和剪接组分直接传递给从酶中出现的 RNA 来协调这些过程。在本节中,我们描述了类似的机制如何确保当 RNA 聚合酶 II 到达基因末端时,pre-mRNA 的 端得到适当处理。
The position of the end of each mRNA molecule is specified by signals encoded in the DNA nucleotide sequence (Figure 6-36). These signals are transcribed into RNA as the RNA polymerase II moves through them, and they are then recognized (as RNA) by a series of RNA-binding proteins and RNA-processing enzymes (Figure 6-37). Two multisubunit proteins (called CstF, cleavage stimulation factor; and CPSF, cleavage and polyadenylation specificity factor) are of special importance. Both of these proteins travel with the RNA polymerase tail and are transferred to the -end processing sequence on an RNA molecule as it emerges from the RNA polymerase.
每个 mRNA 分子的 端位置由编码在 DNA 核苷酸序列中的信号指定(图 6-36)。这些信号在 RNA 聚合酶 II 通过它们时被转录成 RNA,然后被一系列 RNA 结合蛋白和 RNA 加工酶(作为 RNA)识别(图 6-37)。两种多亚基蛋白(称为 CstF,剪切刺激因子;和 CPSF,剪切和多聚腺苷酸特异性因子)非常重要。这两种蛋白都随 RNA 聚合酶尾部移动,并在 RNA 聚合酶释放时转移到 RNA 分子上的 端加工序列。
Once the two proteins bind to their recognition sequences on the emerging RNA molecule, additional proteins assemble with them to cleave the RNA (releasing it from the RNA polymerase) and complete the end of the mRNA. Once the RNA is cleaved, an enzyme called poly-A polymerase adds approximately 200 A nucleotides, one at a time, to the end produced by the cleavage (see Figure 6-37). The precursor for these additions is ATP, and the same type of -to-3' bonds are formed as in conventional RNA synthesis. But unlike other RNA polymerases, poly-A polymerase does not require a template; hence, the poly-A tail of eukaryotic mRNAs is not directly encoded in the genome. As the poly-A tail is synthesized, proteins called poly-A-binding proteins assemble onto it and, by a poorly understood mechanism, they help determine the final length of the tail.
一旦这两种蛋白质与新出现的 RNA 分子上的识别序列结合,其他蛋白质会与它们一起组装,以切割 RNA(将其从 RNA 聚合酶中释放),并完成 mRNA 的 端。一旦 RNA 被切割,一种名为聚 A 聚合酶的酶会逐个添加大约 200 个 A 核苷酸到切割产生的 端(见图 6-37)。这些添加的前体是 ATP,并且形成的是与常规 RNA 合成中相同类型的 -3'键。但与其他 RNA 聚合酶不同,聚 A 聚合酶不需要模板;因此,真核 mRNA 的聚 A 尾部并不直接编码在基因组中。随着聚 A 尾部的合成,称为聚 A 结合蛋白的蛋白质会组装到其上,并通过一种不太理解的机制帮助确定尾部的最终长度。
The RNA polymerase II continues to transcribe after the end of a eukaryotic pre-mRNA molecule has been cleaved, in some cases for hundreds of nucleotides. However, two factors increase the likelihood that an RNA polymerase will terminate transcription shortly after it has synthesized the RNA signal for -end
RNA 聚合酶 II 在真核前 mRNA 分子的 端被切除后继续转录,在某些情况下可延续数百核苷酸。然而,有两个因素增加了 RNA 聚合酶在合成 端 RNA 信号后很快终止转录的可能性。
Figure 6-37 Some of the major steps in generating the end of a eukaryotic mRNA. This process is much more complicated than the analogous process in bacteria, where the RNA polymerase simply stops at a termination signal and releases both the end of its transcript and the DNA template (see Figure 6-11).
图 6-37 在生成真核 mRNA 的 端的过程中的一些主要步骤。这个过程比细菌中类似的过程要复杂得多,在细菌中,RNA 聚合酶只是在终止信号处停止,并释放其转录物的 端和 DNA 模板(参见图 6-11)。

Figure 6-36 The consensus nucleotide sequences in RNA that direct cleavage and polyadenylation to form the 3 ' end of a eukaryotic mRNA. These sequences are encoded in the genome, and specific proteins recognize them-as RNAafter they are transcribed. As shown in Figure 6-37, the hexamer AAUAAA is bound by CPSF, and the GU-rich element beyond the cleavage site is bound by CstF; the CA sequence is bound by a third protein factor required for the cleavage step. Like other consensus nucleotide sequences discussed in this chapter (see Figure 6-12), the sequences shown in the figure represent optimal sequences; in reality, a variety of related cleavage and polyadenylation signals occur in nature.
图 6-36 显示了 RNA 中指导剪切和多聚腺苷酸化形成真核 mRNA 3'端的共识核苷酸序列。这些序列被编码在基因组中,特定蛋白质在转录后识别它们作为 RNA。如图 6-37 所示,六聚体 AAUAAA 被 CPSF 结合,剪切位点后的 GU 富集元素被 CstF 结合;CA 序列被第三个蛋白因子结合,该因子在剪切步骤中是必需的。与本章讨论的其他共识核苷酸序列一样(参见图 6-12),图中显示的序列代表最佳序列;实际上,自然界中存在各种相关的剪切和多聚腺苷化信号。

mature 3' end of an mRNA molecule
mRNA 分子的成熟 3'端

cleavage and polyadenylation. First, the recruitment of the many proteins needed for 3 '-end processing (which occurs while some of them are still bound to the RNA polymerase tail) causes a conformational change in the polymerase, slowing it down and decreasing its processivity. Second, once 3'-end cleavage has occurred, the newly synthesized RNA emerging from the polymerase lacks a cap; this unprotected RNA is rapidly degraded by a exonuclease, and, when it catches up to the polymerase, it causes the RNA polymerase to release its grip on the template and terminate transcription.
裂解和多聚腺苷酸化。首先,为了 3'-端加工所需的许多蛋白质的招募(这发生在其中一些蛋白质仍然与 RNA 聚合酶尾部结合时),导致聚合酶的构象变化,使其减慢并降低其过程性。其次,一旦 3'-端裂解发生,从聚合酶中合成的新 RNA 缺乏 帽;这种未受保护的 RNA 会被 外切核酸酶迅速降解,并且当它追上聚合酶时,会导致 RNA 聚合酶释放对模板的抓取并终止转录。
In the simplest case, a gene carries a single site for RNA cleavage and polyadenylation. However, many genes have several such sites and can therefore produce a variety of mRNAs that differ in their ends. As will be discussed in the next chapter, cells can regulate 3 '-end processing to produce different proteins from the same gene in a manner analogous to alternative splicing.
在最简单的情况下,一个基因携带一个用于 RNA 切割和多聚腺苷酸化的位点。然而,许多基因有几个这样的位点,因此可以产生各种在其 端不同的 mRNA。正如将在下一章讨论的那样,细胞可以调节 3'端加工,以类似于选择性剪接的方式从同一基因产生不同的蛋白质。

Mature Eukaryotic mRNAs Are Selectively Exported from the Nucleus
成熟的真核 mRNA 被选择性地从细胞核中输出

Eukaryotic pre-mRNA synthesis and processing take place in an orderly fashion within the cell nucleus. But of the pre-mRNA that is synthesized, only a small fraction-the mature mRNA—is of further use to the cell. Most of the rest-excised introns, broken RNAs, aberrantly processed pre-mRNAs, and accidently transcribed portions of the genome-is not only useless but potentially dangerous. How does the cell distinguish between the relatively rare mature mRNA molecules it wishes to keep and the overwhelming amount of useless debris?
真核前 mRNA 的合成和加工在细胞核内按顺序进行。但在合成的前 mRNA 中,只有一小部分成熟 mRNA 对细胞有进一步用途。大部分剩余的部分——被切除的内含子、断裂的 RNA、异常加工的前 mRNA 和意外转录的基因组部分——不仅无用,而且可能危险。细胞如何区分相对稀少的希望保留的成熟 mRNA 分子和大量无用的碎片?
The answer is that the RNAs are distinguished by the proteins bound to them. For example, we have seen that acquisition of cap-binding complexes, exon junction complexes, and poly-A-binding proteins marks the completion of capping, splicing, and poly-A addition, respectively. A properly completed mRNA molecule is also distinguished by the proteins it lacks. For example, the long-term presence of an snRNP protein would signify incomplete or aberrant splicing. Only when the proteins present on an mRNA molecule collectively signify that processing was successfully completed is the mRNA exported from the nucleus into the cytosol, where it can be translated into protein. Improperly processed mRNAs and other RNA debris are retained in the nucleus, where they are eventually degraded by the nuclear RNA exosome, a large protein complex whose interior is rich in -to-5' RNA exonucleases (Figure 6-38). Indeed, the default fate of RNAs in the nucleus is degradation; only those bearing the proper constellation of proteins are spared.
答案是 RNA 的区别在于与其结合的蛋白质。例如,我们已经看到,获得帽结合复合物、外显子连接复合物和多 A 结合蛋白标志着分别完成了帽子、剪接和多 A 添加。一个正确完成的 mRNA 分子也可以通过其缺乏的蛋白质来区分。例如,长期存在 snRNP 蛋白会表明剪接不完整或异常。只有当 mRNA 分子上存在的蛋白质共同表明处理成功完成时,mRNA 才会从细胞核转运到细胞质,在那里可以被翻译成蛋白质。处理不当的 mRNA 和其他 RNA 残留物会保留在细胞核中,最终被核 RNA 外体降解,这是一个内部富含 -to-5' RNA 外切酶的大型蛋白质复合物(图 6-38)。事实上,细胞核中 RNA 的默认命运是降解;只有那些携带适当蛋白质组合的 RNA 才会被保留。
Of all the proteins that assemble on pre-mRNA molecules as they emerge from transcribing RNA polymerases, the most abundant are the hnRNPs (heterogeneous nuclear ribonucleoproteins). Some of these proteins (there are approximately 30 different ones in humans) unwind the hairpin helices in the RNA so that splicing and other signals on the RNA can be read more easily. Others
当预 mRNA 分子从转录 RNA 聚合酶中出现时,组装在其上的蛋白质中最丰富的是 hnRNPs(异质核糖核蛋白)。其中一些蛋白质(人类大约有 30 种不同的蛋白质)会展开 RNA 中的发夹螺旋,以便更容易地读取 RNA 上的剪接和其他信号。其他
Figure 6-38 Structure of the nuclear RNA exosome. (A) RNA is fed into one end, passes through the central channel, and is degraded by RNases at the other end. (B) Structure of the central channel of the human RNA exosome viewed endon. Nine different protein subunits (each represented by a different color) make up this large ring structure. Eukaryotic cells have both a nuclear exosome and a cytoplasmic exosome; both forms include the central channel but differ in their additional subunits. The nuclear RNA exosome degrades aberrant RNAs (including excised intron sequences and incorrectly spliced RNAs) before they are exported to the cytosol. It also processes certain types of RNA (for example, the ribosomal RNAs) to produce their final form. The cytoplasmic form of the RNA exosome is responsible for degrading mRNAs in the cytosol and is thus crucial in determining the lifetime of each mRNA molecule. (A, adapted from C. Kilchert et al., Nat. Rev. Mol. Cell Biol. 17:227, 2016. With permission from Springer Nature; B, PDB code: 2NN6.)
图 6-38 核 RNA 外体的结构。 (A) RNA 被送入一端,通过中央通道,然后被另一端的 RNases 降解。 (B) 人类 RNA 外体中央通道的结构,端面视图。九种不同的蛋白亚基(每种用不同颜色表示)组成这个大环结构。真核细胞既有核外体又有胞质外体;两种形式都包括中央通道,但在附加亚基方面有所不同。核 RNA 外体在将异常 RNA(包括剪切的内含子序列和错误剪接的 RNA)导出到细胞质之前降解它们。它还处理某些类型的 RNA(例如核糖体 RNA)以产生它们的最终形式。RNA 外体的胞质形式负责在细胞质中降解 mRNA,因此在确定每个 mRNA 分子的寿命方面至关重要。 (A,改编自 C. Kilchert 等人,Nat. Rev. Mol. Cell Biol. 17:227,2016。获得 Springer Nature 许可;B,PDB 代码:2NN6。)
(A)
(B)
Figure 6-39 Transport of a large mRNA molecule through the nuclear pore complex. (A) The maturation of an mRNA molecule as it is synthesized by RNA polymerase and packaged by a variety of nuclear proteins. This drawing of an unusually large and abundant insect RNA, called the Balbiani Ring mRNA, is based on electron microscope micrographs such as that shown in (B). (A, adapted from B. Daneholt, Cell 88:585-588, 1997; B, © 1966 B.J. Stevens and H. Swift. Originally published in J. Cell Biol. https:doi.org/10.1083/jcb.31.1.55. With permission from Rockefeller University Press.)
图 6-39 通过核孔复合体运输大型 mRNA 分子。(A) mRNA 分子的成熟过程,它由 RNA 聚合酶合成并由各种核蛋白包装。这幅描绘了一种异常大且丰富的昆虫 RNA,称为 Balbiani 环 mRNA,基于电子显微镜照片,如(B)所示。(A,改编自 B. Daneholt,Cell 88:585-588,1997; B,© 1966 B.J. Stevens 和 H. Swift。最初发表在 J. Cell Biol. https:doi.org/10.1083/jcb.31.1.55。获得 Rockefeller University Press 许可。)
preferentially package the RNA contained in the very long intron sequences found in complex organisms (see Figure 6-31); these may also help to distinguish the debris left over from RNA processing from fully mature mRNAs.
优先包装复杂生物体中发现的非常长的内含子序列中包含的 RNA(见图 6-31);这些也可能有助于区分 RNA 加工残留物与完全成熟的 mRNA。
Successfully processed mRNAs are guided through the nuclear pore complexes (NPCs)-aqueous channels in the nuclear membrane that directly connect the nucleoplasm and cytosol (Figure 6-39). Small molecules (less than 40,000 daltons or about in diameter) can diffuse freely through these channels. However, most of the macromolecules in cells, including mRNAs complexed with proteins, are far too large to pass through the channels without a special process. The cell uses energy to actively transport such macromolecules in both directions through the nuclear pore complexes.
成功处理的 mRNA 通过核孔复合体(NPCs)-核膜中的水道引导,直接连接核浆和细胞质(图 6-39)。小分子(分子量小于 40,000 道尔顿或约 直径)可以自由扩散通过这些通道。然而,细胞中的大多数大分子,包括与蛋白质结合的 mRNA,都太大而无法通过这些通道而无需特殊处理。细胞利用能量通过核孔复合体在两个方向上主动运输这些大分子。
As explained in detail in Chapter 12, macromolecules are moved through nuclear pore complexes by nuclear transport receptors, which, depending on the identity of the macromolecule, escort it from the nucleus to the cytoplasm or vice versa. For mRNA export to occur, a specific nuclear transport receptor must be loaded onto the mRNA, a step that, in many organisms, takes place in concert with cleavage and polyadenylation.
正如第 12 章中详细解释的那样,大分子通过核孔复合物由核转运受体移动,这些受体根据大分子的身份,将其从细胞核护送到细胞质或反之亦然。为了进行 mRNA 的输出,必须将特定的核转运受体加载到 mRNA 上,这一步在许多生物体中与 剪切和多聚腺苷酸化同时进行。
The export of mRNA-protein complexes from the nucleus can be readily observed with the electron microscope for the unusually abundant mRNA of the insect Balbiani Ring genes. As these genes are transcribed, the newly formed premRNA is seen to be packaged by proteins, including hnRNPs, SR proteins, and components of the spliceosome. This protein-pre-mRNA complex undergoes a series of structural transitions, probably reflecting RNA-processing events, culminating in a curved fiber. This curved fiber diffuses through the nucleoplasm, enters the nuclear pore complex (with its cap proceeding first), and then undergoes additional structural transitions as it moves through the pore (see Figure 6-39). Such observations reveal that the pre-mRNA-protein and mRNA-protein complexes are dynamic structures that gain and lose specific proteins during RNA synthesis, processing, and export (Figure 6-40).
mRNA-蛋白复合物从细胞核中的输出可以通过电子显微镜轻松观察到,特别是对昆虫巴尔比亚环基因的异常丰富 mRNA。随着这些基因的转录,新形成的 pre-mRNA 被包装在蛋白质中,包括 hnRNPs、SR 蛋白和剪接体的组分。这种蛋白质-pre-mRNA 复合物经历一系列结构转变,可能反映了 RNA 处理事件,最终形成一个弯曲的纤维。这种弯曲的纤维扩散穿过核浆,进入核孔复合物(其 帽首先进入),然后在通过孔道时经历额外的结构转变(见图 6-39)。这些观察揭示了 pre-mRNA-蛋白质和 mRNA-蛋白质复合物是动态结构,在 RNA 合成、处理和输出过程中获得和失去特定蛋白质(图 6-40)。
The journey of an individual mRNA molecule from the nucleus to the cytosol can also be tracked by fluorescently labeling it and observing it over time. A typical mRNA molecule that is released from its site of transcription spends several minutes randomly diffusing in the nucleus until it encounters a nuclear pore complex. During this time, RNA-processing events presumably continue, with the mRNA shedding previously bound proteins and acquiring new ones. Once it arrives at the entrance to the pore, the "export-ready" mRNA molecule hovers for several seconds, during which time the completion ofRNA processinglikely occurs, and itthen is transported through the pore very rapidly. Some mRNA-protein complexes are
一个个体 mRNA 分子从细胞核到细胞质的旅程也可以通过荧光标记它并随时间观察来跟踪。从转录位点释放的典型 mRNA 分子在细胞核中随机扩散几分钟,直到遇到核孔复合体。在此期间,RNA 处理事件可能继续进行,mRNA 脱落先前结合的蛋白质并获得新的蛋白质。一旦到达孔的入口,"出口就绪"的 mRNA 分子会在几秒钟内悬停,此时可能完成 RNA 处理,然后会非常迅速地通过孔被运输。一些 mRNA-蛋白质复合物
Figure 6-40 Schematic illustration of an export-ready mRNA molecule and its transport through the nuclear pore. As indicated, some proteins travel with the mRNA as it moves through the pore, whereas others remain in the nucleus. Some of the nuclear proteins that are lost are eventually replaced by cytosolic versions, such as those that bind the cap. In some species (humans, for example), there are different poly-A-binding proteins in the nucleus and the cytosol; other species have only a single poly-A-binding protein. The nuclear export receptor for mRNAs is a complex of proteins that binds to an mRNA molecule once it has been correctly spliced and polyadenylated. After the mRNA has been exported to the cytosol, this export receptor dissociates from the mRNA and is re-imported into the nucleus, where it can be used again.
图 6-40 展示了一种准备出口的 mRNA 分子及其通过核孔的运输的示意图。如图所示,一些蛋白质随着 mRNA 通过核孔移动,而另一些留在细胞核中。一些丢失的核蛋白质最终会被细胞质版本替代,比如结合 帽的蛋白质。在某些物种(例如人类),核内和细胞质中有不同的多 A 结合蛋白;其他物种只有一个多 A 结合蛋白。mRNA 的核外运输受体是一组蛋白质的复合物,它们在 mRNA 被正确剪接和多聚腺苷化后结合到 mRNA 分子上。当 mRNA 被输出到细胞质后,这个输出受体会与 mRNA 解离,并重新被导入细胞核,以便再次使用。
very large, and how they are moved through the nuclear pore complexes so rapidly (in about 10 milliseconds) remains a mystery.
非常庞大,它们是如何如此迅速地通过核孔复合体移动的(大约在 10 毫秒内)仍然是一个谜。
Some of the proteins deposited on the mRNA while it is still in the nucleus can affect the fate of the mRNA after it is transported to the cytosol. Thus, the stability of an mRNA in the cytosol, the efficiency with which it is translated into protein, and its ultimate destination in the cell can all be determined by proteins acquired in the nucleus that remain bound to the mRNA in the cytosol.
一些蛋白质在 mRNA 仍在细胞核内时沉积,可能会影响 mRNA 在被运送到细胞质后的命运。因此,细胞质中 mRNA 的稳定性、被翻译成蛋白质的效率以及其在细胞中的最终去向,都可能由在细胞核中获得并仍与细胞质中的 mRNA 结合的蛋白质所决定。
Before discussing what next happens to the exported mRNAs, we briefly consider how the synthesis and processing of noncoding RNA molecules occur. There are many types of noncoding RNAs produced by cells (see Table 6-1, p. 327), but here we focus on the rRNAs, which are critically important for the translation of mRNAs into protein.
在讨论出口的 mRNA 接下来会发生什么之前,我们简要考虑一下非编码 RNA 分子的合成和加工是如何发生的。细胞产生许多类型的非编码 RNA(见表 6-1,第 327 页),但在这里我们重点关注 rRNA,它对将 mRNA 翻译成蛋白质至关重要。

Noncoding RNAs Are Also Synthesized and Processed in the Nucleus
非编码 RNA 也在细胞核中合成和加工

Of all the RNAs in a typical cell, only a few percent are mRNA. The bulk of RNA performs structural and catalytic functions (see Table 6-1). The most abundant RNAs in cells are the ribosomal RNAs (rRNAs), constituting approximately of the RNA in rapidly dividing cells. As discussed later in this chapter, these RNAs form the core of the ribosome. Unlike bacteria-in which a single RNA polymerase synthesizes all RNAs in the cell—eukaryotes have a separate, specialized polymerase, RNA polymerase I, that is dedicated to producing rRNAs. RNA polymerase I is similar structurally to the RNA polymerase II discussed previously; however, the absence of a C-terminal tail in polymerase I helps to explain why its transcripts are neither capped nor polyadenylated.
在典型细胞中的所有 RNA 中,只有少数百分比是 mRNA。 RNA 的大部分执行结构和催化功能(见表 6-1)。 细胞中最丰富的 RNA 是核糖体 RNA(rRNA),在快速分裂的细胞中约占 RNA 的 。 正如本章后面讨论的那样,这些 RNA 形成核糖体的核心。 与细菌不同-在细胞中单个 RNA 聚合酶合成所有 RNA-真核生物有一个单独的、专门的聚合酶,RNA 聚合酶 I,专门用于产生 rRNA。 RNA 聚合酶 I 在结构上与之前讨论的 RNA 聚合酶 II 类似; 然而,聚合酶 I 中缺少 C-末端尾巴有助于解释为什么它的转录本既不帽子化也不多腺苷化。
Because multiple rounds of translation of each mRNA molecule can provide an enormous amplification in the production of protein molecules, many of the proteins that are very abundant in a cell can be synthesized from genes that are present in a single copy per haploid genome (see Figure 6-3). In contrast, the RNA components of the ribosome are final gene products, and a growing mammalian cell must synthesize approximately 10 million copies of each type of ribosomal RNA in each cell generation to construct its 10 million ribosomes. The cell can produce adequate quantities of ribosomal RNAs only because it contains multiple copies of the rRNA genes that code for ribosomal RNAs (rRNAs). Even E. coli needs seven copies of its rRNA genes to meet the cell's need for ribosomes. Human cells contain about 200 rRNA gene copies per haploid genome, spread
由于每个 mRNA 分子的多轮翻译可以在蛋白质分子的生产中提供巨大的放大,因此许多在细胞中非常丰富的蛋白质可以从每个单倍体基因组中存在的基因合成(见图 6-3)。相比之下,核糖体的 RNA 组分是最终的基因产物,一个不断增长的哺乳动物细胞必须在每个细胞世代中合成大约 1000 万份每种类型的核糖体 RNA,以构建其 1000 万个核糖体。细胞之所以能够产生足够数量的核糖体 RNA,是因为它包含编码核糖体 RNA(rRNA)的 rRNA 基因的多个拷贝。即使是大肠杆菌也需要七个拷贝其 rRNA 基因来满足细胞对核糖体的需求。人类细胞每个单倍体基因组含有约 200 个 rRNA 基因拷贝,分布在整个基因组中。
out in small clusters on five different chromosomes (see Figure 4-12), while cells of the frog Xenopus contain about 600 rRNA gene copies per haploid genome in a single cluster on one chromosome (Figure 6-41).
分散在五条不同的染色体上(见图 4-12),而青蛙 Xenopus 的细胞中每个单倍体基因组含有约 600 个 rRNA 基因拷贝,集中在一条染色体上(图 6-41)。
There are four types of eukaryotic rRNAs, each present in one copy per ribosome. Three of the four rRNAs (18S, , and ) are made by chemically modifying and cleaving a single large precursor rRNA (Figure 6-42); the fourth (5S RNA) is synthesized from a separate cluster of genes by a different polymerase, RNA polymerase III, and it does not require chemical modification.
有四种类型的真核 rRNA,每种在每个核糖体中各有一份拷贝。这四种 rRNA 中的三种(18S, ,和 )是通过化学修饰和切割单个大前体 rRNA(图 6-42)制成的;第四种(5S RNA)则是由不同的聚合酶 RNA 聚合酶 III 从一个独立的基因簇合成,不需要化学修饰。
Extensivechemicalmodificationsoccurinthe13,000-nucleotide-longprecursor rRNA before the mature , and rRNAs are cleaved out of it. These include about 100 methylations of the -OH positions on nucleotide sugars and 100 isomerizations of uridine nucleotides to pseudouridine (Figure 6-43A). The functions of these modifications are not understood in detail, but they probably aid in ribosome assembly, and they may also subtly affect the operation of completed ribosomes. Each modification is made at a specific position in the precursor rRNA, specified by "guide RNAs," which position themselves on the precursor rRNA through base-pairing and thereby bring an RNA-modifying enzyme to the appropriate position (Figure 6-43B). Other guide RNAs promote cleavage of the precursor rRNAs into the mature rRNAs, probably by causing conformational changes in the precursor rRNA that expose these sites to nucleases. All of these guide RNAs are members of a large class of RNAs called small nucleolar RNAs (snoRNAs), so named because these RNAs perform their functions in a subcompartment of the nucleus called the nucleolus. Many snoRNAs are encoded in the introns of other genes, especially those encoding ribosomal proteins. They are synthesized by RNA polymerase II and processed from excised intron sequences.
在成熟 rRNA 从中切割出来之前,长达 13,000 核苷酸的前体 rRNA 会发生广泛的化学修饰。这些修饰包括约 100 个核糖核苷酸的 -OH 位置的甲基化和 100 个尿苷核苷酸异构化为伪尿苷(图 6-43A)。这些修饰的功能尚不详细了解,但它们可能有助于核糖体的组装,并且可能也会微妙地影响成熟核糖体的操作。每种修饰都是在前体 rRNA 的特定位置进行的,由“引导 RNA”指定,这些 RNA 通过碱基配对定位在前体 rRNA 上,从而将 RNA 修饰酶带到适当的位置(图 6-43B)。其他引导 RNA 促进前体 rRNA 切割为成熟 rRNA,可能是通过导致前体 rRNA 中构象变化,使这些位点暴露给核酸酶。所有这些引导 RNA 都是一类称为小核仁 RNA(snoRNAs)的 RNA 成员,因为这些 RNA 在称为核仁的细胞核亚区中执行其功能。 许多 snoRNA 编码在其他基因的内含子中,特别是编码核糖体蛋白的基因。它们由 RNA 聚合酶 II 合成,并从切除的内含子序列中加工处理。
Figure 6-41 Transcription from tandemly arranged rRNA genes, as seen in the electron microscope. The pattern of alternating transcribed gene and nontranscribed spacer is readily seen. A higher-magnification view of rRNA genes is shown in Figure 6-10. (From V.E. Foe, Cold Spring Harb. Symp. Quant. Biol. 42:723-740, 1978. With permission from Cold Spring Harbor Laboratory Press.)
图 6-41 串联排列的 rRNA 基因的转录,如在电子显微镜中所见。轮替转录基因和非转录间隔的模式很容易看到。rRNA 基因的高倍率视图显示在图 6-10 中。(摘自 V.E. Foe, Cold Spring Harb. Symp. Quant. Biol. 42:723-740, 1978. 获得 Cold Spring Harbor Laboratory Press 许可。)
Figure 6-42 The chemical modification and nucleolytic processing of a eukaryotic precursor rRNA molecule into three separate ribosomal RNAs. Two types of chemical modifications (see Figure 6-43) are made to the precursor rRNA before it is cleaved. Nearly half of the nucleotide sequences in this precursor rRNA are discarded and degraded in the nucleus by the RNA exosome. The processing of the ribosomal RNAs begins while they are still being transcribed; the nascent transcripts also begin to be assembled with ribosomal proteins (see Figure 6-10). The rRNAs are named according to their " " values, which refer to their rate of sedimentation in an ultracentrifuge. The larger the value, the larger the rRNA.
图 6-42 一个真核前体 rRNA 分子的化学修饰和核酸酶加工成三个独立的核糖体 RNA。在切割之前,前体 rRNA 会进行两种类型的化学修饰(见图 6-43)。这个前体 rRNA 中近一半的核苷酸序列会被 RNA 外核在细胞核中丢弃和降解。核糖体 RNA 的加工在它们仍在转录时就开始了;新生转录本也开始与核糖体蛋白组装(见图 6-10)。rRNA 根据它们在超速离心机中的沉降速率的“ ”值来命名。 值越大,rRNA 越大。

The Nucleolus Is a Ribosome-producing Factory
核仁是一个产生核糖体的工厂

The nucleolus is the most obvious structure seen in the nucleus of a eukaryotic cell when viewed in the light microscope. It was so closely scrutinized by early cytologists that an 1898 review could list some 700 references. We now know that the nucleolus is the site for the synthesis and processing of rRNAs and the assembly of ribosomes. Unlike many of the major organelles in the cell, the nucleolus is not bound by a membrane (Figure 6-44); instead, it is a huge biomolecular condensate of macromolecules, including the rRNA genes themselves, precursor
核仁是在光学显微镜下观察到的真核细胞核中最明显的结构。早期细胞学家对其进行了如此密切的审查,以至于 1898 年的一篇评论中列出了大约 700 个参考文献。我们现在知道,核仁是 rRNA 的合成和加工以及核糖体的组装的地点。与细胞中许多主要细胞器不同,核仁没有被膜包围(图 6-44);相反,它是由大量生物大分子组成的巨大生物分子凝聚体,包括 rRNA 基因本身、前体。

Figure 6-43 Modifications of the precursor rRNA by guide RNAs. (A) Two prominent covalent modifications made to rRNA; the differences from the initially incorporated nucleotide are indicated by red atoms. Pseudouridine is an isomer of uridine; the base has been "rotated" and is attached to the red rather than to the red of the sugar (compare to Figure 6-5B). (B) As indicated, snoRNAs determine the sites of modification by base-pairing to complementary sequences on the precursor rRNA. The snoRNAs are bound to proteins, and the complexes are called snoRNPs (small nucleolar ribonucleoproteins). The snoRNPs contain both the guide sequences and the enzymes that modify the rRNA.
图 6-43 通过指导 RNA 对前体 rRNA 的修饰。(A)对 rRNA 进行的两种显著共价修饰;与最初结合的核苷酸的差异由红色原子表示。伪尿嘧啶是尿嘧啶的异构体;碱基已经“旋转”,并连接到糖的红色 而不是红色 (与图 6-5B 进行比较)。 (B)如所示,snoRNA 通过与前体 rRNA 上的互补序列配对来确定修饰位点。snoRNA 结合到蛋白质,并形成 snoRNPs(小核仁核糖核蛋白)。snoRNPs 同时包含指导序列和修饰 rRNA 的酶。
Figure 6-44 Electron micrograph of a thin section of a nucleolus in a human fibroblast, showing its three distinct zones. (A) View of entire nucleus. (B) Higher-power view of the nucleolus. It is believed that processing of the rRNAs and their assembly into the two subunits of the ribosome proceeds outward from the dense fibrillar component to the surrounding granular components (see Figure 6-46). (Courtesy of E.G. Jordan and J. McGovern.)
图 6-44 人类成纤维细胞核仁薄切片的电子显微镜图,显示其三个明显区域。(A) 整个细胞核的视图。(B) 核仁的高倍视图。据信,rRNA 的加工及其组装成核糖体的两个亚基是从致密纤维组分向周围颗粒组分进行的(见图 6-46)。(由 E.G. Jordan 和 J. McGovern 提供。)
Figure 6-45 Nucleoli exhibit fluidlike behavior when observed in vitro. A time course showing the fate of three nucleoli from frog oocytes that have begun to fuse with one another. The lower fusion joint eventually breaks while the other enlarges, completing the fusion event. This experiment was carried out in vitro under mineral oil, and the nucleoli were observed using differential-interference-contrast microscopy (see Chapter 9). (Courtesy of C.P. Brangwynne et al., Proc. Natl. Acad. Sci. USA 108:4334-4339, 2011).
图 6-45 在体外观察时,核仁表现出类似流体的行为。时间过程显示了三个从蛙卵母细胞开始相互融合的核仁的命运。较低的融合连接最终会断裂,而另一个会变大,完成融合事件。这个实验是在矿物油下体外进行的,并且使用差分干涉对比显微镜观察了核仁(见第 9 章)。(由 C.P. Brangwynne 等人提供,Proc. Natl. Acad. Sci. USA 108:4334-4339,2011 年)。
rRNAs, mature rRNAs, rRNA-processing enzymes, snoRNPs, a large set of assembly factors (including ATPases, GTPases, protein kinases, and RNA helicases), ribosomal proteins, and partly assembled ribosomes. We discuss the formation of membraneless organelles in Chapter 12; here, we note that their assembly is likely driven by the type of phase transitions discussed in Chapter 3 (see Figure 3-77). The close, but loose, association of all these components, which allows the assembly of ribosomes to occur rapidly and smoothly, endows the nucleolus with liquid-like properties (Figure 6-45).
rRNAs,成熟的 rRNAs,rRNA 加工酶,snoRNPs,大量的组装因子(包括 ATP 酶,GTP 酶,蛋白激酶和 RNA 解旋酶),核糖体蛋白质和部分组装的核糖体。我们在第 12 章讨论了无膜细胞器的形成;在这里,我们指出它们的组装可能是由第 3 章讨论的相变类型驱动的(见图 3-77)。所有这些组分之间的密切但松散的关联,使得核糖体的组装能够迅速而顺利地进行,赋予核仁液态特性(图 6-45)。
The rRNA genes themselves have an important role in forming the nucleolus (Figure 6-46). In a diploid human cell, the rRNA genes are distributed into 10 clusters, located near the tips of five different chromosome pairs (see Figure 4-12). During interphase, these 10 chromosomes contribute DNA loops (containing the rRNA genes) to the nucleolus; in phase, when the chromosomes condense, the nucleolus fragments and then disappears. Then, in the telophase part of mitosis, as chromosomes return to their semi-dispersed state, the tips of the 10 chromosomes re-form small nucleoli, which progressively coalesce into a single nucleolus (Figure 6-47 and Figure 6-48). As might be expected, the size of the nucleolus reflects the number of ribosomes that the cell is producing. Its size therefore varies greatly in different cells and can change in a single cell, occupying nearly of the total nuclear volume in cells that are making unusually large amounts of protein.
rRNA 基因本身在形成核仁(图 6-46)中起着重要作用。在一个二倍体人类细胞中,rRNA 基因分布在 10 个簇中,位于五对不同染色体末端附近(见图 4-12)。在有丝分裂间期,这 10 条染色体向核仁贡献 DNA 环(含有 rRNA 基因);在 期,染色体凝缩时,核仁会破碎然后消失。接着,在有丝分裂的末期,随着染色体回到半分散状态,这 10 条染色体的末端重新形成小核仁,逐渐融合成一个单一的核仁(图 6-47 和图 6-48)。正如预期的那样,核仁的大小反映了细胞正在产生的核糖体数量。因此,核仁的大小在不同细胞中差异很大,并且在单个细胞中可以发生变化,占据了制造异常大量蛋白质的细胞总核体积的近
Ribosome assembly is a complex process, requiring, in addition to the proteins and RNA molecules that compose the finished ribosome, more than 200 proteins that aid the assembly of the finished ribosome. These include chaperones (discussed later in this chapter), ATP-dependent RNA helicases, nucleases, and a wide variety of RNA-binding proteins. In addition, assembly requires a number of small RNA molecules, such as the two snoRNAs of Figure 6-43. In many respects, building a ribosome resembles the process by which a spliceosome is formed, as we discussed earlier in the chapter. In particular, many ATP-driven RNA structural rearrangements occur as assembly proceeds. However, a key difference is that a new spliceosome must be constructed and disassembled for each splicing event,
核糖体的组装是一个复杂的过程,除了构成成品核糖体的蛋白质和 RNA 分子外,还需要 200 多种蛋白质来辅助完成核糖体的组装。这些蛋白质包括分子伴侣(本章后面将讨论)、依赖 ATP 的 RNA 解旋酶、核酸酶和各种各样的 RNA 结合蛋白质。此外,组装还需要一些小的 RNA 分子,例如图 6-43 中的两个 snoRNA。在许多方面,组装核糖体类似于前面在本章中讨论过的剪接体形成的过程。特别是,随着组装的进行,许多依赖 ATP 的 RNA 结构重排事件发生。然而,一个关键的区别是,每次剪接事件都必须构建和拆卸一个新的剪接体。

Figure 6-46 Schematic diagram of nucleolus formation after mitosis. According to this model, the nucleolus is formed from three distinct condensates, each with a different set of components. This arrangement is proposed to promote the orderly assembly of RNA-protein complexes, much like that observed on an assembly line. (Adapted from A.R. Strom and C.P. Brangwynne, J. Cell Sci. 132:jcs235093, 2019.)
图 6-46 有丝分裂后核仁形成的示意图。根据这个模型,核仁由三个不同的凝聚体形成,每个凝聚体具有不同的组分。据推测,这种排列有助于促进 RNA-蛋白复合物的有序组装,就像在装配线上观察到的那样。(改编自 A.R. Strom 和 C.P. Brangwynne,J. Cell Sci. 132:jcs235093,2019。)

while a ribosome, once formed, is stable and is used repeatedly to translate many mRNAs into protein. In human cells, it is estimated that each ribosome makes, on average, 3000 individual proteins in its lifetime. Ribosome assembly is understood in great detail, and only a few of the key features are summarized in Figure 6-49.
一旦形成,核糖体就是稳定的,并且被反复使用来将许多 mRNA 翻译成蛋白质。在人类细胞中,据估计每个核糖体在其寿命中平均制造 3000 种个体蛋白质。核糖体的组装被详细理解,图 6-49 总结了其中的一些关键特征。
In addition to its central role in ribosome biogenesis, the nucleolus is the site where other noncoding RNAs are produced and other RNA-protein complexes are assembled. For example, the U6 snRNP, a key component in pre-mRNA splicing (see Figure 6-29), is composed of one RNA molecule and seven proteins. The U6 snRNA is chemically modified by snoRNAs and assembled with its proteins in the nucleolus. Other important RNA-protein complexes, including telomerase (encountered in Chapter 5) and the signal-recognition particle (which we discuss in Chapter 12), are also assembled in the nucleolus. Finally, the tRNAs (transfer RNAs) that carry the amino acids for protein synthesis are processed there as well; like the rRNA genes, the genes encoding tRNAs are clustered in the nucleolus. Thus, the nucleolus can be thought of as a large factory at which different noncoding RNAs are transcribed, processed, and assembled with proteins to form a large variety of ribonucleoprotein complexes.
除了在核糖体生物合成中起着中心作用外,核仁还是产生其他非编码 RNA 和组装其他 RNA-蛋白复合物的地点。例如,U6 snRNP,在前 mRNA 剪接中起关键作用的一个组分(见图 6-29),由一个 RNA 分子和七个蛋白质组成。U6 snRNA 被 snoRNAs 进行化学修饰,并与其蛋白质在核仁中组装。其他重要的 RNA-蛋白复合物,包括端粒酶(在第 5 章中遇到)和信号识别粒子(我们在第 12 章中讨论),也在核仁中组装。最后,携带蛋白质合成所需氨基酸的 tRNA(转运 RNA)也在那里被加工;像 rRNA 基因一样,编码 tRNA 的基因也聚集在核仁中。因此,核仁可以被看作是一个大型工厂,用于转录、加工不同的非编码 RNA,并与蛋白质组装成各种大型核糖核蛋白复合物。

The Nucleus Contains a Variety of Subnuclear Biomolecular Condensates
细胞核包含多种亚核生物分子凝聚体

Although the nucleolus is the most prominent structure in the nucleus, several other membraneless compartments have been observed and studied (Figure 6-50). These include Cajal bodies (named for the scientist who first described them in 1906) and interchromatin granule clusters (also called "speckles"). Like the nucleolus, these other compartments are highly dynamic depending on the needs of the cell, and their assembly is likely the result of the association of protein and RNA components involved in the synthesis, assembly, and storage of macromolecules involved in gene expression. Cajal bodies are sites where the snRNPs and snoRNPs undergo their final maturation steps, and where the snRNPs are recycled and their RNAs are "reset" after the rearrangements that occur during splicing (see pp. 344-345). In contrast, the interchromatin granule clusters are stockpiles of fully mature snRNPs and other RNA-processing components that are ready to be used in the production of mRNA.
尽管核仁是细胞核中最显著的结构,但已观察和研究了几个其他无膜区室(图 6-50)。这些包括卡亚尔小体(以 1906 年首次描述它们的科学家命名)和染色质间颗粒团(也称为“斑点”)。与核仁一样,这些其他区室在细胞需要时非常动态,它们的组装可能是蛋白质和 RNA 组分相互关联的结果,这些组分参与了基因表达中的合成、组装和储存大分子。卡亚尔小体是 snRNPs 和 snoRNPs 进行最终成熟步骤的地点,也是 snRNPs 在剪接过程中发生的重排后进行回收并将其 RNA“重置”的地方(见 344-345 页)。相比之下,染色质间颗粒团是完全成熟的 snRNPs 和其他 RNA 处理组分的储备库,准备用于 mRNA 的生产。
Scientists have had difficulties in working out the exact function of these small compartments, in part because their appearances can change dramatically as cells traverse the cell cycle or respond to changes in their environment. Moreover, disrupting a particular type of nuclear body often has little effect on cell viability.
科学家们在确定这些小区室的确切功能方面遇到了困难,部分原因是因为随着细胞穿越细胞周期或对环境变化做出反应,它们的外观可能会发生巨大变化。此外,破坏特定类型的细胞核小体通常对细胞的存活率影响不大。

Figure 6-48 Nucleolar fusion in vivo. These light micrographs of human fibroblasts grown in culture show various stages of nucleolar fusion. After mitosis, each of the 10 human chromosomes that carry a cluster of rRNA genes begins to form a tiny nucleolus, but these rapidly coalesce as they grow to form the single large nucleolus typical of many interphase cells. (Courtesy of E.G. Jordan and J. McGovern.)
图 6-48 体内核仁融合。这些在培养中生长的人类成纤维细胞的光镜照片展示了核仁融合的各个阶段。有丝分裂后,携带一簇 rRNA 基因的 10 条人类染色体中的每一条开始形成一个微小的核仁,但随着它们的增长,这些核仁迅速合并形成许多间期细胞典型的单个大核仁。(由 E.G. Jordan 和 J. McGovern 提供。)
Figure 6-47 Changes in the appearance of the nucleolus in a human cell during the cell cycle. Only the cell nucleus is represented in this diagram. In most eukaryotic cells, the nuclear envelope breaks down during mitosis, as indicated by the dashed circles.
图 6-47 人类细胞中核仁在细胞周期中外观的变化。此图仅代表细胞核。在大多数真核细胞中,细胞核膜在有丝分裂期间会破裂,如虚线圆所示。
It seems that the main function of these aggregates is to bring components together at high concentration in order to speed up their assembly. For example, it is estimated that assembly of the U4/U6 snRNP (see Figure 6-29) occurs 10 times more rapidly in Cajal bodies than would be the case if the same number of components were dispersed throughout the nucleus. Consequently, Cajal bodies appear dispensable in many types of cells but are absolutely required in situations where cells must proliferate rapidly, such as in early vertebrate development. Here, protein synthesis (which depends on RNA splicing) must occur especially rapidly, and delays can be lethal.
这些聚集体的主要功能似乎是将组分聚集在高浓度下,以加快它们的组装速度。例如,据估计,U4/U6 snRNP 的组装(见图 6-29)在 Cajal 小体中比如果相同数量的组分分散在细胞核中要快 10 倍。因此,在许多类型的细胞中,Cajal 小体似乎是可有可无的,但在细胞必须快速增殖的情况下,如在早期脊椎动物的发育中,它们是绝对必需的。在这种情况下,蛋白质合成(依赖于 RNA 剪接)必须特别迅速进行,延迟可能是致命的。
Given the prominence of nuclear compartments in RNA processing, it might be expected that pre-mRNA splicing would occur in a particular location in the
鉴于核小体在 RNA 加工中的重要性,人们可能会预期前 mRNA 的剪接会发生在特定的位置
Figure 6-50 Visualization of some prominent membraneless compartments in the nucleus. The protein fibrillarin (red), a component of several snoRNPs, is present in both nucleoli and Cajal bodies; the latter are indicated by the arrows. The Cajal bodies (but not the nucleoli) are also highlighted by staining one of their main components, the protein coilin; the superposition of the snoRNP and coilin stains appears pink. Interchromatin granule clusters (green) have been revealed by using antibodies against a protein involved in pre-mRNA splicing. DNA is stained blue by the dye DAPI. (From J.R. Swedlow and A.I. Lamond, Genome Biol. 2:1-7, 2001. Micrograph courtesy of Judith Sleeman.)
图 6-50 显示核内一些突出的无膜细胞器的可视化。蛋白质纤维蛋白(红色)是几种 snoRNP 的组成部分,在核仁和 Cajal 小体中都存在;后者由箭头指示。Cajal 小体(而不是核仁)还通过染色其主要组分之一蛋白质 coilin 来突出显示;snoRNP 和 coilin 染色的叠加呈粉红色。通过使用针对参与 pre-mRNA 剪接的蛋白质的抗体,揭示了亚染色质颗粒团(绿色)。DNA 通过 DAPI 染料染成蓝色。(来源:J.R. Swedlow 和 A.I. Lamond,Genome Biol. 2:1-7,2001。感谢 Judith Sleeman 提供的显微图。)

Figure 6-49 The function of the nucleolus in ribosome and other ribonucleoprotein synthesis. The precursor rRNA is packaged in a large ribonucleoprotein particle containing many ribosomal proteins imported from the cytoplasm. While this particle remains at the nucleolus, selected components are added and others discarded as it is processed into immature large and small ribosomal subunits. The two ribosomal subunits attain their final functional form only after each is individually transported through the nuclear pores into the cytoplasm. Other ribonucleoprotein complexes, including telomerase shown here, are also assembled in the nucleolus. (Adapted from A.R. Strom and C.P. Brangwynne, J. Cell Sci. 132:jcs235093, 2019. With permission from the Company of Biologists.)
图 6-49 核仁在核糖体和其他核糖核蛋白合成中的功能。 前体 rRNA 被包装在一个大的核糖核蛋白粒子中,其中包含许多从细胞质进口的核糖体蛋白。当这个粒子仍然停留在核仁时,被选择的组分被添加,其他被丢弃,因为它被加工成不成熟的大和小核糖体亚基。这两个核糖体亚基只有在分别通过核孔进入细胞质后才达到最终的功能形式。其他核糖核蛋白复合物,包括在这里显示的端粒酶,也在核仁中组装。(改编自 A.R. Strom 和 C.P. Brangwynne,J. Cell Sci. 132:jcs235093,2019。获得生物学家公司的许可。)
Figure 6-51 A model for an mRNA production factory. mRNA production is made more efficient in the nucleus by an aggregation of the many components needed for transcription and pre-mRNA processing, thereby producing a specialized biochemical factory. In (A), various components in the proximity of a transcribing RNA polymerase are carried on the tail (see Figure 6-23). In (B), a large number of RNA polymerase tails have been brought together to form a condensate that is highly enriched in the many components needed for the synthesis and processing of pre-mRNAs. Such a model can account for the several thousand sites of active RNA transcription and processing typically observed in the nucleus of a mammalian cell, each of which has a diameter of roughly and is estimated to contain, on average, about 10 RNA polymerase II molecules in addition to many other proteins. (C) Here, mRNA production factories and DNA replication factories have been visualized in a mammalian cell by briefly incorporating differently modified nucleotides into each nucleic acid and detecting the RNA and DNA produced using antibodies, one (green) detecting the newly synthesized DNA and the other (red) detecting the newly synthesized RNA. (C, from D.G. Wansink et al., J. Cell Sci. 107:1449-1456, 1994. With permission from the Company of Biologists.)
图 6-51 mRNA 生产工厂模型。在细胞核中,通过聚集转录和前 mRNA 处理所需的许多组分,使 mRNA 生产更加高效,从而产生一个专门的生化工厂。在(A)中,靠近转录 RNA 聚合酶的各种组分被携带在尾部(参见图 6-23)。在(B)中,大量 RNA 聚合酶尾部被聚集在一起形成一个富含许多用于合成和处理前 mRNA 的组分的凝聚体。这样的模型可以解释哺乳动物细胞核中通常观察到的数千个活跃 RNA 转录和处理位点,每个位点直径大约为 ,据估计平均含有大约 10 个 RNA 聚合酶 II 分子以及许多其他蛋白质。 (C)在这里,通过将不同修饰的核苷酸简要地并入每种核酸中,并使用抗体检测产生的 RNA 和 DNA,在哺乳动物细胞中可视化了 mRNA 生产工厂和 DNA 复制工厂,其中一种(绿色)检测新合成的 DNA,另一种(红色)检测新合成的 RNA。(引自 D.G. Wansink 等人,J. Cell Sci. 107:1449-1456,1994 年。获得生物学家公司许可。)
nucleus, as it requires numerous RNA and protein components. However, as we have seen, the assembly of splicing components on pre-mRNA is co-transcriptional; thus, splicing must occur at many locations along chromosomes. Although a typical mammalian cell may be expressing on the order of 15,000 genes, transcription and RNA splicing takes place in only several thousand sites in the nucleus. These sites are highly dynamic and probably result from the association of transcription and splicing components to create small factories, the name given to specific condensates containing a high local concentration of selected components that create biochemical assembly lines (Figure 6-51). Indeed, it is thought that initial rounds of transcription and RNA processing are very slow and perhaps error-prone due to limiting concentrations of key components; only when a factory becomes fully assembled does mRNA production become rapid and accurate. Interchromatin granule clusters-which contain stockpiles of RNA-processing components-are often observed next to these sites of transcription, as though poised to replenish supplies. We can thus view the nucleus as organized into dynamic condensates of different sizes, with snRNPs, snoRNPs, and other nuclear components diffusing rapidly among them, so as to maintain high concentrations of the many components needed for each step of RNA production.
细胞核需要大量的 RNA 和蛋白质组分。然而,正如我们所看到的,剪接组分在前 mRNA 上的组装是共转录的;因此,剪接必须发生在染色体的许多位置。尽管典型的哺乳动物细胞可能表达大约 15,000 个基因,但转录和 RNA 剪接只发生在细胞核中的几千个位置。这些位置非常动态,可能是由于转录和剪接组分的结合而产生的,从而形成了包含高局部浓度的选定组分的生化装配线的小工厂,这个名字给予了特定的凝聚体(图 6-51)。事实上,人们认为初始的转录和 RNA 处理轮次非常缓慢,可能由于关键组分的浓度有限而容易出错;只有当一个工厂完全组装好时,mRNA 的产生才会变得迅速和准确。在这些转录位置旁边经常观察到包含 RNA 处理组分储备的亚染色质颗粒团,就像准备补充供应一样。 因此,我们可以将细胞核视为由不同大小的动态凝聚体组织而成,其中 snRNPs、snoRNPs 和其他核组分在它们之间迅速扩散,以保持每个 RNA 生产步骤所需的许多组分的高浓度。

Summary 摘要

Before the synthesis of a particular protein can begin, the corresponding mRNA molecule must be produced by transcription. Bacteria contain a single type of RNA polymerase (the enzyme that carries out the transcription of DNA into RNA). An mRNA molecule is produced after this enzyme initiates transcription at a promoter, synthesizes the RNA by chain elongation, stops transcription at a terminator, and releases both the DNA template and the completed mRNA molecule. In eukaryotic cells, the process of transcription is much more complex, and there are three RNA polymerases-polymerase I, II, and III—that are related evolutionarily to one another and to the bacterial polymerase.
在合成特定蛋白质之前,必须通过转录产生相应的 mRNA 分子。细菌含有一种类型的 RNA 聚合酶(负责将 DNA 转录为 RNA 的酶)。在启动子处,此酶开始转录,通过链延伸合成 RNA,终止子处停止转录,并释放 DNA 模板和完成的 mRNA 分子。在真核细胞中,转录过程更为复杂,有三种 RNA 聚合酶——聚合酶 I、II 和 III——在进化上与细菌聚合酶及彼此相关。
RNA polymerase II synthesizes eukaryotic mRNA. This enzyme requires a set of additional proteins, the general transcription factors, to initiate transcription on a DNA template. It requires still more proteins (including transcription activator
RNA 聚合酶 II 合成真核 mRNA。这种酶需要一组额外的蛋白质,即一般转录因子,在 DNA 模板上启动转录。它还需要更多的蛋白质(包括转录激活因子)。

proteins, chromatin remodeling complexes, and histone-modifying enzymes) to initiate transcription on its chromatin templates inside the cell.
蛋白质、染色质重塑复合物和修饰组蛋白酶)在细胞内的染色质模板上启动转录。
During the elongation phase of transcription, the nascent RNA undergoes three types of processing events: a special nucleotide is added to its 5' end (capping), intron sequences are removed from the middle of the RNA molecule (splicing), and the end of the RNA is generated (cleavage and polyadenylation). Each of these processes is initiated by proteins that travel along with RNA polymerase II by binding to sites on its long, extended C-terminal tail. Splicing is unusual in that many assembly steps are required for each splicing event, and the catalytic site for the reaction is formed by RNA molecules rather than proteins. Only properly processed are passed through nuclear pore complexes into the cytosol, where they are translated into protein.
在转录的延伸阶段,新生 RNA 经历三种类型的加工事件:一个特殊核苷酸被添加到其 5'端(帽子化),内含子序列从 RNA 分子中间被移除(剪接),并生成 RNA 的 3'端(剪切和多聚腺苷化)。这些过程中的每一个都由蛋白质启动,这些蛋白质通过结合到其长而延伸的 C-末端上的位点与 RNA 聚合酶 II 一起移动。剪接是不寻常的,因为每个剪接事件都需要许多组装步骤,并且反应的催化位点是由 RNA 分子而不是蛋白质形成的。只有经过适当加工的 RNA 才能通过核孔复合体进入细胞质,在那里它们被翻译成蛋白质。
For many genes, RNA, rather than protein, is the final product. In eukaryotes, the most abundant of these non-coding RNAs are transcribed by either RNA polymerase I or RNA polymerase III. RNA polymerase I makes the ribosomal RNAs, which are by far the most abundant RNAs in a cell. The rRNAs are chemically modified, cleaved, and assembled into the two ribosomal subunits in the nucleolus-a distinct membraneless organelle that also helps to process some smaller RNAprotein complexes in the cell. Additional biomolecular condensates in the nucleus (including Cajal bodies and interchromatin granule clusters) are sites where components involved in RNA processing are assembled, stored, and recycled. The high concentration of components in these and other biomolecular condensates ensures that the processes being catalyzed are rapid and efficient.
对于许多基因来说,RNA 而非蛋白质是最终产物。在真核生物中,这些非编码 RNA 中最丰富的是由 RNA 聚合酶 I 或 RNA 聚合酶 III 转录的。RNA 聚合酶 I 合成核糖体 RNA,这是细胞中数量最多的 RNA。核糖体 RNA 经过化学修饰、剪切,并在核仁中组装成两个核糖体亚基-核仁是一个独特的无膜细胞器,还帮助处理细胞中的一些较小的 RNA 蛋白质复合物。细胞核中的其他生物分子凝聚体(包括 Cajal 小体和染色质间颗粒团)是 RNA 加工所涉及的组分被组装、储存和回收的地方。这些和其他生物分子凝聚体中组分的高浓度确保了催化过程的快速和高效进行。

FROM RNA TO PROTEIN
从 RNA 到蛋白质

In the preceding section, we saw that the final product of some genes is an RNA molecule itself, such as the RNAs present in the snRNPs and in ribosomes. However, most genes in a cell produce mRNA molecules that serve as intermediaries on the pathway to proteins. In this section, we examine how the cell converts the information carried in an mRNA molecule into a protein molecule. This feat of translation was a strong focus of attention for biologists in the late 1950s, when it was posed as the "coding problem": How is the information in a linear sequence of nucleotides in RNA translated into the linear sequence of a chemically quite different set of units-the amino acids in proteins? This fascinating question stimulated great excitement. Here was a cryptogram set up by nature that, after more than 3 billion years of evolution, could finally be solved by one of the products of evolution-human beings. And indeed, not only was the code cracked step by step, but in the year 2000 the structure of the elaborate machinery by which cells read this code-the ribosome-was finally revealed in atomic detail.
在前面的部分中,我们看到一些基因的最终产物是 RNA 分子本身,比如存在于 snRNPs 和核糖体中的 RNA。然而,细胞中的大多数基因产生的是作为蛋白质合成途径中介的 mRNA 分子。在本节中,我们将探讨细胞如何将 mRNA 分子中携带的信息转化为蛋白质分子。这种翻译的壮举是 1950 年代末生物学家关注的焦点,当时它被提出为“编码问题”:如何将 RNA 中核苷酸的线性序列中的信息翻译成蛋白质中氨基酸的线性序列,这是一组化学性质完全不同的单位?这个迷人的问题激发了极大的兴奋。这里是大自然设置的一种密码,经过 30 多亿年的演化,最终可以被演化的产物之一——人类解开。事实上,这个密码不仅被逐步破解,而且在 2000 年,细胞读取这个密码的复杂机制——核糖体的结构最终以原子细节揭示了出来。

An mRNA Sequence Is Decoded in Sets of Three Nucleotides
mRNA 序列以三个核苷酸一组进行解码

Once an mRNA has been produced by transcription and processing, the information present in its nucleotide sequence is used to synthesize a protein. Transcription is simple to understand as a means of information transfer: because DNA and RNA are chemically and structurally similar, the DNA can act as a direct template for the synthesis of RNA by complementary base-pairing. As the term "transcription" signifies, it is as if a message written out by hand is being converted, say, into a typewritten text. The language itself and the form of the message do not change, and the symbols used are closely related.
一旦 mRNA 通过转录和加工产生,其核苷酸序列中包含的信息被用来合成蛋白质。转录很容易理解为信息传递的一种方式:因为 DNA 和 RNA 在化学和结构上相似,DNA 可以作为 RNA 合成的直接模板,通过互补的碱基配对。正如“转录”一词所示,就好像手写的消息被转换成打字的文本一样。语言本身和消息的形式不会改变,使用的符号是密切相关的。
In contrast, the conversion of the information in RNA into protein represents a translation of the information into another language that uses quite different symbols. Moreover, because there are only 4 different nucleotides in mRNA and 20 different types of amino acids in a protein, this translation cannot be accounted for by a direct one-to-one correspondence between a nucleotide in RNA and an amino acid in protein. The nucleotide sequence of a gene, through the intermediary of mRNA, is instead translated into the amino acid sequence of a protein by rules that are known as the genetic code. This code was deciphered in the early 1960s.
相比之下,将 RNA 中的信息转化为蛋白质代表着将信息翻译成另一种使用完全不同符号的语言。此外,由于 mRNA 中只有 4 种不同的核苷酸,而蛋白质中有 20 种不同类型的氨基酸,这种翻译不能通过 RNA 中的核苷酸与蛋白质中的氨基酸之间的直接一对一对应来解释。基因的核苷酸序列,通过 mRNA 的中间体,而是通过被称为遗传密码的规则转译为蛋白质的氨基酸序列。这个密码在 1960 年代初被解读。
UUA
GCA CGA GGA CUA CCA UCA ACA GUA
GCC CGC GGC AUA CCC UCC ACC GUC UAA
GCG CGG GAC AAC UGC GAA CAA GGG CAC AUC CUG AAA UUC CCG UCG ACG UAC GUG UAG
GCU CGU GAU AAU UGU GAG CAG GGU CAU AUU CUU AAG AUG UUU UCU ACU UGG UAU GUU UGA
Ala Arg Asp Asn Cys Glu Gln Gly His Ile Leu Lys Met Phe Pro Ser Thr Trp Tyr Val stop 停止
A D N C E Q G H 1 K M F S W Y V
Figure 6-52 The genetic code. The standard one-letter abbreviation for each amino acid is presented below its three-letter abbreviation (see Panel 3-1, pp. 118-119, for the full name of each amino acid and its structure). By convention, codons are always written with the 5 -terminal nucleotide to the left. Note that most amino acids are represented by more than one codon, and that there are some regularities in the set of codons that specifies each amino acid: codons for the same amino acid tend to contain the same nucleotides at the first and second positions, and vary at the third position. Three codons do not specify any amino acid but act as termination sites (stop codons), signaling the end of the protein-coding sequence. One codon-AUG-acts both as an initiation codon, signaling the start of a protein-coding message, and also as the codon that specifies methionine.
图 6-52 遗传密码。每种氨基酸的标准一字母缩写在其三字母缩写下方呈现(详见第 3-1 面,第 118-119 页,每种氨基酸的全名及其结构)。按照惯例,密码子始终从左侧写入第 5 个核苷酸。请注意,大多数氨基酸由多个密码子表示,并且在指定每种氨基酸的密码子集中存在一些规律性:相同氨基酸的密码子往往在第一和第二位置包含相同的核苷酸,并在第三位置有所变化。三个密码子不指定任何氨基酸,而是作为终止位点(终止密码子),标志着蛋白质编码序列的结束。一个密码子-AUG-既作为启动密码子,标志着蛋白质编码信息的开始,也作为指定甲硫氨酸的密码子。
The sequence of nucleotides in the mRNA molecule is read in consecutive groups of three. RNA is a linear polymer of four different nucleotides, so there are possible combinations of three nucleotides: the triplets AAA, AUA, AUG, and so on. However, only 20 different amino acids are commonly found in proteins. Either some nucleotide triplets are never used or the code is redundant and some amino acids are specified by more than one triplet. The second possibility is, in fact, the correct one, as shown by the completely deciphered genetic code in Figure 6-52. Each group of three consecutive nucleotides in mRNA is called a codon, and each codon specifies either one amino acid or a stop to the translation process.
mRNA 分子中的核苷酸序列以连续的三个为一组进行阅读。RNA 是由四种不同核苷酸构成的线性聚合物,因此有 种可能的三核苷酸组合:三联体 AAA,AUA,AUG 等。然而,在蛋白质中通常只有 20 种不同的氨基酸。有些核苷酸三联体从未被使用,或者密码子是冗余的,某些氨基酸由多个三联体指定。事实上,第二种可能性才是正确的,正如图 6-52 中完全解密的遗传密码所示。mRNA 中每组连续的三个核苷酸被称为密码子,每个密码子要么指定一个氨基酸,要么指定翻译过程的终止。
In principle, an RNA sequence can be translated in any one of three different reading frames, depending on where the decoding process begins (Figure 6-53). However, only one of the three possible reading frames in an mRNA encodes the required protein. We see later how a special punctuation signal at the beginning of each RNA message sets the correct reading frame at the start of protein synthesis.
原则上,RNA 序列可以根据解码过程的开始位置在三种不同的阅读框架中的任何一个进行翻译(图 6-53)。然而,在 mRNA 中只有三种可能的阅读框架中的一种编码所需的蛋白质。我们稍后会看到,每个 RNA 信息的开头处的特殊标点信号设置了蛋白质合成开始时的正确阅读框架。

tRNA Molecules Match Amino Acids to Codons in mRNA
tRNA 分子将氨基酸与 mRNA 中的密码子配对

The codons in an mRNA molecule do not directly recognize the amino acids they specify: the group of three nucleotides does not, for example, bind directly to the amino acid. Rather, the translation of mRNA into protein depends on adaptor molecules that can recognize and bind both to the codon and, at another site on their surface, to the amino acid. These adaptors consist of a set of small RNA molecules known as transfer RNAs (tRNAs), each about 80 nucleotides in length.
mRNA 分子中的密码子并不直接识别它们指定的氨基酸:例如,三个核苷酸的组合并不直接与氨基酸结合。相反,mRNA 转译成蛋白质取决于能够识别并结合密码子以及在它们表面的另一个位置结合氨基酸的适配分子。这些适配分子由一组称为转运 RNA(tRNA)的小 RNA 分子组成,每个大约 80 个核苷酸长。
We saw earlier in this chapter that RNA molecules can fold into precise threedimensional structures, and the tRNA molecules provide striking examples. Four short segments of each folded tRNA are double-helical, producing a molecule that looks like a cloverleaf when drawn schematically (Figure 6-54). For example, a 5'-GCUC-3' sequence in one part of a polynucleotide chain can form a relatively strong base-pairing association with a -GAGC-3' sequence in another region of the same molecule. The cloverleaf undergoes further folding to form a compact L-shaped structure that is held together by additional hydrogen bonds between different regions of the molecule (see Figure 6-54B and C).
我们在本章前面看到,RNA 分子可以折叠成精确的三维结构,tRNA 分子提供了引人注目的例子。每个折叠的 tRNA 的四个短片段是双螺旋的,产生了一个在示意图中看起来像三叶草的分子(图 6-54)。例如,多核苷酸链的一个部分中的 5'-GCUC-3'序列可以与同一分子的另一个区域中的-3'序列形成相对强的碱基配对。三叶草进一步折叠形成一个紧凑的 L 形结构,由分子不同区域之间的额外氢键维持在一起(见图 6-54B 和 C)。
Two regions of unpaired nucleotides situated at either end of the L-shaped molecule are crucial to the function of tRNA in protein synthesis. One of these regions forms the anticodon, a set of three consecutive nucleotides that pairs with the complementary codon in an mRNA molecule. The other is a short singlestrand region at the end of the molecule; this is the site where the amino acid that matches the codon is attached to the tRNA.
L 形分子两端的未配对核苷酸区域对 tRNA 在蛋白质合成中的功能至关重要。其中一个区域形成反密码子,由三个连续核苷酸组成,与 mRNA 分子中的互补密码子配对。另一个是分子末端的短单链区域;这是与密码子匹配的氨基酸附着到 tRNA 的位置。
We saw above that the genetic code is redundant; that is, several different codons can specify a single amino acid. This redundancy implies either that there is more than one tRNA for many of the amino acids or that some tRNA molecules can base-pair with more than one codon. In fact, both situations occur. Some
我们已经看到遗传密码是冗余的;也就是说,几个不同的密码子可以指定一个氨基酸。这种冗余意味着许多氨基酸有不止一种 tRNA,或者一些 tRNA 分子可以与不止一个密码子配对。事实上,这两种情况都存在。一些
Figure 6-53 The three possible reading frames in protein synthesis. In the process of translating a nucleotide sequence (blue) into an amino acid sequence (red), the sequence of nucleotides in an mRNA molecule is read from the end to the end in consecutive sets of three nucleotides. In principle, therefore, the same RNA sequence can specify three completely different amino acid sequences, depending on the reading frame. In reality, however, only one of these reading frames contains the actual message.
图 6-53 蛋白质合成中的三种可能阅读框架。在将核苷酸序列(蓝色)翻译成氨基酸序列(红色)的过程中,mRNA 分子中的核苷酸序列是以连续的三个核苷酸为一组从 端到 端进行阅读。因此,同一 RNA 序列原则上可以指定三种完全不同的氨基酸序列,这取决于阅读框架。然而,在现实中,只有其中一个阅读框架包含实际信息。
(A)
(B)
(C)
(D)
Figure 6-54 A tRNA molecule. A tRNA specific for the amino acid phenylalanine (Phe) is depicted in various ways. (A) The cloverleaf structure showing the complementary base-pairing (red lines) that creates the double-helical regions of the molecule. The anticodon is the sequence of three nucleotides that base-pairs with a codon in mRNA. The amino acid matching the codon-anticodon pair is attached at the end of the tRNA. tRNAs contain some unusual bases, which are produced by chemical modification after the tRNA has been synthesized. For example, the bases denoted (pseudouridine-see Figure 6-43) and D (dihydrouridine-see Figure 6-57) are derived from uracil. (B and C) Views of the L-shaped molecule that are based on x-ray diffraction analysis. Although this diagram shows the tRNA for the amino acid phenylalanine, all other tRNAs have similar structures. (D) The tRNA icon we use in this book. (E) The linear nucleotide sequence of the molecule, color-coded to match the illustrations in A, B, and C.
图 6-54 tRNA 分子。以各种方式描绘了特异于氨基酸苯丙氨酸(Phe)的 tRNA(A)。三叶草状结构显示了创建分子双螺旋区域的互补碱基配对(红线)。反密码子是与 mRNA 中密码子配对的三个核苷酸序列。与密码子-反密码子配对相匹配的氨基酸附加在 tRNA 的 端。tRNA 含有一些不寻常的碱基,这些碱基在 tRNA 合成后经化学修饰产生。例如,标记为 (伪尿嘧啶-见图 6-43)和 D(二氢尿嘧啶-见图 6-57)的碱基源自尿嘧啶(B 和 C)。基于 X 射线衍射分析的 L 形分子的视图。尽管此图显示了氨基酸苯丙氨酸的 tRNA,但所有其他 tRNA 具有类似的结构(D)。我们在本书中使用的 tRNA 图标(E)。分子的线性核苷酸序列,颜色编码以匹配 A、B 和 C 中的插图。
amino acids have more than one tRNA, and some tRNAs are constructed so that they require accurate base-pairing only at the first two positions of the codon and can tolerate a mismatch (or wobble) at the third position (Figure 6-55). This wobble base-pairing explains why so many of the alternative codons for an amino acid differ only in their third nucleotide (see Figure 6-52). In bacteria, wobble base-pairings make it possible to fit the 20 amino acids to their 61 codons with as
氨基酸具有多个 tRNA,一些 tRNA 的构造使其只需要在密码子的前两个位置准确配对,并且可以容忍第三个位置的不匹配(或 wobble)(图 6-55)。这种 wobble 碱基配对解释了为什么许多氨基酸的替代密码子仅在它们的第三个核苷酸上有所不同(参见图 6-52)。在细菌中,wobble 碱基配对使得 20 种氨基酸适应它们的 61 个密码子成为可能。
bacteria 细菌
 摇摆密码子碱基
wobble codon
base

可能的反密码子碱基
possible
anticodon bases
U A, G, or I
A, G, 或 I
C G or I G 或 I
A U or I U 或 I
G C or U C 或 U
eukaryotes 真核生物
 摇摆密码子碱基
wobble codon
base

可能的反密码子碱基
possible
anticodon bases
U A, G, or I
A, G, 或 I
C G or I G 或 I
A U
G C
Figure 6-55 Wobble base-pairing between codons and anticodons. If the nucleotide listed in the first column is present at the third, or wobble, position of the codon, it can base-pair with any of the nucleotides listed in the second column. Thus, for example, when inosine ( ) is present in the wobble position of the tRNA anticodon, the tRNA can recognize any one of three different codons in bacteria and either of two codons in eukaryotes. The inosine in TRNAs is formed from the deamination of adenosine (see Figure 6-57), a chemical modification that takes place after the tRNA has been synthesized. The nonstandard base pairs, including those made with inosine, are generally weaker than conventional base pairs. Codon-anticodon basepairing is more stringent at positions 1 and 2 of the codon, where only conventional base pairs are permitted. The differences in wobble base-pairing interactions between bacteria and eukaryotes presumably result from subtle structural differences between bacterial and eukaryotic ribosomes, the molecular machines that perform protein synthesis. (Adapted from C. Guthrie and J. Abelson, in The Molecular Biology of the Yeast Saccharomyces: Metabolism and Gene Expression, pp. 487-528. Cold Spring Harbor, NY: Cold Spring Harbor Laboratory Press, 1982.)
图 6-55 密码子和反密码子之间的摇摆碱基配对。如果第一列中列出的核苷酸出现在密码子的第三个位置,即摇摆位置,它可以与第二列中列出的任何核苷酸进行碱基配对。因此,例如,当肌苷( )出现在 tRNA 反密码子的摇摆位置时,tRNA 可以识别细菌中的三种不同密码子或真核生物中的两种密码子中的任何一种。tRNA 中的肌苷是由腺苷脱氨化形成的(参见图 6-57),这是在 tRNA 合成完成后发生的化学修饰。非标准碱基对,包括与肌苷形成的碱基对,通常比常规碱基对弱。密码子-反密码子碱基配对在密码子的第 1 和第 2 位置更为严格,只允许常规碱基对。细菌和真核生物之间摇摆碱基配对相互作用的差异可能源于细菌和真核生物核糖体之间微小结构差异,核糖体是执行蛋白质合成的分子机器。(改编自 C. Guthrie 和 J.) Abelson,《The Molecular Biology of the Yeast Saccharomyces: Metabolism and Gene Expression》,第 487-528 页。纽约州科尔德斯普林哈伯:科尔德斯普林哈伯实验室出版社,1982 年。)
Figure 6-56 Structure of a tRNA-splicing endonuclease docked to a precursor tRNA. The endonuclease (a four-subunit enzyme) removes the tRNA intron (dark blue, bottom). A second enzyme, a multifunctional tRNA ligase (not shown), then joins the two TRNA halves together. (Courtesy of H. Li, C. Trotta, and J. Abelson. PDB code: 2A9L.)
图 6-56 tRNA 剪接核酶与前体 tRNA 结构对接。核酶(四亚基酶)去除 tRNA 内含子(深蓝色,底部)。第二种酶,多功能 tRNA 连接酶(未显示),然后将两个 tRNA 半分子连接在一起。(由 H. Li,C. Trotta 和 J. Abelson 提供。PDB 代码:2A9L。)
few as 31 kinds of tRNA molecules. The exact number of different kinds of tRNAs, however, differs from one species to the next. For example, humans have nearly 500 tRNA genes that encode tRNAs with 48 different anticodons.
31 种 tRNA 分子。然而,不同种类的 tRNA 的确切数量因物种而异。例如,人类拥有近 500 个编码具有 48 个不同抗密码子的 tRNA 基因。

tRNAs Are Covalently Modified Before They Exit from the Nucleus
tRNA 在离开细胞核之前会被共价修饰

Like most other eukaryotic RNAs, tRNAs are covalently modified before they are allowed to exit from the nucleus. Eukaryotic tRNAs are synthesized by RNA polymerase III. Both bacterial and eukaryotic tRNAs are typically synthesized as larger precursor tRNAs, which are then trimmed to produce the mature tRNA. In addition, some tRNA precursors (from both bacteria and eukaryotes) contain introns that must be spliced out. This splicing reaction differs chemically from that of premRNA splicing discussed earlier in the chapter; rather than generating a lariat intermediate, tRNA splicing uses a cut-and-paste mechanism that is catalyzed by proteins (Figure 6-56). Trimming and splicing both require the precursor tRNA to be correctly folded in its cloverleaf configuration. Because misfolded tRNA precursors will not be processed properly, the trimming and splicing reactions serve as quality-control steps in the generation of tRNAs. Those that do not pass the tests are degraded by the nuclear exosome (see Figure 6-38).
与大多数其他真核 RNA 一样,tRNA 在允许离开细胞核之前会被共价修饰。真核 tRNA 由 RNA 聚合酶 III 合成。细菌和真核的 tRNA 通常作为较大的前体 tRNA 合成,然后被修剪成成熟的 tRNA。此外,一些 tRNA 前体(来自细菌和真核生物)含有必须被剪接掉的内含子。这种剪接反应在化学上不同于本章前面讨论的 premRNA 剪接;tRNA 剪接不是生成套圈中间体,而是使用由蛋白质催化的剪切-粘贴机制(图 6-56)。修剪和剪接都要求前体 tRNA 以其三叶草状结构正确折叠。由于折叠不正确的 tRNA 前体将无法正确处理,修剪和剪接反应在 tRNA 生成过程中起到质量控制的作用。未通过测试的 tRNA 将被细胞核外体降解(见图 6-38)。
All tRNAs are modified chemically-nearly 1 in 10 nucleotides in each mature tRNA molecule is an altered version of a standard , or A ribonucleotide. More than 50 different types of tRNA modifications are known; a few are shown in Figure 6-57. Some of the modified nucleotides lie within the anticodon-most notably inosine, produced by the deamination of adenosine-and affect the base-pairing of the anticodon, thereby facilitating the recognition of the appropriate mRNA codon by the tRNA molecule (see Figure 6-55). Other modifications affect the accuracy with which the tRNA is attached to the correct amino acid.
所有 tRNA 在化学上都经过修改-每个成熟 tRNA 分子中约有 10 分之 1 的核苷酸是标准 或 A 核糖核苷酸的改变版本。已知有 50 多种不同类型的 tRNA 修饰;其中一些在图 6-57 中显示。一些修改后的核苷酸位于抗密码子内-尤其是由腺苷脱氨生成的次黄嘌呤-并影响抗密码子的碱基配对,从而促进 tRNA 分子通过图 6-55 所示的适当 mRNA 密码子的识别。其他修饰影响 tRNA 连接到正确氨基酸的准确性。

Specific Enzymes Couple Each Amino Acid to Its Appropriate tRNA Molecule
特定酶将每种氨基酸与其相应的 tRNA 分子结合在一起

We have seen that, to read the genetic code in DNA, cells make a series of different tRNAs. We now consider how each tRNA molecule becomes linked to the one amino acid in 20 that is its appropriate partner. Recognition and attachment of
我们已经看到,为了阅读 DNA 中的遗传密码,细胞制造了一系列不同的 tRNA。现在我们考虑每个 tRNA 分子如何与其适当的伙伴中的 20 个氨基酸之一连接。识别和连接
two methyl groups added to G ( -dimethyl G)
两个甲基基团添加到 G ( -二甲基 G)
ONONH
two hydrogens added to (dihydro U)
两个氢原子加到 (二氢 U)
ONHNNN
Figure 6-57 A few of the unusual nucleotides found in tRNA molecules. These nucleotides are produced by covalent modification of a normal nucleotide after it has been incorporated into an RNA chain. Two other types of modified nucleotides are shown in Figure 6-43. In most tRNA molecules, about of the nucleotides are modified (see Figure 6-54). As shown in Figure 6-55, inosine is sometimes present at the wobble position in the tRNA anticodon.
图 6-57 tRNA 分子中发现的一些不寻常的核苷酸。这些核苷酸是在将正常核苷酸并入 RNA 链后通过共价修饰产生的。图 6-43 中还显示了另外两种修改过的核苷酸。在大多数 tRNA 分子中,大约 的核苷酸被修改(参见图 6-54)。如图 6-55 所示,肌苷有时会出现在 tRNA 抗密码子的摇摆位置。
Figure 6-58 Amino acid activation by synthetase enzymes. An amino acid is attached to its corresponding tRNA in two steps by an aminoacyl-tRNA synthetase enzyme. As indicated, the energy of ATP hydrolysis is used in the reaction to produce a high-energy linkage. The amino acid is first activated through attachment of its carboxyl group directly to AMP, forming an adenylated amino acid; the linkage of the AMP, normally an unfavorable reaction, is driven by the hydrolysis of the ATP molecule that donates the AMP. Without leaving the synthetase enzyme, the AMP-linked carboxyl group on the amino acid is then transferred to a hydroxyl group on the sugar at the end of the tRNA molecule. This transfer joins the amino acid by an activated ester linkage to the tRNA and forms the final aminoacyl-tRNA molecule. The synthetase enzyme is not shown in this diagram.
图 6-58 氨基酸通过合成酶酶激活。氨基酸通过氨酰-tRNA 合成酶酶的两个步骤附加到其对应的 tRNA 上。如所示,ATP 水解的能量用于产生高能链接的反应。氨基酸首先通过将其羧基直接附加到 AMP 上而被激活,形成腺苷酸化氨基酸;AMP 的连接,通常是一个不利的反应,是由捐赠 AMP 的 ATP 分子的水解驱动的。在不离开合成酶酶的情况下,氨基酸上的 AMP 连接羧基然后转移到 tRNA 分子末端的糖的羟基上。这种转移通过活化酯键将氨基酸与 tRNA 连接起来,并形成最终的氨酰-tRNA 分子。此图中未显示合成酶酶。
the correct amino acid depends on enzymes called aminoacyl-tRNA synthetases, which covalently couple each amino acid to its appropriate set of tRNA molecules (Figure 6-58 and Figure 6-59). Most cells have a different synthetase enzyme for each amino acid (that is, 20 synthetases in all); one attaches glycine to all tRNAs that recognize codons for glycine, another attaches alanine to all tRNAs that recognize codons for alanine, and so on. Many bacteria, however, have fewer than 20 synthetases, and the same synthetase enzyme is responsible for coupling more
正确的氨基酸取决于称为氨酰-tRNA 合成酶的酶,它们将每种氨基酸共价地耦合到其适当的 tRNA 分子组上(图 6-58 和图 6-59)。大多数细胞对每种氨基酸都有不同的合成酶酶(即总共有 20 种合成酶);一种将甘氨酸连接到所有识别甘氨酸密码子的 tRNA 上,另一种将丙氨酸连接到所有识别丙氨酸密码子的 tRNA 上,依此类推。然而,许多细菌的合成酶少于 20 种,同一种合成酶酶负责耦合更多。
(A)
aminoacylTRNA 氨基酰 tRNA
(B)
Figure 6-59 The structure of the aminoacyl-tRNA linkage. The carboxyl end of the amino acid forms an ester bond to ribose. Because the hydrolysis of this ester bond is associated with a large favorable change in free energy, an amino acid held in this way is said to be activated. (A) Schematic drawing of the structure. The amino acid is linked to the nucleotide at the end of the tRNA (see Figure 6-54). (B) Actual structure corresponding to the boxed region in A. There are two major classes of synthetase enzymes: one links the amino acid directly to the group of the ribose, and the other links it initially to the group. In the latter case, a subsequent transesterification reaction shifts the amino acid to the 3 ' position. The " " is a standard symbol used to represent the side chain of an amino acid.
图 6-59 氨酰-tRNA 连接的结构。氨基酸的羧基端形成酯键与核糖结合。由于这个酯键的水解伴随着大量有利的自由能变化,以这种方式保持的氨基酸被称为被激活的。 (A) 结构的示意图。氨基酸与 tRNA 的 端的核苷酸连接在一起(参见图 6-54)。 (B) 实际结构对应于 A 中的方框区域。有两类主要的合成酶酶:一种将氨基酸直接连接到核糖的 基团,另一种最初将其连接到 基团。在后一种情况下,随后的酯交换反应将氨基酸转移到 3'位置。 “ ”是用于表示氨基酸侧链的标准符号。
Figure 6-60 The genetic code is translated by means of two adaptors that act one after another. The first adaptor is the aminoacyl-tRNA synthetase, which couples a particular amino acid to its corresponding tRNA; the second adaptor is the tRNA molecule itself, whose anticodon forms base pairs with the appropriate codon on the mRNA. An error in either step would cause the wrong amino acid to be incorporated into a protein chain (Movie 6.6). In the sequence of events shown, the amino acid tryptophan (Trp) is selected by the codon UGG on the mRNA.
图 6-60 遗传密码通过两个适配器依次起作用进行翻译。第一个适配器是氨酰-tRNA 合成酶,它将特定的氨基酸与相应的 tRNA 结合;第二个适配器是 tRNA 分子本身,其反密码子与 mRNA 上的适当密码子形成碱基对。在任一步骤中出现错误都会导致错误的氨基酸被合并到蛋白质链中(电影 6.6)。在所示事件序列中,氨基酸色氨酸(Trp)由 mRNA 上的密码子 UGG 选择。
than one amino acid to the appropriate tRNAs. In these cases, a single synthetase places the identical amino acid on two different types of tRNAs, only one of which has an anticodon that matches the amino acid. A second enzyme then chemically modifies each "incorrectly" attached amino acid so that it now corresponds to the anticodon displayed by its covalently linked tRNA.
将一个氨基酸转移到适当的 tRNA 上。在这些情况下,一个合成酶将相同的氨基酸放置在两种不同类型的 tRNA 上,其中只有一种具有与氨基酸匹配的反密码子。然后,第二个酶对每个“错误”连接的氨基酸进行化学修饰,使其现在对应于其共价连接的 tRNA 显示的反密码子。
The synthetase-catalyzed reaction that attaches the amino acid to the end of the tRNA is one of many reactions coupled to the energy-releasing hydrolysis of ATP (see pp. 70-72), and it produces a high-energy bond between the tRNA and the amino acid. The energy of this bond is used at a later stage in protein synthesis to link the amino acid covalently to the growing polypeptide chain.
通过合成酶催化的反应将氨基酸连接到 tRNA 的 端是许多与 ATP 水解释放能量耦合的反应之一(见第 70-72 页),它在 tRNA 和氨基酸之间产生高能键。这种键的能量在蛋白质合成的后期用于将氨基酸共价地连接到不断增长的多肽链上。
The aminoacyl-tRNA synthetase enzymes and the tRNAs are equally important in the decoding process (Figure 6-60). This was established by an experiment in which one amino acid (cysteine) was chemically converted into a different amino acid (alanine) after it already had been attached to its specific tRNA. When such "hybrid" aminoacyl-tRNA molecules were used for protein synthesis in a cell-free system, the wrong amino acid was inserted at every point in the protein chain where that tRNA was used. Although, as we shall see, cells have several quality-control mechanisms to avoid this type of mishap, the experiment did establish that the genetic code is translated by two sets of adaptors that act sequentially. Each matches one molecular surface to another with great specificity, and it is their combined action that associates each sequence of three nucleotides in the mRNA molecule-that is, each codon-with its particular amino acid.
氨基酰-tRNA 合成酶酶和 tRNA 在解码过程中同等重要(图 6-60)。这是通过一项实验确立的,该实验中,一种氨基酸(半胱氨酸)在已经连接到其特定 tRNA 后被化学转化为另一种氨基酸(丙氨酸)。当这种“混合”氨基酰-tRNA 分子在无细胞体系中用于蛋白质合成时,在使用该 tRNA 的蛋白质链的每个点上都插入了错误的氨基酸。尽管,正如我们将看到的那样,细胞有几种质量控制机制来避免这种类型的意外,但该实验确立了遗传密码是由两组依次作用的适配器翻译的。每个适配器将一个分子表面与另一个分子表面具有很高的特异性地匹配,正是它们的联合作用将 mRNA 分子中的每个三个核苷酸序列-也就是每个密码子-与其特定的氨基酸相关联。

Editing by tRNA Synthetases Ensures Accuracy
tRNA 合成酶的编辑确保准确性

Several mechanisms working together ensure that an aminoacyl-tRNA synthetase links the correct amino acid to each tRNA. Most synthetase enzymes select the correct amino acid by a two-step mechanism. The correct amino acid has the highest affinity for the active-site pocket of its synthetase and is therefore favored over the other 19; in particular, amino acids larger than the correct one are excluded from the active site. However, accurate discrimination between two similar amino acids, such as isoleucine and valine (which differ by only a methyl group), is very difficult to achieve in a single step. A second
几种机制共同作用确保氨酰-tRNA 合成酶将正确的氨基酸连接到每个 tRNA。大多数合成酶酶通过两步机制选择正确的氨基酸。正确的氨基酸对其合成酶的活性位点具有最高的亲和力,因此优先于其他 19 种;特别是,大于正确氨基酸的氨基酸被排除在活性位点之外。然而,在两个相似的氨基酸之间进行准确区分,例如异亮氨酸和缬氨酸(仅相差一个甲基基团),在单步骤中非常难以实现。第二
(A)
(B)
discrimination step occurs after the amino acid has been covalently linked to AMP (see Figure 6-58): when tRNA binds, the synthetase tries to force the adenylated amino acid into a second editing pocket in the enzyme. The precise dimensions of this pocket exclude the correct amino acid, while allowing access by closely related amino acids. In the editing pocket, an amino acid is removed from the AMP (or from the tRNA itself if the aminoacyl-tRNA bond has already formed) by hydrolysis. This hydrolytic editing, which is analogous to the exonucleolytic proofreading by DNA polymerases, increases the overall accuracy of tRNA charging so that only about one mistake is made in 40,000 couplings (Figure 6-61).
歧视步骤发生在氨基酸已经与 AMP 共价连接之后(见图 6-58):当 tRNA 结合时,合成酶试图将腺苷化的氨基酸强行推入酶中的第二个编辑口袋。这个口袋的精确尺寸排除了正确的氨基酸,同时允许与之密切相关的氨基酸进入。在编辑口袋中,通过水解从 AMP 中去除一个氨基酸(或者如果氨酰-tRNA 键已经形成,则从 tRNA 本身中去除)。这种水解编辑类似于 DNA 聚合酶的外切校对,增加了 tRNA 充电的整体准确性,使得大约每 40,000 对氨基酸只有一个错误(图 6-61)。
The tRNA synthetase must also recognize the correct set of tRNAs, and extensive structural and chemical complementarity between the synthetase and the tRNA allows the synthetase to probe various features of the tRNA (Figure 6-62). Most tRNA synthetases directly recognize the matching tRNA anticodon; these synthetases contain three adjacent nucleotide-binding pockets, each of which is complementary in shape and charge to a nucleotide in the anticodon. For other synthetases, the nucleotide sequence of the amino acid-accepting arm (acceptor stem) is the key recognition determinant. In most cases, however, the synthetase "reads" the nucleotides at several different positions on the tRNA, thereby increasing the accurate linking of amino acids to their appropriate tRNAs.
tRNA 合成酶还必须识别正确的 tRNA 集合,合成酶与 tRNA 之间的广泛结构和化学互补性使合成酶能够探测 tRNA 的各种特征(图 6-62)。大多数 tRNA 合成酶直接识别匹配的 tRNA 抗密码子;这些合成酶包含三个相邻的核苷酸结合口袋,每个口袋的形状和电荷与抗密码子中的一个核苷酸互补。对于其他合成酶,氨基酸接受臂(接受干)的核苷酸序列是关键的识别决定因子。然而,在大多数情况下,合成酶会“读取”tRNA 上几个不同位置的核苷酸,从而增加氨基酸与其适当 tRNA 的准确连接。

Amino Acids Are Added to the C-terminal End of a Growing Polypeptide Chain
氨基酸被添加到不断增长的多肽链的 C-末端

Having seen that amino acids are first coupled to specific tRNA molecules that serve as adaptors, we now turn to the mechanism that joins these amino acids together to form proteins. The fundamental reaction of protein synthesis is the formation of a peptide bond between the carboxyl group at the end of a growing polypeptide chain and a free amino group on an incoming amino acid.
在看到氨基酸首先与充当适配器的特定 tRNA 分子结合后,我们现在转向将这些氨基酸连接在一起形成蛋白质的机制。蛋白质合成的基本反应是在不断增长的多肽链末端的羧基与新进入的氨基酸上的游离氨基之间形成肽键。

Figure 6-61 Hydrolytic editing in biology. (A) Aminoacyl-tRNA synthetases correct their own coupling errors through the hydrolytic editing of incorrectly attached amino acids. As described in the text, errors are selectively removed because the correct amino acid is rejected by the editing site on the synthetase. (B) The error-correction process performed by DNA polymerase, as described in the previous chapter. Here error removal depends on the wrong nucleotide mispairing with the DNA template (see Figure 5-9). (P, polymerization site; , editing site.)
图 6-61 生物学中的水解编辑。(A) 氨基酰-tRNA 合成酶通过水解编辑错误地连接的氨基酸来纠正自己的偶联错误。如文中所述,错误被选择性地移除,因为正确的氨基酸被合成酶上的编辑位点拒绝了。(B) DNA 聚合酶执行的错误校正过程,如前一章所述。这里的错误移除取决于错误的核苷酸与 DNA 模板的错配(见图 5-9)。(P,聚合位点; ,编辑位点。)
Figure 6-62 The recognition of a tRNA molecule by its aminoacyl-tRNA synthetase. For this tRNA (tRNA specific nucleotides in both the anticodon (dark blue) and the amino acid-accepting arm (green) allow the correct tRNA to be recognized by the synthetase enzyme (yellow-green). The ATP molecule used in the coupling reaction is yellow. (Courtesy of Tom Steitz. PDB code: 1QRS.)
图 6-62 tRNA 分子被其氨酰-tRNA 合成酶识别。对于这个 tRNA(tRNA ),在反密码子(深蓝色)和氨基酸接受臂(绿色)中的特定核苷酸使合成酶酶(黄绿色)能够识别正确的 tRNA。在偶联反应中使用的 ATP 分子为黄色。(由 Tom Steitz 提供。PDB 代码:1QRS。)
Figure 6-63 The incorporation of an amino acid into a protein. A polypeptide chain grows by the stepwise addition of amino acids to its C-terminal end. The formation of each peptide bond is energetically favorable because the growing C-terminus has been activated by the covalent attachment of a tRNA molecule. Note that the peptidyl-tRNA linkage that activates the growing end is regenerated during each addition. The amino acid side chains have been abbreviated as , , and ; as a reference point, all of the atoms in the second amino acid in the polypeptide chain are shaded gray. The figure shows the addition of the fourth amino acid (red) to the growing chain.
图 6-63 氨基酸的融入蛋白质。多肽链通过逐步向其 C-末端添加氨基酸而增长。每个肽键的形成在能量上是有利的,因为生长的 C-末端已经通过 tRNA 分子的共价连接而被激活。请注意,激活生长端的肽-tRNA 连接在每次添加时都会再生。氨基酸侧链已被缩写为 ;作为参考点,多肽链中第二个氨基酸中的所有原子都被阴影处理为灰色。图显示将第四个氨基酸(红色)添加到生长链中。
Consequently, a protein is synthesized from its N-terminal end to its C-terminal end, one amino acid at a time. Throughout the entire process, the growing carboxyl end of the polypeptide chain remains activated by its covalent attachment to a tRNA molecule (forming a peptidyl-tRNA). Each addition disrupts this high-energy covalent linkage but immediately replaces it with an identical linkage on the most recently added amino acid (Figure 6-63). In this way, each amino acid added carries with it the activation energy for the addition of the next amino acid rather than the energy for its own addition-an example of the polymer-end activation mechanism for polymer synthesis described in Figure 2-44.
因此,蛋白质从其 N-末端合成到其 C-末端,每次一个氨基酸。在整个过程中,多肽链的生长羧基末端始终通过共价连接附着在 tRNA 分子上(形成肽-tRNA)。每次添加都会破坏这种高能共价连接,但立即用最近添加的氨基酸上的相同连接替换它(图 6-63)。通过这种方式,每个添加的氨基酸都携带着下一个氨基酸添加的活化能,而不是自身添加的能量-这是在图 2-44 中描述的聚合物合成的聚合物末端激活机制的一个例子。

The RNA Message Is Decoded in Ribosomes
RNA 信息在核糖体中被解码

The synthesis of proteins is guided by information carried by mRNA molecules. To maintain the correct reading frame and to ensure accuracy (about 1 mistake every 10,000 amino acids), protein synthesis is performed by the ribosome, a complex catalytic machine made from more than 50 different proteins (the ribosomal proteins) and several RNA molecules, the ribosomal RNAs (rRNAs). A typical eukaryotic cell contains millions of ribosomes in its cytoplasm (Figure 6-64), and it takes approximately 1 minute to synthesize an average-sized
蛋白质的合成受 mRNA 分子携带的信息指导。为了保持正确的阅读框架并确保准确性(大约每 10,000 个氨基酸中有 1 个错误),蛋白质合成由核糖体执行,核糖体是一个复杂的催化机器,由 50 多种不同的蛋白质(核糖体蛋白质)和几种 RNA 分子(核糖体 RNA,rRNA)组成。典型的真核细胞在其细胞质中含有数百万个核糖体(图 6-64),合成一个平均大小的蛋白质大约需要 1 分钟。
Figure 6-64 Ribosomes in the cytoplasm of a eukaryotic cell. This electron micrograph shows a thin section of a small region of cytoplasm. The ribosomes appear as black dots (red arrows). Some are free in the cytosol; others are attached to membranes of the endoplasmic reticulum. (Courtesy of Daniel S. Friend, by permission of E.L. Bearer.)
图 6-64 真核细胞胞质中的核糖体。这幅电子显微镜照片展示了一个小区域胞质的薄切片。核糖体呈现为黑色点(红色箭头)。有些在细胞质中自由存在,而另一些附着在内质网膜上。(由 Daniel S. Friend 提供,经 E.L. Bearer 许可使用。)