Introduction

The regulation of gene expression occurs both during transcription and post-transcriptionally by controlling the lifetime, subcellular localization and translation of the mRNA. In fact, for many genes1, the levels of mRNA in a cell have no correlation with protein abundance1,2, consistent with the fact that the efficiency of translation is highly variable. Most translational gene control occurs during translation initiation, the process by which the ribosome positions itself with a methionyl tRNA in its P site over the start codon of mRNA.

Given the generally far more sophisticated gene regulation in eukaryotes compared with prokaryotes, it is not surprising that initiation in eukaryotes is much more complex than it is in bacteria. For instance, whereas bacterial initiation involves only three initiation factors, there are about ten eukaryotic translation initiation factors (eIFs), many of which are large multi-subunit complexes. Another fundamental difference is that in bacteria, transcription and translation are spatially and temporally coupled3,4,5,6, so gene expression is regulated mainly at the transcriptional level. Moreover, in bacteria, the ribosome is positioned at or close to the start codon, usually via a direct interaction between the Shine–Dalgarno sequence upstream of the start codon on mRNA and a complementary anti-Shine–Dalgarno sequence in the 16S ribosomal RNA.

In eukaryotes, there is no Shine–Dalgarno sequence on the mRNA to directly guide the small ribosome subunit to the vicinity of the start codon. Instead, the mRNA typically has a capped 5′ end consisting of a 7-methylguanosine (m7G) attached via a 5′–5′ triphosphate linkage to the next nucleotide, followed by a 5′ untranslated region (UTR), a coding sequence and a 3′ UTR that ends in a poly(A) sequence. Initiation starts with the formation of the 43S pre-initiation complex (43S) consisting of eIF1, eIF1A, eIF3 and eIF5, and of a ternary complex comprising eIF2, guanosine 5′-triphosphate (GTP) and methionine initiator transfer RNA (Met-tRNAiMet) bound to the 40S small ribosomal subunit (reviewed in refs. 7,8,9,10) (Fig. 1). In parallel, the complex eIF4F, which consists of the cap-binding protein eIF4E, the RNA helicase eIF4A and the large scaffolding protein eIF4G, binds to the 5′ end of mRNA and recruits 43S to form a 48S initiation complex (48S)11,12,13,14,15,16,17,18,19,20. The 48S then scans along the 5′ UTR until it encounters a start codon12,15,16,19,21,22,23,24,25,26,27.

Fig. 1: The main steps of translation initiation in eukaryotes.
figure 1

Translation initiation starts with the binding of eukaryotic translation initiation factor 1 (eIF1), eIF1A and eIF3 to the 40S small ribosomal subunit (top left and clockwise). eIF5 and a ternary complex of eIF2, guanosine 5′-triphosphate (GTP) and the methionine initiatior tRNA (Met-tRNAiMet) binds to this complex to form a 43S translation pre-initiation complex (43S). Once assembled, the 43S is recruited to the mRNA to form the scanning-competent 48S complex (48S open). The activation of mRNA by the cap-binding complex eIF4F is crucial for recruitment. eIF4F and polyadenylate-binding protein (PABP) bind to the mRNA’s 5′ end and poly(A) tail, respectively, selecting the appropriate mRNA for the recruitment. During the scanning of the 5′ untranslated region (UTR) of mRNA, eIF5 interacts with eIF2 and accelerates the hydrolysis of eIF2-bound GTP (not shown). Start-codon selection triggers the release of eIF1 and inorganic phosphate (Pi) from the complex. The amino-terminal domain (NTD) of eIF5 occupies the position vacated by eIF1 near the P site of the ribosome. eIF2–GDP has a lower affinity for Met-tRNAiMet; therefore, the release of Pi triggers the release of eIF2–GDP and eIF5, as well as eIF3 and eIF4 factors. The release of eIF2–GDP allows the binding of eIF5B, which promotes joining of the 60S large ribosome subunit and formation of the 80S initiation complex (80S IC). Formation of the 80S IC triggers the hydrolysis of eIF5B-bound GTP and the release of eIF1A. Following the release of eIF1A, eIF5B undergoes a conformational change that places the aminoacylated end of the Met-tRNAiMet in the peptidyl transfer centre of the ribosome (not shown). The release of eIF5B marks the end of translation initiation and the beginning of elongation (80S EC). During elongation (not shown), eukaryotic elongation factor 1A (eEF1A)–GTP delivers the aminoacylated tRNA into the A site of the ribosome. Following the release of eEF1A–GDP and formation of the peptide bond, eEF2–GTP promotes the translocation of the tRNA from the A site to the P and E sites of the ribosome. The release of eEF2–GDP and the deacylated tRNA from the E site allows a new cycle of elongation. Translation termination by eukaryotic release factors (eRFs) occurs when a stop codon (for example, UAA) is reached and recognized by eRFs. ATP-binding cassette sub-family E member 1 (ABCE1) binds to the 80S termination complex and stimulates peptidyl-tRNA hydrolysis by eRF1. In addition, ABCE1 is crucial for recycling by splitting the 80S into the 40S and 60S ribosomal subunits. Following the 80S splitting, the mRNA and tRNA are removed from the 40S by recycling factors (not shown), which allows the 40S to become available for a new round of translation. CTD, carboxy-terminal domain; m7G, 7-methylguanosine.

During scanning, another molecule of eIF4A, together with an auxiliary factor eIF4B or its homologue eIF4H, is involved in unwinding mRNA secondary structures downstream of the 48S (refs. 19,28,29,30). For the scanning of mRNAs with long and highly structured 5′ UTRs, other helicases, such as mammalian DHX29 and DDX3 or yeast Ded1, are often needed21,31,32,33. The scanning process culminates in the recognition of the start codon by the 48S complex, which triggers the release of most eIFs. The subsequent binding of eIF5B promotes the joining of the 60S large subunit resulting in the formation of the 80S initiation complex (80S IC)34,35,36,37,38, the last major step of initiation (Fig. 1). Although the broad outline of the process as well as the role of the various eIFs have been characterized over the past several decades, mechanistic details of even the basic steps in the process, such as how the mRNA is inserted into the small ribosomal subunit, how scanning ensues and the changes that occur upon start-codon recognition, are much less understood. In the past decade, genetic, biochemical and structural studies in yeast and mammals have started to shed light on these steps11,12,15,16,19,25,26,27,32,39,40,41,42,43,44,45,46,47,48,49.

In addition to the so-called canonical initiation described above, there are various non-canonical pathways of initiation that employ distinct initiation complex components, and some may even bypass the requirement of a 5′ cap. Such non-canonical initiation is used for translation of specific mRNAs under particular circumstances, including many viral mRNAs. There is increasing evidence for the role of non-canonical initiation in both health and disease, such as the repeat-associated non-AUG (RAN) initiation50, N6-methyladenosine (m6A)-dependent initiation40,51,52 and initiation on circular (m)RNAs (circRNAs)53,54,55. Non-structural protein 1 (nsp1) of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2)56,57,58,59 or programmed cell death 4 (PDCD4)60,61,62,63 in tumour cells are examples of how proteins can regulate translation in human health and disease.

In this Review, we discuss the molecular mechanism of translation initiation in eukaryotes, from the assembly of the eIF4F complex to the joining of the 60S subunit at the start codon. We also discuss some non-canonical mechanisms of translation initiation and the implication of translation regulation in human disease.

Assembly and regulation of the 43S pre-initiation complex

As mentioned above, initiation of translation starts with the assembly of the 43S. The main role of the 43S is to prepare the 40S complex to be recruited to the mRNA and scan along the 5′ UTR. The 40S needs to be in an ‘open’ conformation to allow the accommodation of the ternary complex in the P site, and to allow the binding of the mRNA. The head of 40S alone adopts a closed conformation that is stabilized by interactions of ribosomal RNA with its head and body. During the formation of the 43S complex, the position of the body of the 40S subunit does not change, but the head undergoes upwards and swivel movements that stabilize an open conformation12,16. The presence of eIF1 and eIF1A induces and stabilizes the open conformation, which allows the correct binding of the rest of the factors16,64,65. eIF1 and eIF1A are highly conserved proteins that bind to the 40S small ribosomal subunit near the decoding centre and participate in the accurate recognition of the start codon12,15,66,67,68,69,70,71,72,73,74,75,76,77,78.

eIF3 is one of the most complex initiation factors and is implicated in many steps of translation initiation. In mammals, eIF3 consists of 13 different subunits (eIF3a–eIF3m), whereas in Saccharomyces cerevisiae there are six, of which two are core (eIF3a and eIF3c) and four are non-core (eIF3b, eIF3g, eIF3i and the loosely associated eIF3j) subunits. The large size of eIF3 allows it to encircle the entire 40S, reaching both the mRNA channel entry and exit sites. The subunits eIF3a, eIF3c, eIF3e, eIF3f, eIF3h, eIF3k, eIF3l and eIF3m form an octameric structural core that binds to the 40S complex near the mRNA channel exit site and is crucial for the recruitment of the ribosome to the mRNA. The remaining subunits (eIF3b, eIF3g and eIF3i) form a subcomplex that binds to 40S near the mRNA entry site12,16,19,25,66,79,80,81,82,83. The heart of the octameric structural core consists of eIF3a and eIF3c, which have a role in 43S recruitment to the mRNA and in start-site selection. The location of eIF3a allows its RNA-binding motif to bind mRNA near its exit site in the 40S subunit and to facilitate the recruitment of 43S to mRNA12,16,19,25,82,83,84,85. Furthermore, the carboxy-terminal domain (CTD) of eIF3a extends towards the mRNA channel entry site, where it interacts with the eIF3b–g–i module, which is also implicated in mRNA recruitment and scanning12,19,79,80,84,86. Meanwhile, the eIF3c amino-terminal domain (NTD) extends from the mRNA channel exit site towards the decoding centre of the ribosome, where it interacts with eIF1 and eIF5, and facilitates recognition of the start codon12,16,66,81,87.

Other subunits of eIF3 are also directly involved in the assembly of the 43S complex. For instance, eIF3d binds the head of the ribosome, near the mRNA channel exit site12,25,66,79,81. However, its N-terminal tail (NTT) is involved in an extensive interaction network, including direct interaction with factors involved in the loading of mRNA into the ribosome12,66. Furthermore, eIF3d has cap-binding activity and can recruit 43S to specific mRNAs independently of the cap-binding protein eIF4E (refs. 45,88).

A third role of eIF3 is to stabilize the binding of the ternary complex eIF2–GTP–Met-tRNAiMet. eIF2 — itself a complex of three subunits (eIF2α, eIF2β and eIF2γ) — is a GTPase that delivers Met-tRNAiMet to the P site of the ribosome9,89.

Finally, eIF3 also bridges ribosome recycling to a new round of initiation. Following translation termination, the 80S is dissociated into 40S and 60S subunits. The translation initiation factor ATP-binding cassette sub-family E member 1 (ABCE1; Rli1 in S. cerevisiae) catalyses subunit splitting and ribosome recycling. Although ATP hydrolysis by ABCE1 is not required for 80S splitting, it is required for ABCE1 release from the 40S ribosomal subunit90. Recent studies have identified a role for eIF3j in ribosome recycling91. eIF3j binds near the A site of the ribosome12,81,92,93 and its NTT extends towards the GTPase binding site of the ribosome, where ABCE1 binds12,81. eIF3j and ABCE1 interactions promote ribosome recycling81,91, which frees the 40S small ribosomal subunit to start a new round of initiation (Fig. 1), and could be involved in re-initiation beyond an upstream open reading frame (uORF).

The assembly of the cap-binding complex eIF4F

As stated above, eukaryotic mRNAs typically have a coding sequence flanked by extensive 5′ and 3′ UTRs. The main function of eIF4F is to bind the 5′ end of mRNA and recruit the 43S complex to it. eIF4F is a hetero-trimeric complex of eIF4E, eIF4G and eIF4A. Of these, eIF4E binds directly to the cap at the 5′ end of mRNA, and forms a complex with eIF4A and eIF4G (Fig. 2a). The large scaffold protein eIF4G ranges in size from 107 kDa to 184 kDa (952–1,666 amino acids) in different organisms and has interaction sites for eIF4E, eIF4A, eIF3 and mRNA94,95 (Fig. 2b). Whereas most eukaryotic eIF4G proteins have two eIF4A binding sites, S. cerevisiae eIF4G has only one. S. cerevisiae eIF4G also lacks the eIF3 interaction site, but binding sites for eIF1 and eIF5 have been proposed (Fig. 2b). Interaction of eIF1 with the main mammalian paralogue of eIF4G has also been proposed96.

Fig. 2: The eIF4F complex.
figure 2

a, Recruitment of eukaryotic translation initiation factor 4F (eIF4F) to mRNA through eIF4E–7-methylguanosine (m7G) cap and eIF4G–polyadenylate-binding protein (PABP)–poly(A) interactions. b, Members of the eIF4G protein family have regions implicated in binding to other translation initiation factors. Proposed RNA-binding domains (RNA) in eIF4G1 are also marked. c, Current structural information about interactions between eIF4F components and of eIF4G with PABP in complex with the poly(A): eIF4G1–PABP interaction [PDB: 4F02]245, eIF4G1–eIF4E interaction [PDB: 5T46]246, eIF4A–eIF4G1 interactions (which correspond to those in the full 48S complex) [PDB: 6ZMW]12 and in the yeast the eIF4G–eIF4A complex [PDB: 2VSO]247. MNK1, MAPK signal-interacting kinase 1; ORF, open reading frame; UTR, untranslated region.

eIF4G also binds polyadenylate-binding protein (PABP), which interacts with the poly(A) tail at the 3′ end of mRNA (Fig. 2). PABP is important for efficient translation and enhances the affinity of the eIF4F complex for the cap97, so it can be considered a translation initiation factor. The interaction of eIF4G with the poly(A) tail would bring the 5′ and 3′ ends of mRNA close together, resulting in a topological circularization of mRNA with a closed loop formed by eIF4F–mRNA–PABP98,99,100 (Fig. 2a). In addition to eIF4G–PABP interactions, other interactions stabilize the closed-loop conformation94,99,100. A closed loop has been proposed to be important for efficient initiation and for mRNA stability101,102. One possibility is that bringing the 5′ and 3′ ends of mRNA close together facilitates translation re-initiation on the mRNA by recycled 40S subunits.

Metazoan cells encode several eIF4E, eIF4A and eIF4G paralogues. In mammals, at least three eIF4G paralogues have been identified: eIF4G1, eIF4G2 (also known as p97, death-associated protein 5 (DAP5) or NAT1 (ref. 103)) and eIF4G3 (also referred to in the literature as eIF4GII) (Fig. 2b). All of them exist as multiple isoforms due to alternative splicing. In mammals, eIF4G1 is the main paralogue involved in canonical initiation (referred to as eIF4G for simplicity). By contrast, although eIF4G3 also has sites of interaction with other eIFs, suggesting it is likely involved in initiation, its function is not well established. eIF4G2 is the least conserved paralogue: it lacks the PABP and eIF4E interaction sites, but retains those for eIF3 and eIF4A. Because eIF4G2 lacks the eIF4E interaction site, it is thought to be involved in non-canonical, eIF4E-independent initiation46,104. This factor is further discussed in ‘Non-canonical mechanisms of translation initiation’ below.

The DEAD-box RNA helicase eIF4A is present in considerable excess over the other initiation factors in the cell105 and only a small fraction of it is part of the eIF4F complex105,106. Because eIF4G has two eIF4A interaction sites, one possibility is that initiation may require more than one molecule of eIF4A. Alternatively, eIF4A may have eIF4F-independent activity through forming a direct interaction with mRNA. Consistent with this idea, a recent study identified the interaction of multiple molecules of eIF4A with mRNA26. However, on its own, eIF4A is a non-processive helicase with a very slow rate of ATP hydrolysis28. Thus, although eIF4A has been implicated in other cellular processes106, its best characterized role is in initiation as part of the 48S complex, possibly aided by other factors such as eIF4B and eIF4H (refs. 14,15,17, 19,26,28,107,108,109). In mammals and plants, eIF4A has three paralogues: eIF4A1, eIF4A2 and eIF4A3. Of these, eIF4A1 is the main factor involved in translation initiation (referred to as eIF4A for simplicity). eIF4A1 and eIF4A2 are both cytoplasmic and have 90% sequence identity, whereas eIF4A3 is mainly nuclear and only shares 60% of sequence identity with the other two paralogues110.

Given the importance of the eIF4F complex in the recruitment of the 43S complex to mRNA, it is not surprising that its assembly is a highly regulated process. Several mechanisms block the assembly of eIF4F, including the mammalian target of rapamycin (mTOR) pathway111,112 (Box 1) and MAPK signal-interacting kinase 1 (MNK1; also known as MKNK1), which binds to eIF4G and phosphorylates eIF4E. Based on these findings, eIF4F has become a therapeutics target for blocking translation in neoplasia113,114.

Despite the importance of eIF4F in the initiation process, little is known about its structure. About 70% of eIF4G is predicted to be intrinsically disordered, and there is little structural information about its interactions with the rest of eIF4F. To date, only a small portion of eIF4F has been structurally determined in the context of the 48S complex12,19 (Fig. 2c).

Box 1 Regulation of translation initiation by controlling the formation of eIF4F

The binding of eukaryotic translation initiation factor 4E (eIF4E) to eIF4G can be disrupted by eIF4E-binding proteins (4E-BPs). These proteins inhibit canonical translation by binding to eIF4E and preventing its association with eIF4G and the formation of the eIF4F complex. In favourable growth conditions, which require efficient (canonical) translation, 4E-BPs are phosphorylated (P) by the kinase mammalian target of rapamycin (mTOR) and are unable to bind eIF4E, thereby allowing it to bind both the mRNA 5′ cap (7-methylguanosine (m7G)) and eIF4G and to promote canonical translation initiation248 (see the figure, left panel). In conditions of starvation or other stresses, mTOR activity is suppressed and 4E-BPs can bind to eIF4E, which prevents the formation of the eIF4F complex and inhibits canonical translation, thereby enabling the non-canonical translation of specific mRNAs249 (see the figure, middle panel). One example of mRNAs proposed to be regulated by mTOR activity are terminal oligopyrimidine motif (TOP) mRNAs, some of which encode ribosomal proteins and eukaryotic translation factors250,251.

Cap binding by eIF4E is also regulated by 4E homologous protein (4EHP) (see the figure, right panel). This protein, also known as eIF4E2 or 4E-LP (4E-like protein), is homologous to eIF4E but its interaction with the cap is about 100-fold weaker. 4EHP is able to bind the cap but not eIF4G1, so it is considered a translation repressor that sequesters capped mRNAs252. 4EHP is regulated by many proteins and has been identified in microRNA-mediated gene silencing, ribosomal quality control and translation initiation-regulating machineries. In initiation, it is involved in eIF4G-independent translation of specific mRNAs through its interaction with the cap and different proteins252. PABP, polyadenylate-binding protein.

Recruitment of the 43S complex to mRNA

The precise molecular mechanism by which eIF4F promotes the recruitment of the 43S complex to the mRNA remains unclear. In metazoans, the attachment of 43S to the mRNA–eIF4F complex is mediated by a direct interaction between eIF3 and eIF4G (refs. 12,15,17,19,20,115,116). Moreover, a recent study revealed an interaction between eIF4F and the ribosomal protein eS7 (ref. 19), which is located near the mRNA channel exit site. Nevertheless, a key question remains: how the mRNA is inserted into its channel in the 40S subunit (of 43S) prior to scanning.

The two alternatives proposed for recruitment are the slotting model and the threading model (Fig. 3). In the slotting model, the 5′ UTR of mRNA is laterally inserted into the cleft in the 40S subunit that forms the mRNA channel. In this situation, the eIF4F complex at the 5′ end of the mRNA is expected to be positioned upstream of the 43S complex. In the threading model, the eIF4F complex binds to 43S at the mRNA entry site, downstream of the 40S ribosome subunit; once 43S is attached, the cap-binding protein eIF4E would have to dissociate from the 5′ cap to allow the mRNA to thread into 40S, presumably driven by the helicase activity of eIF4A (Fig. 3). A prediction of the threading model is that the 40S subunit can scan the 5′ UTR from the very first nucleotide downstream of the cap, so that there should be no restriction on how close the start codon can be to the 5′ end.

Fig. 3: Two alternative models of eIF4F-dependent 43S recruitment to mRNA.
figure 3

Two main mechanisms of eukaryotic translation initiation factor 4F (eIF4F)-dependent recruitment have been proposed: the slotting model (left) and the threading model (right). Both mechanisms would require ATP hydrolysis for recruitment. In the slotting model, eIF4F binds upstream of the ribosome, near the mRNA channel exit site in the 40S complex, and the mRNA is laterally slotted into the channel. A second molecule of eIF4A (in addition to the one in eIF4F) and its cofactor eIF4B bind on the opposite side of the 43S pre-initiation complex (43S) at the mRNA entry site to increase recruitment efficiency. Thus, eIF4A–eIF4B has an eIF4F-independent role during recruitment and scanning. The scanning model requires a minimum distance between the 5′ end of mRNA and the start codon, which needs to be positioned at the P site of the ribosome, and is compatible with the length of typical 5′ untranslated regions (UTRs). However, mRNAs with short 5′ UTRs would have a start codon too close to the 5′ end to reach the P site. Thus, these mRNAs would require backtrack scanning or an alternative mechanism for recruitment. In the threading model, eIF4F binds near the mRNA entry site, followed by the release of eIF4E from the 5′ cap (not shown), thereby allowing threading of the mRNA through the mRNA channel and its scanning from the first nucleotide downstream of the cap. This model would permit translation of any mRNA regardless of the length of the 5′ UTR. However, it is not compatible with recent structures of the 48S complex. m7G, 7-methylguanosine; PABP, polyadenylate-binding protein; Met-tRNAiMet, methionine initiator transfer RNA.

The threading model is supported by studies reporting efficient initiation on short-leader (5′ UTR) mRNAs15,117. Analysis of the translation start site by toeprinting indicated that during eIF4F-dependent initiation, the 48S is able to initiate on an AUG start codon located as close as two nucleotides from the 5′ cap15. The data strongly suggest there is no minimum distance of the start codon from the 5′ end of mRNA, which is more compatible with the threading model of recruitment. However, efficient initiation on a cap-proximal AUG start codon was observed in the absence of eIF1 (ref. 15), which is a key factor for ensuring the fidelity of the start-codon selection18,69,72,73,75,118,119,120, suggesting that the observations may not be representative of canonical initiation. Furthermore, the threading model would be incompatible with various cases of non-canonical initiation, such as on circRNA53,54,55,121,122,123,124, on internal ribosome entry site (IRES)-containing mRNA124,125 or in eIF3d-dependent initiation45, given that eIF3d binds to the 43S near the mRNA channel exit site. Indeed, even during canonical initiation, eukaryotic mRNAs are thought to form a closed loop through the eIF4G–PABP interaction99,100. The closed loop is important for enhancing efficient assembly of the initiation complex40,97,98,99,126. It is, of course, formally possible that a closed loop is dynamic and does not preclude the threading model. Use of mammalian cell extracts lacking PABP, or with a PABP mutant that is unable to bind to eIF4G, dramatically reduces the assembly of the 48S complex97, which again would be difficult to reconcile with the threading model.

A crucial requirement of the threading model is the release of the cap-binding protein eIF4E from the 5′ cap to allow the mRNA to thread into and through the mRNA channel of 40S. There is some evidence suggesting that eIF4E remains attached to the 48S, at least during part of the initiation process11,12,42. One study has shown that in most human cells, eIF4E remains bound to the ribosome with a decay half-length of ~12 codons11. However, it was unclear whether eIF4E remained attached to the 5′ cap or its association with the 48S occurred through its interaction with eIF4G. A recent study found that tethering the 5′ cap to eIF4E by ultraviolet radiation or chemical cross-linking does not appear to abrogate translation, unless the start codon is located very close to the 5′ end42, arguing against the threading model.

Recently, the structure of a scanning human 48S complex on an mRNA lacking a start codon directly revealed the location of eIF4F at the exit (5′) side of the mRNA channel in the 40S complex12 and strongly suggested that during recruitment the mRNA is slotted into the mRNA channel. The structure was consistent with ribosome profiling studies indicating that a scanning 48S complex has extended footprints upstream but not downstream of the 40S small ribosomal subunit11,39. To reconcile these data with the threading model, following recruitment the eIF4F complex would have to relocate from the mRNA entry site to the opposite site of 40S, a possibility that has been previously suggested89. However, there is no evidence that such an extensive relocation occurs.

The position of the eIF4F complex in the structure of a human 48S supports the slotting model and suggests that, when positioned at the P site, the start codon would have to be at least about 40 nucleotides from the 5′ end of the mRNA12. This minimum distance was confirmed by biochemical data accompanying the structure12 and is compatible with the typical 5′ UTR of mRNA in eukaryotes, whose median lengths range from 53 nucleotides in budding yeast to 218 nucleotides in humans127.

The threading model would be difficult to reconcile with the location of eIF4F on the 5′ (exit) side of the 43S complex, and would also be hard to reconcile with translation on circRNAs. However, it is also clear that a slotting model that has eIF4F bound on the 5′ side of 43S and its requirement of a minimum distance of 40 nucleotides between the 5′ end and the start codon would not be compatible with translation of mRNAs with very short 5′ UTRs. Thus, it is likely that initiation on these mRNAs uses alternative recruitment pathways. These could include translation initiator of short 5′ UTR (TISU) elements, which are present in about 4% of protein-encoding genes128,129, or eIF4F-independent mechanisms proposed for leaderless mRNAs130,131. Initiation on mRNAs with short 5′ UTRs clearly needs further study.

A recent structure of a human 48S complex revealed an interaction between eIF4F and the 43S complex through eIF3 subunits eIF3e, eIF3k and eIF3l (ref. 12). This finding was surprising because these subunits are not present in S. cerevisiae eIF3 and because eIF3k and eIF3l are dispensable in Neurospora crassa and Caenorhabditis elegans132,133, raising the question of how eIF4F interacts with the 43S complex in those species. This question was solved by a recent higher-resolution structure of a human 48S positioned at the start codon that revealed new interactions between eIF4F and 43S, including of eIF4F with eIF3c and ribosomal protein eS7 (ref. 19), both of which are universally conserved in eukaryotes. Together, these structures agree with early biochemical data showing that eIF4G is in close proximity with eIF3e, eIF3c, eIF3d and eIF3k (refs. 20,115,116), all located upstream of the ribosome. These data also agree with proximity-labelling (BioID) data indicating that eIF4E and eIF3l are in close proximity in human cells134.

Evolutionary divergence in recruitment also poses a puzzle. In metazoans, several mechanisms of recruitment can coexist in the same cell, which depend variously on eIF4F, eIF3d, IRES, m6A modification of mRNA or elements such as TISU. The mRNA 5′ UTR135 and the availability of eIFs are the determinant factors that control recruitment. Thus, despite the fact that there is a conserved core of initiation factors, some organisms such as S. cerevisiae must differ in the details of the recruitment mechanism. Unlike many other eukaryotes, S. cerevisiae lacks the eIF3-binding domain of eIF4G (Fig. 2b) and all subunits of eIF3 (eIF3d, eIF3e, eIF3h, eIF3k and eIF3l) involved in the interaction with eIF4F (ref. 10) except eIF3c (refs. 19,20). It has been proposed that S. cerevisiae compensates for lack of the eIF4G–eIF3 interaction through interactions between the eIF4G HEAT domain (the core of eIF4F) and the eIF5 CTD136,137, which binds to the ternary complex located near the decoding centre66,138. The precise molecular mechanism underlying 43S recruitment in S. cerevisiae remains unclear.

Scanning of the 5′ UTR

Once recruited to the mRNA, the 48S complex moves along the 5′ UTR until it locates a start codon. This scanning process requires ATP hydrolysis22,48 and, presumably, facilitates unwinding of any mRNA secondary structure it encounters19. However, the precise molecular mechanism underlying the regulation of scanning by eIF4F remains unclear.

The location of eIF4F upstream (5′) of the 43S complex raised a question of how its helicase component, eIF4A, would facilitate scanning along the mRNA. Moreover, yeast 43S stimulates the ATPase activity of eIF4A without requiring eIF4G or eIF4E, but instead requiring eIF3i and eIF3g (ref. 48). This finding raised the question of how these eIF3 subunits, which are located on the opposite side of the 43S from eIF4F, could affect the activity of eIF4A.

This puzzle was clarified by a more recent structure of the 48S complex, which unexpectedly showed that in addition to the eIF4A that is part of eIF4F on the 5′ side of the 43S complex, a second eIF4A molecule is on the opposite side of 43S, at the mRNA entry site19 (Fig. 4). The second location suggested that this eIF4A molecule functions separately of eIF4F.

Fig. 4: Scanning and selection of a start codon.
figure 4

Following its recruitment, the 48S complex scans along the 5′ untranslated region (UTR) until it encounters a start codon. Eukaryotic translation initiation factor 4F (eIF4F) binds upstream of the 40S subunit, allowing to prevent its reverse movement and ensuring the 5′ to 3′ directionality of scanning. In addition to the eIF4A molecule that is part of the eIF4F complex, a second molecule of eIF4A binds at the mRNA entry site, where it interacts with its cofactor eIF4B. eIF4A has low intrinsic helicase activity, but association with eIF4B and eIF3 increases it and allows unwinding of mRNA secondary structures before they enter the ribosome. It is unclear whether the 5′ cap is released during scanning or remains tethered to the 48S complex. If the cap remains tethered during scanning, only one 48S complex can scan the 5′ UTR at a time, whereas if the cap is released during scanning, multiple 48S complexes can simultaneously scan the same mRNA. eIF5 accelerates guanosine 5′-triphosphate (GTP) hydrolysis by eIF2. However, inorganic phosphate (Pi) release is prevented by eIF1 until the start codon is encountered. Start-codon recognition induces conformational rearrangement of the 48S complex and triggers the release of eIF1, followed by the release of Pi. The amino-terminal domain (NTD) of eIF5 occupies the place vacated by eIF1 near the decoding centre of the ribosome. Upon the selection of a start codon, the methionine initiator transfer RNA (Met-tRNAiMet) is fully inserted into the P site of the ribosome and interactions with eIF1A further stabilize the codon–anticodon interaction (not shown). m7G, 7-methylguanosine; PABP, polyadenylate-binding protein.

The new structure places eIF4B adjacent to eIF4A at the mRNA entry site, in close proximity to the eIF3b–g–i module and ribosomal proteins uS3 and uS10 (ref. 19). The structure is consistent with an early yeast two-hybrid analysis that identified the ribosomal protein uS10, which is located at the mRNA entry site, as the main interaction partner of eIF4B (ref. 139). Cross-linking mass spectrometry analysis also identified ribosomal proteins uS3 and eS28 — located near uS10 — in close proximity to eIF4B (ref. 140).

A second molecule of eIF4A and the adjacent location of eIF4B is not only consistent with previously puzzling biochemical data48,141 but shows how the helicase activity of eIF4A could unwind secondary structures downstream in the mRNA, aided by eIF4B. It is possible that eIF4F could function as a Brownian ratchet142,143, using the energy from ATP hydrolysis by eIF4A to move from 5′ to 3′ and act as a pawl to prevent any backward movement (3′ to 5′) of the 43S complex12,144. Thus, both eIF4F and the separate, second eIF4A may work in concert to assure the directionality and efficiency of scanning (Fig. 4).

The scanning of mRNAs with long and highly structured 5′ UTRs requires additional RNA helicases such as mammalian DHX29 and DDX3 and yeast Ded1 (refs. 21,31,32,33). Cryogenic electron microscopy (cryo-EM) structures of DHX29 show that it binds near the mRNA entry site in the 40S complex79,145,146, where it would overlap with the binding site of the mRNA entry-site eIF4A (ref. 19). Thus, it is possible that the additional helicases are used instead of an entry-site eIF4A for the translation of mRNAs with highly structured 5′ UTRs, thereby offering another level of translation control. However, there is evidence suggesting that Ded1 is an integral part of the yeast eIF4F complex147,148.

During scanning, the 48S complex moves along the mRNA from 5′ to 3′ until it encounters a start codon. However, this movement may not always be unidirectional and the 48S can also undergo occasional backward oscillations (3′ to 5′)26,42,142,149, which has been suggested to facilitate initiation on a near-cognate CUG codon, or on an AUG codon in a suboptimal context by increasing the number of times 43S samples the near-cognate codon26. The mechanism underlying such putative movements remains unclear, but free eIF4A and eIF4F are known to have bidirectional helicase activity28,150 although it is unclear whether they retain the bidirectional activity as part of 48S. A recent study suggested that ATP hydrolysis by eIF4A is required for a 3′ to 5′ oscillation of the 48S (ref. 42). However, it is unclear how this suggestion would be consistent with a putative role of eIF4F in preventing the backward (3′ to 5′) movement of 43S (ref. 144). An RNA secondary structure in the 5′ UTR has been suggested to trigger the backward movement26, consistent with the finding that stem–loop structures enhance initiation from near-cognate codons26,151,152. Furthermore, RNA secondary structures can also increase the dwell time of Met-tRNAiMet on a near-cognate codon and stimulate initiation on them. Consistent with this idea, DDX3 (Ded1) is required for scanning and translation of mRNA with highly structured 5′ UTR, and in its absence the stalled 48S initiates translation instead on a near-cognate codon upstream of a stem–loop, resulting in the translation of a uORF instead of the main open reading frame (ORF)153.

The movement of the 48S along the 5′ UTR is facilitated by the conformation of the 40S subunit and the initiator Met-tRNAiMet12,16. In the scanning 48S structure12, the tRNA has a conformation intermediate between the closed state, in which it is fully inserted into the P site of the ribosome with its anticodon making an interaction with the mRNA codon, and the open state, in which its anticodon is well separated from the mRNA. In this intermediate conformation, the Met-tRNAiMet is able to move along the mRNA while also able to transiently sample the codon. This transient sampling would allow it to recognize the start codon, as such recognition is dependent on base pairing between Met-tRNAiMet and mRNA.

A key question during the scanning of the 5′ UTR of mRNA is whether the 5′ cap is released from the 48S complex during scanning or is tethered throughout the process (Fig. 4). Selective 40S ribosome profiling data indicated that the cap is tethered during scanning in most human cells11. These data agree with an in vitro translation assay that identified that tethering the cap to eIF4E does not prevent scanning and AUG selection unless the start codon is positioned near the 5′ cap42. Conversely, ribosome profiling data have identified extended footprints upstream of the 48S, which were interpreted as multiple 43S complexes simultaneously scanning the same 5′ UTR of mRNA39. Recent single-molecule fluorescence data also indicated that a second 43S could be recruited to an mRNA already loaded with a 48S (ref. 26). Together, these data suggest that eIF4E is likely to be released before or upon start-codon selection15,154,155,156. This possibility was also suggested by a recent cryo-EM structure of human 48S, indicating that eIF4G and eIF4A, but not eIF4E, are present in the eIF4F complex upon recognition of the start codon19. Thus, more research is needed to understand when the 5′ cap is released from the 48S complex.

Selection of the start codon

A crucial aspect of translation initiation is the selection of the start codon, normally AUG. The fidelity of this process is ensured by eIF1, eIF1A, eIF2 and eIF5 (refs. 68,71,72,73,74,75,76,78,119,157,158,159,160). On near-cognate and non-cognate codons encountered during scanning, a fully closed conformation of the Met-tRNAiMet is prevented by eIF1 (refs. 18,69,72,74,75,78,120). This mechanism facilitates the proper sampling of the codon in the P site, which allows the identification of the start site. Recognition of the start codon triggers a series of changes that stabilize a tRNA conformation that is fully inserted into the P site and base pairs with the codon16,19,25,27,69,70.

Scanning of the 5′ UTR triggers GTP hydrolysis by eIF2, in a process regulated by eIF5, which acts as a GTPase activator161,162. It is unclear exactly when GTP hydrolysis occurs — it is the release of Pi, not GTP hydrolysis itself, that occurs upon start-codon selection163. The additional interactions made by codon–anticodon base pairing stabilize the Met-tRNAiMet in a conformation fully inserted into the P site of the ribosome. In this conformation, the tRNA clashes with eIF1 and triggers its release from the 48S complex12,16,27,82,119,164. The release of eIF1 is accompanied by the release of Pi from eIF2 (ref. 163). Upon release of eIF1, the eIF5 NTD occupies the site vacated by eIF1 (refs. 19,72,83), signalling the end of the scanning process. Moreover, upon start-codon recognition, the head of the 40S complex tilts towards its body, thereby adopting a closed conformation compared with a scanning ribosome19,25,27,69,70,82,83,119, which further stabilizes the 48S complex with the start codon in the P site.

The codon–anticodon interaction is stabilized by eIF1A through its NTT69,165. eIF1A has a globular domain that binds near the A site of the ribosome69,70,77,165. However, a cryo-EM structure of a yeast 48S following AUG codon recognition revealed that the eIF1A NTT extends towards the P site, where it interacts with the mRNA and Met-tRNAiMet and enhances the fidelity of start-codon selection69. This interaction has also been described in mammals25,19,27, and is consistent with the observed rearrangement of eIF1A upon start-codon recognition27.

Recognition of the start codon is also context-dependent and is enhanced by a sequence flanking the AUG start codon, referred to as the Kozak sequence in mammals (5′-AGNNAUGG-3′)25,166. Structures of 48S complexes in the act of recognizing a start codon suggest how a Kozak sequence enhances initiation at the start codon by making additional interactions with nearby elements of the initiation factors eIF1A, eIF2α, 18S ribosomal RNA and ribosomal protein uS7 (refs. 25,164).

Joining of the large ribosomal subunit and formation of the 80S initiation complex

The formation of the 80S IC marks the last major step of translation initiation. Joining of the 60S subunit to the initiation complex requires the release of some eIFs and reorientation of the Met-tRNAiMet to prevent steric clashes. This joining is mediated by the GTPase eIF5B (ref. 36) and by eIF1A (refs. 35,167,168,169). Recent structures of mammalian and archaeal initiation complexes with eIF5B elucidated this process35,168,170. Following the release of eIF2–GDP, eIF5B binds to the 48S complex and interacts with eIF1A to undergo remodelling and reorient the Met-tRNAiMet into a position compatible with the 60S joining35,168,170. 60S joining triggers the hydrolysis of GTP by eIF5B (ref. 171), which accelerates the release of eIF1A from the complex35,169,172. Upon the release of eIF1A, eIF5B adopts a different conformation that places the tRNA into its final position to initiate the elongation process34,35,37,38,173. GTP hydrolysis by eIF5B reduces its affinity to the ribosome and leads to its release35,37,171,174. The departure of eIF5B marks the end of translation initiation, leaving the 80S ribosome with a Met-tRNAiMet in its P site. The neighbouring empty A site can bind a ternary complex comprising eukaryotic elongation factor 1A (eEF1A), GTP and aminoacyl-tRNA, thus starting elongation.

As for eIF3, the molecular mechanism underlying its release from the ribosome remains unclear. eIF3 may dissociate from the ribosome together with eIF4 factors upon selection of the start codon175,176 and the binding of eIF5B. However, with some transcripts, eIF3 remains associated with the ribosome even during the initial part of elongation to assist with re-initiation following passage of a short uORF11,155,176,177,178,179. More research is needed to understand these details.

Non-canonical mechanisms of translation initiation in eukaryotes

Initiation is the most regulated and also the most varied stage of mRNA translation. Although mRNAs generally use the same set of translation factors during elongation and termination, they can use quite different subsets of factors during initiation. As initiation is a major point of regulation in response to internal and external stimuli, which results mainly in the slowing down of canonical initiation (Boxes 1 and 2), the importance of non-canonical initiation has become increasingly evident. Non-canonical initiation mechanisms allow translation of specific mRNAs in conditions where canonical initiation is inhibited (Fig. 5). In this section, we discuss the current knowledge of the main non-canonical translation initiation mechanisms.

Fig. 5: Non-canonical translation initiation mechanisms.
figure 5

Each panel describes how a particular mechanism differs from canonical initiation in mRNA recruitment to the 43S complex and in the factors involved. a, Canonical translation initiation. b, Internal ribosome entry site (IRES)-dependent initiation sometimes involves special IRES trans-acting factors (ITAFs). c, N6-methyladenosine (m6A)-dependent initiation may involve additional factors such as the m6A reader YTH domain-containing family protein 3 (YTHDF3) and writer N6-adenosine-methyltransferase catalytic subunit (METTL3), which interact with eukaryotic translation initiation factor 3 (eIF3) and m6A modifications at the untranslated regions (UTRs) of the mRNA40,203. d, Initiation of mRNAs with short 5′ UTRs. Recent structures question the precise role of the eIF4F complex (dashed circle with question mark) at such mRNAs, and how the mRNAs are recruited to the 43S complex is not clear. e, eIF3d-dependent initiation requires the cooperation of death-associated protein 5 (DAP5; also known as eIF4G2) and eIF3d in the recruitment of the mRNA207. f, eIF2-independent initiation occurs when eIF2 is inactivated. Specifically, mRNAs with an upstream open reading frame (uORF) are translated when a significant fraction of eIF2 is inactivated. The heterodimer density-regulated protein (DENR)–malignant T cell-amplified sequence 1 (MCTS1) or its homologue eIF2D may be required for eIF2-independent initiation213. GTP, guanosine 5′-triphosphate; m7G, 7-methylguanosine; PABP, polyadenylate-binding protein; 48S, 48S pre-initiation complex; Met-tRNAiMet, methionine initiator transfer RNA.

Box 2 Assembly and regulation of eIF2

Eukaryotic translation initiation factor 2 (eIF2) is the initiation factor that delivers the Met-tRNAiMet to the P site, and one of the main targets for translation regulation. Following the hydrolysis of guanosine 5′-triphosphate (GTP) and release of inorganic phosphate (Pi), after AUG recognition, the affinity of eIF2 to the methionine initiator transfer RNA (Met-tRNAiMet) is dramatically reduced, which triggers its release from the 48S complex (see the figure). A new round of initiation requires the exchange of GDP with GTP by the nucleotide exchange factor eIF2B (refs. 253,254,255,256). This step is an important target of translation initiation regulation. Phosphorylated (P) eIF2 is unable to dissociate from eIF2B, thereby sequestering the exchange factor257, which in turn leads to inhibition of GTP recycling and depletion of active eIF2, resulting in inhibition of initiation and protein synthesis. Several eIF2 kinases are activated in response to different stresses in what is termed the integrated stress response. Those kinases vary between species. Of the four kinases identified in mammals, general control non-depressible 2 (GCN2) is activated by starvation or stalled ribosomes; haem-regulated inhibitor (HRI) is activated in erythroid cells by a lack of haem; protein kinase R (PKR) is induced by double-strand RNA present in the cytoplasm during viral infection; and PRKR-like endoplasmic reticulum kinase (PERK) is activated by unfolded proteins in the endoplasmic reticulum210. GCN2 is the only eIF2 kinase present in Saccharomyces cerevisiae210. Recently, a compound called integrated stress response inhibitor (ISRIB) has been identified, which binds to eIF2B and allows eIF2 to be recycled even in phosphorylated form, thus overcoming the integrated stress response258,259. PABP, polyadenylate-binding protein; 80S IC, 80S initiation complex.

Internal ribosome entry sites

One of the most studied pathways of non-canonical translation initiation involves the use of various types of IRESs (Fig. 5b), which are cis-acting RNA elements with complex RNA secondary structures, such as pseudoknots and kissing loops. Some of these structures can mimic tRNA in the ribosome.

IRESs were first described in viral mRNAs180,181. Viruses use IRESs to bypass (viral) stress-induced shutdown of cap-dependent canonical translation initiation by eliminating the need for a 5′ cap and some or all initiation factors. Therefore, viral mRNAs containing IRESs can evade innate immunity responses that target canonical translation, such as eIF2 phosphorylation (Box 2). Simultaneously, many viruses have evolved mechanisms to shut down translation of host genes by inactivating one or more initiation factors, thereby hijacking the translation machinery for the synthesis of their own proteins125.

Four types (I–IV) of viral IRES have been described, each with a different requirement of initiation factors125. Types I and II require almost all of the factors except for eIF4E. Type III requires only eIF1A, eIF2, eIF5 and eIF5B, and type IV does not require canonical initiation factors at all125. However, some IRESs require auxiliary initiation factors called IRES trans-acting factors (ITAFs)182 (Fig. 5b). Several structures of translation complexes involving IRESs show highly structured IRES elements interacting with the 40S complex near the decoding centre on the mRNA exit site, and shed light on how they facilitate initiation173,183,184,185,186,187,188,189.

The presence of IRESs in cellular mRNAs was proposed a long time ago190,191. Several cellular mRNAs have 5′ UTRs with structured elements that have some similarity to known viral IRESs, and these have been proposed to use similar non-canonical mechanisms of initiation127,192. However, recent studies have questioned whether these sequences are in fact IRESs193. More research is needed to elucidate the presence of IRESs in cellular mRNAs.

Recent studies indicate that some circRNAs can be translated53,54,55,121,122,123. As circRNAs lack 5′ and 3′ ends, initiation must involve an IRES-like element. circRNAs are widely present in eukaryotic cells and comprise mainly non-coding RNAs produced through non-canonical back-splicing194,195,196. circRNAs are generally more stable than the linear mRNA as they cannot be degraded by exonucleases, the primary route of degradation of linear RNAs in the cell. Thus, circRNAs have been the subject of considerable interest for their possible therapeutic potential122.

Translation of circRNAs is still not well understood, but there is increasing evidence supporting the role of eIF4G2 in the process55,197. Furthermore, it may also involve m6A modification of the mRNA and the activity of the m6A reader YTH domain-containing family protein 3 (YTHDF3) (refs. 55,197). More research is needed to understand the molecular mechanisms underlying initiation on circRNAs and the physiological significance of their translation.

The role of N 6-methyladenosine in translation

Methylation of adenosine at the N6 position to form m6A is the most common internal mRNA modification, and has been implicated mainly in controlling mRNA stability. It is present throughout the mRNA, including in the 5′ UTR, the coding sequence, the 3′ UTR and the poly(A) tail198,199,200. The methyltransferase complex N6-adenosine-methyltransferase catalytic subunit (METTL3)–METTL14 is a major m6A writer200,201,202. In addition to conferring stability, m6A has been proposed to affect translation initiation: m6A in the 5′ UTR enhances eIF4F-independent initiation without the requirement of an additional auxiliary initiation factor51. A possible mechanism is through interaction between modified mRNA and eIF3. However, it may also require the involvement of other factors in initiation, such as the m6A reader YTHDF3 and writer METTL3 (refs. 40,55,197,203) (Fig. 5c). Consistent with this idea, there is evidence that METTL3 interacts with m6A in the 3′ UTR while interacting with eIF3h, which would be close to the 5′ end40. These interactions would create mRNA in a closed-loop conformation that would enhance translation, perhaps similar to the effect of the eIF4G–PABP interaction in canonical initiation.

Translation of mRNAs with a short 5′ UTR

Although the median length of mammalian 5′ UTRs is more than 200 nucleotides, a subset of cellular mRNAs have very short 5′ UTRs. Based on the Ensemble last human annotation (GRCh38.p13, extracting 5′ UTR sequences using the BioMart tool), 6.8%, 10.1% and 13.1% of the human mRNAs have 5′ UTRs shorter than 30, 40 and 50 nucleotides, respectively. It is worth mentioning that there are some discrepancies between different genome annotations, which affect these findings. A study of transcript start sites using cap analysis of gene expression204 revealed considerable heterogeneity of transcript start sites that cannot be accurately captured in genome-wide annotation of mRNA transcripts. Recent studies have suggested that the 5′ UTR length of some transcripts can be shorter than annotated, and may also depend on alternative splicing in different cell types or conditions193.

It is unclear how transcripts with a short 5′ UTR are translated (Fig. 5d). A possible mechanism is through the TISU elements [G]128,129. An AUG in the context of a TISU element is selected over a downstream AUG in the context of a Kozak sequence128. This initiation is eIF4A-independent128, which in combination with the very short 5′ UTR suggests that TISU mRNAs are translated using a different mechanism of initiation.

eIF3d-Dependent initiation

eIF3d has cap-binding activity and has been proposed to allow translation when eIF4E, the canonical cap-binding protein, is inactivated, so it may have a role under specific conditions such as stress45,88,205. eIF3d presents structural similarity with the decapping and exoribonuclease protein, which has a cap-binding pocket45, and it has been proposed that eIF3d phosphorylation can regulate its activity as a cap-binding protein88,205.

eIF3d binds to the ribosome near the mRNA exit channel. It has a globular domain, where the interaction with the cap has been proposed, that directly contacts the head of the 40S subunit. It also has flexible domains that can interact with eIF3a and eIF3c in the core of the eIF3 complex12,66.

Recent data suggest that DAP5 (eIF4G2), an eIF4G1 homologue that lacks the eIF4E interaction site (Fig. 2b), is involved in eIF3d-dependent initiation — for example of the c-Jun mRNA206,207. Thus, eIF3d-dependent initiation may require a different set of initiation factors to those used for eIF4E-dependent initiation. One possibility is that DAP5 can bind directly to the 5′ UTR of mRNA, thereby bypassing the requirement of the cap-binding protein eIF4E (ref. 208), eIF3d being the protein binding to the cap (Fig. 5e).

eIF2 phosphorylation and translation initiation

Inhibition of general translation by eIF2 phosphorylation is a common mechanism underlying the integrated stress response209,210 (Box 2). Phosphorylation of eIF2 inhibits its release from the GDP–GTP exchange factor eIF2B, which results in eIF2B sequestration and inability to recharge most of the eIF2 molecules in the cell with GTP. However, the translation of several mRNAs is actually induced by this process, indicating that this translation is dependent on the low availability of eIF2 (as proposed for ATF4 and general control non-depressible 4 (GCN4) mRNAs)211 or relies on different factors playing the role of eIF2 (Fig. 5f).

ATF4 in mammals and GCN4 in yeast are among the genes translated when eIF2 is phosphorylated in response to stress. Their translation is regulated through the presence of small uORFs in the 5′ UTR. The translation of the main ORF in this case occurs only when the pool of unphosphorylated (active) eIF2 is low212. uORFs have been identified in almost 50% of eukaryotic genes, suggesting this regulatory mechanism is spread across the transcriptome. In addition to initiation that depends on low levels of active eIF2, the involvement of non-canonical initiation factors that substitute for eIF2 in delivering the Met-tRNAiMet has been proposed, such as the heterodimer density-regulated protein (DENR)–malignant T cell-amplified sequence 1 (MCTS1) (and its homologue eIF2D) and eIF5B (refs. 168,213) (Fig. 5f). The factors eIF5B, DENR–MCTS1 and eIF2D all specifically bind Met-tRNAiMet (ref. 214).

eIF5B has a role in the last stage of canonical translation initiation35, but in the absence of eIF2 it can promote Met-tRNAiMet binding to type III IRES–40S complexes168. Also, a recent study has proposed that its homologue in archaea, aIF5B, stabilizes Met-tRNAiMet binding170.

DENR and MCTS1 form a heterodimer able to bind to 40S and are involved in initiation and re-initiation. The CTD of DENR binds in the place of eIF1, the factor that is involved in the fidelity of start-codon recognition215,216. DENR–MCTS1 has been proposed to be involved in ATF4 translation initiation and in RAN translation213,217. Another non-canonical initiation factor proposed to be involved in eIF2-independent translation is eIF2A (ref. 218), but its mechanism of function remains unclear and needs further investigation.

Regulation of initiation in health and disease

Dysregulation of translation, in particular initiation of translation, has been implicated in several human diseases219. It has long been known that the activity of eIF4F is upregulated in cancer cells, which have increased translation113,114,220, whereas it is downregulated during viral infection as a strategy to shut down host–gene translation.

The best-known examples of eIF4F downregulation during viral infections are that of poliovirus and human immunodeficiency virus (HIV), which encode proteases that cleave eIF4G and prevent 43S recruitment221,222,223. Other viruses encode proteins that prevent 43S recruitment by blocking the mRNA channel in the 40S complex. A notable recent example is nsp1 of SARS-CoV224,225 and SARS-CoV-2, which prevents recruitment of mRNA by binding directly to the 40S subunit and inserting its CTD in the mRNA channel56,57,58,226 (Fig. 6a). Although the CTD of the protein is essential for translational control of host mRNAs56,57,58,59,226,227,228,229,230,231, the NTD stabilizes its binding to the ribosome and allows the attachment of the 43S complex to viral mRNA59,228,229,230,231. How nsp1 combines global translation inhibition while allowing translation of viral genes remains unclear. It has been proposed that RNA sequence and secondary structure in viral 5′ UTRs allows them to escape inhibition by nsp1 (refs. 56,59,228,229,230,231,232).

Fig. 6: Examples of regulation of translation initiation in disease.
figure 6

a, Initiation control by non-structural protein 1 (nsp1) of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). Cryogenic electron microscopy (cryo-EM) structure of the carboxy-terminal domain (CTD) of nsp1 bound to the human 40S ribosome subunit [PDB: 6ZLW]58. nsp1 binds in the mRNA channel of 40S, near the entry site, thus preventing initiation. Some elements in the 5′ untranslated region of the viral mRNA enable it to escape translation inhibition by nsp1 by an as yet unclear mechanism. b, Repeat-associated non-AUG (RAN) translation. Expansion of short tandem repeats is associated with more than 50 neurological disorders, such as frontotemporal dementia, amyotrophic lateral sclerosis and Huntington disease. Translation on mRNAs with expanded repeats, such as the GGGGCC hexamer in the C9orf72 mRNA, starts upstream of or within the repeats themselves through RAN translation initiation. The highly structured elements of the expanded repeats allow the ribosome to initiate upstream of the AUG start codon, at an upstream open reading frame (uORF). Moreover, RAN translation is upregulated upon phosphorylation of eukaryotic translation initiation factor 2 (eIF2) and inhibition of eIF4E-dependent initiation (Box 2), suggesting it may be a type of internal ribosome entry site (IRES)-dependent mechanism50. 43S, 43S pre-initiation complex; m7G, 7-methylguanosine; PABP, polyadenylate-binding protein; Met-tRNAiMet, methionine initiator transfer RNA.

Escaping translational control is not an exclusive property of viral mRNAs. Cellular mRNAs can also bypass important regulatory processes, with serious implications for human health. In RAN translation, some cellular mRNAs use RNA secondary structures to start translation on a non-AUG codon located upstream of the main ORF50,233,234 (Fig. 6b). This non-canonical initiation has been associated with dozens of neurological diseases, including Huntington disease, several spinocerebellar ataxias, amyotrophic lateral sclerosis and frontotemporal dementia50,233,234. For example, expansions of the hexanucleotide GGGGCC repeat within the first intron of the C9orf72 gene are the main monogenic cause of familial and sporadic amyotrophic lateral sclerosis and of frontotemporal dementia50. Despite the intronic location, the GGGGCC repeats in the pre-mRNA can be retained in the 5′ UTR of mature mRNA, and then can be translated through RAN translation to produce toxic dipeptide repeat proteins217,235,236,237,238,239. However, despite its medical and biological importance, the precise molecular mechanism of RAN translation is yet to be determined.

Another important mechanism of translation regulation in human diseases is that of PDCD4, which is a tumour suppressor protein known to prevent cell growth, tumour invasion and metastasis60,61,62,63. Downregulation of PDCD4 has been implicated in the development of different human cancers, such as colorectal cancer, lung cancer, liver cancer and breast cancer240,241,242. It has been proposed that PDCD4 binds to eIF4A (refs. 60,61,62,63), and likely controls translation initiation by blocking the helicase activity of eIF4A. However, it is unknown whether PDCD4 blocks the helicase activity of the eIF4A that is part of eIF4F, or of the recently described second eIF4A molecule at the mRNA entry site19. Moreover, the identification of the proto-oncogene mRNAs a-MYB and c-MYB as two major physiological targets of PDCD4 has implicated translation elongation also in cancer development243,244. The regulatory function of PDCD4 on the translation of MYB mRNAs is achieved through its interaction with specific RNA secondary structures located in the ORF region of the mRNA, leading to the inhibition of translation elongation. This mechanism does not involve the established interactions between PDCD4 and eIF4A. Thus, the precise molecular mechanism underlying translational gene control by PDCD4 is yet to be determined.

Conclusions and future perspective

Several methodological developments have led to a major advance in our understanding of translation initiation and its regulation. These include single-particle cryo-EM, real-time single-molecule biophysics, biochemical techniques such as ribosome profiling and increasingly sophisticated genetic and biochemical tools. We are now in a position to understand not only the structural and biochemical basis of eIFs in translation but also how their activities can be inhibited or modulated in various conditions.

The recruitment of the 43S complex to mRNA is a key step of translation. Recent advances have shed light on the process by which mRNA is inserted into its channel in the 40S ribosome subunit. However, it is still unclear how initiation occurs on mRNAs with short 5′ UTRs, including those containing TISU elements.

Another important gap in our understanding of initiation regulation is how the activity of eIF4A–eIF4B, DDX3, Ded1 and DHX29 is coordinated with the activity of eIF4F in the 48S complex. Cryo-EM structures place eIF4F and eIF4A–eIF4B on the mRNA exit and entry sites on the 40S subunit, respectively. Thus, it is unclear how eIF4F, which itself contains a copy of the helicase eIF4A and has been located upstream of the 40S subunit, can work together with a second, downstream eIF4A during scanning. Determination of the structure of 48S complexes with other helicases, such as in complex with DDX3 or Ded1 possibly located downstream of 40S, will be crucial to resolving these discrepancies.

PABP binds to the poly(A) tail at the 3′ end while interacting with eIF4F. How PABP interacts with translating ribosomes remains a major gap in in our knowledge. Similarly, how the cap-binding protein eIF4E interacts with the rest of the 48S complex during mRNA recruitment and scanning remains unknown. Thus, it will be important to determine structures of eIF4E and PABP in the 48S complex.

Finally, studying translation of circular mRNAs is important both for understanding non-canonical translation and for its potential therapeutic application. It is still unclear which proteins and RNA elements and modifications are required for the recruitment of a 43S complex to a circular mRNA. Furthermore, it is still unclear how RNA modifications such as m6A regulate translation initiation.