Eukaryotic Replication Proteins - Biology

Eukaryotic replication proteins have functions analogous to those found in bacteria.

DNA replication has been studied from a wide variety of species. This section will examine eukaryotic DNA polymerases and accessory proteins, emphasizing properties that are common to those seen in bacterial enzymes.

Five DNA polymerases, called a, d, b, e, and g, have been isolated from eukaryotic cells. Following the paradigm established for studying replication in bacteria, researchers have sought to determine which proteins are involved in a particular function using both genetic analysis and biochemical characterization. Although no genetic screens for DNA replicating functions can be done in mammals, a substantial amount has been learned by studying replication in vitro of DNA containing viral origins of replication, such as those found in simian virus 40 (SV40) or bovine papilloma virus. These mammalian viruses have small chromosomes (about 5 to 7 kb), and they can be replicated completely in cell-free systems.Use of cell-free systems that are competent for replication has allowed a detailed analysis of proteins required for this process. The ability to interfere with the activity of designated proteins in a cell-free system, e.g. by adding antibodies that inactivate them or inhibitors to block their activity, provides the means to test whether the that protein is required for DNA replication. In effect, this interference with a protein in vitromimics the information gleaned from the phenotypes of loss-of-function mutations in the genes that encode the protein of interest. Also, purified proteins can be combined to reconstitute the activities needed for complete synthesis of the viral DNA template. Success in such a reconstitution indicates that the major components have been identified. Furthermore, proteins homologous to those identified in mammalian cells have been found in yeast, and mutation of those genes provides additional information about the biological function of the enzymes. Results of these types of studies are presented in this section.

The chief polymerase for replication of nuclear DNA is DNA polymerase d (Figure 5.27). It is required for both leading strand and lagging strand synthesis, at least in reconstituted in vitroreplication systems. It has two subunits, a polymerase (125 kDa) and another subunit (48 kDa). It catalyzes DNA synthesis with high fidelity, and it contains the expected 3' to 5' exonuclease activity for proofreading. It has high processivity when associated with an analog of the bacterial sliding clamp (b2), called PCNA.

Figure 5.27. Eukaryotic DNA polymerases and replication proteins at the replication fork. The major replicative DNA polymerase in nuclei is DNA polymerase d. RFA is the functional equivalent of bacterial SSB; this single-stranded binding protein coats the single-stranded DNA. A helicase catalyzes the separation of the two parental strands. DNA polymerase a(shown as a circle around the new primer) contains a primase that makes short stretches of RNA and DNA that server as primers for DNA synthesis.

PCNA, or proliferating cell nuclear antigen, was initially identified as an antigen that appears only in replicating cells, and only at certain times of the cell cycle (such as S phase). This trimeric protein has a ring structure similar to that of the b2 sliding clamp of E. coliDNA polymerase III (Figure 5.28), despite the absence of significant sequence similarity in the proteins. Binding of PCNA confers high processivity onto polymerase d. Thus PCNA is both structurally and functionally analogous to the E. coli b subunit. Each subunit of the trimeric PCNA folds into two domains, for a total of six domains in the ring. Each subunit of the dimeric E. coli b subunit folds into three domains, again making six domains in the ring. Thus the sliding clamp has a very similar structure in both bacteria and mammals.

Figure 5.28. Similar structures of processivity factors for DNA replication. The mammalian protein, PCNA (top), is a trimer, each monomer of which has two similar domains. T he trimer forms a circle that surrounds DNA, hence serving as a sliding clamp. The b subunit of DNA polymerase III from E. coli is a dimer (bottom), each monomer of which has three similar domains. These domains have a very similar structure to those of PCNA, despite having only limited sequence similarity. Thus functionally analogous sliding clamps in eukaryotes and prokaryotes have similar structures.

The template-primer junctions are recognized by the multisubunit replication factor C, or RFC. Like the g complex in E. coli, this enzyme is an ATPase, and it helps to load on the processivity factor PCNA. Thus RFC is carrying out a similar function to the bacterial g-complex.

One of the first eukaryotic polymerases to be isolated was DNA polymerase a, which is now recognized as a catalyst of primer synthesis. This enzyme contains four polypeptide subunits, one with a polymerase activity (170 kDa), two that comprise a primase activity (50 and 60 kDa), and another subunit of (currently) undetermined function (70 kDa). DNA polymerase a has low processivity but high fidelity. This high fidelity is surprising because no 3' to 5' exonuclease is associated with the enzyme. Polymerase a, possibly with additional primases, catalyzes the synthesis of short segments of DNA and RNA that serve as primers for the replicative polymerases.

DNA polymerase e is related to polymerase d, and it may play a role in lagging strand synthesis. It is also dependent on PCNA, in vivo. However, no requirement has been identified for it in viral replication systems in vitro.

The compound aphidicolin will block the growth of mammalian cells. It does this by preventing DNA replication, and the targets of this drug are DNA polymerases a and d (as well as e). The fact that inhibition of these DNA polymerases with aphidicolin also stops DNA replication in mammalian cells argues that indeed, a and d are responsible for replication of nuclear DNA in eukaryotic cells. This conclusion is strongly supported by the phenotype of conditional loss-of-function mutations in the genes encoding the homologs to these polymerases in yeast. Such mutants do not grow at the restrictive temperature, indicating that d and a are the replicative polymerases. The biochemical evidence implicates polymerase a in primer formation, and d appears to be the major polymerases used to synthesize the new strands of DNA.

Table 5.4: Analogous components of the replication machinery in E. coliand eukaryotic cells.


Bacterial (E. coli)

Number of subunits

Eukaryotic replication (SV40)

Number of subunits

Leading and lagging strand synthesis

asymmetric dimer, E. colipolymerase III

10 (3 in core)

polymerase d


Sliding clamp

b subunit




Clamp loader








Polymerase a





T-antigen (SV40)


Bind single-stranded DNA







4, A2B2

Topo I

or Topo II


2 (homodimer)

The parallels between bacterial and eukaryotic DNA replication are striking. The overall strategy of synthesis is similar, and analogous proteins carry out similar functions, as listed in Table 5.4. It is difficult to determine whether the proteins carrying out similar functions are actually homologous proteins, i.e. encoded by genes descended from the same gene in the last common ancestor. The protein sequence identities are marginal, and frequently the analogous proteins have different numbers of subunits. These differences complicate the analysis considerably, because different subunits in bacteria or mammals may have similar functions. However, the functional similarities are convincing.

Several other DNA polymerases have been isolated from eukaryotic cells. DNA polymerase b and e are involved in repair of nuclear DNA. DNA polymerase b is a single polypeptide of 36 kDa, and has no 3' to 5' exonuclease. DNA polymerase g replicates mitochondrial DNA.

Reverse transcriptaseis frequently referred to as an RNA-dependent DNA polymerase because it can use RNA as a template, but in fact it can use either RNA or DNA as a template. It is encoded by retroviruses, and hence it is present in cells infected with a retrovirus. This enzyme has widespread use in the laboratory for making complementary copies of RNA, called cDNA. Active copies of LINE1 repetitive elements (in mammals) or Ty1 repeats (in yeast), also encode reverse transcriptase. Thus in cells where these retrotransposable elements are being transcribed, active reverse transcriptase is also present. Reverse transcriptase also has an RNase H activity, which will digest away RNA from an RNA-DNA duplex.

In contrast to the other DNA polymerases discussed in this chapter, terminal deoxynucleotidyl transferasedoes not require a template. It adds dNTPs (as dNMP) to the 3' end of DNA, using that 3' hydroxyl as a primer. It is found in differentiating lymphocytes, and appears to be used physiologically to introduce somatic mutations into immunoglobulin genes. In the laboratory, it is used to add "homopolymer tails" to the ends of DNA molecules by incubating a linear DNA with one particular dNTP and terminal deoxynucleotidyl transferase.

As will be discussed in more detail in the next chapter, the ends of linear chromosomes (telomeres) must be expanded at each replication or they will eventually become shortened. The enzyme telomerasecatalyzes the addition of many tandem copies of a simple sequence to the ends of the chromosomes. The template for this reaction is an RNA that is a component of the enzyme. Thus telomerase is a reverse transcriptase that only makes copies of the template that it carries, using the 3' end of a chromosomal DNA strand as the primer.

Eukaryotic replication and chromatin

A single-molecule approach to reveal the dynamics of Eukaryotic DNA replication in chromatin.

Research overview

The eukaryotic replisome copies DNA that is wrapped about nucleosomes. This state of DNA is also known as chromatin.

The copying, or replication, of DNA is one of the central processes that take place in all living organisms. Our understanding of DNA replication has made gigantic leaps forward since the discovery of the double helical form of DNA by Watson and Crick in 1953. We know many of the structures and functions of the proteins and enzymes involved. Many of these discoveries have been made by studying DNA replication in simple systems, such as viruses or bacteria. These continue to yield valuable insights, but recently, advances in the reconstitution of the yeast replisome have made it possible to gain insights into eukaryotic replication.

DNA replication is carried out at very high accuracy by nanometer-scale, multi-protein complexes known as replisomes. In eukaryotic organisms such as ourselves, the replisome consists of some twenty different proteins. Eukaryotic replication occurs in the context of chromatin: the meters of DNA in eukaryotic organisms are tightly packed into a higher-order structure called chromatin in order to fit in the tiny nucleus. The basic compaction unit of this condensed structure is a DNA-protein complex termed nucleosome which consists of a small piece of DNA wrapped around a core of so-called histone proteins. This compaction adds an extra layer of complexity to the replication process.

Schematic of the eukaryotic replisome. The central motor of the replisome is the CMG helicase, which unwinds the parental DNA the polymerases are then able to add new nucleotides on the leading and lagging strands, respectively, leading to two daughter DNA molecules.

Our research focuses on understanding the molecular processes that underlie eukaryotic DNA replication in the context of chromatin, with the particular aim of gaining spatiotemporal insight into their dynamics by using our single-molecule biophysical expertise in replication and chromatin while integrating it with state-of-the-art molecular biology and biochemistry.


Despite tremendous advances in understanding chromatin replication achieved by experiments in genetics, cell biology, structural biology, and biochemistry, a detailed mechanistic understanding of how the replisome interacts with nucleosomes, histone chaperones, and chromatin remodelers still remains. The possibility of reconstituting an active yeast replisome in vitro in our lab (originally described by the Diffley laboratory in 2015) has opened up a new perspective, because when integrated with reconstituted forms of chromatin it provides the means to study chromatin replication in a precise, carefully controlled manner.

However, to really understand how the different processes that maintain robust chromatin replication occur in space and time requires probing the stoichiometry and dynamics of the individual proteins involved. Doing so requires a complementary approach to bulk biochemistry that can be found in single-molecule techniques. These high-resolution techniques, which include single-molecule fluorescence and single-molecule force spectroscopy, monitor individual biochemical processes under physiological conditions in real time and have demonstrated their value in revealing the dynamics of large protein complexes.

Read about our recent work

We have recently published our findings about the dynamics of the loading proteins involved in the first step of replisome assembly. The origin recognition complex (ORC) turns out to be a protein that is quite mobile on the DNA, except at the origin. Recruitment of the MCM helicase in an ORC-dependent manner can occur at different locations on the DNA, but immobile ORC-MCM complexes are also preferentially observed at the origin. When loaded onto DNA in the presence of ATP in bulk experiments and then visualized at the single-molecule level, both single and double Mcm2-7 hexamers are found on the DNA, and both exhibit similar low-level mobility. Read about these findings and more here.

Scanning confocal images of two ORC molecules, labelled in green, bound to dsDNA (not fluorescent). The dotted line connects fitted ORC positions over time (blue represents early times red, late times). In the top panel, the ORC is static throughout the entire observation time bound to the origin of DNA replication. In the bottom panel, another ORC diffuses randomly from its initial position until it locates the origin.

Mitochondrial Machineries for Protein Import and Assembly

Nils Wiedemann and Nikolaus Pfanner
Vol. 86, 2017


Mitochondria are essential organelles with numerous functions in cellular metabolism and homeostasis. Most of the >1,000 different mitochondrial proteins are synthesized as precursors in the cytosol and are imported into mitochondria by five transport . Read More

Figure 1: Overview of the five major protein import pathways of mitochondria. Presequence-carrying preproteins are imported by the translocase of the outer mitochondrial membrane (TOM) and the presequ.

Figure 2: The presequence pathway into the mitochondrial inner membrane (IM) and matrix. The translocase of the outer membrane (TOM) consists of three receptor proteins, the channel-forming protein To.

Figure 3: Role of the oxidase assembly (OXA) translocase in protein sorting. Proteins synthesized by mitochondrial ribosomes are exported into the inner membrane (IM) by the OXA translocase the ribos.

Figure 4: Carrier pathway into the inner membrane. The precursors of the hydrophobic metabolite carriers are synthesized without a cleavable presequence. The precursors are bound to cytosolic chaperon.

Figure 5: Mitochondrial intermembrane space import and assembly (MIA) machinery. Many intermembrane space (IMS) proteins contain characteristic cysteine motifs. The precursors are kept in a reduced an.

Figure 6: Biogenesis of β-barrel proteins of the outer mitochondrial membrane. The precursors of β-barrel proteins are initially imported by the translocase of the outer membrane (TOM), bind to small .

Figure 7: The dual role of mitochondrial distribution and morphology protein 10 (Mdm10) in protein assembly and organelle contact sites. Mdm10 associates with the sorting and assembly machinery (SAM) .

Figure 8: Multiple import pathways for integral α-helical proteins of the mitochondrial outer membrane. The precursors of proteins with an N-terminal signal anchor sequence are typically inserted into.

Figure 9: The mitochondrial contact site and cristae organizing system (MICOS) interacts with protein translocases. MICOS consists of two core subunits, Mic10 and Mic60. Mic10 forms large oligomers th.

Eukaryotic MCM proteins: beyond replication initiation

The minichromosome maintenance (or MCM) protein family is composed of six related proteins that are conserved in all eukaryotes. They were first identified by genetic screens in yeast and subsequently analyzed in other experimental systems using molecular and biochemical methods. Early data led to the identification of MCMs as central players in the initiation of DNA replication. More recent studies have shown that MCM proteins also function in replication elongation, probably as a DNA helicase. This is consistent with structural analysis showing that the proteins interact together in a heterohexameric ring. However, MCMs are strikingly abundant and far exceed the stoichiometry of replication origins they are widely distributed on unreplicated chromatin. Analysis of mcm mutant phenotypes and interactions with other factors have now implicated the MCM proteins in other chromosome transactions including damage response, transcription, and chromatin structure. These experiments indicate that the MCMs are central players in many aspects of genome stability.


Phylogenetic tree of eukaryotic MCMs,…

Phylogenetic tree of eukaryotic MCMs, assembled using ClustalX ( ) and Phylip…

Structural features of the Mcm…

Structural features of the Mcm protein family, derived from the alignment used in…

Pairwise interactions in vitro suggest…

Pairwise interactions in vitro suggest this organization of the MCM heterohexamer. The P-loop…

Model for replication initiation driven…

Model for replication initiation driven by MCM proteins. Panels A to F show…

Associations between MCMs and other…

Associations between MCMs and other factors are grouped by function and the MCM…


With these questions in mind, we examine the phylogeny of the PCNA, RFCS, and MCM subunits. The phylogenetic data is then compared in detail with the known biochemistry of each subunit, in particular, a subunits interaction partners within each complex.

Proliferating Cell Nuclear Antigen

PCNA was so named after it was found to be highly abundant in proliferating cells [20]. PCNA consists of three subunits (Figure 1a) of 1, 2, or 3 sequence types, depending on the phylogenetic group (Table 1). In the interest of clarity and consistency, we introduce our own designations of the PCNA subunits (C1, C2, C3). Table 2 translates our notation to that of previous literature [21]–[23].

The maximum likelihood phylogeny of the PCNA subunits is shown in Figure 2. This resultant phylogeny generally agrees with the NCBI taxonomy of the corresponding organisms. For clarity, more closely related sequences are shown as a collapsed group. The archaeal and eukaryotic sequences are grouped into separate clades. The Crenarcheota and the Euryarchaea also form distinct groups. The placement of Nitrosopumilis and Cenarcheaum in Figure 2 is consistent with recent proposals that these organisms belong to a phylum distict from the Crenarchaeota and Euryarchaea, which has been named Thaumarchaeota [24]. The Korarchaeum and Nanoarchaeum sequences are grouped together within those of the Crenarchaeota. Given the general agreement between the PCNA phylogeny and the organismal taxonomy, HGT does not appear to have occurred.

Tree produced using RAxML [63]. Note the proliferation of distinct subunit types in the Crenarchaeota.

The eukaryotes and the Euryarchaeota contain only one PCNA gene, with the exception of a few near identical copies of unknown functionality in Drosphila, Arabidopsis, and Thermococcus (see Figure S1) that are generally not present in closely related taxa (data not shown). By contrast, the Crenarchaeota show deep branchings between PCNA subunits. Cenarchaeum symbiosum contains one PCNA gene, while the Thermoproteales have either one, as in Thermofilum pendens, or two distinct PCNA encoding genes, as in the Thermoprotaeceae. The Desulfurococcales and the Sulfolobales both encode three distinct PCNA subunits.

The phylogenetic relationships between the distinct sequence types yield an interesting picture—one that is consistent with their known biochemical properties. Note that the three distinct types of PCNA roughly group into three clades labeled C1, C2, and C3. Sulfolobales PCNA C1 appears slightly more related to PCNA C3, but not significantly so. We tested this further by constructing a phylogeny of sequences from organisms with more than one distinct sequence type. As shown in Figure 3, in this more focused phylogeny, the PCNA subunits C1, C2, and C3 all group separately.

The branching indicated here lends further support to the three PCNA C1, C2, and C3 groupings.

Furthermore, within each of these three groups, the subunits share similar interaction properties. PCNA C1 appears to have preserved the most ancestral function, sharing the most properties in common with the homotrimeric PCNA subunit. C1 has the most stable dimeric interactions with the other subunits [21]–[23] and in Aeropyrum pernix, C1 is capable of forming a homotrimer [22]. In addition, C1 is present in all heterotrimeric configurations of PCNA (C1-C2-C3, C1-C1-C2, and C1-C2-C2) [21]–[23]. Phylogenetically, C1 is also the most closely related to the homotrimeric PCNA of Thermofilum pendens (Figure 2).

In contrast, C3 takes part only in C1-C2-C3 heterotrimer arrangements [21]–[23]. Data suggest that in Sulfolobus solfataricus, C3 is the last to be recruited into the PCNA trimer [21]. Overall, C3 has the least interactions with the other subunits [21]–[23] and appears to be the most functionally divergent of the three subunits from homotrimeric PCNA.

The results for PCNA are consistent with a simpler ancestral homotrimeric PCNA subunit and subsequent duplication and divergence of the distinct subunit types. The archaeal and eukaryotic PCNA both appear to have diverged from a homotrimeric form. Then, in the crenarcheaotes, more specialized PCNA sequence types appear to have originated from gene duplications, while the eukaryotes and Euryarchaea retained the ancestral configuration.

The Clamp Loader: Replication Factor C

The RFC complex consists of five subunits, one large (RFCL) and four small (RFCS). The RFC complex opens between the -position RFCS and the RFCL (Figure 1b) in order to open and close PCNA about the DNA polymerase at the replication fork [25], [26]. The RFC complex is made up of either 1, 2, or 4 distinct RFCS sequence types, depending on phylogenetic group (Table 1).

The maximum likelihood phylogeny of the RFCS subunits is shown in Figure 4. Again, the phylogeny shows general agreement with the NCBI taxonomy of the corresponding organisms. As such, HGT does not appear in the phylogeny of the RFCS subunits. The eukaryotes, crenarchaeotes, and Euryarchaea form separate groups. As with PCNA, the RFCS tree places the Cenarcheaum deep in the branching of archaeal sequences, again consistent with proposals that it be a member of a distinct phylum. The Korarchaea and Nanoarchaea sequences cluster with those of the Euryarchaea. The rooting between the eukaryotes and Archaea follows the canonical pattern, dividing the crenarchaeotes and the Euryarchaea at the base of the archaeal clade.

The red stars indicate splits between RFCS and RFCS1 subunit types in the Methanomicrobia, possibly from loss of RFCS2.

The phylogeny of the RFCS subunits shows that a RFC with four distinct RFCS sequence types seems to have been present in a common eukaryotic ancestor. This can be seen from the four eukaryotic RFCS clades—one for each RFCS position. On the other hand, the archaeal RFC consists of one or two distinct RFCS subunits [27], [28]. Archaea containing only one distinct RFCS form the RFC complex with the same RFCS in all four positions [25]. Euryarchaeal RFC complexes with two distinct RFCS subunits are composed of three RFCS1 at positions , , and , and a single RFCS2 at position [29]. The configuration of RFC in crenarchaeotes with two distinct subunits has not yet been elucidated.

In Euryarchaeota, the specialization of RFCS into RFCS1 and RFCS2 appears to have occurred before the split between Methanomicrobia and Halobacteria. Following the RFCS1-RFCS2 divergence, there appear to be two independent losses of RFCS2 in the Methanomicrobia, indicated by stars in Figure 4. On the other hand, RFCS1 and RFCS2 could have evolved independently in the Halobacteria and Methanomicrobia—a hypothesis that we do not have enough phylogenetic resolution to affirm or reject. However, data from gene context of RFCS1, shown in Figure S4, is consistent with the phylogeny. (For a more general study of gene context of archaeal DNA replication proteins, we refer the interested reader to Ref. [30]). Also, RFCS1-RFCL complexes have been shown to have some functional activity, further lending plausibility to the notion of independent gene losses [29].

Note that the long branch of RFCS2 corresponds to a change of function. Unlike RFCS and RFCS1, RFCS2 is unable to further extend the small subunit chain since it contains only one RFCS-RFCS binding site [29]. Thus, very conserved amino acid positions in RFCS and RFCS1 corresponding to the second RFCS-RFCS binding site have been allowed to drift in RFCS2 [29], resulting in the long RFCS2 branch seen in Figure 4. Also note that the RFCL rooting of the RFCS tree places the root within the eukaryotes, but is not in significant disagreement with the more sensible rooting between Archaea and eukaryotes (Figure S2).

The results for RFCS are consistent with a simpler ancestral RFC complex containing RFCL and four identical RFCS subunits. In the Archaea, we see subsequent multiple independent duplications and divergences of the distinct subunit types in both crenarchaeotes and Euryarchaea. In eukaryotes, we do not see any intermediate forms with fewer than four distinct RFCS types.

Minichromosome Maintenance Complex

MCM complex plays a role in replication licensing [31] and DNA duplex unwinding [32]. The MCM complex consists of six homologous subunits arranged in a hexameric ring (Figure 1c). The six MCM subunits are drawn from 1, 2, 3, 4, 6, or 8 distinct sequence types, depending on phylogenetic lineage (Table 1).

The phylogeny of the MCM subunits is shown in Figure 5 (shown uncondensed in Figure S3). As in the case of PCNA and RFCS, this phylogeny also shows general agreement with the NCBI taxonomy of the corresponding organisms. The eukaryotes, crenarchaeotes, and Euryarchaea form separate groups. Once again the basal position of Nitrosopumilus and Cenarcheaum is consistent with a distinct phylum level group, the proposed Thaumarchaeota [24]. Also as in Figures 2 and 4, the Korarchaea and Nanoarchaea sequences group with those of the Euryarchaea. Once again, given the general agreement between gene and organismal relationships, HGT between distantly related organisms does not appear in the phylogeny of the MCM subunits.

The Methanococci MCM sequences show abundant gene duplication and divergence. They have been labeled I, II, III, IV, and V according to the phylogeny.

The phylogeny of the MCM subunits shows that MCM with six distinct sequence types seems to have been present in a common eukaryotic ancestor, a result previously noted by Liu et al. [33]. By contrast, the archaeal genomes vary in the number of distinct MCM sequence types they contain. The crenarchaeotes appear to contain only a single distinct MCM subunit. On the other hand, the euryarchaeotal genomes contain up to eight distinct MCM subunit genes.

The largest number of MCM genes can be found in the Methanococci. The Methanococci subunits in Figure 5 are labeled based on their phylogeny. The branch lengths between the labeled groups appear indicative of distinct roles among the subunits. The organismal members of each group vary—an indication of gene gains and losses in the Methanococci. For instance, Methanococcus aeolicus appears to have lost MCM III while Methanococcus maripaludis C6 has five MCM V sequences.

There are multiple eukaryotic MCM complexes. At least two different complexes are known to play a role in unwinding dsDNA [34], MCM2-7 [35] and MCM467 [32], [36]. MCM2467 and MCM35 complexes have also been observed [37]. In Archaea, MCM has mostly been characterized in single MCM containing organisms, and several of these MCM proteins have been shown to function as homohexamers [38]–[44]. It is worth noting, however, that MCM in Pyrococcus furiosus requires the presence of accessory protein GINS for unwinding DNA activity [43]. Recently it has been demonstrated that coexpression of the four MCM homologs in Methanococcus maripaludis S2 result in the formation of a heterohexameric complex [45]. Since M. maripaludis has a very robust genetic system, we anticipate that subsequent studies will reveal the need for multiple MCM homologs in this archaeon, instead of the usual single homolog in most archaea.

These results are consistent with an ancestral homohexameric MCM complex. In the Archaea, we see subsequent multiple independent duplications and divergences of the distinct subunit types in the Euryarchaea. The crenarchaeotes, on the other hand, retain the simpler ancestral configuration. In eukaryotes, we do not see any intermediate forms with fewer then six distinct sequence types implying a common eukaryotic ancestor containing six distinct MCM subunits.

Eukaryotic Replication Proteins - Biology

A subscription to J o VE is required to view this content. You will only be able to see the first 20 seconds .

The JoVE video player is compatible with HTML5 and Adobe Flash. Older browsers that do not support HTML5 and the H.264 video codec will still use a Flash-based video player. We recommend downloading the newest version of Flash here, but we support all versions 10 and above.

If that doesn't help, please let us know.

Most prokaryotic factors utilized during replication have equivalents that play similar roles in eukaryotic DNA duplication.

This process initiates at an origin of replication, to which a recognition complex binds. Helicase is then attracted to the site and separates the strands of DNA, generating a bubble with two forks.

Primase also arrives and generates RNA primers, which, as helicase moves, DNA polymerase elongates with new DNA. As in prokaryotes, the newly-formed leading strand grows continuously, following the replication fork.

Conversely, the lagging strand is manufactured in small Okazaki fragments, traveling opposite the fork.

Due to multiple factors, the DNA template used to generate the leading strand in 1/2 of this structure creates the lagging strand in the other.

Interestingly, various origins of replication exist on a linear eukaryotic chromosome, and replication terminates when their associated spheres coalesce. Primers are then eliminated via enzymes like RNAse and swapped for DNA. Afterwards, DNA ligase attaches any segments.

However, when the end primer disappears from the lagging strand, the space remains empty, and there is an uncopied stretch of DNA template abutting it. To combat this, an enzyme called telomerase affixes to the overhanging region and elongates it with a non-coding DNA sequence.

Primase and DNA polymerase act upon this extended region, creating a telomere cap that protects against loss of coding DNA from the lagging strand during multiple replications.

Thus eukaryotic DNA replication ends with two DNA molecules, each with a parental and newly-synthesized strand, numerous origins of replication, and telomeres.

13.6: Replication in Eukaryotes


In eukaryotic cells, DNA replication is highly conserved and tightly regulated. Multiple linear chromosomes must be duplicated with high fidelity before cell division, so there are many proteins that fill specialized roles in the replication process. Replication occurs in three phases: initiation, elongation, and termination, and ends with two complete sets of chromosomes in the nucleus.

Many Proteins Orchestrate Replication at the Origin

Eukaryotic replication follows many of the same principles as prokaryotic DNA replication, but because the genome is much larger and the chromosomes are linear rather than circular, the process requires more proteins and has a few key differences. Replication occurs simultaneously at multiple origins of replication along each chromosome. Initiator proteins recognize and bind to the origin, recruiting helicase to unwind the DNA double helix. At each point of origin, two replication forks form. Primase then adds short RNA primers to the single strands of DNA, which serve as a starting point for DNA polymerase to bind and begin copying the sequence. DNA can only be synthesized in the 5&rsquo to 3&rsquo direction, so replication of both strands from a single replication fork proceeds in two different directions. The leading strand is synthesized continuously, while the lagging strand is synthesized in short stretches 100-200 base pairs in length, called Okazaki fragments. Once the bulk of replication is complete, RNase enzymes remove the RNA primers and DNA ligase joins any gaps in the new strand.

Dividing the Work of Replication among Polymerases

The workload of copying DNA in eukaryotes is divided among multiple different types of DNA polymerase enzymes. Major families of DNA polymerases across all organisms are categorized by the similarity of their protein structures and amino acid sequences. The first families to be discovered were termed A, B, C, and X, with families Y and D identified later. Family B polymerases in eukaryotes include Pol &alpha, which also functions as a primase at the replication fork, and Pol &delta and &epsilon, the enzymes that do most of the work of DNA replication on the leading and lagging strands of the template, respectively. Other DNA polymerases are responsible for such tasks as repairing DNA damage,copying mitochondrial and plastid DNA, and filling in gaps in the DNA sequence on the lagging strand after the RNA primers are removed.

Telomeres Protect the Ends of the Chromosomes from Degradation

Because eukaryotic chromosomes are linear, they are susceptible to degradation at the ends. To protect important genetic information from damage, the ends of chromosomes contain many non-coding repeats of highly conserved G-rich DNA: the telomeres. A short single-stranded 3&rsquo overhang at each end of the chromosome interacts with specialized proteins, which stabilizes the chromosome within the nucleus. Because of the manner in which the lagging strand is synthesized, a small amount of the telomeric DNA cannot be replicated with each cell division. As a result, the telomeres gradually get shorter over the course of many cell cycles and they can be measured as a marker of cellular aging. Certain populations of cells, such as germ cells and stem cells, express telomerase, an enzyme that lengthens the telomeres, allowing the cell to undergo more cell cycles before the telomeres shorten.

Garcia-Diaz, Miguel, and Katarzyna Bebenek. &ldquoMultiple functions of DNA polymerases.&rdquo Critical Reviews in Plant Sciences 26 (2007): 105-122. [Source]


MCMs are known to be modified by a variety of covalent attachments including phosphorylation, acetylation, and ubiquitylation. It is likely that additional modifications and the responsible enzymes will be identified in the future, providing additional levels of regulation. These modifications may provide the mechanisms for cells to distinguish between different pools of MCMs in the nucleus and to activate distinct functions of these proteins in vivo.


As described above, the function of at least two kinases is required for S-phase initiation: DDK, which phosphorylates primarily Mcm2 but also other MCM subunits, and CDK, which phosphorylates at least Mcm4 and perhaps other subunits. Other kinases may also be involved. There is evidence for phosphorylation of Mcm4 and other MCMs that is not CDK or DDK dependent (250). Additionally, a recent study suggests that Mcm4 may be a target of the ATR-Chk2 checkpoint kinase pathway in response to replication arrest caused by HU (133). Thus, the MCM proteins are targets of both positive and negative phosphorylation events.

The identity of the phosphatase(s) that dephosphorylates MCMs remains a mystery. We can infer that there is one, because Mcm4 phosphorylation is associated with its inactivation (see, e.g., references 46, and 108), and there is no evidence that the abundant MCMs turn over significantly during the cell cycle. The only evidence for a phosphatase associated with MCM function is the observation that protein phosphatase 2A is required for binding of Cdc45 to the pre-RC (38). Since this is a positive effect, it suggests that there is an inhibitory kinase. However, the identity and substrate(s) of this kinase are unknown.


Ubiquitin is a small peptide that is covalently linked to lysine residues in the target proteins (reviewed in references 277 and 326). Chains of ubiquitin target proteins for degradation by the proteasome. More recent studies have indicated that ubiquitin and the related peptides SUMO and NEDD8 can also modify proteins to regulate them and can affect localization or protein association in addition to protein stability (reviewed in references 222, 277, and 352). Although there is not yet evidence for sumoylation or neddylation, it is likely that the MCMs will be substrates for a broad range of related modifications.

Genetic experiments with budding yeast isolated an allele of UBA1, a ubiquitin-conjugating enzyme, as a suppressor of an mcm3 mutant (34). This suggested that Mcm3 may be negatively regulated by ubiquitylation. Subsequent studies reveal that a fraction of wild-type S. cerevisiae Mcm3 is polyubiquitylated in vivo, although the consequences of this are not yet clear. Human Mcm7 is polyubiquitylated by the ubiquitin ligase E6-AP, which acts with the papillomavirus HPV-18E6 protein to form a virus-specific ubiquitin ligase (170). Data suggest that this targets Mcm7 for turnover by the proteosome. A homotypic binding site for the ligase was identified between residues 633 and 654 of Mcm7, defining an “L2G box,” (S/T)xxxLLG. In vivo studies show that Mcm7 is also polyubiquitylated in the absence of the E6-AP protein, suggesting that it is a substrate for ubiquitylation even in noninfected cells.

While the steady-state level of bulk MCMs remains fairly constant, as described above, it is possible that a fraction of MCMs are subject to regulated turnover. This could provide one mechanism to define functional pools within this abundant protein family.


Acetylation is now appreciated as a significant modification for many cellular proteins and is not limited to histones (reviewed in reference 165). As described above, the histone aceyltransferase PCAF/Gcn5 is reported to stimulate replication from a viral origin by contributing to the acetylation of polyomavirus T antigen, a viral helicase similar to MCMs (342). This suggests that the interaction of MCMs with other HATs such as Hbo1 may result in MCM acetylation. Therefore, the target of the HATs to which MCMs bind may not be histones but may be MCMs themselves (see 𠇌hromatin remodeling” above).

Mcm3 protein is acetylated in mammalian cells by a protein called MCM3AP (308), which was originally isolated in a two-hybrid screen for Mcm3 interactors (309). The acetylated Mcm3 is associated with the chromatin and is not observed in cells arrested in G2/M, which suggests that acetylation is involved in regulating Mcm3 specifically during G1/S, when it is chromatin bound. Curiously, MCM3AP inhibits DNA replication in a cell-free system, suggesting that it is a negative regulator (307, 308) however, it clearly functions positively by promoting Mcm3 nuclear localization and chromatin binding (307, 309). This paradox has yet to be resolved. MCM3AP is distantly related to the PCAF/Gcn5 family of HATs (308), although it does not have a close homologue in the yeasts. It is a splice variant of a much larger protein called GANP that is found in B cells (1). GANP is associated with DNA primase activity, leading to the suggestion that it is specifically involved in the regulation of B-cell proliferation (174, 175).

Telomerase and Aging

Cells that undergo cell division continue to have their telomeres shortened because most somatic cells do not make telomerase. This essentially means that telomere shortening is associated with aging. With the advent of modern medicine, preventative health care, and healthier lifestyles, the human life span has increased, and there is an increasing demand for people to look younger and have a better quality of life as they grow older.

In 2010, scientists found that telomerase can reverse some age-related conditions in mice. This may have potential in regenerative medicine. 1 Telomerase-deficient mice were used in these studies these mice have tissue atrophy, stem cell depletion, organ system failure, and impaired tissue injury responses. Telomerase reactivation in these mice caused extension of telomeres, reduced DNA damage, reversed neurodegeneration, and improved the function of the testes, spleen, and intestines. Thus, telomere reactivation may have potential for treating age-related diseases in humans.

Cancer is characterized by uncontrolled cell division of abnormal cells. The cells accumulate mutations, proliferate uncontrollably, and can migrate to different parts of the body through a process called metastasis. Scientists have observed that cancerous cells have considerably shortened telomeres and that telomerase is active in these cells. Interestingly, only after the telomeres were shortened in the cancer cells did the telomerase become active. If the action of telomerase in these cells can be inhibited by drugs during cancer therapy, then the cancerous cells could potentially be stopped from further division.

Difference between Prokaryotic and Eukaryotic Replication
Property Prokaryotes Eukaryotes
Origin of replication Single Multiple
Rate of replication 1000 nucleotides/s 50 to 100 nucleotides/s
DNA polymerase types 5 14
Telomerase Not present Present
RNA primer removal DNA pol I RNase H
Strand elongation DNA pol III Pol δ, pol ε
Sliding clamp Sliding clamp PCNA

Replication Protein A (RPA): The Eukaryotic SSB

Replication protein A (RPA) is a heterotrimeric single-stranded DNA-binding protein that is highly conserved in eukaryotes. RPA plays essential roles in many aspects of nucleic acid metabolism, including DNA replication, nucleotide excision repair, and homologous recombination. In this review, we provide a comprehensive overview of RPA structure and function and highlight the more recent developments in these areas. The last few years have seen major advances in our understanding of the mechanism of RPA binding to DNA, including the structural characterization of the primary DNA-binding domains (DBD) and the identification of two secondary DBDs. Moreover, evidence indicates that RPA utilizes a multistep pathway to bind single-stranded DNA involving a particular molecular polarity of RPA, a mechanism that is apparently used to facilitate origin denaturation. In addition to its mechanistic roles, RPA interacts with many key factors in nucleic acid metabolism, and we discuss the critical nature of many of these interactions to DNA metabolism. RPA is a phosphorylation target for DNA-dependent protein kinase (DNA-PK) and likely the ataxia telangiectasia-mutated gene (ATM) protein kinase, and recent observations are described that suggest that RPA phosphorylation plays a significant modulatory role in the cellular response to DNA damage.

70 DNA Replication in Eukaryotes

By the end of this section, you will be able to do the following:

  • Discuss the similarities and differences between DNA replication in eukaryotes and prokaryotes
  • State the role of telomerase in DNA replication

Eukaryotic genomes are much more complex and larger in size than prokaryotic genomes. Eukaryotes also have a number of different linear chromosomes. The human genome has 3 billion base pairs per haploid set of chromosomes, and 6 billion base pairs are replicated during the S phase of the cell cycle. There are multiple origins of replication on each eukaryotic chromosome humans can have up to 100,000 origins of replication across the genome. The rate of replication is approximately 100 nucleotides per second, much slower than prokaryotic replication. In yeast, which is a eukaryote, special sequences known as autonomously replicating sequences (ARS) are found on the chromosomes. These are equivalent to the origin of replication in E. coli.

The number of DNA polymerases in eukaryotes is much more than in prokaryotes: 14 are known, of which five are known to have major roles during replication and have been well studied. They are known as pol α, pol β, pol γ, pol δ, and pol ε.

The essential steps of replication are the same as in prokaryotes. Before replication can start, the DNA has to be made available as a template. Eukaryotic DNA is bound to basic proteins known as histones to form structures called nucleosomes. Histones must be removed and then replaced during the replication process, which helps to account for the lower replication rate in eukaryotes. The chromatin (the complex between DNA and proteins) may undergo some chemical modifications, so that the DNA may be able to slide off the proteins or be accessible to the enzymes of the DNA replication machinery. At the origin of replication, a pre-replication complex is made with other initiator proteins. Helicase and other proteins are then recruited to start the replication process ((Figure)).

Difference between Prokaryotic and Eukaryotic Replication
Property Prokaryotes Eukaryotes
Origin of replication Single Multiple
Rate of replication 1000 nucleotides/s 50 to 100 nucleotides/s
DNA polymerase types 5 14
Telomerase Not present Present
RNA primer removal DNA pol I RNase H
Strand elongation DNA pol III Pol α, pol δ, pol ε
Sliding clamp Sliding clamp PCNA

A helicase using the energy from ATP hydrolysis opens up the DNA helix. Replication forks are formed at each replication origin as the DNA unwinds. The opening of the double helix causes over-winding, or supercoiling, in the DNA ahead of the replication fork. These are resolved with the action of topoisomerases. Primers are formed by the enzyme primase, and using the primer, DNA pol can start synthesis. Three major DNA polymerases are then involved: α, δ and ε. DNA pol α adds a short (20 to 30 nucleotides) DNA fragment to the RNA primer on both strands, and then hands off to a second polymerase. While the leading strand is continuously synthesized by the enzyme pol δ, the lagging strand is synthesized by pol ε. A sliding clamp protein known as PCNA (proliferating cell nuclear antigen) holds the DNA pol in place so that it does not slide off the DNA. As pol δ runs into the primer RNA on the lagging strand, it displaces it from the DNA template. The displaced primer RNA is then removed by RNase H (AKA flap endonuclease) and replaced with DNA nucleotides. The Okazaki fragments in the lagging strand are joined after the replacement of the RNA primers with DNA. The gaps that remain are sealed by DNA ligase, which forms the phosphodiester bond.

Telomere replication

Unlike prokaryotic chromosomes, eukaryotic chromosomes are linear. As you’ve learned, the enzyme DNA pol can add nucleotides only in the 5′ to 3′ direction. In the leading strand, synthesis continues until the end of the chromosome is reached. On the lagging strand, DNA is synthesized in short stretches, each of which is initiated by a separate primer. When the replication fork reaches the end of the linear chromosome, there is no way to replace the primer on the 5’ end of the lagging strand. The DNA at the ends of the chromosome thus remains unpaired, and over time these ends, called telomeres, may get progressively shorter as cells continue to divide.

Telomeres comprise repetitive sequences that code for no particular gene. In humans, a six-base-pair sequence, TTAGGG, is repeated 100 to 1000 times in the telomere regions. In a way, these telomeres protect the genes from getting deleted as cells continue to divide. The telomeres are added to the ends of chromosomes by a separate enzyme, telomerase ((Figure)), whose discovery helped in the understanding of how these repetitive chromosome ends are maintained. The telomerase enzyme contains a catalytic part and a built-in RNA template. It attaches to the end of the chromosome, and DNA nucleotides complementary to the RNA template are added on the 3′ end of the DNA strand. Once the 3′ end of the lagging strand template is sufficiently elongated, DNA polymerase can add the nucleotides complementary to the ends of the chromosomes. Thus, the ends of the chromosomes are replicated.

Telomerase is typically active in germ cells and adult stem cells. It is not active in adult somatic cells. For their discovery of telomerase and its action, Elizabeth Blackburn, Carol W. Greider, and Jack W. Szostak ((Figure)) received the Nobel Prize for Medicine and Physiology in 2009.

Telomerase and Aging

Cells that undergo cell division continue to have their telomeres shortened because most somatic cells do not make telomerase. This essentially means that telomere shortening is associated with aging. With the advent of modern medicine, preventative health care, and healthier lifestyles, the human life span has increased, and there is an increasing demand for people to look younger and have a better quality of life as they grow older.

In 2010, scientists found that telomerase can reverse some age-related conditions in mice. This may have potential in regenerative medicine. 1 Telomerase-deficient mice were used in these studies these mice have tissue atrophy, stem cell depletion, organ system failure, and impaired tissue injury responses. Telomerase reactivation in these mice caused extension of telomeres, reduced DNA damage, reversed neurodegeneration, and improved the function of the testes, spleen, and intestines. Thus, telomere reactivation may have potential for treating age-related diseases in humans.

Cancer is characterized by uncontrolled cell division of abnormal cells. The cells accumulate mutations, proliferate uncontrollably, and can migrate to different parts of the body through a process called metastasis. Scientists have observed that cancerous cells have considerably shortened telomeres and that telomerase is active in these cells. Interestingly, only after the telomeres were shortened in the cancer cells did the telomerase become active. If the action of telomerase in these cells can be inhibited by drugs during cancer therapy, then the cancerous cells could potentially be stopped from further division.

Section Summary

Replication in eukaryotes starts at multiple origins of replication. The mechanism is quite similar to that in prokaryotes. A primer is required to initiate synthesis, which is then extended by DNA polymerase as it adds nucleotides one by one to the growing chain. The leading strand is synthesized continuously, whereas the lagging strand is synthesized in short stretches called Okazaki fragments. The RNA primers are replaced with DNA nucleotides the DNA Okazaki fragments are linked into one continuous strand by DNA ligase. The ends of the chromosomes pose a problem as the primer RNA at the 5’ ends of the DNA cannot be replaced with DNA, and the chromosome is progressively shortened. Telomerase, an enzyme with an inbuilt RNA template, extends the ends by copying the RNA template and extending one strand of the chromosome. DNA polymerase can then fill in the complementary DNA strand using the regular replication enzymes. In this way, the ends of the chromosomes are protected.

Watch the video: Cell Biology. DNA Replication (November 2021).