Proteins folds: relation to splicing and post-translational modification?

Proteins folds: relation to splicing and post-translational modification?

We are searching data for your request:

Forums and discussions:
Manuals and reference books:
Data from registers:
Wait the end of the search in all databases.
Upon completion, a link will appear to access the found materials.

Is the secondary structure pattern of protein folds related in any way to alternative splicing and post-translational modification?

Protein Processing and Folding

All newly-synthesized polypeptides have to be folded into their three-dimensional structures to be functional. Many proteins have to reach destinations other than the cytosol, the site where protein synthesis occurs. In addition, a majority of proteins undergo post-translational modification in response to a wide variety of cellular signals. Therefore, understanding the mechanism and regulation of protein folding, protein translocation, and protein processing is an integral part of modern molecular and cell biology. In addition, errors in these processes cause diseases ranging from Alzheimer's to diabetes. Protein folding and processing is one of the major research focuses in our department. Faculty in this area engage in a number of research topics including the unfolded protein response, the human blood clotting system, the structure and function of molecular chaperones, the heat shock response, protein misfolding in aging and disease, yeast pheromone processing, protein transport in the secretory pathway, protein targeting, organelle biogenesis, and protein design and engineering.

The Baldridge lab studies ERAD (Endoplasmic Reticulum Associated Degradation) of misfolded proteins.

The Fuller lab is interested in how Vps13p and other proteins help localize protein processing enzymes to the cellular compartments where they function.

The role of the ESCRT (Endosomal Sorting Complexes Required for Transport) machinery in membrane repair is under investigation in the Hanson lab.

Co- and Post-Translational Protein Folding in the ER

The biophysical rules that govern folding of small, single-domain proteins in dilute solutions are now quite well understood. The mechanisms underlying co-translational folding of multidomain and membrane-spanning proteins in complex cellular environments are often less clear. The endoplasmic reticulum (ER) produces a plethora of membrane and secretory proteins, which must fold and assemble correctly before ER exit - if these processes fail, misfolded species accumulate in the ER or are degraded. The ER differs from other cellular organelles in terms of the physicochemical environment and the variety of ER-specific protein modifications. Here, we review chaperone-assisted co- and post-translational folding and assembly in the ER and underline the influence of protein modifications on these processes. We emphasize how method development has helped advance the field by allowing researchers to monitor the progression of folding as it occurs inside living cells, while at the same time probing the intricate relationship between protein modifications during folding.

Keywords: N-glycosylation chaperones disulfide-bond formation endoplasmic reticulum folding enzymes protein folding.

Proteins folds: relation to splicing and post-translational modification? - Biology

Post-Translational Modifications to Regulate Protein Function

Hening Lin, Jintang Du, and Hong Jiang, Department of Chemistry and Chemical Biology, Cornell University, Ithaca, New York

Protein post-translational modifications (PTM) are very important to regulate protein function and to control numerous important biological processes. Here a brief review of commonly found enzyme-catalyzed PTM is given. These PTM include modifications that occur on protein side chains and those that involve protein backbones. The introduction of different PTM is followed by a summary of the molecular basis for the regulation of protein function by PTM. The focus is then given to a few major PTM that play important roles in eukaryotes, such as phosphorylation, methylation, acetylation, glycosylation, ubiquitylation, and proteolysis. For each modification, a description will be given about the residues modified, the enzymatic reaction mechanisms, the major known biological functions, and its relevance to human diseases. At the end, we discuss challenges in identifying new pathways regulated by known PTM and discovering new PTM.

The central dogma of molecular biology, DNA is transcribed to mRNA which is then translated to proteins, implies the importance of proteins. After all, it is the proteins that carry out most of the biological functions of a cell. Thus controlling transcription and translation are very important, as they ultimately control what proteins are synthesized in cells and thus control the properties of cells. However, one should not overlook what happens to proteins after they are synthesized. Many chemical modifications can occur to proteins after translation. Collectively, these modifications are called post-translational modifications (PTM). PTM are very important in regulating protein function, which is reflected by the large number of genes devoted to catalyzing PTM. For example, in the human genome (with less than 30,000 genes total), more than 500 kinases catalyze protein phosphorylation (1), and more than 500 proteases catalyze the hydrolytic cleavage of proteins (2). Deregulation in PTM is the cause of various human diseases, as will be explained later in specific PTM sections. Here, a brief review is given on different types of PTM and on how PTM regulate protein function. Some basic principles will be highlighted so that readers who are unfamiliar with PTM can have a quick but comprehensive understanding of PTM. The recent book on PTM by Professor Walsh from Harvard Medical School provides a more complete description of PTM (3). Where appropriate, references on specific PTM will also be given in different sections for additional information. The abbreviations used are cataloged in Table 1 to help readers who are not familiar with the biological language.

Types of post-translational modifications

PTM can be enzyme-catalyzed and thus controlled carefully, or they can be nonenzymatic with less control. For example, protein glycation during hyperglycemia is a nonenzymatic PTM that accounts for some symptoms of diabetes (4). Protein nitrosylation on Cys residues is another nonenzymatic PTM that can affect protein function (5). Coordination by metal ions can also be considered as a PTM. For many proteins, metal binding is crucial for maintaining the correct structure or the enzymatic activity (6). Here, the focus will be given to enzyme-catalyzed PTM. Figures 1 and 2 show many commonly found enzyme-catalyzed PTM. (3)

As can be observed from Fig. 1, most PTM happen to protein side chains. Typically, the side chains involved are nucleophilic, such as Cys (palmitoylation, isoprennylation, disulfide bond formation, ADP-ribosylation), Lys (acetylation, methylation, ubiquitinylation), Arg (methylation, ADP-ribosylation), Asp/Glu [methylation, poly(ADP-ribosyl)ation], Ser/Thr (phosphorylation, O-glycosylation), and Tyr (phosphorylation). Weaker nucleophiles are also used, such as the side chain amide nitrogen in Asn (in N-glycosylation), the C-2 position of Trp (in C-glycosylation), and the C-2 position of His (in diphthamide). In amidation reactions catalyzed by transglutaminases and polyglutamylation/polyglycylation reactions that happen to Glu residues, the ε-NH2 from Lys or α-NH2 from Glu/Gly acts as the neucleophile, whereas the side chain of Gln or Glu serves as the electrophile. In addition, several amino acid side chains can be oxidized, such as Pro, Lys, Asn, Tyr, Trp, and Cys, to give oxidized amino acids.

A few PTM reactions also involve changes in protein backbone. These reactions include the hydrolytic cleavage of the peptide backbone by proteases, the anchoring of proteins to glycosylphosphotidylinositol (GPI) or cholesterol, and the C-terminal amide formation by oxidative cleavage of glycine residues. Some PTM involve changes in both the side chain and the main chain, such as the formation of 4-methylidene-5-imiazole-5-one (MIO) prosthetic group in deaminases and aminomutases, the formation of the fluorophore in GFP (green fluorescent protein), and the formation of pyruvamide in decarboxylases (Fig. 2).

Figure 1. Major enzyme-catalyzed PTM that modify protein side chains.

Figure 2. A few PTM that involve protein backbone.

Table 1. List of abbreviations

A tyrosine kinase encoded from abl (Abelson) gene, the fusion protein ABL-BCR is involved in inhibition of apoptosis in chronic myelogenous leukemia cells

Adenylate cyclase, converts ATP to cyclic AMP

Acyl carrier protein, found in fatty acid synthases and polyketide synthases, functions to carry the elongating fatty acyl chain

A disintergrase and metalloprotease, a family of proteases that hydrolyze off extracellular portions of transmembrane proteins

Apoptotic protease activation factor-1, a cytosolic protein involved in cell death or apoptosis, interacts with cytochrome c to activate caspase 9

Acyltransferase, found in fatty acid syntheases and polyketide syntheases, adds a malonyl group to the holo form of the ACP domain

Named from B-cell lymphoma 2, an antiapoptotic protein

A protein encoded from breakpoint cluster region gene, has serine/threonine kinase activity. Fusion with abl protein causes leukemia

3’-5’-cyclic adenosine monophosphate

Caspase recruitment domain, mediates the formation of larger protein complexes via direct interactions between individual cards, involved in the regulation of caspase activation and apoptosis

Coactivator-associated arginine(R) methyltransferase 1, methylates Arg17 and Arg26 residues on Histone H3

Ubiquitously expressed homolog of Cbl, a mammalian protein involved in cell signaling and protein ubiquitination, named after Casitas B-lineage Lymphoma

CREB binding protein, a transcriptional co-activating protein

Cellular differentiation marker 2, a cell adhesion protein found on the surface of T cells and natural killer cells

Cell-division kinases, serine/threonine kinases, activated by association with cyclins and involved in regulation of the cell cycle, transcription and mrna processing

Chromodomain helicase DNA-binding protein 1, interacts with methylated Lys4 on Histone H3

A protein named from circadian locomotor output cycles kaput gene, regulating circadian rhythm

Chronic myelogenous leukemia, a form of leukemia characterized by the increased and unregulated growth of predominantly myeloid cells in the bone marrow and the accumulation of these cells in the blood

Camp response element binding proteins, as transcription factors, bind to certain sequences called camp response elements (CRE) in DNA and thereby increase or decrease the transcription of certain genes

Cytochrome c, a small heme protein associated with the inner membrane of the mitochondria and released in response to pro-apoptotic stimuli

Death effector domain, a protein interaction domain found in inactive procaspases and proteins that regulate caspase activation in the apoptosis cascade

Dehydratase, found in fatty acid syntheases and polyketide syntheases, dehydrates the P-OH of acyl thioester

Deoxyribonuclease, catalyzes the hydrolytic cleavage of phosphodiester linkages in the DNA backbone

Enoylreductase, found in fatty acid syntheases and polyketide syntheases, reduces the enoyl of enoyl thioester to the saturated thioester

Extracellular signal-regulated kinase, activates many transcription factors and some downstream protein kinases, involved in functions including the regulation of meiosis, mitosis, and postmitotic functions in differentiated cells

One of the serine proteases of the coagulation system

Flavin adenine dinucleotide

Fas-associated protein with death domain, connects the Fas-receptor and other death receptors to caspase-8 through its death domain to form the death inducing signaling complex during apoptosis

Forkhead-associated domain, a phosphospecific protein-protein interaction motif involved in checkpoint control of the cell cycle

A yeast transcriptional adaptor that has histone acetyltransferase activity

Green fluorescent protein

G protein-coupled receptor, a transmembrane receptor that senses molecules outside the cell and activates inside signal transduction pathways and cellular responses

Glycosylphosphatidylinositol, a glycolipid that can be attached to the C-terminus of a protein during post-translational modification

Glycogen phosphorylase kinase, a serine/threonine-specific protein kinase which activates glycogen phosphorylase by phosphorylation

Growth factor receptor-bound protein 2, an adaptor protein involved in signal transduction/cell communication

Histone deacetylases, remove acetyl groups from an e-N-acetyl lysine residues on histones

Human DOTl-like protein, methylates histone H3 at Lys79. (DOT1: Yeast disruptor of telomeric silencing-1)

Homologous to E6-AP C terminus, mediates E2 binding and ubiquitination

Hypoxia inducible factor, a transcription factor that responds to changes in available oxygen in the cellular environment, specifically to decreases in oxygen or hypoxia

Hetergenous nuclear ribonucleoproteins, which forms complex with pre-MRNA and MRNA and shuttles between the nucleus and the cytoplasm

Heterochromatin protein 1, binds to heterochromatin and interacts with numerous partner proteins to organize the higher-order structure of heterochromatin

Immunoglobulin G, one antibody isotype

Inhibitor of NF-Kb kinase, which phosphorylates inhibitor of NF-Kb for the proteasomal degradation to release NF-Kb dimers to translocate to the nucleus and activate transcription of target genes

Inositol pyrophosphate, a proposed physiologic phosphate donor

Jmjc domain-containing histone demethylase

Jumonji domain-containing, a novel demethylase signature motif

Ketoreductase, found in fatty acid syntheases and polyketide syntheases, reduces the β-ketoacyl thioester

Ketosynthase, found in fatty acid syntheases and polyketide syntheases, carries out C-C bond-forming chain elongation step

Lysine-specific demethylase 1, demethylates histone H3 at lysine 9

Mitogen-activated protein kinase, serine/threonine-specific protein kinases that respond to extracellular stimuli (mitogens) and regulate various cellular activities, such as gene expression, mitosis, differentiation, and cell survival/apoptosis

MAPK/ERK kinase, activates a MAP kinase or ERK through phosphorylation

Monocytic leukemia zinc finger protein, a histone acetyltransferase implicated in leukemogenic and other tumorigenic processes, regulates expression of genes required for proliferation and repopulation of potential of stem cells in the hematopoietic compartment

Nicotinamide adenine dinucleotide

Nerve growth factor P1, a secreted protein which induces the differentiation and survival of particular target neurons, belonging to neurotrophins protein family

Protein Arg deiminases, hydrolyzes the guanidine side chain of Arg residues to citrulline residues in proteins

Poly(ADP-ribose) polymerase-1, catalyzes the transfer of poly ADP-ribose to substrate proteins by using NAD as substrate, involved in cellular response to DNA damage and DNA metabolism

Protein kinase A, a family of kinases whose activity are dependent on the level of cyclic AMP, involved in the regulation of glycogen, sugar, and lipid metabolism

Protein Arg(R) methyltransferase, catalyzes the transfer of methyl group from S-adenosylmethionine to the guanidino nitrogen atoms of arginine residues

RIP-associated ICH-1/CED-3 homologous protein with a death domain, functions as an adaptor in recruiting the death protease ICH-1 to the TNFR-1 signaling complex (ICH: Ice and ced-3 homolog TNRF: tumor necrosis factor receptor)

Really interesting new gene. Ring proteins are components of ubiquitin e3 enzyme complexes.

Vorinostat, suberoylanilide hydroxamic acid, brand name Zolinza, a class of agents known as histone deacetylase inhibitors, as a drug for the treatment of cutaneous T cell lymphoma (a type of skin cancer)

Skp1-Cullin-F Box, a multi-protein complex catalyzing the ubiquitylation of proteins destined for proteasomal degradation

Supressor of variegation-Enhanser of zeste-Trithorax. SET domains have methyltranferase activity.

A novel human SET domain-containing protein, which specifically methylates H4 at Lys20

A novel human SET domain-containing protein, which specifically methylates H3 at Lys4

Src homology 2, a phosphotyrosine-recognition protein domain of about 100 amino acid residues first identified as a conserved sequence region among the oncoproteins Src and Fps

Proteins homologs of both the drosophila protein, mothers against decapentaplegic (MAD) and the C. Elegans protein SMA, as signal-activated transcription factors regulated by the TGF-β superfamily

Proteins containing SET and MYND domain. MYND encoded mynd (myosin) gene, which have histone methyltransferase activity

Small nuclear ribonucleoproteins, combining with pre-MRNA and various proteins to form spliceosomes to removes introns from pre-MRNA segment

Son of sevenless, a guanine nucleotide exchange factor that activates Ras

Signal transducers and activators of transcription, proteins which are involved in the development and function of the immune system

Small ubiquitin-like modifier, a family of small proteins that are covalently attached to and detached from other proteins in cells to modify their functions

TATA box-binding protein-associated factor 10, a component of the general transcription factor complex TFIID and the TATA box-binding protein (TBP)-free TAF-containing complex

Transcription intermediary factor 2, a transcriptional coregulatory protein which contains several nuclear receptor interacting domains and an intrinsic histone acetyltransferase activity

Transcription factor, a protein that binds to specific region of DNA by DNA binding domains and mediates the transcription from DNA to RNA

Transforming growth factor |31, a secreted protein that performs many cellular functions, including the control of cell growth, cell proliferation, cell differentiation and apoptosis, belonging to the transforming growth factor beta superfamily of cytokines

Trans Golgi network, a part of the golgi apparatus in cells

Tumor necrosis factor receptor-associated protein with death domain, an adapter protein that recruits other proteins to the cytoplasmic TNF (tumor necrosis factor) receptor complex, involved in apoptosis

Ubiquitin activating protein

Ubiquitin binding associated domain, one class of ubiquitin binding domains

Ubiquitin binding domain, which binds mono- or poly-ubiqitin

Ubiquitin-specific protease, hydrolyzes both linear and branched Ub modifications

As with all other chemical species, protein structure determines protein function. PTM can regulate protein function because they can change protein structure. The structure change introduced by PTM can be local and small. For example, methylation of Lys residues makes the side chain more hydrophobic without changing protein backbone conformation significantly [at least based on crystal structures in which methylated and unmethylated histone peptides are bound by another protein (7)], whereas phosphorylation can change the backbone conformation within a limited region of a protein by charge-pairing with nearby Arg residues or by interacting with main chain NH and helical dipole (8). In contrast, some PTM can alter protein overall structure more dramatically, such as the proteolytic cleavage of proteins into smaller fragments, or the addition of protein tags like ubiquitin. These structure changes, small or big, are the basis for the biological functions of different PTM and typically lead to one or more of the consequences described below.

Changing protein structure to turn on/off catalytic activity of enzymes

The best-known PTM that is widely used to regulate enzymatic activity is phosphorylation. Phosphoryation regulates the activity of many enzymes by different mechanisms. For example, glycogen phosphorylase is activated allosterically by phosphorylation at Ser14, whereas Escherichia coli isocitrate dehydrogenase is inhibited by phosphorylation because of the block of substrate access to the active site (9). The most interesting and very important catalytic activity regulated by phosphorylation is protein kinase activity. Most protein kinases are activated by phosphorylation of Thr/Tyr residue(s) in the activation segment. The structural changes induced by phosphorylation, which are illustrated in Fig. 3 with ERK (extracellular signal-regulated kinase), convert the inactive kinases to active kinases (8). The regulation of protein kinase activity by phosphorylation bears enormous biological significance because protein phosphorylation is important in signal transduction, and the control of downstream kinase activity via phosphorylation by upstream kinase is one major method to propagate signals to downstream partners, as will be elaborated later.

Proteolysis is another way to control enzymatic activity, although unlike phosphorylation, the change in activity is irreversible. Many proteases are synthesized as inactive precursors (zymogens) that have to be cleaved by proteolysis to become active. These precursors include proteases that are secreted into digestive tracts or lysosomes, the catalytic active P subunits in the eukaryotic 20 S proteosome that are activated by self-cleavage (10), and the effector caspases involved in apoptosis that are activated by initiator caspases-mediate cleavage (11).

Figure 3. Structure of ERK2 in both unphosphorylated (inactive) and phosphorylated (active) state. (a) ERK2 in unphosphorylated state (figure made using PDB 1ERK) residues Thr183 and Tyr185 in the activation segment are labeled (b) ERK2 in phosphorylated state (ERK-P2, figure made using PDB 2ERK). The two phosphorylated residues, pThr183 and pTyr185, are labeled (c) Superposition of ERK2 and ERK2-P2.

Changing protein structure to create or to mask recognition motifs

Many PTM exert their biological functions by creating recognition motifs to recruit binding partners (12) or by masking recognition motifs to disrupt existing interactions. Phosphory- lated Ser/Thr residues can be recognized by proteins that contain 14-3-3 domains, FHA (forkhead-associated) domains, SMAD [proteins homologs of both the drosophila protein, mothers against decapentaplegic (MAD) and the Caenorhabditis elegans protein SMA] domains, and several other domains (13). Phosphorylated Tyr residues can recruit proteins that contain SH2 (Src homology 2) domains and PTB (phosphotyrosine binding) domains (14). Acetyl Lys residues can be recognized by proteins with bromodomains (15, 16), and methylated Lys residue can be recognized by proteins with chromodomains and Tudor domains (17). The ubiquitin and ubiquitin-like protein tags can also be recognized by various protein domains that mediate the biological function of modification with these protein tags (18, 19). The structures of a few domains dedicated to recognition of post-translationally modified residues are shown in Fig. 4. Typically, domains that recognize post-translationally modified residues have specificities in that they recognize not only the modified residue, but also the local structure in which the residue resides. The specific recognition of PTM in different contexts is the key to understand many biological consequences of PTM, as will be explained in more detail in particular PTM sections later.

In addition to creating recognition motifs to recruit proteins, a few PTM can also increase interaction with other species, such as the lipid bilayer of different cellular membranes. These modifications include the formation of GPI-anchored proteins (20), protein myristoylation on the a-amino group of the N-terminal Gly (21), protein C-terminal prennylation on Cys residues (22), and protein palmitoylation on Cys residues that are close to membrane surface (23). These lipid modifications occur to many signaling proteins, which include G protein-coupled receptors and small G proteins, and they play important roles in signal transduction and membrane trafficking (24).

Figure 4. Structures of a few dedicated domains that recognize post-translationally modified residues. (a) SH2 domain of v-Src in complex with pTyr peptide (pTyr-Val-Pro-Met-Leu). Residues Arg12, Arg32, Ser34, Thr36, and Lys60 from the SH2 domain interact with pTyr (figure made using PDB 1SHA) (b) Bromodomain of yeast histone acetyltransferase Gcn5 in complex with AcLys peptide (histone H4 residues 15-29, AK(Ac)RHRKILRNSIQGI). Bromodomain residues Pro351, Gln354, Tyr364, Met372, Val399, and Asn407 interact with AcLys (with some of the interaction is mediated by water molecules, figure made using PDB 1 E6I) (c) Chromodomain of HP1 in the complex with histone H3 Me3Lys9 (figure made using PDB 1KNE). Chromodomain residues Tyr 24, Trp 45, Tyr 48 and Glu 52 bind Me3Lys(d) UBA domain of Cbl-b in complex with ubiquitin (figure made using PDB 2OOB). UBA domain residues Asp933, Ala937, Met940, Phe946, and Lys950 interact with ubiquitin residues Leu8, Ile44, Ala46, Gly47, Gln49, His68, and Val70. UBA: ubiquitin binding associated.

Adding functional groups to allow catalysis

Typically, proteins are formed with the most common 20 amino acids, which only offer a limited number of choices of functional groups for catalyzing different reactions. The limit in the number of functional groups is complemented by the use of various coenzymes or cofactors, many of which are attached covalently to the corresponding enzymes. One class of PTM with this function is the addition of “swinging arm” prosthetic groups (biotin, phosphopantetheine, and lipoic acid) to proteins (25). Biotin is used as a carrier of CO2 in carboxylation reactions, and the disulfide bond in lipoyl group is used as an electron carrier and acyl carrier in 2-keto acid dehydrogenases. The phosphopan- tetheine group provides a thiolate as the carrier of acyl chains and is used in fatty acids synthases, polyketide synthases, and nonribosomal peptide synthases (26). Although a thiolate side chain can also be provided by Cys, the longer phosphopantetheine can shuttle the acyl chains to different catalytic domains, which allows multiple reactions to occur in sequence on the acyl chains (Fig. 5). This “swinging arm” catalysis, which is also enabled by biotinylation and lipoylation, cannot be achieved by natural proteinogenic amino acids with shorter side chains.

Another type of PTM provides new functional groups for enzyme catalysis by oxidation of side chain. These include TOPA (2,4,5-trihydroxyphenylalanine) quinone in amine oxidases (Fig. 6), tryptophan tryptophanyl quinone in methylamine dehydrogenase (27) and formylglycine in sulfatases. (28) Main chain modifications can also generate prosthetic groups for enzyme catalysis, such as the MIO group in His/Phe ammonia-lyase (29, 30) and Tyr aminomutases, (31) and the pyruvoyl group in decarboxylases (Fig. 6) (32). The formation of these cofactors by PTM extends the catalytic power of enzymes greatly, which enables them to catalyze chemistry that is difficult with just the side chains of the 20 amino acids commonly found in proteins.

Figure 5. Fatty acid biosynthesis catalyzed by fatty acid synthases. The growing acyl chain is tethered to the phosphopantetheinylated ACP domain, which enables it to undergo cycles of condensation, ketone reduction, dehydration, and enol reduction catalyzed by different domains. AT, acyltransferase ACP, acyl-carrier protein KS, ketosynthase KR, ketoreductase DH, dehydratase ER, enoylreductase.

Figure 6. Post-translationally generated cofactors provide functional groups to allow catalysis. The mechanisms of TOPA quinone in amine oxidases, MIO in deaminases, and pyruvamide in decarboxylases are shown.

Locking proteins into the correct structures or increasing protein stability

The major type of PTM that has this function is protein disulfide bond formation (33). Disulfide bonds are more stable thermodynamically than the reduced thiols in an oxidizing environment. In eukaryotes, proteins that undergo the secretary pathway start to form disulfide bonds once they are translocated into the endoplasmic recticulum (ER) lumen, which is an oxidizing environment. These disulfide bonds help to stabilize the desired protein structure by locking the protein in a certain conformation, and perhaps to assist protein folding too. Many secreted proteins later undergo proteolysis in the Golgi to give smaller fragments (see the proteolysis section below). In this case, disulfide bonds also serve to link the fragments covalently to maintain a certain structure. One textbook example is insulin, which is produced as a single peptide chain that later undergoes several proteolysis step, and the mature insulin consists of two chains connected via two disulfide bonds (Fig. 7) (34). The light and heavy chains of antibodies are connected by disulfide bonds. Another PTM that can increase protein stability is glycosylation. For example, erythropoietin N-glycosylation has been found to increase its in vivo lifetime (35), which is probably because of the blocking of tissue proteases action by carbohydrate modifications.

Figure 7. Maturation of insulin. Insulin is synthesized as preproinsulin that contains an N-terminal signal sequence. After translocating into the ER, the signal sequence is cleaved off by the signal peptidase and the resulting proinsulin folds into a stable conformation. Three disulfide bonds are formed between cysteine side chains. The connecting sequence (Chain C) is cleaved off in the Golgi by proprotein convertases to form the mature and active insulin molecule, which is then secreted.

Exploration of major PTM

In this section, a few major PTM will be explored in more details. For each PTM discussed, a brief introduction on the PTM reaction and the enzymes catalyzing the reaction will be given. A few biological processes that involve the PTM will be explained to demonstrate the important function of the PTM in biology.

Protein phosphorylation typically occurs on Ser, Thr, and Tyr residues (Fig. 1), although His and Asp residues can also be phosphorylated as in bacteria two-component signal transduction systems. The universal phosphate donor is adenine triphosphate (ATP, Fig. 8), and the reaction is catalyzed by more than 500 kinases in humans. Many kinases are Ser/Thr specific, some are Tyr specific, whereas some have dual specificity. It was reported that inositol pyrophosphate (IP7) can also serve as phosphate donor in protein phosphorylation (36). However, the reaction is not enzyme catalyzed and the physiologic relevance is not proven yet.

The large number of protein kinases in the human genome reflects that this PTM is widely occurring and regulates numerous biological processes. The most well understood function is signal transduction, because phosphorylation of proteins can turn ON/OFF catalytic activity or create recognition motif to recruit other protein partners, thus allowing signal to propagate. In accord with its role in signal transduction, protein phosphorylation is reversible so that the signaling process can be terminated as needed. The removal of the phosphate group is catalyzed by phosphatases (Fig. 8).

Figure 8. Kinase-catalyzed phosphorylation and phosphatases-catalyzed dephosphorylation reactions. (a) Catalytic mechanism of protein kinases (b) Catalytic mechanism of bimetallic pSer/pThr or dual specifity protein phosphatases (c) Catalytic mechanism of pTyr phosphatases.

Two signaling processes will be discussed here to illustrate how protein phosphorylation can play a critical role in cell signaling. A more detailed description of these two signaling processes can be found in the Molecular Cell Biology textbook by Lodish et al. (34). The first one, which is shown in Fig. 9, involves protein kinase A (PKA), which can be activated by cyclic AMP (cAMP) (37). PKA at resting state exists as an inactive tetramer that consists of two copies of a regulatory subunit and two copies of the catalytic subunit. Hormones that signal through G-protein coupled receptors can activate the trimeric G protein, which in turn can activate an effector enzyme, adenylate cyclase (38). Adenylate cyclase catalyzes the formation of cAMP from ATP (39), which results in the increase in cAMP concentration. Binding of cAMP to the regulatory subunits of PKA dissociate the inactive tetramer, which releases the catalytic subunit of PKA. The catalytic subunit can then be activated by phosphorylation at the activation loop. Activated PKA can phosphorylate many different substrates and produce both short-term and long-term effects. Short-term effects come from the change of the catalytic activities of substrate proteins on phosphorylation by PKA. The substrates of PKA include proteins involved in glycogen synthesis and degradation, such as glycogen phosphorylase kinase and glycogen synthase (40). Phosphorylation of these proteins by PKA leads to activation of glycogen degradation and inhibition of glycogen synthesis. Long-term effects come from the changes in gene transcription. PKA can affect transcription by phosphorylating CREB (cAMP response element binding proteins) and other transcription factors (41). On phosphorylation, CREB can bind to specific regions of the chromosomal DNA, and it can recruit the basal transcription machinery via CBP (CREB binding protein)/P300 to activate the transcription of certain genes.

Figure 9. The signaling process that involves G protein-coupled receptors (GPCR) and PKA. (1) Binging of hormone produces conformational change in the GPCR (2) GPCR binds to Gs protein (3) GDP bound to Gs is replaced by GTP and the β and γ subunits of Gs dissociate from the α subunit (4) Gsa subunit binds to adenylate cyclase (AC), which activates the synthesis of cAMP (4a), the hormone tends to dissociate, and hydrolysis of GTP to GDP causes Gsα to dissociate from adenylate cyclase and binds to Gβγ, which regenerates a conformation of Gs that can be activated by an GPCR hormone complex (4b) (5) dissociation of regulatory subunits (R) from PKA as cAMP concentration increases (6) subsequent activation of the catalytic subunits (C) by phosphorylation in the activation loop generates the fully active kinase (7) activated PKA can phosphorylate glycogen phosphorylase kinase (GPK) and other enzymes, which leads to activation of glycogen degradation and inhibition of glycogen synthesis and (8) PKA can affect transcription by phosphorylating the transcription factor CREB.

The second example of cell-signaling process that involves protein phosphorylation is receptor tyrosine kinase signaling (Fig. 10) (42). Receptor tyrosine kinases are transmembrane proteins with an extracellular ligand-binding domain and an intracellular tyrosine kinase domain. Ligand binding to the extracellular domain triggers receptor dimerization and/or activation, so that the intracellular catalytic domains from two receptor protein molecules can phosphorylate each other at the activation segment. This transphosphorylation activates the catalytic domain so that it can phosphorylate other Tyr residues in the receptor and other substrate proteins. These phosphorylated Tyr residues then recruit protein-binding partners that contain SH2 or PTB domains that recognize specific phosphorylated Tyr residues. One of the proteins recruited is Grb2 (growth factor receptor-bound protein 2), which contains an SH2 domain. Grb2 in turn recruit Sos (son of sevenless), which is a guanine nucleotide exchange factor for the G protein Ras. Sos catalyzes the exchange of Ras-bound GDP (guanosine-5’-diphosphate) for GTP (guanosine-5’-tiphosphate), which converts Ras to the activated form. Activated Ras can bind to and activate Raf, which is the most upstream kinase in the MAP kinase (Mitogen-activated protein kinase) cascade (43). By phosphorylation of MEK (MAPK/ERK kinase, a dual specificity MAP kinase kinase) on the activation segment, Raf activates MEK, which in turn phosphorylates and activates ERK. Activated ERK can phosphorylate many transcription factors, which leads to changes in gene transcription and ultimately cell division/differentiation.

The two examples mentioned above illustrated basic principles how protein phosphorylation serves specific biological purposes. Although different kinases might be involved in diverse pathways, the molecular mechanism for the regulation of protein function by phosphorylation is similar: By changing protein structure, phosphorylation can turn ON/OFF the catalytic activity of a protein, or create/mask recognition motif for binding by other molecules.

The 500 or so protein kinases in the human genome regulate numerous biological processes. Consequently, deregulation of protein phosphorylation can lead to various diseases, among which cancer is the most prominent one. Accordingly, kinase inhibitors are being sought for treating various cancers. One best understood example is chronic myeloid leukemia, which is caused by chromosomal abnormality that fuses a kinase ABL (encoded from Abelson gene) with another protein BCR (encoded from breakpoint cluster region gene) (44). The BCR-ABL fusion protein was shown to be sufficient to cause chronic myeloid leukemia in mice. Imatinib mesylate (Gleevec Novartis Pharmaceuticals, East Hanover, NJ) is a clinically used BCR-ABL inhibitor to treat CML (chronic myelogenous leukemia). The receptor tyrosine kinase and MAP kinase-signaling pathway mentioned above are key pathways that regulate cell proliferation and differentiation frequently, tumor cells have mutations in proteins involved in this pathway (45). This pathway has thus been studied intensively for the search of cancer drugs. Other kinases, such as cell-division kinases (CDKs), have also been targeted for therapeutics (46). In addition, because phosphatases reverse the effects of kinases, mutations in phosphatases have been indicated in human diseases such as cancer, diabetes, and neurologic disorders (47).

Figure 10. Receptor tyrosine kinase signaling process and the activation of MAP Kinase. (1) Binding of hormone to the receptor causes activation of the kinase activity of the receptor, which leads to phosphorylation of Tyr residues (2) pTyr residues recruit GRB2, which in turn recruit Sos (3) Sos promotes exchange of GTP for GDP in Ras, which leads to the active Ras-GTP complex. Then, Sos dissociates from the active Ras (4) active Ras binds to and activate the kinase Raf (4a) and hormone can dissociate from the receptor (4b) (5) activated Raf phosphoryates and activates MEK (6) activated MEK phosphorylates and activates of MAP kinase (7) activated MAP kinase can phosphorylate transcription factors (TF) and (8) phosphorylated translation factors then bind to DNA and lead to changes in gene transcription and ultimately cell division/differentiation.

Acetylation of Lys residues is a very well known PTM because of histone acetylation, which is involved in transcriptional regulation of genes. The acetyl group comes from Acetyl-CoA, and typically, the acetyl acceptor is Lys residues (Fig. 11). Histone acetylation correlates with transcription activation, and accordingly, histone acetyltransferases (HATs) are normally multidomain proteins associated with transcription activator/coactivator complexes (48). The correlation of histone acetylation with transcription activation can be explained by the relaxation of the chromatin structure on histone acetylation and the recruitment of other proteins via acetyl Lys. In eukaryotic cells, chromosomal DNA wrap around core histone octamers consisted of two copies each of histone H2A, H2B, H3 and H4 (49). The complex formed between the histone octamer and the DNA associated with it is called a nucleosome. Nucleosomes can pack into a more condensed structure. Evidence suggests that the tight packing suppresses transcription, whereas transcription activation correlates with relaxed chromatin structure. The N-terminal tails of the histones have many Lys and Arg residues, among other residues, that can be modified post-translationally. No detailed structure information is available to explain how histone tail modification affects nucleosome packing. However, intuitively, masking the positive charges on histones by Lys acetylation can decrease the interaction with negatively charged DNA, which loosens the chromatin structure (50). In addition, acetylated Lys residues can be recognized by proteins that contain bromodomains (Fig. 4) (16, 51), which serve to recruit other proteins (including chromatin remodeling complexes) that help to activate the transcription of the gene.

Histone acetylation not only affects transcription, but also affects other processes that involve DNA, such as nucleosome assembly, heterochromation formation, and DNA repair (52). The acetylation/deacetylation of different Lys residues can have different biological effects. For example, histone H4 Lys5, 8, and 12 acetylation are involved in nucleosome assembly, H4 Lys16 acetylation does not affect nucleosome assembly but is involved in transcription activation (52), whereas H4 Lys56 has been shown recently to promote genome stability and DNA repair in yeast (53, 54).

Proteins other than histones can also be modified by Lys acetylation. Many transcription factors, cytoskeleton proteins, metabolic enzymes, and signaling proteins are acetylated (55). Transcription factors are known to be substrates of HATs, whereas the enzymes responsible for the acetylation of nonnuclear proteins in many cases are not well known (55). The number of proteins that are regulated by acetylation will continue to increase as method to detect protein acetylation improves. Acetylation of nonhistone proteins can change protein-protein interaction, regulate enzymatic activity, and increase protein stability by suppressing ubiquitinylation (55).

Lys acetylation can be reversed by the action of deacetylases. Many deacetylases are Zn-dependent enzymes that use Zn 2+ in the active site to activate water molecules to hydrolyze the amide bond (Fig. 11) (56). Recently, another type of deacetylases that are nicotinamide adenine dinucleotide (NAD)-dependent, also known as sirtuins, have been identified (57, 58). Their unique ability to couple NAD degradation to Lys deacetylation (Fig. 11) suggests that this type of enzyme can sense the metabolic state (for example, NAD concentration) of the cell and use that information to regulate the acetylation state and thus the function of the substrate proteins.

In addition to Lys side chain acetylation, protein N-terminal can also be acetylated (59). In eukaryotic cells, the first residue Met in most proteins is cleaved by N-terminal methionine peptidase. The newly released N-terminal amino group is then acetylated. This modification can happen co-translationally before the mature peptide chain is released from the ribosome. The function of this modification in most cases is still not understood, although deletion of the genes involved in this modification has clear phenotypes (59).

Because of the involvement of protein Lys acetylation in regulation of transcription, protein-protein interaction, enzymatic activity, and protein stability, the deregulation of protein acetylation has been associated with many diseases, such as cancer and neurodegeneration (60, 61). Frequently, mutations in histone acetyltransferases are found in cancer (60). Chromosomal abnormalities that generate fusions of acetyltranferases are known to lead to acute myeloid leukemia. These abnormalities include the fusions of MOZ (monocytic leukemia zinc finger protein) with CBP (CREB binding protein) or p300, and fusion of MOZ with the transcription factor TIF2 (transcription intermediary factor 2) (60). MOZ, CBP, p300, and TIF2 all contain histone acetyltransferase domains. Presumably, the generation of these aberrant fusion proteins disrupts normal gene transcription profile, which leads to leukemia. Deregulation of histone deacetylases is also suggested to be associated with cancer (61). A histone deacetylase inhibitor, SAHA (Vorinostat, Merck & Co., Inc, Whitehous Station, NJ), was approved by Food and Drug Administration recently for treatment of cutaneous T-cell lymphoma (62).

Figure 11. (a) Lys acetylation catalyzed by acetyltransferases (b) mechanism of Zn-dependent HDACs-catalyzed deacetylation (c) mechanism of sirtuins-catalyzed deacetylation.

Although methylation can happen to several different residues (3, 63), most attention has been given to protein Lys/Arg methylation because the methylation of Lys/Arg in histones controls gene transcription. For Lys and Arg methylation, multiple methyl groups can be added to the same Lys or Arg residue (Fig. 12). The methyl group comes from S-adenosyl methionine (SAM), which is a versatile small molecule that is used in many enzymatic transformations (64). Almost all Lys methyltrans- ferases belong to the SET (supressor of variegation-Enhanser of zeste-Trithorax) family of methyltransferases, whereas the protein Arg methyltransferases belong to a different class (65-67). Both histone Lys/Arg methylation and acetylation are associated with transcription regulation. In contrast to histone acetylation, which usually correlates with transcription activation, histone methylation can lead either to transcription activation or to suppression (17, 68). The effect of histone methylation, which is based on current understanding, is mediated by proteins that are recruited by methylated Lys or Arg residues. Tudor domains and chromodomains are known to recognize methylated Lys/Arg residues via both charge interaction and cation-n interaction (69-73). The methylated Lys/Arg residue is more hydrophobic and sterically bulkier than free Lys/Arg, and it can be differentiated by the domains that recognize methylated Lys/Arg residues (69, 74). Sequences that surround the methylate Lys residues are also read by the chromo domains and Tudor domains (69-71). This finding explains why different Lys residues could recruit different proteins on methylation and thus have different biological effects. For example, H3K4 methylation activates transcription by recruiting chromodomain helicase DNA-binding protein 1 (CHD1) specifically in yeast whereas H3K9 methylation represses transcription by recruiting heterochromatin protein 1 (HP1) (75-77).

Nonhistone proteins are known to be methylated on Lys residues, which include transcription factors, such as p53 (78-80), TAF10 (TATA box-binding protein-associated factor 10) (81), and translation factors (63). The p53 protein can be methylated by different methyltransferases [Set9, (78) Smyd2 (79), and Set8 (80)] on different Lys residues (Lys372, 370, and 382, respectively). These different methylation events either activate or repress p53 activity. Arg methylation has been found frequently in nonhistone proteins. For example, PRMT1 has been reported to methylate the transcription factor STAT1 (signal transducers and activators of transcription) (82), PRMT4/CARM1 [coactivator-associated arginine(R) methyltransferase 1] can methylate CBP/p300 (83), and hetergenous nuclear ribonucleoproteins (hnRNPs) and small nuclear ribonucleoproteins (snRNPs) that are involved in pre-mRNA splicing are also Arg methylated (67). The biological functions of these Lys/Arg methylations in most cases can also be explained by the effect of methylation to block or create interaction with other proteins or nucleic acids.

Compared with acetylation, methylation is more stable. For this reason, it was thought that methylation could be a permanent epigenetic mark. The recent discovery of two types of Lys demethylases suggests that methylation is also a reversible PTM. The first Lys demethylase discovered is LSD1 (lysine-specific demethylase 1), which is a FAD (flavin adenine dinucleotide)-dependent enzyme similar to amine oxidases (Fig. 12) (84). It is believed that LSD1 uses two-electron oxidation mechanism and thus cannot demethylate tri-methylated Lys residues (85). The second type of Lys demethylase, which contains the JmjC (Jumonji domain-containing) domain, is a nonheme Fe(II)-dependent enzyme that is capable of doing one-electron oxidation, and thus it can demethylate trimethylated Lys residues (86). The effect of Arg methylation was proposed to be reversed by protein Arg deiminase 4 (PAD4), which generate citrulline via demethyliminiation (87, 88). However, later studies indicate that PAD4 as well as other PAD enzymes do not catalyze demethylimination with appreciable rates in vitro (88-91). A recent report showed that Arg methylation can be truly reversed by JmjC domain containing demethylases, which suggests that PADs are probably not required for Arg demethylation (92). Thus, both Lys and Arg methylation are reversible modifications.

Similar to Lys acetylation, abnormality in Lys methylation has been considered a contributing factor to cancer (93, 94). Decrease in H3 Lys9 and H4 Lys20 trimethyaltion is found in cancer cells. Both H3 Lys9 and H4 Lys20 methylation are associated with heterochromatin formation. Presumably, the decrease in the methyaltion leads to defects in heterochromatin formation, which in turn lead to chromosomal instability and tumor formation (93). Histone methyltransferase fusion proteins generated from chromosomal translocation are found frequently in leukemia and are thought to contribute to the development of leukemia. For example, the H3 Lys79 methyltransferase hDOT1L (human DOT1-like protein) fusion found in mixed lineage leukemia is sufficient to cause leukemic transformation (95). The close association of methylation and cancer suggests that protein methyltranferases and demethylases can be potential therapeutic targets.

Figure 12. (a) Lys/Arg N-methylation (b) mechanism of FAD-dependent LSDI-catalyzed Lys demethylation (c) mechanism of Fe-dependent JHDM (JmjC domain-containing histone demethylase)-catalyzed demethylation.

In eukaryotic cells, glycosylation happens to many membrane and secreted proteins (i.e., proteins that transit through the ER and the Golgi secretary pathway). Glycosylation can occur either on Asn residues (N-glycosylation, Fig. 13), Ser/Thr and post-translationally hydroxylated Lys and Pro residues (O-glycosylation, Fig. 14), or Trp residues (C-glycosylation, Fig. 14). N-glycosylation is a complicated process and involves three stages: 1) the formation of donor substrate with 14 sugar units (GlC3Man9GlcNAc2-PP-dolichol), which occurs in both the cytosolic and the luminal faces of ER (96) 2) the transfer of the tetradecasaccharyl group to the Asn residues found in the consensus sequence Asn-X-Ser/Thr, which occurs in the ER (97) and 3) the hydrolytic removal of the terminal sugar residues on the tetradecasaccharide, the addition of more sugar units (Fig. 13) (98), and the sulfation and phosphorylation of the carbohydrate moieties in the ER and Golgi (99). The later trimming steps can generate different sets of N-linked carbohydrates, such as the high-mannose type glycans, the complex type glycans, and the hybrid type glycans (Fig. 13) (99). Each stage is achieved by the function of multiple proteins. For example, up to nine proteins are required for the transfer of the tetradecasaccharyl group in yeast (100).

Figure 13. Protein N-glycosylation. (1) The formation of the donor substrate with 14 sugar units (Glc3Man9GlcNAc2-PP-dolichol) (2) the reaction scheme that shows the transfer of the tetradecasaccharyl group to the Asn residues found in the consensus sequence Asn-X-Ser/Thr in proteins (3) hydrolytic removal of the terminal sugar residues on the tetradecasaccharide and addition of more sugar units in the ER and Golgi. OSTase, oligosacchryltransferase.

Figure 14. O- and C-glycosylation reactions. UDP, uridine diphosphate.

Different from N-glycosylation, O-glycosylation starts with the addition of a single sugar residue, which can be followed by the addition of more sugars (101). Similar to N-glycosylation, most O-glycosylation also occurs to proteins that transit through ER and Golgi. However, the addition of a single GlcNAc residue to Ser/Thr is a type of O-glycosylation that occurs to cytosolic proteins (102). This cytosolic O-glycosylation has drawn much attention recently because it can regulate the activity of the substrate proteins, especially because it can compete with protein phosphorylation for the same Ser/Thr on substrate proteins (103).

C-glycosylation is the addition of a single mannosyl group to the indole C-2 position of Trp residues of membrane and secreted proteins (104). The Trp residue that is C-mannosylated reside in a consensus Trp-X-X-Trp sequence, and the first Trp is C-mannosylated. About a dozen proteins in humans are C-mannosylated. The enzyme that catalyzes the modification has not been cloned yet, and currently, the function of this modification is not clear.

The large number of enzymes involved in protein glycosylation and the fact that this complicated N-glycosylation pathway is conserved throughout eukaryotic species suggest that glycosylation has important functions. Deficiency in protein glycosylation causes several diseases in humans, such as lysosomal storage diseases (105), congenital disorders of glycosylation, and leukocytes adhesion deficiency II (106). In addition, changes in glycosylation patterns are associated with cancer and inflammation (107). Protein glycosylation can serve several different biological purposes. One purpose is to help proteins that transit through the secretary pathway to fold correctly. Particularly, the removal of the glucose residue by glucosidase II and the reglucosylation in the ER have been well known to help secreted proteins to fold and make sure only correctly folded proteins are secreted (Fig. 15) (108). Protein O-fucosyltransferase I that modifies Notch protein was reported to have chaperon activity that helps Notch folding and secretion, and this chaperon activity is independent of its catalytic activity (109). Glycosylation is also important for sorting secreted proteins. For example, the phosphorylation of Man on N -glycan (Fig. 16) creates a recognition signal for sorting lysosomal proteins to lysosome. Glycosylation is also believed to increase the protein stability, as has been shown for erythropoietin mentioned earlier. Glycosylation is also proposed to affect ligand receptor interaction and thus regulates cell-cell signaling. However, a detailed molecular understanding about the effect of glycosylation on ligand receptor interaction is hard to obtain in most cases. In two well-studied cases, human CD2 (cellular differentiation marker 2) and IgG (immunoglobulin G), N-glycosylation is found to affect the interaction with their ligands or receptors. Structural data show that the carbohydrate portion does not contact the binding partner directly. Instead, glycsosylation affects the binding by changing the conformation of the glycosylated proteins (110-112).

Figure 15. N-glycosylation helps secreted protein to fold correctly in the ER.

Figure 16. Phophorylation of Man on N-glycan. UMP, uridine monophosphate.

Ubiquitin is an abundant small protein (76 amino acids) found in all eukaryotes. It can be conjugated to many proteins covalently and regulates important biological processes. The addition of ubiquitin to substrate proteins goes through an E1-E2-E3 enzymatic cascade (Fig. 17) (113). E1, which is also called ubiquitin-activating protein (UAP), uses ATP to adenylate the C-terminal Gly of ubiquitin and then captures the activated ubiquitin with a Cys residue in the active site. Most eukaryotic species only have one E1 enzyme responsible for activating all the ubiquitin molecules needed. The ubiquitin-E1 conjugate then is recognized by several dozens of E2 enzymes, which capture ubiquitin from E1 via a transthiolation reaction. The ubiquitin-conjugated E2 enzymes are then recognized by many different E3 enzymes, which recruit the substrate proteins and transfer ubiquitin from E2 to Lys residues of the substrate proteins, either directly or indirectly (Fig. 17). Two major families of E3 enzymes exist: the RING (really interesting new gene) E3s and HECT (homologous to E6AP C terminus) E3s. The Pfam database lists more than 400 RING proteins and 70 HECT proteins. Many E3s form complexes with other proteins. One well-understood E3 complex is the SCF (Skp1-Cullin-F Box) RING E3, for which a crystal structure was reported (114). In humans, multiple Cullins and multiple F Box proteins exist (115). Considering the different combinations, the number of possible E3 complexes can be much more than the number of E3 enzymes (3). E3s decide which substrate proteins get ubiquitylated, thus the large number of E3s and E3 complexes reflects the diverse substrate proteins that must be recognized.

Ubiquitin itself has 7 Lys residues (Lys6, 11, 27, 29, 33, 48, and 63) that can be used for ubiquitin attachment, which lead to polyubiquitylation of substrate proteins. Polyubiquitin chain assembled via different Lys residues have different biological functions (116), as will be explained later. Which Lys residue is used in the polyubiqutine chain is controlled by the specific E3 involved. E3 presumably also controls the length of the polyubiquitin chain, although the detailed chain assembly mechanism is still not clear (117). Ubiquitylation can be reversed by the action of ubiquitin-specific proteases (UBPs). About 60 UBPs exist in the human genome, which presumably recognize different types of ubiquitin modifications at various cellular locations (118).

The biological function of ubiquitylation was recognized originally as targeting proteins to the proteasome for degradation. The importance of this function can be illustrated by many examples. In cell division, progression through the cell cycle is driven by cell division kinases, the activities of which are controlled by a group of proteins called cyclins. Different cyclins function only at certain stages of the cell cycle. Then, they must be degraded, which requires polyubiquitylation by specific E3 enzymes (119). Aberration in the ubiquitylation and degradation of cyclins is associated with cancer. Misfolded proteins must be degraded by the ubiquitin and proteasome system. Aggregation of misfolded proteins is known to cause neurodegeneration, such as Parkinson’s disease (116). Ubiquitylation and proteasome degradation of proteins are also important for other biological processes, such as hypoxia and circadian clock. Ubiquitylation is required for the degradation of hypoxia inducible factor (HIF) on hydroxylation at high oxygen levels (120). Maintaining the circadian clock requires the ubiquitylation and degradation of proteins that inhibit the CLOCK (a protein named from circadian locomotor output cycles kaput gene) transcription factor (121).

It is becoming clear that the biological function of ubiquitylation is not limited to proteasome degradation. Other functions have been discovered, such as promoting membrane protein endocytosis, targeting membrane protein to lysosome for degradation, and regulating cytoplasm/nuclear shuttling (116, 122). It is now generally believed that polyubiquitylation via Lys48 of ubiquitin is a signal for proteasomal degradation, and this action requires minimally 4 ubiquitin units in the chain (123). In contrast, monoubiquitylation, multiple monoubiquitylation on different Lys residues of substrate proteins, and polyubiquitylation via Lys 63 of ubiquitin typically signal proteasome-independent pathways (116). How can so many different functions be achieved? The diverse sets of ubiquitin binding domains (UBDs) provide the molecular explanation to this question (19). Presumably, different UBDs recognize different types of ubiquitin modifications (monoubiquitylation vs. polyubiquitylation, and Lys48-linked vs. Lys63-linked polyubiquitylation, for example), and thus they mediate different functional consequences of ubiquitylation. UBD on yeast proteins Rad23, Rpd10, and Dsk2 recognize the Lys48-linked polyubiquitin chain and deliver the modified substrate proteins to the 26 S proteasome (124). The UBD on the vacuolar proteins recognize monoubiquitylation or Lys63-linked polyubiquitin chain on membrane proteins, which mediate their sorting into lysosome or vacuole. Binding of the Lys63-linked polyubiqutin chain on inhibitor of NF-KB kinase (IKK) by other proteins has been proposed to activate IKK and thus turn on NF-kB signaling (116). The recognition of ubiquitin by UBDs can also explain some “unusual” functions of protein ubiquitylation. For example, Lys48-linked polyubiquitylation of a yeast transcription factor Met4p does not signal for proteasome degradation, but instead it inactivates the transcription factor. It inactivates the transcription factor because Met4p has an in-cis UBD that binds the ubiquitin chain and thus inactivates itself and blocks the proteasomal pathway (125).

In generalization of the function of ubiquitylation, we can say that ubiquitin is an “information-rich protein tag” that can be read by different proteins that contain UBD domains (3), and the exact consequence of ubiquitylation is determined by how the tag is recognized. Besides ubiquitin, eukaryotic cells also have about a dozen known ubiquitin-like protein tags, with SUMO being the best studied one. In addition, many proteins have built-in ubiquitin-like domains. The logic that underlies the biological functions of these ubiquitin-like proteins/domains will likely be the same as what is learned from ubiquitin (3).

Figure 17. Ubiquitylation catalyzed by the E1, E2, E3 cascade.

Hydrolytic cleavage of proteins by proteases is an irreversible PTM. The large number (more than 500) of proteases in the human genome indicates that proteolysis occurrs often. Proteases can be classified into four types based on catalytic mechanisms (Fig. 18): Ser/Thr proteases, Cys proteases, Asp proteases, and metalloproteases.

Figure 18. Catalytic mechanism of different proteases.

At first glance, proteolysis may seem to be an uncontrolled destruction process like the digestion of food proteins in the gut. In fact, proteolysis in cells is under tight regulation. Even proteases secreted to the digestive tract must be controlled to avoid self-destruction. Typically, proteases are made in the inactive forms (zymogens) that can be activated by proteolysis. Inside eukaryotic cells, two major locations exist for proteolytic degradation of unwanted proteins: the 26 S proteosome and the lysosome (126, 127). Access to the two degradation organelles is controlled tightly. The lysosome is an acidic membrane organelle that contains many proteases and is responsible for degradation of endocytosed membrane proteins, such as activated receptor tyrosine kinases and G protein-coupled receptors that are ubiquitylated and sorted to the lysosome (described in the ubiquitylation section). The lysosome can also degrade endocytosed or phagocytosed bacterial and viral proteins (128). In autophagy, the lysosome is responsible for degrading cellular organelles and some cytosolic protein complexes (126). The 26 S proteosome (Fig. 19) has a 20 S degradation chamber that consists of four rings α β β α (129). In eukaryotes, each a ring has seven different a subunits, and each α ring has seven different β subunits. Three β subunits are catalytically active Thr proteases that are responsible for the degradation of substrate proteins. By forming this chamber, the active sites of the proteases are buried inside the chamber to avoid proteolysis of proteins that should not be digested. Access to the degradation chamber is controlled by the 19 S regulatory complex that caps both ends of the degradation chamber. The regulatory complex contains subunits that recognize polyubiquitylated substrates, subunits that recycle the ubiquitin tag, and subunits that use ATP hydrolysis to unfold and translocate the protein into the degradation chamber. Degradation of the unwanted proteins by the 26 S proteasome or lysosome in a timely fashion is very important. For example, cyclins that activate cell division kinases have to be polyubiquitylated and degraded by the proteasome at specific times to drive cell cycle progression (119). Degradation of activated membrane receptors in the lysosome is important to avoid over stimulation (130, 131). Misfolded proteins must be degraded by the proteasome or lysosome (in autophagy). Failure to do so is thought to contribute to neurodegeneration disorders such as Parkinson’s disease and Alzheimer’s disease (126).

Figure 19. The eukaryotic 26 S proteasome. Subunit compositions of the 19 S regulatory particle of Saccharomyces cerevisiae is shown on the left. The a and p rings of the 20 S proteasome, each of which consists of seven different subunits, are included to indicate how the base 19 S complex is linked to the core 20 S protease complex. The crystal structure of the 20s degradation chamber is shown in both side and top views (figure made using PDB 1RYP).

In addition to the “destructive” proteolysis processes in the proteasome and lysosome, many “constructive” proteolysis processes occur in cells. In both prokaryotes and eukaryotes, secreted proteins contain a signal peptide at the N-terminus that directs them to the secretary pathway. This signal peptide must be cleaved later by signal peptidases (typically serine proteases) so that the protein can transit further in the secretary pathway (132). Many secreted proteins, which include insulin, TGFβ1 (transforming growth factor β1), nerve growth factor β1, albumin, Factor IX, insulin receptor, and Notch, also contain a propeptide that is cleaved by proprotein convertases in the Golgi (133). Selective proteolysis also occurs at the cell membrane in signal transduction processes. Notch protein, on binding to its ligand Delta/Jagged (membrane proteins on neighboring cells), is cleaved by one of the ADAM (a disintegrase and metallo-protease) proteins at a site close to the transmembrane region. This cleavage activates Notch for regulated intramembrane proteolysis, which cuts within the membrane-spanning region of Notch and releases the intracellular domain of Notch from the cytoplasm membrane. Then, the intracellular domain translocates into the nucleus where it acts as a transcription factor to turn on genes required for development (Fig. 20) (134). Regulated intramembrane proteolysis is catalyzed by the membrane protein complex called presenilin that contains Asp protease subunits. Presenilin is also responsible for cleavage of the amyloid-p precursor proteins in Alzheimer’s disease. This proteolysis-triggered proteolysis signaling occurs often. Similar signaling pathways are present also in bacteria. For example, the release of the transcription factor σ E is achieved via the sequential cleavage of the membrane protein RseA by DegS (a Ser protease) and YaeL (a Zn protease) (135).

Figure 20. Four proteolysis events for Notch that lead to the release of an active transcription factor. TGN, trans Golgi network.

Figure 21. (a) Domain structures of mammalian caspases (b) the caspase cascades and the initiation of apoptosis. Apaf-1, apoptotic protease activation factor-1 Cyto c, cytochrome c FADD, Fas-associated protein with death domain LS, large subunit RAIDD, RIP-associated ICH-1/CED-3 homologous protein with a death domain RIP, receptor-interacting protein TRADD, tumor necrosis factor receptor-associated protein with death domain SS, small subunit.

Similar to the MAP kinase cascades for protein phosphorylation, protease cascades exist, in which downstream proteases are activated by the action of upstream proteases (3). One of the most famous cascades is the caspase cascade that leads to apoptosis (Fig. 22) (11, 136). Caspases are Cys proteases that cleave the amide bond specifically after an Asp residue. Two types of caspases exist, initiator caspases (Caspase 2, 8, 9, 10) and effector caspases (Caspase 3, 6, 7). Both initiator and effector caspases are produced in zymogen forms. Initiator caspases use their N-terminal DED (death effector domain) and CARD (caspase recruitment domain) domains to interact with other proteins to receive apoptosis signals. The signals cause the dimerization of the initiator caspases and activate them so that they can cleave themselves and the effector caspases after specific Asp residues. Cleavage by the initiator caspases activates the effector caspases, which then cleave their substrate proteins to carry out cell apoptosis. The substrate proteins of effector caspases include the inhibitor of caspases-activated DNAse (deoxyribonuclease), Bcl2 (named from B-cell lymphoma 2, an antiapoptotic protein), and PARP-1 (poly(ADP-ribose) polymerase-1, an enzyme catalyzing protein poly(ADP-ribosyl)ation and required for DNA repair). Cleavage of the inhibitor of DNAse by effector caspases activates its catalytic activity, resulting in the fragmentation of chromosomal DNA, which is a hallmark of apoptosis. The caspases cascade and apoptosis is very important for the development and homeostasis of metazoans. Decreased ability of cells to undergo apoptosis will lead to cancer, whereas too much apoptosis can lead to autoimmune diseases (137).

Figure 22. Shokat's (144) ''bump and hole'' method to identify substrates for kinases.

Identifying new pathways regulated by known PTM and discovering new PTM

The brief description above on a few major PTM demonstrates clearly that PTM can regulate many important biological processes. So far, a fairly good understanding of many aspects of PTM has been obtained. What remaining challenges must be addressed?

One direction is to figure out the molecular details of many of the biological processes that are regulated by PTM. Structural biology and biochemisty is needed to answer questions like what structural changes are induced by a particular PTM and how the structure changes lead to changes in activity or recognition by binding partners. Much progress has been made in this direction but still more remains to be figured out. For example, in protein ubiquitylation, no structural details about E1 exist, it is not clear how the polyubiquitin chain is made (117), and it is not clear how specificities of different ubiquitin binding domains are achieved (19).

Another direction is to identify the proteome that is modified by a specific PTM. Advancement in protein identification by mass spectrometry (MS) has greatly facilitated studies in this direction and many efforts have been invested. Generally, an affinity purification method is used to enrich proteins that are modified by a specific PTM, and then these proteins are identified by MS. For example, phosphotyrosine-specific antibodies have been used to enrich proteins that are modified on Tyr residues, and metal affinity columns have been used to isolate all phosphopeptides (138). These isolated phosphoproteins/peptides can then be identified by MS. A His6 tag has been fused to the N-terminus of ubiquitin and used to isolate ubiquitylated proteins that are then identified by MS (139). GlcNAc with an azide group attached has been used to label proteins that are O-GlcNAc modified, and then a biotin tag is conjugated to the modified protein via Staudinger ligation. O-GlcNAc modified proteins can be pulled out using streptavidin beads and identified using MS. Using this method, close to 200 O-GlcNAc modified proteins were identified (140). A clever method to detect protein S-acylation has been reported recently (141).

These proteomic studies have provided much information. However, to understand the function of a PTM in cell physiology completely, it is desirable to know which enzyme is responsible for the modification of a particular substrate protein. With the availability of bioinformatics tools and completed genome sequences, it is now relatively straightforward to identify all the enzymes in a genome that share similar biochemical function. For example, we now know that the human genome contains more than 500 protein kinases, more than 500 proteases, and

400 ubiquitin E3s. But without knowing what substrate proteins they modify, it will be very difficult (if not impossible) to understand their biological functions on a molecular level. Currently, no efficient and reliable method exists yet to identify the substrate proteins for an enzyme. A straightforward method is to make a library of short peptides and try to identify consensus sequences that are recognized by an enzyme (142, 143). The disadvantage is that the structure of a short peptide may be different from the structure of the same sequence present in a folded protein. Thus, the reliability of this method must be validated by other methods. Shokat and coworkers (144) have used a clever approach to identify kinase substrates (Fig. 22). This approach uses a bulky ATP analog that can be used only by a kinase mutant as a cosubstrate. By incubating 32 P-labeled ATP analog and the kinase mutant with cell extract, the substrate proteins of the specific kinase can be labeled. Identification of the substrate proteins may be difficult though because the radiolabeled substrate proteins cannot be enriched/purified easily for identification by MS. It is not clear whether this method can be applied easily to other PTM enzymes.

Parallel to the efforts of identifying substrate proteins for a particular enzyme, the activity-based small molecule probes pioneered by Cravatt and coworkers can facilitate the identification of the biological functions of an enzyme that catalyzes protein post-translational modifications (145). The major advantage of this type of probes is that potentially they can detect enzymes that are in the active states, and thus can provide snapshots of enzymes that are in the active states at different development stages or different types of cells. Among enzymes that catalyze PTM, so far probes have been developed for studying proteases (145, 146), kinases (147), pTyr phosphatases (148), and protein Arg deiminases (149).

Perhaps a more challenging question is how we can discover new PTM reactions. In principle, there are analytic tools that can be used to research this topic. One such tool is top-down FT-MS, which determines the molecular weight of the whole protein with high accuracy. By comparing the obtained tandem MS (MS/MS) result with the expected MS/MS result, post-translational modifications can be identified (150). Crystallography can also discover new PTM, if a protein expressed in the proper host can be crystallized. Some rare modifications or protein side chains were discovered this way (151). However, the success of using these methods would require that a significant portion of the protein population is modified and the modification is stable. This condition cannot be met by all PTM. Thus, discovering new PTM poses a great challenge to chemical biologists. Undoubtedly, new PTM reactions are waiting to be discovered and the identification of these new PTM, together with the identification of new pathways that are regulated by known PTM, will advance our understanding about the molecular logic of living systems.

1. International Human Genome Sequencing Consortium. Initial sequencing and analysis of the human genome. Nature 2001 409:860-921.

2. Puente XS, Sanchez LM, Overall CM, Lopez-Otin C. Human and mouse proteases: a comparative genomic approach. Nat. Rev. Genet. 2003 4:544-558.

3. Walsh CT. Posttranslational Modification of Proteins: Expanding Nature’s Inventory. 2005. Roberts and Company Publishers, Englewood, CO.

4. Ahmed N, Thornalley PJ. Advanced glycation endproducts: what is their relevance to diabetic complications? Diabetes Obes. Metab. 2007 9:233-245.

5. Hess DT, Matsumoto A, Kim SO, Marshall HE, Stamler JS. Protein S-nitrosylation: purview and parameters. Nat. Rev. Mol. Cell Biol. 2005 6:150-166.

6. Lippard SJ, Berg JM. Principles of Bioinorganic Chemistry. 1994. University Science Books. Mill Valley, CA.

7. Ruthenburg AJ, et al. Histone H3 recognition and presentation by the WDR5 module of the MLL1 complex. Nat. Struct. Mol. Biol. 2006 13:704-712.

8. Johnson LN, Lewis RJ. Structural basis for control by phosphorylation. Chem. Rev. 2001 101:2209-2242.

9. Johnson LN, Barford D. The effects of phosphorylation on the structure and function of proteins. Annu. Rev. Biophys. Biomol. Struct. 1993 22:199-232.

10. Chen P, Hochstrasser M. Autocatalytic subunit processing couples active site formation in the 20S proteasome to completion of assembly. Cell 1996 86:961-972.

11. Riedl SJ, Shi Y. Molecular mechanisms of caspase regulation during apoptosis. Nat. Rev. Mol. Cell Biol. 2004 5:897-907.

12. Pawson T, Nash P. Assembly of cell regulatory systems through protein interaction domains. Science 2003 300:445-452.

13. Yaffe MB, Elia, AEH. Phosphoserine/threonine-binding domains. Curr. Opin. Cell Biol. 2001 13:131-138.

14. Yaffe MB. Phosphotyrosine-binding domains in signal transduction. Nat. Rev. Mol. Cell Biol. 2002 3:177-186.

15. Xiang-Jiao Y. Lysine acetylation and the bromodomain: a new partnership for signaling. Bioessays 2004 26:1076-1087.

16. Mujtaba S, Zeng L, Zhou MM. Structure and acetyl-lysine recognition of the bromodomain. Oncogene 2007 26:5521-5527.

17. Daniel JA, Pray-Grant MG, Grant PA. Effector proteins for methylated histones. Cell Cycle 2005 4:919-926.

18. Hurley JH, Lee S, Prag G. Ubiquitin-binding domains. Biochem. J. 2006 399:361-372.

19. Harper JW, Schulman BA. Structural complexity in ubiquitin recognition. Cell 2006 1241133-1136.

20. Pittet M, Conzelmann A. Biosynthesis and function of GPI proteins in the yeast Saccharomyces cerevisiae. Biochim. Biophys. Acta 2007 1771:405-420.

21. Farazi TA, Waksman G, Gordon JI. The biology and enzymology of protein N-myristoylation. J. Biol. Chem. 2001 276:39501-39504.

22. McTaggart S. Isoprenylated proteins. Cell. Mol. Life Sci. 2006 63:255-267.

23. Linder ME, Deschenes RJ. Palmitoylation: policing protein stability and traffic. Nat. Rev. Mol. Cell Biol. 2007 8:74-84.

24. Resh MD. Trafficking and signaling by fatty-acylated and prenylated proteins. Nat. Chem. Biol. 2006 2:584-590.

25. Perham RN. Swinging arms and swinging domains in multifunctional enzymes: catalytic machines for multistep reactions. Annu. Rev. Biochem. 2000 69:961-1004.

26. Fischbach MA, Walsh CT. Assembly-line enzymology for polyketide and nonribosomal peptide antibiotics: logic, machinery, and mechanisms. Chem. Rev. 2006 106:3468-3496.

27. Schwartz B, Klinman JP. Mechanisms of biosynthesis of protein-derived redox cofactors. Vitam. Horm. 2001 61:219-239.

28. Ghosh D. Human sulfatases: a structural perspective to catalysis. Cell. Mol. Life Sci. 2007 64:2013-2022.

29. Schwede TF, Retey J, Schulz GE. Crystal structure of histidine ammonia-lyase revealing a novel polypeptide modification as the catalytic electrophile. Biochemistry 1999 38:5355-5361.

30. Calabrese JC, Jordan DB, Boodhoo A, Sariaslani S, Vannelli T. Crystal structure of phenylalanine ammonia lyase: multiple helix dipoles implicated in catalysis. Biochemistry 2004 43:11403-11416.

31. Christenson SD, Liu W, Toney, MD, Shen B. A novel 4-methylideneimidazole-5-one-containing tyrosine aminomutase in enediyne antitumor antibiotic C-1027 biosynthesis. J. Am. Chem. Soc. 2003 125:6062-6063.

32. Poelje PD, Snell EE. Pyruvoyl-dependent enzymes. Annu. Rev. Biochem. 1990 59:29-59.

33. Kadokura H, Katzen F, Beckwith J. Protein disulfide bond formation in prokaryotes. Annu. Rev. Biochem. 2003 72:111-135.

34. Lodish H, et al. Molecular Cell Biology. 2007. W.H. Freeman & Co Ltd, New York.

35. Takeuchi M, Kobata A. Structures and functional roles of the sugar chains of human erythropoietins. Glycobiology 1991 1:337-346.

36. Saiardi A, Bhandari R, Resnick AC, Snowman AM, Snyder SH. Phosphorylation of proteins by inositol pyrophosphates. Science 2004 306:2101-2105.

37. Skalhegg BS, Tasken K. Specificity in the cAMP/PKA signaling pathway. Differential expression,regulation, and subcellular localization of subunits of PKA. Front. Biosci. 2000 5:678-693.

38. Pierce KL, Premont RT, Lefkowitz RJ. Seven-transmembrane receptors. Nat. Rev. Mol. Cell Biol. 2002 3:639-650.

39. Hurley JH. Structure, mechanism, and regulation of mammalian adenylyl cyclase. J. Biol. Chem. 1999 274:7599-7602.

40. Krebs EG, Beavo JA. Phosphorylation-dephosphorylation of enzymes. Annu. Rev. Biochem. 1979 48:923-959.

41. Daniel PB, Walker WH, Habener JF. Cyclic amp signaling and gene regulation. Ann. Rev. Nutr. 1998 18:353-383.

42. Schlessinger J. Cell signaling by receptor tyrosine kinases. Cell 2000 103:211-225.

43. Avruch J. MAP kinase pathways: the first twenty years. Biochim. Biophys. Acta 2007 1773:1150-1160.

44. Melo JV, Barnes DJ. Chronic myeloid leukaemia as a model of disease evolution in human cancer. Oncogene 2007 7:441-453.

45. Roberts PJ, Der CJ. Targeting the Raf-MEK-ERK mitogen- activated protein kinase cascade for the treatment of cancer. Oncogene 2007 26:3291-3310.

46. Shchemelinin I, Sefc L, Necas E. Protein kinase inhibitors. Folia Biol. (Praha) 2006 52:137-148.

47. Laurent Bialy HW. Inhibitors of protein tyrosine phosphatases: next-generation drugs? Angew. Chem. Int. Ed. Engl. 2005 44:3814-3839.

48. Roth SY, Denu JM, Allis CD. Histone acetyl transferases. Annu. Rev. Biochem. 2001 70:81-120.

49. Luger K, Mader AW, Richmond RK, Sargent DF, Richmond TJ. Crystal structure of the nucleosome core particle at 2.8 A resolution. Nature 1997 389:251-260.

50. Verdone L, Caserta M, Mauro ED. Role of histone acetylation in the control of gene expression. Biochem. Cell Biol. 2005 83:344-353.

51. Yang XJ. Lysine acetylation and the bromodomain: a new partnership for signaling. Bioessays 2004 26:1076-1087.

52. Shahbazian MD, Grunstein M. Functions of site-specific histone acetylation and deacetylation. Annu. Rev. Biochem. 2007 76:75-100.

53. Han J, et al. Rtt109 acetylates histone H3 lysine 56 and functions in DNA replication. Science 2007 15:653-655.

54. Driscoll R, Hudson A, Jackson SP. Yeast Rtt109 promotes genome stability by acetylating histone H3 on lysine 56. Science 2007 315:649-652.

55. Yang XJ, Gregoire S. Metabolism, cytoskeleton and cellular signalling in the grip of protein Ne - and O-acetylation. EMBO Rep. 2007 8:556-562.

56. Grozinger CM, Schreiber SL. Deacetylase enzymes: biological functions and the use of small-molecule inhibitors. Chem. Biol. 2002 9:3-16.

57. Imai SI, Armstrong CM, Kaeberlein M, Guarente L Transcriptional silencing and longevity protein Sir2 is an NAD-dependent histone deacetylase. Nature 2000 403:795-800.

58. Sauve AA, Wolberger C, Schramm VL, Boeke JD. The biochemistry of sirtuins. Annu. Rev. Biochem. 2006 75:435-465.

59. Polevoda B, Sherman F. N-terminal acetyltransferases and sequence requirements for N-terminal acetylation of eukaryotic proteins. J. Mol. Biol. 2003 325:595-622.

60. Timmermann S, Lehrmann H, Polesskaya A, Harel-Bellan A. Histone acetylation and disease. Cell. Mol. Life Sci. 2001 58:728-736.

61. Varier RA, Swaminathan V, Balasubramanyam K, Kundu TK. Implications of small molecule activators and inhibitors of histone acetyltransferases in chromatin therapy. Biochem. Pharmacol. 2004 68:1215-1220.

62. Marks PA, Breslow R. Dimethyl sulfoxide to vorinostat: development of this histone deacetylase inhibitor as an anticancer drug. Nat. Biotechnol. 2007 25:84-90.

63. Polevoda B, Sherman F. Methylation of proteins involved in translation. Mol. Microbiol. 2007 65:590-606.

64. Fontecave M, Atta M, Mulliez E. S-adenosylmethionine: nothing goes to waste. Trends Biochem. Sci. 2004 29:243-249.

65. Kouzarides T. Histone methylation in transcriptional control. Curr. Opin. Genet. Dev. 2002 12:198-209.

66. Schubert HL, Blumenthal RM, Cheng X. Many paths to methyltransfer: a chronicle of convergence. Trends Biochem. Sci. 2003 28:329-335.

67. Bedford MT, Richard S. Arginine methylation: an emerging regulator of protein function. Mol. Cell 18, 263-272 (2005).

68. Bannister AJ, Kouzarides T. Reversing histone methylation. Nature 2005 436:1103-1106.

69. Jacobs SA, Khorasanizadeh S. Structure of HP1 chromodomain bound to a lysine 9-methylated histone H3 tail. Science 2002 295:2080-2083.

70. Huyen Y, et al. Methylated lysine 79 of histone H3 targets 53BP1 to DNA double-strand breaks. Nature 2004 432:406-411.

71. Flanagan JF, et al. Double chromodomains cooperate to recognize the methylated histone H3 tail. Nature 2005 438:1181-1185.

72. Cote J, Richard S. Tudor domains bind symmetrical dimethylated arginines. J. Biol. Chem. 2005 280:28476-28483.

73. Sprangers R, Groves MR, Sinning I, Sattler M. High-resolution X-ray and NMR structures of the SMN tudor domain: conformational variation in the binding site for symmetrically dimethylated arginine residues. J. Mol. Biol. 2003 327:507-520.

74. Friesen WJ, Massenet S, Paushkin S, Wyce A, Dreyfuss G. SMN, the product of the spinal muscular atrophy gene, binds preferentially to dimethylarginine-containing protein targets. Mol. Cell 2001 7:1111-1117.

75. Bannister AJ, et al. Selective recognition of methylated lysine 9 on histone H3 by the HP1 chromo domain. Nature 2001 410:120-124.

76. Nakayama J-I, Rice JC, Strahl BD, Allis CD, Grewal SIS. Role of histone H3 lysine 9 methylation in epigenetic control of heterochromatin assembly. Science 2001 292:110-113.

77. Lachner M, O’Carroll D, Rea S, Mechtler K, Jenuwein T. Methylation of histone H3 lysine 9 creates a binding site for HP1 proteins. Nature 2001 410:116-120.

78. Chuikov S, et al. Regulation of p53 activity through lysine methylation. Nature 2004 432:353-360.

79. Huang J, et al. Repression of p53 activity by Smyd2-mediated methylation. Nature 2006 444:629-632.

80. Shi X, et al. Modulation of p53 function by SET8-mediated methylation at lysine 382. Mol. Cell 2007 27:636-646.

81. Kouskouti A, Scheer E, Staub A, Tora L, Talianidis I. Genespecific modulation of TAF10 function by SET9-mediated methylation. Mol. Cell 2004 14:175-182.

82. Mowen KA, et al. Arginine methylation of STAT1 modulates IFN[alpha]/[beta]-induced transcription. Cell 2001 104:731-741.

83. Xu W, et al. A transcriptional switch mediated by cofactor methylation. Science 2001 294:2507-2511.

84. Shi Y, et al. Histone demethylation mediated by the nuclear amine oxidase Homolog LSD1. Cell 2004 119:941-953.

85. Stavropoulos P, Blobel G, Hoelz A. Crystal structure and mechanism of human lysine-specific demethylase-1. Nat. Struct. Mol. Biol. 2006 13:626-632.

86. Whetstine JR, et al. Reversal of histone lysine trimethylation by the JMJD2 family of histone demethylases. Cell 2006 125:467-481.

87. Wang Y, et al. Human PAD4 regulates histone arginine methylation levels via demethylimination. Science 2004 306:279-283.

88. Thompson PR, Fast W. Histone citrullination by protein arginine deiminase: is arginine methylation a green light or a roadblock? ACS Chem. Biol. 2006 1:433-441.

89. Kearney PL, et al. Kinetic characterization of protein arginine deiminase 4: a transcriptional corepressor implicated in the onset and progression of rheumatoid arthritis. Biochemistry 2005 44:10570-10582.

90. Raijmakers R, et al. Methylation of arginine residues interferes with citrullination by peptidylarginine deiminases in vitro. J. Mol. Biol. 2007 367:1118-1129.

91. Hidaka Y, Hagiwara T, Yamada M. Methylation of the guanidino group of arginine residues prevents citrullination by peptidylarginine deiminase IV. FEBS Lett. 2005 579:4088-4092.

92. Chang B, Chen Y, Zhao Y, Bruick RK. JMJD6 is a histone arginine demethylase. Science 2007 318:444-447.

93. Fraga MF, Esteller M. Towards the human cancer epigenome: a first draft of histone modifications. Cell Cycle 2005 4:1377-1381.

94. Shi Y, Whetstine Jr. Dynamic regulation of histone lysine methylation by demethylases. Mol. Cell 2007 25:1-14.

95. Okada, Y. et al. hDOT1L links histone methylation to leukemogenesis. Cell 2005 121:167-178.

96. Burda P, Aebi M. The dolichol pathway of N-linked glycosylation. Biochim. Biophys. Acta 1999 1426:239-257.

97. Yan A, Lennarz WJ. Unraveling the mechanism of protein N-glycosylation. J. Biol. Chem. 2005 280:3121-3124.

98. Roth J. Protein N-glycosylation along the secretory pathway: relationship to organelle topography and function, protein quality control, and cell interactions. Chem. Rev. 2002 102:285-304.

99. Kornfeld R, Kornfeld S. Assembly of asparagine-linked oligosaccharides. Annu. Rev. Biochem. 1985 54:631-664.

100. Knauer R, Lehle L. The oligosaccharyltransferase complex from yeast. Biochim. Biophys. Acta 1999 1426:259-273.

101. PeterKatalinic J. Methods in enzymology: O-Glycosylation of proteins. Methods Enzymol. 2005 405:139-171.

102. Zachara NE, Hart GW. The emerging significance of O-GlcNAc in cellular regulation. Chem. Rev. 2002 102:431-438.

103. Love DC, Hanover JA. The hexosamine signaling pathway: deciphering the “O-GlcNAc Code”. Sci. STKE 20051re13.

104. Furmanek A, Hofsteenge J. Protein C-mannosylation: facts and questions. Acta Biochim Pol. 2000 47:781-789.

105. Neufeld EF. Lysosomal storage diseases. Annu. Rev. Biochem. 1991 60:257-280.

106. Schachter H. Congenital disorders involving defective N-glycosylation of proteins. Cell. Mol. Life Sci. 2001 58:1085-1104.

107. Dube DH, Bertozzi CR. Glycans in cancer and inflammation - potential for therapeutics and diagnostics. Nat. Rev. Drug Discov. 2005 4:477-488.

108. Trombetta ES, Parodi AJ. Quality control and protein folding in the secretory pathway. Annu. Rev. Cel. Dev. Biol. 2003 19:649-676.

109. Okajima T, Xu A, Lei L, Irvine KD. Chaperone activity of protein O-fucosyltransferase 1 promotes notch receptor folding. Science 2005 307:1599-1603.

110. Wyss DF, et al. Conformation and function of the N-linked glycan in the adhesion domain of human CD2. Science 1995 269:1273-1278.

111. Krapp S, Mimura Y, Jefferis R, Huber R, Sondermann P. Structural analysis of human IgG-Fc glycoforms reveals a correlation between glycosylation and structural integrity. J. Mol. Biol. 2003 325:979-989.

112. Sondermann P, Huber R, Oosthuizen V, Jacob U. The 3.2-A crystal structure of the human IgG1 Fc fragment-FcyRIII complex. Nature 2000 406:267-273.

113. Pickart CM. Mechanisms underlying ubiquitination. Annu. Rev. Biochem. 2001 70:503-533.

114. Zheng N, et al. Structure of the Cul1-Rbx1-Skp1-F boxSkp2 SCF ubiquitin ligase complex. Nature 2002 416:703-709.

115. Cardozo T, Pagano M. The SCF ubiquitin ligase: insights into a molecular machine. Nat. Rev. Mol. Cell Biol. 2004 5:739-751.

116. Mukhopadhyay D, Riezman H. Proteasome-independent functions of ubiquitin in endocytosis and signaling. Science 2007 315:201-205.

117. Hochstrasser M. Lingering mysteries of ubiquitin-chain assembly. Cell 2006 124:27-34.

118. Wing SS. Deubiquitinating enzymes-the importance of driving in reverse along the ubiquitin-proteasome pathway. Int. J. Biochem. Cell Biol. 2003 35:590-605.

119. Reed SI. Ratchets and clocks: the cell cycle, ubiquitylation and protein turnover. Nat. Rev. Mol. Cell Biol. 2003 4:855-864.

120. Schofield CJ, Ratcliffe PJ. Oxygen sensing by HIF hydroxylases. Nat. Rev. Mol. Cell Biol. 2004 5:343-354.

121. Gallego M, Virshup DM. Post-translational modifications regulate the ticking of the circadian clock. Nat. Rev. Mol. Cell Biol. 2007 8:139-148.

122. Salmena L, Pandolfi PP. Changing venues for tumour suppression: balancing destruction and localization by monoubiquitylation. Nat. Rev. Cancer 2007 7:409-413.

123. Thrower JS, Hoffman L, Rechsteiner M, Pickart CM. Recognition of the polyubiquitin proteolytic signal. EMBO J. 2000 19:94-102.

124. Madura K. Rad23 and Rpn10: perennial wallflowers join the melee. Trends Biochem. Sci. 2004 29:637-640.

125. Flick K, Raasi S, Zhang H, Yen JL, Kaiser P. A ubiquitin-interacting motif protects polyubiquitinated Met4 from degradation by the 26S proteasome. Nat. Cell Biol. 2006 8:509-515.

126. Rubinsztein DC. The roles of intracellular protein-degradation pathways in neurodegeneration. Nature 2006 443:780-786.

127. Aaron C. Intracellular protein degradation: from a vague idea, through the lysosome and the ubiquitin-proteasome system, and onto human diseases and drug targeting (nobel lecture). Angew. Chem. Int. Ed. 2005 44:5944-5967.

128. Luzio JP, Pryor PR, Bright NA. Lysosomes: fusion and function. Nat. Rev. Mol. Cell Biol. 2007: 8:622-632.

129. Voges D, Zwickl P, Baumeister W. The 26S proteasome: a molecular machine designed for controlled proteolysis. Annu. Rev. Biochem. 1999 68:1015-1068.

130. Marmor MD, Yarden Y. Role of protein ubiquitylation in regulating endocytosis of receptor tyrosine kinases. Oncogene 2004 23:2057-2070.

131. Shenoy SK. Seven-transmembrane receptors and ubiquitination. Circ. Res. 2007 100:1142-1154.

132. Paetzel M, Karla A, Strynadka NCJ, Dalbey RE. Signal peptidases. Chem. Rev. 2002 102:4549-4580.

133. Rockwell NC, Krysan DJ, Komiyama T, Fuller RS. Precursor processing by Kex2/Furin proteases. Chem. Rev. 2002 102:4525-4548.

134. Fortini ME. [gamma]-Secretase-mediated proteolysis in cell-surface-receptor signalling. Nat. Rev. Mol. Cell Biol. 2002 3:673-684.

135. Young JC, Hartl FU. A stress sensor for the bacterial periplasm. Cell 2003 113:1-2.

136. Yan N, Shi Y. Mechanisms of apoptosis through structural biology. Annu. Rev. Cell. Dev. Biol. 2005 21:35-56.

137. Thompson CB. Apoptosis in the pathogenesis and treatment of disease. Science 1995 267:1456-1462.

138. Kalume DE, Molina H, Pandey A. Tackling the phosphoproteome: tools and strategies. Curr. Opin. Chem. Biol. 2003 7:64-69.

139. Peng J, et al. A proteomics approach to understanding protein ubiquitination. Nat. Biotechnol. 2003 21:921-926.

140. Nandi A, et al. Global identification of O-GlcNAc-modified proteins. Anal. Chem. 2006 78:452-458.

141. Roth AF, et al. Global analysis of protein palmitoylation in yeast. Cell Cycle 2006 125:1003-1013.

142. Songyang Z, et al. Use of an oriented peptide library to determine the optimal substrates of protein kinases. Curr. Biol. 1994 4:973-982.

143. Obata T, et al. Peptide and protein library screening defines optimal substrate motifs for AKT/PKB. J. Biol. Chem. 2000 275:36108-36115.

144. Ubersax JA, et al. Targets of the cyclin-dependent kinase Cdk1. Nature 2003 425:859-864.

145. Evans MJ, Cravatt BF. Mechanism-based profiling of enzyme families. Chem. Rev. 2006 106:3279-3301.

146. Love KR, Catic A, Schlieker C, Ploegh HL. Mechanisms, biology and inhibitors of deubiquitinating enzymes. Nat. Chem. Biol. 2007 3:697-705.

147. Patricelli MP, et al. Functional interrogation of the kinome using nucleotide acyl phosphates. Biochemistry 2007 46:350-358.

148. Kumar S, et al. Activity-based probes for protein tyrosine phosphatases. Proc. Natl. Acad. Sci. U.S.A. 2004 101:7943-7948.

149. Luo Y, Knuckley B, Bhatia M, Pellechia PJ, Thompson PR. Activity-based protein profiling reagents for protein arginine deiminase 4 (PAD4): synthesis and in vitro evaluation of a fluorescently labeled probe. J. Am. Chem. Soc. 2006 128:14468-14469.

150. Sze SK, Ge Y, Oh H, McLafferty FW. From the cover: Top-down mass spectrometry of a 29-kDa protein for characterization of any posttranslational modification to within one residue. Proc. Natl. Acad. Sci. U.S.A. 2002 99:1774-1779.

151. Ermler U, Grabarse W, Shima S, Goubeaud M, Thauer RK. Crystal structure of methyl-coenzyme M reductase: the key enzyme of biologic methane formation. Science 1997 278:1457-1462.

If you are the copyright holder of any material contained on our site and intend to remove it, please contact our site administrator for approval.


To determine the contribution of translational regulation to gene expression level differences between humans and closely related species, we generated new data using ribosome profiling to estimate translation levels. This dataset in conjunction with the data described in Battle et al. [17] and Cenik et al. [27] provided a unique opportunity to explore recent evolution of translational regulation in humans. Through joint analysis with RNA-seq measurement of transcript levels and quantitative mass spectrometry measurement of protein levels, we provided an integrated view of divergence in gene regulation across primates. We found that divergence in translation efficiency is rare, which means that divergence between primate species at the transcript level often propagates to the level of protein translation (ribosome occupancy). This observation is in contrast to previous reports on pervasive translational buffering observed in F1 hybrids between S. cerevisiae and S. paradoxus [24, 25]. Interestingly, a report focusing on the same process in budding yeast hybrids between laboratory and wild isolate strains [23] and a follow-up reanalysis of the Artieri dataset [37] contradicts the notion of a pervasive translational buffering. Instead, their results were more in line with our observations in primates.

Translational regulation is often controlled by regulatory elements that reside in the UTR regions. Variants found in the UTR regions are therefore more likely to impact translation efficiency. Given the level of sequence divergence in the UTR regions [38], the amount of divergence in translation efficiency found between primates appears to be unexpectedly low. That being said, whether these substitutions in the UTR regions impact translational rate remains an open question. It is possible that these genetic variants, while impactful, are cryptic in the environment we tested. Further studies applying appropriate environmental perturbations could reveal species divergence in translational regulation [39]. On the other hand, we identified some inter-species divergence in translation efficiency. Interestingly, however, among the limited number of genes that show significant inter-species divergence in translation efficiency, transcriptional divergence often predicts protein level as well as (or better than) translational divergence for these genes. In other words, inter-primate divergence in translational regulation appears to have minor impact on gene expression differences at the protein level. Unfortunately, measurement noise prevented us from obtaining a precise estimate for the percentage of translational regulation that has a persistent impact on steady state protein levels. However, we were able to show that in contrast to transcriptional regulation, divergence in translation efficiency has only a minor impact on protein levels.

In contrast to gene regulation at translation, we found post-translational gene regulation to have a much broader impact on protein levels. Regulation at this layer often attenuates variation created upstream. A direct comparison between p values from testing effects of buffering from translational vs post-translational mechanisms clearly showed that more genes are regulated by the post-translational mechanisms. Buffering of divergence in gene expression levels has broad implications, especially in the context of evolution. For most genes, proteins often execute cellular functions. Variation in gene expression that has not reached the protein level is therefore less likely to impact organismal phenotypes. Consistent with this notion, we found evidence for relaxation of selective constraint on the mRNA levels in the HapMap YRI population for buffered genes identified between primate species. Further investigation on gene expression buffering in the context of population genetics would likely provide valuable insights on how selection might act on the regulatory variants associated with buffered genes. We found paralleled similarities between effects of post-translational buffering on gene expression divergence and effects of HSP90 chaperone action on rectifying mis-folding caused by missense mutations [40, 41]. HSP90 confers phenotypic robustness by buffering fitness impact imposed by non-synonymous mutations likely through either correcting the protein structure or facilitating the degradation process [42]. We speculate that parallel to HSP90 buffering at the structural level, post-translational buffering could confer phenotypic robustness at the gene expression level by stabilizing protein expression levels against mutations impacting transcription regulation.

We identified post-translationally buffered genes across all three pairwise species comparisons. This observation suggests that post-translational buffering is a conserved mechanism likely evolved under stabilizing selection for protein levels in primates. It remains unclear how post-translational buffering is achieved. We found enrichment of post-translational modifications among this group of genes without significant enrichment of coding substitutions. It could be that divergence in post-translational modifications instead of divergence in coding sequence led to differential turnover rates of proteins and therefore drives buffering. This interpretation provides an explanation for how post-translational buffering could be achieved between human and chimpanzee given the apparent low level of protein sequence divergence.

Post-translational buffering could be a consequence of a conserved cellular quality control system, such as endoplasmic-reticulum-associated protein degradation (ERAD) [43]. Protein quality control mechanisms are in place to ensure that proteins are properly folded and present in adequate amount to execute biological functions [44]. Adequate post-translational modifications are required for proper folding to take place. Moreover, ubiquitination is a key step in targeting mis-folded proteins to proteasome for degradations [44, 45] misfolded proteins arising out of mutation or shortage of chaperones are labelled for degradation by ubiquitination. Consistent with the role of protein quality control mechanisms, we observed significant enrichment of reported ubiquitination sites in post-translationally buffered genes (Fig. 3e). In addition, many proteins are assembled into multi-subunit complexes with defined stoichiometry. Excess components of these complexes are targeted to proteasome for degradation [46,47,48]. Active degradation of excess product of translation could explain the apparent buffering of divergence at the protein level. Consistent with this notion, Chick et al. recently reported evidence supporting a stoichiometric buffering effect [18]. Moreover, Ishikawa et al. demonstrated that effects of artificial perturbation of protein stoichiometry through genetic manipulation are often buffered post-translationally [49]. Multiple protein quality control pathways could be involved in post-translational buffering. By overexpressing ribosomal proteins, Sung et al. described a nuclear protein-degradation mechanism mediated by ubiquitination in maintaining ribosomal protein stoichiometry [50]. Perhaps not coincidentally, our gene ontology analysis also found enrichment of genes that are involved in the process of protein translation for post-translationally buffered genes (Additional file 1: Table S5). Further investigation to identify factors involved in maintaining post-translational buffering would provide insights to advance our understanding of both how natural selection acts on gene regulation and how to better predict phenotypes given genetic variants that impact gene expression.

Taken together, our study provided the first integrative view on gene expression divergence across primates that allows a comparison between translational and post-translational events. We found extensive post-translational gene expression buffering that led to a stable protein level across primate species. We propose a scenario where buffering evolved under stabilizing selection of protein levels that prevents negative impacts on organismal fitness from protein level variation while allowing the transcript level to diverge for quick adaptation to environmental changes. Given the energy cost of protein translation [51], it remains puzzling to us that stabilizing selection appears to act on the post-translational level instead of the translational level. We reason that evolution of post-translational buffering is probably the more parsimonious path and speculate a trans-acting mechanism, involving post-translational modification enzymes, achieved gene expression buffering in a relatively short period of evolutionary time.


The modulation of protein function via different types of post-translational modifications (PTMs) and their combinatorial interplay has attracted considerable attention in recent years [15�]. In this study, we added the interaction layer to the study of PTMs by performing a systematic investigation of the network properties of the different PTM-types in the context of the physical interactions of PTM-carrying proteins. For twelve different PTM-types and across nine diverse species, we determined characteristic and informative network parameters with the goal to investigate whether particular PTM-types are associated with specific and possibly “strategic” placements in the context of all protein interactions such that their individual role in the orchestration of the combined action of all proteins becomes apparent.

Generalized across all PTM-types and species investigated here, PTM-carrying proteins appear engage in more physical contacts, with a reduced clustering coefficient among those proteins they are interacting with, and elevated closeness centrality than their respective protein sets devoid of the particular PTM-type ( Fig. 5 ) or that, as far as we currently know, do not harbor any PTM of any type (S2 Fig.). Differences between the twelve studied PTM-types proved less pronounced with essentially all𠅎xcept for glycosylation (see below)𠅏ollowing the same trend of high degree, low clustering coefficient, and high closeness centrality with only subtle differences in magnitude between them. However, given the present data coverage, it is not yet possible to conclusively decide whether these differences are statistically significant and biologically relevant. When further subsetted into special types of PTMs (e.g. S/T/Y phosphorylation), no significant sub-type differences were evident (S3 Fig.). As motivated above, the three selected network properties were selected specifically to allow conclusions as for the “strategic” roles of PTM in the context of interactions. According to this logic, proteins with PTMs engage in more and different process than non-PTM proteins and play central information relay functions.

Focusing on human PIN and PTM data, sumoylation and proteolytic cleavage stand out as being associated with the largest relative increase of degree and closeness centrality relative to reference sets. Proteolytic cleavage has been associated with activation processes and protein targeting events (cleavage of targeting N-terminal peptide) and constitute a 𠇍ramatic” modification as the relative change of molecular composition of a protein can be significant. Furthermore, transporting proteins to different compartments will inevitably influence the possible interaction scope. The significance of sumoylation in a range of regulatory processes has been increasingly recognized [44]. Our results underscore the importance of this PTM-type.

Phosphorylation, the PTM-type with the largest data support, was identified as the PTM-type with the consistently central and with the largest potential influence scope ( Fig. 5 ). Phosphorylated proteins reside in central network positions (high closeness centrality) and interact with many other proteins (high degree) including specifically pairwise interactions with proteins carrying any of the other four PTM-types as well as other phosphorylated proteins ( Fig. 6 — pairwise interaction figure). Examples from human of phospho-proteins interacting with proteins carrying other PTM-types include the kinases: mitogen-activated protein kinase 1 (MAPK1), interleukin-1 receptor-associated kinase 2 (IRAK2), and spleen tyrosine kinase (SYK). Those proteins each interact with other proteins representing four different PTM-types. These findings underscore once again the central importance of phosphorylation as perhaps the most important and central PTM-type identified so far. Similar characteristics were found for acetylation, albeit the detected magnitude and statistical support is lower.

By contrast, glycosylation was found associated with proteins of low degree, low clustering coefficient, and low closeness centrality ( Fig. 5 ). In particular the low degree and low closeness centrality of glycosylated proteins may be interpreted as consistent with their preferred location in cytosolic membranes and to act as receptors and cell-cell communication mediators ( Fig. 1 , GO-term clustering) [45]. Unlike the other four PTM-types, the transferred glycosyl-groups can be large leading to impeded protein-protein interactions of glycosylated proteins. In addition, because of their frequent embedding in membranes, they operate in two dimensions, not three as for soluble cytosolic proteins, effectively cutting down the interaction potential.

As shown in Fig. 6 , all PTM-types are found on proteins that exhibit a tendency to interact with other proteins carrying the same PTM-type. In the case of phosphorylation, such interactions are interpretable as the well known as phosphorylation/kinase cascades [46, 47]. It is also possible that the detected tendency of PTM-types to self-interact originates from protein complexes, in which all partners undergo the PTMs of the same type. For example, in histone complexes, lysine residues on different proteins are acetylated modifying the binding affinity of histones to DNA [48, 49]. Similar consideration apply to methylation events in histone [50] and other protein complexes [51].

By including nine species from different kingdoms and lineages, we aimed to extract both general and species/lineage-specific trends. However, currently available datasets proved comprehensive enough for a few species only (human, mouse, rat). In the case of phosphorylation, sufficient data were available across all nine species and provided a consistent result of increased degree and closeness centrality and a decreased clustering coefficient ( Fig. 5 ).

The increased likelihood of a functional association of proteins with high interaction degree and their involvement in human disease has been reported before [52, 53]. In selected cases, proteins carrying PTMs have also been reported to be more likely related to disease processes than non-PTM proteins [54�]. Our dataset allowed us to expand this analysis to testing specific PTM-types combined with their PIN-characteristics. Our results suggest that not only does a PTM render proteins more likely disease associated, but that this association may depend on what PIN context it is embedded in. High degree, low clustering coefficient, and high closeness centrality proteins are more likely to be disease associated ( Table 3 ) than their respective counterpart sets at the respective other end of the property PIN-property spectrum, especially for the PTM-types phosphorylation and glycosylation, albeit it for the latter, no significant clustering coefficient trend was detected. Examples of disease-associated phosphorylated or glycosylated proteins detected with high degree and closeness centrality or low clustering coefficient are provided in Table 4 . It may be speculated that proteins with the properties identified as more likely disease associated based on their PIN properties may constitute promising candidates for intensified research. Evidently, the relevance of the protein p53 in human cancer development has long been recognized [57]. In our study, it was identified as one with characteristic network properties typical of disease associated proteins in general.

Table 4

PTM-type PIN- characteristics Protein Ensembl ID Disease
Glycosylationhigh degree, high closeness centralityPro-epidermal growth factorENSP00000265171Hypomagnesemia 4 (HOMG4) [MIM:611718][67]
Transforming growth factor beta-1ENSP00000221930Camurati-Engelmann disease (CE) [MIM:131300][68]
Interleukin-6ENSP00000258743Rheumatoid arthritis systemic juvenile (RASJ) [MIM:604302][69]
Phosphorylationhigh degree, high closeness centralityCellular tumor antigen p53ENSP00000269305Esophageal cancer (ESCR) [MIM:133239] [70] Li-Fraumeni syndrome (LFS) [MIM:151623][71]
RAC-alpha serine/threonine-protein kinaseENSP00000270202Breast cancer (BC) [MIM:114480] Colorectal cancer (CRC) [MIM:114500] [72] Proteus syndrome (PROTEUSS) [MIM:176920] [73]
Histone acetyltransferase p300ENSP00000263253Rubinstein-Taybi syndrome 2 (RSTS2) [MIM:613684][74]
Phosphorylationlow clustering coefficientAutoimmune regulatorENSP00000291582Autoimmune polyendocrine syndrome 1, with or without reversible metaphyseal dysplasia (APS1) [MIM:240300] [75]
ALK tyrosine kinase receptorENSP00000373700Neuroblastoma 3 (NBLST3) [MIM:613014] [76]
Ataxin-2ENSP00000366843Spinocerebellar ataxia 2 (SCA2) [MIM:183090] [77] Amyotrophic lateral sclerosis 13 (ALS13) [MIM:183090] [78]

Evidently, this study hinges on the completeness and accuracy of the available PTM and PIN data as well. Any bias towards a specific detection of particular protein classes and their associated PTM may further skew our results. By imposing a high significance cutoff for the PIN-data (confidence score > 0.9), and furthermore exploiting two data sources (STRING and IntAct), we believe to have taken proper precautionary steps even though some discrepancies were detected ( Fig. 5 ). However, at this point it cannot be decided whether the size of the dataset (relatively small IntAct data set) or the type of PINs that are recorded cause these differences. With regard to PTMs, we used experimentally verified PTMs only. Future investigations of the PIN characteristics of PTMs will benefit from the expected significant increase of experimentally verified sites. In addition, a larger set of different PTMs with sufficient numbers will likely become available, allowing also to further specify the PTM-types used in this study.

A possible selection bias may also come from preferentially profiling those proteins for PTMs that possess “interesting” properties such has high degree. However, as PTMs are increasingly identified in massive, “shotgun” style omics studies, such selection bias may not be that critical. Rather, abundance may be a concern then. However, for phosphorylation it was reported that protein abundance is not correlated with network properties [34, 35]. Furthermore, we also found that network properties are largely independent of the number of PTMs on a given protein (S6 Fig.). While significant due to the large number of observations, no relevant correlation was found neither for degree and nor clustering coefficient with the number of phosphorylation sites taken as the PTM-type with the largest available dataset. However, for closeness centrality, a more sizable positive correlation (r = 0.164) was detected suggesting that more heavily phosphorylated proteins occupy more central positions in the network of protein-protein interactions.

In conclusion, proteins carrying different types of PTMs differ from average non-PTM-proteins and differ between each other with regard to their protein interaction characteristics. Thus, their location within the web of physical protein-protein interactions is not only non-random, but very likely indicates their specific functional roles in the orchestration of molecular processes mediated by the physical interactions between proteins.

1. Introduction

Neurodegenerative diseases representing a diverse spectrum of disorders differ in molecular etiology and progression, as represented by Alzheimer’s disease (AD), Parkinson’s disease (PD), dementia with Lewy bodies (DLB), Huntington’s disease (HD), frontotemporal lobar dementia (FTLD), amyotrophic lateral sclerosis (ALS), and others. Such diseases present in patients as progressively worsening symptoms, sometimes with overlap between diagnoses such that they are ultimately and unambiguously identified only by signature molecular neuropathology found at autopsy. For example, cognitive impairment, behavioral deficits, motor sensory dysfunction and other deficits can overlap due to degeneration of specific neural circuits and underlying death and/or dysfunction of neurons, glia, and/or vasculature. The pathogenic mechanism resulting in the onset and progression of each disease is often associated with genetic variants and mutations and interactions with environmental impacts, lifestyle risk factors, and slowly evolving molecular changes due to aging [1, 2]. Despite the broad range of neurodegenerative clinical and pathological phenotypes, there are several common pathogenic mechanisms. An emerging theme for most diseases is the accumulation of misfolded peptide or protein aggregation and aggregate deposition within areas of the cerebral cortex, basal ganglia, and/or spinal cord, although the relationship between clinical phenotype and protein dysfunction have not been completely elucidated thus far [3, 4]. Protein homeostasis and folding capacity, sometimes referred to as proteostasis, has been suggested as a possible common pathway that is progressively dysregulated with aging and neurodegenerative pathogenesis [5, 6]. Protein forms which emerge post-translationally through modification of protein residues can have starkly different properties due to a single post-translational modification (PTM), for example phosphorylation of Tau in neurofibrillary tangles, or cleavage and PTM of amyloid precursor protein (APP) to yield modified amyloid beta peptides.

Mass spectrometry (MS) is an emerging platform developed to identify and quantify proteins and exact mass/charge (m/z) shifts due to PTMs, of either intact proteins, or peptides that derive from those proteins. MS-based proteomics has provided a powerful means to profile complex protein mixtures as tens to hundreds of thousands of peptides in bottom-up proteomics, and will open the door to the identification of a billion estimated proteoforms due to specific combinations of PTMs on intact proteins across different cell types [7], which can also be accessed with top-down MS approaches [8]. A number of research avenues to investigate neurodegeneration at different levels [e.g. in tissue, subcellular and biochemical fractions, and biomarkers in the extracellular milieu, including plasma or cerebrospinal fluid (CSF)] with a variety of specific proteomic methods have been involved in successful biomarker discovery, drug target development and elucidation of pathogenic mechanisms in the past 20 years [9–17]. The purpose of this review is to first present an overview of general methodologies applicable to the study of post-translational modifications (PTMs) of proteins, and second, we provide references which highlight the importance, perhaps central, of PTMs in characterizing pathogenic processes of neurodegeneration and imbalances in proteostasis.

84 Eukaryotic Translational and Post-translational Gene Regulation

By the end of this section, you will be able to do the following:

  • Understand the process of translation and discuss its key factors
  • Describe how the initiation complex controls translation
  • Explain the different ways in which the post-translational control of gene expression takes place

After RNA has been transported to the cytoplasm, it is translated into protein. Control of this process is largely dependent on the RNA molecule. As previously discussed, the stability of the RNA will have a large impact on its translation into a protein. As the stability changes, the amount of time that it is available for translation also changes.

The Initiation Complex and Translation Rate

Like transcription, translation is controlled by proteins that bind and initiate the process. In translation, the complex that assembles to start the process is referred to as the translation initiation complex . In eukaryotes, translation is initiated by binding the initiating met-tRNAi to the 40S ribosome. This tRNA is brought to the 40S ribosome by a protein initiation factor, eukaryotic initiation factor-2 (eIF-2) . The eIF-2 protein binds to the high-energy molecule guanosine triphosphate (GTP) . The tRNA-eIF2-GTP complex then binds to the 40S ribosome. A second complex forms on the mRNA. Several different initiation factors recognize the 5′ cap of the mRNA and proteins bound to the poly-A tail of the same mRNA, forming the mRNA into a loop. The cap-binding protein eIF4F brings the mRNA complex together with the 40S ribosome complex. The ribosome then scans along the mRNA until it finds a start codon AUG. When the anticodon of the initiator tRNA and the start codon are aligned, the GTP is hydrolyzed, the initiation factors are released, and the large 60S ribosomal subunit binds to form the translation complex. The binding of eIF-2 to the RNA is controlled by phosphorylation. If eIF-2 is phosphorylated, it undergoes a conformational change and cannot bind to GTP. Therefore, the initiation complex cannot form properly and translation is impeded ((Figure)). When eIF-2 remains unphosphorylated, the initiation complex can form normally and translation can proceed.

An increase in phosphorylation levels of eIF-2 has been observed in patients with neurodegenerative diseases such as Alzheimer’s, Parkinson’s, and Huntington’s. What impact do you think this might have on protein synthesis?

Chemical Modifications, Protein Activity, and Longevity

Proteins can be chemically modified with the addition of groups including methyl, phosphate, acetyl, and ubiquitin groups. The addition or removal of these groups from proteins regulates their activity or the length of time they exist in the cell. Sometimes these modifications can regulate where a protein is found in the cell—for example, in the nucleus, in the cytoplasm, or attached to the plasma membrane.

Chemical modifications occur in response to external stimuli such as stress, the lack of nutrients, heat, or ultraviolet light exposure. These changes can alter epigenetic accessibility, transcription, mRNA stability, or translation—all resulting in changes in expression of various genes. This is an efficient way for the cell to rapidly change the levels of specific proteins in response to the environment. Because proteins are involved in every stage of gene regulation, the phosphorylation of a protein (depending on the protein that is modified) can alter accessibility to the chromosome, can alter translation (by altering transcription factor binding or function), can change nuclear shuttling (by influencing modifications to the nuclear pore complex), can alter RNA stability (by binding or not binding to the RNA to regulate its stability), can modify translation (increase or decrease), or can change post-translational modifications (add or remove phosphates or other chemical modifications).

The addition of an ubiquitin group to a protein marks that protein for degradation. Ubiquitin acts like a flag indicating that the protein lifespan is complete. These proteins are moved to the proteasome , an organelle that functions to remove proteins, to be degraded ((Figure)). One way to control gene expression, therefore, is to alter the longevity of the protein.

Section Summary

Changing the status of the RNA or the protein itself can affect the amount of protein, the function of the protein, or how long it is found in the cell. To translate the protein, a protein initiator complex must assemble on the RNA. Modifications (such as phosphorylation) of proteins in this complex can prevent proper translation from occurring. Once a protein has been synthesized, it can be modified (phosphorylated, acetylated, methylated, or ubiquitinated). These post-translational modifications can greatly impact the stability, degradation, or function of the protein.

Visual Connection Questions

(Figure) An increase in phosphorylation levels of eIF-2 has been observed in patients with neurodegenerative diseases such as Alzheimer’s, Parkinson’s, and Huntington’s. What impact do you think this might have on protein synthesis?

(Figure) Protein synthesis would be inhibited.

Review Questions

Post-translational modifications of proteins can affect which of the following?

  1. protein function
  2. transcriptional regulation
  3. chromatin modification
  4. all of the above

A scientist mutates eIF-2 to eliminate its GTP hydrolysis capability. How would this mutated form of eIF-2 alter translation?

  1. Initiation factors would not be able to bind to mRNA.
  2. The large ribosomal subunit would not be able to interact with mRNA transcripts.
  3. tRNAi-Met would not scan mRNA transcripts for the start codon.
  4. eIF-2 would not be able to interact with the small ribosomal subunit.

Critical Thinking Questions

Protein modification can alter gene expression in many ways. Describe how phosphorylation of proteins can alter gene expression.

Because proteins are involved in every stage of gene regulation, phosphorylation of a protein (depending on the protein that is modified) can alter accessibility to the chromosome, can alter translation (by altering the transcription factor binding or function), can change nuclear shuttling (by influencing modifications to the nuclear pore complex), can alter RNA stability (by binding or not binding to the RNA to regulate its stability), can modify translation (increase or decrease), or can change post-translational modifications (add or remove phosphates or other chemical modifications).

Alternative forms of a protein can be beneficial or harmful to a cell. What do you think would happen if too much of an alternative protein bound to the 3′ UTR of an RNA and caused it to degrade?

If the RNA degraded, then less of the protein that the RNA encodes would be translated. This could have dramatic implications for the cell.

Changes in epigenetic modifications alter the accessibility and transcription of DNA. Describe how environmental stimuli, such as ultraviolet light exposure, could modify gene expression.

Environmental stimuli, like ultraviolet light exposure, can alter the modifications to the histone proteins or DNA. Such stimuli may change an actively transcribed gene into a silenced gene by removing acetyl groups from histone proteins or by adding methyl groups to DNA.

A scientist discovers a virus encoding a Protein X that degrades a subunit of the eIF4F complex. Knowing that this virus transcribes its own mRNAs in the cytoplasm of human cells, why would Protein X be an effective virulence factor?

Degrading the eIF4F complex prevents the pre-initiation complex (eIF-2-GTP, tRNAi-Met, and 40S ribosomal subunit) from being recruited to the 5’ cap of mature mRNAs in the cell. This allows the virus to hijack the translation machinery of the human cell to translate its own (uncapped) mRNA transcripts instead.


Proteins folds: relation to splicing and post-translational modification? - Biology

Post Transcriptional Modifications Post transcriptional modifications are absent in prokaryotes. In eukaryotes, the RNA transcripts undergo modifications like splicing, capping and tailing. The messenger ribonucleic acid (mRNA) sequence formed after transcription is exactly similar to one the DNA strand in sequence except that the base uracil is substituted for thymine.

  1. RNA terminal phosphatase (RTPase) removes the terminal phosphate of the 5' triphosphate group of the nascent RNA.
  2. RNA guanylyl transferase (RGTase) adds GMP to the 5' end using GTP as the donor.
  3. RNA (guanine-7) methyl transferase adds a methyl group to the N7 position of the cap guanosine, using S-adenosyl methionine as a donor, to produce Cap 0.
  4. Methyl groups can be added sequentially to the 2' groups of the first and second ribose groups (Cap 1and 2 respectively). If the first templated base is A, it can be methylated on the N6 position, but only if it has first been 2' -O-methylated. The specificity of capping is primarily determined by the RNA polymerase. In addition, only RNAs with a di- or triphosphate at the 5' end are substrates for capping. Therefore products of RNA polymerase I and III, or of endonuclease cleavage, are not capped.

3' end processing

Termination of transcription by RNA polymerase II does not usually occur at fixed positions and often transcribes beyond the 3' end of the mature mRNA and the correct 3' end of mRNAs is generated by a specific endonuclease cleavage followed by non-templated addition of *250 adenosines to form the poly (A) tail. The purpose of addition of poly (A) tail is to confer resistance t0 30' exonucleases and is also essential for efficient translation initiation. The poly (A) tail is added in two steps.

First the RNA is cleaved by an endonuclease, 15–30 nt downstream of a highly conserved hexanucleotide sequence AAUAAA. Poly (A) polymerase then synthesizes the poly A tail using ATP as a substrate. The 3' end processing reaction requires at least two sequences in the RNA. The AAUAAA element provides a binding site for cleavage polyadenylation specificity factor (CPSF), while a region downstream of the cleavage site enriched in U or G and U residues is necessary for binding of cleavage stimulation factor (CstF). Poly(A) polymerase alone synthesizes very long poly A tails, but the presence of poly(A)-binding protein II (PAB II) limits the length of the tail to the physiological length of *250 nt. Histone 3' ends are not polyadenylated but cleaved at their 3 'ends by a meachanism which is unclear.
Introns are intervening sequences that interrupt the sequences that will appear in the mature mRNA (exons) and are usually non coding. However introns like l-19 do code. These are highly reactive and get spliced when RNA assumes the secondary structure after transcription. Therefore these are untranslatable sequences as they remain absent in the mature transcript. Splicing is essentially a process of joining the exons in a pre-mRNA sequence as a consequence of which the introns are eliminated. Introns are present in mostly in in eukaryotes and are rare in prokaryotes. Most of the eukaryotic genes contain introns but genes encoding histones, heat shock proteins do not contain introns. Introns exhibit a wide variation in length and number, with an average size of 3.3 kilobases in humans with an absolute minimum size of 60 bases. Also they vary in number from a few introns up to 177 in the human titin gene. Apart from generating functional mRNAs, splicing assists in the subsequent export of mRNAs to the cytoplasm.

Mechanism of splicing
Intron splicing of pre- mRNA takes place in the nucleus according to a precise and complex arrangement of proteins and ribonuclear particles. Mature mRNA is exported to the cytoplasm for translation. It is a complex process which involves approximately 70 proteins in higher eukaryotes and five small ribonucleoprotein particles (snRNPs). Ultimately pre-mRNAs containing coding sequences (exons) and intervening sequences (introns) are processed to form a continuous mRNA containing only exons. This process is evolutionarily conserved.

Splicing at 5' end introns occurs concomitantly with transcription, whereas 3' introns are largely spliced post-transcriptionally. The RNA elements required for splice-site recognition include sequences at the 5' and 3' splice sites and short consensus sequences such as invariant GU and AG dinucleotides at their 5' and 3' ends respectively and the A at the ‘branchpoint’.

The 5' splice site (sometimes called the ‘donor’ site) has a 9 nucleotide (nt) consensus (A/C)AG/GURAGU, where R denotes a purine and the vertical line represents the exon–intron boundary. The consensus sequence is complementary to a conserved single-stranded region in the abundant U1 (snRNA), and this base-pairing interaction is critical in 5' splice site recognition. The 3' splice site (sometimes called the ‘acceptor’ site) has the consensus CAGG and is preceded by a region enriched in pyrimidines (the polypyrimidine tract). Upstream of the polypyrimidine tract is the ‘branchpoint’ sequence with consensus YNYURAY, where R is a purine, Y is a pyrimidine, N any nucleotide and A is the branchpoint residue. In yeast, there is a much tighter version of this consensus: UACUAAC. The yeast branchpoint sequence is exactly complementary with a conserved single-stranded region of U2 snRNA , allowing for a single mismatch in which the branchpoint A is bulged out. This U2 snRNA-branchpoint interaction is also important in splicing.

There are two types of splicing- Self splicing and Spliceosome mediated splicing
Group I and II introns undergo self splicing.

Group I introns are mainly self-splicing introns, also termed as Ribozymes. Splicing of group I introns is processed by two transesterification reactions involving two sequential transfer reactions. The initial reaction is triggered by the attack of an exogenous Guanosine or guanosine nucleotide on the 3'-OH of the phosphodiesdter bond at the 5' splice site resulting in a free 3/-OH group at the upstream exon and the exogenous G being attached to the 5' end of the intron. Then the terminal G of the intron swaps the exoG and occupies the G-binding site to organize the second ester-transfer reaction: the 3'-OH group of the upstream exon attacks the 3' splice site, leading to the ligation of the adjacent upstream and downstream exons and release of the catalytic intron.

Group II introns are another class of self splicing introns that act as Ribozymes as well as mobile genetic elements. Similar to any splicing mechanism, it also involves two transesterification reactions. Unilke group I introns the first nucleophillic attack is triggered by the 2' OH of the bulged A to attack the 5'-splice site, producing an intron lariat/3'-exon intermediate. In the second step, the 3' OH of the cleaved 5' exon is the nucleophile and attacks the 3'-splice site, resulting in exon ligation and excision of an intron lariat RNA. Some group II introns self-splice in vitro, but the reaction is generally slow and requires nonphysiological conditions—e.g., high concentrations of monovalent salt and/or Mg++ reflecting that proteins are needed to help fold group II intron RNAs into the catalytically active structure for efficient splicing.
Spliceosome mediated splicing:

The splicing reaction only occurs after the consensus splice site elements have been recognized by various splicing factors leading to assembly of a ‘Spliceosome’. The spliceosome is a large ribonucleo -protein complex containing 50–100 proteins and five snRNA components (U1, 2, 4, 5 and 6). The snRNAs are each contained within preformed small nuclear ribonucleoprotein (snRNP) complexes (U1, U2, U4/6 and U5), containing several snRNPs, some of which are common to all the spliceosomal snRNPs. U1 and U2 snRNAs are responsible for recognition of the 5' splice site and branchpoint respectively by base pairing. In the fully assembled spliceosome, U2 and U6 snRNAs help to form a network of RNA–RNA interactions that bring together the reactive groups in the pre-mRNA, which are distantly separated in the primary RNA sequence. They also form the catalytic core of the spliceosome, in part by providing specific sites for binding the Mg2+ ions important for catalysis of splicing.
Alternative mRNA Splicing

Alternative splicing is the process by which single genes express multiple messenger RNAs (mRNAs), and therefore multiple proteins, by differential processing of the primary transcript (pre-mRNA). The temporal and spatial regulation of protein expression by alternative splicing is determinative for such diverse biological processes as gender specification in Drosophila, commitment to apoptosis, and sound frequency recognition of individual hair cells in the avian cochlea. The first example of alternative splicing was defined in the adenovirus in 1977 and demonstrated that one pre-mRNA molecule could be spliced at different junctions to result in a variety of mature mRNA molecules, each containing different combinations of exons. Shortly afterward, alternative splicing was found to occur in cellular genes as well, with the first example identified in the IgM gene, a member of the immunoglobulin superfamily. Another example of a gene with an impressive number of alternative splicing patterns is the Dscam gene from Drosophila, which is involved in guiding embryonic nerves to their targets during formation of the fly's nervous system. Examination of the Dscam sequence reveals such a large number of introns that differential splicing could, in theory, create a staggering 38,000 different mRNAs. This ability to create so many mRNAs may provide the diversity necessary for forming a complex structure such as the nervous system. In fact, the existence of multiple mRNA transcripts within single genes may account for the complexity of some organisms, such as humans, that have relatively few genes (approximately 20,000).

Trans-splicing involves the joining of exons that originate on separate transcripts. Initially, split group II introns in the organelles of plants and algae, and spliced-leader (SL)-dependent trans-splicing in many lower eukaryotes, were the only documented examples of trans-splicing in nature. Since the early 1990s, evidence has been obtained to suggest that trans-splicing may also occur in higher eukaryotes. With the surprisingly low number of identi?ed genes in humans, there has been an interest in determining whether trans-splicing occurs in mammals because this process could potentially increase the coding capacity of a genome. . In this reaction, the 5’ exon is donated from a small RNA polymerase II transcript – the SL RNA. Trans-splicing occurs at a 3’ splice site located on an RNA molecule that is transcribed separately. Addition of SL RNA to a 3’ trans acceptor molecule is not dependent on the formation of base pairs between the two transcripts. The reaction itself is analogous to cis- splicing which involves two sequential trans esterification reactions.
RNA editing:
RNA editing is the only mechanism that can alter the nucleotide sequence of RNA after transcription by addition of nucleotides that are not coded for in the DNA template. RNAediting includes the deamination of cytidines to produce uracil, the deamination of adenosines to produce inosines, or the insertion or deletion of uridine nucleotides. The best characterized types of mRNA editing events in mammalian systems are C to U and A to I editing, which occur by a chemically similar deamination mechanism

A to I editing:

Adenosine can be converted to inosine by hydrolytic deamination at the N6 position). Since I is recognized by the translation machinery as G, this can change the amino acid encoded by the edited codon, although it cannot create new stop codons. For example, in GluRB, an ion channel mRNA, a CAG glutamine codon is edited to CIG, which is recognized as a CGG arginine codon. The template for editing in this case is double-stranded RNA, which forms as a result of complementary sequence in the exon (around the adenosine to be edited) and downstream intron sequence.All RNAs identified to undergo A to I editing are expressed in the central nervous system (e.g. glutamate-gated ion channel receptors (GluR) and serotonin (5-HT2C) receptors). A to I editing is carried out by adenosine deaminases that act on RNA (ADARs).
C to U editing:

C to U editing occurs in apoB mRNA. ApoB-100, encoded by the unedited mRNA in liver, is a major component of low-density lipoprotein (LDL) and very low-density lipoprotein (VLDL) particles. In the intestine, editing of apoB mRNA at a single cytidine residue (C6666) to U changes a glutamine codon (CAA) to a stop codon (UAA). The resultant truncated protein (apoB-48) lacks the C-terminal LDL receptor binding domain of apoB-100 and is a component of chylomicrons.
C to U editing is catalyzed by APOBEC-1 (apoB mRNA editing enzyme catalytic polypeptide 1), a cytidine deaminase, which in humans is expressed only in the intestine. In contrast to A to I editing, the primary sequence of apoB mRNA is important for editing. The minimal RNA required for specific editing comprises a 5' regulator element (immediately upstream of C6666), a 4 nt spacer element following the edited site, and an 11 nt ‘mooring sequence’ immediately downstream. The mooring site is necessary for high-affinity binding by ACF, which docks APOBEC-1 to the mRNA for specific editing at C6666. ApoB is the only identified.


Neuroligins 1−4 are postsynaptic transmembrane proteins capable of initiating presynaptic maturation via interactions with β-neurexin. Both neuroligins and β-neurexins have alternatively spliced inserts in their extracellular domains. Using analytical ultracentrifugation, we determined that the extracellular domains of the neuroligins sediment as dimers, whereas the extracellular domains of the β-neurexins appear monomeric. Sedimentation velocity experiments of titrated stoichiometry ratios of β-neurexin and neuroligin suggested a 2:2 complex formation. The recognition properties of individual neuroligins toward β-neurexin-1 (NX1β), along with the influence of their splice inserts, were explored by surface plasmon resonance and affinity chromatography. Different neuroligins display a range of NX1β affinities spanning more than 2 orders of magnitude. Whereas splice insert 4 in β-neurexin appears to act only as a modulator of the neuroligin/β-neurexin association, splice insert B in neuroligin-1 (NL1) is the key element regulating the NL1/NX1β binding. Our data indicate that gene selection, mRNA splicing, and post-translational modifications combine to give rise to a controlled neuroligin recognition code with a rank ordering of affinities for particular neurexins that is conserved for the neuroligins across mammalian species.

This work was supported by USPHS Grants P42-ES-10337 and R37 GM-18360 to P.T., Cure Autism Now Pilot Grant and National Alliance for Autism Research Grant 843 to D.C., Fonds de la Recherche en Santé du Québec to A.A.B., R37 MH52804-08 to T.C.S., and NSF DBI-9974819 to B.D.

University of CaliforniaSan Diego.

Center for Basic Neuroscience, University of Texas Southwestern Medical Center.

The University of Texas Health Science Center.

Department of Molecular Genetics, University of Texas Southwestern Medical Center.

Howard Hughes Medical Institute, University of Texas Southwestern Medical Center.

To whom correspondence should be addressed: Department of Pharmacology, University of CaliforniaSan Diego, La Jolla, CA 92093-0636. Telephone: 858-534-1366. Fax: 858-534-8248. E-mail: [email protected]

We would like to thank Jamie Purcell and Julia Oddo for their initial conception of this work as well as Jacob Gacke for his preliminary work on the plasmid dosing system. We would also like to thank members of the Berglund Lab and Center for NeuroGenetics for their helpful feedback and comments.

University of Florida (UF) [to J.A.B. and E.T.W.] National Institutes of Health [GM121862 to J.A.B., OD017865 to E.T.W.] Rosaria Haugland Graduate Research Fellowship [to M.A.H.] Promise to Kate Graduate Research Fellowship [to M.A.H.]. Funding for open access charge: UF startup funds.

Conflict of interest statement. The University of Florida, J. Andrew Berglund and Melissa A. Hale have filed a provisional patent application for the use of synthetic MBNL proteins for treatment of repeat expansion diseases.

Watch the video: Post translational modification Quick Overview (July 2022).


  1. Kajind

    I think you are wrong. Email me at PM.

  2. Ismael

    I noticed that some bloggers like to provoke readers, some even leave provocative comments themselves on their blog

  3. Merestun

    Bravo, the excellent message

  4. Witt

    I am sorry, that has interfered... I understand this question. I invite to discussion.

Write a message