Protein structure and function


Marie-Véronique CLEMENT Associate Professor Yong Loo Lin School of Medicine NUS Graduate School for Integrative Science and Engineering Department of Biochemistry National University of Singapore 8 Medical Drive, MD 7 #03-15 Singapore 117597 Tel: (65) 68747985 Fax: (65) 67791453 E-mail:  

From amino acids to protein:
N­terminus  terminates  by an amino group Peptide bond

Amino acid

C­terminus  terminates by a  carboxyl group

A peptide: Phe­Ser­Glu­Lys (F­S­E­K)



The Shape of proteins:

Occurs Spontaneously Native conformation
determined by different Levels of structure



Four Levels of Structure Determine the Shape of Proteins
Primary structure
The linear arrangement (sequence) of amino acids and the location of covalent (mostly disulfide) bonds within a polypeptide chain. Determined by the genetic code.

Secondary structure

local folding of a polypeptide chain into regular structures including the α helix, β sheet, and U-shaped turns and loops.

Tertiary structure
overall three-dimensional form of a polypeptide chain, which is stabilized by multiple non-covalent interactions between side chains.

Quaternary structure:
The number and relative positions of the polypeptide chains in multisubunit proteins. Not all protein have a quaternary structure.

Primary Structure of a protein: 
determined by the nucleotide sequence of its gene

Bovine Insulin: the first sequenced protein

• In 1953, Frederick Sanger determined the amino acid sequence of insulin, a protein hormone . • This work is a landmark in biochemistry because it showed for the first time that a protein has a precisely defined amino acid sequence. • it demonstrated that insulin consists only of amino acids linked by peptide bonds between α-amino and α-carboxyl groups. • the complete amino acid sequences of more than 100,000 proteins are now known. • Each protein has a unique, precisely defined amino acid sequence.



Primary Structure
Pro­insulin protein

Pro­insulin is produced  in the Pancreatic islet  cells 30/31

65/66 Human:  Thr­Ser­Ile Cow:  Ala­Ser­Val Pig:  Thr­Ser­Ile Chiken:  His­Asn­Thr 



+ C peptide



Amino acid substitution in proteins from different species Conservative Substitution of an amino acid by another  amino acid of similar polarity 
(Val for Ile in position 10 of insulin)

Non conservative

Substitution involving replacement  of an amino acid by another of  different polarity 

(sickle cell anemia, 6th position of hemoglobin  replace from a glutamic acid to a valine induce  precipitation of hemoglobin in red blood cells)

Invariant residues

Amino acid found at the same position in  different species
  (critical for for the sructure or function of the protein)

Protein conformation: most of the proteins fold into only  one stable conformation or native conformation


More than 50 amino acids becomes a protein  

• Stabilized by hydrogen bonds • H- bonds are between –CO and –NH groups of peptide backbone • H-bonds are either intra- or intermolecular • 3 types : a-helix, b-sheet and triple-helix

What forces determine the  structure?
  • Primary structure ­ determined by                                                         covalent  bonds • Secondary, Tertiary, Quaternary structures ­  all determined by weak forces
– Weak forces ­ H­bonds, ionic interactions, van  der Waals interactions, hydrophobic  interactions    



Non covalent  interactions  involved in the  shape of proteins



Secondary structures:
α Helix: α helix conformation was discovered 50 years ago in  α keratine abundant in hair nails, and horns


β Sheet:  discovered within a year of the discovery of  α helix.Found in protein fibroin the major  constituant of silk 



The α helix: 

result from hydrogen bonding, does not involve the side chain of the amino acid




result from hydrogen bonding, does not involve the side chain of the amino acid



Two type of β Sheet  structures
An anti paralellel  β sheet

A paralellel  β sheet



• Limited to tropocollagen molecule • Sequence motif of –(Gly­X­Pro/Hypro)n­ • 3 left­handed helices wound together to give a  right­handed superhelix • Stable superhelix : glycines  located on the  central axis (small R group) of triple helix • One interchain H­bond for each triplet of aas –  between NH of Gly and CO of X (or Proline) in  the adjacent chain

Triple helix of Collagen



• Helices/β­sheets: ~50% of regular  2ostructures of globular proteins • Remaining : coil or loop conformation • Also quite regular, but difficult to  describe • Examples : reverse turns, β­bends  (connect successive strands of  antiparallel β­sheets)

The Beta Turn
• • • (aka beta bend, tight turn)  allows the peptide chain to reverse direction  carbonyl O of one residue is H­bonded to the amide proton of a residue  three residues away  proline and glycine are prevalent in beta turns (?)



• • • A strand of polypeptide in a β­sheet may contain  an “extra” residue This extra residue is not hydrogen bonded to a neighbouring strand This is known as a β­bulge.



Tertiary structure: the overall shape of a protein 
or a telephone cord!!!
The secondary structure of a telephone cord A telephone cord, specifically the coil of a telephone cord, can be used as an analogy to the alpha helix secondary structure of a protein.

The tertiary structure of a telephone cord The tertiary structure of a protein refers to the way the secondary structure folds back upon itself or twists around to form a three-dimensional structure. The secondary coil structure is still there, but the tertiary tangle has been superimposed on it.



Tertiary structure: the overall shape of a protein
Full three dimensional organization of a protein

The three-dimensional structure of a protein kinase



• R­group interactions result in  3D structures of  globular proteins • Types of interactions : H­, ionic­ (salt linkage),  hydrophobic­  and disulphide­ bond • Hydrophilic R groups on surface while  hydrophobic R groups buried inside of  molecule • Wide variety of 3o structures: since large  variation in protein sizes and amino acid  sequences

The role of side chain in the  Where is water? shape of proteins







The tertiary structure for myoglobin is fairly well understood. Myoglobin has an alpha helix which then can be viewed as being enclosed in this blue sheath, the sheath doesn't exist but we can draw it that way. That helix folds back upon itself into what's referred to as the tertiary structure of myoglobin. Bonds between the side groups of the amino acid residues are responsible for holding together the tertiary structure of this protein.













A coiled­coil:

Structure occurs when the 2 a helix  have  most of their nonpolar  (hydrophobic) side chains on one  side, so that they can twist around  each other with these side chain  facing inwards



Quaternery  structure:
If protein is formed as a  complex of more than one  protein chain, the complete  structure is designed as  quaternery structure:

• Generally formed  by non­covalent  interactions between  subunits • Either as homo­ or  hetero­multimers


• Oligomers (multimers) are more stable than dissociated  subunits 
– They prolong life of protein in vivo

• Active sites can be formed by residues from adjacent  subunits/chains
– A subunit may not constitute a complete active site

• Error of synthesis is greater for longer polypeptide chains • Subunit interactions : cooperativity/ allosteric effects



Primary structure

Secondary structure

Tertiary structure

Quaternary structure

Structure of the hemaglutinine protein
(a long multimeric molecule whose three identical subunits are each composed of two chains, HA1 and HA2).

QuickTime™ and a TIFF (LZ W) decompressor are needed to see this picture.

Primary structure
(1 letter code used)

β strands

random coils

α helices


Secondary structure  

QuickTime™ and a TIFF (LZ W) decompressor are needed to see this picture.

Protein domains

Tertiary structure

Quaternary structure

Protein domains:
Any part of a protein that can fold  independently into a compact, stable  structure. A domain usually contains  between 40 and 350 amino acids.

• A domain is the modular unit from which  many larger proteins are constructed.  • The different domain of protein are often  associated with different functions.

Protein domains 
Cytochrome b562 A single domain protein  involved in electron transport  in mitochondria The NAD­binding  domain of  the enzyme lactic  dehydrogenase The variable domain  of an immunoglobulin



The Src protein



The modular nature of proteins:
EGF domain Immunoglobulin domain

Membrane spanning domain

 (tissue plasminogen activator) Fibronectine  domain Chymotryptic domain

Epidermal growth factor: (EGF) is generated by proteolytic cleavage of a precursor protein containing multiple EGF domains (orange). The EGF domain also occurs in Neu protein and in tissue plasminogen activator (TPA). Other domains, or modules, in these proteins include a chymotryptic domain (purple), an immunoglobulin domain (green), a fibronectin domain (yellow), a membranespanning domain (pink), and a kringle domain (blue).


[Adapted from I. D. Campbell and P. Bork, 1993, Curr. Opin. Struc. Biol. 3:385.]


How protein structures are determined?
The majority of protein structures known to date have been solved with the experimental technique of • X-ray crystallography, which typically provides data of high resolution but provides no time-dependent information on the protein's conformational flexibility. • NMR (nuclear magnetic resonance spectroscopy), which provides somewhat lower-resolution data in general and is limited to relatively small proteins, but can provide timedependent information about the motion of a protein in solution. More is known about the tertiary structural features of soluble globular proteins than about membrane proteins because the latter class is extremely difficult to study using these methods.

An X-ray diffraction image for the protein myoglobin.

The first protein crystal structure was of sperm whale myoglobin, as determined by Max Perutz and Sir John Cowdery Kendrew in 1958, which led to a Nobel Prize in Chemistry. The X-ray diffraction analysis of myoglobin was originally motivated by the observation of myoglobin crystals in dried pools of blood on the decks of whaling ships.

NMR is a field of structural biology, that applies nuclear magnetic resonance spectroscopy to investigating proteins
The field was pioneered by among others, Kurt Wüthrich, who won the Nobel prize in 2002,

Pacific Northwest National Laboratory's high magnetic field (800 MHz) NMR spectrometer being loaded with a sample.

The NMR sample is prepared in a thin walled glass tube.

Protein NMR is performed on aqueous samples of highly purified protein. Sample consist of between 300 and 600 microlitres with a protein concentration in the range 0.1 – 3 millimoles.

  recombinant DNA techniques through genetic engineering.  

The source of the protein can be either natural or produced in an expression system using

Function of  peptides and proteins







Oxytocin and vasopressin are two peptide hormones with very similar structure, but with very different biological activities. Interestingly, their structures only differ by one amino acid residue (the hydrophobic LEU number 8 in oxytocin is replaced by a hydrophilic ARG residue in vasopressin). Oxytocin is a potent stimulator of uterine smooth muscle, and also stimulates lactation. Vasopressin, also know as antidiuretic hormone (ADH), has no effect on uterine smooth muscle, but   causes reabsorbtion of water by the kidney, thus increasing blood pressure.

Read Table 3.2 in Devlin  for other examples of  biologically active peptides  

Function of proteins
• Enzymatic catalysis • Transport and storage (the protein hemoglobin, albumins) • Coordinated motion (actin and myosin). • Mechanical support (collagen). • Immune protection (antibodies) • Generation and transmission of nerve impulses - some amino acids act as neurotransmitters, receptors for neurotransmitters, drugs, etc. are protein in nature. (the acetylcholine receptor), • Control of growth and differentiation transcription factors Hormones growth factors ( insulin or thyroid stimulating     hormone)

Proteins are the most important buffers in the body.

• Protein molecules possess basic and acidic groups which act as H+ acceptors or donors respectively if H+ is added or removed.

• • •

Proteins are the most important buffers in the body. They are mainly intracellular and include haemoglobin. The plasma proteins are buffers but the absolute amount is small compared to intracellular protein. Protein molecules possess basic and acidic groups which act as H+ acceptors or donors respectively if H+ is added or removed.

• Many proteins (thousands!) present in blood plasma • Proteins contain weakly acidic (glutamate, aspartate) and basic  (lysine, arginine, histidine) side chains (or R groups) • At neutral pH, only histidine residues (containing imidazole R  group with pKa ~ 6.0) in proteins can act as a buffer component    • Haemoglobin with 38 histidine/tetramer is a good buffer • N­terminal groups of proteins (pKa ~ 8.0) can also act as a  buffer component

Proteins play crucial roles in almost every biological process. They are responsible in one form or another for a variety of physiological functions including

Enzymatic catalysis Transport and storage Coordinated motion Mechanical support Immune protection Generation and transmission of nerve impulses Control of growth and differentiation



Enzymatic catalysis almost all biological reactions are enzyme catalyzed. Enzymes are known to increase the rate of a biological reaction by a factor of 10 to the 6th power! There are several thousand enzymes which have been identified to date.



Transport and storage - small molecules are often carried by proteins in the
physiological setting (for example, the protein hemoglobin is responsible for the transport of oxygen to tissues). Many drug molecules are partially bound to serum albumins in the plasma.

The binding of oxygen is affected by molecules such as carbon monoxide (CO) (for example from tobacco smoking, cars and furnaces). CO competes with oxygen at the heme binding site. Hemoglobin binding affinity for CO is 200 times greater than its affinity for oxygen, meaning that small amounts of CO dramatically reduces hemoglobin's ability to transport oxygen. When hemoglobin combines with CO, it forms a very bright red compound called carboxyhemoglobin. When inspired air contains CO levels as low as 0.02%, headache and nausea occur; if the CO concentration is increased to 0.1%, unconsciousness will follow. In heavy smokers, up to 20% of the oxygen-active sites can be blocked by CO.

3-dimensional structure of hemoglobin. The four subunits are shown in red and yellow, and the heme groups in green.



Coordinated motion - muscle is mostly protein, and muscle contraction is mediated by
the sliding motion of two protein filaments, actin and myosin.

Platelet activation is a controlled  sequence of actin filament: Severing Uncapping Elongating Cross linking That creates a dramatic shape change  in the platelet

Platelet before activation

Activated platelet

Activated platelet  at a later stage than C)

Mechanical support - skin and bone are strengthened by the protein collagen.

Abnormal collagen synthesis or  structure causes dysfunction of •  cardiovascular organs,  • bone,  • skin, • joints  • eyes
Refer to Devlin  Clinical correlation 3.4 p121



Immune protection - antibodies are protein structures that are responsible for
reacting with specific foreign substances in the body.



Generation and transmission of nerve impulses Some amino acids act as neurotransmitters, which transmit electrical signals from one nerve cell to another. In addition, receptors for neurotransmitters, drugs, etc. are protein in nature. An example of this is the acetylcholine receptor, which is a protein structure that is embedded in postsynaptic neurons.

GABA: gamma Amino butyric acid Synthesised from glutamate
GABA acts at inhibitory synapses in the brain. GABA acts by binding to specific receptors in the plasma membrane of both pre- and postsynaptic neurons. Neurotransmetter



Control of growth and differentiation proteins can be critical to the control of growth, cell differentiation and expression of DNA. For example, repressor proteins may bind to specific segments of DNA, preventing expression and thus the formation of the product of that DNA segment. Also, many hormones and growth factors that regulate cell function, such as insulin or thyroid stimulating hormone are proteins.




Membrane transport proteins




In general, all globular proteins have distinctive 3D structures that are specialized for their particular functions.

Shape and function



The relationship between shape and function of proteins:



The relationship between shape and function of proteins:



Protein degradation:



Disease and protein folding:

Exemple: Neurodegenerative  diseases

HOT Areas of Medical Research
Human Genome sequencing is completed Application in Biology and Medicine just beginning
e.g., Cloning of a disease gene is the first step in understanding the basic defects and rational treatment Structural and functional characterization of all novel PROTEINS will unravel new disease genes.





Shape and function
In globular proteins, tertiary interactions are frequently stabilized by the sequestration of hydrophobic amino acid residues in the protein core, from which water is excluded, and by the consequent enrichment of charged or hydrophilic residues on the protein's water-exposed surface. In secreted proteins that do not spend time in the cytoplasm, disulfide bonds between cysteine residues help to maintain the protein's tertiary structure. A variety of common and stable tertiary structures appear in a large number of proteins that are unrelated in both function and evolution - for example, many proteins are shaped like a TIM barrel, named for the enzyme triosephosphateisomerase. Another common structure is a highly stable dimeric coiled-coil structure composed of four alpha helices. Proteins are classified by the folds they represent in databases like SCOP and CATH.