Biochem PDF

Lecture 6 (Proteins tertiary structure)
October 1, 2020 5:14 PM
Tertiary structure of the proteins

- tertiary structure results from the folding of a polypeptide into a closely packed 3D structure
- an important feature is that AAs that are far apart in the primary sequence are brought together, permitting
interactions of their side chains
- about 40% of the AAs in a typical protein have hydrophobic groups that interact unfavourably with water – tertiary
structure allows these R-groups to be buried in the center of the protein
- most hydrophilic R-groups are found on the surface, where they make favourable interactions with water
- stabilized by noncovalent interactions – primarily hydrophobic effect
Supersecondary structures
 Supersecondary structures - some simple combinations of secondary structure elements have
been found to occur frequently in proteins
◊ Also known as motifs
 can be associated with a specific function
◊ Example: helix-loop-helix is associated with DNA binding
 can form part of a larger functional and/or structural assembly (domain)
 usually not stable by themselves
Helix-loop-helix
- 2 types:
1. EF hand
 Occurs in a number of calcium-binding proteins (ex//calmodulin)
 Glu and Asp residues in the loop of these proteins form part of the calcium binding
site
2. Helix-turn helix
 Occurs in certain DNA-binding proteins and bind to DNA
 Play important role in gene silencing and gene activation
The β-α-β motif

- two parallel β strands linked to an intervening α-helix by two loops
- the helix often runs parallel to the strands
β-hairpin motif
◊ simple motif involving two β-strands that are:

 adjacent in primary structure
 oriented in an anti-parallel direction
 linked by a short loop of 2-5 amino acids - relatively tight reverse turn
The Greek key motif
◊ name comes from design found on classical Greek pottery

◊ 4 adjacent anti-parallel strands and their linking loops
◊ strands 3 and 4 form the outer edges and strands 1 and 2 are in the center
◊ The most common linkage of 4 anti-parallel strands
Domains of secondary structures

- Supersecondary structures are organized into domains, which are the fundamental unit of tertiary structure
- Each domain can consist of combinations of motifs
- Size of the domain:
○ Can be 25-30 aa or more than 300
- The domains are connected by:
○ Loops and
○ Week interactions formed by aa side chains - for that the domains must be located in close proximity to each
other
- Most proteins with known structure have 3D folds that also occur in unrelated proteins
Estimated amount of naturally occurring folds is around 8000
Biochem Page 1
○ Estimated amount of naturally occurring folds is around 8000
○ 1200 was observed
○ However, about half of the proteins with known structure belong to only 20 fold groups
◊ Example: pyruvate kinase

 Domain 1: nucleotide binding domain - connected to domain 2 by loop
 Domain 2: substrate binding domain
 Domain 3: regulator domain - connected to domain 3 by week interactions
Four categories of domains

1. All α – entirely α-helices and loops
2. All β – entirely β sheets and nonrepetitive structures linking β strands
3. α/β – have supersecondary structures such as the βαβ motif and/or regions of alternating α and β
- Dispersed along primary sequence
4. α + β – local clusters of α helices and β sheets where each arises from separate continuous regions of the polypeptide
chain
Protein domains can possess distinct functions
◊ Example: S. pyogenes Cas9 consists of 3 domains:

 Domain 1: Nuclease lobe mediates DNA cleavage
 Domain 2: Recognition lobe is responsible for binding guide RNA molecule
 Domain 3: PAM-interacting domain identifies target DNA sequence
Folding of globular proteins depends on a variety of interactions

1. The hydrophobic effect is the principal driving force
2. Hydrogen bonds and van der Waals forces stabilize protein folds
3. Covalent cross-links (disulfide binds)
4. Ionic interactions
The hydrophobic effect

- The hydrophobic effect is the principal driving force in protein folding
- The tendency of hydrophobic groups to associate
- proteins are more stable in water when their hydrophobic side chains are sequestered from water
- Assembled into the core of a protein
- Polar side chains remain in contact with water on the surface of the protein - they are solvated
 Hydrophobic effect best described using thermodynamics

◊ H2O molecules surrounding non-polar groups are relatively well ordered and form a cage
around hydrophobic parts
 b/c the forces between H2O molecules are stronger than the forces between H2O
and hydrophobic aa
◊ When non-polar groups are removed from contact with H2O, disruption of cage structure
occurs
◊ The disruption is accompanied by increase in entropy of H2O molecules
◊ This increase in solvent entropy provides significant driving force
- If polar side chains are forced into interior of the protein they neutralize their polarity by forming H-bonds and forming
secondary structure
Hydrogen bonds and van der Waals

□ Besides the regular patterns in secondary structures, H-bonds can also occur between complementary
R-groups
□ H bonds can form between:
 Atoms of 2 peptide bonds - carbonyl in amide groups of protein backbones
 Atoms of a peptide bond and an amino acid side chain
 Two amino acid side chains
Covalent cross-links and ionic interactions

- Disulfide bonds
 Help stabilize protein structure
 Generally found in intracellular proteins
 Covalent interactions which are found between sulfur atoms on side chains of Cys residue
◊ Requires oxidation of thiol groups in Cys residues
 Makes proteins less susceptible to unfolding - protect from proteolysis
 Form once the protein is fully folded
- Salt bridges
Biochem Page 2
- Salt bridges
 side chains with complementary charges form ionic bonds
 salt bridges are buried deep within the hydrophobic interior of the protein where it can’t be
disrupted by solvent
Binding of metal ion or prosthetic group

- a metal ion (e.g.Mg2+, Ca2+, Zn2+) prosthetic group (e.g. heme) can stabilize a fold
□ Example: myoglobin protein
 Contains heme group
 Functions: facilitates diffusion of Oxygen and supplies Oxygen to muscle tissue in mammals
 Heme occupies a hydrophobic cleft formed by 3 helices and 2 loops
 It is held in place by hydrophobic intermolecular forces, Van Der Waal forces and H-bonding
 Increases stability of the protein
Summary of interactions that stabilize tertiary protein structure
Summary of Tertiary Structure

- tertiary structures form so as to minimize the unfavourable interactions and maximize the favourable ones
- as a result of these interactions, globular proteins tend to be very compact with very few internal cavities or water
molecules
- amino acids great distances apart can be brought into close proximity
- tertiary structure stabilized by covalent and noncovalent interactions
- recognizable combinations of α helices, β strands and loops that appear in different proteins are called super secondary
structures or MOTIFS
Quaternary Structure
- folded proteins interact to form dimers, trimers, or higher order structures
- each subunit is a separate polypeptide chain
- And they always assume exactly the same arrangement in the same protein molecules
- Can be identical chains or can be different chains
- multi-subunit protein is referred to as an oligomer
Why are multi-subunit proteins so common?

1. In large assemblies of proteins, using subunits for construction means that defects can be repaired by simply replacing
the flawed subunit rather than the entire protein
2. The site of subunit synthesis can be different than the site of assembly into the final product.
3. The only genetic information needed to specify the entire assembly is that specifying its few different self-assembling
subunits.
◊ Example: bacteria
 Bacteria has many different structures which help it to move
(fimbriae/flagellum/pilus)
 These structures are made up of repeating subunits polymerized together to form
long structures
A multi-subunit protein
- may consist of identical or non-identical polypeptide chains
- The interaction regions superficially resemble the interiors of single subunit proteins:
- they contain closely packed nonpolar side chains, hydrogen bonds, and in some cases, inter-chain disulfide bonds.
- They differ in several ways:
1. Hydrophobicity is midway between those of protein core and surface. In particular, the interfaces of proteins that
dissociate in vivo are less hydrophobic than permanent interfaces.
2. An average of ~77% of inter-subunit H-bonds are between side chains. Average within subunit is ~32%.
3. About 56% of protein interfaces contain salt bridges
- Few ion pairs buried in protein core.
Biochem Page 3
Lecture 7 (Protein purification)
October 6, 2020 7:39 PM
Why purify proteins?

- Study Protein Structure: X-ray crystallography, NMR spectroscopy, electron microscopy
- Study Protein Function: Biophysical characterization, binding kinetics, enzyme activities
- Use the purified product in downstream reactions/processing
- Produce a commercial product
Where do we isolate proteins from?

- Natural sources: Cells, animal tissue, secreted media, plant materials, biological fluids, algae
- Different proteins are found in different abundances across the different types of cells
○ Can range from a few copies to many thousands copies per cell
○ The lower the protein concentration - the more effort is required to isolate it
- Proteins with identical functions occur in variety of organisms
○ Therefore, isolation can be greatly simplified by choosing a good source
- Recombinant Sources - molecular copying methods (more commonly used)
○ Bacterial expression systems - E.coli
○ Eukaryotic expression systems - yeast, mammalian cells
○ Cell free expression systems - cytoplasmic extracts
○ Example: a human protein which is naturally produced in low levels can be cloned in bacterial or eukaryotic expression systems
Purification step 1
- The first step is cell lysis
 3 ways of mechanically disrupt a cell:

– Homogenizers - crush tissue and work like small blenders
– High pressure French press - forces the cell to burst open by applying a force
– Ultrasonic vibration Sonicator - ultrasonic waves break the cell
General scheme for protein isolation
– After the lysis the cell solution is centrifugated

 Proteins in the insoluble fraction pellet with the cell debris
 Proteins of interest are meant to stay in supernatant (liquid part)
– The protein of interest in the supernatant then are subjected to purification using affinity
or fractionation methods
Monitoring proteins during purification with SDS-PAGE

- SDS-PAGE - Sodium Dodecyl-Sulfate PolyAcrylamide Gel Electrophoresis
- Used to monitor the purity of the protein of interest throughout the purification process
- The gel is usually made out of cross linked polyacrylamide

○ Made by free radical induced polymerization of acrylamide
- The gel forms a matrix with pores of certain molecular dimensions which can be specified according to the protein you are
monitoring
○ gel slows the movement of the larger molecules
○ Therefore: proteins can run through the pores as a function of their size
– The process also involves SDS (Sodium Dodecyl-Sulfate) - very strong protein which
denatures molecules
 Binds tightly to proteins, unfolds them and causes them to assume a long rod shape
 The binding ration is 1 SDS molecule per 2 amino acids
 Highly negative charge masks intrinsic charge of the protein
 Therefore: proteins which are treated with SDS have identical charge to mass ratio,
as well as, similar shape
 SDS treatment also disrupts non-covalent interactions between multi polypeptide
subunits
• Therefore: separation of protein is based on gel filtration and electrophoretic property of the proteins
Gel electrophoresis and how it works

- Gel electrophoresis is an analytical technique used to estimate the mass and purity of a protein
- Because proteins are coated with SDS:
○ They are highly negative
○ Therefore: migrate towards positive side of the gel electrophoresis (bottom)
Migrate according to their size only since their similar in all the other aspects, thank to SDS
Biochem Page 4
○ Migrate according to their size only since their similar in all the other aspects, thank to SDS
 Steps:
1. SDS-coated proteins move through the gel when an electric potential is applied
2. After electrophoresis the proteins are visualized by staining with Coomassie Blue or Silver
3. The migration distance is proportional to the log10 [Mr ].
 The mass of an unknown protein can be determined by interpolation using the migration
distances of proteins of known molecular weight.
 The resulted product:

◊ Proteins get separated in order of their molecular masses - 5-10% accuracy
◊ The molecular masses of proteins are compared to exact known molecular weights of proteins (protein
markers)
 There is a logarithmic relationship between protein molecular mass and its relative electrophoretic mobility
◊ Relationship between MM and relative mobility for 37 proteins ranging in size from 11 to 70 kDa
Immunoblotting: Western blot

- Protein bands can be detected by immunoblotting as an alternative to staining
– Steps of western blot:
1. Perform gel electrophoresis on a sample containing the protein of interest
Blot the proteins from gel onto nitrocellulose
2. Block the unoccupied binding sites on the nitrocellulose with casein
This creates nitrocellulose replica of gel electrophoresis
3. Incubate with rabbit antibody to the protein of interest - binding of primary antibody
4. Wash an incubate with an enzyme-linked goat anti-rabbit antibody - binding of
enzyme-linked secondary antibody
1. Secondary antibody detects primary antibody - it is labelled with smth like
horse radish peroxide and allows to visualize the product
5. Assay the linked enzyme with colorimetric reaction - produces immunoblot
Often horse radish peroxide is used to visualize the blot - catalyzes oxidation reaction
of luminol from luminol peroxide, generating light
The light is detected with special equipment
Other techniques for monitoring proteins
 Absorbance at 280nm
– Tryptophan and tyrosine absorb light in UV region
– Use beer's law to calculate concentration of protein in the sample
 Assay activity - ex// nuclease activity
– Monitors protein activity
– The greater the activity - the higher proportion of proteins is present in the sample
Five characteristics of proteins that are exploited for purification

- Solubility: procedures include
○ Salting in
○ Salting out
- Ionic charge: procedures include
○ Ionic exchange chromatography
○ Electrophoresis
○ Isoelectric focusing
- Polarity: procedures include
○ Adsorption chromatography
○ Paper chromatography
○ Reverse-phase chromatography
○ Hydrophobic interaction chromatography
- Molecular size: procedures include
○ Dialysis and ultrafiltration
○ Gel electrophoresis
○ Gel filtration chromatography
○ Ultracentrifugation
- Binding specificity: procedures include
○ Affinity chromatography
Solubility
Biochem Page 5
Solubility
- A protein’s multiple acid-base groups make its solubility properties dependent on the concentrations of dissolved salts
- As a result, proteins greatly vary in their solubility in different conditions
○ Some precipitate in certain conditions and other don't which forms the bases of protein purification
- Solubility purification technique depends on salting in and salting out procedures:
1. Salting in:
◊ Most proteins have low solubility in the absence of salt
◊ At low concentrations, added salt usually increases the solubility of charged macromolecules because
counterions shield the multiple ionic charges of the protein
◊ Neutralizing charges on protein surface reduces ordered H2O around proteins, increasing entropy of
the system, and thus the solubility
◊ so low salt concentrations prevent aggregation of the proteins, which can lead to precipitation or
“crashing”
1. Salting out:
◊ Most proteins have low solubility in very high concentrations of salt
◊ at high concentrations, added salt lowers the solubility of macromolecules because it competes for the
water molecules needed to solvate them
◊ As a result of counterions competing for solvation of solvent, the solvation around proteins is not
enough to solubilize them
◊ so high salt concentration removes the solvation sphere from the protein molecules and they come out
of solution as precipitation
- Protein solubility at a given ionic strength varies with the type of ions in solution
○ Protein solubility is related to ion size and hydration
○ Different ions have different effects
◊ The graph: shows the solubility of carboxy-hemoglobin at its pI as a function of ionic strength
 S is solubility in salt solution
 S' is solubility in pure water
 Ionic strength = salt concentration
 2 trends:
– Larger ions have more drastic effects on protein solubility
– solubility of proteins decreases with increasing ionic strength
◊ The solubility of proteins in ammonium sulfate varies

 As can be seen from the graph each protein has a different solubility as different concentration of
ammonium sulfate
 Therefore: we can selectively precipitate proteins at different ionic strengths
◊ Salting out was traditionally one of the most commonly used purification steps
 remove unwanted proteins from a solution by salting out
 after removing the precipitate by centrifugation, the desired protein can be precipitated by
altering the salt concentration to the level at which the desired protein becomes insoluble – “salt
cuts”
◊ Centrifugation separates out precipitated proteins

 Insoluble material becomes pellet and settles at the bottom of the tube
 Soluble material stays in the supernatant - protein of interest should be located here
Ionic Charge: ion exchange chromatography
- Chromatography - the separation of a mixture by passing it through a medium in which the components move at different rates
- Ion exchange chromatography - a purification method aimed at separating proteins based on charge
◊ Schematics:
 The procedure requires a column filled with solid matrix
 The matric contains pores of specific size
 The sample is applied to the very top of the column - mixture of proteins and cell parts
- General mechanism: example of positively charged column

○ Slowly add the solvent to the top of the column on top of the sample
○ As we add the solvent, the proteins separate
▪ Some proteins will stay in the column longer
○ Column is positively charged
▪ Proteins with negative charge will midgrade slower than proteins with positive charge
○ Therefore: the separation occurs due to difference in charges of the proteins
◊ More detailed mechanism:
Biochem Page 6
◊ More detailed mechanism:
 We can use concentration gradient to titrate the protein of interest
– In this case we use just 2 step gradient
 First apply low-salt solution buffer
– The first protein is eluted from the chromatography
 Then apply high-salt solution
– Protein which was binding more tightly at low salt solution start to elute from the column
 Then we are left with the protein of interest which is eluted from the column by adjusting the
salt concentration
 To determine at which point the protein is being diluted use:
– protein can be detected by UV absorbance, SDS-PAGE, enzymatic activity
- Chromatography column contains Ion exchange resins

○ The 2 types of ion exchanger is cation and anion exchanger
- Both are composed of polystyrene or silica-based beads with ionic functional groups attached
◊ Cation exchanger has negatively charged group

 therefore binds and exchanges positively charged proteins
◊ Anion exchanger has positively charged group
 therefore binds and exchanges negatively charged proteins
○ Therefore the proteins can bind to both anion and cation exchangers depending on their net charge
- This binding allows the separation of the proteins
- Different proteins can be bound at different points of chromatography by adjusting the pH of the solution
- Once bound, the protein is eluted by increasing the ion concentration
○ the affinity with which a particular protein binds to a given ion exchanger depends on the identities and concentrations of the
other ions in solution b/c of competition among these ions for the binding sites
Chromatographic separation of proteins

◊ Mobile phase - mixture of proteins that we want to separate
◊ Stationary phase - solid matrix of the column
 Acts to slow the progress of the proteins through the column in a manner which varies for each
protein
◊ During procedure, the solvent is continuously applied to the top of the column from a large reservoir of
solvent
◊ There are different forces that hold each protein and causes them to migrate at different rates
 Eventually separating them into bands of pure substances
 Separation depends on the nature of the column and the nature of the proteins
◊ In total there are 3 different chromatography techniques which are classified based on the mobile and
stationary phase
Polarity: Hydrophobic interaction chromatography (HIC)

- The technique is based upon the hydrophobicity properties of the proteins
- 4 main steps:
1. Equilibration - performed by adding salt to mobile phase
2. Sample application and wash - bind the target molecule and wash off all of the unbind material
- Promoted by modernly high salt concentration
3. Elution - biomolecules are released from hydrophobic surface by a change in buffer composition
- Caused by decreasing the salt concentration
- Often gradient solution is used
4. Regeneration - removes all of the molecules that are still bound
- Ensures all of the capacity of the stationary phase is available for the next stage
Hydrophobic interaction chromatography (HIC)

◊ Separates proteins based on their varying strength of interaction with hydrophobic groups attached to
an uncharged gel matrix
◊ it exploits differences in hydrophobicity
 number of hydrophobic amino acids
 distribution of these amino acids
◊ HIC ligands that can be attached to gel matrix are shown on the right
Molecular size: purification technique

◊ remove small proteins from the protein solution through a semipermeable dialysis bag
 The bag contains pores which are smaller than the molecule of interest
 Allows smaller molecules to diffuse across the membrane but traps larger molecules
◊ molecules move from the more concentrated solution (inside) to the less concentrated solution (bulk
solvent)
◊ The procedure can be repeated multiple times, to ensure that most of the unwanted material left the
dialysis bag
Biochem Page 7
◊ molecules move from the more concentrated solution (inside) to the less concentrated solution (bulk
solvent)
◊ The procedure can be repeated multiple times, to ensure that most of the unwanted material left the
dialysis bag
Molecular size: Size exclusion chromatography

◊ Known as “size exclusion” or “gel filtration” chromatography
◊ Small beads of polymerized glucose, agarose, or acrylamide are manufactured with different sizes of
pores depending on the degree of cross-linking of the polymer
◊ The beads are packed into a cylinder (column) and a mixture of proteins is applied
 Big proteins don't enter the porous beads and run quickly through the column
 Small proteins enter and exit the beads and elute more slowly
◊ Gel filtration chromatography can be used to estimate molecular masses

 The plot:
– Vo = void volume
– Ve = elution volume
– Ve/Vo = relative elution volume
– The plot shows a linear relationship between relative elution volume and molecular mass
– The solid line is graphed using protein of known molecular masses and elution volumes
– Unknown proteins are graphed on the plot, and compared with known proteins to
estimate the unknown molecular mass
 This technique is limited by the assumption that known and unknown proteins have identical
shapes
– Globular proteins fall well into the line
– Elongated proteins not so much - therefore, this is a bad way to estimate their molecular
mass
Binding specificity: Affinity chromatography

- 3 main steps
1. Equilibration - equilibration to desired condition
2. Sample application and wash - bind target molecules and was off all the unwanted material
3. Elution - biomolecules are released from bio specific ligand and eluted
- Achieved by increase in pH of the buffer
◊ Separates biochemical mixtures based on a highly specific interaction such as antibody/antigen or

enzyme/substrate
◊ Matrix has ligand bound to it - some antigens or small peptides
 Only specific proteins from the protein solution bind to ligand
 Proteins that failed to attach are washed off
◊ Example: GST - Glutathione

 GST (glutathione S-transferase)-tagged proteins bind to glutathione beads
 non-specifically or weakly bound proteins washed off
 GST-tagged proteins eluted with either
– glutathione (competitor) - outcompetes the protein for binding to the ligand
– thrombin (protease) - cleaves GST from the protein, leaving GST bound to ligand but
protein is eluted
◊ Commonly used affinity tags

 His tag for purification by IMAC affinity column
 Epitope tag for purification by immunoaffinity column - more expensive
 Ligand-binding domain for purification by convectional affinity column - such as GTS
– For unstable proteins, fusing with another domain might increase its expression, thus
increasing the yield
◊ Before purification:
 Cloning the strain with expression vector - plasmid from which we can express recombinant
protein of interest
– Contains protein sequence, linker sequence and affinity tag
Summary of chromatographic methods of protein purification
Biochem Page 8
Review of protein purification
1. Thousands of different proteins in starting mixture (cell extract)
2. Exploit differences in:
- Solubilities
- charge, polarity
- Size
- binding specificities
3. Need the protein to retain its biological activity and high yield to analyze structure and function
Biochem Page 9
Lecture 8
October 13, 2020 2:30 PM
Proteins fall into two main classes:

1. Fibrous proteins
- highly elongated molecules whose secondary structures are their dominant structural motifs (e.g. collagen)
- many fibrous proteins function as structural materials that have aprotective, connective, or supportive role in living organisms
▪ These fibrous proteins are often found in skin/bone/tendon
▪ Others like the ones found in muscles also have motive function which helps us move
- usually span a long distance in the cell
- mainly insoluble proteins
2. Globular proteins
- compact spherical shape with irregular surfaces (e.g. myoglobin and hemoglobin)
- this type of structure increases the solubility of proteins in water because the polar residues are on the surface and hydrophobic groups in
the interior
Fibrous proteins provide mechanical support to cells or organisms

- Intermediate filaments of the cytoskeleton
- Fibrous proteins provide structural scaffold inside the cell - ex// keratin in hair, horns and nails
- Extracellular matrix
- Fibrous proteins bind cells together to make tissues
- secreted from cells and assembled into long fibers
▪ collagen in bone, teeth, cartilage, tendon, ligaments
▪ elastin – unstructured fibers that give tissue an elastic characteristic
- Collagen
- Provides strength to the skin tissue
- Composes a lower layer of skin
- Elastin
- Provides elasticity to the skin
- Forms 3D network which connects collagen fibers
- Composes middle layer of the skin
COLLAGEN
Collagen
□ most abundant protein in vertebrates
□ 30% of total protein in mammals
□ extracellular protein organized into insoluble fibers
□ found in connective tissues
 bone, teeth, cartilage, tendon, fibrous matrices of blood vessels and skin
□ remarkably diverse forms and functions
 in tendons it forms stiff, rope like fibers of tremendous tensile strength
 in skin it forms loosely woven fibers that permit expansion in all directions
□ 28 different types which occur in different tissues
Collagen forms a triple helix
□ collagen consists of three left-handed helical chains coiled around each other to form a right-handed supercoil
□ each left-handed helix has 3 AA/turn
□ much more extended than α-helix, with a rise per residue of 2.9 Å (more elongated)
Collagen has an unusual amino acid sequence

– sequence in helical region consists of multiple repeats of Gly-X-Y
 often X = Pro, Y = Hyp
 Repeating tripeptide
– Hyp - hydroxyproline
 Abnormal amino acid which results from enzyme modification of the proline
 Modification occurs after translation
– Structure usually contains about 35% glycine, 21% proline and 11% alanine
Collagen triple helix is stabilized by interchain hydrogen bonds

- Gly residues are located along central axis of the triple helix, where tight packing of the polypeptides cannot accommodate any other residue
- for each G-X-Y triplet, one H-bond forms between amide hydrogen of Gly in one chain and the carbonyl O of a residue in next chain
- unlike α-helix, there are no intrachain H-bonds
Structural basis of the collagen triple helix

 The unusual amino acid composition of collagen, with long stretches of Gly-Pro-hydroxyPro is NOT suited
for α-helices or β-sheets
 the limited conformational flexibility of Pro and hydroxyPro prevents formation of α-helices and makes
collagen somewhat rigid
 presence of Gly at every third position allows collagen chains to form a tightly wound left-handed helix that
accommodates the Pro
 collagen also contains an additional modified AA, 5-hydroxylysine
– Non-standard amino acid which also results from enzymatic modifications
Collagen is organized into fibrils
◊ Driving force which results in triple helix structure:

 Burial of hydrophobic residues within interface of the protein
 Occurs in less specific manner than in globular protein
The arrangement of collagen fibrils in various tissues:

- Collagen is organized into different fibril organizations for different tissues:
- Tendon - parallel bundles - connect muscles to bones
- Skin - sheets of fibrils layered at many angles - skin becomes more elastic
- Cartilage - no distinct arrangement
- Cornea - planar sheets stacked crossways so as to minimize light scattering
Biochem Page 10
Mutations in collagen
 Mutations in collagen can cause disease
 Example: Osteogenesis imperfecta (brittle bone disease)
◊ single point mutation
◊ central Gly changes to Ala and distorts the triple helix and fiber formation
◊ additional methyl groups cause the fibril melting temperature to drop from 63°C to 29°C
◊ Therefore: protein is not functional in human biological conditions
ELASTIN
Elastin
◊ highly elastic protein in connective tissue
◊ allows tissues in the body to stretch and “snap back” to their original shape
◊ primarily composed of small nonpolar amino acids (e.g. Gly, Ala, Val)
◊ also rich in Pro and Lys, but contain lidle hydroxyproline and hydroxylysine
◊ Usually around 800 amino acids long - single molecule or fiber?
◊ Molecules aggregate together and form tissue by becoming cross-linked
 15-17 cross links per molecule
 Wide spacing of cross-links allows extension of tissues and provides strength
Protein Structure and Function Hemoglobin and Myoglobin
Globular proteins: tertiary structure allows them to bind other molecules
□ the shape of globular proteins, with their indentations and interdomain interfaces allows them to function by binding
selectively and transiently to other molecules
□ best exemplified by interactions between enzymes and substrates
□ binding sites are typically positioned toward the interior of a protein, and are relatively free of water
□ when substrates bind, they fit so well that water molecules in the binding cleft are excluded
MYOGLOBIN
Myoglobin
 Myoglobin was the first known protein structure (1958)

 elucidation of this structure laid the foundation for our present knowledge of tertiary structure of proteins
 myoglobin binds oxygen and facilitates its diffusion in muscles
 accounts for 8% of total protein in muscles of diving mammals
 it binds a single heme group that coordinates oxygen
2 parts of Myoglobin
- Myoglobin has two parts: the protein itself and the heme prosthetic group
◊ Protein itself:
 myoglobin is composed of 153 amino acids
 8 helices ranging in length from 7 to 26 residues (~75% helix)
– Named from A to H starting from N-terminus
 heme is tightly wedged in a hydrophobic pocket formed mainly by helices E and F
 the accessibility of the heme group to molecular oxygen depends on slight movement of nearby amino
acids
◊ Heme prosthetic group

 Heme is composed of porphyrin and Fe2+
 Fe atom is the site of oxygen binding

 Fe interacts with 6 ligands in myoglobin and hemoglobin
– 4 N atoms of porphyrin
– imidazole s/c of His residue (93) - on one side
– the O2 molecule - on the other side
 oxygen-free myoglobin called deoxymyoglobin
 oxygen-bearing molecule is oxymyoglobin
 reversible binding is known as oxygenation
 Molecules such as CO, nitrous oxide, cyanide have a higher affinity for the ligan bound position of Oxygen
– As a result, they are highly toxic
- Example: Oxygen binding site in sperm whale myoglobin
Myoglobin function:
- Myoglobin was first thought to carrying O
○ However, this feature is more important for aquatic animals rather than humans
○ Aquatic animals have 10 to 30 times higher myoglobin concentration in muscles
- Likely major function is to facilitate oxygen transport in rapidly respiring muscles
○ Respiration is limited by low solubility of oxygen in aqueous solution
○ Myoglobin increases solubility of oxygen in muscles as they are the most rapidly respiring tissue under conditions of high exertion
Myoglobin summary
- despite being one of the best studied proteins in biology, its physiological function is not yet conclusively established
- myoglobin increases the solubility of oxygen and facilitates its diffusion
- oxygen storage is also a function – concentrations are 10- fold higher in whales and seals than in land mammals
- hypothesized that myoglobin function relates to increased oxygen transport to muscle, oxygen storage and as a scavenger of reactive oxygen
species
Hemoglobin
Hemoglobin
 Hemoglobin is another oxygen transport and storage molecule
 composed of four polypeptide chains
◊ two α - 141 amino acids
◊ two β - 146 amino acids
 contains two dimers of αβ subunits held together by noncovalent interactions
 each chain is a subunit with a heme group in the center that carries oxygen
 one hemoglobin molecule contains four heme groups and carries four O2 molecules
Biochem Page 11
 one hemoglobin molecule contains four heme groups and carries four O2 molecules
The α and β subunits of hemoglobin and myoglobin

- The α and β subunits of hemoglobin and myoglobin share a common fold
□ The structure:
 α-globin (blue)
 β-globin (purple)
 Myoglobin (green)
□ These protein subunits share only 18% primary sequence identity
◊ Sequence alignment
 Light blue - shared amino acids between Hb alpha and beta
 Pink and purple - shared amino acids between all three subunits
 Overall, not that many conserved sequences + location of loops vary
Myoglobin and hemoglobin have different affinities for O2

- myoglobin has hyperbolic O2 binding curve
○ Simple curve with 1 equilibrium constant
○ myoglobin binds O2 tightly, releases at very low pO2
- hemoglobin has lower affinity than myoglobin

○ Results in sigmoidal curve b/c involves 4 different ligands (4O)
○ O2 bound by hemoglobin in lungs when pO2 is high; released in tissues
- Cooperative binding of hemoglobin

○ Binding of 1 oxygen to 1 heme group is not favoured
○ But once 1 oxygen binds, it facilitates binding of oxygen too the other 3 heme groups
○ If 1 Oxygen is lost from fully oxygenated molecule, it decreases the affinity for oxygen in the other groups
○ This leads to rapid release of oxygen molecules
 Graph:
◊ y-axis - fraction saturation of each protein
◊ x-axis - concentration of Oxygen as partial pressure
◊ When y=0.5 - protein is half saturated with oxygen
Hemoglobin function
- myoglobin, an oxygen storage protein, has a greater affinity for oxygen at all oxygen pressures
- hemoglobin is different – it must bind oxygen in lungs and release it in capillaries
- hemoglobin becomes saturated with O2 in the lungs, where the partial pressure of O2 is about 100 torr
- In capillaries, pO2 is about 40 torr, and oxygen is released from hemoglobin
○ Once Oxygen is released, myoglobin binds to it
○ Makes up a system to deliver Oxygen from the lungs to the other parts of the body
- The binding of O2 to hemoglobin is cooperative – binding of oxygen to the first subunit makes binding to the other subunits more favorable
Quaternary structure of oxy- and deoxyhemoglobin
□ T-state – deoxy form “tensed”

 Greater space between 2 dimers of alpha-beta subunits
 Weak ionic and hydrogen bonds occur between alpha-beta dimer pairs
 Strong interactions (primarily hydrophobic) within the dimer
□ R-state – oxy form “relaxed”

 Smaller space between 2 dimers of alpha-beta subunits
 Some ionic and hydrogen bonds between dimers are broken
 The interactions within dimers stay the same
◊ Mechanism of conversion from one form to another:

 Binding of oxygen triggers conformational change into R-state
 Subunits become more tightly packed
 R-state has a greater affinity for Oxygen b/c oxygen binding stabilizes the structure
What causes the differences in the conformational states?
◊ The Fe iron is about 0.6 Å out of the heme plane in the deoxy state
◊ When oxygen binds it pulls the iron back into the heme plane
◊ Since the proximal His F8 is attached to the Fe this pulls the complete F helix
◊ The dimers become closer together
Mechanism of oxygen-binding cooperativity

- Binding of the oxygen on one heme is difficult, but its binding causes a shift in the α1-β2 contacts and moves the distal His E7 and Val E11 out
of the path for O2 binding to the Fe on the other subunit
- This process increases the affinity of that heme toward oxygen
 Diagram:
◊ Unbound state - blue
◊ Bound state - red
O2 binding to hemoglobin shows positive cooperativity
 As more oxygen is bound to hemoglobin tetramer:

◊ The strain derived from Fe-O bond accumulates in the ligand subunits until it is if efficient strength to snap
hemoglobin into R-conformation
 All subunits change to R-conformation regardless of whether they are bound to oxygen
◊ Un-ligated subunits have higher affinity for oxygen in R-state
The Bohr effect

- The Bohr effect - decreasing pH decreases the affinity of hemoglobin for oxygen
Biochem Page 12
- CO2 indirectly effects the affinity by lowering pH of the blood cells
○ Lower pH leads to protonation of several hemoglobin groups which can then formion pairs and stabilize T-form state
 Overall:
◊ higher pH promotes tighter binding of oxygen to Hb
◊ lower pH permits easier release of oxygen from Hb
◊ this is a result of pK changes of several groups (N-terminal amino groups of α subunits and C-terminal His of β
subunits
- Bohr effect increases the efficiency of oxygen-delivery system

○ In lungs: CO2 concentration is low = high pH = oxygen is picked up by hemoglobin
○ In metabolic tissues: CO2 concentration is high = low pH = Oxygen is unloaded from hemoglobin
The physiological significance of the hemoglobin:O2 interaction

- hemoglobin must be able to bind oxygen in the lungs
- hemoglobin must be able to release oxygen in capillaries
- if hemoglobin behaved like myoglobin, very little oxygen would be released in capillaries
- the sigmoidal, cooperative oxygen binding curve of hemoglobin makes its physiological actions possible!
Muta6ons in β-chain of hemoglobin cause Sickle Cell Anemia

- patients with Sickle-cell anemia have abnormally-shaped red blood cells
- sickle cells pass less freely through the capillaries, impairing circulation and causing tissue damage
- a single amino acid substitution in β-chains of hemoglobin (Glu6Val) causes hemoglobin molecules to aggregate into long, chainlike polymeric
structures
 The mutant Val can insert into hydrophobic pocket in neighbouring Hb molecules
◊ Oxy and deoxy -bound states of sickle cell Hb have a thing on the side
◊ Deoxy-bound state of sickle cell Hb has a pocket, perfect for binding the thing
◊ As a result, deoxy-Hb of sickle cell polymerizes into filaments, aggregating into fibers of 14-16 strands
◊ These fibers deform the cell
- Heterozygotes don't show the disease b/c formation of fibers is 1000 times slower than in homozygotes
Biochem Page 13
Lecture 9 (Protein Folding and Stability)
October 15, 2020 6:32 PM
The Protein Folding Problem

- proteins spontaneously fold into their native conformations under physiological conditions
- a protein’s primary structure dictates its 3D structure
- in general, under the proper conditions, biological structures are self-assembling (i.e. they don’t need external templates to guide their formation)
Levinthal Paradox (1969)

- Simplest explanation to how proteins fold would be: random exploration of space
- Levinthal though that because of high degree of freedom in unfolded polypeptides, a molecule has an astronomical number of possible conformations
○ Example: 100 residues = 99 peptide bonds
▪ 198 different phi and psi bond angles
▪ If there are 3 stable conformations for each bond angle = 3 198 possible conformations
▪ At nano/pico second rate proteins would never find the correct structure
- Paradox: Proteins fold on a milli-micro second timescale
○ Therefore: it is impossible that proteins fold through random exploration
Is there a single mechanism for protein folding?

- How a protein achieves its stable, folded state is a complex question
- Levinthal paradox demonstrates that proteins cannot fold by sampling all possible conformations
- This implies that proteins actually fold via specific “folding pathways”
Anfinsen’s protein renaturation experiment with bovine pancreatic ribonuclease

- Anfinsen exposed bovine pancreatic ribonuclease to a concentrated urea solution in presence of reducing agent
○ Reducing agent cleaves disulfide bonds which leaves 8 unpaired Cys residues
○ Urea disrupts hydrophobic core, leading to complete unfolding
◊ Anfinsen then conducted 2 experiments:

One. Removed urea and reducing agent together at once
– Result: protein spontaneously folds into its precise structure and gains back the activity
Two. Removed reducing agent first, then urea
– Result: Leads to random forming of disulfide bonds, and at the end only 1% of proteins folds into its correct
conformation
Determinants of Protein Folding

- Secondary structures – helices, sheets, and turns – start to form
- Nonpolar residues aggregate or coalesce in a process termed a hydrophobic collapse
- Subsequent steps probably involve formation of long-range interactions between secondary structures or involving other hydrophobic interactions
Forces that drive globular protein folding:

- Hydrophobic Effect, H-bonding, van der Waals Interactions and Charge-Charge interactions
- Cooperativity of folding - first few interactions drive the rest of the interactions to occur
- 2 folding criteria:
○ Peptide chain must satisfy the constraints inherent in its own sequence
○ Peptide chain must fold so as to "bury" the hydrophobic side chains, minimizing their contact with water
Hypothetical protein folding pathways
a) polypeptide collapses due to the hydrophobic effect and elements of secondary structure begin to form
◊ Formed intermediate is called "molten globe"
b) subsequent steps involve rearrangement of backbone chain to form characteristic structural motifs
c) finally reaches stable native conformation
- The process happens in the fraction of a second
Protein Unfolding
- Denaturation - the process by which a folded or native protein is converted to an unfolded form
- The free energy difference between the native and unfolded states of a protein is usually small (∆Gfolding ~5-10 kcal/mol), so mild treatments like heat
or a change in solvent will denature most proteins
- Once denatured, a few proteins can refold spontaneously, but this is rare (chaperones are usually required)
- Usually occurs with small proteins only
- Most denatured proteins tend to aggregate and precipitate
Denaturation leads to the loss of protein structure and function
 Example: ovalbumin monomer
Biochem Page 14
- Most denatured proteins tend to aggregate and precipitate
Denaturation leads to the loss of protein structure and function
 Example: ovalbumin monomer

◊ More than half of the protein in egg white is ovalbumin
◊ This proteins are denatured during cooking
 Mechanism:
◊ when egg white is heated, the proteins denature
 hydrophobic R-groups are exposed to the H2O and the proteins aggregate and become insoluble
How to denature proteins

- heat is the most familiar denaturant (T >55° for most proteins)
- pH (disrupts H-bonds & ionic bonds)
- organic solvents denature proteins
- high concentrations of certain solutes which disrupt the H-bonding system of water (8M urea, 6M guanidine hydrochloride) are excellent protein
denaturants, and keep the denatured form in solution
Folding Accessory Proteins

- most unfolded proteins renature in vitro over periods ranging from minutes to days
○ quite often this is a low efficiency reaction with a large fraction of the polypeptide chain assuming non-native conformations and/or forming
non-specific aggregates
- in vivo, polypeptides efficiently fold to their native conformations as they are being synthesized in a few minutes or less
- this is because all cells contain three types of accessory proteins that help polypeptides fold to their native conformations:
○ Protein disulfide isomerases
○ Peptide prolyl cis-trans isomerases
○ Molecular chaperones
Protein disulfide isomerase
◊ enzyme found in ER in eukaryotes and periplasm in bacteria

◊ catalyze formation and breakage of disulfide bonds as proteins fold
◊ allows proteins to quickly find the correct arrangement of disulfide bonds in their fully folded state
Peptide Prolyl Cis-Trans Isomerase (prolyl isomerase)

◊ PPIs catalyze the otherwise slow interconversion of X-Pro peptide bonds between their cis and trans conformations
◊ most AA have strong preference for trans peptide bond due to steric hindrance
◊ unusual structure of Pro stabilized cis form so that both isomers are populated
◊ X-Pro peptide will not adopt intended conformation spontaneously, so cis-trans conversion can be rate-limiting step in
protein folding
◊ thus PPIs accelerate folding of Pro-containing polypeptides by accelerating cis-trans conversion
Molecular Chaperones
- larger proteins are more likely to become temporarily trapped in a local energy well
- the presence of such metastable intermediates slows rate of protein folding and may contribute to aggregation
- to overcome these problems in the cell, the rate of correct protein folding is enhanced by molecular chaperones
◊ Functions of chaperons:
 enhance rate of correct folding of proteins
– sometimes by binding newly synthesized polypeptides before they are completely folded
 prevent formation of improperly folded intermediates that may trap polypeptide in aberrant form
 bind unassembled protein subunits to prevent aggregation and precipitation before they are assembled into
complete multi-subunit complexes
Why have molecular chaperones?

- Proteins fold in the presence of high concentrations of other macromolecules (~300g/L, 25% of the available room in a cell)
○ Which might prevent them from proper folding
- Chaperons:
○ Prevent premature intermolecular contacts
○ Prevent aggregation by inhibiting inappropriate interactions between potentially complementary surfaces
- Example: bind to hydrophobic regions, preventing their aggregation
- Folding rate is enhanced by chaperones
- Many different types of chaperones:

○ heat shock proteins, HSP70 in eukaryotes
○ chaperonins - GroEL/ES in bacteria
The Hsp70 chaperone cycle

- Hsp70 is the major heat shock protein – present in almost all species
in bacteria it is called DnaK
Biochem Page 15
○ in bacteria it is called DnaK
- one role of Hsp70 is to bind to proteins while being synthesized to prevent aggregation or entrapment in local low-energy well
- high conservation of Hsp70/DnaK indicates that chaperone-assisted folding is an ancient and essential requirement for efficient synthesis of correctly
folded proteins
◊ Structure:
 2 domains which are connected by a short interdomain linker
 N-terminal domain: ATP binding domain
 C-terminal domain: substrate binding pocket with affinity for peptides with hydrophobic amino acids
– Also contains alpha-helical structure which acts as a lid for substrate binding pocket
◊ Mechanism:
 Binding of the peptide triggers ATP hydrolysis - causes lid to close and trap the substrate
 When ADP is converted back to ATP - lid opens and substrate is released
- HSP70 can interact with other proteins such as HSP40 which:

○ helps to regulate ATP hydrolysis
○ Mediates binding to unfolded protein
Folding in the GroEL-ES chaperonin cage

- core structure consists of two rings containing seven identical GroEL subunits
- Interacts with HSP70 protein
◊ Mechanism:
 HSP70 brings the unfolded protein
 unfolded proteins bind the hydrophobic central cavity enclosed by the rings
 When protein binds, ATP hydrolysis occurs which attracts GroES cap which closes the protein within the chamber
– At the same time, the GroES cap on the other side of the protein is released
 when folding is complete the protein is released and ADP is converted back to ATP
– The GroES top cap is released and the bottom cap binds back
- central cavity large enough to accommodate 630AA protein

- In E.coli 10 to 15% of proteins require GroEL-ES protein to fold
- Analogous in eukaryotes: HSP60
Protein folding summary

- Folding is an ordered process that seems to follow a series of steps that are unique to each protein – the process is dependent on AA sequence
- Predicting the steps in folding anything but small peptides remains very difficult
- In some cases, molecular chaperones (proteins that bind partly folded polypeptides) assist in folding – they seem to stabilize key intermediates
thereby preventing non-specific aggregation and incorrect folding
Biochem Page 16
Biochem lab manual 1
October 22, 2020 10:35 AM
Spectrophotometer
- A spectrophotometer is an instrument that is used to measure the transmission of light
through a sample
- The higher the concentration of the substance, the more light will be absorbed as it passes
through the solution
- Light that is not absorbed, strikes a photocell causing a flow of current, which is directly
proportional to the light intensity
○ The current is then measured by an ammeter
- The instruments are adjusted so that a maximum current flows when the light passes through
the blank solution (100% transmission or 0% absorbance)
○ A decrease in current is therefore proportional to the concentration of the substance.
Protein Concentration Assays

- The most reliable means of measuring the concentration of a purified protein in solution is the
complete hydrolysis of the protein in hydrochloric acid followed by a quantitative analysis of
the amino acids released
Preparation of a Standard Curve:

- In order to determine the concentration of an unknown protein solution, we assay in parallel a
series of protein standards of known concentration and plot a standard curve of Absorbance
versus Amount of Protein (in µg or mg)
○ Bovine serum albumin (BSA) is often used for such standard curves
The Bradford Assay:

- The basis of the assay is a shift in the absorbance maximum for an acidic solution of Coomassie
Brilliant Blue G-250 dye (Figure 5) from 465 nm to 595 nm when binding to protein occurs.
○ This change in absorbance can be measured spectrophotometrically.
- Fast and easy
The UV Method:
- Proteins absorb in the UV range (~280 nm) due mainly to the presence of aromatic side chains
in tyrosine and tryptophan residues
○ The amount of absorbance at 280 nm for a given protein will thus depend on its amino
acid composition
- a BSA standard curve is used to determine the concentration of an unknown
- can assume that absorbance is completely linear with respect to protein concentration and
values may be extrapolated from a linear standard curve
- may be performed quickly and easily on a small quantity of material (sensitivity is 0.05 – 2.0
mg/mL protein)
- One major disadvantage to the UV method is that interference may occur from other
compounds found in biological materials
Soret band
- heme-containing proteins such as hemoglobin (Hb) display absorbance at ~ 409 nm
- It is due to delocalization of electrons in the heme porphyrin ring
- In addition to the Soret band, hemoglobin displays characteristic absorption peaks in the
visible range
○ These absorption bands distinguish oxyhemoglobin from deoxyhemoglobin
Biochem Page 17
Lecture 1&2 Section 2
October 27, 2020 2:26 PM
Control of transcription in bacteria

- in bacteria genes encoding proteins that function together can be grouped adjacent to one another in a cluster
within the genome
○ such a cluster is known as an operon
- this arrangement allows for the coordinated expression of the genes through the production of a single mRNA that
encodes several proteins
- the transcription of the mRNA in an operon is controlled by 2 sequences:

○ the promoter - recruits RNA polymerase which will in turn generate the mRNA
○ the operator - which is bound by regulatory factors that regulate recruitment of RNA polymerase to the
promoter
- note that operators may lie upstream of the promoter (as diagramed here) or downstream
Why do we care about gene expression in bacteria?

- bacteria account for a large fraction of the earth’s biomass and as such they play important roles in many aspects
of our environment
- from a medical stand point bacteria are responsible for many human diseases, while others are thought to be
beneficial to humans, so a complete understanding of their biology is important for human health
- they are also used extensively in the biotechnology industry
- bacteria can serve as excellent model systems for biologic research (they can be easy to grow, you can perform
both genetic and biochemical experiments on bacterial systems and many of the things we learn from bacteria are
applicable to higher organisms, including humans)
The lac operon

- this operon contains 3 genes coding for proteins that permit the bacteria to use the sugar lactose
◊ Consider experiemnt:
 E. coli cells are grown in media containing both glucose and lactose
 Glucose is preferred as a carbon source over lactose so in the initial phase of
the experiment the cells use the glucose and ultimately deplete it from the
media at which point cell division levels off
 After a lag cells begin to divide again and this coincides with the cells utilizing
the lactose
 The lag represents the time bacteria requires to activate the lac operon
– This switch involves mechanisms that both positively and negatively
regulate the genes encoded within the lac operon
The lac operon

- The lac operon encodes 3 proteins
1. Galactoside permease - which transports lactose into E. coli
2. Beta-galactosidase - which cleaves lactose to liberate galactose and glucose (and the products of another
operon convert galactose into glucose)
1. Galactoside transacetylase - whose function is still unclear (even though this work began some 70 years ago)
Negative regulation of the lac operon
Biochem Page 18
 the lacI gene - encodes the lac repressor
– expressed all the time
 in the absence of lactose - tetramers of lac repressor bind to the lac operator
– this blocks expression from the Plac promoter
 in the presence of lactose - an inducer is generated that binds to the lac
repressor preventing it from binding to DNA
– this permits expression from the Plac promoter resulting in expression of
the proteins required to utilize lactose
Lac repressor
- lac repressor is an allosteric protein
○ An allosteric protein is one where binding of a specific molecule to the protein changes the shape of a
remote site on that protein thereby altering its interaction with a second molecule
 The inducer causes a conformational change in the lac repressor that

decreases its affinity for DNA
– The inducer is allolactose
– Allolactose is generated by beta-galactosidase because at some
frequency when beta-gal acts on lactose it converts it into allolactose
- Note:
○ much of the work on the lac operon used an allolactose analog called IPTG that induces the lac operon but is
not cleaved by beta-galactosidase
○ this is helpful as you don’t have to worry about the inducer being depleted - during the course of an
experiment which in turn would lead to shutting down of the operon, as would happen if you treated cells
with lactose because beta-gal also cleave allolactose
What is the basis for this model?

- This model was the result of a series of experiments that employed a combination of genetic and biochemical
approaches
- Initial biochemical experiments uncovered the role of the beta-galactosidase protein in metabolizing lactose and
showed that its expression was induced by treatment of cells with inducer
- Genetic experiments identified several mutants that were defective in their response to lactose
Discovery of lac operon

- One mutant induced beta-galactosidase protein expression in response to IPTG but was unable to grow on lactose
- Explanation:
▪ while wild-type cells could take up a radiolabeled lactose analog these mutant cells could not
▪ This suggests that this mutant is defective in the expression of a factor that transports lactose into cells
(i.e. the galactoside permease)
- wild-type cells only expressed the permease after treatment with inducer
- Additional experiments identified another enzyme, the galactoside transacetylase, that was induced along with
beta-galactosidase and the permease
- Finally, the 3 genes encoding these enzymes appeared to be right next to one another in the E. coli genome
- To summarize:
- 3 enzymes that function in lactose metabolism that are all induced by lactose and map to the same region of
the E. coli genome
- Together these data suggest that the regulation of these 3 enzymes might occur through a similar
mechanism that might be related to their position in the genome
Biochem Page 19
Constitutive mutants of lac operon (case 1)
– Another class of mutants identified were called constitutive and they
produce all 3 enzymes in the absence of inducer
– To investigate this mutant further partial diploid bacteria were
generated that carry both wild-type and constitutive alleles
(heterozygotes)
– Result:
 In heterozygote individuals - both sets of lac genes were turned off
in the absence of lactose
 The lac repressor produced by the wild type gene was enough to
supress the transcription of both alleles
– In this model the repressor binds to the operator to turn off expression
of lac products in the absence of inducer
Constitutive mutants of lac operon (case 2)

– the case where the operator sequence was mutated such that the
repressor would no longer bind to it
 generates another constitutive mutant
 these mutants were called operator constitutive mutants or Oc
– In heterozygotes (partial diploid cells)

 The wild type allele expression was off in absence of inducer
 The constitutive mutant expression was on in the absence of
inducer because the lac repressor cannot bind to the promoter
- These results indicate that the operator must be linked on the same molecule of DNA to regulate expression of the
lac operon
- thus, the operator is said to act in cis (which is latin for here)
- Based on these data it was suggested that the Oc mutant indicates the operator is in a noncoding regulatory region
as opposed to a gene that encodes a protein
- This is because a gene that encodes a protein should be complementable in trans (latin for across) as was the
case for the mutation in the lac repressor gene
Summary
- these results pointed to the existence of a repressor molecule that acts in trans to keep the operon off in the
absence of inducer and a cis-acting operator that is a non-coding region that is bound by the repressor
- Remember that this model is based largely on the results of genetic experiments
- The fact that this model was proposed based largely on genetic data meant that a key part of confirming it was
biochemical experiments that isolated the lac repressor and showed that it bound to the operator sequence
Isolating of lac repressor

- The model predicts that the repressor should bind to the inducer so experiments were done to search for a
protein that would bind to IPTG
- Challenge: repressor is expressed at very low levels and the purification of low abundance proteins is always a
challenge
- Solution:
- To hedge their bets researchers isolated a mutant that was inducible at lower concentrations of IPTG
- This reflected a mutant lac repressor that bound IPTG more tightly
- subsequent mapping experiments showed that this mutation was do to a change in the lacI gene - this allele
was called lacIt - t for tight
- Procedure:
- They fractionated lacIt cell extracts using standard biochemical approaches including ammonium sulphate
precipitation and gel filtration, and used an assay to look for an IPTG binding activity
- when you fractionate a cell extract you end up dividing your starting material into several (many) different
fractions and as such you need an assay to identify the fractions that contain the protein you are interested
in
Biochem Page 20
- researchers developed an assay to detect a protein that would bind IPTG
Assay for detection of protein that binds to IPTG

- involved placing protein fractions into dialysis bags and placing these bags into a solution of radioactive IPTG
- IPTG is free to diffuse across the membrane b/c it is a small molecule
- proteins can’t diffuse across the membrane b/c they are large molecules
◊ How would you know which fractions contain the lac repressor?
 In a fraction that doesn’t contain lac repressor - the IPTG will diffuse freely
across the dialysis bag membrane and at equilibrium the concentration of
IPTG will be the same
 In a fraction that contains lac repressor - the IPTG will diffuse freely across the
dialysis bag membrane and at equilibrium the concentration of IPTG different
due to some of IPTG binding to the protein inside the bag
- How do you know that you have identified the right activity or put another way how do you know that the IPTG
binding activity you have identified in likely involved in regulation of the lac operon?
- Run some controls
Controls in Biochemistry
- There are typically two types of controls one uses in an experiment
- Negative control - a negative control should be designed to not give the desired outcome of the experiment
- Positive control - a positive control should give the desired outcome of the experiment, and is typically done
to make sure that all of the reagents in the experiment are working properly
▪ Positive controls are not always possible
- Typically, in each case you change only on thing to ensure that any difference you see is only due to that one
change
- Consider this simple experiment

- I give you a tube of an unknown bacterial plasmid and I ask you to tell me if it contains one or more sites for
a restriction enzyme called BamHI
◊ Experimental procedure:
 add BamHI, salts and a buffer that the enzyme will work in
 incubate at 37oC to give the enzyme a chance to cut
 run out the digest on a gel to separate any DNA fragments based on their
molecular weight and stain the gel to visualize the DNA
 The DNA is negatively charged, thus it will run from negative to positive
electrode
◊ Controls for this procedure:

 negative control: run the same digest procedure without adding any enzyme
 2 positive controls
– confirm that your enzyme actually works by making sure it cuts a
plasmid that carries a BamHI site
– confirm your DNA is actually cut by an enzyme that you know should cut
by – this would be a control to ensure that your DNA is free of
contaminants that might inhibit the ability of a restriction enzyme to cut
it
Isolation of the lac repressor (controls)

- Negative control: lacI- cells which don’t express the repressor and performing the same purification and then
assaying the same fractions for IPGT binding activity
- Possible results:
1. The fractions that contain IPGT binding activity when the purification is performed using lacI t cells DO NOT
HAVE any IPGT binding activity when the purification is performed using lacI - cells
▪ Such a result would suggest that you have purified the lac repressor
Biochem Page 21
▪ Such a result would suggest that you have purified the lac repressor
2. The fractions that contain the IPGT binding activity when the purification is performed using lacI t cells HAVE
IPGT binding activity when the purification is performed using lacI - cells
▪ This would suggest you have not purified the lac repressor and that you have purified another protein
that binds IPTG and is not likely to be involved in regulation of the lac operon
DNA binding properties of the lac repressor

- Remember that the model predicts the repressor should bind DNA and that binding should be disrupted by an
inducer like IPTG
- So with the purified repressor they assayed its ability to bind DNA
 Researchers assayed DNA binding using a filter binding assay

◊ This assay relies on the fact that double strand DNA will pass through a
nitrocellulose filter while proteins stick to the filter
◊ If you mix a DNA with a protein that will bind the DNA - the DNA will now be retain
on the filter
◊ If the DNA is radioactively labeled you can measure how much DNA is retained on
the filter
- Amount of DNA was kept constant and the amount of protein was varied
○ The amount of DNA bound to the filter as a percentage of input DNA was plotted on the y-axis versus the
amount of repressor protein added on the x-axis
- Results:
◊ Left panel
 The wild-type lac operator (positive control), no operator (negative control) or
a mutant Oc operator was mixed with the repressor
 The model predicts that the lac repressor should bind to the wild-type
operator with a higher affinity than the other 2 DNAs and this is what they see
 The fact that the amount of DNA captured never reaches 100% results from
the fact that they were using a crude prep of DNA, a lot of which did not
actually contain operator sequences
◊ Right panel
 wild-type lac operator was mixed with the repressor +/- IPTG
 he model predicts that the repressor should bind the operator – IPTG but not
+IPTG and that is what they see
How does the lac repressor block transcription?

- Model 1: lac repressor blocks binding of RNA polymerase to the lac operon
- Model 2: RNA polymerase binds to the promoter of the lac operon but it is unable to productively transcribe the
lac operon region
Transcription initiation in prokaryotes
 closed promoter complex - the DNA strands bound by the polymerase are coiled together
 open promoter complex - the DNA strands bound by the polymerase are unwound
 open complex is resistant to heparin while the closed complex is not
◊ Closed complex + heparin = RNAP will fall of RNA
◊ Open complex + heparin = RNAP will remain on RNA
In vitro versus in vivo experiments

- In vitro is Latin for ‘within the glass’ and In vivo is Latin for ‘within the living’
Biochem Page 22
- In vitro experiment
○ experiment that is conducted in a test tube
○ 2 extreme cases:
1. in vitro experiment that consists of two highly purified components that you mix together and then
observe how they interact
2. in vitro experiment that involves taking a population of cells and disrupting their membranes (ie lyse
them) and then take the total soluble content of that cell and do an experiment in that complex
mixture
- In vivo experiment
○ conducted within a living organism
○ In vivo experiments can be conducted using living bacterial cells, yeast cells, whole worms (C. elegans),
whole fruit flies (D. melanogaster), or whole mice
- A grey area is experimentation related to cells in culture – some would consider these in vitro experiments while
others would call them in vivo
- work that use both in vivo and in vitro approaches, that complement and reinforce one another, tend to give the
most rigorous and complete picture of what’s going on
○ A good example of this is the genetic (in vivo) and biochemical (in vitro) experiments to understand the
function of the lac operon
How does the lac repressor block transcription?
- One experiment employed an approach known as run-off transcription

○ in vitro experiment where a linear double stranded DNA molecule is incubated with all of the factors
required for transcription
○ If the dsDNA contains an RNA polymerase binding site (ie a promoter) the RNA polymerase in the reaction
will initiate transcription and move down the template until it reaches the end of the DNA and falls off
○ template contained the lac operon promoter and operator sequences
 Procedure:
◊ add the lac repressor and template DNA to the reaction and allow a complex
between them to form
◊ Next RNA polymerase is added and given a chance to bind the promoter
◊ Then heparin is added to the reaction which would displace any polymerase that is
not tightly associated with the template DNA
◊ Then radioactive nucleotides are added along with IPTG and the reaction is
incubated to allow transcription to proceed
◊ The reaction is then run out on a polyacrylamide gel so they can visualize the
radioactively labeled RNA product that might be produced
○ Result of polyacrylamide gel

▪ Lane 3 - the full experiment and you can see that they get a run off product, indicating that the
polymerase was able to stably bind the promoter even with the lac repressor is present
▪ Lane 1 and 2 are controls
□ Lane 1 - shows what happens when you don’t add lac repressor to the experiment – the fact that
you get a run off product indicates that the your in vitro transcription system is actually working
□ Lane 2 - shows what happens if you omit IPTG – since you don’t see a signal here it suggest that
the lac repressor is actual functional
- Taken together these result suggest that the lac repressor doesn’t block recruitment of RNA polymerase,
suggesting instead that it blocks productive transcription of the lac operon
- Other experiments contradicted these results and suggested that lac repressor block RNA polymerase binding to
the promoter
○ To resolve this conflict additional experiments were done
Chromatin immunoprecipitation (Chip)
Biochem Page 23
Chromatin immunoprecipitation (Chip)
- Procedure:
□ Treat cells with formaldehyde to covalently crosslink proteins to nucleic acid
□ Break open cells and shear DNA into small fragments
□ Use an antibody against specific protein to immunoprecipitate that protein and any linked
fragments of DNA
 Involves beads which carry protein A or G attached to them
 Protein A or G are able to bind to the FC region of IGG molecules such that the antibody
and protein that it recognizes become associated with the beads
○ Reverse the crosslink, purify DNA and assay the amounts of specific pieces of DNA using PCR
- No biochemical purification is perfect, so in this case not only will you get fragments of DNA that are bound to the
yellow protein but you will also get the other fragments
- But the idea is that you will recover greater quantities of the bound fragments than other fragments
- Compare the amount of the red fragment that comes down with an antibody against the yellow protein to another
antibody that doesn’t interact with the yellow protein
○ For example if you get 3 fold more red fragment if you run the experiment with an antibody against the
yellow protein compared to an antibody that doesn’t recognize the yellow protein this would suggest the
yellow protein does indeed interact with the red fragment of DNA
○ you would say the red fragment is 3 times enriched in the immunoprecipitates (IPs) with the antibody
against the yellow protein compared to a negative control IP
- On the other hand if you see similar amounts (i.e. no enrichment or 1 fold enrichment) of a fragment of DNA in
the IP with the yellow protein and a control IP where the yellow protein isn’t IPed would suggest the yellow
protein doesn’t interact with that region
Chip and lac repressor binding experiment

- To use this method to test how the lac repressor blocks transcription researchers performed Chip +/- IPGT using
antibodies that IP RNA polymerase
- Then use PCR primers that will amplify the lac operon promoter region to assess how well the lac operon
promoter interacts with RNA polymerase +/- IPTG
- Researches found that RNA polymerase much more efficiently interacted with the lac operon promoter in the
presence of IPTG than in its absence
- These results suggest that the lac repressor blocks binding of RNA polymerase to the lac promoter and this is now
the accepted model for how the repressor works
Biochem Page 24
Lecture 3 Section 2
November 4, 2020 8:04 PM
Lac operons
– There are 3 lac repressor binding sites in the lac control region O1 O2 and O3
– O1 represents the original operator site where the Oc mutant mapped
- To assess the role of each of these sites:

○ researchers mutated each one alone or in all combinations
○ then the effect that each mutation had on the ability of the lac repressor to downregulate transcription was assessed
- The red x’s indicate which operator is mutated in each case and the numbers to the right show the level of repression associated with the
different mutants
○ Fold repression =
- Results:
◊ results show that full repression requires all three sites
◊ While the O1 site in combination with either one of the other two works very well
◊ also note that O2 and O3 only work well when combined with O1
◊ Note that these effects are cooperative (synergistic) not additive

 the effect of having two binding sites is greater than you would expect given the repression by the sites
on their own
 Best seen when looked at each operator expressed individually:
O1 on its own produces 18 fold repression
O2 and O3 on its own produce non or 1.0 fold repression
O1 + O2 produce 700 fold repression
○ These results raise 2 questions:

1. Why are O2 and O3 not very functional compared to O1?
2. What mechanism underlies this cooperativity?
Why are O2 and O3 not very functional compared to O1?

– O2 and O3 have reduced affinity for the lac repressor
– This reduced affinity relates to the fact that the sequence of O2 and O3 differ from the sequence
of O1
– On diagram: lower case letters represent differences from O1 sequence
What mechanism underlies this cooperativity?

- The lac repressor tetramer binds two operators
– The X-ray crystal structure of the lac repressor tetramer shows that:
 the two dimers within the tetramer can each bind to an operator sequence
 And that each monomer of the dimer binds to one half of the operator
- this explains why the operator is a partial inverted repeat

 aattgt sequence on 5' of one strand the same inverted sequence on the 5' end of the other
strand
 Separated by 9 base pairs - significant b/c each turn of double helix is about 10 base pairs
 This separation ensures that 2end sequences of the operator are on the same face of the
DNA, allowing 2 monomers to interact with 2 halves of the operator
- This structure also highlights the fact that different regions of the protein are responsible for different properties
○ the DNA binding domain at the one end of the protein is separate from the regions that mediate tetramer formation
- Different regions and their functions:

○ DNA binding activity
○ Dimerization activity
○ Region mediates the tetramer formation
- Conclusion:
○ The ability of the tetramer to bind to two operators at the same time suggests that when a tetramer is bound to the lac control region
it would loop out the sequences between the two bound operators
○ The cooperativity observed between two operators is explained by the fact that the tetramer bound to two operators would be a
more stable complex than bound to only one
Biochem Page 25
more stable complex than bound to only one
▪ this increase stability would make repression more efficient as the repressor would be less likely to fall off the DNA
○ Consistent with this idea is the fact that a lac repressor that is mutant such that it can not form tetramers shows a substantial defect
in repression
The Affinity of lac Repressor for DNA

- where KA (the association constant) =
- So given this sort of equation a high affinity complex would be one where the concentration of [repressor:DNA complex] would be high,
while the concentration of the free components would be low
- As such KA is going to be a really big number
○ in other words the bigger KA is the higher the affinity of the interaction
- The chart shows:
◊ Ka for lac repressor in association with lac operon with +/- inducer
◊ Ka for lac repressor in association with non-special DNA with +/- inducer
- Observations:
○ Thus these numbers indicate that the affinity of lac repressor for the lac operator is reduced upon inducer binding, while its affinity for
other DNA sequences does not change
○ In addition, they indicate that lac repressor has a higher affinity for operator DNA compared to other DNA even in the presence of
inducer
- Consider that the high affinity operator sequence is competing with a large number of low affinity sites represented by the rest of the
genome
○ In the absence of inducer - the large difference between lac repressor’s affinity of operator and non-operator DNA ensures a high
level of binding to the operator
○ In presence of inducer - the reduction in affinity for the operator is sufficient enough that the large excess of non-specific binding sites
can efficiently compete for binding so the operator will no longer be bound
- Now consider the interaction of lac repressor with non-specific DNA

○ To figure out the fraction of repressor in a cell that is not bound to DNA you can re-arrange the above equation as follows:
○ Ka = 2x106 M and [free DNA] = 7x10-3 M

▪ this is calculated by considering the size of E. coli genome (~4x10 bp which you can think of as ~4x10 non-specific binding sites)
and the volume of an E. coli cell (~10 L)
○ you then convert this to the number of cells per L and then use Avogadro's number to convert this to a concentration in Molarity
○ Subbing these values into the equation and you get a value of 1/14,000
▪ In other words the repressor is going to spend a lot of time associated with DNA compared to the amount of time its free in
solution
▪ Especially when you consider that a cell only expresses ~10 molecules of lac repressor tetramer
- Concluding points:
 The model is that in the presence of inducer the repressor almost always bound to DNA and it is quickly
sliding along the E. coli genome
 This arrangement explains why the repressor is able to find the operator sequences very quick when
inducer is removed
– much more quickly than if lac repressor was floating around the bulk of the E. coli cell
The model so far:

- In absence of inducer
○ Lac repressor is bound to the operator
○ Therefore: blocks ability of RNAP to bind to the promoter
- In presence of inducer
○ Inducer binds to the lac repressor and displaces it from the operator
○ Therefore: allows RNAP to bind to the promoter
Last problem
- Recall this experiment where cells were grown simultaneously in both glucose and lactose
- Problems with what I’ve told you so far:

1. it doesn’t explain why the lac operon isn’t turned on until after the glucose is used up
2. induction of the operon would seem to require both permease to get lactose into the cells and beta-gal to convert the lactose into
the inducer allolactose
Biochem Page 26
the inducer allolactose
- Addressing problem 2
○ It turns out that the lac operon is expressed at a very low level in the absence of lactose such that a cell expresses about one molecule
each of permease and beta-gal
○ This results because the lac repressor/operator complex is not infinitely stable and as such every once in a while the repressor will fall
off the promoter allowing RNA polymerase the chance to transcribe the lac coding region
Biochem Page 27
Lecture 4 Section 2
Why isn’t the lac operon on in the presence of glucose and lactose?
 When glucose is present a small molecule, cyclic AMP (or cAMP for short), is not generated
by E. coli
 As glucose is depleted, cAMP levels increase and this results in the activation of the lac
operon
 This mechanism represents a positive form of regulation that controls the lac operon
◊ while the action of the lac repressor represents a negative form of regulation
An in vitro system to study cAMP regulation of the lac operon

- In initial experiments cells where treated with IPTG and glucose
○ it was shown that cAMP could overcome glucose mediated repression of beta-galactosidase expression
- Based on these results a model was proposed whereby cAMP interacted with a protein that somehow stimulated
beta-galactosidase expression
- To explore this further a cell free (or in vitro) extract was developed that recapitulated this regulation
○ in vitro extracts are very useful because you can manipulate the factors present in them (i.e. remove
components and/or add things in) and see what effect this has on the process you are studying
Experiment 1:
- 2 conditions tested:
 Wild type* without cAMP and CAP protein - beta-gal is expressed

at low levels
 Wild type* with cAMP but without CAP protein - beta-gal
expressed at high levels
- *Note: that this strain is not completely wild-type as it is deleted for the entire lac operon, including the repressor
gene, so these gene products won’t interfere with your ability to assay the expression of the DNA you introduce
into the extract
- Conclusion: these extracts accurately replicate the effect of cAMP on beta-galactosidase expression
Experiment 2:
- Controls:
○ The - cAMP experiment is a negative control
○ the + cAMP experiment is the positive control
○ validates their system for the upcoming experiments
- Using this assay researchers asked if they could purify a protein that could stimulate beta -gal production
○ They used standard biochemical purification techniques such as ion exchange chromatography and assayed
the resulting fractions for their ability to mediate cAMP stimulation of beta-gal expression
- Conditions tested:
- Through this approach they were able to purify a protein whose addition to the extracted resulted in a further
stimulation of beta-gal production when cAMP is present
○ This protein is known CAP (catabolite activator protein)
- Note that these wild-type extracts already carry CAP protein in them, so the addition of more CAP results in just a
small increase in beta-gal expression
Experiment 3:
- Researchers also had a mutant that they thought might be in the CAP gene
- This mutant was identified in a screen for bacteria that were defective in their response to lactose and was shown
to not induce beta-gal expression
○ this mutation didn't map to the lac operon and still produced cAMP
- Conditions tested:
- Result:
Beta-gal activity is not stimulated by cAMP but if you add back purified CAP protein to these extracts you
Biochem Page 28
○ Beta-gal activity is not stimulated by cAMP but if you add back purified CAP protein to these extracts you
see that they become responsive to cAMP
○ Taken together these results provide strong evidence that the mutant carries a defective CAP gene
○ They also provide a correlation between the ability of a cell to use lactose and normal CAP function
CAP protein binds cAMP experiment
 Next using a dialysis type assay researchers showed that the purified CAP protein interacts
with cAMP
How does CAP protein stimulate transcription?

- One hint came from mutants that mapped to the lac promoter sequence which were defective in cAMP + CAP
dependent activation of lac transcription
○ suggested that this process involved a cis-acting sequence within the lac promoter that might represent a
binding site for the cAMP + CAP complex
- Sequencing of these mutants mapped a putative binding site for CAP and subsequent foot printing experiments
confirmed that the above mutation maps to the CAP binding site
Dnase I footprint experiment
- Mechanism:
◊ To do DNase I foot printing you radioactively label one strand of your DNA of
interest at either the 5’ or 3’ end
◊ This labeled DNA is mixed with the DNA binding protein and the mixture is
subsequently treated with low levels of DNase I (an endonuclease) so as to induce
less than one cleavage per DNA molecule
◊ The reaction is then run on a polyacrylamide gel under denaturing conditions
 the position of the bands in the gel are detected using x-ray film or
phosphoimaging
◊ The net result is that where-ever a protein is bound the DNA will be protected from
DNase digestion and this will show up as a gap in the gel
 Diagram: as you add more amount of protein to the rxn, you can see that gap
becomes more obvious
◊ The fact that the DNA is labeled only on one strand and only at one end allows you
to map where the protein binding site is based on the size of the cleavage products
- Results of Dnase I footprint experiment:

 the protected (or footprinted) region is only seen in the presence of both CAP and cAMP
◊ consistent with the fact that CAP only binds DNA when it is bound to cAMP
 Note that the 1st lane represents a negative control
 While some bands disappear in the CAP+cAMP lane other bands become more intense
◊ These represent so-called DNase I hypersensitive sites that result from a protein
binding to the DNA and changing its structure such that a region of the DNA
becomes more sensitive to cleavage
 some of these sites actually occur within the region that is protected
◊ This is because these hypersensitive sites are likely to be on the opposite face of the
DNA to where the protein is bound
Experiment 2
- Experimental procedure:
◊ Allow RNA polymerase to bind to the lac promoter +/- CAP and cAMP
◊ Then nucleotides and rifampicin are added
 rifampicin - a drug that blocks transcription initiation but not elongation
- Results:
○ In the presence of CAP and cAMP you see that transcription occurs while in their absence transcription does
not occur
- These results suggest that in the presence of CAP and cAMP an open promoter complex forms and when
nucleotides are added elongation proceeds before the rifampicin takes effect and blocks initiation
○ Thus, CAP+cAMP stimulates the ability of RNA polymerase to form an open promoter complex
How does binding of CAP to the lac promoter stimulate transcription?
- Additional work showed that CAP+cAMP stimulates the formation of the closed promoter complex if you consider
the formation of the open promoter complex as follows: R + P ---> RPclosed ---> RPopen
- In this case increasing the rate of formation of the closed promoter complex drives the formation of the open
Biochem Page 29
- In this case increasing the rate of formation of the closed promoter complex drives the formation of the open
complex
- Mechanism:
 The CAP-cAMP dimer binds directly to a subunit of RNA polymerase

 This binding increases the efficiency of RNA polymerase recruitment and
stabilize the RNA polymerase/DNA complex once formed
Binding of CAP-cAMP to DNA bends the DNA

- The X-ray crystal structure of the CAP-cAMP dimer bound to DNA shows that DNA in this complex is bent at an
angle of about 100 degrees
- Before X-ray crystallography - this fact had already been recognized using a gel mobility shift assay
- Gel mobility shift assay:

○ In a normal gel shift assay you use a radiolabeled DNA probe and mix it with a DNA binding protein
○ You then run the mixture on a polyacrylamide gel and you will see that the binding of the protein to the DNA
will reduce the mobility of the DNA in the gel giving you a shifted band
○ When you run DNA through such a gel its thought to snake through the pores of the gel
○ Consider then that if you bend the DNA this will generate a hook in the DNA which will result in the DNA
getting hung up in the pores of the gel
○ Also, if the bend is in the middle of the DNA this is going to create a bigger hook and it will be hung up more
than a bend towards the end of the molecule
- Gel mobility shift assay experiment:

○ In these experiments they had various probe DNAs where the CAP binding site was found at various places
in the probes
- Gel mobility shift assay results:

○ the probe with the site in the middle had the lowest mobility in the gel when protein was added, while DNAs
with the site towards the end ran faster
○ Control here was the each probe DNA ran at the same position in the absence of CAP+cAMP
○ This bending is presumably necessary for optimal interaction among the proteins and DNA in the complex
Biochem Page 30
Lecture 5 Section 2 (Trp operon)
November 18, 2020 9:14 AM
The trp operon

- contains 5 genes (E,D,C,B and A) that encode proteins that are required in the synthesis of the amino acid tryptophan (trp)
- Mechanism1:
○ When trp levels are high - cells don’t need to make any more trp so the operon is shut off
○ When trp levels are low - a non-functional trp repressor is produced that is called the aporepressor and as such the 5 genes of the operon are expressed so the
cells can make trp
- The aporepsressor
○ a dimer
○ is not functional because it can’t bind DNA
○ When the aporepressor dimer binds trp an allosteric transition occurs that changes the conformation of the repressor so it can now bind DNA and repress
transcription
- Mechanism2: The trp operon is also regulated by a mechanism that is known as attenuation
- Note:
○ Trp repressor is capable of repressing transcription of trp operon by about 70 fold
○ Attenuation represses transcription of trp operon by around 10 fold
Control of the trp operon by attenuation (overview)
- Mechanism:
◊ When trp levels are high and transcription does initiate at the trp promoter (i.e. the repressor has failed to block RNA polymerase
recruitment)
 The polymerase proceeds through the 5’ end of the operon (through the leader sequence) but transcription terminates (i.e. RNA
polymerase falls off the DNA) in a sequence known as the terminator
 Since the attenuator is upstream of all of the protein coding genes - the attenuator blocks production of the proteins involved in trp
synthesis
◊ when trp levels are low

 the attenuator is not functional so transcription does not terminate and you get a full length transcript that will produce the
proteins required for trp synthesis
Transcript termination in bacteria
▪ In general transcript termination in prokaryotes results from the formation of a stem/loop sequence in the RNA transcript soon after it emerges from the
RNA polymerase
▪ This stem/loop is thought to destabilize the RNA:DNA hybrid which results in release of the transcript and RNA polymerase from the template DNA
Control of the trp operon by attenuation (detailed)
- Attenuation is regulated by the secondary structure of the trp leader sequence
 The 5’ untranslated region (UTR), also known as the leader, can exist in two different secondary structures
1. Functional terminator (left) - sequences 1 and 2 form one stem/loop and sequences 3 and 4 make another stem/loop
followed by the stretches of Us
2. Non-functional terminator (right) - sequences 2 and 3 base pair with one another
 The formation of these structures is regulated by the level of trp in the cell
- Attenuation requires translation of a small open reading frame in the trp leader
○ Trp’s ability to regulate the formation of these structures relates to the presence of a small open reading frame (protein coding sequence) within the 5‘UTR of
the trp mRNA
○ Open reading frame contains two trp codons
○ Translation of these two trp codons controls whether or not the terminator forms
- Attenuation relies on the fact that transcription and translation are coupled in bacteria
 In bacteria soon after RNA polymerase initiates transcription a ribosome can start translating the mRNA
 This simply requires an open reading frame with an upstream ribosome binding site (shine Dalgarno sequence) + start codon (Met)
 Nascent RNA - refers to transcript that is still associated with the RNA polymerase
- How coupled transcription and translation affect trp operon expression

○ Soon after polymerase transcribes the start codon for the small orf a ribosome will initiate translation
○ Mechanism:
 If trp levels are low - the cells will contain low levels of tRNAtrp charged with the trp amino acid
– This causes the ribosome to pause at the trp codons which will, in turn, allow regions 2 and 3 to base pair with one another
thereby blocking formation of the terminator
– Therefore: in low trp levels termination is blocked and the genes that encode proteins which synthesize trp will be expressed
Biochem Page 31
 If trp levels are low - the cells will contain low levels of tRNAtrp charged with the trp amino acid
– This causes the ribosome to pause at the trp codons which will, in turn, allow regions 2 and 3 to base pair with one another
thereby blocking formation of the terminator
– Therefore: in low trp levels termination is blocked and the genes that encode proteins which synthesize trp will be expressed
 If trp levels are high - the cells will contain high levels of tRNAtrp charged with the trp amino acid
– Thus the ribosome will translate through the trp codons - no pause
– This will disrupt base pairing between regions 2 and 3, thus allowing region 3 to base pair with 4 creating the terminator
– Therefore: in high trp levels termination occurs and the genes that encode proteins which synthesize trp will not be expressed
Evidence for this model
- Evidence 1:
○ Trp repressor mutants could still respond to trp starvation by increasing the rate of synthesis of trp mRNA
▪ this already suggests that there is more going on than simple regulation by the trp repressor
- Evidence 2:
○ Researchers detected more RNA coming from the 5’ end (140 base) of the trp mRNA compared to more 3’ sequences, they then mapped the termination site and
showed that termination at this site was reduced in the absence of trp
- Examining the sequence where termination occurs showed the potential stem/loop and the run of Us consistent with a terminator signal
○ but how was it regulated by trp levels
- Note that many of the experiments to understand the mechanisms of attenuation were done in bacteria that were mutant for the trp repressor, simplifying the
interpretation of their results
- Evidence 3:
○ Using the purified leader sequence RNA it was shown that ribosomes bound to the leader and an AUG start codon was identified within the bound region
○ and other experiments showed that the trp leader functioned as a site for translation initiation
○ Translation initiated at this site would result in a 14 aa peptide that would contain 2 trp amino acids in a row towards the end of the peptide
– The presence of two tandem trp codons was surprising

– Reason: trp codons are usually rare occurring about 1% of the time in E. coli genes
– Therefore: translation of this small orf would be sensitive to trp levels in cells and this started researchers thinking that when
trp levels are low ribosomes might stall at these trp codons
○ The fact that the trp leader is predicted to encode for a peptide would make one wonder if the peptide itself might play a role in regulating the operon (i.e. in the
same way a protein might)
Testing the potential role of the peptide in regulation of the operon
- To test the potential role of the peptide in regulation of the operon researches started with a mutant strain where the AUG start codon of the leader peptide was
mutated such that it would no longer support translation of the leader orf
○ Result: In low levels of trp amino acid this mutant failed to show an increase in the transcription of the regions downstream of the leader
○ this contrasted the behaviour of a strain with a wild-type start codon, where transcription of sequences downstream of the leader went up when trp levels were
low
- To test if the peptide encoded by the trp leader was important researchers tried to rescue the start codon mutant using a partial diploid approach
○ Result: They found no evidence for rescue in this case
○ Therefore: the mutant could not be rescued in trans, suggesting that the peptide itself did not play a role
○ Instead these results suggest that the leader functions as a cis-acting element
○ Note1: this is a negative result and researchers are always concerned about giving too much weight to them
▪ However, all of the other experiments done on the trp operon support the idea that the leader sequence functions in cis and not in trans
○ Note2: Also, note that if you were to do this sort of experiment today, you would likely just try to rescue the mutant using a plasmid that would express the
leader sequence peptide
Testing potential importance of trp codons
- The potential importance of the trp codons was emphasized by the fact that other species of bacteria have trp codons within the leader sequence of their trp operons
○ this type of conservation suggests an evolutionary pressure to maintain them indicating they have an important functional role
- other amino acid synthetic operons that are regulated by attenuation contain small open reading frames within their leader sequences and these orfs are rich in codons
for the amino acids they are involved in synthesizing
○ suggests this mechanism might be a common form of regulation for these types of operons
- starvation of cells for Trp or Arg (note that a codon for Arg comes just after the Trp codons) reduces termination in the leader
○ In contrast, starving cells for other amino acids (encoded by the leader or not) doesn’t
○ Suggests that ribosome stalling must occur at the right place to regulate attenuation
Starving cells for amino acids encoded in the terminator sequence
- What happens if you starve cells for Met?

○ The ribosome will stall so early that stem 1+2 will base pair as the mRNA is transcribed and as such stem 3 will have no choice but to pair with 4 generating a
terminator
▪ Therefore: starving for Met will not decrease termination
Biochem Page 32
▪ Therefore: starving for Met will not decrease termination
- What happens if you starve cells for Ser?

○ The ribosome won’t stall until the end of the leader orf, so you would expect a result similar to when trp levels are high
▪ The terminator forms because segment 2 is not free to base pair with segment 3 so the terminator will form
▪ Therefore: starving cells for Ser will not reduce termination
Mutation experiments
 What would happen to a mutant that destabilizes stem 3/4?

– Since stem/loop 3/4 is required for a functional terminator, these mutants would disrupt termination such that termination
would not occur when trp levels are high
 What about if you mutated this *G to an A

– In this case when trp levels are low you still see termination occurring
– this is because the G to A mutation would destabilize stem 2/3 (by disrupting a GC based pair)
– This destabilization of the 2/3 structure would favour pairing of 3/4 resulting in termination
 What if you mutated the start codon for the leader sequence peptide?
– The ribosome will not translate the mRNA
– Therefore: stem 1+2 will base pair as the mRNA is transcribed and as such stem 3 will have no choice but to pair with 4
– Generates a terminator even when trp levels are low
Conclusion:
- The model would seem to suggest that of translation on the leader sequence would have to begin at just the right time for the mechanism to work properly, because if
the ribosome joined too late it would not work properly
○ But it turns out that there is another level of complexity built into the system that links transcription with translation
2nd level of complexity of the model
- RNA polymerases pausing in the trp leader sequence
- Mechanism:
 RNA polymerase initiates transcription and then pauses at the stem/loop 1/2 structure
 The ribosome initiates translation of the leader sequence orf and as a result RNA polymerase resumes transcription
- This mechanism ensures that the polymerase doesn’t get ahead of the ribosome which would disrupt the attenuation mechanism
- Evidence: (in vitro experiment)

○ Procedure:
▪ used crude E. coli extracts that support both in vitro transcription and translation
▪ A DNA template derived from the trp operon along with radioactive nucleotides was added to extract
▪ after incubating for a few minutes the resulting products were run out on a gel
○ Results:
◊ Lane 1 - the top intense band represents termination of the template at the leader sequence terminator, while there is a faint band that
results from RNA pol pausing at the stem/loop 1/2 sequence
◊ Lanes 2 and 3 - increasing amounts of a translational inhibitor are added and you can see that the amount of pausing that is detected
goes up
◊ This suggests that normal translation relieves the RNA pol pause
○ The same experiment was done with a mutant DNA template

▪ In the mutant - the ribosome binding site and AUG start codon were deleted
○ Results:
◊ Lane 4 - shows the results without a translational inhibitor and consistent with the fact that this template can not be translated you see
the pause site band is stronger
◊ In addition, you see that increasing amounts of translational inhibitor have no effect on the level of pausing
◊ This again provides evidence that relief of RNA pausing is related to translation of the leader peptide orf
Biochem Page 33
Lecture 6 Section 2 (Riboswitches)
Control of gene expression by riboswitches
- Riboswitches - structured RNA domains that usually reside in the noncoding regions of mRNAs where they bind
metabolites and control gene expression through a variety of mechanisms
- These RNA regulatory elements form highly specific binding pockets for their target metabolite
○ When metabolites bind to these cis-acting elements they influence the expression of the mRNA
- One of the first characterized riboswitches controls the expression of genes involved in making thiamine
(vitamin B1)
○ Thiamine is an essential cofactor for many enzymes
○ Bacterial cells can utilize thiamine in their environment or when there is none they can synthesize it
themselves
- Bacteria thiamine mechanism:

○ When thiamine levels in their environment are high - the biologically active form of thiamine, thiamine
pyrophosphate (TPP), represses the expression of genes that are required to synthesize thiamine
Control of thiamine synthesis by a riboswitch

- Many of the mRNAs that encode for thiamine synthesis proteins contain an element - thi box - within their 5’
ends
- When this sequence was first identified it was recognized that it had the potential to form a secondary
structure
- Predicting RNA secondary structure involves using computer programs that, in general, use thermodynamic
parameters to predict the most stable structure for RNA sequence
○ Problem: even random RNA sequences are likely to adopt stable structures
○ Solution: to assess whether the structure is biologically important - compare related sequences that
might contain the same structure
Comparison of thi box sequences of two E. coli genes

- both the thiM and thiC mRNAs in E. coli contain thi boxes in their 5’ UTRs
- the sequences of these two thi boxes:
 nucleotide differences in lower case
- potential secondary structures of these sequences:
 Difference is in the rectangle
- The differences highlighted in red above and in the rectangles to the left conserve the secondary structure
○ These type of changes are known as compensatory changes
○ Many such compensatory changes in several related sequences suggest the element can tolerate some
changes in sequence as long as the structure is preserved
- Conclusion: this suggests that the structure is functional in some way thereby increasing your confidence the
structure actually exists in vivo
Thi boxes function
- When just discovered:

○ One possibility was that the thi box represented a binding site for an RNA binding protein that was
somehow involved in regulating the expression of mRNAs that encode proteins involved in synthesizing
thiamine
○ but no evidence could be found for such proteins
○ So one lab tested the possibility that these sequences could actually bind to TPP
○ To test this they used a technique known as in line probing
In line probing of RNA structure

- To do this you need to make the RNA you are interested via in vitro transcription
- RNAs are typically generated using one of 3 RNA polymerases isolated from different bacteriophages
○ T7 RNA polymerase
○ SP6 RNA polymerase
○ T3 RNA polymerase
- These polymerases recognize specific DNA promoter sequences and initiate transcription just downstream
- To transcribe RNA of interest
i) Start with plasmid that contains sequence of interest + unique restriction site
downstream of sequence of interest
ii) Digest the plasmid with restriction enzyme to linearize it
iii) Linear RNA is mixed with 1 of the polymerases
iv) RNA polymerase will recognize the promoter & initiate transcription of RNA of
interest
- In line probing of RNA structure

1. generate RNA via in vitro transcription
Biochem Page 34
1. generate RNA via in vitro transcription
2. end label the RNA either 3' or 5' end
3. Allow the RNA to sit for a prolonged time at a room temperature
- under these conditions RNA will undergo limited spontaneous cleavage but residues that are part
of structured regions are resistant to this cleavage
- again you want to get on average less than one cleavage per RNA molecule
4. resolve RNA on a polyacrylamide gel
In line probing of RNA structure

- Consider this RNA that could adopt 2 possible conformations
○ remember that single stranded regions are much more likely to undergo cleavage than double stranded
regions
 If structure 1 forms on this RNA you will see the cleavage pattern represented in lane
1 of the gel
 if structure 2 forms you will see the pattern in lane 2
In line probing of the thiM 5‘UTR (experiment 1)

◊ The gel shows the results of in line probing of the 165 nt thiM 5‘UTR +/- TPP
 the NR lane shows the RNA not subjected to cleavage
 the OH lane shows the RNA treated under basic conditions which will
cleave at every base which acts as a marker
 the T1 lane is another marker where RNA has been treated with RNase T1
which cleaves the RNA at G residues
◊ The right hand panel shows the predicted secondary structure of the 165 nt thiM
5’ UTR with the thi box highlighted in blue, the ribosome binding site (labeled SD
for Shine-Dalgarno) and the start site for translation (labeled start codon)
◊ Conclusion:
 The addition of TPP you see that the intensity of some bands goes down
while the intensity of others goes up
 Since this is a purely in vitro system containing only the RNA this indicates
that the TPP binds to the RNA and this in turn changes the conformation
of the RNA
In line probing of the thiM 5‘UTR (experiment 2)

- In another in line probing experiment it was shown that:
○ the boxed 91 nt fragment of the thiM 5‘UTR (91 thiM) also bound to TPP suggesting that the binding site
for TPP is found within this region
○ But the 165 nt thiM 5’UTR fragment showed evidence for changes in the structure of sequences region
outside the 91 nt region
- These result suggest that binding of TPP to its binding site within the thiM 5‘UTR changes the structure of
sequences that are remote to the binding site indicating that the 5‘UTR is behaving as a allosteric molecule
- One key change they noticed in the structure was that in the absence of TPP the ribosome binding site (rbs)
was sensitive to cleavage while in the presence of TPP it was not
○ This suggests that in the absence of TPP that the rbs is single stranded, which would allow it to recruit
the ribosome to the mRNA
○ in the presence of TPP the rbs is double stranded, which would block mRNA translation
- In addition, they found that the sequence that was believed to base pair with the rbs was resistant to cleavage
+/- TPP in contrast to the behaviour of the rbs
○ This suggests that the sequence that can pair with the rbs is paired with another sequence in the
absence of TPP
A reporter system for TPP-dependent regulation of thiM expression
- To develop a reporter system to study TPP-mediated regulation of gene expression two constructs were
initially made
1. Translational fusion where the thiM promoter drives the transcription of an mRNA carrying the thiM
5‘UTR and the lacZ gene
▪ in this case the start codon for translation comes from the thiM sequence and its fused in frame
with the lacZ open reading frame
▪ in addition the rbs is provided by the thiM 5‘UTR
2. Transcriptional fusion where the thiM promoter drives the expression of an mRNA carrying the thiM
5‘UTR and the lacZ open reading frame
▪ in this case both the ribosome binding site and the start codon for translation is provided by lacZ
sequences
- Note that in this construct the lacZ gene serves simply as a reporter so that you can readily assess the effect of
TPP on expression of these constructs
○ remember that lacZ encodes the beta-gal protein and that one can readily assay beta-gal protein levels
using a simple assay that measures beta-gal enzyme activity
- Consistent with the model:

○ the expression of the translational fusion was induced in the absence of TPP while
○ the expression of the transcriptional fusion did not change +/- TPP
Mutations and translational fusion reporter

- Using the translational fusion reporter a series of mutations were made in the thiM 5‘UTR sequences to test
aspects of the model
◊ Mutants M1 and M3 are predicted to disrupt base pairing within the core 91 nt
region that carries the TPP binding site
 These mutations block TPP binding and consistent with the model they
Biochem Page 35
aspects of the model
◊ Mutants M1 and M3 are predicted to disrupt base pairing within the core 91 nt
region that carries the TPP binding site
 These mutations block TPP binding and consistent with the model they
also block regulation of these mRNAs by TPP
 they also block the structural rearrangement that results in base pairing of
the rbs upon TPP binding
◊ Compensatory mutations (M2 and M4) - restore base pairing in these regions
but change the sequence and found that these mutations restored TPP binding,
TPP-mediated regulation and the base pairing of the rbs upon TPP binding
- Conclusion: these data provide a STRONG correlation between TPP binding to the thiM 5‘UTR, TPP-dependent
control of gene expression and base pairing at the rbs
- In addition, they provide evidence for the existence of the base pairing in these regions and the importance of
these structures to TPP binding
A reporter system for TPP-dependent regulation of thiM expression

- This model shows the two conformations that the thiM 5‘UTR is thought to adopt +/- TPP
 In the presence of TPP the rbs (again labeled SD) is base paired with another region of
the thiM 5‘UTR which would block ribosome recruitment
 In the absence of TPP this region is base paired with another segment of the 5‘UTR
- As another test of this model the indicated base is changed from a U to a C (see the red arrow)
○ This is predicted to stabilize the right hand conformation while it should destabilize the left hand
conformation
○ Consistent with the model this mutation reduces expression even in the absence of TPP and the
presence of TPP has little effect on this low level of expression
○ This occurs despite the ability of TPP to still bind to the 5‘UTR
- In addition, in line probing shows that this mutation reduces cleavage at the rbs in the absence of TPP
consistent with it leading to increased base pairing at the rbs
TPP riboswitches
- Additional work in bacteria has shown that TPP riboswitches can also control gene expression through
regulation of transcription termination
- The ability of TPP riboswitches to regulate gene expression through different mechanisms relates to their
structure
- They consist of an aptamer or ligand binding region upstream of a sequence that is referred to as the
expression platform
○ The aptamer is solely responsible for ligand (small molecule) binding
○ The expression platform is the sequence that is acted upon by ligand binding that results in some change
in gene expression
- So when a TPP riboswitch controls termination in the absence of TPP no terminator is formed
- but when TPP is present you see a rearrangement of sequences such that a terminator does form
- TPP riboswitches are widespread in prokaryotes and have even been found in some eukaryotes - namely fungi
and plants
- In fungi and plants these riboswitches regulate the splicing of transcripts involved in thiamine synthesis
- In addition, bacterial riboswitches have been identified that bind to other small molecules to regulate gene
expression in much the same way that the TPP riboswitches function
- In general these other riboswitches also consist of an aptamer sequence involved in ligand binding and an
expression platform that is the site that mediates regulation
- Some of these riboswitches were identified by simply looking for conserved structures/sequences upstream of
genes that encode proteins involved in the same metabolic process and knowing what that process is allows
one to make guesses as to what the ligand might be for that potential riboswitch
- In some cases ~2% of the genes in a bacterial genome could be regulated by a riboswitch
Architecture of the riboswitch

- Different types include:
i) Simple (discussed above)

ii) Tandem Riboswitches - 2 aptamers bind the same ligand, while each aptamer
acts on its own expression platform
iii) Tandem, cooperative aptamers - 2 aptamers bind the same ligand, while each
acts on the same expression platform
iv) Tandem riboswitches, mixed ligand specificity - 2 aptamers bind different
ligands and each aptamer acts on its own expression platform
Riboswitch - temperature control

- Similar RNA structures have been shown to function as thermosensors
- Example: in pathagenic human bacteria the expression of a transcription factor is regulated by temperature
 Mechanism:
◊ At low temperature the ribosome binding site (SD) is found in a stem structure
Biochem Page 36
- Similar RNA structures have been shown to function as thermosensors
- Example: in pathagenic human bacteria the expression of a transcription factor is regulated by temperature
 Mechanism:
◊ At low temperature the ribosome binding site (SD) is found in a stem structure
and therefore its non-functional
◊ At higher temperature this stem is unstable and thus translation can occur
- This switch occurs at 37’C (ie when the bacteria enters its human host) and the transcription factor whose
expression is activated in turn activates the expression of a number of genes required for growth of the
bacteria in the human host
Biochem Page 37
Lecture 7 Section 2 (Alternative splicing)
Control of gene expression in eukaryotes
- expression of genes in eukaryotes begins with transcription of a gene to give an RNA

- many of the principles of transcriptional regulation are similar between prokaryotes and eukaryotes
□ Ex// control transcription involves cis-acting elements that are recognized by transacting factors that influence the action of RNA
polymerase
- our discussion of eukaryotic gene expression will instead focus on mechanisms that are mostly specific to eukaryotes
- our first topic will be alternative splicing
Splicing basics
- Steps of removing an intron:
1. 2’-OH group of adenosine nucleotide in middle of intron attacks the phosphodiester bond between the 1st exon and G
residue at the beginning of intron
 Results in the formation of the lariat and separating the first exon from intron
2. 3’-OH left end of 1st exon attacks phosphodiester bond linking the intron to 2nd exon
 Results in the formation of the exon-exon phosphodiester bond and the release of the intron in the lariat form
- Precision is key to ensure the mRNA will generate the correct protein
Alternative Splicing
- Many genes exhibit alternative splicing - the process whereby the same pre-mRNA can give rise to more than one mRNA through changes
in the pattern of splicing
- As can been seen in this diagram there are several ways that splicing can differ from one transcript to another
Impact of alternative splicing

- alternative splicing can effect gene expression by changes that act at the level of the mRNA or the encoded protein
○ At the mRNA level inserting or deleting cis-acting sequences that control processes such as mRNA stability, translation and the localization of
transcript
- these signals often residue in the transcripts 5’ and 3‘UTRs
- these changes do not have to be accompanied by changes in the sequence of the protein
○ Changes in the mRNA that lead to insertion or deletion of portions of the protein
- can have subtle or profound effects on protein function including complete loss of protein function
○ Changes in the protein primary structure can:

- alter the binding properties of protein (ex// to small molecules, other proteins, lipids, DNA, RNA, etc)
- influence their intracellular localization
- modify their enzymatic activity
- change the stability of the protein
- alternative splicing is a much more versatile control mechanism than transcriptional control where in general the only regulation that can be achieved
is to affect changes in the amount of the gene product
- ~95% of human genes show evidence that they are alternatively spliced
○ This level of alternative splicing is thought to help explain the apparent discrepancy between the number of genes in some complex multi-
celled organism (ie humans) and some simpler single cell organisms such as yeast
- The human genome encodes ~25,000 genes while the genome of budding yeast encodes ~6,000 genes
○ since people assume humans are more than 4 times more complex than a yeast cell some have suggested that the high level of alternative
splicing in humans plays a big role in generating this higher level of complexity
An extreme case of alternative splicing

- Example: the Drosophila Down syndrome cell adhesion molecule (Dscam) gene
- The pre-mRNA encodes 20 constitutive exons as well as 4 mutual exclusive exons

○ Constitutive exons are always included in every mRNA
○ One of each of mutual exclusive exons is included in mature transcripts
- There are:
○ 12 different versions of exon 4 (red)
○ 48 versions of exon 6 (blue)
○ 33 versions of exon 9 (green)
○ 2 versions of exon 17 (yellow)
Biochem Page 38
○ 12 different versions of exon 4 (red)
○ 48 versions of exon 6 (blue)
○ 33 versions of exon 9 (green)
○ 2 versions of exon 17 (yellow)
- Therefore: there are 38,016 (12x48x33x2) versions of the mature mRNA possible
○ This represent more genes than are actually encoded within the Drosophila genome (~14,500)
- Dscam is a cell surface protein expressed in neurons

○ The protein has exquisite isoform-specific binding and as such any differences in splicing of the 3 extracellular domains results in proteins that
will not interact with a differently spliced version
○ This property of the Dscam protein is important of correct wiring for the Drosophila brain
Splicing basics
- Three signals in the pre-mRNA are involved in splicing
○ the 5’ splice site, the 3’ splice site, and the branchpoint
○ note that these signals are somewhat degenerate, especially the branchpoint and 3’ splice site
- These signals are recognized by small nuclear ribonucleoproteins (snRNPs, U1, U2, U5 and U6)
○ snRNPs - complexes of RNA and proteins
○ recognition of the splicing signals occurs through base pairing interactions between the small nuclear RNAs (snRNAs) within the snRNPs and
the pre-mRNA
 Exception: recognition of the part of the 3’ splice site by the U2AF which consists of two proteins
– a 65kDa subunit which binds the polypyrimidine tract (Yn) and
– a 35kDa subunit which binds the AG sequence in the 3’ splice site
Sex determination in flies (regulation of sxl)
- One of the best characterized examples of alternative splicing comes from Drosophila where a cascade of differential splicing plays an important role
in sex determination
- This cascade begins with alternative splicing of a sex-lethal (sxl) pre-mRNA
- sxl has 2 promoters

○ PE - only active in early female embryos while
- Transcribes early sxl protein
○ PL - active later in embryogenesis in all embryos
- Transcribes late sxl protein
- The two different promoters generate different pre-mRNAs

○ Early in embryogenesis the sxl-lethal pre-mRNA is generated in females, giving the early Sxl protein
○ Later in embryogenesis PE is turned off and a different pre-mRNA is generated in both sexes from the PL promoter
- Splicing of sxl pre-mRNA in 2 sexes:
◊ Note : exon 2 in the early transcript is the same as exon 4 on the late transcript
 Both encode for the protein RNA binding domain
◊ In males (late sxl protein)
 translation starts in exon 2 (red arrow) and stops in exon 3 (red line)
 Therefore this protein lacks its RNA binding domain and is therefore not functional
◊ In females
 early Sxl protein blocks splicing of exon 2 to exon 3 which excludes exon 3 with a pre-mature stop codon
 resulting in an mRNA that encodes functional late Sxl protein which also blocks splicing of exon 2 to exon 3
 Therefore: late Sxl protein autoregulates its own expression ensuring continuous expression of functional Sxl
protein in females after PE is turned off
Sex determination in flies (regulation of the rest of the mechanism)

- Functional Sxl protein also binds to tra mRNA
○ in so doing, it represses splicing between exons 1 and 2 in tra pre-mRNA
- Tra mRNA
○ In females - tra mRNA makes tra protein (includes only exon 1 and 3)
○ In males - a stop codon in exon 2 results in a non functional protein (includes exons 1, 2, 3)
◊ In females: (Dsx: exon 3 and4)

 Tra protein in combination with Tra-2 binds to dsx pre-mRNA and activates splicing of exon 3 to exon 4
 Result: Dsx protein represses the transcription of genes required for sexual differentiation in males
◊ In males: (Dsx: exon 3 and 5)

 Exon 3 to exon 4 of Dsx mRNA are not joined together b/c exon 3 is spliced to exon 5
 Result: male Dsx protein represses the transcription of genes required for female sexual differentiation
Experiments: How does Sxl regulate tra splicing?
Biochem Page 39
Experiments: How does Sxl regulate tra splicing?
- To investigate this, researches developed an in vitro splicing assay that recapitulated Sxl lethal mediated regulation of tra splicing
- Creating the in vitro assay:

○ made a nuclear extracts from cells
○ added in pre-mRNA, generated via in vitro transcription, that corresponded to the regions of tra RNA whose splicing is regulated by Sxl protein
○ added in purified Sxl protein which was generated in E. coli
- this protein was tagged with another protein called GST that allows for simple purification of protein from E. coli
- Purification using GST:

○ GST is capable of binding to glutathione
○ Chromatography that uses beads coupled with glutathione can be used for purification
○ GST-fusion protein will bind to the beads, while the rest of the components of the mixture will go through the column
○ GST fusion is then reversed by passing free glutathione through the solution of purified protein of interest
- Glutathione will outcompete the protein for binding to the GST
- After incubating the reaction they assayed the level of splicing using a primer extension assay
- Mechanism of primer extension assay:

◊ Mix RNA of interest with DNA oligonucleotide
 Oligonucleotide is complimentary to the RNA sequence and it is labelled at the 5' end with radioactive
phosphorus
 Radioactive labelling allows to visualize the product at the end
◊ Add reverse transcriptase
 RV recognizes 3' OH group of DNA oligonucleotide and uses it as a primer to transcribe complimentary DNA
strand from the RNA template
 Therefore: primer is extended, producing a radiolabelled DNA strand
◊ Denature complex of DNA-RNA and run the mixture on the polyacrylamide gel
◊ Expose the gel to X-ray film which will help to visualize radiolabelled DNA
- The results of primer extension assay:
◊ 3 mixtures with increasing GST-sex-lethal protein concentrations

◊ GST line - negative control
- These results indicate that:

○ The extracts on their own recapitulate the pattern of male splicing
- The GST control line
- makes sense since these cells were derived from a male animal
○ As you add increasing amounts of Sxl protein you see that you start to generate the female splice product
○ Also, note that the splicing reaction in general gets less efficient when you add Sxl as you see significant amounts of unspliced pre-mRNA
- Other work suggested that Sxl acts through binding to the poly-pyrimidine tract (Y n) located close to the male 3’ splice site
◊ Since U2AF also binds here this suggested that Sxl might be regulating U2AF function
◊ To explore this further researchers assessed the the affinity of Sxl and U2AF for the 3’ splice site region of both the
male and the female splice events using a RNA gel shift assay which works the same way as a DNA gel shift
Gel shift assays are used to assay the ability of proteins to bind DNA or RNA
- Mechanism of gel shift assay (recap):
 In a gel shift assay you use a radiolabeled DNA or RNA probe and mix it with a protein of interest
 You then run the mixture on a polyacrylamide gel and you will see that the binding of the protein to the DNA will reduce the
mobility of the DNA in the gel giving you a shifted band
- Result of RNA gel shift assay:
 Sxl protein, purified from E. coli, bound to the male 3’ splice site
- These experiments also allow you assess the stability of this interaction by measuring an apparent K d
○ Kd is defined as the concentration of protein where 50% of the probe has shifted
- Result:
○ Sxl had an apparent Kd of 10-9M for the male 3’ splice site
○ Sxl did not bind at all to the female 3’ splice site
Biochem Page 40
○ U2AF had an apparent Kd of 10-8M for the male 3’ splice site
○ U2AF had an apparent Kd of 10-6M for the female 3’ splice site
○ Based on these results you would rank the affinities of these sites from highest to lowest as: Sxl (M3’SS) > U2AF (M3’SS) > U2AF (F3’SS)
- So given that both Sxl and U2AF bind the male 3‘ SS the next question is:
○ do they compete for binding to the same site or put another way would one protein block the ability of the other to bind to the male 3‘ SS?
How does Sxl regulate tra splicing: competition between Sxl and U2AF
- To test this researchers made use of a UV-crosslinking assay
- Mechanism of UV-crosslinking assay:
 Make uniformly radiolabeled RNA by transcription in vitro, using radioactive nucleotides

 Mix the RNAs with proteins of interest
 After allowing proteins to bind to RNA, irradiate the sample with UV light to crosslink proteins bound to RNA
 Digest the sample with RNase which will destroy most of the label RNA with the exception of a small piece that is protected
from digestion by the protein
 Resolution of the reaction via SDS-PAGE will allow you to detect the now labeled RNA binding proteins and they should run
close to their normal molecular weight
- In this particular experiment

○ Sxl and U2AF proteins at various concentrations were mixed with labeled male 3‘ SS RNA
- Result:
 Inclusion of any amount of Sxl reduces U2AF binding

 Suggests that they do indeed compete for binding to the male 3‘ splice site
 consistent with the idea that Sxl has a higher affinity for the male 3‘ splice site since lower amounts of Sxl protein give similar
levels of binding
Overall model
- Based on these data one can propose a model for the regulation of tra splicing
◊ In males where functional Sxl is not expressed U2AF binds preferentially to the male 3’ splice site as it has a higher
affinity for this region than it does for the female 3’ splice site and thus splicing occurs at that site
◊ In females functional Sxl protein binds to the male 3’ splice site and blocks binding of U2AF as Sxl has a higher affinity
for this site than does U2AF
◊ Thus, U2AF will bind to the female 3’ splice site so splicing occurs at this site
- This model also explains why addition of GST-Sxl to the in vitro splicing reaction reduced the overall efficiency of splicing
○ Recall that Sxl forces U2AF to bind to the female splice site and that U2AF has a lower affinity for the female splice site compared to the male
splice site
○ This reduce affinity means that splicing at this site is in general going to be less efficient, thereby explaining the presence of unspliced pre-
mRNA with Sxl is added
- How is double-sex splicing regulated?

◊ The 3’ splice site of exon 4 is suboptimal
 Exon 4 contains 6 copies of an element that functions as an exonic splicing enhaner (ESE)
◊ In males it is by passed joining exons 3 to 5
◊ In females a complex of 3 proteins tra2, tra and RBP1 binds to this element thereby recruiting the splicing machinery to
the weak exon 4 3‘ splice site
 this only happens in females as functional tra is required for the ESEs to work
◊ Other RNAs carry exonic splicing enhancers, exonic splicing silencers and intronic splicing enhancers and silencers that
bind to a variety of trans-acting factors that control splicing
Biochem Page 41
Lecture 8 Section 2 ( Subcellular localization of mRNAs)
Subcellular localization overview
□ Virtually all mRNAs are transported from the nucleus to the cytoplasm
□ Once in the cytoplasm some mRNAs are localized to various locations within the cytoplasm
□ Purpose: Localization of an mRNA will serve to localize the encoded protein
□ While most of you are used to thinking about protein localization mediated via sequences within proteins a genome wide study of mRNA localization have
suggested that mRNA localization likely plays a major role in the localization of many proteins
Example of localized mRNAs
 Example 1: Xenopus oocyte - VgI mRNA (red) found in vegetable pole of the cell
 Example 2: Chicken fibroblast - beta-actin mRNA (red) found in leading edge

◊ Allows the cell to crawl along a substrate
◊ Therefore: leading edge indicates the direction of movement of the cell
 Example 3: drosophila embryo

◊ Bicoid mRNA and nanos mRNA encode proteins that are known as spatial determinants which direct body patterning decisions within the
localization site
◊ Bicoid mRNA - localized at the anterior side
 Directs anterior development
◊ Nanos mRNA - localized at the posterior side
 Directs posterior development
- In general localization of mRNAs is assayed using a technique known as in situ hybridization
- Mechanism of in situ hybridization

○ cells are treated with crosslinkers that covalently attach macromolecules to one another thereby preserving subcellular structures
▪ this is commonly referred to as “fixing cells”
○ Fixed cells are then incubated with nucleic acid probes that are complimentary to the mRNA of interest
▪ These nucleic acid probes contain modified bases that allow their detection with antibodies that detect these modifications
○ The antibodies can then be detect using a number of approaches including the fluorescent labels
Mechanisms of mRNA localization

- mRNA is labelled red for all diagrams
i) Directional transport on cytoskeletal element

 mRNA transported from nucleus to cytoplasm
 mRNA becomes associated with cytoskeleton (microtubules or microfilaments)
 Motor proteins transports mRNA to final destination where it is anchored by anchor proteins
ii) Random diffusion and anchoring
 mRNA randomly diffuses through the cell
 If it reaches the anchor protein it gets associated with it
iii) Generalized mRNA degradation with localized protection from degradation
 Where mRNA is not supposed to be it will get degraded
 At the sies of proper localization, mRNA is protected and anchored
- Critical components of the mRNA localization machinery include cis-elements within target mRNAs that function as binding sites for RNA-binding proteins
Localization of ASH1 mRNA in S. cerevisiae (budding yeast)

- Budding yeast cells undergo asymmetric cell division where the new daughter cell buds off from the mother cell
- During these divisions the mother cell undergoes some specific DNA rearrangements that don’t happen in the daughter cell
○ These rearrangements allow the mother cell to undergo a process known as mating type switching
- Suppressing these rearrangements in the daughter cell involves the localization of ASH1 mRNA to the daughter cell
○ specifically to the distal tip
- This mechanism of Ash1p protein localization was uncovered in work aimed at understanding the mating type switching process
- A genetic screen for factors involved in mating type switching was conducted
○ The ASH1 gene came out of this screen and it was shown that accumulation of this protein in the daughter cell nucleus was required to preventing switching in the
daughter cell
○ Other factors, including Myo4p, She2p, and She3p, were identified in the screen and were found to be required for the accumulation of the Ash1p protein to the
daughter cell nucleus
- Subsequent experiments showed that ASH1 mRNA also localized to the daughter cell and this localization also required Myo4p, She2p, and She3p
- So based on these data it was proposed that Ash1p protein localizes to the daughter cell through localization of its mRNA
- The fact that Myo4p was required for ASH1 mRNA localization began to suggest a mechanism
- Myo4p is a member of the myosin family of motor proteins

○ these proteins bind to the actin cytoskeleton and they also function as ATPases
○ They use the energy from ATP hydrolysis to move themselves and cargos along actin cables
- So this suggested that ASH1 mRNA localization would require the actin cytoskeleton
○ Researchers confirmed this by showing that latrunculin A, a drug that causes the depolymerization of the actin cytoskeleton, disrupted ASH1 mRNA localization
○ They also showed that ASH1 mRNA was not localized in an actin mutant
- So taken together these data suggest that ASH1 mRNA is transported to the daughter cell along actin cable by the motor protein Myo4p
○ So how does Myo4p interact with ASH1 mRNA?
Mapping the cis-acting element in the ASH1 mRNA that function in mRNA localization
- Researchers assumed there must be a cis-acting element within the ASH1 mRNA that was responsible for localizing the transcript
- To search for this element they fused various pieces of the ASH1 mRNA to an mRNA that doesn’t localize to the daughter cell and then asked which of these hybrid mRNAs
were localized
○ The goal was to identify the smallest fragments that would function in this assay
- They identified 4 elements: E1, E2A, E2B and E3 that could direct the localization of an mRNA to the daughter cell
Biochem Page 42
were localized
○ The goal was to identify the smallest fragments that would function in this assay
- They identified 4 elements: E1, E2A, E2B and E3 that could direct the localization of an mRNA to the daughter cell
 So presumably these localization signals must be responsible for recruiting Myo4p to ASH1 mRNA
 They are redundant to each other b/c each element was capable of localizing the mRNA on its own
She2p binds the localization signals

- Subsequent experiments showed that She2p is required for ASH1 mRNA localization
○ It binds directly to each of these localization elements
- For example, in one series of experiments She2p was purified from E. coli as a GST fusion protein and using a UV-crosslinking assay it was shown to interact with the E3
localization element
○ In lane 1 - two proteins crosslinking to the RNA:

▪ the lower one representing She2p
▪ the upper band representing a contaminating protein in the preps of purified She2p
- To assess the specificity of these interactions competition experiments were performed which involves the inclusion of large amount of various unlabeled RNAs
○ In lanes 2-4 they added increasing amounts of an unrelated RNA
○ In lanes 6-8 they added increasing amounts of unlabeled E3 RNA
○ Each lane contains:

▪ Labelled RNA prob to which She2 binds
▪ GST-She2 protein
▪ Unlabelled RNA competitor
□ E2 RNA in lanes 6-8
□ Unrelated pGEM RNA in lane 2-4
○ Control:
▪ Contaminating protein which is represented by upper band n the gel
○ Results:
 As you add more E3, She2p signal is getting weaker
– Indicates that She2p is binding to the unlabelled E3 RNA
 As you add more pGEM, RNA has no effect on She2p binding
– Indicating it doesn’t bind to She2p
 Conclusion: She2p appears to bind RNA in a sequence specific matter
▪ The contaminating protein is competed by both unlabeled RNAs suggesting that it binds RNA non-specifically
□ The amount of protein bound reduces in similar pattern for both unlabelled RNAs
□ Therefore: protein is unable to distinguish between 2 RNAs
- Using competition experiments researchers tested if She2p could interact with the other localization signals
◊ Excess unlabelled RNA corresponding to the E,1 E2A E2B or E3 elements was included in a crosslinking assay again using labeled E3 RNA as the
probe
◊ Therefore: E3 competes with E1, E2A, E2B, and E3 in lanes 3-6
◊ Note that all 3 elements compete but to different extends suggesting that each element has a different affinity for She2p
She3p interacts with She2p and Myo4p

- So the next question is how does She2p recruit Myo4p to ASH1 mRNA?
- In another series of experiments it was found that She3p is required for ASH1 mRNA localization
○ It binds to both She2p and Myo4p
- To do these experiments both Myo4p and She2p were purified from E. coli as GST fusions
○ This purification exploits the fact that the GST fusion protein interacts with a small molecule called glutathione
○ Thus GST fusions proteins are captured on beads that carry covalently coupled glutathione and eluted from the resin by the addition of excess soluble glutathione
○ This also allows you to create resins that allow you to ask if two protein interact with one another
- Assessing interactions of 2 proteins using GST

○ Bead is covalently linked to glutathione
○ GST-fused protein will bind to the bead
○ Generates resin to assess whether the fused protein is capable of capturing another protein
○ If yes - the two proteins interact
- In this particular experiments GST tagged Myo4p and GST tagged She2p were captured on glutathione beads and radioactively labeled
- She3p was generated using in vitro translation

○ in vitro translation involves mixing an mRNA, generated via in vitro transcription, that encodes the protein you are interested with a cell extract (usually made from
rabbit reticulocytes) that contains all of the factors required to translate an mRNA
- After mixing the in vitro translated She3p with

○ the GST-Myo4p beads
○ GST-She2p beads or
○ just plain GST beads as a control
unbound proteins are washed away and bound proteins are eluted and resolved via SDS-PAGE
- Results:
○ both Myo4p beads and She2p beads captured She3p protein while the GST alone beads did not
○ additional experiments showed that different parts of She3p interact with She2p and Myo4p
- This suggests that She3 can bind to both proteins and mediate interaction between them
A model for ASH1 mRNA localization

- Based on the results I’ve shown you plus additional work the model is that ASH1 mRNA is bound by the She2p (in green) which in turn is bound to She3 which is bound to
Myo4 which transports the entire complex along actin filaments
- This model makes the predication that if She3p could recognize an mRNA on its own it would localize that mRNA in the absence of She2p
- To test this possibility a construct was made that would express:
◊ She3p protein fused to the MS2 coat protein

 MS2 coat protein is a sequence specific RNA binding protein
Biochem Page 43
- This model makes the predication that if She3p could recognize an mRNA on its own it would localize that mRNA in the absence of She2p
- To test this possibility a construct was made that would express:
◊ She3p protein fused to the MS2 coat protein

 MS2 coat protein is a sequence specific RNA binding protein
◊ reported mRNA with MS2 binding sites
- Results
○ When the She3p:MS2 fusion was expressed in cells it directed the localization of the mRNA to the daughter cell and this localization was independent of She2p
○ These results highlight that localization of mRNAs using the cytoskeleton involves 3 basic components
▪ a protein to bind the RNA
▪ a motor protein to transport the mRNA on the cytoskeleton and
▪ adaptor protein that links the RNA binding 12 protein to the motor protein
Studying mRNA localization in living cell

- To study ASH1 mRNA localization in living cells two constructs were generated
○ The first expresses the green florescent protein (GFP) fused to the MS2 coat protein
○ The second construct expresses a reporter mRNA with several MS2 binding site along with the ASH1 3‘UTR which carries the E3 localization signal
- The idea is that the GFP/MS2 fusion protein will bind to the RNA carrying the MS2 binding sites allowing you to visualize the RNA as it moves in a living cell
- Imaging of cells expressing these constructs showed in general one spot and several controls were done to show that this spot represented the reporter mRNA
- This system allows you to assess several aspects of RNA transport

○ for example how long does it take, how fast does the mRNA move, how directional it is etc.
- This film shows that in wild-type cells this mRNA is transported into the daughter cell but the path it takes is not very direct
○ This is likely because the actin fibers are not single long chains that go from the mother to the daughter cell
○ Instead the fibers are shorter and instead of all pointing in exactly the same direction, they are just generally oriented in the same direction
○ Thus the mRNA eventually makes it into the daughter cell but the route taken is not a straight line
- Note that:
○ in the she1 mutant (she1 is another name for myo4) the particle still forms but it doesn’t get transported to the daughter cell
○ the RNA in the wild-type cells is not anchored to the distal tip of the daughter cell
▪ This contrasts the behaviour of a another mRNA carrying the ASH1 open reading frame which did become anchored
- These results thus suggest that transport and anchoring appear to function through somewhat different mechanisms
Biochem Page 44
Lecture 9 Section 2 (mRNA stability)
Regulation of mRNA stability

- mRNA half-lives can vary widely from a less than a minute to more than several hours
Mechanisms of mRNA degradation in eukaryotes

- Generic mRNA contain 5' cap, poly A tail, and reading frame
- Deadenylation - removal of the transcript's poly A tail
i) Mechanism 1:
 Generic mRNA undergoes deadenylation
 mRNA is then subjected to Decapping
– Decapping - removal of 5' cap
 Exonucleolytic decay of the RNA in 5' to 3' direction
ii) Mechanism 2:
 Generic mRNA undergoes deadenylation
 No decapping
iii) Mechanism 3: Deadenylation-independent decapping
 Generic mRNA undergoes decapping
 No deadenylation
iv) Mechanism 4: endonuclease cleavage
 Cleavage somewhere in the body of the generic mRNA
 Generates 2 pieces
– 1 with free 3' end - Exonucleolytic decay of the RNA in 3' to 5' direction
– 1 with free 5' end - Exonucleolytic decay of the RNA in 5' to 3' direction
- The key consideration here is that the 5’ cap and the 3’ poly(A) tail protect the mRNA from degradation and as such each decay pathway
must somehow over come the protection that these two features of the mRNA provide
Regulation of tranferrin receptor mRNA stability

- Iron plays a important role in many cellular processes
- However, free iron can be quite toxic

○ as such there are mechanisms in place to tightly regulate iron levels and to keep free iron levels low
- In mammals iron is transported through the blood bound to a glycoprotein known as transferrin
○ when cells need iron they transport transferrin bound to iron into cells via the transferrin receptor (TfR)
- Overall mechanism:
○ When intercellular iron levels are high - cells down regulate the expression of TfR on the cell surface
○ When intercellular iron levels are low - cells up regulate TfR on the cell surface
- To begin to figure out how this regulation is accomplished initial experiments asked if TfR transcription was regulated by iron levels
using a nuclear run-on assay
Nuclear Run-on Assay (mechanism)
◊ Isolate nuclei from cells of interest

◊ Incubate nuclei in the presence of radioactive nucleotides such that transcripts that were being made at
the time of isolation continue synthesis while blocking transcription initiation
 Creates a radioactive level
◊ Isolate RNA and hybridize it to a filter that carries single stranded probes to the gene of interest
◊ The amount of radioactivity that hybridizes to the probes is a measure of the RNA polymerase density on
that gene at the time the nuclei were harvested and thereby a measure of the transcription rate of that
gene
- Using this approach it was found that TfR transcription was not regulated by iron
- Next researchers disrupted cells and separated them into a cytoplasmic fraction and a nuclear fraction and then assayed levels of TfR
mRNA via Northern blot
Northern blot
◊ Mechanism (initial)
 Denature RNA and separate on gel
Biochem Page 45
Northern blot
◊ Mechanism (initial)
 Denature RNA and separate on gel
– For big fragments use agarose gel
– For small fragments use acrylamide
 Denaturation of the RNA disrupts any secondary structure in the RNA which would otherwise cause
the RNA to run aberrantly
 Denaturation therefore ensures the RNA’s mobility in the gel will be a true reflection of its size
- Mechanism for agarose gel

○ Schematics: from top to bottom
▪ Absorbent paper at the top
▪ Filter
▪ Gel with RNA bands
▪ Filter paper wick
▪ The whole 'sandwich complex' stands on Buffer reservoir
○ Put a weight on top of the complex
○ Buffer in the bottom chamber will be drawn up through the gel into the paper towels due to capillary action
○ Flow of buffer will transfer the mRNA from gel onto the northern blot membrane
○ Hybridize filter with labelled DNA or RNA probe to detect specific RNA
• Mechanism for acrylamide gel

○ Put the gel into electric field
○ Use electric field to transfer RNA from gel onto the northern blot membrane
○ Hybridize filter with labelled DNA or RNA probe to detect specific RNA
Iron regulates TfR mRNA stability in the cytoplasm

- To measure mRNA stability cytoplasmic mRNA levels were assayed via Northern blotting after the addition of the transcriptional
inhibitor Actinomycin D
○ Actinomycin D - inhibits transcription by intercalating between base pairs in DNA
○ Extracted RNA from the cell at various time points after addition of the inhibitor
- Mechanism:
□ transcription is inhibited
□ If you, then check the RNA levels as a function of time
 If RNA is stable - RNA levels will persist after addition of the transcriptional inhibitor
 If RNA is unstable - RNA levels will decrease with time
- This experiment indicates that TfR mRNA has a half-life of:

○ ~45 minutes when iron is present
○ greater than 12 hours when iron is absent
- Suggest that changes in levels of transferrin receptors at the surface of the cell +/- iron might be controlled by stability of mRNA of
transferrin
What sequences mediate this response?

- Other work mapped the signals required for iron regulation
- Initial efforts:
○ deleted and/or replaced different parts of the TfR gene and found that the 3’ UTR was required for iron regulation
- In addition:
○ swapping the TfR promoter for another promoter had no effect on iron regulation
○ consistent with regulation not occurring at the level of transcription
 Additional mapping experiments identified a ~700 nucleotide region of the 3‘UTR that was sufficient for iron
regulation
◊ this region that carried 5 stem/loops
◊ These step-loops were similar to a stem/loop in another mRNA (ferritin heavy chain mRNA) that was also
regulated by iron
 This stem/loop within ferritin mRNA was shown to be responsible for iron dependent regulation of ferritin
translation and it was designated an iron response element or IRE
Iron response element binding protein binds to IREs

- Other work has identified an RNA binding protein
the iron response element binding protein or IRE-BP for short
Biochem Page 46
Other work has identified an RNA binding protein
○ the iron response element binding protein or IRE-BP for short
○ This IRE-BP could bound to the ferritin IRE
- Using a gel shift and a competition assay it was shown that the same protein that binds to the ferritin IRE also binds the stem/loops in
the TfR 3‘UTR
- IRE-BP’s RNA binding activity is stimulated by low iron levels
- Conclusion:
○ Taken together these data would suggest that in low iron
▪ IRE-BP will be bound to the TfR 3‘UTR
▪ this binding would stabilize the mRNA
○ Or put another way the TfR mRNA is normally unstable and this instability is reversed by IRE-BP binding to the 3‘UTR
Two types of cis-acting elements control TfR mRNA stability

- This model suggests that there are two sequence elements within the TfR mRNA:
○ one would be required to make the mRNA unstable
○ another type of element required to protect the mRNA from degradation which would be the IREs
- We could identify 2 cis-elements using mutagenesis based approach

○ mutations are introduced that block IRE-BP binding
○ Levels of TfR mRNA:
○ Conclusion: The ability to make such mutants in consistent with the model
The mechanism of TfR mRNA degradation
- A clue to the mechanism of TfR degradation came from Northern blot analysis of TfR mRNA levels after treating cells in high iron
conditions
○ In this case the signal was detected by exposing the blot to Xray film
- Results:
 The short exposure showed the full length TfR mRNA going away with time consistent with its degradation
under these conditions
 A long exposure showed a less abundant smaller mRNA that isn’t initially present but becomes detectable
sometime after treatment with high iron levels
- The timing of its appearance suggests its a decay intermediate

○ subsequent experiments showed that this intermediate resulted from an endonuleolytic cleavage with the TfR mRNAs 3‘UTR
Model for the control of TfR mRNA stability
◊ When iron levels are low:

 IRE-BP bound to the IREs in the TfR 3‘UTR block the action of an endonuclease that can cleave the
TfR mRNA
◊ When iron levels are high:

 IRE-BP is bound by iron atoms thereby disrupting IRE-BPs RNA binding ability
 the TfR mRNA is now susceptible to cleavage
Assaying poly(A) tail length using an RNase H assay
- Mechanism of RNase H assay:

◊ Take a sample of total RNA with RNA of interest
◊ Hybridize gene specific DNA oligonucleotide which will bind close to the 3' end of the RNA of interest
◊ Add RNase H
 H stands for hybrid b/c it cleaves RNA only when it is hybridized with DNA
 As a result, a small fragment at 3' end of mRNA will be released from RNA molecule
 The length of the fragment is determined by the location of oligonucleotide binding and poly A tail
length
Biochem Page 47
length
◊ Run RNA on polyacrylamide gel and perform northern analysis using 3'end probe
 Allows to measure the fragments length and the length of poly A tail
- This assay can be done on total RNA harvested from cells and since you use a probe specific to your mRNA of interest you will only
detect that RNA on your northern blot
- The size of the resulting fragment on the Northern blot will be an indication of the length of the poly(A) tail
- Can generate a marker for a completely deadenylated mRNA by including an DNA oligo consisting of a run of Ts
○ Oligonucleotide hybridizes to transcript's Poly A tail
○ Therefore: when treated with RNase H - will degrade Poly A tail and generate a marker for deadenylated mRNA
Assaying poly A tail length

- Procedure:
○ Block transcription at t=0
○ Collect RNA from cells at various time point
○ Subject the RNA samples to the RNase H assay
- In the first lane in each gel:

○ a sample has been treated with mRNA specific oligo plus oligo(dT) to provide a marker for fully deadenylated mRNA
- Problem with interpreting results:

○ The mRNA is being both deadenylated and degraded constantly
○ even at t=0 in wildtype cells the mRNA exists in two general states
▪ one with a long poly(A) tail (A0)
▪ one with a very short tail (A35)
○ at the second time point you see a decrease in the signal intensity of the long tail version and an increase in the signal of the
short tail version
▪ This suggests conversion from one form to the other
▪ Also note in the deadenylase mutant cell that the mRNA has a uniformly long tail and is also stable providing a correlation
between the deadenylation of this transcript and its degradation
Mechanism:
- Sequence specific RNA binding proteins recruit deadenylase to target mRNAs to induce there degradation
Biochem Page 48

Biochem PDF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Biochem PDF

Uploaded by

Copyright:

Available Formats

Lecture 6 (Proteins tertiary structure)

October 1, 2020 5:14 PM

Tertiary structure of the proteins

The β-α-β motif

◊ simple motif involving two β-strands that are:

The Greek key motif

◊ name comes from design found on classical Greek pottery

Domains of secondary structures

◊ Example: pyruvate kinase

Four categories of domains

Protein domains can possess distinct functions

◊ Example: S. pyogenes Cas9 consists of 3 domains:

Folding of globular proteins depends on a variety of interactions

The hydrophobic effect

 Hydrophobic effect best described using thermodynamics

Hydrogen bonds and van der Waals

Covalent cross-links and ionic interactions

Binding of metal ion or prosthetic group

Summary of interactions that stabilize tertiary protein structure

Summary of Tertiary Structure

Why are multi-subunit proteins so common?

Why purify proteins?

Where do we isolate proteins from?

 3 ways of mechanically disrupt a cell:

General scheme for protein isolation

– After the lysis the cell solution is centrifugated

Monitoring proteins during purification with SDS-PAGE

- The gel is usually made out of cross linked polyacrylamide

Gel electrophoresis and how it works

 The resulted product:

Immunoblotting: Western blot

Other techniques for monitoring proteins

Five characteristics of proteins that are exploited for purification

◊ The solubility of proteins in ammonium sulfate varies

◊ Centrifugation separates out precipitated proteins

Ionic Charge: ion exchange chromatography

- General mechanism: example of positively charged column

◊ More detailed mechanism:

- Chromatography column contains Ion exchange resins

◊ Cation exchanger has negatively charged group

Chromatographic separation of proteins

Polarity: Hydrophobic interaction chromatography (HIC)

Hydrophobic interaction chromatography (HIC)

Molecular size: purification technique

Molecular size: Size exclusion chromatography

◊ Gel filtration chromatography can be used to estimate molecular masses

Binding specificity: Affinity chromatography

◊ Separates biochemical mixtures based on a highly specific interaction such as antibody/antigen or

◊ Example: GST - Glutathione

◊ Commonly used affinity tags

Summary of chromatographic methods of protein purification

Proteins fall into two main classes:

Fibrous proteins provide mechanical support to cells or organisms

Collagen forms a triple helix

Collagen has an unusual amino acid sequence

Collagen triple helix is stabilized by interchain hydrogen bonds

Structural basis of the collagen triple helix

Collagen is organized into fibrils

◊ Driving force which results in triple helix structure:

The arrangement of collagen fibrils in various tissues:

Protein Structure and Function Hemoglobin and Myoglobin

Globular proteins: tertiary structure allows them to bind other molecules

 Myoglobin was the first known protein structure (1958)