You are on page 1of 65

UNIT II: STRUCTURE FUNCTION CORRELATION IN PROTEINS

The Structure-Function correlation in Transcription


factors TATA box binding proteins,
p53 and GCN4 (Leucine zipper)
The Structure-Function correlation in fibrous proteins –
muscle fibers myosin, actin and the role in ATP in muscle contraction.
The Structure-Function correlation in Signal transducers
– GPCR and tyrosine kinase
What is transcription?
Transcribing genetic information from DNA to RNA

Replication
DNA DNA
Transcription

RNA
Translation

Protein
RNA Polymerase
Synthesizes RNA from DNA

RNA Polymerase I (Pol I)- Synthesizes rRNAs


RNA Polymerase II (Pol II)- Synthesizes mRNAs
RNA Polymerase III (Pol III)- Synthesizes
tRNAs
What is transcription factor?

Distal to the RNA Pol II initiation site, there are


different combinations of specific DNA binding sequences
each of which is recognized by a corresponding site
specific DNA binding protein.

These proteins
Example: are known as transcription
TFIID,TFIIA,TFIIB, TBP etc factor(s).
Core promoter element

Architecture of a structural gene and the promoter


TATA Box:
• A-T Rich 8 base pair DNA sequence
• Located 25 base pair upstream of of TSS
• Recognized by TATA Box binding Proteins (TBPs)
Promoter proximal Element:

● 100-200 bp long
● Several transcription factors interact directly or indirectly with the
pre initiation complex
Enhancer Element:

● Resides further upstream or down stream of the TSS


● Few thousand to 20000 bp distant from the TSS
Transcription factor Bind to the DNA

Transcriptional Activation

Schematic model of transcriptional activation


TF
DNA ACTIVATION
BINDING DOMAIN

DNA Binding Domain:

• 100 aa acid long


• Bound to short DNA of 20 bp
• Built up of very limitted no of motifs– Like Helix turn Helix
Leucine zipper
Helix loop Helix
Zinc finger motif
1. TATA Box Binding Protein (TBP)

• First isolated and purified from Yeast in 1988


• Single polypeptide chain of 27 kDa
• Conserved C Terminal domain of 180 aa
• N Terminal domain of varied length and diverse sequences
• C terminal domain having DNA binding and transcription
activation function
Structure of TBP
Crystal structure by Paul Sigler @ Yale University
With Yeast C trminal TBP and Yeast TATA box DNA

Stephen Burley @ Rockyfeller University with C Terminal


TBP of A. thaliana and TATA box DNA from Adeno virus
Two homologous repeat of 88 aa form
similar motifs

Comprises of an antiparallel Beta sheet of


five strands and Two α- helices

Two motifs are joined together by a


short loop to make a 10 stranded beta
sheet

They look like a saddle (Fig a)


Loops that connects beta strand 2 &
3 of each motif forms the Stirrups of the
saddle

Underside of the saddle forms the


conclave surface built by the central eight
strand of beta sheet

Side chain of this site of beta


Sheet as well as residues of the
Stirrups forms the DNA binding
Site.

The side of the beta sheet


that Faces away from the
DNA is Covered by two
alpha helices

Residues from these two helices and from the short loop that joins the two
How TBP binds to the DNA?

Answer: TBP binds to the minor groove of the DNA


and Induces large structural changes
• Normal B-DNA structure returns out side the TATA box

•The helical axis of the DNA at each end of the TATA BOX
form an angle of about 100 degree to each other , instead of
the Expected 180 degree if the DNA was not bent.

•First two and the last two bp of TATA box, there are sharp kinks, DNA is
Covered smoothly and partially unwounded.
•Two Phenylalanine residues are partially inserted between first two and the last two bases, preventing
stacking of the adjacent bases and allow increase in rise Of the DNA

• The kinks at each end of the DNA and partial unwinding of the DNA produces a wide and shallow minor
groove.

• This exposed wide and shallow minor groove bind intimately to the concave undersurface of the TBP saddle.

DNA Modifications: Distortions:

a. Bending of DNA
b. Widening of the minor
groove
c. Unwinding of the DNA
•All eight nucleotides of TATA box interacts with TBP and their structure
deviates from the normal B-DNA.

•Saddle would straddle normal B-DNA structure with helical axis of the DNA
perpendicular to a line connecting the two stirrups.

•DNA is sharply bent at TATA box region so that the local helical
axis is almost is almost parallel to the line from stirrups to stirrups.

Protein

Saddle structure Minor groove of DNA


What is the nature of the interaction?

•Strong hydrophobic interaction between the underside of TBP saddle and the minor
groove of DNA

•Side chains of eight central beta strands interacts with both the phosphate sugar Backbone
and the minor groove of the eight nucleotides of the TATA box.

•Fifteen side chains projecting from the beta strands make hydrophobic contacts With
the sugar and bases of DNA.

• The phosphate groups are hydrogen bonded to arginine and lysine side chains At the edges
of the interaction area.
Why specific to TA/AT sequence at 4 and 5 position of bp?
Only sequence specific H bonds – center of box
Asn 69 – O2 of T4’ and N3 of A5’
Asn 159 – O2 of T5 and N3
of A4 Thr 124 &215– N3 of A
both sides

Role of Conserved Val residues.


Val 71 and 122 on one side
Val 161 and 213 on the other side
Side chains of val residues cause steric interference with
NH2 substituent from G-C or C-G basepair.
Flanking val residues in combination with 6 H-bonds specify
A- T or T-A at postions 4 and 5 of TATA box
Why Minor groove???
Quasi – palindromicity
Functional implication of DNA bending
TBF – associated factors (TAF)
Why strong affinity between TBP-TATA Box? Around 100000 fold
more affinity than random DNA.

• Large interacting hydrophobic surface area

• Major distortion in the DNA

•SIX Hydrogen bonds between 4 side chain residues of TBP and 4


hydrogen bond acceptors from bases In the minor groove.
p53
Most ambiguous and cited biological molecule
Encoded by genes known to be Tumor Suppressor Genes (TSG)????
Protein with 53 kDa MW – expression of p21 –
inhibiting CDK’s (Check point)
Sufficient time to repair or destruction of damaged cells (apoptosis)
Single point mutation – altered function – observed in more than half
of the cancer patients
wild type – sequence specific DNA binding
mutated p53 – no binding and hence no regulation
P53 – Oligomerization Domain
Oligomerization domain – tetramer formation
Mutations in C – terminal affects tetramer formation.
The monomer still retains DNA binding function
Available structures????
21 base pairs sequence bound to p53 (102 – 292)
Oligomerizing domain (325 – 356)
Tumorogenic mutations
Leu 330 to His
Glycine in
turn
P53 – DNA binding
Domain

DNA binding domain (anti-parallel beta barrel)


protruding loops from anti – parallel beta
barrel immunoglobulin fold (7/9 strands)
This kind of fold also present at I– MHC binding coreceptor in
CD4 NF-kB – REL homology region
• One end of the barrel are closed together

•Other end is more open and lops are more extended and protruding out side the
barrel, this is the end where DNA binds

•The conformation of two of this loop is maintained by Zn atom which is bound to


two cysteine side chains from one loop and one Cysteine and one Histidine side
chain of another loop
Important interactions:
Major groove
Minor
groove

Non-specific interactions between sugar &


phosphates in DNA and side-chain &
main- chain atoms of the protein

Two loops and α helix is involved


in interaction

α helix at major groove (both specific


and non-specific interactions)

Most important interaction Arg 280 with G-


10
C-G at position is invariant
Minor grove interactions at A-T region
Arg 248 from L3 with T12’ and T14 sugar and phosphate groups
are involved

Tumorigenic Mutations
R248 – both for DNA specific interaction
Mutations that alter the interaction between L2 and L3 are also noted
Role of Arg273 with T11’
hydrogen bond interactions with Arg280
Zn finger transcription factors
A zinc finger is a small protein structural motif.
Characterized by the coordinationof one or more zinc
ions in order to stabilize the fold.
Zinc fingers have become extremely useful in various therapeutic and research
capacities.
Engineering zinc fingers to have an affinity for a specific sequence is an area of
active research.
Zinc finger nucleases and zinc finger transcription factors are two of the most
important applications of this to be realized to date
❑More than thousand different transcription factors contain Zn as an essential element Of their DNA
binding domains.

❑ Polypeptides are short about 50 aa

❑ Regular patterns of cysteine and /or Histidine residues along the chain

❑These residues bind to the Zn atoms and thereby providing a scaffold for the folding of the Motif into a
small compact domain

❑ First described in 1985 by Aron Klug at MRC LMB Cambridge

❑ TFIIIA from Xenopus laevis

❑ 344 aa, Nine repeated sequences of about 30 residue each

❑ Repeats are not identical in sequence but each contains two Cysteine and two Histidine at the N
terminal
and C teminal end respectively.
•Zn is intrinsically present,Cysteine and histidine are the ligand of the Zn atom and loop between these
residues forms the DNA binding region. Each of these nine repeats are therefore called zinc finger
• Two cysteine residues are separated by two or four amino acids

• Two Histidine residues are seperated by three to five amino acids

• The linker region between the last Cysteine and the first Histidine is 12 residues long
• Structure of Xfin Synthetic peptide
•Residue 1-10 forms an antiparallel hairpin motif with the Zn ligand Cys 3 in beta strand and the
the first second ligand Cys 6 in the tight turn between the beta strands
• The hairpin is followed by a helix , residue 12-24, of about three and a half turn

The remaining two Zn ligands. His 19 And His 23


are in the C terminal half Of the helix

The helix is distorted and form 310


helix H Bond in
every third residue instead 4th residue

Two ends are held together by the


binding of Side chains to the Zn atom
Finger region of Zn Finger Motif interacts with
DNA
❑ The 12 residues between the second and Cysteine Zn ligand and the first His forms
the “Finger region”
❑Comprises Second beta strand, N terminal half of the helix and the two residues that
form the turn between the beta strand and the helix
❑ This is the main interaction area of the polypeptide chain with the DNA
❑Interactions are sequence specific between the side chains of the protein and the bases of
the DNA and also non specific with phosphate oxygen atoms of the DNA and side chains of
the proteins

Example: Zif 268


Specific interactions: Arg 46- G7
His 49- G6

Non-Specific interactions: Phosphate O2 of base pair 4,5


and 6 and Side chains of
His 53, Arg 42 and Ser 45
Leucine Zipper
Leucine zippersare a dimerization domain of the bZIP (Basic-region
leucine zipper) class of eukaryotic transcription factors.
The bZIP domain is 60 to 80 amino acids in length with a highly conserved DNA binding basic region and a
more diversified leucine zipper dimerization region.
Leucine Zipper
❑ First recognized in yeast transcription factor GCN4
Mammalian Transcription factor
C/EBP Oncogene : Fos, Jun and
myc

❑Linear amino acid sequences when plotted in a helical wheel , remarkable pattern of
Leucine residues forms

❑ Around 30 residues form a modular arrangement of 7 aa residue and the 4th


residue always leucine
First residue usually hydrophobic
Peptide dimerizes and forms two parallel coiled coil alpha helix with a helical repeat of
3.5
residue per turn

a & d position = forms a hydrophobic core region


Side chain outside the core (e & g) are frequently charged and can either promote or
prevent Dimer formation
Dimer: 1. Homodimer- Same transcription factors. Hetero dimer: Two
diffferent transcription factor.
Example: Fos/Jun heterodimer found in AP1 (Active gene regulating protein
1) Jun- Can form both homo and hetero dimer
Fos- Can not form homo dimer.
As they can not form homodimer, they are not able to bind to DNA all by itself
WHY? Answer: Strong charge repulsion of 5 glutamic acid residue in e &g
position with no compensating positive charge.
❖Fos can form hetero dimer with Jun due to the complementary positive charges in the e &
g position of Jun
❖ Hetero dimer formation facilitates repertoire of DNA binding
specificities Two types of monomer– 3 distinct DNA binding
specificities
Three types of monomer- 6 distinct DNA binding specificities
GCN4
⮚ Yeast
⮚ b/Zip family of transcription factor
⮚ Monomer 281 aa
⮚Binds to promoter regions of more than 30 genes involved in amino acid
biosynthesis
⮚ Dimerization and DNA binding domains are in two different
regions Basic region and C terminal Leucine Zipper region

Basic region: Eight charged residues, mainly Arg which are involved in DNA
Binding

DNA recognition region of GCN4 similar to Fos/Jun heterodimer


of AP1
⮚Simple structure , each monomer of the GCN4 fragment forms a smoothly curved
continuous alpha helix
⮚ Leucine Zipper region forms a coiled coil structure
⮚Two helices diverse from the dimer axis in a segment comprising the junction
between the leucine zipper and the basic region
⮚ This fork creates a smooth bend in each helix which displaces the basic region away from
dimer interface, so that they can pas through the major groove of the DNA
⮚Each basic region binds to one half site with numerous contacts to the DNA like a forceps
gripping the major groove.
GCN4 binds to DNA with sequence specific
and nonspecific contacts.
4 aa side chain form sequence-specific contact with
bases. Asn 235-strictly conserved, is at the centre of
interaction area.
Side chain of Asn (N) forms 2 H-bonds.
Oxygen atoms accepts a H-bond from a N-atom of
base C2, & N atom of Asn 235 donates a H-bond to
oxygen atom f T3.
For this the α helix of GCN4 basic region lies deeply
in the major groove.
Specifies two of 4 bases in each half site.
N235 lies in Hydrophobic pocket-methyl side chains
of Ala 238 &239.
A238&239 forms hydrophobic pockets with methyl
grps of T3 and T1
Methylene grp of Ser 242-methyl grp of T3.
Arg 243 in one monomer donates H-bond to Guanine
of G-C bp in bidentate manner.
Arg 243 in second monomer forms nonspecific H-
bonds to PO4 oxygen atoms in the central region.
• Assignment 2
How does GCN4 dimer bind to half-sites comprising T1, C2,
and T3 when these half sites are separated by two
nucleotides instead of one?
Fibrous proteins
Passive structural elements of long fibers.
Specific repetitive aa sequences for specific 3-D structure.
Long chain molecules serve as structural materials.
3 different groups dependent on secondary structure.
*Coiled-coil α helices present in keratin and myosin.
*Triple helix in collagen.
*β-sheets in amyloid fibers and silks.
Often form protofilaments or protofibrils.
Assemble structurally specific highly ordered filaments and
fibrils.
Exs: collagen, amyloids, intermediate filaments, tubulin,
myosin and fibrinogen.
Actin and Myosin
Myosin-muscle protein consisting of head, neck and tail domains.
Head domain binds to filamentous actin uses ATP hydrolysis to generate force.
Neck domain acts as linker and lever arm for transducing force generated by catalytic motor domain.
Neck domain-binding site for myosin, forms part of macromolecular complex and regulatory functions.
Tail domain-Interaction with cargo molecules and other myosin subunits.
Actin-Usually associated with myosin. Monomeric subunit of 2 types of filaments. Microfilaments and thin
filaments.
Interaction of actin and myosin is important for muscle contraction and motor activity of myosin molecules.
Present as free monomer called G-actin (globular) or part of linear filament called F-actin (filamentous).
Actin participates in muscle contraction, cell mobility, cell division, cytokinesis, vesicle and organelle
movement, cell signalling and maintaining cell shape.
Three distinct types of muscle cells in vertebrates-skeletal muscle, cardiac muscle and smooth muscle.
Actin and myosin
Muscle contraction takes place by mutual sliding of 2 sets of interdigitating filaments made of fibrous proteins
Thick filament-myosin
Thin filament-actin.
Thick and thin filaments organized in basic contractile units called sarcomere, each 2-3μm long.
Another fibrous component of sarcomere is titin.
Titin-largest known polypeptide chain with mol.wt of approx.3000kDa.
Measures the length of sarcomere.
Return the stretched muscle to correct length.
Myosin form s cross bridges between actin and myosin filaments.
Within each sarcomere relative sliding of thick and thin filaments brought by cross-bridges.
Cross-bridges-parts of myosin molecules that stick out from myosin filament and interact with actin filament.
The hydrolysis of ATP to ADP and phosphate couples the conformational change in myosin to actin binding and
release.
Myosin in thick filament is a fibrous protein with individual chains arranged in helical coiled coils.
Actin is a fibrous protein formed by linking together globular monomeric subunits.
First molecular theories of muscle contraction appeared in 1930s.
Rubber-like shortening of myosin filaments brought about by
altering the state of ionization of myosin.
Sarcomeres contain two sets of filaments that glide over each
other without altering their length.
What makes them glide?
Myosin cross-bridges to the actin filament, two conformations of
cross-bridges were observed.
Seminal finding led to swinging cross-bridge model-sliding of
actin filaments into myosin filament.
Myosin cross-bridge was thought to bind to actin in an initial
(90°) conformation.
Go over to an angled (45°) conformation followed by release of
actin.
For each complete cycle 1 molecule of ATP would be hydrolysed.
Actual movement per cycle of ATP hydrolysis was measured to
about 80-100 Å.
Cross-bridge was an elongated structure accommodated by
swinging the cross-bridge.
Structure of actin and myosin.
Fibrous protein, F-actin is a helical polymer of globular polypeptide chain.
G-actin comprising 375 aa.
Crystal structure of monomeric G-actin molecule was determined. Structure comprises 4 domains.
Two of which are similar α/β domains that contain an ATPase catalytic site.
F-actin helix has 13 molecules of G-actin in 6 turns of the helix.
Myosin -2 heavy chains and 4 light chains.
Forms a 1400-Å- long tail and 2 heads of 120,000 kDa mol.wt.
C-terminal regions of heavy chains are folded into long α-helices form the tail by dimerizing through
parallel coiled coils.
Fragments of myosin called subfragment 1 or S1.
S1 -2 light chains and N-terminal region of 1 heavy chain-globular head and helical tail.
Head-7 stranded β sheet and associated α-helices. Actin-binding site and nucleotide-binding site.
Structure of myosin supports swinging cross-bridge hypothesis
Myosin cross-bridge having two discrete conformations
Attaches to actin with ADP still bound and with lever at beginning of working stroke.
At the end of working stroke ADP is released.
Switch from stat 1 to state 2-Power stroke.
End state-Rigor (Muscle enter on ATP depletion when they become locked in rigor mortis)
Initial state is weak binding state-low affinity for actin.
End state is strong-binds to actin quite tightly.
These two states of myosin exist independently from actin .
Role of ATP in Muscular contraction
Myosin head binds to actin filament in one position relative to its
anchor point.
Myosin filament changes relative position along fiber axis.
Two filaments slide relative to each other by about same distance.
Myosin head then detaches from actin filament to repeat the process.
In absence of nucleotide myosin nucleotide binding cleft is open, the
lever arm is down.
Actin binding site is intact and this form bind strongly to actin.
Rigor state-absence of nucleotides –muscle is locked as in rigor
mortis.
If ATP added myosin head bound to actin will bind ATP and then
dissociate from actin.
Binding of ATP to nucleotide binding domain cleft causes P loop
corresponds to switch II region in G proteins to change conformation.
Changes in loop conformation coupled to major conformational
change of head protein.
Cleft closes and region binds actin releases the actin filament.
Bound ATP is hydrolysed to ADP and phosphate.
Bound PO4 molecule is released cleft starts to open, myosin head
binds to actin.
Release of ADP coincides with conformational change fully opens
myosin cleft.
Causing actin to be tightly bound and moves the lever arm to up
position.
Myosin head bind to actin at one end and covalently linked to myosin
fibril at the other end.
Signal Transduction
Signal transducing receptors------ Plasma membrane proteins
Binds to extracellular
molecules
: Growth factors
: Hormones
Transmit: Neurotransmitters
the signal
Elicit a specific
response
: Cascade of enzymatic
reaction given rise to
many different effects
within the cell like gene expression

They are structurally and evolutionarily related


Three classes 1. Ion channel linked receptors Contains an extracellular domain
2. G protein linked receptors that Recognizes specific
3. Enzyme linked receptors molecular signals
Signal Transduction receptor
Extracellular domain

Recognizes specific
molecular signal

Transmembrane domain

Through which signal


Is transmitted

Intra cellular domain Limited number of domains- protein


Molecules with different functions have
Produces a response
been Evolved –either by accumulation of
point Mutation or by gene manipulation
No three dimensional structure is available
a. Large size
b. Membrane bound
c. Too large to solve by NMR

Growth hormone Receptor extracellular


domain
Amplification of signal by G protein
and Protein tyrosine kinase linked Intracellular response
receptors
G Proteins receptors:
Transmembrane domain with six helices

Signal transmitted to intracellular domains are


Amplified by amplifiers called G proteins

G protein binds to the Guanine nucleotides and hence named as G proteins

Acts as a molecular switch1. G protein + GTP active state


2. G protein +GDP Inactive

⮚ Slow GTPase activity


⮚ G Proteins + RGS (Regulators of GTP Hydrolysis) Switch off the gene
activation
❑ Heterotrimers a. Alpha
b. Beta
c. Gamma

⮚ When binds to the GTP-----Dissociates 1. Alpha


2. BetaGamma

❑ 1000 different genes


❑ Several Apha, Beta and Gamma subunits---forming different G Proteins
Allowing cells to respond to a wide variety of external signals
Inactive state::: Gα-GDP-GβGγ
Signals Receptors external domain signal passes through
the membrane

G Protein activated Cytosolic domain to become


activated by conformational change

Released and dissociation of G Proteins


⮚ Second messenger molecule: a substance whose release within a cell is promoted by a
hormone and which brings about a response by the cell.

⮚ GTPase activity determines the length and the time that the signal remains on

⮚ Failure to turn off GTPase activity: Gα –GTP remain active


Consequence: Chlolera toxin prevents Gα – GTP breakdown continue excretion
of Na and water into the gut.

RAS: small GTPases: Example: KRAS, NRAS, and HRAS


⮚ are molecular switch activated in response to protein tyrosine kinase receptor

⮚ GTPase activity regulated by GAP


⮚ 25% tumor cells produce mutant Ras protein not regulated by GAP
⮚ Ras – Gα similar function
Structural details
⮚ α/β type
⮚ six beta strands – five parallel, five alpha helices
⮚ 5 out of 6 loops in Ras involve in GTP binding site
⮚ 3 of these loops, G1 (10 – 17), G3 (57 – 60) and G4 ( 116 – 119) conserved in all GTP binding proteins
⮚ G1 – for proper positioning of the phosphate groups – binds to and β phosphates of guanine
the α nucleotide
⮚ G3 – link subsites of Mg2+ binding and the γ phosphate
⮚ G4 – recognition and binding of guanine nucleotide

G1 🡺 G-X-X-X-X-G-K-S/T
G3 🡺 D-X-X-E
G4 🡺 N-K-X_D

❑ Two switches – conformational changes on activation


Switch I – G2 Thr binds Mg2+ and involved in structural switching and GTP hydrolysis
Switch II – G3 and alpha 2
Role of Magnesium ions – all nucleotide triphosphate hydrolyzing enzyme

Nucleotide binding site is similar to that of Tu (elongation factor)


Transducin
G – Protein associated with rhodopsin
Effector enzyme – cAMP phosphodiesterase
Much larger than Ras and has two domains –
GTPase domain similato Ras and alpha helical
domain with unique topology

The linker region ensures that exchange of


guanine
nucleotide is regulated.
The linker region has one large alpha helix
(28 residues) with 5 supporting small
helices
acting as a gate.
Five known classes of enzyme-linked receptors:
(1)Receptor tyrosine kinases (phosphorylate specific tyrosine residues on
intracellular signaling proteins)
(2)Tyrosine kinase-associated receptors ( proteins that have tyrosine kinase activity)
(3)Receptor tyrosine phosphatases (remove phosphate groups from tyrosine residues
of
specific intracellular signaling proteins)
(4)Transmembrane receptor ser/thr kinases which add phosphate group to serine and
threonine side chains on target proteins
(5)Transmembrane guanyl cyclases (catalyse the production of cyclic GMP in cytosol)
Signaling through tyrosine kinase domains involves a variety of diverse biological
processes including cell growth, cell shape, cell cycle control, transcription and apoptosis.
Receptors regulating cell growth and differentiation show similar overall structural
organization. Cytosolic region has 250-300 aminoacid residues.
Extracellular domain is different for different subclasses of these receptors.
Second group is tyrosine kinase associated receptors have cytosolic domain that lack a defined
catalytic function.
Includes receptors for cytokines and some hormones such as growth hormone and
prolactin. Eg: Src family
Small protein modules form adaptors for signaling network
A set of protein modules covalently attached to their associated protein kinases or their
target molecules and regulates signal pathways.
3 important modules
SH2 (Src-homology-2)
SH3 (Src-homology-3)
PH (Pleckstrin-homology)

You might also like