You are on page 1of 33

How To Sequence

A Protein

W. Robert Midden
Department of Chemistry
Bowling Green State University

Copyright © 1998 W.R. Midden


All Rights Reserved Bowling Green State University Protein Sequencing
Protein Sequencing
Preliminary Steps
 For multisubunit proteins, the individual
protein chains must first be separated
 Break interchain disulfide bonds, if necessary
 Two reagents are commonly used:
 performic acid
 mercaptoethanol

Copyright © 1998 W.R. Midden


All Rights Reserved Bowling Green State University Protein Sequencing
2-Mercaptoethanol

OO OO OO OO
HH HH
CC NN CC CC CC NN CC CC
HH HH
CH
CH22 CH
CH22

SS HHSS CH
CH22 CH
CH22 OH
OH SH
SH SS CH
CH22 CH
CH22 OH
OH

SS HS CH SH SS CH
CH22 CH
CH22 OH
HS CH22 CH
CH22 OH
OH SH OH

OO CH
CH22 OO OO CH
CH22 OO

CC NN CC CC CC NN CC CC
HH HH HH HH
 2-mercaptoethanol reduces disulfides to sulfhydryls
 But the sulfhydryls are easily oxidized back to the disulfide
3

Copyright © 1998 W.R. Midden


All Rights Reserved Bowling Green State University Protein Sequencing
Preventing Reversal

OO OO OO OO
HH HH
CC NN CC CC CC NN CC CC
HH HH
CH OO CH
CH2 2 OO
CH
2
2

SH II CH2 CC
SS CH
CH2 2 CC
SH CH2

OO OO

OO OO OO OO
HH HH
CC NN CC CC CC NN CC CC
HH HH
CH CH
CH
CH
2
2
2
2

SH HH
2CC CH SS HH
2CC CH
CH2 2 CC NN
SH 2 CH CC NN
2

 to prevent oxidation the suflhydryls are alkylated with


iodoacetic acid or acrylonitrile
4

Copyright © 1998 W.R. Midden


All Rights Reserved Bowling Green State University Protein Sequencing
Perfomic Acid

OO OO OO OO
HH HH
CC NN CC CC CC NN CC CC
HH HH
CH CH
CH22
CH22
OO
SS SO
SO3-3-
HH CC OO OO HH
SS SO
SO3-3-

OO CH OO CH
CH22 OO
CH22 OO

CC NN CC CC CC NN CC CC
HH HH HH HH

 Performic acid oxidizes cysteine to negatively charged


cysteic acid
5

Copyright © 1998 W.R. Midden


All Rights Reserved Bowling Green State University Protein Sequencing
Reversal Prevented

OO OO
HH
CC NN CC CC
HH
CH
CH22 SO
SO3-3-

SO
SO3-3- OO CH
CH22 OO

CC NN CC CC
HH HH

 The repulsion of the negatively charged SO3- groups


prevents reformation of the disulfide bond
 Therefore alkylation is not necessary with performic acid
6

Copyright © 1998 W.R. Midden


All Rights Reserved Bowling Green State University Protein Sequencing
Protein Sequencing
Preliminary Steps
 After breaking disulfide bonds, the chains are
separated by disrupting noncovalent interchain
interactions with pH extremes, 8 M urea, 6 M
guanidium hydrochloride, or high salt
 Then the individual protein chains are
separated by electrophoresis or chromatography
on the basis of size or charge

Copyright © 1998 W.R. Midden


All Rights Reserved Bowling Green State University Protein Sequencing
Determining Amino
Acid Sequence
 Once each protein is purified the amino acid
sequence is determined by:
 1) determining the amino acid composition
(how many of each amino acid are in the
protein)
 2) identifying the amino and carboxyl terminal
amino acids
 3) cleaving the protein into two or more sets of
peptides using specific enzymatic or chemical
reagents such as trypsin or cyanogen bromide
8

Copyright © 1998 W.R. Midden


All Rights Reserved Bowling Green State University Protein Sequencing
Determining
Protein Sequence
 4) determining the amino acid sequence of each
of the peptide fragments
 5) determining the entire protein sequence from
the sequences of overlapping peptide
fragments
 6) locating the position of disulfide bridges
between cysteines

Copyright © 1998 W.R. Midden


All Rights Reserved Bowling Green State University Protein Sequencing
Determining Amino
Acid Composition
 The amino acid composition is determined by:
 Hydrolysis with 6N HCl for one to three days
 Separating and quantifying individual amino
acids by ion exchange HPLC using an amino
acid analyzer

10

Copyright © 1998 W.R. Midden


All Rights Reserved Bowling Green State University Protein Sequencing
Determining the
N-Terminal Amino Acid
 The N-terminal amino acid is determined using
either chemical reagents or enzymes
 Chemical reagents include:
 Sanger’s reagent
 dansyl chloride
 Edman Degradation

11

Copyright © 1998 W.R. Midden


All Rights Reserved Bowling Green State University Protein Sequencing
Determining the
N-Terminal Amino Acid
 Sanger’s reagent
 Treat with dinitrofluorobenzene to
OO
+
OO form a dinitrophenyl (DNP)
N+N derivative of the amino-terminal
amino acid
 Acid hydrolysis
 Extract the DNP-derivative from the
OO
+
N+
acid hydrolysate with organic
N solvent
FF OO  Identify the DNP-derivative by
chromatography and comparison
with standards

12

Copyright © 1998 W.R. Midden


All Rights Reserved Bowling Green State University Protein Sequencing
Determining the
N-Terminal Amino Acid

HH3CC CH
 Dansyl chloride
CH33
3
NN (dimethylaminonaphthylenesulfonyl
chloride)
 Forms a highly fluorescent derivative
of the amino-terminal amino acid
 Identified by chromatography and
fluorescence detection after acid
OO SS OO
hydrolysis
 Highly senstive
ClCl
 Best choice when the amount of
protein is limited

13

Copyright © 1998 W.R. Midden


All Rights Reserved Bowling Green State University Protein Sequencing
Determining the
N-Terminal Amino Acid
 Edman degradation
 phenylisothiocyanate (phenyl-N=C=S) adds to N-terminus
then acid treatment cleaves the N-terminal amino acid as a
PTH derivative
 the remaining protein chain is intact and the cycle can be
repeated
 under ideal conditions the sequence of 30-60 amino acids
can be determined
 Leucine aminopeptidase
 enzyme from hog kidney hydrolyzes the N-terminal
peptide bond
 best with nonpolar amino acids

14

Copyright © 1998 W.R. Midden


All Rights Reserved Bowling Green State University Protein Sequencing
Determining the
C-Terminal Amino Acid
 Hydrazinolysis
 hydrazine at 100°C cleaves all peptide bonds forming
hydrazides except for the carboxyl terminal
 C-terminus reduced with LiAlH4
 forms amino alcohol at C-terminus
 Carboxypeptidases
 enzymatic removal of C-terminus
 Carboxypeptidase A all except proline, arginine and lysine
 Carboxypeptidase B only arginine and lysine
 Carboxypeptidase C any amino acid
 care required since rate of removal varies with the type of
amino acid
15

Copyright © 1998 W.R. Midden


All Rights Reserved Bowling Green State University Protein Sequencing
Peptide Fragments

 After determining the amino acid composition


and the N & C-terminal amino acids, at least
two different sets of protein fragments are
needed for sequencing

16

Copyright © 1998 W.R. Midden


All Rights Reserved Bowling Green State University Protein Sequencing
Why Use Fragments?

 Why is the protein broken into fragments? Why


isn’t the protein sequenced directly?
 The sequencing methods currently available are
only accurate for peptides up to about 20-30
amino acids, 60 under ideal conditions

17

Copyright © 1998 W.R. Midden


All Rights Reserved Bowling Green State University Protein Sequencing
Why 2 Sets of
Fragments?
 Why can't the entire protein amino acid
sequence be determined from a single set of
peptide fragments obtained by cleavage with a
single reagent?
 There’s no way to determine how the fragments
are connected with just one set
 A second or third set of fragments are used to
deduce how the fragments are connected by
identification and comparison of overlapping
sequnces

18

Copyright © 1998 W.R. Midden


All Rights Reserved Bowling Green State University Protein Sequencing
Protein Cleavage
Reagents
 What types of reagents are best suited for
preparing these sets of fragments?
 Reagents that cleave the protein chain only at a
few specific sites forming fragments that are
less than 20-30 amino acids in length

19

Copyright © 1998 W.R. Midden


All Rights Reserved Bowling Green State University Protein Sequencing
Protein Cleavage
Reagents
 Chemical or enzymatic reagents can be used to
prepare protein fragments
 The most commonly used reagents are:
 cyanogen bromide
 various enzymes including
 trypsin
 chymotrypsin
 clostripain
 Staphylococcal protease
 various endopeptidases

20

Copyright © 1998 W.R. Midden


All Rights Reserved Bowling Green State University Protein Sequencing
Cyanogen Bromide

CH
CH33 Br
Br CH
CH33 HH3CC
3
+
SS CC SS+ CC NN SS CC NN

CH NN CH
CH22 CH
CH22 CH22
CH CH
CH22 OO
CH22 OO HH2CC
2
OO

NN CC CC NN NN CC CC NN NN CC CC
OO
HH HH HH HH HH HH HH HH

 At which amino acid in the protein sequence does the


reagent, cyanogen bromide, cleave protein chains?
 At internal methionines by reaction with the methionine
sulfur as illustrated above
21

Copyright © 1998 W.R. Midden


All Rights Reserved Bowling Green State University Protein Sequencing
Trypsin & Chymotrypsin

 Where in the protein sequence do the enzymes,


trypsin and chymotrypsin cleave protein
chains?
 trypsin cleaves at the carboxyl side of amino
acids with positively charged side chains such
as lysine and arginine
 chymotrypsin cleaves at the carboxyl side of
amino acids with aromatic side chains such as
phenylalanine and tyrosine

22

Copyright © 1998 W.R. Midden


All Rights Reserved Bowling Green State University Protein Sequencing
Clostripain

 Where in the protein sequence does the


enzyme, clostripain, cleave?
 prefers positively charged amino acids, arginine
even more than lysine
 narrower specificity than tryptophan
 which enzyme is likely to produce larger
fragments?

23

Copyright © 1998 W.R. Midden


All Rights Reserved Bowling Green State University Protein Sequencing
Staphyloccal Protease

 Where in the protein sequence does the


enzyme, Staphylococcal protease cleave?
 carboxyl side of acidic amino acids in
phosphate buffer
 in acetate or bicarbonate buffer it is more
specific and cleaves only glutamic acid

24

Copyright © 1998 W.R. Midden


All Rights Reserved Bowling Green State University Protein Sequencing
Endopeptidases

 The following endopeptidases are less specific


than the enzymes metioned above
 Pepsin, papain, subtilisin, thermolysin, elastase
 (papain is the active ingredient in meat tenderizer, soft
contact cleansing solutions, some laundry detergents)
 These enzymes are most often used to further
reduce the size of large tryptic or chymotryptic
fragments

25

Copyright © 1998 W.R. Midden


All Rights Reserved Bowling Green State University Protein Sequencing
How are Peptide
Fragments Separated?
 Usually by column chromatography, often
HPLC
 Separations are most often based on differences
in polarity (reverse phase) or electric charge
(ion exchange)

26

Copyright © 1998 W.R. Midden


All Rights Reserved Bowling Green State University Protein Sequencing
Edman Degradation

 Edman degradation is most often used to


sequence the peptides
 It removes one amino acid from the N-terminal
end of the peptide during each cycle of the
procedure
 The removal of the N-terminal amino acid is
accomplished using the reagent,
phenylisothiocyanate

27

Copyright © 1998 W.R. Midden


All Rights Reserved Bowling Green State University Protein Sequencing
Edman Degradation

 Pheylisothiocyanate attaches to the N-terminal


amino acid
 The peptide amino nitrogen atom bonds to the
PITC carbon
 Sulfur then bonds to the peptide carboxyl carbon
breaking the peptide bond
 This cyclization forms a pheylthiohydantoin
derivative which is removed from the peptide
chain by treatment with anhydrous acid
 Identified by extraction, treatment with aqueous
acid and analysis by chromatography
28

Copyright © 1998 W.R. Midden


All Rights Reserved Bowling Green State University Protein Sequencing
Edman Degradation

NN
NH
NH
CC NN
OO SS
CC SS
HH
SS
HHNN NN
HH2NN
2 HH
3CC HH
HH CC CH
CH3 3 3
HH CC CH
CH
3
3
CC OO
CC OO HH
2NN
2
NH
NH
NH
NH

29

Copyright © 1998 W.R. Midden


All Rights Reserved Bowling Green State University Protein Sequencing
Disulfide Bridges

 The location of disulfide bridges can be


determined by diagonal electrophoresis
 Fragments with intact disulfide bonds are
electrophoresed in one dimension
 Treated with fumes of performic acid to cleave
disulfide bonds
 Then electrophoresed in the second dimension
 Fragments that had no disulfide bonds will be on
the diagonal
 Fragments that had disulfide bonds will migrate
off diagonal due to altered mobility
30

Copyright © 1998 W.R. Midden


All Rights Reserved Bowling Green State University Protein Sequencing
Mass Spectroscopy

 Used for sequencing peptides


 Peptides are fragmented in the mass
spectrometer
 The fragments are identified by their
mass/charge ratio
 Peptide mixtures can be analyzed using a
temperature gradient
 The temperature gradient causes variation in
signals corresponding to different peptides

31

Copyright © 1998 W.R. Midden


All Rights Reserved Bowling Green State University Protein Sequencing
Protein Sequencing
by DNA Sequencing
 In fact, while you have just learned how to
sequece a protein by chemical and enzymatic
degradation, protein sequences are now most
often determined by translating the
corresponding cloned genes
 This latter process is usually easier and quicker
once the gene corresponding to a given protein
has been identified

32

Copyright © 1998 W.R. Midden


All Rights Reserved Bowling Green State University Protein Sequencing
Sequence Databases

 International databases of protein sequences


are maintained
 Many of these databases are accessible via the
internet
 Examples:
 GenBank
 Protein Identification Resource (PIR)
 European Molecular Biology Data Library
(EMBL)

33

Copyright © 1998 W.R. Midden


All Rights Reserved Bowling Green State University Protein Sequencing

You might also like