Chemistry 365 Biochemistry Individual Lab Unit #7 Sequence Determination Techniques for Proteins and Nucleic Acids References

: Lehninger, A.L.; Nelson, D.L.; Cox, M.M. Principles of Biochemistry, 2nd ed.; Worth Publishers: New York, 1993. Berg, J.; Tymoczko, J.; Stryer. L. Biochemistry, 5th ed.; W. H. Freeman: New York, 2002. Introduction:

I.

Determining Protein Amino Acid Sequence

Once the protein of interest has been extracted and purified, and its molar mass determined, the next step is to completely hydrolyze the protein (6 N HCl at 110oC for 24 hours) and determine its amino acid composition. Amino acids in the hydrolysate are separated and identified by Ion Exchange Chromatography, Sulfonated Polystyrene Resin ion exchange Elution Profile of Amino Acids chromatography. In most cases, the next step is to identify the first amino acid in the sequence, or in other words, the protein's amino terminal. The amino terminal is typically reacted with a labelling reagent, such as dabsyl chloride, that forms a bond stable to hydrolysis. The peptide is hydrolyzed and the labeled amino acid identified.
H3 C N H3 C H3 C N H3 C N N dabsyl-amino acid (fluorescent) O S O N H C O O R H N N dabsyl chloride SO 2 Cl

pH 3.25 0.2 M citrate

pH 4.25 0.2 M citrate Elution Volume

pH 5.28 0.35 M citrate

E KS A M F L E glu lys ser ala met p h e leu glu

Peptides up to 50 residues long can be sequenced by a cyclic procedure where the amino terminal is labelled, cleaved, identified and the process repeated on the shortened chain. This procedure, the Edman degradation method, has been automated. This machine, called a sequenator, performs the reactions, separations and identifications as well as recording all results.

EDMAN DEGRADATION O O C H H N C O C H O C ser ala met phe leu glu C C H2N S 1 2 3 Label 4 5 N 1 2 3 4 5 Label O H phenylisothiocyanate Release and Identify 1 2 3 4 5 O C H N C C S N H NH2 H N C O C H NH2 O C ser ala met phe leu glu Label 2 3 4 5 Release and Identify 2 3 Label 3 4 5 Repeat N 4 5 Release O O C C N H O C H H O C H C ser ala met phe leu glu N Release and Identify 3 4 5 H C S Identify NH2 Endopeptidases and Sequencing by Fragment Overlap EKS A M F L E trypsin hydrolyzes peptide bonds on C side of lys. tyr EKS A M F L E glu lys ser ala met phe leu glu resulting peptides glu lys ser ala met phe leu glu Deduce sequence by comparing fragments must be glu lys ser ala met phe leu glu trypsin glu lys ser ala met phe leu glu glu lys ser ala met phe chymotrypsin leu glu . arg glu lys ser ala met phe leu glu resulting peptides glu lys EK ser ala met phe leu glu S AMF L E chymotrypsin hydrolyzes peptide bonds on C side of phe. trp.

II. will allow the deduction of the complete protein amino acid sequence. and the nucleic acid sequence determined. The correct ordering of these sequences is made possible by repeating the procedure with chemical reagents or enzymes that cleave specifically at different sites than previously. The method depends on the abilities to: 1) find two appropriate DNA primers that border the target DNA fragment 2) synthesize the complementary strand to the target utilizing the primers. If the gene for a given protein can be isolated. Determining Nucleic Acid Nucleotide Sequence The development of a technique by Frederick Sanger has made it relatively easy to sequence large DNA molecule fragments.Proteins longer than 50 or so residues must be sequenced in an additional manner. the development of recombinant gene technology and DNA sequencing methods make it possible to sequence proteins via an alternative strategy. which when compared to the first set and "overlapped". The protein chain is broken into smaller fragments by site specific chemical reagents or endopeptidases such as trypsin and chymotrypsin. Recently. the protein amino acid sequence can be deduced through the genetic code. The resulting segments are separated and sequenced. DNA polymerase and deoxynucleotides DNA SEQUENCING by the SANGER (dideoxy) METHOD DNA Polymerase 5' P PRIMER P P P P P P OH OH dATP dGTP P P P OH P P P ddATP A T C T T G C T C G A TEMPLATE A G A A (Target) HO P P P P P P P P 3' 5' . This second analysis should yield a set of different fragments.

dGTP. dATP ddTTP Cycle Sequencing 4) separate labeled complementary DNA fragments. can be produced from the target pTCAGCTCAAG by four different dideoxynucleotide reactions. that differ in length by only one nucleotide PRIMER PRIMER PRIMER PRIMER PRIMER PRIMER PRIMER PRIMER PRIMER PRIMER CTTGAGCTGA CTTGAGCTG CTTGAGCT CTTGAGC CTTGAG CTTGA CTTG CTT CT C ELECTROPHORESIS 10 9 8 7 6 5 4 3 2 1 A C G T - + The four sets of labelled fragments are electrophoretically separated side-by-side. by electrophoresis. the "dideoxynucleotide C" lane will have fragments 1 and 7 nucleotides long. the "dideoxynucleotide G" lane will have fragments 4.3) generate four sets of labelled complementary DNA fragments. the labelled complementary DNA fragment. each yielding a dideoxynucleotide at the 3' end of fragments that all begin at the same starting point. For example above. and PRIMER PRIMER PRIMER PRIMER PRIMER PRIMER PRIMER PRIMER PRIMER PRIMER CTTGAGCTGA CTTGA CTTGAGC C CTTGAGCTG CTTGAG CTTG CTTGAGCT CTT CT ddATP PRIMER ddCTP 5' TEMPLATE OH ddGTP 3' GAACTCGACT + dCTP. dTTP. and each of these fragments ends with the base at that number position in the sequence of the labelled complementary DNA fragment. The sequence of the original target DNA fragment may now be deduced from complementary fragment using base pairing rules. and the "dideoxynucleotide T" lane will have fragments 2. The final result is that there will be one labelled complementary DNA fragment for each length up to the total number of bases. 6 and 9 nucleotides long. 3 and 8 nucleotides long. and the pattern of bands produced directly yields the nucleotide sequence. . pCTTGAGCTGA. with each set generated by a specific reaction including a dideoxynucleotide. The "dideoxynucleotide A" electrophoresis lane will show fragments 5 and 10 nucleotides long.