You are on page 1of 75

Structural Analysis of Protein Structure

Circular Dicroism Fluorescence X-ray NMR

Methods for Secondary Structural Analysis


A number of experimental techniques can selectively examine certain general aspects of macromolecular structure with relatively little investment of time and sample. Reasonable estimates of protein secondary structure content can be determined empirically through the use of Circular dichroism (CD) spectroscopy Nuclear Magnetic Resonance (NMR) spectroscopy FT-infrared spectroscopy

Circular Dichroism
Circular dichroism (CD) spectroscopy is a form of light absorption spectroscopy that measures the difference in absorbance of right- and left-circularly polarized light (rather than the commonly used absorbance of isotropic light) by a substance. It is measured with a CD spectropolarimeter. The instrument needs to be able to measure accurately in the far UV at wavelengths down to 190 - 170 nm (170 - 260 nm). The difference in left and right handed absorbance A(l)- A(r) is very small (usually in the range of 0.0001) corresponding to an ellipticity of a few 1/100th of a degree.

Physics of CD
Linear polarized light can be viewed as a superposition of opposite circularly polarized light of equal amplitude and phase. A projection of the combined amplitudes perpendicular to the propagation direction thus yields a line. When this light passes through an optically active sample with a different absorbance A for the two components, the amplitude of the stronger absorbed component will be smaller than that of the less absorbed component. The consequence is that a projection of the resulting amplitude yields an ellipse instead of the usual line, while the polarization direction has not changed. The occurrence of ellipticity is called Circular Dichroism.

Rotation of Plane-polarized Light by an Optically Active Sample

Pockels cell produces a beam that is alternately switched between L and R. The beam then passes through the sample to a photomultiplier. The detected signal can then be processed as A vs .

Physical Principles of CD
Inherently asymmetric chromophores (uncommon) or symmetric chromophores in asymmetric environments will interact differently with right- and left-circularly polarized light resulting in circular dichroism. Right- and left-circularly polarized light will be absorbed to different extents at some wavelengths due to differences in extinction coefficients for the two polarized rays called circular dichroism (CD). Circular dichroism can only occur within a normal absorption band and thus requires either an inherently asymmetric chromophore (uncommon) or a symmetric one in an asymmetric environment.

Instrumentation
The most common instruments around are the currently produced JASCO, JobinYvon, OLIS, and AVIV models. We have the Jasco 710 and 810 models with temperature controllers. The air cooled 150W Xenon lamp does not necessitate water cooling. You still need to purge with ample nitrogen to get to lower wavelengths (below 190 nm).

Typical Initial Concentrations


Protein Concentration: 0.5 mg/ml (The protein concentration needs to be adjusted to produce the best data). Cell Path Length: 0.5-1.0 mm. If absorption poses a problem, cells with shorter path (0.1 mm) and a correspondingly increased protein concentration and longer scan time can be employed. Stabilizers (Metal ions, etc.): minimum Buffer Concentration: 5 mM or as low as possible, while maintaining protein stability. A typical buffer used in CD experiments is 10 mM phosphate, although low concentrations of Tris, perchlorate or borate is also acceptable. As a general rule of thumb, one requires that the total absorbance of the cell, buffer, and protein be between 0.4 and 1.0 (theoretically, 0.87 is optimal). A spectra for secondary structure determination (260 - 178 nm) will require 30-60 minutes to record (plus an equivalent amount of time for a baseline as every CD spectrometer.

Sample Preparation and Measurement


Additives, buffers and stabilizing compounds: Any compound, which absorbs in the region of interest, (250 - 190 nm) should be avoided. A buffer or detergent, imidazole or other chemical should not be used unless it can be shown that the compound in question will not mask the protein signal. Protein solution: The protein solution should contain only those chemicals necessary to maintain protein stability/solubility, and at the lowest concentrations possible. The protein itself should be as pure as possible, any additional protein will contribute to the CD signal. Contaminants: Particulate matter (scattering particles), anything that adds significant noise (or artificial signal contributions) to the CD spectrum must be avoided. Filtering of the solutions (0.02 m syringe filters) may improve signal to noise ratio. Data collection: Initial experiments are useful to establish the best conditions for the "real" experiment. Cells of 0.5 - 1.0 mm path length offer a good starting point.

CD Data Analysis
The difference in absorption to be measured is very small. The differential absorption is usually a few 1/100ths to a few 1/10th of a percent, but it can be determined quite accurately. The raw data plotted on the chart recorder represent the ellipticity of the sample in radians, which can

be easily converted into degrees

CD Data Analysis
To be able to compare these ellipticity values we need to convert into a normalized value. The unit most commonly used in protein and peptide work is the mean molar ellipticity per residue. We need to consider path length l, concentration c, molecular weight M and the number of residues.
in proper units (CD spectroscopists use decimol) which finally reduces to The values for mean molar ellipticity per residue are usually in the 10,000's

CD Data Analysis
The molar ellipticity [] is related to the difference in extinction coefficients [] = 3298 . Here [] has the standard units of degrees cm2 dmol -1 The molar ellipticity has the units degrees deciliters mol-1 decimeter-1.

Circular Dichroism of Proteins


It has been shown that CD spectra between 260 and approximately 180 nm can be analyzed for the different secondary structural types: alpha helix, parallel and antiparallel beta sheets, turns, and other. A number of excellent review articles are available describing the technique and its application (Woody, 1985 and Johnson, 1990). Modern secondary structure determination by CD are reported to achieve accuracies of 0.97 for helices, 0.75 for beta sheet, 0.50 for turns, and 0.89 for other structure types (Manavalan & Johnson, 1987).

CD Signal of Proteins
For proteins we will be mainly concerned with absorption in the ultraviolet region of the spectrum from the peptide bonds (symmetric chromophores) and amino acid sidechains in proteins. Protein chromophores can be divided into three classes: the peptide bond, the amino acid sidechains, and any prosthetic groups. The lowest energy transition in the peptide chromophore is an n p* transition observed at 210 - 220 nm with very weak intensity (emax~100). ----p* p p* ~`190 nm emax~7000 ----n n p 208-210, 191-193 nm emax~100 ----p

Comparison of the UV absorbance (left) and the circular dichroism (right) of poly-Llysine in different secondary structure conformations as a function of pH. The n p* transition appears in the a-helical form of the polymer as a small shoulder near 220 nm on the tail of a much stronger absorption band centered at 190 nm. This intense band, responsible for the majority of the peptide bond absorbance, is a p p* transition (emax ~ 7000). Using CD, these different transitions are more clearly evident. Exciton splitting of the p p* transition results in the negative band at 208 and positive band at 192 nm.

CD Spectra of Proteins
Different secondary structures of peptide bonds have different relative intensity of n p* transitions, resulting in different CD spectra at far UV region (180 - 260 nm). CD is very sensitive to the change in secondary structures of proteins. CD is commonly used in monitoring the conformational change of proteins. The CD spectrum is additive. The amplitude of CD curve is a measure of the degree of asymmetry. The helical content in peptides and proteins can be estimated using CD signal at 222 nm e222= 33,000 degrees cm2 dmol -1 res-1 Several curve fitting algorithms can be used to deconvolute relative secondary structures of proteins using the CD spectra of proteins with known structures.

Protein CD Signal
The three aromatic side chains that occur in proteins (phenyl group of Phe, phenolic group of Tyr, and indole group of Trp) also have absorption bands in the ultraviolet spectrum. However, in proteins, the contributions to the CD spectra in the far UV (where secondary structural information is located) is usually negligible. Aromatic residues, if unusually abundant, can have significant effects on the CD spectra in the region < 230 nm, complicating analysis. The disulfide group is an inherently asymmetric chromophore as it prefers a gauche conformation with a broad CD absorption around 250 nm.

Far UV CD Spectra of Proteins


[] x10-3 degrees cm2 dmol -1

Each of the three basic secondary structures of a polypeptide chain (helix, sheet, coil) show a characteristic CD spectrum. A protein consisting of these elements should therefore display a spectrum that can be deconvoluted into the three individual contributions.

CD Spectra of Protein

CD Spectra Fit
In a first approximation, a CD spectrum of a protein or polypeptide can be treated as a sum of three components: a-helical, b-sheet, and random coil contributions to the spectrum. At each wavelength, the ellipticity () of the spectrum will contain a linear combination of these components: (1) T is the total measured susceptibility, h the contribution from helix, s for sheet, c for coil, and the corresponding the fraction of this contribution.

CD Spectra Fit
As we have three unknowns in this equation, a measurement at 3 points (different wavelengths) would suffice to solve the problem for , the fraction of each contribution to the total measured signal. We usually have many more data points available from our measurement (e.g., a whole CD spectrum, sampled at 1 nm intervals from 190 to 250 nm). In this case, we can try to minimize the total deviation between all data points and calculated model values. This is done by a minimization of the sum of residuals squared (s.r.s.), which looks as follows in our case :

Using CD to Monitor 3 Structure of Proteins


CD bands in the near UV region (260 350 nm) are observed in a folded protein where aromatic sidechains are immobilized in an asymmetric environment. The CD of aromatic residues is very small in the absence of ordered structure (e.g. short peptides). The signs, magnitudes, and wavelengths of aromatic CD bands cannot be calculated; they depend on the immediate structural and electronic environment of the immobilized chromophores. The near-UV CD spectrum has very high sensitivity for the native state of a protein. It can be used as a finger-print of the correctly folded conformation.

Domain 1of CD2

CD2 is a cell adhesion molecules. Domain 1 of CD2 has a IgG fold. Nine b-strands form a beta-sandwich structure. Two Trp residues, W-7 and W-32 (green) are located at the exposed and buried region of the protein, respectively. Our lab has used CD2 as a model system to understand conformation flexibility of proteins

CD2 is Stable from pH 1 to 10


1000 500

res -1 )

0 -500 -1000 -1500 -2000 -2500 -3000 200 210 220 230 240 250 260

[] (deg cm

dmol

-1

Wav elength (nm)

Conformational Change of CD2


c 0

6M GuHCl
2 [ ] (deg cm dmol ) -1

-1000

25 C

-2000

85 C
-3000 200

210

220

230 Wav elength (nm)

240

250

260

CD2 Becomes Significantly Helical in TFE


5000

[ ] (deg cm2 dmol-1 res -1)

-5000 0% TFE 10% TFE 17% TFE 19% TFE 30% TFE 80% TFE

-1 10

-1.5 10

-2 10 4 200 210 220 230 240 250 260

Wav elength (nm)

Near UV CD Spectra of CD2


200 a 100

[] (d eg cm dmol-1 )

6 MGuHCl 85 C

-100

-200

25 C
-300

CD2 losses its native well packed tertiary structure at high temperature and in 6M GuHCl
360

-400

260

280

300 Wavel ength (nm )

320

340

CD2 losses its Tertiary Structure in TFE


200

100

[ ] (deg cm2 dmol-1 res -1)

-100 0% TFE -200 10% TFE 17% TFE -300 30% TFE

-400 260 270 280 290 300 310 320

Wavel ength (nm )

Trp Fluorescence Emission Spectra of CD2 under Different Conditions


4 10
4

25C
Fluorescence intensity
3 10
4

Trp

2 10 4

1 10

85C
300 320 340 360 380 400

Wavelength (nm)

6M GuHCl

In a hydrophobic environment (inside of a folded protein), Trp emission occurs at shorter wavelength. When it is exposed to solvent, its emission is very similar to that of the free Trp amino acid (red shift occurs).

Secondary Structure Prediction of CD2


x-structure 1 Rat CD2 PHD GOR SOPMA A 10 B 20 30 C C' 40 C" 50

RDSGTVWGALGHGINLNIPNFQMTDDIDEVRWERGSTLVAEFKRKMKPFLK CCCCSSSSCCCCCSSSCCCCCCCCCCHHHHHHHHCCHHHHHHHHHCCCCSS CCCCSSSSSSSCCCSCCCCCCCCCCCHCHSSHHHCCHHHHHHHHHHHHHHH CCCCSSHCCCCCCSSSCCCCCCCCCCCCHSSHHCCCSHHHHHHHHHHHHHC


D 60 E 70 F 80 90 G

x-structure

Rat CD2 PHD GOR SOPMA

SGAFEILANGDLKIKNLTRDDSGTYNVTVYSTNGTRILNKALDLRILE CCCSSSSSCCCSSSCCCCCCCCCCSSSSSSCCCHHHHHHHHCCCCCCC HHHHHHHHHHHHHHHSSSSCCCCSSSSSSSSCCCCSSHHHHHHHHHHH CCCSSSSCCCCSSSSSSCCCCCCCSSSSSSSCCCCSSSSHHHHHSSHC


S = b-sheet C = coil

H = a-helix b-sheet

310-helix

CD2 vs. Helical Propensity


Residues on strands C, C, C and G have strong helical propensity.
N

F42

C" D

C'

V78

E
W32

B
V39 Y76

L16

Circular dichroism spectroscopy is used to gain information about the secondary structure and folded state of proteins and polypeptides in solution. Benefits: Uses very little sample (200 ul of 0.5 mg/ml solution in standard cells) Non-destructive Relative changes due to influence of environment on sample (pH, denaturants, temperature, etc.) can be monitored accurately. Drawbacks: Interference with solvent absorption in the UV region Only very dilute, non-absorbing buffers allow measurements below 200 nm Absolute measurements subject to a number of experimental errors Average accuracy of fits about +/- 10% CD spectropolarimeter is relatively expensive

Summary of CD

X-ray Crystallography
X-rays are electromagnetic radiation at short wavelengths, emitted when electrons jump from a higher to a lower energy state.
Growth of crystals X-ray diffraction Heavy-metal complex Build model Refinement

Drug design information

Crystallization Structure analysis

X-ray crystallography

Model refinement

Data collection

Data procession

http://www-structure.llnl.gov/xray/101index.html; http://www.aps.anl.gov/aps/frame_home.html

Crystal

A crystal is built up from many billions of small identical units, or unit cells. These unit cells are packed against ach other in three dimensions, much as identical boxes are packed and stored in a warehouse. The unit cell may contain one or more than one molecule. Although the number of molecules per unit cell is always the same for all the unit cells of a single crystal, it may vary between different crystal forms of the same protein. The diagram shows in two dimensions several identical unit cells, each containing two objects packed against each other. The two objects within each unit cell are related by twofold symmetry to illustrate that each unit cell in a protein crystal can contain several molecules that are related by symmetry to each other.

Many small identical blocks or unit cells are packed against other in 3D.

In order to obtain a crystal, molecules must assemble into a periodic lattice.

Each unit cell can contain several molecules that are related by symmetry.
The diagram shows identical blocks, each containing two objects packed against each other.

www.via.ecp.fr/~im/musee/escher.html

Crystals & X-ray Diffraction


enzyme RuBisCo

Well-ordered protein crystals (a) diffract x-rays and produce diffraction patterns that can be recorded on film (b) (Laue photograph). The diffraction pattern was obtained using polychromatic radiation from a synchrotron source in the wavelength region 0.5 to 2.0 .

Protein Crystal Packing


Protein crystals contain large channels and holes filled with solvent molecules. The subunits (colored disks) form octamers of molecular weight around 300 kDa of glycolate oxidase, with a hole in the middle of each of about 15 in diameter. Between the molecules there are channels (white) ~ 70 in diameter through the crystal.

The Hangingdrop Method of Protein Crystallization


About 10 ml of a 10 mg/ml protein solution in a buffer with added precipitant --- such as ammonium sulfate, at a concentration below that at which it causes the protein to precipitate --- is put on a thin glass plate that is sealed upside down on the top of a small container. In the container there is about 1 ml of concentrated precipitant solution. Equilibrium between the drop and the container is slowly reached through vapor diffusion, the precipitant concentration in the drop is increased by loss of water to the reservoir, and once the saturation point is reached the protein slowly comes out of solution. If other conditions such as pH and temperature are conducive, protein crystals will form in the drop.

A Diffraction Experiment

When the X-ray goes through the crystal, beams is diffracted and diffraction pattern is recorded on a detector. The crystal is rotated a certain degree while this pattern is recorded. A series of frames are collected. Determine the size of the unit cell by Bragg's law: 2dsin = d= /(2* sin).

http://www-structure.llnl.gov/Xray/101index.html

A Diffraction Experiment
A narrow beam of x-rays (red) is taken out from the x-ray source through a collimating device. When the primary beam hits the crystal, most of it passes straight through, but some is diffracted by the crystal. These diffracted beams, which leave the crystal in many different directions, are recorded on a detector, either a piece of x-ray film or an area detector. The crystal was rotated one degree while this pattern was recorded. The pattern of RuBisCo was collected using polychromatic radiation.

Diffraction of Xrays by a Crystal


(a) When a beam of x-rays (red) shines on a crystal all atoms in the crystal scatter x-rays in all directions. Most of these scattered x-rays cancel out, but in certain directions (blue arrow) they reinforce each other and add up to a diffracted beam. Different sets of parallel planes (b) can be arranged through the crystal so that each corner of all unit cells is on one of the planes of the set. X-ray diffraction can be regarded as reflection of the primary beam from sets of parallel planes in the crystal, separated by a distance d. The primary beam strikes the planes at an angle and the reflected beam leaves at the same angle, the reflection angle.

Diffraction of X-rays by a Crystal


X-rays (red) that are reflected from the lower plane have traveled farther than those from the upper plane by a distance BC + CD, which is equal to 2dsin. Reflection can only occur when this distance is equal to the wavelength l of the x-ray beam and Bragg's law (2dsin = l). To determine the size of the unit cell, the crystal is oriented in the beam so that reflection is obtained from the specific set of planes in which any two adjacent planes are separated by the length of one of the unit cell axes. This distance, d, is then equal to l/(2sin). The wavelength, l, of the beam is known since we use monochromatic radiation. The reflection angle, , can be calculated from the position of the diffracted spot on the film, where the crystal to film distance can be easily measured. The crystal is then reoriented, and the procedure is repeated for the other two axes of the unit cell.

Diffraction of X-ray Beams

The reflection angle, q, for a diffracted beam can be calculated from the distance (r) between the diffracted spot on a film and the position where the primary beam hits the film. From the geometry shown in the diagram, the tangent of the angle 2 = r/A. A is the distance between crystal and film that can be measured on the experimental equipment, while r can be measured on the film. Hence, can be calculated. The angle between the primary beam and the diffracted beam is 2, as can be seen on the enlarged insert to the right. It shows that this angle is equal to the angle between the primary beam and the reflecting plane plus the reflection angle, both of which are equal to .

Properties of Diffracted Waves


Two diffracted beams, each of which is defined by three properties: amplitude, which is a measure of the strength of the beam and which is proportional to the intensity of the recorded spot, phase, which is related to its interference, positive or negative, with other beams, and wavelength, which is set by the x-ray source for monochromatic radiation. We need to know all three properties to determine the position of the atoms giving rise to the diffracted beams.

Multiple Isomorphous Replacement (MIR)


Heavy atoms (strong diffraction) are introduced into the unit cell of the crystal to obtain phase information by soaking crystals in the metal solution. Intensity differences are used to deduce the positions of the heavy atoms in the crystal unit cell. Fourier summations of these intensity differences give Patterson maps of the vectors between the heavy atoms. From the positions of the heavy atoms in the unit cell, we can get amplitudes and phases. More than two different heavy-metal complexes are needed to give a reasonably good phase determination for all reflections.

Building a Model
The amplitude and phases of the diffraction data from the protein crystals are used to calculate an electron-densitymap of the repeating unit of the crystal. This map is then interpreted as a polypeptide chain with a particular amino acid sequence. The resolution (in ) is limited by the map error, resolution of the diffraction map. At low resolution (5 or higher), the shape of the molecule can be obtained. At medium resolution (~3 ), the trace of the polypeptide chain, i.e. active site, can be obtained At high resolution ( 2 ), the a.a. sidechians can be resolved.

Electron-density maps at different resolution show more detail at higher resolution.


(d) 1.1

Interpreting Electron-density Maps

The electron-density map is interpreted by fitting into it pieces of a polypeptide chain with known stereochemistry such as peptide groups and phenyl rings. The electron density is displayed on a graphics screen in combination with a part of the polypeptide chain (red) in an arbitrary orientation (a). The units of the polypeptide chain can then be rotated and translated relative to the electron density until a good fit is obtained (b).

High Resolution Crystal Structures


F. Liu

Reducing Errors by Refinement


In the refinement process, the model is changed to minimize the difference between the experimentally observed diffraction amplitudes and those calculated for a hypothetical crystal containing the model instead of the real molecules. The difference is called the R factor, with 0.0 being exact agreement and 0.59 total disagreement. 0.15 < R < 0.20 = well determined structure R ~ 0.30 = medium structure R > 0.30 = bad structure

B-factor
ATOM ATOM ATOM ATOM 1 2 3 4 N CA C O PRO PRO PRO PRO A A A A 190 190 190 190 -0.567 -0.399 -1.288 -2.520 24.363 23.026 21.990 22.007 16.753 17.339 16.644 16.772 49.28 49.21 49.61 49.44

In the pdb file of x-ray structures, the atoms positions is given by four numbers, three of them for coordinates and one quantity B, which is called the B-factor or temperature factor. B < 20 = well defined regions B > 40 = atoms have high flexibility

NMR Spectroscopy
It is possible to determine the secondary structure of a protein using NMR techniques without determining the threedimensional structure. NMR is potentially the most powerful of all the methods available for prediction of secondary structure. Unlike secondary structure determinations by CD, which provide overall secondary structure content (% helix, % sheet, etc.), using NMR parameters, secondary structures are localized to specific segments of the polypeptide chain. However, obtaining secondary structure from NMR data requires considerably more material (milligrams) and effort (requires sequence specific resonance assignments) than the other spectroscopic techniques and is limited to proteins of molecular weight amenable to NMR investigation (< 35 - 40 kDa).

NMR Spectroscopy
In the past 10 years, nuclear magnetic resonance (NMR) spectroscopy has proved itself as a potentially powerful alternative to X-ray crystallography for the determination of macromolecular three-dimensional structure. NMR has the advantage over crystallographic techniques in that experiments are performed in aqueous solution as opposed to a crystal lattice. However, the physical principles that makes NMR possible, limits the application of this technique to macromolecules of less than 35 - 40 kDa. Fortunately, a large number of globular proteins and most protein domains fall into this molecular weight regime.

Physical Principles of NMR


Sub-atomic particles (e.g., proton, neutron, electron, etc.) possess a characteristic called spin angular momentum. From quantum mechanics, each particle has a spin value of 1/2. The combination of multiple particles in the nucleus results in an overall spin property for each atomic isotope. Those isotopes with an even number of protons and neutrons will have zero magnetic spin (e.g., He-4, C-12 and O-16). An odd number of protons and an even number of neutrons (e.g., H-1, N-15, or F19) or an odd number of neutrons and an even number of protons (e.g., He-3, O-17 or Ca-41) result in an overall (multiple of 1/2) spin. Those isotopes with odd numbers of both protons and neutrons (e.g., H-2 or N-14) have more complex spin states and are less suitable for direct NMR observation in macromolecules.

Physical Principles of NMR


Fortunately, each of the four most abundant elements in biological material (H, C, N, and O) have at least one naturally occurring isotope with non-zero nuclear spin, and in principle, can be observed using NMR. The naturally occurring isotope of hydrogen, H-1, is present at > 99 % abundance and forms the basis of the experiments described here. Other important NMR-active isotopes include C-13 and N-15 present at 1.1 and 0.4 % natural abundance, respectively. The low natural abundance of these two isotopes makes their observation difficult on commonly isolated natural products. These two nuclei are however very extensively used for larger (> 10 kDa) proteins, which can be isotopically enriched (to > 95 % if necessary) when cloned into systems with high expression yields.

Chemical Shifts
In the presence of an external magnetic field, the spin angular momentum of nuclei with isotopes of overall nonzero spin will undergo a cone-shaped rotation motion called precession. The rate (frequency) of precession for each isotope is dependent on the strength of the external field and is unique for each isotope. For example, in a magnetic field of a given strength (e.g. 14.1 Tesla) all protons in a molecule will have characteristic resonance frequencies (chemical shifts) within a dozen or so parts per million (ppm) of a constant value (e.g., 600.13 MHz) characteristic of the particular nuclear type. These slight differences are due to the type of atom the proton is bound to (e.g., C, N, O, or S) and the local chemical environment. Thus each proton should, in principle, be characterized by a unique chemical shift.

One-dimensional NMR Spectra


The NMR signals (chemical shifts) for all the hydrogen atoms in this small molecule are clearly separated from each other. In this spectrum, the signal from the CH3 protons is split into three peaks and that from the CH2 protons into four peaks close to each other, due to the experimental conditions. 1H-NMR spectrum of a small protein, the C-terminal domain of a cellulase, comprising 36 amino acid residues. The NMR signals from many individual hydrogen atoms overlap and peaks are obtained that comprise signals from many hydrogen atoms.

Chemical Shifts
Some protons such as the three protons of each sidechain methyl group of Thr, Val, Leu, Ile, and Met and most pairs of equivalent (2,6 and 3,5) aromatic ring protons are found to have degenerate chemical shifts. Other protons (e.g., some OH, SH, and NH3) are in rapid chemical exchange with the solvent, and thus have chemical shifts indistinguishable from the solvent resonance. Nevertheless, nearly complete chemical shift assignments are often possible and are a prerequisite for structural studies by NMR.

Chemical shifts
Since the chemical shift of a nucleus is sensitive to the environment, it should also contain structural information. Correlations between chemical shift tendencies and secondary structures have been identified. The alpha proton of all 20 naturally occurring amino acids has been shown to have a strong correlation with secondary structure. Wishart et al., (1992) have produced a simple method for secondary structure determination by analyzing the difference between the alpha proton chemical shift for each residue and that reported for the same residue type in a "random coil" conformation. Helical segments have groupings of alpha protons whose chemical shifts are consistently less than the random coil values whereas beta strands had values consistently greater. In this way, the location of helix and strand segments are possible (and quite reliable) although the boundaries of the secondary structural elements are not as well defined.

Secondary Shifts
Plot of the differences between the observed alpha proton chemical shifts and the corresponding random coil values, d(Hanative) - d(Harandom), versus the amino acid sequence of Glutaredoxin 3

J coupling
Structural information from NMR experiments come primarily from through-bond (scalar or J coupling) or through space (the nuclear Overhauser effect NOE) magnetization transfer between pairs of protons. J couplings between pairs of protons separated by three covalent bonds can be measured. The value of a three-bond J coupling constant contains information about the intervening torsion angle. This is called the Karplus relationship and has the form: 3J = A cos () +B cos2 () + C where A, B, and C are empirically derived constants for each type of coupling constant (e.g., 3JHAHN or 3JHAHB).

J coupling

Shown above is the empirically-derived Karplus relationship between the vicinal three-bond coupling constant 3JHNa and the intervening torsion angle phi.

Coupling Constants
The three-bond coupling constant between the intra-residual alpha and amide protons is the most useful for secondary structure determinations as it can be directly related to the backbone dihedral angle phi. 3J right-handed alpha helix, phi = -57, HAHN = 3.9 Hz 3J right handed 3.10 helix, phi = -60, HAHN = 4.2 Hz 3J antiparallel beta sheet, phi = -139, HAHN = 8.9 Hz 3J parallel beta sheet, phi = -119, HAHN = 9.7 Hz 3J left-handed alpha helix, phi = 57, HAHN = 6.9 Hz

Two-dimensional NMR Spectrum


The peaks along the diagonal correspond to the 1D spectrum. The off-diagonal peaks in this NOE spectrum represent interactions between hydrogen atoms that are closer than 5 to each other in space. From such a spectrum, one can obtain information on both the secondary and tertiary structures of the protein.

COSY NMR Experiments


COSY NMR experiments give signals that correspond to hydrogen atoms that are covalently connected through one or two other atoms.

Since hydrogen atoms in two adjacent residues are covalently connected through at least three other atoms (for instance, HCa-C'-NH), all COSY signals reveal interactions within the same amino acid residue. These interactions are different for different types of side chains. The NMR signals therefore give a "fingerprint" of each amino acid. The diagram illustrates fingerprints (red) of residues Ala and Ser.

NOE

NOE NMR experiments give signals that correspond to hydrogen atoms that are close together in space (less than 5 ), even though they may be far apart in the amino acid sequence. Both secondary and tertiary structures of small protein molecules can be derived from a collection of such signals, which define distance constraints between a number of hydrogen atoms along the polypeptide chain.

NOE
The other major source of structural information comes from through space dipole-dipole coupling between two protons called the NOE. The intensity of a NOE is proportional to the inverse of the sixth power of the distance separating the two protons and is usually observed if two protons are separated by < 5 . Thus, the NOE is a sensitive probe of short intramolecular distances. NOEs are categorized according to the location of the two protons involved in the interaction. Intraresidual NOEs are between protons within the same residue, whereas sequential, medium, and long range NOEs are between protons on residues sequentially adjacent, separated by 1, 2 or 3 residues, and separated by four or more residues in the polypeptide sequence. A network of these short inter-proton distances form the backbone of three-dimensional structure determination by NMR.

Sequential Assignment

Adjacent residues in the amino acid sequence of a protein


can be identified from NOE spectra. The H atom attached to residue i + 1 (orange) is close to and interacts with (purple arrows) the H atoms attached to N, Ca, and Cb of residue i (light green). These interactions give cross-peaks in the NOE spectrum that identify adjacent residues and are used for sequence-specific assignment of the amino acid fingerprints derived from a COSY spectrum.

NOE

Regions of secondary structure in a protein have specific interactions between hydrogen atoms in sequentially nonadjacent residues that give a characteristic pattern of cross-peaks in an NOE spectrum. In antiparallel b-sheet regions there are interactions between Ca-H atoms of adjacent strands (pink arrows), between NH and Ca-H atoms (dark purple arrows), and between N-H atoms of adjacent strands (light purple arrows). The corresponding pattern of cross-peaks in an NOE spectrum identifies the residues that form the antiparallel b sheet. Parallel b sheets and a helices are identified in a similar way.

NOEs
A number of short (< 5 ) distances are fairly unique to secondary structural elements. alpha helices are characterized by short distances between certain protons on sequentially neighboring residues (e.g., between backbone amide protons, dNN, as well as between beta protons of residue i and the amide protons of residue i+1, dbN. Helical conformations result in short distances between the alpha proton of residue i and the amide proton of residues i+3 and to a lesser extent i+4 and i+2. These i+2, i+3, and i+4 NOEs are collectively referred to as medium range NOEs NOEs connecting residues separated by more than 5 residues are referred to as long range. Extended conformations (e.g., beta strands) on the other hand, are characterized by short sequential, daN, distances. The formation of sheets also result in short distances between protons on adjacent strands (e.g., daa and daN).

Amide Proton Exchange Rates


The regular hydrogen-bonded secondary structures "protect" amide protons involved in them as evidenced by their significantly reduced amide proton exchange rates with the solvent (H2O). Although nearly all polypeptide amide protons are involved in hydrogen bonds in a globular protein those in regular secondary structures appear to be longerlived. For example, after placing a lyophilized sample of BPTI into 2H O many amide protons are completely replaced with 2 deuterium within 1hr. Over the next several hours, the amide protons in the N-terminal and then the C-terminal helix also completely exchange. However, some amide protons participating in the central antiparallel sheet are still present after some months.

Selection of Secondary Structural Segment

Sequential stretches of residues with consistent secondary structure characteristics (NOEs, coupling constants, slowly exchanging amide protons, and chemical shifts) provide a reliable indication of the location of these structural segments. However, the boundaries of these segments are difficult to define precisely.

Survey of NMR-derived Structural Parameters Characterizing Reduced Grx3

Shown above, amide proton exchange rates with solvent water (filled diamonds) kNH < 0.02 min-1, coupling constants: 3JHNa (filled circles) < 6.0 Hz and (open circles) > 7.0 Hz, and sequential backbone dNN and daN NOE connectivities are classified as strong, weak, or absent and are represented by the thickness (or absence) of a bar connecting the residues in question. Medium range NOE connectivities daN (i, i+3) and (i, i+4) are drawn as line segments connecting the residues contributing to the observed cross peak if present.

NMR-determined Protein Structures


The multiple-dimensional NMR spectra used to derive a number of distance constraints for different hydrogen atoms along the polypeptide chain of the Cterminal domain of a cellulase. The diagram shows 10 superimposed structures that all satisfy the distance constraints equally well. These structures are all quite similar since a large number of constraints were experimentally obtained.

You might also like