Written by students enrolled in BIOL2226 Protein Technologies

School of Applied Sciences RMIT University

November 2006


X-Ray Diffraction Nuclear Magnetic Resonance
Circular Dichroism and Optical Rotatory Dispersion

1.4 1.5 1.6 2.

Fluorescence spectroscopy of proteins and applications Atomic Force Microscopy of Proteins Electron Microscopy of Proteins

2.1 Strategies for Protein Purification

2.2 2.3 2.4

Size Exclusion Chromatography Ion Exchange Chromatography Affinity Chromatography
Hydrophobic Interaction Chromatography


Reversed phase HPLC of peptides and proteins
Capillary Electrophoresis of Peptides and Proteins

2.8 2.9 2.10 3.

Purification of membrane proteins Industrial Scale Purification of Proteins Determination of Protein Concentration and Purity


Amino Acid Analysis and sequencing of proteins Chemical Modification of Proteins
Protein Cross-linking methods


4.2 4.3

MALDI-TOF Mass Spectrometry
ESI Mass Spectrometry Hybrid Mass Spectrometry Techniques


PROTEOMICS 5.1 5.2 5.3 5.4 5.5 2D gel electrophoresis Chromatographic methods for separation of proteomes Mass Spectrometry in Proteomics Bioinformatics in Proteome Analysis Automation and High-Throughput Proteomics


6.1 Chemical methods for study of protein-protein interactions

6.2 6.3

Molecular biology approaches for study of protein-protein interactions Biosensor methods for study of protein-protein interactions
Imaging methods for study of protein-protein interactions


7.2 7.3

t-Boc synthesis and cleavage of peptides
F-moc synthesis and cleavage of peptides Synthesis of proteins using chemoselective ligation methods

7.4 7.5 7.6 7.7 8.

Purification and characterisation of synthetic peptides Peptide Libraries and their application Applications of synthetic peptides Peptide Nucleic Acids and applications

8.1 Approaches to Protein Engineering

8.2 8.3 8.4 8.5 9.

Antibody Engineering Protein Expression systems Protein Folding strategies Protein Arrays and their application

BIOMARKER DISCOVERY 9.1 9.2 Biomarker discovery and applications SELDI-MS



* Those topics that are in small italics have not been covered in this learning resource or the reviews have not been submitted to date.




X rays were discovered by Wilhelm Roentgen in November 1895, while studying about different types of cathode rays. The unknown rays, which he gave the name “x rays were capable of producing shadows of his bones on a fluorescent screen. This discovery soon found uses in medical field. Doctors used X rays to observe the structure of bones and other internal body parts. In 1912, Von Laue proved that these rays are waves of light with very small wave length by diffracting them by using a crystal. When a wave hits an object, some waves are blocked by the object and the waves that pass the object change their direction of travel. This is called x ray diffraction. William Henry Bragg and his son Lawrence Bragg were the pioneers to study the properties of x rays. William Brag developed the x ray Spectrometer in which the rays fell on a crystal and the reflection of the crystal was measured in an ionization chamber and the strength and the direction of diffraction was found out. Lawrence Bragg found out that the reflections of the X rays depend on the spacing of the atoms in the crystal (Bragg’s law). He found that by analysing the patterns of the reflections we can determine the arrangement of individual atoms in the crystal. Thus the “Braggs” created the science of x ray crystallography. [1] The discovery of x ray crystallography literally changed the scenario of biological research; and the understanding we had about biological structures and processes. Structures of Many biological molecules were determined in the first half of the last century. Bernal and Crowfoot observed the first X-ray diffraction pattern from a protein in 1934. The same technique was used to determine the double helical structure of DNA in 1953. In 1959, the structure of Mb to a resolution of 0.6 nm (6 A°) was established. Since then the detailed structure of many biomolecules have been determined. The technology is widely used in many fields of biological research now, especially in the field of medical and molecular research. The basic technique of x ray crystallography has been modified many times since it was discovered. The development of powerful and accurate x ray tubes made it possible to work on small biological systems like viruses and also helped in determining the structure of many macromolecules in three dimensions. The x ray imaging technique is used today to develop 3 D images of proteins and enzymes of importance in molecular biological research. For example, to see the structure of a particular protein which is produced by a pathogen inside our body to study its properties in order to develop a drug against it. The technique is also used to observe the changes inside a biological system after the administration of a drug or therapy. Proteins do almost all the work in the living systems. So exploring them more and more in detail enables us to find solutios for diseases and other biological complexities which were out of our reach before. [2]

Underlying principle. To perform their basic biological function, proteins has to assume three dimensional structures in the living system. These molecules are crystallized and are analysed by x ray diffraction method. The data from such structural analyses becomes the foundation stone of all the modern developments in the area of biochemistry, biophysics, pharmaceutical development and biotechnology. It is the simplest tool available for biological research to obtain the three dimensional image of protein. Electrons around the nucleus of an atom are capable of scattering the x rays in proportion to the density of electrons. The positions of individual atoms or molecules are determined by analysing the diffraction pattern of the rays. And a three dimensional image of the biomolecule is made from this.

Fig.1.1 x ray diffraction of a protein crystal. Source: [8] To get the perfect image of a biomolecule, it should be properly crystallized before analysing it by x ray diffraction. The shape of the crystal has a lot to do with the perfection of the image that we get. All biomolecules has water as a major content of them. Water has a major role in determining the structure of the biomolecules. The perfect crystal for x ray diffractive analysis of protein is produced by removing the

water content of the biomolecule in a controlled manner. A perfect crystal should be large and single. It should have the ability to rotate plane polarized light, and with minimal defects.

There are seven types or classes of crystals. The pictures are given below.

Fig1.2 types of crystals. Source: Aug06.pdf The perfect crystal, i.e., the crystal of a protein which gives the perfect diffraction pattern is prepared by simple controlled vapour diffusion, by using different types of crystallization matrices. Then the x ray diffraction is carried out and the diffraction pattern is captured on a film. By using this, a three dimensional image of the protein is made by using modern computer programs in bioinformatics is made. The overall process is as follows.

Fig.1.3 structural analysis of protein crystals.

Source: Aug06.pdf

1.1.2 Recent advances. Synchrotrons. There have been significant advances in recent years in the field of macromolecular crystallography. Synchrotron rays, CCD detectors and cryocrystallography help the scientists to make the data collection easy and efficient. We can collect the complete data of more than one protein in a single day when we use synchrotron rays for analysis, crystals as small as 60 µm can be used to determine protein structures. Since most of the protein s form small crystals, this help us to analyse more and more proteins. We have the technology to analyse more than hundreds protein crystals per year, using a dedicated beam line on a third-generation synchrotron. The most time-consuming step is, therefore, likely to be electron density map interpretation. Methods to automate model building and refinement from electron density maps are in the early stages of development, and additional support for research in this area could significantly enhance high-throughput structure determination. Synchrotron light is the electromagnetic radiation emitted when charged particles, usually electrons or positrons, moving at velocities close to the speed of light, are forced to change direction under the action of a magnetic field. A synchrotron is a large machine (about the size of a football field) that accelerates electrons to almost the speed of light. As the electrons are deflected through magnetic fields they create extremely bright light. The light is channelled down beamlines to experimental workstations where it is used for research. The development of synchrotrons has been likened to the invention of the microscope in terms of the revolutionary effect across a range of sciences and the capacity to delve deeper into the structure of matter than ever before. [11]

Basic lay out of a synchrotron. Source: Automation of crystal production. Structural analysis of biological compounds is now entering a new phase by the industrialization and automation of the processes. The more sophisticated and accurate radiation techniques and advances in the purification, crystallization and expression techniques of proteins have helped a lot in making the time consuming and tedious process simple. The crystallization process starts with the preparation of highly purified and soluble sample. Factors like pH, ionic strength, temperature and concentrations of organic additives salts and detergents used in the purification have to be considered for this. A simple sampling technique is more practical than a multidimensional approach. Due to the impracticability of the multidimensional approach simple and reduced sampling techniques have been developed. Robots have been developed by certain companies to make the crystal production automatic. These systems are capable of performing about more than 100,000 cycles a day. This makes the crystal production fast and also helps us to store the data for future analysis. Another two advances in the field are cryo-freezing of the crystals and the Multi wavelength anomalous diffraction method (MAD). [9] Cryo-freezing. The first one, cryo-freezing is the method by which the protein crystals produced is cooled and stored in liquid nitrogen. This helps to prevent the damage caused to the crystal by the powerful radiations. [14] MALDI In MALD the synchrotron is tuned to different wavelengths. This helps in the adsorption of the metal ions in the macromolecules. This helps in solving the phase problem.

MALDI X ray diffraction apparatus.

Apart from this, the advancement in the technologies for the evaluation and manipulation of the data like modern bioinformatics programs and other developments in computational chemistry made the usage of this technique less time laborious.


Evaluation of the technology.

NMR or Nuclear Magnetic Resonance is a technique that is more recent than x ray crystallography used in the structural analysis of protein and is very widely used now a days. The basic principle behind this technology is that some atomic nuclei are very highly magnetic and it can attain different stages of energy in a magnetic field. This energy variation can be measured to depict a spectrum. The magnetic properties of the nuclei are affected by the chemical bonds and the short distances between the molecules. These properties of the molecules are used in the technique of Nuclear Magnetic Resonance Spectrometry of protein. Comparison of x ray crystallography and NMR Spectrometry. NMR can measure distance between atoms without considering the spatial orientation. This avoids the need to crystallize the protein. So NMR forms a more straight forward approach than crystallographic techniques. Normally, NMR is used to study small molecules. But due to the recent advancement in resolution of the procedure, the upper limit of molecules screened by this technique came up to 100 daltons from 20 daltons.

The advantages and disadvantages of X ray crystallography and that of NMR are summarized below. Advantages of x ray crystallography.

1. Can examine also by this way the solvent effect from different solvents. The same protein may crystallize into different crystalloid forms. 2. Able to force the protein to another form of crystallization by the change of its solvent. 3. Could get the whole 3D structure by the systematic analysis of a good crystallized material.

Disadvantages of x ray crystallography. 1. The crystal structure is necessary only that proteins which can be crystallized are examinable. 2. Cannot examine solutions and the behaviour of the molecules in solution. 3. This happens when we try to examine powders, gases. 4. Study of motions is not available. 5. Can get only one parameter-set so we are able to observe only one conformation. 6. No possibility to examine small parts in the molecule. 7. No chance for direct determination of secondary structures and especially domain movements (big disadvantage against the NMR). 8. The hydrogen in the molecules are not examinable since it has only one electron.

Advantages of NMR. 1. Several types of information from lots of types of experiments. 2. Obtain angles, distances, coupling constants, chemical shifts, rate constants etc. These are really molecular parameters which could be examined more with computers and molecular modelling procedures.

3. have enough strength of the magnetic field (the resolution is the function of that) than we can handle all of the atoms “personally” 4. With a suitable computer apparatus we can calculate the whole 3D structure.

5. There are lots of possibilities to collect different data-sets from different types of experiments for the ability to resolve the uncertainties of one type of measurements. 6. The motion of the segments (domains) can be examined.


Capable to lead us for the observation of the chemical kinetics.

8. Can investigate the influence of the dielectric constant, the polarity and any other properties of the solvent or some added material. Disadvantages of NMR. 1. Have lots of atoms and a lot of extracted data from a system. 2. This is good for the more accurate determination of the structure, but not for the availability of higher molecular masses. 3. The resolving power of NMR is less than some other type of experiments (e.g.: X-ray crystallography) since the information got from the same material is much more complex. 4. The highest molecular mass which was examined successfully is just a 64kDa protein-complex. There are lots of cases when from a given data-set - a given type of experiment - we could predict two or more possible conformations, too. Just able to determine the degree of probability of being of the protein segment in the given conformation.



7. The cost of the experimental implementation is increasing with the higher strength and the complexity of the determination. Source:


Application of the technology.

The field of biotechnological research is now focussed on genetics and proteomics. Proteins are the molecules responsible for all of the biological functions in all mechanisms of life. The proteins are the functional parts of any living system. So the knowledge about the structure and function of the protein are necessary for and it is the next logical step to understanding any biological system. X ray crystallography is the most powerful tool that is used to generate study and compare the proteins, enzymes, etc of a living system, their mode of action and the changes that may cause in a system. Recombinant DNA technology and highly developed X ray diffraction instruments helps the scientists to obtain solid information from the genes to the three dimensional images of the proteins that the genes code for. [13] Most recently, the technology is applied in understanding the sub cellular level and nucleocytoplasmic transport. [20]

The highly automated analysis methods available make it easy to analyse the vast bulk of data generated by the genome and proteome analysis. Computational tools have a major role in that. 1.1.5 Relevant websites. 1. Biological Macromolecular Crystallization Database (BMCD) 2. 3. 4.

1.1.6 Key industry suppliers. 1. 2. 3. 4. 5.

1.1.7 References 1. www cambridgephysics.com ion/xraydiffraction1_1.htm 2. 3. .htm 4. Makin, O.S., Sikorski, P., Serpel, L.C. (2006) Diffraction to study protein and peptide assemblies, Current Opinion in Clinical Biology,10, 417-422. 5.

6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. Lim, R.Y.H., Fahrenkrog, B. (2006), The nuclear pore complex upclose, Current Opinion in Cell Biology, 18, 342-347. 19. Scapin, G. (2006) structural biology pharmaceutical design, 12, 2087-2097. and drug discovery, Current

20. Ng, J.D., Gavira, A.J., Garzia-Ruiz, J.M. (2003), Protein crystallization by capillary counter diffusion for applied crystallographic structure determination, Journal of Structural Biology, 142, 218-231.

Chapter 1.2 Nuclear Magnetic Resonance
Yang xi 1.2.1 introduction The impact of nuclear magnetic resonance spectroscopy on the natural sciences is critical. The first application of NMR Spectroscopy to a biological sample was reported by the Jacobson ,Anderson, and Arnold in 1954 [1] . Nowadays NMR spectroscopy which can be used in the solution and solid state is very useful in mixtures of analyses; in understanding dynamic effects such as change in temperature and reaction mechanisms. The most important application of NMR is that it has been well-established as a excellent way to determine the threedimensional structure of proteins . [2]. The fundamental character of magnetic resonance is the separation between energy levels are quantized which means the resonance frequency contains plenty and crucial information about the chemical structure and the local magnetic environment .[3]. However it is essential to retain information that, with NMR, we are performing experiments on the nuclei of atoms, not the electrons. And the chemical environment of specific nuclei can be deduced from information obtained about the nuclei. [4]. In a basic NMR process , a large magnetic be set up and the energy levels spilt then . Figure 1.2.1 shows the energylevel diagram for spin system with I=1/2.

Figure 1.2.1 Energy levers for a nucleus with spin quantum number 1/2 Source: In such case, these nuclei in low level can be excited into the higher level by the radiofrequency signals which is determined by the difference in energy between the energy levels. When radiofrequency disappear , the nuclei in the higher energy state return to the lower state with emission of radiation which can be pick up by the receiver. This process is called relaxation process. The chemical shift which defined as applied magnetic field is the most basic of measurements in NMR. Chemical shift is caused by the electrons in the molecule produce local magnetic filed .[5] It is measured relative to a reference compound . For the nuclei1H, 13C, and 29Si, TMS (tetramethylsilane) is commonly used as a reference. Proteins are much larger than the small organic molecules such H, C , however the same NMR theory implicated . The basic 1D spectra become crowded with overlapping signals to an extent where analysis is impossible as the increased number of each element present in the protein. So multidimensional (2, 3 or 4D)

experiments have been devised to deal with this problem. To facilitate these experiments, it is desirable to isotopically label the protein with 13C and 15N as the predominant naturally-occurring isotope 12C is not NMR-active, whereas the nuclear quadruple moment of the predominant naturally-occurring 14N isotope prevents high resolution information to be obtained from this nitrogen isotope. The most important method used for structure determination of proteins is NOE experiements which measures distances between pairs of atoms within the molecule. Then the obtained distances are used to generate a 3D structure of the molecule using a computer program [6] . NMR spectroscopy can determine structure in several following phases, each using a separate set of highly specialized techniques. The sample is prepared, resonances are assigned, restraints are generated and a structure is calculated and validated.[7] NMR Spectroscopy consists of experiments that try to identify the physical relationships between atoms in the molecule, such as distances, angles, and orientations, and use them to search for three dimensional structures that satisfy the most of the physical constraints 1.2.2 Recent Advances One of the distinctive ability that NMR spectroscopy has is to regain information about interactions of proteins with other macromolecules or small molecules, which is recently increasingly used to determine the 3-D structure of protein . Furthermore NMR methods have been applied to drug design by the identification and characterization of small chemicals that restrain protein function. [8] The high-resolution liquid-state NMR spectroscopy can determine protein structure; however a large classes of proteins cannot be examined by this method because of their lack of a sufficiently concentrated solution. Therefore a new method, highresolution solid-state NMR has been studied. So far, most studies have worked at selective isotope labeling of the proteins under investigation. However for practical purposes, it would be desirable to have tools at hand with which it is possible to extract structural information directly from uniformly isotope-enriched samples. In the last few years, a whole arsenal of such tools has been developed by several groups all over the world.[9] Recent advancements in using multi-dimensional magic-anglespinning (MAS) [10] solid-state NMR to study molecular interactions and the methodological aspects of these experiments are discussed. And interactions involving polypeptides, such as surface-bound peptides [11]) or nucleotides [12 and 13], provide additional exciting possibilities for ssNMR studies. As the multi-dimensional NMR-experiments is time-consuming , time-saving NMR methods has become an important target . The total time necessary for a particular NMR experiment is determined by the need to achieve a sufficient signal-to-noise ratio (SNR), the number of repetition cycles for suppressing unwanted signals and the resolution requirements for indirectly sampled evolution dimensions of multidimensional experiments increase the duration of experiments tremendously beyond the amount required to achieve sufficient signal-to-noise ratios.[14] . A number of different approaches that permit for more efficient achievement of multi-dimensional NMR spectra have been published recently due to the improvements of the sensitivity of NMR spectrometers. Single scan multi-dimensional spectroscopy perhaps capitulate the most radical improvement [15]. Approaches involving reduced dimensionality [16] and projection–reconstruction [17], can be readily implemented for well known multi-dimensional experiments using current hardware. All of these techniques can decrease the number of data points required in the evolution dimensions and can be joint with any coherence selection approach. cogwheel phase

cycling applied by Levitt and co-workers have achieved very considerable reductions of the number of phase cycle steps in solid state NMR experiments [18]. 1.2.3 Evaluation of the Technology Nowadays, except NMR Spectroscopy, X-Ray Crystallography, which involves using instruments to detect the electron densities of the molecule crystallized in its native fold, and thus to locate the atoms in the molecule in the three dimensional space, is another experimental technology for protein three dimensional structure determination. Although these two methods base on different principles, both of them are remain expensive and time-consuming, despite numerous technological advances. The ability to characterize protein complexes under physiological conditions at atomic detail, even if the interactions are weak and transient of NMR spectroscopy makes it as the unique role in the investigation of protein interactions with small molecules. Furthermore it also has the unique ability to accurately measure the dynamic properties of proteins and to probe the process of protein folding [19,20]. In additional , it does not need crystallization , which may be the most limitation of X- ray diffraction. This means the structure gotten is much near to that at physiological state. However, a major problem of macromolecular NMR is its size limitation caused by two technical barriers. First, larger molecules have slower tumbling rates and shorter NMR signal relaxation times, which leads to the reduction of the sensitivity of the complicated pulse sequences that often use long delays for the necessary coherence transfer steps. And as more NMR-active nuclei and, therefore, more interactions among them with the increased molecular weight , more complexity to a given spectrum introduces . The current size limit of protein NMR is about 40 kDa , which is much less than that of X – ray diffraction. [21]. 1.2.4 Application of the Technology Application of NMR for unfold protein Some proteins are unstructured by themselves and only fold upon forming specific complexes with other polypeptides or even small-molecule cofactors , so the structures of such proteins are not determined just by their amino acid sequence, but require other molecules in order to adopt a well-defined tertiary structure. Obviously, NMR can deal with such problem [8]. The example of a protein that folds upon binding is the eukaryotic initiation factor 4G (eIF4G), which is the core of a multicomponent complex that controls translation initiation. Among other components, such as the RNA helicase eIF4A and the 40Sassociated eIF3, eIF4G binds the cap-binding protein eIF4E and thus recruits the 5′ end of mRNA to the small ribosomal subunit. It had been reported that the eIF4Ebinding domain of eIF4G is natively unstructured, but folds upon binding to eIF4E [22]. A recent structural study of the yeast eIF4E–cap–eIF4 complex (Figure1.2.2) revealed that an 80-residue segment of eIF4G folds upon wrapping around an otherwise unfolded N-terminal segment of eIF4E [23]. Thus, complex formation involves a mutually induced folding event.

Figure 1.2.2 Ribbon representation of eIF4E (yellow) in complex with a fragment ofeIF4G (residues 393–490) (blue). The bound cap analog, m7GDP, is drawn in rod representation (purple). Upon binding eIF4E, eIF4G folds into a ring-shaped structure around the N-terminal tail, distal to the cap-binding site. Multi-dimensional NMR to determinate 3-D structure of protein. The basic of NMR experiments to determine the structure shows as below without any specific details . Two-dimensional NMR involves using a complex pulse sequence to disturb the nuclear spins. Typically there are four different periods involved: preparation, mixing, evolution and detection. The two most common basic techniques are COSY (homonuclear (J-) correlated spectroscopy) and NOESY (Nuclear Overhauser Effect spectroscopy). The first gives distances through covalent bonds, while the latter through space. Different types of pulse sequences are used for the different types of 2D (and 3D,4D) spectra. The diagonal corresponds to a 1D spectrum. COSY spectra are the simplest, the cross-peaks arise from Hs on adjacent Cs or C and N. COSY spectra, are used to determine the amino acid residue identity NOE interactions are short-range effects and only show atoms closer than 4.5 - 5 A. Usually they are broken down into three classes, strong NOE's (atoms closer than 2.5A), medium (2.5-3.5A) and longer (3.5-5A). The medium and long range NOEs are of most value, since the shorter range ones correspond to neighboring covalently bonded atoms.. [25] Particular types of secondary structure have characteristic coupling constants and patterns, which allow assignment of secondary structure as well. NOE's and an algorithm, such as distance geometry constraints are be applied for the tertiary structures. Furthermore 3D picture can be determined by the necessary hundreds of distances determined from the NOEs to be collated . Subsequently various refinement methods are run on the structure to eliminate errors. Typically one uses concentrations of protein in the 1-3 mM range (i. e. very concentrated for protein solutions). 2D spectra may take from ½ to 3 or 4 days to collect. They are only useful at 500 MHz and higher fields. Today there are a number of 750 MHz (and even two 800 MHz) instruments in operation. [25]

The example used here is the human ASC PYRIN domain (apoptosis-associated speck-like protein containing a caspase recruitment domain) which can be determined according to the basic process , however , with individual details . All the figures gotten by Fabiola Espejo and Manuel E. Patarroyo of Colombia .[26] The first indication of secondary structure was obtained by CD spectra. The far-UV CD spectrum of ACS2 showed two characteristic α-helix minimums at 208 and 224 nm (Fig. 1.2.3).

Fig. 1. 2.3 Far-UV CD spectra of ASC2 doamin. The form of the curve in the samples shows the typical behaviour of α-helix.

The presence of alpha helix secondary structure elements can be observed. Negative values (−1 in Fig.1.2.4) indicated the presence of an α-helix.

Fig. 1.2.4. Sequential and medium-range NOEs used to verify the secondary structure elements derived from NOE analysis. NOEs between amide protons of consecutive residues and between Hα and the amide proton of subsequent residues are represented by bars connecting the residues.

Fig. 1.2.4 illustrates NMR-derived data summarizing ASC2 secondary structural elements. Medium-range NOEs: dNN (i, i + 2), dαN (i, i + 3), dαN (i, i + 4), and dαβ (i, i + 3), with Cα and Cβ secondary chemical shifts, identify the protein’s five αhelical regions. Proton D2O exchange experiments confirmed the presence of α-helix secondary structure elements (• in Fig 1.2.4) The α-H of amino acids L9, K10, V11, L12, E13, and L15 were resistant to being substituted, confirming the presence of H bonds in the first helix. (Fig 1.2.5).

Fig.1.2.5. (A) 500 MHz 1H 15N-HSQC spectrum of recombinant ASC2 obtained at 300 K. Resonance assignments are indicated with the one letter aminoacid code and residue number. Side-chain amide protons of Asn and Gln are indicated by horizontal lines. (B) Proton D2O exchange experiments each 20 min confirmed. the NH implicated in hydrogen bond.

Structure calculation To weigh that strong NOEs were not observed that form the helix 3 from this protein when carrying out the structure calculation, one can observe that it is formed a helix among the amino acids 38 and 43 (Fig 1.2.6).

Fig. 1.2.6. The solution structure of ASC2 domain structure. (A) Superposition of the backbone atoms in the 25 conformers representing the NMR structure of the ASC2

pyrin domain. (B) A ribbon diagram of the Cα trace of the averaged minimized structure. 1.2.5 Relevant web site

1. 2. 3. 1.2.6 Key Industry Suppliers

1. Instruments+Molecular+Biotools/ 2. 1.2.7 References 1 Evans, Jeremy N. S.(1995) Biomolecular NMR spectroscopy .Oxford .N Y. 2 Wüthrich, K. (1998) Nat. Struct. Biol. 5, Suppl., 492-495 3 Mirau, P, A .(2002) A practical guide to understanding the NMR of polymers. Wiley,NY. 4 Sheffield Hallam University (2006) 5 Breitmaier, E.(2002) Structure elucidation by NMR in organic chemistry : a practical guide , 3rd rev. ed. Wiley, England. 6 7 Liu G, Shen Y, Atreya HS, Parish D, Shao Y, Sukumaran DK, Xiao R, Yee A, Lemak A, Bhattacharya A, Acton TA, Arrowsmith CH, Montelione GT, Szyperski T.(2005) NMR data collection and analysis protocol for high-throughput protein structure determination. Proc Natl Acad Sci U S A. 30,10487-92. 8 Koh .T. and Gerhard .W.(2006) NMR studies of protein interactions Current Opinion in Structural Biology. 16. (1), p 109-117. 9 Marc. B (2006) Molecular interactions investigated by multi-dimensional solid-state NMR .Current Opinion in Structural Biology. 16. (5), p 613-628. 10 E.R. Andrew, A. Bradbury and R.G. Eades, (1958).Nuclear magnetic resonance spectra from a crystal rotated at high speed, Nature 182 p. 1659. 11 V. Raghunathan, J.M. Gibson, G. Goobes, J.M. Popham, E.A. Louie, P.S. Stayton and G.P. Drobny, (2006) Homonuclear and heteronuclear NMR studies of a statherin fragment bound to hydroxyapatite crystals, J Phys Chem B Condens Matter Mater Surf Interfaces Biophys 110, pp. 9324–9332.

12 J. Leppert, C.R. Urbinati, S. Hafner, O. Ohlenschlager, M.S. Swanson, M. Gorlach and R. Ramachandran, (2004) Identification of NH.N hydrogen bonds by magic angle spinning solid state NMR in a double-stranded RNA associated with myotonic dystrophy, Nucleic Acids Res 32, pp. 1177–1183. 13 G.L. Olsen, T.E. Edwards, P. Deka, G. Varani, S.T. Sigurdsson and G.P. Drobny, (2005) Monitoring tat peptide binding to TAR RNA by solid-state 31P-19F REDOR NMR, Nucleic Acids Res 33, pp. 3447–3454. 14 Gerhard .Z. and Norbert. M.(2006). Cogwheel phase cycling in common triple resonance NMR experiments for the liquid phase, J Magnetic resonance 181(2),pp244-253. 15 L. Frydman, T. Scherf and A. Lupulescu, (2002) The acquisition of multidimensional NMR spectra within a single scan, Proc. Natl. Acad. Sci. USA 99 (25), pp. 15858–15862. 16 T. Szyperski, G. Wider, J.H. Bushweller and K. Wüthrich, (1993) Reduced dimensionality in triple-resonance NMR experiments, J. Am. Chem. Soc. 115, pp. 9307–9308. 17 E. Kupče and R. Freeman, (2003) Reconstruction of the three-dimensional NMR spectrum of a protein from a set of plane projections, J. Biomol. NMR 27, pp. 383– 387. 18 M.H. Levitt, P.K. Madhu and C.E. Hughes, (2002) Cogwheel phase cycling, J. Magn. Reson. 155, pp. 300–306. 19 Kay, L. E. (1998) Nat. Struct. Biol. 5, Suppl., 513-517 20 Dobson, C. M. & Hore, P. J. (1998) Nat. Struct. Biol. 5, Suppl., 504-507. 21 Hong,T.Y.(1999) Proc. Natl. Acad. Sci. USA 96, 332-334 22 P.E.C. Hershey, S.M. McWhirter, J.D. Gross, G. Wagner, T. Alber and A.B. Sachs, (1999) The cap-binding protein eIF4E promotes folding of a functional domain of yeast translation initiation factor eIF4G1, J Biol Chem 274, pp. 21297–21304 23 J.D. Gross, N.J. Moerke, T. von der Haar, A.A. Lugovskoy, A.B. Sachs, J.E.G. McCarthy and G. Wagner, (2003) Ribosome loading onto the mRNA cap is driven by conformational coupling between eIF4G and eIF4E, Cell 115, pp. 739–750. 24 25 Fabiola. E.and Manuel. E. P.(2006) Determining the 3D structure of human ASC2 protein involved in apoptosis and inflammation, j Biochemical and Biophysical Research Communications 340 (3), pp. 860-864.

Chapter 1.4 Applications

Fluorescence Spectroscopy of Proteins and
M Sabir Patel



Fluorescence spectroscopy is a type of electromagnetic spectroscopy used for the analysis of fluorescence spectra. It is achieved by exciting molecules of certain compounds using a beam of UV light causing them to emit a light of low energy and frequency in the form of a generally visible luminescence. In fluorescence spectroscopy the molecule of the compound absorbs a photon of light, which excites it from its ground state to one of the higher energy vibrational states. The molecule loses is vibrational energy upon collision with other molecules and finally reaches the lowest vibration state of the excited electronic state. Thereafter the molecule further drops to one of the vibrational states of the most stable ground electronic state and emits a photon in the process. The photons thus emitted will vary in their energies and therefore their frequencies. The structures of the vibrational levels can be determined by analyzing the different frequencies of lights emitted in fluorescence spectroscopy. The energy carried by 1 photon is proportional to the frequency of its oscillation and is given by E = hν = hc ergs λ where ν is the frequency, λ the related wavelength and h = Planck's constant (6.624 x 10-27 ergs/seconds).

Figure 1 Transitions giving rise to absorption and fluorescence emission Spectra


Recent Advances (CENTRE for FLUORESCENCE SPECTROSCOPY) Fluorescence Sensing

Fluorescence promises many potential applications in the field of biotechnology and clinical chemistry with the use of newer methods of fluorescence sensing together with newer probes. Life time based sensing: Lifetime based sensing which is an active area of research has a major advantage due to the fact that the fluorescence lifetime is independent of signal intensity due to the external factors like scattering and absorbtion. Lifetime based sensing has been recently applied in the high throughput screening (HTS) mode using two-photon excitation of Calcium Green. Novel Sensing methods: The more recent advancements like modulation and polarization sensing have a wide application in tissue and medical sensing, environmental sensing, bioimaging assays and HTS. Modulation sensing transform analyte dependent intensity changes into a change in the low frequency modulation signal. Polarization sensing transforms analyte dependent intensity changes into a change in polarization and/or angle rotation increasing sensitivity and enabling visual detection. The methods are calibrated using reference film placed adjacent to the sample. These sensing methods are generic and can be used with any fluorophore displaying analyte-dependent change in intensity.

Complex Glucose-Glucokinase

Fluorescence Lifetime Imaging: Fluorescence Lifetime Imaging Method was developed by CFS. FLIM uses the fluorescence lifetime at each point in the image and derives an image contrast using the fluorescence microscope. Certain analytes like Mg2+ , Ca2+ , Cl, K+ or pH etc. alter the fluorescence lifetimes of many fluorophores. However, there lifetimes are not dependent on photobleaching and the local probe concentration. Also, FLIM is independent of the wavelength ratiometric probes. Using visible-wavelength illumination, FLIM also allows quantitative Ca2+ imaging. CFS is currently working on the development of a new FLIM instrument and which will also be available to external users of CFS.

Ionophore and weak base treatments perturbed the cytosolic pH of CHO cells.

Probe Chemistry: The CFS is focused on the design and synthesis new fluorescent probes, novel conjugatable emissive transition metal-ligand complexes and lanthanide compounds, as well as donor-acceptor assemblies to meet the needs for expanding the applications of fluorescence to not only biochemical and biophysical research, but also biotechnology, drug discovery and cell biology. CFS has been successful in introducing a variety of ruthenium(II), rhenium(I) and Osmium(II) dimine complexes which display lifetimes ranging from 100 nanoseconds to 1microsecond and these of particular interest to bio-researchers for the study of the dynamics of biomacromolecules such as proteins and lipids. Below are some of the projects involving probe chemistry that CFS is currently working on:
• • •

Studying the enhancement of molecular luminescence near the surface of silver nanoparticles Exploring luminescent iridium(III) polypyridyl complexes Developing red emissive energy transfer assemblies for sensing

New Series of Rhenium(I) Complexes have:

• • • •

λmax at 610 nm in a buffer with µs-scale lifetime MLCT absorption doubled at 400 nm Much higher thermal and photochemical stability Very high anisotropies

Light Quenching: A pulse is given to excite the sample for measuring the time resolved fluorescence which is followed by measurement of time dependent emission. In the case of light quenching additional pulses follow the excitation pulse in order to modify the excited state population. This occurs upon illumination with longer wavelength non-absorbing light which depletes part of the excited population by stimulated emission. In reality, the fluorphores are not in a quenched state but only appear to be so because the observation of the residual population was actually made at right angles to the quenching beam. The "quenched" part of the emission is not observed since it travels parallel to the quenching beam. Therefore in time resolved light quenching experiments the emission observed is only before and after the quenching pulse. The instantaneous change in the intensity and/or anisotropy decays reflects in frequencydomain as characteristic oscillations. Light Quenching provides an opportunity to control an excited state population and orientation of fluorophores. In presence of Light Quenching fluorescence anisotropies above 0.4 and below -0.2 can be observed. Based on demonstration Light Quenching can be used to:
• • • •

Study the spectral relaxation of solvent-sensitive fluorophores. Eliminate unwanted fluorescent species from the mixture of fluorophores. Increase the spatial resolution in far-field fluorescence microscopy. Quench the fluorescence near the glass surface in total internal reflection excitation.

Light quenching with parallel excitation and quenching pulses. Excited state population without light quenching (top), with time-coincident or one-pulse light quenching (middle) and with time-delayed light quenching (bottom). Multi-Pulse Fluorescence Multi-Pulse Fluorescence (MPF) is a new CFS core project, currently under development. The basic idea of MPF is to perturb the sample with one light pulse and to start the time-resolved measurement at a delay time t with a second excitation pulse. The time-resolved data will be correlated with time-dependent structural changes in the protein following the perturbation pulse. This approach will be applied to proteins which undergo conformational changes in response to light, including hempglobin, rhodopsin and phytochromes. Both steady state and time-resolved measurements will be performed.

Experimental arrangement for two-pulse fluorescence measurements. L are lenses,

P are polarizers, D are diaphragms, R are polarization rotators, and F are optical filters. Long delay line DL1 is placed in the UV and a fine delay line DL2 in the visible beam, DF is a coated optical plate (dichroic filter) and PD is a reference photodiode, used in time-resolved measurements. 1.4.3 Evaluation of the technology

In Analytical chemistry the filed of atomic spectroscopy employs three most common techniques. Namely, • • • Atomic Absorbtion Atomic Emission Atomic Fluorescence

Since , atomic fluorescence technique incorporates properties from the other two, we shall go on to explain this technique and its prime advantage over the other two. The atoms are excited in the flame source by a beam of light that is focused in to the atomic vapor. The intensity of this "fluorescence" increases with increasing atom concentration, providing the basis for quantitative determination. However, the lamp in this technique is mounted at a specific angel to the that of the optical system and thus the light detector only sees the fluorescence in the flame and not the light of the lamp intead. By this arrangement, lamp intensity can be dramatically increased and in turn raise the number of excited atoms which is a function of the intensity of the exciting radiation. In conclusion, even though atomic absorbtion is the mostly widely used of the three, yet , particular benefits are attained using fluorescence spectroscopy technique. Fluorescence correlation spectroscopy (FCS) is an analytical process to measure the fluctuation of fluorescence intensity and which is mainly due to Brownian motion of the particle. The analysis determines the average number of the fluorescent particle and average diffusion time, when the particle is passing through the space. Eventually, both the concentration and size of the particle (molecule) are can be evaluated. This technique finds a very good application in the fields of biochemistry, biophysics and analytical chemistry. Unlike HPLC analysis, this method has no physical separation process and has a good spatial resolution determined by the optics which is great advantage over previous methods. This technique enables us to study and gain a better insight into the nature and function of biochemical pathways in a living cell using fluorescence tagged molecules. Radiative Decay Engineering (RDE) Radiative Decay Engineering(RDE) is more recent of the core projects at CFS and is currently under the development stage. Fluorophores when placed close to metal surfaces or colloids, display a varied set of spectroscopic effects. This effects are due to numerous factors or mechanisms like

quenching with d-3 dependence, enhancement in the local field and radiative decay rate are responsible for the quantum yield. For ellipsoidal particles the maximum enhancement in the magnitude of the local field is about 140. With increase in radiative rate the total radiative decay rate increases and so does the quantum yield. There are two limiting cases. If the dye has a high quantum yield (Q0 —>1), the additional radiative decay rate cannot substantially increase the quantum yield. In the case of low quantum yield chromophores the enhancement can be as large as 1/Q0. For this reason it is of interest to study fluorophore-metal interactions with low quantum yield fluorophores. While the actual mechanism is complex, one can imagine the particles serve as an antenna to radiate faster than knr. This suggests the emission from weakly fluorescent substances can be increased if they are positioned at an appropriate distance from a metal surface or colloid. Non-fluorophore species of chromophores can also become fluorescent with the ability to increase radiative decay. DNA ,for example, can also become brightly fluorescent without the use of extrinsic probes. In the case of proteins, tryptophan residues are quenched by nearby groups like histidine, phenylalanine or disulphide groups. Desirable mutant proteins can be obtained without site-increased mutagenesis by increasing the radiative rate and thereby increasing the quantum yield of the quenched tryptophan residues. Similiarily high quantum yields are achieved if metal surface are placed close to weakly fluorescent species like bilirubin, fullerenes, metal-ligand complexes, or porphyrins.


Applications of the Technology

Flouorescence spectroscopy has now found in roads and applications with newer more rapid and non-destructive methods in the food industry.The emission spectra can be quite complex, arising from a number of known, as well as unknown molecules, and the total fluorescence is sensitive to the local environment, apparently reducing the methods robustness. But with constant improvement and better sensor technology and excellent optics fluorescence spectroscopy shows many more promising applications in varied fields of life sciences. Fluorescence has proved higly effective method in measuring fat and connective tissue in meat. New techniques are presently under development for non-destructive measurements of lipid oxidation based on solid sample fluorescence and the outcome of various experiments on poultry meat, complex meat products, dairy products and fish have been rather fruitful.

Diagnostic tools for malignant diseases are being developed using fluorescence spectroscopy techniques. Autofluorescence emitted by the tissue along with fluorescence from specially designed probes attached to the target are detected using spectroscopy. One of the major advantages of using fluorescence spectroscopy is that it is a non-invasive tool and it can be performed in real-time. The prime objective is to develop guiding tools for biopsies based on fluorescence measurements.

Fluorescence emitted from meat and the fluorophore Rhodamine 6G

ENGINEER RESEARCH and DEVELOPEMENT (ERDC) The US Army Engineer Research and Developmental Center(ERDC) has the Fluorescence Spectroscopy Laboratory(FSL) involved in basic and applied research focused on the development and tesing of fluorophores for recovery by remote sensing.They are also commited for the testing and detection of living organic and inorganic materials which may pose as harmful agents or environemental hazards in relevance to the war fighters. The FSL boasts of state of the art spectrometers that have the capability to measure the steady-state and life time decay fluorescence spectra for fluorophores . The lab uses femtosecond Ti-Sapphire laser that can characterize even the near infrared fluorophores. FSL supports imagery based fluorescence measurements using laser-induced fluorescence and passive fluorescence measurements by Fraunhofer Line Discrimination. These measurements are of great importance to defense, intelligence agencies, and mapping agencies with base line research in polymer detection, backgrounds, and characterization of flurophores for environmental analysis. Multi-Photon Excitation The absorption of two or more long wavelength photons followed by fluorescence emission from the lowest excited state is termed as Multi-Photon Excitation (MPE).

The primary objective is to study the basic principles and biochemical applications of MPE. MPE allows localized excitation in fluorescence microscopy and an enhanced photoselection in spectroscopy. The study of MPE has been applied but not limited to
• • • • •

Spectroscopic properties and images of stained DNA Tyrosine, Tryptophon and Proteins DPH-labeled membranes Detection of anti-cancer drugs in blood Ru and Re metal ligand complexes

MPE has also been used to observe a UV fluorescence from saturated hydrocarbons and cholesterol analogues. CFS also applied MPE in High Throughput Screening (HTS).

Emission intensity of p-terphenyl with NIR (750 nm), UV (375 nm) and combined (375 and 750 nm) excitation

Emission spectra of p-terphenyl with 250 nm excitation and with 2C2P excitation at 375 and 750 nm.

Microsecond Dynamics of Macromolecules In general, biological macromolecules are known to have motions ranging from ps to days or even weeks and these motions are vital for the proper functioning of these macromolecules. Techniques like fluorescence have made it possible to access time frames of pico-nanosecond while slower timeframes of ms are possible by NMR and ESR.With development of stop-flow instrumentation, NMR, temperature jump and faster correlation spectroscopy it is possible to have measurements in microsecond window accessible experimentally. CFS is currently using µs MLC and µs lanthanide probes as donors in FRET experiments to characterize conformational dynamics in microsecond time window. The method provides both the KINETICS and CONFORMATIONAL DISTRIBUTIONS as well as intra and inter molecular diffusion in ms to ms time . It is an EQUILIBRIUM method which can be applied to very diverse problems. MAIN AIMS
• • • • •

Folding/Unfolding of protein secondary structure elements alpha-helix and beta-hairpin Inter-domain motions in histidine and glutamine binding proteins Lipid composition and phase state dependence of lateral diffusion in membranes Lateral diffusion in biological membranes and domains such as RAFTs Domain motions/ arm flexing in nucleic acid structures such as t-RNA, ribozyme, junctions and hairpins.


Relevant web sites: For further reading and learning resources, these are of some of the web site links which may be useful for further reading. • • • • • • • http://


Key Industry Suppliers Instrumentation:

CFS Product range can be seen at (

state-of-the-art time-domain (TD) and frequency-domain (FD) fluorescence instrumentation for time-resolved studies of biological macromolecules are currently available at the Center for Fluorescence Spectroscopy (CFS). The excitation sources are cavity-dumped and frequency-doubled ps dye lasers, or a Ti:Sapphire laser. A microchannel plate (MCP)-PMT is presently used for Time-correlated single photon counting (TCSPC) , to provide an instrument response function near 60 ps. Frequency-domain measurements are possible up to 10 GHz using the Center's FD instrument, and a high speed MCP-PMT. Available excitation wavelengths range

from UV to NIR. For less demanding applications modulated cw lasers (for FD) is available. A unique capability of the CFS will be the ability to collect and analyze both TD and FD data for the same samples, and in the future, simultaneous dual-domain (DD) analysis of the data. A Ti:Sapphire laser is now available for two- and threephoton excitation. Extensive research and development is going on to develop newer abd better instruments for fluorescence lifetime imaging microscopy (FLIM) and for cell-by-cell lifetime measurements in flow cytometry. A FLIM instrument with a red sensitive image intensifier will soon be available for photon migration imaging of tissues and turbid objects. INSTRUMENT SPECIFICATIONS EXCITATION SOURCES (PRIMARY)
• •

• • • • •

Argon Ion (Coherent) mode locked laser, 1W at 514 nm, fwhm = 120 ps. Ti:Sapphire femtosecond laser system (Spectra Physics), 750-920 nm, can be frequency doubled (375 - 460 nm) or frequency tripled (250 - 310 nm); fwhm = 90 fs Ti:Sapphire, regenerative amplifier, optical parametric amplifier system (Coherent) 120 fs fwhm; tunable from UV-IR. Argon Ion air cooled, CW, 488, 514.5 nm. HeCd air cooled, CW, 442 nm (Liconix) HeCd air cooled, CW, 325 nm (Liconix) Modulated CW laser diodes and light emitting diodes

• • •

Rhodamine 6G cavity-dumped dye laser, 560-620 nm, 280-310 nm, after doubling, fwhm = 5 ps (with saturable absorber if needed) Pyridine 1 and pyridine 2 cavity-dumped dye lasers, 680-760 mn, 340-380 nm after doubling, fwhm = 7 ps DCM cavity-dumped dye laser, 620-680 nm, 310-340 nm after doubling, fwhm = 7 ps

• •

MCP-PMT, Hamamatsu R2809, with a red-sensitive photocathode, 60 ps fwhm PMT, Philips XP2020, 500 ps fwhm

• • •

MCP-PMT, Hamamatsu R2566, 6 µ, for FD measurements to 10 GHz PMT, Hamamatsu R928, for FD measurements to 300 MHz A red-sensitive 6 µ R2566, for FD measurements to 10 GHz


• • •

Zeiss, Axiovert 135 TV inverted fluorescence microscope Gain modulated image intensifier operated up to 100 MHz modulation frequency Photometrics PXL-35 CCD frame-shift camera and Innovision software

• •

Nikon inverted microscope 300 FD instrumentation 0.4 - 10,000 MHz

• • • •

TCSPC with a MCP-PMT and a ps laser or fs excitation source FD up to 10 GHz using the harmonic-content of a ps or fs pulse train TCSPC with a laser and fast PMT FD from 3 KHz to 400 MHz with a modulated cw source

• • • • •

Three Silicon Graphics Indy workstations running under IRIX 5.3 operation system PC's running under Windows'98, Windows 2000 operating system (see CFS Network). Internet access to all programs. Reflection 4+ for Windows graphics terminal emulator recommended Data transfer between computers possible via Internet (FTP). Data and/or programs are also available on diskettes (MS-DOS, LS120), or CD-RW Terminals adjacent to instrument room

• • • • • • •

Steady state emission spectra (SLM 4800 Spectrofluorometer and SLM AB-2) Absorption diode array (190 nm - 1100 nm) spectrophotometer. (Hewlett Packard) Linear dichroism and transition moment determination Lab space adjacent to instrument room Temperature control, -60 to +90oC Ultracentrifuge (Beckman L5-65B), 65,000 rpm Gas pressure cell to 100 atm, and a 2 kbar hydrostatic pressure cell

Iridian Spectral Technologies Iridian Spectral Technologies is a one of the market leaders specialized in the Fluorescence spectroscopy product line. The detailed information regarding the product line can be seen at their website (

RISO National Laboratory, Denmark Website: ( Description of Fluorescence Spectroscopy (FLS920) Apparatus: A FLS920 from Edinburgh Instruments capable of steady state and time resolved measurements in the 200nm-900nm range. The system is equipped with a 450W Xe-lamp for steady state measurements. For time resolved measurements two different light sources can be used. Short lifetimes (<125ns) can be measured using a fast 40MHz LED light source (excitation around 378nm, 456nm, 501nm or 598nm). For longer lifetimes (up to 50ms) a 40kHz nanosecond flash lamp can be used. Accessories: • Polarisers to measure orientation. • A cryostat to measure at low temperature (77K).

Use: Fluorescence spectroscopy can be used to obtain steady state emission and excitation spectra (200nm-900 nm) and to measure fluorescence lifetimes (0.1 ns50 ms) of fluorescent compounds.


Simulation of Fluorescence Spectroscopy

Real-time simulation of a scanning fluorescence spectrofluorometer using the software. The software can be downloaded and the instructions and system requirements are available on the following website : ( A wide range of spectroscopy relevant instruments are available at Andor technologies and orders can be placed for these instruments at their website (



1. Molecular Photoluminescence Spectroscopy(2000). David Harvey. In: Modern Analytical Chemistry, McGraw Hill, pp.368-380, 423-440

2. Molecular Fluorescence Spectroscopy(2004). Skoog, West, Holler and Crouch. In: Fundamentals of Analytical Chemistry, Thomson Brooks/Cole.

3. Recent Developments in Fluorescence Spectroscopy (1996). Lakowicz, J.R., Terpetschnig, E., Szmacinski, H., Malak, H., Kusba, J. and Gryczynski, I.,. In: Analytical Use of Fluorescent Probes in Oncology. (Kohen, E., and Hirschberg, J.G., Eds.), Plenum Press, New York. pp. 65-79.

4. Imaging Applications of Time-Resolved Fluorescence Spectroscopy (1996). Lakowicz, J.R. and Szmacinski, H. In: Fluorescence Imaging Spectroscopy and Microscopy. Vol. 137. (X.F. Wang and B. Herman, Eds.), John Wiley & Sons, Inc. Publishers. pp. 273-311.

5. Emerging Applications of Fluorescence Spectroscopy to Cellular Imaging: Lifetime Imaging, Metal-Ligand Probes, Multi-Photon Excitation and Light Quenching (1996). Lakowicz, J. R. In: Scanning Microscopy Supplement Vol. 10. Scanning Microscopy International. pp. 213-224.

6. Fluorescence Spectroscopy of Biomolecules. (1995). In: Molecular Biology and Biotechnology (R.A. Meyers, Ed.), VCH Publishers, Inc

Chapter 1.5 Atomic Force Microscopy of Proteins
Trusha Jhala

1.5.1 Introduction The Atomic Force Microscope very much known as the “Eye of Nanotechnology” has proven to be a powerful tool for biological studies [1]. It is a high resolution imaging technique for surface morphology in various solutions and gas environments that has allowed researchers to observe biological processes in real time. [10,11] The Atomic Force Microscopy(AFM) also known as the Scanning Probe Microscopy, has revolutionized the field of interfacial surface science by allowing observation at molecular and atomic levels in the native environment at a single molecule level. [2,10,11] It was first invented by Gerd Binnig and Christoph Gerber in 1985.[13]The AFM is used in a wide range of technologies in electronics, telecommunications, biological, chemical, automotive, aerospace and energy industries. It is used for studies in abrasion, adhesion, cleaning, corrosion, etching, friction, lubrication and is used to analyze and investigate on a wide variety of materials.[8,13] It is now used in many fields of nanoscience and nanotechnology providing better understanding of events occurring at the molecular level. [11] AFM is widely used because it can be used for imaging any conducting or non-conducting surface unlike STM which is limited to conducting surfaces.

1.5.2. Principle The principle on which the AFM works is by the contact of the cantilever tip and the sample surface to be imaged. The cantilever tips are made of Si3N4 or Si with a radii of 4-60nm. The cantilever tip is combined with an ultra-sharp probe. The probe is of significant importance. The shape of the probe defines the resolution of the AFM and is made of a sharp tip with a specific spring constant. The surface topography is measured by keeping the force constant while the tip scans the surface and moves vertically due to attractive or repulsive interaction forces. The piezo-electric scanners keep the tip at a constant force when it has to obtain information regarding the height. It keeps the height constant when it has to obtain information on the force. Most AFM scanners are in the range of 90 x 90µm in the x-y plane and 5 µm for the z –direction.[13]The cantilever tip bends upwards due to the ionic repulsive force from the surface. This bending is measured by a laser beam reflected onto a positionsensitive photodetector which measures minute sensor deflections. This is used to calculate the force and allows visualization of the surface topography. [1,13]In a nanoscope AFM, there is an optical detection system which comprises of the tip attached to the base of the reflective cantilever. On the back of the cantilever is a diode laser. When the tip scans the sample surface, the laser beam deflects into a dual element photodiode which the photodetector measures and then converts to

voltage. The computer software on gaining this input from the photodetector controls and maintains a constant force or height above the sample surface. [13] The constant force mode measures the height deviation in real time through the piezo-electric transducer. The constant height mode measures the deflection force through the tip to be inserted in the sensitivity of the AFM head. The tip needs calibration parameters during force calibration of the microscope. Few AFM’s use 200mm wafers, which measure surface roughness of 5nm lateral and 0.01nm vertical resolution. Scanners measure the local height of the sample by interpreting the sample under the cantilever or the cantilever over the sample. 3-dimensional topographical maps can be constructed using the sample height and the probe tip position. [13,15] The cantilever tip obeys Hooke’s Law and can thus find the interaction force. The piezo-electric ceramics scanners carry out the movement of the tip or the sample and can measure resolutions in x-, y- and z-directions.[14] AFM can be operated in two modes: [14] • With feedback control • Without feedback control With feedback control When the electronic feedback is switched on, the piezo scanner which is responsible for the movements of the tip and the scanner reacts to changes and again modifies the tip-sample separation to the constant pre-determined value to maintain a constant force. This is known as the height mode.[14]

Figure 1.5.1 - Schematic representation of an atomic force microscopy (AFM). The cantilever-probe system is deflected by the surface topography of the sample. Cantilever deflections are detected with a laser beam mirror set-up. The positionsensitive detector captures normal forces and frictional forces affecting the probe. Source:

Without feedback controls: Here, the electronic feedback is switched off, and is used for imaging very flat samples. This is known as the constant height or deflection mode. The absence of electronic feedback may damage the cantilever tip because of the roughness of the sample or there may be thermal drifting. This can be overcome by keeping electronic feedback which detects such an error and removes slow variations in topography whilst highlighting the edges of the features of the sample.[14]

Figure 1.5.2 AFM without feedback controls Source: The Cantilever Tip: As mentioned above, the tip of the cantilever plays a very significant role defining the resolution of the AFM. For best results, the tip must have a radius of curvature of around 5nm. The absence of a sharp tip can seriously affect the performance of the AFM causing broadening, compression, interaction forces or aspect ratio. Tip broadening occurs when the radius of the tip is greater than the size of the feature being imaged causing the microscope to respond even before the tip has contacted the apex. This is known as tip convolution. Compression arises when the material is particularly soft (DNA) and this may pressurize the surface. The tip has to be selected according to its material as there are forces due to the chemical nature of the tip which may affect the outcome.[14]

Figure.1.5.3 Tip convolution Source:

There are various modes of attraction between the tip and the sample:

Table 1.5.1 Tip-Sample interaction modes Source:

The common AFM modes of operation are:

Figure 1.5.4 – Three common AFM operation modes. Source: (a) Contact mode: This is the most common operational mode of AFM where the probe is in permanent contact with the sample surface. In fluids with biological specimens the probe force is kept below 100 pN. This mode requires minimal sample preparation and can operate in air and fluid environment. It provides information on the elasticity, adhesion, hardness, friction of the sample surface. [1, 14] (b) Tapping mode: This mode uses a probe that oscillates at a constant frequency and the probe force is dominated by changes in the resonance frequency of the cantilever. To evade damage to the sample surface, the probe gently taps the sample surface rather than scraping the surface. The advantage of this mode is that it can detect surface contaminants that are not seen in height images. [1, 14] (c) Non-contact mode: This mode is used for samples which are delicate in both air and fluids and there is no damage to the material because the probe is placed in the

attractive force region. The Van der Vaal forces and electrostatic potentials are used to measure the force gradients. [1, 14]

1.5.2 Recent Advances AFM has revolutionized the field of interfacial surface science by allowing direct high resolution visualization of surface topography and its ability to being performed in various environments. There are a lot of new advances in the instrumentation, simplifying sample preparation, molecular imaging at a single molecular level has been achieved from the native environment of the proteins with better resolution that generates topographs of native proteins with the obtained resolution at nanometer scale that exhibits the supremacy of AFM. The operational mode of AFM with advancement of force –spectroscopy is now used extensively by researchers to study ligand-receptor interactions which aims to measure the forces at single molecular level. [1,2] Recent development of the VideoAFMTM has revolutionized the way research is conducted, allowing researchers to view molecular processes in real-time. This has opened the door for deeper insight into biological processes at nano level. The significance of Multimode Scanning Probe Microscope in Bio-nanotechnology research used alongwith VideoAFM™ has unlocked the various keys to real time biological processes. The VideoAFM™ imaging rates are 1000 times faster than the existing AFM’s. It also provides an opportunity visualize molecular interactions and the forces involved in it. Another positive aspect of this new development is its ability to smoothly work with conventional AFMs.[15,16] The use of Mac Mode AFM has opened prospects in drug development, especially in structural imaging of drug carriers such as liposomes and lactose crystals. Structural imaging revealing the shape and size of these drug carriers is of peak importance, as they are used widely in the pharmaceutical research industry using AFM. [2] The shape and size of dimerystic phosphotydalcholine (DMPC) liposomes in phosphate buffer can be directly observed using MAC Mode AFM.

Figure 1.5.5 Structural imaging of liposomes Source: The above image shows round liposomes with their diameter ranging from 50 up to 200nm achieved at a Scan size = 1.15 µm x 1.15 µm

Likewise, surface imaging of the lactose crystal as an inhaled drug carrier in increasing humidity are shown from 13 to 96%. Mac Mode AFM reveals the crystal structures melting significantly at 80% humidity. [2]

Figure 1.5.6 Surface imaging of lactose crystals at varied humidity levels achieved at a scan size of 5 µm x 5 µm. Source:

1.5.3 Evaluation of the Technology ADVANTAGES: Interest in AFM in the study of proteins arose because it could resolve the surface features of heterogenous samples under different conditions and provide direct observation of protein complexes in real time. As it could image non-conducting surfaces, it was be used to analyze a wide variety of biological samples. [1,13] The main advantage of AFM is that it can provide easily achievable high resolution and 3-dimensional images of surface topography of biological specimens in various environments and temperatures. [1,10,11,12,13] Recent advances in this method have enabled surface imaging at a single molecular level at a resolution all the way to the nanometer scale. [1] Due to better sample preparation techniques and superior control of probe-sample interactions it is probable to now analyze protein folding. [1] AFM methods require little sample preparation. [13] AFM has also been used widely for imaging individual proteins and other molecules like collagen. Immobilization of IgG1 antibodies was attained using AFM, where the low affinity of IgG molecules towards mica was surmounted by cloning a metal-chelating peptide into the carboxy terminus sequence of IgG. The purified IgG had binded in a regiospecific manner to the nickel-treated mica. [13] AFM combined with bright-field, fluorescence and other optical techniques can be used for identification of structures and simultaneously

providing nanometer-resolved images of the sample surface. [13] AFM can measure intermolecular forces in the nanonewton range in protein synthesis, DNA replication or drug interaction which enables it to analyze ligand-receptor interactions. [13] An important part of the study of biological systems is the electrical properties of their surfaces, where the use of AFM comes in. AFM can image electrical surface charge, binding forces and electrostatic forces, micromechanical properties, elasticity and viscosity of live cells and membranes. [13] On comparing AFM with other microscopy techniques, AFM shows many advantages over other techniques making it the preferred method used among the researchers. The table below shows how AFM is advantageous compared to methods like SEM, TEM and optical microscopy techniques as far as cost, flexibility in various environments and ease of sample preparation is concerned. [11]

AFM Max resolution Typical cost (x $1,000) Imaging Environment In-situ In fluid Sample preparation Atomic 100 – 200 air, fluid, vacuum, special gas Yes Yes Easy

TEM Atomic 500 or higher vacuum No No Difficult

SEM 1’s nm 200 – 400 vacuum No No Easy

Optical 100’s nm 10 – 50 air, fluid Yes Yes Easy

Table 1.5.2 Comparison of AFM and other Microscopy Techniques Source: AFM versus STM: The main advantage of AFM over STM is that the latter technique is limited to imaging conducting samples, while the former can image conductors and insulators. In AFM, both writing voltage and tip-to-substrate spacing can be individually controlled whereas in STM, as both are linked, control needs to be adjusted accordingly. STM has better resolution because the force-distance reliance on tip shape and contact force is much more complex compared to STM. [13] AFM versus SEM: AFM offers better topographic contrast direct height measurements and very clear views of surface features and doesn’t require coating compared to SEM. [13] AFM versus TEM: AFM is less expensive compared to TEM. AFM is also more flexible with various environments and the sample preparation is easy. [13] AFM versus Optical Microscope: Optical Microscopy is comparatively cheaper, requiring less sample preparation. AFM generates explicit measurements of step heights, shows differences between materials that are independent of reflectivity. [13]


AFM has enormous applications in the field of Molecular and Microbiology but at the same time several limitations and difficulties also exists. The most vital aspect of the technique is sample preparation which can bear the force applied by the scanning probe for that appropriate solid substrate and well attached sample is needed. When the sample consists of living cells the immobilization by means of adsorption is not appropriate because of the limited contact area and as the substrate is very small it can lead to detachment by the scanning probe. The better alternative developed was to use porous membranes. This approach can minimize the denaturation but this method can basically work well for spherical cells not for rod-shaped cells. The other problem encountered is in relation to the resolution and the image interpretation which is solely based on the imaging force and probe geometry. Large forces acting between the sample and the probe during imaging may significantly reduce the resolution power of the images generated and can also cause molecular damage. When imaging the samples in the air, a layer of water condensation or other contaminations cover up both the samples as well as the probe which often leads to sample damage or makes high resolution images due to strong attractive force. In order to optimize imaging environment pH and ionic strength are also the factors that affect image quality. Another difficulty that may arise is due to shadowing or multiplication of small structures generated due to multiple probe effects generated from contamination on the probe. [3]

1.5.4 Applications of the Technology Recent modes of AFM have shown enormous applications in 3-dimensional structural identification of cells, biomolecules and subcellular entities in various environments. AFM is also a powerful tool for measure the forces and interactions at molecular level in real time. AFM has enabled researchers to study processes like polymerization of fibrinogen and crystal growth and imaging physiochemical properties living cells.[6] AFM used to detect electrostatic potential generated by OmpF porin: AFM was used to reveal structural details of the membrane protein surface and also the electrostatic potentials generated by the protein. At low electrolyte concentrations when the charged probes were used, it provided structural imaging of the surface of membrane protein and also allowed mapping of electrostatic potential of the OmpF porin. The obtained results corresponded with the electrostatic calculations based on the atomic OmpF porin in a lipid bilayer at the same concentrations. This method opens the door to the electrostatic potential of the native protein surfaces with better resolutions.[5]

AFM used for analyzing the reaction of endothelial cells to histamine treatment: Atomic force microscopy was used for the investigation of the cellular response generated to histamine, which is one of the key inflammatory mediators that causes endothelial hyperpermeability and vascular leakage. The probes used were labeled with fibronectin and used for measuring the binding strength between 5ß1 integrin and fibronectin for the quantification of the force needed to break down single firbonectin-integrin bonds. The cytoskeletal changes, adhesion force and binding probability on endothelial cells were monitored before and after histamine treatment. The AFM was used to record changes on live endothelial cells. Cell topography measurements revealed that histamine provokes cell shrinkage, stiffness and increases binding probability. To measure stiffness of cell surface, AFM was used in force mode to measure force adhesion between the AFM tip and the cell surface.[4]

Figure 1.5.7 Contact mode images of endothelial cells before and after histamine treatment. Source: The upper row shows deflection images and the lower images show contrast height images in contact mode of AFM with the arrows showing cell shrinkage after histamine treatment. Panels A and B in the graph represent two topographical profiles.[4] AFM used to create Nanoscopic Collagen Matrices: AFM is used to assemble collagen molecules into well-defined 2-dimensional templates, which may prove to be platforms on non-biological surfaces to direct molecular and cellular processes. The collagen matrices formed, maintained their mechanical stability for several months providing more information on the physical mechanisms through which the biological structures are organized by cells. The monolayers may be used for directing molecular or the cellular processes. Nanostructuring the monolayers may offer a mechanism to orient the monolayers to specific cellular features. The monolayers could also be constructive for storing information or microarray printing in molecular electronic circuits.[7]

Figure 1.5.8 the AFM mode in contact mode or tapping mode at 100pN is shown. The applied force was increased to 200pN to adjust collagen fibers as needed. Then the sample was re-imaged at a force of 100pN Source: Localization of Lipopolysaccharide(LPS) binding-protein by phospholipids membranes using AFM: To study the function of the LPS-binding protein in activating immunocompetent cells, lipid liposomes were adsorbed on mica, and AFM was used to identify the lateral organization of LBP (LPS-binding protein) in these membranes and its interaction with LPS aggregates. Cantilever tips were loaded with anti-LBP antibodies. Membranes were localized with single LBP molecules at low concentration. At high concentrations, cluster formation of many LBP molecules formed cross-linking of lipid bilayers. LPS was added to the LBP liposomes. This addition gave rise to LPS domains, which might be inhibited by anti-LBP antibodies. Thus LBP proves to be facilitating fusion of lipid membranes and LPS aggregates.[9]

Figure 1.5.9 AFM images of LBP, PS bilayers, and LBP-containing PS bilayers. (A).Single LBP molecules on mica. Schematic diagram showing the dimensions of a single LBP molecule.(B). A pure PS bilayer. In a square of 2 _ 2_m, most of the lipid molecules were scratched off of the mica by the cantilever. (C), image of a PS bilayer on mica after LBP was added and was adsorbed to the membrane. Schematic diagram shows single LBP molecules bound to the surface of the membrane. (D), image of preincubated (PS_LBP) liposomes adsorbed on mica. The schematic diagram indicates that small domains of LBP molecules were formed [9] Source:

1.5.5 Relevant Websites • • • • • Agilent Technologies IBM Research Press Resources: Atomic force microscopy. Laboratory of Biophysics and Surface Analysis Worcester Polytechnic Institute. Nanotechnology Resources AFM and SPM Laboratories

1.5.6 Key Industry Suppliers of AFM The list of the key industry suppliers of Atomic Force Microscope’s are as below:[10] • • • • • • • • • • • • • • • • • • • • • • • • Asylum Research BioForce Nanosciences EXFO Burleigh JEOL USA JPK Instruments MikroMasch Molecular Imaging Nanofactory Instruments Nanoink Nanonics Imaging Nanosurf Nanoworld Novascan Technologies NT-MDT Olympus Omicron NanoTechnology Pacific Nanotechnology Photometrics Quesant Instrument RHK Technology Shimadzu Surface Imaging Systems Triple-O Microscopy Veeco Instruments

1. Silva, LP. (2002) Atomic Force Microscopy and Proteins, Protein and Peptide Letters, Vol.9, No.2, pp.117-125. Bentham Science Publishers Ltd. 2. Zhu, J.Y. (1998) Applications of MAC Mode AFM in Biology, Pharmaceutical and Other Bio-Related Industries, Application Notes, Molecular Imaging Corporation 3. Dufrene, Y.F. (October 2002) Atomic Force Microscopy, a Powerful Tool in Microbiology, Journal of Bacteriology, p.5205-5213, Vol. 184, No.19. American Society for Microbiology. 4. Trache, A., Trzeciakowski, J.P., Gardiner, L., Sun, Z., Muthuchamy, M., Guo, M., Yuan, S.Y., Meininger, G. A. (2005) Histamine Effects on Endothelial Cell Fibronectin Interaction Studied by Atomic Force Microscopy, Biophysical Journal 89:2888-2898. The Biophysical Society. 5. Philippsen, A., Im, W., Engel, A., Schirmer, T., Roux, B., Muller, D,J.(March 2002) Imaging the electrostatic potential of transmembrane channels: atomic probe microscopy of OmpF porin. Biophysical Journal 82(3): 1667-1676 6. Lal, R., John, S.A.(1994) Biological Applications of atomic force microscopy. AJP- Cell Physiology, Vol 266, Issue 1 C1-21, American Physiological Society. 7. Jiang, F., Khairy, K., Poole, K., Howard, J., Muller, D.J. (2004) Creating Nanoscopic Collagen Matrices Using Atomic Force Microscopy. Microscopy Research and Technique 64:435-440 8. Karrasch, S., Hegerl, R., Hoh, J.H., Baumeister, W., Engel A. (1994) Atomic force microscopy produces faithful high-resolution images of protein surfaces in an aqueous environment. Proc Nat1 Acad Sci USA. 91(3): 836-838. 9. Roes, S., Mumm, F., Seydel, U., Gutsmann, T., (2006) Localization of the Lipopolysaccharide-binding Protein in Phospholipid Membranes by Atomic Force Microscopy. The Journal of Biological Chemistry Vol.281, No.5, pp.2757-2763. The American Society for Biochemistry and Molecular Biology, USA. 10. Wright-Smith, C., Smith, C.M.(2001) Atomic Force Microscopy. The Scientist 15.2:p23. 11. Agilent Technologies. (1998-2006), What is AFM? 12. Smith, A. (May 1999), Atomic Force Microscopy, Microbiology Today 13. Li, H-Q.(1997) General Ideas About AFM

14. Atomic Force Microscopy 15. Muller, D.J., Aebi, U., Ángel, A. Imaging, measuring and manipulating native biomolecular systems with the atomic force microscope 16. Infinitesima announces compatibility with the MultiMode™ Scanning Probe Microscope from Veeco Corp, March 10th 2006 (Press Release), Oxford, UK

Chapter 1.6 Electron Microscopy of Proteins Priyadharshini Sivakumaran



With the relative ease of operation of present day instruments, electron microscopy has become very popular in the investigation of biologic structures. In many cases, one can obtain images that are fairly faithful records of the detail in the specimen within only a few minutes. The two main draw backs of the method are (1) the relative transparency of proteins to electrons and (2) the disruption of protein structure as the specimen dry out in the evacuated microscope and are bombarded by the electron beam. The usual method of increasing the contrast is by adding heavy metals in one way or another. The simplest way of doing this, the method of negative staining, is fortunately also the most successful in preserving specimen order. With the effective resolution limited to 20 Å in electron micrographs of proteins, little, even in terms of the shape of the molecule, can be deduced from images of average–sized, isolated monomers. Electron microscopy has therefore been most successful in the determination of the quaternary structure of assemblies of protein molecules. The Transmission Electron Microscope (TEM) is the only instrument that allows the analysis of biological samples on different scales, ranging from µm down to Angstrom resolution. Transmission electron microscopy of sections of cells or the recently developed method of electron tomography allows whole cells or organelles within cells to be studied. On the other end, electron crystallography employs the TEM to resolve the atomic details of proteins that are arranged in two-dimensional crystals. Most modern transmission electron microscopes can routinely reach a resolution better than 2Å, at which atoms can be visualized directly. This, however, requires the sample to be sufficiently resistant to the electron beam and adequately prepared. Biological samples are usually highly sensitive to the electron beam. High resolution data of proteins for example can only be recorded at electron doses below 5 electrons/ Å2 with a sample kept at liquid nitrogen temperature, or at doses below 20 electrons/ Å2, when the sample is kept at liquid helium temperature. To record images with a controlled number of electrons admitted onto the sample, the TEM has to be operated in a so-called “Low-Dose” mode.

Figure 1.6.1 The electron microscope (EM) focuses on the structure determination of complex macro-molecular assemblies. Small proteins are studied by means of a technique known as electron crystallography. Such proteins are purified and two-dimensionally crystallized, before EM and data processing steps are performed. Larger complexes are mainly studied by non-crystallographic methods. These proteins are (partially) purified and their single particle EM images are processed. For these investigations

cryo-electron microscopy is used: objects are embedded in a thin amorphous layer of ice, which preserves their ultra-structure. The scanning electron microscopy can be used for mapping the orientation and organization of protein film adsorbed onto various surfaces at the nano scale. In this study, the scanning force and electron microscopic visualization of single molecules of fibronectin either frozen hydrated or adsorbed onto metallic and polymeric surfaces with different solid surface tensions were presented. The surfaces were characterized by dynamic contact angle measurements, X-ray photo emission spectroscopy and scanning force microscopy. The proteins were prepared by fast protein liquid chromatography (FPLC) and characterized by gel electrophoresis. Protein films on surfaces were investigated by surface plasmon resonance spectroscopy and directly imaged by scanning force microscopy. The spreading of the adsorbed fibronectin revealed dependence on the chemical composition and the solid surface tension. Structure of fibronectin in solution as well as on solid interface appeared as an extended straight strand as obtained by imaging with electron and scanning probe microscopes. Frictional forces during the scan have been of significant contribution in the imaging mechanism. 1.6.2 Recent Advances:

In recent years, digital image processing has evolved to the point where it is now possible to more fully exploit the high resolution potential of the transmission electron microscope (TEM). A system has been developed for semi-automatic specimen selection and data acquisition for protein electron crystallography, based on a slowscan CCD camera connected to a transmission electron microscope and control from an external computer. The slow-scan CCD camera has been shown to be a valuable accessory to an electron microscope for direct data acquisition as used in on-line electron optical adjustments and electron tomographic applications. Use of a slowscan CCD camera for the acquisition of diffraction data in an electron crystallographic application would allow for a fast evaluation and an immediate, subsequent numerical analysis of the data, contrary to the imaging plate, which too has been used for the acquisition of electron diffraction intensities. Furthermore, the quality of the acquired data is higher, since this camera performs better than the photographic film in terms of linearity, background noise and dynamic range. Its applicability to protein electron crystallography at 400 kV has resulted from a number of engineering changes, which were made to the slow-scan CCD camera, such as minimization of the spurious X-ray signals picked up by the camera. Areas of interest on the specimen are localised at low magnification and subsequently imaged on the CCD camera, using a dose which is small compared to the dose used in the exposure mode. The crystalline quality of the area is evaluated from the appearance of diffraction peaks in the calculated image Fourier transform. If the quality is considered good, images can then be recorded in different modes, both on film and using the CCD camera. Using this system a significant gain, both quantitatively and qualitatively, can be obtained in acquiring data for electron crystallography of beam-sensitive materials.

Figure 1.6.2 CCD Camera Among many scanning probe microscopes, atomic force microscopy (AFM) is a useful technique to analyse the structure of biological materials because of its applicability to non-conductors in physiological conditions with high resolution. However, the resolution has been limited to an inherent property of the technique. To overcome the problem, a carbon nanotube probe was developed by attaching a carbon nanotube to a conventional scanning probe under a well-controlled process. Because of the constant and small radius of the tip (2.5–10 nm) and the high aspect ratio (1: 100) of the carbon nanotube, the lateral resolution has been much improved. The carbon nanotube probes also possessed a higher durability than the conventional probes. These carbon nanotube probes, with high vertical resolution, enabled to clearly visualize the subunit organization of multi-subunit proteins and to propose structural models for proliferating cellular proteins. This success in the application of carbon nanotube probes provides the current AFM technology with an additional power for the analyses of the detailed structure of biological materials and the relationship between the structure and function of proteins. 1.6.3 Evaluation of the technology:

Because of the differences in the way TEM and SEM work, each has its own distinct advantages. With TEM, for instance, it is able to view a sample at a magnification approximately 10 times that of an SEM (objects as small as three to 10 angstroms for TEM). Also, because of its ability to transmit through samples, it can not only characterize particle surfaces, but it can also reveal the sample’s internal structure. One advantage of SEM is that it provides a better overall visual image of the sample. This is because as it scans over a sample line by line, it gives the image a depth of field, almost making the object three-dimensional. In a TEM image, no depth of field can be seen on the image. Another advantage of SEM is that it is more flexible in the type of samples it can view because the sample does not need to be nearly as thin as with TEM. Therefore, SEM can analyse samples such as larger wear debris particles and distressed machine surfaces. The advantage of colloidal gold labelling is that the intracellular complexes may be more precisely located because of the significant improvement in resolution provided by backscatter electron (BSE) imaging in the SEM. BSE imaging confirmed the presence and subsarcolemma localization of myoglobin in cardiomyocytes directly isolated from fresh biopsies. Electron microscopes are expensive to buy and maintain. As they are sensitive to vibration and external magnetic fields, suitable facilities are required to house microscopes aimed at achieving high resolutions. The samples have to be viewed in vacuum, as the molecules that make up air would scatter the electrons. Recent advances have allowed hydrated samples to be imaged using an environmental scanning electron microscope. Scanning electron microscopes usually image conductive or semi-conductive materials best. The samples have to be prepared in many ways to give proper detail, which may result in artefacts purely the result of treatment. This gives the problem of distinguishing artefacts from material, particularly in biological samples. Scientists maintain that the results from various preparation techniques have been compared, and as there is no reason that they should all produce similar artefacts, it is therefore reasonable to believe that electron microscopy features correlate with living cells.


Application of the technology:

Electron microscopic techniques have been used in studying the structure of many of the proteins. Some of them are discussed below. RESPIRATORY CHAIN SUPER COMPLEXES IN MITOCHONDRIAL MEMBRANES The structure of the individual membrane-bound protein components have been well characterized, especially by X-ray diffraction studies. During the past years increasing evidence that the respiratory chain components are organized into supercomplexes, particularly by the application of Blue-native polyacrylamide gel electrophoresis. Single particle electron microscopy is used for a structural characterization of some of the respiratory chain supercomplexes. The study revealed that the supercomplex composed of monomeric Complex I and dimeric Complex III from Arabidopsis and more recently the dimeric complex of ATP synthase. The ATP synthase is substantially kinked in its membrane-embedded domains and this allows also for the first time to define a functional role for these supercomplexes. The dimeric ATP synthase complex appears to be responsible for the folding of the inner mitochondrial membrane.

Figure 1.6.4 ATP synthase In order to determine the nature of the regulations, structural elucidations of actin filament-end-binding protein complexes are crucially important. A new procedure has been developed on the basis of single-particle analysis to determine the structure of the end of actin filaments from electron micrographs. In these procedures, the polarity of the actin filament image, as well as the azimuth orientation and the axial position of each actin protomer within a short stretch near the filament end, was determined accurately. This improved both the stability and accuracy of the structural determination dramatically. Agrin is a large, multidomain heparan sulfate proteoglycan that is associated with basement membranes of several tissues. Particular splice variants of agrin are essential for the formation of synaptic structures at the neuromuscular junction.

Electron microscopy was used to determine the structure of agrin and to localize its binding site in laminin-1. Agrin appears as an approximately 95 nm long particle that consists of a globular, N-terminal laminin-binding domain, a central rod predominantly formed by the follistatin-like domains and three globular, C-terminal laminin G-like domains. A novel ways to characterise all the larger (membrane) protein complexes from a specific type of membrane by transmission electron microscopy (EM) in combination with proteomics was developed. In general, structural studies on proteins are heavily based on isolated, highly purified protein samples. For X-ray diffraction studies this is a necessity and for EM, based on crystals, the same is true. Single particle EM is a well-developed technique to average the individual projections of large protein complexes. It makes use of sophisticated classification programs to sort out different projections. It is found that this is also possible in complex mixtures of proteins. In this way, the projection structures of all larger proteins from disrupted membranes can be analysed. Proteomics methods can be used to assign the EM projection structures to specific proteins. This will be achieved by comparing the frequencies of EM structures with intensities of electrophoresis spots arising from proteins of complete membranes. As a study object the photosynthetic membrane from cyanobacteria and the peroxisome membrane from yeast were selected. It is expected that by combining EM and proteomics major new discoveries can be achieved because it is a fully novel approach. 1.6.5 Key industry suppliers:

Gatan, Inc. is the world's leading manufacturer of instrumentation and software used to enhance and extend the operation and performance of electron microscopes. The Gatan name is recognized and respected throughout the worldwide scientific community and has been synonymous with high quality products and the industry's leading technology. Gatan digital cameras provide the ultimate digital capture and viewing system for the electron microscopist of today. There are three families in the Gatan range: ES500W Erlangshen™, the most versatile CCD camera to record TEM images, MultiScan™, the industry standard TEM digital camera and the UltraScan™ family, the highest performance imaging system available today. The ES500W is capable of high speed and high quality imaging with a field of view larger than traditional TEM film. These advanced features allow the user to image intense electron diffraction patterns without the “blooming artifact” and observing dynamic events inside a TEM. 'GRACE' is a system for semi-automated specimen selection and data acquisition using a Gatan slow-scan camera in combination with a Philips CMxx electron microscope. It was designed for use with specimens of 2-dimensional (2D) protein crystals, but can also be used for 'single particle' specimens. With 2D protein crystals it allows the user to mark at low magnification areas in the specimen that may contain crystalline protein domains or other interesting features, evaluate the (crystalline) quality of the domains by recording a very low dose image (at intermediate magnification) in the so called 'Preview Mode' and calculating and displaying the FFT. If an area is judged 'good' by the operator, a high resolution (lowdose) image can be recorded on the CCD camera or on film (or both) in the Exposure Mode. In the case of non-periodic specimens the very-low-dose preview image can be used itself for selection instead of the FFT. The system can also be used to just collect images from the positions marked at low magnification, without any preview. 1.6.6 References:

1. Baldwin, J. & Henderson, R. (1984). Measurement and evaluation of electron diffraction patterns from two-dimensional crystals. Ultramicroscopy 14, 319336. 2. Blundell, T. L. & Johnson, L. N. (1976). Protein Crystallography. New York, Academic Press. 3. Booy, F. P. & Pawley, J. B. (1993). Cryo-crinkling: what happens to carbon films on copper grids at low temperature. Ultramicroscopy 48, 273-280 4. Chiu, W. & Jeng, T.-W. (1980). Electron diffraction study of crotoxin complex at 1.6 Å. In: Electron Microscopy at Molecular Dimensions. State of the Art and Strategies for the Future. Berlin, Springer-Verlag. 137-142. 5. Frederic Zenhausern, Marc Adrian and P. Descouts. Solution Structure and Direct Imaging of Fibronectin Adsorption to Solid Surfaces by Scanning Force Microscopy and Cryo-electron Microscopy. J. Electron Microscopy 42(6): 378-388 (1993) 6. Ken I. Hohmura, Yutakatti Itokazu, Shige H. Yoshimura, Gaku Mizuguchi, Yusuke Masamura, Kunio Takeyasu, Yasushi Shiomi, Toshiki Tsurimoto, Hidehiro Nishijima, Seiji Akita and Yoshikazu Nakayama2 Atomic force microscopy with carbon nanotube probe resolves the subunit organization of protein complexes. J. Electron Microscopy 49(3): 415-421 (2000). 7. Narita A, Maeda Y. Molecular determination by electron microscopy of the actin filament end structure. J Mol bio. (2006) 8. Denzer AJ, Schulthess T, Fauser C, Schumacher B, Kammerer RA, Engel J, Ruegg MA. Electron microscopic structure of agrin and mapping of its binding site in laminin-1. EMBO J. 15; 17(2):335-43. (1998). 9. Electron microscopy of proteins by James R. Harris. London ; New York Academic Press, 1981-1987

2.2 Size exclusion chromatography
Michelle Hefford

2.2.1 Introduction Size exclusion chromatography (SEC) is one of a range of chromatographic techniques. Chromatography is a widely used application and the sale of chromatographic instruments represents more than half of the total sale of analytical equipment around the world. [1]. Size exclusion chromatography may also be referred to as Gel Filtration Chromatography (when using an aqueous mobile phase) or Gel Permeation Chromatography (when using an organic mobile phase). The term Size exclusion chromatography covers both these forms and is more widely used and accepted. [2]. All other methods of chromatography involve the sample binding to the support, but size exclusion chromatography separates different molecules based on their molecular weight. In size exclusion chromatography, larger molecules are eluted faster from the column than smaller molecules, giving rise to the name ‘size exclusion’. Underlying Principles The process of size exclusion chromatography is similar to that of other column chromatography methods, such as affinity chromatography. The sample is injected into the column via a mobile phase and travels through the column packing or gel matrix. The column packing consists of permeable pores of various sizes. Depending on the size of the molecules in the sample, they will permeate the pores. The larger molecules will not be able to permeate the pores and will elute through the column faster than the smaller molecules. As the sample is eluted, one or more detectors are used to measure specific properties of the mobile phase/sample. This information is displayed via a chromatogram. [1], [3]

Figure 1.1 Principle of size exclusion chromatography Source: The stationary phase of size exclusion chromatography consists of a column packed with a matrix (gel) which has pores of different sizes. Columns can be packed with a gel that will have pores of a set size range depending on the size of the different molecules in the sample solution. The column packing should be chosen so that the largest molecule will not permeate through any of the different pore sizes. These molecules will elute first form the column. The smallest pores in the matrix will only allow the smallest molecules to permeate through. Therefore the smallest molecules will spend the most time permeating through the matrix and elute last. [4] [5]

It is important to select the right type of column as the matrix should be inert in relation to the sample. If the sample reacts with the matrix, this will mean that the molecules will not be eluting due to their size and this will affect the end results. If the molecules bind to the matrix, (for example via an ion-exchange interaction), it will take longer for the molecule to elute from the column, which would indicate that the molecule is smaller than its actual size. Conversely, if the molecule was repelled by the matrix, it may be prevented from permeating through the pores (even though it is small enough to do so). This would mean that the molecule would elute faster than usual and would be detected as a larger molecule. [5] [6]. The length of the column and the sample size will also affect the resolution of the results. If the column is longer, it allows more time for better separation of the ‘medium sized’ molecules, which gives better resolution on the chromatogram. Sometimes a series of columns is needed to get adequate results, but this increases the time required to run the process. [6] Prepacked columns are commercially available with different volumes and mediums. The table below shows some typical examples of medium used in columns for size exclusion chromatography.

Table 1.1 Properties of typical commercial column packings for size exclusion chromatography. Source: The sample is dissolved in the mobile phase. As the mobile phase elutes through the column, the molecules within the sample will either permeate through the pores (if they are able to fit) or travel down the column via the mobile phase solution. Careful selection of the mobile phase is also very important. A good mobile phase should have the following properties: • Dissolve the sample • Low viscosity • Not cause the stationary phase to interact with the sample. If anything, the mobile phase should prevent this. • Must be compatible with the detector. Usually more than one type of detector is used with size exclusion chromatography. The most common detectors used in size exclusion chromatography are UV, fluorescence, refractive index and evaporative light-scattering detectors (ELSDs).

Size exclusion chromatography systems can also be linked to a Mass Spectrometer for higher selectivity, but this is quite expensive. [5] [1] UV detectors can be fixed-wavelength or dual-wavelength. The measurement of absorbance by the fixed wavelength detector is based on the difference in intensity between the sample and a reference standard. The dual-wavelength detector measures the difference in absorbance of two wavelengths of light that are passed through the sample. Fluorescence detectors measure the signal emitted by molecules after absorbing light at a specific wavelength. If the sample does not fluoresce naturally, tags or dyes can be used to enable detection. The fluorescence signal is linearly proportional to the sample concentration. Refractive index detectors measure changes in the refractive index of the sample in contrast to the mobile phase. This method of detection is sensitive to temperature changes, and the solvent must remain the same throughout the process. For use of ELSDs, the liquid sample is converted into a fine spray and then evaporated to remove any solvent. The remaining sample molecules are then subjected to a light source and the light scattered by the molecules is detected by a photodiode. The amount of light scattered is relative to the mass of the sample. ELSD is commonly used for samples that do not respond well to UV detection. The table below shows the properties of the chromatography detectors outlined. RI Response Sensitivity Linear Range Flow Sensitive Universal 4 microgram 10 Yes UV/VIS Selective 5 nanogram 10 No Fluor Selective 3 picogram 10 No MS Selective 1 picogram 10 Yes

Temp. Sensitive Yes No No No Table 1.2 Properties of SEC detectors. Source: A chromatogram is a visual presentation of the detection of molecules within the sample. A high resolution chromatogram shows well defined peaks with a good baseline. Longer or multiple columns will give better resolution chromatograms, as the molecules have longer to separate from other molecules of a different size. The figure below illustrates high and low quality resolution chromatograms.

Figure 1.2 Chromatograms with high resolution and low resolution peaks Source: [6] Amersham Biosciences, Gel Filtration Principles and Methods The largest molecules elute in the void volume (vo). These molecules are too large to permeate the stationary phase. The smallest molecules that permeate the most pores in the stationary phase are eluted at the total column volume (vt). Molecules of intermediate molecular weight are eluted at various times depending on their size. Ve is the elution volume of each molecule. As shown in the illustration below, the void volume is the volume of the mobile phase, the total column volume is the volume of the mobile phase plus the stationary phase, while the volume of the stationary phase is determined as Vt-Vo. [6] [7]

Figure 1.3 Illustration of the different volumes in size exclusion chromatography. Source: [6] Amersham Biosciences, Gel Filtration Principles and Methods The partition coefficient Kav value can be determined by the equation: Kav = (Ve-Vo)/(Vt-Vo) There is a relationship between the Kav value and the logarithm of the molecular weight of similar molecules. Selectivity curves are created for stationary phase matrices by plotting the Kav value of a set of standard proteins against the log of their molecular weight.

Figure 1.4 Selectivity curves of some commercially available gel filtration medium. Source: [6] Amersham Biosciences, Gel Filtration Principles and Methods Prepacked columns can be purchased and the matrix should be selected carefully so that the predicted molecular weight of the sample falls within the linear part of the calibration curve. If the target sample does not have the same molecular shape as the calibration standards, the result may deviate from the calibration curve. [6] [7] 2.2.2 Recent Advances One of the biggest drawbacks of using SEC is that it can take hours to get results. Recently there has been a growth in interest in Fast size exclusion chromatography. Initial studies into Fast-SEC techniques have suggested that reducing the process time also reduces the resolution to an unacceptable level. Popovici and Schoenmakers investigated different ways of increasing the speed of size exclusion chromatography. [8] The following approaches where considered to reduce the time required: • Decrease the particle diameter of the stationary phase • Reduce the column length • Increase the flow rate An experiment was performed using columns of different lengths: 50mm x 4.6 i.d. (internal diameter), 100mm x 4.6mm i.d., 150mm x 4.6mm i.d. A sample with standards of known molecular weight (1700, 10,900, 117,000, and 1,260,000 Da) was run through the columns at the flow rates 0.3, 0.6 and 0.9ml/min respectively. The results are shown in the chromatograms below.

Figure 1.5 Results from various Fast-SEC columns. Source: [8] Fast size exclusion chromatography – Theoretical and practical considerations. It was concluded that better resolution can be obtained via increase flow rate and column length. [8]. Improvements are continually being made by experimenting with different types of columns and media in order to increase the process time while not reducing the quality of chromatograms produced. Large suppliers of SEC column media have developed specific stationary phases that will maximise the efficiency of the procedure. 2.2.3 Evaluation of the technology There are many forms of chromatography and a combination of the methods is usually used in protein preparation and analysis. Amersham Biosciences describes a Three Phase Strategy for protein purification: Capture, Intermediate Purification, Final Polishing (CIPP). The right combination of methods can ensure that the entire process is less costly and time consuming, while providing pure product. [7] The aim of the Capture step is to isolate, concentrate and stabilise the protein. Size exclusion chromatography would not be recommended for this process, as it usually involves a high volume of sample and would be very time consuming. Ion Exchange Chromatography (IEX) and Affinity Chromatography are commonly used for this process as they are faster, higher capacity techniques than size exclusion chromatography. Column conditions are selected to maximise the binding of the target protein from the sample and avoid the binding of unwanted contaminants. Speed is an important issue in this first stage, as the sample often includes impurities that may cause proteolysis of the target protein, or other denaturing effects. During the Intermediate Purification, the chosen technique will need to have a high selectivity for the target protein as the other unwanted components remaining in the sample after the initial capture stage will be more similar to the target. One or methods may be used for this process. For each method used a balance must be found to provide adequate capacity at an acceptable resolution.

The final polishing stage is used to achieve a high purity of the target protein. Size exclusion chromatography is usually used at this stage as it provides high resolution results. The volume of the sample is reduced by the first 2 stages, so the process time of SEC is decreased. [7] One of the disadvantages of size exclusion chromatography is that it can be slow process. It can take anywhere from 15 minutes to several hours to perform, with the typical run time at approximately 30 minutes. [5]. The resolution of the results can be increased by adding more columns, and a series of columns with different pore sizes is often used. However, this means that the process run time will increase. Each different type of chromatography has its advantages and disadvantages. They can be used in combination, and the materials or procedures altered to gain the best result in relation to resolution, speed, recovery and capacity. The table below lists the different protein properties and the techniques that use those properties during purification.

Figure 1.6 All chromatographic techniques can be optimised to achieve a balance between resolution, speed, capacity and recovery. Source: [7] Amersham Biosciences, Protein Purification Handbook 2.2.4 Applications of the technology The primary application for size exclusion chromatography was for the determination of molecular weight or molecular weight distribution of polymers. It is now used for a variety of purposes, including group separation (for example desalting, buffer exchange), separating monomers from dimers and polymers, determining molecular weight, final ‘polishing’ purification, and facilitate the refolding of proteins. [6] [5] Gel filtration is ideal for the cleanup of proteins before purification. Commercially prepared desalting columns remove the protein from salts and other contaminants of low molecular weight and can transfer the protein to a new buffer, all in a single process. Sample volumes can be up to 40% of the column volume.

Table 1.3 Commercially available prepacked columns for sample cleanup Source: [7] Amersham Biosciences, Protein Purification Handbook The capacity and speed of this procedure makes it efficient to prepare the sample via desalting and buffer exchange, in readiness for further purification. The figure below shows a chromatogram of the desalting of a His6 fusion protein. Using a UV and conductivity detector facilitates optimisation of the separation. [6].

Figure 1.7 Desalting a His6 protein via size exclusion chromatography Source: Source: [6] Amersham Biosciences, Gel Filtration Principles and Methods Size exclusion chromatography is used for profiling protein samples and can be used for relatively small volumes of sample. Parini et al used an automated size exclusion chromatography system to analyse triglyceride and cholesterol content in lipoproteins. [9] Traditionally lipoprotein separation was done via ultra-centrifugation separation, which requires large sample volumes. Samples of 1-10µL were injected into the column and travelled at a flow rate of 40 µL min-1. The absorption was measure by a UV-VIS detector at 500nm. The run time for cholesterol analysis was 75 minutes, while the run time for triglyceride analysis was 90 minutes. The chromatogram below shows the cholesterol profiles generated by running SEC on 5 different sample volumes. The inset shows the lipid content calculated as a percentage of the peak area. The proportion of VLDL, LDL and HDL cholesterol remained the same as the sample volume varied.

Figure 1.8 Cholesterol profiles at varied sample volumes. Source: [9] Parini et al, Lipoprotein profiles in plasma and interstitial fluid analysed with an automated gel filtration system. Due to the difficulty in being able to analyse low levels of lipoproteins in small volumes, the analysis of lipoproteins in Interstitial Fluid has not been widely studied. Using SEC, Parini et al evaluated the levels of lipoproteins in IF and plasma. The resulting chromatograms shown below illustrate the levels of cholesterol in plasma versus IF in 5 healthy volunteers.

Figure 1.9 Cholesterol levels in IF and plasma of 5 healthy individuals. Source: [9] Parini et al, Lipoprotein profiles in plasma and interstitial fluid analysed with an automated gel filtration system. The researchers also applied their method to determining the profile of triglyceride lipoproteins. Again, injections of increasing volume from 1 to 10µL were run through the SEC system and the following chromatogram was produced. Once again, the proportion of the VLDL, LDL and HDL triglyceride remained constant.

Figure 1.10 Triglyceride profiles at varied sample volumes. Source: [9] Parini et al, Lipoprotein profiles in plasma and interstitial fluid analysed with an automated gel filtration system. The fourth peak shown was thought to be free glycerol in plasma. To test this hypothesis, a reference sample of free glycerol was injected into the SEC system, along with a plasma sample from a healthy volunteer and from a patient with hyperglycerolaemia. As is shown in the chromatogram below, the hyperglycerolaemic patient had a high fourth peak, and this was eluted at the same time as the free glycerol.

Figure 1.11 Triglyceride profiles of a hyperglycerolaemic patient compared to those of a healthy subject. Source: [9] Parini et al, Lipoprotein profiles in plasma and interstitial fluid analysed with an automated gel filtration system. The SEC method developed by Parini et al has the following significant improvements on existing methods of lipoprotein analysis.: • It is able to generate lipoprotein triglyceride profiles • It is able to determine lipoprotein profiles in Interstitial Fluid • Components for the system, including reagents are already available. Analysis of lipoproteins would be of value for the study of some metabolic diseases such as hyperglycerolaemia and diabetes. [9] Recently size exclusion chromatography has been used to refold denatured proteins. Neely et al successfully used SEC for the refolding of the active calcium channel β1b subunit. [10] Studies of this subunit have been difficult due to its tendency to form aggregates when expressed in bacteria. Dialysis and fast dilution were initially used to try and refold the protein via denaturant removal but these techniques failed. The SEC method was developed which exchanges the buffer, removes aggregates and promotes refolding of the protein all in the one process. The sample was loaded onto a column that was calibrated with the refolding buffer and eluted at a rate of 2.5ml/min.

A UV detector was used, and the eluted fractions were further analysed by reducing SDS-PAGE. The figure below shows the peak at the void volume which was thought to be aggregates of the subunit. The peaks I and II are thought to have been the β1b subunit refolded into two different states.

Figure 1.12 Protein refolding of β1b subunit. Source: [10] Neely et al, Folding of active calcium channel β1b subunit by size exclusion chromatography and its role on channel function. The molecules from Peak I were recovered and loaded onto the column again. The resulting chromatogram showed a high peak again at the void volume, indicating that the protein had aggregated over time. The molecules from Peak II were recovered and loaded onto the column again. The resulting chromatogram (shown below) indicates that the protein has refolded correctly and remains in a stable condition. The integrity of the Peak II conformation was confirmed by amino acid analysis.

Figure 1.13 Analysis of Peak II subunit. Source: [10] Neely et al, Folding of active calcium channel β1b subunit by size exclusion chromatography and its role on channel function. The recovery of the β1b subunit using this method averaged at 50%, and the protein is stable at ph 8-10. [10] 2.2.5 Web Sites • Amersham Biosciences. Their website has information relating to protein purification and analysis, as well as products available. Wiley Interscience database – relevant journal articles. Science Direct database – relevant journal articles.

• •

• •

Access Excellence is a site run by the USA National Health Museum. Journal of chromatographic science.

2.2.6 Key Industry Suppliers • • • • Amersham Biosciences - Sigmaaldrich – Bio-rad. Waters -

2.2.7 References 1. Rouessae, F. & Rouessae, A. (2000) Chemical Analysis: Modern Instrumentation Methods and Techniques, John Wiley & Sons, West Sussex, England. 2. National Health Museum (1999) An Introduction to Chromatography, 3. Hunt, B.J. & Holding, S.R. (1989) Size Exclusion Chromatography, Blackie & Son Ltd, London 4. Wu, Chi-San (2004) Handbook of Size Exclusion Chromatography and Related Techniques, 2nd Ed, Marcel Dekker, New York, USA 5. Miller, James M. (2005) Chromatography Concepts and Contrasts, 2nd Ed, John Wiley & Sons, New Jersey, USA 6. Amersham Biosciences (2002) Gel Filtration Principles and Methods, _TechDoc2~handb_pplr 7. Amersham Pharmacia Biotech AB (2001) Protein Purification Handbook, _TechDoc2~handb_pplr 8. Parini, P et al (2006) Lipoprotein profiles in plasma and interstitial fluid analysed with an automated gel-filtration system, European Journal of Clinical Investigation (2006) 36, 98-104 9. Neely, Alan et al (2004) Folding of Active Calcium Channel β1b-Subunit by size exclusion chromatography and its role on channel function, Journal of Biological Chemistry Vol 279, No 21, pp21689-21694 10. Popovici, Simona & Schoenmakers, Peter (2005) Fast size exclusion chromatography – Theoretical and practical considerations, Journal of Chromatography A, 1099 (2005), 92-102.

Chapter 2.3

Ion Exchange Chromatography
Ramakrishnan Ramamoorthy

2.3.1 Introduction: Proteins are considered to be building blocks of all living beings. The word “Protein” originates from a Greek word Protos which means primary. The meaning indicates that proteins are an essential component for all living beings. They constitute that part of the living system without which survival is difficult. . They are molecules which are formed of a large amino acid groups Proteins are mainly made up of 20 amino acid groups. They form polymers with a combination of amino acids. [1][2][3] [7]

There are a variety of uses with proteins and hence those applications may utilise a purified active form of protein in the shortest time possible. The proteins are isolated from a mixture of proteins or any complex mixture. The basis of purification of proteins can broadly divided into two major categories. Proteins Purification

Analytical Purpose

Preparative Purpose

The above two purposes for protein purification have a variety of applications in their own sphere of work.The analytical purpose has an application more in research, to identify a strand of protein while the preparative method targets the production part, to produce large proteins for various applications. The biotechnological aspect of protein is a fast and a rapidly upcoming field. They possess a great diversity in the drugs and pharmaceutical fields. [4]. The techniques discussed below using the chromatography technique is Ion Exchange Chromatography. This technique was first adopted for the separation of proteins by H.A.Sober and E. A.Peterson in the mid 1950’s. [5] The principle involved in Ion exchange chromatography is utilisation of charge properties of the molecules for separation. [6]. “The basis for the ion exchange process is the competitive binding of ions of one kind (proteins) for ions of another kind, for example other proteins or salt ions of the same charge, to an oppositely charged chromatographic medium, the ion exchanger.”(Page No 109 Section 4.1)[5] The primary requisite for separation is the matrix within the column. This is basically a resin which facilitates the separation. Examples of some resins are agarose, dextran, acryl amide, cellulose etc. (8) there are four steps adopted in the separations using ion exchange chromatography. They are a) Equilibration

b) Sample Application c)Elution d)Regeneration The part of the complex mixture which comprises of that portion to be separated is known as analyte. The analyte reacts with the oppositely charged particles of the column material..The primary principle applied in the separation of macro molecules using ion exchange chromatography is isolation based on ionic interaction. [8][6][9a]

Fig 2.3.1 Ion Exchange Chromatography Diagram Source : s/genchem/Labs/IEcolumn/images/diagram.gif&imgrefurl=http://www.chemistry.wustl. edu/~courses/genchem/Labs/IEcolumn/diagram.htm&h=410&w=390&sz=43&hl=en& start=3&tbnid=fx0Pkmqjp46oM:&tbnh=125&tbnw=119&prev=/images%3Fq%3Dion%2Bexchange%2Bchromatog raphy%26svnum%3D10%26hl%3Den%26lr%3D

These further results in subdivisions into: Ion Exchange Chromatography Cation Exchange Chromatography Anion Exchange Chromatography

The Cation exchange chromatography uses the positively charged ions as the column material contains negatively charged particles whereas the anion exchanger entraps the negatively charged ions using the positively charged ions from the column material.[9]. Therefore, the binding of a protein in a medium is directly proportionally to its charge. The proteins are then eluted by either salting out or by changing the pH [8]. The important factor for any separation using this technique is the selection of resin and its associated factors. The major factor for resin selection is the overall charge and the charge distribution of protein. The overall charge is proportional to the pH and amino acid composition in the protein while the charge distribution varies on how much is the charge carried by the amino acid. Isoelectric point is also a big contributor for separation. Generally though, proteins are separated in the pH range of 4-8, hence a pH range is selected in accordance with the resin and analyte characteristics for purification or separation [6]

Fig 2.3.2 Ion Exchange Chromatography operation procedure. Source e/ion_exchange_chromatography.png&imgrefurl= olecular_biology_methods.html&h=283&w=915&sz=15&hl=en&start=32&tbnid=fdEE amsdanN71M:&tbnh=45&tbnw=147&prev=/images%3Fq%3Dcation%2Bchromatogra phy%26start%3D18%26ndsp%3D18%26svnum%3D10%26hl%3Den%26lr%3D%26 sa%3

2.3.2: Recent Advancements:

In this period of time, there seems to be a great need and emphasis for cognizance of new things and new technologies. As required by the society, there are certain modifications or advancements that are inevitable for better analysis and thesis, which are updated with more research in a short time period. Hence this section discusses about the same- the recent advancement in ion exchange chromatography. (1) Increasing polymers efficiency for high capacity ion exchange: Monolith polymers are used for separation as a separating media for a reasonable period of time. These polymers were initially not very effective with the separations but, recent studies indicate that monolith polymers, on sulphonation, yield better efficiency and high capacity. Latex coated monoliths were used initially but yielded unfavourable results as there was no great efficiency in separation seen. The polymers were studied by “Joseph Hutchinson and colleagues, from the University of Tasmania “.Studies were then focused on BuMa-co-EDMA-co-AMPS (butyl methacrylate-co-ethylene glycol dimethacrylate-co-2-acrylamido-2-methyl-1propanesulfonic acid), PS-co-DVB (poly (styrene-co-divinylbenzene)), or GMA-coEDMA (poly (glycidyl methacrylate-co-ethylene glycol dimethacrylate)). The initial analysis was with PS-co-DVB as a polymer to make monoliths, which did not respond as expected and had a very low efficiency and capacity of separation. With the second one, GMA-co-EDMA, the research was with three ways to increase the ion exchange capacity. “The first two methods (involving suflonation with 4hydroxybenzenesulfonic acid and thiobenzoic acid respectively), while increasing capacity, turned out not to be suitable to separating anion mixtures. The third method, which made use of sodium sulphite, increased capacity of the monolith by 30-fold and also increased its chromatographic efficiency.”[10] This research paved way for an effective ion exchange separation. [10],[11] (2) Use of a an advanced matrix There have been various evolutions over the past in the various use of the matrix for ion exchange chromatography. S-Zephyr is found to have a better performance compared to Mono-S in the cation exchange chromatography for separations. The conclusions are derived keeping some parameters in view such as the retentivity time. S-Zephyr shows better results for all types of separations.[12] (2) Specificity of Adenovirus purification Using anion exchange Chromatography. In Gene transfer , the choice of vector for the gene transfer to occur ,is recombinent adenoviruses, wit the the emphasise more the no so popular serotypes and chimeras- a human engineered protein. With this boom , a purified form of alternative serotypes is required. Generally , for the purification of adenovirus , the anion exchange chromatography is the best choice for separation. There may be a change in the retention times as there are differences to how the capsid proteins are exposed. With a thorough knowledge of the behaviour of virus , the retention time can be influenced and an efficient basis for chromatography method can be deduced. Study reveals , hexon protein was found to be altering the retention difference in the anion chromatography. An analysis was carried out with with different groups of steroisotopes. The sterioisotopes were found to bind the anion column well and Sodium chloride solution was used to elute. The retention times were related with

good accuracy to properties of hexon proteins. Such an analysis helps in establishing a good chromatography gradients for different serotypes [13] (4) High Speed Separation at elevated Temperatures: An application that is sorted for the separation is by high temperature, at about 900C. A Dionex CS12A column with the help of suppressed conductivity detection is used for the to separate ions. The temperature of operation being high mitigates the viscosity and enables to maintain a better flow rate compared to a normal flow rate utilised and had a better effect on the separations as well. The drawback of such a method was the retention time was reduced of cations. This led the column being operated at 600C, which still yielded a better result than the regular methods of operation. [14]

(5) Application to determine sugars As the nutritional value plays an important role in the modern world, the detection of sugars is a great find by ion exchange chromatography. The results that are found to be obtained out of these results prove genuine and proper. For this analysis, the exchanger uses a mobile phase which is predominantly a basic material (NaoH), and the medium or the column which is used is an amine resin column. This analysis, with pulsed amperometric detection is a big tool in measuring glucose, fructose and sucrose. The detection was found to be better and also took lesser time compared to the other method, HPLC. [15]

(6) Rapid Anion Separation: Separation of anions is faster using Ion exchange chromatography. Such a research was carried out to facilitate better results in the sphere of food and environment departments. This method was seen to have a better throughput. Some issues like hydrophobicity and the interactions between the analytes tends to slow down the process. Hence a new method is developed for the anion separation in about three minutes! For this process, a short column was considered and was coated with cetylpyridinium chloride (CPC), a cationic surfactant, which was found to be doing really well. The research also revealed that the surfactant was removable by acetonitrile. Firstly, the test was carried out on nitrite and nitrate, operating at reasonable flow rates and the separation was successful in less than three minutes. Later some more samples were tested, which yielded a similar result and also had a reasonable capacity of recovery. Hence, by a surfactant coating to the column, the column efficiency is enhanced both in recovery and the time of recovery. [16] [17] [18] (7) Separation of ions by switching columns: A new system is devised for the separation of cations and anions which utilises only one pump, eluant and detector. The columns are connected to each other side by side and the columns are utilised by changing the valve, which is useful for separation in a single run. The two columns do not operate simultaneously. When

one column is in operation , the other column is on the standby for analysis. When such a system was tested for using 2.4 mM 5-sulfosalicylic acid resonal. This showed an acceptable separation level. On placing an injection valve, the required separations of the ions targeted were achieved having better detection and capacity. [19] (8) New Column by Grafting: A recent study by a group of Turkish scientists reveals that a new column can be developed using particles of monodispersed polymers which can be instrumental in designing an ion exchanger. The particles, because of their characteristic of being monodispersable, account for better separations. Latest research reveals a chromatography column developed using atom transfer radical polymerisation method.(ATRP) using poly glycidyl methacrylate-co-ethylene dimethacrylate, or poly(GMA-co-EDM) with ion exchange ligands. The combination was seen to be a major hit for protein separations. An initiator is used with the poly(GMA-coEDM). On further and final analysis of the polymers and when the system was tested for a run , the conclusions drawn from them was effective separation of proteins like lysosome, myoglobin etc. The column utilised the optimum column height and also had very low run to run reproducibility values.[20] [21]

(9) Use of Short columns to separate low membrane proteins: A monolith known as the Convective interaction media in stationary phase ,which is to separate macro molecules and operating at a superior flow rate and minimal pressure drop. They are predominantly focusing on antibody separation, they have a great scope for separations of biological materials in the shortest time possible. Monolconal antibodies were bound to rat liver plasma protein and linked to protein A or protein G. The column seems to have achieved a rapid separation .The separated antigens were analysed further on Mass spectroscopy. [22]

2.3.3. Evaluation of Technology Ion exchange chromatography is the most preferred technique for protein separation as about 57 % proteins separated are via this technique. Large protein volumes are analysed by this method. The column has a very good efficiency and life period and can sustain giving a good output. This technique uses nominal chemicals possible and the samples prepared for analysis need not be more. Basically, with the presence of the charge in it, this technique has good binding capabilities compared to the other techniques. A distinctive feature that separates ion exchange chromatography from the others is it more ions may be evaluated or analysed at once. The additional advantage that it carries is its reliability, the strength of the technique and the assurance of results, which make it stand out comparing other separation methods. They can also be useful for scaling up. [5] [23][24]

Example: Measurement of ammonium nitrite is not simple. Ammonium nitrite analysis, by other analysis is particularly difficult while ion exchange chromatography has a distinctive feature of sensitivity towards it. [24]

There are some basic issues regarding ion exchange chromatography. For separation, setting up the column is an issue as it takes a lot of time doing the same. Normally the extract out from the surrounding cannot be directly fed into the column. Generally, separation of materials having different charges is difficult here. There needs to be a continuous vigil to the column to check if everything is operating properly and to ensure that everything is alright. It is difficult to optimize.[25]

2.3.4 Applications of the Technology: Ion exchange Chromatography has gained significant interests in the different fields from environmental studies, commercial separations , industrial applications etc. The most utilised area of ion exchange chromatography is the medical, bio molecular biomedical, lower molecular weight substances and pharmaceutical applications. (1) IEC technique to purify Monoclonal Antibodies: Monoclonal antibodies have a great application in the field of medicine for the treatment of cancer. It is very important that this antibody, prior to its usage , is very important that it is purified to get the desired protein. This is done using the ion exchange chromatography method (either by cation or anion exchange). The isoelectric point of the monoclonal antibodies are from 5-8, hence are bound to the cation exchangers. The steps where there is no binding taking place is useful for the reason that the impurities like DNA , viruses and other impure and unwanted proteins are removed, ensuring a purified antibody.[26] [27] (2) In the Separation of Ran GDP, Ran GTP , Ran GMPPN Ran GDP has a big role to play in the transportation of essential substances and in cell division. This reveals the affinity and the binding of Ran to an ion exchange column which is very responsive to the concentration of magnesium chloride. At moderate concentrations of magnesium chloride, Ran was found to elude and cause an acceptable level of separation. When there was a further reduction the concentration levels of magnesium chloride, the purity levels of the Ran ,further seem to enhanced resulting in the further purification , which was confirmed on testing with a High performance Liquid Chromatography.[28] (3) To analyse Spermadhesins proteins:

Spermadhesins is an important protein in animals like boar and stallion which play a big role in capacitaion and sperm egg interactions. There was a detection of a protein which was similar to Spermadhesins, using the technique of ion exchange chromatography; the confirmation of the same is sorted. Semen sample was taken from the object (Bucks), after treatment including dialysis and freeze drying is injected to the chromatography column. Seminal plasma of the Buck, gave 6.47 +/0.3 mg from ion exchange chromatography. Thus, proteins were separated and analysed .The seminal fluid protein from the buck revealed that proteins were present that had a similar structure to the Spermadhesins family. [29]

(4)Monolith Application for purification of plasmid DNA. Plasmid DNA is gaining a great importance for genetic vaccination and gene therapy. For pDNA to be utilised for the above application, a purified form is very essential. The crunch step of pDNA production utilises a chromatographic technique. Monoliths are on the top priority list for separation of pDNA, as they have a very strong binding affinity towards the pDNA. The best choice of the purification is by anion exchange chromatography as the polynucleotides are negatively charged and do not depend upon the buffer levels.. The various natures of ligands, their respective densities and their optimisation levels were understood. The scaling up was done using a convective interactive media monoliths operating at low pressure levels. Hence the they were successful in the production of pDNA (the intermediate step). Such a technique gained a lot of prominence in the pharmaceutical industry.[30]

(5)Applications in Food industry In the baking industry, potassium bromate has a major application as it has a very good keeping and conditioning property of dough. Potassium bromate but, has a significant health concerns and hence was banned by the World Health Organisation as a food additive. Recent studies have shown that this problem can be overcome using ion exchange chromatography. Initially there were a lot of other techniques that were tried for detecting the same like HPLC, GC but they were not so successful in separating bromate. Hence, ion exchange chromatography was considered. The column used for separating the bromate “Ion Pac AS19”. The separation of bromates took place with an elusion of potassium hydroxide in a linear column. In order to calculate the bromate levels, some predominant parameters like the separation time, the temperature of separation with other parameters were considered. Analysis showed that after sonication, peaks were formed after 30 mins and there was no significant contribution by temperature for the separation. The detection limit was 0.5µg/L. When this was actually tested on flour oriented products, the test was successful had had a very high bromate recovery levels. [31] [32] :

2.3.5: Relevant Web Sights

One of the best source of relevant information in the web sight is This website has got relevant information for ion exchange chromatography and all the other techniques as well. Pub Med and Science direct (RMIT online Library resource) have excellent material for the same providing a lot of journals articles. 2.3.6: Key Suppliers:
1) Millipore

Billerica, MA 01821 United States of America
2) Palico Instrument Lab, Circle Pines. MN 55014-0125, United States of America.

3)Eichrom Technologies, Inc.

4) Thermo Electron and Fischer Scientific 5) G.E Health Care 2.3.7 Reference: 1) Farlex. Inc (2006) , The free dictionary 2) Everything Bio (2005-2006) Protein Definition 3)Lehninger, A.L., Cox, M.M.,Dobos,M. and Roche,P. Lehninger’s Principle of Biochemistry, 4th Edition, Chapter 3(3.2) pp85-88.

4) Virgina Polytechnique Institute and State University,(2005), Learning Resource 5) Jan-Christer,J.,Rydern,L (1989) Ion Exchange Chromatography, Protein Purification: Principles ,High Resolution Methods and applications. Section 4,pp 104116 6) Protein , (2003)Tutorial Resource.

7) Cornell University learning resource 8) Wagening University, (2005) Laboratory of Biochemistry, Department of Agro Technology and food sciences, Learning resource. change.doc 9) Madison Area Technical College,Learning Resource.

(9a)Everything Bio,(2005) Definition of Analyte.

10)Secko, D.,(2006), Monolith Polymers reveal High Capacity for Ion Exchange &page=1 11)Hutchinson, J.P., Hilder, E.F, Shellie,R.A, Smith, J.A, and Haddad,P.R.(2006) Towards high capacity latex-coated porous polymer monoliths as ion-exchange stationary phases. Analyst 131 pp 215-221. 12)Ronger,R.M., Alder,C.M. and Scouten, W.H.(1991) S-Zephyr, a new high performance ion exchange chromatography column matrix. Journal of bioseparations 2(5), pp 297-308. 13)Konz,J.O.,Livingood,R.C.,Bett,A.J.,Goerke,A.R.,Laska,M.E.,Sagar,S.L.(2005),Ste reotype specificity of adenovirus purification using anion exchange chromatography. Human Gene Therapy 16(11) pp 1346-1353.

14) Chong, J., Hatis,P., and Lucy, C.A.(2003). High-speed ion chromatographic separation of cations at elevated temperature. Journal of Chromatography.A. 997(12) pp161-169.

15) White, D.R., and Widmer, W.W.(1990) Applications of high performance chromatography for sugar analysis in citrus juice. Journal of Agricultural food chemistry 38 pp 1918-1921.

16) Secko, D. (2006) Anion separation in less than three minutes &page=1

17) Li,J., Zhu, Y., and Guo, Y.(2006) Fast determination of anions on a short coated column . Journal of Chromatography .A. 1118, pp 46-50.

18)Fritz, J.S, Yan, Z., and Haddad, P.R.(2003) Modification of ion chromatographic separations by ionic and nonionic surfactants. Journal of Chromatography .A. 997 Pp 21-31. 19)Amin ,M., Lim,L.W and Takeuch, T. (2006) Tunable separation of anions and cations by column switching in ion chromatography .Journal Article. ummary&_orig=search&_cdi=5288&_sort=d&_docanchor=&view=c&_acct=C000050 221&_version=1&_urlVersion=0&_userid=10&md5=8ad429c69a9b17de141228aa6b 15d18

20) Secko, D.(2006) Grafting a new column &page=1 21)Unsal , E., Elmas, B., Caglayan, B., Tuncel,M., Patir, M and Tuncel, A.(2006) Preparation of an Ion-Exchange Chromatographic Support by A "Grafting From" Strategy Based on Atom Transfer Radical Polymerization. Journal of Analytical Chemistry. 78(16) pp 5868-5875. 22)Rucevic, M., Clifton, J.G., Huang, F., Li,X., Callanan, H., Hixson,D.C and Josic,D.(2006). Use of short monolithic columns for isolation of low abundance membrane proteins. Journal of Chromatography A 1123(2) pp 199-204. 23) Portal to Science , Engineering and Technology. Tutorial Source, Chromatography 24) Rey ,M.A.,(2001) High-capacity cation-exchange column for enhanced resolution of adjacent peaks of cations in ion chromatography. Journal of Chromatography 920(1-2) pp 61-68

25)University of British Columbia, Learning Resource

25 a) Laboratory Talk (2002), Jonathan Bruce. [26] Jacob , L., Schluter , H.,(2006) Bio processing and Biopartnering. (article) [27] Bai,L. , Burman, S., Gledhill L.,(2000) Development of ion exchange chromatography methods for monoclonal antibodies , J Pharm Biomed Anal :22 pp 605-611. [28] Bibak, N., Paul, R.M., Freymann, D.M., Yaseen,N.R. (2004) Purification of RanGDP, RanGTP, and RanGMPPNP by ion exchange chromatography.Journal of Analytical Biochemistry 333(1) pp 57-64. [29] Ítalo, D., Teixeira,A., Melo,L.M., Gadelha, C.A.A., Silva da Cunha, R.M., Bloch Jr,C., Rádis-Baptista,G., Cavada,,B.S.,and José de Figueirêdo Freitas ,V. (2006) Ion-exchange chromatography used to isolate a spermadhesin-related protein from domestic goat (Capra hircus) seminal plasma Journal of Genetics and Molecular Research (Online Journal) Access date : 25th September 2006. [30] Urthaler, J., Schlegl, R., Podgornik , A., Strancar , A., Jungbauer, A., Necina, R.(2005) Application of monoliths for plasmid DNA purification development and transfer to production Journal of Chromatography. A. 1065(1) pp 93-106. [31] Secko, D.,(2006) IC Traces Bromate in Flour, &page=1

[32] Shi, Y., Liang, L., Cai, Y., and Mou, S. (2006). Determination of Trace Levels of Bromate in Flour and Related Foods by Ion Chromatography. Journal of Food Chemistry 54(15), pp 5217-5219.

Chapter 2.4 Affinity Chromatography
Aekta Chhabra

2.4.1 Introduction

Affinity chromatography can be defined as a method in which biospecific and reversible interactions can be used as a mode for separation, purification and specific selection of biologically active material from crude samples The approach was first introduced in 1968 to purify proteins and today it still represents one of the most powerful techniques available for purification of biologically active compounds. The method is an extremely useful tool for studying many biological processes, such as the enzyme action mechanism, hormones, protein–protein or cell–cell interactions and other significant biological interactions. It is the specificity of affinity chromatography, which makes it one of the most powerful forms of chromatography. Purifications up to several orders of magnitude are obtainable in a single step, and affinity separations are universally used to remove contaminants that are very difficult to eliminate using conventional chromatographic techniques. [1] Affinity chromatography is considered to be a unique technique in purification technology and finds versatile applications in biochemistry analysis since it is the only procedure that enables the purification of a biomolecule on the basis of its biological function or individual chemical structure. Immunoaffinity chromatography is another form of affinity chromatography, which continues to be a definitive tool for protein isolation produced by genetic engineering. Time consuming and otherwise impossible purification and analytical steps are easily achieved by affinity chromatography. Affinity chromatography is thus the grandfather of most modern chromatographic techniques, including biosensors, DNA, and protein micro arrays, along with their varied application in diagnostics particularly protein–protein interactions and drug screening [1]. Underlying Principle The principle of affinity chromatography is as follows: 1) The sample is injected into an initially equilibrated affinity chromatography column 2) Substances possessing an affinity for the ligands are retained in the column. 3) Substances with no affinity for the ligands are eluted down from the column. 4) The substances retained in the column can be eluted or recovered later from the column by alteration of conditions like pH, salt or organic solvent concentration of the eluent.

Fig 1:Principle of affinity chromatography Source: The success of entire process rests on specificity based on three aspects—the matrix, the ligands, and the attachment of the ligands to the matrix. Choosing the correct ligand is the first important aspect. It is a necessity that the ligand must bind strongly with the molecule to be recovered. If the ligand chosen possesses the ability to bind more than one molecule of the sample to be analyzed, then a technique called negative affinity, which uses ligands to remove everything except the target molecule from the sample solution, can be used. The next important step is the matrix selection that marks the success for affinity chromatography. Matrix materials function to hold the active ligands and provide a pore structure to increase the surface area thereby increasing the binding capacity of the molecules. This binding requires that the matrix be activated and then react with

the ligands to bind them tightly onto the matrix. During the entire process, the ligand must remain active towards the target molecule. Amino, hydroxyl, carbonyl, and thio groups, are easily activated and can serve as the sites on the matrix to which the ligands attach. The matrix, in addition to requiring activation, should also be resistant to contamination when purifying pharmaceutical compounds. Decontamination is performed by a column rinse with sodium hydroxide or urea. Different matrix materials have a tendency to be stable in different pH ranges, which essentially forms the third aspect of selection for affinity chromatography.[2]

Fig 2: The ligand attached to the matrix binds to the protein of interest and is separated from other proteins. Source: 2.4.2 Recent Advances At its introduction of the technique in the late ’60s, affinity chromatography was used to purify classes of proteins which were dependent on their properties or function — antibody binding, hormone binding, enzyme inhibition, etc. In the last few years development of affinity chromatography has enabled monoclonal antibody purification through the special and unique affinity of Protein A (a protein originally from Staphylococcus aureus), with the constant region of antibody. In the recent years, computational chemistry, molecular modelling and combinatorial chemistry have provided opportunity for chromatographic adsorbent development to enable purification by design. In this particular mode, a specific adsorbent is constructed to the target biopharmaceutical moiety in a customized program between the biopharmaceutical company and the adsorbent.[3] In order to increase the efficacy of purification, the concept of spacers between the biologically active ligand and the polymer was introduced. 95% of all affinity purification methods apply the same general principles using spacers. Immunoaffinity chromatography continues to be instrumental for the isolation of

proteins produced by genetic engineering. Recent years have showed that affinity columns can be used to remove toxic compounds from blood. Nowadays, chemical reactions can be combined with the affinity concept to demonstrate and study the phenomenon of biological recognition. A perfect example of this kind of approach is affinity labeling, which can identify the residues in the binding-active site of proteins. The involvement of amino acid residues with active sites of enzymes can be examined using this technique. Affinity therapy is a biorecognition-based approach, which selectively delivers a cytotoxic drug or toxin to the targeted or infected cell. The cell-associated target molecule can be a surface antigen, a surface receptor or any other type of biomolecule which is specific for a given antibody, hormone, or nutrient. The cell-associated target molecule can be a cell-surface antigen, a surface receptor or other type of biomolecule bearing specificity for a given antibody, hormone, or nutrient. The drug counterpart can comprise the corresponding antibody or hormone to which a cytotoxic compound (e.g., selected from chemotherapeutic drugs, radio nucleotides, or toxins from different origins) has been chemically attached in a synthetic process. This approach was popularly known as affinity therapy, which was later changed by others to the misnomer, immunotoxins. [1] Detection of seven specific bands by immunoblot (IB) using glycoproteins (GPs) purified by lentillectin affinity chromatography has indeed been a remarkable standard for neurocysticercosis (NCC) serodiagnosis since 1989.This has aided in the study of certain aspects of the pathological differences between the Asian and Africa-American types and their respective antigenic components.[4] Affinity chromatography is undoubtedly the most selective method for protein purification because it has the required specific purification power to eliminate steps, increase yields and thereby improve process economics. Some of the most recent advances in this area have explored the power of practical and combined approaches for designing highly selective and stable synthetic affinity ligands forming the pivotal point of the process. Rational molecular design techniques based on biochemical analysis of protein structures and advanced computational tools, have made ligand design feasible and faster and much more accurate. Combined approaches based on peptide and nucleic acid libraries have enabled the rapid synthesis of new synthetic affinity ligands, which are of potential use in affinity chromatography. The specificity and versatility of these approaches has made them dominant methods for designing and selection of multifaceted and novel affinity ligands with scale-up potential. [5]

The most recent approach involves high throughput screening techniques of libraries constructed in 96-well plates, containing “microcolumns” of defined affinity adsorbents. These column libraries are nowadays constructed using parallel synthesis robotics. Using sensitive protein assays, adsorbents selectively recognizing the target protein can be identified for further optimization and development. ProMetic BioSciences Ltd. (Cambridge, U.K.), a subsidiary of ProMetic BioSciences Inc. (Montreal, QC), uses robust and modern chemical scaffolds primarily based on triazine derivatized with non-toxic amines. These “ligands” are then immobilized onto a neutral, beaded agarose support used in a standard packed bed chromatographic array. This exceptionally intelligent technique has been used to develop highperformance affinity adsorbents for a plethora of biopharmaceutical targets, including antibodies, fusion proteins and even proteins present in the blood plasma. Chromatography will always remain the technology of choice for downstream processing, despite the many efforts to find new solutions. Affinity chromatography using custom-designed synthetic ligands is likely to play an ever-increasing role in the bioseparations which nevertheless form an integral role of all biochemical analysis. New formats for chromatography will emerge in the next decade and manufacturers will be challenged with balancing the introduction of cost saving, higher yielding technology with high levels of specificity.[3] 2.4.3 Evaluation Of Affinity Chromatography process In protein engineering, the tasks of generating and testing a large number of variants of a molecule and optimization of expression conditions for one individual molecule create the need for purification methods possessing ability to handle a large number of samples simultaneously. A simple affinity chromatography system can be used for the parallel purification of 24 protein samples, yielding sufficient quantities after purification to be used for biochemical and functional analysis. This system employing affinity chromatography has certain advantages over existing systems: the costs of this system are minimal as compared to other chromatography systems,this system allows gentler processing of the samples and is therefore beneficial for proteins prone to be easily damaged. [6] Affinity chromatography shows its true power in intricate biochemical analysis. The challenge of bioseparations lies in identifying and extracting the target protein from its constituent sample mixture. For example, about 300 proteins in blood plasma have now been identified and separated among thousands of them present at low

concentration. Thus, affinity chromatography combines appropriate selectivity alongwith high yield.[3] Affinity chromatography is nevertheless the most selective method for protein purification because it has the ability to eliminate steps, increase yields and efficiency and thereby improve process economics. [5] The affinity chromatography method for the determination of glycosylated hemoglobin is rapid, specific and precise and offers several advantages over other methods. Affinity chromatography is relatively insensitive to changes in temperature and pH and is likely resistant to interferences by other groups. The results are independent and reproducible over a wide range of hemolysate concentrations. In addition to this, the method offers good stabilities of specimen, buffers and columns. The technique allows10-fold reuse and a relatively simple sample preparation without washing or incubation of erythrocytes. All these aspects make this method technically less complicated and apt for routine use in a clinical laboratory. [7] Advantages of affinity chromatography: • • • High affinity, thereby ensuring high selectivity Binding all or none of the sample molecules, so excellent recovery is possible Lots of variations and designs are possible in this method e.g. Usage of tags according to the desired protocol. Disadvantages of affinity chromatography: • • • • Steric hindrance between the molecules tend to lower capacity Elution tends to become harsh many a times A very specific eluent is required to obtain accurate results Limited options available for selection of affinity chromatography beads. [8]

The isolation of proteins and peptides is usually performed using variety of chromatographic, electrophoretic, dialysis, precipitation and other procedures in the biotechnology industry; affinity chromatography being one of the most widely used techniques. Affinity ligand techniques are indeed the most powerful tools available for downstream processing and possess high values of selectivity and recovery. The strength of column affinity chromatography is evident from thousands of successful applications, especially at laboratory and pilot plant stages. However, the disadvantage of all standard column liquid chromatography procedures still remains to be the impossibility of the systems available in the market to cope with the samples containing particulate material thereby making these columns unsuitable for

work in early stages of the isolation/purification process where suspended solid and fouling components are present in the sample. [9] 2.4.4 Applications of affinity chromatography Affinity chromatography finds numerous applications in the field of biochemistry. Early applications included the separation of biomacromolecules from other biological components of the cell’s machinery. This certainly forms the most important field for affinity chromatography. In addition to using small ligands to separate large molecules from a constituent sample mixture, large molecules can now be immobilized on the matrix and used to separate the contaminating small molecules adhering to them. Affinity chromatography can also be used to determine dissociation constants for ligands and molecules. The longer the time for which the molecule stays on the column, the broader is the chromatographic peak, indicating tight binding of molecule to the ligand. This quantitative information can further be used for various biochemical studies. [2] Immunoaffinity chromatography is used to perform flow-based immunoassays, which finds versatile applications including affinity-based chiral separations. Affinity chromatography also serves for the study of drug or hormone interactions with binding proteins. Some areas of possible future developments like tandem affinity methods and the use of synthetic dyes, immobilized metal ions, molecular imprints, or aptamers as affinity ligands for clinical analytes are now being investigated in detail. [10] Affinity chromatography is also a powerful protein separation method based on the principle of specific interaction between immobilized ligands and target proteins. Peptides can be separated effectively by using this method. Affinity chromatography on peptide mixtures, used in conjunction with mass spectrometry, has laid down the basis for study of protein posttranslational modification (PTM) sites and quantitative proteomics. Isotope-coded affinity tags have made possible the easy quantitation of proteomes. Affinity chromatography is becoming a landmark for exploring PTM and protein– protein interactions, especially with a view towards developing strategies of chemical derivatization on peptides. With recent advances in technology, more applications of affinity-based purification are expected in the next few years to come, including

increasing the resolution in 2-DE (2 Dimensional Electrophoresis) and improving the sensitivity of MS (Mass Spectrometry) quantification. [11] The most common use of affinity chromatography is to purify recombinant proteins which is done by tagging the proteins with known affinity to facilitate their purification.[12] The Two-Step Affinity Purification System launched by Qiagen,
which combines sequential affinity purification steps, is claimed to yield ultra pure (>98%) protein. BioChrom has also focused on improving column matrices for affinity chromatography. According to BioChrom's president, Michael Lu, "the improvement of recoveries of proteins and the speed of protein purification are still the major issue for most separation media companies." To deal with this major issue, BioChrom has developed a polymeric material, Hydro cell. The high level of cross-linking in the polymer beads is designed to give better and faster and more efficient protein separation compared to conventional silica-based HPLC columns. Biochrom also claims that their Hydro cell columns are very durable, capable of tolerating a broader pH range and harsher cleaning regimens. [13]

Affinity chromatography is now being used as an analytical method to determine the concentration of the free analyte fractions present in a sample. This method involves the application of a sample comprising a free and bound analyte to an affinity column, which is capable of selectively extracting the free portion in the fraction of a second. The signal generated by the free portion is then subjected to quantification by standard analytical detection techniques and the concentration of the free portion is determined by comparison of its signal with that of a calibration curve depicting the signal of known concentration of the same analyte. [14] Burkholderia pseudomallei is supposed to be the causative agent of melioidosis, a disease prevalent in South East Asia and northern Australia. The bacterium is known to secrete various extra cellular products possessing direct correlation to its pathogenesis, e.g. lethal exotoxin, protease and hemolysin. Antibody mediated affinity chromatography was used to purify the exotoxin as studies have shown that it exhibits necrotic and cytotoxic activities in addition to inhibiting cellular protein synthesis This purified protein can aid in understanding the role of virulence factor and go a long way in facilitating vaccine development to combat melioidosis. [15] With new drugs rapidly advancing into clinical trials, their intracellular target identification becomes fundamental for the full understanding of the molecular basis of their efficacy and toxicity. This is of utmost relevance when the targets belong to a large family and their inhibitors recognize a conserved site among other different

members of the same class. A typical example is the kinase family, where efforts are aimed at inhibitor development of distinct kinases for therapeutic applications in oncology, inflammation and other disease areas. Inhibitor affinity chromatography can be used as a tool to identify and characterize the intracellular targets of various kinases. [16] 2.4.5 Relevant Websites • Resources, Basic concepts for molecular techniques, David B. Collinge's Home Page Molecular Plant Pathology Group • Using The Informant for Searching for Analytical Chemistry Information on the World Wide Web, Stephen R. Heller • Analytical Sciences Digital Library supported by the NSDL program of the National Science Foundation l • Pierce-The protein people EFA139-8363-40F8-9F7D-A689125C9EBA • Nature’s protocol: recipe for researchers, tml • • ndard/biopharm/522003/80026/article.pdf • Affinity Chromatography: principles and methods graphy.pdf • Prometic Biosciences iochimica/affinchrom/affinitychromat.htm

2.4.6 Key Industry Suppliers • Thermo Electron Corporation,,15431,00.html?CA= columns •;jsessionid=HZ5RIFVN2LBXRR3FQLM SFEWHUWBNQIV0?id=prod1440010 • uctID=854939 • • • DocID=%7BC840DD69-1408-4BD0-8AC37E7D4B5FD4EC%7D&VNETCOOKIE=NO • • affinchrom/affinitychromat.htm

• • Biocentrum Ltd.

Prometic Biosciences

• •

Sciences Digital Library Supported by the NSDL program of the National Science Foundation Analytical Sciences Digital Library

2.4.7 References 1 Wilchek, M (2004) My life with affinity ,Protein Science, 13:3066-3070 2 Felton, M. & Lesney, M (2006) To Affinity and Beyond: Analytical Alphabet Soup. In Chromatography: creating a central science, Chapter5, 3 Curling, J (2006) Bioseparation: The Power of Affinity Purification.In Bioscience World 4. Ito. A, etal (2002) Recent advances in basic and applied science for the control Of taeniasis/cysticercosis in Asia In The Southeast Asian journal of tropical medicine and public health, vol. 33, SUP3 (dissem.), pp. 79-82


Labrou, N(2003) Design and selection of ligands for affinity chromatography In Journal of Chromatography B, Volume 790, Issues 1-2 , Pages 67-78 , Preparative Chromatography of Proteins 6. Gottstein, C& Forde, R(2002) Affinity chromatography system for parallel purification of recombinant protein samples.In Protein Engineering, Vol. 15,
No. 10, 775-777 7. Schmid, G & Vormbrock, R (1984) Chemistry and Materials Science In Fresenius' Journal of Analytical Chemistry , Vol317,Number6,pp 703-704 8.Johannsen, M (2005/06) Fundamentals and applications of chromatography, 9. Safarik, I & Safarikova, M(2004) Magnetic techniques for the isolation and purification of proteins and peptides

10.Hage, D (1999) Affinity Chromatography: A Review of Clinical Applications In Clinical Chemistry 45: 593-615 11 .Lee, W & Lee, K (2004) Applications of affinity chromatography in proteomics In Analytical Biochemistry 324 (2004) 1–10 12.Wikipedia, The free Encyclopedia (2006) 13.Smith, C(2005) Striving for purity: advances in protein purification In Nature Methods 2, 71 – 77
14.Hage, D & Clarke, W (2004) Analysis of free analyte fractions by rapid

affinity chromatography 15. Lim, K., Mohamed, R., Embi, N & Nathan, S. (2005) Mediated Affinity Chromatography

16. Valsasina, B., Kalisz, H & Isacchi A(2004) Kinase selectivity profiling by inhibitor affinity chromatography In Expert Rev Proteomics. Oct ;1:30315

Supported by the NSDL program of the National Science Foundation Analytical Sciences Digital Library Supported by the NSDL program of the National Science Foundation

Chapter 2.6

Reverse phase HPLC of peptides and proteins
El-Wazani Montaser


Introduction Reversed –phase HPLC has found a central role in protein studies because of its versatility, sensitive detection and its ability to work together with techniques such as mass spectrometry. Most of all, however, reversed phase HPLC is widely used because of its ability to separate proteins of nearly identical structure [1],[2]. Fig.2.6.1.a Fig.2.6.1.b Mechanism of Protein/Peptide Retention In reversed phase HPLC the particle surface is very hydrophobic due to the chemical attachment of hydrocarbon groups to the surface. Proteins are retained by the adsorption of a face of the protein (Termed the “hydrophobic foot”) to the hydrophobic surface [3]. Fig. Fig.

Fig. Column Characteristics


Hydrophobic surface Fig.

A process called”end-capping” whereby a small organosilane is subsequently reacted with the silica surface, further reduces the number of polar silanol groups [4]. b) Column selection and characteristics of sample molecules The hydrocarbon group forming the hydrophobic phase is usually a linear aliphatic hydrocarbon of (C18), (C8) or (C4) carbons. There are guideline as to which phase is likely to be most effective in separating polypeptides of a given size and hydrophobicity [5], these are summarized in Figure below. Fig.


Column length Column length is not important in protein separations and short columns separate proteins as well as long columns Fig. Fig.


Column Diameter See reference [6] Fig. Mobile Phase a) Organic Modifier

The purpose of the organic solvent is to desorb polypeptide and protein molecules from the adsorbent hydrophobic surface. This is typically done by slowly raising the concentration of organic solvent (gradient) until the proteins and polypeptides of interest desorb and elute. Fig.


Gradient elution Proteins and polypeptides are almost always eluted using a solvent gradient where the relative concentration of organic solvent is slowly increased during the separation [7]. Fig. Fig.

A reduction of the gradient slope to improve resolution must be tempered with the need for keeping analysis time as short as possible. However, adjusting the gradient slope is important in optimizing resolution of proteins and peptides [7]. c) Ion-Pairing Reagents

1. Trifluoroacetic acid TFA added to the mobile phase at concentration of ~0.1% results in good peak shape on most columns Fig.

2. Alternative ion-pair reagent Phosphoric acid and heptafluorobutyric acid are some times used in protein/peptide separation [8] Fig.

3. Ion suppression The major benefit of ion suppression is the elimination of mixed mode retention effects. At low pH, carboxylic acid groups will be protonated and only slightly polar. Increasing the mobile phase pH to 6-7 will cause the carboxylic acid groups to ionize, making the proteins and peptide less

hydrophobic. This reduces the retention of all peptides. Fig.

Flow Rate Fig. Temperature Temperature can have a profound effect on reversed phase chromatography [9],[10]. Fig. Fig. Reversed –phase HPLC- MS Reference [11],[12]. Fig. Fig.


Recent Advanvces Columns Column developments in HPLC have benefited protein and peptide separations in a number of ways a) Atlantis TMdC18 columns Are a fully LC/MS compatible line of universal C18 columns that offer the perfect balance of retention for

both polar and non-polar compounds. Atlantis TMdC18 columns are compatible with aqueous mobile phases, provide enhanced low pH stability and are available in a wide variety of column configurations ranging from nano-scale to preparative Fig2.6.2.1a1 Fig2.6.2.1a2

2.6.3 a)

Evaluation of Technology Comparison between conventional RP-HPLC and Atlantis TMdC18 columns Conventional RP-HPLC Problem Impact - Re-run using separate methods for polar compounds - Increased method development time and labor Atlantis TMdC18 columns Solution and Benefit - Polar compounds are retained longer with Atlantis TM dC18 columns - One Atlantis TMdC18 column and method can be us for polar and non-polar compounds - Decreased labor cost Fig2.6.3.a1

Little or no retention of polar compounds

Method requires 100% aqueous mobile phase for desired separation Sudden loss of analyte retention observed when using highly aqueous mobile phase

- Loss of retention is observed

- Atlantis TMdC18 packing material is tested with high polar analytes in 100% aqueous conditions, thereby ensuring its utility in aqueous conditions - Atlantis TMdC18 column don’t lose retention in 100% aqueous mobile phases - Less time spent rewetting columns resulting in low labor costs. - Increased throughput Fig2.6.3.a2

- Run organic modifier through column to rewet and regenerate column - Increased labor and solvent costs - Decreased throuput - Reproducibility issues

Short column life time in acidic mobile phases

-High cost due to frequent column replacement -Increased instrument downtime - Retention time reproducibility issues

- The proprietary difunctional bonding chemistry of Atlantis TMdC18columns results in low pH stability an longer column lifetime - Decreased costs associated with column replacem and instrument maintenance Fig2.6.3.a3

Retaining polar compounds on


Multiple columns are required to separate


One Atlantis TMdC18column and method can used for polar and non-polar compounds

a conventional C18 column results in increased or infinite retention of non-polar compounds



analytes with a wide range of polarities Increased method development time, labor and column costs Decreased throughput


Easier and faster method development Increased throughput


Severe peak tailing for polar bases is observed


Method fails system suitability guidelines for peak tailing Increased method development time


Atlantis TMdC18columns are optimally endcap and provide excellent peak shapes using MS compatible mobile phases Easier and faster method development


Column bleed is observed on MS


Frequent cleaning of MS source


Atlantis TMdC18columns do not exhibit MS detectable column bleed


Incorrect or inconsistent results


Decreased instrument downtime and maintenance costs


Column to column reproducibility is inconsistent(e.g. selectivity, retention,etc)



Increased labor costs due to individual column QC testing Revalidate/redevelop method with each new batch of columns


The stringent Atlantis TMdC18 packing materi QC batch test separates highly polar analyte 100% aqueous mobile phase conditions Decreased method revalidation and development time.

Fig2.6.3.a7 (for more information) Columns b) Monolithic columns (Polymer- based reversed phase chromatography columns) These columns not packed with conventional small particles, but rather are formed as a single rod of very porous material that is in encased in a column package.This new adsorbent based on porous (100A◦-300A◦ pore diameter), highly cross linked polystyrene-divinyl benzene spheres [13]. Because of the high cross linkage, this new adsorbent gives high mechanical stability with a minimum of shrinking in aqueous and swelling in organic solvents. Its separation performance for proteins and peptides is demonstrated by comparing it with silica-based reversed phase columns. 2.6.3 Evaluation of Technology

b) Comparison between conventional RP-HPLC and Monolithic columns (Polymer- based reversed phase chromatography columns) Test Chemical Stability Silica based reversed phase column The typical silica problems of acidic silanol group or other ionic species, which can interfere with the separation performance of the matrix and are difficult to remove entirely and reproducibly by endcapping Less surface area for retention Polymer-based reversed phase column (PLRP-S)) - No noticeable change in separation after 400 column-volume wash with strong base - The strong acid and base wash did not alter chromatographic performance - pH stable from (1-14) - High TFA concentration will not affect the life of the column Not necessitate bonded ligands, - PLRP-S media possesses a much greater surface area therefore even polar molecules such as Parabens may be retained much longer, resulting in greater resolution.
Residual monomer and surfactant are removed using proprietary cleansing procedures. The result is an entirely pure reversed phase surface without the possibility of leachables or changing retention characteristics. The polymerization procedure prevents any possible contamination with heavy metals.

Retentivity (Peptide & protein separation)





Silica-based reversed phase materials, where analysis of metal chelate compounds can cause poor peak shape


Scale Up and sterilization


The inability to withstand extremes of pH thus retaining sanitization and sterilization.( will not assist in validation)







Problems in reactive sites which often arise in silica-based product Voids are likely to form from the typical silica problems of acidic silanol group or other ionic species




Separations developed on an analytical scale column can be transferred to a preparative scale column with minimal method redevelopment. The media offers exceptional loading capacity due to the high surface area and the clean, purely hydrophobic surface functionality ensures a very high recovery. The ability of the media to withstand extremes of pH thus allowing sanitization both on-column (clean-in place or CIP) or on a batch basis. Ensuring that the batch is free from bacterial or similar contamination will assist in validation, particularly under GMP or regulated procedures. Polystyrene and divinylbenzene which, in addition eliminates all reactive sites Being free from silanols and heavy metal ions the polymeric nature of PLRP-S prevents dissolution of the stationary phase. Columns therefore last significantly longer as voids are unlikely to form. This feature even allows the use of high temperature (>200°C) superheated water as an eluent without fear of damage to the stationary phase (for more information) Columns c) Column packing processes (The future of preparative chromatography) Introduced by Phenomenex, AXIA™ is an advanced column packing and hardware design that eliminates bed collapse as a source of failure in short prep columns.

Using Hydraulic Piston Compression (HPC™) technology patent pending, several fundamental problems faced daily by preparative chromatographers have been solved. Fig.


Evaluation of Technology

c) Comparison between conventional slurry packing and phenomenex, AXIA Conventional Slurry Packing Problem - As the packing pressure is released and before a preparative column is capped, silica extrudes from the inlet end of column - The only options are to scrape and cap Impact - Void formations as the bed settles(premature column failure) -Variability of media bed density from column to column(degrading overall reproducibility) -Variability of both efficiency and peak shape from column to column(scale-up from analytical more difficult) -Peak distortion or asymmetry – reducing the return on each Phenomenex, AXIA™ Solution and Benefit -Integration of the packing piston so pressure is never released resulted in extended column lifetime (column voids are no longer a column failure - Computer control of the entire process to ensure proper sorbent density and uniformity (density tuned and optimised for different media, ) - Overall column to column consistency is also dramatically improved with efficiencies and peak symmetries on par with analytical separations -Easily to scale-up Fig.

purification cycle (for more information) UPLC Systems

a) ACQUITY UPLC Systems ACQUITY UPLC Systems are so much than just fast HPLC. ACQUITY UPLC, enable to choose the separation that is ideal for the analytical task. No other LC system on the market today can come close Fig.


Features Columns for high efficiency With BEH Technology TM(bridge ethylsiloxane/silica hybrid), ACQUITY UPLC BEH Columns exhibit improved efficiency , strength, temperature and pH range the hallmark of UPLC separation




A wide range of complementary chemistries, including C18,C8, Shield RP18,Phenyl and HILIC Sample organizer ACQUITY UPLC Guard columns Handles multiple sample formats : plates, vials, tubes Quick connect fitting Fig.

Direct scalability and Flexibility


Performance, Stability and robustness


The superior preparative HPLC separations and hasslefree scale-up and optimization Column manager with heating/cooling for low dispersion control and precise temperature management Optional automated column switching among multiple columns and a bypass channel Sample manager for high capacity processing Adjustable FLEXcart for convenient installation and easy movement between workstations Third –party mass spectrometer adaptability Guaranteed performance for column to column and batch to batch reproducibility over the life of the column sensitivity to meet multiple detection requirements( binary solvent manager, photodiode array detector, evaporative light scattering detector, UV detector and SQ detector) Fig.

Traceability and Intelligence


ACQUITY UPLC console with calculator for eases the method development and transfer from HPLC to UPLC Connections INSIGHT services to monitor system operational characteristics EmpowerTM or MasslynxTM Software for control and data management eCordTM Technology electronically stores all of the information to track your experiment

2.6.3 d)

Evaluation of Technology How do chromatographic technologies compare?





Waters ACQUITY UPLC Conventional HPLC “Ultra-fast high-flow” HPLC “Hightemperature high-flow” HPLC Monolith HPLC

Excellent Good Fair Very good

Excellent Good Fair Good

Excellent Fair Very good Excellent

Scalability to prepScale HPLC Excellent Excellent Good Good

Compatibility with MS Excellent Good Poor Poor





Poor ( for more information) NanoACQUITY UPLC System The waters nanoACQUITY Ultra Performance LCTM system has been designed to achieve separation at nanoflow rates without flow-splitting, offering significant improvements in robustness, reproducibility and simplicity of operation over conventional nanoflow separations technologies. The nanoACQUITY UPLC system directly benefits from holistic design of ACQUITY UPLC system. The optimized fluidic design of the nanoACQUITY UPLC System, together with the proprietary nanoACQUITY 1.7µm BEH chemistry, enables significant improvement for enhanced analysis of the lowest abundant peptide. Fig.

2.6.3 e)

Evaluation of Technology Conventional nanoscale HPLC Problem - Limitation of reliability and reproduci bility Difficult to precisely control and monitorin g of the Impact Change in solvent composition , column backpressure and flow rate may result in poor day to day and column to column retention NanoACQUITY UPLC Solution and Benefit Improve accuracy in qualitative and quantitative applications (The nanoACQUITY UPLC System is a direct nano-flow system that does not require a flow splitter). The excellent run to run reproducibility is essential for components to be confidently identified or tracked across sample sets, and for performing accurate quantitative comparison across sample sets Increase information content from every sample. The combination of an optimized fluid path and integrating results in enhanced peak shape and


flow rate and pressure -


time reproducibili ty. Limits performanc e in the HPLC domain and places a greater burden on the MS system Comparison across samples problematic

peak capacity, increasing the number of components that can be detected per unit time Fig.2.6.3.e1

Fig.2.6.3.e2 (for more information) Agilent's new HPLC-Chip technology Agilent's HPLC-Chip is the first microfluidic chip-based device that can carry out nanoflow high performance liquid chromatography (HPLC). The center piece of Agilent's new HPLC-Chip technology is a reusable microfluidic polymer chip. Smaller than a credit card, the HPLC-chip seamlessly integrates the sample enrichment and separation -columns of a nanoflow LC system with the intricate connections and spray tip used in electrospray mass spectrometry directly on the polymer chip. The technology eliminates 50% of the traditional fittings and connections typically required in a nanoflow LC/MS system, dramatically reducing the possibility of leaks and dead volumes and significantly improving ease of use, sensitivity, productivity and reliability during analysi Fig.

The second component of the HPLC-Chip technology is the HPLC-Chip/MS interface. A chip is inserted into the interface, which mounts on an Agilent mass spectrometer. The design configuration guarantees that the electrospray tip is in the optimal position for mass analysis when the chip is inserted. Replacement of the chip is simple and can be completed in a few seconds as opposed to much longer times required to change out nanoLC columns. The HPLC-Chip interface will be available as a standard module within the Agilent 1100 Series LC system. Fig.

2.6.3 f)

Evaluation of Technology The dvantages does the HPLC-Chip offer over conventional technology

Agilent's HPLC-Chip carries out nanoflow HPLC to obtain maximum sensitivity with minimal sample sizes. The HPLC-Chip integrates sample preparation, separation, and electrospray tip on a single chip. It significantly reduces the number of fittings, connections, valves and tubing required for nanflow HPLC. It also includes a sprayer that interfaces efficiently with a mass spectrometer, allowing separated compounds such as peptides to then be identified and quantified via mass spectrometry. This highly integrated, automated system promises to improve the analysis of complex samples of unknown composition, increasing productivity and throughput. Compared with conventional column-based nanoflow HPLC, HPLC-Chip offers unparalleled ease of use, greater reliability and robustness and higher sensitivity. Fig.2.6.3.f1

Fig.2.6.3.f2 (for more information) 2.6.4. Application 1. Expression and purification of a recombinant LL-37 from Escherichia coli

This study report the first time a method to express in E.coli and purify LL-37. LL-37 is a 37 residue cationic, amphipathic α-helical peptide. Factor Xa was used to cleave a 4.5kDa LL-37 from the GST-LL-37 fusion protein and the peptide was purified using reverse-phase HPLC on a Vydac C18 column with a final yield of 0.3 mg/ml. The protein purified using HPLC was confirmed to be LL-37 by the analyses of Westren blot and MALDI-TOF-Mass spectrometry. The concentration of LL-37 was determined by comparing its peak area with that of the chemically synthesized LL-37[14]. Fig.


Enabling significant improvements for peptide mapping with UPLCTM The capabilities of Ultra Performance LCTM technologies make higher resolution peptide mapping possible [15]. This study demonstrates the advantages of UPLC for peptide mapping. Fig.

The separation of a tryptic digest of enolase is shown below. In the UPLC separation, more peaks are observed. The overall resolution and sensitivity

are higher. In the UPLC map, there are several small peaks that are difficult to discern with the HPLC run. This result demonstrates that UPLC offers higher resolution and sensitivity when compared to HPLC under the same gradient conditions. This suggests that the selectivity of the UPLC column is suitable for peptide mapping. Fig.

To show how UPLC can resolve the same number of peaks in a peptide map as HPLC but in less time, the separation of an enolase digest was done both with flow rate of 100µl/min. The UPLC separation shows the same number of peaks and a similar overall elution pattern as the HPLC separation, but in half the time. UPLC offers the potential to reduce cycle times for peptide maps. Fig.

UPLC is particularly important when the peptide map is used to detect modified peptides. Higher resolution ensures that modified peptides are resolved from the unmodified form, as well as from other peptides in the digest. UPLC should be the technique of choice for detecting all the peptides in a sample. Fig.

The separation of several peptides with formic acid is compared with TFA on a UPLC column with MS detection. With formic acid, the peak heights are

about 3-fold higher. This result indicates that the UPLC columns perform extreme well under conditions that are best for ESI-MS Fig.

The UPLC-MS separation of a tryptic digest of α-1 acid glycoprotein. The MS detection was performed with a Q-Tof mass spectrometer, which is well suited for glycopeptides due to its extended mass range. The glycopeptides are detected as sharp, symmetrical peaks with UPLC. These characteristics are important for minimizing spectral overlap of different glycoforms of the same peptide. UPLC with ESI-TOF mass spectrometry will be a powerful tool for studying the glycosylation state of proteins. Fig.

3. Nano-HPLC for determination of proteins in infant formula To determine the protein content of formula, gel electrophoresis was performed and the entire protein patterns were analyzed by nano-HPLCelectrospray tandem mass spectrometry (nano-HPLC/ESI/MS/MS) The analysis was performed in an LCQ DECA XP ion-trap mass spectrometer equipped with a nano electrospray ionization source. The electrospray source was coupled online with an agilent 1100 series capillary HPLC system. Two microliters of peptide solution in mobile phase was manually loaded into a capillary column (70mm length x 75µm ID)[16] Fig.



HPLC-Chip/MS for analysis of yeast (Saccharomyces cerevisiae)

The following components were integrated onto the HPLC-Chip [17] 1- A 40-nl enrichment column packed with ZORBAX 300SB-C18,5µm particle size 2- All connections between the two columns and between the analytical column and the nanoelectrospray tip 3- The nanoelectrospray emitter (10 µm ID) The HPLC –Chip inserts into the HPLC-Chip/MS interfaces which mounts to the electrospray source [18] More identified proteins with the HPLC-Chip/MS Fig. Fig2.6.4.4.b

Better sequence coverage with the HPLC-Chip/MS provided a higher level of confidence in the protein identification Fig.

Excellent reproducibility Fig.



Relevant web sites Expression and purification of a recombinant IL-37 from Escherichia coli (online at (2006)) Enabling Significant Improvements for Peptide Mapping with UPLC. Jeffrey R,Thomas E,Wheat,Beth L.Gillence-Castro, and Ziling Lu. (www.waters.comWater Corporation,2005). Determination of proteins in infant formula (online at Comparison of HPLC-Chip/MS with conventional nanoflow LC/MS for proteomic analyses.Martin vollmer, Christine miller and GeorgesL.Gauthier (online at (2005)) Key Industry Suppliers

14 15 16 17


Most of the industry suppliers have been mentioned previously on Relevavt web sites section (2.5.6) Some additional supplier for Several polymeric columns are now commercially available from Dionex/LC Packngs, BIA Separations, BioRad, and Sepragen . 2.6.7 1 2 3 References J.Rivier and R.McClintock., (1983),. Reversed-Phase High Performance Liquid Chromatography of Insulin from Different Species. J.Chrom.286,112119 T.Christianson and C.Paech, (1994),. Peptide Mapping of Subtilisins as a Practical Tool for Locating Protein Sequence Errors during Extensive Protein Engineering Projects, Anal.Biochem.223,119-129 X.Geng and F.E.Regnier,(1984),. Retention Model for Proteins in ReversedPhase Liquid Chromatography, ,J.Chrom.296,15-30

4 5

6 7 8 9



12 13 18

JD.Pearson, N.T.Lin and F.E.Regnier,(1982),. The importance of Silica Type for Reverse-Phase Protein Separation, Anal.Biochem.124,217-230 P.Tempst,D. Woo,D.B.Teplow,R.Aebersold, L.E.Hood and S.B.H.Kent,(1986),.Microscale Structure Analysis of a High Molecular Weight Hydrophobic Membrane Glycoprotein fraction with Platelet-Derived Growth Factor-Dependent Kinase Activity, J.Chrom.359,403-412 M.T.Davis and T.D.Lee, (1992),. Analysis of peptide mixtures by capillary high performance liquid chromatography: Aparticle guide to small-scale separations, Protein Science 1, 935-944 N.C.Robison, M.D.Dale and L.H.Talbert,(1990)Subunit Analysis of Bovine Cytochrome c Oxidase by Reverse-Phase High Performance Liquid Chromatography, Arch. Of Biochem. And Biophys.281(2),239-244 M.C.McCroskey,V.E.Groppi and J.D.Pearson, (1987),. Separation and Purification of S49 Mouse Lymphoma Histones by Reversed-Phase High Performance Liquid Chromatography Anal.Biochem.169,427-432 Y.Chen,C.T.Mant and R.S.Hodges, (2003),. Temperature selectivity effects in reversed-phase liquid chromatography due to conformation differences between helical and non-helical peptides, J.Chrom 1010,45-61 W.S.Hancock, R.C.Chloupek,J.J.Kirkland and L.R.Snyder, (1994) Temperature as a variable in reversed-phase high performance liquid chromatographic separations of peptide and protein samples.I. Optimizing the separation of a growth hormone tryptic digest, J.Chrom.686,31-43 A.Apffel,J.Chakel,S..Udiavar,W.Hancock,C.Souders,E.Pungor,Jr, (1995),. Application of capillary electrophoresis,high-performance liquid chromatography,on-line electrospray mass spectrometry and matrix-assisted laser desorption ionization-time of flight mass spectrometry to the characterization of single-chain plasminogen activator, J.Chrom.A.717,41-60 John Wiley & Sons ,.Overview of Protein and Peptide Analysis by Mass Spectrometry, Section 16.1.14 in current Protocols in Protein Science. F.Svec and L.Geister, (2006),LCGC 24(S4),22-26 ” Fortier, M-H, Bonneil,E.,Goodley,P., and Thibault,P., 2005 ,. Integrated Microfluidic Device for Mass Spectrometry-Based Proteomics and Its Application to Biomarker Discovery Programs Anal.Chem.,77(6),1631-1640,

Chapter 2.8 Purification of membrane proteins
Wang Tianshi 2.8.1 Introduction Membrane proteins are protein molecules that are attached to, or associated with the membranes of mitochondria, chloroplasts, cells and organs. Membrane proteins have a crucial role in many cellular and physiological processes. Usually, they are essential mediators of the transfer of material and information between cells and their environment and between compartments within cells. [10] Based on their attachment to the membrane, membrane proteins can be classified into two groups: Integral membrane proteins and Peripheral membrane proteins. Integral membrane proteins (IMP), also called intrinsic proteins, is a protein molecule (or assembly of proteins) that in most cases spans the biological membrane with which it is associated (especially the plasma membrane) or which, in any case, is sufficiently embedded in the membrane to remain with it during the initial steps of biochemical purification. In general, IMPs can be divided into three groups: transmembrane, membraneassociated and lipid-linked. On the other hand, Peripheral membrane proteins, or extrinsic proteins, do not interact with the hydrophobic core of the phospholipid bilayer. Instead they are usually bound to the membrane indirectly by interactions with integral membrane proteins or directly by interactions with lipid polar head groups. (Source: Approximately 40% of the sequenced genes encode for membrane associated proteins [2]. Many membrane proteins are related to diseases. Using a cost-effective, simple and rapidly method to purify membrane proteins has become a major challenge and resolving this problem is also the basis for designing better drugs. The methods used in membrane protein purification, can roughly be divided into analytical and preparative methods. The distinction is not exact, but the deciding factor is the amount of protein, that can practically be purified with that method. Analytical methods which are usually on a small scale aim to detect and identify a protein in a mixture, whereas preparative methods are operated on a relatively large scale and aim to purify the membrane protein complex from membrane fractions while retaining its native form, mainly to characterize its nature, such as for structural biology and industrial use. In general, the preparative methods can be used in analytical applications. The overall steps have been shown in Fig.2.8.1. [5]

Fig.2.8.1 General purification process for membrane protein complexes. [5] This review will describe some recent advances, compare and evaluate these techniques and discuss three specific applications. 2.8.2 Recent Advances In recent years, the technologies on purification of membrane proteins were being improved. This section will describe these progresses on expression of recombinant protein, choice of detergent and improvement of column chromatography and analytical method. With the development of genomic and proteomic technologies, they are opening a new field of vision on purification of membrane proteins. In an indian study, the gene encoding an outer membrane protein designated ompTS was amplified by PCR excluding the region coding for signal peptide, cloned in pQE 30-UA Vector and expressed using induction with isopropyl thiogalactoside (IPTG) [6]. In a study of the conformation of the CadF protein which was very hard to purify from Campylobacter (the most common bacterial agent of gastroenteritis) membranes, Mamelli and his colleagues developed a novel strategy to produce significant quantities of a recombinant N-terminal domain of the CadF protein (46.5µg/mg of bacterial dry weight) and of the native CadF protein (3.5µg/mg of bacterial dry weight).The nucleotide sequence encoding the N-terminal domain of the CadF protein was cloned in a pET-based expression vector. The recombinant protein was further produced in Escherichia coli, purified from inclusion bodies, and refolded. [10] Moreover, recent studies in this area had also provided new insight into the response of host cells to membrane protein expression and into the mechanism of membrane insertion [4, 16].

Experimental procedures for handling and isolating integral membrane proteins are generally more challenging than their soluble counterparts, since the former requires purification in detergent. A simple and cost-efficient detergent screening strategy is most important premise of a large-scale protein production and crystallization. Recently, studies in Sweden were on extracting and purify the recombinant

ammonium/ammonia channel, AmtB, from Escherichia coli. 26 detergents, 4 types of chromatography columns and various buffer conditions had been screened using a 96-well plate format. Large-scale protein purification and subsequent crystallization screening resulted in AmtB crystals diffracting to low resolution with three detergents: UDM, DDM and Cy6. The researches suggested that excluding detergents that were not useful for high-yield extraction of a specific protein (e.g. due to destabilizing the protein) might be very helpful to minimize the number of detergents during crystallization screening. The fact that no crystals of AmtB were grown in the presence of OG and LDAO might be such an indication. [14] Column chromatography is the main way to purify the membrane proteins, so it becomes the key point to improve the technique of chromatography. This technique contains size exclusion chromatography, ion exchange chromatography, affinity chromatography, hydrophobic interaction chromatography and reversed phase chromatography. In a recent report, it suggested that monoclonal antibodies against several rat liver plasma membrane proteins be bound and cross-linked to protein A or protein G convective interaction media (CIM) affinity columns with a bed volume of only 60µL. Because affinity columns had the potential for processing large volumes of complex biological mixtures within a short time, and CIM could provide a stationary phase with a high binding capacity for large molecules and be capable of high flow rates at a very low pressure drop. As a result, antigens recognized by bound antibodies and co-eluting (interacting) proteins were rapidly isolated in a single step from either total plasma membrane extracts or subfractions isolated using anionexchange CIM disk-shaped columns. [17] Recently, more novel analytical technologies on membrane proteins were utilized. In an American study, they found that when wildtype bacteriorhodopsin (bR) and a labile bR mutant were reconstituted into the phospholipids gel, spectroscopy showed that the protein was both more stable and has improved conformational homogeneity as compared to gels formed using monoolein. In addition, they developed a generally-applicable spectroscopic technique based on the intrinsic fluorescence of tryptophan residues and this fluorescence assay made possible the rapid evaluation of lipid gels as media for the crystallization of membrane proteins. [9] 2.8.3 Evaluation of the Technology Although it actually has made some progresses to improve the techniques, there still exits many problems on the purification of membrane proteins to be solved. This part will mainly discuss the difficulties of membrane protein purification technologies and how to face those hard problems. There are several difficulties of techniques on the investigation and separation of membrane protein complexes originate from their nature as membrane proteins. Because (1) they are very hydrophobic and have single or several transmembrane parts, or closely associate with the membrane; (2) in the functional form, many of them comprise (homologous or heterologous) multi-subunit complexes; (3) such membrane protein complexes contain many cofactors and, inevitably, lipids; (4) some membrane protein complexes have several peripheral proteins which are functionally important but easily detached during the isolation process [5]. The first difficulty is to choose the right detergent for an efficient purification of the membrane protein of interest due to the hydrophocity of membrane proteins. There are dozens of different detergents that are commonly used that are less characterized but still probably useful and many novel detergents under development. It has also been reported that some compartments of the cell

membrane show resistance towards certain detergents. Moreover, mixtures of detergents are sometimes used during purification and crystallization. Altogether, the size of the detergent parameter space becomes very large. On the other hand, screening fewer detergents may result in poor yield, unstable protein and/or no protein crystals, and is not recommended. Thus, the presented strategy allows the screening of tens of detergents, for their efficiency of pure protein production and crystallization, easily and simultaneously, producing reliable and reproducible results, at very low cost [14]. For instance, a detergent such as sodium dodecyl sulfate (SDS) can be used to dissolve cell membranes and keep membrane proteins in solution during purification; however, because SDS causes denaturation, milder detergents such as Triton X-100 or CHAPS can be used to retain the protein's native conformation during purification [20]. Usually a protein purification protocol contains one or more chromatographic steps. The problem during this process is the complex activities and mechanisms of membrane protein in the column, and then causes difficulties on obtaining stable and non-denaturing proteins. Many membrane proteins are glycoproteins and can be purified by lectin affinity chromatography. Detergent-solubilized proteins can be allowed to bind to a chromatography resin that has been modified to have a covalently attached lectin. Some lectins have high affinity binding to oligosaccharides of glycoproteins that is hard to compete with sugars, and bound glycoproteins need to be released by denaturing the lectin. RP-HPLC as the first dimension of protein separation is the opportunities for rapid (ca. 1-min) solubilization with trifluoroacetic acid (TFA) of the native membrane complex without the need of the time-consuming extraction or solubilization by detergents [19]. Some additional techniques enable the facile purification of proteins by chromatography, but they also increase some uncertain and complex factors [15]. For examples, recombinant integral membrane proteins are purified by immobilized metal affinity using polyhistidine tag. The effects of tag length and position in different situations must be considered [13]. Moreover, Mild detergents that are non-denaturing should be used to allow a partial separation by column chromatography. However the dual nature of the proteins often leads to multiple elution points during chromatography depending on which part of the protein surface has bound to the separation phase.(Source: Up to now, many analytical strategies can be used to separate membrane proteins. Proteomic approach which aims to detect whole expressed proteins to analyze the function of such proteins and the functional linkage between them is one of the important clinical analyses. For this analytical aim, SDS–PAGE and/or 2-dimensional electrophoresis in conjunction with isoelectric focusing (IEF) or blue native (BN) electrophoresis are frequently employed. 2D electrophoresis combined with IEF is widely performed for membrane protein samples. Mass Spectrometry (MS) is the ultimate technique for the accurate measurement of protein molecular weight (<0.1% error). In all cases, it can be used to confirm protein identity [18]. However, most proteome analyses have had to ignore these membrane proteins since most do not run on 2D gels. If denaturing conditions are used, the hydrophobic regions become very prone to self-aggregation and this is the major limiting factor in the analysis of membrane proteins by 2D PAGE. Currently the only viable two-dimensional separation method is the BAC/SDS diagonal gel method which has a rather low resolution. Although it has been demonstrated that nearly all membrane proteins can be separated by one-dimensional SDS-PAGE and identified by mass spectrometry [7], the disadvantage of this method is that the quantitative aspect is lost and comparative analyses of protein expression by gel densitometry cannot be carried out. Therefore, the solutions to analysis of the subunit components in an isolated membrane complex better are to understand the function of the membrane protein

complex completely and utilize the proper methods and equipments in terms of the special characters of each membrane protein. 2.8.4 Applications of the Technology This section will discuss three recent applications; two of them are on the purification of Gram-negative bacteria’s outer membrane proteins, one is on the prostate-specific membrane antigen (PSMA). Beis and his co-workers’ study described the development of a two-step purification protocol for Escherichia coli outer integral membrane proteins, Wza and Osmoporin C (OmpC) proteins. In this research, the two very different proteins were purified to homogeneity by anion exchange and size exclusion chromatography and the purity of the samples was judged by electrophoretic analysis, mass spectrometry, single particle analysis, three-dimensional (3D) crystallisation and X-ray diffraction. At first, Bacterial cells were disrupted and a crude extraction procedure that was the same for both the Wza and OmpC proteins was fractionated by ultracentrifugation. The membranes were solubilised overnight at room temperature with rolling and insoluble materials were then removed by centrifugation. Secondly, initially purification of Wza was performed using anion exchange chromatography and then Wza was purified using a linear gradient of 0–100% 1M NaCl elution buffer and size exclusion chromatography.[2] Finally, the purification of the OmpC protein was carried out using the same purification protocol as for the Wza [1].

But a problem to separate membrane protein complexes appeared during this process, size exclusion chromatography failed to separate the Wza from the OmpA. To avoid this, the co-elution with the Wza protein on the anion exchange chromatography step is used. The elution protocol was changed to a step gradient instead of a linear elution and Wza could be eluted with 12% and OmpA with 22% of 1MNaCl elution buffer, respectively. At last, the electron microscopy analysis showed that the Wza protein can be obtained (Fig. In addition, X-ray diffraction studies of the crystals revealed that the two protein batches behave very differently but in some cases they were in the same shape, an orthorhombic crystal form. For the detergent choice, extraction of the protein was achieved by using the inexpensive swittergent SB3-14, and only exchanged to the expensive and more suitable detergent bOG prior to crystallisation. Again, using a linear gradient resulted in the elution of OmpC with contaminants. The step elution approach resulted in the production of highly pure OmpC (Fig. [2] From this application, it was found that protein impurity alters the nucleation and/or the growth of the Wza and OmpC crystals and two or more chromatography steps can be used to make up their disadvantages each other.

Fig. Characterisation of protein impurities and their effects on Wza crystallisation.
(a) Electrophoretic analysis of Wza. Lane 1 shows the extracted protein, lane 2Wza after anion exchange, lane 3 the pure Wza after gel filtration and lane 4 shows the Wza without heating to show the multimeric complex. Molecular weight markers are shown on the left side of the SDS-PAGE in kDa. Electron microscopy analysis of Wza (b) and Wza–OmpA (d) samples. (c and e) Effect of purification on the growth of well-ordered crystals. Scale bar 100 mm.[2]

Fig. (a) SDS-PAGE of OmpC protein. Lane 1 shows the extraction of OmpC from outer membranes, lane 2 the OmpC after the anion exchange chromatography step and lane 3 pure OmpC after size exclusion chromatography. Molecular weight markers are shown on the left side of the SDS-PAGE in kDa. (b) 3D crystals of OmpC. Scale bar 100 mm. [2] Neisserial porins represent more than 60% of outer membrane proteins [12], so the second application is about the outer membrane protein PorB of Neisseria meningitides. This protein has been shown to up-regulate the surface expression of the co-stimulatory molecule CD86 and of MHC class II, be involved in prevention of

apoptosis by modulating the mitochondrial membrane potential and form pores in eukaryotic cells [11]. As an outer membrane protein, its native trimeric form isolation is complicated by its insoluble nature and it required the presence of detergent throughout the whole procedure.

In order to obtain a pure protein it is necessary to analysis the components of the contaminant and their removing conditions. In this case, DNA, lipooligosaccharide (LOS) and other debris were the major effects of the impurity of PorB. During this process, PorB was purified to homogeneity from a mutant meningococcal strain and the crude outer membrane protein preparation was obtained by extraction with the zwittergent-Ca2+method [3]. The pellet obtained from this step, described as starting material (SM), contains all the outer membrane proteins and also residual LOS and lipoproteins. Initially, two ion exchange columns in tandem were used as the first step at pH 8.0: a DEAE Sepharose CL-6B column and a CM-Sepharose column. PorB was recovered almost exclusively in the column flow through (FT), which also contained some contaminating protein species, LOS and lipoprotein (Fig. Secondly, size-exclusion chromatography efficiently separated PorB from most of the residual protein contaminants, but not from LOS or from lipoproteins, such as H.8. A third step was performed using a resin which does not bind endotoxins, Matrex Cellufine Sulfate, which bound PorB at pH 7.5 but did not retain LOS or lipoproteins, allowing their removal with the column flow through. By applying a 0.2–0.5M linear gradient of NaCl, PorB was eluted between 0.24 and 0.4M NaCl (Fig. [11, 22]. The analysis demonstrated that LOS and H.8 were absent in the pooled fractions. These specific contaminants were thus very efficiently removed from the final product and purified PorB is regained at the end of the purification method by removing the detergent from the protein solution by extensive dialysis and formation of protein (Fig. This case provided several methods to separate the cofactors that may exit in membrane protein complexes such as changing pH, the linear gradient of NaCl.

Fig. Chromatographic purification of N. meningitidis PorB.
SM, column starting material; FT, column flow through; and pool, PorB-containing fractions. (A) Ion-exchange chromatography on DEAE/CM columns. The position of PorB and LOS is indicated by the arrows on Coomassie and silver staining of SDS–PAGE, respectively. (B) Size-exclusion chromatography on Sephacryl S300 column. The position of PorB, LOS, and H.8 is indicated by the arrows on Coomassie, silver staining, and Western blot with specific anti-H.8 antibody, respectively. (C) Affinity chromatography on Matrex Cellufine Sulfate column. PorB, LOS, and H.8 are identified as above. (D) PorB formed into proteosomes free of detergent. The position of PorB and LOS is indicated by the arrows on Coomassie and silver stain as above. Furthermore, PorB can be detected as a band on the silver-stained gel, as indicated by the asterisk. [11]

Another case is on PSMA, a membrane protein that has attracted significant attention as a target for immunioscintigraphic and radioimmunotherapeutic applications for prostate cancer. For this purification, they optimized the purification of native PSMA from LNCaP cells using conformational epitope-specific antibody-affinity chromatography. In contrast to general affinity chromatography, this chromatography for the purification of native PSMA employs resin bound anti-PSMA monoclonal antibody 3C6 that reacts with a protein conformational epitope present in the extracellular portion of human PSMA [21]. As this antibody binds PSMA when present in a native conformation, only PSMA in a native conformation is retained by

the affinity resin. Then, there are three further methods to purify PSMA and a comparison of these methods explored is outlined in Table 2.8.4[8]. Western blot analysis and an HPLC-based enzymatic activity assay were used to compare the yield and results demonstrated that all three methods provided similar yields of PSMA. Method A resulted in the least amount of PSMA in a non-native conformation (0.9%) suggesting that PSMA initially purified by this method was predominately in an active conformation. These results were consistent with enzymatic activity data obtained in which PSMA purified by this method exhibited the greatest enzymatic activity when compared to the other two methods. The ratio of purified PSMA in a native and active conformation was determined by quantifying the amount of nonnative PSMA not retained in second antibody-affinity isolation. The low amount of denatured PSMA obtained in Method A when compared to Method B demonstrates that high pH conditions promote denaturation of PSMA. In comparison to Method B, Method C confirmed the importance of the essential Zn2+ cofactor during the desalting step. [8] This result also suggests that Zn2+ may have a stabilizing role in addition to its functional role in PSMA’s enzymatic activity, In other words, the addition of both a neutralization step and the inclusion of Zn2+ to the equilibration buffer in desalting step provides considerable enhancement in the yield of active PSMA from LNCaP cells.

Table 2.8.4 comparison of methods for the purification of PSMA. [8] Overall, it is necessary to be considered that not only the methods to separate common proteins but also the features and environments that membrane proteins exit. As one coin has two sides, these difficult factors such as hydrophobicity, closely associate with the membrane, multi-subunit complexes, many cofactors, and several easily detached peripheral proteins also hint at some solutions to produce membrane proteins. 2.8.5 Relevant web sites 1. National Library of Medicine - Medical Subject Headings (2006) 2. SWEGENE Proteomics Platform (2002) 2.8.6 References 1. Baalaji, N.S., Mathew, M.K. & Krishnaswamy, S. (2006) Functional assay of Salmonella typhi OmpC using reconstituted large unilamellar vesicles: a general method for characterization of outer membrane proteins. Biochimie. 1-6. 2. Beis, K., Whitfield, C., Booth, I. & Naismith, J.H. (2006) Two-step purification of outer membrane proteins. International Journal of Biological Macromolecules. 39,

10–14. 3. Blake, M.S. & Gotschlich, E.C. (1984) Purification and partial characterization of the opacity-associated proteins of Neisseria gonorrhoeae. J. Exp. Med. 159, 452– 462. 4. Jidenko, M., Lenoir, G., Fuentes, J.M., Maire, M. & Jaxel, C. (2006) Expression in yeast and purification of a membrane protein, SERCA1a, using a biotinylated acceptor domain. Protein Expression and Purification. 48, 32–42. 5. Kashino, Y. (2003) Separation methods in the analysis of protein membrane complexes. Journal of Chromatography B. 797, 191–216. 6. Khushiramani, R., Girisha, S.K., Karunasagar, I. & Karunasagar, I. (2006) Cloning and expression of an outer membrane protein ompTS of Aeromonas hydrophila and study of immunogenicity in Wsh. Protein Expression and Purification. 4, 1–5. 7. Lin, C., Cotton, F., Boutique, C., Dhermy, D., Vertongen, F. & Gulbis, B. (2000) Capillary gel electrophoresis: separation of major erythrocyte membrane proteins. Journal of Chromatography B. 742, 411–419. 8. Liu, T., Toriyabe, Y. & Berkman, C.E. (2006) Purification of prostate-specific membrane antigen using conformational epitope-specific antibody-affinity chromatography. Protein Expression and Purification. 1-5. 9. Lunde, C.S., Rouhani, S., Facciotti, M.T. & Glaeser, R.M. (2006) Membraneprotein stability in a phospholipid-based crystallization medium. Journal of Structural Biology. 154, 223–231. 10. Mamelli, L., Pag`es, J., Konkel, M.E. & Bolla, J. (2006) Expression and purification of native and truncated forms of CadF, an outer membrane protein of Campylobacter. International Journal of Biological Macromolecules. 39, 135–140. 11. Massari, P., King, C.A., MacLeod, H. & Wetzler, L.M. (2005) Improved purification of native meningococcal porin PorB and studies on its structure/function. Protein Expression and Purification. 44, 136–146. 12. Minetti, C.A., Blake, M.S. & Remeta, D.P. (1998) Characterization of the structure, function, and conformational stability of PorB class 3 protein from Neisseria meningitidis. A porin with unusual physicochemical properties. J. Biol. Chem. 273, 25329–25338. 13. Mohanty, A.K. & Wiener, M.C. (2004) Membrane protein expression and production: effects of polyhistidine tag length and position. Protein Expression and Purification. 33, 311–325. 14. Niegowski, D., Hedr´en, M., Nordlund, P. & Eshaghi, S. (2006) A simple strategy towards membrane protein purification and crystallization. International Journal of Biological Macromolecules. 39, 83–87. 15. Raymond, F., Rolland, D., Gauthier, M. & Jolivet, M. (1998) Purification of a recombinant protein expressed in yeast: optimization of analytical and preparative chromatography. Journal of Chromatography B. 706, 113–121. 16. Reinhard, G.(2006)Understanding recombinant expression of membrane proteins. Current Opinion in Biotechnology. 17, 337–340. 17. Rucevic, M., Clifton, J.G., Huang, F., Li, X., Callanan, H., Hixson, D.C.& Josic, D. (2006) Use of short monolithic columns for isolation of low abundance membrane proteins. Journal of Chromatography A. 1123, 199–204. 18. Schwabe, T.M.E., Gloddek, K., Schluesener, D. & Kruip, J. (2003) Purification of recombinant BtpA and Ycf3, proteins involved in membrane protein biogenesis in Synechocystis PCC 6803. Journal of Chromatography B. 786, 45–59. 19. Sharov, V.S., Galeva, N.A., Knyushko, T.V., Bigelow, D.J., Williams, T.D. & Schoneich, C. (2002) Two-dimensional separation of the membrane protein sarcoplasmic reticulum Ca–ATPase for high-performance liquid chromatography– tandem mass spectrometry analysis of posttranslational protein modifications. Analytical Biochemistry. 308, 328–335.

20. Tamm, L.K. & Liang, B. (2006) NMR of membrane proteins in solution. Progress in Nuclear Magnetic Resonance Spectroscopy. 48, 201–210. 21. Tino, W.T., et al. (2000) Isolation and characterization of monoclonal antibodies specific for protein conformational epitopes present in prostate-specific membrane antigen (PSMA). Hybridoma 19, 249–257 22. Wetzler, L.M., Blake, M.S. & Gotschlich, E.C. (1988) Characterization and specificity of antibodies to protein I of Neisseria gonorrhoeae produced by injection with various protein I-adjuvant preparations. J. Exp. Med. 168, 1883–1897. .

Charpter 2.9 industrial Scale Purification of Proteins
Guo lu 2.9.1 Introduction The expansion of techniques and methods for protein purification has been an indispensable pre-requisite for numerous of the advancements made in biotechnology of the large scale purification of proteins. Most of the products of biotechnology are proteins; moreover, these proteins must be arranged in large volumes in purified form. Generally, if contaminants can be detected, they must be detached or proven to be harmless. Beside that, the protein must be purified from other proteins. However, the nucleic acids, carbohydrates, lipids or any other materials in the sample should not be ignored. To purify proteins, their inherent similarities and differences should be used. Protein similarity is applied to purify them away from the other non-protein contaminants. Alternatively, the differences are applied to purify one protein from another. Proteins differ from each other in size, shape, charge, hydrophobicity, solubility, and biological activity. Additionally, the protein product must preserve its biological activity. It is noticeable that a method which works deftly in a research laboratory may fail miserably on the industrial production that must be in large scale and reproduced exactly. The Three Phase Purification Strategy is used as a support to the development of purification processes for therapeutic proteins in the pharmaceutical industry. The overall step is shown in figure 2.9.1.[1]

Figure 2.9.1 Preparation and the Three Phase Purification Strategy

Table 2.9.1. Protein properties and their effect on development of purification strategy

2.9.2 Recent Advances Ion Exchange (IEX) Chromatography Ion exchange chromatography (IEX or IEC) has become one of the best-known methods for protein and peptide purification because it offers scalability, high specificity, and wide choice of column materials. It relies on reversible charge interactions between a charged biomolecule (such as a protein or nucleic acid) and an oppositely charged resin-based matrix. ( Hydrophobic Interaction Chromatography(HIC) With the development of the modern biotechnology industry and its desire for highly purified pharmaceutical proteins, a advance emphasis has been positioned on entire processes with admiration to the economy, capacity and the quality of resultant product. Usually, the separation extend power required is controlled by the need to resolve the product. It is not only from the background impurities resulting from the fermentation but also from degradation products and analogues of the drug itself. For

many cases, hydrophobic interaction chromatography (HIC) is an ideal separation method. ( nual.pdf) Alois, Waltraud and Robert believed that the correct folding of solubilized recombinant proteins is of key importance for their production in industry. Many recent improvements have been made to the use of of immobilized metal affinity chromatography and by mimicking the natural folding process with artificial chaperones. [2] Affinity Chromatography (AC) Affinity Chromatography (AC) is a technique that is able to purify a biomolecule with biological function or individual chemical structure. The material to be purified is specifically and reversibly, also, it should be adsorbed to a ligand (binding substance), immobilized by a covalent bond to a chromatographic bed material (matrix). Changing experimental conditions to favour desorption is the way to achieve the recovery of molecules. AC media are commonly used for applications. For example, it is good for purification of fusion proteins, mono- and polyclonal antibodies, and glycoproteins.( LabSep_EduC~LC_tech~AC) Gejing and Gautam pointed out that in the affinity-based approach,compounds are screened based on their binding affinities to target molecules. The interaction between targets and compounds can be directly evaluated by monitoring the formation of non-covalent target–ligand complexes (direct detection) or indirectly evaluated by detecting the compounds after separating bound compounds from unbound (indirect detection). Various techniques including high performance liquid chromatography (HPLC)–MS, size exclusion chromatography (SEC)–MS, frontal affinity chromatography (FAC)–MS and desorption/ionization on silicon (DIOS)–MS can be applied. [3] Gel Filtration (GF) While possibly the most simple column chromatography technique, gel filtration (GF) chromatography is one of the most flexible since it can be performed under a diversity of physical and chemical conditions and normally does not involve an complicated protocol.1,2 GF (also called size-exclusion chromatography) separates globular proteins due to their molecular weights. The liquid volume can be "seen" by the column which consists of a mobile phase and a stationary phase. ( In the recent research by Damien etl., they set up a heterologous expression system in Sf9 insect cells allowing the expression and production of large amounts of a pure active human protein and use gel filtration to purify the target protein.[4] Reversed Phase Chromatography (RPC) Reversed Phase Chromatography includes any chromatographic method which uses a non-polar stationary phase. Mathematical and experimental are used in other chromatographic methods apply. For example, separation resolution is proportional to the column length and inversely proportional to the column width. It has been

widely used in the pharmaceutical, chemical, and biochemical industry for separating molecules of small molecular weight. Moreover, in recent years RPC has been applied to separate larger molecules. ( According to Pattana, Penporn and Aurasorn, Chromatographic separation was achieved with a reversed-phase Apollo C18 column and a mobile phase of methanol–acetonitrile-mixed phosphate buffer (pH 2.6; 10 mM) (40:12:48, v/v/v) with a flow rate of 1.2 ml/min. That proves the RPC is an impact method especially in the protein purification of pharmaceutical industry. [5]

Expanded Bed Adsorption (EBA) EBA uses equipment that is known to most users of standard liquid chromatography. The column has a flow adapter which is located to suit the specific step of resin preparation or protein purification. Besides that, a series of pumps and valves, connected through the adapter and bottom of the column is to control the flow rate and direction of the buffer and sample loading. Therefore, it is possible to perform initial EBA trials with a little ingenuity and standard chromatographic equipment. ( Three anion exchanger expanded bed adsorption (EBA) matrices: Streamline DEAE, Streamline Q XL and Q Hyper Z were evaluated with the aid of EFGP from an ultrasonic homogenate of Escherichia coli in the research of Cabanne, etl. Based on the results, the two other matrices gave a good purification of the EGFP (7–15-fold) but the Q Hyper Z matrix appeared to give the best results. It is composed of little size and density beads which lead to a higher exchange surface and then a better mass transfer. [6] 2.9.3 Evaluation of the Technology Ion Exchange (IEX) Chromatography It has been considered to be one of the best-known methods as it is high specificity, in large scale and has a huge choice of column materials. Hydrophobic Interaction Chromatography(HIC) Despite it is used commonly in the industrial scale of purification, it still has some aspects that influence the product.Based on the inflexibility of the media, the loss of wall support in a large scale column will have a smaller or greater impact on bed compression, with associated deterioration of the flow/pressure properties of the packed bed. The effect of bed compression can be checked by running a pressure/flow rate curve such as outlined under ‘‘Packing large scale columns’’. Zone spreading can also be caused by non-column factors such as increased internal volumes of pumps, valves and monitoring cells and different lengths and diameters of pipes or tubing. If all the above aspects of scaling up are taken into consideration, chromatographic variability is normally not a big issue when scaling-up. In the study of Alois,Christine and Rainer , they considered that HIC exploits the hydrophobic properties of protein surfaces for separation and purification by performing interactions with chromatographic sorbents of hydrophobic nature. In

contrast to reversed-phase chromatography, this methodology is less detrimental to the protein and is therefore more commonly used in industrial scale as well as in bench scale when the conformational integrity of the protein is important. [7] Gel Filtration (GF) There is one factor to consider when choosing a GF medium is the exclusion limit, or the molecular weight limit, of the pores; proteins over this limit will be completely excluded from the pores and will not be separated. It is also important to choose the type of resin. A diversity of resins exists, ranging from silica-based to polymeric. Some companies recommend cross-linked resins, which are advantageous for highpressure purifications as they do not compress and lose porosity under high-pressure conditions. Nevertheless, non-cross-linked resins are appropriate for routine purifications. GF has a number of advantages over other types of liquid chromatography. First, there is a high upper limit on the size of the proteins which can be purified by means of this technique. In addition, the pore shape is perfect for separating globular molecules such as proteins, and the technique does not have need of the use of protein-denaturing organic solvents. However, this method can be difficult to finetune because protein resolution depends on the sample volume applied to the column. Therefore, dilute samples are difficult to purify by this technique. In conclusion, there are two disadvantages for large-scale purification, one is GF media is expensive;the other is GF chromatography is not directly scalable from an analytical to a bulk purification level. Reversed Phase Chromatography (RPC) It can chose from a variety of column configurations based on the use of the purification, and the amount of material available. The smaller inner diameter is used for high resolution separations with very little protein wanted, while large columns may be demand for industrial protein purifications. Analytical columns with diameters of up to 5 mm are daily use RPC columns for routine analytical and purification work. Preparative columns are larger in diameter and can be used for purification of large quantities of proteins from the range of mg to gram. The results of the research of Janine, Colin and Robert showed the excellent potential of one-step RP-HPLC for purification of recombinant proteins from cell lysates, where high yields of purified product and greater purity are achieved compared to affinity chromatography. And they suggested that this approach was also successful in purifying just trace levels (<0.1% of total contents of crude sample) of TM 1–99 from a cell lysate. [8]

Expanded Bed Adsorption (EBA) In the more traditional packed-bed methods, the clogging occurs when particulate matter and cell debris cannot flow around the closely packed resin beads because the resin is confined between the bottom of the column and the flow adapter. Compared with the conventional methods, EBA columns are fed from below, and the adapter is held away from the packed resin level in order to give the resin room to expand .As a consequence creates spaces between the beads.

2.9.4 Applications of the Technology Ion Exchange (IEX) Chromatography IEX separates proteins by the uses of differences in charge to give a very high resolution separation with high sample loading capacity. It is relay on the reversible interaction between a charged protein and an oppositely charged chromatographic medium. Conditions are then altered thus bound substances are eluted differentially. Increasing in salt concentration or changing in pH can perform this elution. During binding, target proteins are concentrated and collected in a purified, concentrated form. Normally IEX is applied to bind the target molecule, however, it can also be used to bind impurities if necessary. At different pH values, IEX can be repeated to separate numerous proteins which have noticeably different charge properties. During a multistep purification, this can be used to advantage. [1] In the research of Khandeparkar and Bhosle, they use this method to Isolation, purification and characterization of the xylanase and get a expected results. This indicates that this method is effective for industrial applications. [9] Hydrophobic Interaction Chromatography (HIC) HIC separates proteins through differences in hydrophobicity. The technique is ideal for the capture or intermediate steps in large scale purification. It is based on the reversible interaction between a protein and the hydrophobic surface of a Chromatographic medium. High ionic strength buffer, which makes HIC an ideal 'next step', with ammonium sulphate or elution in high salt, will enhance the interaction during IEX. Samples in high ionic strength solution attach when they are loaded onto a column. When conditions are altered, as a result, the bound substances are eluted differentially.Generally, samples are eluted with a declining gradient of ammonium sulphate. During binding, target proteins are concentrated and collected in a purified, concentrated type. [1]

Tony, Lloyd, and Alfons pointed out that they have developed expedient and reliable methods to isolate cyclosporin synthetase for in vitro biosynthesis of cyclosporins which is use the Hydrophobic Interaction Chromatography(HIC). It is said that the industrial implementation of an in vitro biosynthetic approach could potentially prove useful for the production of important therapeutic cyclosporins which occur as only minor fermentation by-products.[10]

Figure typical HIC gradient elution Affinity Chromatography (AC) Affinity chromatography can be applied to Purify and concentrate a molecule from a mixture into a buffering solution. It also can reduce the amount of a molecule in a mixture. Moreover, it can discern the biological compounds bind to a particular molecule, like drugs. This method separates proteins on the base of a reversible interaction between a protein and a specific ligand attached to a chromatographic matrix. It is ideal for a capture or intermediate step. Also, it can be used while a suitable ligand is available for the protein of interest. It is high selectivity, hence high resolution, and usually high capacity for the target protein. The target protein is distinctively and reversibly bound with a complementary binding substance (ligand). Desorption is performed particularly, using a competitive ligand, or non-specifically, with changing the pH, ionic strength or polarity. During binding, samples are concentrated and protein is collected in purified, concentrated type. The key stages in a separation are shown in Figure 40. it is also applied to remove specific contaminants, for instance, Benzamidine Sepharose 6B removes serine proteases. [1]Haijie and Tian had successfully used this method to purify a calcium-independent lectin (PjLec) from the haemolymph of the shrimp Penaeus japonicus. This means this method may be application in the industrial scale purification of calcium-independent lectin.[11]

Figure typical affinity separation Gel Filtration (GF) GF separates proteins by means of differences in molecular size. This method is idyllic for the final polishing steps in purification in the situation that sample volumes have been reduced. Samples are eluted isocratically while Buffer conditions are wide-ranging to outfit the sample type or the demand for extra purification, analysis or storage step, as buffer composition does not directly affect resolution. Then products are collected in purified form in the chosen buffer.[1] The study of Bernal, Cair o and Coello Showed that a novel keratinase activity been purified from the bioreaction broth growing media to apparent homogeneity after single step, (24-fold purification with a high yield of 54%) using DEAE column chromatography. They believed that all of the biochemical characteristics, raising the potential use of this enzyme in numerous industrial applications by using GF method. [12] Also, bata-mannanase from Trichoderma harzianum strain T4 had been purified by Humberto and Edivaldo, and they indicated that the thermal stability of the purified Man I make this enzyme attractive for use in industrial applications by using the GF method to purification in a large scale[13].

Figure typical GF elution Reversed Phase Chromatography (RPC) RPC separates proteins and peptides by means of contrary hydrophobicity relay on their reversible interaction with the hydrophobic surface of a chromatographic medium. The binding is usually very strong and requires the use of organic solvents and other additives (ion pairing agents) for elution because of the nature of the reversed phase matrices. The key stages in a separation are shown in Figure 43. It is frequently used in the final polishing of oligonucleotides and peptides and is supreme for analytical separations, such as peptide mapping. However, RPC is not suggested for protein purificationin the condition of recovery of activity and return to a correct tertiary structure are required, because many proteins are denatured in the presence of organic solvents.[1] Janine, Colin and Robert had developed a one-step facile, flexible and readily scalable purification method for a recombinant protein, TM 1–99 (113 amino acid residues; 12,837 Da) based on reversed-phase high-performance liquid chromatography (RP-HPLC) from an E. coli cell lysate.[8]

Figure typical RPC gradient elution

Expanded Bed Adsorption (EBA) EBA is a single pass operation in which target proteins are purified from crude sample, without the need for separate clarification, concentration and initial purification to remove particulate matter. Figure 44a shows the steps involved in an EBA purification and Figure 44b shows a typical EBA elution pattern. In the research of Jian-Feng, Guang-Ce and Cheng-Kui, they applied several methods to isolate and purify large-scale of R-phycoerythrin from red alga Polysiphonia urceolata Grev. However, the results indicate that using the expanded bed adsorption combined with ion-exchange chromatography or hydroxyapatite chromatography, R-phycoerythrin can be puriWed from frozen P. urceolata on large scale. [14]

Figure steps in an EBA purification process

Figure typical EBA elution

2.9.5 Relevant web sites ual.pdf ~LC_tech~AC 2.9.6 Reference [1]Protein Purification Handbook EditionAB Amersham Pharmacia biotech Agent&docid=9C7BA3DA6539F07AC1256EB40044A8B2&file=18113229AC.pdf [2]Jungbauer, A., W. Kaar, et al. (2004). "Folding and refolding of proteins in chromatographic beds." Current Opinion in Biotechnology 15(5): 487-494. [3]Deng, G. and G. Sanyal (2006). "Applications of mass spectrometry in early stages of target based drug discovery." Journal of Pharmaceutical and Biomedical Analysis 40(3): 528-538. [4]Fleury, D., P. Domaingue, et al. "Expression, purification, characterization and crystallization of a recombinant human cytosolic [beta]-glucosidase produced in insect cells." Protein Expression and Purification In Press, Corrected Proof. [5]Sripalakit, P., P. Neamhom, et al. (2006). "High-performance liquid chromatographic method for the determination of pioglitazone in human plasma using ultraviolet detection and its application to a pharmacokinetic study." Journal of Chromatography B 843(2): 164-169. [6]Cabanne, C., A. M. Noubhani, et al. (2004). "Evaluation of three expanded bed adsorption anion exchange matrices with the aid of recombinant enhanced green fluorescent protein overexpressed in Escherichia coli." Journal of Chromatography B 808(1): 91-97.

[7]Jungbauer, A., C. Machold, et al. (2005). "Hydrophobic interaction chromatography of proteins: III. Unfolding of proteins upon adsorption." Journal of Chromatography A 1079(1-2): 221-228. [8]Mills, J. B., C. T. Mant, et al. (2006). "One-step purification of a recombinant protein from a whole cell extract by reversed-phase high-performance liquid chromatography." Journal of Chromatography A 1133(1-2): 248-253. [9]Khandeparkar, R. D. S. and N. B. Bhosle (2006). "Isolation, purification and characterization of the xylanase produced by Arthrobacter sp. MTCC 5214 when grown in solid-state fermentation." Enzyme and Microbial Technology 39(4): 732742. [10]Velkov, T., L. G. Singaretnam, et al. (2006). "An improved purification procedure for cyclosporin synthetase." Protein Expression and Purification 45(2): 275-287. [11]Yang, H., T. Luo, et al. (2007). "Purification and characterisation of a calciumindependent lectin (PjLec) from the haemolymph of the shrimp Penaeus japonicus." Fish & Shellfish Immunology 22(1-2): 88-97. [12]Bernal, C., J. Cairo, et al. (2006). "Purification and characterization of a novel exocellular keratinase from Kocuria rosea." Enzyme and Microbial Technology 38(12): 49-54. [13]Malheiros Ferreira, H. and E. Ximenes Ferreira Filho (2004). "Purification and characterization of a [beta]-mannanase from Trichoderma harzianum strain T4." Carbohydrate Polymers 57(1): 23-29. [14]Niu, J.-F., G.-C. Wang, et al. (2006). "Method for large-scale isolation and purification of R-phycoerythrin from red alga Polysiphonia urceolata Grev." Protein Expression and Purification 49(1): 23-31.

Chapter 2.10 Determination of Protein Concentration and Purity
Xiao zheng Mu

2.10.1 Introduction Biochemical research often requires the quantitative measurement of protein concentration and purity in solutions. Many techniques have been developed; however, most have limitations because either they are not sensitive enough or they are based on reactions with specific amino acids in the protein. Since the amino acid content varies from protein to protein, no single assay will be suitable for all proteins. In this chapter, we will discuss 14 methods which are used in protein concentration and purity determination area. In biuret Method, Lowry Method, Bradford Method and BCA Method, chemical reagents are added to protein solutions to develop a color whose intensity is measured in a spectrophotometer. A “standard protein” of known concentration is also treated with the same reagents and a calibration curve is constructed. Electrophoresis is an analytical tool by which biochemists can examine the movement of charged molecules in an electric field. There are several electrophoresis which are very helpful for the analysis of protein concentration and purity: Polyacrylamide Gel Electrophoreisis (PAGE), Sodium Dodecyl SulfatePolyacrylamide Gel Electrophoresis (SDS-PAGE), Isoelectric Focusing (IEF), TwoDimensional Electrophoresis (2-DE), Capillary Electrophoresis (CE) and Immunoelectrophoresis(IE). The spectrophotometric Method relies on a direct spectrophotometric measurement. There are two kinds of spectrophotometry which can be used in protein concentration and purity determination, they are UltravioletVisible (UV) Absorption Spectrophotometry and Fluorescence Spectrophotometry. High-performance liquid chromatography (HPLC) is ideally suited for the separation and identification of amino acids, proteins, nucleic acids and many other biologically active molecules. The use of nonpolar chemically bonded stationary phases with a polar mobile phase is referred to as reverse-phase HPLC (RP-HPLC). Amino acid analysis can also be used to determine amount of protein present by Automated Edman Degration. None of the methods is perfect because each is dependent on the amino acid content of the protein. However, each will provide a satisfactory result if the proper experimental conditions are used and/or a suitable standard protein is chosen. Other important factors in method selection include the sensitivity and accuracy desired, the presence of interfering substances, and the time available for the assay.

2.10.2 Recent Advances In the area of 2-DE, in recent years, some new methods came out such as fluorescence 2-DE [1, 2]. Through labeling of samples with one of three specially different fluorescent dyes, Cyanine-2, Cyanine-3 or Cyanine-5, the labeled samples are then run in one gel and detected individually by scanning the gel at different wavelengths. After quantitative analysis by the Phoretix/ImageMaster software, the different expressed proteins can be obtained. Based on the principle of this method, however, those proteins without lysine residues can not be labeled and lost. At the same time, the high cost of the whole system prevents it from spreading out [3]. In 2003, Yuan et al. [3] reported a new IPG strip application, called multi-strips on one gel method. This new method can not only improve the reproducibility and resolution power of 2-DE pattern, but also achieve high throughput and economical format which is helpful to automatic proteomic research. A new technology has been introduced during the past few years that greatly increased the speed of spectrophotometric measurements. New detectors called photodiode arrays are being used in modern spectrometers. Photodiodes are composed of silicon crystals that are sensitive to light in the wavelength range 1701100 nm. Upon photon absorption by the diode, a current is generated in the photodiode that is proportional to the number of photons. Linear arrays of photodiodes are self-scanning and have response times on the order of 100 milliseconds; hence, an entire UV-VIS spectrum can be obtained with an extremely brief exposure of the sample to polychromatic light. New spectrometers designed by Hewlett-Packard and Perkin-Elmer use this technology and can produce a full spectrum from 190 to 820 nm in one-tenth of a second [4].

2.10.3 Evaluation of the Technologies The biuret assay has several advantages including speed, similar color development with different proteins, and few interfering substances. Its primary disadvantage is its lack of sensitivity [4]. The obvious advantage of the Lowry assay is its sensitivity, which is up to 100 times greater than that of the biuret assay; however, more time is required for the Lowry assay. Since proteins have varying contents of tyrosine and tryptophan, the amount of color development changes with different proteins, including the bovine serum albumin standard. Because of this, the Lowry protein assay should be used only for measuring changes in protein concentration, not absolute values of protein concentration [4]. Specialist literature contains a multitude of modifications for the Lowry assay. The principal target is to reduce the high susceptibility to interference. The Lowry method is adversely affected by a wide range of non-proteins. Additives such as EDTA, ammonia sulfate or Triton X-100 in particular are incompatible with the test. The Bradford method is twice as sensitive as the Lowry or BCA test and is thus the most sensitive quantitative dye assay. It is the easiest to handle and most rapid method and has the additional advantage that a series of reducing substances (e.g. DTT and mercaptoethanol), which interfere with the Lowry or BCA test, have no adverse effect on results. However, it is sensitive to detergents. The main disadvantage is that identical amounts of different standard proteins can cause considerable differences in the resulting absorption coefficients. With a microassay

procedure, the Bradford assay can be used to determine proteins in the range of 1 to 20 µg. The Bradford assay shows significant variation with different proteins, but this also occurs with the Lowry assay. The Bradford method not only is rapid but also has very few interference by nonprotein components. The only known interfering substances are detergents, Triton X-100 and sodium dodecyl sulfate. The many advantages of the Bradford assay have led to its wide adoption in biochemical research laboratories [4]. In the study of Giraudi et al. [5], they mentioned that a disadvantage of Bradford assay is the variability of colour development with different proteins: the absorbance change per unit mass of protein varies with the nature of the protein assayed. They believe that the Bradford method should give a lower protein concentration than the real value due to the lower probability of interaction between dye molecules and free lysine residue. In the study of Lucarini and Kilikian [6], the methods of Lowry and Bradford were compared regarding the level of interference of some substances used for glucoamylase precipitation by ethanol. The method of Bradford suffers no interference while the method of Lowry showed protein concentration values 20% increased in the presence of ethanol and Tris. They also mentioned that despite these interferences, the Lowry method can evaluate more accurately the increase of purity during fractionation, due to its sensitivity to low molecular weight (below 6 kDa) proteins and peptides. The BCA protein assay is based on chemical principles similar to those of the biuret and Lowry assays. This assay has the same sensitivity level as the Lowry and Bradford assays. Its main advantages are its simplicity and its usefulness in the presence of 1% detergents such as Triton or sodium dodecyl sulfate (SDS) [4]. This test is easier to carry out and sensitivity can be varied using different temperatures. Furthermore, the dye complex is very stable. However, this test is highly susceptible to interference, although on the positive side, its insensitivity to detergents is similar to that of the Lowry method. Perhaps the most difficult and inconvenient aspect of PAGE is the preparation of gels. The monomer, acrylamide, is a neurotoxin and a cancer suspect agent; hence, special handling is required. Other necessary reagents including catalysts and initiators also require special handling and are unstable. In addition, it is difficult to make gels that have reproducible thickness and compositions. Many researchers are now turning to the use of precast polyacrylamide gels. Several manufacturers now offer gels precast in glass or plastic cassettes. Gels for all experimental operations are available including single percentage (between 3 and 27%) or gradient gel concentrations and a variety of sample well configurations and buffer chemistries. Several modifications of PAGE have greatly increased its versatility and usefulness as an analytical tool [4]. SDS-PAGE is valuable to estimating the molecular weight of protein subunits. This modification of gel electrophoresis finds its greatest use in characterizing the sizes and different types of subunits in oligomeric proteins. SDS-PAGE is limited to a molecular weight range of 10,000 to 200,000. Gels of less than 2.5% acrylamide must be used for determining molecular weights above 200,000, but these gels do not set well and are very fragile because of minimal cross-linking. A modification using gels of agarose-acrylamide mixtures allows the measurement of molecular weights above 200,000 [4]. SDS-PAGE is a fundamental method for 2-DE analysis, since it represents the second dimension run and the run that will finally remain in the record, since it is at the end of this step that proteins are stained, or blotted and

extracted and further analysed with the powerful tools today available in proteomics [7]. Modern IEF techniques, both in soluble and immobilised buffers, have much to offer to users. Adequate solutions exist to the two most noxious impediments to a well functioning technique, namely lack of flexibility in modulating the slope of the PH gradient and protein precipitation at the pI value. The solution is use of spacers and novel mixtures of solubilisers, comprising sugar and high molarities of zwitterions. In addition, an important spin-off of the IEF know-how seems to be gaining importance in zone electrophoretic separations: the use of isoelectric buffers. Such buffers allow delivery of extremely high voltage gradients, permitting separations of the order of a few minutes, thus favouring very high resolution due to minimum, diffusion-driven, peak spreading. As an extra bonus, by properly modulating the molarity of the isoelectric buffer in solution, it is possible to move along the pH scale by as much as 0.3 to 0.4 pH units, thus optimising the pH window for separation [7]. 2-DE is a more sensitive analytical method than either electrophoretic method alone [8]. It is a standard method for judging protein purity. In addition, this technique is becoming increasingly valuable in developmental biochemistry, where the increase or decrease in intensity of a spot representing a specific protein can be monitored as a function of cell growth [4]. However, Classical 2-DE with pH gradient generated by a carrier ampholyte was limited in its resolution, reproducibility and protein-loading capacity [9] because of pH-gradient instability with prolonged focusing time: the pH gradient moves towards the cathode (cathode drift). Detailed comparisons of carrier ampholyte-based patterns for the same cell material in separate laboratories were very difficult, furthermore, limiting to establish collective databases of 2-D gel information [3]. The IEF technique is most useful for the analysis of protein purity, composition, and antigenic properties. The basic IEF technique allows only qualitative examination of antigenic proteins. The advanced modifications, rocket IE and two-dimensional IE should be used to get to quantitative results in the form of protein antigen concentration [4]. The method sensitivity is a potential disadvantage of the CE technique. However, this is also a challenge for the HPLC chiral separation since the peak efficiencies for the commonly used chiral columns are low. An important factor to achieve suitable method sensitivity is setting the UV wavelength as low as possible whether by HPLC or CE. Therefore, the UV cut-off of the mobile phase in HPLC and background electrolyte in CE should be as low as possible. In addition, poor precision is another disadvantage associated with the use of CE. Since the enantiomeric impurity is determined based on area percent, this problem is not a major concern. However, some CE-specific related parameters should be carefully controlled. Migration time variation is a major concern for the CE separation. But the data shown in the study of Song et al. [10] indicates this problem can be well controlled as long as the operational parameters and compositional parameters are optimized. Although the spectrophotometric assay of protein is fast, relatively sensitive, and requires only a small sample size, it is still only an estimate of protein concentration. It has certain advantages over the colorimetric assays in that most buffers and ammonium sulfate do not interfere and the procedure is nondestructive to protein samples. The spectrophotometric assay is particularly suited to the rapid measurement of protein elution from a chromatography column, where only protein concentration changes are required [4]. The UV Absorption Spectrophotometry

method may be used with in concentrations of up to approximately 4 mg/ml (3.0 A). This method is simple and rapid, but may be disturbed by the parallel absorption of non-proteins (e.g. DNA). Unlike the colorimetric process, this method is less sensitive and requires higher protein concentrations and should thus be used with pure protein solutions. In addition to the direct absorbance display, evaluation is possible with the BioPhotometer via the Warburg formula or via standard. Fluorescence measurements have much greater sensitivity than absorption measurements. Therefore, the experimenter must take special precautions in making fluorescence measurements because any contaminant or impurity in the system can lead to inaccurate results. Preparation of reagents and solutions and control of temperature must be considered when preparing for a fluorescence experiment [4]. Compare to the classical forms of liquid chromatography (paper, column et al.), HPLC has several advantages. Firstly, resolution and speed of analysis far exceed the classical methods. Secondly, HPLC columns can be reused without repacking or regeneration. On the other hand, reproducibility is greatly improved because the parameters affecting the efficiency of the separation can be closely controlled. Furthermore, instrument operation and data analysis are easily automated. Last but not least, HPLC is adaptable to large-scale, preparative procedures [4].

2.10.4 Applications of the Technologies Biologists often require certain concentration and purity protein in their researches. The techniques for protein concentration and purity determination can be used in many areas. The development of techniques and methods for determination of protein concentration and purity has been essential for many of the recent advancements in biotechnology research [11]. The purity of a protein is a prerequisite for its structure and function studies or its potential application [12]. New technologies such as SDS-PAGE is used widely in protein purity and concentration analysis. Interferons (IFNs) were originally discovered due to their ability to protect cells against viral infections [13]. However, IFNs have also potent immunomodulatory effects and antiproliferative activity against malignant cells [14]. According to the World Health Organization, potency, purity, identity and stability are the most important properties for the quality control of these cytokines [15]. In the study of Ruiz et al. [16], the influence of the protein concentration and a formulation vehicle on the stability of recombinant human Interferon alpha 2b in solution was evaluated. RP-HPLC was undertaken on a Vydac wide-pore octyl column. Purity was calculated as percentage of the main peak divided by the total area. The samples were also analyzed by SDS-PAGE as described by Laemmli [17]. Increasing therapeutic applications for recombinant human interferon-γ (rhIFN-γ) has broadened interest in optimizing methods for its production and purification [18]. In the section of Reversed phase chromatography in the study of Reddy et al. [18], the major peak was collected in fractions. The fractions were then analyzed by SDS–PAGE. These fractions proved to be uncontaminated and were pooled for renaturation of the protein. The eluted fractions were analyzed by SDS–PAGE. Pure fractions were pooled for renaturation. In the section of renaturation and gel filtration, the eluted dimer peak was collected and then both rechromatographed on a Superdex-75 column and assessed by SDS–PAGE to analyze purity. In the section of Cross-linking analysis, dimer formation was further confirmed by interchain cross-linking of the monomeric

rhIFN-γ using DSS as a cross-linker [19]. The cross-linked dimer was then analyzed by SDS–PAGE. A single band corresponding to 33 kDa was confirmed by it. In the study of micro-scale open-tube capillary separations of functional proteins by Hanna et al. [20], the demonstration of enhanced purity for functional protein by open-tube capillary columns as compared with conventional packed-column approached were performed by analysis of the whole-cell lysate by SDS-PAGE. An SDS-PAGE Schagger gel of the eluted membrane protein complex indicates that the two known subunits that make up the octadecameric membrane complex are of exceptional purity. They also used SDS-PAGE to perform the total elution volume in each case of a different capillary to demonstrate that by going to increasingly smaller elution volumes, the tube enrichment factor can be manipulated so as to achieve increasingly higher final concentration of prepared protein. His-tagged magnesiumprotoporphyrin IX chelatase subunit D and untagged subunit I were expressed and purified. The fractions eluted from the capillaries were analyzed by SDS-PAGE following a series of experiments. The HeLa nuclear extracts and samples complex was eluted from the columns and analyzed by SDS-PAGE, then the individual slices from the SDS-PAGE gel were digested with trypsin. The digestibility of novel proteins in simulated gastric fluid is considered to be an indicator of reduced risk of allergenic potential in food, and estimates of digestibility for transgenic proteins expressed in crops are required for making a human-health risk assessment by regulatory authorities [21].In the study of Herman et al. [21], estimation of digestion efficiency using densitometric measurements of relative protein concentration based on SDS-PAGE corroborated digestion estimates based on measurements of dye or fluorescence release from the labeled substrates. The high resolving power of capillary electrophoresis combined with the specificity of binding interactions may be used with advantage to characterize the structurefunction relationship of biomolecules, to quantitate specific analytes in complex sample matrices, and to determine the purity of pharmaceutical and other molecules [22].

2.10.5 Relevant web sites

References: 1. Tonge, R., Shaw, J., Middleton, B., Rowlinson, R., Rayner, S., Young, J., Pognan, F. et al. (2001). Validation and development of fluorescence twodimensional differential gel electrophoresis proteomics technology. Proteomics 1, 377 396. 2. Zhou, G., Li, H., DeCamp, D., Chen, S., Shu, H., Gong, Y., Flaig, M. et al. (2002). 2D differential in-gel electrophoresis for the identification of esophageal scans cell cancer-specific protein markers. Mol Cell Proteomics 1, 117 124. 3. Yuan, Q., An, J., Liu D. G., & Zhao, F. K. (2003). Multi-strips on One Gel Method to Improve the Reproducibility, Resolution Power and Highthroughput of Two-dimensional Electrophoresis. Acta Biochemica 35, 611618. 4. Boyer, R. (2000) Modern Experimental Biochemistry. 3rd edn. Addison Wesley Longman. 41-43, 116, 121, 130-131, 157. 5. Giraudi, G., Baggiani, C., & Giovannoli, C. (1997). Inaccuracy of the Bradford method for the determination of protein concentration in steroid-horseradish peroxidase conjugates. Analytica Chimica Acta 337, 93-97. 6. Lucarini, A. C. & kilikian, B. V. (1999). Comparative study of Lowry and Bradford methods: interfering substances. Biotech. Tech. 13, 149-154. 7. Righetti, P. G., Stoyanov, A. V. & Zhukov. M. Y. (2001) The proteome revisited: theory and practice of all relevant electrophoretic steps. Elsevier Science. Amsterdam. 207, 268, 368. 8. Nelson, D. L., & Cox, M. M. (2003) Lehninger : Principles of Biochemistry. 3rd edn. New York. Worth Publishers. 115. 9. Klose, J., & Kobalz, U. (1995). Two-dimensional electrophoresis of proteins: An updated protocol and implications for a functional analysis of the genome. Electrophoresis 16, 1034 1059. 10. Song, S., Zhou, L., Thompson, R., Yang, M., Ellison, D., & Wyvratt, J. M. (2002) J. Chromatogr. A. 959, 299-308. 11. Wilchek, M., & Miron, T. (1999). React. Funct. Polym. 41, 263. 12. Altintas, E. B., & Denizli, A. (2006). Monosize poly (glycidyl methacrylate) beads for dye-affinity purification of lysozyme. Inter. J. Bio. Macromol.. 38, 99106. 13. Isaacs, A., & Lindenman, J. (1957). Virus interference-1. The interferons. Proc R Soc Lond B Biol Sci, 147, 258-267. 14. Bordens, R., Grossberg, S. E., Trotta, P. P. & Nagabhushan, T. L. (1997). Molecular and biologic characterization of recombinant interferon-alpha2b. Semin Oncol, 24, S9-41-51. 15. World Health Organizations (1988). Requirements for human interferons made by recombinant DNA techniques. Technical Series No. 771. 16. Ruiz, L., Reyes, N., Aroche, K., Baez, R., Aldana, R., & Hardy, E. (2006) Some factors affecting the stability of interferon alpha 2b in solution. Biologicals 34, 15-19. 17. Laemmli, U. K. (1970) Cleavage of structural proteins during the assembly of the head of bacteriophage T4. Nature 227, 680-685. 18. Reddy, P. K. et al., (2006) Increased yield of high purity recombinant human interferon-γ utilizing reversed phase column chromatography, Protein Expression and Purification, doi:10.1016/j.pep.2006.08.013. 19. Wang, F., Liu, Y., Li, J., Ma, G., & Su, Z. (2006), On-column refolding of consensus interferon at high concentration with guanidine–hydrochloride and polyethylene glycol gradients, J. Chromatogr. A. 1115, 72–80. 20. Hanna, C., Gjerde, D., Nguyen, L., Dickman, M, Brown, P. & Hornby, D. (2006). Micro-scale open-tube capillary separations of functional proteins.

Anal. Biochem. 350, 128-137. 21. Herman, R. A., Korjagin, V. A. & Schafer, B. W. (2005). Quantitative measurement of protein digestion in simulated gastric fluid. Regu. Toxi. Phaema. 41, 175-184. 22. Heegaard, N. H., Kennedy, R. T. (1999). Identification, quantitation, and characterization of biomolecules by capillary electrophoretic analysis of binding interactions. Electrophoresis 20, 3122-3133.

Chapter 3.1 Amino Acid Analysis and Sequencing of Proteins
Alexander Leahy

Understanding proteins and their function is fundamental in the aim of understanding the biochemistry of living organisms. With this understanding, one can begin to influence the chemical pathways, contributing to prevention or curing of disease. Analysis of proteins will usually begin with establishing the primary sequence, or amino acid sequence, of the protein. There are a number of techniques to achieve this. One method involves deduction from the corresponding DNA sequence. This method often needs to be confirmed chemically and so this document will concern itself only with chemical analysis of the protein. Amino acid analysis involves the complete hydrolysis of the protein into its constituent amino acids and then separation of these residues for quantitative analysis. There are a number of hydrolysis techniques, including acid hydrolysis, base hydrolysis and enzyme hydrolysis. Some form of derivatisation, such as with ninhydrin or 6-aminoquinolyl-N-hydroxysuccinimidyl carbamate is then performed on the amino acids in order to make them detectable. They are then separated some form of chromatography, such as ionexchange or reverse-phase[1]. Capillary electrophoresis is also becoming more and more common, due to its speed, resolution and sensitivity, and also its ability to separate enantiomers.[2]

Figure 3.1.1 Perkin Elmer Applied Biosystems Model 420A PTC derivatizer with an on-line Perkin Elmer Applied Biosystems Model 130A PTC Amino Acid Analyzer. Source: N-Terminal and C-Terminal amino acid analysis is a useful procedure as this information can often assist sequencing of the protein. The N-terminus is a theoretically simple process which involves reacting the protein with a reagent which will selectively label the terminal amino acid. The protein is then hydrolysed and separated. The amino acid can then be identified by the label. The C-terminus,

however, will need to be cleaved by a carboxypeptidase. This continually cleaves the c-terminal protein over time. By analysing the amino acid composition over time, one can deduce the C-terminus amino acid. Sequence analysis is a far more complicated procedure. There are two main methods commonly used: Edman degradation and mass spectrometry. In both cases, however, when sequence an entire protein, it is usually necessary to digest the protein using a protease like trypsin to break the protein into smaller peptides which can then be sequenced. However these peptide sequences must then be recombined in order. To achieve this, the process must be repeated but the protein is digested with a different protease such as pepsin which breaks peptide links at different amino acids. The resulting sequences are then compared and aligned where they overlap. Since the peptides will be of differing lengths, the resulting alignments can be combined together to result in the overall sequence. Edman degradation can be viewed as an N-terminal sequence analysis. A protein is adsorbed to a solid phase and is reacted with phenylisothiocyanate. The N-terminal amino acid is cleaved and then washed from the solid state with a solvent. The removed amino acid can then be identified through chromatography. By analysing one amino acid at a time, the sequence can be determined. Unfortunately this process can only be used for sequences of about 50 amino acids. Each time an attempt is made to remove the next amino acid, some of the proteins keep the amino acid and it is removed during the next stage. This introduces some noise into the results which gets worse with each progressive step. Consequently the protein will need to by digested with endoproteases and the sequences of the resulting peptides determined. Overlap of different peptides allow the complete sequence to be determined.[3]

Figure 3.1.2 Mechanism of Edman Degradation Source: A more recently developed method of sequencing proteins involves the use of tandem mass spectrometry(MS/MS) or mass spectrometry with post source decay (PSD). Essentially, both of these techniques rely on the fragmentation of the peptide during the flight path of the ion. A common method of fragmentation for MS/MS is

Collision Aided Dissociation. This method involves subjecting the peptide ion selected from the first MS to collisions with an inert gas, often Argon, which provides results in enough vibrational energy to break the covalent bonds within the peptide.[4] Different types of ions are formed, depending on the location of the fragmentation.

Figure 3.1.3 Sequence ions from fragmentation in mass spectrometry Source: As seen in Figure 3.1.3, yx and bx ions are formed from fragmentation between the carbon and nitrogen of the peptide bond. Ideally, a peptide will fragment along the peptide backbone forming these yx and bx ions. The difference in mass between these ions will indicate sequence data for the fragment being analysed.

Recent Advances
There have been a number of developments in amino acid analysis in recent years. With the increase in use of capillary electrophoresis as a method of separation of the hydrolysed residues[2], there has been increased activity in developing better detection techniques. A number of detection methods exist for this process, such as UV detection and Laser Induced Fluorescence (LID). Mass spectrometry has become the method of choice for sequencing proteins and so there has been a lot of focus on its development within the last 5 years. In particular, there has been a lot of attention in the manner ions are fragmented to generate the mass spectrum from which the sequence is derived. CAD is one of the more commonly used methods for fragmentation in MS/MS. However, CAD can sometimes fail to get a complete distribution of fragments from cleavage along the backbone of the peptide, making sequence determination a complex task. This is often due to Arg residues preventing random protonation along the backbone, or post translational modifications that provide a lower energy cleavage than the backbone. An example of this is phosphorylated residues becoming the preferred site for cleavage, as shown in Figure 3.1.4. Consequently the mass spectrum is dominated by a single peak of the peptide without the phosphoric acid moiety, (Figure 3.1.5 A)

Figure 3.1.4 Fragmentation scheme for loss of phosphoric acid from a multiply protonated phosphopeptide by CAD. Source: Syka et. al. (2004), Proc. Natl. Acad. Sci. USA 101:9528-9533[5] Electron Transfer Dissociation, however, is a technique developed by Syka, et. al. [5] which transfers an electron to the protonated peptides in the mass spectrometer as shown in Equation 1. (Peptide + 3H)3+ + Anion – • [Eq. 1][6] (Peptide + 3H)2+• + Anion

The electron carrying peptide produces fragments in such a way that does not cleave any of the chemical modifications from the peptide, but instead promotes cleavage along the peptide backbone. The diagram below shows mass spectra of a phosphopeptide. The first spectrum was obtained using the more conventional collisation activated technique. The spectrum is largely dominated by a single peak resulting in the loss of the phosphoric acid moiety. Figure x.B shows the mass spectrum of the same peptide after electron transfer dissociation fragmentation. It is can be clearly seen that the peaks which result from cleavage along the backbone make the sequence much more easily determined.

Figure 3.1.5 “Tandem mass spectrometry (MS/MS) spectra obtained from a phosphopeptide eluted during a nanoflow high-performance liquid chromatography MS/MS (nHPLC-MS/MS) experiment. (A) The MS/MS spectrum produced following conventional collisional-activation. Note this spectrum is dominated by a single mass/charge (m/z) corresponding to loss of a phosphoric acid moiety. No peptide backbone cleavage is observed. Sequence identification is, therefore, impossible. (B) The MS/MS spectrum that is produced following electron transfer dissociation (ETD) fragmentation. Here, every single backbone cleavage product is observed. The sequence is easily assigned as RKpSILHTIR. Both panels display single-scan mass spectra.” Source: Coon, J, et. al (2005), BioTechniques, 38(4), 519 - 523

Chemically Assisted Fragmentation has made significant improvements to the quality of data obtained from a MALDI-TOF. By sulfonation of the N-terminus of the peptide, it causes the sequence to become negatively charged. When the peptide is ionised and fragmented in the spectrometer, both the yx and bx ions pick up protons from the matrix. This causes the bx ion to become neutral and so will not be detected. Consequently only the y ion is detected and the sequence is much more easily determined.

Figure 3.1.6 Peptide sequencing of a peptide containing two phosphorylated tyrosine residues using Ettan CAF MALDI Sequencing Kit in conjunction with a MALDI-ToF mass spectrometer in PSD mode. The sequence is ALGADSpYpYTAR (two fragment peaks are missing from the spectrum, each indicated by an X). Source: s?OpenDocument&parentid=366147&moduleid=165399&zone=Proteomics

In order to obtain a full sequence of a protein, sequenced peptides from different digestions are usually overlapped, resulting in a complete sequence. A study by Bandeira, et al has shown that this process can be performed directly on MS/MS spectra rather than the sequences. And it is the overlapping spectra of peptide fragment ions which result in the determination of a de novo sequence of a protein. Each spectra is compared and a multiple pairwise alignment is performed on them, specifically focussing on bx and yx ions found in the spectra. With an algorithm that favours multiple alignments over pairwise alignments, a Prefix Residue Mass Spectra is obtained.

Figure 3.1.7 “Clustering phase. (a) and (b) illustrate our linear representation of spectra where a dot indicates a peak and the dot size is proportional to the peak height (used to save space when showing multiple alignments of several spectra). (c) shows the corresponding PRM spectrum (our preprocessed and scored version of an MS/MS spectrum). For the convenience of the reader, prefix masses are shown in green, and suffix masses are shown in red, although this distinction is not known in advance. Other masses (which do not correspond to prefix or suffix masses) are shown as black dots. (d) Clustering is then used to take advantage of redundant information in multiple spectra from the same peptide and (e) obtain a single, more reliable, consensus PRM spectrum (some of the red dots are hidden by green dots). All black dots still present in (e) correspond either to neutral losses or to doubly charged fragments. The increased number and significance of red/green dots in the consensus PRM spectrum as compared to individual spectra would already yield a reliable de novo peptide sequence (as illustrated in (f)), although we refrain from interpreting the spectra until the end of the assembly phase” Source: Bandeira, et. al (2004), Anal. Chem., 76(24), 7221-7233 The resulting PRM from the alignments of spectra is much simpler to interpret for de novo sequencing as some of the noise present in individual spectra is removed, with the added advantage of being able to sequence the complete protein from one resulting spectra. [8]

Evaluation of the Technology
Hydrolysis of Proteins into Amino Acids for Amino Acid Analysis Hydrolysis of proteins usually proceed through 1 of 3 different techniques. Acid hydrolysis is a very harsh method, often using 6N HCl at high temperatures.

Increases in speeds down to 10 minutes can be obtained by heating using microwaves instead of in an oven. Typically, tryptophan, serine, and threonine are destroyed by this method. Serine and threonine can be identified using a time course analysis as they destroy more slowly. The time dependent results allow one to extrapolate the composition of these amino acids. Tryptophan can be analysed by hydrolysis with 4N methane sulfonic acid and 4N sodium hydroxide.[9] Enzyme hydrolysis has advantage of not damaging any of the amino acids, but the enzymes can also hydrolyse themselves, which affects quantitative measurements. The enzymes need to be immobilised in a gel in order to avoid this.[8] Edman Degradation vs Mass Spectrometry Techniques Edman Degradation had been the method of choice for sequencing proteins for many years, however mass spectrometry is beginning to become more prominent in this area. The main advantages of mass spectrometry are its high sensitivity and high throughput. Mass spectrometry of peptides with modern spectrometers are able to obtain relevant data about a protein in the femtomole range, and often less. Edman Degradation sequencers usually require quantites to be in the picomole. The speed of analysis possible by mass spectrometry makes it a much more appropriate form of analysis in the proteomic era. With the increasing number of proteins and peptides requiring identification or characterisation, it is necessary to be able to perform these analyses quickly, which is easily achievable using mass spectrometry. Software exist which can analyse mass spectrometry for sequence analysis. Fragmentation patterns are often very complex in mass spectrometry. The location of fragmentation of a peptide varies depending on the method used. Ideally a fragmentation consisting mostly of bx and yx ions would be desirable, but this often is not the case. Other types of ions, such as those formed from fragmentation of the side chains and further fragmentation of fragments also add to the peaks in the spectra observed. As not fragmentation techniques are compatible with all types of mass spectrometers, this makes it difficult for a laboratory to easily select a method which would best analyse their protein. However with so many peptides and proteins already sequenced, it is often not necessary to sequence an entire peptide to determine its identity. Sequence tags are short sections of the sequence of a peptide which can be used to identify a peptide sequence in a database. In fact, frequently the spectrum itself can be used to search databases for protein identification. One of the problems with performing sequencing by mass spectrometry is being able to distinguish between leucine and isoleucine. These amino acids are isomers of each other and so have identical mass. The only way to distinguish between them using mass spectrometry techniques is by high energy fragmentation. The ions that are formed often cleave the side chain of the amino acid, forming d, v and w ions. These ions can be used to distinguish between these two amino acids.

Applications of the Technology
In 1999 a study was done to sequence neuropeptides present in a tissue sample of pituitary neurointermediate lobes taken from adult X. laevis toads. The technique they chose was to use a MALDI-PSD to determine the sequence of the peptides. Interestingly, the cells were directly mixed with the matrix, 2, 5–dihydroxybenzoic

acid in trifluoroacetic acid, where cell lysis occurred. Once the targets were prepared, a spot would typically contain femtomoles of a peptide. The ions generated with the MALDI source were detected using MALDI-TOF MS and a profile of the peptides present was obtained as shown below.

Figure 3.1.8 Mass profile of peptides in pars intermedia tissue from the amphibian X. laevis, obtained by direct MALDI TOF MS analysis under delayed extraction conditions. Source: Jesperson, S. et al, Anal. Chem., 71(3), 660-666 [11] The two most prominent ions present at masses of 1050.4u and 1392.7u were chosen to perform MALDI-PSD analysis. Analysis of the 1050.4 u peak proved difficult as the peptide had a disulfide link which severely affected the fragmentation patterns observed. However, identification of the masses of individual amino acids assisted in identifying the peptide and the peaks matching fragmentation around the disulfide link supported the conclusion. The figure below shows the identified peptide and the matching ions from the spectrum.[11] This example highlights the need for the breakage of disulfide links in peptides before sequence analysis.via mass spectroscopy.

Figure 3.1.9 Known sequence of the vasotocin peptide (N-terminal end on top), including an intrinsic disulfide (S-S) bridge between the two cysteine residues. Indicated are the a-, b-, and y-type ions observed in the MALDI-PSD fragment spectrum Source: Jesperson, S. et al, Anal. Chem., 71(3), 660-666 [11] Sequencing of a novel peptide in the venom of Crotalus durissus collilneatus was performed on a Micromass Q-TOF Micro mass spectrometer(Waters, USA) in positive electrospray ionisation mode. This particular instrument is capable of tandem MS/MS which was utilised to complete the de novo sequencing of the peptide. The peptide was fragmented with a collision energy fixed at 30 V. The MassLynks(Waters, USA) software automated the sequencing process from the mass spectrum obtained. The spectrum yielded the sequence TPPAGPDGGRP which was supported between the b and y ion series.[12]

. Figure 3.1.10 MS/MS profile of the fragmentation of the selected ion (511.2 Da [M + 2H]2+) and the deduced sequence derived from de novo sequencing based on MassLynks data processing system (Waters, USA). Consistency between the b and y ion series established the sequence TPPAGPDGGPR.

Source: Higuchi S, et al, Comp. Biochem. Physiol. C. Toxicol. Pharmacol, 144(2):107-21[12]

Relevant Web Sites
Ion Source – Mass Spectrometry and Biotechnology Resource This site is particularly useful as it provides tutorials on a number of different aspects of peptide analysis through mass spectrometry. Of particular interest at this site was the tutorial on de novo peptide sequencing. Along with explanations and exercises of the techniques, it also contains many references examining the different techniques and software used in this type of analysis. Waters is a company that makes analytical instruments for protein analysis. They have a number of references on their site relevant to the development of amino acid analysis techniques.

Key Industry Suppliers
Amersham Biosciences, a subsidiary of GE Healthcare Life Sciences, is a major supplier of the Ettan CAF MALDI Sequencing Kit[13] as well as the Ettan MALDITOF Pro. Waters is a company from USA which is able to supply many analytical instruments required for protein analysis. They offer a variety of mass spectrometers, such as LC/MS, LC/MS/MS, MALDI and GC-MS. They also supply chromatography instruments, such as the ACQUITY UPLC which can complete an analysis in less than 30 minutes. They also supply AccQtag which is a derivatisation reagent which allows amino acid residues to be detected by fluorescence. Applied Biosystems is a major supplier of a number of different instruments such as mass spectrometers, chromatography instruments as well as protein sequencers, such as the Procise cLC Protein Sequencer which uses capillary HPLC to complete Edman Degradation down to femtomole analysis.[14]

1. Iowa State University, (2004) Amino Acid Analysis, 2. Poinsot V, Lacroix M, Maury D, Chataigne G, Feurer B, Couderc F (2006), Recent Advances in Amino Acid Analysis by Capillary Electrophoresis, Electrophoresis, 27, 176-194 3. H. Jakubowski (2006), Biochemistry Online, chapter 2 B seqconform.html

4. Hunt, D.F., Yates, J.R., Shabanowitz, J., Winston, S., Hauer, C.R. (1986), Protein Sequencing by Tandem Mass-Spectrometry. Proc. Natl. Acad. Sci. USA 83:6233-6237 5. Syka et. al. (2004), Proc. Natl. Acad. Sci. USA 101:9528-9533 6. Coon, J, et. al (2005), BioTechniques, 38(4), 519 - 523 7. Jesperson, S., Chaurand P., van Strien F, Spengler B., van der Greef J, (1999) Direct Sequencing of Neuropeptides in Biological Tissue by MALDIPSD Mass Spectrometry, Anal. Chem., 71(3), 660-666 8. Bandeira N, Tang H, Bafna V, Pevzner P (2004), Shotgun Protein Sequencing by Tandem Mass Spectra Assembly, Anal. Chem. 76(24), 72217233 9. Karen A West, Jeffrey D Hulmes and John W Crabb (1996), Amino Acid Analysis Tutorial .html 10. David E. Metzler, (2001), Biochemistry – The Chemical Reactions of Living Cells, Harcourt/Academic Press, pp 116 11. Higuchi S, Murayama N, Saguchi K, Ohi H, Fujita Y, da Silva N, Bezerra de Siqueira R, Kahlou S, Aird S (2006), A novel peptide from the ACEI/BPP-CNP precursor in the venom of Crotalus durissus collilineatus, Comp. Biochem. Physiol. C. Toxicol. Pharmacol, 144(2):107-21. 12. Savitski MM, Nielsen ML, Kjeldsen F, Zubarev RA (2005), Proteomics-grade de novo sequencing approach. J Proteome Res. 4(6):2348-54. 13. GE Healthcare Life Sciences (2002), s?OpenDocument&parentid=366147&moduleid=165399&zone=Proteomics 14. Applied Biosystems (2006), ng.cfm

Chapter 3.2

Chemical modifications of proteins
Hellan M Luo

3.2.1 Introduction Protein chemists have long been interested in altering the chemical, physical and biological properties of proteins by chemically changing their structure. [1] Almost the very 1st thing that discovered by scientists is ‘’ Protein can be easily changed upon treatment with chemical reagents’’. [1] Their liability to chemical reagents and reaction conditions has been a very serious problem foe many purposes. The application of modern knowledge of proteins, new chemical reagents and more sophisticated analytical techniques has made chemical modification of protein molecules become one of the most useful approaches to study/research of their properties. [1]. The term “chemical modification" refers to formation or cleavage of covalent bonds, generally with the side chains, though cleavage of peptide bonds and modification of the a-amino terminal can formally be included. [3] Most of these methods are not reversible by dilution, gel filtration or dialysis, however, there are few exceptions== Schiff base formation with aldehydes and some reactions of Arginine are thus reversible. [3]. some modifications are reversed by changing the conditions, for instance lowering the pH, which can be very useful, but most are not and Some modifications are hydrolyzed in 6N HCl, as do before amino acid analysis; Noncovalent interactions also have their place - competitive inhibitors which bind at active sites; low dielectric solvents change the UV absorbance of groups exposed to solvent, fluorescence-quenching molecules similarly quench the fluorescence of exposed tryptophan residues.[3][4] There are 7 main purposes to modify proteins: Purpose 1. Analysis Explanations / main strategies No. of aa’s present in a protein, done by amino acid analysis after acid hydrolysis – automated ion exchange chromatography, detecting peaks of amino acid coming off a column by reaction with ninhydrin or fluorescamine can be done on proteins bound to membranes after electrophoresis, or even in the gel What groups are at the site responsible for biological activity? How many? Which ones in the sequence? For enzymes we aim to characterize the chemical mechanism, which requires identifying the participating groups, eventually to characterize the conformation of the active site, and beyond that to understand changes in conformation during catalysis. best done by X-ray crystallography change in the physical characteristics of the protein, destabilization of complexes, including solubilisation of

2. As an aid in sequence analysis 3. To understand mechanism of action

4. Physical modification

5. Cross-linking and attachment to supports 6. Hapten attachment 7. Attachment of reporter groups

hydrophobic proteins insolubilization of protein by intermolecular cross-linking, so that for instance insoluble but active trypsin can be removed from a digestion by centrifugation Carrier proteins, to elicit antibodies to the hapten (won’t get it if it is a free small molecule.) sensitive to their environment - chromophores whose absorbance is pH-sensitive,

Table 3.2.1 Severn (7) main purposes of chemical modifications of protein and their main strategies, summarized from [2] [3] Modifications occur (1-3) 1. by addition of other groups: Group Added Phosphoryl Methyl Acetyl Hydroxyl Carboxyl Target Residue(s) tyr, ser, thr, his lys lys, N-terminus pro, lys glu ser, thr, hydroxypro, Sugars asn myristyl cys, his glycophospholipid C-terminus Prenyl cys ADP-ribose ? Dipthamide his Ubiquitin lys Reversibility yes yes yes (lys) no no no no no no yes no no

Table 3.2.2 Addiction of other groups, target residue and reversibility for protein chemical modification, source Molecular Genetics, Protein Modification [2] 2. Or by isomerization of residues.
o o o

L-ala is converted to D-ala in Dermorphin (a peptide of frog skin). Cis and Trans proline are interconverted in a reaction important for protein folding. Cysteines are exchanged by a disulfide exchange protein, also important for folding. [2]

3. During translation (co translational) or after the polypeptide chain has been completed (post-translational). [2]

3.2.2 Recent Advances

Recently, Chemical modification of proteins become more and more popular, one question about biologically active protein is has been frequently asked ‘’ what is unique about the structure that accounts for the particular activity?’’ [1] The interests are mainly focus on the amino acid said chain groups of protein molecule that participate in the activity or the ‘’ active centre’’ if it happens to be an enzyme [1]. (NOTE, enzyme is a special form of protein and plays very important roles in biological function). There are many recent advances developed thanks to the modern protein technologies, of which, are widely used in Biopharmaceuticals fields; Proteins are controlled by a vast and dynamic array of post-translational modifications, many of which create binding sites for specific protein-interaction domains. The proteome can be linked to post-translational modifications to cellular organization. The most common strategies are modification-dependent interactions synergize to regulate cell behavior. Some recent developed technologies include: 1) Site-Specific Chemical Modification of Proteins Use site-specific chemical modification to study the structure-function relationships in proteins.[5] This technique is basically prefer to modified specified site of a protein: such as arginyl residues, Cysteine,lysine residues and other αamino groups ,Diethylpyrocarbonate ,Selective reduction of disulfide bonds ,Methionine, Tetranitromethane, tryptophan etc, To modify these specific protein sites, in combination of relevant reagents, in order for proteins that can be a better form of membrane transport, better enzymatic activities, and helpful to study protein structure and function relationship. As well as have better biological function. [13] [14] [15] [16] [5] 2) Chemical modifications of proteins in Vivo There are chemical modifications of proteins which control biological activity. The majority of these modifications are co-translational and post-translational reactions, some of them are considered to occur in a ‘random’ manner, in such modification, they can not be catalysed by an enzyme, and /or enzymes or subject to environment of the proteins. For examples: Oxidation of proteins occurs in vivo with generally unfavourable consequences. Nitric oxide is a potent physiologic agent with diverse systemic effects Peroxynitrite which is formed from nitric oxide is a mediator of some of these physiologic effects of nitric oxide. Glycation is the term used to identify the reaction of reducing sugars with proteins. This involves the initial formation of a Schiff base followed by rearrangement in the Maillard reaction eventually resulting in advanced Glycation end (AGE) products Reaction can occur at lysine and arginine residues with resulting crosslink formation. [5] [21][22] 3) Chemical Modification and Protein Biopharmaceuticals The modification of proteins and peptides with poly (ethylene) glycol (PEG) is the most frequent chemical modification used in the manufacture of biopharmaceuticals. [5][21][14] Other recent advances include: 4) A Wiring of the Human Nucleolus

5) In-depth analysis of the membrane and cytosolic proteome of red blood cells. 6) A Mammalian organelle map by protein correlation profiling. 7) Modular stop and go extraction tips with stacked disks for parallel and multidimensional Peptide fractionation in proteomics. 8) Insulin-dependent Interactions of Proteins with GLUT4 Revealed through Stable Isotope Labeling by Amino Acids in Cell Culture (SILAC 9) Quantitative proteomic comparison of rat mitochondria from muscle, heart andActin homolog MreB and RNA polymerase interact and are both required for chromosome segregation in Escherichia coli. liver. Proteins may be modified in vitro so that they may be used for detection, purification and assay development. In very general terms, protein modification reagents can be separated into reagents that add, cleave or reduce. Reagents that add labels are used for immunoassays, flow cytometry, fluorescenceactivated cell sorter (FACS) analysis and molecular structure and function studies. Addition reagents also include those used to block particular functional groups. Reagents that enzymatically cleave can be used for removing amino acids, producing antibody fragments and releasing peptides from fusion proteins. Reagents that chemically reduce can be used for protein solubilisation and to facilitate cross-linking. [24] 1) Label Addition Reagents
Reagent Reactivity Bolton-Hunter Reagent (SHPP) Primary Amines Water-Soluble Bolton-Hunter Reagent (Sulfo-SHPP) Primary Amines Succinimidyl-3-(tri-N-butylstannyl) benzoate Primary Amines Sulfo-SHB Primary Amines HPPH Carbonyl/Aldehyde Sulfhydryl ß-(4-Hydroxyphenyl)ethylmaleimide ß-(4-Hydroxyphenyl)ethyl iodoacetamide Sulfhydryl

Table 3.2.3 Protein modifications reagents can be Iodinate (source: Pierce net Protein Chemistry) [24] 2) Addition Reagents that Alter or Block Functional Groups In many applications, especially those involving cross-linking or detection of specific functional groups, it is sometimes necessary to selectively block one functional group (e.g., amines) or else add more of one particular functional group (e.g., add more sulfhydryl groups) to one or more proteins used in an experiment. Blocking functional groups is also useful for reducing background detection in certain assays.[24][8][9] 3) Enzymatic Cleavage Reagents

Enzymes may be used as catalysts to affect a wide variety of biochemical transformations. Whether dissolved in solution or immobilized to an insoluble support, enzymes can specifically cleave proteins at discrete sites to isolate fragments of known activity or structure. Such cleavage events also can be a way to demonstrate protein purity by analysing what peptide fragments result from proteolysis. Pierce proteolysis products include highly purified enzymes that can be used in buffered solutions, immobilized enzymes that facilitate simple removal of protease activity after use, assays to monitor protease activity, and protease inhibitors to avoid proteolysis in biological samples. Examples are: Factor Xa, Submaxillaris Protease, TPCK Trypsin [24] [8][9] 4) Reducing Disulfide Bonds Reduction of disulfide bonds in proteins may be required or beneficial for a number of reasons:
• • • • • • •

Exposure of sulfhydryls for cross-linking Protein activation Oligomer separation Protein denaturation Protein solubilization Disulfide bond characterization Protection of protein thiols from oxidation

The keep reagents that commonly used are: sulfhydryl groups , free thiol groups , dithiothreitol (DTT), 2-mercaptoethanol and 2-mercaptoethylamine, Tris(carboxyethyl), phosphine (TCEP).[24] [8][9] Some of these will be discussed in part 3.2.4

3.2.3 Evaluation of the Technology The same as any technologies, chemical modifications of proteins are also associated with advantages and disadvantages The advantages of protein chemical modifications are: 1) For Pharmaceutical proteins are unstable when injected into the blood circulation. The half life is short, from several minutes to hours. The consequence is multiple injection and uncomfortable side effects. Chemical modification is an effective way to increase the longevity and efficacy of the proteins. 2) Covalent modification is an important strategy for introducing new functions into proteins. 3) As engineered proteins become more sophisticated, it is often desirable to introduce multiple, modifications involving several different functionalities in a site-specific manner. 4) Proteins are the final link of information chain 5) Help understand protein structure and function relationship 6) To perform specific biological function 7) Mildness, high degree of specificity

8) improve biocompatibility and bioactivity 9) Site specific The disadvantages of Chemical modification of proteins are: 1) Usually not residue specific, other amino acids can be modified too 2) Large amount of reagents are required 3) Expensive 4) Time consuming 5) Large ranges of reagents are available 6) Potential pitfall: Does modifications causes distal conformational changes rather than specific blocking of active site 7) Need full time qualified staff to perform the modification 8) Need close monitoring during the modification process

3.2.4 Applications of the Technology Proteins can be easily modified by various methods to alter their structure or properties. Common modifications include crosslinking, fragmenting, denaturing, reducing disulfides, and attaching various prosthetic groups (e.g. PEGylation) to a protein. Protein labelling is another common modification. Proteins can be labelled with biotin, fluorophores, enzymes or radioiodine. [24] 1) Amino Acid Side Chain Modification Agents [24] It refers to Reagents used to block amino acid side chains on proteins, change their charge, or change them to functional groups favourable for cross-linking and labelling. [24] One example is: The antitumor protein neocarzinostatin (NCS), isolated from Streptomyces carzinostaticus, is a single chain polypeptide with 109 amino acid residues. Complete acylation of the amino groups (alanine-1 and lysine-20) was observed when NCS was allowed to react with 3-(4-hydroxyphenyl)-propionic acid Nhydroxysuccinimide ester at pH 8.5. Since bis[(alanine-1, lysine-20)-3-(4hydroxyphenyl)]-propionamide NCS was fully active in antibacterial potency and in the inhibition of growth of leukemic (CCRF-CEM) cells in vitro, it appears that the two amino groups in the protein are not essential for biological activity. Radiolabeled NCS was prepared by using a tritiated or 125I-labeled acylating agent. Since the CD spectra of native and bis(alanine-1, lysine-20)-amino modified NCS were indistinguishable, there is presumably no change in the native conformation of the protein due to acylation. Reaction of NCS with ammonium chloride in the presence of 1-ethyl-3-(3-dimethylaminopropyl)carbodiimide at pH 4.75 converted all the 10 carboxyl groups into carboxamides and produced a protein derivative of basic character. This modification caused a change in the native conformation of the protein accompanied by a loss in biological inhibitory activities. [26] 2) Chaotropes, Protein Denaturants [24] This method is generally refers to Chaotropes such as urea or guanidine disrupts water interactions and help solubilise hydrophobic proteins and peptides. They also act as general protein denaturants, unfolding proteins and altering their threedimensional structure [24] 3) Cross linking Reagents [24] It basically means that Chemical crosslinking agents are used to determine nearneighbour relationships, analyses three-dimensional structures of proteins and

complexes, prepare antibody-enzyme conjugates, immobilize molecules and conjugate haptens to carrier proteins. [24] 4) PEGylation Reagents [24] PEGylate peptides and proteins via available amine or sulfhydryl groups. Increase solubility and stability and reduce immunogenicity of peptides and proteins. [24] 5) Post Translational Modification (PTM) [24] Isolate post translationally modified proteins such as glycoproteins, ubiquitin-modified proteins, and phosphopeptides using these simple and efficient kits. [24]

Fig 3.2.1 Post Translational Modification (Source: Wikipedia) [28]

PTMs involving addition include:
• •

• • • • • •

acetylation, the addition of an acetyl group, usually at the N-terminus of the protein [28] alkylation, the addition of an alkyl group (e.g. methyl, ethyl) [28] o methylation the addition of a methyl group, usually at lysine or arginine residues. biotinylation, acylation of conserved lysine residues with a biotin appendage glutamylation, covalent linkage of glutamic acid residues to tubulin and some other proteins. [28] glycylation, covalent linkage of one to more than 40 glycine residues to the tubulin C-terminal tail [28] glycosylation, the addition of a glycosyl group to either asparagine, hydroxylysine, serine, or threonine, resulting in a glycoprotein isoprenylation, the addition of an isoprenoid group (e.g. farnesol and geranylgeraniol) [28] lipoylation, attachment of a lipoate functionality [28]

phosphopantetheinylation, the addition of a 4'-phosphopantetheinyl moiety from coenzyme A, as in fatty acid, polyketide, non-ribosomal peptide and leucine biosynthesis phosphorylation, the addition of a phosphate group, usually to serine, tyrosine, threonine or histidine [28] sulfation, the addition of a sulfate group to a tyrosine. [28]

As described in Fig 3.2.1, after translation occurs (post translation), folding, oxidation and signal peptide cleavage takes place, ER export , Golgi transporation as well as vesicle packing occurs, protease cleavage liberates C peptide, then stimulates insulin production. 6) Proteases [24] Proteinases for enzymatic cleavage of proteins to facilitate sequencing, amino acid analysis, structural analysis and various other applications. We offer proteases immobilized on agarose for easy removal of the protease following digestion [24] Certain proteases have been used in food processing for centuries and any record of the discovery of their activity has been lost in the mists of time. [29] Rennet (mainly chymosin), obtained from the fourth stomach (abomasum) of unweaned calves has been used traditionally in the production of cheese. [29] Proteases may be used at various pH values, and they may be highly specific in their choice of cleavable peptide links or quite non-specific. Proteolysis generally increases the solubility of proteins at their isoelectric points.[29]

Fig 3.2.2 From Milk to cheese with Proteases Applications (source: [29] 7) Reducing Agents [24] Solution and solid-phase reducing agents for disulfide-containing peptides and proteins. [24] For example: a very strong reducing agent--- Raney nickel, which reduces cysteine to alanine, and diborane and alkylboranes, which can reduce carboxyl groups to

CH2OH. These then decrease the size of the side chain. Unfortunately the reaction tends to be rather incomplete, and not commonly used. [3]

3.2.5 Relevant Web Sites These websites are great for ‘’ Chemical Modification of Proteins’’ 1) Some Recent Developments in the Chemical Modification of Proteins with a Note on Applications to Biopharmaceuticals. 2) Recent paper 3) ExPASy Proteomics tools 4) Lab velocity 5) Biozon 6) Better Health--- Health and medical information for consumers, quality assured by the Victorian
government (Australia). ent 7) Wikipedia 8) Protein Science 9) Protein Data Bank (PDB);jsessionid=mEvHfFKktB0f3mdEYUBdFw** 10) Protein Society 11) Organelles proteomics: turning inventories into insights. tractPlus&list_uids=16953200&query_hl=1&itool=pubmed_docsum

3.2.6 Key Industry Suppliers (if any) There are some key industry suppliers, name and website link as below: (1) (2) (3) (4) Toronto Research Chemicals, Pierce Chemical Company Sigma-Aldrich Pierce net Protein Chemistry
(5) Protein Tech group (6) Invitrogen (Australia) om/handbook/sections/0502.html (7) Biotechnique isting&id=301

3.2.7 References 1. Means, Gary. E, Feeney, Robert. E, Chemical Modification of proteins (1971). Holden-Day Press, San Francisco, Ca, USA 2. Molecular Genetics, Protein Modification (2006) 3. Protein chemical Modification Http:// 4. Protein Chemistry 5. Ralph A. Bradshaw Some Recent Developments in the Chemical Modification of Proteins with a Note on Applications to Biopharmaceuticals 6. Irwin, W.A., Gaspers, L.D., and Thomas, J.A. (2002), Inhibition of the Mitochondrial Permeability Transition by Aldehydes. Biochem.Biophys.Res.Commun. 291, 215-219 7. Involvement of conserved histidine, lysine and tyrosine residues in the mechanism of DNA cleavage by the capase-3 activated DNase CAD, Nucleic Acids Research 30, 1325-1332 8. Methods in Molecular Biology: Protein Stability and Folding, Theory and Practice. Vol. 40, Bret A. Shirley, ed. 1995. 9. Slopes, R.K. 1982, pp. 185-193, Protein Purification: Principles and Practice, Springer-Verlag, New York 10. Toronto Research Chemicals, 11. Pierce Chemical Company 12. Sigma-Aldrich 13.. Wu, X., Chen, S.G., Petrash, J.M., and Monnier, V.M. (2002), Alteration of Substrate Selectivity through Mutation of Two Arginine Residues in the Binding Site of Amadoriase II from Aspergillus sp. Biochemistry 41, 4453-4458. 14.. Gärtner, E.M., Liebold, K., Legrum, B., Fasold, H., and Passow, H. (1997), Three different actions of phenylglyoxal on band 3 protein-mediated anion transport across the red cell membrane. Biochimica et Biophysica Acta 1323, 208-222. 15. Irwin, W.A., Gaspers, L.D., and Thomas, J.A. (2002), Inhibition of the Mitochondrial Permeability Transition by Aldehydes. Biochem.Biophys.Res.Commun. 291, 215-219. 16. Kučera, I. (2003), Passive penetration of nitrate through the plasma membrane of Paracoccus denitrificans and its potentiation by the lipophilic tetraphenylphosphonium cation. Biochim. Biophys. Acta 1557, 119-124. 17. Lundblad, R.L. (1994), Techniques in Protein Modification, Chapter 6, The Modification of Cystine, CRC Press, Boca Raton, Florida, USA, pps. 91-96. 18. Yano, H., Kuroda, S., and Buchanan, B.B. (2002), Disulfide proteome in the analysis of protein function and structure. Proteomics 2, 1090-1096. 19. Loo, T.W., Bartlett, M.C., and Clarke, D.M. (2003), Substrate-induced conformational changes in the transmembrane segments of human p-glycoprotein. Direct evidence for the substrate-induced fit mechanism for drug binding. J. Biol. Chem. 278, 13603-13606. 20. van der Sluis, E.O., Nouwen, N., and Driessen, A.J.M. (2002), SecY-SecY and SecY-SecG contacts revealed by site-specific crosslinking, FEBS Letters 527, 159165 21. Dage, J.L., Sun, H., and Halsall, H.B. (1998), Determination of diethylpyrocarbonate-modified amino residues in α1-acid glycoprotein by highperformance liquid chromatography electrospray ionization-mass spectrometry and

matrix-assisted laser desorption/ionization time-of-flight-mass spectrometry. Analytical Biochemistry 257, 176-185. 22. . Alderton, A.L., Faustman, C., Liebler, D.C., and Hill, D.W. (2003), Induction of redox instability of bovine myoglobin by adduction with 4-hydroxy-2-nonenal. Biochemistry 42, 4398-4405. 23. 24. Pierce net Protein Chemistry
25. J. Jefferson Smith, David W. Conrad, Matthew J. Cuneo and Homme W. Hellinga (2005) Orthogonal site-specific protein modification by engineering reversible thiol protection mechanisms, Protein Science (2005), 14:64-73 26. Samy TS. Neocarzinostatin: effect of modification of side chain amino and carboxyl groups on chemical and biological properties. : Biochemistry. 1977 Dec 13;16(25):5573-8. 27. Frisch B, Boeckler C, Schuber F., Synthesis of short polyoxyethylene-based heterobifunctional cross-linking reagents. Application to the coupling of peptides to liposomes. Bioconjug Chem. 1996 Mar-Apr; 7(2):180-6. 28.Wikidia 29. Application of Proteases in food industry (2006)

Chapter 4.1 MALDI-TOF Mass Spectrometry
Michael Christopher 4.1.1 Introduction

Mass spectrometry is the ultimate technique for the accurate determination of the molecular weight of a molecule by measuring its mass-to-charge (m/z) ratio. Ionized molecules are generated by inducing either the loss or gain of a charge from a neutral species [1]. Once formed, ionized molecules are electrostatically directed into a mass analyzer where they are separated according to their m/z ratio, and finally detected. Matrix-assisted laser desorption/ionization mass spectrometry (MALDI-MS) was first introduced in 1988 by Tanaka et al. [2], and Karas and Hillenkamp [3]. It has since become a widespread analytical tool for the identification of peptides, proteins, glycoproteins, and other biomolecules (carbohydrates, lipids, oligonucleotides, and natural products) [1]. The efficient and directed transfer during a matrix-assisted laser-induced desorption event provides high ion yields of the intact analyte with subpicomole sensitivity, and enables the mass analysis of complex biological samples such as proteolytic digests (Table 4.1.1). MALDI allows for easy preparation and rapid analysis of multiple samples at the same time (using a direct insertion multisample plate), and sample amounts in the femtomole to picomole range. MALDI predominantly generates ions that are singly charged, making it easier to identify intact molecules[1].



Practical mass range of up to 300 kDa. Matrix background, which can be a problem Species of much greater mass have been for compounds below a mass of 700 Da. observed using a high current detector This background interference is highly dependent on the matrix material Typical sensitivity in the order of low Possibility of photodegradation by laser femtomole to low picomole. Attomole desorption/ionization sensitivity is possible Soft ionization with little to no fragmentation Acidic matrix used in MALDI may cause observed degradation on some compounds Tolerance of salts in millimolar concentrations Suitable for the analysis of complex mixtures

Table 4.1.1 Advantages and disadvantages of MALDI Source: Siuzdak, G. (2003) The Expanding Role of Mass Spectrometry in Biotechnology, p. 26. MCC Press, San Diego, CA

It is generally believed that MALDI causes the ionization and transfer of a sample from the condensed phase to the gas phase via laser excitation and vaporization of the solid sample matrix (Figure 4.1.1). In MALDI analysis, the analyte is first cocrystallized with a large molar excess of a matrix compound, usually a UV-absorbing weak (nonvolatile) solid organic acid, such as α -cyano-4-hydroxycinnamic acid (CHCA) [1]. The biomolecules are isolated from each other within the matrix, which results in a reduction of the strong intermolecular forces that exist between them. Irradiation of this analyte-matrix mixture UV-laser light (usually 337 nm) results in the vaporization of the matrix (i.e. the matrix molecules absorb laser light (photon) energy, and convert it into excitational (vibrational) energy). This vibrational energy is then transferred to the co-crystallized biomolecules, causing them to vaporize. However, since the biomolecules are physically shielded by the matrix molecules, they do not directly absorb energy from the laser, thereby minimising sample damage due to irradiation. The photoexcitation/photoionization of the matrix molecules results in proton transfer to the analyte molecules. Once in the gas phase, the desorbed charged molecules are the directed electrostatically (by high voltage ~20-25 kV) from the MALDI ionization source into the mass analyzer [1].

Figure 4.1.1 MALDI-TOF Mass Spectrometer Sample Ionization Source: Linear time-of-flight (TOF) mass analyzers measure the precise time it takes for an accelerated ion to traverse a high vacuum (~10-6 torr) field-free drift zone to a time-offlight detector (Figure 4.1.2). The pulsed nature of MALDI (ions are generated in short, nanosecond pulses) is well suited to TOF analyzers since the ion’s initial time

of flight can be started with each pulse of the laser and completed when the ion reaches the detector [1]. With MALDI-TOF mass analyzers, all the ions are given the same amount of energy through an accelerating potential [1]. Because the ions have the same energy, but a different mass, the lighter ions reach the detector first because of their greater velocity, while the heavier ions take longer due to their heavier masses and lower velocity. In essence, the time an ionized molecule takes to arrive at the detector depends on the mass, charge, and kinetic energy of the ion [1]. Linear MALDI-TOF mass analyzers can routinely analyze intact proteins and large peptides from a mass range of 0.7 to 200 kDa, with an an accuracy range of 0.010.1%.

Figure 4.1.2 Schematic of a linear MALDI-TOF Mass spectrometer Source:

The MALDI-TOF reflectron mass analyzer combines linear TOF technology with an electrostatic mirror (reflectron) [1]. The ions enter the source region and are accelerated toward the reflectron. The ions separate in time based on their relative m/z ratio, reverse their path in the reflectron, and impact the time-of-flight reflectron detector. The reflectron offers higher resolution over a linear TOF mass analyzer by increasing the amount of time that ions take to reach the detector while reducing (focussing) their kinetic energy distribution (i.e. equalising the different kinetic energies of ions that arise during ionisation and acceleration). MALDI-TOF reflectron mass analyzers can analyze small proteins, peptides, hormones and small molecules up to 10 kDa, can distinguish monoisotopic masses, and have an accuracy range of 1-10 ppm [1].

MALDI-TOF (for proteins) and MALDI-TOF reflectron (for peptides) mass analyzers have resolving power capabilities in the order of 400 to 10000 respectively. The resolution and accuracy also depends on the presence of an internal standard, the size/type of peptide/protein, sample purity and preparation, and the selection of matrix material [1]. Even though MALDI is known to be more tolerant of salts, buffers, and impurities, sample cleanup procedures (e.g. ZipTipTM or cold water washing) are still useful [1]. The features of a peptide that affect peptide detection by MALDI-TOF MS include its hydrophobicity (tends to adhere to a solid matrix), ionization efficiency, mass, and basicity (arginine-containing peptides generally produce signals that are 2 to 20-fold stronger than lysine-containing peptides). A type of tandem analysis is also possible with MALDI-TOF reflectron mass analyzers. Tandem analysis (MS/MS) is the ability of the mass analyzer to separate an ion, generate fragment ions from the original (selected) ion, and then analyze the fragmentation ions [1]. MS/MS is accomplished by taking advantage of MALDI fragmentation that occurs following ionization, known as post-source decay (PSD). Post-ionization fragment ions from the same precursor ion (peptide) have different kinetic energies, and the reflectron separates them based on this property, thereby producing a fragment ion spectrum [1]. The fragmentation patterns obtained from MALDI-TOF reflectron tandem MS experiments of proteolytically digested or chemically cleaved target protein(s) yield specific partial sequence information (i.e. two peptides with identical amino acid contents but different sequences will exhibit different fragmentation patterns) and structural information (by performing successive MS experiments on a number of generations of fragment ions) [1]. 4.1.2 Recent Advances

The design of modern mass analyzers have changed significantly in the last five years to interface with MALDI and electrospray ionization (ESI), now offering much higher accuracy and resolution, increased sensitivity, broader mass range and the ability to give structural information. This has revolutionized biomolecular analyses, allowing a measure of a wide range of biomolecular ions with ppm mass accuracy and subfemtomole sensitivity [1]. An innovation that has had a dramatic effect on increasing the resolving power of MALDI-TOF instruments has been delayed extraction (DE), the process of cooling and focussing the ions immediately after the MALDI ionization event [1]. In traditional MALDI instruments, the ions were accelerated out of the ionization source immediately as they were formed (continuous extraction). However, with DE, the ions are allowed to “cool” for ~150 nanoseconds before being accelerated to the analyzer. This cooling period generates a set of ions with much smaller kinetic energy distribution, thereby resulting in a dramatic improvement in resolution and accuracy for biomolecules less than 30 kDa [1]. In addition, the automation of multisample probe preparation and sample analyses with MALDI mass analyzers is becoming increasingly important in proteomics and combinatorial chemistry [1]. Automation has made it possible to rapidly identify species of interest by profiling changes in the expression levels of thousands of proteins. Automated liquid handling robots have been developed that perform all the sample preparation steps for peptide mapping experiments, including gel destaining, alkylation/reduction, in gel digestion, peptide extraction, and MALDI target plating [1]. Commercial MALDI-TOF systems are currently available that can perform over 1,000 mapping experiments in just 12 hours. These systems are able to perform automated calibrations, vary laser position energies, and adjust laser firing location to maximize signal intensity [1]. Similarly, automated data processing systems can recognize

suitable signals, identify monoisotopic peaks, and submit summary peak lists directly to a search engine for rapid identification. The state-of-the-art ultraflex III MALDI-TOF and MALDI-TOF/TOF MS (produced by Bruker Daltonics in 2006) incorporates next-generation technology, including smartbeamTM laser technology for MALDI in vitro molecular imaging of peptide and protein biomarker distributions in tissue sections; panoramic PANTM technology for high mass resolution across a broad mass range; unique T3-sequencing capabilities, allowing top-down sequence analysis on many intact proteins; tunable laser speed from 1-200 Hz and adjustable focus size from 10 to 80 µm for high-throughput analysis and sensitive detection of labile protein modifications; fully integrated workflow for liquid-chromatography (LC)-MALDI experiments; and the MALDITOF/TOF provides two MS/MS methods for de novo sequencing (LID-LIFT TOF/TOF MS for highest specificity; and high-energy collision-induced decomposition (CID) for in-depth analysis of selected proteins) [4]. 4.1.3 Evaluation of the Technology

MALDI and ESI have clearly evolved to be the ionization sources of choice when it comes to biomolecular analysis [1]. MALDI has the ability to analyze complex mixtures, with less suppression signal than ESI, making it extremely useful for analyzing biological samples such as protein digests (see Table 4.1.2) [1].
Ionization source Electrospray ionization (ESI) Comments NanoESI Typical mass range (Da) 70000 Matrix interference none Degradation none Complex mixtures somewhat limited LC/MS amenable excellent Sensitivity



Comments Desorption/ Ionization on silicon (DIOS) Comments

high femtomole to low picomole Excellent LC/MS tool; low salt tolerance (low millimolar); multiple charging useful, but significant suppression with mixtures occurs; low tolerance of mixtures; soft ionization (little fragmentation observed) high OK but 70000 none none somewhat zeptomole low flow limited but to low rates can better femtomole present a than ESI problem Very sensitive and very low flow rates; applicable to to LC/MS; but low flow rates require specialized systems; has reasonable salt tolerance (low millimolar); multiple charging useful but significant suppression can occur with mixtures; reasonable tolerance of mixtures; soft ionization (little fragmentation observed) possible low to high good for 300000 yes photo femtomole complex degradation mixtures and matrix reactions Somewhat tolerant of salts; excellent sensitivity; matrix background can be a problem for low mass ions; soft ionization (little fragmentation observed); photodegradation possible; suitable for complex mixtures and small molecules 3000 none photo good for possible low to high degradation complex femtomole mixtures Somewhat tolerant of salts; excellent sensitivity; soft ionization (little fragmentation observed); photodegradation possible; suitable for complex mixtures and small molecules

Table 4.1.2 General comparison of Ionization sources Source: Siuzdak, G. (2003) The Expanding Role of Mass Spectrometry in Biotechnology, pp. 35-36. MCC Press, San Diego, CA However, complete and routine sequence determination through mass analysis of intact proteins and complex biological samples has yet to be realized due to significant signal suppression [1] . Even the tryptic digestion of a mixture containing 3-5 proteins will result in a peptide mixture complex enough to cause considerable ionization signal suppression [1]. Thus, biological samples of proteins (or peptides in a proteolytic digest) must first be separated by gel electrophoresis (SDS PAGE separation on a 1D gel or 2D gel) or liquid chromatography or capillary electrophoresis prior to mass analysis. At present, the method of choice for preparing a biological sample for mass analysis on a MALDI-TOF reflectron is 1D or 2D gel electrophoresis (which separate intact proteins), followed by proteolytic digestion [1]. By contrast, liquid chromatography (LC)-based methodologies (e.g. HPLC) fractionate the peptide mixtures before analysis, thus decreasing signal suppression and improving the analysis of any given peptide [1]. One of the most popular means of performing peptide LC-MS/MS involves the direct coupling of the LC to an ion trap MS through an ESI interface The advantages and disadvantages of protein identification with MALDI-TOF reflectron and LC-MS/MS are shown in Table 4.1.3. MALDI-TOF reflectron Advantages
Very fast Widely available Easy to perform analysis High accuracy (10-50 ppm) adds reliability to data Useful for wide range of proteins

LC-MS/MS with an ion trap analyzer
In addition to molecular mass data, tandem MS measurements are performed in real time MS/MS information adds additional levels of confirmation Multiple proteins can be analyzed simultaneously with simple reversed-phase LC run Useful for PTM identification High coverage of proteins (30% to 90%) depending on the protein Computationally intensive; large database searches can take hours to days; relatively slow


Problematic for mixtures of proteins Typically less coverage than LCMS/MS approach

Table 4.1.3 Protein Identification with MALDI and LC-MS/MS Source: Siuzdak, G. (2003) The Expanding Role of Mass Spectrometry in Biotechnology, p. 131. MCC Press, San Diego, CA 4.1.4 Applications of the Technology

At the Australian Proteome Analysis Facility (APAF), the identification of protein(s) involves digesting the protein with trypsin, and analysing the digested protein with a MALDI-TOF MS to produce a peptide mass fingerprint [5]. The experimentally measured monoisotopic masses of the peptides seen in the mass spectrum are selected using software, and then compared to all the theoretically predicted trypsin peptide digests from a database containing hundreds of thousands of proteins [6,7] using computer search programs such as Mascot, MS-Fit, Aldente and Profound (see section 4.1.5) [1, 5]. This application is recommended for the identification of known purified proteins, and has the advantages of rapid analysis, high sensitivity, and being suitable for large numbers of samples. However, the protein of interest

must already be in the protein database, is generally not suitable for proteins < 15 kDa or for the identification of post-translational modifications, and the match is based on peptide masses, not sequence information [5]. Once a peptide mass fingerprint is generated via MALDI-TOF MS as described above, the most abundant peptide ions can then be subjected to MALDI-TOF/TOF analysis, providing added information that can be used to determine the protein sequence [5]. The results of both types of analyses are combined and search engines such as Mascot are used against protein, DNA or EST databases to identify the protein of interest (1,5]. This application is recommended for the identification of known purified proteins requiring a higher level of confidence than with MALDI-TOF alone, and is able to identify 2-3 proteins in the same spot and allow the identification of small proteins < 15 kDa based on sequence information [5]. Other major applications for MALDI-TOF technology, with specific examples, include: 1) the identification of distinct protein profiles that contribute to subtypes and facilitate classification of certain cancers (e.g. acute leukemia) for future treatment and prognosis [8]. In this study, the proteins of leukemic cells from 61 cases of acute leukemia characterized by French-American-British (FAB) classification were separated by 2-D gel electrophoresis, and the differentially expressed protein spots were identified by MALDI-TOF MS. Distinct protein profiles of acute leukemia FAB types or subtypes, including acute myeloid leukemia (AML) and its subtypes M2, M3 and M5, and acute lymphoid leukemia (ALL), were identified. Myeloid-related proteins 8 and 14 (involved in AML differentiation) were highly expressed in M2 and M3 subtypes, and heat shock 27 kDa protein 1 and other proteins were highly expressed in ALL, making it possible to clinically distinguish AML from ALL [8]. 2) the identification of a complete proteomic profile of snake venom (a complex mixture of proteins and peptides) for potential medical treatments [9]. In this study, the venom proteomic profiles from two snakes common to southern China, the cobra and viper, were assessed using four different approaches, including gel filtration and 2D gel electrophoresis plus MALDI-TOF MS. The novel identification of 124 and 74 proteins and peptides in the cobra and viper venom respectively was reported. Functional analyses revealed that cobra venom has a high abundance of cardio- and neurotoxins, whereas viper venom contains a significant amount of haemotoxins and metalloproteinases. Only 50% of gel spots were confirmed to be venom proteins, probably due to incomplete protein databases, which suggests that post-translational modifications may be a significant characteristic of venomous proteins [9]. 3) the accurate, sensitive and rapid identification of multiple bacterial strains relevant to public health and food safety [10]. In this study, multiple strains of six species of Campylobacter coli isolated from animal, clinical, or food samples were analyzed by MALDI-TOF MS. Whole bacterial cells were harvested from colonies or confluent growth, transferred directly into solvent and then onto a spot of dried 3-methoxy-4-hydroxycinnamic acid (matrix). “Species-identifying” biomarker ions (SIBI) were evident from analyses of multiple reference strains for each of the six species. MALDI-TOF MS analysis of 75 additional Campylobacter strains isolated from humans, poultry, swine, dogs and cats revealed I) associations of SIBI with source, (ii) strains previously speciated incorrectly, and (iii) “strains” composed of more than one species [10].


Relevant web sites

The following is a list of Australia’s leading research/medical Mass Spectrometry Facilities (with their relevant websites) that use MALDI-TOF and MALDI-TOF TOF mass analyzers in areas including protein identification, the investigation of protein expression, protein or peptide biomarker discovery in various illnesses, and potential therapeutic discovery from both plants and animals. 1) The Australian Proteome Analysis Facility (APAF Ltd) (; 2) The Australian National University (ANU) Research School of Biological Sciences (; 3) The University of Queensland Cellular Proteomics Mass Spectrometry Facility (MCPMSF) within the Institute for Molecular Bioscience (IMB) (; 4) The RMIT School of Applied Sciences Mass Spectrometer Facility (; and 5) The Ludwig Institute for Cancer Research (LICR) ( Some useful learning web resources linked to these Mass Spectrometer Facilities and MALDI-TOF and MALDI-TOF TOF mass analyzers include: 1) 2) 3) 4) Wikipedia, the free encyclopedia (; for general articles, tutorials, MS tools, and links to MS literature;; and Bioscienceworld, insights for the life sciences industry (

In addition, some computer search engines used to search MALDI-TOF peptide mass fingerprint data against protein databases for the identification of proteins include: 1) 2) 3) 4) MASCOT (; MS-Fit (; ALDENTE (; and ProFound ( Key Industry Suppliers


The following is a list of the major suppliers (and their website home pages) of MS instruments, particularly MALDI-TOF and MALDI-TOF TOF mass analyzers: Bruker Daltonics Applied Biosystems ( (

Shimadzu Biotech


Agilent Technologies ( Waters (

Thermo Electron Corporation ( 4.1.7 References

1. Siuzdak, G. (2003) The Expanding role of Mass Spectrometry in Biotechnology, MCC Press, San Diego, CA. 2. Tanaka, K., Waki, H., Ido, Y., Akita, S., Yoshida, Y., Yoshida, T. (1988) Protein and polymer analysis up to m/z 100.000 by laser desorption time-of-flight mass spectrometry. Rapid Commun Mass Spectom. 2, 151-153. 3. Karas, M. & Hillencamp, F. (1988) Laser desorption ionization of proteins with molecular masses exceeding 10.000 daltons. Anal Chem. 60, 2299-2301. 4. Bruker Daltonics. (2006) Ultraflex III & ultraflex III TOF/TOF. Ultimate Performance MALDI-TOF & TOF/TOF Systems.,001,003,01,01,006,009,0&rid=105, 001 5. The Australian Proteome Analysis Facility (2006) MALDI MS Analysis for Protein Identifications. Protein Identification by MALDI-TOF (PMF). 6. The NCBInr protein database. (2006) A non-redundant database compiled by the National Center for Biotechnology (NCBI) at the National Institute of Health (United States). 7. The SWISS-PROT protein database. (2006) The Australian mirror of the curated protein sequence database compiled by the Swiss Institute of Bioinformatics and the European Bioinformatics Institute. 8. Cui, J., Wang, J., He, K., Jin, B., Wang, H., Li, W., Kang, L., Hu, M., Li, H., Yu, M., Shen, B., Wang, G., Zhang, X. (2004) Proteomic analysis of human acute leukemia cells: insight into their classification. Clin Cancer Res. 10, 6887-6896.

9. Li, S., Wang, J., Zhang, X., Ren, Y., Wang, N., Zhao, K., Chen, X., Zhao, C., Li, X., Shao, J., Yin, J., West, M., Xu, N., Liu, S. (2004) Proteomic characterization of two snake venoms: Naja naja atra and Agkistrodon halys. Biochem J. 384, 119-127. 10. Mandrell, R., Harden, L., Bates, A., Miller, W., Haddon, W., Fagerquist, C. (2005) Speciation of Campylobacter coli, C. jejuni, C. helveticus, C. lari, C. sputorum, and C. upsaliensis by matrix-assisted laser desorption ionization-time of flight mass spectrometry. Appl Environ Microbiol. 71, 6292-6307.

Chapter 5.1

2D Gel Electrophoresis

Paolo Dominic Navidad 5.1.1 Introduction

Two-dimensional (2D) gel electrophoresis is a commonly used method for analyzing complex protein mixtures. It is used to separate a large number of proteins based on two independent biological properties – the isoelectric point and the molecular weight [1]. The sample is displaced into two dimensions perpendicular to each other, with each dimension corresponding to either biological property. The basic steps of 2D gel electrophoresis are [2]: • • • • • sample preparation first dimension isoelectric focusing second dimension gel electrophoresis staining image analysis

Sample preparation involves the solubilization and denaturation of the sample proteins. A successful preparation should not cause aggregation and chemical modification during the run [3]. Solubilization separates the proteins into their individual components and involves the use of a solubilization/denaturation (SD) buffer. The main components of the buffer are a chaotrophe, such as urea and thiourea, to disrupt hydrogen and hydrophobic bonds; a reductant, such as betamercaptoethanol, to break disulphide bonds; a detergent, such as TRITON X-100, to disrupt membranes and to solubilize lipids; ampholytes, to aid protein solubilization and nucleic acid precipitation; and protease inhibitors [4]. The first dimension of the 2D procedure is isoelectric focusing (IEF), where proteins are separated based on their isoelectric points (pI) due to the establishment of a pH gradient in a polyacrylamide gel [1]. The pH gradient is formed by either the addition of carrier ampholytes, through the use of an immobilized pH gradient (IPG) strip, or a combination of the two [4]. Ampholytes are added to the gel and must be prefocused to produce a gradient, while IPG strips are separately prepared from the polyacrylamide gel. An advantage of using IPG’s is that reproducibility is improved due to improved mechanical stability and a pH gradient that doesn’t drift during a run [3]. Because of this, IPG-2D gel electrophoresis (IPG-Dalt) is more commonly performed as a flexible method for proteomic analysis [5]. In IPG-Dalt, the samples are first run on IPG strips (Figure 5.1.1), after which the strip is equilibrated with SDS to ensure the proteins all have a negative charge, DTT to reduce disulphide bridges that may have reformed, and iodoacetamide for protein alkylation and to remove excess DTT [3]. The samples are now separated according to their respective pI.

Figure 5.1.1 Individual IPG strips in an electrophoresis unit Source: The second dimension SDS-PAGE can be run on either a vertical or a horizontal setup [6]. In a horizontal system, the equilibrated IPG strips are placed gel side down on the surface of the stacking gel without any embedding procedure, while a vertical system involves embedding IPG strips in agarose on top of the vertical second dimension gel (Figure 5.1.2) [7]. An electrical charge is then applied to separate the proteins according to their size. Similar to what happens in an ordinary SDS-PAGE procedure, the molecules migrate across the gel at different speeds, with larger molecules moving more slowly than smaller molecules. Horizontal gels produce sharper spots because these are half as thick as vertical gels, but vertical systems allow for simultaneous running of multiple gels in a single electrophoresis tank [5].

Figure 5.1.2 Loading an equilibrated IPG strip on top of a polyacrylamide gel Source: After the separation of the samples, individual spots may be visualized by staining the gel. A variety of different types of staining procedures – some colorimetric, some fluorescent – may be used in this procedure, each with its own advantages and disadvantages. Silver staining is a highly sensitive procedure and the end results can be mass-spectrophotometry compatible, but it is time-consuming, laborintensive, and causes poor staining with some proteins [5]. Also, spot intensity does not necessarily correlate to protein concentration. Zinc or copper staining produces a negative stain (gel is stained white, proteins are unstained) because zinc/copper does not stain SDS, which coats the proteins [4]. Zinc/copper staining is rapid and inexpensive but cannot be used on thin gels, which provide poor contrast with this procedure. Coomassie Blue is the most commonly used stain for acrylamide gels, and is expensive and easy to use. However, a destaining step is required and nonspecific staining, particularly with polysaccharides, sometimes occurs [3]. Fluorescent dyes bind to the SDS that coats the proteins. Fluorescent staining may be done before IEF, after IEF but before SDS-PAGE, or after SDS-PAGE [5]. There are many different kinds of fluorescent stains. Generally speaking, they are highly sensitive, quick and easy to use, and can be used for quantitative purposes.

However, some fluorescent dyes are quite expensive and may be incompatible with some plastic-backed gels [3]. After staining, the finals step is identification and characterization of the protein spots. Gels are scanned and image analysis can be performed to determine statistically and scientifically significant spots [2]. Image analysis is conducted using specialized software that have the ability to perform spot detection, spot matching between gels, and spot quantification and comparison [8]. These spots can then be excised, characterized, and identified by MS. An example of a 2D gel is shown below (Figure 5.1.3).

Figure 5.1.3

Silver stained sample of human embryonic kidney cells (hek293 cell line) Source:


Recent Advances

IPG strips have led to improved visualization of low-abundance proteins and to further resolve protein species in the gel. In a protocol proposed by Herbert and Righetti wherein they used a multicompartment electrolyzer (MCE), load ability, detection, and sensitivity were enhanced through sample prefractionation [9]. An MCE is an apparatus with chambers of different pH ranges that were separated by membranes (Figure 5.1.4).

Figure 5.1.4 Schematic diagram of an MCE Source: Herbert, Righetti. 2000. Electrophoresis Proteins with similar pI migrate to the same chamber. After prefractionation, 2D PAGE was performed on the samples (E. coli, human plasma extracts). The authors observed that protein precipitation and smearing was reduced. Also, because a larger amount of prefractionated proteins can be loaded onto the gel, a larger number of spots can be observed.

Membrane proteins are difficult to recover with 2D gel electrophoresis because they are hardly soluble in buffers used in IEF [10]. As a result, some species are partially or completely absent in the final gel. Santoni et al. performed prefractionation of Arabidopsis proteins using treatment of Triton-X114, carbonate, and chloroform/methanol, and solubilization with a set of zwitterionic detergents [11]. The authors observed improved isolation of membrane proteins for 2D gel electrophoresis. Fluorescent stains have several advantages over more traditional silver stains. They are more sensitive, easy to use, and have a better dynamic range [10]. Spyro postelectrophoretic fluorescent stains, in particular Spyro Ruby (Molecular Probes, Eugene, OR), are some of the more recent stains to be developed. Spyro Ruby, in addition to being highly sensitive, is also compatible with subsequent protein analysis such as Edman sequencing and MS [12]. Since Spyro Ruby is expensive, Rabilloud et al. have developed a protocol to produce an alternate stain that performs similar to Spyro Ruby [13]. Image analysis is often laborious and slow, and difficulties are encountered especially when performing spot boundary assignment, normalization of the gel background and intensity variation, and spot matching between gels. There are various commercial software packages that perform these tasks, but these still require manual users, leading to a problem in reproducibility [10]. Smilansky developed a system called Z3 that performed image analysis with improved speed and precision. The algorithm that was developed relied on “computation of the registration directly from the raw images, region-based matching, and complementary pseudocolor display” [14]. Westbrook et al. performed 2D gel electrophoresis using combinations of very narrow-range IPG strips, and compared the results to broad-range and narrow-range IPGs [15]. They observed that very narrow-range IPGs have the ability to separate different protein species and isoforms. With very narrow-range IPGs, they also observed that the number of co-migrating protein species was reduced, leading to more reliable database searches compared with broad- or narrow-range IPGs. Reproducibility of 2D gel electrophoresis can be improved by using differential in-gel electrophoresis (DIGE) [16]. This technique involves two pools of protein extracts labeled with Cy3 and Cy5. These are then mixed and separated on a 2D gel. The patterns are visualized by the excitation of the dyes and comparison of the resulting images allows quantitation of the spots. Proteins that exist in both pools would migrate to the same spot. DIGE can be a useful tool in comparative proteomics, such as analyzing protein level differences caused by a disease state, drug treatment, or life cycle stage [17]. Laser capture microdissection (LCM) can be used in combination with 2D gel electrophoresis to produce more specific results [18]. LCM is a technique that can be used to separate cells in heterogenous mixtures. A transfer film is placed on a tissue sample or heterogenous mixture. A laser beam activates an area in the film where the cells of interest are located, and these cells attach to the film [19]. Because the subsequent sample is more or less homogenous, protein analysis can be improved. 5.1.3 Evaluation of the Technology

2D-GE in its basic form has been in use for quite some time and is usually categorized as part of classical proteomics. Classical proteomics involves a

separation step (2D-GE) followed by an identification step, usually MS. 2D-GE is a relatively simple method for mapping differences in protein expression, and is currently the most rapid method for direct targeting of protein expression differences [20]. With some recent advances, 2D-GE has progressed since it was first described by O’Farrell in 1975 [21]. IPGs have improved the resolutions of gels, while fluorescent dyes provided sensitivity in visualization [22]. Computer software are continuously being developed and improved. The use of DIGE has overcome some problems with reproducibility. Because of its simplicity, versatility, and relatively easy visualization, 2D-GE is still in use in spite of being a relatively old technology. In fact, “no other technology allows the ready separation and quantitation and identification of complex protein mixtures” [21]. However, there still remain several disadvantages to this technique. A major limitation of this technique is problems with reproducibility. Because no two gels run in the exact same manner, comparisons between images will produce some discrepancies. Even the use of DIGE cannot completely bypass the problem. The issues of gel-to-gel variation and intrinsic biological variations must be addressed [20]. Gel replication in triplicates or more may be performed, but this would entail laborious image analysis. 2D-GE has limited sensitivity. One hundred fifty µg of protein from a total cell extract generally generates around 2000-3500 identifiable spots depending on the dye used, and around 10,000 proteins using narrow pH gradients [21,23]. The total number of proteins in a cell is estimated to be around 30,000. Low-abundance, very high or low molecular weight, very acidic or basic, and/or hydrophobic species are often underrepresented in a 2D gel. Besides the use of sensitive stains, other methods to bypass the problem are sample fractionation, selective extraction of high abundance proteins, and loading large sample sizes in a large gel [21]. However, other factors such as resolution and ease of image analysis, labor-wise, will be sacrificed for sensitivity. Other limitations for this technique include limited loading capacity of IPG strips, relatively low throughput, and low linear range of visualization procedures [20]. In addition, image analysis is still a labor-intensive step in 2D-GE. Despite advanced computer programs that aid in this task, full automation has not yet been achieved and manual involvement is still required. This, in addition to supplying extra work, may also lead to loss in reproducibility [10]. There are alternative methods that may be used, but these are generally not as efficient as 2D-GE. Liquid chromatography-mass spectrometry (LC-MS) is a tandem procedure wherein a protein mixture is separated by LC, and “pure” compounds are introduced directly to the mass spectrometer [24]. Therefore, proteins with similar retention characteristics can be differentiated via their mass spectra. However, LCMS is much less versatile than 2D-GE, and does not have the ability to provide quantitative data [10]. Using LC-MS also presents difficulties in performing differential display analysis [20]. The use of stable isotope-coded affinity tags (ICAT) for protein extraction and tagging is commonly done in quantitative proteomics. Samples are labeled and identified using LC/MS/MS [21]. This technique can be used in differential expression studies of whole proteomes, and its main strength is in its ability to provide quantitative data. A major limitation of this technique is that it is unable to label proteins that do not contain cysteine [25]. Since protein digestion is involved, it is difficult to establish the association of post-modified forms of peptides and their assembly into modified proteins [23]. Also, due to non-denaturing conditions, the source of fluorescence must be checked to determine that it is due to the target protein and not from an unwanted protein complex [26].

Still, 2D-GE is considered the gold standard as a protein separation method prior to MS because it is a proven research tool due to its simplicity and ability to quantify and visualize a range of samples greater than what other techniques can currently achieve. 5.1.4 Applications of the Technology

Proteomics, in general, has a wide range of applications. For 2D-GE specifically, its main applications are protein identification and differential expression [4]. Coupled with MS, it is used for large-scale identification of proteins in a sample. In one study, Xu et al. used 2D-GE and MS to identify soybean leaf proteins [27]. They analyzed 260 spots and compared these against various databases. They discovered that majority of the identified leaf proteins were involved in energy metabolism. There are clinical applications for this technique, as well. Human myocardial proteins were identified using 2D-GE [28]. Using 2D-GE/MS, proteomic profiles of various cells can be generated to provide not only information about protein content, but also on protein activity, interactions, and localization. Hoffrogge et al. performed proteomic analysis on neuronal stem cells, Lominadze et al. analyzed neutrophils, and Martinez-Hereida et al. did work on sperm cells [29,30,31]. Proteomic analysis using 2D-GE can also be used on pathogens. Pereira et al. looked at proteomic profiles of the bacterium H. pylori to investigate possible pathogenic factors [32]. The other major application for 2D gel electrophoresis is in differential proteomics, which compares distinct proteomes such as normal versus diseased cells or diseased versus treated cells [20]. With the advent of DIGE, reproducibility has improved since multiple protein mixtures can be run on a single gel. 2D-GE may be used for identification of biomarkers in drug treatment. Discovery of novel protein biomarkers would aid in drug development and enable detection of disease at an early stage, measurement of disease progress, and monitoring of treatment efficacy [33]. There have been several studies that used 2D-GE that have shown promising results. Torres-Cabala et al. used the technique as one of the steps in identifying possible biomarkers that may distinguish between malignant and benign thyroid lesions [34]. Roessler et al. also used 2D-GE in identifying PSME3 as a possible tumor marker for colorectal cancer, while Li et al. looked for possible breast cancer metastatic markers [35,36]. Differential proteomics using 2D-GE can be used in studying pathogens. For example, a proteome profile of a resistant organism may be compared with that of a susceptible organism. Pieper et al. utilized this method to study vancomycin resistant S. aureus strains and discovered an enzyme that was involved in an altered cell wall turnover rate and altered peptidoglycan structure [37]. Foucher et al. performed proteomic analysis on arsenic resistant T. brucei, which causes sleeping sickness, and discovered a protein that was present in both resistant and susceptible organisms but at different pI [38]. While non-gel-based methods are also being used for quantitative differential proteomics, 2D-GE methods, with complementary technology that improve sensitivity and expand analytical range, are still considered highly useful and informative tools in proteomic analysis. 5.1.5 Relevant Websites

These are some websites relevant to 2D gel electrophoresis. There are various databases that provide data on proteins and various 2D-PAGE reference maps. Some of these are: • ExPASy - SWISS-2DPAGE (human, mouse, E. coli, yeast)

• • • • • • Max Planck Institute for Infection Biology (mostly microbial organisms) Argonne National Lab, Protein Mapping Group Joint ProteomicS Laboratory Danish Centre for Translational Breast Cancer Research Siena-2DPAGE World-2DPAGE Portal (a portal for 2D PAGE databases)

A tutorial on 2D-GE by Dr. James R. Jefferies can be accessed on the University of Wales at Aberystwyth on the link below. Biocompare has also produced an easy-tounderstand tutorial that covers the basic concepts of the technique. • Jefferies tutorial • Biocompare 2D-GE tutorial A manual on 2D-GE using IPGs by Angelika Görg can be located on the website below. It covers the basic steps of the procedure and possible variations in the protocol. Technical information on 2D-PAGE can be obtained at the ExPASy site. It contains protocols and a list of chemicals used in the procedure. A good journal that covers the advances, mainly technical, in the field of 2D-GE is Electrophoresis, while many applications of the technology are published by Proteomics. Issues and articles from both journals are available online. • Electrophoresis home page • Proteomics home page 5.1.6 Key Industry Suppliers

Companies that are involved in the 2D-GE industry have two main groups of products: the reagents that are used in sample preparation, gel running, etc, and the various image analysis software. Some companies offer both types of products but there are several bioinformatics companies that develop specialized software. Invitrogen offers 2D-GE systems and components such as IPG strips, ampholytes, and gels. They also produce Spyro Ruby fluorescent gel stains and Spyro photo filters. • Invitrogen 2D PAGE • Protein stains for 2D gels

Companies that provide 2D-GE reagents, gels, and equipment include GE Healthcare (formerly Amersham Biosciences), NextGen Sciences, and Bio-Rad Laboratories. Sigma-Aldrich has various stains and IPG strips for use in 2D-GE., while Genetix has the GelPix instrument that can perform spot excision, data tracking, and onboard imaging. • GE Healthcare 2D Electrophoresis trophoresis • NextGen Sciences 2D Electrophoresis m • Bio-Rad Laboratories home page • Sigma-Aldrich Protein Electrophoresis Protein_Expr_/Protein_Analysis/Protein_Electrophoresis.html • Genetix GelPix Various 2D-GE programs and software are offered by companies such as GE Healthcare and Bio-Rad Laboratories. There are also companies that specialize exclusively in image analysis software. Shimadzu Biotech offers software such as the Phoretix 2D. Syngene is a well-known company that produces geldocumentation and analysis systems. Genomics Solutions has gel imaging and picking systems in addition to image analysis software. The GELLAB II+ is a comprehensive program for gel analysis and is available from Scanalytics Inc. (now part of BD Biosciences). • Shimadzu Biotech home page • Syngene home page • Genomic Solutions Proteomics • Scanalytics GELLAB II+ 5.1.7 References

1. The Maiman Institute for Proteome Research, Tel Aviv University (2006) TwoDimensional Gel Electrophoresis, 2. Molecular Structure Facility, University of California at Davis (2006) 2-D Gel Electrophoresis, 3. Biocompare (2006) 2D Gel Electrophoresis Tutorial, 4. Jefferies, J.R. (2005) 2D Gel Electrophoresis for Proteomics Tutorial, 5. Görg, A., Obermaier, C., Boguth, G., Harder, A., Scheibe, B., Wildgruber, R. & Weiss, W. (2000) The current state of two-dimensional electrophoresis with immobilized pH gradients. Electrophoresis. 21, 1037-1053. 6. Carrette, O., Burkhard, P.R., Sanchez, J. & Hochstrasser, D.F. (2006) Stateof-the-art two-dimensional gel electrophoresis: a key tool of proteomics research. Nat. Protocols. 1, 812-823.

7. Görg, A., Boguth, G., Harder, A., Obermaier, C., Scheibe, B., Wildgruber, R. & Weiss, W. (1998) Two-Dimensional Electrophoresis of Proteins Using Immobilized pH Gradients, 8. Mathematical and Information Sciences, Commonwealth Scientific and Industrial Research Organisation (2005) 2D Gel Image Analysis, 9. Herbert, B. & Righetti, P.G. (2000) A turning point in proteome analysis: sample prefractionation via multicompartment electrolyzers with isoelectric membranes. Electrophoresis. 21, 3639-3648. 10. Lilley, K.S., Razzaq, A. & Dupree, P. (2001) Two-dimensional gel electrophoresis: recent advances in sample preparation, detection and quantitation. Curr. Opin. Chem. Biol. 6, 46-50. 11. Santoni, V., Kieffer, S., Desclaux, D., Masson, F. & Rabilloud, T. (2000) Membrane proteomics: use of additive main effects with multiplicative interaction model to classify plasma membrane proteins according to their solubility and electrophoretic properties. Electrophoresis. 21, 3329-3344. 12. Molecular Probes (2005) SPYRO® Ruby Protein Gel Stain, 13. Rabilloud, T., Strub, J., Luche, S., van Dorsselaer, A. & Lunardi, J. (2001) A comparison between Spyro Ruby and ruthenium II tris (bathophenanthroline disulfonate) as fluorescent stains for protein detection in gels. Proteomics. 1, 699-704. 14. Smilansky, Z. (2001) Automatic registration for images of two-dimensional protein gels. Electrophoresis. 22, 1616-1626. 15. Westbrook, J.A., Yan, J.X., Wait, R., Welson, S.Y. & Dunn, M.J. (2001) Zooming-in on the proteome: very narrow-range immobilised pH gradients reveal more protein species and isoforms. Electrophoresis. 22, 2865-2871. 16. Zhou, G., Li, H., DeCamp, D., Chen, S., Shu, H., Gong, Y., Flaig, M., Gillespie, J.W., Hu, N., Taylor, P.R., Emmert-Buck, M.R., Liotta, L.A., Petricoin III, E.F. & Zhao, Y. (2002) 2D differential in-gel electrophoresis for the identification of esophageal scans cell cancer-specific protein markers. Mol. Cell Proteomics. 1, 117-124. 17. GE Healthcare (2005) Ettan DIGE System User Manual, OpenAgent&docid=ABD636471F4EFF17C12571C000813047&file=18117317 AB.pdf 18. Craven, R.A., Totty, N., Harnden, P., Selby, P.J. & Banks, R.E. (2002) Laser capture microdissection and two-dimensional polyacrylamide gel electrophoresis: evaluation of tissue preparation and sample limitations. Am. J. Pathol. 160, 815-822. 19. National Institute of Child Health & Human Development, National Institutes of Health (2006) Introduction to Laser Capture Microdissection, 20. Monteoliva, L. & Albar, J.P. (2004) Differential proteomics: an overview of gel and non-gel based approaches. Brief. Funct. Genomic Proteomic. 3, 220-239. 21. Stein, R.C. & Zvelebil, M.J. (2002) The application of 2D gel-based proteomics methods to the study of breast cancer. J. Mammary Gland Biol. 7, 385-393. 22. Görg, A., Postel, W. & Günther, S. (1988) The current state of twodimensional electrophoresis with immobilized pH gradients. Electrophoresis. 9, 531-546. 23. Hitt, E. (2006) Separation of complex protein samples prior to mass spectrometry remains a bottleneck in the rapid resolution of proteomes, 0&ISSUE=0405&RELTYPE=PR&ORIGRELTYPE=GPF&PRODCODE=00000 000&PRODLETT=AQ 24. Ardrey, R.E. (2003) Introduction. In Liquid Chromatography-Mass Spectrometry, pp. 1-5. J. Wiley, Chichester, UK. 25. Smolka, M., Zhou, H. & Aebersold, R. (2002) Quantitative protein profiling using two-dimensional gel electrophoresis, isotope-coded affinity tag labeling, and mass spectrometry. Mol. Cell Proteomics. 1, 19-29. 26. SWEGENE Proteomics Lund (2003) Peptide-based Approaches to Proteomics, 27. Xu, C., Garrett, W.M., Sullivan, J., Caperna, T.J. & Natarajan, S. (2006) Separation and identification of soybean leaf proteins by two-dimensional gel electrophoresis and mass spectrometry. Phytochemistry. 67, 2431-2440. 28. Wittmann-Liebold, B., Graack, H. & Pohl, T. (2006) Two-dimensional gel electrophoresis as tool for proteomics studies in combination with protein identification by mass spectrometry. Proteomics. 6, 4688-4703. 29. Hoffrogge, R., Beyer, S., Volker, U., Uhrmacher, A.M. & Rolfs, A. (2006) 2-DE proteomic profiling of neuronal stem cells. Neurodegener. Dis. 3, 112-121. 30. Lominadze, G., Ward, R.A., Klein, J.B. & McLeish, K.R. (2006) Proteomic analysis of human neutrophils. Methods Mol. Biol. 332, 343-356. 31. Martinez-Heredia, J., Estanyol, J.M., Ballesca, J.L. & Oliva, R. (2006) Proteomic identification of human sperm proteins. Proteomics. 6, 4356-4369. 32. Pereira, D.R., Martins, D., Winck, F.V., Smolka, M.B., Nishimura, N.F., Rabelo-Goncalves, E.M., Hara, N.H., Marangoni, S., Zeitune, J.M. & Novello, J.C. (2006) Comparative analysis of two-dimensional electrophoresis maps (2-DE) of Helicobacter pylori from Brazilian patients with chronic gastritis and duodenal ulcer: a preliminary report. Rev. Inst. Med. Trop. Sao Paulo. 48, 175-177. 33. Duncan, M.W. (2006) Protein-based biomarker & drug discovery, 34. Torres-Cabala, C., Bibbo, M., Panizo-Santos, A., Barazi, H., Krutzsch, H., Roberts, D.D. & Merino, M.J. (2006) Proteomic identification of new biomarkers and application in thyroid cytology. Acta. Cytol. 50, 518-528. 35. Roessler, M., Rollinger, W., Mantovani-Endl, L., Hagmann, M.L., Palme, S., Berndt, P., Engel, A.M., Pfeffer, M., Karl, J., Bodenmueller, H., Ruschoff, J., Henkel, T., Rohr, G., Rossol, S., Rosch, W., Langen, H., Zolg, W. & Tacke, M. (2006) Identification of PSME3 as a novel serum tumor marker for colorectal cancer by combining two-dimensional polyacrylamide gel electrophoresis with a strictly mass spectrometry-based approach for data analysis. Mol. Cell. Proteomics. 0, M600118-MCP200. 36. Li, D.Q., Wang, L., Fei, F., Hou, Y.F., Luo, J.M., Wei-Chen, Zeng, R., Wu, J., Lu, J.S., Di, G.H., Ou, Z.L., Xia, Q.C., Shen, Z.Z. & Shao, Z.M. (2006) Identification of breast cancer metastasis-associated proteins in an isogenic tumor metastasis model using two-dimensional gel electrophoresis and liquid chromatography-ion trap-mass spectrometry. Proteomics. 6, 3352-3368. 37. Pieper, R., Gatlin-Bunai, C.L., Mongodin, E.F., Parmar, P.P., Huang, S.T., Clark, D.J., Fleischmann, R.D., Gill, S.R. & Peterson, S.N. (2006) Comparative proteomic analysis of Staphylococcus aureus strains with differences in resistance to the cell wall-targeting antibiotic vancomycin. Proteomics. 6, 4246-4258. 38. Foucher, A.L., McIntosh, A., Douce, G., Wastling, J., Tait, A. & Turner, C.M. (2006) A proteomic analysis of arsenical drug resistance in Trypanosoma brucei. Proteomics. 6, 2726-2732.

Chapter 5.2 Chromatographic Methods for the Separation of proteomes



The term proteome is referred as the complement and collection of proteins in a biological system. Proteomics, the study of proteomes, has been defined widely as the link between proteins and genomes. The major distinguishable difference between genomes and proteomes are, genome is defined by the sequence of nucleotides, where as proteome requires the knowledge of structure of proteins. In a biological system proteomes are larger than genomes, this is due to alternative Splicing of genes and posttranslational modifications like phosphorylation. Proteomics Provides a detailed description of protein analysis of different protein mixtures and their properties, interactions and modifications. This method also provides information of changes in protein expression levels, which are associated with drug treatment, genetic manipulation or changes in metabolism. Different proteome molecules are mixed together in a solution. There are different chromatographic techniques to extract and evaluate the selected proteomes from the sample. SDS-PAGE (Sodium dodecyl sulphate – polyacrylamide gel electrophoresis) and Mass Spectroscopy are two important chromatographic techniques, which are used commonly in biochemical test to separate and identify the proteomes. Electrophoresis is the process in which migration of charged molecules takes place in response to the electric field. The rate of migration of charged molecules depends on the size, shape and charge of the molecules. The basic principle in this technique depends on the charge of proteins. Sodium dodecyl sulfate is an anionic detergent, which binds strongly to proteins and all proteins acquires an negative charge, irrespective of their native charges. Heating in a buffer containing reducing agents like 2-mercaptoethanol and SDS denatures proteins and disrupts all intra and intermolecular protein interactions. These denatured proteins are resolved on the basis of size in a buffered gel. These SDS-PAGE gel systems are very useful in analysis of separation of proteome mixtures.

The above photograph is an example of separation proteins on the basis of charge by SDS- PAGE electrophoresis. The dotted spots on the photograph indicate the protein molecules. Extraction and separation of proteins is based on length of the dotted spots on the gel. The other chromatographic technique used in separation of proteomes is Mass Spectroscopy. In this technique the desired proteomes are extracted on the basis of their mass. The desired protein to be extracted is treated with the enzyme trypsin and this sample is passed into a mass spectrometer and the substance is combined with an electron beam. This electron beam has sufficient energy to separate the molecules based on their mass. The positive cation molecules produced are accelerated in a vacuum through a magnetic field and are separated on the basis of mass to charge ratio. The value of mass to charge ratio is equal to the molecular weight of the proteome. There is a simple diagrammatic representation of mass spectroscopy is shown below. The molecular fragment whose mass is to be determined is passed into an electron beam and ion accelerating array and then it is passed into magnetic field bends, in this charged particles are present. The molecules pass through this magnetic field and collected at the final stage.

5.2.2 Recent Advances There has been great development in chromatographic techniques in last 5years. Especially in the medical field, the scope of chromatographic techniques has been Used in various drug formulations. Some of the examples of aspects in which these Techniques used in last five years include: Use of recombinant antibodies in proteomics Use of recombinant antibodies in proteomics is increasing day to day, using this techniques large phage antibody libraries are developed, which binds to target protein. These antibody libraries are used in analysis of proteomes. Arrays for protein expression profiling Protein expression patterns in a cell or tissue of an organism is measured by using Two-dimensional gel electrophoresis. By using this technique analysis of proteins of interest is extracted from a multiple sample containing different proteins, this is an Important tool for analysis of protein expression, which is an key component in the Field of proteomics. These separation techniques are useful to human society in various Medical fields and acts as a platform for diagnostic and prognostic monitoring of diseases. Proteomics in early detection of cancer: The developments in mass spectroscopy and two-dimensional electrophoresis has contributed to development of biomarker research. Chip based techniques like surface enhanced laser desorption and emerging methods like antibody arrays are recent developments, using spectroscopy. By using the biomarkers one can detect and prevent the cause of cancer. Biomarkers detect the status of the infected cancer cells in the body and also detect the distinct changes that occur in the cells, which transforms from nondiseased to neoplastic.

Determining the structure of protein There are more than 10,000 protein folds in existence in human beings. Proteins sequences are distributed among these folds, which are non-homogenous and most of them are rare. The distribution of proteins in these folds follows the asymptotic power laws, which are identified in different biological and physical systems of the body and are associated with the scale free networks. By using chromatographic techniques we can identify the proteins that are distributed in these folds.

Techniques useful in diagnosis and treatment of cancer Using the new techniques like biomarkers for early detection and diagnosis of dangerous diseases like cancer was in wide use in early 2003. Two-dimensional gel electrophoresis (2D-PAGE) is the foundation of many discovery-based proteomics studies. Technologies such as laser capture micro dissection (LCM) and highly sensitive MS methods are widely used methods to recognize the protein that are articulated between distinct cell populations. Technologies such as reverse phase protein arrays will facilitate the recognition of target pathways in small biopsy specimens. The other technique used to analyse the classification of lysates from body fluids which suites for the diagnosis and tratment of the disease is Surfaceenhanced laser desorption/ionization time-of-flight (SELDI-TOF). Application of gel based proteomic techniques also enables to discover the drugs and prevention for lung and bladder cancer by using the technique of tissue biopsies. Most of the eukaryotic protein activity is altered by post-translational modifications. To record the modification sites in detail, techniques like novel mass spectrometric peptide sequencing and analysis technologies are widely used to study the chemistry of modifications. Recent advances in mass spectroscopy-based proteomics is an indispensable tool In the fields of molecular and cell biology. By using these techniques study of proteinProtein interactions on small scale and proteome interactions on the large scale are possible. The most important aspect of this technique is identifying the genome and proteome of malaria parasites. The important ability of mass spectroscopy is to identify and isolate the specific protein from complex samples, which is an important aspect in the fields of Medicine and biology. The chromatographic techniques like gel electrophoresis, which is followed by blotting to PVDF membrane for N terminal sequencing to produce peptides that can separated by HPLC. These techniques are used to identify the proteins that are articulated in the cell and available in the low micrograms or even less. The structural analysis of proteins is done by mass spectroscopy (MS), which is the new technique compared to Edman degradation which is the older method used to remove the N terminal of amino acids. But this is an older and time-consuming method, so this method is replaced by mass spectroscopy. Mass spectroscopy is a fast and method of choice for sensitive analysis and used in large scale industrial applications. The ionization methods like MALDI (matrix assisted laser desorption and ionization) and ESI (electrospray ionization) are commonly used to produce proteins for MS analysis. By using these techniques proteins of choice are identified by peptide mass finger printing or by using sequence tags.

The newly developed techniques like MALDI and Bioinformatics has a wide spread in the fields of medicine, chemistry and proteome data. MALDI (matrix-assisted laser desorption/ionization) is used in mass spectroscopy to analyse the biomolecules, Oligonucleotides and proteins by sensitivity and MS inherent accuracy. MALDI also has the impact on nucleotide polymorphism and challenges of post-genome area. Bioinformatics is a new technique in proteomic technologies, which is introducing the new algorithms to handle the various data sets. New algorithms for image analysis of two-dimensional gels have been developed within the last five years by bioinformatics.

The recent advances in mass spectroscopy has brought the analysis of protein into Lime light of cancer research. By using these techniques in cancer research, one can Find different issues of proteomics in cancer research like profiling of tumour cells, tumour fluids and protein microarrays and pharmacoproteomics and mapping of cancer signaling pathways and the role of biomarkers in diagnosis of cancer disease and monitoring of the disease and therapeutic and immune response to the cancer. The above-mentioned are the benefits of the development of the techniques like functional protein arrays and data handling. Cancer is a dysregulstion in the network of the intracellular and extracellular signaling system of the body. Molecular therapy given to the cancer patient is to target the effected signaling system of human body. The rising technology in proteomic techniques used with genomic analysis fulfils this need and bring the scientific approval for molecular stratification. Proteomic technology offers the state of kinase pathways, and provides post-translational phosphorylation data, which is not accessible by gene arrays. Such techniques provides a new pathways for developing new clinical therapy in curing the disease.

5.2.3 Evaluation of the Technique
Advantages of the Techniques These chromatographic techniques play an important role in the field of extraction and purification of proteomics. SDS-PAGE is the most popular protein separation technique which remains unrivalled in its capability to separate the complex mixtures of proteins. No other technique can be compared to SDS PAGE in the terms of resolution and sensitivity in the extraction of large mixture of protein samples. This separation technique involves two different electrophoretic techniques. In the first phase it separates the proteins according to their isoelectric point and in the second phase it separates according to their molecular weights. By conducting this technique the protein molecules are detected as spots according to their isoelectric point and molecular weights. Mass spectrometry plays an important role in separation of proteomes from a complex mixtures of the sample. Especially in the field of medicine and molecular biology, mass spectrometry involves in the analysis of human plasma proteome. The mobility of ions in gases is greater than condensed phases, by using the technique of mass spectroscopy one can handle the complexity of the samples. The other advantages of this technique include the early detection of cancer cells in the body.

Multi lectin affinity chromatography is used to isolate the glycoprotein part of the serum samples collected from the patients suffering from breast cancer. The peptides present in the serum sample of the cancer patients are isolated by the mass spectrometer. Disadvantages of the Techniques There are also some disadvantages in the SDS PAGE electrophoresis. The bands produced by SDS-PAGE are sharper, which affects the protein shape and charge. Sodium dodecyl sulfate is an anionic detergent present in the SDS PAGE electrophoresis disrupts protein quaternary, tertiary and secondary structural levels And renders all proteins highly negative charge. There are also some problems with protein weight estimated by SDS PAGE. Joule heating is the other disadvantage of SDS PAGE in which high power electrophoretic separations run out and leads to unintentional consequences.

In mass spectroscopy the mass spectral information of the sample is lost due to the limit of detection of the sample is lowered. This is not much enhanced than a gas chromatography with flame ionization detector. By this technique the problem arises like chemical noise due to any component containing an ion with the same mass as that of the component which is scanned. In this methods some compounds cannot accept hydrocarbons easily. No information of fragments is produced by the tandem mass spectrometry on cationized molecules. During the process of electron ejection or electron capture more amount of fragmentation is generated. In MALDI Mass spectroscopy the background matrix, which is used as a stationary phase creates a problem for molecular samples with a mass below 700Da. By using acidic matrix in the MALDI mass spectroscopy also causes degradation in some compounds.

5.2.4 Applications of the Technology
The major applications of the SDS PAGE electrophoresis include the implementation of on chip preconcentration of proteins. In this technique two membranes of different sizes are used for photopolymerization. Proteins are trapped on the membrane of SDS PAGE and eluted out by reversing the electric field. By using this technique proteins with low concentrations are also detectable with in a time span of 30 min of preconcentration time. Preconcentration analysis is also used in DNA analysis and clinical diagnostics. An application of proteomics in the aquaculture field leads to the development of new farming techniques. SDS PAGE separation technique is used to analyse the fish muscle protein content, by collecting the information on physiology of fish muscle. By using SDS PAGE technique in situ protein hydrolysis and de novo sequencing of peptides by MALDI and capillary electrophoresis in determination of fatty acids and metal ion content of the fish muscle protein. These days monoclonal antibody therapies are used to cure various human diseases, which has led to development of different analytical methods to detect and quantitate different size variants. In the place of SDS CE using gel based polymer for

separation of different proteins, these days SDS PAGE electrophoresis is used due to its various advantages. SDS PAGE technique utilizes non-gel polymer solutions for separation. The other important solutions like dextrans are also used as separation matrices. These dextrans are used as matrices in bare fused silica capillary, which is the advanced technology used in the analysis of separation technique. Mass spectroscopy is an important tool used for differentiation of cancerous (HOC 313 and HSC 3) and noncancerous cells (HaCat). These two cells can be differentiated due to overexpression of EGFR on the surface of cancer cells. In photothermal cancer therapy nanonanospheres and nanorods conjugated with antiepidermal growth factor receptor (anti-EGFR) antibodies that specifically target EGFR on the cell surface are used in cancer diagnostics and therapy. The use of nanorods allow for in vivo therapy and the catalytic effect of gold nanoparticles on oxidation of NADH to NAD+ is investigated. This conversion of NADH to NAD+ is supported by nuclear resonance and mass spectroscopy. In United States of America an annual rate of 2.2 million tons of phenolic resins and Phenol formaldehyde polymers are produced. Previously these polymers are thought to be nonbiodegradable and are produced for many industrial and commercial purpose. After conducting various experiments using gas chromatography and mass spectroscopy it was founded that a degradation product named 13C labeled phenol was detected. The main fungus responsible for this biodegradation was white-rot fungus Phanerochaete chryso- sporium. This was the basic platform for analysis of bioremediation and biorecycling of phenolic resins. Mass spectroscopy is also used in the field of security applications in military services due to its detectable capacity for chemical and biological weapons in the society. So various government agencies today are funding to buy these instruments in order to protect the general public from the attack of these weapons. Mass spectrometry also is beginning to enter into the airport security market, where explosives detection is the primary need. Recent advances in the technology are bringing down their size, for allowing them to fit better in military vehicles and ships. There are also some benchtop mass spectroscopy instruments are available, which are employed in laboratories.

References: date 25sep2006) date 25sep2006) date 30sep2006) views.asp(access date 10oct2006) date 18oct2006)

Anson V Hatch, Amy E Herr, Daniel J Throckmorton, James S Brennan, Anup K Singh. (Jul 15, 2006) Analytical Chemistry. Washington: Vol.78, Iss. 14; pg. 4976. (access date 28oct2006) Caesar, Kwadwo O., (2006) Analysis of a recombinant monoclonal antibody by molecular sieving capillary electrophoresis by M.S., Hood College, 64 pages; AAT 1434028 (access date 24oct2006)

Cole R (Editor)(1997). Electrospray Ionization Mass Spectrometry: Fundamentals, Instrumentation, and Applications. New York: Wiley and Sons, (access date 28oct2006) Gianluca Monti, Lorenzo De Napoli, Pietro Mainolfi, Roberto Barone, et al. (Apr 15, 2005) Analytical Chemistry. Washington: Vol.77, Iss. 8; pg. 2587

Holt LJ, Enever C, de Wildt RM, Tomlinson IM. (2000 Oct; 11) Curr Opin Biotechnol. (5): 445-9. (access date 24oct2006) Righetti, PG, Castagna, A, Antonioli, P, et al. Prefractionation techniques in proteome analysis:( JAN 2005) the mining tools of the third millennium ELECTROPHORESIS 26 (2): 297-319 (access date 15sep2006) White-Rot Fungi Demonstrate First Biodegradation of Phenolic Resin Adam C Gusse, Paul D Miller, Thomas J Volk. (Jul 1, 2006.) Environmental Science & Technology. Easton: Vol. 40, Iss. 13; p. 4196 (access date 21oct2006)

Chapter 5.3 Mass Spectrometry in Proteomics
Amit Joglekar 5.3.1. Introduction: Proteomics is the study of the function of all expressed proteins. Tremendous progress has been made in the past few years in generating large-scale data sets for protein–protein interactions, organelle composition, protein activity patterns and protein profiles in cancer patients [1]. In general it deals with the large-scale determination of gene and cellular function directly at the protein level [2]. The term proteome was first coined to describe the set of proteins encoded by the genome [3] and the term was coined to make an analogy with the term genomics [4]. Proteomics now evokes all the proteins in any given cell. It also analyses the set of all protein isoforms and modifications, the interactions between them, the structural description of proteins. In short it analyses everything that is 'post-genomic' [1]. Mass spectrometry (MS) has increasingly become the method of choice for analysis of complex protein samples. MS-based proteomics is the technique made possible by the availability of gene and genome sequence databases and technical and conceptual advances in many areas most important being the discovery and development of protein ionization methods, as recognized by the 2002 Nobel prize in chemistry [2]. Early proteomics study relied on protein separation by two-dimensional gel electrophoresis, with subsequent mass spectrometric identification of protein spots. The ability of mass spectrometry to identify very small amounts of protein from complex mixtures is a primary driving force in proteomics [1]. Proteomics has been possible only because of the previous achievements of genomic studies, which provided the 'blueprint' of possible gene products that are the focal point of proteomics studies [1]. So far, protein analysis such as the identification of primary sequence, post-translational modifications (PTMs) or protein–protein interactions by MS has been most successful when applied to small sets of proteins. The systematic analysis of the much larger number of proteins expressed in a cell is now also rapidly advancing mainly due to the development of new experimental approaches [2]. To analyse such huge number of proteins is an explicit goal of proteomics. To catalog all human proteins and ascertain their functions and interactions presents a daunting challenge for scientists. An international collaboration to achieve these goals is being co-ordinated by the Human Proteome Organization [4]. 5.3.2 Principles and instrumentation: The following block diagram shows the components of a mass spectrometer-

Fig 5.3.1: Components of a Mass Spectrometer Source: (

Mass spectrometric measurements are carried out in the gas phase on ionized analytes. By definition, a mass spectrometer consists of an ion source, a mass analyser that measures the mass-to-charge ratio (m/z) of the ionized analytes, and a detector that registers the number of ions at each m/z value. Electrospray ionization (ESI) and matrix-assisted laser desorption/ionization (MALDI) are the two techniques most commonly used to volatize and ionize the proteins or peptides for mass spectrometric analysis [5] [6]. ESI ionizes the analytes out of a solution and is therefore readily coupled to liquid-based separation tools such as for chromatographic and electrophoretic technique as shown in fig 5.3.2[2]. MALDI sublimates and ionizes the samples out of a dry, crystalline matrix via laser pulses. MALDI-MS is normally used to analyse relatively simple peptide mixtures, whereas integrated liquid-chromatography ESI-MS systems (LC-MS) are preferred for the analysis of complex samples [7]. The mass analyser is the main and the central component of this technology. In the context of proteomics, its key parameters are sensitivity, resolution, mass accuracy and the ability to generate information-rich ion mass spectra from peptide fragments (tandem mass or MS/MS spectra) (Fig 5.3.3)[8-10]. There are four basic types of mass analyser currently used in proteomics research. These are the ion trap, time-of-flight (TOF), quadrupole and Fourier transform ion cyclotron (FT-MS) analysers. They are very different in design and performance, each with its own strength and weakness. These analysers can be used separately or, in some cases, put together in tandem to take advantage of the strengths of each.[2]

Fig 5.3.2: Use of ESI and SDS-Page and HPLC technique for a proteomic experiment. The typical proteomics experiment consists of five stages. In stage 1, the proteins to be analysed are isolated from cell lysate or tissues by biochemical fractionation or affinity selection. This often includes a final step of one-dimensional gel electrophoresis. MS of whole proteins is less sensitive than peptide MS and the mass of the intact protein by itself is insufficient for identification. Therefore, proteins are degraded enzymatically to peptides in stage 2, usually by trypsin, leading to peptides with C-terminally protonated amino acids, providing an advantage in subsequent peptide sequencing. In stage 3, the peptides are separated by one or more steps of high-pressure liquid chromatography in very fine capillaries and eluted into an electrospray ion source where they are nebulized in small, highly charged droplets. After evaporation, multiply protonated peptides enter the mass spectrometer and, in stage 4, a mass spectrum of the peptides eluting at this time point is taken (MS1 spectrum, or 'normal mass spectrum'). The computer generates a prioritized list of these peptides for fragmentation and a series of tandem mass spectrometric or 'MS/MS' experiments ensues (stage 5). These consist of isolation of a given peptide ion, fragmentation by energetic collision with gas, and recording of the tandem or MS/MS spectrum. The MS and MS/MS spectra are typically acquired for about one second each and stored for matching against protein sequence databases. The outcome of the experiment is the identity of the peptides and therefore the proteins making up the purified protein population. Source: (

Fig 5.3.3: Stages in MS/MS experiment The left and right upper panels depict the ionization and sample introduction process in electrospray ionization (ESI) and matrix-assisted laser desorption/ionization (MALDI). The different instrumental configurations (a−f) are shown with their typical ion source. a, In reflector time-of-flight (TOF) instruments, the ions are accelerated to high kinetic energy and are separated along a flight tube as a result of their different velocities. The ions are turned around in a reflector, which compensates for slight differences in kinetic energy, and then impinge on a detector that amplifies and counts arriving ions. b, The TOF-TOF instrument incorporates a collision cell between two TOF sections. Ions of one mass-to-charge (m/z) ratio are selected in the first TOF section, fragmented in the collision cell, and the masses of the fragments are separated in the second TOF section. c, Quadrupole mass spectrometers select by time-varying electric fields between four rods, which permit a stable trajectory only for ions of a particular desired m/z. Again, ions of a particular m/z are selected in a first section (Q1), fragmented in a collision cell (q2), and the fragments separated in Q3. In the linear ion trap, ions are captured in a quadruple section, depicted by the red dot in Q3. They are then excited via resonant electric field and the fragments are scanned out, creating the tandem mass spectrum. d, The quadrupole TOF instrument combines the front part of a triple quadruple instrument with a reflector TOF section for measuring the mass of the ions. e, The (three-dimensional) ion trap captures the ions as in the case of the linear ion trap, fragments ions of a particular m/z, and then scans out the fragments to generate the tandem mass spectrum. f, The FT-MS instrument also traps the ions, but does so with the help of strong magnetic fields. The figure shows the combination of FT-MS with the linear ion trap for efficient isolation, fragmentation and fragment detection in the FT-MS section. Source: ( Ion source The first very important component in a mass spectrometer is the ion source through which various ions are created after which they are passed into the mass analyzer. There are various types of ion sources through which ions can be generated. The choice of ionic source depends on the experiment and the compounds to be analysed. The various ion sources are listed below[11]• • • Electron impact ionization Chemical ionization Field ionization o Desoprptive ionization Fast atom bombardment ionisation Field desorption ionisation Plasma desorption ionisation Laser desorption ionisation • Matrix assistsed laser desorptive ionisation (MALDI) • Surface enhanced laser desorption ionisation (SELDI) Electrospray ionisation (ESI)

Generally in proteomics laser desorption method is the preferred choice of ionisation. MALDI is used in lot of circumstances although SELDI is also used. In the laser desorption method a laser is used to deliver large density of energy into a small space. The laser beam is focussed onto a matrix, which has an absorption band that closely matches the energy of laser radiation. Once the beam hits this matrix the material has a lot of energy and will desorb and ionize. This matrix substance is mixed with the substance to be examined. The details of the method are explained in chapters 4.1. ESI is also used in many cases where the sample is present in the electrospray capillary of small internal diameter. The analyte is forced into this capillary at a particular flow rate and the resulting electrospray plume nebulizes the sample. The details are present in chapter 4.2. Mass Analyzers The basic types of mass analyzers used in mass spectrometry are summarized below:

Type Time-of-flight Magnetic sector Linear Quadrupole Linear Quadrupole ion trap Quadrupole Ion trap Ion cyclotron resonance

Acronym TOF B Q LIT


Principle Time dispersion of a pulsed beam; separation by time-of-flight Deflection of a continuos beam; separation by momentum in magnetic field Continuous ion beam in linear radio frequency quadrupole field; separation due to stability of trajectories Continuous ion beam and trapped ions; storage and eventually separation in linear radio frequency quadrupole field due to stability of trajectories Trapped ions; separation in three dimensional radio frequency quadrupole field due to stability of trajectories Trapped ions; separation by cyclotron frequency (Lorentz force) in magnetic field

Table 5.3.1 Basic types of mass analysers Source: Mass Spectromerty: A Textbook, 2004 Time-of-Flight analyzer The principle of TOF is simple in the sense that ions of different mass: charge ratio i.e. m/z are dispersed in time during their flight along a field-free drift path of known length. Provided all the ions start their journey at the same time or within a sufficiently short time interval, the lighter ions will arrive earlier at the detector than the heavier ones.[12]

Fig 5.3.3 Time-of-Flight Ion Charge State Analyzer Source: ( Magnetic sector analyzer These are comparatively large devices capable of high resolution and accurate mass determination suited for wide variety of ionization methods. It is basically a momentum analyzer rather than a direct mass analyzer as commonly assumed.[12]

Fig 5.3.4 Path of magnetic sector analyzer Source: ( Linear Quadrupole analyzer A linear quadrupole mass analyzer consists of four hyperbolically or cylindrically shaped rod electrodes extending in the z-direction and mounted in a square configuration. The pairs of opposite rods are each held at the same potential which is composed of a DC and an AC component.[12] Detectors The final element of the mass spectrometer is the detector. The detector records the charge induced or current produced when an ion passes by or hits a surface. In a scanning instrument the signal produced in the detector during the course of the scan versus where the instrument is in the scan (at what m/z) will produce a mass spectrum, a record of ions as a function of m/z[13]. The simplest of the detector is a Faraday cup i.e. an electrode where the ions deposit their charge. They are still in use to measure abundance ratios with high accuracy in isotope ratio mass spectrometry [12]. Other type of detector includes the electron multiplier, where, the ions when bombarded on metal (or PbO coated surface) induce emission of electrons. [13].They became predominant with the advent of scanning mass spectrometers. Progress has also been made to employ cryogenic detectors, a rather special type of an ion counting detector for high-mass ions in TOF-MS.[12] 5.3.3 Recent Advances Sequencing of the human genome and numerous pathogens [14] was a great outbreak for overall biological sciences and opened new gates for active research in the field of molecular biology. As a result it also opened gates for proteomics [14]. Interest has been developed in applying proteomics to understand the process of disease development, develop new biomarkers for the purpose of diagnosis and hence early detection of the disease [14]. Once biomarkers are developed further work can be done in developing drug delivery systems. Studies have identified disease related changes in the protein expression using 2D gels and mass spectrometry. Studies on the diseases of heart [15] have gathered a

set of pathological conditions with acute onset and some with slow progression of diseases and some with chronic progression. Changes in the myocardial proteins associated with the heart failure have been found out in relevant animal models such as that in rat myocytes [15]. Altered overall levels of specific proteins or altered posttranslational modifications of proteins such as myosin light chain 2 have been reported in heart failure [16]. Mass spectrometry has been applied to the in situ proteomic analysis of tissues. This approach allows imaging of protein expression in normal and disease tissues [17]. In this method the frozen tissue is sliced and sections are applied on the MALDI plate, which are then analysed at regular intervals. The mass spectra obtained at each interval are then compared yielding a spatial distribution of individual masses across the tissue section [14]. Tumour analyses using this approach have shown differences in protein expression between normal and tumour tissues that may have specificity for different tumour types [17]. Disease biomarkers play a significant role in terms of diagnosis of a disease. There is substantial interest in applying proteomics to the identification of disease markers. These include the comparative analysis of op protein expression on a diseased and a normal tissue, analysis of secreted proteins in cell lines and many such applications [14]. Serum analysis is done by Surface Enhanced Laser Desorption Ionisation (SELDI) method [18]. The mass spectrum patterns obtained for different samples reflect the protein and peptide contents of these samples. Patterns that distinguish between cancer patients and normal subjects with remarkable accuracy have been reported for several types of cancer [18]. Very often the masses observed match precisely to the predicted masses of the proteins; and this was observed in a study where the proteins secreted by the CD8 T cells was identified to be -defensin 1, 2 and 3 as contributing to the anti-HIV-1 activity of CD8 antiviral factor [19]. Proteomics has also contributed to the studies of pathogens such as Plasmodium falciparum, the malarial parasite. After the genome of this parasite was revealed, comparative proteomic studies have been done which lead to the identification of a potential drug and vaccine target [20, 21]. Aside from comprehensive identification of microbial proteins, proteomics is relevant to numerous aspects of microbial disease pathogenesis and treatment [14] Finally there have also been developments in the instrumentation of this technique. In the case of LC–MS, the last two decades have seen some significant developments and improvements in instrumentation design, especially the introduction of robust, user-friendly interfaces such as those based on atmospheric pressure ionisation techniques, e.g. electrospray (ESI) and atmospheric pressure chemical ionisation (APCI)[22]. 5.3.4 Evaluation of the technology: Mass spectrometry has come of age in this decade and has proved a powerful technique to unleash the unknown. Many proteins have been identified till date by using this instrument. As mentioned earlier, the proteins are digested with trypsin and then analysed by mass spectrometer. The masses are then submitted to various protein databases where the protein is theoretically digested and then compared with proteins that are experimentally generated by the database. The match is scored on number of factors, depending on the search program utilised.

This technique has certain advantages and disadvantages associated with it. It is rapid with a low turn around time. It is highly sensitive even with low amounts of sample. It measures masses of molecules accurately and hence it is used to determine the number of subunits in different olygomers, which cannot be determined accurately by other methods like size exclusion, analytical ultracentrifugation or cryo electron microscopy [23]. Though accurate, this technique can’t be used efficiently enough for non-covalent compounds [23]. Before entering the mass analysers, the sample undergoes ionisation and not all molecules ionize well during the ionization process; dissociation of the complexes could also occur during ionization. One of the important requirements of this technique is that once the masses are submitted to the databases, the protein should be present in the database list (databases websites given at the end). Proteins that are less than 15kDa are not suitable for MALDI-TOF mass spectrometry [24]. 5.3.5 Applications: In recent years there has been development of powerful technology that has given a boom for biological sciences. Mass spectrometry is one of them which has been recently been used extensively in the new field of proteomics. Protein analysis is done by mass spectrometry routinely these days. It finds applications in the medical field for the diagnosis of diseases. New biomarkers are being identified by using mass spectrometric analysis, which would improve tuberculosis diagnosis [25]. It finds application in forensic sciences. The ability of mass spectrometry to extract chemical fingerprints from microscopic levels of analyte is invaluable enabling the legally defensible identification and quantification of a wide range of compounds [22]. Determining the use of chemical warfare agents (CWAs) in times of war or in acts of terrorism requires rapid and reliable methods. Nerve agents are extremely potent Organo phosphorus compounds that cause biological effects by irreversibly inhibiting the enzyme acetylcholinesterase (AChE). To confirm exposure, biological samples, e.g. urine, can be analysed for the agents [22]. Depending on the compounds to be detected, GC-MS or LC-MS is routinely used. Screening of illicit drugs by mass spectrometry is one of the common applications. Drug abuse during pregnancy is a major problem and has been associated with prenatal complications and high morbidity and mortality rates of newborns. Meconium is the first faecal matter produced by the neonate typically within the first 5 days of birth. Use of this specimen can extend the window of drug detection considerably, i.e. to approximately the last 20 weeks of pregnancy [22]. Recently, a quantitative determination of azithromycin in human plasma was done by HPLC-MS [26]. Azithromycin is a semisynthetic macrolide antibiotic of the erythromycin group. It has been used to treat infections caused by respiratory pathogens, including Legionella pneumophila, Haemophilus influenzae and Branhamella catarrhalis [26]. The determination of antibiotics is carried out by microbiological assays, but, they tend to lack specificity and their use involves difficulty in confirming what kinds of drugs remain in biological sample [26]. Thus, liquid chromatography–mass spectrometry (LC–MS) and liquid chromatography– tandem mass spectrometry (LC–MS/MS) seem to be the most promising technique for separation and quantitative analysis of drugs and have recently been used in the determination of azithromycin[26]. Such and many more applications such as the identification of new biomarkers in thyroid cytology [27], proteomic strategies to identify urinary biomarkers for prostate cancer [28], to understand the meat quality through proteomic approach [29] have

proven to be fruitful. There are many more applications of mass spectrometry in many fields, but is beyond the scope to list them all. 5.3.5 Relevant WebPages The following pages for Programs used to search MALDI-TOF Peptide Mass Fingerprint data:
• • • •


5.3.6 References: 1. Tyers, M. M., M. (2003) From genomics to proteomics, Nature. 2. Mann, R. A. a. M. (2003). Mass spectrometry-based proteomics. Paper presented at the Nature. 3. Wilkins, M. R., Pasquali,C, Appel, R.D, Ou, K, Golaz ,O, Sanchez, J.C, Yan, J.X, Gooley, A.A, Hughes,G, Humphery-Smith, I., Williams, K.L & Hochstrasser, D.F. (1996) From proteins to proteomes: large scale protein identification by twodimensional electrophoresis and amino acid analysis., Biotechnology, 61-65. 4. 5. Fenn, J. B., Mann, M., Meng, C. K., Wong, S. F. & Whitehouse, C. M. (1989) Electrospray ionization for the mass spectrometry of large biomolecules., Science. 246, 64-71. 6. Karas, M. H., F. (1988) Laser desorption ionization of proteins with molecular mass exceeding 10000 daltons, Analytical Chemistry. 60, 2299-2301. 7. Aebersold, R. G., D. R. (2001) Mass spectrometry in proteomics, Chemical Review. 101, 512-526. 8. Pandey, A. M., M. (2000) Proteomics to study genes and genomes, Nature, 837846. 9. Aebersold, R. G., D. R. (2001) Mass spectrometry in proteomics, Chemistry Review. 101, 269-295. 10. Mann, M., Hendrickson, R. C. & Pandey, A. (2001) Analysis of proteins and proteomes by mass spectrometry, Annual Review of Biochemistry. 70, 437-473. 11. Herbert, C. G. J., R.A. W. (2003) Mass Spectrometry Basics, CRC PRESS, United States of America. 12. Gross, J. H. (2004) Mass Spectrometry: A Textook, Springer-Verlag Berlin Heidelberg, Germany. 13. 14. Hanash, S. (2003) Disease proteomics, Nature. 422, 226-232. 15. Van Eyk, J. E. (2001) Proteomics: unraveling the complexity of heart disease and striving to change cardiology, Current opinion in Molecular Therapeutics. 3, 546553. 16. van Der Velden, J. e. a. (2001) Effects of calcium, inorganic, phosphate, and pH on isometric force in single skinned cardiomyocytes from donor and failing human hearts, Circulation. 104, 1140-1146. 17. Stoeckli, M., Chaurand, P., Hallahan, D. E. & Caprioli, R. M. (2001) Imaging mass spectrometry: a new technology for the analysis of protein expression in mammalian tissues, Nature Medicine. 7, 493-496.

18. Petricoin, E. F., Zoon, K. C., Kohn, E. C., Barrett, J. C. & Liotta, L. A. (2001) Clinical proteomics: translating benchside promise into bedside reality, Nature Reviews Drug Discovery. 1, 683-695. 19. Zhang, L. e. a. (2002) Contribution of human -defensin 1, 2 and 3 to the antiHIV-1 activity of CD8 antiviral factor, Science. 298, 995-1000. 20. Lasonder, E., Ishihama,Y., Andersen,J. S., Adriaan, M. W., Vermunt, A. P., Sauerwein,R.W., Wijnand, M. C., Eling, N. H., Waters,A. P., Stunnenberg,H.G. & Mann,M. (2002) Analysis of the Plasmodium falciparum proteome by high-accuracy mass spectrometry, Nature. 419, 537-542. 21. Florens, L., Washburn,M. P., Raine,J.D., Anthony,R.M., Grainger,M., Haynes,J. D., Moch,J. K., Muster,N., Sacci,J. B., Tabb,D. L., Witney,A.A., Wolters,D., Yimin Wu, Gardner,M.J., Holder,A. A., Sinden,R.E., Yates,J.R. & Carucci,D.J. (2002) Analysis of the Plasmodium falciparum proteome by high-accuracy mass spectrometry, Nature. 419, 537-542. 22. Wood, M., Laloup, M., Samyn, N., del Mar Ramirez Fernandez, M., de Bruijn, E. A., Maes, R. A. A. & De Boeck, G. (2006) Recent applications of liquid chromatography-mass spectrometry in forensic science, Journal of Chromatography A. 1130, 3-15. 23. Poliakov, A. (2006) Mass spectrometry on non-covalent macromolecular complexes in 24. Australian Proteome Analysis Facility:MALDI MS Analysis for Protein Identifications. 25. Agranoff, D., Fernandez-Reyes, D., Papadopoulos, M. C., Rojas, S. A., Herbster, M., Loosemore, A., Tarelli, E., Sheldon, J., Schwenk, A. & Pollok et, al. (2006) Identification of diagnostic markers for tuberculosis by proteomic fingerprinting of serum, Lancet. 368, 1012-1021. 26. Chen, B.-M., Liang, Y.-Z., Chen, X., Liu, S.-G., Deng, F.-L. & Zhou, P. (2006) Quantitative determination of azithromycin in human plasma by liquid chromatography-mass spectrometry and its application in a bioequivalence study, Journal of Pharmaceutical and Biomedical Analysis. 42, 480-487. 27. Torres-Cabala, C., Bibbo, M., Panizo-Santos, A., Barazi, H., Krutzsch, H., Roberts, D. D. & Merino, M. J. Proteomic identification of new biomarkers and application in thyroid cytology, Acta Cytologica. 50, 518-528. 28. Downes, M. R., Byrne, J. C., Dunn, M. J., Fitzpatrick, J. M., Watson, R. W. G. & Pennington, S. R. Application of proteomic strategies to the identification of urinary biomarkers for prostate cancer: A review, Biomarkers: Biochemical Indicators Of Exposure, Response, And Susceptibility To Chemicals. 11, 406-416. 29. Mullen, A. M., P.C. Stapleton, D. Corcoran, R.M. Hamill and A. White. (2006) Understanding meat quality through the application of genomic and proteomic approaches, Meat Science. 74, 3-16.

Chapter 5.4 Bioinformatics in proteome analysis Supriya Narayanan (s3119801)



The complement of proteins found in a single cell in a particular environment is called the proteome. (derived from PROTEin complement to a genOME) [1, 3]. The term was coined by Mark Wilkins in 1995 [4]. Proteome can also be used to refer to all the proteins present in a simple organism such as yeast as shown in the diagram. The study of the proteome is called proteomics. It is a study of not only all the proteins in the cell but also the way they interact, the changes they undergo and the effects they have within the organism.[2] Proteomics can be defined as the qualitative and quantitative comparison of proteomes under different conditions to further unravel biological processes[1]. There are various tools available for carrying out analysis of the proteome such as electrophoresis, chromatography, X-ray crystallography, NMR and mass spectrometry. However, proteomics data is under continuous improvements and new technologies are emerging for generating high throughput results [6]. It is difficult to handle this large influx of information using traditional methods. Bioinformatics is emerging as a good means to handle and interpret this data.

Yeast proteome Bioinformatics is defined by the, BISTIC Definition Committee chaired by Dr. Michael Huerta of the National Institute of Mental Health on July 17, 2000 as follows : “ the research, development, or application of computational tools and approaches for expanding the use of biological, medical, behavioral or health data, including those to acquire, store, organize, archive, analyze, or visualize such data.” [5] The use of computational technology has also enabled a high degree of reproducibility and sensitivity in the technologies used to study the proteome [10]. Bioinformatics software concentrates on three main areas ♦ Interpreting mass spectrometry information. ♦ Computing the structure of a protein

♦ Sequence comparison [13] A brief description of the informatics requirements of various processes of biological analysis is shown in the diagram

Figure 1 : A brief description of the informatics requirements Source: Scott D. Patterson & Ruedi H. Aebersold, (2003) Proteomics: the first decade and beyond, Nature Genetics 33, 311 - 323 5.4.2 Recent Advances

Proteomics is a relatively new field developing only in the last 10 years. Based on 2D gel electrophoresis protein profiles, ideas were proposed in 1970s and 1980s to build protein databases such as the human protein index. In the late 1980s and early 1990s, there were only small sequence databases. However, through the 90s, along with genomic data, protein data was also gathered and the databases began to grow. With complete libraries at hand, rapid identification of proteins was only limited by the ability to extract their sequence information. This gap was rapidly filled by mass spectrometry techniques and database search algorithms. In 1993, five independent reports were published that described the implementation of this insight in database search algorithms. These algorithms, together with MALDITOF mass spectrometry peptide analysis, constituted a 'protein identification' method that is now known as peptide mass mapping (or peptide mass fingerprinting). In this type of analysis, the collected 'MS spectra' are used to generate a list of proteolytic (peptide) fragment masses, which are then matched to the masses calculated from the same proteo-lytic digestion of each entry in a sequence database, resulting in identification of the target protein. Algorithms that match MS/MS spectra to sequence databases have greatly facilitated mass spectrometric protein identification by this approach. MS/MS spectra are also ideally suited to search translated EST and other sequence databases containing incomplete sequences. The next development was of the gel-independent approach to proteomics using LCMS/MS systems. The combination of LC-MS/MS and sequence database searching has been widely adopted for the analysis of complex peptide mixtures generated from the proteolysis of samples containing several proteins. This approach is often referred to as 'shotgun' proteomics and has the ability to catalog hundreds, or even thousands, of components contained in samples isolated from very different sources.

Specific examples include the identification of proteins in the periplasmic space of bacteria, yeast ribosomal complexes, murine nuclear interchromatin granule clusters (nuclear speckles), murine mitochondrial soluble intermembrane proteins, human urinary proteins, yeast TFIID-associated proteins, proteasomal proteins, human microsomal proteins, human membrane proteins and yeast nuclear pore proteins pre-fractionated by SDS polyacrylamide gel electrophoresis, and proteins from yeast lysates.[26] Some of the aspects of bioinformatics in proteomics which have shown development in the last year include: ♦ An increasing number of bioinformatics databases provide graph views of interconnected components of a database. The collection of possibly interconnected maps are called atlases. So far though, there have been no tools developed to interpret these results. e.g., BacMap [27, 28] ♦ In recent times, data is being organised using an emerging field called systems biology. This involves the breaking down of biological systems into its component parts which are linked to each other. This helps in understanding the system as a whole yet preventing redundancy of data and better storage ♦ A recent development is the integration of time into simulation models being generated. Simulations are being made which reproduce and thereby quantify the behaviour of a system over a period of time. The yield of a reaction, the steps of a molecular pathway, up to the full network of interacting entities that characterise a cellular activity can be modelled. The behaviour of the resulting systems is tested in response to defined perturbations. [27, 28]. ♦ Data can be organised based on the use of ‘synthetic biology’ which involves the artificial assembly of natural parts. [27, 28] This can then be organised as networks which can act as both a representation as well as simulation of interacting proteins. ♦ Last but not the least is the development of a standard for all proteomics formats. Several working groups, all launched through the Proteomics Standards Initiative (PSI) of HUPO (Human Proteome Organisation), are currently in charge of defining standardised general proteomics formats. Such standards are expected to facilitate data comparison, exchange and verification. Besides an involvement in setting a standardised general proteomics format, PSI supports other working groups in key areas of proteomics, i.e., gel electrophoresis, mass spectrometry and protein-protein interaction data. 5.4.3 Evaluation of the Technology

From its inception to the present day, proteomics has evolved substantially. Conceptually, proteomics has become a biological assay for the quantitative and qualitative analysis of complex protein samples. Technologically, proteomics has become a combination of relatively mature tools that support protein cataloguing and quantitative proteome measurements reliably, sensitively and at high throughput.[26] This is an invaluable tool for various reasons. Firstly, global data sets are rich in information but difficult to analyze using traditional knowledge-based interpretation. Secondly, the more the data the better it is. That is, it is much more informative to collect several global data sets on the same system, and to use mathematical tools

such as cluster analysis to extract biological insights or to formulate hypotheses. Thirdly, It is expected that additional systematic proteomic data, including activity profiles, interaction maps and profiles of (regulatory) modifications, will provide further insights into the structure, function and control of biological systems. The use of bioinformatics in proteome analysis has made the handling of data much easier. Before computer processing comes into the picture, extensive data, particularly through crystallography and NMR, are required for the study of any protein. With the development of bioinformatics tools and databases, the structure and its relationship to function of newly discovered proteins can be understood in a very short time. The use of high throughput platforms and parallel processing in proteomics has resulted in a huge surge in the amount of data being generated throughout the world. However, the large scale analysis of proteins results in data analysis problems [14]. This is because of the nature of the two main technologies being used to study the structure of the protein: ♦ Systems based on gel electrophoresis ♦ Non-gel based methods. Software written for analysis of gel data has been improved over the years but still requires manual intervention. On the other hand, data from non-gel based methods such as mass spectrometry consists of not only the peptides being analyzed but also the noise in the system. As the vast majority of the data is noise, there is a huge wastage of computer resources and time in trying to analyze this data. Also, algorithms written to analyze the MS data only indicate the significance of the matches and not the actual value. Hence there are a huge number of false positive results. The analysis of these results by untrained or inexperienced personnel can lead to acceptance of false matches and the faulty identification of the protein. The wide adoption of proteomics approaches into biological research will require several developments to combat data overload and ensure data quality. First, tools must be readily available to de-select MS/MS spectra from search routines that are unlikely to yield a match because of poor quality. Second, search algorithms require further refinement to diminish the false positives and false negatives (merely setting scores high to diminish false positives is counter to the aim of the experiment); this problem is beginning to be addressed through the development of true probabilitybased scores that are akin to the assignment of quality scores to each base in DNA sequencing. Third, spectral matching algorithms for peptide MS/MS spectra need to be made commercially available. And fourth, a database of truly nonredundant transcripts of the organism under study is required, together with an extensive relational database that can acquire data from the diverse range of instruments involved in each stage of the proteomics experiments The ability to generate information has now outstripped the ability to analyze the information being generated. This is because the amount of information being entered into the databases in increasing by geometric succession. However, the rate of increase of computing power is based on Moores law. The ability to collect large proteomic data sets already outstrips the ability to validate, to interpret and to integrate such data for the purpose of creating biological knowledge. Therefore, software tools will be developed to help manage, interpret, integrate and understand proteomic data. The lack of suitable software tools currently limits essentially all areas of proteomic data analysis, from database searching using

MS/MS spectra to the assembly of large data sets containing different types of data in relational databases. To derive value from the data that goes beyond an initial scan for 'interesting observations' and to make data portable and comparable, it will be necessary to develop algorithms that assign a score to each data point that estimates the probability that the observation is correct. Just as the assignment of quality scores to each base in DNA sequencing using the algorithm Phred was essential for the success of genome sequencing programs, it can be expected that probability-based scores calculated for proteomic data will have a similar impact on proteomics. More recent initiatives have shown that quantification of information is the next hurdle. Every data involved in bioinformatics analysis has to be weighted with an appropriate number in order to give it the proper significance. These weights express the relative importance of each component of the entity or each event of a process. A recent development or branch of bioinformatics is systems biology. This field has the advantages of portraying the actual relationship between proteins. As it is stored as components, redundancy is avoided and time taken for downloading is minimised. Some challenges involved in this technology are ♦ Enormous complexity of proteome The detection, and particularly the molecular analysis of this complexity, remains an unmatched task. ♦ The second challenge is the need for a general technology for the targeted manipulation of gene expression in eukaryotic cells. An approach that has proved successful for the systematic analysis of biological systems relies on iterative cycles of targeted perturbations of the system under study and the systematic analysis of the consequences of each perturbation. Although recent advances in using RNA interference in higher eukaryotic cells open up exciting possibilities, the general targeted manipulation of biological systems in these species remains unsolved. ♦ The third challenge is the limited throughput of today's proteomic platforms: iterative, systematic measurements on differentially perturbed systems demand a sample throughput that is not matched by current proteomic platforms. ♦ The fourth challenge is the lack of a general technique for the absolute quantification of proteins. The ability to quantify proteins absolutely, thereby eliminating the need for a reference sample, would have far-reaching implications for proteomics—from the determination of the stoichiometry of protein complexes to the design of clinical studies aimed at discovering diagnostic markers. Studies have also highlighted the limitations of shotgun proteomics, including the difficulty of detecting and analyzing by collision-induced dissociation (CID) mass spectrometry all of the peptides in a sample, the qualitative nature of data-dependent experiments, and the challenge of processing the tens of thousands of CID spectra generated in a typical experiment—one of the many informatics challenges that still faces scientists in this field. On average, a protein digested with trypsin will generate 30−50 different peptides. A tryptic digest of the proteome of a typical human cell will therefore generate a peptide mixture containing at least hundreds of thousands of peptides. Even the most advanced LC-MS/MS systems cannot resolve and analyze such complexity in a reasonable amount of time. [26,27,28] 5.4.4 Applications of the Technology

Some of the major applications of bioinformatics in proteomics are . Structural analysis ♦ High throughput data from gel electrophoresis New algorithms for image analysis of two dimensional gels have been developed within the last five years[15] ♦ Mass spectroscopy data analysis Within mass spectrometry data analysis algorithms for peptide mass fingerprinting (PMF) and peptide fragmentation fingerprinting (PFF) have been developed[6, 9,16,17]. ♦ Prediction of protein 3-d structures Various tools have been developed to predict 3D structure of protein by either comparing with existing structures or ab initio.[18] ♦ Structural analysis of receptors, molecules involved in cell signalling Analysis of the various structures and how they function together can be visualised as simulations.[18] Functional analysis ♦ Understanding protein-protein interactions The interaction between various proteins can be studied using systems biology.[23, 25] ♦ Sequence comparison Sequence comparison helps identification of individual proteins and understanding the difference between normal and abnormal cell proteomes[7]. ♦ Sequence to structure information The relationship between structure and sequence in proteins can be studied using bioinformatics tools[11,26] Evolutionary analysis ♦ Tracing ancestral connections Construction of phylogenetic trees and multiple sequence alignment help trace the evolutionary relationship between proteins.[24,25] Biomedical applications ♦ Drug discovery Proteomics has a major role to play in drug discovery as simulations can be done of potential drug targets, such as for cancer therapy, and tested.[20, 21]

♦ Biomarkers development Proteomics is very useful as it can compare a whole proteome or subproteome at a time and can thus help design biomarkers. [ 14} ♦ Proteomic profiling Proteomic profiling helps determine the differences in protein expression patterns between cells or with a cell at different times.[22] Others ♦ Analysis of microarrays Chips containing arrays can be automated and connected to software for analysis.[19] Apart from analysis, bioinformatics is also used to store information in databases which can also function as knowledge resources for scientists throughout the world[8]. 5.4.5 Relevant web sites

Include useful leaning sites, reference/leading laboratory sites, resource sites e.g. Databases Protein sequence databases Swiss-prot ( ) PDB ( Protein structure databases CATH ( SCOP ( Bioinformatics tools Various bioinformatics tools and what they are used for is as given below. A. Predict the protein sequence from a given nucleotide sequence by finding the most probable open reading frame. 1. By direct translation of sequences without introns ExPASy Translation Tool (at Swiss Institute of Bioinformatics)

NCBI ORF Finder 2. By predicting promoters, splice sites, termination sites, etc GENSCAN ( BCM SearchLauncher ( Grail ( FGENEH( Genmark( B. Identification of protein based on sequence AACompIdent( AACompSim TagIdent C. Identification of physical properties based on sequence compute pI/MW peptideMass SAPS D. Identification of similar proteins from databases BLAST FASTA GenQuest Q Server 9 E. Align multiple sequences to identify patterns between proteins of various species or proteins having evolutionary significance. ClustalW BCM Search Launcher

Identification of specific domains or motifs can be done using Pfam SCOP - Structural Classification of Proteins PROSITE A database of regular expression-like patterns (motifs) PRINTS - a diagnostic collection of protein fingerprints fingerPRINTScan BLOCKS BLOCKS are ungapped MSA representing conserved protein regions G. Structure prediction tools – fold and secondary structure prediction PredictProtein Server Meta PP Transmembrane region prediction TMHMM TOPPRED Coiled coil region prediction MultiCoil The MultiCoil program predicts the location of coiled-coil regions in amino acid sequences and classifies the predictions as dimeric or trimeric. The method is based on the PairCoil algorithm

Tertiary structure prediction by homology modeling Swiss-Model - Automated Protein Modeling Server ModBase - Database of comparative protein structure models Align your protein structure vs other protein structures DALI Server - Automated Protein Structure Alignment [12]



1. Swiss-Prot database, definitions

2. J. S. Petersen , Cinjecture corporation
3. Rediscovering biology, online text book 4. Bioinformatics definition committee, July17, 2000, 5. 6. Blueggel M, Chamrad D, Meyer HE.(2004) Bioinformatics in proteomics Curr Pharm Biotechnol, 5(1):79-88. Haoudi A, Bensmail H. (2006 Jun) Bioinformatics and data mining in proteomics, Expert Rev Proteomics.;3(3):333-43

7. Kremer A, Schneider R, Terstappen GC. (2005 Feb-Apr) A bioinformatics perspective on proteomics: data storage, analysis, and integration, Biosci Rep.;25(1-2):95-106. 8. Kearney P, Thibault P. (2003 Apr) Bioinformatics meets proteomics--bridging the gap between mass spectrometry data analysis and cell biology. J Bioinform Comput Biol.;1(1):183-200. 9. Stephens AN, Quach P, Harry EJ. (2005 Apr) A streamlined approach to high-throughput proteomics. Expert Rev Proteomics.;2(2):173-85 10. . Yee A, Pardee K, Christendat D, Savchenko A, Edwards AM, Arrowsmith CH, Structural proteomics: toward high-throughput structural biology as a tool in functional genomics. 11. Online Bioinformatics courses and lectures html 12. Englbrecht CC, Facius A. (2005 Dec) Bioinformatics challenges in proteomics Comb Chem High Throughput Screen.;8(8):705-15 13. Scott D. Patterson, Ruedi H. Aebersold, (2003) Proteomics: the first decade and beyond, Nature Genetics 33, 311 - 323 14. He QY, Chiu JF.(2003 Aug), Proteomics in biomarker discovery and drug development , J Cell Biochem. 1;89(5):868-86. 15. Dowsey AW, Dunn MJ, Yang GZ, (2004 Dec), ProteomeGRID: towards a high-throughput proteomics pipeline through opportunistic cluster image computing for two-dimensional gel electrophoresis., Proteomics.;4(12):380012.

16. Chamrad DC, Korting G, Stuhler K, Meyer HE, Klose J, Bluggel M ( 2004 Mar). , Evaluation of algorithms for protein identification from sequence databases using mass spectrometry data, Proteomics.;4(3):619-28. 17. Cristoni S, Bernardi LR. (2004 Dec), Bioinformatics in mass spectrometry data analysis for proteomics studies. Expert Rev Proteomics.;1(4):469-83. 18. Lundgren DH, Eng J, Wright ME, Han DK. (2003 Nov) PROTEOME-3D: an interactive bioinformatics tool for large-scale data exploration and knowledge discovery. Mol Cell Proteomics.;2(11):1164-76.. 19. Boguski MS, McIntosh MW.(2003 Mar), Biomedical informatics for proteomics. Nature. 13;422(6928):233-7. 20. Rai AJ, Chan DW. .(2003 Mar), Cancer proteomics: Serum diagnostics for tumor marker discovery.. Nature. 13;422(6928):233-7. 21. Burbaum J, Tobal GM. ( 2002 Aug) Proteomics in drug discovery. Curr Opin Chem Biol.;6(4):427-33. 22. White CN, Chan DW, Zhang Z. (2004 Jul), Bioinformatics strategies for proteomic profiling. Clin Biochem. 37(7):636-41. 23. MacCoss MJ (2005 Feb) Computational analysis of shotgun proteomics data. Curr Opin Chem Biol.;9(1):88-94. 24. Cannataro M, Cuda G, Veltri P.( 2005;) Modeling and designing a proteomics application on PROTEUS. Methods Inf Med44(2):221-6. 25. Lester PJ, Hubbard SJ. (2002 Oct), Comparative bioinformatic analysis of complete proteomes and protein parameters for cross-species identification in proteomics. Proteomics.;2(10):1392-405. 26. Martens L, Hermjakob H, Jones P, Adamski M, Taylor C, States D, Gevaert K, Vandekerckhove J, Apweiler R, 2005 Aug, PRIDE: the proteomics identifications database. Proteomics.;5(13):3537-45. 27. Patricia M. Palagi, Patricia Hernandez, DanielWalther and Ron D. Appel,(2006), Proteome informatics I: Bioinformatics tools for processing experimental data, Proteomics, 6, 5435–5444 28. Frédérique Lisacek, Sarah Cohen-Boulakia and Ron D. Appel,(2006) Proteome informatics II: Bioinformatics for comparative proteomics, Proteomics , 6, 5445–5466

Chapter 5.5 Automation and High-Throughput Proteomics
Mohammad Tariq Sadat Hai



Proteomics is the study the structure, function and interactions of the total encoded proteins (proteome) as well as their isoforms and modifications in a particular cell line, tissue, or a whole organism of interest. There are three main approaches in proteomics: expression proteomics, cell-map proteomics, and structural proteomics. Expression proteomics (also called “differential expression proteomics”) is the study of the set of all protein isoforms and their cell to cell differences and modifications. Cell-map proteomics deals with protein–protein interactions within the cells, tissues and so on. Structural proteomics includes the determination of three-dimensional protein structures as well as their higher order complexes on a genome wide scale [1, 2, 3]. Underlying Principles DNA and mRNA, being physico-chemically homogeneous and amplifiable by PCR methods, are amenable to automation. Proteome analysis, in contrast to analysis of DNA or mRNA, has been limited by a number of factors: (i) the level of protein expression cannot be predicted from the level of mRNA; (ii) proteins undergo many post-translational modifications resulting in different conformations/components with distinct functionalities; and (iii) protein maturation and degradation are dynamic processes altering the final amount of active protein independent of mRNA level. Moreover, as the proteome dataset are growing larger and becoming more complex everyday it is becoming more difficult to archive and analyze the dataset manually. Thus, proteome analysis requires a higher degree of throughput and automation to allow: (i) the extraction and high-resolution separation of all protein components, including membrane proteins and proteins having low copy number; (ii) the identification and quantification of each protein component; and (iii) the comparison, analysis, and visualization of complex changes in expression patterns [4, 5].

Figure 5.5.1 Automated robot used to mount and align protein crystals at Berkeley Lab Advanced Light Source [3] Today the attainment of high-throughput proteomics has been possible due to the modern technological supports such as the highest sensitivity of current

instrumentations related to proteome analysis, the most sophisticated protein separation technologies and the highest precision in computational data analysis methods [6]. A number of automated technologies (robotics and intelligent systems technologies) have made it possible to capture a higher quality snapshot of the large and complex proteome and analyse their activities [3]. Automated and high-throughput technologies are functional in three areas of proteomics: (i) 2-D Gel Electrophoresis (2-DE) and Mass Spectroscopy (MS) are used to separate, identify and characterize the set of proteins within the proteome; (ii) protein microarray technology is used to monitor differential expression and interactions of the proteins; and (iii) structure and imaging techniques are used for the determination of protein 3-D structure protein localization and quantitative analysis of protein-protein interactions [3].


Recent Advances

Automation of DNA sequencing technologies has contributed to the acceleration of human genome project which was initially a laborious, expensive and personnelintensive task. Similarly, automation is changing the field of proteomics today while saving both cost and time of experiments [3]. Traditionally 2-DE has been used for obtaining the global picture of the expression levels of a proteome under various conditions. In recent years, MS technologies have evolved as a versatile tool for examining the simultaneous expression of more than 1000 proteins and the identification and mapping of posttranslational modifications. High-throughput methods performed in an array format have emerged enabling large-scale projects for the characterization of protein localization, protein-protein interactions, and the biochemical analysis of protein function [7]. Automation of 2-D Gel Electrophoresis (2-DE) and Mass Spectroscopy (MS) While 2-DE is extremely useful, it has some technical limitations. These include the manual handling of gels which are cumbersome to run, have limits in sample capacity, have poor dynamic range (has low sensitivity for very acidic or basic proteins), and are biased toward abundant and soluble proteins (cannot detect lowabundance proteins in absence of additional sample enrichment techniques). The resolution of 2-DE is not sufficient compared to the enormous diversity of cellular proteins, and there may be co-migrating proteins in the same spot. In addition, 2-DE is time consuming and it may take days to run and analyze a single gel [3, 7, 8, 9, 10]. To overcome these limitations, several approaches have emerged. In one approach, a number of 2-DE products have been developed that support automated gel processing systems. The a2DEoptimizer by NextGen Sciences features automated gel casting that can cast multiple gels simultaneously being controlled and monitored by computer. It also has the ability to create user-defined gradient gels which can be difficult to create manually. Large Scale Biology, under their subsidiary, Predictive Diagnostics, has released BAMF (Biomarker Amplification Filter), a computer platform combining 2DE, NMR, MS, and biomarkers to identify individual proteins [3].

Several features that are commonly offered by many of the newer automated gel processing systems include ‘the ability to: (i) import and export gels into standard bitmapped graphics formats; (ii) manipulate, preprocess, filter, and organize gel bitmaps; (iii) visualize and compare gels; (iv) create, queue, and monitor computational analysis tasks; and (v) present results (e.g., peptide matches in an excised, digested protein spot)’ [3]. To overcome the calculation-intensive process of image analysis of 2-DE gels, ProteomeGRID, a high-throughput 2-DE image analysis computing platform, has been developed that utilizes a gel matching algorithm to overcome the bottleneck of spot matching. It builds on the proTurbo cluster image computing engine with Gridenabled versions of automatic 2-DE and MS analysis algorithms. Specific emphasis is placed on the integrated development of a HUPO PSI GPS (Human Proteome Organization Proteomics Standards Initiative General Proteomics Standards) object model and ontology for statistical Image mining of 2-DE gels. The PSI will also drive the automated spot cutting, protein digestion and MS between the 2-DE and MS bioinformatics pipelines [9]. Taking into account the limitations of 2-DE, a number of alternatives to 2-DE has also emerged in recent years. Non-2-DE-based protein and peptide separation methods combined with new protein chemistries and enrichment methods and highly automated MS instrumentation have been developed that provide new tools for analyzing the properties of proteomes with increased sensitivity and throughput [10]. Non-2-DE-based methods include Time-Of-Flight (TOF) MS and the ‘soft ionization’ methods, namely Matrix-Assisted Laser Desorption Ionization (MALDI), ElectroSpray Ionization (ESI) and Surface-Enhanced Laser Desorption Ionization (SELDI) [7, 11]. These ionization sources are often coupled to TOF, ion trap or quadrupole analyzers or combinations of these analyzers. Using MS, peptides and proteins are identified by their mass-to-charge (m/z) ratio, which correlates with the molecular mass. MS can also be applied to sequence peptides and proteins. Post-translational modifications also change the m/z-ratio [11]. Using liquid chromatography with various columns and pressure conditions or by bait molecules mounted on a fixed surface, provided on small chips, proteins can be sorted based on their various biochemical characteristics, such as hydrophobicity, anion-, cation- or metal ion-binding capabilities, or specific protein–protein interactions, e.g. receptor–ligand or antibody–antigen interactions. The small chip procedure has pushed proteomics studies into high-throughput application on a large-scale [11]. These ProteinChips® (developed by Ciphergen Biosystems Inc., USA) can also be coated with specific antibodies enabling the study of the variety of specific gene products. This technique has sensitivity can be in the picomole-to-femtomole range and is able to detect low abundant peptide and proteins in biological material [11]. SELDI TOF/MS is most effective at profiling low molecular weight proteins (<20 kDa). The application of small sample volumes (µl-range) and the detection of between 15,500 (low resolution SELDI-TOF) and >400,000m/z-ratios (high resolution SELDITOF) makes proteomic profiling of diverse biological samples possible [11]. Figure 5.5.2 below illustrates a typical SELDI-TOF technology.

Figure 5.5.2 Surface-Enhanced Laser Desorption and Ionization (SELDI) technology. Using a robotic sample dispenser, 1 µL of serum is applied to the surface of a protein-binding chip. A subset of the proteins in the sample binds to the surface of the chip. The bound proteins are treated with a matrix-assisted laser desorption ionization matrix and are washed and dried. The chip, which contains multiple patient samples, is inserted into a vacuum chamber where it is irradiated with a laser. The laser desorbs the adherent proteins and causes them to be launched as ions. The time of flight (TOF) of the ion before detection by an electrode is a measure of the mass-to-charge (m/z) value of the ion. The ion spectra can be analysed by computer-assisted tools that classify a subset of the spectra by characteristic patterns of relative intensity [12]. The MALDI-TOF instruments usually perform protein mass fingerprinting (PMF) [13]. Compared with MALDI, ESI has a significant advantage in that in can be easily coupled to separation techniques such as liquid chromatography (LC) and HPLC, allowing high-throughput and on-line analysis of peptide or protein mixtures. Typically, proteins in a complex mixture are separated by ionic or reverse phase column chromatography and then subjected to tandem MS (MS/MS) analysis via online ESI [7, 10]. However, more recently offline spotting of peptides onto sample

targets and use of MALDI instruments capable of high throughput MS/MS analysis has also become an option [10]. The development of multidimensional liquid chromatography (MDLC)-based detection and quantification of multiplexed isotopic iTRAQ-labeled proteins has enabled the global analysis of complex biological system and has permitted the simultaneous comparison of samples of different cell states and disease specimens. Fully automated liquid chromatography-tandem mass spectrometry (LC-MS/MS) usually generates tens of thousands of MS/MS spectra. Many thousands of peptides can be collectively analyzed by multiple LC-MS/MS runs in each proteomic experiment such that thousands of proteins can be identified. Automated data processing software package has been developed (e.g. Multi-Q) which uses iTRAQ labeling for multiplex protein quantitation [8].

Figure 5.5.3 Multi-Q quantitation system. Multi-Q package accepts two data inputs: raw data from mass spectra and protein identification results from search engines such as MASCOT, SEQUEST, and X!Tandem. Data then undergo data conversion, peptide ratio determination, and protein ratio determination, and final protein relative abundance ratios are produced [8]. As shown in Figure 5.5.3 above, Multi-Q provides a data converter for handling spectral raw data and can also accept search results from various search engines, including SEQUEST, X!Tandem and MASCOT. After automatic filtering of noniTRAQ-labeled peptides, the ratio of all peptides with unambiguous identification is determined [8]. The Computational Proteomics Analysis System (CPAS) is an open-source, webbased analysis platform that contains an entire data analysis and management pipeline for Liquid Chromatography Tandem Mass Spectrometry (LC-MS/MS) proteomics, including experiment annotation, protein database searching and sequence management, and mining LC-MS/MS peptide and protein identifications [14].

Automation of Protein Microarray Automation has been achieved in different categories of protein microarray technology. The ProteinChip technology described above works in the microarray format and methodology. It is a parallelized approach that provides information on protein structure, character, and PTMs. Also, other commercially available protein microarray kits has their own features For example, Whatman FAST® Quant system performs parallel processing and can perform over 500 measurements from 56 samples; Panorama™ Ab Microarray Cell Signaling Kit by Sigma-Aldrich can detect minute quantities of protein – as low as a few nanograms per milliliter; BD Lyoplate™ Technology by BD Biosciences can allow microarray plate design flexibility. Using microfluidics technology, chips can be etched with microscopic channels in which miniature assays are performed. Caliper Technologies has developed one such device called the LabChip® 3000 Drug Discovery System [3]. Automation of Protein Structure and Imaging The determination of protein 3-D quaternary structures has greatly increased in the past few years. The most common techniques for structural analysis of protein include NMR, X-ray crystallography, structure prediction methodology, and also MS to a certain degree. Imaging techniques can detect protein–protein interactions and protein localization. Such techniques include transfected cell arrays, Green Fluorescent Protein-based (GFP) labeling, and Fluorescence Resonance Energy Transfer (FRET). Multiplexed Surface Plasmon Resonance (SPR) [34, 35] is a automated approach to the quantitative analysis of protein interactions. Advantages of SPR include its low target consumption and freedom from radioactive labeling VEGA ZZ, a molecular modeling software package, can analyze protein structures. It has an extensive list of features including multiple file format support, atomic potential attribution, 3-D molecular editor, and a protein–protein docking system [3].


Evaluation of High-Throughput Proteomics

The gaol of high-throughput proteome analysis is to catalog and quantify proteins that a whole organism or specific tissue or cellular compartment expresses under certain conditions. To achieve such a daunting goal, two critical factors need to be addressed: first, it is required to demonstrate that the methods and technology of high-throughput proteomics generate valid, reproducible, and reliable results, and second, the gap between the high-throughput data and biological discovery must be bridged. The intrinsic complexity of the biology, multiplied by the enormous volumes

of data generated, make the analysis and interpretation of high-throughput proteomics experiments extremely difficult [15]. Protein microarrays are representing the first new technology that can profile the state of a signaling pathway target even after the cell is lysed. The reverse-phase protein microarray (see Figure 5.5.4 below) has a unique ability to analyze signaling pathways using small numbers of human tissue cells that were microdissected from biopsy specimens procured during clinical trials. These arrays can be manufactured in a sectored array format where dozens of analytes can be queried simultaneously on one slide, which thereby increases the throughput and facile data analysis more readily [12].

FIGURE 5.5.4 Reverse-phase protein arrays. Reverse-phase array immobilizes the cellular lysate sample to be analyzed. Lysates are prepared from cultured cells or microdissected tissues and are arrayed in miniature dilution curves. The analyte molecule contained in the sample is then detected by a separate labeled probe (e.g., antibody) that is applied to the surface of the array [12]. Reverse protein microarrays do not require direct tagging of the protein as a readout for the assay, which yields dramatic improvement in reproducibility, sensitivity and robustness of the assay over other techniques [12]. Though ProteinChip® technology has proved to be promising in disease biomarker detection, it has some key limitations. For example, very little is known about the true nature of the m/z-ratios. The vast majority of the peaks obtained in the MS analysis are unknown and it requires a tremendous amount of work and scrutiny to characterize all these peaks. Many of these peaks will be in the range of pico- to femtomole. Presently, there is no technique available that can amplify peptides and proteins quickly and reliably as the PCR does with DNA [11].

Figure 5.5.5 Four functional areas of the ProteusLIMS system (GenoLogics) illustrating the typical proteomics workflow [3]. Many current automated procedures in proteomics generate large quantities of unwieldy data. Researchers are therefore relying on Laboratory Information Management Systems (LIMS) to assist them with the extraction of useful information. ProteusLIMS system of GenoLogics is comprised of four capabilities: lab management, instrument and data integration, bioinformatics and data management, and analytics and reporting (see Figure 5.5.5 above). Modas, a current LIMS project under development by nonlinear dynamics, supports 2-DE and MS, integrating analysis with LIMS into one package. The LIMS software will be the key to being able to take full advantage of automation advancements [3]. 5.5.4 Applications of High-Throughput Proteomics

Modern technological capabilities have allowed the identification and quantification of most of the large number of proteins in complex biological samples such as whole cell lysates, tissues, blood etc. and this has allowed the use of proteomics in understanding complex disease processes, early detection of the disease using proteomic patterns of body fluid samples [12], developing new diagnosis for them and improving the drug development process [6]. Protein microarray holds great promise in the identification of drug and drug targets, as well as in basic proteomic research such as determining protein-protein interaction, protein-lipid, and enzyme-substrate interactions and also in clinical diagnosis. The reduced sample consumption in the microarray format is crucial in proteome analysis since only minimal amounts of samples are available. Other promises of protein microarray include real-time patient monitoring during disease treatment and therapy [7]. In future, blood tests may be performed by providing fewer drops of blood onto a chip with specific protein markers, providing valuable diagnostic and real-time prognostic information [3]. In the last few years, ProteinChip® technology has been applied in the detection of cancers such as cancers of the head and neck, lung, breast, pancreas, kidney, bladder, prostate and ovary through detection of biomarkers in serum, urine, pancreatic juice, nipple aspirate fluid or tissue homogenates. Besides early detection of cancer, systematic analysis of the serum, tissue, cellular, and subcellular proteome may help to find novel biomarkers that uncover transplant rejection, drug toxicity, and chronic inflammatory or cardiovascular diseases [11]. The virulent and the attenuated strains of M. tuberculosis and Heliobacter pylori were studied by high-resolution 2-DE and MALDI mass fingerprinting for potential DNA vaccines and candidate antigens for serological detection. Marker proteins were selected for the development of potential vaccines that show the high success of proteomics studies [16]. MS/MS combined with ESI or MALDI has been developed as a potential tool for identification of targeted microorganisms through analysis of peptides generated from cellular proteins. Acid extraction of bacterial spore proteins followed by peptide sequencing using MALDI MS/MS has been used to discriminate species from the genus Bacillus in spore mixtures. LC-ESI MS/MS has been used in bacterial classification based on the peptide sequence information generated from LC-ESI MS/MS analysis of a bacterial protein digest. This method can be a strong

complement to the alternative approaches of comparing microbial genomes based on DNA sequencing or microarray hybridization techniques [17]. In recent years, proteomics studies were performed in order to identify the proteomics signature for personalized medicine that best targets the patient's entire disease-specific protein network [12, 16]. Due to the diversity of cancer development process with many unexpected subtypes, many different responses to treatment are observed and establishment of an individualized therapy is required. The diverse functions of proteins and their appearance in various species of different modifications dictate their functional investigations and determine the type of cancer phenotype which is not possible on the level of DNA or RNA [16].


Relevant web sites

Details about CPAS-assisted interactive data analysis, CPAS architecture and implementation, and sample data loaded into CPAS are available at The international website of HUPO (Human Proteome Organisation) is HUPO is an international consortium of national proteomics research associations, government researchers, academic institutions, and industry partners. HUPO promotes the development and awareness of proteomics research, advocates on behalf of proteomics researchers throughout the world, and facilitates scientific collaborations between HUPO members and Initiatives. Presently, there are seven HUPO-sponsored Scientific Initiatives – 1. 2. 3. 4. 5. 6. 7. Human Liver Proteome Project ( Human Brain Proteome Project ( Proteomics Standards Initiative ( Human Antibody Initiative ( Plasma Proteome Project ( Mouse Models of Human Disease ( Human Disease Glycomics/Proteome Initiative (

The human protein atlas (created to show the expression and localization of proteins in a large variety of normal human tissues and cancer cells) is available at The ‘Journal of proteome research’ publishes scientific articles on the recent advances of proteomics. The journal can be accessed at The journal ‘Drug Discovery Today’ publishes many articles focusing on the application of proteomics technology (e.g. protein microarray) in the field of drug development. The journal can be accessed at The journal ‘Proteomics’ is a key resource for information on proteomics. The journal is available at

A comprehensive list of available current LIMS packages may be found at the LIMSource ( Many other publications related to proteomics can be found at The following table lists some Protein Interaction Resources and some other resources of Interest [18]:
Database of interacting proteins (DIP) BIND MIPS Protein–protein interaction database (IntAct) Human Protein Reference Database (HPRD) Human Protein Interaction Database (HPID) HUPO PSI Index site: The Proteome Analysis DB Swiss 2DPAGE 2DWG Image Meta-database PEDRo Open Proteomics Database Systems Biology Institute SBEAMS The Pathway Resource List (PRL)


Key Industry Suppliers

An extensive list of commercially available automated and high-throughput proteomic products and their industrial sources and price is available at the website of Biocompare® The following table lists some proteomics software systems and their resources [6]:

Software Mascot Mascot Integra Sequest EPICenter Spectrum Mill Proteus LIMS Peptide Prophet Protein Prophet X!Tandem GPM XPRESS SBEAMS PRISM Scaffold Phenyx DBParser MZmine ProDB PROTEIOS ProteomIQ Proteome Browser PepLine Protein Expression System Xome CellCarta mzXML Trans-Pproteomic Pipeline

Focus MS–MS search engine Data management for proteomics MS–MS search engine Data management and validation for peptide ID data MS–MS search engine environment Proteomics data management LIMS High-throughput validation of peptide identifications Protein identification (statistical) Open source search engine for MS–MS Public database of identified peptides Quantitative differential analysis for ICAT


Systems biology experiments analysis and management High-throughput proteomics information management system Protein identification automation software Protein identification and validation platform Protein identification and validation platform Differential LC-MS analysis of metabolomics data Storage and analysis of identification proteomics experiments Storage, analysis and annotation of proteomics experiments Integrated proteomics data management platform Protein sequence annotation Software pipeline for protein identification Quantitative and qualitative proteomics analysis Quantitative and qualitative proteomics analysis

Integrated suite for quantitative proteomics analysis File format (standard) for mass spectra data XML-based analysis pipeline for proteomics data

Software ProICAT ProQuant DeCyder MS MS peaks Expasy proteomics server Ion Source

Focus Protein quantitation and identification for ICAT Protein quantitation and for iTRAQ Identification and quantitation analysis platform De novo protein identification Protein sequence analysis tools and databases On line resource of mass spectrometry methods




1 Lee, W.C. and Lee, K.H. (2004) Applications of affinity chromatography in proteomics, Analytical Biochemistry Volume 324, Issue 1 , 1 January 2004, Pages 110 2 Tyers, M. & Mann, M. (2003) From genomics to proteomics, Nature 422, 13 March 2003, 193-197 3 Alterovitz, G., Liu J., Chow J. and Ramoni, M.F. (2006) Automation, parallelism, and robotics for proteomics, Proteomics 2006, 6, 4016-4022 4 James, P. and Quadroni, M. (1999) Proteomics and automation, Electrophoresis 1999, 20, 664-677 5 Wilke, A., Rückert, C., Bartels, D., Dondrup, M., Goesmann, A., Hüser, A.T., Kespohl, S., Linke, B., Mahne, M., McHardy, A., Pühler A. and Meyer, F. (2003) Bioinformatics support for high-throughput proteomics, Journal of Biotechnology Volume 106, Issues 2-3, 19 December 2003, Pages 147-156 6 Topaloglou, T. (2006) Informatics solutions for high-throughput proteomics, Drug Discovery Today Volume 11, Issues 11-12, June 2006, Pages 509-516 7 Zhu, H., Bilgin, M. and Michael Snyder, M. (2003) Proteomics, Annual Review of Biochemistry 2003, 72: 783-812 8 Lin, W.T., Hung, W.N., Yian, Y.H., Wu, K.P., Han, C.L., Chen, Y.R., Chen, Y.J., Sung, T.Y. and Hsu, W.L. (2006) Multi-Q: A Fully Automated Tool for Multiplexed Protein Quantitation, Journal of Proteome Research 2006, 5(9), 2328-2338 9 Dowsey, A.W., Dunn, M.J. and Yang, G.Z. (2004) ProteomeGRID: towards a highthroughput proteomics pipeline through opportunistic cluster image computing for two-dimensional gel electrophoresis, Proteomics 2004, 4, 3800–3812 10 Roe, M.R. and Griffin, T.J. (2006) Gel-free mass spectrometry-based high throughput proteomics: Tools for studying biological response of proteins and proteomes, Proteomics 2006, 6, 4678–4687 11 Rocken, C., Ebert, M.P.A. and Roessner, A. (2004) Proteomics in pathology, research and practice, Pathology – Research and Practice 200 (2004) 69–82

12 Petricoin, E.F. and Lance A. Liotta, L.A. (2003) Clinical Applications of Proteomics, The journal of Nutrition 133:2476S-2484S, July 2003 (Supplement: Nutritional Genomics and Proteomics in Cancer Prevention) 13 Arrigoni, G., Fernandez, C., Holm, C., Scigelova, M. and James, P. (2006) Comparison of MS/MS Methods for Protein Identification from 2D-PAGE, Journal of Proteome Research 2006, 5(9), 2294-2300 14 Rauch, A., Bellew, M., Eng, J., Fitzgibbon, M., Holzman, T., Hussey, P., Igra, M., Maclean, B., Lin, C.W., Detter, A., Fang, R., Faca, V., Gafken, P., Zhang, H., Whitaker, J., States, D., Hanash, S., Paulovich, A. and McIntosh, M.W. (2006) Computational Proteomics Analysis System (CPAS): An Extensible, Open-Source Analytic System for Evaluating and Publishing Proteomic Data and High Throughput Biological Experiments, Journal of Proteome Research 2006, 5(1), 112-121 15 Hogan, J. M., Higdon, R. and Kolker, E. (2006) Experimental Standards for HighThroughput Proteomics, OMICS A Journal of Integrative Biology, Volume 10, Number 2, 2006, 152-157 16 Liebold, B.W., Graack, H.R. and Pohl, T. (2006) Two-dimensional gel electrophoresis as tool for proteomics studies in combination with protein identification by mass spectrometry, Proteomics 2006, 6, 4688–4703 17 Dworzanski, J.P., Deshpande, S.V., Chen, R., Jabbour, R.E., Snyder, A.P., Wick, C.H. and Li, L. (2006) Mass Spectrometry-Based Proteomics Combined with Bioinformatic Tools for Bacterial Classification, Journal of Proteome Research 2006, 5(1), 76-87 18 Kremer, A., Schneider, R. and Terstappen, G.C. (2005) A Bioinformatics Perspective on Proteomics: Data Storage, Analysis, and Integration, Bioscience Reports, Vol. 25, Nos. 1/2, February/April 2005, 95-106

Chapter 6.2 Molecular Biology Approaches for Study of Protein-Protein Interactions
Natalia Samosir 6.2.1 Introduction Protein-protein interactions are an essential key in all biological processes, from replication and expression of genes to the morphogenesis of organisms. The standard molecular biology approach to study protein-protein interactions is the yeast two-hybrid technique. The yeast two-hybrid system was first described 1989 by S. Fields and O-K. Song. The system is based on the reconstitution of a transcriptional activator. Upon protein-protein interaction of two fusion proteins, a functional activator is obtained, resulting in the activation of a reporter gene[1].

Figure 1. Yeast two-hybrid transcription Source: Firstly, two fusion proteins are created: the protein of interest (X), which is constructed to have a DNA binding domain (BD) attached to its N-terminus, and its potential binding partner (Y), which is fused to an activation domain (AD). If protein X and protein Y interact, then their DNA-BD and AD will combine to form a functional transcription activator (TA). This newly formed TA will then go on to transcribe a reporter gene, which is simply a gene whose protein product can be easily detected and measured. In this way, the amount of the reporter produced can be used as a measure of interaction between the protein of interest and its potential partner[1, 2]. The Underlying Principle First, it is necessary to construct the ‘bait’ and ‘hunter’ fusion proteins. The ‘bait’ fusion protein is the protein of interest linked to the GAL4 binding domain (GAL4 BD). This is done by inserting the segment of DNA encoding the bait into a plasmid. This plasmid will also have inserted in it a segment of Gal4 BD DNA next to the site of bait DNA insertion[3, 4]. Therefore, when the DNA from the plasmid is transcribed and converted to protein, the bait will now have a binding domain attached to its end. The same procedure is used to construct the ‘hunter’ protein, where the potential binding partner is fused to the GAL4 AD[2, 4].

Figure 2. Plasmid construction of ‘bait’ and ‘hunter’ proteins Source: In addition to having the fusion proteins encoded for, these plasmids will also contain selection genes, or genes encoding proteins that contribute to a cell’s survival in a particular environment. An example of a selection gene is one encoding antibiotic resistance; when antibiotics are introduced, only cells with the antibiotic resistance gene will survive. Yeast two-hybrid assays typically use selection genes encoding proteins capable of synthesizing amino acids such as histidine, leucine and tryptophan[2]. Once the plasmids have been constructed, they must next be introduced into a host yeast cell by a process called transfection. In this process, the outer-membrane of a yeast cell is disturbed by a physical method, such as sonification or chemical disruption. This disruption produces holes that are large enough for the plasmid to enter, and in this way, the plasmids can cross the membrane and enter the cell[3, 4].

Figure 3. Transfection process

Source: Once the cells have been transfected, it is necessary to isolate colonies that have both ‘bait’ and ‘hunter’ plasmids. This is because not every cell will have both plasmids cross their plasma membrane; some will have only one plasmid, while others will have none. Isolation of transfected cells involves identifying cells containing plasmids by virtue of their expressing the selection genes mentioned previously[4]. After the cells have been transfected and allowed to recover for several days, they are then plated on minimal media, or media that is lacking one essential nutrient, such as tryptophan. The cells used for transfection are called auxotrophic mutants; these cells are deficient in producing nutrients required for their growth. By supplying the gene for the deficient nutrient in the ‘bait’ or ‘hunter’ plasmid, cells containing the plasmid are able to survive on the minimal media, whereas untransfected cells cannot. Selection in this way occurs in two rounds: first on one minimal media plate, to select for the ‘bait’ plasmid, and then on another minimal media plate, to select for the ‘hunter’[3, 4]. Once inside the cell, if binding occurs between the hunter and the bait, transcriptional activity will be restored and will produce normal Gal4 activity. The reporter gene most commonly used in the Gal4 system is LacZ, an E. coli gene whose transcription causes cells to turn blue. In this yeast system, the LacZ gene is inserted in the yeast DNA immediately after the Gal4 promoter, so that if binding occurs, LacZ is produced. Therefore, detecting interactions between bait and hunter simply requires identifying blue versus non-blue[2, 4]. Variation to the Two Hybrid System Reverse two-hybrid and split hybrid system The “reverse” two-hybrid system has been invented to select for disrupted two-hybrid interactions e.g. by mutations, drugs or competing proteins. In this system the interaction of X and Y proteins induces the transcription of a reporter gene that confers toxicity to the yeast[5]. Three-hybrid System In this yeast two-hybrid variation a third protein (Z) is expressed along with the DNABD and AD fusions. Expression of the reporter gene is used to select for interactions that occur only in the presence of this protein. Three-hybrid systemwas developed to detect and analyse RNA-protein interactions in which the binding of a bifunctional RNA molecule links the DNA-BD and AD hybrid-proteins and activates transcription of the reporter gene. This system is known as RNA three-hybrid system[5]. SOS Recruitment system (SRS) This membrane-associated two-hybrid system make use of the Ras pathway in yeast. When localized at the plasma membrane, the yeast Ras guanyl nucleotide exchange factor (RGEF) cdc25 stimulates GDP/GTP exchange on Ras and promotes downstream signalling events that ultimately lead to the cell growth. A mutant yeast strain harbouring the temperature sensitive cdc25-2 allele is still able to grow at 25°C but fails to grow at 36°C. However, the human RGEF (hSOS) when targeted to the plasma membrane efficiently complements the mutation, leading to cell growth at 36°C. In the SRS the translocation of hSOS is dependent on a proteinprotein interaction: the bait X is fused to C-terminally truncated hSOS, which is active but unable to target to the plasma membrane. The bait is co-expressed with a prey Y, which can either be an integral membrane protein or a soluble protein that is anchored to the membrane by means of a myristoylation signal[5].

6.2.2 Recent Advances Prokaryotes Two Hybrid System In recent years, there has been a growing interest in the development of prokaryotic two-hybrid. A large number of prokaryotes two hybrid systems have been described, but despite their many advantages, the prokaryotes two hybrid systems still have not been widely implemented for large-scale or proteomic scale protein interacting mapping in the same way as the yeast two hybrid system[6, 7]. In order to detect protein–protein interactions using the prokaryotes two hybrid systems described to date, both the Prey and Bait vectors for the systems have to be introduced into an appropriate E.coli reporter strain by bacterial transformation[6]. The resulting transformants are then assayed for reporter gene activation/repression or a reconstituted enzyme activity in order to detect protein–protein interactions. It is proposed that bacterial conjugation could be exploited as a technically simplified and more efficient means of introducing plasmids in combination to test for protein– protein interactions[6, 7]. Prokaryotes two hybrid systems present many advantages over yeast-based technologies, which largely derive from the ease with which E.coli can be genetically manipulated, the lack of cellular compartmentalisation, its faster growth rate and the higher transformation efficiencies that are attainable permitting rapid and more efficient screening of complex libraries[6]. The systems permit the investigation of prokaryotic protein–protein interactions in a prokaryotic genetic background, but they can also be used for the analysis of eukaryotic proteins. This may be particularly desirable in circumstances where homologous yeast proteins interfere with an interaction by interacting with and sequestering one of the interacting partners leading to false negatives, or by acting as a bridge between two proteins leading to false positives. The absence of such homologous eukaryotic proteins in E.coli may result in the observation of less false positives and negatives[6, 7]. Mammalian Two Hybrid System Like the yeast two-hybrid system, this is a genetic, in vivo assay based on the reconstitution of the function of a transcriptional activator. In this system, one protein of interest is expressed as a fusion to the Gal4 DNA-binding domain and another protein is expressed as a fusion to the activation domain of the VP16 protein of the herpes simplex virus[8, 9]. The vectors that express these fusion proteins are cotransfected with a reporter chloramphenicol acetyltransferase (CAT) vector into a mammalian cell line. The reporter plasmid contains a cat gene under the control of five consensus Gal4 binding sites. If the two fusion proteins interact, there will be a significant increase in expression of the cat reporter gene[8]. Previously, it was reported that mouse p53 antitumor protein and simian virus 40 large T antigen interact in a yeast two-hybrid system. Using a mammalian two-hybrid system, it was able to independently confirm this interaction[8]. The mammalian two-hybrid system can be used as a complementary approach to verify protein-protein interactions detected by a yeast two-hybrid system screening. In addition, the mammalian two-hybrid system has two main advantages; assay results can be obtained within 48 hours of transfection, and protein interactions in mammalian cells may better mimic actual in vivo interactions[8, 9]. Far Western Analysis Far-Western blotting was originally developed to screen protein expression libraries with 32P-labeled glutathione S-transferase (GST)-fusion protein[10, 11]. Far-Western

blotting is now used to identify protein-protein interactions. In recent years, farWestern blotting has been used to determine receptor-ligand interactions and to screen libraries for interacting proteins[10]. With this method of analysis it is possible to study the effect of post-translational modifications on protein-protein interactions, examine interaction sequences using synthetic peptides as probes, and identify protein-protein interactions without using antigen-specific antibodies[10, 12]. The far-Western blotting technique is quite similar to Western blotting[10]. In a Western blot, an antibody is used to detect the corresponding antigen on a membrane. In a classical far-Western analysis, a labelled or antibody-detectable “bait” protein is used to probe and detect the target “prey” protein on the membrane[10, 12]. The sample (usually a lysate) containing the unknown prey protein is separated by SDS-PAGE or native PAGE and then transferred to the surface of the membrane, the prey protein becomes accessible to probing[10]. After transfer, the membrane is blocked and then probed with a known bait protein, which usually is applied in pure form. Following binding of the bait protein with the prey protein, a detection system specific for the bait protein is used to identify the corresponding band[10-12]. Depending on the presence of a label or tag on the bait protein, one of four detection methods is used to detect far-Western blot protein-protein interactions[10]: • Direct detection of prey protein with a radioactive bait protein • Indirect detection with antibody to the bait protein • Indirect detection with antibody to the tag of a fusion-tagged bait protein • Detection with biotinylated bait protein and enzyme (HRP/AP) labeled with avidin or streptavidin 6.2.3 Evaluation of the Technology The yeast two-hybrid system became one of the most popular technologies for the detection of protein-protein interactions because it is fairly simple, rapid and inexpensive (avoids the costly protein purification and antibody development needed in the traditional biochemical methods). No previous knowledge about the interacting proteins is necessary for a screen to be performed[5]. Limitations Some classes of proteins are not suitable to analysis by the yeast two-hybrid system. For example, transcriptional activators may activate transcription without any interaction. Another class of troublesome proteins are those containing hydrophobic transmembrane domains which may prevent the proteins from reaching the nucleus[13]. To overcome this limitation one of the alternative membrane-associated two-hybrid systems may be used. Other proteins may require modification by cytoplasmic or membrane associated enzymes in order to interact with binding partners[5, 13]. False Positives and False Negatives The two-hybrid system has a tendency to produce false positive, that is reporter gene activity where no protein-protein interaction is involved. Frequently, such false positives are caused by bait proteins that act as transcriptional activators[13]. Other false positives may be caused by proteins that lead to non-specific interactions for largely unknown reasons. Some bait or prey proteins may affect general colony viability and hence allow a cell to grow under selective conditions and activate reporter gene activation. Mutations or other random events of unknown nature may be invoked as potential explanations as well. Overall, extremely few cases of false positives can be explained mechanistically[5, 13].

Comparison to other methods Two-Hybrid Mass Spectrometry In vitro binding Protein Chips

Advantages Simple and inexpensive Coverage of low abundant proteins Identification of protein complexes Defined conditions

Disadvantages Significant risk of false positives Expensive and time consuming Purification required Potentially non-physiological conditions Requires purification of proteins

Defined conditions Potentially highly parallel Table 1. Brief comparison of Two Hybrid System with other technology to analyse protein-protein interaction (Source: 6.2.4 Applications of the Technology

Drug discovery is a lengthy and costly process which involves target identification and validation, drug screening and safety assays, development of animal models and ultimately, testing of potential drug candidates in clinical trials[14]. The development of a novel drug may take 10-15 years, with cost estimates of ~800 million US$. It is a complex process that includes the identification of biological targets as well as the identification leads that aim at altering or inhibiting the function of a particular target[14]. The yeast Saccharomyces cerevisiae has long been recognised as a valuable model organism for studies of eukaryotic cells. Auerbach et al highlighted emerging yeast based functional genomic and proteonomic technologies in their paper, using yeast as a model organism in drug discovery processes. These approaches include the utilisation of variations of the yeast two hybrid systems. With regard to screening for novel drugs, the yeast two hybrid system can be broadly applied to two areas: identification of target and its validation and screening for compounds that inhibit the interaction between two therapeutic target proteins[14, 15]. As a genetic assay, the yeast two-hybrid system is perfectly suitable for highthroughput studies. This has been used successfully to create so-called “whole genome interaction networks” where all proteins of a given organism are systematically tested against each other using high-throughput yeast two-hybrid strategies[14, 16]. As many proteins that play important roles in human disease have orthologues in lower eukaryotes such as yeast, D. melanogaster or C. elegans, a potential route towards identification of novel targets is to create interactions networks in these model organisms and then try to transfer the insights gained to the human situation. Comprehensive protein interaction maps have been created for yeast and recently also for D. melanogaster and C. Elegans. As an example, the D. melanogaster interaction map identified a total of 4, 780 high-confidence interactions, involving 4,679 proteins[14, 17]. The authors then used the Homophila database which lists all proteins in D. melanogaster that have orthologues implicated in disease pathways in human in order to integrate their protein interaction data with already known disease pathways in human[15, 17]. As an example of this approach, they demonstrate that a previously known transcription factor involved in human B cell non-Hodgkin’s lymphoma is connected to two calcium-dependent phosphatases, a finding that suggests calcineurin phosphatases may be valid drug targets for treating lymphomas and other cancer types[14].

The two-hybrid system has also been adapted further to study drug-protein interactions[18]. This technique, termed the yeast "three-hybrid system", uses a synthetic heterodimer consisting of two different organic ligands to bring into proximity the DNA-binding domain fused to the receptor of one ligand and the activation domain fused to the receptor for the second ligand[5]. The feasibility of this system was demonstrated by using as the hybrid ligand a heretodimer of covalently linked dexamethasone and FK506. The receptor for dexamethasone was fused to the LexA DNA binding domain and a Jurkat cDNA library fused to a transcriptional activation domain was screened; three overlapping clones of FKBP12, the human FK506 binding protein, were isolated[5, 18]. Reverse two-hybrid systems that can be used to select small molecules that inhibit protein-protein interactions also have been demonstrated[18]. It is described that expression of proteins that interact through the two-hybrid system is controlled by the GAL promoter. Following galactose induction, the two interacting proteins are synthesized and their association induces the synthesis of a toxic gene. Only cells where a small molecule inhibits the protein-protein interaction survive. Using this system, nanomolar concentrations of FK506 were shown to disrupt the association of FKBP12 with R1 of the transforming growth factor β receptor family[18]. 6.2.5 Relevant Web Sites Structure Factory in Germany Harvard Institute for Proteomics Several collaborative Structural Genomics Centers funded by the NIH ( 6.2.6 Key Industry Suppliers 6.2.7 References 1. Fields, S. & Song, O. K. (1989) A Novel Genetic System to Detect Protein-Protein Interactions, Nature. 340, 245-256. 2. Sobhanifar, S. (2003) The Yeast Two-Hybrid Assay: An Exercise In Experimental Eloquence in Science Creative Quaterly source: 3. Phizicky, E. M. & Fields, S. (1995) Protein-Protein Interactions: Methods for Detection and Analysis, Microbiological Reviews. 59, 94-123. 4. Criekinge, W. V. & Beyaert, R. (1999) Yeast Two-Hybrid: State of the Art, Biological Procedures Online. 2, 1. 5. Vollert, C. & Uetz, P. (2003) The Two Hybrid System in Encyclopedic Reference of Genomics and Proteonomics in Molecular Medicine

Source: 6. Clarke, P., Cuív, P. Ó. & O'Connell, M. (2005) Novel mobilizable prokaryotic twohybrid system vectors for high-throughput protein interaction mapping in Escherichia coli by bacterial conjugation, Nucleic Acid Research. 33, e18. 7. Serebriiskii, I. G., Fang, R., Latypova, E., Hopkins, R., Vinson, C., Young, J. K. & Golemis, E. A. (2005) A Combined Yeast/Bacteria Two Hybrid System, Molecular and Cellular Proteonomics. 4.6, 819-826. 8. Luo, Y., Batalao, A., Zhou, H. & Zhu, L. (1997) Mammalian two-hybrid system: a complementary approach to the yeast two-hybrid system, Biotechniques. 22, 350352. 9. Schenborn, E., deBerg, L. & Brondyk, W. (1998) The CheckMateTM Mammalian Two-Hybrid System, Promega Notes. 66, 2-8. 10. Hall, R. A. (2004) Studying protein-protein interactions via blot overlay or Far Western blot, Molecular Biology Methods. 261, 167-174. 11. Einarson, M. B. & Orlinick, J. R. (2002) Identification of Protein-Protein Interactions with Glutathione-S-Transferase Fusion Proteins in Protein-Protein Interactions pp. 37-57, Cold Spring Harbor Laboratory Press 12. Mahlknecht, U., Ottmann, O. & Hoelzer, D. (2001) Far-Western based proteinprotein interaction screening of high-density protein filter arrays, Journal of Biotechnology. 88, 89-94. 13. McAlister-Henn, L., Gibson, N. & Panisko, E. (1999) Applications of the Yeast Two-Hybrid System, Method. 19, 330-337. 14. Auerbach, D., Arnoldo, A., Bogdan, B., Fetchko, M. & Stagljar, I. (2005) Drug Discovery Using Yeast as a Model System: A Functional Genomic and Proteonomic View, Current Proteonomics. 2, 1-13. 15. Edwards, A. M., Arrowsmith, C. H. & Pallieres, B. d. (2000) Proteonomics: New tools for a new era in Modern Drug Discovery pp. 34 16. Gwynne, P. & Heebner, G. (2003) Drug Discovery and Biotechnology Trends – Proteomics I: In Pursuit of Proteins in Science 17. Stanyon, C. A., Liu, G., Mangiola, B. A., Patel, N., Giot, L., Kuang, B., Zhang, H., Zhong, J. & Jr, R. L. F. (2004) A Drosophila protein-interaction map centered on cellcycle regulators Genome Biology. 5, R96. source: 18. Parsons, A. B., Geyer, R., Hughes, T. R. & Boone, C. (2003) Yeast genomics and proteonomics in drug discovery and target validation, Progress in Cell Cycle Research. 5, 159-166.

Chapter 6.3 Biosensor methods for study of protein-protein interactions
Yung Chih Chen 6.3.1 Introduction A biosensor is a device that uses biological molecules to detect chemicals or other biological molecules. Biosensor usually consists of three components, a biological detection system, a transducer and an output system. Typically, biological detection system can be enzyme, antibody, micro-organism and cell, which are immobilized on the surface of a signal transducer. Two types of transducers, optical and piezoelectric, are commonly implementation in biosensor. Surface Plasmon Resonance is using optical transducer. Quartz Crystal Microbalance (QCM) is using piezoelectric transducer. The most important system in biosensor methods for study of protein-protein interaction is called Surface Plasmon Resonance (SPR). SPR has been demonstrated in the past decade to be an outstanding sensitivity and extreme powerful probe of the interaction of a variety of biopolymers with various ligands. A number of applications such as protein-protein, protein-DNA, protein-ligand, and protein-membrane have been developed. SPR provides a means not only for realtime identifying these interactions but also for building a variety of assays. In a typical SPR biosensing experiment, one of the interacting pair is immobilized on an SPR-active gold –coated glass slide. The other interactant is prepaid in an aqueous buffer solution. A glass slide with a thin gold coating is mounted on a prism. When light passes through the prism and slide, reflects off the gold and passes back through the prism to a detector. The reflectivity is subject to the index of refraction, which is determined by the total mass of the thin gold. In other words, changes in reflectivity versus angle give a resonance signal that is proportional to the volume of biopolymer bound near this surface (Fig A typically readout is indicated as Fig At time 0, there is no interaction. At time 100s, solution contain the other interactant is introduced, therefore the association effect begin. At time 300s, the maximum association happen and pure buffer start to flush into flow channel. At time 420-520 s, the starting surface is regenerated with a sequence of reagents. Related techniques include plasmon waveguide, QCM, Dual Polarisation Interferometry, and Surface Enhanced Laser Desorption Ionization (SELDI). This chapter will discuss the pros and cons of SPR and introduce a little bit about SELDI. Then present the current application of SPR.

Fig Basic components of an instrument for SPR biosensor Source:

Fig A typical Surface Plasmon Resonance Readout Source: 6.3.2 Recent Advanced From MS to MALDI and then SELDI A Mass Spectrometer is a device that measures the mass to charge ratio and typically consist of three parts; an ion source, a mass analyser and a detector system. Aebersold (2003) have identified the ionization process in MS may deteriorate the characteristics of protein and peptides[1]. The classic 2D gel electrophoresis combined with mass spectrometry cannot accurately measure small difference in concentration[2]. Instead of technical advances on 2D PAGE, it still regards as time consuming and not suitable to examine large number of samples. Matriz-assisted Laser/Desorption ionisation time of flight mass spectrometry (MALDITOP MS) offer a more sensible of identifying small volumes of protein and this technique rapidly take over the previous MS following its discovery. MALDI can quickly recognize masses of peptides generated from a pure fragmented protein and it provide highly reliable datas. SELDI offer a similar measurement of analysing biological mixtures. The main differences between MALDI and SELDI is that in MALDI the surface or beads are passive probes and do not contribute to the reaction whereas in SELDI, protein are immobilized on one of a variety of chip surfaces, all

with different binding specificities (Refer to chapter 4.1 and chapter 5.3 for more details). Quality Control and Quality Assessment for SELDI SELDI coupled with protein chip is an effective tool for simultaneous detection of a variety of relevant protein expression under different condition. Differences in protein expression level can then be used to identify disease, differentiate different stage of a disease, or different time point following by toxicant treatment. Therefore, analyses of SELDI data play a pivotal role in developing SELDI technology. Many researches have been conduct under rigorous procedures to detect and discard low quality spectra prior data analysis in SELDI. Hong et al (2005) presented a novel 144 spectrum as a correlation matrix to measure and detect low quality data[3] (Refer to chapter 9.2 for more details). SPR&MS Larsericsdotter et al (2006) have optimized SPR and MS interface by hybrid their functionalities. They called this system ligand fishing process to characterize biomolecular interaction and identify the interaction partner inside cell proteomics. A general variation in chip based affinity separation system may cause by surface to volume ratio of the fluidic system. They have investigated low molecular weight molecules with optimized protocol to avoid non-specific signal in the final MS analysis[4]. Develop high throughput SPR Biosensing by using SPR Microscope Campell et al (2004) develop novel SPR biosensing array measurement that can measure 120 interactions simultaneously and a computer interfaced video camera to probe the interactions (Figure6.3.2.2) By implanting array method in SPR, it offer at least two advantages: measurement smaller protein and calculating Kinetic and concentration in parallel[5].

Fig Design of SPR microscope Source: SPR&AFM Another advanced of SPR biosensor is that the method may be combined with AFM to obtain additional information such as identification of surface topology of bound

proteins and thickness of protein layers immobilized on the surfaces of protein arrays. 6.3.3 Evaluation of the Technology Advantage of SPR Evaluation of macromolecules Recombinant protein is a paramount technique for most laboratories studying biological problems. It is pivotal to be able to show the recombinant protein has the same characteristics at its native counterpart. With regards to this point, scholars have been proved the availability and reliability of using SPR by confirming that the protein binds its natural ligands. Due to post translation modification has been raised the importance in proteome study, the binding interaction require a correctly folded protein instead of only amino acid sequence match. The SPR is particularly well suited to demonstrating the binding of macromolecules. Setting up an assay for any interaction is very fast and the date generated is reliable. Kinetic & Equilibrium measurements The fact that SPR generates real-time binding data make it well suited to the analysis of equilibrium measurement and kinetic measurement. With respect to equilibrium analysis, it is very time-consuming and required several injection of different concentration. In general, the time of equilibrium is determined primarily by the dissociation rate constant K-off. Slow K-off values are usually resulting from high affinity interaction (KD<10 nM). In contrast, weak (KD>100 nM) interaction has fast K-off value and tends to be easily studied. SPR could generate highly reproducible data. With respect to kinetic measurement, SPR is easy to use and the analysis software is available, it is quite a common practice of generating kinetic data. Lack of labelling requirement Due to labelling protein or other detection probes may result in loss of biological activity, non-labelling methods are preference. The current non-labelling methods include SELDI, AFM and SPR[6]. Operated in situ In addition to the use of unlabeled protein, SPR offer the advantage that it operated in situ. In other words, there is no requirement of substrate to rinsed or dried the sample before analyse. This feature is extremely useful for quantifying low affinity protein-protein interaction[7]. Disadvantage of SPR High Throughput Assays SPR is not suitable for high throughput assays; due to average analysis take approximately 10 mins. Common conundrums include surface deteriorate over time and air bubble may blockage micro fluidic system. Many research have been conduct to change this disadvantage. Concentration Assays SPR are limited to observing several binding interaction simultaneously, therefore it is not suitable for concentration measurements. Protein denaturation During the immobilizing procedure, protein may undergo denaturation and show nonspecific binding in the analysis of protein-protein interaction. 6.3.4 Application of SPR There are currently two types of SPR sensors, angular and spectral SPR biosensors, widely be used in proteome researches. Angular SPR biosensors are based on the angular interrogation and analyse protein interaction by scanning incidence angles at a constant wavelength. On the other hand, spectral SPR biosensors are based on the wavelength interrogation and scan wavelengths at a constant incidence angle to analyse biomolecular interactions. SPR imaging SPR imaging measured the intensity change of refractive index on the gold surface. Jung’s group (2004) have applied this technology into monitoring protein expression by E. Coli. In short, they immobilized three proteins; maltose-binding protein tagged human interleukin-6, hexahistidne-tagger human growth factors and glutathione Stransferase-tagged human interleukin-6, onto gold-coated slides pre-treated with cyclodextran, Ni-iminodiacetate and glutathione, respectively. They found the conundrum of SPR imaging was the SPR spectrum may not horizontally move, which results in the fluctuation of SPR imaging[8]. SPR spectroscopic imaging The main difference between SPR spectroscopic imaging and SPR imaging is that the former detected the shift of SPR wavelength, but the later detected the SPR intensity. The SPR spectroscopic imaging was combined position automation and SPR spectroscope into one technology. There are two modes of SPR spectroscopic imaging: line-scanning modes and whole scanning modes. A typical experiment of antibody-antigen interaction of SPR spectroscopic imaging is that tthree G-proteins such as rhoA, rhoA N19, rac1, and C-reactive protein were immobilized onto a protein array modified with a mixed thiol layer of 11-mercaptoundecanoic acid and mercaptohexanol. Then, incubated the array with two monoclonal antibodies against rhoA and rac1. Afterwards, the array were analysed by SPR spectroscopic imaging by whole scanning mode (Fig.[9].

Fig. SPR spectroscopic imaging by whole scanning mode (Source: The whole scanning method can provide detailed information and cross reaction of antigen and antibody. In the figure, it shows anti-rac1 have a strong binding to rac1. However, anti-rhoA has cross reactivity with rhoA N19. The drawbacks of whole scanning methods is that it takes a long time to finish scanning due to the whole

protein array must be scanned by graduate moving to all spots. Yuk et al (2005) have demonstrated the analysis of antibody-antigen interaction, by line scanning mode. They scanned protein arrays along with the central lines and present the data by color spectrum. In their experiment, 7 different proteins, such as tissue transglutaminase (TGase), hemoglobin, haptaglobin, rhoA, rac1, GST and rhoA N19, were immobilized and then their interaction with three antibodies, anti-rhoA, antirac1, and anti-haptoglobin were analysed by the line-scanning mode of the SPR biosensor. In summary, the sensitivity of an in situ spectral SPR biosensor was not as good as an ex situ spectral SPR biosensor[9]. Investigate mutate protein by SPR SPR is a wonderful tool to investigate the effects of amino acid substitution on binding properties and interacting characteristics. Bozzi et al (2005) analyse dystroglycan by SPR. Dystroglycan is comprised by two subunits α-dystroglycan and β-dystroglycan, plays a pivotal role in muscle stability and neuromuscular disorders. α-dystroglycan tend to interact with and β-dystroglycan by non covalent bond. Many researches have proved that α-dystroglycan interact with extracellular matrix protein. β-dystroglycan plays as a transmembrane protein. It binds dystrophin and other cytosolic protein such as Grb2. It is believed that α and β subunits intertwine their functionality affecting the stability of the entire network. By immobilizing different direction of α and β-dstroglycan, Bozzi and his group have identify that the specific aromatic residues between C-terminal region of the α-dystroglycan and the Nterminal ectodomain of the β-dystroglycan play a crucial role in inter-subunit interaction[10]. Kinetic analysis between β-dystroglycan and Grb2 Grb-2 is an adaptor molecule regulates signal transduction and cytoskeleton organization. Grb2 is consist of one SH2 (Sre homology 2) domain and flanked by two SH3 domain. Grb2 binds to the cytoplasmic domain of β-dystroglycan, which contains several proline-rich sequenced. Therefore,Torreri et al (2005) performed SPR analysis: “1) to determine which of the two Grb2-SH3 domains, the N-terminal or the C-terminal, binds to β-dystroglycan and 2) to identify the binding site on βdystroglycan by comparing the kinetics parameters of β-dystroglycan fragments containing one (amino acids 876-895) or two (amino acids 821-895) proline rich consensus sequences for the Grb2-SH3 interaction[11]”. In conclusion, they have found that both Grb2-SH3 domains bind β-dystroglycan, but C-terminal SH3 domain has lower affinity than N-terminal Sh3 domain. They also identified amino acid (876895) of β-dystroglycan plays a crucial role in the determination of binding affinity. 6.3.5 Relevant websites Corn’s research group - University of California, Irvine Campbell research group- University of Washington, Department of Chemistry Dr.Zhou’s research Wikipedia Introduction of SPR More detail about SPR 6.3.6 Key Industry Suppliers Company Biacore AB (Sweden) Affinity Sensors IBIS Technologies Jandratek GmbH GWC instrument Texas Instrument BioTuL Bio Instruments (Germany) Nippon Laser Electronics (Japan) Artificial Sensing Instruments (Switzerland) Aviv (NJ) Quantech Ltd (MN) 6.3.7 References: 1. Aebersold, R. & Mann, M. (2003) Mass Spectrometry-based proteomics, Nature. 422, 198-207. 2. Rodland, K. D. (2004) Proteomics and cancer diagnosis: the protential of mass spectrometry, Clin Biochem. 37, 579-583. 3. Henderson, N. A. & Steele, R. J. C. (2005) SELDI-TOF proteomic analysis and cancer detection, Journal of the Royal Colleges of Surgeons of Edinburgh &Ireland. 3, 383-390. 4. Larsericsdotter, H., Jansson, O., Zhukov, A., Areskoug, D., Oscarsson, S. & Buijs, J. (2006) Optimizing the surface plasmon resonance/mass spectrometry interface for functional proteomics applications: How to avoid and utilize nonspecific adsorption, Proteomics. 6, 2355-2364. 5. Shumaker-Parry, J. & Campbell, C. T. Quantitative Methods for SpatiallyResolved Adsorption/Desorption Measurement in Real Time by SPR Microscope, Anal. Chem. subm. 6. Zhu, H. & Snyder, M. (2003) Protein Chip Technology, Curr Opin Chem Biol. 7, 55-63. 7. Young-Sam, L. & Milan, M. (2002) Protein chips:from concept to practice, Trends in Biotechnology. 20. 8. Jung, J. M., Shin, Y. B., Kim, M. G., Ro, H. S., Jung, H. T. & Chung, B. H. (2004) A fusion protein expression analysis using surface plasmon resonance imaging, Anal Biochem. 330, 251-6. 9. Yuk, J. S. & Ha, K. S. (2005) Proteomic applications of surface plasmon resonance biosensors: analysis of protein arrays, Exp Mol Med. 37, 1-10. System Biacore IAsys IBIS Plasmonic FT-SPR 100 TI-SPR Kinetics Instrument SPR-670 OWLS PWR-400 FasTraQ Website

10. Bozzi, M., Sciandra, F., Pavoni, E., Ferri, L., Torreri, P. & Petrucci, T. (2005). Analysis at the molecular level of the interaction between alpha-dystroglycan and beta-dystroglycan. Paper presented at the GGPO30332, Telethon meeting. 11. Torreri, P., Ceccarini, M., Macioce, P. & Petrucci, T. C. (2005) Biomolecular interactions by Surface Plasmon Resonance technology, Ann Ist Super Sanita. 41, 437-41.

Chapter 7.1

t-Boc synthesis and cleavage of peptides
Rajesh Ramanathan

7.1.1 Introduction Proteins play a crucial role in many of the physical and chemical processes in the living cell [1]. These ubiquitous molecules have a highly organized three dimensional structure in solutions. Peptides are smaller version of proteins which generally are less organized in solution state [2]. There is no clear dividing line between proteins and peptides but acceptable distinctions being proteins are larger peptides. With the completion f the human genome project the study of proteins have increased considerably, also the use of proteins as target molecules for synthesis of drugs has increased the need for artificial synthesis of peptides [3]. The possibility of synthesizing organic compounds was considered seriously only after Friedrich Wohler synthesized urea, also called carbamide in the lab in 1828. Prior to this, the theory of ‘vitalism’ held sway. According to the tenets of vitalism, the compounds required for life were far more complicated than the compounds which constituted inanimate matter and had the ability to self-propagate due to their possession of a ‘vital force’ or spark. Thus these ‘organic’ compounds were above the laws of physics and chemistry and were not worth studying as they were by definition, inexplicable. Certainly, (it was believed) they could not be synthesized. Wohler’s synthesis of urea, an organic compound, invalidated vitalism and inspired chemists to try and synthesize other organic compounds. Proteins came to the limelight when James Sumner demonstrated that urease was a protein. And finally, between the years 1899 and 1908, Emil Fischer, while working with proteins, discovered the nature of the amide or peptide bond between amino acids. With this important discovery, Fischer in collaboration with Fourneau was able to synthesize the first peptides, starting with dipeptides [glycyl-glycine] in 1901 and working his way up to an 18 amino acid peptide in 1907, consisting mostly of glycine and leucine residues. This monumental achievement laid the foundations for the science of peptide synthesis. A series of important discoveries following this which led to peptide synthesis as we know it today, are given below [4]1] Bergmann and Zervas, 1932- The discovery of the carbobenzoxy group which could be used as an easily removable functional group for polyfunctional amino acids. 2] Theodor Curtius- The discovery of the Curtius rearrangement which could be carried out with benzyl alcohol to give carbobenzoxy-protected amine groups. 3] Sifferd and du Vigneaud, 1935- Successfully synthesized carnosine, one of the first times a small naturally occurring peptide was synthesized. 4] Harrington and Mead, 1935- Successfully synthesized glutathione.

5] du Vigneaud et al., 1953- Successfully synthesized the active peptide hormone, oxytocin. 6] Schwyzer and Sieber, 1963- Successful synthesis of porcine adrenocorticotrophic hormone. 7] Bodzansky et al, 1967- Synthesis of the peptide, secretin by solution phase synthesis. 8] Bruce Merrifield, 1963- Invention of the solid phase synthesis method. As we shall see further on, this last discovery was perhaps the most important one; it allowed for the increased use of synthetic peptides in experiments and has mostly replaced solution phase peptide synthesis. Bacteria and viruses were and are still used for the production of proteins by means of expression systems. But with the advent of solid phase and solution phase peptide synthesis it was possible to mass produce peptide sequences. 7.1.2 Principle The main principle behind solid phase peptide synthesis can be summed up as follows ‘if a peptide is bound to solid insoluble matrix then the reagents that are unreacted in the production phase of any synthetic step can be washed, decreasing the time required for chemical synthesis, decreasing the chances of side reactions.’ The T-Boc synthesis by Merrifield uses an acid labile temporary protecting group while the F-Moc synthesis developed by Sheppard uses a basic labile temporary protecting side group. Both methods of solid phase peptide synthesis are carried out from C terminal to N terminal, with the N terminal end being protected by a removable/temporary group to which the next amino acid in line is attached. The process is reversed in nature, where the cell’s ribosomes synthesize proteins from the N terminal to the C terminal. Solution Phase Peptide Synthesis [4]- Although this method has mostly been replaced in labs by the solid phase synthesis method, it is still used for large scale production of synthetic peptides. It is often referred to as the classical method of peptide synthesis. The N-α-protected amino acids are added in a stepwise manner to a growing chain which starts from the C-terminal amino acid. Each step of the coupling process is brought about by one of given methods as follows1] Carbodiimide Method [4]- It involves using dicyclohexylcarbodiimide [DCC] as the coupling agent to couple two amino acid residues. 2] Mixed Carbonic Anhydride Method [4] - It is popular because of its effectiveness at low temperatures as well as the high yield and purity percentage of the final product. The first step activation carboxyl group is using an alkyl chlorocarbonate to form carbonic anhydride. In the second step, carbonic anhydride is added in a known excess to react with the free amino group of the growing peptide chain. Solid Phase Peptide Synthesis [SPPS] [5]- This is the preferred method of peptide synthesis, especially for small labs that require high quality synthetic peptides in small amounts. In this technique, a solid matrix is placed in a reaction vessel. This resin has a reactive group which forms the base for the growing peptide chain and

which can bind to the carboxyl end of the N-protected amino acids. A single cycle consists of de-protecting the temporary reactive group, draining the de-protecting agent, and adding another amino acid to the growing chain. At the end of each cycle, the peptide is tested for purity. When the desired sequence has been achieved, the completed peptide is removed by another cleavage agent, which targets the bond between the C terminal amino acid and the support. The peptide is then freeze dried and analyzed by means of high performance liquid chromatography [HPLC] and mass spectrometry to characterize it. Two kinds of protecting groups are present on amino acids. The ‘temporary’ protecting groups protect the amino end of the amino acids and are removed at the beginning of each cycle. These are the N-α-tert.-butyloxycarbonyl [Boc] group and the N-α-fluorenylmethyloxycarbonyl [Fmoc] groups. The differences in the two methods of SPPS are based on the chemistry associated with these two groups, the former being an acid labile group and the latter being a base labile group. Therefore, we have the Boc synthesis technique and the Fmoc synthesis technique. The ‘permanent’ groups are present to protect the side chains from reacting with each other and are usually similar for both methods. These are usually ether, ester or urethane derivatives of benzyl alcohol. In Boc chemistry, the permanent groups are often modified with the addition of electron donating methyl/methoxy groups or electron withdrawing halogenic groups. The permanent groups are removed in the final step of cleavage. Boc Chemistry [6]- Boc chemistry is the classical method of solid phase peptide synthesis, developed by Merrifield. The temporary group Boc group is introduced on to amino acids using either di-tert-butyl dicarbonate or 2, 2-phenylacetonitrile, in aqueous 1, 4-dioxanee containing NaOH [7]. This Boc group cannot be removed by alkali and neucleophiles, but can be removed rapidly by inorganic and organic acids. This is one of the main steps in the Boc chemistry. The initial Boc protected amino acid is attached to the resin by an HF cleavable bond. Other Boc protected amino acids are added in successive cycles. A single cycle can be outlined as followsDeprotection- The protecting, acid labile group is removed using neat trifluoroacetic acid [TFA]. Activation- Meanwhile, the amino acid is activated using O-(1H-benzotriazole-1-yl)N,N,N',N'-tetramethyluronium hexafluorophosphate [HBTU] in DMF or a combination of HBTU and 1-hydroxybenzotriazole [HOBT] in DMF for the amino acid residues, arginine, glutamine and asparagine. Addition- Dimethylformamide [DMF] is used to neutralize and wash away the residual TFA and the activated amino acid is added to the reaction vessel. Confirmation of the binding is obtained by carrying out a ninhydrin test on a small sample of the resin and measuring the absorbance of the ninhydrin solution against the amount of resin taken in the sample. In case, the percentage of binding is relatively low, the entire cycle is repeated again, with the same amino acid. This process is referred to as double coupling. The result after double coupling is usually accepted as final, no matter what it is. The entire cycle is then repeated again with the next amino acid until the desired sequence has been achieved. Removal- After the desired peptide sequence is complete; the peptide is removed from the resin using anhydrous hydrogen fluoride [HF]. Thiol compounds are often added to protect the peptide from the carbocations that are generated in the process.

Figure 7.1.4 - Summary of the Boc chemistry. Reference: 7.1.3 Recent Advances

The use of Fluorous tags in conjunction with solid phase peptide synthesis has been recently discovered. Taking into consideration the unique property of Fluorous compounds, they were used to tag the growing peptide and were de-tagged at the end. The unique property of fluorous compounds was used to tag in conjunction with SPPS as described by [8]

Fig 7.1.2 – Fluorous Peptide Synthesis Reference: The level of sophistication is set to increase with the improvement in linker design, rapid on and off bead technology as well as automated synthesis equipment. Advanced Automation Peptide Protein Technologies Apex 396 for multiple peptides, Model 90 for small and mid scale production and Model 400 for large scale production are the examples of automated systems. Matrix 384 automated synthesizers can produce up to 384 peptides simultaneously. Although there is extensive use of high throughout techniques like automated synthesizers, most organizations prefer to have knowledge of manual peptide synthesis. The transition from t-Boc to F-Moc was one the most notable advancement in Solid Phase Peptide Synthesis. The major change in F-Moc was the use of mild basic condition (piperidine) for the cleavage of peptide in the place of HF cleavage in the classical method. 7.1.4 Evaluation of technology Boc chemistry uses an acid labile temporary protecting group while Fmoc chemistry uses a mild base labile group as the temporary protecting group. Although the chemistry and the reagents required are different, the cycle is essentially similar to Boc chemistry. Deprotection of the amino acids is achieved by 20% piperidine in DMF. Deprotection is much slower than deprotection by TFA in Boc chemistry because the reaction kinetics is influenced by the formation of an anionic nitrogen group. The main advantage of Fmoc chemistry is that it does not involve the use of HF to remove the permanent protecting groups or for cleavage from the solid support. TFA is used instead. The use of HF is dangerous, harsh on the peptide and is difficult to handle.

Also when peptides are evaluated by HPLC, Mass Spectroscopy, Sequence Analysis, it has been found that synthesis by Fmoc chemistry is more reliable and accurate. This was true as Fmoc uses less harsh conditions for cleavage of peptide from resin and removal of permanent protecting groups. G.B. Fields et al. compared the two techniques and indicated that Fmoc chemistry is more reliable and tends to give desired peptides in better purities [9]. 7.1.5 Application of technology

Advances in the development of methods for the design and synthesis of peptides, pseudo-peptides and related compounds helps in the understanding of protein structure, domain structure and fold, topology, and dynamics. Peptide therapy has enormous potential in diverse areas as growth control, blood pressure, neurotransmission, hormone action, pain, digestion, reproduction. Some of the medical and biological applications for synthetic peptides are Structure and function studies – Design of synthetic peptides in better understanding ligand-receptor/acceptor interactions through structure and function studies. It is essential to know the complementary binding ligand for binding, transduction and release of a receptor/ acceptor [4]. We lack methods to design ligands with specific properties. The use synthetic peptides with specific activities are used for the study of structure-function relationships. Enzyme inhibitors - Inhibition of enzymatic systems is on eof the prime targets for regulation and control of diseases. Current targets include cancer cells, bacteria, humans and many other systems example 5-flurouracil is used as chemotherapeutic agent, penicillin are considered as first line antimicrobials, while lovastatin is used as a cholesterol reducing agent. Recently peptides have been used as therapeutic agents for targeting specific enzymes or systems. For example hirudin peptides have helped to understand the mechanism of thrombin-hirudin coagulation pathway. Therapeutic agents – development of peptide therapeutic agents has become very popular area of research [10]. For example; cardiovascular agents like angiotension converting enzyme inhibitors, immunological agents like cyclosporine have made transplants possible. Receptor Studies – the use of peptides in the study of receptor types and subtypes and providing analogues for in vivo studies [4]. Antigenic and Immunogenic uses – synthetic peptides are used in the study of immune responses due to its ability to mimic similar antigenic sequences in proteins. The ability to design novel analogues which mimic certain properties of the original compound present in the body has been used to study peptide hormones, neurotransmitters, enzyme inhibitors and so on [11].


Relevant websites

Peptide synthesizer by Anaspec:

Wikipedia: Large scale production:


Key suppliers

Auspep – The name on the world’s finest peptides GenScript Corporation Sigma Aldrich ptide_Synthesis.html Biocompare provides a comparison between many major companies that are involved in peptide synthesis. 7.1.8 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. References: Ashely, M.J., V.J. Hruby, and J.-P. Meyer, Bioorganic Chemistry Introduction to Peptides and Proteins, ed. S.M. Hecht. 1998: Oxford University Press. Lyod-Williams, P., F. Albericio, and E. Giralt, Chemical approaches to the Synthesis of Peptides and Proteins. 1997: University of Barcelona, Barcelona, Spain. AnaSpec. 2006 [cited 2006 24 October]; Available from: Grant, G.A., Synthetic Peptides: Beginning the Twenty-first Century. 2nd ed. 2002: Oxford University Press. Ermolat’ev, D.S. and E.V. Babae, Solid-phase synthesis of N-(pyrimidin-2yl)amino acid amides. Chmestry Department, Moscow State University, Moscow, Russia, 2005: p. 172-178. Merrifield, B., Solid Phase Peptide Synthesis - The synthesis of Tetrapeptide. Journal of the American Chemical Society, 1963. 85: p. 2149. Bodanszky, M. and A. Bodanszky, The Practice of Peptide Synthesis. 1984, Berlin, Springer-Verlag. Zhang, W., Fluorous tagging strategy for solution-phase synthesis of small molecules, peptides and oligosaccharides. Curr. Opin. Drug Discovery Development, 2004. 7(6): p. pp. 784-797. Fields, G.B., et al., Evaluation of Peptide Svnthesis As Practiced in 53 Different Laboratories. Beeley, N. and A. Berger, A Revolution in Drug Discovery. BJM, 2000. 321: p. pp. 581-582. Hurby, V.J., F. al-Obeidi, and W.Kazmierski, Emerging approaches in the molecular design of receptor selective peptide ligands - Conformational, topographical and dynamic consideraions. Journal of Biochemistry, 1990. 268: p. pp. 249-262.

Chapter 7.4 Purification and characterization of synthetic peptides
Sandip D Kamath 7.4.1 Introduction Peptides can be defined as long chains of amino acids, where the amino acids are linked together by peptide bonds. Peptides and proteins are complex biomolecular structures which form the basic building blocks of life. It is the final product or the expression of the vast genetic information. In the biological process, peptides are produced by the addition of amino acids, one after another through the formation of peptide bonds, in a sequence that is dictated by the RNA sequence, which in turn is decided by the DNA material of the cell. This is called the primary structure of the peptide thus formed. The folding of the structure then takes place to give a final functional protein. The size, shape, structure and function of the protein are decided by the primary amino acid sequence of that protein. In nature, the assembly of amino acid starts from the amino acid terminal and it proceeds towards the carboxy terminal of the last amino acid. Peptides can also be synthesized in the lab artificially, which are called synthetic peptides.[1] Synthetic peptides can be made by a process called “Solid Phase” chemical synthesis. In this process, the synthesis of the peptide, starts from the carboxy terminal and makes its way to the amino terminus, in contrast to the biosynthetic pathway. The amino acid that constitutes the carboxy terminal of the proposed peptide is attached to a solid surface, i.e. a resin. Subsequent amino acid molecules are linked to the previous amino acid at its carboxy terminal forming a peptide bond. This is not a straight forward reaction; there are other reactive groups on the amino acid that can react with the forming chain and form secondary structures. To avoid this, blocking agents are used to protect other reactive groups and expose only the required group. t-BOC and f-MOC are two such agents that are widely used in preparation of synthetic peptides.[2] After the linking process of the amino acid is complete, the blocking agent is removed and the amino group is exposed for the next amino acid to react with it. This cycle is continued until the required peptide is formed. Peptides containing up to 20 amino acids can be easily formed with the above technique. It is a bit more difficult to synthesize peptides containing up to 100 amino acids, though they have been synthesized successfully. [2] The making of synthetic peptides is not as easy as it seems. It is always haunted by several problems right from the synthesis stage till the final pure peptide is obtained. The synthesis of peptides by the above mentioned technique is not perfect. This process results in contaminants that are nearly similar to the required peptide. Such contaminants include amino acid deletions, truncated amino acid sequences, and peptides with altered amino acid sequences. Small organic molecules such as phenols and thiols are also present due to the removal of the synthesized peptide from the solid support. Such products act as impurities in the final product, which are non-functional. Such impurities interfere with the function of the original peptide and gives unwanted results. Hence it is of utmost importance that the synthetic peptides formed are purified to the desired level of purity before it is used in different applications.

Before the actual purification of the synthetic peptide is performed, it is necessary to know each and every aspect of the required peptide. Knowledge about the protein is necessary, because it helps in deciding, which kind of process is to be used for the purification, and to fine tune the settings of that process. This is where the characterization of the peptide comes into the picture. Characterization is a process where in details of the peptide like molecular weight, amino acid sequence, isoelectric pH, and shape are elucidated. Various techniques are employed for the characterization. Sequencing maybe done by Edman’s method, molecular weight determination can be done by gel filtration or mass spectroscopy.[2] After the characterization of the required peptide, the next step is its purification. Different chromatographic techniques are employed in the purification step such as preparative reversed phase high performance liquid chromatography (RP-HPLC), Ion exchange chromatography (IEC), Size exclusion chromatography (SEC) and affinity chromatography.[3] The first step in the purification of any peptide is to describe the optimal conditions of the process. Specific information about the peptide and related topics should be gathered; as in the intended use of the product, availability of the starting material and its handling, intended use of the final product, contaminants that have to be removed completely, the level of purity required in the final product, and the equipment and resources available.[4] The time frame to complete the work and the economical constraints should be defined before finalizing the purification layout.[5] Information regarding the purity and contaminants will help in designing the purification protocol. Purification of proteins is a complex procedure. Most of the times it is not a one step process. Depending on the level of purity required for the final peptide, the process of purification of peptides is designed in a multi-step manner. The initial steps help in the removal of impurities that are easily detectable and separable, and the final step is used for removal of contaminants that are nearly similar to the original peptide in terms of size and composition.[4] Each added step in the process will lead to some loss of the final product. Better purity might be achieved by more number of steps, but it results in lower yields of the final product due to the loss associated with each step. Hence it becomes necessary to use fewer, optimum number of steps in purification to get a good yield of the final product with acceptable level of purity. Different chromatographic techniques must be engaged in a logical sequence so as to avoid the need of any additional processing steps and to keep the steps to a minimum.

Figure 7.4.1 The Three phase purification Strategy

( ok.pdf) The purification sequence can be broadly divided into 3 stages. The first stage is called capture, wherein the main aim is to isolate, concentrate and stabilize the main peptide. [6] At this step, one cannot expect to obtain a highly pure target protein as the outcome. The main goal of this stage is to separate the target peptide from contaminants that are very dissimilar to it, and from impurities that might destabilize the target peptide [6]. RP-HPLC is a good technique to isolate the protein, but clogging might occur due to the large impurities present in the crude sample and hence not recommended. Techniques widely used for carrying out this stage are Ion exchange chromatography and hydrophobic interaction chromatography [6]. The second step is called the intermediate stage. In this stage the major bulk impurities are removed. The resolution of the separation in this stage is required to be a little higher because the contaminants are such that it more or less resembles the target peptide in terms of structural properties. RP-HPLC is a suitable technique that can be used at this stage of purification. The final stage is the Polishing stage [6]. The aim of this step is to remove trace impurities from the sample. These impurities are very similar to the target peptide in terms of structure, and it is difficult to separate them using a low resolution separation technique. The aim of the polishing stage is to obtain 100% pure target peptide. Size exclusion chromatography maybe used for this stage, but for separating structural variants that differ very slightly from the target peptide, RP-HPLC might be used owing to its excellent resolution power.[6]

Figure 7.4.2 Interaction of a solute with a typical reversed phase medium. Water adjacent to hydrophobic regions is postulated to be more highly ordered than the bulk water. Part of this ‘structured’ water is displaced when the hydrophobic regions interact leading to an increase in the overall entropy of the system. ( Large scale preparative purification of the synthetic peptides requires both high resolution separation and the ability of the technique to be scaled up. When the RPHPLC technique is used, the purification of the synthetic peptide is optimized using small particle reversed phase medium and then scaled-up accordingly keeping the selectivity same, and by increasing the particle size.

7.4.2 Recent Advances

In the past several years, there have been quite a few technological advances in the field of chromatography. Apart from the automated purification systems that are increasingly being used, there have been innovative changes made in the basic principles on which the purification of synthetic polymers are based on. The purification technique is becoming more and more customized in regards to the target peptide, thus enabling faster purification process and better purity level of the final product. A recent chromatographic technique that has been proposed is the novel mixedmode reversed phase weak-anion-exchange type stationary phase. The separation material contains two distinct binding domains in a single chromatographic ligand. First is the lipophilic alkyl chain for the hydrophobic interactions, similar to what we see in reversed-phase chromatography, and second a cationic site for anionic exchange chromatography for interaction with an oppositely charged solute.[7] The purified final product was analyzed HPLC-UV and HPLC-ESI-MS and it was found that all the major impurities had been removed in a single run, employing the HPLCWAX stationary phase. IN comparison to RP-HPLC this technique has improved productivity, and the yield of the pure peptide per run using the HPLC/WAX method was found to be 15 times higher as compared to the standard gradient elution RPpurification.[7] Polymeric particles have been long used in the purification of biomolecules. These are generally semi-rigid porous particles with limited mechanical stability and are used in low pressure, low flow conditions. This limits its use to the capture phase of purification and it is unsuitable for the high pressure polishing stage of the purification protocol. In the large-scale preparative and process scale purification of synthetic peptides, new polymers which make the stationary phase have been developed. The rigid copolymers of styrene and divinylbenzene have been developed for the same purpose.[8] These polymeric particles have high mechanical stability and are able to operate at high linear velocities. The pore size and morphology are optimized so as to facilitate unhindered solute diffusion and to provide maximum surface area to enhance the loading capacity. A pore size of 100 Angstorm is developed for the purification of synthetic peptides. The stability of the polymers enables cleaning with sodium hydroxide without any particle deterioration or any snags in the selectivity.[8] One of the main problem encountered in the purification of synthetic peptides is the separation of contaminants that are similar in shape and structural properties to the target peptide. These impurities are closely eluted near the target peptide which makes it difficult to separate. This problem can be taken care of with the use of fluoros-based separation.[9] There are two different methods that can be followed. Either the final target peptide is tagged with an appropriate fluoros protecting group or the impurity that resembles the target peptides or to the intermediate unreacted product. Then with the use of Affinity chromatography, the fluoros tagged entity can be easily separated from the rest. It has been demonstrated that the fluoros tagged impurities can be easily separated from the target peptide using Fluoros-HPLC.[9] 7.4.3 Evaluation of the Technology Reversed phase chromatography plays a central role among the cluster of technologies used for the characterization and purification of peptides. It has found both analytical and preparative applications in the area of biochemical separation and purification. It plays a lead role in the high sensitivity protein characterization. This role involves the micropurification of peptide and subsequent identification of the isolated peptide fragments. The general procedure is that the peptide sample is

subjected to micro-scale reversed phase chromatography and subsequently passed on to analysis by techniques such as Edman sequencing or Mass spectrometry. Mass spectrometry is performed by using a flow splitter, which allows a fraction of the product eluent to the ESI-MS (electrospray ionization mass spectrometer). Another method is the offline analysis with the use of MALDI-TOF MS (Matrix-assisted laserdesorption Time of flight mass spectrometer). The data collected through these steps ultimately leads to the peptide characterization. RP-HPLC is mainly used for the purification of synthetic peptides. The main advantage of this technique is that it can used for the separation of proteins that are available in microquantities and at the same time, be used for large scale purification of peptides. Also it is generally coupled with other chromatographic techniques in a logical sequence to get good yield of the final product and perform this task in a suitable time frame. Biomolecules with hydrophobic character can be easily separated using reversed phase chromatography with excellent recovery and resolution. But there are some problems associated with the use of RPC. One of the main problems is the buildup of contaminants in the reversed phase columns. The contaminants have lesser retention power and are eluted in the void volume.[10] These undesired impurities might be interpreted by the detector as chromatographic peaks, baseline upsets or negative peaks. If the contaminants are retained on the column and if the mobile phase is not strong enough to elute them, then they start accumulating near the column heads, after many injections. If the contaminant build up is too high in the RP column, then they start to act like a stationary phase, and analytes interact with these modified phase to give an altered separation patern. Retention times can shift and tailing occurs [10]. If there is too much contamination in the column, the pressure on the pump increases to a high level and can cause the column to settle and create a void. The best method to prevent this is specific washing techniques after a number of runs on that column.[10] 7.4.4 Applications of the Technology Perhaps the area that has been affected by this technology the most is the Biopharmaceutical sector. In this sector, it is of utmost importance that the peptides used as therapeutic agents are free of any sort of impurities as it might lead to the deterioration of the therapeutic agent. Apart from its applications in the medical field, it has wide ranging uses in other fields as well. Affinity chromatography has gained popularity in recent years, due to the high specificity in the separation of the target peptide. The selection of peptide ligand is the most important step, which can enable the separation of the target molecule in a single step. Any alterations in it can lead to unwanted results. Therefore it is necessary to obtain a pure sample of the peptide ligand, and it can be done by using the chromatographic purification techniques. For example the PY574 alpha is an 18 amino acid peptide constituting part of the intracellular domain of the PDGF αreceptor. This peptide was synthesized and used as a ligand for subsequent affinity purification of transduction signal protein form the cell lysates. The purification was achieved in a single step using reversed chromatographic technique. [12] Another application, where a synthetic peptide was used as a ligand, was in the one step affinity purification of the recombinant urokinase type plasminogen activator receptor. In this experiment the high affinity synthetic peptide antagonist (AE152) was synthesized and purified to be used as a ligand for performing a affinity purification.[11]

7.4.5 Relevant web sites Protein purification laboratory research- GE Healthcare Genetic engineering news Reversed Phase HPLC basics for LC/MS Handbook of Analysis and Purification of peptides and proteins by RP-HPLC

7.4.6 Key Industry Suppliers Applied Biosystems. ROHM HAAS Advanced Biosciences Viscotek 7.4.7 References 1. Nelson, D. L. & Cox, M. M. (2000) Lehninger Principles of Biochemistry, 3 edn, Worth Publishers, New York. 2. Sheppard, R. C. (1975) Amino-acids, Peptides and Proteins in pp. Vol 9 29-33, The Chemical Society, London. 3. Pasch, H. & Trathnigg, B. (1999) HPLC of Polymers, Springer Verlag, New York. 4. Reversed Phase Chromatography Principles and methods in, Amersham Pharmacia Biotech, 5. Bailey, P. D. (1990) An Introduction to Peptide Chemistry, John Wiley and sons, New York. 6. Protein Purification Handbook in, Amersham Pharmacia Biotech, 7. Nogueira, R., Lammerhofer, M. & Lindner, W. (2005) Alternative highperformance liquid chromatographic peptide separation and purification concept using a new mixed-mode reversed-phase/weak anion-exchange type stationary phase, Journal of Chromatography A. 1089, 158-169. 8. Lloyd, L. L., Millichip, M. I. & Watkins, J. M. (2002) Reversed-phase poly(styrenedivinylbenzene) materials optimised for large scale preparative and process purification of synthetic peptides and recombinant proteins, Journal of Chromatography A. 944, 169-177. 9. de Visser, P. C., van Helden, M., Filippov, D. V., van der Marel, G. A., Drijfhout, J. W., van Boom, J. H., Noort, D. & Overkleeft, H. S. (2003) A novel, base-labile fluorous amine protecting group: synthesis and use as a tag in the purification of synthetic peptides, Tetrahedron Letters. 44, 9013-9016. 10. Majors, R. E. The Cleaning and Regeneration of Reversed Phase HPLC Columns in, Agilent Technologies, Delaware USA.

11. Jacobsen, B., Gardsvoll, H., Juhl Funch, G., Ostergaard, S., Barkholt, V. & Ploug, M. One-step affinity purification of recombinant urokinase-type plasminogen activator receptor using a synthetic peptide developed by combinatorial chemistry, Protein Expression and Purification. In Press, Corrected Proof. 12. Purification of a phosphorylated PDGF a-receptor derived peptide at high pH using a polymer stationary phase. Amersham Biosciences , Application Note 18-1132-63.

Chapter 7.5 Peptide library and its application Patel Saravatichandra
7.5.1 Introduction:
In the field molecular biology there is great discovery of the DNA structure by Watson and crick reveals the most of secret of the biological system up to sequencing of the gene. The sequencing of the gene is really very helpful to make the RNA and protein because the final product of DNA metabolism is the protein. In the transcription the RNA formed is not same as the revers of the amino acids sequence as per the gene sequence. In the post transcriptional modification the some nucleotide are spliced and just useful are interconnected and give protein. In this recent time the large amount of sequencing data is available because of human genome project, but due to improper decoding techniques we are helpless to find perfect gene sequences. It is very difficult to find gene without any intron sequence nucleotide give directly proteins. To solve this we have some idea like to find the entire factor for the invitro post transcriptional modification, but it is too hard to decode large number of searched sequence. The use of this DNA and RNA sequence to cure disease is very tedious method and time consuming. The decoding of proteins from its amino acid sequence and use whole group of small peptide is more convenient and easy. The use of different specific known protein particle will give us sure result what we want. “Peptide library is a systematic combination of different peptides in large number. It is a powerful tool for drug discovery, structural studies and other applications. Solid phase peptide synthesis, along with other methods, has been successfully used to prepare peptide libraries."[1].The basic principle of the peptide library is expression of libraries of peptides in mammalian cells to select for trans-dominant effects on intracellular signalling systems. [3] Peptide library is the use of peptide to find signalling pathways of the cell. These peptides which we will design as per expectation interfere with intracellular signalling. This will be understood by the use of reagent we used. These reveals of this signalling pathway are helpful to cure and find a drug for that decease. When we carry the peptide library into the cell due their small size we can infrequently scan that peptide which binds to the intracellular protein or surface protein. The peptides are as affinity reagents so we can isolate their target protein complex and distinguished their mechanism. We will create this as therapeutics model for drug discovery. [2]

Fig-1 general mechanism of peptide library Source:

7.5.2 Recent advancement in the peptide library:
In this recent time, advancements in bioinformatics lead to the great research in this field. There is great advancement in the peptide library technology. There are so many group, laboratories and scientist are now working in this area. Each and every day new peptide and their associate system are searched. Some of the advancements are stated below. (1)Treatment of cocaine addiction: The cocaine abuse is the very worst social problem. Government and social group trying to reduce this abuse but it can be curable now. In recent time some of the researchers use intranasal managed filamentous bacteriophage which will produce cocaine sequestering antibody on their cell surface. These antibodies bind to the cocaine molecule and stop their psychological effect in the rodent model.

Fig 2 reaction of catalytic antibody with cocaine Source: 10.1208/aapsj070359 This antibodies block cocaine molecule and therefore cocaine disable to bind agonistically to that nerve receptor. So there is no psychological effect on central nervous system. To develop this system they use three approach “(1) Those compounds that can be used in a substitution-based treatment as a cross-tolerant stimulant; (2) medications that serve as antagonists by blocking the binding of cocaine to its cognate receptors; and (3) compounds that function by acting at other sites distinct from the cocaine site of action but functionally antagonize the effects of

cocaine. Several biopsychological models have been proposed and evaluated to address addiction and relapse prevention. [5] “Unquestionably, an improved pharmacotherapy would increase the effectiveness of such programs and alternative strategies for treating cocaine addiction are needed if progress is to be made.”[4] Among this approaches the second approach is good to cure the effect of cocaine on central nervous system. (2)Epitope mapping using m RNA display: To map the epitope or to discover the specific part of the antigen we have to depend on the antibody, which is protein -protein interaction process. Here antibodies are used as molecular reagent .in the traditional method. We have to depend on the expression of the different overlapped poly peptide .This is achieved very fine and correct sequence by using the antibody probing reactivity. This method is very tedious, time consuming and costly. In this display technique each expressing vector contain multiple copy of each single polypeptide sequence on its surface. The revivals of the active peptide are by the affinity of that peptide by the florescence- activated cell sorting or by biopanning. In short the peptide is fused with the mRNA at the 3' end with the help of tethered puromycin compound. This fusion is now selected by the RT-PCR amplification for the sequencing. [6]

Fig-2 out line of the whole procedure of epitope mapping using mRNA display Source: (3)Cloning at molecular level for the expression of the insect FMRF amide receptor: FMRF amide and its compound in nervous system are found in very wide amount in all invertebrates which has very useful function. Here we use the Chinese hamster ovary cell to express the cloned Drosophila orphan receptor. The screening of peptide library discloses that the receptor reacted with very high affinity FMRF amide. This affinity is very high than the drosophila’s FMRF receptor. The peptide library is tested on the drosophila and other insect and many peptide hormones. “Addition of 10–5 or 10–6 M of these peptides to the pre-treated CHO cells gave negative results for many of these peptides, but peptides resembling FMRF amide at their C termini, and FMRF amide itself, gave clear bioluminescence responses, Because FMRF amide was the most potent peptide in inducing the bioluminescence response.”[7] For the testing they prepare eight of the peptide which is associated of to FMRF amide peptide of drosophila. It is assumed that this peptide is the found in the prepohormone. After testing all the peptide, from the result it is found that the FMRF amide-6 ea the most potent intrinsic protein. [7]

7.5.3 Advantages of the peptide library
There are many advantages of the peptide library techniques but the main advantages are as per the following (1)With the use of genetic engineering we can identification of the gene coded for the surface protein but this gene is for the all the surface protein so it is random peptide. The main advantage of this peptide library is an affinity of the peptide towards its target protein. [8] (2) Display avoids the problems of cytotoxicity, soluble protein expression and secretion bias in cell-based systems; it could be an ideal means by which to display functional (single chain) proteins for applications such as target discovery and functional identification. [9] (3) Para tope-specific purification of antibodies has distinct advantages over conventional methods of antibody purification with respect to its capacity to isolate product of high purity and immunoreactivity. [10] (4) This library offers the advantage of multimeric binding (several hundred copies of the peptide are displayed on each phage), and the peptide sequences identified from the binding phases often reveal sequence themes that make it easier to narrow in on the consensus sequences that bind your target[11] (5) “By delivering libraries of peptides to cells we can scan for rare peptides that bind to and induce function of intracellular signalling proteins and surface molecules. Using these peptides as affinity reagents we can isolate their target protein complexes and discern mechanism. Finally, we are using them to create therapeutic modalities from either their targets, the peptides themselves, or by some better understanding of the biological process that is uncovered.” [2] (6) Up to this time there is no dependable, predictive and generally applicable method to determine the interference of signalling of the cell invivo, to discover tissue differentiation, abnormal or disease development and organ development. This discovery leads to the use of this isolated signalling pathway for the future. [12] Limitation of the peptide library technique: The main limitation of the peptide library technique is in the discovery of the epitope. When we searching the antigen related to the diagnostic purpose that microbial sequence database is not complete. Therefore searching of similarity will fail to notice any antigens which are not presented in the available database. [13] “The use of longer peptides cannot overcome a primary limitation of peptide libraries, the inability to represent conformational epitopes comprising amino acids from distant positions along the polypeptide sequence.”[14] In the random peptide display in identifying actual epitope is very time required to improve specific phage clones because three or more round for the panning require getting consensus amino acid sequence ant find epitope. This will take 3-4 week. This technique is costly and labour précising. “Additionally, a random peptide library, despite a theoretical size of >1012 peptides, may not include all possible amino acid combinations and therefore may not contain specific binding-peptide motifs.”[15] Another limitation of the peptide library technique is the folding of engineered protein to their natural form. [16]Owing to the poor sequence diversity some shapes may be

incomplete with the target .This incompleteness is because of electrostatic charges repulsion .Hence these peptides are not selected to their target protein complex. [17] “One limitation of synthetic peptide libraries is the lack of specific criteria for deciding whether a detected binding activity is genuine or is due to non-specific interactions.”[18] More over the problem of coupling of amino acids mixture for the preparation of peptide library. The coupled peptide like di, tri, and tetra are not found as they initially formed. [19]

7.5.4 Application of the peptide library technology:
(1) Combinatorial peptide library methods for immunobiology research. This technology is used widely in the research of the immunological pathways. Peptide library is very useful to determine the epitopes on the surface of the antigen which will useful for the different diagnostic kits of the immunoassay. This epitopes are useful for the vaccination. “In recent immuno biological applications, peptide libraries have proven monumental in the definition of MHC anchor residues, in lymphocyte epitope mapping, and in the development of peptide vaccines. Peptides identified from such libraries, when presented in a chemical micro array format, may prove useful in immunodiagnostics. Combinatorial peptide libraries offer a highthroughput approach to study limitless biological targets. Peptides discovered from such studies may be therapeutically and diagnostically useful agents.”[20] (2) Phage display using peptide library: The phage presentation of the peptide library is a powerful tool for the recognition of compound and to distinguish and bind to that target. Phage system is used for the detection of protein ligand dealings for the binding affinity improvement. The varieties of display methodology that survive offer a variety of capability in panning for attraction interaction. This display system offer a opportunity to address issue of expression prejudice and discover the possibility of creating low molecular compounds which have drug like property. Phage display using peptide library will be used as a biomarker for the disease like cancer cardiovascular disease and angiogenesis. “The peptides are capable of homing to specific pathways and targets within those pathways. By generating the proteins expressed from cDNA libraries, display methodologies can provide a direct link between phenotype and genotype.”[21] (3)Peptide library for the protein- protein interaction: In the cell the signal transduction process, the interaction of protein is the major reaction. We want to determine that how this protein interact and with signalling complex and produce a signal. Some of the peptide library developed which is useful for the study of protein-protein interaction. This all approach basically depends on peptide or the DNA sequence of peptide to identify the motif for the binding. An oriented peptide array library. (OPAL) approach that will ease high through place proteomics investigation. “OPAL integrates the principles of both the oriented peptide libraries and array technologies. Hundreds of pools of oriented peptide libraries are synthesized as amino acid scan arrays. We demonstrate that these arrays can be used to map the specificities of a variety of interactions, including antibodies, protein domains such Src homology 2 domains, and protein kinases.”[22]

(4) Exploring Biochemistry and Cellular Biology with Protein Libraries: Peptide library express a broad network for essential enzymes and binding protein specificity. Furthermore to introduction rules for the molecular level recognition, the binding preferences and tolerance from such library ca n reveals the mechanism of biochemical and cellular procedures. The peptide obtained from protein library will also useful for the pharmaceutical compounds and even reagents to further discovery of cell biology. [23] 7.5.5 Relevant websites: (1) (2) (3) (4) (5)

7.5.6 Key industrial suppliers
(1) AUSPEP-private limited 2002 Website: (2) Chemical Synthesis Services, Elvingston Science Centre. : +44(0)2838332200 F: +44(0)1875408151 Website: (3) Genzyme Pharmaceuticals, 675 West Kendall Street, Cambridge, MA 02142, USA Tel: 617-374-7248, Fax: 617-768-6433 Website:

7.5.7 Reference:
[1]Peptide library, Princeton Bimolecule Corporation, i, Corporation access date (23/09/06) [2] access date (23/09/06) [3] Dominant effector genetics in mammalian cells, Rigel, Inc., San Francisco, California, USA. 11137994&dopt=Abstract

[4] Dickerson, T.J., Janda, K.D., Recent Advances for the Treatment of Cocaine Abuse: Central Nervous System Immunopharmacotherapy. AAPS Journal. 2005; 07(03): E579-E586. DOI: 10.1208/aapsj070359 [5]Hoffman, J.A., III, Caudill, B.D., III, Koman, J.J, III, Luckey, J.W., Flynn, P.M., Hubbard, R.L. Comparative cocaine abuse treatment strategies: enhancing client retention and treatment exposure. J Addict Dis. 1994; 13:115-128. PubMed DOI: 10.1300/J069v13n04_01 [6]William, W.J., Olsen, B.N., Roberts, R.W., Epitope mapping using mRNA display and a unidirectional nested deletion library access date(23/09/06) [7] Cazzamali, G., Grimmelikhuijzen, C. J. P., Molecular cloning and functional expression of the first insect FMRF amide receptor, Access date (24/09/06) [8] Lunder, M., Bratkovic, T., Doljak, B., Kreft, S., Urleb, U., Strukelj, B., Plazar, N. Comparison of bacterial and phage display Peptide libraries in search of targetbinding motif. Access date (24/09/06) [9] He, M., Taussig, M. J., Ribosome displays: Cell-free protein display technology, access date (24/09/06) [10] Murray A, Smith R.G., Brady K, Williams S, Badley, R.A., Price M.R. Cancer Research Laboratories, University of Nottingham, Nottingham, NG7 2RD, United Kingdom. Generation and refinement of peptide mimetic ligands for Para tope-specific purification of monoclonal antibodies. 11520027&dopt=Abstract , Access date (24/09/06) [11]Phage display screening, date (24/09/06) [12]Wesley, C.S., methods for determining notch signalling Access date (24/09/06) and Access


[13] Hamby, C. V., Llibre, M., Utpat, S. Wormser, G. P., Use of Peptide Library Screening To Detect a Previously Unknown Linear Diagnostic Epitope: Proof of Principle by Use of Lyme Disease Sera, Department of Microbiology and Immunology, Division of Infectious Diseases, Department of Medicine of New York Medical College, Valhalla, New York, Access date (25/09/06) [14] Sims, K. L., Schryvers, A. B., Peptide-Peptide Interactions between Human Transferrin and Transferrin-Binding Protein B from Moraxella catarrhalis, Department of Microbiology and Infectious Diseases, University of Calgary, Calgary, Alberta, Canada, Access date (25/09/06) [15] Holzem, A, Nähring, J. M. , Fischer, R. ,Rapid identification of a tobacco mosaic virus epitope by using a coat protein gene-fragment–pVIII fusion library , Institute for Biology I (Botanik/Molekularbiologie), RWTH Aachen, Worringerweg 1, D-52074 Aachen, Germany Fraunhofer Department for Molecular Biotechnology, IUCT, Grafschaft, Auf dem

Aberg 1, D-57392 Schmallenberg, Access date (25/09/06)


[16] Park, S., Xu, Y., Stowell, X. F., Gai, F., Saven, J. G. , Boder , E. T., Limitations of yeast surface display in engineering proteins of high thermostability, Department of Chemistry, University of Pennsylvania, Philadelphia, PA 19104, Department of Chemical and Bimolecular Engineering, University of Pennsylvania, Philadelphia, PA 19104 and Laboratory for Research on the Structure of Matter, University of Pennsylvania, Philadelphia, PA 19104, USA Present address: Department of Biology, Massachusetts Institute of Technology, Cambridge, MA 02139, USA, Access date (25/09/06) [17] Palmer, S. J., Redfern, M. R., Smith, G.C., Cox, J. P. L, Sticky Egyptians: a technique for assembling genes encoding constrained peptides of variable length, Department of Chemistry and 1Department of Mathematical Sciences, University of Bath, Bath BA2 7AY, UK, Access date (25/09/06) [18] Sims, K. L., Schryvers, A. B.. Peptide-Peptide Interactions between Human Transferrin and Transferrin-Binding Protein B from Moraxella catarrhalis, Department of Microbiology and Infectious Diseases, University of Calgary, Calgary, Alberta, Canada, Access date(25/09/06) [19] Boutin, J. A., Gesson, Henlin, J. M., Bertin, S., Lambert, P.H., Volland, J.P., Fauch`ere, J.L., Limitations of the coupling of amino acid mixtures for the preparation of equi molar peptide libraries, a Department of Peptide and Combinatorial Chemistry and b Department of Analytical and Physical Chemistry, Institut de Recherches SERVIER, 11 Rue des Moulineaux, F92150 Suresnes, France. 60.pdf Access date (25/09/06) [20] Liu, R, Enstrom, A.M., Lam, K.S. Combinatorial peptide library methods for immunobiology research, UC Davis Cancer Centre, Division of Haematology/Oncology, and Department of Internal Medicine, University of California Davis, Sacramento, CA, USA. http://Retrieve&db=PubMed&dopt=Abstract&list_uids=12543103 Access date (25/09/06) [21] Molecular display, Access date (25/09/06) [22] Rodriguez, M., Shawn S.,-C. Li, Harper, J. W. Songyang, Z, An Oriented Peptide Array Library (OPAL) Strategy to Study Protein-Protein Interactions, Verna and Marrs McLean Department of Biochemistry and Molecular Biology, Baylor College of Medicine, Houston, Texas 77030 and the Department of Biochemistry, Faculty of Medicine and Dentistry, University of Western Ontario, London, Ontario N6A 5C1, Canada,;279/10/8802 Access date(25/09/06) [23] Diaz, J. E., Howard, B. E., Neubauer, M.S., Exploring Biochemistry and Cellular Biology with Protein Libraries, Allison Olszewski1 and Gregory A. Weiss, Department of Chemistry and 2Department of Molecular Biology and Biochemistry, University of California, Irvine Irvine, CA 92697-2025, USA. Access Date (25/09/06)

Chapter 7.6 Applications of Synthetic Peptides
Shruti Saptarshi

7.6.1 Introduction Proteins are complex organic molecules made up of carbon, nitrogen, oxygen, hydrogen and sulphur. These are abundantly found in nature in the form of enzymes, hormones, antibodies, and thus play an important role in the overall metabolism of living organisms. Proteins are made up of twenty different types of basic structural units called the amino acids. The amino acids found in all the living beings are the same. Every amino acid consists of a chiral carbon atom, an amino group, a carboxy terminal and a replaceable /reactive R group. A condensation reaction between the carboxy and amino terminals respectively of two amino acids leads to the formation of a Peptide Bond. A couple of amino acids linked by the peptide bond form a dipeptide. Peptides are essentially small chains of up to 20 such α amino acids linked by the peptide bond. These are basically small fragments of the protein molecule that are functional. Under natural circumstances peptides are synthesized by the process of m RNA translation. As mentioned above proteins play a pivotal role in function and maintenance of the cell structure. Hence proteins as chemical moieties have a vast array of industrial, therapeutic, diagnostic applications. However it is not always feasible to obtain the required protein mainly because of the minimal amounts available, complexity involved in its extraction, or most importantly the expenditure involved in the process. Peptides on the other hand are specific functional sub units of proteins. Lately production of peptides on large scale has received prominence because of their pharmacological importance. Peptides can be obtained directly from the tissue by the process of purification, use of recombinant DNA technology [1]. However the best means of obtaining peptides on large scale is by the use of chemical synthesis. The basic idea of chemically synthesizing peptides is to obtain maximum number of amino acid residues in the peptide. Three approaches are used in synthesizing peptides namely, Solution phase synthesis, reverse proteolysis, solid phase synthesis. The method commonly used is SPPS. It was developed by the pioneering efforts of R. Bruce Merrifield in the year 1962. As the name suggests SPPS involves synthesizing the peptides while it is still attached to a solid phase. The solid phase used is an insoluble polymer contained in a column usually a resin. The artificial synthesis of the peptide occurs from the carboxy terminal unlike the amino terminal in case of natural synthesis.

Fig 7.6.1 Synthesis of peptides schematic representation. Source: The first step involves coupling of the amino acid to the ligands present on the resin. This is made possible by the means of the chloromethyl linkers. The main criterion of this step is to achieve a stable bond between the amino acid and the polymer support. An alternative resin used now a days is the PAM i.e. phenylacetamidomethyl resin. It helps form a much stable bond between the amino acid and the resin as compared to the ones traditionally used. Once the amino group is firmly attached to the polymer support, its reactive functional group and the R group has to be protected so as to prevent the formation of complex secondary structures. Typical labile groups used for protecting the alpha amino groups include the t-Boc (tertbutyloxycarbonyl) and the F-moc (9flourenylmethloxycarbonyl). Yet another important characteristic of these groups is easy removal so as to add a new amino group.t-boc is a stable labile group and can be easily removed by using TFA (trifluroacetic acid) and dichloromethane. Whereas F-moc can be removed by using concentrated amine solutions such as piperidine in N- methylpyrrolidone. The labile groups used to protect the R group remains attached during the entire process of synthesis. For the formation of the peptide bonds during the elongation of the peptide it is essential to activate the carboxyl groups. This is achieved by using symmetric anhydrides, carbodiimides, phosphonium and uronium salts. This process is

repeated until the desired peptide length is achieved. Once ready the peptide is cleaved from the polymer support to produce a free peptide which then subjected to characterization and purification, by using techniques such as reverse phase HPLC, mass spectroscopy, ion exchange chromatography etc. The newly synthesized peptide is then lyophilized. 7.6.2 Recent Advances Now a day’s peptide synthesis can also be achieved by using automated means. A vast number of automated instruments are available which carry out the coupling, activation, deprotection of the peptides all at the same time with higher efficiency in terms of purity. Common examples of such automated systems include MilliGen, Applied biosystems, Advanced chemtech etc. As a result it is now possible to obtain synthetic peptides of length from 2 to 80 amino acids with a purity of about 99%. 7.6.3 Evaluation of Technology The latest technology has made it possible to manufacture complex peptides. Ones consisting of long chains of hydrophobic amino acids, modified amino acids such as phosphotyrosine, phosphoserine, phosphothreonine etc. It is also possible to synthesize peptides containing other chemical compounds such as dyes and disulphide bonds. Modifications such as biotinylation (process of tagging a molecule with biotin), acylation (process of introducing an acyl group in a molecule), and nitro nation too can be introduced. Peptides are commonly used as antigens for immunization purposes. This requires conjugation of the peptide to a carrier protein. Such a conjugation can be brought about at the C or N terminal of the amino acid by making use of a pre activated carrier protein. Carbodiimide hydrochloride is an example of such a cross linker that binds to Asp, Glu residue of an amino acid sequence. Peptide sequences are specific and this makes the peptides unique with regards to its chemical and physical properties. Peptides are manufactures for myriad reasons. Peptides being specific in nature have a wide use in therapeutics, protein engineering, in studying structure – function relationships, immunodiagnostics. Novel molecules such as oligonucleotide-peptides, PNA (peptide nucleic acids), peptoids, mimetics, MAPs, antibodies etc. are all essentially derived from peptides.


Fig 7.6.2 Structure of a MAP system. Source: Antibodies are now days increasingly used for therapeutic reasons. Antibodies can be artificially synthesized for a large number of proteins (antigenic moieties). These proteins can be the ones separated by electrophoresis, present natively, inactivated pathogens such as viruses or peptides artificially synthesized. As compared to all the forms of antigens mentioned above synthetic peptides are preferred as they help in targeting specific epitopes, are pure, and low on expenditure in terms of their manufacturing. Moreover an antigenic peptide can be introduced in any species so as to obtain antibodies. Also different antibodies can be made against different epitopes on the same antigenic peptide. Antibodies can be raised against specially modifies for eg. Phosphorylated, biotinylated, peptide sequences. Such state specific antibodies have been used in studying cell signaling pathways. However the specificity of the synthetic peptides can be a problem at times. The small size can be a limitation in eliciting an immune response. Hence the need of conjugating the peptide with a specific antigen carrier system. The MAP i.e. is the multiple antigen peptide system provides the answer. This was first describes by Dr. James Tam. It makes use of a peptidyl core of three or seven branched lysine residues upon which the antigenic sequences of interest can be built using the above stated spps method. The basic advantage of using such a system is the high molar ratio of the peptide antigen to the core molecule. Hence there is no further need of conjugation to a carrier protein. Immunogenic properties of multiple antigenic peptide systems containing defined T and B epitopes have been aptly described [2]. They brought about an immunization with chemically defined synthetic polymers, MAP system consisting of T and B epitopes of the circumsporozoit protein of P.bergehi and found an induction of a significant immune response with high levels of circulating antibodies. A secondary immune response against Map system showed increased levels of circulating IgG. This study proved that MAP systems are potent immunogens and are also capable of inducing immunologic memory. Synthesis and bioactivities of two antigenic peptides 26-43(P26) and 116131(P116) derived from 28 kDa glutathione S- transferase of S. mansoni (Sm28 GST), two multiple antigenic peptides (P26)4 – MAP and (P116)-4MAP with an oligomeric lysine core were synthesized by solid phase peptide synthesis method. This experimentation was carried out on laboratory animals. Positive dot ELISA results were obtained for the same. As mentioned by[3] “A new method of synthesis of Multiple Antigen Peptide System, wherein the carrier is a core matrix with branched lysine and β-alanine residues, is now n use. The antigenic peptide is separately synthesized in a protected form and coupled to the core with diisopropylcarbodiimide (DIPCDI) in presence of 1-hydroxybenzotriazole (HOBt). This procedure has two major advantages: firstly, it allows an independent characterization (and purification) of the core and of the peptides;

secondly, it allows a possible further coupling to a protein carrier, after an intermediate addition of Cys to the N-terminal amino acid.” PEPTOIDS

Fig 7.6.3 Structure of a typical Peptoid Source: Peptoids or peptidomimetics are molecules like peptides. However they differ in terms of properties. Peptoids are oligomers of N- substituted glycines rather than α carbon substituted glycine units. These are basically designed to mimic the complex biological properties of the parent peptide. Peptoids are protease resistant and hence are not easily broken down in the body like peptides. Peptoids are synthesized using solid phase sub monomer method and a peptide synthesizer. This particular characteristic of peptoids is now being further explores in the field of biomedicine. As mentioned in one of the recent papers, scientists are designing peptoids with selectively positioned reactive groups along the molecule backbone. This was done by developing the backbone peptoid on a solid support, and then attaching the bioreactive groups to it by using copper catalyzed reaction between an azide and alkyne. In this method the azide or the alkyne unit is first attached to the backbone, then the bioreactive molecule already containing a coupling moiety is added which results in instant ligation. Such a type of peptoid can also find applications in energy transduction as well as in information storage.[4] Peptoids have also found potential application as Fibrin mimics for surgical glue as stated in the journal article by [5]. New substrates for transglutaminases are being developed with peptidomimetic backbones. These substrates are then attached to polymers such as PEG to form polymeric substrates. One of the examples is of a hydrogel formulation that contains calcium loaded liposomes, hrFactorXIII, thrombin, enzymatic substrate, all based upon a PEG-peptide conjugate containing four arms. Each of these four arms ends with a peptidomimetic aping the gamma chain of the protein fibrin. This material can be used in wound healing, tissue scaffolding and drug delivery. Yet another application that is being explored for its potential clinical benefits is the use of Peptide Mimetics as vaccines for cancer. The basic principle on which the cancer vaccines work is basically inducing an immune response against the tumor antigen by the means of whole tumor cell vaccines, dendritic cell vaccines,

adjuvant vaccines, DNA, viral vector vaccines etc. Most of the tumor antigens exhibited by the tumor cells are also expressed by normal cells. Hence it is difficult to induce a strong humoral immune response against the self antigens. Peptides mimetic are artificially designed specific proteins. Thus if such antigenic determinants mimicking the tumor antigens are introduced into the body then it will be possible to induce a stronger immune response. However the only ambiguity associated with this form of immunotherapy is the risk of initiating an autoimmune response. A recent report stated that immunizing mice with a peptide that mimics MUC1 is able to generate a strong cellular and protective response against MUC1 as well as lysing human breast cancer cell lines [6] This idea is still in its infancy and demands conducting many clinical trials so as to device a new strategy that can be useful for humans. Peptoids are also being characterized for their antimicrobial properties. One of the research papers reported the antibiotic activity of a peptoid CHIR29498 and some of its analogues against a host of gram positive as well as gram negative bacteria. Destruction of the bacterial cell wall was found to be the mode of action for the peptoid [7]. PEPTIDE NUCLEIC ACIDS:

Fig 7.6.4 Diagrammatic representation of Peptide Nucleic Acid Source: Peptide nucleic acids are achiral DNA/RNA analogs. Unlike the sugar phosphate backbone present in the nucleic acids these contain pseudo peptide namely N– (2-amino-ethyl--glycine units. This skeleton also it is not charged. PNAs have a unique ability to bind to DNA as well as RNA thereby forming hybrids that are much more stable. They also exhibit complementarity like their natural counterparts thus obeying the Watson – Crick hydrogen bonding scheme. These are synthesized by using solid phase peptide synthesis. Moreover peptide nucleic acids are relatively resistant to attack by proteases and possess thermal stability along with ionic strength. Hence they fine wide applications in the fields of chemistry, medicine, and genetic diagnostics. PNAs are used extensively in anti gene, anti sense studies as it is able to inhibit translation and transcription of the gene towards which it is targeted. They can be used as markers in DNA mapping projects. PNAs labeled with biotin, fluorescent dyes, reporter enzymes are commonly used in hybridization experiments such as DNA arrays, Northern, Southern blots, and detection of point mutations. PNAs are now used as tools for biosensors which help in detecting specific DNA sequences in test samples. It involves immobilization of a single – stranded nucleic acid probe onto an optical transducer over which the sample is passed. Any kind of mismatch is conveyed in the form of an electrical signal which is then detected. Thus PNAs can be used in place of synthetic DNA or RNA but with added benefits as mentioned above. PEPTIDES AS VACCINES:

One of the most critical goals of vaccine development is to trim down the structurally complex molecules to smaller ones which are high on resolution, can be rapidly modified, exhibit specificity in terms of the antigenic property. Use of synthetic peptides is hence the best option. Synthetic peptides are commonly used as immunizing agents. They are the simplest alternatives to vaccine development. The synthetic methods described above have made it possible to synthesize peptides corresponding to specific antigenic epitopes present on an infectious agent. Currently lots of effort is being put in finding peptide based vaccines against cancer and AIDS. One such report stated synthesis of conformationally constrained peptides that bind tightly to 2F5monoclonal antibody (specific against a recognition epitope on the HIV envelope glycoprotein gp 41). This peptide is to be conjugated with a carrier protein and then used as a vaccine so as to elicit an immune response against HIV neutralization. This idea has potential therapeutic benefits.[8] Antigenic and immunogenic properties of totally synthetic peptide based antifertility vaccines too have been reported in one of the research papers. It is observed that vaccine containing synthetic peptide made up of 15 residue –defined cell epitope and 10 residue LHRH i.e. (leuteinizing hormone releasing hormone) epitope induces an explicit antibody formation and then upon sterility in mice. Such a type of vaccine too can be of therapeutic importance. [9] Yet another report stated synthesis of a peptide vaccine against influenza AchiH3N2 virus. It was prepared by introducing residues of the influenza virus haemagglutinin in to a frame component residue which included the agretopes (site of contact between the antigen and the MHC complex) of the antibody. It was observed that this vaccine induced a characteristic T cell response which had a neutralizing effect on the virus.[10] OTHER EXAMPLES OF APPLICATIONS OF SYNTHETIC PEPTIDES: Apart from the examples described above, peptides have been excessively used in synthesis of a large number of hormones for example: LHRH, Insulin, Calcitonin, gonadotropins, other pituitary hormones etc. Synthetic peptides are also now used as regents for immunodiagnosis, also as inhibitors and for protein engineering purposes. Application of synthetic peptides in structure function studies is of great importance. 7.6.5 Relevant Websites The relevant websites pertaining to this topic are as follows: • • • • Activotech – Peptide synthesis Research and development Peptide Synthesis Biomimetic oligomers

7.6.6 Key Industry Suppliers Key industrial suppliers of synthetic peptides in their various forms include the following: • United Biomedical Inc. • GL Biochem (Shanghai) Ltd. • Invitrogen Corporation • CSBio Inc. • Biopeptide Co., Inc. • EZBiolab Inc. 7.6.7 References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. Nelson, D.L. and M.M. Cox, Principles of Biochemistry. 3 ed. 2000, Hampshire: MacMillan Press. Chai, S.K., et al., Immunogenic properties of multiple antigen peptide systems containing defined T and B epitopes. J Immunol, 1992. 149(7): p. 2385-2390. Huang, H.-Q., et al., Synthesis and bioactivities of two multiple antigen peptides as potential vaccine against schistosoma. Bioorganic & Medicinal Chemistry Letters, 2005. 15(9): p. 2415-2419. Holub, J.M., H. Jang, and K. Kirshenbaum, Clickity-click: highly functionalized peptoid oligomers generated by sequential conjugation reactions on solidphase support. Org. Biomol. Chem, 2006. 4: p. 1497-1502. Nathan Brown, N.C., James Patch, Shannon Seurynck, Yu Zhang, Biomaterials, 2001. V. Apostolopoulos., J.M., M. Plebanski and T. Mavromoustakos, "Applications of peptide mimetics in Cancer." Current Medicinal Chemistry, 2002. 9: p. 411 - 420. Goodson, B., et al., Characterization of Novel Antimicrobial Peptoids. Antimicrob. Agents Chemother., 1999. 43(6): p. 1429-1434. Taylor, J. 2004 [cited 28-10-2006]; Available from: Ghosh, S. and D.C. Jackson, Antigenic and immunogenic properties of totally synthetic peptide-based anti-fertility vaccines. Int. Immunol., 1999. 11(7): p. 1103-1110. K Ogasawara, H.N., Y Itoh, T Gotohda, J Arikawa, H Kida, R A Good, and K Onoé, A strategy for making synthetic peptide vaccines. Pub med central, 1992. 19: p. 8995-8999.

7.7 Peptide Nucleic Acids Desai Urvi
7.7.1 Introduction

Fig: PNA having peptide backbone. Source: DNA and RNA are the two forms of genetic materials present in nature from long ago. Scientists are trying to synthesis DNA and RNA in lab and have got a little success too. But now they have discovered a new form of Nucleic acid called as Peptide Nucleic Acid (PNA). PNA, Peptide Nucleic Acid was synthesis by Neilsen, Egholm, Berg and Buchardt in 1991.DNA and RNA consist of sugar backbone and phosphate and nitrogen bases attached to back bone. PNA is an artificially synthesised analogue of DNA, having pseudo peptide in place of sugar phosphate backbone in DNA. PNA has a backbone of repeated units of N-(2-aminoethyl)-glycine units linked by peptide bonds. Which are attached to four bases (adenine, guanine, thymine or cytosine) found in DNA. It also has amino and carboxyl terminals like amino acids. It has N terminus at first

position and it corresponds to 3’ end of DNA or RNA. I.e. 5’ end of PNA binds to complementary single stranded DNA. This means that PNA sequences are written from 3’ to 5’. Unique feature of PNA is that it is shows resistance to nucleases and proteases. [4]

Fig: PNA having peptide backbone. Source:

Fig: comparison of DNA and PNA Source:

PNA has attracted major attention in fields of chemistry and biology because of its attractive chemical, physical, and biological properties and its chances to be very useful as an active component for diagnostic as well as pharmaceutical applications. PNA has behaviour like DNA; it binds to the complementary nucleic acids strands. Also as the backbone of PNA is not nucleic acid it is neutral in nature and results in stronger binding between PNA/DNA then between PNA/PNA and results in better specificity. They are also resistant to hydrolytic (reactions) cleavage and so are not easily degraded inside the living cell. Also PNA is proficient of sequence specific recognizing DNA and RNA obeying Watson and Crick hydrogen bonding scheme. The hybrid form by hydrogen bonding is having too great thermal stability and unique ionic strength effects. It forms stable triplex of PNA-DNA-PNA by identifying duplex homopurine sequence of DNA and binds to it by strand invasion, which has a looped out DNA strand from duplex DNA which it invaded. [2] It is experimented that the melting temperature (Tm) of a normal dA10-dT10 DNA hybrid is 23°C while the Tm of a similar dA10-dT10 DNA/PNA hybrid is 86°C1. PNA was designed originally to recognize double strand DNA. IT was thought to form oligonucleotides that can base pair to double strand of DNA via. Hoogsteen base paring in major groove. In this way there are nitrogen base pairs of DNA in PNA but

the sugar backbone is replaced by a pseudo peptide. It was believed that if this peptide would be of neutral nature it will recover triplex binding capability of ligand. The PNA designed in this way mimic the single strand nucleic acid. Many structures were taken into consideration during early stags but later on when criteria’s like water solubility, rigidity and least chemical accessibility the final structure which was accepted was of PNA. [3]


Recent Advances

It is been thought that this novel discovery of PNA will be useful in lot of ways. PNA can be used as a probe molecule as it is having unique stability and other biochemical properties. It is having more enzymatic and chemical stability than nucleic acid can hybridize very well too. It has better chances of having new ways of detection using PNA as it has unique molecular structure. [5] PNA is now been used as gene targeting drug, where in PNA’s directed towards double stranded DNA exhibit antigene properties where as targeting of single stranded RNA leads to antisense effects. As a result PNA’s are used as bactericidal antibiotics for regulating splice site selection and as telomerase inhibitor. As PNA’s are neutral in nature they cannot be easily delivered as they do not have charge. Many new methods have been used to overcome this problem. Incorporating positive charge to PNA so that it can be easily transported has been tried, for which lysine and arginine has been used. Also ligands are attached e.g. short peptide sequences like antibodies or steroids for delivery of PNA’s. [14] Antisense PNA targeting was used to identify critical HIV gene segment inside gagpol encoding gene. Here translation of mRNA of HIV gag-pol was disrupted with antisense targeting stopping the production of virion from cells infected cronically with HIV. [16] Light up probes are developed using PNA. There is asymmetric cyanine dye thiazole orange (TO) attach to it, and when PNA binds to target DNA, dye binds to DNA and gives flurosense. [17] 7.7.3 Evaluation of Technology

It is been mentioned earlier that the basic properties of PNA are as such much more superior. Like specific binding, neutral nature, protection against hydrolytic cleavage and many more. It is seen that PNA-DNA duplex is very stable than corresponding DNA-DNA or DNA-RNA duplexes. [3] There is formation of D-loop in PNA generation which is used to study mechanism of transcription and inducing gene expression.The PNA are synthesised in laboratory which is big advantage and are easily synthesised by tBoc or fMoc chemistry. PNA-DNA/mRNA hybrids recognized with high efficiency by enzyme RNase H which reflects its ability to act as a substrate to identify cellular enzyme as in DNA. Also there are more betterment done in the PNA itself so as to increase its efficiency. Like at Oxford research lab researchers have developed nucleo-amino acid derived from proline and spacer amino acid, they provide conformational constraint to the PNA. Spacer amino acid is at n terminus and it provide selective binding to either RNA or DNA. The chiral PNA so form can be used in quality control as it forms single enantiomer product. The spacer presents help to modify properties of PNA like adding charge to backbone or increasing hydrophobicity. [15]

PNA has low cell membrane permeability which result in negligible intracellular concentrations. As it has resistance to water solubility it has less bioavability. They cannot be delivered to conventional cationic formulations eg. Liposome and micro spheres as they are neutral in nature. To increase the uptake of PNA because of its low solubility it is used in covalent bounded DNA-PNA hybrid form. It is seen that PNA’s are expensive in aspect of production as compared to other artificially

synthesised peptides. [14]


Application of Technology

PNA was originally designed as a ligand for the recognition of double stranded DNA. Later on it was found out that peptide nucleic acids are very stable, specific in their nature, water resistant and hydrolytic cleavage resistant , and as a result it has a broad spectrum of uses now. The specificity of PNA to bind to a chosen target is of major interest and is there for used in medical and biotechnological context. They show a new scope for development in field of gene therapeutic agent, diagnostic devices for genetic analysis and molecular tools for nucleic acids manipulation. They are also used for antisense and antigene therapy in eukaryotic nerve cells and even in rat’s brain. This activity is also shown in E.coli. PNA binds to complementary sequences on DNA and thus can inhibit transcription of that gene that is antigene strategy and for antisense strategy an analogs can be designed which will bind to complementary sequences in mRNA after recognizing it, and in this way it inhibits the translation of that gene. [2]

Fig: Antigene and Antisense strategy. Source: Detection of SNP; PNA are now used to detect SNP’s which in turn is used to detect neurodegenerative disease identification. In this detection method PNA is used as a probe along with S1 nuclease enzyme and amplifying conjugated polymer (CP). PNA is flurolabelled and it hybridizes to DNA at sequence of interest and thus helps in recognization of SNP’s. [6] Detection of Transcription Factors: A photo Functionalised PNA ongomer was designed, which was used to detect transcription factor. [7]

Chromosomal Identification: Centromeric PNA probe specific to chromosomes 1, 4, 9, 16, X and Yare used to detect oocytes, blastomeres and polar bodies. The identification method used was FISH.[8] PNA Blocker Probes Enhance Specificity in Probe Assays: Mismatch hybridization of Labelled probes to non-target sequence can be prevented by non-labelled PNA known as ‘blocker’. This prevents unwanted hybridization without harming the sensitivity of detection. Also they improve PCR amplification by providing high noise ratios. It was possible to identify a single base mutation in the K-ras gene at levels of only 1.5 copies per 100 copies of wild type DNA. [9] PNA directed genome rare cutting: Usual restriction enzyme is converted into infrequent genome cutter using PARC i.e. PNA-assisted rare cleavage based on Achilles’ heel’ cleavage strategy. A sequence specific complex of double stranded genomic DNA and bis-PNA is treated with DNA methyl transferase also known as methylase. Bis-PNA is removed and sample is treated with restriction enzyme which recognizes same methylase sight and thus cannot cleave them. There are few methylation sights which were protected from methylation by bis-PNA binding to them which will be identified by restriction enzyme and will cleave them after bis-PNA are removed. Bis-PNAs with various combination and with different methylation/restriction enzymatic pairs generates a new class of genome rare cutters [10] Purification of Nucleic acids: PNA hybridize to nucleic acid in two different ways. Sequence specific method and generic method. Sequence specific method is a selective method which requires sequence information on the target and synthesis of dedicated PNA which bind to target DNA. The generic method does not require sequence of target DNA and uses triplex form of PNA. It is capable of bulk purification. [11]


Relevant websites

- - - -


Key industries suppliers

1) Monomer Sciences Inc. United States. Phone: 256-379-5279. Fax: 256-379-5282 E-mail:

2) Eurogentec North America, San Diego, CA.Phone: 858-793-2661 Fax: 858-7932666 Contact email: Web site: 3) Bio-Synthesis, Inc., Texas. Phone: 972-420-8505 Fax: 972-420-0442 Email:



[1] Peptide Nucleic Acids: Protocols and Applications. Horizon Printing Press. (1999) Access Dt.:26/10/2006 [2] Ray, A., Nordén, B. Peptide nucleic acid (PNA): its medical and biotechnical applications and promise for the future. Department of Physical Chemistry, Chalmers University of Technology, S 412 96, Gothenburg, Sweden. Access Dt.: 27/10/2006 [3] Peptide Nucleic Acids. , Horizon Scientific , Access DT.: 26/10/2006 Press.

[4] Matsudaira, P., Coull, J., Peptide Nucleic Acids: A New Nucleic Acid Analog. Whitehead Institute for Biochemical Research, Dept. of Biology. Massachusetts Institute of Technology, Cambridge and Millipore Corporation, Core R& D, Specialty Chemistry Group, Bedford, MA 01 730 [5]Brandt, O., Hoheisel, J., Peptide nucleic acids on micro arrays and other 9244 Access Dt.: 27/10/2006 [6] Gaylord, B. S, Massie. M. R., Feinstein, S. C., Bazan, G. C., SNP detection using peptide nucleic acid probes and conjugated polymers: Applications in neurodegenerative disease identification.(2004). Materials Department and Institute for Polymers and Organic Solids and Neuroscience Research Institute, University of California, Santa Barbara, CA 93106. Access Dt.: 27/10/2006 [7] Fujimori, F., Kitagawa, F., Abe, Y., Ohori, Y., Kiyota,R., Nakamura,Y., Ikeda, H. and Murakami, Y., Application of peptide nucleic acid (PNA) for detection of transcription factors binding probes. Department of Biological Science & Technology, Faculty of Industrial Science & Technology, Tokyo University of Science, Yamazaki, Noda-Shi, Chiba, 278-8510, Japan. . Access Dt.:27/10/2006 [8] Paulasova, P., Andréo, B., Diblik, J., Macek, M. and Pellestor, F. The peptide nucleic acids as probes for chromosomal analysis: application to human oocytes, polar bodies and preimplantation embryos.(2004) Laboratory of Assisted

Reproduction, Motol Hospital, Vuvalu 84, 150 06 Praha 5, Czech Republic and CNRS UPR 1142, Institut de Génétique Humaine, 141 rue de la Cardonille, F-34396 Montpellier Cedex 5, France Access Dt.:27/10/2006 [9] Fiandaca, M. J., Hyldig-Nielsen, J.J. and Coull, J.M. PNA Blocker Probes Enhance Specificity in Probe Assays.(2004) Access Dt.:28/10/2006 [10] Demidov, V.V., Frank-Kamenetskii , M. D., PNA directed genome rare cutting. Access Dt.:28/10/2006 [11] Orum, H., Purification of nucleic acids by hybridisation to affinity tagged PNA probes. Access Dt.:28/10/2006

Fujimori, F., Kitagawa, F., Abe, Y., Nakamura, Y., Ikeda, H. and Murakami, Y., Design of the photosensitized peptide nucleic acids for the analysis of geno-typing. Department of Biological Science & Technology, Faculty of Industrial Science & Technology, Tokyo University of Science, Yamazaki, Noda-Shi, Chiba, 278-8510, Japan. Access Dt.:28/10/2006 [13] Marin V. L., Roy S, Armitage BA., Recent advances in the development of peptide nucleic acid as a gene-targeted drug. Department of Chemistry, Carnegie Mellon University, 4400 Fifth Avenue, Pittsburgh, PA 15213-3890, USA.

[14] Wang, G., XU, X. S., Peptide nucleic acid (PNA) binding-mediated gene regulation. 1Institute of Environmental Health Sciences, Wayne State University, 2727 Second Avenue, Detroit. . Access Dt.: 29/10/2006 [15] New Peptide Nucleic Acids, Technology transfer from University of Oxford. . Access Dt.:30/10/2006. [16] Peptide nucleic acids as epigenetic inhibitors of HIV-1. International journal of Peptide Research and Therapeutics, September 21, 2006. Pages 269-286. Access Dt.: 30/10/2006 [17] Snvik, N., Westman, G., Wang, D. and Kubusta, M. Light up probes: Thiozole Orange conjugated Peptide Nucleic Acid for detection of Target Nucleic Acid in Homologous Solution. Department of Molecular Biology. Lundberg Institute.S-413 90. Goteburg, Sweden. Department of Organic Chemistry. Chalmers University of Technology, S 412 96, Gothenburg,Sweden.,% 2026%20(2000).pdf Access Dt.:30/10/2006

Chapter 8.2 Antibody Engineering
Rushabh Gohil 8.2.1 Introduction ntibodies represent the most diverse and important class of recognition molecules known. The ability of antibodies to recognize target molecules with exquisite affinity and specificity is widely exploited for diagnostic and therapeutic purposes, with over 10 antibody-based drugs currently available to treat conditions ranging from cancer to autoimmunity and organ rejection. In antibody engineering, molecular biology approaches are used to improve the function of antibody molecules by altering their amino acid sequence. The affinity and specificity with which antibodies recognize antigens, their stability in various environmental conditions, their therapeutic efficacy, and their detection in diagnostic applications may thus be enhanced[1].


Fig.8.2.1 Antibody Specificity for Antigens (Source: Antibodies are used widely in medicine and science as indicator molecules. The specific binding properties are used in countless clinical diagnostic tests, and for the identification and quantification of antigens under study in the laboratory, where techniques as immunoblotting, immunoprecipitation, enzyme-linked Immunosorbent assay (ELISA) and radioimmunoassay (RIA) are indispensable. Antibodies are also increasingly used for other applications such as purification of

bio-molecules (immunoaffinity chromatography), for both diagnostic (imaging) and therapeutic applications in vivo, and even for catalysis of chemical reactions. Different antibody derived constructs are rapidly advancing as putative tools for treatment of malignant diseases. Antibody engineering has added significant new technologies to modify size, affinities, solubility, stability and biodistribution properties for immunoconjugates[2]. Underlying Principles for the Production of Monoclonal Antibodies Antibody Engineering is described as the means for modification of antibodies to their increase affinity and specificity, in order to make them apt to aid the various protein technologies as mentioned above. Antibody engineering is basically a tool used to construct hybrid forms of the prevalent antibodies (immunoglobulins) known as Monoclonal Antibodies. Structurally, these monoclonal antibodies are induced with a variation in their variable or Fab region (i.e. the antigen binding site) in order to create specificity to a particular antigen[3].

Fig.8.2.2 A Monoclonal Antibody showing the desired modification in the Fab region to create specificity in its binding site to a particular antigen. (Source: Monoclonal antibodies can be produced in specialized cells through a technique now popularly known as Hybridoma Technology. This technology was discovered in 1975 by two scientists, Georges Kohler of West Germany and Cesal Milstein of Argentina who jointly with Niels Jerne of Denmark were awarded the 1984 Nobel Prize for Physiology and Medicine. The term hybridoma is myeloma cell culture applied to fused cells resulting due to fusion of following two types of cells: (i) an antibody producing lymphocyte cell (e.g. a spleen cell of mouse immunized with red blood cells from sheep), and (ii) a single myeloma cell (e.g. bone marrow tumor cell) which is capable of multiplying indefinitely. These fused hybrid cells or hybridoma have the antibody

producing capability inherited from lymphocytes and have the ability to grow continuously and are hence referred to as immortal[4]. Antibodies are mass-produced in the laboratory by fusing a myeloma cell from a mouse with a mouse B-cell that makes a specific antibody called ‘hybridoma’. The hybridoma cell is an antibody-producing factory, as it is a combination of a B-cell that recognizes a particular antigen and a myeloma cell that exists indefinitely. These multiple clones are called monoclonal antibodies (MAbs), as they are derived from a single hybridoma cell. The production of monoclonal antibodies using hybridoma technology in the laboratory involves a series of carefully designed steps, which is essential to synthesize these MAbs for their specificity of the purpose for which they are required. Antigens specific to the process are repeatedly injected to the mice for the production of specific antibody, facilitated due to proliferation of the desired B cells. These mice then produce the desired sets of lymphocytes in reaction to the antigens. The spleen cells (rich in B cells and T cells) are separately cultured. These spleen cells produce antibodies specific to the antigens that were injected. The myeloma cells producing the effect of the antigens in the mice are cultured separately. The myeloma cell line used for the purpose of synthesizing a Hybridoma should be peculiar for two important characteristics. One, it should have stopped synthesizing antibodies; and two; it should be a mutant that can not synthesize the enzyme hypoxanthine guanine phosphoribosyl transferase (HGPRT).

Fig.8.2.3 Proposed structure of HGPRT with a proposed transition state inhibitor ImmGP (Source: Fusion of spleen cells to myeloma cells is induced, using polyethylene glycol (PEG), to produce the hybridoma. The hybrid cells are grown in selective hypoxanthine aminopterin thymidine (HAT) medium. HAT medium contains a drug aminopterin, which blocks one pathway for nucleotide synthesis, making the cells dependent on another pathway that needs HGPRT enzyme, which is absent in myeloma cells. Therefore, myeloma cells that do not fuse with B cells will die. Also, the HGPRT-.B cells will die because they lack the property of immortal growth.

Therefore the HAT medium allows the selection of hybridoma cells, which inherit the HGPRT gene from B cells and the property of immortal growth from myeloma cells. The desired hybridoma is selected for cloning and antibody production. This is facilitated by preparing single cell colonies that will grow and can be used for screening of antibody producing hybridomas. Only one in several hundred cell hybrids will produce antibodies of the desired specificity. The selected hybridoma cells are cultured for the production of monoclonal antibodies in large quantities. These hybridoma cells may be frozen for future use and may also be injected in the body of an animal so as to produce antibodies in the body, which can be recovered later from the body fluid.

Fig.8.2.3 Production of Monoclonal Antibodies using Hybridoma Technology (Source: pg)

8.2.2 Recent Advances Considerable efforts during the last 10-15 years have been made to improve the yield of monoclonal antibodies using hybridoma technology. Also, consistent efforts are being made to develop newer methods for the production of monoclonal antibodies using antibody engineering as a key tool. One such commendable effort in the recent times has been the use of Phage Display to produce monoclonal antibodies using recombinant phages.

Both conventional hybridoma and phage display antibody production exploit the vast diversity of the mammalian antibody repertoire. The fundamental difference is that with hybridoma antibody production, this diversity is harnessed by the immortalization of antibody producing B-cells, while with phage display it is the genes that encode antibody variable regions (V-genes) that that are immortalized. Thus, the sacrificing of animals like mice is overcome by the use of phase display technique to create hybridomas. Also, a further major advantage of antibody production by phage display is that in many cases the whole process can be performed in vitro, thereby negating the requirement for target antigens to be immunogenic[5]. There have also been quite a few advancements in the process of using hybridomas to create monoclonal antibodies. One such advancement is that the cell fusions are facilitated through the use of polyethylene glycol (PEG). This reagent is used to assist the fusion of the myeloma and spleen cells and also helps in preserving the hybridomas. A second advancement is the use of continuous cell lines as fusion partners for the antibodies producing B cells. Feeder layers consisting of extra cells to feed newly formed hybridomas are used for optimal growth and hybridoma production. The most common feeder layers consist of murine peritoneal cells, marcrophages derived from mouse, rat or guinea pigs and extra non immunized spleen cells. Human fibroblasts, human peripheral blood monocytes or thymus cells are also used as feeder cells, but had some limitations like depletion of nutrients meant for hybridoma and contamination. As a reason, other sources of hybridoma growth factors (HGF) like interleukins derived from human cells are being used in their place[5]. Another major advancement in the field of synthesizing monoclonal antibodies using Hybridoma technology is the generation of humanized monoclonal antibodies. This technique overcomes the shortcoming of using rodent antibodies, wherein the rodent antibodies are perceived as foreign materials by the human lymphocytes and are hence rendered inefficient. Humanized monoclonal antibodies being produced recently consist of only the antigen binding complementarity determining regions (CDRs) from the rodent in association with human framework regions have been produced. This is achieved by either the fusion of mouse myeloma cells with human lymphocytes (blast cells in peripheral blood lymphocytes), or by the immortalization of human cells by Epstein Barr virus. In another approach, transgenic mice carrying human genes for V, D, J and C regions have been produced. These can be used for the production of human antibodies directly by hyperimmunization[6].

Fig.8.2.5 Example of a Humanized Antibody made up of Mouse Complementarity Determining Regions embedded in the Human Antibody Framework. (Source: In recent years, techniques have been developed wherein, antibody genes can be isolated from lymphocytes of immunized animals and then cloned and expressed in bacteria. The antibodies produced in bacteria under the control of cloned genes can be screened for binding to specific antigens. Thus, while hybridoma technology can immortalize antibody producing cells, gene technology immortalizes antibody by producing genes[7]. Computer graphic techniques are also being used to build specific antigen binding sites in antibodies. Using this approach some designer antibodies of practical value has already been produced. In this strategy, genes are not really cloned from lymphocytes, but are instead designed from a repertoire of antibody genes available in a collection. Recent patent applications related to antibodies Table 8.2.1 Recent patent applications related to antibodies
Patent # Subject Assignee Inventor(s) Priority application date Publication date

Enriching for nucleic acids encoding multimeric antibodies having a biological Carter P, Cosman DJ, function; comprises Amgen (Thousand WO 200563817 transfecting mammalian cells Martin FH, Shen Oaks, CA, USA) with polynucleotides W, Yan W, Zhou C, Zhou H containing a library of nucleic acids encoding multimeric antibodies and a vector. An antibody directed specifically against desLys58-beta2-microglobulin or its fragment, and capable of Statens Serum Institute Corlin DB, WO 200563335 forming a complex with des(Copenhagen) Heegaard NHH Lys58-beta2-microglobulin present in a nonimmobilized form, or present in solution. Treating or preventing a bone-related or cartilageAmgen (Thousand Pisegna M, WO 200563292 related disease, condition or Oaks, CA, USA) Simonet WS disorder, or modulating bone mineral density or bone







strength in an individual, comprising administering an amount of an antibody or its fragment that specifically binds a polypeptide comprising fully defined 442−amino acid sequences. An agonist anti-trkC antibody comprising a heavy-chain complementarity determining region (CDR), and/or a lightchain CDR; useful for treating WO 200562955 and/or preventing neuropathies including Taxolinduced, cisplatin-induced and pyridoxine-induced sensory neuropathy. Preventing and/or treating Type 1 diabetes mellitus in a prediabetic human subject or in a human subject suffering WO 200562893 from the disease, comprising administering an amount of an anti-CD52 antibody, that is, Campath-1H. Treating diabetic retinopathy in a patient by administering an amount of an antibody to gamma interferon and/or an amount of an antibody to CD20; antibodies are also US 20050152902 useful for treating hyperimmune reactions, including transplant rejection, autoimmune diseases of the eye and ocular disorders incidental or connected with autoimmune diseases. Treating autoimmune disease such as rheumatoid arthritis, psoriasis, inflammatory bowel disease, Crohn’s disease, ulcerative colitis, eczema, asthma, lupus, atherosclerosis and diabetes, comprising detecting CD20 or WO 200560999 CD20-positive B cells in a sample from the patient, and where CD20 or CD20-positive B cells is detected in the sample, administering a CD20 antagonist or antibody to treat the autoimmune disease. Amplifying nucleic acid encoding a portion of an antibody; useful for producing an antibody library and for identifying an antibody having a desired binding specificity. WO 200560641 The method enables improved nucleic acid amplification with decreased mispriming and amplification of sequences other than the target sequence. Detecting the presence or absence of or determining whether a patient is at risk for developing a cervical cancer, US 20050142620 comprising contacting a test cervical tissue sample with an antibody that binds to a polypeptide and detecting the

Rinat Neuroscience (S. San Francisco, CA, USA)

Pons J



ILEX Products (San Antonio, TX, USA)

Arthaud L



Advanced Biotherapy (Woodland Hills, CA, USA)

Skurkovich B, Skurkovich S



Genentech (S. San Francisco, CA, USA)

Brunetta PG



Alexion Pharmaceuticals (Cheshire, CT, USA)

Bowdish KS, Frederickson S, Lin Y, Maruyama T, Renshaw M



Corixa (Seattle, WA, USA)

Bangur CS, ZehentnerWilkinson BK



amount of the antibody that binds to the polypeptide. An immunoassay method useful for detecting antigens such as hepatitis B virus surface antigen. The method Toa Iyo Denshi (Kobe, JP 2005172546 Kawade Y enables easy, convenient and Japan) simultaneous detection of the antigen and the antibody of the same pathogen. A novel antibody that is immunoreactive with a Beckmann MP, mammalian interleukin-4 Cosman DJ, Immunex (Seattle, WA, Idzerda R, March US 20050118176 receptor (IL-4R), chosen from USA) CJ, Mosley BA, murine IL-4R and human IL4R; useful for inhibiting Park LS binding of IL-4 to IL-4R.





Source: Thomson Scientific Search Service (formerly Derwent). The status of each application is slightly different from country to country. For further details, contact Thomson Scientific, 1725 Duke Street, Suite 250, Alexandria, Virginia 22314, USA. Tel: 1 (800) DERWENT ( 8.2.3 Evaluation of Antibody Engineering Although, being touted as the most efficient and revered technologies, the use of monoclonal antibodies obtained by the process of antibody engineering has its fair share of advantages and disadvantages. Continuous efforts are being made by the protein technology industry to overcome the major shortcomings of this engineering technology[8]. Major advantages of the use of monoclonal antibodies through antibody engineering principles are: 1. Monoclonal antibodies represent single antibody molecules that bind to antigens with the same affinity and promote the same effector functions. This confers homogeneity to the monoclonal antibodies so produced. 2. The product of a single hybridoma reacts with the same epitope on antigens, thus providing specificity to the monoclonal antibody. 3. The Immunizing Antigen, used as the raw material in this technique need not be pure or characterized and is ultimately not needed in large amounts to produce large quantities of the monoclonal antibody. 4. It is possible to select for specific epitope specificities and generate antibodies against a wider range of antigenic determinants. 5. Unlimited quantities of a single well-defined mono-specific reagent can be used for antibody production.

Inspite of having such great advantages, the technique of antibody does have a few disadvantages which are in the process of being eliminated. 1. Because the antibody is monoclonal, it may not produce the desired biologic response thus providing insufficient effector response. 2. Monoclonal antibodies against conformational epitopes on native proteins may lose reactivity with antigens that have been minimally perturbed. This could threaten their specificity for the antigens. 3. Antibodies sometimes display unexpected cross-reactions with unrelated antigens.

4. The production of monoclonal antibodies through antibody engineering requires a very large commitment in terms of time, effort as well as the capital expenditure it incurs! 8.2.4 Applications of Antibody Engineering The Antibody Engineering Technology used to produce monoclonal antibodies has a variety of academic, medical and commercial uses. The applications of the Engineered Antibodies can be broadly divided into three classes, viz., Diagnosis, Immunopurification and Immunotherapy. Diagnosis Antibodies are used in several diagnostic tests to detect small amounts of drugs, toxins or hormones. In diagnosis, one of the major applications of antibody engineering is in the detection of pregnancy by assaying of hormones such as the Human Chorionic Gonadotropin (HCG) with monoclonal antibodies. Once monoclonal antibodies for a given substance have been produced, they can be used to detect the presence and quantity of this substance, for instance in a Western blot test (to detect a protein on a membrane) or an immunofluorescence test (to detect a substance in a cell). Another diagnostic use of antibodies is the diagnosis of AIDS by the ELISA test[1]. They are also very useful in immunohistochemistry which detect antigen in fixed tissue sections. Radioimmunodetection is also widely used. (Fab') 2 and Fab fragments are preferred for imaging, because both targetting and blood clearance are rapid. Tumours as small as 0.5 cm, which are missed by other radiological methods, can be detected by radiolabelled antibodies or Fab fragments. The technique of ELISA (enzyme linked immunosorbant assay), utilizing monoclonal antibodies has also been used for cytogenetic analysis in wheat[5]. Immunopurification Immunopurification involves separation of one substance from a mixture of very similar molecules. Monoclonal antibodies can also be used to purify a substance with techniques called immunoprecipitation, radioimmunoassay and affinity chromatography. For instance, individual interferons could be purified using monoclonal antibodies and could be used for inactivating T-lymphocytes responsible for rejection of organ transplants. Researchers use monoclonal antibodies to identify and to trace specific cells or molecules in an organism, for example, developmental biologists at the University of Oregon use monoclonal antibodies to find out which proteins are responsible for cell differentiation in the respiratory system. Antibodies have been used for the purification and quantification of certain molecules present in trace amounts, such as hormones, cyclic nucleotides, polypeptides, enzymes, antigens, etc. The assay is extremely sensitive and quantities as low as nano or pico molar concentrations can be detected in small volumes (l ml) of body fluids such as plasma, urine or cerebrospinal fluid[5]. In the process of immunoprecipitation, when correct conditions are employed, antigen and antibody react to form a precipitate. This precipitation phenomenon has been used to separate and purify enzymes and other antigenic substances using monoclonal antibodies in a variety of ways. One application of immunoprecipitation is immunoelectrophoresis. This is the most sensitive method for the detection of enzymes and antigens in a mixture. It involves combination of elcctrophoresis and gel diffusion[1].

In affinity chromatography, the monoclonal antibody is generally immobilized on an insoluble matrix and packed in a Column. Protein solution containing the specific enzyme is applied over these immobilized antibodies. The specific enzyme binds with the antibody while other components do not, and thus the specific enzyme is separated. Antigens can also be purified using affinity chromatography with monoclonal antibodies[1, 5]. Immunotherapy Immunotherapy (also known as biologic therapy) is a treatment that uses certain parts of the immune system to fight diseases. For therapeutic uses, monoclonal antibodies are so designed that they will neutralize the reaction or response by one defined antigen, but still preserve the reaction of all other antigens. Several antigens of T cell receptor complex, including CD3, CD4 and CD8 have been the targets of specific antibodies for therapy. Most widely used monoclonal antibody is OKT-I, which has been licensed for clinical use, particularly for the treatment of acute renal allograft rejections. Monoclonal antibodies have also been used for treatment of patients with malignant leukemic cells, B cell lymphomas, and a variety of allograft rejections after transplantation[1]. Antibodies are used in the radioimmunodetection and radioimmunotherapy of cancer, and some new methods can even target only the cell membranes of cancerous cells. A new cancer drug based on antibody engineering technology is Rituxan, approved by the FDA in November 1997[5]. Immunotherapy for various immune diseases like the infectious diseases and the autoimmune diseases also are currently being treated by various monoclonal antibodies designed by the technique of antibody engineering. Monoclonal antibodies can be used to treat viral diseases, traditionally considered untreatable. In fact, there is some evidence to suggest that antibodies may lead to a cure for AIDS[5]. 8.2.5 Relevant Web Sites 1. Thomson Scientific Search Service (2006) 2. Marasco, W.A. (2006) NCFR Centre for Therapeutic Antibody Engineering 3. Ludwig Institute For Cancer Research (2006) Antibody Engineering Home Page 8.2.6 Key Industry Suppliers 1. Abbott Laboratories The diverse family of pharmaceutical, medical, and nutritional products from Abbott Laboratories includes a broad range of specialized medicines; medical diagnostic instruments and tests; minimally invasive surgical devices; a spectrum of nutritional supplements for infants, children and adults; and products for veterinary care. ( 2. Alexion

Alexion Pharmaceuticals is a leading American biotechnology company that is preparing to launch its first commercial product, eculizumab in 2007 and another product pexelizumab being studied in a large Phase III trial. ( 3. Biogen IDEC Biogen IDEC has a strong track record of success in the development, manufacture and commercialization of novel, first-in-class products that significantly improve human healthcare with products like Rituxan®, Zevalin® and Avonex® already reaching phase IV and many more products in clinical trials. ( 4. Cambridge Antibody Technology CAT is committed to the development of human monoclonal antibodies as new treatments for important human diseases to improve patients' lives. HUMIRA® is already approved and marketed in 57 countries worldwide and nine further human monoclonal antibodies originating from CAT are in clinical trials. ( 5. Genentech Genentech manufactures and commercializes multiple protein-based biotherapeutics for serious or life-threatening medical conditions. Herceptin®, Avastin®, Rituxan® are few of the leading products of this company ( 6. Genmab Genmab A/S is a biotechnology company that creates and develops human antibodies for the treatment of life-threatening and debilitating diseases. Genmab has numerous products in development to treat cancer, infectious disease, rheumatoid arthritis and other inflammatory conditions. Genmab is an international company with operations in Europe and the United States. (

7. ImClone Systems The only patented drug of this company is Erbitux® (Cetuximab). ImClone Systems has launched LORHAN (Longitudinal Oncology Registry of Head and Neck carcinoma), an independent national registry of patients with head and neck cancer. LORHAN is non-drug specific and provides detailed longitudinal treatment data on patients managed in all practice settings. ( 8. Medarex Medarex is a biopharmaceutical company focused on the discovery, development, and potential commercialization of fully human antibody-based therapeutics to treat life- threatening and debilitating diseases, including cancer, inflammation, autoimmune and infectious diseases. Medarex applies its UltiMAb® technology and product development and clinical manufacturing experience to

generate, support and potentially commercialize a broad range of fully human antibody products for itself and its partners. ( 9. MedImmune The company's marketed products include Synagis® (palivizumab), Ethyol (amifostine), FluMist® (Influenza Virus Vaccine Live, Intranasal), and CytoGam® (cytomegalovirus immune globulin intravenous (human)), with additional products in clinical testing. (

10. UCB UCB is a leading research, development and biotechnology products in allergy/respiratory diseases, UCB focuses on securing ( global biopharmaceutical company dedicated to the commercialization of innovative pharmaceutical and the fields of central nervous system disorders, immune and inflammatory disorders and oncology a leading position in severe disease categories.

8.2.7 References 1. Goding, J. W. (1983) Monoclonal antibodies : principles and practice : production and application of monoclonal antibodies in cell biology, biochemistry, and immunology, Academic Press, New York. 2. Roitt, I. M. (2001) Roitt's Essential Immunology, 9 edn, Blackwell Science, Oxford. 3. Barrett, J. T. (1976) Basic immunology and its medical application, Saint Louis :Mosby. 4. Kuby, J., Kindt, T. & Osbourne, B. (2000) Kuby Immunology, 4 edn, W H Freeman, New York. 5. Maynard, J. A. & Georgiou, G. (2000) Antibody Engineering, Annual Rev. Biomedical Engineering. 2, 339-376.

6. Hayhurst, A. & Georgiou, G. (2001) High Throughput Antibody Isolation, Curr Opin Chemical Biology. 5, 683-689. 7. Baker, M. (2005) Upping the Ante on Antibodies, Nature Biotechnology. 23, 10651072. 8. Stites, D. P., Terr, A. I. & Parslow, T. G. (1997) Medical Immunology, 9 edn, Appleton & Lange, Stamford, Connecticut.

Chapter 8.3 Protein expression systems
Krutika Wikhe

8.3.1 Introduction Proteins which are composed of amino acids form the building blocks of the human body. Their immense importance and role in biological pathways has lead to an explosion in protein studies. As researchers started to delve into proteins, their structures and their functions, the need to produce them aroused. This in its turn led to the search for a proper biological system to produce the proteins such that their integrity, quality and quantity could validate such a study. Well it didn’t turn out to be an easy task and I m sure many protein scientists of that era would agree that just finding such a system and then optimizing that system for a protein proved to be a daunting task. And we, who do produce proteins on a regular scale for studying their effects, toxicity, functions, structures and many other uses too ought to be thankful for those protein scientist who did the major chunk of the work by finding such biological systems for us. Once scientists came up with the first biological expression system, viz. E.coli, they thought that all the problems associated with protein production are history. But then other problems cropped up like inclusion body formation, intracellular secretion of proteins, and purification and separation of the desired protein from the rest of the cellular mass. This lead to a search for a better and more high through put expression system and from that time till now there has been continuous research for a more efficient and high throughput protein expressing system. We have come a long way from using E.coli, although it is still the preferred system over others. Over the course of time yeast, mammalian cells, Baculovirus and the most recent cell free protein expression systems have been exploited. 8.3.2 Evaluation of the technology and its applications: The above mentioned expression systems along with their pros and cons will be discussed as follows: E.coli: The genetics and biochemistry of E.coli are probably the best understood of any known organism. The knowledge gained in studying E.coli biology has been applied to the development of many molecular cloning techniques. Most cloning vectors and methods utilize E.coli as a preferred host, primarily because of the ease with which it can be grown and manipulated. It is also suitable for expressing proteins because of its rapid doubling time and its ability to grow on a wide range of nutrient media. It also provides numerous transcriptional and translational control elements that can be applied to the expression of foreign genes. The steps involved in foreign gene expression are : 1) Insertion of the gene into an expression vector, mostly plasmid.

2) Transforming a suitable E.coli host strain with the plasmid for example by electroporation. 3) Evaluation of protein stability and expression. 4) Once small scale experiments have verified the expression and stability then large scale production of the protein can be started in fermentation system. 5) Production is followed by purification and characterization of the protein. This flow chart explains the above steps in a diagrammatic fashion:

Fig 8.3.1 A flow diagram for a typical E.coli based expression system.

The important elements in a typical E.coli expression vector are: Selectable marker: Expression plasmids should have a sequence encoding a selectable marker to ensure maintenance of the vector and for identification of transformed cells. Origin of replication (ori): Replication of plasmid as an independent extrachromosomal element is controlled by its origin of replication. Promoters: A controllable transcriptional promoter is helpful in expression system so as to control the induction and direct the production of mRNA from the cloned gene. The most commonly used is the lac promoter which utilizes the β-galactosidase gene. An example of a commercially available lac promoter vector is pBluescript [1,2]

The most common expression strategies in E.coli include direct intracellular expression, secretion and fusion proteins. Incase of intracellular expression recombinant proteins often form dense insoluble aggregates called inclusion bodies. Expression of cloned gene products as secreted proteins has been developed as an alternative to intracellular expression. This can be achieved by adding an N-terminal signal sequence that can be cleaved afterwards. But even in secretion techniques, the yield of the protein is often low and inconsistent. To tackle this problem of inconsistent yields and protein insolubility fusion proteins were brought into picture. Fusion proteins are created by expression of the target protein in frame with a highly expressed protein partner or carrier protein[3]. The most successful fusion systems use Maltose Binding protein (MBP) and Glutathione-STransferase (GST)[3]. One of the manufacturers of commercial expression systems is Invitrogen which has developed a new range of systems using Gateway technology[4,5]. The name of the system is Champion™ pET Expression System with Lumio™ Technology. Stratagene has developed the VariFlex™ Bacterial Expression Systems [6]. Although E.coli is the first choice of expression system for any protein researcher, you really can’t expect that organism to satisfy everyone’s needs. What I mean by that is, for researchers interested in eukaryotic proteins which need to be post-translationally modified, a eukaryotic system must be used. Furthermore, if the protein is expressed in an insoluble state in E.coli, one way to circumvent this problem is to express the protein in eukaryotic system. Yeast: For more than a decade the yeast Saccharomyces cerevisiae has been used extensively utilized or the production of foreign proteins. One of the reason for choosing Saccharomyces as a system for expression is the vast knowledge base about the organism and presence of eukaryotic posttranslational modification pathways. The other positive aspect towards using yeast as a system for production of proteins is that it has been approved as a safe organism by the FDA and hence it can be used for production of biologically important proteins on a commercial scale[7]. The goals achieved in expression of foreign proteins in S.cerevisiae are achievement of desired yield, production of proteins with desired post translational modifications and secretion to the extra cellular medium. Through continuous research in this field, there are now many vectors and host strains available to direct gene expression in S.cerevisiae. A variety of choices are now available with respect to specific elements used to direct expression and secretion like the promoters used, the signal sequence for secretion, selectable marker and even the mechanism of replication[7]. Early on in the 1970s, the methylotropic yeast Pichia pastoris was developed to convert methanol into a high quality protein. By early 1980s the focus shifted from Saccharomyces to Pichia as a eukaryotic microbial system to produce large quantities of heterologous protein of interest to protein researchers[7]. The transformation methods developed for Saccharomyces work well even with Pichia. We can also produce either intracellular or secreted protein with Pichia. Secretion requires the presence of a signal peptide on the expressed protein to target it towards the secretory pathway[8]. Currently Invitrogen has the exclusive rights for the distribution of Pichia expression technology[9].

The preferred reason for using Pichia is it gives a much higher yield of desired protein than Saccharomyces. Another reason for expression in Pichia is it secretes very low levels of its native proteins and so it can be easily purified from the medium. The major producers of commercial yeast systems are Invitrogen, BD Biosciences. BD Biosciences Clontech offers the YEASTMAKER™ Yeast Transformation System and the YEASTMAKER™ Yeast Plasmid Isolation Kit, as well as many types of yeast media and MATCHMAKER Yeast Two-Hybrid System[10] But as with all other systems, there are some problematic issues concerning protein production in yeast too. Briefly the following problems arise: 1. Genetic instability of the transformed yeast particularly during scale-up. 2. Inability to produce toxic proteins. 3. Inefficient secretion of larger (>30kDa) proteins. 4. Proteolysis of secreted proteins. Baculovirus: Baculovirus has emerged out as a popular system for overproducing recombinant proteins in eukaryotic cells. The main difference in Baculovirus expression system and that of yeast is that in Baculovirus we can use a helper-independent virus that can be propagated to increase titers in insect cells. This in turn would result in high protein production which after all is one of the aims during protein production. Another positive aspect of using Baculovirus is it has a large genome and hence can take in large inserts of foreign DNA. Finally Baculovirus being non-infectious to humans they give a possible advantage when expressing oncogenes or toxic proteins[11]. Currently the most widely used system employs the lytic virus Autographa californica. The basic methodology of expressing foreign proteins in this system involves the following steps: 1. Gene is first cloned into a transfer vector 2. The recombinant vector is transfected into insect cells. 3. In a homologous recombination event, the foreign gene is inserted into the viral genome. 4. Recombinant viruses can be identified by DNA hybridization or PCR technique. The general advantages and disadvantages of the system are as follows:

Table 8.3.1 Advantages and disadvantages of Baculovirus system. Ref: [8]

Some commercially available Baculovirus systems are by BaculoDirect™ by invitrogen and BD BaculoGold™ by BD Biosciences[12,13]. The following is a diagrammatic representation of the Baculovirus expression system in BaculoGold™

Fig 8.3.2 Baculovirus expression in BaculoGold™ Ref:

Mammalian cells: In recent years, mammalian cells have been used vastly for production of recombinant proteins mainly those requiring post-translational modifications. They also serve as means for examining aspects of gene replication, transcription and translation. Although there is a wide range of mammalian cells available for expression only few have emerged as systems of choice for protein production. The common features of mammalian cell lines desired are, they should be capable of continuous growth and can be grown in suspensions, they should have low risk of infection by viruses and can be characterized easily with respect to morphology and gene copy number[14].

The general procedure for mammalian cell expression is as follows [8]:

Fig 8.3.3 Protocol for mammalian cell expression. Mammalian cells generally are preferred when expressing proteins for human applications. The following are the typical uses of mammalian expression systems: 1. Verification of cloned gene product. 2. Production and isolation of genes from cDNA libraries.

3. Production of correctly folded and glycosylated proteins 4. Production of clinically active viral surface antigens and monoclonal antibodies. Mammalian produced cells are quality controlled through a process whereby incompletely folded and unassembled proteins into the secretory pathway are selectively inhibited. Even the use of human cell line is not perfect, since transformation required to produce a stable cell line might in turn result in altered glycosylation. Moreover mammalian expression techniques are time consuming, difficult on a larger scale and costly. Mostly optimal techniques are a compromise between transfection efficiency and post treatment viability of cells and regulating this factor is generally troublesome. Complex nutrient requirement and low product concentration also means that the end product should warranty this approach to be commercially viable. One of the major sources of commercially available mammalian expression vectors are Stratagene[15] and BioLabs[16]. Cell-Free protein synthesis: Over four decades ago two scientists, Nirenberg and Matthaei did revolutionary studies in cell free protein synthesis[17]. From then till now it has covered lots of research ground and has come up as a valuable tool for understanding how mRNAs are translated into functional polypeptides. Cell free expression system is based on the early demonstration that cell viability is not required for protein synthesis to occur. Translation can occur even by using a crude lysate from any given organism which provides all the required translational machinery, enzymes, tRNA along with exogenously supplied RNA template amino acids and an energy source. Although any organism can be used for the preparation of a cell free protein extract the most popular ones are E.coli, wheat germ, and rabbit reticulocytes[17]. A key objective for cell translation systems is to synthesize biologically active proteins. Cell free systems offer many advantages over traditional cell based expression methods. We can easily modify reaction conditions to favor protein folding, decrease cell sensitivity towards product toxicity. It is also suitable for high through-put reaction systems because of reduced reaction volumes and process time[17,18]. We have seen that each system has some unique features and qualities which allow it to be used for certain kinds of proteins. Most protein researchers base their choice of expression system on the kind of protein they have to express and the extent of purification, yield and structural and functional conformation they desire. To give an overall comparison of all the above mentioned expression systems the following diagrammatic representation would prove helpful:

Fig 8.3.4 Comparison of protein expression systems.

8.3.3 Recent Advances: Due to an increasing interest in protein studies there has been an increased demand for recombinant protein production. Lawrence Livermore National Laboratory and Onyx Pharmaceuticals, Inc. have successfully produced an automated system foe expression of large number of proteins encoded by cDNA clones from IMAGE (Integrated Molecular Analysis of Genomes and their Expression)[19]. Baculovirus is the system of choice in this technique of converting cDNAs to proteins. This system was developed for the analysis of human proteome. Functional and structural analysis of novel gene products identified by Expressed Sequence Tag (EST) is also possible by this system. A technique called SPEX, Surface Protein Expression Vector using gram positive bacteria Streptococcus gordonii was developed by Myscofsky[20]. They demonstrated that exogenous DNA sequences can be fused to sequences specifying surface expression resulting in the production and secretion of heterologous proteins. Advantages of using this system is it can produce proteins on a larger scale by taking advantage of natural pathway that is designed to export proteins of varying size and structures to the outside of the bacterial cell[20]. The expressed protein can either be anchored on the surface or secreted in the media thus avoiding the formation of inclusion bodies. To allow recombinant proteins to play a role in large industrial applications a more robust and efficient production technology is required. With this aim in mind, Srinivasan[21] developed a novel prokaryotic system based on a nonpathogenic organism Ralstonia eutropha. This systems permits high cell density culture in a defined minimal media and does not require antibiotics for protein stability or IPTG for protein induction. Srinivasan used the strain NCIMB 40124 of R.eutropha in their experiments. They tried to express a protein (organophosphohydrolase[OPH] from Pseudomonas diminuta MG) which is difficult to obtain in soluble form from

E.coli. they achieved cell densities of 150g/l which in turn would give rise to higher protein yields and without any inclusion body formation. Another innovative method developed to overcome the problems faced in traditional protein expressing system was the use of actinomycete[22]. N. Nakashima and t. Tamura[22] tested the norcardioform actinomycete Rhodococcus erythropolis which grows at temperatures ranging from 40 C to 350C as an expression host system. The expression was controlled by using the antibiotic Thiostrepton. It is a known fact that protein are better expressed and produced by the host cell at lower temperatures. Incase of R.erythropolis few proteins were found to be expressed at much higher levels at low temperatures. DNaseI which is expressed as an insoluble protein in E.coli was expressed as a soluble protein at 40C in R.erythropolis. moreover proteins derived from cold-adapted organisms require expression at lower temperatures as they have low stability at higher temperatures[22]. 8.3.4 Key industry suppliers: NextGen sciences ( ) have developed the Expressionfactory™, an instrument that automates the cloning, expression and purification within a single stage. When combined with its corresponding software it enables the parallel exploration of different constructs, host cells and growth conditions[23]. Another cost effective approach to high through put protein production mainly for structural analysis purposes has been provided by PSF (Protein Structure Fabrik in Berlin, ). It involves the parallel expression and purification of recombinant proteins with His tag (Histidine) or GST tag (Glutathione S-Transferase) from bacterial expression systems. Invitrogen ( ) has presented the Gateway system for E.coli extracts whereas Roche Applied Science ( ) has produced the Rapid Translation System for cell-free expression using E.coli and wheat-germ extracts[23].

8.3.5 References: [1] [2] [3] [4] Stratagene (2006) Stratagene. Stratagene (2006) in: Esposito, D. and Chatterjee, D.K. (2006) Current Opinion in Biotechnology 17, 353-358. Invitrogen (2006) in: Details&sku=&productDescription=1031&ref=http%3A%2F%2Fwww%2Egoo gle%2Ecom%2Eau%2Fsearch%3Fnum%3D20%26hl%3Den%26safe%3Doff %26q%3DE%2Ecoli%2Bexpression%2Bsystem%252Cinvitrogen%26btnG%3 DSearch%26meta%3D. Invitrogen (2006) in: Stratagene (2006) in: Romanos, M.A., Scorer, C.A. and Clare, J.J. (1992) Yeast 8, 423-488. Wiley, J. (1997) Current Protocols in Protein Science 5.

[5] [6] [7] [8]

[9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23]

Invitrogen (2006) in:|527213053. Biosciences, B. (2006) in: BD Biosciences. Sridhar, P. et al. (1994) Journal of Biosciences 19, 603-614. Invitrogen (2006) in: Biosciences, B. (2006) in: ID=72. Rai, M. and Padh, H. (2001) Current Science 80, 1121-1128. Stratagene (2006) in: BioLabs (2006) in: Katzen, F., Chang, G. and Kudlicki, W. (2005) Trends in Biotechnology 23, 150-156. Stevens, R.C. (2000) Structure 8, R177-R185. Joanna S. Albala, K.F., Ian R. McConnell, Karen L. Pak, Peg A. Folta, Brian Karlak, Bonnee Rubinfeld, Anthony H. Davies, Gregory G. Lennon, Robin Clark, (2001) Journal of Cellular Biochemistry 80, 187-191. Myscofski, D.M. and Hruby, D.E. (1998) Protein Expression and Purification 14, 409-417. Srinivasan, S., Barnard, G.C. and Gerngross, T.U. (2002) Appl. Envir. Microbiol. 68, 5925-5932. Nakashima, N. and Tamura, T. (2004) Biotechnology and Bioengineering 86, 136-148. Dale, G.E. (2004) Drug Discovery Today 9, 783-784.

Introduction: Proteins are composed of 20 basic units called amino acids which consist of a central carbon atom (the alpha-carbon) bound to an amino group (NH2), a carboxyl group (COOH), a hydrogen atom, and one of 20 different R groups. The alpha-carboxyl group of one amino acid is joined to the alpha-amino group of the next by a peptide bond to form chains of amino acid residues (polypeptide chains). Proteins are functional polypeptide chains. The unbranched chain of amino acid residues has direction, beginning at the amino end, and the chain of regularly repeating peptide bonds is called the backbone, while the R groups projecting from the backbone are known as side chains. The peptide bonds of the backbone are rigid and planar due to their partial double bond character. There are an unlimited number of conformations that proteins could adapt, but most fold spontaneously into one particular stable shape. This particular shape occurs because backbone groups and side chains interact with each other, and water, so that particular conformations have more stabilising interactions than others. Randomly coiled protein is void of its activity, and that isolated proteins in solution can revert to their original active conformation after denaturing conditions are removed. It was concluded, therefore, that the information needed to refold that protein into its native form must be inherent in the amino acid sequence and that a protein’s sequence specifies its conformation. Globular proteins fold into a compact globular shape. In contrast, fibrous proteins do not fold into a compact shape as their function lies in their fibrous nature. The hydrophobic nature of certain amino acids is the main driving force in the adoption of these compact structures. The side chains of amino acids can be polar or non-polar. Non-polar residues are hydrophobic and pack together in the interior of the globular protein structure to avoid contact with water, whereas polar residues, and the polar groups in the backbone, are hydrophilic and form hydrogen bonds with each other and water. These hydrogen bonds form a major part of protein structure stability, and are formed when a hydrogen atom is shared between a hydrogen donor and a hydrogen acceptor. When hydrogen bonds form between backbone groups in proteins the alphaamide group is the donor and the alpha-carbonyl group the acceptor. The folding of a protein can not be a random search through all possible structures as this would take far longer than the observed time, of approximately one second, for typical proteins to fold from random to their folded state. Among the millions of possible folding patterns, proteins take up one working, native, structure. Proteins are thought to initially fold rapidly into a structure in which most of the final secondary structure elements have formed and are aligned in roughly the correct way. This is an open and flexible conformation, called a molten globule, and is the starting point for a relatively slow process in which the side chains are repeatedly adjusted to form the correct tertiary structure. This second stage is thought to have a variety of correct

pathways to the final conformation. This process can be summarised as local folding, formation of long range interactions, then local rearrangements to give the final most stable folded state. This model has the assumption that, although hydrophobic residues direct the initial folding, they also direct the slower tertiary folding, and allows rearrangements to be made. In aqueous environments protein folding is driven by the tendency of hydrophobic residues to be excluded from water. This is only possible because polar residues and backbone groups can interact with the water at the protein exterior and with themselves in the non-polar environment of the protein interior. This conformation is further stabilised by electrostatic and Vander Waals bonds. The conformational folding of proteins destined for non-aqueous environments, such as spanning the cell membrane, differs because the non-polar residues no longer need to pack into a hydrophobic core. The forces and interactions described above fold a protein into a particular 3D conformation, and this apparently complex structure of proteins is in fact governed by a set of relatively simple principles. These principles are explained by splitting the conformation into various levels which build on each other to produce the entire protein shape. To ensure proper folding, cells have evolved a sophisticated and essential machinery of proteins called molecular chaperones that assist the folding of newly made polypeptides. The importance of proper protein folding is underscored by the fact that a number of diseases, including Alzheimer's and those involving infectious proteins (prions), result from protein-misfolding events.

Molecular chaperones have an essential role in the regulation of protein conformation states -- the process during which transient or stable interactions with client proteins affects their conformation and activity. Chaperones capture unfolded polypeptides, stabilize intermediates, and prevent misfolded species from accumulating in stressed cells. The capacity of the Hsp70 and Hsp90 chaperones to regulate these processes involves a constellation of positive and negative co-chaperones that function in various combinations to interact with chaperones to release folded proteins, to facilitate the assembly or disassembly or chaperone-containing heteromeric complexes, to confer substrate specificities, and to affect subcellular trafficking. All the information needed to specify the three dimensional conformation of protein is encoded by proteins amino acid sequence. In vitro protein folding is studied primarily using small model proteins consisting of fewer amino acids. These small model proteins can be unfolded and they spontaneously fold back into native structure upon removal of denaturant. Several features define the basic differences between folding of proteins in vivo and in vitro.Firstly the cytosol is an extremely crowded environment with more macromolecular concentrations. Such macromolecular crowding leads to excluded volume effects which strongly affect biochemical rates by increasing protein association constants and thereby increasing intermolecular interactions including aggregation. Secondly about one third of newly synthesized proteins must be targeted to an organelle or are secreted to an extra cellular compartment where their functions are fulfilled. These proteins are targeted as nascent or loosely folded polypeptides to their translocation machinery where they fold to their native states. Thirdly all proteins are synthesized by the ribosome from N- to Cterminus, implying that as long as polypeptide synthesis proceeds, the folding information is incomplete. Three different models explain how and when a folding of a polypeptide chain to its structure is achieved in living cells. The first model suggests that folding of a growing polypeptide chain is postponed by chaperone binding until its synthesis is completed. Here folding is initiated only upon release of protein from ribosome. The second model suggests that formation of secondary and tertiary structures begins as soon as polypeptide chain emerges from ribosomal exit. The third model proposes a step wise folding where initial folding is initially delayed and is allowed to proceed only when sufficient sequence information is available for the generation of folded domain. Recent Advances: Recent adavances in genetic engineering have heightened the interest in research related to predicting native protein folding and docking conformations. The ability to predict these structures would greatly increase our understanding of hereditary and infectious diseases and aid in interpretation of genomic data.Also the ability to understand peptide docking would revolutionize the process of denovo drug design. Also the recent advances in genetic engineering, high powered computing and global optimization continue to stimulate interest in the area of molecular modeling and protein structure prediction. The goal of these efforts is the ability to correctly predict native protein conformations and the binding interactions of macromolecules. These two problems currently dominate the field of computational chemistry and, through the use of detailed molecular models; they have also greatly influenced research.

Cellular activities are carried out by interactions among many proteins and genes. Set of coordinated interactions involved in a certain cell process is referred to as pathways, for example, wnt pathway. Understanding and finding these organized interactions is important. Due to recent advances in biotechnology, the amount of partial data related to interactions, proteins, genes etc. is increasing tremendously. Using this fragmented data to infer knowledge on cellular networks is an important and active research area. For example, micro array expression profile data gives us what genes are expressed together and how much. Expression profile data can be used by itself for the purpose of understanding pathways up to a point. However, if we can combine expression profile data with protein interaction data (there are several databases of known interactions), we can infer new knowledge more effectively. The reason is that, most probably, proteins acting in a pathway are expressed together (similar expression profile) and most probably these proteins interact with each other. Various methods from graph based modelling to association rule mining can be used to bring together interaction data and expression profile and infer new knowledge. Feeling the molecular forces through a haptic device (i.e. force reflecting robotic device) while visualizing and manipulating them in 3D virtual environments will be an invaluable tool for engineers and scientists in almost every field. The haptic feedback will allow finer control of atoms during manipulation and provide a gateway between our world and the “nano” world. The role of force feedback in molecular simulations with applications to protein-ligand docking are investigated. Proteinligand interactions in biochemical applications determine phenomena ranging from sensory perception to enzyme catalysis Computationally fast models are developed for simulating molecular interactions and then use the haptic device during the simulations to guide a ligand into a receptor site while reflecting the forces acting on the ligand to the user in real-time. Presence of a haptic interface will accelerate the binding process and reduce the development time involved in scientific analysis. Proteins need to be flexible in order to perform their cellular functions. For example, most, if not all, biological processes are regulated through association and dissociation of protein molecules. Thus, the elucidation of the mechanism governing flexibility is critically important in biology and health sciences for the ultimate goal of controlling the functions of proteins and designing new proteins (protein engineering) and new drugs. In many proteins, large conformational transitions involve relative movements of almost rigid structural units. The vibrational motions are indicative of the large amplitude motions. The largest amplitude motions obtained by normal mode analysis will be compared with the conformational changes of the proteins observed by the crystallographers upon a substrate (such as a drug) binding. In this study a set of proteins with both open and closed (with and without a ligand) conformations will be extracted from the protein data bank. These conformations will be analyzed and a computational tool will be developed to model these proteins' motions. An optimization algorithm will be used to find the combinations of these modes in order to have a trajectory between the open and closed conformations. Advantages And Disadvantages: Understanding how proteins fold not only is one of the most interesting theoretical problems in molecular biophysics but also has far-reaching medical and biotechnological consequencesThe great advantage of a really simple model is that

you can solve it exactly, at least for short chains of amino acids. You can examine every possible folding of every possible sequence, picking out the ones of interest. You can know with certainty which configurations have the most favorable properties. Another advantage of a simple model is that you don't have to be an expert in protein chemistry or molecular dynamics to play with it. Determining the process by which proteins fold into particular shapes, characteristic of their amino acid sequence, is commonly called the protein folding problem. The Protein Folding Problem refers to the combinatorial problems involved in enumerating the conformations of a given Protein molecule. Let each amino-acid residue in a 100 residue protein have 6 possible conformations, this leads to 6^100 possible conformations available for this protein, this calculation does not include side chain conformations which will increase the number of degrees of freedom further. The question is now how does the protein fold given this large number of possible conformations. These simple calculations urge the development of new efficient and accurate search methods. Solving the folding problem has enormous implications: exact drugs can be designed theoretically on a computer without a great deal of experimentation. Genetic engineering experiments to improve the function of particular proteins will be possible. Simulating protein folding can allow us to go forward with the modelling of the cell. Protein folding can go wrong for many reasons. When an egg is boiled, the proteins in the white unfold and misfold into a solid mass of protein that will not refold or redissolve. In a similar way, irreversibly misfolded proteins form insoluble protein aggregates found in certain tissues that are characteristic of some diseases, such as Alzheimer's Disease. Applications: Main application of protein design is to understand why proteins fold and why they misfold and aggregate. Understanding protein folding is important due to its applications in the field of biomedicine (drug design, Mad Cow disease cause, etc.,) and nanotechnology (self-assembly of nanomachines).The shape of a protein is the principal determinant of its function. Arbitrary strings of amino acids do not, in general, fold into a well-defined three-dimensional structure, but evolution has selected out the proteins used in biological processes for their ability to fold reproducibly into a particular three-dimensional structure within a relatively short time. Some diseases are actually caused by slight misfoldings of a particular protein. Understanding the mechanisms that cause a string of amino acids to fold into a specific three-dimensional structure is an outstanding scientific challenge. Appropriate use of large scale biomolecular simulation to study protein folding is expected to shed significant light into this process. The level of performance provided by Blue Gene (sufficient to simulate the folding of a small protein in a year of running time) is expected to enable a tremendous increase in the scale of simulations that can be carried out as compared with existing supercomputers. Immunological therapeutic approaches are currently being developed for a number of conformational neurodegenerative conditions, including Alzheimer's disease (AD) and prion disease. Passive immunization is now in clinical trial for the treatment of AD. In wild-type prion model mice active and passive immunization can produce resistance to prion infection with prolongation of the incubation period. Mucosal prion vaccines which can prevent infection in a proportion of animals later exposed to

the prion agent orally. These studies can be used to develop active vaccines for livestock and eventually in humans who are at risk for developing prion infection. Mutants of the gonadotropin releasing hormone receptor are frequently misrouted within the cell, but otherwise competent proteins. "Pharmacoperones," small molecules that penetrate cells and correct folding,and protect such defective molecules, allowing the misfolded mutants to escape retention by the cell's quality control apparatus and route correctly to the plasma membrane. This observation and the indication that disease is a frequent corollary of protein misfolding/misrouting suggests that protein rescue may be an under-appreciated therapeutic approach. The observation that a percentage of wild-type proteins may also be misrouted suggests that this is a novel form of post-translational regulation associated with normal function that can also be therapeutically exploited. Key industrial suppliers(key to engineering designs) Los Alamos National Laboratory: Protein production can be regulated by transcription factors that bind to specific DNA sites thus regulating the transcription rate of proximal genes. Finding these sites known as Transcription Factor Binding Sites is fundamental to understanding gene expression regulation. Professor Uri Keich in computer science works on this motif finding problem. To find the most pronounced motifs in input sequences and to analyse their significance to determine if they are artefacts of the size of data. This work led to innovative significance evaluation of other statistical tests that is especially important for analysing large datasets. Many neurodegenerative diseases such as Alzheimers disease and prion diseases cannot be definitely diagnosed in a person.Only autopsy can confirm the disease. In chemical and Biomolecular engineering, Professor Kelvin Lee achieved a proof of concept for using spinal fluid in an objective molecular based test for Alzheimers disease,ante-mortem.Dozen of the 2000 or more proteins in the cerebrospinal fluid of patients are expressed differentially resulting in identifiable protein bar codes. These barcodes are useful in diagnosis and result in improved treatments. Proteins carry out the cells work acting as catalysts and controllers for numerous chemical reactions and helping to give organs and tissues their shape. Studies of a protein molecules intricately folded,3D structure are important for medical research, drug development, chemical industrial processes and basic research because a proteins structure determines its function. Crystals grown from dissolved protein are useful to know about protein structure and are useful in research. This is a new technology called rapid assay of protein folding which uses folding of a green fluorescent protein to monitor folding and solubility of a test protein. This is done by linking the two proteins in a hybrid molecule. When the hybrid is synthesized in the host cell, the green fluorescent protein achieves its fluorescence if proper folding took place. Folding assay can be used to treat incurable diseases like Alzheimers and Huntingtons. References: %22 role of access date 15/09/06. access date 24/09/06. Cyrus Levinthal Are there pathways in protein folding ? J. Chim. Phys. (1968), vol 65, pp 44-45 access date 10/10/06. access date 27/09/06. C. Clementi, H. Nymeyer and J.N. Onuchic. (2000). Topological and energetic factors: what determines the structural details of the transition state ensemble and 'onroute' intermediates for protein folding? An investigation for small globular proteins. Journal of Molecular Biology, 298: 937-953. access date 19/10/06. protein folding disorders access date 22/10/06. Moody PCE, and AJ Wilkinson, Protein Engineering, IRL press Oxford,1990.

Chapter 8.5 Protein Arrays and their Applications
Huong Nguyen 8.5.1 Introduction The transition from genomics to proteomics for studying protein interactions has brought the field of protein arrays into the limelight [1,2]. In allowing the fast and miniaturised parallel analysis of massive numbers of diagnostic markers in complex samples within a single experiment, high-throughput (HT) protein arrays are promising tools for discovery and validation of biomarkers, drug screening, diagnostics and clinical assays [3-8]. The technological feasibility of protein arrays depends on the different factors that enable the arrayed proteins to recognise molecular partners and on the specificity of the interactions involved [1]. There are two general types of protein arrays. Firstly, analytical arrays utilise antibodies, antibody mimics or other proteins to measure the presence and concentrations of proteins in complex mixtures. Analytical arrays can be subdivided into forward- and reverse-phase protein arrays [9]. Forward-phase protein arrays immobilise a single test sample containing several different target analytes on the substratum. Reverse-phase arrays consist of multiple, different samples that are immobilised on a chip. Each spot represents an individual test sample [10]. This format allows multiple samples to be analysed under the same experimental conditions for any given analyte [10,11]. Secondly, functional protein arrays assess a collection of target proteins or even an entire proteome for a wide range of interactions and biochemical activities [8,12,13]. Underlying Principles Various types of HT protein array techniques include protein microarrays, antibody microarrays, aptamer microarrays, protein nanoarrays and microfluidic arrays [5]. All arrays are essentially bait-and-capture assays. Bait molecules are immobilised on a substratum as homogeneous or heterogeneous spots. These molecules can be aptamers, antibodies, cell lysates, phage or recombinant protein/peptide, a nucleic acid or tissue. The capture molecule can be a complex biologic mixture such as serum or a cell lysate, antibody or ligand [10]. After washing and blocking surface unreacted sites, the immobilised molecules (probe) interrogate the sample applied to the array (analyte). Thus the specific presence or absence (and sometimes quantity) of targets are uncovered. By scanning the entire array a large number of binding events are detected in parallel [11] (Fig 1).

Fig. 1 Typical protein array experiment [11] Proteins can be arrayed either on flat solid phases or in capillary systems. Preferred solid phases are modified glass or filter membranes because of their lowfluorescence background. Binding can be covalent or non-covalent. Parameters such as charge, viscosity, membrane pore size, pH, binding capacity and non-specific binding play important roles in the generation of protein arrays [5]. Protein arrays are usually printed (gridded/spotted) and imaged using the same arrayers and scanners as for DNA arrays. Arraying devices are largely pin-based systems that transfer nanoliter amounts of liquid either on the outside of solid pins or inside a split- or ring-shaped reservoir. Current detection strategies are classified as label-free methods and labelled probe methods. Label-free methods include mass spectrometry (utilises a protein-selective surface for immobilisation of a complex protein solution) [5] and surface plasmon resonance (SPR) (optical biosensors for monitoring biomolecular interactions) [4]. Imaging has also been based on either direct fluorescent labelling of antigens, indirect labelling of antibodies or sandwich assays using secondary antibodies or specific antibody-binding reagents [5,11] (Fig 2). Different fluorophores allow multiplexing and differential protein expression profiling [5].

Fig. 2 Representation of labelled probe methods used in protein array detection [11] Ordered protein spots are arranged in either planar macroarray or microarray formats. The format utilised reflects the relative size and number of spots per square centimetre to be studied. Macroarrays are ideal for the study of dozens of proteins, meanwhile microarrays are ideal for the large-scale analysis of proteomes [1]. As a result of miniaturisation, microarrays can analyse many parameters in parallel and only require minimal amounts of reagents and sample [14]. 8.5.2 Recent Advances As the potential of protein arrays garners increasing attention, researchers are investing more effort into cultivating the technology. As a result, many advances in the field have arisen in the last five years in the areas of surface chemistry, capture molecule attachment, protein labelling and detection methods and HT protein production. Surface Chemistry Considerable effort has been made in the last years to improve immobilisation of proteins on modified glass surfaces. Examples of recent advances include: Introducing functional groups on the surface by glass modification with organosilanes such as 3-glycidoxypropyltrimethoxysilane (GOPS) or 3aminopropyltriethoxsilane (APTES). Organosilanes can directly provide the functional groups for ligand attachment (GOPS) or react with a bifunctional ligand bearing the desired reactive group [11]. A microarray surface has been developed with ProLinkerTM, a calixcrown derivative with a bifunctional coupling property that permits efficient immobilisation or capture proteins on solid matrices such as gold films or animated glass slides [11].

Gels such as polyacrylamide or agarose are being used to physically entrap of proteins in. The 3D structure of these substrates increases the loading capacity and does not disturb the potential functional sites or regulatory domains of the protein. The aqueous environment of the gel also reduces protein denaturation [11]. Some researchers are implementing microfluidic printing of arrayed chemistries on individual protein spots blotted onto membranes. Other researchers are using in-jet printing technology to create protein microarrays on chips [15]. Capture Molecule Attachment The selection and production of the capture agents are the most critical points in protein-detecting arrays. They must be highly specific for the protein of interest and with an affinity sufficient to capture even proteins at very low concentration [11]. Antibodies are ideal capture molecules. Advances in antibody microarrays have established methods for selection, production and purification of antibodies with high affinity and reduced cross reactivity. In addition, antibody microarrays containing more than 200 polyclonal and monoclonal antibodies specific for cell cycle, apoptosis and nuclear signalling proteins are now commercially available. This allows for the analysis of protein abundance change in biological samples in a very large dynamic range using a two-colour approach [11]. Two array technologies quickly being adopted by researchers are 2D arrays and bead arrays. In 2D arrays the capture agents are arrayed into planar substrates such as polystyrene, glass or silicon. These arrays utilise fluorescently labelled reporter molecules and are analysed using microarray scanners. In contrast, bead arrays immobilise capture agents onto beads containing an integrated reporter dye. The dye encodes for the identity of the capture agent on the bead. These arrays are read using a fluorescent particle counter [12]. Protein Labelling and Detection Methods Advancements in protein labelling and detection methods include: Atomic force microscopy (AFM) has been developed for detection at the singular molecular level in protein nanoarrays which exhibit almost no detectable nonspecific protein blinding [10]. It is also currently widely used for surface characterisation in protein microarrays. AFM reveals the change in height of an immobilised protein upon binding with its cognate molecule [11]. Detection methods developed for microarrays, due to miniaturised format, are required to provide high sensitivity (high signal to noise ratio) and HT. The use of probes and signal amplication techniques with chromogenic or fluorescent probes has led to the attainment such criteria [11]. Surface enhanced laser desorption ionisation (SELDI) is a reverse screening technology suited for the detection of small proteins and peptides. SELDI technology is easy to handle and suitable for the rapid detection of differences in total protein content of different samples. This approach is a rapid screening platform for any unknown protein biomarker [9]. Semiconductor quantum dots have been applied as labelling techniques. They are brighter and more stable than organic dyes. Researchers have successfully

applied such quantum dots for the labelling of proteins on cells and within the cytoplasm and nucleus [16]. HT Protein Production Manufacturing HT protein arrays requires the efficient production of large numbers of proteins. This calls for highly parallel and automated recombinant expression systems. HT subcloning of human open reading frames have been illustrated. Recombination-based cloning allows a rapid exchange of vectors and expression systems. A recent development, the pTriEx vector (Novagen, USA), can be used for expression in E. coli, baculovirus-infected insect cells and vertebrate cells [5]. Multidimensional chromatography of cell and tissue lysates or cellular subfractions could become an important alternative to recombinant protein expression. By consecutive chromatography and isoelectric focusing, cellular protein extracts have been directly fractionated, arrayed and detected with antibodies [5]. Cell-free expression has become an alternative to cell-based systems for HT applications. Proteins are expressed from cDNA templates in cell-free systems which can easily be generated by PCR and stored [16]. The production of proteins using cDNA libraries in E. coli with subsequent purification remains the gold standard. The procedures were adapted to HT expression in fully automated systems. The purification is mainly based on short affinity tags to either the N or C terminus of recombinant proteins and involves immobilised metal affinity chromatography to affinity media. Alternate hosts such as Pichia pastoria and Saccharomyces cerevisiae have been tested for HT protein expression [16]. 8.5.3 Evaluation of the Technology Protein arrays are an evolution of previous technologies such as DNA arrays. Consequently, protein array technologies exhibit some of the advantages and disadvantages of these approaches. But protein arrays also present a number of advantages and disadvantages due to the unique physical and chemical characteristics of proteins. Advantages The key advantages of protein arrays over other techniques are based its capacity to characterise a huge number of ordered protein spots simultaneously, thus replacing numerous individual binding parameters in parallel assays with different probes [1]. Another advantage is that protein arrays, derived from DNA arrays, can be manufactured using technologies adapted from DNA microarray production [17,18]. DNA arrays have been successful in gene expression profiling and mutation mapping. As the focus shifts from genomics to proteomics, protocols are needed to study the activity of encoded proteins that directly manifest gene function [13]. Protein arrays can accomplish this providing the proteins’ natural shape and functionality are maintained [5]. Protein array technologies benefit from current techniques requiring higher volumes of capture reagents and being less sensitive. For example, ELISA experiments require nanogram or microlitre amounts of capture reagents and display picomolar sensitivity. In contrast, protein microarrays require picogram or picolitre amounts of

capture agents and display femtomolar sensitivity. Protein microarrays also enable 20,000 spots per substrate. Disadvantages Protein array technology is not as straightforward as DNA arrays due to the complex structure of proteins [4]. The complexity of proteins has proven to be a bottleneck in the progression of protein arrays but the disadvantages will lessen as scientists work to gain a better understanding of proteins. Examples of disadvantages include: Proteins demonstrate a staggering variety of chemistries, affinities and specificities. Proteins may require more multimerisation, partnership with other proteins or post-translational modification to demonstrate activity or binding [4,5,9,12,13]. Manufacturing HT protein arrays requires the efficient production of large numbers of proteins. There is no equivalent amplification process like PCR that can generate large quantities of protein [4,5,12,14]. Expression and purification of proteins is a tedious task and does not guarantee the functional integrity of the protein [19,20]. Preserving the native characteristics of proteins is essential for downstream analysis. Many proteins are notoriously unstable, which raises concerns about microarray shelf life [12,14]. 8.5.4 Applications of the Technology The application of protein arrays appear to be feasible in areas such as target identification and characterisation, target validation, diagnostic marker identification and validation, pre-clinical study monitoring and patient typing. The variety of applications highlights the potential of the technology. Below are examples of recent major applications of protein arrays.

Autoimmune Profiling The diversity of the autoimmune response is a great challenge for the development of antigen-specific tolerising therapies. In the area of autoimmune profiling, researchers fabricated assays containing 196 distinct biomolecules, comprising proteins, peptides, enzyme complexes, ribonucleoprotein complexes, DNA and posttranslationally modified antigens. The assays included sera from eight human autoimmune diseases including systemic lupus erythematosus and rheumatoid arthritis [16]. To identify new autoantigens that may act as tolerising vaccines, an antigen microarray consisting of 232 proteins were identified as potential targets of autoimmune response in chronic experimental autoimmune encephalomyelitis, a mouse model for multiple sclerosis. Such analysis of immune response can also be applied to other diseases [6]. Target Identification and Validation

Drug companies today must discover new disease targets rapidly and develop drugs that are highly specific for their targets to minimise harmful side-effects [21]. Recombinant protein arrays can potentially assist in both areas. Researchers are investigating whether specific disease markers could be screened against highdensity (proteome-wide) and/or low-density recombinant protein arrays to identify proteins that interact with the markers. These interacting proteins could then be investigated further to differentiate those proteins that show promise as new drug targets for combating the disease of interest [17,22]. Additionally, existing drug candidates could be screened against recombinant protein arrays to measure the specificity of the drug molecules. For example, obesity can result in metabolic syndrome X, type 2 diabetes, cardiovascular disease and other disorders. Scientific studies have recently indicated that peroxisome proliferatoractivated nuclear receptor δ (PPARδ) can activate fat metabolism, potentially providing a means of treating or even preventing obesity. PPARδ is usually activated through the binding of a specific ligand or hormone. Drug screening candidates that mimic this specific ligand against a recombinant protein array composed of other closely related nuclear receptors would rapidly identify those mimics that are highly specific for PPARδ and that do not interact significantly with the other nuclear receptors. This application could potentially provide a method for eliminating drug candidates that can cause harmful side-effects through interaction with non-target molecules [17]. Protein arrays have also been applied in identifying new kinases and their substrates. Phosphorylation of proteins by protein kinases plays a central role in regulating cellular processes and may contribute to many diseases, including diabetes, inflammation, and cancer. Therefore, kinases are an important class of drug targets. Different enzyme activities including phosphatases, peroxidases, galactosidases, restriction enzymes and protein kinases have been analysed on protein, peptide, and nanowell microarrays. In one study, a total of 119 known or predicted protein kinases were expressed, purified as GST fusion proteins, arrayed and cross-linked on a protein chip and assayed with 17 different substrates for auto-phosphorylation by treatment with radio-labelled ATP. New activities were found. For example, 27 kinases showed phosporylation activity of poly-Glu-Tyr. This indicates that many kinases are capable of phosphorylating tyrosine even if they are members of the serine/threonine family on the basis of sequence comparison [6]. Clinical Applications in Cancer Research Cancer research is currently one of the largest areas of application for protein arrays. Cancer proteomics encompasses the identification and quantitative analysis of differentially expressed proteins from normal tissue, premalignant and malignant tissues [10,23]. Serum screening was performed in several studies to characterise the serum and plasma of patients suffering from diverse cancers, such as colon, lung or nasopharyngeal cancer. All studies demonstrated the applicability of arrays to this field and led to the identification of known of new potential biomarkers [16]. In another applicaton, mutations and polymorphisms of p53 were functionally characterised with regard to their DNA-binding capacity on protein microarrays. Organ- and disease-specific microarrays have been created using reverse-phase protein arrays, which were created by immobilisation of the whole repertoire of patient proteins. Such arrays were then applied to quantify the phosphorylated status of signal proteins and to monitor cancer progression from histologically normal

prostate epithelium to prostate intra-epithelial neoplasia and invasive prostate cancer [16]. Researchers have also immobilised anti vascular endothelial growth factor (VEGF) antibodies to ProteinChip arrays to analyse the expression of VEGF protein isoforms in lung tumors and normal lung tissue. VEGF plays an important role in the development and metastasis of tumors and is therefore, an important target in novel anti-cancer treatments. The lung tumors that were analysed expressed a wide variety of VEGF isoforms, while normal lung tissues only expressed low amounts of the smallest VEGF isoform [10]. Proteomics Proteomics is promising for the early detection of disease using proteomic patterns of body fluid samples. Proteome analysis may also be important to make individualised selection of therapeutic combinations that best target the entire disease-specific protein network. Investigation of the proteome may also give a real-time assessment of therapeutic efficacy and toxicity. Identifying changes in the diseased protein network associated with drug resistance will make it possible to adjust the therapy [10]. For example, reverse-phase protein arrays have been used to study the fluctuating state of the proteome in minute cell quantities. The activation status of cell signalling pathways controls cellular fate. Deregulation of these pathways can lead to carcinogenesis. Reverse-phase protein arrays have used to analyse the status of key points in cell signalling involved in prosurvival, mitogenic, apoptotic and growth regulation pathways in the progression from normal prostate epithelium to invasive prostate cancer. Focused analysis of phospospecific target proteins revealed changes in cellular signalling events through disease progression and between patients. Gene expression alone cannot determine the activation (ie. phosphorylation) state of in vivo signal pathways checkpoints [9].

Food Safety Food safety is a top priority for many countries. As a result, great effort has been put into developing technologies to ensure food products meet safety standards. A particularly interesting application is the xMAP ( system. This assay uses colour-coded microspheres to attach capture molecules. The beads are sorted by flow cytometry. Each type of bead is identified according to its fluorescence label and the quantity of the captured target on each bead. This approach has been shown capable of performing 100 different assay types simultaneously and is used for the detection of bacterial pathogens in food [17]. 8.5.5 Relevant web sites The following websites are useful resources for learning more about protein arrays: News resource for microarray and microfluidic applications Extensive resource for both genomic and proteomics-based applications Weekly news resource of developments in the microarrays sector Commercial supplier (Telechem) with useful web resources 8.5.6 Key Industry Suppliers Examples of commercially available planar protein arrays: Company Schleicher & Schuell Bioscience Zyomyx Inc. Pierce Biotechnology Inc. RayBiotech Inc. BD Biosciences SIGMAALDRICH Co. Protometrix Inc. Molecular Staging Inc. Zeptosens AG Ciphergen Biosystems Inc. Product Application/Kits ProVisionTM HCA Cytokine Profiling Zyomix Protein Profiling Biochip System SearchLightTM Arrays RayBioTM Cytokine Arrays BD ClontechTM Ab Microarray PanoramaTM Ab Microarray Cell Signalling Kit The Yeast ProtoArrayTM Rolling Circle amplification technology (RCATTM) ZeptoMARTM CeLyA Cell Lysate Arrays SELDI Protein Chip Cytokine Profiling Cytokine Profiling Cytokine and protein profiling Comparative protein analysis Comparative protein analysis Services protein interaction studies Multiplexed protein profiling Reverse Screening Reverse Screening Weblink

Examples of commercially available bead-based systems: Company/Technology Luminex Corporation BD Biosciences Applications/Kits xMAP, Luminex 100TM BDTM cytometric bead array (CBA) Weblink

8.5.7 References 1. Sakanyan, V. (2005) High-throughput and multiplexed protein array technology: protein-DNA and protein-protein interactions, J Chromatog B, 815, 77-95. Witte, K. & Nock, S. (2004) Recent applications of protein arrays in target identification and disease monitoring, Drug Discov Today: Technol, vol. 1, no. 1, pp. 35-40. Arenkov, P., Kukhtin, A., Gemmell, A., Voloshchuk, S., Chupeeva, V. & Mirzabekov, A. (2000) Protein microchips: use for immunoassay and enzymatic reactions, Anal Biochem, 278, 123-131. Espina, V., Woodhouse, E., Wulfkuhle, J., Asmussen, H., Petricoin, E. & Liotta, L. (2004) Protein microarray detection strategies: focus on direct detection technologies, J Immunol Methods, 290, 121-133. Walter, G., Bussow, K., Lueking, A. & Glokler, J. (2002) High-throughput protein arrays: prospects for molecular diagnostics, Trends Mol Med, vol. 8, no. 6, pp. 250-253. Lueking, A., Cahill, D. & Mullner, S. (2005) Protein biochips: a new and versatile platform technology for molecular medicine, Drug Discov Today: Targets, vol. 10, no. 11, pp. 789-794. Barton, S. (2005) The promise of biomarkers: research and applications, Drug Discov Today, vol. 10, no. 9, p. 615-616. Unwin, R., Evans, C. & Whetton, A. (2006) Relative quantification in proteomics: new approaches for biochemistry, Trends Biochem Sci, vol. 31, no. 8, pp. 473-484. Poetz, O., Schwenk, J., Kramer, S., Stoll, D., Templin, M. & Joos, T. (2004) Protein microarrays: catching the proteome, Mech Ageing and Dev, 126, 161170. Hoeben, A., Landuyt, B., Botrus, G., De Boeck, G., Guetens, G., Highly, M., van Oosterom, A. & de Bruijn, E. (2006) Proteomics in cancer research: methods and application of array-based protein profiling technologies, Anal Chimica Acta, 564, 19-33. Cretich, M., Damin, F., Pirri, G. & Chiari, M. (2006) Protein and peptide arrays: recent trends and new directions, Biomol Eng 23, 77-88. LaBaer, J. & Ramachandran, N. (2005) Protein microarrays as tools for functional genomics, Curr Opin Chem Biol, 9, 14-19. Zhu, H. & Snyder, M. (2003) Protein chip technology, Current Opinion in Chemical Biology, 7, 55-63. Stoll, D., Bachmann, J., Templin, M. & Joos, T. (2004) Microarray technology: an increasing variety of screening tools for proteomic research, Drug Discovery Today: Targets, vol. 3, no. 1, pp. 24-31.






7. 8.



11. 12. 13. 14.

15. 16. 17.

Lopez, MF & Pluskal, MG (2003) Protein micro- and macroarrays: digitising the proteome, J Chromatogr B, 787, 19-27. Angenendt, P. (2005) Progress in protein and antibody microarray technology, Drug Discovery Today: Targets, vol. 10, no. 7, pp. 503-511. Schofield, M., Sharma, N. & Ge, H. (2004) The recombinant protein array: use in target identification and validation, Drug Discov Today: Targets, vol. 3, no. 6, pp. 246-252. Lal, SP, Christopherson, RI & dos Remedios, CG (2002) Antibody arrays: an embryonic but rapidly growing technology, Drug Discov Today, vol. 7, no. 18, pp.143-149. Binder, S., Hixson, C. & Glossenger, J. (2006) Protein arrays and pattern recognition: new tools to assist in the identification and management of autoimmune disease, Autoimm Reviews, 5, 234-241. Venkatasubbarao, S. (2004) Microarrays – status and prospects, Trends Biotechnol, 22, 630-637. Cahill, DJ. (2003) Protein arrays and their role in proteomics, Adv Biochem Eng Biotechnol, 83, 177-187. Jona, G. & Snyder, M. (2003) Recent developments in analytical and functional protein microarrays, Curr Opin Mol Ther, 5, 271-277. Angenendt, P. (2005) Progress in protein and antibody microarray technology, Drug Discov Today, 10, 503-511.



20. 21. 22. 23.

Chapter 9.1 Biomarkers: Discovery and Applications
Aditya Paranjape 9.1.1 Introduction Biomarker is a characteristic that can be objectively measured and elevated as an indicator of normal biological processes, pathogenic processes or pharmacological responses to a therapeutic intervention [4]. These characteristics include: serum, plasma, urine, CSF, saliva, tissues etc. Biomarkers provide us a dynamic and powerful approach to understanding the wide spectrum of neurological disease with application in observational and analytical epidemiology, randomized clinical trials, screening and diagnosis and prognosis. These markers offer the means for homogenous classification of diseases and risk factors and thus they can extend our base information about the underlying pathogenesis of diseases. Markers are also able to reflect the entire journey of disease from the earliest manifestation to the terminal stages. Careful assessment of validity of markers is required with respect to the stage of the disease. In practice Biomarkers include tools and technologies that can aid in understanding the prediction, cause, diagnosis, progression, regression or outcome of treatment of disease. Biomarkers are becoming widely used in many aspects of drug discovery and drug development such as dosing, efficacy and as surrogates for clinical outcome [6]. They are now also being utilized for drug development to curtail expensive trials and are being used to make critical decisions about whether a drug should be continued beyond the clinical stage of development Additionally; biomarkers are increasingly being used for patient selection with respect to responses to targeted therapies, cancer research and research for various disease states. Biomarkers have been classified by (Perera and Weinstein) based on the sequence of events from exposure to disease. Though biomarkers readily lend themselves to epidemiological investigations, they are also useful in the investigation of the natural history and prognosis of a disease. In addition to delineating the events between exposure and disease, biomarkers have the potential to identify the earliest events in the natural history, reducing the degree of misclassification of both disease and exposure, opening a window to potential mechanisms related to the disease pathogenesis, accounting for some of the variability and effect modification of risk prediction. The recent interest in biomarker discovery is because of new molecular biologic techniques that promise to find relevant markers rapidly, without detailed insight into mechanisms of disease [4]. By screening many possible biomolecules at

a time, a parallel approach can be tested. Genomics and proteomics are some technologies that are used in this process. There is considerable interest in biomarker discovery from the pharmaceutical industry. Blood test or other biomarkers could serve as intermediate markers of disease in clinical trials, and also be possible drug targets.

Fig.9.1.1 Disease pathway and potential impact of biomarkers (Journal for American Society for Experimental Neuro Therapeutics, Inc.) Biomarkers can also be substances used as indicators of a biologic state. It can be any kind of molecule indicating the existence (past or present) of living organisms. In particular, in the fields of geology and astrobiology biomarkers are also known as bio - signatures. The term is also used to describe biological involvement in the generation of petroleum. The study of biomarkers, or chromosomal abnormalities that can possibly predict how a person’s disease may progress or respond to treatment, falls under the category of “chemoprevention” as scientists hope that the end result of these studies will provide an aid in early detection and screening, which could hopefully make a dent in the statistics representation of bladder cancer specific deaths [2]. The biomarker must be expressed differently in normal and high-risk tissue, with clear evidence of progression from normal tissue to biomarker to cancer and, ideally, should appear early in carcinogenesis. If the use of biomarkers proves to be a tool for achieving successful preventive intervention, this would support the possibility that a preventive agent that could reverse the molecular events (or suppress their consequences) for one tumor site may be effective in preventing a variety of tumors [7]. The development and validation of biomarkers are important to the success of testing chemo preventive agents .Examples of biomarkers include genetic markers (eg, nuclear aberrations [such as micronuclei], gene amplification, and mutation), cellular markers (eg, differentiation markers and measures of proliferation, such as thymidine labeling index), histologic markers (eg, premalignant lesions, such as

leukoplakia and colonic polyps), and biochemical and pharmacologic markers (eg, ornithine decarboxylase activity). During the Clinical Biomarker Summit March 2006, it was contemplated that as the field matures, biomarkers were making their way into clinical trials. Faced with relative lack of experience in implementing biomarkers in clinical trials, many researchers and clinicians were facing similar challenges in modifying trial design and defining the right control population, validating biomarker assays from the biological and analytical perspective, and using biomarker data as a guideline for decision making. The Clinical Biomarkers Summit also addressed biomarker translation from pre-clinical to clinical studies and a variety of biomarker applications in clinical trials, including patient selection, monitoring clinical efficacy and safety, and clinical pharmacology. The Summit also takes note of the bridging gap between the pharmaceutical and diagnostics industries and the potential of companion diagnostics. Specific case studies of leveraging biomarkers in accelerating and streamlining clinical trials will offer a steady status report, no hype! The Clinical Biomarkers Summit, is built on a solid three-year track record of Cambridge Health tech Institute’s Biomarker Series, is the first meeting in the Series to focus exclusively on clinical applications of biomarkers. At the Biomarker World Congress 2005, Pennsylvania over 500 thought leaders from 260+ organizations, representing 20 countries, had gathered to discuss biomarker implementation in drug and diagnostic development. A year later, the largest meeting of its kind, The Biomarker World Congress 2006 is dedicated to all areas of biomarker research spanning the pharmaceutical and diagnostic pipeline. The meeting brought together a unique and international mix of large and medium pharmaceutical, biotech and diagnostics companies, leading universities and clinical research institutions, government and national labs, CROs, emerging companies and tool providers-making the Congress a perfect meeting-place to share experience, foster collaborations across industry and academia, and evaluate emerging technologies. The Congress also offers a balance of scientific sessions covering the latest research and strategic presentations and brainstorming sessions for the decision makers. 9.1.2 Recent Advances

Since the start of the 21st century there has been a lot of research work on there has been a lot of research worldwide for biomarkers three prominent of them are noted below, Many recent investigations have been conducted to determine whether new biological markers will help predict disease progression and potential clinical applications of these tumor markers are under active investigation. Recent attention has focused on which tumor markers may predict the responsiveness of a particular bladder cancer to systemic chemotherapy. Some of these new predictive and prognostic markers include DNA ploidy, S-phase, p53, p21, the retinoblastoma (Rb) gene, MDR-1, bcl-2, cell adhesion molecules, blood group antigens, tumor associated antigens, proliferating antigens, oncogenes, peptide growth factors and their receptors, tumor angiogenesis and angiogenesis inhibitors, and cell cycle regulatory proteins. Beta human chorionic gonadotropin (ß-hCG), carcinoembryonic antigen, CA-125, CA 19-9, and others have been evaluated and shown to correlate with clinical response to chemotherapy. G Actin and Ki67 have indicated response to BCG and radiation.

The most discussed chromosomal biomarker of all is the p53 protein, a tumor suppressor gene on chromosome 17p. (Between 2003 and 2005) over 4,00 articles on this subject were published. A wealth of studies seems to confirm that the p53 protein, if mutated and over expressed in cancerous cells, is an indication of a potentially aggressive condition. It’s been shown that both grade and stage of (invasive) bladder cancer are related to p53 alterations. Mutations in the p53 gene can be detected in tissue sections by immuno histochemistry. Since the wild-type p53 gene has a short life, immuno staining of normal urothelium with p53 monoclonal antibodies is negative. When mutations in the p53 gene occur, the mutated proteins aggregate in tetrameric and pentameric macromolecules of longer life. The result is an accumulation of p53 protein that provides a positive immuno staining reaction. The reaction is observed in the nuclei of tumor cells affected by these events. The function of p53 is critical to the way that many cancer treatments kill cells since radiotherapy and chemotherapy act in part by triggering cell suicide in response to DNA damage [11]. This successful response to therapy is greatly reduced in tumors where p53 is mutant so these tumors are often particularly difficult to treat. It is hoped that better understanding of p53 can guide the development of new treatments for cancer. Scientists are beginning to unravel some of the mysteries and in the test tube at least, are beginning to find ways to make these damaged p53s work again. Such discoveries could potentially offer a powerful and selective new way of treating cancer. Now considering another biomarker, the p21 gene, is showing that people with p21positive tumors have a decreased probability of tumor recurrence. An article from Journal of the National Cancer Institute, July 15, 1998, discusses a multi-centre, randomized clinical trial using p53-status of tumor cells and other molecular markers like p21 to guide treatment decisions in bladder cancer patients, one of the first of its kind. [6] The research team from (Norris Comprehensive Cancer Center) conducted a study on 242 patients with locally confined bladder tumors who were followed for an average of 8.5 years. Analysis was done of of the p21 protein and it’s interaction with the p53 protein. Results of the study indicated that patients with p21-positive tumors survived disease-free significantly longer than those patients with p21-negative tumors. Furthermore, it was shown that the way the p21 and p53 proteins interact with each other can give a very good indication of which patients must be considered at high risk for recurrence. The article also stated that p53 is known to be a primary regulator of p21, since genetic changes in p53 may lead to loss of p21 expression and function. This in turn leads to unregulated cell growth, and is thought to contribute to the aggressive behavior of some tumors. There hypothesis seemed confirmed by the study was what the scientist there proclaimed. Patients with p53altered/p21-negative tumors demonstrated a higher rate of recurrence and worse survival compared with those with p53-altered/p21-positive tumors. Patients with p53-altered/p21-positive tumors demonstrated a similar rate of recurrence and survival as those with p53-wild type tumors.

At the Centre for Translational Cancer Research, efforts being done to focus on three types of biomarker discovery protein biomarkers in tissues and cells, bodily fluid (blood, urine, and other fluids) protein biomarkers, and transcriptome (RNA) biomarkers. These approaches together facilitate translational cancer research by providing new tools to the clinician for the enhanced diagnosis, follow-up and screening of cancer patients.

Tissue Proteins: Cell and tissue analysis of cancer specimens allows researchers to assess the exact site and expression level of protein cancer biomarkers in architecturally preserved or homogenized tissues. Analysis can include histo morphological examination in collaboration with a pathologist, use of immuno histochemical staining with specific antibodies on cell and whole mount tissue specimens, and Western blot analysis of cell or tissue homogenates. (Picture showing Tissue specimen) Fluid Proteins: The emerging field of proteomics provides new tools for the early detection of cancer from human serum, cerebral spinal fluid, urine and other complex samples. Proteomic research provides information regarding the proteome’s dynamic and rapid changes which result from exogenous exposure or endogenous factors. The CTCR Core offers a proteomic analysis based on the patented Surface Enhanced Laser desorption/Ionization (SELDI) technology. Assays using SELDI time-of-flight mass spectrometry (TOF-MS) to provide a means to identify new candidate biomarker proteins because of their ability to detect and quantify multiple posttranslationally modified and processed protein forms in a single assay. RNA Molecules: New methods to analyze gene expression or the “transcriptome” of cancer cells for comparison to the patterns of normal cells provides a powerful new means of identifying RNA biomarkers for specific cancers. Several research projects are underway using the core facilities available to CTCR researchers. These projects seek to analyze total expressed RNA using microarrays, microRNAs that function as regulators of protein expression, quantitative analysis of single genes using Q-PCR, and to develop new methods to analyze RNA expression in archived cancer tissue specimens. Mapping of human genome leads to the Discovery of new biomarkers, currently, scientists are striving to find proteins (biomarkers) that are specific to various disorders. More and more biomarkers are being identified with the help of sophisticated enabling instruments and technologies such as mass spectrometers and protein microarrays. These novel biomarkers are likely to aid researchers in developing precise clinical diagnostics and drugs that are capable of detecting and curing fatal diseases. The early detection and diagnosis of fatal diseases – particularly cancer, Alzheimer’s, and Parkinson’s – enhances the prospects of curing the affected patients. This would encourage physicians to recommend patients to undertake biomarker-based screening tests that can predict these deadly diseases in the early stages. These tests are also likely to help physicians in prescribing personalized medication to patients after pinpointing the particular pathway that is playing an active role in the disease progression. This would not only lead to reduction in side-effects but also ensure that patients are administered the right medication.


Evaluation of Technology

There are various production approaches used for biomarkers; it is essential that the most rewarding and least time consuming procedure of them is selected which yields the best suited biomarker that incorporate many capabilities. Capabilities of Biomarkers,
• • • • • • •

Delineation of events between exposure and disease Establishment of dose-response Identification of early events in the natural history Identification of mechanisms by which exposure and disease are related Reduction in misclassification of exposures or risk factors and disease Establishment of variability and effect modification Enhanced individual and group risk assessments TABLE 9.1.3, To summarize advantages and disadvantages of Biomarkers, Advantages Objective assessment Precision of measurement Reliable; validity can be established Less biased than questionnaires Disease mechanisms often studied Homogeneity of risk or disease Disadvantages Timing is critical Expensive (costs for analyses) Storage (longevity of samples) Laboratory errors Normal range difficult to establish Ethical responsibility

[The American Society for Experimental NeuroTherapeutics, Inc. NeuroRx. 2004 April; 1(2): 182–188] 9.1.4 Applications of Biomarkers

The advancement in the biomarker discovery has led to the use of Markers in most of the therapeutic treatment or clinical trials, thus applications of these are many and a few are listed below as per there success rate. The uses of the biomarkers are as given, As Cancer Markers The increase in the use of tumor markers as screening and diagnostic tools has generated much hope for the identification of broadly reacting or pan-cancer antibodies. Upstate/Chemicon provides a large variety of many well characterized cancer antibodies for breast cancer detection, chronic lymphatic leukemia (CCL), and other cancers. As Neurodegenerative Markers

Chemicon’s line of neurological antibodies has well-characterized neurodegenerative antibodies and assays for researching Alzheimer’s, Fragile X Syndrome, Parkinson’s & Huntington’s and other related biomarkers. As Metabolic Markers Biomarkers of metabolic processes are emerging as key tools in diagnosing dysfunction and disease states. This is becoming increasing important due to the near epidemic surge in metabolic syndrome and resultant cardiovascular disease and Type 2 diabetes. As Drug Response Markers Inflammation, oxidative stress and reactive species are continuously being monitored as drug response markers. Chemicon now has available a monoclonal antibody to RAGE, a member of the immunoglobulin super family of cell surface molecules that bind molecules that have been irreversibly modified by non enzymatic glycation and oxidation. Biomarkers are also verified in multiple applications like, • Immuno histochemistry • Western Blotting • Immuno precipitation • Flow cytometric Analysis • ELISA 9.1.5 Relevant Web sites 1. m (University of Virginia Health System) 2. (Centre for Translational Cancer Research) 3. (Frost and Sullivan Research Services) 4. (MDS Pharma Services) 9.1.6 5. Key Industry Suppliers

The shift in focus from discovering genomic biomarkers to protein biomarkers is driving demand for robust research instruments that enable multiplexing and reduce manual steps such as sample preparation. Technological developments in terms of automation and increase in throughput are enabling researchers to solve the riddle of the human proteome – which is much more complex than the human genome. Companies that manufacture protein chips containing high-density arrays of functional proteins and micro fluidics-based platforms are likely to cash in on these

novel research efforts aimed at discovering new biomarkers. The Companies mostly into Biomarker discoveries are given below, Chemicon/Upstate is responding to this growing market need by offering high quality, well characterized antibodies many of which have been validated in a variety of applications. Chemicon/Upstate’s innovative partnership has brought together a large line of biomarkers for various disease states such as specific cancers, neurodegenerative, cardiovascular and metabolic diseases. Together Chemicon/Upstate brings researchers and drug development companies access to innovative assays and platform technologies for biomarker discovery and drug development. Developed and manufactured under GMP conditions, our antibodies, kits and reagents provide consistent high quality products for all of your research, drug discovery and drug development needs. We offer many well-characterized antibodies for research and discovery to numerous disease and disease related targets. ( Millipore Serological Corporation, Life Sciences is the parent company for most of the biomarker producing companies which have their acquisition in most of the companies mentioned below, which mostly work in collaboration with Millipore they are, 1. LINCO ( 2. Celliance ( 3. MDS Pharma Services 4. Frost & Sullivan Research and Partnership Services (GPS) 5. Biomarker Pharmaceuticals Many studies using biomarkers in the above mentioned industries are going on but it never achieves its full potential because of the failure to adhere to the same rules that would apply for the use of variables that are not biological. The development of any biomarker should precede or go in parallel with the standard design of any epidemiological project or clinical trial. In forming the laboratory component, pilot studies must be completed to determine accuracy, reliability, interpretability, and feasibility. The investigator must establish “normal” distributions by important variables such as age and gender. The investigator will also want to establish the extent of intra individual variation, tissue localization, and persistence of the biomarker. Moreover, he or she will need to determine the extent of inter individual variation attributable to acquired or genetic susceptibility. Most, if not all of these issues can be resolved in pilot studies preceding the formal investigation.



1. Galasko D. New approaches to diagnose and treat Alzheimer’s disease: a glimpse of the future. Clinical Geriatr Med 17: 393–410, 2001. 2. Gordis L. Epidemiology and public policy. In: Epidemiology (Gordis L, ed), pp 247–256. Philadelphia: W.B. Saunders, 1996. 3. Hulka BS. Overview of biological markers. In: Biological markers in epidemiology (Hulka BS, Griffith JD, Wilcosky TC, eds), pp 3–15. New York: Oxford University Press, 1990.

4. IIyin et al, Trends in Biotechnology, 22, 411 – 416, 2004 5. Merikangas K. Genetic epidemiology: bringing genetics to the population-the NAPE Lecture 2001. Acta Psychiatr Scand 105: 3–13, 2002. 6. Naylor S. Biomarkers: current perspectives and future prospects. Expert Rev Mol Diagn 3: 525–529, 2003 7. Perera FP, Weinstein IB. Molecular epidemiology: recent advances and future directions. Carcinogenesis 21: 517–524, 2000. 8. Reiber H, Peter JB. Cerebrospinal fluid analysis: disease-related data patterns and evaluation programs. Journal Neurol Science 184: 101–122, 2001. 9. Rohlff C. Proteomics in neuropsychiatric disorders. Int J Neuropsychopharmacol 4: 93–102, 2001. 10. Schulte PA. A conceptual and historical framework for molecular epidemiology. In: Molecular epidemiology: principles and practices (Schulte PA, Perera FP, eds), pp 3–44. San Diego: Academic Press, 1993. 11. Verbeek MM, De Jong D, Kremer HP. Brain-specific proteins in cerebrospinal fluid for the diagnosis of neurodegenerative diseases. Ann Clin Biochem 40: 25–40, 2003


9.2.1Introduction: SELDI-MS (Surface Enhanced Laser Desorption Ionization – Mass Spectrometry) is a powerful technique that combines chromatography and mass spectrometry. This technique was invented by Tai-Tung yip and T.William Hutchens in the early 1990’s at the Baylor College of medicine, and then afterwards gained in popularity as a powerful tool for protein analysis. [1] SELDI encompasses two major subsets of MS (Mass Spectrometry) Technology 1. SEND –Surface Enhanced Neat Desorption and 2. SEAC –Surface Enhanced Affinity Capture. The underlying principle in SELDI is surface-enhanced affinity capture through the use of specific probe surfaces. Once captured on the SELDI protein chip array, proteins are detected by TOF MS. Figure: 1 Schematic diagram comparing the configuration of PBS-II TOF MS.

Source: [2] “SELDI employs aluminum chips 1-2 mm in diameter, spotted with protein capture bait such as a chemical affinity resin, small molecule, antibody, DNA or enzyme. Users apply a crude sample to the chip, allow proteins with affinities to capture molecules to bind to the surfaces and then wash away any impurities or looselybound proteins. Analytes are laser desorbed directly from the chip, ionized and analysed by mass spectroscopy; A SELDI experiment produces a mass spectral fingerprint that can distinguish differences in protein expression levels between diseased and normal samples”.Giannoula Klemant,Pediatric Oncologist, Dana Farber Cancer Institute and Children’s Hospital, Boston ,says,” The main reason why I became really interested in this technology is that for a biotechnologist, it is often quite important to analyze groups of treated and untreated animals or people. Traditional Mass Spectrometry analysis or permits the analysis of sample pairs,(i.e.) a comparison of a single treated and untreated sample, which not only takes really long hours, but also prevents any statistical analysis”[1, 3] Figure: 2 Diagrammatic representation of overall Principle behind SELDI operation.

Source: Judith Y.M.N. Engwegen, Marie-Christine W. Gast, Jan H.M. Schellens and Jos H. Beijnen, (2006) Clinical proteomics: searching for better tumour markers with SELDI-TOF mass spectrometry, TRENDS in Pharmacological Sciences 27: 251-259 The above diagram shows the Principles of SELDI-TOF MS using the ProteinChip System. (a) Protein profiling. (i) The application of microliters of sample, for example, diseased and healthy persons to an eight-spot array with hydrophilic, hydrophobic, cationic, anionic or immobilized-metal affinity capture chromatography surface. [4] (Figure 3, 4&5) Figure: 3 Two ProteinChip Arrays with 8 spots.


Figure: 4 Variety of ProteinChip arrays available for sample preparation.

Source: Haleem J. Issaq, , Timothy D. Veenstra, Thomas P. Conrads, and Donna Felschow The SELDI-TOF MS Approach to Proteomics: Protein Profiling and Biomarker Identification Biochemical and Biophysical Research Communications 292, 587–592 (2002) Figure: 5 Bioprocessor for liquid handling procedures. The arrays are put into the processor

Source: (ii) Shows the addition of an appropriate binding buffer. (iii) Shows on-chip sample purification using one or more wash buffers. (iv) “Shows the application of energy-absorbing matrix (e.g. sinapinic acid) for the absorption of laser energy. Thus the Laser irradiation desorbs bound proteins and positively ionizes them. Owing to the electric field, they migrate in the mass analyser (TOF MS): small (diamond) and multiply charged proteins (oval) faster than large and single-charged ones (triangle). Thus, the proteins are separated. Time of flight (t) is proportional to protein mass per charge: m/zZconstant x t2”. [4] (b) (Figure shows a comparative study of the results from SELDI result and Gel result.) “SELDI-TOF mass spectrum and the spectrum depicted in gel view. The protein m/z is displayed on the x-axis and the protein abundance is depicted on the y-axis. Spectra are searched for differentially expressed protein m/z values using the ProteinChip bioinformatics software or other suitable statistics or bioinformatics. Computational algorithms are used to build models for the classification of diseased and healthy samples with the discriminating m/z values (e.g. a classification tree). Here, expression differences are visible between ovarian cancer and control

samples. (i) A peak at 9.2 kDa (haptoglobin fragment) is over expressed in ovarian cancer. Other unregulated peaks are visible at (ii) 4.1 kDa and (iii) 4.5 kDa, and (iv) a down regulated peak is seen at 2.7 kDa.”[4, 6] The most significant features of SELDI are its ability to provide a rapid protein expression profile from different types of clinical and biological samples. Researchers proved its effectiveness in biomarker discovery and identification. It also plays a vital role in the study of protein - protein and protein-DNA interaction.SELDI proves to be highly versatile by its success in the study of identification of potential diagnostic markers for breast, bladder, prostate and ovarian cancers and Alzheimer’s diseases to the study of biomolecular interactions and characterization of posttranslational modification.Infact, this technique was first used in early-stage ovarian cancer detection. This technique enables application to easily accessible body fluids such as serum. [4, 5] .In this minireview SELDI’s advantage and disadvantage, key industry suppliers and its various applications along with few learning resources sites are discussed in details. 9.2.2 Recent Advances: Towards the end of 1993, this technology was introduced, its high potentiality and versatile function attracted researchers for more studies on SELDI which lead to the development of SELDI at commercial level in Ciphergen Biosystems, Palo Alto, CA, USA, and this organisation appreciated more research that solved a number of medical and basic research problems. [6-10] Merchant and Weinberger’s review published in 2000 seems to be the first review that states a brief and precise description of SELDI technology with its advantages in biomedical research. But with in the next four years diversity of ProteinChip arrays with improved array surface characteristics has increased with the availability of SEND Technology. There has been continuous research publication that states the enhancement in SELDI instrument performance, added automation tools, improved experimental protocols and fine-turned software for biomarker pattern data analysis. Today it is simple to investigate biomarker candidates that are characterized by identifying PTM (Post Translational Modification). [10] Recent Inventions by CIPHERGEN 1. “The new ProteinChip System Series 4000 for the rapid translation of SELDI biomarker discoveries into assays”. [1] ”Invention of ProteinChip system Series 4000 by Ciphergen was specifically designed to enable and ease the Pattern Track process of biomarker discovery to assay. The basic idea behind it started, when the process begins with novel biomarker discovery, which aimed for differential protein expression between sets of biological samples using Ciphergen ProteinChip technology. Moreover validation studies with larger sample sets select the biomarkers with the highest predictive value. Another issue is the validation of the biomarkers, only the validated marker proteins are enriched and identified using peptide mass fingerprinting. Hence inorder to resolve this, Chromatographic or antibody based biomarker assays are designed which is implemented in the final step of the Pattern Track process”. [1] Significance of this series 4000

Sensitivity, reproducibility and dynamic range of the Series 4000 highly favours discovery of biomarkers and assays directly on a single platform. Validating markers up-front saves the valuable time of the researchers and their effort when compared to those methods that favours discovery of whether the markers stand up in larger sample populations after assay development. [1]

2. Automated SELDI for biomarker identification Using SELDI, Center for Orthopaedic Research in collaboration with the Arkansas Breast Cancer Research Program has developed a core facility for protein biomarker identification which was manufactured by Ciphergen. The focus of the UAMS SELDI facility is biomarker identification in cancer. [22] The SELDI resource is capable of:
• • • •

Clinical and Research Proteomics identification. Biomarker identification and validation. Protein discovery. Protein characterization.

Figure 6 Protein Identification by Automated SELDI


3. “Ciphergen’s BioSepra chromatography products and services have been combined with SELDI ProteinChip technology for a new approach to protein purification called Process Proteomics. Ciphergen’s on-spot methodology dramatically accelerates and simplifies purification development and analysis”. [14]

Advanced SPME/SELDI Fiber Introduction to IMS and MS [11] “The application of polypyrrole (PPY) solid phase micro extraction (SPME) coatings as both an extraction phase and a surface to enhance laser desorption and ionization (SELDI) analyte is introduced in here. This SPME/SELDI fiber integrates sample preparation and sample introduction on the tip of a coated optical fiber, as well as the transmission media for the UV laser light. Using ion mobility spectrometry (IMS) detection, the signal intensity was examined as function of extraction surface area and concentration of analyte. The linear relationship between concentration and signal intensity shows potential applicability of this detection method for quantitative analysis. Extraction time profiles for the fiber, using tetraoctylammonium bromide (TOAB), illustrated that equilibrium can be reached in less than one minute. To investigate the performance of the PPY coating, the laser desorption profile was studied. The fiber was also tested using a Q-TOF MS instrument with leucine enkephalin as a test analyte. Since no matrix was used, mass spectra free from matrix background were obtained. This novel SPME/SELDI fiber is easy to manufacture, and is suitable for studying low mass analytes because of the inherent low background. These findings suggest that other types of conductive polymers could also be used as an extraction phase and surface to enhance laser desorption/ionization in mass spectrometry”. [11] 9.2.3 Evaluation of technology. Advantages: Technology is easy to use and technically simple. Easy to use hardware/software integration with advanced display and analysis tools. Its potentiality enables biomarker discovery, validation, identification and the ability to create an assay on the one platform. Validating markers up-front saves the valuable time of the researchers and their effort. Easily automated. Simple sample preparation. High throughput sample preparation using "Deep proteome" tools, in order to control pre-analytical parameters. Reduction of sample complexity. Reduced sample needs - Typically 1-2 µl of crude sample per analysis. Direct application of whole sample (fast on-chip sample clean up) High-throughput that allows analysing large number of samples statistically. [1] Rapid throughput: Hundreds of samples can be analyzed by a single user in a matter of days.

Suitability for low abundance proteins (e.g., transcription factors and a majority of Cellular proteins). Rapid protein profiling. Improved discovery of protein binding partners through protein-affinity interactions. High throughput (upto 96 samples per bioprocessor) Enable access to PTM (Post Translational Modification)[13.14,15] John Semmes, Director, Center for Biomedical Proteomics at Eastern Virginia Medical School, says,”SELDI seems to attract proteomics researchers because of its ability to reduce sample complexity such as affinity purification and concentration and it also applies them directly to the chip surface, this action saves time which in turn increases the reproducibility since there is no much variability in the process”. [1] Not only qualitative detection of peptides and proteins (including Isoforms) is performed, but quantitative detection is also achieved. On-chip characterization identification using proteolytic digestion of the target protein and subsequent identification through peptide mapping. Disadvantages: This technology is currently applicable only to those proteins with maximum molecular weight <20 kDa and provides relatively lower mass accuracy than the 2D-PAGE MS method.[13] Unsuitable for high molecular weight proteins (>100kDa) Limited to detection of bound proteins. Lower resolution and mass accuracy than MALDI-TOF[14] Uses mild ionisation procedure. Difficult to match protein ID to pattern feature. Quantitative consistency questionable. Reproducibility questionable.[15]

9.2.4 Application “Based on patented Surface Enhanced Laser Desorption/Ionization (SELDI) technology, Ciphergen’s ProteinChip Systems offer a single, unified platform for a multitude of Proteomics research applications.”[24] SELDI’s various application from discovery to development. Biomarker discovery Interaction Difference Mapping Expression Difference Mapping DNA-protein interaction Protein-protein interaction Receptor-ligand interaction Antibody-antigen interaction Protein purification Clinical trials stratification Phosphorylation/signal transduction

Glycosylation analysis Epitope mapping Protein ID/peptide mapping Toxicity markers Biomarker discovery The alternations by SELDI in the detection of cancer proteome, biomarkers and pattern of biomarkers lead to improvement in early detection, diagnosis and treatment monitoring in cancer. [14, 16].this technique was first used for early-stage ovarian cancer detection. [5] Figure 7 Shows biomarker discovery process.

Source: PATTERN RECOGNITION “Single protein biomarkers with clinically relevant predictive power are quite rare. Multiple marker panels generated by SELDI are beginning to deliver on the promise of high-confidence clinical stratification. Ciphergen’s Biomarker Patterns Software enables rapid sample stratification with an easy-to-interpret, tree-based classification output as shown in the figure –8“ .[17] Figure 8 Pattern recognition

Source: VALIDATION To verify the reliability of potential biomarkers, high numbers of samples need to be comparatively analyzed to further substantiate their usefulness. A single user can manually prepare and analyze up to 200 samples per day.[17] Figure 9 validation profile maintenance.

Source: PROTEOMICS APPLICATIONS “SELDI enable clinical researchers to rapidly discover, characterize, and validate predictive protein biomarkers and biomarker patterns, right in their lab. Technology also enables a wide variety of Interaction Difference Mapping applications in basic biochemistry and drug discovery. Quadrupole tandem MS extends the application range to “on-spot” epitope mapping and peptide sequencing capabilities”. [14]

Source: Emanuel F Petricoin and Lance A Liotta, (2004), SELDI-TOF-based serum proteomic pattern diagnostics for early detection of cancer, Current Opinion in Biotechnology. 15:24–30 “Biomarker amplification and harvesting by carrier molecules. Low molecular weight peptide fragments, produced within the unique tissue microenvironment and generated as a consequence of the disease process, permeate through the endothelial cell wall barrier and trickle into the circulation. Here, these fragments are immediately bound with circulating high-abundance carrier proteins, such as albumin, and protected from Kidney clearance. The resultant amplification of the biomarker fragments enables these low abundance entities to be seen by MS-based detection and profiling. In the future, harvesting nanoparticles, engineered with high affinity for binding, can be instilled into the collected body fluids or injected directly into the circulation to bind with the disease- and toxicity-related information archive. These nanoparticles and their bound diagnostic cargo can then be directly collected, filtered over engineered filters and queried by high-resolution MS. A ‘look up table’, where the exact identities of each of the peaks will be compared against the accurate mass tag of each of the peaks within the spectra, will enable the simultaneous identification of each entity within the pattern as well as allowing the discovery of the diagnostic pattern itself”.[18 ] POST-TRANSLATIONAL MODIFICATIONS “Because the SELDI ProteinChip platform delivers the exact molecular weights of the molecules present in a given sample, a wide variety of covalent modifications (phosphorylation, acetylations, oxidations, Glycosylation and others) can be analyzed. For the detection of phosphorylation, an IMAC array loaded with Gallium enables specific enrichment and detection of phospho-peptides. This also provides a powerful approach to the quantitation of kinase activity”. [17, 14] 9.2.5 Relevant Web-sites

1. The Website name is Biocompare - a guide for Life Scientist. 2. The Website name is Wiley interscience. 3. The Website name is Biomed central the open access publisher 4. The Website name is Dissect medicine – collaborative medical news. 5. The Website name is Medscape today – medical news website. 6. The Website name is ExPASy (Expert Protein Analysis System) proteomics server of the Swiss Institute of Bioinformatics (SIB) is dedicated to the analysis of protein sequences and structures as well as 2-D PAGE 7. This Website features Mascot, a powerful search engine that uses mass spectrometry data to identify proteins from primary sequence databases. To assist you, Mascot forms a substantial knowledge base concerning protein identification by MS. 8. The Website name is NCBI (National Centre for Biotechnology Information) creates public databases, conducts research in computational biology, develops software tools for analyzing genome data, and disseminates biomedical information - all for the better understanding of molecular processes affecting human health and disease. 9. The Website name is PerkinElmer Life Science. 10. This Website features Proteomics tools for mining sequence databases in conjunction with Mass Spectrometry experiments 11. The Website name is BIOBASE is the leading content provider of biological databases, knowledge tools and software for the life science industry 12. This Website contains various tools for protein studies. 13. the website name is FEBS (Federation of European Biochemical Societies) Journal search engine. 9.2.6 Key industry suppliers. [19] Major instrument vendors Major companies involved in Protein chips Major Companies exploiting proteomics Vendors supplying miscellaneous proteomics products Major companies dealing with protein products.

Source: Conclusion: SELDI enables researchers gain a better understanding of biological functions at the protein levels, as tool provides a direct approach to understanding the role of proteins in the biology of disease, monitoring of disease progression and in the therapeutic affects of drugs. [15, 16]Thus in this review the overall advancement of SELDI technology and its various application along with the principles, advantages and disadvantages are discussed in details. These types of review papers drive the creation of such technologies and advancement in them. Because on looking from the last decade there has been an exponential growth in the collective understanding and the utility of this technique which could be picturized from the evolution of SELDI from an inherently evolution of seldi from an inherently challenging research project into a useful tool for biomedical research.[16]”Finally, as SELDI ProteinChip technology is applied more frequently on a large scale, automation of sample preparation, chip reading and data analysis will become imperative”.[15 ]

9.2.7 References. 1. Biocompare - a guide for Life Scientist web site (15/10/06) (update Feb 16 ‘05) 2. Thomas P Conrads, Ming Zhou, Emmanuel F Petricoin, Lance Liotta and Timothy D Veenstra, (2003) Cancer diagnosis using proteomic patterns, Expert Rev. Mol. Diagn. 4: 411-420.

3. Semmes et al., (2005). Evaluation of Serum Protein Profiling by SurfaceEnhanced Laser Desorption/Ionization Time of Flight Mass Spectrometry for the Detection of Prostate Cancer Assessment of Platform Reproducibility, Clinical Chemistry. 51: 102-112. 4. Judith Y.M.N. Engwegen, Marie-Christine W. Gast, Jan H.M. Schellens and Jos H. Beijnen, (2006) Clinical proteomics: searching for better tumour markers with SELDI-TOF mass spectrometry, TRENDS in Pharmacological Sciences .27: 251-259. 5. Petricoin, E.F. et al. (2002) Use of proteomic patterns in serum to identify ovarian cancer. Lancet. 359, 572–577.

Frears E.R., Stephens D.J., Walters C.E., Davies H. , Austen B.M, (1999) The role of cholesterol in the biosynthesis of β-amyloid, NeuroRepor., 10 : 16991705.

7. Shane Beck, (1998) Ciphergen's ProteinChip Arrays , The Scientist. 12: 1517. 8. Glaser .V, (1998) Gen.Engineer News, 18: 1 9. Glaser.V, Rosetta, (1998) Polymeric arrays and methods for their use in binding assays, Nature biotechnology.15: 937-938. 10. Strauss, E., (1998) , After the genome IV meeting : News Ways to probes the molecule of life, Science. 282: 1406-1407. 11. Engwegen, J.Y.M.N. et al. (2006) Identification of serum proteins discriminating colorectal cancer patients and healthy controls using surface enhanced laser desorption ionisation-time of flight mass spectrometry (SELDI-TOF MS). World J. Gastroenterol.12:1536–1544. 12. Dr. Benjamin Reed, An Introduction to SELDI ProteinChip® Proteomics and related Applications to Pharmacoproteomics and Clinical Biology, University of water loo,Ontario, Canada, Faculty of science website. 13. David W Speicher,The Wistar Institute Porteomics short course lecture2, Protein profiling,Biological applications and Human disease.(2004) .pdf 14. Wiley interscience.

15. Weinberger S.R., Merchant.M, (2000) Recent advancement in surfaceenhanced laser desorption / ionization time of flight – Mass spectrometry, Electrophoresis, 21: 1164-1177. 16. Judith Y.M.N. Engwegen, Marie-Christine W. Gast, Jan H.M. Schellens and Jos H. Beijnen, (2006) Clinical proteomics: searching for better tumour markers with SELDI-TOF mass spectrometry, TRENDS in Pharmacological Sciences. 251-259. 17. Ciphergen website (2006) 18. Harri siitari and Heini koivistoinen (2004) Proteomics – challenges and possibilities in Finland 19. Emanuel F Petricoin, and Lance A Liotta, (2004) SELDI-TOF-based serum proteomic pattern diagnostics for early detection of cancer, Current Opinion in Biotechnology. 15:24–30. 20. Ning Tang, Pete Tornatore, and Scot R. Weinberger , (2004)Current Developments in SELDI Affinity Technology, Mass Spectrometry Reviews. 23 : 34–44. 21. Sonja V,Steve C,Scot R W & Andreas W (2005) Protien quantification by the SELDI-TOF-MS based Protienchip system Nature Methods .2 : 393-395. 22. Judith Y.M.N. Engwegen, Marie-Christine W. Gast, Jan H.M. Schellens and Jos H. Beijnen, (2006) Clinical proteomics: searching for better tumour marker with SELDI-TOF mass spectrometry. Trends in Pharmacological Sciences .27: 251-259. 23. Yue Hu, Suzhan Zhang_, Jiekai Yu, Jian Liu, Shu Zheng, (2005) SELDI-TOFMS: the proteomics and bioinformatics approaches in the diagnosis of breast cancer, The Breast , 14 : 250–255. 24. Haleem J. Issaq,, Timothy D. Veenstra, Thomas P. Conrads, and Donna Felschow (2002) The SELDI-TOF MS Approach to Proteomics: Protein Profiling and Biomarker Identification,Biochemical and Biophysical Research Communications, 292, 587–592. 25. S.R. Weinberger , E. Boschettib, P. Santambienb, V. Brenacb, (2002) Surface-enhanced laser desorption–ionization retentate chromatography_ mass spectrometry (SELDI–RC–MS): a new method for rapid development of process chromatography Conditions, Journal of Chromatography B. 782 : 307–316. 26. Dan Agranoff, August Stich, Paulo Abel and Sanjeev Krishna, (2005) Proteomic fingerprinting for the diagnosis of human African trypanosomiasis, Trends in Parasitology. 21:154-157.

27. University of Arkansas for Medical Sciences, Center for Orthopaedic Research, Orthopaedic Surgery Department. 28. Emanuel F Petricoin1and Lance A Liotta, (2004), SELDI-TOF-based serum proteomic pattern diagnostics for early detection of cancer, Current Opinion in Biotechnology. 15:24–30. 29. Junjun Wang,y Defa Li, Lawrence J. Dangott,z and Guoyao Wuy (2006) Proteomics and Its Role in Nutrition Research,Journal of nutrition,1759-1762. 30. Eastern Virginia Medical School website

Protein Nanotechnology By Payal Patel S3120173

10.1 Inroduction The branch of engineering that deals with manipulating or constructing things smaller than 100 nanometres is called as Nanotechnology. In healthcare, diagnosis takes and important part to treat a disease. Identification and quantification of proteins and their folding mechanism are very important in diagnosis of diseases. Small quantities of proteins, which generally escape from detection and are responsible for the diseases, now, can be quantified by protein nanotechniques. Proteins are long stringy molecules that fold up into complex and useful shapes due to very subtle interactions between their component parts. Proteins do most of the molecular manipulation work in our bodies, joining and splitting molecules, moving things as small as atoms and as large as cellular organelles from one place to another, and making cellular metabolism work. The combination of nanotechnology and molecular biology has led to a new generation of nanoscale-based devices and methods for probing the cell machinery and elucidating intimate life processes occurring at the molecular level that were invisible to human inquiry. After using all this advanced technology scientists were able to say that they can actually switch on and off protein because they take a control on specific protein and works on it. Protein Nanotechniques have given better understanding of biotechnology processes.

10.2 Recent Advances The worldwide emergence of nanoscale science and engineering was marked by the announcement of the National Nanotechnology Initiative (NNI) in January 2000. Recent research on biosystems at the nanoscale has created one of the most dynamic science and technology domains at the confluence of physical sciences, molecular engineering, biology, biotechnology and medicine. Nanotechnology is beginning to allow scientists, engineers, and physicians to work at the cellular and molecular levels to produce major benefits to life sciences and healthcare. In the next century, the emerging field of nanotechnology will lead to new biotechnology based industries and novel approaches in medicine. The recent developments in nanotechnologies such as protein microarrays, biosensors etc. and their application in diagnosis of diseases at proteomics level have also been seen. Major advances in the last several years in scanning probe and scanning optical analytical methods permit viewing the vital chemical processes and microscopic structures in biological systems with unprecedented resolution. These new analytical probes reveal a detailed picture of the microscopic structure of living cells and a view of chemical processes at the molecular scale. The atomic force microscope, for example, can locate and measure the extraordinarily small forces associated with receptor-ligand binding on cell surfaces. Microscopic electrical probes can detect a living cell’s exchange of ions with its environment or the propagation of electrical signals in nerves. The optical instruments combined with chemically selective light –emitting fluorescent probes, can follow chemical processes on the surfaces of and inside a living cell. This capability allows observation of the biochemical process and interactions of cells in living systems. In human body cells contain naturally occurring molecular motors. For eg. F1-ATPase which is part of the large, membrane-embedded complex that synthesizes ATP within mitochondria (Figure). This structure is only about 10 nm in size, is a robust, fully functional rotating motor that is powered by natural biochemical processes.

Figure: The molecular motor protein F1-ATPase. An actin filament is attached to a motor protein to provide load to and allow visualization of the motor rotation During the last few years, scientists have developed the technology for rapidly mapping the genetic information in DNA and RNA molecules, including detection of mutations and measurement of expression levels. This technology uses DNA microchip arrays that adapt some of the lithographic patterning technologies of the integrated circuit industry. Miniaturization of allied analytical processes such as electrophoresis will lead to increases in throughput and reduced cost for other important methods of analysis such as DNA sequencing and fingerprinting. For example, new research is aimed at replacing the tedious, slow, and expensive process of DNA sequencing in slab gels with miniaturized integrated micro fabricated analytical systems (Figure).

Figure: Photo mosaic of a DNA separation chip. The image is pieced together from twelve optical micrographs. The inset shows a small region 0.8 mm long containing dense pillars that act as a molecular sieve to separate DNA molecules according to size. Conventional gel electrophoresis works essentially the same way, and for this reason these nanofabricated structures are called “artificial gels.” This technology has the potential to revolutionize DNA separation techniques by providing an inexpensive, durable, and reproducible medium for DNA electrophoresis To deliver drugs and genes into cells nanoparticles considerably smaller than one micron in diameter have been used. The particles can be combined with chemical compounds that are ordinarily insoluble and difficult for cells to internalize. These particles can then be introduced into the bloodstream with little possibility of clogging the capillaries and other small blood vessels. The efficacy and speed of drug action in the human body can thereby be dramatically enhanced. As part of new technology nanoparticles can be used in similar ways, by carrying DNA fragments they can be used to incorporate specific genes into target cells (Figure).

Figure: The “Gene Gun,” a system that uses nanoparticles to deliver genetic material to transfect plant and animal cells. In this system, submicron gold particles coated with DNA are accelerated with a supersonic expansion of helium gas. The particles leave the front of the device at high velocity and penetrate the cell membrane and nuclear membrane, thus delivering the genetic material to the nucleus. A biosensor is an analytical device which converts a biological response into an electrical signal (Figure). The term 'biosensor' is often used to cover sensor devices used in order to determine the concentration of substances and other parameters of biological interest even where they do not utilise a biological system directly.

Figure: Schematic diagram showing the main components of a biosensor. The biocatalyst (a) converts the substrate to product. This reaction is determined by the transducer (b) which converts it to an electrical signal. The output from the transducer is amplified (c), processed (d) and displayed (e). In last 5 years NASA Ames has been a leader in nanotechnology. In this last 5 years they have done so many projects to use this useful technique in bioscience and medicine. So, they are important part of advances in protein nanotechnology and their application in field. 10.3 Evaluation of Technology As all other technology protein nanotechnology also have some disadvantage with plenty of advantages. As described in recent advances and application of nanotechnologies in medicine and biology it is having so many advantages to people. The major one is it helps in diagnosis of disease. It makes process simple and short. By using microarray techniques we can find out cause of disease soon and can treat a person soon. Biosensors, gene gun, flurecent biological labels are few examples that are covered here. There are few disadvanges of this technique that makes scientists think sometimes while using it. It requires infrastructure for nanobiology. It is similar to those for other fields: multi-user facilities to provide access to specialized technologies, funding mechanisms and organization structures that encourage and support multidisciplinary teams and are responsive to rapid technological change, and training of a new generation of scientists and engineers who are prepared to maximally exploit this new knowledge are required. It is very costly so that is big disadvantage. 10.4 Application of Technology The protein nanotechnology has application in different fields. In life science and medicine, engineering, physics etc. DNA detector arrays that today operate in the micron size range provide the potential to do thousands of experiments simultaneously with very small amounts of material. Figure shows an image of a chip with 6,400 microdots, each containing a small amount of a different gene in the yeast genome and capable of determining how active that gene is in yeast. Yeast cells were grown under various conditions; the amount of red or yellow light represents the level of RNA produced from the DNA in that gene, under those conditions. Similar experiments using this or related technologies can now be performed with tens or hundreds of thousands of human genes. By comparing the pattern of gene expression of normal tissue with cancerous tissues, scientists can discover which few genes are being activated or inhibited during a specific disease. This information is critical to both the scientific and clinical communities in helping to discover new drugs that inhibit cancer-causing genes. The important point is that these technologies allow physiological changes in yeast or humans to be characterized, molecule by molecule, in just a few hours. Five years ago, an experiment like this would have taken dozens of scientists months to complete.

Figure: The full yeast genome in a microarray chip. For more than a decade there has been an intensive effort to prepare high-quality nanometer-size colloidal crystals of many common semiconductors. At the onset, this effort had a strong focus on fundamental studies of scaling laws, in this case, quantum confinement of electrons and holes. Over this decade, tremendous advances occurred in both the spectroscopy and the fabrication methods. This yielded a new class of very robust macromolecules with readily tunable emission energy. To the extent that applications of this technology were envisioned at the onset, they were focused in the domain of optoelectronics. Yet quite unexpectedly, it turns out that these colloidal nanocrystals can be used as fluorescent labels for biological tagging experiments. Biological tagging is one of the most widely employed techniques for diagnostics and visualization. As shown in Figure it appears as though for many applications, the colloidal nanocrystals are advantageous as labels. This has led to rapid commercialization of the new nanotechnology. It has significant advantages over conventional dyes 4Reduced photo bleaching 4Multi-color labelling, parallel screening 4Infrared labels, blood diagnostics 4Molecular size nanocrystals are bio-compatible, with many other possible applications.

Figure: Semiconductor nanocrystals as fluorescent biological labels Nanotechnology also has revolutionary advances in military capability. For instance, the confluence of biology, chemistry, and physics at the nanometre scale is enabling significant advances in military sensors for biological and chemical warfare agents. Civilian disaster response teams and commercial medicine will benefit as well. We cannot afford to respond to a nerve gas attack. Defence research and development programs are pursuing many sensor options; two related technologies are nearing fruition and will have medical applications as well. One is a colorimetric sensor that can selectively detect biological agent DNA; it is in commercial development with successful tests against anthrax and tuberculosis (Mirkin 1999). Compared to present technology, the sensor is simpler, less expensive (by about a factor of 10), and more selective—it can differentiate one nucleotide mismatch in a sequence of 24, where 17 constitutes a statistically unique identification.

Figure: Anthrax detection: when the anthrax target is present, pairs of nanoparticles assemble together via the DNA filaments and change the color of the respective suspension

10.5 References 1. Baselt, D.R., G.U. Lee, and R.J. Colton. 1996. Biosensor based on force microscope technology. Journal of Vacuum Science and Technology B 14(2):789793. 2. Brown, P. 1999. 3. Bruchez, M. Jr., M. Moronne, P. Gin, S. Weiss, and A.P. Alivisatos. 1998. Semiconductor nanocrystals as fluorescent biological labels. Science 281:20132016. 4. Chan, W.C.W., and S.M. Nie. 1998. Quantum dot bioconjugates for ultrasensitive nonisotopic detection. Science 281:2016-2018. 5. Colton, R. 1999. (Chemistry Division, Naval Research Laboratory -- private communication). 6. Jelinski, L. 1999. Biologically related aspects of nanoparticles, nanostructured materials, and nanodevices. In Nanostructure science and technology. NTSC Report, ed. R.W. Siegel, E. Hu, and M.C. Roco. Baltimore: World Technology Evaluation Center (WTEC). Web site Also published by Kluwer Academic Publishers. 7. Hameroff, S., et al. 1990. Scanning tunneling microscopy of cytoskeletal proteins: Microtubules and intermediate filaments. J. Vac. Sci. and Tech. A 8:687-691. 8. Lysaght, M.J. 1998. An economic survey of the emerging tissue engineering industry. Tissue Eng. 4:231- 238. 9. Mirkin, C. 1999. (Department of Chemistry, Northwestern University – private communication). 10. Noji, H. 1998. The rotary enzyme of the cell: The rotation of F1-ATPase. Science 282: 1844-1845. 11. Odde, D.J. and M.J. Renn. 1998. Laser-based direct-write lithography of cells. Ann. Biomed. Eng. 26:S- 141. 12. Renn, M.J., et al. 1999. Laser guidance and trapping of mesoscale particles in hollow-core optical fibers. Phys. Rev. Lett. 82:1574-1577. 13. Santini, J.T. Jr., M.J. Cima and R. Langer. 1999. A controlled-release chip. Nature 397:335-338. 14. Shi, H., et al. 1999. Template-imprinted nanostructured surfaces for protein recognition. Nature 398:593- 597. 15. Indian Journal of Clinical Biochemistry, 2005, 20 (2) 48-53

16. Barry, R. and Ivanov, D. (2004) Microfluidic in biotechnology. J. Nanotechnol. 2, 1-5. 17. U.R. Muller, D.V. Nicolau (Eds), Microarray Technology and Its Applilcations 18. Biosensors, 19. 20. J Nanosci Nanotechnol. 2006 Sep-Oct;6(9-10):2736-53. 21.

Sign up to vote on this title
UsefulNot useful