You are on page 1of 55

UNIT I

Gene analysis techniques Isolation of DNA and RNA Handling and


Quantification of
nucleic acids Radiolabelling of nucleic acids Gel Electrophoresis Probing
for a specific gene Southern blotting, Northern blotting. Dot blotting,
Western blotting. Chromosome walking Heteroduplex analysis.
Section A
1.The lysis of animal cells is usually performed using anionic detergents such as SDS (sodium deodecyl
sulfate) or Sarcosyl (sodium deodecyl sarcosinate).
2.

Isolation of Nucleic Acid


There are many methods available for purifying nucleic acids. The choice of method depends on
the type, source, size and amount, and quality required for the procedure in which the nucleic
acid is to be used. The advantages and disadvantages of each purification method are presented
in Table 1 (See notes/conditions & reference protocols).
Table 1
Nucleic Acid Purification
Method
Pierce
Prod. #
Phenol/alcohol
17906
(Guanidine
17908
isothiocyanate)
17909
17912
17914
Ethanol
51102, TE
17890

Benefit

Disadvantage

Methods for nucleic acids of Phenol must be buffered and


all sizes and strands; removes free of oxidative products;
proteins; large or small scale poison & caustic; dilute purified
product may require ethanol
precipitation to concentrate
Fast, easy and efficient; works Pellets are difficult to see and
with many salt combinations may detach; ethanol may carry
over and damage enzyme
activity; over-drying the pellet
make re-dissolving difficult

DEAE, DE52

Column or Purification of modified


paper
nucleic acids

Eluate often contains high salt;


dilute purified product may
require ethanol precipitation to
concentrate
Fast and easy method to
Sample loss from large surface
remove small molecules
area; separation is not always
100%; nucleic acid dilution
Cleanest method for DNA;
Time consuming; expensive;
choice method for >15kb and EtBr is a mutagen; DNA must be
closed circular plasmids for
purified away from CsCl for
biophysical measurement
subsequent use
Simple to perform
Recovery may be low (50%);
increased size decreases
recovery; poor recovery for < 50
ng; may require extra
purification; agarose carryover
may inhibit enzymes
Fast and easy to perform; safe Different reagents are required
alternative to
for different sized nucleic acids;
phenol/chloroform; clean
dilution of nucleic acids; ethanol
nucleic acids; >80% recovery may carry over and damage
enzyme activity

Size Exclusion
Chromatography

Cesium Chloride

Extraction from
agarose

Silica Resin

DNA extraction methods. After discussing briefly the structure of DNA. DNA extraction procedures are given
below. For every genetic analysis, we require a good quality DNA. Nowadays readymade kits are available in
the market by which the DNA can be extracted but some time it is difficult for the laboratories to use the
commercial kits because of the cost.
There are various manual protocols about the DNA extraction.
However, in each protocol several steps are involved in the preparation of DNA. These are - (i) cell breakage
(ii) removal of protein (iii) removal of RNA (iv) concentration of DNA (v) determination of the purity
and quantity of DNA.
1. Cell breakage
Cell breakage is one of the most important steps in the purification of DNA. The usual means of cell
disruption, such as sonication, grinding, blending, or high pressure, cannot be used in DNA isolation.
These procedures apply strong force for the disruption of cells which get sheared, as a result the DNA is
fragmented. The best procedure for opening cells and obtaining intact DNA is through application of
chemical (detergents) and / or enzymatic procedures. Detergents can solubilize protein as well as lipids in
cell membranes resulting into gentle cell lysis. In addition, detergents have an inhibitory effect on all
cellular DNAses and can denature proteins, thereby aiding in the removal of proteins from the solutions.
The lysis of animal cells is usually performed using anionic detergents such as SDS (sodium deodecyl
sulfate) or Sarcosyl (sodium deodecyl sarcosinate).

2. Removal of protein
The second step in purification involves removal of major contaminant,namely protein, from the cell
lysate. This procedure is called deproteinization. Removal of proteins from the DNA solution depends on
differences in the physical properties between nucleic acids and proteins. These differences are in
solubility, in partial specific volume, and in sensitivity to digestive enzymes.
(i) Deproteinization using organic solvents
The most frequently used methods for removing proteins explore the solubility differences between
proteins and nucleic acids in organic solvents. Nucleic acids are hydrophilic molecules and are easily
soluble in water. Proteins, on the other hand, contain many hydrophobic re si dues making them partially
soluble in organic solvents. There are several methods of deproteinization based on this difference and
they vary by the choice of the organic solvent introduced:
1. The use of ionic detergents: These detergents, by unfolding the protein, help to expose hydrophobic regions
of the polypeptide chains to phenol micelles, thereby aiding partitioning of proteins into the phenol phase.
2. Enzymatic removal of proteins before phenol extraction: This reduces the number of extractions needed,
thus, limiting the loss and shearing of DNA.
3. Addition of 8HQ (8-Hydroxy-Quinoline) to the phenol: This increases the solubility of phenol in water. In
the presence of this compound phenol remains liquefied at room temperature with only 5 percent water. In
addition, BHQ is easily oxidized and, therefore, it plays the role of an anti-oxidant, protecting phenol against
oxidation. Since the reduced form of SHQ is yellow and the oxidized form is colorless, the presence or absence
of yellow color is an excellent visual indicator of the oxidation state of phenol.
4. Removal of oxidation products from phenol and prevention of oxidation upon storage or during phenol
extraction: Because water-saturated phenol undergoes oxidation rather easily, particularly in the presence of
buffers such as Tris, Phenol used for DNA purification is twice distilled, equilibrated with water, and stored in
the presence of 0.1 percent SHQ.
5. Adjusting the pH of water-saturated phenol solution to above pHS by equilibration of the liquefied phenol
with a strong buffer or sodium borate. DNA obtained by this method is usually of high molecular weight, but
contains approximately 0.5 percent protein impurities that can be removed by another method.
The application of chloroform: isoamyl alcohol (CIA) mixture can also be used which is also known
as the deproteinization method. This is based on a characteristic of this organic solvent that differs from
phenol. The chloroform is not miscible with water and, therefore, even numerous extractions do not result in
DNA loss into the organic phase. The deproteinization action of chloroform is based on the ability of denatured
polypeptide chains which partially enter or be immobilized at the water-chloroform interphase. The resulting
high concentration of protein at the interphase causes protein to precipitate. Since the deproteinization action
of chloroform occurs at the chloroform-water interphase, efficient deproteinization depends on the formation
of a large interphase area. To achieve this, one has to form an emulsion of water and chloroform. Chloroform
does not mix with water. This can only be done by vigorous shaking. An emulsifier, isoamyl alcohol, is added
to chloroform to help to form the emulsion and to increase the water- chloroform surface area.
The characteristics of enzymatic removal of proteins make enzymatic deproteinization an ideal and
indispensable first step in nucleic acid purification. This treatment is used when a large amount of protein is
present, i.e. right after cell lysis. The remaining proteins can be removed with a single extraction using organic
solvent.
(ii) Removal of RNA
The removal of RNA from DNA preparations is usually carried out using an enzymatic procedure.
Consequently, this procedure does not remove all RNA and, therefore, yield DNA preparations with a
very small amount of RNA contamination. Two ribonucleases that can be easily and cheaply prepared free
of DNase contamination are ribonucleases A and ribonucleases Tt.
(a) Ribonuclease A (RNase A) is an endoribonuclease that cleaves RNA after C and U residues. The
reaction generates 2': 3' cyclic phosphate which is hydrolyzed to 3' -nucleoside phosphate producing
oligonuc1eotides ending with 3' -phosphorylated pyrimidine nucleotide.
(b) Ribonuclease Tt (RNase Tt) is an endoribonuclease that is very similar to RNase A in a reaction
conditions and stability. The enzyme cleaves double-stranded and single-stranded RNA after G residues,

generating oligonucleotides ending in a 3' -phosphorylated guanosine nucleotide.


(iii) Concentrating the DNA
Precipitating with alcohol is usually performed for concentration of DNA from the aqueous phase of the
deproteinization step. Two types of alcohols can be used for DNA precipitation: ethanol and isopropanol.
Alcohol precipitation is based on the phenomenon of decreasing the solubility of nucleic acids in water.
Polar water molecules surround the DNA molecules in aqueous solutions. The positively charged
dipoles of water interact strongly with the negative charges on the phosphodiester groups of DNA. This
interaction promotes the solubility of DNA in water. Ethanol is completely miscible with water, yet it is
far less polar than water. Ethanol molecules cannot interact with the polar groups of nucleic acids as
strongly as water, making ethanol a very poor solvent for nucleic acids.
.
(iv) Determination of purity and quality of DNA
The last step in DNA isolation is the quality of the DNA being isolated UV spectrometry is used for
determining the DNA concentration. The DNA has maximum absorbance at 260nm and minimum
absorbance at 234 nm. This can get affected by the PH of the medium in which the DNA is dissolved.
DNA concentration (N) =A260/E260; E260 is the DNA extinction coefficient. This is 0.02 Ilg -1 cm -1 when
measured at neutral or little basic PH for double helical DNA. Thus, an absorbance of 1.0 at 260 nm gives a
DNA concentration of 50 Ilg ml-l (1/0.02 =50 mg ml-l)" However, this may differ because of GC percent. If
the DNA is found in small concentrations then dust particle scattering may effect the measurements. This can
be checked by taking one reading at 320 nm, DNA is not absorbed at 320 nm, hence, if there is any
recorded reading, it is because of dust contamination. If there is no contamination then at 320 the reading
should be 5% less than the absorbance reading at 260. At 280 nm the protein concentration is measured as
the protein is absorbed maximally at 280 nm. This is due to tyrosine, phenylalanine and tryptophan. For
DNA purity the ratio at A260:A280 is taken. The best purity is 1.8 to 2.
Precautions
All the chemicals should be handled with care specially phenol; if skin comes in contact with phenol flush
off with large amount of water and then apply polyethylene glycol, never apply ethanol. If using blood or
any human tissue precaution should be taken that it is pre-tested and free from HIV. Further, always wear
gloves.
Nucleic acid blotting
These techniques may be applied in the isolation and quantification of specific nucleic acid
sequences and in the study of their organization, intracellular localization, expression and
regulation. A variety of specific applications includes the diagnosis of infectious and inherited
disease. Each of these topics is covered in depth in subsequent chapters. An overview of the
steps involved in nucleic acid blotting and membrane hybridization procedures is shown in Fig.
2.4. Blotting describes the immobilization of sample nucleic acids on to a solid support, generally
nylon or nitrocellulose membranes. The blotted nucleic acids are then used as targets in
subsequent hybridization experiments. The main blotting procedures are:
Southern blotting
The original method of blotting was developed by Southern (1975, 1979b) for detecting
fragments in an agarose gel that are complementary to a given RNA or DNA sequence. In
this procedure, referred to as Southern blotting, the agarose gel is mounted on a filterpaper wick which dips into a reservoir containing transfer buffer (Fig. 2.5).
The hybridization membrane is sandwiched between the gel and a stack of paper towels
(or other absorbent material), which serves to draw the transfer buffer through the gel by
capillary action. The DNA molecules are carried out of the gel by the buffer flow and
immobilized on the membrane. Initially, the membrane material used was nitrocellulose.
The main drawback with this membrane is its fragile nature.
Supported nylon membranes have since been developed which have greater binding
capacity for nucleic acids in addition to high tensile strength. For efficient Southern
blotting, gel pretreatment is important. Large DNA fragments (10 kb) require a longer
transfer time than short fragments.

To allow uniform transfer of a wide range of DNA fragment sizes, the electrophoresed DNA
is exposed to a short depurination treatment (0.25 mol/l HCl) followed by alkali. This
shortens the DNA fragments by alkaline hydrolysis at depurinated sites.
It also denatures the fragments prior to transfer, ensuring that they are in the singlestranded state and accessible for probing. Finally, the gel is equilibrated in neutralizing
solution prior to blotting. An alternative method uses positively charged nylon
membranes, which remove the need for extended gel pretreatment.
With them the DNA is transferred in native (non-denatured) form and then alkalidenatured in situ on the membrane. After transfer, the nucleic acid needs to be fixed to
the membrane and a number of methods are available.
Oven baking at 80C is the recommended method for nitrocellulose membranes and this
can also be used with nylon membranes. Due to the flammable nature of nitrocellulose, it
is important that it is baked in a vacuum oven.
An alternative fixation method utilizes ultraviolet cross-linking. It is based on the
formation of cross-links between a small fraction of the thymine residues in the DNA and
positively charged amino groups on the surface of nylon membranes. A calibration
experiment must be performed to determine the optimal fixation period.
Following the fixation step, the membrane is placed in a solution of labelled (radioactive
or non-radioactive) RNA, single-stranded DNA or oligodeoxynucleotide which is
complementary in sequence to the blottransferred DNA band or bands to be detected.
Conditions are chosen so that the labelled nucleic acid hybridizes with the DNA on the
membrane.
Since this labelled nucleic acid is used to detect and locate the complementary sequence,
it is called the probe. Conditions are chosen which maximize the rate of hybridization,
compatible with a low background of non-specific binding on the membrane .
After the hybridization reaction has been carried out, the membrane is washed to remove
unbound radioactivity and regions of hybridization are detected autoradiographically by
placing the membrane in contact with X-ray film

Northern blotting
Southerns technique has been of enormous value,but it was thought that it could not be
applied directly to the blot-transfer of RNAs separated by gel electrophoresis, since RNA
was found not to bind to nitrocellulose.
Alwine et al. (1979) therefore devised a procedure in which RNA bands are blottransferred from the gel on to chemically reactive paper, where they are bound
covalently.
The reactive paper is prepared by diazotization of aminobenzyloxymethyl paper (creating
diazobenzyloxymethyl (DBM) paper), which itself can be prepared from Whatman 540
paper by a series of uncomplicated reactions.
Once covalently bound, the RNA is available for hybridization with radiolabelled DNA
probes. As before, hybridizing bands are located by autoradiography.
Alwine et al.s method thus extends that of Southern and for this reason it has acquired
the jargon term northern blotting. Subsequently it was found that RNA bands can indeed
be blotted on to nitrocellulose membranes under appropriate conditions (Thomas 1980)
and suitable nylon membranes have been developed. Because of the convenience of
these more recent methods, which do not require freshly activated paper, the use of DBM
paper has been superseded.
Western blotting
The term western blotting (Burnette 1981) refers to a procedure which does not directly
involve nucleic acids, but which is of importance in gene manipulation.It involves the
transfer of electrophoresed protein bands from a polyacrylamide gel on to a membrane of
nitrocellulose or nylon, to which they bind strongly (Gershoni & Palade 1982, Renart &
Sandoval 1984).
The bound proteins are then available for analysis by a variety of specific proteinligand
interactions. Most commonly, antibodies are used to detect specific antigens. Lectins

have been used to identify glycoproteins. In these cases the probe may itself be labelled
with radioactivity, or some other tag may be employed. Often, however, the probe is
unlabelled and is itself detected in a sandwich reaction, using a second molecule which
is labelled, for instance a species-specific second antibody, or protein A of Staphylococcus
aureus (which binds to certain subclasses of IgG antibodies), or streptavidin (which binds
to antibody probes that have been biotinylated).
These second molecules may be labelled in a variety of ways with radioactive, enzyme or
fluorescent tags. An advantage of the sandwich approach is that a single preparation of
labelled second molecule can be employed as a general detector for different probes. For
example, an antiserum may be raised in rabbits which reacts with a range of mouse
immunoglobins.
Such a rabbit anti-mouse (RAM) antiserum may be radiolabelled and used in a number of
different applications
to identify polypeptide bands probed with different, specific, monoclonal antibodies, each
monoclonal antibody being of mouse origin. The sandwich method may also give a
substantial increase in sensitivity, owing to the multivalent binding of antibody molecules.
Chromosome walking
Earlier in this chapter, we discussed the advantages of making genomic libraries from random
DNA fragments. One of these advantages is that the resulting fragments overlap, which allows
genes to be cloned by chromosome walking. The principle of chromosome walking is that
overlapping clones will hybridize to each other, allowing them to be assembled into a contiguous
sequence.
This can be used to isolate genes whose function is unknown but whose genetic location is
known, a technique known as positional cloning.
To begin a chromosome walk, it is necessary to have in hand a genomic clone that is known to lie
very close to the suspected location of the target gene. In humans, for example, this could be a
restriction fragment length polymorphism that has been genetically mapped to the same region.
This clone is then used to screen a genomic library by hybridization, which should reveal any
overlapping clones.
These overlapping clones are then isolated, labeled and used in a second round of screening to
identify further overlapping clones, and the process is repeated to build up a contiguous map.
If the same library is used for each round of screening, previously identified clones can be
distinguished from new ones, so that walking back and forth along the same section of DNA is
prevented. Furthermore, modern vectors, such as DASH and FIX, allow probes to be generated
from the end-points of a given genomic clone by in vitro transcription (see Fig. 6.4), which makes
it possible to walk specifically in one direction.
In Drosophila, the progress of a walk can also be monitored by using such probes for in situ
hybridization against polytene chromosomes. Monitoring is necessary due to the dangers posed
by repetitive DNA. Certain DNA sequences are highly repetitive and are dispersed throughout the
genome.
Hybridization with such a sequence could disrupt the orderly progress of a walk, in the worst
cases causing a warp to another chromosome. The probe used for stepping from one genomic
clone to the next must be a unique sequence clone, or a subclone that has been shown to contain
only a unique sequence.
Chromosome walking is simple in principle, but technically demanding. For large distances, it is
advisable to use libraries based on high-capacity vectors, such as BACs and YACs, to reduce the
number of steps involved. Before such libraries were available, some ingenious strategies were
used to reduce the number of steps needed in a walk. In one of the first applications of this
technology, Hogness and his co-workers (Bender et al. 1983) cloned DNA from the Ace and rosy
loci and the homoerotic Bithorax gene complex in Drosophila.
The number of steps was minimized by exploiting the numerous strains carrying wellcharacterized inversions and
translocations of specific chromosome regions. A different strategy, called chromosome jumping,
has been used for human DNA (Collins & Weissman 1984, Poustka & Lehrach 1986). This involves
the circularization of very large genomic fragments generated by digestion with endonucleases,
such as NotI, which cut at very rare target sites.

This is followed by subcloning of the region covering the closure of the fragment, thus bringing
together sequences that were located a considerable distance apart. In this way a jumping library
is constructed, which can be used for long-distance chromosome walks (Collins et al. 1987,
Richards et al. 1988). The application of chromosome walking and jumping to the cloning of the
human cystic fibrosis gene.

UNIT II
Enzymes Nucleases: Restriction endonucleases DNA cloning Hybrid
vectors
Restriction cloning selection for hybrid vectores Methods of cloning
Synthesis of cDNA Clonning from genomic DNA Genomic libraries
Selection and screening methods.
Endonucleases are enzymes that cleave the phosphodiester bond within a
polynucleotide chain, in contrast to exonucleases, which cleave phosphodiester
bonds at the end of a polynucleotide chain. Restriction endonucleases (Restriction
enzymes) cleave DNA at specific sites, and are divided into three categories, Type I,
Type II, and Type III, according to their mechanism of action. These enzymes are
often used in genetic engineering to make recombinant DNA for introduction into
bacterial, plant, or animal cells.

Restriction endonucleases are enzymes that cleave the sugar-phosphate backbone of


DNA. In most practical settings, a given enzyme cuts both strands of duplex DNA
within a stretch of just a few bases. Several thousand different restriction
endonucleases have been isolated, which collectively exhibit a few hundred different
sequence (substrate) specificities.
Nomenclature and Classification
Restriction enzymes are named based on the organism in which they were discovered.
For example, the enzyme Hind III was isolated from Haemophilus influenzae, strain Rd.
The first three letters of the name are italicized because they abbreviate the genus and
species names of the organism. The fourth letter typically comes from the bacterial strain
designation. The Roman numerals are used to identify specific enzymes from bacteria

that contain multiple restriction enzymes. Typically, the Roman numeral indicates the
order in which restriction enzymes were discovered in a particular strain.
There are three classes of restriction enzymes, labeled types I, II, and III. Type I
restriction systems consist of a single enzyme that performs both modification
(methylation) and restriction activities. These enzymes recognize specific DNA
sequences, but cleave the DNA strand randomly, at least 1,000 base pairs (bp) away from
the recognition site. Type III restriction systems have separate enzymes for restriction
and methylation, but these enzymes share a common subunit. These enzymes recognize
specific DNA sequences, but cleave DNA at random sequences approximately twentyfive bp from the recognition sequence. Neither type I nor type III restriction systems
have found much application in recombinant DNA techniques.
Type II restriction enzymes, in contrast, are heavily used in recombinant DNA
techniques. Type II enzymes consist of single, separate proteins for restriction and
modification. One enzyme recognizes and cuts DNA, the other enzyme recognizes and
methylates the DNA. Type II restriction enzymes cleave the DNA sequence at the same
site at which they recognize it. The only exception are type IIs (shifted) restriction
enzymes, which cleave DNA on one side of the recognition sequence, within twenty
nucleotides of the recognition site. Type II restriction enzymes discovered to date
collectively recognize over 200 different DNA sequences.
Type II restriction enzymes can cleave DNA in one of three possible ways. In one case,
these enzymes cleave both DNA strands in the middle of a recognition sequence,
generating blunt ends. For example: (The notations 5 and 3 are used to indicate the
orientation of a DNA molecule. The numbers 5 and 3 refer to specific carbon atoms in
the deoxyribose sugar in DNA.)
These blunt ended fragments can be joined to any other DNA fragment with blunt ends,
making these enzymes useful for certain types of DNA cloning experiments.
Type II restriction enzymes can also cleave DNA to leave a 3 ("three prime") overhang.
(An overhang means that the restriction enzyme leaves a short single-stranded "tail" of
DNA at the site where the DNA was cut.) These 3 overhanging ends can only join to
another compatible 3 overhanging end (that is, an end with the same sequence in the
overhang). Finally, some type II enzymes can generate 5 overhanging DNA ends, which
can only be joined to a compatible 5 end.
In the type II restriction enzymes discovered to date, the recognition sequences range
from 4 bp to 9 bp long. Cleavage will not occur unless the full length of the recognition

sequence is encountered. Enzymes with a short recognition sequence cut DNA


frequently; restriction enzymes with 8 or 9 bp sequences typically cut DNA very
infrequently, because these longer sequences are less common in the target DNA.

1. Explain about the Restriction-Modification Systems and Recognition


Sequences.
A large majority of restriction enzymes have been isolated from bacteria, where
they appear to serve a host-defense role.
The idea is that foreign DNA, for example from an infecting virus, will be chopped up
and inactivated ("restricted") within the bacterium by the restriction enzyme.
The presence of restriction enzymes immediately begs the question of why they do not
chew up the genomic DNA of their host. In almost all cases, a bacterium that makes a
particular restriction endonuclease also synthesizes a companion DNA methyltransferase,
which methylates the DNA target sequence for that restriction enzyme, thereby
protecting it from cleavage.
This combination of restriction endonuclease and methylase is referred to as a
restriction-modification system.
By convention, restriction enzymes are named after their host of origin.
For example, EcoRI was isolated from Escherichia coli (strain RY13), Hind II and Hind
III from Haemophilus influenzae, and XhoI from Xanthomonas holcicola.
Restriction Enzyme Recognition Sequences
The substrates for restriction enzymes are more-or-less specific sequences of
double-stranded DNA called recognition sequences. Examining the following table
will illustrate some important points (recognition sites are shown as double stranded
DNA).

The length of restriction recognition sites varies: The


enzymes EcoRI, SacI and SstI each recognize a 6 basepair (bp) sequence of DNA, whereas NotI recognizes a
sequence 8 bp in length, and the recognition site for
Sau3AI is only 4 bp in length. Length of the recognition
sequence dictates how frequently the enzyme will cut
in a random sequence of DNA. Enzymes with a 6 bp
recognition site will cut, on average, every 46 or 4096 bp;
a 4 bp recognition site will occur roughly every 256 bp.
Different restriction enzymes can have the same
recognition site - such enzymes are called
isoschizomers: Look at the recognition sites for SacI and
SstI - they are identical. In some cases isoschizomers cut
identically within their recognition site, but sometimes
they do not. Isoschizomers often have different optimum
reaction conditions, stabilities and costs, which may
influence the decision of which to purchase.
Restriction recognitions sites can be unambiguous or
ambiguous: The enzyme BamHI recognizes the sequence
GGATCC and no others - this is what is meant by
unambiguous. In contrast, HinfI recognizes a 5 bp
sequence starting with GA, ending in TC, and having any
base between (in the table, "N" stands for any nucleotide)
- HinfI has an ambiguous recognition site. XhoII also has
an ambiguous recognition site: Py stands for pyrimidine
(T or C) and Pu for purine (A or G), so XhoII will
recognize and cut sequences of AGATCT, AGATCC,
GGATCT and GGATCC.
The recognition site for one enzyme may contain the
restriction site for another: For example, note that a
BamHI recognition site contains the recognition site for
Sau3AI. Consequently, all BamHI sites will cut with
Sau3AI. Similarly, one of the four possible XhoII sites
will also be a recognition site for BamHI and all four will
cut with Sau3AI.

One other point to notice from the table above is


that most recognition sequences are
palindromes - they read the same forward (5' to
3' on the top strand) and backward (5' to 3' on the
bottom strand). Most, but certainly not all
recognition sites for commonly-used restriction
enzymes are palindromes. Most restriction
enzymes bind to their recognition site as
dimers (pairs), as depicted for the enzyme PvuII
in the figure to the right.
2. What are the Patterns of DNA Cutting by Restriction Enzymes?
Restriction enzymes hydrolyze the backbone of DNA between deoxyribose and
phosphate groups. This leaves a phosphate group on the 5' ends and a hydroxyl on the
3' ends of both strands. A few restriction enzymes will cleave single stranded DNA,
although usually at low efficiency.
The restriction enzymes most used in molecular biology labs cut within their
recognition sites and generate one of three different types of ends. In the diagrams
below, the recognition site is boxed in yellow and the cut sites indicated by red triangles.

5' overhangs: The enzyme cuts asymmetrically within the recognition site such
that a short single-stranded segment extends from the 5' ends. BamHI cuts in this
manner.

3' overhangs: Again, we see asymmetrical cutting within the recognition site,
but the result is a single-stranded overhang from the two 3' ends. KpnI cuts in
this manner.

Blunts: Enzymes that cut at precisely opposite sites in the two strands of DNA
generate blunt ends without overhangs. SmaI is an example of an enzyme that
generates blunt ends.

The 5' or 3' overhangs generated by enzymes that cut asymmetrically are called sticky
ends or cohesive ends, because they will readily stick or anneal with their partner by
base pairing.

3. What is Exonuclease? And its mode of action.


Exonucleases are enzymes that work by cleaving nucleotides one at a time from the end of a
polynucleotide chain. A hydrolyzing reaction that breaks phosphodiester bonds at either the 3 or
the 5 end occurs. Its close relative is the endonuclease, which cleaves phosphodiester bonds in
the middle of a polynucleotide chain. Eukaryotes and prokaryotes have three types of
exonucleases involved in the normal turnover of mRNA: 5 to 3 exonuclease, which is a
dependent decapping protein, 3 to 5 exonuclease, an independent protein, and poly(A)-specific
3 to 5 exonuclease.
In both archaebacteria and eukaryotes, one of the main routes of RNA degradation is performed
by the multi-protein exosome complex, which consists largely of 3' to 5' exoribonucleases.
Significance to polymerase
RNA polymerase II is known to be in effect during transcriptional termination; it works with a 5
exonuclease (human gene Xrn2) to degrade the newly formed transcript downstream, leaving the
polyadenylation site and simultaneously shooting the polymerase. This process involves the
exonuclease's catching up to the pol II and terminating the transcription.
Pol I then synthesizes DNA nucleotides in place of the RNA primer it had just removed. DNA
polymerase I also has 5' to 3' exonuclease activity, which is used in editing and proofreading
DNA for errors.

E. coli types

WRN Exonuclease with active sites in yellow


In 1971, Lehman IR discovered exonuclease I in E. coli. Since that time, there have been
numerous discoveries including: exonuclease, II, III, IV, V, VI, VII, and VIII. Each type of
exonuclease has a specific type of function or requirement.
Exonuclease I breaks apart single-stranded DNA in a 3'=>5' direction, releasing
deoxyribonucleoside 5'-monophosphates one after another. It does not cleave DNA strands with
terminal 3'-OH groups because they are blocked by phosphoryl or acetyl groups.
Exonuclease II is associated with DNA polymerase I, which contains a 5' exonuclease that clips
off the RNA primer contained immediately upstream from the site of DNA synthesis in a 5' 3'
manner.
Exonuclease III has four catalytic activities:

3 to 5 exodeoxyribonuclease activity, which is specific for double-stranded DNA


RNase activity

3 phosphate activity

AP endonuclease activity (later found to be called endonuclease II).

Exonuclease IV adds a water molecule, so it can break the bond of an oligonucleotide to


nucleoside 5 monophosphate. This exonuclease requires Mg 2+ in order to function and works
at higher temperatures then exonuclease I.

Exonuclease V is a 3 to 5 hydrolyzing enzyme that catalyzes linear double-stranded DNA and


single-stranded DNA, which requires Ca2+.
Exonuclease VIII is 5 to 3 dimeric protein that does not require ATP or any gaps or nicks in the
strand, but requires a free 5 OH group to carry out its function.
Discoveries in humans
The 3 to 5 human type endonuclease is known to be essential for the proper processing of
histone pre-mRNA, in which U7 snRNP directs the single cleavage process. Following the
removal of the downstream cleavage product (DCP) 5 to 3 exonuclease continues to further
breakdown the product until it is completely degraded. This allows the nucleotides to be
recycled. 5 To 3 exonuclease is linked to a co-transcriptional cleavage (CoTC) activity that acts
as a precursor to develop a free 5 unprotected end, so the exonuclease can remove and degrade
the downstream cleavage product (DCP). This initiates transcriptional termination because one
does not want DNA or RNA strands building up in their bodies.
Discoveries in yeast
CCR4-NOT is a general transcription regulatory complex in yeast that is found to be associated
with mRNA metabolism, transcription initiation, and mRNA degradation. CCR4 has been found
to contain RNA and single-stranded DNA 3to 5 exonuclease activities. Another component
associated with the CCR4 complex is CAF1 protein, which has been found to contain 3to 5 or
5 to 3 exonuclease domains in Mus musculus and Caenorhabditis elegans. This protein has not
been found in yeast, which suggests that it is likely to have an abnormal exonuclease domain like
the one seen in a metazoan. Yeast contains Rat1 and Xrn1 exonuclease. The Rat1 works just like
the human type (Xrn2) and Xrn1 function in the cytoplasm is in the 5 to 3 direction to degrade
RNAs (pre-5.8s and 25s rRNAs) in the absence of Rat1.
Hybride vector
1. Explain about the Phagemid.

A phagemid or phasmid is a type of cloning vector developed as a hybrid of the


filamentous phage M13 and plasmids to produce a vector that can grow as a plasmid,
and also be packaged as single stranded DNA in viral particles.
Phagemids contain an origin of replication (ori) for double stranded replication, as well
as an f1 ori to enable single stranded replication and packaging into phage particles.
Many commonly used plasmids contain an f1 ori and are thus phagemids. Similarly to a
plasmid, a phagemid can be used to clone DNA fragments and be introduced into a

bacterial host by a range of techniques (transformation, electroporation).

However, infection of a bacterial host containing a phagemid with a 'helper' phage, for
example VCSM13 or M13K07, provides the necessary viral components to enable
single stranded DNA replication and packaging of the phagemid DNA into phage
particles.

These are secreted through the cell wall and released into the medium. Filamentous
phage retard bacterial growth but, in contrast to lambda and T7 phage, are not generally
lytic.

Helper phage are usually engineered to package less efficiently than the phagemid so
that the resultant phage particles contain predominantly phagemid DNA.

F1 Filamentous phage infection requires the presence of a pilus so only bacterial hosts
containing the F-plasmid or its derivatives can be used to generate phage particles.

Prior to the development of cycle sequencing, phagemids were used to generate single
stranded DNA template for sequencing purposes.

Today phagemids are still useful for generating templates for site-directed mutagenesis.
Detailed characterisation of the filamentous phage life cycle and structural features lead
to the development of phage display technology, in which a range of peptides and
proteins can be expressed as fusions to phage coat proteins and displayed on the viral
surface.

The displayed peptides and polypeptides are associated with the corresponding coding
DNA within the phage particle and so this technique lends itself to the study of proteinprotein interactions and other ligand/receptor combinations.
[hide]v d eTypes of nucleic acids

Constituents

Ribonucleic acids
(coding and non-coding)

Nucleobases Nucleosides Nucleotides Deoxynucleotides

translation: mRNA (pre-mRNA/hnRNA) tRNA rRNA


tmRNA
regulatory: miRNA siRNA piRNA aRNA
RNA processing: snRNA snoRNA

other/ungrouped: gRNA shRNA stRNA ta-siRNA

Deoxyribonucleic acids

cDNA cpDNA gDNA msDNA mtDNA

Nucleic acid analogues

GNA LNA BNA PNA TNA morpholino

Cloning vectors

phagemid plasmid lambda phage cosmid fosmid PAC


BAC YAC HAC

2. What is Cosmid Vectors?and its features.

They have been developed in the late 1970s and have been improved significantly
since. (Basic features of a cosmid).

They are predominantly plasmids with a bacterial oriV, an antibiotic selection marker
and a cloning site, but they carry one, or more recently two cos sites derived from
bacteriophage lambda.

Depending on the particular aim of the experiment broad host range cosmids, shuttle
cosmids or 'mammalian' cosmids (linked to SV40 oriV and mammalian selection
markers) are available.

The loading capacity of cosmids varies depending on the size of the vector itself but
usually lies around 40-45 kb.

The cloning procedure involves the generation of two vector arms which are then
joined to the foreign DNA. Selection against wildtype cosmid DNA is simply done via
size exclusion! Remember however that cosmids always form colonies and not
plaques! Also clone density is much lower with around 105 - 106 cfu per ug of ligated
DNA.

After the construction of recombinant lambda or cosmid libraries the total DNA is
transfered into an appropriate E.coli host via a technique called in vitro packaging.

The necessary packaging extracts are derived from E.coli cI857 lysogens (red- gamSam and Dam (head assembly) and Eam (tail assembly) respectively).

These extracts will recognize and package the recombinant molecules in vitro,
generating either mature phage particles (lambda-based vectors) or recombinant

plasmids contained in phage shells (cosmids).

These differences are reflected in the different infection frequencies seen in favour of
lambda-replacement vectors. This compensates for their slightly lower loading capacity.
Phage library are also stored and screened easier than cosmid (colonies!) libraries.

Target DNA: the genomic DNA to be cloned has to be cut into the appropriate size
range of restriction fragments.

This is usually done by partial restriction followed by either size fractionation or


dephosphorylation (using calf-intestine phosphatase ) in order to avoid chromosome
scrambling, ie the ligation of physically unlinked fragments.

Cosmid features and uses


Cosmids are predominantly plasmids with a bacterial oriV, an antibiotic selection
marker and a cloning site, but they carry one, or more recently two cos sites derived
from bacteriophage lambda.
Depending on the particular aim of the experiment broad host range cosmids, shuttle
cosmids or 'mammalian' cosmids (linked to SV40 oriV and mammalian selection
markers) are available.
The loading capacity of cosmids varies depending on the size of the vector itself but
usually lies around 4045 kb.
The cloning procedure involves the generation of two vector arms which are then
joined to the foreign DNA. Selection against wildtype cosmid DNA is simply done via
size exclusion. Cosmids, however, always form colonies and not plaques. Also the
clone density is much lower with around 105 - 106 CFU per g of ligated DNA.
After the construction of recombinant lambda or cosmid libraries the total DNA is
transferred into an appropriate E.coli host via a technique called in vitro packaging.
The necessary packaging extracts are derived from E.coli cI857 lysogens (red- gamSam and Dam (head assembly) and Eam (tail assembly) respectively).
These extracts will recognize and package the recombinant molecules in vitro,
generating either mature phage particles (lambda-based vectors) or recombinant
plasmids contained in phage shells (cosmids). These differences are reflected in the
different infection frequencies seen in favour of lambda-replacement vectors.
This compensates for their slightly lower loading capacity. Phage library are also stored
and screened easier than cosmid (colonies!) libraries.
Target DNA: the genomic DNA to be cloned has to be cut into the appropriate size

range of restriction fragments.


This is usually done by partial restriction followed by either size fractionation or
dephosphorylation (using calf-intestine phosphatase) to avoid chromosome scrambling,
i.e. the ligation of physically unlinked fragments.

The cDNA synthesis reaction

The synthesis of double-stranded cDNA suitable for insertion into a cloning vector involves three
major steps: (i) first-strand DNA synthesis on the mRNA template, carried out with a reverse
transcriptase; (ii) removal of the RNA template; and (iii) secondstrand DNA synthesis using the
first DNA strand as a template, carried out with a DNA-dependent DNA
polymerase, such as E. coli DNA polymerase I. All DNA polymerases, whether they use RNA or
DNA as the template, require a primer to initiate strand synthesis.

Development of cDNA cloning strategies

The first reports of cDNA cloning were published in the mid-1970s and were all based on the
homopolymer tailing technique, Of several alternative methods, the onethat became the most
popular was that of Maniatis et al. (1976). This involved the use of an oligo-dT primer annealing
at the polyadenylate tail of the mRNA to prime first-strand cDNA synthesis, and took advantage
of the fact that the first cDNA strand has the tendency to transiently fold back on itself, forming a
hairpin loop, resulting in self-priming of the second strand (Efstratiadis et al. 1976).
After the synthesis of the second DNA strand, this loop must be cleaved with a single-strandspecific nuclease,
e.g. S1 nuclease, to allow insertion into the cloning vector (Fig. 6.5). A serious disadvantage of
the hairpin method is
that cleavage with S1 nuclease results in the loss of a certain amount of sequence at the 5 end
of the clone. This strategy has therefore been superseded by other methods in which the second
strand is primed in a separate reaction. One of the simplest strategies (Land et al. 1981). After
first-strand synthesis, which is primed with an oligo-dT primer as usual, the cDNA is tailed with a
string of cytidine residues using the enzyme terminal transferase. This artificial oligo-dC tail is
then used as an annealing site for a synthetic oligo-dG primer, allowing synthesis of the second
strand.
Using this method, Land et al. (1981) were able to isolate a full-length cDNA corresponding to the
chicken lysozyme gene. However, the efficiency can be lower for other cDNAs (e.g. Cooke et al.
1980). For cDNA expression libraries, it is advantageous if the cDNA can be inserted into the
vector in the correct orientation. With the self-priming method,

this can be achieved by adding a synthetic linker to the double-stranded cDNA molecule before
the hairpin loop is cleaved (e.g. Kurtz & Nicodemus 1981; Fig. 6.7a). Where the second strand is
primed separately, direction cloning can be achieved using an oligo-dT primer containing a linker
sequence (e.g. Coleclough & Erlitz 1985; Fig. 6.7b). An alternative
is to use primers for cDNA synthesis that are already linked to a plasmid (Fig. 6.7c). This strategy
was devised by Okayama and Berg (1982) and has two further notable characteristics. First, fulllength cDNAs are preferentially obtained because an RNA DNA hybrid molecule, the result of
first-strand synthesis, is the substrate for a terminal transferase
reaction. A cDNA that does not extend to the end of the mRNA will present a shielded 3-hydroxyl
group, which is a poor substrate for tailing. Secondly, the second-strand synthesis step is primed
by nicking the RNA at multiple sites with RNase H. Second-strand synthesis therefore occurs by a
nick-translation type of reaction, which is highly efficient. Simpler
cDNA cloning strategies incorporating replacement synthesis of the second strand are widely
used (e.g. Gubler & Hoffman 1983, Lapeyre & Amalric 1985). The GublerHoffman reaction.

Genomic Library
A genomic library is a population of host bacteria, each of which carries a DNA
molecule that was inserted into a cloning vector, such that the collection of cloned DNA
molecules represents the entire genome of the source organism. This term also represents
the collection of all of the vector molecules, each carrying a piece of the chromosomal
DNA of the organism, prior to the insertion of these molecules into the host cells

The genomic library is normally made by l phage vectors, instead of plasmid vectors, for
the following reasons:

The entire human genome is about 3 x 109 bp long while a plamid or l phage vector may
carry up to 20 kb fragment. This would require 1.5 x 105 recombinant plasmids or l
phages. When plating E. coli colonies on a 3" petri dish, the maximum number to allow
isolation of individual colonies is about 200 colonies per dish. Thus, at least 700 petri
dishes are required to construct a human genomic library. By contrast, as many as 5 x 104
l phage plagues can be screened on a typical petri dish. This requires only 30 petri dishes
to construct a human genomic library. Another advantage of l phage vector is that its
transformation efficiency is about 1000 times higher than the plasmid vector.

Preparation of a DNA Library

DNA library is a collection of cloned DNA fragments. There are two types of DNA
library:

The genomic library contains DNA fragments representing the entire genome of an
organism.

The cDNA library contains only complementary DNA molecules synthesized from
mRNA molecules in a cell.

Figure 9-B-1. Preparation of the genomic library using l phage vectors. It is basically the
cloning of all DNA fragments representing the entire genome

cDNA Library

The advantage of cDNA library is that it contains only the coding region of a genome. To
prepare a cDNA library, the first step is to isolate the total mRNA from the cell type of
interest. Because eukaryotic mRNAs consist of a poly-A tail, they can easily be
separated. Then the enzyme reverse transcriptase is used to synthesize a DNA strand
complementary to each mRNA mlecule. After the single-stranded DNA molecules are
converted into double-stranded DNA molecules by DNA polymerase, they are inserted
into vectors and cloned.

Probes

A probe is a piece of DNA or RNA used to detect specific nucleic acid sequences by
hybridization (binding of two nucleic acid chains by base pairing) . They are
radioactively labeled so that the hybridized nucleic acid can be identified by
autoradiography.

The size of probes ranges from a few nucleotides to hundreds of kilobases. Long probes
are usually made by cloning. Originally they may be double-stranded, but the working
probes must be single-stranded. Short probes (oligonucleotide probes) can be made by
chemical synthesis. They are single-stranded.

Suppose we have cloned a specific gene in yeast and want to find its homologous gene in
human, then we may use the specific yeast gene as a probe to detect its homologous gene
from the human genomic library. On the other hand, if we know the conserved sequence
in the specific gene between yeast and human, we may use oligonucleotide probes
containing only the conserved sequence. Typically, an oligonucleotide about 20
nucleotides long is sufficient to screen a library.

In some cases, we have known the partial sequence of a protein and want to detect its
gene in the library. Then we may synthesize oligonucleotide probes based on the known
peptide sequence. Since an amino acid may be encoded by several DNA triplets, many
different oligonucleotide probes are often needed.

Figure 9-B-2. The relationship between a peptide and all possible DNA sequences. In
this example, the peptide sequence Leu-Phe-Tyr-Met-His-Asp corresponds to 96 (= 6 x 2
x 2 x 1 x 2 x 2) possible DNA sequences.

Screening

Once a particular DNA fragment is identified, it can be isolated and amplified to


determine its sequence. If we know the partial sequence of a gene and want to determine
its entire sequence, the probe should contain the known sequence so that the detected
DNA fragment may contain the gene of interest.

Figure 9-B-3. Screening of a specific DNA fragment. After recombinant l virions form
plaques on the lawn of E. coli, the nitrocellulose filter (membrane) is placed on the
surface of the petri dish to pick up l phages from each plaque. Then, the filter is
incubated in an alkaline solution to disrupt the virions and release the encapsulated DNA,
which is subsequently denatured. Next, the probe is added to hybridize with the target
DNA fragment, whose position may be displayed by autoradiography.

UNIT III
Biology of genetic engineering Plasmids used for E,coli vectors, based on
bacteriophage and M-13 phage vectors. Eukaryotic vectors Yeast vectors,
animal vectors, plant vectors. Prokaryotic and Eukaryotic hosts.

1. Give general an account on prokaryotic vector.


A plasmid is a DNA molecule that is separate from, and can replicate independently of,
the chromosomal DNA They are double-stranded and, in many cases, circular.
Plasmids usually occur naturally in bacteria, but are sometimes found in eukaryotic
organisms (e.g., the 2-micrometre-ring in Saccharomyces cerevisiae).
Plasmid sizes vary from 1 to over 1,000 kilobase pairs (kbp). The number of identical
plasmids in a single cell can range anywhere from one to even thousands under some
circumstances. Plasmids can be considered part of the mobilome because they are often
associated with conjugation, a mechanism of horizontal gene transfer.
The term plasmid was first introduced by the American molecular biologist Joshua
Lederberg in 1952.
Plasmids are considered transferable genetic elements, or "replicons", capable of
autonomous replication within a suitable host.
Plasmids can be found in all three major domains: Archea, Bacteria and Eukarya. Similar
to viruses, plasmids are not considered a form of "life" as it is currently defined. Unlike
viruses, plasmids are "naked" DNA and do not encode genes necessary to encase the
genetic material for transfer to a new host, though some classes of plasmids encode the
sex pilus necessary for their own transfer.
Plasmid host-to-host transfer requires direct, mechanical transfer by conjugation or
changes in host gene expression allowing the intentional uptake of the genetic element by
transformation.
Microbial transformation with plasmid DNA is neither parasitic nor symbiotic in nature,
because each implies the presence of an independent species living in a commensal or
detrimental state with the host organism. Rather, plasmids provide a mechanism for
horizontal gene transfer within a population of microbes and typically provide a selective
advantage under a given environmental state.

Plasmids may carry genes that provide resistance to naturally occurring antibiotics in a
competitive environmental niche, or alternatively the proteins produced may act as toxins
under similar circumstances.
Plasmids also can provide bacteria with an ability to fix elemental nitrogen or to degrade
recalcitrant organic compounds which provide an advantage when nutrients are scarce.

There are two types of plasmid integration into a host bacteria: Non-integrating plasmids
replicate as with the top instance; whereas episomes, the lower example, integrate into
the host chromosome.
Plasmids used in genetic engineering are called vectors. Plasmids serve as important tools
in genetics and biotechnology labs, where they are commonly used to multiply (make
many copies of) or express particular genes.
Many plasmids are commercially available for such uses. The gene to be replicated is
inserted into copies of a plasmid containing genes that make cells resistant to particular
antibiotics and a multiple cloning site (MCS, or polylinker), which is a short region
containing several commonly used restriction sites allowing the easy insertion of DNA
fragments at this location.
Next, the plasmids are inserted into bacteria by a process called transformation. Then, the
bacteria are exposed to the particular antibiotics. Only bacteria which take up copies of
the plasmid survive, since the plasmid makes them resistant.

In particular, the protecting genes are expressed (used to make a protein) and the
expressed protein breaks down the antibiotics.
In this way the antibiotics act as a filter to select only the modified bacteria. Now these
bacteria can be grown in large amounts, harvested and lysed (often using the alkaline
lysis method) to isolate the plasmid of interest.
Another major use of plasmids is to make large amounts of proteins. In this case,
researchers grow bacteria containing a plasmid harboring the gene of interest.
Just as the bacteria produces proteins to confer its antibiotic resistance, it can also be
induced to produce large amounts of proteins from the inserted gene.
This is a cheap and easy way of mass-producing a gene or the protein it then codes for,
for example, insulin or even antibiotics.
However, a plasmid can only contain inserts of about 110 kbp. To clone longer lengths
of DNA, lambda phage with lysogeny genes deleted, cosmids, bacterial artificial
chromosomes or yeast artificial chromosomes could be used.
2. What are the Applications of vector?
Disease Models
Plasmids were historically used to genetically engineer the embryonic stem cells of rats
in order to create rat genetic disease models.
The limited efficiency of plasmid based techniques precluded their use in the creation of
more accurate human cell models.
Fortunately, developments in Adeno-associated virus recombination techniques, and Zinc
finger nucleases, have enabled the creation of a new generation of isogenic human
disease models.
Gene therapy
The success of some strategies of gene therapy depends on the efficient insertion of
therapeutic genes at the appropriate chromosomal target sites within the human genome,
without causing cell injury, oncogenic mutations (cancer) or an immune response.
Plasmid vectors are one of many approaches that could be used for this purpose. Zinc
finger nucleases (ZFNs) offer a way to cause a site-specific double strand break to the
DNA genome and cause homologous recombination.

This makes targeted gene correction a possibility in human cells. Plasmids encoding ZFN
could be used to deliver a therapeutic gene to a pre-selected chromosomal site with a
frequency higher than that of random integration.
Although the practicality of this approach to gene therapy has yet to be proven, some
aspects of it could be less problematic than the alternative viral-based delivery of
therapeutic genes.
3. Define Episomes.and its role in the genetics.
An episome is a portion of genetic material that can exist independent of the main body
of genetic material (called the chromosome) at some times, while at other times is able to
integrate into the chromosome.
Examples of episomes include insertion sequences and transposons. Viruses are another
example of an episome. Viruses that integrate their genetic material into the host
chromosome enable the viral nucleic acid to be produced along with the host genetic
material in a nondestructive manner. As an autonomous unit (i.e., existing outside of the
chromosome) however, the viral episome destroys the host cell as it commandeers the
host's replication apparatuses to make new copies of itself.
Another example of an episome is called the F factor. The F factor determines whether
genetic material in the chromosome of one organism is transferred into another organism.
The F factor can exist in three states that are designated as FPLUS, Hfr, and F prime.
FPLUS refers to the F factor that exists independently of the chromosome. Hfr stands for
high frequency of recombination, and refers to a factor that has integrated into the host
chromosome. The F prime factor exists outside the chromosome, but has a portion of
chromosomal DNA attached to it.
An episome is distinguished from other pieces of DNA that are independent of the
chromosome (i.e.,plasmids) by their large size.
Plasmids are different from episomes, as plasmid DNA cannot link up with chromosomal
DNA. The plasmid carries all the information necessary for its independent replication.
While not necessary for bacterial survival, plasmids can be advantageous to a bacterium.
For example, plasmids can carry genes that confer resistance to antibiotics or toxic
metals, genes that allow the bacterium to degrade compounds that it otherwise could not
use as food, and even genes that allow the bacterium to infect an animal or plant cell.
Such traits can be passed on to another bacterium.

Transposons and insertion sequences are episomes. These are also known as mobile
genetic elements. They are capable of existing outside of the chromosome. They are also
designed to integrate into the chromosome following their movement from one cell to
another. Like plasmids, transposons can carry other genetic material with them, and so
pass on resistance to the cells they enter. Class 1 transposons, for example, contain drug
resistance genes. Insertion sequences do not carry extra genetic material. They code for
only the functions involved in their insertion into chromosomal DNA.
Transposons and insertion sequences are useful tools to generate changes in the DNA
sequence of host cells. These genetic changes that result from the integration and the exit
of the mobile elements from DNA, are generically referred to as mutations. Analysis of
the mobile element can determine what host DNA is present, and the analysis of the
mutated host cell can determine whether the extra or missing DNA is important for the
functioning of the cell.
4. What is Yeast Plasmid and its types?
Other types of plasmids are often related to yeast cloning vectors that include:

Yeast integrative plasmid (YIp), yeast vectors that rely on integration into the host
chromosome for survival and replication, and are usually used when studying the
functionality of a solo gene or when the gene is toxic. Also connected with the gene
URA3, that codes an enzyme related to the biosynthesis of pyrimidine nucleotides (T, C);
Yeast Replicative Plasmid (YRp), which transport a sequence of chromosomal DNA that
includes an origin of replication. These plasmids are less stable, as they can "get lost"
during the budding.
5. Describe the pUC vector.
o Plasmids are circular, double-stranded DNA molecules that exist in bacteria and
in the nuclei of some eukaryotic cells.
o They can replicate independently of the host cell. The size of plasmids ranges
from a few kb to near 100 kb.

Figure 9-A-3. A typical plasmid vector. It contains a polylinker which can recognize
several different restriction enzymes, an ampicillin-resistance gene (ampr) for selective
amplification, and a replication origin (ORI) for proliferation in the host cell.
A plasmid vector is made from natural plasmids by removing unnecessary segments and
adding essential sequences.
To clone a DNA sample, the same restriction enzyme must be used to cut both the vector
and the DNA sample.
Therefore, a vector usually contains a sequence (polylinker) which can recognize several
restriction enzymes so that the vector can be used for cloning a variety of DNA samples.
A plasmid vector must also contain a drug-resistance gene for selective amplification.
After the vector enters into a host cell, it may proliferate with the host cell.
However, since the transformation efficiency of plasmids in E. coli is very low, most E.
coli cells that proliferate in the medium would not contain the plasmids.
Therefore, we must find a way to allow only the transformed E. coli to proliferate.
Typically, antibiotics are used to kill E. coli cells which do not contain the vectors.
The transformed E. coli cells are protected by the ampicillin-resistance gene (ampr) which
can express the enzyme, lactamase, to inactivate the antibiotic ampicillin.
1. Write about the Lambda phage vector.
phages are viruses that can infect bacteria.

The major advantage of the phage vector is its high transformation efficiency, about
1000 times more efficient than the plasmid vector.

Figure 9-A-4. Schematic drawing of the DNA cloning using


phages as vectors. The DNA to be cloned is first inserted into the DNA, replacing a
nonessential region.
hen, by an in vitro assembly system (described below), the virion carrying the
recombinant DNA can be formed.
The genome is 49 kb in length which can carry up to 25 kb foreign DNA.
Lambda Phage Vectors types.
Plasmid vectors described in the previous section are often used for cloning DNA
segments of small size (upto 10 kilobases). However, while preparing a genomic library
in a eukaryote, the cloned fragments should be large enough to contain a whole gene.
This will also allow cloning of the whole genome into a number, which will not be
unreasonably large and therefore can be screened without serious difficulty
The above properties and other requirements of cloning whole genome In eukaryotes
are fulfilled by the phage lambda and cosmid vectors, the former permitting cloning of

segments upto 20-25kb long (kb = kilobases) and latter accommodating segments upto
45kb long. Phage lambda (A).
However, is easier and more efficient for making genomic and cDNA Libraries
(a) gt10 and gt11. gt10 and gt11 are modified lambda phages designed to clone
cDNA fragments. The major difference between these two Vectors is that gt11 is an
expression vector, where inserted DNA is expressed as galactosidase fusion protein.
gt10 is a 43 kb double stranded DNA for cloning fragments that are only 7 kb in
length. The insertion of DNA inactivates c+ (repressor) gene generating a clbacteriophage. Non recombinant gt10 is cl+ and forms cloudy plaques on appropriate
E. coli host, while recombinant cl- gt10 forms clear plaques permitting screening of
recombinant plaque
Further, in an E. coli strain carrying hf lA 150 mutation (high frequency lysogeny
mutation) only cl- phage will form plaques, because cl+ will form lysogens (integrate
with bacterial genome) and will not undergo lysis to form any plaques. Recombinant
gt10
plaques
can
thus
be
easily
selected
gt11 is a 43.7 kb double stranded A phage for cloning DNA segments, which are less
than 6 kb in length (usually for cDNA). Foreign DNA can be expressed as
galactosidase fusion proteins. Recombinant gt11 can be screened using either).
The recombinant gt11 becomes gar, while non recombinant gt11 remains gal+, so that
an appropriate E. coli host, with recombinant phage (gar) will form white or clear
colonies and that with non recombinant phage (gal+) will form blue colonies permitting
screening in the presence of IPTG (inducer) and Xgal (substrate).

Section C
3. How did types the plasmid vector?

Overview of bacterial conjugation


One way of grouping plasmids is by their ability to transfer to other bacteria. Conjugative
plasmids contain so-called tra-genes, which perform the complex process of conjugation,
the transfer of plasmids to another bacterium .
Non-conjugative plasmids are incapable of initiating conjugation, hence they can only be
transferred with the assistance of conjugative plasmids, by 'accident'.
An intermediate class of plasmids are mobilizable, and carry only a subset of the genes
required for transfer. They can 'parasitize' a conjugative plasmid, transferring at high
frequency only in its presence. Plasmids are now being used to manipulate DNA and may
possibly be a tool for curing many diseases.
It is possible for plasmids of different types to coexist in a single cell. Several different
plasmids have been found in E. coli. But related plasmids are often incompatible, in the
sense that only one of them survives in the cell line, due to the regulation of vital plasmid
functions. Therefore, plasmids can be assigned into compatibility groups.
Another way to classify plasmids is by function. There are five main classes:

Fertility-F-plasmids, which contain tra-genes. They are capable of conjugation (transfer


of genetic material between bacteria which are touching).
Resistance-(R)plasmids, which contain genes that can build a resistance against
antibiotics or poisons and help bacteria produce pili. Historically known as R-factors,
before the nature of plasmids was understood.
Col-plasmids, which contain genes that code for (determine the production of)
bacteriocins, proteins that can kill other bacteria.

Degradative plasmids, which enable the digestion of unusual substances, e.g., toluene or
salicylic acid.

Virulence plasmids, which turn the bacterium into a pathogen (one that causes disease).

Plasmids can belong to more than one of these functional groups.

Plasmids that exist only as one or a few copies in each bacterium are, upon cell division,
in danger of being lost in one of the segregating bacteria. Such single-copy plasmids have
systems which attempt to actively distribute a copy to both daughter cells.

Some plasmids or microbial hosts include an addiction system or "postsegregational


killing system (PSK)", such as the hok/sok (host killing/suppressor of killing) system of
plasmid R1 in Escherichia coli.

This variant produces both a long-lived poison and a short-lived antidote. Several types
of plasmid addiction systems (toxin/ antitoxin, metabolism-based, ORT systems) were
described in the literature and used in biotechnical (fermentation) or biomedical (vaccine
therapy) applications.

Daughter cells that retain a copy of the plasmid survive, while a daughter cell that fails to
inherit the plasmid dies or suffers a reduced growth-rate because of the lingering poison
from the parent cell. Finally, the overall productivity could be enhanced.

PLASMID VECTORS
Plasmids are:
Circular, autonomous molecules of DNA.
Found naturally in most bacterial (and some other) species.
Size: 1.5 - 300 kilobases.
Function: carry non-essential (dispensable) genes, e.g. antibiotic resistance, toxin
production.
But "cryptic" plasmids have no known function!
Plasmids can be conjugative or non-conjugative (conjugation is generally not required
in GM).
Plasmids can be mobilizable or non-mobilizable (non-mobilizable plasmids are
preferred as they are less likely to "escape" from host cells).

Plasmids can be relaxed (multiple copies per host cell) or stringent (1-3 copies per host
cell).
For GM work we want: small, relaxed, non-conjugative, non-mobilizable plasmids with
good markers and unique restriction sites.
THREE EXAMPLES OF NATURAL PLASMIDS
PLASMI
D

pSC101

SIZ
E

RELAXED

(kb)

(AMPLIFIE
D)

SINGLE
SITES FOR
RESTRICTIO
N ENZYMES

MARKER GENES
FOR SELECTING
TRANSFORMAN
TS

ADDITIONAL
MARKER
GENES
SHOWING
INSERTIONAL
INACTIVATIO
N

6.5

NO

XhoI, EcoRI,

Tetracycline
resistance

HindIII,
BamHI, SalI

Tetracycline
resistance

EcoRI

Immunity to

Colicin E1
production

PvuII, HincII

ColE1

8.0

YES

colicin E1
RSF2124

11.0

YES

EcoRI, BamHI

Ampicillin
resistance

Vector" is an agent that can carry a DNA fragment into a host cell. If it is used for
reproducing the DNA fragment, it is called a "cloning vector". If it is used for
expressing certain gene in the DNA fragment, it is called an "expression vector".
Commonly used vectors include plasmid, Lambda phage, cosmid and yeast artificial
chromosome (YAC).

4. Explain the PBR322.draw the diagram.

pBR322 is a plasmid and for a time was one of the most commonly used E. coli cloning
vectors. Created in 1977, it was named eponymously after its Mexican creators, p
standing for plasmid, and BR for Bolivar and Rodriguez.
pBR322 is 4361 base pairs in length and contains a replicon region (source plasmid
pMB1), the ampR gene, encoding the ampicillin resistance protein (source plasmid
RSF2124) and the tetR gene, encoding the tetracycline resistance protein (source plasmid
pSC101).
The plasmid has unique restriction sites for more than forty restriction enzymes. 11 of
these 40 sites lie within the tetR gene.
There are 2 sites for restriction enzymes HindIII and ClaI within the promoter of the tetR
gene. There are 6 key restriction sites inside the ampR gene.
The origin of replication or ori site in this plasmid is pMB1 (a close relative of ColE1).
The ori encodes two RNAs (RNAI and RNAII) and one protein (called Rom or Rop).
The circular sequence is numbered such that 0 is the middle of the unique EcoRI site and
the count increases through the tet genes.
The ampicillin resistance gene is a penicillin beta-lactamase. Promoters P1 and P3 are for
the beta-lactamase gene. P3 is the natural promoter, and P1 is artificially created by the
ligation of two different DNA fragments to create pBR322.
P2 is in the same region as P1, but it is on the opposite strand and initiates transcription in
the direction of the tetracycline resistance gene.
Bits of the pBR322 sequence were used to create the "dinosaur" DNA in the Novel
Jurassic Park

5. Give details an account on pUC19 vector.

pUC19 is a plasmid cloning vector created by Messing and co-workers in the University
of California. p in the name stands for plasmid and UC represents the University in which
it was created. It is a circular double stranded DNA and has 2686 base pairs.

pUC19 is one of the most widely used vector molecules as the recombinants, or the cells
into which foreign DNA has been introduced, can be easily distinguished from the nonrecombinants based on colour differences of colonies on growth media.

pUC18 is similar to pUC19, but the MCS region is reversed.

Components
It has one ampR gene (ampicillin resistance gene), and an N-terminal fragment of galactosidase (lac Z) gene of E. coli.
The multiple cloning site (MCS) region is split into the lac Z gene (codons 67 of lac Z
are replaced by MCS), where various restriction sites for many restriction endonucleases
are present.
The ori site or replicon, rep is derived from pMB1 vector. pUC vector is small but has a
high copy number.
The high copy number of pUC plasmids is a result of the lack of the rop gene and a single
point mutation in rep of pMB1. The lac Z gene codes for -galactosidase.

Function
This plasmid is introduced into a bacterial cell by a process called "transformation", where it can
multiply and express itself. However due to the presence of MCS and several restriction sites, a
foreign piece of DNA of choice can be introduced into it by inserting it into place in MCS
region.
The cells which have taken up the plasmid can be differentiated from cells which have not taken
up the plasmid by growing it on media with Ampicillin. Only the cells with the plasmid
containing the ampicillin resistance (ampR) gene will survive.
Further more, the transformed cells containing the plasmid with the gene of our interest can be
distinguished from cell with the plasmid but without the gene of interest, just by looking at the
colour of the colony they make on agar media. Recombinants are white, whereas nonrecombinants are blue in colour. This is the most notable feature of pUC19.
Mechanism

A schematic representation of the molecular mechanism involved for screening recombinant


cells
The lac Z fragment, whose synthesis can be induced by IPTG, is capable of intra-allelic
complementation with a defective form of -galactosidase enzyme encoded by host
chromosome (mutation lacZDM15).
In the presence of IPTG in growth medium, bacteria synthesise both fragments of the
enzyme. Both the fragments can together hydrolyse X-gal (5-bromo-4-chloro-3-indolylbeta-D-galactopyranoside) and form blue colonies on media with X-gal.

Insertion of foreign DNA into the MCS located within the lac Z gene causes insertional
inactivation of this gene at the N-terminal fragment of beta-galactosidase and abolishes
intra-allelic complementation.
Thus bacteria carrying recombinant plasmids in the MCS cannot hydrolyse X-gal, giving
rise to white colonies, which can be distinguished on culture media from nonrecombinant cells, which are blue. Therefore the media used should contain ampicillin,
IPTG, and X-gal.
Sequence
The recognition sites for HindIII, SphI, PstI, SalI, XbaI, BamHI, SmaI, KpnI, SacI and EcoRI
restriction enzymes have been derived from the vector M13mp19 and are on the strand
complementary to that shown.
6. Write about the M13 Vectors.
M13 is a filamentous bacteriophage composed of circular single stranded DNA (ssDNA) which
is 6407 nucleotides long encapsulated in approximately 2700 copies of the major coat protein
P8, and capped with 5 copies of two different minor coat proteins (P9, P6, P3) on the ends. The
minor coat protein P3 attaches to the receptor at the tip of the F pilus of the host Escherichia
coli. Infection with filamentous phages is not lethal, however the infection causes turbid plaques
in E. coli. It is a non-lytic virus. However a decrease in the rate of cell growth is seen in the
infected cells. M13 plasmids are used for many recombinant DNA processes, and the virus has
also been studied for its uses in nanostructures and nanotechnology.

Phage particles
The phage coat is primarily assembled from a 50 amino acid protein called pVIII (or p8), which
is encoded by gene VIII (or g8) in the phage genome. For a wild type M13 particle, it takes
approximately 2700 copies of p8 to make the coat about 900 nm long. The coat's dimensions are
flexible though and the number of p8 copies adjusts to accommodate the size of the single
stranded genome it packages. For example, when the phage genome was mutated to reduce its
number of DNA bases (from 6.4 kb to 221 bp) , then the number of p8 copies was decreased to
fewer than 100, causing the p8 coat to shrink in order to fit the reduced genome. The phage
appear to be limited at approximately twice the natural DNA content. However, deletion of a
phage protein (p3) prevents full escape from the host E. coli, and phage that are 10-20X the
normal length with several copies of the phage genome can be seen shedding from the E. coli
host.
There are four other proteins on the phage surface, two of which have been extensively studied.
At one end of the filament are five copies of the surface exposed pIX (p9) and a more buried
companion protein, pVII (p7). If p8 forms the shaft of the phage, p9 and p7 form the "blunt"
end that is seen in the micrographs. These proteins are very small, containing only 33 and 32
amino acids respectively, though some additional residues can be added to the N-terminal

portion of each which are then presented on the outside of the coat. At the other end of the
phage particle are five copies of the surface exposed pIII (p3) and its less exposed accessory
protein, pVI (p6). These form the rounded tip of the phage and are the first proteins to interact
with the E. coli host during infection. p3 is also the last point of contact with the host as new
phage bud from the bacterial surface.
Phage life-cycle
The general stages to a viral life cycle are: infection, replication of the viral genome, assembly
of new viral particles and then release of the progeny particles from the host. Filamentous phage
use a bacterial structure known as the F pilus to infect E. coli, with the M13 p3 tip contacting
the TolA protein on the bacterial pilus. The phage genome is then transferred to the cytoplasm
of the bacterial cell where resident proteins convert the single stranded DNA genome to a
double stranded replicative form ("RF"). This DNA then serves as a template for expression of
the phage genes.
Two phage gene products play critical roles in the next stage of the phage life cycle, namely
amplification of the genome. pII (aka p2) nicks the double stranded form of the genome to
initiate replication of the + strand. Without p2, no replication of the phage genome can occur.
Host enzymes copy the replicated + strand, resulting in more copies of double stranded phage
DNA. pV (aka p5) competes with double stranded DNA formation by sequestering copies of the
+ stranded DNA into a protein/DNA complex destined for packaging into new phage particles.
Interestingly there is one additional phage-encoded protein, pX (p10), that is important for
regulating the number of double stranded genomes in the bacterial host. Without p10 no +
strands can accumulate. What's particularly interesting about p10 is that it's identical to the Cterminal portion of p2 since the gene for p10 is within the gene for p2 and the protein arises
from transcription initiation within gene 2. This makes the manipulation of p10 inextricably
linked to manipulation of p2 (an engineering headache) but it also makes for a compact and
efficient phage in nature.
Phage maturation requires the phage-encoded proteins pIV (p4), pI (p1) and its translational
restart product pXI (p11). Multiple copies (on the order of 12 or 14) of p4 assemble in the outer
membrane into a stable, i.e. detergent resistant, barrel-shaped structure. Similarly a handful of
the p1 and p11 proteins (5 or 6 copies of each) assemble in the bacterial inner membrane, and
genetic evidence suggests C-terminal portions of p1 and p11 interact with the N-terminal
portion of p4 in the periplasm. Together the p1, p11, p4 complex forms channels through which
mature phage are secreted from the bacterial host.
To initiate phage secretion, two of the minor phage coat proteins, p9 and p7, are thought to
interact with the p5-single stranded DNA complex at a region of the DNA called the packaging
sequence (aka PS). The p5 proteins covering the single stranded DNA are then replaced by p8
proteins that are embedded in the bacterial membrane and the growing phage filament is
threaded through the p1, p11, p4 channel. This replacement of p5 by p8 explains the microphage
data presented earlier indicate how the size of the phage particle is determined by the number of
bases the phage packages. Once the phage DNA has been fully coated with p8, the secretion

terminates by adding the p3/p6 cap, and the new phage detaches from the bacterial surface.
How long does all this take? Amazingly, new M13 phage particles are secreted within 10
minutes from a newly infected host and can arise at a rate of 1000/cell within the first hour of
infection. The bacterial host can continue to grow and divide, allowing this process to continue
indefinitely.
Replication in E. coli
Below are steps involved with replication of M13 in E. coli.

Viral (+) strand DNA enters cytoplasm


Complementary (-) strand is synthesized by bacterial enzymes

DNA Gyrase, a type II topoisomerase, acts on double-stranded DNA and catalyzes


formation of negative supercoils in double-stranded DNA

Final product is parental replicative form (RF) DNA

A phage protein, pII, nicks the (+) strand in the RF

3'-hydroxyl acts as a primer in the creation of new viral strand

pII circulizes displaced viral (+) strand DNA

Pool of progeny double-stranded RF molecules produced

Negative strand of RF is template of transcription

mRNAs are translated into the phage proteins

Phage proteins in the cytoplasm are pII, pX, and pV, and they are part of the replication process
of DNA. The other phage proteins are synthesized and inserted into the cytoplasmic or outer
membranes.

pV dimers bind newly synthesized single-stranded DNA and prevent conversion to RF


DNA
RF DNA synthesis continues and amount of pV reaches critical concentration

DNA replication switches to synthesis of single-stranded (+) viral DNA

pV-DNA structures from about 800 nm long and 8 nm in diamter

pV-DNA complex is substrate in phage assembly reaction

Research
George Smith showed that fragments of EcoRI endonuclease could fuse to amino-terminal

portion of pIII.
In 2006, MIT researchers modified the DNA of M13 phages to produce a protein that would
complex with cobalt ions in solution, leading to cobalt oxide, a material with energy storage
capacity higher than current carbon-based lithium-ion batteries.

7. Define Ti plasmid. Give detail an account on that.


Ti plasmid is a circular plasmid that often, but not always, is a part of the genetic
equipment that Agrobacterium tumefaciens and Agrobacterium rhizogenes use to
transduce its genetic material to plants. The Ti plasmid is lost when Agrobacterium is
grown above 28C.
Such cured bacteria do not induce crown galls, i.e. they become avirulent. pTi and pRi
share little sequence homology but are functionally rather similar.
The Ti plasmids are classified into different types based on the type of opine produced by
their genes. The different opines specified by pTi are octopine, nopaline, succinamopine
and leucinopine.
The plasmid has 196 genes that code for 195 proteins. There is no one structural RNA.
The plasmid is 206,479 nucleotides long, the GC content is 56% and 81% of the material
is coding genes. There are no pseudogenes.
The modification of this plasmid is very important in the creation of transgenic plants,
but only in dicotyledon plants.

Virulence Region
Genes in the virulence region are grouped into the operons virABCDEFG, which code for the
enzymes responsible for mediating transduction of T-DNA to plant cells.

virA codes for a receptor which reacts to the presence of phenolic compounds such as
acetosyringone, syringealdehyde or acetovanillone which leak out of damaged plant
tissues.
virB encodes proteins which produce a pore/pilus-like structure.

virC binds the overdrive sequence.

virD1 and virD2 produce endonucleases which target the direct repeat borders of the TDNA segment, beginning with the right border.

virG activates vir-gene expression after binding to a consensus sequence, once it has been
phosphorylated by virA.

UNIT IV
Restriction mapping : Restriction map construction Double digest. RFLP
PCR. Site directed mutagenesis, Protein engineering.

Site-directed mutagenesis
Three different methods of site-directed mutagenesis have been devised: cassette mutagenesis,
primer extension and procedures based on the PCR. All three are described below but the reader
wishing more detail should consult the review of Ling and Robinson (1998). In some cases, the
goal of protein engineering is togenerate a molecule with an improvement in some operating
parameter, but it is not known what amino acid changes to make. In this situation,

a random mutagenesis strategy provides a route to the desired protein. However, methods based
on gene manipulation differ from traditional mutagenesis in that the mutations are restricted to
the gene of interest or a small portion of it. Genetic engineering also provides a number of simple
methods of generating chimeric proteins where each domain is derived from a different protein. It
should not be forgotten that constructing the mutant DNA is only part of the task. The vector for
expression, the expression system and strategies for purification and assay must also be
considered
before embarking on protein mutagenesis.
Cassette mutagenesis
In cassette mutagenesis, a synthetic DNA fragment containing the desired mutant sequence is
used to replace the corresponding sequence in the wild-type gene. This method was originally
used to generate improved variants of the enzyme subtilisin (Wells et al. 1985). It is a simple
method for which the efficiency of mutagenesis is close to 100%. The disadvantages are the
requirement for unique restriction sites flanking the region of interest and the limitation on the
realistic number of different oligonucleotide replacements that can be synthesized. The latter
problem can be minimized by the use of doped oligonucleotides ( Reidhaar-Olson and Sauer,
1988).
Primer extension:
the single-primer method
The simplest method of site-directed mutagenesis is the single-primer method (Gillam et al.
1980, Zoller & Smith 1983). The method involves priming in vitro DNA synthesis with a
chemically synthesized oligonucleotide (720 nucleotides long) that carries a base mismatch with
the complementary sequence. The method requires that the DNA to be mutated is available in
single-stranded form, and cloning the gene in M13-based vectors makes this easy. However, DNA
cloned in a plasmid and obtained in duplex form can also be converted to a partially singlestranded molecule that is suitable (Dalbadie-McFarland et al. 1982).
The synthetic oligonucleotide primes DNA synthesis and is itself incorporated into the resulting
heteroduplex molecule. After transformation of the host E. coli, this heteroduplex gives rise to
homoduplexes w hose sequences are either that of the original wild-type DNA or that containing
the mutated base. The frequency with which mutated clones arise,
compared with wild-type clones, may be low. In order to pick out mutants, the clones can be
screened by nucleic acid hybridization (see Chapter 6) with 32P-labelled oligonucleotide as probe.
Under suitable conditions of stringency, i.e. temperature and cation concentration, a positive
signal will be obtained only with mutant clones. This allows ready detection of the desired mutant
(Wallace et al. 1981, Traboni et al. 1983). In order to check that the procedure has not introduced
other adventitious changes, it is prudent to check the sequence of the mutant directly by DNA
sequencing. This was a particular necessity with early versions of the technique which made use
of E. coli DNA polymerase. The more recent use of the high-fidelity DNA polymerases from
phages T4 and T7 has minimized the problem of extraneous
mutations, as well as shortening the time for copying the second strand. Also, these polymerases
do not strand-displace the oligomer, a process which would eliminate the original mutant
oligonucleotide. A variation of the procedure (Fig. 7.10) outlined above involves oligonucleotides
containing inserted or deleted sequences. As long as stable hybrids are
formed with single-stranded wild-type DNA, priming of in vitro DNA synthesis can occur,
ultimately g ving rise to clones corresponding to the inserted or deleted sequence (Wallace et al.
1980, Norrander et al. 1983).
PCR methods of site-directed mutagenesis
Early work on the development of the PCR method of DNA amplification showed its potential for
mutagenesis (Scharf et al. 1986). Single bases mismatched between the amplification primer and
the template become incorporated into the template sequence as a result of amplification (Fig.
7.11). Higuchi et al. (1988) have described a variation of the basic method which enables a
mutation in a PCR-produced DNA fragment to be introduced anywhere along its length. Two
primary PCR reactions produce two overlapping DNA fragments, both bearing the same mutation
in the overlap region. The overlap in sequence allows the fragments to hybridize (Fig. 7.11). One
of the two possible hybrids is extended by DNA polymerase to produce a duplex fragment. The
other hybrid has recessed 5 ends and, since it is not a substrate for the polymerase, is
effectively lost from the reaction mixture. As with conventional primerextension mutagenesis,
deletions and insertions can also be created.

The polymerase chain reaction (PCR)


The impact of the PCR upon molecular biology has been profound. The reaction is easily
performed, and leads to the amplification of specific DNA sequences by an enormous factor. From
a simple basic principle, many variations have been developed with applications throughout gene
technology (Erlich 1989, Innis et al. 1990). Very importantly, the PCR
has revolutionized prenatal diagnosis by allowing tests to be performed using small samples of
fetal tissue. In forensic science, the enormous sensitivity of PCR-based procedures is exploited in
DNA profiling; following the publicity surrounding Jurassic Park, virtually everyone is aware of
potential applications in palaeontology and archaeology. Many other processes have been
described which should produce equivalent results to a PCR (for review, see Landegran 1996) but
as yet none has found widespread use. In many applications of the PCR to gene manipulation,the
enormous amplification is secondary to the aim of altering the amplified sequence. This often
involves incorporating extra sequences at the ends of the amplified DNA. In this section we shall
consider only the amplification process. The applications
of the PCR will be described in appropriate places.
Basic reaction
First we need to consider the basic PCR. The principle is illustrated in Fig. 2.7. The PCR involves
two oligonucleotide primers, 1730 nucleotides in length, which flank the DNA sequence that is to
be amplified. The primers hybridize to opposite strands of the DNA after it has been denatured,
and are orientated so that DNA synthesis by the polymerase proceeds through the region
between the two primers. The extension reactions create two doublestranded target regions,
each of which can again be denatured ready for a second cycle of hybridization and extension.
The third cycle produces two doublestranded molecules that comprise precisely the target region
in double-stranded form. By repeated cycles of heat denaturation, primer hybridization and
extension, there follows a rapid exponential accumulation of the specific target fragment of DNA.
After 22 cycles, an amplification of about 106- fold is expected (Fig. 2.8), and amplifications of this
order are actually attained in practice. In the original description of the PCR method (Mullis &
Faloona 1987, Saiki et al. 1988, Mullis 1990), Klenow DNA polymerase was used and, because of
the heat-denaturation step, fresh enzyme
had to be added during each cycle. A breakthrough came with the introduction of Taq DNA
polymerase (Lawyer et al. 1989) from the thermophilic bacterium Thermus aquaticus. The Taq
DNA polymerase is resistant to high temperatures and so does not need to be replenished during
the PCR (Erlich et al. 1988, Sakai et al. 1988). Furthermore, by enabling the
extension reaction to be performed at higher temperatures, the specificity of the primer
annealing is not compromised. As a consequence of employing the heat-resistant enzyme, the
PCR could be automated very simply by placing the assembled reaction in a heating block with a
suitable thermal cycling programme

UNIT V
DNA sequencing Dideoxy method, Maxam Gilbert method Mapping and
sequencing
the Human genome.

Sanger Method for DNA Sequencing


DNA sequencing, first devised in 1975, has become a powerful technique in
molecular biology, allowing analysis of genes at the nucleotide level. For this reason,

this tool has been applied to many areas of research. For example, the polymerase
chain reaction (PCR), a method which rapidly produces numerous copies of a desired
piece of DNA, requires first knowing the flanking sequences of this piece. Another
important use of DNA sequencing is identifying restriction sites in plasmids. Knowing
these restriction sites is useful in cloning a foreign gene into the plasmid. Before the
advent of DNA sequencing, molecular biologists had to sequence proteins directly;
now amino acid sequences can be determined more easily by sequencing a piece of
cDNA and finding an open reading frame. In eukaryotic gene expression, sequencing
has allowed researchers to identify conserved sequence motifs and determine their
importance in the promoter region. Furthermore, a molecular biologist can utilize
sequencing to identify the site of a point mutation. These are only a few examples
illustrating the way in which DNA sequencing has revolutionized molecular biology.
Dideoxynucleotide sequencing represents only one method of sequencing DNA. It is
commonly called Sanger sequencing since Sanger devised the method. This technique
utilizes 2',3'-dideoxynucleotide triphospates (ddNTPs), molecules that differ from
deoxynucleotides by the having a hydrogen atom attached to the 3' carbon rather than
an OH group. (Figure 1). These molecules terminate DNA chain elongation because
they cannot form a phosphodiester bond with the next deoxynucleotide.
In order to perform the sequencing, one must first convert double stranded DNA into
single stranded DNA. This can be done by denaturing the double stranded DNA with
NaOH. A Sanger reaction consists of the following: a strand to be sequenced (one of
the single strands which was denatured using NaOH), DNA primers (short pieces of
DNA that are both complementary to the strand which is to be sequenced and
radioactively labelled at the 5' end), a mixture of a particular ddNTP (such as ddATP)
with its normal dNTP (dATP in this case), and the other three dNTPs (dCTP, dGTP,
and dTTP). The concentration of ddATP should be 1% of the concentration of dATP.
The logic behind this ratio is that after DNA polymerase is added, the polymerization
will take place and will terminate whenever a ddATP is incorporated into the growing
strand. If the ddATP is only 1% of the total concentration of dATP, a whole series of
labeled strands will result (Figure 1). Note that the lengths of these strands are
dependent on the location of the base relative to the 5' end.
This reaction is performed four times using a different ddNTP for each reaction. When
these reactions are completed, a polyacrylamide gel electrophoresis (PAGE) is
performed. One reaction is loaded into one lane for a total of four lanes (Figure 2).
The gel is transferred to a nitrocellulose filter and autoradiography is performed so
that only the bands with the radioactive label on the 5' end will appear. In PAGE, the
shortest fragments will migrate the farthest. Therefore, the bottom-most band
indicates that its particular dideoxynucleotide was added first to the labeled primer. In
Figure 2, for example, the band that migrated the farthest was in the ddATP reaction

mixture. Therefore, ddATP must have been added first to the primer, and its
complementary base, thymine, must have been the base present on the 3' end of the
sequenced strand. One can continue reading in this fashion. Note in Figure 2 that if
one reads the bases from the bottom up, one is reading the 5' to 3' sequence of the
strand complementary to the sequenced strand. The sequenced strand can be read 5' to
3' by reading top to bottom the bases complementary to the those on the gel.

Figure 1. This figure shows the structure of a dideoxynucleotide (notice the H atom
attached to the 3' carbon). Also depicted in this figure are the ingredients for a Sanger
reaction. Notice the different lengths of labeled strands produced in this reaction.

Figure 2. This figure is a representation of an acrylamide sequencing gel. Notice that


the sequence of the strand of DNA complementary to the sequenced strand is 5' to 3'
ACGCCCGAGTAGCCCAGATT while the sequence of the sequenced strand, 5' to 3',
is AATCTGGGCTACTCGGGCGT

Maxam&GilbertSequencing

There are four chemical cleavage reactions at the core of the Maxam and Gilbert sequencing
system. The figure below left shows an example from these reactions, the reaction cleaving
specifically at guanine. The other three reactions cleave at G+A, C+T, or C. Guanine and
cytosine, therefore, give bands in 2 lanes, adenine and thymine in only one. An example of
the gel pattern produced is presented below right.

MaxamandGilbertDNAsequencingreactionspecificforGuanidine
residues.TheGuaninebaseisfirstmodifiedwithDimethylSulfate(DMS),
whichmakesthechainsusceptibletocleavagebypiperidine,destroying
theGuanidineresidueandreleasingalabeledfragmentfor
electrophoresis.

InaMaxamandGilbertgel,theidentityofguanineor
cytosineinthesequencecanbeassignedmosteasily
becausetwoofthefourreactionsetscleaveatthosebases
alone.Adenineorthymineareslightlymoredifficult,being
representedbythosebandsintheG+AorC+Tlaneswhich
donotappear,respectively,intheGorClanes.

The DNA to be sequenced must first be end labeled, at one end only. This is accomplished
by kinase treatment with 32P ATP, which labels both ends, followed by restriction digestion
and isolation of the two labeled fragments. Alternatively, digestion of a plasmid containing a
clone of the DNA of interest with an appropriate enzyme can yield a unique labeling site.
Plasmid vectors containing the rare site for Tth111I, which leaves a single 5' base overlap,
have been generated for this purpose. Cleavage with Tth111I leaves a G at one end and a C at
the other in these vectors. By filling in the gap with Klenow polymerase fragment in the
presence of dGTP or dCTP, one end or the other can be labeled specifically. Labeled DNA is
first precipitated to remove any salts which might interfere in the cleavage reactions. It is
then modified, cleaved and run on a denaturing gel for analysis. NB: THE HYDRAZINE
AND DMS USED IN THESE PROTOCOLS ARE TOXIC AND VOLATILE. KEEP TUBES
SEALED AND WORK IN A HOOD.
Maxam and Gilbert Sequencing Reactions
1. Precipitate the substrate: To the 32P labeled DNA, add 0.1 vol. 3M Sodium Acetate
and 1 vol. Isopropanol. Precipitate at -70C for 10 minutes, and centrifuge at max
RPM in a microcentrifuge for 5 minutes to collect the DNA. Wash the pellet twice
with 1 ml cold 70% ethanol to remove all salt. Redissolve the DNA in 45 l of sterile
water. Count one microliter of the solution in scintillation cocktail to confirm >5x10 3
cpm total counts.
2. Aliquot 10 l of the DNA solution into each of 4 tubes. Label the tubes C, G, C+T,
G+A.
3. Reactions:

C: Add 10l 2.5M NaCl and mix well. Add 30l of Hydrazine (toxic!) and

incubate at 25C for 7-9 minutes.

G: Add 200l of: 50mM sodium cacodylate, pH 8, 1mM EDTA. Mix well and add
1l Dimethyl Sulfate (DMS) (Toxic!) and incubate at 25C for 4-5 minutes.

C+T: Add 10l H2O and mix well. Add 30l Hydrazine and incubate at 25C for
7-9 minutes.

G+A: Add 25l of formic acid, mix well and incubate at 25C for 4-5 minutes.

4. Stop

the

reactions:

Stop buffers:

G reaction: Add 50l of:1.5M sodium acetate pH 7, 1M mercaptoethanol,


100g/ml tRNA.

All other reactions: Add 200l of 0.3M sodium acetate, pH 7, 0.1mM EDTA,
25g/ml tRNA.

Ethanol
precipitation:
Add 750l of Ethanol, and transfer reactions to a -70C bath for 5 minutes.
Collect DNA by microcentrifugation for 5 minutes. Discard the supernatants
as appropriate for DMS or Hydrazine waste. Rinse the pellets twice with
70% ethanol. Redissolve the pellets in 300l of water, add 30l of 3M
sodium acetate and 1ml of ethanol. Pellet DNA and wash twice with 70%
ethanol. Allow the pellets to air dry.
5. Piperidine
cleavage
reactions:
Resuspend pellets in 75l of 10% piperidine, and transfer to screw top tubes. It is
essential that the tubes used for the piperidine reaction seal well in order to ensure
that the reaction goes to completion. Incubate the tubes at 90C for 30 minutes.
Cool the tubes, centrifuge briefly to collect the condensate, and evaporate to
dryness in a speedvac. Redissolve the pellet in 40l of water and dry again. Repeat
the rehydration and drying once more to ensure that all of the piperidine has been
removed. The samples are now ready for denaturing PAGE.

Mapping and Sequencing the Human Genome


A primary goal of the Human Genome Project is to make a series of descriptive diagrams maps
of each human chromosome at increasingly finer resolutions. Mapping involves (1) dividing the
chromosomes into smaller fragments that can be propagated and characterized and (2) ordering
(mapping) them to correspond to their respective locations on the chromosomes. After mapping
is completed, the next step is to determine the base sequence of each of the ordered DNA
fragments. The ultimate goal of genome research is to find all the genes in the DNA sequence
and to develop tools for using this information in the study of human biology and medicine.
Improving the instrumentation and techniques required for mapping and sequencing a major

focus of the genome project will increase efficiency and cost- effectiveness. Goals include
automating methods and optimizing techniques to extract the maximum useful information from
maps and sequences.
A genome map describes the order of genes or other markers and the spacing between them on
each chromosome. Human genome maps are constructed on several different scales or levels of
resolution. At the coarsest resolution are genetic linkage maps, which depict the relative
chromosomal locations of DNA markers (genes and other identifiable DNA sequences) by their
patterns of inheritance. Physical maps describe the chemical characteristics of the DNA molecule
itself.
Geneticists have already charted the approximate positions of over 2300 genes, and a start has
been made in establishing high- resolution maps of the genome (Fig. 7: Assignment of Genes to
Specific Chromosomes). More- precise maps are needed to organize systematic sequencing
efforts and plan new research directions.
HUMAN GENOME PROJECT GOALS

Resolution

Complete a detailed human genetic map

2 Mb

Complete a physical map

0.1 Mb

Acquire the genome as clones

5 kb

Determine the complete sequence

1 bp

Find all the genes

With the data generated by the project, investigators will determine the functions of
the genes and develop tools for biological and medical applications.

Mapping Strategies
Genetic Linkage Maps
A genetic linkage map shows the relative locations of specific DNA markers along the
chromosome. Any inherited physical or molecular characteristic that differs among individuals
and is easily detectable in the laboratory is a potential genetic marker. Markers can be expressed
DNA regions (genes) or DNA segments that have no known coding function but whose
inheritance pattern can be followed. DNA sequence differences are especially useful markers
because they are plentiful and easy to characterize precisely.
Markers must be polymorphic to be useful in mapping; that is, alternative forms must exist
among individuals so that they are detectable among different members in family studies.
Polymorphisms are variations in DNA sequence that occur on average once every 300 to 500 bp.

Variations within exon sequences can lead to observable changes, such as differences in eye
color, blood type, and disease susceptibility. Most variations occur within introns and have little
or no effect on an organisms appearance or function, yet they are detectable at the DNA level and
can be used as markers. Examples of these types of markers include (1) restriction fragment
length polymorphisms (RFLPs), which reflect sequence variations in DNA sites that can be
cleaved by DNA restriction enzymes, and (2) variable number of tandem repeat sequences,
which are short repeated sequences that vary in the number of repeated units and, therefore, in
length (a characteristic easily measured). The human genetic linkage map is constructed by
observing how frequently two markers are inherited together.
Two markers located near each other on the same chromosome will tend to be passed together
from parent to child. During the normal production of sperm and egg cells, DNA strands
occasionally break and rejoin in different places on the same chromosome or on the other copy
of the same chromosome (i.e., the homologous chromosome). This process (called meiotic
recombination) can result in the separation of two markers originally on the same chromosome
(Fig. 8: Constructing a Genetic Linkage Map). The closer the markers are to each other the more
tightly linked the less likely a recombination event will fall between and separate them.
Recombination frequency thus provides an estimate of the distance between two markers.
On the genetic map, distances between markers are measured in terms of centimorgans (cM),
named after the American geneticist Thomas Hunt Morgan. Two markers are said to be 1 cM
apart if they are separated by recombination 1% of the time. A genetic distance of 1 cM is
roughly equal to a physical distance of 1 million bp (1 Mb). The current resolution of most
human genetic map regions is about 10 Mb.
The value of the genetic map is that an inherited disease can be located on the map by following
the inheritance of a DNA marker present in affected individuals (but absent in unaffected
individuals), even though the molecular basis of the disease may not yet be understood nor the
responsible gene identified. Genetic maps have been used to find the exact chromosomal location
of several important disease genes, including cystic fibrosis, sickle cell disease, Tay- Sachs
disease, fragile X syndrome, and myotonic dystrophy.
One short- term goal of the genome project is to develop a high- resolution genetic map (2 to 5
cM); recent consensus maps of some chromosomes have averaged 7 to 10 cM between genetic
markers. Genetic mapping resolution has been increased through the application of recombinant
DNA technology, including in vitro radiation- induced chromosome fragmentation and cell
fusions (joining human cells with those of other species to form hybrid cells) to create panels of
cells with specific and varied human chromosomal components. Assessing the frequency of
marker sites remaining together after radiation- induced DNA fragmentation can establish the
order and distance between the markers. Because only a single copy of a chromosome is required
for analysis, even nonpolymorphic markers are useful in radiation hybrid mapping. [In meiotic
mapping (described above), two copies of a chromosome must be distinguished from each other
by polymorphic markers.]
Restriction Enzymes: Microscopic Scalpels

Isolated from various bacteria, restriction enzymes recognize short DNA sequences and cut the
DNA molecules at those specific sites. (A natural biological function of these enzymes is to
protect bacteria by attacking viral and other foreign DNA.) Some restriction enzymes (rarecutters) cut the DNA very infrequently, generating a small number of very large fragments
(several thousand to a million bp). Most enzymes cut DNA more frequently, thus generating a
large number of small fragments (less than a hundred to more than a thousand bp).
On average, restriction enzymes with

4-base recognition sites will yield pieces 256 bases long,


6-base recognition sites will yield pieces 4000 bases long, and

8-base recognition sites will yield pieces 64,000 bases long.

Since hundreds of different restriction enzymes have been characterized, DNA can be cut into
many different small fragments.
Physical Maps
Different types of physical maps vary in their degree of resolution. The lowest- resolution
physical map is the chromosomal (sometimes called cytogenetic) map, which is based on the
distinctive banding patterns observed by light microscopy of stained chromosomes. A cDNA
map shows the locations of expressed DNA regions (exons) on the chromosomal map. The more
detailed cosmid contig map depicts the order of overlapping DNA fragments spanning the
genome. A macrorestriction map describes the order and distance between enzyme cutting
(cleavage) sites. The highest- resolution physical map is the complete elucidation of the DNA
base- pair sequence of each chromosome in the human genome. Physical maps are described in
greater detail below.
Low-Resolution Physical Mapping
Chromosomal map. In a chromosomal map, genes or other identifiable DNA fragments are
assigned to their respective chromosomes, with distances measured in base pairs. These markers
can be physically associated with particular bands (identified by cytogenetic staining) primarily
by in situ hybridization, a technique that involves tagging the DNA marker with an observable
label (e.g., one that fluoresces or is radioactive). The location of the labeled probe can be
detected after it binds to its complementary DNA strand in an intact chromosome.
As with genetic linkage mapping, chromosomal mapping can be used to locate genetic markers
defined by traits observable only in whole organisms. Because chromosomal maps are based on
estimates of physical distance, they are considered to be physical maps. The number of base pairs
within a band can only be estimated.
Until recently, even the best chromosomal maps could be used to locate a DNA fragment only to
a region of about 10 Mb, the size of a typical band seen on a chromosome. Improvements in
fluorescence in situ hybridization (FISH) methods allow orientation of DNA sequences that lie as
close as 2 to 5 Mb. Modifications to in situ hybridization methods, using chromosomes at a stage

in cell division (interphase) when they are less compact, increase map resolution to around
100,000 bp. Further banding refinement might allow chromosomal bands to be associated with
specific amplified DNA fragments, an improvement that could be useful in analyzing observable
physical traits associated with chromosomal abnormalities.
cDNA map. A cDNA map shows the positions of expressed DNA regions (exons) relative to
particular chromosomal regions or bands. (Expressed DNA regions are those transcribed into
mRNA.) cDNA is synthesized in the laboratory using the mRNA molecule as a template; basepairing rules are followed (i.e., an A on the mRNA molecule will pair with a T on the new DNA
strand). This cDNA can then be mapped to genomic regions.
Because they represent expressed genomic regions, cDNAs are thought to identify the parts of
the genome with the most biological and medical significance. A cDNA map can provide the
chromosomal location for genes whose functions are currently unknown. For disease- gene
hunters, the map can also suggest a set of candidate genes to test when the approximate location
of a disease gene has been mapped by genetic linkage techniques.
High- Resolution Physical Mapping
The two current approaches to high- resolution physical mapping are termed top- down
(producing a macrorestriction map) and bottom- up (resulting in a contig map). With either
strategy (described below) the maps represent ordered sets of DNA fragments that are generated
by cutting genomic DNA with restriction enzymes (see previously discussed Restriction
Enzymes). The fragments are then amplified by cloning or by polymerase chain reaction (PCR)
methods (see DNA Amplification below). Electrophoretic techniques are used to separate the
fragments according to size into different bands, which can be visualized by direct DNA staining
or by hybridization with DNA probes of interest. The use of purified chromosomes separated
either by flow sorting from human cell lines or in hybrid cell lines allows a single chromosome
to be mapped (see Separating Chromosomes below).
A number of strategies can be used to reconstruct the original order of the DNA fragments in the
genome. Many approaches make use of the ability of single strands of DNA and/or RNA to
hybridize to form double- stranded segments by hydrogen bonding between complementary
bases. The extent of sequence homology between the two strands can be inferred from the length
of the double- stranded segment. Fingerprinting uses restriction map data to determine which
fragments have a specific sequence (fingerprint) in common and therefore overlap. Another
approach uses linking clones as probes for hybridization to chromosomal DNA cut with the same
restriction enzyme.
Macrorestriction maps: Top- down mapping. In top- down mapping, a single chromosome is
cut (with rare- cutter restriction enzymes) into large pieces, which are ordered and subdivided;
the smaller pieces are then mapped further. The resulting macro- restriction maps depict the order
of and distance between sites at which rare- cutter enzymes cleave (Fig. 9a: Physical Mapping
Strategies: Macrorestriction Map). This approach yields maps with more continuity and fewer
gaps between fragments than contig maps, but map resolution is lower and may not be useful in
finding particular genes; in addition, this strategy generally does not produce long stretches of

mapped sites. Currently, this approach allows DNA pieces to be located in regions measuring
about 100,000 bp to 1 Mb.
The development of pulsed- field gel (PFG) electrophoretic methods has improved the mapping
and cloning of large DNA molecules. While conventional gel electrophoretic methods separate
pieces less than 40 kb (1 kb = 1000 bases) in size, PFG separates molecules up to 10 Mb,
allowing the application of both conventional and new mapping methods to larger genomic
regions.
Contig maps: Bottom- up mapping. The bottom- up approach involves cutting the
chromosome into small pieces, each of which is cloned and ordered. The ordered fragments form
contiguous DNA blocks (contigs). Currently, the resulting library of clones varies in size from
10,000 bp to 1 Mb (Fig. 9b: Physical Mapping Strategies: Contig Maps). An advantage of this
approach is the accessibility of these stable clones to other researchers. Contig construction can
be verified by FISH, which localizes cosmids to specific regions within chromosomal bands.
Contig maps thus consist of a linked library of small overlapping clones representing a complete
chromosomal segment. While useful for finding genes localized to a small area (under 2 Mb),
contig maps are difficult to extend over large stretches of a chromosome because all regions are
not clonable. DNA probe techniques can be used to fill in the gaps, but they are time consuming.
Figure 10 is a diagram relating the different types of maps.
Technological improvements now make possible the cloning of large DNA pieces, using
artificially constructed chromosome vectors that carry human DNA fragments as large as 1 Mb.
These vectors are maintained in yeast cells as artificial chromosomes (YACs). (For more
explanation, see DNA Amplification below) Before YACs were developed, the largest cloning
vectors (cosmids) carried inserts of only 20 to 40 kb. YAC methodology drastically reduces the
number of clones to be ordered; many YACs span entire human genes. A more detailed map of a
large YAC insert can be produced by subcloning, a process in which fragments of the original
insert are cloned into smaller- insert vectors. Because some YAC regions are unstable, largecapacity bacterial vectors (i.e., those that can accommodate large inserts) are also being
developed.
Separating Chromosomes
Flow sorting
Flow sorting employs flow cytometry to separate, according to size, chromosomes isolated from
cells during cell division when they are condensed and stable. As the chromosomes flow singly
past a laser beam, they are differentiated by analyzing the amount of DNA present, and
individual chromosomes are directed to specific collection tubes.
Somatic cell hybridization
In somatic cell hybridization, human cells and rodent tumor cells are fused (hybridized); over
time, after the chromosomes mix, human chromosomes are preferentially lost from the hybrid

cell until only one or a few remain. Those individual hybrid cells are then propagated and
maintained as cell lines containing specific human chromosomes. Improvements to this
technique have generated a number of hybrid cell lines, each with a specific single human
chromosome

You might also like