You are on page 1of 11

Genome Sequence Databases: Genomic, Construction of Libraries

J M Struble, P Handke, and R T Gill, University of Colorado, Boulder, CO, USA


ª 2009 Elsevier Inc. All rights reserved.

Defining Statement Increasing the Number of Transformants


Vectors Used in Genomic Library Construction Determining the Number of Transformants Needed for
Genomic DNA Preparation Coverage of an Entire Genome
Agarose Gel Electrophoresis Used in Genomic Library Current Strategies to Enhance Genomic Library
Construction Production
Quantifying DNA and Determining Quality Sample Protocol to Construct a 4 kb Pseudomonas
Ligation Reactions aeruginosa PAO1 Library with Vector PBTB-1 Within
Transformation of Library DNA into Bacterial Host an E. coli Host Strain
Strains Further Reading

Glossary ligation Joining of two compatible fragments of DNA


agarose gel electrophoresis Movement of charged through the reformation of phosphodiester bonds
molecules, such as proteins and nucleic acids, induced catalyzed by ligase.
by an electrical field. partial digestion Enzyme digestion performed for a
blunt ends DNA fragment ends with no overhanging limited amount of time that does not go to completion
nucleotides resulting from enzyme digestions. for purposes of generating random DNA fragments for
cohesive ends DNA fragment ends with library construction.
single-stranded overhangs resulting from enzyme plasmid Extrachromosomal, circular DNA molecule
digestions; also known as sticky ends. employed to introduce DNA fragments into a host
competent cell Cells treated to accept extracellular organism.
DNA. recombinant clone Clone containing a recombinant
copy number Number of particular plasmids present in DNA molecule.
a cell. restriction enzyme Enzyme recognizing a specific
electroporation The process of introducing vectors DNA sequence around which it will cleave both strands
into a host organism by temporarily creating of a DNA segment.
electropores to allow DNA passage. transformation Introduction of foreign DNA into a host
end repair Process employing Klenow fragments or organism.
DNA polymerases to fill in or remove overhangs vector DNA molecule, such as a plasmid, into which
obtained by restriction enzyme digestions. exogenous DNA fragments can be ligated for
genomic library Collection of overlapping segments of transformation into a host organism and propagation
DNA including all regions of an organism’s genome. within that host.
ligase Enzyme catalyzing the reformation of
phosphodiester bonds to join two compatible DNA
fragments.

Abbreviations LB broth Luria–Bertani broth


CFU colony forming units

Defining Statement pace. Identifying and isolating genes of interest, as well as


gene mapping and sequencing, must be preceded by con-
Genomic libraries are becoming more important as the uses struction of a genomic library. The quality and specific
for biotechnology multiply and expand at an increasing features of such a library are therefore of utmost importance.

185
186 Techniques | Genome Sequence Databases: Genomic, Construction of Libraries

Vectors Used in Genomic Library Features of and around the insert site also have an
Construction impact on stability and representation of the library. For
the creation of a representative genomic library, all DNA
Vector Selection
sections of the genome must be cloned, regardless of
The choice of backbone vector used for constructing a DNA secondary structure or the encoding of toxic gene
genomic library is highly dependent upon what studies products or strong promoters. Vectors that contain tran-
will be performed. When choosing a backbone vector, scription terminators flank the cloning site to prevent
decisions must be made about the desired copy number, transcription into and out of the cloned DNA and have
selective marker, size of genomic DNA insert, host been shown to be successful at mitigating cloning bias
range, and, if the cloned DNA is to be expressed, the against difficult DNA and increase vector stability. For
type of promoter and ribosomal binding site that should studies involving expression of the insert DNA, a variety
be upstream of the multiple cloning site. Also, the vector of vectors exist that offer options of using constitutive,
used for library construction should remain stable after inducible, titratable, or native promoters, as well as
genomic DNA fragments are inserted. options including ribosomal binding sites and start codons
For constructing a genomic library, it is important that upstream of the cloned insert.
the vector has the ability to replicate within the desired
host organism or organisms. The origin of replication, or Vector Preparation
the ori region, on a vector controls the host range and, to a
large extent, the copy number of the vector. Vectors with Once an appropriate vector has been selected, it must be
narrow host ranges such as those containing the ColE1 prepared for cloning. Purified vector should be free of
origin have only been found to replicate in Escherichia coli endonuclease contamination and chemicals such as phe-
and closely related bacteria. Broad host range vectors have nol or EDTA that may interfere with downstream
origins of replication that are recognized in a wide range of enzymatic reactions. There exist a number of standard
bacterial species. The origin of replication from the broad protocols and commercially available kits that are used
for purifying plasmid DNA. Plasmid purification proto-
host range plasmids RK2 and pBBR1 is functional across
cols take advantage of the smaller size of the plasmid
multiple Gram-negative species while plasmid RSF1010
DNA compared to the significantly larger chromosomal
has been found to be able to replicate within a number of
DNA. The method chosen is influenced in part by the size
both Gram-positive and Gram-negative species.
of the plasmid and the host strain. Many commonly used
The ori region (replicon) of a vector also affects the copy
cloning strains, such as E. coli DH5, have mutations
number of vector within the host. High copy number vec-
within their endA regions, which eliminate the nonspecific
tors offer the advantage of increased yield of purified vector
endonuclease activity of endonuclease I, thus allowing
from a volume of culture to be used for sequencing or
higher quality plasmid preparations from these strains.
molecular cloning purposes. Low copy number vectors
Other strains such as E. coli HB101 produce large amounts
are typically desired for studies involving expression of
of carbohydrates that can interfere with DNA extractions.
genomic DNA segments, especially those involving toxic
Large plasmids (>15 kb) are more fragile and thus have to
DNA segments. When a higher yield of a low copy number be treated with more gentle extraction methods than
vector is needed, low copy number vectors containing the small plasmids, which are less susceptible to damage.
ColE1 or pMB1 origin of replication, which allows for Once purified, a vector needs to be linearized prior to
plasmid number amplification prior to purification in the ligation. Using the sequence of the vector and the vector
presence of 170 mg l1 of chloramphenicol, may be used. map, an enzyme or a set of enzymes must be found that
Chloramphenicol inhibits protein synthesis and thus pre- cut only within the cloning site of the vector. Enzymes
vents chromosomal replication. The enzymes necessary for may leave blunt or sticky ends after digestion. When
replication of plasmids with the ColE1 or pMB1 origin selecting enzymes to digest the vector, it must be kept
require enzymes that are long lived and thus continue to in mind that the ends for the vector and the ends for the
replicate in the presence of chloramphenicol, reaching sev- fragmented genomic DNA that will be cloned must be
eral thousand copies per cell. compatible. The sticky ends of a digested piece of DNA
The ori region and the mechanism of vector replication may be converted to blunt ends by using particular DNA
have also been implicated to be responsible for the structural modifying enzymes that may fill in overhangs of sticky
stability of cloned vectors. Vectors with replicons using roll- ends or exhibit exonuclease activity against single-
ing circle mechanisms have frequently been found to be stranded DNA and thus degrade overhangs. T4 DNA
unstable for cloning purposes. This instability may be due polymerase or the Klenow fragment of E. coli DNA poly-
to the secondary structure formed by the lagging strand of merase I will remove 39 overhangs and fill in 59 overhangs
DNA during replication, which may be cleaved by nucleases when provided with deoxynucloside triphosphates.
or experience mutations or deletions during replication. Additionally, DNA end repair kits are available from
Techniques | Genome Sequence Databases: Genomic, Construction of Libraries 187

several commercial vendors that use proprietary enzymes PCR amplification may also be used to generate large
to generate blunt-ended DNA segments. amounts of linear dephosphorylated vector. Culture-
In an effort to prevent self-ligation and minimize the purified and linearized vector is PCR amplified with
number of clones without genomic DNA insert, the unphosphorylated primers designed to extend outward
59-termini phosphate groups required by ligase for ligation from the cloning site into the vector backbone.
reactions are removed from the linearized vector using an Proofreading polymerases such as Pfu or Pfx DNA poly-
alkaline phosphatase. The most commonly used alkaline merase will generate high-fidelity blunt-ended PCR
phosphatases within molecular biology are shrimp alkaline product. The choice of polymerase influences the types of
phosphatase from the Arctic shrimp Pandalus borealis, calf overhangs that will be generated as well as the fidelity of the
intestinal alkaline phosphatase from calf intestines, bacterial PCR product generated. The PCR product should be pur-
alkaline phosphatase from E. coli C4, and Antarctic ified away from the components in the PCR buffer prior to
Phosphatase from the psychrophilic bacterium strain ligation so as not to have extraneous nucleotides, primers,
TAB5. All of these phosphatases are effective at removing or salts that may interfere with enzymatic reactions. To do
59 phosphates from DNA, but vary in their activity, buffer so, the sample may be separated via agarose gel electro-
compatibility, and ability to be inactivated. Alkaline phos- phoresis followed by excision of the appropriate band
phatases bind tightly to DNA and thus may require containing the PCR product of linearized vector. DNA
aggressive methods to denature. To inactivate bacterial can then be extracted via a commercial gel extraction kit.
alkaline phosphatase and calf intestinal alkaline phosphatase
reactions, proteinase K is used to digest the phosphatase,
followed by a phenol–chloroform extraction and an ethanol Genomic DNA Preparation
precipitation. Alternatively, a commercial enzymatic reac-
Genomic DNA Purification
tion clean-up kit can be used to purify the
dephosphorylated DNA. Shrimp alkaline phosphatase and Genomic DNA must be isolated from proteins and other
Antarctic Phosphatase from TAB5 are thermolabile and can cellular debris prior to any enzymatic or mechanical
be completely heat inactivated. It desired to use an alkaline manipulation. Bacterial cells are lysed, typically through
phosphatase that is compatible with the restriction enzyme exposure to surfactants, such as sodium dodocyl sulfate or
buffer or buffers that were used in upstream preparation of Tween-20, or treatment with lysozyme to digest the
the vector and to have an alkaline phosphatase that is heat- polysaccharide component of cellular membranes and
labile to minimize the number of purification steps, which proteinase K for protease digestion. DNase-free RNase
may decrease the yield of vector DNA. may be added to the lysis step to minimize RNA contam-
The dephosphorylation step, which removes 59 phos- ination. Genomic DNA can be purified from cell lysate
phate groups from the linearized vector, may not go to using a phenol–chloroform extraction followed by an
completion and thus the amount of background from ethanol precipitation or commercially available silica col-
vector that can self-ligate may be significantly high. An umns. Commercially available kits for genomic DNA
optional step to help mitigate this problem is to perform a isolation are often desirable in that they avoid the use of
ligation reaction following the dephosphorylation step. phenol and chloroform, which are toxic and may interfere
Vector DNA that maintained the 59 phosphate groups with downstream enzymatic reactions. Many commer-
will self-ligate while linearized vector DNA that has had cially available kits avoid phenol by using buffers
its 59 phosphates successfully removed will be incapable containing the chaotropic agent guanidine hydrochloride
of self-ligating and thus remain linear. The circular self- to aid in cell lysis and to effectively denature proteins.
ligated vector DNA and the linear vector DNA can then Purified genomic DNA should be maintained in a nucle-
be separated by agarose gel electrophoresis. The linear- ase-free Tris buffer or in nuclease-free water.
ized vector can be extracted with a scalpel and purified Nuclease contamination is a frequent concern asso-
using traditional DNA extraction methods or commer- ciated with genomic DNA isolation. Nuclease activity
cially available kits for DNA extraction from agarose gels. will degrade DNA and can be easily mistaken for a
If the ligation step to remove vectors that had not been restriction enzyme digestion or the result of mechanical
successfully dephosphorylated is omitted, the linearized shearing. Nuclease contamination may be detected by
vector should still be purified from the restriction enzyme incubating an aliquot of purified DNA at 37  C for 18 h
digestion and the alkaline phosphatase reaction buffers and and then visualizing the DNA on an agarose gel. A control
enzymes. Phenol–chloroform purification followed by an aliquot of DNA that had been stored frozen should be
ethanol precipitation or a commercially available spin col- used for comparison. Following electrophoresis, if the
umn for cleanup of enzymatic reactions may be used for incubated aliquot appears to have degraded, nucleases
this purpose. Purifying the vector DNA will remove pro- may be contaminating the genomic DNA sample.
teins and buffer components that may lessen the efficiency Additionally, the DNaseAlert kit available from Ambion
of the ligation of the vector to genomic DNA insert. (Austin, TX) can be used to detect DNase contamination
188 Techniques | Genome Sequence Databases: Genomic, Construction of Libraries

in a sample. This kit detects DNase in a sample through


the use of modified oligonucleotides that fluoresce upon
cleavage by nucleases that may be within the sample. If
nuclease activity is detected, the DNA sample can be
exposed to high concentrations of guanidine hydrochlor-
ide (6–8 M) for 30 min or be subjected to an additional
phenol–chloroform extraction for more thorough depro-
teinization of the sample. An ethanol precipitation should
follow the additional deproteinization step to remove the
phenol, chloroform, or chaotropic salts.

Fragmentation
Once purified and established as free of contaminating
nucleases, genomic DNA must be fragmented to a desired
size and made compatible for ligation into a prepared
vector. Genomic DNA may be fragmented using enzy-
matic or mechanical means. Enzymatic fragmentation is
accomplished using either restriction endonucleases or
DNase I in the presence of manganese ions. Digestion
with DNase I offers the ability to generate a more random
pool of DNA segments compared to digestion with
restriction endonucleases, which are biased based on the
sequence of the genomic DNA and the recognition sites
of the enzymes. Despite this, DNA digestion with restric-
tion endonucleases is often preferred due to simplicity in
reaction set-up and controllability. Appropriate restric-
tion endonucleases for genomic DNA digestions are
chosen based on five factors:
1. Frequency of cutting.
2. Buffer compatibility.
3. Ability to be denatured. Figure 1 Restriction enzyme digestion of genomic DNA for
4. Methylation sensitivity. 5 min (middle) and 16 h (right) next to DNA standard ladder (left).
5. The type of overhang that is produced.
Restriction endonucleases have known recognition sites enzymes may be chosen based on the type of overhang that is
ranging from 4 to greater than 30 nucleotide bp. Their left after cleavage. It can be arranged to have blunt ends or
frequency of cutting within a genome may be predicted if overhangs (called cohesive ends) that would be compatible
information is known about the genome sequence and the with the prepared vector. This will eliminate a step to
recognition sequence of the enzyme. Enzymes that cut with modify the ends of the DNA fragments prior to ligation.
a high frequency in the genome, typically containing smaller Mechanical methods for DNA fragmenting offer the
recognition sequences, can be used to generate suitably advantage of being unbiased toward DNA sequence and
random fragments by using partial digestions (Figure 1), or thus are useful for creating a more random pool of
digestions that have not gone to completion. More than one fragments. The main disadvantage associated with
restriction enzyme may be used for the partial digestion of mechanically fragmenting DNA is the limitation on the
DNA to ameliorate bias based on recognition sequence. size of fragments that can be generated as well as the
Ideally, all of the restriction enzymes used in a digestion extensive treatment that is required to repair the ends of
should have similar activity within a common reaction buffer the DNA necessary before they can be cloned into back-
and ability to be denatured to minimize DNA purification bone vector. A French press, sonicator, clinical nebulizer,
steps. Additionally, the restriction enzymes selected may be small gauge syringe, and HydroShear (Genomic Solutions
desired to be insensitive to dam methylation (methylation of Inc., Ann Arbor, MI) are common tools used to fragment
the N6 position of the adenine in the sequence GATC) or DNA. Of these tools, the HydroShear was deliberately
dcm methylation (methylation of the cytosine at its C5 designed to shear DNA using hydrodynamic force and
found in the sequences CCTGG and CCAGG), which can be used to reproducibly create random fragments of
may render the DNA resistant to cleavage. Lastly, restriction DNA within a limited size range, independent of DNA
Techniques | Genome Sequence Databases: Genomic, Construction of Libraries 189

concentration, starting length of DNA, and sample volume. 100%


Mechanical fragmentation of DNA will result in DNA that 10%
must be end-repaired to fill in or remove overhangs and 1%
restore 59 phosphate groups. T4 DNA polymerase, Klenow
0.1%
fragment, or kits available to repair DNA ends may be used
for this purpose. T4 polynucleotide kinase, when supplied 0.01%

with ATP, can be used to restore 59 phosphate groups 0.001%


No UV 30 s 60 s 120 s 120 s
necessary for ligase activity. 302 nm 302 nm 302 nm 360 nm

Figure 2 Relative cloning efficiency of pUC19 after exposure to


short or long wavelength UV light. Intact pUC19 DNA was
Agarose Gel Electrophoresis Used transformed after no UV exposure (‘No UV’) or exposure to
in Genomic Library Construction 302 nm UV light for 30, 60, or 120 s (‘30 s 302 nm, 60 s 302 nm,
120 s 302 nm’) or to 360 nm UV light for 120 s (‘120 s 360 nm’).
Cloning efficiencies were calculated relative to nonirradiated
Agarose gel electrophoresis is used to separate DNA by pUC19 DNA. Reprinted with permission from Lucigen
size or topology using an electric field that induces nega- Corporation (www.lucigen.com).
tively charged DNA molecules to migrate to the positive
pole through a porous matrix of agarose. In library con-
struction, it is used in both vector preparation and
isolation of genomic DNA pieces of a particular size.
Circular vector migrates more quickly through an agarose
gel than linearized vector with the same molecular weight
and thus can be separated from vector that was linearized
after digestion with a restriction enzyme. Vector that has
ligated can also be separated from vector that has not been
ligated and remained linear using gel electrophoresis.
After genomic DNA has been fragmented, pools of gen- (a)
erated segments are separated according to their
molecular weight. DNA segments of less than 20 kbp
can be sufficiently separated with a standard 1% agarose
gel. The resolution of smaller pieces of DNA (<1 kbp) can
be enhanced by increasing the percentage of agarose in
the gel to 2%, while lowering the agarose content to 0.8%
can enhance the resolution of larger pieces of DNA. If
larger pieces of DNA need to be resolved, pulse field gel (b)
electrophoresis (which uses an alternating electric field)
may be employed to achieve higher resolution.
Larger segments or fragile pieces of DNA that are
sensitive to shearing should be resolved using gels made
from low melting point agarose, which does not require
vortexing or centrifugation steps to extract embedded
DNA, but rather relies upon melting the gel at low
temperatures followed by an incubation with -agarase I (c)
to digest the agarose away from embedded DNA.
DNA within an agarose gel must be bound to a dye in
order for visualization. The most commonly used dye to
stain agarose gels is ethidium bromide (EtBr). EtBr is a
fluorescent dye that intercalates between stacked bases of
DNA. EtBr experiences excitation by UV radiation and
emits energy at 590 nm, fluorescing with a red-orange Figure 3 Digested DNA to be used in subsequent steps should
color. Dye bound to DNA displays an almost 20-fold not be exposed to UV light. After loading ladder in lane 1, a
increase in fluorescent yield, thus allowing for the detec- fraction of the digested DNA should be placed in lane 2 with the
tion of as little as 10 ng of DNA. If EtBr is used to visualize rest in lane 3. After running the gel (a), lanes 1 and 2 should be
separated from lane 3 with a scalpel (b) and examined under UV
DNA loaded onto an agarose gel, it is very important to light. The desired DNA fragment should be excised from lane 2,
minimize the amount of short wavelength UV light and both gel pieces should be brought together (c) to obtain the
exposed to the DNA (Figures 2 and 3). Exposure to desired DNA fragment piece.
190 Techniques | Genome Sequence Databases: Genomic, Construction of Libraries

short wavelength UV light has been shown to damage with a different method. Fluorometric measurements of
DNA and decrease its cloning efficiency. In addition, DNA are more accurate than those obtained from UV
EtBr is a potent mutagen and moderately toxic, so safer spectroscopy and can detect smaller quantities of DNA.
alternatives or dyes that can be seen without the aid of To quantify DNA, DNA-specific fluorescent stains, such
UV light are often desired. Alternative dyes include crys- as PicoGreen or SYBR Green I, are added to a DNA
tal violet, methylene blue, SYBR Safe(Invitrogen, sample and the fluorescence of the sample is compared to
Carlsbad, CA), or Nile blue. the fluorescence of standards of known concentrations.
Accurate DNA quantification can also be achieved by
running an aliquot of sample along with a standard
DNA mass ladder on an agarose gel and comparing
Quantifying DNA and Determining Quality DNA band intensities. This method is effective when
quantifying distinctly sized pieces of DNA.
It is important to be able to determine the amount and
purity of DNA within a sample. DNA should be quanti-
fied following purifications steps to be sure of a sufficient Ligation Reactions
yield prior to any further manipulations. The quality of
DNA should also be monitored before initiating cloning A ligation reaction is required to bind fragmented geno-
steps to ensure that there are minimal contaminants that mic DNA into linearized vector. The most commonly
would interfere with the efficiency of cloning reactions. used ligase, T4 ligase, is derived from the T4 bacterioph-
The purity of a DNA sample can be accessed by age and requires ATP as a cofactor and an available DNA
calculating the ratio of absorbance at 260 nm to the absor- 59 phosphate group on at least one of the two ligating
bance at 280 nm measured by a spectrophotometer. DNA fragments. When setting up a ligation reaction, the
Nucleic acids have a higher absorbance at 260 nm than moles of insert to moles of vector ratio may be varied to
at 280 nm. The reverse is true for proteins, which display find optimal conditions. Lower ratios may result in ineffi-
higher absorbance at 280 nm than at 260 nm. The absor- cient ligation reactions while higher ratios increase the
bance at each individual wavelength is thus influenced by risk of ligating more than one insert per vector. Typically,
the presence of both proteins and nucleic acids. Based on insert to vector ratios are varied from 1:1 to 5:1. Blunt-
the extinction coefficients for both of these macromole- ended ligations may perform best with higher ratios. A
cules, pure samples of DNA would have a A260/A280 ratio control ligation containing vector without insert should
of close to 2.0 and pure protein samples would have a also be conducted to give an estimate of background
A260/A280 ratio of close to 0.6. Typically, a DNA sample clones that contain self-ligated vector. Ligases may or
with a A260/A280 ratio of greater than 1.7 is acceptable for may not be required to be inactivated or purified from a
molecular cloning reactions. reaction prior to transformation. Following the recom-
The quantity of DNA can be measured via UV spec- mendations of the supplier of the ligase generally will
troscopy, fluorometry, or by comparison to a standard give the best ligation and transformation results.
mass ladder on an agarose gel. Due to the simplicity, the
concentration of DNA within a sample is frequently
approximated based on the absorbance reading at Transformation of Library DNA into
260 nm. The concentration is found through application Bacterial Host Strains
of the Beer–Lambert law, which relates absorbance with
concentration through the relationship Naked DNA in solution can be transferred into a bacterial
host strain via transformation of competent cells. Bacterial
A ¼ "bC
transformations with plasmid DNA is accomplished
where A ¼ absorbance, b ¼ pathlength of the sample cuv- through heat shock of chemically competent cells or elec-
ette, in units of length, " ¼ absorption coefficient in units troporation of electrocompetent cells. Transformation of
of volume/mass/length, and C ¼ concentration in units of chemically competent cells usually achieve 105–109
mass per volume. colony forming units (CFU) per mg of supercoiled DNA
Using a standard spectrophotometer with a path length while electroporation of electrocompetent cells can yield
of 1 cm, an absorbance reading at 260 nm (A260) of 1 up to 1010 CFU mg1 of DNA.
equates to a concentration of approximately 1 ng ml1. Generally, the preparation of chemically competent
As mentioned above, other molecules besides DNA that cultures of E. coli involves treating exponentially growing
absorb at this wavelength, including proteins, RNA, and cells with a salt solution, such as 0.1 M CaCl2. Plasmid
salts within the sample, can influence absorbance at DNA is mixed with the cells and the plasmid DNA and
260 nm. Due to this phenomenon, the amount of cell suspension are heat shocked at 42  C for a brief
DNA within a sample is usually confirmed or measured period, during which the cells can uptake the DNA.
Techniques | Genome Sequence Databases: Genomic, Construction of Libraries 191

While the exact mechanism of DNA uptake by this most likely due to DNA topology. For a combination of
method is not fully known, it is believed that the swelling these reasons, the initial transformation step of transferring
of the cells following treatment with the salt solution and cloned library DNA to a host strain is usually conducted
the activation of heat shock genes are important in cells with E. coli, provided that the cloning vector used has an
taking up DNA from their environment. Factors that origin of replication that can be recognized by E. coli DNA
influence the frequency of transformation include the polymerases. If desired, after this initial transformation
purity of the reagents and water used, the viable cell step, the extracted supercoiled plasmid DNA can be pre-
density of the culture, and the trace contaminants that pared from transformed E. coli and then transformed into a
are found on glassware. Additionally, the number of times different desired host cell line (Figure 4).
a culture has been passaged influences transformation
efficiencies. Best results are obtained from cultures started
directly from cryogenic freezer stock as opposed to cells Increasing the Number of Transformants
that have been continuously passaged.
Electrocompetent cells are prepared by repeated When a large number of recombinant clones is required,
washing of cells in low conductivity solutions such as the ligation reaction can be precipitated in the presence of
10% glycerol or 300 mmol l1 sucrose to reduce the yeast tRNA. Precipitation of ligation reactions prior to
ionic strength of the cell suspension. Electroporation electroporation has been shown to give up to a 400-fold
works by using a transmembrane electric field pulse to increase on the number of transformants. It is believed
create small holes, referred to as electropores, within the that the yeast tRNA alters or stabilizes the topology of the
bacterial membrane through which DNA can pass. ligated DNA, increasing its efficiency of transformation.
Electroporation conditions, such as pulse amplitude and In this method, a 5 ml ligation reaction is mixed with 1 mg
duration, must be sufficient enough to generate electro- of yeast tRNA from a 1 mg ml1 solution, brought up to a
pores but not increased to the point at which the number total volume of 20 ml with ultrapure water, and then
and size of electropores detrimentally affect transforma- precipitated with twice the volume of cold absolute etha-
tion efficiency by causing cell damage or death. The nol. The DNA is pelleted by centrifugation, washed twice
number of pulses, along with the pulse duration and with cold 70% ethanol, and allowed to air dry prior to
amplitude, can be varied to empirically optimize condi- resuspension in 1 ml of ultrapure water. This sample can
tions for various cell lines. then be transformed into competent cells.
While many bacterial strains can be made competent,
the protocols for preparing and manipulating competent E.
coli are the most thoroughly worked out. Furthermore, Determining the Number
competent E. coli can be obtained from commercial sources. of Transformants Needed for Coverage
Commercially available competent cells tend to yield of an Entire Genome
transformation efficiencies several orders of magnitude
greater than those typically achieved by standard labora- The extent to which a library represents all sections of the
tory preparations. Additionally, ligated DNA tends to have genome can be statistically determined. The number of
lower transformation efficiencies than supercoiled DNA, necessary transformants to have sufficient coverage or

Select vector

Purify vector Purify genomic DNA

Linearize vector Fragment genomic DNA

End repair vector if necessary End repair genomic DNA if necessary

Dephosphorylate vector Run DNA on gel and obtain desired fragment size

Remove residual enzymes and buffers Determine quantity and purity of DNA

Ligate vector and genomic DNA fragments

Transform ligation product into bacterial strain


Figure 4 Summary of steps necessary for constructing representative genomic libraries.
192 Techniques | Genome Sequence Databases: Genomic, Construction of Libraries

high probability of containing any given section of the molecular biology experiments. To this end, a number
genome is dependent upon the genome size and the size of advances have been developed, particularly in the area
of genomic DNA inserts contained within a library. The of cloning vectors to improve cloning of genomic DNA
Clarke–Carbon equation, based on the assumption that fragments.
recombinant clones are distributed according to a Poisson Linear vectors based on the coliphage N15, available
distribution across the genome, can be used to determine commercially from Lucigen (Middleton, WI) such as
the number of transformants needed to have a high prob- pJAZZ vectors, have been shown to be stable for larges
ability of any given unique DNA sequence that would be DNA segments (up to 30 kb) or DNA with difficult to
present in a genomic library. The Clarke–Carbon equa- clone secondary structure. The stability of these vectors is
tion can be written as believed to be accredited to their lack of supercoiling and
differences in replication compared to standard cloning
lnð1 – P Þ vectors. Low copy number vectors, such as the pSMART
N¼ or broad host range pRANGER-BTB series of vectors
lnð1 – f Þ
(available from Lucigen), have features that block tran-
scription into and out of the multiple cloning site by the
where N ¼ number of recombinant clones required,
presence of transcriptional terminators and the lack of
P ¼ probability of finding a given unique DNA section,
constitutive promoters. These vectors have been shown
and f ¼ fraction of the total genome size that is contained
to be several times more stable for cloning random DNA
within a single insert of the genomic library, equal to the
fragments than pUC vectors thus minimizing cloning
size of the insert in bp per size of the genome in bp.
gaps caused by difficult-to-clone DNA. Another recently
For E. coli, K12 genome with a 4 639 221 bp sequence, a
developed vector intended to facilitate sequencing,
genomic library containing 12 000 bp inserts would need
pLEXX-AK (also available from Lucigen), is designed to
2667 transformants to have a probability of 99.9% of the
clone two inserts per vector, thus reducing the down-
library containing any given DNA sequence.
stream labor involved in processing clones prior to
While the Clarke–Carbon equation is the most com-
sequencing.
monly used formula for determining the number of
Additional advances in constructing genomic libraries
transformants, other equations, such as the Poisson dis-
come from a reduction in the amount of work required.
tribution-based Lander–Waterman equation may also be
Many molecular biology suppliers now offer kits to aid in
used. The number of recombinant clones required may
genomic library construction. These kits typically contain
also be influenced by the application of the genomic
pre-processed vector that has already been linearized and
library. Some applications requiring high amounts of
dephosphorylated, along with prepared competent cells,
overlap between DNA segments require higher numbers
reducing the amount of user time required to create a
of recombinant clones while applications that require
genomic library. Commercially prepared vectors typi-
only sections of genes to be present, such as some hybri-
cally promise much lower background empty vector
dization studies, may require less transformants.
than is usually obtained when cloning vectors are pre-
pared locally.

Current Strategies to Enhance Genomic


Sample Protocol to Construct a 4 kb
Library Production
Pseudomonas aeruginosa PAO1 Library
with Vector PBTB-1 Within an E. coli Host
Large-scale sequencing projects, including the sequen-
Strain
cing of entire genomes, require the construction of high
quality and highly representational genomic libraries. Supplies and Reagents Needed
These libraries should have minimal sequence bias and
All kits and products should be used according to the
often it is desired to have clones with larger sized inserts
recommendations of the suppliers unless otherwise noted.
to decrease the number of clones required to be
sequenced. In order for genomic sequencing projects to • Incubator 
set at 37 C.
be complete, all DNA of the organism must be included • Heat block. set at 37 C.
Water bath 

within the genomic library, including difficult to clone • Milliliter conical tubes.
DNA that contain secondary structures, AT- or GC-rich • Yeast tRNA (1 mg ml in ultrapure water) (Sigma-
regions, or DNA encoding strong promoters or toxic
gene products. Additionally, it is desired for all insert
• Aldrich, St. Louis, MO). 1

DNA to be stable within cloning vectors so that the • Ultrapure water (Invitrogen).
vector can be amplified or be available for other • 100% ethanol.
Techniques | Genome Sequence Databases: Genomic, Construction of Libraries 193

• 70% ethanol. 4. Blunt cut 1 mg of vector with 10 units of HincII. Heat


• 80dlacZM15
E. cloni 10G (F-mcrA (mrr-hsdRMS-mcrBC)
lacX74 endA1 recA1araD139 (ara,
inactivate the reaction.
5. Allow the sample to equilibrate to 37  C.
leu)7697 galU galK rpsL nupG -tonA) (Lucigen) con- 6. Add Antarctic Phosphatase reaction buffer to a final
taining plasmid pBTB-1 (-lactamase resistance, low concentration of 1  and 5 units of Antarctic
copy number). Phosphatase. Following dephosphorylation, heat
• P. aeruginosa PAO1. inactivate the sample.
• Luria–Bertani (LB) broth. 7. Allow the sample to cool to room temperature.
• YT agar plates supplemented with 100 mg ml1
carbenicillin.
8. Add 10  T4 DNA ligase Buffer containing ATP to a
final concentration of 1  and 2 units of ligase. Allow
• Sterile-filtered stock solution of carbenicillin
(100 mg ml1).
the reaction to proceed for 2 h at room temperature
prior to heat inactivation.
• HiSpeed Plasmid Midi kit (Qiagen, Valencia, CA). 9. Separate circular and linearized vector on a 1% agar-
ose gel stained with 0.5 mg ml1, running the sample
• 500/G Genomic-tips (Qiagen).
• Genomic DNA Buffer Set (Qiagen). adjacent to a 1 kb Plus DNA ladder.
• MinElute Gel Extraction Kit (Qiagen). 10. Extract the linearized band from the gel with a clean
• QIAprep Spin Miniprep Kit (Qiagen). razor blade, using care to minimize UV exposure of
• Nuclease-free water (Ambion). the DNA.
• Agarose gel electrophoresis apparatus. 11. Purify vector DNA from the agarose gel using a
• 1  TAE buffer (50: 242 g Tris base 57.1 ml acetic
acid 100 ml 0.5 M EDTA. Add deionized water to 1 l
MinElute Gel Extraction Kit, eluting with nuclease-
free water.
and adjust pH to 8.5) to be used for all electrophoresis 12. Quantify the vector DNA by visual comparison to a
steps and in preparing agarose gels. High DNA Mass Ladder run on a 1% agarose gel
• Molecular biology grade agarose (Sigma-Aldrich). stained with 0.5 mg ml1.
• Low-melting point agarose (Promega, Madison, WI). 13. Vector may be stored frozen at 80  C.
• 10 mg ml1 stock solution of EtBr.
• 10% sodium dodecyl sulfate solution (Ambion).
• GELase Agarose Gel-Digesting
(Epicentre Biotechnologies, Madison, WI).
Preparation P. aeruginosa PAO1 Genomic DNA 4 kb Insert
Preparation
• HincII (New England BioLabs, Ipswich, MA).
• RsaI (New England BioLabs). 1. Inoculate 100 ml of LB media with P. aeruginosa PAO1
• HaeIII (New England BioLabs). and allow to incubate overnight at 37  C, with shaking
• Antarctic Phosphatase (New England BioLabs). at 225 rpm.
• T4 DNA ligase (Lucigen). 2. Extract genomic DNA using 500/G Genomic-tips and
• High DNA Mass Ladder (Invitrogen). Genomic DNA Buffer Set.
• 1 kb Plus DNA Ladder (Invitrogen). 3. Quantify the genomic DNA by visual comparison to a
• UltraClone DNA Ligation & Transformation Kit con-
taining ELITE DUOS (Lucigen) (contains ligase and
High DNA Mass Ladder run on a 1% agarose gel
stained with 0.5 mg ml1 EtBr.
E. cloni ELITE 10G electrocompetent cells). 4. Partially digest 5 mg of DNA with 10 units each of
blunt-cutting RsaI and HaeIII for 1 h.
5. Add 10% SDS to the reaction to a final concentration
of 1.
Vector Preparation 6. Resolve DNA fragments on a 1% low melting
point agarose gel stained with 0.5 mg ml1 EtBr,
1. Inoculate 100 ml of LB broth supplemented with running the sample adjacent to a 1 kb Plus DNA
100 mg l1 carbenicillin and allow to incubate over- ladder.
night at 37  C, 225 rpm. 7. Excise with a clean razor blade the section of the gel
2. Prepare plasmid from overnight culture with a containing the 4 kb fragments of DNA, using caution to
Qiagen HiSpeed Plasmid Midi kit, eluting with minimize UV exposure of the DNA.
nuclease-free water. 8. Recover DNA using a GELase Agarose Gel-Digesting
3. Quantify vector by visual comparison to a High DNA Preparation.
Mass Ladder run on a 1% agarose gel stained with 9. Quantify and verify length of the genomic DNA frag-
0.5 mg ml1 ethidium bromide. If vector is too dilute ments by visual comparison to a High DNA Mass
(>100 mg ml1), concentrate the sample with a Ladder run on a 1% agarose gel stained with 0.5 mg
speedvac. ml1 EtBr.
194 Techniques | Genome Sequence Databases: Genomic, Construction of Libraries

Ligation and Transformation 3. Allow the cultures to incubate overnight at 37  C and


225 rpm.
1. Following the directions included in the UltraClone 4. Prepare the vectors from these cultures using a
DNA Ligation & Transformation Kit, ligate insert QIAprep Spin Miniprep Kit.
DNA to vector with a 3:1 insert to vector ratio. 5. Vector DNA may be run on a 1% agarose gel to check
2. Heat inactivate the reaction. for appropriately sized insert or be sequenced with
3. Precipitate the ligation by adding to the reaction 2 ml primers designed to extend into the multiple cloning
yeast tRNA (1 mg ml1), 28 ml ultrapure water, and site.
100 ml 100% ethanol.
4. Vortex the mixture and allow it to cool at 20  C for
15 min. See also: Chromosome, Bacterial; DNA Restriction and
5. Pellet by centrifuging for 15 min (13 000 rpm and Modification; DNA Sequencing and Genomics; Plasmids,
4  C). Bacterial; Recombinant DNA, Basic Procedures
6. Carefully remove the supernatant.
7. Wash the pellet with 70% ethanol.
8. Centrifuge the sample again (13 000 rpm and 4  C). Further Reading
9. Remove the supernatant and allow the pellet to air
dry. Adkins S and Burmeister M (1996) Visualization of DNA in agarose gels
as migrating colored bands: Applications for preparative gels and
10. Resuspend the pellet in 2 ml of ultrapure water. educational demonstrations. Analytical Biochemistry 240(1): 17–23.
11. Transform the precipitated ligation reaction into Choi KH, Kumar A, and Schweizer HP (2006) A 10-min method for
electrocompetent E. cloni cells included in the kit preparation of highly electrocompetent Pseudomonas aeruginosa
cells: Application for DNA fragment transfer between chromosomes
following the suggestions of the supplier. and plasmid transformation. Journal of Microbiological Methods
12. Allow cells to recover for an hour at 37  C with 64(3): 391–397.
shaking at 225 rpm. Clewell DB (1972) Nature of Col E 1 plasmid replication in Escherichia
coli in the presence of the chloramphenicol. Journal of Bacteriology
13. Make a 1/100 dilution of the transformed cells prior 110(2): 667–676.
to plating. del Solar G, Giraldo R, Ruiz-Echevarrı́a MJ, Espinosa M, and Dı́az-
14. Plate transformed cells and dilution onto YT agar Orejas R (1998) Replication and control of circular bacterial plasmids.
Microbiology and Molecular Biology Reviews 62(2): 434–464.
plates containing 100 mg ml1 carbenicillin. Godiska R, Patterson M, Schoenfeld T, and Mead DA (2005) Beyond
15. Incubate the plates overnight at 37  C. pUC: Vectors for cloning unstable DNA. In: Kieleczawa J (ed.) DNA
16. Count the number of colonies on the plates onto Sequencing: Optimizing the Process and Analysis, pp. 55–75.
Sudbury, MA: Jones and Bartlett.
which the dilution was plated. Calculate the number Grundemann D and Schomig E (1996) Protection of DNA during
of transformants based on this count. preparative agarose gel electrophoresis against damage induced by
ultraviolet light. Biotechniques 21(5): 898–903.
Kues U and Stahl U (1989) Replication of plasmids in Gram-negative
bacteria. Microbiology Reviews 53(4): 491–516.
Calculating the Number of Recombinant Clones Lander ES and Waterman MS (1988) Genomic mapping by fingerprinting
Needed for a 4000 bp Library random clones: A mathematical analysis. Genomics 2(3): 231–239.
Lynch MD and Gill RT (2006) Broad host range vectors for stable
1. For a representational library (99.9% probability of con- genomic library construction. Biotechnology and Bioengineering
94(1): 151–158.
taining any given DNA sequence), calculate the number McClelland M (1981) The effect of sequence specific DNA methylation
of transformants needed using the Clarke–Carbon equa- on restriction endonuclease cleavage. Nucleic Acids Research
tion (see ‘Determining the number of transformants 9(22): 5859–5866.
Mead DA, Patterson MK, Schoenfeld T, et al. (2002) High stability
needed for coverage of an entire genome’). The size of vectors for cloning unstable DNA. Genome Sequencing and Analysis
the P. aeruginosa PAO1 genome is 6 264 404 bp. Conference XIV. Boston, MA.
Meyer R, Figurski D, and Helinski DR (1975) Molecular vehicle properties
lnð1 – P Þ of the broad host range plasmid RK2. Science 190(4220): 1226–1228.
N¼ Pieterse B, Quirijns EJ, Schuren FH, and van der Werf MJ (2005)
lnð1 – f Þ
Mathematical design of prokaryotic clone-based microarrays. BMC
Bioinformatics 6: 238.
with P ¼ 0.999 and f ¼ 4000/6 264 404, the number of Rand KN (1996) Crystal violet can be used to visualize DNA bands
transformants needed N ¼ 10 815. during gel electrophoresis and to improve cloning efficiency. Elsevier
Trends Journals Technical Tips OnlineT40022.
Ravin NV and Ravin VK (1999) Use of a linear multicopy vector based on
Confirmations the mini-replicon of temperate coliphage N15 for cloning DNA with
abnormal secondary structures. Nucleic Acids Research 27(17): 13.
1. Fill ten sterile 15 ml conical tubes with 5 ml of LB Rodriguez RL and Denhardt DT (1988) Vectors: A survey of molecular
supplemented with 100 mg ml1 carbenicillin. cloning vectors and their uses. In: Rodriguez RL and Denhardt DT
2. Pick ten colonies chosen at random from the plated (eds.) Biotechnology. Stoneham, MA: Butterworth Publishers.
Rosche WA, Trinh TQ, and Sinden RR (1995) Differential DNA
library transformation to inoculate the ten conical secondary structure-mediated deletion mutation in the leading and
tubes. lagging strands. Journal of Bacteriology 177(15): 4385–4391.
Techniques | Genome Sequence Databases: Genomic, Construction of Libraries 195

Sambrook J, Fritsch EF, and Maniatis T (1989) Molecular Cloning: A Relevant Websites
Laboratory Manual. 2nd edn. Cold Spring Harbor: Cold Spring
Harbor Laboratory Press. http://www.lucigen.com – Lucigen Corporation
Trinh TQ and Sinden RR (1991) Preferential DNA secondary structure
http://www.qiagen.com – Qiagen
mutagenesis in the lagging strand of replication in E. coli. Nature
352(6335): 544–547.
Zhu H and Dean RA (1999) A novel method for increasing the
transformation efficiency of Escherichia coli-application for bacterial
artificial chromosome library construction. Nucleic Acids Research
27(3): 910–911.

You might also like