You are on page 1of 45

Nucleic Acids: Amplification and

Sequencing
Genomic DNA

• Genomic DNA constitutes the total genetic (A) Molecular weight


Base pairs per Number of
information of an organism. The genomes of Organism
haploid genome
of genome
(daltons)
chromosomes
almost all organisms are DNA, the only
SV40 5243 3.4 x 106 –
exceptions being some viruses that have RNA Φ174 5386 3.5 x 106 –
genomes. Genomic DNA molecules are Adenovirus 2 35,937 2.3 x 107 –
generally large, and in most organisms are Lambda 48,502 3.2 x 107 –

organized into DNA–protein complexes called Escherichia coli


Saccharomyces
4.7 x 106 3.1 x 109 x=1

chromosomes. The size, number of cerevisiae


1.5 x 107 9.8 x 109 2x = 32

chromosomes, and nature of genomic DNA Dictyostelium


discoideum
5.4 x 107 3.5 x 1010 x=6
varies between different organisms (Fig A). Arabidopsis
7.0 x 107 4.6 x 1010 2x = 10
thaliana
• Genomic DNA contains genes, discrete regions Caenorhabditis
that encode a protein or RNA. A gene elegans
8.0 x 107 5.2 x 1010 2x = 12

comprises the coding DNA sequence, as well Drosophila


melanogaster
1.4 x 108 9.1 x 1010 2x = 8
as the associated regulatory elements that Gallus
control gene expression. Nuclear eukaryotic domesticus
(chicken)
1.2 x 109 7.8 x 1011 2x = 78

genes also contain noncoding regions called Mus musculus


2.7 x 109 1.8 x 1012 2x = 40
introns. The number of genes varies widely (mouse)
Rattus
between different organisms. Coding DNA norvegicus (rat)
3.0 x 109 2.0 x 1012 2x = 42

represents only a small fraction of eukaryotic Xenopus laevis 3.1 x 109 2.0 x 1012 2x = 36

genomic DNA: the bulk of the DNA is Homo sapiens


Zea mays
3.3 x 109
3.9 x 109
2.1 x 1012
2.5 x 1012
2x = 46
2x = 20
noncoding, much of which is made up of Nicotiana
4.8 x 109 3.1 x 1012 2x = 48
repetitive sequences. tabacum
Plasmid DNA

• A plasmid is a small DNA molecule that is physically separate from, and can
replicate independently of, chromosomal DNA within a cell. Most commonly found
as small circular, double-stranded DNA molecules in bacteria, plasmids are
sometimes present in archaea and eukaryotic organisms. In nature, plasmids carry
genes that may benefit survival of the organism (e.g. antibiotic resistance), and can
frequently be transmitted from one bacterium to another (even of another
species) via horizontal gene transfer. Artificial plasmids are widely used as vectors
in molecular cloning, serving to drive the replication of recombinant DNA
sequences within host organisms.
• There are two types of plasmid integration into a host bacteria: Non-integrating
plasmids replicate as with the top instance (Fig A), whereas episomes integrate
into the host chromosome (Fig B).
(A) (B)
Basic procedure of DNA extraction
• There are a few steps in DNA extraction:
• Breaking the cells, commonly referred to as cell disruption or cell lysis.
• Removing membrane lipids by adding a detergent which also serves in cell lysis.
• Removing proteins by adding a protease (optional but often done).
• Removing RNA by adding an RNase (almost always done).
• DNA purification from detergents, proteins, salts and reagents used during cell
lysis step. The most commonly used procedures are:
– Phenol–chloroform extraction and ethanol precipitation in which phenol
denatures proteins in the sample. After centrifugation of the sample,
denaturated proteins stay in organic phase while aqueous phase containing
nucleic acid is mixed with the chloroform that removes phenol residues from
solution. (Note: for DNA isolation in used phenol buffered to pH 8, RNA must
be isolated using acidic phenol.)
– Minicolumn purification relies on the fact that the nucleic acid may adsorp to
the solid phase (silica) depending on the pH (low pH) and the salt content
(high salt).
• After isolation, the DNA is dissolved in slightly alkaline buffer, usually in the TE
buffer, or in ultra-pure water.
Phenol/Chloroform Extraction and Ethanol Precipitation

• Phenol–chloroform extraction is a liquid–liquid extraction technique. When the


sample is vortexed with phenol-chloroform(1:1 ratio) and centrifuged, the
precipitated proteins left as white coagulated mass will remain at the interface
between the aqueous and organic layers and can be drawn off carefully. The
aqueous solution of nucleic acids (DNA and RNA) can then be removed with a
pipette. The RNA can then be degraded by enzyme ribonuclease.
• Under acidic conditions (pH 4-6), DNA partitions into the organic phase while RNA
remains in the aqueous phase. Under neutral conditions (pH 7-8), both DNA and
RNA partition into the aqueous phase. In a last step, the nucleic acids are
recovered from the aqueous phase by precipitation with 2-propanol or ethanol.

Process of phenol extraction


DNA Extraction Technologies

• DNA can be purified using many different methods and the downstream
application determines how pure the DNA should be. In addition to isolation using
home-made methods (e.g., CsCl gradients), DNA extraction kits are available from
many suppliers. The characteristics of the 3 most common types of DNA extraction
kit are shown in the table.
Silica-membrane Magnetic-particle
Anion-exchange
technology technology
Binding to magnetic
Solid-phase, anion-
Selective adsorption silica particles under
What it is exchange
to silica membranes controlled ionic
chromatography
conditions
Binding: variable salt
and pH Binding: high salt Binding: high salt
Procedure Elution: variable salt Elution: low salt Elution: low salt
and pH Ready-to-use eluate Ready-to-use eluate
Alcohol precipitation
Delivers ultrapure,
Delivers high-purity Delivers high-purity
transfection-grade
nucleic acids for use nucleic acids for use
Advantages DNA for optimal
in most downstream in most downstream
results in sensitive
applications applications
applications
Fast, inexpensive Fast, inexpensive
No silica-slurry carry Easy to automate;
over, no alcohol no alcohol
precipitation precipitation
Spin column-based nucleic acid purification

• Column-based nucleic acid purification relies on the fact that the nucleic acid may
bind (adsorption) to the solid phase (silica or other) depending on the pH and the
salt content of the buffer, usually Tris-EDTA (TE buffer).
1. The sample is added to the column with DNA binding buffer containing guanidine
hydrochloride, Triton X-100, isopropanol and a pH indicator
2. The column is then washed (80% EtOH)
3. The column can be eluted with TE buffer or simply water (pH 8.4).
• In the presence of chaotropic agents, such as sodium iodide or sodium
perchlorate, DNA binds to silica, glass particles or to unicellular algae called
diatoms which shield their cell walls with silica. This property was used to purify
nucleic acid using glass powder or silica beads under alkaline conditions. This was
later improved used guanidinium thiocyanate or guanidinium hydrochloride as the
chaotropic agent.
(B)
(A)
Silica on a column
pH dependence of
with water and with
DNA adsorption to
DNA sample in
silica membranes
chaotropic buffer.
Spectrophotometric measurement of DNA concentration

• The concentration of DNA and RNA should be determined by measuring the


absorbance at 260 nm (A260) in a spectrophotometer. For accuracy, absorbance
readings at 260 nm should fall between 0.15 and 1.0.
• Pure DNA has an A260/A280 ratio of 1.8–2.0 in 10 mM Tris·Cl, pH 8.5.
• Strong absorbance at A280 resulting in a low A260/A280 ratio indicates the presence
of contaminants, such as proteins.
• Strong absorbance at 270 nm and 275 nm may indicate the presence of
contaminating phenol.
• Absorbance at 325 nm suggests contamination by particulates in the solution or
dirty cuvettes.

Spectrophotometric conversions from absorbance at 260 nm


1 A260 unit Concentration (µg/ml)*
dsDNA 50
ssDNA 33
Oligonucleotides 20–30
Plasmid vs Genomic DNA Extraction: The Difference
• Plasmid DNA preparation is same as total cell DNA preparation but importantly
distinct in one aspect that in plasmid DNA preparation it is always necessary to
separate the plasmid DNA from the large amount of bacterial chromosomal DNA
that is also present in the cells.
• Size based separation: Bacterial cell disruption is carried out very gently to prevent
wholesale breakage. Treatment with EDTA and lysozyme is carried out in the
presence of sucrose, which prevents the cell from bursting. Sphaeroplasts(partially
wall less cells) are formed that retain an intact cytoplasmic membrane. Cell lysis is
induced by adding a non-ionic detergent Triton X-100 which causes minimal
breakage of the bacterial DNA, therefore centrifugation will leave a cleared lysate,
consisting almost entirely of plasmid DNA. A clear lysate will however, invariably
retain some chromosomal DNA. Size fractionation does not sufficiently help to
remove contaminants, and therefore alternative ways for it must be considered.
Plasmid DNA Extraction: Alkaline Denaturation

• Non-supercoiled DNA is denatured at a narrow pH range. If pH of a cell extract or


cleared lysate is increased (12.0-12.5) by addition of NaOH, then the hydrogen
bonding in non supercoiled DNA molecules is broken, causing the unwinding of
double helix and finally separation of two polypeptide chains. These denatured
DNA strands will re-aggregate into a tangle mass by the addition of acid. With the
help of centrifugation, the insoluble network can be pelleted, leaving pure plasmid
DNA in the supernatant. Under some circumstances ( cell lysis by SDS and
neutralization with sodium acetate) , most of the proteins and RNA also becomes
insoluble and can be removed by centrifugation.

Process of alkaline denaturation


CsCl Density Gradient Centrifugation

• Under high centrifugal force, a solution of cesium chloride (CsCl) molecules will
dissociate, and the heavy Cs+ atoms will be forced towards the outer end of the
tube, thus forming a shallow density gradient. The gradient is sufficient to separate
types of DNA with slight differences in density due to differing (G+C) content, or
physical form (e.g., linear versus circular molecules). Density gradient
centrifugation in the presence of ethidium bromide (EtBr) can be used to separate
supercoiled DNA from non-supercoiled molecules. EtBr binds to DNA molecules by
intercalating between adjacent base pairs, causing partial unwinding of the double
helix. More EtBr is bound to the linear, chromosomal DNA than the supercoiled,
plasmid DNA. The density of chromosonal DNA decreases by 0.125g/mL, whereas
the density of plasmid DNA decreases by only 0.085g/mL
• The EtBr bound to the plasmid DNA is extracted with n-butanol and the CsCl
removed by dialysis.

EtBr-CsCl density
gradient centrifugation
RNA Isolation: The Proteinase K Method

• The isolation of RNA is more challenging than that of DNA. Sample can be easily
contaminated by the ubiquitous ribonuclease, which cause RNA to be fragmented.
RNA also forms tight complexes with proteins. Coarse treatment is required to
release RNA from these complexes. Furthermore, RNA is not stable under basic
condition.

• A common method for RNA isolation is the proteinase K method. In this method,
the cells are lysed by incubation in a hypotonic solution followed by centrifugation
to remove DNA and cell debris. Treatment with the proteolytic enzyme proteinase
K leads to the dissociation of RNA-protein complexes and the digestion of the
proteins. The digestion products are then removed by phenol/chloroform
extraction and the RNA in the remaining solution is precipitated using ethanol.
The Principle of PCR
• The polymerase chain reaction (PCR) is a biochemical technology that allows the
enzymatic-catalysed amplification of a single or a few copies of DNA to generate
thousands to millions of copies of a particular DNA sequence.
• PCR is now an indispensable technique for a variety of applications. These include
DNA cloning, functional analysis of genes; the diagnosis of hereditary diseases; the
identification of genetic fingerprints (used in forensic sciences and paternity
testing); and the detection and diagnosis of infectious diseases.
• Most PCR methods typically amplify DNA fragments of between 0.1 and 10 kilo
base pairs (kb), although some techniques allow for amplification of fragments up
to 40 kb in size.
• A basic PCR set up requires several components and reagents. These components
include:
– DNA template that contains the DNA region (target) to be amplified.
– Two primers that are complementary to the DNA target.
– DNA polymerase with a temperature optimum at around 70 °C.
– Deoxynucleoside triphosphates (dNATP. dCTP, dGTP, and dTTP, or in general
dNTPs)
– Buffer solution to main optimum pH and to supply magnesium ions.
Three Steps of PCR

• The PCR reaction is carried out with all the reagents in a single sample tube which
is placed in a thermocycler. The reaction takes place in three temperature
controlled steps:
– Step1-DNA denaturation
– Step2-Primer annealing
– Step3-Primer Extension
(A) The steps of PCR (B) The temperature profile of the PCR
DNA Amplification in PCR

• A typical PCR consists of 20 to 35 cycles. The denaturation, annealing and


extension steps are carried out for about 30-120s each.
• In theory, the number of DNA copies is doubled during each cycle, resulting in an
exponential amplification.

• As the reaction proceeds, continued thermal cyclic does not lead to the production
of significant amount of product anymore. This is the reason of reagents depletion,
i.e. primers, nucleotides, magnesium as well as Polymerase detrioration and
phosphate inhibition of the enzyme.

The amplification profile showing the


slow start, exponential amplification
and plateau phase
DNA Amplification in PCR

• Amplification during the first four cycles. The first target-length molecules appear
after the third cycle.
DNA Polymerase

• PCR was facilitated by the introduction of heat


stable polymerases such as Taq polymerase.
Apart from heat stable, some enzymes, e.g.
Pfu and KOD, exhibit a proof-reading activity,
called 5’-3’ exonuclease activity.
• Commercially available Pfu typically results in
an error rate of 1 in 1.3 million base pairs and
can yield 2.6% mutated products when
amplifying 1 kb fragments using PCR. However,
Pfu is slower and typically requires 1–2
minutes per cycle to amplify 1kb of DNA at 72
°C. Using Pfu DNA polymerase in PCR reactions
also results in blunt-ended PCR products.
Primers

• Good primer design is essential for successful reactions. The important design
considerations described below are a key to specific amplification with high yield.
– Primer Length
– Primer Melting Temperature
– Primer Annealing Temperature
– GC Content
– GC Clamp
– Primer Secondary Structures
– Repeats and Runs
– Avoid Template Secondary Structure
– Avoid Cross Homology
PCR Buffer

• The ionic strength has a crucial influence on the specificity of the PCR. A typical
buffer system has an ionic strength of about 50 mM and consists of Tris-HCl, pH
8.3, with KCl or NaCl.

• MgCl2 at concentrations between 0.5 and 5 mM is always added. The Mg2+ ions
form a soluble complex with DNA and polymerase which bring the polymerase and
DNA into close proximity and to balance the negative charges on the DNA.
Additionally, they stimulate the polymerase activity. At low Mg2+ concentrations,
the enzymatic activity of the polymerase is decreased. Excess Mg2+ results in poor
denaturation because dsDNA molecules are stabilised by the Mg2+ ions. High
magnesium concentrations lead to increased annealing of the primers to incorrect
sites and loss of specificity.

• Additives include glycerine, Bovine serum albumin, and polyethylene glycol (PEG)
to stabilise the polymerase and to optimise primer annealing. Denaturation can be
improved by adding dimethyl sulfoxide (DMSO)or a surfactant such as Tween-20.
Real-Time PCR (qPCR)

• In conventional PCR, the amplified DNA product, or amplicon, is detected in an


end-point analysis. In contrast, in real-time PCR the accumulation of amplification
product is measured as the reaction progresses, that is, in real time — with
product quantification after each cycle.
• The main advantage of real-time PCR over PCR is that real-time PCR allows you to
determine the initial number of copies of template DNA (the amplification target
sequence) with accuracy and high sensitivity over a wide dynamic range.
• A qPCR analysis was illustrated by using a typical amplification plot and a series of
template dilution.

(A) (B)
Real-Time PCR: dsDNA Binding Dye Assays

• A DNA-binding dye binds to all double-stranded (ds) DNA in PCR, causing


fluorescence of the dye. An increase in DNA product during PCR therefore leads to
an increase in fluorescence intensity and is measured at each cycle, thus allowing
DNA concentrations to be quantified.

(A) Dye-based assay with an intercalator (B) Table of intercalators


Real-Time PCR: Probe Based Assays

• The method relies on a DNA-based probe with a fluorescent reporter at one end
and a quencher of fluorescence at the opposite end of the probe. The close
proximity of the reporter to the quencher prevents detection of its fluorescence;
breakdown of the probe by the 5' to 3' exonuclease activity of the Taq polymerase
breaks the reporter-quencher proximity and thus allows unquenched emission of
fluorescence, which can be detected after excitation with a laser.

(1) In intact probes, reporter


fluorescence is quenched.
(2) Probes and the complementary
DNA strand are hybridized and
reporter fluorescence is still
quenched.
(3) During PCR, the probe is
degraded by the polymerase
and the fluorescent reporter
released.
Applications of qPCR

• There are numerous applications for quantitative polymerase chain reaction in the
laboratory. It is commonly used for both diagnostic and basic research. Uses of the
technique in industry include the quantification of microbial load in foods or on
vegetable matter, the detection of GMOs (Genetically modified organisms) and the
quantification and genotyping of human viral pathogens.
– Diagnostic uses
– Microbiological uses
– Uses in research
– Detection of phytopathogens
– Detection of genetically modified organisms
– Clinical quantification and genotyping
Reverse Transcription PCR (RT-PCR)

• RT-PCR is commonly used in molecular biology to detect RNA expression levels. In RT-
PCR, the RNA template is first converted into a complementary DNA (cDNA) using a
reverse transcriptase. The cDNA is then used as a template for exponential
amplification using PCR. RT-PCR is currently the most sensitive method of RNA
detection available. The use of RT-PCR for the detection of RNA transcript has
revolutionalized the study of gene expression in the following important ways:

– Made it theoretically possible to detect the Principle of RT-PCR


transcripts of practically any gene

– Enabled sample amplification and eliminated


the need for abundant starting material that one
faces when using northern blot analysis

– Provided tolerance for RNA degradation as long as


the RNA spanning the primer is intact
Applications of RT-PCR
• The exponential amplification via reverse transcription polymerase chain reaction
provides for a highly sensitive technique in which a very low copy number of RNA
molecules can be detected. RT-PCR is widely used in the diagnosis of genetic
diseases and, semiquantitatively, in the determination of the abundance of
specific different RNA molecules within a cell or tissue as a measure of gene
expression.
– Research methods
– Gene Insertion
– Genetic Disease Diagnosis
– Cancer Detection

• Challenge of RT-PCR: The exponential growth of the reverse transcribed


complementary DNA (cDNA) during the multiple cycles of PCR produces inaccurate
end point quantification due to the difficulty in maintaining linearity. In order to
provide accurate detection and quantification of RNA content in a sample, qRT-
PCR was developed using fluorescence-based modification to monitor the
amplification products during each cycle of PCR. Furthermore, the extreme
sensitivity of the technique can be a double edged sword since even the slightest
DNA contamination can lead to undesirable results.
DNA Sequencing
• DNA sequencing is the process of determining the precise order of nucleotides
within a DNA molecule. It includes any method or technology that is used to
determine the order of the four bases—adenine, guanine, cytosine, and thymine—
in a strand of DNA.
• DNA sequencing may be used to determine the sequence of individual genes,
clusters of genes, full chromosomes or entire genomes. Depending on the
methods used, sequencing may provide the order of nucleotides in DNA or RNA
isolated from virtually any source of genetic information. The resulting sequences
may be used for further scientific progress or may be used by medical personnel to
make treatment decisions or in genetic counseling.
• DNA sequencing techniques can be classified as following based on the
development time frame and efficiency:
– Basic methods: Maxam-Gilbert sequencing and chain termination methods.
– Advanced methods and de novo sequencing: Shotgun sequencing and Bridge
PCR
– Next-generation methods: Massively parallel signature sequencing (MPSS),
polony sequencing, pyrosequencing, Illumina (Solexa) sequencing and etc.

http://en.wikipedia.org/wiki/DNA_sequencing
The Chain Terminator Method or Sanger Sequencing

• In Sanger Sequencing, the PCR is performed in the presence of fluorescently


labelled ddNTPs that, when incorporated into the growing DNA strand, cause
synthesis to stop. The fluorescently labelled DNA can then be sequenced by
capillary electrophoresis. Fig. B shows the chemical structure of the four ddNTPs.

(A) (B)
Challenges in Sanger Sequencing

• Common challenges of DNA sequencing with the Sanger method include poor
quality in the first 15-40 bases of the sequence due to primer binding and
deteriorating quality of sequencing traces after 700-900 bases.
• Current methods can directly sequence only relatively short (300-1000 nucleotides
long) DNA fragments in a single reaction. The main obstacle to sequencing DNA
fragments above this size limit is insufficient power of separation for resolving
large DNA fragments that differ in length by only one nucleotide.

View of the start of an example dye-terminator read


The Principle of Pyrosequencing

• Pyrosequencing differs from Sanger sequencing, in that it relies on the detection


of pyrophosphate release on nucleotide incorporation, rather than chain
termination with dideoxynucleotides. Pyrosequencing consists of the following
steps:
– Step 1: a single strand DNA template, usually PCR amplified, is immobilised
onto a surface. A suitable primer is then hybridised to this single strand
together with four enzymes (DNA polymerase, ATP sulfyrase, luciferase and
apyrase) and two substrates (adenosine 5’phosphosulfate (APS) and luciferin).

– Step 2: The addition of one of the four dNTPs (dATPαS, which is not a
substrate for a luciferase, is added instead of dATP) initiates the second step.
DNA polymerase incorporates the correct, complementary dNTPs onto the
template. This incorporation releases pyrophosphate (PPi) stoichiometrically.

Step 1 Step 2

https://www.youtube.com/watch?v=nFfgWGFe0aA
The Principle of Pyrosequencing
• Step 3: ATP sulfurylase quantitatively converts PPi to ATP in the presence of
adenosine 5´ phosphosulfate. This ATP acts as fuel to the luciferase-mediated
conversion of luciferin to oxyluciferin that generates visible light in amounts that
are proportional to the amount of ATP. The light produced in the luciferase-
catalyzed reaction is detected by a camera and analyzed in a program.
• Step 4: Unincorporated nucleotides and ATP are degraded by the apyrase, and the
reaction can restart with another nucleotide.
Step 4
Step 3

APS
The Principle of Pyrosequencing

• During the process of iterative nucleotide addition, a complementary DNA strand


is synthesized, one base at a time and the sequence can be read from the obtained
pyrogram.

• Currently, a limitation of the method is that the lengths of individual reads of DNA
sequence are in the neighborhood of 300-500 nucleotides, shorter than the 800-
1000 obtainable with chain termination methods.

• Another limitation is the large numbers of the same base in a row, homopolymeric
regions, cannot be detected easily. Homopolymeic regions longer than 10 bases
cannot be resolved by pyrosequencing.
DNA Sequencing of Chromosomes

• For longer targets such as chromosomes,


common approaches consist of cutting
large DNA fragments into shorter DNA
fragments by restriction enzymes. The
fragmented DNA may then be cloned into a
DNA vector and amplified in a bacterial
host such as Escherichia coli. Short DNA
fragments purified from individual bacterial
colonies are individually sequenced and
assembled electronically into one long,
contiguous sequence.
Restriction Enzymes

• A restriction enzyme (or


restriction endonuclease) is an
enzyme that cuts DNA at or near
specific recognition nucleotide
sequences known as restriction
sites. Over 3000 restriction
enzymes have been studied in
detail, and more than 600 of these
are available commercially.
Illumina Dye Sequencing

• in Illumina dye sequencing, DNA


molecules are first attached to primers
on a slide and amplified so that local
colonies are formed. The four dNTPs of
reversible terminate bases are added,
each fluorescently labeled and
attached with a blocking group. The
four bases then compete for binding
sites on the template DNA to be
sequenced and non-incorporated
molecules are washed away. Following
imaging, a cleavage step removes the
fluorescent dyes and regenerates the
3′-OH group using the reducing agent
tris(2-carboxyethyl)phosphine (TCEP)
and the next cycle can begin
Comparison of Illumina Sequencing with Other
Sequencing Methods

• This technique offers a number of advantages over traditional sequencing


methods such as Sanger sequencing. Due to the automated nature of Illumina dye
sequencing it is possible to sequence multiple strands at once and gain actual
sequencing data quickly. Additionally, this method only uses DNA polymerase as
opposed to multiple, expensive enzymes required by other sequencing techniques
(i.e. Pyrosequencing).

New technologies (454–Roche


pyrosequencing and Illumina
sequencing) generate far more
sequence data per run, at a much
lower cost than conventional dye-
terminator sequencing, but the reads
are shorter.
Gene Expression and Microarray

• Proteins do most of the work


• They’re dynamically created/destroyed, so are their mRNA blueprints
• Different mRNAs expressed at different times/places
• Knowing mRNA “expression levels” tells a
• lot about the state of the cell
• Measure the level of transcript from a very large number of genes in one go

CELL

RNA
The Principle of DNA Arrays

• The principle of molecular recognition in DNA array reactions (Fig. A).


• A DNA array is an orderly arrangement of immobilised oligonucleotides on a glass
slid, each grey spot represents a different oligonucleotide. When reacted with
labelled DNA samples, they hybridise with only certain spots on the array, i.e.
those containing a matching oligonucleotide sequence. This results in a
characteristic pattern, a fingerprint , of coloured and uncoloured spots (Fig. B).

(A) (B)
Fabrication of DNA arrays

• In microarrays, spot sizes of 20-50 µm are common. The arrays can be spotted
either by a small pipette or with an ink-jet printer (Fig A). Affymetrix Corp have
developed a method to generate millions of DNA copies (containing 25
nucleotides) on the arrays by using photolithography. The whole GeneChipTM
consists of hundreds of thousands of different features.

(A) Deposition of oligonucleotides (B) Affymetrix GeneChipTM


Synthesis of Oligonucleotide Arrays by Photolithography
The steps required in a microarray experiment

• The process of carrying out a DNA array can be divided into (1) sample
preparation, (2) hybridization, (3) scanning and (4) data analysis. Fluorescently
labeled target sequences that bind to a probe sequence generate a signal that
depends on the hybridization conditions (such as temperature), and washing after
hybridization. Total strength of the signal, from a spot (feature), depends upon the
amount of target sample binding to the probes present on that spot.

High Intensity Microarray


Applications of DNA Microarays

• DNA microarrays can be used to detect DNA, or detect RNA (after reverse
transcription) that may or may not be translated into proteins. Applications of DNA
microarrays include:

• Gene expression profiling: In an mRNA or gene expression profiling experiment


the expression levels of thousands of genes are simultaneously monitored to study
the effects of certain treatments, diseases, and developmental stages on gene
expression. For example, microarray-based gene expression profiling can be used
to identify genes whose expression is changed in response to pathogens or other
organisms by comparing gene expression in infected to that in uninfected cells or
tissues.

• SNP detection: Identifying single nucleotide polymorphism among alleles within or


between populations. Several applications of microarrays make use of SNP
detection, including Genotyping, forensic analysis, measuring predisposition to
disease, identifying drug-candidates, evaluating germline mutations in individuals
or somatic mutations in cancers, assessing loss of heterozygosity, or genetic
linkage analysis.
Microarray Noise Sources

• Lot-to-lot variation (chips, reagents,…)

• Experiment-to-experiment variation
– cell state, culture purity
– sample preparation, hybridization conditions

• Spot-to-spot variation
– unequal dye incorporation
– dye nonlinearity/saturation
– uneven spot sizes
– self- & cross-hybridization
– Image capture & processing (spot finding, quantization, sensors)
Challenges in analyzing Microarray Data

• Amount of DNA in spot is not consistent


• Spot contamination
• cDNA may not be proportional to that in the tissue
• Low hybridization quality
• Measurement errors
• Spliced variants
• Outliers
• Data are high-dimensional “multi-variant”
• Biological signal may be subtle, complex, non linear, and buried in a cloud
of noise
• Normalization
• Comparison across multiple arrays, time points, tissues, treatments
• How do you reveal biological relationships among genes?
• How do you distinguish real effect from artifact?
Factors to consider in designing
microarray experiments

• Need to do lots of control experiments-validate method


• Do replicate spotting, replicate chips, and reverse labeling for custom
spotted chips
• Do pilot studies before doing “mega chip” experiments
• Don’t design experiment without replication; nothing will be learned from
a single failed experiment
• Design simple (one-two factor) experiments, i.e. treatment vs.
untreatment
• Understand measurement errors
• In designing Databases; they are useful ONLY if quality of data is assured
• Involve statistical colleagues in the design stages of your studies

You might also like