Professional Documents
Culture Documents
R EVIEW
reprints@futuremedicine.com
At the outset, it is important to define the dis- analytical validation of exploratory genotyping
tinction between the general process of bio- assays conducted during clinical drug develop-
marker validation, and the specific validation of ment to measure polymorphisms at predefined
an analytical method used to measure the bio- locations in genomic DNA is the subject of the
marker in question. The former process – the present review article. The definitions of
accumulation of sufficient data to change the selected characteristics that define the analytical
status of an exploratory biomarker to one of a performance of targeted genotyping assays are
probable valid or known biomarker – implies presented in Table 1.
many years of effort and multiple confirmatory
studies [1,2,101]. The latter process, which focuses Targeted genotyping methodologies
solely on characterizing the analytical perfor- In the subsequent sections, brief overviews of three
mance of a method or test used to measure a examples of genotyping technologies are provided.
biomarker, can be accomplished in a relatively For more comprehensive overviews focusing spe-
shorter time period and involves the assessment cifically on the molecular principles by which
of various parameters of the assay [3,4,102]. If the these and other types of genotyping assays work,
pharmacogenomic test is intended to become a the reader is referred to excellent review articles
marketed commercial assay (i.e., a device requir- focused on general genotyping methodology over-
Keywords: allelic
discrimination, analytical ing the submission of a premarket approval views [6–10] or specific technologies for matrix-
validation, genotyping assays, application [PMA] or a 510[k] premarket noti- assisted laser desorption ionization time-of-flight
MALDI-TOF-MS, fication demonstrating substantial equivalence mass spectrometry (MALDI-TOF MS) [11–14],
pharmacogenetics,
pharmacogenomics, to the US FDA), a series of rigorous, well- Pyrosequencing® [15,16], Taqman®-based allelic
Pyrosequencing defined criteria must be met [5,102]. If a discrimination (also known as Taqman SNP
pharmacogenomic test is intended for research Genotyping Assays) [17] or other genotyping
part of
use only, the elements of analytical validation technologies [18,19]. The purpose of providing
are typically determined by the test user(s). The brief summaries of the methods here is to lay
10.2217/14622416.8.4.353 © 2007 Future Medicine Ltd ISSN 1462-2416 Pharmacogenomics (2007) 8(4), 353–368 353
REVIEW – Isler, Vesterqvist & Burczynski
the foundation for the similarities and with the assignment of genotyping calls by a
differences involved in various aspects of ana- technique such as polymerase chain reaction
lytical validation of assays that employ distinct with restriction fragment length polymorphisms
molecular principles of DNA sequence [PCR-RFLP]) is presented in Table 2.
polymorphism detection.
A critical concept highlighted for each tech- Primer extension & MALDI-TOF MS
nology in the following sections is the concept of Analysis of polymorphisms using primer exten-
how ‘confidence’ is calculated for genotyping sion assays coupled with MALDI-TOF MS
calls made by each method. Each platform uses entails three major steps [11–14]:
different parameters regarding the data output to
• PCR amplification of the sequence containing
determine the confidence score of a genotyping
the polymorphic site;
call. Although manufacturers may give default
• Primer extension through the polymorphic
recommendations, it is ultimately up to the end
site in the presence of a combination of deoxy-
user to establish the acceptance criteria for
nucleotides (dNTPs) and dideoxynucleotides
assigning genotypes for each assay designed. The
(ddNTPs);
predefined cut-offs for confidence scores estab-
• MALDI-TOF MS analysis of the primer-
lished by end users impact every analytical vali-
extended products.
dation parameter described below and should
thus be carefully considered during genotyping One version of this general procedure as
assay design and validation. A list of example developed by Sequenom® (San Diego, CA,
acceptance criteria for each platform (contrasted USA), collectively termed the homogenous
Table 2. Data outputs, quality assessments and example acceptance criteria* for different methods
employed for targeted genotyping assays.
Genotyping platform Data format Quality assessment Acceptance criteria
MassARRAY® MALDI-TOF-MS Extended primer Software-assigned confidence At least one of two calls must be assigned
mass spectra score for each genotype call a conservative score or both calls must be
assigned a moderate score
Genotype calls cannot disagree
Allelic discrimination Cluster plot of Software-assigned quality Both calls must be assigned a quality value
fluorescence values value for each genotype call greater than 95
Genotype calls cannot disagree
Pyrosequencing® Pyrogram of Software-assigned confidence Both calls must be assigned a ‘pass’ or
luciferase intensities score for each genotype call ‘check’ confidence score
‘Check’ scores must be visually confirmed
Genotype calls cannot disagree
PCR-RFLP Agarose gel DNA banding pattern DNA bands must match an expected
banding pattern
*The genotype call acceptance criteria listed above are nonstringent recommendations based on the performance of most assays. These criteria can
be modified if deemed necessary to increase the call rate of otherwise accurate and reproducible genotyping assays.
PCR-RFLP: Polymerase chain reaction with restriction fragment length polymorphisms; MALDI-TOF-MS: Matrix-assisted laser desorption ionization
time-of-flight mass spectrometry.
MassEXTEND® (hME) assay, is presented in 1–4 bases longer than the original primer. Since
Figure 1. After amplifying a region of genomic the termination point and number of nucleo-
DNA containing the polymorphic site of inter- tides incorporated are sequence-specific, the
est, shrimp alkaline phosphatase is added to mass of the extension products can be used to
dephosphorylate residual nucleotides from the identify the nucleotide present at the poly-
PCR reaction prior to initiating the primer morphic site. Sequenom has more recently
extension reaction. In the next step an extension developed an iPLEX® assay, a modified hME
reaction employs a primer that anneals to the assay that uses a common termination mix with
PCR amplicon and is located with its 3´-end mass-modified terminators. One significant dif-
juxtaposed to the polymorphic site. Addition of ference is that all primer extension reactions ter-
a ‘termination mix’ (containing a specified com- minate after a single base extension, allowing for
bination of nonterminating dNTPs and chain- increased plexing efficiency [103].
terminating ddNTPs) to initiate the primer Following a critical resin-based purification
extension reaction causes dNTPs to be incorpo- step to eliminate salts present in previous reac-
rated until a sequence-dependent ddNTP incor- tion buffers, a minute quantity of the reaction
poration event terminates the reaction. The (∼15 nl) is transferred to a solid matrix for mass
assay is designed to allow immediate termina- determination by MALDI-TOF MS. Primer
tion at the juxtaposing polymorphic site for one mass spectra data can be directly converted into
of the alleles possessing a nucleotide comple- genotype calls using software such as Spec-
mentary to the chain-terminating ddNTP, and tro-TYPER™ (Sequenom), in which prespeci-
termination further downstream for the other fied single peak masses or peak mass patterns
allele. A typical reaction generates allele-specific represent various allelic combinations. In our
primer extension products that are generally laboratory, samples are analyzed in duplicate, and
Allele 1 Allele 2
TCT ACT
+ enzyme
+ ddATP
Extended primer (24-mer) + dCTP/dGTP/dTTP
Extended primer (26-mer)
TCT ACT
21 22 23 24 25 26 27 28 29 21 22 23 24 25 26 27 28 29
Allele 1
Allele 2
EXTEND primer
EXTEND primer
Allele 1
5000 6000 7000 8000 9000 10,000 5000 6000 7000 8000 9000 10,000
genotype calls are issued only if genotype calls for dNTPs are added in a predetermined order and
the duplicates are of high confidence and in DNA polymerase catalyzes the incorporation of
agreement. The degree of confidence assigned to each dNTP complementary to the base present
genotyping calls using this system depends upon in the DNA template. Each incorporation event
peak characteristics and is mainly determined by is accompanied by release of pyrophosphate in
the closeness of the observed atomic masses with an amount proportional to the amount of dNTP
the expected masses of primer extension products incorporated. ATP sulfurylase then converts
in the sample, and the signal-to-noise ratios pyrophosphate into ATP, which drives luciferase-
observed for each of the peaks. As mentioned mediated conversion of luciferin to oxyluciferin
previously, during analytical validation of any and generates visible light in amounts propor-
genotyping assay using primer exten- tional to the amount of ATP. The light emitted is
sion/MALDI-TOF analysis, it is critical to deter- detected by a camera and visualized as a peak in a
mine confidence score acceptability criteria for pyrogram. Apyrase is added to degrade unincor-
genotype assignments prior to running clinical porated dNTPs and excess ATP prior to addition
samples. An example of the raw spectra generated of the next dNTP, allowing the complementary
from a multiplexed MALDI-TOF assay in which DNA strand to be synthesized and the nucle-
three polymorphic positions were interrogated in otide sequence to be determined from the signal
a single assay is shown in Figure 2. peaks in the resulting pyrogram.
Computer software such as PyroMark MD
Pyrosequencing 1.0™ (Biotage) can be used to determine the
Pyrosequencing, as developed by Biotage (Upp- optimal dNTP dispensation order based on the
sala, Sweden), differs from other methods in that sequence surrounding the single nucleotide
it provides genotyping results in the context of polymorphism (SNP) of interest, and the same
the neighboring DNA sequence [15,16]. In this software can generate all theoretical pyrograms
assay (depicted in Figure 3) a sequencing primer is as bar graphs. The software then calculates peak
hybridized to a single stranded template isolated heights from the raw data, and genotypes are
from a PCR reaction using streptavidin beads determined by comparison with the theoretical
following PCR amplification of the target DNA pyrograms. Genotype calls are assigned quality
sequence. In the first of four enzymatic reactions, scores of pass, check or fail based on a number of
UEP CYPD632549
Pausing peak
Pausing peak
Pausing peak
M.DEL
M.DEL
DEL
CG
AG
CG
AG
G
C
M
C
A
40
Intensity (signal)
30
20
10
4800 5000 5200 5400 5600 5800 6000 6200 6400 6600 6800
Mass (amu)
Mass-spectral image from a multiplexed primer extension/MALDI-TOF MS assay. Dotted lines represent either
unextended primers (UEP), potential pause/early termination peaks, or successfully extended primers
indicative of the presence of different alleles. The assays shown are for CYP2D6*2 (pink), CYP2D6*3 (blue)
and CYP2D6*4 (green). This sample was determined to be wild-type with respect to all three alleles, as
shown by only a single peak (indicated by arrows) at the location of each wild-type extended primer.
CYP: Cytochrome P450; MALDI-TOF-MS: Matrix-assisted laser desorption ionization time-of-flight
mass spectrometry.
Figure 3. Molecular basis for Pyrosequencing® assays. 5´-end and a quencher on its 3´-end. During
PCR, each detection probe anneals specifically to
its complementary sequence within the PCR-
amplified region. As the product is amplified, the
5´-nuclease activity of DNA polymerase releases
the reporter dye from its proximity to the
quencher allowing fluorescence. When the probe
is intact, the proximity of the reporter dye to the
quencher dye results in suppression of the reporter
fluorescence. However, when hybridized to its tar-
get, the probe is cleaved by DNA polymerase and
the reporter and quencher dyes are separated,
resulting in an emission of a fluorescent signal.
At the conclusion of the assay, the accumulated
fluorescence levels of both probes in each sample
are measured. ABI software plots the fluorescence
values for each detection probe in each sample
using a 2D cluster plot in which the 2D location
of each sample is determined by the strength of
Nucleotide-incorporation-dependent detection of light by the Pyrosequencing fluorescence for each of the detection probes, and
assay. In this multi-enzymatic cascade, light is emitted by the dual actions of
relates, therefore, to the identity of the alleles
sulfurase and luciferase only if the polymerase catalyzes nucleotide addition to
the growing strand upon dispensation of one of four nucleotides into the
originally present in the sample. An algorithm
reaction. A predetermined dispensation order is used to dispense nucleotides assigns a confidence score (referred to as a ‘quality
into the reaction, and the pattern of light emission corresponds to the sequence value’) for each genotype call based upon the
of the target gene interrogated. Apyrase is added at the conclusion of every ‘closeness’ of each sample with other samples
dispensation to degrade the previously added nucleotide before the next exhibiting similar fluorescence properties.
nucleotide is dispensed. Homozygotes for one allele have increased fluo-
PPi: Pyrophosphate.
rescence in one channel but baseline fluorescence
Reproduced from [16] with permission of Future Drugs Ltd.
in the other, while heterozygotes possessing both
alleles exhibit intermediate fluorescence in both
factors, including the agreement between the channels. The quality value ranges from 0–100,
theoretical and actual pyrograms, observed sig- and the default minimum quality value required
nal:noise ratios and calculated peak widths. for an acceptable genotype call in an allelic dis-
Genotype calls are typically only accepted if crimination assay is set to 95. However, once the
assigned a pass score, or if assigned a check score accuracy of a genotyping assay employing allelic
and confirmed visually by the user. An example discrimination is established, the stringency for
of a theoretical pyrogram and the actual pyro- these quality values can be either increased or
gram for a sample bearing the genotype pre- decreased as appropriate. Figure 6 shows an exam-
dicted from the theoretical pyrogram below it ple of a cluster plot displaying samples
are presented in Figure 4. homozygous for allele X (red), allele Y (blue) or
heterozygous for both (green).
Taqman 5´-nuclease-based
allelic discrimination Comparison of
Taqman-based allelic discrimination (referred to genotyping methodologies
hereafter as simply allelic discrimination), using All of the technologies presented above have per-
Taqman SNP Genotyping Assays (Applied Bio- ceived advantages and disadvantages, and these
systems, Foster City, CA, USA), provides another are listed in Table 3. PCR-RFLP is also included
method for polymorphism detection (Figure 5). for comparison. The following sections highlight
This assay utilizes a standard pair of PCR primers a few of the most relevant pros and cons of the
to amplify a region containing the polymorphism technologies covered in this review. Optimal
of interest, and two detection probes [17]. The two technologies to use are almost always assay-
detection probes bind differentially to amplified dependent, but often any of these or other tech-
products containing either the nonpolymorphic nologies can be used to generate genotyping data
or the polymorphic nucleotide, each of which is of sufficient quality after careful assay validation
labeled with a different fluorescent dye on its if one is limited to one type of platform or
A
C/C G/G
Relative amount
2.0
1.0
0.0
G T C T A G C T G C T A
Nucleotide dispensation order
B
1600
Intensity (luminescence)
1500
1400
1300
E S G T C
A GT C T G C T A
5 10
Nucleotide dispensation order
A theoretical pyrogram (A) and the actual pyrogram (B) are presented for an assay in which two successive
polymorphic regions were queried (highlighted in purple) and indicated in the following sequence:
[T/C]A[C/G]CGCTA. In a typical Pyrosequencing assay, the light emission prior to reagent addition is captured
(E), the light emission after addition of reagents is captured (S) and, finally, the light emission after addition
of an initial nucleotide that should not be present as the next nucleotide is captured (in this case, G). After
these control steps, the possible nucleotides at the first polymorphic position are added sequentially (in this
case, T followed by C), then a nucleotide that should not be present as the next nucleotide (in this case, T)
followed by the next nonpolymorphic nucleotide in the sequence (A) and, finally, the next two possible
nucleotides in the second polymorphic region (in this case, G followed by C).
another. The comparisons below are made only to lower multiplexing capabilities. Pyrosequencing
highlight the performance differences between can be multiplexed into duplex and triplex
these platforms and to draw attention to a core set assays, but these require careful assay design and
of considerations before developing genotyping software-assisted interpretation. Multiplexing of
assays on any platform. Taqman-based allelic discrimination assays is
currently limited by the wavelengths of the fluo-
Multiplexing capabilities rophores that can be simultaneously measured
Primer extension assays employing during real-time PCR.
MALDI-TOF MS offer a high-throughput
method capable of high-order multiplexing of Sensitivity & susceptibility
SNP assays, since a large number of extension to contamination
primers and extended primer products of vary- Sensitivity is a double-edged sword with respect
ing length and composition can be resolved and to PCR-based genotyping assay performance.
detected within the range of a typical TOF mass High sensitivity is critical when very small quanti-
spectrometer. While the possibilities for multi- ties of DNA are available (e.g., forensic samples),
plexing with MALDI-TOF MS are not infinite, but is not necessarily as critical when ample
since all primers and extension products must be amounts of DNA are available in the samples of
adequately resolved, multiplexing of up to interest (5 ml whole blood draws from subjects
24 alleles has been reported [103]; however, both participating in clinical studies, for example). In
Pyrosequencing and allelic discrimination have our experience, both MALDI-TOF and
MGB
Forward R Q
primer Probe
5´ 3´
3´ 5´
5´ 3´ Legend
5´
Reverse
primer VIC
Allele 1 FAM
F
V Quencher
MGB MGB
Q Reporter
Q
Minor groove
binder
Match Mismatch Ampli Taq
Allele 2 Gold™ DNA
polymerase
F F
MGB MGB
Q Q
Match Mismatch
Fluorescence emission by a group of samples analyzed in an allelic discrimination assay. Samples exhibiting
fluorescence in only one channel (over baseline) indicate the presence of only one allele (blue and red
samples) and are homozygous for one allele or the other. Samples exhibiting fluorescence in both channels
indicate the presence of both alleles and are heterozygotes at that position.
CYP: Cytochrome P450; MALDI-TOF MS: Matrix-assisted laser desorption ionization time-of-flight mass
spectrometry; NTC: Non-template control.
chip spotting. For 96 samples, the average assay around the SNP of interest, since Pyrosequenc-
time from initial reaction set-up to data output ing reconstructs the sequence downstream of
for allelic discrimination assays is approximately the sequencing primer based upon light emis-
2 h; for Pyrosequencing it is approximately 3 h; sion after single nucleotide addition in every
and for MALDI-TOF it is approximately 5 h. sample analyzed. This is an attractive quality
The same times hold true for 384 samples for since, when used properly, it can provide an
allelic discrimination and MALDI-TOF, but ‘internal standard’-like assessment of the assay
increase significantly for Pyrosequencing in the specificity for every sample (see next section).
current 96-well format. The use of automated While it is unlikely that MALDI-TOF-based
liquid handling and accessories that enable assay primer extension assays could yield primer
plates to be stacked for continuous analysis can extension masses falsely identified as genotype
increase throughput even further. calls after nonspecific hybridization to a target
gene, it is nonetheless a possible scenario, espe-
Sequence context cially for pseudogenes. Allelic discrimination
Only Pyrosequencing affords the user an assays only generate a fluorescence signal at the
opportunity to generate true sequence context end point of allele detection, and the chance
Table 3. Select advantages/disadvantages of polymerase chain reaction with restriction fragment length
polymorphisms versus targeted genotyping technologies in present review.
Platform Advantages Disadvantages
PCR-RFLP Low cost Data quality
Simple instrumentation Subjective visual interpretation
Low throughput
MALDI-TOF MS High throughput (384-well) Instrumentation cost/complexity
High signal to noise Susceptible to contamination due to high sensitivity
Mass accuracy
Multiplexing capabilities
Excellent sensitivity
Pyrosequencing® Moderate throughput (96-well) Moderate throughput (96-well)
Sequence context for every sample Not amenable to higher order multiplexing
Good for insertions and deletions
Allelic discrimination High throughput (384-well) Less certainty in genotype assignments
Common instrumentation in many laboratories Fluorescence as output
Low multiplexing capabilities
PCR-RFLP: Polymerase chain reaction with restriction fragment length polymorphisms; MALDI-TOF MS: Matrix-assisted laser desorption ionization
time-of-flight mass spectrometry.
that such a signal could be due to nonspecific interest is bracketed on both sides by extensive
binding is always theoretically present. How- repeating sequences, it is impossible to direct
ever, these assays have been carefully optimized, the primer extension primer to the
and the use of recommended PCR cycling con- polymorphic site. In these cases, Pyro-
ditions should ensure sufficient specificity of sequencing may be of use since the sequencing
Taqman-based allelic discrimination assays. primer can be designed to hybridize further
Careful analytical validation of a genotyping from the polymorphic site in a nonrepeat por-
method employing any of these end point tion of the sequence and then read through the
detection technologies is the best way to dem- expansion to determine the exact number of
onstrate to an end-user’s satisfaction the speci- triplet repeats or the identity of a poly-
ficity of a given assay during validation, but morphism within the expanded repeat region.
only Pyrosequencing can generate a specificity Currently this approach has an upper limit of
measure (sequence context) for each sample 100 or so nucleotides under optimized condi-
analyzed during an analytical run. Even pyro- tions. While long expanded repeats are less
sequencing can miss nonspecific hybridization, common than typical single-point mutations
again in the case of pseudogenes, if there is or single nucleotide polymorphisms of interest,
100% identity in the regions flanking the SNP these are important considerations for scien-
of interest. tists interested in characterizing these types of
genetic modifications.
Detection of genetic alterations other
than SNPs Instrumentation, analytical interpretation
All three technologies can interrogate SNPs. & range of applications
Short insertions and/or deletions can be effec- Taqman-based allelic discrimination and Pyro-
tively assessed by all three technologies as well, sequencing both can be performed on instrument
but longer insertions and deletions, or nucleo- packages that cost less than US$100,000, while
tide expansion repeats, can be increasingly diffi- MALDI-TOF instrumentation can cost substan-
cult to assay. Long nucleotide expansions tially more. Allelic discrimination assays use the
differing by only a few repeats can be difficult to same familiar real-time detection system from
detect by allelic discrimination since the detec- Applied Biosystems employed for RNA quantifi-
tion probe can hybridize nonspecifically to vari- cation by real-time PCR, and thus can be run on
ous sections of the expanded repeat. Similarly, a common instrumentation system in biomedical
extension primers used in MALDI-TOF assays laboratories that may not necessarily routinely
need to lie near or juxtapose the polymorphic focus on genotyping assessments. Software pack-
site of interest, and if the polymorphism of ages provided with all three systems are
user-friendly and all three platforms enable auto- This more specific aspect is addressed in detail
mated final output of genotype calls based on the in another Guidance for Industry, entitled Class II
generated signal data. Each of the methods may Special Controls Guidance Document: Drug Metab-
require visual inspection for certain outlier sam- olizing Enzyme Genotyping Systems [102]. While this
ples where chemical impurities or other problems document mainly focuses on the more extensive
hamper the generation of easily interpretable steps that diagnostic assay manufacturers should
spectra or plots. While this rarely occurs, peak- follow when submitting a PMA or a 510(k) pre-
based outputs from Pyrosequencing and market notification for a genotyping device to the
MALDI-TOF are more objectively interpretable US FDA, many of the principles outlined for spe-
than the cluster plots generated by fluorescence cifically characterizing the test methodology are
intensity accumulated during allelic discrimina- pertinent to the topic of this review.
tion assays. All of the platforms have multiple In the next section we describe procedures that
uses – for instance, the real-time detection system can be implemented to analytically validate assays
used for allelic discrimination and MALDI-TOF used to determine SNP genotypes in clinical trials
can also be used for gene-expression analysis, and using three different technology platforms. Prepa-
Pyrosequencing and MALDI-TOF can be used ration of, and adherence to, standard operating
for methylation assays. procedures outlining the analytical validation of
genotyping assays in a biomarker laboratory is
Analytical validation of highly desirable and recommended. The main
genotyping assays objective in the analytical validation of any bio-
Pharmacogenomic information is now provided marker assay (pharmacogenomic or other) is to
in a growing number of new drug applications characterize and control sources of variability in a
[1,2,10,21,22,101]. Owing to the increasing impor- method. For a genotyping assay, this process can,
tance of pharmacogenomic tests during the in turn, lead to an understanding of the expected
development of new drug candidates, the US error rate [23]. While SNP genotyping methods are
FDA has issued multiple publications regarding typically qualitative in nature, newer technologies
both general pharmacogenomic biomarker vali- can allow oncology-relevant mutation or copy
dation and principles of analytical validation of number assays to be more quantitative if needed.
pharmacogenomic tests. Examples of the former Thus, key elements in the analytical validation of
include the Guidance for Industry regarding most SNP genotyping methods are the evaluation
pharmacogenomic data submissions issued in of the specificity, sensitivity, accuracy and repro-
March 2005 [101], a process map proposal for ducibility of the methods as summarized in
the validation of genomic biomarkers in both Table 1. Of these, the determination of the abso-
preclinical and clinical drug development [1] and lute accuracy of a genotyping method can be very
a review of the clinical utility of pharmaco- challenging, since well-characterized reference
genomic tests for drug-metabolizing enzymes materials/standards or reference methods are usu-
[2]. The US FDA guidance document essentially ally not available, although there are impressive
draws a distinction between more or less well- current efforts in this area to eventually make
established pharmacogenomic biomarkers, standard QC samples available for genotyping
where genotypes that are strongly linked to cer- assays [24]. In most cases, accuracy for an explor-
tain phenotypic effects are considered either atory SNP is best determined with minimized risk
probable or known valid biomarkers and the by assuming that the convergence of genotype
other, less well-established genotypes that may calls between multiple methods employing differ-
be linked to certain phenotypic effects are con- ent molecular principles of sequence detection
sidered observational or exploratory biomarkers. indicates a ‘true’ genotyping result. Often this is
While both the US FDA guidance and the pro- achieved by comparing the results of the assay
posal map laid out by Goodsaid and Frueh pro- undergoing validation with sequence generated by
vide excellent overviews of the general an independent dideoxysequencing reaction (see
validation process for pharmacogenomic biom- section on Accuracy below), but any previously
arkers (i.e., the process of converting the status validated allele-resolving assay could also theoreti-
of exploratory biomarkers to that of probable or cally be used to determine the convergence of the
known valid biomarker), they do not address genotype calls between methods.
the more specific topic concerning the analytical In addition to the concepts of analytical
validation of pharmacogenomic tests used to method validation, we also summarize recom-
measure the genomic biomarkers in question. mendations regarding the use of quality controls
during the genotyping analysis of clinical sam- conducted in silico by Applied Biosystems as
ples. To control for unforeseen changes in described in their white paper on the design of
method performance, biomarker laboratories Taqman probe-based assays [104].
should use QC samples to assess the quality of For the other types of non-Taqman geno-
data delivered in any given analytical run. The typing assays designed within the biomarker lab-
QC samples should, whenever possible, be based oratory, similar in silico analyses can be
on the true sample matrix, and standard accep- performed. In these cases an assay can prelimi-
tance criteria should be prospectively defined narily be considered specific if bioinformatic
prior to the analysis of clinical samples. For sequence analysis (using the current Ensembl
genotyping assays, QC samples are typically build of the human genome) indicates that only
comprised of positive controls bearing each of the target gene of interest will be amplified by
the potential genotypes of interest (when possi- the PCR strategy. During the in silico analysis of
ble/practical) to ensure that the method is capa- genotyping assay specificity, suitable binding
ble of detecting all allele combinations sites for PCR primers designed to amplify the
accurately and reproducibly. Genotyping assays target SNP of interest are evaluated with respect
must also contain negative controls demonstrat- to whether other polymorphisms exist in the
ing the lack of spurious genotype calls in DNA- proposed PCR primer binding sites, and
free samples, which indicate the absence of con- whether the PCR primer binding sites are
taminating genomic DNA that could interfere unique to the gene of interest or also exhibit high
with the assay. homology elsewhere in the genome. Assays pre-
dicted to amplify DNA sequences other than the
Specificity target of interest must be redesigned.
The specificity of a genotyping assay relates to Wet laboratory or experimental demonstra-
the ability of the assay to recognize and accu- tion of assay specificity can also be performed,
rately detect the specified polymorphism with- for example, by sequencing the amplicon gener-
out reacting with or detecting related DNA ated by the designed PCR assay to demonstrate
sequences that can interfere with the assay and that the amplified sequence matches only the
confuse final genotype assignments. As such, target sequence intended for amplification.
the specificity of any genotyping method is Although the PCR product obtained from a
highly dependent on the specificity of the prim- newly designed genotyping assay can be submit-
ers used in the various PCR-based steps of the ted for direct sequencing, a more rigorous
reaction. This aspect of analytical validation is method is to clone the PCR product and
particularly important for genotyping assays sequence a large number of clones to determine
designed to detect polymorphisms within genes the percentage abundance of the target ampli-
that have high homology to other regions in the con amongst all amplified sequences. Alterna-
genome (e.g., cytochrome P450 [CYP]2D6 and tively, the specificity of a PCR-based genotyping
its pseudogenes CYP2D7 and CYP2D8). In assay can be assessed by melting-curve analysis
such cases, nested PCR strategies may be of the amplified product to demonstrate the
required in order to first amplify the gene of presence of a single melting-point temperature
interest, and then design a second PCR-based (Tm) in the melting curve of the product. How-
strategy to detect the SNP of interest in the ever, such wet-laboratory-based approaches still
enriched target gene PCR product. Even then, represent inexact determinations (for instance,
great care must be exercised to demonstrate that amplified products from a pseudogene can yield
pseudogenes originally present in the initial a Tm indistinguishable from that of the target
genomic DNA sample will not contribute a sig- gene if the amplicons share high homology),
nal in a genotyping assay employing a nested and these experiments can also be costly and/or
PCR approach, despite being present at far time-consuming to execute. In our experience
lower levels than the amplified target gene. we have found that careful in silico analysis of
With the sequencing of the human genome, PCR primer specificity often provides a suffi-
the specificity of any nucleic acid-based assay can ciently accurate assessment of the overall speci-
be evaluated in silico during the process of PCR ficity of a genotyping assay for the purposes of
primer design. For commercially available Taq- analytical validation.
man assays where primer and probe sequences As a final note of caution regarding assay
are proprietary, the demonstration of the speci- specificity, since PCR can detect a few mole-
ficity of PCR primers and detection probes is cules of the target sequence, and the end point
technologies in current use display impressive 15–20%. We have recently determined that a
sensitivity (see next section), a small amount of primer extension/MALDI-TOF reaction can
contamination in a sample can lead to a mis- detect as little as 1–5% mutation in a back-
leading result. It is therefore of utmost impor- ground of wild-type genomic DNA. These
tance that the laboratory uses procedures results indicate that newer technologies will
(e.g., automation and regular fluidics mainte- afford greater sensitivity for mutation detection,
nance) that will minimize the risk of contami- which should eventually result in commensu-
nation by amplicons or genomic DNA from rately earlier diagnosis/prognosis than has been
neighboring samples, and employ QCs that will available with older sequencing methodologies.
indicate the presence of contamination during
an analytical run. Efficiency
The efficiency of a genotyping assay (also
Sensitivity expressed as the overall call rate) simply indicates
Sensitivity of targeted genotyping assays may the ability of a qualitative assay to provide an
sometimes be considered a less critical issue in analytically acceptable genotyping result (which
the analytical validation of pharmacogenomic may either be correct or incorrect – this aspect of
tests employed during clinical development analytical performance is captured by accuracy as
since: described below). Analytical acceptability, in this
case, is established by prospectively defining the
• Milliliter volumes of whole blood are typically
cut-offs for confidence scores that will allow a
drawn from subjects providing pharmaco-
genotyping call to be made when using the rec-
genomic samples in clinical trials;
ommended starting input amount of genomic
• Microgram amounts of DNA are available
DNA for analysis. To evaluate the efficiency of a
from such volumes, even though genotyping
given genotyping assay, the assay should be per-
assays typically require amounts of input
formed on a number of different days using a
DNA at the ng level.
validation sample panel. Whenever possible, the
Regardless of these considerations, however, validation sample panel should include each
establishing the sensitivity of genotyping assays allele combination and have an equal distribu-
provides valuable information regarding the tion of the genotypes or a distribution similar to
minimal recommended input amount of DNA the genotype frequency in the relevant popula-
for a given method. To establish the sensitivity tion. To determine the efficiency of the method,
of a qualitative genotyping assay, genotyping we typically perform at least three analytical runs
calls are simply attempted in increasingly in an analyst-blinded fashion using at least
diluted genomic DNA samples until the geno- 40 genomic DNA samples per run. The effi-
typing assay employed is no longer able to make ciency or call-rate of the assay is calculated by
a genotyping call with acceptable confidence. dividing the number of samples yielding accept-
Establishing assay sensitivity is an especially able genotype calls by the total number of sam-
critical aspect of analytical validation for semi- ples analyzed. Assay efficiencies of robust assays
quantitative genotyping assays designed to are typically 99–100%.
detect mutations in oncology tumor tissues that
may be genetically heterogeneous. For estab- Reproducibility
lishing the sensitivity of these types of assays, Reproducibility refers to the ability of a qualita-
genomic DNA titration experiments can be tive genotyping assay to yield identical results
performed, in which genomic DNA from a cell for a given sample in different analytical runs.
line bearing the mutation of interest is titrated To evaluate the reproducibility of a given geno-
in a decreasing relative ratio to genomic DNA typing assay, the assay should be performed on
from a wild-type cell line. Similar to a standard a number of different days using the previously
genotyping sensitivity experiment, the point at described validation sample panel. We typically
which the genotyping assay is no longer able to establish the overall call-rate and the reproduc-
detect the presence of the mutation with ibility of the genotyping method at the same
acceptable confidence is defined as the limit of time over multiple analytical runs; thus, in
detection of the assay. order to determine the reproducibility of a
For a typical dideoxy-sequencing reaction, the method, we typically analyze the same
limit of detection of a mutation in a background 40 genomic DNA samples over three or more
of wild-type genomic DNA is approximately analytical runs. The reproducibility of the
method is calculated by dividing the number of analyzed for efficiency and reproducibility to
samples yielding identical genotype calls in all analysis by DNA sequencing. Alternatively,
analytical runs by the total number of samples when there is a well-characterized/published
analyzed. Again, assay reproducibility of robust method available, (e.g., the US FDA-approved
assays typically ranges from 99–100%. While Amplichip® CYP450 test for CYP2D6 or
genotyping methodologies of optimized assays CYP2C19 [Roche Diagnostics, Indianapolis, IN,
are typically very reproducible and rarely yield USA]), a method comparison study can be per-
disagreement between replicates, measurement formed using the well-characterized/published
of the assay reproducibility during analytical method as the reference method. The overall
validation also evaluates the preanalytical vari- accuracy of the method is determined as the total
ables associated with the assay, including the number of samples yielding genotype calls iden-
sample handling steps. While very rare, in our tical to those determined by direct DNA
experience most disagreements between repli- sequencing (or by using a well-characterized
cate samples are typically due to either variable method) and is expressed as percent accuracy.
PCR failures or pipetting errors. To minimize
the latter source of error, the use of automated Sample stability
liquid handling systems in as many individual While the reported stability of DNA is well
steps as possible is highly desirable. known, and the literature boasts a large number
of studies that have determined the effect of stor-
Accuracy age time and temperature on both the quantity
Establishing the accuracy of a genotyping assay (yield) and quality (A260:A280 ratio) of
is quite challenging, and this parameter actually genomic DNA in whole blood and isolated
refers to establishing the ability of the geno- DNA samples [25–27], very few validation studies
typing assay to yield the ‘true’ genotype call. have been published regarding the stability of
Accuracy can be determined by comparing the certain sample types and/or genomic DNA with
genotype call determined by a given genotyping respect to detecting accurate genotypes in the
assay with the actual genotype of each valida- stored samples. This is of greater interest for
tion sample. The actual genotype of each vali- sample matrices where degradation is known to
dation sample is most often determined using occur, but in order to formally document stabil-
DNA dideoxy-sequencing analysis, which is ity it is more informative to establish the temper-
considered the gold standard comparator ature- and time-dependent stability of the tissue
method in the field of genotyping [2,101]. How- of interest by using genotyping accuracy of the
ever, other alternative methods of demonstrated isolated genomic DNA as the end point assess-
validity may also be used to establish actual ment, rather than relying on the yield and appar-
genotypes for the purpose of accuracy determi- ent quality of the isolated nucleic acid as the
nation. An important pitfall that must be indicator of DNA stability.
avoided when determining accuracy is using the
same PCR strategy to amplify the target gene Validation samples & quality assessment
for both end point detection methods. To To demonstrate the ability of a designed assay to
establish the accuracy of a genotyping assay detect wild-type, heterozygous polymorphic and
undergoing validation, a unique and indepen- homozygous polymorphic alleles, samples with
dent target gene PCR amplification strategy preliminary data or annotation that suggests they
should be used to amplify the region of bear each allele combination should be included
genomic DNA for sequence analysis by a sec- in the initial panel of samples to be used for assay
ond method. Convergence between indepen- validation if possible. We typically design an ini-
dently designed PCR-based methods is the only tial assay that exhibits acceptable in silico specific-
way to establish accuracy of a genotyping assay, ity, and then screen several hundred samples to
since convergence of two technologies for geno- identify genomic DNA (gDNA) samples from at
type assignments on the same PCR template least 40 different individuals that will be
only demonstrates that the two methods pro- informative for analytical validation purposes.
vide similar results for that amplicon (the For moderate- or high-frequency alleles, screen-
design of which may be flawed). ing a relevant population of gDNA samples
We typically determine the accuracy of each typically results in the identification of at least
genotyping assay validated in the biomarker lab- one or more samples for each allele combination.
oratory by submitting the 40 genomic samples For example, given that the CYP2D6*17
Executive summary
• Whole-genome association (WGA) studies are identifying novel genetic markers, and the demand for reliable targeted
genotyping assays amenable to high-throughput analysis is expected to increase as more and more confirmatory studies are
required to validate initial findings in WGA studies.
• Recent guidance documents from the US FDA summarize aspects of both general pharmacogenomic biomarker validation and
analytical validation of pharmacogenomic assays, and provide excellent resources for laboratories developing pharmacogenomic
tests to support biomarker studies.
• Matrix-assisted laser desorption ionization time-of-flight mass spectrometry assays employ primer extension reactions in the
presence of mixtures of deoxynucleotides and dideoxynucleotides in order to determine the identity of the original nucleotide
juxtaposing the extension primer.
• Pyrosequencing® assays employ direct sequencing reactions using ordered nucleotide addition in the presence of luciferase-
related enzymes to determine the identity of nucleotides in the target gene as demonstrated by the generation of a luminescence
signal upon nucleotide addition.
• Allelic discrimination assays employ probe-based hybridization to determine the identity of the nucleotide present at a
polymorphic locus by measuring fluorescence of different fluorophores conjugated to oligonucleotide probes designed to
hybridize perfectly to either wild-type or mutant (polymorphic) sequences in the target gene.
• The possible presence of pseudogenes and/or regions of high homology elsewhere in the human genome sequence dictates that
careful in silico assessments of assay specificity should be performed before genotyping assays are tested in the
pharmacogenomic laboratory.
• While a number of different technologies have been designed to query nucleotide identity at exact locations in genomic DNA, all
of the technologies can be subjected to the principles of analytical validation to determine their suitability for assessing single
nucleotide polymorphsims in target genes.
Bibliography 4. Lee JW, Weiner RS, Sailstad JM et al.: protocol, cost and throughput.
Papers of special note have been highlighted as Method validation and measurement of Pharmacogenomics J. 3, 77–96 (2003).
either of interest (•) or of considerable interest (••) biomarkers in nonclinical and clinical samples •• Extensive and fundamental overview of the
to readers. in drug development: a conference report. underlying molecular principles of many
1. Goodsaid F, Frueh F: Process map proposal Pharmaceutical Res. 22, 499–511 (2005). current genotyping methodologies.
for the validation of genomic biomarkers. 5. Amos J, Grody W: Development and 8. Xu B, Tubbs RR, Kottke-Marchant K:
Pharmacogenomics 7, 773–782 (2006). integration of molecular genetic tests into Molecular genetic testing of polymorphisms
2. Andersson T, Flockhart DA, Goldstein DB clinical practice: the US experience. Expert associated with venous thrombosis: a review
et al.: Drug metabolizing enzymes: evidence Rev. Mol. Diagn. 4, 465–477 (2004). of molecular technologies. Diagn. Mol.
for clinical utility of pharmacogenomic tests. 6. Kwok PY: Methods for genotyping single Pathol. 14, 193–202 (2005).
Clin. Pharm. Ther. 78, 559–581 (2005). nucleotide polymorphisms. Annu. Rev. 9. Gibson NJ: The use of real-time PCR
3. Swanson BN: Delivery of high quality Genomics Hum. Genet. 2, 235–258 (2001). methods in DNA sequence variation
biomarker assays. Disease Markers 18, 47–56 7. Chen X, Sullivan PF: Single nucleotide analysis. Clin. Chim. Acta 363, 32–47
(2002). polymorphism genotyping: biochemistry, (2006).
10. Gibson N, Jawaid A, March R: Novel 20. Udaykumar, Epstein JS, Hewlett IK: 28. Gaedigk A, Bradford LD, Marcucci KA,
technology and the development of A novel method employing UNG to avoid Leeder JS: Unique CYP2D6 activity
pharmacogenetics within the carry-over contamination in RNA-PCR. distribution and genotype–phenotype
pharmaceutical industry. Pharmacogenomics Nucleic Acids Res. 21, 3917–3918 (1993). discordance in black Americans. Clin.
6, 339–356 (2005). 21. Tsongalis GJ, Coleman WB: Clinical Pharm. Ther. 72, 76–89 (2002).
•• Forward-looking review article focusing on genotyping: the need for interrogation of
the applications of pharmacogenetics in single nucleotide polymorphisms and Websites
drug development. mutations in the clinical laboratory. Clin. 101. US FDA: Guidance for Industry.
11. Leushner J: MALDI-TOF mass Chim. Acta 363, 127–137 (2006). Pharmacogenomic Data Submissions. FDA,
spectrometry: an emerging platform for 22. Ingelman-Sundberg M, MD, USA (2005)
genomics and diagnostics. Expert Rev. Mol. Rodriguez-Antona C: Pharmacogenetics of www.fda.gov/cder/guidance/6400fnl.pdf
Diagn. 1, 11–18 (2001). drug-metabolizing enzymes: implications • Landmark guidance document drafted to
12. Pusch W, Wormbach JH, Thiele H, for a safer and more effective drug therapy. facilitate the incorporation of personalized
Kostrzewa M: MALDI-TOF mass Philos. Trans. R. Soc. Lond. B Biol. Sci. 360, medicine and pharmacogenomic principles
spectrometry-based SNP genotyping. 1563–1570 (2005). into drug development.
Pharmacogenomics 3, 537–548 (2002). 23. Pompanon F, Bonin A, Bellemain E, 102. US FDA: Guidance for Industry and FDA
13. Pusch W, Kostrzewa M: Application of Taberlet P: Genotyping errors: causes, Staff. Class II Special Controls Guidance
MALDI-TOF mass spectrometry in consequences and solutions. Nat. Rev. Genet. Document: Drug Metabolizing Enzyme
screening and diagnostic research. Curr. 6, 847–859 (2005). Genotyping System. FDA, MD, USA
Pharm. Des. 11, 2577–2591 (2005). •• Excellent review article describing not only (2005)
14. Tost J, Gut IG: Genotyping single the sources but also the solutions of www.fda.gov/cdrh/oivd/guidance/1551.pdf
nucleotide polymorphisms by MALDI mass different types of genotyping errors. • Helpful guidance document specifically
spectrometry in clinical applications. Clin. 24. Chen B, O’ Connell CD, Boone DJ et al.: addressing the validation of genotyping
Biochem. 38, 335–350 (2005). Developing a sustainable process to provide devices, with principles that are broadly
15. Ahmadian A, Ehn M, Hober S: quality control materials for genetic testing. applicable to genotyping assays in general.
Pyrosequencing: history, biochemistry and Genet. Med. 7, 534–549 (2005). 103. Oeth P, Beaulieu M, Park C et al.: iPLEX
future. Clin. Chim. Acta 363, 83–94 (2006). 25. Cushwa WT, Medrano JF: Effects of blood Assay: Increased Plexing Efficiency and
16. Clarke SC: Pyrosequencing: nucleotide storage time and temperature on DNA yield Flexibility for MassARRAY® System
sequencing technology with bacterial and quality. Biotechniques 14, 204–207 Through Single Base Primer Extension with
genotyping applications. Expert Rev. Mol. (1993). Mass Modified Terminators. Sequenom®
Diagn. 5, 947–953 (2005). 26. Lahiri DK, Schabel B: DNA isolation by a Inc., CA, USA (2005)
17. de la Vega FM, Lazaruk KD, Rhodes MD, rapid method from human blood samples: www.sequenom.com/Assets/pdfs/brochures/
Wenz MH: Assessment of two flexible and effets of MgCl2, EDTA, storage time, and IPLEX%20App%20Note%20External.pdf
compatible SNP genotyping platforms: temperature on DNA yield and quality. 104. The design process for a new generation of
TaqMan SNP Genotyping Assays and the Biochem. Genet. 31, 321–328 (1993). quantitative gene expression analysis tools:
SNPlex Genotyping System. Mutat. Res. • Stability study looking at the effect of Taqman probe-based assays for human,
573, 111–135 (2005). various variables on DNA isolation. mouse and rat genes
18. de Arruda M, Lyamichev VI, Eis PS et al.: 27. Madisen L, Hoar DI, Holroyd CD, www.appliedbiosystems.com/pebiodocs/001
Invader technology for DNA and RNA Crisp M, Hodes ME: DNA banking: the 13967.pdf
analysis: principles and applications. Expert effects of storage of blood and isolated DNA 105. Coriell Institute for Medical Research:
Rev. Mol. Diagn. 2, 487–496 (2002). on the integrity of DNA. Am. J. Med. Genet. National Institute of General Medical
19. Olivier M: The Invader assay for SNP 27, 379–390 (1987). Sciences Human Genetic Cell Repository
genotyping. Mutat. Res. 573, 103–110 http://ccr.coriell.org/nigms/nigms_cgi
(2005).