Professional Documents
Culture Documents
High Throughput Multiplex SNP Genotyping With MALDI-TOF PDF
High Throughput Multiplex SNP Genotyping With MALDI-TOF PDF
METHODS
Single nucleotide polymorphisms (SNPs) are currently being identified and mapped at a remark-
able pace, providing a rich genetic resource with vast potential for disease gene discovery, phar-
macogenetics, and understanding the origins of modern humans. High-throughput, cost effective
genotyping methods are essential in order to make the most advantageous and immediate use of
these SNP data. We have incorporated the use of matrix-assisted laser desorption/ionization time-
of-flight mass spectrometry (MALDI-TOF) in our laboratory as a tool for differentiating geno-
types based on the mass of the variant DNA sequence, and have utilized this method for production
scale SNP genotyping. We have combined a 4 ml PCR amplification reaction using 3 ng of ge-
nomic DNA with a secondary enzymatic reaction (mini-sequencing) containing oligonucleotide
primers that anneal immediately upstream of the polymorphic site, dideoxynucleotides, and a
thermostable polymerase used to extend the PCR product by a single base pair. Mass spectrom-
etry (MS) analysis of mini-sequencing reactions was performed using a MALDI-TOF instrument
(Voyager-DE, Perseptive Biosystems, Framingham, MA). We performed both single and multi-
plex PCR and mini-sequencing reactions, and genotyped seven different variant sites in a ran-
dom sample of 989 individuals. Genotypes generated with MS methods were compared with
genotypes produced using a 5¢ exonuclease fluorescence-based assay (Taqman, Applied Biosystems,
Foster City, CA) and a gel-based genotyping protocol. Because multiple polymorphisms can be
detected in a single reaction, the MS technique provides a cost-effective and efficient method for
high-throughput genotyping. Hum Mutat 17:296–304, 2001. © 2001 Wiley-Liss, Inc.
KEY WORDS: SNP; genotyping; MALDI-TOF; mass spectrometry; polymorphism; multiplex; ADD1;
ADRB2; AGT; AGTR1; GNB3; LPL
DATABASES:
ADD1 – OMIM: 102680; GDB:134672; Genbank: NM_001119; HGMD: ADD1
ADRB2 – OMIM: 109690; GDB: 120541; Genbank: NM_000024;HGMD: ADRB2
AGT – OMIM: 106150; GDB:118750; Genbank: X15323, NM_000029; HGMD: AGT
AGTR1 – OMIM: 106165; GDB: 132359; Genbank: Z11162, NM_000685; HGMD: AGTR1
GNB3 – OMIM: 139130; GDB: 120005; Genbank: NM_002075; HGMD: GNB3
LPL – OMIM: 238600; GDB: 120700; Genbank: M76722, NM_000237; HGMD: LPL
Received 8 November 2000; accepted revised manuscript 12 Contract grant sponsor: CDC; Contract grant number: UR6/
January 2001. CCU617218-01; Contract grant sponsor: NIH; Contract grant
*Correspondence to: Molly S. Bray, Ph.D., Human Genetics numbers: HL54481; HL54526; HL54457; HL54464; DDK45538.
Center, University of Texas Health Science Center at Houston,
P.O. Box 20334, Houston, TX 77225.
E-mail: molly.s.bray@uth.tmc.edu
TABLE 1. PCR Primers for Genes Selected for Comparative Analysis of MS Genotyping
Gene Variant Foward primer sequence Reverse primer sequence PCR length
ADD1 G460W CGGGGCGACGAAG TGGGACTGCTTCCATTC 46
ADRB2 R16G CAGCGCCTTCTTGCTG CGGCGCATGGCTTC 40
ADRB2 Q27E CGGACCACGACGTCAC CCACCACCCACACCTC 45
ADRB2 Both Sites CAGCGCCTTCTTGCTG CCACCACCCACACCTC 83
AGT -6G>A AAATAGGGCCTCGTGAC AACGGCAGCTTCTTCC 42
AGTR1 1166A>C CTGCAGCACTTCACTACCAAA TCCTTCAATTCTGAAAAGTAGC 52
GNB3 825C>T TCATCTGCGGCATCAC GTAGGCGGCCACTGAG 48
LPL S447Ter TGCCATGACAAGTCTCTGA GCCCAGAATGCTCACC 50
298 BRAY ET AL.
TABLE 2. Mini-sequencing Primer Sequences and Masses for Unextended and Extended Products
PCR product Unextended Variant Extended
Gene Variant length Extension oligo sequence oligo mass alleles oligo mass
AGT -6G>A 42 CGTGACCCGGCC(A/G) 3607.4 A 3904.6
G 3920.6
AGTR1 1166A>C 52 CTACCAAATGAGC(A/C) 3927.6 A 4224.8
C 4200.8
GNB3 825C>T 48 AGGGAGAAGGCCAC(A/G) 4346.9 A 4644.1
G 4660.1
ADRB2 Q27E 45 CACGACGTCACGCAG(C/G) 4547.0 C 4820.2
G 4860.2
LPL S447X 50 GAATGCTCACCAGCCT(G/C) 4826.2 C 5099.4
G 5139.4
ADD1 G460W 46 ACGAAGCTTCCGAGGAA(G/T) 5228.5 G 5541.7
T 5516.7
ADRB2 R16G 40 GAGGCGGCGCATGGCTTC(C/T) 5556.6 C 5829.8
T 5844.8
SNP GENOTYPING WITH MALDI-TOF 299
eluted with matrix can be spotted directly onto and 0.5 U Taq polymerase (Life Technologies,
the MALDI plates but must be analyzed within Rockville, MD) in a 20 µl reaction volume. The
approximately 24 hr. Samples eluted in acetoni- amplified products were then digested overnight
trile can be stored for extended periods if frozen with 6 U of the appropriate restriction enzyme,
but must first be mixed with an equal amount of and digested products were separated on 3% aga-
matrix prior to plate spotting. Approximately 400 rose gels, stained with ethidium bromide, and vi-
nl of matrix-sample mixture was spotted onto a sualized using UV light. A 100 bp size standard
384 well MALDI plate using a Symbiot I robotic was included on each of the typing gels for geno-
workstation (Perseptive Biosystems). Barcode type identification, and two individuals scored
labels were used to verify and record the sample each gel independently. For the Taqman assay,
transfer from reaction plates to MALDI plates. PCR products were generated using 30 ng DNA,
5.0 mM MgCl2, and 1X Taqman Universal PCR
Mass Spectrometry and Genotype
Master Mix containing AmpliTaq Gold DNA
Determination
Polymerase in a 27 µl reaction volume. A total of
Masses of the mini-sequencing primers and ex- 0.2 µM of each of the allele-specific probes was
tension products were assessed using linear de- used in the allele discrimination assays, and al-
layed extraction mass spectrometry and a lele detection and genotype calling were per-
Voyager-DE MALDI-TOF instrument (Perseptive formed using an ABI 7700 and Sequence
Biosystems). Genotypes for the samples were re- Detection System software (Applied Biosystems).
solved with the aid of the Genespectrometer soft-
RESULTS AND DISCUSSION
ware package. Such allele-calling software is
Single and Multiplex PCR
critical in order to achieve high throughput
genotyping. The software first targets the ioniza- Single locus amplification reactions were gen-
tion laser to accumulate mass data from regions erated for each of the variant sites (data not
of a sample spot that yield high signal-to-noise shown), as well as multiplex PCRs that contained
ratio. When sufficient signal has been accumu- sets of six (Fig. 1A), three (Fig. 1B), or four (Fig.
lated, the laser is directed to the next spot, allow- 1C) target genes. A subset of samples visualized
ing a 384-well MALDI plate containing multiplex on 9% polyacrylamide gels demonstrated that
genotyping reaction products to be analyzed au- the 4 µl reactions were reliable, with only slight
tomatically. The software then uses the mini-se- variation in amplification efficiency due to vari-
quencing primers in each extension reaction as ability of DNA sample quality and/or concen-
internal mass calibrants and automatically iden- tration. PCR primers were designed so that the
tifies the primers, calibrates each mass spectrum, 3′ end of each primer was at least two base pairs
and determines genotypes based on the mass dif- from the variant site (Fig. 1D) both to ensure
ferences between primer and extension peaks. that mini-sequencing primers would bind and
Genotypes are written to an electronic database extend only on specific targets and to negate the
that can be queried to produce data files in a need to remove any non-specific PCR products.
spreadsheet format suitable for further analysis. Limiting the PCR product size to less than 100
bp was found to produce the most robust multi-
Genotype Determination Using Alternative
plex PCR and mini-sequencing reactions. We
Methods
have also successfully performed extension re-
Genotypes generated using mass spectrometry actions using larger PCR products (up to 500
were compared to two alternative methods, ei- bp) that contain multiple SNPs (data not
ther gel-based restriction fragment length (RFLP) shown). The reaction volume for PCR was re-
enzyme digest assays (ADD1, AGT, AGTR1, duced to 4 µl, both to minimize reaction costs
GNB3, and LPL) or the Taqman hybridization and in order to perform all steps of the sample
assay (ADRB2-R16G and ADRB2-Q27E). PCR preparation in a single vessel. Small reaction
reactions for RFLPs included 30 ng DNA, 1.5 mM volumes also reduced the amount of DNA
MgCl2, standard concentrations of PCR reagents, needed, thereby maximizing sample resources.
300 BRAY ET AL.
FIGURE 1. Multiplex PCR products. The six-plex reaction (A) contains PCR products for AGT (42 bp), ADD1 (46 bp), GNB3
(48 bp), LPL (50 bp), AGTR1 (52 bp), and ADRB2 (83 bp). The three-plex reaction (B) contains PCR products for ADD1 (46
bp), AGTR1 (52 bp), and ADRB2-Q27E (45 bp), and the four-plex reaction (C) contains products for AGT (42 bp), GNB3
(48 bp), LPL (50 bp), and ADRB2-R16G (40 bp). PCR primers were designed so that the 3′ end of each primer was 2–3 bp
from the variant site to ensure that mini-sequencing primers would extend only when bound to specific PCR targets (D).
Mini-sequencing Reactions and MALDI merase. Nevertheless, samples that amplified and
Chemistry extended well did so with either enzyme. The
Mini-sequencing extension reactions resulted use of Thermosequenase is preferred when ge-
in easily separable peaks in single locus reactions nomic DNA samples are sub-optimal, but pre-
and only moderate loss of signal in multiplex liminary experiments have shown that the
reactions. With smaller sets of three (Fig. 2A) addition of ammonium sulfate to the Tth reac-
and four (Fig. 2B) multiplexed PCR products, tion buffer greatly enhances extension efficiency
each set of extension products could be sepa- (data not shown). Because the largest cost asso-
rated by two or more nucleotides, providing eas- ciated with MALDI-TOF SNP genotyping is the
ily differentiated signals for each genotype. The polymerase enzyme for the extension reactions,
seven-plex extension reaction also produced dif- and because Tth costs less than Thermo-
ferentiable extension products, and higher-level sequenase, further investigation of its use in pro-
multiplexing can be accomplished by extending duction genotyping is warranted.
the mass range beyond 6000 Daltons. Neverthe-
Genotype Determination and Comparison to
less, due to the close spacing of extension and
Alternative Methods
primer peaks in the seven-plex reactions, there
was some interference with depurination peaks Genotypes generated using mass spectrometry
when THAP was used as the matrix chemical were compared to two alternative methods, ei-
(Fig. 3A). Alternatively, when 3-HPA matrix ther RFLP assays (ADD1, AGT, AGTR1,
was used, depurination peaks were no longer GNB3, and LPL) or the Taqman hybridization
present, providing clear signals for more closely assay (ADRB2-R16G and ADRB2-Q27E). A
spaced extension products (Fig. 3B). concordance score was calculated by compar-
In all cases, the use of Thermosequenase in ing the genotypes generated using MS to geno-
the mini-sequencing reaction resulted in more types generated with the alternative method.
reliable primer extension than did Tth poly- Allele frequencies for each comparison were not
SNP GENOTYPING WITH MALDI-TOF 301
of interest, has been demonstrated to provide ac- Interference of depurination peaks with ex-
curate genotypes for numerous polymorphisms. tension peaks in multiplex reactions. The use of
Nevertheless, certain variant DNA sequences THAP matrix can produce substantial depur-
are difficult to differentiate using this method ination peaks that can mask or interfere with
due to similar affinity of both perfect-match and detection of extension peak signals in multiplex
mismatch probes for the target sequence. Since reactions. THAP is used often when assessing
there is little flexibility in the design of the probe oligonucleotides with MALDI-TOF methods
sequences, some variant sites are not well suited due to its even crystallization in the sample well,
for hybridization genotyping methods. The use which is especially useful when sampling in the
of mini-sequencing extension reactions and mass instrument’s automatic mode. Interference from
spectrometry circumvents the problems involved depurination peaks can make THAP undesir-
in allele-specific hybridization, since alleles are able for high level multiplexing, however. Be-
differentiated based on the mass of the extended cause depurination peaks have a predictable
products rather than hybridization signals. mass that is approximately 150 Daltons less than
Disagreement due to weak signals or excess the primer peaks, careful design of primer and
primer concentration. As extension primers and extension masses can help to avoid depurination
products are expanded into the higher mass peak interference when using THAP matrix. An
ranges, some decrease in signal is to be expected. alternative to THAP is the use of 3-HPA, which
This decrease in signal intensity may be due to produces an even baseline and clearly resolved
less efficient desorption and/or decreasing sen- peaks. However, 3-HPA crystallizes in a circular
sitivity of the detector for higher mass molecules. pattern, leaving the center of the well void,
Differences in PCR efficiency as well as com- which can make autosampling difficult.
petitive ionization in multiplex reactions may
Cost Comparison of Alternative Methods
also contribute to uneven extension signals.
Therefore, it is critical to minimize background A comparison of genotyping supply costs for MS,
noise and optimize the concentrations of prim- RFLP, and Taqman is presented in Table 4. Esti-
ers so that primer signal does not mask exten- mates include the cost of all plasticware and
sion signals. This is especially true in multiplex consumables. Instrument costs amortized over a
reactions, where high signal intensities for low large number of locus tests were assumed to be
mass primers may result in comparably weak sig- negligible and personnel costs were assumed to be
nals for the higher mass products. equivalent among methods. Taqman assay costs
SNP GENOTYPING WITH MALDI-TOF 303
TABLE 4. Cost (in U.S. Dollars) Comparison of Genotyping Using Mass Spectrometry Versus Alternative Methods
Mass spec typing RFLP typing Taqman assay
Total Single Total Single Total Single
rxn vol. locus 4-plex 7-plex rxn vol. locus rxn vol. locus
PCR 4 µl $0.16 $0.04 $0.02 20µl $0.48 20µl $0.93
Enzyme digest 6 µl $0.08 $0.02 $0.01 25µl $0.60 – –
Post PCR 10 µl TS $0.92 $0.23 $0.13 2% $0.13 – –
10 µl Tth $0.73 $0.18 $0.10 gel
Purification $0.14 $0.04 $0.02
Total $1.11–$1.30 $0.28–$0.33 $0.15–$0.18 $1.21 $0.93
were calculated using Taqman master mix (Ap- currently about six MALDI plates (∼2,300 samples)
plied Biosystems), which contains AmpliTaq Gold can be analyzed per day. By incorporating as few
DNA polymerase, dNTPs, PCR additives, and as five variants into a multiplex reaction, more than
normalizing dye. For each genotyping method, the 11,500 genotypes can be produced in a regular
cost of enzymes is the primary expense, and single eight-hour day. Extending robotic spotting and
locus MS genotyping is comparable to other meth- analysis of MALDI plates into the off hours can
ods. Nevertheless, multiplexing up to seven prim- greatly improve throughput.
ers in a single MS reaction can reduce the cost per
Issues in High Throughput Genotyping With
genotype to approximately $0.15 US. The capac-
Mass Spectrometry
ity to multiplex makes MS genotyping an extremely
cost-effective method. In addition, even modest In order to bring the use of MS genotyping
levels of multiplexing and automation made pos- into the laboratory at a production level, there
sible with MS will reduce personnel costs below are a number of factors that must be considered.
that of the other two methods. First, performing the multi-step sample prepa-
ration in a single vessel is critical for minimizing
Genotyping Throughput With MS Methods
sample contamination and labor steps. Second,
Critical to any evaluation of production reaction volumes should be kept low in order to
genotyping methods is the consideration of sample minimize cost, concentrate template, and maxi-
throughput. Low volume PCR and mini-sequenc- mize DNA sample resources. Third, as with any
ing reactions using MS methods allow for reduced high throughput genotyping methods, the use
cycling times, with PCR completed in approxi- of robotics greatly improves productivity. Ro-
mately 1 hr and 15 min and mini-sequencing re- botic spotting of MALDI plates is essential due
actions lasting around 1 hr. ExoI-SAP digestion to the difficulty in manual transfer of sub-mi-
takes 30 min and can be performed in a water bath croliter sample volumes. Fourth, barcoding of
or incubator. Preparing a sufficient quantity of both sample and MALDI plates greatly assists
sample cocktail at the start of each day minimizes in tracking samples from preparation to data
reagent transfer time. Sample purification using ro- analysis. Fifth, mass detection of samples must
botic transfer takes approximately 30 min, and take place automatically, requiring robust reac-
when samples are eluted in matrix, sample spot- tions and consistent sample preparation, spotting,
ting from four 96-well reaction plates to one 384- and laser targeting. And sixth, interpretation of
well MALDI plate using the Symbiot I robotic mass spectra and genotype calling must be au-
workstation takes approximately 40 min. Auto- tomated, reliable, and fully integrated with a
sampling of the MALDI plate using the Voyager- general data management utility.
DE takes approximately 90 min. With eight thermal
CONCLUSIONS
blocks, we can prepare approximately 2,300
samples using 96-well plates and more than 9,200 The identification of a large number of SNPs
samples using 384-well plates per eight-hour day. throughout the genome provides a valuable tool
Approximately 12 384-well MALDI plates (∼4,600 for identifying and characterizing genes under-
samples) can be spotted in an eight-hour day, and lying human disease, for understanding inter-
304 BRAY ET AL.
individual variation in drug response, and for Cusi D, Barlassina C, Azzani T, Casari G, Citterio L, Devoto
tracking the origins and migrations of human M, Glorioso N, Lanzani C, Manunta P, Righetti M, Rivera
R, Stella P, Troffa C, Zagato L, Bianchi G. 1997. Polymor-
populations. Additionally, it is becoming increas- phisms of alpha-adducin and salt sensitivity in patients with
ingly clear that both coding and non-coding re- essential hypertension. Lancet 349:1353–1357.
gions of genes can be highly variable, and both Gray I, Campbell D, Spurr N. 2000. Single nucleotide poly-
encode information necessary for proper gene morphisms as tools in human genetics. Hum Mol Genet
function and regulation. To date, more than 60 9:2403–2408.
polymorphisms have been identified in the gene Griffin T, Smith L. 2000. Genetic identification by mass spec-
encoding lipoprotein lipase alone [Templeton et trometric analysis of single-nucleotide polymorphisms: ter-
al., 2000]. Characterizing all of the variation in nary encoding of genotypes. Anal Chem 72:3298–3302.
a gene or genes may be necessary before the ef- Haff L, Smirnov I. 1997. Singel nucleotide polymorphism iden-
fects of sequence variation in the gene can be tification assays using a thermostable DNA polymerase and
determined. These facts emphasize the need for MALDI-TOF MS. Genome Res 7:378–388.
efficient and cost-effective methods for high Jeunemaitre X, Soubrier F, Kotelevtsev Y, Lifton R, Williams
throughput SNP genotyping. C, Charru A, Hunt S, Hopkins P, Williams R, Lalouel J-M,
Genotyping technologies are advancing rap- Corvol P. 1992. Molecular basis of human hypertension:
idly, and the ultimate technology for high role of angiotensinogen. Cell 71:169–180.
throughput, cost effective SNP genotyping has Lichtenwalter K, Apffel A, Bai J, Chakel J, Dai Y, Hahn-
possibly not yet been envisioned. Nevertheless, enberger K, Li L, Hancock W. 2000. Approaches to func-
of the techniques currently available, MALDI- tional genomics: potential of matrix-assisted laser
desorption ionization—time of flight mass spectrometry
TOF genotyping methods show great promise combined with separation methods for the analysis of DNA
for high-throughput, economical genotyping. in biological samples. Chromatogr B Biomed Sci Appl J
Reducing reaction volumes saves cost in these 745:231–241.
high-throughput applications; however, it also Porter C, Talbot Jr C, Cuticchia A. 2000. Central mutation
increases the need for sensitivity. Mass spectrom- databases—a review. Hum Mutat 15:36–44.
etry provides sufficient sensitivity for sequence Ross P, Hall L, Smirnov I, Haff L. 1998. High level multiplex
detection in low-volume reactions, and this sen- genotyping by MALDI-TOF mass spectrometry. Nat
sitivity is particularly useful for multiplex Biotechnol 16:1347–1351.
genotyping reactions that typically produce large Sass C, Herbeth B, Siest G, Visvikis S. 2000. Lipoprotein li-
numbers of low concentration oligonucleotides. pase (C/G)447 polymorphism and blood pressure in the
The capacity for high level multiplexing, com- Stanislas Cohort. J Hypertens 18:1775–1781.
bined with low cost and high sensitivity, make Siffert W, Rosskopf D, Siffert G, Busch S, Moritz A, Erbel R,
MALDI-TOF genotyping methods a desirable Sharma A, Ritz E, Wichmann H, Jakobs K, Horsthemke
alternative for production genotyping. B. 1998. Association of a human G-protein beta3 subunit
variant with hypertension. Nat Genet 18:45–48.
ACKNOWLEDGMENTS Sun X, Ding H, Guo B. 2000. A new MALDI-TOF based
mini-sequencing assay for genotyping SNPs. Nucleic Ac-
This work was supported by grants from CDC ids Res 28:E68.
(UR6/CCU617218-01 to MSB), and NIH
Templeton A, Clark A, Weiss K, Nickerson D, Boerwinkle E,
(HL54481, HL54526, HL54457, and HL54464 Sing C. 2000. Recombinational and mutational hotspots
to EB, and DDK45538 to PAD). We thank within the human lipoprotein lipase gene. Am J Hum
Marjorie Minkoff, Larry Haff, Philip Ross, and Genet 66:69–83.
Jon Speak, and Perseptive Biosystems for pre- Timmermann B, Mo R, Luft F, Gerdts E, Busjahn A, Omvik P,
liminary use of the Genespectrometer software. Li G, Schuster H, Wienker T, Hoehe M, Lund-Johansen P.
1998. Beta-2 adrenoceptor genetic variation is associated
REFERENCES with genetic predisposition to essential hypertension: The
bergen blood pressure study. Kidney Int 53:1455–1460.
Altshuler D, Pollara V, Cowles C, Van Etten W, Baldwin J,
Linton L, Lander E. 2000. An SNP map of the human Wang W, Zee R, Morris B. 1997. Association of angiotensin II
genome generated by reduced representation shotgon se- type 1 receptor gene polymorphiszm with essential hyper-
quencing. Nature 407:513–516. tension. Clin Genet 51:31–34.