• Embed Doc
  • Readcast
  • Collections
  • CommentGo Back
Download
 
http://genomebiology.com/2002/3/7/research/0034.1
 c omm en t  r  e vi   e w s  r  e p o r  t  s  d  e p o s i   t  e d  r  e s  e a r  ci  n t  e r  a c t i   on s i  nf   o r m a t i   on r  ef   e r  e e d  r  e s  e a r  c
Research
Accurate normalization of real-time quantitative RT-PCR data bygeometric averaging of multiple internal control genes
Jo Vandesompele, Katleen De Preter, .ilip Pattyn, Bruce Poppe,NadineVan Roy, Anne De Paepe and .rank Speleman
 Address: Center for Medical Genetics, Ghent University Hospital 1K5, De Pintelaan 185, B-9000 Ghent, BelgiumCorrespondence: .rank Speleman E-mail: frankispeleman@rugacbe
Abstract
Background:
Gene-expression analysis is increasingly important in biological research, with real-time reverse transcription PCR (RT-PCR) becoming the method of choice for high-throughputand accurate expression profiling of selected genes. Given the increased sensitivity,reproducibility and large dynamic range of this methodology, the requirements for a properinternal control gene for normalization have become increasingly stringent. Althoughhousekeeping gene expression has been reported to vary considerably, no systematic survey hasproperly determined the errors related to the common practice of using only one control gene,nor presented an adequate way of working around this problem.
Results:
We outline a robust and innovative strategy to identify the most stably expressedcontrol genes in a given set of tissues, and to determine the minimum number of genes required tocalculate a reliable normalization factor. We have evaluated ten housekeeping genes from differentabundance and functional classes in various human tissues, and demonstrated that the conventionaluse of a single gene for normalization leads to relatively large errors in a significant proportion of samples tested. The geometric mean of multiple carefully selected housekeeping genes wasvalidated as an accurate normalization factor by analyzing publicly available microarray data.
Conclusions:
The normalization strategy presented here is a prerequisite for accurate RT-PCRexpression profiling, which, among other things, opens up the possibility of studying the biologicalrelevance of small expression differences.
Published: 18 June 2002
Genome
Biology 
2002,
3(7)
:research0034.1–0034.11The electronic version of this article is the complete one and can befound online at http://genomebiology.com/2002/3/7/research/0034© 2002 Vandesompele
et al 
., licensee BioMed Central Ltd(Print ISSN 1465-6906; Online ISSN 1465-6914)Received: 20 December 2001Revised: 10 April 2002Accepted: 7 May 2002
Background
Gene-expression analysis is increasingly important in many fields of biological research Understanding patterns of expressed genes is expected to provide insight into complexregulatory networks and will most probably lead to the iden-tification of genes relevant to new biological processes, orimplicated in disease Two recently developed methods tomeasure transcript abundance have gained much popularity and are frequently applied Microarrays allow the parallelanalysis of thousands of genes in two differentially labeledRNA populations [1], while real-time RT-PCR provides thesimultaneous measurement of gene expression in many dif-ferent samples for a limited number of genes, and is espe-cially suitable when only a small number of cells areavailable [2-4] Both techniques have the advantage of speed, throughput and a high degree of potential automationcompared to conventional quantification methods, such asnorthern-blot analysis, ribonuclease protection assay, or
 
2
Genome
Biology 
Vol 3 No 7
 Vandesompele
et al.
competitive RT-PCR Nevertheless, these new approachesrequire the same kind of normalization as the traditionalmethods of mRNA quantificationSeveral variables need to be controlled for in gene-expres-sion analysis, such as the amount of starting material, enzy-matic efficiencies, and differences between tissues or cells inoverall transcriptional activity Various strategies have beenapplied to normalize these variations Under controlled con-ditions of reproducible extraction of good-quality RNA, thegene transcript number is ideally standardized to thenumber of cells, but accurate enumeration of cells is oftenprecluded, for example when starting with solid tissue Another frequently applied normalization scalar is the RNA mass quantity, especially in northern blot analysis There areseveral arguments against the use of mass quantity Thequality of RNA and related efficiency of the enzymatic reac-tions are not taken into account Moreover, in someinstances it is impossible to quantify this parameter, forexample, when only minimal amounts of RNA are availablefrom microdissected tissues Probably the strongest argu-ment against the use of total RNA mass for normalization isthe fact that it consists predominantly of rRNA molecules,and is not always representative of the mRNA fraction This was recently evidenced by a significant imbalance betweenrRNA and mRNA content in approximately 75% of mammary adenocarcinomas [5] Also, it has been reportedthat rRNA transcription is affected by biological factors anddrugs [6-8] .urther drawbacks to the use of 18S or 28SrRNA molecules as standards are their absence in purifiedmRNA samples, and their high abundance compared totarget mRNA transcripts The latter makes it difficult toaccurately subtract the baseline value in real-time RT-PCR data analysisTo date, internal control genes are most frequently used tonormalize the mRNA fraction This internal control - oftenreferred to as a housekeeping gene - should not vary in thetissues or cells under investigation, or in response to experi-mental treatment However, many studies make use of theseconstitutively expressed control genes without proper vali-dation of their presumed stability of expression But the lit-erature shows that housekeeping gene expression - althoughoccasionally constant in a given cell type or experimentalcondition - can vary considerably (reviewed in [9-12]) Withthe increased sensitivity, reproducibility and large dynamicrange of real-time RT-PCR methods, the requirements for aproper internal control gene have become increasingly strin-gent In this study, we carried out an extensive evaluation of 10 commonly used housekeeping genes in 13 differenthuman tissues, and outlined a procedure for calculating anormalization factor based on multiple control genes formore accurate and reliable normalization of gene-expressiondata .urthermore, this normalization factor was validatedin a comparative study with frequently applied microarray scaling factors using publicly available microarray data
Results
Expression profiling of housekeeping genes
Primers were designed for ten commonly used housekeepinggenes (
 ACTB
,
 B2M 
,
GAPD
,
 HMBS 
,
 HPRT1
,
 RPL13A
,
 SDHA
,
TBP 
,
UBC 
and
YWHAZ 
) (see Table1 for full gene name,accession number, function, chromosomal localization, alias,existence of processed pseudogenes, and indication thatprimers span an intron; see Table2 for primer sequences)Special attention was paid to selecting genes that belong todifferent functional classes, which significantly reduces thechance that genes might be co-regulated The expressionlevel of these 10 internal control genes was determined in 34neuroblastoma cell lines (independently prepared in differ-ent labs from different patients), 20 short-term culturednormal fibroblast samples from different individuals, 13normal leukocyte samples, 9 normal bone-marrow samples,and 9 additional normal human tissues from pooled organs(heart, brain, fetal brain, lung, trachea, kidney, mammary gland, small intestine and uterus) The raw expression values are available as a tab-delimited file (see Additionaldata files)
Single control normalization error
To determine the possible errors related to the commonpractice of using only one housekeeping gene for normaliza-tion, we calculated the ratio of the ratios of two control genesin two different samples (from the same tissue panel) andtermed it the single control normalization error,
 E 
(seeMaterials and methods) .or two ideal internal control genes(constant ratio between the genes in all samples),
 E 
equals 1In practice, observed
 E 
 values are larger than 1 and consti-tute the erroneous
 E 
-fold expression difference between twosamples, depending on the particular housekeeping geneused for normalization
 E 
 values were calculated for all 45two-by-two combinations of control genes and 865 two-by-two sample combinations within the available tissue panels(neuroblastoma, fibroblast, leukocyte, bone marrow and aseries of normal tissues from Clontech; that is, a total of 38,925 data points) (.igure1) In addition, the systematicerror distribution was calculated by analysis of repeatedruns of the same control gene The average 75th and 90thpercentile
 E 
 values are 30 (range 21-39), and 64 (range30-109), respectively
Gene-stability measure and ranking of selectedhousekeeping genes
It is generally accepted that gene-expression levels should benormalized by a carefully selected stable internal controlgene However, to validate the presumed stable expression of a given control gene, prior knowledge of a reliable measure tonormalize this gene in order to remove any nonspecific varia-tion is required To address this circular problem, we devel-oped a gene-stability measure to determine the expressionstability of control genes on the basis of non-normalizedexpression levels This measure relies on the principle thatthe expression ratio of two ideal internal control genes is
 
 c omm en t  r  e vi   e w s  r  e p o r  t  s  d  e p o s i   t  e d  r  e s  e a r  ci  n t  e r  a c t i   on s i  nf   o r m a t i   on r  ef   e r  e e d  r  e s  e a r  c
http://genomebiology.com/2002/3/7/research/0034.3
identical in all samples, regardless of the experimental condi-tion or cell type In this way, variation of the expressionratios of two real-life housekeeping genes reflects the factthat one (or both) of the genes is (are) not constantly expressed, with increasing variation in ratio correspondingto decreasing expression stability .or every control gene wedetermined the pairwise variation with all other controlgenes as the standard deviation of the logarithmically transformed expression ratios, and defined the internalcontrol gene-stability measure
 M 
as the average pairwise
Table 1Internal control genes evaluated in this study
SymbolAccession NameFunctionLocalizationPseudo-Primers
AliasIMAGE
numbergene*
 ACTB
NM_001101Beta actinCytoskeletal structural 7p15-p12+S510455protein
B2M
NM_004048Beta-2-microglobulinBeta-chain of major 15q21-q22-S51940histocompatibility complexclass I molecules
GAPD
NM_002046Glyceraldehyde-3-Oxidoreductase in glycolysis 12p13+D510510phosphate dehydrogenaseand gluconeogenesis
HMBS
NM_000190Hydroxymethyl-bilane Heme synthesis, porphyrin 11q23-DPorphobilinogen245564synthasemetabolismdeaminase
HPRT1
NM_000194Hypoxanthine Purine synthesis in salvage Xq26+D345845phosphoribosyl-transferase 1pathway
RPL13A
NM_012423Ribosomal protein L13aStructural component of the 19q13+D23 kDa highly -large 60S ribosomal subunitbasic protein
SDHA
NM_004168Succinate dehydrogenase Electron transporter in the 5p15+D375812complex, subunit ATCA cycle and respiratorychain
TBP 
NM_003194TATA box binding proteinGeneral RNA polymerase II 6q27-D280735transcription factor
UBC 
M26880Ubiquitin CProtein degradation12q24-D510582
YWHAZ 
NM_003406Tyrosine 3-monooxygenase/Signal transduction by 2p25+S
§
Phospholipase 416026tryptophan 5-monooxygenase binding to phosphorylated A2activation protein, zeta serine residues on a varietypolypeptideof signaling molecules*Presence (+) or absence (-) of a retropseudogene in the genome determined by BLAST analysis of the mRNA sequence using the high-throughputgenomic sequences database (htgs) or human genome as database.
Localization of forward and reverse primer in different exons (D) or the same exon(S).
IMAGE cDNA clone number according to [14].
§
A single-exon gene.
Table2Primer sequences for internal control genes
Symbol*Forward primerReverse primer
 ACTB
CTGGAACGGTGAAGGTGACA AAGGGACTTCCTGTAACAATGCA
B2M
TGCTGTCTCCATGTTTGATGTATCT TCTCTGCTCCCCACCTCTAAGT
GAPD
TGCACCACCAACTGCTTAGC GGCATGGACTGTGGTCATGAG
HMBS
GGCAATGCGGCTGCAA GGGTACCCACGCGAATCAC
HPRT1
TGACACTGGCAAAACAATGCA GGTCCTTTTCACCAGCAAGCT
RPL13A
CCTGGAGGAGAAGAGGAAAGAGA TTGAGGACCTCTGTGTATTTGTCAA
SDHA
TGGGAACAAGAGGGCATCTG CCACCACTGCATCAAATTCATG
UBC 
ATTTGGGTCGCGGTTCTTG TGCCTTGACATTCTCGATGGT
YWHAZ 
ACTTTTGGTACATTGTGGCTTCAA CCGCCAGGACAAACCAGTAT
*
TBP 
primer sequences are described in [24].
HMBS
primer sequences kindly provided by E. Mensink and L. van de Locht (Nijmegen, The Netherlands).
of 00

Leave a Comment

You must be to leave a comment.
Submit
Characters: ...
You must be to leave a comment.
Submit
Characters: ...