This action might not be possible to undo. Are you sure you want to continue?

Stats and clinical trials

PDF generated using the open source mwlib toolkit. See http://code.pediapress.com/ for more information. PDF generated at: Wed, 23 Jun 2010 02:43:06 UTC

Contents

Articles

Epidemiological methods Study design Blind experiment Randomized controlled trial Statistical inference Statistical hypothesis testing Meta-analysis Clinical trial 1 3 6 9 22 30 41 47

References

Article Sources and Contributors Image Sources, Licenses and Contributors 63 64

Article Licenses

License 65

Epidemiological methods

1

Epidemiological methods

The science of epidemiology has matured significantly from the times of Hippocrates and John Snow (physician). The techniques for gathering and analyzing epidemiological data vary depending on the type of disease being monitored but each study will have overarching similarities.

**Outline of the Process of an Epidemiological Study[1]
**

1. Establish that a Problem Exists • Full epidemiological studies are expensive and laborious undertakings. Before any study is started a case must be made for the importance of the research. 2. Confirm the Homogeneity of the Events • Any conclusions drawn from inhomogeneous cases will be suspicious. All events or occurrences of the disease must be true cases of the disease. 3. Collect all the Events • It is important to collect as much information as possible about each event in order to inspect a large number of possible risk factors. The events may be collected from varied methods of epidemiological study or from censuses or hospital records. • The events can be characterized by Incidence rates and Prevalence rates. 4. Characterize the events as to epidemiological factors 1. Predisposing factors • Non-environmental factors that increase the likelihood of getting a disease. Genetic history, age, and gender are examples. 2. Enabling/Disabling factors • Factors relating to the environment that either increase or decrease the likelihood of disease. Exercise and good diet are examples of disabling factors. A weakened immune system and poor nutrition are examples of enabling factors. 3. Precipitation factors • This factor is the most important in that it identifies the source of exposure. It may be a germ, toxin or gene. 4. Reinforcing factors • These are factors that compound the likelihood of getting a disease. They may include repeated exposure or excessive environmental stresses. 5. Look for patterns and trends • Here one looks for similarities in the cases which may identify major risk factors for contracting the disease. Epidemic curves may be used to identify such risk factors. 6. Formulate a Hypothesis • If a trend has been observed in the cases, the researcher may postulate as to the nature of the relationship between the potential disease-causing agent and the disease. 7. Test the hypothesis • Because epidemiological studies can rarely be conducted in a laboratory the results are often polluted by uncontrollable variations in the cases. This often makes the results difficult to interpret. Two methods have evolved to assess the strength of the relationship between the disease causing agent and the disease. • Koch's postulates were the first criteria developed for epidemiological relationships. Because they only work well for highly contagious bacteria and toxins, this method is largely out of favor.

Hazard ratio 2. Incidence rate. Attributable risk in exposed 2. Measures of occurrence 1. Incidence measures 1. incubation period. Levin’s attributable risk Other measures 1. A relationship may fill all. Absolute risk reduction 2. some. Period prevalence Measures of association 1. Point prevalence 2. 2. Relative measures 1.Epidemiological methods • Bradford-Hill Criteria are the current standards for epidemiological relationships. Rate ratio 3. Hazard rate 3. 4. duration. Risk ratio 2. 3. Odds ratio 4. Percent attributable risk 3. where cases included are defined using a case definition 2. Virulence and Infectivity Mortality rate and Morbidity rate Case fatality Sensitivity (tests) and Specificity (tests) See also • • • • Study design Epidemiology OpenEpi Epi_Info . Cumulative incidence 2. Prevalence measures 1. Attributable risk 1. 8. or none of the criteria and still be true. Absolute measures 1. Each measure serves to characterize the disease giving valuable information about contagiousness. Publish the results 2 Measures Epidemiologist are famous for their use of rates. and mortality of the disease.

B. Epidemiologic. cebm. Springfield. debates. Ill: C. edu/ nccphp/ training/ Austin. net/ toolbox. sph. C. bulletins. etc. • Epimonitor [5] has a comprehensive list of links to associations. Thomas. Epidemiology for the health sciences a primer on epidemiologic concepts and their uses. and collaborations in epidemiology • The Centre for Evidence Based Medicine [4] at Oxford maintains an on-line "Toolbox" of evidence based medicine methods. C. Treatment studies • Randomized controlled trial • Double-blind randomized trial • Single-blind randomized trial • Non-blind trial • Nonrandomized trial (quasi-experiment) • Interrupted time series design (measures on a sample or a series of samples from the same population are obtained several times before and after a manipulated event or a naturally occurring event) .org [2] Epidemiologic Inquiry online weblog for epidemiology researchers • Epidemiology Forum [3] A discussion and forum community for epi analysis support and fostering questions. agencies. Epidemiology for the health sciences a primer on epidemiologic concepts and their uses. [2] http:/ / www. Springfield. Print. Donald F. net/ index. epidemiologic.considered a type of quasi-experiment .. asp [5] http:/ / www. org/ forum/ [4] http:/ / www. htm [6] http:/ / bmj. with easy explanations. Donald F. unc. epimonitor. 1974. bmjjournals.. • Epidemiology for the Uninitiated [6] On line text. References [1] Austin. and S.Epidemiological methods 3 External links • Epidemiologic. html [7] http:/ / www. Study design A number of different study designs are indicated below. Thomas. com/ epidem/ epid. B. Ill: C. Werner. • North Carolina Center for Public Health Preparedness Training [7] On line training classes for epidemiology and related topics. Print. Werner. org/ [3] http:/ / www. and S. 1974.

Seasonal Affective Disorder. • When using "parallel groups". influenza. Different types of studies are subject to different types of bias. For example. by contrast. while a cross-sectional study assesses research subjects at one point in time. many factors must be taken into account.[1] [2] Other terms • The term retrospective study is sometimes used as another term for a case-control study. • Non-inferiority trials are designed to demonstrate that a treatment is at least not appreciably worse than another. breast cancer) may be more likely to recall the relevant exposures that they had undergone (e. each patient receives one treatment. each patient receives several treatments.g.g. • A longitudinal study research subjects over two or more points in time. in a "crossover study". • Superiority trials are designed to demonstrate that one treatment is more effective than another. and should be avoided because other research designs besides case-control studies are also retrospective in orientation. This use of the term "retrospective study" is misleading. Subjects with the relevant condition (e. hormone replacement therapy) than subjects who don't have the condition. Additionally. . however. recall bias is likely to occur in cross-sectional or case-control studies where subjects are asked to recall exposure to risk factors. • Equivalence trials are designed to demonstrate that one treatment is as effective as another. and others) can complicate a trial as patients must be enrolled quickly. The ecological fallacy may occur when conclusions about individuals are drawn from analyses conducted on grouped data.Study design 4 Observational studies • Cohort study • Prospective cohort • Retrospective cohort • Time series study • Case-control study • Nested case-control study • Cross-sectional study • Community survey (a type of cross-sectional study) • Ecological study Important considerations When choosing a study design. seasonal variations and weather patterns can effect a seasonal study. Seasonal studies Conducting studies in seasonal indications (such as allergies. The nature of this type of analysis tends to overestimate the degree of association between variables.

Study design

5

See also

• • • • • • • • Clinical trial Conceptual framework Design of experiments Epidemiological methods Experimental control Hypothesis Meta-analysis Operationalization

External links

• Epidemiologic.org [2] Epidemiologic Inquiry online weblog for epidemiology researchers • Epidemiology Forum [3] An epidemiology discussion and forum community to foster debates and collaborations in epidemiology • Some aspects of study design [3] Tufts University web site • Comparison of strength [4] Description of study designs from the National Cancer Institute • Political Science Research Design Handbook [5] Truman State University website • Study Design Tutorial [6] Cornell University College of Veterinary Medicine

References

[1] Yamin Khan and Sarah Tilly. "Flu, Season, Diseases Affect Trials" (http:/ / appliedclinicaltrialsonline. findpharma. com/ appliedclinicaltrials/ Drug+ Development/ Flu-Season-Diseases-Affect-Trials/ ArticleStandard/ Article/ detail/ 652128). Applied Clinical Trials Online. . Retrieved 26 February 2010. [2] Yamin Khan and Sarah Tilly. "Seasonality: The Clinical Trial Manager's Logistical Challenge" (http:/ / www. pharm-olam. com/ pdfs/ POI-Seasonality. pdf). published by: Pharm-Olam International (http:/ / www. pharm-olam. com/ ). . Retrieved 26 April 2010. [3] http:/ / www. jerrydallal. com/ LHSP/ STUDY. HTM [4] http:/ / imsdd. meb. uni-bonn. de/ cancernet/ 902570. html [5] http:/ / www2. truman. edu/ polisci/ design. htm [6] http:/ / www. vet. cornell. edu/ imaging/ tutorial/ index. html

Blind experiment

6

Blind experiment

A blind or blinded experiment is a scientific experiment where some of the persons involved are prevented from knowing certain information that might lead to conscious or unconscious bias on their part, invalidating the results. For example, when asking consumers to compare the tastes of different brands of a product, the identities of the latter should be concealed — otherwise consumers will generally tend to prefer the brand they are familiar with. Similarly, when evaluating the effectiveness of a medical drug, both the patients and the doctors who administer the drug may be kept in the dark about the dosage being applied in each case — to forestall any chance of a placebo effect, observer bias, or conscious deception. Blinding can be imposed on researchers, technicians, subjects, funders, or any combination of them. The opposite of a blind trial is an open trial. Blind experiments are an important tool of the scientific method, in many fields of research — from medicine, forensics, psychology and the social sciences, to basic sciences such as physics and biology and to market research. In some disciplines, such as drug testing, blind experiments are considered essential. The terms blind (adjective) or to blind (transitive verb) when used in this sense are figurative extensions of the literal idea of blindfolding someone. The terms masked or to mask may be used for the same concept. (This is commonly the case in ophthalmology, where the word 'blind' is often used in the literal sense.)

History

One of the earliest suggestions that a blinded approach to experiments would be valuable came from Claude Bernard, who recommended that any scientific experiment be split between the theorist who conceives the experiment and a naive (and preferably uneducated) observer who registers the results without foreknowledge of the theory or hypothesis being tested. This suggestion contrasted starkly with the prevalent Enlightenment-era attitude that scientific observation can only be objectively valid when undertaken by a well-educated, informed scientist.[1]

**Single-blind trials
**

Single-blind describes experiments where information that could introduce bias or otherwise skew the result is withheld from the participants, but the experimenter will be in full possession of the facts. In a single-blind experiment, the individual subjects do not know whether they are so-called "test" subjects or members of an "experimental control" group. Single-blind experimental design is used where the experimenters either must know the full facts (for example, when comparing sham to real surgery) and so the experimenters cannot themselves be blind, or where the experimenters will not introduce further bias and so the experimenters need not be blind. However, there is a risk that subjects are influenced by interaction with the researchers — known as the experimenter's bias. Single-blind trials are especially risky in psychology and social science research, where the experimenter has an expectation of what the outcome should be, and may consciously or subconsciously influence the behavior of the subject. A classic example of a single-blind test is the "Pepsi challenge." A marketing person prepares several cups of cola labeled "A" and "B". One set of cups has Pepsi, the others have Coca-Cola. The marketing person knows which soda is in which cup but is not supposed to reveal that information to the subjects. Volunteer subjects are encouraged to try the two cups of soda and polled for which ones they prefer. The problem with a single-blind test like this is the marketing person can give (unintentional or not) subconscious cues which bias the volunteer. In addition it's possible the marketing person could prepare the separate sodas differently (more ice in one cup, push one cup in front of the volunteer, etc.) which can cause a bias. If the marketing person is employed by the company which is producing the challenge there's always the possibility of a conflict of interests where the marketing person is aware that future income will be based on the results of the test.

Blind experiment

7

**Double-blind trials
**

Double-blind describes an especially stringent way of conducting an experiment, usually on human subjects, in an attempt to eliminate subjective bias on the part of both experimental subjects and the experimenters. In most cases, double-blind experiments are held to achieve a higher standard of scientific rigor. In a double-blind experiment, neither the individuals nor the researchers know who belongs to the control group and the experimental group. Only after all the data have been recorded (and in some cases, analyzed) do the researchers learn which individuals are which. Performing an experiment in double-blind fashion is a way to lessen the influence of the prejudices and unintentional physical cues on the results (the placebo effect, observer bias, and experimenter's bias). Random assignment of the subject to the experimental or control group is a critical part of double-blind research design. The key that identifies the subjects and which group they belonged to is kept by a third party and not given to the researchers until the study is over. Double-blind methods can be applied to any experimental situation where there is the possibility that the results will be affected by conscious or unconscious bias on the part of the experimenter. Computer-controlled experiments are sometimes also erroneously referred to as double-blind experiments, since software may not cause the type of direct bias between researcher and subject. Development of surveys presented to subjects through computers shows that bias can easily be built into the process. Voting systems are also examples where bias can easily be constructed into an apparently simple machine based system. In analogy to the human researcher described above, the part of the software that provides interaction with the human is presented to the subject as the blinded researcher, while the part of the software that defines the key is the third party. An example is the ABX test, where the human subject has to identify an unknown stimulus X as being either A or B.

Usage

In medicine

Double-blinding is relatively easy to achieve in drug studies, by formulating the investigational drug and the control (either a placebo or an established drug) to have identical appearance (color, taste, etc.). Patients are randomly assigned to the control or experimental group and given random numbers by a study coordinator, who also encodes the drugs with matching random numbers. Neither the patients nor the researchers monitoring the outcome know which patient is receiving which treatment, until the study is over and the random code is broken. Effective blinding can be difficult to achieve where the treatment is notably effective (indeed, studies have been suspended in cases where the tested drug combinations were so effective that it was deemed unethical to continue withholding the findings from the control group, and the general population),[2] [3] or where the treatment is very distinctive in taste or has unusual side-effects that allow the researcher and/or the subject to guess which group they were assigned to. It is also difficult to use the double blind method to compare surgical and non-surgical interventions (although sham surgery, involving a simple incision, might be ethically permitted). A good clinical protocol will foresee these potential problems to ensure blinding is as effective as possible. It has also been argued[4] that even in a doubly-blind experiment, general attitudes of the experimenter such as skepticism or enthusiasm towards the tested procedure can be subconsciously transferred to the test subjects. Evidence-based medicine practitioners prefer blinded randomised controlled trials (RCTs), where that is a possible experimental design. These are high on the hierarchy of evidence; only a meta analysis of several well designed RCTs is considered more reliable.

and thus are unable to see the correlation (if any).. Another blinding scheme is used in B meson analyses in experiments like BaBar and CDF. In particular. the analysis does not introduce any bias into the final number N which is reported. therefore. the correct charge signs are revealed. this is difficult or impossible if one of the errors is observer bias. here.Blind experiment 8 In physics Modern nuclear physics and particle physics experiments often involve large numbers of data analysts working together to extract quantitative data from complex datasets. which are fairly trivial to measure. the analysts want to report accurate systematic error estimates for all of their measurements. but are forbidden from seeing the sign of the charge. the crucial experimental parameter is a correlation between certain particle energies and decay times—which require an extremely complex and painstaking analysis—and particle charge signs. Therefore. However. an officer shows a group of photos to a witness or crime victim and asks him or her to pick out the suspect. like electron neutrinos in MiniBooNE or proton decay in Super-Kamiokande. the experimenters devise blind analysis techniques. There is a growing movement in law enforcement to move to a double blind procedure in which the officer who shows the photos to the witness does not know which photo is of the suspect. where the experimental result is hidden from the analysts until they've agreed—based on properties of the data set other than the final value—that the analysis techniques are fixed. detector resolutions.[5] [6] External links • “control group study“ — The Skeptic’s Dictionary [7] — More on why double blind is important. They use these data to understand the backgrounds. one is allowed to "unblind the data" and "open the box". no one has preexisting expectations about the meaningless neutrino count N' = N x f in the visible data. At the end of the experiment. The "hidden" part of the experiment—the fraction f for SNO. like the Sudbury Neutrino Observatory. etc. where the experimenters wish to report the total number N of neutrinos seen. and these expectations must not be allowed to bias the analysis. and may be subject to subtle or overt influence by the officer. the charge-sign database for CDF—is usually called the "blindness box". The experimenters have preexisting expectations about what this number should be. signal-detection efficiencies. and the resulting numbers are published. To remove this bias. In forensics In a police photo lineup. This is basically a single-blind test of the witness' memory. require a different class of blinding schemes. the analysis software is run once (with no subjective human intervention). At the end of the analysis period. Searches for rare events. Analysts are allowed to work with all of the energy and decay data. since no one knows the "blinding fraction" f. • PharmaSchool JargonBuster Clinical Trial Terminology Dictionary [8] . the experimenters are allowed to see an unknown fraction f of the dataset. One example of a blind analysis occurs in neutrino experiments.

[4] The Journal of Alternative and Complementary Medicine."[2] The terms "RCT" and randomized trial are often used synonymously. com/ doi/ abs/ 10. 2009. html [8] http:/ / www. [5] Psychological sleuths – Accuracy and the accused (http:/ / www. and international development. education. liebertonline. forensic science has been a cornerstone of criminal law. in the assignment of treatments. legalaffairs. "Scientific Error and the Ethos of Belief". Social Research 72 (Spring 2005): 18. modified from the placebo-controlled study). http:/ / www. The most important advantage of proper randomization is that "it eliminates selection bias. . Donald G. Study Finds" (http:/ / www. balancing both known and unknown prognostic factors. org/ monitor/ julaug04/ accuracy. bbc. medical devices or surgery). .[3] RCTs are sometimes known as randomized control trials. nytimes. co. 2006-12-13. . org/ issues/ July-August-2002/ review_koerner_julaug2002. 0515 title=Skeptical Comment About Double-Blind Trials.org [6] Under the Microscope – For more than 90 years. 1089/ acm. RCTs are also employed in other research areas such as criminology. [3] McNeil Jr. uk/ Randomized controlled trial A randomized controlled trial (RCT) is a type of scientific experiment most commonly used in testing the efficacy or effectiveness of healthcare services (such as medicine or nursing) or health technologies (such as pharmaceuticals. html?pagewanted=print). The New York Times. stm). apa.[4] RCTs are also called randomized clinical trials or randomized controlled clinical trials when they concern clinical research[5] [6] [7] . intervention allocation. "Circumcision Reduces Risk of AIDS. [2] "Male circumcision 'cuts' HIV risk" (http:/ / news. RCTs involve the random allocation of different interventions (treatments or conditions) to subjects. follow-up. html) on apa. com/ control. com/ 2006/ 12/ 13/ health/ 13cnd-hiv.Blind experiment 9 References [1] Daston. however. co. BBC News. msp) [7] http:/ / skepdic. Critics and judges now ask whether it can be trusted. Lorraine. (http:/ / www. Retrieved 2010-05-04. but some authors distinguish between "RCTs" which compare treatment groups with control groups not Flowchart of four phases (enrollment. Retrieved 2009-05-18. Retrieved 2009-05-18. (2006-12-13). pharmaschool. uk/ 2/ hi/ health/ 6176209. and data receiving treatment (as in a analysis) of a parallel randomized trial of two groups. and "randomized [1] CONSORT (Consolidated Standards of Reporting Trials) 2010 Statement trials" which can compare multiple treatment groups with each other.

Zelen's design.[11] By the late 20th century."[16] [17] Classifications of RCTs By study design One way to classify RCTs is by study design. schools) are randomly selected to receive (or not receive) an intervention • Factorial – each participant is randomly assigned to a group that receives a particular combination of interventions or non-interventions (e. it has been argued that equipoise itself is insufficient to justify RCTs. 2001.[11] To improve the reporting of RCTs in the medical literature. which may be ethical for RCTs of screening and selected therapies.[14] For another. the ethics of RCTs have special considerations. and 2% were factorial. and all the participants in the group receives (or does not receive) an intervention • Crossover – over time. "collective equipoise" can conflict with a lack of personal equipoise (e. group 3 receives placebo X and vitamin Y. group 2 receives vitamin X and placebo Y. each participant receives (or does not receive) an intervention in a random sequence • Split-body – separate parts of the body of each participant (e. 2% were split-body.. the left and right sides of the face) are randomized to receive (or not receive) an intervention • Cluster – pre-existing groups of participants (e. more than 150. and group 4 receives placebo X and placebo Y) An analysis of the 616 RCTs indexed in PubMed during December 2006 found that 78% were parallel-group trials.g..g.[1] [2] Ethics Although the principle of clinical equipoise ("genuine uncertainty within the expert medical community. randomizes subjects before they provide informed consent.Randomized controlled trial 10 History It is claimed that the first published RCT was a 1948 paper entitled "Streptomycin treatment of pulmonary tuberculosis. who is credited as having conceived the modern RCT.[12] As of 2004. RCTs had become the "gold standard" for "rational therapeutics" in medicine.. villages. about the preferred treatment") common to clinical trials[13] has been applied to RCTs.. which has been used for some RCTs. a personal belief that an intervention is effective). For one. the major categories of RCT study designs are[18] : • Parallel-group – each participant is randomly assigned to a group.. 2% were cluster."[8] [9] [10] One of the authors of that paper was Austin Bradford Hill. group 1 receives vitamin X and vitamin Y.[15] Finally. A Medical Research Council investigation.g..000 RCTs were in the Cochrane Library. and 2010 which have become widely accepted. but is likely unethical "for most therapeutic trials. 16% were crossover. an international group of scientists and editors published Consolidated Standards of Reporting Trials (CONSORT) Statements in 1996. From most to least common in the medical literature.g.[18] .

• Low selection bias.[20] Most RCTs are superiority trials. so researchers must select a procedure for a given study based on its advantages and disadvantages. If the randomization procedure causes an imbalance in covariates related to the outcome across groups. • "It facilitates blinding (masking) of the identity of treatments from investigators."[20] Other RCTs are equivalence trials in which the hypothesis is that two interventions are indistinguishable from each other." and "equivalence trials.[19] In contrast. participants. especially in subgroup analyses. and assessors. Non-random "systematic" methods of group assignment." • "It permits the use of probability theory to express the likelihood that any difference in outcome between treatment groups merely indicates chance. may be "restricted.[20] Randomization The advantages of proper randomization in RCTs include[21] : • "It eliminates bias in treatment assignment. can cause "limitless contamination possibilities" and can cause a breach of allocation concealment. in which one intervention is hypothesized to be superior to another in a statistically significant way." or may be "adaptive.Randomized controlled trial 11 By outcome of interest (efficacy vs." "noninferiority trials. • Low probability of confounding (i. noninferiority vs. which refers to the stringent precautions taken to ensure that the group assignment of patients are not revealed prior to definitively allocating them to their respective groups. First is choosing a randomization procedure to generate an unpredictable sequence of allocations. in this way." There are two processes involved in randomizing patients to different interventions.[20] Some RCTs are noninferiority trials "to determine whether a new treatment is no worse than a reference treatment. effectiveness) RCTs can be classified as "explanatory" or "pragmatic.. such as alternating subjects between one group and the other.e. . However. pragmatic RCTs test effectiveness in everyday practice with relatively unselected participants and under flexible conditions. equivalence) Another classification of RCTs categorizes them as "superiority trials. estimates of effect may be biased if not adjusted for the covariates (which may be unmeasured and therefore impossible to adjust for). That is." specifically selection bias and confounding."[19] Explanatory RCTs test efficacy in a research setting with highly selected participants and under highly controlled conditions. which implies a balance in covariates across groups. this may be a simple random assignment of patients to any of the groups at equal probabilities.[21] Randomization procedures An ideal randomization procedure would achieve the following goals[22] : • Equal group sizes for adequate statistical power. no single randomization procedure meets those goals in every circumstance. pragmatic RCTs can "inform decisions about practice. a low probability of "accidental bias"[21] [23] . the procedure should not allow an investigator to predict the next subject's group assignment by examining which group has been assigned the fewest subjects up to that point." which differ in methodology and reporting."[19] By hypothesis (superiority vs." A second and more practical issue is allocation concealment.

[22] Another disadvantage is that "proper" analysis of data from permuted-block-randomized RCTs requires stratification by blocks. to "ensure good balance of participant characteristics in each group.[24] Restricted randomization To balance group sizes in smaller RCTs. it is robust against both selection and accidental biases. However. also known as outcome-adaptive randomization: The probability of being assigned to a group increases if the responses of the prior patients in the group were favorable. This type of randomization can be combined with "stratified randomization. Stories abound of investigators holding up sealed envelopes to lights or ransacking offices to determine group assignments in order to dictate the assignment of their next patient.[24] Although arguments have been made that this approach is more ethical than other types of randomization when the probability that a treatment is effective or ineffective increases during the course of an RCT. a block size of 6 and an allocation ratio of 2:1 would lead to random assignment of 4 subjects to one group and 2 to the other. sealed envelopes (SNOSE). sequentially-numbered containers. and central . but much less frequently than simple or restricted randomization: • Covariate-adaptive randomization."[24] Minimization is reported to have "supporters and detractors"[21] . • Response-adaptive randomization. the probability of being assigned to a group increases if the group is over-represented and decreases if the group is under-represented. in taking care of individual patients.[21] Some standard methods of ensuring allocation concealment include sequentially-numbered. the method does not necessarily eliminate bias on unknown factors[2] .[21] The methods are thought to be less affected by selection bias than permuted-block randomization. some form of "restricted" randomization is recommended."[21] Also known as "complete" or "unrestricted" randomization. the procedure can lead to selection bias. ethicists have not yet studied the approach in detail. and subjects are allocated randomly within each block. opaque."[2] A special case of permuted-block randomization is random allocation.[21] The major disadvantage of permuted-block randomization is that even if the block sizes are large and randomly varied.[25] 12 Allocation concealment "Allocation concealment" (defined as "the procedure for protecting the randomisation process so that the treatment to be allocated is not known before the patient is entered into the study") is considered desirable in RCTs.[24] Adaptive At least two types of "adaptive" randomization procedures have been used in RCTs.[24] The major types of restricted randomization used in RCTs are: • Permuted-block randomization or blocked randomization: a "block size" and "allocation ratio" (number of subjects in one group versus the other group) are specified. because only the first subject's group assignment is truly chosen at random. of which one type is minimization: The probability of being assigned to a group varies in order to minimize "covariate imbalance.[24] • Adaptive biased-coin randomization methods (of which urn randomization is the most widely-known type): In these relatively uncommon methods.[21] Such practices introduce selection bias and confounders (both of which should have minimized by randomization). pharmacy controlled randomization. similar to "repeated fair coin-tossing.[21] For example.Randomized controlled trial Simple randomization This is a commonly used and intuitive procedure.[26] In practice." for example by center in a multicenter trial. its main drawback is the possibility of imbalanced group sizes in small RCTs. It is therefore recommended only for RCTs with over 200 subjects. in which the entire sample is treated as one block. thereby possibly distorting the results of the study. clinical investigators in RCTs often find it difficult to maintain impartiality.

g. whether a so-called "intention-to-treat analysis" is used)." "double-blind. the less bias that an RCT will be subject to. RCTs may be stopped early if an intervention produces "larger than expected benefit or harm. and that the allocation concealment methods should be reported in detail in a publication of an RCT's results. for example. participants cannot be blinded to the intervention. or outcome assessors from knowing which intervention was received. for changes in blood lipid levels after receipt of atorvastatin after acute coronary syndrome[36] ) tests the effects of predictor variables. • For continuous outcome data.g.g."[19] Analysis of data from RCTs The types of statistical methods used in RCTs depend on the characteristics of the data and include: • For dichotomous (binary) outcome data. Nevertheless.[27] On the other hand.e. 2005 study determined that most RCTs have unclear allocation concealment in their protocols.. In 2008 a study concluded that the results of unblinded RCTs tended to be biased toward beneficial effects only if the RCTs' outcomes were subjective as opposed to objective[28] . in an RCT of treatments for multiple sclerosis. who was blinded after assignment to interventions (for example. in 2001 and 2006 two studies showed that these terms have different meanings for different people." or "triple-blind".. or (if the intervention is a medication) "open-label"[33] . unblinded neurologists (but not blinded neurologists) felt that the treatments were beneficial[34] .[29] [30] The 2010 CONSORT Statement specifies that authors and editors should not use the terms "single-blind. for example. options include analyzing only cases with known outcomes and using imputed data[2] . when some outcome data are missing." or if "investigators find evidence of no important difference between experimental and control interventions.Randomized controlled trial randomization. the more that analyses can include all participants in the groups to which they were randomized. (also called "masked") by "procedures that prevent study participants. Regardless of the statistical methods used. it is "still desirable and often possible to blind the assessor or obtain an objective source of data for evaluation of outcomes. analysis of covariance (e. survival analysis (e.[21] It is recommended that allocation concealment methods be included in an RCT's protocol. instead.. participants. caregivers.g. logistic regression (e. those assessing outcomes) and how. to predict sustained virological response after receipt of peginterferon alfa-2a for hepatitis C[35] ) and other methods can be used.[28] 13 Blinding An RCT may be Blinded." and "triple-blind". Traditionally.. or both.. however.[2] ." "double-blind. blinding is sometimes inappropriate or impossible to perform in an RCT. • For time-to-event outcome data that may be censored. care providers. a 2008 study of 146 meta-analyses concluded that the results of RCTs with inadequate or unclear allocation concealment tended to be biased toward beneficial effects only if the RCTs' outcomes were subjective as opposed to objective."[2] • The extent to which the groups can be analyzed exactly as they existed upon randomization (i. A "pure" intention-to-treat analysis is "possible only when complete outcome data are available" for all randomized subjects[38] . however. blinded RCTs have been classified as "single-blind. physical therapy). although the participants and providers are often unblinded. in their publications. "open"[32] . In pragmatic RCTs. reports of blinded RCT should discuss "If done. important considerations in the analysis of RCT data include: • Whether a RCT should be stopped early due to interim results. For example. if an RCT involves a treatment in which active participation of the patient is necessary (e."[28] Unlike allocation concealment. Kaplan–Meier estimators and Cox proportional hazards models for time to coronary heart disease after receipt of hormone replacement therapy in menopause[37] ) is appropriate."[2] RCTs without blinding are referred to as "unblinded"[31] .

"CONSORT extensions" have been published. a preliminary report of a RCT concluded that the two drugs increased mortality.Randomized controlled trial • Whether subgroup analysis should be performed."[39] The CONSORT 2010 checklist contains 25 items (many with sub-items) focusing on "individually randomised.[50] . based on observational studies. the antiarrhythmic agents flecainide and encainide came to market in 1986 and 1987 respectively.[41] • The GRADE Working Group concluded in 2008 that "randomised trials without important limitations constitute high quality evidence. and in the populations studied.[41] It has recognized "evidence obtained from at least one properly randomized controlled trial" with good internal validity (i.[44] The non-randomized studies concerning the drugs were characterized as "glowing"[45] .[46] Sales of the drugs then decreased.000 prescriptions per month in early 1989[44] .. Some examples of scientific organizations' considering RCTs or systematic reviews of RCTs to be the highest-quality evidence available are: • As of 1998."[42] • For issues involving "Therapy/Prevention. published RCTs from the Women's Health Initiative claimed that women taking hormone replacement therapy with estrogen plus progestin had a higher rate of myocardial infarctions than women on a placebo.[2] 14 Reporting of RCT results The CONSORT 2010 Statement is "an evidence-based.[44] • Prior to 2002. the National Health and Medical Research Council of Australia designated "Level I" evidence as that "obtained from a systematic review of all relevant randomised controlled trials" and "Level II" evidence as that "obtained from at least one properly designed randomised controlled trial. two group. in the hormone regimens used. in making clinical practice guideline recommendations the United States Preventive Services Task Force has considered both a study's design and its internal validity as indicators of its quality.[45] In 2002 and 2004.[37] [47] Possible explanations for the discrepancy between the observational studies and the RCTs involved differences in methodology." the Oxford Centre for Evidence-based Medicine]] as of 2009 defined "Level 1a" evidence as a systematic review of RCTs that are consistent with each other. and "Level 1b" evidence as an "individual RCT (with narrow Confidence Interval). and their sales increased to a combined total of approximately 165.[48] [49] The use of hormone replacement therapy decreased after publication of the RCTs. In that year."[40] • Since at least 2001.[1] Advantages RCTs are considered by most to be the most reliable form of scientific evidence in the hierarchy of evidence that influences healthcare policy and practice because RCTs reduce spurious causality and bias."[43] Notable RCTs with unexpected results that contributed to changes in clinical practice include: • After Food and Drug Administration approval. a rating of "I-good") as the highest quality evidence available to it. parallel trials" which are the most common type of RCT. however. however.e. Results of RCTs may be combined in systematic reviews which are increasingly being used in the conduct of evidence-based medicine. Aetiology/Harm. These are "often discouraged" because multiple comparisons may produce false positive findings that cannot be confirmed by other studies. minimum set of recommendations for reporting RCTs. and that estrogen-only hormone replacement therapy caused no reduction in the incidence of coronary heart disease. it was routine for physicians to prescribe hormone replacement therapy for post-menopausal women to prevent myocardial infarction.[1] For other RCT study designs.

.g. based on evaluating a quality-adjusted life year as equal to the prevailing mean per capita gross domestic product. an RCT may include patients whose prognosis is better than average. The authors of the 2000 findings cast doubt on the ideas that "observational studies should not be used for defining evidence-based medical care" and that RCTs' results are "evidence of the highest grade. RCTs' external validity may be limited. which increased the cure rate from 5% to 60% in a 1977 non-randomized study.[51] [53] Factors that can affect RCTs' external validity include[53] : • Where the RCT was performed (e. Nevertheless. in that the same study projected that the 28 RCTs produced a "net benefit to society at 10-years" of 46 times the cost of the trials program.g.. followed by RCTs. one study found 28 Phase III RCTs funded by the National Institute of Neurological Disorders and Stroke prior to 2000 with a total cost of US$335 million[55] .. RCTs may use composite measures infrequently used in clinical practice) • Incomplete reporting of adverse effects of interventions Costs RCTs can be expensive[52] .g. a 2001 study published in Journal of the American Medical Association concluded that "discrepancies beyond chance do occur and differences in estimated magnitude of treatment effect are very common" between observational studies and RCTs. and those with common medical conditions"[54] ) • Study procedures (e.[59] • RCTs may be unnecessary for treatments that have dramatic and rapid effects relative to the expected stable or progressively worse natural course of the condition treated. what works in one country may not work in another) • Characteristics of the patients (e. that is. children.[58] Two other lines of reasoning question RCTs' contribution to scientific knowledge beyond other types of studies: • If study designs are ranked by their potential for new discoveries. for a mean cost of US$12 million per RCT.Randomized controlled trial 15 Disadvantages Many papers discuss the disadvantages of RCTs. in an RCT patients may receive intensive diagnostic procedures and follow-up care difficult to achieve in the "real world") • Outcome measures (e.[51] [60] One example is combination chemotherapy including cisplatin for metastatic testicular cancer."[56] [57] However.[51] [52] Among the most frequently-cited drawbacks are: Limitations of external validity The extent to which RCTs' results are applicable outside the RCTs varies. or may exclude "women. the elderly..g. the return on investment of RCTs may be high.[55] Relative importance of RCTs and observational studies Two studies published in The New England Journal of Medicine in 2000 found that observational studies and RCTs overall produced similar results[56] [57] . then anecdotal evidence would be at the top of the list.[60] [61] . followed by observational studies.

"prevention". "corrections".g. education. that is. even when treatments are unlikely to be successful. A systematic review published in 2003 found four 1986-2002 articles comparing industry-sponsored and nonindustry-sponsored RCTs.[63] Therapeutic misconception Although subjects almost always provide informed consent for their participation in an RCT.Randomized controlled trial 16 Difficulty in studying rare events Interventions to prevent events that occur only infrequently (e. despite the publication of a 1978 paper noting that the sample sizes of many "negative" RCTs were too small to make definitive conclusions about the negative results[67] .[51] Difficulty in studying outcomes in distant future It is costly to maintain RCTs for the years or decades that would be ideal for evaluating some interventions. sudden infant death syndrome) and uncommon adverse outcomes (e.g. or other sources.[51] [52] Pro-industry findings in industry-funded RCTs Some RCTs are fully or partly funded by the health care industry (e... they do not understand the difference between research and treatment. "court". 1 in 20) as the probability that the RCT will falsely find two equally effective treatments significantly different. a typical RCT will use 0. and international development Criminology A 2005 review found 83 randomized experiments in criminology published in 1982-2004.[71] .05 (i. compared with only 35 published in 1957-1981. [69] For example. and "community". Hollin (2008) argued that RCTs may be difficult to implement (e. a rare side effect of a drug) would require RCTs with extremely large sample sizes and may therefore best be assessed by observational studies.e.[62] A 2004 study of 1999-2001 RCTs published in leading medical and surgical journals determined that industry-funded RCTs "are more likely to be associated with statistically significant pro-industry findings. Regarding Type I errors. RCTs in criminology. studies since 1982 have documented that many RCT subjects believe that they are certain to receive treatment that is best for them personally.[70] Focusing only on offending behavior programs. nonprofit.. if an RCT required "passing sentences that would randomly assign offenders to programmes") and therefore that experiments with quasi-experimental design are still necessary. by 2005-2006 a sizeable proportion of RCTs still had inaccurate or incompletely-reported sample size calculations[68] .[70] The authors classified the studies they found into five categories: "policing". the pharmaceutical industry) as opposed to government."[63] One possible reason for the pro-industry results in industry-funded published RCTs is publication bias. patients with terminal illness may attempt to join trials as a last ditch attempt at treatment.g.[66] Regarding Type II errors.g.. Cultural effects The RCT method creates cultural effects that have not been well understood."[65] Statistical error RCTs are subject to both type I ("false positive") and type II ("false negative") statistical errors.. and in all the articles there was a correlation of industry sponsorship and positive study outcome.[64] [65] Further research is necessary to determine the prevalence of and ways to address this "therapeutic misconception.

and musings. Nezu CM. For example. Enkin M. Designing randomised trials in health. Randomized controlled trials: design and implementation for community-based psychosocial interventions. Olken concluded that government audits were more effective than "increasing grassroots participation in monitoring" in reducing corruption. Malden. Successful randomized trials: a handbook for the 21st century. Cavanaugh MM. and education programs in the developing world. a parent-centered intervention. 2nd ed. invitations to accountability meetings vs. Development economists at research organizations including Abdul Latif Jameel Poverty Action Lab[74] [75] and Innovations for Poverty Action[76] have used RCTs to measure the effectiveness of poverty. • Solomon PL. Fla. a problem present in many current studies of development policy. 2008. In one notable example of a cluster RCT in the field of development economics. • Torgerson DJ. 2nd ed.: CRC Press.[73] International development RCTs are currently being used by a number of international development experts to measure the impact of development interventions worldwide. Boca Raton. health.Randomized controlled trial 17 Education RCTs have been used in evaluating a number of educational interventions. ISBN 1584886242. classroom intervention. • Nezu AM. and then measured the behavioral and academic performance of their students. or no intervention. and no invitations to accountability meetings vs. • Jadad AR. Basingstoke. For development economists. ISBN 9781405132664.[72] Another 2009 study randomized classrooms for 678 first-grade children to receive a classroom-centered intervention. New York: Oxford University Press. RCTs can be highly effective in policy evaluation since they allow researchers to isolate the impacts of a specific program from other factors such as other programs offered in the region. short-term events (such as a favorable harvest). Philadelphia: Lippincott Williams & Wilkins. ISBN 9780195304633. Introduction to randomized controlled clinical trials. 2007. 2006. invitations to accountability meetings along with anonymous comment forms). • Matthews JNS. Torgerson C. Draine J. Olken (2007) randomized 608 villages in Indonesia in which roads were about to be built into six groups (no audit vs. a 2009 study randomized 260 elementary school teachers' classrooms to receive or not receive a program of behavioral screening. England. Evidence-based outcome research: a practical guide to conducting randomized controlled trials for psychosocial interventions. audit.[77] After estimating "missing expenditures" (a measure of corruption). general macroeconomic growth. and then followed their academic outcomes through age 19. ISBN 9780195333190. ISBN 9780230537354. the main benefit to using RCTs compared to other research methods is that randomization guards against selection bias. Mass. and differences in personal qualities that might make one individual more successful than another. and New York: Palgrave Macmillan. 2009. Oxford: Oxford University Press. McKinlay S.[77] See also • • • • • Drug development Hypothesis testing Impact evaluation Jadad scale Statistical inference Further reading • Domanski MJ. . education and the social sciences: an introduction. 2009. Randomized controlled trials: questions.: Blackwell. ISBN 9780781779456. answers. and parent training. 2008.

gov/ articlerender.1136/bmj. [16] Zelen M (1979). Yu LM. Mantel N. Drexler H (2004). PMID 17060757. Pike MC. Smith PG (1976). doi:10.c723. McPherson K. Howard SV. [3] Ranjith G (2005). N Engl J Med 317 (3): 141–5. Meyer GP. "CONSORT 2010 explanation and elaboration: updated guidelines for reporting parallel group randomised trials" (http:/ / www. From oranges and lemons to the gold standard". Lancet 364 (9429): 141–8.bc. Br Med J 316 (7131): 606. bmj.0000225356. "Landmark study made research resistant to bias". "The quality of reports of randomised trials in 2000 and 2006: comparative study of articles indexed in PubMed" (http:/ / www. Br J Cancer 35 (1): 1–39. [9] Brown D (1998-11-02). A Medical Research Council investigation" (http:/ / www. doi:10. Yamagishi H (2006). [82] M. N Engl J Med 300 (22): 1242–5. Messinger D. PMC 2025229/. pubmedcentral. Washington Post. Taji Y. [4] Chalmers TC.x. . PMID 9518917. [5] Peto R. fcgi?tool=pmcentrez& artid=1856609). Chalmers I. ajronline. pubmedcentral.1159/000087787. ISBN 9781905177356. Blackburn B. bmj.c332. Lilford RJ. "What is Zelen's design?" (http:/ / www. Peto J. Roland M (1998). Altman DG (2010). Ann Surg 244 (5): 668–76.sla.tb00306. DC: U. "The ethics of randomised controlled trials from the perspectives of patients. Cook JD.4582. fcgi?tool=pmcentrez& artid=2844943).1136/bmj. Ambroz A (1981). com/ cgi/ content/ full/ 316/ 7131/ 606). II. [14] Gifford F (1995). fcgi?tool=pmcentrez& artid=2844940). fcgi?tool=pmcentrez& artid=2025229/ ). Mantel N. [79] London: Pinter & Martin. doi:10. gov/ articlerender. • Gelband H. PMID 18890300. Hertenstein B. "A new design for randomized clinical trials" (http:/ / content. . PMID 795448. nih. "CONSORT 2010 Statement: updated guidelines for reporting parallel group randomised trials" (http:/ / www.1111/j. [2] Moher D. "Randomized controlled trials" (http:/ / www. Introduction and design" (http:/ / www. [7] Wollert KC. PMID 10949771. PMID 3600702. Arseniev L.1136/bmj. Lotz J. gov/ articlerender. doi:10. [17] Torgerson DJ. the public. (Report OTA-BP-H-22.769. . Am J Roentgenol 183 (6): 1539–44. Howard SV. doi:10. Fichtner S. Noguchi Y. Egger M. Analysis and examples" (http:/ / www. [11] Stolberg HO. doi:10. Gøtzsche PC. "A method for assessing the quality of a randomized control trial". doi:10. Hematol Oncol Clin North Am 14 (4): 745–60. Cox DR. com/ cgi/ content/ full/ 340/ mar23_1/ c723). nih. . Power and bias in adaptively randomized clinical trials. [18] Hopewell S. pubmedcentral. Dutton S. "Community-equipoise and the ethics of randomized clinical trials". nih. [80] Washington. "Interferon-α-induced depression: when a randomized trial is not a randomized controlled trial". [78] University of York. PMC 2091872. PMC 2844940.1097/01. [10] Shikata S. Pike MC. bmj. pubmedcentral. Directory of randomisation software and services. 2006 July 12. PMID 20332511. Hopewell S. PMC 2844941. pubmedcentral.1016/0197-2456(81)90056-8. [13] Freedman B (1987). Psychother Psychosom 74 (6): 387. Breslow NE. BMJ 340: c723. The impact of randomized clinical trials on health policy and medical practice: background paper. gov/ articlerender. 2008 March 19. Ganser A. PMID 20332510. "Design and analysis of randomized clinical trials requiring prolonged observation of each patient. PMID 831755. Testing treatments: better research for better health care. . org/ cgi/ content/ abstract/ 300/ 22/ 1242). Br Med J 2 (4582): 769–82. doi:10.1467-8519. nejm. I. Schulz KF.2. Trop I (2004). Hewison J (1998). Br J Cancer 34 (6): 585–612. Nakayama T. Armitage P. [6] Peto R. Cox DR. Anderson Cancer Center. . PMID 15547188. org/ cgi/ content/ full/ 183/ 6/ 1539). PMID 11653056.) • REFLECT (Reporting guidElines For randomized controLled trials for livEstoCk and food safeTy) Statement [81] • Wathen JK. fcgi?tool=pmcentrez& artid=2025310). "Streptomycin treatment of pulmonary tuberculosis. Br Med J 340: c332. nih. Breslow NE. Br Med J 340: c869.1136/bmj. Lippolt P. Silverman B. PMID 20332509. Smith H Jr. Congress. Devereaux PJ. pubmedcentral. com/ cgi/ content/ full/ 317/ 7167/ 1209). D.Randomized controlled trial 18 External links • Bland M. doi:10. Breidenbach C. Armitage P. PMID 16244516. PMID 9794861. Altman DG. Br Med J 317 (7167): 1209–12. gov/ articlerender. Norman G. for the CONSORT Group (2010). Bioethics 9 (2): 127–48.04304. McPherson K. fcgi?tool=pmcentrez& artid=2091872). Montori V. PMC 2844943. [15] Edwards SJL. gov/ articlerender. • Evans I. Ringes-Lichtenberg S. 1983. Office of Technology Assessment. University of Texas. 2010. Moher D. Control Clin Trials 2 (1): 31–49. [12] Meldrum ML (2000). [8] Streptomycin in Tuberculosis Trials Committee (1948). PMC 1856609. References [1] Schulz KF. "Design and analysis of randomized clinical trials requiring prolonged observation of each patient.c869. Elbourne D. Peto J. doi:10. "Equipoise and the ethics of clinical research". PMC 2025310.1016/S0140-6736(04)16626-9. PMID 15246726. "Intracoronary autologous bone-marrow cell transfer after myocardial infarction: the BOOST randomised controlled clinical trial". vii. and healthcare professionals" (http:/ / www.S. "Comparison of effects in randomized controlled trials with observational studies in digestive surgery" (http:/ / www. Korte T. Altman DG (2010). PMC 1112637. nih. Thornton H. PMID 7261638. Reitman D. nih. PMC 1114158.1995. PMID 431682. Chan AW.1016/S0889-8588(05)70309-9. Smith PG (1977). "A brief history of the randomized controlled trial. Schroeder B. Hornig B.

"Welcome to the CONSORT statement Website" (http:/ / www. Gøtzsche PC (2005).295. [27] Pildal J. Chadwick DW.321. Howard BV. PMID 15651970. . doi:10. Lancet 359 (9305): 515–9. . O'Regan M. placebo-controlled multiple sclerosis clinical trial" (http:/ / www. Ockene J. Albrecht JK (2001). . pdf). Haynes B. BMJ 310 (6991): 1360–2. Beresford SA. int/ entity/ rhl/ LANCET_515-519. . PMID 17060210.1016/0197-2456(93)90028-C. Med J Aust 182 (2): 87–9. Clancy L. PMID 11583749. Gordon SC. PMID 8119063. "Who is blinded in randomized clinical trials? A study of 200 trials and a survey of authors" (http:/ / ctj. J Am Med Assoc 285 (15): 2000–3. "The SANAD study of effectiveness of valproate. [28] Wood L. [29] Devereaux PJ. bmj. Baker GA. Bhandari M. "Physician interpretations and textbook definitions of blinding terminology in randomized controlled trials" (http:/ / jama. Ling M. doi:10.38414. doi:10. com/ cgi/ content/ abstract/ 3/ 4/ 360). Control Clin Trials 14 (6): 471–84. .1136/bmj. Sterne JA (2008). com/ cgi/ content/ full/ 336/ 7644/ 601). nhmrc. Clin Trials 3 (4): 360–5. ama-assn. Lachin JM (1993).1001/jama.a2390. . Grimes DA (2002). "Generation of allocation sequences in randomised trials: chance.1177/1740774506069153. PMID 16522836. PMID 11853818. PMID 3203526. bmj. Chan AW. [39] CONSORT Group. Lancet 358 (9286): 958–65. CONSORT Group (2006). "Comparison of descriptions of allocation concealment in trial protocols and the published reports: cohort study" (http:/ / www. Egger M. "What is meant by intention to treat analysis? Survey of published randomised controlled trials" (http:/ / www. Gluud LL. Control Clin Trials 9 (4): 365–74. Treweek S. Lancet 369 (9566): 1016–26. "The use of response-adaptive designs in clinical trials". Kooperberg C. org/ cgi/ content/ full/ 295/ 10/ 1152). Pragmatic Trials in Healthcare (Practihc) group (2008). who.8F. Altman DG. Pocock SJ. Kotchen JM.285. "Reporting of noninferiority and equivalence randomized trials: an extension of the CONSORT statement" (http:/ / jama. Koury K.1136/bmj. PMID 3060315. Alwaidh M. . doi:10. Terao S. BMJ 330 (7499): 1049. or topiramate for generalised and unclassifiable epilepsy: an unblinded randomised controlled trial" (http:/ / www. A guide to the development.1152. Schulz KF. J Am Med Assoc 288 (3): 321–33.1016/S0140-6736(02)07683-3. p. Lancet 372 (9636): 392–7. Japan Gast Study Group (2008). PMC 2549744.1001/jama. Canberra: Commonwealth of Australia. pdf). PMID 12117397. PMID 19001484.1016/0197-2456(88)90049-9. et al (2007). ama-assn.1016/0197-2456(88)90045-1. "Improving the reporting of pragmatic trials: an extension of the CONSORT statement" (http:/ / www. doi:10.13. "The impact of blinding on the results of a randomized. JAMA 295 (10): 1152–60. Roberts R (1994). org/ cgi/ content/ abstract/ 44/ 1/ 16). Amagai K. J Am Med Assoc 285 (13): 1711–8. Appleton R. PMID 17382828. [33] Fukase K. BMJ 336 (7644): 601–5. mja. au/ public/ issues/ 182_02_170105/ for10877_fm.288. Asaka M. [35] Manns MP. ama-assn. Shiffman M. CONSORT group. doi:10. ama-assn. .1016/S0140-6736(01)06102-5. Hróbjartsson A (2006). Oliver MF. . "Analysis of clinical trial outcomes: some comments on subgroup analyses". [31] Marson AG. [22] Lachin JM (1988).422650. org/ cgi/ content/ full/ 285/ 15/ 2000). Rustgi VK. Waters D. PMID 7787537. Hayashi S. gov. Leslie S. randomised controlled trial". PMID 10480822. "Peginterferon alfa-2b plus ribavirin compared with interferon alfa-2b plus ribavirin for initial treatment of chronic hepatitis C: a randomised trial". neurology.Randomized controlled trial [19] Zwarenstein M. PMID 15817527. Gagnier JJ. [24] Lachin JM. not choice" (http:/ / www. Reindollar R. [30] Haahr MT. Forfang E. Altman DG. Ezekowitz MD. Retrieved 2010-03-29. PMID 18316340. Olsson AG. nih. com/ cgi/ content/ full/ 319/ 7211/ 670). "Effects of atorvastatin on early recurrent ischemic events in acute coronary syndromes: the MIRACL study: a randomized controlled trial" (http:/ / jama. Yetisir E. Moher D. . Okamoto S. [25] Rosenberger WF. PMC 2267990. "Effect of eradication of Helicobacter pylori on incidence of metachronous gastric carcinoma after endoscopic resection of early gastric cancer: an open-label. "Oral versus intravenous antibiotics for community acquired lower respiratory tract infection in a general hospital: open. Montori VM. Vandervoort MK. [20] Piaggio G. Hemeryck L. Altman DG. McHutchison JG.1711.1016/0197-2456(89)90057-3. 56. randomised controlled trial" (http:/ / www. Control Clin Trials 10 (4 Suppl): 187S–194S. Control Clin Trials 9 (4): 289–311. com/ cgi/ content/ full/ 310/ 6991/ 1360). "Randomization in clinical trials: conclusions and recommendations". consort-statement. pubmedcentral. Retrieved 2010-03-28. doi:10. Anderson GL. Kikuchi S. Chaitman BR. [40] National Health and Medical Research Council (1998-11-16). Farquhar RE. Jackson RD. au/ _files_nhmrc/ file/ publications/ synopses/ cp30.10. Ghali WA. Uemura N. [32] Chan R. [21] Schulz KF. Inoue K. Wei LJ (1988). Oxman AD. com/ cgi/ content/ full/ 337/ nov11_2/ a2390).1136/bmj. PMC 2039891. bmj. Manns BJ. Evans SJ. Neurology 44 (1): 16–20. bmj. com/ cgi/ content/ full/ 330/ 7499/ 1049).1001/jama. doi:10. 19 . Johnson KC. PMID 8290055. Stefanick ML. Goodman ZD. Keech AC (2005). Elbourne DR. "Allocation concealment and blinding: when ignorance is bliss" (http:/ / www. doi:10. doi:10.1001/jama. PMID 11277825. [34] Noseworthy JH. Jüni P. . implementation and evaluation of clinical practice guidelines (http:/ / www. org). . Guyatt GH (2001). com. html). doi:10. . Writing Group for the Women's Health Initiative Investigators (2002). [23] Buyse ME (1989). Lacchetti C. Feely J (1995). LaCroix AZ. Matts JP. [38] Hollis S.285. Gluud C. doi:10. Quan H. Tunis S. doi:10. . Zeiher A.1016/S0140-6736(07)60461-9. Al-Kharusi AM. doi:10. Ebers GC. Gebski VJ. Br Med J 319 (7211): 670–4. org/ cgi/ content/ full/ 285/ 13/ 1711). "Risks and benefits of estrogen plus progestin in healthy postmenopausal women: principal results from the Women's Health Initiative randomized controlled trial" (http:/ / jama. doi:10. Kato M.451748. [37] Rossouw JE. Wood AJ.15. PMID 18675689. fcgi?tool=pmcentrez& artid=2039891). PMC 557221. org/ cgi/ content/ full/ 288/ 3/ 321). [26] Forder PM.3. BMJ 337: a2390. bmj. PMID 2605967. Hróbjartsson A.1016/S0140-6736(08)61159-9. . ISBN 1864960485.2000. [36] Schwartz GG. lamotrigine. "Statistical properties of randomization in clinical trials". sagepub.39465. Altman DG. Martin RM. PMID 11308438. Campbell F (1999).AD. "Empirical evidence of bias in treatment effect estimates in controlled trials with different interventions and outcomes: meta-epidemiological study" (http:/ / www. Prentice RL. Ganz P. PMC 28218. gov/ articlerender. Myocardial Ischemia Reduction with Aggressive Cholesterol Lowering (MIRACL) Study Investigators (2001). doi:10. Stern T.

1136/bmj. evidence can be confusing . cebm. doi:10. Am J Cardiol 79 (1): 43–7.2105/AJPH.pmed. Card A. gov/ clinic/ ajpmsuppl/ review.2007. doi:10. [52] Sanson-Fisher RW. Am J Prev Med 33 (2): 155–61. Br Med J 334 (7589): 349–51. com/ cgi/ content/ full/ 336/ 7651/ 995). BMJ 312 (7040): 1215–8. what doesn't" (http:/ / www. Elkins JS (2006). gov/ articlerender. "Observational research. Atkins D. [61] Einhorn LH (2002). USA Today. doi:10. "Current methods of the US Preventive Services Task Force: a review of the process" (http:/ / www. PMID 18456631. Retrieved 2010-03-28.289. . [49] Vandenbroucke JP (2009). . Lancet 365 (9453): 82–93. Contopoulos-Ioannidis DG. org/ cgi/ content/ abstract/ 321/ 6/ 406). JAMA 297 (11): 1233–40. Pratt CM. [47] Anderson GL. "Levels of evidence" (http:/ / www. . Mota S. doi:10. . com/ news/ health/ 2006-10-15-medical-evidence-cover_x. Haidich AB. Teutsch SM. pdf). "Scope and impact of financial conflicts of interest in biomedical research: a systematic review" (http:/ / jama. JAMA 291 (14): 1701–12.007. org/ cgi/ content/ full/ 342/ 25/ 1887).454. Rawlins M. PMID 9024734. "Eligibility criteria of randomized controlled trials published in high-impact general medical journals: a systematic sampling review" (http:/ / jama. Smith WS.1001/jama.1016/S0140-6736(06)68578-4.551019.1001/jama. The Cardiac Arrhythmia Suppression Trial (CAST) Investigators" (http:/ / content. [51] Black N (1996). PMID 15082697. . org/ cgi/ content/ extract/ 348/ 7/ 645). [55] Johnston SC. bmj.68. Mulrow CD. .0050067. PMID 12584376.159889. et al (2004). Karagounis LA (1997). . [48] Grodstein F. PMID 17303884. pdf). Woolf SH. ama-assn. Third US Preventive Services Task Force (2001). nih. [60] Glasziou P. .11. elsevier. N Engl J Med 342 (25): 1887–92. "A comparison of observational studies and randomized. "Curing metastatic testicular cancer" (http:/ / www. PMID 19362661. . org/ cgi/ content/ full/ 286/ 7/ 821). controlled trials. . pubmedcentral. Fowler RA (2007). Pappa M. Lin SX. [53] Rothwell PM (2005).4.1016/S0002-9149(96)00673-X. . PMC 1557642.821.527986. [56] Benson K. doi:10. 20 . PMID 16631910.297.1016/j. randomised trials. Falck-Ytter Y. D'Este C (2007). doi:10. gov/ articlerender.BE. [58] Ioannidis JP. J Am Med Assoc 286 (7): 821–30. "Why we need observational studies to evaluate the effectiveness of health care" (http:/ / www. Katrak S. "Effect of a US National Institutes of Health programme of clinical trials on public health and costs" (http:/ / www. [62] Bekelman JE. "Limitations of the randomized controlled trial in evaluating population-based health interventions" (http:/ / www. doctors try to sort out what works. Proc Natl Acad Sci U S A 99 (7): 4592–5. PMID 11306229. "When are randomised trials unnecessary? Picking signal from noise" (http:/ / www. doi:10. Carrasquillo O. [50] Hsu A. Schünemann HJ. "Effects of conjugated equine estrogen in postmenopausal women with hysterectomy: the Women's Health Initiative randomized controlled trial" (http:/ / jama. Rootenberg JD. org/ pdf/ HSR20070511. nejm. fcgi?tool=pmcentrez& artid=123692). bmj.1073/pnas. N Engl J Med 348 (7): 645–50. nejm.1056/NEJM200006223422507. . PMID 10861324. Assaf AR. com/ retrieve/ pii/ S0140-6736(09)60708-X).1233. J Am Med Assoc 289 (4): 454–65. chrp. Toren A. Oxman AD. observational studies. org/ cgi/ content/ full/ 342/ 25/ 1878). and the hierarchy of research designs" (http:/ / nejm. ama-assn. "Changes in postmenopausal hormone replacement therapy use among women with high cardiovascular risk" (http:/ / ajph. pubmedcentral. [46] "Preliminary report: effect of encainide and flecainide on mortality in a randomized trial of arrhythmia suppression after myocardial infarction. int/ rhl/ Lancet_365-9453. ama-assn. [42] Guyatt GH. Clarkson TB.286. [57] Concato J. BMJ 336 (7651): 995–8. doi:10. "Understanding the divergent data on postmenopausal hormone therapy" (http:/ / content. PMID 17374817. PMID 12533125.amepre. PMC 2350940. Chalmers I. doi:10. com/ cgi/ content/ full/ bmj. ama-assn. Limacher M. Pantazis N. htm). nejm. doi:10. Waldo AL. Bassford T. GRADE Working Group (2008). "Comparison of evidence of treatment effects in randomized and nonrandomized studies" (http:/ / jama. Beresford SA. org/ cgi/ content/ full/ 99/ 12/ 2184). Am J Prev Med 20 (3 Suppl): 21–35. nih. PMC 1800999. org/ cgi/ content/ full/ 291/ 14/ 1701). . . fcgi?tool=pmcentrez& artid=2265762). nih. pdf). Manson JE (2003). . "What is "quality of evidence" and why is it important to clinicians?" (http:/ / www. Lau J (2001). ahrq. Black H. .Randomized controlled trial [41] Harris RP. org/ article/ S0002-9149(96)00673-X/ abstract). PMC 2265762. Shah N.1001/jama. . Vist GE. org/ cgi/ content/ full/ 289/ 4/ 454). Lancet 367 (9519): 1319–27. PMID 19833984. PMC 123692. PMID 18336067. PMID 15639683. doi:10.312/ 7040/ 1215). 1989. doi:10. [59] Vandenbroucke JP (2008). net/ article/ S0749-3797(07)00225-5/ abstract). PMID 2473403.deluged with studies. PMID 10861325. Am J Public Health 99 (12): 2184–7. N Engl J Med 321 (6): 406–12.7. .1371/journal. [44] Anderson JL. "Impact of the Food and Drug Administration approval of flecainide and encainide on coronary artery disease mortality: putting "Deadly Medicine" to the test" (http:/ / www. ajconline. doi:10. PMID 8634569. Lancet 373 (9671): 1233–5. aphapublications.2009.1016/S0749-3797(01)00261-6. Kunz R. highwire. Kokori SI.1016/S0140-6736(04)17670-8. PMID 11497536. Moran A (2009).1136/bmj. and two views of medical science" (http:/ / www. [54] Van Spall HG. Lohr KN. doi:10. Horwitz RI (2000).072067999. Green LW.1016/S0140-6736(09)60708-X. Retrieved 2010-03-22.04. N Engl J Med 342 (25): 1878–86. gov/ articlerender. Hartz AJ (2000).1056/NEJM200006223422506. PMID 17673104. usatoday. pubmedcentral. fcgi?tool=pmcentrez& artid=1800999). Li Y. PMID 11904381. McCulloch P (2007). Methods Work Group. aspx?o=1025). ajpm-online. doi:10.1056/NEJMsb022365. Bonevski B. . "In medicine. [43] Oxford Centre for Evidence-based Medicine (2009 March). doi:10.291. "Randomized. PLoS Med 5 (3): e67.39490.14. Helfand M. "The HRT controversy: observational studies and RCTs fall in line" (http:/ / linkinghub. Gross CP (2003).1701. doi:10. controlled trials" (http:/ / content. Tektonidou MG. Kiss A. net/ index. [45] Rubin R (2006-10-16). who.39070. doi:10.1001/jama. "External validity of randomised controlled trials: "to whom do the results of this trial apply?"" (http:/ / apps. org/ cgi/ content/ full/ 297/ 11/ 1233).

1093/rfs/hhp092. Quarterly Journal of Economics 122 (3): 1235–1264. Glennerster R. "The mortality effect: counting the dead in the cancer trial". doi:10. Review of Financial Studies 23 (1): 433–464. nih.1177/1063426609341645. york. [66] Wittes J (2002). fas. [71] Hollin CR (2008). Ravaud P (2009 May 12).1037/a0016586. pubmedcentral. [73] Bradshaw CP. Joffe S. Giraudeau B. doi:10. Schemitsch EH. Easter MM. doi:10. urban school district". "Randomized experiments in criminology: What have we learned in the last two decades?". PMID 6135666. Kellam SG. Davis AM. PMID 19435763. org/ [82] http:/ / www. "Monitoring corruption: evidence from a field experiment in Indonesia". ac. fcgi?tool=pmcentrez& artid=2680945). PMC 332713. . Zmuda JH. mdanderson.Randomized controlled trial [63] Bhandari M. "Reporting of sample size calculation in randomised controlled trials: review" (http:/ / www. PMC 2680945. Serna L.1093/epirev/24. doi:10. Linden L (2007). MIT Department of Economics. pdf 21 . Cole S. doi:10. Kremer M (2006 December 12).39. org/ cgi/ content/ abstract/ 299/ 13/ 690). "Clinical trials and medical care: defining the therapeutic misconception" (http:/ / www. doi:10.3. Devereaux PJ (2004). Working Paper No. Duflo E.1177/1748895807085871. [74] Duflo E.1007/s11292-004-6460-0. Survey of 71 "negative" trials" (http:/ / content. [65] Henderson GE. "Expanding credit access: using randomized supply decisions to estimate the impacts". [78] http:/ / www-users. doi:10. Heels-Ansdell D. uk/ ~mb55/ guide/ randsery.1. Journal of Emotional and Behavioral Disorders 17 (4): 197–212. Kass N.1086/517935. [70] Farrington DP. Ialongo NS (2009). Forness SR (2009). PMID 14970094. pdf [80] http:/ / www. Can Med Assoc J 170 (4): 477–80. Nelson DK. "The importance of beta. "Association between industry funding and statistically significant pro-industry findings in medical and surgical randomized trials" (http:/ / ecmaj.1162/qjec. Peppercorn J. Seeley JR. org/ education-and-research/ departments-programs-and-labs/ departments-and-divisions/ division-of-quantitative-sciences/ research/ biostats-utmdabtr-002-06. PMC 2082641. Br Med J 338: b1732. jameslindlibrary. nih. Kuebler RR (1978). Journal of Political Economy 115 (2): 200–249. . Journal of Educational Psychology 101 (4): 926–937.0040324. Busse JW. [77] Olken BA (2007). pdf) (PDF). Grady C. com/ papers/ Using+ Randomization+ in+ Development+ Economics. org/ pdf/ testing-treatments. Sankar P. Roth LH. fcgi?tool=pmcentrez& artid=2082641). PMID 12119854.1371/journal. Feil EG. doi:10. PMID 18044980.1136/bmj. Schünemann H. PMID 355881. Zimmer CR (2007). Dechartres A. Lidz CW. Demonstration of program efficacy outcomes in a diverse. Journal of Experimental Criminology 1 (1): 9–38. [67] Freiman JA. Graham BA. Chalmers TC. Int J Law Psychiatry 5 (3-4): 319–29. pubmedcentral. Golly AM. [64] Appelbaum PS. . org/ web/ 20070119193234/ http:/ / www. PLoS Med 4 (11): e324.1016/0160-2527(82)90026-7. the type II error and sample size in the design and interpretation of the randomized control trial. Churchill LR. Criminology and Criminal Justice 8 (1): 89–106. com/ cgi/ content/ full/ 170/ 4/ 477). Jackowski D. pdf [81] http:/ / reflect-statement. Public Culture 21 (1): 89–117. doi:10. King NM. Baron G.pmed. htm [79] http:/ / www. "Sample size calculations for randomized controlled trials". Wilfond BS. Epidemiol Rev 24 (1): 39–53. 06-36 [75] Banerjee AV. doi:10.122.b1732. doi:10. [76] Karlan D. N Engl J Med 299 (13): 690–4. Zinman J (2010). [69] Jain SL (2010).1215/08992363-2009-017. [72] Walker HM. Rothschild BB. Severson HH. doi:10. "Remedying education: evidence from two randomized experiments in India". nejm. Sprague S. Small J. org/ ota/ reports/ 8310. "Evaluating offending behaviour programmes: does only randomization glister?". gov/ articlerender. [68] Charles P. gov/ articlerender. povertyactionlab. Montori VM. "Longitudinal impact of two universal preventive interventions in first grade on educational outcomes in high school". Lidz C (1982). "The therapeutic misconception: informed consent in psychiatric research". "A randomized controlled trial of the First Step to Success early intervention.1235. Smith H Jr. Miller FG. "Using randomization in development economics research: a toolkit" (http:/ / web. archive. Welsh BC (2005). Mears D.

or about drawing a final conclusion before implementing some organizational or governmental policy.g. i.Statistical inference 22 Statistical inference Statistical inference is the process of making conclusions using data that is subject to random variation.e. Introduction Scope For the most part.e. and • a particular realization of the random process. a set of data. statistical inference makes propositions about populations. in which modeling decisions made by a data analyst have had minimal influence.[1] More substantially. an interval constructed from the data in such a way that. Given a parameter or hypothesis about which one wishes to make inference. using data drawn from the population of interest via some form of random sampling. for example. 95% of posterior belief. a set of values containing e. where this might be a decision about making further experiments or surveys. the terms statistical inference. Some common forms of statistical proposition are: • an estimate. • rejection of a hypothesis [3] • clustering or classification of data points into groups Comparison to descriptive statistics Statistical inference is generally distinguished from descriptive statistics. • a credible interval. i. More generally. In simple terms. and will often progress in a series of steps where the emphasis moves gradually from description to inference. descriptive statistics can be thought of as being just a straightforward presentation of facts. observational errors or sampling variation. statistical inference most often uses: • a statistical model of the random process that is supposed to generate the data. A complete statistical analysis will nearly always include both descriptive statistics and statistical inference. a particular value that best approximates some parameter of interest. .e. data about a random process is obtained from its observed behavior during a finite period of time. such intervals would contain the true parameter value with the probability at the stated confidence level. under repeated sampling of datasets. statistical induction and inferential statistics are used to describe systems of procedures that can be used to draw conclusions from datasets arising from systems affected by random variation. The outcome of statistical inference may be an answer to the question "what should be done next?". • a confidence interval (or set estimate).[2] Initial requirements of such a system of procedures for inference and induction are that the system should produce reasonable answers when applied to well-defined situations and that it should be general enough to be applied across a range of situations. The conclusion of a statistical inference is a statistical proposition. i. i.e.

One component is treated parametrically and the other non-parametrically. • Non-parametric: The assumptions made about the process generating the data are much less than in parametric statistics and may be minimal. about which we wish to draw inference. and statisticians' experience. Incorrect assumptions of Normality in the population also invalidates some forms of regression-based inference [8] ."[9] Here. which may be estimated using the sample median or the Hodges-Lehmann-Sen estimator. if the distribution is not heavy tailed. one may assume that the distribution of population values is truly Normal.000 independent samples the normal distribution approximates (to two digits of accuracy) the distribution of the sample mean for many population distributions.e.[11] Following Kolmogorov's work in the 1950s. • Semi-parametric: This term typically implies assumptions 'between' fully and non-parametric approaches. For example.[4] Degree of models/assumptions Statisticians distinguish between three levels of modeling assumptions. with unknown mean and variance. Yet for many practical purposes. one may assume that a population distribution have a finite mean. semi-parametric models can often be separated into 'structural' and 'random variation' components. • Fully parametric: The probability distributions describing the data-generation process are assumed to be fully described by a family of probability distributions involving only a finite number of unknown parameters. by the Berry–Esseen theorem[10] . according to simulation studies. the normal approximation provides a good approximation to the sample-mean's distribution when there are 10 (or more) independent samples.e. and that datasets are generated by 'simple' random sampling. which has good properties when the data arise from simple random sampling. Descriptions of statistical models usually emphasize the role of population quantities of interest. The well-known Cox model is a set of semi-parametric assumptions. Furthermore. correctly-calibrated inference in general requires these assumptions to be correct. one may assume that the mean response level in the population depends in a truly linear manner on some covariate (a parametric assumption) but not make any parametric assumption describing the variance around that mean (i. i. advanced statistics uses approximation . Approximate distributions Given the difficulty in specifying exact distributions of sample statistics. The use of any parametric model is viewed skeptically by most experts in sampling human populations: "most sampling statisticians. every continuous probability distribution has a median. where the central limit theorem ensures that these [estimators] will have distributions that are nearly normal. With finite samples. The family of generalized linear models is a widely-used and flexible class of parametric models. the central limit theorem states that the distribution of the sample mean "for very large samples" is approximately normally distributed. For example. incorrectly assuming the Cox model can in some cases lead to faulty conclusions [7] . approximation results measure how close a limiting distribution approaches the statistic's sample distribution: For example. For example.[4] For example.and fully-parametric assumptions are also cause for concern. with 10. limit themselves to statements about [estimators] based on very large samples. many methods have been developed for approximating these. More generally. about the presence or possible form of any heteroscedasticity). Incorrect assumptions of 'simple' random sampling can invalidate statistical inference [6] . that the data-generating mechanisms really has been correctly specified. when they deal with confidence intervals at all. Importance of valid models/assumptions Whatever level of assumption is made.[5] .Statistical inference 23 Models/Assumptions Any statistical inference requires some assumptions. A statistical model is a set of assumptions concerning the generation of the observed data and similar data. More complex semi.

[33] Seriously misleading results can be obtained analyzing data from randomized experiments while ignoring the experimental protocol. and the Hellinger distance. limiting results are often invoked to justify the generalized method of moments and the use of generalized estimating equations.[22] Objective randomization allows properly inductive procedures.g. a good observational study may be better than a bad randomized experiment. 24 Randomization-based models For a given dataset that was produced by a randomization design. In frequentist inference. the metric geometry of probability distributions is studied.[28] [29] ) Similarly. and this is important especially in survey sampling and design of experiments. the 'error' of the approximation) can be assessed using simulation:[16] .[23] [24] [25] [26] Many statisticians prefer randomization-based analysis of data that was generated by well-defined randomization procedures. It is not possible to choose an appropriate model without knowing the randomization scheme. limiting results like the central limit theorem describe the sample statistic's limiting distribution. when analyzing data from randomized experiments. a lot of expertise and time.g. Model-based analysis of randomized experiments It is standard practice to refer to a statistical model. this approach quantifies approximation error with e.[17] [18] . The magnitude of the difference between the limiting distribution and the true distribution (formally. Limiting results are not statements about finite samples. which often require a large budget. in randomized experiments.Statistical inference theory and functional analysis to quantify the error of approximation: In this approach. However. the randomization scheme guides the choice of a statistical model. Statistical inference from randomized studies is also more straightforward than many other situations. the randomization distribution of a statistic (under the null-hypothesis) is defined by evauating the test statistic for all of the plans that could have been generated by the randomiation design. if one exists.[34] . The use of limiting results in this way works well in many applications. results from randomized experiments are recommended by leading statistical authorities as allowing inferences with greater reliability than do observational studies of the same phenomena. randomized experiments may increase the costs of experimentation without improving the quality of inferences. it is true that in fields of science with developed theoretical knowledge and experimental control. common mistakes include forgetting the blocking used in an experiment and confusing repeated measurements on the same experimental unit with independent replicates of the treatment applied to different experimental units. which are popular in e. For example.[30] However. Bregman divergence.[15] However. not all hypotheses can be tested by randomized experiments or random samples. especially with low-dimensional models with log-concave likelihoods (such as with one-parameter exponential families). econometrics and biostatistics. randomization is also of importance: In survey sampling – sampling without replacement ensures the exchangeability of the sample with the population. randomization allows inferences to be based on the randomization distribution rather than a subjective model.[27] (However. the Kullback–Leibler distance. randomization warrants a missing at random assumption for covariate information. The statistical analysis of a randomized experiment may be based on the randomization scheme stated in the experimental protocol and does not need a subjective model. the asymptotic theory of limiting distributions is often invoked for work in estimation and testing. and indeed are logically irrelevant to finite samples. and may have ethical problems.[12] [13] [14] With infinite samples.[19] [20] [21] In Bayesian inference. often a normal linear model.[31] [32] However.

highest posterior density intervals. While statisticians using frequentist inference must choose for themselves the parameters of interest. and Bayes Factors can all be motivated in this way. integrate to one. objectivity. In particular. these summaries do all depend (to some extent) on stated prior beliefs. Frequentist inference This paradigm calibrates the production of propositions by considering (notional) repeated sampling of datasets similar to the one at hand. that is. frequentist developments of optimal inference (such as minimum-variance unbiased estimators. Loss functions must be explicitly stated for statistical theorists to prove that a statistical procedure has an optimality property. the posterior mean. and decision theory Frequentist inference calibrates procedures. or uniformly most powerful testing) make use of loss functions. such as tests of hypothesis and constructions of confidence intervals. Examples of frequentist inference • P-value • Confidence interval Frequentist inference. in terms of frequency probability. which play the role of (negative) utility functions. and methods which work well under one paradigm often have attractive interpretations under other paradigms. median and mode. some elements of frequentist statistics. (In contrast. Examples of Bayesian inference • Credible intervals for interval estimation • Bayes factors for model comparison Bayesian inference.although in practice this quantification may be challenging. Bayesian inference calibrates procedures with regard to epistemological uncertainty. in terms of repeated sampling from a population. Bayesian inference uses the available posterior beliefs as the basis for making statistical propositions. These schools (or 'paradigms') are not mutually-exclusive. While a user's utility function need not be stated for this sort of inference. There are several different justifications for using the Bayesian approach. and are generally viewed as subjective conclusions. However. The two main paradigms in use are frequentist and Bayesian inference. the frequentist properties of any statistical inference procedure can be described . For example. Bayesian inference The Bayesian calculus describes degrees of belief using the 'language' of probability. described as a probability measure) The frequentist calibration of procedures can be done without regard to utility functions. do incorporate utility functions. For example. and least squares estimators are optimal under squared error loss functions. and the estimators/test statistic to be used. median-unbiased estimators are optimal under absolute value loss functions.Statistical inference 25 Modes of inference Different schools of statistical inference have become established. By considering its characteristics under repeated sample.) . which are both summarized below. the absence of obviously-explicit utilities and prior distributions has helped frequentist procedures to become widely-viewed as 'objective'. subjectivity and decision theory Many informal Bayesian inferences are based on "intuitively reasonable" summaries of the posterior. and obey probability axioms. such as statistical decision theory. beliefs are positive. (Methods of prior construction which do not require external input have been proposed but not yet fully developed.

the minimum description length (MDL) principle selects statistical models that maximally compress the data. Barnard developed "structural inference" or "pivotal inference". George A. this approach has been recognized as being ill-defined. in linear regression. data and utility. and even fallacious.[42] an approach using invariant probabilities on group families. The MDL principle has been applied in communication-coding theory in information theory. Analyses which are not formally Bayesian can be (logically) incoherent.Statistical inference Formally. .e. and that Bayesian inference should not conclude with the evaluation and summarization of posterior beliefs. MDL estimation is similar to maximum likelihood estimation and maximum a posteriori estimation (using maximum-entropy Bayesian priors). if a 'data generating mechanism' does exist in reality. For example. extremely limited in applicability. also known as a "fiducial distribution". and in time-series analysis (particularly for chosing the degrees of the polynomials in Autoregressive moving average (ARMA) models). although not every statistical inference need have a Bayesian interpretation. as might be done in frequentist or Bayesian approaches. inference proceeds without assuming counterfactual or non-falsifiable 'data-generating mechanisms' or probability models for the data.g. then according to Shannon's source coding theorem it provides the MDL description of the data. 26 Other modes of inference (besides frequentist and Bayesian) Information and computational complexity Other forms of statistical inference have been developed from ideas in information theory [35] and the theory of Kolmogorov complexity. MDL avoids assuming that the underlying probability model is known. averaged over the posterior uncertainty. or loss function. Barnard reformulated the arguments behind fiducial inference on a restricted class of models on which "fiducial" procedures would be well-defined and useful. Bayesian inference is calibrated with reference to an explicitly stated utility. However. Bayesian inference can be made for essentially any problem. The evaluation of statistical inferential procedures often uses techniques or criteria from computational complexity theory or numerical analysis. Some advocates of Bayesian inference assert that inference must take place in this decision-theoretic framework. on average and asymptotically[37] . In subsequent work. However.[38] Information-theoretic statistical inference has been popular in data mining. the MDL principle can also be applied without assumptions that e. which has become a common approach for very large observational and heterogeneous datasets made possible by the computer revolution and internet[36] . those integrable to one) is that they are guaranteed to be coherent.[36] . In minimizing description length (or descriptive complexity). the 'Bayes rule' is the one which maximizes expected utility.[39] Fiducial inference Fiducial inference was an approach to statistical inference based on fiducial probability.[40] Structural inference Developing ideas of Fisher and of Pitman from 1938-1939[41] . the data arose from independent sampling[37] [38] . a feature of Bayesian procedures which use proper priors (i. Given assumptions. Formal Bayesian inference therefore automatically provides optimal decisions in a decision theoretic sense.

pp. MR178484. Statistical models: Theory and practice [43] (revised ed. 25: pp. • (1878 August). pp. 12. Mathematical statistics: Basic and selected topics. "Note on an Article by Sir Ronald Fisher" [45]. .669-693). (1998). Series B (Methodological) 18 (2): pp. Andrei N. the analysis of variance. Internet Archive Eprint [49]. v. 5. 387--395. and regression Survey sampling Summarizing statistical data See also • • • • Predictive inference Induction (philosophy) Philosophy of statistics Algorithmic inference References • Bickel. • Kolmogorov. • Neyman.1016/S0304-3975(98)00075-9. David A. Andrei N. Principles of Statistical Inference. 3. D. "Model Selection and the Principle of Minimum Description Length: Review paper" [44]. • Fisher. "The Doctrine of Chances". 13. (CP 6. 4. 12-13. Reprinted (CLL 61-81). ISBN 0-521-68567-2. Ser.).619-644). Internet Archive Eprint [51]. "Deduction. JSTOR 2670311. Popular Science Monthly. 604 [46]–615. Roy. 8. 470 [52]–482. JSTOR 2983716.Statistical inference 27 Inference topics The topics below are usually included in the area of statistical inference.). Sankhyā Ser. pp. Reprinted (CLL 82-105). • (1878 June). Selections plus CP 2. Journal of the Royal Statistical Society. Internet Archive Eprint [47]. Theoretical Computer Science 207 (2): pp. (W 3:276-290). 369–375. vols. B. 17 (1955). PWP 157-173. 1 (Second (updated printing 2007) ed. • Cox.MR1939352. Peter J. (EP 1:170-185). Popular Science Monthly. (2006).. Bin (June 2001). published as "The Doctrine of Chances With Later Reflections". (2009). (2001). (CP 2. MR1643414. v.645-668). (reply to Fisher 1955) • Peirce. and Hypothesis". R. Journal of the American Statistical Association 96 (454): pp. Yu. 7. • Hansen. (1877–1878). 12. Ronald "Statistical methods and scientific induction" J. 288–294. (criticism of statistical theories of Jerzy Neyman and Abraham Wald) • Freedman. C.661-668 and CP 2. 13. Jerzy (1956). MR2489600. v. 203 [50]–217. "On Tables of Random Numbers". Popular Science Monthly. 2.395-427). "The Order of Nature". (CP 2. Popular Science Monthly. v. (EP 1:142-154). "The Probability of Induction". ISBN 978-0-521-74385-3. MR443141. Relevant individual papers: • (1878 March). S. Soc. Mark H. Statist. (EP 1:155-169). (1963). "On Tables of Random Numbers". "Illustrations of the Logic of Science" (series). Cambridge University Press. Doksum. • (1878 April). March issue. 69—78. doi:10. • Kolmogorov. 1. Popular Science Monthly. Reprinted (CLL 131-156). pp. (CP 2. 746–774. Pearson Prentice-Hall. xiv+442 pp. pp. (PWP 174-189). Kjell A. Induction. 6. 705 [48]–718.758. Internet Archive Eprint [53]... CUP. (EP 1:186-199). Statistical assumptions Statistical decision theory Estimation theory Statistical hypothesis testing Revising opinions in statistics Design of experiments. A. Reprinted (CLL 106-130).

Edited by David Collier. • Zabell. Duxbury Press. Berlin: Walter de Gruyter. (1883). “Statistical Models and Shoe Leather” (1991). Smith. (December 2000). William (December 1988). Methodology and Philosophy of Science IX: Proceedings of the Ninth International Congress of Logic. Vol.. Sekhon. "Miracles and Statistics: The Casual Assumption of Independence (ASA Presidential address)" (http:/ / www. "Models and Statistical Inference: The Controversy between Fisher and Neyman—Pearson. 1992). Stochastic Complexity in Statistical Inquiry. Singapore: World Scientific. ISBN 3-11-01-3863-8. CUP. D. "Miracles and Statistics: The Casual Assumption of Independence (ASA Presidential address)" [59]. OUP. org/ stable/ 2290117). (Reprinted as Chapter 11 (pages 169–192) of: Freedman. Studies in Logic. "Fiducial distribution and Bayes’ theorem". S. Oscar (2008). John Benjamins Publishing Company. (2008) "Survival analysis: An Epidemiological hazard?". 369–387. G. (1994). R. Volume I: Introduction to Experimental Design [57] (Second ed." in Dag Prawitz.) Cambridge University Press. • David A. and Westerstahl (eds. ISBN 978-0-471-72756-9. 102–7 • Sudderth. D. Sekhon. Jennifer (1979). C. Sweden. "Principal Information-Theoretic Approaches (Vignettes for the Year 2000: Theory and Methods. Klaus and Kempthorne. Design and Analysis of Experiments. by George Casella)" [55]. Hamböker (1994). Jasjeet S. "Coherent Inference and Prediction in Statistics. JSTOR 2290117 • Lenhard. . ISBN 9027232717 (W 4:408-453) • Pfanzagl. and Company. (1958). Fisher and Fiducial Argument" [56]. • Soofi. pp. (2003) The Oxford Dictionary of Statistical Terms. Series B. MR1291393. Y. "A Theory of Probable Inference". 291–313. 929–940. Parametric Statistical Theory. MR1082556. A. Ltd. • Trusted. vol. (2008) Oxford Dictionary of Statistics. Brown. JSTOR 2290117 [7] Freedman. I. Journal of the Royal Statistical Society. 15. 2010. (Reprinted 1983. • Lindley. (1998) Asymptotic Statistics Cambridge University Press. and Philip B.). (2010) Statistical Models and Causal Inferences: A Dialogue with the Social Sciences (Edited by David Collier. Methodology and Philosophy of Science. Little. • Hinkelmann. 3-11-014030-6. Cook. Statistical Models and Causal Inferences: A Dialogue with the Social Sciences. jstor. Bryan Skyrms.L. (2001). pp.A. and Philip B. 126-181 [54]. Amsterdam: Elsevier. Ehsan S. 20. Series in computer science. • Kruskal. The American Statistician (2008) 62: 110-119. Stark. ISBN 0-19-920613-9 (entry for "inferential statistics") According to Peirce. ISBN 0534243126 • David A. Freedman. In science. ISBN 9780521123907) . The Logic of Scientific Inference: An Introduction." British Journal for the Philosophy of Science. "R. 69–91. 21. JSTOR 2669786.Statistical inference • Peirce. Sociological Methodology.W. William D.A. G. ISBN 0-521-83971-8 External links • MIT OpenCourseWare [60]: Statistical Inference References [1] [2] [3] [4] [5] [6] Upton. Johannes (2006).A. with the assistance of R. 57 Issue 1. JSTOR 2246073 28 Further reading • Casella. Cambridge University Press. 929–940. R. (2005) Essentials of Statistical Inference. 1349–1353.MR1825292. S. • Young. William (December 1988). Journal of the American Statistical Association 83 (404): pp. Statistical Inference. acceptance means that inquiry on this question ceases for the time being.. Stark. Statistical Science 7 (3): pp. Journal of the American Statistical Association 95 (452): pp. G. 1991. (Aug. Jorma (1989). Freedman. Wiley [58]. Journal of the American Statistical Association 83 (404): pp. ed. Uppsala.L. Berger. London: The Macmillan Press. OUP 978-0-19-954145-4 Dodge.). D. Jasjeet S. L. Johann. all scientific theories are revisable Cox (2006) page 2 van der Vaart. Logic. ISBN 0-521-78450-6 (page 341) Kruskal. • Rissanen. A.. August 7–14. pp.

. page 84 [39] Joseph F. wiley. ISBN 0-761-92904-5 [9] Page 6 in Brewer. html).: "The crucial drawback of asymptotic theory: What we [wish] from asymptotic theory are results which hold approximately . Dorota M. Freedman Statistical Models. Asymptotic Methods of Statistical Decision Theory: "Indeed. Combined Survey Sampling Inference: Weighing of Basu's Elephants. MR1291393. MR1291393. org/ stable/ 1403482 [43] http:/ / www.: "By taking a limit theorem as being approximately true for large sample sizes." (page ix) [17] Jerzy Neyman. [34] Hinkelmann and Kempthorne. ISBN 3-11-01-3863-8. (2003) Regression Analysis: A Constructive Critique (Advanced Quantitative Techniques in the Social Sciences) (v. [23] Peirce (1877-1878) [24] Peirce (1883) [25] David Freedman et alia Statistics and David A. Klaus and Kempthorne. Friedrich and Miescke.” Statistical Science 5 (4): 465–472. google. Page 399 [11] Op. (available at the ASA website) [20] David A. Parametric Statistical Theory. Design and Analysis of Experiments. Freedman et alia's Statistics. 11) Sage Publications. Wiley. [10] Jörgen Hoffman-Jörgensen's Probability With a View Towards Statistics. cambridge. Oscar (2008). Dabrowska and Terence P. Traub. 63 (3). Walter de Gruyter. does not contribute anything to substantiate the applicability of the results of probability theory to real practical problems where we have always to deal with a finite number of trials". cit. Asymptotic Methods of Statistical Decision Theory. Judin and Nemirovski. [. [41] Davison. based on the notion of limiting frequency as the number of trials increases to infinity. Wasilkowski. Volume I. R. page 747. Sankhyā Ser. A. Speed. [28] Box. Parametric Statistical Theory. page 12. [42] Barnard. "On Tables of Random Numbers". Bailey. 309–323. Johann. Freedman. Essay on Principles. op. not limits. ISBN 9810231113 [27] Peirce. “On the Application of Probability Theory to AgriculturalExperiments.A. • David S. a normal distribution "would be a totally unrealistic and catastraphoicaly unwise assumption to make if we were dealing with any kind of economic population" (page 6 again). etc. 25: pp. Testing. ISBN 0340692294. W." (page xiv) Pfanzagl. [31] Neyman.] Realistic information about the remaining errors may be obtained by simulations. com/ books?id=T3wWj2kVYZgC& printsec=frontcover& hl=sv& source=gbs_book_other_versions_r& cad=4_0) (Second ed. Walter de Gruyter. [32] Hinkelmann. Statistical Decision Theory: Estimation. [22] Gelman. (1963).P. 978-0340692295. Andrei N.E. Moore and George McCabe. ISBN 3-11-01-3863-8. Moore and George McCabe. org/ catalogue/ catalogue. Jerzy." (page 188) [16] Pfanzagl. G. Introduction to the Practice of Statistics. World Scientific. . 1923 [1990]. Wiley (http:/ / eu. Cambridge University Press. Zabell. volume 36 of Encyclopedia of Mathematics. Freedman et alia's Statistics. Volume I: Introduction to Experimental Design (http:/ / books. Section 9. Klaus-J. [38] Rissanen (1989). What asymptotic theory has to offer are limit theorems. [12] Lucien Le Cam. G. (1997) Statistics and Truth: Putting Chance to Work.Statistical inference [8] Berk. cit. and H. page 196 [30] ASA Guidelines for a first course in statistics for non-statisticians. "On the Two Different Aspects of the Representative Method: The Method of Stratified Sampling and the Method of Purposive Selection" given at the Royal Statistical Society on 19 June 1934. Wozniakowski. All they can do is suggest [definite] approaches whose performance must then be checked on the case at hand. Introduction to the Practice of Statistics. . (available at the ASA website) • David A. (page 369) Lucien Le Cam. (Page 369): "The frequency concept. Johann. asp?isbn=9780521743853 [44] http:/ / www. [35] Soofi (2000) [36] Hansen & Yu (2001) [37] Hansen and Yu (2001). Moore and McCabe.). org/ stable/ 2670311 29 . Revised Edition. Rubin. Hamböker (1994). . C. 369–375."(page ix) "What counts for applications are approximations. [26] Rao. and Selection. ISBN 978-0-471-72756-9. chapter 6. with the assistance of R. jstor. [13] Erik Torgerson (1991) Comparison of Statistical Experiments. Ken (2002). Trans.R. [21] David S. [18] Hinkelmann and Kempthorne. (1995) "Pivotal Models and the Fiducial Argument". Springer. [15] Kolmogorov. with the assistance of R. [33] Hinkelmann and Kempthorne. Stable URL: http:/ / www. com/ WileyCDA/ WileyTitle/ productCd-0471727563. G. [14] Liese. International Statistical Review. . Hodder Arnold. Hamböker (1994).: In particular. we commit an error the size of which is unknown. ISBN 978-0-471-72755-2 [29] Cox (2006). jstor. Bayesian Data Analysis. [40] Neyman 1956. and Friends (2006) Improving Almost Anything: Ideas and Essays. [19] ASA Guidelines for a first course in statistics for non-statisticians. limit theorems 'as tends to infinity' are logically devoid of content about what happens at any particular . (2008).

com/ WileyCDA/ WileyTitle/ productCd-0471727563. com/ books?id=ZKMVAAAAYAAJ& jtp=604 http:/ / www. org/ stream/ popscimonthly12yoummiss#page/ 612/ mode/ 1up http:/ / books. It is the hypothesis one tries to prove. The second one is called alternative (hypothesis). The prosecutor tries to prove the guilt of the defendant. if they occur. Example 1 . a defendant is considered innocent as long as his guilt is not proven. In the start of the procedure. and is for the time being accepted. because one doesn't want to condemn an innocent defendant. google. archive. what is the probability of observing a value for the test statistic that is at least as extreme as the value that was actually observed?)[2] One use of hypothesis testing is deciding whether experimental results contain enough information to cast doubt on conventional wisdom. In statistics. org/ stable/ 2669786 http:/ / www. Statistical hypothesis testing is a key technique of frequentist statistical inference. but also much criticized. jstor. com/ books?id=u8sWAQAAIAAJ& jtp=203 http:/ / www. html http:/ / www. The critical region of a hypothesis test is the set of all outcomes which. While controversial.e. The following examples should solidify these ideas. jstor. org/ stream/ popularsciencemo13newy#page/ 203/ mode/ 1up http:/ / books. and when such tests are available we may discover whether a second sample is or is not significantly different from the first. That is. org/ stream/ popularsciencemo13newy#page/ 470/ mode/ 1up http:/ / books. org/ stable/ 2290117 http:/ / ocw. edu/ OcwWeb/ Mathematics/ 18-441Statistical-InferenceSpring2002/ CourseHome/ 30 Statistical hypothesis testing A statistical hypothesis test is a method of making decisions using experimental data. jstor. will lead us to decide that there is a difference. com/ books?id=u8sWAQAAIAAJ& jtp=470 http:/ / www. google. org/ stream/ popscimonthly12yoummiss#page/ 715/ mode/ 1up http:/ / books.e.Statistical inference [45] [46] [47] [48] [49] [50] [51] [52] [53] [54] [55] [56] [57] [58] [59] [60] http:/ / www. google. jstor."[1] Introduction Hypothesis testing is sometimes called confirmatory data analysis. The phrase "test of significance" was coined by Ronald Fisher: "Critical tests of this kind may be called tests of significance. google. in contrast to exploratory data analysis. The first one is called null hypothesis. and : "the defendant is guilty". archive. wiley. The hypothesis of innocence is only rejected when an error is very unlikely. google. com/ books?id=V7oIAAAAQAAJ& pg=PA126 http:/ / www. archive.Court Room Trial A statistical test procedure is comparable to a trial. tests that answer the question Assuming that the null hypothesis is true. and . a result is called statistically significant if it is unlikely to have occurred by chance. com/ books?id=T3wWj2kVYZgC& printsec=frontcover& hl=sv& source=gbs_book_other_versions_r& cad=4_0 http:/ / eu.[4] Other approaches to reaching a decision based on data are available via decision theory and optimal decisions. Such an error is called error of the first kind (i. com/ books?id=ZKMVAAAAYAAJ& jtp=705 http:/ / www. archive.[3] the Bayesian approach to hypothesis testing is to base rejection of the hypothesis on the posterior probability. org/ stable/ 2246073 http:/ / books. and is widely used. cause the null hypothesis to be rejected in favor of the alternative hypothesis.. google. org/ stable/ 2983716 http:/ / books. these decisions are almost always made using null-hypothesis tests (i. there are two hypotheses : "the defendant is innocent". the condemnation of an innocent person). In frequency probability. mit. The critical region is usually denoted by C. Only when there is enough charging evidence the defendant is condemned.

The alternative is. the desired probability of a Type I error is determined.HANG 'EM HIGH BAD . one decides how critical one will be. if we select an error rate of 1%. But what about 12 hits. Accept Null Hypothesis (H0) Let Him Go Null Hypothesis (H0) is true He truly is innocent GOOD Reject Null Hypothesis (H0) GUILTY . With only 5 or 6 hits.Statistical hypothesis testing the occurrence of this error is controlled to be seldom. gives: Thus. As we try to find evidence of his clairvoyance. c. we will consider him clairvoyant. the critical value c is calculated. For example. at which point we consider the subject to be clairvoyant. or correct answers. The probability of a false positive is the probability of randomly guessing correctly all 25 times. c is calculated thus: . or 17 hits? What is the critical number. If the alternative is valid. For every card.Incorrectly reject the null Type I Error False Positive GOOD 31 Alternative Hypothesis (H1) is true He truly is guilty BAD . We will call the probability of guessing correctly p. some number more will pass the test. The number of hits. on the other hand. Before the test is actually performed. Less critical.Clairvoyant Card Game A person (the subject) is tested for clairvoyance. If the null hypothesis is valid. He is shown the reverse of a randomly chosen play card 25 times and asked which suit it belongs to. With the probability of such an error is: Hence. Typically. Depending on this desired Type 1 error rate. the error of the second kind (setting free a guilty person). The hypotheses. versus coincidental? How do we determine the critical value c? It is obvious that with the choice clairvoyance when all cards are predicted correctly) we're more critical than with (i. In practice. values in the range of 1% to 5% are selected. is often rather large. or Type I error. then. That is. Thus also with 24 or 23 hits.e. the test subject will predict the suit correctly with probability greater than 1/4. and reject the null hypothesis.Incorrectly accept the null Type II Error False Negative Example 2 . are: (Null Hypothesis = Just Guessing) and (Alternative Hypothesis = True Clairvoyant) When the test subject correctly predicts all 25 cards. As a consequence of this asymmetric behaviour. the only thing the test person can do is guess. of hits. yields a much greater probability of false positive. there is no cause to consider him so. in the second case. for the time being the null hypothesis is that the person is not clairvoyant. very small. we only accept . of course: the person is (more or less) clairvoyant. In the first case almost no test subjects will be recognised to be clairvoyant. with . the probability (relative frequency) of guessing correctly is 1/4. is called X. one decides how often one accepts an error of the first kinda false positive.

A calculated value is compared to a threshold. One characteristic of the test is its crisp decision: to reject or not reject the null hypothesis. Toothpaste. in order to minimize the probability of a Type II error. GOOD Reject Null Hypothesis (H0) TSA Believes Suite contains Radioactive Materials Detain at Airport Security BAD . declared when the observed sample is unlikely to have occurred by chance if the null hypothesis were true. GOOD Alternative Hypothesis (H1) is true Suitcase contains Radioactive Material BAD . which is determined from the tolerable risk of error. Placed under a Geiger counter. We can then calculate how likely it is that we would observe 10 counts per minute if the null hypothesis were true. Statistical significance is a possible finding of the test. .Incorrectly Accept Type II Error False Negative Terrorist Gets On Board! Again.Incorrectly Reject Type I Error False Positive Cavity Searches. Accept Null Hypothesis (H0) You Believe Subject is Just Guessing Send him home to Mama! Null Hypothesis (H0) is true Subject Truly is Just Guessing GOOD Reject Null Hypothesis (H0) You Believe Subject is Clairvoyant Take him to the Horse Races! BAD . if the null hypothesis predicts 3 counts per minute and a standard deviation of 1 count per minute.. For the above example. and there are likely other factors responsible to produce the measurements.Incorrectly reject the null Type I Error False Positive 32 Alternative Hypothesis (H1) is true BAD .Incorrectly accept the null Subject is Truly a Gifted Clairvoyant Type II Error False Negative GOOD Note that the sum of all four possible outcomes must be 1. If the null hypothesis predicts (say) on average 9 counts per minute and a standard deviation of 1 count per minute. The null hypothesis represents what we would believe by default. The test described here is more fully the null-hypothesis statistical significance test. a false negative. Example 3 .Statistical hypothesis testing From all the numbers c. one wants to maximize the good probabilities (in this case A and D) and minimize the bad probabilities (B and C). A + B + C + D = 1.Clothes. with this property. just that we don't have enough evidence to suggest there is). Lawsuits to Follow. On the other hand.Radioactive Suit Case As an example. When designing a statistical test. consider determining whether a suitcase contains some radioactive material. it produces 10 counts per minute. The null hypothesis is that no radioactive material is in the suitcase and that all measured counts are due to ambient radioactivity typical of the surrounding air and harmless objects. before seeing any evidence. we select: . then the suitcase is not compatible with the null hypothesis. we choose the smallest. Accept Null Hypothesis (H0) TSA Believes Suitcase contains only Approved Items Allow Person to Board Aircraft Null Hypothesis (H0) is true Suitcase contains only Approved Items . Shoes.. the designer of a statistical test wants to maximize the good probabilities and minimize the bad probabilities. Scandal Ensues. The name of the test describes its formulation and its possible outcome. then we say that the suitcase is compatible with the null hypothesis (this does not guarantee that there is no radioactive material.

This is equally important as invalid assumptions will mean that the results of the test are invalid. 3. 6. The complement of the false negative rate. and to accept or "fail to reject" the hypothesis otherwise. the idea of "accepting" the null hypothesis may be dangerous. Statistical test A decision function that takes its values in the set of hypotheses. The phrase "accept the null hypothesis" may suggest it has been proved simply because it has not been disproved. this is the test's probability of incorrectly rejecting the null hypothesis. The first step in any hypothesis testing is to state the relevant null and alternative hypotheses to be tested. Decide to either fail to reject the null hypothesis or reject it in favor of the alternative. 2. The second step is to consider the statistical assumptions being made about the sample in doing the test. Power of a test (1 − β) The test's probability of correctly rejecting the null hypothesis. 4. In standard cases this will be a well-known result. Derive the distribution of the test statistic under the null hypothesis from the assumptions. β. Unless a test with particularly high power is used. The false positive rate. Composite hypothesis Any hypothesis which does not specify the population distribution completely. . Size / Significance level of a test (α) For simple hypotheses. Compute from the observations the observed value tobs of the test statistic T. 5. Region of acceptance The set of values for which we fail to reject the null hypothesis. for example. The decision rule is to reject the null hypothesis H0 if the observed value tobs is in the critical region. This is important as mis-stating the hypotheses will muddy the rest of the process. a logical fallacy known as the argument from ignorance. It is important to note the philosophical difference between accepting the null hypothesis and simply failing to reject it. The "fail to reject" terminology highlights the fact that the null hypothesis is assumed to be true from the start of the test. assumptions about the statistical independence or about the form of the distributions of the observations. Nonetheless the terminology is prevalent throughout statistics. Decide which test is appropriate. For example the test statistics may follow a Student's t distribution or a normal distribution. The distribution of the test statistic partitions the possible values of T into those for which the null-hypothesis is rejected. the so called critical region. Definition of terms The following definitions are mainly based on the exposition in the book by Lehmann and Romano:[5] Simple hypothesis Any hypothesis which specifies the population distribution completely. Region of rejection / Critical region The set of values of the test statistic for which the null hypothesis is rejected.Statistical hypothesis testing 33 The testing process Hypothesis testing is defined by the following general procedure: 1. For composite hypotheses this is the upper bound of the probability of rejecting the null hypothesis over all cases covered by the null hypothesis. and stating the relevant test statistic T. where its meaning is well understood. if there is a lack of evidence against it. 7. it simply continues to be assumed true. and those for which it is not.

[6] Unbiased test For a specific alternative hypothesis. the symbols used are defined at the bottom of the table. the power against any fixed alternative approaches 1 in the limit. Normal population and independent observations and σ1 and σ2 are known Two-sample z-test Two-sample pooled t-test. (z is the distance from the mean in relation to the standard deviation of the mean). Consistent test When considering the properties of a test as the sample size grows. unequal variances* [8] (Normal populations or n1 + n2 > 40) and independent observations and σ1 ≠ σ2 and σ1 and σ2 unknown . Common test statistics In the table below. 34 Interpretation The direct interpretation is that if the p-value is less than the required significance level. Name One-sample z-test Formula Assumptions or notes (Normal population or n > 30) and σ known. Uniformly most powerful unbiased (UMPU) A test which is UMP in the set of all unbiased tests. the test with the greatest power. equal variances* (Normal populations or n1 + n2 > 40) and independent observations and σ1 = σ2 and σ1 and σ2 unknown [7] Two-sample unpooled t-test. assuming the null hypothesis is true. a test is said to be unbiased when the probability of rejecting the null hypothesis is not less than the significance level when the alternative is true and is less than or equal to the significance level when the null hypothesis is true. of observing a result at least as extreme as the test statistic. p-value The probability. then we say the null hypothesis is rejected at the given level of significance.Statistical hypothesis testing Most powerful test For a given size or significance level. a test is said to be consistent if. Many other tests can be found in other articles. for a fixed size of test. Criticism on this interpretation can be found in the corresponding section. For non-normal distributions it is possible to calculate a minimum proportion of a population that falls within k standard deviations for any k (see: Chebyshev's inequality). Uniformly most powerful test (UMP) A test with the greatest power for all values of the parameter being tested.

[11] . the subscript 0 indicates a value taken from the null hypothesis. Definitions of other symbols: • • • • • • • • • • . methods and terminology developed in the early 20th century. Two-proportion z-test.Lady Tasting Tea The following example is summarized from Fisher. Neyman (who teamed with the younger Pearson) emphasized mathematical rigor and methods to obtain more results from many samples and a wider range of distributions. unless specified otherwise = hypothesized population proportion = proportion 1 = proportion 2 = hypothesized difference in proportion = minimum of n1 and n2 Origins Hypothesis testing is largely the product of Ronald Fisher. see notes. . the probability of Type I error (rejecting a null hypothesis when it is in fact true) = sample size = sample 1 size = sample 2 size = sample mean = hypothesized population mean = population 1 mean = population 2 mean = population standard deviation = population variance • • • • • • • • • = sample standard deviation = sample variance = sample 1 standard deviation = sample 2 standard deviation = t statistic = degrees of freedom = sample mean of differences = hypothesized population mean difference = standard deviation of differences • • = Chi-squared statistic = F statistic • • • • • • • • = x/n = sample proportion. Example 4 .p0 > 10 and n (1 − p0) > 10 and it is a SRS (Simple Random Sample).. Fisher was an agricultural statistician who emphasized rigorous experimental design and methods to extract a result from few samples assuming Gaussian distributions.. which should be used as much as possible in constructing its test statistic. Modern hypothesis testing is an (extended) hybrid of the Fisher vs Neyman/Pearson formulation. see notes. H0. see notes. The example is loosely based on an event in Fisher's life. Jerzy Neyman. n1 p1 > 5 and n1(1 − p1) > 5 and n2 p2 > 5 and n2(1 − p2) > 5 and independent observations. pooled for Two-proportion z-test. unpooled for n1 p1 > 5 and n1(1 − p1) > 5 and n2 p2 > 5 and n2(1 − p2) > 5 and independent observations.Statistical hypothesis testing 35 One-proportion z-test n . The Lady proved him wrong. Karl Pearson and (son) Egon Pearson. One-sample chi-square test One of the following • All expected counts are at least 5 • All expected counts are > 1 and no more that 20% of expected counts are less than 5 Arrange so [9] > and reject H0 for *Two-sample F test for equality of variances In general. and is known as the Lady tasting tea example. calculations and design of the experiment. The article is less than 10 pages in length and is notable for its simplicity and completeness regarding terminology.[10] Fisher thoroughly explained his method in a proposed experiment to test a Lady's claimed ability to determine the means of tea preparation by taste.

complaints focus on flawed interpretations of the results and over-dependence/emphasis on one test. and the many developments carried out within its framework continue to play a central role in both the theory and practice of statistics and can be expected to do so in the foreseeable future". Attacks and defenses of the null-hypothesis significance test are collected in Harlow et al. In other contexts.. The typically more relevant goal of science is a determination of causal effect size. 36 Importance Statistical hypothesis testing plays an important role in the whole of statistics and in statistical inference. Fisher asserted that no alternative hypothesis was (ever) required. in other words.[13] Rejection of the null hypothesis at some effect size has no bearing on the practical significance of the observed effect size. 3. The distribution associated with the null hypothesis was the binomial distribution familiar from coin flipping experiments. The test statistic was a simple count of the number of successes in 8 trials. despite their shortcomings. The null hypothesis was that the Lady had no such ability. Meta-criticism Criticism is of the application. The amount and nature of the difference. The most persistent attacks originated from the field of Psychology. is what should be studied. 2. There is little criticism.[12] Many researchers also feel that hypothesis testing is something of a misnomer. as a tool for the experimenter. 4. For example. of the formulation in its original context. In practice a difference can almost always be found given a large enough sample. If and only if the 8 trials produced 8 successes was Fisher willing to reject the null hypothesis – effectively acknowledging the Lady's ability with > 98% confidence (but without quantifying her ability).Statistical hypothesis testing 1. larger effects of more concern. Numerous attacks on the formulation have failed to supplant it as a criterion for publication in scholarly journals. After review. Appropriate specification of both the hypothesis and the test of said hypothesis is therefore important to provide inference of practical utility. or of the interpretation. Criticism of null-hypothesis significance testing is available in other articles (for example "Statistical significance") and their references. the new paradigm formulated in the 1933 paper.[14] The original purposes of Fisher's formulation. whilst a true effect of practical significance may not appear statistically significant if the test lacks the power to detect it. In practice a single statistical test in a single study never "proves" anything. was to plan the experiment and to easily assess the information content of the small sample. The critical region was the single case of 8 successes in 8 trials based on a conventional probability criterion (< 5%). rather than of the method. but adopted enhanced publication guidelines which implicitly reduced the relative importance of such testing. 5. Bayesian in nature. . A statistically significant finding may not be relevant in practice due to other. the American Psychological Association did not explicitly deprecate the use of null-hypothesis significance testing. Fisher later discussed the benefits of more trials and repeated tests. Lehmann (1992) in a review of the fundamental paper by Neyman and Pearson (1933) says: "Nevertheless. Criticism Some statisticians have commented that pure "significance testing" has what is actually a rather strange goal of detecting the existence of a "real" difference between two populations.

. "significance" often has two distinct meanings in the same sentence. The test asks indirectly whether the sample can illuminate the research hypothesis. It is repulsive because it allows authority to avoid taking personal responsibility for decisions. If it is false.. A mathematical decision-making process is attractive because it is objective and transparent. "Despite the stranglehold that hypothesis testing has on experimental psychology.05"[15] The statistical significance required for publication has no mathematical basis. There is widespread and fundamental disagreement on the interpretation of test results. it must be the case that a large enough sample will produce a significant result and lead to its rejection. examples often support an argument. their terminologies have been blended. in the sense that they are prepared to ignore all results which fail to reach this standard.. So if the null hypothesis is always false. The Sage Dictionary of Statistics would not agree with the title of this article. even it mentions Neyman and Pearson terminology (Type II error and the alternative hypothesis). to eliminate from further discussion the greater part of the fluctuations which chance causes have introduced into their experimental results.[2] ". not an empirical one".there is no alternate hypothesis in Fisher's scheme: Indeed. A single counterexample results in the rejection of a conjecture."[17] In discussing test results."[16] Students find it difficult to understand the formulation of statistical null-hypothesis testing. the other is a subject-matter measurement (such as currency). Karl Popper defined science by its vulnerability to disproof by data.) The premature death of a laboratory rat during testing can impact doctoral theses and academic tenure decisions. The significance (meaning) of (statistical) significance is significant (important). and. he violently opposed its inclusion by Neyman and Pearson. by this means. While this article teaches a pure Fisher formulation. One is a probability.. ". (Consider close election results. which it would call null-hypothesis testing. the terminology and confusion about the interpretation of results. what's the big deal about rejecting it?"[17] (The above criticism . The blend is not seamless or standardized. "It is usual and convenient for experimenters to take 5% as a standard level of significance. In rhetoric.. Pedagogic criticism Pedagogic criticism of the null-hypothesis testing includes the counter-intuitive formulation. 37 Philosophical criticism Philosophical criticism to hypothesis testing includes consideration of borderline cases. even to a tiny degree. but is based on long tradition.06 nearly as much as the . taken literally (and that's the only way you can take it in formal hypothesis testing). Students expect hypothesis testing to be a statistical tool for illumination of the research hypothesis by the sample."[10] Ambivalence attacks all forms of decision making. but a mathematical proof "is a logical argument. Null-hypothesis testing shares the mathematical and scientific perspective rather than the more familiar rhetorical one. is almost always false in the real world.. God loves the .. Students also find the terminology confusing. I find it difficult to imagine a less insightful means of transiting from data to conclusions. Any process that produces a crisp decision from uncertainty is subject to claims of unfairness near the decision threshold. The typical introductory statistics text is less consistent. The applicability of the null-hypothesis testing to the publication of observational (as contrasted to experimental) studies is doubtful. it is not. surely. While Fisher disagreed with Neyman and Pearson about the theory of testing. "A little thought reveals a fact widely understood among statisticians: The null hypothesis.Statistical hypothesis testing The International Committee of Medical Journal Editors recognizes an obligation to publish negative (not statistically significant) studies under some circumstances.

With spatial data. some say researchers are really interested in the probability P(Null | Data as actually observed) which cannot be inferred from a p-value: some like to present these as inverses of each other but the events "Data as or more extreme than observed" and "Data as actually observed" are very different. It is uniformly agreed that statistical significance is not the only consideration in assessing the importance of research results. However. If one were testing. Mathematical models support the conjecture that most published medical research test results are flawed. year. Under traditional null hypothesis testing. Null-hypothesis testing has not achieved the goal of a low error probability in medical journals. modern statistics do not reliably link exposures of carcinogens to spatial-temporal patterns of cancer incidence. With temporal data the units chosen for temporal aggregation (hour. Rejecting the null hypothesis is not a sufficient condition for publication. day. about the general unreliability of statistical hypothesis testing to explain many social and medical phenomena. we can reject the null when it's virtually certain to be true. P(Null | Data) approaches 1 while P(Data as or more extreme than observed | Null) approaches 0. If the issue of analysis scale is ignored in a hypothesis test then skepticism about the results is justified. it would not apply. in other words. it is put forward as a straw man only to allow the data to contradict it. the null is rejected when the conditional probability P(Data as or more extreme than observed | Null) is very small. In some cases . decade) can completely alter the trends and cycles. Many statisticians have pointed out that rejecting the null hypothesis says nothing or very little about the likelihood that the null is true. say 0.05. Straw man Hypothesis testing is controversial when the alternative hypothesis is suspected to be true at the outset of the experiment. widely used as one requirement for publication of experimental research with statistical results.Statistical hypothesis testing only applies to point hypothesis tests. It is a research quality assurance test. There is not a strong convention in statistical hypothesis testing to consider alternate units of scale. week."[2] Null-hypothesis significance testing does not determine the truth or falsity of claims. It determines whether confidence in a claim based solely on a sample-based estimate exceeds a threshold. for example. sometimes labeled as postmodernism.[19] [20] Many authors have expressed a strong skepticism. whether a parameter is greater than zero. "Statistical significance does not necessarily imply practical significance!"[18] 38 Practical criticism Practical criticism of hypothesis testing includes the sobering observation that published test results are often contradicted. For this and other reasons.[22] .) "How has the virtually barren technique of hypothesis testing come to assume such importance in the process by which we arrive at our conclusions from our data?"[16] Null-hypothesis testing just answers the question of "how well the findings fit the possibility that chance factors alone might be responsible. making the null hypothesis the reverse of what the experimenter actually believes. Gerd Gigerenzer has called null hypothesis testing "mindless statistics"[21] while Jacob Cohen described it as a ritual conducted to convince ourselves that we have the evidence needed to confirm our theories. the units chosen for analysis (the modifiable areal unit problem) can alter or reverse relationships between variables. For example.

Some Bayesians (James Berger in particular) have developed Bayesian hypothesis testing methods. This does not mean that the relationship they were looking for did not exist.05). "other journals and reviewers have exhibited a bias against articles that did not reject the null hypothesis.Statistical hypothesis testing 39 Bayesian criticism Bayesian statisticians reject classical null hypothesis testing. We collect these articles and provide them to the scientific community free of cost. We plan to change that by offering an outlet for experiments that do not reach the traditional significance levels (p < 0." Ioannidis has inventoried factors that should alert readers to risks of publication bias.e. Thus. in "file drawers. Even though these papers can often be interesting. (although Bayesian confidence intervals are different from classical ones). i. it is meaningful to make statements of the general form "the probability that the true value of the parameter is greater than 0 is p". Andrew Gelman). Fisher ignored the 8-failure case (equally improbable as the 8-success case) in the example test involving tea. P(Null). though these are not accepted by all Bayesians (notably. while P(Data) approaches 0: then P(Data | Null) is low because we have extremely unlikely data." The "File Drawer problem" is a problem that exists due to the fact that academics tend not to publish results that indicate the null hypothesis could not be rejected. According to the editors. they tend to end up unpublished. Given a prior probability distribution for one or more parameters. In this framework. sample evidence can be used to generate an updated posterior distribution. along with a confidence interval. Bayesians prefer to provide an estimate. and reducing the bias in psychological literature. a group of psychologists launched a new journal dedicated to experimental studies in psychology which support the null hypothesis. Publication bias In 2002. According to Bayes' theorem. The Jeffreys–Lindley paradox illustrates this. Without such a resource researchers could be wasting their time examining empirical questions that have already been examined. reducing the file drawer problem. Along with many frequentist statisticians. but not in the null hypothesis testing framework. See also • • • • • • Comparing means test decision tree Counternull Multiple comparisons Omnibus test Behrens–Fisher problem Bootstrapping (statistics) . but it means they couldn't prove it. but the Null hypothesis is extremely likely to be true. which altered the claimed significance by a factor of 2[24] . The Journal of Articles in Support of the Null Hypothesis [23] (JASNH) was founded to address a scientific publishing bias against such articles. is also approaching 1. (for instance) when the a priori probability of the null hypothesis. we have: thus P(Null | Data) may approach 1 while P(Data | Null) approaches 0 only when P(Null)/P(Data) approaches infinity.[20] Improvements Jones and Tukey suggested a modest improvement in the original null-hypothesis formulation to formalize handling of one-tail tests. since it violates the Likelihood principle and is thus incoherent and leads to sub-optimal decision-making.

and Rice. jasnh. Volume 1. What If There Were No Significance Tests?. doi:10. in James Roy Newman. (1956) [1935]. [22] Cohen. Mulaik. Series A. [12] McCloskey. Theoretical Statistics. ISBN 9780486411514.pmed. John P.0020124. ISSN 0003-066X. (1999). Mahwah. nist.. . ISBN 076194138X. A. (1991).. [14] Harlow. D. Jacob (December 1994). gov/ articlerender.). berkeley. The Sage Dictionary of Statistics. R. Pearson. [16] Loftus.05)". M: Theory of Statistics. Ann Arbor: University of Michigan Press. nist. E. [19] Ioannidis. ISBN 978-0849327186. A. Tukey JW (December 2000). Spiegelhalter. K: (2009) Bayesian statistics Scholarpedia.). pubmedcentral. gov/ div898/ handbook/ eda/ section3/ eda353. "Why most published research findings are false" (http:/ / www. 231.10. . ISBN 0-387-94037-5 (followed by reprinting of the paper) • Neyman. doi:10.1276.1037/0003-066X. com/ [24] Jones LV. 76. "Contradicted and initially stronger effects in highly cited clinical research" (http:/ / jama. New York: Springer. . Fisher (1925). Beyond Human Error. D.A. G. nih. .). 2 (8): e124.1304. The World of Mathematics. JAMA 294 (2): 218–28.L. American Psychologist 49 (12): 997–1003.L. "A sensible formulation of the significance test" (http:/ / content.1037/0003-066X. J. Joseph P.1037/1082-989X. "Mathematics of a Lady Tasting Tea" (http:/ / books. itl. htm) (should say "Variances") [10] Fisher. Stanley A. A.: Lawrence Erlbaum Associates Publishers.45. Brendan. Cramer. Contemporary Psychology 36: 102–105. 521.Statistical hypothesis testing • • • • • • • • • Checking if a coin is fair Falsifiability Fisher's method for combining independent tests of significance Null hypothesis P-value Statistical theory Statistical significance Type I error. google. ISSN 0003-066X. NIST handbook: Two-Sample t-Test for Equal Means (http:/ / www. fcgi?tool=pmcentrez& artid=1182327). doi:10. org/ journals/ met/ 5/ 4/ 411). volume 3 [Design of Experiments]. D.1016/j. Mass. PMID 16014596. Edinburgh: Oliver and Boyd. 1925. (Eds Kotz.: Addison Wesley isbn=0-201-59877-9. Hinkley (1974). American Psychologist 45 (12): 1304–1312. org/ cgi/ content/ full/ 294/ 2/ 218). [9] NIST handbook: F-Test for Equality of Two Standard Deviations (http:/ / www. p. The Life of a Scientist. Rosenthal. pdf).S.. Sir Ronald A. Cox. 134.5. "Mindless statistics". John P. doi:10. doi:10.V. R. Reading. Introductory Statistics (5th ed.1001/jama. Steiger (1997). 1995 Lehmann. [11] Box. Courier Dover Publications. "On the tyranny of hypothesis testing in the social sciences". New York: Wiley. (1933) On the Problem of the Most Efficient Tests of Statistical Hypotheses. [13] Wallace. page 218.2004. ama-assn. edu/ ~maccoun/ PP279_Rosnow. PMC 1182327. Lisa Lavoie. ISBN 978-0-8058-2634-0. Deirdre (2008). • Lehmann E. ISBN 0387988645.4. Jacob (December 1990). Romano (2005).. apa. Statistical Methods for Research Workers. N.J. itl. "Things I have learned (so far)". PMID 11194204. N.L.. 289–337. R. Neil A. Alastair Ross (2006).411.R. (August 2005). Springer-Verlag. Type II error Exact test 40 References [1] [2] [3] [4] [5] [6] [7] R. In: Breakthroughs in Statistics. "Statistical procedures and the justification of knowledge in psychological science" (http:/ / ist-socrates. The Cult of Statistical Significance. S.294.socec. PLoS Med. doi:10. American Psychologist 44 (10): 1276–1284.12.033. Trans.44. Soc. p. Duncan. ISBN 0472050079...43. R. Springer. p.1371/journal. James H. Fisher. (October 1989). [18] Weiss. E.R. . [23] http:/ / www. Johnson.L. Joan Fisher (1978).2. (July 2005). [20] Ioannidis. Dennis Howitt (2004). com/ ?id=oKZwtLQTmNAC& pg=PA1512& dq="mathematics+ of+ a+ lady+ tasting+ tea").218. ISBN 0412124293. [21] Gigerenzer. [15] Rosnow. Psychol Methods 5 (4): 411–4. p. [17] Cohen. htm) [8] Ibid. "The earth is round (p < . Gerd (2004). The Journal of Socio-Economics 33 (5): 587–606. Phil. 4(8):5230 Schervish. (1992) Introduction to Neyman and Pearson (1933) On the Problem of the Most Efficient Tests of Statistical Hypotheses.09. gov/ div898/ handbook/ eda/ section3/ eda359. Florida: CRC Press. Testing Statistical Hypotheses (3E ed. PMID 16060722.

com/spss-articles/ hypothesis-testing) • Wilson González. Although meta-analysis is widely used in epidemiology and evidence-based medicine today.[1] [2] However.tufts.de/ioeb/en/organisation/pfaff/ stat_overview_table. G. a meta-analysis of a medical treatment was not published until 1955. usgs.HTM) (A good tutorial) • References for arguments for and against hypothesis testing (http://core. 1997). Rhine.uni-muenster. Roughly speaking. Pratt. 1993) meta-theory . Resulting overall averages when controlling for study characteristics can be considered meta-effect sizes. Glass. From the perspective of the systemic TOGA (Top-down Object-based Goal-oriented Approach.htm) • Dallal GE (2007) The Little Handbook of Statistical Practice (http://www. for a given Ax. This is normally done by identification of a common measure of effect size. vt. Tm).npwrc. meta-knowledge and meta-system.edu/psyc/wuenschk/StatHelp/ NHST-SHIT. authored by Duke University psychologists J. Frank L.edu/users/goguen/courses/275f00/stat. and associates. T). analyzing the results from a group of studies can allow more accurate data analysis. Schmidt and . it means MA(Pm. a meta-analysis combines the results of several studies that address a set of related research hypotheses. and conducted by independent researchers.[3] This encompassed a review of 145 reports on ESP experiments published from 1882 to 1939. B. more sophisticated analytical techniques were introduced in educational research.laerd.Tx). the first meta-analysis of all conceptually identical experiments concerning a particular research issue.wiwi. which is modelled using a form of meta-regression.html). History The first meta-analysis was performed by Karl Pearson in 1904.ucsd. "Hypothesis Testing" (http://www.org/ ?l=en) Meta-analysis In statistics.edu/ewr/environmental/teach/smprimer/hypotest/ht. where MA denotes a meta-analysis operator.cs. Therefore we may write: A (P. Georgina. algorithm. Environmental Sampling & Monitoring Primer. In such declarative formalization. Dx context. • Bayesian critique of classical hypothesis testing (http://www. Dm =(Px.cee. has been identified as the 1940 book-length publication Extra-sensory perception after sixty years. methodology) in the Px. D.edu/~gdallal/LHSP. a meta-analysis domain is Tx (such as method. Virginia Tech. any analysis A relates to a problem P in an arbitrarily selected ontological domain D.htm) • Statistical Tests Overview: (http://www. Karpagam Sankaran (September 10.heliohost. The above systemic definition is congruent with definitions of meta-theory. a meta-analysis is an analysis performed on the level of tools applied to a particular or more general domain-oriented analysis. which are more powerful estimates of the true effect size than those derived in a single study under a given single set of assumptions and conditions.gov/resource/methods/statsig/stathyp. html) • Critique of classical hypothesis testing highlighting long-standing qualms of statisticians (http://www.ecu. J.Statistical hypothesis testing 41 External links • A Guide to Understanding Hypothesis Testing by Laerd Statistics (http://statistics. in an attempt to overcome the problem of reduced statistical power in studies with small sample sizes. starting with the work of Gene V. and included an estimate of the influence of unpublished papers on the overall effect (the file-drawer problem).html) How to choose the correct statistical test • An Interactive Online Tool to Encourage Understanding Hypothesis Testing (http://wasser. It is performed using some analysis tools T. In the 1970s.

Schmidt. For instance: • Differences (discrete data) • Means (continuous data) • Hedges' g is a popular summary measure for continuous data that is standardized in order to eliminate scale differences. Jacob Cohen. the requirement of randomization and blinding in a clinical trial • Selection of specific studies on a well-specified subject. Hunter. is the control mean. over classical literature reviews. Harris Cooper. Model selection (see next paragraph) For reporting guidelines. but it incorporates an index of variation between groups: in which is the treatment mean. John E. fixed effects meta-regression and random effects meta-regression. and Frank L. • Decide whether unpublished studies are included to avoid publication bias (file drawer problem: see below) 3.[4] The statistical theory surrounding meta-analysis was greatly advanced by the work of Nambury S. Raju. Thomas C. Selection of studies (‘incorporation criteria’) • Based on quality criteria. Simple regression The model can be specified as Where is the effect size in study and (intercept) the estimated overall effect size. are parameters specifying different study characteristics.) include: • • • • • • Shows if the results are more varied than what is expected from the sample diversity Derivation and statistical testing of overall factors / effect size parameters in related studies Generalization to the population of studies Ability to control for between-study variation Including moderators to explain variation Higher statistical power to detect an effect than in ‘n=1 sized study sample’ Steps in a meta-analysis 1. Decide which dependent variables or summary measures are allowed. Ingram Olkin. Chalmers. simple overall means of effect sizes etc. three types of models can be distinguished in the literature on meta-analysis: simple regression. Hunter. 42 Advantages of meta-analysis Advantages of meta-analysis (eg. Search of literature 2. Larry V. 4. specifies the between study variation. Hedges.Meta-analysis John E. the treatment of breast cancer. e.g. e. The online Oxford English Dictionary lists the first usage of the term in the statistical sense as 1976 by Glass. the pooled variance.g. does not allow specification of within study variation. Note that this model . see QUOROM statement [5] [6] Meta-regression models Generally.

Random effects meta-regression Random effects meta-regression rests on the assumption that (hyper-)distribution in is a random variable following a Where again is the variance of the effect size in study . where . It can test if the outcomes of studies show more variation than the variation that is expected because of sampling different research participants. One approach frequently used in meta-analysis in health care research is termed 'inverse variance method'. parameter estimates are biased if between study variation can not be ignored. Some methodological weaknesses in studies can be corrected statistically. is the within study variance of the effect size. whereby the weights are equal to the inverse variance of each studies' effect estimator. This is important because much of the research on low incidents populations has been done with Single-subject research designs. As a result. Larger studies and studies with less random variation are given greater weight than smaller studies. If that is the case. It emphasizes the practical importance of the effect size instead of the statistical significance of individual studies.e. Between study variance is estimated using common estimation procedures for random effects models (restricted maximum likelihood (REML) estimators). which is a special case of combinatorial meta-analysis. Results from studies are combined using different approaches. i. Furthermore. This shift in thinking has been termed "meta-analytic thinking". A fixed effects meta-regression model thus allows for within study variability. generalizations to the population are not possible. Applications in modern science Modern statistical meta-analysis does more than just combine the effect sizes of a set of studies. but no between study variability because all studies have expected fixed effect size Where is the variance of the effect size in study . These characteristics are then used as predictor variables to analyze the excess variation in the effect sizes. Considerable dispute exists for the most appropriate meta-analytic technique for single subject research [7] Meta-analysis leads to a shift of emphasis from single studies to multiple studies.Meta-analysis 43 Fixed-effects meta-regression Fixed-effects meta-regression assumes that the true effect size is normally distributed with . The results of a meta-analysis are often shown in a forest plot. A recent approach to studying the influence that weighting schemes can have on results has been proposed through the construct of gravity. For example. it is possible to correct effect sizes or correlations for the downward bias due to measurement error or restriction on score ranges. . population sampled. study characteristics such as measurement instrument used. Other common approaches include the Mantel–Haenszel method[8] and the Peto method. The average effect size across all studies is computed as a weighted mean. or aspects of the studies' design are coded. Fixed effects meta-regression ignores between study variation. VBM or PET. Signed differential mapping is a statistical technique for meta-analyzing studies on differences in brain activity or structure which used neuroimaging techniques such as fMRI. Meta-analaysis can be done with Single-subject design as well as group research designs.

and the combination study in the opposite direction). It only be a statistical examination of scientific studies. a practice he calls 'best evidence meta-analysis'. creating a serious base rate fallacy. it should be treated as highly suspect or having a high likelihood of being "junk science". There are several procedures available that attempt to correct for the file drawer problem. social. Robert Slavin has argued that only methodologically sound studies should be included in a meta-analysis. once identified. or economic goals in ways such as selecting small favorable data sets and not incorporating larger unfavorable data sets. This should be seriously considered when interpreting the outcomes of a meta-analysis. In addition. there are two different ways to measure effect: correlation or standardized mean difference. the coding of an effect is subjective. the underlying risk in each studied group is of significant importance. A weakness of the method is that sources of bias are not controlled by the method. one cannot know how many studies have been conducted but never reported and the results filed away.Meta-analysis 44 Weaknesses Meta-analysis can never follow the rules of hard science. For any given research area. and only ten got results. A good meta-analysis of badly designed studies will still result in bad statistics. as it is very hard to publish studies that show no significant results.[9] [10] This can be visualized with a funnel plot which is a scatter plot of sample size and effect sizes. Other meta-analysts would include weaker studies.[9] This file drawer problem results in the distribution of effect sizes that are biased. Other weaknesses are Simpson's Paradox (two smaller studies may point in one direction. in which the significance of the published studies is overestimated. the favored authors may themselves be biased or paid to produce results that support their overall political. skewed or completely cut off. The example provided by the Rind et al. such as guessing at the cut off part of the distribution of study effects. or thrown out by publishers as uninteresting. which may create exaggerated outcomes. then the real outcome is only 20% as significant as it appears. Those persons with these types of agenda have a high likelihood to abuse meta-analysis due to personal bias. Dangers of agenda-driven bias The most severe weakness and abuse of meta-analysis often occurs when the person or persons doing the meta-analysis have an economic. except that the other 80% were not submitted for publishing. for example being double-blind. the interpretation of effect size is purely arbitrary. itself. the decision to include or reject a particular study is subjective. for medicine. For example. . If a meta-analysis is conducted by an individual or organization with a bias or predetermined desired outcome. researchers with a bias should avoid meta-analysis and use a less abuse-prone (or independent) form of research. From an integrity perspective. or proposing a way to falsify the theory in question. File drawer problem Another weakness of the method is the heavy reliance on published studies. controversy illustrates an application of meta-analysis which has been the subject of subsequent criticisms of many of the components of the meta-analysis. if there were fifty tests.or political agenda such as the passage or defeat of legislation. and. it has not been determined if the statistically most accurate method for combining results is the fixed effects model or the random effects model. not an actual scientific study. and add a study-level predictor variable that reflects the methodological quality of the studies to examine the effect of study quality on the effect size. researchers favorable to the author's agenda are likely to have their studies "cherry picked" while those not favorable will be ignored or labeled as "not credible". and there is no universally agreed-upon way to weight the risk. social. controlled. For example.

1–13) [4] meta-analysis. Retrieved 2009-09-10. In S. Draft Entry June 2008. "Can meta-analysis be trusted?" [11].1258/jrsm.100.12. Accessed 28 March 2009.. Retrieved 2009-09-10. Pocock. com/ cgi/ entry/ 00307098?single=1& query_type=word& queryword=meta-analysis) Oxford English Dictionary. htm). Reanalyzing a meta-analysis on extra-sensory perception dating from 1940. [2] Egger. Meta-analysis refers to the analysis of analyses. University of Vienna. "1976 G. Nov. Res.. . V. (pp. [3] Bösch. "An historical perspective on meta-analysis: dealing quantitatively with varying study results" (http:/ / jrsm. com/ archive/ 7119/ 7119ed. bmj. G D Smith (1997-11-22).the meta-analysis of research. Schmidt (Ed. Simon G. "Meta-Analysis." [5] http:/ / www. ISSN 0959-8138. quantitative methods for combining evidence from separate but similar studies" or merely "statistical tricks which make unjustified assumptions in producing oversimplified generalisations out of a complex of disparate studies"? References [1] O'Rourke. The term is a bit grand. Stuar J (2 November 1991). org/ resources/ related-guidelines-and-initiatives/ . H. rsmjournals. Oxford University Press. retrieved 15 June 2010.). Keith (2007-12-01). Proceedings of the 47th Annual Convention of the Parapsychological Association.579. 3/2 My major interest currently is in what we have come to call.) 315 (7119): 1371–1374. doi:10. consort-statement. J R Soc Med 100 (12): 579–582. (2004). Explores two contrasting views: Does meta-analysis provide "objective. PMID 18065712. M. The Lancet 338: 1127-1130. (http:/ / dictionary. com). Glass in Educ. oed. but it is precise and apt. Potentials and promise" (http:/ / www. BMJ (Clinical Research Ed.Meta-analysis 45 A funnelplot expected without the file drawer problem A funnelplot expected with the file drawer problem See also • • • • • • Epidemiologic methods Meta-analytic thinking Newcastle–Ottawa scale Review journal Study heterogeneity Systematic review Further reading • Thompson.. . the first comprehensive meta-analysis in the history of science.

Methods of Meta-Analysis: Correcting Error and Bias in Research Findings. consort-statement.asu. J.htm) • Effect Size and Meta-Analysis (http://www.meta-analysis-made-easy.lyonsmorris.org/1992-5/meta. J. External links • Cochrane Handbook for Systematic Reviews of Interventions (http://www.stata. • Cornell. Robert (1979). 37 (6B). aspx?o=1346 [7] Van den Noortgate.A. New York: Russell Sage.com article) Software • ClinTools (http://www. 2008.V. (1999). "Statistical aspects of the analysis of data from the retrospective analysis of disease".com/wiki/Meta-analysis) (Psychwiki.com) (commercial) • Comprehensive Meta-Analysis (http://www.com/support/faqs/stat/meta. T.ed.R. (2000).. H. • Norman. (2007).html) • Meta-Analysis in Educational Research (http://www.ericdigests. A.com) Software for professional meta-analysis in Excel (free and commercial versions available) • What meta-analysis features are available in Stata (http://www. Cochrane Handbook for Systematic Reviews of Interventions Version 5. Psychological Bulletin 86 (3): 638-641 [10] Hunter. C. L. The Cochrane Collaboration. (1959).R. Journal of the National Cancer Institute 22 (4): 719–748. Schmidt. behavioral and life sciences (pp.org/meta_analyst/) Free Windows-based tool for Meta-Analysis of binary. & Hedges. London: John Wiley.org/resources/handbook/ index. Newbury Park. ISBN 0-471-49066-0 • Higgins JPT.B.com) (commercial) • MIX 2. PMID 13655060.cfm) free on-line tool for conducting a meta-analysis • Metastat (http://edres. F. 285–323).ericdigests. Frank L (1990). continuous and diagnostic data • Revman (http://www. John E.com/lyons/metaAnalysis/index. Research Methodology in the social. Tutorial in Biostatistics. P.clintools. K.cc-ims. org/ index. California. 321–359. Abrams.. Green S (editors). & Song. The Handbook of Research Synthesis. 3867–3892. & Mulrow.. and Reporting. Meta-analysis. T. E.htm) (Free) • Meta-Analyst (http://tuftscaes. • Sutton. W. html 46 • Cooper. D. D.psychwiki.htm) (ERIC Digest) • Meta-Analysis: Methods of Accumulating Results Across Research Domains (http://www.-L.J. A.org Further reading • Owen.lyonsmorris. 18. baojournal.Meta-analysis [6] http:/ / www. S. Sheldon.0. New Delhi: SAGE Publications [11] http:/ / tobaccodocuments. 8(2).com/ MetaA/) (article by Larry Lyons) • Meta-analysis (http://www.html)? (free add-ons to commercial package) • The Meta-Analysis Calculator (http://www. Mellenbergh (Eds). (1999). Statistics in Medicine. Combining. Available from www. Evaluating. 196–209 BAO (http:/ / www. Haenszel. [9] Rosenthal.edu/gene/papers/meta25. (2009). In: H.org/meta/metastat.1 [updated September 2008].. N. London.cochrane.. org/ pm/ 2047231315-1318. Aggregating Single-Case Results. J. Adèr & G.html) (ERIC Digest) • Meta-Analysis at 25 (Gene V Glass) (http://glass. (1994). Meta-Analysis: Formulating. com) [8] Mantel. W. Jones. Karl Pearson’s meta-analysis revisited.org/2003-4/meta-analysis. The Behavior Analyst Today.net/revman) A free software for meta-analysis and preparation of cochrane protocols and review available from the Cochrane Collaboration . London: Sage. & Onghena. Methods for Meta-analysis in Medical Research.cochrane-handbook.0 (http://www. Annals of Statistics.metaanalysis. "The "File Drawer Problem" and the Tolerance for Null Results".

Women. often a clinical trial is managed by an outsourced partner such as a contract research organization or a clinical trials unit in the academic sector. If the sponsor cannot obtain enough patients with this specific disease or condition at one location. Therapy A vs. The researchers send the data to the trial sponsor who then analyzes the pooled data using statistical tests. and what kind of patients might benefit from the medication or device. one or more pilot experiments are conducted to gain insights for design of the clinical trial to follow. the burden of paying for all the necessary people and services is usually borne by the sponsor who may be a governmental organization.g. standard medication or device ("the gold standard" or "standard therapy") • Compare the effectiveness in patients with a specific disease of two or more already approved or common interventions for that disease (e.Clinical trial 47 Clinical trial Clinical trials are conducted to allow safety and efficacy data to be collected for health interventions (e. the investigators: recruit patients with the predetermined characteristics. As positive safety and efficacy data are gathered. investigators enroll healthy volunteers and/or patients into small pilot studies initially. and whether the patient's health improves or not.g. In medical jargon. the sponsor decides what to compare the new agent with (one or more existing treatments or a placebo). Device A vs. and Health Authority/Ethics Committee approval is granted in the country where the trial is taking place. 10 mg dose instead of 5 mg dose) • Assess the safety and effectiveness of an already marketed medication or device for a new indication. the number of patients is typically increased. and people with unrelated medical conditions are also frequently excluded. patients who have been diagnosed with Alzheimer's disease) • assess the safety and effectiveness of a different dose of a medication than is commonly used (e. they are often excluded from trials because their more frequent health issues and drug use produces unreliable data. therapy protocols). a pharmaceutical. Device B.e. These trials can take place only after satisfactory information has been gathered on the quality of the non-clinical safety. Since the diversity of roles may exceed resources of the sponsor. In the U... the elderly comprise only 14% of the population but they consume over one-third of drugs. or biotechnology company..g.[1] Despite this. Usually. then investigators at other locations who can obtain the same kind of patients to receive the treatment would be recruited into the study. the sponsor or investigator first identifies the medication or device to be tested. devices.. followed by larger scale studies in patients that often compare the new product with the currently prescribed treatment. Due to the sizable cost a full series of clinical trials may incur. These data include measurements like vital signs. i. Some examples of what a clinical trial may be designed to do: • Assess the safety and effectiveness of a new medication or device on a specific kind of patient (e. concentration of the study drug in the blood. children.g.[2] In coordination with a panel of expert investigators (usually physicians well known for their publications and clinical experience). and collect data on the patients' health for a defined time period. a disease for which the drug is not specifically approved • Assess whether the new medication or device is more effective for the patient's condition than the already used. Therapy B) . effectiveness is how well a treatment works in practice and efficacy is how well it works in a clinical trial. Clinical trials can vary in size from a single center in one country to multicenter trials in multiple countries. diagnostics. drugs. During the clinical trial. Overview In planning a clinical trial. administer the treatment(s).S.. Depending on the type of product and the stage of its development.

For example. most clinical trial programs follow ICH guidelines. on groups of afflicted sailors. It must be used on a simple. there are some drugs whose heat is less than the coldness of certain diseases. where "he separated chronic nephritis with secondary . because sometimes a drug cures one disease by its essential qualities and another by its accidental ones. One of the most famous clinical trials was James Lind's demonstration in 1747 that citrus fruits cure scurvy. to prevent unnecessary duplication of clinical trials in humans and to minimize the use of animal testing without compromising the regulatory obligations of safety and effectiveness. 2. which still form the basis of modern clinical trials:[9] [10] 1. (This uniformity is designed to allow the data to be pooled. and ensures that researchers in different locations all perform the trial in the same way on patients with the same characteristics. or other interventions. for use on patients. Clinical trials may be required before the national regulatory authority[3] approves marketing of the drug or device."[5] 48 History The history of clinical trials before 1750 is easily summarized: there were no clinical trials.[11] He compared the effects of various different acidic substances. who worked at Guy's Hospital in London. coordination between Europe. The most commonly performed clinical trials evaluate new drugs. The drug must be tested with two contrary types of diseases. The time of action must be observed. aimed at "ensuring that good quality. 4.) A protocol is always used in multicenter trials. biologics. ranging from vinegar to cider. the clinical trial design and objectives are written into a document called a clinical trial protocol. psychological therapies. clinical trials can be seen as the application of the scientific method to understanding human or animal biology.Clinical trial Note that while most clinical trials compare two medications or devices. 1884). Japan and the United States led to a joint regulatory-industry initiative on international harmonization named after 1990 as the International Conference on Harmonisation of Technical Requirements for Registration of Pharmaceuticals for Human Use (ICH) [4] Currently. 5.[12] made substantial contributions to the process of clinical trials during his detailed clinical studies. Because the clinical trial is designed to test hypotheses and rigorously monitor and assess what happens. 6. Beginning in the 1980s.[8] He laid out the following rules and principles for testing the effectiveness of new drugs and medications. harmonization of clinical trial protocols was shown as feasible across countries of the European Union. or a new dose of the drug. Synonyms for 'clinical trials' include clinical studies. At the same time. and found that the group who were given oranges and lemons had largely recovered from scurvy after 6 days. research protocols and clinical research. disease. The drug must be free from any extraneous accidental quality. doses of medications. Except for very small trials limited to a single location. Frederick Akbar Mahomed (d. The experimentation must be done with the human body. 3. it was an accidental effect. for if this did not happen. so that they would have no effect on them. in which he laid down rules for the experimental use and testing of drugs and wrote a precise guide for practical experimentation in the process of discovering and proving the effectiveness of medical drugs and substances. The quality of the drug must correspond to the strength of the disease. safe and effective medicines are developed and registered in the most efficient and cost-effective manner. not a composite. some trials compare three or four medications. or devices against each other. so that essence and accident are not confused. The protocol is the 'operating manual' for the clinical trial.[6] [7] Clinical trials were first introduced in Avicenna's The Canon of Medicine in 1025 AD. 7. for testing a drug on a lion or a horse might not prove anything about its effect on man. medical devices (like a new catheter). These activities are pursued in the interest of the consumer and public health. The effect of the drug must be seen to occur constantly or in many cases.

Types of observational studies in epidemiology such as the cohort study and the case-control study provide less compelling evidence than the randomized controlled trial. • In an interventional study. The U. or new approaches to surgery or radiation therapy. If the study is double-blind. Currently. • Compassionate use trials or expanded access: provide partially tested. the investigators give the research subjects a particular medicine or other intervention. or a patient that has already attempted and failed all other standard treatments and whose health is so poor that he does not qualify for participation in randomized clinical trials. a physician might give extra care to only the patients who receive the placebos to compensate for their ineffectiveness. unapproved therapeutics prior to a small number of patients that have no other realistic options. Usually. or lifestyle changes. vaccines. this organization collected data from physicians practicing outside the hospital setting and was the precursor of modern collaborative clinical trials. the researchers also do not know which treatment is being given to any given subject. • Randomized: Each study subject is randomly assigned to receive either the study treatment or a placebo.S. • Treatment trials: test experimental treatments.Clinical trial hypertension from what we now term essential hypertension. Usually. • Quality of life trials: explore ways to improve comfort and the quality of life for individuals with a chronic illness (a. minerals. they compare the treated subjects to subjects who receive no treatment or standard treatment. double blind. and placebo-controlled. Usually. These approaches may include medicines. In addition. Supportive Care trials).a. In observational studies. An example is the Nurses' Health Study. he/she might be tempted to give the (presumably helpful) study drug to a patient who could more easily benefit from it."[13] 49 Types One way of classifying clinical trials is by the way the researchers behave." He also founded "the Collective Investigation Record for the British Medical Association.k. since if a physician knew which patient was getting the study treatment and which patient was getting the placebo. the investigators only observe associations (correlations) between the treatments experienced by participants and their health status or diseases. vitamins. this involves a disease for which no effective therapy exists. The researchers do not actively manage the experiment. This 'blinding' is to prevent biases. some Phase II and most Phase III drug trials are designed as randomized. Design A fundamental distinction in evidence-based medicine is between observational studies and randomized controlled trials. A randomized controlled trial is the study design that can provide the most compelling evidence that the study treatment causes the expected effect on human health. new combinations of drugs. Then the researchers measure how the subjects' health changes. National Institutes of Health (NIH) organizes trials into five (5) different types:[14] • Prevention trials: look for better ways to prevent disease in people who have never had the disease or to prevent a disease from returning. • Screening trials: test the best way to detect certain diseases or health conditions. the investigators observe the subjects and measure their outcomes. • Diagnostic trials: conducted to find better tests or procedures for diagnosing a particular disease or condition. • Blind: The subjects involved in the study do not know which study treatment they receive. A form of double-blind study called a "double-dummy" . Another way of classifying trials is by their purpose. case by case approval must be granted by both the FDA and the pharmaceutical company for such exceptions. • In an observational study.

Although the term "clinical trials" is most commonly associated with the large. Some journals. In this kind of study. Clinical trial protocol A clinical trial protocol is a document used to gain confirmation of the trial design by a panel of experts and adherence by all study investigators. European Union. The study would compare the 'test' treatment to standard-of-care therapy. during the last ten years or so it has become a common practice to conduct "active comparator" studies (also known as "active control" trials). and the trial sponsor is a private company. . a well-established comparator sourcing agency can alleviate the problem of parallel importing (importing a patented compound for sale in a country outside the patenting agency's sphere of influence). storing. Other clinical trials require large numbers of participants (who may be followed over long periods of time). Such third parties provide expertise in the logistics of obtaining. 50 Active comparator studies Of note.[16] Regulatory authorities in Canada and Australia also follow ICH guidelines. design. e.g. the alternate treatment would be a standard-of-care therapy. even if conducted in various countries. The protocol also gives the study administrators (often a contract research organization) as well as the site team of physicians. The protocol describes the scientific rationale. In the field of rare diseases sometimes the number of patients might be the limiting factor for a clinical trial. and organization of the planned trial. A growing trend in the pharmacology field involves the use of third-party contractors to obtain the required comparator compounds. encourage trialists to publish their protocols in the journal. The protocol contains a precise study plan for executing the clinical trial. They may be "sponsored" by single physicians or a small group of physicians. biotechnology or medical device companies in the United States.Clinical trial design allows additional insurance against bias or placebo effect. many clinical trials are small. not only to assure safety and health of the trial subjects. but also to provide an exact template for trial conduct by investigators at multiple locations (in a "multicenter" trial) to perform the study in exactly the same way. methodology. objective(s). Details of the trial are also provided in other documents referenced in the protocol such as an Investigator's Brochure. The format and content of clinical trial protocols sponsored by pharmaceutical. and shipping the comparators. • Placebo-controlled: The use of a placebo (fake treatment) allows the researchers to isolate the effect of the study treatment.e. or an academic research body such as a university. or Japan has been standardized to follow Good Clinical Practice guidance[15] issued by the International Conference on Harmonization of Technical Requirements for Registration of Pharmaceuticals for Human Use (ICH). This harmonization allows data to be combined collectively as though all investigators (referred to as "sites") were working closely together. all patients are given both placebo and active doses in alternating periods of time during the study. Trials. In other words. randomized studies typical of Phase III. when a treatment exists that is clearly better than doing nothing for the subject (i. giving them the placebo). and are designed to test simple questions. statistical considerations. a government health agency. As an advantage to the manufacturer of the comparator compounds. nurses and clinic administrators a common reference document for site responsibilities during the trial.

a sponsor must decide on the target number of patients who will participate. unique value. The power of a trial is not a single. and these are controlled for by the inclusion of a placebo group. as the participant can withdraw at any time without penalty. .. The Declaration of Helsinki provides guidelines on this issue. but only have a power of . potential benefits and key contacts. The number of patients required to give a statistically significant result depends on the question the trial wants to answer. The number of patients enrolled in a study has a large bearing on the ability of the study to reliably detect the size of the effect of the study intervention. it estimates the ability of a trial to detect a difference of a particular size (or larger) between the treated (tested drug/device) and control (placebo or standard treatment) groups. number of deaths after 28 days in the study) between the groups of patients who receive the study treatments. the doctors and nurses involved in the trial explain the details of the study using terms the person will understand. However.70 to detect a difference of 5 mg/dL. This is described as the "power" of the trial. Informed consent is not an immutable contract."[17] Informed consent is a legally-defined process of a person being told about key facts involved in a clinical trial before deciding whether or not to participate. By example. Foreign language translation is provided if the participant's native language is not the same as the study protocol. this consideration must be balanced with the fact that more patients make for a more expensive trial.90 to detect a difference between patients receiving study drug and patients receiving placebo of 10 mg/dL or more. For example. the greater the statistical power. required procedures. To fully describe participation to a candidate subject. Since researchers can behave differently to subjects given treatments or placebos. The participant then decides whether or not to sign the document in agreement. trials are also doubled-blinded so that the researchers do not know to which group a subject is assigned. The sponsor's goal usually is to obtain a statistically significant result showing a significant difference in outcome (e. Subjects in the treatment and placebo groups are assigned randomly and blinded as to which group they belong.Clinical trial 51 Design features Informed consent An essential component of initiating a clinical trial is to recruit study subjects following procedures using a signed document called "informed consent. in designing a clinical trial. such as its purpose. a trial of a lipid-lowering drug versus placebo with 100 patients in each group might have a power of . risks. The larger the sample size or number of participants in the trial.g. Placebo groups Merely giving a treatment can have nonspecific effects. duration. The research team provides an informed consent document that includes trial details. to show the effectiveness of a new drug in a non-curable disease as metastatic kidney cancer requires many fewer patients than in a highly curable disease as seminoma if the drug is compared to a placebo. Assigning a person to a placebo group can pose an ethical problem if it violates his or her right to receive the best available treatment. Statistical power In designing a clinical trial.

also called dose escalation. pharmacokinetics. they conduct extensive pre-clinical studies. to a larger amount of up to approx $6000 depending on length of participation. toxicity and pharmacokinetic information. Phase I trials also normally include dose-ranging. Drug development companies carry out Phase 0 studies to rank drug candidates in order to decide which has the best pharmacokinetic parameters in humans to take forward into further development. Each phase of the drug approval process is treated as a separate clinical trial. There are different kinds of Phase I trials: SAD Single Ascending Dose studies are those in which small groups of subjects are given a single dose of the drug while they are observed and tested for a period of time. Distinctive features of Phase 0 trials include the administration of single subtherapeutic doses of the study drug to a small number of subjects (10 to 15) to gather preliminary data on the agent's pharmacokinetics (how the body processes the drug) and pharmacodynamics (how the drug works in the body). first-in-human trials conducted in accordance with the United States Food and Drug Administration's (FDA) 2006 Guidance on Exploratory Investigational New Drug (IND) Studies. Pay ranges from a small amount of money for a short period of residence. II. speed up the drug development process or save money. These trials are often conducted in an inpatient clinic. and whether there is room for improvement. If they do not exhibit any adverse side effects. Before pharmaceutical companies start clinical trials on a drug. However. feasible. such as patients who have terminal cancer or HIV and lack other treatment options. The tested range of doses will usually be a fraction of the dose that causes harm in animal testing. being by definition a dose too low to cause any therapeutic effect. there are some circumstances when real patients are used.[18] Phase 0 trials are also known as human microdosing studies and are designed to speed up the development of promising drugs or imaging agents by establishing very early on whether the drug or agent behaves in human subjects as was expected from preclinical studies. Pre-clinical studies Pre-clinical studies involve in vitro (test tube) and in vivo (animal or cell culture) experiments using wide-ranging doses of the study drug to obtain preliminary efficacy.[19] A Phase 0 study gives no data on safety or efficacy. and pharmacodynamics of a drug. ethically acceptable.[20] Phase I Phase I trials are the first stage of testing in human subjects. it will usually be approved by the national regulatory authority for use in the general population. Normally. Phase IV are 'post-approval' studies. a small (20-100) group of healthy volunteers will be selected.Clinical trial 52 Phases Clinical trials involving new drugs are commonly classified into four phases. Questions have been raised by experts about whether Phase 0 trials are useful. Such tests assist pharmaceutical companies to decide whether a drug candidate has scientific merit for further development as an investigational new drug. Phase 0 Phase 0 is a recent designation for exploratory. and the . and III. Volunteers are paid an inconvenience fee for their time spent in the volunteer centre. where the subject can be observed by full-time staff. The drug-development process will normally proceed through all four phases over many years. They enable go/no-go decisions to be based on relevant human models instead of relying on sometimes inconsistent animal data. studies so that the appropriate dose for therapeutic use can be found. tolerability. This phase includes trials designed to assess the safety (pharmacovigilance). If the drug successfully passes through Phases I. The subject who receives the drug is usually observed until several half-lives of the drug have passed. Phase I trials most often include healthy volunteers.

the dose is escalated. When the development process for a new drug fails. Other reasons for performing trials at this stage include attempts by the sponsor at "label expansion" (to show the drug works for additional types of patients/diseases beyond the original use for which the drug was approved for marketing). This allows patients to continue to receive possibly lifesaving drugs until the drug can be obtained by purchase. 53 Phase II Once the initial safety of the study drug has been confirmed in Phase I trials.Clinical trial pharmacokinetic data is roughly in line with predicted safe values. and test both efficacy and toxicity. to obtain additional safety data. Studies in this phase are by some companies categorised as "Phase IIIB studies. where some patients receive the drug/device and others receive placebo/standard treatment. especially in therapies for chronic medical conditions. in order to obtain approval from the appropriate regulatory agencies such as FDA (USA). Phase III trials are the most expensive. demonstrating a drug's safety and activity in a selected group of patients. • Phase IIA is specifically designed to assess dosing requirements (how much drug should be given). • Phase IIB is specifically designed to study efficacy (how well the drug works at the prescribed dose(s)).000 or more depending upon the disease/medical condition studied) and are aimed at being the definitive assessment of how effective the drug is. or to support marketing claims for the drug. for example. it is typically expected that there be at least two successful Phase III trials. or the EMA (European Union). Randomized Phase II trials have far fewer patients than randomized Phase III trials. in comparison with current 'gold standard' treatment."[21] [22] While not required in all cases. Some trials combine Phase I and Phase II. Phase III Phase III studies are randomized controlled multicenter trials on large patient groups (300–3. up to a predetermined level. a group of patients receives multiple low doses of the drug. caused by eating before the drug is given. It is common practice that certain Phase III trials will continue while the regulatory submission is pending at the appropriate regulatory agency. and one after being fed. and other fluids) are collected at various time points and analyzed to understand how the drug is processed within the body. and a new group of subjects is then given a higher dose. Because of their size and comparatively long duration. Phase II trials are performed on larger groups (20-300) and are designed to assess how well the drug works. time-consuming and difficult trials to design and run. . with volunteers being given two identical doses of the drug on different occasions. Other Phase II trials are designed as randomized clinical trials. These studies are usually run as a crossover study. Food effect A short trial designed to investigate any differences in absorption of the drug by the body. In these studies. as well as to continue Phase I safety assessments in a larger group of volunteers and patients. this usually occurs during Phase II trials when the drug is discovered not to work as planned. MAD Multiple Ascending Dose studies are conducted to better understand the pharmacokinetics & pharmacodynamics of multiple doses of the drug. Phase II studies are sometimes divided into Phase IIA and Phase IIB. one while fasted. or to have toxic effects. while samples (of blood. This is continued until pre-calculated pharmacokinetic safety levels are reached. demonstrating a drug's safety and efficacy. Trial design Some Phase II trials are designed as case series. The dose is subsequently escalated for further groups. or intolerable side effects start showing up (at which point the drug is said to have reached the Maximum tolerated dose (MTD).

meaning not everyone can participate. It is a challenge to find the appropriate patients and obtain their consent. fewer than 5% of adults with cancer will participate in drug trials. on average. On average. and shelf life.Clinical trial Once a drug has proved satisfactory after Phase III trials. Length Clinical trials are only a small part of the research that goes into developing a new treatment. Phase IV trials involve the safety surveillance (pharmacovigilance) and ongoing technical support of a drug after it receives permission to be sold. about 400 cancer medicines were being tested in clinical trials in 2005. the drug may not have been tested for interactions with other drugs. but those that are may be delayed in getting approved because the number of participants is so low. Drugs for other diseases have similar timelines. the drugs need to be recalled immediately from the market. or on certain population groups such as pregnant women.[24] . who are unlikely to subject themselves to trials). give the sponsor approval to market the drug. manufacturing procedures. While most pharmaceutical companies refrain from this practice. a new cancer drug has. formulation details. and tested in labs (in cell and animal studies) before ever undergoing clinical trials. if not years. Most drugs undergoing Phase III clinical trials can be marketed under FDA norms with proper recommendations and guidelines. or the patient may receive a placebo). In all. it takes months.e. about 1. it is not abnormal to see many drugs undergoing Phase III clinical trials in the market. characterized. the study drug is not yet proven to work. But the major holdup in making new cancer drugs available is the time it takes to complete clinical trials themselves. Potential drugs. purified. Some reasons a clinical trial might last several years: • For chronic conditions like cancer. or restricted to certain uses: recent examples involve cerivastatin (brand names Baycol and Lipobay). In the case of cancer patients. Then they must identify the desirable patients and obtain consent from them or their families to take part in the trial. • For drugs that are not expected to have a strong effect (meaning a large number of patients must be recruited to observe any effect). about 8 years pass from the time a cancer drug enters clinical trials until it receives approval from regulatory agencies for sale to the public. • Only certain people who have the target disease condition are eligible to take part in each clinical trial. The biggest barrier to completing studies is the shortage of people who take part. it is hoped.. Not all of these will prove to be useful. but in case of any adverse effects being reported anywhere. the trial results are usually combined into a large document containing a comprehensive description of the methods and results of human and animal studies. first have to be discovered. Some drug trials require patients to have unusual combinations of disease characteristics. This collection of information makes up the "regulatory submission" that is provided for review to the appropriate regulatory authorities[3] in different countries. to see if a cancer treatment has an effect on a patient. According to the Pharmaceutical Research and Manufacturers of America (PhRMA). They will review the submission. getting statistical power) can take several years. For example.[23] 54 Phase IV Phase IV trial is also known as Post Marketing Surveillance Trial. Harmful effects discovered by Phase IV trials may result in a drug being no longer sold. All drug and many device trials target a subset of the population. troglitazone (Rezulin) and rofecoxib (Vioxx). The safety surveillance is designed to detect any rare or long-term adverse effects over a much larger patient population and longer time period than was possible during the Phase I-III clinical trials. recruiting enough patients to test the drug's effectiveness (i. Researchers who treat these particular patients must participate in the trial. and. Phase IV studies may be required by regulatory authorities or may be undertaken by the sponsoring company for competitive (finding a new market for the drug) or other reasons (for example.000 potential drugs are tested before just one reaches the point of being tested in a clinical trial. especially when they may receive no direct benefit (because they are not paid. for example. 6 years of research behind it before it even makes it to clinical trials.

[25] [26] Clinical trials that do not involve a new drug usually have a much shorter duration. collecting and statistically analyzing data. (Exceptions are epidemiological studies like the Nurses' Health Study.) A CRO is a company that is contracted to perform all the administrative work on a clinical trial.) If the patient is unable to consent for him/herself. Phase III and Phase IV clinical trials of new drugs are usually administered by a contract research organization (CRO) hired by the sponsoring company. At a participating site. and communicating with the IRB. In California. (The sponsor provides the drug and medical oversight. this body is called the Institutional Review Board (IRB). and ensures that the sponsor receives 'clean' data from every site. The guidelines aim to ensure that the "rights. safety and well being of trial subjects are protected". The research assistant's job can include some or all of the following: providing the local Institutional Review Board (IRB) with the documentation necessary to obtain its permission to conduct the study. coordinates study administration and data collection. (One of the IRB's main functions is ensuring that potential patients are adequately informed about the clinical trial. International Conference of Harmonisation Guidelines for Good Clinical Practice (ICH GCP) is a set of standards used internationally for the conduct of clinical trials. trains them.S. obtaining consent from them or their families. administering study treatment(s). but some sponsors allow the use of a central (independent/for profit) IRB for investigators who work at smaller institutions. maintaining and updating data files during followup. In the U. To be ethical. This can be an additional complication on the length of the study. when the drug can be tested.S. . monitors the sites for compliance with the clinical protocol. assisting with study start-up. but its precise definition may still vary.Clinical trial For clinical trials involving a seasonal indication (such as airborne allergies. They must understand the federal patient privacy (HIPAA) law and good clinical practice. researchers can seek consent from the patient's legally authorized representative. Small-scale device studies may be administered by the sponsoring company. and others). Most IRBs are located at the local investigator's hospital or institution. In some U. Seasonal Affective Disorder. the local IRB must certify researchers and their staff before they can conduct clinical trials. one or more research assistants (often nurses) do most of the work in conducting the clinical trial.) federally funded clinical trials are almost always administered by the researcher who designed the study and applied for the grant. The local ethics committee has discretion on how it will supervise noninterventional studies (observational studies or those using already collected data). identifying eligible patients. the study can only be done during a limited part of the year (such as Spring for pollen allergies). as well as the sponsor and CRO. Ethical conduct Clinical trials are closely supervised by appropriate regulatory authorities. site management organizations have also been hired to coordinate with the CRO to ensure rapid IRB/IEC approval and faster site initiation and patient recruitment. sets up meetings. locations. the state has prioritized [27] the individuals who can serve as the legally authorized representative. yet proper planning and the use of trial sites in the southern as well as northern hemispheres allows for year-round trials can reduce the length of the studies. All studies that involve a medical or therapeutic intervention on patients must be approved by a supervising ethics committee before permission is granted to run the trial..S. It recruits participating researchers. researchers must obtain the full and informed consent of participating human subjects. The notion of informed consent of participating human subjects exists in many countries all over the world.) 55 Administration Clinical trials designed by a local investigator and (in the U. influenza. Recently. provides them with supplies.

as a Data Safety Monitoring Board). while at the same time presenting the material as briefly as possible and in ordinary language. or seems to be causing unexpected and study-related serious adverse events. • The sponsor and the local site investigators are jointly responsible for writing a site-specific informed consent that accurately informs the potential subjects of the true risks and potential benefits of participating in the study.Clinical trial Informed consent is clearly a necessary condition for ethical conduct but does not ensure ethical conduct. and for informing all the investigators of the sponsor's judgment as to whether these adverse events were related or not related to the study treatment. pregnant women. and (in some cases. This allows the local investigators to make an informed judgment on whether to participate in the study or not. and/or women who become pregnant during the study. known in the U. however. as the trial proceeds.[28] . and then quantified methods may play an important role. the sponsor must translate the informed consent into the language of the participant. the various IRBs that supervise the study. • The sponsor is responsible for monitoring the results of the study as they come in from the various sites." If the participant's native language is not English. This is an area where sponsors can slant their judgment to favor the study treatment. and of any potential interactions of the study treatment(s) with already approved medical treatments. This is an independent group of clinicians and statisticians. FDA regulations and ICH guidelines both require that “the information that is given to the subject or the representative shall be in language understandable to the subject or the representative. as for instance for questions of when to stop sequential treatments (see Odds algorithm). Sponsor • Throughout the clinical trial. • The sponsor is responsible for collecting adverse event reports from all site investigators in the study. However. it may be hard to turn this objective into a well-defined quantified objective function. The DMC has the power to recommend termination of the study based on their review. many clinical trials of drugs are designed to exclude women of childbearing age. In larger clinical trials. 56 Safety Responsibility for the safety of the subjects in a clinical trial is shared between the sponsor. if the study involves a marketable drug or device) the regulatory agency for the country where the drug or device will be sold. the local site investigators (if different from the sponsor). for example if the study treatment is causing more deaths than the standard treatment. a sponsor will use the services of a Data Monitoring Committee (DMC. For safety reasons. In some cases this can be done. device or other medical treatments to be tested.S. the sponsor is responsible for accurately informing the local site investigators of the true historical safety record of the drug. The DMC meets periodically to review the unblinded data that the sponsor has received so far. In some cases the male partners of these women are also excluded or required to take birth control measures. The final objective is to serve the community of patients or future patients in a best-possible and most responsible way.

investigators often have a financial interest in recruiting subjects. but study staff at all locations are responsible for informing the coordinating investigator of anything unexpected. the study protocol is not approved by an IRB before the sponsor recruits sites to conduct the trial. Avoiding an audit is an incentive for investigators to follow study procedures.S. "An estimated 40 percent of all clinical trials now take place in Asia. if the sponsor withholds negative data. each local site investigator submits the study protocol. A required yearly "continuing review" report from the investigator updates the IRB on the progress of the study and any new safety information related to the study. the consent(s). and supporting documentation to the local IRB. In this case. and the sponsor. • The local investigator or his/her study staff are responsible for ensuring that potential subjects in the study understand the risks and potential benefits of participating in the study. • In commercial clinical trials. and can act unethically in order to obtain and maintain their participation. • The local investigators are responsible for reviewing all adverse event reports sent by the sponsor. (These adverse event reports contain the opinion of both the investigator at the site where the adverse event occurred. in other words. Regulatory agencies • If a clinical trial concerns a new regulated drug or medical device (or an existing drug for a new purpose). Eastern Europe. • The IRB scrutinizes the study for both medical safety and protection of the patients involved in the study. the study protocol and procedures have been tailored to fit generic IRB submission requirements. However. and promptly informing the local IRB of all serious and study-treatment-related adverse events. and where there is no independent sponsor. the data collection forms. However. • The local investigators are responsible for conducting the study according to the study protocol. the FDA can audit the files of local site investigators after they have finished participating in a study. Different countries have different regulatory requirements and enforcement abilities. • When a local investigator is the sponsor.. Other researchers (such as in walk-in clinics) use independent IRBs. On the other hand. and if a physician investigator believes that the study treatment may be harming subjects in the study.Clinical trial 57 Local site investigators • A physician's first duty is to his/her patients. “There is no compulsory registration system for clinical trials in these countries and many do not follow European directives in their . Universities and most hospitals have in-house IRBs. central and south America. the appropriate regulatory agency for each country where the sponsor wishes to sell the drug or device is supposed to review all study data before allowing the drug/device to proceed to the next phase. is necessary before all but the most informal medical research can begin. This audit may be random. the regulatory agency may make the wrong decision. the investigator can stop participating at any time. to see if they were correctly following study procedures. before it allows the researcher to begin the study. or for cause (because the investigator is suspected of fraudulent data). and supervising the study staff throughout the duration of the study. or misrepresents data it has acquired from clinical trials. It may require changes in study procedures or in the explanations given to the patient. there may not be formal adverse event reports. The local investigators are responsible for making an independent judgment of these reports. regarding the relationship of the adverse event to the study treatments). • In the U. or ethics board. that they (or their legally authorized representatives) give truly informed consent. IRBs Approval by an IRB. or to be marketed. • The local investigator is responsible for being truthful to the local IRB in all communications relating to the study.

In the U. an Expert Group on Phase One Clinical Trials published a report. just covering a partial salary for research assistants and the cost of any supplies (usually the case with national health agency studies). the investigator who writes the grant and administers the study acts as the sponsor.[30] Economics Sponsor The cost of a study depends on many factors. the number of patients required.S. Following this. investigators are almost always paid to participate. says Dr. or be substantial and include 'overhead' that allows the investigator to pay the research staff during times in between clinical trials." [29] 58 Accidents In March 2006 the drug TGN1412 caused catastrophic systemic organ failure in the individuals receiving the drug during its first human clinical trials (Phase I) in Great Britain.[31] National health agencies such as the U. The costs to a pharmaceutical company of administering a Phase III or IV clinical trial may include. .S. among others: • manufacturing the drug(s)/device(s) tested • staff salaries for the designers and administrators of the trial • payments to the contract research organization. In these cases. These amounts can be small. including onsite monitoring by the CRO before and (in some cases) multiple times during the study • one or more investigator training meetings • costs incurred by the local researchers such as pharmacy fees. Clinical trials follow a standardized process. Using internet resources can. depending on the amount of the grant and the amount of effort expected from them. • any payments to patients enrolled in the trial (all payments are strictly overseen by the IRBs to ensure that patients do not feel coerced to take part in the trial by overly attractive payments) These costs are incurred over several years. IRB fees and postage. and coordinates data collection from any other sites. and whether the study treatment is already approved for medical use. These other sites may or may not be paid for participating in the study. the site management organization (if used) and any outside consultants • payments to local researchers (and their staffs) for their time and effort in recruiting patients and collecting data for the sponsor • study materials and shipping • communication with the local researchers. reduce the economic burden. National Institutes of Health offer grants to investigators who design clinical trials that attempt to answer research questions that interest the agency. especially the number of sites that are conducting the study. Jacob Sijtsma of the Netherlands-based WEMOS. an advocacy health organisation tracking clinical trials in developing countries. Clinical trials are traditionally expensive and difficult to undertake. when the sponsor is a private company or a national health agency. there is a 50% tax credit for sponsors of certain clinical trials. However.[32] Investigators Many clinical trials do not involve any money.Clinical trial operations”. in some cases.

interested volunteers should speak with their doctors. and Canada. participants are paid because they give up their time (sometimes away from their homes) and are exposed to unknown risks. and others who have participated in trials in the past. After receiving consent from their doctors. Participating in a clinical trial Phase 0 and Phase I drug trials seek healthy volunteers. flyers. trial type. family members. posters in places the patients might go (such as doctor's offices). heart rate and temperature • Blood sampling • Urine sampling • Weight and height measurement . In most other trials. without their judgment being skewed by financial considerations. however. Such trials typically recruit via networks of medical professionals who ask their individual patients to consider enrollment.gov to locate trials using a registry run by the U.[34] Volunteers may also search directly on ClinicalTrials. sponsors of clinical trials use various recruitment strategies. Volunteers with specific conditions or diseases have Newspaper advertisements seeking patients and healthy volunteers to participate in clinical trials.Clinical trial 59 Patients In Phase I drug trials. and personal recruitment of patients by investigators. However. volunteers then arrange an appointment for a screening visit with the trial coordinator. There are different requirements for different trials. additional online resources to help them locate clinical trials. However. including patient databases. volunteers will often have the opportunity to speak or e-mail the clinical trial coordinator for more information and to answer any questions.S. Steps for volunteers Before participating in a clinical trial.[33] Other disease-specific services exist for volunteers to find trials related to their condition. National Institutes of Health and National Library of Medicine. and search for specific Parkinson’s clinical trials using criteria such as location. Locating trials Depending on the kind of participants required. Most other clinical trials seek patients who have a specific disease or medical condition. and symptom. patients are not paid. For example.[35] All volunteers being considered for a trial are required to undertake a medical screen.S. many clinical trials will not accept participants who contact them directly to volunteer as it is believed this may bias the characteristics of the population being studied. people with Parkinson’s disease can use PDtrials to find up-to-date information on Parkinson’s disease trials currently enrolling participants in the U. without the expectation of any benefit. but typically volunteers will have the following tests:[36] • Measurement of the electrical activity of the heart (ECG) • Measurement of blood pressure. newspaper and radio advertisements. in order to ensure that their motivation for participating is the hope of getting better or contributing to medical knowledge. After locating a trial. they are often given small payments for study-related expenses like travel or as compensation for their time in providing follow-up information about their health after they are discharged from medical care.

gov Clinical data acquisition Clinical site Community-based clinical trial Contract Research Organization • • • • • • • • • Data Monitoring Committees Drug development Drug recall Electronic Common Technical Document Ethical problems using children in clinical trials European Medicines Agency FDA Special Protocol Assessment Health care Health care politics IFPMA • • • • • • • • • • Investigational Device Exemption ISO 10006 Medical ethics Nocebo Nursing ethics Odds algorithm Orphan drug Philosophy of Healthcare Randomized controlled trial World Medical Association Clinical Data Interchange Standards Consortium • ... the sponsor's drug may be compared with another drug administered at a dose so low that the sponsor's drug looks more powerful. which is why it's so important that investigators be truly disinterested in the outcome of their work. and that that support should have no strings attached.S..[37] Angell believes that members of medical school faculties who conduct clinical trials should not accept any payments from drug companies except research support.. Or a drug that is likely to be used by older people will be tested in young people.Clinical trial • Drugs abuse testing • Pregnancy testing (females only) 60 Criticism Marcia Angell has been a stern critic of U. She is scathing on the topic of how clinical trials are conducted in America: Many drugs that are assumed to be effective are probably little better than placebos. interpretation.. I take no pleasure in this conclusion. Because favorable results were published and unfavorable results buried . when the relevant question is how it compares with an existing drug. including control by the companies over the design.. the public and the medical profession believed these drugs were potent. but there is no way to know because negative results are hidden. A common form of bias stems from the standard practice of comparing a new drug with a placebo.[39] See also • • • • • • • • • • Academic clinical trials Bioethics CIOMS Guidelines Clinical trial management ClinicalTrials. and publication of research results.. For example.. Clinical trials are also biased through designs for research that are chosen to yield favorable results for sponsors. which I reached slowly and reluctantly over my two decades as an editor of the New England Journal of Medicine. so that side effects are less likely to emerge. She has speculated that "perhaps most" of the clinical trials are viewed by critics as "excuses to pay doctors to put patients on a company's already-approved drug". In short.. health care in general and the pharmaceutical industry in particular.. it is often possible to make clinical trials come out pretty much any way you want.. or to rely on the judgment of trusted physicians or authoritative medical guidelines.[38] Seeding trials are particularly controversial. It is simply no longer possible to believe much of the clinical research that is published.

Retrieved 2007-09-09. The Rise of Early Modern Science: Islam. [19] The Lancet (2009). . in Canada. Clinical Trials: A Practical Approach. com/ blog. Retrieved 2007-03-27. JAMA 297 (11): 1233–40. gov/ cder/ guidance/ 959fnl. . [10] D.297. Edinburgh: Churchill Livingstone. Stephanie Green. "Phase 0 trials: a platform for drug development?". ISBN 0471213888 [8] Toby E. bruzelius. Pharmacology 5 ed. "The Arab Roots of European Medicine". php). John Crowley (2003). ISBN 1-56592-566-1 • Chow S-C and Liu JP (2004). gov/ ct2/ info/ glossary) [15] ICH Guideline for Good Clinical Practice: Consolidated Guidance (http:/ / www. 2005. doi:10.Clinical trial 61 References • Rang HP.. go. [11] "James Lind: A Treatise of the Scurvy (1754)" (http:/ / www. p. p. doi:10. Fowler RA (March 2007). "Frederick Akbar Mahomed".gov [44] • PDtrials [45] for Parkinson's disease clinical trials • ICH GCP Guidelines [46] References [1] Avorn J. Clinical Pharmacology & Therapeutics 67 (5). covance. info/ Nautica/ Medicine/ Lind(1753). Dale MM. Food and Drug Administration. Health Canada. Design and Analysis of Clinical Trials : Concepts and Methodologies. Craig Brater and Walter J. Huff (2003)..go. html). org) [17] What is informed consent? US National Institutes of Health. (1992). "Frederick Akbar Mahomed". China. ich. and in Japan. Hypertension (American Heart Association) 19: 212–217 [212] [14] Glossary of Clinical Trial Terms. "Phase 0 workshop at the 20th EORT-NCI-AARC symposium. [9] Tschanz. NIH Clinicaltrials.118. google. Knopf. and the West. [20] Silvia Camporesi (October 2008). Toren A. pdf) (Japanese) [5] ICH (http:/ / www. Powerful Medicines. Retrieved 2007-03-27. . John Wiley and Sons. com/ books?id=d8GxG0d9rpgC& pg=PA118& dq& hl=en#v=onepage& q=& f=false)". ich. fda. Geneva" (http:/ / www. asp?postId=27). html). Kiss A.jp 独立行政法人 医薬品医療機器総合機構 (http:/ / www. Retrieved 2008-11-07. Michael F. PMID 17374817. John Wiley & Sons. html) [6] " Clinical trials in oncology (http:/ / books. Cancer Clinical Trials: Experimental Treatments and How They Can Help You. the Ministry of Health. gov/ ct2/ info/ understand) [18] "Error: no |title= specified when using {{Cite web}}". Alfred A. 129-133. CRC Press. [3] The regulatory authority in the USA is the Food and Drug Administration (United States).1016/S0140-6736(09)61309-X.1001/jama. google. pmda. ISBN 1584883022 [7] " Clinical Trials Handbook (http:/ / books. Covance Inc. Michael F. ISBN 0-443-07145-4 • Finn R. [2] Van Spall HG. David W. ISBN 0-471-24985-8 • Pocock SJ (2004). [12] O'Rourke. (2004). org/ cache/ compo/ 276-254-1. January 2006. 2001. ecancermedicalscience. Lancet 374 (9685): 176. [22] "Periapproval Services (Phase IIIb and IV programs)" (http:/ / www. (1992). Saudi Aramco World 48 (3): 20–31. 218. 1999-03-16. Ritter JM. (1999). Labour and Welfare [4] Pmda. "Clinical pharmacology in the Middle Ages: Principles that presage the 21st century". com/ books?id=Zke8ocubNXAC& pg=PA1& dq& hl=en#v=onepage& q=& f=false)". jp/ ich/ s/ s1b_98_7_9e. Moore PK (2003). p. Clinicaltrials. gov/ oc/ ohrt/ irbs/ drugsbiologics. [21] "Guidance for Institutional Review Boards and Clinical Investigators" (http:/ / www.11. ISBN 0521529948. Daly (2000). Jacqueline Benedetti. Food and Drug Administration. "Eligibility criteria of randomized controlled trials published in high-impact general medical journals: a systematic sampling review". ISBN 0-471-90155-5 External links • Clinical Trials [40] at the Open Directory Project • The International Conference on Harmonization of Technical Requirements for Registration of Pharmaceuticals for Human Use (ICH) [41] • The International Clinical Trials Registry Platform (ICTRP) [42] • IFPMA Clinical Trials Portal (IFPMA CTP) [43] to Find Ongoing & Completed Trials of New Medicines • ClinicalTrials. pp. Shayne Cox Gad (2009).gov (http:/ / clinicaltrials. Cambridge University Press.1233. Sebastopol: O'Reilly & Associates. .gov (http:/ / clinicaltrials. (May/June 1997). pdf) [16] International Conference on Harmonization of Technical Requirements for Registration of Pharmaceuticals for Human Use (http:/ / www. in the European Union.1. Hypertension (American Heart Association) 19: 212–217 [213] [13] O'Rourke. 447-450 [448]. com/ periapproval/ svc_phase3b. the European Medicines Agency. p. fda. . ecancermedicalscience.

[33] http:/ / www. . cancer.1021/ja0713781. Retrieved 2007-03-27. gcphelpdesk. P (2007). examples and issues" (http:/ / www. Peterson (2005). [38] Angell M. com/ ). Ann. New York Review of Books. Vol 56. "The Internet and clinical trials: background. jmir. org [46] http:/ / www. Intern. dmoz. Retrieved 26 February 2010. ISBN 0781757843.2196/jmir.1. [32] Paul. gts-translation. [26] Yamin Khan and Sarah Tilly. Journal of medical Internet research 7 (1): e5. Rennie D (August 2008). ucsd.What to Expect (http:/ / www. html [35] http:/ / www. (Mar 2005). int/ trialsearch [43] http:/ / clinicaltrials. pdf [28] Back Translation for Quality Control of Informed Consent Forms (http:/ / www. Med. org/ archive/ 2007/ 12/ 14/ 5838/ ) [30] Expert Group on Phase One Clinical Trials (Chairman: Professor Gordon W. org/ 2005/ 1/ e5/ ) (Free full text). [25] Yamin Khan and Sarah Tilly. uk/ en/ Publicationsandstatistics/ Publications/ PublicationsPolicyAndGuidance/ DH_063117). gov/ orphan/ taxcred. pdtrials.e5. org/ en/ participate_clinicalresearch_how [36] Life on a Trial . org/ cgi/ pmidlookup?view=long& pmid=18711161). The Stationery Office. gov [45] http:/ / pdtrials. Retrieved 2008-08-21. PMID 15829477. com/ medicaltranslationpaper. Retrieved 2007-05-24.Clinical trial [23] Arcangelo. "Expert Group on Phase One Clinical Trials: Final report" (http:/ / www. asp). "Drug Companies & Doctors: A Story of Corruption". . "Flu. Food and Drug Administration. MJ. org [42] http:/ / www. doi:10. PMC 1550630. pharm-olam.7. [31] "Tax Credit for Testing Expenses for Drugs for Rare Diseases or Conditions" (http:/ / www. T. 15 January 2009. American Cancer Society 129 (22): 7155. findpharma. 149 (4): 279–80. "Seeding trials: just say "no"" (http:/ / www. org/ docroot/ ETO/ content/ ETO_6_3_Clinical_Trials_-_Patient_Participation. Crossley. doi:10. R. p. Duff) (2006-12-07). . Applied Clinical Trials Online. Virginia Poole. [24] Web Site Editor. beavolunteer. pdf) [29] Common Dreams (http:/ / www. com/ appliedclinicaltrials/ Drug+ Development/ Flu-Season-Diseases-Affect-Trials/ ArticleStandard/ Article/ detail/ 652128). Pharmacotherapeutics for Advanced Practice: A Practical Approach. pdf). com/ pdfs/ POI-Seasonality. who. pharm-olam. PMID 18711161. PMID 17497782. 2001-04-17. dh. org [44] http:/ / clinicaltrials. "Clinical Trials . (2004). Seib. [40] http:/ / www. "Seasonality: The Clinical Trial Manager's Logistical Challenge" (http:/ / www. online resources. [39] Sox HC.. Andrew M. ich. Season. htm). Marcia (2009). Retrieved 26 April 2010. Prescott. . mlanet.. uk/ index. [27] http:/ / irb. . commondreams. Lippincott Williams & Wilkins. php?option=com_content& view=article& id=25& Itemid=21) [37] Angell. com 62 . J. Thordarson. org/ Business/ Biotechnology_and_Pharmaceuticals/ Pharmaceuticals/ Products_Evaluation/ Clinical_Trials/ / [41] http:/ / www. co. annals. edu/ ab_2328_bill_20020826_enrolled. Turner. P. org/ en/ about_PDtrials_what [34] http:/ / www. ifpma. Diseases Affect Trials" (http:/ / appliedclinicaltrialsonline. . gov. No 1. fda.What Your Need to Know" (http:/ / www. org/ resources/ hlth_tutorial/ mod4c. . Pharm-Olam International (http:/ / www. 30. The Truth About Drug Companies. pdtrials.

CommodiCast. Bdolicki. Barticus88. Lexor.wikipedia. Argon233. Den fjättrade ankan.michael.php?oldid=361499016 Contributors: Arcadian. Salinecjr. Sadaphal. Trovardig. Holopoj. Requestion. Dr. Blehfu. Chuunen Baka. Tsange. Alex. TrippingTroubadour. Jason. Pabloes. LilHelpa. Gene Nygaard. David. Xdenizen. MeredithParmer. Chenxinghan. Gsociology. Hyju. ROxBo. Sseretti84. Hollybeth. Towne. LittleHow.Article Sources and Contributors 63 Article Sources and Contributors Epidemiological methods Source: http://en. Alex. Wapcaplet. The freddinator. Radagast83. Simonm223. 497 anonymous edits . Restepc. Junkyardprince. Jackol. Phil Boswell. Happywaffle.wikipedia. Dfrg.php?oldid=366892216 Contributors: 168. Jkpjkp. Csol. BlaiseFEgan. Wmahan. Flyerdog11. Dfrg. Woohookitty. Richi. Mosis. Noca2plus. Rsabbatini. Vassyana. Hawksj. Mike2000. Benlisquare.).sound. Ph. Suprateep. Wouterstomp. Acroterion. PamD. Lethaniol. IceKarma. Iridescent. Patrick. Alfie66. RichardF. Michael Hardy. Nanook77. Wjs. Slakr.wikipedia. Calltech.org/w/index. Petersam. ImperfectlyInformed. JonHarder. Dick Beldin. G716. Bfinn. Omegatron. Levineps. Nsaa. Agricola44. PhilipO. Evands. Cp111. Henrygb. Uthbrian. Mamour. Allmuth. Amead. Beland. MER-C. Cacafuego95. Rich Farmbrough. Vicarious. Mitch Ames. Cburnett. Drhirose. Indquimal. Andre Engels. Helper6860. Hoary. Jackol. Ibpassociation. Khaledelmansoury. Shardulkoza. Tide rolls. Ulyssesmsu.org/w/index. Kiefer. Adjespers. Qwfp. JimJonesap. Jyeee. Nbarth. Victor Chmara. Rich Farmbrough. Ceyockey. Iainspeed. Jeepday. JamesAM. Melcombe. MrOllie.the. Mwanner. Ddstretch. SteveJothen. Mydogategodshat. BarretBonden. Mikelove1. Dysprosia. Antoncampos. Cordless Larry. 295 anonymous edits Meta-analysis Source: http://en. Crasshopper. Cpichardo. FitzColinGerald. Jibegod. Cultofpj. Pewwer42. Atgvg. 31 anonymous edits Blind experiment Source: http://en. Wjjessen. Nbarth. Cojax. Ph.Wolfowitz. Cnshealthcare. Johnkarp. Melcombe. Andrew73. Kiefer. DarkAxius. MarkSweep. MattKingston. Arcadian. Thatcher. Siroxo. Avb. Oleg Alexandrov. SiobhanHansa. Gnusmas. Burlywood. Wakekamel. Vinders. NYC2TLV. Odcaapshs. 102 anonymous edits Statistical inference Source: http://en. Seren-dipper. Abcb336. Luna Santin. Dcljr. Zheric. Bfearing. The Anome. Hephaestos. TedDunning. Xasodfuih. Tarotcards. Heresiarch. Feezo. Arnoutf. Joe Decker. Kiefer. McPastry. Ibd-uc. Nesbit. Henrygb. Mattisse. Kauczuk. Piano non troppo. Lakn. Neilc. Kristenq. Koavf. Nijdam. Plastic Fish.wikipedia. ClinicalGuy. R'son-W. Giftlite. בלוד.Wolfowitz. Jbarrett. Larry_Sanger. Justin Eiler. Bluesandsm. Armandeh. Surfunc. Knutux.php?oldid=369148434 Contributors: Alphax. Hugh Waddington. Jrandall. PMO writers. Apollo reactor. Shadow1. Joeerato. Phantomsteve. Qwfp. Quercus solaris. User27091. Lilac Soul. Robma. Giavedonil. Michael Hardy. AlexKepler. Msweany. Nick San. Cyc. BillBell.wikipedia. Widefox. Rod57. Century0. Open2universe. Kiefer. Minglex. Lipothymia. Edgar181. Seans Potato Business. Shoemaker's Holiday. Strategist333. Lahiru k. Hunter1084. Saihtam. Reedy. Adoniscik. J04n. Drbreznjev. Davidruben. DocWatson42. Christian List. Mattisse. Koolkao. Bopps. Peyna. William Avery. Gakrivas. GJeffery. Reytan. Fermion.eyes. GraemeL. Hanjabba. NYC2TLV. Subheight640. Benmento.. Hu12. Michaelbluejay. Barticus88.php?oldid=368217425 Contributors: 168. Sadmachine14. Sullivan. Feinstein. Cutler. Zhenqinli. Ps07swt. Yamamoto Ichiro. Lordmetroid. Sriram sh. Edgar181. Pedvi. Oleg Alexandrov. Jytdog. Karada. Nunquam Dormio. Burlywood. Om3rtA. Arcadian. Spchjok. Brougham96. SimonP. QBay. Oleg Alexandrov. Rjwilmsi. Brim. Willking1979. RichardVeryard. Vassyana. Crazy george.Hurd. Btyner.Wolfowitz. Spacejamiri. Allthingstoallpeople. Thumperward.php?oldid=369620283 Contributors: 97198. Malte. Hu12. 135 anonymous edits Clinical trial Source: http://en. Tobby72. Schwnj. 16 anonymous edits Study design Source: http://en. ClinicSoft. Gkklein. Rich Farmbrough. Canuck4. Jackzhp. Holzi. PierreAbbat. Conversion script. Lagostin. Tim bates.Wolfowitz. Juliancolton. Hankwang. Peter grotzinger. Lisastockdale. KJBurns. Afa86. Krawi. Giftlite. WhisperToMe. Sleske. Edcolins. Hgilbert. Jwanders. Tookla. Melcombe. Sam Staton. Llort. Patiwat. Ggchen. Jagged 85. Ceyockey. Edcolins. APT. AThing. Mortense. Xasodfuih. Kirill Lokshin. ArielGold. Ericjsilva. Thincat. Kochhard. Michelle stone1. Maher27777. X!. GTBacchus. TimVickers. Ferkelparade. Campelli. Oakbell. AngoMera. Stevenmac4201. Biowizardvinoth. Gaff. Pejman47. Someguy1221. Chrispounds.tan. Davidnortman. Nopetro. Hooriaj. Richardjpitman. EverSince. Leonbax. Xiphosurus. Angesalm. Modeha. WLU. TimVickers. Instantnematode. Petewailes.koester. Zefr. Michael Hardy. WhatamIdoing. Bebz08. Ronabop. Ingolfson. Robbyjo. Wmahan. Wikilibrarian. Schwnj. Ultramarine. Leon7. Beland.org/w/index. Ghassankaram. Philippe (WMF). Peter-F. Wyvern1110. DavidCBryant. Kymacpherson. Larry_Sanger. Daveswagon.randomizer. Hbent. Huji. UserUserUserUserUserUserUserUserUser. Wile E. Tito4000. Rjwilmsi.Throop. Arcadian. MER-C. Arcadian. Squids and Chips. TrickyTank. Bo Jacoby. Mycatharsis. Davidmack. Mightyms. Mootros. LittleHow. Rumping. Iss246. Davidgrunwald. Pollinosisss. Jkominek. Billjefferys. Huji. Skbkekas. Lowellian. Marco Krohn. Paul144. Patrick. Lyndsay31. Phe. Costyn. Terry0051. LibLord. Nescio. Trontonian. Camw. Tcaruso2. Remi Arntzen. Riana. Qwfp. B. Qwfp. Falk Lieder. Boxplot. Gccwang. AxelBoldt. Siguva. ThisGregM. Badger. Ibpassociation. The-tenth-zdog. Chainsawriot. Shalom Yechiel.eyes. Haywardmedical. Pvosta. Ddxc. PrevMedFellow. Dancter. Wdynamia. LutzPrechelt. Drbreznjev. Diberri. Tramb. Tobycat. Melcombe. Norman21. Btyner. JA(000)Davidson. Philng. Ferrar08. Kroetz. CaddyJ. Giftlite. TesseUndDaan. Rmotz. Johnkarp. Kateshortforbob. Latka. Dav2008. MystRivenExile. Gnusworthy. Poor Yorick. Kiefer.here. Cretog8. Jfdwolff. Deez n. Simonhalsey. Amusella. Meritus. Aua. Epbr123. Ap. Jakew.. Bicanashy. Kikos. Scottk100. Jonathan321. Birge. Ceyockey. Andresswift. BD2412. ZimZalaBim. HJ Mitchell. Jane rawr. Albmont. Subversive. Alansohn. Czap42.was.msc. Giftlite. The Tetrast. Kyoko. Mikael Häggström. Badjoby. Stori. Teresag. Stevenfruitsmaak. Pmh35. Pstevens. Beetstra. Dcfleck. Haya shiloh. Hugh2414. Kpjas. AndonicO. The Anome. Quimr. Rodgarton. Eddieboyoreilly. WhatamIdoing. Ybbor. RichardF.grossman. WatchAndObserve. Koolkao. Ph. Lindsay658. Jollyroger131.wikipedia. Lumos3. Zvika. Graham87. 2over0. Babbage. Henry Flower. Hu. Pgan002. Redux. G716. SimonP. Tom Lougheed. Techelf. Zvika. G716..org/w/index. Nakon. Ivancho. Cybercobra. Pdbogen. LeilaniLad. Tatsundo h. MikeRiggs. Xomyork. Storm Rider. Wotnow. Dhaluza. Cordless Larry. Leevanjackson.php?oldid=366451356 Contributors: Ahoerstemeier. Cydmab. Hielor. NavakanthRajulapati. Kjtobo. Kauczuk. Wikid77. Giovanni Abbadessa. Coppertwig. The Anome. Kku. JeffreyN. Academic Challenger. Ronabop. Richard Arthur Norton (1958. Dailyknowledge. Twinxor. John of Reading. Sexecutioner. Jeff Silvers. Ymerino. Pmuean. Michael C Price. LeadSongDog. Mike2vil.j. Crispycook. Wiki091005!!.benjamin. Kazvorpal. Dreamback1116. Jtneill. MadScientistMatt. SparsityProblem. Lfkrebs. Hopping.php?oldid=368685924 Contributors: AbsolutDan.. Carlijn. Ellusion. Una Smith. Beland. Hephaestos. Kenosis. Quercus solaris. Jtneill. BethBukata.wikipedia. Quantyz.Wolfowitz. Conversion script. Angelafspencer. Rich Farmbrough. G716. Melcombe. Mrgoodguy. NellieBly. Statlearn.org/w/index. Digfarenough. Johnbibby. Wolverineski. Dfrg. 19 anonymous edits Statistical hypothesis testing Source: http://en. Astaines. Piano non troppo. Arcadian. Knutux. ImperfectlyInformed. MaxPont. Sam Hocevar. TABryant450. Psydude. Eric Kvaalen.php?oldid=367499901 Contributors: Aagtbdfoua. Hugh2414. PleaseStand. Lotje. Edcolins. Michael Hardy.. Jaroslav Pavliš. Gabbe. Pyfan. Davwillev. MarkSutton. Johnkarp. Adamjslund.org/w/index. Sofia Roberts. Baselbonsai. Bobber0001. CBM. Tomi. Pmrobert49. Ijliao. Sytwix. Falciparum. Johnkarp. Mcld. Jake Wartenberg. Npang. Mimihitam. Ibpassociation.t. Bondegezou. Dirrival. RekishiEJ. Dogbertd. Mild Bill Hiccup. Rorro. FreplySpang. Mizst. Kinnison. No Guru. Bakerstmd. Dawn Bard. Neurotip. ITBlair. Richard Keatinge. Spalding. Pvosta. KyraVixen. Uirauna. Wikid. Rfwoolf. Drakedirect.wikipedia. Nrets. JunCTionS. Rich Farmbrough. John Quiggin. Devilly. Jjamison. Snoyes. Arthena. Giftlite. Utcursch. Banjo. GSMartin. Nsaa. Shadowjams. Veinor. Michael Hardy. Jmh649. Mbhiii. Dave Nelson. Arcadian. Sweetmarie9. Pete. John Carter. Manfi. Kingpin13. Mangojuice. Simon66217. Boxplot.. G716. G716. Subversive. Nephron.org/w/index. Verlainer. Urdutext. AderakConsteen. Sj. Thehalfone. CLW. RockfanRecords. Sam Blacketer. Per84. Ancheta Wis. NoisyJinx.msc. PhilKnight. Yurik. 120 anonymous edits Randomized controlled trial Source: http://en. Damian Yerrick. CRGreathouse. Philippe. Wittygrittydude.msc. Daniel11. Unara. Wildt. Radio89. Thoytpt. Jorge Stolfi. Rjwilmsi. Ronz. Graham87. Kenneth M Burke. Tedunning. J heisenberg. Zrenneh. DavidSJ. Varuag doos. Koolkao. Piotrus. Costyn.eyes. Ioannes Pragensis. Charles Matthews. Bongwarrior. DSLeB. Gr8wight. Den fjättrade ankan. Rongou. Kudret abi.org/w/index. Bgs022. Gak. WikiDan61. Jjsimpsn. Wiki emma johnson. Plasticup.tan. Sam Blacketer. Trainspotter. Ianharvey.

jpg Source: http://en.Modified from CONSORT 2010.org/w/index.png License: Public Domain Contributors: User:TesseUndDaan Image:funnel_2.org/w/index.png Source: http://en.org/w/index.wikipedia.wikipedia.php?title=File:Clinical_trial_newspaper_advertisements.jpg License: GNU Free Documentation License Contributors: User:PrevMedFellow Image:funnel_1.php?title=File:Funnel_1. Licenses and Contributors Image:Flowchart of Phases of Parallel Randomized Trial .Image Sources.org/w/index.php?title=File:Flowchart_of_Phases_of_Parallel_Randomized_Trial_-_Modified_from_CONSORT_2010.php?title=File:Funnel_2.wikipedia.wikipedia.png License: Public Domain Contributors: User:TesseUndDaan File:Clinical trial newspaper advertisements.JPG License: unknown Contributors: Sofia Roberts . Licenses and Contributors 64 Image Sources.JPG Source: http://en.png Source: http://en.

0 Unported http:/ / creativecommons. org/ licenses/ by-sa/ 3. 0/ .License 65 License Creative Commons Attribution-Share Alike 3.

- Syndromic Surveillance Data Submission 144700000
- Guidelines Depression
- Cdsqiworksheet Ambulatory
- Clinical Decision Support Kit
- Depression Toolkit
- CDS Example Hypertension Implementation in Small Practice
- Hallucinations
- Biryani in a Mug
- Insomnia Symptom Questionnaire Instrument
- Microwave Risotto
- Sauteed Greens With Labneh
- Brown Rice Edamame and Pineapple in a Mug
- Vitamin D and Asthma
- 100% Whole Wheat Bread
- Strategies to Improve Accuracy and Minimize Inconsistencies in Medical Chart Review
- Metallic Skirt
- Stretchy Pencil Skirt
- Myobloc
- Aesthetic Treatments Around the Eyes
- Baked Chicken Nuggets
- Nursing Informatics
- Pushups[1]
- DIY Chanel Chignon
- Telugu
- PTSD Screening Tool

Sign up to vote on this title

UsefulNot usefulEpidemiology

Epidemiology

- COMPLEMENTARY CANCER THERAPY : A SYSTEMATIC REVIEW OF PROSPECTIVE CLINICAL TRIALS ON ANTHROPOSOPHIC MISTLETOE EXTRACTSby Jonathan Robert Kraus (OutofMudProductions)
- Liddle_2003.pdfby Javier Flores García
- Evaluating Informatics Apps (1)by Astrie Mamanya Alif
- L-Epid01-11 Schulz Unequal Group Sizesby susanamm

- 1cochrane reviewsby Karam Ali Shah
- Urschel_how to Analyzeby k1988
- Exercises Reduce the Progression Rate of Adolescent Idiopathic Scoliosis Results of a Comprehensive Systematic Review of the Literatureby eretria
- Epidemiology, Microbiology, Clinical Manifestations, And Diagnosis of Typhoid Feverby Mauricio Pacheco Andrade

- Epidemiology
- Fundamentals of Epidemiology
- Epidemiology
- Epidemiology
- BIOSTATISTICS MCQS
- Randomized controlled trials
- Extending the CONSORT Statement to Randomized Trials of Nonpharmacologic Treatment
- Getting Key Items Right When You Measure Program Outcomes
- COMPLEMENTARY CANCER THERAPY
- Liddle_2003.pdf
- Evaluating Informatics Apps (1)
- L-Epid01-11 Schulz Unequal Group Sizes
- EBM Chapter
- How to Write a Clinical Paper
- 510370226_ftp
- Acupuncture for Chronic Pain
- 1cochrane reviews
- Urschel_how to Analyze
- Exercises Reduce the Progression Rate of Adolescent Idiopathic Scoliosis Results of a Comprehensive Systematic Review of the Literature
- Epidemiology, Microbiology, Clinical Manifestations, And Diagnosis of Typhoid Fever
- DA Savitz - Interpreting Epidemiologic Evidence - Strategies for Study Design and Analysis 2003.pdf
- Agresion Apatía RS TNF
- What is Research
- tugas epidemiologi
- Iraq
- Ace Tamino Fen
- 1475-2891-13-106
- v090p00845
- Investigative Journalism Manual Chapter 9
- Jornal Acute Kidney Injury Fix
- Epidemiology Notes