Professional Documents
Culture Documents
Biostatistics: I. Types of Data
Biostatistics: I. Types of Data
CHARLES HERRING
15
I. TYPES OF DATA1
A. Nonparametric (aka discrete) variables1
1. Nominal: Numbers are purely arbitrary or without regard to any order of ranking of severity.2–6
This includes dichotomous (binary) data (lived/died, yes/no, hospitalized/not hospitalized) and
categorical data without order or inherent value (race, eye color, hair color, religion, blood type,
acute renal failure [ARF]/congestive heart failure [CHF]/diabetes mellitus [DM]).5,6
2. Ordinal: Categorical, but scored on a continuum, without a consistent level of magnitude of
difference between ranks (pain scale, New York Heart Association [NYHA] class, trauma score,
coma score).2–8
B. Parametric (aka continuous or measuring): Order and consistent level of magnitude of difference be-
tween data units (drug concentrations, glucose, forced expiratory volume in 1 second [FEV1], heart
rate, blood pressure [BP]).2–6
F. CI: In medical literature, a 95% CI is most frequently used, and it is a range of values that “if the entire popu-
lation could be studied, 95% of the time the true population value would fall within the CI estimated from
the sample.”9 CIs
__ are descriptive and inferential. All values contained in the CI are statistically possible.
4
Reality
Decision from statistical test Difference exists (H0 false) No difference exists (H0 true)
Difference found (Reject H0) Correct Incorrect
No error Type 1 error (aka false positive)
No difference found (Accept H0) Incorrect Correct
Type 2 error (false negative) No error
(if reported), the existing evidence-based or expert consensus statements, and any cost-
effectiveness or decision analyses that have been performed.8,12 Absent such guidance, require
that the minimum worthwhile effect be large when the intervention is costly (e.g., in terms of
time, money, or other resources), the intervention is high risk, a patient is risk averse, or the out-
come is unimportant or has intermediate importance but with uncertain benefit to patients.8,12
Accept the minimum worthwhile effect as small when the intervention is low cost, the interven-
tion is low risk, the patient is risk taking, or the intervention is important and has an unambigu-
ous outcome (e.g., death).8,12
School A 90 5
School B 120 9
School C 130 11
b. There are two possible results (passing vs. failing) for three schools of pharmacy. Therefore,
this is called a 2 3 table or matrix.1
C. Once chi-square calculations (for greater than 2 2 contingency table) indicate a statistically
significant difference, one must perform post hoc tests to determine which groups or treatments
differ from one another.5 These post hoc tests should only be performed if the chi-square test
was significant.1
D. Fisher exact test may be used when the sample size for a nominal data set is between 20 and 40.14 It
may also be used for a 2 2 matrix when a nominal data set is 20 or less. 4,14 In addition, Fisher exact
test may be used for matched or paired data (i.e., crossover or prepost test design).1
F. ANOVA rank tests are generally used to account for confounders in ordinal data with the excep-
tion of using repeated measures regression to account for two or more confounders in crossover
design.1
between stroke and hypertension reporting an r 0.70, then r2 0.49, and one could say that 49%
of stroke risk may be explained by BP.1
1. Conversely, 1 r2 represents the proportion of the variation that is not related to the indepen-
dent variables (i.e., the residual variation).11 This is sometimes referred to as the coefficient of
nondetermination.15
2. As with correlation, CIs can be calculated for regression analyses. Also, as with correlation, the
existence of this kind of statistical association is not in itself evidence of causality. One must take
into account what type of analysis is being performed (i.e., case control vs. cohort vs. randomized
controlled trial [RCT]).1
C. Multiple regression. When more than one independent variable is used to predict a dependent vari-
able, multiple (or multivariate) regression analysis (MRA) is used.4,15 For example, the national cho-
lesterol guidelines use multiple regression to help establish 10-year coronary heart disease (CHD)
risk for patients based on population data. A patient’s 10-year CHD risk is the dependent variable
because its estimate “depends on” several independent variables. The independent variables include
age, total cholesterol, high-density lipoprotein (HDL) cholesterol, smoking status, and systolic BP. All
of these independent variables are used to help predict a patient’s 10-year CHD risk.1
1. Multiple linear regression is “used with parametric (aka continuous) outcomes like BP” and lipid
values.17
2. Logistic regression is used with nominal outcomes such as death and hospitalization.8,17
3. Cox proportional hazards regression is “used when an outcome is the length of time to an event.”17
For example, time until death or hospitalization or time until discharge.5,17
X. ERROR VERSUS BIAS. Errors are “mistakes that do not systematically under- or overestimate
effect size.”18 Biases are systematic errors/flaws in study design that lead to incorrect results.5,6,18 More
common types of biases include publication bias, investigator bias, compliance bias, selection bias,
diagnostic or detection bias, recall bias, and channeling bias (aka confounding by indication).5–7,18–20
The best way to minimize bias is through proper study design (e.g., randomization, inclusion/exclusion
criteria, blinding, using controls and objective outcome measures).5,6,8,18,19
XI. CONFOUNDING. Confounders are “causes, other than the one studied, which may be linked
to the studied outcomes and/or the hypothesized cause.”5,7,19 Although these are sometimes difficult
to detect, investigators should account for known confounders.5,19 For example, atherosclerosis causes
myocardial infarction (MI). There is an association between atherosclerosis and smoking, smoking
and risk for an MI, and atherosclerosis and risk for an MI. The proposed cause is atherosclerosis.
The potential confounder is smoking, so investigators need to account for any significant smoking
differences among studied groups.19
Outcome studied
(heart attack)
A. Ways of controlling for confounders include proper study design (e.g., randomization, inclusion/
exclusion criteria, blinding, using controls and objective outcome measures) and proper statistical
analysis (stratification, MRA, and use of ANOVA [for parametric data]).5–8,18,19
XIV. VALIDITY
A. Internal validity. To what degree does a study appropriately test and answer the question(s) being
asked and measure what is claimed to be measured? It “addresses issues of bias, confounding, and
measurement of endpoints.” 19 It directly affects external validity.7,18,19
B. External validity. Presuming internal validity, this assesses whether the results can be extrapolated to
the general population, to other groups, patients, or systems.7,18,19
C. Odds ratio (OR) is an estimate of RR and is generally used for case-control trials.5,15
D. Hazard ratio (HR) estimates RR and is generally used with Cox proportional hazards regression
analysis.
E. OR and HR are fairly accurate estimates of RR if the incidence of an outcome is 15%.17
+ = disease present
− = disease absent
Outcome
Exposure
Today
(?)
a. These are retrospective, identify patients based on outcome or disease, and are therefore good
for rare outcomes or diseases.6,19 These are good for identifying possible “causal influences on
relatively uncommon outcomes” or slow-developing diseases, can be used in diseases with
long latencies such as Alzheimer, and are inexpensive and quick.5,18,19 Problems include their
being only hypothesis generating, subject to bias and confounding, not good for rare expo-
sures, and that selection of controls can be challenging. 5,18,19
2. Retrospective cohort study (aka “outcomes studies”)19:
Exposure
Outcome Today
(?)
Exposure
Today Outcome (?)
a. This is called a prospective cohort, but it is really a retrospective design because exposure took
place in the past.19
4. Very rarely, one may see the following type of prospective cohort study, which is truly prospective18:
Exposure
Today Outcome (?)
a. Cohorts are good for studying frequent outcomes or diseases such as atherosclerosis.5,18
Subjects are identified based on exposures, which allows investigators to study multiple dis-
eases and/or uncommon exposures.6,19 These are subject to less bias than case control, can
help determine incidence of disease, and can help define temporal trends between exposure
and disease. 5,6,18,19 Problems include loss of follow-up, inability to control exposures, bias
(although less than with case control), time, and expense.1
D. Randomized trials. Randomization (aka allocation) means that all within a population have an equal
and independent opportunity of being selected as part of a sample.1
1. Parallel design (arrow charts for trial designs shown hereafter) 19
Rx 2 End point
Randomization Evaluation/Analysis
2. Crossover design. Used when there is wide, interpatient variability. “Since each patient serves as
his or her own control, variation between treatment groups is minimized.”20
Evaluation/Analysis
Randomization
a. Crossovers may be used for certain “chronic, stable diseases, such as osteoarthritis, or for
pharmacokinetic studies.”20 Crossovers are not appropriate for chronic, unstable diseases (e.g., al-
lergic rhinitis or asthma).20 Also, crossovers are not suitable for “acute conditions, such as post-
operative pain or infections.”20 They are not appropriate for certain types of treatment questions
(e.g., treatment of nausea/vomiting in chemotherapy trials). Also, there may be cases where cross-
overs are unethical or impractical (e.g., a smoking cessation trial). To prevent carryover effect, “a
typical washout period should last at least 5 half-lives of the study drug or its active metabolite.”20
3. Randomized trials are the “best design for determining causality” by minimizing bias and
dividing confounders equally.5,19 However, these are expensive and time and labor intensive, and
generalization of results depends highly on appropriate inclusion/exclusion criteria.5,19
XVII. REFERENCES
1. Herring C. Quick Stats: Basics for Medical Literature Evaluation. 3rd ed. Acton, MA: Copley; 2009.
2. Gaddis ML, Gaddis GM. Introduction to biostatistics: Part 1, basic concepts. Ann Emerg Med.
1990;19(1):86–89.
3. Glaser AN. High Yield Biostatistics. Media, PA: Williams & Wilkins; 1995.
4. DeYoung GR. Biostatistics: A Refresher (handout). 2000 Updates in Therapeutics: The Pharmaco-
therapy Preparatory Course.
5. Kaye KS. Clinical Epidemiology and Biostatistics: Overview and Basic Concepts (handout). Faculty
Development Seminar, Campbell University School of Pharmacy, Department of Pharmacy
Practice. 2001.
6. Kaye KS. Clinical Epidemiology and Biostatistics, Part 2 (handout). Faculty Development Seminar,
Campbell University School of Pharmacy, Department of Pharmacy Practice. 2001.
7. Berensen NM. Statistics: A Review (handout). 2001.
8. DeYoung GR. Understanding statistics: an approach for the clinician. In: Pharmacotherapy Self-
Assessment Program. Book 5: The Science and Practice of Pharmacotherapy 1. 5th ed. Kansas, MO:
American College of Clinical Pharmacy; 2005:1–17.
9. Gaddis GM, Gaddis ML. Introduction to biostatistics: Part 2, descriptive statistics. Ann Emerg Med.
1990;19(3):309–315.
10. Gaddis GM, Gaddis ML. Introduction to biostatistics: Part 3, sensitivity, specificity, predictive value,
and hypothesis testing. Ann Emerg Med. 1990;19(5):591–597.
11. Gaddis ML, Gaddis GM. Introduction to biostatistics: Part 6, correlation and regression. Ann Emerg
Med. 1990;19(12):1462–1468.
12. Froehlich GW. What is the chance that this study is clinically significant? A proposal for Q values.
Eff Clin Pract. 1999;2:234–239.
13. Gaddis GM, Gaddis ML. Introduction to biostatistics: Part 4, statistical inference techniques in
hypothesis testing. Ann Emerg Med. 1990;19(7):820–825.
14. Gaddis GM, Gaddis ML. Introduction to biostatistics: Part 5, statistical inference techniques for
hypothesis testing with nonparametric data. Ann Emerg Med. 1990;19(9):1054–1059.
15. De Muth JE. Basic Statistics and Pharmaceutical Statistical Applications. 2nd ed. Boca Raton, FL:
Chapman & Hall/CRC, Taylor & Francis Group; 2006.
16. Kelly WD, Ratliff TA, Nenadic CM. Basic Statistics for Laboratories: A Prime for Laboratory Workers.
Hoboken, NJ: John Wiley and Sons; 1992:93.
17. Katz MH. Multivariable analysis: A primer for readers of medical research. Ann Intern Med.
2003;138:644–650.
18. Drew R. Clinical Research Introduction (handout). Drug Literature Evaluation/Applied Statistics
Course. Campbell University School of Pharmacy. 2003.
19. DeYoung GR. Clinical Trial Design (handout). 2000 Updates in Therapeutics: The Pharmacotherapy
Preparatory Course.
20. West PM. Literature evaluation. In: Pharmacotherapy Self-Assessment Program. Book 5: The
Science and Practice of Pharmacotherapy 2. 5th ed. Kansas, MO: American College of Clinical
Pharmacy; 2005.
Study Questions*
1. Which measure(s) of central tendency is/are sensitive (For the next two questions) A study of the effects of
to outliers? bupropion (Zyban) versus nicotine gum (Nicorette)
(A) Mean on the primary end point of change in the number of
(B) Median cigarettes smoked per day in a parallel, randomized trial.
(C) Mode The investigators plan to include 450 subjects (150 in each
arm) to reach statistical significance based on a of 0.2
2. For what type of data can standard deviation (SD) be and of 0.05.
used?
5. Which of the following statistical tests would be the
(A) Parametric data
most appropriate? (Hint: assume no confounders)
(B) Ordinal data
(C) Nominal data (A) One-way ANOVA
(B) Chi-square (2)
3. Which of the following is correct regarding measures (C) Fisher exact test
of variability? (D) Friedman test
(A) Range can be both descriptive and inferential. (E) t test
(B) Standard error of the mean (SEM) is always larger 6. Which of the following statistical tests would be the
than SD. most appropriate if the study had evaluated three
(C) All values contained in a confidence interval (CI) groups instead of only two? (i.e., bupropion [Zyban]
are statistically possible. vs. nicotine patches [Nicoderm CQ] vs. nicotine gum
(D) CI is a descriptive measure only. [Nicorette])
4. A study was performed to determine the effect of a (A) One-way ANOVA
new antipsychotic agent (Drug A) on psychosis in (B) Chi-square (2)
patients with underlying schizophrenia as compared (C) Fisher exact test
to placebo. A sample size of 300 patients was (D) Friedman test
calculated to be needed based on an of 0.05 and a (E) Student’s t test
of 0.2. The double-blind, parallel, superiority trial
7. A study is designed to evaluate the change in blood
was performed in 350 patients for 8 weeks. At the end
pressure lowering between metoprolol tartrate
of the 8-week period, the new antipsychotic agent
(Lopressor®) and metoprolol succinate (Toprol XL®).
was found to induce remission in 20% of patients as
The investigators decide to perform a parallel trial
compared to 19% in the placebo group (P 0.04).
in 200 patients. There were significant baseline
Which of the following statements is true based on the
differences between the groups in diet and exercise.
results of the study?
Which of the following statistical tests would be most
(A) Drug A was found to have a statistically appropriate?
significant and clinically significant difference
(A) One-way ANOVA
on remission of psychosis as compared to
(B) Chi-square (2)
placebo.
(C) Fisher exact test
(B) Drug A was found to have a statistically
(D) ANCOVA
significant difference but not a clinically
(E) Student’s t test
significant difference on remission of psychosis as
compared to placebo.
(C) Drug A was found to have a clinically significant
difference but not a statistically significant
difference on remission of psychosis as compared
to placebo.
(D) Drug A was not found to have a clinically or
statistically significant difference on remission of
psychosis as compared to placebo.
* These study questions were composed by Melanie Pound, PharmD, BCPS and Rebekah Grube, PharmD, BCPS.
8. The makers of eplerenone (Inspra) want to design 11. What can be concluded about the outcome “CVA
a study to compare their medication to the current or TE”?
standard of spironolactone (Aldactone) in the (A) Dabigatran (Pradaxa) has a clinically significant
treatment of heart failure. They decide to perform a lower risk than warfarin (Coumadin), although it
parallel trial of the two agents in 2000 patients with is not statistically significant because the CI does
NYHA classes II to IV heart failure over 2 years. The not include 1.
primary end point is mortality. Which of the following (B) Dabigatran (Pradaxa) has a clinically significant
statistical tests would be most appropriate to use? lower risk than warfarin (Coumadin), although it
(A) ANOVA is not statistically significant because the CI does
(B) Fisher exact test not include 0.
(C) Chi-square (2) (C) Dabigatran (Pradaxa) has a clinically significant
(D) Mann-Whitney U test higher risk than warfarin (Coumadin), and it is
(E) Student’s t test statistically significant because the CI does not
include 0.
9. A retrospective study produces correlation/regression
(D) Dabigatran (Pradaxa) has a clinically significant
analysis between a high sodium intake ( 2.4 g/day)
lower risk than warfarin (Coumadin), and it is
and hypertension (HTN) reporting an r 0.45. Which
statistically significant because the CI does not
of the following is correct?
include 1.
(A) 20% of HTN may be explained by high sodium
intake. 12. What can be concluded about the outcome “MI”?
(B) 20% of HTN is not explained by high sodium (A) Dabigatran (Pradaxa) has a higher MI risk
intake. than warfarin (Coumadin), although it is not
(C) 80% of HTN is explained by high sodium intake. statistically significant because the CI includes 1.
(D) 55% of HTN is not explained by high sodium (B) Dabigatran (Pradaxa) has a higher MI risk
intake. than warfarin (Coumadin), although it is not
(E) 45% of HTN may be explained by high sodium statistically significant because the CI does not
intake. include 0.
(C) Dabigatran (Pradaxa) has a higher MI risk than
10. A study was performed to evaluate a possible
warfarin (Coumadin), and it is statistically
correlation between the use of the herbal product
significant because the CI includes 1.
Goldenseal and changes in pain relief (based on pain
(D) Dabigatran (Pradaxa) has a higher MI risk than
scale scores). Which type of correlation analysis
warfarin (Coumadin), and it is statistically
should be used in this trial?
significant because the CI does not include 0.
(A) Pearson
(B) Spearman 13. A researcher was interested in examining the
(C) Linear association between postmenopausal hormone
(D) Cox replacement therapy (HRT) and development of
heart disease. All women who were characterized as
(For the next two questions) In the RE-LY trial, dabigatran postmenopausal were approached regarding their
(Pradaxa) was compared with warfarin (Coumadin) for interest in participating in the study by answering a
the prevention of cerebrovascular accident (CVA) in atrial questionnaire annually regarding their medication use
fibrillation (AF) patients. The primary outcome in this trial and medical conditions. Of the 16,168 women who
was CVA or systemic thromboembolism (TE). The results provided consent, the average length of follow-up
are presented as follows: was 12.5 years (range, 6 to 16 years). Which of the
following best describes the study design?
Dabigatran Warfarin (A) Case-control study
End point (n 6076) (n 6022) RR, 95% CI (B) Prospective cohort study
(C) Randomized controlled trial
CVA or TE 134 (2.2%) 159 (2.6%) 0.66 (0.53–0.82) (D) Meta-analysis
MI 89 (1.5%) 63 (1.0%) 1.38 (1.00–1.91)
14. An investigator wishes to study a new drug for the 15. Which of the following would be appropriate for a
treatment of hypertension in patients with diabetes. crossover design study?
What is the best type of trial design the investigator (A) Effects of Drug A versus Drug B on hypertension
should use for determining causality in this particular in 100 patients
study? (B) Effects of varenicline (Chantix) compared to
(A) A case-control study placebo on smoking cessation
(B) A prospective cohort study (C) Effects of fluticasone/salmeterol (Advair) and
(C) A prospective, randomized, placebo-controlled trial budesonide/formoterol (Symbicort) on asthma
(D) A prospective, randomized, standard-of-care exacerbations
comparison trial (D) Effects of hydralazine and hydrochlorothiazide
(E) A meta-analysis (Microzide) on all-cause mortality
8. The answer is C [see VI.A–B]. ratios like relative risk, it is the difference from 1 that
Answer C is correct because chi-square is used to de- determines statistical significance, not difference from
tect statistical differences for nominal data (mortality) 0. Answer C is incorrect because all values within the
when there are large numbers of patients in each treat- 95% CI are statistically possible and the 95% CI for
ment group. Answers A and E are incorrect because MI contains 1. Therefore, it is statistically possible that
ANOVA and t test are used to test parametric data. there is no difference between dabigatran (Pradaxa)
Answer B is incorrect because Fisher exact test is used and warfarin (Coumadin) for the outcome of MI.
when there are small numbers of patients in each treat-
13. The answer is B [see XVI.C.4].
ment group ( 40 patients). Answer D is incorrect be-
cause Mann-Whitney U test is used to test ordinal data. Answer B is correct because postmenopausal women
were identified based on exposure (medications they
9. The answer is A [see IX.B]. were taking, i.e., whether or not they were taking HRT)
Answer A is correct because r 0.45, r-squared (r2) as is done in cohort studies. Answer A is incorrect be-
0.2 or 20%. Therefore, 20% of one variable (hyper- cause, for case-control studies, subjects are identified
tension) may be explained by the other variable (high- based on disease (heart disease in this case) rather
sodium diet). This would mean that 1 0.2 0.8 than exposure (HRT), which was not the setup for this
or 80% of hypertension would not be explained by a study. Answer C is incorrect because patients were not
high-sodium diet. Therefore, answers B, C, D, and E randomized to an intervention. Answer D is incorrect
are incorrect. because meta-analyses include multiple studies, which
10. The answer is B [see VIII.B]. is not the case in this example.
Answer B is correct because pain scale scores are ordi- 14. The answer is D [see XII.C.4 and XVI.D.3].
nal data, and Spearman is the sample correlation coef- Answer D is correct because the most robust and ethi-
ficient for ordinal data. Answer A is incorrect because cal way of determining differences between hyperten-
Pearson is the sample correlation coefficient for linear sion treatments and determining causality is through a
(parametric) data. Answer C is incorrect because pain randomized trial with a standard of care control. An-
scale is ordinal data, not linear (parametric). Answer D swer A is incorrect because a case-control study is very
is incorrect because Cox is a type of regression analysis, weak at determining causality. Answers B and E are
not a form of correlation analysis. incorrect because cohort studies and meta-analyses are
11. The answer is D [see III.F.1.b, IV.E, and XV.B]. weaker than randomized trials for establishing causal-
Answer D is correct because the outcome measure is ity. Answer C is incorrect because it would be unethical
relative risk (RR). For ratios like relative risk, it is the to treat hypertensive, diabetic patients with a placebo
difference from 1 that determines statistical signifi- rather than an established therapy that has been shown
cance. Answer A is incorrect because there is a statisti- to improve cardiovascular outcomes.
cally significant difference in CVA or TE because the 15. The answer is A [see XVI.D.2.a].
95% CI does not include 1. Answers B and C are incor- Answer A is correct. Answer B is incorrect because
rect because, for ratios like relative risk, it is the differ- it would be unethical to ask those who had stopped
ence from 1 that determines statistical significance, not smoking in the first part of the trial to restart smoking
difference from 0. in order to obtain data for the second part of the study.
12. The answer is A [see III.F.1.b, IV.E, and XV.B]. Answer C is incorrect because crossovers are not good
Answer A is correct because for ratios like relative for treatment evaluation in unstable diseases. Asthma
risk, it is the difference from 1 that determines statisti- severity may vary depending on seasons. Answer D is
cal significance. The 95% CI included 1, so although incorrect because those who died in the first part of
this may be clinically meaningful, it is not statistically the crossover trial could not be evaluated during the
significant. Answers B and D are incorrect because for second part of the trial.