Epi Survival Guide

© All Rights Reserved

1 views

Epi Survival Guide

© All Rights Reserved

- Chapter 3 Hypothesis Testing_Mean One Sample
- Statistics Lecture
- Errors in Statistical Tests
- Bio Statistics
- Lecture in R
- Construction Project Change Management in Singapore
- Results 5
- MDA
- BRM Final Report 2018 Resubmitted.
- Chapter 9 Fundamental of Hypothesis Testing
- Homework 5
- Midterm Exam With Answers
- Revision Hypothesis Testing
- Lecture 2
- index (13)
- Poole 2017
- Hypothesis
- Statistics
- Untitled
- CH 5 HP TESTING

You are on page 1of 40

I. Sampling

1. General Info

a. Impossible to obtain data from every member of a population

i. Need to take representative sample of underlying population

1. then draw inferences about the population from the sample.

b. Sample always involves an element of random variation/error.

c. Sampling statistics are essentially about characterizing the nature and magnitude of this random error.

2. Random Error

a. Def. the variation that is due to chance

i. Inherent feature of sampling, statistical inference, and measurement of biological phenomenon

1. Blood pressure measurement

b. Statistics as a field concerned w/ random error

i. Thus, a significant p-value or a precise confidence interval CANNOT tell you if the underlying data is

accurate/unbiased

1. P-value/CI only statistical no info on biases

2. the further the point estimate from the null, the smaller the p-value

a. further away from the null

c. Not as important as systematic error

3. Systematic Error

a. Def. any process that acts to distort data or findings from their true value.

b. More important than random error.

i. It can be removed by better processes

c. Can be seen as selection bias, measurement bias, or confounding bias

4. Statistical Inference

a. Def. the process whereby one draws conclusions regarding a population from the results observed in a

sample taken from that population

b. Types

i. Estimation estimating the specific value of a parameter

1. Used with confidence intervals

ii. Hypothesis Testing making a decision about a hypothesized value of a parameter

1. General Info

a. There is inherent variability in data

b. All variation is additive

i. The net observed variation is a result of the all the individual sources of variation

c. Two main categories (biological and measurement)

2. Biological Variation

a. Def. variation in the actual entity being measured

b. Outside of the science stuff, can be subdivided

i. Variation within a person (intra-person)

1. Your BP changes as a result of stimuli (time of day, posture, emotions)

ii. Variation between people (inter-person)

c. Without it, there would be nothing for epidemiologists to measure

i. The presence of biological variation is sine qua non

d. Net effect it adds to the level of random error in any measurement process

i. Can be reduced with repeated measurements

3. Measurement Variation

a. Def. variation due to the measurement process

b. Causes

i. Instrument error (inaccuracy in the instrument)

Husaini 1 of 40

ii. Operator error (inaccuracy in the person operating the test)

c. Can introduce BOTH random and systematic error

i. Systematic differences is why different laboratories establish their own reference range

d. Types

i. Inter-observer variability different observes reading the same test

ii. Intra-observer variability same person observing test at different times

e. Net effect the use of specific operational standards can reduce the impact of measurement bias

1. Validity

a. Def. the degree to which a measurement process tend to measure what is intended to

i. It is the accuracy

ii. A valid instrument/test free of any systematic error/bias

1. Will be close to the underlying true value

b. Can be determined by comparing to an accepted gold standard

c. When no gold standards exist, we measure some specific phenomena or construct

i. Constructs are then used to develop a clinical scale which can then be used to measure the

phenomenon in practice.

d. Types of validity

i. Content validity includes all the dimension to be measured

1. If measuring for pain, you would include questions on aching, throbbing, burning (but not

itching, nausea, tingling)

ii. Construct validity the scale correlates with other known measures

1. A scale for depression includes questions related to it such as those about fatigue and

headache

iii. Criterion validity scale predicts a directly observable phenomenon

1. To see if responses to pain bear a predictable relationship to pain of known severity

e. For dichotomous data validity usually expressed in terms of sensitivity and specificity

f. For continuous data can use mean, SD, correlation, and regression analysis

2. Reliability

a. Def. the extent that repeated measures of a phenomenon tend to yield the same results regardless of the

correctness

i. It is the reproducibility

ii. No comparison to a reference or gold standard

iii. Refers to the lack of random error

b. Classified as intra-observer variability or inter-observer variability

c. If not direct observation reliability can be assed with the test-retest method

i. Respondents answer the same question at two different times.

ii. Measures a form of intra-person reliability

d. Type of data measured dictates the exact statistical approach Validity= Get what your supposed to

i. Categorical data Kappa Ask family/neighbors/goldstandard

ii. Interval data intra-class correlation

Reliability = Always get the same result

IV. Statistical aspects of variability test-retest

1. General Info

a. Measures of variation

i. Is basically a measure of dispersion

1. variance (2), SD (), and range

b. Measures of agreement

i. Correlation (r) and Kappa

2. Standard Deviation ()

a. Def. the absolute value of the average difference of individual values from the mean. It is calculated by taking

the square root of the variance

1. 1 SD = 68% of total observations

2. 2 SD = 95% of total observations

a. 2 SD away from the mean is considered abnormal

Husaini 2 of 40

3. Correlation (r)

a. Def. the correlation coefficient expresses the reliability of a continuous measurement (interval data)

i. It measures the strength of the liner relationship between two continuous variables

b. Ranges from -1 to 1 (zero is no correlation)

c. Takeaway

i. If info from actual/true values then correlation is a test of validity

ii. Most cases correlation asses reliability

d. It is possible to have r values, yet have little direct agreement between observers

i. A perfect r (1.0) can be obtained if Lab A results are always exactly 10mg/dl higher than those of Lab

B.

e. It is also often used in test-retest studies

i. Used for intra-rater or intra-person variability

4. Kappa

a. Def. reliability can be characterized for categorical/qualitative data

i. It corrects for the degree of chance in the overall level of agreement

ii. it tells us the possible agreement over and above chance the reviewers have achieved

b. The ability of kappa to adjust for chance agreement is important clinically

i. The prevalence of a particular condition being evaluated affects the likelihood that observers will

agree purely due to chance

1. Even if two people have no idea what they are doing, there will be substantial agreement

by chance alone

2. The magnitude of the agreement by chance increases as the proportion of positive (or

negative) assessments increases.

3. Two people each repeatedly toss a coin

a. Four possible options (HH, TT, TH, HT)

i. agreement 50% of the time due to chance

ii. Thus any percentage above the 50% is what we care about.

ii. If the prevalence of the attribute is either high or low, than the overall percent agreement will also be

high. In other words, if something if obviously right or obviously wrong, people are more likely to

agree as such.

1. prevalence overall agreement

2. prevalence overall agreement

3. prevalence overall agreement

c. Kappa ranges from -1 to 1

i. Negative agreement that is worse than chance

ii. Zero agreement is no better than chance

iii. Positive the amount of agreement above chance

V. Types of Data

1. Categorical

a. Nominal no order

i. Alive vs. dead, male vs. female, blood type (A,B,AB,O),

Husaini 3 of 40

b. Ordinal in a natural order, but not equally spaced

i. 1st/2nd/3rd degree burn, pain scale for migraines (none, mild, moderate, severe), Glasgow Coma Scale

a. Discrete on a number line

i. There is equal spacing between values (no fractions)

ii. Examples: # of live births, stools/day, # of sexual partners

b. Continuous lots of possible values within the range clinically possible

i. BP, IQ, BMI, random blood glucose

1. General Info

a. The normal, Gaussian, distribution is the bell-shaped

curve)

i. The mean, median, and mode are all equal

b. Two ways to summarize distribution

i. Central tendency mean, median, mode

ii. Dispersion Standard Deviation

2. Central Tendency

a. Mean pulled by outliers

i. The center of gravity of the distribution

b. Median best when values are skewed

i. Will be between mode and mean when the data is skewed

c. Mode least sensitive to skewed data

i. Maximum value

3. Dispersion

a. Standard deviation used for normal (or near normal) distributions

i. 1 SD = 2/3 of the observations

ii. 2 SD = 95% of the observations

VII. Misc.

1. Abnormality

a. Abnormality depends on the population and their respective distribution

i. The cut-off will differ b/w populations

b. Best definition of abnormality

i. Being unusual greater than 2 SD from mean

ii. Sick observation regularly associated with disease

1. Most common definition

iii. Treatable only considered abnormal if treatment leads to improved outcome

2. Sub-group sampling

a. May need to obtain a larger sample from important subgroups and select subjects at random within subgroup

3. Regression to the mean some outliers may be due to random error; retesting them will cause them to move closer to

the mean.

I. Quantifying Uncertainty

1. General Info

a. Uncertainty can be characterized as

i. Qualitatively unlikely, possible, suspected, etc

ii. Quantatative probability and odds

1. Can be converted back and forth

b. Often used to quantify a physicians opinion

Husaini 4 of 40

c. Quantitative odds can force one to be more exact than is justified

P = a/b

2. Probability

a. Expresses uncertainty explicitly

a = number of events

b. Numerical value between 0 and 1

b = the total number at risk

c. Calculated as a proportion

3. Odds

a. The ratio of the probability of the event occurring over the probability of the event not occurring

a. small probability (<10%) little difference b/w probability and odds

b. large probabilities big difference b//w probability and odds

c. Probability and Odds are more alike the lower the absolute P (risk)

1. Ratios

a. Expressed as (A/B) where A is NOT a part of B

b. In other words, A & B are mutually exclusive frequencies

c. Ratio of blacks to white in a school was 15/300 or 1:20

2. Proportion

a. Expressed as (A/B) where A is INCLUDED in B

b. Based on a fraction in which the numerator (frequency of disease or condition) is included in the denominator

(the population)

c. The proportion of blacks in the school was 15/315 or 4.8%

3. Rates

a. Special types of proportions that are evaluated over a specified time period

i. Express the relationship b/w an event (e.g. disease) during a given time period and a defined

population at risk over the same time period

b. Must have a population-at-risk and a specific time period

1. Prevalence

a. The proportion of the number of cases observed compared to the population at risk at a given point in time.

i. no time dimension

ii. Is the pretest probability

b. Refers to all cases of disease observed

at a given moment

c. Is a function of both the incidence rate and

the mean duration of the disease in the population

i. Ex. Arthritis no cure, so there is a long duration. Thus, the

Prevalence = burden

prevalence is high (for a given incidence rate)

Incidence = risk Ex. Rabies lethal disease, so the duration is very short. Thus,

the prevalence is very low (for a given incidence rate)

d. Conveys the disease burden preferred by epidemiological studies for disease burden

a. The most commonly used measured of incidence.

i. Used for fixed/closed populations, only counts the first event (you only die once), and normally

measures stable rates (cancer rates)

1. Example Fixed Cohort (a medical school class)

b. Def. the proportion of a fixed population that becomes diseased during a stated period of time.

c. A measure of average risk.

i. The probability that a person

develops the disease in a

specified time period

Husaini 5 of 40

ii. If you have a 5-year CIR of disease is 10%, then you have a 10% chance of developing the disease

over the next five years.

d. Range of 0 to 1 and must have a reference to time CFR = Deaths/Cases

i. Thus, it must increase with time

Mortality = Deaths/Population

1. Time : CIR

e. It is the event rate in the context of randomized trails Incidence rate = cases/population

i. Control event rate (CER) for the baseline/control group

ii. Experimental event rate (EER) for the treatment group

f. Case-Fatality Rate (CFR) proportion of affected individuals that die from the disease

i. CFR = die/affected thus, we need the number of affected in the denominator

1. This contrasts to mortality rate in which the denominator is the entire population

ii. Associated with the seriousness and/or virulence of the disease

1. CFR : virulent the disease

iii. Best Measure for the lethality of the condition

g. Attack Rate number of people affected divided by the number at risk

i. Used as a measure of morbidity (illness) in outbreak investigations

a. More commonly used in larger epidemiologic studies

i. Used for open populations at a variable starting point (randomized trial where it takes time to enroll

patients), single or multiple outcomes (UTIs), and measures highly variable rates (outbreaks)

1. Example an open cohort (a randomized control trial)

b. It represents the speed/instantaneous rate at a given point in time that disease is occurring in the

population (analogous to miles/hr a car is traveling)

i. An incidence rate of 25 cases per 100,000 population-years expresses the instantaneous speed

which the disease is affecting the population.

ii. It is a dynamic measure that can

freely change

iii. IDR = 0 disease is not occurring in

the population

iv. IDR = infinity theoretical maximum; implies instantaneous, universal effect (nuclear explosion!)

c. The numerator is the same as CIR; however, the denominator is now person-time

i. Denominator = the sum of the disease-free time experience

ii. In chronic studies, the standard measure is 100,000 person years

iii. In outbreaks, they will make more sense (such as person-days)

iv. Person time is approximated by estimating the population size midway through the time period

a. Def. the frequency of death in a defined population during a specified time period

i. Measure using either CIR (cumulative Incidence Rate) or IDR (Incidence Density Rate)

1. Often given as # per person time (which is IDR)

b. Measures #s of deaths rather than # of disease events

c. Biggest contributor of mortality is incidence

i. The mortality rate is therefore some fraction of the underlying incidence rate depending on

the lethality of the condition

d. Termed all-cause mortality when all deaths are combined (regardless of cause)

e. The denominator is the population at risk for dying from the condition

i. The denominator in the CFR (Case Fatality Rate) is the number of affected individuals

Husaini 6 of 40

1. Why CRF is the best measure for the lethality of a condition

2. CRFs can be similar between two populations when intuition tells you otherwise (aka, once

you have it your fuckedits just a matter of time)

a. In other words, CRFs doesnt tell you true mortality or incidence as you dont

know who is going to get what and how often they may do so. What CRF does

tell you is that once you get it, this is how lethal it is based on those who died

divided by those who have it.

b. For example, acute myocardial infarction between males and females. Males

(10% CRF) and Females (12% CRF) while mortality was 110/100,00 person

years for males and 35/1000,00 person years for females.

I. Risks and Measures of Effect

1. General Idea

a. In clinical studies, it is common to calculate the risk/CIR of an event in different populations

i. By taking the ratio or difference between these two measures, we can calculate two fundamental

measures of effect

1. The relative risk

2. The absolute risk

b. The risk in the control group = baseline risk

a. Def. the ratio of the risk in the treated group (Riskt) relative to the risk in the control group (Riskc).

i. It measures the strength/magnitude of the effect of the new treatment on mortality relative to the

effect of the standard.

ii. Null value = 0

iii. Magnitude is important # conveys less

chance/confounding biases

1. Factor that increase risk

a. <2 = small effect

Relative Risk efficacy of a treatment b. 2-5 = moderate effect

c. >5 = large effect

2. Factor the decreases risk

a. >0.5 = small effect

i. High number indicates that higher risk in treatment group

b. 0.5 -0.2 = medium effect

c. <0.2 = large effect

i. Low # indicates lower risk in treatment group

3. If RR < 1.0 treatment group lowered event rate

4. If RR > 1.0 treatment group increased event rate

a. Harmed more than the control.

b. In cohort studies, RR measures the strength/magnitude of association between an exposure and a outcome

c. Only measured where the actual incidence or risk of an event can be measured

i. RCTs and cohort studies

ii. CCS & XS cannot thus cannot measure the actual incidence/risk in a population

d. RR is favored by epidemiologists b/c it is fairly constant across different populations

i. Can be transported from one study to another

e. Limitations

i. Limited in clinical usefulness fails to convey information on the likely effectiveness of clinical

intervention

ii. Is not symmetrical The OR is symmetrical

iii. No measure of impact Not a very useful measure of impact of a risk factor on a population

1. No info on frequency or prevalence of risk factor

a. Applied in the context of a treatment that reduces the risk of some adverse outcome

i. It indicates the magnitude of the treatment effect in relative terms

1. <10% = small treatment effect

Husaini 7 of 40

2. 10-30% = moderate treatment effect

3. >30% = large treatment effect

ii. A RRR of 38% would be interpreted as the death rate being 38% lower after the new treatment

compared to the old treatment

iii. It represents the proportion of the original baseline (control) risk that is removed by the

treatment.

b. It is nothing more than a re-expression of the RR (hence they add to 1)

c. Commonly used in context of RCT RRR how much risk was removed

d. More Clinically Important more direct meaning

i. It indicates by how much in relative terms the event rate is decreased by the treatment

ii. X (Rc) = (Rc RT)

a. It is simply the absolute difference in risks between the control and the

treatment groups

i. Simple and direct measure of the impact of treatment

b. More clinically useful it is a measure of the absolute benefit of

intervention

i. Preferred measure when discussing the benefits of clinical

intervention at an individual patient level

c. It will vary based on the baseline risk in the control group

i. At constant RRR, the ARR will vary based on the baseline risk

ii. The absolute benefit of treatment depends upon how much

risk there is in the population before the treatment is applied

1. An ARR in one study CANNOT be transported to

another

d. null value is zero

RR RRR ARR

Assumption: same for all

Portability? Yes Will vary b/w populations

populations

Riskt

Equation RRR = 1 - RR Riskc - Riskt

Riskc

Strength of the effect of T relative

(magnitude of the treatment effect -

Measures to C (magnitude of association Absolute difference b/w T & C

the proportion of the baseline that

between exposure and outcome)

was removed by treatment)

"The risk of outcome is X times "The outcome is X% lower in T "The outcome is (T-C) lower in T

Example

lower (or higher) in T compared to C compared to C" compared to C

a. The number of patients who would need to be treated in order to prevent one adverse effect

i. It is the amount of work required to take advantage of the potential clinical benefit of an intervention

ii. # lots of work to gain any benefit

b. It is a simpler way interpret absolute probabilities

i. Converts the probabilities into real numbers

c. As a result of relationship with ARR

i. Needs to be accompanied by a time frame

1. As time : ARR : NNT

ii. Will be influenced by baseline Baseline Risk & RRR Inversely related to NNT

1. As baseline risk : NNT

a. The less risk there is, the more people we need to treat to show anything

d. Also depends on the relative efficacy of the treatment (RRR)

i. As RRR : NNT

1. As treatment gets less

effective, we need to treat more

to get the same result

Husaini 8 of 40

e. Use NNT and NNH in concert with each other to make decisions

i. Will describe in absolute terms the trade off in both benefits and harm

1. General Info

a. In order to understand the impact of a risk factor on the incidence of disease in a population, we need to

know

i. The relative effect of the risk factor

ii. The prevalence of the risk factor in the population

b. In order to quantify the impact of the risk factors, we have the implicit assumption that the risk factor is a

cause of the disease.

c. PAR and PARF indicate the potential public health significance of a risk factor

i. Risk factor with big effect (RR =10) but is rare (P = 0.01%) has a PARF of 1%

ii. Risk factor with small effect (RR=2) but is common (P=40%) has a PARF of 44%.

a. The excess disease in a population that is associated with a risk factor

b. In other words, it is the excess disease (incidence) in the population that is caused by the risk factor

a. The fraction of total disease in the population that is attributable to the risk factor

b. In other words, it is the proportion of the total incidence in the population that is

attributable to the risk factor

i. Prevalence is the prevalence of the risk factor

c. It also represents the maximum potential impact of prevention efforts on

the incidence of disease in the

d. Used in cohort studies

e. population if the risk factor was eliminated

III. The Odds Ratio OR =

odds of exposure (controls) b/d bc

1. The Odds Ratio

a. Measure of effect choice for case control studies (CCS)

i. CCS is not able to quantify the actual

incidence or risk of disease

b. Is a good approximation of the RR

i. When outcome of interest is rare (<10%)

OR more closely approximates RR

1. Odds and probability are more

alike when the risk is small

2. Odds ratio can only be interpreted

as RR when baseline risk <10%

c. It Is the odds of exposure in cases compared to the

odds of exposure in controls

i. Describes both the magnitude and strength

of an association between exposure and

outcome

1. Null value = 1.0

2. OR >1.0 = positive association between exposure and the disease

3. OR <1.0 = negative association between exposure and the disease

d. Calculate the ratio of odds of exposure in the cases (95/5) divided by the odds of exposure among the

controls (56/44)

Husaini 9 of 40

i. The odds of death due to lung cancer was 15.6 times higher in smokers compared to non-smokers

ii. Odds ratio is symmetrical if you calculated the odd of disease among the exposed (a/b) and

divided it by the odds of disease among the non-exposed (c/d), you would get the same odds ratio.

a. OR deviates from the true RR as the baseline risk in the untreated group increases

i. Noticeable once risk >10%; often the case in RCTs

b. OR deviates from the true RR as the treatment effect gets larger OR overestimates treatment effect

c. OR is always further away from the null value of 1.0 than the RR

i. thus the treatment effect is always over-estimated

d. There is nothing clinically intuitive about using the OR

I. Classical Hypothesis (Significance) Testing

1.

General Overview

a. Concerned with making a decision about the value of an unknown parameter

b. Views experimentation as a decision making process

c. Null Hypothesis (Ho) no difference in the groups being compared with respect to the measured quantity of

interest

d. Alternative Hypothesis (HA) the groups being compared are different

i. can be specified for direction (one-sided alternative) instead of any difference (two-sided alternative)

ii. if difference round regardless, it is called the treatment effect

iii. we can never prove Ha is true, we can only reject Ho

e. Process of testing null hypothesis consists of calculating the probability of obtaining the results observed

assuming the null hypothesis is true

i. This probability known as the p-value

1. It is the probability of observing the test statistic at least as large as the one observed

P value = % that H0 is true under the assumption that the null hypothesis is true

2. P = probability of seeing the result P% of the time assuming the null hypothesis is

true

f. Alpha () the significance level

i. By convention, set to 5%; can be altered to suit researchers needs.

g. If the P-value is less than Alpha, the null hypothesis is rejected (as the percentage chance of

it being true is lower than what we define as significant).

a. Define the null hypothesis

b. Define the alternative hypothesis

c. Calculate the P-value assuming the null hypothesis is true, this is the probability of obtaining the results

found in the data

d. Accept or reject the null hypothesis

i. If the probability of observing the actual data under the null hypothesis is small than the significance

level (p <), then we reject the null.

e. Accept the alternative hypothesis

3. The T-test

a. Tests means between two groups using continuous data, assuming the data is normally distributed

b. Larger values of t result in smaller p values which are more consistent with Ho being false

i. Numerator larger differences in the mean result in larger t values

1. difference : t: P Ho being false

ii. Denominator measure of the standard error of the difference

1. As sample size increases, the denominator decreases, and t

increases

a. sample size : t : P Ho being false

a. Type I (FP) error

i. Occurs when we determine a difference exists when there is not one

ii. A statistically significant p value is obtained

1. Even though there is no underlying difference between the groups being compared

iii. The rate that false positives occurs is the significance level ()

1. Also known as the Type I error rate

2. Set at 5% b/c scientists are by nature cautious

a. Want to avoid false alarms only find the person guilty beyond a reasonable

Type 1 Error (FP) = significance level () doubt

Husaini 10 of 40

3. This makes sense b/c everything under this 5% will be deemed significant (even though it

Type 2 Error (FN) = beta is not assuming we are still under the FP pretense) as the null hypothesis will be rejected.

In other words, 5 times out of 100 we will have a Type I error because there are 5 chances

(0.0, 0.01, 0.02, 0.03, and 0.04) that the p-value can be under alpha (0.5).

b. Type II (FN) error

i. Occurs when we determine a difference does

not exist when in fact it does

1. When we accept Ho as false

ii. A statistically non-significant p-value is

obtained

1. Even though there is a difference

between the groups being compared

iii. The rate that false negatives occurs is beta

1. Also known as Type II error rate

2. Sample size estimates are based on

setting beta at either 20% or as low

as 10%

3. This means that a real difference would be missed 20% of the time

iv. For smaller studies, the probability of a Type II error is a lot higher

c. & have an inverse relationship

i. As one increases, the other decreases

5. Power and Sample Size Power = (1-) = sensitivity = probability of correctly rejecting Ho when Ho is false

a. Power

i. The complement of the Type II error rate power = (1-)

ii. The probability of correctly rejecting Ho when Ho is false

1. The probability of the study finding a difference when a difference truly exists.

iii. Most studies have power = 0.8 or greater

iv. Power is analogous to sensitivity

v. Easiest way to increase power = sample size

b. 4 parameters

i. (FP) error rate

1. a smaller alpha increases beta, which would lower the power of the study making it

harder to identify a real difference

a. a low more stringent test harder to prove difference exists (harder to

reject Ho)

i. Likely to get a Type II error ()

2. as : : Power

ii. (FN) error rate

1. the smaller the beta, the easier it will be to identify a difference

a. this can be accomplished by increase the sample size or increasing

2. : easier to find a difference ( power)

iii. Effect Size

1. The magnitude of the treatment difference you are trying to detect

a. Bigger differences are easier to detect than smaller differences

b. Size does matter

2. The study will be powered to find the minimal clinically important difference (the

smallest difference b/w 2 treatments what would be clinically beneficial)

iv. The variability of the data

1. The greater the variability in the data the harder it will be to detect a difference

a. variability : power

2. It is harder to detect the true signal when there is a lot of noise to contend with

3. Also true with rare events (death, relapse in follow-up study)

a. rare outcome : power

c. Problem with low power studies

i. It is difficult to interpret negative results

1. No effect? Or was there a failure to detect a true effect b/c of too small #s or outcomes

2. Low power studies also indicate imprecise measurements (wide CIs)

ii. Low power studies Type II errors ()

iii. Low power studies no effect on Type I errors ()

1. In a nutshell

a. Approaches statistical inference as a measurement exercise

i. Estimating the specific value and the precision in which the specific value is measured

b. Same info as p-value; however also gives

i. Size of treatment difference

Husaini 11 of 40

ii. The precision of the estimated difference

iii. Information that aids in interpreting a negative result

a. Point estimate observed single best estimate

i. Conveys the magnitude of an effect

b. Confidence Interval set of all possible values that are consistent with the data

i. It quantifies the precision

ii. CI = point estimate +/- (percentile distrib*standard error)

1. Percentile distrib measure of confidence

2. Standard error = /sqr(n)

a. N: standard error : narrower CI

iii. Not a uniform distribution #s closer to the estimate are more likely

1. the 95% CI is symmetrical

a. (CI +1) has the same probability as (CI -1)

b. However, (CI +.5) has a larger probability of occurring than (CI+1) b/c

(CI+0.5) is closer to the point estimate so it is more likely

iv. the further the point estimate from the null, the more extreme the p-value

1. At one extreme, or the positive end (in a study to prove an increased effect)

a. then p-value would be less than 0.05 (assuming 95% CI)

2. At the other extreme, or the negative end (in the same study to prove an increased effect)

a. then p-value would be much greater than 0.05 (assuming 95% CI)

v. All the values outside the 95% CI would be statistically significant from the point estimate at p<0.05

vi. If the CI includes 0 or a negative number results are not statistically significant

1. However, results from these negative trials may still be clinically significant

3. Clinical Relevance

a. Clinicians should only adjust their practices if there is a treatment difference and that treatment is large enough

to be clinically important.

b. With wide confidence intervals, clinicians can determine what they think is clinically important and then reach

conclusions appropriate for their practice.

1. When two identical groups of patients are compared, there is a chance () that a statistically significant p value will be

obtained (type I error)

a. When multiple comparisons are performed, the risk of one or more false-positive p values increases

i. If choose enough outcomes will eventually get data that is statistically significant

2. Bonferroni Correction

a. Method for reducing the overall Type I error risk when making multiple comparisons

b. Divide the overall type I error risk desired (0.05) by the number of comparisons the new value is the for

each individual test

c. Controls the type I error risk, but reduces the power in type II error risk

I. Clinical Testing (Diagnostic Strategies)

1. Hypothetico-deductive reasoning

a. Diagnostic strategy that nearly all clinicians use most of the time

b. Steps

i. Formulate hypotheses for patients primary problem

ii. First consider explanations that are most likely and/or those that are particular harmful to miss

1. Simultaneously rule out those that would be particularly harmful or catastrophic and try to

rule-in those that are considered to be most likely.

iii. Continue until list is shortened and/or candidate disease has very high likelihood (>90%)

c. The list of possibilities is reduced by considering the evidence for and against each, discussing those which are

very unlikely and conducting further tests to increase the likelihood of the most plausible candidates

1. Se & Sp Overview

a. Due to inherent variability in biological systems

Husaini 12 of 40

i. FPs and FNs will always occur

b. Interpretation of diagnostic results is essentially concerned with comparing the relative frequencies of the

incorrect results (FN/FP) to the correct results (TP/TN)

c. As tests normally have a continuum of values, positive and negative are divided due to a cut-off point that

differentiates between normal and abnormal

i. To the extent that the two populations (see figure) have similar

measurements, the test will not be able to discriminate between

them

1. Degree of overlap = measure of the test effectiveness

a. Sp and Se quantify this

ii. overlap: discriminatory power of the test

d. The presence or absence of a disease must be determined by a gold standard

2. Sensitivity

a. Defined as, the proportion of individuals with disease that have a positive test

result or the ability of a test to detect a disease when it is present

i. The true-positive rate

1. Test positives divided by total disease positives

ii. Calculated from diseased individuals

iii. The conditional probability of being test positive given the disease is present

1. Se = P (T+|D+)

2. When the disease is present, the test will be positive

b. The more sensitive a test, the better the NPV

c. A perfectly sensitive test

i. Test recognizes all actual positives it rarely misses

All diseased patients test (+) ii. Type II error (FN) we wont miss the disease

No FN results 1. Negative results rule out disease should be reassuring

All TN patients are disease free 2. All negatives must be TNs

Sizable portion of D- test positive 3. No FNs

iii. Does not tell you if disease is present

1. Test gives no information on false positives

iv. Three scenarios when high sensitivity tests should be selected

1. Early stages of work-up when large # of potential diseases are being considered

a. (-) result rules out that disease; helps narrow down choices

2. When there is an important penalty for missing the disease

a. TB, syphilis, etc

b. FNs since they are treatable, we want to make sure we dont miss them

3. Probability of disease is relatively low (low prevalence)

a. Purpose is to discover asymptomatic disease

X% of the time

Patients with this will have this test result

(indicates Se)

Duodenal ulcer History of ulcer, 50+ years, pain relieved by eating or pain after eating 95%

Favorable prognosis following non-traumatic

Positive corneal reflex 92%

coma

intracranial pressure Absence of spont. Pulsation of retinal veins 100%

DVT (+) D-dimer 89%

Pancreatic cancer (+) ERCP 95%

e. Most helpful when test is negative rules out disease

i. (+) results will depend on the rate of FPs (specificity) SnNOUT = seNsitivity = FNs = NPV

f. Used for screening in diseases with low prevalence

g. SnNOUT: a highly SeNisitive test, when Negative, rules OUT disease ( FNs)

3. Specificity

a. Defined as, the proportion of individuals without disease that have a negative test result or the ability of a

test to indicate non-disease when disease is not present.

i. The true-negative rate SpPIN = sPecificity = FPs = PPV

1. Test negatives divided by total disease negatives

ii. Calculated from non-diseased individuals

iii. The conditional probability of being test negative given the disease is absent

1. Sp = P(T-|D-)

2. When the disease is absent, the test will be negative

iv. If Sp were 75%, the FP rate would be 25%

b. Low specificity

i. Will have lots of FPs

ii. We want to make sure we dont miss anything

(similar to airport screening)

iii. Analogous to high sensitivity

Husaini 13 of 40

c. The more specific a test, the better the PPV

d. A perfectly specific test

i. Test recognizes all actual negatives confirms health

All non-disease patients test (-) ii. Type II error (FP) well catch it if you have it

No FP results 1. Positive results rule in disease true bad news

2. All positives must be TPs

All TP patients have disease iii. Does not tell you if disease is absent

Sizable portion of D+ test negative 1. Test gives no information on false negatives

iv. Two scenarios when high specificity tests should be selected

1. To rule-in a diagnosis that has been suggested by other tests

2. When FPs can harm the patient physically or emotionally

a. Confirmation of HIV or cancer

b. When we want to be absolutely sure a condition is present

X% of the time

Patients without this will have this test result

(indicates Sp)

Alcohol dependency No to 3 or more of the 4 CAGE questions 99.7%

Fe-deficiency Anemia (-) serum ferritin 90%

Breast Cancer (-) fine needle aspirate 98%

Strep Throat (-) pharyngeal gram stain 96%

f. SpPIN: a highly SPecific test, when Positive, rules IN disease ( FPs)

4. Trade off Between Sensitivity and Specificity

a. No such thing as a perfect test; must have a trade-off

b. For continuous scales, location of cut-off point is arbitrary

i. Can be modified for the purpose of the test

c. Lowering the cutoff point Se but Sp

i. FN; FP

a. Uses

i. Compare the accuracy of two or more tests

ii. Illustrate the trade-off b/w Se & Sp as the cut-point is changed

1. Slope of ROC curve (TP rate : FP rate) is the likelihood ratio

b. Constructed by plotting the sensitivity (true positive rate) against the false positive rate (1-specificity) for a series

of cut-points

c. Best discriminating tests lie further to the north-

west

i. FN rates (indicated by high Se)

ii. FP rates (indicated by high Sp)

d. To discriminate b/w diseased and non-disease individuals

i. Area under ROC curve (AUROCC) indicates the

overall accuracy of the test

1. 0.5 no discriminating ability

2. 1.0 perfect accuracy

e. Tests w/ no discriminating ability diagonal straight line

i. Equivalent to likelihood ratio (LR) of 1.0

f. Deciding on cut point

i. Influenced by

1. Likelihood of disease (prevalence)

2. Relative costs (risk/benefit) associated

with errors in diagnosis

a. Includes both FN & FP

ii. If cost of missing a diagnosis (FN) is high

(compared to FP) we want a low FN

1. Operate on horizontal part of the curve (60 units)

a. FN result are minimized at the expense of FP results

b. Maximizes sensitivity (~90%) while providing reasonable specificity (~50%)

iii. If cost of falsely labeling a health person as diseased (FP) is high compared to the cost of

missing a diagnosis (FN) we want a low FP

1. Operate on the vertical part of the curve (320 units)

a. FP results are minimized at the expense of FN results

b. Maximizes specificity (~99%) while providing moderate sensitivity (~40%).

1. General Idea

Husaini 14 of 40

a. Se & Sp can only be calculated if the true disease status is known

i. They are conditional on the disease status being either positive (Se) or negative (Sp)

b. Clinician is using test precisely b/c they do not know the disease state

i. We want to know the conditional probability of disease given a test result

a. Defined as, the probability of the disease in a patient with a positive (abnormal) test

i. True Positives

b. The conditional probability of being diseased given that the test was positive

i. PVP = P(D+|T+)

ii. Sp & PVP are linked as both provide info on FP rate

c. Influenced by Sp and Prevalence

i. Decreases as prevalence decreases

1. The relative size of D+ (cell a) to D- (cell b) is now much smaller

2. prevalence = PPV

ii. Increases as specificity increases

1. The more specific the test, the less FPs there will be and thus, more likely those that test

positive will actually have the disease/condition

2. Sp = PPV

d. A highly specific test helps rule-in disease b/c PVP is maximized

i. SpPIN implies that 1-PPV is very small

1. # = they probably have it (as PPV is )

2. # = less able to rule in disease (not all test positive patients require follow-up)

ii. test specificity = better the PPV

a. Defined as, the probability of not having the disease when the test result is negative (normal)

i. True Negatives

b. The conditional probability of not being diseased given the test was negative

i. PVN = P(D-|T-)

ii. Se & PVN are linked as both provide info on FN rate

c. Influenced by Se & Prevalence

i. Increases as prevalence decreases

1. The relative size of D- (cell d) to D+ (cell c) is much larger

2. prevalence = NPP

d. A highly sensitive test helps to rule-out disease because PVN is maximized

i. test sensitivity = better NPV

e. Clinically, more interested in the complement of PVN (1-PVN)

i. = P(D+|T-)

ii. Tells the clinician what the probability is of having the disease despite testing negative

1. Informs on the rate of FNs among all negative test results

2. SnNOUT implies that 1-NPV is very small

a. # = they really dont have it (as NPV is )

b. # = less able to rule out the disease (cant send a test-negative patient

home with a clean bill of health

f. PVN = few FN among all test negative results

i. Indicates an alternative diagnosis should be sought

4. Prevalence

a. Represents the proportion of the total population tested that has the

disease

b. It is the third force that often goes unnoticed before revealing

its influence in dramatic fashion

i. Has dramatic influence on PVN & PVP

c. AKA likelihood of disease, prior probability, prior belief, prior odds,

pre-test probability & pre-test odds.

d. As prevalence falls

i. PPV

ii. NPV

e. As prevalence increases

Husaini 15 of 40

i. PPV

ii. NPV

1. General Idea

a. Calculating the predictive values for any combination of Se, Sp, & prevalence (using 2x2 tables)

b. The process by which disease probabilities are revised in face of new test information

c. Can be used to calculate PPV, NPV, Se, Sp, and Prev.

a. Make grid, fix N, and then calculate the expected number of diseased by applying the prevalence rate to N

b. Calculate D+|T+ (cell a) by multiplying # diseased by Se

i. Place remaining under D+|T- (cell c) as these are the

FNs

1. FNs = (1-Se)* # diseased

c. Calculate D-|T- (cell d) by multiplying # healthy by Sp

i. Place remaining under D-|T+ (cell b) as these are the

FPs FPs

1. FPs = (1-Sp)* # healthy

d. Use top two rows (cells a&b) to calculate PPV & the lower

two rows (cells c&d) to calculate NPV

FNs

V. Multiple Testing Strategies

1. General Idea

a. Tests with sufficiently high Se & Sp that can simultaneously

rule out and rule in are very rare

i. We only have access to an array of imperfect tests

b. Get more information by combining tests

2. Parallel Testing

a. Situation where several tests are fun simultaneously

i. Any one positive test leads to further evaluation

b. Used in early phases of work-up when trying to , rule stuff out

i. Lots of negative tests = condition ruled out

c. Positive test only means that more testing is needed

i. b/c of in FP

d. Very costly, highly inefficient, dangerous to the patient, and is bad medicine

e. Best when need highly sensitive test (yet only have a couple insensitive tests)

i. Combining the relatively insensitive tests in parallel maximizes your chance of identifying

diseased subjects.

f. Net effect

i. likelihood of detecting disease

1. Se (there are multiple opportunities to find a positive test result)

2. PVN

ii. risk of FP

1. Sp

2. PVP

3. Serial Testing

a. Situation where several tests are run in order

i. Each subsequent test is only run if the prior test was positive

b. Any negative test work-up stopped

c. Best used when

i. We want to be sure a disease is ruled in w/ certainty

1. Used when we have no time constraints

ii. If the definitive test is expensive, difficult, or invasive

1. To avoid overusing, we make sure the patient is positive to other tests before advancing

a. Example colonoscopy after (+) fecal occult blood test

d. Great example of the logic of Bayes theorem to revise probabilities

i. The results of the first test are used to provided pre-test probabilities for the second test

e. Net effect

i. Sp & PPV b/c each case has to test positive to multiple tests

1. FPs

ii. Se $ NPV

Husaini 16 of 40

I. Introduction

1. General Idea

a. Goal of screening mortality & morbidity (and/or expensive or toxic treatment)

i. Is a form of secondary prevention

ii. Designed to detect disease early in asymptomatic phase

1. Early treatment either slows disease progression or provides a cure

iii. Premise is based on concept that early treatment will stop/retard disease progression

iv. Screening has both diagnostic and therapeutic components

2. Results of screening

a. Unlikely to have the disease (both FN & TN)

b. Likely to have disease therefore requires further diagnostic evaluation

Testing Screening

Sick patients are tested Healthy, non-patients are screened

Diagnostic intent No diagnostic intent

to disease prevalence to disease prevalence

a. Mass/population based application of screening tests to large, unselected populations

i. Mammography screening for breast cancer in women <40 years

b. Case finding use of screening by clinicians to identify disease in patients who present for other unrelated

problems

i. Blood pressure measurements

a. Defined as, the period between when early detection by screening is possible and when the clinical

diagnosis would usually have been made

b. Must be sufficiently long in order for a disease to be a suitable candidate for screening

i. The point that a typical person seeks medical attention depends upon availability of medical care, as

well as the level of medical awareness in the population

c. Examples

i. Long PCP screening might useful colorectal cancer (PCP = 7-10 years)

ii. Short PCP unlikely for screening childhood diabetes (PCP = weeks to months)

d. The prevalence of detectable pre-clinical disease in a population is a critical determinant of the

potential utility of screening

i. Prevalence of disease itself is not the critical component

ii. Depends on

1. Incidence of disease

2. Average duration of pre-

clinical phase

3. Recent screening ( prevalence)

4. Detection capabilities of the test

( sensitive test = prevalence)

1. General Idea

a. Defined as, the interval from detection by screening to the time at which diagnosis would have been

made without screening

i. It is the central rational of screening as it equals the amount of time by which treatment is

advanced or made early

ii. Results in longer awareness of disease

b. Does not necessarily imply any improved outcome

i. After lead time has occurred, early treatment must then be effective for screening to be beneficial

Husaini 17 of 40

2. Lead time is not a theory or a statistical artifact

a. It is what is expected w/ an early diagnosis

b. It must occur for screening to be worthwhile

i. It is therefore a necessary condition for screening to be effective in reducing mortality

3. Distribution of lead time is important

a. It indicates the length of time by which detection and treatment must be advanced in order to achieve a

level of improved mortality

b. Suggests how often screens should be done

1. Sensitivity

a. Defined as, the proportion of cases with a positive screening test among all cases of pre-clinical disease

b. In order to be accurate, all pre-clinical disease individuals must be identified w/an acceptable gold standard

diagnostic test

i. The true disease status of negative screening individuals is impossible to verify

1. No justification for a full diagnostic work-up

a. Excellent example of verification bias

c. sensitive a test = better the NPV

d. Imperfect sensitivity affects a few (the cases)

i. An imperfect (sensitive) test will have a lot of FNs, so a lot of diseased people will be

classified as negative; thus, it is affecting the cases

e. Can only be estimated in screening studies by counting the # of interval cases that occur over a

specified period in persons who tested negative to the screening test

i. In other words, count the people who got the disease but tested negative

1. false negatives (FNs) = screening Se

2. Specificity

a. Defined as, the ability of screening test to designate as negative people who do not have pre-clinical disease

b. Imperfect specificity affects many (the healthy!)

i. An imperfect (specific) test will have lots of SPs, so a lot of healthy people will be classified

as positive; thus, it is affecting healthy people

c. specific a test = better the PPV

d. For screening to be feasible the FP rate (1-Sp) needs to be sufficiently low

i. Since prevalence in pre-clinical disease is always , the positive predictive value (PPV) will be low in

most screening programs

1. pre-clinical prevalence : PPV ; thus, we need FP

ii. PPV can be improved by

1. screening only high risk populations

2. using a lower frequency of screening (which pre-clinical prevalence)

a. repeatedly screening will catch everyone

i. pre-clinical prevalence

1. No one else to catch

ii. PVP will in a successful screening program

1. Less people to catch

3. Yield

a. Defined as, the amount of previously unrecognized disease that is diagnosed and brought to treatment

as a result of screening

b. Affected by

i. the sensitivity of the screening test

1. Se = smaller fraction of diseased individuals are detected at any screening

ii. Pre-clinical disease prevalence in the population

1. pre-clinical prevalence = yield

a. Aiming screening programs at risk populations will efficiency

1. Methods

a. Experimental

i. Conduct a RCT of the screening modality

1. compare the disease specific cumulative mortality rate

a. groups randomized to screening

b. control

2. allows one to study effects of early treatment

3. estimate distribution of lead times

4. identify prognostic factors

ii. randomized design critical

1. eliminating confounding (known & unknown)

2. allowing a valid comparison between groups

Husaini 18 of 40

a. unaffected by time bias

iii.Problems

1. Expensive, time, ethical concerns

b. Non-experimental

i. Cohort comparison of advanced illness or death rate b/w people who chose to be screened and

that do not

ii. CCS comparison of screening history b/w people w/ advanced disease/death and those unaffected

(healthy)

iii. Ecological correlation of screening patterns and disease experience of several populations

c. Problems with non-experimental

i. Confounding due to health awareness

1. Those that choose to get screened are more health conscious and have lower mortality

ii. Poor quality, often retrospective data

iii. Difficult to distinguish screening from diagnostic examinations

2. Measures of effect

a. Comparison of survival experience/duration

i. the efficacy of a screening program cannot be assessed by comparing the duration of survival of

screen detected cases versus cases diagnosed clinically

1. Although common, they over-estimate the effect of screening

a. Selection bias patients who chose to get screened are more health

conscious, better educated, and have an inherently better prognosis

i. may also occur when subjects decide to get screened b/c they have

symptoms

b. Lead time screen-detected cases will survive longer even without benefit of

early treatment

i. Simply b/c they were detected earlier!

ii. Survival is increased due to lead time

c. Length-based sampling screen detected cases represent a sample of

cases prevalent in the asymptomatic pre-clinical phase

i. It is not simply a sample of all cases in a population

ii. screening preferentially identifies slow growing, indolent cases that

have a long pre-clinical phase

1. Slow growing tumors will obviously have a better prognosis

as they have a long pre-clinical phase and a long clinical

phase

b. Disease-specific mortality rate (DSMR)

i. The only true valid measure of the efficacy of a screening program is to conduct a randomized

screening trail where the DSMRs are compared b/w groups assigned screening or no screening

ii. Unlike survival time, the DSMR will not be changed by early diagnosis/lead time

1. DSMR accurately reflects benefit of screening

iii. One major problem with DSMR

1. Within the confines of a randomized screening trial, the specific cause of death is usually

assigned by an adjudication committee

a. Since they get all the information they need to properly figure out the cause of

death, they can pretty much figure out what screening group they were in

i. Study becomes un-blinded

ii. Tendencies in a breast cancer

1. Deaths in mammography trial not breast cancer related

2. Deaths in control group over diagnose breast cancer as

cause of death

2. Debate is now if the ideal measure should be all-cause mortality

a. Not subject to these biases

1. General Idea

a. A potential negative side-effect of screening is pseudo-disease or over-diagnosis

i. Identifying disease that wouldnt become clinically apparent if not for screening

b. Involves three forms

i. Over-diagnosis

1. Cases detected what would have never progressed to a clinical state

a. Ex. Cancer cases w/limited malignant potential; PSA testing and low-grade

prostate cancer; mammography and ductal carcinoma in situ

b. Is an extreme form of length-based sampling

2. Pap testing

a. incidence of invasive cervical cancer

b. in overall incidence of cervical cancer b/c of over-diagnosis of carcinoma in

situ

ii. Competing risks

Husaini 19 of 40

1. Cases are identified that would have been interrupted by an unrelated death

a. Identification of prostate cancer in an 85 year old man who would have died of

stroke

iii. Serendipity

1. The identification of disease due to non-related diagnostic test

a. Chest X-ray for TB identifies lung cancer

1. Prevention paradox

a. Occurs when a majority of the patients come from a low to moderate risk pool (low to moderate

hypertenstion) while only a few come from a high risk pool (extreme hypertension). Also seen in mothers

who have Down Syndrome babies (A majority of Down Syndrome babies come from younger, lower risk

mothers than the older, higher risk mothers)

i. Paradox = It is common and logical to equate high risk populations with a majority of the

cases

b. A preventative measure may provide a large benefit to the community at-large, but very little to the individual. It

explains how the absolute benefit provided by a preventive action to the individual can be small, yet, collectively

the benefit may be significant. Example, if everyone in a community always used a seatbelt, over the lifetime

one subject out of 400 would be saved from dying in a motor vehicle accident. The net benefit on an individual

level is small, but it is large when judged from the community level.

2. Other important Issues

a. Assessability

i. Program should be convenient, free of discomfort, efficient, and economical

b. Efficiency

i. PPV = wasteful program

1. Most of the test positive individuals will not have the disease

ii. PPV = normally good

1. Can still be associated w/only a few cases detected and thus, only a small reduction in

overall mortality

iii. No reduction in mortality if

1. Mortality from the disease is normally low

2. Risk of death from other causes is high (in the aged)

c. Cost-effectiveness

i. Should these health dollars be spent on this program?

1. Most population based screening programs are about 30-50K/year of life saved

a. For subjects who would develop the condition that you are trying to help into one of the following three groups

i. A cure is necessary, but not possible (Nec, NotPos)

1. If target population is death from lung cancer, these subjects are going to die regardless

screening would not be beneficial

ii. Cure is possible but not necessary (Pos, NotNec)

1. If target subjects includes those who develop lung cancer but will not die of it (over-

diagnosis will die of something else before dying of lung cancer) screening would not

be beneficial

iii. Cure is necessary and maybe possible (Nec, Pos)

1. This target group is the only group that can benefit from screening. It represents the cases

of lung cancer that individuals would have died from if not for the screening program

b. It is helpful to consider the relative sizes of these three groups

i. A reasonable estimate can be made on knowledge of the natural history of disease, the potential of

the intervention to identify the condition early, potential treatment to impact the outcome, and the

potential to identify undiagnosed by benign disease

I. Introduction to the RCT

1. General Idea

a. Experimental study conducted on clinical patients

b. Investigator controls everything

i. The exposure type, amount, and duration

ii. Randomization who receives what

c. Most scientifically vigorous study

i. Groups are equivalent w/respect to baseline prognosis

1. The unpredictable random assignment eliminates/reduces confounding from known and

unknown prognostic factors.

ii. No biased measurements

Husaini 20 of 40

1. Blinding ensures that outcomes are measured with the same degree of accuracy and

completeness in every patient

iii. Main potential biases are selection and measurement and they are small compared to cohort,

CCS, XS

d. Can confidently attribute cause and effect

i. As a result of the conditions, the presence or absence of treatment is the only thing that differed

between the two groups

1. thus, any effect is a result of the respective group

ii. Has a high internal validity

1. The experimental design ensures that strong cause and effects conclusions can be drawn

from the results

e. Gold standard to determine the efficacy of treatment

a. Prophylactic trials

i. Evaluate the efficacy of an intervention designed to prevent disease

1. Vaccine, vitamin supplement, patient education, screening

b. Treatment trails

i. Evaluate efficacy of a curative drug or individual

1. Designed to manage/mitigate signs and symptoms of disease

3. Levels of RCTs

a. Individual level highly select group, tightly controlled conditions

b. Community level large groups, less rigidly controlled conditions

i. Test interventions for primary prevention purposes

1. Inclusion criteria

a. Done in order to optimize

i. The rate of the primary outcome

ii. The expected efficacy of the treatment

iii. The generaliziblity of the results

iv. The recruitment, follow-up, and compliance of patients

b. Goal is to identify sup-population whom the intervention is feasible and will produce the desired effect

i. Choice of inclusion criteria represents a balance between

1. Picking the people who are most likely to benefit

2. Not sacrificing the generalizability of the study

ii. If too restrictive study population is so unique, it cant be applied to other populations

2. Exclusion criteria

a. Valid reasons for excluding patients that would mess-up the study

i. The risk of treatment/placebo is unacceptable

ii. Treatment is unlikely to be effective for the respective patient

1. Disease is too severe, too mild, or treatment has already failed in the patient

iii. Co-morbities interfere w/ intervention, measure of outcome, expected length of follow up (terminal

cancer)

iv. Compliance patient unlikely to complete follow-up or adhere to protocol

v. Other practical reasons

1. Language, cognitive barriers, no phone

b. Goal is still to identify sup-population whom the intervention is feasible and will produce the desired

effect

i. Avoid excessive exclusions

1. Will add to complexity of screening process (every patient needs to be assessed, so

exclusions = time) and ultimately decrease recruitment

a. exclusions = complexity = recruitment

ii. Trade off between

1. Patients more likely to make the study a success

2. Sacrificing generalizibility

a. If too restrictive, wont apply to real world

b. internal validity; external validity

3. Baseline Measurements

a. Necessary to collect sufficient (but not excessive) demographic to illustrate that the randomization process

resulted in identical groups

b. Baseline info to be collected

i. Demographics of participants

1. Important to demonstrate that randomization process worked

ii. Contact & identifying info from patient and contact info from friends, family, etc

1. Important to track subject during study prevent loss-to-follow up

Husaini 21 of 40

iii. Major clinical and prognostic factors for the primary outcome that can be evaluated in pre-specified

subgroup analyses

1. If we thought treatment effect would dependent on gender, we would collect info on gender

a. Randomization process should be reproducible, unpredictable, & tamper proof

i. The process of generating the randomization scheme & steps taken to ensure concealment should be

described in detail

ii. The scheme itself should be unpredictable (cant predict what group the next person is going to be

in)

1. Unpredictability is assured through concealment

a. Concealment critical in preventing selection bias (the potential for

investigators to manipulate who gets what treatment)

b. Results in balance of known and unknown confounders

c. Randomization and concealment are separate from blinding Concealment = prevent selection bias

i. Concealment Blinding (Before study has begun)

1. Concealment designed to prevent selection bias

2. Randomization prevent confounding bias

3. Blinding to reduce measurement bias Blinding = prevent measurement bias

ii. They are mutually independent (you can have one w/o the other)

d. Randomization schemes (After study has begun)

i. Simple coin flip

ii. Blocked Randomization randomization done w/in blocks of 4-8 subjects

1. Ensures that there is an equal balance b/w groups

iii. Stratified Blocked Randomization strata are defined according to a critically important

factor/subgroup (gender, disease severity, or study center)

1. Ensures balance b/w groups and within each subgroup

5. Intervention

a. Important to balance potential benefits vs. risks of intervention

i. Everyone is exposed to potential side effects of interventions

ii. Yet, not everyone will benefit from the intervention

1. Not everyone will develop the outcome; no intervention is ever 100% effective)

iii. Thus, caution dictates using the lowest effective dose

b. RCTs designed on premise that serious side effects will occur much less frequently than the outcome

i. Thus, RCTs are very under-powered to detect side effects ( : : power)

1. Phase IV post-marketing surveillance studies are around to check serious side effects once

drugs make it to market b/c many more people will the power so rare, but serious, side

effects can be uncovered.

c. Control group measures the cumulative effects of all other

factors except for the treatment

i. Spontaneous improvements due to natural history

ii. Hawthorne effect subjects improve simply b/c they

are being studied

iii. Placebo Effect Its in your head

6. Blinding (masking)

a. Cardinal feature of RCT

i. It preserves the benefits of randomization by

preventing biased assessment of outcomes

b. Blinding = prevents measurement bias

c. Helps reduce non-compliance, contamination, & cross-overs

i. Especially true in control group (they are unaware

that they are not getting the active treatment)

d. Types

i. Single blind either patient or physician is blinded

ii. Double blind both patient and physician are blinded

iii. Ideally patients, caregivers, collectors of outcome data, adjudicators of the outcome data, & the

data analyst

e. Placebo

i. Defined as, any agent or process that attempts to mask, or blind, the identity of the true active

treatment

ii. Common feature of drug trails

iii. Especially important when primary outcome measure is non-specific (soft)

1. soft = patients self reporting pain, nausea, depression, etc

2. Placebo effect the tendency for such soft outcomes to improve in study

participants (regardless of control vs. treatment)

a. The effect is regarded as the baseline against which to measure the effect of

active treatment

iv. Placebos are not justified when known standard to care already exists

Husaini 22 of 40

1. Must give patients the minimum standard of care

a. Ex. In stroke prevention trial w/ anti-platelet drugs, aspirin would be the minimum

standard of care that would be used as the control group

f. Many times blinding/placebo are not feasible (surgical interventions)

i. Difficult to mask who got surgery and who didnt

ii. Study referred to as an open trial

iii. Blinding may be hard to maintain

1. When treatment has clear and obvious benefit or side effect

a. Turns urine orange, etc

iv. In such cases, use hard outcomes to standardize treatment/data collection

1. Blinding feasible hard outcomes (any cause death)

a. General Idea

i. All patients should be accounted for throughout the trial

ii. All three of these are potential problems with RCTs

1. Will of patients, so this power

2. If occurs in a non-random fashion, will introduce bias

iii. Will eventually translate to the slow, prolonged death of the trial

1. The original RCT degenerates into a observational (cohort) study

a. The active/compliant participants in each arm at no longer under the control of

the study investigator.

iv. Trailists go to great lengths as they attempt to reduce these problems

1. Try to enroll patients who are more likely to be compliant and not LTFU

a. Use two screening visits (time wasters), pre-randomization run-in periods for

drug trails (early non-adheres)

2. Once patients enrolled, imperative to minimize LTFU by tracking and following-up on

patients

a. VERY HARD TO DO!

b. Loss to follow-up (LTFUs)

i. Normally related to outcomes of interest

1. LTFU more likely if side effects, patient moved away, got worse, got better, or simply lost

interest

2. Death also is a cause for LTFU

3. LTFU = sample size = power of the study

ii. If final outcome of subjects LTFU is unknown, that patient cannot be included in the final

analysis

1. Can have significant negative effect on the studys conclusions

a. power (smaller sample size)

b. Biased results (differential LTFUs.not equal b/w groups)

iii. Negative effects of LTFU CANNOT be easily corrected by the intention-to-treat analysis

1. w/o knowledge of final outcome status, these patients must be dropped from the analysis.

iv. Major problem with trying to do an intention-to-treat-analysis (see below)

c. Poor Compliance

i. Can be related to outcomes of interest

1. Presence of side-effects, iatrogenic drug reactions, patient got better/worse or simply lost

interest

ii. Non-compliance = expected to have worse outcome (than those who comply)

1. Regardless of what treatment group they were assigned to

iii. Important to assess the degree of non-compliance and the degree to which it differentially affects one

arm of the study vs. the another

d. Contamination

i. Defined as, the situation when subjects cross-over from one arm of the study into the other; thereby

contaminating the initial randomization process

ii. Ex. Early AIDS RCTs; patients got treatments assayed by private labs to see if they were getting

the placebo or active drug (AZT). If they were getting the placebo, they would buy AZT on the street

e. Solutions

i. Intention-to-treat-analysis (ITT)

1. Defined as, the principle that ALL participants are analyzed according to their

original randomization group or arm regardless of protocol violations

a. All subjects should be included in both numerator and denominator of group

event rates

2. Gold standard for RCT

3. LTFUs make ITT very problematic

a. If subject included in denominator but not numerator

i. The event rate in that group will be underestimated

b. If subjects simply dropped from the study (which is common)

i. Analysis can be biased

1. de-factor per protocol (PP) analysis if dropped b/c

outcome status is known

Husaini 23 of 40

c. In order to mitigate problems

i. Trails will impute/extrapolate an outcome based on missing data

protocol

1. Using the last or worse observation

2. Attempting to predict the unobserved outcome based on the

characteristics of the subject

ii. Results should always be viewed with caution

ii. 5 & 20 rule

1. Technique to assess the likely impact of poor compliance & LTFU; The percentages are

those of the study participants affected by LTFU or non-compliance

a. If affects <5% = bias is minimal

b. If it affects >20% = bias is likely to be considerable

iii. best case/worst case sensitivity analysis

1. Assess potential impact of LTFU

2. Best case

a. LTFU subjects assumed to have best outcome (no adverse outcomes)

b. Event rates calculated counting all LTFU in denominator but not in the numerator

3. Worst case

a. LTFU subjects assumed to have worst outcome

b. Event rates calculated counting all LTFU in both numerator and denominator

4. Overall potential impact is then gauged by comparing the actual results with the range of

findings from the best case/worse case sensitivity analysis

a. High range of estimates imply studys findings are questionable

a. 1 & 2 study outcomes (w/ associated definitions) should be defined before the study is started

i. Termed a priori or pre-specified comparisons

b. Best outcomes are hard, clinically relevant end points (disease rates, death, recovery, complications,

hospital/ER use)

i. Need to be measured w/ accuracy, precision, and measured in the same manner in both groups

ii. Outcomes need to be clinically relevant to the patients themselves

1. Death, recovery, complications = patient-important or patient-relevant outcomes

c. Surrogate end points

i. Hard outcomes cannot always be used

1. It takes too long to measure disease mortality

2. Thus, these end points are used to reduce the length and/or size of the intended study

should be based on validated biologically relevant endpoints

ii. Need to ensure whether a surrogate end point used is an adequate measure of the real outcome of

interest

iii. Ideally, a prior RCT should prove that the end point is a valid surrogate measure for the real

outcome of interest

1. Ex. Study designed to reduce stroke incidence

a. Degree of BP reduction is considered a valid surrogate end point b/c of the

known causal relationship b/w BP & stroke risk

d. Pre vs. post-hoc sub-group analyses

i. Sub-group analyses

1. Defined as, the examination of the primary outcome among study sub-groups

defined by key prognostic variables such as age, gender, race, disease severity, etc

2. Identifies whether treatment has different effect w/in specific sub-populations

a. Differences in efficacy of treatment b/w sub-groups may be described in terms of

a treatment-by-sub-group interaction

3. All sub-group analyses need to be PRE-SPECIFIED ahead of time

a. Natural tendency among author of trials that didnt show positive result tend to

go fishing for any positive results w/in the subgroup comparisons

b. Leads to multiple comparisons potential for FPs

i. Thus, post-hoc (non pre-specified comparisons) should be

regarded as exploratory findings that should be re-examined in

future RCTs as pre-panned comparisons

9. Statistical Analyses

a. Should be straight forward as the design should have created balance b/w all the factors except for the

intervention

i. Simple matter of comparing 1 outcomes b/w groups

1. Continuous data t-test

2. Categorical outcomes chi-square

3. Small data or not Gaussian non-parametric methods

4. Survival type studies Kaplan Meire survival curves or Cox Regression modeling (will

measure the fraction of patients living for a certain amount of time after treatment)

b. Intention-to-Treat Analysis (ITT)

i. Most important concept for RCT

Husaini 24 of 40

ii. Compares outcomes based on the original treatment arm that each individual participant was

randomized to regardless of protocol violations

1. Violations include ineligibility, non-compliance, contamination, or LTFU

iii. Results are the most valid, but conservative estimate of the true treatment effect

1. Approach is the truest to the principles of randomization (which sticks to the perfectly

comparable groups at study outset)

2. However, ITT cannot fix the problem of LTFU unless the missing outcomes are imputed

using a valid method (which can never be fully verified)

a. Thus, no amount of

statistical analysis can fix

the problem that the final

outcome is unknown for a

sub-set of subjects

c. Per Protocol (PP)

i. Fundamentally Flawed

ii. Persons dropped

1. Those in treatment arm who did not

comply

2. Control subjects who got treated

(cross-overs)

iii. Analyzed

1. Only those who complied w/ the

original randomization

iv. Answers the question as to whether the

treatment works among those who complied

1. It can never provide an unbiased

assessment of the true treatment

effect

a. The decision to comply w/

treatment is unlikely to

occur at random

2. Basically the same thing when, during an ITT analysis, subjects are dropped b/c of

unknown outcome

v. Aka. Efficacy, exploratory, or effectiveness analyses

d. As Treated (AT)

i. Fundamentally Flawed

ii. Analyzed

1. Everyone assuming subjects got the treatment or did not (regardless of which group they

were originally assigned to)

iii. Basically the same as analyzing a trial as if a cohort study had been done completely destroys

any of the advantages afforded by randomization

1. everyone decided themselves whether to get treated or not

iv. Published studies use AT when studies do not show positive ITT analysis

1. Have to ask, what was the point of doing the trial in the first place if you ended up doing an

AT analysis AT approach w/o merit

e. Example

i. Note the very high death rate in the 26 subjects that were slated for surgery but received medical

treatment

1. At baseline, were probably a very sick group of patients who died before surgery or were

too sick to undergo it

ii. Note the 50 subjects who should have gotten medical treatment but got surgery instead and their

much lower death rate

1. At baseline, these men were probably healthier so impossible to judge the relative merits of

surgery based on this info

iii. Analysis

1. ITT surgery has small, significant benefit

2. PP or AT surgery has a larger and statistically significant benefit for surgery

a. This estimate is biased!

i. The 26 high risk subjects were either dropped from the surgery

group (PP) or moved to the medical group (AT which makes

medical look much worse)

Husaini 25 of 40

10. Meta-analyses

a. assessing trial quality, trail reporting, and trail registration; improve effect size estimtates (narrow the CI)

b. Meta-analyses fast becoming the undisputed king of the evidence based tree

c. Three important implications for RCTs

i. Assessment of study quality

1. b/c of the variability in the quality of published RCTs, meta-analysts will attempt to assess

their quality to determine whether the quality of a trial has an impact on the overall results

2. all approaches focus on

a. a description of randomization process

b. the use of concealment & blinding

c. a description of the LTFU and non-compliance rates

ii. Trail reporting

1. Reports on quality assessment of trials (using Jadad scale or similar tool)

2. If trail is of marginal or poor quality

a. Probably did not report info on key quality criteria

i. Randomization, concealment, blinding, LTFU

b. Not sure if author simply failed to mention or if they simply did not follow these

steps

3. Lead to development of specific guidelines for the reporting of clinical trials

a. CONSORT Statement aims to make sure trials are reported in a consistent

fashion and that specific descriptions are included so the validity can be

independently assessed

iii. Trail registration

1. Big problem for meta-analyses is potential for publication bias

2. Results from meta-analyses can be seriously biased if there is a tendency to not

publish trails w/ negative or null results

a. Thus, when we collect data, we are collecting relatively much more positive data

that what is truly representative

i. The negative data is hidden

b. Unpublished negative trails

i. Either small (power)

ii. Large, drug company sponsored trials

1. Company doesnt want to release info

3. International Committee of Medical Journal Editors

a. Requires that for a trial to be published in any of these journals, it must

have been registered prior to starting it

i. Thus, scientific community will then a registry of all trials undertaken on

a respective subject

a. As many conditions have a standard treatment that makes use of a placebo trail ethically unacceptable, new

drugs need to be compared to this active control/standard treatment

i. However, it is increasingly difficult to prove that a new drug is better than an existing drug

b. Alternative approach = prove that the new drug is NO WORSE than the active control (w/in a given

tolerance or equivalence margin)

i. emphasis at the federal level on the conduct of comparative effectiveness trails

1. Trails done directly to compare alternative treatments

c. Equivalence Trails

i. Most often used to prove a new drug is equivalent to an existing

standard drug w/in a given tolerance or equivalence margin

1. Most often used in generic drug development

a. Prove similar bioavailability, pharmacology

d. Non-inferiority Trails

i. Designed to prove that a new drug is no less effective than an existing

standard drug

1. One sided equivalence test

2. Interest in non-inferiority trails assumes other drug

Husaini 26 of 40

a. Better safety profile ( side effects, monitoring)

b. Easier dosing schedule

i. compliance

c. Cheaper

3. May involve the evaluation of the same drug given using a different strategy, dose, or

duration

ii. Methodological challenges

1. Null hypothesis is opposite that of typical superiority trial

a. Superiority trail

i. Ho: new drug = active control

ii. Ha: New drug active control

b. Non-inferiority trail

i. Ho: New drug + equivalence margin < active control

1. the active control is substantially better than the new drug

2. Rejecting Ho new drug is not inferior to the active

control within the bounds of the equivalence margin

ii. HA: New drug + equivalence margin active control

2. Equivalence margin = how much we are willing to accept that the new drug can have worse

efficacy

a. Set by clinically deciding how big a difference there would have to be between

the two drugs before we would decided that the drugs are clinically not equivalent

b. It is the critical determinant of the success of the trial & its sample size

c. #s = more conservative

d. #s = more liberal

iii. Other problems/limitations of non-inferiority trials

1. Assay sensitivity

a. A poorly conducted trail may falsely show that the 2 drugs are equivalent

i. Poor trail conduct (compliance, follow-up, blinding, etc) will favor the

non-inferiority

2. Blinding

a. Vital step to reduce measurement bias in superiority trials

b. It cannot protect against the investigators giving the same outcomes/ratings to all

subjects

i. Thereby showing non-inferiority

3. ITT analysis

a. ITT is gold standard in superiority trails

b. ITT in non-inferiority trails tends to bias towards finding non-inferiority

i. Including non-compliance in treatment/control tends to minimize the

differences b/w groups

1. Thus, this can show an inferior drug to be non-inferior

c. PP analysis can introduce bias in either direction

i. Not recommended as it can compound the problem

d. Best bet = do both ITT & PP and hope findings are consistent

i. Even so, accepting HA doesnt rule out possibility of bias

1. Advantages

a. internal validity

b. Control of exposure (amount, timing, frequency, duration)

c. Randomization

i. Ensures balance of factors that could influence outcome

1. controls the effect of known and unknown confounders

d. A true measure of efficacy

2. Disadvantages

a. Limited external validity

b. Artificial environment

i. Strict eligibility criteria and conducted in specialized tertiary care medical centers

1. Greatly limits generalizability

c. Difficult/complex to conduct

i. Takes time and is expensive

d. Limited scope due to ethical concerns

i. Mostly therapeutic/preventive only

I. General Info

1. Overview

Husaini 27 of 40

a. Observational study = investigator has no control over exposure

b. Descriptive

i. Case reports & case series (Clinical)

1. Profile of a clinical case or case series which should

a. Illustrate a new finding

b. Emphasize a clinical principle

c. Generate a new hypothesis

2. It is not a measure of disease occurrence

3. As there is no control or comparison group, we usually cannot identify risk factors or

the cause

a. Exception 12 cases w/ salmonella infection, 10 had eaten cantaloupe

ii. Cross-sectional (Epidemiological) prevalence, or collecting data

c. Analytical

i. Cohort

ii. Case-control

iii. Ecological

1. we dont know exposures, but people who are affected are relatedso we study the

relationship workers and asbestos

2. Risk Factor

d. Heard daily with cholesterol (heart disease), HPV (cervical cancer), cell phones (brain cancer), TV watching

(childhood obesity), etc

i. However, association does not mean causation

ii. Ex. almost perfect overlap b/w CHD and non-CHD b/w in

percentage of men who developed coronary heart disease

with respect to serum cholesterol

1. Even though cholesterol is a proven risk factor

2. If you are just given one # for an individual, it is

difficult to predict outcome b/c of the perfect

match

e. How are risk factors used

i. Identifying individuals/groups at risk

1. Even though ability to predict future disease in

individual patients is very limited (even for well

established risk factors like cholesterol/CHD), it

still helps identify populations

ii. Causation causative agent vs. a marker

iii. Establish pretest probability (Bayes theorem)

iv. Risk stratification

1. Helps to identify target populations (age >40 for mammography screening)

v. Prevention

1. Remove causative agent prevent disease

a. The relationship between a risk factor and disease can be due to

i. The risk factor being a cause of disease = causative agent

ii. The risk factor is NOT a cause but merely associated w/ the

disease = a marker

b. Need to guard against thinking, A causes B when really, B causes A

i. Called reverse causation

4. Prevention

c. Removing a true cause = disease incidence

i. Decrease aspirin use = Reyes Syndrome

ii. Discourage prone position while infants are sleeping

1. Back to Sleep = SIDS

5. Observational Studies

d. XS, Cohort, and CCS are all analytical observational studies

1. General Idea

a. Exposure & Outcome at the same time

b. Also called prevalence study

i. Prevalence measured by conducting a survey of the population

of interest

c. Mainstay of descriptive epidemiology

i. Patterns of occurrence by time, place, and person

ii. Estimate disease frequency (prevalence) and time trends

Husaini 28 of 40

iii. when trying to get a handle on an ideatrying to get clues on the origin of disease by looking at

subgroups

d. Useful for

i. Program planning

ii. Resource allocation

iii. Generate hypotheses

2. How

a. Select sample of individual subjects and report disease prevalence (%)

b. Can also simultaneously classify subjects according to exposure and disease status to draw inferences

i. Describe association using the Odds Ratio (OR)

3. Examples

a. Prevalence of asthma in school-aged children in MI

b. Trends and changing epidemiology of hepatitis in Italy

c. Characteristics of teenage smokers in MI

d. Prevalence of stroke in Olmstead County, MN

a. Advantages Does not measure incidence no RR or AR

i. Quick

Only use Odds Ratio

ii. Inexpensive

iii. Useful

b. Disadvantages

i. Uncertain temporal relationship

ii. Survivor effect

iii. Low prevalence due to

1. Rare disease

2. Short duration Exposure Outcome

III. Cohort Study Incident Study Finds incidence through nature

Good for rare exposure; not good for rare outcomes

1. General Idea Has very high confounding bias (selection and measurement bias also occur)

a. Exposure Outcome

b. Is a group w/ something in common (an exposure)

c. Start with disease-free at-risk population

i. They are susceptible to the disease of interest

ii. Have control and non-control

d. Determine eligibility and exposure status

e. Follow-up and count incident status

f. Very similar to RCT; however, in cohort studies exposures are chosen by

nature rather than by randomization

a. Population based (one sample)

i. Select entire population (N) or a known fraction of the population (n)

ii. p(exposed) in population can be determined

iii. Exp (+) = IDR exposed; Exp (-) = IDR unexposed

b. Multi Sample

i. Select subgroups with known exposure

1. Ex. Smokers vs. non-smokers; coal miners vs. uranium

miners

ii. p(exposed) in population cannot be determined

iii. Multi-sample cohort studies are done in occupational studies

1. Fireman and cancer risk?

2. Exp (+) = fireman; Exp (-) =Non fireman (police)

3. Relative Risk

a. The standard measure of association in cohort studies

b. Describes the magnitude and direction of the association

c. Incidence can be measured as IDR or CIR

d. Interpretation

i. RR = 1.0 null

ii. RR = 2.0 risk is twice as high in exposed vs. non-

exposed

iii. RR = 0.5 risk in exposed is half that in non-exposed

0 0.2 0.5 1 2 3 4 5 6

Big Moderate Small Moderate Big

4. Sources of Cohorts

Husaini 29 of 40

a.Geographically defined groups

i. Framingham, MA heart study

b. Special resource groups

i. Medical plans (Kaiser Permanente), Medical professionals (Physicians health study, Nurses Health

Study), Veterans, College Grads

c. Special Exposure Groups

i. Occupational exposures

1. Lead workers, uranium miners

a. If everyone was exposed, then you need an external (non-exposed) cohort for

comparison purposes

i. Lead workers to car assembly workers

5. Cohort Design Options

a. Variation in timing of exposure and disease measurement

b. Types

i. Prospective

ii. Historical Look back at the entire cohort (Exposure) and see who gets the outcome

iii. Retrospective

1. Go back in time to figure out exposure

2. Comparing exposure and non-exposure

3. Doesnt happen often, but sweet way to do it.

4. Examples

a. Aware of cases of fibromyalgia in women within a large HMO. Go back and

determine who had silicone breast implants (past exposure). Compare incidence

of disease in exposed and non-exposed.

i. Go back and look

1. Who had fibromyalgia

2. Who had silicone breast implants

ii. Case control study would look like this

1. Ask women w/fibromyalgia if they had silicone implants

2. Determine a control group & then only compare the 2 groups

no population comparison

b. Framingham Study: used frozen blood bank to determine baseline levels of hs-

CRP and then measure incidence of CHD by risk groups (quartiles)

i. They measured the CRP levels in the blood from the 60s and then

figured out who currently had CHD

5. We know that the population is composed of cases and non-cases. In case control

studies we find the cases/controls and then track backward to determine exposure.

In retrospective cohort studies, we start by figuring out exposure retroactively and

then track them forward to figure out if they developed into cases

6. Need complete population data in order to do this

a. need to classify everyone

a. General idea

i. Important question for public health

1. How much can we lower disease incidence if we intervene to remove this risk factor?

2. We want to know how much disease did an exposure cause in a respective population

ii. PAR & PARF assume that the risk factor is causal it caused the disease

b. Population Attributable Risk

i. The incidence of disease in a population that is associated with a risk factor

ii. PAR = (attributable risk) (P)

iii. Equals the excess incidence of X in the population due to risk factor Y

c. Population Attributable Risk Fraction

i. The fraction of disease in a population that is attributed to a risk factor

ii. PARF = PAR/total incidence

1. = P (RR-1)/[1+P(RR-1)]

a. A risk factor w/ a small RR but a large prevalence can cause more disease in a

population than a risk factor with a big RR and a low prevalence

b. prevalence & RR trumps prevalence & RR

iii. Represents the maximum potential impact on disease incidence if risk factor was removed

7. Bias

a. Selection Bias

i. Can occur at any time once the cohort is first assembled

1. Patients assembled for the study differ in many ways other than the exposure under

study and these factors may determine the outcome

a. Ex. Only the uranium miners at the highest risk for lung cancer (smokers, prior

family history) agree to participate.

ii. Can occur during the study

1. Differential LTFU in exposed vs. non-exposed

Husaini 30 of 40

a. LTFU doesnt occur at random

iii. Its basically inevitable

b. Confounding Bias

i. As the exposure of interest is not assigned at random & other risk factors may be associated

w/ both the exposure and the disease, confounding basis can occur in these cohort studies

ii. Confounding bias the big one for cohort studies

8. Advantages/Disadvantages

a. Advantages

i. Can measure disease incidence, can study the natural history of disease, provides strong

evidence b/w casual association between E/D (b/c time order is known), provides info on time lag,

multiple diseases can be examined, good choice if exposure is rare (assemble special exposure

cohort), generally less susceptible to bias vs. CCS

b. Disadvantages

i. Takes time, large samples, is expensive, complicated to implement and conduct, not useful for

rare diseases/outcomes, problems of selection bias (assembling at start and LTFU during) &

prolonged time period compounds LTFU, and confounding

Effect cause

Begin with the OUTCOME (case) and then GOING BACK ( Recall Bias) looking for ODDS OF EXPOSURE

Good for RARE OUTCOMES; cohort for rare exposures

IV. Case Control Studies (CCS) Cannot calculate incidence (no RR or AR)

High bias for everything (RECALL, selection, confounding, & measurement)

1. General Idea

a. An alternative observational design to identify risk factors for a disease/outcome

i. Two samples are selected

1. Patients who had developed the disease in question

2. Otherwise similar people who did not develop the disease in question

ii. Find a case (45year old female) with a control (45 year old female)

1. Distribution of age and gender are the same b/w the groups they can no longer

confound

iii. They already have the outcome

b. Question: how do diseased cases differ from non-diseased (controls) w/ respect to prior exposure

history?

c. In the population we find those that are diseased, and then match controls that are not diseased. In

other words, we figure out cases and controls first and then figure out if exposures occurred.

i. We know that the population is composed of cases and non-cases. In case control studies we find

the cases/controls and then track backward to determine exposure. In retrospective cohort

studies, we start by figuring out exposure retroactively and then track them forward to figure

out if they developed into cases

d. Compare the frequency of exposure among cases and control

i. Effect cause

e. Cannot calculate disease incidence rates b/c the CCS does not

follow a disease free population over time

f. For CCS, all the cases had the outcome

g. Basically, we identify cases and then look backward to find

causes of disease (& non-disease)

i. Look for common exposure

ii. Still set up a control group & then look back at that group as

well

2. Nested CCS

a. Study of MHG in infants

i. Not only did they look forward to see how the infants were

affected, they set up a control group in both those w/MHG &

those w/o MHG & looked back for exposures

3. Examples of CCS

a. Outbreak investigations ( what is causing young women to die of

toxic shock)

b. Birth defects (drug exposures and heart teratology)

c. New (unrecognized) disease (DES and vaginal cancer in adolescents)

a. Directionality outcome to exposure

b. Timing retrospective for exposure, but case ascertainment can be either retrospective or prospective

c. Rare/new disease design of choice if disease is rare or if a quick answer is needed

i. Cohort design is not useful

d. Challenging the most difficult type of study to design and execute

e. Design options

Husaini 31 of 40

i. Selection of cases

1. Requires case definition

a. Need for standard diagnostic criteria, consider severity of disease, and consider

duration of disease (prevalent or incident case?)

2. Requires eligibility criteria

a. Age of residence, age, gender, etc

ii. Sources of cases

1. Population based

a. Identify and enroll all incident cases from a defined population

i. Ex. Disease registry, defined geographic area, vital records

2. Hospital based

i. Popular in USA b/c we dont have good national/regional databases

b. Identify cases where you can find them

i. Hospitals, clinics

ii. Issues of representativeness, prevalent or incident cases?

iii. Selection of controls

1. Controls reveal the normal/expected level of exposure in the population that gave

rise to the cases

2. Should have the same eligibility criteria as the cases

3. Issue

a. Control comparability to cases concept of the study base

i. Controls should be from same underlying population

ii. Need to determine if the control would have developed disease

would s/he be included as a case in the study

1. If no, then dont include as a control

iv. Sources of controls

1. Population based

a. Ideal as it represents the exposure distribution in the general population

b. However, if there is a low participation rate response bias likely (selection

bias)

2. Hospital based

a. Used when population based controls are not feasible

b. Much more susceptible to bias

c. Advantages

i. Similar to cases? (it is a hospital after all..), more likely to participate,

and efficient (there in a hospital)

d. Disadvantages

i. Are they representative of the study base?

ii. They already have some kind of disease/co-morbidity

1. Dont select if risk factor for their disease is similar to the

disease under study (COPD & lung cancer)

3. Other sources

a. Relatives, neighbors, friends (of cases)

i. Advantages

1. Similar to cases and more willing to cooperate

ii. Disadvantages

1. More time consuming, may not be willing to give info, may

have similar risk factors

5. Analysis of CCS

a. The only valid measure of association for the CCS is the Odds Ratio (OR)

b. Under reasonable assumptions (the rare disease assumption), the OR approximates the RR

c. Odds Ratio

i. Odds of exposure among cases = a/c

ii. Odds of exposure among controls = b/d

iii. Similar interpretation as RR

iv. Provides the same information as RR if

1. Controls represent the target population

2. Cases represent all cases

3. Rare disease assumption holds

a. Or if CCS us undertaken w/population based sampling

v. OR can be calculated for any design

1. OR is the only valid measure for the CCS

2. RR can only be calculated for RCT & cohort studies

3. Publications will occasionally mislabel OR & RR

6. Confounding

a. Exposure of interest may be confounded by a factor that is associated with the exposure and the

disease it is an independent risk factor that the disease

b. Can be controlled

Husaini 32 of 40

i. At the design phase

1. Randomization, restriction, matching

ii. At the analysis phase

1. Age-adjustment, stratification, multivariable adjustment

c. Matching

i. Used to control an extraneous variable by matching controls to cases on a factor you know is

an important risk factor or marker for disease

1. Age (w/in 5 years), sex, neighborhood

ii. If the factor is fixed to be the same in both the cases and controls, then it cant be confounded

iii. Analysis of matched CCS needs to account for the matched case-control pairs

1. Only pairs that are discordant with respect to exposure provide useful information

2. McNemars OR = b/c

a. Case (+/-) vs. Control (+/-) and then match in a 2X2 table

i. Each box entered as a pair of one case and one control

ii. Concordant pair = both smokers or both non-smokers

iii. Discordant cells = contribute to Odds Ratio

1. Case is a smoker, control is not

2. Case is a non-smoker, control is a smoker

b. the only pair that gives any information is discordant pairs

iv. Can power by matching more than one control per case

1. 4 controls to 1 case = power

2. Useful if few cases are available

v. Over-matching

1. Matching can result in controls being so similar to cases that all exposures are the same

a. Ex. 8 cases of GRID (LA county 1981) in which all cases were gay men so they

were matched using a 4:1 matching ration to other gay men who did not have

signs of GRID (32 controls)

i. No differences found in sexual or other lifestyle habits

d. Recall Bias

i. Presence of disease may affect ability to recall or report the exposure

1. Is a form of measurement bias

2. Ex. Exposure to OTC drugs during pregnancy use by moms of normal and congenitally

abnormal babies

a. Its pretty hard to remember if/when you may have taken a Tylenol

ii. To lessen potential

1. Blind participants and study personnel to study hypothesis

2. Use explicit definitions for exposure

3. Use controls w/ an unrelated but similar disease

a. Ex. Heart tetralogy (cases), hypospadia (controls)

e. Reverse Causation

i. The disease or sub-clinical manifestations of it results in a change in behavior (exposure)

ii. Ex. Obese children found to be less physically active than non-obese children

a. Advantages

Odds Ratio Yes Yes Yes Yes

i. Quick & cheap (relatively)

1. So ideal for outbreaks RR Yes Yes No No

ii. Can study rare or new diseases AR (RD) Yes Yes No No

iii. Can evaluate multiple exposures PARF No Yes No No

b. Disadvantages

i. Uncertain of E D relationship

1. Especially uncertain of timing RCT Cohort CCS XS

ii. Cannot estimate disease rates

iii. Worry about representativeness of controls Selection

iv. Inefficient if exposures are rare Confounding -

v. Bias selection, confounding, measurement Measurement

(recall)

Cohort (incidence) Case Control Cross-Sectional (prevalence)

Begins with a defined population at risk Population at risk may be undefined Begins with a defined population

Cases not selected but ascertained by Cases selected by investigator from an Cases not selected but ascertained by a

surveillance available pool of patients single examination of the population

Controls (the comparison group) are Controls selected by investigator to Noncases include those free of disease at

not selected, they evolve naturally resemble cases the single examination

Exposure measured before the Exposure measured, reconstructed, or Exposure measured at the same time as

development of disease recollected after development of disease disease

Risk & incidence cannot be measured Risk & incidence cannot be measured

Risk, incidence of disease, & RR

directly directly

measured directly

RR can be estimated by the odds ratio RR can be estimated by the odds ratio

Husaini 33 of 40

EPI 547

I. Random Stuff

1. Bias

a.

It is a deviation from the truth (Grimes, Bias and Causal Associations, Lancet 2002)

b.

Systematic Error (Bias) - Error in study design which may skew the results leading to a deviation from the

truth.

i. This is when all measurements are consistently all high, or low.

1. Ex: A spectroscopy machine consistently gives high readings because it wasnt calibrated.

c. Three broad classes of Bias

i. Confounding

1. Factor that distorts the true relationship of the study variable of interest of being to both

a. The outcome of interest

b. The study variable

c. Confonding = bias that we can control

2. Two mechandims

a. Confounding by Indication

i. Intractable problem where prognostic factors influence treatment

decision

1. Problematic when elevating treatment effects from

observational data

ii. Ex. Asthma studies of 1980s showed an association between a -

agonist (Fenoterol) and death from asthma

1. However, it was argued that patients who had more sever

asthma were therefore at a higher risk of mortality from

asthma and thus were likely to be prescribed Fenoterol in

the first place.

2. The severity of the disease confounds the association

between the drug and the adverse outcome

b. Channeling effect (bias)

i. Tendency for clincians to prescribe certain treatments based on a

Reducing Bias

Confounding use RANDOMIZATION patients underlying prognosis or comorbidity profile

Selection use CONCEALMENT 1. results in differences in baseline risk

Measurement use BLINDING ii. Solution

1. Adapt the design of the study

2. Statistically adjust the baseline risk differences

ii. Selection

1. Internal validity questions for Dx & Px

2. Cohort studies

3. Selection bias = bais that we cannot control (compared to confounding)

iii. Measurement

1. Internal validity questions for harm

2. Case Control Study

a. Assessment / Measurement Bias Occurs when one group of patients has a better (or worse) chance of

having their outcome detected than another group (EPI 547 CP).

i. Particularly likely to occur for soft outcomes like side effects, mild disabilities, sub-clinical disease or

specific cause of death

ii. Minimize bias by adhering to the following 3 principles

1. Ensuring that all observations are carried out by observers who are blinded to the exposure

stats of the particular patients.

a. Blinding eliminates measurement bias outcomes are measured with same

degree of accuracy and completeness in every participant

2. Develop (and use) careful criteria or rules for deciding whether an outcome event has

occurred

3. Apply equally rigorous efforts to ascertain all events regardless of exposure group

b. Information Bias / Observation Bias / Classification Bias / Measurement Bias Results from incorrect

determination of exposure or outcome, or both (Grimes 2002)

Husaini 34 of 40

c. Interviewer bias error introduced by an interviewers conscious or subconscious gathering of selective data

(the interviewer might think that people are sicker than they really are).

d. Recall bias error due to differences in accuracy or completeness of recall to memory of past events or

experiences. Particularly relevant to case control studies (CCS).

e. Selection bias an error in patient assignment between groups that permits a confounding variable to arise

from the study design rather than by chance alone.

i. Occurs when the groups of exposed and non-exposed assembled for the study differ in some way

other than the prognostic factors under study.

ii. When extraneous variables affect the outcome of the study

iii. This stems from an absence of comparability between groups being studied

iv. Spectrum Bias

1. Definied as: the difference in both the spectrum and severity of diseae between

a. The population among whom the test was first developed (the study population)

i. Phase I evalutions to see if (+) in sick people

1. The sickest of the sick

a. Easier time picking out the obvious

b. Se will be overestimated

ii. Phase I evaluations to see if (-) in normal people

1. The wellest of the well

a. Healthier and younger than clinically relevant

population

b. Less likely to have other Dx or co-morbitites

c. Less FP results (or so many TN..)

d. Overestiimates Sp

iii. NET EFFECT: new diagnostic tests are overly optimistic

a. Overestimate Se and Sp

b. The population that the test will be used in practice (clinically relevant population)

i. Phase II evaluations clinical population that has a whole array of

conditions that are a part of the DDX

1. Patients WITH OUT the DISEASE of INTEREST

a. Conditions that cause FPs

b. This underestimates the Sp

c. Opposite of the wellest of the well

v. Assembly or Susceptibility bias Is an example of selection bias, since the bias occurs when the

subjects are first selected.

1. Survival Cohort This is a special type of assembly bias where only the patients that

survived the outcome are taken into account

2. A survival cohort describes the past history of prevalent cases and NOT that of a true

inception cohort

3. Individuals who would have been included in a true inception cohort are not accounted for

because they died soon after the onset of treatment

vi. Migration Bias / Loss-to-Follow-Up This is another form of selection bias which occurs when

patients drop out of the study prematurely

vii. Referral / Sampling Bias This is a selective referral of patients to tertiary (academic) medical

centers where many publications concerning prognostic aspects of disease are conducted and this

selection bias alters the clinical spectrum of disease

1. The proportion of more severe or unusual cases tends to be artificially higher at tertiary

care centers.

2. People who are treated at primary care centers are often TOO SICK to be referred to or

even make the trip to the tertiary care center.

3. The people that survived the referral to the academic centers are the ones getting studied.

f. Volunteer bias people who choose to enroll in clinical research may be systematically different (e.g.

healthier, or more motivated) from your patients.

g. Verification Bias / Workup bias when the decision to conduct the confirmatory or reference standard test is

influenced by the result of the diagnostic test under study.

i. Ask: Where all the patients subjected to a gold standard

ii. Fecal occult blood test and colonoscopy

1. FOBT (-) are not referred to colonoscopty, but they could be FNs

a. Over estimates Se (b/c of the FNs are underestimated)

b. Under estimates Sp (b/c # of TN are underestimated)

Husaini 35 of 40

3. Cochrane Collaboration

a. This international group, named for Archie Cochrane, is a unique initiative in the evaluation of healthcare

interventions that prepares, disseminates, and continuously update systematic reviews of controlled trials for

specific patient problems.

b. This team will gather all of the studies on a subject, disregard the poor studies, and come up with a consensus

on the final outcomes of the good studies.

c. Key points for systematic reviews

i. Grade concealment of allocation

ii. Describe key quality parameters relevant to topic

iii. Report risk of bias table

a. Effectivness: A measurement of benefit resulting from an intervention for a given health problem under

conditions of usual practice. This form of evaluation considers both the efficacy of an intervention and its

acceptance by those to whom it is offered. It helps answer does the practice do more good than harm to

people to whom it is offered?

i. Think of as the clinical trials that try to figure out if the treatment works under Usual and Ordinary

circumstances

b. Efficacy: A measure of benefit resulting from an intervention for a given health problem under conditions of

ideal practice. It helps answer does the practice do more good than harm to people who fully comply with the

recommendations? (N.B. It is the job of RCTs to measure efficacy)

i. Think of as the clinical trial checking to see if the treatment can work under Ideal situations (Fletcher

137)

a. The conscientious, explicit, and judicious use of current best evidence in making decisions about the care of

individual patients.

i. Clinical Expertise

ii. Research Evidence

iii. Patient Preference

b. The practice of evidence-based medicine requires integration of individual clinical expertise and patient

preferences with the best available external clinical evidence from systematic research.

c. It is the application of clinical epidemiology to the care of patients and includes the following concepts:

i. Formulate: Converting the information need into an answerable clinical question

1. EBM clinical question PICO

a. P = Population

b. I = Intervention

c. C = Control

d. O = Outcome

2. Clinical question type

a. RCT = Therapy

b. Cohort = Dx & Px

i. Having exposure and looking for outcome

c. Case Control = Harm (outcome)

i. Have outcome and looking back for exposure/harm

ii. Search: Track down/search for the best evidence because EBM must include the best available

research evidence

iii. Appraise: Critically appraise and judge whether the evidence is strong enough to base clinical

decisions on

1. Systematic approach

a. Validity

b. Results

c. Application

iv. Apply: Integrate the critical appraisal with clinical expertise and pts biology, values, and

circumstances

6. Hazard Function

a. The probability of an event (such as death or relapse) at a given moment in time (t) (EPI 547 CP).

b. It is a direct measure of prognosis and indicates that, given the patient has survived to a certain point in time,

what is the probability of the patient failing during the next time period?

i. in hazard function indicates that the prognosis worsens with time

Husaini 36 of 40

ii. in hazard function indicates that the prognosis improves for those patients that survive longer

7. Censoring

a. When the event of interest does not occur in all individuals because

i. Study stopped/ended before outcome occurred

ii. LTFU

iii. Death from other (competing) causes eg. Road accident

a. Necessary The factors that MUST ALWAYS BE PRESENT for the disease to occur

i. If factor is NOT present, then disease DOES NOT HAVE TO occur, and CANNOT occur

ii. If factor present, then disease CAN occur, but disease DOES NOT HAVE TO occur.

b. Sufficient Once the factor IS PRESENT, the disease WILL ALWAYS occur

i. Once factor is present, then disease MUST ALWAYS occur, even though it is not the only important

thing for the disease to occur

c. Examples

i. Rabies Virus and Human Rabies Rabies virus is BOTH NECESSARY & SUFFICIENT because

virus must be present for the disease to occur and once it is present, the disease must occur

ii. Mycobacterium tuberculosis and Clinical TB disease This bacteria is ONLY NECESSARY to cause

tuberculosis because it must be present for TB to occur, but it is not sufficient, because not everyone

that has the bacteria will get the disease

iii. Smoking and Lung Cancer Smoking is NEITHER necessary because when smoking is present lung

cancer can occur, but lung cancer does not have to and when smoking is not present, then lung

cancer may or may not occur. It is not sufficient because when smoking is present lung cancer does

not always occur.

iv. Maternal Alcohol Use and Fetal Alcohol Syndrome As for the maternal drinking and fetal alcohol

syndrome, it is ONLY NECESSARY for a mother to drink in order for the child to be born with FAS. It

is not sufficient, because other factors play a role too, one being the dose etc.

v. Prone Sleeping and Sudden Infant Death Syndrome This is NEITHER necessary because prone

sleeping is not always present in SIDS cases and it is not sufficient because prone sleeping does not

mean that SIDS will occur

vi. PhenylPropanolAmine and hemorrhagic stroke PPA is NEITHER necessary or sufficient. It is more

a risk factor than the cause in the narrow sense of the word

vii. HIV and AIDS HIV is BOTH NECESSARY & SUFFICIENT because HIV must be present for the

disease to occur (but does not have to occur) and SUFFICIENT because once the factor is present,

the disease will occur

9. Study Designs

a. Case Series A collection or a report of the series of patients with an outcome of interest. No control group is

involved.

b. Case Control Study (CCS) Identifies patients who have a condition or outcome of interest (cases) and

patients who do not have the condition or outcome (controls). The frequency that subjects are exposed to a risk

factor of interest is then compared between the cases and controls. Because of the design of the CCS, disease

rates cannot be directly measured (contrast this with the cohort study design). Thus the comparison between

cases and controls is actually done by calculating the odds of exposure in cases and controls. The ratio of these

2 odds results in the odds ratio (OR) which is usually a good approximation of the relative risk (RR)

i. Advantages

1. it is relatively quick and inexpensive requiring fewer subjects than other study designs.

2. It is often times the only feasible method for investigating very rare disorders or when a

long lag time exists between an exposure of interest and development of the

outcome/disease of interest.

3. It is also particularly helpful in studies of outbreak investigations where a quick answer

followed by a quick response is required.

ii. Disadvantages: recall bias, unknown confounding variables, and difficulty selecting appropriate

control groups.

c. Crossover Design A method of comparing 2 or more treatments or interventions in which all subjects are

switched to the alternate treatment after completion of the first treatment. Typically allocation to the first

treatment is by a random process. Since all subjects serve as their own controls, error variance is reduced.

d. Cross-Sectional Survey The observation of a defined population at a single point in time or during a specific

time interval. Exposure and outcome are determined simultaneously. Also referred to as a prevalence survey

because this is the only epidemiological frequency measure that can be measured (in other words incidence

rates cannot be generated from this design)

e. Cohort Study Involves identification of two groups (cohorts) of patients who are defined according to whether

they were exposure to a factor of interest e.g., smokers and non-smokers . The cohorts are then followed over

time and the incidence rates for the outcome of interest in each group are measured. The ratio of these

incidence rates results in the relative risk (RR) which quantifies the magnitude of association between the factor

and outcome (disease). Note that when the follow-up occurs in a forward direction the study is referred to as a

prospective cohort. When follow-up is done based on historical information it is referred to as a retrospective

cohort).

Husaini 37 of 40

i. Advantages

1. can establish clear temporal relationships between exposure and disease onset

2. able to generate incidence rates .

ii. Disadvantages

1. control/unexposed groups may be difficult to identify, exposure to a variable may be linked

to a hidden confounding variable, blinding is often not possible, randomization is not

present.

2. For relatively rare diseases of interest, cohort studies require huge sample sizes and long

f/u (hence they are slow and expensive).

f. N-of-1 Trial When an individual patient undergoes pairs of treatment periods organized so that one period

involves use of the experimental treatment and the other involves use of a placebo or alternate therapy. Ideally

the patient and physician are both blinded, and outcomes are measured. Treatment periods are replicated until

patient and clinician are convinced that the treatments are definitely different or definitely not different.

g. Randomized Controlled Trial (RCT) A group of patients is randomized into an experimental group and into a

controlled group. These groups are then followed up and various outcomes of interest are documented. RCTs

are the ultimate standard by which new therapeutic maneuvers are judged. Randomization should result in the

equal distribution of both known and unknown confounding variables into each group. An unbiased RCT also

requires concealment and where feasible blinding.

i. It is the gold standard of clinical research designs because it REDUCES CONFOUNDING from

known and unknown confounders

ii. Randomization should ensure that there are NO differences between the groups at baseline.

1. This can be described as the groups having an equal chance at prognosis at baseline

2. It can also be described as controlling for the known and unknown confounding variables

3. It CANNOT be applied to ALL clinical questions of interest

4. RCTs do NOT have to always have a placebo group, but they must have a group to

compare to (different drug)

iii. Disadvantages: often impractical, limited generalizability, volunteer bias, significant expense, and

sometimes ethical difficulties.

h. Systematic Review A formal review of a focused clinical question based on a comprehensive search

strategy and structured critical appraisal designed to reduce the likelihood of bias.

i. No quantitative summary is generated however.

ii. Any summary of research that attempts to address a focused clinical question in a systematic,

reproducible manner.

iii. These reviews provide a summary of studies which have been searched out comprehensively with

explicit and reproducible search strategy intended to answer a specific clinical question

iv. These reviews incorporate some sort of inclusion criteria, valid quality assessment methods, a

rigorous appraisal of the evidence offered, and some summary conclusions

v. High quality SRs

1. Assess quality of individual studies

2. Report results

i. Narrative Review This provides a general overview which may not follow rigorous, reproducible scientific

methods

i. This may result in biased or erroneous conclusions

ii. It may be that these reviews provide practical information about managing common clinical conditions

iii. One is unlikey to find a detailed and discriminating appraisal of evidence

j. Meta-Analysis A systematic review which uses quantitative methods to combine the results of several

studies into a pooled summary estimate.

i. The quantitative synthesis that yields a single best estimate of, for instance, treatment effect.

ii. This is a sub-set of systematic reviews where the investigators report a combined summary statistic

with a variable of interest.

a. This is the survival experience of a population, which is the probability of survival to a given point in time (t).

i. S(3) = 60 This indicates that 60% of the population survived 3 years

b. Median Survival Time Crude, but commonly used measure of survival time at which half the patients have

failed

c. Hazard Function is closely linked

the ODDS that(D+) would occur compared to (D-)

a. Defined as: the probablity of observing the test result in the presence of disease divided by the probability of

observing the test result in the absence of disease.

i. It is therefore ODDS that a given test result (x) would occur in diseased individual (D+) compared to a

non-diseaed individual (D-).

b. Make the application of Bayes Theorem more practical/easier

i. an altervative way of describing the performance of a diagnositic test

c. Prevalance

i. Its the prior probability of disease the clincicans best gues/opinion prior to ordering a test.

Husaini 38 of 40

ii. Fundamental Fact #1

1. The interpretation of test results depends on the probability of disease before the test was

run

d. Odds-LR form of Bayes Theorem

i. The environment in which the test is applied (indicated by pre-test odds) is as important as the

information provided by the test (indicated by the LR); in other words, each aspect is only half of the

story

ii. The LR can be obtained on-online, the hard part is for the clinician to provide an accurate estimate of

the pre-test odds.

iii. (Pre-test ODDS)(LR) = (Post-test ODDS)

1. Pre-test ODDS = (Prev/(1-Prev)

2. Post-test PROBALITY = (post test odds)/(1+post test odds)

a. Post test porbality = PVP (or PVN)

iv. Thus, can calculate PVP or PVN from

1. LR+

2. LR-

3. Pre-test odds

v. Positive posttest Probability) the likelihood of disease given that the test is (+) = P(D+T+)

vi. Negative posttest probability) the likelihood of disease given that the test is (-) = P(D+T-)

1. This one isnt much use to us as it is the complement of PVN or 1- P(D-T-)

a. Therefore, we calculate PVN = 1- post test probability

e. Can be calculated for a range of test results (ordinal or continuous test), thereby preserving clinical information

f. Test results provide the maximum amount of information the change b/w prior and post test probability when

the prevalence of disease is between 40-60%.

g. Theoritical advantage of LR over Se and Sp

i. Se and Sp remain constant regardless of the prior probability of disease

ii. LR is less susceptible to changes in the underlying prevalence of disease because they are

calculated from smaller slices of the data.

a. Widely accepted method of estimating S(t) that is expressed as the product of conditional probabilities

i. S(3) = S(1) + S(21) + S(32)

13. Log-rank test

a. A formal statistical test of the difference in survival distribution

b. It compares the observed number of events in one group to the expected number of events if the two groups

had identical hazard functions.

14. Cox proportional hazard model

a. Very powerful regression modeling technique based on the hazard function

i. Allows for full application and flexibility of regression analysis to be applied to survival data

ii. In its simplist form, it is an extension of the log rank test

iii. The measure of effect is the HR

b. Advantages

i. Able to handle a large number of prognostic variables (discrete and continuous)

ii. Can adjust for confounding varibales

iii. Evaluate interaction effects

15. Heterogeneity

a. Test done in systematic reviews and meta analysis to determine how similar individual study RESULTS are

b. Q statistic

i. Based on the chi-square test (the test has low power to detect heterogeneity)

ii. H0: p>0.5 there is homogeneity among the results

iii. Ha: p<0.5 then heterogeneity is present

1. Unlikey that chance explains the difference in the studies

iv. NOTE: the H0 is OPPOSITE of what we would normally think

c. Inconsistency Index or I2

i. Estimate of the variability in results due to true differences in treatment effect vs. chance

ii. I2<25% = low heterogeneity

iii. I2 25-75% = moderate heterogeneity

iv. I2 >75% = high heterogeneity

d. Sources of heterogeneity

i. Clinical heterogeniety

1. Populatoin

2. Intervention

3. Outcome

ii. Methodologcical heterogenetiy

1. Design

iii. Chance

a. Weighted average

Husaini 39 of 40

i. Larger trails have more weight as a simple mean may provide an unbalanced estimate of the effect

size

ii. Types

1. Fixed effect model = used with LOW heterogeneity

a. Inference based on studies at hand

b. So assumption is that identicial studies should produce identical results

i. Thus, any difference is only from within-study random variation

c. Combines all the studies according to weight

2. Random effect model = used with HIGH heterogeneity

a. Random sample of studies from all the possible studies in the univerise

(hypothetical)

b. Accounts for between and within study random variation

c. More weight given to smaller studies

d. Wider 95% confidence interval

Husaini 40 of 40

- Chapter 3 Hypothesis Testing_Mean One SampleUploaded bymfy
- Statistics LectureUploaded byPamelyn Faner Yao
- Errors in Statistical TestsUploaded byhubik38
- Bio StatisticsUploaded bygaoyutailang
- Lecture in RUploaded byAlhaitham Gho
- Results 5Uploaded byMicaela Bacolod
- Construction Project Change Management in SingaporeUploaded bylionprince
- MDAUploaded bybhaktiva
- BRM Final Report 2018 Resubmitted.Uploaded byMuhamad Ashraf
- Chapter 9 Fundamental of Hypothesis TestingUploaded byteklay
- Homework 5Uploaded bytheboac
- Midterm Exam With AnswersUploaded byShanna Basallo Alenton
- Revision Hypothesis TestingUploaded bybchin2009
- Lecture 2Uploaded byJoey Ramirez
- index (13)Uploaded bynitin gupta
- Poole 2017Uploaded bymehdi.chlif4374
- HypothesisUploaded byGautam Poudel
- StatisticsUploaded byAnkit
- UntitledUploaded byapi-292954931
- CH 5 HP TESTINGUploaded byWonde Biru
- Uji Statistik 1Uploaded byZulpakor Oktoba M Bs
- Chapter07A.pdfUploaded byPOSSD
- irfan.docxUploaded byAnonymous oSbl3O
- Bioinformatics 2009 de Vienne 150 1Uploaded bydamien2vienne
- Hypothesis Testing Final [Autosaved]Uploaded byRAYMON ROLIN HILADO
- Clinical Epidemiology TalkUploaded byAkshaya Rajangam
- Constructivist Problem Based Learning TechniqueUploaded byRerrysta Yolanda
- data8-sp17-final.pdfUploaded bydrefor
- 5527-12711-2-PBUploaded byMMU Thesis
- ch 9reportashleyanddelanieUploaded byapi-346152803

- The Sun is ShiningUploaded byalobrien
- Nerve Entrapment- UpdateUploaded byalobrien
- Felt TT Fit Calculator V4Uploaded byalobrien
- Fence PermitUploaded byalobrien
- ISP Schedule for ALOUploaded byalobrien
- Cardiovascular Drug ListUploaded byalobrien
- Sekimoto Et Al-2017-Acute Medicine & SurgeryUploaded byalobrien
- Surgeon DefinitionUploaded byalobrien
- Final Blank General Memorization SheetUploaded byalobrien
- Microbiology Memorization SheetUploaded byalobrien
- Iron ReviewUploaded byalobrien
- Arrythmia Review TableUploaded byalobrien
- 2012 Grand Cherokee OM 3rdUploaded byalobrien
- 2012-Grand_Cherokee-UG-2ndUploaded byDuncan McGreggor

- Reoperative Plastic Surgery of the Breast [PDF]Uploaded bySang Trần
- Endocrine System DiseasesUploaded byPrincess Xzmae Ramirez
- Infant Orthopedics and Facial Growth in Complete Unilateral Cleft Lip AndUploaded byShannon Victor Peter
- Gerd gastroesofageal reflux diseaseUploaded byMuhammad Benny Setiyadi
- Aromatherapy for CatsUploaded bygemma_liñán
- Functional Food 112004Uploaded byManish Sharma
- Colloid & Renal Function, Dr Ike Sri Rejeki SpAnUploaded bymusthafaafif
- Summary of Chapter 5Uploaded byKanisha
- St 11301Uploaded byPaPa
- African isUploaded byGuru Nandha
- Consensus Statement on Concussions in Sports - Oct 2016Uploaded byindeedwrestling
- Laboratory ValuesUploaded byJaney Co
- 5. Neonatal hyperbilirubinemia and jaundice.docxUploaded byAlexandra Sirotenco
- Ophthalmology MCQsUploaded byvamshidh
- Blood Pressure Measurements.pdfUploaded byAdarshBijapur
- Fomepizole 4 2 AC AdUploaded byr4aden
- Herbal Picnic_ Diagnostic Face & Body ReadingUploaded byVijay Kumar
- ATLANTO-AXIAL SUBLUXATION AFTER PYOGENIC SPONDYLITIS.pdfUploaded byPaEliToscRod
- Feedback Simulation for Acupressure Training and Skill AssessmentUploaded byEhab Albizri
- ioi160069Uploaded bymustikaarum
- Fijación prótesis humeralUploaded byCarlos Miras
- Life Lessons From Breast Cancer SurvivorsUploaded byliewananda
- cryptorx.pdfUploaded byZakiyul Fuad
- Quetiapine in Bipolar ReviewUploaded byAnonymous Vvu02RL
- WL Wrist and Hand Surgery and Rehabilitation EBookUploaded byJohnPaulBasco
- Borderline Personality DisorderUploaded byPradeep Kumar
- Insulin.docxUploaded byKatherineQ.Cañete
- 2015-16 Final Year B.pharmacyUploaded bySagar Gavit
- Coagulants for Water TreatmentUploaded byhafiz zain saeed
- eng 1010 argumentative essayUploaded byapi-273356922