Professional Documents
Culture Documents
Relative Risk
Risk
o The probability of developing a disease
Incidence is a measure of risk
Prevalence is not a measure of risk
Because disease may not be newly developed
o Relative risk is the ratio of the absolute risk for disease in an exposed group
versus an unexposed group over a defined time interval
Examples of this would be: relative risk, risk ratio and rate ratio
Relative risk is the ratio of the risk for disease in an exposed versus
unexposed group over a defined time interval.
RR = Incidence in exposed during specified time interval/ Incidence in
unexposed during specified time interval.
o Absolute risk used for relative risk is the probability that a disease-free individual
will develop a given diseases over a specified time interval
Examples of this probability are: Cumulative incidence, attack rate, and
incidence rate
Absolute risk shows the magnitude of risk in a group of people
Absolute does not involve a comparison
Absolute risk does not tell us the excess (or decreased) risk associated
with the exposure
? Which is why it is not used to calculate the interaction of
exposures.?
Absolute risk does not provide any information about associations
between exposure and disease.
Measures of morbidity
o Prevalence = (the number of cases of a disease in a population during a specified
period of time)/ (the number of persons in the population during that period of
time)
o Incidence:
Cumulative incidence = (Number of new cases of a disease occurring in
the population during a specified period of time) / ( the number of
persons at risk of developing the disease during that period of time)
Incidence Rate = (number of new cases of a disease occurring in a
population during follow-up) / (Total time at risk experienced by the
individuals in the population)
***time at risk is typically measured by person-years (years can
equal any amount of time)***
Study Design Prevalence Incidence
Cross-sectional YES NO
Case-control NO NO
Cohort YES YES
Randomized trial YES YES
Odds Ratio
Prevalence Ratio
o Similar to RR, except that PR uses prevalence instead of incidence
o Can be used in
Randomized clinical trial (RCT)
Cohort Study
Cross-sectional Study
This is the study that you cannot calculate RR
Typically, only used in cross-sectional studies, because can use RR in RCT
and cohort studies
In practice, not often used (other options for cross-sectional studies are
available)
o PR = (Prevalence of disease of exposed group) / (Prevalence of disease of
unexposed group)
Odds
o Odds = Probability of an event happening / Probability of an event NOT
happening
o Odds = (P / (1-P)
Odds of Exposure
o Odds that the case was exposed = Probability that a case was exposed /
Probability that a case was NOT exposed
o Odds that the control was exposed = Probability that a control was exposed /
Probability that a control was NOT exposed
Odds Ratio (OR) case-control study = Odds that a case was exposed / Odds that a
control was exposed
o This is the odds ratio of exposure
o Sometimes called relative odds
o This is the only measure of association that can be used in case-control studies
o May also calculate this for a cross-sectional study
OR Key:
o OR > 1: Cases were more likely to be exposed, consistent with when exposure
increases the odds of disease.
o OR = 1: Cases and controls are equally likely to be exposed, consistent with when
exposure is not related to disease.
o OR < 1: Cases were less likely to be exposed, consistent with when exposure
decreases the odds of disease (or is protective against disease).
Protective Odds Ratio
o If the odds ratio is <1.0
There is a decreases odds or protective effect
Examples is on slide 26 of PowerPoint
Dose-response
o You can also observe a trend in dose-response with the OR
Matched case-control studies
o Cases and controls are matched by a third variable such as age, sex, etc.
o This is method to make sure there are the same proportion of ages, sexes, etc. In
the cases and control groups so that these variables do not influence the
outcome
o Concordant pairs: case and controls share the same exposure status
o Discordant pairs: cases and controls have a different exposure status
o To calculate the OR in a matched case-control study you will only use the
discordant pairs
Odds ratio of exposure
o In a case-control study, we calculate the odds ratio of exposure
This is because in a case-control study participants are selected based on
case/control status, so the estimate is based on the odds of exposure
among cases and control.
Odds of disease:
o Odds of disease among exposed = (Number of exposed with disease)/ (Number
of exposed without disease)
o Odds of disease among unexposed = (Number of unexposed with disease)/
(Number of unexposed without disease)
Odds Ratio (OR) for cohort study
o OR = Odds of disease among exposed/ Odds of disease among unexposed
Odds ratio of disease
o In a cohort study, we can calculate an odds ratio, but it is the odds ratio of
disease
This is because in a cohort study participants are not selected by their
disease status (the disease develops over time) so you can compare odds
of disease by exposure by exposure
Odds ratio vs. Relative risk
o Both are useful measures of association between exposure and disease
o RR can only be calculated in a cohort or experimental studies because incidence
is needed for the calculation
o In case-control studies, the OR can only be calculated because incidence is not a
morbidity that can measured in this type of study design
o The two measures are related to each other
OR tends to exaggerate the RR
Can see this in a cohort study
o The two (OR and RR) are similar when:
The cases (exposed) are representative, with regard to history of
exposure of all people with the disease in the population form which
cases are drawn, and controls (unexposed) are representative, with
regard to history of exposure, of all people without the disease in the
population form which the cases were drawn,
The exposed and unexposed participants would have to be
selected as a representative sample and not
Enrolled based on exposure and unexposed status
And the disease being studied does not occur frequently (< 10%).
When a disease is rare, the number of exposed individuals with
the disease and the number of unexposed individuals with the
disease will be relatively small compared to the exposed and
unexposed individuals without the disease
However, cohort studies are not used for rare diseases because
they are inefficient
o Cohort studies are used for rare exposures
o So typically, since a cohort study would not be used in the
first place, for a rare disease unless the exposure was rare,
a similar OR and RR would not be seen.
o RR is the preferred measure of association because it incorporates the
development of disease, i.e., risk
o Under certain conditions such as: (1) a representative sample with regard to
history of exposure and (2) the disease does not occur frequently (Prevalence of
disease <10%) the OR can be used to approximate the RR.
o The OR always exaggerate the RR
OR will be farther from the null value, making the effect observed seem
bigger
If RR and OR > 1, the OR > RR
If RR and OR <1, the OR < RR
Study Designs & Measures of association
SIR/ KM RR AR OR PR
SMR
Case-report/ case- - - - - - -
series
Cross-sectional - - - - YES YES
Case-control - - - - YES -
Cohort YES YES YES YES YES YES
RCT - YES YES YES YES -
Attributable Risk
Hypothesis testing
o Research question—the first step before any epidemiological study can be
designed
Must be clearly defined and explicitly stated
o Study hypothesis
Null hypothesis (Ho)
the exposure is not associated with the outcome
Alternative Hypothesis (Ha)
The exposure is associated with outcome
o Association
If your results do not reject the null hypothesis --> no association
If your results reject the null hypothesis, which by default accepts the
alternative hypothesis --> there is an association
Statistical significance
Measures of association
o Examples:
RR, OR, PR, AR, PAR and regression coefficients
Introduction to causal inference
o Causation
Association is not causation
In order to improve public health, we have to identify true causes from
factors which simply have an association with health outcomes
Causal inference
Causal inference is the process of drawing a conclusion about
whether a relationship is causal
Components of causal inference:
o Evaluation of the likelihood of a causal association in a
given study
o Evaluation of all epidemiology studies
o Evaluation of all data
Inference from an individual study
o First, is there an association in the study
o Second, is the association in an individual study likely to be
causal?
o If there is a significant association:
It might be real
Statistical significance
o It may be causal
o It might not be causal
Confounding
It might occur by chance (spurious)
Statistical insignificance
It might not be real (i.e. there is not really an
association)
bias
Statistical significance
o Factors that influence statistical significance
Sample size
Multiple comparisons
Bonferroni corrections; reduced p-values or confidence interval
thresholds are all solutions for controlling for the influence of
multiple variable comparisons of the significance of an association
Borderline significance
o A term used to describe results which are close to be statistically significant
Typically, p is between 0.05- 0.10
o Rationale to use borderline significance
Cutoff points we are arbitrary anyways
Significance is highly dependent on the sample size
Some studies can be difficult to obtain a large sample size
Confounding
Confounding
o In confounding a third variable (the confounder) alters the observed association
between exposure and outcome
The causal association were interested in is masked by the effect of the
confounder
o Confounding generally represents a real association within data
It is not caused by error (bias) or by chance
But when you do not address confounding, it prevents you from
determining causal relationships
o The classic criteria for a confounder is as follows:
(1) The confounding factor must be causally associated with the outcome
(disease)
(2) The confounding factor must be causally or non-causally associated
with exposure
(3) The confounding factor must not be an intermediate variable in the
causal pathway between exposure and disease
o Positive confounding—the exposure—outcome association is exaggerated
further away from null hypothesis
The magnitude of the crude OR is exaggerated
Crude OR > Adjusted OR
o Negative confounding—the exposure—outcome association is attenuated—
closer to the null hypothesis
The magnitude of the crude OR is attenuated
Crude OR < Adjusted OR
Intermediate variable – mediator
o A mediator represents an intermediate effect between exposure and disease. Or
and effect in the causal pathway between exposure and disease.
o NOT a confounder
o Example: slide 54
Confounding, observational & experimental studies
o Confounding is the most important cause of spurious associations in
observational epidemiology studies
o Confounding is not a problem in experimental lab studies
The are designed so that the only difference between study groups is the
exposure: thus, no confounder is possible
o Confounding is usually not present in randomized trials
Randomization is preformed so that the comparison groups are as similar
to each other as possible with respect to every variable except for the
exposure
Identifying confounders
o During design
Biological model or underlying theory should allow you to specify
potential confounders in advance of study/analysis
Collect information on potential confounders when possible
o During analysis
Assess for confounding in a systematic way
Known or potential confounding factors
Other factors not previously known to be confounding factor, but may be
in your population
Evaluate by comparing distribution of factor of both exposure and
outcome
o Confounding in regression
“informal rule”
Alternate method to identify confounders in multiple regression
models
Compare how much the estimate for your exposure changes
when using
o A model with the confounder
o A model without the confounder
o Example: slides 22-24
o Identifying confounders using only your data is not suggested, because your data
can still be influenced by other variables or flaws in data collection, data analysis,
etc.
It is critical to use other methods as well
Prior (external) information
Evaluation of study design and conduct
DAGs—Directed Acyclic Graphs
Type of causal diagram
o Directed—indicated direction of effect
o Acyclic—no path creates a circular loop
o Graph—created in graphical form
o Method used to select confounders
Solutions for confounding
o At design stage
Restriction (eligibility criteria)
Matching (matcher-control study)
Individual and group matching
Randomization (randomized trial)
o At analysis stage
Direct or indirect adjustments
Age adjusted rates
o Adjusted rates are relative indexes rather than actual
measures of risk
SMRs
Example: slides 31-34
Stratified analysis
Or divide your population on different values of the confounder
and analyze within each subgroup
Interpretation for stratified analysis: slides 36-41
o Can be done with RR or OR, even PR any measure of
association
Mantel-Haenszel Pooled Estimates
Slide 42
Regression analysis
Adjusted models = adjust for cofounders
Slide 43
Types of confounding
o Induced
It is possible to create (induce) confounding where it did not exist
previously
In selection of participants
o Selection is based on criteria associated with exposure
(cohort) or outcome (case-control)
In matching of data
o Matching subjects in a case-control study on a confounder
can make this a confounder even if it is not a risk factor for
disease
In analysis
o Controlling for a variable which is not a confounder may
inadvertently create confounding elsewhere
o Residual
When strata of a variable are broad, there may be confounding within
the strata
Confounding remaining due to inaccurately measured confounding factor
Lack of adjustment for factors that are confounders
o Unknown
Known knowns
Confounders that you are aware of and have been accounted for
by design or analysis
Known unknowns:
Confounders that you are aware of but were not able to account
for
Unknown unknowns:
Confounders that you are unaware of and were not able to
account for
Bias
Bias is any systematic error that results in a mistaken estimate of an exposure's effect on
the risk of disease
o Can occur in design, conduct, or analysis
o Effects internal validity
Types of error
o Random
Statistical variation
Confidence interval
o Systemic (Bias)
Main types of bias:
Selection Bias
o Systemic errors in selecting study participants
o Distort the relationship between exposure and outcome
Information Bias
o Systematic errors in collecting information
o Mistakes in exposure, disease status
o Distorts the relationship between exposure and outcome
Selection Bias
o Occurs during the process used to recruit and enroll participants and results in a
distorted relationship between the exposure and outcome
Examples of this can be seen in a
Cohort study: exposed vs. Non-exposed
Case-control study: diseased vs. Non-diseased
o Types of selection bias
Response bias
Differential loss to follow-up
Participation in the study is related to exposure (cohort) or
disease (case-control)
o At enrolment (agreement/refusal)
o During follow-up (response/ non-response)
If subjects in a particular exposure-disease group are more likely
or less likely to participate than other subjects, the observed
measure of association will be biased
Example on slides: 25-29
Exclusion bias
Systematic difference in eligibility criteria of cases/controls
(exposed/ unexposed) that is related to exposure (or disease)
Important to ensure that the only difference in eligibility of cases
and control is the disease status
Example on slide 31
Berkson's bias
Applied to hospital-based case-control studies
Systematic difference in case/control selection
o Occurs when combination of exposure and disease
increase the risk of admission to hospital
Example on slide 32: coffee and pancreatic cancer
Neyman's bias
Also known as incidence-prevalence bias
Exposures are related to survival or to disease status
o Especially when incidence may precede diagnosis
Case-control example: on slide 33
Cross-sectional example: on slide 33
Typically leads to an underestimation of odds ratio (OR)
Surveillance/ diagnosis bias
Selection bias in case-control studies
Individuals with known risk factors are more likely to diagnosed
for disease due to increased medical surveillance
o Those with exposure are more likely to have identified
disease (especially subclinical)
Individuals with a family history of cancer may be more likely to
have cancer screen test
Diabetics are more frequently screened for development of
hypertension
Generalizability vs. Selection bias
Remember generalizability relates to the external validity of a
study and the study populations.
o Is the study population similar to the reference
population?
Population at risk?
Selection bias impacts the internal validity of a study
o This happens when there is a difference in how the groups
for the study population are selected, which result in
biased risk estimates.
o When internal validity is threatened or compromised
external validity is compromised with decreases the
generalizability for the reference population.
Information bias
o Systematic difference in the way the information on exposure o disease is
obtained from study groups
o Results in participants being incorrectly classified as either exposed or
unexposed/ disease or not diseased
Misclassification of exposure or disease
Information bias results from Misclassification
Misclassification of exposure
o Exposed as unexposed
o Unexposed as exposed
Misclassification of outcome
o Disease as non-diseased
o Non-diseased as diseased
Differential vs. Non-differential
o Non-differential misclassification
Error in assessing exposure (or disease) is similar
between comparison groups
Measure of effect tends toward the null
o Differential misclassification
Error in assessing exposure (or disease) differs in
comparison groups
May increase or decrease measure of effect
Example on slides: 45-50
o Occurs after subjects have entered the study
o Types of information bias
Recall bias
Of particular concern in case-control studies
Systematic error due to differences in accuracy or completeness
of recall (memory)
o Cases tend to recall exposure more than control
o Results in an overestimate of the OR
More common with exposures that are
o Involuntary
o Not associated with social stigma
Example: Home pesticides use and birth defects
Reporting bias
Of particular concern in case-control studies and cross-sectional
studies
Exposures that are associated with a stigma are likely to be
underreported
o Attitudes, perceptions, or beliefs about exposure
o Exposures that are not socially acceptable
o Results in an underestimation of the OR
Example: Alcohol consumption and fetal alcohol syndrome
Interviewer bias
Systematic error due to interviewers' subconscious or conscious
data gathering
o Might more thoroughly question cases
o Usually a problem in case-control studies
o Generally, more of a problem when data gathered is
subjective
Similar biases
o Observer bias
Seen in RCT
Corrected for with double blinding
o Responder (interviewee) bias
Corrected for with blinding of participants
Surveillance (diagnosis) bias
Information bias in a cohort study
When exposed are more likely to be under medical surveillance
o Disease is more likely to be diagnosed
o Especially when subclinical
Solutions to Bias
o Avoid bias by
Implementing a clearly thought out inclusion/ exclusion criteria
Minimizing loss to follow-up
Retention methods and tracing
Applying the same methods for assessing exposure (or disease) for all
participants
Training (and retraining) the data collection personnel
Improved reliability
Blinding (mask) interviewers and study participants
Using a control group of diseased individuals—or multiple control groups
Measurement methods
Are methods valid and reliable
o Validity—accuracy
Indicates how close a measurement is to the truth,
or the true state of nature.
It is assessed with
Sensitivity
o The probability of a positive test
given that the person truly has
disease as determined by a "gold
standard" test
o Example on slide 24 (screening test)
Specificity
o The probability of a negative test
given that the person truly does not
have disease as determined by a
"gold standard" test
o Example on slide 25 (screening test)
Positive predictive value
Negative predictive value
o Reliability—Repeatability
Agreement between multiple measurements on
the same sample
Intra-observer (intra-subject) variation
Inter-observer variation
Assessed with:
Percent agreement
Kappa statistic
Consider using multiple measurements
o Sequential testing
Sensitivity reduced
High specificity method
Example on slide 40 (screening test)
o Simultaneous testing
High sensitivity method
Specificity reduced
Example on slide 45 (screening test)
Regular calibration of instruments
o This improves the reliability
o Assess whether bias exist by
Comparing participants to nonparticipants
Responders and on-responders
Retained to those lost to follow-up
Analyzing data by potential sources of bias
Interviewer
Laboratory batch
Exposure assessment method (if varied)
Outcome assessment method (if varied)
Secondary control group
o Eliminate bias when possible
Frequently, this is not possible because extent/direction of bias is often
unknown
If extent of bias is known:
Selection bias –make groups comparable
Information bias –correct error in exposure /disease assessment
Examples of bias slides: 58-64
Interaction