Professional Documents
Culture Documents
October 2015
Introduction to observational epidemiology
case-control studies and prospective cohort studies follow these to try to clarify associations
and establish causality
large scale prospective studies have established the most reliable evidence for many
associations in NCDs
ongoing prospective studies include biological samples for genetic & metabolomic analyses,
and often combine with other cohort studies in collaborative meta-analyses to increase power
• Randomised trials in NCDs difficult (think of smoking exposure and heart disease; UV
exposure and skin cancer)
ecological study
case-control study
This is done by comparing a group of patients who have the disease or condition (cases) with a group of
people who do not have it (controls) but who are otherwise as similar as possible (in characteristics
thought to be unrelated to the causes of the disease or condition).
This means the researcher can look for aspects of their lives that differ to see if they may cause the
condition.
For example, a group of people with lung cancer might be compared with a group of people the same
age that do not have lung cancer. The researcher could compare how long both groups had been
exposed to tobacco smoke.
Case-control studies are retrospective because they look back in time from the outcome to the possible
causes of a disease or condition.
The 95% confidence interval can be interpreted as the interval which, with a 95% certainty, holds the true
value of the association, assuming the study is not biased.
The wider the confidence interval, the greater uncertainty that we have precisely estimated the strength of
the association.
The width of the confidence interval is determined by the number of participants with the outcome of
interest, which in turn is determined by the sample size, so larger studies produce narrower CIs and greater
precision.
We can obtain a narrow confidence interval and yet have inaccurate results if our study is biased.
Cohort study
For example, the EPIC-Oxford cohort study recruited vegetarians and non-vegetarians and followed them
over a number of years to compare disease rates between these groups
Look at these abstracts of published research papers. Some of them may have some details
blocked out.
Sort the abstracts into piles of different types of study e.g. prospective cohort study, cross-
sectional study.
If you identify an interventional study, keep that separate from the observational studies.
Work in small groups: task 1 (10 minutes)
Look at the print-outs we provide of the published research papers. Answer the following
questions about each:
2. PECO:
- What is the population group being investigated?
- What is the exposure (factor of interest) being investigated?
- What is the comparison group being used?
- What is the outcome being investigated?
3. Why do you think this was the study type used? The authors might specify this, or you
might be left to deduce this from the research topic, setting or other features of the study.
Measurement error, bias and confounding
• What these are
• Some examples
• Variability in human population means there is always some error, or ‘noise’ in gathering
information on the exposure, outcome, and any covariable information.
• Random error can be reduced by large study sizes: bigger studies give more precise estimates
(narrower confidence intervals)
A small study – wide confidence intervals – low precision
Bias
Bias = systematic error
Bias is an error in measuring or collecting information that differs systematically between groups of
participants. Bias can result from the design, conduct, or analysis of a study
• Even when you have reduced random error (increased precision) by having a large study, bias can still
affect your results (as well as confounding and other problems).
• There are many types of bias. Some can be avoided by good study design. Always look out for biases,
even in randomised studies.
Bias in measuring body size – a paper looking at this.
Body size and mortality association – the impact of
bias in reporting height and weight
• Look at this association of BMI and mortality
• Now look at this information on how individuals tend to report their height and weight:
Individuals under-report their BMI (by underestimating weight and overestimating height), unless
their BMI is low
• If these biases are present in the BMI-mortality study, what might the impact be on the reported
results?
• Now let’s look at some more papers and find our own examples
Types of bias Example Example in context Potential impact on observed
association
Selection bias Participants in a prospective study differ systematically Participants in EPIC-Oxford have
from people who do not participate in the prospective lower mortality rates than
study. general UK population
From NICE
https://www.nice.org.uk/glossary
Confounding
Confounding factor = a characteristic that can cause or prevent the disease and is also associated with the
exposure of interest.
An observational study aims to identify the effect of an exposure. Sometimes the apparent association
with an exposure is actually an association with another factor which is associated with the exposure and
with the outcome.
This other factor is a confounder, provided that it is not an intermediate step between the exposure and
the outcome (e.g. high blood pressure is an intermediate step between obesity and CVD, rather than being
a confounder).
- an RCT uses randomisation to deal with both known and unknown factors that may differ between
exposed and unexposed groups
Example
• In your prospective study, you observe that saccharin intake is associated with increased risk of kidney
cancer. Obesity is established to increase risk of kidney cancer. Could saccharin intake be a confounder?
It can be difficult to identify confounding. If a factor is an intermediate on a causal pathway (e.g. if obesity
causes raised blood pressure which then causes CVD) then adjusting for it will take away a genuine
association.
• Now let’s look at some papers and find our own examples
Illustration of confounding
Effect modification
Illustration of effect modification
Approaches to deal with confounding
Study design randomisation to deal with known and unknown confounders
In the analysis
• Multivariable analysis
• The observed association will be inaccurate, sometimes completely in the wrong direction.
Example
We might observe in a case-control study of pancreatic cancer risk that participants with lower
body weight had lower odds of pancreatic cancer than those with higher body weight.
However, pancreatic cancer can cause weight loss, often before cancer is diagnosed.
To avoid this reverse causation bias, we might perform a prospective study in which BMI is recorded
at enrolment and the association with pancreatic cancer risk examined several years later. This may
reveal that those with higher BMI had increased risk of pancreatic cancer compared with those with
lower BMI.
Looking out for measurement error, bias and
confounding
• When designing observational studies, plan strategies to reduce errors and biases
• RCTs aim to eradicate bias but they are still subject to many biases, see further
reading
We are trying to develop an awareness of the error, bias, confounding within observational
studies.
Can you identify any problems with this study? Some examples might be:
- Study seems too small – confidence intervals are very wide
- The reported associations seem implausible biologically
- What is reported in the abstract is not representative of what the full results section
shows
- the wrong type of study design was used to try to answer this research question
- The study does not mention the possibility of confounding
Biases we found
Rule out random error Rule out bias Deal with confounding
Valid association
sequence in time
strength of association
specificity of association
biological plausibility
experimental evidence
Measures of disease frequency
• Difference between rate and risk
• OR, HR
Questions to consider
• Can epidemiological studies be wrong?
Risk probability that an event will occur = no. new cases in time period/no. disease free
participants at start of time period
Rate no. new cases in a defined time period/ total person-time at risk
Relative risk (RR) risk in exposed group divided by risk in unexposed group
Odds ratio (OR) odds in exposed group divided by odds in unexposed group
Absolute risk reduction/Risk difference risk in exposed group minus risk in unexposed group
Reliability the ability to obtain the same result each time a study is repeated with a different
population or group.
Confidence interval
The confidence interval (CI) indicates how certain we are about an estimate: the CI gives a range
that is likely to include the 'true' value for the population.
Typically the 95%CI is reported, indicating that the range of values has a 95 in a 100 chance of
including the 'true' value.
A wide confidence interval indicates a lack of certainty about the true estimate, often due to a small
sample size. A narrow confidence interval indicates a more precise estimate, e.g. if a large sample
size used.
P values
p value = the probability that the results would be as observed, if the null hypothesis were true.
The null hypothesis, set up to be disproved, usually states that an exposure has no effect on the outcome.
If you perform 20 tests, you would expect 1 of these (on average) to produce a “significant” p value of P<0.05, purely by
chance. When a study reports a large number of p values, some of these will appear statistically significant just by chance.
The p value needs to be interpreted alongside the size of the observed association and its confidence interval, the biological
plausibility of such an association, and what you know of the study’s design, conduct and analysis.
Looking a study’s results, some associations may be statistically significant but have no relevance.
Some associations may be important but fail to achieve a statistical significant p value in that study.
The p value depends on the sample size. Large studies will produce associations with “significant” p values just by chance.
P values must be interpreted with caution, clinical knowledge and common sense.
Risk
The probability that an event will occur
• Relates the number of new cases to the size of the population at risk in the beginning of the time
period studied
= Number of new cases in a defined time period/Number of disease free people at the beginning
of the time period
Rate
= Number of new cases in a defined time period/Total person-time at risk during the follow-up period
Person-time is a measure that takes into account changes in the size of the population at risk during
the follow-up period
For each study participant, time at risk is the time from enrolment until the earliest of:
a) they develop the outcome
b) they become lost to follow-up
c) the study ends
Non-differential (random) misclassification occurs when classifications of disease status or
exposure occurs equally in all study groups being compared.
Non-differential misclassification increases the similarity between the exposed and non-exposed
groups, and may result in an underestimate (dilution) of the true strength of an association
between exposure and disease.
Standardised mortality rate (SMR) Standardisation is used when comparing mortality in two population groups
with different demographic structures, to remove the effect of differences in age (or other confounding variables
that affect mortality rate) between the population groups.
Standardisation can be either direct, giving an Age-Standardized Mortality Rate; or indirect, producing a
Standardized Mortality Ratio (SMR).
The standardized mortality ratio is the ratio of observed deaths in the study group to expected deaths in the general
population.
16 million NCD deaths occur before the age of 70; 82% of these "premature" deaths occurred in LMICs
By 2020 NCDs predicted to cause 70% deaths in developing regions, compared with <50% currently
Tobacco use, physical inactivity, the harmful use of alcohol and unhealthy diets all increase the risk of
dying from an NCD.
http://www.who.int/mediacentre/factsheets/fs355/en/
http://www.who.int/nmh/publications/ncd-profiles-2014/en/
http://www.thelancet.com/journals/lancet/article/PIIS0140-6736%2812%2961766-8/abstract
In 1997
http://www.who.int/gho/ncd/mortality_morbidity/en/
http://www.who.int/mediacentre/factsheets/fs310/en/
http://www.who.int/mediacentre/factsheets/fs355/en/
apps.who.int/iris/bitstream/10665/112738/1/9789240692671_eng.pdf