Epi 5,6

COHORT STUDIES
The general idea of a cohort study is that a group of

persons are identified who do not have a disease and are
defined on the basis of different exposures.
Can measure multiple exposures
These are then followed and the occurrence of disease is

measured in the population over a period of time.
Can measure multiple diseases
Expt studies are a form of cohort study
Persons are free of disease at outset
Some are exposed, others not
Measure occurrence of disease/cure/etc over time
But, the term cohort studies is usually reserved for observational
studies – ie exposures are not assigned, but occur naturally, or
are chosen purposely by subjects, or by their MD’s, etc
The effects of most risk factors in humans cannot be
studied with experimental studies. Consider some of the
risk questions that concern us today:
Are inactive people at increased risk for cardiovascular
disease, everything else being equal?
Do cellular phones cause brain cancer?
Does obesity increase the risk of cancer?
For such questions, it is usually not possible to conduct
an experiment.
First, it would be unethical to impose possible risk
factors on a group of healthy people for the purposes of
scientific research.
Second, most people would balk at having their diets
and behaviors constrained by others for long periods of
time.
Finally, the experiment would have to go on for many
years, which is difficult and expensive. Chapter 6.
As a result, it is usually necessary to study risk in less obtrusive
ways.
Clinical studies in which the researcher gathers data by simply
observing events as they happen, without playing an active part
in what takes place, are called observational studies.
Most studies of risk are observational studies and are either
cohort studies, described in the rest of this chapter, or case
control studies.
Total population can be studied. Include children, elderly,
mentally incompetent, intensive care, and people with early,
minimal, or advanced disease (all usually excluded in RCT – esp
Pharma trials)
Findings more likely to be applicable in real world
Adverse effects of interventions will be much more accurately measured
Population based estimates of exposure effects can be made
This implies that in cohort studies you MUST include as wide a
spectrum of patients as possible
Selection bias – Persons who get exposed not same as unexposed
Surgery – who is ‘operable’ vs ‘inoperable’
Smoking – not the only difference
Healthy worker effect
Exposures that seem same, may not be
Also potential bias in measuring
Drop-outs – reduce power, may bias (a lot)
Outcome assessment can be biased
Cohort is used to describe a group of people who have
something in common when they are first assembled and who
are then observed for a period of time to see what happens to
them.
Whatever members of a cohort have in common, observations of

them should fulfill three criteria if the observations are to
provide sound information about risk of disease.
1. They do not have the disease (or outcome) in question at the time they
are assembled.
2. They should be observed over a meaningful period of time in the
natural history of the disease in question so that there will be sufficient
time for the risk to be expressed. For example, if one wanted to learn
whether neck irradiation during childhood results in thyroid neoplasms,
a 5-year follow-up would not be a fair test of this hypothesis, because the
usual time period between radiation exposure and the onset of disease is
considerably longer.
3. All members of the cohort should be observed over the full period of
follow-up or methods must be used to account for dropouts. To the
extent that people drop out of the study and their reasons for dropping
out are related in some way to the outcome, the information provided by
an incomplete cohort can misrepresent the true state of affairs.
Cohort Studies
A group of people (a cohort) is assembled,none of whom has

experienced the outcome of interest, but all of whom could
experience it.
(For example, in a study of risk factors for endometrial cancer, each
member of the cohort should have an intact uterus.)
Upon entry into the study, people in the cohort are classified
according to those characteristics (possible risk factors) that might be
related to outcome. For each possible risk factor, members of the
cohort are classified either as exposed (i.e., possessing the factor in
question, such as hypertension) or unexposed.
All the members of the cohort are then observed over time to see which of
them experience the outcome, say, cardiovascular disease, and the rates of the
outcome events are compared in the exposed and unexposed groups.
It is then possible to see whether potential risk factors are related to
subsequent outcome events.
Other names for cohort studies are incidence studies, which emphasize that
patients are followed over time; prospective studies, which imply the forward
direction in which the patients are pursued; and longitudinal studies, which
call attention to the basic measure of new disease events over time.
Subjects without disease are enrolled and then followed over
time to determine occurrence (incidence) of diseases (outcomes)
Exposures are usually measured directly at baseline, and may be
measured concurrently with outcomes
The cohort can be assembled in the present and

followed into ther future
Exposure is defined based upon a past single event (eg.
Hiroshima survivors) or period of exposure (eg. Worked in gas
mask factory 1940-45)
Outcomes may be ascertained directly, or also have already occurred
The cohort can be identified from past records and

followed forward from that time up to the present.
Another method using computerized medical databases in cohort studies is the
case-cohort design. Conceptually, it is a modification of the retrospective
cohort design that takes advantage of the ability to determine the frequency of a
given medical condition in a large group of people. In a case-cohort study, all
exposed people in a cohort, but only a small random sample of unexposed
people are included in the study and followed for some outcome of interest.
For efficiency, the group of unexposed people is “enriched” with all those who
subsequently suffer the outcome of interest (i.e., become cases). The results are
then adjusted to reflect the sampling fractions used to obtain the sample. This
efficient approach to a cohort study requires that frequencies of outcomes be
determined in the entire group of unexposed people; thus, the need for a large,
computerized, medical database.
The basic expression of risk is incidence, which is defined in Chapter
2 as the number of new cases of disease arising during a given period
of time in a defined population that is initially free of the condition. In
cohort studies, the incidence of disease is compared in two or more
groups that differ in exposure to a possible risk factor.
To compare risks, several measures of the association between
exposure and disease, called measures of effect, are commonly used.
These measures represent different concepts of risk, elicit different
impressions of the magnitude of a risk, and are used for different
purposes.
Absolute Risk = What is the incidence of disease in a group
initially free of the condition?
# of new cases over a given time period/# of people in group
Attributable Risk = What is the incidence of disease

attributable to exposure?
AR = Incidence in E − Incidence in Non E
Relative Risk = How many times more likely are exposed
persons to become diseased, relative to non-exposed persons?
RR= Incidence in E / Incidence in Non E
Population-attributable risk = What is the incidence of disease
in a population, associated with the prevalence of a risk
factor?
ARP = AR × Prevalence of exposure to a risk factor
Population-attributable fraction = What fraction of the disease

in a population is attributable to exposure to a risk factor?
AFP =ARP /total incidence of disease in a population.
Simple Risks
Death rate (absolute risk/incidence) from lung cancer in
smokers is 341.3/100,000/yr
Death rate (absolute risk/incidence) from lung cancer in non-
smokers is 14.7/100,000/yr
Prevalence of cigarette smoking 32.1%
Lung cancer mortality rate in population 119.4/100,000/yr
The probability of an event in a population under study. Its value
is the same as that for incidence, and the terms are often used
interchangeably.
Absolute risk is the best way for individual patients and
clinicians to understand how risk factors may affect their
lives.
One might ask, “What is the additional risk (incidence) of
disease following exposure, over and above that experienced by
people who are not exposed?”
The answer is expressed as attributable risk, the absolute risk
(or incidence) of disease in exposed persons minus the absolute
risk in non-exposed persons.
Often, it is expressed as a percentage – the attributable risk
percent (AR%) is the AR multiplied by 100.
“How many times more likely are exposed persons to get the
disease relative to non-exposed persons?”
To answer this question, relative risk or risk ratio, is the ratio of
incidence in exposed persons to incidence in nonexposed
persons.
Because relative risk indicates the strength of the association
between exposure and disease, it is a useful measure of effect for
studies of disease etiology.
Is the product of the attributable risk and the prevalence of exposure
to the risk factor in a population.
It measures the excess incidence of disease in a community that is
associated with a risk factor.
One can also describe the fraction of disease occurrence in a
population associated with a particular risk factor, the
population-attributable fraction.
It is obtained by dividing the population attributable risk by the
total incidence of disease in the population.
Risk: From Disease to
Exposure
• Cohort studies are a wonderfully logical and direct way of studying risk,
but they have practical limitations. Most chronic diseases take a long time
to develop. The latency period, the period of time between exposure to a
risk factor and the expression of its pathologic effects, is measured in
decades for most chronic diseases. For example, smoking precedes
coronary disease, lung cancer, and chronic bronchitis by 20 years or more,
and osteoporosis with fractures occurs in the elderly because of diet and
exercise patterns throughout life. Also, relatively few people in a cohort
develop the outcome of interest, even though it is necessary to measure
exposure in, and to follow-up, all members of the cohort. The result is
that cohort studies of risk require a lot of time and effort, not to mention
money, to get an answer. The inefficiency is especially limiting for very
rare diseases
• This chapter describes another way of studying the relationship between a
potential risk (or protective) factor and disease more efficiently: case
control studies. This approach has two main advantages over cohort
studies. First, it bypasses the need to collect data on a large number of
people, most of whom do not get the disease and so contribute little to the
results. Second, it is faster because it is not necessary to wait from
measurement of exposure until effects occur.
• But efficiency and timeliness come at a cost: Managing bias is a more
difficult and sometimes uncertain task in case-control studies. In addition,
these studies produce only an estimate of relative risk and no direct
information on other measures of effect such as absolute risk, attributable
risk, and population risks.
• A case-control study is designed to help determine if an exposure is
associated with an outcome (i.e., disease or condition of interest). In
theory, the case-control study can be described simply.
• First, identify the cases (a group known to have the outcome) and the
controls (a group known to be free of the outcome). Then, look back
in time to learn which subjects in each group had the exposure(s),
comparing the frequency of the exposure in the case group to the
control group.
The validity of case-control studies depends on the care with which
cases and controls are selected, how well exposure is measured, and
how completely potentially confounding variables are controlled.
Selecting Cases
• The cases in case-control research should be new (incident) cases, not

existing (prevalent) ones.
• At best, a case-control study should include all the cases or a representative
sample of all cases that arise in a defined population.
• Some case-control studies, especially older ones, have identified cases in
hospitals and referral centers where uncommon diseases are most likely to
be found. This way of choosing cases is convenient, but it raises validity
problems.
• However the cases might be identified, it should be possible for both them
and controls to be exposed to the risk factor and to experience the outcome.
For example, in a case-control study of exercise and sudden death, cases and
control would have to be equally able to exercise (if they chose to) to be
eligible.
• Selecting Controls
• Above all, the validity of case-control studies depends on the

comparability of cases and controls. To be comparable, cases and
controls should be members of the same base population and have an
equal opportunity of being exposed.
• The best approach to meeting these requirements is to ensure that
controls are a random sample of all non-cases in the same population or
cohort that produced the cases.
The Population Approach
• Studies in which cases and controls are a complete or random sample
of a defined population are called population-based case-control
studies. In practice, most of these populations are dynamic—that is,
continually changing, with people moving in and out of the
population.
•The Cohort Approach
• Another way of ensuring that cases and controls are comparable is to
draw them from the same cohort. In this situation, the study is said to
be a nested case control study (it is “nested” in the cohort).
• With nested case-control studies, there is an opportunity to obtain
both a crude measures of incidence from a cohort analysis and a
strong estimate of relative risk, that takes into account a rich set of
direct interest, from a case-control analysis
Measuring Exposure
• The validity of case-control studies also depends on avoiding
misclassification when measuring exposure. The safest approach is to
depend on complete, accurate records that were collected before disease
developed.
• Examples include pharmacy records for studies of prescription drug risks,
surgical records for studies of surgical complications, and stored blood
specimens for studies of risk related to biomolecular abnormalities. With
such records, knowledge of disease status cannot bias reporting of
exposure.
• However, many important exposures can only be measured by asking cases
and controls or their proxies about them. Among these are exercise, diet,
and overthe-counter and recreational drug use.
• When cases and controls are asked to recall their previous exposures, bias
can occur for several reasons. Cases, knowing they have the disease under
study, may be more likely to remember whether they were exposed, a
problem called recall bias.
THE ODDS RATIO: AN ESTIMATE
OF RELATIVE RISK
It shows the dichotomous classification of exposure and disease typical

of both cohort and casecontrol studies and compares how risk is
calculated differently for the two. These concepts are illustrated with
the different studies, which have or had both a cohort and a case-
control component.
The odds ratio is defined as
the odds that a case is exposed divided by the odds
that a control is exposed
Case-Control Study
• Cases (people with illness) and controls (people with no illness)
• Compare foods eaten by cases and controls
• Foods more commonly eaten by cases than controls might be
associated with illness
Ate food
Cases
Did not eat food
Population
at risk
Ate food
Controls
Did not eat food
> Case-control studies

• In a case-control study, subjects are enrolled based on whether they
have (or had) the disease associated with the outbreak or not. (In
these studies, persons with the disease of interest are called “cases”
or “case-patients” and persons without the disease are called
“controls”).
• Prior exposures, such as eating a particular food item, are compared
between cases and controls to see if there is a relationship between
the disease and the exposure.
• (Notice how this differs from cohort studies. In a cohort study we
look at people who ate a food or did not eat a food and determine if
they became ill or not. With a case-control study, we look at people
who were ill or not and look back to see if they ate the food or not.)
• Two sisters and their mother from Vancouver, British Columbia,
developed signs and symptoms suggestive of botulism. After
these cases were publicized, 34 additional cases of botulism
were identified in the area. All case-patients had eaten at a
single, family-styled restaurant.
• A case-control study was undertaken to determine the source of
the outbreak at the restaurant. Cases were persons who had
eaten at the Vancouver restaurant who had neurologic signs and
symptoms suggestive of botulism. Controls were persons who
ate at the restaurant with case-patients but developed no
gastrointestinal or neurologic symptoms in the following 2 weeks.
Twenty-two case-patients and 22 controls were interviewed. It
was determined that 20 (91%) of 22 case-patients but only 3
(14%) of 22 controls ate a beef dip sandwich at the restaurant.
Outbreak of Botulism in Vancouver, B.C.
• 36 cases of botulism among patrons of Restaurant X
• Case-control study undertaken
• 20 (91%) of 22 cases ate beef dip sandwich
• 3 (14%) of 22 controls ate beef dip sandwich

Odds Ratio (OR)
• Measure of association for a case-control study
• Compares odds of cases having eaten a certain
food to odds of controls having eaten the food
odds of eating food among cases

odds ratio =
odds of eating food among controls
• Answers the question “How much higher is the

odds of eating the food among cases than
controls?”

Odds Ratio (Optional) 
Case Control
Ate food a b (two-by-two
Did not eat food c d table)
TOTAL a+c b+d
a/c
odds ratio = odds of eating food (cases) =
odds of eating food (controls) b/d
axd
odds ratio = (cross product)
bxc

Odds Ratio
• Close to 1.0 = odds of eating food is similar among
cases and controls  no association between food
and illness
• Greater than 1.0 = odds of eating food among cases is
higher than among controls  food could be risk
factor
• Less than 1.0 = odds of eating food among cases is
lower than among controls  food could be
“protective factor”
• Magnitude reflects strength of association between
illness and eating the food.

What is our Odds Ratio
for our Outbreak of
Botulism in Vancouver?
Outbreak of Botulism in Vancouver
Returning to the outbreak of botulism:
• 20 of 22 cases ate beef dip sandwich (2 didn’t)
• 3 of 22 controls ate beef dip sandwich (19 didn’t)
odds of eating food (cases) 20/2

odds ratio = =
odds of eating food (controls) 3/19
odds ratio = 63

• An odds ratio of 63 means that the odds that cases ate the
beef dip sandwich was 63 times higher than the odds
among controls. Eating the beef dip sandwich might be a
risk factor for botulism in this outbreak.
• Cyclosporiasis is a parasitic disease caused by the microorganism
Cyclospora cayetanensis. Cyclospora infects the small bowel and
usually causes watery diarrhea, bloating, increased gas, stomach
cramps, nausea, loss of appetite, and profound weight loss.
Cyclosporiasis is transmitted in food or water.
• In 1996, a number of outbreaks of cyclosporiasis were occurring
across the United States. In late June, the New Jersey Department
of Health and Senior Services (NJDHSS) undertook a case-control
study to examine an association between cyclosporiasis and eating
raspberries. The cases did not come from one particular setting or
event but were spread across the state. (Sporadic is the term often
used to describe cases that do not appear to be related to a
particular event or exposure.)
• In the case-control study, 21 (70%) of 30 cases reported eating
raspberries in the week before onset of illness whereas 4 (7%) of 60
controls ate raspberries.
•What is the odds ratio for the outbreak
•What does this odds ratio mean?

Class Question
• Outbreak of cyclosporiasis in New Jersey not
associated with particular event/establishment
• Case-control study undertaken
– 21 (70%) of 30 cases ate raspberries
– 4 (7%) of 60 controls ate raspberries

Epi 5,6

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Epi 5,6

Uploaded by

Copyright:

Available Formats

COHORT STUDIES

The general idea of a cohort study is that a group of

These are then followed and the occurrence of disease is

Whatever members of a cohort have in common, observations of

A group of people (a cohort) is assembled,none of whom has

The cohort can be assembled in the present and

The cohort can be identified from past records and

Attributable Risk = What is the incidence of disease

Population-attributable fraction = What fraction of the disease

• The cases in case-control research should be new (incident) cases, not

• Above all, the validity of case-control studies depends on the

It shows the dichotomous classification of exposure and disease typical

> Case-control studies

> Case-control studies

odds of eating food among cases

• Answers the question “How much higher is the

> Case-control studies

> Case-control studies

> Case-control studies

odds of eating food (cases) 20/2

> Case-control studies

> Case-control studies

You might also like