Cohort Studies: A Summary of Study Types, Their Strengths and Limitations, and Results Calculation and Reporting

NURSING RESEARCH,
STEP BY STEP By Bernadette Capili, PhD, NP-C, and Joyce K. Anastasi, PhD, DrNP, FAAN
A series coordinated by the Heilbrunn Family Center for Research Nursing at Rockefeller University
Cohort Studies
A summary of study types, their strengths and limitations, and
results calculation and reporting.
Downloaded from http://journals.lww.com/ajnonline by BhDMf5ePHKav1zEoum1tQfN4a+kJLhEZgbsIHo4XMi0hCywCX1AWnYQp/IlQrHD3i3D0OdRyi7TvSFl4Cf3VC1y0abggQZXdtwnfKZBYtws= on 01/07/2022
Editor’s note: This is the seventh article in a series on clinical research by nurses. The series is designed to
give nurses the knowledge and skills they need to participate in research, step by step. Each column will
present the concepts that underpin evidence-based practice—from research design to data interpreta-
tion. The articles will be accompanied by a podcast offering more insight and context from the authors. To
see all the articles in the series, go to http://links.lww.com/AJN/A204.
C
ontinuing our discussion of observational status at study entry. For example, an investigator
study designs, this column focuses on cohort could recruit people living with HIV (PLWH) who
studies. The word “cohort” derives from smoke and who don’t smoke (have never smoked)
the ancient Roman military term for a group of from the same community and follow them for five
several hundred soldiers who march together to years to determine the relationship between smok-
achieve a tactical purpose.1 The epidemiology com- ing status and incidence of heart disease and stroke
munity began using the word during the 1930s to in this population. Alternatively, the smokers could
mean a designated group of people with a com- be categorized by pack-years (less than five pack-
mon characteristic or characteristics (such as year smokers and more than five pack-year smok-
smokers, ICU nurses, people exposed to lead in ers, for example) to determine whether heart dis-
drinking water) who are followed or traced over ease and stroke are associated with the amount and
a period of time.1 This group is followed longitu- duration of smoking.
dinally, with periodic measurements to determine Prospective cohort studies are also referred to
the incidence of specific health outcomes or events. as longitudinal studies. These studies are used to
Since cohort studies are observational, study par- answer a specific question or questions in a selected
ticipants are monitored but study interventions area. Investigators recruit a sample of participants
are not provided. This article describes prospec- and follow them over time, from the present to the
tive (following a group from the present into the future. At predetermined time points, characteristics
future) and retrospective (studying a group from are measured (via interviews, questionnaires, biolog-
the past through to the present) cohort designs, ical assays, or physiological measures, for example)
examines their strengths and weaknesses, and dis- to understand the relationship between the cohort
cusses methods for reporting the study results. and the study outcome (see Figure 1).
During the recruitment phase, investigators need
USES OF THE COHORT DESIGN to identify and exclude potential participants who
The cohort study design is excellent for understand- plan to move and thus may be difficult to reach
ing an outcome or the natural history of a disease during the study’s follow-up period. The eligibility
or condition in an identified study population. Since criteria should reflect this consideration. Investiga-
participants do not have the outcome or disease at tors should collect contact information from
the start of the study, this study design can be used enrolled participants: telephone number, e-mail
to assess the relationship over time between expo- address, mailing address, and contact information
sure and outcome. from at least two friends or family members in case
A vital feature of a cohort study is selecting two participants move or die during follow-up. Addi-
groups of study participants based on mutual char- tionally, the study protocol should indicate periodic
acteristics such as geographic location, birth year, or contact with the participants, such as telephone
occupation. Cohorts are also selected based on calls to provide assessment results, a study newslet-
exposure and nonexposure status. In such cases, ter, or study incentives (gift cards) to keep the par-
both groups should be similar except for exposure ticipants engaged.
ajn@wolterskluwer.com AJN ▼ December 2021 ▼ Vol. 121, No. 12 45

NURSING RESEARCH,
STEP BY STEP
Using the HIV study example above, participants the outcome.2 A limitation of this design is that it
are recruited from local New York City HIV pri- requires a large sample size when the incidence of
mary care clinics and will be evaluated annually for the outcome is smaller than the prevalence of the
10 years to determine heart disease and stroke inci- exposure. Additionally, conducting the study may
dence. PLWH are eligible to join if they smoke ciga- be costly in terms of participant recruitment; the
rettes and have well-controlled HIV (undetectable number of staff needed; and the collection, storage,
viral load). At study entry, individual smoking and analysis of the outcome measurements. More-
exposure (pack-years) is determined, medical his- over, some conditions, despite being relatively com-
tory is taken, and cardiovascular health is evalu- mon (such as breast cancer or chronic obstructive
ated. Participants identified at baseline to have a pulmonary disease), can occur at low rates in any
history of heart disease or stroke are excluded. Par- given evaluation period and not provide meaningful
ticipants are categorized into two groups based on results. In that case, participants need to be followed
smoking exposure: less than five pack-years or for a longer duration, even though it increases the
more than five pack-years. The independent vari- cost as well as the possibility of participants with-
ables (outcome predictors: pack-years, blood pres- drawing from the study or being lost to follow-up.
sure, weight, waist circumference, lipid levels) and Retrospective cohort studies are also called his-
the dependent variables (outcomes measured: inci- torical cohort studies. The term “historical” is fit-
dence of heart disease and stroke) are assessed ting since data analysis occurs in the present but
annually. The longitudinal design allows investiga- participants’ baseline measurements and follow-ups
tors to compare changes over time and determine if happened in the past. This type of study is feasible
the level of smoking exposure (pack-years) and if an investigator has access to a data set that fits
other independent variables are associated with the the research question. The data set must also have
outcome (incidence of heart disease and stroke). adequate measurements of the predictor variables.
Strengths and weaknesses. A primary strength Generally, the data for a retrospective cohort
of the prospective cohort design is that it allows study were originally collected for other purposes—
investigators to determine incidence—the number for electronic health records (EHRs), for example,
of new cases of an outcome occurring over time—in or an administrative database such as Medicare.2
this case, the incidence of new-onset heart disease This study design’s primary goal is to examine
and stroke. In addition, measuring the independent events or outcomes by reviewing past data in rela-
(predictor) variables before the onset of the out- tion to predictor variables. Institutional review
come strengthens the investigators’ ability to assess board approval is required even though actual
the sequence of events and infer the causal basis of patient interactions do not occur. For example, to
an association between the predictor variables and ascertain the incidence of heart disease and stroke
among PLWH who smoke, the EHRs of 500 HIV
patients from a local HIV primary clinic are exam-
Figure 1. Prospective and Retrospective Cohort Designs ined over 10 years, from 2010 to 2020. In this
example, HIV patients are categorized by their
PROSPECTIVE
smoking exposure status: less than five pack-years
or more than five pack-years. The outcome of inter-
Start 2021 End 2031
est is the incidence of heart disease or stroke.
Strengths and weaknesses. The ability to analyze
Outcome/Disease
an outcome based on data from already collected
Exposed measurements and participant follow-ups is one
strength of the retrospective cohort design. This
No Outcome/No Disease
type of study is also inexpensive to conduct. One
Defined limitation is that, because the data were collected
Sample
for other purposes, the data set may be incomplete
Outcome/Disease or inaccurate or include measurements that do not
fit the research question.2 In other words, the inves-
Unexposed
tigators do not have control over the data collection
No Outcome/No Disease methods and procedures.
RETROSPECTIVE METHODS FOR REPORTING RESULTS

Start 2021 End 2011 During a cohort study’s scheduled evaluation
periods, investigators determine the incidence of
46 AJN ▼ December 2021 ▼ Vol. 121, No. 12 ajnonline.com

Table 1. Calculating the Results of a Cohort Study
Disease No Disease
(Heart Disease/ (No Heart Disease/ Total Person-Time
Outcome Stroke) Stroke) Total (Years)
Exposed 125 375 500 (125 × 5) + (375 × 10) = 4,375
(Smoker) a b (a + b) (a × 5a) + (b × 10b)
Unexposed 25 475 500 (25 × 5) + (475 × 10) = 4,875

(Nonsmoker) c d (c + d) (c × 5a) + (d × 10b)
Total 150 850 1,000 9,250

(a + c) (b + d) (a + b + c + d) [(a × 5a) + (b × 10b)] +
[(c × 5a) + (d × 10b)]
a = exposed participant and acquires the outcome of interest
b = exposed participant and does not acquire the outcome of interest
c = unexposed participant and acquires the outcome of interest
d = unexposed participant and does not acquire the outcome of interest
Risk (cumulative incidence) of PLWH diagnosed with heart disease/stroke: (a + c)/(a + b + c + d) = 150/1,000 = 0.15 × 100 = 15%
Risk Ratio among PLWH who smoke for heart disease/stroke: [a/(a + b)] / [c/(c + d)] = (125/500) / (25/500) = 0.25/0.05 = 5
Rate (incidence rate) of heart disease/stroke among PLWH over 10 years: a + c/ [(a × 5a) + (b × 10b)] + [(c × 5a) + (d × 10b)] =
150/9,250 = 0.016 cases/person-year
Rate Ratio (IRR): a/[(a × 5a) + (b × 10b)] / c/[(c × 5a) + (d × 10b)] = 0.029/0.005 = 5.8
Interpretation of Risk Ratio or Rate Ratio

Risk Ratio or Rate Ratio = 1 Exposure is not preventive or harmful
Risk Ratio or Rate Ratio > 1 Exposure is harmful
Risk Ratio or Rate Ratio < 1 Exposure is protective
IRR = incidence rate ratio; PLWH = people living with HIV.

a
Participants diagnosed with heart disease/stroke at the end of 5 years of follow-up.
b
Participants with no heart disease/stroke at the end of the 10-year study duration.
the outcome of interest by counting the number Number of participants who

of participants who develop the outcome, such as Risk = develop the outcome
those PLWH in our example who develop heart Total number of participants at risk
disease or stroke. Incidence is measured using
risks and rates.3 Both can provide additional infor- A total of 150 cases of heart disease and stroke
mation about the exposure of interest (smoking, were identified in the cohort sample of 1,000 par-
nonsmoking) by calculating the risk ratio and rate ticipants. Based on these calculations, there was a
ratio. 15% risk of developing heart disease or stroke
Risk and risk ratio. Risk is also known as cumu- among the study participants. Additional analyses
lative incidence, and is defined as the number of par- used risk ratio to compare the risk between exposed
ticipants who develop the outcome of interest divided (smoker) and unexposed (nonsmoker) participants,
by the total population of participants at risk. For providing further information about the data. Risk
instance, investigators conduct a study to evaluate ratio illustrates the relative increase or decrease in
the association between smoking and heart disease incidence of the outcome between the exposed and
and stroke among PLWH. They follow 1,000 PLWH unexposed groups.3
for 10 years, 500 of whom are smokers, 500 non-
smokers. Participants are evaluated annually. A total Risk Ratio = Riskexposed/Riskunexposed
of 125 heart disease and/or stroke cases were diag-
nosed in the smoking group while only 25 were diag- Using the formula shown in Table 1, the risk
nosed in the nonsmoking group (see Table 1). All ratio was 5. The results demonstrate that PLWH
cases of heart disease/stroke were diagnosed at the who were smokers (exposed) were five times more
fifth year of follow-up. likely to be diagnosed with heart disease or stroke
ajn@wolterskluwer.com AJN ▼ December 2021 ▼ Vol. 121, No. 12 47

NURSING RESEARCH,
STEP BY STEP
Figure 2. Interpretation of Risk Ratio or Rate Ratio tective for heart disease and stroke. The further
the rate ratio is from 1 (null association, the
exposure is not preventive or harmful), the more
Risk Ratio
impact the exposure has on the study cohort.
or Rate
Ratio
REPORTING RECOMMENDATIONS
As with cross-sectional studies, which were discussed
Risk: less-exposed group Risk: more-exposed group
in the last column, cohort studies also use the guide-
line Strengthening the Reporting of Observational
Less than 1 Greater than 1 Studies in Epidemiology (STROBE)4 to explain
1
how a study is conducted and how the results are
obtained. This guideline provides specific recom-
mendations for cohort studies in its 22-item check-
than PLWH who were nonsmokers. To further list to guide investigators. The checklist provides
understand the meaning of the risk ratio results, if criteria for understanding research, including study
the result had been equal to 1, then the exposure planning, conduct, findings, and conclusions. Addi-
(smoking) would not have affected the outcome. In tionally, the checklist contains information on how
other words, the risk would have been the same for a study might be replicated, how it can be used to
the exposed and unexposed groups. If the risk ratio make clinical decisions, and what constitutes suffi-
had been less than 1, the exposure (smoking) would cient information to be included in a systematic
have been protective for heart disease and stroke review. See www.equator-network.org/reporting-
(see Figure 2). guidelines/strobe.
Rate and rate ratio. Rate is also known as inci-
dence rate and is defined as the number of partici- CONCLUSION
pants who develop the outcome of interest (heart The cohort study design is appropriate to use to
disease and stroke) divided by the person-time determine the incidence of a health outcome or
(days, months, years) at risk during follow-up. event. It is especially helpful in understanding the
Person-time is the sum of each participant’s total natural history of a disease or condition in an
time free from the outcome of interest (no heart identified study population. Additionally, the
disease or stroke). cohort study allows an investigator to examine
the timing between an exposure and an outcome
Rate = Number of new cases or outcomes.
Total person-time at risk The next article in this series will discuss the
case–control study design. ▼
This measure provides the accumulated events
(cases of heart disease and stroke) and the speed at Bernadette Capili is director of the Heilbrunn Family Center
for Research Nursing at Rockefeller University, New York City.
which new health outcomes transpire in a study Joyce K. Anastasi is the Independence Foundation Professor
cohort. Rate ratio is also used to compare and of Nursing at the New York University Rory Meyers College
understand the rate of speed (increase or decrease) of Nursing, New York City. This manuscript was supported
in part by grant No. UL1TR001866 from the National Insti-
of a health outcome between the exposed and unex- tutes of Health’s National Center for Advancing Translational
posed groups. Sciences Clinical and Translational Science Awards Program.
Contact author: Bernadette Capili, bcapili@rockefeller.edu. The
authors have disclosed no potential conflicts of interest, finan-
Rate Ratio (Incidence Rate Ratio [IRR]) = cial or otherwise. A podcast with the authors is available at
IRRexposed/IRRunexposed www.ajnonline.com.
The calculated rate shown in Table 1 is 0.016, REFERENCES

indicating that 0.016 cases of heart disease and 1. Hood MN. A review of cohort study design for cardiovascu-
stroke per person-year occurred in the sample. lar nursing research. J Cardiovasc Nurs 2009;24(6):E1-E9.
The rate ratio was 5.8, indicating that heart dis- 2. Hulley SB, et al. Designing clinical research. 4th ed.
Philadelphia: Wolters Kluwer/Lippincott Williams and
ease and stroke rates were 5.8 times greater in Wilkins; 2013.
the exposed group than in the unexposed group. 3. Alexander LK, et al. Risk and rate in cohort studies. ERIC
As with risk ratio, if the result had been equal to Notebook 2015;2(7). https://sph.unc.edu/epid/eric.
1, then the smoking exposure would not have 4. von Elm E, et al. The strengthening the reporting of obser-
vational studies in epidemiology (STROBE) statement:
affected the outcome. If it had been less than 1, guidelines for reporting observational studies. Int J Surg
then the smoking exposure would have been pro- 2014;12(12):1495-9.
48 AJN ▼ December 2021 ▼ Vol. 121, No. 12 ajnonline.com

Cohort Studies: A Summary of Study Types, Their Strengths and Limitations, and Results Calculation and Reporting

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Cohort Studies: A Summary of Study Types, Their Strengths and Limitations, and Results Calculation and Reporting

Uploaded by

Copyright:

Available Formats

NURSING RESEARCH,

ajn@wolterskluwer.com AJN ▼ December 2021 ▼ Vol. 121, No. 12 45

RETROSPECTIVE METHODS FOR REPORTING RESULTS

46 AJN ▼ December 2021 ▼ Vol. 121, No. 12 ajnonline.com

Unexposed 25 475 500 (25 × 5) + (475 × 10) = 4,875

Total 150 850 1,000 9,250

Interpretation of Risk Ratio or Rate Ratio

IRR = incidence rate ratio; PLWH = people living with HIV.

the outcome of interest by counting the number Number of participants who

ajn@wolterskluwer.com AJN ▼ December 2021 ▼ Vol. 121, No. 12 47

The calculated rate shown in Table 1 is 0.016, REFERENCES

48 AJN ▼ December 2021 ▼ Vol. 121, No. 12 ajnonline.com

You might also like