You are on page 1of 74

1

College of Computing and Informatics


Department of Statistics

Biostatistics and Epidemiology


Chapter 1: Principles and Methods of Epidemiology

By: Dugo G. (MSc.)


Email: dugojgadisa@gmail.com or
Dugo.Gadisa@haramaya.edu.et
2 Introduction to Epidemiology and Biostatistics
What is the difference between the two?
 Biostatistics is the application of statistical methods
in biology, medicine, public health, and other fields
of study.
 Epidemiology is the study of patterns of health and
illness and associated factors at the population level
(disease distribution, prevalence, mechanisms of
prevention, etc.)
3 Introduction to Epidemiology
Definitions of Epidemiology
 It is a study of the distribution of a disease or a
physiological condition in human populations and the
factors that influence this distribution.
 It is a study of the distribution and determinants of health-
related states and events in populations and the application
of this study to control health problems.
4
What is Epidemiology?
 In general, Epidemiology can be
defined as the study of determinants,
distribution, and frequency of
disease.
5 Uses of Epidemiology
a. Community diagnosis; i.e., what are the major health problems
occurring in a community
b. Establishing the history of a disease in a population; e.g.,
identifying the periodicity of an infectious disease
c. Describing the natural history of disease in the individual; e.g.,
natural history of Cancer in the individual (from its pathological
onset (inception) to resolution (recovery or death), clinical stages)
6 Uses of Epidemiology
d. Describing the clinical picture of the disease; i.e., who gets the
disease, who dies from the disease, and what the outcome of the
disease is
e. Estimating risk; e.g., what factors increase the risk of heart
disease, automobile accidents, and violence
f. Identifying syndromes and precursors; e.g., the relationship of high
blood pressure to stroke, kidney disease, and heart disease
 Syndromes: a group of signs and symptoms that consistently occur
together and characterize a particular abnormality or condition.
 Precursors: a substance, cell, or cellular component from which
another substance, cell, or cellular component is formed
7 Uses of Epidemiology
g. Evaluating prevention/intervention programs; e.g., vaccine
and clinical trials
h. Investigating epidemics/diseases of unknown etiology.
 Etiology encompasses understanding why a particular
condition or disease occurs.
8 Some Epidemiologic Concepts
Catchment area:
 The geographical area from which the people attending a
particular health facility come.
Catchment population :
 People attending particular health facilities
 Population at risk: is vital to know all people at risk of
developing a disease or having a health problem, as well
as those who are currently suffering from it.
9 Some Epidemiologic Concepts
 Incidence: the number of new cases, or events
occurring over a defined period of time, commonly one
year.
 Prevalence: the total number of existing cases, episodes or
events occurring at one point in time, commonly on a
particular day.
10 Some Epidemiologic Concepts
11 Some Epidemiologic Concepts
 Case: A person who is identified as having a particular characteristic
such as a disease, behavior, or condition. Cases may be divided into
possible, probable, and definite, depending on how well specific
criteria are satisfied
 Controls: refer to a specific group of individuals who serve as a
comparison for the group of people with a particular disease (known
as cases).
12 Some Epidemiologic Concepts
 Epidemic: the occurrence in a community or region of cases of an
illness or other similar event clearly in excess of what is normally
expected. The characteristics of the illness, the area and the season
all have to be taken into account.
 Epidemic incidence curve: a graph that plots cases of the disease
by the time of onset of the illness. An essential part of the analysis
is this graph can indicate the nature of the outbreak and the
probable source.
13 Mortality Rate
Crude Death Rate (CDR)

The crude death rate was 5.995 per 1,000 in Ethiopia in 2023.

 The crude death rate in Addis Ababa was approximately 6.29 per
1000 (data of 2020).
14 Age-specific Mortality Rate

Age-specific mortality rate

One example of age specific mortality rate is Infant


Mortality Rate.
15 Sex-Specific Mortality Rate

Sex-specific mortality rate =

Example: The average total population of “Town A” in 2019 was


6000 (3500 female & 2500 male). In the same year, 300 people died
(100 female and 200 male). Calculate the Crude death rate and
mortality rate for females.
16 Case Fatality Rate

 Case Fatality Rate (CFR) =

 Example: In 1996, there were 1000 tuberculosis patients in


one region. Out of the 1000 patients, 100 died in the same
year. Calculate the case fatality rate of tuberculosis.
17 Neonatal Mortality Rate
Neonatal Mortality Rate =

 Example: In 2010, there were a total of 5000 live births in “Zone


B”. Two hundred of them died before 28 days after birth. Calculate
the Neonatal Mortality Rate (NMR).

Answer: NMR =
 That means out of 1000 live births in 2010, 40 of them died before
28 days after birth.
18 Infant Mortality Rate (IMR)

 Infant Mortality Rate (IMR) =


19
Under-Five Mortality Rate
 Under-five mortality rate =
 NB: The numerator says 0-4 years. 0-4 years in this formula means children
from birth to less than five years of age i.e. the upper age limit is not 4.

 Example: In 1996, the total number of children under 5 years of age was
10,000 in “Zone C”. In the same year, 200 children under five years of age
died. Calculate the under-five mortality rate (U5MR).

U5MR =
20
Maternal Mortality Rate
 Maternal Mortality Rate =

 Maternal Mortality Rate reflects the standards of all aspects of


maternal care (antenatal, delivery, and postnatal).

 The Maternal Mortality Rate in Ethiopia is estimated to be between


267 to 412 per 100,000 live births in 2020.

 That means in 100,000 live births, on average around 340 mothers


die each year due to pregnancy-related causes.
Scope of Epidemiology
21  Its scope at the beginning was limited to understanding epidemics
 Now it is the basis of advancing our understanding of all kinds of diseases
included:
 Nutritional deficiencies
 Infectious and non-infectious diseases
 Injuries and accidents
 Mental disorders
 Maternal and Child Health
 Cancer
 Occupational Health
 Environmental Health
 Health behaviors
22
Measuring Disease Frequency
 It Has several Components
 Classifying and categorizing disease
 Defining the period of time of risk of disease
 Deciding what constitutes a case of disease in a study
 Obtaining permission to study people
 Making measurements of disease frequency
 Finding a source for ascertaining the cases
 Relating cases to population and time at risk
 Defining the population at risk of disease
23 The Basic Triad of Descriptive Epidemiology
 The three essential characteristics of disease we look for in
descriptive epidemiology:

TIME
PLACE
PERSON
Time
24
There are three major kinds of changes in disease occurrence over
time.
1. SECULAR TRENDS (slowly change): This refers to gradual
changes over long period of time, such as years or decades.
E.g. AIDS, cancer.
2. PERIODIC OR CYCLIC CHANGES. This refers to recurrent
alterations in the frequency of diseases.
 Cycles may be annual or have some other periodicity. E.g. measles,
malaria, meningitis.
3. SPORADIC: irregular and unpredictable intervals.
 E.g. influenza, Allergies
25
Time (Cont’d…)
 Secular trend can be due to one or more of the following factors.
1. Change in diagnostic technique
2. Change in accuracy of enumerating population at risk.
3. Change in age distribution of the people.
4. Change of survival from disease.
5. Change in actual incidence of the disease.
26
Time (Cont’d…)
 Changing or Stable
 Seasonal Variation
 Clustered (Epidemic) or evenly distributed (Endemic)
 Point source or Propagated
27
Place
 The frequency of disease is different in different places.
 Natural barriers: environmental or climatic conditions, such as
temperature, humidity, rainfall, altitude, mineral content of soil, or
water supply
 Political boundaries: Intended for planning and allocation of
resources
 Urban-rural differences in disease occurrence: in terms of
migration, style of living and differential environmental exposures
also helpful
28
Place (Cont’d…)
 Geographically restricted or widespread (Pandemic)
 Relation to food or water supply
 Multiple clusters or one
29
Person
Age
Socio-economic status
Gender
Ethnicity
Behavior
Person (Who)
30
 Young vs Old
 Female vs male
 Rich vs Poor
 Illiterate Vs educate

Place (Where)
 Lowland vs Highland
 Urban vs Rural

Time (When)
 Day/night variation
 Seasonal variation
 Long term
31
Disease Occurrence
Dynamics of Disease Transmission
 Interaction of agents and environmental factors with human
hosts
 Distribution of severity of diseases
 Modes of disease transmission
 Level of disease in community when transmission stops
32 The Basic Triad of Analytic Epidemiology
 The three phenomena assessed in Analytic Epidemiology
are:
Host

Agent Environment
33 The Basic Triad of Analytic Epidemiology
 Host: In epidemiology, the host is usually a human who gets sick
but can also be an animal that acts as a carrier of disease but may or
may not present illness.
 Agent: Epidemiologic triangle agents include Bacteria, Viruses,
Fungi, Protozoa, et cetera.
 Environment: The environment represents the favorable
conditions for an agent to cause a health event. Environmental
factors include physical features like geology or climate, biological
factors like the presence of disease-transmitting insects, and
socioeconomic factors like crowding, sanitation, and access to
health services.
34 Measuring Disease Frequency
 Incidence
 Prevalence
 Defined time period
 Population at risk
35 When Calculating DF(disease frequency)
 The numerator (number of cases/episodes)
 The denominator (total population at risk)
 Factor (e.g. 100, 1000, 10000)
 Time period (dates, weeks, months, or years)
 We use the following to determine DF.
 Use rates: incidence rates
 Prevalence rates
36 Incidence and Prevalence rates
 Decide what are you counting.
 Episodes/cases, people, attendance or what?
 What is the service count when filling monthly statistics eg.
diarrhea or malaria
 people get repeated attacks in one month and attend your service
 This is one person sick but has suffered several times separate
episodes in one year and attended your service several times
37 Incidence and Prevalence rates
 Incidence: Count episodes/cases
 Prevalence: chronic conditions/diseases which count the total
number of sick people.
 To study the use of health services, informations on new attendance
and repeat attendance are required.
38 When Calculating:
 Incidence rate:

 Prevalence rate:
39
Example 1: In September 1995 there were 200 new cases of
relapsing fever in “Kebele X”. The average total population of
“Kebele X” was 4000. Calculate the incidence rate of relapsing fever
in “Kebele X” in September 1995. Answer: 50 new cases per 1000
Example 2: 5,600,000 people in South Africa were estimated to be
infected with HIV in 2009 with a total population of 53 million. What
is the prevalence of HIV in the South? Answer:
40 Comparing Incidence and Prevalence
Incidence Prevalence
 New cases or events over a  All cases at a point/interval
period of time of time
 Useful to study factors  Useful for measuring the size
causing risks of the problem and planning
41 Relationship of Incidence to Prevalence
 Prevalence depends on both on incidence rate and duration of
disease
 Because prevalence is affected by factors such as migration and
duration, incidence is preferred for studying etiology.
 Prevalence = Incidence X Duration
42 Relationship between Incidence, Prevalence and Disease
Duration
Incidence

Prevalence Death
Cure
Lost to follow up
43 Attack Rate
 Example: Consider the outbreak of cholera in country Y in
March 2016. 490 population with cholera and the population at
risk were 18,600. What is the AR?
 Answer:
44 Relative Risk (RR) or Risk Ratio
 Defined as the ratio of the incidence of disease in the exposed
divided by the corresponding non-exposed group.

Exposure Disease Total


Yes No
Yes a b a+b
No c d c+d
Total a+c b+d N
45
Relative Risk (RR) or Risk Ratio
 Where, and
 A point estimate of the risk ratio is given by:
46
Relative Risk (RR) or Risk Ratio

Example
1st give Breast Cancer
Birth Yes No Total

≥25 years 31 1597 1628


<25 years 65 4475 4540
96 6072 6168
47
Relative Risk (RR) or Risk Ratio

 Women who give first birth at an older age are 35.7%


more likely to develop breast cancer.
48
Relative Risk (RR) or Risk Ratio
 To obtain a CI for the RR

 Where, and
ln is a natural logarithm.
49
The Odds Ratio
 The odds ratio (OR) is the odds in favor of disease for the exposed
group divided by the exposed group divided by the odds in the
favor of disease for the unexposed group.
 The odds in favor of disease is , where, p is probability of a
disease.
50 The Odds Ratio
51
The Odds Ratio
 The odds ratio is defined as:

 Is estimated by:
52
The Odds Ratio
 Example: in the study of the risk factors for invasive cervical
cancer, the following data were collected (case-control)

Smoker Nonsmoker Total


Cancer 108 117 225
No Cancer 163 268 431
Total 271 385 656
53
The Odds Ratio
 The odds ratio is estimated by:

 Women with cancer have an odds of smoking that are 1.52


times the odds of those without cancer.
54 The Odds Ratio
 A CI can be constructed for OR as:
55 The Odds Ratio

 Exponentiating the upper and lower confidence limits for the


natural log of the OR
56
The Odds Ratio

 For Cervical Cancer data

 Therefore, a 95% CI for ln(OR)


ln(1.52) ± 1.96(0.166)
or
(0.093, 0.744)
57
The Odds Ratio

 A 95% CI for the OR itself is

or
(1.10, 2.13)
 This interval does not contain the value 1
 We conclude that the odds of developing cervical cancer
are significantly higher for smokers than for nonsmokers
58 Quiz (5%)
 Consider the total 22,071 people under study;
where 11,037 were assigned to the Aspirin user
group and the rest were assigned to a placebo
group. If 104 people among the Aspirin users have
a Myocardial Infarction case and in total, there are
293 Myocardial Infarction cases, find the Odds
Ratio and Interpret the result.
59
Bias
 Describes error arise from the design or execution of
the study.
 It’s undesirable
 It can’t be adjusted
 Useful to consider in any study
 Essential to consider in critical appraisal
60
Bias
 It’s a systematic error introduced to the study design.
 Two major forms
 Selection Bias: refers to any error that arises in the
process of identifying the study subjects.
 Information Bias: includes any systematic error in
the measurements on either exposure or outcome
variable.
61 Selection Bias
 Selection bias occurs when identification of subjects for
inclusion into a study depends on the interest of the data
collector or investigator.
 If selection of cases and controls (eg in case control
study) is based on different criteria, then bias can occur.
 There are lots of circumstances selection bias to occur,
but there are two major known forms.
62
Types of Selection Bias
 Response Bias:
 Those who agree to be in a study may be in some way different
from those who refuse to participate.
 Volunteers may be different from those who are listed.
 Berksonian Bias:
 Bias that is introduced due to differences in criteria/probabilities of
admission to the hospital for those with the disease and those
without the disease.
 Admission criteria of the hospital
63
Information Bias
 In analytical studies usually one factor is known and another is
measured.
 E.g. in case control studies, the “outcome” is known and the
“exposure” is measured.
 E.g. in cohort studies, the exposure is known and the outcome is
measured.
64 Information Bias
 Error in the measurements/information obtained in the study could
be:
 Error due to participants
 Error due to “observers”
 Differential (Non-random)
 Non-differential (Random)
• (i.e. is it influencing equally on the exposure and the outcome?)
65 Types of Information Bias
1. Interviewer Bias: an interviewer’s knowledge of the exposure and
outcome may influence the structure of questions and the manner
of presentation which may influence the response.
2. Recall Bias: those with a particular outcome or exposure may
remember events more clearly or amplify their memories.
3. Observer Bias: Observers may have preconceived expectations of
what they should find in an examination.
4. Lose to follow-up: those who are lost to follow-up or who
withdraw from the study may be different from those who are
followed for the entire study.
Types of Information Bias
66
5. Hawthorne effect: an effect first documented at the Hawthorne
manufacturing plant; people act differently if they know they are
being watched.
6. Surveillance Bias: The group with the known exposure or outcome
may be followed more closely or longer than the comparison
group.
7. Misclassification Bias: Errors are made in classifying either the
disease or exposure status.
67
Confounding Variable
 The word came from Latin, “confundere” meaning “to
mix up”.
 The measured effect of an exposure is distorted because
of the association of the exposure with another factor
(confounder) that influences the outcome.
Exposure Outcome

Confounder
Confounding
68
 A problem resulting from the fact that one feature of study
subjects has not been separated from the second feature and has
thus been confounded with it producing a spurious result.
 The spuriousness arises from the effect of the effect of the first
feature being mistakenly attributed to the second feature.
 Confounding can produce either a type I or a type II error, but we
usually focus on the type I errors.
69 Confounding
 At the simplest level, confounding can be thought of as a
confusion of effects.
 The apparent effect of the exposure of interest is distorted
because the effect of an extraneous third factor is mixed
with the actual effect.
70
Difference from Bias…
 Bias creates an association that is not true; however, confounding
describes an association that is true, but potentially misleading.
 Key principle of confounding include that a confounder should
be associated with both the independent and dependent variables
(i.e. with the exposure and the disease)
 Association of the confounder with just one of the two variables
is not enough to produce spurious result.
71
Effect of a confounder
 Could be large
 May produce an over or underestimate of the true effect
 May change the apparent direction of the effect
Controls for confounding
72
 Controls for confounding may be built into the design or analysis
stages of the study
 Design stage
 Randomization
 Restriction
 Matching (on the basis of the potential confounding variables;
especially, age and gender)
 Cases and controls can be individually matched for one or more
variables, or they can be group matched.
 Matching is more expensive and requires specific analytic
techniques
Control Confounding: Analysis Stage
73
 Stratification
 Multivariate Analysis: Multiple Linear Regression
74 Matching
 One approach to deal with potential confounders is by matching.
 Matching: is a statistical technique that is used to evaluate the effect of a
treatment by comparing the treated and the non-treated units in a study. It is a
technique that selects subjects so that the distribution of potential confounders is
similar in both groups.
 For example, if we are assessing the effect of opium on total mortality and sex is
a potential confounder, one can match a male opium user to a male opium non-
user and a female opium user to a female opium non-user. This way users and
non-users will be exactly the same for sex, and thus sex could not confound the
association.
 By extension, one can match for more than one variable, such as by age and sex.
 For example, a 56-year-old male opium user can be matched to a 56-year-old
male non-user.

You might also like