Professional Documents
Culture Documents
REVALENCE
STUDIES
INTRODUCTION
Definition.........................................................................................................33
Use in public health and research.....................................................................34
MEASUREMENTS OF PREVALENCE
Point prevalence...............................................................................................34
Period prevalence.............................................................................................34
Life time prevalence.........................................................................................34
EXAMPLES OF PREVALENCE STUDIES
Seroprevalence studies.....................................................................................36
Repeat prevalence studies................................................................................36
METHODOLOGY
Sampling .........................................................................................................36
Sample size .....................................................................................................37
Primary and secondary source of data .............................................................38
ADDITIONAL READING........................................................................................43
EXERCISES..............................................................................................................44
INTRODUCTION
Definition
Prevalence or cross-sectional are the most common population-based epidemiological
studies. They are designed to estimate the frequency of a health event in the population at a
point in time or over a short period of time. Cross-sectional studies can also be used to
investigate associations between risk factors and disease, although this is not the most
efficient design to study causality.
The population at risk is usually the population living in the study area, or it is defined by
geographical, administrative, demographical, occupational, or other parameter, such as
health services clients. Prevalence rate is reported on a population base, eg. 5 cases of a
disease per 100 inhabitants (5%).
Prevalence is influenced by the incidence (I), and mean duration (D) of the disease. As a
proportion, the numerator is part of denominator and has no unit - value ranges from 0 to
1. When incidence and the population dynamic are constant, prevalence (P) may be
calculated as by:
The duration of a disease can be obtained when the incidence and prevalence are known.
An area reporting, for example, an incidence of 3.3 new cases of tuberculosis per year, per
100,000 inhabitants, and prevalence rate of 19.8/100,000 will estimate an average duration
of the disease as :
P 19.8
D= = = 6 months
I 3.3
of a follow-up to exclude those who are already sick/infected, and the other to detect the
emergence of new cases. For infectious diseases of rapid evolution prevalence
measurement has no significance. For events (infections and diseases) of longer or chronic
duration, prevalence may indicate the risk of exposure for susceptible individuals.
Use in Public Health and Research - Prevalence studies are often used as a baseline
measurement for the monitoring of control programmes. They are also used in the
selection of participants for other studies such as case-control, cohort and clinical trials.
For example, in an initial serological screening for Trypanosoma cruzi infection among a
large schoolchildren population in a rural area in Brazil a prevalence of 7.9% (95%
confidence interval 6.8%-9.1%) was reported (Andrade et al., 1992). A sample of those
seropositive children was then selected to participate in a clinical trial to evaluate the
efficacy of benznidazol as a specific treatment. In addition, seropositive and seronegative
matched controls (case-control) were compared to evaluate environmental, familial and
nutritional risk factors associated with T. cruzi infection.
MEASUREMENTS OF PREVALENCE
The most commonly used types of prevalence rate are: point, period or lifetime
prevalence.
Lifetime prevalence - is the total number of person known to have had the disease or
attribute at least part of their life.
Figure 1 illustrates the concepts of point and period prevalence in malaria. The point
prevalence in endemic areas of malaria can be obtained by the parasitological screening of
a population over a short period of time. Differences between the prevalence of infection
and the incidence of clinical cases depend on the levels of endemicity. According to the
example in figure 1, at the beginning of 1992 the point prevalence of symptomatic
malaria was 4 cases, and 5 new cases were diagnosed during the year (incidence), yielding
a period prevalence of 9. At the beginning of 1993 the point prevalence of infection
was 12 cases and the number of clinical cases 3, which illustrates the differences between
point prevalences of infection and disease, respectively.
34
Prevalence studies
Prevalence estimates in control activities are influenced by the operation and diagnosis
criteria. Changes in case definition, treatment schemes and discharge criteria may change
prevalence figures. Mass interventions potentially interfere with the transmissibility of an
infectious disease, its incidence, duration and characteristics of the infection/disease of
existing cases. In the case of leprosy control, for example, target areas for elimination are
defined as those with prevalence rates below 1 case per 10,000.
Figure 2 illustrates the concept of point and period prevalence for leprosy. Assume 500
cases (N) at the beginning of the period (t0) and that all new cases (A = 250), regardless of
their clinical form, occurred at the same time, at mid-year (t 1). The period prevalence
(Δt1) is 750 cases; 500 at the start of the period, plus 250 new cases. Assuming that in time
t1 there were 350 discharges (B = 350), the prevalence at point t1 is the net number of
cases (N - B = 150) plus the new cases (A = 250), which totals 400 cases. Thus, in a
situation of stable incidence, reduction of the point prevalence will depend on the number
of patients treated (cured or discharged) and the proportion that defaulted from treatment.
35
Prevalence studies
Figure 2
Leprosy - Point and period prevalence
New cases
(A = 250)
Treatment+ Defaulters
{
N – B = 150
N=500
} Discharges
(B = 350)
t0 t1 t1 t2 t2
EXAMPLES OF PREVALENCE STUDIES
Seroprevalence studies - are particularly useful for infectious diseases that induce
antibody response or other biological markers. Seroprevalence studies are used to
determine geographic distribution of a large number of diseases, such as hepatitis A, B, C,
HIV and also in surveys before and after vaccinations to evaluate antibody seroconversion.
Prevalence is estimated with respect to age and sex in order to understand the dynamics of
transmission of infection in the community. This type of analysis allows the identification
of areas of high risk within the community, carriers, immune and susceptible individuals.
The analysis should indicate the current and past disease/infection/immunity situation,
providing useful information to predict future risk of transmission.
METHODOLOGY
Sampling
This also allows for the extrapolation of study results to other communities (external
validity).
Stratified sampling - this involves dividing the population into distinct subgroups
according to some important characteristics and selecting a random sample of each
subgroup. If the proportion of sample drawn from each strata is the same as the proportion
of the total population, then all strata will be fairly represented in with regard to the
number of person in the sample. A two-stage sampling was developed by EPI-WHO to
evaluate vaccination coverage and the quality of health services. 30 urban settlements are
selected and 7 children in the given age group are selected in each settlement.
Sample size
While a probability sample gives a study internal validity, the precision of the prevalence
estimate obtained depends on the sample size. Thus, the width of the confidence interval
(estimated by the interval of the prevalence in the population) reflects the degree of
precision conferred by the size of the sample chosen.
n = Z * Z [P (1-P)] / (D*D)
where:
Z the value of the reference normal distribution for the desired confidence level (Z =
1.96 for the 95% confidence interval - 95% CI)
P the expected prevalence
D the highest acceptable error in the estimate (half-width of the CI - measurement of
precision)
For example, to estimate the seropositivity for dengue virus antibody in a population of
about 1 million inhabitants with an expected prevalence of 15% (P = 0.15) and a 95% CI
of 12% (D= 0.06), the number of persons to be studied would be:
The estimate sample size should be increased to compensate for eventual refusals or losses.
The sample size conveys an idea of the order of magnitude of the population needed for
the study, but must not be rigid, as it is calculated on the basis of an estimated parameter
(expected prevalence). This estimate is usually obtained from a review of the literature.
Sample sizes must be based on different estimates of prevalence and precision in keeping
with the purpose of the study. A balance between what is desirable and what is practically
possible should be achieved. Opinion surveys are generally conducted on about 1,000
persons to obtain good precision (for example, 95% CI with a maximum width of 6%). It
37
Prevalence studies
should be emphasized that prevalence studies are not suited for events of low frequency of
occurrence.
In some case, data generated from information systems of control programmes make it
possible to build up time series. Other sources are the medical histories of general or
referral hospitals and of sentinel hospitals for infectious diseases.
The interpretation of secondary data requires a knowledge of the coverage and quality of
the information; of changes in the definition of cases over time; of administrative actions
such as changes from voluntary to transitory reporting, and changes in established
interventions and report forms.
The epidemiological interpretation should recognize the limitations, quality of the existing
database, potential biases associated with determination of the disease and the selection of
cases for treatment.
Types of bias
Survivor bias – systematic error arising in cross-sectional studies for including only
prevalent cases. Cases with rapid evolution and early deaths are excluded, while longer
survival cases tend to be over-represented. Since the probability of surviving a disease
affects its prevalence, studies based on prevalent cases generate associations that reflect
determinants of the survival of cases.
38
Prevalence studies
Stored collections of clinical and laboratory specimens are eventually used to estimate
prevalence of some diseases. Serum and biological material banks that do not record a
description of the population from which the specimens have been taken, the sampling
method used, and the circumstances in which they were obtained are without value for
epidemiological purposes. In order for the results obtained from these tests to represent
the actual prevalence, all the requirements of a project design must be satisfied: (a) clear
purposes, (b) representation of the population of interest, (c) sample size, and (d)
knowledge of the tests to be used, their sensitivity and specificity, the limits of their
accuracy and their significance.
DATA ANALYSIS
Measurement of prevalence
In prevalence studies the association between exposure and disease can be evaluated.
The relative risk estimation or the odds ratio can be calculated especially when the
frequency of the disease/outcome is rare. In these circumstances the ratio between the
two prevalences (exposed and not exposed) called prevalence ratio (PR) can be used.
For example, to study the association between history of sexually transmitted disease
(STD) and homeless children, 496 children (101 homeless and 395 family children
working in the streets were interviewed for histories of STD. STD history was reported by
24.8% homeless children and 3.5% family children working in the streets. (Porto et al.,
1994). The results are presented in the following table:
39
Prevalence studies
Therefore, the risk of reporting a STD was 7 times higher for homeless children than for
family children working in the streets.
Stratification - The main technique for the evaluation of confounding effect and to
examine interaction (modification of effect) between risk factors is stratification.
Stratified analysis is usually done in the following stages:
Divide the study population into strata for the potential confounding variable;
Calculate estimates of the effect of exposure (prevalence ratio and confidence interval),
for each specific stratum in relation to the baseline exposure level;
Determine whether the magnitudes of the differences between the prevalence ratio of
the different strata suggest interaction or confounding;
Estimate a summary (grouped) risk based on Mantel-Haenszel test in case of
confounding.
Logistic Regression – Although stratification can be used to adjust the prevalence for
more than one confounding variable, a large number of strata tend to produce clusters of
small numbers of observations, with loss of precision in the calculations. This limitation
of stratification in the simultaneous adjustment of several confounding variables can be
overcome to some extent by the use of modeling techniques. Multivariate models will help
better understanding the predictive value of a set of variables related to a particular
outcome. If the endpoint is binary logistical regression models can be applied to
prevalence studies to assess the effect of one exposure in the presence of other additional
risk factors. When the endpoint is continuous, linear regression is the option. Logistic
regression is usually done in the following steps:
40
Prevalence studies
Prevalence studies are not suitable for rare or short-duration diseases, which will afflict
few persons at a given point in time. It is frequently difficult to separate cause and effect
(risk factor and disease) because the measurements of exposure and disease are made at the
same time, and for this reason, can not be used to test etiological hypotheses.
41
Prevalence studies
42
Prevalence studies
ADDITIONAL READING
HENNECKENS, H.C. & BURING, J.E. Epidemiology in Medicine, 5th ed. Boston:
Toronto, Ed. Little, Brown and Company, 1987.
PAUL, J.R. & WHITE, C. Serological epidemiology. Academic Press New York and
London, 1973.
43
Prevalence studies
EXERCISES
Files: 1. ViewScreen
2. ViewHepbprev
Exercise 1
Serologic screening for Trypanosoma cruzi. A serologic survey was conducted to
estimate the prevalence of T.cruzi infection in schoolchildren aged 7 to 12 years, resid-
ing in endemic rural areas of central Brazil. Blood samples were collected on filter
paper from 1,990 children for indirect hemagglutination (IHA), indirect
immunofluorescence (IIF) and ELISA. Details of the study and methodology are
given in Andrade et al., 1992. The plan of analysis include (a) comparison of
seroprevalence by each technique, and (b) prevalence ratio by sex and age group. Use
the screen data table, included in the EpiGuide.MDB project to answer the following
questions.
**Before starting the exercise route out the results to a HTML file named “Results
SCREEN”. ROUTEOUT 'Results SCREEN' [Figure 1]
1- Click on RouteOut
44
Prevalence studies
4 – Click Ok
45
Prevalence studies
1- Click
Frequencies
Note 2: [To continue the exercise you must create a new variable
called “POS”. Consider (+) as positive to at least two
serological tests and (-) as negative.]
46
Prevalence studies
For Epitable:
Run EPITABLE to calculate the 95% CI – select
DESCRIBE, then select PROPORTION, then select
SIMPLE RANDOM SAMPLING
[Use the results of the previous table]
47
Prevalence studies
[Figure 6 – IF command]
2
1
4
10
5 9
48
Prevalence studies
3- Choose the
Outcome variable
4 – Click Ok
1- Click on Tables
1- Click Proportion
2- Click
Enter New Data
4 – Click Calculate
49
Prevalence studies
50
Prevalence studies
[If you are doing the Advanced exercise leave Analysis OPEN
and proceed to Question 5]
[If you are not doing the Advanced analysis of this exercise you
can Exit Analysis]
EXIT [to leave ANALYSIS]
Advanced Exercise
Question 5. What are the adjusted prevalence ratios for group age and sex
(OR) after applying a logistic regression technique? What
happened to the associations between exposure to T. cruzi and
sex or age?
DEFINE AGEGR_R
RECODE AGEGR TO AGEGR_R
1=1
2=0
END
51
Prevalence studies
3 – Choose the
2 – Choose 4 – Click Make
Independent
the Outcome Dummy to create
variables
dummy variables
1 – Click
Logistic Regression
Exercise 2
Prevalence of and risk factors for hepatitis B infection. A cross-sectional study was
designed to measure the prevalence of serologic markers for hepatitis B virus infection
(HBV) in first-time blood donors and convicts, and to evaluate risk factors associated
with seropositivity. The viewhepbprev, part of EPIGUIDE.MDB project, includes
results of HBsAg and anti-HBs (ELISA) for 1,033 blood donors and 201 convicts, and
14 potential risk factors variables. Details of the methodology and population are in
Martelli et al., 1990. Positivity to HBsAg or to anti-HBS was taken as HBV infection.
The plan of analysis was designed to (1) evaluate the prevalence of the HBsAg and
anti-HBs markers in the group of donors and convicts; (2) compare sex and age
distribution and frequency of potential risk factors between groups, and (3) calculate
the prevalence ratio of HBV positivity between exposure groups.
**Before starting the exercise route out the results to a HTML file named “Results
HEPBPREV”
ROUTEOUT 'Results HEPBPREV'
52
Prevalence studies
53
Prevalence studies
1- Click
Frequencies
2 - Click OK to cancel
current selection criteria
54
Prevalence studies
5 – Click Ok
1- Click on Tables
56
Prevalence studies
57
Prevalence studies
REFERENCES
For Analysis:
DEAN A.G., DEAN J.A., COULOMBIER D. et al. Epi Info™, Version 6.04, a word
processing, database, and statistics program for public health on IBM-compatible
microcomputers. http://www.cdc.gov/epiinfo/Epi6/ei6.htm
DEAN, A., SULLIVAN, K, & SOE, M.M. OpenEpi - Open Source Epidemiologic
Statistics for Public Health. http://www.openepi.com
58
Prevalence studies
Project: EPIGUIDE.MDB
File: Screen
1 to
ID Identification number
1991
1 Posse
MUN Municipality 2 Simolândia
3 Guarani
1 Male
SEX Sex
2 Female
P Positive
IHA Hemagglutination test
N Negative
P Positive
IIF Immunofluorescence test
N Negative
P Positive
ELISA ELISA
N Negative
59
Prevalence studies
Project: EPIGUIDE.MDB
File: Hepbprev
Variable Description Code Description of code
ID Identification number
-1 No information
INJMED Use of injectable medicine 1 Yes
2 No
-1 No information
INJDRUG Use of injectable drug 1 Yes
2 No
-1 No information
TATTOO Presence of tattoo 1 Yes
2 No
-1 No information
ACP Acupuncture 1 Yes
2 No
-1 No information
HBSAG Serology for HbsAg 1 Positive
2 Negative
-1 No information
ANTIHBSAG Serology for anti-HBsAg 1 Positive
2 Negative
-1 No information
VDRL Serology for VDRL 1 Positive
2 Negative
-1 Blood donors
0 less than 1 year
YEXP Years of incarceration
1 1 year
2 2 or more years
Convicts
1
GROUP Population under study First-time donors
2
1 Yes
STD Report of sexually transmitted disease
2 No
60