You are on page 1of 38

Sources of Data for Use in

Epidemiology

SB Boadi-Kusi
Learning Objectives
• Discuss criteria for assessing the quality
and utility of epidemiologic data
• Indicate privacy and confidentiality issues
that pertain to epidemiologic data
• Discuss the uses, strengths, and
weaknesses of various epidemiologic data
sources
Criteria for the Quality and
Utility of Epidemiologic Data

• Nature of the data


• Availability of the data
• Completeness of population coverage
– Representativeness (external validity)
– Thoroughness
• Value and limitations
Nature of the Data

• Refers to the source of data, e.g., vital


statistics, case registries, physicians’
records, surveys of the general population,
or hospital and clinic cases.
• Will affect the types of statistical analyses
and inferences that are possible.
Availability of the Data

• Refers to investigator’s access to data.


• For example, medical records and other
data with personal identifiers may not be
used without patients’ consent.
Representativeness and
Thoroughness

• Representativeness (external validity)--


generalizability of findings to the
population from which the data have
been taken.
• Thoroughness--the extent to which all
cases of a health phenomenon have
been identified.
Value and Limitations
• The utility of the data for various types of
epidemiologic research.
• Factors inherent in the data may limit their
usefulness.
– Incomplete diagnostic information.
– Case duplication.
Computerized Bibliographic
Databases
• Facts related to the distribution of
diseases can be obtained through such
sources as: Index Medicus, Psychological
Abstracts, Sociological Abstracts,
Education Index.
• On-line databases include Medline,
Toxline, and DIALOG.
• Internet, including World Wide Web.
Confidentiality
• Privacy Act of 1974
– Prohibits the release of confidential data
without the consent of the individual.
• Freedom of Information Act
– Mandates the release of government
information to the public, except for
personal and medical files.
• The Public Health Service Act
– Protects confidentiality of information
collected by some federal agencies, e.g.,
NCHS.
Data Sharing
• Refers to the voluntary release of
information by one investigator or
institution to another for the purpose of
scientific research.
• Key issue is the primary investigator’s
potential loss of control over information.
Record Linkage
• Joining data from two or more sources,
e.g., employment records and mortality
data.
• Applications include genetic research,
planning of health services, and chronic
disease tracking.
Statistics Derived from the
Vital Registration System
• Mortality statistics
• Birth statistics: certificates of birth and
fetal death.
Mortality Statistics
• Mortality data are nearly complete, since
most deaths in the most developed
countries are unlikely to be unreported.
• Death certificates include demographic
information about the deceased and cause
of death (immediate cause and contributing
factors).
• The situation is different in developing
countries
Limitations of Mortality Data
• Certification of cause of death.
– For example, in an elderly person with
chronic illness, exact cause of death may be
unclear.
• Lack of standardization of diagnostic
criteria.
• Stigma associated with certain diseases,
e.g., AIDS, may lead to inaccurate
reporting.
Limitations of Mortality Data
(cont’d)
• Errors in coding by nosologist.
• Changes in coding.
– Revisions in the (ICD) International
Classification of Disease.
– Sudden increases or decreases in a
particular cause of death may be due to
changes in coding.
Birth Statistics: Certificates of
Birth and of Fetal Death
• Birth certificate includes information that
may affect the neonate, such as
congenital malformations, birth weight,
and length of gestation.
• Sources of unreliability:
– Mothers’ recall of events during pregnancy
may be inaccurate.
– Conditions that affect neonate may not be
present at birth.
Birth Statistics (cont’d)
• Varying state requirements for fetal death
certificates.
• Both types of certificates have been used
in studies of environmental influences
upon congenital malformations.
• Both provide nearly complete data.
Reportable Disease Statistics
• The statutes require health care
providers to report those cases of
diseases classified as reportable and
notifiable.
– Include infectious and communicable
diseases that endanger a population, e.g.,
STDs, measles, food borne illness.
Limitations of Reportable
Disease Statistics
• Possible incompleteness of population
coverage.
– For example, asymptomatic persons would
not seek treatment.
• Failure of physician to fill out required
forms.
• Unwillingness to report cases that carry a
social stigma.
Screening Surveys
• Conducted on an ad hoc basis to identify
individuals who may have infectious or
chronic diseases. Examples: breast
cancer screenings, health fairs.
• Clientele are highly selected.
– Individuals who participate are concerned
about the particular health issue.
Multiphasic Screening
Programs
• Ongoing screening programs often are
carried out at worksites.
• Data can be useful for research on
occupational health problems.
• Biases of data due to worker attrition and
turnover.
• Data may not contain etiologic information.
Disease Registries
• Registry--a centralized database for
collection of data about a disease.
• Coding algorithms are used to maintain
patient confidentiality.
• Applications of registries:
– Patient tracking
– Identification of trends in rates of disease
– Case-control studies
• Example: SEER program
Surveillance, Epidemiology, and
End Results (SEER) Program
• Conducted by the National Cancer
Institute (NCI), USA.
• Collects cancer data from different
cancer registries across the U.S.
• Provides information about trends in
cancer incidence, mortality, and survival.
Morbidity Surveys of the
General Population
• Morbidity surveys collect data on the
health status of a population group.
• Obtain more comprehensive information
than would be available from routinely
collected data.
• Example: National Health Survey
National Health Survey
• Authorized under the National Health
Survey Act of 1956 to obtain information
about the health of the U.S. population.
• Conducted by the NCHS; consists of
three programs:
– National Health Interview Survey (HIS, a
household health interview survey)
– Health Examination Survey (HES)
– Surveys of health resources
Household Interview Survey
(HIS)
• General household health survey of the
U.S. civilian noninstitutionalized
population.
• Studies a comprehensive range of
conditions such as diseases, injuries,
disabilities, and impairments.
• Ghana Demographic Health Survey
Health Examination Survey
(HES)
• Provides direct information about morbidity
through examinations, measurements, and
clinical tests.
– Identifies conditions previously unreported or
undiagnosed.
– Provides information not previously available
for a defined population.
• Now known as the Health and Nutrition
Examination Survey (HANES).
Insurance Data
• Sources include:
– Social Security--provides data on disability
benefits and Medicare.
– Health insurance--provides data on those
who receive care through a prepaid medical
program.
– Life insurance--provides information on
causes of mortality; also provides results of
physical examinations.
Hospital Data
• Consists of both inpatient and outpatient
data.
• Deficiencies of data:
– Not representative of any specific
population.
– Different information collected on each
patient.
– Settings may differ according to social class
of patients; e.g., specialized clinics,
emergency rooms.
Diseases Treated in Special
Clinics and Hospitals
• Data cannot be generalized because
patients are a highly selected group.
• Case-control studies can be done with
unusual and rare diseases.
– However, it is not possible to determine
incidence and prevalence rates without
knowing the size of the denominator.
Data from Physicians’ Practices
• Limited application due to:
– Confidentiality of patient data.
– Highly selected group of patients.
– Lack of standardization of information
collected.
• Useful for the purposes of:
– Verification of self-reports.
– Source of exposure data.
Absenteeism Data
• Records of absenteeism from work or
school.
• Possible deficiencies:
– Data omit people who neither work nor
attend school.
– Not all people who are ill take time off.
– Those absent are not necessarily ill.
• Useful for the study of rapidly spreading
conditions.
School Health Programs
• Provide information about
immunizations, physical exams, and self-
reports of illness.
• Have been used in studies of
intelligence, mental retardation, and
disease etiology.
• Paffenbarger, et al. used information
from health records of college students
to track causes of chronic diseases.
Morbidity Data from the Armed
Forces
• Reports from physicals, hospitalizations,
and selective service examinations.
• Data have been used for:
– Studies of disease etiology.
• Study of twins serving in Korean War or
WWII to determine influence of “nature and
nurture” on cause of disease.
– Studies investigating genetic factors in
obesity.
Other Data Sources Relevant
to Epidemiologic Studies

• Census Bureau publications:


– Statistical Abstract of Ghana
– Regional & District Data Book
– Decennial Censuses of Population and
Housing
Bureau of the Census
• Provides information on the general,
social, and economic characteristics of
the population.
• Census administered every 10 years.
– Attempts to account for every person and
his or her residence.
– Characterizes population according to sex,
age, family relationships, and other
demographic variables.
Census Tracts
• Small geographic subdivisions of Regions,
districts, and sub-district..
• Are designed to provide a degree of
uniformity of population economic status
and living conditions in each tract.
THANK YOU

You might also like