You are on page 1of 61

Measuring Population Health

Week 3 – Data sources

Dr Linda Ng Fat
Department of Epidemiology and Public Health
l.ngfat@ucl.ac.uk, @lindangfat
Outline

• 10.05-10.45 Lecture
• 10.45-10.55 Break
• 10.55-11.45 Lecture part II
• 11.45-12.00 Break
• 12.00-12.45 Computer Practical
• 12.45-13.00 Summing up

2
Objectives

• Learn about different data sources that are


necessary to measure population health
• Consider advantages and disadvantages of
different sources of data
• Understand ethical issues concerning the use of
data including the Data Protection Act 1998

3
What data do we need to measure population
health?
Health Lifestyle
Status factors
Examples Examples
• Chronic diseases • Physical activity
• Infectious diseases Population • Diet
• Physical/Mental • Tobacco and Alcohol
health indicators Health consumption

Demographic Social and


factors Economic
Indicators
Examples Examples
• Sex • Social class, Employment,
• Age Education, Income
• Marital status • Housing
• Births/Deaths • Ethnicity
• Geography/Environment 4
• Discuss with the person sitting next to you of
all the forms of official data that has been
collected from you, that you are aware of.

5
Vital registration Censuses
systems

Surveys
Surveillance
systems
Primary Care
Data records
sources Administrative
Hospital records
Maps
Non-health (sector)
records
Qualitative

Big Data

6
Vital registration
• Register vital events, for example, births,
marriages and deaths
o Universal
o Compulsory
• These systems are the best and most-reliable
source for fertility, mortality, life expectancy and
cause-of-death statistics
• Requires expensive infrastructure
• Incomplete in many low- and middle-income
countries

7
Vital registration in UK

• UK 1837 – birth, marriages and deaths


• Legal requirement to register birth within 42 days,
death within 5 days (8 days in Scotland)

• Births – sex, date, occupation of father, address,


birth weight not recorded
• Death – date, cause, occupation, age

8
9
Global vital registration systems

• In 2009…

o only 25% of the world population lived in countries where at


least 90% of births and deaths are registered

o 74 countries lacked data altogether about births and deaths

o In the WHO African Region 42 out of 46 countries had no


death registration data

10
Global birth registrations

• Industrialised countries 98% of births registered

• Sub-Saraharan Africa 45% births registered

• India, China
– Longitudinal registration of demographic events in an
nationally representative sample

11
Censuses
• Count of a national population at a single point
in time
• Collected at regular intervals e.g. UK every ten
years since 1801
• High costs of coverage means limited
questions/accuracy
• Essential features defined by the UN:
– Individual enumeration
– Universality within a defined territory
– Simultaneity
– Defined periodicity 12
13
14
General health question in the 2011 UK
Census

15
Disability question in the 2011 UK Census

16
Surveys
• Information is collected on a sample of the
population
• Data are representative for a specific
population (often national) but needs to
satisfy certain criteria e.g. random sampling,
large enough sample, good response rate

17
Uses
• Have rich data on a specific health topic as well as
living standards and other complementary variables
• Often repeated over time (aka routine data), allowing
for measurement of time trends
• Conducted in multiple countries, allowing for
benchmarking
• Currently the most common and overall most reliable
data source for health monitoring in low- and middle-
income countries

18
Demographic and Health Surveys (DHS)

• Nationally-representative household surveys


• 5,000-30,000 households every 5 years
• Wide range of monitoring and impact evaluation
indicators in the areas of population, health, and
nutrition
• Funded by US Agency for International
Development
• http://www.measuredhs.com/

19
Demographic and Health Surveys countries

20
21
The Health Survey for England
• A series of annual cross-sectional surveys since 1991
• Approx. 4000-15000 households randomly selected and
interviewed each year
• Self-reported data is initially collected through face-to-face
interviews
• Biomedical data collected by a nurse at a later visit
• Large sample size, randomly selected means we can
make inferences about the general population living in
private households

23
Health Survey for England
Covers
• Social & demographic indicators
• Physical health
• Lifestyle behaviours
• Social care
• Physical measures
• Mental and wellbeing
• Some years include additional boost samples of
specific subgroups, e.g. ethnic minorities, children
24
25
http://healthsurvey.hscic.gov.uk/media/63790/HSE2016-pres-med.pdf
Ng Fat et al. BMC Public Health 2018 18:1090 26
https://bmcpublichealth.biomedcentral.com/articles/10.1186/s12889-018-5995-3
Non-drinking on in the increase among
young people

27
What was the impact on smokefree
legislation on exposure to secondhand
smoke?

Base: non-smoking adults

Sims, M et al, Environ Health Perspect. 2012 Mar;120(3):425-30.


Other national UK health surveys

• Adult Dental Health Survey


• Adult Psychiatric Morbidity Survey
• Children’s Dental Health Survey
• Infant Feeding Survey
• Local Health and Wellbeing Survey for Younger People
• National Study of Health and Wellbeing
• NHS Stop Smoking Services
• Smoking at Time of Delivery
• Smoking, Drinking and Drug Use among Young People in England
• Survey of Carers in Households, England
• What About Youth
29
Large-scale UK surveys

• General Lifestyle Survey (General Household Survey)


• Labour Force Survey
• Health Survey for England/Wales/Scotland
• Living Costs and Food Survey (Expenditure and Food Survey)
• Crime Survey for England and Wales (British Crime Survey)
• Family Resources Survey
• Opinions and Lifestyle Survey (ONS Opinions Survey, ONS
Omnibus Survey)
• English Housing Survey (Survey of English Housing)
• British Social Attitudes
• National Travel Survey
http://ukdataservice.ac.uk
30
Benefits of the large-scale government data

• Good quality data


– produced by experienced research organisations
– usually nationally representative with large samples
– good response rates
– very well documented
• Continuous data
– allows comparison over time
– data is largely cross-sectional
• Hierarchical data
– intra-household differences
– household effects on individuals
Limitations of surveys
• Biases can arise including
o Selection bias (e.g. low response rates)
o Non-response
o Measurement issues – reliability & validity (e.g. social
desirability, recall biases)
o Respondent fatigue
• Survey may not be representative of small
subpopulations of interest
• May exclude segments of the population e.g.
homeless, mental health institutions
32
Types of surveys
Cross-sectional
• Observational data is collected from a
population at a specific point in time

Longitudinal/prospective
• A survey that records observations from the
same groups of individuals over time
• Cohort or panel design

33
Now we are fifty:
http://www.cls.ioe.ac.uk/page.aspx?&sitesectionid=112&sitesectiontitle=Published+work+using+NCDS+data

34
NCDS Sweeps/Waves (data collection points)

35
Establishing the effects of maternal smoking

http://www.cls.ioe.ac.uk/page.aspx?&sitesectionid=112&sitesectiontitle=Published+work+using+NCDS+data 36
37
Knuppel et al. Scientific Reports. 2017 7;3287
Other UK longitudinal studies

• Hertfordshire Cohort Study


• MRC National Survey of Health and Development (1946)
• 1958 National Child Development Study
• 1970 British Cohort Study
• Avon Longitudinal Study of Parents and Children
• Southampton Women’s Survey
• Millennium Cohort Study
• Understanding Society: The UK Household Longitudinal Study
• Whitehall study I & II
• British Regional Heart Study

http://www.closer.ac.uk/data-resources/timeline/ 38
MRC National Health and Development Study
1946-2016, celebrating 70 years

• https://www.nshd.mrc.ac.uk/nshd/70th-birthday-brochure/
• Further reading: ‘The Life Project’ Helen Pearson
39
Longitudinal/Prospective surveys

• Advantages
o Can examine events and outcomes across time,
including a life span
o ‘Reverse causality’ problem with cross-sectional
studies are not as severe
• Disadvantages
o Costly to administer, not widely available in LIC
o Attrition can be a problem

40
UK Data Service - https://www.ukdataservice.ac.uk/

41
Administrative sources
• Data is primarily collected for
administrative purposes and not for health
research/monitoring purposes

• Largely collected by Institutions e.g.


Hospitals and/or UK Government e.g.
welfare, educational, tax record systems

42
Primary Care
records

Administrative Hospital records

Non-health (sector)
records

43
Hospital Episode Statistics in England
• Hospital Episode Statistics (HES): Data
warehouse that contains details of:

• Inpatient admissions (1989/90 – present)


• Outpatient appointments (2003/4 – present)
• Accident and Emergency (A&E) attendances
(2007/8 – present)

• Record based system that covers all NHS Trusts


in England

44
Hospital Episode Statistics in England

 There are hundreds of variables. Each HES record


contains a wide range of information about an individual
patient admitted to an NHS hospital:

 Clinical: diagnoses (ICD-10), procedures (OPCS4)


 Patient information: age, gender, ethnicity
 Administrative: waiting times, method of admission
 Geographies: of treatment, residence (LA, CCG, GP)
 Specifics: maternity (birth and maternity episodes)
 Costing: Healthcare Resource Group (HRG), length of
stay
45
What are the limitations of HES?

• Specific data-quality issues:


• Boundary and organisational changes over time.
• Poor data: Maternity, Critical Care, earlier years.
• Excludes (most) activity in private hospitals
• Coding issues (e.g. ethnicity)
• Not all illnesses results in hospital activity – only patients
who have been admitted are recorded.
• It does not provide prevalence of diseases nor incidence
• HES is not a live system
• Time lag – annual extract is 9 months behind.
46
Strength of HES

• Large medical database


• Un-interrupted data collection since 1989
• Resident and registered populations
• Geographically and temporally referenced
• Diagnoses & procedures coded using standardised coding
frames - ICD9, ICD10 & OPCS4, HRG (latest 3.5)
• Covers population of approx 50,000,000 +
• Covers all NHS hospitals ~ 90-95% of all in-patient care

47
Primary care in UK

• Most of population registered with GP


• Cannot register with more than 1 GP
• Limited information: age, sex, place of residence
• Used to measure migration within the UK
• Not necessarily true in other countries

48
Other sources
• Cancer registries
• Disease notifications
• Prescription data in UK
– The Health and Social Care Information Centre has
published monthly general practice level prescribing data
since 2011

49
Cancer registrations

• Cancer registries cover all of US population


• 11 cancer registries in the UK
– Collect data from a variety of sources
– Aim: to deliver timely, comparable and high-quality
cancer data
• The three most common cancers for men in 2012: prostate
(25.9%), lung (13.6%) and colorectal (13.4%).
• The three most common cancers for women in 2012: breast
(30.9%), lung (11.9%) and colorectal (10.9%).

50
Data linkage
Administrative

Administrative

Surveys Contextual
51
Oyebode O et al. Epidemiology and Community Health. 2014; 68:856-682
https://jech.bmj.com/content/68/9/856
52
De-identification of data for linkage purposes
https://www.adrn.ac.uk/policies-procedures/protecting-privacy/de-
identification/

De-identifying data
removes information
such as:
Names
Addresses
Exact date of birth
National Insurance
number
National Health Service
number
tax reference number

53
Data Protection Act 1998
Ensures Data is used
• used fairly and lawfully
• used for limited, specifically stated purposes
• used in a way that is adequate, relevant and not excessive
• accurate
• kept for no longer than is absolutely necessary
• handled according to people’s data protection rights
• kept safe and secure
• not transferred outside the European Economic Area
without adequate protection

54
Data confidentiality
‘Patient and service user information is generally
held under legal and ethical obligations of
confidentiality. Information provided in
confidence should not be used or disclosed in a
form that might identify a patient or service user
without his or her consent. There are a number of
important exceptions to this rule, described later in
this document, but it applies in most circumstances’

NHS code of practice


55
Administrative sources

Advantages
• Good coverage, and already routinely collected
• Can provide small area statistics
• Can be linked to other databases
Disadvantages
• Limited contextual information
• Data may be erroneous
• Can take time to get access due to data
protection laws
56
Maps

Walkability:
Residential density
+
Junction density
+
Land use mix

1= lowest walkability 4= highest walkability


Stockton et al. 2016 Development of a novel walkability index for London, United Kingdom: cross-
sectional application to the Whitehall II Study. BMC Public Health. 2016:16:416 57
Maps

Stockton et al. 2016 Development of a novel walkability index for London, United Kingdom: cross-
sectional application to the Whitehall II Study. BMC Public Health. 2016:16:416 58
Qualitative
• E.g. In-depth interview transcripts, diaries,
anthropological field notes, answers to open-
ended survey questions, audio-visual recordings
and images
• Favours depth over breadth
• Can provide contextual information that surveys
lack, however due to small sample findings are
often not generalisable
• But, generalisability may not be the goal.
• Mixed methods research: Qual & Quant combined
59
Big Data
• Datasets that are large and complex occurring in
real-time, commonly as a by-product of everyday
use of technology
• Data is often unstructured, not easily interpreted
by computer programs or traditional data mining
software
• Very broad term encompassing data in health
care systems, social media posts – tweets,
images, videos, google searches, oyster card use,
gps etc
60
Data source appraisal

• Who is included?
• What was the original purpose?
• Social construction of the data
• Quality and monitoring of data
• Consistency between sources
• Data collection methods
• Definitions and analysis
Computer practical – Gordon St (25) -105 Public
Cluster
Topic: Health Survey for England 2013
• Log into Moodle
• Click on Week 3 tab
• Click on ‘Week 3 – Practical Exercise’
• Click Enter
• Complete the step by step instructions
• By the end you should have registered on the UK
Data service
• Don’t click back on your browser!

You might also like