Professional Documents
Culture Documents
SC Workshop II - Administrative Data and Data Validation
SC Workshop II - Administrative Data and Data Validation
Administrative data
and Data validation
Workshop II on Monitoring and Evaluation for the Supreme Court
M Maines
Research Associate, IPA
PRE–TEST
PRE-TEST
https://tinyurl.com/scvalid-pre
OUTLINE
Outline
2. Information gathering
3. Implementation
Examples:
● Medical records
● Educational records
● Arrest records
● Banking records
● Personnel records Photo credit: Shutterstock | moreimages
Who is in
the data?
What exactly does
the data measure?
What identifiers are available to
link different sources of data?
Image: Shutterstock
Data flow
Identified Finder File Administrative Data File
Data flow
Matching administrative data with program/evaluation data
Matching process
Exact
● Minor discrepancies are not well accounted for → false negatives
1
Bhebhegurl dela
Torres
5/1/1950 A
Bhebhegurl dela
Torres
1/5/1950 ✗
2 Michael Santos 7/1/1975 B Mike Santos 7/1/1975 ✗
3 Marilyn Cruz 8/23/1987 C Marilyn Cruz 8/23/1987 ✓
SC Workshop II | Admin Data
Implementation
Matching process
Non-Exact Deterministic
● Set of criteria created in advance that two records must meet in order to be
determined a match. → false positive
FINDER FILE ADMINISTRATIVE DATA
ID NAME DOB ID NAME DOB
1
Bhebhegurl dela
Torres
5/1/1950 A
Bhebhegurl dela
Torres
1/5/1950 ✓
2 Michael Santos 7/1/1975 B Mike Santos 7/1/1975 ✓
3 Marilyn Cruz 8/23/1987 C Marilyn Cruz 8/23/1987 ✓
SC Workshop II | Admin Data
Implementation
Matching process
Probabilistic
● Probability that two records belong to the same individual. Techniques include:
○ calculating the similarity of names based on phonetic computation
✓
Bhebhegurl de la Bhebhegurl Dela
1 5/10/1985 1 10/5/1985
Torres Torres
• No over-promising
2. Which individuals are included in the data and which are excluded,
and why?
a. What steps have to occur before appearing in the data?
b. Does the intervention affect reporting of outcomes?
Reporting Bias
To address:
To address:
Data
Data
Data storage management Data analysis
collection
& cleaning
Source: Georgina Evans, Gary King, Adam D. Smith, and Abhradeep Thakurta. Working Paper. “Differentially Private Survey Research”. Copy
at https://j.mp/3jAYXo3
Data
Data collection Data storage management Data analysis
& cleaning
.csv .dta
Anonymized Clean
Raw
Raw
Rawdata
data
Rawdata
Raw data
data raw data Data
POST-TEST
https://tinyurl.com/scvalid-post
Maraming salamat po!