Professional Documents
Culture Documents
History
Input 3 Input 2
Types
• Two basic types of strategies: deterministic
and probabilistic, both of which are
considered to be a type of exact matching
Probabilistic
TYPES of
Record
Linkage
STRATEGIES
Deterministic
Deterministic Record linkage
• A pair of records is said to be a link if the two records
agree exactly on each element within a collection of
identifiers called the match key.
• ALL or NONE
• First step
• E.g. : For input data belonging to Mr. William Marcus Smith, entries
could have been made by different individuals as :
– Smith W. M.
– William M. Smith
– W.M. Smith
– W.M. Smithe etc
Blocking:
• In order to reduce the search space (i.e. the
number of record pairs to be compared)
• FRIL
General record linkage system
Uses
• The system is used to improve data quality and coverage, for
long term medical follow up of cohorts, for creating patient-
oriented rather than event-oriented data, for building new
data sources, and for a range of other statistical purposes