Professional Documents
Culture Documents
On Data Quality: Suppose You Have A Database Sitting in Front of You, and I Ask ''Is It A Good Quality Database?''
On Data Quality: Suppose You Have A Database Sitting in Front of You, and I Ask ''Is It A Good Quality Database?''
uk/~dwcorne/Teaching/
On Data Quality
Suppose you have a database sitting in front
of you, and I ask ``Is it a good quality
database?’’
What is your answer? What does quality
depend on?
Note: this is about the data themselves, not
the system in use to access it.
These slides are at: http://www.macs.hw.ac.uk/~dwcorne/Teaching/
These slides are at: http://www.macs.hw.ac.uk/~dwcorne/Teaching/
These slides are at: http://www.macs.hw.ac.uk/~dwcorne/Teaching/
A Conventional Definition of
Data Quality
Good quality data are:
Accurate, Complete, Unique,
Up-to-date, and Consistent ;
meaning …
These slides are at: http://www.macs.hw.ac.uk/~dwcorne/Teaching/
References
• A ppt presentation called `Data Quality and Data
Cleaning: An Overview’ by Tamrapani Dasu and
Theodore Johnson, at AT & T Labs
• A paper called `Data Cleaning: Problems and
Current Approaches’, by Erhard Rahm and Hong
Hai Do, University of Leipzig, Germany.
• Pipino, L. L., Lee, Y. W., & Wang, R. Y. (2002).
Data quality assessment. Communications of the
ACM, 45(4), 211-218.