Professional Documents
Culture Documents
Lecture1 - Introduction To Data Quality
Lecture1 - Introduction To Data Quality
-Ashu Mehta
Database systems
Data quality
• Data Quality is a key factor to Business Intelligence
success.
• There is a common saying in analytics circles which is
“Garbage in, Garbage out”
• This refers to data quality.
– said to mean that if you produce something using
poor quality materials, the thing you produce will also
be of poor quality
• If your data is poor, then the reporting and decisions
made from those reports will be as equally poor.
Data Quality Dimensions
• The six dimensions of data quality are:
– Completeness
– Conformity
– Accuracy/Correctness
– Timeliness
– Consistency
– Integrity
DIMENSIONS OF DATA QUALITY
• Completeness
– Are critical data values missing? A database with
missing data values is not unusual, but when the
information missing is critical, then completeness is
an issue.
– If a customer’s first name and last name are
mandatory but the job title is optional, a record can
be considered complete even if a job title is not
available.
• Conformity
– Is the data following standard data definitions?
For example, are dates in a standard format?
– Maintaining conformity to standard formats are
important for maintaining consistent structure
and nomenclature for sharing and internal data
management.
• Accuracy
– Is the data accurate to the “real-world” values expected?
– Incorrect spellings, misplaced decimals, untimely or out of
date data can lead to inaccurate analysis.
– If the sales from a customer are not the true sales or the
email address of a contact is misspelled, the data is not
accurate.