You are on page 1of 3

Case Study on Data Quality-CS2179

Bourmpoulia Victoria(ID:205006)

The data presented in the excel file are not, first of all, accurate. They are not

complete and consistent in many cases. For example, in the “Employee” (see picture),

there are empty cells that do not indicate the name of the employee, and the names of the

employees are not written in the same way, e.g.,”Callahan, Lora” and “Callahan L.,”. The

same thing happens in the “Shipped Date” as well. In addition, there is a lot of missing

information in the “Ship Region” and the” Ship Postal Code”.

If the data will be left like this, without the necessary correct processing, they will

be inaccuracies in tracking employee work, valuable customers, and which region and

which country is doing better than others.


A company should perform a data gap analysis to determine if it can meet

business expectations while identifying possible data gaps or where missing data might

exist (Baltzan et al.,2012). Plus, it should perform better Master Data Management

(MDM) and Data validation tests to ensure the accuracy, consistency, completeness, and

correctness of the data (Baltzan et al.,2012). Moreover, according to Batini et al.(2009),

careful standardization which replaces or complements nonstandard data values with

corresponding values would increase the completeness of the data. Plus, error localization

and correction already embedded in the software, or in Excel, would eliminate data

quality errors by detecting the records that do not satisfy a given set of quality rules

(Batini et al.,2009).
References
Baltzan, P., Detlor, B., & Welsh, C. (2012). Business driven information systems.

Whitby, Ont.: McGraw-Hill Ryerson.

Batini, C., Cappiello, C., Francalanci, C., & Maurino, A. (2009). Methodologies for data

quality assessment and improvement. ACM Computer Surveys. 41(3),1-52.

https://doi.org/10.1145/1541880.1541883

You might also like