Professional Documents
Culture Documents
Data Cleaning
Data Cleaning
1. Incorrect data fields: Manual input errors or flawed data transfers can result in a mix of accurate and
erroneous data within an organization's dataset. Even a small amount of flawed data can compromise
the entire dataset's reliability.
2. Outdated data values: Data sets, such as customer information, may require adjustments due to
factors like customer departures or discontinued product lines. Historical data that is no longer pertinent
must be updated.
3. Missing data: Changes in data collection methods or the types of data gathered may lead to gaps in
historic entries. Organizations in this scenario may need to consider manual data entry or employ
algorithms or machine learning to predict and complete missing data.
4. Data silos: If an organization's data is dispersed across multiple spreadsheets or different types of
databases, consolidation may be necessary to centralize all the data. While the foundation of any
business analytics strategy is typically first-party data (collected directly from stakeholders),
incorporating third-party data (purchased or obtained from other organizations) can provide additional
external insights.