You are on page 1of 1

Data Mining , An Introduction by Ruth Dilly 1995

Data mining problems/issues

As data mining systems rely on the raw data from different data sources. Following are
the main problems or issues that a data analyst has to face during mining data []

a. Limited Information

A database/ data mart / data warehouse is usually designed for running their day to day
business as well for some summary reports. So there may be limited no. of attributes
required for data mining to fulfill the objective of stakeholder For example cannot
diagnose malaria from a patient database if that database does not contain the patients red
blood cell count.

b. Noise and missing values

Data is entered in Databases by the operators or in some cases atutomatic, so data is more
vulnerable to errors as well missing values. Attributes which rely on objective can give
rise to errors or miss classified. Errors in the values of attributes id called noise. So for
accurate data mining results noise and missing values must be removed.

Missing data can be treated by the following ways.

• Discard the missing values


• Discard the rows having missing values
• Take general values from others records/record
• A special value can be assigned for record
• Or an average value can be used

c. Uncertainty

Uncertainty depends upon the severity of the degree of noise in data elements. For
example there is a great difference between the records in term of values.

d. Data cohesiveness

Cohesiveness means how much the data attributes are relevant to each other.

You might also like