Professional Documents
Culture Documents
Assignments Unit1,2,3
Assignments Unit1,2,3
1. What is data mining? Describe the steps involved in data mining when viewed as a
process of knowledge discovery.
2. Understanding types of the databases. Identify the challenges associated and the data
mining functionalities that can be applied to them.
3. Define each of the following data mining functionalities: characterization,
discrimination, association and correlation analysis, classification, prediction,
clustering, and evolution analysis. Give examples of each data mining functionality,
using a real-life database that you are familiar with.
1. Tale a real world scenario (Healthcare, Education, Sales, etc.). Identify the process of
data collection from appropriate sources and write the data set description.
2. Data Cleaning:
(a)List the methods to handle missing values.
(b) Consider the dataset D={12,14,3,23,16,7,8,4,11,10,20,5}, and perform smoothing
using binning methods: (1) Smoothing by Bin Boundaries and (2) Smoothing by Bin
Means. Take bin size as 3.
(c) List other methods to handle noisy data.
3. Data Integration:
(a) With a suitable example, differentiate between Schema and Instance Integration.
(b) List the data value conflicts faced during data integration and resolve them.
4. Data Reduction and Transformation:
(a) Generate a concept hierarchy for home location.
(b) Perform data aggregation on an education dataset.