Professional Documents
Culture Documents
1. (6 points) Suppose that the data for analysis include the attribute age. The age values
for the data tuples are (in increasing order): 13, 15, 16, 16, 19, 20, 20, 21, 22, 22, 25,
25, 25, 25, 30, 33, 33, 35, 35, 35, 35, 36, 40, 45, 46, 52, 70.
a. (4 points) Use smoothing by bin means to smooth the above data, using a bin
depth of 3.
Bin # Values Smoothed Values
1 13, 15, 16 14.67, 14.67, 14.67
2 16, 19, 20 18.33, 18.33, 18.33
3 20, 21, 22 21, 21, 21
4 22, 25, 25 24, 24, 24
5 25, 25, 30 26.67, 26.67, 26.67
6 33, 33, 35 33.67, 33.67, 33.67
7 35, 35, 35 35, 35, 35
8 36, 40, 45 40.33, 40.33, 40. 33
9 46, 52, 70 56, 56, 56
2. (4 points) List and briefly describe the steps involved in data mining when viewed as
a process of knowledge discovery.
• Data selection: Selecting data relevant for the data mining task.
• Data cleaning: Removing and fixing errors in data.
• Data integration and aggregation: Merging data from different sources under a single
coherent schema. Summarizing and aggregating the data.
• Data reduction: Reducing the size of data for mining.
• Data mining: Discovering and extracting interesting structure and pattern in data.
• Knowledge visualization and evaluation: Presenting and analyzing the discovered
patterns for evaluation.