You are on page 1of 7

IS328-Data Mining

Data Types and Distance Metrics

Tutorial 3 Exercises
---------------------------------------------------------------------------
PART A – Multiple Choice Questions

Q1. Which of the following is not an attribute type in data mining?


A Interval
B Ratio
C Random
D Ordinal
E Nominal

Answer C

Q2 Nominal and ordinal attributes are collectively referred to as_________ attributes.


A. qualitative.
B. perfect.
C. consistent.
D. quantitative

Answer A

Q3. Which of the following is the correct order of steps of the knowledge discovery process.
A Data Selection, Data Transformation, Data Mining, Evaluation
B Data Cleaning, Data Selection, Data Mining, Data Transformation, Evaluation
C Data Selection, Data Transformation, Data Cleaning, Evaluation, Data Mining
D Data Cleaning, Data Selection, Data Transformation, Data Mining, Evaluation

Answer D

Q4 Web Mining involves

A Web content mining


B Web structure mining
C Web usage mining
D Both A and B
E All of the above

Answer E

Q5 Which of the following is not a time series data?

1|Page
A the continuous monitoring of a person's heart rate,
B hourly readings of air temperature,
C daily closing price of a company stock,
D monthly rainfall data, 
E Survey data collected from 25 online students

Answer E

Consider the following data and answer questions 6=10.


ID Age Marital Income Height No of
Min = 25 Status (m) Children
Max = 55
1 32 Single Low 1.68 0
2 40 Married High 1.70 2
3 28 Single Mediu 1.62 0
m
4 50 Married High 1.73 3
5 55 Divorced High 1.65 4
6 48 Married Mediu 1.78 1
m
7 39 Divorced Mediu 1.70 2
m
8 25 Single Low 1.69 0
9 35 Married Mediu 1.73 4
m
10 36 Single Low 1,68 0

Q6 What is the data type of ‘Age’?


A. Qualitative
B. Quantitative
C Interval
D Ratio
E None of the above

Answer C

Q7 What is the data type of ‘Marital Status’?


A. Nominal
B. Ordinal
C Interval
D Ratio
E None of the above

Answer A

Q8 What is the data type of ‘Income’?


A. Nominal
B. Ordinal
C Interval

2|Page
D Numeric
E None of the above
Answer B

Q9 What is the data type of ‘Height’?


A. Qualitative
B. Quantitative
C Discrete
D Continuous
E None of the above

Answer D
Q10 What is the data type of ‘No of Children’?
A. Nominal
B. Numeric
C Discrete
D Continuosu
E None of the above

Answer C

Q11   Data Mining is done using which type of data?


A Text Data
B Image data
C Video data
D Web data
E All of the above

Q12_____________ the application of data mining to mine geographical  information to produce


business intelligence or other results.

A Web Mining
B Big Data Mining
C Spatial Mining
D Time series Mining
E Text Mining

Answer C

Q13   The Euclidean distance between P(2, -3, 5, 7 ) and Q(4, 6, -1, 7 ) is

A 9
B 11
C 13
D 17
E 19

Answer B

Q14 The Manhattan distance between P (2, -3, 5, -7) and Q (4, 5, 2, -4) is

3|Page
A 12
B 14
C 16
D 18

Answer C

Q15. Suppose that X and Y are two vectors as follows:

X = {5, -1, 3, 2, 4} and Y = {3, 2, 4, -1, 5}.

The cosine similarity between the two vectors is


A 0.65
B 0.78
C 0.87
D 0.94

Answer B

Consider two objects A and B are represented by four attributes as follows:


Answer questions 15 & 16.

A1 A2 A3 A4
Object A 0 3 4 5
Object B 7 6 3 -1

Q16 The squared Euclidean distance between objects A and B


A 80
B 98
C 95
D 101

Answer C

Q17 The Minkowski distance of order 3 between objects A and B


A 3.873
B 8.373
C 7.383
D 3.378

Answer B

1
Given the following two objects answer the questions 18-19....

A1 A2 A3 A4 A5 A6 A7 A8
Object 1 1 0 1 0 0 1 1 0
Q18 What
Object 2 is the
1 distance
0 between
1 the1objects if
0 all variables
1 are
0 symmetric?
1

4|Page
A 0.25
B 0.5
C 0.67
D 0.375

Answer D

Q19 What is the distance between the objects if all variables are asymmetric?

A 0.25
B 0.67
C 0.5
D 0.75

Answer C

Q20) The following are four patients and their attributes:

A1 A2 A3 A4 A5 A6 A7
Mary 1 0 1 0 0 1 0
Amanda 1 0 1 1 0 1 0
John 1 1 0 1 1 0 1
Stephen 0 1 1 0 1 0 0
A new patient Diana has the following details:

A1 A2 A3 A4 A5 A6 A7
Diana 0 1 0 1 0 1 0
Assuming that all attributes are symmetric, the nearest neighbour of Diana is

A Mary
B Amanda
C John
D Stephen

Answer B

___________________________________________________________________

PART B – Distance Calculations


5|Page
Q1. Distance Matrix
a) Consider three objects, 
A: (1,  2, 3, 0)
B: (0, 1, 2, 3)
C: (3, 0, 1, 2).
D (2, 3, 0, 1)

Complete the following distance matrix using


(a) The Euclidean distances
(b) The Manhattan distances.

(a) The Euclidean Distance

   A B C D
 A  0      
 B  3.46  0    
 C  4  3.46  0  
 D  3.46  4  3.46  0

(b) The Manhattan Distance

   A B C D
 A  0      
 B  6  0    
 C  8  6  0  
 D  6  8  6  0

  
Q2 The Pearson Correlation Coefficient

X and Y are two vectors as defined below:


` X = (3, 6, 2, 5, 7, 1)
Y = {1, 3, 2, 7, 6, 5)

Calculate the Pearson correlation coefficient between X and Y using the formula below:

X Y X -- X Y -- Y (X- X))(Y - Y) SQ (X - X) SQ (Y - Y)
3 1 -1 -3 3 1 9
6|Page
6 3 2 -1 -2 4 1
2 2 -2 -2 4 4 4
5 7 1 3 3 1 9
7 6 3 2 6 9 4
1 5 -3 1 -3 9 1
X Y Sum =11 Sum = Sum =
= = 28 28
4 4

The Pearson Coefficient = 11/ (5.29 * 5.29)= 0.393

7|Page

You might also like