You are on page 1of 2

Vishwakarma Institute of Technology Issue 01 : Rev No. 0 : Dt.

16/03/16

Title : Question Paper FF No. 868


Reg.No.

Bansilal Ramnath Agarwal Charitable Trust’s


VISHWAKARMA INSTITUTE OF TECHNOLOGY, PUNE – 411037.
( An Autonomous Institute Affiliated to Savitribai Phule Pune University)

Examination: End Semester Examination


Year: S.Y. Branch: Common – Under DOME
Subject: Data Science Subject Code: MD 2201
Max. Marks:50 Total Pages of Question Paper:
Day & Date: Time:

Instructions to Candidate
1. All questions are compulsory.
2. Neat diagrams must be drawn wherever necessary.
3. Figures to the right indicate full marks.

Q.1: (CO1) Study the data distribution given in Table no. 1 and answer the Questions –

Value 1 2 3 4 5 6 7 8
No. of data 1 0 0 3 4 10 12 8
points with
that value i.e.
frequency

Table no. 1

a. What is the mean value? ………. 1 M


b. How would you describe the data distribution? Why? ………. 3 M

Q.2: (CO2) What are type 1 and type 2 errors in hypothesis testing? Explain with the help of
suitable examples. ………. 4 M

Q.3: a. (CO3) What are the advantages and disadvantages of using L1 norm? ………. 1 M

Q.3: b. (CO3) Draw a typical Hessian Matrix? Indicate how is it used in Optimization……. 3 M

Q.4: a. (CO4) What is the significance of R2 in regression? If a regression activity returns R2 as


0.9354, what is your interpretation of the same? ………. 4 Marks

Q.4: b. (CO4) Which regression is used in modelling of sensor characteristics? Why?.......... 4


Marks

Q.4: c. (CO4) What do you mean by interpretation of beta coefficients? explain with
examples……….. 5 Marks

OR

Q.4: c. (CO4) What do you understand by Logistic regression? What are dichotomous variables
in the context of Logistic regression? ………. 5 Marks

Q.5: a. (CO5)How does the KNN algorithm make the predictions on the unseen dataset? ……….
5 Marks
Vishwakarma Institute of Technology Issue 01 : Rev No. 0 : Dt. 16/03/16

OR

Q.5: a. (CO5)Why we measure impurity of a resulting node in Decision tree? List the different
measures of impurity in DT. ………. 5 Marks

Q.5: b. (CO5)Is Feature Scaling required for the KNN Algorithm? Explain with proper
justification……….. 4 Marks

OR

Q.5: b. (CO5)There are 4 coins A, B, C and D out of which 3 coins are of equal weight and one
coin is heavier. Find out the heavier coin using Decision Tree……….. 4 Marks

Q.5: c. (CO5) Compare and contrast between Divisive and Agglomerative clustering algorithms
……… 4 Marks

Q.6: a. (CO6) The confusion matrix for a certain classification activity is as shown in Table no. 2

Predicted: NO Predicted: YES


Actual: NO 50 10
Actual: YES 5 100

Table no. 2
Find the following classifier performance measures –
1. Accuracy
2. Precision
3. Recall
4. Specificity
5. F-Score
6. Error rate …………… 6 Marks

Q.6: b (CO6) Explain the following methods used for training and testing –
1. Re substitution
2. K fold Cross-validation
3. Bootstrapping …………. 6 Marks

OR
Q.6: (CO6) Using the Naïve Bayes Classifier approach based on the training data set given in
Table no. 3, predict Class = Buy Laptop: Yes or No for the feature set: {Income = Low; Student
= No; Credit Rating = Excellent} ………. 12 Marks

Sr. No. Income Student Credit Rating Buy Laptop


1 High No Fair No
2 High No Excellent No
3 High No Fair Yes
4 Medium No Fair Yes
5 Low Yes Fair Yes
6 Low Yes Excellent No
7 Low Yes Excellent Yes
8 Medium No Fair No
9 Low Yes Fair Yes
10 Medium Yes Fair Yes
11 Medium Yes Excellent Yes
12 Medium No Excellent Yes
13 High Yes Fair Yes
14 Medium No Excellent No
Table no. 3
****************************************************************************

You might also like