Professional Documents
Culture Documents
a) Define and explain the circumstances that may require one to either Normalize or
Standardize the data in a dataset. [4 marks]
b) The goal of resampling methods is to make the best use of your training data in
order to accurately estimate the performance of a model on new unseen data.
You can use either, a train and test split of your data or a k-fold cross-validation
to resample your data. Describe scenarios where you can use either of the
techniques. [4 marks]
c) Describe the following algorithm evaluation matrix and provide their Python
functions.
i) Classification Accuracy [2 marks]
ii) Mean Absolute Error [2 marks]
d) Using equations, differentiate Statistical from Computer Science Learning
Perspective [4 marks]
e) Give any one benefit and any one limitation of parametric and non-parametric
algorithms [4 marks]
f) Provide any two supervised machine learning algorithms for regression
problems [2 marks]
g) Differentiate Classification from Regression Machine Learning problems
[2 marks]
h) Using k-Nearest Neighbours and Support Vector Machine algorithms
discuss the term Bias-Variance Trade-Off [5 marks]
i) Differentiate Over-fitting from under-fitting of a machine learning
Algorithm [1 mark]
Suppose you have a CSV file called iris.data.CSV that has the following features
sepal-length, sepal-width, petal-length, petal-width, class and you have loaded
in your python programming interface of your choice all the necessary libraries,
provide code snippets to perform the following tasks.
1
a) Load the dataset [3 marks]
b) Print dimension and the first 30 rows of the dataset [2 marks]
c) Display both box & whisker plots and histograms of the each
attribute in the dataset [4 marks]
d) Create a validation dataset using split-out validation technique.
Use validation dataset value of 20% and seed value==7 [6 marks]
Use the following dataset that describes two categorical input variables
and a class variable that has two outputs to answer the questions that follow.
Use Naïve Bayes Algorithm