Professional Documents
Culture Documents
DATASET
Testing Dataset
Training Dataset
TRAIN TEST
Evaluate Model
Train Model
Holdout method
There are two parts to the dataset in the | 3. Use the hold-out test dataset to
evaluate the model.
diagram above.
One split is held aside as atraining set. 4. Use the entire dataset to train the final
Another set is held back for testing or model so that it can generalize better
on future datasets.
evaluation of the model. The percentage
of the split is determined based on the In this process, the dataset is split into
amount of training data available. A training and test sets, and a fixed set of
typical split of 70-30% is used in which hyper-parameters is used to evaluate the
70% of the dataset is used for training and model.
30% is used for testing the model. 3.2.1 K-fold cross validation method
Follow the steps below for using the hold
out method for model evaluation:
In machine learning, we couldn't fit the
model on the training data and can't say
1. Split the dataset in two (preferably 70 that the model will work accurately for
30%; however, the split percentage can the real data.
vary and should be random). For this, we have to assure that our model
2. Now, we train the model on the got the correct patterns from the data, and
training dataset by selecting some it is not getting up too much noise. For
fixed set of hyper-parameters while this purpose, we use the cross-validation
training the model.
technique.
Fundamental of Machine Learning / 2023 /4
26
Fundamentals of
Machine
¬ross validation is a technique used in
machine learning to evaluate the
subset for the
evaluation
model. In this method, we of the Learning
iterate tkrained
performance of a model on unseen data. with a different subset reserved
It involves dividing the available data into purpose each time, for tesitimnges
multiple folds or subsets, using one of The general steps to
these folds as a validation set, and training implement
the model on the remaining folds.
cross validation are as follows:
Step 1: Shuffle the dataset K-Fold
This process is repeated multiple times,
each time using a different fold as the Step 2: Split the dataset into k randomly.
validation set. Finally, the results from Step 3: For each unique group: groups
each validation step are averaged to a. Take the group as a hold
produce a more robust estimate of the data set out or test
model's performance. h Take the
The main purpose of cross validation is remaining
training data set groups as a
to overCOme overfitting, which c. Fit a model on the
0ccurs
when a model is trained too well on the evaluate it on the testtraining
set
set and
training data and performs poorly on new, d.Retain the evaluation score and
unseen data, discard the model
In K-Fold Cross Validation Step 4 :Summarize the skill of the
method, we model
split the data-set into k number of subsets
(known as folds) then we perform training using the sample of model
on the all the subsets but leave evaluation scores
one(k-1)
Vaiidation
Fola
Train.ng
Fold
K 1st
(K-Folds)
Iterations Performance .
2nd
Performance ,
3rd
Performance 3 E Performance
4th
Performance a =)Performance
5th
Perforrmance
K-Fold cross validation
Advantages of cross-validation:
More accurate estimate of out-of every observation is used for both
sample accuracy. training and testing.
More efficient" use of data as Overcome the jssue of overfitting
Modeling and Evaluation 27
training
Subset
pretiction
training
Bootstrap Samples
Bagging
33
Modeling and Evaluation
on the
Boosting: Boosting is a sequential process training set and train a new model
that creates an ensemble of models by same dataset. In each iteration, we give
iteratively improving the performance of a more weight to the misclassified samples
single model. In b0osting, we train a base and less weight to the correctly classified
model on the entire training set. Then, we samples. Finally, we combine the results
assign weights to each sample in the of all models to make the final prediction.
The Process of Boosting
3
ubtot testing Fase
prediction
training
Training
set
WOa False
prediction
Overall
Prediction
Boosting
Stacking: Stacking is a process that Then, we train a meta-model on the output
involves combining multiple models that of these models. The meta-model learns
use different algorithms or different howto combine the outputs of the models
representations of the data. In stacking, we to make the final prediction.
train multiple models on the same dataset.
Model
training
Training Model Final Predictlons
Set
Model
Stacking
Fundamental of Machine Learning / 2023 /5
34
Fundamentals of
Machine
Bagging Boosting Stacking
Learning
Reduce Variance Reduce Bias
Improve Accuracy
Max Voting,
Weighted Averaging Weighted Averaging
Averaging
One powerful way for ensemble based technique is Random forest:
In Random Forest, a large number of decision trees are
trained on
subsets of the data. Each tree is trained on a different subset of the data,randomly selected
using a diferens
set of randomly selected features. This randomness helps to reduce
the generalization ability of the model. overfitting and improve
Instance
ass-A Class-A
random
However. ensemble learning approach also defined using a grid search or a
has SOme drawbacks. It can be search.
The
computationally expensive and may Step 3 : Select a performance metric: will
require a large amount of memory. In performance metric is the metric that
addition, the performance of the ensemble be used to evaluate the performance of the
may be highly dependent on the model. The most common performance
performance of the individual models. metrics are accuracy, precision, recall, and
Therefore, it is important to carefully Fl-score.
select the models and ensure that they are Step 4 : Evaluate the model: The model
diverse and complementary to each other. is trained on the training data using
3.5.2 Model Parameter Tuning different combinations of hyperparameters.
Model parameter tuning is the process of The performance of the model is
evaluated on the validation data using the
adjusting the hyperparameters of a
machine learning model in order to selected performance metric.
improve its performance. Hyperparameters Step 5 : Select the best hyperparameters:
are the parameters that are set before the The hyperparameters that give the best
training of the model and cannot be performance on the validation data are
learned from the data. Examples of selected as the final hyperparameters.
hyperparameters include the learning rate, These hyperparameters are then used to
the number of hidden layers in a neural train the model on the entire training
network, and the regularization parameter. dataset.
The model parameter tuning approach in Step 6 : Evaluate the final model: The
final model is evaluated on the test data
machine learning involves the following
steps:
using the selected performance metric to
Step 1 : Define the hyperparameters: The determine the generalization performance
of the model.
first step is to define the hyperparameters
that need to be tuned. This can be done The model parameter tuning approach is
by looking at the model architecture and an iterative process that involves repeating
understanding how the hyperparameters the above steps until the desired level of
affect the performance of the model. performance is achieved.
Step 2 : Define the search space: The next It is important to note that model
step is to define the search space for the parameter tuning can be time-consuming
hyperparameters. .The search space is a and computationally expensive, especially
range of possible values for each for large datasets and complex models.
hyperparameter. The search space can be
Question Bank
Short-Answer Questions (3 or 4 marks) :
Define model. How can you train a model ?
2. Give the difference between predictive model and descriptive model.
3 State any four real-world problems solved by predictive models. Explain any one in
brief.
descript1ve models.
36
real-world problems
solved by Explain any one
in Modeling and Evaluation 37
4 State any four training.
brief. method for model d) Bagging reduces the complexity of a model, while b0osting increases the complexity
use holdout of 10-fold cross validation. of a model.
5 Wite down steps to the approach
to show 2 Which of the following is not a metric that can be calculated using a confusion matrix?
6 Draw a detailed diagram Bagging and Boosting.
between a) Precision b) Recall
Give the difference
When does it happen?
7
c) FI score d) R-squared
Define overfitting.
When does it happen?
8
model fittino 3
Which of the following is a common technique used to prevent overfitting in a machine
9 Define underfitting. trade-off in context of
on bias-varjance learning model?
10. Write a short note a) Adding more features to the model
Explain structure of
confusion matrix.
11. b) Increasing the number of epochs during training
Sensitivity, Specificity
12. Define following terms: c) Reducing the size of the model
13. Write a brief note on
stacking.
performance of a model. d) Decreasing the learning rate during
14. State various ways to improve (7 marks): 4 What is the holdout method used for?
Long-Answer Questions
a) To evaluate the performance of a machine learning model on a dataset that it
Explain different types of model. hasn't seen during training
Explain Holdout method in detail. b) To train a machine learning model on a subset of a dataset
detail.
3 Describe k-fold cross validation
stacking in detail. c) To generate new data for a machine learning model
4 Explain bagging, boosting and d) To reduce the complexity of a machine learning model
learning approach in detail.
5 Consider Ensemble
Describe the following confusion matrix of the win/loss prediction of cricket match. 5 Which of the following is a common technique used to prevent underfitting in a machine
6 calculate the accuracy, error rate, sensitivity, specificity, precision, recall and F-nmeasure learning model?
a) Adding more features to the model
of the model.
Actual Win Actual LOss b) Increasing the number of epochs during training
8 7 c) Reducing the size of the model
Predicted Win
d) Decreasing the learning rate during training
Predicted Loss 3
7. While predicting malignancy of tumour of a set of patients using a classification model
following are the data recorded:
1. Correct predictions - 20 malignant, 70 benign
2. Incorrect predictions 4 malignant, 6 benign
Create confusion matrix for the same. And, calculate the accuracy, error rate, sensitivity,
specificity, precision, recall and F-measure of the model.
Check your knowledge :
What is the difference between bagging and boosting?
a) Bagging reduces the variance of a model, while boosting reduces the bias of
a model.
b) Bagging reduces the bias of a model, while boosting
model.
reduces the variance of à
c) Bagging increases the complexity of a model, while boosting reduces the complexiy
of a model.