You are on page 1of 33

Model Evaluations:

Training, testing & cross-validation

Dr. Karuna Reddy


STEMP - USP

1
Overview
 Evaluate models by training, testing, and
cross validating
 Correct overfitting and underfitting
problems in your models
 Improve the generalization of your models
using regularization

2
Model Evaluation
 Split our dataset into training and testing
sets
 Build and train the model with a training
set
 To assess performance - compute metrics on
the testing set
 Techniques for handling cases with small
datasets
3
Model Evaluation
 Evaluating the model based on data you have
trained it on will most likely lead to an
optimistic result
 Solution: split your data into training and
testing sets

4
Training and test sets
 Training set - subset of data used to train or
build the model.
 Test set - subset of data used to test the
trained model. Tests how well your model will
perform in the real world

5
Training and test sets

6
Training and test sets

7
Problem
 Predict the likelihood of a flight delay on a
given airline
 Data analysis in R for data cleaning,
exploratory data analysis, model development,
and model evaluation

8
Airline data
 Data Asset eXchange
 ibm.biz/data-exchange
 Data in the Airline Reporting Carrier On-Time
Performance dataset
 https://dax-cdn.cdn.appdomain.cloud/dax-
airline/1.0.1/lax_to_tar.gz

9
Airline data
 ArrDelay (minutes) – target variable.
 If ArrDelay is positive, the flight
arrived late. Negative = early
 Predictors: "Distance", "CarrierDelay",
"WeatherDelay", "SecurityDelay",
etc.

10
Training and test sets in tidymodels

11
Train a linear regression

12
Make predictions

13
Evaluate models
Testing Training

14
Visualize our results

15
Problem
 Small datasets can:
 Introduce bias into your testing
 Keep useful examples out of training
 Leads to an ineffective evaluation of model
performance

16
Cross validation
 Most common out of sample evaluation technique
 More effective use of data

17
Why use cross validation?
 Tests the generalizability of the model
 Avoids overfit on certain training sets
 Works well with small amount of data (eg., k-
folds)

18
Cross validation in R

19
Overfitting and Underfitting

20
The bias-variance trade-off
 Avoid overfitting because it gives too
much predictive power
 Avoid underfitting because you will
ignore important general features
 How do we balance the two?

21
The bias-variance trade-off

22
Underfitting vs overfitting

23
Example in R: Underfitting
 Model is mean of stopping distance, i.e., 𝒅!𝒊𝒔𝒕 =
𝒎𝒆𝒂𝒏(𝒅𝒊𝒔𝒕)

24
How to prevent underfitting?
 Increase the model complexity
 Add more features to the training data
 Try different models such as
regression trees or random forest

25
How to prevent underfitting?
 More complicated model, eg.,
! 𝒊𝒔𝒕 = 𝜷𝟎 + 𝜷𝟏 ∗ 𝒔𝒑𝒆𝒆𝒅 + 𝜷𝟐 ∗ 𝒔𝒑𝒆𝒆𝒅𝟐 + ⋯ + 𝜷𝟖 ∗ 𝒔𝒑𝒆𝒆𝒅𝟖
𝒅

26
How to prevent overfitting?
 Reduce model complexity
 Training with more data
 Cross-validation
 Regularisation

27
Example in R: Best model
 Reduce model complexity

28
What is Regularization?
 A way to deal with the problem
of overfitting
 A technique you can use to reduce the
complexity of the model
 Less likely to fit the noise of the training
data and will improve generalization abilities
of the model
29
What to use Regularization?
 Regularization is a technique used to avoid
this overfitting problem

30
What is Regularization?
 Ridge (or L2) regularization
 Lasso (or L1) regularization
 Elastic net (L1/L2)
regularization

31
What is Regularization?
 Start with a slightly worse fit, with
regularization, the new regression can provide
better long-term predictions
 Ridge regression - minimizes the sum of the
squared residuals + lambda*(weight squared)
 Lasso regression - minimizes the sum of the
squared residuals + lambda * |weight|
32
Questions

33

You might also like