Welcome to Scribd!

Regression Trees Chapter2

Uploaded by

0% found this document useful (0 votes)

19 views21 pages

The document provides an introduction to regression trees and machine learning with tree-based models in R. It discusses training a regression tree using rpart(), performing a train/validation/test split of data, common metrics for evaluating regression like MAE and RMSE, hyperparameters that can be tuned for decision trees like minsplit, maxdepth, and cp, using grid search to evaluate multiple models trained with different hyperparameter combinations, and selecting the best model based on validation set performance.

Original Description:

Original Title

Regression Trees chapter2

Copyright

Available Formats

PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

0% found this document useful (0 votes)

19 views21 pages

Regression Trees Chapter2

Uploaded by

Mikistli Yowaltekutli

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

Jump to Page

You are on page 1of 21

Search inside document

Introduction to

regression trees
MA CH IN E LEA RN IN G W ITH TREE-BA S ED MODELS IN R

Erin LeDell
Instructor
Train a Regression Tree in R
rpart(formula = ___,
data = ___,
method = ___)

MACHINE LEARNING WITH TREE-BASED MODELS IN R

Train/Validation/Test Split
training set

validation set

test set

MACHINE LEARNING WITH TREE-BASED MODELS IN R

Let's practice!
MA CH IN E LEA RN IN G W ITH TREE-BA S ED MODELS IN R
Performance metrics
for regression
MA CH IN E LEA RN IN G W ITH TREE-BA S ED MODELS IN R

Erin LeDell
Instructor
Common metrics for regression
Mean Absolute Error (MAE)
1
MAE = ∑ ∣ actual − predicted ∣
n
Root Mean Square Error (RMSE)

RMSE = √ ∑ (actual − predicted)2

1
n

MACHINE LEARNING WITH TREE-BASED MODELS IN R

Evaluate a regression tree model
pred <- predict(object = model, # model object
newdata = test) # test dataset

library(Metrics)

# Compute the RMSE

rmse(actual = test$response, # the actual values
predicted = pred) # the predicted values

2.278249

MACHINE LEARNING WITH TREE-BASED MODELS IN R

Let's practice!
MA CH IN E LEA RN IN G W ITH TREE-BA S ED MODELS IN R
What are the
hyperparameters for
a decision tree?
MA CH IN E LEA RN IN G W ITH TREE-BA S ED MODELS IN R

Erin LeDell
Instructor
Decision tree hyperparameters
?rpart.control

MACHINE LEARNING WITH TREE-BASED MODELS IN R

Decision tree hyperparameters
minsplit: minimum number of data points required to attempt a
split

cp: complexity parameter

maxdepth: depth of a decision tree

MACHINE LEARNING WITH TREE-BASED MODELS IN R

Cost-Complexity Parameter (CP)
plotcp(grade_model)

MACHINE LEARNING WITH TREE-BASED MODELS IN R

Cost-Complexity Parameter (CP)
print(model$cptable)

CP nsplit rel error xerror xstd

1 0.06839852 0 1.0000000 1.0080595 0.09215642
2 0.06726713 1 0.9316015 1.0920667 0.09543723
3 0.03462630 2 0.8643344 0.9969520 0.08632297
4 0.02508343 3 0.8297080 0.9291298 0.08571411
5 0.01995676 4 0.8046246 0.9357838 0.08560120
6 0.01817661 5 0.7846679 0.9337462 0.08087153
7 0.01203879 6 0.7664912 0.9092646 0.07982862
8 0.01000000 7 0.7544525 0.9407895 0.08399125

MACHINE LEARNING WITH TREE-BASED MODELS IN R

Cost-Complexity Parameter (CP)
# Prune the model to optimized cp value
model_opt <- prune(tree = model, cp = cp_opt)

MACHINE LEARNING WITH TREE-BASED MODELS IN R

Let's practice!
MA CH IN E LEA RN IN G W ITH TREE-BA S ED MODELS IN R
Grid Search for
model selection
MA CH IN E LEA RN IN G W ITH TREE-BA S ED MODELS IN R

Erin LeDell
Instructor
Grid Search
What is a model hyperparameter?

What is a "grid"?

What is the goal of a grid search?

How is the best model chosen?

MACHINE LEARNING WITH TREE-BASED MODELS IN R

Set up the grid
# Establish a list of possible hyper_grid[1:10,]
# values for minsplit & maxdepth

minsplit maxdepth
splits <- seq(1, 30, 5)
1 1 5
depths <- seq(5, 40, 10)
2 6 5
3 11 5
# Create a data frame containing 4 16 5
# all combinations 5 21 5
6 26 5
hyper_grid <- expand.grid( 7 1 15
minsplit = splits 8 6 15
maxdepth = depths 9 11 15
10 16 15

MACHINE LEARNING WITH TREE-BASED MODELS IN R

Grid Search in R: Train models
# Create an empty list to store models
models <- list()

# Execute the grid search

for (i in 1:nrow(hyper_grid)) {
# Get minsplit, maxdepth values at row i
minsplit <- hyper_grid$minsplit[i]
maxdepth <- hyper_grid$maxdepth[i]

# Train a model and store in the list

models[[i]] <- rpart(formula = response ~ .,
data = train,
method = "anova",
minsplit = minsplit,
maxdepth = maxdepth)

MACHINE LEARNING WITH TREE-BASED MODELS IN R

# Create an empty vector to store RMSE values
rmse_values <- c()

# Compute validation RMSE

for (i in 1:length(models)) {

# Retreive the i^th model from the list

model <- models[[i]]

# Generate predictions on grade_valid

pred <- predict(object = model,
newdata = valid)

# Compute validation RMSE and add to the

rmse_values[i] <- rmse(actual = valid$response,
predicted = pred)
}

MACHINE LEARNING WITH TREE-BASED MODELS IN R

Let's practice!
MA CH IN E LEA RN IN G W ITH TREE-BA S ED MODELS IN R

Random Sample Consensus: Robust Estimation in Computer Vision
From Everand
Random Sample Consensus: Robust Estimation in Computer Vision
Fouad Sabry
No ratings yet
Regression Linaire Python Tome II
Document10 pages
Regression Linaire Python Tome II
Elisée TEGUE
No ratings yet
P05 The Regression Pipeline - Training and Testing Ans
Document13 pages
P05 The Regression Pipeline - Training and Testing Ans
YONG LONG KHAW
No ratings yet
8 Ejercicio - Optimización y Guardado de Modelos - Training - Microsoft Learn Ingles
Document13 pages
8 Ejercicio - Optimización y Guardado de Modelos - Training - Microsoft Learn Ingles
acxel david castillo casas
No ratings yet
ML Lab 10 - Ensemble Learning
Document7 pages
ML Lab 10 - Ensemble Learning
PRIYANSH AGGARWAL
No ratings yet
Module - 4 (R Training) - Basic Stats & Modeling
Document15 pages
Module - 4 (R Training) - Basic Stats & Modeling
RohitGahlan
No ratings yet
Nthu Bacshw
Document8 pages
Nthu Bacshw
黃淑菱
No ratings yet
Grid Search For Random Forest
Document12 pages
Grid Search For Random Forest
kPrasad8
No ratings yet
Bagged Trees Chapter3
Document21 pages
Bagged Trees Chapter3
Mikistli Yowaltekutli
No ratings yet
5 Ejercicio - Experimentación Con Los Modelos de Regresión Más Eficaces - Training - Microsoft Learn Ingles
Document9 pages
5 Ejercicio - Experimentación Con Los Modelos de Regresión Más Eficaces - Training - Microsoft Learn Ingles
acxel david castillo casas
No ratings yet
Keras
Document3 pages
Keras
magas32862
No ratings yet
Problem Set 1: Introduction To R - Solutions With R Output: 1 Install Packages
Document24 pages
Problem Set 1: Introduction To R - Solutions With R Output: 1 Install Packages
Darnell Larsen
No ratings yet
RSTUDIO
Document44 pages
RSTUDIO
samarth agarwal
No ratings yet
Chapter1 PDF
Document36 pages
Chapter1 PDF
Mikistli Yowaltekutli
No ratings yet
Code With Queries - Solved
Document10 pages
Code With Queries - Solved
Rounak Singh
No ratings yet
Generalization Error: Elie Kawerk
Document37 pages
Generalization Error: Elie Kawerk
ramatama
No ratings yet
Machine Leaarning
Document32 pages
Machine Leaarning
Luis Eduardo Calderon Canto
No ratings yet
Homework 2: CS 178: Machine Learning: Spring 2020
Document3 pages
Homework 2: CS 178: Machine Learning: Spring 2020
Jonathan Nguyen
No ratings yet
Recipes For Data Processing
Document51 pages
Recipes For Data Processing
Anonymous EoHQjD
No ratings yet
IE 451 Fall 2023-2024 Homework 7 Solutions
Document11 pages
IE 451 Fall 2023-2024 Homework 7 Solutions
Abdullah Bingazi
No ratings yet
Introduction To Parallel Computing in R: 1 Motivation
Document6 pages
Introduction To Parallel Computing in R: 1 Motivation
Ranveer Raaj
No ratings yet
A Short Introduction To Caret
Document10 pages
A Short Introduction To Caret
blacng
No ratings yet
EvaluatingClassifierPerformance ML CS6923
Document13 pages
EvaluatingClassifierPerformance ML CS6923
rk2153
No ratings yet
04-Classification & Tunning - Copie PDF
Document54 pages
04-Classification & Tunning - Copie PDF
Salma Tebaa
No ratings yet
R Homework
Document13 pages
R Homework
Testa Mesta
No ratings yet
Chapter 4
Document50 pages
Chapter 4
Rajachandra Voodiga
No ratings yet
Elmasri/Navathe, Fundamentals of D Atabase Systems, 4th Edition
Document29 pages
Elmasri/Navathe, Fundamentals of D Atabase Systems, 4th Edition
ISHITA GUPTA 20BCE0446
No ratings yet
Data Science Interview Quesions
Document22 pages
Data Science Interview Quesions
Harshith Jayaram
No ratings yet
K Means Clustering in R Example - Learn by Marketing
Document3 pages
K Means Clustering in R Example - Learn by Marketing
Ari Clecius
No ratings yet
Package Automl': R Topics Documented
Document12 pages
Package Automl': R Topics Documented
LESALINAS
No ratings yet
Pandas Exercises For Data Analysis PDF
Document83 pages
Pandas Exercises For Data Analysis PDF
Wasim Akram
100% (1)
Pandas Library
Document83 pages
Pandas Library
SandhyaRoy
No ratings yet
RSQLML Final Slide 15 June 2019 PDF
Document196 pages
RSQLML Final Slide 15 June 2019 PDF
Thanthirat Thanwornwong
No ratings yet
Stats 101c Final Project
Document16 pages
Stats 101c Final Project
api-362845526
100% (1)
C1W3 Assignment
Document7 pages
C1W3 Assignment
Rainata Putra
No ratings yet
Chapter1 - Decision Tree For Classification
Document29 pages
Chapter1 - Decision Tree For Classification
indrahermawan
No ratings yet
M5 Prime Lab
Document10 pages
M5 Prime Lab
pinkkittyjade
No ratings yet
Chapter 2
Document37 pages
Chapter 2
All Uun
No ratings yet
ML0101EN Reg Mulitple Linear Regression Co2 Py v1
Document5 pages
ML0101EN Reg Mulitple Linear Regression Co2 Py v1
Rajat Solanki
No ratings yet
Control Systems Lab 1
Document8 pages
Control Systems Lab 1
Patrick Go
No ratings yet
RNN LSTM Example Implementations With Keras TensorFlow
Document20 pages
RNN LSTM Example Implementations With Keras TensorFlow
Richard Smith
No ratings yet
Dimensional Reduction in R
Document24 pages
Dimensional Reduction in R
Shil Shambharkar
No ratings yet
Generalization Error: Elie Kawerk
Document37 pages
Generalization Error: Elie Kawerk
manish wadhwani
No ratings yet
Knowing MLPs
Document4 pages
Knowing MLPs
Rhitesh Kumar Singh
No ratings yet
R Machine Learning PDF
Document137 pages
R Machine Learning PDF
Anonymous NoermyAEpd
No ratings yet
AMTA Assignment AMTA B (Aswin Avni Navya)
Document13 pages
AMTA Assignment AMTA B (Aswin Avni Navya)
Shambhawi Sinha
No ratings yet
Principal Component Analysis Notes : Info
Document22 pages
Principal Component Analysis Notes : Info
VALMICK GUHA
No ratings yet
Handling The Dataset Using R - Word
Document54 pages
Handling The Dataset Using R - Word
swarnalathasatheesh.13
No ratings yet
Data Analytics
Document31 pages
Data Analytics
Sandeep Tanwar
No ratings yet
R Studio Assignments
Document95 pages
R Studio Assignments
samarth agarwal
No ratings yet
Interpolation - R Spatial
Document16 pages
Interpolation - R Spatial
koua
No ratings yet
How To Train A Model With MNIST Dataset
Document7 pages
How To Train A Model With MNIST Dataset
Magdalena Falkowska
No ratings yet
Writing Efficient R Code
Document5 pages
Writing Efficient R Code
Octavio Flores
No ratings yet
R Assignment
Document8 pages
R Assignment
Tuna
No ratings yet
ML Implementation
Document14 pages
ML Implementation
noussayer mighri
No ratings yet
Assignments Walkthroughs and R Demo: W4290 Statistical Methods in Finance - Spring 2010 - Columbia University
Document38 pages
Assignments Walkthroughs and R Demo: W4290 Statistical Methods in Finance - Spring 2010 - Columbia University
tsit
No ratings yet
List of Programs in R 2 Sem
Document48 pages
List of Programs in R 2 Sem
samarth agarwal
No ratings yet
Assignment 2.4.1 Multiclass Classification
Document5 pages
Assignment 2.4.1 Multiclass Classification
Hockhin Ooi
No ratings yet
Amta - Final Exams: Code: # Load The Toyotacorolla - CSV
Document13 pages
Amta - Final Exams: Code: # Load The Toyotacorolla - CSV
Shambhawi Sinha
No ratings yet
Lecture 3 - MachineLearning-CrashCourse2023
Document99 pages
Lecture 3 - MachineLearning-CrashCourse2023
Giorgio Aduso
No ratings yet
Cka, Ckad
Document1 page
Cka, Ckad
Mikistli Yowaltekutli
No ratings yet
App-Owns-Data Starter Kit
Document58 pages
App-Owns-Data Starter Kit
Mikistli Yowaltekutli
No ratings yet
Bagged Trees Chapter3
Document21 pages
Bagged Trees Chapter3
Mikistli Yowaltekutli
No ratings yet
Chapter1 PDF
Document36 pages
Chapter1 PDF
Mikistli Yowaltekutli
No ratings yet
Product Catalog
Document30 pages
Product Catalog
Mikistli Yowaltekutli
No ratings yet
Scraping HTML Chapter2
Document31 pages
Scraping HTML Chapter2
Mikistli Yowaltekutli
No ratings yet
Modern Grammar
Document59 pages
Modern Grammar
Mikistli Yowaltekutli
No ratings yet
Deep Learning CH3
Document22 pages
Deep Learning CH3
Mikistli Yowaltekutli
No ratings yet
Jabra Sport Pulse Techsheet Standard
Document1 page
Jabra Sport Pulse Techsheet Standard
Mikistli Yowaltekutli
No ratings yet
Alchemical Keys Book Review
Document1 page
Alchemical Keys Book Review
Mikistli Yowaltekutli
50% (2)
Wacom STU SDK
Document13 pages
Wacom STU SDK
Mikistli Yowaltekutli
No ratings yet
Lesson Plan in General Mathematics
Document3 pages
Lesson Plan in General Mathematics
Derren Nierras Gaylo
100% (1)
Mathematical and Computer Modelling: E. Babolian, A. Azizi, J. Saeidian
Document12 pages
Mathematical and Computer Modelling: E. Babolian, A. Azizi, J. Saeidian
Rohan sharma
No ratings yet
Midas Civil Ver.7.4.0 Enhancements: Analysis & Design
Document19 pages
Midas Civil Ver.7.4.0 Enhancements: Analysis & Design
dia_blo_777
No ratings yet
Körner, S. - The Impossibility of Transcendental Deductions
Document16 pages
Körner, S. - The Impossibility of Transcendental Deductions
David He
No ratings yet
Review: in Vitro Drug Release Characterization Models: January 2011
Document9 pages
Review: in Vitro Drug Release Characterization Models: January 2011
Syifa' Al-robbani
No ratings yet
Narayana Iit Academy Narayana Iit Academ
Document57 pages
Narayana Iit Academy Narayana Iit Academ
Deepak Phogat
No ratings yet
Simulation Models For Teaching Training
Document5 pages
Simulation Models For Teaching Training
meriza
No ratings yet
Edexcel Further Pure Mathematics Term 1 GFS
Document1 page
Edexcel Further Pure Mathematics Term 1 GFS
Mohammed Aayan Pathan
No ratings yet
G10 Math Q1-Week 3-Geometric Sequence
Document31 pages
G10 Math Q1-Week 3-Geometric Sequence
Regina Pausal Jordan
No ratings yet
Jas Gas Kinetic Transport
Document15 pages
Jas Gas Kinetic Transport
cindydianita
No ratings yet
ASTM C403-16 Time of Setting of Concrete Mixtures by Penetration Resistance
Document7 pages
ASTM C403-16 Time of Setting of Concrete Mixtures by Penetration Resistance
Rajnikanth Gedhada
50% (4)
01 Number System
Document25 pages
01 Number System
ashutoshpathakcivil
No ratings yet
18.102 Introduction To Functional Analysis: Mit Opencourseware
Document5 pages
18.102 Introduction To Functional Analysis: Mit Opencourseware
Lộc Mai
No ratings yet
Homework Answers So Far
Document64 pages
Homework Answers So Far
asgfazsgf
0% (2)
Frequency Spectrum Method-Based Stress Analysis For Oil Pipelines in Earthquake Disaster Areas
Document24 pages
Frequency Spectrum Method-Based Stress Analysis For Oil Pipelines in Earthquake Disaster Areas
NTR
No ratings yet
Trial and Improvement
Document9 pages
Trial and Improvement
Selva Raj
No ratings yet
Puzzle Abacus WIMWI
Document12 pages
Puzzle Abacus WIMWI
Vaibhav Kumar
No ratings yet
Synchronus Machine Stability
Document13 pages
Synchronus Machine Stability
Soumya Jyoti Dan
No ratings yet
Quiz On FOR Loops
Document4 pages
Quiz On FOR Loops
Salif Ndiaye
No ratings yet
08 Sepam - Understand Sepam Control Logic
Document20 pages
08 Sepam - Understand Sepam Control Logic
Thức Võ
100% (1)
A Strut-And-tie Model For Ultimate Loads of Precast Concrete Joints With Loop Connections in Tension
Document8 pages
A Strut-And-tie Model For Ultimate Loads of Precast Concrete Joints With Loop Connections in Tension
manoelj_26
No ratings yet
b70 3570en PDF
Document84 pages
b70 3570en PDF
Ashish Tripathi
No ratings yet
Ug Cse CBCS PDF
Document214 pages
Ug Cse CBCS PDF
Dileep K M
0% (1)
Joining Numbers Lesson Plan
Document3 pages
Joining Numbers Lesson Plan
api-246531150
No ratings yet
Cost Accounting Ch03 1
Document86 pages
Cost Accounting Ch03 1
Cali
100% (1)
Peter Gustav Dirichlet: Cinquième Degré, The Paper Addressed Problems in Number Theory Devised by The
Document10 pages
Peter Gustav Dirichlet: Cinquième Degré, The Paper Addressed Problems in Number Theory Devised by The
James Swinton
No ratings yet
Evaluation Metrics For Data Classification
Document12 pages
Evaluation Metrics For Data Classification
elmannai
No ratings yet
How To Study Math
Document7 pages
How To Study Math
HercudonX
No ratings yet
C
Document20 pages
C
Doc Nazia
No ratings yet
Measure Theory
Document39 pages
Measure Theory
Sutapa Borna
No ratings yet