Welcome to Scribd!

Skip carousel

Lab 9

Uploaded by

Beitriss Chua

0% found this document useful (0 votes)

4 views4 pages

Ntu lab9 computing

Original Title

lab9

Copyright

Available Formats

PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Ntu lab9 computing

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

0% found this document useful (0 votes)

4 views4 pages

Lab 9

Uploaded by

Beitriss Chua

Ntu lab9 computing

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

Jump to Page

You are on page 1of 4

Search inside document

MH4510 - Statistical Learning and Data Mining - AY1819 S1 Lab 09

MH4510 - Decision Tree

Matthew Zakharia Hadimaja

19th October 2018 (Fri) - Decision Tree

Course instructor : PUN Chi Seng
Lab instructor : Matthew Zakharia Hadimaja

References
Chapter 8.3, [ISLR] An Introduction to Statistical Learning (with Applications in R). Free access to download
the book: http://www-bcf.usc.edu/~gareth/ISL/
To see the help file of a function funcname, type ?funcname.

1. Preparation

Install packages if needed and load

library(MASS) # dataset
library(tree) # decision tree
library(randomForest) # random forest
library(gbm) # gradient boosting

2. Regression Trees

Load dataset
data(Boston) # as usual, predict medv
str(Boston)

Data split
set.seed(1)
train <- sample(nrow(Boston), 0.7 * nrow(Boston))

Tree
reg.tree <- tree(medv ~ ., Boston[train, ])
summary(reg.tree)
reg.tree
plot(reg.tree)
text(reg.tree)

CV for tree
cv.reg.tree <- cv.tree(reg.tree)
cv.reg.tree
plot(cv.reg.tree$size, cv.reg.tree$dev, type='b')
(min.cv.reg <- cv.reg.tree$size[which.min(cv.reg.tree$dev)])

Prune Tree

1
MH4510 - Statistical Learning and Data Mining - AY1819 S1 Lab 09

prune.reg.tree <- prune.tree(reg.tree, best = min.cv.reg)

plot(prune.reg.tree)
text(prune.reg.tree)

Predict
medv.pred <- predict(prune.reg.tree, newdata = Boston[-train, ])
medv.true <- Boston$medv[-train]
plot(medv.pred, medv.true)
abline(0, 1)
mean((medv.pred - medv.true) ^ 2)

2
MH4510 - Statistical Learning and Data Mining - AY1819 S1 Lab 09

3. Classification Trees

Since Boston doesn’t have a categorical variable, we will create one and call it cmed. It indicates if the crime
rate crim is above or below the median. Note that we use the median of the training set.
Boston.cls <- Boston
Boston.cls$cmed <- "No"
Boston.cls$cmed[Boston.cls$crim > median(Boston.cls[train,]$crim)] <- "Yes"
Boston.cls$cmed <- factor(Boston.cls$cmed)
Boston.cls <- Boston.cls[-1] # drop the crim variable
str(Boston.cls)

Tree
cls.tree <- tree(cmed ~ ., Boston.cls[train, ])
summary(cls.tree)
cls.tree
plot(cls.tree)
text(cls.tree)

CV for tree
cv.cls.tree <- cv.tree(cls.tree, FUN = prune.misclass)
cv.cls.tree
plot(cv.cls.tree$size, cv.cls.tree$dev, type='b')
(min.cv.cls <- cv.cls.tree$size[which.min(cv.cls.tree$dev)])

Prune + Predict
prune.cls.tree <- prune.tree(cls.tree, best = min.cv.cls)
cmed.pred <- predict(cls.tree, newdata = Boston.cls[-train, ], type = "class")
cmed.true <- Boston.cls$cmed[-train]
table(cmed.pred, cmed.true)
mean(cmed.pred == cmed.true)

4. Random Forest

We’ll set m = 4 (it’s about sqrt(13)). Here’s a classification example.

set.seed(1)
rf.cls <- randomForest(cmed ~ .,
data = Boston.cls[train, ],
mtry = 4,
ntree = 100,
importance = TRUE)
rf.cls
importance(rf.cls)
varImpPlot(rf.cls)

Predict
cmed.rf <- predict(rf.cls, newdata = Boston.cls[-train, ], type = "class")
table(cmed.rf, cmed.true)
mean(cmed.rf == cmed.true)

3
MH4510 - Statistical Learning and Data Mining - AY1819 S1 Lab 09

5. Bagging

Here’s a regression example.

set.seed(1)
bag.reg <- randomForest(medv ~ .,
data = Boston[train, ],
mtry = 13, # when m = p, it's bagging
ntree = 100,
importance = TRUE)
bag.reg
importance(bag.reg)
varImpPlot(bag.reg)

Predict
medv.bag <- predict(bag.reg, newdata = Boston[-train, ])
mean((medv.bag - medv.true) ^ 2)

6. Boosting

set.seed(1)
boost.reg <- gbm(medv ~ .,
data = Boston[train, ],
distribution = 'gaussian', # bernoulli for classification
n.trees = 5000,
interaction.depth = 1)
boost.reg
summary(boost.reg)

par(mfrow = c(1, 2))

plot(boost.reg, i = 'rm')
plot(boost.reg, i = 'lstat')

Predict
medv.boost <- predict(boost.reg, newdata = Boston[-train, ], n.trees = 5000)
mean((medv.boost - medv.true) ^ 2)

The Essential R Reference
From Everand
The Essential R Reference
Mark Gardener
No ratings yet
Random Forests
Document10 pages
Random Forests
api-285777244
No ratings yet
Introduction To Basics of R - Assignment: Log2 (2 5) Log (Exp (1) Exp (2) )
Document10 pages
Introduction To Basics of R - Assignment: Log2 (2 5) Log (Exp (1) Exp (2) )
optimistic_harish
No ratings yet
EvaluatingClassifierPerformance ML CS6923
Document13 pages
EvaluatingClassifierPerformance ML CS6923
rk2153
No ratings yet
Grid Search For SVM
Document9 pages
Grid Search For SVM
kPrasad8
No ratings yet
Import Numpy As NP
Document4 pages
Import Numpy As NP
4AL19CS059 BHAGYASREE
No ratings yet
D.A Lab Assignment-09: (NAME:RUDRASISH MISHRA) - (SECTION:IT-8) - (ROLL NO:1906649)
Document12 pages
D.A Lab Assignment-09: (NAME:RUDRASISH MISHRA) - (SECTION:IT-8) - (ROLL NO:1906649)
Rudrasish Mishra
No ratings yet
Aditya Garg DMDW
Document40 pages
Aditya Garg DMDW
Raj Nish
No ratings yet
AIML Lab - Ws10
Document9 pages
AIML Lab - Ws10
lucky one
No ratings yet
ML Lab Records
Document101 pages
ML Lab Records
Sreesha Chakraborty
No ratings yet
Sega
Document5 pages
Sega
Dheemanth syamakuri
No ratings yet
Rstudio Study Notes For PA 20181126
Document6 pages
Rstudio Study Notes For PA 20181126
Trong Nghia Vu
No ratings yet
Arrays 1
Document21 pages
Arrays 1
Raju
No ratings yet
R Programming Cheat Sheet: by Via
Document2 pages
R Programming Cheat Sheet: by Via
Kimondo King
No ratings yet
Codes
Document14 pages
Codes
Arvind NANDAN SINGH
No ratings yet
Nthu Bacshw
Document8 pages
Nthu Bacshw
黃淑菱
No ratings yet
Urmi ML Practical File
Document37 pages
Urmi ML Practical File
bekar ka
No ratings yet
EDA & Data Viz
Document21 pages
EDA & Data Viz
Adisya Yuliasari Rohiman
No ratings yet
Automatic Activity Recognition of Weight Lifting Exercises Using Sensor Data
Document4 pages
Automatic Activity Recognition of Weight Lifting Exercises Using Sensor Data
svglolla
No ratings yet
R Examples
Document56 pages
R Examples
Animesh Dubey
No ratings yet
R-Tutorial - Introduction
Document30 pages
R-Tutorial - Introduction
lamaram32
No ratings yet
R Functions
Document6 pages
R Functions
Shreya Ghosh
No ratings yet
Introduction To Python (Part III)
Document29 pages
Introduction To Python (Part III)
Subhradeep Pal
No ratings yet
Patri Lalithya Manasa 19BCD7013 Lab Slot: L23+L24: Decision Tree
Document11 pages
Patri Lalithya Manasa 19BCD7013 Lab Slot: L23+L24: Decision Tree
SHAIK AZAJUDDIN 18MIS7192
No ratings yet
Module - 4 (R Training) - Basic Stats & Modeling
Document15 pages
Module - 4 (R Training) - Basic Stats & Modeling
RohitGahlan
No ratings yet
Bda Assign
Document15 pages
Bda Assign
Aishwarya Biradar
No ratings yet
FDP Indoglobal Group of Colleges: 27 April To 1 May R Programming Language Assignment Submission
Document12 pages
FDP Indoglobal Group of Colleges: 27 April To 1 May R Programming Language Assignment Submission
optimistic_harish
No ratings yet
R Syntax Examples 1
Document6 pages
R Syntax Examples 1
Pedro Cruz
No ratings yet
MMMMM
Document16 pages
MMMMM
Mohit Hooda
No ratings yet
Car Seats R Code
Document5 pages
Car Seats R Code
nvnkmr9999
No ratings yet
Is Lab Aman Agarwal PDF
Document8 pages
Is Lab Aman Agarwal PDF
Aman Bansal
No ratings yet
Practical File of Machine Learning 1905388
Document42 pages
Practical File of Machine Learning 1905388
Devansh
No ratings yet
Daftar Lampiran: Music Signal Analysis
Document7 pages
Daftar Lampiran: Music Signal Analysis
jeremi kucing
No ratings yet
Fitting Data - SciPy Cookbook Documentation PDF
Document10 pages
Fitting Data - SciPy Cookbook Documentation PDF
ninjai_thelittleninja
No ratings yet
1.1 Loading The Data: Survival by Sex
Document6 pages
1.1 Loading The Data: Survival by Sex
k767
No ratings yet
Practical No. 2: 1. Packages
Document15 pages
Practical No. 2: 1. Packages
sagar korde
No ratings yet
ML 7
Document6 pages
ML 7
pratikn1406
No ratings yet
Introduction To R For Gene Expression Data Analysis
Document11 pages
Introduction To R For Gene Expression Data Analysis
Coțovanu Iulian
No ratings yet
R Fourier
Document18 pages
R Fourier
NUR ARZILAH ISMAIL
No ratings yet
Tutorial 6
Document8 pages
Tutorial 6
POEASO
No ratings yet
Scalable Data Processing in R
Document8 pages
Scalable Data Processing in R
Octavio Flores
No ratings yet
DS Lab Manual
Document110 pages
DS Lab Manual
Kumar Aditya
No ratings yet
Control Flow - Looping
Document18 pages
Control Flow - Looping
Nur Syazliana
No ratings yet
Group Assignment - Predictive Modelling
Document23 pages
Group Assignment - Predictive Modelling
Simran Saha
No ratings yet
Boston Housing Kaggle Challenge With Linear Regression
Document3 pages
Boston Housing Kaggle Challenge With Linear Regression
20pba216 Pavithra Meenakshi M
No ratings yet
Aiml Exp 10
Document5 pages
Aiml Exp 10
Kartik Guleria
No ratings yet
Chemo Mortality Analysis
Document5 pages
Chemo Mortality Analysis
Ayu Hutami
No ratings yet
Lab 7
Document3 pages
Lab 7
Beitriss Chua
No ratings yet
Tutorial2 Q&A
Document5 pages
Tutorial2 Q&A
darrenseah5530
No ratings yet
Fresco
Document17 pages
Fresco
vinay
100% (2)
Iba Cia 3
Document9 pages
Iba Cia 3
shreya namana
No ratings yet
Using The MarkowitzR Package
Document12 pages
Using The MarkowitzR Package
Almighty59
No ratings yet
Random Forest - Parameter - Tuning
Document10 pages
Random Forest - Parameter - Tuning
Nit Gossy
No ratings yet
Pattern Recognition
Document26 pages
Pattern Recognition
Aryan Attri
No ratings yet
ML2 Practical List
Document80 pages
ML2 Practical List
Yash Amin
No ratings yet
R-Training For Print
Document11 pages
R-Training For Print
lamaram32
No ratings yet
ML Lab
Document7 pages
ML Lab
Rishi TP
No ratings yet
Fem2063 Data Analytics - May 2020 Lab Practice 5 (Week 6)
Document8 pages
Fem2063 Data Analytics - May 2020 Lab Practice 5 (Week 6)
Zhi yi
No ratings yet
0
Document343 pages
0
Artem Sergeevich Akopyan
No ratings yet
Machine Learning With SQL
Document12 pages
Machine Learning With SQL
prince krish
100% (1)
Adaptive Control Systems Q P
Document4 pages
Adaptive Control Systems Q P
Sree Murthy
No ratings yet
ALFM001-3.02 - Solutions For Exercise 1
Document6 pages
ALFM001-3.02 - Solutions For Exercise 1
Ayisha A. Gill
No ratings yet
Dense Extreme Inception Network
Document10 pages
Dense Extreme Inception Network
Mursal Rehman
No ratings yet
AI MCQ's + Questions
Document12 pages
AI MCQ's + Questions
Ansari Umair
No ratings yet
KNN Is A Very Simple Algorithm Used To Solve Classification Problems. KNN Stands For K-Nearest Neighbors. K Is The Number of Neighbors in KNN
Document9 pages
KNN Is A Very Simple Algorithm Used To Solve Classification Problems. KNN Stands For K-Nearest Neighbors. K Is The Number of Neighbors in KNN
Jessica Samuel
0% (1)
Direct Digital Synthesizers in Clocking Applications
Document8 pages
Direct Digital Synthesizers in Clocking Applications
ravi010582
No ratings yet
Calculus and Its Applications 11th Edition by Bittinger Ellenbogen Surgent ISBN Test Bank
Document215 pages
Calculus and Its Applications 11th Edition by Bittinger Ellenbogen Surgent ISBN Test Bank
neil
100% (20)
COMP3334 Mid-Term 2223 Sample Solutions
Document8 pages
COMP3334 Mid-Term 2223 Sample Solutions
Fouilland
No ratings yet
Aditya Bhandari, Ameya Joshi, Rohit Patki, Bird Species Identification From An Image
Document5 pages
Aditya Bhandari, Ameya Joshi, Rohit Patki, Bird Species Identification From An Image
Sajari Kangutkar
No ratings yet
MC0079 Computer Based Optimization Methods Assignement Feb 11
Document5 pages
MC0079 Computer Based Optimization Methods Assignement Feb 11
Chitra Lekha
No ratings yet
An Introduction To Error-Correcting Codes: The Virtues of Redundancy
Document38 pages
An Introduction To Error-Correcting Codes: The Virtues of Redundancy
Krish Cs20
No ratings yet
Electrical Mathematics Question
Document2 pages
Electrical Mathematics Question
daniel ayobami
No ratings yet
UECM1693/UECM2623/UGCM2623 Tutorial N3: Solution of Linear Systems
Document2 pages
UECM1693/UECM2623/UGCM2623 Tutorial N3: Solution of Linear Systems
freeload
No ratings yet
TP 2 Preparation
Document2 pages
TP 2 Preparation
rimi oyko
No ratings yet
Fundamentals of Digital Signal Processing (DSP)
Document23 pages
Fundamentals of Digital Signal Processing (DSP)
Thiagu Rajiv
No ratings yet
Lecture 17: General Quantum Errors CSS Codes: CPSC 519/619: Quantum Computation John Watrous, University of Calgary
Document7 pages
Lecture 17: General Quantum Errors CSS Codes: CPSC 519/619: Quantum Computation John Watrous, University of Calgary
arcchem
No ratings yet
Data Structure and Algorithms
Document38 pages
Data Structure and Algorithms
ifire
No ratings yet
Multiobjective Slides
Document46 pages
Multiobjective Slides
Betty Nagy
No ratings yet
Computer Example 5.1:, Nonlinear HZ or Multiples Thereof For
Document11 pages
Computer Example 5.1:, Nonlinear HZ or Multiples Thereof For
Nguyễn Quang Minh
No ratings yet
CH 6 - Numerical Methods of Solving Differential Equations: Dy DX F X, y y X
Document9 pages
CH 6 - Numerical Methods of Solving Differential Equations: Dy DX F X, y y X
Mülügêtá Hãilê
No ratings yet
Initialize Table With Single Character Strings P First Input Character
Document5 pages
Initialize Table With Single Character Strings P First Input Character
Mohammad Abubakr
No ratings yet
Cost Function
Document21 pages
Cost Function
Swapnil Bera
No ratings yet
SunOil Example - Figures 5-3 Thru 5-7
Document6 pages
SunOil Example - Figures 5-3 Thru 5-7
Akshay Kumar M
No ratings yet
Introduction To Artificial Neural Networks
Document70 pages
Introduction To Artificial Neural Networks
madhu shree m
No ratings yet
CHAPTER 6 - Transportation
Document35 pages
CHAPTER 6 - Transportation
shirley lyn
No ratings yet
Image Transforms: DFT-Properties, Walsh, Hadamard, Discrete Cosine, Haar and Slant Transforms The Hotelling Transform
Document25 pages
Image Transforms: DFT-Properties, Walsh, Hadamard, Discrete Cosine, Haar and Slant Transforms The Hotelling Transform
ramadevi
No ratings yet
Vlsi - FFT Report Final
Document22 pages
Vlsi - FFT Report Final
kk0511
No ratings yet
Lab Report Final
Document23 pages
Lab Report Final
Sampath Reddy
No ratings yet
Thuật toán NLP
Document57 pages
Thuật toán NLP
Anonymous rsGzBBiqk
No ratings yet
Hipass PDF
Document1 page
Hipass PDF
ahmed_galal_waly1056
No ratings yet