L3 - คำสั่งภาษา R เบื้องต้น (Decision Tree, Random Forest) PDF

17/6/2562 คําสังภาษา R เบืองต ้น (Decision Tree, Random Forest) – Pang Taithong
Pang Taithong
ั
คําสงภาษา R เบืองต ้น (Decision Tree, Random Forest)
ตัวอย่างการใช ้คําสัง R เบืองต ้นทีใช ้สําหรับการสร ้างโมเดล Decision Tree และโมเดล Random Forest ต่อเนืองจากบทความการสร ้าง
Classification Model ใน
ตอนที 1 (h p://wp.me/p4UVPR-20) และ ตอนที 2 (h p://wp.me/p4UVPR-2h)
1. ไลบรารีหลักๆทีใช ้ในการสร ้างโมเดล Decision Tree, โมเดล Random Forest และการวาดกราฟต่างๆ มีดงั นี
# สําหรับสร ้างโมเดล Decision Tree
install.packages(“rpart”)
# สําหรับวาด Decision Tree
install.packages(‘ra le’)
install.packages(‘rpart.plot’)
install.packages(‘RColorBrewer’)
# สําหรับสร ้าง Random Forest
install.packages(‘randomForest’)
# สําหรับสร ้าง Bar Chart
install.packages(‘plotly’)
#เรียกใช ้ไลบรารี
library(rpart)
library(ra le)
library(rpart.plot)
library(RColorBrewer)
library(randomForest)
library(plotly)
2. คําสังนํ าเข ้าด ้วยไฟล์ CSV
raw_data <- read.csv( file=”path_to_ file/priceday.csv”,head=TRUE,sep=”,”)
หมายเหตุ head=TRUE หมายถึง ไฟล์ CSV มีชอคอลั ื มภ์ทแถวแรก
ี
sep=”,” หมายถึง ไฟล์ CSV นีใช ้เครืองหมาย “,” ในการแบ่งฟิ ลด์คอลัมภ์
3. คําสังการสุม ่ ข ้อมูล
smp_size <- floor(0.8 * nrow(raw_data))
set.seed(123)
train_ind <- sample(seq_len(nrow(raw_data)), size = smp_size)
4. คําสังการแบ่งข ้อมูลเก็บใส่เวกเตอร์สําหรับ Train และ Test
training_data <- raw_data[train_ind, ] #เก็บข ้อมูลทีสุม ่ 80% ไว ้ใน training_data
testing_data <- raw_data[-train_ind, ] #เก็บข ้อมูลทีสุม ่ 20% ไว ้ใ น test_data
split_data <- list(training_data, testing_data)
5. คําสังหรับโมเดล Decision Tree
1. คําสัง Train Model
#Train ข ้อมูลโดยใช ้ training_data กําหนด ‘price’ เป็ นเฉลย
tree_ fit <- rpart(price ~ ., training_data, method = “class”)
2. คําสัง Test Model
#Test ข ้อมูลโดยใช ้ testing_data และไม่ต ้องกําหนดเฉลย
tree_pred <- predict(tree_fit, testing_data, type=”class”)
3. คําสังวาด Decision Tree
fancyRpartPlot(tree_ t) #วาด Decision Tree (แบบมีสส ั สวยงาม ใช ้ไลบรารี RColorBrewer)
ี น
#rpart.plot(tree_ t, type = 4, extra = 100, fallen.leaves = T) (แบบธรรมดา)
4. คําสังการตัดกิงต ้นไม ้ (แก ้ปั ญหา overfi ing)
ptree <- prune(tree_ t,cp=tree_fit$cptable[which.min(tree_ t$cptable[,”xerror”]),”CP”]) ptree_pred <- predict(ptree,
testing_data, type=”class”)
6. คําสังสําหรับโมเดล Random Forest

1. คําสัง Train Model
rf_ fit <- randomForest(price ~ ., training_data)
2. คําสัง Test Model
rf_pred <- predict_by_model(rf_fit, testing_data)
https://purinko.wordpress.com/2016/03/26/คําสังภาษา-r-เบืองต ้น-decision-tree-random-fo/ 1/2
3.
17/6/2562 คําสังวาดกราฟ คําสังภาษา R เบืองต ้น (Decision Tree, Random Forest) – Pang Taithong
plot(rf_fit, log=”y”)
7. ตัวอย่างการวัดประสิทธิภาพ
การทดสอบประสิทธิภาพของโมเดล Decision Tree และ Random Forest จะวัดประสิทธิภาพจาก
1. ค่าPrecision
2. ค่าRecall
3. ค่าF-Measure
4. ค่าAccurracy
#เก็บข ้อมูลฟิ ลด์เฉลยไว ้ในเวกเตอร์ เดียวเราจะเอาข ้อมูลนีมาใส่สต

ู รการวัดประสิทธิภาพ
testing_price <- testing_data$price
#คําสังการหาค่า TP, TN, FP, FN ทํานายข ้าวราคาสูง(high)

t_positive <- length(tree_pred[tree_pred==’high’ & tree_pred ==testing_price])
t_negative <- length(tree_pred[tree_pred!=’high’ & tree_pred ==testing_price])
f_positive <- length(tree_pred[tree_pred==’high’ & tree_pred !=testing_price])
f_negative <- length(tree_pred[tree_pred!=’high’ & tree_pred !=testing_price])
#คําสังเก็บข ้อมูลTP,TN,FP,FN ลงใน data.frame

conf_table <- data.frame(TP = t_positive, TN = t_negative, FP = f_positive, FN = f_negative)
——-มาถึงตรงนีเราจะได ้ค่า TP, TN, FP, FN พร ้อมใส่สต ู รวัดประสิทธิภาพแล ้ว——-
#หาค่า Precision
precision <- conf_table[[“TP”]] / (conf_table[[“TP”]] + conf_table[[“FP”]]) * 100
#หาค่า Recall
recall <- conf_table[[“TP”]] / (conf_table[[“TP”]] + conf_table[[“FN”]]) * 100
#หาค่า F-measure
f_measure <- (2 * precision * recall) / (precision + recall)
#หาค่า Accuracy
accuracy <- (conf_table[[“TP”]] + conf_table[[“TN”]]) /(conf_table[[“TP”]] + conf_table[[“FP”]] + conf_table[[“TN”]] +
conf_table[[“FN”]]) * 100
8. กราฟสถิตต ิ า่ งๆ
plot(table(raw_data$usd), col=rgb(150,100,400,500,maxColorValue=500),
xlab = “USD (Days)”, ylab = “”, main = “USD Frequency”)
MARCH 26, 2016 PURINKO
CREATE A FREE WEBSITE OR BLOG AT WORDPRESS.COM.
https://purinko.wordpress.com/2016/03/26/คําสังภาษา-r-เบืองต ้น-decision-tree-random-fo/ 2/2

L3 - คำสั่งภาษา R เบื้องต้น (Decision Tree, Random Forest) PDF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

L3 - คำสั่งภาษา R เบื้องต้น (Decision Tree, Random Forest) PDF

Uploaded by

Copyright:

Available Formats

17/6/2562 คําสังภาษา R เบืองต ้น (Decision Tree, Random Forest) – Pang Taithong

6. คําสังสําหรับโมเดล Random Forest

#เก็บข ้อมูลฟิ ลด์เฉลยไว ้ในเวกเตอร์ เดียวเราจะเอาข ้อมูลนีมาใส่สต

#คําสังการหาค่า TP, TN, FP, FN ทํานายข ้าวราคาสูง(high)

#คําสังเก็บข ้อมูลTP,TN,FP,FN ลงใน data.frame

MARCH 26, 2016 PURINKO

CREATE A FREE WEBSITE OR BLOG AT WORDPRESS.COM.

https://purinko.wordpress.com/2016/03/26/คําสังภาษา-r-เบืองต ้น-decision-tree-random-fo/ 2/2

You might also like