Welcome to Scribd!

Individual#8

Uploaded by

0% found this document useful (0 votes)

25 views3 pages

The document discusses using random forests and bagging to predict car seat sales. It finds that bagging reduces the test MSE to 2.59 and identifies price and shelf location as most important. Random forests increase the test MSE to 3.29 but also identifies price and shelf location as most important. It then shifts to using boosting to predict log-transformed salaries in baseball data, creating training and test sets to evaluate models on different shrinkage parameters and compare to other regression approaches.

Original Description:

predictive analytics homework

Copyright

Available Formats

PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

0% found this document useful (0 votes)

25 views3 pages

Individual#8

Uploaded by

Matias Donoso

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

Jump to Page

You are on page 1of 3

Search inside document

Untitled

Matias Donoso

11/4/2019
Problem #8: In the lab, a classification tree was applied to the Carseats data set after
converting Sales into a qualitative response variable. Now we will seek to predict Sales
using regression trees and related approaches, treating the response as a quantitative
variable.
library(ISLR)
library(tree)
library(randomForest)

## randomForest 4.6-14

## Type rfNews() to see new features/changes/bug fixes.

set.seed(1)
train = sample(1:nrow(Carseats), nrow(Carseats) / 2)
Carseats.train = Carseats[train, ]
Carseats.test = Carseats[-train, ]

(d) Use the bagging approach in order to analyze this data. What test MSE do you obtain?
Use the importance() function to determine which variables are most important.
bag.carseats = randomForest(Sales ~ ., data = Carseats.train, mtry = 10,
ntree = 500, importance = TRUE)
yhat.bag = predict(bag.carseats, newdata = Carseats.test)
mean((yhat.bag - Carseats.test$Sales)^2)

## [1] 2.623527

importance(bag.carseats)

## %IncMSE IncNodePurity
## CompPrice 24.6476262 168.57576
## Income 5.3355144 90.64911
## Advertising 11.7975748 100.81807
## Population -2.4347577 58.94510
## Price 55.1387362 500.32765
## ShelveLoc 46.3295341 376.78200
## Age 17.9245890 162.68705
## Education 0.8706811 43.04402
## Urban 0.6850649 8.77639
## US 4.3005762 17.96535
The MSE Error obtained in this case is 2.59.This means that bagging the tea will reduce the
test MSE. The importance function states that Price and ShelveLoc are the most important
variables (followed by Age, Advertising and CompPrice).
(e) Use random forests to analyze this data. What test MSE do you obtain? Use the
importance() function to determine which variables are most important. Describe the
effect of m, the number of variables considered at each split, on the error rate
obtained.
rf.carseats = randomForest(Sales ~ ., data = Carseats.train, mtry = 3, ntree
= 500, importance = TRUE)
yhat.rf = predict(rf.carseats, newdata = Carseats.test)
mean((yhat.rf - Carseats.test$Sales)^2)

## [1] 3.001375

importance(rf.carseats)

## %IncMSE IncNodePurity
## CompPrice 15.0614295 155.38762
## Income 2.8372504 125.35841
## Advertising 8.5912531 108.52715
## Population -2.2524534 101.40095
## Price 37.9562323 398.55509
## ShelveLoc 36.9148773 289.37326
## Age 11.0749075 173.58313
## Education 0.9820296 70.20401
## Urban 0.7161522 15.45718
## US 6.1094256 33.75805

The test MSE is 3.29, which is higher than the one obtained before. Running the importance
function we realize that still the same two variables are the most important ones
(ShelveLoc and Price).
Problem #10: We now use boosting to predict Salary in the Hitters data set.
(a) Remove the observations for whom the salary information is unknown, and then log-
transform the salaries.
Hitters = na.omit(Hitters)
Hitters$Salary = log(Hitters$Salary)

(b) Create a training set consisting of the first 200 observations, and a test set consisting
of the remaining observations.
train = 1:200
Hitters.train = Hitters[train, ]
Hitters.test = Hitters[-train, ]

(c) Perform boosting on the training set with 1,000 trees for a range of values of the
shrinkage parameter λ. Produce a plot with different shrinkage values on the x-axis
and the corresponding training set MSE on the y-axis.
(d) Produce a plot with different shrinkage values on the x-axis and the corresponding
test set MSE on the y-axis.

(e) Compare the test MSE of boosting to the test MSE that results from applying two of the
regression approaches seen in Chapters 3 and 6.

(f) Which variables appear to be the most important predictors in the boosted model?

(g) Now apply bagging to the training set. What is the test set MSE for this approach?

The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
From Everand
The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
Mark Manson
Rating: 4 out of 5 stars
4/5 (5794)
Chapter 5
Document10 pages
Chapter 5
David Sam
No ratings yet
The Little Book of Hygge: Danish Secrets to Happy Living
From Everand
The Little Book of Hygge: Danish Secrets to Happy Living
Meik Wiking
Rating: 3.5 out of 5 stars
3.5/5 (399)
Jurnal Khoirunnisa
Document15 pages
Jurnal Khoirunnisa
Reza Reynaldi
No ratings yet
Shoe Dog: A Memoir by the Creator of Nike
From Everand
Shoe Dog: A Memoir by the Creator of Nike
Phil Knight
Rating: 4.5 out of 5 stars
4.5/5 (537)
Factors Affecting Return on Assets (ROA) at BRI Syariah 2009-2014
Document15 pages
Factors Affecting Return on Assets (ROA) at BRI Syariah 2009-2014
Laila juniar
No ratings yet
Yes Please
From Everand
Yes Please
Amy Poehler
Rating: 4 out of 5 stars
4/5 (1891)
Chi Square 2 - 2
Document13 pages
Chi Square 2 - 2
Fruelan Sarita
No ratings yet
Never Split the Difference: Negotiating As If Your Life Depended On It
From Everand
Never Split the Difference: Negotiating As If Your Life Depended On It
Chris Voss
Rating: 4.5 out of 5 stars
4.5/5 (838)
Logistic Regression
Document37 pages
Logistic Regression
Dhruv Bansal
No ratings yet
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
From Everand
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
Margot Lee Shetterly
Rating: 4 out of 5 stars
4/5 (895)
2019 3 Me Smkma A
Document5 pages
2019 3 Me Smkma A
Melody Ong
No ratings yet
The Yellow House: A Memoir (2019 National Book Award Winner)
From Everand
The Yellow House: A Memoir (2019 National Book Award Winner)
Sarah M. Broom
Rating: 4 out of 5 stars
4/5 (98)
Hubungan Antar Volume Lalu Lintas Dengan Tingkat Kebisingan Di Jalan
Document7 pages
Hubungan Antar Volume Lalu Lintas Dengan Tingkat Kebisingan Di Jalan
abd haris djalante
No ratings yet
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
From Everand
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
Dave Eggers
Rating: 3.5 out of 5 stars
3.5/5 (231)
Download 5 Steps To A 5 Ap Statistics 2021 Elite Student Edition Corey Andreasen Deanna Krause Mcdonald full chapter
Document67 pages
Download 5 Steps To A 5 Ap Statistics 2021 Elite Student Edition Corey Andreasen Deanna Krause Mcdonald full chapter
theresa.whitley258
100% (4)
Grit: The Power of Passion and Perseverance
From Everand
Grit: The Power of Passion and Perseverance
Angela Duckworth
Rating: 4 out of 5 stars
4/5 (588)
Introduction to key statistical concepts
Document8 pages
Introduction to key statistical concepts
My Khanh
No ratings yet
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
From Everand
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
Ashlee Vance
Rating: 4.5 out of 5 stars
4.5/5 (474)
Chi-Square Analysis of Knowledge, Education, Age and Income on ISPA Occurrence
Document6 pages
Chi-Square Analysis of Knowledge, Education, Age and Income on ISPA Occurrence
Lalu Rizki Andri Saputra
No ratings yet
On Fire: The (Burning) Case for a Green New Deal
From Everand
On Fire: The (Burning) Case for a Green New Deal
Naomi Klein
Rating: 4 out of 5 stars
4/5 (73)
Tutorial Letter 003/0/2021: Statistical Inference I
Document7 pages
Tutorial Letter 003/0/2021: Statistical Inference I
Regina Thobela
No ratings yet
Team of Rivals: The Political Genius of Abraham Lincoln
From Everand
Team of Rivals: The Political Genius of Abraham Lincoln
Doris Kearns Goodwin
Rating: 4.5 out of 5 stars
4.5/5 (234)
CHAPTER 1: Basic Concepts of Regression Analysis: Prof. Alan Wan
Document59 pages
CHAPTER 1: Basic Concepts of Regression Analysis: Prof. Alan Wan
adane
No ratings yet
Principles: Life and Work
From Everand
Principles: Life and Work
Ray Dalio
Rating: 4 out of 5 stars
4/5 (599)
To Show Whether or Not Colours Are Evenly Distributed in A Bag of Gummi Bears
Document6 pages
To Show Whether or Not Colours Are Evenly Distributed in A Bag of Gummi Bears
Rahul Mistry
No ratings yet
The Emperor of All Maladies: A Biography of Cancer
From Everand
The Emperor of All Maladies: A Biography of Cancer
Siddhartha Mukherjee
Rating: 4.5 out of 5 stars
4.5/5 (271)
517 (Sims, Princeton) PDF
Document6 pages
517 (Sims, Princeton) PDF
Invest
No ratings yet
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
From Everand
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
Ben Horowitz
Rating: 4.5 out of 5 stars
4.5/5 (344)
Stat2507 Finalexam
Document12 pages
Stat2507 Finalexam
yana22
No ratings yet
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
From Everand
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
Gilbert King
Rating: 4.5 out of 5 stars
4.5/5 (266)
Tutorial 4 Sem 2 2020-21
Document2 pages
Tutorial 4 Sem 2 2020-21
nki
No ratings yet
The Glass Castle: A Memoir
From Everand
The Glass Castle: A Memoir
Jeannette Walls
Rating: 4.5 out of 5 stars
4.5/5 (1712)
MPP 611 Homework #2 Statistics Problems
Document3 pages
MPP 611 Homework #2 Statistics Problems
Arsen Irgibayev
No ratings yet
Rise of ISIS: A Threat We Can't Ignore
From Everand
Rise of ISIS: A Threat We Can't Ignore
Jay Sekulow
Rating: 3.5 out of 5 stars
3.5/5 (137)
Regression Analysis in R
Document7 pages
Regression Analysis in R
Mayank Rawat
No ratings yet
Fear: Trump in the White House
From Everand
Fear: Trump in the White House
Bob Woodward
Rating: 3.5 out of 5 stars
3.5/5 (738)
Bayesian Approach For Animal Breeding Data Analysis
Document42 pages
Bayesian Approach For Animal Breeding Data Analysis
Gopal Gowane
50% (2)
Angela's Ashes: A Memoir
From Everand
Angela's Ashes: A Memoir
Frank McCourt
Rating: 4.5 out of 5 stars
4.5/5 (440)
Correlation Analysis Multiple Linear Regression With All 3 Analysts
Document3 pages
Correlation Analysis Multiple Linear Regression With All 3 Analysts
Edward Berbari
No ratings yet
The Unwinding: An Inner History of the New America
From Everand
The Unwinding: An Inner History of the New America
George Packer
Rating: 4 out of 5 stars
4/5 (45)
Partial Correlation
Document28 pages
Partial Correlation
yupingzhao
No ratings yet
The World Is Flat 3.0: A Brief History of the Twenty-first Century
From Everand
The World Is Flat 3.0: A Brief History of the Twenty-first Century
Thomas L. Friedman
Rating: 3.5 out of 5 stars
3.5/5 (2219)
Report Project 1
Document25 pages
Report Project 1
Nurul Nadiah
No ratings yet
Steve Jobs
From Everand
Steve Jobs
Walter Isaacson
Rating: 4.5 out of 5 stars
4.5/5 (806)
Least Square Regression: Numerical Methods ECE 410
Document44 pages
Least Square Regression: Numerical Methods ECE 410
Maria Anndrea Mendoza
No ratings yet
John Adams
From Everand
John Adams
David McCullough
Rating: 4.5 out of 5 stars
4.5/5 (2409)
ARIMA Models for Forecasting Time Series Data
Document200 pages
ARIMA Models for Forecasting Time Series Data
juanivazquez
No ratings yet
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
From Everand
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
Brené Brown
Rating: 4 out of 5 stars
4/5 (1090)
Cambridge Mathematical Tripos Part II Course Guide
Document17 pages
Cambridge Mathematical Tripos Part II Course Guide
harry
100% (1)
Bad Feminist: Essays
From Everand
Bad Feminist: Essays
Roxane Gay
Rating: 4 out of 5 stars
4/5 (1015)
MAS183 Assignment 4 Final
Document6 pages
MAS183 Assignment 4 Final
Tekla Fabriczy
No ratings yet
The Outsider: A Novel
From Everand
The Outsider: A Novel
Stephen King
Rating: 4 out of 5 stars
4/5 (1839)
Econometrics Syllabus Dr. Ani Katchova Course
Document3 pages
Econometrics Syllabus Dr. Ani Katchova Course
Zunda
No ratings yet
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
From Everand
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
Viet Thanh Nguyen
Rating: 4.5 out of 5 stars
4.5/5 (119)
Decision-Tree-Lab 3
Document4 pages
Decision-Tree-Lab 3
api-559045701
No ratings yet
The Woman in Cabin 10
From Everand
The Woman in Cabin 10
Ruth Ware
Rating: 3.5 out of 5 stars
3.5/5 (2322)
House price prediction model evaluation
Document3 pages
House price prediction model evaluation
Tangirala Ashwini
No ratings yet
A Man Called Ove: A Novel
From Everand
A Man Called Ove: A Novel
Fredrik Backman
Rating: 4.5 out of 5 stars
4.5/5 (4609)
Brooklyn: A Novel
From Everand
Brooklyn: A Novel
Colm Tóibín
Rating: 3.5 out of 5 stars
3.5/5 (1937)
The Light Between Oceans: A Novel
From Everand
The Light Between Oceans: A Novel
M.L. Stedman
Rating: 4.5 out of 5 stars
4.5/5 (789)
Wolf Hall: A Novel
From Everand
Wolf Hall: A Novel
Hilary Mantel
Rating: 4 out of 5 stars
4/5 (3811)
The Perks of Being a Wallflower
From Everand
The Perks of Being a Wallflower
Stephen Chbosky
Rating: 4.5 out of 5 stars
4.5/5 (2100)
Little Women
From Everand
Little Women
Louisa May Alcott
Rating: 4 out of 5 stars
4/5 (104)
Sing, Unburied, Sing: A Novel
From Everand
Sing, Unburied, Sing: A Novel
Jesmyn Ward
Rating: 4 out of 5 stars
4/5 (1103)
Manhattan Beach: A Novel
From Everand
Manhattan Beach: A Novel
Jennifer Egan
Rating: 3.5 out of 5 stars
3.5/5 (792)
The Art of Racing in the Rain: A Novel
From Everand
The Art of Racing in the Rain: A Novel
Garth Stein
Rating: 4 out of 5 stars
4/5 (4200)
Her Body and Other Parties: Stories
From Everand
Her Body and Other Parties: Stories
Carmen Maria Machado
Rating: 4 out of 5 stars
4/5 (821)
The Constant Gardener: A Novel
From Everand
The Constant Gardener: A Novel
John le Carré
Rating: 3.5 out of 5 stars
3.5/5 (104)
A Tree Grows in Brooklyn
From Everand
A Tree Grows in Brooklyn
Betty Smith
Rating: 4.5 out of 5 stars
4.5/5 (1929)