© All Rights Reserved

81 views

© All Rights Reserved

- BUS 302 Study Material
- Sample Exam 2 Questions
- anova review
- Energy Pro USA Environ Report
- Computing Primer for Applied Linear Regession by Weisberg
- Lecture Note: Analysis of Financial Time Series
- Copy of Eco Project1
- System
- Factors Influencing Popularity of Branded Content in Facebook
- Linear Regression
- MultipleRegression_CompleteStepwiseProblems_spring2006
- Econometrics I 2
- Ay 4301275279
- Strategic plan implementation and organizational performance: a case of Hargeisa Water Agency in Somaliland
- Foo
- Regression
- Weibull Analysis
- Doyle, Lundholm, Dan Soliman (2003)
- Lectures_8_9_10
- regression after midterm 5.ppt

You are on page 1of 26

html

Problem 1

Consider two curves, g1 and g2, where g^(m) represents the mth derivative of g.

g2 will have the smaller training RSS because its a higher order polynomial and is therefore more likely to capture

more of the data due to its higher DoF value.

g1 will have the smaller test RSS because g2 is more likely to overt the data with the extra degree of freedom.

When = 0, the penalty function will cancel out, and because the loss function is the same in g1 and g2, they will

have the same training and test RSS.

Problem 2

Suppose that we carry out backward stepwise, forward stepwise, and best subset all on the same data set. Each

approach will yield a sequence of models with k = 0 up through k = p predictors.

a. Which approach with k predictors will have the smallest test residual sum of squares? Explain.

While best subset selection is able to lter through all potential models, it is less likely to nd a model to t the

test data because there is concern of overtting the data, especially as p increases. Forward and backwise step

selection evaluate fewer models, making it less likely to overt the data. However, when looking at the number of

p parameters, if p > n, only forward stepwise selection is the viable model able to provide the most accurate test

RSS.

b. Which approach with k predictors will have the smallest training residual sum of squares? Explain.

With a larger search space, we are more likely to nd a model that looks good on training data because it is able

to lter through all p^2 model options, unlike forward and backward stepwise selection methods, which only lter

through 1+p(p+1)/2 models.

c.True or False: i. The predictors in the k-variable model identied by forward stepwise are a subset of the

predictors in the (k+1)-variable model identied by backward stepwise selection.

False. Predictors defined by a forward stepwise model are not necessarily the same ones

identified by backward stepwise because these models do not evaluate all possible option

s.

ii. The predictors in the k-variable model identied by backward stepwise are a subset of the predictors in the

(k+1)-variable model identied by forward stepwise selection.

False. Predictors dened by a backward stepwise model are not necessarily the same ones identied by

forward stepwise because these models do not evaluate all possible options.

iii. The predictors in the k-variable model identied by best subset are a subset of the predictors in the (k+1)-

variable model identied by best subset selection.

le://localhost/Users/alexnutkiewicz/Desktop/STATS%20216/Homework__3.html 1/26

3/6/2017 Homework__3.html

False. The k and k+1 variable models are evaluated independently of one another, so it is impossible to

determine fully that a k variable model is a subset of a larger best subset model.

iv. The predictors in the k-variable model identied by backward stepwise are a subset of the predictors in the

(k+1)-variable model identied by backward stepwise selection.

True. The model contains all but one feature in the (k+1) variable model, minus the feature resulting in the

smallest overall benet in RSS.

v. The predictors in the k-variable model identied by forward stepwise are a subset of the predictors in the

(k+1)-variable model identied by forward stepwise selection.

True. The k+1 variable model contains all chosen k features, plus the best overall feature.

This question uses the variables dis (the weighted mean of distances to ve Boston employment centers) and nox

(nitrogen oxides concentration in parts per 10 million) from the Boston data (in the MASS library). We will treat dis

as the predictor and nox as the response.

a. Use the poly() function to t a cubic polynomial regression to predict nox using dis. Report the regression

output, and plot the resulting data and polynomial ts.

require(MASS)

library(splines)

attach(Boston)

summary(polyfit)

le://localhost/Users/alexnutkiewicz/Desktop/STATS%20216/Homework__3.html 2/26

3/6/2017 Homework__3.html

##

## Call:

## lm(formula = nox ~ poly(dis, 3), data = Boston)

##

## Residuals:

## Min 1Q Median 3Q Max

## -0.121130 -0.040619 -0.009738 0.023385 0.194904

##

## Coefficients:

## Estimate Std. Error t value Pr(>|t|)

## (Intercept) 0.554695 0.002759 201.021 < 2e-16 ***

## poly(dis, 3)1 -2.003096 0.062071 -32.271 < 2e-16 ***

## poly(dis, 3)2 0.856330 0.062071 13.796 < 2e-16 ***

## poly(dis, 3)3 -0.318049 0.062071 -5.124 4.27e-07 ***

## ---

## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

##

## Residual standard error: 0.06207 on 502 degrees of freedom

## Multiple R-squared: 0.7148, Adjusted R-squared: 0.7131

## F-statistic: 419.3 on 3 and 502 DF, p-value: < 2.2e-16

dislims = range(dis)

#create grid of x-axis points for which we want to predict

dis.grid = seq(from=dislims[1], to=dislims[2])

#predict values for each of the points

polypreds = predict(polyfit,newdata = list(dis=dis.grid), se=TRUE)

require(ggplot2)

0.9341281 + -0.1820817*x^1 + 0.0219277*x^2+ -0.0008850*x^3)

le://localhost/Users/alexnutkiewicz/Desktop/STATS%20216/Homework__3.html 3/26

3/6/2017 Homework__3.html

Based on the results above, each of the linear(1), quadratic(2), and cubic(3) coecients are signicant to our

output.

b. Plot the polynomial ts for a range of dierent polynomial degrees (say, from 1 to 10), and report the

associated residual sum of squares.

for(i in 1:10){

polyfit = lm(nox ~ poly(dis, i), data=Boston)

rss[i] = sum(polyfit$residuals^2)

}

plot(1:10, rss, xlab = "Degree", ylab = "RSS", type = "b")

le://localhost/Users/alexnutkiewicz/Desktop/STATS%20216/Homework__3.html 4/26

3/6/2017 Homework__3.html

rss[10]

## [1] 1.832171

Based on the plot, RSS decreases as the polynomial degree increases, as expected. Therefore, we see a

minimum RSS at degree 10 of 1.832171.

c. Perform cross-validation or another approach to select the optimal degree for the polynomial, and explain

your results.

require(boot)

set.seed(36)

cv.error = rep(0,10)

for (i in 1:10){

polyfit = glm(nox ~ poly(dis, i), data=Boston)

cv.error[i] = cv.glm(Boston, K=10, polyfit)$delta[2] #delta = estimated test MSE, valu

e 2 considers LOOCV in estimation

}

plot(1:10, cv.error, xlab = "Degree", ylab = "Test MSE", type = "b", col="darkgreen")

le://localhost/Users/alexnutkiewicz/Desktop/STATS%20216/Homework__3.html 5/26

3/6/2017 Homework__3.html

Looking at our test MSE curve, we see the traditional U-shape occur with it bottoming out at the 3rd degree, and

as we increase the degree of polynomial, we notice our model starts likely overtting our data, which is why we

see peaks in test MSE at the 7th and 9th degrees. I used 10-fold cross validation (K=10) to minimize

computational time.

d. Use the bs() function to t a regression spline to predict nox using dis. Report the output for the t using

four degrees of freedom. How did you choose the knots? Plot the resulting t.

library(splines)

bs.fit = lm(nox ~ bs(dis, knots = c(6)), degree=3, data=Boston)

## extra argument 'degree' will be disregarded

summary(bs.fit)

le://localhost/Users/alexnutkiewicz/Desktop/STATS%20216/Homework__3.html 6/26

3/6/2017 Homework__3.html

##

## Call:

## lm(formula = nox ~ bs(dis, knots = c(6)), data = Boston, degree = 3)

##

## Residuals:

## Min 1Q Median 3Q Max

## -0.12387 -0.04012 -0.01033 0.02308 0.19446

##

## Coefficients:

## Estimate Std. Error t value Pr(>|t|)

## (Intercept) 0.76037 0.01018 74.667 < 2e-16 ***

## bs(dis, knots = c(6))1 -0.23672 0.02321 -10.200 < 2e-16 ***

## bs(dis, knots = c(6))2 -0.36177 0.02548 -14.200 < 2e-16 ***

## bs(dis, knots = c(6))3 -0.33337 0.04044 -8.244 1.47e-15 ***

## bs(dis, knots = c(6))4 -0.36220 0.05105 -7.095 4.45e-12 ***

## ---

## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

##

## Residual standard error: 0.06208 on 501 degrees of freedom

## Multiple R-squared: 0.7152, Adjusted R-squared: 0.7129

## F-statistic: 314.6 on 4 and 501 DF, p-value: < 2.2e-16

bs.pred2 = cbind(bs.pred$bs.fit)

lines(dis.grid, predict(bs.fit,list(dis=dis.grid)), col="darkblue", lwd=2)

abline(v=c(6), lty=2, col="darkblue")

le://localhost/Users/alexnutkiewicz/Desktop/STATS%20216/Homework__3.html 7/26

3/6/2017 Homework__3.html

The goal in choosing knots is for all terms to be signicant. So, in order to eectively select values for our knot,

we want to select inputs that give each term importance, by referring to the coecient and t-statistics. I chose

degree 3 (because of our CV error results), resulting in the above t curve.

e. Now t a regression spline for a range of degrees of freedom, and plot the resulting ts and report the

resulting RSS. Describe the results obtained.

for(i in 1:10){

splineFit = lm(nox ~ bs(dis, knots=c(6), degree=i), data=Boston)

rss[i] = sum(splineFit$residuals^2)

plot(dis,nox,col="darkgrey")

lines(dis.grid,predict(splineFit,list(dis=dis.grid)),col="darkblue",lwd=2)

abline(v=c(6),lty=2,col="darkgreen")

}

le://localhost/Users/alexnutkiewicz/Desktop/STATS%20216/Homework__3.html 8/26

3/6/2017 Homework__3.html

le://localhost/Users/alexnutkiewicz/Desktop/STATS%20216/Homework__3.html 9/26

3/6/2017 Homework__3.html

le://localhost/Users/alexnutkiewicz/Desktop/STATS%20216/Homework__3.html 10/26

3/6/2017 Homework__3.html

le://localhost/Users/alexnutkiewicz/Desktop/STATS%20216/Homework__3.html 11/26

3/6/2017 Homework__3.html

le://localhost/Users/alexnutkiewicz/Desktop/STATS%20216/Homework__3.html 12/26

3/6/2017 Homework__3.html

le://localhost/Users/alexnutkiewicz/Desktop/STATS%20216/Homework__3.html 13/26

3/6/2017 Homework__3.html

le://localhost/Users/alexnutkiewicz/Desktop/STATS%20216/Homework__3.html 14/26

3/6/2017 Homework__3.html

Referring to the above chart, we see our plots get smootherand then bumpier as a result of the models

increased exibility. As mentioned in lecture, splines create more unstable plots at the tails of our graphs -

especially when the degree of freedom increases.

f. Perform cross-validation or another approach in order to select the best degrees of freedom for a

regression spline. Describe your results.

set.seed(36)

reg.cv.error = rep(NA,10)

for (i in 1:10){

regPolyfit = glm(nox ~ bs(dis, knots= c(6), degree = i), data=Boston)

reg.cv.error[i] = cv.glm(Boston, K=10, regPolyfit)$delta[2]

}

## 10.7103: some 'x' values beyond boundary knots may cause ill-conditioned

## bases

## 10.7103: some 'x' values beyond boundary knots may cause ill-conditioned

## bases

## 12.1265: some 'x' values beyond boundary knots may cause ill-conditioned

## bases

## 12.1265: some 'x' values beyond boundary knots may cause ill-conditioned

## bases

## 12.1265: some 'x' values beyond boundary knots may cause ill-conditioned

## bases

## 12.1265: some 'x' values beyond boundary knots may cause ill-conditioned

## bases

## 10.7103: some 'x' values beyond boundary knots may cause ill-conditioned

## bases

## 10.7103: some 'x' values beyond boundary knots may cause ill-conditioned

## bases

le://localhost/Users/alexnutkiewicz/Desktop/STATS%20216/Homework__3.html 15/26

3/6/2017 Homework__3.html

## 10.7103: some 'x' values beyond boundary knots may cause ill-conditioned

## bases

## 10.7103: some 'x' values beyond boundary knots may cause ill-conditioned

## bases

## 12.1265: some 'x' values beyond boundary knots may cause ill-conditioned

## bases

## 12.1265: some 'x' values beyond boundary knots may cause ill-conditioned

## bases

## 10.7103: some 'x' values beyond boundary knots may cause ill-conditioned

## bases

## 10.7103: some 'x' values beyond boundary knots may cause ill-conditioned

## bases

## 12.1265: some 'x' values beyond boundary knots may cause ill-conditioned

## bases

## 12.1265: some 'x' values beyond boundary knots may cause ill-conditioned

## bases

## 12.1265: some 'x' values beyond boundary knots may cause ill-conditioned

## bases

## 12.1265: some 'x' values beyond boundary knots may cause ill-conditioned

## bases

## 10.7103: some 'x' values beyond boundary knots may cause ill-conditioned

## bases

## 10.7103: some 'x' values beyond boundary knots may cause ill-conditioned

## bases

le://localhost/Users/alexnutkiewicz/Desktop/STATS%20216/Homework__3.html 16/26

3/6/2017 Homework__3.html

## 12.1265: some 'x' values beyond boundary knots may cause ill-conditioned

## bases

## 12.1265: some 'x' values beyond boundary knots may cause ill-conditioned

## bases

## 10.7103: some 'x' values beyond boundary knots may cause ill-conditioned

## bases

## 10.7103: some 'x' values beyond boundary knots may cause ill-conditioned

## bases

## 10.7103: some 'x' values beyond boundary knots may cause ill-conditioned

## bases

## 10.7103: some 'x' values beyond boundary knots may cause ill-conditioned

## bases

## 12.1265: some 'x' values beyond boundary knots may cause ill-conditioned

## bases

## 12.1265: some 'x' values beyond boundary knots may cause ill-conditioned

## bases

## 10.7103: some 'x' values beyond boundary knots may cause ill-conditioned

## bases

## 10.7103: some 'x' values beyond boundary knots may cause ill-conditioned

## bases

## 10.7103: some 'x' values beyond boundary knots may cause ill-conditioned

## bases

## 10.7103: some 'x' values beyond boundary knots may cause ill-conditioned

## bases

le://localhost/Users/alexnutkiewicz/Desktop/STATS%20216/Homework__3.html 17/26

3/6/2017 Homework__3.html

## 10.7103: some 'x' values beyond boundary knots may cause ill-conditioned

## bases

## 10.7103: some 'x' values beyond boundary knots may cause ill-conditioned

## bases

## 12.1265: some 'x' values beyond boundary knots may cause ill-conditioned

## bases

## 12.1265: some 'x' values beyond boundary knots may cause ill-conditioned

## bases

plot(1:10, reg.cv.error, xlab = "Degree", ylab = "Test MSE", type = "b", col="darkred")

reg.cv.error[1]

## [1] 0.004429047

le://localhost/Users/alexnutkiewicz/Desktop/STATS%20216/Homework__3.html 18/26

3/6/2017 Homework__3.html

Looking at our test MSE plot, we see the test MSE value slowly increases as degree increases, so we can

conclude that 1 degree is the best option to t our data. The minimum test MSE, at 1 degree, is 0.004429047.

This problem works with the body dataset, which you can download from the homework folder on the class

website. The goal of this problem is to perform and compare Principal Components Regression and Partial Least

Squares on the problem of trying to predict someones weight.

a. Read the body dataset into R using the load() function. This dataset contains: X - A dataframe containing

21 dierent types of measurements on the human body and Y - A dataframe that contains the age, weight

(kg), height (cm), and the gender of each person in the sample. Lets say we forgot how the gender is

coded in this dataset. Using a simple visualization, explain how you can tell which gender is which.

load("/Users/alexnutkiewicz/Downloads/body.rdata")

genderCode = as.factor(Y$Gender)

par(mfrow = c(3,1))

plot(Y$Weight, Y$Gender, col = "darkblue")

plot(X$Bicep.Girth, Y$Gender, col = "darkgreen")

plot(X$Forearm.Girth, Y$Gender, col = "darkgrey")

The above plots analyze weight, chest girth, bicep girth, and forearm girth versus gender to allow us to intitively

gure out whether or not 0 or 1 is male. We can assume men are likely to be heavier and have girthier features

than women. So, we can assume that 1 is coded as male.

le://localhost/Users/alexnutkiewicz/Desktop/STATS%20216/Homework__3.html 19/26

3/6/2017 Homework__3.html

b. Reserve 200 observations from your dataset to act as a test set and use the remaining 307 as a training

set. On the training set, use both pcr and plsr to t models to predict a persons weight based on the

variables in X. Use the options scale = TRUE and validation=CV. Why does it make sense to scale our

variables in this case?

set.seed(36)

testing = sort(sample(1:nrow(X), 200))

training = (1:nrow(X))[-testing]

library(pls)

##

## Attaching package: 'pls'

##

## loadings

plsrFit = plsr(Y$Weight ~., data=X, subset=training, scale=TRUE, validation="CV")

We want to scale our variables to improve stability in our analysis. It is much easier to compare predictors when

they are on the same scale (e.g., comparing mm to mm vs.cm to mm). Additionally, we want to use cross validate

our variables because this process takes place within the PCR/PSLR model tting. This is therefore cross

validating our choice of model on the 307 training observations. This will prevent us from overtting the data.

c. Run summary() on each of the objects calculated above, and compare the training % variance explained

from the pcr output to the plsr output. Do you notice any consistent patterns (in comparing the two)? Is that

pattern surprising? Explain why or why not.

summary(pcrFit)

le://localhost/Users/alexnutkiewicz/Desktop/STATS%20216/Homework__3.html 20/26

3/6/2017 Homework__3.html

## Y dimension: 307 1

## Fit method: svdpc

## Number of components considered: 21

##

## VALIDATION: RMSEP

## Cross-validated using 10 random segments.

## (Intercept) 1 comps 2 comps 3 comps 4 comps 5 comps 6 comps

## CV 12.95 3.370 3.193 2.976 2.940 2.936 2.936

## adjCV 12.95 3.369 3.191 2.974 2.937 2.933 2.932

## 7 comps 8 comps 9 comps 10 comps 11 comps 12 comps 13 comps

## CV 2.915 2.895 2.892 2.896 2.913 2.925 2.924

## adjCV 2.911 2.888 2.886 2.888 2.906 2.916 2.916

## 14 comps 15 comps 16 comps 17 comps 18 comps 19 comps

## CV 2.943 2.909 2.861 2.850 2.821 2.832

## adjCV 2.934 2.895 2.850 2.815 2.809 2.820

## 20 comps 21 comps

## CV 2.842 2.843

## adjCV 2.829 2.831

##

## TRAINING: % variance explained

## 1 comps 2 comps 3 comps 4 comps 5 comps 6 comps 7 comps

## X 62.49 74.20 79.27 83.69 86.21 88.29 89.98

## Y$Weight 93.23 93.97 94.81 94.99 95.07 95.09 95.15

## 8 comps 9 comps 10 comps 11 comps 12 comps 13 comps

## X 91.41 92.66 93.81 94.92 95.87 96.75

## Y$Weight 95.26 95.31 95.35 95.35 95.39 95.42

## 14 comps 15 comps 16 comps 17 comps 18 comps 19 comps

## X 97.53 98.06 98.53 98.93 99.32 99.59

## Y$Weight 95.42 95.66 95.76 95.92 95.92 95.93

## 20 comps 21 comps

## X 99.82 100.00

## Y$Weight 95.93 95.94

summary(plsrFit)

le://localhost/Users/alexnutkiewicz/Desktop/STATS%20216/Homework__3.html 21/26

3/6/2017 Homework__3.html

## Y dimension: 307 1

## Fit method: kernelpls

## Number of components considered: 21

##

## VALIDATION: RMSEP

## Cross-validated using 10 random segments.

## (Intercept) 1 comps 2 comps 3 comps 4 comps 5 comps 6 comps

## CV 12.95 3.273 2.956 2.859 2.843 2.811 2.807

## adjCV 12.95 3.272 2.954 2.855 2.832 2.802 2.796

## 7 comps 8 comps 9 comps 10 comps 11 comps 12 comps 13 comps

## CV 2.802 2.80 2.801 2.804 2.805 2.804 2.805

## adjCV 2.792 2.79 2.791 2.793 2.794 2.793 2.794

## 14 comps 15 comps 16 comps 17 comps 18 comps 19 comps

## CV 2.805 2.804 2.804 2.804 2.804 2.804

## adjCV 2.794 2.794 2.794 2.794 2.794 2.794

## 20 comps 21 comps

## CV 2.804 2.804

## adjCV 2.794 2.794

##

## TRAINING: % variance explained

## 1 comps 2 comps 3 comps 4 comps 5 comps 6 comps 7 comps

## X 62.48 72.47 78.75 80.70 83.45 86.13 87.99

## Y$Weight 93.67 94.99 95.43 95.77 95.87 95.92 95.94

## 8 comps 9 comps 10 comps 11 comps 12 comps 13 comps

## X 89.31 90.48 91.65 92.79 93.58 94.61

## Y$Weight 95.94 95.94 95.94 95.94 95.94 95.94

## 14 comps 15 comps 16 comps 17 comps 18 comps 19 comps

## X 95.37 96.13 96.81 97.47 98.07 98.81

## Y$Weight 95.94 95.94 95.94 95.94 95.94 95.94

## 20 comps 21 comps

## X 99.61 100.00

## Y$Weight 95.94 95.94

Each of these models has a similar training percent of variance explained in the data. Although each of these

methods are so dierent (PLSR is a supervised learning method, PCR is unsupervised), this is not a surprising

result. Each of these methods are used to model a response variable under a large p value, especially if the p

predictors are highly correlated. PCR creates linear combinations of the original set of predictors without

consideration for the response variable. PLSR however does consider the response variable, which is why it

typically has fewer linear combinations. Despite their dierences, both of these approaches create linear

combinations of our original set of predictors, and so their similarity in results is unsurprising.

d. For each of the models, pick a number of components that you would use to predict future values of weight

from X. Please include any further analysis you use to decide on the number of components.

validationplot(pcrFit, val.type = "RMSEP", type = "b", main = "PCR Fit")

validationplot(plsrFit, val.type = "RMSEP", type = "b", main = "PLSR Fit")

le://localhost/Users/alexnutkiewicz/Desktop/STATS%20216/Homework__3.html 22/26

3/6/2017 Homework__3.html

Using validationplot through the pls library, we were able to nd that a signicant amount of the RMSE drops o

after just adding 1 component, so well move forward in our analysis of the body dataset with one component.

e. Practically speaking, it might be nice if we could guess a persons weight without measuring 21 dierent

quantities. Do either of the methods performed above allow us to do that? If not, pick another method that

will, and t it on the training data.

Yeah, so if we want to reduce the number of predictors, aka simplify the model via feature selection, the lasso

seems like a good option!

library(ISLR)

library(glmnet)

le://localhost/Users/alexnutkiewicz/Desktop/STATS%20216/Homework__3.html 23/26

3/6/2017 Homework__3.html

lassoX = scale(model.matrix(Y$Weight ~ ., data = X)[, -1])

lassoY = Y$Weight

lassoFit = glmnet(lassoX[training,], lassoY[training], alpha = 1, lambda = grid)

plot(lassoFit)

set.seed(36)

lassoCV = cv.glmnet(lassoX[training,], lassoY[training], alpha = 1)

bestLambda = lassoCV$lambda.1se

predict(lassoFit, type = "coefficients", s = bestLambda)

le://localhost/Users/alexnutkiewicz/Desktop/STATS%20216/Homework__3.html 24/26

3/6/2017 Homework__3.html

## 1

## (Intercept) 69.00221166

## Wrist.Diam 0.04076614

## Wrist.Girth .

## Forearm.Girth 1.35906177

## Elbow.Diam 0.54778799

## Bicep.Girth .

## Shoulder.Girth 1.62958068

## Biacromial.Diam 0.42140884

## Chest.Depth 0.47952509

## Chest.Diam 0.21704888

## Chest.Girth 1.39778183

## Navel.Girth .

## Waist.Girth 3.26613218

## Pelvic.Breadth 0.35693112

## Bitrochanteric.Diam .

## Hip.Girth 1.45081886

## Thigh.Girth 0.82597085

## Knee.Diam 0.32318685

## Knee.Girth 1.13763065

## Calf.Girth 0.84002676

## Ankle.Diam 0.44355954

## Ankle.Girth 0.19631757

bestLambda

## [1] 0.5396431

f. Compare all 3 methods in terms of performance on the test set. Keep in mind that you should only run one

version of each model on the test set. Any necessary selection of parameters should be done only with the

training set.

mean((pcrPredict - lassoY[testing])^2)

## [1] 8.562787

mean((plsrPredict - lassoY[testing])^2)

## [1] 7.952771

lassoPredict = predict(lassoFit, s = bestLambda, newx = lassoX[testing,])

mean((lassoPredict - lassoY[testing])^2)

le://localhost/Users/alexnutkiewicz/Desktop/STATS%20216/Homework__3.html 25/26

3/6/2017 Homework__3.html

## [1] 8.141433

These results show that if we are employing the 1 standard error rule, we nd that PLSR has the lowest test MSE.

le://localhost/Users/alexnutkiewicz/Desktop/STATS%20216/Homework__3.html 26/26

- BUS 302 Study MaterialUploaded byCadyMyers
- Sample Exam 2 QuestionsUploaded byChenelly Doromal Alcasid
- anova reviewUploaded byapi-285777244
- Energy Pro USA Environ ReportUploaded byS. Michael Ratteree
- Computing Primer for Applied Linear Regession by WeisbergUploaded byWendel Mirbel
- Lecture Note: Analysis of Financial Time SeriesUploaded bytestuser132546
- Copy of Eco Project1Uploaded byZahir Ahmad
- SystemUploaded bylintang07
- Factors Influencing Popularity of Branded Content in FacebookUploaded byDiana Maria
- Linear RegressionUploaded bykishorenayark
- MultipleRegression_CompleteStepwiseProblems_spring2006Uploaded bypooja24k
- Econometrics I 2Uploaded bymasabkhan
- Ay 4301275279Uploaded byAnonymous 7VPPkWS8O
- Strategic plan implementation and organizational performance: a case of Hargeisa Water Agency in SomalilandUploaded byIOSRjournal
- FooUploaded byTommy Anastas Taskovski
- RegressionUploaded byCART11
- Weibull AnalysisUploaded byPujan Neupane
- Doyle, Lundholm, Dan Soliman (2003)Uploaded byRosalia Anita
- Lectures_8_9_10Uploaded byAzmiHafifi
- regression after midterm 5.pptUploaded byNataliAmiranashvili
- Coondoo - Volatility1_componentsUploaded byz_k_j_v
- SASProject2Uploaded byAshwani Pasricha
- Mari HatUploaded byHaruti Hamdani
- stats216_hw2Uploaded byAlex Nutkiewicz
- stats216_hw4.pdfUploaded byAlex Nutkiewicz
- PROIECT MSSAUploaded byToma Amalia
- Impact of Executive Support on Organizational Core Competencies Management for Strategic Product InnovationUploaded byInternational Journal of Business Marketing and Management
- Auto Titrator AnalysisUploaded byKaushal Rai
- steams 2018-05 fsdm bangkok applying statistical modeling to predict basketball winning percentageUploaded byapi-448507449
- Description & MatLab CodeUploaded byAndresBernal

- LOR 1Uploaded bySoumya Chatterjee
- CV English VersionUploaded bySulaiman Amd
- Guidelines MTech DisseratationUploaded byabhi_engg06
- Pjm 6005 assingmentUploaded byDantuPallavi
- social studies compare and contrast newspaperUploaded byapi-251136539
- 1e study plan 2nd semesterUploaded byapi-159091557
- Antisurge Control and Turbomachinery TrainingUploaded byEnas Al-khawaldeh
- PhD Regulations-New.pdfUploaded byBalvinder
- Short Course - 2013 July - Ait001 - UdkUploaded byaitsydney
- Project 2 RevisionUploaded byLgrnell
- Civil Engineer Resume - b.karthikeyan Be, m.techUploaded byspero
- Vancouver Short NewUploaded byDiana Bengea
- resume - tressa johnson - 10 23 17Uploaded byapi-351784888
- 作文Uploaded byAnonymous mw6kqFg
- Hunt & Beglar (2005) -- A Framework for Developing EFL Reading VocabularyUploaded byAva Dunlap
- Dr.a.jahitha BegumUploaded byAnbu Andal
- school support plan - kelantan 2016Uploaded byFaizah Ramli
- Viewing Children as Sexual Beings (Prolife Propaganda)Uploaded byPropaganda Hunter
- Undergraduate ViewbookUploaded byHofstraUniversity
- Problems of Acquisition and Provision of Library Resources and Services to Users in Special Libraries by Douglas Tersugh AuvaUploaded byIsrael Aondohemba M
- Kaunseling FifiUploaded byFifi Fazira
- Man Chaster MBAUploaded bymohit-love-5282
- cv tenure updateUploaded byapi-306538802
- BSHF 101 eng 2013-14Uploaded byVinu Vijaykumar
- The Oral Approach and Situational Language TeachingUploaded byNicholas Henry
- Architectural Thesis- Music in ArchitectureUploaded bySinoj
- EE 320 (CS 320) - Computer Organization and Assembly - Jahangir Ikram (1)Uploaded bySheikh Asher
- lecture-discussion model-2 originalUploaded byapi-314418210
- Blended Learning in English Language Teaching: Course Design and Implementation Edited by Brian Tomlinson and Claire WhittakerUploaded byMassy2020
- referencesUploaded byapi-346324242

## Much more than documents.

Discover everything Scribd has to offer, including books and audiobooks from major publishers.

Cancel anytime.