You are on page 1of 15

a) λ=∞,m=0

b)
λ=∞,m=1:
c) λ=∞, m=2:
d)
λ=∞, m=3:
e)
λ=0, m=3:
3. Suppose we fit a curve with basis func>ons b1(X) = X, b2(X) = (X − 1) ^2 * I (X >= 1). (Note that
I (X >= 1) equal 1 for X (1 and 0 otherwise.) We fit the linear regression model Y = beta0 +
beta1*b1(X) + beta2*b2(X) + error term", and obtain coefficient es>mates ˆ Beta 0 = 1, ˆ Beta1 =
1, ˆBeta 2 = −2. Sketch the es>mated curve between X = −2 and X = 2. Note the intercepts,
slopes, and other relevant informa>on.
9. This ques>on uses the variables dis (the weighted mean of distances to five Boston
employment centers) and nox (nitrogen oxides concentra>on in parts per 10 million) from the
Boston data. We will treat dis as the predictor and nox as the response.
(a) Use the poly () func>on to fit a cubic polynomial regression to predict nox using dis.
Report the regression output and plot the resul>ng data and polynomial fits.
(b) Plot the polynomial fits for a range of different polynomial degrees (say, from 1 to 10),
and report the associated residual sum of squares.
(c) Perform cross-valida>on or another approach to select the op>mal degree for the
polynomial and explain your results.

While all lines generally follow the same trend, indicating a decrease in nitrogen
oxide concentration with increased distance from employment centers, there is
slight divergence at the extremes.
(d) Use the bs () func>on to fit a regression spline to predict nox using dis. Report the
output for the fit using four degrees of freedom. How did you choose the knots? Plot the
resul>ng fit.
(e) Now fit a regression spline for a range of degrees of freedom and plot the resul>ng fits
and report the resul>ng RSS. Describe the results obtained.

It includes polynomial regression lines for degrees of freedom from 3 to 10, each
represented by a different color and labeled in the legend. The lines demonstrate
how the fitted model's complexity increases with the degrees of freedom, aiming
to capture the trend in the data, with the potential trade-off of overfitting at
higher degrees. The plot's title "Fits for Different Degrees of Freedom" indicates
that the graph is used to evaluate the fit of various polynomial degrees to the
observed data points.
(f) Perform cross-valida>on or another approach to select the best degrees of freedom for
a regression spline on this data Describe your results.

As the degrees of freedom increase from 3 to around 5, there is a sharp decrease


in cross-validation error because the model's predictive performance is
improving. After reaching a minimum at around 5 degrees of freedom, the cross-
validation error starts to fluctuate slightly as the degrees of freedom further
increase, but overall remains fairly stable without a clear increasing or decreasing
trend.

You might also like