Professional Documents
Culture Documents
1)
`(π, µ, Σ) = X N n=1 ln p(x (n) , z (n) |π, µ, Σ) = X N n=1 ln p(x (n) | z (n) ; µ, Σ)+ln p(z (n) |
π)
We would get :
µk = PN n=1 1[z (n)=k] x (n) PN n=1 1[z (n)=k] Σk = PN n=1 1[z (n)=k] (x (n) − µk )(x (n) −
µk ) T
q(y) for y ∈ {1 . . . k}, qj (x|y) for j ∈ {1 . . . d}, y ∈ {1 . . . k}, x ∈ {−1, +1} that maximize
L(θ) = Xn i=1 log q(y (i) ) + Xn i=1 X d j=1 log qj (x (i) j |y (i) ) subject to the following
constraints:
1. q(y) ≥ 0 for all y ∈ {1 . . . k}. Pk y=1 q(y) = 1. 2. For all y, j, x, qj (x|y) ≥ 0. For all y
∈ {1 . . .
k}, for all j ∈ {1 . . . d}, X x∈{−1,+1} qj (x|y) = 1
2. For all y, j, x, qj (x|y) ≥ 0. For all y ∈ {1 . . . k}, for all j ∈ {1 . . . d}, X x∈{−1,+1} qj
(x|y) = 1
PROBLEM 2 ANSWERS.
1a) f1(x)=a1+b1x+c1x2+d1x3f1(x)=a1+b1x+c1x2+d1x3
1b) f2(x)=a2+b2x+c2x2+d2x3f2(x)=a2+b2x+c2x2+d2x3
such that
2) Showthat
f1(ξ)=f2(ξ)f1(ξ)=f2(ξ). That is f(x)f(x) is continuous at ξξ.
We have
f1(ξ)=β0+β1ξ+β2ξ2+β3ξ3f1(ξ)=β0+β1ξ+β2ξ2+β3ξ3
and
f2(ξ)=(β0−β4ξ3)+(β1+3ξ2β4)ξ+(β2−3β4ξ)ξ2+
(β3+β4)ξ3=β0+β1ξ+β2ξ2+β3ξ3.f2(ξ)=(β0−β4ξ3)+(β1+3ξ2β4)ξ+(β2−3β4ξ)ξ2+
(β3+β4)ξ3=β0+β1ξ+β2ξ2+β3ξ3.
We have
f′1(ξ)=β1+2β2ξ+3β3ξ2f1′(ξ)=β1+2β2ξ+3β3ξ2
and
f′2(ξ)=β1+3ξ2β4+2(β2−3β4ξ)ξ+3(β3+β4)ξ2=β1+2β2ξ+3β3ξ2.f2′
(ξ)=β1+3ξ2β4+2(β2−3β4ξ)ξ+3(β3+β4)ξ2=β1+2β2ξ+3β3ξ2.
e. Show
f′′1(ξ)=f′′2(ξ)f1″(ξ)=f2″(ξ). That is f′′(x)f″(x) is continuous at ξξ. Therefore, f(x)f(x) is
indeed a cubic spline.
f′′1(ξ)=2β2+6β3ξf1″(ξ)=2β2+6β3ξ
and
f′′2(ξ)=2(β2−3β4ξ)+6(β3+β4)ξ=2β2+6β3ξ.
PROBLEM 5 ANSWERS.
1)
set.seed(1)
require(MASS); require(tidyverse); require(ggplot2); require(ggthemes)
require(broom); require(knitr); require(caret)
theme_set(theme_tufte(base_size = 14) + theme(legend.position = 'top'))
data('Boston')
model <- lm(nox ~ poly(dis, 3), data = Boston)
tidy(model) %>%
kable(digits = 3)
Boston %>%
mutate(pred = predict(model, Boston)) %>%
ggplot() +
geom_point(aes(dis, nox, col = '1')) +
geom_line(aes(dis, pred, col = '2'), size = 1.5) +
scale_color_manual(name = 'Value Type',
labels = c('Observed', 'Predicted'),
values = c('#56B4E9', '#E69F00'))
The model finds each power of the dis coefficient to be statistically significant. On the plot,
the fitted line seems to describe the data well without overfitting.
2) Plot the polynomial fits for a range of different polynomial degrees (say, from 1 to
10), and report the associated residual sum of squares.
errors <- list()
models <- list()
pred_df <- data_frame(V1 = 1:506)
for (i in 1:9) {
models[[i]] <- lm(nox ~ poly(dis, i), data = Boston)
preds <- predict(models[[i]])
pred_df[[i]] <- preds
errors[[i]] <- sqrt(mean((Boston$nox - preds)^2))
}
4)
require(splines)
model <- lm(nox ~ bs(dis, df = 4), data = Boston)
kable(tidy(model), digits = 3)
Boston %>%
mutate(pred = predict(model)) %>%
ggplot() +
geom_point(aes(dis, nox, col = '1')) +
geom_line(aes(dis, pred, col = '2'), size = 1.5) +
scale_color_manual(name = 'Value Type',
labels = c('Observed', 'Predicted'),
values = c('#56B4E9', '#E69F00')) +
theme_tufte(base_size = 13)
The model finds all the different bases to be statistically significant. The prediction line
seems to fit the data well without overfitting.
5)
errors <- list()
models <- list()
pred_df <- data_frame(V1 = 1:506)
for (i in 1:9) {
models[[i]] <- lm(nox ~ bs(dis, df = i), data = Boston)
preds <- predict(models[[i]])
pred_df[[i]] <- preds
errors[[i]] <- sqrt(mean((Boston$nox - preds)^2))
}
6)
folds <- sample(1:10, size = 506, replace = TRUE)
errors <- matrix(NA, 10, 9)
models <- list()
for (k in 1:10) {
for (i in 1:9) {
models[[i]] <- lm(nox ~ bs(nox, df = i), data = Boston[folds != k,])
pred <- predict(models[[i]], Boston[folds == k,])
errors[k, i] <- sqrt(mean((Boston$nox[folds == k] - pred)^2))
}
}