Professional Documents
Culture Documents
Talha Farooq
School of Natural Sciences
MS Statistics
1
The lasso estimate is dened as
p
X
β̂ = argminky − Xβk22 +λ |βj |
β∈R j=1
The tuning parameter λ controls the strength of the penalty, and we get
β̂ = the linear regression estimate when λ = 0, and β̂ = 0 when λ = ∞
For λ in between these two extremes, we are balancing two ideas:
tting a linear model of y on X , and shrinking the coecients. But the
nature of the l1 penalty causes some coecients to be shrunken to zero
exactly.
Dierentiating both sides of (1) with respect to β and then letting them be
equal to zero.
∂L ∂
= −2(y − Xβ) (y − Xβ) ± λ
∂β ∂β
T
= −2X (y − Xβ) ± λ
−2X T y + 2X T Xβ ± λ = 0
λ
−X T y + X T Xβ ± =0
2
Simply for β and consider XX T = I (Identity Matrix)
λ
β̂ = X T y ±
2