You are on page 1of 4

Li Huiteng Homework 1 April 25, 2023

1 Likelihood and Bayesian Inference


a
By
 n  1 √ 
T
ℓ β, σ 2 = − log σ 2 − 2 (y − Xβ) (y − Xβ) − n log 2π
2 2σ
 
and β̂, σ̂ 2 is the solution of

∂ℓ 1
= − 2 X T (Xβ − y) = 0,
∂β σ
∂ℓ n 1 T
2
=− 2 + 2 (y − Xβ) (y − Xβ) = 0,
∂σ 2σ 2 (σ 2 )

we obtain
−1 T
β̂ = X T X X y, (1)
1  T   1 
σ̂ 2 = y − X β̂ y − X β̂ = y T I − X(X T X)−1 X T y. (2)
n n

Denote ŷ := X β̂, the expression of σ̂ 2 can be simplified as

1X
n
1 T 2
σ̂ 2 = (y − ŷ) (y − ŷ) = (ŷi − yi ) . (3)
n n i=1

b
   
 n 1 T
2
β̂(0) , σ̂(0) = arg max ℓ β(0) , σ = arg min log(σ 2 ) + 2 y − Xβ(0) (y − Xβ(0) )
2 2σ

implies that
T
β̂(0) = arg min y − Xβ(0) (y − Xβ(0) )
= arg min (y − β0 1) (y − β0 1).
T

Hence
−1 1X
n
β̂0 = 1 1
T
1 T
y= yi = ȳ, (4)
n i=1
1 T   1X
n
2
σ̂(0) = y − X β̂(0) y − X β̂(0) = (yi − ȳ)2 . (5)
n n i=1
Li Huiteng Homework 1 April 25, 2023

Since
  n  n
ℓ β̂, σ̂ = − log 2πσ̂ 2 − ,
  2   2n
n
ℓ β̂(0) , σ̂(0) = − log 2πσ̂(0) − ,
2
2 2
we obtain
!
     2
σ̂(0)
W = 2 ℓ β̂, σ̂ − ℓ β̂(0) , σ̂(0) = n log .
σ̂ 2

The F -test for regression is denoted as the quotient of Mean of Squares for
Model(MSM) and Mean of Squares for Error(MSE), i.e.,

(ŷi −ȳ)2 P P 2
!
M SM n−p (yi − ȳ)2 − (ŷi − yi )2 n−p σ̂(0)
F = = ∑
n−1
= · P = −1 .
M SE (ŷi −yi )2 n−1 (ŷi − yi )2 n−1 σ̂ 2
n−p

Therefore, W is a function of the F -test for regression,


 
n−1
W = n log F +1 . (6)
n−p

c
 
1 T 1 T
f (β | y) ∝ f (y | β)f (β) ∝ exp − 2 (y − Xβ) (y − Xβ) − 2 β β
2σ 2τ
 
1 
= exp − 2 β Aβ − 2β X y + y y
T T T T

 
1
−1 T

= exp − 2 ⟨β, β⟩ − 2 β, A X y + y y T

 
1
= C exp − 2 ⟨β − µ, β − µ⟩

 
1
= C exp − 2 (β − µ) A(β − µ) ,
T

where

A = X T X + λI,
⟨x1 , x2 ⟩ = xT1 Ax2 , ∀x1 , x2 ∈ Rp ,
µ = A−1 X T y,

and C is independent of β. The last three steps hold since A is a positive definite
matrix and thus induces a well-posed inner product ⟨·, ·⟩.
Li Huiteng Homework 1 April 25, 2023

Hence this posterior distribution for β is normal, with


−1
E(β | y) = µ = X T X + λI X T y, (7)
−1 2
cov(β | y) = A−1 σ 2 = X T X + λI σ . (8)
 −1 −1 
The limiting process of τ 2 → ∞ indicates that f (β | y) ∼ N XT X X T y, X T X σ2
as λ → 0.

d
(TBC)

2 Exercise 3.2 of HTF


(TBC)

3 Exercise 3.12 of HTF


Denote
   
e=
X √X , ye =
y
,
λIp Op

and we obtain the least square solution


 −1
β̂LS = XeT X
e e T ye
X
−1 T
= X T X + λIp X y,

which equals to the ridge regression estimator β̂ridge .

4 Wine Quality Data


5 Classification of Wine Quality Data
6 Bayesian Model for Smoothing
By
 
λ
p(θ) ∝ exp − 2 θT Ωθ ,

 
1 T
p(y | θ) ∝ exp − 2 (y − Nθ) (y − Nθ) ,

Li Huiteng Homework 1 April 25, 2023

we obtain
 
λ T 1 T
p(θ | y) ∝ exp − 2 θ Ωθ − 2 (y − Nθ) (y − Nθ)
2σ 2σ
 
1 
= exp − 2 θ Aθ − 2θ N y + y y
T T T T

 
1
= C exp − 2 (θ − µ) A(θ − µ) ,
T

where

A = NT N + λΩ,
µ = A−1 NT y,

and C is independent of θ. The last step follows similarly to Section 1(c).


Therefore, the posterior distribution of θ given y is again Gaussian with

E(θ | y) = µ,
cov(θ | y) = A−1 σ 2 .

And f (Nθ | y) ∼ N Nµ, σ 2 NA−1 NT , that is
−1 T
E(Nθ | y) = N NT N + λΩ N y,
 −1
cov(Nθ | y) = σ 2 N NT N + λΩ NT .

You might also like