Professional Documents
Culture Documents
Quality of Fit
Price
Price
Size Size Size
OverfiHng:
• The learned hypothesis may fit the training set very
well ( )
J(✓) ⇡ 0
• ...but fails to generalize to new examples
54
RegularizaIon
• Linear regression objecIve funcIon
Xn ⇣ ⇣ ⌘ ⌘2 XXdd
1
J(✓) = h✓ x(i) y (i) + ✓✓j2j2
2n 2 j=1
i=1 j=1
55
Understanding RegularizaIon
Xn ⇣ ⇣ ⌘ ⌘2 d
X
1
J(✓) = h✓ x(i) y (i) + ✓j2
2n i=1 2 j=1
X d
✓j2 = k✓1:d k22
• Note that
j=1
– This is the magnitude of the feature coefficient vector!
Size
0 Size
0 0 0
• Gradient update:
n ⇣
X ⇣ ⌘ ⌘
@ 1 (i)
@✓0
J(✓) ✓ 0 ✓ 0 ↵ h ✓ x y (i)
n i=1
Xn ⇣ ⇣ ⌘ ⌘
@ 1 (i)
@✓j
J(✓) ✓j ✓j ↵ h✓ x(i) y (i) xj ↵ ✓j
n i=1
regularizaIon
59
Regularized Linear Regression
1 X ⇣ ⇣ (i) ⌘ ⌘2
n d
X
J(✓) = h✓ x y (i) + ✓j2
2n i=1 2 j=1
Xn ⇣ ⇣ ⌘ ⌘
1
✓0 ✓0 ↵ h✓ x(i) y (i)
n i=1
Xn ⇣ ⇣ ⌘ ⌘
1 (i)
✓j ✓j ↵ h✓ x(i) y (i) xj ↵ ✓j
n i=1
61
Regularized Linear Regression
• To incorporate regularizaIon into the closed form
soluIon:
0 2 31 1
0 0 0 ... 0
B 6 0 1 0 . . . 0 7C
B 6 7C
B | 6 0 0 1 . . . 0 7C
✓ = BX X + 6 7C X | y
B 6 .. .. .. . . .. 7C
@ 4 . . . . . 5A
0 0 0 ... 1
@
• Can derive this the same way, by solving J(✓) = 0
@✓
• Can prove that for λ > 0, inverse exists in the
equaIon above
62