Professional Documents
Culture Documents
Lecture 6
David Sontag
New York University
Dual:
dot product
+1
-1
=
=
=
w.x + b
w.x + b
w.x + b
Final solution tends to
be sparse
• αj=0 for most j
u
u2 1 v 2v1 = (u v + u v ) 2
φ(u).φ(v) = 2 . 2 = 1u11 v1 + 2 2u v = u.v
2 2
u2 v2 = (u.v) 2
φ(u).φ(v) = (u.v)d
For any d (we will skip proof): φ(u).φ(v) = (u.v) d
= (u.v)=d (u.v)d
φ(u).φ(v)φ(u).φ(v)
• Cool! Taking a dot product and exponentiating gives same
results as mapping into high dimensional space and then taking
dot product
Quadratic kernel
[Tommi Jaakkola]
Quadratic kernel
[Cynthia Rudin]
Common kernels
• Polynomials of degree exactly d
• Polynomials of degree up to d
• Gaussian kernels
Euclidean distance,
squared
Q: How would you prove that the “Gaussian kernel” is a valid kernel?
A: Expand the Euclidean norm as follows:
• LIBSVM
• Both of these handle multi-class, weighted SVM for
unbalanced data, etc.
regression
Once the best parameters (e.g., choice of C for the SVM) are found, re-train
using all of the training data
LOOCV (Leave-one-out Cross
Validation)
There are N data points.
Repeat learning N times.
k-fold