Professional Documents
Culture Documents
Kernel Methods: Feature Mapping at No Cost
Kernel Methods: Feature Mapping at No Cost
: x ( x), R F d
non-linear mapping to F
1. high-D space
2. infinite-D countable space :
3. function space (Hilbert space)
example: ( x, y) ( x , y , 2 xy)
2 2
7
How Kernel solves XOR
Find the weight vector to solve the XOR under the following
Look into earlier notes: we had discussed this as an exercise problem
QUADRATURE FEATURE MAPPING
Cubic Kernel
Complexity cost
Polynomial Kernel
Computational cost saving
Properties of Kernels
The Kernel Matrix of dot products
(contain all information of algorithm
Formal Definitions
PSD property (Not needed for this
course)
Mercer’s condition ( not needed for
this course)
CAN ANY FUNCTION BE USED AS A KERNEL FUNCTION?
Good
•Kernel algorithms are typically constrained convex optimization
problems solved with either spectral methods or convex optimization tools.
• Efficient algorithms do exist in most cases.
• The similarity to linear methods facilitates analysis. There are strong
generalization bounds on test error.
Bad
• You need to choose the appropriate kernel
• Kernel learning is prone to over-fitting
• All information must go through the kernel-bottleneck. (The Gram-Matrix)
22
Modularity
Kernel methods consist of two modules:
some kernels:
( || x y||2 / c )
some kernel algorithms:
k ( x, y ) e - support vector machine
k ( x, y ) ( x, y ) d - Fisher discriminant analysis
- kernel regression
k ( x, y ) tanh( x, y )
- kernel PCA
1 - kernel CCA
k ( x, y )
|| x y ||2 c 2
25