Professional Documents
Culture Documents
Cheat Sheet: Algorithms For Supervised-And Unsupervised Learning
Cheat Sheet: Algorithms For Supervised-And Unsupervised Learning
�
t̂ = arg max δ(ti , C)
C
i:xi ∈Nk (x,x̂)
The label of a new point x̂ is classified • Nk (x, x̂) ← k points in x O(N M ) space complexity, since all
k-nearest Use cross-validation to learn the appropriate k; otherwise no k acts as to regularise the classifier: as k → N the
with the most frequent label t̂ of the k closest to x̂ No optimisation needed. training instances and all their Natively finds non-linear boundaries. To be added.
neighbour training, classification based on existing points. boundary becomes smoother.
nearest training instances. features need to be kept in memory.
• Euclidean distance formula:
� �D 2
i=1 (xi − x̂i )
• δ(a, b) ← 1 if a = b; 0 o/w
Binary, linear classifier: Iterate over each training example xn , and update the
weight vector if misclassification: Use a kernel K(x, x� ), and 1 weight per
y(x) = sign(wT x) Tries to minimise the Error function; the number of The Voted Perceptron: run the perceptron i times training instance:
i+1 i
incorrectly classified input vectors: w = w + η∆EP (w) � N �
. . . where: and store each iteration’s weight vector. Then:
� �
Directly estimate the linear function � = wi + ηxn tn � � y(x) = sign wn tn K(x, xn )
y(x) by iteratively updating the +1 if x ≥ 0 arg min EP (w) = arg min − w T xn t n � O(IN M L), since each combination of
The perceptron is an online algorithm
Perceptron sign(x) = w w
n∈M . . . where typically η = 1. y(x) = sign ci × sign(wT
i x) instance, class and features must be n=1
weight vector when incorrectly −1 if x < 0 per default.
i calculated (see log-linear). . . . and the update:
classifying a training instance. . . . where M is the set of misclassified training
Multiclass perceptron: For the multiclass perceptron: . . . where ci is the number of correctly classified
vectors. i+1
wn i
= wn +1
y(x) = arg max wT φ(x, Ck ) training instances for wi .
wi+1 = wi + φ(x, t) − φ(x, y(x))
Ck
1 Created by Emanuel Ferm, HT2011, for semi-procrastinational reasons while studying for a Machine Learning exam. Last updated May 5, 2011.