Professional Documents
Culture Documents
material useful in giving your own lectures. Feel free to use these slides verbatim, or to modify
them to fit your own needs. If you make use of a significant portion of these slides in your own
lecture, please include this message, or a link to our web site: http://www.mmds.org
Large-Scale Machine
Learning: k-NN, Perceptron
Mining of Massive Datasets
Jure Leskovec, Anand Rajaraman, Jeff Ullman
Stanford University
http://www.mmds.org
New Topic: Machine Learning!
Locality Filtering
PageRank, Recommen
sensitive data SVM
SimRank der systems
hashing streams
Dimensiona Duplicate
Spam Queries on Perceptron,
lity document
Detection streams kNN
reduction detection
Main question:
How to efficiently train
(build a model/find model parameters)?
k=9
J. Leskovec, A. Rajaraman, J. Ullman: Mining of Massive Datasets, http://www.mmds.org 7
Kernel Regression
Distance metric:
Euclidean
wi
How many neighbors to look at?
All of them (!)
Weighting function:
d(xi, q) = 0
p
q
nigeria
wx=0
Start with w0 = 0
Pick training examples x(t) one by one (from disk)
Predict class of x(t) using current weights
y’ = sign(w(t) x(t))
If y’ is correct (i.e., yt = y’) y(t)x(t)
No change: w(t+1) = w(t) w(t)
If y’ is wrong: adjust w(t)
w(t+1)
w(t+1) = w(t) + y (t) x(t)
is the learning rate parameter
x(t) is the t-th training example
x(t), y(t)=1
y(t) is true t-th class label ({+1, -1})
J. Leskovec, A. Rajaraman, J. Ullman: Mining of Massive Datasets, http://www.mmds.org 15
Perceptron Convergence
Perceptron Convergence Theorem:
If there exist a set of weights that are consistent (i.e.,
the data is linearly separable) the Perceptron learning
algorithm will converge
How long would it take to converge?
Perceptron Cycling Theorem:
If the training data is not linearly separable the
Perceptron learning algorithm will eventually repeat
the same set of weights and therefore enter an
infinite loop
How to provide robustness, more expressivity?
Overfitting:
Mediocre generalization:
Finds a “barely” separating
solution
• Classification rule:
• f(x) =+1 if (w+-w-)∙x ≥ θ
• Update rule:
• If mistake:
•