Welcome to Scribd, the world's digital library. Read, publish, and share books and documents. See more
Download
Standard view
Full view
of .
Look up keyword
Like this
3Activity
0 of .
Results for:
No results containing your search query
P. 1
Cornell CS578: KNN

Cornell CS578: KNN

Ratings: (0)|Views: 222 |Likes:
Published by Ian

More info:

Categories:Types, School Work
Published by: Ian on Mar 15, 2009
Copyright:Attribution Non-commercial

Availability:

Read on Scribd mobile: iPhone, iPad and Android.
download as PDF, TXT or read online from Scribd
See more
See less

11/08/2012

pdf

text

original

 
1
Memory-Based LearningInstance-Based LearningK-Nearest Neighbor 
Motivating ExampleInductive Assumption
Similar inputs map to similar outputs
 – If not true => learning is impossible – If true => learning reduces to defining “
 similar 
 Not all similarities created equal
 –  predicting a person’s height may depend ondifferent attributes than predicting their IQ
1-Nearest Neighbor 
works well if no attribute noise, class noise, class overlap
can learn complex functions (sharp class boundaries)
as number of training cases grows large, error rate of 1-NNis at most 2 times the Bayes optimal rate (i.e. if you knewthe true probability of each class for each test case)
 Dist 
(
c
1
,
c
2
)
=
attr
i
(
c
1
)
attr
i
(
c
2
)
( )
2
i
=
1
 N 
 NearestNeighbor
=
MIN 
 j 
(
 Dist 
(
c
 j 
,
c
test 
))
 prediction
test 
=
class
 j 
(
or value
 j 
)
 
2
k-Nearest Neighbor 
Average of k points more reliable when:
 – noise in attributes – noise in class labels – classes partially overlap
 Dist 
(
c
1
,
c
2
)
=
attr
i
(
c
1
)
attr
i
(
c
2
)
( )
2
i
=
1
 N 
NearestNeighbors
=
MIN 
(
 Dist 
(
c
i
,
c
test 
))
{ }
 prediction
test 
=
1
class
ii
=
1
(
or
1
value
ii
=
1
)
attribute_1
     a      t      t     r       i       b     u      t     e_       2
+++++++++ooooooooooo+++oo o
How to choose “k”
Large k:
 – less sensitive to noise (particularly class noise) –  better probability estimates for discrete classes – larger training sets allow larger values of
Small k:
 – captures fine structure of problem space better  – may be necessary with small training sets
Balance must be struck between large and small k 
As training set approaches infinity, and k grows large,kNN becomes Bayes optimal
 From Hastie, Tibshirani, Friedman 2001 p418From Hastie, Tibshirani, Friedman 2001 p418
 
3
 From Hastie, Tibshirani, Friedman 2001 p419
Cross-Validation
Models usually perform better on training datathan on future test cases
1-NN is 100% accurate on training data!
Leave-one-out-cross validation:
 – “remove” each case one-at-a-time – use as test case with remaining cases as train set – average performance over all test cases
LOOCV is impractical with most learningmethods, but extremely efficient with MBL!
Distance-Weighted kNN
tradeoff between small and large k can be difficult
 – use large k, but more emphasis on nearer neighbors?
 prediction
test 
=
w
i
class
ii
=
1
w
ii
=
1
(
orw
i
value
ii
=
1
w
ii
=
1
)
w
=
1
 Dist 
(
c
,
c
test 
)

Activity (3)

You've already reviewed this. Edit your review.
1 thousand reads
1 hundred reads
Pavankumar Vasu liked this

You're Reading a Free Preview

Download
scribd
/*********** DO NOT ALTER ANYTHING BELOW THIS LINE ! ************/ var s_code=s.t();if(s_code)document.write(s_code)//-->