You are on page 1of 9

K-NN Classifier

U.A NULI
K-Nearest Neighbors (K-NN)classifier

the k-nearest neighbors algorithm (k-NN) is a non-parametric method used


for classification and regression.In both cases, the input consists of the k closest
training examples in the feature space. The output depends on whether k-NN is
used for classification or regression:
• In k-NN classification, the output is a class membership. An object is classified
by a majority vote of its neighbors, with the object being assigned to the class
most common among its k nearest neighbors (k is a positive integer, typically
small). If k = 1, then the object is simply assigned to the class of that single
nearest neighbor.
• In k-NN regression, the output is the property value for the object. This value is
the average of the values of its k nearest neighbors.
Algorithm:

K nearest neighbours is a simple algorithm that stores all available cases and classifies new cases
based on a similarity measure (e.g., distance functions).

The algorithm can be summarized as:


1. A positive integer k is specified, along with a new sample to classify
2. Distance of new unknown sample with respect to every entry in the dataset is measured and
stored.
3. select the k entries in the dataset which are closest to the new sample(at minimum distance
form new sample).
4. We find the most common classification out of these k entries.(select class of majority
neighbours)
5. This is the classification given to the new sample
Distance metric - for continuous
variables.
Distance metric for categorical values

In the instance of categorical variables the Hamming distance must be used.


We can now use the training set to classify an unknown case (Age=48 and Loan=$142,000) using Euclidean
distance. If K=1 then the nearest neighbour is the last case in the training set with Default=Y.

D = Sqrt[(48-33)^2 + (142000-150000)^2] = 8000.01 >> Default=Y


Iris setosa

Iris versicolor

Iris virginica
Test
6 2.7 5.1 1.6 Iris-versicolor

SL SW PL PW Flower type Distance2 Distance NN


5.1 3.5 1.4 0.2 Iris-setosa 17.1 4.135215
4.9 3 1.4 0.2 Iris-setosa 16.95 4.117038
4.7 3.2 1.3 0.2 Iris-setosa 18.34 4.282523
5.6 3 4.5 1.5 Iris-versicolor 0.62 0.787401 3
5.8 2.7 4.1 1 Iris-versicolor 1.4 1.183216 5
6.2 2.2 4.5 1.5 Iris-versicolor 0.66 0.812404 4
5.6 2.5 3.9 1.1 Iris-versicolor 1.89 1.374773
5.9 3.2 4.8 1.8 Iris-versicolor 0.39 0.6245 2
6.9 3.2 5.7 2.3 Iris-virginica 1.91 1.382027
7.7 2.8 6.7 2 Iris-virginica 5.62 2.370654
6.3 2.7 4.9 1.8 Iris-virginica 0.17 0.412311 1
6.7 3.3 5.7 2.1 Iris-virginica 1.46 1.208305
7.2 3.2 6 1.8 Iris-virginica 2.54 1.593738

Distance = 𝑆𝐿 − 𝑆𝐿 + 𝑆𝑊 − 𝑆𝑊 + 𝑃𝐿 − 𝑃𝐿 + 𝑃𝑊 − 𝑃𝑊

You might also like