You are on page 1of 4

2/21/22, 3:21 PM Monday: The K-Nearest Neighbours (KNN): DS Core 13 Machine Learning

Monday: The K-Nearest Neighbours


(KNN)
Agenda
5 min: Overview

55 min: Suggested Readings

1 hr: Exercise

Specific Learning Outcomes


I can understand the difference between parametric and non-parametric learning.
I can outline examples of instance-based learning algorithms.
I can explain the use of distance measures while working with the KNN algorithm.
I can understand and apply the KNN algorithm to solving classification and regression
problems.
I can understand the limitations of the KNN algorithm.
I can evaluate the performance of the KNN algorithm.

Overall Learning Outcome


I can understand and apply supervised learning algorithms such as regression, decision trees,
KNN, SVM, naive Bayes, random forests to solving business problems.
I can understand the benefits, limitations, and requirements of various supervised learning
algorithms.

Overview
During this session, we learn about another type of machine learning algorithm called the K-
Nearest Neighbours (KNN). However, before we get to understand what this type of algorithm
does, we will briefly learn about how it is categorized as non - parametric.

Overall machine learning algorithms are categorized as either parametric or non - parametric.
This is as a result of the nature of its number of parameters. An algorithm is considered to be
parametric if it contains a fixed number of parameters while it is considered non - parametric
algorithm if it uses a flexible number of parameters.

A parametric algorithm is computationally faster but makes stronger assumptions about the data
thus may work well if the assumptions turn out to be correct i.e. normal distribution, etc.
Conversely, this type of algorithm may perform badly if the assumptions are wrong. A common
example of a parametric algorithm is linear regression.

https://moringaschool.instructure.com/courses/539/pages/monday-the-k-nearest-neighbours-knn?module_item_id=48918 1/4
2/21/22, 3:21 PM Monday: The K-Nearest Neighbours (KNN): DS Core 13 Machine Learning

On the other hand, a non-parametric algorithm uses a flexible number of parameters and the
number of parameters often grows as the algorithm learns from more data. A non-parametric
algorithm is computationally slower but makes fewer assumptions about the data. A common
example of a non-parametric algorithm is the K-nearest neighbor which is the algorithm that we
will learn.

The K-Nearest Neighbor (KNN) also belongs to another family of algorithms called instance-
based algorithms which don't perform explicitly generalization, instead, they compare new
problem instances with instances seen in training, which have been stored in memory. In simple
terms, instance-based algorithms look at the nearest neighbors to decide what any queried point
should be.

Now looking into the K-Nearest Neighbor (KNN), it can be used for both classification and
regression problems. It stores all available cases and classifies new cases by a majority vote of
its K neighbors. Predictions are made for a new data point by searching through the entire training
set for the K most similar instances (the neighbors) and summarizing the output variable for those
K instances. For example, if we take K=3 and we want to decide which class does a new example
belongs to, we consider the 3 closest (Euclidian distance usually) points to the new example.

In regression, this would be the mean output variable as shown;

https://moringaschool.instructure.com/courses/539/pages/monday-the-k-nearest-neighbours-knn?module_item_id=48918 2/4
2/21/22, 3:21 PM Monday: The K-Nearest Neighbours (KNN): DS Core 13 Machine Learning

Below are the advantages and disadvantages of using the KNN Algorithm.

Advantages

This type of algorithm is easy to use.


Quick calculation time.
It does not make assumptions about the data.

Disadvantages

1. The accuracy of the algorithm depends on the quality of the data.


2. One needs to find an optimal k value (number of nearest neighbors).
3. It is poor at classifying data points in a boundary where they can be classified one way or
another.

Let's now go through the following resources and readings.

Suggested Readings
K-nearest neighbors, Clearly Explained. [Link (https://www.youtube.com/watch?
v=HVXime0nQeI) ]
How the KNN Algorithm works. [Link (https://www.youtube.com/watch?v=UqYde-LULfs) ]

Exercise
Python Programming: K-Nearest Neighbours (KNN). [Link
(https://colab.research.google.com/drive/1DsvJjfbpq0vMSHttWL85kDvuJX4jQuqL?usp=sharing) ]

"Machine learning will automate jobs that most people thought could only be done by people." ~
Dave Waters

https://moringaschool.instructure.com/courses/539/pages/monday-the-k-nearest-neighbours-knn?module_item_id=48918 3/4
2/21/22, 3:21 PM Monday: The K-Nearest Neighbours (KNN): DS Core 13 Machine Learning

Did you enjoy this content?

Yes No

Send Feedback

https://moringaschool.instructure.com/courses/539/pages/monday-the-k-nearest-neighbours-knn?module_item_id=48918 4/4

You might also like