Monday - The K-Nearest Neighbours (KNN) - DS Core 13 Machine Learning

2/21/22, 3:21 PM Monday: The K-Nearest Neighbours (KNN): DS Core 13 Machine Learning
Monday: The K-Nearest Neighbours

(KNN)
Agenda
5 min: Overview
55 min: Suggested Readings
1 hr: Exercise
Specific Learning Outcomes

I can understand the difference between parametric and non-parametric learning.
I can outline examples of instance-based learning algorithms.
I can explain the use of distance measures while working with the KNN algorithm.
I can understand and apply the KNN algorithm to solving classification and regression
problems.
I can understand the limitations of the KNN algorithm.
I can evaluate the performance of the KNN algorithm.
Overall Learning Outcome

I can understand and apply supervised learning algorithms such as regression, decision trees,
KNN, SVM, naive Bayes, random forests to solving business problems.
I can understand the benefits, limitations, and requirements of various supervised learning
algorithms.
Overview
During this session, we learn about another type of machine learning algorithm called the K-
Nearest Neighbours (KNN). However, before we get to understand what this type of algorithm
does, we will briefly learn about how it is categorized as non - parametric.
Overall machine learning algorithms are categorized as either parametric or non - parametric.
This is as a result of the nature of its number of parameters. An algorithm is considered to be
parametric if it contains a fixed number of parameters while it is considered non - parametric
algorithm if it uses a flexible number of parameters.
A parametric algorithm is computationally faster but makes stronger assumptions about the data
thus may work well if the assumptions turn out to be correct i.e. normal distribution, etc.
Conversely, this type of algorithm may perform badly if the assumptions are wrong. A common
example of a parametric algorithm is linear regression.
https://moringaschool.instructure.com/courses/539/pages/monday-the-k-nearest-neighbours-knn?module_item_id=48918 1/4
On the other hand, a non-parametric algorithm uses a flexible number of parameters and the
number of parameters often grows as the algorithm learns from more data. A non-parametric
algorithm is computationally slower but makes fewer assumptions about the data. A common
example of a non-parametric algorithm is the K-nearest neighbor which is the algorithm that we
will learn.
The K-Nearest Neighbor (KNN) also belongs to another family of algorithms called instance-
based algorithms which don't perform explicitly generalization, instead, they compare new
problem instances with instances seen in training, which have been stored in memory. In simple
terms, instance-based algorithms look at the nearest neighbors to decide what any queried point
should be.
Now looking into the K-Nearest Neighbor (KNN), it can be used for both classiﬁcation and
regression problems. It stores all available cases and classiﬁes new cases by a majority vote of
its K neighbors. Predictions are made for a new data point by searching through the entire training
set for the K most similar instances (the neighbors) and summarizing the output variable for those
K instances. For example, if we take K=3 and we want to decide which class does a new example
belongs to, we consider the 3 closest (Euclidian distance usually) points to the new example.
In regression, this would be the mean output variable as shown;
Below are the advantages and disadvantages of using the KNN Algorithm.
Advantages
This type of algorithm is easy to use.

Quick calculation time.
It does not make assumptions about the data.
Disadvantages
1. The accuracy of the algorithm depends on the quality of the data.

2. One needs to find an optimal k value (number of nearest neighbors).
3. It is poor at classifying data points in a boundary where they can be classified one way or
another.
Let's now go through the following resources and readings.
Suggested Readings
K-nearest neighbors, Clearly Explained. [Link (https://www.youtube.com/watch?
v=HVXime0nQeI) ]
How the KNN Algorithm works. [Link (https://www.youtube.com/watch?v=UqYde-LULfs) ]
Exercise
Python Programming: K-Nearest Neighbours (KNN). [Link
(https://colab.research.google.com/drive/1DsvJjfbpq0vMSHttWL85kDvuJX4jQuqL?usp=sharing) ]
"Machine learning will automate jobs that most people thought could only be done by people." ~
Dave Waters
Did you enjoy this content?
Yes No
Send Feedback

Monday - The K-Nearest Neighbours (KNN) - DS Core 13 Machine Learning

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Monday - The K-Nearest Neighbours (KNN) - DS Core 13 Machine Learning

Uploaded by

Copyright:

Available Formats

2/21/22, 3:21 PM Monday: The K-Nearest Neighbours (KNN): DS Core 13 Machine Learning

Monday: The K-Nearest Neighbours

55 min: Suggested Readings

Specific Learning Outcomes

Overall Learning Outcome

In regression, this would be the mean output variable as shown;

This type of algorithm is easy to use.

1. The accuracy of the algorithm depends on the quality of the data.

Let's now go through the following resources and readings.

Did you enjoy this content?

You might also like