Professional Documents
Culture Documents
Artificial Intelligence
OUTLINE
2
Chapter 1: Overview of AI
Chapter 2: Artificial Neural Networks
Chapter 3: Searching, Knowledge, Reasoning, and Planning
Chapter 4: Machine learning
W1 W2 W3 W4 W5 W6 W7 W8 W9 W10
L L L I-T L L L L P P
2
03/11/2023
Objectives
3
1. Understand the knowledge of kNN
Main contents
4
1. Working with image dataset
3. Real applications
4
03/11/2023
6
03/11/2023
Working with 8
image dataset
1. “Animals” dataset
8
03/11/2023
Working with 9
image dataset
2. A Basic Image Preprocessor
10
03/11/2023
11
12
03/11/2023
13
14
03/11/2023
15
16
03/11/2023
17
18
03/11/2023
19
The k-Nearest Neighbor classifier is by far the most simple machine learning and image
classification algorithm. In fact, it’s so simple that it doesn’t actually “learn” anything.
Instead, this algorithm directly relies on the distance between feature vectors (which in
our case, are the raw RGB pixel intensities of the images).
Simply put, the k-NN algorithm classifies unknown data points by finding the most
common class among the k closest examples. Each data point in the k closest data points
casts a vote, and the category with the highest number of votes wins. Or, in plain English:
“Tell me who your neighbors are, and I’ll tell you who you are”.
20
03/11/2023
21
22
03/11/2023
23
24
03/11/2023
There are two clear hyperparameters that we are concerned with when running
the k-NN algorithm.
The first is obvious: the value of k. What is the optimal value of k? If it’s too small
(such as k = 1), then we gain efficiency but become susceptible to noise and outlier
data points. However, if k is too large, then we are at risk of over-smoothing our
classification results and increasing bias.
The second parameter we should consider is the actual distance metric. Is the
Euclidean distance the best choice? What about the Manhattan distance?
25
The goal of this section is to train a k-NN classifier on the raw pixel intensities of
the Animals dataset and use it to classify unknown animal images. We’ll be using
our four step pipeline to train classifiers:
Step #1 – Gather Our Dataset: The Animals datasets consists of 3,000 images with
1,000 images per dog, cat, and panda class, respectively. Each image is represented
in the RGB color space. We will preprocess each image by resizing it to 32×32
pixels. Taking into account the three RGB channels, the resized image dimensions
imply that each image in the dataset is represented by 32×32×3 = 3,072 integers.
26
03/11/2023
Step #2 – Split the Dataset: For this simple example, we’ll be using two splits
of the data. One split for training, and the other for testing. We will leave out
the validation set for hyperparameter tuning and leave this as an exercise to
the reader.
• Step #3 – Train the Classifier: Our k-NN classifier will be trained on the raw
pixel intensities of the images in the training set.
27
28
03/11/2023
29
30
03/11/2023
31
32
03/11/2023
33
34
03/11/2023
35
36
03/11/2023
37
38
03/11/2023
39
IV. Quiz
40
1. Download others images and test the result
40
03/11/2023
41
41
42
42