You are on page 1of 69

CS 480

Introduction to Artificial Intelligence

November 9, 2023
Announcements / Reminders
 Please follow the Week 12 To Do List instructions (if you
haven't already)
 Work on your Written Assignment #04
 Programming Assignment #02 due on Monday
(11/27/23) 11:59 PM CST

 Final Exam date:


– Thursday 11/30/2023 (last week of classes!)
 Ignore the date provided by the Registrar

2
Plan for Today
 Casual introduction to Machine Learning

3
Traditional Programming vs ML
Traditional programming:
Program
Input data Output
(Rules)

Machine learning:
Program
Input data Output
(Rules)

4
Main Machine Learning Categories
Supervised learning Unsupervised learning Reinforcement learning

Supervised learning is one Unsupervised learning Reinforcement learning is


of the most common involves finding underlying inspired by behavioral
techniques in machine patterns within data. psychology. It is based on a
learning. It is based on Typically used in clustering rewarding / punishing an
known relationship(s) and data points (similar algorithm.
patterns within data (for customers, etc.)
example: relationship Rewards and punishments
between inputs and are based on algorithm’s
outputs). action within its
environment.
Frequently used types:
regression, and
classification.

5
Artificial Neuron (Perceptron)
A (single-layer) perceptron is x1

a model of a biological
neuron. It is made of the w1
following components:
weights
 inputs xi - numerical values x2
(numbers)
w2
representing information Output
 weights wi - numerical w3  y

values representing how x3


w4
“important” corresponding
input is
 weighted sum:  wi * xi
 activation function f that x4

decides if the neuron


“fires” Inputs
(numbers)

6
Single-layer Perceptron as a Classifier
x1 x1

w1 w1

x2 x2
w2 w2

w3  X y w3  y

x3 x3
w4 w4

x4 x4

 wi * xi < 0 → f = 0 → NOT CAT  wi * xi  0 → f = 1 → CAT

7
Classification: Linear Separation?
HAM

SPAM
HAM SPAM HAM

HAM SPAM

SPAM

Sometimes decision boundary CANNOT be linear? Not linearly separable f()

8
Hypothesis: Classification “Boundary”

9
XOR: Not a Linearly Separable f()

Logical XOR is an example of a function that is NOT linearly separable

10
XOR: Not a Linearly Separable f()?

11
Basic Neural Unit
weights

Input Output
layer layer

12
Hidden Layer
weights weights

Input Hidden Output


layer layer layer

13
XOR: Hidden Layer Approach

14
Hidden Layer
features weights weights output

Input Hidden Output


layer layer layer

15
Hidden Layer
features weights weights output

Input Hidden Output


layer layer layer

16
Feedforward Neural Network
features weights weights weights output

Input Hidden Hidden Output


layer layer layer layer

Also called (historically): multi-layer perceptron

17
Artificial Neural Network (ANN)
An artificial neural network is made of multiple artificial neuron layers.

Input Hidden Hidden Output


layer layer layer layer

18
ANN as an Image Classifier
An artificial neural network can be used as a classifier as well.

Input
image

Other

Input Hidden Hidden Output


layer layer layer layer

19
ANN as a Classifier
features weights weights weights label

Input Hidden Hidden Output


layer layer layer layer

20
ANN: Supervised Learning
In order to work properly a classifier needs to be trained first with labeled data.
features weights weights weights label

Input
data

Other

Input Hidden Hidden Output


layer layer layer layer

Training will adjust all the weights within this artificial neural network.

21
Training Data: Features + Labels
Typically input data will be represented by a limited set of features.
Features:
Wheels: 4 Label:
Weight: 8 tons
Passengers: 1
Truck

Features:
Wheels: 6 Label:
Weight: 8 tons Truck
Passengers: 1

Features:
Wheels: 4 Label:
Weight: 1 ton Car
Passengers: 4

Features:
Wheels: 4 Label:
Weight: 2 tons Car
Passengers: 4

22
ANN: Supervised Learning
weights weights weights

wheels

weight

passengers

Input Hidden Hidden Output


layer layer layer layer

23
Training Data: Images + Labels
A classifier needs to be “shown” thousands of labeled examples to learn.

Label: Label: Label:


BUS CAR BRIDGE

Label: Label: Label:


PALM TRAFFIC LIGHT TAXI

Label: Label: Label:


CROSSWALK CHIMNEY MOTORCYCLE

Label: Label: Label:


STREET SIGN HYDRANT BICYCLE
Note how some images are “incomplete” and “flawed”.

24
Digit Image as ANN Feature Set
Individual features need to be “extracted” from an image. An image is numbers.

Source: https://nikolanews.com/not-just-introduction-to-convolutional-neural-networks-part-1/

25
ANN: Supervised Learning
An untrained classifier will NOT label input data correctly.
weights weights weights

0.12

0.99

0.55
Other

Input Hidden Hidden Output


layer layer layer layer

26
ANN: Training
Given: input data and it’s corresponding expected label: DOG calculate
weights
“error”.
weights weights
Should be 1!

0.12

0.99

0.55
Other

Input Hidden Hidden Output


layer layer layer layer

“Error” = 0.88. Go back and adjust all the weights to ensure it is lower next time.

27
ANN: Training
Show data / label pair: / DOG.
weights weights weights
Should be 1!

0.12

0.99

0.55
Other

Input Hidden Hidden Output


layer layer layer layer

Correct all the weights. Repeat many times.

28
Exercise: ANN Demo
http://playground.tensorflow.org/

29
Exercise: Train a Classifier!
https://teachablemachine.withgoogle.com/

30
ANN for Classification
features weights weights weights label

Input Hidden Hidden Output


layer layer layer layer

31
ANN for Regression
features weights weights weights prediction

Input Hidden Hidden Output


layer layer layer layer

32
ANN for Regression: Used Car Price
Used car price predictor: train it first with used car data - price pairs.
features weights weights weights prediction

model

age

mileage

Input Hidden Hidden Output


layer layer layer layer

33
K Nearest Neighbors

34
k = 11 Nearest Neighbors

35
k = 11 Nearest Neighbors

36
k = 11 Nearest Neighbors

37
k = 25 Nearest Neighbors

38
k = 25 Nearest Neighbors

39
k = 25 Nearest Neighbors

40
How Would kNN Do Here?

41
Classifier Evaluation: Confusion Matrix
Predicted class

Positive Negative

False Negative (FN) Sensitivity


Positive True Positive (TP) 𝑻𝑷
Type II Error
Actual class

𝑻𝑷 𝑭𝑵

False Positive (FP) Specificity


Negative True Negative (TN) 𝑻𝑵
Type I Error
𝑻𝑵 𝑭𝑷

Precision Negative Predictive Accuracy


𝑻𝑷 𝑻𝑵 𝑻𝑷 𝑻𝑵
Value
𝑻𝑷 𝑭𝑷 𝑻𝑵 𝑭𝑵 𝑻𝑷 𝑻𝑵 𝑭𝑷 𝑭𝑵

42
Training / Validation / Test Sets
In order to create the best model possible, given
some (relatively large) data set, we should divide it
into:
 training set: to train candidate models
 validation set: to evaluate candidate models and
pick the best one
 test set: to do the final evaluation of the model

43
K-Fold Cross-Validation
Validation
Train Validate Score

4-fold cross-validation
Train Train Train Validate ScoreA

Train Train Validate Train ScoreB

Train Validate Train Train ScoreC

Validate Train Train Train ScoreD

ScoreA + ScoreB + ScoreC + ScoreD


Score =
4

44
Ensemble Learning
In ensemble learning we are creating a collection
(an ensemble) of hypotheses (models) h1, h2, ..., hN
and combine their predictions by averaging, voting,
or another level of machine learning. Indvidual
hypotheses (models) are based models and their
combination is the ensemble model.
 Bagging
 Boosting
 Random Trees
 etc.

45
Bagging: Regression
In bagging we generate K training sets by sampling
with replacement from the original training set.

Train (M dataTrain
points) Model 1 | h1

Train (M data points)


Train Model 2 | h2

1
Train
Train
(M data points) Model 3 | h3 ℎ(𝒙) = ℎ (𝒙) Output
𝐾

....
Train
Train (M data points) Model K | hK

Bagging tends to reduce variance and helps with smaller data sets.

46
Bagging: Classification
In bagging we generate K training sets by sampling
with replacement from the original training set.

Train (M dataTrain
points) Model 1 | h1

Train (M data points)


Train Model 2 | h2

Train
Train
(M data points) Model 3 | h3 Plurality vote Output

....
Train
Train (M data points) Model K | hK

Bagging tends to reduce variance and helps with smaller data sets.

47
scikit-learn Algorithm Cheat Sheet

Source: https://scikit-learn.org/stable/tutorial/machine_learning_map/index.html

48
Distance Measures

Source: https://towardsdatascience.com/9-distance-measures-in-data-science-918109d069fa

49
Practical ML: Feature Engineering
 One-hot encoding
red = [1, 0, 0]
yellow = [0, 1, 0]
green = [0, 0, 1]
 Binning / Bucketing
 Normalization
 Dealing with missing data / features

50
Unsupervised Learning

51
What is Unsupervised Learning?
Idea:
Unsupervised learning involves finding underlying
patterns within data. Typically used in clustering
data points (similar customers, etc.).

In other words:
 there is some structure (groups / clusters) in
data (for example: customer information)
 we don’t know what it is (= no labels!)
 unsupervised learning tries to discover it
52
Main Machine Learning Categories
Supervised learning Unsupervised learning Reinforcement learning

Supervised learning is one Unsupervised learning Reinforcement learning is


of the most common involves finding underlying inspired by behavioral
techniques in machine patterns within data. psychology. It is based on a
learning. It is based on Typically used in clustering rewarding / punishing an
known relationship(s) and data points (similar algorithm.
patterns within data (for customers, etc.)
example: relationship Rewards and punishments
between inputs and are based on algorithm’s
outputs). action within its
environment.
Frequently used types:
regression, and
classification.

53
Unsupervised Learning:
K-Means Clustering

54
K-Means Clustering: The Idea

Source: https://stanford.edu/~cpiech/cs221/handouts/kmeans.html

55
Exercise: K-Means Clustering
https://lalejini.com/my_empirical_examples/KMean
sClusteringExample/web/kmeans_clustering.html

56
3D K-Means Clustering Visualized

Source: https://github.com/Gautam-J/Machine-Learning

57
Where Would You Use Clustering?

58
Reinforcement Learning (RL)

59
What is Reinforcement Learning?
Idea:
Reinforcement learning is inspired by behavioral
psychology. It is based on a rewarding / punishing
an algorithm.

Rewards and punishments are based on algorithm’s


action within its environment.

60
RL: Agents and Environments
State

Agent
Reward

Action
Environment

61
Reinforcement Learning in Action

62
Reinforcement Learning in Action

Source: https://www.youtube.com/watch?v=x4O8pojMF0w

63
Reinforcement Learning in Action

Source: https://www.youtube.com/watch?v=kopoLzvh5jY

64
Reinforcement Learning in Action

Source: https://www.youtube.com/watch?v=Tnu4O_xEmVk

65
ANN for Simple Game Playing

UP

Game
DOWN
state

JUMP

Input Hidden Hidden Output


layer layer layer layer

66
ANN for Simple Game Playing
Current game is an input. Decisions (UP/DOWN/JUMP) are rewarded/punished.

UP

Game
DOWN
state

JUMP

Input Hidden Hidden Output


layer layer layer layer

Correct all the weights using Reinforcement Learning.

67
RL: Agents and Environments
State
What’s
inside?
Reward

Action
Environment

68
RL: Agents and Environments
State

Reward

Action
Environment

69

You might also like