CS480 Lecture November 09th

CS 480
Introduction to Artificial Intelligence
November 9, 2023
Announcements / Reminders
 Please follow the Week 12 To Do List instructions (if you
haven't already)
 Work on your Written Assignment #04
 Programming Assignment #02 due on Monday
(11/27/23) 11:59 PM CST
 Final Exam date:

– Thursday 11/30/2023 (last week of classes!)
 Ignore the date provided by the Registrar
2
Plan for Today
 Casual introduction to Machine Learning
3
Traditional Programming vs ML
Traditional programming:
Program
Input data Output
(Rules)
Machine learning:
Program
Input data Output
(Rules)
4
Main Machine Learning Categories
Supervised learning Unsupervised learning Reinforcement learning
Supervised learning is one Unsupervised learning Reinforcement learning is

of the most common involves finding underlying inspired by behavioral
techniques in machine patterns within data. psychology. It is based on a
learning. It is based on Typically used in clustering rewarding / punishing an
known relationship(s) and data points (similar algorithm.
patterns within data (for customers, etc.)
example: relationship Rewards and punishments
between inputs and are based on algorithm’s
outputs). action within its
environment.
Frequently used types:
regression, and
classification.
5
Artificial Neuron (Perceptron)
A (single-layer) perceptron is x1
a model of a biological
neuron. It is made of the w1
following components:
weights
 inputs xi - numerical values x2
(numbers)
w2
representing information Output
 weights wi - numerical w3  y
values representing how x3

w4
“important” corresponding
input is
 weighted sum:  wi * xi
 activation function f that x4
decides if the neuron

“fires” Inputs
(numbers)
6
Single-layer Perceptron as a Classifier
x1 x1
w1 w1
x2 x2
w2 w2
w3  X y w3  y
x3 x3
w4 w4
x4 x4
 wi * xi < 0 → f = 0 → NOT CAT  wi * xi  0 → f = 1 → CAT
7
Classification: Linear Separation?
HAM
SPAM
HAM SPAM HAM
HAM SPAM
SPAM
Sometimes decision boundary CANNOT be linear? Not linearly separable f()
8
Hypothesis: Classification “Boundary”
9
XOR: Not a Linearly Separable f()
Logical XOR is an example of a function that is NOT linearly separable
10
XOR: Not a Linearly Separable f()?
11
Basic Neural Unit
weights
Input Output
layer layer
12
Hidden Layer
weights weights
Input Hidden Output

layer layer layer
13
XOR: Hidden Layer Approach
14
Hidden Layer
features weights weights output
Input Hidden Output

layer layer layer
15
Hidden Layer
features weights weights output
Input Hidden Output

layer layer layer
16
Feedforward Neural Network
features weights weights weights output
Input Hidden Hidden Output

layer layer layer layer
Also called (historically): multi-layer perceptron
17
Artificial Neural Network (ANN)
An artificial neural network is made of multiple artificial neuron layers.

18
ANN as an Image Classifier
An artificial neural network can be used as a classifier as well.
Input
image
Other

19
ANN as a Classifier
features weights weights weights label

20
ANN: Supervised Learning
In order to work properly a classifier needs to be trained first with labeled data.
Input
data
Other

Training will adjust all the weights within this artificial neural network.
21
Training Data: Features + Labels
Typically input data will be represented by a limited set of features.
Features:
Wheels: 4 Label:
Weight: 8 tons
Passengers: 1
Truck
Features:
Wheels: 6 Label:
Weight: 8 tons Truck
Passengers: 1
Features:
Wheels: 4 Label:
Weight: 1 ton Car
Passengers: 4
Features:
Wheels: 4 Label:
Weight: 2 tons Car
Passengers: 4
22
weights weights weights
wheels
weight
passengers

23
Training Data: Images + Labels
A classifier needs to be “shown” thousands of labeled examples to learn.
Label: Label: Label:

BUS CAR BRIDGE

PALM TRAFFIC LIGHT TAXI

CROSSWALK CHIMNEY MOTORCYCLE

STREET SIGN HYDRANT BICYCLE
Note how some images are “incomplete” and “flawed”.
24
Digit Image as ANN Feature Set
Individual features need to be “extracted” from an image. An image is numbers.
Source: https://nikolanews.com/not-just-introduction-to-convolutional-neural-networks-part-1/
25
An untrained classifier will NOT label input data correctly.
0.12
0.99
0.55
Other

26
ANN: Training
Given: input data and it’s corresponding expected label: DOG calculate
weights
“error”.
weights weights
Should be 1!
0.12
0.99
0.55
Other

“Error” = 0.88. Go back and adjust all the weights to ensure it is lower next time.
27
ANN: Training
Show data / label pair: / DOG.
Should be 1!
0.12
0.99
0.55
Other

Correct all the weights. Repeat many times.
28
Exercise: ANN Demo
http://playground.tensorflow.org/
29
Exercise: Train a Classifier!
https://teachablemachine.withgoogle.com/
30
ANN for Classification

31
ANN for Regression
features weights weights weights prediction

32
ANN for Regression: Used Car Price
Used car price predictor: train it first with used car data - price pairs.
features weights weights weights prediction
model
age
mileage

33
K Nearest Neighbors
34
k = 11 Nearest Neighbors
35
36
37
38
39
40
How Would kNN Do Here?
41
Classifier Evaluation: Confusion Matrix
Predicted class
Positive Negative
False Negative (FN) Sensitivity

Positive True Positive (TP) 𝑻𝑷
Type II Error
Actual class
𝑻𝑷 𝑭𝑵
False Positive (FP) Specificity

Negative True Negative (TN) 𝑻𝑵
Type I Error
𝑻𝑵 𝑭𝑷
Precision Negative Predictive Accuracy

𝑻𝑷 𝑻𝑵 𝑻𝑷 𝑻𝑵
Value
𝑻𝑷 𝑭𝑷 𝑻𝑵 𝑭𝑵 𝑻𝑷 𝑻𝑵 𝑭𝑷 𝑭𝑵
42
Training / Validation / Test Sets
In order to create the best model possible, given
some (relatively large) data set, we should divide it
into:
 training set: to train candidate models
 validation set: to evaluate candidate models and
pick the best one
 test set: to do the final evaluation of the model
43
K-Fold Cross-Validation
Validation
Train Validate Score
4-fold cross-validation
Train Train Train Validate ScoreA
Train Train Validate Train ScoreB
Train Validate Train Train ScoreC
Validate Train Train Train ScoreD
ScoreA + ScoreB + ScoreC + ScoreD

Score =
4
44
Ensemble Learning
In ensemble learning we are creating a collection
(an ensemble) of hypotheses (models) h1, h2, ..., hN
and combine their predictions by averaging, voting,
or another level of machine learning. Indvidual
hypotheses (models) are based models and their
combination is the ensemble model.
 Bagging
 Boosting
 Random Trees
 etc.
45
Bagging: Regression
In bagging we generate K training sets by sampling
with replacement from the original training set.
Train (M dataTrain
points) Model 1 | h1
Train (M data points)

Train Model 2 | h2
1
Train
Train
(M data points) Model 3 | h3 ℎ(𝒙) = ℎ (𝒙) Output
𝐾
....
Train
Train (M data points) Model K | hK
Bagging tends to reduce variance and helps with smaller data sets.
46
Bagging: Classification
In bagging we generate K training sets by sampling
with replacement from the original training set.
Train (M dataTrain
points) Model 1 | h1
Train (M data points)

Train Model 2 | h2
Train
Train
(M data points) Model 3 | h3 Plurality vote Output
....
Train
Train (M data points) Model K | hK
Bagging tends to reduce variance and helps with smaller data sets.
47
scikit-learn Algorithm Cheat Sheet
Source: https://scikit-learn.org/stable/tutorial/machine_learning_map/index.html
48
Distance Measures
Source: https://towardsdatascience.com/9-distance-measures-in-data-science-918109d069fa
49
Practical ML: Feature Engineering
 One-hot encoding
red = [1, 0, 0]
yellow = [0, 1, 0]
green = [0, 0, 1]
 Binning / Bucketing
 Normalization
 Dealing with missing data / features
50
Unsupervised Learning
51
What is Unsupervised Learning?
Idea:
Unsupervised learning involves finding underlying
patterns within data. Typically used in clustering
data points (similar customers, etc.).
In other words:
 there is some structure (groups / clusters) in
data (for example: customer information)
 we don’t know what it is (= no labels!)
 unsupervised learning tries to discover it
52
Main Machine Learning Categories
Supervised learning Unsupervised learning Reinforcement learning
Supervised learning is one Unsupervised learning Reinforcement learning is

of the most common involves finding underlying inspired by behavioral
techniques in machine patterns within data. psychology. It is based on a
learning. It is based on Typically used in clustering rewarding / punishing an
known relationship(s) and data points (similar algorithm.
patterns within data (for customers, etc.)
example: relationship Rewards and punishments
between inputs and are based on algorithm’s
outputs). action within its
environment.
Frequently used types:
regression, and
classification.
53
Unsupervised Learning:
K-Means Clustering
54
K-Means Clustering: The Idea
Source: https://stanford.edu/~cpiech/cs221/handouts/kmeans.html
55
Exercise: K-Means Clustering
https://lalejini.com/my_empirical_examples/KMean
sClusteringExample/web/kmeans_clustering.html
56
3D K-Means Clustering Visualized
Source: https://github.com/Gautam-J/Machine-Learning
57
Where Would You Use Clustering?
58
Reinforcement Learning (RL)
59
What is Reinforcement Learning?
Idea:
Reinforcement learning is inspired by behavioral
psychology. It is based on a rewarding / punishing
an algorithm.
Rewards and punishments are based on algorithm’s

action within its environment.
60
RL: Agents and Environments
State
Agent
Reward
Action
Environment
61
Reinforcement Learning in Action
62
Source: https://www.youtube.com/watch?v=x4O8pojMF0w
63
Source: https://www.youtube.com/watch?v=kopoLzvh5jY
64
Source: https://www.youtube.com/watch?v=Tnu4O_xEmVk
65
ANN for Simple Game Playing
UP
Game
DOWN
state
JUMP

66
ANN for Simple Game Playing
Current game is an input. Decisions (UP/DOWN/JUMP) are rewarded/punished.
UP
Game
DOWN
state
JUMP

Correct all the weights using Reinforcement Learning.
67
State
What’s
inside?
Reward
Action
Environment
68
State
Reward
Action
Environment
69

CS480 Lecture November 09th

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

CS480 Lecture November 09th

Uploaded by

Copyright:

Available Formats

CS 480

Introduction to Artificial Intelligence

 Final Exam date:

Supervised learning is one Unsupervised learning Reinforcement learning is

values representing how x3

decides if the neuron

 wi * xi < 0 → f = 0 → NOT CAT  wi * xi  0 → f = 1 → CAT

Sometimes decision boundary CANNOT be linear? Not linearly separable f()

Logical XOR is an example of a function that is NOT linearly separable

Input Hidden Output

Input Hidden Output

Input Hidden Output

Input Hidden Hidden Output

Also called (historically): multi-layer perceptron

Input Hidden Hidden Output

Input Hidden Hidden Output

Input Hidden Hidden Output

Input Hidden Hidden Output

Input Hidden Hidden Output

Label: Label: Label:

Label: Label: Label:

Label: Label: Label:

Label: Label: Label:

Input Hidden Hidden Output

Input Hidden Hidden Output

Input Hidden Hidden Output

Correct all the weights. Repeat many times.

Input Hidden Hidden Output

Input Hidden Hidden Output

Input Hidden Hidden Output

False Negative (FN) Sensitivity

False Positive (FP) Specificity

Precision Negative Predictive Accuracy

Train Train Validate Train ScoreB

Train Validate Train Train ScoreC

Validate Train Train Train ScoreD

ScoreA + ScoreB + ScoreC + ScoreD

Train (M data points)

Train (M data points)

Supervised learning is one Unsupervised learning Reinforcement learning is

Rewards and punishments are based on algorithm’s

Input Hidden Hidden Output

Input Hidden Hidden Output

Correct all the weights using Reinforcement Learning.

You might also like