You are on page 1of 45

Machine Learning

Concepts
SE-805 Advance AI – Lecture
Terminologies

Khawir Mahmood SE-805 AAI - Machine Learning Concepts 2


Role of Data
• Data is everywhere • Data types
1. Google: processes 24 peta – Texts
bytes of data per day – Numbers
2. Facebook: 10 million photos – Clickstreams
uploaded every hour
– Graphs
3. Youtube: 1 hour of video
uploaded every second – Tables
4. Twitter: 400 million tweets – Images
per day – Transactions
5. Astronomy: Satellite data is – Videos
in hundreds of PB
– Some or all of the above!
6. …

Khawir Mahmood SE-805 AAI - Machine Learning Concepts 3


Traditional Programming vs ML Paradigm

Data
• Traditional
Computer Output
Programming
Program

Data
• Machine Computer
Model
(Program)
Learning
Outputs
(Optional)

Khawir Mahmood SE-805 AAI - Machine Learning Concepts 4


The Data Science Process

Khawir Mahmood SE-805 AAI - Machine Learning Concepts 5


Machine Learning
“Machine learning is programming
computers to optimize a performance
criterion using example data or past
experience”
– Tom Mitchell (1998)

• ML is an algorithmic field that


– Blends ideas from statistics,
computer science etc. to
 Design methods that learn
from data and help make
decisions
Khawir Mahmood SE-805 AAI - Machine Learning Concepts 6
Machine Learning
• Nicolas learns about trucks and dumpers

Khawir Mahmood SE-805 AAI - Machine Learning Concepts 7


Machine Learning
• But will he recognize others?

So learning involves ability to generalize


from labeled examples
Khawir Mahmood SE-805 AAI - Machine Learning Concepts 8
Machine Learning
• How do we create computer programs that improve with
experience?

“A computer program is said to learn from experience E with


respect to some class of tasks T and performance measure P, if
its performance at tasks in T, as measured by P, improves with
experience E”

Khawir Mahmood SE-805 AAI - Machine Learning Concepts 9


Learning Problems - Example
• Learning = Improve with experience over some task
– Improve over task T,
– With respect to performance measure P,
– Based on experience E.
• Example
– T = recognizing handwritten words
within images
– P = % of words correctly recognized
– E = a database of handwritten words
with given classification

Khawir Mahmood SE-805 AAI - Machine Learning Concepts 10


Learning Problems - Example
• Learning = Improve with experience over some task
– Improve over task T,
– With respect to performance measure P,
– Based on experience E.
• Example
– T = autonomous driving
– P = average distance traveled before an error
– E = a sequence of images and steering
commands recorded while observing a
human driver

Khawir Mahmood SE-805 AAI - Machine Learning Concepts 11


Machine Learning Tasks
• Classification: Predicting the item class/ category of a case
• Regression: Predicting continuous values
• Clustering: Finding the structure of data; summarization
• Associations: Associating frequent co-occurring items/ events
• Anomaly detection: Discovering abnormal and unusual cases
• Sequence mining: Predicting next event
• Dimension reduction: Reducing the size of data
• Recommendation systems: Recommending items

Khawir Mahmood SE-805 AAI - Machine Learning Concepts 12


Types of Machine Learning

Machine
Learning Types

Supervised Learning Unsupervised Learning Reinforcement Learning


Learn from labelled data Learn from unlabeled data Learn from experience

Driverless
cars
Regression Classification
Dimensionality
Continuous Categorical Clustering
Reduction
target variable target variable
Housing price Medial Customer Feature
prediction imaging segmentation selection

Khawir Mahmood SE-805 AAI - Machine Learning Concepts 13


Machine Learning Algorithms

Khawir Mahmood SE-805 AAI - Machine Learning Concepts 14


Supervised Learning

Khawir Mahmood SE-805 AAI - Machine Learning Concepts 15


Supervised Learning

1 (cat)
vs
0 (non cat)

Blue
Green
Red

Khawir Mahmood SE-805 AAI - Machine Learning Concepts 16


Notation
ℝ𝑛𝑦
(𝑥, 𝑦) 𝑥 ∈ ℝ𝑛𝑥 𝑦 ∈ 0,1

m examples: 𝑥 1 ,𝑦 1 , 𝑥 2 ,𝑦 2 ,…, 𝑥 𝑚 ,𝑦 𝑚

𝑋= 𝑥 1 𝑥 2 𝑥 𝑚 𝑛𝑥 𝑌= 𝑦 1 ,𝑦 2 ,…,𝑦 𝑚

𝑌 ∈ ℝ𝑛𝑦×𝑚
𝑚

𝑋 ∈ ℝ𝑛𝑥 ×𝑚
Khawir Mahmood SE-805 AAI - Machine Learning Concepts 17
Notation
• Sizes:
–𝑚 : number of examples in the dataset
– 𝑛𝑥 : input size (attributes/ features/ dimension)
– 𝑛𝑦 : output size (number of classes)
• Objects:
– 𝑋 ∈ ℝ𝑛𝑥×𝑚 : the input matrix
– 𝑥 (𝑖) ∈ ℝ𝑛𝑥 : the ith example represented as a column vector
– 𝑌 ∈ ℝ𝑛𝑦 ×𝑚 : the label matrix
– 𝑦 (𝑖) ∈ ℝ𝑛𝑦 : the output label of the ith example
– 𝑦ො ∈ ℝ𝑛𝑦 : the predicted output vector

Khawir Mahmood SE-805 AAI - Machine Learning Concepts 18


Supervised vs Unsupervised Learning
• Supervised Learning: learning a model from labeled data
– Given: (𝑥 1 , 𝑦1 ), … ,(𝑥 𝑚 , 𝑦 𝑚 )|𝑥 (𝑖) ∈ ℝ𝑛𝑥 and 𝑦 (𝑖) is the label
– Task: 𝑋 → 𝑌
 Classification: 𝑦 is discrete i.e. 𝑦 ∈ ℝ𝑛𝑦
 Regression: 𝑦 is a real value i.e. 𝑦 ∈ ℝ

• Unsupervised Learning: learning a model from unlabeled data


– Given: 𝑥 1 , 𝑥 2 … ,𝑥 𝑚 |𝑥 𝑖 ∈ ℝ𝑛𝑥
– Task : ℝ𝑛y → 𝐶1 , 𝐶1 … 𝐶𝑘 (clustering/ segmentation)

Khawir Mahmood SE-805 AAI - Machine Learning Concepts 19


Unsupervised Learning

Methods: K-means, gaussian mixtures, hierarchical clustering etc.


Khawir Mahmood SE-805 AAI - Machine Learning Concepts 20
Supervised Learning

Methods: Decision Trees, Support Vector Machines, Neural Networks,


K-nearest neighbors, Naïve Bayes etc.
Khawir Mahmood SE-805 AAI - Machine Learning Concepts 21
Supervised Learning
• Classification

Khawir Mahmood SE-805 AAI - Machine Learning Concepts 22


Supervised Learning
• Classification

Khawir Mahmood SE-805 AAI - Machine Learning Concepts 23


Supervised Learning
• Non-linear Classification

Khawir Mahmood SE-805 AAI - Machine Learning Concepts 24


Supervised Learning
• Regression

Khawir Mahmood SE-805 AAI - Machine Learning Concepts 25


Training and Testing

Training set

ML Algorithm
Income,
gender,
age, Credit yes/no
family status, Model (𝒇 )
Credit amount
zipcode

Khawir Mahmood SE-805 AAI - Machine Learning Concepts 26


K-nearest neighbors
• Not every ML method builds a model!
• K-NN : Instance-based ML algorithm
• Main idea: Uses the similarity between examples
• Assumption: Two similar examples should have same labels
• Assumes all examples (instances) are points in the 𝑛𝑥
dimensional space ℝ𝑛𝑥

Khawir Mahmood SE-805 AAI - Machine Learning Concepts 27


K-nearest neighbors
• K-NN uses the standard Euclidian distance to define nearest
neighbors
• Given two examples 𝑥 (𝑖) 𝑎𝑛𝑑 𝑥 (𝑗) :
𝑛𝑥

𝑑 𝑥 𝑖 ,𝑥 𝑗
= ෍ (𝑥 𝑖 𝑘 −𝑥 𝑗 𝑘 )2

𝑘=1

Khawir Mahmood SE-805 AAI - Machine Learning Concepts 28


K-nearest neighbors
• Training algorithm:
– Add each training example 𝑥, 𝑦 to the dataset 𝐷
– 𝑥 (𝑖) ∈ ℝ𝑛𝑥 and 𝑦 (𝑖) ∈ ℝ𝑛𝑦
• Classification algorithm:
– Given an example 𝑥 (𝑞) to be classified. Suppose 𝑁 𝑘 (𝑥 (𝑞) ) is the set
of the K-nearest neighbors of 𝑥 (𝑞)

𝑦ො (𝑞) = 𝑠𝑖𝑔𝑛( ෍ 𝑦 (𝑖)


𝑥 (𝑖) ∈𝑁𝑘 (𝑥 (𝑞) )

Khawir Mahmood SE-805 AAI - Machine Learning Concepts 29


K-nearest neighbors

3-NN

Khawir Mahmood SE-805 AAI - Machine Learning Concepts 30


K-nearest neighbors

3-NN

Khawir Mahmood SE-805 AAI - Machine Learning Concepts 31


K-nearest neighbors
• Pros:
– Simple to implement
– Works well in practice
– Does not require to build a model, make assumptions, tune
parameters
– Can be extended easily with news examples
• Cons
– Requires large space to store the entire training dataset
– Slow! Given 𝑚 examples and 𝑛𝑥 features. The method takes 𝑂(𝑚 ×
𝑛𝑥 ) to run
– Suffers from the curse of dimensionality

Khawir Mahmood SE-805 AAI - Machine Learning Concepts 32


Applications of K-NN
• Information retrieval
• Handwritten character classification using nearest neighbor
in large databases
• Recommender systems (user like you may like similar
movies)
• Cancer diagnosis
• Medical data mining (similar patient symptoms)
• Pattern recognition in general

Khawir Mahmood SE-805 AAI - Machine Learning Concepts 33


Training and Testing
Training set

ML Algorithm
Income,
gender,
age, Credit yes/no
family status, Model (𝒇 )
Credit amount
zipcode

How can we be confident about 𝒇 ?


Khawir Mahmood SE-805 AAI - Machine Learning Concepts 34
Training and Testing
• We calculate Cost function ℑ𝑡𝑟𝑎𝑖𝑛 the in-sample error
(training error or empirical error/ risk)
𝑚

ℑ𝑡𝑟𝑎𝑖𝑛 𝑓 = ෍ ℒ(𝑦ො (𝑖) , 𝑦 (𝑖) ) … where 𝑦ො (𝑖) = f 𝑥 𝑖

𝑖=1
• Examples of loss functions
1 𝑖𝑓 𝑠𝑖𝑔𝑛( 𝑦
ො 𝑖 ) ≠ 𝑠𝑖𝑔𝑛(𝑦 𝑖 )
𝑖
ℒ 𝑦ො , 𝑦 𝑖 =ቊ
0 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
𝑖 𝑖 𝑖 𝑖 2
ℒ 𝑦ො , 𝑦 = 𝑦ො −𝑦

Khawir Mahmood SE-805 AAI - Machine Learning Concepts 35


Training and Testing
• We calculate Cost function ℑ𝑡𝑟𝑎𝑖𝑛 the in-sample error
(training error or empirical error/ risk)
𝑚

ℑ𝑡𝑟𝑎𝑖𝑛 𝑓 = ෍ ℒ(𝑦ො (𝑖) , 𝑦 (𝑖) ) … where 𝑦ො (𝑖) = f 𝑥 𝑖

𝑖=1

• We aim to have ℑ𝑡𝑟𝑎𝑖𝑛 𝑓 small i.e. minimize it


We hope that ℑ𝑡𝑒𝑠𝑡 𝑓 , the out-sample (test/ true error), will
be small too

Khawir Mahmood SE-805 AAI - Machine Learning Concepts 36


Overfitting/ underfitting

Khawir Mahmood SE-805 AAI - Machine Learning Concepts 37


Bias Variance Tradeoff

Khawir Mahmood SE-805 AAI - Machine Learning Concepts 38


Training and Testing

Khawir Mahmood SE-805 AAI - Machine Learning Concepts 39


Avoid overfitting
• In general, use simple models!
– Reduce the number of features manually or do feature selection
– Do a model selection
– Use regularization
 keep the features but reduce their importance by setting small parameter
values
 dropout regularization
– Do a cross-validation to estimate the test error

Khawir Mahmood SE-805 AAI - Machine Learning Concepts 40


Train, Validation and Test
Train Validate Test

• Training set is a set of examples used for learning a model


• Validation set is a set of examples that cannot be used for
learning the model but can help tune model parameters
– e.g., selecting K in K-NN
– Validation helps control overfitting.
• Test set is used to assess the performance of the final model
and provide an estimation of the test error
– Never use the test set in any way to further tune the parameters or
revise the model
Khawir Mahmood SE-805 AAI - Machine Learning Concepts 41
K-fold Cross Validation
• A method for estimating test error using training data
• Algorithm:
– Given a learning algorithm Α and a dataset 𝐷
– Step 1: Randomly partition 𝐷 into 𝑘 equal-size subsets 𝐷1 , 𝐷2 , … , 𝐷𝑗
– Step 2: For 𝑗 = 1 𝑡𝑜 𝑘
 Train Α on all 𝐷𝑖 , 𝑖 ∈ 1,2, … , 𝑘 and 𝑖 ≠ 𝑗, and get 𝑓𝑗
 Apply 𝑓𝑗 to 𝐷𝑗 and compute ℑ𝐷𝑗

– Step 3: Average error over all folds


σ𝑘𝑗=1(ℑ𝐷𝑗 )

Khawir Mahmood SE-805 AAI - Machine Learning Concepts 42


Confusion matrix

Khawir Mahmood SE-805 AAI - Machine Learning Concepts 43


Review

Khawir Mahmood SE-805 AAI - Machine Learning Concepts 44


Credit
• The elements of statistical learning. Data mining, inference,
and prediction. 10th Edition 2009.
– T. Hastie, R. Tibshirani, J. Friedman.
• Machine Learning 1997
– Tom Mitchell.

Khawir Mahmood SE-805 AAI - Machine Learning Concepts 45

You might also like