2-Capacity, Underfitting, overfitting-15-Jul-2020Material - I - 15-Jul-2020 - ML - Fundamentals

MACHINE LEARNING
FUNDAMENTALS
Sudhakar MS
School of Electronics Engineering
Vellore Institute of Technology
MACHINE LEARNING
A computer program to learn from experience with
Teaching machine/computers to do things naturally by
respect to some class of tasks.
learning through experience
Algorithms that
Practise of using
Science of getting can learn from
algorithms to parse
computers to act data without
data, learn from it and
without being relying on rule-
then make prediction
explicitly based
or determination
programmed programming
Can figure out

how to perform
important tasks
by “The field of Machine Learning seeks
generalisation to answer the questions”
from examples
Traditional Programming
Data
Computer Output
Program
Machine Learning
Data
Computer Program
Output
Slide credit: Pedro Domingos

4
When Do We Use Machine
Learning?
ML is used when:
• Human expertise does not exist (navigating on Mars)
• Humans can’t explain their expertise (speech recognition)
• Models must be customized (personalized medicine)
• Models are based on huge amounts of data (genomics)
Learning isn’t always useful:

• There is no need to “learn” to calculate payroll
Based on slide by E. Alpaydin
5
A classic example of a task that requires machine
learning: It is very hard to say what
makes a 2
Slide credit: Geoffrey Hinton 6

Some more examples of tasks that are best solved by using a
learning algorithm
• Recognizing patterns:
– Facial identities or facial expressions
– Handwritten or spoken words
– Medical images
• Generating patterns:
– Generating images or motion sequences
• Recognizing anomalies:
– Unusual credit card transactions
– Unusual patterns of sensor readings in a nuclear power plant
• Prediction:
– Future stock prices or currency exchange rates
Slide credit: Geoffrey Hinton

7
Sample Applications
• Web search
• Computational biology
• Finance
• E-commerce
• Space exploration
• Robotics
• Information extraction
• Social networks
• Debugging software
• [Your favorite area]
Slide credit: Pedro Domingos

What is it? Where is it going? How it is
possible?
• It can predict/Forecast
• Training
• Often confused with AI/Deep Learning/NN/Big Data
• 100 % accuracy – not needed
• ML in Big Data/Data Science/Data Engineering
• ML programming
Dimensions of Learning Systems
• type of feedback
• supervised (labeled examples)
• unsupervised (unlabeled examples)
• reinforcement (reward)
• representation
• attribute-based (feature vector)
• relational (first-order logic)
• use of knowledge
• empirical (knowledge-free)
• analytical (knowledge-guided)
Types of Learning
• Supervised (inductive) learning
– Given: training data + desired outputs (labels)
• Unsupervised learning
– Given: training data (without desired outputs)
• Semi-supervised learning
– Given: training data + a few desired outputs
• Reinforcement learning
– Rewards from sequence of actions
Based on slide by Pedro Domingos

What is Supervised Machine Learning?
• Data is well "labeled” denotes that the data is pre-tagged with the right answer
• learning in the presence of a supervisor or a teacher.
• A supervised learning algorithm learns from labeled training data
• Helps to predict outcomes for unforeseen data.

What is Unsupervised Learning?
• Deals with the unlabeled data

• Model works on its own to discover information.
• Performs more complex processing tasks compared to supervised learning.
• Unpredictable compared with other natural learning deep learning and reinforcement learning
methods
Why Supervised Learning?
•Supervised learning collects data or produces a data output from the previous experience.
•Helps to optimize performance criteria using experience
•Supervised machine learning solves various types of real-world computation problems.
Why Unsupervised Learning?

Here, are prime reasons for using Unsupervised Learning:
•Unsupervised machine learning finds all kind of unknown patterns in data.
•Unsupervised methods find features that can be useful for categorization.
•Deployable in real time, so all the input data to be analyzed and labeled in the presence of learners.
•Doesn’t need manual intervention as it is easier to get unlabeled data from a computer than labeled data
How Supervised Learning works?
Predict the time duration to drive home from your workplace. Here, you start by creating a set of labeled
data. This data includes
•Weather conditions
•Time of the day
•Holidays
All these details are inputs. The output is the amount of time it took to drive back home on that specific
day.
Parameters Supervised machine learning technique Unsupervised machine learning technique
Process In a supervised learning model, input and In unsupervised learning model, only input data will
output variables will be given. be given
Input Data Algorithms are trained using labeled data. Algorithms are used against data which is not labeled
Algorithms Used Support vector machine, Neural network, Unsupervised algorithms can be divided into different
Linear and logistics regression, random categories: like Cluster algorithms, K-means,
forest, and Classification trees. Hierarchical clustering, etc.
Computational Complexity Supervised learning is a simpler method. Unsupervised learning is computationally complex
Use of Data Supervised learning model uses training data Unsupervised learning does not use output data.
to learn a link between the input and the
outputs.
Accuracy of Results Highly accurate and trustworthy method. Less accurate and trustworthy method.
Real Time Learning Learning method takes place offline. Learning method takes place in real time.
Number of Classes Number of classes is known. Number of classes is not known.

Main Drawback Classifying big data can be a real challenge in You cannot get precise information regarding data
Supervised Learning. sorting, and the output as data used in unsupervised
learning is labeled and not known.
Supervised Learning: Regression
• Given (x 1 , y1), (x 2 , y2), ..., (x n , yn)

• Learn a function f(x) to predict y given x
– y is real-valued == regression
Data from G. Witt. Journal of Statistics Education, Volume 21, Number 1 (2013)
Supervised Learning: Classification
• Given (x 1 , y1), (x 2 , y2), ..., (x n , yn)
• Learn a function f(x) to predict y given x
– y is categorical == classification
Based on example by Andrew Ng

Supervised Learning
• x can be multi-dimensional
– Each dimension corresponds to an attribute
- Clump Thickness
- Uniformity of Cell Size
Age - Uniformity of Cell Shape
…
Tumor Size
Based on example by Andrew Ng

Unsupervised Learning
• Given x 1 , x 2 , ..., x n (without labels)
• Output hidden structure behind the x’s
– E.g., clustering
Genomics application: group individuals by genetic similarity
Genes
Individuals
[Source: Daphne Koller]
Organize computing clusters Social network analysis
Image credit: NASA/JPL-Caltech/E. Churchwell (Univ. of Wisconsin, Madison)
Market segmentation Astronomical data analysis

Slide credit: Andrew Ng
• Independent component analysis – separate a
combined signal into its original sources
Image credit: statsoft.com Audio from http://www.ism.ac.jp/~shiro/research/blindsep.html

Reinforcement Learning
• Given a sequence of states and actions with
(delayed) rewards, output a policy
– Policy is a mapping from states  actions that
tells you what to do in a given state
• Examples:
– Credit assignment problem
– Game playing
– Robot in a maze
– Balance a pole on your hand
Machine Learning Steps
• Gathering data/Data collection/Data Collating
• Data Cleaning/Preprocessing/Preparing that data
• Choosing a model/Supervised/Unsupervised
• Training/splitting Data Sets
• Evaluation/Performance measures/Epochs/Stopping condition/Cost
function
• parameter tuning/Hyper parameter tuning
• Prediction/Forecasting/Classification/Recognition/Identification
Machine learning tasks
• Classification
• Classification with missing inputs
• Regression
• Transcription
• Machine Translation
• Structured output
• Anomaly detection
• Synthesis and sampling
• Imputation of missing values
• Denoising
• Density estimation
Learning Algorithm (LA)
• Ability to learn from data
• Learn from Experience ‘E’ w.r.t tasks ‘T’ and Performance measure ‘P’
• Example – Collection of features measure quantitatively from object / event
Performance Measure
• Accuracy – Proportion of examples for which the model produces the correct output
• Error rate / Expected loss - Proportion of examples for which the model produces the incorrect output
Model evaluation
• Data set (Design Matrix) - Collection of examples
• Test data, Test error (generalization error)
• Train data, Training error
Simple example of a Learning algorithm
• Linear
Regression
• System that takes a vector as input and predict the scalar as an output
w – vector of parameters that control the behaviour of the system
Performance of MLA is highly dependent upon two factors, namely:

• Minimize the training error
• Minimize the gap between the training and testing error
• IID – Independent and Identically Distributed
ML Model Metrics
• Statistical Fit
• Overfitting – gap between the training and test error is large
• Under fitting – inability to attain a low error value on the training set
• Capacity – ability to fit a wide variety of functions
• Hypothesis space – Set of functions that the LA selects as the solution
• Bias – absence of input
• Variance
• Generalisation – Ability to perform well on unobserved inputs
• Regularisation – Modification of LA to reduce its generalization error
Underfitting and overfitting -Example
Graphical illustration of Performance metrics
Techniques to reduce underfitting :
1. Increase model complexity
2. Increase number of features,
performing feature engineering
3. Remove noise from the data.
4. Increase the number of epochs or
increase the duration of training to
get better results.
Techniques to reduce overfitting :

1. Increase training data.
2. Reduce model complexity.
3. Early stopping during the training
phase (have an eye over the loss over
the training period as soon as loss
begins to increase stop training).
Source:https://www.geeksforgeeks.org/underfitting-and-overfitting-in-machine-learning/
Generalisation & Regularisation
• The central challenge in machine learning is that we must perform well on
inputs—not just those on which our model was trained. The ability to
perform well on previously unobserved inputs is called generalization.
• Generalization or Test error should be low
• Generalization error of a machine learning model by measuring its
performance on a test set of examples that were collected separatelyfrom the
training set.
• Regularization - design our machine learning algorithms to perform well on a
speciﬁc task- increase or decrease the model’s capacity-no best form of
regularization- Instead we must choose a form of regularization that is well-
suited to the particular task we want to solve.
Hyperparameters and Validation Sets
• Most machine learning algorithms have several settings that we can use
to control the behavior of the learning algorithm. These settings are
called hyperparameters
• Typically, one uses about 80% of the training data for training and 20%
for validation. Since the validation set is used to “train” the
hyperparameters, the validation set error will underestimate the
generalization error, though typically by a smaller amount than the
training error. After all hyperparameter optimization is complete, the
generalization error may be estimated using the test set.
• Cross Validation – K fold Cross Validation- k non-overlapping subsets.
Machine Learning (ML) Algorithms
• Deep Learning • Bayesian
• Ensemble • Decision Tree
• Neural Networks • Dimensionality Reduction
• Regularisation • Instance Based
• Rule Based • Clustering
• Regression

2-Capacity, Underfitting, overfitting-15-Jul-2020Material - I - 15-Jul-2020 - ML - Fundamentals

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

2-Capacity, Underfitting, overfitting-15-Jul-2020Material - I - 15-Jul-2020 - ML - Fundamentals

Uploaded by

Copyright:

Available Formats

MACHINE LEARNING

Can figure out

Slide credit: Pedro Domingos

Learning isn’t always useful:

Slide credit: Geoffrey Hinton 6

Slide credit: Geoffrey Hinton

Slide credit: Pedro Domingos

Based on slide by Pedro Domingos

• learning in the presence of a supervisor or a teacher.

• A supervised learning algorithm learns from labeled training data

• Helps to predict outcomes for unforeseen data.

• Deals with the unlabeled data

Why Unsupervised Learning?

Number of Classes Number of classes is known. Number of classes is not known.

• Given (x 1 , y1), (x 2 , y2), ..., (x n , yn)

Based on example by Andrew Ng

Based on example by Andrew Ng

Organize computing clusters Social network analysis

Image credit: NASA/JPL-Caltech/E. Churchwell (Univ. of Wisconsin, Madison)

Market segmentation Astronomical data analysis

Image credit: statsoft.com Audio from http://www.ism.ac.jp/~shiro/research/blindsep.html

w – vector of parameters that control the behaviour of the system

Performance of MLA is highly dependent upon two factors, namely:

Techniques to reduce overfitting :

You might also like