You are on page 1of 35

MACHINE LEARNING

FUNDAMENTALS

Sudhakar MS
School of Electronics Engineering
Vellore Institute of Technology
MACHINE LEARNING
A computer program to learn from experience with
Teaching machine/computers to do things naturally by
respect to some class of tasks.
learning through experience

Algorithms that
Practise of using
Science of getting can learn from
algorithms to parse
computers to act data without
data, learn from it and
without being relying on rule-
then make prediction
explicitly based
or determination
programmed programming

Can figure out


how to perform
important tasks
by “The field of Machine Learning seeks
generalisation to answer the questions”
from examples
Traditional Programming

Data
Computer Output
Program

Machine Learning

Data
Computer Program
Output

Slide credit: Pedro Domingos


4
When Do We Use Machine
Learning?
ML is used when:
• Human expertise does not exist (navigating on Mars)
• Humans can’t explain their expertise (speech recognition)
• Models must be customized (personalized medicine)
• Models are based on huge amounts of data (genomics)

Learning isn’t always useful:


• There is no need to “learn” to calculate payroll
Based on slide by E. Alpaydin

5
A classic example of a task that requires machine
learning: It is very hard to say what
makes a 2

Slide credit: Geoffrey Hinton 6


Some more examples of tasks that are best solved by using a
learning algorithm
• Recognizing patterns:
– Facial identities or facial expressions
– Handwritten or spoken words
– Medical images
• Generating patterns:
– Generating images or motion sequences
• Recognizing anomalies:
– Unusual credit card transactions
– Unusual patterns of sensor readings in a nuclear power plant
• Prediction:
– Future stock prices or currency exchange rates

Slide credit: Geoffrey Hinton


7
Sample Applications
• Web search
• Computational biology
• Finance
• E-commerce
• Space exploration
• Robotics
• Information extraction
• Social networks
• Debugging software
• [Your favorite area]

Slide credit: Pedro Domingos


What is it? Where is it going? How it is
possible?
• It can predict/Forecast
• Training
• Often confused with AI/Deep Learning/NN/Big Data
• 100 % accuracy – not needed
• ML in Big Data/Data Science/Data Engineering
• ML programming
Dimensions of Learning Systems
• type of feedback
• supervised (labeled examples)
• unsupervised (unlabeled examples)
• reinforcement (reward)
• representation
• attribute-based (feature vector)
• relational (first-order logic)
• use of knowledge
• empirical (knowledge-free)
• analytical (knowledge-guided)
Types of Learning
• Supervised (inductive) learning
– Given: training data + desired outputs (labels)
• Unsupervised learning
– Given: training data (without desired outputs)
• Semi-supervised learning
– Given: training data + a few desired outputs
• Reinforcement learning
– Rewards from sequence of actions

Based on slide by Pedro Domingos


What is Supervised Machine Learning?

• Data is well "labeled” denotes that the data is pre-tagged with the right answer

• learning in the presence of a supervisor or a teacher.

• A supervised learning algorithm learns from labeled training data

• Helps to predict outcomes for unforeseen data.


What is Unsupervised Learning?

• Deals with the unlabeled data


• Model works on its own to discover information.
• Performs more complex processing tasks compared to supervised learning.
• Unpredictable compared with other natural learning deep learning and reinforcement learning
methods
Why Supervised Learning?
•Supervised learning collects data or produces a data output from the previous experience.
•Helps to optimize performance criteria using experience
•Supervised machine learning solves various types of real-world computation problems.

Why Unsupervised Learning?


Here, are prime reasons for using Unsupervised Learning:
•Unsupervised machine learning finds all kind of unknown patterns in data.
•Unsupervised methods find features that can be useful for categorization.
•Deployable in real time, so all the input data to be analyzed and labeled in the presence of learners.
•Doesn’t need manual intervention as it is easier to get unlabeled data from a computer than labeled data
How Supervised Learning works?
Predict the time duration to drive home from your workplace. Here, you start by creating a set of labeled
data. This data includes
•Weather conditions
•Time of the day
•Holidays
All these details are inputs. The output is the amount of time it took to drive back home on that specific
day.
Parameters Supervised machine learning technique Unsupervised machine learning technique

Process In a supervised learning model, input and In unsupervised learning model, only input data will
output variables will be given. be given
Input Data Algorithms are trained using labeled data. Algorithms are used against data which is not labeled

Algorithms Used Support vector machine, Neural network, Unsupervised algorithms can be divided into different
Linear and logistics regression, random categories: like Cluster algorithms, K-means,
forest, and Classification trees. Hierarchical clustering, etc.

Computational Complexity Supervised learning is a simpler method. Unsupervised learning is computationally complex

Use of Data Supervised learning model uses training data Unsupervised learning does not use output data.
to learn a link between the input and the
outputs.
Accuracy of Results Highly accurate and trustworthy method. Less accurate and trustworthy method.
Real Time Learning Learning method takes place offline. Learning method takes place in real time.

Number of Classes Number of classes is known. Number of classes is not known.


Main Drawback Classifying big data can be a real challenge in You cannot get precise information regarding data
Supervised Learning. sorting, and the output as data used in unsupervised
learning is labeled and not known.
Supervised Learning: Regression

• Given (x 1 , y1), (x 2 , y2), ..., (x n , yn)


• Learn a function f(x) to predict y given x
– y is real-valued == regression

Data from G. Witt. Journal of Statistics Education, Volume 21, Number 1 (2013)
Supervised Learning: Classification
• Given (x 1 , y1), (x 2 , y2), ..., (x n , yn)
• Learn a function f(x) to predict y given x
– y is categorical == classification

Based on example by Andrew Ng


Supervised Learning
• x can be multi-dimensional
– Each dimension corresponds to an attribute

- Clump Thickness
- Uniformity of Cell Size
Age - Uniformity of Cell Shape

Tumor Size

Based on example by Andrew Ng


Unsupervised Learning
• Given x 1 , x 2 , ..., x n (without labels)
• Output hidden structure behind the x’s
– E.g., clustering
Unsupervised Learning
Genomics application: group individuals by genetic similarity

Genes

Individuals
[Source: Daphne Koller]
Unsupervised Learning

Organize computing clusters Social network analysis

Image credit: NASA/JPL-Caltech/E. Churchwell (Univ. of Wisconsin, Madison)

Market segmentation Astronomical data analysis


Slide credit: Andrew Ng
Unsupervised Learning
• Independent component analysis – separate a
combined signal into its original sources

Image credit: statsoft.com Audio from http://www.ism.ac.jp/~shiro/research/blindsep.html


Reinforcement Learning
• Given a sequence of states and actions with
(delayed) rewards, output a policy
– Policy is a mapping from states  actions that
tells you what to do in a given state
• Examples:
– Credit assignment problem
– Game playing
– Robot in a maze
– Balance a pole on your hand
Machine Learning Steps
• Gathering data/Data collection/Data Collating
• Data Cleaning/Preprocessing/Preparing that data
• Choosing a model/Supervised/Unsupervised
• Training/splitting Data Sets
• Evaluation/Performance measures/Epochs/Stopping condition/Cost
function
• parameter tuning/Hyper parameter tuning
• Prediction/Forecasting/Classification/Recognition/Identification
Machine learning tasks
• Classification
• Classification with missing inputs
• Regression
• Transcription
• Machine Translation
• Structured output
• Anomaly detection
• Synthesis and sampling
• Imputation of missing values
• Denoising
• Density estimation
Learning Algorithm (LA)
• Ability to learn from data
• Learn from Experience ‘E’ w.r.t tasks ‘T’ and Performance measure ‘P’
• Example – Collection of features measure quantitatively from object / event

Performance Measure
• Accuracy – Proportion of examples for which the model produces the correct output
• Error rate / Expected loss - Proportion of examples for which the model produces the incorrect output

Model evaluation
• Data set (Design Matrix) - Collection of examples
• Test data, Test error (generalization error)
• Train data, Training error
Simple example of a Learning algorithm
• Linear
  Regression
• System that takes a vector as input and predict the scalar as an output

w – vector of parameters that control the behaviour of the system

Performance of MLA is highly dependent upon two factors, namely:


• Minimize the training error
• Minimize the gap between the training and testing error
• IID – Independent and Identically Distributed
ML Model Metrics
• Statistical Fit
• Overfitting – gap between the training and test error is large
• Under fitting – inability to attain a low error value on the training set
• Capacity – ability to fit a wide variety of functions
• Hypothesis space – Set of functions that the LA selects as the solution
• Bias – absence of input
• Variance
• Generalisation – Ability to perform well on unobserved inputs
• Regularisation – Modification of LA to reduce its generalization error
Underfitting and overfitting -Example
Graphical illustration of Performance metrics
Techniques to reduce underfitting :
1. Increase model complexity
2. Increase number of features,
performing feature engineering
3. Remove noise from the data.
4. Increase the number of epochs or
increase the duration of training to
get better results.

Techniques to reduce overfitting :


1. Increase training data.
2. Reduce model complexity.
3. Early stopping during the training
phase (have an eye over the loss over
the training period as soon as loss
begins to increase stop training).

Source:https://www.geeksforgeeks.org/underfitting-and-overfitting-in-machine-learning/
Generalisation & Regularisation
• The central challenge in machine learning is that we must perform well on
inputs—not just those on which our model was trained. The ability to
perform well on previously unobserved inputs is called generalization.
• Generalization or Test error should be low
• Generalization error of a machine learning model by measuring its
performance on a test set of examples that were collected separatelyfrom the
training set.
• Regularization - design our machine learning algorithms to perform well on a
specific task- increase or decrease the model’s capacity-no best form of
regularization- Instead we must choose a form of regularization that is well-
suited to the particular task we want to solve.
Hyperparameters and Validation Sets
• Most machine learning algorithms have several settings that we can use
to control the behavior of the learning algorithm. These settings are
called hyperparameters
• Typically, one uses about 80% of the training data for training and 20%
for validation. Since the validation set is used to “train” the
hyperparameters, the validation set error will underestimate the
generalization error, though typically by a smaller amount than the
training error. After all hyperparameter optimization is complete, the
generalization error may be estimated using the test set.
• Cross Validation – K fold Cross Validation- k non-overlapping subsets.
Machine Learning (ML) Algorithms
• Deep Learning • Bayesian
• Ensemble • Decision Tree
• Neural Networks • Dimensionality Reduction
• Regularisation • Instance Based
• Rule Based • Clustering
• Regression

You might also like