You are on page 1of 51

SUPPORT VECTOR MACHINE

(Optimization in Machine Learning)

Presented By
Anisha N Rao & Aditi Goel

Department Of Data Science


Prasanna School Of Public Health
Manipal Academy Of Higher Education

May 13, 2022

1/51
Support Vector Machine May 13, 2022
Table of Contents

1 Introduction

2 Support Vector Classification

3 Support Vector Regression

4 Case Study

5 SVC Implementation

6 Conclusion

2/51
Support Vector Machine May 13, 2022
Introduction

Support Vector Machine or SVM is one of the most popular Supervised


Learning algorithms, which is used for Classification as well as Regression
problems.
The whole concept of SVM relies upon the appropriate formation of the
hyperplane both in Classification and Regression. The hyperplane will be
generated in an iterative manner by SVM so that the error can be
minimized.

3/51
Support Vector Machine May 13, 2022
Support Vector Classification

Basic Concepts
• Support Vectors : Datapoints that are closest to the hyperplane is called
support vectors. Separating line will be defined with the help of these data
points.
• Hyperplane : As we can see in the diagram, it is a decision plane or
space which is divided between a set of objects having different classes.
• Margin : It may be defined as the gap between two lines on the closest
data points of different classes. It can be calculated as the perpendicular
distance from the line to the support vectors. Large margin is considered
as a good margin and small margin is considered as a bad margin.

4/51
Support Vector Machine May 13, 2022
Figure: Classification Visualization Using SVC
5/51
Support Vector Machine May 13, 2022
Hyperplane

• In Geometry a hyperplane is a subspace whose dimension is one less than


that of its ambient space.A hyperplane separates the space into two spaces.
1• If a space is 3-dimensional then its hyperplane are the 2-dimensional
planes
2• If a space is 2-dimensional then its hyperplane are the 1-dimensional
lines
If a space is 1-dimensional then its hyperplane are single points

Figure: Hyperplane in 3D
6/51
Support Vector Machine May 13, 2022
Hyperplane Visualization

Figure: Hyperplane in 2D

Figure: Hyperplane in 1D

7/51
Support Vector Machine May 13, 2022
Geometry

• The distance between a point (xo,yo) and a line ax+by+c=0 is equal to:

Figure: Perpendicular distance


8/51
Support Vector Machine May 13, 2022
Types Of SVC

• Linear SVC : Linear SVC is used for linearly separable data, which means
if a data set can be classified into two classes by using a single straight
line, then such data is termed as linearly separable data, and classifier is
used called as Linear SVM classifier.
• Non-linear SVC : Non-Linear SVC is used for non-linearly separated data,
which means if a data set cannot be classified by using a straight line,
then such data is termed as non-linear data and classifier used is called as
Non-linear SVM classifier.

9/51
Support Vector Machine May 13, 2022
Maximum Margin Classifier - Hard Margin

• MMC is the hyperplane that among all separating hyperplanes ,find the
one that makes the biggest gap(margin) between two classes.
• The core idea of hard margin is to maximize the margin ,under the
constraint that the classifier does not make any mistake.

Figure: MMC 10/51


Support Vector Machine May 13, 2022
If we numerically define blue circles as +1 and green circles as -1 ,any
good linear model is expected to satisfy:

The Optimization Problem is

11/51
Support Vector Machine May 13, 2022
Support Vector Classifier - Soft Margin

We can extend the concept of a separating hyperplane in order to develop


a hyperplane that almost separates the classes ,using so called soft-margin.
• The generalization of the maximal margin classifier using soft margin is
known as the support vector classifier(SVC).
• It could be worthwhile to misclassify a few training observations in order
to do a better job in classifying remaining observations.
• Slack variables allow some observations to fall on the wrong side of the
margin ,but will penalize them by parameter C:Cost of Misclassification

12/51
Support Vector Machine May 13, 2022
Figure: Soft Margin
13/51
Support Vector Machine May 13, 2022
14/51
Support Vector Machine May 13, 2022
The Kernel Trick

Sometimes a linear boundary simply wont work ,no matter what is the
value of C.

Figure: Classification In Higher Dimension

The kernel is a function that quantifies the similarities between


observations by summarizing the relationship between every single pairs in
the training set.

15/51
Support Vector Machine May 13, 2022
Types Of Kernel

Figure: Kernels

16/51
Support Vector Machine May 13, 2022
Comparing Kernels

Figure: Accuracy Of Kernels


17/51
Support Vector Machine May 13, 2022
Support Vector Regression

SVR Overview
• a regression model estimates a continuous-valued multivariate function
• formulate binary classification as convex Optimization problems
Optimization problem goals :
• find maximum margin separating the hyperplane
• alongside, correctly classify as many training points as possible.
• SVMs represent this optimal hyperplane with support vectors.

18/51
Support Vector Machine May 13, 2022
SVR: Concepts, Mathematical Model, and Graphical
Representation

• Consider the following image where the middle line,

f (x) = wx + b

represents a hyperplane and the two dotted lines around the hyperplane

y = f (x) + ϵ
and
y = f (x) − ϵ
represent the decision boundaries where ϵ is the distance from the
hyperplane

19/51
Support Vector Machine May 13, 2022
Graphical Presentation

20/51
Support Vector Machine May 13, 2022
• The main aim of the support vector regression model is to decide a
decision boundary at a distance from the original hyperplane such that
data points closest to the hyperplane or the support vectors are within that
boundary line. Any hyperplane that satisfies the SVR should also satisfy:

21/51
Support Vector Machine May 13, 2022
SVR generalization to SVC

Introduce an ϵ -insensitive region around the function, called the ϵ -tube.


This tube reformulates the optimization problem to find the tube that best
approximates the continuous-valued function, while balancing model
complexity and prediction error.
• hyperplane is represented in terms of support vectors
• training and test data are assumed to be independent and identically
distributed (iid), drawn from the same fixed but unknown probability
distribution function in a supervised-learning context.
Adopting a soft-margin approach similar to that employed in SVM, slack
variables ξ, ξ ∗ can be added to guard against outliers. These variables
determine how many points can be tolerated outside the tube.

22/51
Support Vector Machine May 13, 2022
Basic details on Loss Function

Interpret L(x,y,f(x)) as the cost, or loss, of predicting y by f(x) if x is


observed.
• smaller the value L(x,y,f(x)) is, the better f(x) predicts y in the sense of L
• L penalizes predictions whose signs disagree with that of y
• constant loss functions, such as L := 0, are rather meaningless for our
purposes
Loss functions should be convex to ensure that the optimization problem
has a unique solution that can be found in a finite number of steps. A few
examples of loss functions:

23/51
Support Vector Machine May 13, 2022
A few examples of loss functions

24/51
Support Vector Machine May 13, 2022
(a) Linear loss function
(b) Quadratic loss function
(c) Huber loss function
25/51
Support Vector Machine May 13, 2022
Figure: Solutions For SVR with various orders of polynomial
26/51
Support Vector Machine May 13, 2022
This graph visualizes how the magnitude of the weights can be interpreted
as a measure of flatness.
• Horizontal line is a 0th-order polynomial solution, has a very large
deviation from desired o/ps so large error.
• Linear function produces better approximations for a portion of data but
still underfits the training data.
• 4th-order solution produces the best tradeoff between function flatness
and prediction error.
• Higher-order solution has zero error, high complexity, overfits the
solution on yet to be seen data.

27/51
Support Vector Machine May 13, 2022
Case Study of SVM for Handwriting Recognition

(Discuss difference between Offline Sensing and Online Recognition)


•We present here a case study on developing an efficient
writer-independent HWR system for isolated letters, using pen trajectory
modeling for feature extraction.
• The proposed HWR workflow is composed of preprocessing; feature
extraction; and a hierarchical, three-stage classification phase.
• Preprocessing comprises correcting the slant, normalizing the dimensions
of the letter, and shifting the coordinates with respect to the center of
mass.

28/51
Support Vector Machine May 13, 2022
Preprocessing

Figure: Examples of letters before (left) and after (right) preprocessing

29/51
Support Vector Machine May 13, 2022
Feature Extraction

This preprocessed data consists of strokes of coordinate pairs [x(t), y(t)].


Model this data using a pen trajectory technique set of features is
obtained after averaging the following functions:

30/51
Support Vector Machine May 13, 2022
31/51
Support Vector Machine May 13, 2022
32/51
Support Vector Machine May 13, 2022
Hierarchical, Three-Stage SVM

A three-stage classifier recognizes one of the 52 classes (26 lowercase and


26 uppercase letters).
• Using a binary SVM classifier, the first stage classifies the instance as
one of two classes: uppercase or lowercase letter.
• Using OAA SVM, the second stage classifies the instance as one of the
manually determined clusters
• Using OAA SVM, with a simple majority vote, the third stage identifies
the letter as one of the 52 classes (or subclusters).

33/51
Support Vector Machine May 13, 2022
Clustering for the upper and lower cases of alphabets.

34/51
Support Vector Machine May 13, 2022
3-stage Hierarchical SVM Block Diagram

35/51
Support Vector Machine May 13, 2022
Confusion plot for classified label versus true label

36/51
Support Vector Machine May 13, 2022
Table Of Experimental Results

Experimental results showed an average accuracy of 91.7 %. The three


stages of the classifier achieved, respectively, 99.3 %, 95.7 %, and 96.5 %
accuracy respectively. (Our proposed preprocessing helped improve the
general accuracy of the recognizer by approximately 1.5 % to 2 %.)*
*Recognition rates comparison:

37/51
Support Vector Machine May 13, 2022
SVC Implementation

38/51
Support Vector Machine May 13, 2022
39/51
Support Vector Machine May 13, 2022
40/51
Support Vector Machine May 13, 2022
41/51
Support Vector Machine May 13, 2022
42/51
Support Vector Machine May 13, 2022
43/51
Support Vector Machine May 13, 2022
44/51
Support Vector Machine May 13, 2022
45/51
Support Vector Machine May 13, 2022
46/51
Support Vector Machine May 13, 2022
47/51
Support Vector Machine May 13, 2022
Conclusion

SVM Advantages
• Support vector machine is very effective even with high dimensional data.
• When you have a data set where number of features is more than the
number of rows of data, SVM can perform in that case as well.
• When classes in the data are points are well separated SVM works really
well.
• SVM can be used for both regression and classification problem. And
last but not the least SVM can work well with image data as well.

48/51
Support Vector Machine May 13, 2022
SVM Disadvantages
• When classes in the data are points are not well separated, which means
overlapping classes are there, SVM does not perform well.
• We need to choose an optimal kernel for SVM and this task is difficult.
• SVM on large data set comparatively takes more time to train.
• SVM or Support vector machine is not a probabilistic model so we can
not explanation the classification in terms of probability.
• It is difficult to understand and interpret the SVM model compared to
Decision tree as SVM is more complex.

49/51
Support Vector Machine May 13, 2022
References I

https://www.javatpoint.com/
machine-learning-support-vector-machine-algorithm
https://www.geeksforgeeks.org/
support-vector-machine-algorithm
https:
//www.tutorialspoint.com/machine_learning_with_python
https://www.youtube.com/watch?v=jMWjN6mJiSw&t=39s
https://www.youtube.com/results?search_query=john+
pedram+svm+part+25
https://www.researchgate.net/publication/277299933_
Efficient_Learning_Machines_Theories_Concepts_and_
Applications_for_Engineers_and_System_Designers
https://machinehack.com/bootcamp/bootcampcourse

50/51
Support Vector Machine May 13, 2022
THANK YOU

51/51
Support Vector Machine May 13, 2022

You might also like