You are on page 1of 6

GAMBELLA UNIVERSITY

FACULTY OF NATURAL AND


COMPUTATIONAL SCIENCE
DEPARTMENT OF COMPUTER SCIENCE
INDIVIDUAL ASSIGNMENT
NAME ID
BRKE YESHAMBEL 0186

submission date: Dec 18,2023

submission to: Mr.Aklilu Thomas


Table of content

1. What is support vector machine?.............................................................2


1.2 how support vector machine work?...................................................3
1.3 formula of support vector machine:...................................................6
1.4 advantage of support vector machine:..............................................7
1.5 Disadvantage of support vector machine:.........................................7

1
1 . What support vector machine

Support Vector Machine (SVM) is a supervised machine learning algorithm. SVM’s


purpose is to predict the classification of a query sample by relying on labeled input data
which are separated into two group classes by using a margin. Specifically, the data is
transformed into a higher dimension, and a support vector classifier is used as a threshold
(or hyperplane) to separate the two classes with minimum error.

A support vector machine (SVM) is a type of supervised learning algorithm used


in machine learning to solve classification and regression tasks; SVMs are
particularly good at solving binary classification problems, which require
classifying the elements of a data set into two groups.

The aim of a support vector machine algorithm is to find the best possible line,
or decision boundary, that separates the data points of different data classes. This
boundary is called a hyperplane when working in high-dimensional feature spaces.
The idea is to maximize the margin, which is the distance between the hyperplane
and the closest data points of each category, thus making it easy to distinguish
data classes.

1.2 how support vector machine work?

Step 1: Transform training data from a low dimension into a higher dimension.

Step 2: Find a Support Vector Classifier [also called Soft Margin Classifier] to

separate the two classes [Kernel Trick].

Step 3: Return the class label → prediction of the query sample!

Example of the Algorithm

Let’s start off with the basics…

2
Maximal Margin Classifier — is when the threshold is assigned to the midpoint of

the observations on edges of each class cluster; the threshold gives the largest

distance between two classes to give the maximal margin.

Maximal Margin Classifier — Correct Classification Example [Image by Author]

 Example: Since the query sample falls to the right of the threshold, the query

sample is classified as Class B, which is intended! There is a bias/variance

tradeoff since there is high bias (selected threshold not sensitive to outliers)

and low variance (performed well with new query sample).

 Issue: What happens in an event that there is an outlier present?

Maximal Margin Classifier — Incorrect Classification Example [Image by Author]

 Example: Since the query sample falls to the left of the threshold, the query

sample is classified as Class A, which is NOT intended! Intuitively, this does not

make sense, as the query sample is closer to the Class B cluster when

compared to the Class A cluster. There is a bias/variance tradeoff since there is

3
low bias (selected threshold sensitive to outliers) and high variance (performed

poorly with new query sample).

 Solution: Since the Maximal Margin Classifier is very sensitive to outliers in

training data, it is necessary to select a threshold that is not sensitive to outliers

and allows misclassifications → Soft Margin Classifier.

Soft Margin Classifier — is when the threshold is allowed to make an acceptable

amount of misclassifications while allowing the new data points to still be classified

correctly; cross-validation is used to determine how many misclassifications and

observations are allowed inside the soft margin to obtain the best

classification. [Support Vector Classifier is another way to reference the Soft

Margin Classifier].

Soft Margin Classifier — Correct Classification Example [Image by Author]

Example: Since the query sample falls to the right of the threshold, the query

sample is classified as Class B, which is intended! There is 1 misclassification

made to find the optimal threshold.


1.3 formula of support vector machine:

Support Vector Machine (SVM) is a supervised learning algorithm used for


classification and regression analysis. It is based on the concept of finding a
hyperplane that best separates the data points into different classes.

4
The hyperplane is chosen such that it maximizes the margin between the two
classes. The formula for the hyperplane is given by:

WTx+b=0

Where w is the weight vector, x is the input vector, and b is the bias. The distance
between the hyperplane and the closest data points from each class is called the
margin. The goal of SVM is to maximize this margin. The margin can be calculated
as:

∥w∥2

Where ∥w∥ is the Euclidean norm of the weight vector. SVM can be used for both
linear and non-linear classification using the kernel trick. The kernel trick maps the
input data into a higher-dimensional space where it is easier to find a hyperplane.

1.4 advantage of support vector machine:

1. Support vector machine is very effective even with high dimensional data.
2. When you have a data set where number of features is more than the number of
rows of data, SVM can perform in that case as well.
3. When classes in the data are points are well separated SVM works really well.
4. SVM can be used for both regression and classification problem.
5. And last but not the least SVM can work well with image data as well.

1.5 Disadvantage of support vector machine:

1. When classes in the data are points are not well separated, which means
overlapping classes are there, SVM does not perform well.
2. We need to choose an optimal kernel for SVM and this task is difficult.
3. SVM on large data set comparatively takes more time to train.
4. SVM or Support vector machine is not a probabilistic model so we cannot
explanation the classification in terms of probability.

You might also like