Professional Documents
Culture Documents
1
1 . What support vector machine
The aim of a support vector machine algorithm is to find the best possible line,
or decision boundary, that separates the data points of different data classes. This
boundary is called a hyperplane when working in high-dimensional feature spaces.
The idea is to maximize the margin, which is the distance between the hyperplane
and the closest data points of each category, thus making it easy to distinguish
data classes.
Step 1: Transform training data from a low dimension into a higher dimension.
Step 2: Find a Support Vector Classifier [also called Soft Margin Classifier] to
2
Maximal Margin Classifier — is when the threshold is assigned to the midpoint of
the observations on edges of each class cluster; the threshold gives the largest
Example: Since the query sample falls to the right of the threshold, the query
tradeoff since there is high bias (selected threshold not sensitive to outliers)
Example: Since the query sample falls to the left of the threshold, the query
sample is classified as Class A, which is NOT intended! Intuitively, this does not
make sense, as the query sample is closer to the Class B cluster when
3
low bias (selected threshold sensitive to outliers) and high variance (performed
amount of misclassifications while allowing the new data points to still be classified
observations are allowed inside the soft margin to obtain the best
Margin Classifier].
Example: Since the query sample falls to the right of the threshold, the query
4
The hyperplane is chosen such that it maximizes the margin between the two
classes. The formula for the hyperplane is given by:
WTx+b=0
Where w is the weight vector, x is the input vector, and b is the bias. The distance
between the hyperplane and the closest data points from each class is called the
margin. The goal of SVM is to maximize this margin. The margin can be calculated
as:
∥w∥2
Where ∥w∥ is the Euclidean norm of the weight vector. SVM can be used for both
linear and non-linear classification using the kernel trick. The kernel trick maps the
input data into a higher-dimensional space where it is easier to find a hyperplane.
1. Support vector machine is very effective even with high dimensional data.
2. When you have a data set where number of features is more than the number of
rows of data, SVM can perform in that case as well.
3. When classes in the data are points are well separated SVM works really well.
4. SVM can be used for both regression and classification problem.
5. And last but not the least SVM can work well with image data as well.
1. When classes in the data are points are not well separated, which means
overlapping classes are there, SVM does not perform well.
2. We need to choose an optimal kernel for SVM and this task is difficult.
3. SVM on large data set comparatively takes more time to train.
4. SVM or Support vector machine is not a probabilistic model so we cannot
explanation the classification in terms of probability.