(IJCST-V3I3P43) : Er. Manju Rani, Er. Lekha Bhambhu

International Journal of Computer Science Trends and Technology (IJCST) Volume 3 Issue 3, May-June 2015
RESEARCH ARTICLE
OPEN ACCESS
Data Classification Using Support Vector Machine: A Review

Er. Manju Rani [1], Er. Lekha Bhambhu [2]
M.Tech scholar [1], HOD [2]
Department of Computer Science and Engineering and Information Technology
JCDMCOE, Sirsa
Haryana - India
ABSTRACT
Classification is a learning function that maps a given data item into one of several predefined classes. The learning function
helps in distinguishing the samples and helps to identifying the unknown samples of dataset. Support vector machines are
relatively best approach as compared to other classification technique. It is based on machine learning algorithm that are the
most widely used in various applications like tone recognition, voice recognition, fingerprint matching, handwriting recognition
etc. This paper presents overview of the basic ideas of Support Vector (SV) machines and recent developments in the field,
Furthermore we include a summary of currently used kernel function for training, testing SV machines.
Keywords:- Data Classification, Support Vector Machine, Kernel Function.
I.
INTRODUCTION
The support vector algorithm (nonlinear generalization
The main feature which makes SVM more popular to
algorithm) developed in Russia in 1963 by Vapnik and
others classifiers is that has been developed as robust tool
Lerner which is basically based on framework of statistical
for classification and regression linear and nonlinear
learning theory. Since the introduction of SV algorithm
datasets. And it is accurate and very effective in case when
many other researchers also worked on SVM algorithms
training sample is very large or very small.
and theoretical analysis and they merged the concept
SVMs perform classification by constructing an N-
statistics, machine learning and functional analysis
dimensional space and try to specify a maximum-margin
introduced soft margin classifier. In 1992 SV machine was
hyper plane which separate all datasets of two classes.
developed at AT&T Bell Laboratories by Vapnik and co-
Thats why SVMs are called Maximum margin classifier.
workers. In 1995 it is extended to SVM regression
The hyperplane generalize seen as well as unseen data also.
algorithm, which contains polynomial classifiers neural
Main objective of SVM is to find the best separation
networks and radial basis function (RBF) networks in
hyperplanes, that is, the hyperplane that provides maximum
particular cases.
margin space between the nearest points of the two classes.

This approach gives guarantees that as larger the margin is
the as generalization error is lower of the classifier. Figure1
II. SVM CLASSIFICATION

SVM is a nonlinear classification algorithm. In contrast to
linear classification methods, the kernel method maps the
training vectors into a higher dimensional space by the
nonlinear kernel function, without any nonlinear mapping
shows Separating hyperplanes. H1 does not separate the

two classes; H2 separates but with a very tinny margin
between the classes and H3 separates the two classes with
much better margin than H2
explicitly.
ISSN: 2347-8578
www.ijcstjournal.org
Page 254
the yn is
where
either
or
1.
Each xn is
a p-
dimensional real vector. We want to find the maximummargin hyperplane that divides the points having yn=1 from
yn=-1.
Hyperplane can be described by
w . x b =0
Where
is p-dimensional vector to the hyperplane, and
.(dot) denotes the dot product. The parameter b allows

increasing the margin. If the training data are linearly
separable, we can select two hyperplanes to separate data,
Figure 1
so that there are no points between them and then try to

maximize distance between these hyperplanes. As we want
2.1
maximum margin the hyperplanes are described by below
LINEAR SVM
equations.
Linear classification is a separation between two linearly
separable classes by any hyperplane on all data points to
w . x + b = 1 and w . x + b = -1
any considered class, i.e. points belongs to class x are

labeled as +1 and points belongs class y are labeled as -1.
To find the distance between these two hyperplane is 2 /
We consider some training data D, a set of n points of the
w. So we have to minimize |w|. And we also want that
form
no data points are fall into margin for this we add following
Or we can say
constraint i for both class each i either
D={(xn , yn) | xn Rp , yn{-1 , 1}} n I = 1

D={(x1,y1),(x2,y2),(x3,y3),(x4,y4).,(xn, yn)}.
w . xi b 1 or w. xi b -1
This can be written as
yi ( w. xi b) 1. 1 i n
2.2 Nonlinear classification

In
1992, Bernhard
E.
Boser and Vladimir
N.
Vapnik suggested a way to create nonlinear classifiers by

using the kernel function. The use of the kernel functions is
usually referred to as thekernel trick and it was
introduced by Aizerman et al. (1964). The nonlinear
Figure 2
algorithm is formally similar to linear, except that

every dot product is replaced by a nonlinear kernel function
ISSN: 2347-8578
Page 255
(K(xi , xj) (xi)T (xj)). There are many kernel functions
i.
in SVM. However, for general purposes, there are some

popular kernel functions:
The biggest limitation of SVM lies in the choice of

the best kernel.
ii.
Linear kernel:
A second limitation is speed and size (mostly in

training - for large training sets).
K (xi , xj) = xiT xj.
iii.
Polynomial (homogeneous):
K (xi , xj) =
(xiT
x j)
SVM machines take long training time and also

difficult to understand its learned function.
iv.
The optimal design for multiclass SVM classifiers.
Polynomial (inhomogeneous):
IV. CONCLUSION
K (xi , xj) = (xiT xj + 1)d

RBF kernel :
K (xi , xj) = exp(- ||xi - xj||2) , > 0
SVMs can produce accurate and robust classification
Sigmoid kernel:
results, even when input data are non-linearly separable.
K (xi , xj) = tanh(
xiT
xj + r)
The support vector machine has been introduced for many
Here, , r and d are kernel parameters. From all above
aspects of data mining including classification, regression
popular kernel functions, RBF is the main kernel function
and outlier detection. SVMs belong to a family of
because the RBF, SV algorithm automatically determines
generalized linear classifiers. A special property noted is
centers, weights and threshold and it have less numerical
that
difficulties as well as minimize an upper bound on the
observed classification error and maximize the margin.
they
simultaneously
minimize
the
expected test error.
REFERENCES
III. APPLICATIONS OF SVMS
[1]
SVMs used to solve various real world problems:
SVMs
are
helpful
in
text
and
Chih-Jen (2003). A Practical Guide to Support

Vector classification. Department of Computer
hypertext
Science and Information Engineering, National
categorization.
Taiwan University.
Classification of images can also be performed

using SVMs. Experimental results show higher search
[2]
Study". Multiple Classifier Systems

SVMs are also useful in medical science to
classify proteins.
Duan, K. B.; Keerthi, S. S. (2005). "Which Is the

Best Multiclass SVM Method? An Empirical
accuracy than traditional query refinement schemes.
Hsu, Chih-Wei; Chang, Chih-Chung; and Lin,
[3]
Comparison of Methods for Multiclass Support
Hand-written characters can be recognized using

SVM.
Hsu, Chih-Wei; and Lin, Chih-Jen (2002). "A
Vector Machines".
[4]
Chang, C.-C. and C. J. Lin (2001). LIBSVM: a

library
LIMITATIONS OF SVM
for
support
vector
machines.
http://www.csie.ntu.edu.tw/~cjlin/libsvm.
[5]
Durgesh K.Srivastava, Lekha Bhambhu, Data

Classification using support vector machine,
ISSN: 2347-8578
Page 256
Journal of Theoretical and Applied Information
Technology,2009.
[6]
V. Vapnik. The Nature of Statistical Learning

Theory. NY: Springer-Verlag. 1995.
[7]
Meyer, D.; Leisch, F.; Hornik, K. (2003). "The

support
vector
machine
undertest".Neurocomputing.
[8]
V. Vapnik, The Nature of Statistical Learning,

Springer-Verlag, New York, 1995.
[9]
Ming-Hsuan Yang Gentle Guide To Support

Vector Machines
[10]
Sayan Mukherjee classifying microarray data

using support vector machines
[11]
P.-H. Chen, C.-J. Lin, and B. Scholkopf. A

tutorial on _-support vector machines. Applied
Stochastic Models in Business and Industry,
21:111{136,
2005.
URL
http://www.csie.ntu.edu.tw/~cjlin/papers/nusvmtot
urial.pdf.
[12]
Minaxi Arora, Lekha Bhambhu Role of Scaling in

Data Classification Using SVM.
ISSN: 2347-8578
Page 257

(IJCST-V3I3P43) : Er. Manju Rani, Er. Lekha Bhambhu

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

(IJCST-V3I3P43) : Er. Manju Rani, Er. Lekha Bhambhu

Uploaded by

Copyright:

Available Formats

International Journal of Computer Science Trends and Technology (IJCST) Volume 3 Issue 3, May-June 2015

Data Classification Using Support Vector Machine: A Review

The support vector algorithm (nonlinear generalization

The main feature which makes SVM more popular to

algorithm) developed in Russia in 1963 by Vapnik and

others classifiers is that has been developed as robust tool

Lerner which is basically based on framework of statistical

for classification and regression linear and nonlinear

learning theory. Since the introduction of SV algorithm

datasets. And it is accurate and very effective in case when

many other researchers also worked on SVM algorithms

training sample is very large or very small.

and theoretical analysis and they merged the concept

SVMs perform classification by constructing an N-

statistics, machine learning and functional analysis

dimensional space and try to specify a maximum-margin

introduced soft margin classifier. In 1992 SV machine was

hyper plane which separate all datasets of two classes.

developed at AT&T Bell Laboratories by Vapnik and co-

Thats why SVMs are called Maximum margin classifier.

workers. In 1995 it is extended to SVM regression

The hyperplane generalize seen as well as unseen data also.

algorithm, which contains polynomial classifiers neural

Main objective of SVM is to find the best separation

networks and radial basis function (RBF) networks in

hyperplanes, that is, the hyperplane that provides maximum

margin space between the nearest points of the two classes.

II. SVM CLASSIFICATION

shows Separating hyperplanes. H1 does not separate the

is p-dimensional vector to the hyperplane, and

.(dot) denotes the dot product. The parameter b allows

so that there are no points between them and then try to

maximum margin the hyperplanes are described by below

any considered class, i.e. points belongs to class x are

To find the distance between these two hyperplane is 2 /

We consider some training data D, a set of n points of the

w. So we have to minimize |w|. And we also want that

constraint i for both class each i either

D={(xn , yn) | xn Rp , yn{-1 , 1}} n I = 1

2.2 Nonlinear classification

Boser and Vladimir

Vapnik suggested a way to create nonlinear classifiers by

algorithm is formally similar to linear, except that

in SVM. However, for general purposes, there are some

The biggest limitation of SVM lies in the choice of

A second limitation is speed and size (mostly in

K (xi , xj) = xiT xj.

SVM machines take long training time and also

The optimal design for multiclass SVM classifiers.

K (xi , xj) = (xiT xj + 1)d

SVMs can produce accurate and robust classification

results, even when input data are non-linearly separable.

K (xi , xj) = tanh(

The support vector machine has been introduced for many

Here, , r and d are kernel parameters. From all above

aspects of data mining including classification, regression

popular kernel functions, RBF is the main kernel function

and outlier detection. SVMs belong to a family of

because the RBF, SV algorithm automatically determines

generalized linear classifiers. A special property noted is

centers, weights and threshold and it have less numerical

difficulties as well as minimize an upper bound on the

observed classification error and maximize the margin.