You are on page 1of 3

Intrinsic Discrepancies (Correllation between the targets)

Intrinsic discrepancy is a symmetrized Kullback-Leibler distancebetween disease and no-


disease feature distributions. Kullback–Leibler divergence (also called relative entropy) is a
measure of how one probability distribution is different from a second, reference probability
distribution. Applications include characterizing the relative (Shannon) entropy in information
systems, randomness in continuous time-series, and information gain when comparing statistical
models of inference. In contrast to variation of information, it is a distribution-wise asymmetric
measure and thus does not qualify as a statistical metric of spread (it also does not satisfy the
triangle inequality). In the simple case, a Kullback–Leibler divergence of 0 indicates that the two
distributions in question are identical. In simplified terms, it is a measure of surprise, with
diverse applications such as applied statistics, fluid mechanics, neuroscience and machine
learning.The Kullback–Leibler divergence is a special case of a broader class of statistical
divergences called f-divergences as well as the class of Bregman divergences. It is the only such
divergence over probabilities that is a member of both classes. Although it is often intuited as a
way of measuring the distance between probability distributions, the Kullback–Leibler
divergence is not a true metric.

Intrinsic discrepancies between disease and no-disease, in decreasing order:

0.609044 (thal)

0.584354 (cp)

0.478634 (ca)
0.456496 (thalach)

0.420033 (oldpeak)

0.352659 (exang)

0.312049 (slope)

0.198552 (age)

0.156002 (sex)

0.058716 (restecg)

0.033252 (restbp)

0.030527 (chol)

0.000000 (fbs)

Classification using SVM

SVM is a supervised machine learning algorithm which works based on the concept of
decision planes that defines decision boundaries. A decision boundary separates the objects of
one class from the object of another class. Support vectors are the data points which are nearest
to the hyper-plane. Kernel function is used to separate non-linear data by transforming input to a
higher dimensional space. Gaussian radial basis function kernel is used in our proposed method.

Classification using kNN

The K-nearest neighbors (KNN) algorithm is a type of supervised machine learning


algorithms. KNN is extremely easy to implement in its most basic form, and yet performs quite
complex classification tasks. It is a lazy learning algorithm since it doesn't have a specialized
training phase. Rather, it uses all of the data for training while classifying a new data point or
instance. KNN is a non-parametric learning algorithm, which means that it doesn't assume
anything about the underlying data. This is an extremely useful feature since most of the real
world data doesn't really follow any theoretical assumption e.g. linear-separability, uniform
distribution, etc.

A refinement of the k-NN classification algorithm is to weigh the contribution of each of


the k neighbors according to their distance to the query point qx , giving greater weight iwto
closer neighbors. It is given by

Where the weight is,

In case qx exactly matches one of ix so that the denominator becomes zero, we assign)
(qx F equals) (ix f in this case. It makes sense to use all training examples not just kif weighting
is used, the algorithm then becomes a global one. The only disadvantage is that the algorithm
will run more slowly.

Classification using Naïve Bayes Classifier

Naive Bayes classifiers are a family of simple "probabilistic classifiers" based on applying
Bayes' theorem with strong (naïve) independence assumptions between the features (see Bayes
classifier). They are among the simplest Bayesian network models, but coupled with kernel
density estimation, they can achieve higher accuracy levels. Naïve Bayes classifiers are highly
scalable, requiring a number of parameters linear in the number of variables (features/predictors)
in a learning problem. Maximum-likelihood training can be done by evaluating a closed-form
expression, which takes linear time, rather than by expensive iterative approximation as used for
many other types of classifiers

You might also like