1.0 - Classification Algo - Medium

Classification Algo
Logistic Regression
is used to perform the classification problem.
Representation of logistic regression equation:
y = e^(b0 + b1*x) / (1 + e^(b0 + b1*x))
here y is nothing but the probability of “X”
so we can wright it as
P(X) = e^(b0 + b1*x) / (1 + e^(b0 + b1*x))
after doing mathematical transform
ln(p(X) / 1 — p(X)) = b0 + b1 * X
then transpose ln(odds) = b0 + b1 * X
then again move the exponent back to the right
odds = e^(b0 + b1 * X)
where odds is the ratio of probability ,something is happening or something is not happening odds = p(x)/(1-
p(x))
Logistic regression is the binary classifier ,which used the sigmoid function as underline function for the
classification. to deal with the outliers.
the decision is taken based on the probability threshold,it calculate the probability “y” for the values of x
variable. which belongs to specific class . that threshold depends on the problem and case to case.
Logistic regression is convert the values of logits (logodds), which can range from -infinity to +infinity to a
range between 0 and 1. As logistic functions output is the probability of occurrence of an event.
where odds is the ratio of probability ,something is happening or something is not happening, Log Odds is
nothing but log of odds, which is scale the probability score between the range of 0 and 1.
so when we calculate the probability there could be the chance we get the values less then 0 ,and as well as
grater then 1.that don’t make sense .so we need the function which always gives the values between 0 and 1.
that’s why we use sigmoid function as underline function in Logistic regression
Non-linear problems can’t be solved with logistic regression because it has a linear decision surface. Linearly
separable data is rarely found in real-world scenarios.
SVM (Support Vector Machine)

SVM is a superwised machine learning algorithum ,which widely used in classification and regression problem.
Svm used for the data which are linearly separable.
for the non liner-separable data we use karnal svm.
SVM classify the classes using hyperplane .this hyper plans should have highest marginal distance between the
classes in high Dimension space to separate the data.
assume that we have 2 classes in our data set so first SVM create the Hyperplane between the classes and then
create the marginal Hyperplan parallel to the Hyperplane which passes thought the nearest data points.
the margin between the classes represents the longest distance between closest point of class. that know as marginal
distance
The main concept behind the SVM is to find the maximum marginal distance between the support vector to classify
new data points
Support Vector
Support vectors are data points which are closer to the marginal hyperplane and influence the position and
orientation of the hyperplane. Using these support vectors, we maximize the margin of the classifier.
Deleting the support vectors will change the position of the hyperplane. These are the points that help us build our
SVM.
SVM create the many hyper plans to classify the data point but select only one which has the highest marginal
distance with the support vector
Non linear SVM(karnal SVM)

if there is dataset which is non-linear separable. we use the karnal function to map the non-linear separable data-set
into a higher dimensional space where we can find a hyperplane that can separate the samples.
Kernel function are convert the low Dimension space data in to high dimension space to classify them.
Types of kernels:
1. linear kernel
2. polynomial kernel
3. Radial basis function kernel (RBF)/ Gaussian Kernel
What is Hinge Loss?
Explanation: Hinge Loss is a loss function which penalises the SVM model for inaccurate predictions.
If Yi(WT*Xi +b) ≥ 1, hinge loss is ‘0’ i.e the points are correctly classified. When
SVM Hyper parameter

Support Vector Machine Classifier**
SVC(kernel='linear', C=1.0, random_state=0)

example
kernel ==> ‘linear’, for linear classification, or kernel = ‘rbf’ for non-linear classification.
C--> is the penalty parameter of the error term, and

random_state --> is the seed of the pseudo random number generator
Parameters:
C: Inverse of the strength of regularization.
Behavior: As the value of ‘c’ increases the model gets overfits. As the value of ‘c’ decreases the model underfits.
γ : Gamma (used only for RBF kernel)
Behavior: As the value of ‘ γ’ increases the model gets overfits. As the value of ‘ γ’ decreases the model
underfits.
K-Nearest Neighbors (K-NN) algorithm

K-Nearest Neighbors (K-NN) algorithm is a supervised machine learning algorithm that can be used to solve both
classification and regression problems.
sometime is call Lazy algorithm.because it just memories the process does not learn itself.K-NN not required any
pre-explicity training .
K-NN work based on the similarity measures (sometimes called distance, proximity, or closeness).thoes measures
caculated by Euclidean Distance & Manhattan Distance
step 1: Initialize K (Should be a odd number)K → Represent the number of neighbors For each samples in the data
step 2: For each samples in the data

2.1: Calculate the distance between the query point and the current point from the data.
2.2: Add the distance and the index to an ordered collection
step 3: Sort the ordered collection of distances and index from smallest to largest (in ascending order) by the
distances
step 4: Pick the first K entries from the sorted collection
step 5: Get the labels of the selected K entries
step 6: If regression, return the mean of the K labels
step 7 :If classification, return the mode of the K labels

1.0 - Classification Algo - Medium

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

1.0 - Classification Algo - Medium

Uploaded by

Copyright:

Available Formats

Classification Algo

Representation of logistic regression equation:

y = e^(b0 + b1*x) / (1 + e^(b0 + b1*x))

here y is nothing but the probability of “X”

P(X) = e^(b0 + b1*x) / (1 + e^(b0 + b1*x))

after doing mathematical transform

then transpose ln(odds) = b0 + b1 * X

then again move the exponent back to the right

SVM (Support Vector Machine)

Svm used for the data which are linearly separable.

for the non liner-separable data we use karnal svm.

Non linear SVM(karnal SVM)

SVM Hyper parameter

SVC(kernel='linear', C=1.0, random_state=0)

C--> is the penalty parameter of the error term, and

C: Inverse of the strength of regularization.

γ : Gamma (used only for RBF kernel)

K-Nearest Neighbors (K-NN) algorithm

step 2: For each samples in the data

2.2: Add the distance and the index to an ordered collection

step 4: Pick the first K entries from the sorted collection

step 5: Get the labels of the selected K entries

step 6: If regression, return the mean of the K labels

step 7 :If classification, return the mode of the K labels

You might also like

y = e^(b0 + b1x) / (1 + e^(b0 + b1x))

P(X) = e^(b0 + b1x) / (1 + e^(b0 + b1x))