You are on page 1of 4

What is Logistic Regression?

Logistic regression is a machine learning algorithm based on supervised learning.


It is a statistical method that
is used for predicting probability of target variable.
Example-

Here we take a dataset and use logistic regression on it.

In [35]:
import numpy as np

import pandas as pd

import matplotlib.pyplot as plt

In [31]:
df= pd.read_csv('C:/Users/anirb/Downloads/New folder/Iris.csv')

df

Out[31]: SepalLengthCm SepalWidthCm PetalLengthCm PetalWidthCm Species

0 5.1 3.5 1.4 0.2 Iris-setosa

1 4.9 3.0 1.4 0.2 Iris-setosa

2 4.7 3.2 1.3 0.2 Iris-setosa

3 4.6 3.1 1.5 0.2 Iris-setosa

4 5.0 3.6 1.4 0.2 Iris-setosa

... ... ... ... ... ...

145 6.7 3.0 5.2 2.3 Iris-virginica

146 6.3 2.5 5.0 1.9 Iris-virginica

147 6.5 3.0 5.2 2.0 Iris-virginica


Loading [MathJax]/extensions/Safe.js
SepalLengthCm SepalWidthCm PetalLengthCm PetalWidthCm Species

148 6.2 3.4 5.4 2.3 Iris-virginica

149 5.9 3.0 5.1 1.8 Iris-virginica

150 rows × 5 columns

In [3]:
df['Species'].unique()

Out[3]: array(['Iris-setosa', 'Iris-versicolor', 'Iris-virginica'], dtype=object)

In [32]:
df['Species'].replace({'Iris-setosa':'1', 'Iris-versicolor':'2', 'Iris-virginica':'3'},inp

In [5]:
df

Out[5]: SepalLengthCm SepalWidthCm PetalLengthCm PetalWidthCm Species

0 5.1 3.5 1.4 0.2 1

1 4.9 3.0 1.4 0.2 1

2 4.7 3.2 1.3 0.2 1

3 4.6 3.1 1.5 0.2 1

4 5.0 3.6 1.4 0.2 1

... ... ... ... ... ...

145 6.7 3.0 5.2 2.3 3

146 6.3 2.5 5.0 1.9 3

147 6.5 3.0 5.2 2.0 3

148 6.2 3.4 5.4 2.3 3

149 5.9 3.0 5.1 1.8 3

150 rows × 5 columns

In [6]:
from sklearn.model_selection import train_test_split

In [10]:
x_train,x_test,y_train,y_test=train_test_split(df[['SepalLengthCm','SepalWidthCm','PetalLe

In [11]:
len(x_train)

Out[11]: 120

In [12]:
len(x_test)

Out[12]: 30

In [14]:
from sklearn.linear_model import LogisticRegression

Loading [MathJax]/extensions/Safe.js
In [18]: lr= LogisticRegression()

In [23]:
lr.fit(x_train,y_train)

Out[23]: LogisticRegression()

In [21]:
lr.predict(x_test)

Out[21]: array(['1', '3', '3', '1', '2', '1', '1', '1', '2', '1', '1', '3', '2',

'2', '1', '2', '1', '3', '1', '2', '2', '3', '1', '3', '1', '2',

'3', '3', '2', '3'], dtype=object)

In [25]:
x_test

Out[25]: SepalLengthCm SepalWidthCm PetalLengthCm PetalWidthCm

33 5.5 4.2 1.4 0.2

142 5.8 2.7 5.1 1.9

129 7.2 3.0 5.8 1.6

11 4.8 3.4 1.6 0.2

64 5.6 2.9 3.6 1.3

42 4.4 3.2 1.3 0.2

38 4.4 3.0 1.3 0.2

26 5.0 3.4 1.6 0.4

54 6.5 2.8 4.6 1.5

27 5.2 3.5 1.5 0.2

12 4.8 3.0 1.4 0.1

119 6.0 2.2 5.0 1.5

72 6.3 2.5 4.9 1.5

82 5.8 2.7 3.9 1.2

20 5.4 3.4 1.7 0.2

57 4.9 2.4 3.3 1.0

40 5.0 3.5 1.3 0.3

140 6.7 3.1 5.6 2.4

21 5.1 3.7 1.5 0.4

90 5.5 2.6 4.4 1.2

84 5.4 3.0 4.5 1.5

117 7.7 3.8 6.7 2.2

10 5.4 3.7 1.5 0.2

139 6.9 3.1 5.4 2.1

43 5.0 3.5 1.6 0.6

87 6.3 2.3 4.4 1.3

128 6.4 2.8 5.6 2.1

120 6.9 3.2 5.7 2.3


Loading [MathJax]/extensions/Safe.js
SepalLengthCm SepalWidthCm PetalLengthCm PetalWidthCm

81 5.5 2.4 3.7 1.0

138 6.0 3.0 4.8 1.8

In [26]:
lr.score(x_test,y_test)

Out[26]: 1.0

In [27]:
import seaborn as sns

In [36]:
sns.set_style("whitegrid")

sns.pairplot(df,hue= 'Species',height= 3)

plt.show()

Loading [MathJax]/extensions/Safe.js

You might also like