ML 5th

Experiment-5
AIM:
(a) Execute the Logistic Regression with the help of diabetes data set. Analyse the
result and identify how well the model performed on test set. Brief the steps that
you have followed for analyses the data set.
(b) Implement Logistic Regression using python.
THEORY:
Logistic regression is a fundamental and widely used statistical method in machine learning for
binary classification tasks.
Logistic regression is a supervised machine learning algorithm used for classification tasks where
the goal is to predict the probability that an instance belongs to a given class or not.
Logistic regression aims to model the probability that an instance belongs to a particular class based on
one or more predictor variables. It's particularly useful when the dependent variable (target) is
categorical with two levels, commonly referred to as the binary classification problem.
For example, we have two classes Class 0 and Class 1 if the value of the logistic function for an input is
greater than 0.5 (threshold value) then it belongs to Class 1 it belongs to Class 0. It’s referred to as
regression because it is the extension of linear regression but is mainly used for classification problems.
Types of Logistic Regression
On the basis of the categories, Logistic Regression can be classified into three types:
1. Binomial: In binomial Logistic regression, there can be only two possible types of the dependent
variables, such as 0 or 1, Pass or Fail, etc.
2. Multinomial: In multinomial Logistic regression, there can be 3 or more possible unordered

types of the dependent variable, such as “cat”, “dogs”, or “sheep”
3. Ordinal: In ordinal Logistic regression, there can be 3 or more possible ordered types of
dependent variables, such as “low”, “Medium”, or “High”.
Gungun Sahu EN21CS301277

Sahu
Procedure:
a) Logistic Regression on weka:
Load Data: Open Weka Explorer and load your dataset. Weka supports various file formats, such as
ARFF, CSV, and more.

Sahu
1.Preprocess Data (if needed): If your dataset requires preprocessing, you can perform tasks like
handling missing values, normalization, or feature selection using Weka's preprocessing tools.
1.Choose Logistic Regression Algorithm: Navigate to the "Classify" tab in Weka Explorer. Click on
the "Choose" button to select the logistic regression algorithm. In Weka, logistic regression is
implemented as the "Logistic" classifier under the "functions" category.

Sahu
1. Set Options : Configure any specific options for the logistic regression algorithm, such
as regularization parameters, feature selection methods, or other settings.
2. Split Data (optional): Optionally, split your dataset into training and testing sets to
evaluate the performance of the logistic regression model. First use 80% for training and
20% for testing. Second time, use 60% for training and 40% testing.
3. Run Logistic Regression: Once we have selected the logistic regression algorithm and
set the options, click on the "Start" button to run the logistic regression model on your
dataset.
4. Evaluate Results: After running the logistic regression model, evaluate its performance
using appropriate evaluation metrics. Weka provides tools for computing various
performance metrics such as accuracy, precision, recall, F1-score, ROC curve, and
AUCROC.

Sahu
5. Interpret Results: Interpret the results of the logistic regression model, including the
coefficients of the predictor variables, odds ratios, and any other relevant statistics.
Weka provides visualization tools and summary statistics to help interpret the results.
b) Logistic Regression using python:

Steps to perform logistic regression in python:
1. Import Libraries: Import the necessary libraries for data manipulation, visualization, and
modeling.
Sahu
2. Preprocess Data (if needed): Handle missing values, encode categorical variables, and
perform feature scaling if necessary.
3. Split Data: Split your dataset into training and testing sets.
4. Instantiate Model: Create an instance of the logistic regression model.
5. Train Model: Fit the model on the training data.
6. Make Predictions: Use the trained model to make predictions on the test data.
7. Evaluate Model: Evaluate the performance of the model using appropriate metrics.
8. Interpret Results: Interpret the results of the logistic regression model, including
coefficients and odds ratios if needed.
9. Iterate and Refine (if needed): Depending on the performance of the model, you may need
to iterate and refine your approach by experimenting with different preprocessing
techniques, feature selection methods, or hyperparameter tuning.
Program for logistic regression:
# Import necessary libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt import seaborn as sns from sklearn.datasets import
load_diabetes from sklearn.model_selection import train_test_split from sklearn.preprocessing
import StandardScaler from sklearn.linear_model import LogisticRegression from
sklearn.metrics import accuracy_score, classification_report, confusion_matrix, roc_curve, auc
# Load the diabetes dataset diabetes

= load_diabetes()
X, y = diabetes.data, diabetes.target
# Convert the target variable to binary (1 for diabetes, 0 for no diabetes) y_binary
= (y > np.median(y)).astype(int)
# Split the data into training and testing sets

X_train, X_test, y_train, y_test = train_test_split(
X, y_binary, test_size=0.2, random_state=42)
# Standardize features scaler

= StandardScaler()
X_train = scaler.fit_transform(X_train)

Sahu
X_test = scaler.transform(X_test)
# Train the Logistic Regression model

model = LogisticRegression()
model.fit(X_train, y_train)
# Evaluate the model y_pred =

model.predict(X_test) accuracy =
accuracy_score(y_test, y_pred) print("Accuracy:
{:.2f}%".format(accuracy * 100))
# Visualize the decision boundary with accuracy information
plt.figure(figsize=(8, 6)) sns.scatterplot(x=X_test[:, 2], y=X_test[:,
8], hue=y_test, palette={0: 'blue', 1: 'red'}, marker='o') plt.xlabel("BMI") plt.ylabel("Age")
plt.title("Logistic Regression Decision
Boundary\nAccuracy: {:.2f}%".format( accuracy * 100))
plt.legend(title="Diabetes", loc="upper right") plt.show()
# evaluate the model print("Confusion Matrix:\n",
confusion_matrix(y_test, y_pred)) print("\nClassification Report:\n",
classification_report(y_test, y_pred))
# Plot ROC Curve y_prob =
model.predict_proba(X_test)[:, 1] fpr, tpr,
thresholds = roc_curve(y_test, y_prob) roc_auc
= auc(fpr, tpr)
plt.figure(figsize=(8, 6)) plt.plot(fpr, tpr,
color='darkorange', lw=2,
label=f'ROC Curve (AUC = {roc_auc:.2f})')
plt.plot([0, 1], [0, 1], color='navy', lw=2, linestyle='--', label='Random')
plt.xlabel('False Positive Rate') plt.ylabel('True Positive Rate') plt.title('Receiver
Operating Characteristic (ROC) Curve\nAccuracy: {:.2f}%'.format( accuracy *
100))
plt.legend(loc="lower right") plt.show()

Sahu
OUTPUT:

Sahu

ML 5th

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

ML 5th

Uploaded by

Copyright:

Available Formats

Experiment-5

(b) Implement Logistic Regression using python.

Types of Logistic Regression

2. Multinomial: In multinomial Logistic regression, there can be 3 or more possible unordered

Gungun Sahu EN21CS301277

a) Logistic Regression on weka:

Gungun Sahu EN21CS301277

Gungun Sahu EN21CS301277

Gungun Sahu EN21CS301277

b) Logistic Regression using python:

# Load the diabetes dataset diabetes

# Split the data into training and testing sets

# Standardize features scaler

Gungun Sahu EN21CS301277

# Train the Logistic Regression model

# Evaluate the model y_pred =

Gungun Sahu EN21CS301277

Gungun Sahu EN21CS301277

You might also like