You are on page 1of 2

BUSI 650 – Class Activity 3

Regression Analysis in Python

The objective of this tutorial is to practice regression analysis using Python and get hands-on practice in
performing linear regression.

Step 1: Importing Libraries


Open Google Colab and create a new notebook.

Import the necessary libraries for regression analysis.

import pandas as pd
import numpy as np
import scipy.special as sps
import matplotlib.pyplot as plt

Step 3: Uploading and loading the data file


data = pd.read_csv('/content/ data_activity_3.csv')

Step 4: Coverting dataframe to a numpy array:


data = np.array(data)

Step 5: Exploratory Data Analysis (EDA):


Before performing regression, it's important to understand the dataset. Explore the dataset by examining
its dimensions and viewing the first few rows. Visualize the data using a scatter plot.

Step 6: Define training input and output:


Consider features 1 to 3 for input and feature 4 for output.
X = data[:,:3]
Y = data[:,3]

Step 7: Initializing parameters:


w = np.random.rand (3,)
Step 8: Training process in a for loop:
# Maximum number of iterations.
max_iter = 500

# define an error vector to save all error values over all iterations.
error_all = []

# Learning rate for gradient descent.


eta = 0.5

for iter in range (0,max_iter):

Y_hat = sps.expit(np.dot(X,w))

#
==========================================================================
===
# Compute the error below
#
==========================================================================
===
e = -np.mean(np.multiply(Y,np.log(Y_hat)) + np.multiply((1-
Y),np.log(1-Y_hat)))

# Add this error to the end of error vector.

error_all.append(e)

# Gradient of the error


grad_e = np.mean(np.multiply((Y_hat - Y), X.T), axis=1)

w_old = w
w = w - eta*grad_e

print ('epoch {0:d}, negative log-likelihood {1:.4f},


w={2}'.format(iter, e, w.T))

Step 9: Training process in a for loop:


Plot the error values during the iterations and explain why this plot is oscillating.

You might also like