You are on page 1of 13

Using a three layer deep neural network to solve an unsupervised

learning problem

May 1, 2022

To apply a three layer neural network, we first need to preprocess the train and test data.

1 Importing libraries

Necessary liabraries which will help us preprocess the train and test data are imported bellow.
[49]: import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn import preprocessing
from sklearn import linear_model
from sklearn.metrics import accuracy_score
import matplotlib.pyplot as plt
import time

2 Loading the Dataset

[50]: from google.colab import drive


drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call


drive.mount("/content/drive", force_remount=True).
I loaded the train and test data in csv format from google drive with pandas.
[51]: train_data = pd.read_csv('/content/drive/MyDrive/trnData4D.csv',header = None)
test_data = pd.read_csv('/content/drive/MyDrive/tstData4D.csv',header = None)

3 Checking the train an test data

We can see that both train and test data are four dimensional and contains 15000 examples each.

1
[52]: train_data.shape

[52]: (15000, 4)

[53]: test_data.shape

[53]: (15000, 4)

4 Create training classes

According to The training dataset matlab file, the first 7500 rows are class 1 and the second 7500
rows are class 2. We need to create a new column ’class_name’ that labels first 7500 data as 0
(for class 1) and second 7500 data as 1 (for class 2).

[54]: class_name = [i for i in range(1,15001) ]


train_data['class_name'] = class_name
train_data['class_name'].iloc[0:7500] = 0 # data belong to class_1 is set to␣
,→'0'

train_data['class_name'].iloc[7500:15000] = 1 # data belong to class_2 is set␣


,→to '1'

/usr/local/lib/python3.7/dist-packages/pandas/core/indexing.py:1732:
SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-


docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
self._setitem_single_block(indexer, value, name)

5 Checking the updated data

Here are the first 4 rows of the labeled data, all of which belong to class_1.
[55]: train_data.head()

[55]: 0 1 2 3 class_name
0 0.351074 3.635034 6.047802 -3.335702 0
1 6.391923 4.069761 5.844878 -7.343990 0
2 -10.094422 -5.324229 0.806393 -4.402726 0
3 -9.151437 -1.194381 6.352514 -0.786227 0
4 3.057876 -9.318271 -2.736060 -6.155479 0

2
6 Randomize the Dataset

We need to randomized the train data to make sure the data is random, so that the deep neural
network can learn properly.
[56]: from collections import Counter
Counter(train_data["class_name"])
train_data_updated = train_data.sample(frac=1).reset_index(drop=True)
train_data_updated.head()

[56]: 0 1 2 3 class_name
0 7.775054 7.431022 -2.588864 -0.022134 0
1 -5.181603 14.139670 -0.160220 2.571873 0
2 -4.486467 4.666326 -0.147838 7.182762 0
3 11.431682 10.108164 8.937301 1.930146 0
4 8.966712 -12.707098 -0.733887 7.864671 0

7 Create features and labels from the data

Before implementing the neural network we first have to seperate the first three columns of the
training data as input ( x_train ) and the fourth column ( class_name ) as output( y_train ).

[57]: x_train,y_train = train_data_updated.iloc[:, :-1], train_data_updated.iloc[:,␣


,→[-1]]

[58]: x_train.shape

[58]: (15000, 4)

[59]: y_train.shape

[59]: (15000, 1)

8 Implement the neural network

It is better for the neural network to learn, if the data is scaled between 0 to 1. Therefore, I
normalized the input data but I did not touch the output data as it contains either 0 or 1.
[60]: x_train = np.array(x_train)
y_train = np.array(y_train)

[61]: x_train = preprocessing.normalize(x_train)

[62]: x_train

3
[62]: array([[ 0.70284308, 0.67174356, -0.23402605, -0.00200085],
[-0.33915294, 0.92548786, -0.01048693, 0.16833754],
[-0.46393742, 0.48253631, -0.01528766, 0.7427564 ],
…,
[ 0.30323523, -0.23685057, 0.02507769, 0.92267075],
[ 0.6515841 , 0.5779838 , 0.29274954, -0.39455113],
[ 0.85834407, -0.08024698, -0.11541187, 0.49344299]])

[63]: x_train.shape

[63]: (15000, 4)

[64]: y_train

[64]: array([[0],
[0],
[0],
…,
[1],
[0],
[0]])

I than split the data to validate the training of the network. I took 85% of the data for training
the network and 15% of the data to validate the accuracy of training.
[65]: X_train, X_val, Y_train, Y_val = train_test_split(x_train, y_train, test_size=0.
,→15, random_state=42)

[66]: X_train.shape

[66]: (12750, 4)

9 Creating the DeepNeuralNetwork class

Here, I created a class named DeepNeuralNetwork. I set the number of epochs to 10000 and
learning rate to 0.01. I used Sigmoid activision function for the output layer as it is a binary
classification problem and Relu activision function for all the other layers.
[67]: class DeepNeuralNetwork:
def __init__(self, sizes, epochs=10000, learning_rate=.01):
self.sizes = sizes
self.epochs = epochs
self.learning_rate = learning_rate

# Initialize weights and biases


self.W = [None] * (len(self.sizes) - 1)

4
self.b = [None] * (len(self.sizes) - 1)
# set random value to weights and biases
for i in range(len(self.sizes) - 1):
self.W[i] = np.random.randn(self.sizes[i], self.sizes[i + 1])
self.b[i] = np.zeros((1, self.sizes[i + 1]))

def sigmoid(self, z):


return 1 / (1 + np.exp(-z)), z

def sigmoid_backward(self, da, z):


c = 1 / (1 + np.exp(-z))
return da * c * (1 - c)

def relu(self, z):


return np.maximum(0, z), z

def relu_backward(self, dA, z):


dz = np.array(dA, copy=True)
dz[z <= 0] = 0
return dz

def compute_cost(self, prediction, Y):


m = Y.shape[0]
cost = (1. / m) * (-np.dot(np.log(prediction).T, Y) - np.dot(np.log(1 -␣
,→prediction).T, 1 - Y))

cost = np.squeeze(cost)
return round(float(cost), 2)

def linear_forward(self, A, W, b):


Z = np.dot(A, W) + b
cache = (A, W, b)

return Z, cache

def linear_activation_forward(self, A_prev, W, b, activation):


if activation == 'sigmoid':
Z, linear_cache = self.linear_forward(A_prev, W, b)
A, activation_cache = self.sigmoid(Z)
elif activation == 'relu':
Z, linear_cache = self.linear_forward(A_prev, W, b)
A, activation_cache = self.relu(Z)
cache = (linear_cache, activation_cache)
return A, cache

def linear_backward(self, dZ, cache):


A_prev, W, b = cache
m = A_prev.shape[0]

5
dW = np.dot(A_prev.T, dZ) / m
db = np.sum(dZ, axis=0, keepdims=True) / m
dA_prev = np.dot(dZ, W.T)
return dA_prev, dW, db

def linear_activation_backward(self, dA, cache, activation):


linear_cache, activation_cache = cache
if activation == 'sigmoid':
dZ = self.sigmoid_backward(dA, activation_cache)
dA_prev, dW, db = self.linear_backward(dZ, linear_cache)
elif activation == 'relu':
dZ = self.relu_backward(dA, activation_cache)
dA_prev, dW, db = self.linear_backward(dZ, linear_cache)
return dA_prev, dW, db

def forward(self, X):


caches = []
A = X
L = len(self.sizes) - 1
for l in range(1, L):
A_prev = A
A, cache = self.linear_activation_forward(A_prev, self.W[l - 1],␣
,→self.b[l - 1], 'relu')

caches.append(cache)
AL, cache = self.linear_activation_forward(A, self.W[L - 1], self.b[L -␣
,→1], 'sigmoid')

caches.append(cache)
return AL, caches

def backward(self, AL, Y, caches):


grads = {}
L = len(self.sizes) - 1
Y = Y.reshape(AL.shape)
# First order derivatie of loss function output with respect to AL
dAL = - (np.divide(Y, AL) - np.divide(1 - Y, 1 - AL))
current_cache = caches[L - 1]
grads["dA" + str(L)], grads["dW" + str(L)], grads["db" + str(L)] = self.
,→linear_activation_backward(dAL, current_cache, 'sigmoid')

for l in reversed(range(L - 1)):


current_cache = caches[l]
dA_prev_temp, dW_temp, db_temp = self.
,→linear_activation_backward(grads["dA" + str(l + 2)], current_cache, 'relu')

grads["dA" + str(l + 1)] = dA_prev_temp


grads["dW" + str(l + 1)] = dW_temp
grads["db" + str(l + 1)] = db_temp
return grads

6
def update_parameters(self, grads, learning_rate):
L = len(self.sizes) - 1
for l in range(L):
self.W[l] = self.W[l] - learning_rate * grads["dW" + str(l + 1)]
self.b[l] = self.b[l] - learning_rate * grads["db" + str(l + 1)]

def train(self, X, Y, X_val, Y_val):


train_losses, train_accuracies = [], []
validation_losses, validation_accuracies = [], []
for i in range(self.epochs):
AL, caches = self.forward(X)
grads = self.backward(AL, Y, caches)
self.update_parameters(grads, self.learning_rate)
cost = self.compute_cost(AL, Y)
# measure accuracy
prediction = np.where(AL > .5, 1, 0)
accuracy = round(accuracy_score(prediction, Y) * 100, 2)
# Validation
AL_val, _ = self.forward(X_val)
val_cost = self.compute_cost(AL_val, Y_val)
val_prediction = np.where(AL_val > .5, 1, 0)
val_accuracy = round(accuracy_score(val_prediction, Y_val) * 100, 2)
if i% 500 == 0:
print(f'Epoch: {i+1},\n Train: \t Accuracy: {accuracy}, cost:␣
,→{cost}\n '

f'Validation: \t Accuracy: {val_accuracy}, cost: {val_cost}')


train_losses.append(cost)
train_accuracies.append(accuracy)
validation_losses.append(val_cost)
validation_accuracies.append(val_accuracy)
return train_losses, train_accuracies, validation_losses,␣
,→validation_accuracies

def predict(self, X):


AL, caches = self.forward(X)
return np.where(AL > .5, 1, 0)

def get_test_accuracy(self, X, Y):


predict = self.predict(X)
return round(accuracy_score(predict, Y) * 100, 2)

I created an object dnn using the DeepNeuralNetwork class. Here I set the number of neurons
for the first and second hidden layers to 32 and 24 respectively. And finally I trained the object
using the train method.
[ ]: sizes = (4,32,24,1)
dnn = DeepNeuralNetwork(sizes, epochs=10000)

7
train_losses, train_accuracies, validation_losses, validation_accuracies = dnn.
,→train(X_train, Y_train, X_val, Y_val)

Epoch: 1,
Train: Accuracy: 64.97, cost: 1.5
Validation: Accuracy: 66.13, cost: 1.48
Epoch: 501,
Train: Accuracy: 81.63, cost: 0.43
Validation: Accuracy: 80.76, cost: 0.48
Epoch: 1001,
Train: Accuracy: 82.26, cost: 0.4
Validation: Accuracy: 81.47, cost: 0.44
Epoch: 1501,
Train: Accuracy: 82.72, cost: 0.39
Validation: Accuracy: 81.78, cost: 0.42
Epoch: 2001,
Train: Accuracy: 82.99, cost: 0.38
Validation: Accuracy: 81.91, cost: 0.41
Epoch: 2501,
Train: Accuracy: 83.06, cost: 0.38
Validation: Accuracy: 82.09, cost: 0.41
Epoch: 3001,
Train: Accuracy: 83.16, cost: 0.37
Validation: Accuracy: 82.31, cost: 0.4
Epoch: 3501,
Train: Accuracy: 83.22, cost: 0.37
Validation: Accuracy: 82.27, cost: 0.4
Epoch: 4001,
Train: Accuracy: 83.18, cost: 0.37
Validation: Accuracy: 82.4, cost: 0.4
Epoch: 4501,
Train: Accuracy: 83.22, cost: 0.37
Validation: Accuracy: 82.49, cost: 0.4
Epoch: 5001,
Train: Accuracy: 83.26, cost: 0.37
Validation: Accuracy: 82.58, cost: 0.4
Epoch: 5501,
Train: Accuracy: 83.2, cost: 0.37
Validation: Accuracy: 82.58, cost: 0.39
Epoch: 6001,
Train: Accuracy: 83.3, cost: 0.37
Validation: Accuracy: 82.8, cost: 0.39
Epoch: 6501,
Train: Accuracy: 83.44, cost: 0.36
Validation: Accuracy: 82.93, cost: 0.39
Epoch: 7001,
Train: Accuracy: 83.41, cost: 0.36

8
Validation: Accuracy: 82.84, cost: 0.39
Epoch: 7501,
Train: Accuracy: 83.43, cost: 0.36
Validation: Accuracy: 82.76, cost: 0.39
Epoch: 8001,
Train: Accuracy: 83.47, cost: 0.36
Validation: Accuracy: 82.62, cost: 0.39
Epoch: 8501,
Train: Accuracy: 83.44, cost: 0.36
Validation: Accuracy: 82.62, cost: 0.39
Epoch: 9001,
Train: Accuracy: 83.45, cost: 0.36
Validation: Accuracy: 82.67, cost: 0.39
Epoch: 9501,
Train: Accuracy: 83.5, cost: 0.36
Validation: Accuracy: 82.67, cost: 0.39
We can see that the final train and validation accuracies are 83.5 % and 82.67 % respectively.

10 Plotting the loss and accuracy graph for the train and valida-
tion data

We plotted loss and accuracy graphs for train and validation data in order to see if there is over-
fitting. The plots clearly shows there is no overfitting. We will further verify it by checking the
prediction accuracy of Test data. If the accuracy of test data is consistant or close enough with the
training accuracy then we can say the data fitting is generalized.
[ ]: plt.plot(train_losses, label="Train loss")
plt.plot(validation_losses, label="Validation loss")
plt.legend()
plt.title("Losses")

[ ]: Text(0.5, 1.0, 'Losses')

9
Losses
Train loss
1.4 Validation loss

1.2

1.0

0.8

0.6

0.4
0 2000 4000 6000 8000 10000

[ ]: plt.plot(train_accuracies, label="Train accuracy")


plt.plot(validation_accuracies, label="Validation accuracy")
plt.legend()
plt.title("accuracy")

[ ]: Text(0.5, 1.0, 'accuracy')

10
accuracy
82.5
80.0
77.5
75.0
72.5
70.0
67.5 Train accuracy
65.0 Validation accuracy
0 2000 4000 6000 8000 10000

11 Apply on test data

As instructed in the test data matlab file , the expected class labels are 2,2,1,1,2,2,1,1,.....This
would be in our case 1,1,0,0,1,1,0,0,...Now, we will preprocess the test data the simillar way that
we did in case of training data.
[ ]: # labeling the data in the described order 2,2,1,1,2,2,1,1,.....('0' for class␣
,→1 and '1' for class 2)

test_data['class_name'] = np.asarray([1, 1, 0, 0] * 3750)


test_data_updated = test_data.sample(frac=1).reset_index(drop=True)
x_test,y_test = test_data_updated.iloc[:, :-1], test_data_updated.iloc[:, [-1]]
x_test = np.array(x_test)
y_test = np.array(y_test)
x_test = preprocessing.normalize(x_test)

We checked the accuracy of test data by applying the get_test_accuracy method which we
created inside the DeepNeuralNertwork class. The test accuracy is 83.39 %.
[ ]: dnn.get_test_accuracy(x_test, y_test)

[ ]: 83.39

We can improve the accuracy slightly by tuning some hyperparameter. For example, we can increase
the number of hidden layer. As it is not in the scope of the assignment, I kept the three layer neural

11
network as my final model.

12 The End

[ ]: from IPython.display import set_matplotlib_formats


set_matplotlib_formats('pdf', 'svg')

[ ]: %%capture
!wget -nc https://raw.githubusercontent.com/brpy/colab-pdf/master/colab_pdf.py
from colab_pdf import colab_pdf
colab_pdf('Using a three layer deep neural network to solve an unsupervised␣
,→learning problem.ipynb')

[ ]: !wget -nc https://raw.githubusercontent.com/brpy/colab-pdf/master/colab_pdf.py


from colab_pdf import colab_pdf
colab_pdf('Using a three layer deep neural network to solve an unsupervised␣
,→learning problem.ipynb')

File ‘colab_pdf.py’ already there; not retrieving.

WARNING: apt does not have a stable CLI interface. Use with caution in scripts.

WARNING: apt does not have a stable CLI interface. Use with caution in scripts.

[NbConvertApp] Converting notebook /content/drive/MyDrive/Colab Notebooks/Using


a three layer deep neural network to solve an unsupervised learning
problem.ipynb to pdf
[NbConvertApp] Support files will be in Using a three layer deep neural network
to solve an unsupervised learning problem_files/
[NbConvertApp] Making directory ./Using a three layer deep neural network to
solve an unsupervised learning problem_files
[NbConvertApp] Making directory ./Using a three layer deep neural network to
solve an unsupervised learning problem_files
[NbConvertApp] Making directory ./Using a three layer deep neural network to
solve an unsupervised learning problem_files
[NbConvertApp] Making directory ./Using a three layer deep neural network to
solve an unsupervised learning problem_files
[NbConvertApp] Writing 68204 bytes to ./notebook.tex
[NbConvertApp] Building PDF
[NbConvertApp] Running xelatex 3 times: ['xelatex', './notebook.tex', '-quiet']
[NbConvertApp] Running bibtex 1 time: ['bibtex', './notebook']
[NbConvertApp] WARNING | bibtex had problems, most likely because there were no
citations
[NbConvertApp] PDF successfully created

12
[NbConvertApp] Writing 98768 bytes to /content/drive/My Drive/Using a three
layer deep neural network to solve an unsupervised learning problem.pdf
<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

[ ]: 'File ready to be Downloaded and Saved to Drive'

[ ]:

13

You might also like