TMA01 Question 1 (45 Marks)

TMA01 Question 1 (45 marks)
Name: Parth Shah

PI: E395923X
In this question, you will create some neural network models that use a dataset of natural
scenes. The dataset is currently hosted on Kaggle and has been divided into separate
datasets for this TMA.
In this question we will only use two classes of the dataset (in question 2 you will use all six
classes).
In this question, you are not being asked to select a "best" model. Therefore, for the
purposes of the question, you'll be using the test dataset to explore how models behave
differently on seen and unseen data. Don't do this on models you want to deploy, or in
subsequent TMAs!
Completing the TMA

The tasks in this notebook can be addressed using the techniques discussed in the
Foundation and Block 1 of the module materials, and the associated notebooks.
You should be able to complete this question when you have completed the
practical activities in Block 1
You should look at the notebooks for Block 1 while working through this
question. You will find many useful examples in those notebooks which will help
you in this assignment.
Record all your activity and observations in this notebook. Insert additional notebook cells
as required. Remember to run each cell in sequence and to rerun cells if you make any
changes in earlier cells.
Include Markdown cells (like this one) liberally in your solutions, to describe what you are
doing. This will help your tutor give full credit for all you have done, and is invaluable in
reminding you what you were doing when you return to the TMA after a few days away.
Before you submit your notebook make sure you run all cells in order and check that you
get the results you expect. (It is not unknown to receive notebooks which don't work when
the cells are run in order.)
See the VLE for details of how to submit your completed notebook. You should submit only
this notebook file for this question.
Marks are based on process, not results

In this notebook, you will be asked to create, train, and evaluate several neural networks.
Training neural networks is inherently a stochastic process, based on the random
allocation of initial weights and the shuffled order of training examples. Therefore, your
results will differ from results generated by other students, and those generated by the
module team and presented in the tutor's marking guide.
The marks in this question are awarded solely on your ability to carry out the steps of
training and evaluation, not on any particular results you may achieve. There are no
thresholds for accuracy (or any other metric) you must achieve. You will gain credit
for carrying out the tasks specified in this question, including honest evaluations of how the
models perform.
Setup
This imports the required libraries.
import tensorflow as tf
from tensorflow.keras import layers, optimizers, metrics, Sequential,
utils
import os
import json
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
Loading and preparing the dataset

This section of the notebook loads the dataset and makes it available for training.
First, we define some constants we will use later, such as the image size, and define some
metrics to use for model evaluation.
BATCH_SIZE = 64
IMAGE_SIZE = (150, 150, 3)

IMAGE_RESCALE = (IMAGE_SIZE[0], IMAGE_SIZE[1])
METRICS = [
lambda : tf.keras.metrics.TruePositives(name='tp'),
lambda : tf.keras.metrics.FalsePositives(name='fp'),
lambda : tf.keras.metrics.TrueNegatives(name='tn'),
lambda : tf.keras.metrics.FalseNegatives(name='fn'),
lambda : tf.keras.metrics.BinaryAccuracy(name='accuracy'),
lambda : tf.keras.metrics.Precision(name='precision'),
lambda : tf.keras.metrics.Recall(name='recall'),
lambda : tf.keras.metrics.AUC(name='auc'),
]
def fresh_metrics():
return [metric() for metric in METRICS]
The dataset contains several classes of image. For this notebook, we'll use just two then
convert them to a tensor of strings.
# Where to find the test data
base_dir = '/datasets/intel-multiclass/'
train_dir = os.path.join(base_dir, 'train')
validation_dir = os.path.join(base_dir, 'validation')
test_dir = os.path.join(base_dir, 'test')
label_list = ['buildings', 'sea']
desired_labels_tensor = tf.constant([l.encode('utf-8') for l in

label_list])
desired_labels_tensor
<tf.Tensor: shape=(2,), dtype=string, numpy=array([b'buildings',

b'sea'], dtype=object)>
Some dicts to convert between numbers and text labels.

# Human-sensible labels for the classification
class_names = {i: l for i, l in enumerate(sorted(label_list))}
class_numbers = {l: i for i, l in enumerate(sorted(label_list))}
num_classes = len(label_list)
class_names, class_numbers, num_classes
({0: 'buildings', 1: 'sea'}, {'buildings': 0, 'sea': 1}, 2)
Some Tensorflow functions for use in defining the datasets we'll use.
• A predicate that determines if a given image is in one of our desired classes.
• A function that finds the class number from a label (in the image file's directory
name)
• A function that loads an image, given a path.
def desired_class(image_path):
label_text = tf.strings.split(image_path, os.path.sep)[-2]
return tf.math.reduce_any(label_text == desired_labels_tensor)
def lookup_class_label(label_text):
return class_numbers[label_text.numpy().decode('utf-8')]
def load_image(image_path):
# read the image from disk, decode it, resize it, and scale the
# pixels intensities to the range [0, 1]
image = tf.io.read_file(image_path)
image = tf.io.decode_jpeg(image, channels=3)
image = tf.image.resize(image, IMAGE_RESCALE)
image /= 255.0
# grab the label and encode it
label_text = tf.strings.split(image_path, os.path.sep)[-2]
label = tf.py_function(lookup_class_label, inp=[label_text],
Tout=tf.int32)
label = tf.ensure_shape(label, [])
# return the image and the integer encoded label

return (image, label)
These are all the image filenames we will use. Note the use of the filter to retain only the
two classes we're interested in.
train_dataset_files = tf.data.Dataset.list_files(
os.path.join(train_dir, '*', '*.jpg'),
shuffle=True).filter(desired_class)
train_data = train_dataset_files.map(load_image,
num_parallel_calls=tf.data.AUTOTUNE)
train_data = train_data.cache()
train_data = train_data.shuffle(20000)
train_data = train_data.batch(BATCH_SIZE)
train_data = train_data.prefetch(tf.data.AUTOTUNE)
validation_dataset_files = tf.data.Dataset.list_files(
os.path.join(validation_dir, '*', '*.jpg'),
shuffle=True).filter(desired_class)
validation_data = validation_dataset_files.map(load_image,
validation_data = validation_data.cache()
validation_data = validation_data.batch(BATCH_SIZE)
validation_data = validation_data.prefetch(tf.data.AUTOTUNE)
test_dataset_files = tf.data.Dataset.list_files(
os.path.join(test_dir, '*', '*.jpg'),
shuffle=True
).filter(desired_class)
test_data = test_dataset_files.map(load_image,
test_data = test_data.cache()
test_data = test_data.batch(BATCH_SIZE)
test_data = test_data.prefetch(tf.data.AUTOTUNE)
A check that we have what we're expecting: a dataset of batches of 150×150×3 images,
each paired with an integer label.
train_data
<PrefetchDataset element_spec=(TensorSpec(shape=(None, 150, 150, 3),

dtype=tf.float32, name=None), TensorSpec(shape=(None,),
dtype=tf.int32, name=None))>
Examining the data
Now we've loaded the data, we can look at some example images. We treat the dataset as
an iterator of Numpy arrays (each element is a batch of images and labels), and use that
interface to load a batch of images into memory. We then display them as a grid.
sample_imgs, sample_labels = train_data.as_numpy_iterator().next()
sample_imgs.shape, sample_labels.shape
((64, 150, 150, 3), (64,))
A batch of 64 images (each 150×150 pixels, 3 colour channels) and a batch of 64 labels.
plt.figure(figsize=(10,10))
for i in range(25):
plt.subplot(5,5,i+1)
plt.imshow(sample_imgs[i])
plt.xticks([])
plt.yticks([])
plt.grid(False)
plt.title(class_names[sample_labels[i]])
plt.show()
Jittered labels
The labels of the validation set, jittered. These may be useful for charts similar to those in
the Foundations notebooks.
test_labels = np.array(list(test_data.unbatch().map(lambda x, y:
y).as_numpy_iterator()))
test_labels.shape
(1353,)
jittered_labels = test_labels + (np.random.random(test_labels.shape) *

0.8)
jittered_labels.shape
(1353,)
Define and train a sample model
We now create and train a simple model using these datasets.
You should use this example as a basis for the models of your own that you create in this
question.
model = tf.keras.Sequential([
tf.keras.layers.Flatten(input_shape=IMAGE_SIZE),
tf.keras.layers.Dense(1024, activation='sigmoid'),
tf.keras.layers.Dense(1, activation='sigmoid')
])
Note that we're using binary cross entropy as the loss function (as there are two classes).
Categorical cross-entropy is used when there are multiple classes, one-hot encoded.
opt = tf.keras.optimizers.SGD()
model.compile(optimizer=opt,
loss='binary_crossentropy',
metrics=['accuracy'])
history = model.fit(train_data,
validation_data=validation_data,
epochs=5)
Epoch 1/5
58/58 [==============================] - 2s 25ms/step - loss: 0.8806 -
accuracy: 0.5477 - val_loss: 0.6577 - val_accuracy: 0.5667
Epoch 2/5
58/58 [==============================] - 1s 14ms/step - loss: 0.6805 -
Epoch 3/5
58/58 [==============================] - 1s 14ms/step - loss: 0.6518 -
Epoch 4/5
58/58 [==============================] - 1s 14ms/step - loss: 0.6300 -
Epoch 5/5
58/58 [==============================] - 1s 14ms/step - loss: 0.6191 -
Save and reload the model and the training history.

model.save('q1_sample.h5')
with open('q1_sample_history.json', 'w') as f:

json.dump(history.history, f)
model = tf.keras.models.load_model('q1_sample.h5')
with open('q1_sample_history.json') as f:
sample_history = json.load(f)
Plot the training history.

acc = sample_history['accuracy']
val_acc = sample_history['val_accuracy']
loss = sample_history['loss']
val_loss = sample_history['val_loss']
epochs = range(len(acc))
plt.plot(epochs, acc, 'ro', label='Training acc')

plt.plot(epochs, val_acc, 'b', label='Validation acc')
plt.title('Training and validation accuracy')
plt.legend()
plt.figure()
plt.plot(epochs, loss, 'ro', label='Training loss')

plt.plot(epochs, val_loss, 'b', label='Validation loss')
plt.title('Training and validation loss')
plt.legend()
plt.show()
Update the metrics used on the model and evaluate them on the validation data.
model.compile(metrics=fresh_metrics())
model.evaluate(validation_data, return_dict=True)
6/6 [==============================] - 0s 13ms/step - loss: 0.0000e+00

- tp: 136.0000 - fp: 72.0000 - tn: 103.0000 - fn: 49.0000 - accuracy:
0.6639 - precision: 0.6538 - recall: 0.7351 - auc: 0.7169
{'loss': 0.0,
'tp': 136.0,
'fp': 72.0,
'tn': 103.0,
'fn': 49.0,
'accuracy': 0.6638888716697693,
'precision': 0.6538461446762085,
'recall': 0.7351351380348206,
'auc': 0.7169111967086792}
You are now able to work on the tasks in this TMA question.
(a) (5 marks)
Referring to the sample model above:
• show how many trainable parameters it has
• show the shape of inputs to the model
• describe in words the shape of the inputs.
Give your answer below
Insert additional code and markdown cells as needed.
1a.)
• The input layer consists of 150 ×150 ×3=67 , 500 inputs.
• The second layer has 1 , 024 inputs, so there must be 67 , 500 ×1 , 024 weights,
resulting in 69 , 120 ,000 parameters so far.
• Each input in the second layer also has a bias, resulting in
69 , 120 ,000+ 1, 024=69 , 121 ,024 parameters so far.
• The third layer only has 1 output, which results in 1 , 024 weights and 1 bias, so the
total number of trainable parameters are 69 , 121, 024 +1 ,024 +1=69 , 122 ,049 .
The shape of the input is:
model.layers[0].input.shape
TensorShape([None, 150, 150, 3])
This is a 150 x 150 pixel image with 3 color channels. It can be represented by an array of
150 arrays, each of which has 150 arrays too, which contain 3 pixel values, each for red,
green and blue:
# [
# [[r1_1, g1_1, b1_1], ..., [r1_150, g1_150, b1_150]]
# ...
# [[r150_1, g150_1, b150_1], ..., [r_150_150, g_150_150,
b150_150]]
# ]
(b) (10 marks)

Using the sample model defined above, create and train a new classifier model of this
dataset. Your new model should be different from the sample one in the following way:
• Insert an additional Dense layer of 256 neurons between the two existing Dense
layers
• All Dense layers, except the last, should use ReLU activitation
• Training should use the SGD optimiser with a learning rate of 0.001
• Remember to use binary_crossentropy as the loss function
Train your modified model for 40 epochs. Show plots of how the accuracy and loss changed
over training, for both the training and validation datasets.
(You may wish to save your model and the training history.)
model2 = tf.keras.Sequential([
tf.keras.layers.Dense(1024, activation='relu'),
])
opt = tf.keras.optimizers.SGD(learning_rate=0.001)
model2.compile(
optimizer=opt,
metrics=['accuracy']
)
history = model2.fit(
train_data,
epochs=40
)
Epoch 1/40
58/58 [==============================] - 1s 17ms/step - loss: 0.6661 -
Epoch 2/40
58/58 [==============================] - 1s 14ms/step - loss: 0.5883 -
Epoch 3/40
58/58 [==============================] - 1s 14ms/step - loss: 0.5891 -
Epoch 4/40
58/58 [==============================] - 1s 14ms/step - loss: 0.5569 -
Epoch 5/40
58/58 [==============================] - 1s 14ms/step - loss: 0.5475 -
Epoch 6/40
58/58 [==============================] - 1s 14ms/step - loss: 0.5368 -
Epoch 7/40
58/58 [==============================] - 1s 14ms/step - loss: 0.5190 -
Epoch 8/40
58/58 [==============================] - 1s 14ms/step - loss: 0.5162 -
Epoch 9/40
58/58 [==============================] - 1s 14ms/step - loss: 0.5054 -
Epoch 10/40
58/58 [==============================] - 1s 14ms/step - loss: 0.5030 -
Epoch 11/40
58/58 [==============================] - 1s 14ms/step - loss: 0.4885 -
Epoch 12/40
58/58 [==============================] - 1s 14ms/step - loss: 0.4790 -
Epoch 13/40
58/58 [==============================] - 1s 14ms/step - loss: 0.4798 -
Epoch 14/40
58/58 [==============================] - 1s 14ms/step - loss: 0.4616 -
Epoch 15/40
58/58 [==============================] - 1s 14ms/step - loss: 0.4646 -
Epoch 16/40
58/58 [==============================] - 1s 14ms/step - loss: 0.4518 -
Epoch 17/40
58/58 [==============================] - 1s 14ms/step - loss: 0.4527 -
Epoch 18/40
58/58 [==============================] - 1s 15ms/step - loss: 0.4476 -
Epoch 19/40
58/58 [==============================] - 1s 15ms/step - loss: 0.4307 -
Epoch 20/40
58/58 [==============================] - 1s 15ms/step - loss: 0.4290 -
Epoch 21/40
58/58 [==============================] - 1s 15ms/step - loss: 0.4401 -
Epoch 22/40
58/58 [==============================] - 1s 15ms/step - loss: 0.4221 -
Epoch 23/40
58/58 [==============================] - 1s 15ms/step - loss: 0.4093 -
Epoch 24/40
58/58 [==============================] - 1s 14ms/step - loss: 0.4116 -
Epoch 25/40
58/58 [==============================] - 1s 15ms/step - loss: 0.4036 -
Epoch 26/40
58/58 [==============================] - 1s 14ms/step - loss: 0.3975 -
Epoch 27/40
58/58 [==============================] - 1s 15ms/step - loss: 0.3879 -
Epoch 28/40
58/58 [==============================] - 1s 14ms/step - loss: 0.3920 -
Epoch 29/40
58/58 [==============================] - 1s 14ms/step - loss: 0.3951 -
Epoch 30/40
58/58 [==============================] - 1s 14ms/step - loss: 0.3743 -
Epoch 31/40
58/58 [==============================] - 1s 14ms/step - loss: 0.3818 -
Epoch 32/40
58/58 [==============================] - 1s 14ms/step - loss: 0.3686 -
Epoch 33/40
58/58 [==============================] - 1s 14ms/step - loss: 0.3702 -
Epoch 34/40
58/58 [==============================] - 1s 14ms/step - loss: 0.3675 -
Epoch 35/40
58/58 [==============================] - 1s 14ms/step - loss: 0.3509 -
Epoch 36/40
58/58 [==============================] - 1s 14ms/step - loss: 0.3498 -
Epoch 37/40
58/58 [==============================] - 1s 14ms/step - loss: 0.3547 -
Epoch 38/40
58/58 [==============================] - 1s 14ms/step - loss: 0.3654 -
Epoch 39/40
58/58 [==============================] - 1s 14ms/step - loss: 0.3580 -
Epoch 40/40
58/58 [==============================] - 1s 14ms/step - loss: 0.3583 -
(c) (5 marks)
Comment on the plots of loss and accuracy, for both training and validation data, during the
training of this model. Do you think this model would benefit from additional training?
acc = history.history["accuracy"]
val_acc = history.history['val_accuracy']
loss = history.history['loss']
val_loss = history.history['val_loss']

plt.legend()
plt.figure()

plt.legend()
plt.show()
print("Epoch 1 Training Accuracy: ", str(history.history['accuracy']
[0] * 100) + '%')
[4] * 100) + '%')
[39] * 100) + '%')
Epoch 1 Training Accuracy: 59.9351167678833%

Some epochs saw a worse training accuracy compared to the previous epoch, but overall,
the training accuracy gradually improved as the epochs increased, reaching 88% in the
final epoch. It is worth mentioning that the rate at which the training accuracy improved
slowed down significantly after 5 epochs — the training accuracy improved by ~13%
between the first and fifth epoch, but took a further 35 epochs to improve by another
~13%:
Although the validation accuracy gradually increased over the 40 epochs, it was much more
volatile. For example:
print("Epoch 37 Validation Accuracy: ",
str(history.history['val_accuracy'][36] * 100) + '%')
print("Epoch 38 Validation Accuracy: ",
str(history.history['val_accuracy'][37] * 100) + '%')
Epoch 37 Validation Accuracy: 66.38888716697693%
Epoch 38 Validation Accuracy: 79.72221970558167%
The validation accuracy between the 37th and 38th epoch decreased by ~13%. Like the
training accuracy, the validation accuracy also saw diminishing returns as the epochs
increased, settling between 75% - 80% from the 10th epoch.
The training loss decreased at a relatively stable pace, reducing from ~65% to ~35% over
the 40 epochs. The validation loss however suffered from diminishing returns, settling
between ~45% to ~50% from the 15th epoch.
To conclude, the training accuracy is already very good, and despite the training loss is
35%, it would most likely decrease significantly if the model is trained further. However,
the validation accuracy and loss have flattened and do not show any signs of improvement.
Therefore, training the model further would just lead to the training loss/accuracy
increasing to perfect-like values and the validation loss/accuracy remaining constant.
There is not much benefit of training the model further, as the validation performance, and
therefore, real-world performance too, may not increase by much (if at all). Training the
model further also increases the risk of over-fitting.
(d) (10 marks)

Recompile the model from part (b) above to use the metrics defined by the
fresh_metrics function defined above.
Evaluate the model, using these metrics, on all three of the train, validation, and test
datasets.
Use that model to generate predicted classes for all elements in the test dataset. Plot a
scatter chart of the predicted results with the actual results (defined above as either
test_labels or jittered_labels.)
Comment on these results.

model2.compile(metrics=fresh_metrics())
train_metric = model2.evaluate(train_data, return_dict=True)

validation_metric = model2.evaluate(validation_data, return_dict=True)
test_metric = model2.evaluate(test_data, return_dict=True)
58/58 [==============================] - 1s 10ms/step - loss:

0.0000e+00 - tp: 1666.0000 - fp: 131.0000 - tn: 1665.0000 - fn:
237.0000 - accuracy: 0.9005 - precision: 0.9271 - recall: 0.8755 -
auc: 0.9655
6/6 [==============================] - 0s 11ms/step - loss: 0.0000e+00
22/22 [==============================] - 0s 9ms/step - loss:
0.0000e+00 - tp: 549.0000 - fp: 92.0000 - tn: 565.0000 - fn: 147.0000
- accuracy: 0.8234 - precision: 0.8565 - recall: 0.7888 - auc: 0.8947
train_metric
{'loss': 0.0,
'tp': 1666.0,
'fp': 131.0,
'tn': 1665.0,
'fn': 237.0,
'accuracy': 0.9005136489868164,
'precision': 0.9271007180213928,
'recall': 0.8754597902297974,
'auc': 0.9654821753501892}
validation_metric
{'loss': 0.0,
'tp': 141.0,
'fp': 30.0,
'tn': 145.0,
'fn': 44.0,
'accuracy': 0.7944444417953491,
'precision': 0.8245614171028137,
'recall': 0.7621621489524841,
'auc': 0.8669652938842773}
test_metric
{'loss': 0.0,
'tp': 549.0,
'fp': 92.0,
'tn': 565.0,
'fn': 147.0,
'accuracy': 0.823355495929718,
'precision': 0.8564742803573608,
'recall': 0.7887930870056152,
'auc': 0.8947311043739319}
The accuracy, precision, and recall values are the lowest in the validation set and highest in
the training set, with the testing set falling in between, but leaning more closely to the
validation set.
The fact that the accuracy, precision, and recall values in the training set are higher than
the values in the testing/validation set is to be expected. This is because the model is tuned
to optimise the training data. Some over-fitting is inevitably present.
model2_input = np.array(list(test_data.unbatch().map(lambda x, y:
x).as_numpy_iterator()))
expected_output = np.array(list(test_data.unbatch().map(lambda x, y:
y).as_numpy_iterator()))
model2_predictions = model2.predict(model2_input)[:, 0]
is_sea = np.count_nonzero((model2_predictions >= 0.5) &

(expected_output >= 0.5))
is_building = np.count_nonzero((model2_predictions < 0.5) &
(expected_output < 0.5))
plt.figure(figsize=(10, 10))
misclassifications = 0
for i in range(25):
plt.subplot(5, 5, i + 1)
plt.imshow(model2_input[i])
plt.xticks([])
plt.yticks([])
plt.grid(False)
if model2_predictions[i] < 0.5 and expected_output[i] >= 0.5:
print(f"Misclassification at row {(i // 5) + 1}, column {(i %
5) + 1} - expecting {class_names[1]}.")
misclassifications += 1
elif model2_predictions[i] >= 0.5 and expected_output[i] < 0.5:
print(f"Misclassification at row {(i // 5) + 1}, column {(i %
5) + 1} - expecting {class_names[0]}.")
misclassifications += 1
plt.title(class_names[1 if model2_predictions[i] >= 0.5 else 0])
print(f"{(misclassifications/25) * 100}% of data points were
misclassified.")
plt.show()
43/43 [==============================] - 0s 5ms/step

Misclassification at row 1, column 1 - expecting buildings.
Misclassification at row 1, column 5 - expecting sea.
Misclassification at row 2, column 2 - expecting sea.
28.000000000000004% of data points were misclassified.
plt.scatter(jittered_labels, model2_predictions)
plt.xlabel('Label')
plt.ylabel('Prediction');
plt.axline((0, 0.5), (2, 0.5), c="tab:orange", label="Threshold")
plt.legend()
<matplotlib.legend.Legend at 0x7f81987d5370>
print(f"False Negatives: {test_metric['fn']}")
print(f"False Positives: {test_metric['fp']}")
False Negatives: 147.0

False Positives: 92.0
As can be seen from the graph above, there are more points in the lower right quadrant
(false negatives) compared to the top left quadrant (false positives).
• points in the scatter chart are clustered at the more extreme values for each class, with few
with predicted values near 0.5. This indicates that the model can classify well.
• for the test data, precision, accuracy and recall are all high and similar, indicating that there
(e) (5 marks) isn’t a systematic bias in the results and that both classes are classified correctly.
• for the training data, the values of all metrics are extremely high, indicating that the model
has learnt most features of the training data.
Referring to the evaluations in part (d), explain why it is important to have separate
datasets for training and testing ML models. What would be the implications if the same
dataset was used for both roles? When should testing data be used? What is the advantage
of having a separate validation dataset?

Implications if same dataset is used for both training and testing

If the same dataset is used for both training and testing, it becomes difficult to evaluate the
performance of the model. Neural Networks are trained to minimise the loss between the
input data and the expected output, and so it is obvious that the evaluation results would
appear to be unnaturally good if the same dataset is used for both training and testing. This
may not be an accurate representation of the actual performance of the model.
When testing data should be used
The model should first be trained using training data. The testing data should only be
touched at the final model evaluation stage. This is to allow for a good approximation of
how the model would perform with real-world data.
Advantage of having a separate validation dataset

Without a validation dataset, a model is improved by looking at the results from the testing
data and using that to fine-tune parameters in a way which provides better results to the
testing data. However, this means that some of the testing data is being converted into
training data, defeating the purpose of splitting the testing and training data in the first
place. When a validation dataset is used, this problem can be avoided, as the validation data
can be used to fine-tune the model parameters, leaving the training data only to be used at
the final step. Validation data is useful during model development, as a source of unseen data for validation
of the model.However, it is still possible for the model development to focus on optimising performance on
the validation data, so testing data is still needed for final evaluation.
(f) (10 marks)

Starting from the model defined in part (b), create and train two new models.
• Model S should have a different number of neurons in its hidden layer (you decide
how many)
• Model L should use a different value for the learning rate with the SGD optimizer
(you decide what it should be).
Apart from these changes, the model creation and training should be identical to what you
did in part (b) above. Comment on any relevant observations you make of the training.
Evaluate your new models and compare their performance with each other and what you
found in part (d) above. Comment on your results.
As a reminder, note that there are no marks in this TMA for whether your
modified models work better or worse then the original. The marks in this
question are purely for the experiment and commenting on the results.

modelS = tf.keras.Sequential([
])
modelS.compile(
optimizer=opt,
)
historyS = modelS.fit(
train_data,
epochs=40
)
acc = historyS.history["accuracy"]
val_acc = historyS.history['val_accuracy']
loss = historyS.history['loss']
val_loss = historyS.history['val_loss']

plt.legend()
plt.figure()

plt.legend()
plt.show()
modelS.compile(metrics=fresh_metrics())
train_metric = modelS.evaluate(train_data, return_dict=True)
validation_metric = modelS.evaluate(validation_data, return_dict=True)
test_metric = modelS.evaluate(test_data, return_dict=True)
Epoch 1/40
58/58 [==============================] - 1s 10ms/step - loss: 0.6722 -
Epoch 2/40
58/58 [==============================] - 0s 8ms/step - loss: 0.6510 -
Epoch 3/40
58/58 [==============================] - 0s 8ms/step - loss: 0.6348 -
Epoch 4/40
58/58 [==============================] - 0s 8ms/step - loss: 0.6189 -
Epoch 5/40
58/58 [==============================] - 0s 8ms/step - loss: 0.6053 -
Epoch 6/40
58/58 [==============================] - 0s 8ms/step - loss: 0.5957 -
Epoch 7/40
58/58 [==============================] - 0s 8ms/step - loss: 0.5852 -
Epoch 8/40
58/58 [==============================] - 0s 8ms/step - loss: 0.5757 -
Epoch 9/40
58/58 [==============================] - 0s 8ms/step - loss: 0.5710 -
Epoch 10/40
58/58 [==============================] - 0s 8ms/step - loss: 0.5587 -
Epoch 11/40
58/58 [==============================] - 0s 8ms/step - loss: 0.5543 -
Epoch 12/40
58/58 [==============================] - 0s 8ms/step - loss: 0.5461 -
Epoch 13/40
58/58 [==============================] - 0s 8ms/step - loss: 0.5413 -
Epoch 14/40
58/58 [==============================] - 0s 8ms/step - loss: 0.5344 -
Epoch 15/40
58/58 [==============================] - 0s 8ms/step - loss: 0.5250 -
Epoch 16/40
58/58 [==============================] - 0s 8ms/step - loss: 0.5254 -
Epoch 17/40
58/58 [==============================] - 0s 8ms/step - loss: 0.5165 -
Epoch 18/40
58/58 [==============================] - 0s 8ms/step - loss: 0.5128 -
Epoch 19/40
58/58 [==============================] - 0s 8ms/step - loss: 0.5103 -
Epoch 20/40
58/58 [==============================] - 0s 8ms/step - loss: 0.5049 -
Epoch 21/40
58/58 [==============================] - 0s 8ms/step - loss: 0.5019 -
Epoch 22/40
58/58 [==============================] - 0s 8ms/step - loss: 0.4986 -
Epoch 23/40
58/58 [==============================] - 0s 8ms/step - loss: 0.4942 -
Epoch 24/40
58/58 [==============================] - 0s 8ms/step - loss: 0.4878 -
Epoch 25/40
58/58 [==============================] - 0s 8ms/step - loss: 0.4868 -
Epoch 26/40
58/58 [==============================] - 0s 8ms/step - loss: 0.4809 -
Epoch 27/40
58/58 [==============================] - 0s 8ms/step - loss: 0.4796 -
Epoch 28/40
58/58 [==============================] - 0s 8ms/step - loss: 0.4743 -
Epoch 29/40
58/58 [==============================] - 0s 8ms/step - loss: 0.4781 -
Epoch 30/40
58/58 [==============================] - 0s 8ms/step - loss: 0.4646 -
Epoch 31/40
58/58 [==============================] - 0s 8ms/step - loss: 0.4650 -
Epoch 32/40
58/58 [==============================] - 0s 8ms/step - loss: 0.4595 -
Epoch 33/40
58/58 [==============================] - 0s 8ms/step - loss: 0.4548 -
Epoch 34/40
58/58 [==============================] - 0s 8ms/step - loss: 0.4551 -
Epoch 35/40
58/58 [==============================] - 0s 8ms/step - loss: 0.4486 -
Epoch 36/40
58/58 [==============================] - 0s 8ms/step - loss: 0.4467 -
Epoch 37/40
58/58 [==============================] - 0s 8ms/step - loss: 0.4424 -
Epoch 38/40
58/58 [==============================] - 0s 8ms/step - loss: 0.4385 -
Epoch 39/40
58/58 [==============================] - 0s 8ms/step - loss: 0.4401 -
Epoch 40/40
58/58 [==============================] - 0s 8ms/step - loss: 0.4334 -
58/58 [==============================] - 1s 8ms/step - loss:
0.0000e+00 - tp: 1752.0000 - fp: 588.0000 - tn: 1208.0000 - fn:
auc: 0.9003
6/6 [==============================] - 0s 10ms/step - loss: 0.0000e+00
22/22 [==============================] - 0s 8ms/step - loss:
0.0000e+00 - tp: 616.0000 - fp: 292.0000 - tn: 365.0000 - fn: 80.0000
The number of neurons in both middle layers was changed to 32. This significantly reduced
the training time, from ~1 second per step to ~8 ms per step. This is attributed to there
being exponentially fewer weights and biases to update upon every learning iteration.
Surprisingly, reducing the number of neurons only reduced the test-set accuracy slightly
(77% compared to ~81% seen in 1(d)).
A possible explanation for why good accuracy is retained when reducing the number of
neurons significantly could be the fact that there are fewer features distinguishing an
image of a sea and an image of a building, compared to something more complex like an
image of a cat or a dog. An image of a sea generally has more bluer colors and has less
granular detail (smoothness of water compared to a building facade with many windows).
Therefore, as images of seas and buildings could theoretically be more linearly separable,
good results still may be achieved with fewer neurons.
The compromise in accuracy may be worth the gain in computation time for some non-
critical applications.
modelL = tf.keras.Sequential([
])
modelL.compile(
optimizer=opt,
)
historyL = modelL.fit(
train_data,
epochs=40
)
acc = historyL.history["accuracy"]
val_acc = historyL.history['val_accuracy']
loss = historyL.history['loss']
val_loss = historyL.history['val_loss']

plt.legend()
plt.figure()

plt.legend()
plt.show()
modelS.compile(metrics=fresh_metrics())
train_metric = modelS.evaluate(train_data, return_dict=True)
validation_metric = modelS.evaluate(validation_data, return_dict=True)
test_metric = modelS.evaluate(test_data, return_dict=True)
Epoch 1/40
58/58 [==============================] - 1s 16ms/step - loss: 1.2910 -
Epoch 2/40
58/58 [==============================] - 1s 14ms/step - loss: 0.6697 -
Epoch 3/40
58/58 [==============================] - 1s 14ms/step - loss: 0.6594 -
Epoch 4/40
58/58 [==============================] - 1s 14ms/step - loss: 0.6441 -
Epoch 5/40
58/58 [==============================] - 1s 14ms/step - loss: 0.6634 -
Epoch 6/40
58/58 [==============================] - 1s 14ms/step - loss: 0.6326 -
Epoch 7/40
58/58 [==============================] - 1s 14ms/step - loss: 0.6072 -
Epoch 8/40
58/58 [==============================] - 1s 14ms/step - loss: 0.6009 -
Epoch 9/40
58/58 [==============================] - 1s 14ms/step - loss: 0.5946 -
Epoch 10/40
58/58 [==============================] - 1s 14ms/step - loss: 0.5633 -
Epoch 11/40
58/58 [==============================] - 1s 14ms/step - loss: 0.5585 -
Epoch 12/40
58/58 [==============================] - 1s 14ms/step - loss: 0.5401 -
Epoch 13/40
58/58 [==============================] - 1s 14ms/step - loss: 0.5368 -
Epoch 14/40
58/58 [==============================] - 1s 14ms/step - loss: 0.5351 -
Epoch 15/40
58/58 [==============================] - 1s 14ms/step - loss: 0.5292 -
Epoch 16/40
58/58 [==============================] - 1s 14ms/step - loss: 0.5097 -
Epoch 17/40
58/58 [==============================] - 1s 14ms/step - loss: 0.5106 -
Epoch 18/40
58/58 [==============================] - 1s 14ms/step - loss: 0.4934 -
Epoch 19/40
58/58 [==============================] - 1s 14ms/step - loss: 0.4866 -
Epoch 20/40
58/58 [==============================] - 1s 14ms/step - loss: 0.4896 -
Epoch 21/40
58/58 [==============================] - 1s 15ms/step - loss: 0.4754 -
Epoch 22/40
58/58 [==============================] - 1s 14ms/step - loss: 0.4450 -
Epoch 23/40
58/58 [==============================] - 1s 14ms/step - loss: 0.4402 -
Epoch 24/40
58/58 [==============================] - 1s 14ms/step - loss: 0.4657 -
Epoch 25/40
58/58 [==============================] - 1s 14ms/step - loss: 0.4589 -
Epoch 26/40
58/58 [==============================] - 1s 14ms/step - loss: 0.4413 -
Epoch 27/40
58/58 [==============================] - 1s 14ms/step - loss: 0.4338 -
Epoch 28/40
58/58 [==============================] - 1s 14ms/step - loss: 0.4484 -
Epoch 29/40
58/58 [==============================] - 1s 14ms/step - loss: 0.4451 -
Epoch 30/40
58/58 [==============================] - 1s 14ms/step - loss: 0.4229 -
Epoch 31/40
58/58 [==============================] - 1s 14ms/step - loss: 0.4278 -
Epoch 32/40
58/58 [==============================] - 1s 14ms/step - loss: 0.4228 -
Epoch 33/40
58/58 [==============================] - 1s 14ms/step - loss: 0.3744 -
Epoch 34/40
58/58 [==============================] - 1s 14ms/step - loss: 0.4053 -
Epoch 35/40
58/58 [==============================] - 1s 14ms/step - loss: 0.4169 -
Epoch 36/40
58/58 [==============================] - 1s 14ms/step - loss: 0.4946 -
Epoch 37/40
58/58 [==============================] - 1s 14ms/step - loss: 0.4370 -
Epoch 38/40
58/58 [==============================] - 1s 14ms/step - loss: 0.4342 -
Epoch 39/40
58/58 [==============================] - 1s 14ms/step - loss: 0.3885 -
Epoch 40/40
58/58 [==============================] - 1s 14ms/step - loss: 0.3784 -
58/58 [==============================] - 1s 8ms/step - loss:
0.0000e+00 - tp: 1752.0000 - fp: 588.0000 - tn: 1208.0000 - fn:
auc: 0.9003
6/6 [==============================] - 0s 10ms/step - loss: 0.0000e+00
22/22 [==============================] - 0s 8ms/step - loss:
0.0000e+00 - tp: 616.0000 - fp: 292.0000 - tn: 365.0000 - fn: 80.0000
The learning rate was changed from 0.001 to 0.03. This led to a decrease in accuracy and
more volatile jumps in training loss and accuracy.
A key motivation behind increasing the learning rate by such a high margin was to discover
if the results from 1(d) were constrained by the Stochastic Gradient Descent algorithm
getting stuck in a local minimum.
A higher learning rate allows the algorithm to be less concerned about deeply honing into
anything that appears to lead to a minimum, and more freely explore the nearby space,
which could potentially lead to a lower minimum. This is why the training loss and
accuracy is more volatile than previously.
However, since worse results for accuracy and precision were achieved on the testing set
compared to previously, it appears that the learning rate is perhaps too high. It could be
interesting to experiment with a learning rate lower than 0.001 to see whether better
accuracy can be achieved. This would require more epochs though, which would increase
computation time.
I am expecting comments on training and validation data sets and any evidence of over fitting.
I am also expecting comments on accuracy, precision and recall scores as well.

TMA01 Question 1 (45 Marks)

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

TMA01 Question 1 (45 Marks)

Uploaded by

Copyright:

Available Formats

TMA01 Question 1 (45 marks)

Name: Parth Shah

Completing the TMA

Marks are based on process, not results

Loading and preparing the dataset

IMAGE_SIZE = (150, 150, 3)

label_list = ['buildings', 'sea']

desired_labels_tensor = tf.constant([l.encode('utf-8') for l in

<tf.Tensor: shape=(2,), dtype=string, numpy=array([b'buildings',

Some dicts to convert between numbers and text labels.

({0: 'buildings', 1: 'sea'}, {'buildings': 0, 'sea': 1}, 2)

# return the image and the integer encoded label

<PrefetchDataset element_spec=(TensorSpec(shape=(None, 150, 150, 3),

((64, 150, 150, 3), (64,))

jittered_labels = test_labels + (np.random.random(test_labels.shape) *

Save and reload the model and the training history.

with open('q1_sample_history.json', 'w') as f:

Plot the training history.

plt.plot(epochs, acc, 'ro', label='Training acc')

plt.plot(epochs, loss, 'ro', label='Training loss')

6/6 [==============================] - 0s 13ms/step - loss: 0.0000e+00

TensorShape([None, 150, 150, 3])

(b) (10 marks)

plt.plot(epochs, acc, 'ro', label='Training acc')

plt.plot(epochs, loss, 'ro', label='Training loss')

Epoch 1 Training Accuracy: 59.9351167678833%

(d) (10 marks)

Comment on these results.

Give your answer below

train_metric = model2.evaluate(train_data, return_dict=True)

58/58 [==============================] - 1s 10ms/step - loss:

is_sea = np.count_nonzero((model2_predictions >= 0.5) &

43/43 [==============================] - 0s 5ms/step

False Negatives: 147.0

Give your answer below

Implications if same dataset is used for both training and testing

Advantage of having a separate validation dataset

(f) (10 marks)

Give your answer below

plt.plot(epochs, acc, 'ro', label='Training acc')

plt.plot(epochs, loss, 'ro', label='Training loss')

plt.plot(epochs, acc, 'ro', label='Training acc')

plt.plot(epochs, loss, 'ro', label='Training loss')

You might also like