You are on page 1of 10

04a_asl_augmentation about:srcdoc

Data Augmentation

So far, we've selected a model architecture that vastly improves the model's performance, as it is designed to
recognize important features in the images. The validation accuracy is still lagging behind the training
accuracy, which is a sign of overfitting: the model is getting confused by things it has not seen before when it
tests against the validation dataset.

In order to teach our model to be more robust when looking at new data, we're going to programmatically
increase the size and variance in our dataset. This is known as data augmentation (https://link.springer.com
/article/10.1186/s40537-019-0197-0), a useful technique for many deep learning applications.

The increase in size gives the model more images to learn from while training. The increase in variance helps
the model ignore unimportant features and select only the features that are truly important in classification,
allowing it to generalize better.

Objectives

Augment the ASL dataset


Use the augmented data to train an improved model
Save the well-trained model to disk for use in deployment

Preparing the Data

As we're in a new notebook, we will load and process our data again. To do this, execute the following cell:

1 of 10 12/07/2021, 05:00 pm
04a_asl_augmentation about:srcdoc

In [1]: import tensorflow.keras as keras


import pandas as pd

# Load in our data from CSV files


train_df = pd.read_csv("data/asl_data/sign_mnist_train.csv")
valid_df = pd.read_csv("data/asl_data/sign_mnist_valid.csv")

# Separate out our target values


y_train = train_df['label']
y_valid = valid_df['label']
del train_df['label']
del valid_df['label']

# Separate our our image vectors


x_train = train_df.values
x_valid = valid_df.values

# Turn our scalar targets into binary categories


num_classes = 24
y_train = keras.utils.to_categorical(y_train, num_classes)
y_valid = keras.utils.to_categorical(y_valid, num_classes)

# Normalize our image data


x_train = x_train / 255
x_valid = x_valid / 255

# Reshape the image data for the convolutional network


x_train = x_train.reshape(-1,28,28,1)
x_valid = x_valid.reshape(-1,28,28,1)

Model Creation

We will also need to create our model again. To do this, execute the following cell. You will notice this is the
same model architecture as the last section:

2 of 10 12/07/2021, 05:00 pm
04a_asl_augmentation about:srcdoc

In [2]: from tensorflow.keras.models import Sequential


from tensorflow.keras.layers import (
Dense,
Conv2D,
MaxPool2D,
Flatten,
Dropout,
BatchNormalization,
)

model = Sequential()
model.add(Conv2D(75, (3, 3), strides=1, padding="same", activation="rel
u",
input_shape=(28, 28, 1)))
model.add(BatchNormalization())
model.add(MaxPool2D((2, 2), strides=2, padding="same"))
model.add(Conv2D(50, (3, 3), strides=1, padding="same", activation="rel
u"))
model.add(Dropout(0.2))
model.add(BatchNormalization())
model.add(MaxPool2D((2, 2), strides=2, padding="same"))
model.add(Conv2D(25, (3, 3), strides=1, padding="same", activation="rel
u"))
model.add(BatchNormalization())
model.add(MaxPool2D((2, 2), strides=2, padding="same"))
model.add(Flatten())
model.add(Dense(units=512, activation="relu"))
model.add(Dropout(0.3))
model.add(Dense(units=num_classes, activation="softmax"))

Data Augmentation

Before compiling the model, it's time to set up our data augmentation.

Keras comes with an image augmentation class called ImageDataGenerator . We recommend checking
out the documentation here (https://keras.io/api/preprocessing/image/#imagedatagenerator-class). It accepts
a series of options for augmenting your data. Later in the course, we'll have you select a proper augmentation
strategy. For now, take a look at the options we've selected below, and then execute the cell to create an
instance of the class:

3 of 10 12/07/2021, 05:00 pm
04a_asl_augmentation about:srcdoc

In [3]: from tensorflow.keras.preprocessing.image import ImageDataGenerator

datagen = ImageDataGenerator(
rotation_range=10, # randomly rotate images in the range (degrees,
0 to 180)
zoom_range=0.1, # Randomly zoom image
width_shift_range=0.1, # randomly shift images horizontally (fract
ion of total width)
height_shift_range=0.1, # randomly shift images vertically (fracti
on of total height)
horizontal_flip=True, # randomly flip images horizontally
vertical_flip=False, # Don't randomly flip images vertically
)

Take a moment to think about why we would want to flip images horizontally, but not vertically. When you
have an idea, reveal the text below.

Our dataset is pictures of hands signing the alphabet. If we want to use this model to classify hand images
later, it's unlikely that those hands are going to be upside-down, but, they might be left-handed. This kind of
domain-specific reasoning can help make good decisions for your own deep learning applications.

Batch Size
Another benefit of the ImageDataGenerator is that it batches (https://machinelearningmastery.com/how-
to-control-the-speed-and-stability-of-training-neural-networks-with-gradient-descent-batch-size/) our data so
that our model can train on a random sample.

If the model is truly random (http://sites.utexas.edu/sos/random/), meaning that the data is properly shuffled
so it's fair like a deck of cards, then our sample can do a good job of representing all of our data even though
it is a tiny fraction of the population. For each step of the training, the model will be dealt a new batch.

In practice, a batch size of 32 and 64 does well. Run the cell below to see what kind of batches we'll be
training our model with. Is our randomizer fairly randomizing? Are all of the images recognizable ASL letters?

4 of 10 12/07/2021, 05:00 pm
04a_asl_augmentation about:srcdoc

In [4]: import matplotlib.pyplot as plt


import numpy as np
batch_size = 32
img_iter = datagen.flow(x_train, y_train, batch_size=batch_size)

x, y = img_iter.next()
fig, ax = plt.subplots(nrows=4, ncols=8)
for i in range(batch_size):
image = x[i]
ax.flatten()[i].imshow(np.squeeze(image))
plt.show()

Fitting the Data to the Generator

Next, the generator must be fit on the training dataset.

In [5]: datagen.fit(x_train)

Compiling the Model

With the data generator instance created and fit to the training data, the model can now be compiled in the
same way as our earlier examples:

In [6]: model.compile(loss='categorical_crossentropy', metrics=['accuracy'])

Training with Augmentation

5 of 10 12/07/2021, 05:00 pm
04a_asl_augmentation about:srcdoc

When using an image data generator with Keras, a model trains a bit differently: instead of just passing the
x_train and y_train datasets into the model, we pass the generator in, calling the generator's flow
(https://keras.io/api/preprocessing/image/) method. This causes the images to get augmented live and in
memory right before they are passed into the model for training.

Generators can supply an indefinite amount of data, and when we use them to train our data, we need to
explicitly set how long we want each epoch to run, or else the epoch will go on indefinitely, with the generator
creating an indefinite number of augmented images to provide the model.

We explicitly set how long we want each epoch to run using the steps_per_epoch named argument.
Because steps * batch_size = number_of_images_trained in an epoch a common practice,
that we will use here, is to set the number of steps equal to the non-augmented dataset size divided by the
batch_size (which has a default value of 32).

Run the following cell to see the results. The training will take longer than before, which makes sense given
we are now training on more data than previously:

6 of 10 12/07/2021, 05:00 pm
04a_asl_augmentation about:srcdoc

In [7]: model.fit(img_iter,
epochs=20,
steps_per_epoch=len(x_train)/batch_size, # Run same number of
steps we would if we were not using a generator.
validation_data=(x_valid, y_valid))

7 of 10 12/07/2021, 05:00 pm
04a_asl_augmentation about:srcdoc

Epoch 1/20
858/857 [==============================] - 8s 10ms/step - loss: 1.003
8 - accuracy: 0.6756 - val_loss: 0.3289 - val_accuracy: 0.8733
Epoch 2/20
858/857 [==============================] - 7s 9ms/step - loss: 0.2871
- accuracy: 0.9029 - val_loss: 2.2686 - val_accuracy: 0.6238
Epoch 3/20
858/857 [==============================] - 7s 8ms/step - loss: 0.1903
- accuracy: 0.9358 - val_loss: 0.4617 - val_accuracy: 0.8716
Epoch 4/20
858/857 [==============================] - 8s 9ms/step - loss: 0.1395
- accuracy: 0.9542 - val_loss: 0.5177 - val_accuracy: 0.8593
Epoch 5/20
858/857 [==============================] - 7s 9ms/step - loss: 0.1171
- accuracy: 0.9627 - val_loss: 0.1270 - val_accuracy: 0.9576
Epoch 6/20
858/857 [==============================] - 7s 8ms/step - loss: 0.1093
- accuracy: 0.9666 - val_loss: 0.0894 - val_accuracy: 0.9745
Epoch 7/20
858/857 [==============================] - 7s 9ms/step - loss: 0.0913
- accuracy: 0.9716 - val_loss: 0.6857 - val_accuracy: 0.8107
Epoch 8/20
858/857 [==============================] - 7s 9ms/step - loss: 0.0841
- accuracy: 0.9741 - val_loss: 0.0666 - val_accuracy: 0.9776
Epoch 9/20
858/857 [==============================] - 7s 8ms/step - loss: 0.0774
- accuracy: 0.9771 - val_loss: 0.0723 - val_accuracy: 0.9727
Epoch 10/20
858/857 [==============================] - 7s 9ms/step - loss: 0.0690
- accuracy: 0.9797 - val_loss: 0.9610 - val_accuracy: 0.8537
Epoch 11/20
858/857 [==============================] - 7s 9ms/step - loss: 0.0704
- accuracy: 0.9800 - val_loss: 0.1252 - val_accuracy: 0.9653
Epoch 12/20
858/857 [==============================] - 8s 9ms/step - loss: 0.0630
- accuracy: 0.9813 - val_loss: 0.4922 - val_accuracy: 0.9037
Epoch 13/20
858/857 [==============================] - 7s 9ms/step - loss: 0.0680
- accuracy: 0.9807 - val_loss: 0.0504 - val_accuracy: 0.9876
Epoch 14/20
858/857 [==============================] - 7s 9ms/step - loss: 0.0595
- accuracy: 0.9831 - val_loss: 0.0459 - val_accuracy: 0.9848
Epoch 15/20
858/857 [==============================] - 7s 8ms/step - loss: 0.0644
- accuracy: 0.9825 - val_loss: 0.0119 - val_accuracy: 0.9948
Epoch 16/20
858/857 [==============================] - 7s 8ms/step - loss: 0.0539
- accuracy: 0.9849 - val_loss: 0.1671 - val_accuracy: 0.9615
Epoch 17/20
858/857 [==============================] - 7s 8ms/step - loss: 0.0544
- accuracy: 0.9854 - val_loss: 0.0752 - val_accuracy: 0.9757
Epoch 18/20
858/857 [==============================] - 7s 8ms/step - loss: 0.0613
- accuracy: 0.9826 - val_loss: 0.1388 - val_accuracy: 0.9637
Epoch 19/20
858/857 [==============================] - 7s 8ms/step - loss: 0.0547

8 of 10 12/07/2021, 05:00 pm
04a_asl_augmentation about:srcdoc

- accuracy: 0.9858 - val_loss: 0.0358 - val_accuracy: 0.9870


Epoch 20/20
858/857 [==============================] - 7s 8ms/step - loss: 0.0500
- accuracy: 0.9854 - val_loss: 0.2121 - val_accuracy: 0.9388
Out[7]: <tensorflow.python.keras.callbacks.History at 0x7f96d9c24eb8>

Discussion of Results

You will notice that the validation accuracy is higher, and more consistent. This means that our model is no
longer overfitting in the way it was; it generalizes better, making better predictions on new data.

Saving the Model

Now that we have a well-trained model, we will want to deploy it to perform inference on new images.

It is common, once we have a trained model that we are happy with to save it to disk.

Saving the model in Keras is quite easy using the save method. There are different formats that we can save
in, but we'll use the default for now. If you'd like, feel free to check out the documentation
(https://www.tensorflow.org/guide/keras/save_and_serialize). In the next notebook, we'll load the model and
use it to read new sign language pictures:

In [8]: model.save('asl_model')

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensor
flow/python/ops/resource_variable_ops.py:1817: calling BaseResourceVa
riable.__init__ (from tensorflow.python.ops.resource_variable_ops) wi
th constraint is deprecated and will be removed in a future version.
Instructions for updating:
If using Keras pass *_constraint arguments to layers.
INFO:tensorflow:Assets written to: asl_model/assets

Summary

In this section you used Keras to augment your dataset, the result being a trained model with less overfitting
and excellent validation image results.

Clear the Memory


Before moving on, please execute the following cell to clear up the GPU memory.

9 of 10 12/07/2021, 05:00 pm
04a_asl_augmentation about:srcdoc

In [9]: import IPython


app = IPython.Application.instance()
app.kernel.do_shutdown(True)

Out[9]: {'status': 'ok', 'restart': True}

Next

Now that you have a well-trained model saved to disk, you will, in the next section, deploy it to make
predictions on not-yet-seen images.

Please continue to the next notebook: Model Predictions (04b_asl_predictions.ipynb).

10 of 10 12/07/2021, 05:00 pm

You might also like