You are on page 1of 4

Digit recognizer using

CNN
Building a simple Convolutional Neural Network
using mnist data set to recognize handwritten
digits.

Dataset:

MNIST (“Modified National Institute of Standards and Technology”)


is the de facto “Hello World” dataset of computer vision. Since its
release in 1999, this classic dataset of handwritten images has served
as the basis for benchmarking classification algorithms. As new
machine learning techniques emerge, MNIST remains a reliable
resource for researchers and learners alike.

Data Processing:
import tensorflow as tf
(x_train, y_train), (x_test, y_test) =
tf.keras.datasets.mnist.load_data()

The data set contains 60,000 traning images and 10000 testing
images. Here I split the data into training and testing datasets
respectively. The x_train & x_test contains grayscale codes while
y_test & y_train contains labels from 0–9 which represents the
numbers.

When you check the shape of the dataset to see if it is compatible to


use in for CNN. You can see we will (60000,28,28) as our result
which means that we have 60000 images in our dataset and size of
each image is 28 * 28 pixel.

To use Keras API we need a 4-dimensional array but we can see


from above that we have a 3-dimension numpy array.
x_train = x_train.reshape(x_train.shape[0], 28, 28, 1)
x_test = x_test.reshape(x_test.shape[0], 28, 28, 1)
input_shape = (28, 28, 1)

So, here we convert the 3-dimension numpy array into 4-


dimensional and after we set the type as float to have floating values
after the division.
x_train = x_train.astype('float32')
x_test = x_test.astype('float32')

Now coming to the normalizing part, we will always we to do this in


our neural networks. This is done by dividing it by 255 (which is the
maximum RGB code minus the minimum RGB code).
x_train /= 255
x_test /= 255

Building the Model:


from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Conv2D, Dropout,
Flatten, MaxPooling2D
model = Sequential()
model.add(Conv2D(28, kernel_size=(3,3),
input_shape=input_shape))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Flatten())
model.add(Dense(128, activation=tf.nn.relu))
model.add(Dropout(0.2))
model.add(Dense(10,activation=tf.nn.softmax))
I use the Keras API to build the model hence I have a Tensorflow
background.I import the Sequential Model from Keras and add
Conv2D, MaxPooling, Flatten, Dropout, and Dense layers.

Dropout layers fight with the overfitting by disregarding some of the


neurons while training while Flatten layers flatten 2D arrays to 1D
array before building the fully connected layers.

Compiling and fitting the Model:

So far, we have created an non-optimized empty CNN. Then I set an


optimizer with a given loss function which uses a metric and fit the
model by using our train data. The ADAM optimizer is said to
outperform the other optimizers, that’s why I used that.
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy', metrics=['accuracy'])
model.fit(x=x_train,y=y_train, epochs=10)

Here we get pretty high accuracy with just 10 epochs. Since the
dataset doesn’t need heavy computational power you can play
around with the number of epochs you can also play around with the
optimizer, loss function and metrics.

Model Evaluation:
model.evaluate(x_test, y_test)

When this model is evaluated we see that just 10 epochs gave use the
accuracy of 98.59% at a very low loss.
Now to check its prediction:
image_index = 2853
plt.imshow(x_test[image_index].reshape(28, 28),cmap='Greys')
predict = x_test[image_index].reshape(28,28)
pred = model.predict(x_test[image_index].reshape(1, 28, 28, 1))
print(pred.argmax())

Here we select an image and run it through to get the prediction


then display both the image and prediction to see if its accurate.

Image by author

And that is how you can build and implement a simple convolutional
neural network. You can implement this concept to various different
types of classification and other such implementations.

You might also like