Professional Documents
Culture Documents
A
Experiment No. 04
Aim :- To implement Handwritten Digit Recognition.
Theory:-
Handwritten Character Recognition
Handwritten character recognition is a field of research in artificial intelligence, computer vision, and pattern
recognition. A computer performing handwriting recognition is said to be able to acquire and detect
characters in paper documents, pictures, touch-screen devices and other sources and convert them into
machine-encoded form. Its application is found in optical character recognition, transcription of handwritten
documents into digital documents and more advanced intelligent character recognition systems.
Handwritten character recognition can be thought of as a subset of the image recognition problem.
Basically, the algorithm takes an image (image of a handwritten digit) as an input and outputs the likelihood
that the image belongs to different classes (the machine-encoded digits, 1–9).
The goal is to take an image of a handwritten digit and determine what that digit is. The digits range from
one (1) through nine (9).
We will look into the Support Vector Machines (SVMs) and Nearest Neighbor (NN) techniques to solve the
problem. The tasks involved are the following:
1. Download the MNIST dataset
2. Preprocess the MNIST dataset
3. Train a classifier that can categorize the handwritten digits
4. Apply the model on the test set and report its accuracy
The dataset for this problem will be downloaded from kaggle, which was taken from the famous MNIST
(Modified National Institute of Standards and Technology) dataset.
Metrics
We will be using the accuracy score to quantify the performance of our model. The accuracy will tell us
what percentage of our test data was classified correctly. The accuracy is a good metric choice because it
will be easy to compare our model’s performance to that of the benchmark as it uses the same metric. Also,
our dataset is balanced (equal number of training examples for each label) which makes the accuracy
appropriate for this problem.
We have counted the number of occurrences of each label in the training set. The figure below illustrates
the distribution of these labels. It is obvious from the figure that the distribution is uniform meaning our
dataset is balanced.
We’d also like to know more about average intensity, that is the average value of a pixel in an image for
the different digits. Intuition tells me that the digit “1” will on average have less intensity than say an “8”.
As we can see, there are differences in intensities and our intuition was correct. “8” has a higher intensity
than a “1”. Also, “0” has the highest intensity, even higher than “8” which is surprising. This could be
attributed to the fact that different people write their digits differently. Calculating the standard deviation of
intensities gives a value of 11.08 which shows that there exists some variation in the way the digits are
written.
10/23/23, 10:18 AM SC_LAB_04.ipynb - Colaboratory
created by Y.A
import tensorflow as tf
from tensorflow import keras
import matplotlib.pyplot as plt
import numpy as np
len(x_train)
60000
len(x_test)
output 10000
x_train[0].shape
(28, 28)
x_train[0]
array([[ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0],
[ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0],
[ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0],
[ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0],
[ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0],
[ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 3,
18, 18, 18, 126, 136, 175, 26, 166, 255, 247, 127, 0, 0,
0, 0],
[ 0, 0, 0, 0, 0, 0, 0, 0, 30, 36, 94, 154, 170,
253, 253, 253, 253, 253, 225, 172, 253, 242, 195, 64, 0, 0,
0, 0],
[ 0, 0, 0, 0, 0, 0, 0, 49, 238, 253, 253, 253, 253,
253, 253, 253, 253, 251, 93, 82, 82, 56, 39, 0, 0, 0,
0, 0],
[ 0, 0, 0, 0, 0, 0, 0, 18, 219, 253, 253, 253, 253,
253, 198, 182, 247, 241, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0],
[ 0, 0, 0, 0, 0, 0, 0, 0, 80, 156, 107, 253, 253,
205, 11, 0, 43, 154, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0],
[ 0, 0, 0, 0, 0, 0, 0, 0, 0, 14, 1, 154, 253,
90, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0],
[ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 139, 253,
190, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0],
[ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 11, 190,
253, 70, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0],
[ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 35,
241, 225, 160, 108, 1, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0],
[ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
81, 240, 253, 253, 119, 25, 0, 0, 0, 0, 0, 0, 0,
0, 0],
[ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 45, 186, 253, 253, 150, 27, 0, 0, 0, 0, 0, 0,
0, 0],
[ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 16, 93, 252, 253, 187, 0, 0, 0, 0, 0, 0,
0, 0],
[ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
https://colab.research.google.com/drive/19OgK5-p_trwMC3ixplDxM4anOBYMwwMd#printMode=true 1/7
10/23/23, 10:18 AM SC_LAB_04.ipynb - Colaboratory
created by Y.A 0, 0, 0, 0, 249, 253, 249, 64, 0, 0, 0, 0, 0,
0, 0],
[ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 46, 130, 183, 253, 253, 207, 2, 0, 0, 0, 0, 0,
0, 0],
[ 0 0 0 0 0 0 0 0 0 0 0 0 39
plt.matshow(x_train[8])
<matplotlib.image.AxesImage at 0x79ed41e5d840>
y_train[2]
y_train[:5]
x_train = x_train/255
x_test = x_test/255
x_train_flattened = x_train.reshape(len(x_train),28*28)
x_train_flattened.shape
(60000, 784)
x_test_flattened = x_test.reshape(len(x_test),28*28)
x_test_flattened.shape
(10000, 784)
arr = np.around(x_train[0])
x_train_flattened[0]
array([0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
https://colab.research.google.com/drive/19OgK5-p_trwMC3ixplDxM4anOBYMwwMd#printMode=true 2/7
10/23/23, 10:18 AM SC_LAB_04.ipynb - Colaboratory
created by Y.A0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0.01176471, 0.07058824, 0.07058824,
0.07058824, 0.49411765, 0.53333333, 0.68627451, 0.10196078,
0.65098039, 1. , 0.96862745, 0.49803922, 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0.11764706, 0.14117647, 0.36862745, 0.60392157,
0.66666667, 0.99215686, 0.99215686, 0.99215686, 0.99215686,
0.99215686, 0.88235294, 0.6745098 , 0.99215686, 0.94901961,
0.76470588, 0.25098039, 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0.19215686, 0.93333333,
0.99215686, 0.99215686, 0.99215686, 0.99215686, 0.99215686,
0.99215686, 0.99215686, 0.99215686, 0.98431373, 0.36470588,
0.32156863, 0.32156863, 0.21960784, 0.15294118, 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0.07058824, 0.85882353, 0.99215686, 0.99215686,
0.99215686, 0.99215686, 0.99215686, 0.77647059, 0.71372549,
0.96862745, 0.94509804, 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0.31372549, 0.61176471, 0.41960784, 0.99215686, 0.99215686,
0.80392157, 0.04313725, 0. , 0.16862745, 0.60392157,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0 0 0 0 0 05490196
model = keras.Sequential([
keras.layers.Dense(10, input_shape=(784,), activation = "sigmoid" )
])
Epoch 1/5
1875/1875 [==============================] - 10s 3ms/step - loss: 0.4755 - accuracy: 0.8752
Epoch 2/5
1875/1875 [==============================] - 4s 2ms/step - loss: 0.3042 - accuracy: 0.9160
Epoch 3/5
1875/1875 [==============================] - 4s 2ms/step - loss: 0.2833 - accuracy: 0.9206
Epoch 4/5
1875/1875 [==============================] - 5s 3ms/step - loss: 0.2731 - accuracy: 0.9238
Epoch 5/5
1875/1875 [==============================] - 4s 2ms/step - loss: 0.2669 - accuracy: 0.9252
<keras.src.callbacks.History at 0x79ed3ebbc190>
model.evaluate(x_test_flattened, y_test)
plt.matshow(x_test[10])
plt.matshow(x_test[1])
plt.matshow(x_test[2])
https://colab.research.google.com/drive/19OgK5-p_trwMC3ixplDxM4anOBYMwwMd#printMode=true 3/7
10/23/23, 10:18 AM SC_LAB_04.ipynb - Colaboratory
created by Y.A
<matplotlib.image.AxesImage at 0x79ed10740a30>
y_pred = model.predict(x_test_flattened)
y_pred[0]
[7, 2, 1, 0, 4]
y_test[:5]
https://colab.research.google.com/drive/19OgK5-p_trwMC3ixplDxM4anOBYMwwMd#printMode=true 4/7
10/23/23, 10:18 AM SC_LAB_04.ipynb - Colaboratory
created by Y.A
import numpy as np
np.argmax(y_pred[10])
model = keras.Sequential([
keras.layers.Dense(100, input_shape =(784,), activation = "relu"),
keras.layers.Dense(10, activation = "sigmoid")
])
https://colab.research.google.com/drive/19OgK5-p_trwMC3ixplDxM4anOBYMwwMd#printMode=true 5/7
10/23/23, 10:18 AM SC_LAB_04.ipynb - Colaboratory
created by Y.A
Epoch 1/5
1875/1875 [==============================] - 6s 3ms/step - loss: 0.2769 - accuracy: 0.9211
Epoch 2/5
1875/1875 [==============================] - 5s 3ms/step - loss: 0.1253 - accuracy: 0.9637
Epoch 3/5
1875/1875 [==============================] - 5s 2ms/step - loss: 0.0869 - accuracy: 0.9742
Epoch 4/5
1875/1875 [==============================] - 5s 2ms/step - loss: 0.0652 - accuracy: 0.9808
Epoch 5/5
1875/1875 [==============================] - 5s 3ms/step - loss: 0.0520 - accuracy: 0.9840
<keras.src.callbacks.History at 0x79ed422c55d0>
model.evaluate(x_test_flattened, y_test)
https://colab.research.google.com/drive/19OgK5-p_trwMC3ixplDxM4anOBYMwwMd#printMode=true 6/7
10/23/23, 10:18 AM SC_LAB_04.ipynb - Colaboratory
created by Y.A
model = keras.Sequential([
keras.layers.Flatten(input_shape = (28,28)),
keras.layers.Dense(100, input_shape =(784,), activation = "relu"),
keras.layers.Dense(10, activation = "sigmoid")
])
Epoch 1/5
1875/1875 [==============================] - 6s 3ms/step - loss: 0.2678 - accuracy: 0.9242
Epoch 2/5
1875/1875 [==============================] - 6s 3ms/step - loss: 0.1173 - accuracy: 0.9651
Epoch 3/5
1875/1875 [==============================] - 5s 3ms/step - loss: 0.0819 - accuracy: 0.9754
Epoch 4/5
1875/1875 [==============================] - 5s 3ms/step - loss: 0.0619 - accuracy: 0.9813
Epoch 5/5
1875/1875 [==============================] - 5s 3ms/step - loss: 0.0502 - accuracy: 0.9846
<keras.src.callbacks.History at 0x79ec8392be20>
https://colab.research.google.com/drive/19OgK5-p_trwMC3ixplDxM4anOBYMwwMd#printMode=true 7/7