You are on page 1of 17

MNIST

HANDWRITTEN
DIGITS
CLASSIFICATION
USING
DIFFERENT
CLASSIFIERS
INTRODUCTION

 The MNIST database is a large database of handwritten digits that is commonly


used for training various image processing systems..
 MNIST is a “HELLO WORLD” of machine learning.
DATASET EXPLORATION
 MNIST dataset of handwritten digits.
 Total images are 70,000
 Dataset divided into two parts
 Training set: about 60,000 samples
 Test set: about 10,000 samples
 All are grey scales images with same size.
 Images are in square 28x28 pixel box
 Digits between 0 to 9
 Classes = 10
OBJECTIVE

We apply multiple classifiers on the MNIST handwritten dataset to compare the results. The multiple
classifiers are as follows:

 Logistic Regression
 KNN
 Decision Tree
 Random Forest
 CNN
LOGISTIC REGRESSION

Logistic regression is a supervised learning classification algorithm used to predict the


probability of a target variable.
K-NEAREST NEIGHBORS

The k-nearest neighbors (KNN) algorithm is a simple, easy-to-implement supervised machine learning
algorithm that can be used to solve both classification and regression problems.
The KNN algorithm assumes that similar things exist in proximity. In other words, similar things are
near to each other.

“Birds of a feather flock together.”


DECISION TREE
A Decision tree is a flow chart like structure in which each internal node=represents a test on
a feature (e.g., coin whether a coin flip comes up heads or tails), each leaf node represents a
class label (decision taken after computing all features) and branches represent conjunctions
of features that lead to those class labels.
RANDOM FOREST
Random forest is a supervised learning algorithm which is used for both
classification as well as regression. But however, it is mainly used for classification
problems. As we know that a forest is made up of trees and more trees means more
robust forest. Similarly, random forest algorithm creates decision trees on data
samples and then gets the prediction from each of them and finally selects the best
solution by means of voting. It is an ensemble method which is better than a single
decision tree because it reduces the over-fitting by averaging the result.
CNN
In neural networks, Convolutional neural network (ConvNets or CNNs) is one of the
main categories to do images recognition, images classifications. Objects detections,
recognition faces etc., are some of the areas where CNNs are widely used.
CNN image classifications takes an input image, process it and classify it under
certain categories
ARCHITECTURE DIAGRAM

Pre- Test-Train Result


Dataset Classifiers
Processing Split Evaluate
IMAGE PRE-PROCESSING

Input Image OUTUT(Pre-


Reshaping Normalizing
Image Augmentation Processed Image)

Grey scale 2D image


DATA AUGMENTATION
The amount of data gathered was very low and could the cause the models
under-fit. Hence, we would use a brilliant technique of Data Augmentation to
increase the amount of data. This techniques relies on rotations, flips etc. to
create similar images.
6000

CHECK
DATASET IS 5000
BALANCED
OR NOT?
4000

3000

2000

1000

Dataset is
approx. 0
balanced 0 1 2 3 4 5 6 7 8 9
COMPARISON OF THE LEARNING METHODS

METHOD ACCURACY (%)

1 Logistic Regression 92.05

2 KNN 96.94

4 Decision Tress 85.25

5 Random Forest 96.87

6 CNN 97.39
Comparison of multiple classifiers

96.94 97.39
96.87

98

96
92.05
94

92

90

88 85.25

86

84

82

80

78
LR KNN DT RF CNN
THANK YOU

You might also like