Professional Documents
Culture Documents
RI - Assignment, Shaima Jahan 201-15-14198, F
RI - Assignment, Shaima Jahan 201-15-14198, F
Assignment - 1
Course Title: Research & Innovation
Course code: CSE 326
Topic name: Methodology (Handwritten Digit Recognition)
Submitted to:
Sharun Akter khushbu
Lecturer
Department of CSE
Daffodil International University
Submitted by:
Name ID Section
Shaima Jahan 201-15-14198
F
2. Methodology
The proposed paper makes use of well-known Machine Learning applied to the customized input
datasets for recognizing the characters and segregating digits from letters.
3
2.1 Data Collection: Dataset being used in this project is MNIST dataset and Kaggle. The MNIST
database is a huge database of handwritten digits used for training image processing systems. It
has a total of 70000 images of which training set is of 60,000 examples, and test set of 10,000
examples. The images are stored in a file format that is designed for storing vectors and
multidimensional matrices. The images have grey levels by applying an anti-aliasing technique
used by the normalization algorithm. By calculating the center of mass of the pixels of the image
and moving the image so as to position this point at the center, the image is centered in the
28*28 field. Handwritten Letters are collected from Kaggle platform. The dataset has 26 folders of
Alphabets A-Z, that includes handwritten images in size 28*28 pixels, each alphabet in the image
is center fitted to 20*20-pixel box.
The dataset contained examples from approximately 250 writers. The sets of writers for the dataset
were disjoint. The Sample digit database of MNIST is shown in figure 1.
√
n
dist ( x 1 , x 2 )= ∑ (x 1i −x2 i )2
i=1
3
Before using the equation above, each attribute is normalized as it helps to prevent attributes with
initially large ranges from outweighed attributes with initially small ranges. This can be done by Min
– max normalization, for transforming a value v of a numeric attribute A to v’ in the range [0,1] by
computing:
v−min A
v' =
max A−min A
where minA and maxA are the minimum and maximum values of attribute A.