You are on page 1of 13

Recognition of Handwriting Using Convolutional

Neural Network for Personality Identification


Prince Kumar1, Sakshi Srivastava1, Noor Mohd1, Fateh Singh Gill2, Vaishali Chaud-
hary3
1
Computer Science and Engineering Department, Graphic Era Deemed to be University,
Dehradun, Uttarakhand, India.
2
Department of Allied Sciences, Graphic Era Deemed to be University, Dehradun, Uttarakhand,
India.
3
Department of Biotechnology, Graphic Era Deemed to be University, Dehradun, Uttarakhand,
India.

kumar.prince976@gmail.com

Abstract: Graphology or Handwriting analysis is a method to identify the behavioral char-


acteristics of an individual by analyzing his handwriting. There are various parameters on the
basis of which handwriting can be analyzed, some of which are the amount of pen pressure ap-
plied, spacing between the letters and the words, the presence of baseline, etc. Graphology, if
done manually is time-consuming and tiresome, so automated handwriting analysis using Im-
age Processing and Deep Learning is preferred over a manual one. The proposed work analyses
the handwriting of different people using different image processing algorithms for image en-
hancement and Convolutional Neural Network, a specific type of Artificial Neural Network, for
image classification purpose. The CNN classifies the handwriting into eight different classes
and in this way the personality traits of the person are identified.

Keywords: Graphology, Deep Learning, ANN, CNN, Image Processing.

1 Introduction

Recognizing the personality of a human on the basis of his handwriting is in-de-


mand these days as in employment profiling, in which the human resource manager
recruits a person on the basis of how he writes or signs is also an application of
Graphology. Handwriting analysis or Graphology is a technique used to predict the
behavioral traits of a person by studying his handwriting. The way in which a person
writes, the patterns, strokes, lines, shape, size, the baseline, amount of pen pressure
applied, spacing between the letters, spacing between words, slant of letter, etc. can
tell about individual’s feelings, current mood and overall personality. Table 1 shows
the different parameters considered during the analysis of handwriting and their corre-
sponding personality trait [15].
Graphology mainly comprises of two approaches, one is the signature analysis and
the other is letter recognition or the handwriting analysis [1]. We are only concerned
about the handwriting analysis in this paper. The analysis of handwriting, if done
2

manually, is very tiresome and difficult as it requires lot of time, patience and prac-
tice. So, automated handwriting recognition techniques are preferred over manual
analysis as the use of machines and models reduces time.
We are using a Convolutional Neural Network, a class of Deep Learning for pre-
dicting personality traits of a person by performing handwriting analysis.

Table 1. Some Graphology Studies


3

Category Subcategory Personality Trait


Baseline

Rising Upwards Optimistic, excited, joy

Pessimistic, fatigue, discour-


Slanting down- agement, depressed, ill
wards

Reliable, Even tempered,


Straight Control of emotions
Letter Slant
Ability to express opinions,
Moderate Right
future orientation, Expressive
Lack of self-control, Impul-
Extreme Right sive, Low frustration toler-
ance
Reflective, Independent, Dif-
Moderate Left
ficulty in adapting
Early rejection, Fear of future,
Extreme Left
Defensive
Amount of pen pressure
Very deep and enduring feel-
ings, forgive but never forget,
Heavy feels situation intensely

Endure traumatic feelings


without being seriously af-
Light fected, emotional experiences
do not make a lasting impres-
sion
Word Spacing
Less disposition towards criti-
Less
cism, and towards argument
More disposition towards
More criticism, and towards argu-
ment
Letter Spacing
Less openness of senti-
Less ment/intelligence

more openness of senti-


ment/intelligence
More

Size Of Letters
Desires to get noticed
Large and Bold

Small Less desire to get noticed


4

1.1 Deep Learning

Machine learning (an application of Artificial Intelligence) is used to train systems by


providing them ability to automatically learn and improve by using statistical methods
like Decision tree, Naive Bayes, Support Vector Machine (SVM), K-Nearest Neigh-
bor(KNN), etc. On the other hand, deep learning is a subset of Machine Learning and
uses neural networks for interpreting data features and its relationships. Some exam-
ples of deep learning frameworks are Artificial Neural Network (ANN), Convolu-
tional Neural Network (CNN), Boltzmann Machines, etc. Neural Networks comprise
of real biological neurons or artificial neurons called nodes to communicate with each
other through dendrites in order to train the system and improve from past experi-
ences.

1.2 Significance of CNN

The machine learning algorithms for predicting human behavior are less accurate
as compared to the deep learning algorithms. The proposed work uses a CNN for pre-
dicting personality traits of an individual on the basis of his handwriting. CNN is cho-
sen over ANN because a basic ANN comprises of only three main layers - the input
layer, the hidden layer (one or more depending upon the size of dataset) and the out-
put layer whereas a CNN is a specific type of artificial neural network that comprises
of four steps - Convolution, Max pooling, Flatten and Full connection and also uses
perceptrons or epochs which refers to a full cycle of going through the entire dataset
in order to train the system or model. Due to more number of steps, the resultant accu-
racy of the overall model is higher in case of CNN [17]. The input in an ANN is 2D
(generally, a x.csv file) because of which no activation functions can be applied
whereas in a CNN, input is an image so it is in a 3D form which constraints the archi-
tecture in a more sensible way by transforming a 3D input volume to a 3D output vol-
ume by using activation functions like rectifier, softmax and so on [18].

2 Literature Review

H.E.S. et al. performed texture analysis (a text-independent method) on handwrit-


ing samples in order to identify different people. Initially, the screwed words in the
handwriting were normalized by proper alignment of the spaces between horizontal
and vertical lines and 128 x128 blocks of texts were generated. Then the features were
extracted using Multi-channel Gabor filtering (MCGM) and Grayscale Co-occurrence
Matrix (GSCM) techniques. For recognition of the writer, the Weighted Euclidean
Distance (WED) classifier and the K-Nearest Neighbor classifier (KNN) were used.
The classifiers compared the extracted features with the original features of known
writers. Later the accuracy of both the classifiers was compared. It was found that the
WED classifier and MCGM technique showed better performance [11]. Later biomet-
ric systems involving recognition of fingerprints, iris and retina, hand geometry, key-
stroke dynamics, etc. was used for verification or identification of human. Identifica-
5

tion was found more challenging as the accuracy would decrease with increase in size
of the database. Biometrics was divided into two categories. The devices that relied
on the physiological traits (like fingerprint or hand geometry) and the other were the
systems that relied on the behavioral traits (like signature detecting systems). A bio-
metric authentication would do two types of error - Type 1 or False Rejection and
Type 2 or False Acceptance. Parameters like universality, uniqueness, circumvention,
performance; acceptability, permanence, and collect-ability were studied on the basis
of which a biometric trait was chosen [12].
A biometric system gave only two results, whether the person was genuine or not.
H.Gamboa et al. used human computer interaction for verifying the identity of a hu-
man by developing a new technique which hinge on human computer interaction
through a pointing device like a mouse pointer. A WIDAM (Web Interaction Display
and Monitoring) system was created having the data of users on the basis of computer
interaction like successive clicks, called strokes and stored in a file by a data acquisi-
tion system. This file was fed to the recognition system which was responsible for ex-
tracting features by performing operations like preprocessing (where signals were
cleaned by going through a smoothing procedure), using arctan function to create a
spatial vector, computation of time related feature or temporal vector and finally the
feature generation. Sequential Forward Selection (SFS) was used for creating a subset
of best features to distinguish between different users. Sequential classifiers were
used for distinguishing between genuine and impostor distribution [6].
R. V. Yampolskiy classified behavioral biometric into 29 types. Audit logs, blinking
patterns, call-stack systems, calling behavior, usage of credit card, car driving style
recognition systems, face recognition systems, gait analysis, observing e-mail behav-
ior, game strategies, interaction with the GUI, lip dynamic, keyboard/typing dynam-
ics, mouse dynamics, painting style, graphology, programming style, storage activi-
ties, etc. were used for human identification. A generalized algorithm was applied to
these human activities and the user was verified or rejected. A crux of behavioral bio-
metric was formed constituting corresponding trait and feature [9].
Human behavior was divided into two categories - microscopic and macroscopic.
Microscopic meant short time frame behaviors, like a blink, yawn, etc. whereas
macroscopic meant longer time frames like pattern recognition systems. A combined
use of both these behaviors was found in ambient training which involved predicting
daily activities of people in an environment containing sensors using SVM and Con-
ditional Random Fields. Human pose and gaze estimation was done via visual action
recognition systems by considering a human body as a collection of feature points,
shapes and figures. Behavioral cues were used to tell about the personality of a person
by using social signal processing, done by analyzing the emotion in the speech by
training SVM classifiers. Non-verbal behavior was also studied by analyzing the fa-
cial expressions of a person [7].
S. Prasad et al. used Segmentation method to predict the personality of the writer.
The images of hand-writings of 100 different writers were studied, preprocessed (in
order to remove noise) and re-sized. Next, the image was segmented into three types -
word segmentation (for finding behavior towards criticism), letter segmentation (for
predicting personality) and line segmentation (for figuring out the emotional stability
6

of the writer). Then the features like size of letters, its slant, baseline, amount of pen
pressure and the spacing were extracted and values in the form of -1, 0 and 1 were ob-
tained. These values were combined and fed to an SVM classifier which used Radial
Basis Function (RBF) to predict human behavior [13].
To increase the accuracy, the usage of deep learning models over machine learning
ones came into play for the prediction of human behavior. Champa H.N. et al. created
an ANN model for predicting personality traits of a human by studying different sam-
ples of letter ‘t’ on the basis of writing pressure, base line and height of t-bar on the
stem. The baseline classified whether the person is pessimistic, optimistic or leveled
and used polygonization method. The writing pressure indicates the emotional inten-
sity; more the pressure more deep was the enduring feelings. Height of bar on letter
‘t’ revealed about the self-esteem of the writer whether high, moderate, low or a
dreamer using predefined templates. The three criterion for studying letter ‘t’ were fed
to an ANN and the model was trained by using back propagation algorithm. The out-
put from the ANN proclaimed about the personality of the writer [8].
It was found that 3D face recognition, hand geometry, fingerprints; iris and retina
recognition systems can't be used single-handed for identification because of the com-
plication for physically impaired people. So the combination of "speech and features
of signature" and “face and iris after application of discrete wavelet transform" were
selected for authentication. Next generation technology involved "Face-Gait fusion
model" which was tested on video sequences database by using three different classi-
fiers namely K-Nearest Neighbour (KNN), Bayesian linear and Bayesian quadratic.
The "static and dynamic body fusion model" which used procreates shape analysis
method for obtaining a compact representation and the static information of body was
fused with the dynamic information of gait to obtain results while "multiple camera
model" which used multiple camera views simultaneously the "holistic and hierarchi -
cal fusion protocols", where the face and gait features were integrated and the "adap-
tive face-gait fusion model". The same video sequence database was used for the ex-
traction of primary and secondary identifiers like gender, age, emotion etc. to increase
the reliability of decision taken by the next-generation model using Canonical Corre-
lation Analysis (CCA) and Gait Energy Image (GEI). These primary and secondary
identifiers were then fused by using an apt protocol [5].
Dynamic signature data from the MCYT was analyzed by J.Galbally et al. on the
basis of Kinematic Theory of rapid human movements and Sigma-Lognormal model
consisting of 8 feature set, which used the starting and ending time of the signature to
compute SNR used to assess the quality of the signature whether good or bad. After
carrying out two fold cross-validation to avoid bias result, it was observed that the
quality of the signature was better when the strokes were minimum which lead to dex-
terity controlled movement of hands and subsequent shorter sample signatures. Well-
performed signatures had high SNR than the worst ones, thus depicted in a better way
by the Sigma-Lognormal model [3].
P.K. Grewal et al. proposed a work to predict the human behavior by using Feed
Forward Network(a class of ANN). The authors scrutinized the baseline, slant of let-
ter, amount of pen pressure imposed, formation of letters ‘f’ and ‘i’ and computed the
values by using polygonization method (for baseline and letter slant), gray-level
7

threshold value (for pen pressure) and template matching (for letters ‘i’ and ‘f’).
These were fed to the ANN as an input and model was trained using back propagation
method and the output was found. MSE was calculated and regression line was plot-
ted. In this way, personality traits were found [10].
E.C. Djamal et al. recognized personality traits based on signature as well as digit
of character recognition using Multiple ANNs. For the digit recognition, the testing
data was fed to the data acquisition machine and then pre-processed by applying
grayscale, threshold and segmentation operations and then the normalized vector was
calculated, which was used to identify personality of the writer. Signature analysis in-
cluded nine feature, 5 using MLPNN, a class of ANN, and 4 using multistructure al-
gorithm. LVQ was slower in comparison to MLPNN for signature analysis [2].
Swarna Bajaj et al. analyzed the typing speed for protection of password via typing
bio-metrics which followed the approach that every person types in a distinctive man-
ner. Keystroke dynamics having 8 distinctive features was used to find the dwell,
flight, pressing and total time while entering the password in a Java environment by
computing mean and average time using statistical method. At the back-end, the pass-
word, time intervals and the time-stamp was stored in the database from where the ac-
tual value of password was compared to the current one. A user was called authenti -
cated if his current value matched with the actual value. FAR and FRR was calculated
for further analysis [4]. Artifi-
cial Neural Network was used to analyse the human behaviour on the basis of hand-
writing. The input image was the signature which was fed to the ANN and operations
like re-sizing, thinning, crop and alignment were performed on it and the resultant
normalized value was used to identify the personality trait. Mean Square Error
method was used for predicting the final output [1].
A.Chanchlani et al. implemented a Convolutional Neural Network (CNN) by tak-
ing NIST dataset as an input for the prediction of human behavior. The dataset was
split into two parts - training and test. 20% of the dataset was for testing or validation
and the rest comprised training set. OpenCV was used for loading the dataset into
memory. Input images were resized into 128 x128 and converted into grayscale. Stan-
dard back propagation method was used for training the CNN. The CNN used RELU
or rectifier as the activation function and Adam as the optimizer. While testing the
model, real-time handwriting images were used. The CNN classified the human be-
havior into five categories namely Criminal Intent, Excitable, Honest, Narcissistic and
self-centered and Persistent. In this way, human behavior was found by choosing the
category or class which had maximum probability [14].
8

3 Dataset Collection

For solving a problem in any domain in a deep learning model, it is important to have
an appropriate dataset. For the prediction of behavioral traits, we collected the
samples of handwriting of different people in our University. These samples were
then scanned to the computer. These scanned images were then cropped accordingly
in order to create the dataset. The dataset was then split into two parts. First part was
the training set and the second was the test set. In general, 20% of the dataset was
used for validation and the rest was used for training the CNN.

4 Methodology

4.1 Digital Image Processing


Digital Image Processing is a method in which a digital image is processed in a digital
computer by using following steps.

Pre-processing.
Images contain a lot of noise namely, Gaussian noise, Salt and Pepper noise and
Gamma noise. Therefore, it is important to remove the unwanted noise present in the
image samples of handwriting dataset. In order to remove the noise, filters such as
mean or average filters are applied. But the image quality decreases on applying fil-
ters as filtering reduces the levels of details in the image. So, the image is enhanced
and point transformations are done.

(a)

(b)

(c)
Fig 1. (a) Original RGB image (b) Image converted into grayscale (c) Image after noise re-
moval
Grayscale and Binary Conversion.
The image in our dataset is colored or in RGB format. An RGB image consists of
three channels- Red(0-255), Green(0-255) and Blue(0-255) whereas a grayscale im-
9

age only consists of one channel(0-255). It is difficult to extract features from three
channels as compared to one. Therefore, the image is converted into grayscale. Then
an inverted binary image function is applied which converts the pixel values above a
specific threshold(foreground) to 255(white) and those below a specific
threshold(background) to 0(black). (1) shows the inverted binary function.

0 if src(x,y) < threshold


dst(x,y) = (1)
maxValue(255) otherwise

Contour and Warp Affine Transformation.


The OpenCV [16] library provides functions for transformation of the image. We
applied dilation, contour and warp affine transformation on our handwriting sample.
Dilation is a mathematical operation of morphology. In dilation, the boundary pixels
of the images get expanded.

Fig 2.1 (a) A sample image having black background pixels and white foreground pixels.
(b) The sample image after applying dilation with a 5x100 kernel. The foreground pixels are
spread horizontally.

Contour is used for representing the boundaries of objects in an image. In our case, it
is representing the boundaries of the letters.
Warp Affine Transformation is done in order to geometrically transform the image
by not changing the content of image but deforming the pixel grid and mapping this
deformed grip to destination image.

Horizontal and Vertical Projections.


Projection consists of a python list. Horizontal projection gives sum of all the
pixel values of each row of the image, while vertical projection gives the sum of all
the pixel values of each column of the image.
10

Fig. 3 Affine Transformation

4.1 Convolutional Neural Network

After the image processing is done, the dataset is fed to the CNN model for classifica-
tion in order to iderntify the behavioral traits of the writer. We are using a Convolu -
tional Neural Network (CNN) to train the model and predict the personality traits by
analyzing images of handwriting samples. A CNN is basically used for image classifi-
cation. It uses perceptrons or epochs for training itself. A CNN can classify images by
analyzing the objects, structures and patterns included in it. A CNN or ConvNet ar-
ranges its neurons in a three-dimensional structure. The input to the CNN is the
dataset of handwriting of different people. A CNN is a four step process and which
includes Convolution Operation, Pooling, Flattening and Full Connection

Convolution Operation.
An image is in the form of a matrix i ¿ j which has different pixel values associated to
every cell [i,j]. In this step, a filter called a feature detector, usually of size 3*3 is used
to compute dot product by placing the feature detector over the image starting from
top-left corner. The resultant image matrix is called the feature map or activation map.
The same process is repeated by moving the feature map one cell to the right over the
image till the computation of dot product for all the cells is done. In this way, only the
essential information of an image remains intact whereas the noise and unwanted in-
formation is removed.
Now activation functions are applied to all the generated feature maps. Here, we
have applied rectifier function (ReLU) to the input layer and softmax function to the
output layer.
In order to increase non-linearity in images we have applied rectifier function. By
using rectifier function, only non-negative value cells (gray and white colors) are kept
and all the black content ie., 0 is removed. (2) Shows the mathematical formula for
rectifier function.

F(x) = max(x,0) (2)

Softmax function is used to deal with classification problems. It squashes the out-
put of each cell to be between 0 and 1 and divides each output to make its sum equal
to 1. (3) shows the mathematical formula for softmax function. z is the vector for in-
puts to output layer and j is the output unit.
11

zj
e
σ ( z )j = K
∑k =1 e
z
k
(3)

Pooling.
For a same class, the structure for every image may be different. Pooling helps the
neural network to recognize images irrespective of different structure and patterns. In
our case, the handwriting of people are different, so for the CNN to recognize every
handwriting image, max pooling or in general, pooling is done. We have applied a 2 ¿
2 filter over every 4 cells of the image starting from top-left corner and the maximum
value out of the 4 cells is selected to generate a pooled feature map.

Flattening.
In this step, the multiple pooled feature maps are flattened into a single column
which forms the input layer of the future ANN.

Full Connection.
In this step, the ANN collides with the CNN. In simple ANN, we have a hidden
layer between input and output layer. Here, we have a fully-connected layer instead of
hidden layer. The ANN takes the data from the flattened layer and detects the feature
in the fully-connected layer and preserves its value. Then the CNN communicates this
value to the different classes and in this way the image is classified.
12

Fig. 4. Flow chart to demonstrate the proposed methodology

5 Result and Observation

The image was digitally processed by applying filters, image enhancement tech-
niques, point transformations, dilation, contour and warp affine transformation. After
that the image was fed to the Convolutional Neural Network which classified the im-
age into eight different personality traits given below.

Table 2. Classification of personality into eight categories

Personality Trait Given By


Emotional Stability Baseline And Slant Angle
Mental Energy Or Will Power Letter Size And Pen Pressure
Modesty Top Margin And Letter Size
Personal Harmony And Flexibility Line Spacing And Word Spacing
Lack Of Discipline Top Margin And Slant Angle
Poor Concentration Power Letter Size And Line Spacing
Non-communicative letter size and word spacing
Social Isolation line spacing and word spacing

6 Conclusion

The paper emphasizes on the importance of identifying personality traits of an indi-


vidual by analyzing his handwriting using CNN. The behavior of a person can be pre-
dicted by carefully scrutinizing the stokes, patterns, shapes, spaces between letters
and words, baselines, etc in a handwriting. Image is pre-processed and then a CNN is
trained to determine in which category out of the eight mentioned categories is the
writer’s personality falling into by feeding the dataset of different image samples of
handwriting as an input to the CNN. In this way, the behavioral characteristics of a
writer are identified. Further, the present work can also enable one to predict person-
ality of a person using his signature.

References
1. Sandeep Dang, Prof. Mahesh Kumar, Mahesh: “Handwriting Analysis of Human Behav-
iour Based on Neural Network”. International Journal of Advanced Research in Computer
Science and Software Engineering 4(9), pp. 227–232 (2014).
2. Esmeralda C Djamal, Sheldy Nur Ramdlan, Jeri Saputra: “Recognition of Handwriting
Based on Signature and Digit of Character Using Multiple of Artificial Neural Networks in
Personality Identification”. Information Systems International Conference (ISICO),
(2013).
13

3. Javier Galbally, Julian Fierrez, Marcos Martinez-Diaz, R’ejean Plamondon E’cole poly-
techinque de Montre’al: “Quality Analysis of Dynamic Signature Based on the Sigma-
Lognormal Model”. International Conference on Document Analysis and Recognition, pp.
633-637 (2011).
4. Swarna Bajaj, Sumeet Kaur: “Typing Speed Analysis of Human for Password Protection
(based on Keystrokes Dynamics)”. International Journal of Innovative Technology and Ex-
ploring, vol. 3, issue 2, (2013).
5. S. M. E. Hossain, G. Chetty: “Human Identity Verification by Using Physiological and Be-
havioural Biometric Traits”. International Journal of Bioscience, Biochemistry and Bioin-
formatics 1(3), (2011).
6. Hugo Gamboaa, Ana Fredb: “A Behavioral Biometric System Based on Human Computer
Interaction”.
7. Albert Ali Salah, Theo Gevers, Nicu Sebe, Alessandro Vinciarelli: “Challenges of Human
Behavior Understanding”.
8. Champa H N, Dr. K R AnandaKumar: “Artificial Neural Network for Human Behavior
Prediction through Handwriting Analysis”. International Journal of Computer Applica-
tions, 2(2), pp. 36-41 (2010).
9. Roman V. Yampolskiy, Venu Govindaraju: “Behavioral biometrics: a survey and classifi-
cation”. International Journal of Biometrics, vol. 1, no. 1, (2008).
10. Parmeet Kaur Grewal, Deepak Prashar: “Behavior Prediction Through Handwriting Anal-
ysis”. IJCST, 3(2), (2012).
11. H.E.S.Said, T.N.Tan, K.D.Baker: “Writer Identification Based on Handwriting”. IEE
Third European workshop on Handwriting Analysis and Recognition, vol.33, no.1, pp.
133-148 (2000).
12. V. M. Jr, Z. Riha: “Biometric authentication systems”. Tech. Rep. FIMU-RS-2000-08,
FIMU.Reportseries, http://www.fi.muni.cz/informatics/reports/files/older/FIMU-RS-2000-
08.pdf, (2000).
13. Shitala Prasad, V.K. Singh, Akshay Sapre: “Handwriting Analysis based on Segmentation
Method for Prediction of Human Personality using Support Vector Machine”. Interna-
tional Journal of Computer Applications, vol.8, no.12, (2010).
14. Prof. Akshita Chanchlani, Aakanksha Jaitly, Pratima Kharade, Rutuja Kapase, Sonal Jan-
valkar: “Prediction Human Behaviour through Handwriting”. International Journal for Re-
search in Applied Science & Engineering Technology, vol. 6, issue VI, (2018).
15. Alex Pentland, Andrew Liu: “Modeling and Prediction of Human Behavior”. Massachu-
setts Institute of Technology, Cambridge, MA 02139, U.S.A. (1999).
16. https://docs.opencv.org/3.4.1/ (last accessed 2019/06/19)
17. AGeert Litjens, Thijs Kooi, Babak Ehteshami, Bejnordi, Arnaud Arindra AdiyosoSetio,
Francesco Ciompi, Mohsen Ghafoorian, Jeroen A.W.M.van der Laak, Bramvan Ginneken,
Clara I.Sánchez: “A Survey on Deep Learning in Medical Image Analysis”, Medical Im-
age Analysis, vol. 42, pp 60-88 (2017).
18. Alexander Selvikvåg Lundervold, Arvid Lundervold: “An Overview of Deep Learning in
Medical Imaging Focusing on MRI”, Z Med Phys, volume 29, issue 2, pp 102-127(2019).

You might also like