You are on page 1of 4

Personality Prediction based on Handwriting using

Machine Learning
Nikita Lemos Krish Shah Rajas Rade Dharmil Shah
Department of Information Technology Department of Information Technology Department of Information Technology Department of Information Technology
Xavier Institute of Engineering Xavier Institute of Engineering Xavier Institute of Engineering Xavier Institute of Engineering
Mumbai, India Mumbai, India Mumbai, India Mumbai, India
nikita.l@xavierengg.com krishshah24091997@gmail.com rajasrade@gmail.com shahsunny0871@gmail.com

Abstract—The lifestyle of humans has modified since digital thinking of that person while writing. Handwriting analysis is
age where everything may be handled with a tip of the finger, thus the study of handwriting styles where the handwriting
however all those luxuries might return at a value of security expert examines the handwriting sample and checks for
or fraud where masking one's identity with a faux one is numerous trails present in the sample and predicts the
possible that on the opposite hand isn't possible during a case personality trait present in the handwriting sample. The
with handwriting. Handwriting is exclusive to each person like traditional method of analyzing a handwriting is to write on a
a fingerprint is exclusive to each person. Someone can imitate plain white paper and then letting the handwriting expert
another person’s handwriting for less than a few words
check the handwriting sample, this method is possible if the
creating it distinctive. Handwriting tells about the character of
quantity of the handwriting samples to be checked are less
the person as writing is coupled with brain and it
subconsciously leaves a path concerning the temperament
however if the quantity of handwriting samples are more
attribute like optimistic, pessimistic, balanced, shy, etc., which than the time needed to do the analysis will increase
might be detected. Various forms of handwriting styles are considerably, the time required to analyze the handwriting
taken into thought like slope or angle of the sentence, number can be reduced by computational or digital means where the
of words in a region, left or right slant of the sentence, etc. The image of the handwriting sample can be taken and then it can
complete system evaluates the handwriting samples based on be analyzed by the system. In the era of a digital world
the above-mentioned handwriting styles and it is divided into wherever everything is shifted from handwritten resource to
four modules with the primary module being the input where digital resource thereby reducing the human work and
the image of handwritten text is taken from the user that is making it doable to explore and implement new concepts.
followed by image pre-processing that removes noise and With the exponential growth of the digital world, the
sharpens the contrast of the image for better results, that is inclusion of traditional technique could offer vital results.
then passed to the Convolutional Neural Network (CNN) that World is constantly developing particularly in the
analyses the input image with the CNN model which is created Information Technology (IT) industry, recruiting countless
by performing CNN on the training dataset and labels the number of new employee’s each year. This recruiting
input image accordingly and the last module is the output method are often typically burdening for the Human
where the labeled images from the previous module is used to Resource (HR) department and this is where a handwriting
find out the percentage of various traits present in the analysis can be used to provide a heads up regarding the
handwriting sample of the subject.
character of an individual, as an individual could lie in an
interview however the brain writing can be tough to
Keywords— CNN, Machine Learning, handwriting, manipulate. The digital world additionally shifted the
personality, adaptive gaussian thresholding, traits. traditional matchmaking to matrimonial websites.
Digitalizing resolved the matter of gathering an outsized
I. INTRODUCTION quantity of information and for the quality results, efficient
Handwriting is one of the distinctive attribute humans and well-tried strategies should be enforced. The handwriting
have, everybody has a style of writing, handwriting is often analysis can be used in matrimonial websites where the
used to get information concerning about the temperament of handwriting of the user needs to be tested with various
the person. Distinctive handwriting does not mean a person proven data sets and the results generated from the users
has a unique personality but certainly it is used to compare it handwriting can be cross-referenced with the various users
with certain patterns which in turn can be used to find out the and the pair with the most similar personality can be tagged
character of a person. Handwriting style is not something as the best match. This paper focuses on analyzing various
which humans develop overnight it’s a continuous process. style of handwriting like clockwise and anticlockwise
Handwriting style of a person also changes according to the inclination of the baseline of a sentence [1], Spacing between
mood of the person, if the person is drowsy the handwriting words and, the forward and backward slant [2] which will be
tends to be more unconventional to read whereas if the used in identifying a set of traits associated with the person.
person is enthusiastic the handwriting tends to be more II. LITERATURE REVIEW
understandable and neater. Thus, studying such patterns and
analyzing various handwriting styles is called graphology. Subham Nagar et al analysed personality based
Writing is linked with brain, the brain sends various signals only on space in the handwritten text. Skew normalization
to the writing hand to write, during this process brain also was performed on the image and then it compared the new
sends some other signals subconsciously which gets image with old one. Loops in the character were considered
reflected, the handwritten text thus contains a trail which was to decide the personality [1].
subconsciously left by brain while writing and this trail is Anamika Sen and Harsh Shah checked for features
noticeable and can be used to know the character or the like the flow of handwriting shortness etc. It was created in

978-1-5386-7709-4/18/$31.00 2018
c IEEE 110
MATLAB guide, pre-processing was also done in intelligence. It also stated that deep learning is not yet
MATLAB. It did not group different types of people having applied to the field of handwriting analysis [9].
the same characteristic. Features focused in were predefined Anupam Varshney and Shalini Puri performed a
and did not compare the input with other sets of features. At survey on human personality based on the handwriting using
a time only one feature was compared with the database of artificial neural network, it covered various aspects of
images and not all the features were compared [2]. handwriting through which handwriting can be analyzed.
Bala Mallikarjunarao Garlapati and Srinivasa Rao The various parameter were zones where the handwritten
Chalamala separated handwritten text & printed text and text is divided into three categories and then analyzed
then both were given in separate documents. SVM was used accordingly, connections which looks out for how the
to classify them into a category. It separated only two connection between alphabets in a word is made, slant
different classes. It used10 fold cross-validation technique to which checks for the inclination of the handwritten text line,
find accuracy [3]. spacing where the spacing between the words are
Vasundhara Bhade and Trupti Baraskar classified considered, margin is the space left at the start of the
the photos of handwriting based on only three features. sentence or the space between the page border and the
Bounding box technique-which comes under image handwritten text, letter size where the size of every word is
processing was used in their study. It classified only the taken into consideration, large and small middle zone where
photos of handwriting based on only three features which the main center part of the handwritten text line is
were left, right margin & word spacing. The processing was considered, speed is also important if the speed of the writer
done by forming boxes around the words and the distance is fast the words tend to be slant in the direction of the
between these boxes was calculated [4]. writer writing, clarity of the handwriting is the
Usha Tiwari et al compared OCR, Neural network, understandability of the handwritten text a clear handwriting
Intelligent character recognition (ICR) and Intelligent word suggest a well-organized person, pressure some people write
recognition (IWR). The following conclusions were with more pressure and some tend to write with light hands
produced in this study: OCR is good for just character or light pressure this is also used to analyze the handwriting
recognition not for comparing and concluding(analysis) a sample. The paper explains all the above-mentioned various
data. Neural network finds the similarity with the training attributes for analyzing a handwritten sample [10].
images and then conclude the result based on similar factors.
Neural networks have a high tolerance for noise [5]. Some of the major limitations of the above papers
Behnam Fallah et al used hidden maker model & were that their system grouped only two different styles [3],
neural network to perform classification. It was used to their system looked out for only one personality trait at a
identify properties which are not related to writer and those time [2], the accuracy was low as the bounding boxes
related to them. Various pre-processing steps like removing overlapped [4], the inclination of baseline was measured
noise, smoothing& word fonts, etc. were performed. It using the skewness of the baseline which many times get
combined data of both an online test as well as handwriting. introduced during image pre-processing affecting the overall
It achieved an efficiency of 76% as compared with k-means accuracy [1].
having an efficiency of 20% for the same set of data and
operations [6].
Xinxin Xie et al described the gaussian algorithm III. PROPOSED WORK
which removes noise in the input images and improved the The proposed work focuses on eliminating the limitation
contrast of the image by using only adaptive threshold mentioned within the previous section by using
algorithm unlike other algorithms were both noise removal convolutional neural network because it can be used to
and contrast were done by two algorithms [7]. produce multiple groups having similar temperament traits, it
Vaishali R. Lokhande and Bharti W. Gawali used a checks for all the possible personality traits present in the
signature for determining the personality of a person, handwriting promptly, it additionally eliminates the problem
signatures can be easily copied as compared to copying a of inaccuracy of system introduced because of skewing the
handwriting. It checked for personality based on the image [1] by performing the pre-processing on the training
parameters like the baseline under the main signature, the dataset as well as on the input image and doing such would
counter the inaccuracy produced.
style of the dot over certain alphabets, start style of the
signature, end style of the signature, and the space or gap in The proposed work is divided into several modules and
the signature. The dataset included only 60 images and were every module is explained in detail below:
taken from only 10 different people. The output of the
In the first module image of handwriting is taken from
signature was determined based on the segmentation where the user through a website. The user uploads the image on
the main parameters taken into consideration were the website and the image is passed to the system as shown
horizontal segmentation and the vertical segmentation [8]. in Fig 1.
Afnan H. Garoot et al performed a survey on
multiple handwriting analysis and described in detail about
what graphology means, what are the advantages and
disadvantages of implementing it through computerized
method. And compared various papers which were on the
topic of handwriting analysis but were implemented using
different methods like image processing or artificial

2018 International Conference on Computational Techniques, Electronics and Mechanical Systems (CTEMS) 111
Fig 1. Input Module

In the second module pre-processing on the image is


done by using various technique to remove noise and to
smoothen the image for better results [3]. The first step of
image pre-processing is to convert the input image to black
and white image as it has lower inherent complexity [7]. In
the second step of image preprocessing the contrast or
sharpness of the image is improved to brighten the image as Fig. 2. Flowchart for Personality Prediction
it makes the handwriting curves much sharper which
increases the accuracy [7]. In the third step of image pre- In the third module, the dataset is created and training is
processing, thresholding on the image is performed. This performed on this dataset to create a model. Convolutional
image is converted into an array of pixels. The threshold neural network is used as it is very flexible [6]. It can
value of each pixel is calculated. A certain region of the smoothly perform on less dataset. If we plan to increase the
image is considered and the Gaussian Weighted Average dataset, it can still work without making any changes. It
(WA) of all the pixels in the selected area is calculated and gives better accuracy than the Image Processing [2]. Noise
the process is applied to the entire image [7]. The pixel resistance is quite high in CNN [6]. Adding more features is
values that are near the center of the region will have a easy in CNN.
higher weight. After subtracting a constant parameter from
the weighted average value, we get the threshold value for In neural network, there are three layers namely the input
that pixel location. layer, hidden layer, output layer. Appropriate numbers of
layers are deployed to increase the proficiency of the model
[5]. The model is then trained based on these selected
The Formula is given by: numbers of layers. The proposed work mainly considers 5
classes through which handwritings are differentiated. It
ሺšǡ ›ሻ  ൌ ሺšǡ ›ሻ  െ ’ƒ”ƒͳ (1) means there are 5 output labels. Each labeled output
represents a specific characteristic or a personality trait found
in the handwriting.
where, T (x, y) is the Threshold value of the pixel at the
location x, y In the last module, as the main processing on the input
image is complete, the test sample of the handwriting is
WA (x, y) is the weighted average of the pixel at the location passed to the CNN model. Feature Extraction takes place and
x, y we get the probabilities of each output labels. By observing
probabilities of different output labels on an input image we
param1 is the parameter
can predict the most dominant class to which the test
handwriting sample belongs or the personality traits that the
person has.
A certain Threshold value is picked. The Threshold value
of the pixels greater than the selected value is set to 1 in the The table below shows various types of handwriting
binary image and the Threshold value of pixels less than the sample and its corresponding personailty traits like optmistic,
selected value is set to 0 in the Binary image [7]. By carrying pessimisctic, balanced, good taste, poor taste, etc [1][2].
out this activity, the image is converted into a Binary image
which is necessary for the neural networks to works [5].

112 2018 International Conference on Computational Techniques, Electronics and Mechanical Systems (CTEMS)
Table 1. Handwriting Characteristics This project can be extended to detect medical
diseases like Parkinson’s, Alzheimer’s and even cancer
Type Writing Sample Personality through Kanfer test. The system can be switched from
Traits offline handwriting samples to online handwriting samples
by allowing users to give handwriting samples through a
touch screen of any device. The system requires high
computation for developing the CNN model, once the CNN
Ascending model is developed the input images can be compared with
Optimistic
the CNN model within a couple of minutes, so the upgrade
must reduce the time consumption for developing the CNN
model.
Descending Pessimistic
REFERENCES

Level [1] S. Nagar, S. Chakraborty, A. Sengupta, J. Maji and R. Saha, "An


efficient method for character analysis using space in handwriting
Balanced image," 2016 Sixth International Symposium on Embedded
(straight)
Computing and System Design (ISED), Patna, 2016, pp. 210-216.
[2] A. Sen and H. Shah, "Automated handwriting analysis system using
Wide Good taste, principles of graphology and image processing," International
Spacing independent Conference on Innovations in Information, Embedded and
Communication Systems (ICIIECS), Coimbatore, 2017, pp. 1-6.
[3] B. M. Garlapati and S. R. Chalamala, "A System for Handwritten and
Printed Text Classification," UKSim-AMSS 19th International
Conference on Computer Modelling & Simulation (UKSim),
Narrow Poor taste Cambridge, 2017, pp. 50-54.
Spacing [4] Bhade V., Baraskar T. “A Model for Determining Personality by
Analyzing Off-line Handwriting.” In: Reddy Edla D., Lingras P.,
Venkatanareshbabu K. (eds) Advances in Machine Learning and Data
Science. Advances in Intelligent Systems and Computing, vol 705.
Right Slant Pragmatic Springer, Singapore,2018,pp 345-354.
[5] Tiwari U., Jain M., Mehfuz S. Handwritten Character Recognition—
An Analysis. In: Singh S., Wen F., Jain M. (eds) Advances in System
Optimization and Control. Lecture Notes in Electrical Engineering,
vol 509. Springer, Singapore,2019, pp 207-212.
Likes to [6] B. Fallah and H. Khotanlou, "Identify human personality parameters
Left Slant
work behind based on handwriting using neural network," 2016 Artificial
the scenes Intelligence and Robotics (IRANOPEN), Qazvin, 2016, pp. 120-126.
[7] Xie, W. Huang, H. H. Wang and Z. Liu, "Image de-noising algorithm
based on Gaussian mixture model and adaptive
thresholdmodeling," 2017 International Conference on Inventive
Computing and Informatics (ICICI), Coimbatore, 2017, pp. 226-22.
[8] V. R. Lokhande and B. W. Gawali, "Analysis of signature for the
IV. CONCLUSION AND FUTURE SCOPE prediction of personality traits," 2017 1st International Conference on
In this project, personalities or traits like optimistic, Intelligent Systems and Information Management (ICISIM),
Aurangabad, 2017, pp. 44-49.
pessimistic, balanced, poor taste, pragmatic, etc. are
[9] A. H. Garoot, M. Safar and C. Y. Suen, "A Comprehensive Survey on
detected by performing a handwriting analysis on the input Handwriting and Computerized Graphology," 14th IAPR
image. The system compares the input image with the CNN International Conference on Document Analysis and Recognition
model which is created after applying the convolutional (ICDAR), Kyoto, 2017, pp. 621-626.
neural network on the training dataset. The key feature of [10] A. Varshney and S. Puri, "A survey on human personality
this project is exacting all the possible traits using CNN. It identification on the basis of handwriting using ANN," International
Conference on Inventive Systems and Control (ICISC), Coimbatore,
gives which of all the personality traits mentioned above, 2017, pp. 1-6.
and it also indicates each personality trait in the form of
percentage, where the sum of each percentage of personality
trait adds up to the total of hundred percent.

2018 International Conference on Computational Techniques, Electronics and Mechanical Systems (CTEMS) 113

You might also like