You are on page 1of 7

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/235759393

Fingerprint prediction using statistical and machine learning methods

Article  in  ICIC Express Letters · February 2013

CITATIONS READS

4 897

3 authors:

Promise Molale Bhekisipho Twala


University of Johannesburg Durban University of Technology
4 PUBLICATIONS   5 CITATIONS    145 PUBLICATIONS   970 CITATIONS   

SEE PROFILE SEE PROFILE

Solly Matshonisa Seeletse


University of Limpopo
5 PUBLICATIONS   29 CITATIONS   

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Statistics, Artificial Intelligence and Decision Making tools in Mining and Metallurgy; Safe mining and New Technologies for a sustainable mineral resource beneficiation
View project

Optimizers in Deep Learning View project

All content following this page was uploaded by Promise Molale on 01 June 2014.

The user has requested enhancement of the downloaded file.


ICIC Express Letters c
ICIC International 2013 ISSN 1881-803X
Volume 7, Number 2, February 2013 pp. 1–ICICIC2012-235

FINGERPRINT PREDICTION USING STATISTICAL AND MACHINE


LEARNING METHODS

Promise Molale1 , Bhekisipho Twala2 and Solly Seeletse1


1
Department of Statistics and Operations Research
University of Limpopo
P.O. Box 107, Medunsa, 0204, South Africa
pmolale@csir.co.za; solly.seeletse@ul.ac.za
2
Department of Electrical and Electronic Engineering Science
University of Johannesburg
P.O. Box 524, Auckland Park 2006, South Africa
btwala@uj.ac.za
Received May 2012; accepted July 2012

Abstract. There are various fingerprint prediction models but it is difficult to deter-
mine which model is ideal based on one performance measure. This paper looked at four
machine learning methods and one statistical method: k-Nearest Neighbour (k-NN), Ar-
tificial Neural Network (ANN), Decision Trees (DT), Support Vector Machine (SVM)
and Linear Regression (LR). The performance of the classifiers is evaluated in terms of
the following performance measures: Root Mean Square Error (RMSE), Mean Absolute
Error (MAE), Relative Absolute Error (RAE), Root Relative Squared Error (RRSE),
Correlation Coefficient (CC) and time taken to build the model. The assessment was
done using the National Institute of Standards and Technology (NIST) fingerprint image
database. Examining the performance of the classifiers showed DT method to be better
than all other compared methods.
Keywords: Fingerprint, Classifier, Machine leaning, Prediction, Performance measures

1. Introduction. Fingerprint information has been around since 1901 when it was in-
troduced at Scotland Yard in the United Kingdom. Ever since it was introduced it has
played a key role with law enforcement and crimes and for authentication purposes. In
1969 there was a major push from the Federal Bureau of Investigation (FBI) to develop
a system to automate its fingerprint identification process which led to the birth of the
Integrated Automated Fingerprint Identification Service (IAFIS) [15].
Biometric recognition is a method of identifying a person based on his/her physio-
logical/behavioural characteristics. Such biological characteristics cannot be forgotten
(like passwords) and cannot be easily shared or misplaced (like keys); they are generally
considered to be more reliable approach to solving the personal identification problem.
However, with the emergence of computing technologies and developments in automated
methods of digitized inked fingerprints and the effects of image compression on image
quality, classification, extraction of minutiae, and matching in recent years, the need for
classifiers that would accurately predict fingerprint patterns has never been so apparent.
Accurate identification of a person could deter crime and fraud, streamline business
processes and save critical resources. The development of high speed computer networks
has offered opportunities for electronic commerce and electronic purse applications. Thus
accurate authentication of identities over networks has become one of the important ap-
plications of biometric-based authentication.
Fingerprint classification has long been an important part of any fingerprint system.
Identifying a person from a large database takes long and the performance of the finger-
print recognition system is negatively affected due to prolonged duration of query. The

1
2 P. MOLALE, B. TWALA AND S. SEELETSE

purpose of classification is to partition a large database into clusters so that search space
becomes smaller and thus reduce search time which will improve identification accuracy.
Based on our survey related to fingerprint classification, it has been observed that most
of the existing work uses different classifiers which generate different model forms: linear,
density estimation, networks and trees [2,4,6,7,9,11]. Some of the work only looks at
misclassification error, i.e., [14] as the sole performance measure or ROC analysis [5].
There are several other performance measures which can be used to evaluate the ac-
curacy of a particular algorithm, i.e., Mean Absolute Error (MAE), Root Mean Square
Error (RMSE), Relative Absolute Error (RAE), Root Relative Squared Error (RRSE)
and Correlation Coefficient (CC).
The major contribution of the study is to show the accuracy of five classifiers (k-
Nearest Neighbour (k-NN), Artificial Neural Network (ANN), Decision Trees (DT), Linear
Regression (LG) and Support Vector Machine (SVM)) for predicting fingerprints patterns
in terms of the following performance measures: Mean Absolute Error (MAE), Relative
Absolute Error (RAE), Root Relative Squared Error (RRSE), Root Mean Square Error
(RMSE), Correlation Coefficient (CC) and the time taken to build the model.
The paper is organized as follows. Section 2 explains the research background. Section
3 presents the research methodology followed in this paper. The results of the experiments
are given in Section 4. Finally, the paper is concluded in Section 5.

2. Supervised and Statistical Learning Methods. In this paper, we are using four
machine learning techniques and one statistical technique for fingerprint prediction. We
have used these methods in order to predict fingerprints into their correct class. k-Nearest
neighbour, artificial neural network, decision trees, linear regression, and support vector
machine methods have seen an explosion of interest over the years, and have successfully
been applied in various areas including fingerprint classification.
2.1. k-Nearest Neighbour (k-NN). One of the most venerable algorithms in Ma-
chine Learning (ML) is the nearest neighbour. k-NN methods are sometimes referred
to as memory-based reasoning, Instance-Based Learning (IBL) or case-based learning
techniques [1]. They essentially learn by assigning to an unclassified sample point the
classification of the nearest of a set of previously classified points. The entire training
set is stored in the memory. To classify a new instance, the Euclidean distance (possible
weighted) is computed between the instance and each stored training instance and the
new instance is assigned the class of the nearest neighbouring instance. To determine the
distance between a pair of instances we apply Euclidean distance metric. Also, in our
experiments, k is set to 1 [13].
2.2. Artificial Neural Network (ANN). Artificial neural networks [8] are usually
nonparametric approaches (i.e., no assumptions about the data are made). They are
represented by connections between a very large number of simple computing processors
or elements (neurons). They have been used for a variety of classification and regres-
sion problems. There are many types of ANNs, but for the purposes of this study we
shall concentrate on single unit perceptron’s and multi-layer perceptron’s also known as
“back propagation networks”. The ANN is trained by supplying it with a large num-
ber of numerical observations of the patterns to be learned (input data pattern) whose
corresponding classifications (target values or desired output) are known [13].
2.3. Decision Trees (DT). Decision tree is a methodology used for classification and
regression. It provides a modelling technique that is easy for human to comprehend
and simplifies the classification process. Its advantage lies in the fact that it is easy to
understand; also, it can be used to predict patterns with missing values and categorical
attributes. Decision tree algorithm is a data mining induction technique that recursively
ICIC EXPRESS LETTERS, VOL.7, NO.2, 2013 3

partitions a data set of records using depth-first greedy approach or breadth-first approach
until all the data items belong to a particular class. A decision tree structure is made
of (root, internal and leaf) nodes and the arcs. The tree structure is used in classifying
unknown data records. At each internal node of the tree, a decision of best split is made
using impurity measures. We have used M5P implemented in WEKA Tool [13]. This
method generated M5 model rules and trees.

2.4. Linear Regression (LG). Linear regression analyses the relationship between two
variables, X and Y. One variable is the dependent variable and the other is the independent
variable. For doing this, it finds a line which minimizes the sum of the squares of the
vertical distances of the points from the line. In other words, it is method of estimating
the conditional expected value of one variable y given the values of some other variable
or variables x [13].

2.5. Support Vector Machine (SVM). Support Vector Machine (SVM) is a learning
technique which is used for classifying unseen data correctly. For doing this, SVM builds
a hyper plane which separates the data into different categories. The dataset may or
may not be linearly separable. By ‘linearly separable’ we mean that the cases can be
completely separated, i.e., the cases with one category are on the one side of the hyper
plane and the cases with the other category are on the other side. SVM can also be
extended to the non-linear boundaries using kernel trick. The kernel function transforms
the data into higher dimensional space to make the separation easy. We have used SVM
for estimating continuous variable, i.e., class [13].

3. Performance Measures. We have used the following evaluation criteria to evaluate


the chosen classifiers:

3.1. Root Mean Square Error (RMSE). The root mean square error:
v
u n
u1 X
E=t (Pi − Ai )2 (1)
n i=1

where Pi is the predicted value for data point i; Ai is the actual value for data point i; n
is the total number of data points. If Pi = Ai , ∀i = 1, 2, . . ., n; E = 0 (ideal case). Thus,
the range of E is from 0 to infinity.

3.2. Mean Absolute Error (MAE). The mean absolute error:


n
X Ai − Pi
E=
Ai
(2)
i=1

where Pi is the predicted value for data point i; Ai is the actual value for data point i; n
is the total number of data points. If Pi = Ai , ∀i = 1, 2, . . ., n; E = 0 (ideal case). Thus,
the range of E is from 0 to infinity.

3.3. Relative Absolute Error (RAE). The relative absolute error of individual datas-
et j is defined as: Pn
|Pij − Ai |
Ej = Pni=1 (3)
i=1 |Ai − Am |
where Pij is the value predicted by the individual dataset j for data point i; Ai is the
actual value for data point i; n is the total number of data points; Am is the mean of Ai .
For ideal case, the numerator is equal to 0 and Ej = 0. Thus, the range of Ej is from 0
to infinity.
4 P. MOLALE, B. TWALA AND S. SEELETSE

3.4. Root Relative Squared Error (RRSE). The root relative squared error of indi-
vidual dataset j is defined as:
sP
n
(Pij − Ai )2
Ej = Pni=1 (4)
i=1 (Ai − Am )
2

where Ai is the actual value for data point i; n is the total number of data points; Am is
the mean of Ai . For ideal case, the numerator is equal to 0 and Ej = 0. Thus, the range
of Ej is from 0 to infinity.

3.5. Correlation Coefficient (CC). Correlation measures of the strength of a relation-


ship between two variables. The strength of the relationship is indicated by the correlation
coefficient. The larger the value of the correlation coefficient, the stronger the relationship.

3.6. Time taken to build the model. We finally looked at the time taken to build
the models as the fifth performance measure. The time is given in seconds for each
classification method. As mentioned in the introduction, performance of the fingerprint
recognition system is negatively affected due to prolonged duration of query when the
population size is relatively large. So the smaller the time it takes to build the model the
better the performance of the system as a whole will be. Thus it is important to look at
the cost of time as well.

4. Experimental Design. This section discusses the data and resources that we used
in our experiments, the experimental set-up and the model prediction results.

4.1. Empirical data collection. The data we have used is obtained from NIST special
database 4 biometric images dataset [3,10]. The NIST dataset consists of 256 grey-level
images; two different fingerprint instances (F = first, S = second) are present for each
finger. Each fingerprint was manually analysed by a domain expert and assigned to one of
the following 5 classes: arch (A), left loop (L), right loop (R), tented arch (T) and whorl
(W). The dataset contains 2000 uniformly distributed fingerprint pairs in five classes (A
= 3.75%; T = 2.9%; L = 33.8%; R = 31.7% and W = 27.9%). However, only 1998
fingerprint images were used for experimentation. Out of these 798 images are arch, 800
are tented arch, 800 are left loop, 800 are right loop, and 800 are whorl.
The dataset consists of 6 variables, i.e., five independent numeric variables and one
dependent numeric variable. The numeric variable, i.e., the class, was manually classed
by a human expert and the five independent numeric variables, A-Orient, B-Orient, C-
Orient, D-Orient and X-Orient, which extracted from the fingerprint orientation map.
A fingerprint orientation map is a matrix whose cells contain the local orientation of
each and every ridge in the original fingerprint image. Each of the five numeric variables
represents an average orientation value for a 25 by 25 square block of pixels, where the
first four blocks (A, B, C, and D) are situated at the corners of the chosen region of
interest (ROI), while the fifth block (X) is situated at the centre of the ROI. Figure 1
shows an original fingerprint image overlaid with a typical ROI, and shows the respective
location of the blocks that represent the mentioned five numeric variables.

4.2. Resources. The five classifiers are performed using the Waikato Environment for
Knowledge Analysis (WEKA) software [13]. WEKA is a collection of machine learning
algorithms for data mining tasks. The algorithms can either be applied directly to the
dataset or called from your own Java code. For the purposes of this study, we follow the
former procedure and by using the default values. All statistical tests are conducted using
EXCEL 2010.
ICIC EXPRESS LETTERS, VOL.7, NO.2, 2013 5

Figure 1. A fingerprint image overlaid with a typical ROI, showing the


blocks that represent the mentioned five numeric variables

4.3. Model prediction results. The NIST dataset was used to carry out the prediction
of the classifiers estimated model. A k-fold cross validation, i.e., 5-fold cross validation,
was used to estimate the accuracy of the classifiers estimated model. The dataset was
divided into two parts, i.e., training and validation set in ratio 8 : 2. Thus 80% was used
for training the model and 20% was used for validating accuracy of the model. The dataset
was divided into 5 subsets (20% each) and the experiments were repeated five times for
each classifier, i.e., changing the testing data in each of the 5 runs. Five machine learning
methods were used to analyse the results.

Table 1. Results of the five classifiers given five performance measures

k-nearest Artificial neural Naive Bayes Support Vector


Performance Measures Decision trees
neighbour network classifier Machines
Root Mean Square Error
170.57 149.24 128.23 135.95 140.57
(RMSE) %
Mean Absolute Error
118.21 124.82 107.56 120.69 118.51
(MAE) %
Relative Absolute Error
98.49 103.99 89.62 100.56 98.74
(RAE) %
Root Relative Squared error
120.60 105.51 90.66 96.12 99.39
(RRSE) %
Correlation Coefficient 0.28 0.33 0.43 0.28 0.23
Time taken to build the
0.00 1.61 0.62 0.04 8.46
model (in seconds)

The model with lower RMSE, RAE, MAE, RRSE, with the highest correlation coeffi-
cient and that takes the least time to build the model is considered to be the best among
others. As shown in all the tables (Table 1), decision tree method is found to be effective
in predicting fingerprint class. The differences in performance between the individual
classifiers are significant at the 5% level.

5. Summary and Conclusions. In this research we have made a comparative analysis


of five machine learning methods for predicting fingerprint class. We have obtained results
using the data obtained from NIST database. The results show that decision tree had the
6 P. MOLALE, B. TWALA AND S. SEELETSE

lowest error rates and correlation coefficient of 0.43, which is the highest, as compared with
other classifiers. k-Nearest neighbour took the least amount of time to build the model.
Hence DT had the best performance. The good performance of DT as an individual could
be attributed to its pruning, i.e., removing outliers from the dataset, strategy (which is
able to handle outliers in data). Researchers and practitioners who use biometric data
may apply decision tree method for classification tasks and for estimation also taking into
consideration the time it takes to build the model. k-Nearest neighbour, support vector
machine and Naı̈ve bayes machine learning methods are among the top ten algorithms in
data mining [12].
Future work can further replicate this study for iris data. We plan to replicate our
study to predict iris class prediction models based on other machine learning algorithms
and regression methods. This will help with improving search time and accuracy of the
iris system.
Acknowledgement. This study was carried out with the financial support of the Council
of Scientific and Industrial Research and the Statistics and Operational Research Depart-
ment at the University of Limpopo, Medunsa Campus. We would also like to thank NIST
Special 4 for allowing us to have access to their fingerprint images database and further
use them for all our experiments. The authors would like to also thank Ishmael Msiza, at
CSIR for helping with the feature extraction process.
REFERENCES
[1] D. W. Aha, D. Kibbler and M. K. Albert, Instance-based learning algorithms, Machine Learning,
vol.6, pp.37-66, 1995.
[2] Y. Amit, D. German and K. Wilder, Joint induction of shape features and tree classifiers, IEEE
Trans. on Pattern Anal. and Machine Intell., vol.19, no.11, pp.1300-1305, 1997.
[3] G. T. Candela and R. Chelleppa, Comparative performance of classification methods for fingerprints,
NIST Technical Report NISTIR 5163, 1993.
[4] R. Cappelli, A. Lumini, D. Maio and D. Maltoni, Fingerprint classification by direct image parti-
tioning, IEEE Trans. on Pattern Anal. and Machine Intell., vol.21, no.5, pp.402-421, 1999.
[5] J. P. Egan, Signal Detection Theory and ROC Analysis, Academic Press, 1975.
[6] A. K. Jain, S. Prabhakar and L. Hong, A multichannel approach to fingerprint classification, IEEE
Trans. on Pattern Anal. and Machine Intell., vol.21, no.4, pp.3482-359, 1990.
[7] S. Oteru, H. Kobayash, T. Kato, F. Noda and H. Kimura, Automated fingerprint classifier, Proc. of
Intl. Conf. Pattern Recognition, pp.1985-189, 1974.
[8] B. D. Ripley, Pattern Recognition and Neural Networks, Cambridge University Press, John Wiley,
New York, 1992.
[9] A. Senior, A combination fingerprint classifier, IEEE Trans. on Pattern Anal. and Machine Intell.,
vol.23, no.10, pp.1165-1174, 2001.
[10] C. T. Watson and C. L. Wilson, NIST special database 4: Fingerprint database, Technical Report,
National Institute of Standards and Technology, 1992.
[11] C. L. Wilson, G. T. Candela and C. I. Watson, Neural network fingerprint classification, Journal of
Artificial Neural Networks, vol.1, no.2, pp.203-228, 1993.
[12] X. Wu, V. Kumar, J. R. Quinlan, J. Ghosh, Q. Yang, H. Motoda, G. T. McLachlan, B. Liu, P. S.
Yu, Z. H. Zhou, M. Steinbach, D. J. Hand and D. Steinberg, The top 10 algorithms in data mining,
Knowledge Information Systems, vol.14, pp.1-37, 2008.
[13] I. H. Witten and E. Frank, Data Mining: Practical Machine Learning Tools and Techniques, San
Francisco, Morgan Kaufmann Publishers, CA, USA, 2005.
[14] P. Molale, B. Twala and S. Seeletse, Fingerprint prediction using classifier ensembles, The 53rd
Annual South African Statistical Association Conference, pp.47-61, 2011.
[15] Federal Bureau of Investigation, The Science of Fingerprints: Classification and Uses, U.S. Govern-
ment Printing Office, Washington DC, 1984.

View publication stats

You might also like