Professional Documents
Culture Documents
ABSTRACT
Existing methods of face emotion recognition have been limited in performance in terms of recognition
accuracy and execution time. It is highly important to use efficient techniques for improving this
performance. In this article, the authors present an automatic facial image retrieval combining the
advantages of color normalization by texture estimators with the gradient vector. Starting from a
query face image, an efficient algorithm for human face by hybrid feature extraction provides very
interesting results.
Keywords
Color Normalization, Gradient Vector, Similarity Distance, Texture Estimators
1. INTRODUCTION
DOI: 10.4018/IJSE.2020010102
Copyright © 2020, IGI Global. Copying or distributing in print or electronic forms without written permission of IGI Global is prohibited.
17
International Journal of Synthetic Emotions
Volume 11 • Issue 1 • January-June 2020
the problems related to linear transformations (rotation, scaling and translation) on images, or by
structuring the content of the image. We propose feature-based modeling approach that combines
gradient vector and normalized steering kernels from each color channel with estimators of the
covariance. The proposed methodology combines the advantages of color normalization by texture
estimators with the gradient vector.
The paper is organized as follows: the next section reviews the state of the art of recognition
and search models. In section 3, we introduce the proposed methodology for features modeling and
selection. The similarity evaluation strategy is presented in section 4. Classification results, as well
as discussion and main findings, are exposed in section 5. Section 6 summarizes our contributions
and concludes the paper.
2. RELATED WORK
Considerable works have been carried out on content-based image indexing. Existing works of
content-based image retrieval using dominant colors as well as the complexity of their content. They
use generic attributes such as color, shape or texture (Israel et al. 2004; Zhou et al. 2017;). Other
systems use XML schemas to search for images on their semantic and visual content (Hong & Nah,
2004). These visual primitives can be categorized into three main types: color-based descriptors,
texture-based descriptors and shape-based descriptors. Histograms (Boujemaa, Boughorbel, &Vertan,
2001) and Color Angles (Wang et al. 2010) are typical examples of the first type. In particular, color
angles (Wang et al. 2010) are considered one of the most powerful discriminative algorithms that were
applied to diverse classification problems including face recognition. In fact, color angles, are known
to be particularly efficient to cope with high dimensional data spaces (Costa, Humpire-Mamani, &
Traina, 2012). However, the problem with color angles is the fact of being frame-based classifiers
i.e. they are inherently unable to model pixels dependencies.
Co-occurrence matrix (Eleyan & Demirel, 2011) is another well-known discriminative technique
for texture-based descriptors and face recognition. Gabor filter (Abhishree, Latha, Manikantan, &
Ramachandran, 2015), Wavelet transformations (Ashraf et al. 2018) are other types of models that
are known to be part of the transform-based domains and especially part of the texture ones. In spite
of having less discriminative power than Color Angles and other discriminative classifiers, they have
the advantage of modeling efficiently sequences and temporal data due to their internal network
configuration. This property has made Color Angles very popular in the face recognition literature
(Wang et al. 2010). For instance, Color Angles were applied in (Mahoor, & Abdel-Mottaleb, 2008)
for multimodal face modeling and video indexing in a smart environment.
Other proposed texture analysis and classification techniques such as Fourier Mellin Transform
(Goecke, Asthana, Pettersson, & Petersson, 2007), algebraic moments (De Siqueira, Schwartz, &
Pedrini, 2013), contour models (Bouhini, Géry, & Largeron, 2013) were designed to detect specific
situations and classification results were quite interesting. Other face recognition studies applying
Fourier Mellin Transform can be found in (Derrode, & Ghorbel, 2001). However, it seems difficult to
find attributes that can model an image according to all of their aspects described above. More recently,
in (Karmakar, 2019) proposed a retrieval technique for medical images, the main idea is to find the
requested data based DWT domain, where a simple linear function is used for that. Comparison and
analysis of algorithms for image retrieval on a large images dataset was done.
The major problem with all these approaches is the non-consideration of all images features
retrieval. Lot of queries on lack of exploration of information available about the various image
features and the various computing signatures that lack exploration. The association between face
images and the investigative signatures was not studied for deriving useful satisfaction. To cope with
this limitation, other approaches are based on the combination of three descriptors-based techniques
for extracting the image features, which are the aim of our approach.
18
International Journal of Synthetic Emotions
Volume 11 • Issue 1 • January-June 2020
The purpose of this research paper is presenting a novel approach. It consists in enriching the
image retrieval with semantic content. Users then have the opportunity also to add some key terms
in order to guide and filter relevant images depending on their interests and preferences in order to
adapt the search process to the specific needs of users.
3. PROPOSED METHODOLOGY
We propose the segmentation of color images into coherent regions using gradient method. This
method is applied all the regions to extract local maxima. We have a shape descriptor formed from
the regions of the segmented image combined with other descriptors: color channel normalization
and estimators of the covariance. Then, all descriptors vectors (De Siqueira, Schwartz, & Pedrini,
2013) are concatenated to obtain the image features vector. Finally, Euclidean distance is applied to
find the input images based on features vectors. Some benefits are faster processing, supports reduced
cost and efficiency of the index.
The focus this paper is to apply similarity-based techniques based on the various visual attributes
to derive complete image signature and validate the relevance and significance of face image, which
facilitated in effective implication of these extracted features for each image by validating the
automatic relevance.
The proposed method improves facial recognition accuracy and execution time using reduced
feature vector size. The three signatures, color, texture and shape on the statistical distribution of the
image highlight the potential differences between the images.
R (x , y ) +V (x , y ) + B (x , y )
I1 = : Intensity at pixel location (x, y)
3
R (x , y ) −V (x , y ) + 1
I2 = : Color difference (Red / Green) (1)
2
R (x , y ) +V (x , y ) − 2B (x , y ) + 2
I3 = : Color difference (Yellow / Blue)
4
This normalization clearly describes the color content of the image. This content translated into
the feature vector and saved as a signature with the initial image in the image database.
19
International Journal of Synthetic Emotions
Volume 11 • Issue 1 • January-June 2020
matrix Pd,θ (x, y). By selecting the same values d and orientations θ equal to 0 °, 45 °, 90° and 135 °,
the texture feature vector has been estimated as follows:
N i 6 di2
COV (d, θ) = ∑ ∑ i − j ∗ Pd ,θ (i, j ) (2)
i =1 j =1 (
n n2 − 1 )
3.3. Gradient Vector
Our goal is to find homogeneous regions according defined through boundary of gradient vector.
The gradient norm at each pixel of window W of size d is defined as follows:
1. Direct application of continuous derivation for each pixel (x, y) of an image I, we compute the
partial derivatives Gx and Gy with respect to x and y:
Gx = I (x + 1, y ) − I (x , y )
(3)
Gy = I (x , y + 1) − I (x , y )
−1
H x = −1, 1 H y = (4)
1
G (x , y ) = Gx 2 + Gy 2 (5)
G
D (x , y ) = Arctg x (6)
Gy
Information Retrieval System based on images as query belongs to the Query by Example categories.
Most of those information retrieval systems are using image as query and the result is usually a set
of similar images. This approach will try to merge the benefits of both visual and textual querying.
First, the user will upload a face image. Then our system will generate a list of images as the most
Content-Based Emotional Image Retrieval system (CBEIR). However, our approach will present a
list of face images that satisfies user research.
20
International Journal of Synthetic Emotions
Volume 11 • Issue 1 • January-June 2020
Sc (I ,Q ) =
∑ i
min(I 1i , I 1q ) + min(I 2i , I 2q ) + min(I 3i , I 3q )
; i: 0 à 255 ;
Color
min( I , Q )3
∑ (max )
2
Shape S f (I ,Q ) = i
I
d ,i
− maxQ d ,i
wc Sc (I ,Q ) + wt St (I ,Q ) + w f S f (I ,Q )
Color, texture, and shape Sct f (I ,Q ) =
wc + wt + w f
5. EXPERIMENTAL RESULTS
The excellence of the proposed approach based three features descriptors is implemented with the
help of the C++ Builder tool. At the time of implementation process, multimedia indexing and search
system is evaluated on 30 video sequences that consists of 30 images that help to analyze the proposed
features extraction system efficiency. From the collected region, various features (color, shape and
texture) are extracted which are trained by using the SVM method which is stored as template in
database. Figure 1 shows the list of face images used in the experimentation. The efficiency of the
system is evaluated using the accuracy and response time.
21
International Journal of Synthetic Emotions
Volume 11 • Issue 1 • January-June 2020
using C++ Builder. It is noticed from Table 3 that the execution time of our technique is acceptable
and the system displays all the images similar to the request image whose similarity distance is
greater than 0.75 (see Table 2).
Table 2. Accuracy of the proposed face image recognition using color attribute
Table 3. Response time of the proposed face image recognition using color attribute
Table 4. Accuracy of the proposed face image recognition using texture attribute
using the texture attribute. The system tries to display all similar images to the request image when
the similarity is greater than 0.75. The obtained results are encouraging and demonstrate the high
efficiency (Table 5).
22
International Journal of Synthetic Emotions
Volume 11 • Issue 1 • January-June 2020
Table 6. Accuracy of the proposed face image recognition using shape attribute
23
International Journal of Synthetic Emotions
Volume 11 • Issue 1 • January-June 2020
Table 8. Accuracy of the proposed face image recognition-using hybrid features extraction
Table 9. Response time of the proposed face image recognition-using hybrid features extraction
6. CONCLUSION
In this paper, we proposed a hybrid efficient facial images indexing and retrieval based on three
descriptors to improve the selection and matching process. The proposed image features benefits from
the combination of color normalization, covariance estimators and the gradient vector for retrieving
pertinent facial image. Our approach offers users several possibilities for querying a database image
data, and returns as a response, the images most similar to the query image. The numerical results are
encouraging and demonstrate the high efficiency and flexibility. Futures works attempt to enhance this
method in terms of using different word embedding in order to investigate their impact in retrieval task.
24
International Journal of Synthetic Emotions
Volume 11 • Issue 1 • January-June 2020
REFERENCES
Abhishree, T. M., Latha, J., Manikantan, K., & Ramachandran, S. (2015). Face recognition using Gabor filter
based feature extraction with anisotropic diffusion as a pre-processing technique. Procedia Computer Science,
45, 312–321. doi:10.1016/j.procs.2015.03.149
Ashraf, R., Ahmed, M., Jabbar, S., Khalid, S., Ahmad, A., Din, S., & Jeon, G. (2018). Content based image
retrieval by using color descriptor and discrete wavelet transform. Journal of Medical Systems, 42(3), 44.
doi:10.1007/s10916-017-0880-7 PMID:29372327
Bouhini, C., Géry, M., & Largeron, C. (2013). User-Centered Social Information Retrieval Model Exploiting
Annotations and Social Relationships. In Asia Information Retrieval Symposium (pp. 356-367). Springer.
doi:10.1007/978-3-642-45068-6_31
Boujemaa, N., Boughorbel, S., & Vertan, C. (2001). Color Soft Signature for Image Retrieval by Content.
Eusflat, 2, 394–401.
Costa, A. F., Humpire-Mamani, G., & Traina, A. J. M. (2012). An efficient algorithm for fractal analysis of
textures. In 2012 IEEE 25th SIBGRAPI Conference on Graphics, Patterns and Images (pp. 39-46). doi:10.1109/
SIBGRAPI.2012.15
De Siqueira, F. R., Schwartz, W. R., & Pedrini, H. (2013). Multi-scale gray level co-occurrence matrices for
texture description. Neurocomputing, 120, 336–345. doi:10.1016/j.neucom.2012.09.042
Derrode, S., & Ghorbel, F. (2001). Robust and efficient Fourier–Mellin transform approximations for gray-level
image reconstruction and complete invariant description. Computer Vision and Image Understanding, 83(1),
57–78. doi:10.1006/cviu.2001.0922
Eleyan, A., & Demirel, H. (2011). Co-occurrence matrix and its statistical features as a new approach for face
recognition. Turkish Journal of Electrical Engineering and Computer Sciences, 19(1), 97–107.
Goecke, R., Asthana, A., Pettersson, N., & Petersson, L. (2007). Visual vehicle egomotion estimation using
the fourier-mellin transform. In 2007 IEEE Intelligent Vehicles Symposium (pp. 450-455). doi:10.1109/
IVS.2007.4290156
Hong, S., & Nah, Y. (2004). An intelligent image retrieval system using XML. In IEEE 10th International
Conference on Multimedia Modelling (p. 363). doi:10.1109/MULMM.2004.1265010
Israel, M., Broek, E. L., Putten, P. V., & Den, M. J. (2004). Automating the construction of scene classifiers for
content-based video retrieval. Seattle, WA: Academic Press.
Karmakar, D. (2019). Multimodal Biometric Recognition in Feature Level Fusion using Statistical Moment
Measure of Color Values. Journal of Computer and Mathematical Sciences, 10(3), 584–592. doi:10.29055/
jcms/1041
Mahoor, M. H., & Abdel-Mottaleb, M. (2008). A multimodal approach for face modeling and recognition. IEEE
Transactions on Information Forensics and Security, 3(3), 431–440. doi:10.1109/TIFS.2008.924597
Van Den Broek, E. L., & van Rikxoort, E. V. (2004). Evaluation of color representation for texture analysis.
Proceedings of the 16th Belgium-Netherlands Artificial Intelligence Conference, 35-42.
Van Rikxoort, E. M., van den Broek, E. L., & Schouten, T. E. (2005). Object based image retrieval: Utilizing
color and texture. Academic Press.
Wang, K., Wu, D., Chen, F., Liu, Z., Luo, X., & Liu, S. (2010). Angular color uniformity enhancement of
white light-emitting diodes integrated with freeform lenses. Optics Letters, 35(11), 1860–1862. doi:10.1364/
OL.35.001860 PMID:20517442
Zhou, D., Wu, X., Zhao, W., Lawless, S., & Liu, J. (2017). Query expansion with enriched user profiles for
personalized search utilizing folksonomy data. IEEE Transactions on Knowledge and Data Engineering, 29(7),
1536–1548. doi:10.1109/TKDE.2017.2668419
25
International Journal of Synthetic Emotions
Volume 11 • Issue 1 • January-June 2020
Adel Alti obtained a Master’s degree from the University of Setif (UFAS), Algeria, in 1998. He obtained a Ph.D.
degree in software engineering from UFAS University of Sétif, Algeria, 2011. Right now he is an associate professor,
HDR at the University of Sétif. He is a header of the Smart Semantic Context-aware Services research group
LRSD. His area of interest includes Mobility; ambient, pervasive and ubiquitous computing, automated software
engineering, mapping multimedia concepts into UML, semantic integration of architectural description into MDA
platforms, context-aware quality software architectures and automated service management, Context and QoS.
During his work, he has published number of publications concerning these subjects.
26