Professional Documents
Culture Documents
AbstractIn this paper, an Interval Type-2 neuro-fuzzy infer- A given emotion recognition system involves two tasks:
ence system based emotion recognition system is proposed. The emotion representation and emotion recognition. Emotion rep-
employed fuzzy inference system is a four layer network realizing resentation involves extraction of facial features that could
Takagi-Sugeno-Kang fuzzy inference mechanism, with an input
layer, a rule layer, a normalization layer and an output layer. The efficiently represent the various emotions, distinctly, whereas,
rule layer employs an Interval Type-2 fuzzy membership function emotion recognition involves learning and identifying different
to handle the uncertainty in the facial emotions of different emotions based on the facial features. Facial action coding
individuals. The rules for this network is generated by employing system (FACS) [1] is one of the first complete work on
a meta-cognitive projection based learning algorithm. The aim facial emotion representation and it involved human operators
of the proposed approach is to approximate the decision surface
separating different emotions based on noisy input features. to observe face for a given set of characteristic movements
During learning, as a sample is presented to the network, it to analyse the emotions. Such systems are not reliable, as
calculates the prediction error and knowledge content in the judgement may vary drastically over the human operators.
sample to decide on whether to learn the sample, when to learn However, these systems possess the human characteristic of
it and which technique to use to learn it. The meta-cognitive the ability to handle uncertainty over the facial emotions.
learning mechanism helps in avoiding overtraining and achieving
better generalization performance. The projection based learning In the past two decades, various emotion representation and
approach employed to learn the knowledge in a given sample automatic recognition systems have been proposed.
works by minimizing the total energy in the network in a linear In [2], authors have employed linear discriminant analysis
least squared sense. to classify principal component features of the face images,
The proposed emotion recognition system is tested on two well- while independent component analysis based approach for
known publicly available datasets: Japanese female facial expres-
sion dataset and Taiwanese female expression image database. expression recognition was used in [3]. Here, the authors
Local binary pattern based features are extracted from these employ wavelet based features to represent the facial emotions.
databases, as they have been shown to describe facial features In literature, authors have also employed general discriminant
such as edges and spots, efficiently. Two different studies are analysis for extracting representative features from facial im-
performed: a 5-fold cross-validation study on the emotion recog- ages [4]. Facial expression contains different motion informa-
nition ability of the system and a database independent study.
The performance comparison with other approaches clearly tion which could be separated by employing motion based
highlights the advantage of the proposed system. information. This information was employed by [5], [6] to
Index TermsEmotion recognition, fuzzy inference system, develop optical flow based technique to model facial features,
Type-2 fuzzy set, meta-cognition as optical flow based features could intrinsically capture the
apparent movement in pixels. However, optical flow based
I. I NTRODUCTION features suffer from their sensitivity to lighting variation and
Automatic recognition of emotions has wide implications disturbance caused due to nonrigid motions, in addition to
in the different applications such as data-driven animation, inaccurate image representation and discontinuous motion.
operator fatigue detection, and human-computer interaction. Recently, local binary patterns (LBPs) [7] have been shown
However, there are various challenges in the state-of-the-art as effective features for facial image analysis [8], [9]. LBPs
approaches, with a few being less researched than others. are non-parameteric descriptors that describe the local spatial
Intra/inter-personal variability in expressing emotions is one structure of an image. They are extracted by employing LBP
such challenge in computer based recognition of human emo- operator, which thresholds a given pixel in a window of an
tions. The gravity of this issue becomes evident upon consid- image with respect to its neighbors. The derived binary thresh-
ering the fact that people may express emotions differently olds are represented as LBP codes. In [10], a meta-cognitive
at different times. Further, there are certain differences in neuro-fuzzy inference system based classifier is employed for
how emotions are expressed across culture. Although various efficient recognition of facial emotions. A local binary pattern
research have been carried out on automatic emotion recog- based feature is employed here. Another efficient feature
nition, very less work is available on handling uncertainty in representation technique is Gabor filter based features. Gabor
the emotions. filter based feature is obtained by convolving a given image
The output of each node is given as
y
1
1 Fkt = Fklo,t + (1 )Fkup,t ; k = 1, , K (8)
x
i
where is the weight measure of uncertainty. In our study,
y
n we choose = 0.5.
1 Layer 5- Normalization layer: This layer consists of K nodes
xm
w
and each node normalizes the firing strengths of rule that is
nK
generated by the type reduction layer:
1 Ft
Fkt = PK k , k = 1, , K (9)
Layer 4 Layer 5 p=1 Fpt
Layer 1 Layer 2 Layer 3 Layer 6
The knowledge content is measure by spherical potential and Upon adding a new rule, the corresponding output weights
is given as are estimated by Projection Based Learning (PBL) algorithm.
Ks
1 X PBL algorithm aims to find optimal weights to minimize the
t = Fkt (13)
Ks sum of squared error in the network, i.e.,
k=1
where, Ks is the number of rules with Fkt > 0.01. w = arg min J(W) (19)
w
Upon monitoring the above measures, the meta-cognitive where,
learning mechanism decides on three strategies, sample dele-
t X n ! 2
tion, sample learning and sample reserve. X yjl yjl if yjt yjt < 1
J(W) = (20)
Sample Deletion Strategy: If maximum prediction error for 0 otherwise
l=1 j=1
a sample is less than delete threshold, then the knowledge
contained in the sample is similar to that present in the network The optimal solution to the above minimization problem could
and hence it could be deleted without being used in learning. be obtained by equation the first order partial derivative of
This helps in avoiding over-fitting and reducing training time. J(W) to zero and re-arranging as,
Sample Learning Strategy: A sample is learnt by either X N
k X N
X
growing a new rule or by adapting the parameters of the Fkt Fpt wjk = Fpt yjt , p = 1, , K; j = 1, , n
network to learn the knowledge contained the sample. k=1 t=1 t=1
Rule growing strategy: If maximum prediction error of (21)
a sample is very high and the spherical potential is below which in turn is written as
K
novelty threshold, then a new rule is added to the network to X
capture the knowledge in the current sample. When a new rule wjk akp = bpj Aw = B (22)
is added the center is initialized as, k=1
TABLE II
P ERFORMANCE C OMPARISON OF D IFFERENT E MOTION R ECOGNITION S YSTEM ON JAFFE AND TFEID DATABASE I NDEPENDENT S TUDY FOR LBP
FEATURES
and the output weights are updated according to: McIT2FIS in achieving better recognition than McFIS.
! T ! t T Database Independent Study: The performance comparison
WK = WK + A1 Ft e (28) of the three classifiers for database independent study is
Sample Reserve Strategy: A sample satisfying none of the tabulated in Table II. From the table it could be observed that
above strategies are reserved to be considered for learning at for both the cases, none of the classifiers is able to obtain
a later point in time. good recognition. However, among the classifiers, McIT2FIS
A detailed description of this learning algorithm could be achieves better recognition. More work needs to be done in
found in [21]. Next, the performance on the proposed emotion database independent automatic emotion recognition systems.
recognition system is analysed on two benchmark emotion
recognition databases. IV. C ONCLUSION
In this paper, a Meta-Cognitive Interval Type-2 Neuro-
III. P ERFORMANCE E VALUATION
Fuzzy Inference System (McIT2FIS) based emotion recog-
In this study, two databases JAFFE [22] and TFEID [23] nition system was proposed. The use of Interval Type-2
are employed. JAFFE database consists of 213 images of fuzzy set has helped the system handle uncertainty in the
seven frontal shots of facial expressions posed by ten Japanese representation of facial emotions. Further, the use of meta-
female models. The TFEID database is a growing database, cognitive learning mechanism in combination with projection
however only frontal facial images are considered in this study. based learning has helped the system in attaining better
It consists of 336 images with eight emotions out of which 268 generalization employing lesser resources. The performance of
images with the matching emotions as JAFFE is considered the propose McIT2FIS based emotion recognition system was
here. evaluated on two databases: JAFFE and TFEID. Two studies
Two different tests are performed: 10-fold cross-validation were performed: 10-fold cross-validation study and database
study and database independent emotion recognition study. In independent emotion recognition study. Based on the study it
order to extract 10-fold data for JAFFE, one image per person could be concluded that McIT2FIS is able to better learn and
per emotion is employed for testing and rest of the images are classify emotion within a database. However, the performance
used in training. In TFEID, as the number of images per person degrades significantly in inter-database study.
is not same, 66% of images per emotion are used for training In the future, more research would be carried out to better
and rest for testing. The performance of the proposed approach represent and recognize emotions across different cultures.
is compared with two state-of-the-art classifiers McFIS [26] Other feature representation techniques such as curvelet, Ga-
and SVM [27]. bor filter, optical flow features would be analyzed. Further,
10-Fold Cross-Validation Study: The performance of the fusion of data at feature level as well as classifier level would
three classifiers, McFIS, SVM and McIT2FIS is tabulated in be studied.
Table I. It could be observed from the table that McFIT2FIS
outperforms the existing classifiers. It should be noted that R EFERENCES
McFIS and McIT2FIS employs a meta-cognitive learning
[1] P. E. and E. R. L., What the face reveals: Basic and applied studies of
mechanism to help improve generalization performance. How- spontaneous expression using the Facial Action Coding System (FACS).
ever, the use of Interval Type-2 fuzzy sets might have helped Oxford University Press, 1997.
[2] A. J. Calder, A. M. Burton, P. Miller, A. W. Young, and S. Akamatsu, [15] H. F. Huang and S. C. Tai, Facial expression recognition using new
A principal component analysis of facial expression, Vision Research, feature extraction algorithm, Electronic Letters on Computer Vision and
vol. 41, pp. 1179 1208, 2001. Image Analysis, vol. 11, no. 1, pp. 41 54, 2012.
[3] I. Buciu, C. Kotropoulos, and I. Pitas, ICA and Gabor representation [16] M. Tang and F. Chen, Facial expression recognition and its application
for facial expression recognition, in IEEE International Conference on based on curvelet transform and PSO-SVM, Optik- International Jour-
Image Processing, vol. 2, no. II, 2003, pp. 855 858. nal for Light and Electron Optics, vol. 124, no. 22, pp. 5401 5406,
[4] L. Shen, L. Bai, and M. Fairhurst, Gabor wavelets and general 2013.
discriminant analysis for face identification and verification, Image and [17] E. Candes and D. Donoho, Curvelets- a surprisingly effective non-
Vision Computing, vol. 25, pp. 553 563, 2007. adaptive representation for objects with edges, in Curves and Surface
[5] Y. Yacoob and L. S. Davis, Recognizing human facial expression from Fitting: Saint-Malo, A. Cohen, C. Rabut, and L. Schumaker, Eds., 2000,
long image sequences using optical flow, IEEE Transactions on on pp. 105 120.
Pattern Analysis and Machine Intelligence, vol. 18, no. 6, pp. 636 [18] A. Saha and Q. M. J. Wu, Facial expression recognition using curvelet
642, 1996. based local binary patterns, in IEEE International Conference Acoustics
[6] I. Essa and A. Pentland, Coding, analysis, interpretation, and recogni- Speech and Signal Processing, 2010, pp. 2470 2473.
tion of facial expressions, IEEE Transactions on on Pattern Analysis [19] J. Mendel, Advances in type-2 fuzzy sets and systems, Information
and Machine Intelligence, vol. 19, no. 7, pp. 757 763, 1997. Sciences, vol. 177, no. 1, pp. 84 110, 2007.
[7] T. Ojala, M. Pietikainen, and T. Maenpaa, Multiresolution gray-scale [20] O. Castillo and P. Melin, A review on the design and optimization
and rotation invariant texture classification with local binary patterns, of interval type-2 fuzzy controllers, Applied Soft Computing, vol. 12,
IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 24, no. 7, no. 4, pp. 1267 1278, 2012.
pp. 971 987, 2002. [21] K. Subramanian, A. K. Das, S. Sundaram, and S. Ramasamy, A meta-
[8] A. Hadid, M. Pietkiainen, and T. Ahonen, A discriminative feature cognitive interval type-2 fuzzy inference system and its projection based
space for detecting and recognizing faces, in IEEE Conf. Computer learning algorithm, Evolving Systems, vol. 5, no. 4, pp. 219 230, 2014.
Visiion and Pattern Recognition, 2004, pp. II797 II804. [22] J. L. Michael, A. Shigeru, M. Kamachi, and J. Gyoba, Coding facial
[9] C. Shan, S. Gong, and W. M. Peter, Robust facial expression recog- expressions with Gabor wavelets, in IEEE International Conference on
nition using local binary patterns, in IEEE International Conference Automatic Face and Gesture Recognition, 1998, pp. 200 205.
Image Processing, 2005, pp. 370 373. [23] L.-F. Chen and Y.-S. Yen, Taiwanese facial expression image database,
[10] K. Subramanian and S. Suresh, Human action recognition using meta- Brain Mapping Laboratory, Institute of Brain Science, National Yang-
cognitive neuro-fuzzy inference system, in IEEE International Joint Ming University, Taipei, Taiwan, 2007.
Conference Neural Networks, Brisbane, Australia, 2012, pp. 1 8. [24] D. Huang, C. Shan, A. Mohsen, and L. Chen, Facial image analysis
[11] S. M. Lajevardi and M. Lech, Averaged Gabor filter features for facial based on local binary patterns: A survey, submitted for IEEE publica-
expression recognition, in Digital Image Computing: Techniques and tion.
Applications, 2008, pp. 71 76. [25] D. Wu, An overview of alternative type-reduction approaches for re-
[12] J. Ou, X. B. Bai, Y. Pei, L. Ma, and W. Liu, Automatic facial ducing the computational cost of interval type-2 fuzzy logic controllers,
expression recognition using Gabor filter and expression analysis, in in IEEE Intl. Conf. on Fuzzy Systems, 2012, pp. 1 8.
IEEE International Conference on Computer Modeling and Simulation, [26] K. Subramanian, R. Savitha, and S. Suresh, Zero-error density maxi-
2010, pp. 215 218. mization based learning algorithm for a neuro-fuzzy inference system,
[13] D. Lowe and B. Vancouver, Object recognition from local scale- in Fuzzy Systems (FUZZ), 2013 IEEE International Conference on.
invariant features, in IEEE Intl. Conf. Computer Vision, 1999, pp. 1150 IEEE, 2013, pp. 17.
1157. [27] C. Chang and C. Lin, LIBSVM: A library for support vector
[14] H. Soyel and H. Demirel, Facial expression recognition based on machines, 2003, National Taiwan University, Taiwan, Deptartment of
discriminative scale invariant feature transform, Electronics Letters, Computer Science and Information Engineering. [Online]. Available:
vol. 46, no. 5, pp. 343 345, 2010. http://www.csie.ntu.edu.tw/cjlin/libsvm/