Mobilenet Part3 Ref

TABLE I.
CONFUSED MATRIX ON FER2013

Ang. Dis. Fea. Hap. Sad Sur. Neu.
Ang. 626 29 135 39 141 34 84

Dis. 8 68 2 1 3 0 2
Fea. 88 2 510 38 138 55 58
Hap. 22 2 20 1548 44 30 77
(a)Nao roabot (b)angry Sad 108 6 183 35 685 13 157

Sur. 16 1 71 32 17 678 19
Neu. 90 3 103 81 219 21 836
TABLE II. CONFUSED MATRIX ON CK+

Ang. Dis. Fea. Hap. Sad Sur. Neu.
Ang. 23 0 0 0 1 0 0
Dis. 4 33 4 0 0 0 1
(c)neutral (d)surprised
Fea. 0 3 38 0 0 0 0
Hap. 0 0 0 33 0 0 0
Sad 5 0 0 0 36 0 2
Sur. 0 0 2 0 0 32 1
Neu. 0 2 0 0 2 0 30
(e)disgust (f)fear will play more and more important role in human-robot
interaction in future. NAO is a humanoid robot of softbank
company (Fig.6(a)) and is especially applicable as the research
platform of service robot and users can make.
After training and testing the proposed model, we actually
apply it on Nao robot. The test result is shown in figure 6 and
shows the trained model can accurately and real-time recognize
the expressions of different people.
(g)sad (h)happy V. CONCLUTIONS

In this paper, a facial expression recognition framework
Figure 6. Recognition results on Nao robot
based on MobileNetV2 and SSD is proposed, and then we
apply it on Nao robot. We use the Depthwise Separable
While there is no neutral expression in CK+ expression Convolution of the Mobilenetv2 model. It can not only reduce
datasets, so we take the first frame from each emotion the parameters of the convolution network but also improve the
sequence to generate neutral emotion dataset. During the recognition precision of expression recognition. Through
training process, the emotion images of the dataset are experimental verification, this method can perform facial
converted into csv format for training and testing, and cross- expression recognition in real time and accurately on the Nao
validation is finished on the fer2013 and CK+ data sets robot platform.
respectively.
We test the emotional recognition model on fer2013, ck+ ACKNOWLEDGMENTS
datasets firstly. Table I is the confused matrix on fer2013. The
test accuracy 68.97% on 7178 test samples. while table II is on This work is supported partially by the project of Jilin
CK+ dataset. The test accuracy 89.2% on 252 test samples. Provincial Science and Technology Department under the
Grant 20180201003GX and the project of Jilin province
development and reform commission under the Grant
B. Experiments on Nao robot 2019C053-4. All authors gratefully acknowledge the helpful
Although artificial intelligence has made a great progress, comments and suggestions of the reviewers, which have
human still can’t interact with robot deeply. This is because the improved the presentation.
robot can’t understand people’s emotion. However, emotion
121
REFERENCES [12] K. Simonyan and A. Zisserman. Very deep convolutional networks for
large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014. 1,
[1] JANG Bin, GAN Yong, ZHANG Huan-long, ZHANG Qiu-wen, Survey 6
on Non-frontal Facial Expression Recognition Methods, Computer
Science, 2019. [13] C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna.
Rethinking the inception architecture for computer vision. arXiv preprint
[2] EKMAN P,CONHEN L,MOOS R, et al. Divergent Reactions to The arXiv:1512.00567, 2015. 1, 3, 4, 7
Threat of War[J]. Science, 1963, 139(3550): 88-94.
[14] C. Szegedy, S. Ioffe, and V. Vanhoucke. Inception-v4, inception-resnet
[3] YANG S F, KAFAI M, AN L, et al. Zapping Index: Using Smile to and the impact of residual connections on learning. arXiv preprint
Measure Advertisement Zapping Likelihood[J]. IEEE Transactions on arXiv:1602.07261, 2016. 1
Affective Computing, 2014,5(4): 432-444.
[15] K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image
[4] TEIJEIRO-MOSQUERA L, BIEL J, ALBA-CASTRO J, et al. What recognition. arXiv preprint arXiv:1512.03385, 2015. 1
Your Face Vlogs About: Expressions of Emotion and BigFive Traits
[16] Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. Imagenet
Impressions in YouTube[J]. IEEE Transactions on Affective Computing,
classification with deep convolutional neural networks. In Bartlett et al.
2015, 6(2): 193-205.
[48], pages 1106–1114. 1
[5] K. Elissa, “Title of paper if known,” unpublished. RADHAMANI V,
DALIN G. A Supporting Survey to Step into a Novel Approah for [17] Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott E.
Providing Automated Emotion Recognition Service in Mobie Phones[C] Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and
// Proceedings of the Second International Conference on Inventive Andrew Rabinovich. Going deeper with convolutions. In IEEE
Systems and Control. New York: IEEE Press, 2018: 35-39. Conference on Computer Vision and Pattern Recognition, CVPR 2015,
Boston, MA, USA, June 7-12, 2015, pages 1–9. IEEE Computer
[6] Lopes A T , Aguiar E D , Souza A F D , et al. Facial Expression Society, 2015. 1
Recognition with Convolutional Neural Networks: Coping with Few
Data and the Training Sample Order[J]. Pattern Recognition, 2016, [18] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep
61:610-628. residual learning for image recognition. CoRR, abs/1512.03385, 2015. 1,
3, 4, 8
[7] P. Liu, S. Han, Z. Meng, Y. Tong, Facial expression recognition via a
boosted deep belief network, in: 2014 IEEE Conference on Computer [19] He K , Zhang X , Ren S , et al. Identity Mappings in Deep Residual
Vision and Pattern Recognition (CVPR), 2014, pp. 1805–1812. Networks[J]. 2016.
[8] I. Song, H.-J. Kim, P.B. Jeon, Deep learning for real-time robust facial [20] Andrew G. Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko,
expression recognition on a smartphone, in: International Conference on Weijun Wang, Tobias Weyand, Marco Andreetto, and Hartwig Adam.
Consumer Electronics (ICCE), Institute of Electrical & Electronics Mobilenets: Efficient convolutional neural networks for mobile vision
Engineers (IEEE), Las Vegas, NV, USA, 2014. applications. CoRR, abs/1704.04861, 2017. 2, 4, 5, 6
[9] M. Liu, S. Li, S. Shan, X. Chen, Au-inspired deep networks for facial [21] Mark Sandler, Andrew Howard, Menglong Zhu, Andrey Zhmoginov.
expression feature learning, Neurocomputing 159 (2015) 126–136, MobileNetV2: Inverted Residuals and Linear Bottlenecks.
http://dx.doi.org/ 10.1016/j.neucom.2015.02.011. arXiv:1801.04381v4 [cs.CV] 21 Mar 2019
[22] Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott
[10] Y.-H. Byeon, K.-C. Kwak, Facial expression recognition using 3d
Reed, Cheng-Yang Fu, and Alexander C Berg. Ssd: Single shot multibox
convolutional neural network. International Journal of Advanced
detector. In ECCV, 2016. 7
Computer Science and Applications(IJACSA), 5 (2014).
[11] P. Burkert, F. Trier, M.Z. Afzal, A. Dengel, M. Liwicki, Dexpression: [23] Karen Simonyan and Andrew Zisserman. Very deep convolutional
Deep Convolutional Neural Network for Expression Recognition, CoRR networks for large-scale image recognition. CoRR, abs/1409.1556,
abs/1509.05371 (URL ‫ۃ‬http://arxiv.org/abs/1509.05371‫)ۄ‬. 2014. 1, 7
122

Mobilenet Part3 Ref

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Mobilenet Part3 Ref

Uploaded by

Copyright:

Available Formats

TABLE I.

CONFUSED MATRIX ON FER2013

Ang. 626 29 135 39 141 34 84

(a)Nao roabot (b)angry Sad 108 6 183 35 685 13 157

TABLE II. CONFUSED MATRIX ON CK+

(g)sad (h)happy V. CONCLUTIONS

You might also like