You are on page 1of 7

Real-time student emotion and performance

analysis.

Pramodini Metgud
Navya Dayanand Naik Sukrutha M S
pramodinimetgud@gmail.com
navyanaik292@gmail.com sukruthams1@gmail.com
Anita S Prasad
Department of Electronics and
Communication
JSS Science and Technology University
Mysuru, India
anith.sp@sjce.ac.in

978-1-6654-9781-7/22/$31.00 ©2022 IEEE


Abstract –Emotional state of the learner in any course plays Academically, students experience a large variety of self-
a major role. By analyzing the facial emotion accurately, referenced and social emotions like pride, shame, peer
student performance can be improved. The proposed system pressure, and anxiety [6]. Hence there is a need for regular
gives an accurate prediction of emotions with reference to analysis of students’ emotions.
the data set FER-2013. Accuracy of around 85.72% is given Naman Gupta [7] proposed the automatic attendance
by using the EfficientNet-B0 algorithm. After facial system to overcome the drawbacks faced by using the
expression detection, face recognition is performed to fetch traditional system of marking attendance, in which a teacher
the student information. The student information along with has to manually mark attendance, which is both time-
performance is sent to concerned faculty and parents. consuming and prone to mistakes.

Keywords— Real-time facial emotion recognition, In this paper, the implementation of emotion recognition
EfficientNet-B0, Student data. is done for the analysis of student emotions based on their
facial expression using EfficientNet-B0, EfficientNet can be
considered a group of convolution neural network models,
I. INTRODUCTION and gives better results than Machine Learning algorithms.
The face is the most expressive part of a person [4]. The project is divided into three main parts: face emotion
Facial emotion recognition identifies a person’s emotion recognition, student data acquisition, and compiling and
from their facial expression. All human emotions can be sending the required data to the concerned teacher.
expressed as a combination of 7 basic emotions (anger, fear,
disgust, sadness, surprise, happiness, and neutral). III. PROPOSED APPROACH
Emotion recognition and analysis are of great importance The main purpose of the project is to analyze the emotion
in medical practice, education, and social practices. In the of the students twice every semester, once at the beginning, a
field of education, the mental health of the student is very few weeks after school/college starts, and again just before
important for their academic excellence [5]. The traditional the final exams. The results and data obtained will be shared
way of acquiring the emotion of students is by manual with the concerned faculty and parents, if the student
analysis of the student’s marks and observations from the displays any negative emotions like anger, sorrow, and fear.
feedback forms.
The project can be divided into three sections or less. The
first section is student emotion analysis which is done using
II. RELATED WORK EfficientNet-B0, and the FER-2013 data set. The second
Emotions have a significant impact on students' learning section is student face recognition, this information is used to
processes, according to recent studies [8][9][10][11]. The find all the data related to the students. In the third section,
research that is now available seems to indicate that positive all the data obtained in the first two sections will be sent to
emotions affect students' attention, motivation, and self- the concerned faculty and the student’s parents.
regulation of learning, which in turn influences learning.
Negative emotions, on the other hand, have an impact on
student’s performance and achievement, which also affects Proposed Algorithm
learning. For instance, guilt and worry lower motivation and Step 1: Image captured from the webcam.
interest. Different emotions may elicit various non-task- Step 2: Analysis of emotions using EfficientNetB0.
related cognitions, and they may vary in the degree of
intensity and permanence of these cognitions. As a result, Step 3: If: Indication of any negative emotion.
both positive and negative emotions play a crucial part in Facial detection.
how information is stored and retrieved during learning.
According to [11], the activation of emotions encourages the Send notification to concerned faculty and parents.
activation of association data in long-term memory, which Else: Exit
makes recall easier.
Step 4: End
Many researchers are interested in improving the
accuracy of facial emotion recognition. Imane Lasri [1] A. Student emotion detection
proposed a system to recognize the student’s emotions using
deep learning techniques like CNN. The data set used is The popularity of using CNNs for image analysis has
FER-2013. The aim was to help the teacher modify their grown over the past few years, largely due to the impressive
presentation according to the result. results that have been obtained. In contrast, CNN requires a
large amount of training data before it can begin to work.
Mingxing Tan performed a systematic study on model Using CNNs trained on millions of images has been used to
scaling to improve the accuracy of the Convolution Neural solve this problem, but it is time-consuming and requires a
Networks (CNN). Based on their observation, a new method
huge amount of data. Using a pre-trained model avoids the
of scaling uniformly scales all dimensions of depth /width need for a large amount of data and maximizes processing
/resolution using a highly effective compound coefficient. resources. The most common problem is when the intended
Further, they used neural architecture search to design a new
data collection is too small for it to be useful. There may be
baseline network and scale it up to obtain a family of models,
instances in which overfitting is an issue, and data
called EfficientNets [2].
augmentation is not always enough to solve it. This study
Research has shown that emotion plays a very important proposes a transfer learning method that uses
role in students’ life both personally and academically.
EfficientNetB0, an emotion recognition system, to focus on  Let's have a look at table 1 to see how the new architecture
salient face areas. appears.
Table 1: EfficientNet-B0 network
Data set
The data set used is FER-2013. The data set
consists of 48x48 pixel grayscale images of the faces of
people. The faces are automatically registered in order
that the face is more or less centered and occupies
about an equivalent amount of space in each image . The
training set consists of 28,709 grayscale images and the test
set consists of 3,589 grayscale images. It contains 7
emotions (anger, happy, sad, fear, surprised, neutral, and
disgust).
In the architecture, 7 inverted residual blocks are used, but
Proposed approach each is set differently. Additionally, these blocks use
In this paper, we suggest that the EfficientNetB0 squeeze & excitation blocks along with swish activation.
Convolutional Neural Network model be used as the basis
for transferring learning. Real-world photographs of facial Swish Activation
expressions were used to test the suggested approach. In this Swish is the product of a linear and a sigmoid activation.
study, EfficientnetB0 was used to transfer learning. While Swish(x) = x * sigmoid(x)
creating a data generator for images, there are a few things
to keep in mind (Rescale by 255, split data 20 percent, Inverted Residual Block
rotation by 5, shift height and weight by 20 percent, zoom In MobileNet's residual block, convolution is based on
20 percent) Datasets were divided into three sections after depthwise separable convolution, which employs depthwise
being cleaned up: training, testing, and validation. Target convolution first and pointwise convolution after that. The
picture size 128x128 has been used for all three datasets. result is a reduction in the number of parameters that can be
For all three datasets, batch size 64 has been used. Select trained. The inverted residual block results in skip
seven different label types (such as angry, disgusted, fearful, connections linking narrow layers, while larger ones are left
happy, neutral, sad, and surprised). The training set has between skip connections.
22968 photos, whereas the validation set contains 5741, and
the test collection contains 7178. The EfficientNetB0 and Squeeze and Excitation Block
imagenet weight were then loaded using transfer learning. When generating an output feature map, CNN weights the
Adam optimizer and categorical cross-entropy loss function convolutional layers equally Rather than treating each
are used to construct the model. At 10, 50, and 100 epochs, channel equally, the squeeze and excitation (SE) block gives
the model has been trained. Weight should be used for each channel more weight. The SE square gives an yield of
testing when training is complete. A Haarcascade frontal shape (1 x 1 x channels) that characterizes the weightage for
face classifier is used to recognize facial characteristics and each channel, and the astounding portion is that the neural
determine emotional state from seven categories using organize can learn this weightage by itself like other
OpenCV. parameters, which may be a enormous advantage.

a)EfficientNet-B0 model EfficientNet’ MBConv Block


One can think of EfficientNet as a collection of To begin with, the MBConv piece requires information and
convolutional neural networks. Despite its complexity, the moment, the MBConv block's contentions. The final
however, it still proves to be more efficient than most of its layer yields the information. Qualities such as input and
predecessors. In the EfficientNet model family, there are yield channels, development and press proportion, and so on
eight models from B0 through B7, each specifying a model are all portion of a square contention which will be used
with increased accuracy and more parameters There are inside an MBConv square.
more than one million photos in the ImageNet collection
that was used to train EfficientNet-B0. The network can b)Transfer Learning
identify over 1000 item types such as keyboards, mice, and A model that has been learned for one task may be
other animals. A broad variety of pictures are represented by transferred to another similar activity via transfer learning.
more complex feature representations in the network. The Deep learning networks with multiple parameters are
network accepts images with a resolution of 224 by 224 expensive and time-consuming to compute. These networks
pixels.[2] are not overfitted because of the volume of data needed to
train them. As a result, it is typical for researchers to invest a
EfficientNet Architecture lot of time in training for cutting-edge models. The premise
 In this section, we will discuss the EfficientNet-B0 that a state-of-the-art model is trained using a lot of
architecture in detail. resources, hence the benefits of such investments should be
realized many times over, gave rise to the concept of
 In B0, which is a mobile-sized architecture, there are 11M
transfer learning. The best thing is that transfer learning
trainable parameters.
allows you to reuse all or part of your model! In this Step 12: Test the model using OpenCV using a haarcascade
manner, we can avoid training the entire model. frontalface classifier to detect facial features and predict the
emotion from seven categories.
c) Loss Function Categorical Cross Entropy
For categorical classification, cross-entropy loss contributed Step 13: Obtain accurate expression detection results.
by training data point 𝒊,(𝒙𝒊 , 𝒚𝒊), is simply the "negative log- The validation accuracy obtained in this method after
likelihood (NLL)": running 20 epochs is at 85.72% and the loss is at 18% as
shown in Fig 1. The model reaches an accuracy of 85.72%
by the end of the 7th epoch. The loss function is minimum
since the ground truth probability is one for the correct label for this model at 18% at the end of the 10th epoch. If 20
𝒚𝒊 and zero for every other label. epochs are run all the parameters will be met.

d) Adam Optimizer (Adaptive Moment Estimation


It is a gradient descent optimization algorithm. This strategy
works well when tackling large problems with lots of data or
parameters. It is efficient and requires minimal storage. It
makes intuitive sense to combine the "gradient descent" and
"RMSP" algorithms.

The estimations of the first moment (mean) and second


moment (uncentered variance) of the gradients are denoted
by the letters mt and vt, respectively. Researchers of Adam
have found that because mt and vt are initialised as vectors Fig 1. Training and validation accuracy.
of zeros, they are biassed toward zero, particularly during
initial time steps and especially when decay rates are slow B. Student face recognition
(such that t1 and t2 are almost 1). In this section, the input used is from the first section, the
same input real-time image captured, is used for face
recognition. In a class of 200 students, the teacher cannot
Proposed Algorithm for emotion classification remember all the names. So along with the emotion, the
Step 1: Load FER2013 Dataset. student’s name should also be sent to the teacher to make
his/her job easy.
Step 2: Preprocess the image by defining the image data In this method, two databases are created, a student
generator. database and an unknown database. The student database
contains the images of students of a class. The unknown
Step 3: load the dataset of the training set, validation set, database contains the data which is taken from a camera,
and test set using target image size 128x128 and batch size which is installed in the class. Face recognition and emotion
64. detection algorithms both will be applied.

Step 4: Select class label into 7 categories including {0: Student database creation
"Angry", 1: "Disgusted", 2: "Fearful", 3: "Happy", 4: The student database is created by storing the images of
"Neutral", 5: "Sad", 6: "Surprised"} the students of a class, along with their names. A google
spreadsheet is also created with the names, addresses,
Step 5: Divide the images into training, testing, and CGPAs, etc. of the students of that class
validation set
With the help of the images, which are stored in the
Step 6: Apply the EfficientNetB0 model to train the student database, facial recognition can be performed to
network. identify the student who is showing negative emotions.
Face detection
Step 7: load the weight of ImageNet.
The detection of the faces is done by identifying the 68
landmarks present on a person’s face. Based on these
Step 8: Use transfer learning to select a related predictive
landmarks, the Haar cascade or Viola and Jones algorithm
modeling problem with an abundance of data.
[8] will be used. It will create a face bounding box.
Step 9: Compile the model using the Adam optimizer.
Face recognition
Step 10: Calculate the loss using categorical cross-entropy. The face recognition library is used to recognize the
student’s face. Face recognition is considered the world’s
Step 11: Train the model for 10,50, and 100 epochs to detect simplest face recognition library. It has built-in dlib’s state-
of-the-art face recognition built with deep learning. The
facial emotions.
model has an accuracy of 99.38% on labeled faces.
Face recognition is implemented in this project as shown name, address, backlogs, CGPA, etc. of the students. Once
in Fig 2. The student database has labeled images of the the student is identified, all the data present in the sheet
students of a class. After the student is identified, the data along with the emotion analysis data is sent to the concerned
obtained from the emotion detection of the student is shared faculty and that student’s parents. Fig 3 is an example of
with the teacher. This is discussed in the next section. student information collected and stored by school/college.

Fig 3. Student data.

IV. RESULTS
The following results were obtained from this project. Fig 4.
Shows real-time student emotion analysis. The graph is
showing negative emotions like fear and anger. Fig 5. is the
face recognition done for a student showing negative
emotions, Fig 6. shows all the data obtained is sent as an
email to the concerned faculty and the student’s parents.

Fig. 2. Face recognition system

Proposed Algorithm for face recognition


Step 1: Start.
Step 2: Creating the student database.
Step 3: Camera to capture the faces of students.
Step 4: Face detection.
Step 5: Face recognition by comparing images taken from
the camera with the student database. Fig 4. Real-time results of a student showing negative
emotions
Step 6: IF: face present in the student database.
Go to step 7.
ELSE: go back to step 2.
Step 7: Save and send the data.
Step 8: End.

C. Sharing of data
In this section, the data obtained from the before two
sections will be sent to the concerned faculty and the Fig 5. Face recognition of the student.
student’s parents. In the student database which was created,
a google sheet which contains all the data related to
students. The google sheet contains information like the
[4] R. G. Harper, A. N. Wiens, and J. D. Matarazzo, Nonverbal
communication: the state of the art. New York: Wiley, 1978.
[5] G. H. Bower, “Mood and Memory American Psychologist 36:
129– 148,” CrossRef Google Sch., 1981.
[6] R. Pekrun, T. Goetz, W. Titz, and R. P. Perry, “Academic
Emotions in Students’ Self-Regulated Learning and
Achievement: A Program of Qualitative and Quantitative
Research,” Educ. Psychol., vol. 37, no. 2, pp. 91–105, Jun.
2002.
[7] Naman Gupta, Purushottam Sharma, Vikas Deep, “Automatic
Attendance System Using OpenCV”
[8] M. Feidakis, T. Daradoumis, and S. Caballe, “Endowing e-
Learning Systems with Emotion Awareness,” in 2011
Third International Conference on Intelligent Networking and
Collaborative Systems, 2011, pp. 68–75.
[9] J. M. Harley, S. P. Lajoie, C. Frasson, and N. C. Hall, “An
Fig 6. Student data was sent to the concerned faculty. Integrated Emotion-Aware Framework for Intelligent Tutoring
Systems,” in Artificial Intelligence in Education, 2015, pp. 616–
619.
V. CONCLUSION [10] F. Benmarrakchi, J. E. Kafi, A. Elhore, and S. Haie, “Exploring
This paper proposes a method for real-time student the use of the ICT in supporting dyslexic students’ preferred
learning styles : A
emotion analysis, face recognition, and sharing the student
information with concerned faculty. The model used in this [11] preliminary evaluation,” Educ. Inf. Technol., pp. 1–19, Oct.
2016.
project gives 85.72% accuracy for emotion detection on the
[12] G. H. Bower, “Mood and Memory American Psychologist 36:
FER-2013 database, 99.83% accuracy for student face 129– 148,” CrossRef Google Sch., 1981.
recognition on the labeled student database. The notification
helps teacher and parents to improve the student’s
performance. This will definitely help to improve the
learning environment.
This framework helps to monitor the student’s
emotional state. This approach will also help teachers and
parents to be aware of the academic performance. In the
future, we can work on 3D facial emotion analysis because
they give better result as it considers spontaneous change in
the face.

REFERENCES
[1] Imane Lasri, Anouar Riad Solh, “Facial Emotion Recognition of
 
Students using Convolution Neural Network” in 2019 Third
International Conference on Intelligent Computing in Data
Sciences (ICDS), 2019.
[2] Mingxing Tan, Quoc Le, “ EfficientNet: Rethinking Model
Scaling for Convolution Neural Networks” Proceedings of the
36th International Conference on Machine Learning, PMLR
97:6105-6114, 2019.
[3] Viola, M. J. Jones, and Paul, “Robust real-time face detection”
in International journal of computer vision 57.2 (2004), 2004.

You might also like