Professional Documents
Culture Documents
ON
AURA – AI for every developer
Submitted in partial fulfillment of the requirement for the award of Degree
of Bachelor of Technology in Computer Science & Engineering
Submitted to:
Submitted by:
IPS ACADEMY,
2022-23
Project-I entitled
CERTIFICATE
This is to certify that Project -III entitled
I would like to express my heartfelt thanks to Dr. Neeraj Shrivastava, CS, for his
guidance, support, and encouragement during the course of my study for B.Tech
(CS) at IPS Academy, Institute of Engineering & Science, Indore. Without his
endless effort, knowledge, patience, and answers to my numerous questions, this
Project would have never been possible. It has been great honor and pleasure for me
to do Project under her supervision.
My gratitude will not be complete without mention of Dr. Archana Keerti
Chowdhary,
Principal, IPS Academy, Institute of Engineering & Science, Dr. Neeraj
Shrivastava,
HOD CSE, IPS Academy, Institute of Engineering & Science and Mr. Arvind
Upadhyay, Branch Coordinator CSIT, IPS Academy, Institute of Engineering
& Science for the encouragement and giving me the opportunity for this project
work. I also thank my friends who have spread their valuable time for
discussion/suggestion on the critical aspects of this report. I want to acknowledge
the contribution of my parents and my family members, for their constant
motivation and inspiration.
Finally I thank the almighty God who has been my guardian and a source of strength
and hope in this period.
List of Abbreviation ii
Abstract iii
CHAPTER 4: DESIGNS 15
4.1 Use Case Diagram 16
4.2 Sequence Diagram 17
4.3 Flow Chart 18
1 Activity Diagram 05
3 Sequence Diagram 17
4 Flow Chart 18
i
LIST OF ABBREVIATION
CV Computer Vision
CD Continuous Delivery
TF Tensor Flow
DC Digital Camera
ii
ABSTRACT
This project aims to develop a system for image recognition and emotion detection
using deep learning algorithms. The system will be trained on a dataset of images
with labeled emotions to recognize and classify facial expressions into different
emotion categories such as happy, sad, angry, surprised, and neutral. The project will
use state-of-the-art deep learning techniques such as convolutional neural networks
(CNNs) and transfer learning to improve the accuracy and efficiency of the model.
The system will be implemented using Python and popular deep learning frameworks
such as TensorFlow and Keras. The final output will be a user-friendly interface that
allows users to upload an image and receive the corresponding emotion category label
with a confidence score. This system has various applications in fields such as
psychology, marketing, and entertainment.
Interface
iii
CHAPTER-1
INTRODUCTIO
N
1
Facial expression is the visible manifestation of the affective state, cognitive activity,
intention, personality, and psychopathology of a person and plays a communicative
role in interpersonal relations. It has been studied for a long period of time and
obtaining the progress recent decades. Though much progress has been made,
recognizing facial expression with a high accuracy remains to be difficult due to the
complexity and varieties of facial expressions.
Generally human beings can convey intentions and emotions through nonverbal ways
such as gestures, facial expressions, and involuntary languages. This system can be
significantly useful, nonverbal way for people to communicate with each other. The
important thing is how fluently the system detects or extracts the facial expression
from image. The system is growing attention because this could be widely used in
many fields like lie detection, medical assessment, and human computer interface.
The system classifies facial expression of the same person into the basic emotions
namely anger, disgust, fear, happiness, sadness, and surprise. The main purpose of
this system is efficient interaction between human beings and machines using eye
gaze, facial expressions, cognitive modeling etc. Here, detection and classification of
facial expressions can be used as a natural way for the interaction between man and
machine.
And the system intensity varies from person to person and varies along with age,
gender, size and shape of face, and further, even the expressions of the same person
do not remain constant with time. However, the inherent variability of facial images
caused by different factors like variations in illumination, pose, alignment, occlusions
make expression recognition a challenging task. Some surveys on facial feature
representations for face recognition and expression analysis addressed these
challenges and possible solutions in detail.
2
1.1 Overview
Image recognition refers to the ability of a machine to identify objects, people, or
other elements within an image. It involves using computer vision techniques and
machine learning algorithms to analyze and understand the contents of an image.
The process typically involves segmenting the image into smaller parts, extracting
features from those parts, and using those features to classify the image.
Face detection, on the other hand, is a specific application of image recognition that
focuses on identifying human faces within an image. It involves using computer
algorithms to detect and locate the position of human faces within an image. This
process often involves identifying specific features of the face, such as the eyes,
nose, and mouth, and using those features to locate the face.
Both image recognition and face detection have a wide range of applications in
various fields, including security, entertainment, marketing, and healthcare.
Facial expression is one or more motions or positions of the muscles beneath the
skin of the face. These movements convey the emotional state of an individual to
observers. Facial expressions are the most important information for emotions
perception in face-to-face communication and a form of nonverbal communication.
They are a primary means of conveying social information between humans, but
they also occur in most other mammals and some other animal species. Facial
expression detection is one of the most relevant applications of image processing
and biometric system, it is not an easy task because of circumstances like
illumination, facial occlusions, face color, shape etc. These expressions can vary
between individuals. Face detection systems have many problems pertaining to
pose, light, facial expression, and quality of picture. It can be solved by applying
some sort of image preprocessing before they are applied for further analysis
purpose. Face detection determines the locations and sizes of faces in an input
image, they are easily located in cluttered scenes by infants and adults alike.
However, automatic human face detection by computers is a very challenging task
because face patterns can have significantly variable image appearances. For
example, human faces vary from genders, ages, hair styles and races etc. In addition,
the variations of scales, shapes, and poses of faces in images also hinder the success
of automatic face detection systems. Several different approaches have been
proposed to solve the problem of face detection and each approach has its own
3
advantages and disadvantages. 2.
4
Emotions are read by humans through detection of face muscles and behavior of
some features on the face. We detect faces effortlessly in a wide range of conditions,
under bad lightning conditions or from a great distance and these conditions might
not allow us to get the accurate expression of the face. This system will help in
reducing the difficulty in detecting human expressions accurately. Looking into the
problem statement and having an insight on the already existing system, the system
will be able to detect the primary emotions which are sadness, anger and happiness
and to easily tell if the user is in any of these three states. The system will be human-
computer interaction systems that will help detect human facial expression and
emotions, it will provide the ability to perform real-time, frame-by-frame analysis of
the emotional responses of users, detecting and tracking expressions.
FACE DATABASE
(KAGGLE)
5
1.2 Literature Survey
Research in the fields of face detection and tracking has been very active and there
is exhaustive literature available on the same. The major challenge that the
researchers face is the non-availability of spontaneous expression data. Capturing
spontaneous expressions on images and video is one of the biggest challenges
ahead. Many attempts have been made to recognize facial expressions. Zhang et al
investigated two types of features, the geometry-based features and Gabor wavelets-
based features, for facial expression recognition. The Facial Action Coding System
(FACS), which was proposed in 1978 by Ekman and refined in 2002, is a very
popular facial expression analysis tool.
6
CHAPTER-2
PROBLEM IDENTIFICATION
AND
SCOPE
7
2.1 Problem Statement
Human emotions and intentions are expressed through facial expressions and
deriving an efficient and effective feature is the fundamental component of facial
expression system. Face recognition is important for the interpretation of facial
expressions in applications such as intelligent, man-machine interface and
communication, intelligent visual surveillance, tele- conference, and real-time
animation from live motion images. The facial expressions are useful for efficient
interaction Most research and system in facial expression recognition are limited to
six basic expressions (joy, sad, anger, disgust, fear, surprise). It is found that it is
insufficient to describe all facial expressions and these expressions are categorized
based on facial actions .
1. Detecting face and recognizing the facial expression is a very complicated task
when it isa vital to pay attention to primary components like: face
configuration, orientation, location where the face is set.
2. The different states of mind change the production level and the efficiency of
every individual.
The scope of this system is to tackle with the problems that can arise in day- to-day
life. Some of the scopes are:
1. Systemis fully automatic and has the capability to work with images feed. It can
recognize spontaneous expressions.
2. Our system can be used in Digital Cameras wherein the image can be
captured only when the person smiles.
3. In security systems which can identify a person, in any form of expression
he presents himself.
4. Rooms in homes can set the lights, and television to a person’s taste when
they enter the room.
5. Doctors can use the system to understand the intensity of pain or illness of a deaf
6. Our system can be used to detect and track a user’s state of mind, and in mini-
marts, and shopping centers to view the feedback of the customers to enhance
the business and many more.
9
CHAPTER-3
SOFTWARE ENGINEERING
APPROACH
10
3.1 Software model used
PROTOTYPE MODEL –
3.1.1 Description
Here we built a prototype, tested, and then reworked as necessary until an
acceptable outcome is achieved from which the complete system or product can be
developed.
12
B) OpenCV Framework-
C) Independent platform OS
An independent platform OS, also known as a cross-platform OS, is an
operating system that can run on different hardware architectures and
processor types. It can also run software applications that were developed for
different operating systems, without requiring modification or recompilation
of the source code.
Independent platform OSs are useful for developers who want to create
software that can run on multiple platforms without having to develop separate
versions for each one. They are also beneficial for users who want to switch
between different hardware platforms without having to learn a new operating
system.
13
The CNN consist of 7 layer-:
1. Input layer: This layer receives the input image and passes it to the next
layer for processing.
3. ReLU layer: This layer applies the rectified linear unit (ReLU) activation
function to the output of the previous convolutional layer, introducing non-
linearity into the model.
4. Pooling layer: This layer reduces the spatial dimensions of the feature
maps produced by the previous convolutional layer by applying a pooling
operation, such
as max-pooling or average-pooling
5. Flattening layer: This layer converts the pooled feature maps into a one
dimensional feature vector that can be fed into a fully connected layer.
6. Fully connected layer: This layer connects every neuron in the previous
layer to every neuron in the current layer, applying a set of weights and
biases to the input to produce an output.
7. Output layer: This layer produces the final output of the CNN, which can
be a classification label or a prediction of a continuous value.
14
The above sequential model structure is based on Facial
Emotion. Recognition which we have implemented in our
project.
15
3.2.2 Hardware Requirement
Following are the hardware requirement that is most important for the project:
A) Fluently working Laptops
B) RAM minimum 4Gb
C) Web Camera
16
CHAPTER-4
DESIGNS
17
4.1 Use case Design
System design shows the overall design of system. In this section we discuss in detail
the design aspects of the system:
18
4.2 SEQUENCE DIAGRAM
19
4.3 FLOWCHART
20
21
CHAPTER-5
IMPLEMENTATION PHASE
22
5.1 Language Used & its Characteristics
A) Python
Python is a dynamic, interpreted (bytecode-compiled) language. There are no type
declarations of variables, parameters, functions, or methods in code. This makes
the code short and flexible, and you lose the compile-time checking of the source
code. Python tracks the types of all values at runtime flags code that does not make
sense as it runs.
LIBRARIES OF PYTHON-
1. Pyttsx: Pyttsx is a cross-platform text-to-speech library for Python. It allows
you to generate speech from text, with support for multiple TTS engines and
languages. Here is an example of how to use pyttsx in Python.
2. Datetime: This module allows developers to work with dates and times. It
provides classes for working with dates, times, time deltas, and more.
3. Wikipedia: This module allows developers to search and retrieve information
from Wikipedia. It provides an API for accessing Wikipedia pages and retrieving
information.
4. Web browser: This module provides a simple way to open a web browser
from a Python script. Developers can use it to open a web page or search for
something on the internet.
5. Pyjokes: This module provides a collection of funny jokes in Python.
Developers can use it to add humor to their Python applications.
6. Time: This module provides various functions to work with time in Python. It
can be used to measure the time taken to execute a piece of code, to delay the
execution of a program, and more
B) OpenCV
OpenCV (Open-Source Computer Vision Library) is an open-source computer
vision and machine learning software library. OpenCV was built to provide a
common infrastructure for computer vision applications and to accelerate the use
of machine perception in the commercial products. Being a BSD-licensed product,
OpenCV makes it easy for businesses to utilize and modify the code.
23
The library has more than 2500 optimized algorithms, which includes a
comprehensive set of both classic and state-of-the-art computer vision and machine
learning algorithms. These algorithms can be used to detect and recognize faces,
identify objects, classify human actions in videos, track camera movements, track
moving objects, extract 3D models of objects, produce 3D point clouds from stereo
cameras, stitch images together to produce a high-resolution image of an entire
scene, find similar images from an image database, remove red eyes from images
taken using flash, follow eye movements, recognize scenery and establish markers
to overlay it with augmented reality, etc. It has C++, C, Python, Java and
MATLAB interfaces.
(C) TensorFlow
TensorFlow is a popular open-source machine learning framework developed by
Google. It is designed to make it easy to develop and train machine learning
models, particularly deep neural networks, on a variety of platforms including
CPUs, GPUs, and TPUs.
TensorFlow has gained significant popularity in the field of computer science due
to its versatility and scalability. It provides a range of pre-built machine learning
models, as well as a flexible framework for building custom models, making it
useful for a wide range of applications.
Overall, TensorFlow has become an essential tool for computer scientists and
machine learning practitioners, and it continues to evolve with new features and
capabilities.
(D) Keras
Keras is a deep learning API written in Python, running on top of the machine
learning platform. It was developed with a focus on enabling fast experimentation.
Being able to go from idea to result as fast as possible is key to doing good
research.
Keras is:
• Simple -- but not simplistic. Karas reduces developer cognitive load to free you to
24
focus on the parts of the problem that really matter.
25
• Flexible -- Karas adopts the principle of progressive disclosure of complexity:
simple workflows should be quick and easy, while arbitrarily advanced workflows
should be possible via a clear path that builds upon what you have already learned.
LIBRARIES OF KERAS-:
2. Dense: This layer is a fully connected layer that connects every neuron in the
previous layer to every neuron in the current layer. It applies a set of weights
and biases to the input to produce an output.
4. Dropout: This layer randomly drops out a fraction of the input units during
training, preventing overfitting by forcing the network to learn more robust
features.
26
row. In this article, let us learn the definition of the dataset, different types of
datasets, properties, and so on with many solved examples.
5.2.1 KAGGLE
Kaggle is a platform for data science competitions and collaborative data science
projects. The platform also hosts a large collection of datasets that can be used for a
variety of data analysis and machine learning tasks.
Kaggle datasets are contributed by the Kaggle community, and they are often
accompanied by documentation and tutorials that help users get started with the
data. Users can search for datasets by keyword, browse by category, or sort by
popularity or date uploaded.
5.3 EMOTIONS
HAPPY:-
27
SAD:-
ANGER-:
28
DISGUST: -
NEUTRAL:-
29
FEAR: -
SURPRISE: -
30
CHAPTER-6
TESTING METHOD
31
6.1 Testing Method
System testing was done by giving different training and testing datasets. This test
was done to evaluate whether the system was predicting accurate result or not.
During the phase of the development of the system our system was tested time and
again.
The series of testing conducted are as follows:
Developers in a test-driven environment will typically write and run the tests prior
to the software or feature being passed over to the test team. Unit testing can be
conducted manually, but automating the process will speed up delivery cycles and
expand test coverage. Unit testing will also make debugging easier because finding
issues earlier means they take less time to fix than if they were discovered later in
the testing process. Test Left is a tool that allows advanced testers and developers
to shift left with the fastest test automation tool embedded in any IDE.
32
This is the accuracy of our trained model.
These tests are often framed by user scenarios, such as logging into an application
or opening files. Integrated tests can be conducted by either developers or
independent testers and are usually comprised of a combination of automated
functional and manual tests.
Beta testing helps in minimization of product failure risks and it increased quality of
the product through customer validation. It is the test before shipping a product to
the customers. One of the major advantages of beta testing is direct feedback from
customers.
34
35
CHAPTER-7
CONCLUSIO
36
This project proposes an approach for recognizing the category of facial expressions.
Face Detection and Extraction of expressions from facial images is useful in many
applications, such as robotics vision, video surveillance, digital cameras, security, and
human-computer interaction.
In this project, seven different facial expressions of different persons’ images from
different datasets have been analyzed. This project involves facial expression
preprocessing of captured facial images followed by feature extraction using feature
extraction using Local Binary Patterns and classification of facial expressions based
on training of datasets of facial images based on Support Vector Machines. This
project recognizes more facial expressions based on KAGGLE face database. To
measure the performance of proposed algorithm and methods and check the results
accuracy, the system has been evaluated using Precision, Recall and Scored. The
same datasets were used for both training and testing by dividing the datasets into
training samples and testing samples in the ratio of 9:1 of KAGGLE.
Experiment results on databases, KAGGLE dataset, show that our proposed method
can achieve a good performance. Facial expression recognition is a very challenging
problem. More efforts should be made to improve the classification performance for
important applications. Our future work will focus on improving the performance of
the system and deriving more appropriate classifications which may be useful in
many real-world applications.
37
CHAPTER-8
FUTURE ENHANCEMENTS
38
Face expression recognition systems have improved a lot over the past decade. The
focus has shifted from posed expression recognition to spontaneous expression
recognition. Promising results can be obtained under face registration errors, fast
processing time, and high correct recognition rate (CRR) and significant performance
improvements can be obtained in our system. System is fully automatic and has the
capability to work with images feed. It can recognize spontaneous expressions. Our
system can be used in Digital Cameras wherein the image can be captured only when
the person smiles. In security systems which can identify a person, in any form of
expression he presents himself. Rooms in homes can set the lights, television to a
person’s taste when they enter the room. Doctors can use the system to understand the
intensity of pain or illness of a deaf patient. Our system can be used to detect and
track a user’s state of mind, and in mini-marts, shopping center to view the feedback
of the customers to enhance the business etc.
39
REFERENCES
[1] Bethabara, V. (2012). Face expression recognition and analysis: the state of
the art.
[2] Shan, C., Gong, S., & McCowan, P. W. (2005, September). Robust facial
expression recognition using local binary patterns. In Image Processing, 2005. ICIP
2005. IEEE International Conference on (Vol. 2, pp. II-370). IEEE., Chi, Z., & Fu, H.
(2014, August). Facial expression recognition based on facial components detection
and hog features. In International Workshops on Electrical and Computer Engineering
Subfields.
[3] Ahmed, F., Bari, H., & Hossain, E. (2014). Person-independent facial
expression recognition based on compound local binary pattern (CLBP). Int. Arab J.
Inf. Technol.
[4] Happy, S. L., George, A., & Rout ray, A. (2012, December). A real time facial
expression classification system using Local Binary Patterns. In Intelligent Human
Computer Interaction (IHCI), 2012 4th International Conference on (pp. 1-5). IEEE.
[5] Zhang, S., Zhao, X., & Lei, B. (2012). Facial expression recognition based on
local binary patterns and local fisher discriminant analysis.
WSEAS Trans. Signal Process.
[6] Bhatt, M., Drishti, H., Rathod, M., Kirit, R., Aggravate, M., & Sharda, J.
(2014). A Study of Local Binary Pattern Method for Facial Expression Detection.
[7] Chen, J., Chen, Z., Chi, Z., & Fu, H. (2014, August). Facial expression
recognition based on facial components detection and hog features. In International
Workshops on Electrical and Computer Engineering Subfields.
40