You are on page 1of 4

Emotion Based Music Player System

Salvina Desai Vrushali Paul Pranita Gode


Information Technology Information Technology Information Technology
Datta Meghe College of Engineering Datta Meghe College of Engineering Datta Meghe College of Engineering
Navi Mumbai Navi Mumbai Navi Mumbai
salvinadesai12@gmail.com vrushalipaul02@gmail.com pranitagode19@gmail

Sayali Jadhav
Information Technology
Datta Meghe College of Engineering
Navi Mumbai
sayalijadhav8291@gmail.com

Abstract— This paper proposes an intelligent agent that Imagine yourself in a world where humans interact with
sorts a music collection based on the emotions conveyed computers. You are sitting in front of your personal
by each song and then suggests an appropriate playlist to computer that can listen, talk, or even scream aloud. It has
the user based on his or her current mood. The user's the ability to gather information about you and interact with
local music collection is initially clustered based on the you through special techniques like facial recognition,
emotion the song conveys, i.e., the mood of the song. This speech recognition, etc. It can even understand your
is calculated taking into consideration the lyrics of the emotions at the touch of the mouse. It verifies your identity,
song, as well as the melody. Every time the user wishes to feels your presence, and starts interacting with you . The
generate a mood-based playlist, the user takes a picture vital part of hearing the song has to be in a facilitated way,
of themselves at that instant. This image is subjected to that is the player is able to play the song in accordance to the
facial detection and emotion recognition techniques, person’s mood. The paper proposes emotion-based music
recognising the emotion of the user. The music that best player is a approach that helps automatically play songs
matches this emotion is then recommended to the user as based on the user's emotions. Emotion recognition is an
a playlist. Face emotion recognition is a form of image aspect of artificial intelligence that is becoming increasingly
processing. Facial emotion recognition is the process of relevant, for the purpose of automating various processes
converting the movements of a person's face into a that are relatively more tedious to perform manually. The
digital database using various image processing human face is an important organ of the human body,
techniques. Facial Emotion Recognition recognises the especially playing an important role in extracting a person's
face's emotions. Recently, a number of new technologies behaviour and emotional state. Identifying a person’s state
for facial expression recognition have been developed. of mind based on emotions they display is an important part
This technique, with an equal contribution from AI, can of making efficient automated decisions best suited to the
revolutionise the robotics industry. Human-robot person in question, for a variety of applications. One
interaction will be taken to a whole new level when important aspect of this would be in the entertainment field,
robots can understand human emotions and play music for the purpose of providing recommendations to a person
based on them. Index Terms: music classification, based on their current mood. We study this from the
emotion recognition, intelligent music player. perspective of providing a person with customised music
recommendations based on their state of mind, as detected
I INTRODUCTION from their facial expressions. Most music connoisseurs have
extensive music collections that are often sorted only based
Everyday, each and every person undergoes a lot of on parameters such as artist, album, genre and no. of times
troubles and the reliever of all the stress that is encountered played. However, this often leaves the users with the
is Music. Music plays a very important role in strengthening arduous task of making mood based playlists, i.e. sorting the
and the life of the individual because it is an important music based on the emotion conveyed by the songs
medium entertainment for music lovers and listeners. In something that is much more essential to the listening
today's world, with ever-increasing progress in the field experience than it often appears to be. This task only
multimedia and technology, various music players were increases in complexity with larger music collections, and
developed with functions like fast forward, reverse, variable automating the process would save many users the effort
playback speed (time search and compression), local spent in doing the same manually, while improving their
playback, streaming playback with multicast streams and overall experience and allowing for a better enjoyment of
incl volume modulation, genre classification, etc. Although the music. it recognizes the emotion on the user's face and
these functions satisfy the basic requirements of the user, yet plays songs according to that emotion. 
the user has to face the task of manually scrolling through
the playlist songs and choosing songs based on his current
mood and behaviour.
II RELATED WORK entropy, average gradient, peak signal to noise ratio,
correlation coefficient and the distortion degree also have
Charles Darwin is the first scientist to recognize that facial increased significantly. So, the new algorithm is an effective
expression is one of the most powerful and immediate and practical method of edge detection. Automatic face
means for human being to communicate their emotions, recognition is all about extracting those meaningful features
intentions and opinions to each other. Rosalind Picard from an image, putting them into a useful representation and
(1997) describes why emotions are important to the performing some kind of classification on them.  Face
computing community.   recognition based on the geometric features of a face is
There are two aspects of affective computing: giving the probably the most intuitive approach to face recognition.
computer the ability to detect emotions and giving the
computer the ability to express emotions. Not only are METHODOLOGY
emotions crucial for rational decision making as Picard
describes, but emotion detection is an important step to an There are several methods for handling the implementation
adaptive computer system. An adaptive, smart computer and the consequent conversion from the old to the new
system has been driving our efforts to detect a person’s computerized system.
emotional state. An important element of incorporating The most secure method for conversion from the old system
emotion into computing is for productivity for a computer to the new system is to run the old and new system in
user. A study (Dryer & Horowitz, 1997) has shown that parallel. In this approach, a person may operate in the
people with personalities that are similar or complement manual older processing system as well as start operating
each other collaborate well. Dryer (1999) has also shown the new computerized system. This method offers high
that people view their computer as having a personality. For security, because even if there is a flaw in the computerized
these reasons, it is important to develop computers which system, we can depend upon the manual system. However,
can work well with its user. the cost for maintaining two systems in parallel is very high.
 In the year 2011 Ligang Zhang and Dian Tjondronegoro This outweighs its benefits.
developed a facial emotion recognition system (FER) they Another commonly method is a direct cut over from the
used dynamic 3D Gabor feature approach and obtained the existing manual system to the computerized system. The
highest correct recognition rate (CRR) on the JAFFE change may be within a week or within a day. There are no
database and FER is among the top performers on the Cohn- parallel activities. However, there is no remedy in case of a
Kanade (CK) database using above approach. They testified problem. This strategy requires careful planning.
the effectiveness of the proposed approach through A working version of the system can also be implemented in
recognition performance, computational time, and one part of the organization and the personnel will be
comparison with the state-of the art performance. And piloting the system and changes can be made as and when
concluded that patch-based Gabor features show a better required. But this method is less preferable due to the loss of
performance over point based Gabor features in terms of entirety of the system.
extracting regional features, keeping the position
information, achieving a better recognition performance, and A. User Emotion Recognition
requiring a less number. For this model, the Cohn Kanade extended dataset is used. A
Till 2010 as per the survey done by R A Patil, Vineet Sahula python script is used to fetch images containing faces along
and A. S. Mandal CEERI Pilani on expression recognition, with their emotional descriptor values. The images are
The problem is divided into three sub problems face contrast enhanced by contrast limiting adaptive histogram
detection, feature extraction and facial expression equalization and converted to grayscale in order to maintain
classification. Most of the existing systems assume that the uniformity and increase effectiveness of the classifiers. A
presence of the face in a scene is ensured. Most of the cascade classifier, trained with face images is used for face
systems deal with only feature extraction and classification, detection where the image is split into fixed-size windows.
as- suming that face is already detected. Each window is passed to the cascade classifiers and is
The algorithms which we have selected for detecting accepted if it is passes through all the classifiers, otherwise
emotions are from the papers, Image Edge Detection it is rejected. The detected faces are then used to train the
Algorithm Based on Improved Canny Operator of 2013 and face, which works on reduction of variance between classes.
Rapid Object Detection using a Boosted Cascade of Simple Fisher face recognition method proves to be efficient as it
Features by Viola and Jones. In Image Edge Detection works better with additional features such as spectacles and
Algorithm Based on Improved Canny Operator paper facial hair. It is also relatively invariant to lighting. A picture
improved canny edge detection algorithm is proposed. is taken during the runtime of the application, which after
Because, the traditional canny algorithm has difficulty in preprocessing, is predicted to belong to one of the emotion
treating images which contain the salt & pepper noise, and it classes by the fisher face classifier. The model also permits
does not have the adaptive ability in the variance of the the user to customize the model in order to reduce the
Gaussian filtering. For this reason, a new canny algorithm is variance within the classes further, initially or periodically,
presented in this paper, in which open-close filtering instead such that only variance would be that of emotion change
of Gaussian filtering. In this paper, the traditional canny
operator is improved by using morphology filtering to B. Music Recommendation
preterit the noise image. The final edge image can reduce The emotion detected from the image processing is given as
effectively the influence of noise, keep the edge strength and input to the clusters of music to select a specific cluster. In
more complete details, get a more satisfactory subjective order to avoid the interfacing with a music app or music
result. And by using objective evaluation standards, modules which would involve extra installation, the support
compared with the traditional Canny operator, information from the operating system is used instead to play the music
file. The playlist selected by the clusters are played by
creating a subprocess, as the forked subprocess returns the
control back to the python script on the completion of its
execution, so that other songs can be played. This makes the
program play music on any system regardless of its music
player.

C. Haar cascade classifiers

The core basis for Haar classifier object detection is the


Haar-like features. These features, rather than using the
intensity values of a pixel, use the change in contrast values
between adjacent rectangular groups of pixels. The contrast
variances between the pixel groups are used to determine
relative light and dark areas. Two or three adjacent groups
With a relative contrast variance form a Haar-like feature.
Haar-like features are used to detect an image. Haar features
can easily be scaled by increasing or decreasing the size of
the pixel group being examined. This allows features to be
used to detect objects of various sizes.

D. Implementation plan
The implementation plan includes a description of all the
activities that must occur to implement the new system and
to put it into operation. It identifies the personnel
responsible for the activities and prepares a time chart for
implementing the system. The implementation plan consists III . DESIGN
of the following steps.
List all files required for implementation. Identify all data Now that you have detected a face, you can use that face
required to build new files during the implementation. List image for Face Recognition. However, if you tried to simply
all new documents and procedures that go into the new perform face recognition directly on a normal photo image,
system. you will probably get less than 10% accuracy! 
The implementation plan should anticipate possible It is extremely important to apply various image pre-
problems and must be able to deal with them. The usual processing techniques to standardize the images that you
problems may be missing documents; mixed data formats supply to a face recognition system. Most face recognition
between current and files, errors in data translation, missing algorithms are extremely sensitive to lighting conditions, so
data etc. that if it was trained to recognize a person when they are in a
dark room, it probably won’t recognize them in a bright
E.DFD room, etc. This problem is referred to as "lumination
A data flow diagram (DFD) is a graphical representation of dependent", and there are also many other issues, such as the
the flow of data through an information system. A data face should also be in a very consistent position within the
flowdiagram can also be used for the visualization of data images (such as the eyes being in the same pixel
processing (structured design). It is common practice for a coordinates), consistent size, rotation angle, hair and
designer to draw a context-level DFD first which shows the makeup, emotion (smiling, angry, etc), position of lights (to
interaction between the system and outside entities. This the left or above, etc). This is why it is so important to use a
context-level DFD is then exploded to show more detail of good image preprocessing filters before applying face
the system being modeled.  recognition. You should also do things like removing the
Symbols: pixels around the face that aren't used, such as with an
The four components of a data flow diagram (DFD) are: elliptical mask to only show the inner face region, not the
External Entities/Terminators are outside of the system hair and image background, since they change more than the
being modeled. Terminators represent where information face does For simplicity, the face recognition system I will
comes from and where it goes. In designing a system, we show you is Eigen faces using greyscale images. So I will
have no idea about what these terminators do or how they do show you how to easily convert color images to greyscale
it. (also called 'grayscale'), and then easily apply Histogram
Processes modify the inputs in the process of generating the Equalization as a very simple method of automatically
outputs comes to rest. A DFD does not say anything about standardizing the brightness and contrast of your facial
the relative timing of the processes, so a data store might be images. For better results, you could use color face
a place to accumulate data over a year for the annual recognition (ideally with color histogram fitting in HSV or
accounting process. another color space instead of RGB), or apply more
processing stages such as edge enhancement, contour
detection, motion detection, etc. Also, this code is resizing
images to a standard size, but this might change the aspect
ratio of the face.
IV. FUTURE WORK 

The future scope in the system would to design a mechanism


that would be helpful in music therapy treatment and
provide the music therapist the help needed to treat the
patients suffering from disorders like mental stress, anxiety,
acute depression and trauma. The proposed system also
tends to avoid in future the unpredictable results produced in
extreme bad light conditions and very poor camera
resolution. Extend to detect more facial features, gestures
and other emotional states (stress level, lie detection,
etc.)Facial recognition can be used for authentication
purpose. Android Development. Can detect sleepy mood
while driving. Can be used to determine mood of physically
challenged & mentally challenged people.

REFERENCES

[1] Adolf, F. How-to builds a cascade of boosted classifiers


based on Haar-like features.
http://robotik.inflomatik.info/other/opencv/OpenCV_Object
Detection_HowTo.pdf, June 20 2003.
[2] Bradski, G. Computer vision face tracking for use in a
perceptual user interface. Intel Technology Journal, 2nd
Quarter, 1998.
[3] Cristinacce, D. and Cootes, T. Facial feature detection
using AdaBoost with shape constraints. British Machine
Vision Conference, 2003.
[4] The Facial Recognition Technology (FERET) Database.
National Institute of Standards and Technology, 2003.
http://www.itl.nist.gov/iad/humanid/feret
[5] Lienhart, R. and Maydt, J. An extended set of Haar-like
features for rapid object detection. IEEE ICIP 2002, Vol. 1,
pp. 900-903, Sep. 2002
[6] Menezes, P., Barreto, J.C. and Dias, J. Face tracking
based on Haar-like features and Eigen faces. 5th IFAC
Symposium on Intelligent Autonomous Vehicles, Lisbon,
Portugal, July 5-7, 2004.
[7] Muller, N., Magaia, L. and Herbst B.M. Singular value
decomposition, Eigen faces, and 3D reconstructions. SIAM
Review, Vol. 46 Issue 3, pp. 518–545. Dec. 2004.
[8] Open Computer Vision Library Reference Manual.
Intel Corporation, USA, 2001.
[9] Viola, P. and Jones, M. Rapid object detection using
boosted cascade of simple features. IEEE Conference
on Computer Vision and Pattern Recognition, 2001.

You might also like