You are on page 1of 8

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/286453239

Mood Classification of Hindi Songs based on Lyrics

Conference Paper · December 2015

CITATIONS READS

9 34,789

3 authors:

Braja Gopal Patra Dipankar Das


University of Texas Health Science Center at Houston 115 PUBLICATIONS   999 CITATIONS   
37 PUBLICATIONS   219 CITATIONS   
SEE PROFILE
SEE PROFILE

Sivaji Bandyopadhyay
Jadavpur University
288 PUBLICATIONS   2,642 CITATIONS   

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Content-Based Literature Recommendation System for Enhancing Dataset Reusability View project

Authorship Verification View project

All content following this page was uploaded by Braja Gopal Patra on 30 August 2017.

The user has requested enhancement of the downloaded file.


Mood Classification of Hindi Songs based on Lyrics

Braja Gopal Patra, Dipankar Das, and Sivaji Bandyopadhyay


Department of Computer Science & Engineering, Jadavpur University, Kolkata, India
{brajagopal.cse,dipankar.dipnil2005}@gmail.com,
sivaji cse ju@yahoo.com

Abstract ies indicating contradictory emphasis of lyrics or


audio in predicting music moods are prevalent in
Digitization of music has led to easier ac- literature (Hu and Downie, 2010b). Indian mu-
cess to different forms music across the sic considered as one of the oldest musical tradi-
globe. Increasing work pressure denies the tions in the world. Indian music can be divided
necessary time to listen and evaluate music into two broad categories, “classical” and “popu-
for a creation of a personal music library. lar” (Ujlambkar and Attar, 2012). Further, classi-
One solution might be developing a music cal music tradition of India has two main variants;
search engine or recommendation system namely Hindustani and Carnatic. The prevalence
based on different moods. In fact mood of Hindustani classical music is found largely in
label is considered as an emerging meta- north and central parts of India whereas Carnatic
data in the digital music libraries and on- classical music dominates largely in the southern
line music repositories. In this paper, we parts of India.
proposed mood taxonomy for Hindi songs Indian popular music, also known as Hindi Bol-
and prepared a mood annotated lyrics cor- lywood music or Hindi music, is mostly present in
pus based on this taxonomy. We also Hindi cinemas or Bollywood movies. Hindi is one
annotated lyrics with positive and nega- of the official languages of India and is the fourth
tive polarity. Instead of adopting a tra- most widely spoken language in the World1 . Hindi
ditional approach to music mood classifi- or Bollywood songs make up 72% of the total mu-
cation based solely on audio features, the sic sales in India (Ujlambkar and Attar, 2012). Un-
present study describes a mood classifica- fortunately, not much computational and analyti-
tion system using lyrics by combining a cal work has been done in this area.
wide range of semantic and stylistic fea- Therefore, mood taxonomy especially for Hindi
tures extracted from lyrics. We also devel- songs has been introduced here in order to closely
oped a supervised system to identify the investigate the role played by lyrics in music mood
sentiment of the Hindi song lyrics based classification. The lyrics corpus is annotated in
on the above features. The Hindi song two steps. In the first step, mood is annotated
polarity and mood classification systems based on the listener’s perspective. In the sec-
achieved the maximum average F-measure ond step, the same corpus is annotated with po-
of 68.30% and 38.49% using the lyric fea- larity based on the reader’s perspective. Further,
tures respectively. we developed a mood classification system by in-
corporating different semantic and textual stylis-
1 Introduction
tic features extracted from the lyrics. In addition,
Studies on music information retrieval (MIR) have we also developed a polarity classification system
shown moods as a desirable access point to mu- based on the above features.
sic repositories and collections (Hu and Downie, The paper is organized as follows: Section 2
2010a). In the recent decade, much work on west- reviews related work on music mood classifica-
ern music mood classification has been performed tion. Section 3 introduces the proposed mood
using audio signals and lyrics (Hu and Downie, 1
https://www.redlinels.com/most-widely-spoken-
2010a; Mihalcea and Strapparava, 2012). Stud- languages/

261

D S Sharma, R Sangal and E Sherly. Proc. of the 12th Intl. Conference on Natural Language Processing, pages 261–267,
Trivandrum, India. December 2015. ©2015 NLP Association of India (NLPAI)
classes. The detailed annotation process and the mark Workshop. In the above task, the arousal
dataset used in the study have been described in and valence scores were estimated continuously
Section 4. Section 5 describes the features of the in time for every music piece using several re-
lyrics used in the experiments, which is followed gression models3 . Notable works on Indian mu-
by the results obtained so far, our findings and fur- sic mood classification using audio features can be
ther prospect in Section 6. Finally Section 7 con- found on several music categories, such as Hindi
cludes and suggests the future work. music (Ujlambkar and Attar, 2012; Patra et al.,
2013a; Patra et al., 2013b; Patra et al., 2015), Hin-
2 Related Work dustani classical music (Velankar and Sahasrabud-
Dataset and Taxonomy: Preparation of an anno- dhe, 2012) and Carnatic classical music (Koduri
tated dataset requires the selection of proper mood and Indurkhya, 2010).
classes to be used. With respect to Indian music, Mood Classification from Lyric Features:
limited work on mood detection by considering Multiple experiments have been carried out on
audio features has been reported till today. Ko- Western music mood classification based on bag
duri and Indurkhya (2010) worked on the mood of words (BOW), emotion lexicons, and other
classification of South Indian Classical music, i.e. stylistic features (Zaanen and Kanters, 2010; Hu
Carnatic music. The main goal of their experi- and Downie, 2010a; Hu and Downie, 2010b).
ment was to verify the raagas that really evoke a Multimodal Music Mood Classification:
particular rasa(s) (emotion) specific to each user. Much literature on mood classification on West-
They considered the taxonomy consisting of ten ern music has been published based on both audio
rasas e.g., Srungaram (Romance, Love), Hasyam and lyrics (Hu, 2010). The system developed by
(Laughter, Comedy) etc. Similarly, Velankar and Yang and Lee (2004) is often regarded as one of
Sahasrabuddhe (2012) prepared dataset for mood the earliest studies on combining audio and lyric
classification of Hindustani classical music con- features to develop a multimodal music mood
sisting of 13 mood words (Happy, Exciting, Satis- classification.
faction, Peaceful, Graceful, Gentle, Huge, Surren- To the best of author’s knowledge, Indian music
der, Love, Request, Emotional, Pure, Meditative). mood classification based on lyrics has not been
In case of audio based Hindi music mood classi- attempted yet. Moreover, in context to Indian mu-
fication, Patra et al. (2013a) used the standard Mu- sic, multimodal music mood classification also has
sic Information Retrieval eXchange 2 (MIREX) not been explored either.
taxonomy whereas Ujlambkar and Attar (2012)
3 Taxonomy
used five mood classes, namely Happy, Sad,
Silent, Excited and Romantic along with three or In context to the western music, the adjec-
more sub-classes based on two dimensional “En- tive list (Hevner, 1936), Russell’s circumplex
ergy and Stress” model. model (Russell, 1980), and MIREX taxonomy (Hu
Mood Classification using Audio Features: et al., 2008) are the most popular mood tax-
Automatic music mood classification systems onomies used by several worldwide researchers in
based on the audio features where spectral, this arena. Though, several mood taxonomies have
rhythm, and intensity are the most popular fea- been proposed by different researchers, all such
tures, have been developed in the last few decades. psychological models were proposed in laboratory
The MIREX is an annual evaluation campaign settings and thus were criticized for the lack of so-
of different MIR related systems and algorithms. cial context of music listening (Hu, 2010; Laurier
The “Audio Mood Classification (AMC)” task et al., 2009).
has been running each year since 2007 (Hu et Russell (1980) proposed the circumplex model
al., 2008). Among the various audio-based ap- of affect (consisting of 28 affect words) based
proaches tested at MIREX, spectral features and on the two dimensions denoted as “pleasant-
Support Vector Machine (SVM) classifiers were unpleasant” and “arousal-sleep” (as shown in Fig-
widely used and found quite effective (Hu and ure 3). The most well-known example of such tax-
Downie, 2010a). The “Emotion in Music” task onomy is the Valence-Arousal (V-A) representa-
was started in the year 2014 at MediaEval Bench- tion which has been used in several previous ex-
2 3
www.music-ir.org/mirex/wiki/MIREX HOME http://www.multimediaeval.org/mediaeval2015/emotioninmusic2015/

262
sically written in Romanized English characters.
The prerequisite resources like Hindi sentiment
lexicons and stopwords are available in utf-8 char-
acter encoding. Thus, we transliterated the Ro-
manized English characters to utf-8 characters us-
ing the transliteration tool available in the EILMT
project4 . We observed several errors in the translit-
eration process and hence corrected the mistakes
manually.
It has to be mentioned that we have only used
the coarse grain classes for all of our experi-
ments. Also to be noted that, we started an-
notating the lyrics at the same time of annotat-
ing their corresponding audio files by listening to
Figure 1: Russell’s circumplex model of 28 affect them. All of the annotators were undergraduate
words students worked voluntarily and belong to the age
group of 18-24. Each of the songs was anno-
periments (Soleymani et al., 2013). Valence in- tated by five annotators. We achieved the inter-
dicates positive versus negative polarity whereas annotator agreement of 88% for the lyrics data an-
arousal indicates the intensity of moods. notated with five coarse grain mood classes (as
We opted to use Russel’s circumplex model by mentioned in bold face in Table ?? 1). While
clustering the similar affect words (as shown in annotating the songs, we observed that the con-
Figure 3). For example, we have considered the fusions occur between the pair of mood classes
affect words calm, relaxed, and satisfied together like “Class An and Class Ex”, “Class Ha and
to form one mood class i.e., Calm, denoted as Class Ex” and “Class Sa and Class Ca” as these
Class Ca. The present mood taxonomy contains classes have similar acoustic features.
five mood classes with three sub-classes in each. To validate the annotation in a consistent way,
One of the main reasons of developing such tax- we tried to assign our proposed coarse mood
onomy was to collect similar songs and cluster classes (e.g., Class Ha) to a lyric after reading its
them into a single mood class. Preliminary ob- lexical contents. But, it was too difficult to an-
servations showed significant invariability in case notate a lyric with such coarse mood classes as a
of audio features of the sub-classes over its cor- lyric of a single song may contain multiple emo-
responding main or coarse class. Basically, the tions within it. On the other hand, the annotators
preliminary observations of annotation are related felt different emotions while listening to audio and
with the psychological factors that influence the reading its corresponding lyrics, separately. For
annotation process while annotating a piece of mu- example, Bhaag D.K.Bose Aandhi Aayi5 is anno-
sic after listening to the song. For example, a tated as Calss An while listening to it, whereas an-
happy and a delighted song have high valence, notated as Class Sa while reading the correspond-
whereas an aroused and an excited songs have ing lyric. Therefore, in order to avoid such prob-
high arousal. The final mood taxonomy used in lem and confusion, we decided to annotate lyrics
our experiment is shown in Table 3. with one of the coarse grained sentiment classes,
viz. positive or negative.
4 Dataset Annotation Perspective based We calculated the inter-annotator agreement
on Listeners and Readers and obtained 95% agreement on the lyrics data an-
notated with two coarse grained sentiment classes.
Till date, there is no mood annotated lyrics cor- In order to emphasize the annotation schemes, we
pus available on the web. In the present work, could argue that a song is generally considered
we collected the lyrics data from different web as positive if it belongs to the happy mood class.
archives corresponding to the audio data that was
4
developed by Patra et al. (2013a). Some more http://tdil-dc.in/index.php?option=com verti-
cal&parentid=72
lyrics were added as per the increment of the au- 5
http://www.lyricsmint.com/2011/05/bhaag-dk-bose-
dio data in Patra et al. (2013b). The lyrics are ba- aandhi-aayi-delhi-belly.html

263
Class Ex Class Ha Class Ca Class Sa Class An
Excited Delighted Calm Sad Angry
Astonished Happy Relaxed Gloomy Alarmed
Aroused Pleased Satisfied Depressed Tensed

Table 1: Proposed Mood Taxonomy

Positive Negative No. of Songs with their parts-of-speech from angry, disgust,
Class An 1 49 50 fear, happy, sad, and surprise classes, respec-
Class Ca 83 12 95 tively. The statistics of the sentiment words found
Class Ex 85 6 91 in the whole corpus using three sentiment lexicons
Class Ha 96 4 100 are shown in Table 5.1.
Class Sa 7 117 125
Total Songs 461 Class HWA Class HSL HSW
Angry 241
Table 2: Confusion matrix of two annotation Disgust 13 Positive 1172 857
scheme and statistics of total songs Fear 13
Happy 349
But, in our case, we observed a different sce- Sad 107 Negative 951 628
nario. Initially, the annotators annotated a lyric Surprise 38
with Class Ha after listening to audio, but, later Table 3: Sentiment words identified using HWA,
on, the same annotator annotated the same lyric HSL and HSW
with negative polarity while finished reading of its
contents. Therefore, a few cases where the mood
class does not always coincide with the conven- 5.2 Text Stylistic Features
tional moods at lyrics level (e.g., Class Ha and
positive, Class An and negative) are identified and The text stylistic features such as the number of
we presented a confusion matrix in Table 4. unique words, number of repeated words, number
of lines, number of unique lines and number of
5 Classification Framework lines ended with same words were considered in
our experiments.
We adopted a wide range of textual features such
as sentiment lexicons, stylistic features and n-
5.3 Features based on N-grams
grams in order to develop the music mood clas-
sification framework. We have illustrated all the Many researches showed that the n-gram feature
features below. works well for lyrics mood classification (Za-
anen and Kanters, 2010) as compared to the
5.1 Features based on Sentiment Lexicons stylistic or sentiment features. Thus, we con-
We used three Hindi sentiment or emotion lexi- sidered Term Frequency-Inverse Document Fre-
cons to classify the sentiment or emotion words quency (TF-IDF) scores of up to trigrams as the
present in the lyrics, which are Hindi Subjective results get worsen after including the higher or-
Lexicon (HSL) (Bakliwal et al., 2012), Hindi Sen- der n-grams. However, we removed the stopwords
tiWordnet (HSW) (Joshi et al., 2010) and Hindi while considering the n-grams and considered the
Wordnet Affect (HWA) (Das et al., 2012). HSL n-grams having document frequency more than
contains two lists, one is for adjectives (3909 pos- one.
itive, 2974 negative and 1225 neutral) and an- We used the correlation based supervised fea-
other is for adverbs (193 positive, 178 negative, ture selection technique available in the WEKA6
and 518 neutral). HSW consists of 2168 posi- toolkit. Finally, we performed our experiments
tive, 1391 negative and 6426 neutral words along with 10 sentiment features, 13 textual stylistic fea-
with their parts-of-speech (POS) and synset id tures and 1561 n-gram features.
extracted from the Hindi WordNet. HWA con-
tains 2986, 357, 500, 3185, 801 and 431 words 6
http://www.cs.waikato.ac.nz/ml/weka/

264
6 Results and Discussion as compared to the audio based systems (accu-
racies of 51.56% and 48%), although the cur-
Support Vector Machines (SVM) is widely used rent lyrics dataset contain more instances than the
for the for the western songs lyrics mood classi- number of audio files used for the audio based sys-
fication (Hu et al., 2009; Hu and Downie, 2010a). tem in Patra et al. (2013a). They divided a song
Even for the mood classification from audio data at into multiple audio clips of 60 seconds, whereas
MIREX showed that the LibSVM performed bet- we considered the total lyrics of a song for our
ter than the SMO algorithm, K-Nearest Neighbors experiment. This may be one of the reasons for
(KNN) implemented in the WEKA machine learn- the poor performance of the lyrics based mood
ing software (Hu et al., 2008). classification system as the mood varies over a
To develop the automatic system for mood clas- full length song. But in the present task, we per-
sification using lyrics, we have used several ma- formed classification task on a whole lyric. It is
chine learning algorithms, but the LibSVM imple- also observed that, in context of Hindi songs, the
mented in the WEKA tool performs better than the mood aroused while listening to the audio is differ-
other classifiers available for the classification pur- ent from the mood aroused at the time of reading
pose in our case also. Initially, we tried LibSVM a lyric. The second system achieves the best F-
with the polynomial kernel, but the radial basic measure of 68.30. We can observe that the polarity
function kernel gave better results. all over the music does not change, i.e. if a lyric
We developed two systems for the data anno- is positive, then the positivity is observed through
tated with two different annotation schemes. In the the lyric. We also observed that the n-gram fea-
first system, we tried to classify the lyrics into five tures yield F-measure of 35.05% and 64.2% alone
coarse grained moods classes. In the second sys- for the mood and polarity classification systems
tem, we classified the polarities (positive or nega- respectively. The main reason may be that the
tive) of the lyrics that were assigned to a song only Hindi is free word order language. The Hindi
after reading its corresponding lyrics. In order to lyrics are also more free word order than the Hindi
get reliable accuracy, we have performed 10-fold language itself as it matches the end of each line.
cross validation for both systems. We have shown
the system F-measure in Table 4. 7 Conclusion and Future Work
In Table 4, we observed that the F-measure of
the second system is high compared to the first In this paper, we proposed mood and polarity clas-
system. In case of English, the maximum accu- sification systems based on the lyrics of the songs.
racy achieved in Hu and Downie (2010b) is 61.72 We achieved the best F-measure of 38.49 and 68.3
over the dataset of 5,296 unique lyrics comprising in case of the mood and polarity classification of
of 18 mood categories. But, in case of Hindi, we Hindi songs, respectively. We also observed that
achieved F-score of 38.49 only on a dataset of 461 the listener’s perspective and reader’s perspective
lyrics and with five mood classes. The observa- of emotion are different in case of audio and its
tions yield the facts that the lyrics patterns for En- corresponding lyrics. The mood is transparent
glish and Hindi are completely different. We have while adopting the audio only, where the polarity
observed various dissimilarities (w.r.t. singer and is transparent in case of lyrics.
instruments) of the Hindi Songs over the English In future, we plan to perform the same experi-
music. There are multiple moods in a Hindi lyric ment on a wider set of textual features. Later on,
and the mood changes while annotating a song at we plan to develop a hybrid mood classification
the time of listening to the audio and reading its system based audio and lyrics features. We also
corresponding lyric. plan to improve accuracy of the lyrics mood clas-
To the best of our knowledge, there is no exist- sification system using multi-level classification.
ing system available for lyrics based mood clas-
sification in Hindi. As the lyrics data is devel- Acknowledgments
oped on the audio dataset in Patra et al. (2013a),
thus, we compared our lyrics based mood classi- The first author is supported by Visvesvaraya
fication system with the audio based mod classifi- Ph.D. Fellowship funded by Department of Elec-
cation system developed in the Patra et al. (2013a; tronics and Information Technology (DeitY), Gov-
2013b). Our lyrics based system performed poorly ernment of India. The authors are also thankful

265
Systems Features Precision Recall F-measure
Sentiment Lexicon (SL) 29.82 29.8 29.81
System 1:
SL+Text Stylistic (TS) 33.6 33.56 33.58
Mood
N-Gram (NG) 34.1 36 35.05
Classification
SL+TS+ NG 40.58 36.4 38.49
SL 62.3 62.26 65.28
System 2:
SL+TS 65.54 65.54 65.54
Polarity
NG 65.4 63 64.2
Classification
SL+TS+NG 70.3 66.3 68.3

Table 4: System performance

to the anonymous reviewers for their helpful com- Aditya Joshi, AR Balamurali, and Pushpak Bhat-
ments. tacharyya. 2010. A fall-back strategy for sentiment
analysis in hindi: a case study. In Proceedings of the
8th International Conference on Natural Language
Processing.
References
Akshat Bakliwal, Piyush Arora, and Vasudeva Varma. Gopala Krishna Koduri and Bipin Indurkhya. 2010. A
2012. Hindi subjective lexicon: A lexical resource behavioral study of emotions in south indian classi-
for hindi polarity classification. In Proceedings of cal music and its implications in music recommen-
the Eight International Conference on Language Re- dation systems. In Proceedings of the 2010 ACM
sources and Evaluation (LREC). workshop on Social, adaptive and personalized mul-
timedia interaction and access, pages 55–60.
Dipankar Das, Soujanya Poria, and Sivaji Bandyopad-
hyay. 2012. A classifier based approach to emotion Cyril Laurier, Mohamed Sordo, Joan Serra, and Per-
lexicon construction. In Proceedings of the Inter- fecto Herrera. 2009. Music mood representations
national Conference on Application of Natural Lan- from social tags. In Proceedings of the 10th Interna-
guage to Information Systems, pages 320–326. tional Society for Music Information Retrieval Con-
ference (ISMIR 2009), pages 381–386.
Kate Hevner. 1936. Experimental studies of the ele-
ments of expression in music. The American Jour- Rada Mihalcea and Carlo Strapparava. 2012. Lyrics,
nal of Psychology, 48(2):246–268. music, and emotions. In Proceedings of the 2012
Joint Conference on Empirical Methods in Natural
Xiao Hu and J Stephen Downie. 2010a. Improv- Language Processing and Computational Natural
ing mood classification in music digital libraries by Language Learning, pages 590–599.
combining lyrics and audio. In Proceedings of the
10th annual joint conference on Digital libraries, Braja G Patra, Dipankar Das, and Sivaji Bandyopad-
pages 159–168. hyay. 2013a. Automatic music mood classification
of hindi songs. In Proceedings of 3rd Workshop on
Xiao Hu and J Stephen Downie. 2010b. When lyrics Sentiment Analysis where AI meets Psychology (IJC-
outperform audio for music mood classification: A NLP 2013), pages 24–28.
feature analysis. In Proceedings of the 11th Interna-
tional Society for Music Information Retrieval Con- Braja G Patra, Dipankar Das, and Sivaji Bandyopad-
ference (ISMIR 2010), pages 619–624. hyay. 2013b. Unsupervised Approach to Hindi Mu-
sic Mood Classification.
Xiao Hu, J. Stephen Downie, Cyril Laurier, Mert Bay,
and Andreas F. Ehmann. 2008. The 2007 mirex au- Braja G Patra, Dipankar Das, and Sivaji Bandyopad-
dio mood classification task: Lessons learned. In hyay. 2015. Music emotion recognition system. In
Proceedings of the 9th International Society for Mu- Proceedings of the International Symposium Fron-
sic Information Retrieval Conference (ISMIR 2008), tiers of Research Speech and Music (FRSM-2015),
pages 462–467. pages 114–119.

Xiao Hu, J Stephen Downie, and Andreas F Ehmann. James A Russell. 1980. A circumplex model of af-
2009. Lyric text mining in music mood classifica- fect. Journal of personality and social psychology,
tion. In Proceedings of the 10th International Soci- 39(6):1161–1178.
ety for Music Information Retrieval Conference (IS-
MIR 2009), pages 411–416. Mohammad Soleymani, Micheal N Caro, Erik M
Schmidt, Cheng-Ya Sha, and Yi-Hsuan Yang. 2013.
Xiao Hu. 2010. Music and mood: Where theory 1000 songs for emotional analysis of music. In Pro-
and reality meet. In Proceedings of the iConference ceedings of the 2nd ACM international workshop on
2010. Crowdsourcing for multimedia, pages 1–6.

266
Aniruddha M Ujlambkar and Vahida Z Attar. 2012.
Mood classification of indian popular music. In
Proceedings of the CUBE International Information
Technology Conference, pages 278–283.
Makarand R Velankar and Hari V Sahasrabuddhe.
2012. A pilot study of hindustani music sentiments.
In Proceedings of the 2nd Workshop on Sentiment
Analysis where AI meets Psychology (SAAIP-2012),
pages 91–98.
Dan Yang and Won-Sook Lee. 2004. Disambiguating
music emotion using software agents. In Proceed-
ings of the 5th International Society for Music Infor-
mation Retrieval Conference (ISMIR 2004), pages
218–223.
Menno Van Zaanen and Pieter Kanters. 2010. Au-
tomatic mood classification using tf*idf based on
lyrics. In Proceedings of the 11th International So-
ciety for Music Information Retrieval Conference
(ISMIR 2010), pages 75–80.

267

View publication stats

You might also like