You are on page 1of 16

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/313237218

Modeling and Analysis of Indian Carnatic Music Using Category Theory

Article  in  IEEE Transactions on Systems, Man, and Cybernetics: Systems · January 2017


DOI: 10.1109/TSMC.2016.2631130

CITATIONS READS
4 5,104

4 authors:

Sarala Padi Spencer Breiner


National Institute of Standards and Technology National Institute of Standards and Technology
17 PUBLICATIONS   126 CITATIONS    19 PUBLICATIONS   102 CITATIONS   

SEE PROFILE SEE PROFILE

Eswaran Subrahmanian Ram D Sriram


Carnegie Mellon University National Institute of Standards and Technology
165 PUBLICATIONS   3,029 CITATIONS    302 PUBLICATIONS   6,938 CITATIONS   

SEE PROFILE SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Ontology Summit 2017 View project

Integrating smart systems using category theory View project

All content following this page was uploaded by Eswaran Subrahmanian on 05 March 2018.

The user has requested enhancement of the downloaded file.


This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS 1

Modeling and Analysis of Indian Carnatic


Music Using Category Theory
Sarala Padi, Spencer Breiner, Eswaran Subrahmanian, and Ram D. Sriram, Fellow, IEEE

Abstract—This paper presents a category theoretic ontology humming or singing. However, the vast majority of MIR is
of Carnatic music. Our goals here are twofold. First, we will metadata-based; we retrieve music segments by searching for
demonstrate the power and flexibility of conceptual modeling text related to the segment, such as its title or artist. In this
techniques based on a branch of mathematics called category
theory (CT), using the structure of Carnatic music as an example. paper, we will focus exclusively on metadata-based MIR.
Second, we describe a platform for collaboration and research Although metadata-based retrieval is convenient for users
sharing in this area. The construction of this platform uses for- and easy to implement, the metadata on which these searches
mal methods of CT (colimits) to merge our Carnatic ontology depend must be expressive enough and accurate enough to
with a generic model of music information retrieval tasks. The support retrieval tasks. This makes metadata creation and
latter model allows us to integrate multiple analytical methods,
such as hidden Markov models, machine learning algorithms, and evaluation one of the most significant challenges in metadata-
other data mining techniques like clustering, bagging, etc., in the based MIR. In many cases metadata collection requires explicit
analysis of a variety of different musical features. Furthermore, supervision to ensure accurate results. It may also be difficult
the framework facilitates the storage of musical performances to maintain a consistent format (including the choice of data
based on the proposed ontology, making them available for fields) in our music databases, especially when these grow
additional analysis and integration. The proposed framework
is extensible, allowing future work in the area of rāga recog- rapidly.
nition to build on our results, thereby facilitating collaborative One way to ease, though not eliminate, these challenges is
research. Generally speaking, the methods presented here are through the use of domain modeling. If we have some rules
intended as an exemplar for designing collaborative frame- about the way that different pieces of metadata relate to one
works supporting reproducibility of computational analysis and another, then we can evaluate new data against these rules to
simulation.
recognize some incorrect inferences. Similarly, modeling the
Index Terms—Categorical framework for Carnatic music, structure of a musical domain can allow us to cross-check
categorical structure for rāga, category theory (CT), related inferences.
ontology.
There is a significant body of literature modeling Western
classical music, including tempo estimation, beat tracking,
instrumental music segmentation, instruments classification,
I. I NTRODUCTION and transcription [4]–[9]. By comparison, there has been
ROUND the world, a tremendous number of audio music little effort to analyze Indian classical music and, correspond-
A files are accessed every day for personal usage, research,
and analysis purposes. The area of music information retrieval
ingly, most MIR technologies have not been applied in this
context [10]–[12]. Here, we try to address this gap.
(MIR) studies techniques and methods for helping users find Indian classical music has two main traditions, namely,
the music files they want. Thus, MIR concentrates on archival Carnatic and Hindustani. Carnatic music is popular in the
methods, providing metadata, annotations, search algorithms, southern part of India while Hindustani is popular in the north.
and analysis of music to help the users filter an ocean of musi- Though these two traditions are similar in certain respects,
cal content. Especially as listening habits migrate from hard in this paper, we choose to focus only on Carnatic music,
drives to the cloud, MIR is becoming an increasingly critical leaving the extension to Hindustani music and the relation-
area of research [1]. ship between the two branches for future work. A reader
There are two general classes of MIR: 1) content-based interested in learning more about Hindustani music should
and 2) metadata-based [2], [3]. The former involves retrieval consult [13], [14].
based on musical content, which a user might provide by A song in Carnatic music is composed in a specific
melody (rāga) but, unlike the case in Western classical music,
Manuscript received May 12, 2016; revised August 29, 2016; accepted
November 4, 2016. This paper was recommended by Associate Editor the rhythm can be rendered differently by different musi-
L. Sheremetov. cians [13], [15]. This is primarily because Carnatic music has
This paper has supplementary downloadable multimedia material available been handed down from teacher to student in an oral tradi-
at http://ieeexplore.ieee.org provided by the authors.
The authors are with the Department of ITL, National Institute tion. Consequently, Carnatic music typically does not have
of Standards and Technology, Gaithersburg, MD 20899 USA (e-mail: standardized notations as are available in Western music, and
sarala.padi@nist.gov). what does exist varies from school to school. This adds an
Color versions of one or more of the figures in this paper are available
online at http://ieeexplore.ieee.org. additional difficulty for the archiving and analysis of Indian
Digital Object Identifier 10.1109/TSMC.2016.2631130 classical music.
2168-2216  c 2017 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/
redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

2 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS

Despite all this flexibility, Carnatic music is still constrained


by rules, and our goal here is to model these rules using a
mathematical discipline called category theory (CT). CT pro-
vides a precise mathematical representation for these aspects
of Carnatic music so that these properties and characteristics
can be clearly elucidated. This will allow us to check MIR
metadata for consistency with these characteristics.
There are, of course, many different modeling languages and
techniques; CT is noteworthy for its focus on the relationships Fig. 1. Basic svaras that make up a melody in Carnatic music tradition.
between informational entities, rather than the entities them-
selves. Because of this, categorical models are largely agnostic
with respect to specific choices of representational form or for-
malism. The success of these methods is demonstrated by the rāga and tāla, the interpretations of these notions vary based
impressive breadth of categorical methods, which have pro- on the performance practices of the two traditions [27]. In this
vided insights into fields as varied as music modeling [16], paper, we focus on the Carnatic tradition, leaving Hindustani
computer science [17]–[20], engineering [21]–[23], theoretical music for future analysis.
physics [24], and biosciences [25], [26]. Importantly for our The most basic component in Carnatic music is the svara,
purposes, CT models are closely related to database schemas, a which is roughly analogous to a note in western music.
fact which facilitates MIR activities, such as storage, retrieval, However, as we will see, the relationship between svara and
and analysis. pitch is more complicated than in western music. In addi-
We hope that this paper will find interested readers in several tion, these svaras determine the articulation of a note, the
different areas. First of all, we hope that researchers interested way that it is sung or played. Fundamentally there are seven
in Carnatic music will be able to use and extend our model svaras in Carnatic music, namely, Sadja (S), Rishaba (R),
to assist in their own work, and thereby facilitate collabora- Ghandhara (G), Madhyama (M), Panchama (P), Daivatha (D),
tion in this area. More generally, we offer this example as a and Nishada (N) [13]. As shown in Fig. 1, the first note,
fairly detailed study in the development and application of cat- Sadja, is the tonic, meaning that the frequency of the other
egory theoretic models, of interest in any area of collaborative svaras is measured relative to the pitch1 of the Sadja. Thus,
science. In the interests of these readers we do not assume the frequency of the Panchama svara is always one and a half
familiarity with CT methods, and include a short introduction times the frequency of the Sadja.
in the supplementary material. However, we also hope that However, the remaining svaras (R, G, M, D, and N) may
experienced practitioners of CT may also enjoy this paper, each occur in one of two or three variations. Rishaba, for
and find in it some small progress toward a methodology of example, may have a frequency ratio of (16/15), (9/8),
applied CT, an area of study still largely undeveloped. or (6/5). We refer to these variations by R1 , R2 , and R3 ,
This paper begins in Section II, where we describe the indi- respectively. Moreover, some different svaras may share the
vidual building blocks and structure of Carnatic music that will same pitch frequency: G2 also has a relative pitch frequency
be used for MIR tasks. In Section III, we translate this descrip- of (6/5). Altogether there are 16 svara variations spread across
tion into a categorical model. The basis of this translation is 12 pitch frequencies (svarastanas). The fixed svaras, S and P,
provided as supplementary material giving a brief introduc- are called shudh svaras.
tion to CT; readers without a CT background are strongly The situation is similar to that in Western classical music,
advised to read the supplemental material before proceeding where E and B are pure notes in the sense that they do not have
to Section III. Section IV broadens the scope of our models by sharp and flat modifications, whereas the remaining notes, C,
merging the individual building blocks into a single concep- D, F, G, and A, may be modulated by sharp and flat notations
tual framework to model Carnatic music as a whole. Finally, C , D , etc. It is also worth noting that when the S svara in
we discuss some of the ways that categorical methods can Carnatic music is fixed to C major in Western classical music
facilitate collaboration and group research. then the svaras S, R, G, M, P, D, and N correspond to the seven
notes of the C major scale, C, D, E, F, G, A, and B [28].
Table I lists the pitch ratios associated with each svara.
II. I NTRODUCTION TO C ARNATIC M USIC See [10], [29] for discussions of the allowed frequencies
Indian classical music is one of the world’s oldest musi- for the 12 svarastanas and their frequency ratios relative to
cal traditions. It has been developed over centuries, and been the tonic. Notice that some svaras share the same relative
influenced by many religions and cultures. The present system pitch; these are nonetheless distinguished by their articula-
of Indian music is based on two pillars: 1) rāga and 2) tāla. tion. This is displayed in Fig. 1, which shows the group-
The first, rāga, is the melodic component of the music, corre- ing of svaras by pitch (vertical groupings) and articulation
sponding to the “mode” or “scale” in Western classical music. (horizontal groupings).
The tāla, by contrast, describes the rhythmic component of a
music. Indian classical music contains two traditions, Carnatic 1 Here, we use “pitch” to refer to the frequency of a tone (in hertz) and we
and Hindustani, associated with southern and northern India, use these two terms interchangeably. In later sections pitch will be the named
respectively. Though both traditions involve the concepts of feature which measures the frequency in Hz for a rāga recognition tasks.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

PADI et al.: MODELING AND ANALYSIS OF INDIAN CARNATIC MUSIC USING CT 3

TABLE I
C ARNATIC M USIC Svaras AND T HEIR F REQUENCY R ATIOS

(a)

(b)

Fig. 2. Svara sequence (both ascending and descending) and its generating
regular expression for a Melakartha rāga system.

A. Rāga
The melodic component of a piece of Carnatic music is
By contrast, Janya rāgas are much more flexible. Their
called its rāga. Rāgas are constructed by grouping together
sequences usually contain fewer than seven notes, although
the svaras introduced above in different ways, and each of
sometimes they may have seven or more notes. A given
these groupings are associated with one of nine correspond-
sequence in a Janya rāga may also repeat certain svaras, and
ing emotions [15]. The rāgas are developed to elicit these
need not be strictly ascending or descending. For a comparison
emotions and, in fact, the word “rāga” is derived from the
of Melakartha and Janya rāgas [see Fig. 2(b)].
Sanskrit word for “color” or “passion.”
There are further restrictions on Melakartha rāgas which
A rāga is uniquely determined by a sequence of svaras.
Janya rāgas do not share. For Melakartha rāgas, both the
Additionally, this sequence can be decomposed into an
aarohana and avarohana sequences must contain exactly the
aarohana (ascending) sequence followed by an avarohana
same svaras, and these sequences must be exactly the reverse
(descending) sequence. In the aarohana sequence the pitch
of one another. Thus, all together there are 72 possible
tends to increase, beginning at the tonic and ending again at
Melakartha rāgas constructed from the allowed combinations
the tonic, one full scale higher (i.e., at twice the frequency of
of svaras indicated in Fig. 2.
the initial tonic). The avarohana sequence, on the other hand,
There is, however, an important relationship between the
tends to decreases, beginning at the high tonic and ending at
two types of rāgas: every Janya rāga is derived from a
the low.
Melakartha rāga. Starting from a Melakartha rāga, one usu-
Rāgas are broadly grouped into two classes, namely,
ally drops or adds a small number of svaras to arrive at an
Melakartha and Janya rāgas. While both classes contain
associated Janya rāga.
aarohana and avarohana sequences, the Melakartha rāgas are
This relationship is also reflected in an additional char-
more restrictive in the sequences they allow. Rāgas may be
acteristic of rāgas: the emotion associated with a rāga.
further classified as sampoorna (complete) or asampoorna.
Traditionally, each rāga is associated with one of nine emo-
A sampoorna rāga includes exactly seven svaras in both
tions, include Bhakti (ritual or devotional) and Viram (bravery
the aarohana and the avarohana sequence. Any other rāga
or fury). A thorough discussion is given in [30]. Here, the
is asampoorna. Therefore, all Melakartha rāgas are sam-
important observation is that a Janya rāgas shares the same
poorna rāgas while Janya rāgas, may be either sampoorna
emotion as the Melakartha rāgas that it was derived from.
or asampoorna rāgas.
In Melakartha rāgas, both the aarohana and avarohana
sequences are required to contain exactly one note from each
svara class (i.e., the horizontal groupings as shown in Fig. 1). B. Tāla
Moreover, in the aarohana sequence the pitch of each svara In the Carnatic music tradition tāla and rāga are concepts
must be higher than the one before. Thus, e.g., the svara R2 of equal importance in rendering a composition. Tāla defines
can be followed by G2 or G3 , but not by G1 . Similar (but the rhythmic structure or framework using which music is
reversed) requirements hold for the avarohana sequence. This performed. In general tāla means clap and is used to main-
means that the frequency of notes in a rāga gradually increases tain the time which in turn determines the rhythmic structure
from the tonic to its middle note (one full scale higher), and or pattern for a musical composition. Generally, there are no
then gradually descends to return to the tonic at the end of special instruments to maintain the tāla in Indian classical
the rāga. Fig. 2(a) gives a finite state machine and a regular music performance. It is usually maintained by the musician
expression which will produce all of the aarohana sequences by tapping of the hand on the lap or using both the hands like
in Melakartha rāgas. clapping. As shown in Table II, seven tālas are characterized
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

4 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS

TABLE II TABLE IV
Tāla NAMES AND C ORRESPONDING S TRUCTURES (PATTERN ) IN L IST OF G ATIS AND THE N UMBER OF N OTES P ER B EAT
C ARNATIC M USIC T RADITION . T HE N OTATIONS IN THE TABLE
ARE : laghu—I, drutam—O, AND anudrutam—U

TABLE III
JATI NAMES AND N UMBER OF B EATS FOR
laghu (I) S PECIFIED IN tāla Pattern

using three different notations, namely, laghu (I), drutam (O),


and anudrutam (U).2 Fig. 3. Illustration of tāla structure in Carnatic music tradition.
Any piece of Carnatic music has a fixed cycle called an
avartana, and each cycle is divided into basic time intervals
called units or aksharas. The number of aksharas in a cycle The final component of Carnatic rhythm is called its gati.
is determined by two pieces of information: 1) a tāla and This determines the number of notes which are played or sung
2) a jati. A tāla is built up as a combination of three differ- for each beat (akshara). Combining this with the number of
ent clapping styles. Andrutam (indicated using U) consists of aksharas per cycle we can also compute the number of notes
a single beat counted with an ordinary (palm-to-palm) clap. per cycle. One point of potential confusion here arises from
Drutam (O) consists of two beats: one ordinary clap followed the fact that the gati names are the same as the jati names,
by an “empty” clap placing the back of one hand into the palm although the two can vary independently. In addition, the num-
of the other. Finally laghu (I) consists of a variable number of ber of notes per akshara for a given gati is the same as the
beats. The first is an ordinary clap followed by counting the number of beats per laghu (I) for the jati of the same name
remaining beats on the fingers of one hand. There are seven (see Table IV). Thus, a performance in tisra gati will have
different tālas, each corresponding to a different sequence of three notes per beat, regardless of its jati.
these components. Thus, for example, the Jampa tāla corre- A full example of this rhythmic structure is illustrated in
sponds to the sequence IUO while Ata tāla corresponds to Fig. 3 for Rupaka tāla Kanda jati with Tisra gati. Because
IIOO. A list of all these tālas, together with the associated Rupaka tāla has the pattern OI, the entire cycle is divided
sequences, are described in Table II. into two parts: 1) drutam (O) and 2) laghu (I). As always,
The variability of the laghu component of a tāla is deter- drutam has fixed number of beats (2) while the number of
mined by a second characteristic of Carnatic rhythm: the jati. beats for laghu is determined by the Kanda jati (5), making
There are five traditional options for jati, each of which deter- seven in total. Finally, the Tisra gati entails three notes per
mines a different number of beats for the laghu component of akshara, meaning that the full cycle will consist of 21 notes.
a tāla. So the Tisra jati involves three beats for each laghu
component while Misra involves seven. These various jatis, C. Performance in Carnatic Music
along with their number of beats, are indicated in Table III. In the Carnatic music tradition, performance can be vocal
The rhythm of a Carnatic composition is determined by a or instrumental. In the case of vocal performances, the lead
tāla and a jati, a pair which we call a tāla structure. Any musician is a vocalist and in the case of instrumental perfor-
combination is valid, so there are 7 × 5 = 35 tāla structures. mances, the lead musician is an instrumentalist (violin, veena,
From a tāla structure we can compute the number of beats in mridangam, etc). In Carnatic music performances the lead
a full musical cycle. For example, the ata tāla (with pattern musician is generally accompanied by instruments, namely,
IIOO) in the tisra jati would consist of 3 + 3 + 2 + 2 = 10 violin, mridangam, ghatam, morsing, and tanpura. Tanpura is
beats per cycle, whereas the same tāla in the misra jati would used to maintain the basic pitch throughout the concert, usually
consist of 7 + 7 + 2 + 2 = 18 beats per cycle. referred as the tonic of the performance. All these instruments
are tuned to basic pitch of the lead performer. In general, any
2 laghu—indicates one beat of palm followed by counting the fingers,
concert or musical performance is a combined effort from the
drutam—indicates one beat of the palm and turning it over and anudrutam—
indicates just one beat of the palm. The number of aksharas (beats) for drutam lead musician and various instrumentalists who are accompa-
is 2, anudrutam is 1, and for laghu depends on type of jati. nying the lead performer. Therefore, a piece of music is usually
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

PADI et al.: MODELING AND ANALYSIS OF INDIAN CARNATIC MUSIC USING CT 5

Fig. 5. Methods and features used for rāga recognition task in Carnatic
music tradition.

three classes: 1) those which have not yet been attempted;


2) those which are in progress; and 3) those which have been
Fig. 4. Research tasks in the MIR community for Indian classical music.
successfully completed.
Fig. 5 elaborates on the rāga recognition task by giving an
tagged with song name, rāga name, tāla name, and composer overview of the features and methods used for identifying the
name along with the names of all the musicians involved in rāga of a music segment. Such a task involves a particular
a performance. This data is referred to as metadata and it is method which is used to analyze a particular feature in hopes
critical for MIR, and providing it is the main archival goal of identifying the rāga of a music segment. For example, two
in MIR. The following section gives an overview of current recognition procedures considered in this paper are the appli-
research on archival methods and retrieval of Carnatic music. cation of hidden Markov models (the method HMMs) [45]
1) Research in Carnatic Music (State of the Art): This sec- to mel frequency cepstral coefficients features [46], and the
tion gives an overview of the state of the research in both the application of the K-nearest neighbor [47] method to the pitch
Carnatic and the Hindustani musical traditions. In particular, feature [48]. Many other features (pitch class distributions and
various methods and features used in the literature to recog- pitch class dyad distributions) and methods (support vector
nize the rāga of a given music segment are described fully machines and linear discriminant analysis) are available for
to help understand the categorical models discussed in later these recognition tasks.
sections. Fig. 4 illustrates the state of research useful for MIR At present, there is no common platform used to store,
purposes in Indian classical music, an area which has grown share, and integrate the results obtained from these tasks. This
substantially in the last ten years. state of affairs is due to the lack of a common model of
One aspect of this research, CompMusic,3 is a research Carnatic music that is needed to integrate these studies. The
project focused on a wide variety of musical traditions, categorical model that we propose will allow users to develop
including Indian classical music, Chinese music, Turkish and work with such common database collections and to eas-
music, etc. “Dunya,” a culture-specific toolbox developed ily share their results with a larger group. In a broad sense, the
under this project, includes tools for music segmentation, model proposed in this paper provides the complete structure
feature extraction, meta-data extraction, and meta-data-based necessary for group of researchers to work together, share their
retrieval of music segments in wide variety of musical tra- data and results, and store the data in a collective database.
ditions [31]–[38]. Apart from the CompMusic project, other The proposed categorical framework is intended to facilitate
efforts have focused on analyzing Carnatic and Hindustani a platform for collaborative research and it can be adapted to
music for various processing tasks (both vocal and instrumen- other domains as well.
tal music) which are useful for MIR [39]–[44].
For most MIR-related tasks, metadata plays a dominant role; III. C ATEGORY T HEORETIC R EPRESENTATION
for example, a composer’s name is among the most useful OF C ARNATIC M USIC
metadata for information retrieval. Thus, composer identifi-
cation is one of the most critical tasks among those given Knowledge representation plays a fundamental role in
in Fig. 4. In Carnatic music, composers often have a unique describing any domain problem in which a computer must
signature (mudra) which they weave into their compositions. interpret and process data for further analysis. In many real
Thus, one possible way to identify the composer of a music world applications, modeling the objects and relationships in
segment is to identify its mudra, which occurs near the end a domain constitutes a major contribution to such knowledge
of the a composition. As yet, no one has developed any representation. The most important criteria for representing
technique or feature to find the mudra of a composition seg- any domain problem is that knowledge should be properly
ment for either Carnatic or Hindustani music. The different defined so that it is consistent with the problem domain.
borders around the tasks in Fig. 4 distinguishes them into Therefore, the primary challenges of knowledge representation
are as follows.
3 http://compmusic.upf.edu/ 1) Choosing the problem to solve.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

6 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS

2) Representing a problem and providing the required There is also a broad and substantial literature on the
knowledge to solve the problem. use of CT in formal modeling. One line of inquiry fol-
3) Validating the appropriateness of the knowledge before lowing on this database work can be found in the work
solving the problem. of Johnson and Rosebrugh [63], who provide a categori-
4) Solving the problem computationally and evaluating the cal interpretation of entity–attribute–relation diagrams. This
correctness of the solution to the problem. paper was further elaborated by Diskin [66], [67], and
5) Comprehensive representation of the knowledge so that a Rutle et al. [68], with a particular emphasis on software
computer can solve the problem for a particular domain. engineering. Most importantly, whereas earlier work was
There are many models for knowledge representation. One largely theoretical, much of this more recent work has been
class of such models are called ontologies, which provide an implemented and is directly informed by engineering practice.
explicit specification and abstraction of knowledge [49]–[52] Along broadly similar lines, CT can be used to under-
in either human-readable form or a formal language. These stand the object-oriented class modeling as found, for example,
typically begin by specifying the types of entities which occur in the universal modeling language (UML). Although it is
in the problem domain, then supplemented with a collection ubiquitous in (especially the early stages of) software engi-
of rules that the entities are expected to obey. neering, UML lacks well-defined semantics. Both Diskin [69]
The most popular approach to ontologies rely on a family of and Padi et al. [70] have associated (fragments of) the UML
Web ontology languages (OWL) [53]–[55]. In MIR, these have class diagram syntax with constructions in CT. Turning this
been applied to model music for segmentation and the extrac- relationship around, CT straight-forwardly inherits many of
tion of semantic information [56], [57]. The OWL approach the methods used in object-oriented class modeling.
is based on description logic and is often used in conjunc- The formal mathematics of CT allow us to sidestep some of
tion with other technologies, such as the resource description the difficulties of other ontological representations. For exam-
framework and the SPARQL the query language, and can also ple, one advantage of categorical ontologies is that we can
be mapped to relational database schemas for storage and express their inter-relationships using maps called functors.
retrieval [58], [59]. More generally, a diagram of functors can describe sophisti-
One shortcoming of existing ontological approaches is a cated relationships between families of ontologies, which can
lack of extensibility. It can be difficult to modify an existing then be merged together using a construction called a colimit.
ontology in order to extend or debug its domain. In particu- Kan extensions provide a formal method for data migration
lar, it is difficult to relate different ontologies to one another, between schemas. These powerful formal methods allow us to
particularly in cases where neither ontology embeds into the manipulate categorical ontologies in a way that is both correct
other. A lack of formal relationships between ontologies also and mathematically justified [71]–[73].
leads to difficulties when migrating data from one ontology to The previous section described some fundamental aspects of
another. Carnatic music. In this section, we translate that description
Similar issues arise for ontological integration. Domain- into a category-theoretic ontology. Our supplementary mate-
specific OWL ontologies are often specialized from large and rials provide a brief introduction to the methods of CT (as
complex “upper” ontologies, e.g., [60]. The advantage is that well as our notation); we strongly advise those without a
this method can align multiple small ontologies derived from background in CT to study that material before proceeding.
the same upper ontology. The disadvantage is that these upper In our categorical model, each object represents a set of
ontologies can be difficult for developers to write and debug entities, and each arrow represents a function between these
and for users to navigate. A more modular approach, based on sets. In diagrams the objects will be represented as a box con-
identification of overlap between small ontologies, depends on taining text which describe a typical element of the set. For
a concrete representation of that overlap. example:
In this paper, we investigate an alternative approach to
ontologies based on CT. CT was developed in the 1940s
by Eilenberg and MacLane [61] to study the relationship
between two areas of mathematics: 1) topology and 2) algebra. represents the set of svaras (notes) in Carnatic music. For
Subsequent research has led to applications throughout mathe- readability, we use corner braces rather than full boxes when
matics as well as in theoretical physics and computer science. discussing objects in the text: e.g., a svara.
In 1990s, Rosebrugh and Wood [62] showed that a database As discussed in Section II, a Carnatic melody is built up
schema can be viewed as a category, and a database instance from 16 basic notes, spread across 12 pitches and 7 articula-
as a functor on this category [63]. More recently, Spivak [64] tions. Each of these will become an object in our categorical
has used this point of view to apply the methods of CT to representation. Since each of these objects represents a single
problems of data management. Generally speaking, this line individual, these can all be modeled by terminal objects in our
of work provides a dictionary of categorical interpretations category. We then group these notes by articulation, as in the
for the standard vocabulary of databases, such as schemas, horizontal groupings of Fig. 1. As shown in Fig. 6 and 7,
instances, queries, updates, and data migration. This paper has categorically, these groupings take the form of coproducts,
also been implemented in the functorial query language, soft- allowing us to represent the objects Rishabham, Gandharam,
ware for building and analyzing databases from the categorical Dhaivata, Nishada, Madhyama, Sadja, and Panchama as finite
perspective [65]. sets (also called enumerations).
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

PADI et al.: MODELING AND ANALYSIS OF INDIAN CARNATIC MUSIC USING CT 7

Fig. 10. Universal Mapping Property (UMP) of the product to compare


frequency ratios between svaras.

Fig. 6. Categorical representation of notes in Carnatic music tradition.

Fig. 7. Categorical representation of notes with respect to isomorphism.


Fig. 11. Illustrating the Universal Mapping Property of svaras shown in
Fig. 10 with an example.

monomorphism (i.e., an injection). We can say that these


arrows are jointly monic.
However, not every choice of svaras is allowed as an
Fig. 8. Projections from an M-arohana sequence to its svaras. M-arohana sequence; the rules for identifying those which
are acceptable were described in Section II. We can model
these rules using truth functions and pullbacks. One of these
rules says that the frequency of the Rishabham svara is less
Fig. 9. Cartesian product formed from individual svara objects. than that of the Ghandharam svara (as displayed in Table I).
To say this categorically we make use of a truth function
less_than : Q × Q → {True, False}. This function is
A. Modeling rāgas defined by rule
Let a svara denote the coproduct of the seven objects 
True if p < q
Sadja, Rish., etc., as described in Table I. All together this less_than(p, q) =
False otherwise.
will be a coproduct of 16 terminal objects (2 × 1 element +
1 × 2 elements + 4 × 3 element). Table I defines a function4 Using the universal mapping property of the product we
can use this to define a truth function on Rish. × Gand.
as in Fig. 10.
Notice that the object Q appears twice in the preceding
diagram. Usually when this occurs it means that the same
As a minor abuse of notation we will use the same object is involved in two or more different arrows; we write it
name freq_ratio to denote the composition of the multiple times for readability, even though it really is the same
above map with the coproduct injections, e.g., Rish. → object. The best way to understand such diagrams is by tracing
a svara → Q. an element through the diagram; since different functions give
These svara objects are then connected into two sequences: different values, this allows us to see why the object appears
1) the arohana and 2) avarohana, to form a rāga. Here, we twice. For example, we might trace the pair (R1, G1) through
must distinguish between Melakartha and Janya rāgas, as their the diagram in Fig. 11.
mathematical structure is quite different. For a Melakartha Since (R1, G1) maps to True, this is an allowable tran-
rāga, we can identify its Sadja svara, its Rishabham svara, etc. sition in Melakartha rāgas. If we traced (R2, G1) through
Mathematically, this corresponds to a family of functions from the diagram we would end up at False, so this pair is not
a sequence object into each of the svara objects as displayed allowed. In order to distinguish these cases requires the use
in Fig. 8 for the Melakartha arohana (M-arohana) sequence. of the pullback operation.
Together, these functions define a single map into the Note that there is a map True : 1 → {True, False}
product object (see Fig. 9). which sends the unique element of 1 to True. As dis-
Indeed the arohana sequence is completely determined cussed in supplementary section on CT, pulling our truth
by these choice of these seven svaras, so the map function back along this map will yield a subset of Rish. ×
M-arohanasequence → Svara product will be a Gand. consisting of only those pairs which map to the
value True. We denote this pullback by Rish. < Gand..
4 Here, Q, the rational numbers, is the set of fractions. See Fig. 12.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

8 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS

Fig. 12. Defining frequency-ordered pairs of svaras using a pullback property.

Fig. 16. Using the UMP of the coproduct to define the type of a rāga.

Fig. 13. Representation of M-araohana sequence as a Cartesian product.

Fig. 17. Illustrating the relationship between Melakartha rāga and Janya
rāga.

Fig. 14. Commutative diagram expressing the reversal symmetry of


Melakartha rāga. a rāga as the co-product of the a Melakartha rāga object
and the a Janya rāga objects.
We also find it useful to represent this information as a
typing map. Consider the two-element set {J-type, M-type} ∼ =
Fig. 15. An additional relationship between Janya and Melakartha rāgas. 1 + 1. Using the fact that any object has a unique map to
the terminal object, the universal property of the coproduct
a rāga yields our typing map. See Fig. 16.
Similar reasoning applies to the restriction between One final piece of information usually associated with a
Dhaivata and Nishada svaras. This gives us a final definition rāga is its emotion; historically, this is usually classified
shown in Fig. 13. into one of nine categories. This corresponds to an arrow
has
Next, we turn to the relationship between arohana and a rāga −→ an emotion. In addition, this emotion is the
avarohana sequences. In Melakartha rāgas, each is the reverse same between a Janya rāga and the Melakartha rāga which
of the other. Using products and coproducts one can define a it is derived from, corresponding to a final condition requir-
list operator ∗ which acts on objects in the category; given a ing that the two paths a Janya rāga → an emotion in the
set A, A∗ is the set of lists whose elements come from A. In diagram in Fig. 17 agree.
particular, each arohana sequence is a list of svaras, so we The above descriptions can be regarded as a recipe for defin-
will have a monic arrow from an M-arohana sequence into ing a category which models the relationships between pieces
a list of svaras. and types of Carnatic rāgas. One begins by assembling all of
The advantage of this point of view is that list objects come the objects and arrows referred to above into a single graph
equipped with a variety of operations. In particular, we may (which we omit here for reasons of space). Next, one assem-
reverse any sequence, corresponding to a map A∗ → A∗ . We bles a list of declared path equations and, finally, a list of
can then specify the relationship between arohana and avaro- all categorical constructions (i.e., products, pullbacks, monic
hana in Melakartha rāgas as an equation between the two arrow, coproducts, and finite sets) specified in our descrip-
paths such that the diagram in Fig. 14 commutes. tion. Mathematically, these lists of categorical constructions
As discussed in Section II, this categorical model says that are called a sketch for our model, and there are well-known
every Melakartha rāga is a sampoorna rāga which is indicated methods for generating a category from these basic con-
as commutative property in the above diagram. structions (see [74]); a complete description is beyond the
Next consider Janya rāgas; these are much less structured scope of this paper, but is close in spirit to the categories
than Melakartha rāgas and, consequently, there is much less of Examples 2 and 2.1 in the supplementary section.
that we can say about them at a categorical level. It is true that
Janya rāgas contain an arohana and an avarohana sequence.
We can also classify which Janya rāgas are sampoorna using B. Modeling Rhythm Structure
an analysis very similar to the one above. There is also a As discussed in Section II, in order to describe a piece of
relationship between the two types of rāgas: every Janya rāga music we must specify both its melody and its rhythm. In
is derived from a particular Melakartha rāga. In our category, Carnatic music the rāga provides the first, while the second is
this corresponds to the map provided in Fig. 15. determined by what we call a Carnatic rhythm structure. This
In general, a rāga may be either a Melakartha rāga or a can be broken down into three pieces: 1) the tāla; 2) the jati;
Janya rāga (but not both), so we can represent the object and 3) the gati; the first two pieces determine a tāla structure.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

PADI et al.: MODELING AND ANALYSIS OF INDIAN CARNATIC MUSIC USING CT 9

Fig. 18. A tāla determines a sequence (list) of beats (of type O, U or I).

Fig. 21. Categorical representation of an instance of a tāla structure.

Fig. 19. Categorical representation of tāla structure.

Fig. 20. A jati associates each type of beat (O, U or I) with a natural number,
its beat count. This figure number should be 19.

As explained in Section II-B, there are a fixed number


of tālas (seven), of jatis (five), and of gatis (five), so each
Fig. 22. Categorical representation of Carnatic music rhythm structure.
of these can be represented as a finite set (i.e., a coprod-
uct of terminal objects). Each of these vary independently, so
the a Carnatic rhythm structure object will be a product of
these three (and consequently a coproduct of 7 × 5 × 5 = 175
terminal objects). We also distinguish the object of pairs (tāla,
jati), which we call a tāla structure.
The structural pattern associated with each tāla (as dis-
played in Table II) can be represented as a function, as shown Fig. 23. Jatis and gatis with the same name also have related beat counts.
in Fig. 18. Here, we use the “Kleene star” A∗ to denote the
set of lists with elements from A.
As given on the left-hand side of the same table, the num- Just as we did for rāgas, we can examine this diagram
ber of laghu (I) beats associated with each jati determines a by tracking an element through it (Fig. 21) [although this is
function from a jati to the natural numbers. More usefully, made more difficult by the fact that a function List{O, U, I} →
because we know that the number of beats associated with dru- List(N) contains an infinite number of values].
tam (O) and anudrutam (U) are always the same (two and one, Finally, we can combine the beats/cycle information gen-
respectively), each jati determines a function {O, U, I} → N. erated by a tāla structure with the notes/beat information
For example, the Misra jati (I = 7) would be associated given by the gati in order to determine the total number of
with the function in Fig. 19. notes/cycle in the segment (Fig. 22).
As discussed in the supplementary material, applying this Although we do not need to use it, we should also note
function entry by entry determines a related map between that there is an isomorphism between a gati and a jati,

lists (N∗ ){O,U,I} . This is essentially the functoriality of the implicit in Tables III and IV, and that this iso commutes with
list operator. relevant counts associated with these objects, as shown in
This shows that a tāla structure determines both an ele- Fig. 23.

ment of {O, U, I}∗ and a function in (N∗ ){O,U,I} . Since one
is a function and the other is an input, we can pair these
together and apply the evaluation function to obtain a list of IV. S INGLE C ONCEPTUAL M ODEL FOR A NALYSIS
natural numbers. Summing the resulting list, we obtain the In this section, we describe a method of integrating the
number of beats per cycle associated with a tāla structure categorical models just defined (and others) into a single con-
shown in Fig. 20. ceptual framework. We have already seen applications of limits
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

10 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS

Fig. 24. Categorical representation for analysis of Carnatic music.

and colimits inside a category. Now, we employ some of the


same ideas to study relationships between categories. Fig. 25. Integrating rāga and metadata models using pushout.
First consider the following simple category (Fig. 24) which
contains metadata related to a musical segment. A raw file5
records a Carnatic music segment, and this segment has five
relevant pieces of metadata, including its title, artist, and
composer.
This is acceptable as a structure for storing data, but for
analysis and validation we would like to integrate it with the
categorical models from the previous sections. We can do this
using pushouts of categories.
Let R denotes the categorical ontology for rāgas which we
developed in the previous section, and M the metadata cate-
gory presented in Fig. 24. Notice that both of these categories
contain an object called a rāga.6 We would like to glue Fig. 26. Integrating rāga, tāla and metadata models through iterated
pushouts.
these two categories together by identifying their rāga objects.
In CT, functors are used to relate structures in one category
to analogous constructions in another [75]. Technically, these
functors must preserve the categorical structures (e.g., products
and coproducts) include in our model.
Let 1 denotes the category which contains only a single
object and its identity arrow. Because of its particularly sim- Fig. 27. Categorical representation of task.
ple form, a functor from 1 into some other category C is
determined by the choice of a single object in C. In partic-
ular, the a rāga object in R and M define two functors
ragaM : 1 → M and ragaR : 1 → R. Fig. 28. The task type object is a finite set consisting of five tasks.
Consider the pushout diagram below. The functors into RM
show that this composite category contains both M and R
as subcategories. Moreover, the two paths 1 → RM agree,
meaning that the rāga objects in the two categories have been
identified. By its universal property, RM is the smallest cate- Fig. 29. Each piece of MIR metadata addresses a specific task type.
gory satisfying these two properties: it contains nothing except
the objects and arrows which come from R and M. This gives
a formal description of our intuitive goal: we have glued our
models together along their common piece. In particular, RM
is fully specified by categories M and R and the two maps
ragaM and ragaR (see Fig. 25).
Similarly, let T denotes the categorical ontology for tāla
structures; exactly the same sort of argument allows us to inte-
grate M and T. Moreover, these two pushouts form the legs
of a third diagram, allowing us to iterate the pushout proce- Fig. 30. Categorical model for an experiment evaluation.
dure. This yields a new category integrating both ontologies
into our metadata category. Of course, given ontologies for
the other metadata, such as database specifications for artist Finally, we need to evaluate our model for various experi-
and composer data, we could glue these into our framework mental conditions. Fig. 27 shows the categorical representation
in exactly the same fashion (see Fig. 26). of a task. The object a task is a product consisting of three
pieces of information: 1) a feature to be used; 2) a method
5 We limit our attention to raw files that are Carnatic music segments. of analysis required; and 3) the type of information to be
6 Note, however, that unlike some other ontological approaches it is not identified. The last of these, a task type, is a finite set
important that the naming of these objects agree. consisting of elements shown in Fig. 28.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

PADI et al.: MODELING AND ANALYSIS OF INDIAN CARNATIC MUSIC USING CT 11

any processing task, share the code, use the existing results
for analysis purposes and verifying the results. This platform
creates a common workspace, where anyone can use a shared
database and perform analysis in a consistent manner.
Notice, in particular, the dotted line
Fig. 31. A commutative diagram expressing coherence between the task and
result of an experiment.

If we let meta-data for MIR denote the coproduct represents a ground truth. Every file under consideration
records a particular piece of Carnatic music, so there is such
a function. However, this information is, in general, not avail-
able to us. In some sense, our goal is to identify (the metadata
Attached to this object there is a typing map constructed of) the recorded segment based on characteristics of the file.
in the same fashion as the rāga typing map introduced in Alternatively, in situations where we know the ground truth of
Section III (see Fig. 29). a music segment, this provides us with additional paths from
Next, we have the notion of an experiment as depicted an experiment to meta-data for MIR (refer to Fig. 33).
in Fig. 30. Roughly speaking, an experiment is an instance These alternate paths can help to verify our metadata.
of a particular task which analyzes a raw file to generate a Ideally, these paths should agree (for any particular task
single result. This result can be mapped to meta-data for MIR, type), but in practical experimental evaluations, this cannot
corresponding to an arrow (see Fig. 32). be expected in all cases. One path is derived from ground
truth about music segment and another path is taken from the
experimental evaluation. We can compare the results of both
the paths to judge the accuracy of our experimental methods.
For example, Fig. 35 provides an instance of the task model
There is an obvious consistency check here, which can be rep- for rāga recognition task using the HMM method applied to
resented as a commutative diagram: the type of result from the pitch feature. It also gives an example of one element in
an experiment is the same as the type of the task which the model, traced through the various maps. Other tasks may
the experiment performed. This is shown diagrammatically involve different parameters, but this basic model provides a
in Fig. 31. framework enabling end users to integrate various pieces of
As shown in Fig. 30, we can also keep track of some addi- information for implementing other tasks like tāla recognition,
tional information like the researcher and the date on which the singer identification, song name identification, etc.
experiment is conducted. This categorical model allows group
of people to do the experiments on a given music segment
with different methods and features and different users can V. U SE C ASES
store, retrieve, and update the results. This framework can be This section provides experimental evaluation and database
used by group of people for collaborative analysis and imple- schemas for the categorical models discussed in Section IV.
mentation purposes. Notice that the models of an experiment A database is a collection of data that is stored and organized
and a task may be glued together along their overlap using the so that the end user can easily and quickly access, manage, and
same colimit methods discussed above. Moreover, in any par- update the collection. In particular, relational database models
ticular domain the feature parameters, method parameters and involve the storage of data in tables (also called relations).
results objects may be expanded in a similar way to represent In the relational models, each column in a table is called an
structured data. attribute and each row is called as an entity or record. Each
Fig. 32 illustrates the complete categorical model for analy- record is uniquely determined by an identifier called a primary
sis of Carnatic music with experimental evaluation, including key. Columns of other tables which refer to this primary key
the tasks performed on music segments. As suggested by the are called foreign keys, and these can be regarded as arrows
figure, an experiment is performed on a raw file which contains in a category.
a recording of a Carnatic music segment. Every experiment In fact, Spivak has shown [64], [65] that (finitely presented)
involves a task which is performed using features and meth- categories can be converted into database specifications (and
ods with appropriate parameters. The parameters for feature vice versa), so in defining our categorical models we have
and method vary based on type of feature and type of method already provided a database model. The fact that categorical
with respect to a particular task. models can be translated into a database schemas facilitates the
The idea behind this categorical model is that any user can storage, analysis and fine tuning of the data based on our mod-
execute the model and store his/her outputs for a particu- els. The resulting database provides the users access to the data
lar task using different features and methods. These can be for further analysis and processing tasks. See [64], [65] for an
accessed or cross verified by other people who work on the explanation of how a categorical model can be represented as
same music segment. The model shares information between a database schema along with its integrity constraints.
a group of people and allows them to collaborate on research Via this approach we can use our existing models of rāga,
using the same music databases. This collaborative platform tāla, etc., to define our underlying database models. In fact, in
allows for sharing the common model for modeling music for most cases we do not want to store all of the data associated
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

12 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS

Fig. 32. Single conceptual model for analysis of Carnatic music.

Fig. 33. Categorical representation of an experimental analysis of a music


segment.

Fig. 35. Categorical representation of rāga task with example elements.

Fig. 34. Categorical representation of rāga task.

with our model. For example, the database schema depicted


in Fig. 36 includes only some of the objects and arrows from
the detailed model in Section III; this might be the only data
Fig. 36. Database model for the Carnatic music conceptual structure.
needed for metadata-based retrieval purposes.
The decision of which data to store and which to throw
away can be encoded as a functor from a reduced database to migrate data from the complex schema into the simpler one.
schema into the full categorical model. It sends each table in These are called -migrations, and are discussed in [65].
Fig. 36 to the object of the categorical model which has the Fig. 37 depicts of a generic model for implementation
same name. Based on this mapping, there is a canonical way and analysis of music segments. This generic model can be
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

PADI et al.: MODELING AND ANALYSIS OF INDIAN CARNATIC MUSIC USING CT 13

Fig. 37. Graphical interpretation of generic model for analysis of Carnatic music.

visualized as a three-layered structure. The first layer concen- for music segments, but these efforts have been somewhat dis-
trates on a categorical representation of the characteristics of parate, making it difficult to compare their results. This paper
Carnatic music. This layer also details the methods, features, discusses a categorical approach for modeling the Carnatic
and tasks which are used to perform a particular analysis. music for various processing tasks for MIR applications.
While implementing such a task, we may pull characteristics Despite the complexity of traditional Carnatic music and its
of music from its corresponding models. basis in rāga and tāla structures, we have formally charac-
After concluding a task, its results are stored in the database terized these complexities using the categorical approach. The
of the second layer. conceptual model proposed in this paper provides a means to
As explained in Fig. 36, the second layer in the generic model any music segment for finding metadata for information
model is a database schema which mediates our category the- retrieval purposes.
oretic ontology and the query processing system presented to With respect to collaborative research, the proposed categor-
users. This layer allows us to create database schema based ical model provides a common platform that allows users or
on (a fragment of) the CT models provided in first layer so researchers to share data and verify the results on a common
that end user can access the required information from the database. This paper also addresses the mapping of categori-
database through query processing system. cal structures into database schemas for storage and retrieval
The last layer of the model supports query analysis for informa- purposes. The framework developed in this paper will allow a
tion retrieval purposes. This layer also integrates the processing group of people to conduct analysis using a categorical model
of results based on different methods which may be evaluated on a common corpus of music segments collaboratively.
on different features. This layer allow the end user to access This paper provides a framework to formalize the relation-
or retrieve information about desired music segments. The end ship between Carnatic music structure, prior research, tasks
user can also find the methods and features used for a particular previously attempted, methods and features used for differ-
task for further analysis purposes. This layer extracts the data ent music processing tasks. The general model proposed in
from the database for any query analysis. In this way, the unified this paper can be adapted to other domains by replacing our
conceptual model allows a group of people to share the results model of Carnatic music by models applicable to other domain
and to get complete details about the tasks, features and results problems. As future work, the proposed categorical model can
of a music segment in the database. We can say that this model be extended to characterize other musical traditions, including
allows the people to do collaborative research. Western classical music and Hindustani music.

VI. C ONCLUSION ACKNOWLEDGMENT


In the area of MIR for Carnatic music, developing meta- The authors would like to thank D. Spivak who helped to
data for music segments is important and challenging task. improve the CT models defined in this paper specially for
Presently, some efforts have been made to provide metadata defining models for tāla structure. The authors also thank
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

14 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS

R. Wisnesky and S. Adhinarayanan for their valuable com- [24] J. C. Baez and A. Lauda, “A prehistory of n-categorical physics,” in
ments and suggestions for improving the consistency and Deep Beauty: Understanding the Quantum World Through Mathematical
Innovation. 2009, pp. 13–128.
readability of this paper. [25] J.-C. Letelier, J. Soto-Andrade, F. G. Abarzua, A. Cornish-Bowden, and
M. Luz Cárdenas, “Organizational invariance and metabolic closure:
Analysis in terms of (M,R) systems,” J. Theor. Biol., vol. 238, no. 4,
R EFERENCES pp. 949–961, 2006.
[26] D. Ellerman, “Determination through universals: An application of cate-
[1] N. Orio, “Music retrieval: A tutorial and review,” Found. Trends Inf. gory theory in the life sciences,” unpublished paper. [Online]. Available:
Retrieval, vol. 1, no. 1, pp. 1–90, 2006. https://arxiv.org/abs/1305.6958
[2] M. A. Casey et al., “Content-based music information retrieval: [27] V. G. Paranjape, “Indian music and aesthetics,” J. Music Acad. Madras,
Current directions and future challenges,” Proc. IEEE, vol. 96, no. 4, vol. XXVIII, pp. 68–71, 1957.
pp. 668–696, Apr. 2008. [28] P. Lavezzoli, The Dawn of Indian Music in the West. New York, NY,
[3] R. Typke, F. Wiering, and R. C. Veltkamp, “A survey of music USA: Bhairavi, Continuum, 2006.
information retrieval systems,” in Proc. ISMIR, London, U.K., 2005, [29] A. Bellur, V. Ishwar, X. Serra, and H. A. Murthy, “A knowledge based
pp. 153–160. signal processing approach to tonic identification in Indian classical
[4] E. D. Scheirer, “Tempo and beat analysis of acoustic musical signals,” music,” in Proc. 2nd CompMusic Workshop, Istanbul, Turkey, 2012,
J. Acoust. Soc. America, vol. 103, no. 1, pp. 588–601, 1998. pp. 113–116.
[5] M. A. Alonso, G. Richard, and B. David, “Tempo and beat estimation of [30] G. K. Koduri and B. Indurkhya, “A behavioral study of emotions in
musical signals,” in Proc. ISMIR, Barcelona, Spain, 2004, pp. 158–163. South Indian classical music and its implications in music recommen-
[6] P. Herrera-Boyer, G. Peeters, and S. Dubnov, “Automatic classification of dation systems,” in Proc. ACM Workshop Soc. Adapt. Pers. Multimedia
musical instrument sounds,” J. New Music Res., vol. 32, no. 1, pp. 3–21, Interact. Access, Florence, Italy, 2010, pp. 55–60.
2003. [31] P. Sarala, V. Ishwar, A. Bellur, and H. A. Murthy, “Applause identifica-
[7] T. Kitahara, M. Goto, K. Komatani, T. Ogata, and H. G. Okuno, tion and its relevance to archival of Carnatic music,” in Proc. Workshop
“Instrument identification in polyphonic music: Feature weighting to Comput. Music, Istanbul, Turkey, Jul. 2012, pp. 66–71.
minimize influence of sound overlaps,” EURASIP J. Appl. Signal [32] V. Ishwar, A. Bellur, and H. A. Murthy, “Motivic analysis and its rel-
Process., vol. 1, no. 1, p. 155, 2007. evance to raga identification in Carnatic music,” in Proc. Workshop
[8] B. L. Sturm, M. Morvidone, and L. Daudet, “Musical instrument Comput. Music, Istanbul, Turkey, Jul. 2012, pp. 153–157.
identification using multiscale mel-frequency cepstral coefficients,” in [33] S. Gulati, J. Salamon, and X. Serra, “A two-stage approach for tonic
Proc. 18th Eur. Signal Process. Conf., Aalborg, Denmark, Aug. 2010, identification in Indian art music,” in Proc. Workshop Comput. Music,
pp. 477–481. Istanbul, Turkey, Jul. 2012, pp. 119–127.
[9] M. Marolt, “SONIC: Transcription of polyphonic piano music with [34] G. K. Koduri, J. Serrá, and X. Serra, “Characterization of into-
neural networks,” in Proc. Workshop Current Res. Directions Comput. nation in Karnataka music by parametrizing context-based Svara
Music, Barcelona, Spain, 2001, pp. 217–224. Distributions,” in Proc. Workshop Comput. Music, Istanbul, Turkey,
[10] H. G. Ranjani, S. Arthi, and T. V. Sreenivas, “Carnatic music analy- Jul. 2012, pp. 128–132.
sis: Shadja, swara identification and raga verification in alapana using [35] J. C. Ross and P. Rao, “Detection of raga-characteristic phrases from
stochastic models,” in Proc. IEEE Workshop Appl. Signal Process. Audio Hindustani classical music audio,” in Proc. Workshop Comput. Music,
Acoust. (WASPAA), New Paltz, NY, USA, Oct. 2011, pp. 29–32. Istanbul, Turkey, Jul. 2012, pp. 133–138.
[36] A. Vidwans, K. K. Ganguli, and P. Rao, “Classification of Indian clas-
[11] R. Sridhar and T. V. Geetha, “Raga identification of Carnatic music for
sical vocal styles from melodic contours,” in Proc. Workshop Comput.
music information retrieval,” Int. J. Recent Trends Eng., vol. 1, no. 1,
Music, Istanbul, Turkey, Jul. 2012, pp. 139–146.
pp. 571–574, 2009.
[37] B. Bozkurt, “Features for analysis of Makam music,” in Proc. Workshop
[12] R. Sridhar and T. V. Geetha, “Music information retrieval of Carnatic
Comput. Music, Istanbul, Turkey, Jul. 2012, pp. 61–65.
songs based on Carnatic music singer identification,” in Proc. Int. Conf.
[38] A. Srinivasamurthy, S. Subramanian, T. Gregoire, and P. Chordia,
Comput. Elect. Eng., 2008, pp. 407–411.
“A beat tracking approach to complete description of rhythm in Indian
[13] L. Pesch, The Oxford Illustrated Companion to South Indian Classical
classical music,” in Proc. Workshop Comput. Music, Istanbul, Turkey,
Music. New Delhi, India: Oxford Univ. Press, 2009.
Jul. 2012, pp. 72–78.
[14] R. M. Rangayyan, “An introduction to the classical music of India,” [39] P. Sarala and H. A. Murthy, “Cent filter-banks and its relevance
Dept. Elect. Comput. Eng., Univ. Calgary, Calgary, AB, Canada. to identifying the main song in Carnatic music,” in Proc. Comput.
[15] T. M. Krishna, A Southern Music: The Karnatik Story, HarperCollins, Music Multidscipl. Res. (CMMR), Marseilles, France, Oct. 2013,
Noida, India, 2013. pp. 659–681.
[16] M. Andreatta, A. Ehresmann, R. Guitart, and G. Mazzola, “Towards [40] V. Ishwar, S. Dutta, A. Bellur, and H. A. Murthy, “Motif spot-
a categorical theory of creativity for music, discourse, and cogni- ting in an alapana in Carnatic music,” in Proc. Int. Soc. Music Inf.
tion,” in Mathematics and Computation in Music. Heidelberg, Germany: Retrieval (ISMIR), Curitiba, Brazil, Nov. 2013, pp. 499–504.
Springer, 2013, pp. 19–37. [41] A. Bellur and H. A. Murthy, “A novel application of group delay function
[17] E. Dennis-Jones and D. E. Rydeheard, “Categorical ML—Category- for identifying tonic in Carnatic music,” in Proc. EUSIPCO, Marrakesh,
theoretic modular programming,” Formal Aspects Comput., vol. 5, no. 4, Morocco, 2013, pp. 1–5.
pp. 337–366, 1993. [42] J. C. Ross, T. P. Vinutha, and P. Rao, “Detecting melodic motifs from
[18] K. Williamson, M. Healy, and R. Barker, “Industrial applications of audio for Hindustani classical music,” in Proc. Int. Soc. Music Inf.
software synthesis via category theory—Case studies using Specware,” Retrieval (ISMIR), Porto, Portugal, Oct. 2012, pp. 193–198.
Autom. Softw. Eng., vol. 8, no. 1, pp. 7–30, 2001. [43] P. Sarala and H. A. Murthy, “Inter and intra item segmentation of
[19] R. Tate, M. Stepp, and S. Lerner, “Generating compiler optimizations continuous audio recordings of Carnatic music for archival,” in Proc.
from proofs,” ACM SIGPLAN Notices, vol. 45, no. 1, pp. 389–402, Int. Soc. Music Inf. Retrieval (ISMIR), Curitiba, Brazil, Nov. 2013,
2010. pp. 487–492.
[20] J. C. Reynolds, “Using category theory to design implicit conversions [44] J. Kuriakose, J. C. Kumar, P. Sarala, H. A. Murthy, and U. K. Sivaraman,
and generic operators,” in Semantics-Directed Compiler Generation. “Akshara transcription of mrudangam strokes in Carnatic music,” in
New York, NY, USA: Springer, 1980, pp. 211–258. Proc. 21st IEEE Nat. Conf. Commun. (NCC), Mumbai, India, 2015,
[21] Z. Diskin and T. Maibaum, “Category theory and model-driven engi- pp. 1–6.
neering: From formal semantics to design patterns and beyond,” [45] CUED. (2002). HTK Speech Recognition Toolkit. [Online]. Available:
in Model-Driven Engineering of Information Systems: Principles, http://htk.eng.cam.ac.uk
Techniques, and Practice. Toronto, ON, Canada: Apple Acad. Press, [46] B. Logan, “Mel frequency cepstral coefficients for music modeling,” in
2014, p. 173. Proc. Int. Symp. Music Inf. Retrieval, Plymouth, MA, USA, 2000.
[22] T. Giesa, D. I. Spivak, and M. J. Buehler, “Category theory based solu- [47] M.-L. Zhang and Z.-H. Zhou, “ML-KNN: A lazy learning approach to
tion for the building block replacement problem in materials design,” multi-label learning,” Pattern Recognit., vol. 40, no. 7, pp. 2038–2048,
Adv. Eng. Mater., vol. 14, no. 9, pp. 810–817, 2012. 2007.
[23] D. I. Spivak, T. Giesa, E. Wood, and M. J. Buehler, “Category theoretic [48] D. Gerhard, “Pitch extraction and fundamental frequency: History and
analysis of hierarchical protein materials and social networks,” PLoS current techniques,” Dept. Comput. Sci., Univ. Regina, Regina, SK,
ONE, vol. 6, no. 9, pp. 1–15, Sep. 2011. Canada, Tech. Rep. TR-CS 2003-6, 2003.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

PADI et al.: MODELING AND ANALYSIS OF INDIAN CARNATIC MUSIC USING CT 15

[49] B. Chandrasekaran, J. R. Josephson, and V. R. Benjamins, “What are Sarala Padi received the Ph.D. degree from the
ontologies, and why do we need them?” IEEE Intell. Syst., vol. 14, no. 1, Indian Institute of Technology Madras, Chennai,
pp. 20–26, Jan. 1999. India, in 2014.
[50] P. M. Simons, Parts: A Study in Ontology. Oxford, U.K.: Oxford Univ. She is currently a Guest Researcher with the
Press, 1987. National Institute of Standards and Technology,
[51] D. Fensel, Ontologies: A Silver Bullet for Knowledge Management and Gaithersburg, MD, USA. Her area of specializations
Electronic Commerce, 2nd ed. Heidelberg, Germany: Springer-Verlag, are machine learning and signal processing tech-
2003. niques for efficient music information retrieval and
[52] D. Fensel, “Ontology-based knowledge management,” Computer, automatic indexing purposes, text mining, and NLP
vol. 35, no. 11, pp. 56–59, 2002. methods for processing of text documents for various
[53] G. Antoniou and F. Van Harmelen, “Web ontology language: Owl,” applications. Her current research interests include
in Handbook on Ontologies. Heidelberg, Germany: Springer, 2004, applying category theory for modeling and knowledge representation, and
pp. 67–92. applying deep learning models for medical image analysis purposes.
[54] G. Antoniou, E. Franconi, and F. Van Harmelen, “Introduction to seman-
tic Web ontology languages,” in Reasoning Web. Heidelberg, Germany:
Springer, 2005, pp. 1–21.
[55] D. L. McGuinness and F. Van Harmelen, “Owl Web ontology lan-
guage overview,” W3C Recommendation, vol. 10, no. 10, 2004. [Online]. Spencer Breiner received the master’s and
Available: https://www.w3.org/TR/owl-features/ Ph.D. degrees from Carnegie Mellon University,
[56] S. Song, M. Kim, S. Rho, and E. Hwang, “Music ontology for mood Pittsburgh, PA, USA, in 2010 and 2013,
and situation reasoning to support music retrieval and recommendation,” respectively.
in Proc. 3rd Int. Conf. Digit. Soc. (ICDS), Cancún, Mexico, Feb. 2009, He joined the U.S. National Institute of Standards
pp. 304–309. and Technology, Gaithersburg, MD, USA, in 2015
[57] B. Fields, K. Page, D. De Roure, and T. Crawford, “The segment ontol-
under a Post-Doctoral Grant from the National
ogy: Bridging music-generic and domain-specific,” in Proc. IEEE Int.
Research Council. His current research interest
Conf. Multimedia Expo (ICME), Barcelona, Spain, 2011, pp. 1–6.
includes potential applications of the mathematical
[58] A. Gali, C. X. Chen, K. T. Claypool, and R. Uceda-Sosa, “From
field of category theory to real-world applications.
ontology to relational databases,” in Conceptual Modeling for
Advanced Application Domains. Heidelberg, Germany: Springer, 2004,
pp. 278–289.
[59] B. Motik, I. Horrocks, and U. Sattler, “Bridging the gap between owl
and relational databases,” Web Semantics Sci. Services Agents World
Wide Web, vol. 7, no. 2, pp. 74–89, 2009. Eswaran Subrahmanian received the Ph.D. degree
[60] A. Gangemi, N. Guarino, C. Masolo, A. Oltramari, and L. Schneider, from Carnegie Mellon University, Pittsburgh, PA,
“Sweetening Ontologies with DOLCE,” in Proc. Int. Conf. Knowl. Eng. USA, in 1987.
Knowl. Manag., Sigüenza, Spain, 2002, pp. 166–181. He is a Research Professor with the Institute
[61] S. Eilenberg and S. MacLane, “General theory of natural equivalences,” for Complex Engineered Systems and Engineering
Trans. Amer. Math. Soc., vol. 58, no. 2, pp. 231–294, 1945. and Public Policy, Carnegie Mellon University,
[62] R. Rosebrugh and R. J. Wood, “Relational databases and indexed cat- Pittsburgh, PA, USA. He has published over
egories,” in Proc. Int. Category Theory Meeting CMS Conf., vol. 13. 100 refereed papers, co-edited three books on
1992, pp. 391–407. Empirical Studies in Design, Design engineer-
[63] M. Johnson and R. Rosebrugh, “Sketch data models, relational schema ing, and Organizational Implications knowledge
and data specifications,” Electron. Notes Theor. Comput. Sci., vol. 61, Management. He also co-authored a book setting the
pp. 51–63, Jan. 2002. global research agenda for ICT in Development. He has also co-edited three
[64] D. I. Spivak, Category Theory for the Sciences. Cambridge, MA, USA: special issues in the areas of design theory, engineering informatics, and anno-
MIT Press, 2014. tations in engineering design. His current research interests include design
[65] H. Forssell, H. R. Gylterud, and D. I. Spivak, “Type theoretical
theory, design support systems, information modeling for engineering, col-
databases,” in Proc. Int. Symp. Logic. Found. Comput. Sci., 2016,
laborative engineering, and engineering education.
pp. 117–129.
Dr. Subrahmanian was a recipient of the Steven Fenves Award for Systems
[66] Z. Diskin, “Generalized sketches as an algebraic graph-based framework
Engineering at CMU. He is a Distinguished Scientist of the Association
for semantic modeling and database design,” Faculty Phys. Math., Univ.
of Computing Machinery, and a Fellow of the American Association of
Latvia, Riga, Latvia, Tech. Rep. M-97, 1997.
Advancement of Science and a member of the Design Society.
[67] Z. Diskin and U. Wolter, “A diagrammatic logic for object-oriented
visual modeling,” Electron. Notes Theor. Comput. Sci., vol. 203, no. 6,
pp. 19–41, 2008.
[68] A. Rutle, A. Rossini, Y. Lamo, and U. Wolter, “A category-theoretical
approach to the formalisation of version control in MDE,” in Proc.
Int. Conf. Fundam. Approaches Softw. Eng., York, U.K., 2009, Ram D. Sriram (S’82–M’85–SM’00–F’17) was
pp. 64–78. a Faculty of Engineer with the Massachusetts
[69] Z. Diskin, “Mathematics of UML,” in Practical Foundations of Business Institute of Technology, Cambridge, MA, USA,
System Specifications. Amsterdam, The Netherlands: Springer, 2003, from 1986 to 1994 and was instrumental in setting
pp. 145–178. up the Intelligent Engineering Systems Laboratory.
[70] S. Padi, S. Breiner, E. Subrahmanian, and R. D. Sriram, “Category He is currently a Division Chief of the Software
theoretical approaches for deconstructing the semantics of UML class and Systems Division with the National Institute
diagrams,” in preparation. of Standards and Technology, Gaithersburg, MD,
[71] I. Cafezeiro and E. H. Haeusler, “Semantic interoperability via cate- USA. He is a Distinguished Alumni with the
gory theory,” in Proc. 26th Int. Conf. Conceptual Model. Tuts. Posters Indian Institute of Technology and Carnegie Mellon
Panels Ind. Contribut. (ER), vol. 83. Auckland, New Zealand, 2007, University, Pittsburgh, PA, USA. He has co-authored
pp. 197–202. or authored nearly 250 papers, books, and reports, including several books. His
[72] F. McNeill, A. Bundy, and C. Walton, “Diagnosing and repairing onto- current research interests include developing knowledge-based expert systems,
logical mismatches,” in Proc. Stairs 2nd Starting Ai Res. Symp., vol. 109. natural language interfaces, machine learning, object-oriented software devel-
Amsterdam, The Netherlands, 2004, p. 241. opment, life-cycle product and process models, geometrical modelers, object-
[73] F. McNeill, “Dynamic ontology refinement,” Ph.D. dissertation, Div. oriented databases for industrial applications, health care informatics, bioin-
Informat., Univ. at Edinburgh, Edinburgh, U.K., 2006. formatics, and bioimaging.
[74] M. Barr and C. Wells, Eds., Category Theory for Computing Science, Dr. Sriram was a recipient of the NSF’s Presidential Young Investigator
2nd ed. Hertfordshire, U.K.: Prentice-Hall, 1995. Award in 1989, the ASME Design Automation Award in 2011, the ASME
[75] S. M. Lane, Categories for the Working Mathematician (Graduate Texts CIE Distinguished Service Award in 2014, and the Washington Academy of
in Mathematics), vol. 5. New York, NY, USA: Springer-Verlag, 1971. Sciences’ Distinguished Career in Engineering Sciences Award in 2015.

View publication stats

You might also like