Professional Documents
Culture Documents
Abstract deal with text data, and few systems deal with music data.
We construct a content-based filtering system for mu-
Recommender systems, which recommend appropriate sic data. The target data of our system are not written in
information to users from enormous amount of information, MP3 or RealPlayer which record audio wave but are writ-
are becoming popular. There are two methods to realize rec- ten in MIDI data which record note data. We aim to vali-
ommender systems. One is content-based filtering, and the date how effectively content-based filtering works for music
other is collaborative filtering. Many systems using the for- data. The music data used in our system are popular mu-
mer method deal with text data, and few systems deal with sic because popular music is most distributed in the market.
music data. This paper proposes a content-based filtering The filtering method is as follows. The system extracts fea-
system that targets music data in MIDI format. First, we ture parameters from both of music data that the user has
analyze characteristics of feature parameters about music already rated in questionnaire and target music data for rec-
data in MIDI format. Then we propose a filtering method ommendation. The system compares both the feature pa-
based on the above feature parameters. Finally, we build a rameters by using classification algorithm (Concretely, we
prototype system with standard technology of the Internet. use a decision tree). Then, the system recommends mu-
sic data that the user will like. The system will reduce the
users’ burden in acquiring music data.
1 Introduction The paper is organized as follow. Section 2 describes
the approach of our research and Section 3 reviews related
With the development of the Internet, the quantity of in- works. Section 4 explains feature parameters used in our fil-
formation is explosively increasing. In this situation, users tering system. Section 5 explains our filtering method based
need to acquire appropriate information from the huge net- on the above feature parameters. Finally, in Section 6, we
work of information. On the other hands, information offer some summaries and future works.
providers need to provide appropriate information depend-
ing on customers. Providing customers with appropriate in- 2 Outline of Research
formation is expected as one of the functions in CRM (Cus-
tomer Relationship Management)[12], which is emphasized
Section 2.1 presents the overview of our filtering
in e-business[1] recently.
method. Section 2.2 explains the procedure of our research
In the early days of the Internet, most of the data were
to construct the filtering system.
text because the bandwidth was limited. However recently,
the major portal sites such as amazon.com and Yahoo! are
beginning to distribute music data like MP3 and RealPlayer. 2.1 Overview of Our Filtering System
Therefore, we think that the above-mentioned needs for the
users and the information providers will increase. We use MIDI data as music data. Compared with data
Recommender systems have been studied to support the of audio wave, MIDI data allow us to easily extract many
users to acquire information[9][7]. Recommender systems features about music because MIDI data are note data. Al-
provide users with appropriate information that meet their though, MIDI data are used in our system, the system can
preferences or interests from enormous amount of informa- recommend not only MIDI data but also MP3 or RealPlayer
tion. There are two methods for realizing recommender sys- data. Because if the system can specify the name of the mu-
tems. One is content-based filtering, and the other is collab- sic data, the system can recommend MP3 or RealPlayer data
orative filtering. Many systems using the former method corresponding to the name.
Proceedings of the 2004 International Symposium on Applications and the Internet Workshops (SAINTW’04)
0-7695-2050-2/04 $20.00 © 2004 IEEE
• Step 4: Evaluation of the filtering system.
We evaluate the system by precision (the percentage
of the user’s favorite music data out of the music data
recommended by the system) and recall (the percent-
age of the music data recommended by the system out
of the user’s favorite music data).
3 Related Work
Proceedings of the 2004 International Symposium on Applications and the Internet Workshops (SAINTW’04)
0-7695-2050-2/04 $20.00 © 2004 IEEE
in both of the pairs of attribute and attribute value and time-
series patterns. In music information retrieval systems, their Table 1. Candidates of Feature Parameters
purpose is to output music data that matches a search query
the whole music 1. meter, 2. tonality
given by a user. Therefore their feature parameters are not
3. the number of changing tonality
only the pairs of attribute and attribute value but also time-
4. average tempo, 5. rhythm
series patterns. For example, in the case that a user inputs
6. the percentage of major chord
a hum by using a microphone, the system expresses the
7. the percentage of minor chord
phrase input by the user in the form of time-series pattern
8. the percentage of sus4 chord
and outputs music data which have the same pattern. How-
9. key
ever, information filtering systems select music data that
each CH 10. tone, 11. average pitch
matches a user profile. The system must estimate which
12. average difference of pitch
kinds of music the user generally prefers from the set of the
13. average duration
user’s favorite songs. In order to estimate these tastes in mu-
14. average difference of duration
sic, the system needs the feature parameters which express
not the features of local part of music but the features of the
whole music. Therefore, this section surveys music data by
focusing on feature parameters which are represented in the 4.3 Extraction of Feature Parameters
form of the pairs of attribute and attribute value.
There are two kinds of feature parameters in the form In Table1, Feature parameters 1 to 4 and Feature param-
of the pairs of attribute and attribute value. One of them is eters 10 to 12 are written explicitly in MIDI data. Feature
extracted in every channel (CH) and the other is common parameters 13 and 14 are extracted by using note on event,
for all CH. Our system extracts feature parameters in entire note off event and delta time between the two events. Fea-
music, melody CH, chord CH, base CH and drum CH. This ture parameter 5 is extracted by the method that Ikezoe[4]
is because our research targets pop music, which mainly uses. The extraction of Feature parameters 6 to 9 requires
consists of the above four parts. the estimation of the chord. The chord type is estimated
in each half measure (The chord of popular music mainly
changes in less than a half measure). Our estimation method
of the chord type is as follows. Firstly, all notes in a half
4.2 Candidates of Feature Parameters measure are collected and a root note is calculated from col-
lected notes. After that, the chord type is estimated by the
relationship between the root note and the other notes. A
key is estimated based on root notes.
From feature parameters used in existing music infor-
mation retrieval systems, we select tempo, tonality, rhythm, 4.4 Estimation of CH
tone, meter, pitch, the difference of pith and the differ-
ence of duration as candidates of feature parameters. They In order to acquire feature parameters in each CH, the
are used by many researchers such as Chen[2], Kurose[6], system has to estimate each CH. Drum CH, base CH and
Ikezoe[4], Satou[10], Kumamoto[5] and Clausen[3]. Note chord CH are estimated easily with nearly 100% accuracy.
that the average of pitch is calculated from the pitches be- However, it is not easy to estimate melody CH with high ac-
tween a note and its next note and that the average of du- curacy. Therefore, we survey features of melody CH. From
ration is calculated from the durations between a note and the result of survey, we estimate melody CH by using the
its next note. There are some feature parameters that are not average difference of pitch, the number of notes performed
used in the existing music information retrieval systems, but at the same time and CH number (See the detail in the ap-
that are likely to express features of music data. From such pendix). We conducted an experiment by using other 50
feature parameters, we select the number of changing tonal- MIDI data to see whether or not melody CH is correctly es-
ity, the percentage of chord type and key. This is because timated. As a result, this method estimates melody CH with
they can be extracted from many music data with ease and 94% accuracy.
they do not depend on writers’ ability to create MIDI data
(Writers with high technique can create MIDI data adding 4.5 Decision of Feature Parameters
various effects. Our research uses features which are seen in
any music data created by not only expert writers but also 30 MIDI data validate how effectively feature parame-
amateur writers). Table 1 shows the candidates of feature ters explained in Section 4.2 classify music data. Figure 2
parameters used in our filtering system. shows the results of dispersion of the feature parameters.
Proceedings of the 2004 International Symposium on Applications and the Internet Workshops (SAINTW’04)
0-7695-2050-2/04 $20.00 © 2004 IEEE
Figure 2. Dispersion of Feature Parameters
Generally, there are two types in classification algo- 5.2 User Profile and Content Model
rithms. One type expresses feature parameters as factors
in a vector and classifies data based on the distance of their The user profile in our system is a decision tree learned
vectors. The other type focuses on each feature parameter for each user in advance. The user profile is made as fol-
Proceedings of the 2004 International Symposium on Applications and the Internet Workshops (SAINTW’04)
0-7695-2050-2/04 $20.00 © 2004 IEEE
Figure 3. System Architecture of C-base MR
lows. The user rates music data presented by the system tem consists of a user interface layer, a servlet layer and a
on a scale of one to three, ”like”, ”neutral” and ”dislike”. database layer.
Rating on any discrete scale like 5-scale or 7-scale is possi- The process flow of the system is as follows. Firstly,
ble, but a decision tree cannot deal with grades. Therefore, the user inputs his/her name and selects a menu in the user
we represent users’ preferences with no grade. The system interface. Then, the request manager gets the user’s name
conducts learning from the user’s ratings and makes a deci- and the selected menu in the servlet. In the case that the user
sion tree. Nodes of the decision tree have conditions for the selects ”questionnaire”, the questionnaire module makes a
feature parameters. Leaves of the decision tree have a class questionnaire page in HTML. The questionnaire module
that expresses ”like”, ”neutral” and ”dislike”. sends the HTML file to the user interface, and the user inter-
The content model is made for each music data. The face displays the questionnaire page (Figure 4 (a)). When
content model consists of feature parameters extracted from the user answers the questionnaire, the user’s answer is sent
music data automatically by the system. to the rating database. The rating database stores users’ rat-
Filtering is carried out by comparing the user profile with ings of music data, which are used to build decision trees.
the content model of target music data. As the result of com- In the case that the user selects ”recommendation”, the
parison, if the music data becomes the class representing decision tree module builds a user profile using the user’s
”like”, the system recommends it to the user. If the music ratings in the rating database and feature parameters of mu-
data becomes the class representing ”dislike” or ”neutral”, sic data rated by the user in the feature parameter database.
the system does not recommend it. Furthermore, the user Then, the comparison module compares the feature parame-
rates music data recommended by the system. The system ter of target music data with the user profile and selects mu-
updates the user profile based on the user’s ratings. sic data for recommendation. The recommendation module
makes a recommendation page in HTML. The recommen-
5.3 Prototype System dation module sends the HTML file to the user interface,
and the user interface displays the recommendation page
Our prototype system with the above filtering method, (Figure 4 (b)). The user rates music data recommended by
which is called ”C-base MR”, is implemented in Java the system, and this leads to the update of his/her user pro-
servlet. Figure 3 shows our system architecture. The sys- file. The extraction module extracts feature parameters from
Proceedings of the 2004 International Symposium on Applications and the Internet Workshops (SAINTW’04)
0-7695-2050-2/04 $20.00 © 2004 IEEE
Figure 4. Sample Recommendation of C-base MR
Proceedings of the 2004 International Symposium on Applications and the Internet Workshops (SAINTW’04)
0-7695-2050-2/04 $20.00 © 2004 IEEE
MIDI data in the MIDI database offline.
Figure 4 shows a sample recommendation of our system.
A user rates music data in the questionnaire page (Figure 4
(a)). This user tends to rate music data whose tempo is fast
as ”Like”. A decision tree which reflects the user’s ratings
is constructed (Figure 4 (c)). Figure 5 simplifies Figure 4
(c). The rule in the root node reflects the most remarkable
feature of the user’s rating that is ”the tempo is fast”. Our
system recommends music data whose tempo is fast to the
user by using the decision tree (Figure 4 (b)).
Proceedings of the 2004 International Symposium on Applications and the Internet Workshops (SAINTW’04)
0-7695-2050-2/04 $20.00 © 2004 IEEE
Figure 7. Flow of Estimation for Melody CH
sorted all CHs except drum CH, base CH and chord CH in way, CHs, which rank within third (a=1st, b=2nd, c=3rd) in
descending order of the value for each feature. Then, we the average difference of pitch, are extracted. After that, the
examine the rank of melody CH for each feature. Figure 6 number of the common CH between (A, B) and (a, b, c)
depicts the result. is counted. In the case that the common CH is only one,
Melody CH ranks first or second in the number of notes the CH is selected as melody CH. In the case that the com-
performed at the same time and ranks within third in the mon CHs are two, the flow of estimation is as follows. If A
average difference of pitch for most of the music data. Fur- equals a (i.e. the CH ranks first in the both parameters), the
thermore, the CH number of the melody CH is relatively CH is selected as melody CH. Because the CH ranking first
small. Therefore, we target the CH that ranks within sec- in the both parameters is most likely to be melody CH. If B
ond in the number of notes performed at the same time and equals c (i.e. the CH ranks second in the number of notes
ranks within third in the average difference of pitch for esti- performed at the same time and ranks third in the average
mating melody CH. Then we estimate melody CH by using difference of pitch), the CH is eliminated and another com-
the CH’s rank to the above two features and CH number. mon CH is selected as melody CH. Because the CH ranking
second in the number of notes performed at the same time
A.2 Estimation Method of Melody CH and third in the average difference of pitch is least likely
to be melody CH. Otherwise, the CH whose CH number is
Figure 7 depicts the flow of estimation of melody CH. smaller than the other is selected as melody CH. In the case
Firstly, CHs whose performance time are short are elimi- that there are no common CHs between the two parameters,
nated. This is because those CHs play a decorative role in the system gives up the estimation.
music. Secondly, drum CH, base CH, and chord CH are es-
timated and eliminated from the rest of the CHs. Then, CHs,
which rank first or second (A=1st, B=2nd) in the number of
notes performed at the same time, are extracted. In the same
Proceedings of the 2004 International Symposium on Applications and the Internet Workshops (SAINTW’04)
0-7695-2050-2/04 $20.00 © 2004 IEEE