• Embed Doc
  • Readcast
  • Collections
  • CommentGo Back
Download
 
Applying Semantic Web Technologies for Enriching Master Classes
Paloma de Juan José C. González Carlos A. Iglesias
 Depto. Ing. SistemasTelemáticosUniv.Politécnica de Madrid  Depto. Ing. SistemasTelemáticosUniv. Politécnica de Madrid Germinus XXI (Grupo Gesfor) paloko@gsi.dit.upm.es jcg@gsi.dit.upm.es cif@germinus.com
Abstract
This article presents how semantic webtechnologies have been applied for enriching existingcontents within the SEMUSICI project. The SEMUSICI  project has the goal of researching on how semanticweb technologies can be applied to digital libraries,and how this can improve searchability and accessibility. The project takes the results from theeContent project HARMOS, which defined a musicaltaxonomy for cataloguing master classes, and  proposes a methodology for evolving this taxonomyinto an ontology, and migrating the contentsaccordingly.
1. Introduction
Cataloguing standards, such as MODS [1], MARC[13] or Dublin Core [14] define metadata following aflat property value orientation, which provides textualsearch capabilities. In some contexts, such as themusical digital libraries, this approach is too narrow,since some of the metadata are entities themselves,such as Compositions of Composers. In the Harmosproject (section 2) an object oriented taxonomy wasdefined, where some of the values, such ascompositions, movements or composers were modeledas entity objects, and an advanced search system basedon these properties was developed and is available at[17]. This article presents an evolution of thisapproach, where semantic technology is used formodeling the relationships of the domain model. Themain advantage of this approach is its powerfulretrieval and inferential capabilities.The rest of the article is organized as follows.Section 2 and 3 give an overview of the projectsHarmos and Semusici, respectively, which constitutethe context of this research. Section 4 describes ageneric methodology for transforming taxonomiesinto ontologies. This is the main contribution of SEMUSICI project for content enrichment. Finally,section 7 draws out the main conclusions of the articleand the future work.
2. The HARMOS project
The European eContent HARMOS project [16] hadthe aim of providing access through Internet to videosof master classes from big maestros. HARMOS hasproduced a collection of audiovisual contents thatbelong to the musical heritage, where education wasthe principal focus and the project’s main objective.Harmos defined a pedagogical taxonomy [15]which aims to cover the whole spectrum of musicalpractice and teaching, focusing on pedagogical aspects.The potential semantic descriptors of this taxonomywhere structured around three main concepts, themusic, the musician and the musical expression, conmore than 400 descriptors as detailed at [15] and morethan 700 audiovisual hours of recorded master classeshave been catalogued according to this taxonomy.
3. The SEMUSICI Project
The SEMUSICI project [18] aims to evolve theresults of the Harmos project by introducing semanticweb technologies. The Harmos system providesseveral retrieval facilities, which allow finding amaster based on the previous selections of the user,such as a composer, a composition, a movement, ateacher that has explained this composition, etc. asdescribed in [15]. The introduction of new retrievalpossibilities required to extend the database model andhuge investment in development the new consults,which should be tuned and optimised given the bigvolume of the database. The usage of semantic webtechnology, which allows an easy extension of properties and relationships with new predicates, isexpected to make this feasible. In addition, semanticweb technology can contribute to improve the quality
 
of metadata, since semantic web technologies can helpin checking the consistency of the cataloguing.The inclusion of semantic web technology pointsout several challenges. Firstly, it is needed to define anontology that contains the concepts of the Harmostaxonomy. Secondly, since musical analysts should notbe aware of the usage of semantic web technology forcataloguing, it is needed to develop easy interfaces inorder to catalogue semantically. Thirdly, it is neededmigrating all the Harmos catalogued multimediacollection to the new semantic schema. Finally, it isneeded to evaluate the current status of semantic webtechnology in terms of throughput and performance,given the size of the multimedia collection. This articlewill cover the first objective.
4. Methodology for transformingtaxonomies into ontologies
The central aim of Semusici is to provide asemantic structure that fits the former conceptstaxonomy. The purpose of this new approach is togather information about the relationships betweendisjoint leaves and build a new representation of boththe concepts and these relationships. This leads to aricher representation of the knowledge that is reallyassociated to the digital audiovisual items. This impliesa deep understanding of the subject domain. There isno definite methodology for this task, but a generalprocess together with best practices has been proposed.Once the problem has been well defined and all therequirements have been identified, a suitable structurehas to be chosen. As we are looking to take advantageof the technologies of the semantic web, the mostappropriate structure is an ontology. The reason whywe have chosen this structure is that it provides aformal way to represent roles and their correspondingrelations in a specific domain. By placing a concept insuch a structure, we are stating that it has certainproperties and satisfies some restrictions about hismeaning. In other words, each leaf of an ontologyrepresents the definition of a certain resource.The main difference between an ontology and ataxonomy is the kind of structure in which each of them is based. A taxonomy can be represented as a treewhere each leaf is a class. No connections are allowedbetween disjoint branches. Relations between classescan only be established between a concept and itsdirect children. So an instance of a certain class can bedefined as ”a kind of” its parent class. An ontology is agraph in which richer definitions can be expressedthrough a more extensive set of relations. This meansthat any class can be defined in terms of any otherresource that is connected to it, not necessarily beingits parent or child. Therefore ontologies can store moresemantic information than taxonomies, allowing us toinfer undeclared knowledge by studying the relationsand restrictions of a certain class.
5.1. First step: choosing the appropriate tools
There is a wide variety of tools available to create,edit, browse and store ontologies. There are also manyinference engines or reasoners, which are veryimportant to obtain knowledge from the ontology.Several tools have been examined in order to choosethe most suitable framework for our purposes. Some of these were Protégé [2], RacerPro [3], Sesame [4],SWOOP [5], WebODE [6], etc. A survey was carriedout in order to find distinctive features. Thereforeeleven parameters were chosen and thirteen tools wereevaluated according to these key features. Some of these parameters were the supported languages,consistency check support, availability, maintenance,etc. As a result, Protégé and Sesame were chosen.All these tools support a number of languages.Choosing the right language to implement an ontologyis probably the most important step in the process. Thisdepends on how thorough the ontology is intended tobe. For Semusici, our initial choice was RDFS as it isthe main language in Sesame. It proved to be completeenough to allow the building of a basic version of theontology. Later we decided to include somerestrictions to enforce the definition of the elementsthat we have already defined. These restrictions werealso intended to help us perform consistency checkswhen adding new contents. For that purpose, newOWL statements were added.
5.2. Semusici knowledge base
There are two distinct parts in the knowledge basethat is to be represented by the ontology. One isintended to capture all the information that is notdirectly related to the collection and can be useful tolocate a recording. The aim of this is to answer anyquery that is not directly related to the contents of therecording itself. For instance, “give me all therecordings related to composers born in the 18
th
 century”.The other part of the knowledge base is theconcepts taxonomy. The features of this structure havealready been discussed. This taxonomy contains over200 pedagogical concepts that are used as tags todescribe the recordings. In the process of cataloguingthe content, these recordings are to be labelled
 
according to semantic descriptors that are part of thistaxonomy.The semantic descriptors were defined according toa tree diagram of concepts. This was based on threelarge branches that served as a starting point: themusician, music and musical expression. Each one of the divisions that structured the tree diagram of concepts was joined to one of these large branches.The smaller branches were then organized according toa series of categories, reaching, in the end, a didacticconcept.
5.3. Building the ontology
This first ontology had to be built from scratch, asmost of the concepts it should represent were new.Following a methodology is strongly recommended forthis task. The goal of using a methodology is trying notto miss information in the process of transferringknowledge between the different actors that take partin the process. It also provides a set of steps to followin order to avoid inconsistency, which would lead toundesirable rework. The quality of the ontology will bestrongly affected by the choice of an appropriatemethodology [7].There is no single generic ontology-designmethodology [8] that covers all the kinds of applications. This means that there is no standard wayto build an ontology [9] neither a standard mechanismto evaluate a methodology. However, all publishedmethodologies have proven to be useful, as they allhave been applied to some process at least once. Thekey to finding the best guidelines for a certainapplication is to analyze the purpose thosemethodologies were used for and find similaritiesbetween that purpose and our application. This couldbe viewed as a way of reusing knowledge. Reuse is avery common practice in ontology engineering.There are some steps that are common to almostevery methodology. The first step is to identify thepurpose and scope of the ontology. Both of them havealready been mentioned in this document. Next, onemust find out which questions is the ontologysupposed to answer. These are called competencyquestions [10]. We gathered a list of over 50 questionsand identified keywords that later would become partof the terminology of the ontology.Next step was to decide which ones of thesekeywords should be represented as classes, attributesand instances. The most important thing to consider atthis point is how specific we want our ontology to be.Thus we chose those concepts which we found theyneed a precise definition and separated them fromthose which constituted the most specific level of theontology. We also considered reusing some publishedontology but finally decided to define our ownvocabulary.
5.4. From the concepts taxonomy to anontology
The first step to turn this taxonomy into anontology was to create a root class called
Concept 
.Every instance of this class is assigned a conceptname. This name is the same as the corresponding tagused to classify the digital recordings. Although theoriginal taxonomy was divided into three mainbranches, we decided to create a first level of morespecific classes. We intended to group concepts thathad basic semantic features in common in order tomake it easy to define relations between differentclasses.The original classification grouped most of theconcepts according to the instrument they werereferred to. For instance, every technique that is relatedto a string instrument is placed in the subcategoryStrings technique, child of Strings. We decided tocreate a main category, called Technique, to group allthe specific technique related concepts, given that wecan not consider that a technique “is-a” String. Thuswe could establish that every instance of a subclass of Technique should be related to some type of instrument.We followed these same criteria to create the maincategories and build the first level of our ontology. Wealso defined some properties, such as “relatedTo”,“partOf” and “elementOf”. The first one was definedas a symmetric property and was meant to connectconcepts that could be interesting to the same users.For instance, if a user searches for a lesson abouthammers, he will probably be interested in videosabout keyboards too.Both “partOf” and “elementOf” are transitiveproperties. This means that if a first concept ispart/element of a second one and this one ispart/element of a third one, we can state that the firstconcept is also part/element of the last one. Thedifference between them is that if concept A is part of concept B, every instance of B has A (i.e. the frog ispart of the bow, because every bow has a part calledfrog). However, if concept A is element of concept B,that means that only some instances of B have A (i.e.the reed is element of the embouchure, because thereare wind instruments that have no reed in theirembouchure). Considering this difference, we can statethat if concept A is part of concept B and this concept
of 00

Leave a Comment

You must be to leave a comment.
Submit
Characters: ...
You must be to leave a comment.
Submit
Characters: ...