You are on page 1of 138

Gda«sk University of Technology

Faculty of Electronics, Telecommunications


and Informatics

National University of Ireland, Galway


Digital Enterprise Research Institute

Department: Department of Computer Systems Architecture

(Katedra) (Katedra Architektury Systemów Komputerowych)

Student's name: Jarosªaw Dobrza«ski


(Imi¦ i nazwisko)

ID number: 93635/ETI

(Nr albumu)

Type of studies: Master's

(Rodzaj studiów) (Dzienne magisterskie)

Specialty: Informatics, Distributed Applications and Internet Systems

(Kierunek studiów) (Informatyka, Aplikacje Rozproszone i Systemy Internetowe)

Master's Thesis
Praca magisterska
Title: Social Semantic Information Sources for eLearning
(Tytuª pracy)

Supervisor: prof. dr hab. in».Henryk Krawczyk, prof. zw. PG

(Kieruj¡cy prac¡)

Consultant: mgr in». Sebastian Ryszard Kruk

(Konsultatnt)

Thesis domain: Make a thorough analysis of Social Semantic Information Sources in a context

(Zakres pracy) of using them in eLearning. Identify best tting ontologies used for their de-

scription. Dene a common object model for them. Develop the framework

that supplies Didaskon with information described with this model.

Gda«sk, 2007
Contents

1 Introduction 5
1.1 Problem description . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

1.2 Goal of this thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

1.3 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

2 Related work 9
2.1 eLearning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

2.1.1 History of eLearning . . . . . . . . . . . . . . . . . . . . . . . 10

2.1.2 Learning Object . . . . . . . . . . . . . . . . . . . . . . . . . . 10

2.1.3 Defects of eLearning . . . . . . . . . . . . . . . . . . . . . . . 13

2.2 The Semantic Web . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

2.2.1 The current Web . . . . . . . . . . . . . . . . . . . . . . . . . 14

2.2.2 The Semantic Web . . . . . . . . . . . . . . . . . . . . . . . . 15

2.2.3 Semantic Web and eLearning . . . . . . . . . . . . . . . . . . 20

2.3 Web 2.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

2.3.1 AJAX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

2.3.2 Democracy . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

2.3.3 Social network . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

2.3.4 Tagging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

2.3.5 Mashups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

2.4 Semantic Web 2.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

2.4.1 Metadata development . . . . . . . . . . . . . . . . . . . . . . 25

3 Social Semantic Information Sources and eLearning 2.0 29


3.1 Examples of Social Semantic Information Sources . . . . . . . . . . . 29

1
3.1.1 Semantic Blogs . . . . . . . . . . . . . . . . . . . . . . . . . . 29

3.1.2 Semantic Wikis . . . . . . . . . . . . . . . . . . . . . . . . . . 33

3.1.3 Social Semantic Digital Library . . . . . . . . . . . . . . . . . 36

3.2 Model of Social Semantic Information Sources . . . . . . . . . . . . . 40

3.2.1 SIOC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

3.3 eLearning 2.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

3.3.1 Is there a place for semantics? . . . . . . . . . . . . . . . . . . 43

3.3.2 Didaskon . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

4 Informal Knowledge Harvester 47


4.1 Capturing informal learning . . . . . . . . . . . . . . . . . . . . . . . 47

4.1.1 Existing tools . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

4.1.2 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

4.2 System Requirement Specication . . . . . . . . . . . . . . . . . . . . 50

4.2.1 System scope . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

4.2.2 System requirements . . . . . . . . . . . . . . . . . . . . . . . 50

4.2.3 System Use Cases . . . . . . . . . . . . . . . . . . . . . . . . . 60

4.3 System design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

4.3.1 Service-Oriented Architecture . . . . . . . . . . . . . . . . . . 69

4.3.2 System components . . . . . . . . . . . . . . . . . . . . . . . . 70

4.3.3 Classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

4.3.4 Extending IKHarvester . . . . . . . . . . . . . . . . . . . . . . 74

4.3.5 Attribute mapping rules . . . . . . . . . . . . . . . . . . . . . 77

5 System implementation 86
5.1 Implementation methodology . . . . . . . . . . . . . . . . . . . . . . 86

5.2 Three-tier architecture . . . . . . . . . . . . . . . . . . . . . . . . . . 87

5.3 IKHarvester main page . . . . . . . . . . . . . . . . . . . . . . . . . . 88

5.4 Environment and necessary tools . . . . . . . . . . . . . . . . . . . . 89

5.4.1 Implementation environment . . . . . . . . . . . . . . . . . . . 89

5.4.2 Documentation . . . . . . . . . . . . . . . . . . . . . . . . . . 91

5.5 Main problems and solution details . . . . . . . . . . . . . . . . . . . 93

2
5.5.1 Implementation of REST . . . . . . . . . . . . . . . . . . . . . 93

5.5.2 Invoking the data tier features . . . . . . . . . . . . . . . . . . 96

5.5.3 Extending IKHarvester . . . . . . . . . . . . . . . . . . . . . . 102

5.5.4 Adding data to the informal knowledge repository . . . . . . . 107

6 Conclusions 109
6.1 Achievements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109

6.1.1 Publications . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109

6.1.2 IKHarvester . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110

6.2 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114

7 Streszczenie pracy w j¦zyku polskim 115


7.1 Wst¦p . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115

7.1.1 Denicja problemu . . . . . . . . . . . . . . . . . . . . . . . . 115

7.1.2 Cele . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116

7.2 Podstawy teoretyczne . . . . . . . . . . . . . . . . . . . . . . . . . . . 116

7.2.1 eLearning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116

7.2.2 Semantic Web . . . . . . . . . . . . . . . . . . . . . . . . . . . 118

7.2.3 Web 2.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119

7.2.4 Semantic Web 2.0 . . . . . . . . . . . . . . . . . . . . . . . . . 119

7.3 Social Semantic Information Sources i eLeraning 2.0 . . . . . . . . . . 120

7.3.1 Przykªady Social Semantic Information Sources . . . . . . . . 120

7.3.2 Model Social Semantic Information Sources . . . . . . . . . . 122

7.3.3 eLearning 2.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . 122

7.4 Informal Knowledge Harvester . . . . . . . . . . . . . . . . . . . . . . 123

7.4.1 Capturing informal knowledge . . . . . . . . . . . . . . . . . . 123

7.4.2 Analiza wymaga« . . . . . . . . . . . . . . . . . . . . . . . . . 124

7.4.3 Projekt systemu . . . . . . . . . . . . . . . . . . . . . . . . . . 124

7.5 Implementacja . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125

7.5.1 Metodologia . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125

7.5.2 ‘rodowisko i niezb¦dne narz¦dzia . . . . . . . . . . . . . . . . 126

7.6 Uwagi ko«cowe . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126

3
7.6.1 Publikacje . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127

7.6.2 Perspektywy na przyszªo±¢ . . . . . . . . . . . . . . . . . . . . 127

Bibliography 128

List of Figures 135

List of Listings 136

List of Tables 137

A Installation guide 138


A.1 Apache Tomcat . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138

A.2 Sesame . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138

A.3 IKHarvester . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139

A.3.1 Downloading the source code . . . . . . . . . . . . . . . . . . 139

A.3.2 Conguration . . . . . . . . . . . . . . . . . . . . . . . . . . . 139

B Output examples 141


B.1 LOM example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141

B.2 LO content example . . . . . . . . . . . . . . . . . . . . . . . . . . . 148

B.3 List of LOs example . . . . . . . . . . . . . . . . . . . . . . . . . . . 150

4
Chapter 1

Introduction

First Chapter is the introduction to this Master's Thesis. Here, I formulate and

describe the main problems I should face while developing the nal system. I form

the goals to achieve. Also, I describe the methodology of the developed system.

Finally, I introduce the outline of this paper.

1.1 Problem description


Learning is not attained by chance, it must be sought for with ardor and

attended to with diligence.

Abigail Adams, 1780

People have been learning for ages; to eat one had to hunt, to feel relaxed 

sleep, to stay alive  avoid danger. One was obliged to learn to make his/her life

easier or, even more, to survive!

It seems nothing has changed. However, the learning process is more organized

now than it used to be. In general, learning can be divided into formal and informal.

Formal learning is all we remember from the school or the university; it is an

old traditional approach. Courses are rigid, made once and for all. Students are

pushed to go through a course from beginning to the end without the possibility of

changing its content.

Informal learning, also known as self-directed learning, is more natural, spon-

taneous and gives a user more exibility in deciding when, where and what to learn.

5
We often learn that way unconsciously by chatting, video conferences, observing oth-

ers, reading blogs or wikis. It is denitely cheaper and, what can be strange, more

eective. In fact, most learning occurs as such unstructured processes and is not

orchestrated or directed by learning specialists. According to surveys taken in USA,

75% of organizational learning is informal [36]. Informal learning relates with col-

laborative learning, which supports communication between learners, communities

of learners and other forms of shared knowledge creation and sharing.

The 'e' in eLearning stands for experience.

Elliott Masie, Masie Center

eLearning is a term used for describing computer-enhanced learning process [61];

it is naturally suited for distance, exible learning but also can be used along with the

traditional, face-to-face approach. eLearning is no longer associated with materials

delivered on CD-ROMs and sent across the country [31]. Nowadays, tools like web-

based teaching materials and hypermedia in general (web pages, discussion boards,

simulations and games) are commonly used. eLearning is so popular due to the

expansion of the Internet. There are a lot of online services that provide courses for

1 2 3
free or by paying, e.g. Nuvvo , Berlin High School eLearning Online , eTech Ohio .

After all, there are a considerable number of Web 2.0 (see Sec. 2.3) services

like blogs, wikis, fora, digital libraries, online chats or video conferences. They allow

users to collaborate with peers and share their opinions. They are sources of informal

learning since a lot of relevant information is passed there. However, information is

often unordered and it can be dicult to nd anything.

By applying the Semantic Web (see Sec. 7.2.2) to Web 2.0 services, we make their

content also machine readable. Thus, computers can assist and guide users so that

they are not lost in the sea of information [61]. The assumptions of the Semantic

Web are fullled by introducing semantic annotations of online resources. This way,

blogs, wikis, and digital libraries become Social Semantic Information Sources (see

Chapter 7.3.1).

1 http://www.nuvvo.com/
2 http://moodle.berlinwall.org/
3 http://www.etech.ohio.gov

6
In this thesis, I focus on informal learning based on Social Semantic Information

Sources (SSIS). I consider semantic blogs, semantic wikis and social semantic digital

libraries as unfailing source of informal knowledge. Both Web 2.0 and the Semantic

Web, combined together, allow us to create new, better solutions that go beyond

current eLearning assumptions. They form a new learning standard, eLearning 2.0,

which aims at giving the ability to leverage the community as a part of the larger

eLearning picture [23].

4
Didaskon is a system designed according to eLearning 2.0 assumptions. This

5
is a project developed in the Digital Enterprise Research Institute at the National

6
University of Ireland, Galway ; it was initiated as a working group project in co-

7
operation with Gda«sk University of Technology , Faculty of Electronics, Telecom-

8
munications and Informatics . Didaskon delivers a framework for composing on-

demand curriculum from existing learning objects provided by eLearning services

(formal knowledge). Besides, it derives from Social Semantic Information Sources

 sources of informal knowledge. Basing on some preconditions, Didaskon creates

a learning path which best ts a specic learner. To achieve that, the system uses

initial information (preconditions) like a student's needs, skills, learning history etc.,

anticipated resulting skills and knowledge (goals), and technical details of the clients

platform.

1.2 Goal of this thesis


The main goal of this thesis is to develop IKHarvester, an extension for the Didaskon

system. IKHarvester will be a Service Oriented Architecture (SOA) layer. Its goal

is to capture informal learning from Social Semantic Information Sources and store

metadata for this information. Stored data should be delivered to Didaskon in the

form of informal learning objects that support formal ones during learning path

composition.

4 http://didaskon.corrib.org/
5 http://deri.org
6 http://www.nuigalway.ie/
7 http://www.pg.gda.pl/
8 http://www.eti.pg.gda.pl/

7
Social Semantic Information Sources provide heterogeneous data. Hence, design

and development of IKHarvester should be supported by a thorough analysis of

them. The extension should provide data in a consistent, common object model.

Thus, Didaskon is able to perform more eective reasoning during course composi-

tion.

Figure 1.1: Capturing informal learning with IKHarvester

1.3 Outline
This thesis is divided into six chapters. In Chapter 2, I present related works; there

is an introduction to the Semantic Web and Web 2.0. In Chapter 3, I introduce

Social Semantic Information Sources and on new eLearning solutions. Chapter 4

is a summary of the designing stage; here I dene the system requirements, use

cases, and architecture. In chapter 5, I describe the implementation proces: tools

I have used during the system development, software helpful during writing the

documentation. Finally, Chapter 6 is lled with some conclusions related to my

work.

8
Chapter 2

Related work

In this Chapter, I describe the state of the art in eLearning, Web 2.0, and the Seman-

tic Web. At rst, I give a denition of eLearning and present how it has changed in

years. I dene a Learning Object that is anything one can acquire, manage and use.

Then, I characterize the Semantic Web, a newer, better Web, where the content of

resources is used by people and software agents. Afterwords, I introduce Web 2.0, a

Web dedicated for communities whose members share information and collaborate.

Finally, there is a description of the Semantic Web 2.0, which is a combination of

Web 2.0 and the Semantic Web.

2.1 eLearning
eLearning (Electronic Learning) [40] is the delivery of educational content through

any electronic media, including the Internet, intranets, extranets, satellite broadcast,

audio and video tapes, interactive TV, CD-ROMs, interactive CDs and computer-

based training. It is expected to squeeze out the old-fashioned learning. In the old

approach, a student is passive, pushed to learn. He/she is obliged to obey some rules

dening when and where the classes take place and what is their actual content [33].

Thus, the learning process is constrained and limited.

Unlike that, a learner should be given (to some extent) a free hand with regard

to selecting the course schedule. One should be allowed to learn just-in-time, on-

demand. Moreover, he/she should have inuence on the contents of the classes.

9
Learning should be customized, initiated by user proles and business demands.

This is actually what eLearning is aimed at.

There are two communication technologies used for eLearning: synchronous and

asynchronous. The rst expects students to gather face-to-face or use chats, video-

conferences etc. The latter approach is characterized by using blogs, wikis or dis-

cussion boards as tools for sharing opinions or gained experience. All in all, they

both support informal learning.

2.1.1 History of eLearning


The beginning of eLearning goes back to the late 1950s when tools such as cal-

culators, VCRs, radio and bulletin board systems were used. All these tools have

contributed to ideas concerning the uses of the eLearning systems [40].

In the mid 1980s eLearning started to develop rapidly. It was the time when

the Multimedia Era began; Windows 3.1, PowerPoint, Macintosh and CD-ROMs

became popular and common all over the world [25]. Computers were supposed

to make training more transportable and visually engaging; learning became the

Computer Based Training.

In the mid 1990s, the Internet was popular enough; hence, training providers

tried to incorporate it into tuitional process. They found emails, web browsers, web

pages, media players, streamed audio and video very helpful. A great number of

companies enriched courses with graphics and web-based training and made them

accessible in the Internet. eLearning entered student lives for good.

2.1.2 Learning Object


eLearning is evolving rapidly. Consequently, there is more and more information

that can be learned. A single piece of information is called a Learning Object.

In general, a Learning Object is something you can acquire, manage and use.

LOs [19] are reusable, modular, exible, portable and compatible. Ecient manag-

ing of learning material is crucial to make a Learning Management System (LMS)

work properly. The problem is, how to organize metadata so that it can be ex-

changed between dierent LOs.

10
SCORM
1
SCORM (Shareable Content Object Reference Model) is a Web oriented data model

for content aggregation. This is an XML-based framework used to dene and access

information about LOs so they can be easily shared among dierent LMSs. SCORM

focuses on the structure, runtime environment for LOs and description of learning

process [53].

The part of SCORM related to this thesis is SCORM CAM (Content Aggregation

Model) which denes how to create and manage LOs. According to SCORM CAM,

the content of a learning object can be diverse: plain text, HTML code, short movies

or even more complicated interactive course. Also, SCORM CAM includes all the

specications of the IEEE LOM data model.

LOM

LOM (Learning Object Metadata) is a model for representing information about

learning objects and electronic resources in general; it is a standard underlying

SCORM 2004 [5].

LOM denes the way to build metadata for LOs. There are nine categories of

this information [18] each of which focus on dierent aspects (see also 2.1):

• General  general information about the LO as a whole

• Lifecycle  features related to the history and current state of the LO and

those who have aected it during its evolution

• MetaMetadata  information about the metadata instance itself

• Technical  groups the technical requirements and technical characteristics of

the LO

• Educational  educational and pedagogic characteristics of the LO

• Rights  intellectual property rights and condition of use the LO

1 http://www.adlnet.gov/scorm/

11
Figure 2.1: LOM structure (from [60])

12
• Relation  group of features dening the relationship between the LO and

other related LOs

• Annotation  comments on the educational use of the LO and information on

the author of the comment and time when it was written

• Classication  describes the LO in relation to a particular classication system

2.1.3 Defects of eLearning


eLearning was, and still is, a great importance in the education process. Its main ad-

vantage is the convenience and exibility a learner is given; one learns when he/she

wants to or has time. Another positive aspect is communication between learn-

ers, which allows them to share opinions on learning material. eLearning provides

greater adaptability to a learner's needs and more varieties in learning experience.

Nevertheless, eLearning suers from a considerable number of limits and disadvan-

tages.

First of all, eLearning is course-oriented, not user-oriented. There is one course

prepared for all. Usually, courses are not personalized [27], they are tailored for

a generic student on one of the generic levels of skills or knowledge. The main

assumption is a desire to pass the course. Learning services usually do not take into

account specic user's conditions, like wishing to broaden knowledge in wide range

of domains at the same time [52]. Thus, a student is obliged to attend dierent

courses at the same time; some information can be repeatedly used.

Secondly, current learning services treat students as single entities; they do not

assume community involvement. A learner is merely supposed to go through the

course alone, without response from other students who also attend it. Students are

deprived of possibility to exchange their observations and comments.

Finally, existing Learning Management Systems use only traditional, formal

methods  they serve only what is supplied by the provider. Currently, in the

Internet, there is a lot relevant information which could support learning process.

Current LMSs are not capable of understanding the content of many web pages,

which are per se informal sources of knowledge.

13
eLearning needs management support in order to dene a vision and plan for

learning and to integrate learning into daily work. However, current Web based

solutions do not meet the requirements mentioned above; they bring the problem of

information overload, lack of accurate information or content that is not machine-

understandable. Only the course creators and students can understand the content

of the course.

2.2 The Semantic Web


In this Section I point out the limitations of the current Web. I introduce the

Semantic Web and focus on its core assumptions and solutions. Finally. I present

the Semantic Web 2.0, which links the Semantic Web platform into existing Web

2.0 features.

2.2.1 The current Web


The World Wide Web (Web) is an invention of Sir Tim Berners-Lee; it is a system

of interlinked, hypertext documents (called web pages or website) that run over the

Internet. Web pages can contain text and multimedia such as images, movies, music,

etc. They are navigated by using hyperlinks and viewed with a Web browser [63].

Web data is accessible and exchangable through HTTP protocol.

Web pages are written in HTML language. Each page has its own URL (Uniform

Resource Locator), which is a synonym of URI (Uniform Resource Identier). A

website can contain hyperlinks that connect it with other sites.

On the one hand, there is an abundance of valuable information. On the other

hand, the information is not machine readable. Take the following sentence as

an example: Bob Marley was a Jamaican reggae musician. This sentence is an

established fact, understood by a human. However, it does not bring any particulars

for a machine. One can ask: why should machines understand web page contents,

when these are people who look for it? Nothing more confusing.

14
Simple scenario

Imagine Adam, a young man with a broad music taste. He had listened rock and

ska music for ages. Once he heard a reggae song by Bob Marley on his favorite

Internet radio. He really liked the song, and wanted to learn something about

reggae music and Bob Marley.

The perfect situation assumes he nds a reggae music fans community where

a great many information and useful links could be found. Even the previously

mentioned sentence about Bob Marley could, for a start, satisfy Adam as it brings

some knowledge. But, this sentence can be hidden in the mids of the accumulation

of web pages. Computers' task is to nd it.

But, again, how can the computer nd anything when it does not comprehend

it? It must be somehow described, so that a machine can distinguish one piece of

information from another. The above-mentioned scenario portrays the importance

of appropriate resources denition. Without it, computers are not able to help

people derive the boon of the Internet.

2.2.2 The Semantic Web

The Semantic Web will bring structure to the meaningful content of Web

pages, creating an environment where software agents roaming from page

to page can readily carry out sophisticated tasks for users.

sir Tim Berners-Lee

The word semantic stands for the meaning of . The Semantic Web encom-

passes eorts to build a new World Wide Web architecture that enhances content

with formal annoations. It is supposed to create a universal medium for exchanging

information in a way understood by computers [62, 17]. Consequently, browsing and

searching in the cyberspace is simplied.

One of the most important advantages of the Semantic Web is exibility. Dif-

ferent kinds of data can be used altogether and diverse types of analysis can be

15
applied over it [64]. For instance, a book can be described with Dublin Core [9]

2
annotations whereas information about the author can be expressed by using the

3
FOAF (Friend-of-a-Friend) vocabulary [7]. Moreover, vocabularies can be easily

broadened by creating modules or sets [21].

Thanks to semantic annotations, some reasoning and inference can be performed

on using a learner's description (his/her identity, relationships). As such, the Se-

mantic Web represents a promising technology for realizing eLearning requirements.

Figure 2.2: The Semantic Web Stack (from W3C)

The Resource Description Framework

Semantics entails description issues, so that artifacts are understood and eciently

processed by machines. The Resource Description Framework (RDF) is a model for

4
metadata description. It is a W3C standard for describing web resources which

have been assigned a URI by which they can be identied. It was designed to be

read and processed by machines, not to be displayed to people. On account of RDF,

2 http://dublincore.org/
3 http://www.foaf-project.org/
4 http://www.w3.org/

16
programs or automated scripts (crawlers) can eciently search, discover, collect and

process information from the Web.

RDF is based on statement concepts. In a statement, there is a subject, a

predicate and an object; altogether they are called a triple (a statement) [58]. A

collection of RDF statements produces a directed graph in which arrows point from

subjects to objects and texts on arrows are predicates.

A fact  Ronaldo is a football player can be represented by an RDF state-

ment that has the following structure:

• a subject (resource): Ronaldo

• a predicate (property): is a

• an object (value): football player

Supposing all three parts are attributed with URI with http://example.com
namespace, the above statement can be illustrated by a graph showed on Fig. 2.3:

Figure 2.3: RDF statement

Besides the graph, RDF N3 representation can be used to show triples and rela-

tionships between them. See the List. 2.1 to learn the structure of N3 representation

of the above showed graph.

Listing 2.1: N3 RDF representation

<h t t p : / / e x a m p l e . com/ p e o p l e#Ronaldo>

<h t t p : / / e x a m p l e . com/ p r o p e r t y#i s >

17
<h t t p : / / e x a m p l e . com/ p r o f e s s i o n#f o o t b a l _ p l a y e r > .

Using triples is eective and very popular. However, for the representation rea-

son, XML language can be employed as well (see List. 2.2).

Listing 2.2: RDF/XML representation

<?xml v e r s i o n ="1.0"? >

< r d f : RDF

x m l n s : r d f =" h t t p : / /www . w3 . o r g /1999/02/22 − r d f −s y n t a x −n s#"

x m l n s : p r o p e r t y =" h t t p : / /www . e x a m p l e . com/ p r o p e r t y#">

<r d f : D e s c r i p t i o n

r d f : a b o u t=" h t t p : / /www . e x a m p l e . com/ p e o p l e#R o n a l d o">

<p r o p e r t y : i s

r d f : r e s o u r c e=

" h t t p : / /www . e x a m p l e . com/ p r o f e s s i o n#f o o t b a l _ p l a y e r ">

</ r d f : D e s c r i p t i o n >

</ r d f : RDF>

RDF Vocabulary Description Language

According to W3C [59], RDF aims at represent information on the Web so that it is

processable by machine agents; RDF Schema is a semantic extension of RDF. It is

a description language of the vocabulary of RDF [11]. Consequently, it is possible

to describe groups of related resources (their domain and ranges of properties) and

relations between them.

Ontologies

Ontology is a word with quite a handful of meaning. The term is borrowed from phi-

losophy. It refers to the science of describing entities in the world and relationships

between them.

Although RDF and RDF Schema are helpful in expressing simple statements,

they lack when used in more complex cases. That is why Web Ontology Language

(OWL) was developed. OWL is a markup language for publishing and sharing data

using ontologies on the Internet. It consists of three sub-languages: OWL Lite,

18
OWL DL and OWL Full. Each sub-language encapsulates the former ones. It is

mainly the level of restrictions, which distinguishes them.

An OWL ontology contains a description of classes, properties and their in-

stances [56, 57]. Also, it allows us to dene cardinality constraints on properties,

specifying transitivity and uniqueness.

In general, an ontology represents a domain and objects within that domain. It

is a form of knowledge representation of such domain. Below, I oresent the list of the

most popular RDF Schema metadata denitions for specic domains of interests:

5
• people and social networks: Friend Of A Friend (FOAF)

6
• online discussions: Semantically-Interlinked Online Communities (SIOC)

7
• career: Description Of A Career (DOAC)

8
• project: Description Of A Project (DOAP)

• thesauri, taxonomies and subject-heading systems: Simple Knowledge Orga-

9
nization System (SKOS)

Searching and browsing

The most popular way to search on the web is text searching. It is supported by

Google, Yahoo and other search engines. One just enters the query string and then

is given a set of possible answers. The list is huge and often consists of garbage

information, though.

Semantics tries to enhance searching process. It is achieved by introducing se-

mantic indexing and query renement. The former makes it possible to measure

distance between terms; the latter improves imprecise query string so that more ad-

10
equate results are found [29]. Using dictionaries, like WordNet , can boost search-

ing process by eliminating disambiguity caused by using homonyms, synonyms and

5 http://www.foaf-project.org/
6 http://sioc-project.org/
7 http://ramonantonio.net/doac/
8 http://usefulinc.com/doap/
9 http://www.w3.org/2004/02/skos/
10 http://wordnet.princeton.edu/

19
words used in non-basic form. As described earlier, RDF consists of graph struc-

ture and literals. Thus, a search can be performed by using both keywords and

structured queries.

2.2.3 Semantic Web and eLearning


eLearning aims at just-in-time, task relevant learning. No longer should there be

a centralized authority (teacher) who foists already dened course schedule on stu-

dents. It is impossible to satisfy all students' needs because they dier one from

another.

The Semantic Web can be successfully employed for describing LOs which rep-

resents learning material. Software agents can perform continuous scanning of se-

mantic descriptions of LOs to build a huge, decentralized knowledge repository.

Additionally, agents may use a commonly agreed service language, which boosts

their cooperation. Consequently, creating a course adjusted for a specic learner is

becoming signicantly simpler and faster. Then, it is possible to use diverse types

of learning objects.

Limitations of the Semantic Web

So far, I have pointed out a great many virtues of the Semantic Web, especially

when introducing it to eLearning. It distinguishes itself with prefect theoretical

assumptions and solutions. Nevertheless, practical experience has proved that the

Semantic Web is far from changing the vision of the Internet; it needs some help to

become a reality and face the current problems.

Some society-scale applications are required. The above mentioned agents are

necessary to process decentralized semantic annotations, which must be created as

well. To make shared data real, some more advanced collaborative applications are

required.

20
2.3 Web 2.0
Although Web 2.0 is currently a very popular term, it is dicult to give its precise

denition. Even Tim Berners-Lee, the inventor of the Internet has dicult in doing

that:

Web 1.0 was all about connecting people [. . . ] It was an interactive

space, and I think Web 2.0 is of course a piece of jargon, nobody even

knows what it means. If Web 2.0 for you is blogs and wikis, then that

is people to people. But that was what the Web was supposed to be all

along.

sir Tim Berners-Lee

In short, Web 2.0 is the Web where people meet, collaborate and share anything

that is popular by using social software applications. The term refers to second gen-

eration of Internet-based services: blogs, wikis, communication tools and platforms

11 12 13 14 15 16
like del.icio.us , Flickr , Skype , Wikipedia , last.fm , Technorati .

Web 2.0 applications derive from new techniques such as rich internet applica-

tions (RIA), Asynchronous JavaScript and XML (AJAX), semantically valid Ex-

tensible HyperText Markup Language (XHTML), Cascading Style Sheets (CSS),

Syndication and aggregation of data in RSS or Atom, clean and meaningful URLs.

A user of Web 2.0 must feel as if he/she used traditional desktop applications to

share anything with the community.

In accordance with Tim O'Reilly [43], the meaning of Web 2.0 can be presented

by contrasting the traditional Web with new Web 2.0 in Table 2.2.

11 http://del.icio.us/
12 http://www.ickr.com/
13 http://www.skype.com/
14 http://en.wikipedia.org/
15 http://www.last.fm/
16 http://www.technorati.com/

21
Table 2.1: New trends in the Web (concept: [43]).

Web 1.0 Web 2.0


platforms Netscape, Internet Google Services, Flock

Explorer

web pages static personal web- dynamic blogging

sites

portals Content Management wikis

Systems

encyclopedia Britannica Online Wikipedia

arrangement directories (tax- tags (folksonomies)

onomies)

2.3.1 AJAX
AJAX is a web development technique to create web applications as if they were

desktop ones. The aim is to exchange only small amounts of data with a server; this

should be performed behind the scenes. No longer should entire page be (re)loaded.

17
One of the rst Web 2.0 applications was Google Maps , a set of interactive

maps of the world. One can watch diverse views of the world, change the way the

views are displayed and personalize them. There is a constant dialog between the

server and client application, but a page is not reloaded.

2.3.2 Democracy
Democracy in Web 2.0 is very important [12]. Users, often amateurs, collaborate

and share anything that is popular. Without users, many Web 2.0 application would

not live for long.

Del.icio.us is a collection of favorites. The idea is based on keeping bookmarks

and sharing them with other users; users collaborate and share information. It is

similar with Wikipedia, a free encyclopedia. Wikipedians can write new articles,

edit existing ones. Yet, all Wikipedia users are anxious about the quality of their

17 http://maps.google.com/

22
18
encyclopedia. There are even Web 2.0 news services, like Reddit . It is a set of news

items and articles which were found interesting by other people, and consequently

added there.

Aforementioned examples expose the importance of the Internet users. Web 2.0

exists and is becoming more and more popular since users try to evolve, expand and

improve it. One can share anything and in return is allowed to use others' products.

2.3.3 Social network


A social network consists of users who collaborate and share, using the Internet,

which brings about online communities  social networks (see Fig. 2.4). The main

reason why a user belongs to social networks is the desire to share and meet oth-

ers with a similar domain of interests. Collaboration is a good way of reaching

information and knowledge [46].

Communication can be divided to three modes, which is classied on the basis

of the techniques used:

• one-to-one  emails, instant messaging

• one-to-many  web pages, blogs (see Sec. 3.1.1)

• many-to-many  forum, wikis (see Sec. 3.1.2)

Networks have diverse sizes. In a small, tight one, there are few people who form

a kind of a private area. However, there can also be a lot of participants with loose

connections (weak ties). From the collaboration point of view, the latter mode is

more valuable as it is more probable to introduce new ideas. Hence, it is better

to have connections with other networks than with only one. However, unlimited

access to information exchange can involve some risk; there is a possibility that a

social network is ooded with unneeded information. To avoid that, or at least to

limit the possibility of reaching poor data, rating and annotating shared resources

were introduced.
18 http://reddit.com/

23
Figure 2.4: An example of a social network

Scale-free network

In a scale-free network, there are many very connected nodes (hubs) which have

high degree of connections. The important characteristics of scale-free networks is

that the ratio of those well connected hubs to the number of nodes in the rest of the

network remains constant as the network changes in size.

2.3.4 Tagging
A tag is a label associated with or assigned to a piece of information such as a

web page, a photo or a movie. It is a keyword, which les and classies resources.

Popular services that use tags are del.icio.us and Flickr. The former uses tags to

label favorite web pages, while the latter employ them to marker photos.

A tag cloud represents a collection of tags in a way a user is capable of distinguish

more popular tags from less popular ones; the former are written in bigger font than

the latter. Popularity is seen either by the number of items that have been given a

tag (like at Flickr) or the number of times the tag has been applied to a single item

24
(like at last.fm). Clicking on a tag from the cloud shows the list of resources which

were labeled with that tag.

19
The TagCommons project is aimed at creating ways to share and interoperate

over tagging data. The idea of the project is to benet from rich social tagging

across applications, communities, and spaces by introducing an ontology for tagging

descriptions.

2.3.5 Mashups
A mashup is a web page which oers a number of online services from various

20 21
sources. It allows using existing applications like Google Maps , Google Calendar

22
or Yahoo! UI Library (YUI) . It is possible due to access to their public APIs, Web

feeds (RSS or Atom) and JavaScript.

2.4 Semantic Web 2.0


Formerly, I have introduced the Semantic Web and Web 2.0 - new technologies which

have impacted the Internet development. The former is a low-level solution whose

assumption is to produce standards and recommendations helpful in interlinking

applications, the latter is high-level, user experience-minded and supposed to provide

user applications [21, 65].

Both these standards can be overlapped to make even better benets. By involv-

ing Web 2.0 techniques into the Semantic Web solutions, we get Semantic Web 2.0

applications which not only act as desktop ones, with a ne looking user interface,

but also carry information understood by machine agents.

2.4.1 Metadata development


There are three approaches to creating metadata. They can be created by profes-

sionals, by authors or directly by users.

19 http://tagcommons.org/
20 http://maps.googe.com/
21 http://www.google.com/calendar
22 http://developer.yahoo.com/yui/

25
Table 2.2: Metamorphosis of the Web (concept: [21]).

Web 1.0 Web 2.0 Semantic Web


2.0
web pages static personal blogs Semantic Blogs

websites

portals Content Manage- wikis Semantic Wikis

ment Systems

search engines Altavista, Google Google Per- Swoogle, The

sonalized, SHOE Search

dumpnd.com Engine

books, articles Project Guten- Google Scholar, JeromeDL, NDSL

berg Google Book

Search

collaboration Message Boards Community Por- Semantic Forums,

tals Community Por-

tals

socializing Address Books Online Social Net- Semantic Social

works Networks

Web space Social Seman-

tic Information

Spaces

26
The most traditional states that the metadata is created by dedicated profession-

als; it has a form of catalog records created by complying to complexed rules which

are not understood by laymen. Moreover, organizing and developing the catalogs is

expensive and time-consuming.

Author-created metadata approach assumes that authors are responsible for sup-

plying their work with metadata since they know them best. It helps with the scal-

ability problem, but still users are only the recipients and do not have the inuence

on the data.

User dened metadata solves the scalability problem and involves users in the

cataloging process [35].

Folksonomies

Tags (see Sec. 2.3.4) arose along with Web 2.0; they played the role of taxonomies

 were supposed to to classify resources. Taxonomies are developed with controlled

vocabulary; users are imposed with them.

The Semantic Web 2.0 has set users free from using predened vocabularies. It

gave one more freedom in that eld by introducing folksonomies. One of the most

popular services that involve folksonomies are del.icio.us and Flickr. As the name

suggest (it is a combination of words folk and taxonomy), a folksonomy is being

developed by folks (users) who collaborate. A more technical denition says it is

an open-ended labeling system with low entry costs that enables Internet users to

categorize content using tags. Tags in a folksonomy are metadata about categorized

resources; they make a body of information considerably easier to search, discover,

and navigate over time.

In other words, folksonomies are the simplest way of knowledge representation

in the Semantic Web 2.0 depiction. At the same time, they bring into the Semantic

Web 2.0 the whole potential of Web 2.0. That is why they also appeared in table 2.2.

As I stated earlier in this paper (see Sec. 7.2.2), the Semantic Web is about

describing information so it is readable and understood by machines.

An important aspect of of a folksonomy is using a at namespace. There is no

hierarchy concerning tags. A parent or a sibling of a tag cannot be specied. Instead,

27
a tag can be interlinked with others related tags. The relationship is established by

analyzing the URLs. Related tags can be used to broaden or widen the range of

found information and to nd information somehow associated with current tag [35].

The most important limitation of folksonomies is the fact that there is no scope

information and systematic guidelines, which results in ambiguity. A tag can include

a large number of information from dierent subject matters. Then there is no

synonym control. How to classify photos of oneself ? In me or selfportret tag?

Then how to reach the most appropriate information about Macintosh? By searching

in apple or macintosh tag? There also is a problem concerning multiple words

and spaces, as usually multiple words are not allowed. Finally, the problem of plural

and singular forms and conjugated words appears. There is no strict rules which

form should be chosen. As I said, a user is given a free hand in tag's name selection,

so there is a risk of existing tags which are senseless or a few versions of the tag that

could be named once and for all with one name.

However, these problems can be solved in simple ways. One way is to educate

users to add better tags. They should be advised to use plurals in basic forms.

They also shall be taught not to make spelling errors and avoid personal tags (e.g.

mydog) that are meaningless to the community. Then, tagging systems should

catch misspelled and not recommended words and give users advice at run-time [34].

There are some initiatives who try to learn how to order tags in folksonomies such

23
as taga.licio.us .

23 Taga.licio.us: a way to integrate del.icio.us  http://frenchfragfactory.net/ozh/archives/2004/10/05/tagaliciou

a-way-to-integrate-delicious/, accessed: December 28, 2006

28
Chapter 3

Social Semantic Information Sources

and eLearning 2.0

The goal of this Master's Thesis is to employ Social Semantic Information Sources

for eLearning; that is why it is necessary to understand what the Semantic Web 2.0

is and how it can be used for eLearning. So far, I have introduced those technologies

(see Chapter 2). In this Chapter, I explain the idea of Social Semantic Information

Sources (see Fig. 3.1) and make a review of their most popular examples (semantic

blogs, semantic wiki, and Social Semantic Digital Library). Using that information,

I dene a common model of SSIS and propose a consistent way of its description.

Then, I present eLearning 2.0, a new approach which tracks informal learning so

widely available in the Internet.

3.1 Examples of Social Semantic Information


Sources
3.1.1 Semantic Blogs
According to Moeller [37], blogs (weblogs) are online journals or diaries created and

leaded by one to publish their opinions, thoughts and web links [42]. Although

most blogs are textual, some focus on photographs (photoblog), videos (vlog), audio

(podcasting). In general, blogs are part of the wide network of social media.

29
Figure 3.1: Location of SSIS in the Web (gure concept: [21])

A blog's owner can be motivated by the desire to introduce themselves to others.

He/she can be also a beginning writer who looks for the audience. Finally, a blogger

can be a professional in some eld of interest, e.g., a specialized photographer or a

master of science. Anyway, whoever the blogger is, his or her main reason to create

the blog is to share some information with the broader community.

Blogs are updated by habitually writing new entries (posts). They are usually

showed to visitors in reverse chronological order. New posts can be syndicated

with headlines, hyperlinks and summary using RSS or Atom formats. This allows

interested readers know about changes to the blog.

1
Visitors can read posts and annotate them. Due to Technorati , blogs are pow-

erful since they allow millions of people to easily publish and share their ideas, and

millions more to read and respond. They engage the writer and readers in an open

conversation, and are shifting the Internet paradigm as we know it.

1 http://www.technorati.com/

30
Blog as a tool in eLearning
2
According to Technorati statistics from August 2006 , there were fty million blogs

in the Internet, and their number had been doubling every six months or so since

November 2002. At that stage, the number was one hundred times bigger than it

had been three years earlier. That day, about 175 000 new blogs and about 1.6

million posts were created each day. These numbers demonstrate the potential of

blogs.

Being so popular, blogs can support the learning process. Yet, not only do they

remove the technical barriers to writing and publishing online, but, thanks to their

format, they also encourage students to sharing their ideas.

According to O'Hear [41], Will Richardson was one pioneering educational blog-

3
gers. By using Manila , a blog software, he encouraged his English literature stu-

dents to publish a reader's guide to the book The Secret Life of Bees. The author

of this book helped in that experiment by answering questions and commenting on

what the students have written. This way, a small community of people interested

in a certain topic arose.

Will Richardson succeeded since he relied on the main concepts of weblogs, the

power of collaboration, which can be used in eLearning. Students can use weblogs

for exchanging their experience, publishing their notes or gained knowledge. Yet,

other students or even teachers can write annotations to express their feelings about

records of one's thinking.

Syndication of blog content

Blogs would not be so helpful in studying if it were not for exposing machine-readable

listings. There is a family of XML-based standards for describing the contents of a

blog. Syndication services generate feeds, which are portions of information about

changes to a blog. The most popular standards [22] are Really Simple Syndication

0.92, Atom and the RDF Site Summary 1.0 (it fully supports RDF).

2 State of the Blogosphere, August 2006  http://www.sifry.com/alerts/archives/000436.html;

accessed: December 29, 2006


3 http://manila.userland.com/

31
Semantics for blogs
4
There is a large number of blogging publishing services available, such as Blogger or

5
WordPress . That services provide a wide range of tools for creating and managing

blogs. However, they lack from semantic description of the content: topic of the

posts, their content or connection with other posts, perhaps from others blogs.

To make a blog also machine readable, rich metadata for its content must be

provided [54]. The metadata can belong to one of two domains:

• structure  information about the composition of a blog. It describes a blog

itself or its parts (posts, comments, hyperlinks) and relations between them

• content  describes a post's topic, which can be an event, a person, a book

etc. The structure of the content depends on what it really describes.

The data can be either mixed in a post (seen by a reader) or added in a hidden,

computer-understandable RDF format. Due to RDF, computers can interpret and

process the metadata; machines can nd connections between one's blog posts and

other blogs, quickly obtain information about a post's author or a described event.

Consequently, browsing and exploring blogosphere is more ecient.

6
Semantically-Interlinked Online Communities (see Sec. 3.2.1) project (SIOC)

delivers a plug-in (SIOC Exporter) for a few most popular blogging platforms:

7
• WordPress one of the most popular blogging tool

8
• DotClear  blogging platform used mostly in French

9
• Drupal  content management platform for blogs and fora

10
• b2evolution
4 http://www.blogger.com/
5 http://wordpress.org/
6 http://sioc-project.org/
7 http://wordpress.org
8 http://www.dotclear.net/
9 http://drupal.org/
10 http://b2evolution.net/

32
SIOC plug-in adds additional information about the site, a hyperlink to extract

RDF document for the whole blog or its posts. These metadata describe the blog

who hosts a post, and gives some information specic for a blog post, the author,

the topic, external links, the date of creation, the content of the post, etc. To learn

more about SIOC project, see 3.2.1

3.1.2 Semantic Wikis


Wiki is an interlinked website developed and maintained by a community. The most

11 12
popular wiki engines are MediaWiki (Wikipedia is based on it) and MoinMoin

13
Wiki .

The inventor of Wikipedia is Ward Cunningham; he introduced its idea at a

programming language pattern group. A wiki has a simple text syntax for creating

new pages. Users can easily create the contents (ad hoc) and edit existing informa-

tion using a web browser. They do not have to even be logged in to do that [16].

Wiki provides easy and deep linking by using names. In other words, if a wiki page

contains a word or phrase which is the topic of another page in that domain it is

automatically linked to that page. That, quite straightforward feature, improves

navigation; moreover, this works for pages which do not exist yet.

As everyone is allowed to interfere what other see, the contents must be checked

 only then the information a wiki provides is reliable. Each community mem-

ber can be a moderator. Reliability is achieved with versioning and di features.

Each wiki page has a history of changes which can be easily tracked by compar-

ing dierences between them. Thus, in case of occurrence of errors changes can

be easily reverted. All the aforementioned features make wikis a powerful tool for

collaborative work.

There can be many reasons for creating a wiki. Wikipedia is the most popular

encyclopedia based on a wiki engine in the Internet. Wikis can also be used to

14
manage the open source software documentation, like Jakarta does. It is convenient

11 http://www.mediawiki.org
12 http://wikipedia.org/
13 http://moinmoin.wikiwikiweb.de/
14 http://jakarta.apache.org/

33
to use a wiki as a personal information management system. Finally, it is commonly

15
used as a discussion platform in companies' intranets (see TWiki ).

Semantics makes wiki better

Wikis seem to be a good way of making people cooperate and a powerful informal

source of knowledge. To better use their potential, the structure and the content

of wiki pages shall be modeled by using semantic description. Semantic wikis allow

user to add additional metadata (semantic descriptions) for described concepts. This

data shall mark the place of its occurrence so that the system is capable of extract

relevant data without understanding the rest of the text. As a result, it helps to

organize, search, browse, share, and annotate the wiki's content. Semantics enhance

the searching process; it is not limited to only keyword based searching. It introduces

queries similar to structural databases.

For instance, a wiki with articles about rock songs could annotate these pages

with little pieces of additional data (written in RDF), such as this song was made by

Red Hot Chili Peppers , or This song was published in 2000 . A user does not have

to know RDF syntax to annotate. Thus, the wiki can reason on the annotations

and for instance reach songs of a specic band.

Currently, there are few Semantic Wiki solutions working:

16
• Semantic MediaWiki  extension of MediaWiki (see Sec. 3.1.2)

17
• IkeWiki  web-based wiki (prototype)

18 19
• Makna  its engine implementation is based on Janne Jalkanen's JSPWiki ;

20
it uses Jena , the Semantic Web engine used by HP

21
• SemperWiki  a semantic personal wiki developed for the Gnome desk-

top [44]

15 http://twiki.org/
16 http://meta.wikimedia.org/wiki/Semantic_MediaWiki
17 http://ikewiki.salzburgresearch.at/
18 http://www.apps.ag-nbi.de/makna/
19 http://www.jspwiki.org/
20 http://jena.sourceforge.net/
21 http://www.semperwiki.org/

34
There are three ontologies designed to deal with wikis:

22
• WikiOnt  aims at integrating Wikipedia (and by extension other

MediaWiki-based sites) into the Semantic Web framework

23
• SWIFT

• SIOC (see Sec. 3.2.1).

Semantic MediaWiki

MediaWiki is one of the most popular wiki engines. The most known wiki,

Wikipedia, is based on it. However, MediaWiki does not support the Semantic

Web demands. Although, the HTML code is to some extent semantic, there is no

place for such features like OWL and RDF.

To make a MediaWiki-like wiki a semantic one, one can instal the Semantic Medi-

aWiki extension [16]. Its goal is to make important parts of MediaWikis knowledge

machine processable with as little eort as possible. For that reason, there are

instructions on how to improve typed links, attributes and types, and introduced

semantic templates.

Typed links are treated like semantic relations between two concepts described

in articles. A typed link is obtained by extending the way of creating a hyper-

24
link. Let us take the main page of Corrib Clan Wiki as an example. On this

site, there is information about Corrib Clan like projects developed by their mem-

bers and the supervisors. There are a number of typed links on that page. The

hyperlink to the article about Didaskon not only gives the page location but also in-

troduce some additional information, that Didaskon in subproject of Corrib: [[has


subproject::Didaskon]]. From this template, an HTML hyperlink is contained.

This template is built from two main parts. First part (the expression before ::)
describes the relation; the second part (after ::) is a hyperlink to the article within

the wiki. So, this example says that Didaskon is a subproject of Corrib.

22 http://sw.deri.org/2005/04/wikipedia/wikiont.html
23 http://ontoware.org/projects/swift/
24 http://wiki.corrib.org/

35
Besides typed links, Semantic MediaWiki introduces better way to manage at-

tributes of concepts. Since each typed link connects two wiki pages, not all informa-

tion can be stored as a relation. For that reason, one uses attributes. On the above

mentioned Corrib Clan Wiki main page there are a few attributes as well. For exam-

ple, [[is supervised by:=sebastian_DOT_kruk@deri.org]] means that Corrib

Clan is supervised by Sebastian Kruk. The dierence between a typed link and an

attribute is the operator; now it is :=.


Relations and attributes describe the concept of an article in a machine pro-

cessable way. A set of relations and attributes is situated on the bottom of the

article page. But machines are not obliged to scrape the content of the page.

Semantic MediaWiki allows extracting these annotations with an RDF feed. For

http://wiki.corrib.org/index.php/Main_Page, the RDF description is available


at http://wiki.corrib.org/index.php/Special:ExportRDF/Main+Page . More-

over, Semantic MediaWiki allows querying on-the-y.

DBpedia.org
25
DBpedia.org is a project that aims at extracting structured information from

Wikipedia and to make this information available on the Web. The information

is often published on Wikipedia articles in special boxes by using special templates.

Also, DBpedia allows us to ask queries against Wikipedia and to link other datasets

on the Web to Wikipedia data.

3.1.3 Social Semantic Digital Library


In this Section, I focus on Semantic Digital Libraries. I present how they introduce

freshness to traditional libraries. I explain the reason for applying semantics to

them. Finally, I describe JeromeDL, the rst Social Semantic Digital Library. All

in all, I point out the importance of Social semantic Digital Libraries to learning

process.

25 http://dbpedia.org/

36
Digital Libraries

A library is a source of organized knowledge in various areas. Popularity of comput-

ers and the Internet expansion, brought in digital libraries [6]. In a digital library,

resources are machine readable and full-text index improves searching. Resources

become available and easily accessible through the Internet.

There were some quite innovative methods adapted to digital library commu-

nities: taxonomies, thesauri and classication schemes. They were introduced to

improve management of the signicant collections. Managing resources in a digital

library would be impossible if they were not suciently described; electronic anno-

tations play an important role since they bring more information about books. The

most popular description formats are MARC21, BibTeX and Dublin Core.

Besides searching and reading, users are allowed to download resources for further

use. In fact, downloading seems to be a substitute for traditional book borrowing.

Digital libraries also handle access rights. Some resources can be hidden from users

who do not have enough permissions to access them.

Semantic Digital Libraries

Digital libraries already have controlled vocabulary and taxonomies. All of them

even have metadata in place. In semantic digital libraries, rich and extensive se-

mantic annotations (metadata) make resources accessible not only with machines

but also by machines.

The metadata is modeled with RDF (see Sec. 2.2.2). Searching is more ecient

and gives more accurate results; it reects meanings of terms.

Currently, there are a few semantic digital libraries:

BRICKS26 is a fully decentralized platform that allows low-cost, transparent access

to distributed information sources by Web Services. It is internationalized,

easy installable and manageable. BRICKS is still in development stage and is

planned to support existing systems, not replace them.

Artifacts, which are from cultural heritage domain, are arranged in hierarchical

structure and can be stored internally or in any other place by keeping their

37
references. It also supports various metadata schemas dened in OWL-DL.

Bibliographic resources are described with RDF. Again records can be queried

in SPARQL.

Fedora27 is a service-oriented platform for managing and delivering digital con-

tent. It is developed jointly by Cornell University Information Science and

the University of Virginia Library. By using SOA, Fedora's developers aspired

to achieve interoperability and exibility. Digital objects consist of linkages

between data streams (internal or external content les), in-line or external

metadata, system metadata and behaviors, that are code objects providing

bindings and links to disseminators.

SIMILE28 (Semantic Interoperability of Metadata and Information in unlike En-

vironments) is developed by W3C, HP, MIT Libraries, and MIT's Lab for

Computer Science.

SIMILE project provides some tools for metadata managers and common end-

users. They all deal with RDF: allow to extract XML and HTML les, inspect

and edit RDF les. SIMILE extends and leverages DSpace and makes library

metadata management easier by facilitating browsing, searching and mapping

heterogeneous data in RDF.

JeromeDL  the Social Semantic Digital Library

So far, I have described innovative semantic digital libraries. I have presented how

the Semantic Web improves their features. The potential of semantic digital libraries

can be even more improved by applying Web 2.0 abilities. A semantic digital library

can give some space for collaboration. Users can leave a trace by making annotations

and evaluations of the resources. By supporting Web 2.0 collaboration aspects (com-

ments, blogs, shared bookmarks, tagging, etc.), a semantic digital library becomes

a social semantic platform. Also, it turns into a dynamic collaborative informal

knowledge repositories. A social semantic digital library aims at integrating infor-

mation collected from heterogeneous metadata sources, like resources descriptions,

user proles, bookmarks and taxonomies.

38
JeromeDL29 [49] is developed at Digital Enterprise Research Institute, Gal-

30 31
way (DERI) with collaboration from Gda«sk University of Technology by a

group of MSc and PhD students, including myself. It has 2-layer metadata en-

richment. The lower level, MarcOnt Mediation Service, supports legacy metadata

(DublinCore [9], BibTeX and MARC21 [1]), which allows interoperability with al-

ready existing digital libraries systems [48, 47, 26].

The upper level is community oriented [31]; a community of users can interact

in a Web 2.0 manner by tagging resources through Social Semantic Collaborative

Filtering (SSCF) [50]. Users can evaluate and annotate resources. Users' data

is stored in a private bookshelf, in semantically annotated directories. They can

share this information with other users, base on their prole, which is managed by

FOAFRealm [30, 13].

In JeromeDL, content managing and browsing is simplied due to an intelligent

search engine. Users can form queries even in natural language (NL) by using query

templates.

There are seven ontologies supported by JeromeDL and they can be grouped as

follows:

• User prole management component

 FOAF  Friend-of-a-Friend - describes person/agent prole

 FOAFRealm  allows identity management

• JeromeDL resource structure management  JeromeDL

• MarcOnt Mediation Service [48, 47, 26]  metadata about bibliographic re-

sources

 DublinCore

 BibTeX

 MARC21
29 http://jeromedl.org/
30 http://www.deri.ie/
31 http://www.pg.gda.pl/

39
 MarcOnt

Among other features, JeromeDL also allows exporting the description of its

resources. One can obtain it in BibTeX, DublinCore or in MarcOnt, the ontology

prepared specially for bibliographic reasons.

3.2 Model of Social Semantic Information Sources


So far, I have described basic Social Semantic Information Sources examples. The

analysis of SSIS asserted myself that there are a few main concepts regarding SSIS:

• collaboration (social aspect  online community)

• the content of resources

• rich metadata describing the content (semantics)

• enormous number of valuable information available

The last point suggests the potential of SSIS. Being collaboration-minded, online

community sites, like blogs, wikis, bookmarks sharing systems, allow users to create a

network where they can feel free to band together: share ideas and opinions, publish

links and works and comment them; any resource can be annotated. Consequently,

plenty of relevant information can be extracted; this data can support the learning

process. For instance, it can be served as an additional material to read. All in all,

this data can be treated as informal knowledge.

The main problem of online communities is that they are dispersed over the

Internet. Although their content is valuable, it is dicult to reach it. Current

solutions allow mainly text based searching, so a user must browse many web pages

to nd what he/she looks for.

The Semantic Web assumes rich description of the resources; its main postulate

says semantic annotations make the content readable by machines, which allows

better navigation and more ecient searching.

40
Figure 3.2: Online communities overview (from [4]).

3.2.1 SIOC
32
SIOC (Semantically-Interlinked Online Communities) is an initiative that is sup-

posed to overcome the above mentioned problem [20]; its goal is to interconnect

online communities. SIOC can be used in published or subscribed mechanisms,

as it stores community-like metadata such as information about the post's author,

enclosed links, the creation time, connection with other web pages.

The core of the SIOC framework is the SIOC ontology which is based on RDF

(Resource Description Framework). The ontology consists of a set of classes and

properties which link them:

Site is the location of an online community or set of communities.

Forum is a discussion area, housed on a site.

Post can be formed as an article, a message or an audio- or videoclip. A post is

32 http://sioc-project.org/

41
written by an author, has a topic, a content, external links, etc.

User represents an account held by an online community member.

Usergroup is a set of accounts of users interested in a common subject matter.

Figure 3.3: Main concepts in SIOC Ontology (from SIOC homepage)

Mapping in RDFS and OWL allows exchanging community instance data by

importing and exporting SIOC data in dierent vocabularies. This manner, the

amount of existing available data can be controlled. Also, SIOC makes cross-site

queries and topic related search on sites with SIOC metadata more ecient [4]. I

have already written about SIOC plug-in for a few blogging platforms (see Sec. 3.1.1).

SIOC ontology is still developed; recently, its authors have been trying to apply

it to other collaborative services. At the moment, it is possible employ it form

modelling wikis, image galleries, event calendars, address books, audio and video

channels and a few more.

3.3 eLearning 2.0


When we think of eLearning today, we probably think of Learning Objects (LOs) and

Learning Management Systems (LMSs) that provide online courses (see Sec. 2.1).

LMSs seem to be ubiquitous; thousands of instructors and students in a large number

of universities and colleges use products provided by companies such as WebCT,

42
Blackboard, and Desire2Learn [8]. To recap, LMS organizes the learning content in

a standard way, and delivers it to learners in the form of courses.

This great approach lacks from many limitations, though. The main problem of

current LMSs is that they deliver courses prepared for a generic student. They are

personalized, but prepared basing on an individual's view and supposed to satisfy

all. However, the learning path shall be adaptable, created dynamically. Also, LMSs

focus on a small group of students, for instance a group of studemts in a class; they

do not allow a broader community. Moreover, students should benet from not only

their repository (formal learning), but also use collected learning material widely

available on the Web [33].

eLearning 2.0 has emerged from Web 2.0 developments. According to DTI Global

Watch Mission [36], its key characteristics are:

• enabling a more active role of the user/learner

• knowledge and information sharing  Web 2.0 core assumption

• diversity of content and media  Web 2.0 services (blogs, wikis, multimedia

and bookmarks sharing systems)

• ease of collaborative learning

• informal learning

Blogs were one of the rst Web 2.0 services used in the newer eLearning approach.

Students' blog posts are often about something from their own range of interests,

rather than on a course topic or assigned project. Students run blogs and read others'

blogs; consequently, they create a social network with loads of useful data [36]. Then,

wikis, RSS, podcasting services, and others Web 2.0 platforms have emerged. All in

all, the number of available resources has increased, which occurs as a problem for

content management systems [2].

3.3.1 Is there a place for semantics?


In the Semantic Web, data are processed both by human and machine agents; this

is possible due to ontologies (see Sec. 7.2.2). Thus, machines can produce intelligent

43
responses for unforeseen situations. But the real power of the Semantic Web can be

realized when heterogeneous data from diverse environments are collected, processed

and sent for further use [33]. Ontologies organize learning material around good

semantic annotations of learning objects. Also, they can be used to describe user

proles in order to compose the best course for him/her basing on semantic queries.

Description of learning material is essential for course composing. The main

problem of current eLearning is that there is no standard that denes description

of LOs. We have many LMSs and most of them describe LOs in their specic way.

Thus, it is impossible to exchange LOs between dierent LMSs, integrate learn-

ing content used by other LMS and create common searchable content and content

repositories. Advanced Distributed Learning (ADL) Initiative introduced SCORM

(see Sec. 2.1.2) which is a collection of standards and specications adapted from

multiple sources to provide a comprehensive suite of eLearning capabilities that en-

able interoperability, accessibility and reusability of Web-based learning content [32].

However, SCORM has introduced its own XML formats and methodologies [3]. One

of the standards that underly SCORM is LOM; its goal is to provide rich descrip-

tion of learning material (see Sec. 2.1.2). Since LOM is very accurate, many LMSs

support it. This way, exchanging LOs between them is, to some extent, facilitated.

Although SCORM tries to introduce semantics to the education community,

learning content is still not much machine-readable. By bringing the Semantic Web

to eLearning, it is easier to integrate learning material with other material and dene

services; it allows interoperability, exibility, and machine-readable description of

learning material [3, 38]. Thus, it is more likely to benet from both formal and

informal sources (Social Semantic Information Sources) of information.

At the moment, considerable eort is put into research in the Semantic Web

and eLearning. There is a number of the Semantic Web educational services and

projects:

AQUA is an ontology-driven Question Answering (QA) system. Its goal is to an-

swer questions, written in natural language (English), about academic people

and organizations. Heterogeneous data for reasoning can be collected from

web pages that contain semantic content. AQUA uses ontologies for rening

44
initial queries, similarity algorithm, and reasoning process [55].

SES (Student Essay Service) is a service for annotating argumentation in student

essays, which facilitates writing essays that really answer the essay question.

Annotations are created by using argumentation categorizations stored as on-

tologies [38].

Elena33 denes a smart space for eLearning on top of Edutella [39] peer-to-peer

(P2P) infrastructure. It brings in interoperability and resource exchange be-

tween dierent heterogeneous educational applications and dierent types of

learning resource repositories. It uses SOAP based Web Services which are

described in WSDL and DAML-S [3, 51].

Edutella is a peer-to-peer (P2P) network that interconnects universities. Within

Edutella, a university is a content provider and a content consumer. The net-

work and all its resources are described in RDF. This allows running ecient

queries, perform replications supposed to achieve workload balancing, and

mapping, mediation and clustering resources and the metadata for them [39].

Memora is an ontology-based document-driven memory, which allows to eciently

manage learning material through indexing by the means of ontologies [2].

3.3.2 Didaskon
34 35
Didaskon is a project developed in the Digital Enterprise Research Institute

(DERI), Ireland by a few students, including myself. It is a research project in

the eLearning eld. Its main goal is to deliver a framework for assembling an on-

demand curriculum from existing Learning Objects (LOs) provided by eLearning

services [52].

It has an access to a repository of LOs described with semantic annotations 

LOM ontology. LOs are composed into a learning path for a specic student. Along

with formal Learning Objects, Didaskon also uses the potential of Social Semantic

Information Sources. It is capable of fullling informal learning postulates and

34 http://didaskon.corrib.org/
35 http://deri.ie/

45
creates LOs from data harvested from SSIS. Consequently, a user gets a course path

prepared from information collected in both formal and informal way.

Furthermore, Didaskon composition algorithm takes into account some pre-

conditions regarding a user. Each user is described with FOAF ontology [7]. Basing

on a delivered user's prole (knowledge level in dierent domains and goals/ex-

pectations from the course) it is capable of returning learning material customized

for his/her needs. Moreover, the system allows more scalable helper features for

students supervision.

Again, used ontologies link user needs and the characteristics of the learning ma-

terial. Produced curriculum not only reects user requirements, but also introduces

new interdisciplinary, extensible and robust meaning of eLearning.

46
Chapter 4

Informal Knowledge Harvester

Each project is burdened with some risk; when things go wrong it can end up

failing to reach the initial assumptions. Therefore, the design process is very crucial.

Besides dening business goals, I also must identify possible problems and risks.

In this chapter, I introduce existing tools for capturing informal learning and

describe the scope of my project. I dene the functional and non-functional require-

ments for the system and use cases. Then, I introduce the its architecture: the

main components, classes, and Web Services specication. All information gathered

helped me during the implementation stage.

4.1 Capturing informal learning


In this Section, I present existing tools for capturing, tagging, and browsing on-

line resource or metadata for them. I describe their features, and point out their

limitations.

4.1.1 Existing tools


PingtheSemanticWeb.com
1
PingtheSemanticWeb.com is a service for sharing RDF documents. Its engine looks

for RDF data either in the content of the resource with the specied URL or in docu-

ments this resource links to. If such data is found, it is saved to the shared repository.

1 http://pingthesemanticweb.com/

47
PingtheSemanticWeb.com supports FOAF, SIOC, and DOAP ontologies, and other

RDF documents.

The pinging feature is invoked either by typing a URL on the service's home page

or using specially prepared browser buttons. Moreover, PingtheSemanticWeb.com

2
benets from Semantic Radar , an add-on for Firefox web browser; whenever Se-

mantic Radar detects RDF data on a web page, it informs PingtheSemanticWeb.com

about that fact so it can be added to the repository. Software agents can request

the service for a list of stored RDF documents and use that information for crawling

the Semantic Web.

SIMILE Project
3
SIMILE Project Semantic Interoperability of Metadata and Information in unlike

Environments provides tools for metadata managers and common end-users.

Piggy Bank, an add-on for Firefox, changes the browser into a mashup platform,

by allowing to capture metadata for online resources and mix them together. Col-

lected data can be stored locally, tagged, searched, and browsed. Piggy Bank can

capture RDF documents to whom a web page links and from any web pages that are

supplied by screen scrapers. A screen scraper is a little program for collecting

metadata for, also, non-semantic web pages. It is written in another SIMILE tool,

Solvent.

If a user wants to share his collection of metadata, he/she publishes it to the

Semantic Bank, a communal repository of RDF data.

Zotero
4
Zotero is an add-on for Firefox web browser. It helps with collecting, manag-

ing, and citing research material, mainly bibliographic resources. Zotero extracts

RDF injected into XHTML documents; it works with a few standards and microfor-

mats [24]: embedded RDF, COinS, Dublin Core [9], and MARC [1]. Zotero informs

2 http://sioc-project.org/refox/
3 http://simile.mit.edu/
4 http://www.zotero.org/

48
a user it has discovered some mark up by showing a special button in the browser

toolbar. Clicking the button starts capturing process.

A user can easily edit the data saved by Zotero and append additional informa-

tion, such as notes, tags, and related les. Moreover, Zotero can be integrated with

Microsoft Word and WordPress. Captured data can be searched and browsed both

online and oine.

4.1.2 Limitations
All the above mentioned tools are good metadata harvesters. However, they work

dierently, and have dierent possible usages.

Providing Web Services, PingtheSemanticWeb.com allows gathering semantic

annotations for online resources in a shared space. This information can be used

for instance by crawlers while searching for specic piece of data. But, PingtheSe-

manticWeb.com does not come up with the possibility to browse stored data besides

viewing raw RDF documents, which is unacceptable for a common user. Also, it

does not work with non-semantic sources, like Wikipedia.

Zotero is a powerful tool for researchers and students because it facilitates biblio-

graphic resources management. With Zotero, it is easy to browse saved information

about books and articles, search and cite them. However, it only reads embedded

RDF; there is no support for pure RDF data which can pass more knowledge.

Piggy Bank is capable of reading whole RDF documents that a web page links

to. Although it does not support non-semantic web pages itself, it is possible to

write screen scrapers that can do that. In spite of that, it has little support for

eLearning platform; there is no standardized way to use captured data by eLearning

frameworks, like Learning Management Systems.

Analysis of existing knowledge management tools, resulted with a set of signif-

icant characteristics that such a tool must be distinctive with. Not only should it

work with semantic sources of information but also it must operate on non-semantic

web pages, like Wikipedia. It must be easy to extend it so that it supports more

types of websites. Then, a user should be supplied by supportive tools for data

capturing, like browser buttons or add-ons.

49
Also, I have discovered that captured data can considerably boost informal learn-

ing; it can be used in new eLearning frameworks that use both learning material

prepared by specialists and collected by an information harvester.

4.2 System Requirement Specication


4.2.1 System scope
The system I have developed aims at capturing informal learning. It is an SOA

layer for Didaskon system (see Sec. 3.3.2) which works as its extension. The system

provides Web Services for harvesting data from SSIS and providing them in a form

of informal Learning Objects (see Fig. 4.1). Data delivered by the system must be

described with a common object model so that Didaskon can easily reason on it.

Because the system is supposed to collect data, I have named it IKHarvester, from
Informal Knowledge Harvester.

In the picture of the system scope (see Fig. 4.1), you can see SSIS that pro-

vide heterogeneous metadata. Their content is enriched with semantic annotations.

IKHarvester collects the metadata and stores it in the repository of informal knowl-

edge (it's not in the picture). The collection of these metadata is well described

so it is machine readable. This allows delivering to Didaskon relevant portions of

learning material, basing on what it needs during the composition. The description

of learning material is formed in a common way, according to the LOM standard,

which is popular with eLearning frameworks.

4.2.2 System requirements


Before this stage, I had dened both functional and non-functional requirements.

They gave me a view on what and how the developed system should act.

Requirement description template

Dening the system requirements is very important. They should be organized in a

readable and correct way. Table 4.1 is a template for describing the requirements.

50
Figure 4.1: System scope

In the table, there is the following information:

• Id  unique identier of the requirement used further within the documenta-

tion. It consists of:

 X  F for functional; N for non-functional requirements

 YY  the number of the F or N requirement, starting from 01

• Priority  determines the importance of this requisite; possible values from

most to less important: crucial, required, optional

• Title  the described aspect of the system

• Description  the essence of the requirement

51
Table 4.1: Requirement description template

Id XYY Priority

Title

Description

Source

Related req.

• Source  stakeholder(s) whose/which knowledge and needs constitute on the

requirement

• Related req.  possible requirements the one described relates to.

Functional requirements

Functional requirements cover the stakeholders' demands on what the developed

system should do. Precisely described stakeholders' needs are a rst step to nish a

project with success.

Id F01 Priority Crucial

Title Deliver a list of all informal LOs

Description IKHarvester should be able to provide a list all informal LOs stored in

the informal knowledge repository.

Source Didaskon

Related req. F07, F09

52
Id F02 Priority Optional

Title Deliver a list of informal LOs that have changed since a given date

Description IKHarvester shall provide information on which informal LOs have

changed since a given date. This aims at avoiding the situation, where

data which is not up-to-date is used.

Source Didaskon

Related req. F01, F07, F09

Id F03 Priority Crucial

Title Deliver the manifest of a specic informal LO

Description It is one of the basic features of IKHarvester. Metadata for SSIS resources

is stored in the informal knowledge repository and must be accessible

by agents. However, it must be described in a common object model.

Because of eLearning background of the system, the metadata must be

presented in Learning Object Manifest.

Source Didaskon

Related req. F01, F02, F07, F09

Id F04 Priority Crucial

Title Deliver the content of a specic LO

Description During a course composition, Didaskon uses LO manifests. Finally, it

creates a curriculum from the content of relevant LOs.

The content of informal LOs must not be stored in the repository; it must

be collected on the y. This way, only descriptive information is kept in

the informal knowledge repository. It is useless to store the content of

SSIS resources; it can change quite often so it may be dicult to keep it

up to date. Also, the amount of data stored would be too large.

Source Didaskon

Related req. F01, F02, F07, F09

53
Id F05 Priority Crucial

Title Add a LO to the informal knowledge repository

Description Informal Learning Objects will be added either by students (Didaskon

users) or by the administrator. Regarding SSIS, one must know the URL

of the resource from which a LO will be created.

The data should be harvested from:

• blog posts that have support for SIOC

• (semantic) wiki articles, based on MediaWiki engine

• JeromeDL

Source Didaskon

Related req. F01, F02, F07, F09, F10, F11, F12

Id F06 Priority Crucial

Title Remove a LO from the informal knowledge repository

Description If a SSIS resource from which the LO was created no longer exists, it shall

be removed from the informal knowledge repository. The data should not

be physically removed though. Instead, in the repository, there should

be added some information about removal.

Source Didaskon

Related req. F01, F02, F07, F09

54
Id F07 Priority Crucial

Title Access to all functionalities by Web Services

Description IKHarvester shall be an extension to Didaskon, build as a SOA layer.

Using SOA assures eciency and easy access to the system features. All

methods should be accessible by REST type Web Services.

Source Didaskon

Related req. F01, F02, F03, F04, F05, N04

Id F08 Priority Required

Title Testing background

Description Since IKHarvester should support SOA architecture, a user interface is no

longer needed (especially for LMSs purposes). However, the functionality

should be tested to assure the system is reliable. Also, there should be

provided an access for the administrator of the repository. Thus, some

testing pages must be prepared.

Source Didaskon, Jarosªaw Dobrza«ski

Related req. F071, N01, N04

Id F09 Priority Crucial

Title The structure of Learning Objects

Description IKHarvester must provide informal LOs description in a common ob-

ject model suitable for eLearning. The structure of the model must be

compliant with SCORM Content Aggregation Model.

Source Didaskon

Related req.

55
Id F10 Priority Crucial

Title Harvest data from semantic blogs

Description IKHarvester must be able to collect data from semantic blogs which are

supported with SIOC plug-in. The data should be obtained by using the

SIOC exporter.

Source Didaskon

Related req. F05

Id F11 Priority Crucial

Title Harvest data from (semantic) wikis

Description IKHarvester must be able to collect data from both semantic and non-

semantic wikis, based on MediaWiki engine.

IKHarvester should use RDF feeds that provide semantic annotations.

Besides, it must perform articles' pages scraping to collect more data.

Source Didaskon

Related req. F05

Id F12 Priority Crucial

Title Harvest data from JeromeDL

Description IKHarvester must be able to collect data from JeromeDL, the Social Se-

mantic Digital Library. The data shall be produced by the RDF exporter.

Source Didaskon

Related req. F05

Id F13 Priority Crucial

Title Filter the collected metadata

Description In case RDF extractors supplies IKHarvester with not relevant data, it

must be ltered.

Source Didaskon

Related req. F05

56
Non-functional requirements

Non-functional requirements predetermine expectations regarding the system, but

not concern its interaction with the environment.

Id N01 Priority Required

Title Reliability

Description IKHarvester will be a subsystem working with LMS(s), in particular with

Didaskon. It is necessary to provide reliable responses. SSIS resources

must be precisely described; only then Didaskon can perform proper

reasoning on them.

Source Didaskon, Jarosªaw Dobrza«ski

Related req. N06, N07, N08

Id N02 Priority Required

Title Interoperability

Description IKHarvester shall be able to exchange and use information in heteroge-

neous networks. Yet, it is supposed to collect data from SSIS which carry

diverse information.

Source Jarosªaw Dobrza«ski

Related req. N05

57
Id N03 Priority Required

Title Extensibility

Description The system should be developed in a way that allows making im-

provements; it must be possible to make corrections, improvements and

changes to the existing and working system.

Also, it must be easily extended with plug-ins (see Fig. 4.6) that deal

with other types of SSIS (wiki based on dierent engine, other digital

libraries, fora, bookmarks sharing systems, etc.)

Source Didaskon

Related req. N10, N11

Id N04 Priority Required

Title Eciency

Description The system must provide relevant information quickly. Therefore, it

should be developed as a lightweight application, for example as Web

Services.

However, this is not crucial; the communication takes place within the

Internet so there might happen some periods of time the services are

dead. It cannot happen often, though.

Source Didaskon, Jarosªaw Dobrza«ski

Related req.

Id N05 Priority Required

Title Portability

Description It is required that the system is platform independent; It should be de-

ployed on either UNIX or Microsoft Windows operating system without

changes to the source code. System should be delivered in a way that

allows quick deployment.

Source Didaskon

Related req. N02

58
Id N06 Priority Required

Title Stability

Description IKHarvester should be stable; it should work ne in the environment it

is deployed. Unexpected breaks and falls must be predicted and avoided.

Source Didaskon

Related req. N01

Id N07 Priority Required

Title Safety

Description Stored information shall be protected; it cannot be lost or modied ac-

cidentally.

Source Didaskon, Jarosªaw Dobrza«ski

Related req. N01, N07

Id N08 Priority Optional

Title Security

Description IKHarvester shall be protected from intentional and unintentional ac-

tivities and eorts that aim at lowering its eciency and the quality of

work.

Source Didaskon, Jarosªaw Dobrza«ski

Related req. N01, N04, N07

Id N09 Priority Optional

Title Open software

Description During the development, only open source software and tools should be

used (does not apply for operating system)

Source Didaskon

Related req.

59
Id N10 Priority Crucial

Title Version Control

Description All the documents and other products (like software) created during the

project will have a version number which will allow to track changes in

an easy way. There is need for a tool like SVN for version controlling.

Source Didaskon

Related req. N03, N11

Id N11 Priority Crucial

Title Integrated development environment

Description There is a need for an integrated development environment for more

ecient project managing and better support for version control.

Source Didaskon

Related req. N10

4.2.3 System Use Cases


As stated previously (see Sec. 4.2.2), IKHarvester is built for capturing informal

learning. There are two main functions provided:

• providing data stored in the informal knowledge repository

• collecting data from SSIS and storing it in the informal knowledge repository

More detailed use cases are depicted in Fig. 4.2; there are basic functionalities

and the actor shown.

Actors

Below, there is a description of the actor that uses functionalities provided by IKHar-

vester.

60
Figure 4.2: Use Case diagram

61
Id A01

Title Client

Description IKHarvester will be build as an SOA layer. Initially it was

supposed to be an extension for Didaskon, an eLearning 2.0

LMS. However, since all its features are accessible through

Web Services, we expect more than one actors that can use

it.

Related actors

Use cases

A use case is an occurrence that takes place while the system works. Each use case

is initiated either by the actor's activity or by another use case. It is very important

to provide a use case scenario that was created after system requirements analysis.

Use cases tell more precisely about what can happen to the system while it works.

Id UC01

Title Delivery of a list of LOs

Description Providing a list of existing Learning Objects created from

informal knowledge stored in the repository.

Actors A01

Initial occurrence Claim for list of LOs.

Exceptional occurrence Unsuccessful connection to the repository

Related use cases UC07, UC17

62
Id UC02

Title Delivery of the LO manifest

Description Providing a description of an informal LO, taken from repos-

itory. The LO is described according to LOM standard.

Actors A01

Initial occurrence Claim for the manifest of an informal LO.

Exceptional occurrence Unsuccessful connection to the repository

Related use cases UC07, UC16, UC17

Id UC03

Title Delivery of the LO content

Description Providing the content of a LO. The content can be txt, HTML

or a link to a digital resource. This content will be used in

the composed course.

Actors A01

Initial occurrence Claim for the content of the LO.

Exceptional occurrence Unsuccessful connection to the resource with given URL; ei-

ther problems with the Internet connection or the resource is

no longer online.

Related use cases UC07, UC17

63
Id UC04

Title Adding a new LO

Description Didaskon's user or the system administrator is allowed to add

new LOs. The actor must hold the URL of the resource which

should be added.

Actors A01

Initial occurrence Claim for adding a new LO.

Exceptional occurrence Unsuccessful connection to the repository or problems with

the Internet connection (impossible to harvest metadata).

Related use cases UC07, UC10

Id UC05

Title Update to the LO manifest

Description If the metadata for resources, which are regarded as sources

of informal knowledge changes, it should be updated. Only

then Didaskon can perform ecient and sucient reasoning.

Actors A01

Initial occurrence Claim for updating metadata of resource that has changed.

Exceptional occurrence Unsuccessful connection to the repository or problems with

the Internet connection (impossible to harvest metadata).

Related use cases UC07, UC10

64
Id UC06

Title Removal of the LO manifest

Description If a resource regarded as informal knowledge no longer exists,

it should be removed from the repository. Then, Didaskon

will not use it during course composition.

Actors A01

Initial occurrence Claim for the LO manifest removal.

Exceptional occurrence Unsuccessful connection to the repository or problems with

the Internet connection (impossible to harvest metadata).

Related use cases UC07

Id UC07

Title Specifying the input data

Description Specifying the input data consists of qualifying preconditions

or other information (for instance resource's URL) needed for

data harvesting and providing.

Actors A01

Initial occurrence Claim for data harvesting and providing.

Exceptional occurrence

Related use cases UC01, UC02, UC03, UC04, UC05, UC06, UC08, UC09

Id UC08

Title Specifying the adding date

Description Assign a value to resource's the adding date.

Actors A01

Initial occurrence Claim for a list of all LOs. If adding date is specied, the list

will contain only those LOs which are added since then.

Exceptional occurrence

Related use cases UC07

65
Id UC09

Title Specifying the URL

Description Assigning a value to resource's URL.

Actors A01

Initial occurrence Claim for a specic LO.

Exceptional occurrence

Related use cases UC07

Id UC10

Title Harvesting data from SSIS

Description For a given resource's URI metadata harvesting procedure is

performed.

Actors A01

Initial occurrence Claim for LO content or adding or updating metadata of spe-

cic LO.

Exceptional occurrence Problems with the Internet connection.

Related use cases UC03, UC04, UC05, UC11, UC12, UC15

Id UC11

Title Using RDF extractor

Description Semantic web pages allow extracting metadata for their re-

sources by using special RDF extractors. IKHarvester calls it

in order to be given the metadata.

Actors A01

Initial occurrence Harvesting data from SSIS

Exceptional occurrence Problems with the Internet connection or diculty in resolv-

ing RDF extractor URL address.

Related use cases UC10

66
Id UC12

Title Scraping HTML code

Description Reading HTML code of the resource's web page in order to

nd more relevant metadata.

Actors A01

Initial occurrence Harvesting data from (semantic) wikis

Exceptional occurrence Problems with the Internet connection.

Related use cases UC10, UC13

Id UC13

Title Transformation information to RDF

Description If a web page scraping is performed, collected data must be

transformed to RDF so that it can be stored in the semantic

informal knowledge repository.

Actors A01

Initial occurrence Scraping (semantic) wikis article web page

Exceptional occurrence

Related use cases UC12

Id UC14

Title Filtering metadata

Description Some information provided by RDF extractors can be no rel-

evant for Didaskon. Thus, the triples must be ltered; only

the crucial information will be saved to the repository.

Actors A01

Initial occurrence Using RDF extractor.

Exceptional occurrence

Related use cases UC02

67
Id UC15

Title Saving data

Description Saving triples created during SSIS harvesting to the informal

knowledge repository.

Actors A01

Initial occurrence Using RDF extractor.

Exceptional occurrence Unsuccessful connection to the repository

Related use cases UC10

Id UC16

Title Transformation to LOM model

Description Triples collected from the informal knowledge repository must

be delivered to Didaskon in a common model. Because of

learning purposes, the model is LOM.

Actors A01

Initial occurrence Claim for LO manifest

Exceptional occurrence

Related use cases UC02

Id UC17

Title Selection from the repository

Description Metadata which will be delivered to Didaskon must be col-

lected from the informal knowledge repository.

Actors A01

Initial occurrence Claim for LO manifest or Lo list

Exceptional occurrence Unsuccessful connection to the repository

Related use cases UC01, UC02

68
4.3 System design
By now, I have pointed out and described the system requirements. Also, I dened

possible use cases. In this Section, I describe the architecture of IKHarvester. I

report more precisely how it works, give some details on what is going on inside the

system.

4.3.1 Service-Oriented Architecture


According to He [15], SOA is an architectural style that aims at loose coupling

among interacting software agents. There is a number of services that do a unit of

work to fulll the service consumer's needs. The services are independent; they do

not rely on the context and state of other services. The architecture demands using

interfaces based on the Internet protocols like HTTP, FTP, SMTP; all messages,

except from binary data attachments, must be described in XML. There are two

main Web Services types: SOAP and REST.

SOAP

SOAP (Simple Object Access Protocol) Web Services are very popular nowadays.

SOAP is a protocol for transferring data between the source and the destination

through potential intermediate nodes. It forces developers to describe services in

WSDL (Web Service Description Language). Having a WSDL, it is easy to create

the core (stubs) of the client code which can call SOAP Web Services. Messages

sent with SOAP are wrapped by an envelope; within it, there is the content (body)

and some additional information.

REST

REST (REpresenational State Browser) Web Services are based on the concept of a

resource  anything that is characterized with a URI (Uniform Resource Identier).

In fact, it is used commonly nowadays, in the World Wide Web and Web 2.0 [10, 45].

REST interfaces provide representation of a resource in XML. There are four

possible HTTP methods:

69
• GET  for obtaining a stateless representation of a resource

• POST  for updating or creating a representation of a resource

• PUT  for creating a representation of a resource

• DELETE  for removing a representation of a resource

Employing SOA

I have decided to implement IKHarvester as a SOA layer for Didaskon. I have

developed a group of Web Services. Thus, IKHarvester is independent from the LMS,

which introduces better scalability, eciency, extensibility and interoperability. This

fullls a number of non-functional requirements.

Although, both SOAP and REST have pros and cons, I have used REST since

it is more suitable for the Semantic Web solution [14], as it is resource-oriented.

4.3.2 System components


In this Section, I describe details of IKHarvester's architecture. The component

diagram (see Fig 4.3) depicts a high level architecture of the system.

Borders of responsibilities of respective components are as follows:

IKHarvester The core of the system. It is responsible for integrating its two

subcomponents:

• Harvester  performs informal knowledge harvesting from SSIS

• Provider  delivers informal knowledge stored in the repository in a

form compatible with LOM standard

Jericho HTML Parser5 is a Java library for web pages scraping. It allows anal-

ysis and manipulation of parts of HTML documents, including some com-

mon server-side tags, while reproducing verbatim any unrecognized or invalid

HTML. It is freeware, under GNU Library or Lesser General Public License

(LGPL).

70
Figure 4.3: Component diagram

Didaskon LOM  a component developed by the Didaskon team. Its goal is to

manage data compatible with LOM standard; it allows creating and exporting

to XML format Learning Objects Manifest which suciently describes learning

material.

Didaskon DB  a component developed by the Didaskon team. It provides the

71
interface for connection with RDF storages. Consequently, informal knowledge

can be saved to and retrieved from the repository.

Informal Knowledge Repository RDF storage; it is Sesame repository. It con-

tains triples describing the learning material.

4.3.3 Classes
Fig. 4.4 and Fig. 4.5 present the simplied class diagram for the IKHarvester system.

It covers a number of classes with their most important attributes and methods. The

classes are organized in a few packages.

On the class diagram, there are following classes:

DataHarvester  an interface dening three methods for harvesting: harvest-

Content(), harvestMetadata(), and removeResource().

DataHarvesterImpl that implements DataHarvester. This is the superclass to

those that are harvest data from resources of dierent types. In current ver-

sion of IKHarvester, it has three subclasses: WordPressDataHarvester, Blogger-


DataHarvester, MediaWikiDataHarvester, and JeromeDLDataHarvester.

WordPressDataHarvester is used for harvesting informal knowledge from blog

posts that use WordPress engine. Current version of IKHarvester tracks only

those WordPress blogs that support SIOC.

BloggerDataHarvester is used for harvesting informal knowledge from blogs

hosted on Blogger.

MediaWikiDataHarvester is used for harvesting informal knowledge from arti-

cles hosted on wikis that use MediaWiki engine.

JeromeDLDataHarvester is used for harvesting informal knowledge from

JeromeDL resources.

HarvestingResults  enum class that dene how harvesting ends (for instance,

with success, with failure)

72
MediaWikiScraper  used for scraping web pages with wiki articles in order

to nd crucial metadata. Its methods employ Jericho HTML Parser for that

purposes.

BloggerScraper  used for scraping blog posts hosted on Blogger in order to nd

crucial metadata. Its methods employ Jericho HTML Parser for that purposes.

DataProvider  an interface that dene two methods for providing data stored

in the informal knowledge repository: getLOManifest(), and getLOContent().

DataProviderImpl implements the above mentioned interface and calls its sub-

classes responsible for providing data that has been collected from dierent

type of resources. Also, it delivers methods for obtaining the list of learning

objects stored in the informal knowledge repository.

BlogPostDataProvider  provides methods for retrieving from the informal

knowledge repository metadata for posts and providing them to eLearning

frameworks.

WikiArticleDataProvider  provides methods for retrieving from the informal

knowledge repository metadata for wiki articles and providing them to eLearn-

ing frameworks.

DLResourceDataProvider  provides methods for retrieving from the informal

knowledge repository metadata for digital libraries resources and providing

them to eLearning frameworks.

NS is a set of a few classes that dene namespaces for ontologies used for describing

blog posts, wiki articles and JeromeDL resources: NOTITIOUS, FOAF, XFOAF,
MarcOnt, XMarcOnt, JeromeDL, and SIOC.

NSDL  denes predicates used for digital libraries resources

NSBlog  denes predicates used for blog posts

NSWiki  denes predicates used for wiki articles

RDFQuery  a helper class containing a set of SeRQL queries

73
Util  contains a set of helper methods

Constant  denes constants used in IKHarvester classes

WikiArticleJBean  represents a wiki article

BlogPostJBean  represents a blog post

BloggerPostHTMLJBean  represents an HTML snippet with information

about a blog post

LOJBean  represents a LO that is returned in a collection of LOs

FieldValueWrapper  a helper wrapper

FieldValueType  an enum that denes dierent types of HTTP request param-

eters

ContextKeeper  a helper class to access webapp context information

FieldValueWrapperMap  wraps a map constructed when processing a request

query

4.3.4 Extending IKHarvester


I have designed the IKHarvester system in a way that allows programmers to create

new blades  modules for managing other types of resources (see Fig. 4.6). To do

so, a programmer must learn the class diagram (see Fig. 4.4 and Fig. 4.5).

Adding new data harvesters

Current version of IKHarvester captures metadata from SSIS with the following

classes:

• WordPressDataHarvester

74
Figure 4.4: Class diagram (part #1)

75
Figure 4.5: Class diagram (part #2)

76
• BlogerDataHarvester

• MediaWikiDataHarvester

• JeromeDLDataHarvester

Each of the above mentioned classes works with a specic type of SSIS. It is im-

portant wheter it captures information from, for example, a post hosted on Blogger

or one that runs on WordPress engine because data is exposed dierently. Conse-

qently, to provide support for example for a new type of blog posts or wiki articles,

a programmer must write a new class that extends DataHarvesterImpl.

Adding new data providers

Curently, there are three classes for retrieving metada for captured resources from

the informal knowledge repository and providing them to eLearning frameworks:

• BlogPostDataProvider

• WikiArticleDataProvider

• DLResourceDataProvider

All the above mentioned classes extend DataProviderImpl that implements the

DataProvider interface. Those three classes support three types of SSIS: blogs, wikis

and digital libraries.

It is assumed that metadata for resources of each of those types is dened in a

common object model, disregarding the fact whether it is, for instance, a post from

Blogger or WordPress, wiki article from MediaWiki or IkeWiki. As a result, extend-

ing IKHarvester with a module that captures data from another type of blogs, wikis

and digital libraries does not require implementation of a new providing module.

However, if such new module is required, a new class that extends the DataProvider-
Impl must be added.

4.3.5 Attribute mapping rules


IKHarvester aims at capturing informal learning. Data harvesting means collecting

data from SSIS (in general  online communities) and saving them to the infor-

77
Figure 4.6: Blades for dierent SSIS types

mal knowledge repository. The repository stores these metadata in RDF triples

from which Learning Objects described according to LOM standard are created and

delivered to Didaskon.

Dening mapping rules for resources' attributes, their semantic representations

(predicates), and LOM attributes was crucial for further development. There are

plenty of properties that describe a resource. Semantic RDF feeds are very helpful

since they provide mapping from attributes to predicates. For they give a lot of

unnecessary information (from learning perspective ), their output must be ltered

during LO's manifest composition.

In this Section, I describe the attributes mappings for each resource type IKHar-

vester supports at the moment (blog posts, wiki articles, and JeromeDL resources).

Blog Posts

Metadata for blog posts is delivered by SIOC data exporters. A blog that supports

SIOC, contains some additional information in the meta tag (inside head tag) in the
HTML code. For my blog, which is available at http://dobrzanski.net, it looks

78
as follows:

Listing 4.1: Support for SIOC information

<l i n k r e l ="meta " t y p e =" a p p l i c a t i o n / r d f+xml " t i t l e ="SIOC"

h r e f =" h t t p : / / d o b r z a n s k i . n e t / i n d e x . php ? s i o c _ t y p e= s i t e " />

The href attribute value is the URL of the RDF representation of the data on

current page. Its value changes during browsing the blog; it is always up to date,

ready to produce RDF output. In general, the output consists of some information

about the blog itself and its posts.

Having the URL of SIOC data for a post, IKHarvester uses the exporter to obtain

the RDF graph which is saved to the informal knowledge repository.

When it is asked to deliver data, it collects the RDF statements from the repos-

itory and transform them so they describe the post in a way compatible with LOM

standard. Since some of the metadata is not crucial for eLearning purposes, it is

ltered during creating LO manifest.

In the following Table, I present how post attributes (rst column) are mapped

to SIOC ontology predicates (second column) and then to LOM attributes (third

column). Some of the LOM attributes are set to default values, which cannot be

collected from SIOC exporter output. Attributes labeled with an asterisk (*) can

occur more than once.

Table 4.2: Mapping: posts attribute - semantic description - LOM.

Attribute Predicate LOM

- sioc:Post Educational.LearningResourceType=BlogPost

URI - Technical.Location &

General.Identier.Catalog=URI &

General.Identier.Entry &

Meta-Metadata.Identier.Catalog=URI &

Meta-Metadata.Identier.Entry

title dc:title General.Identier.Title

79
creator sioc:has_creator Lifecycle.Contribute.Role=Author &

Lifecycle.Contribute.Entity=Personal info. &

Lifecycle.Contribute.Date=Date of creation &

Meta-Metadata.Contribute.Role=Author &

Meta-Metadata.Contribute.Entity=Personal info. &

Meta-Metadata.Contribute.Date=Date

creation date dcterms:link Lifecycle.version=Date

description SIOC:content General.Description &

Educational.Description &

Classication.Description

rich content (HTML) content:encoded -

topic* sioc:topic General.Keyword &

Classication.Keyword

reply* sioc:has_reply Annotation.Entity=About author &

Annotation.Date=Date &

Annotation.Description=Content

external link* sioc:links_to Relation.Kind=references &

Relation.Resource.Identier.Catalog=URI &

Relation.Resource.Identier.Entry &

Relation.Resource.Description=references

language - General.Language &

Educational.Language &

Meta-Metadata.Language

- - Educational.InteractivityType=expositive

- - Educational.InteractivityLevel=medium

- - Educational.SemanticDensity=medium

- - Educational.IntendedEndUserRole=learner

- - Educational.Context=school &

Educational.Context=higher education &

Educational.Context=training &

Educational.Context=other

- - Educational.Diculty=easy

- - Rights.Cost=no

- - Rights.CopyrightAndOtherRestrictions=

no

- - General.Structure=atomic

80
- - General.AggregationLevel=1

- - MetaMetadata.MetadataSchema=LOMv1.0

- - Technical.Requirement.OrComposite. . .

.Type=operating system

.Name=multi-os

.Type=browser

.Name=any

- - LifeCycle.Status=revised

Wiki Articles

IKHarvester must collect data from semantic and non-semantic wikis which are

based on MediaWiki engine. Many information (relations and attributes) about

the concept described in an article from a semantic wiki can be obtained by using

RDF feed. However, harvesting should be performed also for non-semantic wikis, like

Wikipedia. It turns out there is quite a lot of semantics in the HTML code; dierent

sections like titles, content and categories are put inside sections with formalized

identiers. Thus, scraping the page results in a lot of crucial information. In fact, I

perform scraping for both semantic and non-semantic wikis.

In the following Table, I present the way of mapping the attributes of wiki

articles (rst column) to SIOC ontology predicates (second column) and then to

LOM attributes (third column). Some of the LOM attributes are set to default

values suggesting on LOM standard proposes. Attributes labeled with an asterisk

(*) can occur more than one time; those with two asterisks (**) are served by RDF

feeds; they can be multiple as well.

Table 4.3: Mapping: wiki article - semantic description - LOM.

Attribute Predicate LOM

- sioc:WikiArticle Educational.LearningResourceType=

WikiArticle

81
URI - Technical.Location &

General.Identier.Catalog=URI &

General.Identier.Entry &

Meta-Metadata.Identier.Catalog=URI &

Meta-Metadata.Identier.Entry

title dc:title General.Identier.Title

last. modif. date dctermss:link Lifecycle.version=Date

description SIOC:content General.Description &

Educational.Description &

Classication.Description

rich content (HTML) content:encoded -

category* sioc:topic General.Keyword &

Classication.Keyword

external link* sioc:links_to Relation.Kind=references &

Relation.Resource.Identier.Catalog=URI &

Relation.Resource.Identier.Entry &

Relation.Resource.Description=references

relation** relation:xxx Relation.Kind=xxx &

Relation.Resource.Identier.Catalog=URI &

Relation.Resource.Identier.Entry &

Relation.Resource.Description=xxx

attribute** attribute:xxx Relation.Kind=has attribute &

Relation.Resource.Identier.Catalog=URI &

Relation.Resource.Identier.Entry &

Relation.Resource.Description=has attribute

language - General.Language &

Educational.Language &

Meta-Metadata.Language

- - Educational.InteractivityType=expositive

- - Educational.InteractivityLevel=medium

- - Educational.SemanticDensity=medium

- - Educational.IntendedEndUserRole=learner

- - Educational.Context=school &

Educational.Context=higher education &

Educational.Context=training &

Educational.Context=other

82
- - Educational.Diculty=medium

- - Rights.Cost=no

- - Rights.CopyrightAndOtherRestrictions=

no

- - General.Structure=atomic

- - General.AggregationLevel=1

- - MetaMetadata.MetadataSchema=LOMv1.0

- - Technical.Requirement.OrComposite. . .

.Type=operating system

.Name=multi-os

.Type=browser

.Name=any

- - LifeCycle.Status=revised

JeromeDL resources

JeromeDL provides extract information for resources in a few forms (see Sec. 3.1.3).

I have chosen MarcOnt ontology supported by JeromeDL ontology. The mapping

rules for JeromeDL resources' attributes are presented in the following table.

Table 4.4: Mapping: JeromeDL resource - semantic description -

LOM.

Attribute Predicate LOM

- jeromedl:Book Educational.LearningResourceType=

JeromeDLResource

URI - Technical.Location &

General.Identier.Catalog=URI &

General.Identier.Entry &

Meta-Metadata.Identier.Catalog=URI &

Meta-Metadata.Identier.Entry

title marcont:hasTitles General.Identier.Title

83
creator marcont:hasCreator Lifecycle.Contribute.Role=Author &

Lifecycle.Contribute.Entity=Personal info. &

Lifecycle.Contribute.Date=Date of creation &

Meta-Metadata.Contribute.Role=Author &

Meta-Metadata.Contribute.Entity=Personal info. &

Meta-Metadata.Contribute.Date=Date

abstract jeromedl:abstract General.Description &

Educational.Description &

Classication.Description

keyword* marcont:hasKeyword General.Keyword &

Classication.Keyword

bookType jeromedl:bookType Educational.LearningResourceType

digitalType jeromedl:digitalType Technical.Format

protectionType jeromedl:protectionType Rights.Copyright=XXX&

Rights.Cost

language - General.Language &

Educational.Language &

Meta-Metadata.Language

supervisor xmarcont:supervisor Lifecycle.Contribute.Role=Supervisor &

Lifecycle.Contribute.Entity=Personal info. &

Meta-Metadata.Contribute.Role=Supervisor &

Meta-Metadata.Contribute.Entity=Personal info.

consultant xmarcont:consultant Lifecycle.Contribute.Role=Consultant &

Lifecycle.Contribute.Entity=Personal info. &

Meta-Metadata.Contribute.Role=Consultant &

Meta-Metadata.Contribute.Entity=Personal info.

uploader jeromedl:uploader Lifecycle.Contribute.Role=Uploader &

Lifecycle.Contribute.Entity=Personal info. &

Meta-Metadata.Contribute.Role=Uploader &

Meta-Metadata.Contribute.Entity=Personal info.

- - Educational.InteractivityType=expositive

- - Educational.InteractivityLevel=medium

- - Educational.SemanticDensity=medium

- - Educational.IntendedEndUserRole=learner

84
- - Educational.Context=school &

Educational.Context=higher education &

Educational.Context=training &

Educational.Context=other

- - Educational.Diculty=medium

- - General.Structure=atomic

- - General.AggregationLevel=1

- - MetaMetadata.MetadataSchema=LOMv1.0

- - Technical.Requirement.OrComposite. . .

.Type=operating system

.Name=multi-os

.Type=browser

.Name=any

- - LifeCycle.Status

85
Chapter 5

System implementation

In this chapter, I describe the software development methodology I followed during

development of the IKHarvester system. Then, I present tools I used while writing

this Thesis. I give a brief description of software that helped me in both, writing

this paper and developing IKHarvester.

5.1 Implementation methodology


IKHarvester has been built according to the waterfall software development model.

This model demands that an application is build by following sequentially a few

specied steps:

Requirements is the initial stage of the system development. This is the time of

collecting information on what the system should.

Design Having the requirements, the designers create the architecture of the system

and try to explain how it will fulll the demands.

Implementation Basing on the prepared architecture, the programmers imple-

ment the system. The result of this stage is a working product.

Testing (validation) When the system works, it is time to validate it, check

whether it does what it should and how it should.

Integration After improving bugs and deciencies discovered during the testing

phase, the system can be deployed in the determined environment.

86
Maintenance This is the stage when the system is deployed and works in the

determined environment. Although it is the last stage, a lot eort must be

put into maintenance. Often, this is the time, when some uncovered errors

occur. Also, the application can be still improved; new features can be added

as well.

5.2 Three-tier architecture


Although IKHarvester is a SOA layer, it can also present responses in web browsers.

Thus, there is supposed to be the presentation layer as well.

Figure 5.1: Three-tier architecture

All in all, IKHarvester has three-tier architecture, so there are:

Presentation Tier Visualizes responses to the client in a web browser.

Logic Tier Contains the logic of the application. Basing on the input arguments

provided by a client, it performs calculations, also on the data from the

database.

87
Data Tier Handles the connection and queries to the database in order to get and

save data in the storage.

Each tier is related to the dierent aspect of the application (presentation, logic

and data). The general idea says, that the Logic Tier is the middleware used by the

presentation layer (user's actions) in order to operate on the stored data.

5.3 IKHarvester main page


IKHarvester can be used by a software agent due to the exposed Web Services and by

a user with a web browser. The second approach introduces usage of web pages. In

the picture below (see Fig. 5.2) you can see the main web page with the menu and a

form for adding metadata for online resources to the informal knowledge repository.

Figure 5.2: IKHarvester main page

A user can get metadata for informal Learning Objects (see Sec. B.1), get the

list of informal Learning Objects (see Sec. B.3), get the information and support for

facilitating usage of IKHarvester with web browsers, and learn about the system.

88
5.4 Environment and necessary tools
5.4.1 Implementation environment
Java Platform

Java Platform is an environment dedicated to programmers who develop applica-

tions using Java programming language, which is supposed to be write once, run

anywhere. It was created and managed by Sun Microsystems. Java Platform con-

sists of great many technologies. It has an execution machine which is called Java

Virtual Machine (JVM).

As aforementioned, there are a lot of technologies which Java Platform consists

of. I will shortly describe two of them, which I decided to use in the system I will

create.

Java Standard Edition


1
Java SE is a collection of Java programming APIs that are broadly used in many

2
Java platform programs. With reference to Sun , Java SE allows to develop and

deploy applications on desktops and servers. Java SE also includes classes that

support development of Java Web Services, and provides the foundation for Java

Platform, Enterprise Edition (Java EE).

The core products in Java SE family are:

• Java Runtime Environment (JRE)  provides Java APIs, Java Virtual Ma-

chine, and some more components required to develop applications and applets

in Java programming language

• Java Development Kit (JDK)  encapsulates JRE and useful tools for devel-

opers (compilers, debuggers)

1 http://java.sun.com/j2se/
2 http://www.sun.com/

89
Java Enterprise Edition
3
Java EE provides more classes than Java SE. They are dedicated to programs

running on rather on servers than on workstations. That applications have multi-

tier architecture, based mainly on the modular software running on servers.

Java EE is considered as a standard as providers must agree to certain con-

formance requirements in order to declare their products as Java EE compliant.

However, it is not a formal standard.

Web server  Apache Tomcat


4
Tomcat is a part of an open-source Apache Jakarta project. It is a platform in-

dependent servlet container, used in the ocial Reference Implementation for the

5 6
Java Servlet and Java Server Pages technologies.

RDF data storage  Sesame


7
Sesame is an open-source RDF database with support for RDF Schema referencing

and querying. It is being developed as a part of the On-To-Knowledge project.

Sesame's benets are: good scalability, high query performance and support for

several RDF query languages including SeRQL and RQL.

IDE  Eclipse
8
Eclipse is one of the most popular Integrated Development Environment. It is an

open-source platform-independent software framework. It is developed, evaluated

and promoted by the Eclipse Foundation along with the community.

The platform has been designed to be plug-in-able. Its power and abilities can

be extended by downloading and installing extensions called plug-ins.

3 http://java.sun.com/j2ee/
4 http://jacarta.apache.org/tomcat/
5 http://java.sun.com/products/servlets
6 http://java.sun.com/products/jsp
7 http://openrdf.org/
8 http://eclipse.org

90
Building the project  Apache Ant
9
Apache Ant is an open-source build tool that was build on Java programming

language.

It is kind of Make. However, it is simpler to use. It automates tasks like com-

piling, building, deploying Java projects les by a proper conguration stored in

XML-based les.

Testing/logging  log4j

I have put a lot eort in testing during the development. I have tried to create new

components and test them at once, so the risk of bugs was limited. All in all, I have

10
used log4j , the logging mechanism, with two levels of logs: errors and information.

It is easy to switch between these logging levels.

Logs are saved to a special le. Each occurrence of an error is rich described 

there is some information about the error itself, the reason why it occurred and the

place in the source code where it occurred.

Support for group work  Subversion


11
Subversion (SVN) is a revision control system which facilitates applications de-

velopment by a distributed group of programmers. It is designed to be a modern

12
replacement for Current Versions Systems (CVS) . It has a number of features:

atomic commits, versioning of symbolic links, native support for binary les, full

MIME support, etc.

5.4.2 Documentation
Thesis environment  LATEX
AT X 13
L E is a document markup language and document preparation system for

TEX typesetting program.

9 http://ant.apache.org
10 http://logging.apache.org/log4j/
11 http://subversion.tigris.org/
12 http://www.nongnu.org/cvs/
13 http://www.latex-project.org/

91
It allows an author to focus on the content and meaning of the document he/she

writes instead of how it looks; the visual presentation is dened by using styles. Since

one species the logical structure of the document (chapters, sections, paragraphs,

etc.), he/she can easily change the way it looks, by using another style.

LATEX editor  TeXlipse


14 AT X support for Eclipse IDE (see Sec. 5.4.1). It
TeXlipse is a plug-in that adds L E

facilitates writing TEX documents by highlighting the syntax, outlining the docu-

ments, code folding, providing BibTeX and table editor.

UML diagrams  JUDE Community

All the UML diagrams used in this document have been created in a free (Commu-

15
nity) version of JUDE (Java and UML Developers' Environment).

JUDE Community supports a number of UML 1.4 diagrams: Class (Object/-

Package/Robustness), Use Case, Collaboration, Statechart, Activity, etc. Moreover,

it can generate templates and include of Java source les, automatically generate

Class diagrams with model information, and more.

Figures  Inkscape
16 17
Inkscape is an open source editor for creating vector graphics using W3C stan-

18
dard Scalable Vector Graphics (SVG) . It is designed to fully support XML, SVG,

and CSS standards.


14 http://texlipse.sourceforge.net/
15 http://jude.change-vision.com/jude-web/product/community.html
16 http://www.inkscape.org/
17 http://www.w3.org/
18 http://w3.org/Graphics/SVG/

92
5.5 Main problems and solution details
5.5.1 Implementation of REST
As stated before (see Sec. 4.3.1), IKHarvester is a SOA layer for Didaskon. Then,

Didaskon is a client that uses Web Services provided by my system. All available

Web Services must be specied, so that the relevant client code can be implemented.

My REST implementation employs the Java Servlet technology. I have imple-

mented the RESTRequestDispatcherServlet class  the servlet that handles requests

from the client and generates responses.

All requests handled by IKHarvester have URIs according to the following tem-

plate:

• server  the server on which IKHarvester is hosted, for example:

http://notitio.us/ikh/

• query string  denes the action to be performed. It is build in two ways:

 URI  the URI of the LO to be used (it is put between $ characters). It

can be followed by the manifest or content keywords


 type  if a client wants to obtain a list of LOs, the type must be specied

Below, there are denitions of all request that can be sent to IKHarvester. The

following tables dene usage from the client's point of view. When some features of

IKHarvester are invoked from a web page (provided along with the system), a query

string mode=admin is used in order to present some output on the page.

Get the list of available Learning Objects

This HTTP request is used for obtaining all the LOs that can be created from

informal knowledge stored in the repository.

Denitions:

server  the server on which IKHarvester is hosted

type  if not set, the Web Service delivers a list of LOs of all types; if set ( BlogPost,
MediaWiki, JeromeDL), only resources of that type are returned

93
Table 5.1: REST  get LO list

URL http://[server]/ikh/soa/[type]
Method GET

Returns All available LOs or LOs of the specied type.

Content type text/xml

Examples:

http://notitio.us/ikh/soa
http://notitio.us/ikh/soa/BlogPost

Get the manifest of the Learning Object

This HTTP request is used for retrieving from the informal knowledge repository

the manifest of the specic LO in a an XML form, compatible with LOM standard.

Table 5.2: REST  get LOM

URL http://[server]/ikh/soa/$URI$manifest
Method GET

Returns LO manifest compatible with LOM standard

Content type text/xml

Denitions:

server  the server on which IKHarvester is hosted

URI  the URI of the resource

Examples:

http://notitio.us/ikh/soa/$http://dobrzanski.net/2007/03/15/pandora/$manifest

Get the content of the Learning Object

This HTTP request is used for obtaining the content of the specic LO in a an XML

form. The content is collected on the y  it is not stored in the repository.

Denitions:

server  the server on which IKHarvester is hosted

94
Table 5.3: REST  get LO content

URL http://[server]/ikh/soa/$URI$content
Method GET

Returns LO content

Content type text/xml

URI  the URI of the resource

Examples:

http://notitio.us/ikh/soa/$http://dobrzanski.net/2007/03/15/pandora/$content

Add Learning Object

This HTTP request is used for adding and updating an informal LO to the repos-

itory. All crucial metadata, except from the actual content, for the given resource

are saved as triples describing it.

Table 5.4: REST  add LO

URL http://[server]/ikh/soa/$URI$
Method PUT

Returns
Content type

Denitions:

server  the server on which IKHarvester is hosted

URI  the URI of the resource

Examples:

http://notitio.us/ikh/soa/$http://dobrzanski.net/2007/03/15/pandora/$

Remove Learning Object

This HTTP request is used for removing an informal LO from the repository. It is

should be invoked if the resource is no longer available in the Internet.

95
The resource is not physically removed from the repository. Instead, a triple

informing about the removal is added to the repository. This is forced because of

the synchronizing problems, when more than one LMS uses IKHarvester.

Table 5.5: REST  remove LO

URL http://[server]/ikh/soa/$URI$
Method DELETE

Returns
Content type

Denitions:

server  the server on which IKHarvester is hosted

URI  the URI of the resource

Examples:

http://notitio.us/ikh/soa/$http://dobrzanski.net/2007/03/15/pandora/$

5.5.2 Invoking the data tier features


IKHarvester operates on data that are stored in the Sesame repository in a form

of RDF triples. The connection to the storage and realization of the queries are

handled in the SesameDBFace class  the only class in the Didaskon DB module
that has been prepared for that reason.

Retrieving the connection to the data storage

The SesameDBFace class has been implemented according to the singleton pat-

tern which allows only one instance of a class. Thus, there is a private construc-

tor that can be used only from within the getInstance(. . . ) method, which is in-

voked by the logic tier of the application. It is worth noticing that the Didaskon
DB module can be used by any application. For that reason, there is the cache

(SESAME_DB_FACE_CACHE) of instantiated objects of that class.


To learn how the connection to the data storage is created and managed, see

List. 5.1.

96
Listing 5.1: Retrieving the connection to the data storage

/ ∗∗

∗ cache of instances of dbfaces

∗/
private static Map<S t r i n g , S o f t R e f e r e n c e <SesameDBFace>>

SESAME_DB_FACE_CACHE = new HashMap<S t r i n g ,

S o f t R e f e r e n c e <SesameDBFace > >();

private SesameDBFace ( ) {}

private SesameDBFace ( L o c a l R e p o s i t o r y repository1 ) {

try {

repository = repository1 ;

graph = r e p o s i t o r y . getGraph ( ) ;

v a l u e F a c t o r y = graph . g e t V a l u e F a c t o r y ( ) ;

} catch ( AccessDeniedException e) {

throw new RuntimeException ( e ) ;

/ ∗∗

∗ Returns SesameDbFace object associated with repository with given id


∗ @aut hor Jaroslaw D o b r z a n s k i <j a r o s l a w @ d o b r z a n s k i . n e t >

∗ @param repositoryId

∗ @return

∗/
public static SesameDBFace getInstance ( String repositoryId ) {

return getInstance ( repositoryId , null , null );

/ ∗∗

∗ Returns SesameDbFace object associated with repository with given id

∗ and for user with given login and password


∗ @aut hor Jaroslaw D o b r z a n s k i <j a r o s l a w @ d o b r z a n s k i . n e t >

∗ @param repositoryId

∗ @param login

97
∗ @param password

∗ @return

∗/
public static SesameDBFace getInstance ( String repositoryId ,

String login , String password ) {

SesameDBFace dbFace = n u l l ;

synchronized (SESAME_DB_FACE_CACHE) {

S o f t R e f e r e n c e <SesameDBFace> ref =

SESAME_DB_FACE_CACHE. g e t ( r e p o s i t o r y I d ) ;

if ( r e f == n u l l || r e f . g e t ( ) == n u l l ) {

LocalService service = getService ( login , password ) ;

try {

LocalRepository repository =

( LocalRepository ) service . getRepository ( repositoryId ) ;

dbFace = SesameDBFace . g e t I n s t a n c e ( r e p o s i t o r y ) ;

} catch ( UnknownRepositoryException e) {

try {

LocalRepository repository =

service . createRepository ( repositoryId , false );

dbFace = SesameDBFace . g e t I n s t a n c e ( r e p o s i t o r y ) ;

} catch ( ConfigurationException e1 ) {

throw new

RuntimeException (" F a i l e d to create sesame repository (" +

repositoryId +")" , e1 ) ;

} catch ( ConfigurationException e) {

throw new RuntimeException (" F a i l e d to get the repository ("+

repositoryId +")" , e );

SESAME_DB_FACE_CACHE. p u t ( r e p o s i t o r y I d ,

new S o f t R e f e r e n c e <SesameDBFace >( dbFace ) ) ;

} else {

dbFace = r e f . get ( ) ;

if ( dbFace == n u l l ) {

throw new RuntimeException (" F a i l e d to get the repository (" +

repositoryId +")");

98
}

return dbFace ;

/ ∗∗

∗ Returns service that allows operations on the rpository


∗ @aut hor Jaroslaw D o b r z a n s k i <j a r o s l a w @ d o b r z a n s k i . n e t >

∗ @param login

∗ @param password

∗ @return

∗/
private static LocalService getService ( String login , String password ) {

LocalService service = SesameServer . g e t L o c a l S e r v i c e ( ) ;

if ( login != n u l l && p a s s w o r d != null ) {

try {

service . login ( login , password ) ;

return service ;

} catch ( AccessDeniedException e) {

throw new RuntimeException (

S t r i n g . f o r m a t (ERR_ACCESS_DENIED, login ) , e );

return service ;

Querying the data storage

All queries to the data storage are handled by the SesameDBFace class from the

Didaskon DB module. Primarily, IKHarvester queries the storage to retrieve data.

For that reason, it uses the performGraphQuery(String query, Object... args) method
which takes two arguments (see List. 5.2):

• query  a SERQL query itself

• args  none, one or more arguments that are used in the query

99
Listing 5.2: Querying the data storage

/ ∗∗

∗ Performs a SERQL q u e r y and returns a graph containing its result .


∗ @aut hor Jaroslaw D o b r z a n s k i <j a r o s l a w @ d o b r z a n s k i . n e t >

∗ @param query

∗ @param args

∗ @return

∗/
public Graph performGraphQuery ( S t r i n g query , Object . . . args ) {

try {

return r e p o s i t o r y . p e r f o r m G r a p h Q u e r y ( QueryLanguage . SERQL,

S t r i n g . format ( query , args ) ) ;

} catch ( Exception e) {

e . printStackTrace ( ) ;

return null ;

To make the code of IKHarvester cleaner and separate features related to the

storage issues, I have prepared the RDFQuery class that contains all the queries used
by the system (see List. 5.3).

Listing 5.3: RDF SERQL queries denition

public class RDFQuery {

private RDFQuery ( ) {}

/ ∗∗

∗ construct ∗ from { subject } predicate { object }

∗/
public static final S t r i n g SELECT_ALL =

" construct ∗ from { subject } predicate { object }";

/ ∗∗

∗ construct ∗ f r o m {<%s >} <%s> {<%s >}

∗/
public static final S t r i n g SELECT_ALL_FOR_ALL =

" construct ∗ f r o m {<%s >} <%s> {<%s > } " ;

100
/ ∗∗

∗ construct ∗ f r o m {<%s >} b {c}

∗/
public static final S t r i n g SELECT_ALL_FOR_SUBJECT =

" construct ∗ f r o m {<%s >} b {c }";

/ ∗∗

∗ construct ∗ from { a } <%s> { c }

∗/
public static final S t r i n g SELECT_ALL_FOR_PREDICATE =

" construct ∗ from { a } <%s> { c } " ;

/ ∗∗

∗ construct ∗ f r o m {<%s >} <%s> { c }

∗/
public static final S t r i n g SELECT_OBJECT_FOR_SUBJECT_AND_PREDICATE =

" construct ∗ f r o m {<%s >} <%s> { c } " ;

/ ∗∗

∗ construct ∗ from { a } <%s> {<%s >}

∗/
public static final S t r i n g SELECT_SUBJECT_FOR_PREDICATE_AND_OBJECT =

" construct ∗ from { a } <%s> {<%s > } " ;

/ ∗∗

∗ construct ∗ f r o m {%s } b {c}

∗/
public static final S t r i n g SELECT_ALL_FOR_BLANKNODESUBJECT =

" construct ∗ from {_:% s } b {c }";

/ ∗∗

∗ construct ∗ f r o m {<%s >} <%s> { c }

∗/
public static final String

SELECT_OBJECT_FOR_BLANKNODESUBJECT_AND_PREDICATE =

" construct ∗ from {_:% s } <%s> { c } " ;

/ ∗∗

∗ construct ∗ f r o m {<%s >} b {c} where c like "%s ∗ "

∗/
public static final String

SELECT_ALL_FOR_SUBJECT_AND_PREDICATE_LIKE =

" construct ∗ f r o m {<%s >} b {c} where c like \"% s ∗\"";


}

101
5.5.3 Extending IKHarvester
One of the requirements for the IKHarvester system demands allowing to extend

IKHarvester to support new types of online resources (see Sec. 4.3.4). Writing

new features should be facilitated. Therefore, I have decided that new classes for

harvesting metadata should extend the DataHarvesterImpl class and ones for pro-

viding metadata from the informal knowledge repository should extend the Dat-
aProviderImpl class. Both classes implement respectively the DataHarvester and the

DataProvider interfaces.
Since the idea for both harvesting and providing features is the same, in the

following listings (see List. 5.4 and List. 5.5) I present the mechanism only for

providing classes.

Listing 5.4: DataProvider interface

public interface DataProvider {

/ ∗∗

∗ Returns a LOJBean manifest containing

∗ metadata relevant for course composition


∗ @aut hor Jaroslaw D o b r z a n s k i <j a r o s l a w @ d o b r z a n s k i . n e t >

∗ @param uri

∗ @return

∗ @throws IOException

∗/
public LOManifest getLOManifest ( ) ;

/ ∗∗

∗ Returns the content of a specific LOJBean .


∗ @aut hor Jaroslaw D o b r z a n s k i <j a r o s l a w @ d o b r z a n s k i . n e t >

∗ @param resourceType The type of the resource

∗ @param content This parameters returns the content

∗ @return Information about whether

∗ the operation succeeded or failed

102
∗ @throws IOException

∗/
public HarvestingResults getLOContent ( S t r i n g resourceType ,

StringBuffer content ) throws IOException ;

The DataProviderImpl class implements the methods from the DataProvider inter-
face: getLOManifest() and getLOContent(String resourceType, StringBuer content).
Basing on the type of the resource, the former method creates an instance of the

specic subclass of DataProviderImpl by using Java Reection. Finally, features of

the subclass are invoked.

The name of the subclass has a sux which is equal to the name of the resource

type, while the sux is always DataProvider, for instance BlogPostDataProvider.

Listing 5.5: DataProvider interface

public class DataProviderImpl implements DataProvider {

/ ∗∗

∗ URI of the resource

∗/
protected String uri = null ;

/ ∗∗

∗ Gives access to a repository

∗/
protected SesameDBFace dbFace = n u l l ;

protected static Logger logger =

Logger . getLogger ( DataProviderImpl . c l a s s ) ;

/ ∗∗

∗ Constructor using resources uri

∗ @param uri

∗/
public DataProviderImpl ( S t r i n g uri1 ) {

uri = uri1 ;

dbFace = SesameDBFace . g e t I n s t a n c e ( C o n s t a n t . REPOSITORY_ID ) ;

103
/∗ ( non−J a v a d o c )

∗ @see org . c o r r i b . i k h a r v e s t e r . provider .

∗ D a t a P r o v i d e r#g e t L O M a n i f e s t ( j a v a . l a n g . S t r i n g )

∗/
public LOManifest getLOManifest ( ) {

LOManifest manifest = null ;

// get the reource type

StatementIterator iter =

dbFace . g e t G r a p h S t a t e m e n t s (

RDFQuery . SELECT_ALL_FOR_SUBJECT_AND_PREDICATE_LIKE,

uri , NS . NOTITIOUS . r e s o u r c e T y p e ) ;

if ( i t e r == n u l l || ! i t e r . hasNext ( ) ) {

return null ;

// t h e r e is maximum o n e entry

String resType = i t e r . next ( ) . getObject ( ) . t o S t r i n g ( ) .

s u b s t r i n g ( NS . NOTITIOUS . r e s o u r c e T y p e . l e n g t h ( ) ) ;

if ( ! U t i l . i s S t r i n g S e t ( resType ) ) {

return null ;

try {

Class p r o v i d e r C l a s s = C l a s s . forName (

DataProviderImpl . c l a s s . getPackage ( ) .

getName ( ) + " . " + r e s T y p e + " D a t a P r o v i d e r " ) ;

Class constrArgsTypes [ ] = { String . class };

Constructor ct = p r o v i d e r C l a s s . getConstructor ( constrArgsTypes ) ;

Object args [ ] = { uri };

DataProvider p r o v i d e r = ( DataProvider ) ct . newInstance ( args ) ;

m a n i f e s t = p r o v i d e r . getLOManifest ( ) ;

} catch ( SecurityException e) {

l o g g e r . e r r o r (" SecurityException " , e );

} catch ( IllegalArgumentException e) {

l o g g e r . e r r o r (" IllegalArgumentException " , e );

} catch ( ClassNotFoundException e) {

l o g g e r . e r r o r (" ClassNotFoundException " , e );

104
} catch ( NoSuchMethodException e) {

l o g g e r . e r r o r ( " NoSuchMethodException " , e );

} catch ( InstantiationException e) {

l o g g e r . e r r o r (" I n s t a n t i a t i o n E x c e p t i o n " , e );

} catch ( IllegalAccessException e) {

l o g g e r . e r r o r (" I l l e g a l A c c e s s E x c e p t i o n " , e );

} catch ( InvocationTargetException e) {

l o g g e r . e r r o r (" InvocationTargetException " , e );

return manifest ;

/∗ ( non−J a v a d o c )

∗ @see o r g . c o r r i b . i k h a r v e s t e r . p r o v i d e r . D a t a P r o v i d e r#g e t L O C o n t e n t (

∗ String resourceType , StringBuffer content )

∗/
public HarvestingResults getLOContent ( S t r i n g resourceType ,

StringBuffer content ) throws IOException {

DataHarvester h a r v e s t e r = new DataHarvesterImpl ( uri , resourceType ) ;

return harvester . harvestContent ( content ) ;

/ ∗∗

∗ Returns a list containg all LOs that can be collected

∗ from the informal knowledge repository .


∗ @aut hor Jaroslaw D o b r z a n s k i <j a r o s l a w @ d o b r z a n s k i . n e t >

∗ @return

∗/
public static L i s t <LOJBean> g e t L O L i s t ( ) {

L i s t <LOJBean> o b j e c t s = new A r r a y L i s t <LOJBean > ( ) ;

o b j e c t s . a d d A l l ( g e t L O L i s t ( C o n s t a n t .RESOURCE_TYPE_BLOG ) ) ;

o b j e c t s . a d d A l l ( g e t L O L i s t ( C o n s t a n t .RESOURCE_TYPE_DL ) ) ;

o b j e c t s . a d d A l l ( g e t L O L i s t ( C o n s t a n t . RESOURCE_TYPE_WIKI ) ) ;

Collections . sort ( objects );

return objects ;

105
/ ∗∗

∗ Returns a list containg all LOs of a specified type


∗ @param type

∗ @aut hor Jaroslaw D o b r z a n s k i <j a r o s l a w @ d o b r z a n s k i . n e t >

∗ @return

∗/
public static L i s t <LOJBean> g e t L O L i s t ( S t r i n g type ) {

SesameDBFace dbFace = SesameDBFace . g e t I n s t a n c e (

C o n s t a n t . REPOSITORY_ID ) ;

L i s t <LOJBean> l o s = new A r r a y L i s t <LOJBean > ( ) ;

StatementIterator iter = dbFace . g e t G r a p h S t a t e m e n t s (

RDFQuery . SELECT_SUBJECT_FOR_PREDICATE_AND_OBJECT,

RDF . TYPE, NS . NOTITIOUS . r e s o u r c e T y p e+t y p e ) ;

if ( i t e r == n u l l ) {

return los ;

while ( i t e r . hasNext ( ) ) {

String uri = i t e r . next ( ) . g e t S u b j e c t ( ) . t o S t r i n g ( ) ;

// c h e c k if removed , this information must also be provided

boolean removed = false ;

StatementIterator it = dbFace . g e t G r a p h S t a t e m e n t s (

RDFQuery . SELECT_OBJECT_FOR_SUBJECT_AND_PREDICATE,

uri , NSBlog . removed ) ;

if ( i t == n u l l ) {

continue ;

if ( i t . hasNext ( ) ) {

removed = t r u e ;

l o s . add ( new LOJBean ( u r i , removed ) ) ;

it . close ();

iter . close ();

Collections . sort ( los );

return los ;

106
}

5.5.4 Adding data to the informal knowledge repository


IKHarvester allows adding resources to the informal knowledge repository either

with a client application or by putting their URL in the form on the main page of

the system. The latter method might be bothersome since every time a user must

go to the above mentioned web page and return to the nal one after the adding.

To reach out users needs, I have created an add-on for Firefox, one of the most

popular web browser. In general, an add-on adds some functionality for a piece of

software. The one I have created works with the implementation of IKHarvester

19
deployed on the notitio.us project (see Sec. 6.1.2); it adds a button with post

to notitio.us link on web pages of one of a type supported by IKHarvester (see

Fig. 5.3).

Figure 5.3: IKHarvester  support for web browsers

19 http://notitio.us/

107
Any time a user visits such page, he/she can click the above mentioned button.

He/she is redirected to one of IKHarvester web pages, where the initial page can be

tagged and saved to the informal knowledge repository. All the information in that

repository is shared.

IKHarvester add-on for Firefox can be downloaded from http://notitio.us/


ikh/browser.jsp.

108
Chapter 6

Conclusions

Nowadays, there is a lot of informal knowledge available on the Internet; it can be

found in blogs, fora, digital libraries, wikis, etc. The amount of such data is growing

rapidly; more and more tools for managing it is developed.

In this thesis, I have included the results of my research in the eld of the Se-

mantic Web and eLearning. I have presented those two approaches and dened the

model of Social Semantic Information Sources. I have proposed a way of capturing

informal learning from a few types of SSIS and delivering it to eLearning 2.0 frame-

works. Then, I have designed the architecture of such system (IKHarvester) and

developed it. Finally, I have successfully deployed IKHarvester in the real environ-

ment.

6.1 Achievements
6.1.1 Publications
This thesis is dedicated to the issue of collecting informal knowledge from Social

Semantic Information Sources. The idea of how to capture data from online resources

is quite innovative and there is a lot of eort put into research in that eld.

1
The Semantic Infrastructure (SemInf ) lab, the core part of the Corrib Cluster

2
in the Digital Enterprise Research Institute , whose member I am, is also interested

1 http://corrib.org/
2 http://deri.org

109
in this area. During last months, we have created a few publication.

The following list presents two of those articles:

• S. R. Kruk, A. Gzella, J. Dobrza«ski, B. McDaniel, T. Woroniecki. E-Learning

on the Social Semantic Information Sources. Proceedings of the Second Eu-

ropean Conference on Technology Enhanced Learning, 2007, Crete, Greece,

September 17-20, 2007

• J. Dobrza«ski, S. R. Kruk, T. Nagle, E. Curry, A. Gzella. IKHarvester  In-

formal eLearning with Semantic Web Harvesting. 6th International Semantic

Web Conference, 2007, Busan, Korea, November 11-15, 2007 (Submitted)

• J. Dobrza«ski. Employing Social Semantic Information Sources for e-Learning.

Faculty of Engineering Research Day 2007, Galway, Ireland, April 16, 2007

• J. Jankowski, F. Czaja, J. Dobrza«ski. Adapting informal sources of knowledge

to e-Learning. 5th Annual Conference on Teaching and Learning: Learning

Technologies, Galway, Ireland, June 7-8, 2007

• J. G. Breslin, S. Grzonkowski, A. Gzella, S. R. Kruk, T. Woroniecki, and J. Do-

brza«ski. Sharing Information Across Community Portals with FOAFRealm.

International Journal of Web Based Communities (Submitted)

6.1.2 IKHarvester
Although current version of IKHarvester is a prototype, it works well and collects a

lot of relevant data from SSIS.

Benets

To recap, there are few solutions for capturing managing semantic annota-

tions (metadata) for online resources useful in learning process: PingtheSeman-

ticWeb.com, Piggy Bank, and Zotero (see Sec. 4.1.1). Although their goal is similar,

they achieve it in dierent ways. Table 6.1 explicitly shows the dierence between

the above mentioned solutions, indicating the level of support for each of the feature.

110
Table 6.1: Comparison of tools for collecting informal data

Ping the Se-


Feature IKHarvester Piggy Bank Zotero
manticWeb

buttons: FF,
Integration with buttons: FF, FF an add-on FF an add-on
Opera, IE;
browsers Opera, IE itself itself
add-on for FF

full (also ad-


Support for
ditional infor-
Semantic Medi- full full none
mation besides
aWikis
from RDF)

Support for
some none weak none
Wikipedia

Support for
full some full weak
JeromeDL

Tagging yes no yes yes

no, but works

Is remote service yes yes with Semantic no

Bank

Accessible with
yes yes no no
Web Services

partially 
Allows data
yes yes sharing with no
sharing
Semantic Bank

no  depen- no  depen-
Support for new yes  writ-
yes  writing dency on the dency on the
document types ing new screen
new blades authors of the authors of the
(extensibility) scrapers
tool tool

111
Integration with web browsers is crucial for such systems. The more web browser

the system supports, the better; such a tool should not demand using a specic

browser. Since Piggy Bank and Zotero are Firefox add-on, they are perfectly inte-

grated only in this browser. IKHarvester and PingtheSemanticWeb.com also support

Internet Explorer and Opera by providing special buttons for capturing data. More-

over, some features of IKHarvester can be invoked by using a special add-ons for

Firefox (see Sec. 5.5.4).

All compared tools, except from Zotero, are able to collect sucient amount of

metadata for online resources available on web pages, by reading RDF documents

that those pages link to. By sucient, we mean more information than the URL

or the title of the resource. For instance, there should be some information about

the author of the resource or related resources. IKHarvester distinguishes itself as

it collects metadata also from non-semantic web pages, like Wikipedia which is a

treasury of informal knowledge.

To make much more use of metadata for learning purposes, it should be shared

and made available for all. For that reason, it is necessary to access it with Web

Services as it improves its accessibility and reusability. Also, tagging helps managing

collected information and facilitates searching and browsing. Again, IKHarvester

acquits itself well. All shared data can be retrieved, saved, and tagged by calling

REST Web Services.

Beyond that, IKHarvester has a considerable eLearning background. It treats

online resources as learning material (informal Learning Objects), and uses captured

data as its description. Moreover, IKHarvester delivers these metadata in a form in

accordance with LOM standard. This rich information is used by eLearning LMSs,

to perform accurate reasoning and provide well tailored courses.

Success stories

There can be found a lot of projects it can be used in. At the moment it is employed

by two of SemInf group projects: Didaskon, and notitio.us

112
Didaskon
3
IKHarvester has been designed as a SOA layer for Didaskon , a system designed

according to eLearning 2.0 assumptions. Didaskon delivers a framework for com-

posing on-demand curriculum from existing learning objects provided by eLearning

services (formal knowledge). Besides, it derives from SSIS - sources of informal

knowledge [53].

Basing on some preconditions, Didaskon creates a learning path which best ts a

specic learner. To achieve that, the system uses initial information (preconditions)

like a student's needs, skills, learning history etc., anticipated resulting skills and

knowledge (goals), and technical details of the clients platform.

4
Initially, IKHarvester was supposed to work with Didakon , an eLearning 2.0

framework (see Sec. 3.3.2). However, during the development, I have found another

application IKHarvester could be employed by.

notitio.us
5
notitio.us is a service for collaborative knowledge aggregation and sharing; it em-

ploys IKHarvester for retrieving RDF information about Web resources bookmarked

by the users. Therefore, it is capable of indexing rich metadata, coming from various

types of resources; in contrary to bookmarking services, such as del.icio.us, notitio.us

keeps rich, semantically interconnected metadata shared by the users using Social

Semantic Collaborative Filtering [28].

The resources not only can be shared with bookmarking interface (SSCF),

but also, based on the rich metadata, they can be searched and browsed using

6
TagsTreeMaps , a tags browser based on treemaps rendering algorithm, and Multi-

BeeBrowse, a collaborative browsing component; this components improve user

browsing experience, utilizing metadata delivered by IKHarvester.

One of modules delivered by IKHarvester allows to expose aggregated metadata

in LOM standard, which turns notitio.us into a valuable source of learning objects

3 http://didaskon.corrib.org/
4 http://didaskon.corrib.org/
5 http://notitio.us/
6 TagsTreeMaps: http://sf.net/projects/tagstreemaps/

113
Figure 6.1: IKHarvester in the notitio.us service

based on informal knowledge, delivered by IKHarvester (see Fig. 6.1).

To learn more about IKHarvester and notitio.us, please visit its home page:

http://notitio.us/ikh/.

6.2 Future work


As stated before, current version can operate of three resources types: wikis based

on MediaWiki engine, blogs that support SIOC, and JeromeDL.

The system was designed in a manner that allows extending it so that it works

with other sources of informal knowledge (see Fig. 4.6). In future, it should support

more types of online resources, among others: Bricks (another digital library), blogs

7
hosted on Blogger , and other types of wiki engines.

7 http://www.blogger.com/

114
Bibliography

[1] Marc (machine readable cataloging) and sgml/xml.

http://xml.coverpages.org/marc.html, July 2002.

[2] M.-H. Abel, A. Benayache, D. Lenne, C. Moulin, C. Barry, and B. Chaput.

Ontology-based organizational memory for e-learning. Educational Technology

& Society, 7(4):98111, 2004.

[3] L. Aroyo and D. Dicheva. The new challenges for e-learning: The educational

semantic web. Educational Technology & Society, 7(4):5969, 2004.

[4] U. Bojars, J. Breslin, and A. Passant. Sioc browser - towards a richer blog

browsing experience. In Accepted for the 4th Blogtalk Conference (Blogtalk

reloded), Vienna, Austria.

[5] J. Brase, M. Painter, and W. Nejdl. Completing LOM - how additional axioms

increase the utility of learning object metadata. In ICALT, page 493. IEEE

Computer Society, 2003.

[6] M. Cygan. Ubiquitous search service component gateway for heterogeneous l2l

network. Master's thesis, Gdansk University of Technology, 2006.

[7] L. Dodds. An Introduction to FOAF.

http://www.xml.com/pub/a/2004/02/04/foaf.html, February 2004.

[8] S. Downes. E-learning 2.0. eLearn magazine. Online, ac-

cessed May 1st, 2007, http://www.elearnmag.org/subpage.cfm?section=


articles&article=29-1.

128
[9] DublinCore Initiative, http://dublincore.org/documents/dces/. Dublin Core

Metadata Element Set, Version 1.1: Reference Description.

[10] R. T. Fielding. Architectural styles and the design of network-based software

architectures, 2000. Online; accessed April 7, 2007; http://www1.ics.uci.


edu/%7Efielding/pubs/dissertation/rest_arch_style.htm.

[11] E. M. Frank Manola. Rdf primer, w3c recommendation. Online; accessed

December 16, 2006, http://www.w3.org/TR/rdf-primer/.

[12] P. Graham. Web 2.0. Online; accessed December 18, 2006; http://www.
paulgraham.com/web20.html.

[13] S. Grzonkowski, A. Gzella, H. Krawczyk, S. R. Kruk, F. J. M.-R. Moyano, and

T. Woroniecki. D-FOAF - Security Aspects in Distributed User Managment

System. In TEHOSS'2005.

[14] A. Gzella. Service oriented architecture for distributed identity management

system. Master's thesis, Gdansk University of Technology, September 2006.

[15] H. He. What is service-oriented architecture, September 2003. Online, ac-

cessed April 7, 2007, http://webservices.xml.com/pub/a/ws/2003/09/30/


soa.html.

[16] M. V. Heiko Haller, Markus Kroetzsch and D. Vrandecic. Semantic wikipedia.

In D. Riehle and J. Noble, editors, Proceedings of the 2006 International Sym-

posium on Wikis, 2006, Odense, Denmark, August 21-23, 2006, pages 137138.

ACM, 2006.

[17] J. Hendler and O. Lassila. The semantic web. Scientic American Magazine,

May 2001.

[18] IEEE. Draft standard for learning object metadata. Techni-

cal report, Institute of Electrical and Electronics Engineers, Inc.,

http://ltsc.ieee.org/wg12/les/LOM_1484_12_1_v1_Final_Draft.pdf, 2002.

[19] J. Jankowski. Internetowy system zdalnego nauczania oparty o usªugi sieciowe.

Master's thesis, Gdansk University of Technology, 2006.

129
[20] A. H. John Breslin, Stefan Decker. Sioc: an approach to connect web-based

communities. Int. J. of Web Based Communities, 2:132142, jul 2006.

[21] S. D. John Breslin. Semantic web 2.0: Creating social semantic information

spaces.

[22] D. R. Karger and D. Quan. What would it mean to blog on the semantic web?

J. Web Sem, 3:147157, 2005.

[23] T. Karrer. elearning 2.0: Informal learning, communities, bottom-up vs. top-

down, feb 2006. Online; accessed September 20, 2006; http://elearningtech.


blogspot.com/2006/02/elearning-20-informal-learning.html.

[24] R. Khare. Microformats: The next (small) thing on the semantic web? IEEE

Internet Computing, 10(1):6875, 2006.

[25] KnowledgeNet. Knowledgenet - history of e-learning. Online; accessed

November 6, 2006; http://www.knowledgenet.com/corporateinformation/


ourhistory/history.jsp.

[26] S. R. Kruk. MarcOnt Initiative. Technical report, DERI.Galway, Ireland,

http://www.marcont.org/, 10 2004. Bibliographic description and related tools

utilising Semantic Web technologies.

[27] S. R. Kruk. E-learning on semantic web 2.0, 2006. Online; accessed November 5,

2006; http://www.sebastiankruk.com/storage/presentation/elearning_
on_sw20/img0.html.

[28] S. R. Kruk, S. Decker, A. Gzella, S. Grzonkowski, and B. McDaniel. Social

semantic collaborative ltering for digital libraries. Journal of Digital Informa-

tion, Special Issue on Personalization, 2006.

[29] S. R. Kruk, S. Grzonkowski, A. Gzella, and M. Cygan. Digime - ubiquitous

search and browsing for digital libraries. In Mobile Data Management, 2006.

[30] S. R. Kruk, S. Grzonkowski, A. Gzella, T. Woroniecki, and H.-C. Choi. D-foaf:

Distributed identity management with access rights delegation. In The Seman-

tic Web - ASWC 2006, First Asian Semantic Web Conference, Beijing, China,

130
September 3-7, 2006, Proceedings, volume 4185 of Lecture Notes in Computer

Science, pages 140154, 2006.

[31] S. R. Kruk, K. Samp, T. Woroniecki, A. Westerski, F. Czaja, and C. O'Nuallain.

E-learning based on the social semantic information sources. In submitted to

ISWC, 2006.

[32] A. D. Learning. Scorm homepage. Online; accessed May 1st, 2007; http:
//www.adlnet.gov/scorm/.

[33] R. S. Ljiljana Stojanovic, Steen Staab. elearning based on the seman-

tic web, jan 2001. Online; accessed November 5, 2006; http://www.aifb.


uni-karlsruhe.de/WBS/Publ/2001/WebNet_lstsstrst_2001.pdf.

[34] E. T. Marieke Guy. Folksonomies: Tidying up tags? D-Lib Magazine, 12(1),

jan 2006. Online; accessed April 30th, 2007; http://www.dlib.org/dlib/


january06/guy/01guy.html.

[35] A. Mathes. Folksonomies - cooperative classication and commu-

nication through shared metadata. December 2004. Online; ac-

cessed April 30th, 2007; http://www.adammathes.com/academic/


computer-mediated-communication/folksonomies.html.

[36] D. G. W. Mission. Beyond elearning: practical insights from the usa. Technical

report, May 2006.

[37] K. Moeller. A generalised approach for generating semantic metadata in the

blogosphere.

[38] E. Moreale and M. Vargas-Vera. Semantic services in e-learning: an argumen-

tation case study. volume 7, pages 112128, 2004.

[39] W. Nejdl, B. Wolf, C. Qu, S. Decker, M. Sintek, A. Naeve, M. Nilsson, M. Palmr,

and T. Risch. Edutella: A p2p networking infrastructure based on rdf. Jan. 01

2002.

131
[40] M. of New Media. E-learning - m/cyclopedia of new media, 2006. Online; ac-

cessed November 5, 2006; http://wiki.media-culture.org.au/index.php/


E-Learning.

[41] S. O'Hear. E-learning 2.0 - how web technologies are shaping education. On-

line; accessed September 19, 2006; http://www.readwriteweb.com/archives/


e-learning_20.php.

[42] S. O'Hear. Seconds out, round two. The Guardian, 2005. Online; accessed

September 20, 2006; http://education.guardian.co.uk/elearning/story/


0,10577,1642281,00.html.

[43] T. O'Reilly. What is web 2.0. Online; accessed December 16,

2006; http://www.oreillynet.com/pub/a/oreilly/tim/news/2005/09/30/
what-is-web-20.html.

[44] E. Oren. SemperWiki: a semantic personal Wiki. In Proceedings of the ISWC

Workshop on the Semantic Desktop, Nov. 2005.

[45] P. Prescod. Rest and the real world, February 2002. Online; accessed April 7,

2007; http://webservices.xml.com/lpt/a/ws/2002/02/20/rest.html.

[46] H. Rollett, M. Lux, M. Strohmaier, G. Dosinger, and K. Tochtermann. The web

2.0 way of learning with technologies. Int. J. of Learning Technology, 3:87107,

Feb. 07 2007.

[47] K. Z. Sebastian R. Kruk, Marcin Synak. Marcont - integration ontology for

bibliographic description formats. In Proceedings of the International Confer-

ence on Dublin Core and Metadata Applications (DC-2005), Madrid, Spain,

September 2005.

[48] K. Z. Sebastian R. Kruk, Marcin Synak. Marcont initiative - mediation services

for digital libraries. In In Proceedings of ECDL 2005.

[49] L. Z. Sebastian R. Kruk, Stefan Decker. Jeromedl - managing digital library

database with the semantic web technologies. In Proceedings of the 16th In-

132
ternational Conference on Database and Expert Systems Applications. Copen-

hagen, Denmark, 2005.

[50] S. D. Sebastian R. Kruk. Semantic social collaborative ltering with foafrealm.

2005. Semantic Desktop Workshop, ISWC 2005.

[51] B. Simon, Z. Miklós, W. Nejdl, M. Sintek, and J. Salvachúa. Elena: A mediation

infrastructure for educational services. In WWW (Alternate Paper Tracks),

2003.

[52] D. Team. Didaskon project description, 2006. Online; accessed

September 20, 2006; http://wiki.jeromedl.org/Projects/W2W/


WorkingGroupProjectETI/2006/Didaskon.

[53] D. team. Didaskon project documentation. Technical report, Digital Enterprise

Research Institute (DERI), http://didaskon.corrib.org/, 2006.

[54] K. M. Uldis Bojars, John Breslin. Using semantics to enhance the blogging

experience. In The Semantic Web: Research and Applications, 3rd European

Semantic Web Conference, ESWC 2006, Budva, Montenegro, June 11-14, 2006,

Proceedings, volume 4011, pages 679696. Springer, 2006.

[55] D. M. Vargas-Vera and D. E. Motta. Aqua - ontology-based question answering

system. Jan. 01 2004. Online; accessed May 1st; http://kmi.open.ac.uk/


publications/pdf/kmi-04-20.pdf.

[56] W3C. Owl web ontology language guide. Online; accessed December 16, 2006;

http://www.w3.org/TR/owl-guide/.

[57] W3C. Owl web ontology language overview. Online; accessed March 21, 2007;

http://www.w3.org/TR/owl-features/.

[58] W3C. Primer: Getting into rdf & semantic web using n3. Online; accessed

December 16, 2006; http://www.w3.org/2000/10/swap/Primer.

[59] W3C. Rdf vocabulary description language 1.0: Rdf schema. Online; accessed

March 21, 2007; http://www.w3.org/TR/rdf-schema/.

133
[60] Wikimedia. Learning object metadata - meta. Online; accessed April 12, 2007,

http://meta.wikimedia.org/wiki/Learning_object_metadata.

[61] Wikipedia. E-learning - wikipedia, the free encyclopedia, 2006. Online; accessed

September 20, 2006; http://en.wikipedia.org/wiki/Elearning.

[62] Wikipedia. Semantic web - wikipedia, the free encyclopedia, 2006. Online; ac-

cessed November 22, 2006; http://en.wikipedia.org/wiki/Semantic_web.

[63] Wikipedia. World wide web - wikipedia, the free encyclopedia, 2006. Online;

accessed November 22, 2006; http://en.wikipedia.org/wiki/World_Wide_


Web.

[64] D. N. Z. Bjelogrlic. The `a-semantic platform': Solving basic semantic web

problems in security-related elds.

[65] D. Zambonini. Is web 2.0 killing the semantic web? O'Reilly XML Blog,

October 2005. Online; accessed December 28, 2006; http://www.oreillynet.


com/xml/blog/2005/10/is_web_20_killing_the_semantic.html.

134
List of Figures

1.1 Capturing informal learning with IKHarvester . . . . . . . . . . . . . 8

2.1 LOM structure (from [60]) . . . . . . . . . . . . . . . . . . . . . . . . 12

2.2 The Semantic Web Stack (from W3C) . . . . . . . . . . . . . . . . . 16

2.3 RDF statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

2.4 An example of a social network . . . . . . . . . . . . . . . . . . . . . 24

3.1 Location of SSIS in the Web (gure concept: [21]) . . . . . . . . . . . 30

3.2 Online communities overview (from [4]). . . . . . . . . . . . . . . . . 41

3.3 Main concepts in SIOC Ontology (from SIOC homepage) . . . . . . . 42

4.1 System scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

4.2 Use Case diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

4.3 Component diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

4.4 Class diagram (part #1) . . . . . . . . . . . . . . . . . . . . . . . . . 75

4.5 Class diagram (part #2) . . . . . . . . . . . . . . . . . . . . . . . . . 76

4.6 Blades for dierent SSIS types . . . . . . . . . . . . . . . . . . . . . 78

5.1 Three-tier architecture . . . . . . . . . . . . . . . . . . . . . . . . . . 87

5.2 IKHarvester main page . . . . . . . . . . . . . . . . . . . . . . . . . . 88

5.3 IKHarvester  support for web browsers . . . . . . . . . . . . . . . . 107

6.1 IKHarvester in the notitio.us service . . . . . . . . . . . . . . . . . . 114

135
List of Listings

2.1 N3 RDF representation . . . . . . . . . . . . . . . . . . . . . . . . . . 17

2.2 RDF/XML representation . . . . . . . . . . . . . . . . . . . . . . . . 18

4.1 Support for SIOC information . . . . . . . . . . . . . . . . . . . . . . 79

5.1 Retrieving the connection to the data storage . . . . . . . . . . . . . 97

5.2 Querying the data storage . . . . . . . . . . . . . . . . . . . . . . . . 100

5.3 RDF SERQL queries denition . . . . . . . . . . . . . . . . . . . . . 100

5.4 DataProvider interface . . . . . . . . . . . . . . . . . . . . . . . . . . 102

5.5 DataProvider interface . . . . . . . . . . . . . . . . . . . . . . . . . . 103

A.1 Informal knowledge repository conguration . . . . . . . . . . . . . . 139

A.2 Host context conguration . . . . . . . . . . . . . . . . . . . . . . . . 140

B.1 Learning Object Metadata . . . . . . . . . . . . . . . . . . . . . . . . 141

B.2 The content of a Learning Object . . . . . . . . . . . . . . . . . . . . 148

B.3 List of Learning Objects . . . . . . . . . . . . . . . . . . . . . . . . . 150

136
List of Tables

2.1 New trends in the Web (concept: [43]). . . . . . . . . . . . . . . . . . 22

2.2 Metamorphosis of the Web (concept: [21]). . . . . . . . . . . . . . . . 26

4.1 Requirement description template . . . . . . . . . . . . . . . . . . . . 52

4.2 Mapping: posts attribute - semantic description - LOM. . . . . . . . 79

4.3 Mapping: wiki article - semantic description - LOM. . . . . . . . . . . 81

4.4 Mapping: JeromeDL resource - semantic description - LOM. . . . . . 83

5.1 REST  get LO list . . . . . . . . . . . . . . . . . . . . . . . . . . . 94

5.2 REST  get LOM . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94

5.3 REST  get LO content . . . . . . . . . . . . . . . . . . . . . . . . . 95

5.4 REST  add LO . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95

5.5 REST  remove LO . . . . . . . . . . . . . . . . . . . . . . . . . . . 96

6.1 Comparison of tools for collecting informal data . . . . . . . . . . . . 111

137
Appendix A

Installation guide

A.1 Apache Tomcat


IKHarvester uses Apache Tomcat (version 5.5.23 or newer), the servlet container.

The web container can be downloaded from its home page: http://tomcat.apache.
org/. Apache Tomcat should be installed according to the instructions available on

the above mentioned web page.

Let us assume, TOMCAT_DIR/ is the Tomcat installation directory; this name will
be used further in this chapter.

A.2 Sesame
Sesame, RDF storage, plays the role of the informal knowledge repository. IKHar-

vester uses Sesame version 1.2.6 which available at: http://www.openrdf.org/


download.jsp.
Again, one should follow instructions from the tool's home page. In short,

Sesame webapp must be put into TOMCAT_DIR/webapps/, and all jars moved from

TOMCAT_DIR/webapps/sesame/WEB-INF/lib/ to TOMCAT_DIR/common/lib/.
Having installed Sesame, it should be congured. For that reason, put the fol-

lowing code (see List. A.1) in TOMCAT_DIR/webapps/sesame/WEB-INF/system.conf


le, inside the <repositorylist> section.

In the listing, STORAGE_FILENAME is the path to the le where RDF data will be

138
stored.

Listing A.1: Informal knowledge repository conguration

<r e p o s i t o r y i d =" j o i n e d − r e p o s i t o r y ">

<a c l w o r l d R e a d a b l e =" t r u e " w o r l d W r i t e a b l e =" t r u e " />

< t i t l e >i k − r e p o s i t o r y </ t i t l e >

<s a i l s t a c k >

<s a i l c l a s s =" o r g . o p e n r d f . s e s a m e . s a i l i m p l . s y n c . S y n c R d f R e p o s i t o r y " />

<s a i l c l a s s =" o r g . o p e n r d f . s e s a m e . s a i l i m p l . memory . R d f S c h e m a R e p o s i t o r y ">

<param name=" f i l e " v a l u e ="STORAGE_FILENAME" />

</ s a i l >

</ s a i l s t a c k >

</ r e p o s i t o r y >

A.3 IKHarvester
IKHarvester can be run in two ways, either by dening the listener in Apache Tom-

cat, or by creating war and deploying it to TOMCAT_DIR/webapps/ directory.

A.3.1 Downloading the source code


1
IKHarvester source code is available at SourceForge.net , as a part of Didaskon

project.

The direct link to the source code: https://didaskon.svn.sourceforge.net/


svnroot/didaskon/IKHarvester

A.3.2 Conguration
After downloading the application, put all jar les from

IKHARVESTER_DIR/dist/TOMCAT_DIR/common/lib/ to TOMCAT_DIR/common/lib/
directory. Also, commons-fileupload-1.1.jar le (copied from Sesame 1.2.6

distribution) must be deleted, because along with IKHarvester les you have

downloaded newer version of that le.


1 http://sourceforge.net/

139
Running the application

There are two ways of running IKHarvester. You can do it either by dening a

listener in Apache Tomcat or by creating and deploying Web ARchive le.

Conguring the listener

This is more convenient way of running IKHarvester. After addition conguration

of Apache Tomcat, the web container sees changes to the source les every time they

are compiled. Consequently, there is no need to redeploying war le and restarting

the container.

Put the code from List. A.2 in le TOMCAT_DIR/conf/server.xml, at the

end of <Host name="localhost" ... section and restart Apache Tomcat.

IKHARVESTER_DIR is the path to IKHarvester directory (with source les), whereas

path="/ikh" denes the URL IKHarvester is available at. In this example it is

http://localhost:8080/ikh

Listing A.2: Host context conguration

<C o n t e x t p a t h ="/ i k h " d o c B a s e="IKHARVESTER_DIR/ WebContent "

debug ="0" r e l o a d a b l e =" t r u e "/>

Deploying WAR

This is a more burdensome way of running IKHarvester.

After changes to the system source les, one musts run ant script

IKHARVESTER_DIR/build.xml. It compiles Java classes, creates a Web ARchive

le, ikharvester.war, and deploys it to TOMCAT_DIR/webapps/ directory.

The onerousness of this approach lies in the fact that every time the developer

makes a change, he must create new ikharvester.war le, deploy it, and restart

the web container. That is why, I suggest using the former approach.

140
Appendix B

Output examples

B.1 LOM example


IKHarvester provides descriptions of Learning Objects (LOs) stored in the informal

knowledge repository (see Tab. 4.2.2 for details for that functional requirement) in

a form suggested by LOM standard (see Sec. 2.1.2).

In the List. B.1 there is presented the description of a LO created out of informa-

tion harvested from a blog post available at: http://dobrzanski.net/2007/04/


23/ajax-activity-indicator/

Listing B.1: Learning Object Metadata

<?xml v e r s i o n ="1.0" e n c o d i n g ="UTF−8" ?>

<lom>

<g e n e r a l >

<i d e n t i f i e r >

<c a t a l o g >URI</ c a t a l o g >

<e n t r y >

h t t p : / / d o b r z a n s k i . n e t / 2 0 0 7 / 0 4 / 2 3 / a j a x −a c t i v i t y −i n d i c a t o r /
</ e n t r y >

</ i d e n t i f i e r >

<t i t l e >

<l a n g s t r i n g xml : l a n g ="e n">

AJAX activity indicator

</ l a n g s t r i n g >

</ t i t l e >

141
<l a n g u a g e >en </ l a n g u a g e >

<d e s c r i p t i o n >

<l a n g s t r i n g xml : l a n g ="e n">

Users are familiar with indications of work performed

in background since first versions o f MS Windo

</ l a n g s t r i n g >

</ d e s c r i p t i o n >

<keyword>

<l a n g s t r i n g xml : l a n g ="e n">

http :// dobrzanski . net / category / ajax /

</ l a n g s t r i n g >

<l a n g s t r i n g xml : l a n g ="e n">

http :// dobrzanski . net / category / j a v a s c r i p t /

</ l a n g s t r i n g >

<l a n g s t r i n g xml : l a n g ="e n">

h t t p : / / d o b r z a n s k i . n e t / c a t e g o r y / web20 /

</ l a n g s t r i n g >

</keyword>

<s t r u c t u r e >

<s o u r c e >LOMv1.0 </ s o u r c e >

<v a l u e >a t o m i c </ v a l u e >

</ s t r u c t u r e >

<a g g r e g a t i o n l e v e l >

<s o u r c e >LOMv1.0 </ s o u r c e >

<v a l u e >1</ v a l u e >

</ a g g r e g a t i o n l e v e l >

</ g e n e r a l >

<l i f e c y c l e >

<v e r s i o n >

<l a n g s t r i n g xml : l a n g ="e n">

2007 − 04 − 23T22 : 4 3 : 1 5 Z

</ l a n g s t r i n g >

</ v e r s i o n >

<c o n t r i b u t e >

<r o l e >

<s o u r c e >LOMv1.0 </ s o u r c e >

<v a l u e >a u t h o r </ v a l u e >

142
</ r o l e >

< e n t i t y >h t t p : / / d o b r z a n s k i . n e t / a u t h o r / admin/</ e n t i t y >

<d a t e >

<d a t e t i m e >2007 −04 −23T22 : 4 3 : 1 5 Z</ d a t e t i m e >

<d e s c r i p t i o n >

<l a n g s t r i n g xml : l a n g ="e n"> C r e a t i o n d a t e </ l a n g s t r i n g >

</ d e s c r i p t i o n >

</d a t e >

</ c o n t r i b u t e >

<s t a t u s >

<s o u r c e >LOMv1.0 </ s o u r c e >

<v a l u e >r e v i s e d </ v a l u e >

</ s t a t u s >

</ l i f e c y c l e >

<metametadata>

<i d e n t i f i e r >

<c a t a l o g >URI</ c a t a l o g >

<e n t r y >

h t t p : / / d o b r z a n s k i . n e t / 2 0 0 7 / 0 4 / 2 3 / a j a x −a c t i v i t y −i n d i c a t o r /
</ e n t r y >

</ i d e n t i f i e r >

<c o n t r i b u t e >

<r o l e >

<s o u r c e >LOMv1.0 </ s o u r c e >

<v a l u e >a u t h o r </ v a l u e >

</ r o l e >

< e n t i t y >h t t p : / / d o b r z a n s k i . n e t / a u t h o r / admin/</ e n t i t y >

<d a t e >

<d a t e t i m e >2007 −04 −23T22 : 4 3 : 1 5 Z</ d a t e t i m e >

<d e s c r i p t i o n >

<l a n g s t r i n g xml : l a n g ="e n"> C r e a t i o n d a t e </ l a n g s t r i n g >

</ d e s c r i p t i o n >

</d a t e >

</ c o n t r i b u t e >

<m e t a d a t a s c h e m a>LOMv1.0 </ m e t a d a t a s c h e m a>

<l a n g u a g e >en </ l a n g u a g e >

</metametadata>

143
<t e c h n i c a l >

<f o r m a t >t e x t / html </f o r m a t >

<l o c a t i o n >

h t t p : / / d o b r z a n s k i . n e t / 2 0 0 7 / 0 4 / 2 3 / a j a x −a c t i v i t y −i n d i c a t o r /
</ l o c a t i o n >

<r e q u i r e m e n t >

<o r c o m p o s i t e >

<t y p e >

<s o u r c e >LOMv1.0 </ s o u r c e >

<v a l u e >o p e r a t i n g s y s t e m </ v a l u e >

</t y p e >

<name>

<s o u r c e >LOMv1.0 </ s o u r c e >

<v a l u e >m u l t i −o s </ v a l u e >

</name>

</ o r c o m p o s i t e >

</ r e q u i r e m e n t >

<r e q u i r e m e n t >

<o r c o m p o s i t e >

<t y p e >

<s o u r c e >LOMv1.0 </ s o u r c e >

<v a l u e >b r o w s e r </ v a l u e >

</t y p e >

<name>

<s o u r c e >LOMv1.0 </ s o u r c e >

<v a l u e >any</ v a l u e >

</name>

</ o r c o m p o s i t e >

</ r e q u i r e m e n t >

</ t e c h n i c a l >

<e d u c a t i o n a l >

<l e a r n i n g r e s o u r c e t y p e >

<s o u r c e >D i d a s k o n </ s o u r c e >

<v a l u e >B l o g P o s t </ v a l u e >

</ l e a r n i n g r e s o u r c e t y p e >

144
<d e s c r i p t i o n >

<l a n g s t r i n g xml : l a n g ="e n">

Users are familiar with indications of work performed in

background since first versions o f MS Windo

</ l a n g s t r i n g >

</ d e s c r i p t i o n >

<l a n g u a g e >en </ l a n g u a g e >

<i n t e r a c t i v i t y t y p e >

<s o u r c e >LOMv1.0 </ s o u r c e >

<v a l u e >e x p o s i t i v e </ v a l u e >

</ i n t e r a c t i v i t y t y p e >

<i n t e r a c t i v i t y l e v e l >

<s o u r c e >LOMv1.0 </ s o u r c e >

<v a l u e >low </ v a l u e >

</ i n t e r a c t i v i t y l e v e l >

<s e m a n t i c d e n s i t y >

<s o u r c e >LOMv1.0 </ s o u r c e >

<v a l u e >medium</ v a l u e >

</ s e m a n t i c d e n s i t y >

<i n t e n d e d e n d u s e r r o l e >

<s o u r c e >LOMv1.0 </ s o u r c e >

<v a l u e >l e a r n e r </ v a l u e >

</ i n t e n d e d e n d u s e r r o l e >

<c o n t e x t >

<s o u r c e >LOMv1.0 </ s o u r c e >

<v a l u e >t r a i n i n g </ v a l u e >

</ c o n t e x t >

<c o n t e x t >

<s o u r c e >LOMv1.0 </ s o u r c e >

<v a l u e >s c h o o l </ v a l u e >

</ c o n t e x t >

<c o n t e x t >

<s o u r c e >LOMv1.0 </ s o u r c e >

<v a l u e >h i g h e r e d u c a t i o n </ v a l u e >

</ c o n t e x t >

<c o n t e x t >

<s o u r c e >LOMv1.0 </ s o u r c e >

<v a l u e >o t h e r </ v a l u e >

145
</ c o n t e x t >

<d i f f i c u l t y >

<s o u r c e >LOMv1.0 </ s o u r c e >

<v a l u e >medium</ v a l u e >

</ d i f f i c u l t y >

</ e d u c a t i o n a l >

<r i g h t s >

<c o s t >

<s o u r c e >LOMv1.0 </ s o u r c e >

<v a l u e >no</ v a l u e >

</ c o s t >

</ r i g h t s >

<r e l a t i o n >

<k i n d >

<s o u r c e >LOMv1.0 </ s o u r c e >

<v a l u e >r e f e r e n c e s </ v a l u e >

</k i n d >

<r e s o u r c e >

<i d e n t i f i e r >

<c a t a l o g >URI</ c a t a l o g >

<e n t r y >

h t t p : / / d o b r z a n s k i . n e t / 2 0 0 7 / 0 4 / 2 2 / u s i n g −put −and− d e l e t e −methods −i n −a j a x − r e q u e s t a −w i t

</ e n t r y >

</ i d e n t i f i e r >

<d e s c r i p t i o n >

<l a n g s t r i n g xml : l a n g ="e n"> r e f e r e n c e s </ l a n g s t r i n g >

</ d e s c r i p t i o n >

</ r e s o u r c e >

</ r e l a t i o n >

<r e l a t i o n >

<k i n d >

<s o u r c e >LOMv1.0 </ s o u r c e >

<v a l u e >r e f e r e n c e s </ v a l u e >

</k i n d >

<r e s o u r c e >

146
<i d e n t i f i e r >

<c a t a l o g >URI</ c a t a l o g >

<e n t r y >h t t p : / /www . n a p y f a b . com/ a j a x − i n d i c a t o r s /</ e n t r y >

</ i d e n t i f i e r >

<d e s c r i p t i o n >

<l a n g s t r i n g xml : l a n g ="e n"> r e f e r e n c e s </ l a n g s t r i n g >

</ d e s c r i p t i o n >

</ r e s o u r c e >

</ r e l a t i o n >

<r e l a t i o n >

<k i n d >

<s o u r c e >LOMv1.0 </ s o u r c e >

<v a l u e >r e f e r e n c e s </ v a l u e >

</k i n d >

<r e s o u r c e >

<i d e n t i f i e r >

<c a t a l o g >URI</ c a t a l o g >

<e n t r y >h t t p : / / a j a x l o a d . i n f o /</ e n t r y >

</ i d e n t i f i e r >

<d e s c r i p t i o n >

<l a n g s t r i n g xml : l a n g ="e n"> r e f e r e n c e s </ l a n g s t r i n g >

</ d e s c r i p t i o n >

</ r e s o u r c e >

</ r e l a t i o n >

</lom>

147
B.2 LO content example
Apart from the description of a Learning Objcect in LOM (see List. B.1, IKHarvester

can also provide the content of such LO. The content is supposed to be used in the

course created by an eLearning framework.

In the List. B.2, there is presented the content of a LO created out of information

harvested from a blog post available at: http://dobrzanski.net/2007/04/23/


ajax-activity-indicator/.

Listing B.2: The content of a Learning Object

<?xml v e r s i o n ="1.0" e n c o d i n g ="UTF−8" ?>

<LO u r i =" h t t p : / / d o b r z a n s k i . n e t / 2 0 0 7 / 0 4 / 2 3 / a j a x − a c t i v i t y − i n d i c a t o r /">


<c o n t e n t > <![CDATA[ < h1>AJAX activity i n d i c a t o r </h1><d i v ><p>

Users are familiar with indications of work performed

in background since first versions o f MS Windows .

Besides being fancy , they are also i n f o r m a t i v e . </p>

<p>AJAX, a Web 2.0 technique , aim at exchanging only

small amounts of data with a server ; this should be

performed behind the scenes . If so , why not expose

the moments when user interaction brings about reqest

and response from a server ? Remeber my previous

<a h r e f =" h t t p : / / d o b r z a n s k i . n e t / 2 0 0 7 / 0 4 / 2 2 /

u s i n g −put −and− d e l e t e −methods −i n −a j a x − r e q u e s t a −

w i t h − p r o t o t y p e j s /"> p o s t </a> a b o u t using prototype . j s

for making AJAX r e q u e s t ? I use prototype also for indicating

background actions on web pages that s u p p o r t AJAX. </p>

<p>You&#8217;d never guess how easy it is to such

i n d i c a t o r . </p><p>F i r s t , you must register an action

which accurs in case of an AJAX− r e l a t e d event .

The best way to do that is add the following code

in t h e <c o d e>head </c o d e> s e c t i o n of t h e HTML c o d e

( remember to include prototype . j s library before i t ! ) : < / p>

<p r e ><c o d e >&#60; s c r i p t t y p e =" t e x t / j a v a s c r i p t "&#62;&#60;

!&#91;CDATA&#91; Ajax . R e s p o n d e r s . r e g i s t e r ( {

onCreate : function (){ E l e m e n t . show ( ' s p i n n e r ' ) } ,

onComplete : f u n c t i o n ( ) { Element . h i d e ( ' s p i n n e r ' ) } } ) ;

148
&#93;&#93;>&#60;/ s c r i p t &#62;</ c o d e ></p r e ><p>Then ,

further in the code i n s i d e <c o d e>body</c o d e> s e c t i o n ,

add t h i s : </p><p r e ><c o d e >&#60;img a l t =" s p i n n e r "

i d =" s p i n n e r " s r c =" g f x / s p i n n e r . g i f " s t y l e =" d i s p l a y : n o n e ; "

/&#62;</ c o d e ></p r e ><p>A c t u a l l y , i t &#8217; s all .

Whenever you click an object which sends an AJAX

request to the server , the indicator defined by

<c o d e>img</c o d e> a p p e a r s and is visible until

the response is o b t a i n e d . </p><p>Wonder , how to create

an indicator animation ? Either download one from

<a h r e f =" h t t p : / /www . n a p y f a b . com/ a j a x − i n d i c a t o r s /"> h e r e </a>

or generate o n e <a h r e f =" h t t p : / / a j a x l o a d . i n f o /"> t h e r e </a>

</p></d i v >]] >

</ c o n t e n t >

</LO>

149
B.3 List of LOs example
IKHarvester can deliver to a LMS a list of informal LOs it stores (see Tab. 4.2.2 for

details on this functional requirement).

List. B.3 is an example such list.

Listing B.3: List of Learning Objects

<?xml v e r s i o n ="1.0" e n c o d i n g ="UTF−8" ?>

<LOList>

<l o u r i =" h t t p : / / en . w i k i p e d i a . o r g / w i k i / F r i e n d _ o f _ a _ F r i e n d " />

<l o u r i =" h t t p : / / en . w i k i p e d i a . o r g / w i k i / House " />

<l o u r i =" h t t p : / / en . w i k i p e d i a . o r g / w i k i / S u b v e r s i o n _ ( s o f t w a r e ) " />

<l o u r i =" h t t p : / / en . w i k i p e d i a . o r g / w i k i / Opera " />

<l o u r i =" h t t p : / / s c r u b s . w i k i a . com/ w i k i / Main_Page " />

<l o u r i =" h t t p : / / en . w i k i p e d i a . o r g / w i k i / M i c r o s o f t " />

<l o u r i =" h t t p : / / en . w i k i p e d i a . o r g / w i k i /RDF/A" removed=" t r u e "/>

<l o u r i =" h t t p : / / d o b r z a n s k i . n e t / 2 0 0 7 / 0 4 / 1 0 / l a t e x − t a b l e s / " />

<l o u r i =" h t t p : / / l i b r a r y . d e r i . i e / r e s o u r c e / c a 1 9 9 1 0 " />

<l o u r i =" h t t p : / / en . w i k i p e d i a . o r g / w i k i / Roman_Catholic_Church " />

<l o u r i =" h t t p : / / en . w i k i p e d i a . o r g / w i k i /HCard" />

<l o u r i =" h t t p : / / l i b r a r y . d e r i . i e / r e s o u r c e /8 b f 0 7 1 5 6 " />

<l o u r i =" h t t p : / / l i b r a r y . d e r i . i e / r e s o u r c e /wYc5hSU1" removed=" t r u e "/>

<l o u r i =" h t t p : / / en . w i k i p e d i a . o r g / w i k i /HTML" />

<l o u r i =" h t t p : / / en . w i k i p e d i a . o r g / w i k i / Machine " />

<l o u r i =" h t t p : / / en . w i k i p e d i a . o r g / w i k i / B a n f f _ N a t i o n a l _ P a r k " />

<l o u r i =" h t t p : / / w i k i . c o r r i b . o r g / i n d e x . php /S3B/MBB/SOA" />

<l o u r i =" h t t p : / / en . w i k i p e d i a . o r g / w i k i / h c a r d " removed=" t r u e "/>

<l o u r i =" h t t p : / / m i c r o f o r m a t s . o r g / w i k i / hatom " removed=" t r u e "/>

<l o u r i =" h t t p : / / en . w i k i p e d i a . o r g / w i k i / B o e i n g " />

<l o u r i =" h t t p : / / d o b r z a n s k i . n e t / 2 0 0 7 / 0 4 / 0 3 / f a c u l t y − r e s e a r c h −day /" />

<l o u r i =" h t t p : / / l i b r a r y . d e r i . i e / r e s o u r c e / D i i s D 2 R s " />

<l o u r i =" h t t p : / / en . w i k i p e d i a . o r g / w i k i / F i r e f o x " />

</LOList>

150

You might also like