You are on page 1of 6

A Multimedia Database Supports Intelligent English Distance Learning

Ying-Hong Wang, Hui-Hua Huang, Chih-Hao Lin


(Department of Computer Science and Information Engineering, TamKang University, Taipei Hsien, Taiwan,
Email: inhon@mail.tku.edu.tw, HHHuang@cs.tku.edu.tw, CHLin@cs.tku.edu.tw
Abstract
This work presents a novel English distance learning system that was developed through multimedia database and Internet
technologies called English multimedia corpus. The system includes English articles, dialogs, and videos. A student can study English
writing and reading as well as view Web browser listings to connect the Corpus server. In the system, semantic query and Link
grammar" are applied to construct the English multimedia corpus system. Furthermore, it promotes the query level from keyword-
base and content-based query to a semantic level. The main function of this system is to query the English sentence pattern through
keywords from the English multimedia corpus. The other function is to detect grammatical errors in written English. Thus, the system
not only teaches English grammar, but also, due to its database, allows teachers to understand the most frequent mistakes.
Keywords: Multimedia database, Link grammar, Distance learning, Linguistic corpus
1. Introduction
Traditional databases always store the character and
numerical data. A primary function manages this basic data, for
example, Students, School data, or Companies' employee and
financial data. Following computer-based technological
advancements, multiple types of data, such as image, audio,
video, and hypermedia documents, are applied to represent
computer information. Thus, database technologies must support
multimedia data, which is known as a multimedia database. It
includes many facets, such as content-based retrieval [6,15],
shape detection and object recognition [8,10]. Meanwhile, these
technical aspects promote its numerous application domains.
Many examples of distance learning, digital television, and
distance medicine are implemented based on those techniques.
This study presents an Internet-based English learning
environment that applies multimedia database technologies.
Learning English is always a substantial difficulty for
Chinese students. Some experts recommend that the best way to
learn English is to establish a good study environment and
practice it through several various approaches. This should make
it a satisfying learning experience. Based on this viewpoint,
teachers should provide students with a better environment to
learn English. Distance learning based on Internet is one of the
best solutions to create an English learning environment.
Currently, the highly distributed computing environment
has been supported and applied popularly. Networking services
should perform as tutoring tools. Numerous innovative
approaches in this field have been investigated. Through
implementing computer-mediated education, many advocates
emphasize its positive aspects and English learning tutoring
systems, which are computer-based, have been developed by
numerous academic research groups [19-31].
Incidentally, students learning a foreign language through
computing devices might encounter many difficulties. Some
articles have discussed whether students frustrations inhibit their
educational opportunity [20, 21, 23]. These frustrations were
based on three interrelated sources: lack of prompt feedback,
ambiguous Web instructions, and technical problems.
Hence, an easy-to-use English learning system was
developed. The proposed system contains several preparations,
which are relevant to English learning. While treating natural
language processing, the system is constructed based on Link
Grammar. However, in order to separate it from conventional
approaches, modifications were included to improve its ability.
The proposed English-learning corpus stores mistakes that
students make. That is, all errors that students will probably
make on a particular question are determined in advance. Each
type of mistake has correcting suggestions and error descriptions.
Corpus can not only provide suggestions but it also records the
mistakes that Chinese students might make in English. Another
function of this Internet-based multimedia corpus is that a user
can locate sample sentences patterns from movie scripts, live
dialogs and textbook reading. Each sentence from live dialog
and movie captions is parsed and analyzed by link grammar,
which includes special tags. The end product is then stored
within this proposed database system. Consequently, based on
semantic retrieval, the system subscribes to query methodology.
In this paper, English sentence construction will be
analyzed first. Notably, the sentence pattern is divided into nine
classes. Secondly, each sentence pattern is parsed by link
grammar, in which every pattern has a particular tag set.
Consequently, the system constructs the query engine with these
essential tag sets.
In general, similar learning systems store single sentences
or vocabularies and facilitate query through keywords [19-
22,24,25,27-31]. Semantic-based retrieval technologies are
rarely used. This work discusses how to use the appended
information, the special tag sets, which Link grammar generated
to advance the query function from keyword-based to semantic-
based.
Several web sites regarding learning corpus [19,20,21,22],
and study several papers in addition to those that pertain to Link
grammar [4,9,13,16] were surveyed. Furthermore, Schulenburg
proposed realistic feedback processing [9], Brill developed
machine learning and automatic linguistic analysis [12], and
Chang presented automatic linguistic resolution: framework and
applications [3]. Other similar systems are referenced [4,5,14,16,
17].
In section 2, related works regarding Link grammar and
XML are discussed. The system analysis for the proposed
learning functions is introduced in section 3. Section 4 presents
the architecture of the proposed system. Finally, section 5
contains the conclusion and discussions for future research.
2. Theoretical Backgrounds
Link grammar is an English grammar parser system that
was proposed by The School of Computer Science of Carnegie
Mellon University (CMU). Link grammar is a context-free
formula to describe natural language [32]. Link grammar
consists of a set of words, which are the terminal symbols of the
grammar, and each has a linking requirement. The linking
requirements of each word are gathered in a dictionary. Figure 1
illustrates the linking requirements through a simple dictionary
for the words, a/the, cat/mouse, John, ran, chased.
a
the
cat
nouse
John ran chased
5 5
O
D D O 5 5 O
Fig. 1: words and connectors in a dictionary
Each of the intricately shaped boxes is labeled connector.
A pair of compatible connectors will join, and only one of those
attached to a given black dot must be satisfied. Figure 2 depicts
how the linking requirements are satisfied in the sentence, "The
cat chased a mouse."
the cat
nouse
a
chased
5 5 D D
O O
D D
Fig. 2: All linking requirements are satisfied to form a linkage
The linkage can be perceived as a graph and the words can
be treated as vertices, which are connected by labeled arcs. Thus,
the graph is connected and planar. The labeled arcs that connect
words to others on either their left or right are links. A valid parse
is called a linkage. Thus, the following is a simplified form of
the diagram indicating that the cat chased a mouse is part of this
language.
the cat chased a nouse
D
D
O
5
Fig. 3: the simplified form of Fig. 2
Table 1 presents an abridged dictionary, which encodes
linking requirements of the above example.
Table 1: the words and linking requirements in a dictionary
words formula
a / the D+
cat / mouse D- & (O- or S+)
John O- or S+
ran S-
chased S- & O+
The linking requirement for each word is expressed as a
formula involving that includes the operators &, and o r,
parentheses, and connector names. The + or suffix on a
connector indicates the direction in which the matching
connector must lie be placed. Thus, the farther left a connector is
in the within an expression phphrase, the nearer the word to
which it connects must be.
A sequence of words is a sentence defined by grammar, if
links can be established among the words so as to satisfy the
formula of each word. It includes the following meta-rules:
Planarity: Links are drawn above the sentence and do not
cross.
Connectivity: Links connect all the words in the sequence.
Ordering: When the connectors of a formula are traversed from
left to right, the words to which they connect
proceed from near to far. Namely, consider a
word, and consider two links connecting that
word to the word on its left. The link
connecting the closest word (the shorter link)
must satisfy a connector that appears to the left
(in the formula) of that connector in the other
word. The same process occurs for a link to
the right.
Exclusion: No two links may connect the same pair of words.
The use of a formula to specify a link grammar dictionary
is convenient for creating natural language grammars, however it
is cumbersome for mathematical analysis there of, and as well as
in describing algorithms to parse link grammar. An alternate
method of expressing link grammar is known as disjunctive
form, in which each word has an associated set of disjuncts. A
disjunct is denoted as:
((L
1
, L
2
, ,L
m
)(R
n
, R
n-1
, , R
1
))
Where L
1
, L
2
, , L
m
and R
n
, R
n- 1
, , R
1
are the
connectors that must connect to the left and right, respectively.
Thus, simply rewriting each disjunct and combining them
all with the or operator to create an appropriate formula translates
a link grammar in from a disjunctive to a standard form can be
completed by as follows:
(L
1
&L
2
&&L
m
&R
1
& R
2
&&R
n
)
Enumerating all ways that the formula cab be satisfied
translates a formula into a set of disjuncts. For example, the
formula
(A- or ( )) & D- & (B+ or ( )) & (O- or S+)
corresponds to the following eight disjuncts, may be used in
some linkages:
( (A,D) (S,B) )
( (A,D,O) (B) )
( (A,D) (S) )
( (A,D,O) ( ) )
( (D) (S,B) )
( (D,O) (B) )
( (D) (S) )
( (D,O) ( ) )
3. System analysis
The proposed system contains two varieties of learning
functions, or two sub-systems. These sub-systems are English
pattern query system and English sentence error-detect system.
The detail will be discussed in the following.
3.1 English pattern query system
To develop this system, English sentence patterns were
analyzed. Based on general analysis, there are five English
sentence patterns, as follows:
(1) Simple sentence pattern
(2) Negative sentence pattern
(3) Interrogative sentence pattern
(4) WH question sentence pattern (What, When,
Where,)
(5) Imperative sentence pattern
They can then be divided by tense, as follows:
Simple pattern
The simple present tense: He writes a letter everyday.
The simple past tense: He wrote a letter yesterday.
The simple future tense: He will write a letter tomorrow.
Perfective pattern
The present perfective tense: He has written a letter.
The past perfective tense: He had written a letter when I
arrived.
The future perfective tense: He will have written a letter before
I arrive.
Continuous pattern
The present continuous tense: He is writing a letter now.
The past continuous tense: He was writing a letter when I
arrived.
The future continuous tense: He will be writing a letter when I
arrive.
Perfective proceed pattern
The present perfective continuous tense: He is writing a letter
now.
The past perfective continuous tense: He was writing a letter
when I
arrived.
Link grammar parses these sentences to determine the
label correlation of their sentence patterns (Table 2~5).
Table2: Simple sentence pattern and link grammar tags
Active voice Passive voice
Simple
pattern
present
past
future
Ss or Sp
Ss or Sp
Ss or Sp + I
Ss or Sp + Pv
Ss or Sp + Pv
Ss or Sp + Ix + Pv
Continuou
s pattern
present
past
future
Ss or Sp + PP
Ss or Sp + PP
Ss + If + PP
Ss or Sp + ppf + Pv
Ss or Sp + ppf + Pv
Ss or Sp + If + ppf + Pv
Proceed
pattern
present
past
future
Ss or Sp + Pg
Ss or Sp + Pg
Ss or Sp + Ix
+ Pg
Ss or Sp + Pg + Pv
Ss or Sp + Pg + Pv
none
Perfective
continuou
s pattern
present
past
Ss or Sp + ppf
+ Pg
Ss or Sp + ppf
+ Pg
Illustrate:
Ss and Sp: connects subject nouns to finite verbs
I: connects infinitive verb forms to certain words such as
modal verbs and "to"
PP: connects forms of "have" with past participles.
PV: connects forms of the verb "be" to past participles.
Pg: connects forms of the verb "be" to present participles.
ppf: connects forms of the verb "be" to past participles
been.
Subsequently, each pattern receives a label constituent
type. This characteristic parses each sentence in the system,
which stores the resulting label set. For example, (Ss, ppf, pg) is
a label set that was achieved and stored following link grammar
parsing. Then, a user can employ query language to arrive at the
English sentence pattern they desire. Initially, to ensure that a
pattern is accurate, analysis of only one type of pattern is
performed. The system can then be upgraded to handle multi-
patterns. The system can also process multiple queries in several
types of patterns. However, this has some difficulties, which are
discussed in section 3.2.
3.2 English sentence error detect system
In this section, the Enhanced Link Grammar, which is
based on a dictionary with modified disjuncts, is illustrated.
Herein, word linkage and the applications of Enhanced Link
Grammar are discusses, respectively.
Modified Dictionary
In general, the formula of each word is complicated.
Initially, the word that requires connector changes is analyzed.
Various formulas impose distinct constraints on the proposed
system. Thus, although changes may improve the system, many
may not. To illustrate, simple words are tested prior to further
explanation.
For example, some verbs must be followed by an infinitive
to, a gerund or a participle. However, several verbs can accept
several varying types. For example, We enjoy walking in the
rain. is accurate, but We enjoy to walk in the rain. Is not.
However, the link of the word enjoy can be altered to accept
the infinitive to.
a portion of the formula is changed from (Pg) to (Pg o r
TO***e9+). Hence, the word enjoy can be linked to not only a
gerund but also an infinitive to with the additional connector
TO***e9+. Notably, this connector contains a subscript
***e9, which indicates there is a type 9 error. The type 9 error
indicates the mistake is happened when a transitive verb is used
as intransitive verb. Subsequently, to the Link grammar with
original dictionary (Fig. 4) and modified dictionary (Fig. 5),
respectively, there are another two sentences We enjoy
swimming. and We enjoy to swim.
Figure 4presents that the link grammar that has the
original dictionary can not accept an inaccurate sentence. That
is, an error message No complete linkage found. will appear.
Fig.4: the original link grammar parser
However, Fig. 5 presents the experimental result when the
same two sentences are parsed by the link grammar that contains
the modified dictionary.
Fig. 5: the modified link grammar parser
Thus, enjoy can link to either a gerund or an infinitive
to. Incidentally, there are ten types of catalogued error tags for
error detection. In the next section, the error tag, including the
links that can and cannot be modified, is discussed.
4. The System Architecture and Examples
4.1 System architecture
Corpus is a multimedia database that provides a dictionary
as well as record errors, comments, and system feedback.
Similar to English essays, dialog, sentences, words, terms, and
attributes, the proposed English-learning corpus also has positive
study data. It promotes language teaching and research. In
addition to the original data, it also stores multimedia data,
including instructional films, and audio clips, movies, music
videos. Therefore, it provides a vivid, attractive learning
environment. This system, presented in Fig. 9 and 10, is a
multimedia corpus that provides four main functions of basic
language learning; they are listen, speech, read, and write. That
is a user can read an essay, listen to Standard English
pronunciation, write compositions, practice dialog and recite text.
The database is a vital tool for linguistic researchers. Philologists
can analysis mistakes that English as second language students
typically make.
This system can be used over the Internet to access the
data from the multimedia learning corpus. Figure 6 illustrates
how a user can access this system from any location.
Fig. 6 System architecture overview
As stated, the proposed system contains two sub-systems,
which are English pattern query system and English sentence
error detect system. Notably, they share the same database.
Figure 7 depicts the architecture of the English pattern
query system.
Link grammar with the original dictionary parses the data,
which produces link labels. The system then analyses and filters
the link labels, which are stored into the database. When a user
queries a sentence pattern, they use the interface of the system to
obtain the desired pattern. Subsequently, the system will
transform the query into readable data-type and, if possible, it
will locate the relevant link label within the database. The
system will retrieve the data and send it to the multimedia player
on the Web. The advantage of this is that a user can replay
messages several times until they clearly understand the correct
pronunciation of the sentence, as well as the context in which the
sentence should be used.
Figure 8 depicts the architecture of the other second sub-
system, English sentence error detect system.
Lnglish sentence
input
Link grannar
parser
(enhanced)
Label analysis
& Iiliter
ls there error
tag appear
access the data in
the learner
corpus
DataBase
out put
Yes
No
put the error
sentence into th
database
out put
Learner Corpus
The user can input a sentence into the system. Following
analysis, the link grammar with the modified dictionary will
parse the sentence, and filter the link labels if error tags are
detected. The system will locate the description and error
examples within the learner corpus and then display that
information on the web page. However, if the system fails to
understand an error, it may not have been defined or is too
complex. Those sentences will be stored in the database for
philologist analysis and further subsequent system enhancement.
4.2 System examples
In this section, operation examples of the system, which is
a web-based user interface, are presented. Figure 9 depicts the
interface of the English pattern query system.
This provides users with an easy-to-use interface; that is
they simply need to drop down the menu and select the pattern
that they want to read. The system will then return the query
result with the multimedia data on the web browser (Fig. 10).
Lnglish pattern
Query Function
data transIorn
nultinedia
player on the
web
Output
DataBase
Label analysis
& Iilter
Link grannar
Parser
(Original)
data input
Data/5entence
Fig. 7 English pattern query system architecture
Fig. 8 English sentence error detect system architecture
Fig. 9 English pattern query system interface
User(teacher,student..)
multimedia web
page
Essay Database Movies Database Dialog Database
SQL Server,IIS
Internet
This system also supports keyword-based queries.
Figure 11 presents the interface of the English sentence
error detect system.
you can also use this kind of word detect system.
After a sentence is input, link grammar with modified
dictionary will determine errors. Figure 12 illustrates an example
of a correctly input English sentence.
However, Fig. 13 presents the systems response when an
error is made
When the linkage of the users input reveals one error tag,
the system searches the Learner Corpus and provides both correct
and incorrect examples as well as a description thereof (Fig. 14).
5. Conclusions and Future Work
5.1 The Contribution and Remarks
After following analysis and experimentation, the system
can easily use the label set, which is stored in the corpus and
parsed with link grammar. Primarily, it assists junior or senior
high school students to locate proper English sentence patterns.
Numerous linguistic experts recommend that the best way to
improve English speaking ability is to establish a realistic
learning environment and practice English through several
distinct approaches. The proposed interactive multimedia system
promotes this type of learning.
Sleator and Temperley developed Link Grammar, which is
a word based parsing mechanism. Although the system was
though to be an accurate grammar checker, unlike the proposed
system, it fails to focus on fault tolerance ability, which is
particularly useful for non-native English speaking students. The
proposed system can parse sentences that the original Link
Grammar cannot.
Notably, our modified dictionary has prog ressed steadily. It
appears that the idea proposed herein can be applied to other
natural language processing and its corresponding applications.
The system not only queries English sentence patterns, but
also parses input sentences on-line. It is important to philologists
to analysis sentences generated by students, so they can easily
locate common or special mistakes. Subsequently, they can alter
the focus of their course for a complete learning environment.
This system provides a better and more interactive
environment for both teachers and students. Furthermore, it
combines the humanistic education with technology and
Fig. 10 query result
Fig. 12 correct sentence result
Fig. 13 error sentence result
Fig. 14 error suggestion
Fig. 11 English sentence error detect system interface
advances these two sciences for superior achievement.
5.2 Future Works
Generally, words are a basic unit of communication that
are common to all natural language texts and speech-processing
activities. When teaching English, teachers always want to know
the types of mistakes that students may make. Thus, according to
various functional necessities, the proposed system can be
extended to cover encompass more scalable ranges. The notions
presented herein can aid in the development of other similar
applications.
XML has self-descriptive characteristics, which has resulted in
increasing data interchange applications. These applications and
research areas may become a popular subject in the future.
Reference:
[1] Wasfi Al-Khatib, Y. Francis Day, Arif Ghafoor, P. Bruce
Berra, Semantic Modeling and Knowledge Representation
in Multimedia Databases, IEEE Transactions on
Knowledge and Data Engineering, Vol. 11 NO. 1
January/February 1999.
[2] D. Beeferman, A. Berger and J. Lafferty, Text
segmentation using exponential models, Proceedings of the
Second Conference On Empirical Methods in NLP,
Providence, RI, 1997.
[3] Brill, E., Machine learning and automatic linguistic
analysis: the next step, Acoustics, Speech and Signal
Processing, 1998. Proceedings of the 1998 IEEE
International Conference on Volume: 2 , 1998 , Page(s):
1033 -1036 vol.2
[4] C.M. Chung, Yuan-Kai Wang, Ying-Hong Wang and Chih-
Hao Lin, The multimeaia corpus based on link
grammmar, National Computer Symposium 99, Vol.1 A-
300~304.
[5] C.M. Chung, Yuan-Kai Wang, Ying-Hong Wang and Chih-
Hao Lin, The Design of Multimedia Database for English
Learning Application proceeding of 2000 International
Conference on Chinese Language Computing
(ICCLC2000), pp. 77~81.
[6] Shih-Fu Chang, Chen, W., Meng, H.J., Sundaram, H., Di
Zhong Circuits and Systems for Video Technology, A
Fully Automated Content-Based Video Search Engine
Supporting Spatiotemporal Queries, IEEE Transactions on
Volume: 8 5, Sept. 1998, Page(s): 602 615.
[7] Chan, S.W.K., Automatic linguistic resolution: framework
and applications, Systems, Man and Cybernetics, 1996.,
IEEE Internation Conference on Volume: 1 , 1996 , Page(s):
625 -630 vol.1.
[8] Chang, S.-K.; Hsu, A., Image Information systems: where
do we go from here? Knowledge and Data Engineering,
IEEE Transactions on Volume: 4 5 , Oct. 1992 , Page(s):
431 442.
[9] D. Grinberg, J. Lafferty and D. Sleator, A robust parsing
algorithm for link grammars, Proceedings of the Fourth
International Workshop on Parsing Technologies, Prague
and Karlovy Vary, September 1995, pp.111-125.
[10] F. Myron, S. Harpreet, N. Wayne, A. Jonathan, H. Qian, D.
Byron, G. Monika, H. Jim, L. Denis and P. Dragutin,
Query by Image and Video Content: The QBIC System,
Computer v28 n9 Sept 1995 IEEE Los Alamitos CA USA p
23-32.
[11] Yuichi Nakamura, Takeo Kanade, Semantic Analysis for
Video Contents Extraction Spotting by Association in
New Video, ACM Multimedia Conference, 1997.
[12] Schulenburg, D., Sentence processing with realistic
feedback, Neural Networks, 1992. IJCNN., International
Joint Conference on Volume: 4 , 1992 , Page(s): 661 -666
vol.4.
[13] D. Sleator, and D. Temperley, Parsing English with a link
grammar, technical report CMU-CS-91-196, Department
of Computer Science, Carnegie Mellon University, 1991.
[14] chiro Yamada, Hideki Sumiyoshi, Yeun-Bea Kim and
Masahiro Shibata, TV program skimming method using
similarity of scene descriptions, SPIE Conference on
multimedia Storage and Archiving Systems III, November
1998.
[15] Yoshitaka, A.; Ichikawa, T., A survey on content-based
retrieval for multimedia databases, Knowledge and Data
Engineering, IEEE Transactions on Volume: 11 1, Jan.-Feb.
1999, Page(s): 81 93.
[16] Yuan-Kai Wang, Ying-Hong Wang, Hsiu-Ping Lin and
Chih-Feng Chao, An English-Learning Application based
on extended link grammar formalism, National Computer
Symposium 99, Vol. 2, B-306~310.
[17] IHarry W. Agius, Marios C. Angelides, Developing
Knowledge-Based Intelligent Multimedia Tutoring Systems
Using Semantic Content- Based Modeling, Artificial
Intelligence Review 13: 55-83, 1999.
[18] Jian-Kang Wu, Content-Based Indexing of Multimedia
Databases, IEEE Transactions on Knowledge and Data
Engineering, Vol. 9, NO. 6, November/December 1997.
[19] EFLWEB Home Page, http://www.eflweb.com/
[20] English grammar clinic,
http://forums.educhat.com:8080/~grammar
[21] Foreign Languages for Travelers,
http://www.travlang.com/languages/
[22] G.R. Sampson: SUSANNE Scheme,
http://www.cogs.susx.ac.uk/users/geoffs/RSue.html
[23] Interactive Javascript Quizzes for ESL Students:
http://www.aitech.ac.jp/~iteslj/quizzes/js/
[24] Language Dictionaries and Translators:
http://rivendel.com/~ric/resources/dictionary.html
[25] Linguistic Data Consortium LDC,
http://www.ldc.upenn.edu/
[26] Link Grammar, http://bobo.link.cs.cmu.edu/link/
[27] Longman Dictionaries Home, http://www.awl-
elt.com/dictionaries/
[28] Merriam-Webster Online, http://www.m-w.com/home.htm
[29] On line English Grammar,
http://www.edunet.com/english/grammar/
[30] Project Gutenberg Official Home Site, http://promo.net/pg/
[31] YourDictionary.COM, http://www.yourdictionary.com/
[32] John Lafferty, Daniel Sleator, and Davy Temperley
Grammatical Trigrams: A Probabilistic Model of Link
Grammar Presented to the 1992 AAAI Fall Symposium on
Probabilistic Approaches to Natural Language.

You might also like