Professional Documents
Culture Documents
Sentences
Ahmad Al-Taani1, Mohammed Msallam2, and Sana Wedian1
Department of Computer Science, Yarmouk University, Jordan
2
Department of Computer Science, Al-Aqsa University, Palestine
T
JI
IA
Fi
Abstract: Parsing of Arabic sentences is a necessary mechanism for many natural language processing applications such as
machine translation; question answering, knowledge extraction and information retrieval. In this study, we present a top-down
chart parser for parsing simple Arabic sentences, including nominal and verbal sentences within specific domain Arabic
grammar. We used the Context Free Grammar (CFGs) to represent the Arabic grammar. We first developed the Arabic
grammar rules that give precise description of grammatical sentences. Then, we implemented the parser that assigns
grammatical structure to the input sentence. The parser is tested on sentences extracted from real documents. Experimental
results showed the effeteness of the proposed top-down chart parser for parsing modern standard Arabic sentences. From a
practical perspective, the parser is able to satisfy syntactic constraints and reduce parsing ambiguity.
rs
Keywords: NLP, Parser, chart parsing, top-down chart parser, context free grammar, syntactic structures.
1. Introduction
tO
in
nl
io
at
lic
ub
eP
T
JI
IA
2. Related Work
io
at
lic
ub
eP
in
nl
tO
rs
Fi
T
JI
IA
Arabic Grammar
Input sentence
Fi
Parser
Final chart
tO
Word Cat.
Word
Classifier
NS NP
NS NP
NS NP
NS NP
Word
NP
NS
VP
VS
Lexicon
rs
Word
Suffixes
Verb
KX YZ [\Z Y]
cZ deZ dfZ KZ [fZ gZ YeZ
[\eZ
Noun
Pattern
_k mn Q
Noun
[kmn Q
Prefix
OQ
infix
-
OQ
Examples
dpqnQtnQ
YLunQ vnwnQ
[pqnQ tnQ
[LunQ [nwnQ
Class
Verb
io
at
Class
lic
ub
eP
in
nl
Nominative
He
huwa
Ana
[X
She
heya
We
nahnu
YuX
They
hum
[\
You
anta
OX
They
Hum
You
antum
[\nX
You
(pl.)
antum
dnX
You
antuna
YnX
Hunna
Fi
They
rs
Accusative
Pronouns
T
JI
IA
Pronouns
Possessive pronouns
L\~`[ kM
_[m`[ Z - [X -Yw ` -[\`
|[~\` ][ -
in
nl
tO
den\`[ `[ [ `\~[ ][
lic
ub
After that the user can use the parser where he must
press the Run Parser button. Figure 5 shows each
step for parsing the sentence to identify its structure.
io
at
eP
IA
Fi
tO
rs
OX
[mn
42
39
92.9%
42
36
85.7%
Article (Noun)
45
42
93.3%
Article (Verb)
100%
Conjunction
100%
Total
313
285
91%
KM
Total
Correct
Nominal Sentence
36
35
97.2 %
Verbal Sentence
34
31
91.2 %
Total
70
66
94.3 %
Type
io
at
lic
ub
4. Experimental Results
eP
[X
ARTN 1
Verb
Pronoun
in
nl
N1
90.9%
T
JI
NP2 (rule 17 with PRO1 &
CONJ! & PRO2)
NP1
(rule
13)
PRO
CONJ
PRO 2
1
1
159
q`
NP3
(rule 12)
Correct
175
N2
Total
Type
Noun
T
JI
IA
[6]
[7]
[8]
[9]
Fi
[10]
in
nl
tO
rs
[2]
[17]
[5]
[16]
io
[4]
[15]
at
[3]
[14]
lic
ub
[1]
eP
References
[13]
-->
-->
-->
-->
-->
-->
-->
-->
-->
-->
-->
-->
-->
-->
-->
-->
-->
-->
-->
VP NP
VP NS
NP NP
NP VP
NP VS
NP NS
ARTV V1
V
V1
V P
V1 P
N
PRO
ARTN N
ARTN PRO
N CONJ N
PRO CONJ PRO
N CONJ PRO
PRO CONJ N
T
JI
IA
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
rs
Fi
tO
lic
ub
eP
.35
.36
.37
.38
.39
.40
.41
.42
.43
.44
.45
.46
.47
.48
.49
.50
.51
.52
.53
.54
.55
.56
.57
.58
.59
.60
.61
.62
.63
.64
.65
.66
.67
.68
.69
.70
io
at
.1
.2
.3
.4
.5
.6
.7
.8
.9
.10
.11
.12
.13
.14
.15
.16
.17
.18
.19
.20
.21
.22
.23
.24
.25
.26
.27
.28
.29
.30
.31
.32
.33
.34
in
nl
\ cn|[ _f`
q[ q _ tM
Yw\` [ Y q|`
dk\` Y [k t
[f{ OuZ w`
[ L` [k` t`
[\ t {w
{m cwe` [ ne`
[m` d g`
d` qX qk` YuX
|u\` Y _|X M
t` KM _[ [
`|[qM q_
[w]{ q\Q [
[k`[ fn|[ X[~`
K n
t`[ q {]
Z[ q[
qm` np K Q
q` KM [mn OX[ X
[wb` L [b
t\u \{ u Q` `\{
tm d`[ _e`
[ftL` q]{Z bt|u [nm`
dpq` tm` t
p q p
{k`[b \[eu] `[[
_\k` cQ[Q[ w`
_[ wnb KM
{]{ t `\|[
] t`
q~b dnX_ [
{ k`
q| `[` Y q`\{
{p Q{ ]q`[b
[w` KM _[ f
[un nX qZ
Y]qf {kb nw` YkZ tQ
tqnu] [w` qnp
[b ~` deZ
q\u` p[mn`]{ _
qu|` Y ]q`[ X Oq
[{] [u` KM Kw|k]
t` n
qZ t \Q
YuX PM[e] d`
q`] q`[ X Oq
`[ f t` ]
[ `{X q
Q ` `\{Qt] ]{
[wM [w]{b t` qn
[k\ [f] q` OnQ
pq`[ nb OL
\`[ k t` qZ
dfb t|[`[
tb ` KX
]q t\` O\
[K[
qm` t[ L
[b{w tkb q\Z\` qp
{|kX ][
tP\` d {q`[
q KM ][M dkZ
m q|~`\{ u P|X
K[\`[ k` KM {`[ [
[[X [LX
q` q]{ Ob[
`\|[KM [w]qM [M
[X[ [Q O]
Yn _\ _\
T
JI
IA
io
at
lic
ub
eP
in
nl
tO
rs
Fi