Professional Documents
Culture Documents
Introduction
Part of Speech
Part of Speech Problem
Lexical Syntax
01/02/23 2
Introduction
01/02/23 3
Introduction …
The task of POS tagging is finding the set of tags T = t1, t2, . . .
tn, where ti corresponds to the POS tag of wi, 1 ≤ i ≤ n, as
accurately as possible.
01/02/23 5
Introduction POS…
01/02/23 6
Introduction POS…
Although there is some debate on the topic (e.g., the claim that
the adjective–verb distinction is almost nonexistent in some
languages such as the East-Asian language Mandarin or the
claim that all the words in a particular category do not show the
same functional/semantic behavior), this minimal set of three
categories is considered universal.
01/02/23 7
Introduction POS…
Although the decision about the size and the contents of the
tagset (the set of POS tags) is still linguistically oriented, the
idea is providing distinct parts of speech for all classes of words
having distinct grammatical behavior, rather than arriving at a
classification that is in support of a particular linguistic theory.
01/02/23 9
Introduction POS…
Usually the size of the tag set is large and there is a rich
repertoire (Catalog) of tags with high discriminative power.
01/02/23 10
Introduction POS …
01/02/23 11
Introduction POS Problem
01/02/23 12
Introduction POS Problem
01/02/23 13
Introduction POS Problem
01/02/23 14
Introduction POS Problem
01/02/23 15
Introduction POS Problem
01/02/23 16
Introduction POS Problem
01/02/23 17
Introduction POS Problem
01/02/23 18
Introduction POS Problem
01/02/23 19
Introduction POS Problem
01/02/23 21
Lexical Syntax
01/02/23 22
POS Tagging Approaches
01/02/23 23
POS Tagging Approaches…
01/02/23 24
POS Tagging Approaches (Rule
based Approach)
The earliest POS tagging systems are rule-based systems, in
which a set of rules is manually constructed and then applied to
a given text.
The first rule-based tagging system is based on a large set of
handcrafted rules and a small lexicon to handle the exceptions.
Moreover, these systems are not robust in the sense that they
must be partially or completely redesigned when a change in
the domain or in the language occurs.
This leads to the development of new statistical models.
01/02/23 26
POS Tagging Approaches – HMM…
01/02/23 27
POS Tagging Approaches – HMM…
01/02/23 28
POS Tagging Approaches –
(Maximum Entropy Model)
The HMM framework has two important limitations for
classification tasks such as POS tagging:
Strong independence assumptions and poor use of contextual
information.
For HMM POS tagging, it is usually assumed that the tag of a word
does not depend on previous and next words, or a word in the
context does not supply any information about the tag of the target
word.
Furthermore, the context is usually limited to the previous one or
two words.
Although there exist some attempts to overcome these limitations,
they do not allow to use the context in any way required.
01/02/23 29
POS Tagging Approaches –
(Maximum Entropy Model)
Maximum Entropy (MaxEnt) models provide more flexibility in
dealing with the context and are used as an alternative to
HMMs in the domain of POS tagging.
01/02/23 30
POS Tagging Approaches
01/02/23 31
Sequence Labeling…
01/02/23 32
Sequence Labeling…
01/02/23 34
Question & Answer
01/02/23 35
Thank You !!!
01/02/23 36