You are on page 1of 19

Parsing and Syntax

 Syntax- refers to the way words are related to each other in a sentence.
 Syntactic Analysis- analyzes:
how words are grouped together into phrases;
what words modify other words;
what words are of central importance to the sentence.
 Syntactic Analysis is used in many NLP applications such as:
Grammar Checking
Question Answering
Information Extraction
Machine Translation
Cont…
Noun Phrases
Student, the student, that student, two students, many students
Clever student
A student of computer science
AU students, long queues, the student with long hair, the city where I lived
Adjective Phrases
 incredibly short
 rather difficult
 very happy
 unbelievably quick
 exceedingly sorry about the mistake
 amazingly rich in minerals
Cont…
Verb Phrases
turn, turn on, is turning on, have been working
threatened to throw himself into the window
was an understandable reaction by the visitors
is amazingly rich in minerals
Prepositional Phrases
on the table
across the world
over your head
in the hotel
to their house
Cont…
Adverbial Phrases
 immediately
 unbelievably quickly
 very carefully
Simple Sentences
 The computer is on the table
 He went home
 They are always happy
Complex Sentences
He was driving the car that he bought from his father
We rented our house to friends while we were abroad
Cont…
Examples: Simple Sentences
Cont…
 Concepts: Alphabet, String, and Language
 Formal Language Theory - considers a language as a mathematical object defined by
alphabets, strings and grammar.
 Alphabet - a finite set of symbols.
 e.g. Binary Alphabet: {0, 1}
 Decimal Alphabet: {0, 1, 2 , … , 9}
 English Alphabet: {a, b, c, … , z, A, B, C, …, Z}
 String - a finite sequence of symbols from an alphabet.
 e.g. Binary String: 0100101, 01101, 00110
 Decimal String: 176392, 12, 398702
 English String: killed, Abebe, lion, the
Language- (potentially infinite) set of strings over an alphabet.
 e.g. Binary Language: {0100101, 01101, 00110, ….}
 Decimal Language: {176392, 12, 398702, ….}
 English Language: {killed, Abebe, lion, the, ….}
Cont…
Grammar - a formalism to generate strings in a language by a process of replacing symbols.
- has 4 elements (tuples) represented as: G= (N, T, P, S) where
• N is a finite set of non-terminal symbols. In natural languages, this can be syntactic
categories, phrases or sentences.
• T is a finite set of terminal symbols (disjoint from N). It consists of elements of target
language such as words and letters in natural language.
• P is a finite set of production rules of the form ɑ→β with at least one nonterminal in ɑ.
• S is a member of N called the start symbol (special non-terminal symbol). In natural
languages, the start symbol is a sentence.
Cont…
Hierarchy of Grammars/Languages
 Also known as Chomsky Classification, the hierarchy of grammars/languages represents
a hierarchy of expressiveness of grammars.
 Different classes of grammars/languages are defined by putting different constraints on
production rules resulting in different structural complexity of sentences of natural
languages.
 Chomsky classification consists of the following four levels of grammars/languages:
 Type 0 (Unrestricted / Recursively Enumerable)
 Type I (Context-Sensitive)
 Type II (Context-Free)
 Type III (Regular)
Cont…
Hierarchy of Grammars/Languages: Type 0 (Unrestricted)
Cont…
Hierarchy of Grammars/Languages: Type I (Context-Sensitive)
Cont…
Hierarchy of Grammars/Languages: Type II (Context-Free)
Cont…
Hierarchy of Grammars/Languages: Type III (Regular)
Cont…
Parsing
Is the process of recognizing and assigning structure of sentences.
is a derivation process which identifies the structure of sentences using a given grammar.
considered as a special case of a search problem.
 two basic methods of searching are used
 top-down strategy
 bottom-up strategy
 methods of improving efficiency
 storing lexical rules separately
 chunking
Cont…
Parsing Strategies: Top-down Parsing
Cont…
Parsing Strategies: Bottom-up Parsing
Cont…
Towards Efficient Parsing: Separating Lexical Rules
Cont…
Towards Efficient Parsing: Chunking
Applications of parsing

 Machine translation
tree
English operations Chinese

 Speech synthesis from parses

 Speech recognition using parsing


Put the file in the folder.
Put the file and the folder.
Applications of parsing
 Grammar checking

 Indexing for information retrieval


 Information extraction

You might also like