NATURAL LANGUAGE PROCESSING

Dr. Ahmed El-Bialy

Why Natural Language Processing ?
Huge amounts of data
† Internet

Classify text into categories Index and search large texts Automatic translation Speech understanding
†

= at least 20 billions pages † Intranet

Understand phone conversations Extract useful information from resumes Condense 1 book into 1 page

Applications for processing large amounts of texts require NLP expertise

Information extraction
†

Automatic summarization
†

Question answering Knowledge acquisition Text generations / dialogues

Where does it fit in the CS taxonomy?
Computers Databases Artificial Intelligence Algorithms Networking

Robotics

Natural Language Processing

Search

Information Retrieval

Machine Translation

Language Analysis

Semantics

Parsing

NLP

aims at : making computers talk endowing computers with the linguistics ability of humans

.

Dialog system reality E-commerce: AINI ‡ a chatterbot integrated with 3D animated agent character ‡ Improve customer services ‡ Reduce customer reliance on human operator .

org/what/what.htm ) † Intelligent tutoring system that helps student learn by holding a conversationnal in natural language † Animated agent : synthesis speech.autotutor. intonation. and gestures † demo (from 2002) . facial expressions.Dialog system reality E-teaching : autotutor (http://www.

Machine translation Automatically translate a document from one language to another Very useful on the web Far from solved problem .

and aggregate information † Use speech as a UI (when needed) † Talk to us / listen to us † But they can·t: Language is complex. flexible.The Dream It·d be great if machines could Process our email (usefully) † Translate languages accurately † Help us manage. summarize. ambiguous. and subtle † Good solutions need linguistics and machine learning knowledge † So: 9/34 .

The mystery What·s now impossible for computers (and any other species) to do is effortless for humans 10/34 .

What is NLP? Fundamental goal: deep understand of broad language † Not just string processing or keyword matching! 11/34 .

Role of Knowledge in Language Understanding     Understanding  word meanings Inference Speaker goal & assumptions Non Monotonic Reasoning Word meanings are not enough .

head. understand. lemmas † Parsing and chunking † Semantic tagging: semantic role. word sense † Certain expressions: named entities † Discourse: coreference.What is Natural Language Processing (NLP) 13/34 Computers use (analyze. part of speech. generate) natural language Text Processing † Lexical: tokenization. discourse segments Speech Processing † Phonetic transcription † Segmentation (punctuations) † Prosody .

down and up and Morphology down and combine steps too!! **every step is equally complex .The NLP Problem & Symbolic Approach 14/34 Discourse Pragmatics Semantics Syntax **we can go up.

Vocabulary         Prosody Phonology Morphology Syntax Semantics Pragmatics: ´Do you know the time?µ World Knowledge Parsing .

some semantics Requires: words meanings and linguistic structure  Semantic Interpolation   Contextual/ World Knowledge . morphology.Stages of language Analysis  Parsing: analysis syntactic structure of sentences parse tree  Requires: Language syntax.

in that they can be further re-written. So. such as S. such as John. The result is one or more parse trees. we use the grammar to find the legal structures for a sentence. Given a sentence. in this sense. . a noun phrase and a verb phrase. The non-boldface symbols.Syntax The description of the legal structures in a language is called a grammar. A grammar is typically written as a set of rewrite rules such as the ones shown here in blue. the and boy. in turn. NP and VP. the constituent phrases are the atoms of meaning. The verb phrase. is composed of another verb phrase followed by a prepositional phrase. which indicates that the sentence can be broken down into two constituents. such as the one shown here. are the words of the language also known as the terminal symbols. This process is called parsing the sentence. are known as non-terminal symbols. etc. Boldface symbols. Our attempt to understand sentences will be based on assigning meaning to the individual constituents and then combining them to construct the meaning of the sentence.

aba. aab.y are sent.Specification and parsing using Context free grammar Alphabet: V a set of symbols (digits and letters) Sentence: x over V a string of finite length Length of x: = x = # symbols eg . ab. bb. ba} and if len(x)=3 {aaa.} Concatination: of x with y. over V is xy and xy = x + y Empty string P is string with len =0 V* closure of V: infinite set of all sent over V including P V+= V* . b} so len(x) on V = 2 {aa. V={a. ««. where x.P . abb.

. aaa. «««. a. b. b. ab. «««.Specification and parsing using Context free grammar † † † eg . aa. V={a. ab. aa. ba.} V+={ a.} Any Language is a set of sentences over alphabet and subset of V* Informal Language include some rules for generating the language. ba. b} V* = {P . aaa.

P) ‡ ‡ ‡ ‡ N is finite set of nonterminals (variables) 7 is finite set of terminals (constants) S is the start symbol (one of the nonterminals) P is rules/productions of the form X p K. 7.Context-free grammars ‡ G = (N. N‰7=J V= N Š 7 . where X is a nonterminal and K is a sequence of terminals and nonterminals (possibly an empty sequence) ‡ ‡ ‡ A grammar G generates a language model L. S.

b} P= {S aS.Context-free grammars ‡ ‡ eg. N= {S} 7={a. S b} apply to get aaaab g1 g1 g1 g1 g2 ‡ S aS aaS aaaS aaaaS aaaab .

Context-free grammars eg ´ahmed ate the food with a spoonµ       S NP NP VP VP PP NP VP art N NP PP V NP VP PP P NP        NP N N V P art art ahmed food spoon ate with the a .

NP: noun phrase VP: verb phrase art: articulation N: noun PP: preposition phrase V: verb P: preposition .Nonterminals         Terminals        ahmed food spoon ate with the a S : sent.

What is Parsing? S NP NP VP VP PP NP VP art N NP PP V NP VP PP P NP S NP N N V P art art PP ahmed food spoon ate with the a NP VP VP V Det art ahmed ate the NP N food P Det art with a NP N spoon .

Top Down Space 25 S NP VP S Aux NP VP S VP VP V VP V NP NP PropN NP art N NP NP PP VP VP PP PP P NP art art 3/29/2011 .

Then work your way up from there to larger and larger trees. . we also want trees that cover the input words.Bottom-Up Parsing 26 Of course. So we might also start with trees that link up with the words in the right way.

Bottom-Up Search 27 .

Bottom-Up Search 28 .

Bottom-Up Search 29 .

Bottom-Up Search 30 .

Bottom-Up Search 31 .

Transition Network Parsers Grammar is a finite state machine Each network= transition diagram for each nontermenal Each arc = transition or terminal Nodes are states Top down parse .

Transition Network Parsers Here¶s a very simple example S VP NP NP NP VP VERB NP ART NOUN POSS NOUN My dog ate a frog. NOUN ART VERB POSS frog | dog a ate my .

Parse tree S The tree notation is difficult to compute with directly. so we can convert the representation into more useful: VP (S (NP (POSS my) (NOUN dog)) (VP (VERB ate) (NP (ART a) (NOUN frog)))) NOUN NP POSS NOUN VERB NP ART my dog ate a frog .

Transition Network S VP NP NP S0 VP0 NP VERB ART POSS ART S1 VP1 VP NP NOUN NOUN S2 VP2 ART NP2 NP0 NP5 NP1 NP0 NP1 NOUN NP2 NP3 NP4 POSS ART S1 S0 NOUN S2 VERB S3 S4 NOUN S5 POSS POSS When the lexicon gets really big. drawing them takes forever! .

Transition Network NP S V VP art NP frog N dog a art poss poss V my ate N NP VP .

Transition Network NP S art NP poss my poss N dog art a N dog frog V N VP ate NP poss frog art N V NP VP .

... The boy drowned. † . † The boy with the raft near the island drowned.Why do we want recursive rules in our grammar? Natural languages allow us to express an infinite range of ideas using a finite set of rules and symbols. † The boy with the raft near the island in the ocean drowned. † The boy with the raft drowned. † The boy .

.A sample recursive transition network(1) The boy broke the window with a hammer.

.

.

.

.

.

.

Transition Network Paresers .

.

.

Thank You .

Sign up to vote on this title
UsefulNot useful