Professional Documents
Culture Documents
Natural Language Processing: Zhao Hai 赵海 Department of Computer Science and Engineering Shanghai Jiao Tong University
Natural Language Processing: Zhao Hai 赵海 Department of Computer Science and Engineering Shanghai Jiao Tong University
Zhao Hai 赵海
Department of Computer Science and Engineering
Shanghai Jiao Tong University
2010-2011
zhaohai@cs.sjtu.edu.cn
Outline
Course Goals
Course Schedule
Course Requirements
Overview
21/7/10 09:26 2
Course Goals
21/7/10 09:26 3
Course Schedule (1)
1. Overview (2 lhs = 2 lecture hours)
21/7/10 09:26 4
Course Schedule (2)
2. Lexicons and Lexical Analysis (11 lhs)
21/7/10 09:26 5
Course Schedule (3)
3. Syntactic Processing (14 lhs)
21/7/10 09:26 6
Course Schedule (4)
4. Semantic Interpretation (6 lhs)
21/7/10 09:26 7
Course Schedule (5)
5. Machine Learning Approaches for
Natural language processing (6 lhs)
21/7/10 09:26 9
Course Schedule (7)
7. Students Workshop (2 lh)
21/7/10 09:26 10
Course Schedule (8)
Curriculum Schedule
21/7/10 09:26 11
Course Requirements (1)
1. Final Grade
21/7/10 09:26 12
Course Requirements (2)
2. Texts and References
21/7/10 09:26 13
Course Requirements (3)
ACL Anthology
http://www.aclweb.org/anthology-new/
21/7/10 09:26 14
Course Requirements (4)
4. FTP Site and Contact Email
zhaohai@cs.sjtu.edu.cn
21/7/10 09:26 15
Overview (1)
Natural Language Understanding (1)
What is Natural Language?
21/7/10 09:26 16
Overview (2)
Natural Language Understanding (2)
NLP & NLU (1)
21/7/10 09:26 18
Overview (4)
Natural Language Understanding (4)
Why is NLU a Difficult Task? (1)
21/7/10 09:26 19
Overview (5)
Natural Language Understanding (5)
Why is NLU a Difficult Task? (2)
Type of mapping
There are one-to-one, many-to-one, one-to-many, or many-to-
many mappings. One-to-many mappings require a great deal of
domain knowledge beyond the input to make the correct choice
among target representations.
For example (one-to-many): a) a tall giraffe vs. b) a tall poodle
(a small dog with thick curling hair has proud bearing)
21/7/10 09:26 20
Overview (6)
Natural Language Understanding (6)
Why is NLU a Difficult Task? (3)
21/7/10 09:26 21
Overview (7)
Natural Language Understanding (7)
Why is NLU a Difficult Task? (4)
21/7/10 09:26 22
Overview (8)
Natural Language Understanding (8)
Why is NLU a Difficult Task? (5)
21/7/10 09:26 24
Overview (10)
Natural Language Understanding (10)
Computational Linguistics (2)
Computational
Engineering Linguistics
Science Bioscience
Psychology
Cognitive
AI
Computer Science
Science
Philosophy
Linguistics
21/7/10 09:26 25
Overview (11)
Natural Language Understanding (11)
Symbolic Processing
http://translate.google.com/?hl=zh-CN&tab=wT#auto|en|
The increasing demand for these services will give a push to improve their
quality;
The translation providers will find ways to increase vocabularies and
translation quality semi-automatically from terminological resources,
bilingual corpora and similar sources.
21/7/10 09:26 29
Overview (15)
Natural Language Understanding (15)
Machine Translation (4)
21/7/10 09:26 33
Overview (19)
Different Levels of Language Analysis (3)
Morphological Analysis (2)
21/7/10 09:26 34
Overview (20)
Different Levels of Language Analysis (4)
Syntactic Analysis (1)
Its goal is to break down given textual units, e.g. sentences,
into smaller constituents, to assign categorical labels to them, and
to identify the grammatical relations that hold between the
various parts.
In most parsers, the grammar is separated from the processing
21/7/10 09:26 38
Overview (24)
Different Levels of Language Analysis
(8)
Semantic Analysis (2)
21/7/10 09:26 39
Overview (25)
Different Levels of Language Analysis (9)
Pragmatic Analysis
21/7/10 09:26 40
Overview (26)
Different Levels of Language Analysis (10)
Discourse Analysis
21/7/10 09:26 42
Overview (28)
Different Levels of Language Analysis (12)
Examples
21/7/10 09:26 44
Overview (30)
Applied Approaches in NLU Systems (2)
Historical Categories
21/7/10 09:26 46
Overview (32)
Applied Approaches in NLU Systems (4)
SAD-SAM [Lindsay, 1963]
It was built at MIT in 1966 and was the most famous pattern-
matching natural language system. The system assumes the role
of a Rogerian, or “nondirective”, therapist in its dialog with the
user.
It operated by matching the left sides of its rules against the
user’s last sentence, and using the appropriate right side to
generate a response. Rules were indexed by keywords so only a
few had to be matched against a particular sentence. Some rules
had no left side, so they could apply anywhere.
21/7/10 09:26 48
Overview (34)
Applied Approaches in NLU Systems (6)
ELIZA: Sample Data
Word Rank Pattern Outputs
21/7/10 09:26 49
Overview (35)
Applied Approaches in NLU Systems (7)
ELIZA: A Dialogue
User: ELIZA:
21/7/10 09:26 50
Overview (36)
Applied Approaches in NLU Systems (8)
SIR [Bertram Raphael, 1968]
21/7/10 09:26 51
Overview (37)
Applied Approaches in NLU Systems (9)
LUNAR [William Woods, 1973] (1)
21/7/10 09:26 52
Overview (38)
Applied Approaches in NLU Systems (10)
LUNAR [William Woods, 1973] (2)
21/7/10 09:26 53
Overview (39)
Applications of NLU (1)
Text-Based Applications
21/7/10 09:26 55
Overview (41)
CL Research Topics (1)
Call for Papers from ACL-2006 (1)
21/7/10 09:26 56
Overview (42)
CL Research Topics (2)
Call for Papers from ACL-2006 (2)
21/7/10 09:26 57
Overview (43)
CL Research Topics (3)
Call for Papers from ACL- 2010 (1)
Discourse, dialogue, and pragmatics
Grammar engineering
Information extraction
Information retrieval
Knowledge acquisition
Large scale language processing
Language generation
Language processing in domains such as bioinformatics, legal, medical,
etc.
Language resources, evaluation methods and metrics, science of annotation
Lexical/ontological/formal semantics
Machine translation
Mathematical linguistics, grammatical formalisms
Mining from textual and spoken language data
21/7/10 09:26 58
Overview (44)
CL Research Topics (4)
Call for Papers from ACL-2010 (2)
Multilingual language processing
Multimodal language processing (including speech, gestures, and other
communication media)
NLP applications and systems
NLP on noisy unstructured text, such as emails, blogs, sms
Phonology/morphology, tagging and chunking, word segmentation
Psycholinguistics
Question answering
Semantic role labeling
Sentiment analysis and opinion mining
Spoken language processing
Statistical and machine learning methods
Summarization
Syntax, parsing, grammar induction
Text mining
Textual entailment and paraphrasing
Topic and text classification
Word sense disambiguation
21/7/10 09:26 59
Overview (45)
CL Research Topics (5)
Accepted Regular Paper Statistics for JSCL-2005
21/7/10 09:26 60
Overview (46)
Assignments (1)
21/7/10 09:26 61