You are on page 1of 51

Lecture 1:

Introduction

Kai-Wei Chang
CS @ University of Virginia
kw@kwchang.net

Couse webpage: http://kwchang.net/teaching/NLP16

CS6501– Natural Language Processing 1


CuuDuongThanCong.com https://fb.com/tailieudientucntt
Announcements
 Waiting list: Start attending the first few meetings
of the class as if you are registered. Given that
some students will drop the class, some space
will free up.

 We will use Piazza as an online discussion


platform. Please enroll.

CS6501– Natural Language Processing 2


CuuDuongThanCong.com https://fb.com/tailieudientucntt
Staff

 Instructor: Kai-Wei Chang


 Email: nlp16@kwchang.net
 Office: R412 Rice Hall
 Office hour: 2:00 – 3:00, Tue (after class).
 Additional office hour: 3:00 – 4:00, Thu
 TA: Wasi Ahmad
 Email: wua4nw@virginia.edu
 Office: R432 Rice Hall
 Office hour: 4:00 – 5:00, Mon

CS6501– Natural Language Processing 3


CuuDuongThanCong.com https://fb.com/tailieudientucntt
This lecture

 Course Overview
 What is NLP? Why it is important?
 What will you learn from this course?
 Course Information
 What are the challenges?
 Key NLP components

CS6501– Natural Language Processing 4


CuuDuongThanCong.com https://fb.com/tailieudientucntt
What is NLP
 Wiki: Natural language processing (NLP) is
a field of computer science, artificial
intelligence, and computational linguistics
concerned with the interactions between
computers and human (natural) languages.

CS6501– Natural Language Processing 5


CuuDuongThanCong.com https://fb.com/tailieudientucntt
Go beyond the keyword matching

 Identify the structure and meaning of


words, sentences, texts and conversations
 Deep understanding of broad language
 NLP is all around us
CS6501– Natural Language Processing 6
CuuDuongThanCong.com https://fb.com/tailieudientucntt
Machine translation

Facebook translation, image credit: Meedan.org

CS6501– Natural Language Processing 7


CuuDuongThanCong.com https://fb.com/tailieudientucntt
Statistical machine translation

Image credit: Julia Hockenmaier, Intro to NLP

CS6501– Natural Language Processing 8


CuuDuongThanCong.com https://fb.com/tailieudientucntt
Dialog Systems

CS6501– Natural Language Processing 9


CuuDuongThanCong.com https://fb.com/tailieudientucntt
Sentiment/Opinion Analysis

CS6501– Natural Language Processing 10


CuuDuongThanCong.com https://fb.com/tailieudientucntt
Text Classification

www.wired.com

 Other applications?
CS6501– Natural Language Processing 11
CuuDuongThanCong.com https://fb.com/tailieudientucntt
Question answering

'Watson' computer wins at 'Jeopardy'

credit: ifunny.com
CS6501– Natural Language Processing 12
CuuDuongThanCong.com https://fb.com/tailieudientucntt
Question answering

 Go beyond search

CS6501– Natural Language Processing 13


CuuDuongThanCong.com https://fb.com/tailieudientucntt
Natural language instruction

https://youtu.be/KkOCeAtKHIc?t=1m28s

CS6501– Natural Language Processing 14


CuuDuongThanCong.com https://fb.com/tailieudientucntt
Digital personal assistant
More on natural language instruction

credit: techspot.com

 Semantic parsing – understand tasks


 Entity linking – “my wife” = “Kellie” in the phone
book

CS6501– Natural Language Processing 15


CuuDuongThanCong.com https://fb.com/tailieudientucntt
Information Extraction

 Unstructured text to database entries

Yoav Artzi: Natural language processing

CS6501– Natural Language Processing 16


CuuDuongThanCong.com https://fb.com/tailieudientucntt
Language Comprehension

Christopher Robin is alive and well. He is the same


person that you read about in the book, Winnie the Pooh.
As a boy, Chris lived in a pretty home called Cotchfield
Farm. When Chris was three years old, his father wrote
a poem about him. The poem was printed in a magazine
for others to read. Mr. Robin then wrote a book

 Q: who wrote Winnie the Pooh?


 Q: where is Chris lived?

CS6501– Natural Language Processing 17


CuuDuongThanCong.com https://fb.com/tailieudientucntt
What will you learn from this course

 The NLP Pipeline


 Key components for
understanding text

 NLP systems/applications
 Current techniques & limitation

 Build realistic NLP tools

CS6501– Natural Language Processing 18


CuuDuongThanCong.com https://fb.com/tailieudientucntt
What’s not covered by this course

 Speech recognition – no signal processing

 Natural language generation

 Details of ML algorithms / theory

 Text mining / information retrieval

CS6501– Natural Language Processing 19


CuuDuongThanCong.com https://fb.com/tailieudientucntt
This lecture

 Course Overview
 What is NLP? Why it is important?
 What will you learn from this course?
 Course Information
 What are the challenges?
 Key NLP components

CS6501– Natural Language Processing 20


CuuDuongThanCong.com https://fb.com/tailieudientucntt
Overview

 New course, first time being offered


 Comments are welcomed
 Aimed at first- or second- year PhD students
 Lecture + Seminar
 No course prerequisites, but I assume
 programming experience (for the final project)
 basics of probability calculus, and linear
algebra (HW0)

CS6501– Natural Language Processing 21


CuuDuongThanCong.com https://fb.com/tailieudientucntt
Grading

 No exam & HW -- hooray


 Lectures & forum
 Participate in discussion (additional credits)
 Review quizzes (25%): 3 quizzes
 Critical review report (10%)
 Paper presentation (15%)
 Final project (50%)

CS6501– Natural Language Processing 22


CuuDuongThanCong.com https://fb.com/tailieudientucntt
Quizzes

 Format
 Multiple choice questions
 Fill-in-the-blank
 Short answer questions
 Each quiz: ~20 min in class
 Schedule: see course website
 Closed book, Closed notes, Closed laptop

CS6501– Natural Language Processing 23


CuuDuongThanCong.com https://fb.com/tailieudientucntt
Critical review report

 1 page maximum
 Pick one paper from the suggested list
 Summarize the paper (use you own words)
 Provide detailed comments
 What can be improved
 Potential future directions
 Other related work
 Some students will be selected to present
their critical reviews

CS6501– Natural Language Processing 24


CuuDuongThanCong.com https://fb.com/tailieudientucntt
Paper presentation

 Each group has 2~3 students


 Picked one paper from the suggested
readings, or your favorite paper
 Cannot be the same as critical review report
 Can be related to your final project
 Register your choice early
 15 min presentation + 2 mins Q&A
 Will be graded by the instructor, TA, other
students
CS6501– Natural Language Processing 25
CuuDuongThanCong.com https://fb.com/tailieudientucntt
Final Project

 Work in groups (2~3 students)


 Project proposal
 Written report, 2 page maximum
 Project report (35%)
 < 8 pages, ACL format
 Due 2 days before the final presentation
 Project presentation (15%)
 5-min in-class presentation (tentative)

CS6501– Natural Language Processing 26


CuuDuongThanCong.com https://fb.com/tailieudientucntt
Late Policy

 Credit of 48 hours for all the assignments


 Including proposal and final project
 No accumulation
 No more grace period

 No make-up exam
 unless under emergency situation

CS6501– Natural Language Processing 27


CuuDuongThanCong.com https://fb.com/tailieudientucntt
Cheating/Plagiarism

 No. Ask if you have concerns


 UVA Honor Code:
http://www.virginia.edu/honor/

CS6501– Natural Language Processing 28


CuuDuongThanCong.com https://fb.com/tailieudientucntt
Lectures and office hours

 Participation is highly appreciated!


 Ask questions if you are still confusing
 Feedbacks are welcomed
 Lead the discussion in this class
 Enroll Piazza
https://piazza.com/virginia/fall2016/cs6501004

CS6501– Natural Language Processing 29


CuuDuongThanCong.com https://fb.com/tailieudientucntt
Topics of this class

 Fundamental NLP problems


 Machine learning & statistical approaches
for NLP
 NLP applications
 Recent trend in NLP

CS6501– Natural Language Processing 30


CuuDuongThanCong.com https://fb.com/tailieudientucntt
What to Read?

 Natural Language Processing


ACL, NAACL, EACL, EMNLP, CoNLL, Coling, TACL
aclweb.org/anthology
 Machine learning
ICML, NIPS, ECML, AISTATS, ICLR, JMLR, MLJ
 Artificial Intelligence
AAAI, IJCAI, UAI, JAIR

CS6501– Natural Language Processing 31


CuuDuongThanCong.com https://fb.com/tailieudientucntt
Questions?

CS6501– Natural Language Processing 32


CuuDuongThanCong.com https://fb.com/tailieudientucntt
This lecture

 Course Overview
 What is NLP? Why it is important?
 What will you learn from this course?
 Course Information
 What are the challenges?
 Key NLP components

CS6501– Natural Language Processing 33


CuuDuongThanCong.com https://fb.com/tailieudientucntt
Challenges – ambiguity

 Word sense ambiguity

CS6501– Natural Language Processing 34


CuuDuongThanCong.com https://fb.com/tailieudientucntt
Challenges – ambiguity

 Word sense / meaning ambiguity

Credit: http://stuffsirisaid.com
CS6501– Natural Language Processing 35
CuuDuongThanCong.com https://fb.com/tailieudientucntt
Challenges – ambiguity

 PP attachment ambiguity

Credit: Mark Liberman, http://languagelog.ldc.upenn.edu/nll/?p=17711

CS6501– Natural Language Processing 36


CuuDuongThanCong.com https://fb.com/tailieudientucntt
Challenges -- ambiguity

 Ambiguous headlines:
 Include your children when baking cookies
 Hospitals are Sued by 7 Foot Doctors
 Iraqi Head Seeks Arms

 Safety Experts Say School Bus Passengers


Should Be Belted

CS6501– Natural Language Processing 37


CuuDuongThanCong.com https://fb.com/tailieudientucntt
Challenges – ambiguity

 Pronoun reference ambiguity

Credit: http://www.printwand.com/blog/8-catastrophic-examples-of-word-choice-mistakes

CS6501– Natural Language Processing 38


CuuDuongThanCong.com https://fb.com/tailieudientucntt
Challenges – language is not static

 Language grows and changes


 e.g., cyber lingo
LOL Laugh out loud
G2G Got to go
BFN Bye for now
B4N Bye for now
Idk I don’t know
FWIW For what it’s worth
LUWAMH Love you with all my heart

CS6501– Natural Language Processing 39


CuuDuongThanCong.com https://fb.com/tailieudientucntt
Challenges--language is compositional

Carefully
Slide

CS6501– Natural Language Processing 40


CuuDuongThanCong.com https://fb.com/tailieudientucntt
Challenges--language is compositional

小心: 地滑:
Carefully Slide
Careful Landslip
Take Wet Floor
Care Smooth
Caution

CS6501– Natural Language Processing 41


CuuDuongThanCong.com https://fb.com/tailieudientucntt
Challenges – scale

 Examples:
 Bible (King James version): ~700K
 Penn Tree bank ~1M from Wall street journal
 Newswire collection: 500M+
 Wikipedia: 2.9 billion word (English)
 Web: several billions of words

CS6501– Natural Language Processing 42


CuuDuongThanCong.com https://fb.com/tailieudientucntt
This lecture

 Course Overview
 What is NLP? Why it is important?
 What will you learn from this course?
 Course Information
 What are the challenges?
 Key NLP components

CS6501– Natural Language Processing 43


CuuDuongThanCong.com https://fb.com/tailieudientucntt
Part of speech tagging

CS6501– Natural Language Processing 44


CuuDuongThanCong.com https://fb.com/tailieudientucntt
Syntactic (Constituency) parsing

CS6501– Natural Language Processing 45


CuuDuongThanCong.com https://fb.com/tailieudientucntt
Syntactic structure => meaning

Image credit: Julia Hockenmaier, Intro to NLP

CS6501– Natural Language Processing 46


CuuDuongThanCong.com https://fb.com/tailieudientucntt
Dependency Parsing

CS6501– Natural Language Processing 47


CuuDuongThanCong.com https://fb.com/tailieudientucntt
Semantic analysis

 Word sense disambiguation


 Semantic role labeling

Credit: Ivan Titov

CS6501– Natural Language Processing 48


CuuDuongThanCong.com https://fb.com/tailieudientucntt
Q: [Chris] = [Mr. Robin] ?
Christopher Robin is alive and well. He is the
same person that you read about in the book,
Winnie the Pooh. As a boy, Chris lived in a
pretty home called Cotchfield Farm. When
Chris was three years old, his father wrote a
poem about him. The poem was printed in a
magazine for others to read. Mr. Robin then
wrote a book
Slide modified from Dan Roth

CuuDuongThanCong.com https://fb.com/tailieudientucntt
49
Co-reference Resolution

Christopher Robin is alive and well. He is the


same person that you read about in the book,
Winnie the Pooh. As a boy, Chris lived in a
pretty home called Cotchfield Farm. When
Chris was three years old, his father wrote a
poem about him. The poem was printed in a
magazine for others to read. Mr. Robin then
wrote a book

CuuDuongThanCong.com https://fb.com/tailieudientucntt
50
Questions?

CS6501– Natural Language Processing 51


CuuDuongThanCong.com https://fb.com/tailieudientucntt

You might also like