You are on page 1of 22

Haramaya University

College of Computing and Informatics


Department of Computer Science

1
Chapter Six
Introduction to NLP
Introduction What is language? What is natural language?
Ambiguity Is there artificial language? Who study language?
What is next? What is NLP? Why NLP?
Where Does NLP fits in CS

Introduction
• What is language?
• What is natural language?
• Is there artificial language as opposed to natural
language?
• Who study about language?
• What do we mean by Natural Language Processing (NLP)?
• Why we need to process natural language?

Tamrat Delessa(tamedase@gmail.com), HU
Introduction What is language? What is natural language?
Ambiguity Is there artificial language? Who study language?
What is next? What is NLP? Why NLP?
Where Does NLP fits in CS

Introduction: What is language?

• Primarily language is a means of communication


between two or more group of entities
• Language can take
• Spoken form,
• Written form,
• Signed form,
• Facial expression or gesture,
• or other forms

Tamrat Delessa(tamedase@gmail.com), HU
Introduction What is language? What is natural language?
Ambiguity Is there artificial language? Who study language?
What is next? What is NLP? Why NLP?
Where Does NLP fits in CS

Introduction: What is natural language?

• Natural language refers to the language that human


beings learnt from his environment and use it to ease
communication among its group
• Natural language is not inborn rather learnt but as we
learnt from child hood, we just consider it as natural

Tamrat Delessa(tamedase@gmail.com), HU
Introduction What is language? What is natural language?
Ambiguity Is there artificial language? Who study language?
What is next? What is NLP? Why NLP?
Where Does NLP fits in CS

Introduction: Is there artificial language?

• Yes there are lots of languages which are not natural


such as
• Language of mathematics (x = y + f(z))
• Programming language (int x, y, z = 10;)
• Machine language (bit sequences define data, instruction, address, etc)
• Language of robots (robot may use spoken language but generated
artificially)

Tamrat Delessa(tamedase@gmail.com), HU
Introduction What is language? What is natural language?
Ambiguity Is there artificial language? Who study language?
What is next? What is NLP? Why NLP?
Where Does NLP fits in CS

Introduction: Who study language?

• Language is intensively studied by Linguists


• Linguist analyze language at seven levels
1. Phonetic or phonological level (level that deals with the
pronunciation)
2. Morphological level (deals with the smallest part of a word (morph)
that carries meaning, and affixes (prefix, suffix, infix))
3. Lexical level (deals with lexical meaning of a word)

Tamrat Delessa(tamedase@gmail.com), HU
Introduction What is language? What is natural language?
Ambiguity Is there artificial language? Who study language?
What is next? What is NLP? Why NLP?
Where Does NLP fits in CS

Introduction: Who study language?

• Linguist analyze language at seven levels


4. Syntactic level (deals with the grammar and structure of
sentences in the language)
5. Semantics level (deals with meaning of word/sentence)
6. Discourse (deals with structure of different kinds of text using
document structure)
7. Pragmatics (deals with knowledge that comes from the outside
world used in the context)

Tamrat Delessa(tamedase@gmail.com), HU
Levels of Linguistic Analysis: Morphology
• At morphological level, the smallest parts of words that carry meanings
and affixes are analysed.
English Morphology (Examples)
• preregistration preregistration
• books  books
• converted  converted
• converts  converts
• converting  converting
• converter  converter
• convertible  convertible
• unconvertible  unconvertible
Syntax

• Syntax refers to the study of structural relationships between


words in a sentence.
• Syntactic analysis requires both a grammar and a parser,
the output of which is a representation of the sentence that
reveals the structural dependency relationships between the
words.
• E.g
Belete gave a book to Chaltu
Translate into Amharic and Afan Oromo ???
Introduction What is language? What is natural language?
Ambiguity Is there artificial language? Who study language?
What is next? What is NLP? Why NLP?
Where Does NLP fits in CS

Introduction: What is NLP?

• Natural language processing is theoretically


motivated range of computational techniques for
analyzing and representing naturally occurring texts
at one or more levels of linguistic analysis for the
purpose of achieving human like language
processing for a range of tasks or applications

Tamrat Delessa(tamedase@gmail.com), HU
Introduction What is language? What is natural language?
Ambiguity Is there artificial language? Who study language?
What is next? What is NLP? Why NLP?
Where Does NLP fits in CS

Introduction: What is NLP?

• This is a physical problem that can not be solved


analytically and needs
• Proper Modeling
• Proper algorithm
• implementation
• Language Knowledge (expert support)

Tamrat Delessa(tamedase@gmail.com), HU
Introduction What is language? What is natural language?
Ambiguity Is there artificial language? Who study language?
What is next? What is NLP? Why NLP?
Where Does NLP fits in CS

Introduction: Why NLP?

So why we computer Scientists bother about language?


• Information Retrieval: Information Retrieval provides a list of potentially relevant
documents in response to a user’s query.
• Information Extraction: Information Extraction focuses on the recognition,
tagging, and extraction of certain key elements of information (e.g. persons,
companies, locations, organizations, etc.)from large collections of text into a
structured representation.
• Machine Translation: Machine Translation is an automatic translation of text
from one language to another.
• Question-Answering: Question-Answering provides the user with either just the
text of the answer itself or answer-providing passages.
• Dialogue Systems: Dialogue Systems are agents that converse with human
beings in a coherent structure
Introduction What is language? What is natural language?
Ambiguity Is there artificial language? Who study language?
What is next? What is NLP? Why NLP?
Where Does NLP fits in CS

Introduction: Why NLP?

• We need to avail knowledge to every one


• We need to communicate with every body
• We need computers to do on behalf of us on our absence
(dialog system for example)
• We need the speech to be transcribed into text and
analyzed immediately

Tamrat Delessa(tamedase@gmail.com), HU
Introduction What is language? What is natural language?
Ambiguity Is there artificial language? Who study language?
What is next? What is NLP? Why NLP?
Where Does NLP fits in CS

Introduction: Why NLP?

• We need the text to be uttered in real time to the visually


impaired group of people
• We need to correct essays by machine
• We need to teach grammar, word formation and meaning
• We need to know the science behind natural language

Tamrat Delessa(tamedase@gmail.com), HU
Introduction What is language? What is natural language?
Ambiguity Is there artificial language? Who study language?
What is next? What is NLP? Why NLP?
Where Does NLP fits in CS

Introduction: Summary of Application NLP

• Application areas:
• Information extraction
• Named entity recognition
• Machine translation
• Text generation
• Text classification
• Cross-document cross-reference
• Parsing
• Semantic analysis

Tamrat Delessa(tamedase@gmail.com), HU
Introduction What is language? What is natural language?
Ambiguity Is there artificial language? Who study language?
What is next? What is NLP? Why NLP?
Where Does NLP fits in CS

Introduction: Why NLP?

• Application areas:
• Word sense disambiguation
• Word clustering
• Question answering
• Text Summarization
• Indexing
• Information retrieval
• Document retrieval (filtering, routing)
• Structured text (relational tables)
• Paraphrasing
• etc
Tamrat Delessa(tamedase@gmail.com), HU
Introduction What is language? What is natural language?
Ambiguity Is there artificial language? Who study language?
What is next? What is NLP? Why NLP?
Where Does NLP fits in CS

Introduction: Where does NLP fit in CS taxonomy?


Computers

Databases Artificial Intelligence Algorithms Networking

Robotics Natural Language Processing Search

IR Machine Translation Language Analysis

Semantics Parsing

Tamrat Delessa(tamedase@gmail.com), HU
Introduction Definition
Ambiguity Lexical Ambiguity
What is next? Syntactic Ambiguity

Ambiguity: Definition

• One the most important aspect of NLP is ambiguity


• Ambiguity refers to the possibility to have multiple
linguistic structure of a given input (word or sentence)
• There ambiguities at different linguistic processing levels
• Here we will see
• Lexical ambiguity
• Syntactic ambiguity

Tamrat Delessa(tamedase@gmail.com), HU
Introduction Definition
Ambiguity Lexical Ambiguity
What is next? Syntactic Ambiguity

Ambiguity: Lexical Ambiguity

• It is ambiguity of a lexical element of a language


• Example:
• What is the POS of the word “” (Personal-Noun (PN), common
Noun (NN) or Verb). Such things can be disambiguated by POS
tagger
• What is the meaning of “Bela”,”hodhuu” (he ate or he is bad).
Such things can be disambiguated through word sense
disambiguation
• Note such disambiguation is very important in many NLP
applications such as TTS systems
Tamrat Delessa(tamedase@gmail.com), HU
Introduction Definition
Ambiguity Lexical Ambiguity
What is next? Syntactic Ambiguity

Ambiguity: Syntactic Ambiguity

• It is ambiguity because the grammar of the sentence


• Example:
• I saw a man on the mountain with a telescope
• Who is carrying the telescope?
• How far is the man in a relative sense from me?
• Give me one Amharic/Afan Oromo sentence Example (Exercise) with
justification included?

Tamrat Delessa(tamedase@gmail.com), HU
Introduction What is next
Ambiguity
What is next?

What is next?

• Team up into six (distinct two sub group each with three members)
• This group will take one ASR and one TTS project
• You will work on the project
• Compile a proper documentation
• Submit the documentation when you finish the implementation
• Defend the project (theoretically and practically)

Tamrat Delessa(tamedase@gmail.com), HU

You might also like