You are on page 1of 16

Mrs. B.

Moohambigai
Assistant Professor/ Dept of IT
IT8601 - Computational Intelligence
Natural Language Processing
• Natural Language
• Natural Language Processing
• Need of processing human language
Issues and processing complexities of NLP
• Ambiguity - Words in any natural languages
usually have a number of different possible
meanings
• Language variability - A large number of
languages are available worldwide and most of
the languages have different character set,
structure and grammar rules.
• Difficult to incorporate human cognition
Phases in NLP
• Morphological Analysis/Lexical Analysis
• Syntactic Analysis (Parsing)
• Sematic Analysis
• Discourse Analysis
• Pragmatic Analysis
MORPHOLOGICAL ANALYSIS
• Morphology is the study of the structure and formation of
words. Its most important unit is the morpheme, which is
defined as the "minimal unit of meaning“

• Morphological analysis and parsing are essential for many


NLP applications such as spelling error detection, machine
translation and information retrieval etc.,
There are three different ways to form words

• Inflection - Inflection is the process of changing the form of a word so that it


expresses information such as number, person, case, gender, tense, mood and
aspect, but the syntactic category of the word remains unchanged. As an
example, the plural form of the noun in English is usually formed from the
singular form by adding an “S”

• Derivation - Inflection doesn't change the syntactic category of a word.


Derivation does change the category. Combines word stem with a grammatical
morpheme resulting a word belonging to a different class.

• Compounding - Process of merging two or more words to form a new word


SYNTAX ANALYSIS
• Syntax analysis checks the text for meaningfulness
comparing to the rules of formal grammar. The objective of
syntactic analysis is to find the syntactic structures of the
sentence that determines the meaning of the sentence.

• It can be represented as a tree where nodes represent


phrases, leaf nodes are words.
TYPES OF PARSING

TOP-DOWN PARSING : Top down parsing starts its search from the root node
and works downwards towards the leaf node.

BOTTOM-UP PARSING : The parser starts with the input symbol and tries to
construct the parse tree in an upward direction towards the root.
SEMANTIC ANALYSIS
• It involves mapping of natural language occurrences to
some representation of meaning.
• A meaning representation language bridges the gap
between linguistic and common sense knowledge.
Characteristics of Meaning representation language

• Verifiable

• Unambiguous

• Support canonical form

• Support inference and variables

• Expressiveness
LANGUAGE MODELS
• A Language Model is a probabilistic model which predicts
the probability that a sequence of tokens belongs to a
language.

• The probabilities returned by a language model are mostly


useful to compare the likelihood that different sentences
are "good sentences". Spell checking, Automatic Speech
Recognition, Machine Translation:
N-grams
• N-grams are the simplest tool
available to construct a language
model.
• o An N-gram is a sequence of N
words.
• o An N-gram model predicts the
probability of a given N-gram
within any sequence of words in
the language.
MOTIVATION OF NLP
• Understand language analysis & generation

• Communication

• Language is a window to the mind

• Data is in linguistic form

• Data can be in Structured (table form), Semi structured (XML


form), Unstructured (sentence form).
APPLICATIONS OF NLP
• Information Retrieval system

• Speech Recognition

• Text to Speech

• Machine Translation System

• Question answering system


THANK YOU !!!

You might also like