You are on page 1of 5

UNIT 1- COMPUTATIONAL LINGUISTICS:

DEFINITION AND ORIGIN


1. WHAT IS COMPUTATIONAL LINGUISTICS?

Its scope is the language as an object capable of being analysed by computational procedures.

DEFINITION BY HANS USZKONEIT

Computational linguistics (CL) is a discipline between linguistics and computer science which is
concerned with the computational aspects of the human language faculty. It belongs to the
cognitive sciences and overlaps with the field of artificial intelligence (AI), a branch of computer
science, aiming at computational models of human cognition.

Computational linguistics is the scientific and engineering discipline concerned with understanding
written and spoken language from a computational perspective and building artifacts that usefully
process and produce language.

To the extent that language is a mirror of mind, a computational understanding of language also
provides insight into thinking and intelligence. And since language is our most natural and most
versatile means of communication, linguistically competent computers would greatly facilitate our
interaction with machines and software of all sorts.

Computational linguistics has two primary motivations:

- Technological motivation: It focuses on the practical outcome, the product. The methods,
techniques, tools and applications are often subsumed under the term “language
engineering or human language technology”. Its goal is to create software products that
model human language.

- Linguistic or cognitive motivation: issues in theoretical linguistics and cognitive science: to


gain a better understanding of how humans communicate by using natural language.
Computational linguists develop formal models simulating human language faculty. This
motivation is shared with theoretical linguistics and psycholinguistics.

As its goals includes:

- The formulation of grammatical and semantic frameworks for characterizing languages


- The discovery of processing techniques that exploit both structural and distributional
(statistical) properties of language.
- The development of cognitively and neuroscientifically plausible computational models of
how language processing and learning might occur in the brain.

GOALS OF THE FIELD

1
 Build psychological adequate models of human language processing capabilities on
the basis of knowledge about the way in which humans acquire, store and process
language.
 Build functionally correct models of human language processing capabilities on the
basis of knowledge about the world and about language.

AN EXTREMELY VARIED FIELD

Some of the most prominent application areas are:

 Machine translation (MT)


 Question answering (QA)
 Speech recognition.
 Speech synthesis. (the opposite of speech recognition)
 Man-machine interfaces.
 Intelligent word processing: spelling recognition, grammar correction.
 Find relevant documents in collections.
 Establish authorship of documents (forensic linguistics)
 Catch plagiarism.
 Extract information from documents: analysis of texts for topic, sentiment, or other
psychological attributes.
 Classify documents.
 Summarize documents.
 Dialogue agents for accomplishing particular tasks (medical advising).

AN INTERDISCIPLINARY FIELD

 Part of Applied Linguistics


 Part of Computational Science

2
 Part of Telecommunication Engineering
 Field studied by linguistics, CS psychology and mathematics.

3
UNIT 2.- SYNTACTIC PROCESSING
BASIC PARSING TECHNIQUES: METHODS FOR SENTENCE ANALYSIS.

1. Grammars:
a. Context-free grammars (CFG)
b. Simple transition networks
c. Recursive transition networks
d. Augmented transition networks
e. Unification
2. Parsing techniques:
1. Top-down
2. Bottom-up
3. Mixed mode

Parsing is the same than analysis.

REPRESENTING SENTENCE STRUCTURE

A tree-like representation:

A grammar: Context-free Grammars rules

- S  NP VP
- VP  VERB NP
- NP  PN
- NP  ART NOUN

TWO COMMON

 Top-Down parsing: begins by starting with the symbol S till terminal symbols are
rewritten.
o S  NP VP
 NAME VP

4
 JOHN VP
 JOHN VERB NP
 John ate NP
 John ate ART NOUN
 John ate the NOUN
 John ate the cat
 Bottom-Up parsing: begins with individual words till S symbol is rewritten.
o NAME ate the cat
o NAME VERB the cat S  NP VP
o NAME VERB ART cat
VP  VERB NP
o NAME VERB ART NOUN
NP NAME
o NP VERB NP
o NP VP NP  ART NOUN

oS +

Lexicon

You might also like