Welcome to Scribd, the world's digital library. Read, publish, and share books and documents. See more
Download
Standard view
Full view
of .
Save to My Library
Look up keyword
Like this
16Activity
0 of .
Results for:
No results containing your search query
P. 1
Natural Language Processing

Natural Language Processing

Ratings:

5.0

(1)
|Views: 394 |Likes:
Published by api-3806739
Describes about Natural Language processing
Describes about Natural Language processing

More info:

Published by: api-3806739 on Oct 17, 2008
Copyright:Attribution Non-commercial

Availability:

Read on Scribd mobile: iPhone, iPad and Android.
download as DOC, PDF, TXT or read online from Scribd
See more
See less

03/18/2014

pdf

text

original

Abstract

The goal of the Natural Language Processing (NLP) group is to design and build software
that will analyze, understand, and generate languages that humans use naturally, so that
eventually you will be able to address your computer as though you were addressing
another person.

This goal is not easy to reach. "Understanding" language means, among other things,
knowing what concepts a word or phrase stands for and knowing how to link those
concepts together in a meaningful way. It's ironic that natural language, the symbol
system that is easiest for humans to learn and use, is hardest for a computer to master.

The challenges we face stem from the highly ambiguous nature of natural language. As
an English speaker you effortlessly understand a sentence like "Flying planes can be
dangerous". Yet this sentence presents difficulties to a software program that lacks both
your knowledge of the world and your experience with linguistic structures. Is the more
plausible interpretation that the pilot is at risk, or that the danger is to people on the
ground? Should "can" be analyzed as a verb or as a noun? Which of the many possible
meanings of "plane" is relevant? Depending on context, "plane" could refer to, among
other things, an airplane, a geometric object, or a woodworking tool. How much and what
sort of context needs to be brought to bear on these questions in order to adequately
disambiguate the sentence?

We address these problems using a mix of knowledge-engineered and statistical/machine-
learning techniques to disambiguate and respond to natural language input. Our work has
implications for applications like text critiquing, information retrieval, question
answering, summarization, gaming, and translation. The grammar checkers in Office for
English, French, German, and Spanish are outgrowths of our research; Encarta uses our
technology to retrieve answers to user questions; Intellishrink uses natural language
technology to compress cellphone messages; Microsoft Product Support uses our
machine translation software to translate the Microsoft Knowledge Base into other
languages. As our work evolves, we expect it to enable any area where human users can
benefit by communicating with their computers in a natural way.

Natural language processing
Natural language processing (NLP) is a subfield of artificial intelligenceand
linguistics. It studies the problems of automated generation and understanding ofnatural
human languages. Natural language generation systems convert information from

computer databases into normal-sounding human language, and natural language
understanding systems convert samples of human language into more formal
representations that are easier for computer programs to manipulate.

Contents

1 Tasks and limitations
2 Concrete problems
3 Subproblems
4 Statistical NLP
5 The major tasks in NLP
6 Evaluation of natural language processing

7 Organizations and conferences
Tasks and limitations
In theory natural language processing is a very attractive method ofhum an-comp uter
interaction. Early systems such asSHRDLU, working in restricted "blocks worlds"

with restricted vocabularies, worked extremely well, leading researchers to excessive
optimism which was soon lost when the systems were extended to more realistic
situations with real-world ambiguity and complexity.

Natural language understanding is sometimes referred to as anAI-com plete problem,
because natural language recognition seems to require extensive knowledge about the
outside world and the ability to manipulate it. The definition of "understanding" is one of
the major problems in natural language processing.

Concrete problems
Some examples of the problems faced by natural language understanding systems:
\u2022
The sentences We gave the monkeys the bananas because they were hungry and
We gave the monkeys the bananas because they were over-ripe have the same
surface grammatical structure. However, in one of them the wordthey refers to
the monkeys, in the other it refers to the bananas: the sentence cannot be
understood properly without knowledge of the properties and behaviour of
monkeys and bananas.
\u2022
A string of words may be interpreted in myriad ways. For example, the string
Time flies like an arrow may be interpreted in a variety of ways:
o
time moves quickly just like an arrow does;
o
measure the speed of flying insects like you would measure that of an
arrow - i.e. (You should) time flies like you would an arrow.;
o
measure the speed of flying insects like an arrow would - i.e. Time flies in
the same way that an arrow would (time them).;
o
measure the speed of flying insects that are like arrows - i.e. Time those
flies that are like arrows;
o
a type of flying insect, "time-flies," enjoy arrows (compare Fruit flies like
a banana.)
English is particularly challenging in this regard because it has littleinflectional
morphology to distinguish between parts of speech.
\u2022
English and several other languages don't specify which word an adjective applies
to. For example, in the string "pretty little girls' school".
o
Does the school look little?
o
Do the girls look little?
o
Do the girls look pretty?
o
Does the school look pretty?
Subproblems
Speech segmentation

In most spoken languages, the sounds representing successive letters blend into
each other, so the conversion of the analog signal to discrete characters can be a
very difficult process. Also, in natural speech there are hardly any pauses between
successive words; the location of those boundaries usually must take into account
grammatical and semantical constraints, as well as the context.

Text segmentation

Some written languages like Chinese, Japanese and Thai do not have signal word
boundaries either, so any significant text parsing usually requires the
identification of word boundaries, which is often a non-trivial task.

Word sense disambiguation
Many words have more than one meaning; we have to select the meaning which
makes the most sense in context.
Syntactic ambiguity

The grammar for natural languages is ambiguous, i.e. there are often multiple
possible parse trees for a given sentence. Choosing the most appropriate one
usually requires semantic and contextual information. Specific problem
components of syntactic ambiguity include sentence boundary disambiguation.

Imperfect or irregular input

Activity (16)

You've already reviewed this. Edit your review.
1 thousand reads
1 hundred reads
ramjanaprdc liked this
Ahsan Siddiqui liked this
azayrahmad liked this
kvssravs liked this
ffptp liked this
Alexander Han liked this
Ahmad Shuami liked this

You're Reading a Free Preview

Download
/*********** DO NOT ALTER ANYTHING BELOW THIS LINE ! ************/ var s_code=s.t();if(s_code)document.write(s_code)//-->