You are on page 1of 16

LECTURE NOTES ON

SUBJECT: ARTIFICIAL INTELLIGENCE


SUBJECT CODE: RCS-702
Unit-1
B.TECH
BRANCH: CSE & IT
YEAR- 4TH SEMESTER- 7th

(AKTU)

Rohit Mishra
(Assistant Professor)

Department of Computer Science & Engineering and Information


Technology

United Institute of Technology, Prayagraj

1
Prepared By:
Mr. Rohit Mishra
Assistant Professor
(UIT)
Unit – I
Introduction to artificial intelligence:
• It is a branch of science which deal with helping machines find solutions to complex problems in a
more human-like fashion.
• AI is the study of complex information processing problems that often have their roots in some
aspect of biological information processing. The goal of the subject is to identify solvable and
interesting information processing problems, and solve them.
• The intelligent connection of perception to action.
• This generally involves borrowing characteristics from human intelligence and applying them as
algorithms in a computer friendly way.
• A more or less flexible or efficient approach can be taken depending on the requirements
established, which influence how artificial the intelligence behavior appears.
• AI is generally associated with computer science, but it has many important links with other fields
as math, psychology, biology etc.

Strong and Weak AI:

Artificial Intelligence (AI) is the field of computer science dedicated to developing machines that
will be able to mimic and perform the same task just as a human would. According to AI
philosophy, AI is considered to be divided in to two major types, Weak AI and Strong AI. Weak AI
is the thinking focused towards the development of technology capable of carrying out pre-planned
moves based on some rules and applying these to achieve a certain goal. As opposed to that, strong
AI is developing technology that can think and function similar to humans, not just mimicking
human behaviour in a certain domain.

Weak AI:

The principle behind weak AI is simply the fact that machines can be made to act as if they are
intelligent. For example, when a human player plays chess against a computer, the human player
may feel as if the computer is actually making impressive moves. But the chess application is not
thinking and planning at all. All the moves it makes are previously feed in to the computer by a
human and that is how it is ensured that the software will make the right moves at the right times.

Strong AI:

The principle behind strong AI is that the machines could be made to think of in other words could
represent human minds in the future. If that is the case, those machines will have the ability to
reason, think and do all functions that a human is capable of doing.

Important Features of AI:


1. The use of computers to do reasoning, pattern recognition, learning, or some other form of
inference.
2. A focus on problems that do not respond to algorithmic solutions.
3. A concern with problem-solving using inexact, missing, or poorly defined information.
4. Reasoning about the significant qualitative features of a situation.
5. An attempt to deal with issues of semantic meaning as well as syntactic form.
6. Answers that are neither exact nor optimal, but are in some sense “sufficient”.
7. The use of large amounts of domain-specific knowledge in solving problems.
8. The use of meta-level knowledge to effect more sophisticated control of problem-solving strategies.

2
Prepared By:
Mr. Rohit Mishra
Assistant Professor
(UIT)
History of Artificial Intelligence:
The term artificial intelligence was first coined in 1956, at the dartmouth conference, and since then
artificial intelligence has expanded because of the theories and principles developed by its dedicated
researches. In 1956 John McCarthy regarded as the father of AI, organized a conference to draw the
talent and expertise of others intersted in machine intelligence for a month of brainstorming.

In 1957 the first version of a new program The General Problem Solver (GPS) was tested.
The program developed by the same pair which developed the logic theorist.

In 1958 McCarthy announced his new development the LISP language, which still used
today. LISP stands for LISt Processing, and was soon adopted as the language of choice among
most AI developers.

Another achievement in the 1970’s was the advent of the expert system. Expert systems
predict the probability of a solution under set conditions.

Another development during this time was the PROLOGUE language. This language was
proposed in 1972.

Application of the Artificial Intelligence:


The potential applications of artificial intelligence are abundant. They stretch from the military for
autonomous control and target identification, to the entertainment industry for computer games and
robotic. Some of the applications of AI are as follows:

1. Speech Recognition: In the 1990’s computer speech recognition reached a practical level for
limited purposes. Thus united airlines has replaced its keyboard tree for flight information by a
system using speech recognition of flight numbers and city names. It is quite convenient. On the
other hand while it is possible to instruct some computers using speech, most users have gone
back to the keyboard and the mouse as still more convenient.
2. Understanding Natural Language: Just getting a sequence of words into a computer is not
enough. Parsing sentences is not enough either. The computer has to be provided with an
understanding of the domain the test is about, and this is presently possible only for very limited
domain.
3. Computer Vision: the world is composed of three-dimensional objects but the inputs to the
human eye and computers TV cameras are two dimensional. Some useful programs can work
solely in two dimensions, but full computer vision requires partial three-dimensional information
that is not just a set of two-dimensional view. At present there are only limited ways of
representing three-dimensional information directly and they are not as good as what humans
evidently use.
4. Expert System: A knowledge engineer interviews experts in a certain domain and tries to
embody their knowledge in a computer program for carrying out some task. How well this works
depends on whether the intellectual mechanism required for the task are within the present state
of AI. When this turned out not to be so, there were many disappointing results. The first expert
systems were MYCIN in 1974, which diagnosed bacterial infections of the blood and suggested
treatments. Its ontology included bacteria, symptoms, and treatments and did not include
patients, doctors, hospitals, death, recovery, and events occurring in time. Its interaction
depended on a single patient being considered.
5. Heuristic Classification: One of the most feasible kinds of expert system given the present
knowledge of AI is to put some information in one of a fixed set of categories using several
sources of information. An example is advising whether to accept a proposed credit card
3
Prepared By:
Mr. Rohit Mishra
Assistant Professor
(UIT)
purchase. Information is available about the owner of the credit card, his record of payment and
also about the item he is buying and about the establishment from which he is buying it.
6. Game Playing: Much of the early research in state space search was done using common board
games such as checkers, chess, and the 15-puzzle. Games can generate extremely large search
spaces. These are large and complex enough to require powerful techniques for determining
what alternative to explore.
7. Automated reasoning and Theorem Proving: Theorem-proving research was responsible in
formalizing search algorithms and developing formal representation languages such as predicate
calculus and the logic programming language.

Intelligent Agent:
In artificial intelligence, an intelligent agent (IA) is an autonomous entity which observes through
sensors and acts upon an environment using actuators (i.e. an agent) and directs its activity towards
achieving goals.

• An agent is anything that can be viewed as perceiving its environment through sensors and acting
upon that environment through effectors.
• An agent perceives its environment through sensors. The complete set of inputs at a given time is
called a percept.
• The current percept, or a sequence of percepts can influence the actions of an agent.
• The agent can change the environment through actuators or effectors.
• An operation involving an effector is called an action.
• Actions can be grouped into action sequences.
• The agent can have goal which it tries to achieve.

Thus an agent can be looked upon as a system that implements a mapping from percept sequences to
actions. A performance measure has to be used in order to evaluate an agent. Intelligent agent may also
learn or use knowledge to achieve their goals. They may be very simple or very complex.

Observation

Ability
Past Experience
Goals
Environment
Prior Knloledge
AGENT

Action

Structure of Intelligent Agent:


An agent is anything that can be viewed as perceiving its environment through sensors and acting
on that environment through effectors. A simple agent program can be defined mathematically as an
4
Prepared By:
Mr. Rohit Mishra
Assistant Professor
(UIT)
agent function which maps every possible precepts sequence to a possible action the agent can
perform or to coefficient, feedback element, function or constant that affects eventual action.

Agent = Architecture + Program

• Agent: Mail sorting robot


• Percepts: Array of pixel intensities
• Actions: Route letter into bin
• Goals: Route letter into correct bin
• Environment: convey or belt of letters

A Environment provides the conditions under which an entity (agent or object) exists. In other
words, it defines the properties of the world in which an agent can and does function. An agent’s
environment then consists not only of all the other entities in its environment, but also those
principles and processes under which the agents exist and communicate. Environment are of the
following types:

• Fully Observable vs. Partially Observable: If an agent's sensors give it full access to the
complete state of the environment, the environment is fully observable, otherwise it is only partially
observable or unobservable.
• Deterministic vs. Stochastic: If the next state of the environment is completely determined by the
current state and the agent's selected action, the environment is deterministic. An environment may
appear stochastic if it is only partially observable.
• Episodic vs. Sequential: If future decisions do not depend on the actions an agent has taken, just
the information from its sensors about the state it is in, then the environment is episodic.
• Static vs. Dynamic: If the environment can change while the agent is deciding what to do, the
environment is dynamic. Discrete vs. Continuous: If the sets of percepts and actions available to the
agent are ¯note, and the individual elements are distinct and well-defined, then the environment is
discrete.

Classes of Intelligent Agents: A general learning agent can be classified into five classes based on
their degree of perceived intelligence and capability:

1. Simple reflex agents


2. Model-based reflex agents
3. Goal-based agents
4. Utility-based agents
5. Learning agents

1. Simple Reflex Agents: Simple reflex agents act only on the basis of the current percept, ignoring
the rest of the percept history. The agent function is based on the condition-action rule: if condition
then action. This agent function only succeeds when the environment is fully observable. Some
reflex agents can also contain information on their current state which allows them to disregard
conditions whose actuators are already triggered.

5
Prepared By:
Mr. Rohit Mishra
Assistant Professor
(UIT)
2. Model-based reflex agents: A model based agent can handle a partially observable environment.
Its current state is stored inside the agent maintaining some kind of structure which describes that
part of the world which cannot be seen. This knowledge about “how the world works” is called a
model of the world, hence the name “model-based agent”. A model based reflex agent should
maintain some sort of internal model that depends on the percept history and thereby reflects at least
some of the unobserved aspects of the current state. It then chooses an action in the same way as the
reflex agent.

3. Goal-based Agents: Goal based agent further expand on the capabilities of the model-based agents
by using “goal” information. Goal information describes situations that are desirabe. This allows the
agent a way to choose among multiple possibilities, selecting the on which reaches a goal state. In
some instances, the goal-based agent appears to be less efficient it is more flexible because the
knowledge that supports its decisions is represented explicitly and can be modified.

6
Prepared By:
Mr. Rohit Mishra
Assistant Professor
(UIT)
4. Utility-based Agents: A relational utility-based agent chooses the action that maximizes the
expected utility of the action outcomes that is, the agent expects to derive on average, given the
probabilities and utilities of each outcome. Goal-based agents only distinguish between goal states
and non-goal states. It is possible to define a measure of how desirable a particular state is. This
measure can be obtained through the use of a utility function which maps a state to a measure of the
utility of the state. A utility-based agent has to model and keep track of its environment, tasks that
have involved a great deal of reach on perception, representation, reasoning and learning.

5. Learning Agents: learning has an advantage that it allows the agents to initially operate in
unknown environments and to become more competent than its initial knowledge alone might allow
the most important distinction is between the “learning element”, which is responsible for making
improvements and the “performance element” which is responsible for selecting external actions.
The learning element uses feedback from the “critic” on how the agent is doing and
determines how the performance element should be modified to do better in the future.
The last component of the learning agent is the “problem generator”. It is responsible for
suggesting actions that will lead to new and informative experiences.

Computer Vision:

7
Prepared By:
Mr. Rohit Mishra
Assistant Professor
(UIT)
Computer vision is a field that includes methods for acquiring processing, analysing, and
understanding images and, in general, high-dimensional data from the real world in order to
produce numerical or symbolic information eq. in the form of decisions.

• The computer vision covers the core technology of automated image analysis which is used in
many fields.
• The computer vision and machine vision fields have significant overlap.
• Machine vision usually refers to a process of combining automated image analysis with other
methods and technologies to provide automated inspection and robot guidance in industrial
application.
• As a scientific discipline, computer vision is concerned with the theory behind artificial systems
that extract information from images.
• The image data can take many forms, such as video sequences, views from multiple cameras, or
multi-dimensional data from a medical scanner.
• As a technological discipline, computer vision seeks to apply its theories and models to the
construction of computer vision system.
• Examples of applications of computer vision include systems for:
• Controlling processes (e.g. an industrial robot or an autonomous vehicle).
• Detecting events (e.g. for visual surveillance)
• Organizing information (e.g. for indexing databases of images and image sequences),
• Modeling objects or environments (e.g. industrial inspection, medical image analysis or
topographical modeling),
• Interaction (e.g. as the input to a device for computer-human interaction).

Relation between Computer vision and various other fields


Computer vision tends to focus on the 3D scene projected onto one or several images, e.g., how
to reconstruct structure or other information about the 3D scene from one or several images.

8
Prepared By:
Mr. Rohit Mishra
Assistant Professor
(UIT)
Computer vision often relies on more or less complex assumptions about the scene depicted in an
image.
Machine vision tends to focus on applications, mainly in industry, e.g., vision based autonomous
robots and systems for vision-based inspection or measurement. This implies that image sensor
technologies and control theory often are integrated with the processing of image data to control a
robot and that real-time processing is emphasized by means of efficient implementations in
hardware and software

Tasks of computer vision:


Each of the application areas employ a range of computer vision tasks. Some examples of typical
computer vision tasks are as follows

1. Recognition:

The classical problem in computer vision, image processing and machine vision is that of
determining whether or not the image data contains some specific object, feature, or activity. This
task can normally be solved robustly and without effort by a human, but is still not satisfactorily
solved in computer vision for the general case: arbitrary objects in arbitrary situations. The existing
methods for dealing with this problem can at best solve it only for specific objects, such as simple
geometric objects (e.g., polyhedrons), human faces, printed or hand-written characters, or vehicles,
and in specific situations, typically described in terms of well-defined illumination, background,
and pose of the object relative to the camera. Different varieties of the recognition problem are
described:

• Recognition: one or several pre-specified or learned objects or object classes can be recognized,
usually together with their 2D positions in the image or 3D poses in the scene.
• Identification: An individual instance of an object is recognized. Examples: identification of a
specific person face or fingerprint, or identification of a specific vehicle.
• Detection: the image data is scanned for a specific condition. Examples: detection of possible
abnormal cells or tissues in medical images or detection of a vehicle in an automatic road toll
system. Detection based on relatively simple and fast computations is sometimes used for
finding smaller regions of interesting image data which can be further analyzed by more
computationally demanding techniques to produce a correct interpretation.

Several specialized tasks based on recognition exist, such as:

• Content-based image retrieval: finding all images in a larger set of images which have a specific
content. The content can be specified in different ways, for example in terms of similarity relative a
target image (give me all images similar to image X), or in terms of high-level search criteria given
as text input (give me all images which contains many houses, are taken during winter, and have no
cars in them).
• Pose estimation: estimating the position or orientation of a specific object relative to the camera.
An example application for this technique would be assisting a robot arm in retrieving objects from
a conveyor belt in an assembly line situation.
• Optical character recognition (or OCR): identifying characters in images of printed or
handwritten text, usually with a view to encoding the text in a format more amenable to editing or
indexing (e.g. ASCII).
• 2D Code Reading: Reading of 2D codes such as data matrix and QR codes.

9
Prepared By:
Mr. Rohit Mishra
Assistant Professor
(UIT)
2. Motion:
Several tasks relate to motion estimation, in which an image sequence is processed to produce an
estimate of the velocity either at each points in the image or in the 3D scene. Examples of such
tasks are:
• Ego motion: determining the 3D rigid motion of the camera.
• Tracking: following the movements of objects (e.g. vehicles or humans).
• Optical flow: to determine, for each point in the image, how that point is moving relative to the
image plane. i.e. its apparent motion. This motion is a result both of how the corresponding 3D
point is moving in the scene and how the camera is moving relative to the scene.

3. Scene Reconstruction:

Given two or more images of a scene, or a video, scene reconstruction aims at computing a 3D
model of the scene. In the simplest case the model can be a set of 3D points. More sophisticated
methods produce a complete 3D surface model.

4. Image Restoration:
The aim of image restoration is the removal of noise (sensor noise, motion blur, etc.) from images.
The simplest possible approach for noise removal is various types of filters such as low-pass filters
or median filters. More sophisticated methods assume a model of how the local image structures
look like, a model which distinguishes them from the noise. By first analyzing the image data in
terms of the local image structures, such as lines or edges, and then controlling the filtering based
on local information from the analysis step, a better level of noise removal is usually obtained
compared to the simpler approaches.

5. 3D volume Recognition:
Parallax depth recognition generates a 2.5D plane, which is a 2D plane with depth information. For
3D volume recognition, the 2D pre-processing steps may be applied first (noise reduction, contrast
enhancement, etc), because initially the data may be in a 2D images using parallax depth
perception.

Computer Vision System Methods:


The organization of a computer vision system is highly application dependent. Some systems are
stand-alone applications which solve a specific measurement or detection problem, while other
constitute a sub-system of a larger design which, for example, also contains sub-systems for
control of mechanical actuators, planning, information databases, man-machine interfaces, etc.
The specific implementation of a computer vision system also depends on if its functionality is
pre-specified or if some part of it can be learned or modified during operation. There are, however,
typical functions which are found in many computer vision systems.

• Image Acquisition: A digital image is produced by one or several image sensors which, besides
various types of light-sensitive cameras, includes range sensors, tomography devices, radar, ultra-
sonic cameras, etc. Depending on the type of sensor, the resulting image data is an ordinary 2D
image, a 3D volume, or an image sequence. The pixel values typically correspond to light
intensity in one or several spectral bands (gray images or color images), but can also be related to
various physical measures, such as depth, absorption or reflectance of sonic or electromagnetic
waves, or nuclear magnetic resonance.
• Pre-Processing: Before a computer vision method can be applied to image data in order to extract
some specific piece of information, it is usually necessary to process the data in order to assure
that it satisfies certain assumptions implied by the method. Examples are:
▪ Re-sampling in order to assure that the image coordinate system is correct.
10
Prepared By:
Mr. Rohit Mishra
Assistant Professor
(UIT)
▪ Noise reduction in order to assure that sensor noise does not introduce false information.
▪ Contrast enhancement to assure that relevant information can be detected.
▪ Scale-space representation to enhance image structures at locally appropriate scales
• Feature Extraction: Image features at various levels of complexity are extracted from the image
data. Typical examples of such features are:

▪ Lines, edges and ridges.

▪ Localized interest points such as corners, blobs or points.

More complex features may be related to texture, shape or motion.

• Detection/Segmentation: At some point in the processing a decision is made about which image
points or regions of the image are relevant for further processing. Examples are:
▪ Selection of a specific set of interest points
▪ Segmentation of one or multiple image regions which contain a specific object of interest.
• High-level processing: At this step the input is typically a small set of data, for example a set of
points or an image region which is assumed to contain a specific object. The remaining processing
deals with, for example:
▪ Verification that the data satisfy model-based and application specific assumptions.
▪ Estimation of application specific parameters, such as object pose or object size.
▪ Classifying a detected object into different categories
▪ Comparing and combining two different views of the same object.

Natural Language Processing:


Natural Language Processing is a theoretically motivated range of computational techniques for
analysing and representing naturally occurring texts at one or more levels of linguistic analysis for
the purpose of achieving human-like language processing for a range of tasks or applications.

Firstly, the imprecise notion of ‘range of computational techniques’ is necessary because there are
multiple methods or techniques from which to choose to accomplish a particular type of language
analysis.

‘Naturally occurring texts’ can be of any language, mode, genre, etc. The texts can be oral or
written. The only requirement is that they be in a language used by humans to communicate to one
another. Also, the text being analysed should not be specifically constructed for the purpose of the
analysis, but rather that the text be gathered from actual usage.

The notion of ‘levels of linguistic analysis’ (to be further explained in Section 2) refers to the fact
that there are multiple types of language processing known to be at work when humans produce or
comprehend language. It is thought that humans normally utilize all of these levels since each level
conveys different types of meaning. But various NLP systems utilize different levels, or
combinations of levels of linguistic analysis, and this is seen in the differences amongst various
NLP applications. This also leads to much confusion on the part of non-specialists as to what NLP
really is, because a system that uses any subset of these levels of analysis can be said to be an NLP-
based system. The difference between them, therefore, may actually be whether the system uses
‘weak’ NLP or ‘strong’ NLP.

‘Human-like language processing’ reveals that NLP is considered a discipline within Artificial
Intelligence (AI). And while the full lineage of NLP does depend on a number of other disciplines,
since NLP strives for human-like performance, it is appropriate to consider it an AI discipline.

11
Prepared By:
Mr. Rohit Mishra
Assistant Professor
(UIT)
‘For a range of tasks or applications’ points out that NLP is not usually considered a goal in and
of itself, except perhaps for AI researchers. For others, NLP is the means for accomplishing a
particular task. Therefore, you have Information Retrieval (IR) systems that utilize NLP, as well as
Machine Translation (MT), Question-Answering, etc.

Goal:

The goal of NLP as stated above is “to accomplish human-like language processing”. The choice of
the word ‘processing’ is very deliberate, and should not be replaced with ‘understanding’. For
although the field of NLP was originally referred to as Natural Language Understanding (NLU) in
the early days of AI, it is well agreed today that while the goal of NLP is true NLU, that goal has
not yet been accomplished. A full NLU System would be able to:

1. Paraphrase an input text


2. Translate the text into another language
3. Answer questions about the contents of the text
4. Draw inferences from the text
Divisions of NLP:

While the entire field is referred to as Natural Language Processing, there are in fact two distinct
focuses-

• Language Processing: Refers to the analysis of language for the purpose of producing a
meaningful representation.
• Language Generation: Refers to the production of language from a representation.

Levels of Natural Language Processing:


The most explanatory method for presenting what actually happens within a Natural Language
Processing system is by means of the ‘levels of language’ approach. This is also referred to as the
synchronic model of language and is distinguished from the earlier sequential model, which
hypothesizes that the levels of human language processing follow one another in a strictly
sequential manner. Psycholinguistic research suggests that language processing is much more
dynamic, as the levels can interact in a variety of orders. Introspection reveals that we frequently
use information we gain from what is typically thought of as a higher level of processing to assist in
a lower level of analysis. For example, the pragmatic knowledge that the document you are reading
is about biology will be used when a particular word that has several possible senses (ormeanings)
is encountered, and the word will be interpreted as having the biology sense. Of necessity, the
following description of levels will be presented sequentially. The key point here is that meaning is
conveyed by each and every level of language and that since humans have been shown to use all
levels of language to gain understanding, the more capable an NLP system is, the more levels of
language it will utilize.

1. Phonology:

This level deals with the interpretation of speech sounds within and across words. There are,
in fact, three types of rules used in phonological analysis: 1) phonetic rules – for sounds
within words; 2) phonemic rules – for variations of pronunciation when words are spoken
together, and; 3) prosodic rules – for fluctuation in stress and intonation across a sentence. In
an NLP system that accepts spoken input, the sound waves are analysed and encoded into a

12
Prepared By:
Mr. Rohit Mishra
Assistant Professor
(UIT)
digitized signal for interpretation by various rules or by comparison to the particular language
model being utilized.

2. Morphology:

This level deals with the componential nature of words, which are composed of morphemes –
the smallest units of meaning. For example, the word preregistration can be morphologically
analysed into three separate morphemes: the prefix pre, the root register, and the suffix tion.
Since the meaning of each morpheme remains the same across words, humans can break
down an unknown word into its constituent morphemes in order to understand its meaning.
Similarly, an NLP system can recognize the meaning conveyed by each morpheme in order to
gain and represent meaning. For example, adding the suffix –ed to a verb, conveys that the
action of the verb took place in the past. This is a key piece of meaning, and in fact, is
frequently only evidenced in a text by the use of the -ed morpheme.

3. Lexical:

At this level, humans, as well as NLP systems, interpret the meaning of individual words.
Several types of processing contribute to word-level understanding – the first of these being
assignment of a single part-of-speech tag to each word. In this processing, words that can
function as more than one part-of-speech are assigned the most probable part-of speech tag
based on the context in which they occur.

Additionally, at the lexical level, those words that have only one possible sense or meaning
can be replaced by a semantic representation of that meaning. The nature of the representation
varies according to the semantic theory utilized in the NLP system. The following
representation of the meaning of the word launch is in the form of logical predicates. As can
be observed, a single lexical unit is decomposed into its more basic properties. Given that
there is a set of semantic primitives used across all words, these simplified lexical
representations make it possible to unify meaning across words and to produce complex
interpretations, much the same as humans do. launch (a large boat used for carrying people on
rivers, lakes harbors, etc.)

The lexical level may require a lexicon, and the particular approach taken by an NLP system
will determine whether a lexicon will be utilized, as well as the nature and extent of
information that is encoded in the lexicon. Lexicons may be quite simple, with only the words
and their part(s)-of-speech, or may be increasingly complex and contain information on the
semantic class of the word, what arguments it takes, and the semantic limitations on these
arguments, definitions of the sense(s) in the semantic representation utilized in the particular
system, and even the semantic field in which each sense of a polysemous word is used.

4. Syntactic:

This level focuses on analysing the words in a sentence so as to uncover the grammatical
structure of the sentence. This requires both a grammar and a parser. The output of this level
of processing is a (possibly delinearized) representation of the sentence that reveals the
structural dependency relationships between the words. There are various grammars that can
be utilized, and which will, in turn, impact the choice of a parser. Not all NLP applications
require a full parse of sentences, therefore the remaining challenges in parsing of
prepositional phrase attachment and conjunction scoping no longer stymie those applications
for which phrasal and clausal dependencies are sufficient. Syntax conveys meaning in most
languages because order and dependency contribute to meaning. For example the two

13
Prepared By:
Mr. Rohit Mishra
Assistant Professor
(UIT)
sentences: ‘The dog chased the cat.’ and ‘The cat chased the dog.’ differ only in terms of
syntax, yet convey quite different meanings.

5. Semantic:

This is the level at which most people think meaning is determined, however, as we can see in
the above defining of the levels, it is all the levels that contribute to meaning. Semantic
processing determines the possible meanings of a sentence by focusing on the interactions
among word-level meanings in the sentence. This level of processing can include the semantic
disambiguation of words with multiple senses; in an analogous way to how syntactic
disambiguation of words that can function as multiple parts-of-speech is accomplished at the
syntactic level. Semantic disambiguation permits one and only one sense of polysemous
words to be selected and included in the semantic representation of the sentence. For example,
amongst other meanings, ‘file’ as a noun can mean either a folder for storing papers, or a tool
to shape one’s fingernails, or a line of individuals in a queue. If information from the rest of
the sentence were required for the disambiguation, the semantic, not the lexical level, would
do the disambiguation. A wide range of methods can be implemented to accomplish the
disambiguation, some which require information as to the frequency with which each sense
occurs in a particular corpus of interest, or in general usage, some which require consideration
of the local context, and others which utilize pragmatic knowledge of the domain of the
document.

6. Discourse:

While syntax and semantics work with sentence-length units, the discourse level of NLP
works with units of text longer than a sentence. That is, it does not interpret multi sentence
texts as just concatenated sentences, each of which can be interpreted singly. Rather, discourse
focuses on the properties of the text as a whole that convey meaning by making connections
between component sentences. Several types of discourse processing can occur at this level,
two of the most common being anaphora resolution and discourse/text structure recognition.
Anaphora resolution is the replacing of words such as pronouns, which are semantically
vacant, with the appropriate entity to which they refer (30). Discourse/text structure
recognition determines the functions of sentences in the text, which, in turn, adds to the
meaningful representation of the text. For example, newspaper articles can be deconstructed
into discourse components such as: Lead, Main Story, Previous Events, Evaluation, Attributed
Quotes, and Expectation.

7. Pragmatic:

This level is concerned with the purposeful use of language in situations and utilizes context
over and above the contents of the text for understanding the goal is to explain how extra
meaning is read into texts without actually being encoded in them. This requires much world
knowledge, including the understanding of intentions, plans, and goals. Some NLP
applications may utilize knowledge bases and inferencing modules. For example, the
following two sentences require resolution of the anaphoric term ‘they’, but this resolution
requires pragmatic or world knowledge.

Approaches to Natural Language Processing:


Natural language processing approaches fall roughly into four categories: symbolic, statistical,
connectionist, and hybrid. Symbolic and statistical approaches have coexisted since the early days
of this field. Connectionist NLP work first appeared in the 1960’s. For a long time, symbolic

14
Prepared By:
Mr. Rohit Mishra
Assistant Professor
(UIT)
approaches dominated the field. In the 1980’s, statistical approaches regained popularity as a result
of the availability of critical computational resources and the need to deal with broad, real-world
contexts. Connectionist approaches also recovered from earlier criticism by demonstrating the
utility of neural networks in NLP. This section examines each of these approaches in terms of their
foundations, typical techniques, differences in processing and system aspects, and their robustness,
flexibility, and suitability for various tasks.

Symbolic Approach:

Symbolic approaches perform deep analysis of linguistic phenomena and are based on explicit
representation of facts about language through well-understood knowledge representation schemes
and associated algorithms (21). In fact, the description of the levels of language analysis in the
preceding section is given from a symbolic perspective. The primary source of evidence in symbolic
systems comes from human-developed rules and lexicons.

A good example of symbolic approaches is seen in logic or rule-based systems. In logic base
systems, the symbolic structure is usually in the form of logic propositions. Manipulations of such
structures are defined by inference procedures that are generally truth preserving. Rule-based
systems usually consist of a set of rules, an inference engine, and a workspace or working memory.
Knowledge is represented as facts or rules in the rule-base. The inference engine repeatedly selects
a rule whose condition is satisfied and executes the rule.

Symbolic approaches have been used for a few decades in a variety of research areas and
applications such as information extraction, text categorization, ambiguity resolution, and lexical
acquisition. Typical techniques include: explanation-based learning, rule-based learning, inductive
logic programming, decision trees, conceptual clustering, and K nearest neighbour algorithms.

Statistical Approach:

Statistical approaches employ various mathematical techniques and often use large text corpora to
develop approximate generalized models of linguistic phenomena based on actual examples of these
phenomena provided by the text corpora without adding significant linguistic or world knowledge.
In contrast to symbolic approaches, statistical approaches use observable data as the primary source
of evidence.

A frequently used statistical model is the Hidden Markov Model (HMM) inherited from the
speech community. HMM is a finite state automaton that has a set of states with probabilities
attached to transitions between states (34). Although outputs are visible, states themselves are not
directly observable, thus “hidden” from external observations. Each state produces one of the
observable outputs with a certain probability.

Statistical approaches have typically been used in tasks such as speech recognition, lexical
acquisition, parsing, part-of-speech tagging, collocations, statistical machine translation, statistical
grammar learning, and so on.

Connectionist Approach:

Similar to the statistical approaches, connectionist approaches also develop generalized models
from examples of linguistic phenomena. What separates connectionism from other statistical
methods is that connectionist models combine statistical learning with various theories of
representation - thus the connectionist representations allow transformation, inference, and
manipulation of logic formulae (33). In addition, in connectionist systems, linguistic models are

15
Prepared By:
Mr. Rohit Mishra
Assistant Professor
(UIT)
harder to observe due to the fact that connectionist architectures are less constrained than statistical
ones.

Some connectionist models are called localist models, assuming that each unit represents a
particular concept. For example, one unit might represent the concept “mammal” while another unit
might represent the concept “whale”. Relations between concepts are encoded by the weights of
connections between those concepts. Knowledge in such models is spread across the network, and
the connectivity between units reflects their structural relationship. localist models are quite similar
to semantic networks, but the links between units are not usually labeled as they are in semantic
nets. They perform well at tasks such as word-sense disambiguation, language generation, and
limited inference.

Natural Language Processing Applications:


Natural language processing provides both theory and implementations for a range of applications.
In fact, any application that utilizes text is a candidate for NLP. The most frequent applications
utilizing NLP include the following:

• Information Retrieval – given the significant presence of text in this application, it is surprising
that so few implementations utilize NLP. Recently, statistical approaches for accomplishing NLP
have seen more utilization.
• Information Extraction (IE) – a more recent application area, IE focuses on the recognition,
tagging, and extraction into a structured representation, certain key elements of information, e.g.
persons, companies, locations, organizations, from large collections of text. These extractions can
then be utilized for a range of applications including question-answering, visualization, and data
mining.
• Question-Answering – in contrast to Information Retrieval, which provides a list of potentially
relevant documents in response to a user’s query, question-answering provides the user with either
just the text of the answer itself or answer-providing passages.
• Summarization – the higher levels of NLP, particularly the discourse level, can empower an
implementation that reduces a larger text into a shorter, yet richly constituted abbreviated narrative
representation of the original document.
• Machine Translation – perhaps the oldest of all NLP applications, various levels of NLP have
been utilized in MT systems, ranging from the ‘word-based’ approach to applications that include
higher levels of analysis.
• Dialogue Systems – perhaps the omnipresent application of the future, in the systems envisioned by
large providers of end-user applications. Dialogue systems, which usually focus on a narrowly
defined application (e.g. your refrigerator or home sound system), currently utilize the phonetic and
lexical levels of language. It is believed that utilization of all the levels of language processing
explained above offer the potential for truly habitable dialogue systems

16
Prepared By:
Mr. Rohit Mishra
Assistant Professor
(UIT)

You might also like