Print AI Complete Note KCC BySagarMalla

$sagarMalla 2016 Batch 19171R
Artificial Intelligence Artificial Intelligence
CHAPTER - 1 The test involves an interrogator who interacts with one human
Introduction to AI and one machine. He must decide which is which. If he is wrong
half the time, then the machine is intelligent
1.1 What is AI?
To pass the Turing Test, the computer would need to possess the
Intelligence following capabilities:
· The intelligence is the ability to learn, understand and create, to solve problems and to make
decision. Natural language processing to enable it to communicate successfully in English;
Intelligence involves sensing, thinking, and acting. Knowledge representation to store what it knows or hears;
Automated reasoning to use the stored information to answer questions and to draw new
conclusions;
Machine learning to adapt to new circumstances and to detect and extrapolate patterns.
SENSING THINKING ACTING
If there is no direct physical interaction between the interrogator and the computer, it is called
Manipulation of the conceptual Translation of intent into ―Total Turing Test‖. The requirements for the ―total Turing test‖: computer vision, speech
Translation of sensory inputs (percepts) into a conceptual representation (physical) actions (reflexive or recognition, speech synthesis, robotics.
representation deliberative)
- Knowledge
- Computer Vision Representation - Robotics B) Thinking humanly: The cognitive modeling approach
- Speech Recognition - Problem - Speech and Language The human mind can think three ways: (i) through introspection —trying to catch our own
- Language Understanding Solving/Planning Synthesis thoughts as they go by; (ii) through psychological experiments—observing a person in action; and
- Learning (making (iii) through brain imaging—observing the brain in action. Cognitive Science embraces computer
improvements models from AI and experimental methods from psychology.
based on the results
of past actions)
C) Thinking rationally: The "laws of thought" approach

Formal logic (late nineteenth, early twentieth centuries) provides a precise notation of statements of
Artifici al Intell igence (AI)
all kinds of things and relations between them. For example: ―Socrates is a man; all men are
mortal; therefore Socrates is mortal.‖
· Artificial Intelligence (AI) is the study & attempt to design of non-organic intelligent agent or
machine that can perceive it’s environmental changes and behaves as natural organic system The main problems associated with the ―Laws of thought‖ approach are: the knowledge will
(human) like to think, to imagine, to create, memorize, understand, recognize patterns, make uncertain, complexity of problems easily exhausts today’s (and tomorrows) computational
choices, adopt to change & learn from past experience (i.e. Heuristic). resources.
· “The art of creating machines that perform functions that require intelligence when performed by
people” – Kurzwell,1990 D) Acting rationally: The rational agent approach
An agent is anything that can be viewed as perceiving its environment through sensors and acting
· “The science of making computers, do things that require intelligence like humans”- Minskey upon that environment through actuators. A human agent has eyes, ears, and other organs for
sensors and hands, legs, vocal tract, and so on for actuators. A robotic agent might have cameras and
· “AI is the study of how to make computers do things at which, at the moment, people are better”- infrared range finders for sensors and various motors for actuators.
Elaine Rich A rational agent is one that acts so as to achieve the best outcome or, when there is
· uncertainty, the best expected outcome. AI as rational agent design is more general than the
· The definition of AI can be organized into four categories as: ―laws of thought‖ approach; correct inference is useful but not necessary for achieving
rationality.
Goals and Approaches of AI

A) Acting humanly: The Turing Test approach · The definition of AI gives four possible goals to pursue.
Alan Turing (1950) proposes ―Turing Test‖ to provide a satisfactory operational definition of 1. System that think like humans
intelligence. 2. System that act like humans
3. System that think rationally
4. System that act rationally
https://bit-papers.blogspot.com/ ©17171R
AI Today
• Diagnose lymph- node diseases

• Monitor space shuttle missions
• Automatic vehicle control
• Large- scale scheduling
• Detection of money laundering
• Classify astronomical objects
• Speech understanding systems
• Beat world’s best players in chess, checkers, and backgammon.
· General AI goals are as follows:
* Replicating human intelligence. Example Eliza
* Solve knowledge intensive tasks. • ELIZA: A program that simulated a psychotherapist interacting with a patient and successfully
* Make an intelligent connection between perception and action. passed the Turing Test.
* Enhance human-human,human-computer,computer to computer interaction /communication. • Coded at MIT during 1964- 1966 by Joel Weizenbaum.
* Emphasis on enhancing intelligent behavior.
• First script was DOCTOR.
• The script was a simple collection of syntactic patterns not unlike regular expressions
1.2 History of AI
• each pattern had an associated reply which might include bits of theinput (after simple
The first use of term ―Artificial Intelligence‖ was not until 1956 by an American scientist ―John
McCarthy‖ who is referred to as the ―Father of AI‖. McCarthy also come up with a programming transformations (my -> your)
language called LISP (i.e. List-Processing), which is still used to program computer in AI that allow • Weizenbaum was shocked at reactions:
the computer to learn. • Psychiatrists thought it had potential.
• People unequivocally anthropomorphised.
1943 First electronic computer ―Colossus‖ was developed.
1949 First commercial stored program computer was developed.
1950 The ―Turing Test‖ was proposed by ―Alan Turing‖. AI Techniques
1951 Marvin Minsky (student of von Neumann) built a neural network using 3000 Various techniques that have evolved, can be applied to a variety of AI tasks. The techniques
vacuum tubes and the ―autopilot‖ from a B-24 bomber. The first NN was a computer are concerned with how we represent, manipulate and reason with knowledge in order to
simulated ―Rat‖ find its way through a maze (a web).
1958 John McCarthy develops LIPS programming language. solve problems.
1966 ―Eliza‖ program was written that was like a therapist & finds the disease on the Example
basis of to the given symptoms.
1969 ―Shakey‖ named robot was built which could recognize shape, colors and navigate a • Techniques, not all "intelligent" but used to behave as intelligent
path through colored blocks. § Describe and match
1970 First export system was appeared. Another AI programming language PROLOG
(i.e. programming logic) was developed. § Constraint satisfaction
1986 Neural Networks come back from the dead. § Generate and test
Late Web-bots & Crawlers developed to give information to search engine on internet.
1990s § Goal reduction
1994 AI systems exist in real environments with real sensory inputs (i.e. Intelligent § Tree searching
Agents).
1997 First time AI system controlled a spacecraft named ―Deep Space – II‖. § Rule based systems
Present Programmers are still trying to develop a computer which can successfully pass the
―Turing Test‖.
• Biology-inspired AI techniques are currently popular
§ Neural Networks a larger disk on top of a smaller one;

§ Reinforcement learning move one disk at a time, from one peg to another; middle post
§ Genetic algorithms can be used for intermediate storage.
Play the game in the smallest number of moves possible.
i. Describe and Match
■ Possible state transitions in the Towers of Hanoi puzzle with 2 disks.
■ Model is a description of a system’s behavior.
■ Finite state model consists of a set of states, a set of input events and the relations
[1, 2] [ ] []
between them. Given a current state and an input event you can determine the
next current state of the model.
■ Computation model is a finite state machine. It includes of a set of states, a

[ 2] [1 ] [] [2] [] [1]
set of start states, an input alphabet, and a transition function which maps input
symbols and current states to a next state.
■ Representation of computational system include start and end state descriptions

[] [1 ] [2] [] [2] [1]
and a set of possible transition rules that might be applied. Problem is to find the
appropriate transition rules.
■ Transition relation: If a pair of states (S, S') is such that one move takes the
[] [ ] [1, 2] [1] [ ] [2] [1] [2] [ ] [ ] [1, 2] [ ]
system from S to S', then the transition relation is represented by S => S’
Towers of Hanoi puzzle
■ State-transition system is called deterministic if every state has at most one

successor; it is called non-deterministic if at least one state has more than one
successor.
■ Examples of some possible transitions between states are shown for the Towers of
Hanoi puzzle.
■ Puzzle : Towers of Hanoi with only 2 disks
Solve the puzzle :
Initial state Goal state
Move the disks from the leftmost post to the rightmost post while never putting
The “going on strike” and “increasing productivity” are alternative ways of trying
ii. Goal Reduction to “earn more money” (increase pay).
■ Goal-reduction procedures are a special case of the procedural representations of

e.g.: “improving standard of living” and “working less hard” are alternative ways
of trying to “improve enjoyment of life”.
knowledge in AI; an alternative to declarative, logic- based representations.
■ Goal-reduction process is illustrated in the form of AND/OR tree drawn upside- ◊ Conjoint sub-goals
To “provide for old age”, not only need to “earn more money”, but as well need
down.
to “save money”.
◊ Goal levels : Higher-level goals are higher in the tree, and lower- level goals
are lower in the tree.
iii. Constraint Satisfaction Techniques
◊ Arcs are directed from a higher-to-lower level node represents the reduction of
higher-level goal to lower-level sub-goal. ■ Constraint satisfaction is a process of finding a solution to a set of constraints –
◊ Nodes at the bottom of the tree represent irreducible action goals. the constraints express allowed values for variables and finding solution is
■ An AND-OR tree/graph structure can represent relations between goals and sub- evaluation of these variables that satisfies all constraints.
goals, alternative sub-goals and conjoint sub-goals. ■ Constraint Satisfaction Problem (CSP) and its solution
■ Example Goal Reduction
◊ A Constraint Satisfaction Problem (CSP) consists of :
AND-OR tree/graph structure to represent facts such as “enjoyment”, ‡ Variables, a finite set X = {x1 , . . . , xn } ,
“earning/save money”, “old age” etc. ‡ Domain, a finite set Di of possible values which each variable xi can take,
‡ Constraints, a set of values that the variables can simultaneously satisfy the
constraint (e.g. D1 != D2)
■ Example 1 : N-Queens puzzle
Problem : Given any integer N, place N queens on N*N chessboard satisfying

constraint that no two queens threaten each other.
Solution : To model this problem
◊ Assume that each queen is in different column;
◊ Assign a variable Ri (i = 1 to N) to the queen in the i-th column
indicating the position of the queen in the row.
The above AND-OR tree/graph structure describes ◊ Apply "no-threatening" constraints between each couple Ri and Rj
of queens and evolve the algorithm.
◊ Hierarchical relationships between goals and subgoals
The “going on strike” is a sub-goal of “earning more money”, is a sub-goal of
“improving standard of living ”, is a sub-goal of “improving enjoyment of
life”.
◊ Alternative ways of trying to solve a goal ◊ Example : 8 - Queens puzzle
a b c d e f g h a b c d e f g h v. Rule Based Systems(RBS)

8 8 8 8
Rule based systems are simple and successful AI technique.
7 7 7 7
§ Rules are of the form :IF <condition>THEN<action>
6 6 6 6
§ Rules are ofthen arranged in hierarchies(“and/or”trees).
5 5 5 5
§ When all conditions of a rule are satisfied the rule is triggered.
4 4 4 4
3 3 3 3
vi. Generate and Test (GT)
2 2 2 2
1 1 1 1 * Generate-and-test method
a b c d e f g h a b c d e f g h
The method first guesses the solution and then tests whether this solution is
Unique solution 1 Unique solution 2 correct, means solution satisfies the constraints.
iv. Tree Searching ◊ This paradigm involves two processes:

– Generator to enumerate possible solutions (hypotheses).
■ Many problems (e.g. goal reduction, constraint networks) can be described in
– Test to evaluate each proposed solution
the form of a search tree. A solution to the problem is obtained by finding a
path through this tree. ◊ The algorithm is
■ A search through the entire tree, until a satisfactory path is found, is called
exhaustive search. Generate labels test
■ Tree search strategies: Satisfaction

◊ Depth-first search
* Assumes any one path is as good as any other path.

* At each node, pick an arbitrary path and work forward until a solution is
found or a dead end is reached.
* In the case of a dead end - backtrack to the last node in the tree where a previously
unexplored path branches of, and test this path.
◊ Best-first search
* Like beam search but only proceeding from one "most likely" node at each
level.
◊ Hill climbing ◊ Breadth-first search

* Like depth first but involving some * Look for a solution amongst all
quantitative decision on nodes at a given level before
the "most likely" path to proceeding to the next.
follow at each node.

iii. Understanding natural language
Just getting a sequence of words into a computer is not enough. Parsing sentences is not
Area of Artificial Intelligence enough either. The computer has to be provided with an understanding of the domain the text
is about, and this is presently possible only for very limited domains.
iv. Computer vision
a. It is a combination of concepts, techniques and ideas from : Digital

Image Processing, Pattern Recognition, Artificial Intelligence and Computer
Graphics.
b. The world is composed of 3-D objects, but the inputs to the human eye and
computers' TV cameras are 2-D.
c. Some useful programs can work solely in 2-D, but full computer vision
requires partial 3-D information that is not just a set of 2-D views.
Applications Of AI
d. At present there are only limited ways of representing 3-D information
You can buy machines that can play master level chess for a few hundred dollars. There is some
AI in them, but they play well against people mainly through brute force computation--looking at directly, and they are not as good as what humans evidently use.
hundreds of thousands of positions. To beat a world champion by brute force and known reliable
heuristics requires being able to look at 200 million positions per second. e. Examples
i. Speech recognition
◊ Face recognition :
a. A process of converting a speech signal to a sequence of words; the programs in use by banks
b. In 1990s, computer speech recognition reached a practical level for ◊ Autonomous driving :
limited purposes. The ALVINN system, autonomously drove a van from Washington,
D.C. to San Diego, averaging 63 mph day and night, and in all
c. Using computers recognizing speech is quite convenient, but most users find
weather conditions.
the keyboard and the mouse still more convenient.
◊ Other usages
d. The typical usages are : Handwriting recognition, Baggage inspection, Manufacturing inspection,
◊ Voice dialing (Call home), Photo interpretation, etc .
◊ Call routing (collect call), v. Expert systems
◊ Data entry (credit card number). A ``knowledge engineer'' interviews experts in a certain domain and tries to embody their
◊ Speaker recognition. knowledge in a computer program for carrying out some task. How well this works depends
ii. Game playing
on whether the intellectual mechanisms required for the task are within the present state of
a. Games are Interactive computer program, an emerging area in which the
AI. When this turned out not to be so, there were many disappointing results. One of the first
goals of human-level AI are pursued.
expert systems was MYCIN in 1974, which diagnosed bacterial infections of the blood and
b. Games are made by creating human level artificially intelligent entities, suggested treatments. It did better than medical students or practicing doctors, provided its
e.g. enemies, partners, and support characters that act just like humans. limitations were observed. Namely, its ontology included bacteria, symptoms, and treatments
Compiled By: Bhawana Bam Email: bhawana70003@gmail.com Compiled By: Bhawana Bam Email: bhawana70003@gmail.com
$sagarMalla
and did not include patients, doctors, hospitals, death, recovery, and events occurring in time.
2016 Batch Artificial intelligence helps us in reducing the error and the chance of reaching accuracy
19171R
with a greater
Its interactions depended on a single patient being considered. Since the experts consulted degree of precision is a possibility. It is applied in various studies such as exploration of space.Intelligent
by the knowledge engineers knew about patients, doctors, death, recovery, etc., it is clear robots are fed with information and are sent to explore space. Since they are machines with metal bodies,
that the knowledge engineers forced what the experts told them into a predetermined they are more resistant and have greater ability to endure the space and hostile atmosphere.
framework.
They are created and acclimatized in such a way that they cannot be modified or get disfigured or
vi. Heuristic classification
breakdown in the hostile environment.
One of the most feasible kinds of expert system given the present knowledge of AI is to put
some information in one of a fixed set of categories using several sources of information. An 2. Difficult Exploration:
example is advising whether to accept a proposed credit card purchase. Information is available
about the owner of the credit card, his record of payment and also about the item he is buying and Artificial intelligence and the science of robotics can be put to use in mining and other fuel exploration
about the establishment from which he is buying it (e.g., about whether there have been processes. Not only that, these complex machines can be used for exploring the ocean floor and hence
previous credit card frauds at this establishment). overcome the human limitations.Due to the programming of the robots, they can perform more laborious
and hard work with greater responsibility. Moreover, they do not wear out easily.
Foundations of AI
Different fields have contributed to AI in the form of ideas, viewpoints and techniques. 3. Daily Application:
Philosophy:
Logic, reasoning, mind as a physical system, foundations of learning, language and rationality. Computed methods for automated reasoning, learning and perception have become a common phenomenon
in our everyday lives. We have our lady Siri or Cortana to help us out. We are also hitting the road for long
Mathematics:
Formal representation and proof algorithms, computation, undecidability, intractability, probability. drives and trips with the help of GPS. Smartphone in an apt and every day is an example of the how we use
Psychology: artificial intelligence. In utilities, we find that they can predict what we are going to type and correct the
adaptation, phenomena of perception and motor control. human errors in spelling. That is machine intelligence at work.
Economics:
formal theory of rational decisions, game theory. When we take a picture, the artificial intelligence algorithm identifies and detects the person’s face and tags
Linguistics: the individuals when we are posting our photographs on the social media sites. Artificial Intelligence is
Knowledge representation, grammar
widely employed by financial institutions and banking institutions to organize and manage data. Detection
Neuroscience:
of fraud uses artificial intelligence in a smart card based system.
Physical substrate for mental activities
Control theory:
Homeostatic systems, stability, optimal agent design 4. Digital Assistants:
Highly advanced organizations use ‘avatars’ which are replicas or digital assistants who can actually
Advantages of Artificial Intelligence:
interact with the users, thus saving the need for human resources. For artificial thinkers, emotions come in
the way of rational thinking and are not a distraction at all. The complete absence of the emotional side,
Artificial intelligence is complex in nature. It uses very complicated mixture of computer science,
makes the robots think logically and take the right program decisions. Emotions are associated with moods
mathematics and other complex sciences. Complex programming helps these machines replicate the
that can cloud judgment and affect human efficiency. This is completely ruled out for machine intelligence.
cognitive abilities of human beings.
1. Error Reduction:
$sagarMalla
5. Repetitive Jobs:
2016 Batch 2. No Replicating Humans: 19171R
Repetitive jobs which are monotonous in nature can be carried out with the help of machine intelligence. Intelligence is believed to be a gift of nature. An ethical argument continues, whether human intelligence is
Machines think faster than humans and can be put to multi-tasking. Machine intelligence can be employed to be replicated or not.
to carry out dangerous tasks. Their parameters, unlike humans, can be adjusted. Their speed and time are
calculation based parameters only. Machines do not have any emotions and moral values. They perform what is programmed and cannot make
the judgment of right or wrong. Even cannot take decisions if they encounter a situation unfamiliar to them.
When humans play a computer game or run a computer-controlled robot, we are actually interacting with They either perform incorrectly or breakdown in such situations.
artificial intelligence. In the game we are playing, the computer is our opponent. The machine intelligence
plans the game movement in response to our movements. We can consider gaming to be the most common 3. No Improvement with Experience:
use of the benefits of artificial intelligence.
Unlike humans, artificial intelligence cannot be improved with experience. With time, it can lead to wear
and tear. It stores a lot of data but the way it can be accessed and used is very different from human
6. Medical Applications:
intelligence.
In the medical field also, we will find the wide application of AI. Doctors assess the patients and their
health risks with the help of artificial machine intelligence. It educates them about the side effects of Machines are unable to alter their responses to changing environments. We are constantly bombarded by
various medicines. the question whether it is really exciting to replace humans with machines.
Medical professionals are often trained with the artificial surgery simulators. It finds a huge application in In the world of artificial intelligence, there is nothing like working with a whole heart or passionately. Care
detecting and monitoring neurological disorders as it can simulate the brain functions. or concerns are not present in the machine intelligence dictionary. There is no sense of belonging or
togetherness or a human touch. They fail to distinguish between a hardworking individual and an
Robotics is used often in helping mental health patients to come out of depression and remain active. A inefficient individual.
popular application of artificial intelligence is radiosurgery. Radiosurgery is used in operating tumours and
this can actually help in the operation without damaging the surrounding tissues. 4. No Original Creativity:
Do you want creativity or imagination?

7. No Breaks:
Machines, unlike humans, do not require frequent breaks and refreshments. They are programmed for long These are not the forte of artificial intelligence. While they can help you design and create, they are no
hours and can continuously perform without getting bored or distracted or even tired. match for the power of thinking that the human brain has or even the originality of a creative mind.
Disadvantages of Artificial Intelligence: Human beings are highly sensitive and emotional intellectuals. They see, hear, think and feel. Their
thoughts are guided by the feelings which completely lacks in machines. The inherent intuitive abilities of
1. High Cost: the human brain cannot be replicated.
Creation of artificial intelligence requires huge costs as they are very complex machines. Their repair and
maintenance require huge costs.
They have software programs which need frequent up gradation to cater to the needs of the changing
environment and the need for the machines to be smarter by the day.
Chapter-2 Intelligent agents 2.2 Properties of the agent

2.1Introduction i. Rationality:
· An agent is anything that can be viewed as perceiving its environment through Rational agent
sensors and acting upon that environment through actuators. A rational agent is one which chooses the action which will make it most
· An AI system is composed of an agent and its environment. The agents act in their
successful. In order to complete that definition we need to have a way of
environment. The environment may contain other agents.
measuring how successful a course of action is. The criteria for measuring the
Examples: degree of successfulness is performance measure and it varies from agent to
· A human agent has sensory organs such as eyes, ears, nose, tongue and skin agent. Rationality of an agent is restricted by its percepts since it can only
parallel to the sensors, and other organs such as hands, legs, mouth, for effectors. respond to the percept sequence, that is the history of things it has sensed from
· A robotic agent replaces cameras and infrared range finders for the sensors, and the environment. It is also limited by its effectors (motors in the case of robots).
various motors and actuators for effectors. Ideal rational agent
· A software agent has encoded bit strings as its programs and actions. “For each possible percept sequence an ideal rational agent should do whatever action
is expected to maximize its performance measure, on the basis of the evidence
provided by the sensors and built in knowledge the agent has.”
ii. Autonomous: “A system is autonomous to the extent that its behavior is

determined by its own experience” If actions depend entirely on built in
knowledge without considering percepts the agent lacks autonomy. Autonomy is
achieved by giving to the agent built in knowledge together with the ability to
learn, just as in nature animals have instincts but also learn from the
Fig: Architecture of an agent
environment.
A truly autonomous intelligent agent should be able to operate successfully in a
What do you mean, sensors/percepts and effectors/actions?
wide variety of environments given sufficient time to adapt.
For Humans iii. Flexibility:
– Sensors: Eyes (vision), ears (hearing), skin (touch), tongue
(gestation), nose (olfaction), neuromuscular system An intelligent agent is a computer system capable of flexible action in some
(proprioception) dynamic environment.
– Percepts:
• At the lowest level – electrical signals from these sensors iv. Reactiveness: In order to define reactivity we shall first define the notion of
• After preprocessing – objects in the visual field logical agent and consider what Knowledge Based Agents are.
(location, textures, colors, …), auditory streams
(pitch, loudness, direction), … Concept of Logical Agent.
– Effectors or actuators: limbs, digits, eyes, tongue, …..
– Actions: lift a finger, turn left, walk, run, carry an object, …
[Compiled By: Bhawana Bam –Artificial Intelligence (Email:bhawana70003@gmail.com)] Page 1 [Compiled By: Bhawana Bam –Artificial Intelligence (Email:bhawana70003@gmail.com)] Page 2
“An agent that can form representations of the world, use a process of inference to 2.5 Properties / types of environment
derive new references of the world and use these new representatives to decide what to do.
The range of task environments that might arise in Ai is obviouslyvast. We can, however, identifya
v. Proactiveness: fairlysmall number of dimensions along which task environments can be categorized. These
– Generating and attempting to achieving goals dimensions determine, to a large extent, the appropriate agent design and the applicability of each of
– Executing actions /giving advice/making recommendations /making the principal families of techniques for agent implementation.
suggestions without an explicit user request.

· Discrete / Continuous − If there are a limited number of distinct, clearly defined,
– Exhibit goal directed behavior.
states of the environment, the environment is discrete (For example, chess);
otherwise it is continuous (For example, driving).
2.3Agent Terminology
· Performance Measure of Agent − It is the criteria, which determines how · Fully Observable / Partially Observable − If it is possible to determine the
successful an agent is. complete state of the environment at each time point from the percepts it is fully
observable (e.g. Image Recognition); otherwise it is only partially observable (e.g.
· Behavior of Agent − It is the action that agent performs after any given sequence Self driving).
of percepts.
· Static / Dynamic − If the environment does not change while an agent is acting,
· Percept − It is agent’s perceptual inputs at a given instance. then it is static(e.g. Chess); otherwise it is dynamic(e.g Taxi driving).
· Percept Sequence − It is the history of all that an agent has perceived till date. · Single agent / Multiple agents − the environment may contain other agents which
· Agent Function − It is a map from the precept sequence to an action. may be of the same or different kind as that of the agent. For example: ,an agent
solving a crossword puzzle by itself is clearly in a single agent environment,
whereas an agent playing chess is in a two agent environment.
2.4 The Nature of Environments
· Accessible / Inaccessible − If the agent’s sensory apparatus can have access to the
An environment is everything in the world which surrounds the agent, but it is not a part of complete state of the environment, then the environment is accessible to that
an agent itself. An environment can be described as a situation in which an agent is present. agent(e.g. Chess),otherwise it is inaccessible (e.g. medical diagnosis system).
The environment is where agent lives, operate and provide the agent with something to sense · Deterministic / Non-deterministic − If the next state of the environment is
and act upon it. An environment is mostly said to be non-feministic. completely determined by the current state and the actions of the agent, then the
environment is deterministic (e.g. Chess); otherwise it is non-deterministic (e.g.
In contrast, some software agents (software robots or softbots) exist in rich, unlimited Taxi driving).
softbots domains. The simulator has a very detailed, complex environment. The software
agent needs to choose from a long array of actions in real time. A softbot designed to scan · Episodic / Non-episodic − In an episodic environment, each episode consists of the
the online preferences of the customer and show interesting items to the customer works in agent perceiving and then acting. The quality of its action depends just on the
the real as well as an artificial environment. episode itself. Subsequent episodes do not depend on the actions in the previous
episodes. Episodic environments are much simpler because the agent does not need
The most famous artificial environment is the Turing Test environment, in which one real to think ahead.(e.g. Image analysis); In non-episodic environment , the current
and other artificial agent are tested on equal ground. This is a very challenging decision could affect all future decisions e.g. chess.
environment as it is highly difficult for a software agent to perform as well as a human.
Environment Accessible Deterministic Episodic static Discrete

Chess with a clock Yes Yes No Semi Yes
Omniscience:
An omniscient agent knows the actual outcome of it’s actions and can act accordingly; but
Poker No No No Yes Yes omniscience is impossible in reality.
Yes No No Yes Yes 2.7 Structure of Agents/Types of agents

Taxi driving
Agents can be grouped into five classes based on their degree of perceived intelligence and
Medical diagnosis system No No No No No capability. All these agents can improve their performance and generate better action over the
time. These are given below:
Yes Yes Yes Semi No
Image analysis system
1. Simple Reflex Agents
Interactive English tutor No No No No Yes
· The Simple reflex agents are the simplest agents. These agents take decisions on the basis
of the current percepts and ignore the rest of the percept history.
2.6 Rationality and Omniscience · These agents only succeed in the fully observable environment.
· The Simple reflex agent does not consider any part of percepts history during their decision
Rationality:
and action process.
Rationality is nothing but status of being reasonable, sensible, and having good sense of
· The Simple reflex agent works on Condition-action rule, which means it maps the current
judgment.
state to action. Such as a Room Cleaner agent, it works only if there is dirt in the room.
Rationality is concerned with expected actions and results depending upon what the agent
has perceived. Performing actions with the aim of obtaining useful information is an · Problems for the simple reflex agent design approach:
important part of rationality. • They have very limited intelligence
What is rational at any given time depends on four things: • They do not have knowledge of non-perceptual parts of the current state
· The performance measure that defines the criterion of success. • Mostly too big to generate and to store.
· The agent’s prior knowledge of the environment.
• Not adaptive to changes in the environment.
· The actions that the agent can perform.
· The agent’s percept sequence to date. · For example: the vaccum agent is simplex reflex ,because it’s decision is based only on
· This leads to the definition of a rational agent: the current location and on whether that location contains dirt.
For each possible percept sequence, a rational agent should select an action that
Condition-Action Rule − It is a rule that maps a state (condition) to an action.
is expected to maximize it’s performance measure, given the evidence provided by
the percept sequence and whatever built-in knowledge the agent has.

function SIMPLE-REFLEX-AGENT(percept) returns an action
persistent: rules,a set of condition-action rules function REFLEX-AGENT-WITH-STATE(percept) returns an action Static :
stateßINTERPRET-INPUT (percept) state, a description of the current world state
ruleßRULE-MATCH(state,rules)
rules, a set of condition-action rules
actionßrule.ACTION
action, the most recent action, initially none state ←
return action
UPDATE-STATE(state, action, percept) rule ←
RULE-MATCH(state, rules)
Fig: A simple reflex agent. It acts according to a rule whose condition matches the
action ← RULE-ACTION[rule] return
current state,as defined by the percept.
action
They use a model of the world to choose their actions. They maintain an internal state.
Model − knowledge about “how the things happen in the world”.
Internal State − It is a representation of unobserved aspects of current state depending on

percept history.
Updating the state requires the information about −
· How the world evolves.

· How the agent’s actions affect the world.
2. Model Based Reflex Agents

o The Model-based agent can work in a partially observable environment, and track the
situation.
o A model-based agent has two important factors:
o Model: It is knowledge about "how things happen in the world," so it is called a
Model-based agent.
o Internal State: It is a representation of the current state based on percept history.
o These agents have the model, "which is knowledge of the world" and based on the model
they perform actions.
o Updating the agent state requires information about:
a. How the world evolves
b. How the agent's action affects the world. 3. Goal Based Agents
· The knowledge of the current state environment is not always sufficient to decide for
an agent to what to do.
· The agent needs to know its goal which describes desirable situations.

· Goal-based agents expand the capabilities of the model-based agent by having the
"goal" information.
· They choose an action, so that they can achieve the goal.
· These agents may have to consider a long sequence of possible actions before deciding
whether the goal is achieved or not. Such considerations of different scenario are called
searching and planning, which makes an agent proactive.
Goal − It is the description of desirable situations.
v. Learning Agents
o A learning agent in AI is the type of agent which can learn from its past experiences, or it has
learning capabilities.
o It starts to act with basic knowledge and then able to act and adapt automatically through
learning.
o A learning agent has mainly four conceptual components, which are:
a. Learning element: It is responsible for making improvements by learning from
4. Utility Based Agents environment
b. Critic: Learning element takes feedback from critic which describes that how well
An agent generates a goal state with high –quality behavior(utility) that is , if more than one the agent is doing with respect to a fixed performance standard.
sequence exists to reach the goal state then the sequence with more reliable,safer,quicker and c. Performance element: It is responsible for selecting external action.
cheaper than others to be selected. d. Problem generator: This component is responsible for suggesting actions that will
lead to new and informative experiences.
o These agents are similar to the goal-based agent but provide an extra component of utility
measurement which makes them different by providing a measure of success at a given state. o Hence, learning agents are able to learn, analyze performance, and look for new ways to
improve the performance.
o Utility-based agent act based not only goals but also the best way to achieve the goal.
o The Utility-based agent is useful when there are multiple possible alternatives, and an agent
has to choose in order to perform the best action.
o The utility function maps each state to a real number to check how efficiently each action
achieves the goals.
Goals are inadequate when −
· There are conflicting goals, out of which only few can be achieved.
· Goals have some uncertainty of being achieved and you need to weigh likelihood of
success against the importance of a goal.
Sensors: The device or components through which the agent observes and perceives it’s
environment are the sensors of the agent.
Agent Performance Environment Actuators Sensors

Type Measure
Taxi Be safe, Urban streets, Steering Video,
driver reach freeways, traffic, wheel, accelerometers,
destination, pedestrians, gauges, engine
maximize accelerator, sensors,
weather,
profits, obey brake, horn keyboard, GPS,
customers, . . .
laws, . . .
Figure: PEAS description of task environment for an automated taxi.
PEAS description of the task environment

2.8 PEAS: Agents and Environments Agent Performance Environment Actuators Sensors
What’s an agent? Type Measure
Agents include humans, robots, softbots, thermostats, etc. The agent function maps from Medical Healthy Patient, Display of Keyboard entry of
diagnosis patient, hospital, Staff, questions,
percept histories to actions: system reduced costs pharmacy tests, symptoms, findings,
f : P ∗ → A. diagnoses, patient’s answers, Web
treatments,
The agent program runs on the physical architecture to produce f . referrals. pages (text, graphics,
Display to user, scripts. . .)
follow URL, fill in
PEAS form
PEAS stand for performance, Environment, Actuators, and Sensors. Based on these properties of an Safety, time, Roads, other Steering, sonar, Camera, Sonar, GPS, speedometer
Self legal drive, card , GPS, , accelerometer, engine sensor,
agent, they can be grouped together or can be differentiated from each other. Each agent has these driving comfort pedestrians , speedometer, keyboard
car traffic signs etc. ,keyboard etc.
following properties defines for it.
Performance: The output which we get from the agent .All the necessary results that an agent gives
Interactive Student’s Set of students, Display of Keyboard entry
after processing comes under its performance. English score on test testing agency exercises,
Environment: All the surrounding things and conditions of an agent fall in this section. It basically Tutor suggestions,
corrections
consists of all things under which the agent works. Fig: Examples of agent types and their PEAS description
Actuators: The devices, hardware or software through which the agent performs any actions or
processes any information to produce a result.
CHAPTER -3 3.3 Problem Formulation and problem Definition
Problem formulation means choosing a relevant set of states to consider and a feasible set of operators for moving from
Problem Solving Using Searching one state to another.
3.1 Introduction
Search is the process of considering various possible sequence of operates applied to the initial state and finding out the
– A problem is a situation which is experienced by an agent. A problem is solved by a sequence of
sequence which culminates in the goal state.
actions that reduce the different between the initial situation and the goal.
A search problem consist of the following:
– Problem solving is a mental process that involves the defining a problem; determining the cause
S:the full set of staes
of the problem; identifying, prioritizing and selecting alternatives for a solution; and implementing a
S0: the initial state
solution.
A Sà S is a set of operators
- Problem solving is a agent based system that finds sequence of actions that lead to G is the set of final states.G is the subset of S.
desirable states from the initial state. The search problem is to find the sequence of actions which transform the agent from the initial state to the goal state.
- Four steps of problem solving are:

A problem can be defined formally by five components:
i. Goal Formulation: Helps to organize behavior by isolating and representing the task
knowledge necessary to solve problem. 1. The initial state that the agent starts in.
ii. Problem Formulation: Define the problem precisely with initial states, final state and
2. A description of the possible actions available to the agent.
acceptable solutions.
3. A description of what each action does; the formal name for this is the transition model. Together, the
iii. Searching: Find the most appropriate techniques o sequence among all possible
initial state, actions, and transition model implicitly define the state space of the problem—the set of all
techniques.
states reachable from the initial state by any sequence of actions. The state space forms a directed
iv. Execution: Once the search algorithm returns a solution to the problem, the
network or graph in which the nodes are states and the links between nodes are actions. A path in the
Solution is then executed by the agent. state space is a sequence of states connected by a sequence of actions.
3.2 Problem Representation in AI 4. The goal test, which determines whether a given state is a goal state. Sometimes there is an explicit set of
possible goal states, and the test simply checks whether the given state is one of them.
Various techniques are evolved to a variety of AI tasks. These techniques are concerned with how
we represent, manipulate and reason with knowledge in order to solve problems. 5. A path cost function that assigns a numeric cost to each path. The problem-solving agent chooses a
cost function that reflects its own performance measure.
Techniques behave as intelligent Biology-inspired AI techniques
§ State space representation § Neural Networks Problem formulation is the process of deciding what actions and states to consider, given a goal. This
§ Goal reduction by AND-OR graph § Genetic Algorithms Activity aimed at identifying a problem by specifying
§ Constraints satisfaction § Reinforcement Learning
§ Tree searching § The undesirable and problematic state currently occupied,
§ Generate and Test § The resources currently available to move away from that problematic state, particularly the available courses
§ Rule based system
of actions, the combinatorial constraints on using them, etc., and
§ The criteria that need to be satisfied to say that a problem no longer exists or is solved.
1 2
i. Toy problem
The first example we examine is the vacuum world. This can be formulated as aproblem as follows:
· States: The state is determined by both the agent location and the dirt locations. The agent
is in one of two locations, each of which might or might not contain dirt. Thus, there are 2
2 2
x 2 =8 possible world sates. The larger environment with n locations has n.n states.
· Initial state: Any state can be designed as the initial state.
· Actions: In this simple environment, each state has just three actions: Left, right, and Suck.
Larger environment might also include UP and DOWN.
3.4 State Space Representation of Problem
· Transition model: The actions have their expected effects, except that moving Left in the
leftmost square, moving Right in the rightmost square, and Sucking in a clean square have no – A set of all possible states for a given problem is known as state space for that
effect. problem. The major components of state space representation are: (a) Initial state,
(b) Goal state, and (c) operator or legal moves.
· Goal test: This checks whether all the squares are clean.
· Path Cost: Each step cost 1, so the path coast is the number of steps in the path. – Many problems in AI take the form of state-space search. The states might be legal board
configurations in a game, towns and cities in some sort of route map, Collections of
mathematical propositions, etc.
– The state-space is the configuration of the possible states and how they connect to each
other e.g. the legal moves between states.
– When we don't have an ; which tells us definitively how to negotiate the state-space we
need to search the state-space to find an optimal path from a start state to a goal state.
i. Let us consider a problem of 8-tile puzzle game. The puzzle consists of an 8-square frames and
an empty slot. The tiles are numbered from 1-8. It is possible to move the tiles in the square
field by moving tile into the empty slot. The objective is to get square in a numeric order.
ii. 8 puzzle
3 4
ii. State Spaces versus Search Trees

State Space Search Tree
§ All the possible moves of a § Data structure to search state-space
problem § Root node = initial state
§ Set of valid states for a problem § Child nodes = states that can be visited from parent
§ Linked by operators § Note that the depth of the tree can be infinite. E.g., via
§ E.g., 20 valid states (cities) in repeated states
the Romanian travel problem § Partial search tree: Portion of tree that has been
As shown in figure, the tiles are in jumbled fashion at initial state and these are arranged in sequenced expanded so far
§ Fringe: Leaves of partial search tree, candidates for
order in the goal state. The operator of the problem can be represented as:
expansion
1. UP: If the empty slot is not touching the up-
frame, move it up.
3.5 Some Well-Defined Problems
2. DOWN: If the empty slot is not touching
down- frame, move it down. 1. A Water Jug Problem
• You have a 4-gallon and a 3-gallon water
3. LEFT: If the empty slot is not touching left- jug
frame, move it left. • You have a faucet with an unlimited amount
4. RIGHT: If the empty slot is not touching right-frame, move it right. of water
• You need to get exactly 2 gallons in 4-gallon
The state space representation is highly beneficial in AI because they provide all possible state, jug
goal state and the operations. If the entire state space representation for a problem is given, it is
Puzzle-solving as Search
possible to trace a path from initial state to goal state and identify the sequence of operators § State representation: (x, y), where x:
necessary for doing it. Contents of four gallon and y: Contents of
three gallon
§ Start state: (0, 0)
§ Goal state (2, n)
§ Operators
– Fill 3-gallon from faucet, fill 4-gallon from faucet
– Fill 3-gallon from 4-gallon , fill 4-gallon from 3-gallon
– Empty 3-gallon into 4-gallon, empty 4-gallon into 3-gallon
– Dump 3-gallon down drain, dump 4-gallon down drain
Production Rules for the Water Jug Problem
Production Rules for the Water Jug Problem

1. if x < 4 (x,y) à (4,y) Fill the 4-gallon jug
2. if y < 3 (x,y) à (x,3) Fill the 3-gallon jug
3. if x > 0 (x,y) à (x – d,y) Pour some water out of the 4-gallon jug
4. if x > 0 (x,y) à (x,y – d) Pour some water out of the 3-gallon jug
5. if x > 0 (x,y) à (0,y) Empty the 4-gallon jug on the ground
6. if y > 0 (x,y) à (x,0) Empty the 3-gallon jug on the ground
7. if x + y ≥ 4 and y > 0 Pour water from the 3-gallon jug into the 4-gallon
(x,y) à (4, y – (4 – x)) jug until the 4-gallon jug is full
8. if x + y ≥ 3 and x > 0 Pour water from the 4-gallon jug into the 3-gallon
(x,y) à (x – (3 – y), 3) jug until the 3-gallon jug is full
9. if x + y ≤ 4 and y > 0 (x,y) à (x + y, 0) Pour all the water from the 3-gallon jug into the 4-
gallon jug
10. if x + y ≤ 3 and x > 0 (x,y) à (0, x + y) Pour all the water from the 4-gallon jug into the 3-
gallon jug
5 6
One Solution to the Water Jug Problem • Goal test: 8 queens on

the board none attacked.
Gallons in the 4-Gallon Jug Gallons in the 3-Gallon Jug Rule Applied • Path cost: 1 per move
0 OR 0 0 OR 0 2 • The 8 queen puzzle has
0 3 3 0 9 92 distinct solutions.
3 0 0 3 2 4. Tower of Hanoi with only 2 disks
3 3 3 3 7
4 2 2 4 5 Problem: Move the disk from the leftmost post to the rightmost post while
0 2 2 0 9
§ Never putting a larger disk on top of a smaller one;
2 0
§ Move one disk at a time, from one peg to another;
§ Middle post can be used for intermediate storage.
Play the game in the smallest number of move

possible.
Solution: Search tree for TOH problem
2. The Traveling Salesperson Problem

§ A salesman must visit N cities.
§ Each city is visited exactly once and finishing the city
started from.
§ There is usually an integer cost c (a, b) to travel from city a
to city b. 3.6 Searching
§ However, the total tour cost must be minimum, where the
total cost is the sum of the individual cost of each city visited - Searching is the process finding the required states or nodes.
in the tour.
§ Given a road map of n cities, find the shortest tour which visits every city on the map - Searching is to be performed through the state space.
exactly once and then return to the original city (Hamiltonian circuit). - Search process is carried out by constructing a search tree.
3. 8-queens problem - Search is a universal problem-solving technique.
• States: -any arrangement of n<=8 queens or arrangements of n<=8 queens in leftmost n

- Search involves systematic trial and error exploration of alternative solutions.
columns, 1 per column, such that no queen attacks any other. - Many problems don’t have a simple algorithmic solution. Casting these problems as search
• Initial state: no queens on
the board problems is often the easiest way of solving them.
• Actions: add queen to any - Useful when the sequence of actions required to solve a problem is not known
empty square or add queen
o Path finding problems, e.g, eight puzzle, travelling salesman problem
to leftmost Empty Square
such that it is not attacked o Two player games, e.g., chess and checkers
by other queens.
o Constraint satisfaction problems, e.g., eight queens
7 8
Steps in Searching
1. Check whether the current state is the goal state or not? Uninformed search strategies
2. Expand the current state to generate the new sets of states.
- These types of search strategies are provided with the problem definition and these don’t have
3. Choose one of the new states generated for search depending upon search strategy.
additional information about the state space.
4. Repeat step 1 to 3 until the goal state is reached or there are no more state to be
- These can only expand current state to get a new set of states and distinguish a goal state
expanded.
from non-goal state.
Performance Measure of the Algorithm - The uninformed search does not contain any domain knowledge such as closeness, the location of
• Strategies are evaluated along the following dimensions: the goal. It operates in a brute-force way as it only includes information about how to traverse the
o Completeness: does it generate to find a solution if there is any? tree and how to identify leaf and goal nodes.
o Optimality: does it always find the highest quality (least-cost) solution? - It examines each node of the tree until it achieves the goal node.
o Time complexity: How long does it take to find a solution? - Less effective than informed search.
- Distinguish goal state from non goal state
o Space complexity: How much memory does it need to perform the search?
- Time and space complexity are measured in terms of It can be divided into five main types:
b: maximum branching factor of the search tree
d: depth of the least-cost solution o Breadth-first search
m: maximum depth of the state space (may be ∞) o Uniform cost search
Types of Search/Searching strategies: o Depth-first search
• A search strategy is defined by picking the order of node expansion
o Iterative deepening depth-first search
o Bidirectional Search
i. Breadth First Search

· Breadth-first search is the most common search strategy for traversing a tree or graph. This
algorithm searches breadth wise in a tree or graph, so it is called breadth-first search.
· BFS algorithm starts searching from the root node of the tree and expands all successor node at the
current level before moving to nodes of next level.
· The breadth-first search algorithm is an example of a general-graph search algorithm.
· Breadth-first search implemented using FIFO queue data structure.
· Proceeds level by level down the search tree
· Starting from the root node (initial state) explores all children of the root node, left to right
· If no solution is found, expands the first (leftmost) child of the root node, then expands the
i. Uninformed (Blind) Search second node at depth 1 and so on …
ii. Informed ( Heuristic)Search
9 10
ii. Depth First Search

• Process
- DFS proceeds down a single branch of the tree at a time.
• Place the start node in the queue
• Examine the node at the front of the queue - It expands the root node, then the leftmost child of the root node, then the
• If the queue is empty, stop leftmost child of that node etc.
• If the node is the goal, stop - Always expands a node at the deepest level of the tree
• Otherwise, add the children of the node to the end of the queue - Only when the search hits a dead end (a partial solution which can’t be
extended) does the search backtrack and expand nodes at higher levels.
Expand shallowest unexpanded node. fringe is implemented as a FIFO queue • Process: Use stack to keep track of nodes. (LIFO)
Advantage : • Put the start node on the stack
o BFS will provide a solution if any solution exists. • While stack is not empty
• Pop the stack
o If there are more than one solutions for a given problem, then BFS will provide the minimal solution • If the top of stack is the goal, stop
which requires the least number of steps . • Other wiser push the nodes connected to the top of the stack on
the stack ( provided they are not already on the stack)
Disadvantages: DFS example
o It requires lots of memory since each level of the tree must be saved into memory to expand the next
level.
o BFS needs lots of time if the solution is far away from the root node.
BFS example (Find path from A to D)
Put the start node in the queue, Examine the first element of the queue. If it is the goal, stop,
otherwise put its children in the queue
Weakness
• We may get stuck going down an infinite branch that doesn’t lead to a solution.
• As in given figure, suppose our solution is on right branch at depth level 1, but DFS
search proceeds down left branch and continues its search upto 1000’s of nodes. It might get
stuck in that branch without solution.
Properties of depth-first search
Properties of Breadth-First Search • Completeness: Incomplete as it may get stuck down going down an infinite branch that doesn’t
leads to solution.
• Completeness: Complete if the goal node is at finite depth • Optimality: The first solution found by the DFS may not be shortest.
• Optimality: It is guaranteed to find the shortest path • Space complexity: b as branching factor and d as tree depth level, Space complexity =
O(b.d)
• Time complexity
Time complexity = O(bd+1) • Time Complexity: O(bd)
• Space Complexity: O(bd+1) • Looks for the goal node among all the children of the current node before using the sibling of this
node i.e. expand deepest unexpanded node.
Weakness of BFS • Fringe is implemented as a LIFO queue (=stack)
• High time and memory requirements
11 12
iii. Uniform Cost Search:
- Breadth first Search finds the shallowest goal but it’s not always sure to find the optimal solution.
- Uniform cost search can be used if the cost of travelling from one node to another is
S
Available.
- Uniform cost search always expands the lowest cost node on the fringe (the collection of nodes that are
waiting to be expanded.)
A
- The first solution is guaranteed to be the cheapest one because a cheaper one is expanded
earlier and so would have been found first.
Uniform cost search example (Find path from A to E)
B
• Expand A to B,C,D
• The path to B is the cheapest one with path cost 2.
C E • Expand B to E
• Total path cost = 2+9 =11
• This might not be the optimal solution since the path AC as path
D F cost 4 ( less than 11)
• Expand C to E
• Total path cost = 4+5 =9
• Path cost from A to D is 10 ( greater than path cost, 9)
G
• Hence optimal path is ACE
DFS Evaluation:
Disadvantage
Completeness;
Does not care about the no of steps a path has but only about their cost. Hence it might get stuck in an
– Does it always find a solution if one exists?
– NO infinite loop if it expands a node that has a zero cost action leading back to same state.
– If search space is infinite and search space contains loops then DFS may Properties of uniform cost search
not find solution.
Time complexity: • Completeness: Complete if the cost of every step is greater than or equal to some
– Let m is the maximum depth of the search tree. In the worst case Solution may small positive constant e
exist at depth m.
– root has b successors, each node at the next level has again b successors (total • Optimality: Optimal if the cost of every step is greater than or equal to some small positive
2
b ), … constant e .
– Worst case; expand all except the last node at depth m
– Total no. of nodes generated: • Time complexity : O(bc*/e) where c* is cost of optimal path, e is small positive constant
2 3 m m
b + b + b + ………………….. b = O(b ) • Space Complexity : O(bc*/e)
Space complexity:
– It needs to store only a single path from the root node to a leaf node, along with iv. Depth Limited Search:
remaining unexpanded sibling nodes for each node on the path.
– Total no. of nodes in memory: • The unbounded tree problem appeared in DFS can be fixed by imposing a limit on the depth that DFS
1+ b + b + b + ………………….. b m times = O(bm)
can reach, this limit we will call depth limit l, this solves the infinite path problem.
Optimal (i.e., admissible):
– DFS expand deepest node first, if expands entire let sub-tree even if right sub- • A depth-limited search algorithm is similar to depth-first search with a predetermined limit. Depth-limited
tree contains goal nodes at levels 2 or 3. Thus we can say DFS may not always
search can solve the drawback of the infinite path in the Depth-first search. In this algorithm, the node at the
give optimal solution.
depth limit will treat as it has no successor nodes further.
Properties of depth limit search
· Completeness:
13 14
• Iterative deepening search uses only linear space and not much more time than other uninformed
The limited path introduces another problem which is the case when we choose l < d, in which is our DLS algorithms
will never reach a goal, in this case we can say that DLS is not complete.
An example of Iterative deepening DFS (depth level 3)
· Optimality: First Iteration Search at level l=0
- One can view DFS as a special case of the depth DLS, that DFS is DLS with l = infinity.
- DLS is not optimal even if l > d.
l
· Time Complexity: O(b )
Second Iteration Search at level l=1
· Space Complexity: O(bl)
Third Iteration Search at level l=2
v. Iterative Deepening Depth First Search:
• The iterative deepening algorithm is a combination of DFS and BFS algorithms. This search algorithm finds out
the best depth limit and does it by gradually increasing the limit until a goal is found.
• This algorithm performs depth-first search up to a certain "depth limit", and it keeps increasing the depth limit
Third Iteration Search at level l=3
after each iteration until the goal node is found.
• This Search algorithm combines the benefits of Breadth-first search's fast search and depth-first search's
memory efficiency.
• The iterative search algorithm is useful uninformed search when search space is large, and depth of goal node is
unknown.
• Take the idea of depth limited search one step further.

• Starting at depth limit L = 0, we iteratively increase the depth limit, performing a depth limited search for
each depth limit.
• Stop if no solution is found, or if the depth limited search failed without cutting off any nodes because of
the depth limit.
15 16
Advantages:
o It combines the benefits of BFS and DFS search algorithm in terms of fast search and memory efficiency. Completeness: Bidirectional Search is complete if we use BFS in both searches.
Disadvantages: Time Complexity: Time complexity of bidirectional search using BFS is O(bd).
Space Complexity: Space complexity of bidirectional search is O(bd).

o The main drawback of IDDFS is that it repeats all the work of the previous phase.
Optimal: Bidirectional search is Optimal.
Iterative Deepening search evaluation:
i. Completeness:
– YES (no infinite paths)
ii. Time complexity:
– Algorithm seems costly due to repeated generation of certain states.
– Node generation:
level d : once level d-1: 2
level d-2: 3
…
level 2: d-1
level 1: d
– Total no. of nodes generated:
d.b +(d-1). b2 + (d-2). b3 + …………………..+1. bd = O(bd)
iii. Space complexity:
– It needs to store only a single path from the root node to a leaf node, along with remaining Forward search: 1à 4à 8à9
unexpanded sibling nodes for each node on the path. Backward search: 16à12à10à9
– Total no. of nodes in memory:
1+ b + b + b + ………………….. b d times = O(bd) Solution path is: 1à 4à 8à 9à 10à 12à16
iii. Optimality: Drawbacks of uniformed search :

– YES if path cost is non-decreasing function of the depth of the node.
· Criterion to choose next node to expand depends only on a global criterion: level.
vi. Bidirectional Search: · Does not exploit the structure of the problem.
· One may prefer to use a more flexible rule, that takes advantage of what is being discovered on the
This is a search algorithm which replaces a single search graph, which is likely to with two smaller graphs -- one way, and hunches about what can be a good move.
starting from the initial state and one starting from the goal state. It then, expands nodes from the start and goal · Very often, we can select which rule to apply by comparing the current state and the desired state
state simultaneously. Check at each stage if the nodes of one have been generated by the other, i.e, they meet in
the middle. If so, the path concatenation is the solution.
17 18
1. Greedy Best first Search

• Greedy best-first search algorithm always selects the path which appears best at that
Comparative study all uninformed search strategies moment.
• It is the combination of depth-first search and breadth-first search algorithms. It uses the
heuristic function and search.
• Best-first search allows us to take the advantages of both algorithms. With the help of
best-first search, at each step, we can choose the most promising node.
• A node is selected for expansion based on evaluation function f(n)

• Node with lowest evaluation function is expanded first.
• The evaluation function must represent some estimate of the cost of the path from state to
the closest goal state.
• In the best first search algorithm, we expand the node which is closest to the goal node
and the closest cost is estimated by heuristic function, i.e.
F(n) = h(n)
Informed Search (Heuristic Search)
Were, h(n)= estimated cost from node n to the goal.
• In uninformed search, we don’t try to evaluate which of the nodes on the frontier are most promising.
We never “look-ahead” to the goal • h(n) = 0 for goal state
• Informed search have problem specific knowledge apart from problem definition.
The greedy best first algorithm is implemented by the priority queue.
• To solve large problems with large number of possible states, problem specific knowledge needs to be
added to increase the efficiency search algorithm. Algorithm
• Informed Search uses domain specific information to improve the search pattern. In other Step 1: Place the starting node into the OPEN list.
word, the informed search is one that uses the problem specific knowledge beyond the definition of
Step 2: If the OPEN list is empty, Stop and return failure.
the problem itself and can find solutions more efficiently than blind search strategies.
• It may be too resource intensive (both time and space) to use a blind search Step 3: Remove the node n, from the OPEN list which has the lowest value of h(n), and
places it in the CLOSED list.
• Use of Heuristic improves efficiency of search process.
• The idea is to develop a domain specific heuristic function h(n). Step 4: Expand the node n, and generate the successors of node n.
h(n) guesses the cost of getting to the goal from node n.

Step 5: Check each successor of node n, and find whether any node is a goal node or not. If
any successor node is goal node, then return success and terminate the search, else proceed
to Step 6.
Step 6: For each successor node, algorithm checks for evaluation function f(n), and then
check if the node has been in either OPEN or CLOSED list. If the node has not been in both
list, then add it to the OPEN list.
Step 7: Return to Step 2.
19 20
Applications
Best-first search and its more advanced variants have been used in such applications as
games and web crawlers.
§ In a web crawler, each web page is treated as a node, and all the hyperlinks on the
page are treated as unvisited successor nodes. A crawler that uses best-first search
generally uses an evaluation function that assigns priority to links based on
how closely the contents of their parent page resemble the search query.
§ In games, best-first search may be used as a path-finding algorithm for game
characters. For example, it could be used by an enemy agent to find the location of the
player in the game world.
Heuristic Searches - Why Use?
· Even if a blind search will work we may want a more efficient search method
Informed Search uses domain specific information to improve the search pattern
– Define a heuristic function, h(n), that estimates the "goodness" of a node n.
– Specifically, h(n) = estimated cost (or distance) of minimal cost path from n to a
goal state.
– The heuristic function is an estimate, based on domain-specific information that
is computable from the current state description, of how close we are to a goal.
Admissible Heuristic:
A heuristic function is said to be admissible if it is no more than the lowest-cost path to the
goal. In other words, a heuristic is admissible if it never overestimates the cost of reaching
the goal. An admissible heuristic is also known as an optimistic heuristic.
An admissible heuristic is used to estimate the cost of reaching the goal state in an informed
search algorithm. In order for a heuristic to be admissible to the search problem, the estimated
cost must always be lower than the actual cost of reaching the goal state. The search algorithm
uses the admissible heuristic to find an estimated optimal path to the goal state from the current
node. For example, in A* search the evaluation function (where n is the current node) is: f(n) =
g(n) + h(n)
where;
f(n) = the evaluation function.
• Complete? No – can get stuck in loops, g(n) = the cost from the start node to the current node
• Time? O(bm), but a good heuristic can give dramatic improvement h(n) = estimated cost from current node to goal.
• Space? O(bm) -- keeps all nodes in memory h(n) is calculated using the heuristic function. With a non-admissible heuristic, the A*
• Optimal? No algorithm would overlook the optimal solution to a search problem due to an overestimation in
f(n).
21 22
2) A* Search Algorithm
· A* search is the most commonly known form of best-first search. It uses heuristic function
h(n), and cost to reach the node n from the start state g(n). Example: heuristic value
· It has combined features of UCS and greedy best-first search, by which it solve the problem 1 S 4
efficiently. S 7
2
· A* search algorithm finds the shortest path through the search space using the heuristic A B A 6
function. B 2
12 5 2
· This search algorithm expands less search tree and provides optimal result faster. A* D C C 1
Goal 3
algorithm is similar to UCS except that it uses g(n)+h(n) instead of g(n). D 0
· In A* search algorithm, we use search heuristic as well as the cost to reach the node. Hence
we can combine both costs as following, and this sum is called as a fitness number. Solution:
F(n)=g(n)+ h(n)
· A* is a best first, informed graph search algorithm.
For S, f(n)= 0+7= 7 // 0 because S is start
· The h(x) part of the f(x) function must be an admissible heuristic; that is, it
node (Now expand S)
must not overestimate the distance to the goal. Thus, for an application like routing, h(x) SàA , f= 1+6=7
might represent the straight-line distance to the goal, since that is physically the smallest SàB, f=4+2=6 (least expensive)
possible distance between any two points or nodes.
SàBàC , f=(4+2)+1=7
· It finds a minimal cost-path joining the start node and a goal node for node n.
SàAàB = (1+2)+2=5
· Evaluation function: f(n) = g(n) + h(n)
SàAàC=(1+5)+1=7
Where, SàAàD=(1+12)+0=13
g(n) = cost so far to reach n from root
h(n) = estimated cost to goal from n SàAàBà5+1=6
f(n) = estimated total cost of path through n to goal SàBàCàD=8+0=8
· combines the two by minimizing f(n) = g(n) + h(n); SàBàCàD= 9+0=9
· is informed and, under reasonable assumptions, optimal and complete. SàAàCàD=9+0=9
Therefore, optimal path is : SàAàCàD=8
Advantages:
o A* search algorithm is the best algorithm than other search algorithms.
o A* search algorithm is optimal and complete.
o This algorithm can solve very complex problems.
Disadvantages:
o It does not always produce the shortest path as it mostly based on heuristics and
approximation.
o A* search algorithm has some complexity issues.
o The main drawback of A* is memory requirement as it keeps all generated nodes in the
memory, so it is not practical for various large-scale problems.
23 24
Example 2: Genetic Algorithm:

· A genetic algorithm is a search heuristic that is inspired by Charles Darwin’s theory of
natural evolution. This algorithm reflects the process of natural selection where the fittest
individuals are selected for reproduction in order to produce offspring of the next generation.
· A genetic algorithm is a heuristic search method used in artificial intelligence.
· It is used for finding optimal solutions to search problems.
· Genetic algorithms are excellent for searching through large and complex datasets.
· It is frequently used to find optimal or near optimal solutions to difficult problems

which otherwise take a lifetime to solve.
Optimal path is : S--->A--->C--->G
Complete: A* algorithm is complete as long as:
Steps
o Branching factor is finite.
o Cost at every action is fixed.
Optimal: A* search algorithm is optimal if it follows below two conditions:
o Admissible: the first condition requires for optimality is that h(n) should be an admissible
heuristic for A* tree search. An admissible heuristic is optimistic in nature.
o Consistency: Second required condition is consistency for only A* graph-search.
If the heuristic function is admissible, then A* tree search will always find the least cost path.
Time Complexity: The time complexity of A* search algorithm depends on heuristic function, and
the number of nodes expanded is exponential to the depth of solution d. So the time complexity is
O(b^d), where b is the branching factor.
Space Complexity: The space complexity of A* search algorithm is O(b^d)

Fitness Function
The fitness function determines how fit an individual is (the ability of an individual to compete with other
individuals).
Selection
25 26
The idea of selection phase is to select the fittest individuals and let them pass their genes to the next Generate the initial population
generation. Two pairs of individuals (parents) are selected based on their fitness scores. Individuals with high Compute fitness
REPEAT
fitness have more chance to be selected for reproduction. Selection
Crossover
Crossover Mutation
Compute fitness
Crossover is the most significant phase in a genetic algorithm. For each pair of parents to be mated, a crossover UNTIL population has converged
STOP
point is chosen at random from within the genes.
Example :
For example, consider the crossover point to be 3 as shown below.
Offspring are created by exchanging the genes of parents among themselves until the crossover
point is reached.
Exchanging genes among parents
The new offspring are added to the population.
Planning :
· The planning in Artificial Intelligence is about the decision making tasks performed by the
robots or computer programs to achieve a specific goal.
New offspring · The execution of planning is about choosing a sequence of actions with a high likelihood to
Mutation complete the specific task.
· The given set of operator instances (defining the possible primitive actions by the agent),an
In certain new offspring formed, some of their genes can be subjected to a mutation with a
initial state description, and a goal state description or predicate ,the planning agent
low random probability. This implies that some of the bits in the bit string can be flipped.
computes a plan.
· The word planning refers to the process of computing several steps of a problem solving
procedure before executing any of them.
· Planning of the problem is done in the hierarchical manner.
What is a plan ?
· A sequence of operator instances such that executing them in the initial state will change the
world to a state satisfying a goal state description.
Mutation: Before and After
Algorithm:
START
27 28
Properties of Planning Algorithm Algorithm

1. Choose a goal 'g' from the goal set
i. Soundness
2. If 'g' does not match the state, then
- A planning algorithm is sound if all solutions found are legal plans.
· Choose an operator 'o' whose add-list matches goal g
- All preconditions and goals are satisfied.
· Push 'o' on the opstack
- No constraints are violated.
· Add the preconditions of 'o' to the goalset
ii. Complete
3. While all preconditions of operator on top of opstack are met in state
- A planning algorithm is complete if a solution can be found whenever one actually exists. · Pop operator o from top of opstack
iii. Optimality
· state = apply(o, state)
- A planning algorithm is optimal if the order in which solutions are found is consistent with · plan = [plan; o]
some measure of plan quality.
Types of Planning Example
1. Linear planning
- Basic Idea : work on one goal until completely solved before moving onto the next goal.
- Planning algorithm maintain goal stack.
Implications
- No interleaving of goal achievement.
- Efficient search if goals don’t interact.
Advantages
- Reduced search space, single goals are solved one at a time. UNSTACK(A,B)
- Advantageous if goals are (mainly) independent. -- pick up clear block A from block B;
STACK(A,B)
- Linear planning is sound. -- place block A using the arm onto clear block B;
PICKUP(A)
Disadvantages
-- lift clear block A with the empty arm;
- Linear planning may produce suboptimal solutions(based on the number of operation in PUTDOWN(A)
-- place the held block A onto a free space on the table.
plan).
- Linear planning is incomplete. and the five predicates:
2. Non-Linear Planning ON(A,B)

- Use goal set instead of goal stack, include in the search space all possible sub goal -- block A is on block B.
ONTABLE(A)
ordering. -- block A is on the table.
- Handles goal interactions by interleaving. CLEAR(A)
-- block A has nothing on it.
Advantages: HOLDING(A)
-- the arm holds block A.
- Non-linear planning is sound.
ARMEMPTY
- Non-linear planning is complete. -- the arm holds nothing.
- Non-linear planning may be optimal with respect to plan length (depending on

search strategy employed).
Disadvantages
- Larger search space, since all possible goal orderings may have to be considered.
- Somewhat more complex algorithm; more books keeping.
29 30
Means-Ends Analysis in Artificial Intelligence
o Means –Ends Analysis (MEA) is a problem solving technique used commonly in Artificial
Intelligence for limiting search in AI problems.
o Most of the search strategies either reason forward or backward, however often a mixture of the two
directions is appropriate. Such mixed strategy would make it possible to solve the major parts of
problem first and solve the smaller problems. Such technique is called Means and End analysis. Solution:
o Means-Ends Analysis is problem-solving techniques used in Artificial intelligence for limiting
To solve the above problem, we will first find the differences between initial states and goal states,
search in AI programs.
and for each difference, we will generate a new state and will apply the operators. The operators we
o MEA technique is a strategy to control search in problem solving.
have for this problem are:
o The MEA technique was first introduced in 1961 by Allen Newell, and Herbert A. Simon in their
problem-solving computer program, which was named as General Problem Solver (GPS).
o Move
o The MEA analysis process centered on the evaluation of the difference between the current state and
o Delete
goal state.
o Expand
How means-ends analysis Works:
1. Evaluating the initial state: In the first step, we will evaluate the initial state and will
The means-ends analysis process can be applied recursively for a problem. It is a strategy to control search in
problem-solving. Following are the main Steps which describes the working of MEA technique for solving a compare the initial and Goal state to find the differences between both states.
problem.
a. First, evaluate the difference between Initial State and final State.
b. Select the various operators which can be applied for each difference.
c. Apply the operator at each difference, which reduces the difference between the current state and
goal state.
2. Applying Delete operator: As we can check the first difference is that in goal state there is no
dot symbol which is present in the initial state, so, first we will apply the Delete operator to
Operator Subgoaling
remove this dot.
In the MEA process, we detect the differences between the current state and goal state. Once these
differences occur, then we can apply an operator to reduce the differences. But sometimes it is
possible that an operator cannot be applied to the current state. So we create the sub problem of the
current state, in which operator can be applied, such type of backward chaining in which operators
are selected, and then sub goals are set up to establish the preconditions of the operator is
called Operator Sub goaling.
Example of Mean-Ends Analysis: 3. Applying Move Operator: After applying the Delete operator, the new state occurs which we
Let's take an example where we know the initial state and goal state as given below. In this will again compare with goal state. After comparing these states, there is another difference that is
problem, we need to get the goal state by finding differences between the initial state and goal state the square is outside the circle, so, we will apply the Move Operator.
and applying operators.

31 32
[Artificial Intelligence]
Chapter - Adversarial Search
· Competitive environments in which the agents goals are in conflict, give rise to
adversarial search, often known as games.
4. Applying Expand Operator: Now a new state is generated in the third step, and we will
compare this state with the goal state. After comparing the states there is still one difference which · In AI, games means deterministic, fully observable environments in which there are two
is the size of the square, so, we will apply Expand operator, and finally, it will generate the goal agents whose actions must alternate and in which utility values at the end of the game
state. are always equal and opposite.
• Eg. If first player wins, the other player necessarily loses
Opposition between the agent’s utility functions makes the situation adversarial.
Game
· The term Game means a sort of conflict in which n individuals or Groups(known as players)
participate.
· John Von Neumann is acknowledged as a father of Game theory; Neumann defined Game
theory in 1928 and 1937.
· Games are integral attribute of human beings. Games engage the intellectual faculties of
humans.
· If computers are to mimic people they should be able to play games.
A game can be formally be defined as a kind of search problem with the following components
1. Initial state: which specifies how the game is set up at the start?
2. A successor function
3. A terminal test: A terminal test, which is true when the game is over and false
otherwise. States where the game has ended are called terminal states.
4. A utility function: A utility function (also called an objective function or
payoff function), defines the final numeric value for a game that ends in
terminal state 's‘ for a player ‗p‘. In chess, the outcome is a win, loss, or
draw, with values +1, 0, or 1/2. Some games have a wider variety of possible
outcomes; the payoffs in backgammon range from 0 to +192. A zero-sum
game is (confusingly) defined as one where the total payoff to all players is
the same for every instance of the game. Chess is zero- sum because every
game has payoff of either 0 + 1, 1 + 0 or 1/2+ 1/2.
33 [Artificial Intelligence: Compiled By: Bhawana Bam(bhawana70003@gmail.com)] Page 1
[Artificial Intelligence] [Artificial Intelligence]
Minimax Algorithm
Game Tree Representation · Mini-max algorithm is a recursive or backtracking algorithm which is used in decision-making
and game theory. It provides an optimal move for the player assuming that opponent is also
playing optimally.
· Mini-Max algorithm uses recursion to search through the game-tree.
· Min-Max algorithm is mostly used for game playing in AI. Such as Chess, Checkers, tic-tac-
toe, go, and various tow-players game. This Algorithm computes the minimax decision for the
current state.
· In this algorithm two players play the game, one is called MAX and other is called MIN.
· Both the players fight it as the opponent player gets the minimum benefit while they get the
maximum benefit.
· Both Players of the game are opponent of each other, where MAX will select the maximized
value and MIN will select the minimized value.
· The minimax algorithm performs a depth-first search algorithm for the exploration of the
complete game tree.
· The minimax algorithm proceeds all the way down to the terminal node of the tree, then
backtrack the tree as the recursion.
· The minimax algorithm performs a complete depth-first exploration of the game
tree. If the maximum depth of the tree is m and there are b legal moves at each
point, then the time complexity of the minimax algorithm is O(bm) and the space
complexity is O(bm).
· We first consider games with two players: MAX and MIN. MAX moves first, and
then they take turns moving until the game is over. Each level of the tree alternates,
MAX is trying to maximize her score, and MIN is trying to minimize MAX score in
order to undermine her success. At the end of the game, points are awarded to the
winning player and penalties are given to the loser.
Working of Min-Max Algorithm:
• The working of the minimax algorithm can be easily described using an example. Below we
have taken an example of game-tree which is representing the two-player game.
• In this example, there are two players one is called Maximizer and other is called Minimizer.
• Maximizer will try to get the Maximum possible score, and Minimizer will try to get the
minimum possible score.
• This algorithm applies DFS, so in this game-tree, we have to go all the way through the leaves
to reach the terminal nodes.
[Artificial Intelligence: Compiled By: Bhawana Bam(bhawana70003@gmail.com)] Page 2 [Artificial Intelligence: Compiled By: Bhawana Bam(bhawana70003@gmail.com)] Page 3

• At the terminal node, the terminal values are given so we will compare those value and
backtrack the tree until the initial state occurs. Following are the main steps involved in
solving the two-player game tree: Step 2: Now, first we find the utilities value for the Maximizer, its initial value is -∞, so we will
compare each value in terminal state with initial value of Maximizer and determines the higher nodes
• The minimax value of terminal state is just its utility. values. It will find the maximum among the all.
• It is a Depth-first search with limited depth.
o For node D max(-1,- -∞) => max(-1,4)= 4
• Use a static evaluation function for all leaf states.
o For Node E max(2, -∞) => max(2, 6)= 6
• Assume the opponent will make the best move possible. o For Node F max(-3, -∞) => max(-3,-5) = -3
o For node G max(0, -∞) = max(0, 7) = 7
Algorithm
minimax(player,board)
if(game over in current board position)
return winner
children = all legal moves for player from this board
if(max's turn)
return maximal score of calling minimax on all the children
else (min's turn)
return minimal score of calling minimax on all the children
example 1
In the below tree diagram, let's take A is the initial state of the tree. Suppose maximizer takes first turn
which has worst-case initial value =- infinity, and minimizer will take next turn which has worst-case Step 3: In the next step, it's a turn for minimizer, so it will compare all nodes value with +∞, and will
initial value = +infinity. find the 3rd layer node values.
o For node B= min(4,6) = 4

o For node C= min (-3, 7) = -3
Problem
Step 3: Now it's a turn for Maximizer, and it will again choose the maximum of all nodes value and find
the maximum value for the root node. In this game tree, there are only 4 layers, hence we reach · Interesting games have too many states to expand to the leaves.
immediately to the root node, but in real games, there will be more than 4 layers. · Number of nodes to expand is exponential in the depth of the tree and the
branching factor.
o For node A max(4, -3)= 4
Properties of Mini-Max algorithm:
o Complete- Min-Max algorithm is Complete. It will definitely find a solution (if exist), in the
finite search tree.
o Optimal- Min-Max algorithm is optimal if both opponents are playing optimally.
o Time complexity- As it performs DFS for the game-tree, so the time complexity of Min-Max
algorithm is O(b m), where b is branching factor of the game-tree, and m is the maximum depth of
the tree.
o Space Complexity- Space complexity of Mini-max algorithm is also similar to DFS which
is O(bm).
Limitation of the minimax Algorithm:
The main drawback of the minimax algorithm is that it gets really slow for complex games such as
Chess, go, etc. This type of games has a huge branching factor, and the player has lots of choices to
decide. This limitation of the minimax algorithm can be improved from alpha-beta pruning.
That was the complete workflow of the minimax two player game.
Example 2 :
-β pruning
α
o Alpha-beta pruning is a modified version of the minimax algorithm. It is an optimization
technique for the minimax algorithm.
o As we have seen in the minimax search algorithm that the number of game states it has to
examine are exponential in depth of the tree. Since we cannot eliminate the exponent, but we can
cut it to half. Hence there is a technique by which without checking each node of the game tree
we can compute the correct minimax decision, and this technique is called pruning. This
involves two threshold parameter Alpha and beta for future expansion, so it is called alpha-beta
pruning. It is also called as Alpha-Beta Algorithm.
o Alpha-beta pruning can be applied at any depth of a tree, and sometimes it not only prune the tree
leaves but also entire sub-tree.
o The two-parameter can be defined as:
a. Alpha: The best (highest-value) choice we have found so far at any point along the path
of Maximizer. The initial value of alpha is -∞.
b. Beta: The best (lowest-value) choice we have found so far at any point along the path of
Minimizer. The initial value of beta is +∞.
The Alpha-beta pruning to a standard minimax algorithm returns the same move as the
standard algorithm does, but it removes all the nodes which are not really affecting the final
decision but making algorithm slow. Hence by pruning these nodes, it makes the algorithm
fast.
• It reduces the time required for the search and it must be restricted so that no time is to
be wasted searching moves that are obviously bad for the current player.
• The exact implementation of alpha-beta keeps track of the best move for each side as it
moves throughout the tree.
• Alpha–beta pruning can be applied to trees of any depth, and it is often possible to
prune entire sub-trees rather than just leaves. Remember that minimax search is depth-
first, so at any one time we just have to consider the nodes along a single path in the
tree.
Example
α = the value of the best (i.e., highest-value) choice we have found so far at any choice
point along the path for MAX.
β = the value of the best (i.e., lowest-value) choice we have found so far at any choice
point along the path for MIN.
• Alpha–beta search updates the values of α and β as it goes along and prunes
the remaining branches at a node as soon as the value of the current node is
known to be worse than the current α or β value for MAX orMIN, respectively.
Condition for Alpha-beta pruning:
The main condition which required for alpha-beta pruning is:
1. α>=β
Working of Alpha-Beta Pruning:
Key points about alpha-beta pruning:
Let's take an example of two-player search tree to understand the working of Alpha-beta pruning
o The Max player will only update the value of alpha.
Step 1: At the first step the, Max player will start first move from node A where α= -∞ and β= +∞, these
o The Min player will only update the value of beta.
value of alpha and beta passed down to node B where again α= -∞ and β= +∞, and Node B passes the
o While backtracking the tree, the node values will be passed to upper nodes instead of values of same value to its child D.
alpha and beta.
o We will only pass the alpha, beta values to the child nodes.
Step 2: At Node D, the value of α will be calculated as its turn for Max. The value of α is compared with
firstly 2 and then 3, and the max (2, 3) = 3 will be the value of α at node D and node value will also 3.
Step 5: At next step, algorithm again backtrack the tree, from node B to node A. At node A, the value of
Step 3: Now algorithm backtrack to node B, where the value of β will change as this is a turn of Min,
alpha will be changed the maximum available value is 3 as max (-∞, 3)= 3, and β= +∞, these two values
Now β= +∞, will compare with the available subsequent nodes value, i.e. min (∞, 3) = 3, hence at node
now passes to right successor of A which is Node C.
B now α= -∞, and β= 3.
At node C, α=3 and β= +∞, and the same values will be passed on to node F.
Step 6: At node F, again the value of α will be compared with left child which is 0, and max(3,0)= 3, and
then compared with right child which is 1, and max(3,1)= 3 still α remains 3, but the node value of F will
become 1.
In the next step, algorithm traverse the next successor of Node B which is node E, and the values of α= -
∞, and β= 3 will also be passed.
Step 4: At node E, Max will take its turn, and the value of alpha will change. The current value of alpha
will be compared with 5, so max (-∞, 5) = 5, hence at node E α= 5 and β= 3, where α>=β, so the right
successor of E will be pruned, and algorithm will not traverse it, and the value at node E will be 5.
Step 7: Node F returns the node value 1 to node C, at C α= 3 and β= +∞, here the value of beta will be
changed, it will compare with 1 so min (∞, 1) = 1. Now at C, α=3 and β= 1, and again it satisfies the
condition α>=β, so the next child of C which is G will be pruned, and the algorithm will not compute the
entire sub-tree G.
[Artificial Intelligence: Compiled By: Bhawana Bam(bhawana70003@gmail.com)] Page [Artificial Intelligence: Compiled By: Bhawana Bam(bhawana70003@gmail.com)] Page
10 11
Alpha Beta Procedure
• At each non-leaf node, store two values— alpha and beta.

• Let alpha be the best (i.e., maximum) value found so far at a “max” node.
• Let beta be the best (i.e., minimum) value found so far at a “min” node.
• Initially assign alpha = – and beta = at the root.
• Note alpha is monotonically non-decreasing and beta is monotonically non-increasing as
you travel up the tree.
• Given a node n, cut off the search below that node (i.e., generate no more children) if
o n is a “max” node and alpha(n) beta(i) for some “min” ancestor i of n, or
o n is a “min” node and beta(n) alpha(j) for some “max” ancestor j of n.
Algorithm
Step 8: C now returns the value of 1 to A here the best value for A is max (3, 1) = 3. Following is the
function alphabeta(node, depth, α, β, maximizingPlayer)
final game tree which is the showing the nodes which are computed and nodes which has never
computed. Hence the optimal value for the maximizer is 3 for this example. if depth = 0 or node is a terminal node
return the heuristic value of node
if maximizingPlayer
for each child of node
α := max(α, alphabeta(child, depth - 1, α, β, FALSE))
if β ≤ α
break (* β cut-off *)
return α
else
for each child of node
β := min(β, alphabeta(child, depth - 1, α, β, TRUE))
if β ≤ α
break (* α cut-off *)
return β
[Artificial Intelligence: Compiled By: Bhawana Bam(bhawana70003@gmail.com)] Page [Artificial Intelligence: Compiled By: Bhawana Bam(bhawana70003@gmail.com)] Page
12 13

Example
Minimax without pruning
Alpha-Beta run
· Depth-first search
Visit C, A, F,
Visit G, heuristics evaluates to 2
Visit H, heuristics evaluates to 3
· Back up {2,3} to F. max(F)=3
· Back up to A. β(A)=3. Temporary
min(A) is 3.
3 is the ceiling for node A's score.
· Visit B according to depth-first
order.
· Visit I. Evaluates to 5.
Max(B)>=5. α(B)=5.
It does not matter what the value of
J is, min(A)=3. β-prune J.
Alpha-beta pruning improves search

efficiency of minimax without sacrificing
accuracy.
Effectiveness
• The effectiveness depends on the order in which children are visited.

• In the best case, the effective branching factor will be reduced from b to sqrt(b).
• In an average case (random values of leaves) the branching factor is reduced to b/logb.
Properties of α-β
• Pruning does not affect final result

• Good move ordering improves effectiveness of pruning
• With "perfect ordering," time complexity = O(bm/2)
doubles depth of search
• A simple example of the value of reasoning about which computations are relevant
Properties of Alpha beta pruning:

· Pruning does not affect the final result.
· Good move ordering improves effectiveness of pruning.
[Artificial Intelligence: Compiled By: Bhawana Bam(bhawana70003@gmail.com)] Page 9
Artificial Intelligence-Compiled by Yagya Raj Pandeya, NAST, Dhandadhi ©yagyapandeya@gmail.com Page 10

Example2:
Step 1:
Artificial Intelligence-Compiled by Yagya Raj Pandeya, NAST, Dhandadhi ©yagyapandeya@gmail.com Page 10 Artificial Intelligence-Compiled by Yagya Raj Pandeya, NAST, Dhandadhi ©yagyapandeya@gmail.com Page 10
Constraint Satisfaction Problem

In artificial intelligence, constraint satisfaction is the process of finding a solution to set of
variables Vi (V1, V 2, …, Vn ) and a set of constraint Ci (C1, C2, …, C m), where each variable
has a domain of allowed values. There is no any specified rule to define the procedure to
solve the CSP; the technique depends upon the kind of constraints being considered. The
main features of CSP are:
§ A CSP is a high level description of a problem.
§ A model for the problem is represented by a set of variables and their domain.
§ The problem is stated as constraints specifying the relation between the variables.
§ The constraints only specify the relationship without specifying a
computational procedure to enforce that relationship.
§ The computer has to find the solution of the specified problem.
Ø A constraint satisfaction problem consists of

v A finite set of variables, where each variable has a domain .Using a set of variables
(features) to represent a domain is called a factored representation.
v A set of constraints that restrict variables or combination of variables.
Ø A constraint satisfaction problem consist of three components, X,D, and C:
· X is a set of variables, {X1,X2,……Xn}
· D is a set of domains, {D1,D2,…….Dn}, one for each variable.
· C is a set of constraints that specify allowable combinations of values.
Ø A CSP is solved by a variable assignment that satisfies given constraints.

Ø In CSPs, states are explicitly represented as variable assignmemnts.CSP search algorithms take
advantages of this structure.
Ø An Assignments complete when every value is mentioned.
Ø A solution to a CSP is a complete assignment that satisfies all constraints.
Ø Some CSPs require a solution that maximizes an objective function.
Ø Applications: Scheduling the time of observations on the Hubble space telescope, Floor planning,
map coloring, cryptography.
Examples of Applications:
§ Scheduling the time of observations on the Hubble Space Telescope
§ Airline schedules
§ Cryptography
§ Computer vision -> image interpretation
§ Scheduling exam
1. Map Coloring Example

Crypt arithmetic Problem So,

A B 4
It is an arithmetic problem which is represented in letters. It involves the decoding of digits represented by
+ D E 9
a character. It is in the form of some arithmetic equation where digits are distinctly represented by some G H 3
characters. The problem requires finding of the digit represented by each character. Assign a decimal digit
to each of the letters in such a way that the answer to the problem is correct. If the same letter occurs more Step 2:
Domain of B={1,2,5,6,7,8}
than once, it must be assigned the same digit each time. No two different letters may be assigned the same Domain of E={1,2,5,6,7,8,}
digit. So, Domain of H={0,1,2,5,6,7,8,}
Select B=2 & E=8 then H=10+1 (previous carry) (carry=1)
So,
Procedure A 2 4
+ D 8 9
1. Cryptarithmatic problem is an interesting constraint satisfaction problemfor which different G 1 3
algorithms have been developed. Cryptarithm is a mathematical puzzle in which digits are replaced Step 3:
Domain of A = {0, 5, 6, 7}
by letters of the alphabet or other symbols. Cryptarithematic is the science and art of creating and Domain of D = {0, 5, 6, 7}
solving cryptarithms. So, Domain of G = {5, 6}
Select A = 0 & D = 5 Then G = 6 (with addition of Carry)
2. Two different constraint of defining a cryptarithmatic are as follows: So,
- Each letter or symbol represented only once and a unique digit throughout the problem. 0 2 4
+ 5 8 9
- When the digit replace letters or symbols, the resultant arithmetical operation must be correct.
6 1 3
3. The above two constraints lead to some other restrictions in the problem.
Hence, the required solutions are: A = 0, B = 2, C = 4, D = 5, E = 8, F = 9, G = 6, H = 1, I = 3.
Example:
Solve the following Constraint Satisfaction Problems [CSP]:
Ü A B C
+ D E F
G H I
Solution:
Tree Searching Rules:
1. A ≠ B ≠ C ≠ D ≠ E ≠ F ≠ G ≠ H ≠ I
2. C + F = I
C + F = 10 + I (I as carry)
3. B + E = H
B + E = 10 + H B +
E+1=H
B + E + 1 =10 + H (H as carry)
4. A + D = G
A+D+1=G
Now,
Step 1:
Domain of C={1,2,3,4,5,6,7,8,9}
Domain of F={1,2,3,4,5,6,7,8,9}
So, Domain of I={0,1,2,3,4,5,6,7,8,9}
Select C=4 & F=9 then I=3 (carry=1)

Q. Hence, the required solutions are:

T = 7, R = 6, U = 4, E = 0, F = 1, A = 5, L = 2, S = 8.
T R U E
2.
+ T R U E
F A L S E Ü L O V E
+ L O V E
Solution: H A T E
Step 1:
Domain of F = {1}
So, Select F = 1 Step 1:
Now we have, Domain of E = {0}
So, Select E = 0 Then
T R U E We have,
+ T R U E
L O V 0
1 A L S E
+ L O V 0
Step 2 H A T 0
D
Of Domain of E={0}
So, Select E = 0
Now we
have,
T R U 0
+ T R U 0
1 A L S 0
Step 3:
Domain of U = {2, 3, 4, 6, 7, 8, 9}
So, Domain of S = {2, 4, 6, 8}
Select U = 4 Then S = 8
So,
Step 3:
T R 4 0
+ T R 4 0 Domain of O = {1, 3, 6}
1 A L 8 0
So, Domain of A = {2, 6}
Select O = 6 Then A = 2 (Carry = 1)
Step 4:
Domain of R={3, 6} So,
So, Domain of L={2,6} L 6 40
Select R = 6 Then L = 2 (Carry = 1) + L 6 40
So, H 2 80
T 6 4 0
+ T 6 4 0 Step 4:
1 A 2 8 0 Domain of L = {1, 3}
Step 5: So, Domain of H = {3, 7}
7 6 4 0 Select L = 3 Then H = 7
+ 7 6 4 0
1 5 2 8 0
Now, finally we have

3 6 4 0 Step 4:
Domain of C= {9}
3 6 4 0
So, select C={9}
7 2 8 0 Finally, our result will be,
Hence, the required solutions are:
L = 3, O = 6, V = 4, E = 0, H = 7, A = 2, T = 8. 9 6 2 3 3
+ 6 2 5 1 3
Ü 1 5 8 7 4 6
C R O S S
+ R O A D S
D A N G E R Hence, the required solutions are:
C =9, R = 6, O = 2, S = 3, A = 5, D = 1, N = 8, G = 7, E = 4.
Solution:
Step 1: Ü
Domain of D = {1} S E N D
Select D= 1 + M O R E
So,
M O N E Y
C R O S S Solution:
+ R O A 1 S
1 A N G E R Step 1:
Domain of M={1}
Step 2: Select M= 1
Domain of S = {2, 3, 4, 5, 6, 7, 8, 9}
So ,we have
So, Domain of R = {0, 2, 4, 6, 8}
Select S = 3 Then R = 6
S E N D
So, + 1 O R E
C 6 O 3 3
+ 6 O A 1 3 1 O N E Y
1 A N G 4 6
Step 2:
Step3:
Domain of O = {0, 2, 4, 5, 7, 8, 9} Domain of S = {9}
Domain of A = {0, 2, 4, 5, 7, 8, 9} Select S = 9 Then O = 0 (Carry = 1)
So, Domain of G = {0, 2, 5, 7, 9} Select O = 2 & A
So,
= 5 Then G = 7
So,
C 6 2 3 3
+ 6 2 5 1 3
1 5 8 7 4 6
Step 3:
Domain of E = {5}
Domain of T = {2, 3, 4, 7, 8, 9}
The Carry in the above step will be 2 only when we select E = 5 & T = 8 to obtain T = 8 (Carry = 1)
So,
Simplified form of the above problem will be:
F 9 R 8 6
8 5 0
+ 8 5 0
S 1 X 8 6
Ü F O R T Y
T E N
+ T E N
S I X T Y
Solution:
Step 1:
Domain of N = {0, 5}
Domain of Y = {1, 2, 3, 4, 5, 6, 7, 8, 9}
So, Select N = 0 & Y = 6 to obtain Y = 6
Now we have,
F O R T 6
T E 0
+ T E 0
S I X T 6
CHAPTER - Knowledge Representation, Inference and Reasoning
4.1 Logical Agents
- The knowledge-based agents are useful for representation of knowledge and processes of reasoning. As
the humans, the knowledge and reasoning are important for artificial agents because they enable
successful behaviors in various scenarios.
- The central component of a knowledge-based agent is its knowledge base, or KB. A knowledge
Exercises: base is a set of sentences, expressed in a language called a knowledge representation language
1. BASE+BALL = GAMES
- The agent maintains a knowledge base, KB; this may initially contain some background knowledge.
2. GERALD + DONALD = ROBERT
3. WIRE + MORE = MONEY Sometimes we dignify a sentence with the name axiom. An axiom is a sentence or proposition
that is not proved or demonstrated and is considered as self-evident or as an initial necessary
consensus for a theory building or acceptation. According as requirements, the new sentences are
added to the knowledge base and then new sentences are also derived from old axiom & theorems,
called inference.
- Logic is method of reasoning process in which conclusions are drawn from premises using rules of
inference. The logic is knowledge representation technique that involves:
* Syntax: defines well-formed sentences or legal expression in the language
* Semantics: defines the "meaning" of sentences
* Inference rules: for manipulating sentences in the language
Basically, the logic can be classified as:
a) Proposition (or statements or calculus) logic

b) Predicate [or First Order Predicate Logic (FOPL)] logic
Knowledge: Knowledge is awareness or familiarity gained by experiences of facts, data, and situations.
Knowledge acquisition refers to the process of extracting, structuring, and organizing domain knowledge from
domain experts into a program.
Knowledge Representation and Reasoning : Knowledge representation and reasoning (KRR) is the field
of artificial intelligence dedicated to representing information about the world in a form that a computer
system can utilize to solve complex tasks.
knowledge engineer is an expert in AI language and knowledge representation who investigates a particular
problem domain, determines important concepts, and creates correct and efficient representations of the objects
and relations in the domain.
Knowledge engineer can be divided into four phases:
1. Planning
2. Knowledge extraction
3. Knowledge analysis
1
$sagarMalla
4. Knowledge verification 2016 Batch 19171R
Knowledge Representation:
Ø To solve large numbers of complex problems in AI we need
- Large number of Knowledge (knowledge base)
- Mechanism to manipulate or solve the problem create new solutions
Ø Knowledge representation is a study of ways of how knowledge is represented or pictorized. 3. Inferential knowledge:
Ø For representation two entities are important: - Represent knowledge as formal logic.
i. Fact: Truth in real world(Knowledge Base) E.g All dogs have tail .
ii. Representation of fact(Symbolic representation) ∀x: DOG(x)àTAIL(x)
* Reasoning Programs 4. Procedural knowledge:

Facts * Internal Representation
- Procedural knowledge involves knowing how to do something.
- Procedural knowledge can explain different ways in program.
English
- Example: computer program
Understanding English Generation
4.2 Propositional Logic
English Representation
- Propositions/Statement: It is a declarative statement in sentence which is either True or False or
Fig : Mapping between facts and Representation not both.
Representation Mapping: - A proposition is a declarative sentence to which only one of the “Truth value” (i.e. TRUE or
i. Forward representation: Fact to Representation FALSE) can be assigned (but not both). Hence, the propositional logic is also called Boolean
ii. Backward representation: Representation to Fact logic. When a proposition is true, we say that its truth value is T, otherwise its truth value is F.
For example:
Types of Knowledge :
§ The square of 4 is 16 ® T
There are 5 main types of knowledge representation in Artificial Intelligence. § The square of 5 is 27 ® F
· Meta Knowledge – It’s a knowledge about knowledge and how to gain them § Every collage will have computer by 2010 AD ® we cannot know its truth value
· Heuristic – Knowledge – Representing knowledge of some expert in a field or subject. i.e. may be true (T) or false (F), but not both so it is a proposition.
· Procedural Knowledge – Gives information/ knowledge about how to achieve something. - Propositions are of two types:
· Declarative Knowledge – Its about statements that describe a particular object and its attributes , i. Atomic Statement: These statements cannot be simplified further. They are singular sentences.
including some behavior in relation with it.
ii. Compound statement: The two or more statements connected together with some logical
· Structural Knowledge – Describes what relationship exists between concepts/ objects.
connectives such as AND(∧), OR (∨), Implication (→), etc.
Approaches of Knowledge Representation
1. Simple Relational Knowledge:
It is a simple way of sorting facts using relational method.e.g Tabular representation
2. Inheritable knowledge:
- Data should be stored into a hierarchy of classes.
- Classes must be arranged in a generalization.
- Every individual frame can represent the collection of attribute and it’s values.
Eg:
2 3
Implication (p q) Bi-conditionals (p q)
Syntax: • if p then q Variety of terminology:
There are five connectives in common use: • if p, q • p is necessary and sufficient for q
• p is sufficient for q • if p then q, and conversely
Name Representation Meaning
• q if p • p if and only if q
Negation (not) ¬p or ~p “not p”
• q when p • p iff q
Conjunction (and) p ∧ q or p & q “p and q”
• a necessary condition for p is q
Disjunction (or) “p or q (or both)” •
p∨q p implies q P « q is equivalent to (p®q) Ù (q ®p) =
Exclusive Or “either p or q, but not both” • p only if q (Ø p Ú q) Ù (Ø q Ú p)
p⊕q
• a sufficient condition for q is p
Implication (if...then) p→q “if p then q”
• q whenever p
Bi-conditional or Bi- p ↔ q or p ≡ q “p if and only if q” or “p iff q” • q is necessary for p
implication (if and only if) • q follows from p
Semantics
Note: Set Theory Artificial Intelligence Logic
The semantics defines the rules for determining the truth of a sentence (specified the
(or Discrete Structure)
Union (È) Disjunction (Ú) OR (+) syntax) with respect to a particular model. In propositional logic, a model simply fixes
Intersection (Ç) Conjunction (Ù) AND (.) the truth value—true or false—for every proposition symbol.
Complement Negation (Ø or ~) NOT (Bar)
A. Truth Table
Example-1 The truth tables specify the truth value of a complex sentence for each possible
For example, the meaning of the statements P = it is raining and Q = I am indoors is assignment of truth values to its components. Truth tables for the five connectives are
given in figure.
transformed when the two are combined with
logical connectives: @ a « b: It is a formula.
§ It is raining and I am indoors (P ∧ Q) @ a º b: It is not a formula but it denotes
§ If it is raining, then I am indoors (P →Q) the relationship between a and b.
@ If P®Q be any compound preposition
§ If I am indoors, then it is raining (Q → P) then
* Converse: Q ® P = Ø Q Ú P
§ I am indoors if and only if it is raining (P ↔ * Contra positive: ØQ ® ØP = P® Q
* Inverse: ØP ® ØQ = Q ® P
Q)
Tautology, contradictions & contingent
§ It is not raining (¬P)
Example-2 Let P: This book is good and Q: This - A compound proposition whose truth value is always true is called tautology. For instance, p ∨ ¬p is
book is cheap. a tautology.
Now,
§ This book is good and cheap ® P Ù Q - A contradiction is always false whatever the truth values of its variables. For instance, p ∧ ¬p is a
§ This book is not good but cheap ® ØP Ù Q contradiction.
§ This book is costly but good ®ØQ Ù P - The compound proposition that is neither a tautology nor a contradiction is called contingency.
§ This book is neither good not cheap ® ØP Ù ØQ For e.g. ¬p ∧ ¬q is a contingency.
§ This book is either good or cheap ® P Ú Q
4 5
Rules of$sagarMalla
Inference in Propositional Logic 2016 Batch 19171R
Example-1: “It is not sunny this afternoon and it is colder than yesterday”, “We will go
swimming only if it is sunny”, “If we do not go swimming then we take a canoe trip”, and “If we take
a canoe trip then we will be home by sunset” lead to the conclusion “We will be home by sunset”.
Solution: Let
p = “It is sunny this afternoon”
q = “It is colder than yesterday”
r = “We will go swimming”
s = “We will take a canoe trip”
t = “We will be home by sunset”
Hypothesis: Øp∧q, r→p, Ør→s, s→t
Conclusion: t
Steps Operations Reasons

1. Øp∧q Given hypothesis
2. Øp Using simplification rule of inference on 1
3. r→p Given hypothesis
4. Ør Using modus tollens on 2 and 3
5. Ør→s Given hypothesis
6. S Using modus ponens on 4 and 5
7. s→t Given hypothesis
8. T Using modus ponens on 6 and 7
Hence the given expression is proved.
Example-2: “If you send me an e-mail message then I will finish writing the program”, “If you do not
send me an e-main message then I will go to sleep early”, and “If I go to sleep early then I will wake up
feeling refreshed”. Lead to the conclusion “If I do not finish writing the program then I will wake up
§ Theorem proving—applying rules of inference directly to the sentences in
feeling refreshed”.
our knowledge base to construct a proof of the desired sentence without
Solution: Let
consulting models. p = “You send me an e-mail message”
q = “I will finish writing the program”
r = “I will go to sleep early”
s = “I will wake up feeling refreshed”
Hypothesis: p→q, Øp→r, r→s
Conclusion: Øq→s
Steps Operations Reasons
1. p→q Given hypothesis
2. Øq→Øp Using contrapositive on 1
3. Øp→r Given hypothesis
4. Øq→r Using hypothetical syllogism on 2 and 3
5. r→s Given hypothesis
6. Øq→s Using hypothetical syllogism on 4 and 5
6 7
$sagarMalla
Hence the given hypotheses lead to the conclusion Øq→s. 2016 Batch 19171R
B. Resolution in Propositional Logic
- Resolution principle was introduced by John Alan Robinson in 1965. Negation of conclusion is : ØR
- The resolution technique can be applied for sentences in propositional logic and first-order logic. Let us assume it will not rain [ØR]
Resolution technique can be used only for disjunctions of literals to derive new conclusion.
- The resolution rule for the propositional calculus can be stated as following: from (P ∨ Q) and (¬P ∨ ØR ØHÚ R
R), we can derive (Q ∨ R). Resolution refutation will terminate with the empty clause if it is logically
equivalent (i.e. KB |= p). There are basic two methods for theorem proving using resolution, which are:
ØO Ú H
ØH
§ Forward chaining: Start with given axioms (or universal rules) and use the rules of
inference to prove the theorem.
§ Backward chaining: Prove that the negation of the result (or goal) cannot be true. This ØO
O
method is commonly known as theorem proving using refutation or proof by
contradiction.
false
For example: Express the following statements in
(Empty clause / Contradiction)
propositional logic
Since, an empty clause has been deduced, we say that our assumption is wrong and hence we
§ If it is hot then humid.
have proved “It will rain”.
§ If it is humid, then it will rain.
§ It is hot. Note :
Use resolution refutation to prove the statement “It will rain”. ## Horn clause is a disjunction of literals of which at most one (or only one) is positive. So
Solution : Let us denote these statements with proposition H, O,& R all definite clauses are Horn clauses, as are clauses with no positive literals; these are called
goal clauses. Horn clauses are closed under resolution: if you resolve two Horn clauses, you
H=It is humid.
get back a Horn clause.
O=It is hot.
Example-1: Let
R=It will rain.
j1 = A ∨ B ∨ ØD
Formulas corresponding to sentences are :
j2 = A ∨ B ∨ C ∨ D
1. O à H CNF: 1. ØO Ú H j3 = ØB ∨ C
2. Hà R 2. ØHÚ R j4 = ØA
3. O 3. O j=C
To prove: R
8 [Compiled By: Bhawana Bam (Artificial Intelligence) Email : bhawana70003@gmail.com] Page 10

1. A ∨ B ∨ ØD [hypothesis j1]
2. A ∨ B ∨ C ∨ D [hypothesis j2]
3. A ∨ B ∨ C [res. on 1, 2 with ØD, D] Conversion Procedure for CNF
4. ØB ∨ C [hypothesis j3] We illustrate the procedure by converting the sentence B ⇔ (P ∨ Q) into CNF. The steps are as follows:
5. A ∨ C [res. on 3, 4 with B, ØB]
Step 1: Eliminate ⇔, replacing α ⇔ β with (α ⇒ β) ∧ (β ⇒ α). (B
6. ØA [hypothesis j4] ⇒ (P ∨ Q)) ∧ ((P ∨ Q) ⇒ B)
7. C [res. on 5, 6 with A, ØA] Step 2: Eliminate ⇒, replacing α ⇒ β with ￢α ∨ β: (￢B
∨ P ∨ Q) ∧ (￢ (P ∨ Q) ∨ B)
This proof may also be represented graphically: Step 3: CNF requires ￢ to appear only in literals, so we “move ￢ inwards” by
Conjunctive Normal Form(CNF) repeated application of the following equivalences:
· a formula is in conjunctive normal form (CNF) or clausal normal form if it is a conjunction § ￢(￢α) ≡ α (double-negation elimination)
of one or more clauses, where a clause is a disjunction of literals; otherwise put, it is an AND of ORs. § ￢(α ∧ β) ≡ (￢α ∨ ￢β) (De Morgan)
· Since every sentence of propositional logic is logically equivalent to a conjunction of clauses, so § ￢(α ∨ β) ≡ (￢α ∧ ￢β) (De Morgan)
any disjunctions literals can be converted into its equivalent conjunction of calluses. A sentence In the example, we require just one application of the last rule:
expressed as a conjunction of clauses is said to be in conjunctive normal form or CNF. (￢B ∨ P ∨ Q) ∧ ((￢P ∧ ￢Q) ∨ B) .
Step 4: Now we have a sentence containing nested ∧ and ∨ operators applied to
Conversion to Conjunctive Normal Form (CNF)
literals. We apply the distributivity law, distributing ∨ over ∧ wherever possible.
i.e. (α ∨ (β ∧ γ)) ≡ ((α ∨ β) ∧ (α ∨ γ)) distributivity of ∨ over ∧
(￢B ∨ P ∨ Q) ∧ (￢P ∨ B) ∧ (￢Q ∨ B) .
The original sentence is now in CNF, as a conjunction of three clauses.
4.3 First Order Predicate Logic (FOPL)

- First-order logic is another way of knowledge representation in artificial intelligence. It is an
extension to propositional logic.
- FOL is sufficiently expressive to represent the natural language statements in a concise way.
- First-order logic is also known as Predicate logic or First-order predicate logic. First-order logic is a
powerful language that develops information about the objects in a more easy way and can also
express the relationship between those objects.
- First-order logic does not only assume that the world contains facts like propositional logic but also
assumes the following things in the world:
i. Objects: A, B, people, numbers, colors, wars, theories, squares, pits, wumpus,etc

ii. Relations: It can be unary relation such as: red, round, is adjacent, or n-any relation such as: the sister
of, brother of, has color, comes between.
iii. Predicate:
- Predicate is a part of declarative sentences describing the properties of an object or relation
among objects. For example: “is a student” is a predicate as , A is a student‟ and „B is a
student‟.
- Thus predicate binds the atoms or terms together.
[Compiled By: Bhawana Bam (Artificial Intelligence) Email : bhawana70003@gmail.com] Page 11 [Compiled By: Bhawana Bam , Email :bhawana70003@gmail.com]
If x is a variable, then existential quantifier will be ∃x or ∃(x). And it will be read as:
iv. Function: Father of, best friend, third inning of, end of, etc.. o There exists a 'x.'
v. Quantifiers o For some 'x.'
As a natural language, first-order logic also has two main parts:
o For at least one 'x.
a. Syntax: The syntax of FOL determines which collection of symbols is a logical expression in first-order
logic. Note :
b. Semantics
o The main connective for universal quantifier is implication →.
Basic Elements of First-order logic o The main connective for existential quantifier is and .
Following are the basic elements of FOL syntax: Properties of Quantifiers:

o In universal quantifier, ∀x∀y is similar to ∀y∀x.
Constant 1, 2, A, John, Mumbai, cat,.... o In Existential quantifier, ∃x∃y is similar to ∃y∃x.
o ∃x∀y is not similar to ∀y∃x.
Variables x, y, z, a, b,....
Predicates Brother, Father, >,....
Function sqrt, LeftLegOf, ....
Connectives ∧, ∨, ¬, ⇒, ⇔
Equality ==
Quantifier ∀, ∃
Some valid Well formed formulas WFFs

# Quantifier § MAN (ram)
- A quantifier is a symbol that permits one to declare the range or scope of variables in a logical § PILOT (father_of (hari))
expression.
§ "x (NUMBER (x) ® $y GREATERTHAN (y, x))
- Quantifier are the symbols which quantifies the predicate logic.
§ $xyz ( (FATHER (x, y) & FATHER (y, z) ® GRANDFATHER (x, z) )
- There are two types of quantifiers:
i. Universal quantifier : Universal quantifier is a symbol of logical representation, which specifies Some invalid WFFs
that the statement within its range is true for everything or every instance of a particular thing . For § "P P(x) ® Q(x) [since, predicate cannot be quantified]
e.g ": For all (it can only be defined in a domain) § MAN (Øram) [since negation of constant is not possible i.e. variable may
If x is a variable, then ∀x is read as: have negation]
§ father_of (Q(x)) [since the predicate cannot come inside function]
o For all x
o For each x § MARRIED (MAN, WOMAN) [since predicate of predicate is not possible or the
o For every x.
constant must be in small latter]
ii. Existential quantifier: Existential quantifiers are the type of quantifiers, which express Note: An atomic formula is a WFF. For example: if P and Q are WFFs then ØP, P&Q, PÚQ, P®Q,
that the statement within its scope is true for at least one instance of something. It is P«Q, "x P(x), $x P(x) are WFFs.
denoted by the logical operator ∃.
Example: Convert into
[Compiled By: Bhawana Bam , Email :bhawana70003@gmail.com] [Compiled By: Bhawana Bam , Email :bhawana70003@gmail.com]
FOPL 1.All men are people.

Ø "x MAN (x) → PEOPLE (x)
2. Marcus was Pompeian. 17. Every student in this class has visited Mexico.
Ø POMPEIAN (Marcus)
Ø Let
3. All Pompeian were Roman. o S(x) = “x is a student in this class”
Ø "x POMPEIAN (x) → ROMAN (x)
o M(x) = “x has visited Mexico”
4. Ram tries to assassinate (kill) Hari. Hence, required expression is: "x [S(x) → M(x)] [Note: → appears due to "]
Ø ASSASSINATE (Ram, Hari) 18. Not all birds can fly. Ø"x"x $y ∧∨
Ø("x: BIRD(x) à FLY(x))
5. Everyone is loyal (believe) to someone.
Ø "x $y LOYAL (x, y) 19. If anyone can solve the problem then Shyam can.
6. Someone is loyal to everyone. $x: SOLVES(x, problem) ∧ SOLVES (Shyam, problem)
Ø $x "y LOYAL (x, y) 20. Nobody in the math class is smarter than everyone in AI.
7. Ever gardener likes sun. Ø[ $x : TAKES_MTH(x) ∧ ("y TAKES_AI(y)) à SMARTER-THAN(x,y)]

21. No purple mushroom is poisonous.
Ø "x GARDNER (x) → LIKES (x, sun)
Ø$x:PURPLE (x) ∧ MUSHROOM (x)) ∧ POISONOUS (x)
8. Not ever gardener likes sun.
Or "x:(PURPLE (x) ∧ MUSHROOM (x)) àØ POISONOUS (x)
Ø Ø("x GARDNER (x) → LIKES (x, sun))
22. John hates all people who don’t hate themselves.
9. It is now 2010.
"x:PERSON(x) ∧Ø HATES (x,x)--> HATES(John,x)
Ø NOW = 2010 23. Gita loves all types of clothes.
10. All Romans were either loyal to caser or heated him. "x: clothes(x)→loves(Gita, x).
Ø "x ROMAN (x) → LOYAL (x, caser) ∨ HEATES (x, caser)
24. Suits are clothes.
11. You can fool all the people at one time.
Suits(x)→Clothes(x).
Ø "x $t PEOPLE (x) ∧ TIME (t) → CAN BE FOOLED (x, t)
25. Jackets are clothes.
12. No one likes everyone or Ø(likes everyone)
Jackets(x)→Clothes(x).
Ø "x $y ØLIKES (x, y) or Ø($x "y LIKES (x, y))
13. Everyone is special to someone at least. 26. Anything any wear and isn’t bad is clothes.
Ø "x (PERSON (x) → $y SPECIAL (y)) "x$x: wears(x,y)Ʌ ¬bad(y)→Clothes(x)
14. No coat is waterproof unless it has been specially treated. 27. Sita wears skirt and is good.
Ø Ø"x (COAT (x) ∧ TREATED (x) → WATERPROOF (x))
15. Socrates is a man. All men are mortal; therefore Socrates is mortal. wears(Sita,skirt) Ʌ ¬good(Sita)
Ø MAN (Socrates), "x MAN (x) ® MORTAL (x), MORTAL (Socrates) 28. Renu wears anything Sita wears.
wears(Sita,x)→wears(Renu,x)
16. Some student in this class has studied mathematics.

Note : For more questions refers to class note…
Ø Let
What is knowledge-engineering?
o S(x) = “x is a student in this class”
o M(x) = “x has studied mathematics” The process of constructing a knowledge-base in first-order logic is called as knowledge-
Hence, required expression is: $x [S(x) ∧ M(x)] [Note: ∧ appears due to $]
engineering. In knowledge-engineering, someone who investigates a particular domain, learns Clause: Disjunction of literals (an atomic sentence) is called a clause. It is also known as a unit
clause.
important concept of that domain, and generates a formal representation of the objects, is known
as knowledge engineer. Conjunctive Normal Form: A sentence represented as a conjunction of clauses is said to
be conjunctive normal form or CNF.
The knowledge-engineering process

The resolution inference rule:
1. Identify the task
2. Assemble the relevant knowledge The resolution rule for first-order logic is simply a lifted version of the propositional rule. Resolution can
3. Decide on vocabulary resolve two clauses if they contain complementary literals, which are assumed to be standardized apart so that
they share no variables.
4. Encode general knowledge about the domain.
5. Encode the description of the problem instance
6. Use inference to get answer
7. Debug the knowledge
4.4.1Conjunctive Normal Form(CNF)
Where li and mj are complementary literals.
In propositional logic, the resolution method is applied only to those clauses which are disjunction of This rule is also called the binary resolution rule because it only resolves exactly two literals.
literals. There are following steps used to convert into CNF:
1) Eliminate bi-conditional implication by replacing A ⇔ B with (A → B) Ʌ (B →A) Steps for Resolution:
1. Conversion of facts into first-order logic.
2) Eliminate implication by replacing A → B with ¬A V B. 2. Convert FOL statements into CNF
3. Negate the statement which needs to prove (proof by contradiction)
4. Draw resolution graph (unification).
3) In CNF, negation(¬) appears only in literals, therefore we move it inwards as:
Examples of Resolution in Predicate Logic:
Example 1
· ¬ ∀x P(x) = $x ¬ P(x)
a. John likes all kind of food.
· ¬$x P(x) = ∀x ¬ P(x) b. Apple and vegetable are food
· ¬ ( ¬A) ≡ A (double-negation elimination c. Anything anyone eats and not killed is food.
· ¬ (A Ʌ B) ≡ ( ¬A V ¬B) (De Morgan) d. Anil eats peanuts and still alive
· ¬(A V B) ≡ ( ¬A Ʌ ¬B) (De Morgan) e. Harry eats everything that Anil eats.
4) Eliminate quantifiers. Prove by resolution that:
5) Finally, using distributive law on the sentences, and form the CNF as: f. John likes peanuts.
(A1 V B1) Ʌ (A2 V B2) Ʌ …. Ʌ (An V Bn).
Step-1: Conversion of Facts into FOL
4.4 .2 Resolution in FOPL
In the first step we will convert all the given statements into its first order logic.
- Resolution is a theorem proving technique that proceeds by building refutation proofs, i.e., proofs
by contradictions. It was invented by a Mathematician John Alan Robinson in the year 1965.
- Resolution is used, if there are various statements are given, and we need to prove a conclusion of
those statements. Unification is a key concept in proofs by resolutions.
- Resolution is a single inference rule which can efficiently operate on the conjunctive normal form or
clausal form.
f. ∀k ¬ alive(k) V ¬ killed(k)
g. likes (John, Peanuts).
o Eliminate existential instantiation quantifier by elimination.
In this step, we will eliminate existential quantifier ∃, and this process is known as Skolemization.
But in this example problem since there is no existential quantifier so all the statements will remain
same in this step.
o Drop Universal quantifiers.

In this step we will drop all universal quantifier since all the statements are not implicitly quantified
so we don't need it.
. ¬ food(x) V likes(John, x)
a. food(Apple)
b. food(vegetables)
Step-2: Conversion of FOL into CNF c. ¬ eats(y, z) V killed(y) V food(z)
d. eats (Anil, Peanuts)
In First order logic resolution, it is required to convert the FOL into CNF as CNF form makes easier for
resolution proofs. e. alive(Anil)
o Eliminate all implication (→) and rewrite f. ¬ eats(Anil, w) V eats(Harry, w)
a. ∀x ¬ food(x) V likes(John, x) g. killed(g) V alive(g)
b. food(Apple) Λ food(vegetables) h. ¬ alive(k) V ¬ killed(k)
c. ∀x ∀y ¬ [eats(x, y) Λ ¬ killed(x)] V food(y) i. likes(John, Peanuts).
d. eats (Anil, Peanuts) Λ alive(Anil)
o Distribute conjunction over disjunction ¬.
e. ∀x ¬ eats(Anil, x) V eats(Harry, x)
This step will not make any change in this problem.
f. ∀x¬ [¬ killed(x) ] V alive(x)
g. ∀x ¬ alive(x) V ¬ killed(x) Step-3: Negate the statement to be proved
h. likes(John, Peanuts). In this statement, we will apply negation to the conclusion statements, which will be written as ¬likes (John,
o Move negation (¬)inwards and rewrite Peanuts)
. ∀x ¬ food(x) V likes(John, x)
Step-4: Draw Resolution graph:
a. food(Apple) Λ food(vegetables)
b. ∀x ∀y ¬ eats(x, y) V killed(x) V food(y) Now in this step, we will solve the problem by resolution tree using substitution. For the above problem, it
will be given as follows:
c. eats (Anil, Peanuts) Λ alive(Anil)
d. ∀x ¬ eats(Anil, x) V eats(Harry, x)
e. ∀x ¬killed(x) ] V alive(x)
f. ∀x ¬ alive(x) V ¬ killed(x)
g. likes(John, Peanuts).
o Rename variables or standardize variables
. ∀x ¬ food(x) V likes(John, x)
a. food(Apple) Λ food(vegetables)
b. ∀y ∀z ¬ eats(y, z) V killed(y) V food(z)
c. eats (Anil, Peanuts) Λ alive(Anil)
d. ∀w¬ eats(Anil, w) V eats(Harry, w)
e. ∀g ¬killed(g) ] V alive(g)
Example 3 Given Expression: Bhaskar is a physician. All physicians know surgery. Prove that
Bhaskar knows surgery using principle of resolution.
Ø FOPL
§ PHYSICIAN (Bhaskar)
§ "x PHYSICIAN (x) → KNOWS (x, surgery) or ØPHYSICIAN (x) ∨
KNOWS (x, surgery)
Now, we have to prove that: KNOWS (Bhaskar, surgery)
To prove the statement using resolution (proof by contradiction); let’s take the negation of
this as: ØKNOWS (Bhaskar, surgery)
Now,
Hence the negation of the conclusion has been proved as a complete contradiction with the given set
of statements.
Example 2 Given Expression: John likes all kinds of foods. Apples are food. Chicken is food. Since, ØKNOWS (Bhaskar, surgery) is not possible and hence the: KNOWS (Bhaskar, surgery)
Prove that John likes Peanuts using resolution. is proved.
FOPL Example 4 Given Expression: All carnivorous animals have sharp teeth. Tiger is carnivorous.
§ "x FOOD (x) → LIKES (John, x) or ØFOOD (x) ∨ LIKES (John, x) Fox is carnivorous. Prove that tiger has sharp teeth.
§ FOOD (apples) Ø FOPL
§ FOOD (chicken) § "x CARNIVOROUS (x) → SHARP TEETH (x) or Ø CARNIVOROUS (x)
Now, we have to prove: LIKES (John, peanuts). ∨ SHARP TEETH (x)
To prove the statement using resolution (proof by contradiction); let’s take the negation of § CARNIVOROUS (tiger)
this as: ØLIKES (John, peanuts) § CARNIVOROUS (fox)
Now, we have to prove that: SHARP TEETH (tiger)

Now, To prove the statement using resolution (proof by contradiction); let’s take the negation of this as:
Ø SHARP TEETH (tiger)
Now
Since, ØLIKES (John, peanuts) is not possible and hence the: LIKES (John,
peanuts) is proved.
Goal: likes(Gita, almond)
Negated goal: ¬likes(Gita, almond)
Since, Ø SHARP TEETH (tiger) is not possible and hence the: SHARP TEETH (tiger) is Now, rewrite in CNF form:
proved. 1. ¬food(x) V likes(Gita, x)
Example 5 Given Expression: Gita only likes easy course. Science courses are hard. All the 2. food(Mango),food(chapati)
courses in KMC are easy. KMC302 is a KMC course. Use the resolution to answer the 3. ¬eats(x,y) V killed(x) V food(y)
question “Which course would Gita like?” 4. eats(Gita, almonds), alive(Gita)
Ø FOPL 5. killed(x) V alive(x)
§ LIKES (Gita, easy course) 6. ¬alive(x) V ¬killed(x)

§ HARD COURSE (science) Finally, construct the resolution graph:
§ "x KMC (x) → EASY COURSE (x) or Ø KMC (x) ∨ EASY COURSE (x)
§ KMC (KMC302)
Now, we have to prove that: LIKES (Gita, KMC302)
To prove the statement using resolution (proof by contradiction); let‟s take the negation of this
as: Ø LIKES (Gita, KMC302)
Now
Since, Ø LIKES (Gita, KMC302) is not possible and hence the: LIKES (Gita, KMC302) is proved. Hence, we have achieved the given goal with the help of Proof by Contradiction. Thus, it is proved that Gita
Example 6: Consider the following knowledge base: likes almond.
1. Gita likes all kinds of food. Difference Between Propositional logic and Predicate logic
2. Mango and chapati are food.
Propositional Logic Predicate Logic
3. Gita eats almond and is still alive.
1) Propositional logic is a representational language 1) Predicate logic can reason about general properties
4. Anything eaten by anyone and is still alive is food. that makes the assumption that the world can be and relationships that apply to collection of individual.
represented solely in terms of propositions that are
Goal: Gita likes almond.
TRUE or FALSE.
Solution: Convert the given sentences into FOPL as:
Let, x be the light sleeper. 2) Propositional logic combines atoms 2) Predicate logic talk about objects.
1. ∀x: food(x) → likes(Gita,x) - Atoms have connectives. - Properties
- Relations
2. food(Mango),food(chapati)
- -True/False
3. ∀x∀y: eats(x,y) Ʌ ¬ killed(x → food(y)
3) Simplest kind of Logic. 4) An extension of propositional logic.
4. eats(Gita, almonds) Ʌ alive(Gita) 5) It doesn’t uses any quantifiers. 5) It uses mainly two types of quantifiers:
5. ∀x: ¬killed(x) → alive(x) - Universal quantifiers.
- Existential quantifiers.
6. ∀x: alive(x) → ¬killed(x)
4.5 Semantic Network:

- Semantic networks are alternative of predicate logic for knowledge representation.
- In Semantic networks, we can represent our knowledge in the form of graphical networks. This
network consists of nodes representing objects and arcs which describe the relationship between
those objects.
- Semantic networks can categorize the object in different forms and can also link those objects.
- Semantic network is a knowledge representation model which is in a form of graphical schemes

consisting of nodes and links among nodes. Semantic networks of computer executions have been.
This representation consists of mainly two types of relations: Advantages of Semantic network:
1. Semantic networks are a natural representation of knowledge.
a. IS-A relation (Inheritance)
2. Semantic networks convey meaning in a transparent manner.
b. Kind-of-relation
3. These networks are simple and easily understandabl e.
Example :
Drawbacks in Semantic representation:
Let us make a semantic net with the following piece of information.
1. Semantic networks take more computational time at runtime as we need to traverse the complete
“Tweety is a yellow bird having wings to fly”.
network tree to answer some questions.
Facts:
2. Semantic networks try to model human-like to store the information, but in practice, it is not possible
Fact1: Tweety is a bird.
to build such a vast semantic network.
Fact2: Birds can fly.
3. These types of representations are inadequate as they do not have any equivalent quantifier, e.g., for
Fact3: Tweety is yellow in color.
all, for some, none, etc.
4. Semantic networks do not have any standard definition for the link names.
Fly
can 5. These networks are not intelligent and depend on the creator of the system .
a kind of color 4.6 Frames:

bird Tweety Yellow · A frame is a record like structure which consists of a collection of attributes and its values to
describe an entity in the world.
has part · Frames are the AI data structure which divides knowledge into substructures by representing
stereotypes situations.
wing
· It consists of a collection of slots and slot values. These slots may be of any type and sizes. Slots
have names and values which are called facets.
Statements:
· Frames are record like structures that have slots and slot-values for any entity. Using frames, the
a. Jerry is a cat.
knowledge about an object/ event can be stored together in the knowledge base as unit.
b. Jerry is a mammal
· A frame is a collection of attributes (usually called slots) and associated values (and
c. Jerry is owned by Priya.
possibly constraints on values) that describes some entity in the world.
d. Jerry is brown colored.
e. All Mammals are animal.
· In AI, frames are called slot fillers data representations. The slot is the data values, and the How are objects related in a frame based system ?
fillers are attached procedures which are called before, during or after the slot’s value are There are three types of relationships between objects
changed.
1. Generalization:
Frame as knowledge representation techniques:
· It denotes a-kind of or “Is-a” relationship between super-class and it’s sub-class.
· The concept of a frame defined by a collection of attribute/slot.
· For example: A car is a vehicle.
· Each slot describes a particular attribute or operation of the frame.
· Slots are used to store values. 2. Aggregation:
Example: · It is “a-part-of” or “Part-whole” relationship in which several subclass representing
components are associated with a super-class representing a whole.
Slots Filters
· For example: An engine is a part of a car.
Title Artificial Intelligence 3. Association :

· It describes some semantic relationship between different classes which are
Genre Computer Science
unrelated otherwise.
Author Peter Norvig · For example: Mr Joe owns a car.
Edition Third Edition Advantage of Frame based knowledge representation:
Year 1996 · We can define the given problem in abstract way.

· Frames provide a for the structured and concise representation of knowledge.
Page 1152
· In a single entity , a frame combines all necessary knowledge about a particular object or
concept.
2. Employee Details
Disadvantages:
(Ram Sharma
(PROFESSION (VALUE teacher)) · Idea behind frame based system is easy but implementation is difficult.
(EMPID (VALUE 502)) · It cannot distinguish between essential properties and accidental properties of a frame.
SUBJECT (VALUE computer))) · It is difficult to predict how these features will interact or to explain unexpected
interactions, which makes debugging and updating difficult.
Ravi Sharma 4.7 Rule Based system:
· Rule-based systems are used as a way to store and manipulate knowledge to interpret
teacher
PROFESSION: information in a useful way. In this approach, idea is to use production rules, sometimes
EMPID: 502 called IF-THEN rules. The syntax structure is
SUBJECT: computer IF <premise>THEN<action>
<premise>– is Boolean. The AND, and to a lesser degree OR and NOT, logical
connectives are possible.
<action>– a series of statements
· A rule based system represents knowledge in the form of a set of rules.
· Each rule represents a small chunk of knowledge relating to the given domain.
· A number of related rules along with some known facts collectively may correspond
to a chain of inferences.
Unit-5 Reasoning
· An interpreter (inference engine) uses for facts and rules to derive conclusions about
7.1 What is Reasoning??
the current context and situation as presented by the user input.
• Reasoning is the act of deriving a conclusion from certain premises using a given
· Suppose a rule based system has the following statements. methodology.
R1: If A is an animal and A lays no eggs, then A is a mammal. • Reasoning is a process of thinking, reasoning is logically arguing, reasoning is drawing
F1: Lucida is an animal. inference.
• Any system must reason, if it is required to do something which has not been told
F2: Lucida lays no eggs.
explicitly.
The inference engine will update the rule base after interpreting the above set as: • For reasoning, the system must find out what it needs to know from what it already
R1: If A is an animal and A lays no eggs then A is a mammal. knows.
F1: Lucida is an animal.
F2: Lucida lays no eggs.
F3: Lucida is a mammal.
A typical rule-based system has four basic components:

1. A list of rules or rule base, which is a specific type of knowledge base.
• Human reasoning capabilities are divided into three areas:
2. An inference engine or semantic reasoner, which infers information or takes action
based on the interaction of input and the rule base. ❖ Mathematical reasoning : axioms, definitions , theorems , proofs
3. Temporary working memory. ❖ Logical reasoning: deductive , inductive , adductive
4. A user interface or other connection to the outside world through which input and ❖ Non-logical reasoning :linguistic, language
output signals are received and sent. 7.2 Types of Reasoning
Advantages of rule based approach: 1. Monotonic Reasoning
· Naturalness of Expression: Expert knowledge can often been seen naturally as rules of thumb. • Once you get a conclusion the addition of new facts won’t change the conclusion.
· Modularity: Rules are independent of each other – new rules can be added or revised later. • The definite clause logic is monotonic in the sense that anything that could be concluded
Interpreter is independent from rules. before a clause is added can still be concluded after it is added; adding knowledge does
· Restricted Syntax: Allows construction of rules and consistency checking by other not reduce the set of propositions that can be derived.
programs. Allows (fairly easy) rephrasing to natural language.
i.e, Earth rotates around the sun.
Disadvantages (or limitations)
- Adding new facts like Saturn all revolves around sun won’t affect the conclusive decision.
· rule bases can be very large (thousands of rules)
• In monotonic reasoning if we enlarge at sets of axioms we cannot retract any existing
· rules may not reflect the actual decision making
assertions or axioms.
· the only structure in the KB is through the rule chaining
For example: “If the patient has stiff neck, high fever and an headache, check for Brain
Meningitis”. Then it can be represented in rule based approach as:
IF <FEVER, OVER, 39> AND <NECK, STIFF, YES> AND <HEAD, PAIN, YES> THEN [Compiled by: Bhawana Bam (bhawana70003@gmail.com, Artificial Intelligence)] Page 1
add(<PATIENT,DIAGNOSE, MENINGITIS>)
Page 21
Example:
Bayes Theorem
• Bayes' theorem is also known as Bayes' rule, Bayes' law, or Bayesian reasoning, which
determines the probability of an event with uncertain knowledge.
• In probability theory, it relates the conditional probability and marginal probabilities of

two random events.
2. Non-monotonic Reasoning
• Read the expression as the probability of Hypothesis H given that we have observed
• Logic is non-monotonic if the truth of proposition may change when new information
evidence E.
(axioms) is added.
• To compute this, we need to take into account the prior probability of H (the probability
• A logic is non-monotonic if some conclusions can be invalidated by adding more that we would assign to H if we had no evidence) & the extent to which E provides
knowledge. evidence of H.
• Non-monotonic reasoning is useful for representing defaults. A default is a rule that can • To do this we need to define a universe that contains an exhaustive, mutually exclusive set
be used unless it overridden by an exception. of Hi’s, among which we are trying to discriminate.
Proof of Baye’s Rule :

We know that :
P(a|b) = P(a ^ b)/ P(b)
P(a ^ b)=P(a|b)P(b)……………….(1)
3. Statistical Reasoning Similarly P(b|a) = P(a ^ b)/ P(a)
• We have described several representation techniques that can be used to model belief P(a ^ b) = P(b|a) P(a) ……………….(2)
systems in which, at any given point in time, a particular fact is believed to be true, Equating 1 and 2
believed to be false or not considered to be either. P(a|b) P(b) = P(b|a) P(a)
• But for some kinds, it is useful to be able to describe beliefs that are not certain but for i.e. P(b|a) = P(a|b) P(b)/P(a)
which there is some supporting evidence. Example:
A doctor knows that the disease meningitis causes the patient to have a stiff neck 50%
• For example, problems that contain genuine randomness. E.g. Card Games of the time. The doctor also knows that the probability that a patient has meningitis
• For such problems, statistical measures may serve a very useful function as summaries of is 1/50,000, and the probability that any patient has a stiff neck is 1/20.
the world; rather than looking for all possible exceptions we can use a numerical Find the probability that a patient with a stiff neck has meningitis.
summary that tells us how often an exception of some sort can be expected to occur.
[Compiled by: Bhawana Bam (bhawana70003@gmail.com, Artificial Intelligence)] Page 2 [Compiled by: Bhawana Bam (bhawana70003@gmail.com, Artificial Intelligence)] Page 3
Here, we are given;

p(s|m) = 0.5
p(m) = 1/50000
p(s) = 1/20
Now using Bayes‘ rule;
P(m|s) = P(s|m)P(m)/P(s) = (0.5*1/50000)/(1/20) = 0.0002
Applications of Baye’s Theorem
i. Each node corresponds to the random variables.
o It is used to calculate the next step of the robot when the already executed step is given.
ii. Arc or directed arrows represent the causal relationship or conditional probabilities between
o Bayes' theorem is helpful in weather forecasting. random variables. These directed links or arrows connect the pair of nodes in the graph.
o It can solve the Monty Hall problem. o In the above diagram, A, B, C, and D are random variables represented by the nodes
of the network graph.
o If we are considering node B, which is connected with node A by a directed arrow,
Bayesian networks:
then node A is called the parent of Node B.
- A Bayesian network is a probabilistic graphical model which represents a set of variables and o Node C is independent of node A.
their conditional dependencies using a directed acyclic graph."
Joint probability distribution:
- It is also called a Bayes network, belief network, decision network, or Bayesian model.
- A data structure to represent the dependencies among variables and to give a concise If we have variables x1, x2, x3,....., xn, then the probabilities of a different combination of x1, x2,
x3.. xn, are known as Joint probability distribution.
specification of any full joint probability distribution.
- It can also be used in various tasks including prediction, anomaly detection, diagnostics, P[x1, x2, x3,....., xn] = P[x1| x2, x3,....., xn]P[x2, x3,....., xn]
automated insight, reasoning, time series prediction, and decision making under uncertainty. = P[x1| x2, x3,....., xn]P[x2|x3,....., xn]....P[xn-1|xn]P[xn].
- A data structure called Bayesian network to represent the dependencies among variable. It can
In general for each variable Xi, we can write the equation as:
represent essentially any full joint probability distribution in many cases can do very concisely.
- Bayesian Network can be used for building models from data and experts opinions, and it consists P(Xi|Xi-1,........., X1) = P(Xi |Parents(Xi ))
of two parts: Question:
▪ Directed Acyclic Graph
▪ Table of conditional probabilities.
Conditional probability table for David Calls:
The Conditional probability of David that he will call depends on the probability of
Alarm.
A P(D= True) P(D= False)
True 0.91 0.09
False 0.05 0.95
Conditional probability table for Sophia Calls:
The Conditional probability of Sophia that she calls is depending on its Parent Node
"Alarm."
A P(S= True) P(S= False)

Let's assume :
True 0.75 0.25
P(B= True) = 0.002, which is the probability of burglary.
P(B= False)= 0.998, which is the probability of no burglary. False 0.02 0.98
P(E= True)= 0.001, which is the probability of a minor earthquake

From the formula of joint distribution, we can write the problem statement in the form of
P(E= False)= 0.999, Which is the probability that an earthquake not occurred. probability distribution:
We can provide the conditional probabilities as per the below tables: P(S, D, A, ¬B, ¬E) = P (S|A) *P (D|A)*P (A|¬B ^ ¬E) *P (¬B) *P (¬E).
Conditional probability table for Alarm A: = 0.75* 0.91* 0.001* 0.998*0.999
The Conditional probability of Alarm A depends on Burglar and earthquake: = 0.00068045.
Hence, a Bayesian network can answer any query about the domain by using Joint distribution.
B E P(A= True) P(A= False)
4. Uncertainty in Reasoning
True True 0.94 0.06
• In many problem domains it is not possible to create complete , consistent models of the
True False 0.95 0.04 world. Therefore agent must act in uncertain worlds .We want an agent to make rational
decisions even when there is not enough information to prove that an action will work.
False True 0.31 0.69
• AI systems must have ability to reason under conditions of uncertainty.
False False 0.001 0.999
- Circumscription: It is a formalized rules of conjecture/guess.

x :(Bird (x) Abnormal(x) flies(x))
Here, It donǯt allow us to say that ǮTweety fliesǯ. Since; we do not know Tweety
is normal or abnormal.
- The truth Maintenance systems(TMS)

Approaches to reasoning:
There are three different approaches to reasoning under uncertainties: ✓ It is assumption-based reasoning.
➢ Symbolic Reasoning ✓ It is a program that does the book keeping and consistency management for a problem
➢ Statistical reasoning solver.
➢ Fuzzy logic reasoning ✓ TMS maintains the consistency of a knowledge base as soon as new knowledge is
added. It considers only one state at a time so it is not possible to manipulate
a. Symbolic Reasoning :
environment.
• The basis for intelligent mathematical software is the integration of the “power of
symbolic mathematical tools” with the suitable “proof technology”.
b. Statistical Reasoning
• In these logic base approaches, we have assumed that everything is either believed false
• The (Symbolic) methods basically represent uncertainty belief as being or true.
- True,
- False, or • However, it is often useful to represent the fact that we believe that something is
- Neither True nor False. probably true or true with probability (0.65).
Some methods also had problems with c. Fuzzy Logic (FL)

- Incomplete Knowledge • The term fuzzy refers to things which are not clear or are vague.
- Contradictions in the knowledge. • Fuzzy logic is a logical system, which is an extension of multivalve logic.
Mathematical reasoning enjoys a property called monotonocity, that says, • Fuzzy logic is a means of specifying how well an object satisfies a vague description.
“If a conclusion follows from given premises A, b, c,… then it also follows from any larger set • The approach of FL imitates the way of decision making in humans that involves all
of premises, as long as the original premises A, b, c,… are included.” intermediate possibilities between digital values
Example: All dogs are mammals. YES and NO.
All mammals are animals.
CERTAINLY YES
Therefore; all dogs are animals.
POSSIBLY YES CANNOT
Human reasoning is not monotonic.
SAY POSSIBLY NO
Non-Monotonic Reasoning:
CERTAINLY NO
The Non-monotonic reasoning is of the type:
- Default reasoning: This is a very common form of non-monotonic reasoning. The
conclusions are drawn based on what is most likely to be true.
Why Fuzzy Logic?

Fuzzy logic is useful for commercial and practical purposes.
• In the real world many times we encounter a situation when we can’t determine whether
• It can control machines and consumer products.
the state is true or false, their fuzzy logic provides a very valuable flexibility for
• It may not give accurate reasoning, but acceptable reasoning.
reasoning. In this way, we can consider the inaccuracies and uncertainties of any
• Fuzzy logic helps to deal with the uncertainty in engineering.
situation.
Application Areas of Fuzzy Logic
• The rules for evaluating the fuzzy truth T, a complex sentence are :
The key application areas of fuzzy logic are as given −
T (A  B) = min T (A).T (B)
Automotive Systems Consumer Electronic Goods
T (AB) = max T (A).T (B) • Automatic Gearboxes • Photocopiers
T (A) = 1-T (A) • Four-Wheel Steering • Video Cameras

• Vehicle environment control • Television
ARCHITECTURE of Fuzzy Logic
Its Architecture contains four parts: Domestic Goods • Washing Machines

• Microwave Ovens
Environment Control
▪ RULE BASE: It contains the set of rules and the IF-THEN conditions provided by the • Refrigerators • Air
experts to govern the decision making system, on the basis of linguistic information. • Toasters Conditioners/Dryers/Heaters
• Vacuum Cleaners • Humidifiers
▪ FUZZIFICATION: It is used to convert inputs i.e. crisp numbers into fuzzy sets. Crisp
inputs are basically the exact inputs measured by sensors and passed into the control Advantages of FLSs
system for processing, such as temperature, pressure, rpm’s, etc. • Mathematical concepts within fuzzy reasoning are very simple.
▪ INFERENCE ENGINE: It determines the matching degree of the current fuzzy input with • You can modify a FLS by just adding or deleting rules due to flexibility of fuzzy logic.
respect to each rule and decides which rules are to be fired according to the input field. • Fuzzy logic Systems can take imprecise, distorted, noisy input information.
▪ DEFUZZIFICATION: It is used to convert the fuzzy sets obtained by inference engine
• FLSs are easy to construct and understand.
into a crisp value. There are several defuzzification methods available and the best suited
• Fuzzy logic is a solution to complex problems in all fields of life, including medicine, as
one is used with a specific expert system to reduce the error.
it resembles human reasoning and decision making.
Disadvantages of FLSs iii. Revise

Evaluate the applicability of the proposed solution in the real world.
• There is no systematic approach to fuzzy system designing. iv. Retain
Update case base with new learned case for future problem solving.
• They are understandable only when simple.
CBR Assumptions:
• They are suitable for the problems which do not need high accuracy.
• The main assumption is that:
5. Case Based Reasoning - Similar problems have similar solutions.
• It is a process of solving new problems based on the solutions of similar past problems. • Other two assumptions are:
• Case based reasoning is reasoning by remembering. - The world is regular place: what holds true today will probably hold true tomorrow.
- Solutions Repeat: If they do not, there is no point in remembering them.
• A case based reasoning solves new problems by adapting solutions that were used to
Advantages of CBR:
solve old problems.
• Case based reasoning is a recent approach to problem solving and learning. • Solutions are quickly proposed.
• Some of the examples are: • Domains don’t need to be completely understood.

- An auto mechanic who fixes an engine by recalling another car that exhibited similar • Cases are useful for open-ended /ill defined concepts.
symptoms is using case-based reasoning. Limitations of CBR
- A lawyer who advocates a particular outcome in a trial based on legal precedents or a
• There are limits to CBR technology. The most important limitations relate to how cases
judge who creates case law is using case-based reasoning.
are efficiently represented, how indexes are created, and how individual cases are
- So, too, an engineer copying working elements of nature (practicing biomimicry), is generalized.
treating nature as a database of solutions to problems.
• Old cases may be poor.
• To get a prediction for a new example, those cases that are similar, or close to, the new
• Library may be biased.
example are used to predict the value of the target features of the new example.
• A CBR system can be used in risk monitoring, financial markets, defense, and marketing
just to name a few. CBR learns from past experiences to solve new problems. Rather
than relying on a domain expert to write the rules or make associations along generalized
relationships between problem descriptors and conclusions, a CBR system learns from
previous experience in the same way a physician learns from his patients.
• Case-based reasoning has been formalized for purposes of computer reasoning as a
four- step process:
i. Retrieve
Determine most similar cases.
ii. Reuse
Solve the new problems re-using information and knowledge in the retrieved
cases.
Chapter- 6 Learning System

Learning:
• Learning denotes changes in the system that are adaptive in the sense that they enable
the system to do the same task (or task drawn from the same population) more effectively
the next time.
• Learning denotes changes in a system that enables the system to do the same task more
efficiently next time.
• Learning is an important feature of “Intelligence”.
• Learning is constructing or modifying representations of what is being experienced. In computer science an agent is a software agent that assists users and acts in performing
Definition: computer related tasks.
A computer program is said to be learn from experience E with respect to some class of tasks T Components of a Learning system:
and performance measure P, if it’s performance at tasks in T, as measured by P, improves with
experience E.
Given: A task T
A performance measure P.
Some experience E with the task.
Goal: Generalized the experience in a way that allows improving your performance on the task.
Why do you require Machine Learning?
• Understand and improve efficiency of human learning.
• Discover new things or structure that is unknown.
• Fill in skeletal or incomplete specification about a domain.
1.2 Learning Agents
An agent is an entity is capable of perceiving and does actions.
An agent can be viewed as perceiving its environment through sensors and acting upon
that environment through actuators.
1. Performance Element: The performance element is the agent itself that acts in the world
. It takes in percepts and decides on external actions.
2. Learning Elements: It is responsible for making improvements. Takes knowledge about
performance element and some feedback determines how to determine performance
element.
[Compiled By: Bhawana Bam (Email: bhawana70003@gmail.com)] Page 1 [Compiled By: Bhawana Bam (Email: bhawana70003@gmail.com)] Page 2
3. Critic: Tells the Learning element how agent is doing (success or failure) by comparing • Three methods are used in inductive learning
with a fixed standard of performance. 1. Winston’s Learning program
4. Problem Generator: Suggests problems or actions that will generate new examples or 2. Version Trees
experiences that will aid in training the system further. 3. Decision trees
Types of Learning Inductive learning Example: Curve fitting
1. Rote learning
3. Decision tree
• Decision tree is a powerful tool of classification and prediction.
2. Inductive Learning • It represents rules that are easily expressed and use to retrieve useful information
• A process of learning by example. • A decision tree consist of
• The system tries to induce a general rule from a set of observed instances. The - Nodes: test for the value of certain attribute.
learning method extracts rules and patterns out of massive data sets. - Edges: correspond to the outcome of a test.
• This involves the process of learning by example -- where a system tries to induce a - Leaves: terminal nodes that predict the outcome.
general rule from a set of observed instances.
• This involves classification -- assigning, to a particular input, the name of a class to
which it belongs. Classification is important to many problem solving tasks.
• A learning system has to be capable of evolving its own class descriptions:
- Initial class definitions may not be adequate.
- The world may not be well understood or rapidly changing.
• The task of constructing class definitions is called induction or concept learning.
4. Reinforcement Learn i ng
• Reinforcement learning refers to a class of problem in machine learning which
postulates an agent exploring an environment.
- the agent perceives it’s current state and takes actions.
- The environment in return, provides a reward positive or negative.
- The algorithm attempt to find a policy for maximizing cumulative reward for the agent
over the course of the problem.
• In other word Reinforcement is: “A computational approach to learning whereby an
agent tries to maximize the total amount of reward it receives when interacting with a
complex, uncertain environment”.
• It is a way of programming agents by reward and punishment without needing to
specify how task is to be achieved.
5. Explanation based Learning (EBL)

• Human learning is accomplished by examining particular situations and rating
Training example
them to the background knowledge in the form of general principles. This kind
of learning is called “Explanation Based Learning (EBL)”.
• The system attempts to learn from single example’ X’ by explaining why X Using domain
is an example of target concept. theory
• EBL system accepts an example (i.e., a training example) and explain what it
learns from the example. The EBL system takes only the relevant concepts of Explains
the training. This explanation is translated into particular form that a problem Generate
solving can understand. The explanation is generalized so that it can be used to
solve other problem. Goal concept
• An EBL accepts 4 kinds of inputs Example:
o/p: Lender (x,y)→ relative (x,y) ʌ Rich (y)
a. A training Example: what the learning sees in the world.
b. A goal concept: High level description of what the program is supposed to learn. relative (x,y) uncle (y,x)
c. Operational criterion: description of which concepts are usable. rich(y) Ceo(y,b) ʌ
d. A domain theory: a set of rules that describes relationships between object and Bank(B)
action in a domain. rich (y) Own (y,H) ʌ
• EBL has two steps: House(H) Input
a. Explanation: the domain theory is used to prune away all unimportant aspects of the
training example with respect to the goal concept.
b. Generalization: the explanation is generalized as far as possible while still
describing the goal concept.
[Compiled By: Bhawana Bam (Email: bhawana70003@gmail.com)] Page 7 [Compiled By: Bhawana Bam (Email: bhawana70003@gmail.com)] Page
Chapter-7 Artificial Neural Networks
Biological Neural network
Artificial neuron Network

• An artificial neuron network is defined as data processing system consisting of a large
The human brain consists of a large number, more than a bi ll i o n of neural
number of interconnected processing elements or artificial neurons.
cells that process information. Each cell works like a simple processor. The
• A neural network is a system composed of many simple processing elements operating
massive interaction between all cells and their parallel processing only makes
in parallel whose function is determined by network structure, connection strengths
the brain's abilities possible.
and the processing performed at computing elements or nodes.
Four parts of typical nerve cell are • It is a mathematical model or computational model that is inspired by the structure
i. Dendrites: accepts the inputs- 1 to 10 4 per neuron. and/or functional aspects of biological neural networks.
ii. Soma: Process the inputs. • Neural network resembles the human brain in the following two ways:
iii. Axon: Turns the processed inputs into outputs. - A neural network acquires knowledge through learning.
iv. Synapses: The electromechanical contact between the neurons. - A neural networks knowledge is stored within the interconnections strengths known
Sigmoid function : as synaptic weights.
The sigmoid function y=1/(1+e-x )is used instead of a step function in artificialneuralnets because Artificial Neuron Model
the sigmoid is continuous , where as a step function is not,and we need continuity whenever we
• Inputs to the network are represented by the mathematical symbol Xn
want to usegradient decent.
• Each of these inputs are multiplied by a connection
weight wn Sum= w1x1 +…..+ wnxn + bias
[Compiled By: Bhawana Bam (bhawana70003@gmail.com)- Artificial Intelligence] Page 1 [Compiled By: Bhawana Bam (bhawana70003@gmail.com)- Artificial Intelligence] Page 2
• These produces are simply summed, fed through the transfer function f(), to generate a • It may be divided into 2 parts. The first part, g takes an input (ahem dendrite ahem),
result and then output. performs an aggregation and based on the aggregated value the second part, f makes a
decision.
2. Single Layer feed forward neuron network
• Single layer of source nodes that projects directly onto an output layer of neurons.
• The single layer feed-forward network consists of single layer of weights. Where the inputs
are directly connected to the outputs via a series of weights.
• It consist of two layers namely the input layer and the output layer. Input layer neuron receives
the input signals and output layer neuron gives the output.
• Every input neuron is connected to output neuron but not vice versa.
• The output layer performs the computation hence called single layer.
• Input layer transmits the signals to the output layer.

Neuron Network Architecture
There are three fundamentally different classes of neural networks .Those are: • Neurons with this kind of activation function are also called artificial neurons or linear
i. Single layer feed forward Networks threshold units.
ii. Multi layer feed forward Networks
iii. Recurrent Networks
iv. Hopfield Neural networks
1. MC -Culloch / Pitts Neuron:

• One of the first neuron models to be implemented.
• Output is 1 or 0.
• Input weight -1 to +1.
• Threshold (T)
Fig. Single Layer Feed-forward Network
Perceptron
• Perceptron is a single layer neural network. It is a binary classifier and part of supervised
learning. A simple model of the biological neuron in an artificial neural network is known
as the perceptron.
• An arrangement of one input layer of neurons feed forward to one
output layer of neurons is known as Single Layer Perceptron.
• The perception is the simplest form of neural network that is able to classify data into two
classes.
3. Multi Layer feed forward neuron network
• It consists of single neuron with a number of adjustable weights.
• A single artificial neuron that computes it’s weights inputs and uses a threshold actuation • The architecture of this class of network, besides having the input and the
function. output layers, also have one or more intermediary layers called hidden layers. The
• Perceptron is a simple neuron model that takes input signals (patterns) coded as real input computational units of the hidden layer are known as hi dd en neurons.
vectors (x1 + x2 +......+xn) through the associative synaptic weights W= (w1 + w2+…..+wn+1). • The hidden layer does intermediate computation before directing the input to output layer.
The out put is determined by :
• The input layer neurons are linked to t he hi d d e n l aye r n e u r o n s ; the weights on
Output = f(net)=f(w.x)
these links are referred to as input-hidden layer weights.
It is called a Threshold Logic unit (TLU).
• The hidden layer neurons and the corresponding weights are referred to as output-hidden
• It uses iterative learning procedure. layer weights.
• A multi-layer feed-forward network with ℓ input neurons, m1 neurons in the first hidden
layers, m2 neurons in the second hidden layers, and n output neurons in the output
layers is written as (ℓ - m1 - m2 – n ).
• Fig. above illustrates a multilayer feed-forward network
with a configuration (ℓ - m – n)
Fig: Perceptron
1 if S > F(z)
F(s) = y = 0 if –f(z) <= S<= f(z)
-1 if S < f(z)
• Weights are changed only when the error occurs. Back propagation in multilayer neural network
• The weights are updated using the following
Algorithm
Wi(new) = Wi (old) + α t xi Given : A set of input/output vector pairs.
Compute: A set of weights for a three –layer network that maps inputs onto
Where, t is either +1 or -1.
Α is the learning rate. corresponding outputs.
Step 1: Initialize the weights in the network. Each weight should be set
randomly to a number between -1 to 1 .
Wij =random(-1 to 1) for all i= 0,….A, j= 1,…B

W2ij= random(-1 to 1) for all i= 0,…b, j=1,.. C
Step 2 : Initialize the activation of the thresholding units.The values of these

thresholding units should never change.
X0=1.0
H0=1.0
Step 3: Choose an input and output pair. Suppose the input vector is xi and the
target output is yi. Assign activation levels to the input units.
Step 4: propagate the activations from the units in the input layer to the hidden
layer using the activation function
Step 5: propagate the activations from the units in the hidden layer to the unit to the
computer layer.
Step 6: Compute the errors of the units in the output layer denotes by delta. Errors
are based on the networks actual output and the target output.
Error = Target value- actual output
Step 7: Adjust the weights between the hidden layer and output layer. The learning
rate is denoted by ;
Step 8: Adjust the weights between the input layer and hidden layer
Step 9 : Go to step 3 to repeat. When all the input-output pairs have been to the
network, one epoch has been completed. Repeat step3 to 8 for as many iteration as
desired.
Example
actual output to be closer the target output, thereby minimizing the error for each output neuron and
Step 1 : Feed forward the network as a whole.
Here’s how we calculate the total net input for h1 :

net h1 = w1 *i1 + w2 * i2 + b1 *1 Output Layer
net h1 = 0.15*0.05 + 0.2 * 0.1 + 0.35 * 1 Consider . We want to know how much a change in affects the total error, aka .
We then squash it using the logistic function to get the output of h1: Visually, here’s what we’re doing:
Carrying out the same process for we get:
We repeat this process for the output layer neurons, using the output from the hidden layer
neurons as inputs.
Here’s the output for :
And carrying out the same process for we get:
Calculating the Total Error The partial derivative of the logistic function is the output multiplied by 1 minus the output:
Repeating this process for (remembering that the target is 0.99) we get: Finally, how much does the total net input of change with respect to ?
The total error for the neural network is the sum of these errors:
Putting it all together:

Back propagation
Our goal with back propagation is to update each of the weights in the network so that they cause the
You’ll often see this calculation combined in the form of the delta rule:
Alternatively, we have and which can be written as , aka .We can use this to
rewrite the calculation above:
Therefore:
Some sources extract the negative sign from so it would be written as:
To decrease the error, we then subtract this value from the current weight (optionally multiplied
by some learning rate, eta, which we’ll set to 0.5):
We know that affects both and therefore consideration
its effect on the both output neurons: Starting with :
We can calculate using values we calculated earlier:

We can repeat this process to get the new weights , , and :
And is equal to :
Following the same process for , we get:
We perform the actual updates in the neural network after we have the new weights leading into Therefore:
the hidden layer neurons (ie, we use the original weights, not the updated weights, when we
continue the back propagation algorithm below). Now that we have , we need to figure out and then for each weight:
Hidden Layer
We calculate the partial derivative of the total net input to with respect to the same as we
did for the output neuron:
• RNNs are used in deep learning and in the development of models that simulate the activity of
neurons in the human brain.
Applications of RNNs ― RNN models are mostly used in the fields of natural language processing and
Putting it all together: speech recognition.
Recurrent Network Architecture
You might also see this written as:
We can now update :
Repeating this for , ,and

( )
• Wxh: is weights for the connection of the input layer to the hidden layer.
Finally, we’ve updated all of our weights! When we fed forward the 0.05 and 0.1 inputs • W: is weights for the connection of the hidden layer to the hidden layer.
originally, the error on the network was 0.298371109. After this first round of backpropagation,
• Why: are the weights for the connection of the hidden-layer-to-output-layer layer.
the total error is now down to 0.291027924. It might not seem like much, but after repeating this
• a: is the activation of the layer.
process 10,000 times, for example, the error plummets to 0.0000351085. At this point, when we
feed forward 0.05 and 0.1, the two outputs neurons generate 0.015912196 (vs 0.01 target) and
The recurrent neural network scans through the data from left to right.
0.984065734 (vs 0.99 target).
The parameters it uses for each time step are shared. In the above diagram, parameters Wxh,
4. Recurrent Networks Why and W are the same for each time step.
• Recurrent neural networks, also known as RNNs, are a class of neural networks that allow In RNN making a prediction at time t, it uses not only input “xt” at time t but also information from
previous outputs to be used as inputs while having hidden states. previous input at time t-1 through activation parameter “a” and weights “W” which passes from
• Recurrent Neural Network (RNN) is a type of Neural Network where the output from previous previously hidden layer to current hidden layer.
step is fed as input to the current step.
• There could be also neurons with self feedback links i.e output of a neuron is feedback into itself
as input.
Architecture
Hopfield Neural Network
• In 1982 , John Hopfield introduced an artificial neural network to store and retrieve
memory like the human Brain.
• It consists of a single layer which contains one or more fully connected recurrent neurons.
• It can be seen as a fully connected single layer auto associative network.
• Hopfield networks are constructed from artificial neurons with N inputs. With each input i
there is a weight wi associated. They also have an output. The state of the output is
maintained, until the neuron is updated.
Properties of Hopfield Neural Network
- A recurrent network with all nodes connected to all other nodes.
- Nodes have binary outputs ( either 0,1 or 1,-1)
- Weights between the nodes are symmetric.
- No connection from node to itself is allowed.
- Nodes are updated asynchronously (the nodes are selected at random)
- The network has no hidden layer. • This model consists of neurons with one inverting and one non-inverting output.
Hopfield neural networks are of two types: • The output of each neuron should be the input of other neurons but not the input of self.
i. Discrete Hopfield network
• Weight/connection strength is represented by wij.
ii. Continuous Hopfield network
• Connections can be excitatory as well as inhibitory. It would be excitatory, if the output
Discrete Hopfield Network of the neuron is same as the input, otherwise inhibitory.
- The network is fully interconnected neural network where each unit is connected to every • Weights should be symmetrical, i.e. wij = wji
other unit.
• The output from Y1 going to Y2, Yi and Yn have the weights w12, w1i and w1n respectively.
- The network has symmetric weights with no self connection i.e Wi j =Wji and Wii = 0
Similarly, other arcs have the weights on them.
Continuous Hopfield network
- Used either for associative memory problems or constrained optimization problems
such as the Travelling sales man problems.
- Output is Vi = g(ui) where ui denote the internal activity of a neuron.
Boltzmann machine
Learning methods in Neural Networks
Boltzmann Machine were first invented in 1985 by Geoffrey Hinton, a professor at the University
The Learning methods in neural networks are classified into three basic types:
of Toronto. He is a leading figure in the deep learning community and is referred to by some as
- Supervised Learning
the “Godfather of Deep Learning”.
- Unsupervised Learning
• Boltzmann Machine is a generative unsupervised model, which involve learning a
- Reinforcement Learning
probability distribution from an original dataset and using it to make inferences about
i. Supervised Learning
never before seen data.
• Learning in which we teach or train the machine using data which is well labeled that means
• Boltzmann Machine have an input layer (also referred to as the visible layer) and one or some data is tagged with correct answer.
several hidden layers (also referred to as the hidden layer).
• Supervised learning is where you have input variables(x) and an output variable (y) and you
• Boltzmann Machine use neural networks with neurons that are connected not only to other use an algorithm to learn the mapping function from the input to the output.
neurons in other layers but also to neurons within the same layer.
Y= f(x)
• Everything is connected to everything. Connections are bidirectional, visible neurons
• It is called Supervised because the process of an algorithm learning from the training dataset
connected to each other and hidden neurons also connected to each other
can be thought of as a teacher supervising the learning process.
• Boltzmann Machine doesn’t expect input data, it generate data. Neurons generate
• Learning process is based on comparison between networks computed output and the correct
information regardless they are hidden or visible.
expected output and the correct expected output , generating error.
• A Boltzmann machine (also called stochastic Hopfield network with hidden units) is a
• The easiest way can be used if a (large enough) set of test data with known results exists.
type of stochastic recurrent neural network.
Then the learning goes like this: Process one dataset. Compare the output against the known
• Visible nodes are what we measure and Hidden nodes are what we don’t measure. When
result. Adjust the network and repeat.
we input data, these nodes learn all the parameters, their patterns and correlation between
• Supervised learning algorithms can be further grouped into:
those on their own and form an efficient system; hence Boltzmann Machine is termed as
a) Classification: A Classification problem is when the output variable is a cegory such as
an Unsupervised Deep Learning model.
red or blue OR disease or non-disease etc.
• This model then gets ready to monitor and study abnormal behavior depending on what it b) Regression : A regression problem is when output variable is real value,weigjht,dollor
has learnt. This model is also often considered as a counterpart of Hopfield Network, etc.
which are composed of binary threshold units with recurrent connections between them..
ii. Unsupervised Learning
- Unsupervised learning is where you only have input data (x) and no corresponding
output variables.
- The goal is to model the underlying structure or distribution in the data in order to
learn more about data.
- It is called unsupervised because unlike supervised learning there is no teacher.
Algorithms are left to their own devices to discover and present the interesting
structure in the data.
- The expected or desired output is not presented to the network. might imposer a limit on the growth of synaptic weight.
- The system learns of it own by discovering and adapting to the What can neural networks do?
structural features in the input patterns. ANN’s have been successfully applied to a number of problem domains:
- Useful if no test data is readily available, and if it is possible to derive some kind
• Classify data by recognizing patterns. Is this a tree on that picture?
while working on the real data.
• Detect anomalies or novelties, when test data does not match the usual patterns. Is the
- Unsupervised learning further grouped into :
truck driver at the risk of falling asleep? Are these seismic events showing normal ground
a. Clustering: A Clustering problem is where you want to discover the inherent motion or a big earthquake?
groupings in the data, such as grouping customers by purchasing behavior. • Process signals, for example, by filtering, separating, or compressing.
b. Association: It is a rule where you want to discover rules that describe large portion • Approximate a target function–useful for predictions and forecasting. Will this storm turn
of your data, such as people that buy x also tend to buy y.
so let’s look at some real-world applications. Neural networks can -
iii. Reinforced Learning
- A teacher is present but does not present the expected or desired output • identify faces,
• recognize speech,
but only indicated if the computed output is correct or incorrect.
• read your handwriting (mine perhaps not),
- The information provided helps the network in its learning process. • translate texts,
- A reward is given for correct answer computed and a penalty for a wrong answer. • play games (typically board games or card games)
• control autonomous vehicles and robots
- The ‘carrot and stick’ method. Can be used if the neural network generates • and surely a couple more things!
continuous action. Follow the carrot in front of your nose! If you go the wrong way - Applications of Neural Networks:
ouch. Over time, the network learns to prefer the right kind of action and to avoid the Neural network applications can be grouped in following categories:
wrong one. i. Clustering:
Note : The Supervised and Unsupervised learning methods are most popular
A clustering algorithm explores the similarity between patterns and places similar
forms of learning compared to Reinforced learning.
patterns in a cluster. Best known applications include data compression and data
iv. Hebbain Learning mining.
• In 1949, Donald Hebb proposed one of the key ideas in biological learning, ii. Classification/Pattern recognition
commonly known as hebb’s law. The task of pattern recognition is to assign an input pattern (like handwritten symbol)
• Hebb’s Law states that if neuron iis near enough to excite neuron j and repeatedly to one of many classes. This category includes algorithmic implementations such as
participates in it’s activation, the synaptic connection between these two neurons is associative memory.
strengthened and neuron j becomes more sensitive to stimuli from neuron i. approximation.
• Hebb’s Law can be represented in the form of two rules: iii. Prediction Systems
- If two neurons on either side of a connection are activated synchronously, then the The task is to forecast some future values of a time-sequenced data. Prediction has a
weight of that connection is increased. significant impact on decision support systems. Prediction differs from function
- If two neurons on either side of a connection are activated asynchronously, then the approximation by considering time factor. System may be dynamic and may produce
weight of that connection is decreased. different results for the same input data based on system state (time).
• Hebbian learning implies that weights can only increase. To resolve this problem, we
• Time Series Prediction —Neural networks can be used to make predictions. Will UNIT - 8 Expert system
the stock rise or fall tomorrow? Will it rain or be sunny?
What is Expert system?
• Signal Processing —Cochlear implants and hearing aids need to filter out
unnecessary noise and amplify the important sounds. Neural networks can be • An expert system is a computer system that emulates the decision-making ability of a
trained to process an audio signal and filter it appropriately. human expert.
• Control —You may have read about recent research advances in self-driving cars. • Expert systems are designed to solve complex problems by reasoning through bodies of
Neural networks are often used to manage steering decisions of physical vehicles knowledge, represented mainly as if-then rules rather than through conventional
(or simulated ones). procedural code.
• Soft Sensors —A soft sensor refers to the process of analyzing a collection of • The expert systems are the computer applications developed to solve complex problems
many measurements. A thermometer can tell you the temperature of the air, but
in a particular domain, at the level of extra-ordinary human intelligence and expertise.
what if you also knew the humidity, barometric pressure, dewpoint, air quality, air
• It is considered at the highest level of human intelligence and expertise. It is a computer
density, etc.? Neural networks can be employed to process the input data from
application which solves the most complex issues in a specific domain.
many individual sensors and evaluate them as a whole.
• An expert system is an alternative computer based decision tool that uses both facts and
• Anomaly Detection —Because neural networks are so good at recognizing patterns,
they can also be trained to generate an output when something occurs that doesn’t heuristics to solve difficult decision making problem, based on knowledge acquired from
fit the pattern. Think of a neural network monitoring your daily routine over a long an expert.
period of time. After learning the patterns of your behavior, it could alert you when • An expert system compared with traditional computer :
something is amiss. Inference engine + knowledge = (Expert system)
(Algorithm + Data structure =Program in traditional structure)
• First expert system, called DENDRAL, was developed in the early 70’s at Stanford
University.
Intelligence v/s Expertise:

• Expertise and Intelligence are not the same things (although they are related).
• Expertise require long time to learn (e.g.it takes 6 years to become a doctor).
• Expertise is a large amount of knowledge (in some domain).
• Expertise is easily recalled.
• Intelligence allows you to use your expertise (apply the knowledge)
• Expertise enables you to find solutions much faster.
[Compiled By : Bhawana Bam( bhawana70003@gmail.com)- Artificial Intelligence]

[Compiled By: Bhawana Bam (bhawana70003@gmail.com)- Artificial Intelligence] Page 25 Page 1
8.1Introduction
• Heuristic knowledge is less exhaustive, more experimental, more judgmental knowledge
Expert systems are computer applications which embody some non-algorithmic expertise
of performance.
for solving certain types of problems. For example ,expert systems are used in diagnostic
What is knowledge??
applications .They also play chess , make financial planning decisions , configure
The data is collection of facts. The information is organized as data and facts about the task
computers ,monitor real time systems, underwrite insurance policies and perform many
domain. Data, information and past experience combined together are termed as knowledge.
services which previously required human expertise.
- Factual knowledge: It is the information widely accepted by the knowledge
8.2 Components of expert system:
engineers and scholars in the task domain.
- Heuristic knowledge: It is about practice, accurate judgment, one’s ability of
evaluation and guessing.
Knowledge Acquisition??
Knowledge acquisition is the process of extracting, structuring and organizing knowledge from
one source, usually human experts, so it can be used in software such as an ES.
ii. Inference Engine

The inference engine is the brain of the expert system. Inference engine contains rules to
solve a specific problem. It refers the knowledge from the Knowledge Base. It selects facts
and rules to apply when trying to answer the user's query. It provides reasoning about the
The components of expert system (ES) includes: information in the knowledge base. It also helps in deducting the problem to find the
- Knowledge Base solution. This component is also helpful for formulating conclusions.
- Inference Engine
To recommend a solution, the Inference Engine uses the following strategies −
- User Interface
Let us explain them one by one briefly: • Forward Chaining
i. Knowledge Base: • Backward Chaining
• The knowledge base is a repository of facts. It stores all the knowledge about the problem
a. Forward Chaining
domain. It is like a large container of knowledge which is obtained from different experts
of a specific field. - It is a strategy of an expert system to answer the question, “What can happen
• It is expert systems contain both factual and heuristic knowledge. next?”
• Factual knowledge is that knowledge of task domain that is widely shared, typically - It starts with data available and then concludes a desired goal.
found in journals or textbooks.
[Compiled By : Bhawana Bam( bhawana70003@gmail.com)- Artificial Intelligence] [Compiled By : Bhawana Bam( bhawana70003@gmail.com)- Artificial Intelligence]
Page 2 Page 3
- Inference Engine follows the chain of conditions and derivations and finally deduces
the outcome. It considers all the facts and rules, and sorts them before concluding to iii. User Interface
a solution. The user interface is the most crucial part of the expert system. This component takes the
- This strategy is followed for working on conclusion, result, or effect. user's query in a readable form and passes it to the inference engine. After that, it displays
the results to the user. In other words, it's an interface that helps the user communicate with
- For example, prediction of share market status as an effect of changes in interest the expert system.
rates.
It explains how the ES has arrived at a particular recommendation. The explanation may
appear in the following forms −
• Natural language displayed on screen.
• Verbal narrations in natural language.

• Listing of rule numbers displayed on the screen.
The user interface makes it easy to trace the credibility of the deductions.
1.2 Characteristic of Expert System
b. Backward Chaining
Following are important characteristic of Expert System:
- With this strategy, an expert system finds out the answer to the question, “Why this
happened?” • The Highest Level of Expertise: The expert system offers the highest level of
- It starts with list of goals and works backward if there is data which will allow it to expertise. It provides efficiency, accuracy and imaginative problem-solving.
conclude these goals. • Right on Time Reaction: An Expert System interacts in a very reasonable period of
- On the basis of what has already happened, the Inference Engine tries to find out time with the user. The total time must be less than the time taken by an expert to get the
which conditions could have happened in the past for this result. most accurate solution for the same problem.
• Good Reliability: The expert system needs to be reliable, and it must not make any a
- This strategy is followed for finding out cause or reason.
mistake.
- For example, diagnosis of blood cancer in humans.
• Flexible: It is vital that it remains flexible as it the is possessed by an Expert system.
• Effective Mechanism: Expert System must have an efficient mechanism to administer
the compilation of the existing knowledge in it.
• Capable of handling challenging decision & problems: An expert system is capable
of handling challenging decision problems and delivering solutions.
8.3 Design of an Expert System (ES)

The process of Building An Expert Systems are :
• Determining the characteristics of the problem

• Knowledge engineer and domain expert work in coherence to define the problem
Page 4 Page 10
• The knowledge engineer translates the knowledge into a computer-understandable • Maintains a significant level of information
language. He designs an inference engine, a reasoning structure, which can use • Helps you to get fast and accurate answers
knowledge when needed. • A proper explanation of decision making
• Knowledge Expert also determines how to integrate the use of uncertain knowledge in • Ability to solve complex and challenging issues
the reasoning process and what type of explanation would be useful. • Expert Systems can work steadily work without getting emotional, tensed or fatigued.
8.4 Advantages/Benefits of expert system 8.7 Limitations of the expert system

• Consistent: It provides consistent answer for repetitive decisions, processes and tasks.
• Unable to make a creative response in an extraordinary situation
• Clarify: it clarify the logic of decision making. • Errors in the knowledge base can lead to wrong decision
• No-Human need: it doesn’t need humans; it can work continuously. • The maintenance cost of an expert system is too expensive
• Availability − they are easily available due to mass production of software. • Each problem is different therefore the solution from a human expert can also be
different and more creative
• Multiuser: a multi user experts system can serve more users at a time.
• Less Production Cost − Production cost is reasonable. This makes them affordable.
8.8 Applications of expert systems
• Speed − they offer great speed. They reduce the amount of work an individual puts in.
Some popular application where expert systems user:
• Less Error Rate − Error rate is low as compared to human errors.
• Information management
• Reducing Risk − they can work in the environment dangerous to humans. • Hospitals and medical facilities
• Steady response − they work steadily without getting motional, tensed or fatigued. • Help desks management
• Employee performance evaluation
8.5 Disadvantages • Loan analysis
• Virus detection
• Sense: it lacks common sense needed in decision making.
• Useful for repair and maintenance projects
• Creativeness: it cannot respond creatively like human expert would in • Warehouse optimization
unusual circumstances. • Planning and scheduling
• Errors: In knowledge base errors may occur and this leads wrong decisions. • The configuration of manufactured objects
• Financial decision making Knowledge publishing
• Environments: if knowledge base is changed it cannot adapt changing environments.
• Process monitoring and control
• Supervise the operation of the plant and controller
8.6 Benefits of expert systems
• Stock market trading
• It improves the decision quality • Airline scheduling & cargo schedules
• Cuts the expense of consulting experts for problem-solving
• It provides fast and efficient solutions to problems in a narrow area of specialization.
• It can gather scarce expertise and used it efficiently.
• Offers consistent answer for the repetitive problem
Page 10 Page 10
Expert system Example

i. MYCIN UNIT – 9 Natural Language Processing(NLP)
MYCIN, an early expert system, or artificial intelligence (AI) program, for What is NLP?
treating blood infections. In 1972 work began on MYCIN at Stanford University in • Natural Language Processing (NLP) is the capacity of a computer to "understand" natural
California. MYCIN would attempt to diagnose patients based on reported language text at a level that allows meaningful interaction between the computer and a
symptoms and medical test results. The program could request further information person working in a particular application domain.
concerning the patient, as well as suggest additional laboratory tests, to arrive at a • Processing of natural language is required when we want an intelligent system like
probable diagnosis, after which it would recommend a course of treatment. If robot to perform as per our instructions when we want to hear decision from a
requested, MYCIN would explain the reasoning that led to its diagnosis and dialogue based expert system etc.
recommendation. Using about 500 production rules, MYCIN operated at roughly • Natural Language Processing (NLP) refers to AI method of communicating with
the same level of competence as human specialists in blood infections and rather intelligent systems using a natural language such as English. Natural languages are
better than general practitioners. spoken by people.
• The field of NLP involves making computers to perform useful tasks with the natural
ii. DENDRAL
languages humans use. The input and output of an NLP system can be –
DENDRAL, an early expert system, developed beginning in 1965 by the artificial - Speech
intelligence (AI) researcher Edward Feigenbaum and the geneticist Joshua Lederberg, - Written Text
both of Stanford University in California. Heuristic DENDRAL (later shortened to • A language is a set of system, a set of symbols and a set of rules (or grammars).
DENDRAL) was a chemical-analysis expert system. The substance to be analyzed - The symbols are combined to convey new information.
might, for example, be a complicated compound of carbon, hydrogen, and nitrogen. - The rules govern the manipulation of symbols.
Starting from spectrographic data obtained from the substance, DENDRAL would • NLP encompasses anything a computer needs to understand natural language (typed or
hypothesize the substance’s molecular structure. DENDRAL’s performance rivaled spoken) and also generate the natural language.
that of chemists expert at this task, and the program was used in industry and in Components of NLP
academia. There are two components of NLP as given –
iii. PXDES: this ES is used for lung disease and X-ray diagnosis.
iv. CaDet: this ES is used for early cancer detection.
v. DXplain: this ES is used for diagnosis.
vi. RICE-CROP DOCTOR
[Compiled By : Bhawana Bam( bhawana70003@gmail.com)- Artificial Intelligence]

Page 10 [Artificial Intelligence: Compiled By: Bhawana Bam(bhawana70003@gmail.com )] Page 1
1. Natural Language Understanding (NLU)

• Taking some spoken/typed sentence and working out what it means. Steps in NLP:
• The NLU task is understanding and reasoning while the input is a natural language.
• Mapping the given input in the natural language into a useful representation.
• Understanding involves the following tasks –
- Mapping the given input in natural language into useful representations.

- Analyzing different aspects of the language.
• Different levels of analysis required:
- Morphological analysis
- Syntactic analysis
- Semantic analysis
- Discourse analysis
2. Natural Language Generation (NLG) a. Lexical Analysis:

• Taking some formal representations of what you want to say and working out a way to - The lexicon of a language is it’s vocabulary that includes it’s words and expressions.
express it in a natural language (e.g. English). - Lexical analysis involves dividing a text into paragraphs, words and the sentences.
• NLG is a subfield of natural language processing NLP.
b. Syntactic analysis:
• It is the process of producing meaningful phrases and sentences in the form of natural
language from some internal representation. - Syntax concerns the proper ordering of words and it’s affect on meaning.
• Producing output in the natural language from some internal representation. - This involves analysis of the words in a sentence to depict the grammatical structure
• Different level of synthesis required: of the sentence.
- Deep planning(what to say)
- The words are transformed into structure that shows how the words are related to
- Syntactic generation
each other.
• It involves –
- E.g. “the girl the go to the school”. This would definitely be rejected by the English
- Text planning − It includes retrieving the relevant content from knowledge base.
- Sentence planning − It includes choosing required words, forming meaningful phrases, setting syntactic analyzer.
tone of the sentence. c. Semantic Analysis –
- Text Realization − It is mapping sentence plan into sentence structure. - Semantics concerns the (literal) meaning of words, phrases and sentences.
• The NLU is harder than NLG.
- This abstracts the dictionary meaning or the exact meaning from context.
- The structures which are created by the syntactic analyzer are assigned meaning.
- Example: “colorless blue idea”. This would be rejected by the analyzer as colorless
blue do not make any sense together.
[Artificial Intelligence: Compiled By: Bhawana Bam(bhawana70003@gmail.com )] Page 2 [Artificial Intelligence: Compiled By: Bhawana Bam(bhawana70003@gmail.com )] Page 3
E.g., eat + s = eats

d. Discourse Integration –
- Sense of the context. • Syntax - the set of all well-formed sentences in a language and the rules for forming them
- The meaning of any single sentence depends upon the sentences that precedes it and • Semantics - the meanings of all well-formed sentences in a language
also invokes the meaning of the sentences that follow it. • Pragmatics (world knowledge and context) - the influence of what we know about the
- Example: the word “it” in the sentence “she wanted it” depends upon the prior real world upon the meaning of a sentence. E.g., "The balloon rose." allows an inference
discourse context. to be made that it must be filled with a lighter-than-air substance.
e. Pragmatic Analysis – • The influence of discourse context (E.g., speaker-hearer roles in a conversation) on the
- Pragmatic concerns the overall communicative and social context and it’s effects on meaning of a sentence
interpretation. • Ambiguity
- It means abstracting and deriving the purposeful use of the language in the situations. o lexical - word meaning choices (E.g., flies)
- The main focus is on what was said is reinterpreted on what it actually means. o syntactic - sentence structure choices (E.g., She saw the man on the hill with the
- Example: “close the window?” should have been interpreted as request rather than an telescope.)
order. o semantic - sentence meaning choices (E.g., They are flying planes.)
Applications of NLP
Parse tree representation in natural language
• text processing - word processing, e-mail, spelling and grammar checkers The parse tree breaks down the sentence into structured parts so that the computer can easily
• interfaces to data bases - query languages, information retrieval, data mining, text understand and process it. In order for the parsing algorithm to construct this parse tree, a set of
summarization rewrite rules, which describes what tree structures are legal, must be available. These rules say
• expert systems - explanations, disease diagnosis that a certain symbol may be expanded in the tree by a sequence of other symbols.
• linguistics - machine translation, content analysis, writers' assistants, language
Grammars and parsing
• Companies using AI chat bots that give you suggestions to locate the nearest grocery
Syntactic categories (common denotations) in NLP
store, book a movie ticket, order food, etc.
• Sentiment analysis during a political campaign to take informed decisions by • np - noun phrase
• vp - verb phrase
monitoring trending issues on social media
• s - sentence
• Analyzing lengthy text reviews by users of products on an e-commerce website • det - determiner (article)
• Call centers using NLP to analyze the general feedback of the callers • n - noun
• tv - transitive verb (takes an object)
• iv - intransitive verb
Linguistic Organization of NLP
• prep - preposition
• Grammar and lexicon - the rules for forming well-structured sentences, and the words • pp - prepositional phrase
• adj - adjective
that make up those sentences
• Morphology - the formation of words from stems, prefixes, and suffixes A context-free grammar (CFG) is a list of rules that define the set of all well-formed sentences in
a language. Each rule has a left-hand side, which identifies a syntactic category, and a right-hand
side, which defines its alternative component parts, reading from left to right.
Context-Free Grammars are simply grammars consisting entirely of rules with a single symbol
on the left-hand side of the rewrite rules. The obvious advantage of CFG is that it is simple to
define. Many of the grammars used for NLP systems are CFG, as such they have been widely
studied and understood and hence highly efficient parsing mechanisms have been developed to
apply them to their input.
However, CFG also have some severe disadvantages. Consider the following rewrite rules,
since V can be replaced by both "eat" or "eats", sentences like "The cat eat the rice" would be
allowed. Therefore, additional sets of grammar would have to be implemented for both singular
and plural sentences. Moreover, completely different sets of rules would also be needed for
passive sentences, e.g. "The rice was eaten by the cat". This means that an extremely large set of
rules would have to be created which makes it difficult to handle. Many different grammar
formalisms like the unification grammar and the categorical grammar have been developed to
capture the rules of syntax more concisely, but we won't go into them.
Q. Parse tree for “The cat eats the rice”

The rewrite rules of this example is as follows:
S -> NP VP
NP -> DET N | DET ADJ N
VP -> V NP
DET -> the
ADJ -> big | fat
N -> cat | cats | rice
V -> eat | eats | ate
NLP vs PLP (Programming Language Processing)
There are some parallels, and some fundamental distinctions, between the goals and methods of
programming language processing (design and compiler strategies) and natural language
processing. Here is a brief summary:
NLP PLP
domain of broad: what can narrow: what can be

discourse be expressed computed
lexicon large/complex small/simple
many and
varied
few
grammatical - declarative
- declarative
constructs - interrogative
- imperative
- fragments
etc.
meanings of
many one
an expression
morphological
analysis
syntactic lexical analysis
analysis context-free parsing
tools and
semantic code
techniques
analysis generation/compiling
integration of interpreting
world
knowledge
[Artificial Intelligence: Compiled By: Bhawana Bam(bhawana70003@gmail.com )] Page 8

Print AI Complete Note KCC BySagarMalla

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Print AI Complete Note KCC BySagarMalla

Uploaded by

Copyright:

Available Formats

$sagarMalla 2016 Batch 19171R

Artificial Intelligence Artificial Intelligence

C) Thinking rationally: The "laws of thought" approach

Goals and Approaches of AI

Artificial Intelligence Artificial Intelligence

• Diagnose lymph- node diseases

• Biology-inspired AI techniques are currently popular

Artificial Intelligence Artificial Intelligence

§ Neural Networks a larger disk on top of a smaller one;

■ Computation model is a finite state machine. It includes of a set of states, a

■ Representation of computational system include start and end state descriptions

■ State-transition system is called deterministic if every state has at most one

Solve the puzzle :

Initial state Goal state

Artificial Intelligence Artificial Intelligence

■ Goal-reduction procedures are a special case of the procedural representations of

■ Example 1 : N-Queens puzzle

Problem : Given any integer N, place N queens on N*N chessboard satisfying

◊ Alternative ways of trying to solve a goal ◊ Example : 8 - Queens puzzle

Artificial Intelligence Artificial Intelligence

a b c d e f g h a b c d e f g h v. Rule Based Systems(RBS)

iv. Tree Searching ◊ This paradigm involves two processes:

■ Tree search strategies: Satisfaction

* Assumes any one path is as good as any other path.

◊ Hill climbing ◊ Breadth-first search

$sagarMalla 2016 Batch 19171R

iv. Computer vision

a. It is a combination of concepts, techniques and ideas from : Digital

Do you want creativity or imagination?

1. High Cost: the human brain cannot be replicated.

Artificial Intelligence Artificial Intelligence

Chapter-2 Intelligent agents 2.2 Properties of the agent

ii. Autonomous: “A system is autonomous to the extent that its behavior is

Artificial Intelligence Artificial Intelligence

suggestions without an explicit user request.

Artificial Intelligence Artificial Intelligence

Environment Accessible Deterministic Episodic static Discrete

Yes No No Yes Yes 2.7 Structure of Agents/Types of agents

Artificial Intelligence Artificial Intelligence

Model − knowledge about “how the things happen in the world”.

Internal State − It is a representation of unobserved aspects of current state depending on

Updating the state requires the information about −

· How the world evolves.

2. Model Based Reflex Agents

Artificial Intelligence Artificial Intelligence

Goal − It is the description of desirable situations.

Goals are inadequate when −

Artificial Intelligence Artificial Intelligence

Agent Performance Environment Actuators Sensors

Figure: PEAS description of task environment for an automated taxi.

PEAS description of the task environment

- Four steps of problem solving are:

ii. State Spaces versus Search Trees

Production Rules for the Water Jug Problem

Production Rules for the Water Jug Problem

One Solution to the Water Jug Problem • Goal test: 8 queens on

Play the game in the smallest number of move

Solution: Search tree for TOH problem

2. The Traveling Salesperson Problem

• States: -any arrangement of n<=8 queens or arrangements of n<=8 queens in leftmost n

i. Breadth First Search

· The breadth-first search algorithm is an example of a general-graph search algorithm.

· Breadth-first search implemented using FIFO queue data structure.

· Proceeds level by level down the search tree