You are on page 1of 117

AAI Assignment 1

1. What is Artificial Intelligence? State its applications.

Ans: Definitions of AI
There are as many definitions as there are practitioners.

Systems that act like humans:Turing Test

 “The art of creating machines that perform functions that require intelligence when performed by
people.” (Kurzweil)

 “The study of how to make computers do things at which, at the moment, people are better.” (Rich
and Knight)

 You enter a room which has a computer terminal. You have a fixed period of time to type what you
want into the terminal, and study the replies. At the other end of the line is either a human being or a
computer system.

 If it is a computer system, and at the end of the period you cannot reliably determine whether it is a
system or a human, then the system is deemed to be intelligent.

Turing test approach

 A human questioner cannot tell if there is a computer or a human answering his question, via teletype
(remote communication)
The computer must behave intelligently

Turing test

In this test, Turing proposed that the computer can be said to be an intelligent if it can mimic human
response under specific conditions.

The Turing test is based on a party game "Imitation game," with some modifications. This game involves
three players in which one player is Computer, another player is human responder, and the third player is
a human Interrogator, who is isolated from other two players and his job is to find that which player is
machine among two of them.

 Interrogator: Are you a computer?

 Player A (Computer): No

 Interrogator: Multiply two large numbers such as (256896489*456725896)

 Player A: Long pause and give the wrong answer.

 In this game, if an interrogator would not be able to identify which is a machine and which is human,
then the computer passes the test successfully, and the machine is said to be intelligent and can
think like a human.

System that think like a human : Cognitive modeling

 Humans as observed from ‘inside’


 How do we know how humans think?
• Introspection vs. psychological experiments

 Cognitive Science
 “The exciting new effort to make computers think ... machines with minds in the full and literal
sense” (Haugeland)
 “[The automation of] activities that we associate with human thinking, activities such as decision-
making, problem solving, learning ...” (Bellman)

System that think rationally “Laws of thaughts”

 Humans are not always ‘rational’

 Rational - defined in terms of logic?

 Logic can’t express everything (e.g. uncertainty)

 Logical approach is often not feasible in terms of computation time (needs ‘guidance’)

 “The study of mental facilities through the use of computational models” (Charniak and McDermott)

 “The study of the computations that make it possible to perceive, reason, and act” (Winston)

System that act rationally : “ Rational agent “

 Rational behavior: doing the right thing


The right thing: that which is expected to maximize goal

 achievement, given the available information

 Giving answers to questions is ‘acting’.

 I don't care whether a system

1. replicates human thought processes

2. makes the same decisions as humans

3. uses purely logical reasoning

APPLICATION OF AI

1. Autonomous Planning & Scheduling:

 Autonomous rovers.

 Telescope scheduling

 Analysis of data:

2. Medicine:

 Image guided surgery

 Image analysis and enhancement

3. Transportation:

 Autonomous vehicle control:

 Pedestrian detection:
4. Games

 Games

 Robotics toys etc,

2. Explain the historical developments of Artificial Intelligence.

Ans: AI has roots in a number of scientific disciplines

o computer science and engineering (hardware and software)


o philosophy (rules of reasoning)
o mathematics (logic, algorithms, optimization)
o cognitive science and psychology (modeling high level human/animal thinking)
o neural science (model low level human/animal brain activity)
o linguistics
 The birth of AI (1943 – 1956)
o Pitts and McCulloch (1943): simplified mathematical model of neurons (resting/firing states)
can realize all propositional logic primitives (can compute all Turing computable functions)
o Allen Turing: Turing machine and Turing test (1950)
o Claude Shannon: information theory; possibility of chess playing computers
o Tracing back to Boole, Aristotle, Euclid (logics, syllogisms)
 Early enthusiasm (1952 – 1969)
o 1956 Dartmouth conference
 John McCarthy (Lisp);
 Marvin Minsky (first neural network machine);
 Alan Newell and Herbert Simon (GPS);
 Emphasize on intelligent general problem solving
 GSP (means-ends analysis);
 Lisp (AI programming language);
 Resolution by John Robinson (basis for automatic theorem proving);
 heuristic search (A*, AO*, game tree search)
 Emphasis on knowledge (1966 – 1974)
o domain specific knowledge is the key to overcome existing difficulties
o knowledge representation (KR) paradigms
o declarative vs. procedural representation
 Knowledge-based systems (1969 – 1999)
o DENDRAL: the first knowledge intensive system (determining 3D structures of complex
chemical compounds)
o MYCIN: first rule-based expert system (containing 450 rules for diagnosing blood infectious
diseases)
 EMYCIN: an ES shell
 PROSPECTOR: first knowledge-based system that made significant profit (geological ES for mineral
deposits)
 AI became an industry (1980 – 1989)
o wide applications in various domains
o commercially available tools
 Current trends (1990 – present)
o more realistic goals
o more practical (application oriented)
o distributed AI and intelligent software agents
o resurgence of neural networks and emergence of genetic algorithms

3. Explain the foundations of Artificial Intelligence.

answer

• Philosophy

At that time, the study of human intelligence began with no formal expression • Initiate the idea of mind
as a machine and its internal operations

 Mathematics

formalizes the three main area of AI: computation, logic, and probability \ Computation leads to analysis
of the problems that can be computed complexity theory Probability contributes the “degree of belief”
to handle uncertainty in AI Decision theory combines probability theory and utility theory (bias)

• Psychology

How do humans think and act? • The study of human reasoning and acting • Provides reasoning models
for AI • Strengthen the ideas • humans and other animals can be considered as information processing
machines

• Computer Engineering

How to build an efficient computer? • Provides the artifact that makes AI application possible • The
power of computer makes computation of large and difficult problems more easily • AI has also
contributed its own work to computer science, including: time-sharing, the linked list data type, OOP,
etc.

• Control theory and Cybernetics

How can artifacts operate under their own control? • The artifacts adjust their actions • To do better
for the environment over time • Based on an objective function and feedback from the environment •
Not limited only to linear systems but also other problems • as language, vision, and planning, etc.

• Linguistics

For understanding natural languages • different approaches has been adopted from the linguistic work •
Formal languages • Syntactic and semantic analysis • Knowledge representation

4. Explain types of algorithms in AI.

Ans: There are two types of algorithms in A

Uninformed Search Strategies

• Breadth-First search

• BFS uses Queue data structure for finding the shortest path.
• BFS works on concept of FIFO (First In First Out )
• BFS is more suitable for searching vertices closer to the given source.
• In BFS there is no concept of backtracking.

• Depth-First search

• DFS uses Stack data structure.


• DFS works on concept of LIFO( Last In First Out )
• It is more suitable when there are solutions away from source.
• DFS algorithm is a recursive algorithm that uses the idea of backtracking.

• Uniform-Cost search

Uniform-cost search is an uninformed search algorithm that uses the lowest cumulative cost to find a
path from the source to the destination.

Nodes are expanded, starting from the root, according to the minimum cumulative cost.

• Depth-First Iterative Deepening search.

IDDFS combines depth-first search’s space-efficiency and breadth-first search’s fast search (for nodes
closer to root)

IDDFS calls DFS for different depths starting from an initial value.

Informed Search Strategies


• Hill climbing

• Best-first search

• Greedy Search

• Beam search

• Algorithm AO

• Algorithm A*

Most commonly known form of best-first search.

Uses h(n) + g(n).

Features of UCS + greedy best-first search.

Finds the shortest path through the search space using the

Heuristic function.

5. Give the outline of BFS and DFS algorithm.

Ans: BFS

 BFS stands for Breadth First search.


 BFS is a traversal approach in which we first walk through all nodes on the same level before
moving on to the next level.
 BFS uses Queue data structure for finding the shortest path.
 BFS works on concept of FIFO (First In First Out )
 BFS is more suitable for searching vertices closer to the given source.
 In BFS there is no concept of backtracking.
 BFS is used in various applications such as bipartite graphs, shortest path, etc.
 It requires more memory.
 BFS is optimal for finding the shortest path.
Path [S,A,B,C,D,E,F]

DFS

 DFS stands for Depth First Search.


 DFS is a traversal approach in which the traverse begins at the root node and proceeds through
the nodes as far as possible until we reach the node with no unvisited nearby nodes.
 DFS uses Stack data structure.
 DFS works on concept of LIFO( Last In First Out )
 It is more suitable when there are solutions away from source.
 DFS algorithm is a recursive algorithm that uses the idea of backtracking.
 DFS is used in various applications such as acyclic graphs and topological order etc.
 It requires less memory.
 DFS is not optimal for finding the shortest path.
Path [A,B,D,H,I,E,J,C,F,K]

6. Explain uninformed search algorithms of AI.

Ans.

 Uninformed Search Strategies include:


1. Breadth-First search
Breadth-first search is a graph traversal algorithm that starts traversing the graph from the root
node and explores all the neighboring nodes. Then, it selects the nearest node and explores all
the unexplored nodes

2. Depth-First search

Depth-first search (DFS) is an algorithm for traversing or searching tree or graph data structures. The
algorithm starts at the root node and explores as far as possible along each branch before backtracking. Extra
memory, usually a stack, is needed to keep track of the nodes discovered so far along a specified branch
which helps in backtracking of the graph
3. Uniform-Cost search

Uniform-cost search is an uninformed search algorithm that uses the lowest cumulative cost to find a
path from the source to the destination. Nodes are expanded, starting from the root, according to the
minimum cumulative cost. The uniform-cost search is then implemented using a Priority Queue.

Uniform-cost search is a searching algorithm used for traversing a weighted tree or graph. This
algorithm comes into play when a different cost is available for each edge. The primary goal of the
uniform-cost search is to find a path to the goal node which has the lowest cumulative cost. Uniform-
cost search expands nodes according to their path costs form the root node. It can be used to solve any
graph/tree where the optimal cost is in demand. A uniform-cost search algorithm is implemented by the
priority queue. It gives maximum priority to the lowest cumulative cost. Uniform cost search is
equivalent to BFS algorithm if the path cost of all edges is the same.

4. Depth-First Iterative Deepening search

IDDFS combines depth-first search’s space-efficiency and breadth-first search’s fast search (for nodes
closer to root)

IDDFS calls DFS for different depths starting from an initial value. In every call, DFS is restricted from
going beyond given depth. So basically we do DFS in a BFS fashion.

The main idea here lies in utilizing the re-computation of entities of the boundary instead of stocking
them up. Every re-computation is made up of DFS and thus it uses less space. Now let us also consider
using BFS in iterative deepening search.

 Consider making a breadth-first search into an iterative deepening search.


 We can do this by having aside a DFS which will search up to a limit. It first does searching to a
pre-defined limit depth to depth and then generates a route length1.
 This is done by creating routes of length 1 in the DFS way. Next, it makes way for routes of depth
limit 2, 3 and onwards.
 It even can delete all the preceding calculation all-time at the beginning of the loop and iterate.
Hence at some depth eventually the solution will be found if there is any in the tree because the
enumeration takes place in order.
7. Explain informed search algorithms of AI.

Ans:

Informed Search Strategies

• Hill climbing

• Best-first search

• Greedy Search

• Beam search

• Algorithm AO

• Algorithm A*

 A* search:

Most commonly known form of best-first search. Uses h(n) + g(n). Features of UCS + greedy
best-first search. Finds the shortest path through the search space using the heuristic function.
This search algorithm expands less search tree and provides optimal result faster. Uses search
heuristic as well as the cost to reach the node. Hence we can combine both costs as following,
and this sum is called as a fitness number.

8. What is an expert system? Explain with example.

Ans.
 Expert system are the computer application developed in AI to solve complex problem in a
particular domain; at the level of extra-ordinary human intelligence and expertise. It emulates
the decision-making ability of a human expert.
 Expert systems are designed to solve complex problems by reasoning through bodies of
knowledge, represented mainly as if-then rules rather than through conventional procedural
code.
 Expert systems have specific knowledge to one problem domain, e.g., medicine, science,
engineering, etc. The expert’s knowledge is called a knowledge base, and it contains
accumulated experience that has been loaded and tested in the system.
 Like other artificial intelligence systems, expert system’s knowledge may be enhanced with add-
ons to the knowledge base, or additions to the rules.
 The more experience entered into the expert system, the more the system can improve its
performance.
 Characteristics of Expert System:
1. Highly responsive
2. Reliable
3. Understandable
4. High performance

Example of an Expert System

 MYCIN was built using LISP programming language, it was the first AI programming language.
 MYCIN is goal directed system which uses backward chaining reasoning approach
 MYCIN was an early backward chaining expert system that used artificial intelligence to identify
bacteria causing severe infections, such as bacteremia and meningitis
 The name is derived from the antibiotics themselves, as many antibiotics have the suffix "-
mycin"
 This system was able to perform as well as some experts and considerably better than junior
doctors. A consultation with MYCIN begins with requests for routine information such as age,
medical history and so on, programming to more specific questions as required.
 MYCIN helps the physician to prescribe disease-specific drugs. MYCIN informs itself about
particular cases by requesting information from the physician about a patient’s symptoms,
general condition, history, and laboratory-test results.
 At each point, the question MYCIN asks is determined by MYCIN’S current hypothesis and the
answer to all previous questions. When MYCIN is satisfied that it has a reasonably good grasp of
the situation, MYCIN announces its diagnosis.

9. What are the different phases in expert system?

Ans. Building an ES initially requires extracting the relevant knowledge from a human domain expert;
this knowledge is often based on useful thumb rules and experiences rather than absolute certainties.
Developed system should be able to explain its reasoning to its users and answer questions about the
solution process. Moreover, updating the system should just involve adding or deleting localized regions
of knowledge.
The different phases involved in building an ES may be categorized as follows:

 Identification Phase: In this phase, the knowledge engineer determines important features of
the problem with the help of the human domain expert. The parameters that are determined in
this phase include the type and scope of the problem, the kind of resources required, and the
goal and objective of the ES.
 Conceptualization Phase: In this phase, knowledge engineer and domain expert decide the
concepts, relations, and control mechanism needed to describe the problem-solving method. At
this stage, issue of granularity is also addressed, which refers to the level of details required in
the knowledge.
 Formalization Phase: This phase involves expressing the key concepts and relations in some
frame work supported by ES building tools. Formalized knowledge consists of data structures,
inference rules, control strategies, and languages requires for implementations.
 Implementation Phase: During this phase, formalized knowledge is converted to a working
computer program, initially called “prototype” of the whole system.
 Testing Phase: This phase involves evaluating the performance and utility of prototype system
and revisiting the system, if requires. The domain expert evaluates the prototype system and
provides feedback, which helps the knowledge engineer to revise it.

10. What are the different expert system examples?

Ans. Expert System is an interactive and reliable computer-based decision-making system which uses
both facts and heuristics to solve complex decision-making problems. It is considered at the highest level
of human intelligence and expertise. The purpose of an expert system is to solve the most complex
issues in a specific domain.

MYCIN: It was based on backward chaining and could identify various bacteria that could cause acute
infections. It could also recommend drugs based on the patient’s weight. It is one of the best Expert
System

DENDRAL: Expert system used for chemical analysis to predict molecular structure.

PXDES: An Example of Expert System used to predict the degree and type of lung cancer

CaDet: One of the best Expert System Example that can identify cancer at early stages

11. Write a short note on MYCIN expert system.

Ans: The system called MYCIN was developed using the expertise of best diagnostician of
bacterial infections whose performance was found to be better than the average.
MYCIN was invented in 1972 when Edward Shortliffe developed the system with a team from
Stanford University.
MYCIN was designed to help identify bacteria that cause blood infections and other severe
infections like meningitis.
The MYCIN System was a computer-based system physicians used to identify blood infections
and the most appropriate treatments.
The MYCIN Expert System used backward chaining technology to diagnose infections based on
symptoms and medical history and recommend treatment based on the data received.

MYCIN refers to a backward chaining expert system that helped diagnose and suggest
infections, named after a typical class of antibiotics in use.
MYCIN was an expert system using backward chaining, a form of artificial intelligence. In this
context, backward chaining helped determine that the patient had an infection and worked
back through several steps to determine the type of bacteria and which antibiotics to use.
Advantages included making it easier to find out the causes because of the known endpoint.

12. Explain expert system architecture?

Ans) The knowledge base contains the specific domain knowledge that is used by an expert to derive
conclusions from facts.

In the case of a rule-based expert system, this domain knowledge is expressed in the form of a series of
rules.

The explanation system provides information to the user about how the inference engine arrived at its
conclusions. This can often be essential, particularly if the advice being given is of a critical nature, such
as with a medical diagnosis system
If the system has used faulty reasoning to arrive at its conclusions, then the user may be able to see this by
examining the data given by the explanation system.

The fact database contains the case-specific data that are to be used in a particular case to derive a
conclusion

In the case of a medical expert system, this would contain information that had been obtained about the
patient’s condition.

The user of the expert system interfaces with it through a user interface, which provides access to the
inference engine, the explanation system, and the knowledge-base editor.

The inference engine is the part of the system that uses the rules and facts to derive conclusions. The
inference engine will use forward chaining, backward chaining, or a combination of the two to make
inferences from the data that are available to it.

The knowledge-base editor allows the user to edit the information that is contained in the knowledge base.

The knowledge-base editor is not usually made available to the end user of the system but is used by the
knowledge engineer or the expert to provide and update the knowledge that is contained within the system.
13. Write a short note on Forward and Backward chaining.
Ans) Forward chaining
Forward chaining is a method of reasoning in artificial intelligence in which inference rules are applied to

existing data to extract additional data until an endpoint (goal) is achieved.

In this type of chaining, the inference engine starts by evaluating existing facts, derivations, and conditions

before deducing new information. An endpoint (goal) is achieved through the manipulation of knowledge

that exists in the knowledge base.

Backward chaining

Backward chaining is a concept in artificial intelligence that involves backtracking from the endpoint or goal

to steps that led to the endpoint. This type of chaining starts from the goal and moves backward to

comprehend the steps that were taken to attain this goal.

The backtracking process can also enable a person establish logical steps that can be used to find other

important solutions.

14. Explain Components of an Expert System.

Ans) There are 5 Components of expert systems:

• Knowledge Base

• Inference Engine

• Knowledge acquisition and learning module

• User Interface

• Explanation module

• Knowledge base: The knowledge base in an expert system represents facts and rules. It
contains knowledge in specific domains along with rules in order to solve problems and form procedures
that are relevant to the domain.

• Inference engine: The most basic function of the inference engine is to acquire relevant data
from the knowledge base, interpret it, and to find a solution as per the user’s problem. Inference
engines also have explanationatory and debugging abilities.
• Knowledge acquisition and learning module: This component functions to allow the expert
systems to acquire more data from various sources and store it in the knowledge base.

• User interface: This component is essential for a non-expert user to interact with the expert
system and find solutions.

• Explanation module: As the name suggests, this module helps in providing the user with an
explanation of the achieved conclusion.

15. What are the benefits of expert systems.

Ans. It improves the decision quality

• Cuts the expense of consulting experts for problem-solving

• It provides fast and efficient solutions to problems in a narrow area of specialization.

• It can gather scarce expertise and used it efficiently.

• Offers consistent answer for the repetitive problem

• Maintains a significant level of information

• Helps you to get fast and accurate answers

• A proper explanation of decision making

• Ability to solve complex and challenging issues

• Expert Systems can work steadily work without getting emotional, tensed or fatigued.

16. What are the limitations of the expert systems?

Ans. Expert system are the computer application developed in AI to solve complex problem in a
particular domain; at the level of extra-ordinary human intelligence and expertise. It emulates the
decision-making ability of a human expert.

LIMITATIONS

 Unable to make a creative response in an extraordinary situation


 The maintenance and development cost of an expert system are too expensive
 Each problem is different therefore the solution from a human expert can also be different and
more creative
 The response of the expert system may get wrong if the knowledge base contains the wrong
information.
 Like a human being, it cannot produce a creative output for different scenarios.
 Its maintenance and development costs are very high.
 Knowledge acquisition for designing is much difficult.
 For each domain, we require a specific ES, which is one of the big limitations.
 It cannot learn from itself and hence requires manual updates.
17. Write difference between Expert System and Traditional Systems

18. What is Rule based Expert Systems?

Ans: If-then rules are one of the most common forms of knowledge representation used in
expert systems. Systems employing such rules as the major representation paradigm are called
rule-based systems. Some people refer to them as production systems. There are some
differences between rule-based systems and production systems, but we will ignore these
differences and use the terms interchangeably. In computer science, a rule-based system is
used to store and manipulate knowledge to interpret information in a useful way. It is often
used in artificial intelligence applications and research. Rule-based systems constructed using
automatic rule inference, such as rule-based machine learning.
The rule-based expert systems consist of three important elements:

 Set of Facts: These are assertions or anything relevant to the beginning state of the
system.
 Set of Rules: It contains all actions that should be taken within the scope of a problem
and specify how to act on the assertion set. Here, facts are represented in an IF-THEN
form.
 Termination Criteria or Interpreter: Determines whether a solution exists or not, as well
as when to terminate the process.

19. What is a Blackboard System?

Ans: Blackboard is the common data structure of the knowledge sources. The blackboard all the states of
the given problem space. The blackboard usually contains several levels of description w.r.t the problem
space. These levels may have several relationships with each other. These levels are a part of the same
data structure. In case more than one data structure is needed, the representation is broken into panels
and each panel can now hold multiple levels.

The knowledge source is a component that adds to the solution of the problem. It is anything that reads
from the blackboard and suggests some changes to parts of the blackboard. Usually, the knowledge
sources are disconnected with other knowledge sources.

Scheduler controls and decides which knowledge source will get an opportunity to change the
blackboard. For every execution cycle, the scheduler observes the changes made to the blackboard and
activates the knowledge source to execute the next change.

20. Write a short note on Truth Maintenance System.

21. What are the different Applications of Expert System?

Ans :- The area of expert systems has been of interest to AI researcher. The major purpose of building
as ES for an organization is to preserve the know – how , experience, and expertise of the experts,
which is a valuable asset to the organization. The purpose of ES is to provide this knowledge to other
members of the organization for problems in different types of domains. The appropriate problem –
solving technique depends generally on the type of problem and the domain. Application may be
categorized into the following major classes:

DIAGNOSIS :- The expert systems belonging to this class perform the task of inferring malfunctioning of
system from observations. Such expert systems use situation descriptions, behaviour characteristics, or
knowledge about component design to determine the probable cause of system malfunction. These
systems may also be used for diagnosis of faulty modules in large signal switching networks and for
finding faults in computer hardware system. Diagnosis can refer to inferring a possible disease from a
given set of symptoms in the field of medicine.

PLANNING & SCHEDULING :- The expert system of this class help in designing actions and plans before
actually solving a given problem. They analyse a set of one or more potentially complex and interacting
goals in order to determine a set of actions that are needed to achieve these goals.

DESIGN & MANUFACTURING:- This is one of the most important areas for ES applications. Here, a
solution to problem is configured by a given set of objects under a set of constraints. Configuration
applications were pioneered by computer companies to facilitate the manufacturing of semi-custom
minicomputers.

PREDICTION :- The expert systems of this class perform the task of inferring the likely consequences of
a given situation.

INTERPRETATION :- The expert systems of this class perform the task of interpreting and inferring
situation description data of any domain such as geological data, census data, medical data, etc.

FINANCIAL DECISION MAKING :- The financial services industry has been a prominent user of ES
techniques. Such systems assist insurance companies to assess the risk presented by the customer and
to determine a price for the insurance.

PROCESS MONITORING AND CONTROL :- Exprt systems belonging to this class analyse real time data
from physical devices with the goal by comparing observations to expected outcomes, predicting trends,
and controlling for both optimality and failure correction.

INSTRUCTION:- An ES can offer tutorials and instructions to students by incorporating a student’s


behaviour and learning capability models and can also evaluate a student’s acquired skills.

DEBUGGING:- The systems of this class prescribe remedies for malfunctioning devices.

22. What are the different Shells and Tools for Expert system?

Ans: There are only a few AI methods, such as knowledge representation,


inferences strategies, etc., essentially required in building expert systems, but
there is a wide variation in domain knowledge. Thus, we can develop systems
containing these useful methods without any domain-specific knowledge. Such
systems are known as skeletal systems, shells, or simply AI tools. Most expert
systems are developed using these specialized software tools. A shell is a
complete development environment that may be used for building and
maintaining knowledge-based applications. It provides knowledge engineer
With a step-by-step methodology , which allows domain experts to be directly
involved in structuring and encoding the knowledge. These shells have in-built
inference mechanism (backward chaining, forward chaining, or both) and require
knowledge to be entered according to a specified format while developing ES. The
Shells typically come with a number of other features, such as tools for
constructing friendly user interfaces; and writing hypertext, etc. For a detailed
coverage of ES shells, interested readers can refer to the ‘ES Shell at Work’ series
by Schmuller (1991, 1992).
Using Shell for building expert systems has certain significant advantages such as
reduction in development time and costs. Moreover, the necessary knowledge
regarding an application domain can be entered to perform a unique task using a
shell thus generating an ES for that particular application in the domain. If the
problem is not very complicated and if an expert has had some training in the use
of a shell, the expert can enter the knowledge himself or herself as well.
A large number of commercial shells are available for all types of systems such as
PCs, workstations and large mainframe computers. These shells may range in
complexity from simple, forward-chained , rule-based systems, which require a
few days of training, to complex systems that can be used highly trained
knowledge engineers only. They also range from general-purpose shell to custom-
tailored shells for a class of applications, such as real-time process control or
financial planning . An important point that must be kept in mind regarding shells
is that although they simplify programming, they do not with knowledge
acquisition.
AAI Assignment 2

1. Explain joint probability with example. (21306A1074-Vaishnavi Pangam)


Ans- •Joint probability is defined as the probability of accurate of two independent events
in conjunction.
•That is joint probability refers to the probability of both events accruing together
•The joint probability of A & B is written as P (A ^ B ) or P(A and B) . It may defined as
P(A and B) = P(A)* P(B)
•Two events said to be independent if the accurate of one event does not affect the
probability of accurate of the other.
•Consider an example of tossing of two pair coins separately of getting head H on tossing
the first coin is denoted by P(A) =0.5 , and the probability of getting a head on tossing the
second coin is denoted by P(B) = 0.5 , the probability of getting H on both the coin is
called joint probability and is represented as P(A and B) . It is calculated as follow.

P(A and B) = P(A)* P(B)

= 0.5 * 0.5 = 0.25

•Similarly the probability of head H on tossing one or both coins can be calculated. It is
called Union of the probabilities P(A) and P(B) , and is denoted by P(A U B) , it is also
written as P(A or B) . It can be calculated for above example as follows
P (A or B) = P(A) + P(B) – P(A)* P(B)
= 0.5 + 0.5 – 0.25
= 0.75

2. Explain conditional probability with example (21306A1023 – Komal Gupta)


Ans:- Conditional probability is a probability of occurring an event when another event has
already happened.
The probability of A under the conditions of B", it can be written as:
P(A|B) = P(A⋀B)/P(B)
Where P(A⋀B) = Joint probability of a and B, P(B)= Marginal probability of B.
If the probability of A is given and we need to find the probability of B, then it will be given as:
P(B|A) = P(A⋀B)/P(A)
Example:
In a class, there are 70% of the students who like English and 40% of the students who likes
English and mathematics, and then what is the percent of students those who like English also
like mathematics?
Solution:
Let, A is an event that a student likes Mathematics
B is an event that a student likes English.
= 0.4 / 0.7
= 57%
Hence, 57% are the students who like English also like Mathematics.

3. Explain Bayes’s theorem with example..(21306A1011 – Shraddha Kasar)


Ans:-Developed in 1763 by Thomas Bayes. Bayes’ theorem was named after the British
mathematician Thomas Bayes. It provides a mathematical model for reasoning where prior
beliefs are combined with evidence to get estimates of uncertainty. It relies on the concept that
one should incorporate the prior probability of an event into the interpretation of a new situation.
Bayes’ theorem can be written as :
P(H|E)=[P(E|H)*P(H)]/P(E)
Using Conditional Probability:-
P(H|E) can be expressed as:-
P(H|E) =P(E and H)/P(H)
P(H|E)*P(H) =P(H and E) -------(1)
Similarly, P(E|H) can be expressed as
P(E|H) =P(E and H)/P(H)
P(E|H)*P(H) =P(E and H) -------(2)
From Eqs (1) and (2), we get
P(H|E)*P(E)=P(E|H)*P(H)
Hence, we obtain
P(H|E) = P(E|H)*P(H)/P(E)
P(H|E) = P(E|H) * P(H)/ (P(E|H) * P(H) + P(E|~H) * P(~H))
As P(E) = P(E and H) + P(E and ~H)
Using Conditional Probability, we obtain,
P(E and H) = P(E|H) *P(H)
P(E and ~H) = P(E|~H) *P(~H)
Therefore P(E) = P(E|H) *P(H) + P(E|~H) *P(~H)
 P(H) is known as the prior probability of H. It is called prior probability because it does
not contain any information regarding E.
 P(H|E) is known as the conditional probability of H, given E. It is also called posterior
probability because it is derived from or depends on the specified value of E.
 P(E|H) is known as the conditional probability of E given H.
 P(E) is the prior probability of E and acts as a normalizing constant.
Example:-
Suppose we are given the probability of Mike has a cold as 0.25, the probability of Mike
was observed sneezing when he had cold in the past was 0.9 and the probability of Mike
was observed sneezing when he did not have cold as 0.20. Find the probability of Mike
having a cold given that he sneezes.
Solution:-
H : Mike has a cold,
P(H)=0.25
E : Mike sneezes,
P(H|E) to be calculated P(Mike was observed sneezing | Mike has a cold) =P(E|H) = 0.9
P(Mike was observed sneezing | Mike does not have a cold) = P(E|~H)=0.2
P(Mike has a cold | Mike was observed sneezing) =P(H|E)

= (0.9*0.25) / (0.9*0.25+0.2*0.75)
= (0.225 / 0.375)
= 0.6

= [(1-0.9) * 0.25] / (1-0.375)


= 0.025 / 0.625 = 0.04
4. Write a short note on probabilities in rules and facts of rule-based system.
(Gorima(21306A1054))
Ans. A fact ‘battery in a randomly picked computer is dead 2% of the time’ can be expressed in
Prolog as battery_dead_computer(0.02) .
• The rule ‘The probability of the battery being dead is same as the probability of the circuit
being faulty’ may be written in Prolog as battery_dead_computer(P) :-
computer_circuit_faulty(P)
To ignore weak evidences, we can write as:
battery_dead_computer(P) :- computer_circuit_faulty(P) , P > 0.1
• The rule ‘If 30% of the time when a computer has a circuit problem, the battery is dead’ can be
written as: battery_dead_computer(P) :- computer_circuit_faulty(P1) , P1 =0.3.

5. What are cumulative probabilities? Madhushree Parab (21306A1026)


SOLUTION:
It is very important to combine the probabilities from the facts and successful rules to get a
cumulative probability. The following two situation will arise:
1. If sub goals of a rule are probable, then the probability of the rule to succeed should take
care of the probable sub goals.
2. If rules with the same conclusion have different probabilities, then the overall probability
of the rule must be found.
The first situation is resolved by simply computing cumulative probability of the conclusion
with the help of and-combination assuming that all sub goals are independent. In this case,
probabilities of sub goals in the right side of the rule are multiplied using joint probability
formula as shown below:
Prob (A and B and C and…) = Prob(A) * Prob(B) * Prob(C) * ….
The second situation is handled by using or-combination to get the overall probability of
predicate in the head of rule. If events are mutually independent, the following formula is used to
obtain the OR probability.
Prob (A or B or C or…) = 1-[(1-Prob(A)) (1-Prob(B)) (1-Prob(C)) ….]
Prolog Programs for Computing Cumulative Probabilities
To develop Prolog programs for computing probabilities, we must assume that all sub goals and
rules are independent of each other. The prolog programs for and-combination as well as or-
combination are discussed below:
1. AND-combination: A list of all the probabilities is passed as an argument and the product
of all these probabilities is computed to obtain and-combination effect using Prolog rules
as follows:
and_combination([P], P),
and_combination([H | T], P) :- and_combination(T,P1), P is P1 * H
2. OR_combination: We obtain all the probabilities of the same predicate name (defined as
head of different rules) and compute or combination probability using the following
formula:
Prob (A or B or C or…) = 1-[(1-Prob(A)) (1-Prob(B)) (1-Prob(C)) ….]
The prolog rules for computing or combination may be written as
or_combination([P],P).
or_combination([H |T],P):- or_combination(T,P1). P is 1-((1-H) * (1-P1)).

6. What are rule based system and Bayesian method?(Sejal Shingre-21306A1040)


Rule-based system:
Rule-based methods are a popular class of techniques in machine learning and data
mining. They share the goal of finding regularities in data that can be expressed in the
form of an IF-THEN rule. Depending on the type of rule that should be found, we can
discriminate between association rule discovery and predictive rule learning. In the latter
case, one is often also interested in learning a collection of rules that collectively cover the
instance space in the sense that they can make a prediction for every possible instance.
Bayesian Method:
"A Bayesian network is a probabilistic graphical model which represents a set of variables and
their conditional dependencies using a directed acyclic graph."
It is also called a Bayes network, belief network, decision network, or Bayesian model.
Bayesian networks are probabilistic, because these networks are built from a probability
distribution, and also use probability theory for prediction and anomaly detection.
Real world applications are probabilistic in nature, and to represent the relationship between
multiple events, we need a Bayesian network. It can also be used in various tasks including
prediction, anomaly detection, diagnostics, automated insight, reasoning, time series
prediction, and decision making under uncertainty.

7. What are Fuzzy Sets ? 21306a1061 Mitali Jadhav


Ans Fuzzy sets – admits gradation such as all tones between black and white. A fuzzy set has
a graphical description that expresses how the transition from one to another takes place.
This graphical description is called a membership function. Fuzzy set is a set having degrees
of membership between 1 and 0. Fuzzy sets are represented with tilde character(~). For
example, Number of cars following traffic signals at a particular time out of all cars present
will have membership value between [0,1].Partial membership exists when member of one
fuzzy set can also be a part of other fuzzy sets in the same universe.The degree of
membership or truth is not same as probability, fuzzy truth represents membership in vaguely
defined sets

8. What is the difference between fuzzy sets and classical sets?


(Muskan 21306A1025)
Ans.
Sr.No Classical Set Fuzzy Set

1 Classical set defines the value is either 0 or 1. Fuzzy set defines the value between 0 and 1 including both 0
and 1.
2 It is also called a crisp set. It specifies the degree to which something is true.

3 It shows full membership It shows partial membership.


4 Eg1. She is 18 years old. Eg1. She is about 18 years old.
Eg2. Rahul is 1.6m tall Eg2. Rahul is about 1.6m tall.
5 Classical set application used for digital design. Fuzzy set used in the fuzzy controller.

6 It is bi-valued function logic. It is infinite valued function logic


7 Full membership means totally true/false, yes/no, Partial membership means true to false, yes to no, 0 to 1.
0/1.

9. What are the different Fuzzy set operations? ( Aaranta vijay waykar21306A1027)
There are three operation performs on fuzzy set are : fuzzy complements, fuzzy
intersection, and fuzzy unions.
1. Complements of fuzzy set Ā(x) :
The complement is the oppsite of the set. The complement of a fuzzy set is
denoted by A(x) and is defined with respect to the universal set X as follows:
A`()X = 1-A(x) for all x £ x

Ex - A is set which contains :


A = {(1,0.3),(5,0.8)(2,0.5)(4,0.1)}
Then,
Ā = {(1,0.7)(5,0.2)(2,0.5)(4,0.9)}

2. Intersection of fuzzy set


Basically is shows how much of the element belongs to both sets. May have different
degrees of membership in each set.
The membership (y) value at any x value is the minimum of the membership values of
the two fuzzy sets. Returns the union of two fuzzy sets.
Intersection is analogous to logical AND operation

Ex : A is a set which contains :


A = {(1,0.3)(5,0.8)(2,0.5)(4,0.1)}
B is set which contains :
B = {(1,0.5)(5,0.5)(2,0.4)(4,0.7)}
Then,
A Ո B = {(1,0.3)(5,0.5)(2,0.4)(4,0.1)}

3. Union of fuzzy set


Union of fuzzy set consist of every element that falls into either set. The value of the
membership value is will be the largest membership value of the element in the either set
A and set B.
Ex : A is set contains :
A = {(X1,0.6)(X2,0.2)(X3,1)(X4,0.4)}
B is set contains :
B = {(X1,0.1)(X2,0.8)(X3,0)(X4,0.9)}
Then,
A Ս B = {(X1,0.6)(X2,0.8)(X3,1)(X4,0.9)}
Also, some additional operations are also performed on fuzzy set are equality, not equal,
containment, proper subset, product, power, bold union, bold intersection.

10. Explain different types of Member ship Functions.


(Hiral Patel – 21306A1072)
Ans. Fuzzy membership function is used to convert the crisp input provided to the fuzzy
inference system. Fuzzy logic it self is not fuzzy, rather it deals with the fuzziness in the data.
And this fuzziness in the data is best described by the fuzzy membership function.
Fuzzy membership function is the graphical way of visualizing degree of membership of any
value in given fuzzy set. In the graph, X axis represents the universe of discourse and Y axis
represents the degree of membership in the range [0, 1]
A) Triangular Membership function:
This is one of the most widely accepted and used membership function (MF) in fuzzy controller
design. The triangle which fuzzifies the input can be defined by three parameters a, b and c,
where and c defines the base and b defines the height of the triangle.

Triangle (x;a,b,c)= 0, x<=a.


x-a/b-a, a<=x<=b.
c-x/c-b, b<=x<=c
0, c<=x

B) Trapezoidal membership function:

Trapezoidal membership function is defined by four parameters: a, b, c and d. Span b to c


represents the highest membership value that element can take.
Trapezoidal function transforms into triangular function when b=c.A graph shows a a= 10, b=30,
c=70 and d=90
Trapezoid (x;a,b,c,d) = 0, x<=a.
x-a/b-a, a<=x<=b.
1, b<=x<=c
d-x/d-c c<=x<=d
0, d<=x

C) Gaussian membership function:


Gaussian distribution curve forms a different membership function. Membership function for a
symmetric Gaussian function may be defined as follows:

D)Generalised membership function:


A generalized bell MF (or Bell-shaped Function) is specified by three parameters (a, b, c):

Bell(x;a,b,c)= 1/1+|x-c|^2b/a

where the parameter b is usually positive. (If b is negative, the shape of this MF becomes an
upside-down bell.) Note that this MF is a direct generalization of the Cauchy distribution used in
probability theory, so it is also referred to as the Cauchy MF.
E) Sigmoid membership function:
A sigmoid MF is defined by

sig(r; a, c)=1/1+expl-a(x-c)]

where a controls the slop at the crossover point x=c.


Depending on the sign of the parameter a, a sigmoid MF is inherently open right or left and thus
is appropriate for representing concepts such as "very large" or "very negative". Sigmoid
functions of this kind are employed widely as the activation function of artificial neural
networks. Therefore, for a neural network to simulate the behaviour of a fuzzy inference system,
the first problem we face is how to synthesize a close MF through a sigmoid function.

11. Write a short note on Multivalued Logic.


Ans- (HIRKANI KASHID)(21306A1070)
 Many-valued logic (also multi- or multiple-valued logic) refers to a propositional
calculus in which there are more than two truth values. Traditionally, in
Aristotle’s logical calculus, there were only two possible values (i.e., “true” and
“false”) for any proposition. Classical two-valued logic may be extended to n-
valued logic for n greater than 2. Those most popular in the literature are three-
valued (e.g., Łukasiewicz’s and Kleene’s, which accept the values "true", "false",
and “unknown”), four-valued, nine-valued, the finite-valued (finitely-many
valued) with more than three values, and the infinite-valued (infinitely-many-
valued), such as fuzzy logic and probability logic.
Known applications of many-valued logic can be roughly classified into two groups.[14]
The first group uses many-valued logic to solve binary problems more efficiently. For
example, a well-known approach to represent a multiple-output Boolean function is to
treat its output part as a single many-valued variable and convert it to a single-output
characteristic function (specifically, the indicator function). Other applications of many-
valued logic include design of programmable logic arrays (PLAs) with input decoders,
optimization of finite state machines, testing, and verification.

The second group targets the design of electronic circuits that employ more than two
discrete levels of signals, such as many-valued memories, arithmetic circuits, and field
programmable gate arrays (FPGAs). Many-valued circuits have a number of theoretical
advantages over standard binary circuits. For example, the interconnect on and off chip
can be reduced if signals in the circuit assume four or more levels rather than only two. In
memory design, storing two instead of one bit of information per memory cell doubles
the density of the memory in the same die size. Applications using arithmetic circuits
often benefit from using alternatives to binary number systems. For example, residue and
redundant number systems[15] can reduce or eliminate the ripple-through carries that are
involved in normal binary addition or subtraction, resulting in high-speed arithmetic
operations. These number systems have a natural implementation using many-valued
circuits. However, the practicality of these potential advantages heavily depends on the
availability of circuit realizations, which must be compatible or competitive with present-
day standard technologies. In addition to aiding in the design of electronic circuits, many-
valued logic is used extensively to test circuits for faults and defects. Basically all known
automatic test pattern generation (ATG) algorithms used for digital circuit testing require
a simulator that can resolve 5-valued logic (0, 1, x, D, D’).[16] The additional values—x,
D, and D’—represent (1) unknown/uninitialized, (2) a 0 instead of a 1, and (3) a 1 instead
of a 0..

12. What is Fuzzy Logic? (21306A1051)


•Fuzzy logic is an approach to computing based on "degrees of truth" rather than the usual
"true or false" (1 or 0) Boolean logic on which the modern computer is based.
Applications
of Fuzzy
Logic
-
•It is used in Businesses for decision- making
support system.
•It is used in Automative systems for controlling
the traffic and speed, and for improving the
efficiency of automatic
transmissions.
•Automative systems also use the
shift scheduling method for automatic
transmissions.
•It is also used in microwave oven for
setting the lunes power and cooking strategy.
•This technique is also used in the area
of modern control systems such as expert
systems.
•It is also used in the vacuum cleaners, and
the timings of washing machines.
• It is also used in heaters, air conditioners,
and humidifiers.
13. Explain with examples Linguistic variables and Hedges. ( 21306A1031 )
Answer:- Linguistic Variables : Variables in mathematics normally take numeric values,
although non-numeric linguistic variables are frequently employed in fuzzy logic to make
the expression of rules and facts easier. For instance, the term ‘Age’ can be used to
indicate a linguistic variable with a value such as a child, young, old, and so on.
◼ Linguistic variables are variables with a value made up of linguistic concepts (also known
as linguistic words) rather than numbers, such as child, young, and so on. .AGE = {Child,
Young, Old} Each AGE linguistic phrase has a membership function for a specific age range.
The same age value is mapped to multiple membership values in the range of 0 to 1 by each
function. These membership values can then be used to identify whether a person is a child, a
young person, or an elderly person. The following is the membershi
p function with regard to each fuzzy term:
◼ For AGE = 11, we will get a membership value of 0.75 (roughly) in the Child set, 0.2
(approx) in the Young set, and 0 in the Old set, as shown in the diagram. So, if a person’s age
is 11, it’s safe to assume that he or she is a child, perhaps a little young but certainly not old.
Linguistic hedges
◼ Linguistic hedges can be used to modify linguistic variables, which is an important feature.
Linguistic hedges are primarily employed to aid in the more precise communication of the
degree of correctness and truth in a particular statement.
For example: If a statement John is Young is associated with the value 0.6 then very young is
automatically deduced as having the value 0.6 * 0.6 = 0.36. On the other hand, not very
young get the value ( 1 – 0.36 ), i.e., 0.64. In this example, the operator very(X) is defined as
X*X; however, in general, these operators may be uniformly, but flexible, defined to fit the
application; this results in great power for the expression of both rules and fuzzy facts.
Linguistic modifiers such as very, more or less, fairly, and extremely rare examples of
hedges. They can modify fuzzy predicates and fuzzy truth values. Hedge very is often
interpreted as the unary operator :h(a) = a2; for a ∈ [0,1]
Similarly, hedge fairly interpreted as the unary operator :h(a) = √a ; or a ∈ [0,1]
◼ A strong modifier strengthens a fuzzy predicate to which it is applied and consequently
reduces the truth value of the associated proposition. A weak modifier weakens the predicate
and consequently increases the truth value of the associated proposition.
Example: If Young (25) = 0.8, Then : very Young(25) = 0.8 ×0.8 = 0.64 and fairly young(25)
= √0.8=0.89

14. What are Fuzzy propositions? Uma Sharma - 21306A1012


Answer:
Main difference between classical proposition and fuzzy proposition is in the range of their truth values.
The proposition value for classical proposition is either true or false but in case of fuzzy proposition the
range is not confined to only two values it varies from 2 to n.

For example, speed may be fast, very fast, medium, slow, and very slow. In fuzzy logic the truth value of
fuzzy proposition is also depend on an additional factor known as degree of truth whose value is varies
between 0 and 1.

For example

 p: Speed is Slow
 T(p) = 0.8, if p is partly true
 T(p) = 1, if p is absolutely true
 T(p) = 0, if p is totally false
 So, we can say that fuzzy proposition is a statement p which acquires a fuzzy truth value
T(p) range from (0 to1)
Different types of Fuzzy Propositions:
1. Unconditional and unqualified propositions
The canonical form of this type of fuzzy proposition is p:V is F

Where, V is a variable which takes value v from a universal set U. F is a fuzzy set on U that represents a
given inaccurate predicate such as fast, low, tall etc.

For example:

 p: Speed (V) is high (F)


 T(p) = 0.8, if p is partly true
 T(p)=1, if p is absolutely true
 T(p)=0, if p is totally false
 Where, T(p) = µF(v) membership grade function indicates the degree of truth of v
belongs to F, its value ranges from 0 to 1.

2. Unconditional and qualified propositions


 The canonical form of this type of fuzzy proposition is p:V is F is S
 Where, V and F have the same meaning and S is a fuzzy truth qualifier
 Example: Speed is high is very true

3. Conditional and unqualified propositions


 The canonical form of this type of fuzzy proposition is p: if X is A, then Y is B
 Where, X, Y are variables in universes U1 and U2 A, B are fuzzy sets on X, Y
 Example: p: if speed is High, then risk is Low
4. Conditional and Qualified Propositions
 The canonical form of this type of fuzzy proposition is
 p: (if X is A, then Y is B) is S Where, all variables have same meaning as previous
declare
 Example: p: if speed is high than risk is low is true
15. Write a short note on inference rules for fuzzy propositions. - Shivranjani
(21306A1033)
Answer – Inference rules are the templates for generating valid arguments. Inference rules are
applied to derive proofs in artificial intelligence, and the proof is a sequence of the conclusion
that leads to the desired goal.
In inference rules, the implication among all the connectives plays an important role. Following
are some terminologies related to inference rules:

 Implication: It is one of the logical connectives which can be represented as P → Q. It is


a Boolean expression.
 Converse: The converse of implication, which means the right-hand side proposition
goes to the left-hand side and vice-versa. It can be written as Q → P.
 Contrapositive: The negation of converse is termed as contrapositive, and it can be
represented as ¬ Q → ¬ P.
 Inverse: The negation of implication is called inverse. It can be represented as ¬ P → ¬
Q.
From the above term some of the compound statements are equivalent to each other, which we
can prove using truth table:
Hence from the above truth table, we can prove that P → Q is equivalent to ¬ Q → ¬ P, and Q→
P is equivalent to ¬ P → ¬ Q.

16. What are the different applications of fuzzy systems? (21306A1042 – Sushma
Singh)
Ans:- Fuzzy System Applications
“If all motion vectors are almost parallel and their time differential is small, then the hand
jittering is detected and the direction of the hand movement is in the direction of the moving
vectors”. Image Stabilization via Fuzzy Logic
1)Aerospace: Altitude control of spacecraft, satellite altitude control, flow and mixture
regulation in aircraft de-
control, shift scheduling method for automatic transmission, intelligent highway systems, traffic
control, improving efficiency of automatic transmissions
2)Business Decision-making support systems: personnel evaluation in a large company,

production, a coke oven gas cooling plant


3) Défense : Underwater target recognition, automatic target recognition of thermal infrared
images, naval decision support aids, control of a hypervelocity interceptor, fuzzy set modelling
of NATO decision making. Electronics Control of automatic exposure in video cameras,
humidity in a clean room, air conditioning systems, washing machine timing, microwave ovens,
vacuum cleaners.
4)Financial Banknote transfer control: fund management, stock market predictions. Industrial
Cement kiln controls heat exchanger control, activated sludge wastewater treatment process
control, water purification plant control, quantitative pattern analysis
for industrial quality assurance, control of constraint satisfaction problems in structural design,
control of water purification plants
5) Manufacturing Optimization of cheese production: Marine Autopilot for ships, optimal
route selection, control of autonomous underwater vehicles, ship steering. Medical Medical
diagnostic support system, control of arterial pressure during anaesthesia, multivariable control
of anaesthesia, modelling of neuropathological findings in Alzheimer's patients, radiology
diagnoses, fuzzy inference diagnosis of diabetes and prostate cancer.
6)Mining and Metal Processing Sinter plant control: decision making in metal forming.
Robotics Fuzzy control for flexible-link manipulators, robot arm control. Securities Decision
systems for securities trading.
7)Sal Pignrocessing and Telecommunications: Adaptive filter for nonlinear channel
equalization control of broadband noise Transportation Automatic underground train operation,
train schedule control, railway acceleration, braking, and stopping
17. Write a short note on possibility theory and other enhancement to Logic.
(21306A1074- Vaishnavi Pangam)
Ans- possibility theory –
1. Probability theory is incorporated into machine learning, particularly the subset of
artificial intelligence concerned with predicting outcomes and making decisions. In
computer science, softmax functions are used to limit the functions outcome to a value
between 0 and 1.
2. These functions, also known as squashing functions, are useful in an algorithms
process of assigning outcomes a probability value. The values assigned by these
functions assist the neural network in making better decisions, and is often the final step
in a neural network function.
3. Applications of Probability Theory:- Probabilities are a cornerstone of
mathematical understanding and are extremely common in representing outcomes both in
abstract situations, and in real-world scenarios. Below are two examples, one that uses
probability theory in an abstract sense as a way of understanding phenomena, and the
other that is a real-world application of probability theory.
4. The foundation of probability theory is the idea that every possible outcome from
a sample space is assigned a numerical value between 0 and 1. These numbers represent
the likelihood that the event will occur. The sum total of probabilities of a set of events is
known as a probability distribution.
5. Set of outcomes known as the sample space. Outcomes are often referred to as the
results of an event. Probability theory in general attempts to apply mathematical
abstractions of uncertain, also known as non-deterministic, processes. The tools that are
common in probability theory are discrete and continuous random variables, probability
distributions, and stochastic processes.
• Other enhancement to logic-
6. Computer scientists in general are familiar with the idea that logic provides
techniques for analyzing the inferential properties of languages, and with the distinction
between a high-level logical analysis of a reasoning problem and its implementations.
7. Involvement of logic in AI applications to vary from relatively weak uses in
which the logic informs the implementation process with analytic insights, to strong uses
in which the implementation algorithm can be shown to be sound and complete.
8. Logical theories in AI are independent from implementations. They can be used
to provide insights into the reasoning problem without directly informing the
implementation
9. The earliest expert systems, such as MYCIN (a program that reasons about
bacterial infections, see Buchanan & Shortliffe 1984), were based entirely on large
systems of procedural rules.

18. In a class, there are 70% of the students who like English and 40% of the students
who likes English and mathematics, and then what is the percent of students those
who like English also like mathematics? (Gorima(21306A1054))
Ans. Let, A is an event that a student like mathematics.
B is an event that a student like English.
P(A|B) = P(A^B)/P(B) = 0.4/0.5 = 57%
Hence, 57% are the students who like English also like Mathematics.

19. Two dies are thrown simultaneously, and the sum of the numbers obtained is found
to be 7. What is the probability that the number 3 has appeared at least once?
(21306A1023 – Komal Gupta)
Ans:- The sample space S would consist of all the numbers possible by the combination
of two dies. Therefore S consists of 6 × 6, i.e. 36 events.
Event A indicates the combination in which 3 has appeared at least once.
Event B indicates the combination of the numbers which sum up to 7.
A = {(3, 1), (3, 2), (3, 3)(3, 4)(3, 5)(3, 6)(1, 3)(2, 3)(4, 3)(5, 3)(6, 3)}
B = {(1, 6)(2, 5)(3, 4)(4, 3)(5, 2)(6, 1)}
P(A) = 11/36
P(B) = 6/36
A∩B=2
P(A ∩ B) = 2/36
Applying the conditional probability formula we get,
P(A|B) = P(A∩B)/P(B) = (2/36)/(6/36) = ⅓

20. In a batch, there are 80% C programmers, and 40% are Java and C programmers.
What is the probability that a C programmer is also Java programmer?
(21306A1011 – Shraddha Kasar)
Ans:- Let A --> Event that a student is Java programmer
B --> Event that a student is C programmer
P(A|B) = P(A ∩ B) / P(B)
= (0.4) / (0.8)
= 0.5
So there are 50% chances that student that knows C also knows Java
21. Write the difference between conditional probability and Bayes Theorem (Sejal
Shingre -21306A1040)
Conditional Probability Bayes Theorem
Conditional Probability is the probability of Bayes Theorem includes two conditional
occurrence of a certain event, say AA, based probabilities for the events, say AA and BB.
on some other event whether BB is true or
not.
The equation of conditional probability The equation of Bayes Theorem
is:P(A|B)=P(A∩B)P(B)P(A|B)=P(A∩B)P(B) is:P(A|B)=P(B|A)×P(A)P(B)P(A|B)=P(B|A)×
P(A)P(B)

It is used to compute the conditional It is used in Bayesian inference and in models


probability and the events AAandBBare where we are interested in the distribution up
relatively simple. to a normalizing factor P(B)P(B)

It is used for relatively simple problems. It gives a structured formula for solving more
complex problems.

22. Suppose we are given the probability of Mike has a cold as 0.25, the probability of
Mike was observed sneezing when he had cold in the past was 0.9 and the
probability of Mike was observed sneezing when he did not have cold as 0.20. Find
the probability of Mike having a cold given that he sneezes. Madhushree Parab
(21306A1026)
SOLUTION:
P(H)=0.25 .: P(~H)=1-P(H)=1-0.25=0.75

P(E|H)=0.9

P(E|~H)=0.20

P(H|E)=?

P(H|E)= P(E|H)*P(H) / P(E|H)* P(H)+P(E|~H) * P(~H)

= 0.9*0.25 / 0.9*0.25+0.20*0.75

=0.225 / 0.225 + 0.15

=0.225/0.375

=0.6

=60%
Hence, we can conclude that mike probability of having cold given that he sneezes is equal to 0.6.

Similarly, we can determine his probability of having cold if he was not sneezing in the following
manner.

P(H|E) = [P(~E|H) * P(H)] / P(~E)

= [(1-0.9) * 0.25] / (1-0.375)

=0.025 / 0.625

=0.04

Hence, Mike’s probability of having a cold if he was not sneezing is obtained to be equal to 0.04.

23. Dangerous fires are rare (1%). but smoke is fairly common (10%) due to barbecues,
and 90% of dangerous fires make smoke. Discover the probability of dangerous Fire
when there is Smoke (21306A1027 aaranta waykar )

24. What are the advantages of Fuzzy Logic? 21306a1061 Mitali Jadhav
Ans The methodology of this concept works similarly as the human reasoning. Any user can
easily understand the structure of Fuzzy Logic. It does not need a large memory, because the
algorithms can be easily described with fewer data. It is widely used in all fields of life and
easily provides effective solutions to the problems which have high complexity. This concept is
based on the set theory of mathematics, so that’s why it is simple. It allows users for controlling
the control machines and consumer products. The development time of fuzzy logic is short as
compared to conventional methods. Due to its flexibility, any user can easily add and delete rules
in the FLS system.

25. What are the disadvantages of Fuzzy Logic? (Gorima(21306A1054))


Let's suppose A is a set which contains following elements:
A = {( X1, 0.3 ), (X2, 0.7), (X3, 0.5), (X4, 0.1)}
And, B is a set which contains following elements:
B = {( X1, 0.8), (X2, 0.2), (X3, 0.4), (X4, 0.9)}
Then write A∩B, A UB, Ā(x), B(x) complement
Ans. Disadvantages of FuzzyLogic
 The run time of fuzzy logic systems is slow and takes a long time to produce outputs.
 Users can understand it easily if they are simple.
 The possibilities produced by the fuzzy logic system are not always accurate.
 Many researchers give various ways for solving a given statement using this technique
which leads to ambiguity.
 Fuzzy logics are not suitable for those problems that require high accuracy.
 The systems of a Fuzzy logic need a lot of testing for verification and validation.
 A∩B : {(X1,0.3),( X2,0.2),(X3,0.4),(X2,0.1)}
 A UB : {(X1,0.6),(X2,0.8),(X3,1),(X4,0.9)}
 Ā(x) : {(X1,0.7),(X2,0.3),(X3,0.5),(X4,0.9)}
 B(x) complement : {(X1,0.2),(X2,0.8),(X3,0.6),(X4,0.1)}

26. Let's suppose A is a set which contains following elements:


A = {( X1, 0.3 ), (X2, 0.7), (X3, 0.5), (X4, 0.1)}
And, B is a set which contains following elements:
B = {( X1, 0.8), (X2, 0.2), (X3, 0.4), (X4, 0.9)}
Then write A.B, A + B, A°B (Muskan 21306A1025)
Ans. A = 1(X1, 0.3), (X2, 0.7), (X3, 0.5), (X4, 0.1)}
B= 1(X1, 0.8), (X2, 0.2), (X3, 0.4), (X4, 0.9)}
A.B={(X1, 0.24), (X2, 0.14), (X3, 0.20), (X4, 0.9)}
A + B=‹( X1, 1), (X2, 0.9), (X3, 0.9), (X4, 1)}

27. Y={(5,1),(10,0.5), (20, 08), (30,0.4)} Apply CON and DIL operators on Y.

(Hiral Patel - 21306A1072)

Solution:
28. Y={(5,1),(10,0.5), (20, 08), (30,0.4)} Write height, cardinality and norms for Y.

(Hiral Patel – 21306A1072)


Solution:
AAI Assignment 3
 What is learning? What are the types of learning? (Mandar more)

Ans:- Learning is the process of converting experience into expertise or knowledge.

Types of learning:-

1. Supervised learning

The model used in supervised learning describes the effect of one set of observations (called inputs) on another set of
observations (called outputs). Here both the sets are given and the pur- pose is to find function f that transforms given input x into
given output y. In this type of learning, inputs are assumed to be provided at the beginning, while outputs are obtained at the end
of the casal chain. Alternatively, in supervised learning the user tries to find the connection between two given sets of
observations, namely inputs and outputs.

2. Unsupervised Learning

In unsupervised learning, all the input observations are given and no output observations are available. Unsupervised learning
enables users to learn larger and more complex models as compared to supervised learning. Supervised learning cannot be used
to learn models with deep hierarchies, as the difficulty of the learning task increases exponentially between the two sets.
However, in unsupervised learning, the learning can proceed hierarchically from the observations into more abstract levels of
representation.

3. Reinforcement Learning

In reinforcement learning, the decision-making system (also known as agent) receives rewards or feedback (positive or negative)
for its action at the end of a sequence of steps. It is required to assign reward to steps while solving the credit assignment
problem; this problem determines which steps should receive credit or blame for the final result. As opposed to supervised
learning. reinforcement learning takes place in an environment where the agent cannot directly compare the results of its action to
a desired result. Instead, it is given a positive and negative feedback directly on the basis of its actions. Reinforcement learning
may cause a system to win or lose a game, or inform a system that it has made a good move or a poor one. Therefore, the primary
task of reinforcement learning is to obtain a successful function using these rewards.
 What is Machine learning? What is a need for it? ( Raj Mishra )

Ans. The concept of machine learning as adaptive changes in a system that enable the system to do the same task (or tasks) drawn
from the same population with greater efficiency whenever the task (or tasks) have to repeated again.

There is an increased need of machine learning. as helps understanding and improves efficiency of human learning. Rapid
advancement in computer technology has enabled users across the world to Store and process large amount Of distributed data.
this stored data proves to of greater use if it can be analysed and transformed into useful information that can further used for
drawing inferences, making future predictions. helping in intelligent decision making. and other such applications. Because of
this. Machine learning has become an area in AI. One of the most goals of A1 is to enable the development of computers that can
be taught rather than programmed. This will help in discovering structures that are still unknown to humans. It is not possible to
derive complicated AI systems by hand. A process of dynamic updating should be in place for continuous incorporation of new
information. If a system is capable of learning new characteristics automatically, then there is tremendous of expanding its
domain or thereby reducing its brittleness simulation requires features such as knowledge acquisition, inference, updating or
refinement of knowledge
base. acquisition of heuristics, application of faster searches, etc. Thus, we can sum
up by saying that learning is an important aspect of intelligence. The two types of learning

methodologies that are generally followed include inductive methodology-

and deductive methodology.

In inductive machine learning methodology, required rules and patterns are extracted from massive
data sets. Hence, the major focus of machine learning research in this case is to extract information
from data automatically by computational and statistical methods. On the other hand, deductive
machine learning methodology involves deducing new knowledge from already existing knowledge. The
utility of machine learning in day-to-day life can gauged from the wide range of applications such as
medical diagnosis, detection of monetary frauds (e.g., involving credit cards), classification of DNA
sequences, bioinformatics, brain-machine interfaces, stock market analysis, natural language processing,
syntactic pattern recognition, object recognition, game playing, and so on. Computational analysis of
machine learning algorithms and their performance forms a branch of theoretical computer science
known as computational learning theory.
 Explain Components of Learning System. (Bhupendra Yadav)

Ans. The components of learning system can be listed as follows

i. Learning Components

ii. Performance Elements

iii. Critic

iv. Problem Generator

v. Sensor

vi. Effectors

i. Learning Components

The basic purpose of learning Components is to make changes or improvements to the system depending on its performance.

ii. Performance Elements

In a learning system the performance elements performs the task of choosing the actions that need to be taken.

iii. Critic

The job of the critic is to inform the learning components regarding its performance with respect to fixed standard. Note that
critic could be either a human or an automated component.

iv. Problem Generator

Problem Generator is imperative to a learning system since it suggests problems or actios that would lead to generation of new
examples or experiences, which will aid in further training of the system.

v. Sensors and effectors

Both these components are external to the system. The system receives information or data from sensor, while the output is
transmitted through effectors.
 What are the Basic Learning Methods (mandar sawant )

Ans-
Following are the basic learning methods

 Rote learning
 Learning by taking advise
 Learning by parameter adjustment
 Learning by Macro-Operators
 Learning by Analogy

1. Rote learning

Rote learning basically refers to the process of memorization. Hence, it requires saving knowledge so
that it can be utilized again whenever needed. Rote learning involves one-to-one mapping from inputs
to stored representation and is also known as learning by memorization; it uses association-based
storage and retrieval. Moreover, there is no repeated computation required; only an inference or a
query is necessary. Although memorization is a key requirement in learning and development of an
intelligent program, it can be a complicated subject. In spite of being a basic and simple process, rote
learning highlights some relevant issues pertaining to more complex learning concepts as described
below.

Organization: Knowledge should be stored in such a manner that accessing this stored knowledge is
faster than resorting to re-computation. Organization of knowledge is achieved by employing
techniques such as hashing, indexing, sorting, and so on.

Generalization: Since the number of stored objects can be quite large, we need to generalize some
information to make the problem manageable.

Stability of environment: The method of rote learning does not work very effectively in a rapidly
changing environment. In case there is a change in environment, the change has to be detected and
recorded exactly.
Rote learning should not become a cause of decrease in the efficiency of a system; therefore, we must
be able to decide whether it is worth storing a particular value in the first place.

2. Learning by Taking Advice

Although the idea of learning in Al by taking advice was proposed by John McCarthy in 1950s, very few
attempts were made to create such systems till the late 1970s (McCarthy, 1959). Expert systems are
examples of this concept. The following are two basic approaches to advice taking:

 Taking high-level and abstract advice and then converting it into rules.
 Developing sophisticated modules
The first approach involves taking abstract advice and converting it into rules that can be used to guide
performance elements of the system. All aspects of advice taking are automated. The following steps
are required in this method:

 Request: This can be simple question enquiring about either general or


complicated advice by identifying shortcomings in the knowledge base and
asking for a remedy.
 Interpret: In this step, the advice is translated into an internal
representation.
 Operationalize: It is quite possible that translated advice may still not be
usable; so, this stage aims to provide a representation that can be used by the
performance element.
 Integrate: When this knowledge is added to the knowledge base, care needs
to be taken to ensure that negative side-effects, such as redundancy and
contradictions, are avoided.
 Evaluate: The system must evaluate or access the new knowledge for
errors, contradictions, etc.
The second approach involves developing sophisticated modules such as knowledge-base editors and
debugging tools. These modules enable an expert to translate his expertise into detailed rules Here, the
expert is an integral part of the learning system. Therefore, such modules are important in expert
systems area of Al.

3. Learning by Parameter Adjustment


In many programs, static evaluation functions can be used to reduce search space and make program to
be intelligent and efficient. A slight modification in formulating evaluation function may give rise to
learning to some extent. Such type of learning denotes learning by parameter adjustment. In such cases,
an evaluation function is represented as a polynomial of the form ΣWi * Ti where Ti are the terms
representing values of features, while W, are the weights of terms.

4. Learning by Macro-Operators
Although the basic idea in learning by macro-operators is similar to rote learning, here we avoid
expensive re-computation by using macro-operators that are learnt for subsequent use. These operators
consist of series of stereotyped actions. The STRIPS problem-solving employed macro- operators in its
learning phase.

5. Learning by Analogy
Learning by analogy is based on the understanding that if a system can recognize similarities in
information that is already stored in it then it may be able to transfer some knowledge from this
previous information to improve the solution of the task in hand. Analogy involves a complicated
mapping between two dissimilar concepts or correspondence between two different representations. It
is easy for human beings to quickly recognize the abstractions involved and understand the meaning.
There are two types of analogical problem methods studied in Al

 Transformational analogy - This analogy involves looking for a similar


solution and copying it to the new situation making suitable substitutions
wherever appropriate.
 Derivational analogy - Transformational analogy does not look at how a
given problem is solved but looks only at the final solution. In this case, the
history of the problem solution and the steps involved are often relevant.

 Write a short note on Supervised & Unsupervised Learnings (Tejas Sawant )

Ans. Supervised learning, also known as supervised machine learning, is a subcategory of


machine learning and artificial intelligence.
•It is defined by its use of labeled datasets to train algorithms that to classify data or predict
outcomes accurately. As input data is fed into the model, it adjusts its weights until the model
has been fitted appropriately, which occurs as part of the cross validation process.
•Supervised learning helps organizations solve for a variety of real-world problems at scale, such
as classifying spam in a separate folder from your inbox.
•Supervised learning uses a training set to teach models to yield the desired output. This training
dataset includes inputs and correct outputs, which allow the model to learn over time.
•The algorithm measures its accuracy through the loss function, adjusting until the error has been
sufficiently minimized.
•Supervised learning can be separated into two types of problems when data mining—
classification and regression
Example -
•Supervised learning models can be used to build and advance a number of business applications,
including the following:
•Image- and object-recognition: Supervised learning algorithms can be used to locate, isolate,
and categorize objects out of videos or images, making them useful when applied to various
computer vision techniques and imagery analysis.
•Predictive analytics: A widespread use case for supervised learning models is in creating
predictive analytics systems to provide deep insights into various business data points. This
allows enterprises to anticipate certain results based on a given output variable, helping
business leaders justify decisions or pivot for the benefit of the organization.
•Customer sentiment analysis: Using supervised machine learning algorithms, organizations
can extract and classify important pieces of information from large volumes of data—
including context, emotion, and intent—with very little human intervention. This can be
incredibly useful when gaining a better understanding of customer interactions and can be
used to improve brand engagement efforts.
•Spam detection: Spam detection is another example of a supervised learning model. Using
supervised classification algorithms, organizations can train databases to recognize patterns
or anomalies in new data to organize spam and non-spam-related correspondences
effectively.

Unsupervised Learning -
•Unsupervised learning, also known as unsupervised machine learning, uses machine learning
algorithms to analyze and cluster unlabeled datasets. These algorithms discover hidden
patterns or data groupings without the need for human intervention.
•Its ability to discover similarities and differences in information make it the ideal solution for
exploratory data analysis, cross-selling strategies, customer segmentation, and image
recognition.
•Unsupervised learning refers to the use of artificial intelligence (AI) algorithms to identify
patterns in data sets containing data points that are neither classified nor labeled.
•The algorithms are thus allowed to classify, label and/or group the data points contained within
the data sets without having any external guidance in performing that task.
•Unsupervised learning allows the system to identify patterns within data sets on its own.
•In unsupervised learning, an AI system will group unsorted information according to similarities
and differences even though there are no categories provided.
•Unsupervised learning algorithms can perform more complex processing tasks than supervised
learning systems.
•Unsupervised learning can be more unpredictable than a supervised learning model. While an
unsupervised learning AI system might, for example, figure out on its own how to sort cats
from dogs, it might also add unforeseen and undesired categories to deal with unusual breeds,
creating clutter instead of order.
•Unsupervised learning models are utilized for three main tasks—clustering, association, and
dimensionality reduction. Below we’ll define each learning method and highlight common
algorithms and approaches to conduct them effectively.
•Clustering is a data mining technique which groups unlabeled data based on their similarities or
differences. Clustering algorithms are used to process raw, unclassified data objects into
groups represented by structures or patterns in the information. Clustering algorithms can be
categorized into a few types, specifically exclusive, overlapping, hierarchical, and
probabilistic.
•Exclusive and Overlapping Clustering - Exclusive clustering is a form of grouping that
stipulates a data point can exist only in one cluster. This can also be referred to as “hard”
clustering. The K-means clustering algorithm is an example of exclusive clustering.
•K-means clustering is a common example of an exclusive clustering method where data points
are assigned into K groups, where K represents the number of clusters based on the distance
from each group’s centroid. The data points closest to a given centroid will be clustered
under the same category. A larger K value will be indicative of smaller groupings with more
granularity whereas a smaller K value will have larger groupings and less granularity. K-
means clustering is commonly used in market segmentation, document clustering, image
segmentation, and image compression.
•Overlapping clusters differs from exclusive clustering in that it allows data points to belong to
multiple clusters with separate degrees of membership. “Soft” or fuzzy k-means clustering is
an example of overlapping clustering.
•Euclidean distance is the most common metric used to calculate these distances; however, other
metrics, such as Manhattan distance, are also cited in clustering literature.
•Divisive clustering can be defined as the opposite of agglomerative clustering; instead it takes a
“top-down” approach. In this case, a single data cluster is divided based on the differences
between data points. Divisive clustering is not commonly used, but it is still worth noting in
the context of hierarchical clustering. These clustering processes are usually visualized using
a dendrogram, a tree-like diagram that documents the merging or splitting of data points at
each iteration.
Diagram of a Dendrogram; reading the chart "bottom-up" demonstrates agglomerative clustering
while "top-down" is indicative of divisive clustering
•Probabilistic clustering - A probabilistic model is an unsupervised technique that helps us solve
density estimation or “soft” clustering problems. In probabilistic clustering, data points are
clustered based on the likelihood that they belong to a particular distribution. The Gaussian
Mixture Model (GMM) is the one of the most commonly used probabilistic clustering
methods.
•Gaussian Mixture Models are classified as mixture models, which means that they are made
up of an unspecified number of probability distribution functions. GMMs are primarily
leveraged to determine which Gaussian, or normal, probability distribution a given data point
belongs to. If the mean or variance are known, then we can determine which distribution a
given data point belongs to. However, in GMMs, these variables are not known, so we
assume that a latent, or hidden, variable exists to cluster data points appropriately. While it is
not required to use the Expectation-Maximization (EM) algorithm, it is a commonly used to
estimate the assignment probabilities for a given data point to a particular data cluster.

•Autoencoders leverage neural networks to compress data and then recreate a new representation
of the original data’s input. Looking at the image below, you can see that the hidden layer
specifically acts as a bottleneck to compress the input layer prior to reconstructing within the
output layer. The stage from the input layer to the hidden layer is referred to as “encoding”
while the stage from the hidden layer to the output layer is known as “decoding.”

 What is Reinforcement Learning? (Prajesh Fondekar)

ANS:- Reinforcement learning (RL) is an area of machine learning concerned with how intelligent
agents ought to take actions in an environment in order to maximize the notion of cumulative
reward. Reinforcement learning is one of three basic machine learning paradigms, alongside supervised
learning and unsupervised learning.

Reinforcement learning differs from supervised learning in not needing labelled input/output pairs to
be presented, and in not needing sub-optimal actions to be explicitly corrected. Instead, the focus is on
finding a balance between exploration (of uncharted territory) and exploitation (of current knowledge)

Associative reinforcement learning: -


Associative reinforcement learning tasks combine facets of stochastic learning automata tasks and
supervised learning pattern classification tasks. In associative reinforcement learning tasks, the learning
system interacts in a closed loop with its environment.
Deep reinforcement learning: -
This approach extends reinforcement learning by using a deep neural network and without explicitly
designing the state space. The work on learning ATARI games by Google DeepMind increased attention to
deep reinforcement learning or end-to-end reinforcement learning.

Adversarial deep reinforcement learning: -


Adversarial deep reinforcement learning is an active area of research in reinforcement learning focusing on
vulnerabilities of learned policies. In this research area some studies initially showed that reinforcement
learning policies are susceptible to imperceptible adversarial manipulations. While some methods have been
proposed to overcome these susceptibilities, in the most recent studies it has been shown that these
proposed solutions are far from providing an accurate representation of current vulnerabilities of deep
reinforcement learning policies.

Fuzzy reinforcement learning: -


By introducing fuzzy inference in RL, approximating the state-action value function with fuzzy rules in
continuous space becomes possible. The IF - THEN form of fuzzy rules make this approach suitable for
expressing the results in a form close to natural language. Extending FRL with Fuzzy Rule Interpolation allows
the use of reduced size sparse fuzzy rule-bases to emphasize cardinal rules (most important state-action
values).

Inverse reinforcement learning: -


In inverse reinforcement learning (IRL), no reward function is given. Instead, the reward function is inferred
given an observed behaviours from an expert. The idea is to mimic observed behaviours, which is often
optimal or close to optimal.

Safe reinforcement learning: -


Safe reinforcement learning (SRL) can be defined as the process of learning policies that maximize the
expectation of the return in problems in which it is important to ensure reasonable system performance
and/or respect safety constraints during the learning and/or deployment processes.

 What is Inductive Learning and Inductive Bias? (Anmay Gaonkar)


Ans:-
1. Inductive learning:
-Inductive Learning Algorithm (ILA) is an iterative and inductive machine learning algorithm which is used
for generating a set of a classification rule, which produces rules of the form “IF-THEN”, for a set of
examples, producing rules at each iteration and appending to the set of rules.
-For a very large amount of data, the domain experts are not very useful and reliable. So we move towards
the machine learning approach for this work. To use machine learning One method is to replicate the experts
logic in the form of algorithms but this work is very tedious, time taking and expensive.
-So we move towards the inductive algorithms which itself generate the strategy for performing a task and
need not instruct separately at each step.

-ILA ALGORITHM: General requirements at start of the algorithm:-

list the examples in the form of a table ‘T’ where each row corresponds to an example and each column
contains an attribute value.
create a set of m training examples, each example composed of k attributes and a class attribute with n
possible decisions.
create a rule set, R, having the initial value false.
initially all rows in the table are unmarked.
Steps in the algorithm:-
Step 1: divide the table ‘T’ containing m examples into n sub-tables (t1, t2,…..tn). One table for each
possible value of the class attribute. (repeat steps 2-8 for each sub-table)
Step 2: Initialize the attribute combination count ‘ j ‘ = 1.
Step 3: For the sub-table on which work is going on, divide the attribute list into distinct combinations, each
combination with ‘j ‘ distinct attributes.
Step 4: For each combination of attributes, count the number of occurrences of attribute values that appear
under the same combination of attributes in unmarked rows of the sub-table under consideration, and at the
same time, not appears under the same combination of attributes of other sub-tables. Call the first
combination with the maximum number of occurrences the max-combination ‘ MAX’.
Step 5: If ‘MAX’ = = null , increase ‘ j ‘ by 1 and go to Step 3.
Step 6: Mark all rows of the sub-table where working, in which the values of ‘MAX’ appear, as classified.
Step 7: Add a rule (IF attribute = “XYZ” –> THEN decision is YES/ NO) to R whose left-hand side will have
attribute names of the ‘MAX’ with their values separated by AND, and its right-hand side contains the
decision attribute value associated with the sub-table.
Step 8: If all rows are marked as classified, then move on to process another sub-table and go to Step 2. else,
go to Step 4. If no sub-tables are available, exit with the set of rules obtained till then.
2. Inductive Bias:
-The inductive bias (also known as learning bias) of a learning algorithm is the set of assumptions that the
learner uses to predict outputs of given inputs that it has not encountered.
-Approaches to a more formal definition of inductive bias are based on mathematical logic. Here, the
inductive bias is a logical formula that, together with the training data, logically entails the hypothesis
generated by the learner. However, this strict formalism fails in many practical cases, where the inductive
bias can only be given as a rough description (e.g. in the case of artificial neural networks), or not at all.
 What are the Techniques for Selecting Best Attribute? (Shivam Singh)

 What is Inductive Learning and Deductive Learning (Pankaj Manachekar - 21306A1052)

Ans :-
Deductive reasoning is deducing new information from logically related known
information. It is the form of valid reasoning, which means the argument's conclusion must be
true when the premises are true.
Deductive reasoning is a type of propositional logic in AI, and it requires various rules and facts.
It is sometimes referred to as top-down reasoning, and contradictory to inductive reasoning.
Deductive reasoning mostly starts from the general premises to the specific conclusion, which
can be explained as below example.
Example:-

Premise-1: All the human eats veggies


Premise-2: Suresh is human.
Conclusion: Suresh eats veggies.
Inductive reasoning is a form of reasoning to arrive at a conclusion using limited sets of facts by
the process of generalization. It starts with the series of specific facts or data and reaches to a
general statement or conclusion.
Inductive reasoning is a type of propositional logic, which is also known as cause-effect
reasoning or bottom-up reasoning.
In inductive reasoning, we use historical data or various premises to generate a generic rule, for
which premises support the conclusion.
Example:-

Premise: All of the pigeons we have seen in the zoo are white.
Conclusion: Therefore, we can expect all the pigeons to be white.

 What is clustering? Write Properties of Clustering Algorithm. (Chaitanya Mane - 21306A1014)

Ans :-
Clustering: no predefined classification is required. The task is to learn a classification from
the data. Clustering algorithms divide a data set into natural groups (clusters). Instances in the
same cluster are similar to each other, the y share certain properties.
Clustering algorithms can have different properties:-
Hierarchical or flat: hierarchical algorithms induce a hierarch y of clusters of decreasing
generality, for flat algorithms, all clusters are the same.
Iterative: the algorithm starts with initial set of clusters and improves them by reassigning
instances to clusters.
Hard and soft: hard clustering assigns each instance to exactly one cluster. Soft clustering assigns
each instance a probability of belonging to a cluster.
Disjunctive: instances can be part of more than one cluster.

 Explain categories of Clustering Algorithm. ( Nihal Satam)

Ans – Clustering Algorithm categories are as follows :-


1)Exclusive clustering : In exclusive clustering objects are grouped in an exclusive way , so that if a certain object belongs to a
definite cluster then it cannot be included in any other cluster.

2)Overlapping clustering : In overlapping clustering , fuzzy sets are used to cluster objected , So that each point may belongs to
two or more cluster with different degrees of membership . In this case an objected is associated with an appropriate membership
value .
3)Hierarchical clustering : This type of clustering is based on the union between two nearest clusters . The beginning condition
is realized by setting every object as a cluster . After a few iterations it reaches the final clusters.

 Explain K-means, Fuzzy C-means and Hierarchical Clustering. ( Mohd Sameer Khan)

Ans

K-means Clustering :

K-means is one of the simplest unsupervised learning algorithms that can be used to solve the well-
known clustering problem (MacQueen, 1967). The procedure follows fairly simple approach of
classifying a given object set through a certain number of clusters, Say K (fixed a priori). If we assume
that there are n data points, then K-means the algorithm broadly is stated as follows:

• Place K points into the space represented by the objects. These points represent initial group
centroids. . Repeat until the centroids do not move •Assign each object to the group that has the closest
centroid. • When all objects have been assigned, recalculate the positions of the K centroids.

This algorithm causes a separation of objects into groups from which the metric to be minimized can be
calculated. Finally, the aim of this algorithm is to minimize an objective function. In this case, an
objective function F is the squared error function and it may be defined as follows: Let xji be the ith data
point in cluster j and cj be the cluster centre, then we can write,
Fuzzy C-means Fuzzy C-means (FCM) is a method of clustering in which an object is allowed to belong to
two of more clusters. This method was initially developed by Dunn in the year 1973 and is frequently
used in applications such as pattern recognition. Let us assume u., to represent the degree of
membership of x; (ith data) in the cluster j, and c, to be the center of the cluster j. FCM algorithm is
broadly described as follows: Here n represents input data and c represents number of clusters.

Hierarchical Clustering

As already defined earlier, hierarchical clustering is a type of clustering that is based on the union of two
nearest clusters. In this clustering, the starting condition is realized by setting every object as a cluster.
After a few iterations, it reaches the final clusters. Given a set of n objects to be clustered, and an n*n
distance or similarity matrix, the basic process of hierarchical clustering defined by S.C. Johnson
(Johnson, 1967) is as follows:
 What are the different Methods To Find The Closest Pair Of Clusters? ( Devendra Katpara)

ANS – There are number of methods that can be used for finding for closet pair of cluster such as

 Single-linkage clustering
 Complete-linkage clusterin
 Average-linkage clustering
These are described briefly as follows:
1. In single-linkage clustering, we consider the distance between one cluster and another
cluster to be the shortest distance from any member of one cluster to any member of the other
cluster. If the data consist of similarities, we consider the similarity between one cluster and
another cluster to be equal to the greatest similarity from any member of one cluster to any
member of the other
cluster.
2. In complete-linkage clustering, we consider the distance between one cluster and another
cluster to be the greatest distance from any member of one cluster to any member of the other
cluster.
3. In average-linkage clustering, we consider the distance between one cluster and another
cluster to be the average distance from any member of one cluster to any member of the other
cluster. Clustering algorithms find great applications in a large number of fields, such as
marketing (which involves finding groups of customers who possess similar marketing behaviour
from a given data- base containing customer information and past buying patterns), biology
(which involves classifying a given database of plants and animals into genus, families, orders,
classes, kingdoms, and so on), insurance (which identifying groups of motor insurance policy
holders with a high average claim cost and also identifying frauds), residential development
(which involves classify- ing groups of houses in terms of their type, value, and location), and so
on.
 Write Application Of Clustering Algorithm (Omkar Jadhav)

 Explain Support Vector Machines. (Ameya Gadekar)


Ans.
1. Support Vector Machine(SVM) is a supervised machine learning algorithm used for both classification and
regression. Though we say regression problems as well its best suited for classification.
2. The objective of SVM algorithm is to find a hyperplane in an N-dimensional space that distinctly classifies
the data points. The dimension of the hyperplane depends upon the number of features.
3. If the number of input features is two, then the hyperplane is just a line. If the number of input features is
three, then the hyperplane becomes a 2-D plane. It becomes difficult to imagine when the number of features
exceeds three.
4. Advantages of SVM:

 Effective in high dimensional cases

 Its memory efficient as it uses a subset of training points in the decision function called
support vectors

 Different kernel functions can be specified for the decision functions and its possible to
specify custom kernels

-Linearly separable:

5. Let’s consider two independent variables x1, x2 and one dependent variable which is either a blue circle or a
red circle.

From the figure above its very clear that there are multiple lines (our hyperplane here is a line because we
are considering only two input features x1, x2) that segregates our data points or does a classification
between red and blue circles
Selecting the best hyper-plane:
6. One reasonable choice as the best hyperplane is the one that represents the largest separation or margin
between the two classes.

So we choose the hyperplane whose distance from it to the nearest data point on
each side is maximized. If such a hyperplane exists it is known as the maximum-
margin hyperplane/hard margin. So from the above figure, we choose L2.
7. Let’s consider a scenario like shown below

The blue ball in the boundary of red ones is an outlier of blue balls.
The SVM algorithm has the characteristics to ignore the outlier and
finds the best hyperplane that maximizes the margin. SVM is robust to
outliers.

It finds maximum margin as done with previous data sets along with that
it adds a penalty each time a point crosses the margin. So the margins
in these type of cases are called soft margin.

-Non linearly separable:


8.

SVM solves this by creating a new variable using a kernel. We call a point xi on the
line and we create a new variable yi as a function of distance from origin

For right hand side image given above: the new variable y is created as a function of
distance from the origin. A non-linear function that creates a new variable is
referred to as kernel.

9. The SVM kernel is a function that takes low dimensional input space and
transforms it into higher-dimensional space, ie it converts non
separable problem to separable problem.
10. It is mostly useful in non-linear separation problems. Simply put the
kernel, it does some extremely complex data transformations then finds
out the process to separate the data based on the labels or outputs
defined.
 What is regression? Explain in short Support Vector Regression. (Harsh Shukla)

 What is Case-Based Reasoning and Learning? (Harshal Kadam)

Ans)
Case-based reasoning (CBR) is a recent and very useful approach to problem solving and learning
paradigm in AI. It is different from other AI approaches as CBR can utilize specific knowledge from
previously experienced, concrete problem situations. It does not rely solely on general knowledge of a
problem domain or make associations along generalized relationships between problem descriptors and
conclusions.
There is an incremental and sustained learning in CBR where a new experience is saved every time a
problem is solved so that this solution becomes immediately available for future use. CBR approach is
that whenever we are faced with a new problem, our first instinct is to look at solutions that have worked
for similar problems in the past. Thus, the case-based reasoner solves new problems by adapting solutions
that were used to solve old problems or by remembering a previous similar situation and by reusing
information and knowledge of that situation.
For example, suppose while driving through a particular road, we come across a traffic jam; if we have
faced a similar situation in the past and remember a route that we may have taken to get out of this jam,
the same route may be used gain. Or else, we can consider an experimental route and if we are successful
in our attempt, we may remember this route for similar circumstances in the future.
Another example of case-based reasoning approach is in the case of medical situations where physicians
use their past experiences for diagnosis and treatment of patients.
Case-based reasoning is a sub-field of machine learning since the basic feature of CBR is learning. It
represents a machine learning paradigm that enables continual learning by updating the case base after
obtaining a solution to each problem. Therefore, learning naturally follows problem solving in CBR.
After a particular problem has been solved successfully, the solution is retained for future reference. If a
problem cannot be solved, then the reason for the failure is identified and recorded so that the same
mistake should be avoided in the future.
 What are Steps for Learning in CBR? ( Sanket Patil )
Ans : Case-based reasoning (CBR) is an experience-based approach to solving new problems by adapting previously
successful solutions to similar problems. Addressing memory, learning, planning and problem solving, CBR provides a
foundation for a new technology of intelligent computer systems that can solve problems and adapt to new situations. In
CBR, the “intelligent” reuse of knowledge from already-solved problems, or cases, relies on the premise that the more
similar two problems are, the more similar their solutions will be.

◆ Case-Based Problem Solving


In CBR terminology, a problem situation refers to a case, while a previously experienced situa- tion and knowledge
which is retained and used later is called a past case. The current situation is known as the new case, whose solution
needs to be determined. Case-based reasoning is defined as a cyclic and integrated process of solving a problem, then
learning from this experience and reusing this experience to solve a new problem, and so on. CBR is not only concerned
with inter- preting a problem situation and generating corresponding possible solutions. The term problem- solving is
used in a wide sense and does not necessarily indicate determination of a concrete solution to some application problem

◆ Learning in Case-based Reasoning

Case-based reasoning is considered to be a sub-field of machine learning since the basic underly ing feature of CBR is
learning. Moreover, CBR does not just depict a particular reasoning method Rather, regardless of how the cases are
acquired, it represents a machine learning paradigm that enables continual learning by updating the case base after
obtaining a solution to each problem Therefore, learning naturally follows problem solving in CBR. After a particular
problem has been solved successfully, the solution is retained for future reference. If a problem cannot be solved, then
the reason for the failure is identified and recorded so that the same mistake should be avoided in the future. The
following steps need to be followed for effective learning using CBR approach

> Extraction of relevant knowledge from an experience


> Integration of a case into the existing knowledge base
> Indexing of a case for future reference in similar situations

 Explain Case Retrieval Methods. (Saad Ansari)

The central tasks that all case-based reasoning methods have to deal with
are to
I. identify the current problem situation
II. find a past case similar to the new one,
III. use that case to suggest a solution to the current problem,
IV. evaluate the proposed solution, and
V. update the system by learning from this experience.

 It is extremely important to retain the knowledge obtained from


previous experience in such a manner that this knowledge can be made
easily available for future cases.
 The CBR paradigm follows a variety of methods to erisure proper
organization, retrieval, utilization, and indexing of the knowledge
recorded from past cases.
 The following points may be noted:

 Cases may be retained as an individual or as a generalized case of


set of similar casts
 Cases may be stored as separate knowledge units or may also be
split into sub units that may then be distributed within the
knowledge structure.
 Cases so stored may be indexed by a prefixed or open vocabulary,
and may be retained within a flat or hierarchical index structure.
 The solution from a previous case may be either directly applied to
the present problem, or modified gccording to differences between
the two cases.
 The matching of cases, adaptation of solutions, and learning from
an experience may be guided and supported by a model of general
domain knowledge, by more shallow and compiled knowledge, or it
may be based on an apparent, syntactic similarity only.
 Past cases may be retrieved and evaluated sequentially or in
parallel.

 CBR methods. CBR methods and systems have two main models:
 A process model of the CBR cycle
 A task-method structure for case-based reasoning
 Both the models mentioned above are complementary to each other and
depict two different views of case-based reasoning. While the process
model of the CBR cycle is a dynamic model concerned with the main
sub processes of a CBR cycle, their interdependencies, whereas the
task-method structure offers a task-oriented view where task
decomposition and related problem-solving methods are dealt with. In
general, the CBR cycle may be regarded as a cyclical process
comprising the four REs as mentioned below:
 RETRIEVE the most similar case(s);
 REUSE the case(s) to attempt to solve the problem;
 REVISE the proposed solution if necessary; and
 RETAIN the new solution as a part of a new case.

 What is ANN? What are the applications of ANN? (Jerome raj)

 What are Components of the ANN? (Afzal Shaikh)

 Explain SINGLE-LAYER FEED-FORWARD NETWORK. (Rushikesh Pukale)

Ans:

In this type of network, we have only two layers input layer and the output layer but the
input layer does not count because no computation is performed in this layer.
The output layer is formed when different weights are applied to input nodes and the
cumulative effect per node is taken.
After this, the neurons collectively give the output layer to compute the output signals.
 Explain MULTI-LAYER FEED-FORWARD NETWORK. (Mohd Owais patni)

Ans:
Multilayer Feed-Forward Neural Network(MFFNN) is an interconnected Artificial Neural
Network with multiple layers that has neurons with weights associated with them and they
compute the result using activation functions. It is one of the types of Neural Networks in which
the flow of the network is from input to output units and it does have any loops, no feedback, and
no signal moves in backward directions that is from output to hidden and input layer.
The ANN is a self-learning network that learns from sample data sets and signals, it is based on
the function of the biological nervous system. The type of activation function depends on the
desired output. It is a part of machine learning and AI, which are the fastest growing fields, and
lots of research is going on to make it more effective.
The Architecture of the Multilayer Feed-Forward Neural Network:
This Neural Network or Artificial Neural Network has multiple hidden layers that make it a
multilayer neural Network and it is feed-forward because it is a network that follows a top-down
approach to train the network. In this network there are the following layers:
Input Layer: It is starting layer of the network that has a weight associated with the signals.
Hidden Layer: This layer lies after the input layer and contains multiple neurons that perform
all computations and pass the result to the output unit.
Output Layer: It is a layer that contains output units or neurons and receives processed data
from the hidden layer, if there are further hidden layers connected to it then it passes the
weighted unit to the connected hidden layer for further processing to get the desired result.

 Explain RECURRENT NETWORK. (Sagar sawant )

Ans:-

A feed-forward network represents an acyclic (with no cycles) network since data can pass from input to the output nodes but not vice versa.
Once the FFNN is trained, its state gets fixed and does not modify when new data is presented to it, and it has no memory. These shortcomings of
feed- forward networks are resolved by another type of network called recurrent network. These networks can have connections going back from
output to input nodes and, in fact, can have arbitrary connections between any nodes. In addition, an internal state of recurrent networks can be
modified as new sets of input data are presented. It also possesses a memory, which proves to be useful while solving problems where the
solution depends on all previous inputs and not just on the current inputs. For example, prediction of stock market price, weather forecast
prediction, etc., are all problems that require a network with the features described for a recurrent network. Learning in a recurrent network
involves feeding inputs through the network, which includes feeding data back from outputs to inputs. The process of feeding back is repeated
until the values of the outputs stop changing. This state is called equilibrium or stability. Figure 12.11 shows a recurrent network with hidden
neuron that models a dynamic system using a unit delay operator d .
Recurrent networks can be trained by using back-propagation algorithm. In this method, at each
step, the activation of the output is compared with the desired activation and errors are
propagated backward through the network. Once this training process is completed, the network
becomes capable of performing a sequence of actions.

 Write a short note on Radial-Basis Function Networks. (Zuber langde)

 What are the Design Issues of Artificial Neural Networks? (Shlok shivkar)

 Write a short note on Recurrent Networks (Vinay Kumar Gajielli)

Ans: A feed-forward network represents an acyclic (with no cycles) network since data can pass from input to the output nodes but not vice
versa. Once the FFNN is trained, its state gets fixed and does not modify when new data is presented to it, and it has no memory. These
shortcomings of feed- forward networks are resolved by another type of network called recurrent network. These networks can have connections
going back from output to input nodes and, in fact, can have arbitrary connections between any nodes. In addition, an internal state of recurrent
networks can be modified as new sets of input data are presented. It also possesses a memory, which proves to be useful while solving problems
where the solution depends on all previous inputs and not just on the current inputs. For example, prediction of stock market price, weather
forecast prediction, etc., are all problems that require a network with the features described for a recurrent network. Learning in a recurrent
network involves feeding inputs through the network, which includes feeding data back from outputs to inputs. The process of feeding back is
repeated until the values of the outputs stop changing. This state is called equilibrium or stability. Figure 12.11 shows a recurrent network with
hidden neuron that models a dynamic system using a unit delay operator d .

Recurrent networks can be trained by using back-propagation algorithm. In this method, at each
step, the activation of the output is compared with the desired activation and errors are
propagated backward through the network. Once this training process is completed, the network
becomes capable of performing a sequence of actions.

 What is Hopfield Network? Write Applications of Hopfield. (mayur Sumra)


Unit 4 Assignment

What is soft computing? (21306A1074 Vaishnavi Pangam)


1.
Ans:

 Soft computing is defined as a group of computational techniques based


on artificial intelligence (human like decision) and natural selection that

provides quick and cost effective solution to very complex problems for

which analytical (hard computing) formulations do not exist.

 you can consider an example where you can see the evolution changes for

a specific species like the human nervous system and behavior of an Ant’s,

etc. Learning from experimental data.

 It does not require any mathematical modeling for solving any given
problem

 It gives different solutions when we solve a problem of one input from


time to time

 Uses some biologically inspired methodologies such as genetics, evolution,


particles swarming, the human nervous system, etc.

 Adaptive in nature.

 There are three types of soft computing techniques which include the
following:

-Artificial Neural Network: It is a connectionist modeling and parallel

distributed network. There are of two types ANN (Artificial Neural

Network) and BNN (Biological Neural Network). A neural network that

processes a single element is known as a unit. The components of the unit


are, input, weight, processing element, output. It is similar to our human

neural system. The main advantage is that they solve the problems in

parallel, artificial neural networks use electrical signals to communicate.

-Fuzzy Logic: The fuzzy logic algorithm is used to solve the models which

are based on logical reasoning like imprecise and vague.

Fuzzy logic provides stipulated truth value with the closed interval [0,1].

Where 0 = false value, 1= true value

-Genetic algorithm: They are usually used for optimization problems like

maximization and minimization of objective functions, which are of two

types of an ant colony and swarm particle. It follows biological processes

like genetics and evolution.


2.
Describe various phases of genetic algorithm. ( Aaranta Waykar.
21306A1027)
Ans : In computing terms, a genetic algorithm implements the model of computation by having arrays of bits or

characters (binary string) to represent the chromosomes.

Genetic algorithm has the following two requirements :

A chromosomes encoding of the solution domain.

A fitness function to evaluate the solution domain.

There are various phases of genetic algorithm :

Selection :
The idea of selection phase is to select the fittest individuals and let them
pass their genes to the next generation.

Two pairs of individuals (parents) are selected based on their fitness scores.
Individuals with high fitness have more chance to be selected for reproduction
Crossover
Crossover is the most significant phase in a genetic algorithm. For each pair of
parents to be mated, a crossover point is chosen at random from within the
genes. For example, consider the crossover point to be 3 as shown below.
Offspring are created by exchanging the genes of parents among themselves until
the crossover point is reached.

Mutation
In certain new offspring formed, some of their genes can be subjected to a
mutation with a low random probability. This implies that some of the bits in the
bit string can be flipped. Mutation occurs to maintain diversity within the
population and prevent premature convergence.

3. What do you understand by Evolutionary Computation? Sejal Shingre (21306A1040)

Evolutionary computation is a sub-field of artificial intelligence (AI) and is used


extensively in complex optimization problems and for continuous optimization.
Evolutionary computation is used to solve problems that have too many variables
for traditional algorithms.
Computers performing evolutionary computing run such evolutionary algorithms
as genetic algorithms, evolutionary programming, genetic programming and
swarm intelligence models like ant colony optimization or particle swarm
optimization.
The computational models using evolutionary algorithms apply evolutionary
processes in order to solve complex problems.
These evolutionary processes are inspired by biological evolution theory.
Evolving algorithms use principles such as inheritance from previous successful
generations, and natural selection where the best solutions pass their traits on to
the successive generations.
4. Explain evolutionary programming. [Komal Gupta] [21306A1023]

Ans: Evolutionary programming is a more complex form of genetic programming in which the individuals are structures with

greater degrees of complexity. It constitutes one of the major evolutionary algorithm paradigms. Evolutionary programming

was invented by Dr. Lawrence J. Fogel in the year 1960 to use simulated evolution as a learning process with the aim of

generating intelligent behaviour. Although he did not model the end product of evolution, he did try to model the process of

evolution itself as a mechanism for producing intelligent behaviour. Fogel used finite state machines (FSMs) as predictors and

evolved them in the following manner. He described this process as evolutionary programming in contrast to heuristic

programming.

Evolutionary programming is considered to be a wide evolutionary computing with no fixed structure or representation. It is

difficult to distinguish between evolutionary programming and evolutionary strategies. Some of its original variants closely

resemble to later genetic program- ming, with the difference that the program structure is fixed and numerical parameters are

allowed to evolve.

The crucial point that distinguishes evolutionary programming from genetic algorithm is the man- ner in which new solutions or

offsprings are generated in both. While in GA, a new solution is formed as a result of crossover of two solutions, in evolutionary

programming, each member of the population generates an offspring by the process of mutation. Evolutionary programming is

better for obtaining the global optimum since it relies on mutation rather than crossover. Because of the inherent flexibility in

the fitness function, evolutionary programming method leads to the best solution with fewer generations.

The basic steps involved in using an evolutionary programming method to determine a globally optimal solution are outlined in

the following subsection. Similar to other computational proce- dures discussed so far, such as genetic algorithms, evolutionary

programming is also a methodology and not an algorithm. Many parameters need to be taken care of before using this method

in a particular computer program.

5. What is swarm intelligence? Name two swarm intelligence systems. [Shraddha Kasar][21306A1011]

Ans:- A swarm is defined as a set of (mobile) agents that are capable of communicating directly or indirectly (by acting on their

local environment) with each other. They can carry out distributed problem solving in a collective manner with the help of

extremely simple rules. Therefore, SI can said to be based on the collective behaviour of self-organized and decentralized

systems. Even though there is no common centralized control structure that defines the behaviour of individual agents,

interactions between such agents lead to the emergence of intelligent global behaviour, which is unknown to the individual

agent.
Swarms are more powerful than single individuals since they can achieve goals that individuals may not be able to achieve.

Swarm intelligence contains four basic steps, namely, positive feedback, negative feedback, amplification of fluctuations, and

multiple interaction.

Ant Colony Optimization: - Ant colony optimization (ACO) is a class of optimization algorithms that are modelled on the xtions

of the members of an ant colony. Artificial ants or simulation agents locate optimal solutions by moving through a search space

representing all possible solutions. While studying the behaviour of real ants, we had stated that they lay down pheromones

directing each other to food d other resources while exploring their environment. In a similar manner the artificial ants cord

their positions and the quality of the solutions located by them so that other ants can locate better solutions in later simulation

iterations.

Particle Swarm Optimization: - Particle swarm optimization (PSO) is a global optimization algorithm that is used for solving

problems in which the best solution can be represented as a point or surface in an n-dimensional All hypotheses are plotted in

this space and provided with an initial velocity along with a communication channel between particles. As the particles move

through this solution space, they are evaluated according to some fitness criterion after each time step. With the passage of

time particles are seen to accelerate towards those particles within their communication grouping which have better fitness

values.

6. What are the different applications of evolutionary algorithms? (21306A1051)

Evolutionary algorithms in real-life problems. Similar to swarm intelligence algorithms [6], a major reason is a growing

demand for smart optimization methods in many business and engineering activities. EAs are suitable mainly for

optimization, scheduling, planning, design, and management problem

 Audio watermark detection


 Automated design = computer-automated design
 Automated design of mechatronic systems using bond graphs and
genetic programming (NSF).
 Automated design of industrial equipment using catalogs of
exemplar lever patterns.
 Automated design of sophisticated trading systems in the financial
sector.
 Automated design, including research on composite material
design and multi-objective design of automotive components for
crashworthiness, weight savings, and other characteristics.

7. Explain ant colony paradigm. [ MITALI JADHAV ] [SRN 21306A1061]


Ans. A complicated social behaviors displayed by ant have been observed and study in great depth by scientists :they have

also inspired models that can be used to solve particularly difficult optimization problems. One of the most important aspects

of ant behavior is their ability to find the shortest paths. This has motivated computer scientists to develop algo for solving

shortest path and optimization probnd know as the flied of ant colony optimization (ACO). This is the most successful and

widely recognized algo technique based on ant behavior. This paradigm emerged by resrch on real ant behavior. As mentioned

earlier ants can deduce the shortest path from their colony to the src of food by leaving traces of chemical substances called

pheromones as trail for other ants to follow. From many such trails left by the predecessors an ant chooses that trail which has

max amount of pheromone deposit. The ant then traverses the chosen path and leaves its own pheromone as trail for others

behind it. This is an autocatalytic process which favor path along which more ants have previously traversed.

8. Write a short note on particle swarm optimization. [Madhushree Parab 21306A1026]

SOLUTION:

Particle swarm optimization (PSO) is a global optimization algorithm that is used


for solving problems in which the best solution can be represented as a point or
surface in an n-dimensional space. All hypotheses are plotted in this space and
provided with an initial velocity along with a communication channel between
particles. As the particles move through this solution space, they are evaluated
according to some fitness criterion after each time step. With the passage of time
particles are seen to accelerate towards those particles within their
communication grouping which have better fitness values.
Swarm Intelligence-based techniques can be used in a number of applications
such as solving the travelling salesman’s problems, job shop scheduling,
sequential ordering, network routing, graph coloring, vehicle routing, flow
manufacturing , quadratic assignment, and also for business applications such as
modelling clusters of entrepreneurs, space planning, defense for controlling
unnamed vehicle’s, self-assembly, layout of facilities, resource allocation,
determining orbital system for planetary mapping, and so on. An important use of
SI was in Telecommunication Networks in the form of ant-based routing. This was
reported in mid-1990 by Dorigo, et al. and Hewlett Packard, who basically used a
probabilistic routing table to reward or reinforce the route successfully traversed
by each ant (or a small control packet) which flood the network. Research on
reinforcement of the route in the forward, reverse direction, and both
simultaneously has also been done. While backward reinforcement requires a
symmetric network and couples the two directions together, forward
reinforcement rewards a route before the outcome is known.
Some of the main advantages of SI are as mentioned below:
 Adaptability: SI systems possess self-organizational capabilities.
 Robustness: SI systems can find a new solution if the current solution
becomes invalid.
 Reliable: Agents can be added or removed without disturbing the behavior
of the total system because of the distributed nature.
 Simplicity: These systems are simple given the absence of a central control.

What is an Agent?
9. (Sushma - 21306A1042)
Ans :
Artificial intelligence is defined as the study of rational agents. A rational agent
could be anything that makes decisions, as a person, firm, machine, or software. It
carries out an action with the best outcome after considering past and current
percepts(agent’s perceptual inputs at a given instance). An AI system is composed
of an agent and its environment. The agents act in their environment. The
environment may contain other agents.
An agent is anything that can be viewed as :
 perceiving its environment through sensors and
 acting upon that environment through actuators
Examples of Agent:
 A software agent has Keystrokes, file contents, received network packages
which act as sensors and displays on the screen, files, sent network packets
acting as actuators.
 A Human-agent has eyes, ears, and other organs which act as sensors, and
hands, legs, mouth, and other body parts acting as actuators.
 A Robotic agent has Cameras and infrared range finders which act as
sensors and various motors acting as actuators.

10.Explain Agents vs software programs. (Shivranjani - 21306A1033)


Ans : Traditional software programs lack the ability to assess and react to the
environment and modify their behavior accordingly. They do not follow a goal –
oriented and autonomous approach to problem-solving. These characteristics
distinguish traditional programs from agents whose key feature is the presence of
autonomy. For instance, a payroll program could probably be said to sense the
world through its input and act on it via its output, but it is not an agent because
its output would not normally be affected if its senses or finds unseen situations
later. In such program there is no concept of capturing environment which is
dynamic in nature and can affect the output of the system at different times of its
invocation.

1] Agents and Objects -


 Wooldridge has depicted the underlying difference between agents and
objects in terms of autonomy and behavior which depends on
characteristics such as reaction, proactiveness and social ability.
 Standard object models do not support the kind behavior normally
displayed by agents.
 The most basic difference lies in the degree to which agents and objects are
autonomous; the classical definition of objects clearly defines them as
computational entities that are capable of encapsulating a certain state,
and methods on this state, and then performs actions.
 The manner in which different objects communicate with each other is
called message passing.
 While instance variables identified as private can only be accessed from
within an object, public methods can be accessed from anywhere.
 Although in this way, an object may be thought of as exhibiting autonomy
over its state, but it does not exhibit any control over its behavior.
 On the other hand, in case of an agent, it may or may not choose to
perform a certain action which is of no interest to itself even if it is directed
by other agents in favor of that particular action.
 Thus, the decision to perform a given action rests with the agents and we
can state that agents display stronger sense of autonomy that objects, and
can take the important decision of whether or not to perform an action on
the request of another agent.

2] Agents and Expert Systems -


 Expert systems were considered to be the most important AI technology of
the 1980s. An expert system is a system that is considered to be an expert
when it comes to solving problems or giving advice in some knowledge-rich
domain.
 Expert systems are defined as rule-based systems in which a knowledge
engineer uses the knowledge of a certain domain and codes this knowledge
as rules and facts in a special type of database knowledge as a knowledge
base.
 The rules are usually rules of thumb, that is, they are based in some
heuristic's knowledge of the domain expert.
 Therefore, expert systems are based on the fact that previous knowledge of
a certain application exists and that we can acquire this knowledge from
samples and then code this gathered knowledge into the knowledge base.
 However, expert systems are not capable of interacting with their
environment and do not display reactive, proactive behavior or social
abilities such as cooperation, coordination and negotiation.

11.What are the types of agents?


Ans:
Types of Agents: Agents can be grouped into five classes based on their degree of
perceived intelligence and capability:
1. Simple Reflex Agents
2. Model-Based Reflex Agents
3. Goal-Based Agents
4. Utility-Based Agents
5. Learning Agent
Simple reflex agents
Simple reflex agents ignore the rest of the percept history and act only based on
the current percept. Percept history is the history of all that an agent has
perceived to date. The agent function is based on the condition-action rule. A
condition-action rule is a rule that maps a state i.e., condition to an action. If the
condition is true, then the action is taken, else not. This agent function only
succeeds when the environment is fully observable. For simple reflex agents
operating in partially observable environments, infinite loops are often
unavoidable. It may be possible to escape from infinite loops if the agent can
randomize its actions.
Problems with Simple reflex agents are:
 Very limited intelligence.
 No knowledge of non-perceptual parts of the state.
 Usually too big to generate and store.
 If there is any change in the environment, then the collection of rules needs
to be updated.

Model-based reflex agents


It works by finding a rule whose condition matches the current situation. A
model-based agent can handle partially observable environments using a model
about the world. The agent must keep track of the internal state which is adjusted
by each percept and that depends on the percept history. The current state is
stored inside the agent which maintains structure describing the part of the world
which cannot be seen.
Updating the state requires information about:
 how the world evolves independently from the agent, and
 how the agent’s actions affect the world.

Goal-based agents
These kinds of agents take decisions based on how far they are currently from
their goal (description of desirable situations). Their every action is intended to
reduce its distance from the goal. This allows the agent a way to choose among
multiple possibilities, selecting the one which reaches a goal state. The knowledge
that supports its decisions is represented explicitly and can be modified, which
makes these agents more flexible. They usually require search and planning. The
goal-based agent’s behavior can easily be changed.
Utility-based agents
The agents which are developed having their end uses as building blocks are
called utility-based agents. When there are multiple possible alternatives, then to
decide which one is best, utility-based agents are used. They choose actions
based on a preference (utility) for each state. Sometimes achieving the desired
goal is not enough. We may look for a quicker, safer, cheaper trip to reach a
destination. Agent happiness should be taken into consideration. Utility describes
how “happy” the agent is. Because of the uncertainty in the world, a utility agent
chooses the action that maximizes the expected utility. A utility function maps a
state onto a real number which describes the associated degree of happiness.
Learning Agent:
A learning agent in AI is the type of agent that can learn from its past experiences,
or it has learning capabilities. It starts to act with basic knowledge and then can
act and adapt automatically through learning.
A learning agent has mainly four conceptual components, which are:
 Learning element: It is responsible for making improvements by learning
from the environment
 Critic: The learning element takes feedback from critics which describes
how well the agent is doing with respect to a fixed performance standard.
 Performance element: It is responsible for selecting external action
 Problem Generator: This component is responsible for suggesting actions
that will lead to new and informative experiences.
12.Explain agent environment in detail. (Pooja Sakrulla 21306A1031)
Answer:- Environments for an Agent
Environments can be classified into the following categories:-
Deterministic and non-deterministic: A deterministic environment is defined as
that in which every action has a single guaranteed effect and there does not exist
any uncertainty about the state that will result from performing an action. On the
other hand, in a non-deterministic environment, there may be many effects
corresponding to a single action. All intents and purposes in physical world can be
regarded as non-deterministic. These environments present greater problems for
the agent designer than the deterministic ones.
Static and dynamic: A static environment is assumed to remain unchanged except
by the performance of actions by the agent. On the other hand, in a dynamic
environment other processes operate on it, and hence change it beyond the
agent's control. The physical World has a highly dynamic environment
Discrete and continuous: An environment is said to be discrete if there are a
fixed. finite number of actions and percepts in it. Russell and Norvig gave a chess
game as an example of a discrete environment, and taxi driving as an example of
a continuous one. If an environment is sufficiently complex, then the
deterministic is not of much help.
13.Write a short note on the working of an agent. Hiral Patel (21306A1072)
Ans.
1. An agent generally maps its internals state to its data structures, the
operations which may be performed on these data structures, and the
control flow between these data structures.
2. One of challenging goals to design an agent program is to implement the
mapping the mapping from percepts to actions.
3. The agent takes sensor input from the environment and produces actions
that affect it as output.
4. The agent starts in some initial internal state, observes its environment
states, and then generates a percept.
5. Based on the percepts, the action is then performed, and the agents enters
another cycle, of perceiving the word via perception, updating its state, and
choosing an action to perform.

14.What is single agent and multiagent systems? Hirkani


kashid(21306A1070)
Ans. A multi-agent system (MAS or “self-organized system”) is a computerized
system composed of multiple interacting intelligent agents. Multi-agent systems
can solve problems that are difficult or impossible for an individual agent or a
monolithic system to solve. Intelligence may include methodic, functional,
procedural approaches, algorithmic search or reinforcement learning. Despite
considerable overlap, a multi-agent system is not always the same as an agent-
based model (ABM). Multi-agent systems consist of agents and their
environment. Typically multi-agent systems research refers to software agents.
However, the agents in a multi-agent system could equally well be robots,
humans or human teams. A multi-agent system may contain combined human-
agent teams.
Single-agent systems should be simpler than multiagent systems when dealing
with a fixed, complex task, the opposite is often the case. Single-agent systems
belong at the end of the progression from simple to complex multiagent systems.
The agent in a single-agent system model itself, the environment, and their
interactions. In a single-agent system, no other such entities are recognized by
the agent

15.How to do evaluation of agent’s performance? Gorima Sayed


(21306A154)
Ans. The performance of an agent can be measured to determine its success.
Obviously, no one fixed measure is suitable for all agents.
• Subjective opinion of an agent's satisfaction for its own performance is not very
practical, therefore, we will insist on an objective performance measure.
• In other words, there should be some standard of measuring success in an
environment. Generating high-quality behaviour with the sole purpose of
satisfying goals is not enough.
• A more general performance measure might be a utility function.
• For example, one sequence of states is preferred over another if it has higher
utility for the agent. Therefore, utility is a function that maps a state onto a real
number, which, in turn, describes the degree of the quality of the state being
useful. • A complete specification of the utility function allows rational decisions
in two types of cases where goals have trouble.
• First, if there are conflicting goals, the utility function specifies the appropriate
trade-off between them.
• Second, if there are several goals and none is achievable with certainty, then the
utility provides a way in which the likelihood of success can be weighed against
the importance of the goals.
16.What are the different types of Intelligent Agent architecture?
[Muskan Padvekar 21306A1025]
SOLUTION:
Architectures of Intelligent Agents
In our discussion so far, we have considered agents only in the abstract form and
not discussed how these agents are implemented. In this section, we will consider
four types of agent architectures:
• Logic-based
• Reactive
• Belief-Desire-Intention
• Layered
Each of these is described as given below
logic-based architecture: Agents in which decision making is realized through
logical deduction.
Reactive architecture: Agents in which decision making is implemented in some
form of direct
mapping from situation to action.
Belief-Desire- Intention architecture: Agents in which decision making depends
upon the Manipulation of data structures representing the beliefs, desires, and
intentions of the agent
Layered architectures: Agents in which decision making is realized via various
software layers, each of which is more or less explicitly reasoning about the
environment at different levels of
abstraction.
In each of these cases, we are moving away from the abstract view of agents, and
beginning to make quite specific commitments about the internal structure and
operation of agents

 Logic-based Architecture
It was believed initially that intelligent systems that possess intelligent behaviour
can be generated by representing its environment and desired behaviour using a
symbolic/logical representation and manipulating this representation to derive
logical deduction, or proving theorems.
Logic-based approaches for agents are viewed as decision making by deduction. A
decision-making strategy of agent is encoded as a logical theory, and the process
of selecting an action reduces to a problem of proof in logic. Logic-based
approaches are elegant, and have a clean semantics.
Let us now discuss how to develop a simple model of logic-based agents, which
we call deliberate agents. In such agents, the internal state is assumed to be a
database of first-order predicate logic formulae.
 Reactive Architecture
The researchers started investigating alternatives to logic-based approaches in
the mid to late 980s. The subsumption architecture is known to be the best for
reactive agent and was developed by Brooks, who was one of the strong critics of
the symbolic approach (Brooks R., 1991).
Subsumption has been widely influential in autonomous robotics and in real-time
AI systems.
Subsumption architecture is a way of decomposing complicated intelligent
behaviour into many simple self-contained component, which are in turn
organized into layers. Each layer implements a particular goal of the agent, and
higher layers are increasingly more abstract. The goal of each layer subsumes that
of the underlying layers.
The subsumption architecture has two main characteristics: The first is that
decision making of an agent is realized through a set of behaviours accomplishing
tasksHigher layers represent more abstract behaviours.
However, there are certain disadvantages of this model as well. Since the goals
might begin interfering with each other, there is a difficulty of designing action
selection through highly distributed system of inhibition and suppression.
Further, in this architecture, there is a low flexibility at runtime.

 Belief-Desire-Intention Architecture
ln belief-desire-intention (BDD theory, the behaviour of an agent is described in
terms of a processing cycle. The processing cycle is a control mechanism that may
be achieved by software feedback mechanism for performing functions without
direct external intervention. A feedback mechanism can continuously monitor the
output of the system and compare the result against preset values and feeds the
difference back to adjust the behaviour of the target system in a processing cycle.
BDI architectures have a reasoning process that helps in deciding an appropriate
action to be performed for achieving goals based on belief and intentions. These
are practical reasoning architectures, in which the process reasoning resembles
human reasoning. Belief is about understanding the problem and environment in
that context. The decision process involves typically understanding and
generating various options available to the agent, on the basis of belief and
chooses between them, and commits to some. These chosen options become
intentions for agent to determine their actions. Intentions are fed back into the
agent's future practical reasoning.
An agent should review and reconsider its intentions from time to time as it might
have to drop certain intentions because of some reasons. For example, some of
the reasons might be that the belief of the agent has changed such that a
particular intention is no longer relevant now, or intention can never be achieved,
or it has already been achieved, etc. But reconsideration increases the cost
associated with it in terms of both time and computational resources.
So, different types of environment require different types of decision strategies.
In static environments, purely pro-active, goal-directed behaviour of an agent is
adequate. On the other hand, in more dynamic environments, the ability of an
agent to react to changes by modifying intentions is more important.
The basic components of BDI architecture are data structures representing
beliefs, desires, and intentions of the agent, and functions that represent its
deliberation for deciding what to do. The main components of BDI agent are as
follows: A set of current beliefs (denoted by B) of agent representing information
about current environment.
o A set of current beliefs (denoted by B) of agent representing information
about current environment.
o A set of current intentions (denoted by I) representing the current focus of
the agent.
o A set of current desires (options or goals, denoted by D) of the agent.
Generally, beliefs, desires, and intentions are represented as logical formulae. The
sets B, D, and should have consistency. For example, an intention to achieve X
should be consistent with the belief Y. The state of a BDI agent at any given
moment is represented as a triple (B, D, 1). In addition, the following functions
are used:
An action selection function (asf) determines an action to be performed on the
basis of current intentions i.e., asf(1) -> Action
o A belief revision function (brf) takes input as a percept and the current
beliefs of agent and produces a new set of beliefs i.e., brf(P, B) -› B
o A filter function (ff) takes agent's current beliefs, desires, and intentions
and determines the intentions of agent i.e., ff(B, D, 1) -> I.
o An option generation function (of) that determines the options available to
the agent on the basis of its current beliefs about its environment and its
current intentions as ogf(B, I) -> D.
o The deliberation process of a BDI agent is represented in the filter function.
It updates the agent's intentions on the basis of its previously-held
intentions and current beliefs and desires. This function must do the
following things. It should
o drop any intentions that are no longer achievable, or for becoming costlier
o retain intentions that are not achieved, and that are still expected to have a
positive overall benefit
o adopt new intentions, either to achieve existing intentions, or to exploit
new opportunities

Layered Architecture
In layered architectures, the various sub-systems of an agent are arranged into a
hierarchy of interacting layers. There will be at least two layers, to deal with
reactive and pro-active behaviors of agent, respectively. A useful typology for
such architectures is by the information and control flows within them. Broadly
there are two types of control flow within layered architectures (Wooldridge and
Jennings, 1995) namely horizontal layering and vertical layering.
Horizontal layering: Suppose each layer is capable of suggesting m possible
actions, then there are at most m*n such interactions to be considered. In order
to ensure that horizontally layered architecture is consistent, generally a mediator
is included, that makes decisions about which layer have control of the agent at
any given time. The introduction of a central control or mediator system
introduces a bottleneck into the agent's decision making and it is problematic also
as the designer must foresee all possible
interactions between layers.
Vertical layering: In vertically layered architecture, sensory input and action
output are dealt by at most one layer each. In this form of architecture, the
problems shown in horizontal architecture are partly solved. Vertical layered
architectures can be divided into one-pass architectures and two-pass
architectures. These are shown in Figure 14.6
o In one-pass architecture, control flows sequentially through each layer,
until the final layer generates action output.
o In two-pass architecture, information flows from percept through one layer
to another in first pass and control flows back from last layer to first layer
till action output in second pass.

The complexity of interactions between layers in both one pass and two pass
vertically layered architectures is reduced since there are n interactions to be
considered between layers. This is clearly much simpler than the horizontally
layered case. However, this simplicity comes at the cost of some flexibility. In
vertically layered architecture, for making a decision, control must pass between
each different layer. The failures in any one layer are likely to have serious
consequences for agent performance,

In the design of agent systems, their communication mechanism is to put in place


using which agents can pass messages amongst each other.

17.What is agent communication language? (Sushma - 21306A1042)


Ans: Communication is necessary in order to allow collaboration, negotiation,
cooperation, etc… between independent entities. For this purpose, it requires a
well-defined, agreed and commonly understood semantics. Therefore, there
cannot be any interoperability without standards.

Agent communication is based on message passing, where agents communicate


by formulating and sending individual messages to each other. The FIPA ACL
specifies a standard message language by setting out the encoding, semantics and
pragmatics of the messages.
The standard does not set out a specific mechanism for the internal
transportation of messages. Instead, since different agents might run on different
platforms and use different networking technologies, FIPA just specifies how
transporting and encoding the messages between different remote platforms.
The syntax of the ACL is very close to the KQML communication language.
1.Language used by agents in exchange of messages, defining common syntax for
cooperation between heterogeneous agents.
2.Language used by agents in exchange of messages and defining common syntax
for cooperation between heterogeneous agents.

18.What are the applications of AI Agent? (Shivranjani - 21306A1033)


Ans : The applications of AI Agent are as follows :
 Manufacturing : In manufacturing, agents can flexibly handle unexpected
events such as machine failure, custom manufacturing of consumer
specified special orders, opportunistic rescheduling, monitoring of
machines, automatic correction of machine or automatic assignment of
failure of machine to the appropriate agency with the details of the
problem.
 Unmanned Aerial Vehicles (UAVs) : here, agents having the ability to
autonomously follow a flight plan, can do re-planning of flight path in
response to unexpected events, air traffic management where agents can
manage a large number of aircrafts.
 Military Simulation: Here, agents can be simulated as humans.
 Fault Diagnosis: In fault diagnosis, distributed agents monitor and diagnose
individual components in a larger system.
 Prognostic Health Management (PHM): PHM utilizes agents for
identification of incipient faults and integration of sensor information and
maintenance histories.
 Business Processes: Processes use automated, goal-directed agents for
execution of business processes. Automated processes respond to events
and invoke the appropriate course of action.
 Dynamic Trading: Here, agents can monitor different aspect of the e-
market. Agents can be used in auctions and negotiation on web.
 Intelligent Decision Support: Here, agents with domain knowledge can
provide expert and timely advice to users.
Unit 5
1. What is Conceptual dependency? Explain different Conceptual Primitive actions.

Different types of Conceptual Primitive actions are :-


2. Write a short note on the Conceptual category.
Ans :- CD provides specific set of building blocks from which representations can be made
rather than a structure in which knowledge can be represented. Building blocks are the set of
allowable dependencies among the conceptualizations for different events. There are four
primitive conceptual categories from which dependency structures can be built.

ACT Actions {one of the CD primitives representing verb}

PP Objects {picture producers or noun'pronoun}

AA Modifiers of actions {action aiders or adverb}

PA Modifiers Of PP'S {picture aiders or adjective}

The relationships between concepts are called dependencies. Ille main conceptualization of a
clause is a two-way dependency between a PP (the actor) and an action. It is important to note
that actions are broken down into sequence of primitive ACT's.

A set of rules describe the syntax of the conceptual level, and these rules speci$' which type of
concepts can depend on which other type, as well as the different kinds of dependency
relation- ships between concepts. Specific concepts depending on other concepts based on the
particular meaning of these concepts is determined by the semantics of the conceptual level.
There exists a dictionary of ACTs which specifies different meanings with its conceptual
structure for each verb.

Example: The CD representation for sentences such as "I took a book from the man", "The man
gave me a book , and "The book was given by man to me" having the same intended meaning is
given as follows:
Here we notice that some special notations and symbols are used. These be explained as we
move ahead, but the conventions used throughout this chapter are as follows:

• Arrows indicate directions of dependency.

• Double arrow indicates two-way links between actor and action.

• Following characters are used to represent different concepts

o — for the object case relation

r —for the recipient case relation

d — destination

Conceptual relations, at a higher level, indicate dependencies between conceptualizations,


annotated with conceptual tenses such as past, future. and conditional. Other types of
conceptual relations are the time and location of a conceptualization. So the conceptualizations
representing events can be modified in various ways to supply information normally indicated
in languages by the tense, mood, or aspect of a verb form. The set of conceptual tenses
includes

p — Past

f — Future

t — Transition

ts — Start transition

tf — Finished transition

k — Continuing

c —Conditional

3. Explain different types of rules for Conceptualization blocks in Conceptual dependency.


Ans: Rules for Conceptualization Blocks in CD
-Dependency structures are themselves conceptualization and therefore can serve as
components for larger dependency structures.
-The dependencies among conceptualization correspond to semantic relations among the
underlying concepts.
-Here different types of arrows are used to represent different underlying semantics. Following
is the list of the most important allowable structures or the rules:
Rule 1: Rule representing relationship between an actor and the event caused by him or her is
called a two-way dependency, as neither actor nor event can be considered primary. The letter
p in the dependency link indicates that the event occurred in the past.

Rule 2: Rule representing relationship between an ACT and the PP (object) of the ACT is shown
by the direction of an arrow toward the ACT, since the context of the specific ACT determines
the meaning of the object relation.

Rule 3: This rule shows relationship both ways between two PPs. One PP belongs to the set
defined by the other PP. For example, John belongs to a set of doctors in the sentence "John is
a doctor".

Rule 4: It shows a relationship between two PPs. One of the PP provides a particular kind of
information about the other PP. The most common types of information to be encoded in these
ways are possession (shown as 'poss-by') and location (shown as 'loc'). The direction of the
arrow are possession (shown as 'poss-by') and location (shown as 'loc'). The direction of the
arrow is again, toward the concept being described.
Rule 5: It shows a relationship between a PP and a PA that is asserted to describe it. In this case,
PA represents the states of PP such as height, weight, health, etc. on numeric scales and has
both ways arrows.

Rule 6: It shows a relationship between a PP and an attribute that already has been predicated
of it. The direction of the arrow is toward the PP being described.

Rule 7: It shows a relationship between an ACT and its physical source and destination locations
of ACT. Here 'd' indicates the source and destination case relation, as this representation is also
used for recipient case relation later. Here we can use 'v' for vehicle which can be single object
or full conceptualization.
Rule 8: It shows a relationship between an ACT and its source and the recipient of ACT. The
letter 'r' indicates source and recipient case relation.

Rule 9: It shows a relationship between an ACT and the instrument using which it is performed.
In the simplest form, instrument can be represented by just a single physical object or must be
expanded to a full conceptualization. Here DO represents some act done by actor using
instrument. The letter 'i' represents instrument.

Rule 10: It shows a relationship that describes the change in state between PP and a state in
which it started and another state in which it ended. Here states of object are described using
numerical values. For example, for health, range could be - 10 to 10 describing various health
conditions {(dead, - 10), (seriously ill, - 9), (sick, [- 1, - 8]), (alright, 0), (fine, 5), (perfect heath,
10)}.
Rule 11: Relationship between one conceptualization and other conceptualization that causes
it. Here {x} and {y} represent two full conceptualizations where {y} is caused by {x}. It is
conditional representation as 'if x then y'. Alternatively, we can say that 'y' is a consequence of
'x'.

Rule 12: It shows a relationship between one conceptualization with another that is happening
at the time of the first. Here event 'y' is happening while event 'x' simultaneously.
Rule 13: It shows a relationship between conceptualization and the place at which it happened.

4. Give examples of Conceptual dependency representation.(Aaranta waykar 21306A1027)


Ans : conceptual dependency (CD) is a theory of deep semantics foe concepts used for
natural language understanding, which was developed by Roger Schank at Stanford University
in 1969. It mainly focuses on the concept and meaning rather than on syntax or structure.
conceptual dependency uses the basic hypothesis, that the ACTION is the basis of any
proposition.

Conceptual dependencies are represented as following :

1. ATRANS : transfer of an abstract relationship such as possession, control, or ownership.


It requires an actor, object, and recipient.

For ex : The verb ‘give’ is represented by an act ATRANS which transfers an object
from an actor to the recipient.
The verb ‘take’ is represented by an act ATRANS which transfers an object from
someone to an actor.

2. PTRANS : transfer of the physical location of an object (or actor) that requires an actor,
object, and direction

For ex : the verbs ‘go’ or ‘walk’ are type of PTRANS where an actor is moving himself to
some location

The verb ‘fly’ is represented by an PTRANS where an actor is transferring himself to a


location. It can use another act as its instrument such as airplane.

3. PROPEL : application of physical force to an object (eg : push, pull)

For ex : the verb ‘push’ is a PROPEL act that pushes an object in a direction by an actor.

The verb ‘throw’ is represented by an act PROPEL act where an object is moving
in a direction by an actor using MOVE action whose instrument is hand.

4. MOVE : movement of a body part by its owner. (kick, throw, hit)

For ex : ‘kick’ is represented by MOVE owner’s foot to PTRANS an object in a direction


by owner of foot.

5. GRASP : grasping of an object by an actor (catch, clutch)

For ex : ‘catch’ is represented by GRASP of an object by an actor. The object was


PROPEL before GRASP.

6. INGEST

7. MTRANS

8. MBUILD

9. EXPEL

10. SPEAK

11. ATTEND

5. What are the advantages and disadvantages of Conceptual dependency?


6. What is Script structure? What are the main components of a script?
Ans :- Schank and Abelson introduced the concepts of scripts in 1977 that were build upon CD
frame to handle story understanding by organizing episodes in a sequence or a chain of
situations which could be anticipated using personal experiences rather than semantic
categories.

Scripts are Structures representing procedural knowledge by describing a set of stereotyped


events in a Particular situation or context which could he expected to be followed from one to
another event. consists of set of slots which may contain default values along with some
information about the type of values they might contain.

It looks similar to frames, except that the values of the slots must be Ordered and have more
specialized roles. Script structure allows individual to make inferences needed for
understanding by filling in missing information.

In real world situations, we see that event tends to occur in known patterns because of clausal
relationship to the occurrence of events. So scripts are useful structure in such situations. A
number of computer programs have been developed to demonstrate the theory. Schank applied
his theoretical framework to story telling and the development of intelligent tutors. The classic
example of Schank's theory is the restaurant script.

Each script contains the following main components :-

Entry Conditions : Must be satisfied before events in the script can occur.

Results : Conditions that Will be true after events in script occur.

Props : Slots representing objects involved in the events.


Roles : Persons involved in the events.

Track : Specific variation on more general pattern in the script. Different tracks may share many
components of the same script but not all.

Scenes: The sequence of events that occur. Events are represented in conceptual dependency
form.

7. Develop a script for a Play in Theater. (Hiral Patel – 21306A1072)


Ans :- The scripts are useful in describing certain stereotyped situations such as going to theater
to see a play. To understand this concept better, let us consider another example of developing a
script for going to theater to see a play. This might involve the following scenes:

Going to theater.
Buying ticket.
Going inside hall and sitting ona seat
Watching play
Exciting from theater

Various components of script for going to theater are described as follows:-

Entry Conditions:-
P wants to see a play
P has money

Results:
P saw play
P has less money

Props :
Tickets; Seat;Play
Roles:
Person
Ticket Distributor
Ticket checker

Complete Script for Event-Going to Theater :-


Script : Play in theater
Track : Play in Theater

Tickets
Seat
Play

Roles :
Person (who wants to see a play)~P
Ticket distributor - TD
Ticket checker –TC

Conditions:
P wants to see a play
P has a money
Results:
P saw a play
P has less money
P is happy (optional if he liked the play)

Various Scenes
Scene 1: Going to theater
P PTRANS P into theater
PATTEND eyes to ticket counter
Scene 2: Buying ticket
P PTRANS P to ticket counter
P-MTRANS (need a ticket) to TD
TD ATRANS ticket to P

Scene 3: Going inside hall of theater and sitting


on a seat.
P PTRANS P into Hall of theater
TC ATTEND eyes on ticket POSS_by P
TC MTRANS (showed seat) to P
P PTRANS P to seat
P MOVES P to sitting position

Scene 4: Watching a play


P ATTEND eyes on play
P MBUILD (good moments) from play

Scene 5: Exiting
P PTRANS P out of Hall and theater.

8. Develop a script for a person going to the bank to withdraw money.


9. What are the advantages and disadvantages of Script?
10. Write a short note on CYC theory.
11. What is Case Grammar? Explain with examples.

12. Write a short note on Semantic Web.


13. What is XML? Explain XML representation and schema for a book.
Ans :- Extensible Markup Language (XML): It is a general-purpose specification for creating
custom markup languages similar to HTML. XML tags are not predefined and user may define
his/her own tags. It is designed to be self-descriptive. It is classified as an extensible language
because it allows its users to define their own elements. Its primary purpose is to facilitate the
sharing Of structured data across different information systems, particularly via the Internet,
and it is used both to encode documents and to serialize data. XML was designed to carry data
across different information systems and not to display data as is done in HTML document.

For example, if we have to represent "a circular from head to faculty for a meeting" in XML, it
will look like:

The circular is quite self-descriptive. It has sender and receiver information; it also has a
heading and a message body. But still, this XML document does not do anything except
information wrapped in N-defined tags. There must be a piece of software written to send,
receive, or display it.

XML provides a surface syntax for structured documents, but imposes no semantic constraints
on the meaning of these documents.

XML Schema :- It is a language for describing the structure of an XML documents, typically ex-
pressed in terms of constraints on the structure and content of documents. XML Schemas
express shared vocabularies and allow machines to carry out rules made by the people. They
provide Xans for defining the structure, content, and semantics of XML documents in more
detail. XML Schema was approved as a W3C Recommendation and was published in October
2004. The mechanism for associating an XML document with a schema varies according to the
schema language. The association may be achieved via markup within the XML document itself,
or via some external means. The purpose of an XML Schema is to define the valid building
blocks of an XML document, and it consists of following things:

• elements and attributes that can appear in a document

• elements appearing as child elements , theirorderand number of child elements

• whether an element is empty or can include text

• types for elements and attributes with default and fixed values for elements and attributes

Let us define simple XML document which describes a hook and the corresponding XML
schema. Here dotted lines are to be replaced by actual data values such as name tide of the
book, etc.

To write a schema for this document, we could simply follow its structure and define each
element. Every XML schema starts with element and then each term is defined one by one. The
following XML schema is just to give an idea.

XML schema is basically similar to database schema where we define structure consisting of
attributes with their corresponding types.

14. Explain Ontology with an example.


Ans :- Ontology is an explicit specification Ofa conceptualization for the objects, the concepts,
and Other entities in a domain of interest and relationships that hold among them. Simply we
can say that conceptualization is an abstract, simplified view of the world that we wish to
represent for some purpose and is somewhat similar to frame-based system. Ontology defines
terms used to describe and represent an area of knowledge used by people, databases, and
applications that need to share domain information. A domain is just a specific subject area or
area of knowledge, like educational systems, medicine, financial management, real estate, etc.
Ontology includes computer-usable definitions of basic concepts in the domain and the
relationship among them. It allows sharing Of common understanding of the structure of
information among people or software systems, It enables reuse of domain knowledge that is
separate from the operational knowledge.

Consider an example of designing ontology for “EDUCATIONAL INSTITUTIONS”.

In the graphical representation given in Fig, 15.1, the class nodes are represented by oval-
shaped boxes, whereas the instances have been shown as rectangular boxes at the leaf nodes.
Each rectangular box contains the list of all instances of that class. Ontology is represented
using ontology language such as RDF Schema as explained earlier and OWL (Web Ontology
Language). Let us briefly describe OWL to give you a feel of ontology encoding language.

Consider ontology example of Educational Institution earlier, where a class 'Faculty' with a
property 'Instructor of with value as 'Student' is defined. We would add to the RDF document a
specification that
All the properties of class 'Faculty' will be directly inherited by the given resources. For
example, if every 'Faculty' has a Designation, then the object identified by cris&gt; must also
have designation. Furthermore, suppose that ontology defines a property called "Student of" as
the inverse of "Instructor of " property, then without any extra effort, a Semantic Web engine
would be able to infer from inverse property that 'Mike is a Student of Cris", even though we
have only stated that "Cris is the Instructor of Mike" .

There are two ways of developing ontology, viz. either hand code the graphical structure of
ontology in language such as OWL or use some standard tool, if available. There is a PROTÉGÉ
tool that helps in creating the OWL format of the ontology structure. We will call such
ontology as OWL ontology. The sample code generated by PROTÉGÉ tool for Educational
Institution ontology given in Fig. 15.1

15. Write a short note on Natural Language Processing.


Ans :- Natural Language Processing (NLP) is an important field of A1 since both
understanding and generation of natural languages require a lot of intelligence. Understanding
basically refers to the process of mapping from a given input form (text or speech) into more
immediately useful form. It is represented as a pair (source language, target representation)
along with a mapping between elements of each to other.

Target representation varies from one application to other. For example, in the case of
translation, target representation refers to target language; on the other hand, in case of
paragraph comprehension, target representation may refer to some form of semantic
representation using which one can answer various queries regarding the source text.

Basically, NLP involves processing of written text using computer models at lexical, syntactic,
and semantic level. It also includes processing of spoken language that uses all the processing
techniques required for written text plus knowledge about phonology with added information
to handle ambi- guities that arise in speech.

We will concentrate on written language processing techniques. It includes syntactic structures


and various grammars with parsing methodologies, semantic models, definite clause grammar,
case grammar and its application in machine translation. At the end of the chapter, we have
briefly discussed the universal networking language (UNL) which is a language-independent
generic model used for storing target expression.

16. Explain Sentence Analysis Phases in detail.


Ans :- A sentence is usually analyzed through different phases such as morphological (Lexical)
analysis, syntactic analysis, semantic analysis, discourse, and pragmatic analysis. Each of these
phases is briefly explained below:

Morphological Analysis: Morphological analysis process (MAP) is carried out initially on the
natural language sentences; this method tries to the extract root word from declined or
inflectional form of word after removing suffices and prefixes. For example, getting the root
'push' from declined forms such as pushed, pushing, pushes, etc. In addition to this, it also
assigns appropriate syntactic categories such as noun, verb, adjective, etc., to all words in the
sentence.

Syntactic Analysis: This method of analysis uses the result of MAP to build a structural
description of the sentence based on grammatical rules. This process is called parsing. A
declarative representation (called grammar) of syntactic facts about the language and
procedure (called parser) compares the grammar against the input sentence to produce parse
structures. Creating a parse tree is the first step towards understanding a sentence.

Semantic Analysis: It creates a semantic structure by ascribing the literal meaning to a sentence
using parse structure obtained in syntactic phase. It maps individual words into corresponding
objects in the knowledge base and combines the words with each other using semantic rules.
Our aim here is to produce meaning in some suitable representation scheme by using any of KR
methods described in earlier chapter. Main purpose of semantic processing is the creation of
target representation of the meaning of sentence.

If a natural language is used as an interface to another system (programs such as database),


then target representation must be so chosen that it is understood by the system. So the design
of target representation is driven by the program using it.

Pragmatics Analysis: It refers to intended meaning of a sentence used in different contexts. The
context affects the interpretation of the sentence. For example, in the sentence "John saw Mike
in the garden with a cat", there are two interpretations. First interpretation is that John was
having Cat and saw Mike in the garden. Other interpretation might be that John saw Mike (with
cat) in the garden. Seeing the context one can resolve to unique interpretation. If we have
knowledge about John that he keeps pets then first interpretation will be more suitable.

Discourse Analysis: It refers to Conversation between two or more individuals and the
interpretation is based on the belief set at the time of conversation. Here the interpretation of
spoken sentence will be based on the belief set of the people involved in conversation.

A large number of computational models for syntactic and semantic analysis have been
developed but Very little work has been done for developing pragmatic and discourse analysis
models since various complexities are involved in understanding them. Linguists have done
researches and developed theories but these are not feasible straightway for computational
purposes.

17. Write a short note on Grammars and Parsers.


Ans :- In general, the term parsing refers to the process of analyzing an input sequence in order
to determine its structure with respect to a given grammar. In the context of natural language
processing, parsing implies analyzing a sentence syntactically to assign syntactic tags (subject,
verb. object, and so on) to provide constituent structure (noun phrase, verb phrase etc.) or to
characterize the syntactic relations between two words (i.e., dependency representations).
Parsing is a fundamental requirement for many natural language processing applications such
as machine translation, information extraction, natural language interfaces, speech recognition,
etc. Parsing techniques are broadly divided into rule-based parsing and statistical parsing. The
most commonly technique of the two is rule-based parsing where the syntactic structure of
language is provided in the form of linguistic rules which can be coded as production rules that
are similar to context- rules.

Production rules are defined using non-terminal (symbols to be further expanded) and terminal
symbols (direct symbols found in the language). Statistical parsing methods require large
corpora and linguistic knowledge is represented as statistical parameters or probabilities,
which may be used to parse a given sentence (Jurafsy D. & Martin J.H., 2000).

In this chapter, we will concentrate on rule-based parsers. Once the grammar rules are defined,
a sentence is parsed using the grammar and a tree kind of structure is built, if the sentence is
syntactically correct. This tree is called a parse tree. Parsing can be done using two methods:
top—down parsing and bottom—u parsing. A parser can use any of these parsing methods.
Each one has its merits and demerits (Allen J., 1994).

Bottom—up parsing: In bottom—up parsing, we start with the words in the sentence and apply
grammar rules in the backward direction until a single tree is produced whose root matches wi
the start symbol.

Top—down parsing: In top—down parsing, we start with the start symbol and apply grammar
rules in the forward direction until the terminal symbols of the parse tree correspond to the
words in the sentence.

Consider simple context-free-like grammar for English language as given in Table 16.1
conventions used in this chapter are as follows:

• The symbol —9 is used for 'defined as'

• Vertical bar I for alternative definitions

• 'Ille symbols S for Sentence,

• NP for Noun phrase, and VP for Verb Phrase.

18. What is Bottom-up parsing? Give an example.


Ans :- Let us consider the following sentence: The cute girl ate an apple and see its parse
structure (Fig. 16.1). The direction of the arrow shows the way m which the parse structure is
made. Here, the syntactic category to each word in the sentence is assigned before starting the
actual parsing. The application of rules is quite clear from the parse structure and needs no
explanation.
Look-ahead Parser is basically a bottom—up parser, where the parser when deciding how to
interpret Input word is allowed to look at the next k input items before making its decision for
an appropriate rule.

Figure 16.1 Parse Structure (Bottom-Up Parsing)

19. What is Top-down parsing? Give an example.

20. What are the different types of Parsers?


21. What is Semantic Analysis?
22. What is a universal networking language?

23. Write a short note on UNL dictionary.

You might also like