Professional Documents
Culture Documents
Turing test in AI
“A test to check whether a machine can think like a human or not”
- Player A is a computer, Player B is human, and Player C is an interrogator. Interrogator is aware that one of
them is machine, but he needs to identify this on the basis of questions and their responses.
- The test result does not depend on each correct answer, but only how close the response is to human answer
and the conversation between all players takes place via keyboard and screen
- In this game, if an interrogator would not be able to identify the computer, then the computer passes the test
successfully, and it is said to be intelligent enough to think like a human.
- The questions and answers can be like:
Interrogator: Are you a computer?
Player A (Computer): No
Interrogator: Multiply two large numbers such as (256896489*456725896)
Player A: Long pause and give the wrong answer.
- Features used by the computer to pass the turing test are :
o Natural language processing : To communicate successfully in English
o Knowledge representation: To store Knowledge from the real world
o Automated reasoning: To make use of the stored info to answer the question and draw conclusion from it.
o Machine learning: To make new predictions by finding patterns
AI Underlying Assumptions
- Newwell and simon presented the physical Symbol hypothesis which lies in the heart of the research of the AI.
- A physical Symbol consists of a set of entities called symbols which can occur as a part of another entity called
symbol structure or expression
- A symbol structure is collection of symbols connected in some physical way.
- A physical symbol system is the machine that produces an evolving collection of symbol structure.
- Physical Symbol system Hypothesis states that it is possible to build programs that can perform intelligent tasks
that are currently performed by the people.
- Example of physical System :
1. A digital computer :
o Symbols are 0’s and 1’s
o Processes are operations of the CPU
2. Chess:
o Symbols are Pieces
o Processes are legal chess move
o Expressions are positions of all pieces on the chess board
3. Formal Logic
o Symbols are Logical operators
o Processes are rules of logical deduction
o Expressions are statements in formal logic that can be true or false
4. Algebra
o Symbols are +,-,x,y,etc
o Processes are the rules of algebra
o Expressions are equations
AI Techniques
There are three important AI techniques:
1. Search
- Provides a way of solving problems for which no direct approach is available.
- It also provides a framework into which any direct techniques that are available can be embedded.
2. Use of knowledge
- Provides a way of solving complex problems by exploiting the structure of the objects that are involved
3. Abstraction
- Provides a way of separating important features and variations from many unimportant ones that would
otherwise overwhelm any process
AI Problems
- Humans learn mundane tasks since their birth.They learn Formal Tasks and Expert Tasks later.
- So much of the early work in the field of AI focused on formal tasks domain, such as game playing and
theorem proving and less in the mundane task domain.
- As AI research progressed, techniques for handling large amount of world knowledge were
Developed and now the tasks were focused on perception, natural language understanding and problem solving in
specialized domains.
AI Task Domains
Classification of AI
1. Weak AI: The study and design of machines that perform intelligent tasks.
- Not concerned with how tasks are performed, mostly concerned with performance and efficiency.
Eg : to make a flying machine, use logic and physics, don’t mimic a bird.
2. Strong AI: The study and design of machines that simulate the human mind to performintelligent tasks.
- Borrow many ideas from psychology, neuroscience.
- Goal is to perform tasks the way human might do them.
- Assumes that the physical symbol hypothesis holds true.
3. Evolutionary AI: The study and design of machines that simulate simple creatures
- For example, ants, bees, etc.
ARTIFICIAL INTELLIGENCE MACHINE LEARNING
AI leads to intelligence or wisdom. ML leads to knowledge.
The aim is to increase the chance of success and not The aim is to increase accuracy, but it does not care
accuracy. about the success
The goal is to simulate natural intelligence to solve The goal is to learn from data to maximize
complex problems. the performance the task.
AI has a very broad variety of applications. The scope of machine learning is constrained.
AI is a broader family consisting of ML and DL as its
components. ML is a subset of AI.
It involves developing a system that mimics humans to
solve problems. It involves creating self-learning algorithms.
AI will go for finding the optimal solution. ML will go for a solution whether it is optimal or not
AI can work with structured, semi-structured, and ML can work with only structured and semi-structured
unstructured data. data.
AI’s key uses include- ML’s key uses include-
- Siri, customer service via catboats - Facebook’s automatic friend suggestions
- Expert Systems - Google’s search algorithms
- Machine Translation like Google Translate - Banking fraud analysis
- Intelligent humanoid robots such as Sophia - Stock price forecast
Applications of AI
1. Gaming
- AI plays vital role in strategic games such as chess, poker, tic-tac-toe, etc.,where machine can think of large
number of possible positions based on heuristic knowledge.
2. Natural Language Processing
- It is possible to interact with the computer that understands natural language spoken by humans.
3. Expert Systems
- There are some applications which integrate machine, software, and special information to provide
explanation and advice to the users.
4. Computer Vision Systems
- These systems understand, interpret, and comprehend visual input on the computer.
5. Speech Recognition
- Some intelligent systems are capable of hearing and comprehending the language in terms of sentences and
their meanings while a human talks to it. It can handle different accents, slang words, noise in the
background, change in human’s noise, etc.
6. Handwriting Recognition
- The handwriting recognition software reads the text written on paper by a pen or on screen by a stylus. It
can recognize the shapes of the letters and convert it into editable text.
7. Intelligent Robots
- Robots are able to perform the tasks given by a human. They have sensors to detect physical data from the
real world such as light, heat, temperature. They have efficient processors and huge memory, to exhibit
intelligence. In addition, they are capable of learning from their mistakes and they can adapt to the new
environment.
Problem Solving
For solving any type of problem in the real world one needs a formal description of theproblem. Ch-2
1. What is the explicit goal of the problem
2. What is the Implicit criteria for success
3. What is the Initial Situation
4. Ability to Perform
- Problem solving is a process of generating solutions from the observed data.
- Problem Solving means Searching for a goal state.
Control Strategies
Control strategies help us decide which rule to apply next during the process of searching for a solution to a
problem.
1. Forward search :Search proceeds forward from the initial state towards a solution (goal).
2. Backward search :Search proceeds backward from a goal state toward either a solvable subproblem or the
initial state.
3. Both forward and backward search :Mixture of both forward and backward search.
4. Systematic search (Blind search OR Uninformed search) : Do not have additional information about states
beyond problem definition. Total search space is looked for the solution. Blind searches are inefficient in most
cases
5. Heuristic Search (Informed search OR Directed search control strategy): Some information about problem space
is used to compute the preference among various possibilities for expansion. It can decide whether one non goal
state is more promising than other non goal state. A heuristic search might not always find the best solution but
it is guaranteed to find a good solution in reasonable time. Informed search methods use problem specific
knowledge so they may be more efficient.
Production System
- Production systems provide appropriate structures for performing and describing search processes.
2. Production rules
The productions are rules of the form C -> A, where the LHS is known as the conditionand the RHS is known
as the action.These rules are interpreted as given condition C, take action A.
3. Control system
The control system checks the applicability of a rule.It helps decide which rule should be applied and
terminates the process when the system gives the correct output.
7 problem Characteristics
1. Is the problem decomposable ?
2. Can solution steps be ignored or undone
3. Is the problem universe predictable
4. Is a good solution absolute or relative?
5. Is the solution a state or a path?
6. What is the role of knowledge?
7. Does the task require human interaction?
Generate and Test (British Museum Search Algorithm)
- Generate and Test Search is a heuristic search technique based on Depth First Search with Backtracking which
guarantees to find a solution if done systematically and there exists a solution. It ensures that the best
solution is checked against all possible generated solutions.
Algorithm :
1.Generate a possible solution.
2.Test to see if this is the expected solution.
3.If the solution has been found quit else go to step 1.
Limitations :
- Inefficient for problems with large space.
- Acceptable only for simple problems like 4 Cube problem
o Characteristics of DFS-L :
1. Complete – If goal node lies above depth limit
2. Optimal – If goal node lies above depth limit
3. Time Complexity – O(b^l) l=depth-limit, b=children
4. Space Complexity – O(bl)
IV. Iterative Deepening Search (IDS) OR (IDDFS - Iterative Deepening Depth firstsearch)
- It is a search algorithm that runs multiple DFS searches with increasing depth limits.
- It is iterative in nature.
- This algorithm performs depth-first search up to a certain "depth limit", and it keeps increasing the
depth limit after each iteration until the goal node is found.
- Useful when the search space is large, and depth of the goal node is unknown.
o Characteristics of IDS :
1. Complete – This algorithm is complete if the branching factor is finite.
2. Optimal – If path cost is a non- decreasing function of the depth of the node.
3. Time Complexity – O(b^d) wh., d=depth, b=children
4. Space Complexity – O(bd)
o Advantages: It combines the benefits of Breadth-first search's fast search and depth-first search's
memory efficiency
o Disadvantages : It repeats all the work of the previous phase.
V. Bi-directional search
- Bidirectional search algorithm runs two simultaneous searches,one from initial state called as forward-
search and other from goal node called as backward-search, to find the goal node.
- The search stops when these two graphs intersect each other.
- Bidirectional search can use search techniques such as BFS, DFS, DLS, etc.
- It is useful for those problem which have a single start state and single goal state.
o Characteristics of Bi-directional search:
1. Complete – Bidirectional Search is complete if we use BFS in both searches.
2. optimal
3. Time Complexity – O(b^d/2) wh., d=depth, b=children
4. Space Complexity – O(b^d/2)
o Advantages: Bidirectional search is fast and it requires less memory
o Disadvantages: Implementation of the bidirectional search tree is difficult. One should know the goal state
in advance.
Heuristic Function
- Heuristic is a function which is used in Informed Search to find most promising path.
- The heuristic function might not always give the best solution but it guarantees to find a good solution in
reasonable time.
- Heuristic function estimates how close a state is to the goal and it calculates the cost of an optimal path
between the pair of states.
- It is represented by h(n), and its value is always positive.
- Admissibility of the heuristic function is : h(n) <= h*(n)
- Here h(n) is heuristic cost, and h*(n) is the estimated cost. Hence heuristic cost should be less than or equal to
the estimated cost.
b. Hill Climbing
- Hill climbing algorithm is a local search algorithm which continuously moves in the direction of increasing
elevation to find the peak of the mountain or best solution to the problem.
- It is also called greedy local search as it only examines its immediate neighboring node that improves
the current state and does not look beyond.
- A node of hill climbing algorithm has two components which are state and value.
◼ Steepest-Ascent hill-climbing:
- The steepest-Ascent algorithm is a variation of simple hill climbing algorithm.
- This algorithm examines all the neighboring nodes of the current state and selects one neighbor node which is closest to
the goal state.
o Advantage : Provides optimal Solution.
o Disadvantage : More time consuming as it examines multiple neighbors
ALGORITHM
o Step 1: Evaluate the initial state, if it is goal state then return success and stop, else make current state as initial state.
o Step 2: Loop until a solution is found or the current state does not change.
1. Let SUCC be a state such that any successor of the current state will be better than it.
2. For each operator that applies to the current state:
I. Apply the new operator and generate a new state.
II. Evaluate the new state.
III. If it is goal state, then return it and quit, else compare it to the SUCC.
IV. If it is better than SUCC, then set new state as SUCC.
V. If the SUCC is better than the current state, then set current state to SUCC.
◼ Stochastic hill Climbing:
- Stochastic hill climbing does not examine all its neighboring nodes instead this search algorithm selects one neighbor node
at random and decides whether to choose it as a current state or examine another neighbor node.
- Plateau: a plateau is a flat area of the search space in which, a whole set of neighboring states have the same values.
To overcome plateaus:. Randomly select a state far away from current state and Make a big jump.
- Ridge: is a special kind of local maximum. It is an area of the search space that is higher than surrounding areas and that
itself has slop and cannot be reached in a single move.
To overcome Ridge :Apply two or more rules before doing the test. It implies moving in several directions at once.
◼ Simulated Annealing
- Annealing is a thermal process for obtaining low energy states of a solid in a heat bath.
- The process contains 2 steps :
1. Increase the temperature of the heat bath to a maximum value at which the solid melts.
2. Decrease carefully the temperature of the heat bath until the particles arrangethemselves in the ground state of
the solid. The ground state of the solid is obtained only if the maximum temperature is high enough and the
cooling is done slowly.
- The rate at which the system is cooled is called the annealing schedule.
1. If it occurs too rapidly, a local minimum is obtained.
2. If there is a slower schedule, then global minima is reached.
o The probability of accepting a worse state is a function of both the temperature of the system and the change in the cost
function. As the temperature decreases, the probability of accepting worse moves decreases. If t=0, no worse moves are
accepted (i.e. hill climbing).
- The same process is used in simulated annealing in which the algorithm picks a random move, instead of picking the best
move. If the random move improves the state, then it follows the same path. Otherwise, the algorithm follows the path
which has a probability of less than 1 or it moves downhill and chooses another path.
- In this algorithm we have valley descending rather than hill climbing.
- Simulated annealing avoids climbing false foothills and it avoids the danger of being caught on plateau or ridge.
ALGORITHM
o Step 1: Evaluate the initial state, if it is goal state then return success and stop, else make current state as initial state.
o Step 2: Initial BEST-SO-FAR to current state and Initialize T according to annealing schedule
o Step 3: Loop until a solution is found or no new operator left to apply.
1. For each operator that applies to the current state:
I. Apply the new operator and generate a new state.
II. Evaluate the new state. 𝛿𝐸 = (value of current) – (value of new state)
III. If it is goal state, then return it and quit
IV. If it is better than current state, then assign it as current state and set BEST-SO-FAR to the new state.
V. If it is not better than the current state, then assign it as the current state with probability 𝑃’.
VI. Revise T as necessary according to annealing schedule
o Characteristics of A* :
- Complete: If Branching factor is finite and Cost at every action is fixed.
- Optimal: If A* graph-search is consistent and If h(n) is Admissible heuristic (means h(n) never overestimates the
cost to reach the goal .
- Time Complexity and Space Complexity: O(b^d), where b is the branching factor.
o Advantages of A*
- It can solve very complex problems.It is optimal and complete.
o Disadvantages of A*
- It does not always produce the shortest path as it mostly based on heuristics and approximation.
- It has some complexity issues.
- The main drawback is memory requirement as it keeps all generated nodes in the memory
o Admissibility of A* (Effectivity of A*)
Greedy vs. A*
- Greedy best-first search expands nodes with minimal f(n)=h(n). It is not optimal, but is efficient.
- A* search expands nodes with minimal f(n)=g(n)+h(n). A* is complete and optimal.
A* vs AO*
A* algorithm is a OR graph algorithm AO* is an AND-OR graph Algorithm
requires more memory as compared to AO* requires lesser memory
can go into an infinite loop. doesn't go into infinite loop
Stops when it finds optimal solution Stops when it finds any solution
Less efficient More Efficient
Means-Ends Analysis
- Most of the search strategies either reason forward of backward, Often a mixture of the two
directions is appropriate for solving complex and large problems.
- Such mixed strategy would make it possible to solve the major parts of problem first and solve the
smaller problems the arise when combining them together.
- Such a technique is called Means - Ends Analysis.
- The means -ends analysis process centers around finding the difference between current state and
goal state.
- The means-ends analysis process can be applied recursively for a problem.
- sIt is a strategy to control search in problem-solving.
• Steps :
1. First, evaluate the difference between Initial State and final State.
2. Select the various operators which can be applied for each difference.
3. Apply the operator at each difference, which reduces the difference between the current state and
goal state.
1. Chess
2. Water jug
Problem characteristic Satisfied Reason
Is the problem decomposable? No One Single solution
Can solution steps be ignored orundone? Yes
Is the problem universe Yes Problem Universe is predictable because to solve this
predictable? problem it requires only one person. We can predict what
will happen in next step
Is a good solution absolute orrelative? absolute Absolute solution: water jug problem may have number of
solution , but once we found one solution, no need to bother
about other solution because it doesn’t effect on its cost
Is the solution a state or a path? Path Path to solution
What is the role of knowledge? lot of knowledge helps to constrain thesearch for a solution.
Does the task require human-interaction? Yes Additional assistance is required like to get jugs or pump
3. 8 puzzle
Problem characteristic Satisfied Reason
Is the problem decomposable? No One game have Single solution
Can solution steps be ignored or Yes We can undo the previous move
undone?
Is the problem universe Yes Problem Universe is predictable because to solve this problem
predictable? it require only one person .we can predict what will be position
of blocks in next move
Is a good solution absolute orrelative? absolute Absolute solution : once you get onesolution you do need
to bother about other possible solution.
By considering this 8 puzzle is absolute
Is the solution a state or a path? Path In 8 puzzle winning state(goal state)describe path to state
What is the role of knowledge? lot of knowledge helps to constrain the search for a solution.
Does the task require human- No In 8 puzzle additional assistance is not required
interaction?
4. Travelling salesman Problem
o Production rules
- They are legal chess moves that can be described as a set of rules consisting of two parts:
A left side that gives the current position and the right side that describes the change to be made to the board
position. Example :
Current Position While pawn at square ( 5 , 2), AND Square ( 5 , 3 ) is empty, AND Square ( 5 , 4) isempty.
Changing Board Position :Move pawn from Square ( 5 , 2 ) to Square ( 5 , 4 ) .
2. 8 Puzzle problem
- Given a 3×3 board with 8 tiles (every tile has one number from 1 to 8) and one empty space.
- The program is to change the initial configuration into the goal configuration.
- A solution to the problem is an appropriate sequence of moves i.e. we can slide four adjacent (left, right, above, and
below) tiles into the empty space.
- The Knowledge Representation mechanisms are often based on: Logic, Rules, Frames, Semantic Net etc.
- A good knowledge representation system must possess the following properties.
o Representational Adequacy: It is the ability to represent the required knowledge.
o Inferential Adequacy: It is the ability to manipulate the knowledge represented to produce new knowledge
corresponding to that inferred from the original.
o Inferential Efficiency: It is the ability to direct the inferential knowledge mechanism into the most productive
directions by storing appropriate guides.
o Acquisitional Efficiency: It is the ability to acquire new knowledge using automatic methods wherever
possible rather than reliance on human intervention.
Approaches to Knowledge representation
2. Inheritable Knowledge
- In this approach, all data must be stored into a hierarchy of classes and all the classes should be arranged
in a generalized form .
- This approach contains inheritable knowledge which shows a relation between instance and class.
- In this approach
o We apply inheritance property
o Knowledge Elements inherit values from their parents.
o objects and values are represented in Boxed nodes
o Arrows are used to point from objects to their values.
- This structure is known as a slot and filler structure, semantic network or a collection of frames where Every
individual frame can represent the collection of attributes and its value.
3. Inferential Knowledge
- It represents knowledge in the form of formal logics.
- It can be used to derive more facts and to verify truths of new statements.
- It guarantees correctness.
4. Procedural knowledge
- In this approach knowledge is encoded in small programs and codes which describes how to do specific things,
and how to proceed.
- It is a representation in which the control information, to use the knowledge, is embedded in the knowledge
itself.
- For example, computer programs, directions, and recipes.
Types of Knowledge
Tacit or Implicit or Informal Explicit or formal
Exists within a human being Exists outside a human being
It is embodied. It is embedded.
Difficult to articulate formally. Can be articulated formally.
Difficult to communicate or share. Can be shared, copied, processed and stored.
Hard to steal or copy. Easy to steal or copy
Drawn from experience, action, subjective insight Drawn from artifact of some type as principle,
procedure, process, concepts.
Playing a musical instrument,Humour, Emotional Encyclopedia and books are classic examples of such
intelligence, Speaking a certain language are examples knowledge
of such knowledge
Logic
- Logic is the primary vehicle for representing and reasoning about knowledge.
- A logic is a formal language, with precisely defined syntax and semantics
- It provides a way of deriving new knowledge from old using mathematical deduction.
- Using logic we can conclude that a new statement is true by proving that it follows from the statements that are
already known.
- Specifically, we will be dealing with formal logic. The advantage of using formal logic is that it is precise and
definite.
1. Propositional Logic
- A proposition a simple declarative sentence eg : “the book is expensive”
- A propositions can be either true or false but not both.
- We can use any symbol for a representing a proposition, such A, B, C, P, Q, R.
- The propositions are combined by connectives.(Logical Operators)
There are two types of Propositions:
1. Atomic Propositions : It consists of a single proposition symbol. These are the sentences which must be either true
or false.
a) 2+2 is 4, it is an atomic proposition as it is a true fact.
b) "The Sun is cold" is also an atomic proposition as it is a false fact.
- In Propositional logic in order to draw conclusions, facts are represented in a more convenient way as,
o Marcus is a man : man(Marcus)
o All men are mortal : mortal(men)
- But propositional logic fails to capture the relationship between an individual being a man and that individual
being a mortal.
2. Predicate Logic
- First-order Predicate logic (FOPL) is a formal language in which propositions are expressed in terms of
predicates, variables and quantifiers.
- It should be viewed as an extension to propositional logic.
- A predicate is an expression of one or more variables defined on some specific domain.
(∀x)P(x) means that P holds for all values of x in the (∃ x)P(x) means that P holds for some value of x in the
domain associated with that variable domain associated with that variable
(∀x) dolphin(x) → mammal(x) (∃ x) mammal(x) ∧ lays-eggs(x)
A well-formed formula (wff) is a sentence containing no “free”variables i.e. all variables are “bound” by universal or
existential quantifiers.
Well Formed Formula (wff) is a predicate holding any of the following
1. All propositional constants and propositional variables, Truth and false values, atomic propositions and all
connectives connecting wffs are wffs
2. If x is a variable and Y is a wff, ∀Y and ∃x are also wff
Resolution
- Resolution produces proofs by refutation.
- To prove a statement resolution attempts to show that the negation of the statement is un-satisfiable
- The resolution procedure is a simple iterative process: at each step, two clauses, called the parent clauses, are
compared (resolved), resulting into a new clause that has been inferred from them.
- The new clause represents ways that the two parent clauses interact with each other
Steps
1. Conversion of facts into first-order logic.
2. Convert FOL into CNF (clause form)
3. Negate the statement which needs to prove (proof by contradiction)
4. Draw resolution graph (unification).
o Eliminate implication
- ∀x ¬graduating(x) ∨ happy(x)
- ∀x ¬happy(x) ∨ smile (x)
- ∃x graduating(x)
- ¬∃x smiling(x)
o Move ¬ inwards
- ∀x ¬graduating(x) ∨ happy(x)
- ∀x ¬happy(x) ∨ smile (x)
- ∃x graduating(x)
- ∀x smiling(x)
• Backward Chaining (Backward chaining, goal driven reasoning): start with a possible conclusion and try to prove
its validity by searching for evidence.(What we are trying to prove is our goal)
Take the goal and try to match against the consequents of rules from the rule base.
- If a rule is fired, then the antecedents are added to the set of goals, and the consequents are removed.
- Inference engines that use the backward chaining strategy apply the strategy exhaustively, until no more
rules are fired.
Eg: B is the goal or endpoint, that is used as the starting point for backward tracking. A is the initial state. A->B is a
fact that must be asserted to arrive at the endpoint B.
- Tom is sweating (B).
- If a person is running, he will sweat (A->B).
- Tom is running (A).
Forward vs Backward Chaining
Forward Chaining Backward chaining
1. It starts from known facts and applies inference rule to It starts from the goal and works backward through
extract more data unit it reaches to the goal. inference rules to find the required facts that support
the goal.
5. It tests for all the available rules It only tests for few required rules.
6. It is suitable for the planning, monitoring, control, and Itis suitable for diagnostic, prescription, and debugging
interpretation application. application.
7. It can generate an infinite number of possible It generates a finite number of possible conclusions.
conclusions.
9. It is aims for any conclusion. It only aims for the required data.
Different Methods of Reasoning
Reasoning is the act of deriving a conclusion from certain properties using a given methodology.
1. Deductive Reasoning
Ch-4
- Deductive reasoning is deducing new information from logically related known information.
Example:
Premise-1: All the human eats veggies
Premise-2: Suresh is human.
Conclusion: Suresh eats veggies.
The general process of deductive reasoning is given below:
2. Inductive Reasoning
- Inductive reasoning is means to arrive at a conclusion using limited sets of facts by the process of generalization.
Example:
Premise: All of the pigeons we have seen in the zoo are white.
Conclusion: Therefore, we can expect all the pigeons to be white.
The general process of inductive reasoning is given below:
3. Abductive Reasoning
- Abductive reasoning starts with single or multiple observations then seeks to find the most likely explanation or
conclusion for the observation.
- Abductive reasoning is an extension of deductive reasoning, but in abductive reasoning, the premises do not
guarantee the conclusion.
Example:
Implication: Cricket ground is wet if it is raining
Axiom: Cricket ground is wet.
Conclusion It is raining.
5. Monotonic Reasoning
- In monotonic reasoning, once the conclusion is taken, then it will remain the same even if we add some other
information to existing information in our knowledge base.
- Monotonic Reasoning is the process that does not change its direction.
- Monotonic Reasoning will move in increasing order or decreasing.
- But since it depends on knowledge and facts, It will only increase and will never decrease.
- Any theorem proving is an example of monotonic reasoning.
Example:Earth revolves around the Sun.
It is a true fact, and it cannot be changed even if we add another sentence in knowledge base like, "The moon revolves
around the earth" Or "Earth is not round," etc.
Advantages :
o Can be used for theorem proving.
o In monotonic reasoning each old proof will always remain valid.
Disadvantages :
o In we can only derive conclusions from the old proofs, so new knowledge from the real world cannot be added
o It cannot be used for hypothesis knowledge.
6. Non Monotonic reasoning (IMP)
- Non-Monotonic means something which can vary according to the situation or condition.
- Non-monotonic Reasoning is the process that changes its direction or values as the knowledge base increases.
- Non-monotonic Reasoning will increase or decrease based on the condition.
Eg : Consider a bowl of water, If we put it on the stove and turn the flame on it will obviously boil hot and
as we will turn off the flame it will cool down gradually
- Logic will be said as non-monotonic if some conclusions can be invalidated by adding more knowledge into our
knowledge base.
- "Human perceptions for various things in daily life, "is a general example of non-monotonic reasoning because
Human reasoning is not monotonic.
Example: Let suppose the knowledge base contains the following knowledge:
Birds can fly
Penguins cannot fly
Pitty is a bird
- So from the above sentences, we can conclude that Pitty can fly.
- However, if we add one another sentence into knowledge base "Pitty is a penguin", which concludes "Pitty cannot
fly", so it invalidates the above conclusion.
Advantages :
o It can be used for real-world systems such as Robot navigation.
o In Non-monotonic reasoning, we can choose probabilistic facts or can make assumptions.
Disadvantages :
o In non-monotonic reasoning, the old facts may be invalidated by adding new sentences.
o It cannot be used for theorem proving.
- Probability is the numerical measure of the likelihood that an uncertain event will occur.
o 0 ≤ P(A) ≤ 1, where P(A) is the probability of an event A.
o P(A) = 0, indicates total uncertainty in an event A.
o P(A) =1, indicates total certainty in an event A.
- Prior probability: The prior probability of an event is probability computed before observing new information.
- Posterior Probability: The probability that is calculated after information has taken into account. It is a
combination of prior probability and new information.
- Marginal Probability: The probability of an event irrespective of the outcomes of other random variables. P(A).
- Joint Probability: Probability of two (or more) simultaneous events, It is symmetrical. P(A, B) = P(A | B) * P(B)
- Conditional Probability: Probability of one (or more) event given the occurrence of another event .It is not
Symmetrical. P(A | B) = P(A, B) / P(B)
1. Bayes Theorem
- In statistics and probability theory, the Bayes’ theorem is a mathematical formula used to determine the
conditional probability of events.
- It describes the probability of an event based on prior knowledge of the conditions that might be relevant to the
event.
- It relates the conditional probability and marginal probabilities of two random events.
- The Bayes’ theorem is expressed as: P(B|A)P(A)
P(A|B) =
P(B)
- P(A|B) – the probability of event A occurring, given event B has occurred (Posterior Probablity)
- P(B|A) – the probability of event B occurring, given event A has occurred
- P(A) – the probability of event A (Prior Probablity)
- P(B) – the probability of event B (Marginal Probablity)
- Note that events A and B are independent events.
From the formula of joint distribution, we can write the problem statement in the form of probability distribution
P(S, D, A, ¬B, ¬E) = P (S|A) *P (D|A)*P (A|¬B ^ ¬E) *P (¬B) *P (¬E)
= 0.75* 0.91* 0.001* 0.998*0.999
= 0.00068045.
Hence, a Bayesian network can answer any query about the domain by using Joint distribution
- In a rule based system, a rule is an expression of the form "if A then B" where A is an assertion and B can be
either an action or another assertion.
- A problem with rule-based systems is that often the connections reflected by the rules are not absolutely
certain and So In such cases, a certainty measure is added to the premises as well as the conclusions
- Resultant Rule: how much a change in the certainty of the premise will change the certainty of the conclusion.
- If A (with certainty x) then B (with certainty f(x))
- Each rule has a certainty attached to it.
- Example: In MYCIN once the identities of the virus/bacteria are found attempts to select a therapy by which
the disease can be treated.
- A certainty factor (CF [h, e]) is defined in terms of two components:
1. MB[h, e] - a measure (between 0 and 1) of belief in hypothesis “h” given the evidence “e”.
MB measures the extent to which the evidence supports the hypothesis.
It is zero if the evidence fails to support the hypothesis.
2. MD[h, e] - a measure (between 0 and 1) of disbelief in hypothesis “h” given the evidence “e”.
MD measures the extent to which the evidence supports the negation of the hypothesis. It is zero if
the evidence support the hypothesis.
CF[h, e] = MB[h, e] - MD[h, e]
- In Dempster-Shafer Theory we consider sets of propositions and assign to each of them an interval in which the
degree of belief must lie. [Belief, Plausibility]
- Belief (denoted as Bel) measures the strength of the evidence in favor of a set of propositions. It ranges from 0 (
no evidence) to 1 (definite certainty)
- Plausibility (PI) is,PI(s) = 1- Bel(¬s) It also ranges from 0 to 1 and measures the extent to which evidence in favor
of ¬s leaves room for belief in s.
- Let’s take an example where we have some mutually exclusive hypothesis.{Allergy, Flu, Cold, Pneumonia},
- The set is denoted by θ and we want to attach some measure of belief to elements of θ.
- The key function we use here is a Probability Density Function, denoted by m which is defined for all elements
of θ but and all subsets of it. We must assign m so that the sum of all the m values assigned to subsets of θ is 1.
- The quantity m(p) measures the amount of belief that is currently assigned to exactly to the set “p” of
hypothesis.
Let X be the set of subsets of θ for m1 and let Y be the corresponding set for m2.
We define m3, as a combination of the m1 and m2 to be, 𝐦3 (𝐙) = σ𝐗∩𝐘=𝐙 𝐦1 (𝐗) ∙ 𝐦2 (𝐘)
1 − σ𝐗∩𝐘=∅ 𝐦1 (𝐗) ∙ 𝐦2 (𝐘)
• suppose m1 corresponds to our belief after observing fever: m1= { F, C, P} = 0.6 ,θ = (0.4)
• suppose m2 corresponds to our belief after observing runny nose: m2= { A ,F, C} =0.8 ,θ = 0.2
• Then we can compute their combination m3 using the following table:
Fuzzy Logic
- The term fuzzy refers to things that are not clear or are vague
- So fuzzy logic provides very valuable flexibility for reasoning.
- In the boolean system truth value, 1.0 represents the absolute truth value and 0.0 represents the absolute
false value. But in the fuzzy system, there is no logic for the absolute truth and absolute false value. But there
is an intermediate value to present what is partially true and partially false.
- Crisp set theory is governed by a logic that uses one of only two values: true or false.This logic cannot represent
vague concepts. In such case fuzzy set theory comes to rescue where an element is with a certain degree of
membership
- In the architecture of the Fuzzy Logic system consists of four different components.
1. Rule Base :
- Rule Base is a component used for storing the set of rules and the If-Then conditions given by the experts
- There are so many functions which offer effective methods for designing and tuning of fuzzy controllers.
- These updates or developments decreases the number of fuzzy set of rules.
2. Fuzzification :
- Fuzzification is a component for transforming the system inputs, i.e., crisp number into fuzzy steps.
- The crisp numbers are those inputs which are measured by the sensors and then fuzzification passes them into
the control systems for further processing.
- This component divides the input signals into following five states in any Fuzzy Logic system:Large Positive
(LP),Medium Positive (MP),Small (S),Medium Negative (MN),Large negative (LN)
3. Inference Engine :
- This component is a main component in any Fuzzy Logic system, where all the information is processed
- It allows users to find the matching degree between the current fuzzy input and the rules.
- After the matching degree, this system determines which rule is to be added according to the given input field.
- When all rules are fired, then they are combined for developing the control actions.
4. Defuzzification
- Defuzzification is a component, which takes the fuzzy set inputs generated by the Inference Engine, and then
transforms them into a crisp value. It is the last step in the process of a fuzzy logic system.
- The crisp value is a type of value which is acceptable by the user.
- Various techniques are present to do this, but the user has to select the best one for reducing the errors.
Operations On the Fuzzy set
Game Theory
-
-
It does not prescribe a way or say how to play a game.
It is the set of ideas and techniques for analysing conflict situations between two or more players.
Ch-6
MiniMax Search
- Mini-max algorithm is a backtracking algorithm which is used in decision-making and game theory.
- Min-Max algorithm is mostly used for two player games Such as Chess, Checkers, tic-tac-toe, go etc.
- Mini-Max algorithm uses recursion to search through the game-tree.
- The minimax algorithm performs a depth-first search for the exploration of the complete game tree.The minimax algorithm
proceeds all the way down to the terminal node of the tree, then backtrack the tree as the recursion.
- In this algorithm two players play the game, one is called MAX and other is called MIN and both players of the game are
opponent of each other.
- MAX will select the maximized value and MIN will select the minimized value.
- The steps for the min max algorithm in AI can be stated as follows:
1. Create the entire game tree.
2. Evaluate the scores for the leaf nodes based on the evaluation function.
3. Backtrack from the leaf to the root nodes:
For Maximizer, choose the node with the maximum score.
For Minimizer, choose the node with the minimum score.
4. At the root node, choose the node with the maximum value and select the respective move.
• Characteristics of mini-Max
- Complete: For finite search tree.
- Optimal- If both opponents are playing optimally.
- Time complexity is O(bm) and Space Complexity is O(bm) where b is branching factor, and m is the maximum depth of the
tree. (Same as DFS as it performs DFS Search of the tree)
• Limitations of MiniMax
- The main drawback is that it gets really slow for complex games such as Chess, go, etc. This type of games has a huge
branching factor, and the player has lots of choices to decide.
- This limitation of the minimax algorithm can be improved from alpha-beta pruning.
Alpha-Beta Pruning
- It is an optimization technique for the minimax algorithm
- There is a technique by which without checking each node of the game tree we can compute the correct minimax decision,
and this technique is called pruning. This involves two threshold parameter Alpha and beta for future expansion, so it is
called alpha-beta pruning.
- Alpha-beta pruning can be applied at any depth of a tree, and sometimes it does not only prune the tree leaves but also
entire sub-tree.
- The two-parameter can be defined as:
o Alpha: The best (highest-value) choice we have found so far at any point along the path of Maximizer. The initial value
of alpha is -∞.
o Beta: The best (lowest-value) choice we have found so far at any point along the path of Minimizer. The initial value of
beta is +∞.
- The main condition which required for alpha-beta pruning is: α>=β
o The effectiveness of alpha-beta pruning is highly dependent on the order in which each node is examined.
- Worst ordering: In some cases, alpha-beta pruning algorithm does not prune any of the leaves of the tree, and works exactly
as minimax algorithm. The time complexity for such an order is O(bm).
- Ideal ordering: The ideal ordering for alpha-beta pruning occurs when lots of pruning happens in the tree, and best moves
occur at the left side of the tree. The time complexity for such an order is O(bm/2)
What is Planning in AI ?
-
Ch-7
It is about the decision making tasks performed by the robots or computer programs to achieve a specific goal.
- Planning refers to the process of computing several steps of a problem-solving procedure before executing any of them.
Predicates
- All the operations performed by the robot arm has certain pre-conditions that can be described in the form of predicates.
1. ON(A,B): Block A is on Block B.
2. ONTABLE(A): Block A is on the table.
3. CLEAR(A): There is nothing on the top of Block A.
4. HOLDING(A): The arm is holding Block A.
5. ARMEMPTY: The arm is holding nothing.
• Following list of actions can be applied to the various situations in the problem.
Constraint Posting
- The idea of constraint posting is to build up a plan by incrementally hypothesizing operators, partial orderings between
operators, and binding of variables within operators.
- At any given time in the problem-solving process, we may have a set of useful operators but perhaps no clear idea of how
those operators should order with respect to each other.
- A solution is a partially ordered, partially instantiated set of operators to generate an actual plan.
Algorithm For Non-linear Planning
1. Choose a goal 'g' from the goalset
2. If 'g' does not match the state, then
• Choose an operator 'o' whose add-list matches goal g
• Push 'o' on the opstack
• Add the preconditions of 'o' to the goalset
3. While all preconditions of operator on top of opstack are met in state
• Pop operator o from top of opstack
• state = apply(o, state)
• plan = [plan; o]
Hierarchical Planning
- In order to solve hard problems, a problem solver may have to generate long plans.
- But it is important to be able to eliminate some of the details of the problem until a solution that addresses the main
issues is found. And then an attempt can be made to fill appropriate details.
- To do this initially macro operators were used but in this approach no details were eliminated from actual description of
operators.
ABSTRIPS is better approach ,actually planned in a hierarchy of abstraction spaces, in each of which preconditions at a
lower level of abstraction are ignored.
• The assignment of appropriate criticality value is crucial to the success of this hierarchical planning method and Those
preconditions that no operator can satisfy are clearly the most critical.
• Example, solving a problem of moving robot, for applying an operator, PUSH-THROUGH DOOR, the precondition that there
exist a door big enough for the robot to get through is of high criticality.
Reactive Systems
- The idea of reactive systems is to avoid planning altogether.
- A reactive system is very different from the other kinds of planning system because it chooses actions one at
a time.
It does not anticipate and select an entire action sequence before it does the first thing.
- A reactive system must have an access to a knowledge base of some sort that describes what actions should
be taken under what circumstances.
- The example is a Thermostat. The job of the thermostat is to keep the temperature constant inside a room.
- Simple pair of situation-action rules used by Thermostat:
1. If the temperature in the room is k degrees above the desired temperature, then turn the AC on.
2. If the temperature in the room is k degrees below the desired temperature, then turn the AC off.
NLU is the process of reading and interpreting NLG is the process of writing or generating language.
language.
It produces non-linguistic outputs from natural It produces constructing natural language outputs
language inputs. from non-linguistic inputs.
- Natural Language Processing (NLP) problem can be divided into two tasks:
1.Processing written text, using lexical, syntactic and semantic knowledge of the language as well as the required
real world information.
2.Processing spoken language, using all the information needed above plus additional knowledge about phonology
as well as enough added information to handle the further ambiguities that arise in speech.
Phases of NLP
1. Morphological Analysis:
- Individual words are analyzed into their components and non-word tokens such as punctuation are separated
from the words.
2. Syntactic Analysis:
- Linear sequences of words are transformed into structures that show how the words relate to each other.
- Some word sequences may be rejected if they violate the language’s
rule for how words may be combined.The sentence such as “The school
goes to boy” is rejected by English syntactic analyzer.
3. Semantic Analysis:
- The structures created by the syntactic analyzer are assigned meanings.
- A mapping is made between the syntactic structures and objects in the
task domain.
- Structures for which no such mapping is possible may be rejected. The
semantic analyzer disregards sentence such as “hot ice-cream”.
4. Discourse integration:
- The meaning of an individual sentence may depend on the sentences
that precede it and may influence the meanings of the sentences that
follow it.
5. Pragmatic Analysis:
- The structure representing what was said is reinterpreted to determine
what was actually meant.
1. Morphological Analysis:
- The morphological level involves identifying and analyzing the structure of words.
- Lexicon of a language means the collection of words and phrases in a language. It involves identifying and analyzing the
structure of words.
- Morphological analysis is dividing the whole chunk of text into paragraphs, sentences, and words.
- This process will usually assign syntactic categories to all the words in the sentence.
- Suppose there is a sentence “I want to print Bill’s .init file.”
- Morphological analysis must do the following things:
▪ Pull apart the word “Bill’s” into proper noun “Bill” and the possessive suffix “’s”
▪ Recognize the sequence “.init” as a file extension that is functioning as an adjective in the sentence.
2. Syntactic Analysis:
- Syntactic analysis must exploit the results of morphological analysis to build a structural description of the sentence.
- The goal of this process, called parsing, is to convert flat sentence into a hierarchical structure that corresponds to
meaning units .
- Reference markers (set of entities) are shown in the parenthesis in the parse tree.
- Each one corresponds to some entity that has been mentioned in the sentence.
- These reference markers are useful later since they provide a place in which to accumulate information about the entities
as we get it.
3. Semantic Analysis:
- Semantic analysis must do two important things:
a. It must map individual words into appropriate objects in the knowledge base.
b. It must create the correct structures to correspond to the way the meanings of the individual words combine with each
other.
- It draws the exact meaning or the dictionary meaning from the text. The text is checked for meaningfulness.
- It is done by mapping syntactic structures and objects in the task domain.
4. Discourse Integration:
- The discourse level of linguistic processing deals with the analysis of structure and meaning of text beyond a single
sentence, making connections between words and sentences.
- Discourse Integration depends upon the sentences that proceeds it and also invokes the meaning of the sentences
that follow it
- At this level, Anaphora Resolution is also achieved by identifying the entity referenced by an anaphor
- Structured documents also benefit from the analysis at the discourse level since sections can be broken down into (1)
title, (2) abstract, (3) introduction, (4) body, (5) results, (6) analysis, (7) conclusion, and (8) references.
5. Pragmatic Analysis:
- During this, what was said is re-interpreted on what it actually meant
- It helps you to discover the intended effect by applying a set of rules that characterize cooperative dialogues.For
Example: "Open the door" is interpreted as a request instead of an order.
- It is the process to translate knowledge based representation to a command to be executed by the system.
- The pragmatic level of linguistic processing deals with the use of real-world knowledge and understanding of how this
impacts the meaning of what is being communicated.
Applications of NLP
1. Question Answering
- Question Answering focuses on building systems that automatically answer the questions asked by humans in a
natural language.Eg : Virtual Assistants like Siri, Alexa etc.
2. Text Classification
- Text classification is the process of categorizing the text into a group of words.
- By using NLP, text classification can automatically analyze text and then assign a set of predefined tags
- For e.g., Spam Detection is used to detect unwanted e-mails getting to a user's inbox.
3. Sentiment Analysis
- Sentiment Analysis is also known as opinion mining.
- It is used on the web to analyse the attitude, behaviour, and emotional state of the sender.
- This application is implemented through a combination of NLP and statistics by assigning the values to the text
(positive, negative, or natural), identify the mood of the context (happy, sad, angry, etc.)
4. Machine Translation
- Machine translation is used to translate text or speech from one natural language to another natural
language.Example: Google Translator
5. Spelling correction
- Microsoft Corporation provides word processor software like MS-word, PowerPoint for the spelling correction.
6. Speech Recognition
- Speech recognition is used for converting spoken words into text.
- It is used in applications, such as mobile, home automation, video recovery, dictating to Microsoft Word, voice
biometrics, voice user interface, and so on.
7. Chatbot
- It is used by many companies to provide the customer's chat services.
8. Information extraction
- Information extraction is one of the most important applications of NLP. It is used for extracting structured
information from unstructured or semi-structured machine-readable documents.
9. Natural Language Understanding (NLU)
- It converts a large set of text into more formal representations such as first-order logic structures that are easier
for the computer programs to manipulate notations of the natural language processing.
Spell Checking
- Spell Check is a process of detecting and sometimes providing suggestions for incorrectly spelled words in a text.
- A basic spell checker carries out the following processes:
o It scans the text and extracts the words contained in it.
o It then compares each word with a known list of correctly spelled words (i.e. a dictionary).
o An additional step is a language-dependent algorithm for handling morphology.
- Error Detection
• Dictionary Lookup Technique checks every word of input text for its presence in dictionary. If that word present
in the dictionary then it is a correct word. Otherwise it is put into the list of error words.
How to build NLP Pipeline
1. Sentence Segmentation
- Sentence Segment is the first step for building the NLP pipeline. It breaks the paragraph into separate sentences.
2. Word Tokenization
- Word Tokenizer is used to break the sentence into separate words or tokens.
3. Stemming
- Stemming is used to normalize words into its base form or root form. The big problem with stemming is that
sometimes it produces the root word which may not have any meaning.
- For Example, intelligence, intelligent, and intelligently, all these words are originated with a single root word
"intelligen." In English, the word "intelligen" do not have any meaning.
4. Lemmatization
- Lemmatization is quite similar to the Stamming.. The main difference between Stemming and lemmatization is
that it produces the root word, which has a meaning.
- For example: In lemmatization, the words intelligence, intelligent, and intelligently has a root word intelligent,
which has a meaning.
6. Dependency Parsing
- Dependency Parsing is used to find that how all the words in the sentence are related to each other.
7. POS tags
- POS stands for parts of speech, which includes Noun, verb, adverb, and Adjective. It indicates that how a word
functions with its meaning as well as grammatically within the sentences.
- Example: Google something on the Internet.
Here Google is used as a verb, although it is a proper noun.
9. Chunking
- Chunking is used to collect the individual piece of information and grouping them into bigger pieces of sentences.
Hopfield Network
-
Ch-9
Hopfield neural network is proposed by John Hopfield as a theory of memory: Model of content addressable.
- Hopfield network is a special kind of neural network whose response is different from other neural networks. It
is calculated by converging iterative process.
- A Hopfield network is at first prepared to store various patterns or memories and then it becomes ready to
recognize any of the learned patterns.
- The Hopfield network consists of a set of neurons and corresponding set of unit delays, forming a multiple loop
feedback system.
- Processing units are always in one of two states, active(Black) or inactive(White).
- Units are connected to each other with weighted symmetric connections. A positive weighted connection
indicates that the two units tend to activate each other. A negative weighted connection allows an active unit to
deactivate a neighboring unit.
- Sometimes the network cannot find global solution because the network sticks with the local minima.
- Boltzman machines is a technique to combine simulated annealing and Hopfield to find
Global minima.
- Artificial Neural networks are computing system which has interconnected nodes that work like neurons
present in the brain
- Simple ANN consists of input layer,output layer and in between them hidden layers .
• Inputs are fed simultaneously into the units making up the input layer
• They are then weighted and fed simultaneously to a hidden layer
• The weighted outputs of the last hidden layer are input to the output layer
• Output layer emits the network's prediction
In ANN, we can also apply activation functions over the input to get the exact output.
1. Linear Activation Function
- It is also Known as identity Function as it performs no input editing.
F(x)=x
1) Supervised Learning
- This type of learning is done under the supervision of a teacher. This learning process is dependent.
- During the training of ANN under supervised learning, the input vector is presented to the network, which will
give an output vector and this output vector is compared with the desired output vector.
- An error signal is generated, if there is a difference between the actual output and the desired output vector.
- On the basis of this error signal, the weights are adjusted until the actual output is matched with the desired
output.
2) Unsupervised Learning
- This type of learning is done without the supervision of a teacher. This learning process is independent.
- During the training of ANN under unsupervised learning, the input vectors of similar type are combined to form
clusters. When a new input pattern is applied, then the neural network gives an output response indicating the
class to which the input pattern belongs.
- There is no feedback from the environment as to what should be the desired output and if it is correct or incorrect
3) Reinforcement learning
- In Reinforcement learning agents learn from their experiences only.
- It lies between supervised and unsupervised learning
- It is also called learning with critic
- Reinforcement learning works on a feedback-based process, in which an AI agent automatically explore its
surrounding by hitting & trail, taking action, learning from experiences, and improving its performance.
- Agent gets rewarded for each good action and get punished for each bad action.The main aim of
reinforcement learning agent is to maximize the rewards.
- The most useful network for this is Kohonen Self-Organizing feature map, which has its input as short segments
of the speech waveform. It will map the same kind of phonemes as the extracted features which will help some
acoustic model to recognize the utterance.
2. Character Recognition
- It is an interesting problem which falls under the general area of Pattern Recognition.
- Many neural networks have been developed for automatic recognition of handwritten characters, either letters
or digits. Following ANNs have been used for character recognition −
• Backpropagation neural networks and Neocognitron
- Like back-propagation neural networks , neocognitron also has several hidden layers, the pattern of connection
from one layer to the next is localized and so training is done layer by layer.
5. Other Applications
- Artificial Neural Networks are used in Oncology to train algorithms that can identify cancerous tissue at the
microscopic level at the same accuracy as trained physicians.
- Object detection models such as YOLO (You Only Look Once) and SSD (Single Shot Object Detectors).
Recurrent networks
- They are feedback networks with closed loops.
Feedback network : It has feedback paths so the signal can flow in both directions using loops.
1. Fully recurrent network − It is the simplest neural network architecture because all nodes are connected to all
other nodes and each node works as both input and output.
2. Jordan network − It is a closed loop network in which the output will go to the input again as feedback .
o RNNs are a class of neural networks that are helpful in modeling sequential data like time series, speech,
financial data, audio, video, weather etc.
o RNNs are most promising algorithms in use because it is the only one with an internal memory.
o Because of their internal memory RNN can remember important things about the input they received.
o Recurrent networks can be trained with the Back-propagation algorithm.
o In a feed-forward neural network, the information only moves in one direction and never touches a node
twice so it has no memory of the input they receive. Because it only considers the current input, it has no
notion of order in time.
o In an RNN the information cycles through a loop. When it makes a decision, it considers the current input and
also what it has learned from the inputs it received previously. So RNN has two inputs: the present and the
recent past.
• Feed-forward neural networks map one input to one output but RNNs can map one to many, many to
many and many to one.
Symbolic AI vs Connectionist AI
o Symbolic AI
- Search – state space traversal
- Knowledge representation – predicate logic, semantic frames, scripts
- Learning – macro-operators, explanation learning, version space
o Connectionist AI
- Search – Parallel Relaxation
- Knowledge representation – very large number of real value connection strength
- Learning – back propagation, reinforcement learning, unsupervised learning
1. Representational efficiency
- With n binary output neurons, it can represent 2n concepts .
- With localist representation, n neurons can only represent n concepts.
2. Mapping efficiency
- It allows for a more compact overall structure from input nodes to the output ones and that means less number of
connections and weights to train.
3. Sparse distributed representation
- It is sparse if only a small fraction of the n neurons is used to represent a subset of the concepts.
4. Resiliency
- It is more resistant to damage.
- It is resilient in the sense that degradation of a few elements in the network structure may not disrupt or effect the
overall performance of the structure.
What is Expert System ?
-
Ch-10
The concept of expert systems was first developed by Feigenbaum .He explained that the world was moving
from data processing to knowledge processing, a transition which was being enabled by new processor
technology and computer architectures.
- An expert system is a computer program that is designed to solve complex problems and to provide decision-
making ability like a human expert.
Features:
- Human experts are perishable, but an expert system is permanent.
- One expert system may contain knowledge from more than one human experts thus making the
solutions more efficient.
- It decreases the cost of consulting an expert .
- Expert systems can solve complex problems by deducing new facts through existing facts of knowledge
- Expert systems were among the first truly successful forms of artificial intelligence (AI) software.
Limitations :
- Do not have human-like decision-making power.
- Cannot produce correct result from less amount of knowledge.
- Requires excessive training
- Knowledge Base
- Inference Engine
- User Interface
1. Knowledge Base
- The knowledge base represents facts and rules.
- It consists of knowledge in a particular domain as well as rules to solve a problem, procedures and intrinsic
data relevant to the domain
- The knowledge base of an ES is a store of both, factual and heuristic knowledge.
a. Factual Knowledge − It is the information widely collected by the Knowledge Engineers in the task domain.
b. Heuristic Knowledge − It is about practice, judgement, one’s ability of evaluation, and guessing.
- An expert system solves the most complex issue as an expert by extracting the knowledge stored in its
knowledge base. This knowledge is extracted from its knowledge base using the reasoning and inference rules
according to the user queries.
- The performance of an expert system is based on the expert's knowledge stored in its knowledge base. The
more knowledge stored in the knowledge base better the performance of expert system.
b. Inference Engine
- The function of the inference engine is to fetch the relevant knowledge from the knowledge base, interpret it
and to find a solution relevant to the user’s problem.
- The inference engine acquires the rules from its knowledge base and applies them to the known facts to infer
new facts.
- Inference engines can also include an explanation and debugging abilities.
c. User Interface
- This module makes it possible for a non-expert user to interact with the expert system and find a solution to
the problem.
Representation using Domain Knowledge
- Expert system is built around a knowledge base module.
- It contains a formal representation of the information provided by the domain expert. This information may be
in the form of problem-solving rules, procedures, or data intrinsic to the domain.
- To incorporate these information into the system, it is necessary to make use of one or more knowledge
representation methods. Three common methods of knowledge representation evolved over the years are, IF-
THEN rules, Semantic networks and Frames.
- Transferring knowledge from the human expert to a computer is often the most difficult part of building an
expert system.
- The function of this component is to allow the expert system to acquire more and more knowledge from
various sources and store it in the knowledge base.
- The success of any expert system majorly depends on the quality, completeness, and accuracy of the information
stored in the knowledge base.
- The knowledge base is formed by readings from various experts and Knowledge Engineers. The knowledge
engineer is a person who acquires information from subject and then he categorizes and organizes the
information in a meaningful way.
- The knowledge engineer also monitors the development of the ES.
The Inference Engine generally uses two strategies for acquiring knowledge from the Knowledge Base:
1. Forward Chaining
- Forward Chaining is a strategic process used by the Expert System to answer the question – What will happen next.
- This strategy is mostly used for managing tasks like creating a conclusion, result or effect.
- Example : prediction or share market movement status.
2. Backward Chaining
- Backward Chaining is a storage used by the Expert System to answer the questions – Why this has happened.
- This strategy is mostly used to find out the root cause or reason behind it, considering what has already happened.
- Example: diagnosis of stomach pain, blood cancer etc.
Applications of Expert System
Application Description
Diagnosis Systems to deduce cause of disease from observed data, conduction medical
Medical Domain
operations on humans.
Comparing data continuously with observed system such as leakage monitoring in long
Monitoring Systems
petroleum pipeline.
Finance/Commerce Detection of possible fraud, stock market trading, Airline scheduling, cargo scheduling.
- MYCIN is a well-known medical expert system that was developed at Stanford University.
- MYCIN was designed to assist doctors to prescribe antimicrobial drugs for blood infections. So indirectly using ES
,experts in antimicrobial drugs can assist the doctors who are not so expert in that field.
- By asking the doctor a series of questions, MYCIN is able to recommend a course of treatment for the patient.
- MYCIN is also able to explain to the doctor the rules that are fired and therefore is able to explain why it
produced the diagnosis and recommended treatment that it.
- MYCIN has been proven to be able to provide more accurate diagnoses of meningitis in patients than most
doctors.
- MYCIN was developed using LISP(List Processing), and its rules are expressed as LISP expressions.
- Example of the kind of rule used by MYCIN, translated into English:
o IF the infection is primary-bacteria.
o AND the site of the culture is one of the sterile sites.
o AND the suspected portal of entry is the gastrointestinal tract.
o THEN there is suggestive evidence (0.7) that infection is bacteroid.
- A common method for building this expert systems is to use a rule-based system with backward chaining.
- Typically, a user enters a set of facts into the system, and the system tries to see if it can prove any of the
possible hypotheses using these facts.
- Typically, backward chaining is used in combination with forward chaining. Whenever a new fact is added to the
database, forward chaining is applied to see if any further facts can be derived and then Backward chaining is
used to prove each possible hypothesis.
-
Introduction To Genetic Algorithms
Genetic Algorithms are inspired by Charles Darwin’s theory of Evolution.
Ch-11
In Darwin’s theory the three main principles necessary for evolution to happen are :
1) Heredity — There must be a process in by which children receive the traits of their parent
2) Variation — There must be a variety of traits present in the population or a means to introduce a variation
3) Selection — There must be a mechanism by which some members of the population can be parents and pass
down their genetic information and some do not (survival for the fittest).
- They simulate the process of natural selection which means those species who can adapt to changes in their
environment are able to survive and reproduce and go to next generation. In simple words, they simulate
“survival of the fittest”
- Genetic Algorithms(GAs) are adaptive heuristic search algorithms that are part of evolutionary algorithms.
- They are an intelligent exploitation of a random search.
- Although randomized, Genetic Algorithms are by no means random.
- They are commonly used to generate high-quality solutions for optimization problems and search problems.
- Each generation consist of a population of individuals and each individual represents a point in search space and
possible solution. Each individual is represented as a string of character/integer/float/bits. This string is
analogous to the Chromosomes.
• Individual - Any possible solution
• Population - Group of all individuals
• Search Space - All possible solutions to the problem
• Chromosome - Blueprint for an individual
• Trait – Features of an individual
• Genome - Collection of all chromosomes for an individual
Genetic Algorithm
3. Selection
- Two fittest Chromosomes are selected for creating the next generation and other chromosomes are dropped.
- These pair of chromosomes will act as parents to generate offspring for the next generation.
- Some methods for parent Selection are:
o Tournament selection ,Roulette wheel selection ,Proportionate selection ,Rank selection ,Steady state selection etc.
4. Crossover
- This represents mating between individuals. It is equivalent to two parents having a child.
- Two Chromosomes are selected using selection operator and crossover sites are chosen randomly.
- Then the genes at these crossover sites are exchanged thus creating a completely new individual (offspring).
Uniform crossover
5. Mutation
- Mutation is applied to each child individually after the crossover .
- To avoid duplicity(crossover generates offspring that are similar to parents) and to enhance the diversity in offspring one
can perform mutation
- Mutation randomly alters some features in the offspring.
Swap Mutation
Scramble Mutation
Inversion Mutation
The offspring population created by selection, crossover (recombination), and mutation replaces the original parent
population.
1. Selection
- The idea is to give preference to the individuals with good fitness scores and allow them to pass their genes to
successive generations.
2. Crossover
- Crossover ensures that offspring possess characteristics similar to both the parents.
- If no crossover is performed then the offspring would be exact copies of the parents with no improvements or
variations.
- Thus, crossover is an attempt to create better or fitter chromosomes from the existing good ones.
3. Mutation
- Mutation facilitates a sudden change in a gene within a chromosome, generating a solution that is dimensionally
far away from or dissimilar to those in the current pool.
- If the mutant has better fitness then it will be taken up for the next offspring generation.
- If the mutant has a lesser fitness value then it will gradually fade out as selection operator will ensure not using
it for offspring generation.