You are on page 1of 31

Unit-II

Problem solving state space search tree and control strategies

Introduction :
The reflex agent of AI directly maps states into action.
Whenever these agents fail to operate in an environment where the
state of mapping is too large and not easily performed by the agent,
then the stated problem dissolves and sent to a problem-solving
domain which breaks the large stored problem into the smaller
storage area and resolves one by one. The final integrated action will
be the desired outcomes.
On the basis of the problem and their working domain, different
types of problem-solving agent defined and use at an atomic level
without any internal state visible with a problem-solving algorithm.
The problem-solving agent performs precisely by defining problems
and several solutions. So we can say that problem solving is a part of
artificial intelligence that encompasses a number of techniques such
as a tree, B-tree, heuristic algorithms to solve a problem.
We can also say that a problem-solving agent is a result-driven agent
and always focuses on satisfying the goals.
Steps problem-solving in AI: The problem of AI is directly
associated with the nature of humans and their activities. So we need
a number of finite steps to solve a problem which makes human easy
works.
These are the following steps which require to solve a problem :
 Goal Formulation: This one is the first and simple step in
problem-solving. It organizes finite steps to formulate a
target/goals which require some action to achieve the goal. Today
the formulation of the goal is based on AI agents.
 Problem formulation: It is one of the core steps of problem-
solving which decides what action should be taken to achieve the
formulated goal. In AI this core part is dependent upon software
agent which consisted of the following components to formulate
the associated problem.
Components to formulate the associated problem:
 Initial State: This state requires an initial state for the problem
which starts the AI agent towards a specified goal. In this state
new methods also initialize problem domain solving by a specific
class.
 Action: This stage of problem formulation works with function
with a specific class taken from the initial state and all possible
actions done in this stage.
 Transition: This stage of problem formulation integrates the
actual action done by the previous action stage and collects the
final stage to forward it to their next stage.
 Goal test: This stage determines that the specified goal achieved
by the integrated transition model or not, whenever the goal
achieves stop the action and forward into the next stage to
determines the cost to achieve the goal.
 Path costing: This component of problem-solving numerical
assigned what will be the cost to achieve the goal. It requires all
hardware software and human working cost.

Genral problem solving:


The General Problem Solver (GPS) was an AI program
proposed by Herbert Simon, J.C. Shaw, and Allen Newell. It was
the first useful computer program that came into existence in the AI
world. The goal was to make it work as a universal problem-solving
machine. Of course there were many software programs that existed
before, but these programs performed specific tasks. GPS was the
first program that was intended to solve any general problem. GPS
was supposed to solve all the problems using the same base
algorithm for every problem.

As you must have realized, this is quite an uphill battle! To program


the GPS, the authors created a new language
called Information Processing Language (IPL). The basic premise
is to express any problem with a set of well-formed formulas. These
formulas would be a part of a directed graph with multiple sources
and sinks. In a graph, the source refers to the starting node and the
sink refers to the ending node. In the case of GPS, the source refers
to axioms and the sink refers to the conclusions.

Even though GPS was intended to be a general purpose, it could only


solve well-defined problems, such as proving mathematical theorems
in geometry and logic. It could also solve word puzzles and play
chess. The reason was that these problems could be formalized to a
reasonable extent. But in the real world, this quickly becomes
intractable because of the number of possible paths you can take. If it
tries to brute force a problem by counting the number of walks in a
graph, it becomes computationally infeasible.

Charecteristics of problem :
 Decomposable to smaller or easier problems.
 Solution steps can be ignored or undone.
 Predictable problem universe.
 Good solutions are obvious.
 Uses internally consistent knowledge base.
 Requires lots of knowledge or uses knowledge to constrain
solutions
 Requires periodic interaction between human and computer

Exhaustive searches
In computer science, brute-force search or exhaustive search, also
known as generate and test, is a very general problem-
solving technique and algorithmic paradigm that consists of
systematically enumerating all possible candidates for the solution
and checking whether each candidate satisfies the problem's
statement.
A brute-force algorithm to find the divisors of a natural
number n would enumerate all integers from 1 to n, and check
whether each of them divides n without remainder. A brute-force
approach for the eight queens puzzle would examine all possible
arrangements of 8 pieces on the 64-square chessboard, and, for each
arrangement, check whether each (queen) piece can attack any other.
While a brute-force search is simple to implement, and will always
find a solution if it exists, its cost is proportional to the number of
candidate solutions – which in many practical problems tends to grow
very quickly as the size of the problem increases (§Combinatorial
explosion).[1] Therefore, brute-force search is typically used when the
problem size is limited, or when there are problem-
specific heuristics that can be used to reduce the set of candidate
solutions to a manageable size. The method is also used when the
simplicity of implementation is more important than speed.
This is the case, for example, in critical applications where any errors
in the algorithm would have very serious consequences; or
when using a computer to prove a mathematical theorem. Brute-force
search is also useful as a baseline method when benchmarking other
algorithms or metaheuristics. Indeed, brute-force search can be
viewed as the simplest metaheuristic. Brute force search should not be
confused with backtracking, where large sets of solutions can be
discarded without being explicitly enumerated (as in the textbook
computer solution to the eight queens problem above). The brute-
force method for finding an item in a table – namely, check all entries
of the latter, sequentially – is called linear search.
Heuristic search techniques
A Heuristic is a technique to solve a problem faster than classic
methods, or to find an approximate solution when classic methods
cannot. This is a kind of a shortcut as we often trade one of optimality,
completeness, accuracy, or precision for speed. A Heuristic (or a
heuristic function) takes a look at search algorithms. At each
branching step, it evaluates the available information and makes a
decision on which branch to follow. It does so by ranking alternatives.
The Heuristic is any device that is often effective but will not
guarantee work in every case.
Heuristic Search Techniques in Artificial Intelligence

Direct Heuristic Search Techniques in AI


Other names for these are Blind Search, Uninformed Search, and
Blind Control Strategy. These aren’t always possible since they
demand much time or memory. They search the entire state space for a
solution and use an arbitrary ordering of operations. Examples of these
are Breadth First Search (BFS) and Depth First Search (DFS).

Weak Heuristic Search Techniques in AI


Other names for these are Informed Search, Heuristic Search, and
Heuristic Control Strategy. These are effective if applied correctly to
the right types of tasks and usually demand domain-specific
information. We need this extra information to compute preference
among child nodes to explore and expand. Each node has a heuristic
function associated with it. Examples are Best First Search (BFS) and
A*.
Before we move on to describe certain techniques, let’s first take a
look at the ones we generally observe. Below, we name a few.

 Best-First Search

 A* Search

 Bidirectional Search

 Tabu Search

 Beam Search

 Simulated Annealing

 Hill Climbing

 Constraint Satisfaction Problems

Hill Climbing in Artifical Intelligence


First, let’s talk about Hill Climbing in Artifical Intelligence. This is a
heuristic for optimizing problems mathematically. We need to choose
values from the input to maximize or minimize a real function. It is
okay if the solution isn’t the global optimal maximum.

Features of Hill Climbing in AI


Let’s discuss some of the features of this algorithm (Hill Climbing):

 It is a variant of the generate-and-test algorithm

 It makes use of the greedy approach


This means it keeps generating possible solutions until it finds the
expected solution, and moves only in the direction which optimizes
the cost function for it.

Types of Hill Climbing in AI

 Simple Hill Climbing- This examines one neighboring node at a


time and selects the first one that optimizes the current cost to be
the next node.

 Steepest Ascent Hill Climbing- This examines all neighboring


nodes and selects the one closest to the solution state.

 Stochastic Hill Climbing- This selects a neighboring node at


random and decides whether to move to it or examine another.

Let’s take a look at the algorithm for simple hill climbing.

1. Evaluate initial state- if goal state, stop and return success. Else,
make initial state current.

2. Loop until the solution reached or until no new operators left to


apply to current state:

a. Select new operator to apply to the current producing new state.

b. Evaluate new state:

 If a goal state, stop and return success.


 If better than the current state, make it current state, proceed.

 Even if not better than the current state, continue until the solution
reached.

1. Exit.

Problems with Hill Climbing in AI


We usually run into one of three issues-

 Local Maximum- All neighboring states have values worse than


the current. The greedy approach means we won’t be moving to a
worse state. This terminates the process even though there may
have been a better solution. As a workaround, we use
backtracking.

 Plateau- All neighbors to it have the same value. This makes it


impossible to choose a direction. To avoid this, we randomly make
a big jump.

 Ridge- At a ridge, movement in all possible directions is


downward. This makes it look like a peak and terminates the
process. To avoid this, we may use two or more rules before
testing.

Simulated Annealing Heuristic Search

In metallurgy, when we slow-cool metals to pull them down to a state


of low energy gives them exemplary amounts of strength. We call this
annealing. While high temperatures observe much random movement,
low temperatures notice little randomness.

In AI, we take a cue from this to produce something called simulated


annealing. This is a way of optimization where we begin with a
random search at a high temperature and reduce the temperature
slowly. Eventually, as the temperature approaches zero, the search
becomes pure greedy descent. At each step, these processes randomly
selects a variable and a value. It accepts the assignment only when it is
an improvement or doesn’t lead to more conflict. If not, it checks if the
temperature is much worse than the current assignment to accept the
assignment with some probability.

Best-First Search (BFS) Heuristic Search


Often dubbed BFS, Best First Search is an informed search that uses
an evaluation function to decide which adjacent is the most promising
before it can continue to explore. Breadth- and Depth- First Searches
blindly explore paths without keeping a cost function in mind. Things
aren’t the same with BFS, though. Here, we use a priority queue to
store node costs. Let’s understand BFS Heuristic Search through
pseudocode.

1. Define list OPEN with single node s– the start node.

2. IF list is empty, return failure.

3. Remove node n (node with best score) from list, move it to list
CLOSED.
4. Expand node n.

5. IF any successor to n is the goal node, return success and trace path
from goal node to s to return the solution.

6. FOR each successor node:

 Apply evaluation function f.

 IF the node isn’t in either list, add it to list OPEN.

1. Loop to step 2.

Let’s learn about Python Slice

So, this was all in Heuristic Search Techniques in AI. Hope you like
our explanation.

Iterative deepening a*
Iterative deepening A* (IDA*) is a graph traversal and path
search algorithm that can find the shortest path between a designated
start node and any member of a set of goal nodes in a weighted graph.
It is a variant of iterative deepening depth-first search that borrows the
idea to use a heuristic function to evaluate the remaining cost to get to
the goal from the A* search algorithm. Since it is a depth-first search
algorithm, its memory usage is lower than in A*, but unlike ordinary
iterative deepening search, it concentrates on exploring the most
promising nodes and thus does not go to the same depth everywhere
in the search tree. Unlike A*, IDA* does not utilize dynamic
programming and therefore often ends up exploring the same nodes
many times.
While the standard iterative deepening depth-first search uses search
depth as the cutoff for each iteration, the IDA* uses the more
informative where is the cost to travel from the root to node and is a
problem-specific heuristic estimate of the cost to travel from to the
goal.
The algorithm was first described by Richard Korf in 1985
Iterative-deepening-A* works as follows: at each iteration, perform a
depth-first search, cutting off a branch when its total cost exceeds a
given threshold.

This threshold starts at the estimate of the cost at the initial state, and
increases for each iteration of the algorithm. At each iteration, the
threshold used for the next iteration is the minimum cost of all values
that exceeded the current threshold.

Pseudocode
path current search path (acts like a stack)
node current node (last node in current path)
g the cost to reach current node
f estimated cost of the cheapest path
(root..node..goal)
h(node) estimated cost of the cheapest path (node..goal)
cost(node, succ) step cost function
is_goal(node) goal test
successors(node) node expanding function, expand nodes ordered by g +
h(node)
ida_star(root) return either NOT_FOUND or a pair with the best path
and its cost

procedure ida_star(root)
bound := h(root)
path := [root]
loop
t := search(path, 0, bound)
if t = FOUND then return (path, bound)
if t = ∞ then return NOT_FOUND
bound := t
end loop
end procedure

function search(path, g, bound)


node := path.last
f := g + h(node)
if f > bound then return f
if is_goal(node) then return FOUND
min := ∞
for succ in successors(node) do
if succ not in path then
path.push(succ)
t := search(path, g + cost(node, succ), bound)
if t = FOUND then return FOUND
if t < min then min := t
path.pop()
end if
end for
return min
end function

properties
Like A*, IDA* is guaranteed to find the shortest path leading from the
given start node to any goal node in the problem graph, if the heuristic
function h is admissible,[2] that is
for all nodes n, where h* is the true cost of the shortest path
from n to the nearest goal (the "perfect heuristic").[3]
IDA* is beneficial when the problem is memory constrained. A*
search keeps a large queue of unexplored nodes that can quickly
fill up memory. By contrast, because IDA* does not remember any
node except the ones on the current path, it requires an amount of
memory that is only linear in the length of the solution that it
constructs. Its time complexity is analyzed by Korf et al. under the
assumption that the heuristic cost estimate h is consistent, meaning
that
for all nodes n and all neighbors n' of n; they conclude that
compared to a brute-force tree search over an exponential-sized
problem, IDA* achieves a smaller search depth (by a constant
factor), but not a smaller branching factor
constraint satisfaction
In artificial intelligence and operations research, constraint
satisfaction is the process of finding a solution to a set
of constraints that impose conditions that the variables must satisfy.
[1]
A solution is therefore a set of values for the variables that satisfies
all constraints—that is, a point in the feasible region.
The techniques used in constraint satisfaction depend on the kind of
constraints being considered. Often used are constraints on a finite
domain, to the point that constraint satisfaction problems are typically
identified with problems based on constraints on a finite domain.
Such problems are usually solved via search, in particular a form
of backtracking or local search. Constraint propagation are other
methods used on such problems; most of them are incomplete in
general, that is, they may solve the problem or prove it unsatisfiable,
but not always. Constraint propagation methods are also used in
conjunction with search to make a given problem simpler to solve.
Other considered kinds of constraints are on real or rational numbers;
solving problems on these constraints is done via variable
elimination or the simplex algorithm.
Constraint satisfaction originated in the field of artificial
intelligence in the 1970s (see for example (Laurière 1978)). During
the 1980s and 1990s, embedding of constraints into a programming
language were developed. Languages often used for constraint
programming are Prolog and C++.
As originally defined in artificial intelligence, constraints enumerate
the possible values a set of variables may take in a given world. A
possible world is a total assignment of values to variables
representing a way the world (real or imaginary) could be.
[2]
Informally, a finite domain is a finite set of arbitrary elements. A
constraint satisfaction problem on such domain contains a set of
variables whose values can only be taken from the domain, and a set
of constraints, each constraint specifying the allowed values for a
group of variables. A solution to this problem is an evaluation of the
variables that satisfies all constraints. In other words, a solution is a
way for assigning a value to each variable in such a way that all
constraints are satisfied by these values.
In some circumstances, there may exist additional requirements: one
may be interested not only in the solution (and in the fastest or most
computationally efficient way to reach it) but in how it was reached;
e.g. one may want the "simplest" solution ("simplest" in a logical, non
computational sense that has to be precisely defined). This is often the
case in logic games such as Sudoku.
Solving
Constraint satisfaction problems on finite domains are typically
solved using a form of search. The most used techniques are variants
of backtracking, constraint propagation, and local search. These
techniques are used on problems with nonlinear constraints.
Variable elimination and the simplex algorithm are used for
solving linear and polynomial equations and inequalities, and
problems containing variables with infinite domain. These are
typically solved as optimization problems in which the optimized
function is the number of violated constraints.
Constraint satisfaction toolkits
 Constraint satisfaction toolkits are software
libraries for imperative programming languages that are used to
encode and solve a constraint satisfaction problem.
 Cassowary constraint solver, an open source project for
constraint satisfaction (accessible from C, Java, Python and
other languages).
 Comet, a commercial programming language and toolkit
 Gecode, an open source portable toolkit written in C++
developed as a production-quality and highly efficient
implementation of a complete theoretical background.
 Gelisp, an open source portable wrapper
[4]
of Gecode to Lisp. http://gelisp.sourceforge.net/

 IBM ILOG CP Optimizer: C++, Python, Java, .NET libraries


(proprietary, free for academic use).[5] Successor of ILOG
Solver/Scheduler, which was considered the market leader in
commercial constraint programming software as of 2006

Problem reduction and game playing


Problem reduction:
We already know about the divide and conquer strategy, a
solution to a problem can be obtained by decomposing it into smaller
sub-problems. Each of this sub-problem can then be solved to get its
sub solution. These sub solutions can then recombined to get a
solution as a whole. That is called is Problem Reduction. This
method generates arc which is called as AND arcs. One AND arc may
point to any number of successor nodes, all of which must be solved
in order for an arc to point to a solution.
Problem Reduction algorithm:
1.Initialize the graph to the starting node.
2. Loop until the starting node is labelled SOLVED or until its cost
goes above FUTILITY:
(i) Traverse the graph, starting at the initial node and following the
current best path and accumulate the set of nodes that are on that path
and have not yet been expanded.
(ii) Pick one of these unexpanded nodes and expand it. If there are no
successors, assign FUTILITY as the value of this node. Otherwise,
add its successors to the graph and for each of them compute f'(n). If
f'(n) of any node is O, mark that node as SOLVED.
(iii) Change the f'(n) estimate of the newly expanded node to reflect
the new information provided by its successors. Propagate this change
backwards through the graph. If any node contains a successor arc
whose descendants are all solved, label the node itself as SOLVED.

Game playing :
Game Playing is an important domain of artificial intelligence.
Games don’t require much knowledge; the only knowledge we need
to provide is the rules, legal moves and the conditions of winning or
losing the game.
Both players try to win the game. So, both of them try to make the
best move possible at each turn. Searching techniques like
BFS(Breadth First Search) are not accurate for this as the branching
factor is very high, so searching will take a lot of time
The most common search technique in game playing is Minimax
search procedure. It is depth-first depth-limited search procedure. It
is used for games like chess and tic-tac-toe.

Mini-Max Algorithm in Artificial Intelligence


o Mini-max algorithm is a recursive or backtracking algorithm which is
used in decision-making and game theory. It provides an optimal move
for the player assuming that opponent is also playing optimally.
o Mini-Max algorithm uses recursion to search through the game-tree.
o Min-Max algorithm is mostly used for game playing in AI. Such as
Chess, Checkers, tic-tac-toe, go, and various tow-players game. This
Algorithm computes the minimax decision for the current state.
o In this algorithm two players play the game, one is called MAX and other
is called MIN.
o Both the players fight it as the opponent player gets the minimum benefit
while they get the maximum benefit.
o Both Players of the game are opponent of each other, where MAX will
select the maximized value and MIN will select the minimized value.
o The minimax algorithm performs a depth-first search algorithm for the
exploration of the complete game tree.
o The minimax algorithm proceeds all the way down to the terminal node
of the tree, then backtrack the tree as the recursion.

Pseudo-code for MinMax Algorithm:


1. function minimax(node, depth, maximizingPlayer) is
2. if depth ==0 or node is a terminal node then
3. return static evaluation of node
4.
5. if MaximizingPlayer then // for Maximizer Player
6. maxEva= -infinity
7. for each child of node do
8. eva= minimax(child, depth-1, false)
9. maxEva= max(maxEva,eva) //gives Maximum of the values
10.return maxEva
11.
12.else // for Minimizer player
13. minEva= +infinity
14. for each child of node do
15. eva= minimax(child, depth-1, true)
16. minEva= min(minEva, eva) //gives minimum of the values
17. return minEva
Working of Min-Max Algorithm:
o The working of the minimax algorithm can be easily described using an
example. Below we have taken an example of game-tree which is
representing the two-player game.
o In this example, there are two players one is called Maximizer and other
is called Minimizer.
o Maximizer will try to get the Maximum possible score, and Minimizer
will try to get the minimum possible score.
o This algorithm applies DFS, so in this game-tree, we have to go all the
way through the leaves to reach the terminal nodes.
o At the terminal node, the terminal values are given so we will compare
those value and backtrack the tree until the initial state occurs. Following
are the main steps involved in solving the two-player game tree:

Step-1: In the first step, the algorithm generates the entire game-tree and apply
the utility function to get the utility values for the terminal states. In the below
tree diagram, let's take A is the initial state of the tree. Suppose maximizer takes
first turn which has worst-case initial value =- infinity, and minimizer will take
next turn which has worst-case initial value = +infinity.

Step 2: Now, first we find the utilities value for the Maximizer, its initial value is -∞, so we
will compare each value in terminal state with initial value of Maximizer and determines the
higher nodes values. It will find the maximum among the all.
o For node D max(-1,- -∞) => max(-1,4)= 4
o For Node E max(2, -∞) => max(2, 6)= 6
o For Node F max(-3, -∞) => max(-3,-5) = -3
o For node G max(0, -∞) = max(0, 7) = 7

Step 3: In the next step, it's a turn for minimizer, so it will compare all nodes
value with +∞, and will find the 3rd layer node values.

o For node B= min(4,6) = 4


o For node C= min (-3, 7) = -3

Step 3: Now it's a turn for Maximizer, and it will again choose the maximum of
all nodes value and find the maximum value for the root node. In this game tree,
there are only 4 layers, hence we reach immediately to the root node, but in real
games, there will be more than 4 layers.
o For node A max(4, -3)= 4

That was the complete workflow of the minimax two player game.

Properties of Mini-Max algorithm:


o Complete- Min-Max algorithm is Complete. It will definitely find a
solution (if exist), in the finite search tree.
o Optimal- Min-Max algorithm is optimal if both opponents are playing
optimally.
o Time complexity- As it performs DFS for the game-tree, so the time
complexity of Min-Max algorithm is O(bm), where b is branching factor
of the game-tree, and m is the maximum depth of the tree.
o Space Complexity- Space complexity of Mini-max algorithm is also
similar to DFS which is O(bm).

Limitation of the minimax Algorithm:


The main drawback of the minimax algorithm is that it gets really slow for
complex games such as Chess, go, etc.

This type of games has a huge branching factor, and the player has lots of
choices to decide. This limitation of the minimax algorithm can be improved
from alpha-beta pruning which we have discussed in the next topic.
Alpha-Beta Pruning

o Alpha-beta pruning is a modified version of the minimax algorithm. It is


an optimization technique for the minimax algorithm.
o As we have seen in the minimax search algorithm that the number of
game states it has to examine are exponential in depth of the tree. Since
we cannot eliminate the exponent, but we can cut it to half. Hence there is
a technique by which without checking each node of the game tree we
can compute the correct minimax decision, and this technique is
called pruning. This involves two threshold parameter Alpha and beta
for future expansion, so it is called alpha-beta pruning. It is also called
as Alpha-Beta Algorithm.
o Alpha-beta pruning can be applied at any depth of a tree, and sometimes
it not only prune the tree leaves but also entire sub-tree.
o The two-parameter can be defined as:

a. Alpha: The best (highest-value) choice we have found so far at any


point along the path of Maximizer. The initial value of alpha is -∞.
b. Beta: The best (lowest-value) choice we have found so far at any
point along the path of Minimizer. The initial value of beta is +∞.
o The Alpha-beta pruning to a standard minimax algorithm returns the
same move as the standard algorithm does, but it removes all the nodes
which are not really affecting the final decision but making algorithm
slow. Hence by pruning these nodes, it makes the algorithm fast.

Condition for Alpha-beta pruning:

The main condition which required for alpha-beta pruning is:

1. α>=β
Key points about alpha-beta pruning:
o The Max player will only update the value of alpha.
o The Min player will only update the value of beta.
o While backtracking the tree, the node values will be passed to upper
nodes instead of values of alpha and beta.
o We will only pass the alpha, beta values to the child nodes.
Pseudo-code for Alpha-beta Pruning:
1. function minimax(node, depth, alpha, beta, maximizingPlayer) is
2. if depth ==0 or node is a terminal node then
3. return static evaluation of node
4.
5. if MaximizingPlayer then // for Maximizer Player
6. maxEva= -infinity
7. for each child of node do
8. eva= minimax(child, depth-1, alpha, beta, False)
9. maxEva= max(maxEva, eva)
10. alpha= max(alpha, maxEva)
11. if beta<=alpha
12. break
13. return maxEva
14.
15.else // for Minimizer player
16. minEva= +infinity
17. for each child of node do
18. eva= minimax(child, depth-1, alpha, beta, true)
19. minEva= min(minEva, eva)
20. beta= min(beta, eva)
21. if beta<=alpha
22. break
23. return minEva

Working of Alpha-Beta Pruning:


Let's take an example of two-player search tree to understand the
working of Alpha-beta pruning

Step 1: At the first step the, Max player will start first move from
node A where α= -∞ and β= +∞, these value of alpha and beta passed
down to node B where again α= -∞ and β= +∞, and Node B passes the
same value to its child D
Step 2: At Node D, the value of α will be calculated as its turn for
Max. The value of α is compared with firstly 2 and then 3, and the
max (2, 3) = 3 will be the value of α at node D and node value will
also 3.

Step 3: Now algorithm backtrack to node B, where the value of β will


change as this is a turn of Min, Now β= +∞, will compare with the
available subsequent nodes value, i.e. min (∞, 3) = 3, hence at node B
now α= -∞, and β= 3.
In the next step, algorithm traverse the next successor of Node B
which is node E, and the values of α= -∞, and β= 3 will also be
passed.

Step 4: At node E, Max will take its turn, and the value of alpha will
change. The current value of alpha will be compared with 5, so max (-
∞, 5) = 5, hence at node E α= 5 and β= 3, where α>=β, so the right
successor of E will be pruned, and algorithm will not traverse it, and
the value at node E will be 5.

Step 5: At next step, algorithm again backtrack the tree, from node B
to node A. At node A, the value of alpha will be changed the
maximum available value is 3 as max (-∞, 3)= 3, and β= +∞, these
two values now passes to right successor of A which is Node C.

At node C, α=3 and β= +∞, and the same values will be passed on to
node F.

Step 6: At node F, again the value of α will be compared with left


child which is 0, and max(3,0)= 3, and then compared with right child
which is 1, and max(3,1)= 3 still α remains 3, but the node value of F
will become 1.
Step 7: Node F returns the node value 1 to node C, at C α= 3 and β=
+∞, here the value of beta will be changed, it will compare with 1 so
min (∞, 1) = 1. Now at C, α=3 and β= 1, and again it satisfies the
condition α>=β, so the next child of C which is G will be pruned, and
the algorithm will not compute the entire sub-tree G.

.
Step 8: C now returns the value of 1 to A here the best value for A is
max (3, 1) = 3. Following is the final game tree which is the showing
the nodes which are computed and nodes which has never computed.
Hence the optimal value for the maximizer is 3 for this example.
Move Ordering in Alpha-Beta pruning:
The effectiveness of alpha-beta pruning is highly dependent on the
order in which each node is examined. Move order is an important
aspect of alpha-beta pruning.

It can be of two types:

o Worst ordering: In some cases, alpha-beta pruning algorithm


does not prune any of the leaves of the tree, and works exactly
as minimax algorithm. In this case, it also consumes more time
because of alpha-beta factors, such a move of pruning is called
worst ordering. In this case, the best move occurs on the right
side of the tree. The time complexity for such an order is O(bm).
o Ideal ordering: The ideal ordering for alpha-beta pruning
occurs when lots of pruning happens in the tree, and best moves
occur at the left side of the tree. We apply DFS hence it first
search left of the tree and go deep twice as minimax algorithm
in the same amount of time. Complexity in ideal ordering is
O(bm/2).

Rules to find good ordering:

Following are some rules to find good ordering in alpha-beta pruning:

o Occur the best move from the shallowest node.


o Order the nodes in the tree such that the best nodes are checked
first.
o Use domain knowledge while finding the best move. Ex: for
Chess, try order: captures first, then threats, then forward
moves, backward moves.
o We can bookkeep the states, as there is a possibility that states
may repeat.

Two player perfect information games


The perfect examples of deterministic, two-player, zero-sum,
and perfect-information games are chess, checkers, tic-tac-toe, and
Go. Looking at the image, the minimizer will choose the child node
with the lowest score [min (3, 12, 8) = 3].
 Perfect information: A game with the perfect information is
that in which agents can look into the complete board. Agents
have all the information about the game, and they can see each
other moves also. Examples are Chess, Checkers, Go, etc.
 Imperfect information: If in a game agents do not have all
information about the game and not aware with what's going on,
such type of games are called the game with imperfect
information, such as tic-tac-toe, Battleship, blind, Bridge, etc.
 Deterministic games: Deterministic games are those games
which follow a strict pattern and set of rules for the games, and
there is no randomness associated with them. Examples are
chess, Checkers, Go, tic-tac-toe, etc.
 Non-deterministic games: Non-deterministic are those games
which have various unpredictable events and has a factor of
chance or luck. This factor of chance or luck is introduced by
either dice or cards. These are random, and each action response
is not fixed. Such games are also called as stochastic games.
Example: Backgammon, Monopoly, Poker, etc.
Zero-Sum Game
o Zero-sum games are adversarial search which involves pure
competition.
o In Zero-sum game each agent's gain or loss of utility is exactly
balanced by the losses or gains of utility of another agent.
o One player of the game try to maximize one single value, while
other player tries to minimize it.
o Each move by one player in the game is called as ply.
o Chess and tic-tac-toe are examples of a Zero-sum game.

Zero-sum game: Embedded thinking


The Zero-sum game involved embedded thinking in which one agent
or player is trying to figure out:

o What to do.
o How to decide the move
o Needs to think about his opponent as well
o The opponent also thinks what to do

Each of the players is trying to find out the response of his opponent
to their actions. This requires embedded thinking or backward
reasoning to solve the game problems in AI.

Formalization of the problem:


A game can be defined as a type of search in AI which can be formalized of
the following elements:

o Initial state: It specifies how the game is set up at the start.


o Player(s): It specifies which player has moved in the state space.
o Action(s): It returns the set of legal moves in state space.
o Result(s, a): It is the transition model, which specifies the result of
moves in the state space.
o Terminal-Test(s): Terminal test is true if the game is over, else it is false
at any case. The state where the game ends is called terminal states.
o Utility(s, p): A utility function gives the final numeric value for a game
that ends in terminal states s for player p. It is also called payoff function.
For Chess, the outcomes are a win, loss, or draw and its payoff values are
+1, 0, ½. And for tic-tac-toe, utility values are +1, -1, and 0.

Game tree:
A game tree is a tree where nodes of the tree are the game states and
Edges of the tree are the moves by players. Game tree involves initial
state, actions function, and result Function.

Example: Tic-Tac-Toe game tree:

The following figure is showing part of the game-tree for tic-tac-toe


game. Following are some key points of the game:

o There are two players MAX and MIN.


o Players have an alternate turn and start with MAX.
o MAX maximizes the result of the game tree
o MIN minimizes the result.
Example Explanation:

o From the initial state, MAX has 9 possible moves as he starts


first. MAX place x and MIN place o, and both player plays
alternatively until we reach a leaf node where one player has
three in a row or all squares are filled.
o Both players will compute each node, minimax, the minimax
value which is the best achievable utility against an optimal
adversary.
o Suppose both the players are well aware of the tic-tac-toe and
playing the best play. Each player is doing his best to prevent
another one from winning. MIN is acting against Max in the
game.
o So in the game tree, we have a layer of Max, a layer of MIN,
and each layer is called as Ply. Max place x, then MIN puts o to
prevent Max from winning, and this game continues until the
terminal node.
o In this either MIN wins, MAX wins, or it's a draw. This game-
tree is the whole search space of possibilities that MIN and
MAX are playing tic-tac-toe and taking turns alternately.
Hence adversarial Search for the minimax procedure works as
follows:

o It aims to find the optimal strategy for MAX to win the game.
o It follows the approach of Depth-first search.
o In the game tree, optimal leaf node could appear at any depth of
the tree.
o Propagate the minimax values up to the tree until the terminal
node discovered.

In a given game tree, the optimal strategy can be determined from the
minimax value of each node, which can be written as MINIMAX(n).
MAX prefer to move to a state of maximum value and MIN prefer to
move to a state of minimum value then:

You might also like