You are on page 1of 36

Adversarial Search

Games vs. search problems


• "Unpredictable" opponent  specifying a move for every possible
opponent reply
• Time limits  unlikely to find goal, must approximate
Formal definition of game as search problem

• The initial state, which includes the board position and an indication of whose
move it is.
• A set of operators, which define the legal moves that a player can make.
• A terminal test, which determines when the game is over. States where the
game has ended are called terminal states.
• A utility function (also called a payoff function), which gives a numeric value
for the outcome of a game. In chess, the outcome is a win, loss, or draw, which
we can represent by the values +1, —1, or 0
Perfect Decisions in two person Games

• The general case of a game with two players, whom we will call MAX
and M1N,

• MAX moves first, and then they take turns moving until the game is over.

• At the end of the game, points are awarded to the winning player.
Perfect Decisions in two person Games

• MAX would have to search for a sequence of moves that leads to a


terminal state that is a winner then go ahead and make the first move in
the sequence.
• Unfortunately, MIN has something to say about it.
• MAX therefore must find a strategy that will lead to a winning terminal
state regardless of what MIN does, where the strategy includes the correct
move for MAX for each possible move by MIN.
Tic-Tac-Toe Game tree (2-player,
deterministic, turns)
• From the initial state, has a choice of nine possible moves.
• Play alternates between MAX placing x's and MIN placing o's
until we reach leaf nodes corresponding to terminal states:
states where one player has three in a row or all the squares are
filled.
• The number on each leaf node indicates the utility value of the
terminal state from the point of view of MAX; high values are
assumed to be good for MAX and bad for MIN.
• It is MAX'S job to use the search tree to determine the best
move.
Generate Game Tree
Searching with an opponent
Searching with an opponent
• Heuristic approximation: defining an evaluation function which indicates how close a state is
from a winning (or losing) move
• This function includes domain information.
• It does not represent a cost or a distance in steps.
• Conventionally:
• A winning move is represented by the value“+∞”.
• A losing move is represented by the value “-∞”.
• The algorithm searches with limited depth.

• Each new decision implies repeating part of the search.


Evaluation Function
• The game tree is fully developed to a given number of levels.
• An evaluation function is applied to each leaf node. These values represent how
good each node is for the max player.
• These values are then propagated upwards, indicating the next best move for the
max player
• The success of the minimax algorithm depends on how accurate the evaluation
function is.
• MAX plays with “X” and desires maximizing e.
• MIN plays with “0” and desires minimizing e.
Evaluation Function
• (3X + O) – (3Z + Y)
• X = Number of rows, columns or diagonals with two X’s and no O’s
• O = Number of rows, columns or diagonals with one X and no O’s
• Z = Number of rows, columns or diagonals with two O’s and no X’s
• Y = Number of rows, columns or diagonals with one O and no X’s
Evaluation Function for Tic Tac Toe
• (3X + O) – (3Z + Y)
X O X

X O O X X O

X O O X X O O
Another Evaluation Function for Tic Tac
Toe
• e(n) =(number of rows, columns and diagonals available to Max) -
(number of rows, columns and diagonals available to Min)

X O O X X
O

e(n) = 6 - 4 = 2 e(n) = 4 - 3 = 1
Example Utility Functions for Chess
• Chess: Assume Max is “White”
• Assume each piece has the following values
• pawn = 1; knight = 3; bishop = 3;
• rook = 5; queen = 9;
• let w = sum of the value of white pieces
• let b = sum of the value of black pieces

• e(n) =
• w-b
• w + bthat this value ranges between 1 and -1
Note
Example Utility Functions for Chess
• The previous evaluation function naively gave the same weight to a piece
regardless of its position on the board...

• Let Xi be the number of squares the ith piece attacks

• e(n) = piece1value * X1 + piece2value * X2 + piece3value * X3 + ...


The minimax algorithm
• The minimax algorithm computes the minimax decision from the current
state.
• It uses a simple recursive computation of the minimax values of each
successor state:
• directly implementing the defining equations.
• The recursion proceeds all the way down to the leaves of the tree.
• Then the minimax values are backed up through the tree as the recursion
unwinds.
The minimax algorithm
• The algorithm first recurs down to the tree bottom-left nodes
• and uses the Utility function on them to discover that their values are 3, 12 and 8.
Tic Tac Toe Example
Tic Tac Toe Example
Tic Tac Toe Example
The minimax algorithm: problems
• For real games the time cost of minimax is totally impractical, but this
algorithm serves as the basis:
• for the mathematical analysis of games and
• for more practical algorithms
• Problem with minimax search:
• The number of game states it has to examine is exponential in the number of moves.
• Unfortunately, the exponent can’t be eliminated, but it can be cut in half.
Alpha-beta pruning
• It is possible to compute the correct minimax decision without looking at
every node in the game tree.
• Alpha-beta pruning allows to eliminate large parts of the tree from
consideration, without influencing the final decision.
Alpha-beta pruning
• Alpha-beta pruning gets its name from two parameters.
• They describe bounds on the values that appear anywhere along the path under
consideration:
• α = the value of the best (i.e., highest value) choice found so far along the path for MAX
• β = the value of the best (i.e., lowest value) choice found so far along the path for MIN
Alpha-beta pruning
• Alpha-beta search updates the values of α and β as it goes along.
• It prunes the remaining branches at a node (i.e., terminates the recursive
call)
• as soon as the value of the current node is known to be worse than the current α or β
value for MAX or MIN, respectively.
Alpha-beta pruning
• α represents max and β represents Condition
min α≥β
-
α= ∞
β=∞

-
α= ∞
β=∞
-
α= ∞
β=∞
Alpha-beta pruning
• The leaves below B have the values 3, 12 and 8.
Condition
• The value of B is exactly 3.
α≥β
• It can be inferred that the value at the root is at least 3, because MAX has a choice worth 3.
α= ∞- α represents
β=∞ max

B α= ∞- β represents min
β=∞
Alpha-beta pruning
• C, which is a MIN node, has a value of at most 2.
Condition
• But B is worth 3, so MAX would never choose C.
α≥β
• Therefore, there is no point in looking at the other successors of C.
α= ∞-
β=∞

B C α= ∞-
β=∞
Alpha-beta pruning
• D, which is a MIN node, is worth at most 14.
Condition
• This is still higher than MAX’s best alternative (i.e., 3), so D’s other successors are explored.
α≥β
α= ∞-
β=∞

B C D
α= ∞-
β=∞
Alpha-beta pruning
• The second successor of D is worth 5, so the exploration continues.
Condition
α= ∞- α≥β
β=∞

B C D -
α= ∞
β=∞
Alpha-beta pruning
• The third successor is worth 2, so now D is worth exactly 2.
Condition
• MAX’s decision at the root is to move to B, giving a value of 3
-
α= ∞ α≥β
β=∞

-
α= ∞
B C D β=∞
Effectiveness of Alpha-Beta Search
• Worst-Case
• branches are ordered so that no pruning takes place. In this case alpha-beta gives no
improvement over exhaustive search
• Best-Case
• each player’s best move is the left-most child (i.e., evaluated first)
• in practice, performance is closer to best rather than worst-case
Alpha-Beta Pruning
• Pruning does not affect final results
• Entire subtrees can be pruned.
• Good move ordering improves effectiveness of pruning
Example
-which nodes can be pruned?

5 6
3 4 1 2 7 8
Answer to Example
Max
-which nodes can be pruned?
Answer: NONE! Because the most favorable
nodes for both are explored last (i.e., in the
diagram, are on the right-hand side).
Min

Max

5 6
3 4 1 2 7 8
Second Example
-which nodes can be pruned?

3 4
6 5 8 7 2 1
Second Example
Max -which nodes can be pruned?
Answer: LOTS! Because the most favorable
nodes for both are explored first (i.e., in the
diagram, are on the left-hand side).
Min

Max

3 4
6 5 8 7 2 1

You might also like