You are on page 1of 60

Game Playing

Overview
• A game provides a structured task in which it is very easy to
measure success or failure.
•Chess Game: The average branching factor is around 35. In an
average game, each player might make 50 moves. So in order to
examine the complete game tree, we would have to examine 35100
positions.
• Depth-limited search
It uses a plausible move generator in which only small number of promising
moves are generated.

• Static evaluation function


In order to choose the best move, the resulting board positions must be
compared to discover whish is most advantageous. It uses information to
evaluate individual board positions by estimating how likely itto lead to a win.
Why Game Playing?
• What do you think?
• Playing a game will clearly requires a form of “intelligence”.
Games capture a pure form of competition between opponents.
• Games are abstract and precisely defined, very easy to
formalize.
• Game playing is one of the oldest sub-areas of AI (ca. 1950).
• The dream of a machine that plays Chess is, indeed, much older
than AI! (von Kempelen’s “Schachtu ̈rke” (1769), Torres y
Quevedo’s “El Ajedrecista” (1912))
Game Playing? Which Games?

• Game states are discrete, number of game states finite. Finite


number of possible moves.
The game state is fully observable.
The outcome of each move is deterministic.
• Two players: Max and Min.
Turn-taking: It’s each player’s turn alternatingly. Max begins.
• Terminal game states have a utility u. Max tries to maximize u,
Min tries to minimize u.
In that sense, the utility for Min is the exact opposite of the
utility for Max (“zero-sum”).
• There are no infinite runs of the game (no matter what moves are
chosen, a terminal state is reached after a finite number of steps).
An Example Game: Chess

• Game states: Positions of figures.


• Moves: Given by rules.
• Players: White (Max), Black (Min).
• Terminal states: Checkmate.
• Utility of terminal states, e.g.:
– +100 if Black is checkmated.
– 0 if stalemate.
– −100 if White is checkmated.
Game as Search Problem

• Initial State: board position and player to


move
• Successor Function: returns a list of legal
(move, state) pairs
• Terminal Test: determines when the game
is over
• Utility function: Gives a numeric value for
the terminal state
Game Trees
• Game trees are used to represent two-player
games.
• Alternate moves in the game are represented
by alternate levels in the tree (plies).
• Nodes in the tree represent positions.
• Edges between nodes represent moves.
• Leaf nodes represent won, lost or drawn
positions.
Overview
• Credit assignment problem
• The problem of deciding which of a series of actions is actually
responsible for a particular outcome.

• Minimax search algorithm.


• It is a depth-first, depth-limited search procedure.
• It uses the plausible move generator to generate the set of
possible successor positions.
• It apply the static evaluation function to those positions and
choose the best one.
One – Ply Search
Two – Ply Search
Backing Up the Values
of a Two – Ply Search
Game Tree (2-Player, deterministic)
Minimax: Outline
• Depth-first search in game tree, with Max in the root.
• Apply utility function to terminal positions.
• Bottom-up for each inner node n in the tree, compute the utility
u(n) of n as follows:
• If it’s Max’s turn: Set u(n) to the maximum of the utilities of n’s
successor nodes.
If it’s Min’s turn: Set u(n) to the minimum of the utilities of n’s
successor nodes.
• Selecting a move for Max at the root: Choose one move that
leads to a successor node with maximal utility.
Minimax Search

Algorithm : MINIMAX (Position, Depth, Player)


1.If DEEP-ENOUGH(Position, Depth), then return the structure
VALUE = STATlC(Position, Player);
PATH = nil
This indicates that there is no path from this node and that its value is
determined by the static evaluation function.
2. Otherwise, generate one more ply of the tree by calling the function
MOVE-GEN(Position Player) and setting SUCCESSORS to the list is
returns.
3. If SUCCESSORS is empty, then there are no moves to be made, so
return the same structure that would have been returned if
DEEP-ENOUGH had returned true.
4. If SUCCESSORS is not empty, then examine each element in turn and
keep track of the best one. This is done as follows. Initialize BEST-SCORE
to the minimum value that STATIC can return. It will be updated to reflect
the best score that can be achieved by an element of SUCCESSORS. For
each element SUCC of SUCCESSORS, do the following:
Minimax Search (Cont’d)
(a)Set RESULT-SUCC to
MINIMAX(SUCC, Depth + 1, OPPOSITE(Player))
This recursive call to MINIMAX will actually carry out
the exploration of SUCC.
(b) Set NEW-VALUE to - VALUE(RESULT-SUCC). This will cause it to reflect
the merits of the position from the opposite perspective from that of the next
lower level.
(c) If NEW-VALUE > BEST-SCORE, then we have found a successor that is
better than any that have been examined so far. Record this by doing the
following:
(i) Set BEST-SCORE to NEW-VALUE.
(ii) The best known path is now from CURRENT to SUCC and then on to the
appropriate path down from SUCC as determined by the recursive call to
MINIMAX. So set BEST-PATH to the result of attaching SUCC to the front of
PATH(RESULT-SUCC).
5. Now that all the successors have been examined, we know the value of
Position as well as which path to take from it. So return the structure
VALUE = BEST-SCORE
PATH = BEST-PATH
Minimax: Example

Blue numbers: Utility function u applied to terminal


positions.
Red numbers: Utilities of inner nodes, as computed by
Minimax.
Minimax: Pseudo-Code
Minimax: Example, Now in Detail
Minimax: Example, Now in Detail
Minimax: Example, Now in Detail
Minimax: Example, Now in Detail
Minimax: Example, Now in Detail
Minimax: Example, Now in Detail
Minimax: Example, Now in Detail
Minimax: Example, Now in Detail

So which action for Max is returned? Leftmost branch. Note:


The maximal possible pay-off is higher for the rightmost branch,
but assuming perfect play of Min, it’s better to go left. (Going
right would be “relying on your opponent to do something
Analysis of Minimax
• Optimal play for MAX assumes that MIN also plays optimally,
what if MIN does not play optimally?
• A complete depth-first search?
• Yes
• Time complexity?
– O(bm) , where b is the branching factor and m is the length
of path
• Space complexity?
O(bm) (depth-first exploration)
• For chess, b ≈ 35, m ≈ 100 for "reasonable" games
exact solution completely infeasible
Alpha-Beta Pruning
• The number of game states with minimax
search is exponential in the # of moves
• Is it possible to compute the correct minimax
decision without looking at every node in the
game tree?
• Need to prune away branches that cannot
possibly influence the final decision
Alpha-Beta Pruning
• Depth-first search with branch and bound
technique.
• It requires the maintenance of two threshold
values, one representing a lower bound on the
value that a maximizing node may be assigned
(alpha) and another representing an upper
bound on the value that a minimizing node
may be assigned (beta).
Alpha – Beta Pruning
Alpha and Beta Cutoffs
Alpha – Beta Pruning

• Search at a minimizing level can be terminated


when a value less than alpha is discovered.

• Search at a maximizing level can be terminated


when a value greater than beta has been found.
Alpha Pruning: Basic Idea
Alpha Pruning
Alpha Pruning
Alpha-Beta Pruning
Alpha-Beta Search: Example
Alpha-Beta Search: Example
Alpha-Beta Search: Example
Alpha-Beta Search: Example
Alpha-Beta Search: Example
Alpha-Beta Search: Example
Alpha-Beta Search: Example
Alpha-Beta Search: Example
Alpha-Beta Search: Example
Alpha-Beta Search: Modified Example
Alpha-Beta Search: Modified Example
Alpha-Beta Search: Modified Example
Alpha-Beta Search: Modified Example
Analysis of α-β
• Pruning does not affect final result
• The effectiveness of alpha-beta pruning is highly dependent on the
order of successors
• It might be worthwhile to try to examine first the successors that are
likely to be best
• With "perfect ordering," time complexity =O(bm/2)
– effective branching factor becomes b1/2
– For chess, 6 instead of 35
• it can look ahead roughly twice as far as minimax in the same
• amount of time
• Ordering in chess: captures, threats, forward moves, and then
backward moves
Random Ordering?

• If successors are examined in random order rather


than best-first, the complexity will be roughly
O(b3m/4)
• Adding dynamic move-ordering schemes, such as
trying first the moves that were found to be best
last time, brings us close to the theoretical limit
• The best moves are often called killer moves
(killer move heuristic)
Dealing with Repeated States
• In games, repeated states occur frequently because
of transpositions
– different permutations of the move sequence end up in
the same position e.g., [a1, b1, a2, b2] vs. [a1, b2, a2,
b1]
– It’s worthwhile to store the evaluation of this position
in a hash table the first time it is encountered
– similar to the “explored set” in graph-search
• Tradeoff
– Transposition table can be too big
– Which to keep and which to discard
Imperfect, Real-Time Decisions
• Minimax generates the entire game search space
• Alpha-beta prunes large part of it, but still needs to
search all the way to terminal states
• However, moves must be made in reasonable
amount of time
• Standard approach: turning non-terminal nodes into
terminal leaves
– cutoff test: replaces terminal test, e.g., depth
limit
– heuristic evaluation function = estimated
desirability or utility of position
Evaluation Functions
The performance of a game-playing program is
dependent on the quality of its evaluation function

– Order the terminal states the same way as the


true utility function
– Evaluation of nonterminal states correlate with
the actual chance of winning
– Computation must not take too long!
Cutting off Search
• Modify alpha-beta search so that
– Terminal? is replaced by Cutoff?
– Utility is replaced by Eval
– if Cutoff-Test(state, depth) then return Eval(state)
– depth is chosen such that the amount of time used will
not exceed what the rules of the game allow
– Iterative deepening search can be applied
– When time runs out, returns the move selected by the
deepest completed search
Additional Refinements
Waiting for Quiescence: one of the factors that should
sometimes be considered in determining when to stop going
deeper in the search tree is whether the situation is relatively
stable. To make sure that such short-term measures do not unduly
influence our choice of move, we should continue the search until
no such drastic change occurs from one level to the next. This is
called waiting for quiescence.
Additional Refinements
Secondary search
One good way of combating the horizon effect is to double-check
a chosen move to make sure that a hidden pitfall does not exist a
few moves farther away than the original search explored.
Suppose we explore a game tree to an average depth of six ply
and, on the basis of that search, choose a particular move.
Although it would have been too expensive to have searched the
entire tree to a depth of eight, it is not very expensive to search
the single chosen branch an additional two levels to make sure
that it still looks good. This technique is called secondary search.
Additional Refinements

Using book moves: to select a move by simply looking


up the current game configuration in a catalogue and
extracting the correct move. Performance of a program can
often be considerably enhanced if it is provided with a list of
moves (called book moves) that should be made.

Alternatives to MINIMAX
Iterative Deepening
References on Specific Games
• Chess – Shannon [1950], Greenblatt et al. [1967] and Newell and
Simon [1972], Berliner and Ebling [1989], Anantharaman et al. [1990].

• Checkers – Samuel [1963].

• Go – Wilcox [1988].

• Backgammon – Berliner [1980], Tesauro and Sejnowski [989].

• Othello – Rosenbloom [1982], Lee and Mahajan [1990].

• Others – Levy [1988].

You might also like