Minmax

CS 4700:
Foundations of Artificial Intelligence
Carla P. Gomes
gomes@cs.cornell.edu
Module:
Adversarial Search
(Reading R&N: Chapter 6)
Carla P. Gomes
CS4700
Outline
Game Playing
Optimal decisions
Minimax
α-β pruning
Imperfect, real-time decisions
Carla P. Gomes
CS4700
Game Playing
Mathematical Game Theory
 Branch of economics that views any multi-agent environment as a game,

provided that the impact of each agent on the others is “significant”,
regardless of whether the agents are cooperative or competitive.
Game Playing in AI (typical case) :
– Deterministic
– Turn taking
– 2-player
– Zero-sum game of perfect information (fully observable)
Carla P. Gomes
CS4700
Game Playing vs. Search
Game vs. search problem
"Unpredictable" opponent  specifying a move for every possible

opponent reply
Time limits  unlikely to find goal, must approximate
Carla P. Gomes
CS4700
Game Playing
Formal definition of a game:

– Initial state
– Successor function: returns list of (move, state) pairs
– Terminal test: determines when game over
Terminal states: states where game ends
– Utility function (objective function or payoff function): gives
numeric value for terminal states
We will consider games with 2 players (Max and Min);

Max moves first.
Carla P. Gomes
CS4700
Game Tree Example:
Tic-Tac-Toe
Tree from
Max’s
perspective
Carla P. Gomes
CS4700
Minimax Algorithm
Minimax algorithm
– Perfect play for deterministic, 2-player game
– Max tries to maximize its score
– Min tries to minimize Max’s score (Min)
– Goal: move to position of highest minimax value
 Identify best achievable payoff against best play
Carla P. Gomes
CS4700
Minimax Algorithm
Payoff for Max
Carla P. Gomes
CS4700
Minimax Algorithm (cont’d)
3 9 0 7 2 6
Payoff for Max
Carla P. Gomes
CS4700
3 0 2
3 9 0 7 2 6
Payoff for Max
Carla P. Gomes
CS4700
3 0 2
3 9 0 7 2 6
Payoff for Max
Carla P. Gomes
CS4700
Properties of minimax algorithm:

Complete? Yes (if tree is finite)
Optimal? Yes (against an optimal opponent)
Time complexity? O(bm)
Space complexity? O(bm) (depth-first exploration, if it generates all
successors at once)
m – maximum depth of tree; b branching factor
For chess, b ≈ 35, m ≈100 for "reasonable" games

 exact solution completely infeasible
m – maximum depth of the tree; b – legal moves;

Carla P. Gomes
CS4700
Minimax Algorithm
Limitations
– Not always feasible to traverse entire tree
– Time limitations
Key Improvement
– Use evaluation function instead of utility
• Evaluation function provides estimate of utility at given position
 More soon…
Carla P. Gomes
CS4700
α-β Pruning
Can we improve search by reducing the size of the

game tree to be examined?
 Yes!!! Using alpha-beta pruning
Principle
– If a move is determined worse than another move already
examined, then there is no need for further examination of the
node.
Carla P. Gomes
CS4700
α-β Pruning Example
Carla P. Gomes
CS4700
α-β Pruning Example (cont’d)
Carla P. Gomes
CS4700
Carla P. Gomes
CS4700
Carla P. Gomes
CS4700
Carla P. Gomes
CS4700
Alpha-Beta Pruning (αβ prune)
Rules of Thumb
– α is the best ( highest) found so far along the path for

Max
– β is the best (lowest) found so far along the path for
Min
– Search below a MIN node may be alpha-pruned if the its β   of
some MAX ancestor
– Search below a MAX node may be beta-pruned if the its  β of
some MIN ancestor.
Carla P. Gomes
CS4700
Alpha-Beta Pruning Example
1.Search below a MIN

node may be alpha-
pruned if the beta
value is <= to the
alpha value of some
MAX ancestor.
2. Search below a
MAX node may be
beta-pruned if the
alpha value is >= to
the beta value of
some MIN ancestor.
Carla P. Gomes
CS4700
1.Search below a MIN 3

node may be alpha-
pruned if the beta
value is <= to the
alpha value of some
MAX ancestor.
3
2. Search below a
MAX node may be
beta-pruned if the
the beta value of
some MIN ancestor.
3
Carla P. Gomes
CS4700

node may be alpha-
pruned if the beta
value is <= to the
alpha value of some
MAX ancestor.
3
2. Search below a
MAX node may be
beta-pruned if the
the beta value of
some MIN ancestor.
3 5
Carla P. Gomes
CS4700

node may be alpha-
pruned if the beta
value is <= to the
alpha value of some
MAX ancestor.
3 0
2. Search below a
MAX node may be
beta-pruned if the α
the beta value of
some MIN ancestor.
3 5 0
β
Carla P. Gomes
CS4700

node may be alpha-
pruned if the beta
value is <= to the
alpha value of some
MAX ancestor.
3 0 2
2. Search below a
MAX node may be
beta-pruned if the α α
the beta value of
some MIN ancestor.
3 5 0 2
Carla P. Gomes
CS4700
-β Search Algorithm
1. If terminal state, compute e(n) and return the result.

2. Otherwise, if the level is a minimizing level, pruning
• Until no more children or   
-  i     search on a child
i   ,   i .
- If
• Return min
i 
pruning
3. Otherwise, the level is a maximizing level:
  ,
• Until no more children or
i    
– search on a child.
– If
 i   , set    i See page 170 R&N
max( i )
• Return Carla P. Gomes
CS4700
Another Example
1.Search below a MIN
node may be alpha-
pruned if the beta
value is <= to the
alpha value of some
MAX ancestor.
2. Search below a
MAX node may be
beta-pruned if the
the beta value of
some MIN ancestor.
Carla P. Gomes
CS4700
Example
1.Search below a
MIN node may
be alpha-pruned
if the beta value
is <= to the alpha
5 value of some
5 3 MAX ancestor.
2. Search below a
MAX node may
be beta-pruned if
5 3 the alpha value is
>= to the beta
value of some
5 6 3 7 MIN ancestor.
α
5 6 3
5 0 6 1 3 2 4 7
β
Carla P. Gomes
CS4700
Properties of α-β Prune
Pruning does not affect final result
Good move ordering improves effectiveness of pruning b(e.g., chess, try

captures first, then threats, froward moves, then backward moves…)
With "perfect ordering," time complexity = O(bm/2)

 doubles depth of search that alpha-beta pruning can explore
Example of the value of reasoning about which

computations are relevant (a form of metareasoning)
Carla P. Gomes
CS4700
Resource limits
Suppose we have 100 secs, explore 104 nodes/sec

 106 nodes per move
Standard approach:
evaluation function
= estimated desirability of position
cutoff test:
e.g., depth limit What is the problem with that?
add quiescence search:

quiescent position: position where
– next move unlikely to cause
large change in players’ positions
Carla P. Gomes
CS4700
Cutoff Search
Suppose we have 100 secs, explore 104 nodes/sec

 106 nodes per move
Does it work in practice?

bm = 106, b=35  m=4
4-ply lookahead is a hopeless chess player!

– 4-ply ≈ human novice
– 8-ply ≈ typical PC, human master
– 12-ply ≈ Deep Blue, Kasparov
Other improvements… Carla P. Gomes

CS4700
Evaluation Function
Evaluation function
– Performed at search cutoff point
– Must have same terminal/goal states as utility function
– Tradeoff between accuracy and time → reasonable complexity
– Accurate
• Performance of game-playing system dependent on
accuracy/goodness of evaluation
• Evaluation of nonterminal states strongly correlated with actual
chances of winning
Carla P. Gomes
CS4700
Evaluation functions
For chess, typically linear weighted sum of features

Eval(s) = w1 f1(s) + w2 f2(s) + … + wn fn(s)
e.g., w1 = 9 with
f1(s) = (number of white queens) – (number of black queens), etc.
Key challenge – find a good evaluation function:

Isolated pawns are bad.
How well protected is your king?
How much maneuverability to you have?
Do you control the center of the board?
Strategies change as the game proceeds
Carla P. Gomes
CS4700
When Chance is involved:
Backgammon Board
0 1 2 3 4 5 6 7 8 9 10 11 12
25 24 23 22 21 20 19 18 17 16 15 14 13
38
Expectiminimax
Generalization of minimax for games with chance nodes
Examples: Backgammon, bridge
Calculates expected value where probability is taken

over all possible dice rolls/chance events
- Max and Min nodes determined as before
- Chance nodes evaluated as weighted average
Carla P. Gomes
CS4700
Game Tree for Backgammon
MAX
DICE …
… … … …
1/36 1/18
1,2 6,5 6,6
1,1
MIN …
… … …
DICE C …
… … …
… 1/18
1/36
1,2
MAX
1,1 …6,5 6,6
… … …
TERMINAL
Expectiminimax
Expectiminimax(n) =
Utility(n) for n, a terminal state
maxsSucc(n) expectiminimax( s) for n, a Max node
minsSucc(n) expectiminimax( s) for n, a Min node
 sSucc ( n ) P ( s ) * expectiminimax( s) for n, a chance node
Carla P. Gomes
CS4700
Expectiminimax
A1 A2 A1 A2
2.1 1.3
21 40.9
.9 .1 .9 .1 .1
.9 .1 .9
2 3 1 4 20 30 1 40
2 3 1 1 4 4 40 40
23 20 30 1 1
20 30
Carla P. Gomes
CS4700
Chess: Case Study
Carla P. Gomes
CS4700
Combinatorics of Chess
Opening book
Endgame
– database of all 5 piece endgames exists; database of all 6 piece games
being built
Middle game
– Positions evaluated (estimation)
• 1 move by each player = 1,000
• 2 moves by each player = 1,000,000
• 3 moves by each player = 1,000,000,000
Carla P. Gomes
CS4700
Positions with Smart Pruning
Search Depth Positions
2 60
4 2,000
6 60,000
8 2,000,000
10 (<1 second DB) 60,000,000
12 2,000,000,000
14 (5 minutes DB) 60,000,000,000
16 2,000,000,000,000
How many lines of play does a grand master consider?

Around 5 to 7 Carla P. Gomes
CS4700
Formal Complexity of Chess
How hard is chess?
– Obvious problem: standard complexity theory tells us

nothing about finite games!
– Generalizing chess to NxN board: optimal play is
PSPACE-hard
Carla P. Gomes
CS4700
Game Tree Search
How to search a game tree was independently invented by Shannon

(1950) and Turing (1951).
Technique called: MiniMax search.
Evaluation function combines material & position.

– Pruning "bad" nodes: doesn't work in practice
– Extend "unstable" nodes (e.g. after captures): works well in
practice (Selection extension)
Carla P. Gomes
CS4700
History of Search Innovations
Shannon, Turing Minimax search 1950

Kotok/McCarthy Alpha-beta pruning 1966
MacHack Transposition tables 1967
Chess 3.0+ Iterative-deepening 1975
Belle Special hardware 1978
Cray Blitz Parallel search 1983
Hitech Parallel evaluation 1985
Deep Blue ALL OF THE ABOVE 1997
Carla P. Gomes
CS4700
Evaluation Functions
Primary way knowledge of chess is encoded

– material
– position
• doubled pawns
• how constrained position is
Must execute quickly - constant time
– parallel evaluation: allows more complex functions
• tactics: patterns to recognitize weak positions
• arbitrarily complicated domain knowledge
Carla P. Gomes
CS4700
Learning better evaluation functions
– Deep Blue learns by tuning weights in its board evaluation

function
» f(p) = w1f1(p) + w2f2(p) + ... + wnfn(p)
– Tune weights to find best least-squares fit with respect to

moves actually chosen by grandmasters in 1000+ games.
– The key difference between 1996 and 1997 match!
– Note that Kasparov also trained on
“computer chess” play.
Carla P. Gomes
CS4700
Transposition Tables
Introduced by Greenblat's Mac Hack (1966)

Basic idea: caching
– once a board is evaluated, save in a hash table, avoid re-
evaluating.
– called “transposition” tables, because different orderings
(transpositions) of the same set of moves can lead to the same
board.
Carla P. Gomes
CS4700
Transposition Tables as Learning
Is a form of root learning (memorization).
– positions generalize sequences of moves

– learning on-the-fly
– don't repeat blunders: can't beat the computer twice in a row using
same moves!
Deep Blue --- huge transposition tables (100,000,000+), must be carefully

managed.
Carla P. Gomes
CS4700
Special-Purpose and Parallel Hardware
Belle (Thompson 1978)

Cray Blitz (1993)
Hitech (1985)
Deep Blue (1987-1996)
– Parallel evaluation: allows more complicated evaluation functions
– Hardest part: coordinating parallel search
– Deep Blue never quite plays the same game, because of “noise” in
its hardware!
Carla P. Gomes
CS4700
Deep Blue
Hardware
– 32 general processors
– 220 VSLI chess chips
Overall: 200,000,000 positions per second
– 5 minutes = depth 14
Selective extensions - search deeper at unstable positions
– down to depth 25 !
Carla P. Gomes
CS4700
Tactics into Strategy
As Deep Blue goes deeper and deeper into a position, it displays elements
of strategic understanding. Somewhere out there mere tactics translate
into strategy. This is the closet thing I've ever seen to computer
intelligence. It's a very weird form of intelligence, but you can feel it.
It feels like thinking.
– Frederick Friedel (grandmaster), Newsday, May 9, 1997
Carla P. Gomes
CS4700
Automated reasoning --- the path
1M Multi-agent systems
10301,020 5M combining:
reasoning,
Case complexity
uncertainty &
ty
learning
e xi
0.5M VLSI
pl
m
10150,500 1M Verification
Co
l
ne ntia
o
E xp
100K Military Logistics
450K
106020
20K Chess (20 steps deep) & Kriegspiel (!)
100K
No. of atoms 103010

10K Deep space mission control
On earth 1047
50K
Seconds until heat
death of sun
100 Car repair diagnosis
1030 200
Protein folding
Calculation
(petaflop-year)
100 10K 20K 100K 1M Variables
Rules (Constraints)
$25M Darpa research program --- 2004-2009
Kriegspiel
Pieces hidden
from opponent
Interesting combination of
reasoning, game tree
search, and uncertainty.
Another chess variant:

Multiplayer
asynchronous chess.
Carla P. Gomes
CS4700
The Danger of Introspection
When people express the opinion that human grandmasters do not

examine 200,000,000 move sequences per second, I ask them, ``How
do you know?'' The answer is usually that human grandmasters are
not aware of searching this number of positions, or are aware of
searching many fewer. But almost everything that goes on in our
minds we are unaware of.
– Drew McDermott
Carla P. Gomes
CS4700
State-of-the-art of other games
Carla P. Gomes
CS4700
Deterministic games in practice
Checkers: Chinook ended 40-year-reign of human world champion

Marion Tinsley in 1994. Used a pre-computed endgame database
defining perfect play for all positions involving 8 or fewer pieces on
the board, a total of 444 billion positions.
2007: proved to be a draw! Schaeffer et al. solved checkers for

“White Doctor” opening (draw) (about 50 other openings).
Othello: human champions refuse to compete against computers, who are

too good
Backgamon: TD-Gamon is competitive with World Champion (ranked

among the top 3 players in the world). Tesauro's approach (1992) used
learning to come up with a good evaluation function. Exciting application
of Carla P. Gomes
CS4700
CS4700
Carla P. Gomes
Not true!
Computer Beats Pro at U.S. Go Congress
http://www.usgo.org/index.php?%23_id=4602
On August 7, 2008, the computer program MoGo running on 25 nodes (800 cores)
beat professional Go player Myungwan Kim (8p) in a handicap game on the 19x19
board. The handicap given to the computer was nine stones.
MoGo uses Monte Carlo based methods combined with, upper confidence bounds
applied to trees (UCT).
plausible moves (R&N).
programs use pattern knowledge bases to suggest
computers, who are too bad. In go, b > 300, so most
Go: human champions refuse to compete against
Playing GO
Summary
Game systems rely heavily on
– Search techniques
– Heuristic functions
– Bounding and pruning technqiues
– Knowledge database on game
For AI, the abstract nature of games makes them an

appealing subject for study:
state of the game is easy to represent;

agents are usually restricted to a small number of actions whose
outcomes are defined by precise rules
Carla P. Gomes
CS4700
Summary
Game playing was one of the first tasks undertaken e ! in AI as

soon as computers became programmable gamTuring,
(e.g.,
un: ay a
Shannon, Wiener tackled chess). r e f p l
s a ow t o
m e h
Ga puter
om spawned a number of interesting
Game playing researchr chas
you data structures, databases, heuristics,
research ideas onchsearch,
Tea
evaluations functions and many areas of computer science.
Carla P. Gomes
CS4700

Minmax

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Minmax

Uploaded by

Copyright:

Available Formats

CS 4700:

Foundations of Artificial Intelligence

Mathematical Game Theory

 Branch of economics that views any multi-agent environment as a game,

Game Playing in AI (typical case) :

Game vs. search problem

"Unpredictable" opponent  specifying a move for every possible

Time limits  unlikely to find goal, must approximate

Formal definition of a game:

We will consider games with 2 players (Max and Min);

Payoff for Max

Payoff for Max

Payoff for Max

Payoff for Max

Properties of minimax algorithm:

For chess, b ≈ 35, m ≈100 for "reasonable" games

m – maximum depth of the tree; b – legal moves;

Can we improve search by reducing the size of the

 Yes!!! Using alpha-beta pruning

– α is the best ( highest) found so far along the path for

1.Search below a MIN

1.Search below a MIN 3

1.Search below a MIN 3

1.Search below a MIN 3

1.Search below a MIN 3

1. If terminal state, compute e(n) and return the result.

Pruning does not affect final result

Good move ordering improves effectiveness of pruning b(e.g., chess, try

With "perfect ordering," time complexity = O(bm/2)

Example of the value of reasoning about which

Suppose we have 100 secs, explore 104 nodes/sec

add quiescence search:

Suppose we have 100 secs, explore 104 nodes/sec

Does it work in practice?

4-ply lookahead is a hopeless chess player!

Other improvements… Carla P. Gomes

For chess, typically linear weighted sum of features

Key challenge – find a good evaluation function:

Generalization of minimax for games with chance nodes

Examples: Backgammon, bridge

Calculates expected value where probability is taken

maxsSucc(n) expectiminimax( s) for n, a Max node

minsSucc(n) expectiminimax( s) for n, a Min node

 sSucc ( n ) P ( s ) * expectiminimax( s) for n, a chance node

Search Depth Positions

How many lines of play does a grand master consider?

How hard is chess?

– Obvious problem: standard complexity theory tells us

How to search a game tree was independently invented by Shannon

Technique called: MiniMax search.

Evaluation function combines material & position.

Shannon, Turing Minimax search 1950

Primary way knowledge of chess is encoded

– Deep Blue learns by tuning weights in its board evaluation

» f(p) = w1f1(p) + w2f2(p) + ... + wnfn(p)

– Tune weights to find best least-squares fit with respect to

Introduced by Greenblat's Mac Hack (1966)

Is a form of root learning (memorization).

– positions generalize sequences of moves

Deep Blue --- huge transposition tables (100,000,000+), must be carefully

Belle (Thompson 1978)

No. of atoms 103010

Another chess variant: