Professional Documents
Culture Documents
Minmax
Minmax
Carla P. Gomes
gomes@cs.cornell.edu
Module:
Adversarial Search
(Reading R&N: Chapter 6)
Carla P. Gomes
CS4700
Outline
Game Playing
Optimal decisions
Minimax
α-β pruning
Imperfect, real-time decisions
Carla P. Gomes
CS4700
Game Playing
– Deterministic
– Turn taking
– 2-player
– Zero-sum game of perfect information (fully observable)
Carla P. Gomes
CS4700
Game Playing vs. Search
Carla P. Gomes
CS4700
Game Playing
Tree from
Max’s
perspective
Carla P. Gomes
CS4700
Minimax Algorithm
Minimax algorithm
– Perfect play for deterministic, 2-player game
– Max tries to maximize its score
– Min tries to minimize Max’s score (Min)
– Goal: move to position of highest minimax value
Identify best achievable payoff against best play
Carla P. Gomes
CS4700
Minimax Algorithm
Carla P. Gomes
CS4700
Minimax Algorithm (cont’d)
3 9 0 7 2 6
Carla P. Gomes
CS4700
Minimax Algorithm (cont’d)
3 0 2
3 9 0 7 2 6
Carla P. Gomes
CS4700
Minimax Algorithm (cont’d)
3 0 2
3 9 0 7 2 6
Carla P. Gomes
CS4700
Minimax Algorithm (cont’d)
Limitations
– Not always feasible to traverse entire tree
– Time limitations
Key Improvement
– Use evaluation function instead of utility
• Evaluation function provides estimate of utility at given position
More soon…
Carla P. Gomes
CS4700
α-β Pruning
Principle
– If a move is determined worse than another move already
examined, then there is no need for further examination of the
node.
Carla P. Gomes
CS4700
α-β Pruning Example
Carla P. Gomes
CS4700
α-β Pruning Example (cont’d)
Carla P. Gomes
CS4700
α-β Pruning Example (cont’d)
Carla P. Gomes
CS4700
α-β Pruning Example (cont’d)
Carla P. Gomes
CS4700
α-β Pruning Example (cont’d)
Carla P. Gomes
CS4700
Alpha-Beta Pruning (αβ prune)
Rules of Thumb
Carla P. Gomes
CS4700
Alpha-Beta Pruning Example
2. Search below a
MAX node may be
beta-pruned if the
alpha value is >= to
the beta value of
some MIN ancestor.
Carla P. Gomes
CS4700
Alpha-Beta Pruning Example
Carla P. Gomes
CS4700
Alpha-Beta Pruning Example
Carla P. Gomes
CS4700
Alpha-Beta Pruning Example
Carla P. Gomes
CS4700
Alpha-Beta Pruning Example
Carla P. Gomes
CS4700
-β Search Algorithm
2. Search below a
MAX node may be
beta-pruned if the
alpha value is >= to
the beta value of
some MIN ancestor.
Carla P. Gomes
CS4700
Example
1.Search below a
MIN node may
be alpha-pruned
if the beta value
is <= to the alpha
5 value of some
5 3 MAX ancestor.
2. Search below a
MAX node may
be beta-pruned if
5 3 the alpha value is
>= to the beta
value of some
5 6 3 7 MIN ancestor.
α
5 6 3
5 0 6 1 3 2 4 7
β
Carla P. Gomes
CS4700
Properties of α-β Prune
Carla P. Gomes
CS4700
Resource limits
Standard approach:
evaluation function
= estimated desirability of position
cutoff test:
e.g., depth limit What is the problem with that?
Evaluation function
– Performed at search cutoff point
– Must have same terminal/goal states as utility function
– Tradeoff between accuracy and time → reasonable complexity
– Accurate
• Performance of game-playing system dependent on
accuracy/goodness of evaluation
• Evaluation of nonterminal states strongly correlated with actual
chances of winning
Carla P. Gomes
CS4700
Evaluation functions
e.g., w1 = 9 with
f1(s) = (number of white queens) – (number of black queens), etc.
25 24 23 22 21 20 19 18 17 16 15 14 13
38
Expectiminimax
Carla P. Gomes
CS4700
Game Tree for Backgammon
MAX
DICE …
… … … …
1/36 1/18
1,2 6,5 6,6
1,1
MIN …
… … …
DICE C …
… … …
… 1/18
1/36
1,2
MAX
1,1 …6,5 6,6
… … …
TERMINAL
Expectiminimax
Expectiminimax(n) =
Utility(n) for n, a terminal state
Carla P. Gomes
CS4700
Expectiminimax
A1 A2 A1 A2
2.1 1.3
21 40.9
.9 .1 .9 .1 .1
.9 .1 .9
2 3 1 4 20 30 1 40
2 3 1 1 4 4 40 40
23 20 30 1 1
20 30
Carla P. Gomes
CS4700
Chess: Case Study
Carla P. Gomes
CS4700
Combinatorics of Chess
Opening book
Endgame
– database of all 5 piece endgames exists; database of all 6 piece games
being built
Middle game
– Positions evaluated (estimation)
• 1 move by each player = 1,000
• 2 moves by each player = 1,000,000
• 3 moves by each player = 1,000,000,000
Carla P. Gomes
CS4700
Positions with Smart Pruning
2 60
4 2,000
6 60,000
8 2,000,000
10 (<1 second DB) 60,000,000
12 2,000,000,000
14 (5 minutes DB) 60,000,000,000
16 2,000,000,000,000
Carla P. Gomes
CS4700
Game Tree Search
Carla P. Gomes
CS4700
History of Search Innovations
Carla P. Gomes
CS4700
Evaluation Functions
Carla P. Gomes
CS4700
Learning better evaluation functions
Carla P. Gomes
CS4700
Transposition Tables
Carla P. Gomes
CS4700
Transposition Tables as Learning
Carla P. Gomes
CS4700
Special-Purpose and Parallel Hardware
Carla P. Gomes
CS4700
Deep Blue
Hardware
– 32 general processors
– 220 VSLI chess chips
Overall: 200,000,000 positions per second
– 5 minutes = depth 14
Selective extensions - search deeper at unstable positions
– down to depth 25 !
Carla P. Gomes
CS4700
Tactics into Strategy
As Deep Blue goes deeper and deeper into a position, it displays elements
of strategic understanding. Somewhere out there mere tactics translate
into strategy. This is the closet thing I've ever seen to computer
intelligence. It's a very weird form of intelligence, but you can feel it.
It feels like thinking.
– Frederick Friedel (grandmaster), Newsday, May 9, 1997
Carla P. Gomes
CS4700
Automated reasoning --- the path
1M Multi-agent systems
10301,020 5M combining:
reasoning,
Case complexity
uncertainty &
ty
learning
e xi
0.5M VLSI
pl
m
10150,500 1M Verification
Co
l
ne ntia
o
E xp
100K Military Logistics
450K
106020
20K Chess (20 steps deep) & Kriegspiel (!)
100K
Pieces hidden
from opponent
Interesting combination of
reasoning, game tree
search, and uncertainty.
Carla P. Gomes
CS4700
The Danger of Introspection
Carla P. Gomes
CS4700
State-of-the-art of other games
Carla P. Gomes
CS4700
Deterministic games in practice
– Search techniques
– Heuristic functions
– Bounding and pruning technqiues
– Knowledge database on game
Carla P. Gomes
CS4700