Professional Documents
Culture Documents
1 2
•In the search tree, first layer is move by MAX, next layer by
MIN, and alternate to terminal states. X O X X O X X O X ...
TERMINAL O X O O X X
O X X O X O O
Utility −1 0 +1
•Each layer in the search is called a ply.
3 4
Minimax Algorithm Minimax Computation
A1 A2 A3
A 11 A 12 A 13 A 21 A 22 A 23 A 31 A 32 A 33
•Compute utility of each node bottom up from leaves toward root.
3 12 8 2 4 6 14 5 2
•At each MAX node, pick the move with maximum utility.
•Can be performed using a depth-first search
•At each MIN node, pick the move with minimum utility
(assumes opponent always acts correctly to minimize utility). function MINIMAX-DECISION(game) returns an operator
if TERMINAL-TEST[game](state) then
return UTILITY[game](state)
else if MAX is to move in state then
return the highest MINIMAX-VALUE of SUCCESSORS(state)
else
return the lowest MINIMAX-VALUE of SUCCESSORS(state)
5 6
•Generating the complete game tree for all but the simplest •Search to a uniform depth (ply) d.
games is intractable.
w1 f 1 + w2 f 2 + … + wn f n
7 8
Determining Cutoff (cont) Alpha-Beta Pruning
A1 A2 A3
MIN 3 <=2 2
A 11 A 12 A 13 A 21 A 22 A 23 A 31 A 32 A 33
3 12 8 2 14 5 2
Black to move
9 10
11 12
Effectiveness of Alpha-Beta State of the Art Game Programs
•Amount of pruning depends on the order in which siblings •Chess: DeepBlue beat world champion
are explored.
-Customized parallel hardware
-Highly-tuned evaluation function developed with expert
•In optimal case where the best options are explored first, -Comprehensive opening and end-game databases
time complexity reduces from O(bd) to O(bd/2), a dramatic
improvement. But entails knowledge of best move in
advance!
•Checkers:
-Samuel’s learning program eventually beat developer.
•With successors randomly ordered, assymptotic bound is -Chinook is world champion, previous champ held title for
O((b/logb)d) which is not much help but only accurate for
40 years had to withdraw for health reasons.
b>1000. More realistic expectation is something like
O(b3d/4).
•Othello: Programs best players (e.g. Iago).
•Fairly simple ordering heuristic can produce closer to
optimal results than random results (e.g. check captures &
threats first).
•Backgammon: Neural-net learning program TDGammon
one of world’s top 3 players.
•Theoretical analysis makes unrealistic assumptions such •Go: Branching factor of ~360 kills most search methods.
as utility values distributed randomly across leaves and
Best programs still mediocre.
therefore experimental results are necessary.
13 17