Professional Documents
Culture Documents
Adversarial Search
[These slides were created by Dan Klein and Pieter Abbeel for CS188 Intro to AI at UC Berkeley. All CS188 materials are available at http://ai.berkeley.edu.]
Game Playing State-of-the-Art
▪ Checkers: 1950: First computer player. 1994: First
computer champion: Chinook ended 40-year-reign
of human champion Marion Tinsley using complete
8-piece endgame. 2007: Checkers solved!
▪ Pacman
Pac-Man as an Agent
Agent Environment
Sensors
Percepts
?
Actuators
Actions
Pac-Man is a registered trademark of Namco-Bandai Games, used here for educational purposes Demo1: pacman-l1.mp4 or L1D2
Reflex Agents
▪ Reflex agents:
▪ Choose action based on current percept (and
maybe memory)
▪ May have memory or a model of the world’s
current state
▪ Do not consider the future consequences of
their actions
▪ Consider how the world IS
▪ Planning agents:
▪ Ask “what if”
▪ Decisions based on (hypothesized)
consequences of actions
▪ Must have a model of how the world evolves in
response to actions
▪ Must formulate a goal (test)
▪ Consider how the world WOULD BE
▪ A state space
“E”, 1.0
▪ A start state and a goal test
▪ World state:
▪ Agent positions: 120
▪ Food count: 30
▪ Ghost positions: 12
▪ Agent facing: NSEW
▪ How many
▪ World states?
120x(230)x(122)x4
Search Trees
This is now / start
“N”, 1.0 “E”, 1.0
Possible futures
▪ A search tree:
▪ A “what if” tree of plans and their outcomes
▪ The start state is the root node
▪ Children correspond to successors
▪ Nodes show states, but correspond to PLANS that achieve those states
▪ For most problems, we can never actually build the whole tree
Recall: Search
▪ Depth-first search
▪ Breadth-first search
▪ A* Search
▪ Iterative Deepening
9
A* Search and Iterative Deepening
10
11
Adversarial Games
Types of Games
▪ Many different kinds of games!
▪ Axes:
▪ Deterministic or stochastic?
▪ One, two, or more players?
▪ Zero sum?
▪ Perfect information (can you see the state)?
2 0 … 2 6 … 4 6
Value of a State
Value of a state: Non-Terminal States:
The best achievable
outcome (utility)
from that state
2 0 … 2 6 … 4 6
Terminal States:
Adversarial Game Trees
-8 -5 -10 +8
Terminal States:
Tic-Tac-Toe Game Tree
Adversarial Search (Minimax)
▪ Deterministic, zero-sum games: Minimax values:
computed recursively
▪ Tic-tac-toe, chess, checkers
▪ One player maximizes result 5 max
▪ The other minimizes result
2 5 min
▪ Minimax search:
▪ A state-space search tree
▪ Players alternate turns
▪ Compute each node’s minimax value: 8 2 5 6
the best achievable utility against a
rational (optimal) adversary Terminal values:
part of the game
Minimax Implementation
3 12 8 2 4 6 14 5 2
Minimax Efficiency
▪ How efficient is minimax?
▪ Just like (exhaustive) DFS
▪ Time: O(bm)
▪ Space: O(bm)
max
min
1
1 1
9 0
0 0
0
▪ Example:
▪ Suppose we have 100 seconds, can explore 10K nodes / sec
▪ So can check 1M nodes per move
▪ α-β reaches about depth 8 – decent chess program
[Demo: thrashing d=2, thrashing d=2 (fixed evaluation function), smart ghosts coordinate (L6D6,7,8,10)]
Why Pacman Starves
3 12 8 2 4 6 14 5 2
Minimax Pruning
3 12 8 2 14 5 2
Alpha-Beta Pruning
▪ General configuration (MIN version)
▪ We’re computing the MIN-VALUE at some node n MAX
▪ We’re looping over n’s children
▪ n’s estimate of the childrens’ min is dropping MIN a
▪ Who cares about n’s value? MAX
▪ Let a be the best value that MAX can get at any choice
point along the current path from the root
▪ If n becomes worse than a, MAX will avoid it, so we can MAX
stop considering n’s other children (it’s already bad
enough that it won’t be played) MIN n
min
▪ Good child ordering improves effectiveness of pruning
Note: wrongness of eval functions matters less and less the deeper the search goes!