Professional Documents
Culture Documents
Adversarial Search
INT3401 21
Instructor: Nguyễn Văn Vinh, UET - Hanoi VNU
2020
1 4/19/2020
Warm Up
2 0 … 2 6 … 4 6
5 Nguyen Van Vinh – UET, VNU Hanoi 4/19/2020
Value of a State
Non-Terminal States:
Value of a state: V(s) = max V(s’)
s’ successors(s)
The best achievable
outcome (utility)
from that state
Terminal States:
2 0 … 2 6 … 4 6
V(s) = known
6 Nguyen Van Vinh – UET, VNU Hanoi 4/19/2020
Multi-Agent Applications
(Football)
Simplest idea: each agent plans their own actions separately from others.
2 0 … 2 6 … 4 6
10 Nguyen Van Vinh – UET, VNU Hanoi 4/19/2020
Idea 2: Joint State/Action Spaces
𝑆0 = (𝑆0𝐴 , 𝑆0𝐵 )
𝑆𝐾 = (𝑆𝐾𝐴 , 𝑆𝐾𝐵 )
Search looks through all combinations of all agents’ states and actions
Think of one brain controlling many agents
𝑆𝐾 = (𝑆𝐾𝐴 , 𝑆𝐾𝐵 )
Search looks through all combinations of all agents’ states and actions
Think of one brain controlling many agents
Each agent proposes their actions and computer confirms the joint plan
Example: Autonomous driving through intersections
https://www.youtube.com/watch?v=4pbAI40dK0A
Search one agent’s actions from a state, search the next agent’s actions
from those resulting states , etc…
Agent 2
Agent 1
Search one agent’s actions from a state, search the next agent’s actions
from those resulting states , etc…
(Football)
Deterministic or stochastic?
Perfect information (fully observable)?
One, two, or more players?
Turn-taking or simultaneous?
Zero sum?
Two-Player
Two players involved
Sequential
Players take turns to move (take actions)
Act alternately
Zero-sum
Utility values of the two players at the end of the game have equal absolute value and
opposite sign
Perfect Information
Each player can fully observe what actions other agents have taken
https://en.wikipedia.org/wiki/Tic-tac-toe
23 Nguyen Van Vinh - UET, VNU Hanoi 4/19/2020
Chinese Chess
https://www.chessprogramming.org/Pham_Hong_Nguyen
https://vnexpress.net/giai-co-tuong-may-tinh-lan-dau-tien-o-viet-nam-
1773345.html
https://en.wikipedia.org/wiki/Chess
25 Nguyen Van Vinh - UET, VNU Hanoi 4/19/2020
Chess
Result: Win-loss-draw-draw-draw-loss
(In even-numbered games, Deep Blue played
white)
https://www.youtube.com/watch?v=SUbqykXVx0A
28 Nguyen Van Vinh - UET, VNU Hanoi 4/19/2020
Go
https://deepmind.com/blog/alphago-zero-learning-scratch/
AlphaGo: https://www.nature.com/articles/nature16961.pdf
AlphaZero: www.nature.com/articles/nature24270.pdf
29 Nguyen Van Vinh - UET, VNU Hanoi 4/19/2020
Zero-Sum Games
Search one agent’s actions from a state, search the competitor’s actions
from those resulting states , etc…
This is a zero-sum
game, the best action
for X is the worst
action for O and vice
versa
How do we define
best and worst?
MAX nodes: under Agent’s control MIN nodes: under Opponent’s control
V(s) = max V(s’) V(s) = min V(s’)
s’ successors(s) s’ successors(s)
-8
-8 -10
-8 -5 -10 +8
Terminal States:
V(s) = known
36 Nguyen Van Vinh – UET, VNU Hanoi 4/19/2020
Minimax Algorithm
3 2 2
3 12 8 2 4 6 14 5 2
A) BFS
B) DFS
C) UCS
D) A*
MAX nodes: under Agent’s control MIN nodes: under Opponent’s control
V(s) = max V(s’) V(s) = min V(s’)
s’ successors(s) s’ successors(s)
-8
-8 -10
-8 -5 -10 +8
Terminal States:
V(s) = known
40 Nguyen Van Vinh – UET, VNU Hanoi 4/19/2020
Minimax Efficiency
Example:
Suppose we have 100 seconds, can explore 10K nodes / sec
So can check 1M nodes per move
For chess, b=~35 so reaches about depth 4 – not so good
? ? ? ?
Example:
Suppose we have 100 seconds, can explore 10K nodes / sec
So can check 1M nodes per move
For chess, b=~35 so reaches about depth 4 – not so good ? ? ? ?
3 2 2
3 12 8 2 4 6 14 5 2
α =3 α =3
3
3 12 8 2 14 5 2
We can prune when: min node won’t be The order of generation matters: more pruning
higher than 2, while parent max has seen is possible if good moves come first
something larger in another branch
α = 10
β = 10 ?
β=2
? ?
α= 10 α=100 v= 2
1
? ? ?
77 Nguyen Van Vinh – UET, VNU Hanoi 4/19/2020
Alpha-Beta Pruning Properties
Theorem: This pruning has no effect on minimax value computed for the root!
10 10 0