You are on page 1of 82

Artificial Intelligence:

Adversarial Search

INT3401 21
Instructor: Nguyễn Văn Vinh, UET - Hanoi VNU
2020

1 4/19/2020
Warm Up

How would you search for moves in Tic Tac Toe?

2 Nguyen Van Vinh – UET, VNU Hanoi 4/19/2020


Warm Up

How is Tic Tac Toe different from maze search?

3 Nguyen Van Vinh – UET, VNU Hanoi 4/19/2020


Warm Up

How is Tic Tac Toe different from maze search?

Multi-Agent, Adversarial, Zero Sum


Single Agent

4 Nguyen Van Vinh – UET, VNU Hanoi 4/19/2020


Single-Agent Trees

2 0 … 2 6 … 4 6
5 Nguyen Van Vinh – UET, VNU Hanoi 4/19/2020
Value of a State

Non-Terminal States:
Value of a state: V(s) = max V(s’)
s’  successors(s)
The best achievable
outcome (utility)
from that state

Terminal States:
2 0 … 2 6 … 4 6
V(s) = known
6 Nguyen Van Vinh – UET, VNU Hanoi 4/19/2020
Multi-Agent Applications

(Football)

Adversarial Team: Collaborative


Collaborative Maze Solving Competition: Adversarial

7 Nguyen Van Vinh – UET, VNU Hanoi 4/19/2020


How could we model multi-agent collaborative problems?

8 Nguyen Van Vinh – UET, VNU Hanoi 4/19/2020


How could we model multi-agent problems?

Simplest idea: each agent plans their own actions separately from others.

9 Nguyen Van Vinh – UET, VNU Hanoi 4/19/2020


Many Single-Agent Trees

Choose the best Non-Terminal States:


action for each V(s) = max V(s’)
s’  successors(s)
agent independently

2 0 … 2 6 … 4 6
10 Nguyen Van Vinh – UET, VNU Hanoi 4/19/2020
Idea 2: Joint State/Action Spaces

Combine the states and actions of the N agents

𝑆0 = (𝑆0𝐴 , 𝑆0𝐵 )

11 Nguyen Van Vinh – UET, VNU Hanoi 4/19/2020


Idea 2: Joint State/Action Spaces

Combine the states and actions of the N agents

𝑆𝐾 = (𝑆𝐾𝐴 , 𝑆𝐾𝐵 )

12 Nguyen Van Vinh – UET, VNU Hanoi 4/19/2020


Idea 2: Joint State/Action Spaces

Search looks through all combinations of all agents’ states and actions
Think of one brain controlling many agents

𝑆𝐾 = (𝑆𝐾𝐴 , 𝑆𝐾𝐵 )

13 Nguyen Van Vinh – UET, VNU Hanoi 4/19/2020


Idea 2: Joint State/Action Spaces

Search looks through all combinations of all agents’ states and actions
Think of one brain controlling many agents

What is the size of


the state space?

What is the size of


the action space?

What is the size of


the search tree?

14 Nguyen Van Vinh – UET, VNU Hanoi 4/19/2020


Idea 3: Centralized Decision Making

Each agent proposes their actions and computer confirms the joint plan
Example: Autonomous driving through intersections

https://www.youtube.com/watch?v=4pbAI40dK0A

15 Nguyen Van Vinh – UET, VNU Hanoi 4/19/2020


Idea 4: Alternate Searching One Agent at a Time

Search one agent’s actions from a state, search the next agent’s actions
from those resulting states , etc…

Choose the best Non-Terminal States:


Agent 1 V(s) = max V(s’)
cascading combination s’  successors(s)
of actions

Agent 2

Agent 1

16 Nguyen Van Vinh – UET, VNU Hanoi 4/19/2020


Idea 4: Alternate Searching One Agent at a Time

Search one agent’s actions from a state, search the next agent’s actions
from those resulting states , etc…

What is the size of


the state space?

What is the size of


the action space?

What is the size of


the search tree?

17 Nguyen Van Vinh – UET, VNU Hanoi 4/19/2020


Multi-Agent Applications

(Football)

Adversarial Team: Collaborative


Collaborative Maze Solving Competition: Adversarial

18 Nguyen Van Vinh – UET, VNU Hanoi 4/19/2020


Games

19 Nguyen Van Vinh – UET, VNU Hanoi 4/19/2020


Why study games?

 Games are a traditional hallmark of intelligence


 Games are easy to formalize
 Games can be a good model of real-world competitive or
cooperative activities
 Military confrontations (đối đầu quân sự), negotiation (đàm phán), auctions
(đấu giá), etc.
 Games have been a key driver of new techniques in CS and AI.
Types of Games

 Deterministic or stochastic?
 Perfect information (fully observable)?
 One, two, or more players?
 Turn-taking or simultaneous?
 Zero sum?

21 Nguyen Van Vinh – UET, VNU Hanoi 4/19/2020


A Special Case

 Two-Player Sequential Zero-Sum Complete-Information Games

 Two-Player
 Two players involved
 Sequential
 Players take turns to move (take actions)
 Act alternately
 Zero-sum
 Utility values of the two players at the end of the game have equal absolute value and
opposite sign
 Perfect Information
 Each player can fully observe what actions other agents have taken

22 Nguyen Van Vinh - UET, VNU Hanoi 4/19/2020


Tic-Tac-Toe

 Two players, X and O, take turns marking the spaces in a


3×3 grid

 The player who succeeds in placing three of their


marks in a horizontal, vertical, or diagonal row wins the
game

https://en.wikipedia.org/wiki/Tic-tac-toe
23 Nguyen Van Vinh - UET, VNU Hanoi 4/19/2020
Chinese Chess

 https://www.chessprogramming.org/Pham_Hong_Nguyen
 https://vnexpress.net/giai-co-tuong-may-tinh-lan-dau-tien-o-viet-nam-
1773345.html

24 Nguyen Van Vinh – UET, VNU Hanoi 4/19/2020


Chess

https://en.wikipedia.org/wiki/Chess
25 Nguyen Van Vinh - UET, VNU Hanoi 4/19/2020
Chess

Garry Kasparov vs Deep Blue (1996)

Result: Win-loss-draw-draw-draw-loss
(In even-numbered games, Deep Blue played
white)

26 Nguyen Van Vinh - UET, VNU Hanoi 4/19/2020


Chess

27 Nguyen Van Vinh - UET, VNU Hanoi 4/19/2020


Go

 DeepMind promotion video before the game with Lee Sedol

https://www.youtube.com/watch?v=SUbqykXVx0A
28 Nguyen Van Vinh - UET, VNU Hanoi 4/19/2020
Go

AlphaGo vs Lee Sedol (3/2016) AlphaZero vs AlphaGo (2017)

https://deepmind.com/blog/alphago-zero-learning-scratch/

Result: win-win-win-loss-win Result: 100-0

AlphaGo: https://www.nature.com/articles/nature16961.pdf
AlphaZero: www.nature.com/articles/nature24270.pdf
29 Nguyen Van Vinh - UET, VNU Hanoi 4/19/2020
Zero-Sum Games

Zero-Sum Games General Games


 Agents have opposite utilities • Agents have independent utilities
 Pure competition: • Cooperation, indifference, competition,
shifting alliances, and more are all possible
 One maximizes, the other minimizes

30 Nguyen Van Vinh – UET, VNU Hanoi 4/19/2020


Game Trees

Search one agent’s actions from a state, search the competitor’s actions
from those resulting states , etc…

31 Nguyen Van Vinh – UET, VNU Hanoi 4/19/2020


Tic-Tac-Toe Game Tree

32 Nguyen Van Vinh – UET, VNU Hanoi 4/19/2020


Tic-Tac-Toe Game Tree

This is a zero-sum
game, the best action
for X is the worst
action for O and vice
versa

How do we define
best and worst?

33 Nguyen Van Vinh – UET, VNU Hanoi 4/19/2020


Tic-Tac-Toe Game Tree
Instead of taking the
max utility at every
level, alternate max and
min

34 Nguyen Van Vinh – UET, VNU Hanoi 4/19/2020


Tic-Tac-Toe Minimax

MAX nodes: under Agent’s control


V(s) = max V(s’)
s’  successors(s)
MIN nodes: under Opponent’s control
V(s) = min V(s’)
s’  successors(s)

35 Nguyen Van Vinh – UET, VNU Hanoi 4/19/2020


Small Pacman Example

MAX nodes: under Agent’s control MIN nodes: under Opponent’s control
V(s) = max V(s’) V(s) = min V(s’)
s’  successors(s) s’  successors(s)

-8

-8 -10

-8 -5 -10 +8
Terminal States:
V(s) = known
36 Nguyen Van Vinh – UET, VNU Hanoi 4/19/2020
Minimax Algorithm

function minimax-decision(s) returns action


return the action a in Actions(s) with the highest Result(s,a)s’
min-value(Result(s,a))

function max-value(s) returns value function min-value(s) returns value


if Terminal-Test(s) then return Utility(s) if Terminal-Test(s) then return Utility(s)
initialize v = -∞ initialize v = +∞
for each a in Actions(s): for each a in Actions(state):
v = max(v, min-value(Result(s,a))) v = min(v, max-value(Result(s,a))
return v return v
V(s) = max V(s’) V(s) = min V(s’)
s’  successors(s) s’  successors(s)
37 Nguyen Van Vinh – UET, VNU Hanoi 4/19/2020
Minimax Example

3 2 2

3 12 8 2 4 6 14 5 2

38 Nguyen Van Vinh – UET, VNU Hanoi 4/19/2020


Poll

What kind of search is Minimax Search?

A) BFS
B) DFS
C) UCS
D) A*

39 Nguyen Van Vinh – UET, VNU Hanoi 4/19/2020


Minimax is Depth-First Search

MAX nodes: under Agent’s control MIN nodes: under Opponent’s control
V(s) = max V(s’) V(s) = min V(s’)
s’  successors(s) s’  successors(s)

-8

-8 -10

-8 -5 -10 +8
Terminal States:
V(s) = known
40 Nguyen Van Vinh – UET, VNU Hanoi 4/19/2020
Minimax Efficiency

 How efficient is minimax?


 Just like (exhaustive) DFS
 Time: O(bm)
 Space: O(bm)

 Example: For chess, b  35, m  100


 Exact solution is completely infeasible
 Humans can’t do this either, so how do
we play chess?

41 Nguyen Van Vinh – UET, VNU Hanoi 4/19/2020


Generalized minimax

 What if the game is not zero-sum, or has multiple players?

 Generalization of minimax: 8,8,1


 Terminals have utility tuples
 Node values are also utility tuples
 Each player maximizes its own component 8,8,1 7,7,2
 Can give rise to cooperation and
competition dynamically…
0,0,7 8,8,1 7,7,2 0,0,8

1,1,6 0,0,7 9,9,0 8,8,1 9,9,0 7,7,2 0,0,8 0,0,7

42 Nguyen Van Vinh – UET, VNU Hanoi 4/19/2020


Resource Limits

43 Nguyen Van Vinh – UET, VNU Hanoi 4/19/2020


Resource Limits

 Problem: In realistic games, cannot search to leaves! 4 max

 Solution 1: Bounded lookahead -2 4 min


 Search only to a preset depth limit or horizon -1 -2 4 9
 Use an evaluation function for non-terminal positions

 Guarantee of optimal play is gone


 More plies make a BIG difference

 Example:
 Suppose we have 100 seconds, can explore 10K nodes / sec
 So can check 1M nodes per move
 For chess, b=~35 so reaches about depth 4 – not so good
? ? ? ?

44 Nguyen Van Vinh – UET, VNU Hanoi 4/19/2020


Depth Matters

 Evaluation functions are always


imperfect
 Deeper search => better play
(usually)
 Or, deeper search gives same
quality of play with a less accurate
evaluation function
 An important example of the
tradeoff between complexity of
features and complexity of
computation

45 Nguyen Van Vinh – UET, VNU Hanoi 4/19/2020


[Demo: depth limited (L6D4, L6D5)]
Evaluation Functions

46 Nguyen Van Vinh – UET, VNU Hanoi 4/19/2020


Evaluation Functions

 Evaluation functions score non-terminals in depth-limited search

 Ideal function: returns the actual minimax value of the position


 In practice: typically weighted linear sum of features:
 EVAL(s) = w1 f1(s) + w2 f2(s) + …. + wn fn(s)
 E.g., w1 = 9, f1(s) = (num white queens – num black queens), etc.
 Terminate search only in quiescent positions, i.e., no major
changes expected in feature values
47 Nguyen Van Vinh – UET, VNU Hanoi 4/19/2020
Evaluation for Pacman

48 Nguyen Van Vinh – UET, VNU Hanoi 4/19/2020


Resource Limits

 Problem: In realistic games, cannot search to leaves! 4 max


-2 4 min
 Solution 1: Bounded lookahead
 Search only to a preset depth limit or horizon -1 -2 4 9
 Use an evaluation function for non-terminal positions

 Guarantee of optimal play is gone


 More plies make a BIG difference

 Example:
 Suppose we have 100 seconds, can explore 10K nodes / sec
 So can check 1M nodes per move
 For chess, b=~35 so reaches about depth 4 – not so good ? ? ? ?

49 Nguyen Van Vinh – UET, VNU Hanoi 4/19/2020


Solution 2: Game Tree Pruning

50 Nguyen Van Vinh – UET, VNU Hanoi 4/19/2020


Pruning - Motivation

 Q1. Why would “Queen to G5” be a bad move for Black?


 Q2. How many White “replies” did you need to consider in answering?
Once we have seen one reply scary enough to convince us the move is really bad, we can abandon this
move and continue searching elsewhere.

51 Phạm Bảo Sơn & Nguyễn Văn Vinh 4/19/2020


Alpha-beta Pruning

 We can improve on the performance of the minimax algorithm through


alpha-beta pruning
 Basic idea: “If you have an idea that is surely bad, don't take the time to see
how truly awful it is.” -- Pat Winston

• We don’t need to compute the value at


this node.
• No matter what it is, it can’t affect the
value of the root node.

52 Nguyen Van Vinh – UET, VNU Hanoi 4/19/2020


Intuition: prune the branches that can’t be chosen

3 2 2

3 12 8 2 4 6 14 5 2

53 Nguyen Van Vinh – UET, VNU Hanoi 4/19/2020


Alpha-Beta Pruning Example

α = best option so far from any


MAX node on this path

α =3 α =3
3

3 12 8 2 14 5 2

We can prune when: min node won’t be The order of generation matters: more pruning
higher than 2, while parent max has seen is possible if good moves come first
something larger in another branch

54 Nguyen Van Vinh – UET, VNU Hanoi 4/19/2020


Why is α-β?

 α is the value of the best choice for the


MAX player found so far
at any choice point above node n
 We want to compute the
MIN-value at n
 As we loop over n’s children,
the MIN-value decreases
 If it drops below α, MAX will never
choose n, so we can ignore n’s remaining
children
 Prunning nodes
 Analogously, β is the value of the lowest-
utility choice found so far for the MIN
player
Alpha-Beta Algorithm

α: MAX’s best option on path to root


β: MIN’s best option on path to root

def max-value(state, α, β): def min-value(state , α, β):


initialize v = -∞ initialize v = +∞
for each successor of state: for each successor of state:
v = max(v, value(successor, α, β)) v = min(v, value(successor, α, β))
if v ≥ β if v ≤ α
return v return v
α = max(α, v) β = min(β, v)
return v return v
56 Nguyen Van Vinh – UET, VNU Hanoi 4/19/2020
Quiz: Minimax Example

What is the value of the blue triangle?


A) 10
B) 8
C) 4
D) 50

57 Nguyen Van Vinh – UET, VNU Hanoi 4/19/2020


Quiz: Minimax Example

What is the value of the blue triangle?


8 A) 10
B) 8
C) 4
D) 50
8 4

58 Nguyen Van Vinh – UET, VNU Hanoi 4/19/2020


Alpha-Beta Small Example

𝛼 = −∞ def max-value(state, α, β):


𝛽= ∞ initialize v = -∞
𝑣 = −∞ for each successor of state:
v = max(v, value(successor, α, β))
if v ≥ β
return v
α = max(α, v)
return v

def min-value(state , α, β):


initialize v = +∞
for each successor of state:
v = min(v, value(successor, α, β))
if v ≤ α
return v
β = min(β, v)
return v

59 Nguyen Van Vinh – UET, VNU Hanoi 4/19/2020


Alpha-Beta Small Example

𝛼 = −∞ def max-value(state, α, β):


𝛽= ∞ initialize v = -∞
𝑣 = −∞ for each successor of state:
v = max(v, value(successor, α, β))
if v ≥ β
return v
α = max(α, v)
𝛼 = −∞ return v
𝛽= ∞
𝑣=∞
def min-value(state , α, β):
initialize v = +∞
for each successor of state:
v = min(v, value(successor, α, β))
if v ≤ α
return v
β = min(β, v)
return v

60 Nguyen Van Vinh – UET, VNU Hanoi 4/19/2020


Alpha-Beta Small Example

𝛼 = −∞ def max-value(state, α, β):


𝛽= ∞ initialize v = -∞
𝑣 = −∞ for each successor of state:
v = max(v, value(successor, α, β))
if v ≥ β
return v
α = max(α, v)
𝛼 = −∞ return v
𝛽= ∞
𝑣=∞
def min-value(state , α, β):
initialize v = +∞
for each successor of state:
v = min(v, value(successor, α, β))
if v ≤ α
return v
β = min(β, v)
return v

61 Nguyen Van Vinh – UET, VNU Hanoi 4/19/2020


Alpha-Beta Small Example

𝛼 = −∞ def max-value(state, α, β):


𝛽= ∞ initialize v = -∞
𝑣 = −∞ for each successor of state:
v = max(v, value(successor, α, β))
if v ≥ β
return v
α = max(α, v)
𝛼 = −∞ return v
𝛽= ∞
𝑣 = 10
def min-value(state , α, β):
initialize v = +∞
for each successor of state:
v = min(v, value(successor, α, β))
if v ≤ α
return v
β = min(β, v)
return v

62 Nguyen Van Vinh – UET, VNU Hanoi 4/19/2020


Alpha-Beta Small Example

𝛼 = −∞ def max-value(state, α, β):


𝛽= ∞ initialize v = -∞
𝑣 = −∞ for each successor of state:
v = max(v, value(successor, α, β))
if v ≥ β
return v
α = max(α, v)
𝛼 = −∞ return v
𝛽 = 10
𝑣 = 10
def min-value(state , α, β):
initialize v = +∞
for each successor of state:
v = min(v, value(successor, α, β))
if v ≤ α
return v
β = min(β, v)
return v

63 Nguyen Van Vinh – UET, VNU Hanoi 4/19/2020


Alpha-Beta Small Example

𝛼 = −∞ def max-value(state, α, β):


𝛽= ∞ initialize v = -∞
𝑣 = −∞ for each successor of state:
v = max(v, value(successor, α, β))
if v ≥ β
return v
α = max(α, v)
𝛼 = −∞ return v
𝛽 = 10
𝑣=8
def min-value(state , α, β):
initialize v = +∞
for each successor of state:
v = min(v, value(successor, α, β))
if v ≤ α
return v
β = min(β, v)
return v

64 Nguyen Van Vinh – UET, VNU Hanoi 4/19/2020


Alpha-Beta Small Example

𝛼 = −∞ def max-value(state, α, β):


𝛽= ∞ initialize v = -∞
𝑣 = −∞ for each successor of state:
v = max(v, value(successor, α, β))
if v ≥ β
return v
α = max(α, v)
𝛼 = −∞ return v
𝛽=8
𝑣=8
def min-value(state , α, β):
initialize v = +∞
for each successor of state:
v = min(v, value(successor, α, β))
if v ≤ α
return v
β = min(β, v)
return v

65 Nguyen Van Vinh – UET, VNU Hanoi 4/19/2020


Alpha-Beta Small Example

𝛼 = −∞ def max-value(state, α, β):


𝛽= ∞ initialize v = -∞
𝑣 = −∞ for each successor of state:
v = max(v, value(successor, α, β))
if v ≥ β
return v
α = max(α, v)
𝛼 = −∞ return v
𝛽=8 8
𝑣=8
def min-value(state , α, β):
initialize v = +∞
for each successor of state:
v = min(v, value(successor, α, β))
if v ≤ α
return v
β = min(β, v)
return v

66 Nguyen Van Vinh – UET, VNU Hanoi 4/19/2020


Alpha-Beta Small Example

𝛼 = −∞ def max-value(state, α, β):


𝛽= ∞ initialize v = -∞
𝑣=8 for each successor of state:
v = max(v, value(successor, α, β))
if v ≥ β
return v
α = max(α, v)
𝛼 = −∞ return v
𝛽=8 8
𝑣=8
def min-value(state , α, β):
initialize v = +∞
for each successor of state:
v = min(v, value(successor, α, β))
if v ≤ α
return v
β = min(β, v)
return v

67 Nguyen Van Vinh – UET, VNU Hanoi 4/19/2020


Alpha-Beta Small Example

𝛼=8 def max-value(state, α, β):


𝛽= ∞ initialize v = -∞
𝑣=8 for each successor of state:
v = max(v, value(successor, α, β))
if v ≥ β
return v
α = max(α, v)
𝛼 = −∞ return v
𝛽=8 8
𝑣=8
def min-value(state , α, β):
initialize v = +∞
for each successor of state:
v = min(v, value(successor, α, β))
if v ≤ α
return v
β = min(β, v)
return v

68 Nguyen Van Vinh – UET, VNU Hanoi 4/19/2020


Alpha-Beta Small Example

𝛼=8 def max-value(state, α, β):


𝛽= ∞ initialize v = -∞
𝑣=8 for each successor of state:
v = max(v, value(successor, α, β))
if v ≥ β
return v
α = max(α, v)
𝛼 = −∞ 𝛼=8 return v
𝛽=8 8 𝛽= ∞
𝑣=8 𝑣=∞
def min-value(state , α, β):
initialize v = +∞
for each successor of state:
v = min(v, value(successor, α, β))
if v ≤ α
return v
β = min(β, v)
return v

69 Nguyen Van Vinh – UET, VNU Hanoi 4/19/2020


Alpha-Beta Small Example

𝛼=8 def max-value(state, α, β):


𝛽= ∞ initialize v = -∞
𝑣=8 for each successor of state:
v = max(v, value(successor, α, β))
if v ≥ β
return v
α = max(α, v)
𝛼 = −∞ 𝛼=8 return v
𝛽=8 8 𝛽= ∞
𝑣=8 𝑣=4
def min-value(state , α, β):
initialize v = +∞
for each successor of state:
v = min(v, value(successor, α, β))
if v ≤ α
return v
β = min(β, v)
return v

70 Nguyen Van Vinh – UET, VNU Hanoi 4/19/2020


Alpha-Beta Small Example

𝛼=8 def max-value(state, α, β):


𝛽= ∞ initialize v = -∞
𝑣=8 for each successor of state:
v = max(v, value(successor, α, β))
if v ≥ β
return v
α = max(α, v)
𝛼 = −∞ 𝛼=8 return v
𝛽=8 8 𝛽= ∞ 4
𝑣=8 𝑣=4
def min-value(state , α, β):
initialize v = +∞
for each successor of state:
v = min(v, value(successor, α, β))
if v ≤ α
return v
β = min(β, v)
return v

71 Nguyen Van Vinh – UET, VNU Hanoi 4/19/2020


Alpha-Beta Small Example

𝛼=8 def max-value(state, α, β):


𝛽= ∞ initialize v = -∞
𝑣=8 for each successor of state:
v = max(v, value(successor, α, β))
if v ≥ β
return v
α = max(α, v)
𝛼 = −∞ 𝛼=8 return v
𝛽=8 8 𝛽= ∞ 4
𝑣=8 𝑣=4
def min-value(state , α, β):
initialize v = +∞
for each successor of state:
v = min(v, value(successor, α, β))
if v ≤ α
return v
β = min(β, v)
return v

72 Nguyen Van Vinh – UET, VNU Hanoi 4/19/2020


Alpha-Beta Small Example

𝛼=8 def max-value(state, α, β):


𝛽= ∞ initialize v = -∞
𝑣=8 for each successor of state:
v = max(v, value(successor, α, β))
if v ≥ β
return v
α = max(α, v)
𝛼 = −∞ 𝛼=8 return v
𝛽=8 8 𝛽= ∞ 4
𝑣=8 𝑣=4
def min-value(state , α, β):
initialize v = +∞
for each successor of state:
v = min(v, value(successor, α, β))
if v ≤ α
return v
β = min(β, v)
return v

73 Nguyen Van Vinh – UET, VNU Hanoi 4/19/2020


Alpha-Beta Small Example

𝛼=8 def max-value(state, α, β):


𝛽= ∞ initialize v = -∞
𝑣=8 8 for each successor of state:
v = max(v, value(successor, α, β))
if v ≥ β
return v
α = max(α, v)
𝛼 = −∞ 𝛼=8 return v
𝛽=8 8 𝛽= ∞ 4
𝑣=8 𝑣=4
def min-value(state , α, β):
initialize v = +∞
for each successor of state:
v = min(v, value(successor, α, β))
if v ≤ α
return v
β = min(β, v)
return v

74 Nguyen Van Vinh – UET, VNU Hanoi 4/19/2020


Minimax Quiz

What is the value of the top node?


A) 10
B) 100
C) 2
D) 4

75 Nguyen Van Vinh – UET, VNU Hanoi 4/19/2020


Alpha Beta Quiz

Which branches are pruned?


A) e, l
B) g, l
C) g, k, l
D) g, n

76 Nguyen Van Vinh – UET, VNU Hanoi 4/19/2020


Alpha-Beta Quiz 2

α = 10

β = 10 ?
β=2
? ?
α= 10 α=100 v= 2
1

? ? ?
77 Nguyen Van Vinh – UET, VNU Hanoi 4/19/2020
Alpha-Beta Pruning Properties

 Theorem: This pruning has no effect on minimax value computed for the root!

 Good child ordering improves effectiveness of pruning


 Iterative deepening helps with this max

 With “perfect ordering”:


 Time complexity drops to O(bm/2) min
 Doubles solvable depth!
 1M nodes/move => depth=8, respectable

10 10 0

 This is a simple example of metareasoning (computing about what to compute)

78 Nguyen Van Vinh – UET, VNU Hanoi 4/19/2020


Advanced Techiques

 Transposition table to store previously expanded


states
 Forward pruning to avoid considering all possible
moves
 Lookup tables for opening moves and endgames

79 Nguyen Van Vinh – UET, VNU Hanoi 4/19/2020


Chess Systems

 Baseline system: 200 million node evalutions per move


(3 min), minimax with a decent evaluation function and quiescence search
 5-ply ≈ human novice
 Add alpha-beta pruning
 10-ply ≈ typical PC, experienced player
 Deep Blue: 30 billion evaluations per move, singular extensions, evaluation
function with 8000 features,
large databases of opening and endgame moves
 14-ply ≈ Garry Kasparov
 More recent state of the art (Hydra, ca. 2006): 36 billion evaluations per
second, advanced pruning techniques
 18-ply ≈ better than any human alive?
Summary

 Multi-agent problems can require more space or deeper trees to search


 Games require decisions when optimality is impossible
 Bounded-depth search and approximate evaluation functions
 Games force efficient use of computation
 Alpha-beta pruning
 Game playing has produced important research ideas
 Reinforcement learning (checkers)
 Iterative deepening (chess)
 Rational metareasoning (Othello)
 Monte Carlo tree search (Go)
 Solution methods for partial-information games in economics (poker)
 Video games present much greater challenges – lots to do!
 b = 10500, |S| = 104000, m = 10,000

81 Nguyen Van Vinh – UET, VNU Hanoi 4/19/2020


References

 Slides from CMU, Berkeley University.


 Artificial Intelligence: A modern approach. Chapter 6.

82 Nguyen Van Vinh – UET, VNU Hanoi 4/19/2020

You might also like