You are on page 1of 23

www.StudentRockStars.

com

om
Game playing

.c
rs
Unit 4

ta
kS
oc
R
nt
de
tu
.S
w
w
w

www.StudentRockStars.com
www.StudentRockStars.com

Outline
♦ Games
♦ Perfect play

om
– minimax decisions

.c
– α–β pruning

rs
ta
♦ Resource limits and approximate evaluation

kS
oc
♦ Games of chance

R
nt
♦ Games of imperfect information
tu
de
.S
w
w
w

Chapter 6 2

www.StudentRockStars.com
www.StudentRockStars.com

Games vs. search problems


“Unpredictable” opponent ⇒ solution is a strategy
specifying a move for every possible opponent reply

om
Time limits ⇒ unlikely to find goal, must approximate

.c
rs
Plan of attack: www.StudentRockStars.com

ta
kS
• Computer considers possible lines of play (Babbage, 1846)

oc
• Algorithm for perfect play (Zermelo, 1912; Von Neumann, 1944)

R
nt
• Finite horizon, approximate evaluation (Zuse, 1945; Wiener, 1948;
de
Shannon, 1950)
tu
.S

• First chess program (Turing, 1951)


w
w

• Machine learning to improve evaluation accuracy (Samuel, 1952–57)


w

• Pruning to allow deeper search (McCarthy, 1956)

Chapter 6 3

www.StudentRockStars.com
www.StudentRockStars.com

Types of games

deterministic chance

perfect information chess, checkers, backgammon

om
go, othello monopoly

.c
rs
imperfect information battleships, bridge, poker, scrabble

ta
blind tictactoe nuclear war

kS
oc
www.StudentRockStars.com

R
nt
de
tu
.S
w
w
w

Chapter 6 4

www.StudentRockStars.com
www.StudentRockStars.com

Game tree (2-player, deterministic, turns)


MAX (X)

om
X X X
MIN (O) X X X

.c
X X X

rs
ta
X O X O X ...

kS
MAX (X) O

oc
R
nt
X O X X O X O ...
MIN (O) X X de
tu
.S

... ... ... ...


w
w

...
w

X O X X O X X O X
TERMINAL O X O O X X
O X X O X O O
Utility −1 0 +1

Chapter 6 5

www.StudentRockStars.com
www.StudentRockStars.com

Minimax
Perfect play for deterministic, perfect-information games
Idea: choose move to position with highest minimax value

om
= best achievable payoff against best play

.c
E.g., 2-ply game:

rs
ta
MAX 3

kS
A1 A2 A3

oc
R
3 2 2

nt
MIN
de
tu
A 11 A 12 A 13 A 21 A 22 A 23 A 31 A 32 A 33
.S
w
w

3 12 8 2 4 6 14 5 2
w

Chapter 6 6

www.StudentRockStars.com
www.StudentRockStars.com

Minimax algorithm

function Minimax-Decision(state) returns an action


inputs: state, current state in game www.StudentRockStars.com

om
return the a in Actions(state) maximizing Min-Value(Result(a, state))

.c
rs
function Max-Value(state) returns a utility value

ta
if Terminal-Test(state) then return Utility(state)

kS
v ← −∞

oc
for a, s in Successors(state) do v ← Max(v, Min-Value(s))

R
return v

nt
de
function Min-Value(state) returns a utility value
tu
if Terminal-Test(state) then return Utility(state)
.S

v←∞
w

for a, s in Successors(state) do v ← Min(v, Max-Value(s))


w
w

return v

Chapter 6 7

www.StudentRockStars.com
www.StudentRockStars.com

Properties of minimax
Complete??

om
.c
rs
ta
kS
oc
R
nt
de
tu
.S
w
w
w

Chapter 6 8

www.StudentRockStars.com
www.StudentRockStars.com

Properties of minimax
Complete?? Only if tree is finite (chess has specific rules for this).
NB a finite strategy can exist even in an infinite tree!

om
Optimal??

.c
rs
ta
kS
oc
R
nt
de
tu
.S
w
w
w

Chapter 6 9

www.StudentRockStars.com
www.StudentRockStars.com

Properties of minimax
Complete?? Yes, if tree is finite (chess has specific rules for this)
Optimal?? Yes, against an optimal opponent. Otherwise??

om
Time complexity??

.c
rs
ta
kS
oc
R
nt
de
tu
.S
w
w
w

Chapter 6 10

www.StudentRockStars.com
www.StudentRockStars.com

Properties of minimax
Complete?? Yes, if tree is finite (chess has specific rules for this)
Optimal?? Yes, against an optimal opponent. Otherwise??

om
Time complexity?? O(bm)

.c
rs
Space complexity??

ta
kS
oc
R
nt
de
tu
.S
w
w
w

Chapter 6 11

www.StudentRockStars.com
www.StudentRockStars.com

Properties of minimax
Complete?? Yes, if tree is finite (chess has specific rules for this)
Optimal?? Yes, against an optimal opponent. Otherwise??

om
Time complexity?? O(bm)

.c
www.StudentRockStars.com

rs
Space complexity?? O(bm) (depth-first exploration)

ta
kS
For chess, b ≈ 35, m ≈ 100 for “reasonable” games

oc
⇒ exact solution completely infeasible

R
nt
de
But do we need to explore every path?
tu
.S
w
w
w

Chapter 6 12

www.StudentRockStars.com
www.StudentRockStars.com

α –β pruning example

MAX 3

om
.c
rs
MIN 3

ta
kS
oc
R
nt
3 12 8
de
tu
.S
w
w
w

Chapter 6 13

www.StudentRockStars.com
www.StudentRockStars.com

α –β pruning example

MAX 3

om
.c
rs
MIN 3 2

ta
kS
oc
R
X X

nt
3 12 8
de
tu 2
.S
w
w
w

Chapter 6 14

www.StudentRockStars.com
www.StudentRockStars.com

α –β pruning example

MAX 3

om
.c
rs
MIN 3 2 14

ta
kS
oc
R
X X

nt
3 12 8
de
tu 2 14
.S
w
w
w

Chapter 6 15

www.StudentRockStars.com
www.StudentRockStars.com

α –β pruning example

MAX 3

om
.c
2 14 5

rs
MIN 3

ta
kS
oc
R
X X

nt
3 12 8 2 14 5
de
tu
.S
w
w
w

Chapter 6 16

www.StudentRockStars.com
www.StudentRockStars.com

α –β pruning example

MAX 3 3

om
.c
rs
MIN 3 2 14 5 2

ta
kS
oc
R
X X

nt
3 12 tu 8
de 2 14 5 2
.S
w
w
w

Chapter 6 17

www.StudentRockStars.com
www.StudentRockStars.com

Why is it called α–β ?

MAX

om
.c
MIN

rs
ta
..

kS
..
..

oc
MAX

R
nt
MIN
de V
tu
.S

α is the best value (to max) found so far off the current path
w
w
w

If V is worse than α, max will avoid it ⇒ prune that branch


om
Define β similarly for min www.StudentRockStars.c

Chapter 6 18

www.StudentRockStars.com
www.StudentRockStars.com

The α–β algorithm


function Alpha-Beta-Decision(state) returns an action
return the a in Actions(state) maximizing Min-Value(Result(a, state))

om
function Max-Value(state, α, β) returns a utility value
inputs: state, current state in game

.c
rs
α, the value of the best alternative for max along the path to state

ta
β, the value of the best alternative for min along the path to state

kS
if Terminal-Test(state) then return Utility(state)

oc
v ← −∞

R
for a, s in Successors(state) do

nt
v ← Max(v, Min-Value(s, α, β))
de
if v ≥ β then return v
tu

α ← Max(α, v)
.S
w

return v
w
w

function Min-Value(state, α, β) returns a utility value


same as Max-Value but with roles of α, β reversed

Chapter 6 19

www.StudentRockStars.com
www.StudentRockStars.com

Properties of α–β
Pruning does not affect final result
Good move ordering improves effectiveness of pruning

om
With “perfect ordering,” time complexity = O(bm/2)

.c
rs
⇒ doubles solvable depth

ta
kS
A simple example of the value of reasoning about which computations are

oc
relevant (a form of metareasoning)

R
nt
Unfortunately, 3550 is still impossible!
de
tu
.S
w
w
w

Chapter 6 20

www.StudentRockStars.com
www.StudentRockStars.com

Resource limits
Standard approach:

• Use Cutoff-Test instead of Terminal-Test

om
e.g., depth limit (perhaps add quiescence search)

.c
• Use Eval instead of Utility

rs
ta
i.e., evaluation function that estimates desirability of position

kS
oc
Suppose we have 100 seconds, explore 104 nodes/second

R
⇒ 106 nodes per move ≈ 358/2
nt
de
⇒ α–β reaches depth 8 ⇒ pretty good chess program
tu
.S
w
w
w

Chapter 6 21

www.StudentRockStars.com
www.StudentRockStars.com

Evaluation functions

om
.c
rs
ta
kS
oc
R
nt
Black to move de White to move
tu
White slightly better Black winning
.S
w

For chess, typically linear weighted sum of features


w
w

Eval(s) = w1f1(s) + w2f2(s) + . . . + wnfn(s)


e.g., w1 = 9 with
f1(s) = (number of white queens) – (number of black queens), etc.
Chapter 6 22

www.StudentRockStars.com
www.StudentRockStars.com

Summary
Games are fun to work on! (and dangerous)
They illustrate several important points about AI

om
♦ perfection is unattainable ⇒ must approximate

.c
rs
♦ good idea to think about what to think about

ta
kS
♦ uncertainty constrains the assignment of values to states

oc
R
♦ optimal decisions depend on information state, not real state

nt
de
Games are to AI as grand prix racing is to automobile design
tu
.S

www.StudentRockStars.com
w
w
w

Chapter 6 38

www.StudentRockStars.com

You might also like