Unit 4 JWFILES PDF

www.StudentRockStars.
com
om
Game playing
.c
rs
Unit 4
ta
kS
oc
R
nt
de
tu
.S
w
w
w
www.StudentRockStars.com
Outline
♦ Games
♦ Perfect play
om
– minimax decisions
.c
– α–β pruning
rs
ta
♦ Resource limits and approximate evaluation
kS
oc
♦ Games of chance
R
nt
♦ Games of imperfect information
tu
de
.S
w
w
w
Chapter 6 2
Games vs. search problems

“Unpredictable” opponent ⇒ solution is a strategy
specifying a move for every possible opponent reply
om
Time limits ⇒ unlikely to find goal, must approximate
.c
rs
Plan of attack: www.StudentRockStars.com
ta
kS
• Computer considers possible lines of play (Babbage, 1846)
oc
• Algorithm for perfect play (Zermelo, 1912; Von Neumann, 1944)
R
nt
• Finite horizon, approximate evaluation (Zuse, 1945; Wiener, 1948;
de
Shannon, 1950)
tu
.S
• First chess program (Turing, 1951)

w
w
• Machine learning to improve evaluation accuracy (Samuel, 1952–57)

w
• Pruning to allow deeper search (McCarthy, 1956)
Chapter 6 3
Types of games
deterministic chance
perfect information chess, checkers, backgammon
om
go, othello monopoly
.c
rs
imperfect information battleships, bridge, poker, scrabble
ta
blind tictactoe nuclear war
kS
oc
R
nt
de
tu
.S
w
w
w
Chapter 6 4
Game tree (2-player, deterministic, turns)

MAX (X)
om
X X X
MIN (O) X X X
.c
X X X
rs
ta
X O X O X ...
kS
MAX (X) O
oc
R
nt
X O X X O X O ...
MIN (O) X X de
tu
.S
... ... ... ...

w
w
...
w
X O X X O X X O X
TERMINAL O X O O X X
O X X O X O O
Utility −1 0 +1
Chapter 6 5
Minimax
Perfect play for deterministic, perfect-information games
Idea: choose move to position with highest minimax value
om
= best achievable payoff against best play
.c
E.g., 2-ply game:
rs
ta
MAX 3
kS
A1 A2 A3
oc
R
3 2 2
nt
MIN
de
tu
A 11 A 12 A 13 A 21 A 22 A 23 A 31 A 32 A 33
.S
w
w
3 12 8 2 4 6 14 5 2
w
Chapter 6 6
Minimax algorithm
function Minimax-Decision(state) returns an action

inputs: state, current state in game www.StudentRockStars.com
om
return the a in Actions(state) maximizing Min-Value(Result(a, state))
.c
rs
function Max-Value(state) returns a utility value
ta
if Terminal-Test(state) then return Utility(state)
kS
v ← −∞
oc
for a, s in Successors(state) do v ← Max(v, Min-Value(s))
R
return v
nt
de
function Min-Value(state) returns a utility value
tu
.S
v←∞
w
for a, s in Successors(state) do v ← Min(v, Max-Value(s))

w
w
return v
Chapter 6 7
Properties of minimax
Complete??
om
.c
rs
ta
kS
oc
R
nt
de
tu
.S
w
w
w
Chapter 6 8
Complete?? Only if tree is finite (chess has specific rules for this).
NB a finite strategy can exist even in an infinite tree!
om
Optimal??
.c
rs
ta
kS
oc
R
nt
de
tu
.S
w
w
w
Chapter 6 9
Complete?? Yes, if tree is finite (chess has specific rules for this)
Optimal?? Yes, against an optimal opponent. Otherwise??
om
Time complexity??
.c
rs
ta
kS
oc
R
nt
de
tu
.S
w
w
w
Chapter 6 10
om
Time complexity?? O(bm)
.c
rs
Space complexity??
ta
kS
oc
R
nt
de
tu
.S
w
w
w
Chapter 6 11
om
Time complexity?? O(bm)
.c
rs
Space complexity?? O(bm) (depth-first exploration)
ta
kS
For chess, b ≈ 35, m ≈ 100 for “reasonable” games
oc
⇒ exact solution completely infeasible
R
nt
de
But do we need to explore every path?
tu
.S
w
w
w
Chapter 6 12
α –β pruning example
MAX 3
om
.c
rs
MIN 3
ta
kS
oc
R
nt
3 12 8
de
tu
.S
w
w
w
Chapter 6 13
MAX 3
om
.c
rs
MIN 3 2
ta
kS
oc
R
X X
nt
3 12 8
de
tu 2
.S
w
w
w
Chapter 6 14
MAX 3
om
.c
rs
MIN 3 2 14
ta
kS
oc
R
X X
nt
3 12 8
de
tu 2 14
.S
w
w
w
Chapter 6 15
MAX 3
om
.c
2 14 5
rs
MIN 3
ta
kS
oc
R
X X
nt
3 12 8 2 14 5
de
tu
.S
w
w
w
Chapter 6 16
MAX 3 3
om
.c
rs
MIN 3 2 14 5 2
ta
kS
oc
R
X X
nt
3 12 tu 8
de 2 14 5 2
.S
w
w
w
Chapter 6 17
Why is it called α–β ?
MAX
om
.c
MIN
rs
ta
..
kS
..
..
oc
MAX
R
nt
MIN
de V
tu
.S
α is the best value (to max) found so far off the current path
w
w
w
If V is worse than α, max will avoid it ⇒ prune that branch

om
Define β similarly for min www.StudentRockStars.c
Chapter 6 18
The α–β algorithm

function Alpha-Beta-Decision(state) returns an action
return the a in Actions(state) maximizing Min-Value(Result(a, state))
om
function Max-Value(state, α, β) returns a utility value
inputs: state, current state in game
.c
rs
α, the value of the best alternative for max along the path to state
ta
β, the value of the best alternative for min along the path to state
kS
oc
v ← −∞
R
for a, s in Successors(state) do
nt
v ← Max(v, Min-Value(s, α, β))
de
if v ≥ β then return v
tu
α ← Max(α, v)
.S
w
return v
w
w
function Min-Value(state, α, β) returns a utility value

same as Max-Value but with roles of α, β reversed
Chapter 6 19
Properties of α–β
Pruning does not affect final result
Good move ordering improves effectiveness of pruning
om
With “perfect ordering,” time complexity = O(bm/2)
.c
rs
⇒ doubles solvable depth
ta
kS
A simple example of the value of reasoning about which computations are
oc
relevant (a form of metareasoning)
R
nt
Unfortunately, 3550 is still impossible!
de
tu
.S
w
w
w
Chapter 6 20
Resource limits
Standard approach:
• Use Cutoff-Test instead of Terminal-Test
om
e.g., depth limit (perhaps add quiescence search)
.c
• Use Eval instead of Utility
rs
ta
i.e., evaluation function that estimates desirability of position
kS
oc
Suppose we have 100 seconds, explore 104 nodes/second
R
⇒ 106 nodes per move ≈ 358/2
nt
de
⇒ α–β reaches depth 8 ⇒ pretty good chess program
tu
.S
w
w
w
Chapter 6 21
Evaluation functions
om
.c
rs
ta
kS
oc
R
nt
Black to move de White to move
tu
White slightly better Black winning
.S
w
For chess, typically linear weighted sum of features

w
w
Eval(s) = w1f1(s) + w2f2(s) + . . . + wnfn(s)

e.g., w1 = 9 with
f1(s) = (number of white queens) – (number of black queens), etc.
Chapter 6 22
Summary
Games are fun to work on! (and dangerous)
They illustrate several important points about AI
om
♦ perfection is unattainable ⇒ must approximate
.c
rs
♦ good idea to think about what to think about
ta
kS
♦ uncertainty constrains the assignment of values to states
oc
R
♦ optimal decisions depend on information state, not real state
nt
de
Games are to AI as grand prix racing is to automobile design
tu
.S
w
w
w
Chapter 6 38

Unit 4 JWFILES PDF

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Unit 4 JWFILES PDF

Uploaded by

Copyright:

Available Formats

www.StudentRockStars.

Games vs. search problems

• First chess program (Turing, 1951)

• Machine learning to improve evaluation accuracy (Samuel, 1952–57)

• Pruning to allow deeper search (McCarthy, 1956)

perfect information chess, checkers, backgammon

Game tree (2-player, deterministic, turns)

... ... ... ...

function Minimax-Decision(state) returns an action

for a, s in Successors(state) do v ← Min(v, Max-Value(s))

Why is it called α–β ?

If V is worse than α, max will avoid it ⇒ prune that branch

The α–β algorithm

function Min-Value(state, α, β) returns a utility value

• Use Cutoff-Test instead of Terminal-Test

For chess, typically linear weighted sum of features

Eval(s) = w1f1(s) + w2f2(s) + . . . + wnfn(s)

You might also like