Professional Documents
Culture Documents
Random Walks
60% 10%
60% 60%
Surf
30%
Surf
9:01 9:00
50%
30%
Surf
30%
Surf
9:02 9:01
50%
9:03 9:02
50%
30%
Surf
30%
Surf
9:04 9:03
50%
60% 10%
30%
Surf
.9
.9
.9
K=
Rows sum to 1
(stochastic matrix)
30%
Surf
60%
30%
Surf
60%
30%
Surf
60%
Work 50%
40%
K=
W S E
30%
Surf
60%
.4 .1 .5
.6 .6 0
0 .3 .5
X3 = W
30%
Surf
60%
Work 50%
40%
Work 50%
40%
K=
W S E
30%
Surf
60%
.4 .1 .5
.6 .6 0
0 .3 .5
K=
W S E
30%
Surf
60%
.4 .1 .5
.6 .6 0
0 .3 .5
Pr [X2 = W | X0 = S] =
+ +
Pr [X2 = W | X0 = S] =
Pr [X1 = W | X0 = S] Pr [X2 = W | X1 = W, X0 = S]
i K
X0 ~ 0 =
Work 50%
50%
40%
20%
30%
W S E
e.g., X0 ~
50%
20%
30%
K=
W S E
30%
Surf
60%
.4 .1 .5
.6 .6 0
0 .3 .5
a distribution vector
(nonnegative, adds to 1)
Pr [X1 = W] = .5 .4 + .2 .1 + .3 .5 = .37
I.e., the distribution vector for X1 is 1 = 0 K And, the distribution vector for Xt is t = 0 Kt
Work 50%
40%
K=
W S E
30%
Surf
60%
Recall: Kt [i,j] = Pr [i j in exactly t steps] When t is large, the distribution vector for Xt hardly depends on the initial distribution 0.
= K
= K
[W] [S] [E]
Fundamental Theorem
.4 .1 .5 .6 .6 0 0 .3 .5
Given a
[W] = .4 [W] + .1 [S] + .5 [E] [S] = .6 [W] + .6 [S] + 0 [E] [E] = 0 [W] + .3 [S] + .5 [E] and you can add [W] + [S] + [E] = 1
Solution:
Fundamental Theorem
is also the limiting row of Kt as t unless the chain has some stupid periodicity: 100% 1 100% 2
If you walked for N steps, you would expect to be at state u about times.
The average time between successive visits to u would be about No limiting dist., but = ( ) is still invariant. .
Interlude: PageRank
1997: Web search was horrible.
Interlude: PageRank
1997: Web search was horrible. You search for CMU, it finds all the pages containing CMU & sorts by # occurrences.
$20Billionaires
Nevanlinna Prize
Interlude: PageRank
Lorem Ipsum Dolor Sit Amet Lorem Ipsum Dolor Sit Amet Lorem Ipsum Dolor Sit Amet
PageRank: compute the invariant distribution , rank pages u by highest [u] value!
.6
1/2
1/3
1/3
1/3
1/3
.6
(symmetric)
1/3
1/3
Theorem:
In random walk on undirected graph G, inv. distribution =
Higher degree higher limiting prob? Could [u] just be proportional to degree du?
Proof:
( di = 2m)
Corollary:
In random walk on undirected
(connected)
Examples
graph G,
Proof:
Mean first recurrence theorem.
:
Mvv:
1/4 4
1/2 2
1/4 4
Examples
Pn+1, the path on n+1 nodes:
Examples
The clique on n nodes:
:
Mvv:
2n n n n 2n
Mvv = n
Examples
The lollipop on n nodes:
n/2 path n/2 clique
Proposition:
Let (u0,v0) be an edge in G. Mu0v0 = E [# steps to hit v0 starting from u0] 2m1 2m. u0
Proof:
u1 u2
v0
Mvv n2/8
Mvv n/2
Theorem:
Let G be a connected graph. Let u and v be any two vertices. Then Muv = E [# steps to hit v starting from u] 2mn n3
Examples
Pn+1, the path on n+1 nodes: u
E [# steps to hit v starting from u] 2mn = 2n(n+1) = O(n2) Youll see (hmwk or recitation): its indeed (n2)
Proof:
Pick a path u, w1, w2, , wr, v. At most n nodes. E[# of steps to go uv] E[# of steps to go uw1w2wrv] = E[#uw1]+E[#w1w2]++E[#wrv] 2m + 2m + + 2m 2mn.
Examples
The clique on n nodes: v
Thm: E [# steps to hit v starting from u] 2mn n3
Examples
The lollipop on n nodes:
v
u
Thm: E [# steps to hit v starting from u] n3 Actually: the expectation really is (n3) !
An application
CONN problem:
Given graph G, possibly disconnected, and two vertices u and v. YES/NO: are u and v connected? Easily solved in O(m) time using DFS/BFS. Requires marking nodes, hence n bits of memory need to be allocated.
(Assume input is read-only.)
CONN problem:
Given graph G, possibly disconnected, and two vertices u and v. YES/NO: are u and v connected? You cant even keep track of where youve been!
Difficulty:
Do it without allocating any memory. You can only use a constant number of integer variables.
10
one variable
four variables z := u for t0 = 11000 for t = 1 ... 1000n3 for t1 = 1n for t2 = 1n z := random-neighbor(z) for t3 = 1n if z = v, return YES end for return NO couple more variables
z := u for t = 1 ... 1000n3 z := random-neighbor(z) if z = v, return YES end for return NO True answer is NO: alg. always says NO True answer is YES: alg. says YES w/prob 99.9% Why?
Suppose u and v are indeed in the same connected component. Say we do a random walk from u until we hit v. Let T = # steps it takes, a random variable. E [T] n3, by our theorem. Pr [T > 1000n3] < by Markovs Inequality.
For 25 years, this was one of the most famous examples of a problem with a known randomized solution, but no known deterministic solution. In 2004, Omer Reingold gave a deterministic solution! You can escape a labyrinth using O(1) memory and no random coins!
Definitions: Markov Chains Transition matrix Distribution vectors Invariant distribution Theorems: Fundamental theorem Mean first recurrence Inv dist. in undir graphs 2mn bound for uv Skills: Finding inv. distribs Analyzing rand walks
Study Guide
11