Professional Documents
Culture Documents
Extrapolation Methods For Accelerating Pagerank Computations
Extrapolation Methods For Accelerating Pagerank Computations
PageRank Computations
Sepandar D. Kamvar
Taher H. Haveliwala
Christopher D. Manning
Gene H. Golub
Stanford University
Motivation
Problem: Search: Giants
Speed up PageRank
Motivation:
Personalization
Results: Results:
“Freshness” 1. The Official Site of 1. The Official Site of
the New York Giants the San Francisco Giants
2
Outline 0.4
0.4
Definition of PageRank
0.2
Convergence Properties u1 u2 u3 u4 u5
Empirical Results
3
Link Counts
Taher’s Home Page Sep’s Home Page
Linked by 2 Linked by 2
Unimportant pages Important Pages
4
Definition of PageRank
The importance of a page is given by the
importance of the pages that link to it.
1
xi x j
importance of page i
jBi N j
importance of page j
1/2 1/2 1 1
6
PageRank Diagram
0.333
0.333
0.333
1
Initialize all nodes to rank xi( 0 )
n
7
PageRank Diagram
0.167
0.333
0.167 0.333
0.5
0.333
0.167
1 (0)
(1)
xi xj
jBi N j
9
PageRank Diagram
0.167
0.5
0.167 0.167
10
PageRank Diagram
0.333
0.5
0.167
1 (1)
( 2)
xi xj
jBi N j
11
PageRank Diagram
0.4
0.4
0.2
After a while…
1
xi xj
jBi N j
12
Computing PageRank
1
Initialize: x (0)
i
n
Repeat until convergence:
1 (k )
x ( k 1)
i xj
jBi N j
importance of page i
importance of page j
.1 .1
.3 .3
.2 = 0 .2 0 .3 0 0 .1 .4 0 .1 .2
.3 .3
T
.1
.1 P .1
.1
x 14
Matrix Notation
Find x that satisfies:
xP x T
.1 .1
.3 .3
.2 = 0 .2 0 .3 0 0 .1 .4 0 .1 .2
.3 .3
.1 .1
.1 .1
15
Power Method
Initialize: T
1 1
x (0)
...
n n
Repeat until convergence:
x (k 1) P T x (k)
16
A side note
PageRank doesn’t actually use PT.
Instead, it uses A=cPT + (1-c)ET.
Initialize: T
1 1
x (0)
...
n n
Repeat until convergence:
x (k 1) Ax (k)
18
Outline 0.4
0.4
Definition of PageRank
0.2
Convergence Properties u1 u2 u3 u4 u5
Empirical Results
19
Power Method
Express x(0) in terms of eigenvectors of A
u1 u2 u3 u4 u5
1 2 3 4 5 20
Power Method
(1)
x
u1 u2 u3 u4 u5
1 22 33 44 55 21
Power Method
( 2)
x
u1 u2 u3 u4 u5
1 222 332 442 552 22
Power Method
(k )
x
u1 u2 u3 u4 u5
1 22k 33k 44k 55k 23
Power Method
( )
x
u1 u2 u3 u4 u5
1 24
Why does it work?
Imagine our n x n matrix A
has n distinct eigenvectors
Au i i u i
ui.
u1 u2 u3 u4 u5
1 2 3 4 5
25
Why does it work?
From the last slide: x (0)
u1 2u 2 ... nu n
u1 u2 u3 u4 u5
1 22 33 44 55
2 2
x ( 2)
u1 2 2 u 2 ... n n u n
u1 u2 u3 u4 u5
1 222 332 442 552
27
Convergence
k k
x (k )
u1 2 2 u 2 ... n n u n
u1 u2 u3 u4 u5
1 22 33 44 55k
k k k
28
Our Approach
Estimate components of current iterate in the directions
of second two eigenvectors, and eliminate them.
u1 u2 u3 u4 u5
29
Why this approach?
For traditional problems:
A is smaller, often dense.
2 often close to , making the power method slow.
In our problem,
A is huge and sparse
More importantly, 2 is small1.
Therefore, Power method is actually much
faster than other methods.
u1
u1 u2 u3 u4 u5
31
Using Successive Iterates
x(0)
x(1)
u1
u1 u2 u3 u4 u5
32
Using Successive Iterates
x(0)
x(1)
x(2)
u1
u1 u2 u3 u4 u5
33
Using Successive Iterates
x(0)
x(1)
x(2)
u1
u1 u2 u3 u4 u5
34
Using Successive Iterates
x(0)
x(1)
x’ = u1
u1 u2 u3 u4 u5
35
How do we do this?
Assume x(k) can be written as a linear
combination of the first three eigenvectors
(u1, u2, u3) of A.
Compute approximation to {u2,u3}, and
subtract it from x(k) to get x(k)’
36
Assume
Assume the x(k) can be represented by
first 3 eigenvectors of A
x ( k ) u1 2u 2 3u n
x ( k 1) Ax( k ) u1 2 2u 2 33u 3
x ( k 2 ) u1 2 22u 2 332u 3
38
Rearranging Terms
We can rearrange the terms to get:
1x ( k 1) 2 x ( k 2 ) 3x ( k 3)
( 1 2 3 )u1
2 ( 12 2 22 332 )u 2
3 ( 13 )u 3
2
2 3
3
3 3
40
Outline 0.4
0.4
Definition of PageRank
0.2
Convergence Properties u1 u2 u3 u4 u5
Empirical Results
41
Results
Quadratic Extrapolation speeds up convergence.
Extrapolation was only used 5 times!
42
Results
Extrapolation dramatically speeds up convergence,
for high values of c (c=.99)
43
Take-home message
Speeds up PageRank by a fair amount,
but not by enough for true Personalized
PageRank.
Ideas are useful for further speedup
algorithms.
Quadratic Extrapolation can be used for a
whole class of problems.
44
The End
Paper available at
http://dbpubs.stanford.edu/pub/2003-16
45