Professional Documents
Culture Documents
Data Parallel Prog
Data Parallel Prog
Piglatin
Hatway hriscay Olstonlay oesday otnay antway ouyay otay nowkay: Igpay atinlay islay ustjay estednay elationalray algebrajay, othingnay oremay.
Pagerank
Input: Weighted web graph/stochastic matrix M. Perform a random walk following edges probabilistically. PR: probability of being at any given node at a given time. - Principal eigenvector of the Web graph. (fixed point of the equation M*p=p)
1
0 1 1 x x .5 0 0 * y = y .5 0 0 z z
PageRank, ctd.
The eigenvector can be computed by starting with a random vector p, and iteratively multiplying with M. The Web graph is a Markov chain, and some MCs have bad properties (are not ergodic, so convergence does not happen).
Trick: make random surfer sometimes stop surfing and jump to a random node. The Web graph becomes complete; but we want the matrix to remain sparse:
PR(u) = PR(v) * w(u,v) * (1-lambda) + lambda / N
DryadLinQ
Googles Pregel
Bulk-synchronous parallel (BSP) programming model Supersteps: in each superstep, each node's compute function is called (in parallel) The compute node may send messages to other nodes, which are received in the next superstep. A nodes compute fn processes received msgs from the previous superstep.
Pregel