You are on page 1of 8

Mining of Massive Datasets

Leskovec, Rajaraman, and Ullman


Stanford University
[Tong-Faloutsos, ‘06]

I 1 J
1 1

A 1 H 1 B

1 1
D

1 1 1
E G
F

a.k.a.: Relevance, Closeness, ‘Similarity’…


J. Leskovec, A. Rajaraman, J. Ullman (Stanford University) Mining of Massive Datasets 69
Shortest path is not good:

No effect of degree-1 nodes (E, F, G)!


Multi-faceted relationships
J. Leskovec, A. Rajaraman, J. Ullman (Stanford University) Mining of Massive Datasets 70
Network flow is not good:

Does not punish long paths

J. Leskovec, A. Rajaraman, J. Ullman (Stanford University) Mining of Massive Datasets 71


[Tong-Faloutsos, ‘06]

I 1 J
1 1

A 1 H 1 B

Multiple Connections
1 1
D Quality of connection

1 1 1 Direct & In-direct
E G
connections
F
Length, Degree,
Weight…
J. Leskovec, A. Rajaraman, J. Ullman (Stanford University) Mining of Massive Datasets 72
SimRank: Random walks from a fixed node on
k-partite graphs
Setting: k-partite graph with k types of nodes
Example: picture nodes and tag nodes
Do a Random Walk with Restarts from node u
i.e., teleport set S = {u}
Resulting scores measures similarity to node u
Problem:
Must be done once for each node u
Suitable for sub-Web-scale applications
J. Leskovec, A. Rajaraman, J. Ullman (Stanford University) Mining of Massive Datasets 73


IJCAI
Q: What is most related
Philip S. Yu
KDD conference to ICDM?
Ning Zhong
ICDM

SDM R. Ramakrishnan A: Personalized


AAAI M. Jordan PageRank with
teleport set S={ICDM}

NIPS

Conference Author

J. Leskovec, A. Rajaraman, J. Ullman (Stanford University) Mining of Massive Datasets 74


PKDD

SDM PAKDD
0.008
0.007
0.009
KDD 0.005 ICML
0.011
ICDM
0.005
0.004
CIKM ICDE
0.005
0.004
0.004

ECML SIGMOD

DMKD

J. Leskovec, A. Rajaraman, J. Ullman (Stanford University) Mining of Massive Datasets 75

You might also like