GML 1

Traditional Feature-based Methods Traditional Feature-based Methods Graph Representation Graphs : A Common Language c - end i ey / rather) fn Actor 3) Albert Ces Protein 4 Cae @ Protein SO RO \ / ~o Protein 90) INI=4 / lela &Traditional Feature-based Methods Graph Representation How to build a graph m= What are nodes ? = What are edges? The choice of the proper network representation of a given domain/problem determines our ability to use networks successfully = In some cases, there is a unique, unambiguous representation = In other cases, the representation is by no means unique The way you assign links will determine the nature of the question you can study Traditional Feature-based Methods Graph properties Undirected vs Directed = Links: undirected = Links: directed (symmetrical, reciprocal) (arcs) ‘er = Examples: = Examples: " Collaborations = Phone calls ® Friendship on Facebook = Following on TwitterTraditional Feature-based Methods Graph properties Node degrees Node degree, k;: the number 3 of edges adjacent to node i 2 = k,=4 5 nus ne wg. degree: k=(k)=— g. deg Waa N a1 ! In directed networks we define an in-degree and out-degree. The (total) degree of a node is the sum of in- and out-degrees. Kad ket =1 Directed 3 Source:Nodewith =o = F_ Sink: Node with k=0 N Traditional Feature-based Methods Graph properties Bipartite graph = A bipartite graph is a graph whose nodes can be divided into two disjoint sets U and V such that every link connects a node in U to one in V; that is, U and V are independent sets = Examples: = Authors-to-Papers = Actors-to-Movies = Users-to-MoviesTraditional Feature-based Methods Graph properties Bipartite graph . “ose Bipartite Graph Projection U Projection V Sah Traditional Feature-based Methods Graph properties Adjacency Matrix Ajj= 1 if thereis a link from node ito node j Aj=O otherwise o 1041 oo O01 f 00 i ef 00 Sparse 0001 ooo00 ae a) o110 Note that for a directed graph (right) the matrix is not symmetric.Traditional Feature-based Methods Graph properties Adjacency Matrix Most real-world networks are sparse E<< E... (or k<< N-1) ‘max NerwoR nNoDes ts ieecreo/ N t Unbinecreo aan | oso | 267 zesos | sues | 2st mas | 29,397,908] 23.72 ager | asaoare | 1048 1039 saz | sse Traditional Feature-based Methods Graph properties Edge List (representation) * (2,3) = (2,4) = (3, 2) eo ——— =(,4) = (4,5) = (5,2) * (5,1)Traditional Feature-based Methods Graph properties Adjacency List (representation) Easier to work with if network is "= Large fp = Sparse ® Allows us to quickly retrieve all neighbors of a given node at = 2:3,4 =3:2,4 "4:5 *5:1,2 Traditional Feature-based Methods Graph properties Node and Edge Attributes Weight (e.g,, frequency of communication) Ranking (best friend, second best friend...) = Type (friend, relative, co-worker) = Sign : friend vs Foe, Trust vs Distrust = Properties depending on the structure of the rest of the graph: Number of common friendsTraditional Feature-based Methods Graph properties Node and Edge Attributes = Unweighted = Weighted (undirected) (undirected) 0 2 05 0 2 0 1 4 Ay= os 1 0 0 0 4 0 oOo) A,=0 E=LY nonzero(4,) Examples: Friendship, Hyperink Examples: Collaboration, Internet, Roads Traditional Feature-based Methods Graph properties Self-loops and Multigraph = Self-edges (self-loops) = Multigraph (undirected) (undirected) 1 1 0 0 lo 3 0 oJ Snonzero(A,) Examples: Proteins, Hyperlinks Examples: Communication, CollaborationTraditional Feature-based Methods Graph properties Connectivity of undirected graph = A graph is connected if any two nodes can be joined by a path = A disconnected graph is made up by two or more connected components 8 Fr Largest Component: Giant Component Isolated node (node 4) Traditional Feature-based Methods Graph properties Connectivity of undirected graph = The adjacency matrix of a network with several components can be written in a block-diagonal form, so that nonzero elements are confined to squares, with all other elements being zero g a oe a oo opm oo 0c Z ae z Meio z >. oa .Traditional Feature-based Methods Graph properties Connectivity of directed graph = Strongly connected : has a path from each node to every other node and vice versa & Weakly connected Graph on the left is connected but not strongly connected (e.g there is no way to get from F to G by following the edge directions) Traditional Feature-based Methods Machine Learning Tasks = Node-level prediction = Link-level prediction = Graph-level prediction Graph-level aTraditional Feature-based Methods Machine Learning Pipeline = Design features for nodes/links/graphs = Obtain features for all training data eR? ER ee oe See eR 3 features < “Graph features Traditional Feature-based Methods Machine Learning Pipeline = Trainan ML model: = Apply the model: * Random forest = Given a new = SVM node/link/graph, obtain = Neural network, etc. its features and make a prediction x > v1 , ‘ x —y xy PwTraditional Feature-based Methods Feature Design Using effective features over graphs is the key to achieve good model performance In this lecture, we overview the traditional features for : ® Node-level prediction & Link-level prediction = Graph-level prediction For simplicity, we focus on undirected graphs. Traditional Feature-based Methods Node Features Node-Level Tasks > 2 ee, Pa e® ®@ Mact -e a a @ e@°@ Node classificationTraditional Feature-based Methods Node Features Goal Characterize the structure and position of a node in the network Overview Node feature Node degree Node centrality Clustering coefficient Graphlets Traditional Feature-based Methods Node Features : Node degree = The degree k,, of node v is the number of edges (neighboring nodes) the node has. = Treats all neighboring nodes equally.Traditional Feature-based Methods Node Features : Node centrality = Node degree counts the neighboring nodes without capturing their importance. = Node centrality c, takes the node importance in a graph into account = Different ways to model importance: = Eigenvector centrality = Betweenness centrality = Closeness centrality = and many others... Traditional Feature-based Methods Node Features : Node centrality Eigenvector centrality = Anode v is important if surrounded by important neighboring nodes u € N(v). = We model the centrality of node v as the sum of the centrality of neighboring nodes: 1 =z eq u€N(v) 2 is normalization constant (it will turn out to be the largest eigenvalue of A) = Notice that the above equation models centrality in a recursive manner. How do we solve it?Traditional Feature-based Methods Node Features : Node centrality Eigenvector centrality = Rewrite the recursive equation in the matrix form. Ti «7 Du em de = Ac uEN(Y) + A: Adjacency matrix J is normalization const Aw= lifueN() + ¢: Centrality vector * Q: Eigenvalue = We see that centrality c is the eigenvector of A! = The largest eigenvalue A,,,,. is always positive and unique (by Perron-Frobenius Theorem). = The eigenvector C,,_, corresponding to Ajay is used for centrality. (largest eigenvalue of A) Traditional Feature-based Methods Node Features : Node centrality Betweenness centrality = Anode is important if it lies on many shortest paths between other nodes. #(shortest paths betwen s and t that contain v) = (shortest paths between s and t) sepet = Example: (A-C-B, ACD, A-C-D-E) op=8 (A-C-D-E, B-D-E, C-D-E)Traditional Feature-based Methods Node Features : Node centrality Closeness centrality = Anode is important if it has small shortest path lengths to all other nodes. 1 “= Dusv Shortest path length between u and v = Example: y= 1/(2+14+24+3)=1/8 (A-C-B, A-C, A-C-D, A-C-D-E) ¢p =1/2+14+14+1)=1/5 (D-C-A, D-B, D-C, D-E) Traditional Feature-based Methods Node Features : Clustering coefficient = Measures how connected v’s neighboring nodes are: #(edges among neighboring nodes e, = (edg B nei 8 2 € [0,1] 2 #(node pairs among k,, neighboring nodes) = Examples: In our examples below the denominator is 6 (4 choose 2). e=1 ey = 0.5 e,=0Traditional Feature-based Methods Node Features : Graphlets = Observation: Clustering coefficient counts the #(triangles) in the “nee St IMI Noonan triangles (out of 6 node triplets) = We can generalize the above by counting #(pre-specified subgraphs, i.e., graphlets). Traditional Feature-based Methods Node Features : Graphlets Graphlets: Rooted connected non-isomorphic subgraphs: aoe Seema sieges TIA TANT?A 8 getewig WAAT LAT TSE G Go Gi Go Gs Ga Gs Ge Go Gs Gy SASL Cig There are 73 different graphlets on up to 5 nodesTraditional Feature-based Methods Node Features : Graphlets = Degree counts #(edges) that a node touches = Clustering coefficient counts #(triangles) that a node touches. = Graphlet Degree Vector (GDV): Graphlet-base features for nodes = GDV counts #(graphlets) that a node touches Traditional Feature-based Methods Node Features : Graphlets . Example: Possible graphlets up to size 3 G « » ¢ «TAA a Graphlet instances of node u: d a b & qe q- qe GDV of node u: waved oe BisTraditional Feature-based Methods Node Features : Graphlets = Considering graphlets on 2 to 5 nodes we get: * Vector of 73 coordinates is a signature of a node that describes the topology of node's neighborhood = Captures its interconnectivities out to a distance of 4 hops = Graphlet degree vector provides a measure of a node’s local network topology: = Comparing vectors of two nodes provides a more detailed measure of local topological similarity than node degrees or clustering coefficient. Traditional Feature-based Methods Link Features = The task is to predict new links based on the existing links. = At test time, node pairs (with no existing links) are ranked, and top K node pairs are predicted. = The key is to design features for a pair of nodes.Traditional Feature-based Methods Link Features Two formulations of the link prediction task: = 1) Links missing at random: = Remove a random set of links and then aim to predict them = 2) Links over time: = Given G[tg, tj] a graph defined by edges up to time fo, output a ranked list L of edges (not in G[to, to]) that are predicted to appear in time G[t,, tt] at alta Evaluation: ® n= |Eyjoy|: # new edges that appear during the test period [¢,, t;] = Take top 7 elements of L and count correct edges Traditional Feature-based Methods Link Features = Methodology: = For each pair of nodes (x,y) compute score c(x,y) = For example, c(x,y) could be the # of common neighbors of xandy = Sort pairs (x,y) by the decreasing score c(x,y) = Predict top n pairs as new links = See which of these links actually appear in G[t,, tj]Traditional Feature-based Methods Link Features Overview = Distance-based feature = Local neighborhood overlap = Global neighborhood overlap Link feature...... Traditional Feature-based Methods Link Features : distance-based feature Shortest-path distance between two nodes = Example: Seu = Spe = Sap = 2 Seg = Spr = 3 = However, this does not capture the degree of neighborhood overlap: = Node pair (B, H) has 2 shared neighboring nodes, while pairs (B, E) and (A, B) only have 1 such node.Traditional Feature-based Methods Link Features : local neighborhood overlap Captures # neighboring nodes shared between two nodes v, and v2: = Common neighbors: |N(v,) NN(v,)| = Example: [N(A) 9 N(B)| = |{C}] = 1 oe NI iN = Jaccard’s coefficient; Mever@a! IN(@y)UN(2)| © Example: M@ON@! _ WC _ 4 PrtiNC@UN) IED) 2 Adamic-Adar index: 1 Zuew(v, IW) ogee fd log(ke) log 4 = Example: Traditional Feature-based Methods Link Features : local neighborhood overlap = Limitation of local neighborhood features: = Metric is always zero if the two nodes do not have any neighbors in common. = However, the two nodes may still potentially be connected in the future. = Global neighborhood overlap metrics resolve the limitation by considering the entire graph.Traditional Feature-based Methods Link Features: global neighborhood overlap = Katz index: count the number of walks of all lengths between a pair of nodes. = How to compute #walks between two nodes? = Use adjacency matrix powers! = Ay, specifies #walks of length 1 (direct neighborhood) between wu and v. . Aes specifies #walks of length 2 (neighbor of neighbor) between wand v. = And, Aly specifies #walks of length L. Traditional Feature-based Methods Link Features: global neighborhood overlap Powers of adjacence matrices = Computing #walks between two nodes = Recall: A, = 1 if u € N(v) = Let po = #walks of length K between u and v = We will show P™) = ak . po = #walks of length 1 (direct neighborhood) between wand v = Ayy PY =A, > I Bors er oReERTraditional Feature-based Methods Link Features: global neighborhood overlap Powers of adjacence matrices = How to compute pe) ? = Step 1: Compute #walks of length 1 between each of u’s neighbor and v = Step 2: Sum up these #walks across u's neighbors 2) 1 * Phe = Ladue * Ply) = Li Aue * Aw uy ‘valks of length 1 between Node 1's neighbors and Nod A? = Power of adjacency Traditional Feature-based Methods Link Features: global neighborhood overlap = Katz index between v, and 1 is calculated as Sum over all walk lengths #walks of length U Se = > [6] between 1 and v, '=1 9
You might also like
The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
From Everand
The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
Mark Manson
Rating: 4 out of 5 stars
4/5 (5814)
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
From Everand
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
Brene Brown
4/5 (1092)
Never Split the Difference: Negotiating As If Your Life Depended On It
From Everand
Never Split the Difference: Negotiating As If Your Life Depended On It
Chris Voss
Rating: 4.5 out of 5 stars
4.5/5 (845)
Magazines
Podcasts
Sheet music
Principles: Life and Work
From Everand
Principles: Life and Work
Ray Dalio
4/5 (609)
The Glass Castle: A Memoir
From Everand
The Glass Castle: A Memoir
Jeannette Walls
4.5/5 (1717)
Grit: The Power of Passion and Perseverance
From Everand
Grit: The Power of Passion and Perseverance
Angela Duckworth
4/5 (590)
Sing, Unburied, Sing: A Novel
From Everand
Sing, Unburied, Sing: A Novel
Jesmyn Ward
4/5 (1104)
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
From Everand
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
Margot Lee Shetterly
4/5 (897)
Shoe Dog: A Memoir by the Creator of Nike
From Everand
Shoe Dog: A Memoir by the Creator of Nike
Phil Knight
4.5/5 (540)
The Perks of Being a Wallflower
From Everand
The Perks of Being a Wallflower
Stephen Chbosky
4.5/5 (2104)
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
From Everand
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
Ben Horowitz
4.5/5 (348)
Bad Feminist: Essays
From Everand
Bad Feminist: Essays
Roxane Gay
4/5 (1018)
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
From Everand
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
Ashlee Vance
4.5/5 (474)
Her Body and Other Parties: Stories
From Everand
Her Body and Other Parties: Stories
Carmen Maria Machado
4/5 (822)
The Outsider: A Novel
From Everand
The Outsider: A Novel
Stephen King
4/5 (1866)
The Emperor of All Maladies: A Biography of Cancer
From Everand
The Emperor of All Maladies: A Biography of Cancer
Siddhartha Mukherjee
4.5/5 (271)
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
From Everand
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
Viet Thanh Nguyen
4.5/5 (122)
Angela's Ashes: A Memoir
From Everand
Angela's Ashes: A Memoir
Frank McCourt
4.5/5 (441)
Brooklyn: A Novel
From Everand
Brooklyn: A Novel
Colm Tóibín
3.5/5 (1947)
The Little Book of Hygge: Danish Secrets to Happy Living
From Everand
The Little Book of Hygge: Danish Secrets to Happy Living
Meik Wiking
3.5/5 (401)
A Man Called Ove: A Novel
From Everand
A Man Called Ove: A Novel
Fredrik Backman
4.5/5 (4770)
The World Is Flat 3.0: A Brief History of the Twenty-first Century
From Everand
The World Is Flat 3.0: A Brief History of the Twenty-first Century
Thomas L. Friedman
3.5/5 (2259)
Steve Jobs
From Everand
Steve Jobs
Walter Isaacson
4.5/5 (807)
The Yellow House: A Memoir (2019 National Book Award Winner)
From Everand
The Yellow House: A Memoir (2019 National Book Award Winner)
Sarah M. Broom
4/5 (98)
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
From Everand
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
Gilbert King
4.5/5 (266)
The Art of Racing in the Rain: A Novel
From Everand
The Art of Racing in the Rain: A Novel
Garth Stein
4/5 (4203)
A Tree Grows in Brooklyn
From Everand
A Tree Grows in Brooklyn
Betty Smith
4.5/5 (1929)
Yes Please
From Everand
Yes Please
Amy Poehler
4/5 (1898)
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
From Everand
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
Dave Eggers
3.5/5 (231)
Team of Rivals: The Political Genius of Abraham Lincoln
From Everand
Team of Rivals: The Political Genius of Abraham Lincoln
Doris Kearns Goodwin
4.5/5 (234)
The Woman in Cabin 10
From Everand
The Woman in Cabin 10
Ruth Ware
3.5/5 (2522)
Fear: Trump in the White House
From Everand
Fear: Trump in the White House
Bob Woodward
3.5/5 (738)
Wolf Hall: A Novel
From Everand
Wolf Hall: A Novel
Hilary Mantel
4/5 (3811)
John Adams
From Everand
John Adams
David McCullough
4.5/5 (2409)
On Fire: The (Burning) Case for a Green New Deal
From Everand
On Fire: The (Burning) Case for a Green New Deal
Naomi Klein
4/5 (74)
The Light Between Oceans: A Novel
From Everand
The Light Between Oceans: A Novel
M.L. Stedman
4.5/5 (789)
Manhattan Beach: A Novel
From Everand
Manhattan Beach: A Novel
Jennifer Egan
3.5/5 (792)
The Constant Gardener: A Novel
From Everand
The Constant Gardener: A Novel
John le Carré
3.5/5 (104)
The Unwinding: An Inner History of the New America
From Everand
The Unwinding: An Inner History of the New America
George Packer
4/5 (45)
Rise of ISIS: A Threat We Can't Ignore
From Everand
Rise of ISIS: A Threat We Can't Ignore
Jay Sekulow
3.5/5 (137)
Little Women
From Everand
Little Women
Louisa May Alcott
4/5 (104)

GML 1

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

GML 1

Uploaded by

Copyright:

Available Formats

You might also like