Social Media & Web Analytics: Manu Kohli BE, MBA (IIFT-2003-05) Data Science Indiana University PHD Candidate Iit-Delhi

Social Media & Web Analytics
M AN U KO H L I
BE , M B A ( IIF T- 2 0 03 - 0 5 )
D ATA S C IE N C E IN D IA N A U N I VER SIT Y
PH D C AN D ID AT E I IT- D E LH I
Social Media & Web Analytics
1. Strong and Weak Ties
2. Preferential Attachment
3. Scale Free Networks
4. Random Networks
5. Centrality Measures
◦ Degree
◦ Betweenness
Strong Ties and Weak Ties
•Network consist of nodes and edges
•How links can play different role in the network structure: a few edges
spanning different groups while most are surrounded by dense patterns of
connections.
•We then look at how nodes can play different role in the network
structure.
•Mark Granovetter (born October 20, 1943): an American sociologist and
professor at Stanford University
•Published his work in 1973
•Tie strength refers to a general sense of closeness with another person
• Strong ties: the stronger links, corresponding to friends, dependable
sources of social or emotional support
•Weak ties: the weaker links, corresponding to acquaintances. The most
notable role of weak ties in social networks is their structural significance
as connectivity-generating factors: they tend to be bridges that connect
distant clusters within social structures.
•Weak ties facilitate the interpersonal dissemination of novel phenomena,
be those useful information or harmful diseases.
•Strong ties exist between close-knit members with frequent interactions,
such as family and close friends.
•Example - A strong tie is someone who you know well. You've probably
got their number on your phone. You interact with them on social
networking sites. There is good 2 way conversation, and even if you don't
know everything about them, you know them pretty well and information
flows freely. We know the same information.
Weak ties are typified by distant social relationships and infrequent interactions, which are
commonly observed between acquaintances or strangers (Granovetter, 1973).
A weak tie is a more tenuous relationship. Once a year, you may send them a Christmas
message promising to be in touch more often. If you look up their number, they are
surprised to hear from you. You have different interests and don't interact much. You might
have kept their business card in case it comes in handy one day.
 LinkedIn for example.
◦Are all of your relationships "strong ties?"
◦Do you count all of your connections as good friends? Or are they colleagues who you
occasionally interact with?
◦Are they important to you at all? Should they be?
Weak Ties – Do they Matter ?
We are weak ties to some of our connections and strong ties to others.
Just like a network multiplexer our weak ties can carry both types of
signals around our network.
Weak ties are crucial in binding groups of strong ties together. They bring
circles of networks into contact with each other, strengthening
relationships and forming new bonds between existing relationship
circles.
Develop new interest

New Job Opportunity
Dunbar Number
Tie strength: the 5-15-50-150-500 rule
According to [How Many Friends Does One Person Need?: Dunbar’s Number and Other Evolutionary Quirks,
Robin Dunbar, Harvard University Press (November 1, 2010)]:
Most peoples social networks have a common pattern, unchanged for thousands of years.
There are clear boundaries based on the number of connections we have; it starts at five and goes up by a
factor of three.
 Inner circle: 5
 Sympathy group: 12-15
 Semi-regular group: 50
 stable social group: 150 (the Dunbar number)
 Friends of friends group (weak ties): 500
Triadic closure: Friend of a friend is also friend
If two people in a social network have a friend in common, then there is
an increased likelihood that they will become friends themselves at some
point in the future.
This principle can explain the evolving of network over times in many
situations.
Types of Network
Random Network
Scale-Free Netework
The Erdös-Renyi (Random Graph) Model
Really a randomized algorithm for generating networks
Begin with N isolated vertices, no edges
Add edges gradually, one at a time
Randomly select two vertices not already neighbors, add edge
So edges are added in a random, unbiased fashion
But what can it already explain?
The Erdös-Renyi (Random Graph) Model
After adding E edges, edge density is
As E increases, p goes from 0 to 1

Q: What are the likely structural properties at density p?
e.g. as p = 0  1, small diameter occurs; single connected component
λ=Np to be the expected degree of a node.
Most result for ER graphs suppose that this is fixed (e.g. λ=1λ=1 or λ=2λ=2)
Erdös-Renyi (Random Graph) Model
Simulation
http
://www.networkpages.nl/CustomMedia/Animations/RandomGraph/ERRG/AddoneEdgepATime.html
Erdös-Renyi (Random Graph) Model
Erdös-Renyi:
global/background edge density p
all edges appear independently with probability p
no bias towards connecting friends of friends , no high clustering
But in real networks, such biases often exist:
people introduce their friends to each other
people with common friends may share interests (homophily)
So natural to consider a model in which:
the more common neighbors two vertices share, the more likely they are to connect
still some “background” probability of connecting
still selecting edges randomly, but now with a bias towards friends of friends
Preferential Attachment
Processes in which the more someone has of something, the
more likely they are to get more of it
Examples:
the more friends you have, the easier it is to make more
the more business a firm has, the easier it is to win more
the more people there are at a nightclub, the more who want to go
Such processes will amplify inequality

One simple and general model: if you have amount x of
something, the probability you get more is proportional to x
Nodes appear over time (growth)
Nodes prefer to attach to nodes with many connections (preferential attachment,
cumulative advantage) Preferential attachment: ¤ new nodes prefer to attach to well-
connected nodes over less-well connected nodes.
Preferential attachments leads to generation of Scale free Networks
•Start with two vertices connected by an edge

•At each step, add one new vertex v with one edge back to previous vertices
•Probability a previously added vertex u receives the new edge from v is proportional to the
(current) degree of u
• more precisely, probability u gets the edge = (current degree of u)/(sum of all current
degrees)
•Vertices with high degree are likely to get even more links!
•Generates a power law distribution of degrees
•Variation: each new vertex initially gets k edges
•Here’s another demo
http://rocs.hu-berlin.de/interactive/pa/index.html
Scale Free Graphs
Several natural and human-made systems, including the Internet, the world wide
web, citation networks, and some social networks are thought to be approximately
scale-free and certainly contain few nodes (called hubs) with unusually high degree as
compared to the other nodes of the network.
The network begins with an initial connected network of nodes. New nodes are added
to the network one at a time. Each new node is connected to existing nodes with a
probability that is proportional to the number of links that the existing nodes already
have. Formally, the probability that the new node is connected to node is[2]
Generating Scalefree graphs
To start, each vertex has an equal number of edges (2)
The probability of choosing any vertex is 1/3
We add a new vertex, and it will have m edges, here
take m=2 Draw 2 random elements from the array –
suppose they are 2 and 3
Now the probabilities of selecting 1,2,3,or 4 are 1/5,
3/10, 3/10, 1/5 ¤ Add a new vertex, draw a vertex for it
to connect from the array
With new addition of Node 5 , where should new node

be added?
Power-law distribution
Scale Free Vs random (non-preferential) growth
contrasting with random (non-preferential) growth
Exponential vs. Power-Law
Scale Free Network
•The graph is connected
• Every new vertex is born with a link or several links (depending on whether m = 1 or m
> 1)
•It then connects to an ‘older’ vertex, which itself connected to another vertex when it was
introduced
•The older are richer
•Nodes accumulate links as time goes on, which gives older nodes an advantage since
newer nodes are going to attach preferentially – and older nodes have a higher degree
to tempt them with than some new kid on the block
Scale Free Network
•A network is called scale-free if the characteristics of the network are independent of the
size of the network, i.e. the number of nodes. That means that when the network grows,
the underlying structure remains the same.
http://rocs.hu-berlin.de/interactive/pa/index.html
Scale Free Network
•The underlying structure
•A scale-free network is defined by the distribution of the number of edges of the nodes
following a so called power law distribution.
•A crucial difference between the normal and power-law distribution is that the number of
nodes with really high numbers of edges is much higher in the power-law distribution
than in the normal distribution. But generally well connected nodes are more common in
a normal distribution.
•This means that in networks you will often find a small number of very highly connected
nodes. They have a number of connections that would not occur if the distribution would
be normal
Centrality
•Real valued function on the nodes of a graph
• How influential a person is in a social network?
•How well used a road is in a transportation network?
•How important a web page is?
•How important a room is in a buildling?
Centrality
Different measures of centrality
◦ Degree centrality
◦ Betweenness centrality
◦ Closeness centrality
◦ Eigenvector centrality
Centrality
•Relative importance of a node in the graph
• Which nodes are in the “center” of a graph?
•What do you mean by “center”
•Definition of “center” varies by context/purpose
•“There is certainly no unanimity on exactly what centrality is or on its conceptual
foundations, and there is little agreement on the proper procedure for its measurement.”
 by Freeman, 1979
Degree Centrality
Most intuitive notion of centrality
Node with the highest degree is most important
Index of exposure to what is flowing through the network
Gossip network: central actor more likely to hear a gossip
Normalized degree centrality
Divide by max. possible degree (n-1)
Degree Centrality
Can be deceiving
Why?
Local measure
Degree Centrality
When to use?
Whom to ask for favor
People you can talk to
Betweenness Centrality
Betweenness Centrality of a node 𝑢 is the ratio of the shortest paths between all other
nodes, that pass through node 𝑢
Quantifies the control of a node on the communication between other nodes
𝐴 lies between no two other vertices  𝐵 lies between 𝐴 and 3 other vertices: 𝐶,𝐷, and 𝐸
 𝐶 lies between 4 pairs of vertices (𝐴,𝐷), (𝐴, 𝐸), (𝐵,𝐷), (𝐵, 𝐸)
The betweenness centrality [4] Cb(n) of a node n is computed as
follows:
Cb(n) = ∑s≠n≠t (σst (n) / σst),
where s and t are nodes in the network different from n, σst denotes
the number of shortest paths from s to t, and σst (n) is the number of
shortest paths from s to t that n lies on.
For example, the betweenness centrality of node b in Figure is computed as

follows:
Cb(b) = ((σac(b) / σac) + (σad(b) / σad) + (σae(b) / σae) + (σcd(b) / σcd) + (σce(b) / σce) +
(σde(b) / σde)) / 6 = ((1 / 1) + (1 / 1) + (2 / 2) + (1 / 2) + 0 + 0) / 6 = 3.5 / 6 ≈ 0.583
Normalization coefficient = (n-1)X(n-2)/2 if node is neither terminal or initial or n(n-1) /

2 otherwise
Thank You

Social Media & Web Analytics: Manu Kohli BE, MBA (IIFT-2003-05) Data Science Indiana University PHD Candidate Iit-Delhi

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Social Media & Web Analytics: Manu Kohli BE, MBA (IIFT-2003-05) Data Science Indiana University PHD Candidate Iit-Delhi

Uploaded by

Copyright:

Available Formats

Social Media & Web Analytics

Develop new interest

As E increases, p goes from 0 to 1

Such processes will amplify inequality

•Start with two vertices connected by an edge

With new addition of Node 5 , where should new node

For example, the betweenness centrality of node b in Figure is computed as

Normalization coefficient = (n-1)X(n-2)/2 if node is neither terminal or initial or n(n-1) /

You might also like