You are on page 1of 47

Big Data Analytics

MS4252
Social Network Analysis II

Sources: Jennifer Golbeck 2013. Analyzing the Social Web.


Elsevier (chapter 4, 5, 6)

1
Chapter 4

NETWORK VISUALIZATION

2
Information Visualisation
n Humans are wired to find pattern visually.
n Having natural ability to see anomalies, patterns,
clusters, and changes.
n Recognize many of these things without consciously
looking at them.

3
Information Visualisation
n In visual, data patterns can be recognized that may
otherwise be difficult to see in lists of numbers,
adjacency lists, or other textual representations of data.

n Information visualization deals with the presentation of


data in visual format.
n The data may be numeric, categorical, network data
(like social networks), text, and other types.

n Good information visualization supports users in better


understanding the data they are seeing.

4
Information Visualization
n Take advantage of humans’ natural abilities to see
Ø Patterns
Ø Anomalies
Ø Relationships
Ø Trend
Ø Clusters
n Overview of the complex data and explore more from
visualization!
n Visualizations are a qualitative way to begin
understanding the data.
Ø From there, quantitative experiments or analysis can follow to
explain any insights

5
Graph Layout
n Network is made up of nodes and edges.
n How to lay-out is critical to what an observer is able to
understand about a network.

n There are many types of layout algorithms that position


the nodes and edges in different ways for network
visualization.
Ø Random layout
Ø Circular layout
Ø Grid layout
Ø Force-directed layout

6
What makes a good visualization?
Criteria from Dunne and Shneiderman (2009)

n Every node is visible.


n For every node you can count its degree.
n For every link you can follow it from source to
destination.
n Clusters and outliers are identifiable.

7
Random layout

8
Circular layout

9
Grid layout

10
Force directed layout
The layout is dynamic and determined by the connections
between the nodes.

11
Visualizing network features
n Labels (node label and edge label; hard to show all the
labels even for small network)

n Size, Shape, and color

n Larger graph properties (e.g., clusters)

12
Size, Shape, and color
n Showing other attributes of nodes and edges in graphs
can be easier.
n Categorical or quantitative attributes are
particularly easy to show by adjustments in size,
shape, or colour.

n There are many statistics about the nodes in that


network:
Ø degree, centrality, and so on.
Ø These can be encoded using colour, size, or both.

13
Node size and color
n A graph indicating degree with node color and clustering
coefficient with node size

14
Edge weight
n Indicate the strength of a relationship, the
frequency of communication, or other factors.

15
Large graph properties (clusters)
n Example: Youtube videos, where nodes represent videos
and edges connected video that share a common tag!

16
Scale Issues
n Too many nodes (~ 10, 000 or more) or edges are
almost impossible to visualize.

n Dense network may not reveal patterns.

n Filtering for visualization is crucial!

17
Example: Senate Voting Records
n Density can be a problem even the number of nodes is
small.
Ø Shows a network of members of the U.S. Senate.
Ø Senators voted
Ø There are only 100 nodes but over 4,100 edges
Ø Edge indicates the senators have voted the same way in
at least 40% of the time.

18
Filtering for visual patterns
n One way to compensate for this is to filter the
networks when possible.
n Filter out the edges based on the weight
n Based on how many times senators have voted
together (at least two-thirds of the bills)

19
Visualization Tools
n Gephi

n SAS Enterprise Miner (link analysis)

n R – we will use R!

n Python

20
Chapter 5

TIE STRENGTH

21
Tie Strength
n Social relationships are complicated.

Ø The type of relationship people have will draw on many


things like their history and similarity, each person’s
personal background and preferences, environmental
factors, and more.

n Relationships are also multifaceted, and many


relationship types can be used in social network
analysis.
Ø One of the most useful is the idea of tie strength

22
Tie Strength
n Measure of the strength of a relationship
between people (Mark Granovetter, 1973)

n `The strength of a tie is a combination of the


amount of time, the emotional intensity, the
intimacy (mutual confiding), and the reciprocal
services which characterize the tie’

23
Tie Strength
n Two main types:
Ø Strong ties are rare, trusted and are usually family
members or very close friends.
u Usually people a person sees frequently, with whom one
shares personal details of one’s life, and for whom the
person will do and expect favours.
Ø Weak ties are much more common and include
acquaintances and more casual friendships.
u Co-workers or people who you know from a class, but you
don’t spend a lot of time with

n A spectrum of tie strength, and any relationship


may fall along the scale from weak to strong.

24
The strength of Weak Tie
n Tie strength is a very important factor to
consider in social network analysis.
n Consider the flow of information through a
network
Ø Weak ties often connect to diverse groups of people
with different perspectives
Ø These ties allow information to move throughout the
network
n E.g. A disease.
Ø Someone is more likely to catch a cold from a weak tie
Ø But because of the high level of close contact, it will
likely spread quickly to one’s strongest connections.

25
The strength of Weak Tie
n NOT to say that tie strength is the only factor
influencing trust, reliability, and closeness in
social networks.

n Weak ties may provide highly trusted


information.

Ø E.g., a physician may be more trusted about medical


information than someone’s family members.

Ø The authority of the physician outweighs tie strength

26
Replicating Migram’s `six degrees’
n Send booklets from original participants to a
target, unknown person.

n (Lin et al., 1978) show that successful chains


made heavy use of weak ties.

27
The benefit of weak tie
n Connect people to different social circles,
exposing them to more information.
n A person has more weak ties than strong ties.

28
Tie strength and network structure
n To analyse tie strength in social network
analysis, the network must include relationship
information.

29
Network Structure- forbidden triad
n What does that tell us about the relationship with
Bob and Chuck?
Ø Cannot draw any absolute conclusions
Ø Some sort of tie exists between B and C, either strong
or weak.
n Counter-example: A is married to B and having
an affair with C?

30
Network Structure – Bridge
n Many Forbidden triad can be found. i.e. PFO,
PFH, and PFN.
n If there is a weak tie P-O, the edge between P
and F would no longer be a bridge.
n Conclusion: no strong tie is a bridge!!!

31
Tie Strength and Propagation
n Tie strength
Ø Strong tie – more trusted
Ø Weak tie – wider spread

n Network propagation
Ø a phenomenon where things spread through a network

u Diseases spreading through a social network,


u Computer viruses on the Internet, or
u Rumours and fads through a social network

32
Chapter 6

TRUST

33
Definition – trust
n Trust is a relationship with which we are all
familiar, but which we rarely define or describe。
Ø Load money
u We expect the person will pay us back
Ø Ask for a recommendation
u The person's recommendation will match our taste and the
movie or restaurant or hotel will be good
Ø Tell a secret
u The person will keep a secret, not tell others, and not
judge us for it
Ø Ask for a recommendation or reference
u The recommendation will be positive and help us get the
position we are applying to

34
General definition
n Trust is putting oneself in a vulnerable position
based on the belief that another person will act
with our best interest in mind.

n Definition: `A person trusts another if she is


willing to take a risk based on her expectation
that the trusted person’s actions will lead to a
positive outcome’

35
Development of trust
n Calculation-based trust:
Ø A rational decision about whether to trust someone,
where the costs and benefits of trusting are factored in.
n Personal-based trust:
Ø A person's propensity to trust, developed over the
course of their life.
n Cognition-based trust:
Ø The instant rapport and trust that can develop between
people who share similar backgrounds, beliefs, and
values
n Institution-based trust:
Ø How trust may form in the presence of guarantees and
protections offered by an institution.
36
Asymmetry
n Trust is not necessarily identical in both directions.

n Extreme example: parents and children


Ø A child must have almost absolute trust in his parent
while the parent may have almost no trust in a child

n Most trust is mutual but in different levels.


Ø E.g. Bosses and employees
Ø Employees tend to trust superiors more

37
Context and Time
n Trust will vary among contexts
Ø I may trust Bob to recommend a restaurant, but not to
repair my car.
n Trust can also transfer between contexts
Ø I may build trust in a co-worker that is entirely in work
context, but later trust that person to recommend a
plumber.

n Trust changes over time


Ø People tend to develop trust over time, but trust may
disappear completely if there has one dramatic failure.

38
Measuring trust
n Measuring trust is important but difficult.

1. A person’s propensity to trust.


Ø This can be measured with a simple through experiment
called the Investment Game.

2. One person’s decision about the other person

39
Trust in social media
n Apply these same estimates to people we know
online.

n Ask people explicitly to rate trust in others


Ø Customer rating

n Issue: most people online are strangers


Ø Find some way to leverage information that people have
shared about the trustworthiness of others to infer how
much one person may trust a stranger

n Example: eBay
40
Trust inference
n Infer trust between two unknown people using
network structure.
n If A-B have trust, and B-C have trust, how much
should A trust C?
𝑡!# ?

A B C
𝑡!" 𝑡"#

41
Trust inference algorithm
n Network-based Inference
Ø Use network structure to infer trust

Ø Example approach
u Find neighbours who are trusted.
u Ask them how much to trust the stranger.
u Average their responses weighted by how much we trust
each neighbour.
u Neighbours repeat this if they do not know the stranger.

Ø A lot of Algorithms to do this by computer scientists

42
Network based inference
n Inferring over many paths?
n Favor highly trusted connections and short paths over
long ones

43
Similarity based trust inference
n Research has shown that people who trust one
another tend to be similar (Ziegler and Golbeck,
2007).

Ø A person will trust his friend about movies if they have


similar taste.

Ø A parent will trust a babysitter to watch her child if they


have similar ideas about the appropriate way to care for
the child and respond in an emergency.

44
Application of Trust
n Once trust is computed, how can we use it?

n Filtering information
Ø e.g. show reviews only from the most trusted people

45
Application of Trust
n Sorting Information
Ø Show Facebook posts from my most trusted friends
first, and least trusted friends last

n Aggregating Information

Ø Give more weight to restaurant ratings from trustworthy


people and less weight to lower-trust people when
computing an average rating.

46
See You!!!

47

You might also like