You are on page 1of 38

An introduction to social

network analysis

June 11, 2008

David Lazer
Program on Networked Governance
Kennedy School of Government
Harvard University
 Definition:
 Paradigmatic focus on relationships
 Emergent/self-organized interconnected forms
 Units are indeterminate
 Key issues:
 How do the configuration of networks affect how
individuals and systems function?
 How to study networks?
A brief history of the study of
 Roots in sociology and anthropology
going back to early days of those fields
 Sociometry in 1930s (Moreno)
 Substantial interest in social
psychology in 1940s-1960s (Festinger,
Milgram, Newcomb)
 Economic sociology 1970s-present
(Granovetter,White, Uzzi)
 Explosion of social capital research in
1990s (Putnam, Burt)
 Invasion of the physicists (Watts,
Barabasi, Newman)
Networks in political science
 Examples go back (at least) to 1938.
 Applications to:
 Legislative processes
 Public opinion
 IR
 Interest groups
 But until very recently, very thin research tradition--
does not fit into dominant paradigms
Network analysis
 Overview of foci of current social network
 Research design
 Some examples of applications
Multiple levels of network
 Systems level– what network structures
function well for what tasks?
 Positional level– how does the individual
position in the network affect that individual?
 Relational level– what drives the
configuration of the network?
Overview of some social
network “ideas”
 How do networks affect how systems and
individuals function?

 How are networks structured?

How do networks affect
systemic and individual
 Regulation
 Circulation
 Coordination
 Control
Regulation vs Circulation

Coordination and control:
centralized vs decentralized
Network structure
 Small worlds (Milgram; Watts and Strogatz)
 Scale free networks (Barabasi, Stanley)
 Homophily (Merton, Lazarsfeld)
Small world networks
Scale free networks
Homophily: birds of a feather
How to do social network
 Types of data
 Research foci
 Design issues
Types of network data
 One mode vs two mode
 Whole network vs egocentric
 Different types of relationships
One mode vs two mode

 One mode: person to  Two mode: person

person to event

Jack Jack Jill


Whole network

Network visualization of Members who traveled at least 10 days

together (Williams 2006)

From:Assessing the Social and Behavioral Science Base for HIV/AIDS

Prevention and Intervention: Workshop Summary (1995)
Types of relationships
 Communication
 Affection
 Advice
 Proximity
 Power
 Multiplexity of relationships
 What kind of data might be appropriate?
 Survey
 Any communication, meeting
 Proximity
 Affiliation
 Behavioral
 Ramifications of missing/noisy data
 Some network measures degrade more than
The coming revolution in
observational data

Impact of removal of links from 7m person

mobile phone network: weak vs strong tie
Structure and tie strengths in mobile communication networks by J.-P. Onnela, J. Saramäki, J. Hyvönen, G.
Szabó, D. Lazer, K. Kaski, J. Kertész, and A.-L. Barabási, PNAS 104, 7332-7336 (2007)
Research foci
 Individual level
 System level
 Network structure
 Micro to macro, using computational models
Analysis I: individual-level
 Impact on being in a particular place in the
 E.g., impact of centrality: degree, reachability,
betweeness on outcomes; benefit of a brokering
between other actors (Burt)
Analysis II: system-level
 Impact of structure of overall network
 E.g., density of connections has inverse U
relationship with performance in creative settings
(Leenders, Uzzi)
 Impact of centralization on signal aggregation
 Various research on small groups
Analysis III: network structure
 Dyadic correlates of the configuration of the
 E.g., homophily, distance (McPherson)
 Mid-level features (e.g., triads, quads, etc)
 Structure “reduction” (Newman, Frank)
 Descriptions of overall structure
 Degrees of separation, clustering (small world)
 Degree distribution (scale free)
Analysis IV: computational
 In emergent processes, snapshot may not
reflect micro-level processes (Schelling)
 Agent-based modeling: very simple
assumptions about behavior, which
(sometimes) yield surprising systemic
 Ex: my work on the social structure of
exploration and exploitation
Network visualization

When useful?
 To see unanticipated patterns
 More useful for small networks and
egocentric networks
 Tools for pattern recognition in larger
 Powerful tools for presenting ideas
 Software: Netdraw, Pajek, Visone
Networks among State Health Officials

1. Grey ties = overlapping ties
2. Red = Talk in general
3. B l a ck = Pandemic preparedness
4. Dark grey ties = Professional

5. R e gion 1: red
6. R e gion 2: blue
7. R e gion 3: black
8. R e gion 4: grey
9. R e gion 5: pink (Territories)
Flight patterns movie (Aaron Koblin)
Research design
 Statistical challenges
 Design challenges
Statistical challenges
 Interdependence of observations
 For example, whether if A talks to B, and B to C, it
is more likely that A talks to C (transitivity)
 Statistical methods to deal with these
interdependencies (QAP, P*, ERGM)
Design challenges
 The causal nexus–
whither the causal
 Network to node?
 Node to network?
 Omitted variable driving
 The value of control
 The value of
longitudinal data
studying social influence
 How to dissect cause and effect of social
 Problem of unobserved heterogeneity
 Some roommate studies
 Study of policy school students
Keys to studying social
influence in this study
 Longitudinal data
 Measurement of views at inception of system
 Implausibility of alternative explanation that
network is related to unobserved
The network of influence

triangle=section 1
square/diamond = section 2
circle/octagon = section 3

dark blue = 1 (extremely liberal)

blue = 2
light blue = 3
gray = 4
pink = 5
red = 6 (conservative)

larger= became more conservative

smaller = became more liberal
in-between=no change
Paradigm shortcomings
 Until recently, almost all work was based on
snapshots of small systems.
 Lack of attention to causal nexus
 Lots of attention on flow in networks, but little
data on actual flow
 Relational focus obscures interplay of nodal-
level factors and network