Computational Journalism
Columbia Journalism School
Week 3: Filters as Editors
x
x
x
x
The Twitter Network
Twitter follower network
We have crawled the entire Twitter site and obtained 41.7 million
user profiles, 1.47 billion social relations, 4, 262 trending topics,
and 106 million tweets. In its follower-following topology analysis
we have found a non-power-law follower distribution, a short
effective diameter, and low reciprocity, which all mark a
deviation from known characteristics of human social networks
Kwak et. al, What is Twitter, a Social Network or a News Media? (2010)
More followings than followers
Small average distance between nodes
Its a news network - hubs
Twitter vs. Newswire timings
Different uses.
why?
- Zynep Tufekci, What Happens to #Ferguson Affects Ferguson:
Net Neutrality, Algorithmic Filtering and Ferguson
data from
SocialReach,
who works with
many publishers
John McDermott, Why Facebook is for ice buckets, Twitter is for Ferguson
Sunita, Why #Ferguson broke out on Twitter, not Facebook
Information flow on Facebook
Filtering News on Twitter
Reuters News Tracer
Score
Cluster into Searches
Filter veracity &
events and Alerts
newsworthy
Liu et. al, Reuters Tracer: A Large Scale System of Detecting &
Verifying Real-Time News Events from Twitter
Liu et. al, Reuters Tracer: A Large Scale System of Detecting &
Verifying Real-Time News Events from Twitter
Liu et. al, Reuters Tracer: A Large Scale System of Detecting &
Verifying Real-Time News Events from Twitter
Problems with Filters
The Echo Chamber
[Echo chambers are] those Internet spaces where like-minded
people listen only to those people who already agree with them.
...
While most of us had assumed that the Internet would increase
the diversity of opinion, the echo chamber meme says the Net
encourages groups to form that increase the homogeneity of
belief. This isnt simply a factual argument about the topography
carved by traffic and links. A tut, tut has been appended: See,
you Web idealists have been shown up humankinds social
nature sucks, just as we always told you!
This creates this kind of a feedback loop in which your media influences
your preferences and your choices; your choices influence your media;
and you really can go down a long and narrow path, rather than
actually seeing the whole set of issues in front of us.
- Eli Pariser,
How do we recreate a front-page ethos for a digital world?
The (Algorithmic) Filter Bubble
- Ethan Zuckerman,
Playing the Internet with PMOG
Information and Disinformation
The five clusters of users who made #TrumpWon trend after first presidential
debate. Gilad Lotan, 2016
Human-Machine Filters
Different Filtering Systems
Content:
Newsblaster analyzes the topics in the documents.
No concept of users.
Social:
What I see on Twitter determined by who I follow.
Reddit comments filtered by votes as input.
Amazon "people who bought X also bought Y - no content analysis.
Hybrid:
Recommend based both on content and user behavior.
TechMeme / MediaGazer
Facebook trending (with editors)
Facebook trending (without editors)
Facebook trending review tool screenshot from leaked documents
Filter Design
Item Content My Data Other Users Data
Text analysis,
topic modeling,
clustering...
social network
structure,
who I follow
other users likes
Filter design problem
Formally, given
U = user preferences, history, characteristics
S = current story
{P} = results of function on previous stories
{B} = background world knowledge (other users?)
Define
r(S,U,{P},{B}) in [0...1]