Professional Documents
Culture Documents
of
Computa%onal
Journalism
Columbia
Journalism
School
Week
1:
Basics
September
10,
2012
Week
1:
Basics
What
is
computa%onal
journalism?
Data
in
journalism
Aims
of
the
course
Course
structure
Week
1:
Basics
What
is
computa%onal
journalism?
Data
in
journalism
Aims
of
the
course
Course
structure
Data
Repor%ng User
Computer Science
CS CS
Data
Repor%ng User
Data
Repor%ng
CS
CS
CS
Data
Repor%ng
Filtering
User
CS
CS
Data
Repor%ng
Examples
of
lters
What
an
editor
puts
on
the
front
page
Google
News
Reddits
comment
system
TwiRer
Facebook
news
feed
Techmeme
Track
eects
CS
CS
Data
Repor%ng
CS
CS
CS
CS
Data
Repor%ng
Filtering User
Eects
CS
CS
Data
Repor%ng
Week
1:
Basics
What
is
computa%onal
journalism?
Data
in
journalism
Aims
of
the
course
Course
structure
structured data
unstructured data
More video on YouTube than produced by TV networks during en%re 20th century
400,000,000 tweets per day AP moves ~15,000 stories per day 390,000 Wikileaks cables 500,000 Enron emails how many govt and corporate docs?
All New York Times ar%cles ever = 0.06 terabytes (13 million stories, assuming 5k per story)
Week
1:
Basics
What
is
computa%onal
journalism?
Data
in
journalism
Aims
of
the
course
Course
structure
Design
[Designers]
are
guided
by
the
ambi%on
to
imagine
a
desirable
state
of
the
world,
playing
through
alterna%ve
ways
in
which
it
might
be
accomplished,
carefully
tracing
the
consequences
of
contemplated
ac%ons.
-
Horst
RiRel,
The
Reasoning
of
Designers
Design
is
poli%cal
No
plan
has
ever
been
benecial
to
everybody.
Therefore,
many
persons
with
varying,
oten
contradictory
interests
and
ideas
are
or
want
to
be
involved
in
plan-making.
The
resul%ng
plans
are
usually
compromises
resul%ng
from
nego%a%on
and
the
applica%on
of
power.
The
designer
is
party
in
these
processes;
he
takes
sides.
-
Horst
RiRel,
The
Reasoning
of
Designers
Week
1:
Basics
What
is
computa%onal
journalism?
Data
in
journalism
Aims
of
the
course
Course
structure
Theory
We
will
learn
important
guiding
principles
about
Filter
design
Visualiza%on
Social
network
analysis
Drawing
conclusions
from
data
Security
modeling
Techniques
We
will
discuss
a
handful
of
techniques
in
great
depth.
Distance
func%ons
and
clustering
Vector
space
document
model
Recommender
systems
Proposi%on
extrac%on
Knowledge
representa%on
as
linked
data
Community
detec%on
Any
requests?
Course
structure
Classes:
well
review
the
readings
(so
please
read
them)
By
next
week:
form
groups
of
2-3.
Assignments
every
other
week,
due
in
two
weeks
Some
involve
will
involve
coding,
all
will
involve
cri%cal
analysis.
Your
data
You
are
encouraged
to
pick
a
data
set
and
s%ck
with
it.
If
you
want,
can
do
all
assignments,
nal
research
report,
etc.
with
this
data
This
is
a
research
course
lets
learn
something
new.
What
data?
SEC
reports,
municipal
open
gov
data,
Wikileaks,
your
favorite
archive,
social
media
Two
criteria:
Journalis%cally
interes%ng
Requires
advanced
techniques
Final
Report
For
3-point
students
A
theore%cal
discussion
(10
pages)
For
6-point
students,
one
of:
A
theore%cal
discussion
(25
pages)
An
implementa%on
of
a
technique
and
discussion
of
results
Analysis
of
your
chosen
data
A
completed
story,
plus
methodology