You are on page 1of 18

DATA(VIS) JOURNALISM

-Why and how to start

Nakho Kim (nkim3@wisc.edu) Nov 2011

Why Data?
In fact, journalism has always been about DATA.
Collecting facts Parsing out meaningful patterns from facts Telling stories with patterns Opening up for feedback

The change:
We have so much more data on hand, both quan & qual
Open public data (data.gov), whistleblowers (Wikileaks.org), open collective data (Twitter, FB, Wikipedia)

We can process that data better


Computing power (and labor)

We have better ways to present them

In a nutshell: more diverse, intuitive ways to go in-depth

Why Visualization?
Two main functions (often both)
Interface for thick information Pattern finding

Information interface
Goal:
To organize or filter some information from the vast whole
Example: Madison Commons neighborhood map (link), timelines

To show the process or mechanism


Example: Dynamic flowcharts (link)

or just to get attention

Pattern Finding
4 major patterns better found with Vis:
Difference
position: Voter heatmap (link), scale: Money chart (link)

Clustering
Social networks (link)

Overlap
Guardians London riot map (link): pol-econ spaces

Change over time


Roslings Gapminder: Health & Wealth (link)

The Best Datavis Ever

Napoleons March by Minard (1861)

The process
4 steps to raise value to public
Data > Filter > Vis > Story (Lorenz, 2010)

Inverted pyramid process (Bradshow, 2011)


Compile -> Clean -> Context -> Combine -> Communicate
Visualise -> Narrate -> Socialize -> Humanize -> Personalize -> Utilize

The core: (Plan), Collect, Process, Show-tell

Planning the story :


What do you WANT to find? DO NOT try to cramp too much info
Afghanistan disaster (link)

Collecting data
Locating
Data.gov, Guardian DataStore, etc

Mining
We Feel Fine (link)

Sourcing
Ushahidi Kenya (link)

Processing data
Excel rules (macros and formulas are helpful) Good stat tools are useful, too
SPSS, SAS, Matlab, R Parsing often includes statistical analysis.

Google docs rising


Easy to connect different tools and gadgets

Export to csv for versatility

Showing the data


Selecting the vis style : little changes big differences
Network layouts: round vs force-layout (link: NodeXL) Mapping vs chart (link: NYT 2008 election)

Does it tell a good story?


If yes, is it the story you planned?
Yes: Congrats. No: Pretend that it was your plan.

If no, select something else

Final check
Proof-check for unintended biases
One code for one data type Color as misleading connotations Patterns distorting the vision Exaggeration by unintended hyperbole, figure size, etc The greatest sin of all: overpacking (examples of bad vis: link)

Now, tell (write, talk, draw) your story around it.


- The case of infovis (link) : good and bad - Sometimes, straightforward is the best (link)

Tools
Easy to learn / clunky

Menu-based

Apps (Dipity, Wordle, etc) Many Eyes Fusion Table Gephi

Query-coding
R Processing

Hard-coding

Python, Java, Ruby Hard to learn / flexible

Dont Be Over-ambitious
Most likely, any journalist will be starting with menu-based online tools
Many Eyes (link)
Great for on-the-fly charts

Fusion table (link)


Great for maps and mashups

If you know what youre doing


Start using dedicated tools
E.g. Networks: Gephi (link)

Be ambitious only if your team can afford a programmer. you are willing to learn some foreign language.

Now, lets discuss


What are the stories you want to visualize? What kind of data do you want to see? What are some of the good datavis examples youve seen? How about some really bad ones? Which kind of tools would you like to learn?

Some Suggestions
Watch:
Journalism in the Age of Data http://datajournalism.stanford.edu

Read:
Many Eyes FAQ (link) 10 tools for data journos (link)

Play:
Draw one network graph, one chart, one map with Many Eyes

Prepare:
Explain 1 or 2 Examples of good / bad datavis One real journalism task to discuss about

You might also like