From Online Journalism Review, http://www.ojr.org/ojr/stories/070731niles
Annenberg School of Journalism, University of Southern California
A journalist's guide to crowdsourcing
OJR's editor answers basic questions about how news organizations can improve investigative reports by using the Internet to gather
information from their readers.
By Robert Niles
Last week, I had the pleasure of conducting some training sessions for the staff at the Orlando Sentinel in Florida. I spent the morning and lunch sessions talking with Sentinel
reporters and editors about blogging and discussion forums, and the final session of the day was on my favorite online journalism topic: crowdsourcing.
Few journalists, at the Sentinel or elsewhere, know much about this topic, save, perhaps, for the fact that it's become one of the industry's hotter buzzwords. But I believe that
crowdsourcing might, in the end, have more of an effect on all forms of journalism than anything else that's come out of the online journalism revolution.
That's why I decided to put together this introductory Q&A about crowdsourcing, for OJR readers.
What is crowdsourcing?
Crowdsourcing, in journalism, is the use of a large group of readers to report a news story. It differs from traditional reporting in that the information collected is gathered not
manually, by a reporter or team of reporters, but through some automated agent, such as a website.
Stripped to its core, though, it's still just another way of reporting, one that will stand along the traditional "big three" of interviews, observation and examining documents.
The core concept is not new in journalism. At its heart, modern crowdsourcing is the descendent of hooking an answering machine to a telephone "tip line," where a news
organization asks readers to phone suggestions for stories. Or asking readers to send in photos of events in their community.
Such methods require substantial manual labor to sift through submitted material, looking for information that can be used well in a story. Which makes them only marginally more
effective than traditional news reporting.
True crowdsourcing involves online applications that enable the collection, analysis and publication of reader-contributed incident reports, in real time.
What are some examples of crowdsourcing?
My favorite example comes not from a news organization, but the U.S. Geological Survey. Its "Did You Feel It" feature builds detailed "shake maps" illustrating the intensity of
earthquakes by zip code, through thousands of volunteer reports submitted online by readers.
A simpler example, but very popular this summer, is GasBuddy.com The site won't win any awards for soothing graphic design, but it allows readers in more than 100 communities
to share real-time reports on gas prices in their area.
I built my first crowdsourcing news feature in 2001, on my theme park website. "Accident Watch" built a reader-written database of injury accidents at U.S. theme parks, in the
absence of federal or significant state incident data. Readers submitted reports of injury accidents that they'd witnessed or read about, with reports from just one reader labeled
"unverified." A second report of the same incident from another reader or link to an official police, court or park report or a news story was required for a report to be labeled
How can I be sure this information isn't bogus?
In a true crowdsourced project, information is not verified manually by a reporter between submission and publication. Which inspired concern from many traditional reporters.
A well-designed crowdsourcing project, like a well-edited newsroom, can discourage bogus submissions while minimizing their influence if accepted. Here are my suggestions to
avoid bogus data in a crowdsourced project:
Request the reader submit personal identification along with the report. On "Accident Watch," readers must be registered with the site, which requires e-mail verification, in order
to submit a report. The earthquake project requires a zip code and requests a reader's name, phone, e-mail and street address. Asking readers to identify themselves sends the
message that you take this project seriously and that you wish them to do the same. Obviously bogus ID allows you to flag bogus records for deletion with ease.
If your project publishes individual reports, provide other readers with an opportunity to dispute or verify each individual report. The empowers your readers to help clean your
data for you.
Even if you are publishing data only in aggregate, be aggressive about encouraging readers who dispute that data to add their report to the database, as more data should help
move the mean toward the true value.
How is crowdsourcing different from polling?
Obviously, you do not have a controlled random sample of the population in a crowdsourced project, as you would with a carefully executed poll. But that does not prohibit you from
collecting accurate and engaging data through crowdsourcing. You just need to be careful in identifying whether a specific project works better with polling or crowdsourcing.
Polling's great for constructing an accurate portrait of a community's demographics, attitudes and behavior. Crowdsourcing's great for incident reports, which might be incomplete if
limited to a small random sample.
Either the incident (the roller coaster crash, the bottled falling from your kitchen shelves, the three-buck gasoline) happened, or it didn't. But the more people you have "on the
ground" as potential sources in your crowd, the more data points you can collect. If you poll only a few hundred people, you'll miss incidents.
Think of another great crowdsourcing project: missing/safe person lists following a disaster, such as Hurricane Katrina or 9/11. A random sample would get you only the family and
friends of your sample, instead of the many thousands more who want and need information about their loved ones.
At the same time, be careful about drawing broad conclusions about community behavior based on your crowdsourced incident reports. Don't ask people about their income,
A journalist's guide to crowdsourcing
1 of 2
4/6/2009 10:25 A