Welcome to Scribd, the world's digital library. Read, publish, and share books and documents. See more
Download
Standard view
Full view
of .
Look up keyword
Like this
2Activity
0 of .
Results for:
No results containing your search query
P. 1
Strategies for Crowdsourcing Social Data Analysis

Strategies for Crowdsourcing Social Data Analysis

Ratings: (0)|Views: 257|Likes:
Published by Crowdsourcing.org

More info:

Published by: Crowdsourcing.org on Jan 21, 2012
Copyright:Attribution Non-commercial

Availability:

Read on Scribd mobile: iPhone, iPad and Android.
download as PDF or read online from Scribd
See more
See less

10/10/2013

 
Strategies for Crowdsourcing Social Data Analysis
Wesley Willett
, Jeffrey Heer
, Maneesh Agrawala
Computer Science Division, University of California, Berkeley
{
willettw, maneesh
}
@cs.berkeley.edu
Computer Science Department, Stanford University jheer@cs.stanford.edu
ABSTRACT
Web-based social data analysis tools that rely on public dis-cussion to produce hypotheses or explanations of patternsand trends in data rarely yield high-quality results in prac-tice. Crowdsourcing offers an alternative approach in whichan analyst pays workers to generate such explanations. Yet,asking workers with varying skills, backgrounds and moti-vations to simply “Explain why a chart is interesting” canresult in irrelevant, unclear or speculative explanations of variable quality. To address these problems, we contributeseven strategies for improving the quality and diversity of worker-generated explanations. Our experiments show thatusing
(S1) feature-oriented prompts
, providing
(S2) good ex-amples
, and including
(S3) reference gathering
,
(S4) chart reading
, and
(S5) annotation
subtasks increases the qualityof responses by 28% for US workers and 196% for non-US workers. Feature-oriented prompts improve explanationquality by 69% to 236% depending on the prompt. We alsoshow that
(S6) pre-annotating charts
can focus workers’ at-tention on relevant details, and demonstrate that
(S7) gener-ating explanations iteratively
increases explanation diversitywithout increasing worker attrition. We used our techniquesto generate 910 explanations for 16 datasets, and found that63% were of high quality. These results demonstrate thatpaid crowd workers can reliably generate diverse, high-qual-ity explanations that support the analysis of specific datasets.
Author Keywords
Information Visualization; Social Data Analysis; Crowdsourcing
ACM Classification Keywords
H.5.3 Group & Organization Interfaces: Collaborative computing.
INTRODUCTION
Making sense of large datasets is fundamentally a humanprocess. While automated data mining tools can find recur-ring patterns, outliers and other anomalies in data, only peo-ple can currently provide the explanations, hypotheses, andinsights necessary to make sense of the data [22, 24]. Socialdata analysis tools such as Sense.us [8], Pathfinder [18] and Many Eyes [30] address this problem by allowing groups of web-based volunteers to collaboratively explore visualiza-tions, propose hypotheses, and seek out new insights. Con-trolled experiments have shown that groups can use thesetools to discover new, unexpected findings [8, 29]. However,
To appear at CHI 2012.
eliciting high-quality explanations of the data requires seed-ing the discussion with prompts, examples,and other startingpoints to encourage contributions [8, 32].Outside the lab, in real-world web-based deployments, thevast majority of the visualizations in these social data anal-ysis tools yield very little discussion. Even fewer visualiza-tionselicithigh-qualityanalyticalexplanationsthatareclear,plausible, and relevant to a particular analysis question.We recently surveyed the Many Eyes website and found thatfrom 2006 to 2010, users published 162,282 datasets butgeneratedonly77,984visualizationsandleftjust15,464com-ments. We then randomly sampled 100 of the visualizationscontaining comments and found that just 11% of the com-ments included a plausible hypothesis or explanation for thedata in the chart. The low level of commenting may repre-sent a shortage of viewers or may be due to
lurking
– a com-mon web phenomenon in which visitors explore and readdiscussions, but do not contribute to them[31, 20]. Whencomments do appear, they are often superficial or descrip-tive rather than explanatory (Figures2a,2b). Higher-quality analyses sometimes take place off-site[5] but tend to occuraround limited (often single-image) views of the data cu-rated by a single author. Ultimately, marshaling the analyticpotential of crowds calls for a more systematic approach tosocial data analysis; one that explicitly encourages users togenerate good hypotheses and explanations.In this paper we show how paid crowd workers can be usedto perform the key sensemaking task of generating explana-tions of data. We develop an analysis workflow (Figure1) inwhich an analyst first
selects charts
, then uses crowd work-ers to carry out
analysis microtasks
and
rating microtasks
to generate and rate possible explanations of outliers, trendsand other features in the data. Our approach makes it possi-ble to quickly generate large numbers of good candidate ex-planations like the one in Figure2c, in which a worker givesseveral specific policy changes as possible explanations forchanges in Iran’s oil output. Such analytical explanations areextremely rare in existing social data analysis systems.
Figure 1. Our workflow for crowdsourcing data analysis.
 
Figure 2. Comments on social data analysis on sites like Many Eyes (a,b) often add little value for analysts. We show that crowd workers can reliablyproduce high-quality explanations (c) that analysts can build upon as part of their broader analyses. (Emphasis added.)
Yet, simplyaskingworkerswithvaryingskills, backgrounds,and motivations to “Explain why a chart is interesting” canresult in irrelevant, unclear, or speculative explanations of variable quality. We present a set of seven strategies thataddress these problems and improve the quality of worker-generated explanations of data. Our seven strategies are to:
(S1) use feature-oriented prompts
,
(S2) provide good exam- ples
,
(S3) include reference gathering subtasks
,
(S4) includechart reading subtasks
,
(S5) include annotation subtasks
,
(S6) use pre-annotated charts
, and
(S7) elicit explanationsiteratively
. While some of these strategies have precedentsin other crowdsourcing systems [17,21], the main contribu- tion of this work is to demonstrate their impact in the contextof collaborative data analysis.We have applied these strategies to generate 910 explana-tions from 16 datasets, and found that 63% were of highquality. We also conducted six experiments to test the strate-gies in depth. We find that together our first five strategies(S1-S5) increase the quality ratings (a combined measureof clarity, plausibility, and relevance) of responses by 28%for US workers and 196% for non-US workers. Feature-oriented prompts (S1) are particularly effective, increasingthe number of workers who explain specific chart featuresby 60%-250% and improving quality by 69%-236% depend-ing on the prompt. Including chart annotation subtasks (S5)or pre-annotating charts (S6) also improves workers’ atten-tion to features. Additionally, iterative rounds of explanationgeneration (S7) can produce 71% new explanations withoutincreasing worker attrition. Finally we show how workerscan helpanalysts identifythebestuniqueexplanations pro-viding quality ratings that correlate strongly with our ownand identifying redundant explanations with 72% accuracy.Our results show that by recruiting paid crowd workers wecan reliably generate high-quality hypotheses and explana-tions, enabling detailed human analyses of large data sets.
RELATED WORK
We build on two main areas of related work; asynchronoussocial data analysis and applications of paid crowdsourcing.
Asynchronous Social Data Analysis 
Social data analysis systems such as Sense.us [8], Pathfinder[18], Many Eyes [30], and Swivel [27]were built under the assumption that people can parallelize the work required toanalyze and make sense of data. Motivated users can visu-alize, share, and discuss datasets but, as we’ve noted, few of the visualizations exhibit high-quality analytical discussion.In fact, many of the commercial websites no longer exist.Heer and Agrawala[6] discuss a variety of issues in design-ing asynchronous social data analysis systems to improvesensemaking. They suggest that these systems should fa-cilitate division, allocation and integration of analysis work,support communication between workers and provide intrin-sic and extrinsic incentives. Building on these suggestions,Willett et al.’s CommentSpace[32] demonstrates that divid-ing social data analysis into concrete subtasks can improvethe quality of analysts’ contributions. In this work, we fur-ther break the task of generating explanations into smaller
microtasks
inwhichpaidworkersexplainfeaturesofthedataand other workers rate those explanations.
Applications of Paid Crowdsourcing 
With the rise of online labor marketplaces such as Amazon’sMechanical Turk (
www.mturk.com
), researchers have fo-cusedontheuseofpaidcrowdsourcingtosupplementpurelycomputational approaches to problem solving and user test-ing [12, 23]. In the context of visualization, recent work hasused crowdsourced workers to perform graphical perceptionexperimentsontheeffectivenessofchartsandgraphs[7, 14].We also pay crowd workers to make judgments about chartsand graphs and to provide graphical annotations, but we fo-cus on analytical sensemaking tasks.Other work has examined how to incorporate human com-putation into larger workflows. Soylent [2]uses paid work-ers to perform document editing tasks within a word pro-cessor, using a Find-Fix-Verify pattern to break editing tasksinto smaller subtasks. Similarly, our workflow helps an ana-lyst break down complex data analysis operations into
anal- ysis microtasks
that many workers can perform in paralleland
rating microtasks
that help the analyst consolidate theresults of the parallel analyses. We also take inspirationfrom CrowdForge [13], Jabberwocky [1], TurKit [17], and Turkomatic [15]which provide general-purpose program-mingmodelsforleveragingcrowdstoperformcomplextasks.
 
Figure 3. An example analysis microtask shows a single chart (a)along with chart-reading subtasks (b) an annotation subtask (c) anda feature-oriented explanation prompt designed to encourage workersto focus on the chart (d). A request for outside URLs (e), encouragesworkers to check their facts and consider outside information.
AWORKFLOWFORCROWDSOURCINGDATAANALYSIS
Hypothesis(orexplanation)generationisakeystepofPirolliand Card’s [22]sensemaking model and it requires human judgment. Developing good hypotheses often involves gen-erating a diverse set of candidate explanations based on un-derstanding many different views of the data. Our tech-niques allow an analyst to parallelize the sensemaking loopby dividing the work of generating and assessing hypothe-ses into smaller
microtasks
and efficiently distributing thesemicrotasks across a large pool of workers.We propose a four-stage workflow (Figure1)for crowd-sourcing data analysis. An analyst first
selects charts
rele-vant to a specific question they have about the data. Crowdworkers then examine and explain these charts in
analysismicrotasks
. Optionally, an analyst can ask other workers toreview these explanations in
rating microtasks
. Finally, theanalyst can
view the results
of the process, sorting and filter-ing the explanations based on workers’ ratings. The analystmay also choose to iterate the process and add additionalrounds of analysis and rating to improve the quality and di-versity of explanations.
Selecting Charts 
Given a dataset, an analyst must initially select a set of chartsfor analysis. The analyst may interactively peruse the datausing a visual tool like Tableau [28]to find charts that raisequestions or warrant further explanation. Alternatively, theanalyst may apply data mining techniques (e.g., [10, 16, 33])to automatically identify subsets of the data that require fur-ther explanation. In general, our workflow can work withany set of charts and is agnostic to their source.In our experience, analysts often know
a priori
that they areinterested in understanding specific features of the data suchas outliers, strong peaks and valleys, or steep slopes. There-fore, our implementation includes R scripts that apply basic
Figure 4. An example rating microtask showing a single chart (a) alongwith explanations (b) from several workers. The task contains a chart-reading subtask (c) to help focus workers’ attention on the charts anddeter scammers, along with controls for rating individual responses (d),indicating redundant responses (e), and summarizing responses (f).
data mining techniques to a set of time-series charts in orderto identify and rank the series containing the largest outliers,the strongest peaks and valleys and the steepest slopes. Theanalyst can then review these charts or post them directly tocrowd workers to begin eliciting explanations. We leave itto future work to build more sophisticated data mining algo-rithms for chart selection.
Generating Explanations 
For each selected chart, our system creates an analysis mi-crotask asking for a paid crowd worker to explain the visualfeatures within it. Each microtask contains a single chartand a series of prompts asking the worker to explain and/orannotate aspects of the chart (Figure3). The analyst canpresent each microtask to more than one worker to increasethe diversity of responses.
Rating Explanations 
If a large number of workers contribute explanations, the an-alyst may not have the time to read all of them and mayinstead wish to focus on just the clearest, most plausible ormost unique explanations. In the rating stage the analystenlists crowd workers to aid in this sorting and filtering pro-cess. Each
ratingmicrotask 
(Figure4)includesasinglechartalong with a set of explanations authored by other workers.Workers rate explanations by assigning each a binary (0-1)
relevance
score based on whether it explains the desired fea-ture of the chart. Workers also rate the clarity (how easy itis to interpret) and plausibility (how likely it is to be true)of each response on a numerical (1-5) scale. We combinethese ratings into a numerical
quality
score (0-5) that mea-

You're Reading a Free Preview

Download
scribd
/*********** DO NOT ALTER ANYTHING BELOW THIS LINE ! ************/ var s_code=s.t();if(s_code)document.write(s_code)//-->