Professional Documents
Culture Documents
2
select(country_of_origin, total_cup_points) %>%
1.3K
3
mutate(country = fct_lump(country_of_origin, 12)) %>%
1.3K
Fig. 2: To invoke Detangler, users can highlight the pipeline
they wish to detangle and select ‘Detangle’ from the RStudio 1.3K
3
Fig. 4: Visualizing numeric
mutate(country data with thetotal_cup_point
= fct_reorder(country, Data Details view.
Addins drop-down menu. Detangler provides descriptive statistics and potential issues
for each column. No change Internal change Visible change Error
Search
4 Ethiopia 89 Ethiopia
The study was conducted remotely, over a video conferencing C. Data Collection and Analysis
tool, where participants shared their screens for the session’s All participant sessions were recorded and transcribed. We
duration. For the programming environment, we used RStudio made field notes during the sessions to record observations
Cloud IDE, a web-based version of the RStudio IDE. Being about which features were used by participants, and in what
a browser-based IDE, participants were able to use Detangler ways, to facilitate exploration and understanding of the data
on their computers within their own choice of browsers and wrangling tasks. We then performed inductive qualitative coding
configurations. Each session lasted 60 minutes, and consisted to analyze initial patterns and draw connections to derive
of a demo session and three debugging tasks. axial themes that describe our participants’ interactions with
1) Debugging Tasks: For the three debugging tasks, the Detangler. We followed the guidelines set by Carlson [28]
participants were tasked with exploring analysis scripts written and performed a single-event member check with our results
in tidyverse R using the dplyr and tidyr packages with the to overcome theoretical sensitivity. Following the completion
goal of fixing the data wrangling code to produce the desired of the usability session, we administered an exit survey to
visualization. The three datasets were based on #TidyTuesday measure the usefulness of Detangler and its specific features.
datasets [26], which included data quality issues inherently, To triangulate our findings, we also analyzed the log events
and we further modified them to include common data smells for patterns that help explain our themes. We instrumented De-
discussed in prior work [27], and common issues mentioned by tangler and RStudio to log all successful user code executions,
participants (F1–F8) in our formative interviews. No participant Detangler invocations (by clicking lines in the code editor),
indicated familiarity with the selected datasets. counts of “focusing” on the views (via mouse hover event),
Each task script was scaffolded with code to import the and Detangler-specific feature usage.
dataset, wrangle the data, and visualize it to explore relation-
ships between variables. Participants were tasked with thinking VI. R ESULTS
aloud while exploring, understanding, and debugging the data In this section, we present themes and log analysis results
wrangling code using Detangler. We did not prevent participants from the user study, describing the behaviors observed, strate-
from writing their own code, in an attempt to investigate at gies used, and feedback from participants. The results of our
which stage(s) participants would reach for Detangler during user study suggest that Detangler addresses the design goals
their data wrangling workflow. We designed the following 3 we formulated in Section III. Participants found that Detangler
tasks including inherent issues with the data as well as varying helped them explore data wrangling code as a consolidated
mistakes in the data wrangling code: view when they felt confused about transformations (D2,
(Task A) Coffee Ratings: Participants examined a script that D4). Detangler enabled participants in further understanding,
visualizes total coffee ratings for countries around the world and more importantly, in catching issues with both—the
using a box plot. The provided data wrangling code involved data wrangling code and the data (D1, D3). Detangler made
filtering out missing values (NAs) or miscoded values like exploration of data wrangling code and output easier as “it
white space or 999. The missing values are not handled in data offers a lot of information in a very compact way.” (P14)
wrangling code which produces an incorrect plot containing Several features of Detangler helped participants interactively
box plots for total cup points by top 12 countries. The inclusion explore the code behavior and the dataframe transformations.
of miscoded NAs makes this process harder because they are We observe that Detangler was used in bursts whenever
no longer detected with explicit checks for NAs. participants were trying to understand code behavior or
(Task B) Chopped: Participants examined a script visualiz- the resulting dataframe from the wrangling. Beginners and
ing the average episode ratings for all seasons of the Chopped experienced participants did not differ much on their frequency
TV Show. The provided data wrangling code included an of using Detangler or executing code in RStudio. Beginners
incorrect order of summary statistics about particular groups of detangled code 124 times and executed code 268 times in
variables before grouping the data by those variables. Grouped RStudio; while experienced participants detangled code 127
calculations is common yet it can be confusing because it times and executed code 294 times.
collapses large tables into a smaller ones by aggregating values
according to groups. The data also contained a problematic A. User-Survey Results
value (-99) that fell outside of the range of expected values Overall, data scientists’ response to Detangler was positive.
for a rating (0–100). On a 5-point Likert scale (Table I), participants positively rated
the usefulness of Detangler overall (µ = 4.5, median = 5). 200
Beginner [2,3.5] Experienced (3.5,4]
# Focus Events
44
the always-on visualization showing dataframe shape (µ = 4.5, 13
Table Type
100 Data Details
median = 5). Participants found code highlighting to invoke 47 19 31
Table
10 17 7 3 11 15 13 9 1 18 4 16 2 8 14 12 5 6
Participant
With a consolidated view presenting users with always-on
visualizations, Detangler gives data scientists the agency to (i) Fig. 6: Number of times participants focus on Table and Data
form mental models of variable distributions, (ii) quickly and Details views.
reliably validate the effect of individual transformations, and
(iii) visually identify data smells and quality issues. Detangler For example, P10, P12, and P13 looked at the histogram of the
successfully mitigates the need to perform manual inspections total_cup_points column in Task A and saw that it was
of data and saves the previously spent effort in writing code heavily left skewed. They examined the expanded details for
to perform data quality checks (supporting D1–D4). the column and noticed there was a 0 as a minimum value,
1) Detangler’s consolidated experience is inviting for data and they were able to filter out that value. P14 was able to
scientists: We observed a mix of beginners and experienced spot the incorrect ordering of the group_by after summarize
participants making use of RStudio features while examining by taking a peek at the histogram for the season column,
data, such as the Environment Pane (P5, P7), the interactive and said, “Yeah so that should definitely be more than 14 and
table using the View() function (P5, P6, P9, P10, P18) that should be more of a uniform distribution.”
holds all variables such as the dataset variables, and the
interactive console (P4, P5, P10). When participants were using C. Using Detangler to debug data wrangling mistakes and
these other views, they eventually decided to use Detangler data quality issues
when they felt the need to view both, the code and the output Detangler helped participants identify and debug data quality
in the same window, supporting D4. Participants then made issues and code mistakes (supporting D1 and D3).
use of the always-on visualizations within Detangler (described 1) Data scientists organically switch between the views:
in Section IV) to help them validate transformation pipelines. Beginners made heavier use of the Table view (488), while
2) Detangler facilitates isolation of transformations to still consulting the Data Details (300). Experienced participants
validate assumptions about code behaviour: P13 favored equally split their time between both views, 436 in Table
the Detangler’s Table view when checking dataframe outputs view and 422 in Data Details view (see Figure 6). Several
because of the ease of exploring them in the tool: “I’m a very beginners (P1, P2, P7, P9, P11) used the Table view and
slow coder, I change things and come back and mess things up clicked through the ‘Next’/‘Previous’ buttons to flip through
each time. [In Detangler], I can see that I’m messing up if my the rows to validate assumptions for a particular column. For
dataset has a strange shape and not go over to the console. I example, P11 flipped through some rows in the Table view
check the console output but it’s not as easy to understand as when they were investigating the ‘-99’ outlier in Task B and
what happened [on the console].” Experienced participants like once that became laborious, they switched to the Data Details
P5, P13, and P14 liked the fact that each line comes with its view which pointed out the value. Although hesitant at first,
own output as well as Data Details, a clear difference from the experienced participants like P4, P5, P8, P12 went directly
traditional approach of writing a single expression pipeline and to the Data Details view for every task to save on time after
only seeing the final result: “What’s cool is that what happens having success with it initially, instead of performing manual
below [on the table] depends where you are on the [code line] checks with the Table view.
so it’s for each line. Usually I only look at beginning data and 2) Participants triangulate potential issues using multiple
final output.” (P5) views: Participants used multiple views to understand potential
3) At-a-glance visualizations help data scientists form in- issues, toggling between (i) transformation steps using the code
stincts: Participants used the Data Details view to get an overlay, (ii) the Data Details view for descriptive statistics and
overall sense of the columns and quickly catch issues. The potential issues with attributes, and (iii) the Table view to look
unique number of elements and ‘missingness’ insights in the for values corresponding to those issues (e.g. by searching for
Data Details view helped participants quickly identify and specific values in the Table view). For example, P1 searched
validate issues corresponding to each transformation. P14 used for the miscoded NA value of “-” in Task A for the country
the unique counts for the season column while trying to column in its expanded details within Data Details. Similarly,
check how the counts dropped in Task B and was able to spot P2 made use of the search feature in Task C to validate whether
the missing values in episode ratings that dropped 2 seasons. the crop column contained values with an “_” in the name
Participants used the in-line histograms to spot extreme values. (cocoa_beans). P16 also made heavy use of the search feature
TABLE I: Post-Study Survey Responses
50% 0% 50%
Data Details view helped identify potential issues with data. 94% 0 0 1 4 13
Clicking lines to view intermediate data was useful. 94% 0 0 1 5 12
Data Details view helped describe transformations. 89% 0 2 0 5 11
The dataframe shape information helped validate changes. 89% 0 1 1 4 12
Detangler was useful overall. 89% 0 0 2 5 11
Invoking Detangler through code highlight and Add-in was useful. 83% 0 0 3 9 6
1 Likert responses: Strongly Disagree (SD), Disagree (D),
Neutral (N), Agree (A), Strongly Agree (SA).
2 Net stacked distribution removes the Neutral option and
shows the skew between positive (more useful) and negative (less useful) responses.
Strongly Disagree, Disagree, Agree; Strongly Agree.
to validate removing the “-” in the country column in Task Beginner [2,3.5] Experienced (3.5,4]
# Lines Clicked
the unique counts for the season column in the last step was Tasks Completed
50 1
43 instead of the original 45. Confused, they checked the line 2
42
before the filter step and saw the missing values in episode 39 3
18
step before. Through these features of Detangler, participants 17
13
8 8 9
were able piece together and triangulate information from each 0
4
6