Professional Documents
Culture Documents
extended use of the drug, the earliest study yielded negative results. However,
the results were not published for several years. In the interim period, Pfizer
funded two studies that produced more favorable results. Eventually, the report
that was made public “bundled” (or aggregated) the three studies, thus diluting
the negative findings and enabling Pfizer to state that extended use of the drug
was effective. Thus, aggregating these studies benefited the drug company
and prevented consumers from making more informed decisions about their
health care.
An example from the airline industry further highlights how aggregating or
disaggregating data can privilege some groups, but silence or disempower
others. An article from the Wall Street Journal (McCartney, 2007) describes
statistical reports about on-time arrivals, flight delays and cancellations, and
baggage handling. In this case, the issue of aggregating or disaggregating data
involves the decisions on how to report data from both the mainline carriers
and their regional affiliates. The U.S. Department of Transportation does not
require that these data be presented as aggregates. In their view, reporting stat-
istics in a disaggregated form (i.e., mainline and regional carriers as separate
entities) gives specific information that benefits consumers.
Not everyone agrees with the DOT policy. A look at different presentations
of the same data helps illustrate why. The February 2007 DOT report gave
the on-time arrival rate of 76% for Delta AirLines, Inc. (The definition of
“on-time” includes flights arriving within 15 minutes of the scheduled time.)
However, two of Delta’s biggest regional partners reported much less favorable
rates: 53.5% for Comair, and 66.5% for Atlantic Southeast. If the data for Delta
and its regional airlines were combined, the overall percentage drops by 13.3%,
to 66.5%, thus casting Delta in a less favorable light. Obviously, most Delta
officials are quite content with the DOT policy, which does not include these
regional airlines in the Delta on-time arrival rate.
However, many of the regional affiliates feel quite differently. They argue
that the data should be presented as an aggregate because it would bring to
light a problem that Delta is trying to hide, namely that their lower on-time
arrival rates are the result of Delta’s power and control over the regional air-
lines’ schedules. For example, in the aftermath of bad weather, larger planes are
often the first to resume flying so that the greatest number of passengers can be
accommodated. As a result the regional planes are delayed, and their on-time
record suffers. In the view of these regional partners, the disaggregated data
exonerates the mainline carriers from taking the responsibility for the decisions
that result in the delays of these affiliate airlines. From this point of view, the
regional partners are unfairly positioned by the practice of presenting the data
in a disaggregated form. Given the industry’s use of the data for marketing
purposes, consumers may suffer as well. Aggregating or disaggregating categor-
ies of data make it possible to tell different stories, promote various agendas,
and privilege or marginalize groups of people.
52 • Definitions and Categories
Figure 3.1 Graph showing data from the school uniform survey disaggregated by age.
Reprinted with permission from Thinking and reasoning with data and chance: sixty-
eighth yearbook, copyright 2006 by the National Council of Teachers of Mathematics. All
rights reserved.
Definitions and Categories • 53
Figure 3.2 Alternate graph showing aggregated data from the school uniform survey.
promote their own point of view. She asked, “How might adults use your data
to promote their point of view?” Jeremy admitted that they could disaggregate
the student responses and only report the adult ones but that “wouldn’t be fair.
They aren’t showing all the information.” By raising this alternative possibility
Stacey helped Jeremy see that people can tell very different stories by aggregat-
ing and disaggregating data in different ways.
A similar issue arose during the fifth-grade school lunch study described
above. The cafeteria director told the children that she was exploring the idea of
offering an alternative to the standard hot menu: an option for sandwich, soup,
and/or salad. The fifth graders agreed that this change would improve the
lunch program. Knowing that they could use data to lobby for the adoption of
this option, they decided to include the question, “Would you like to have a
choice between the regular lunch OR salad, soup, or sandwich?” on a survey
given to third-, fifth-, and eighth-grade classes. They intended to combine the
data from all three grades. However, unanticipated questions arose about this
decision when Brittaney and Maya, the two in charge of this part of the report,
tabulated and graphed the results:
Grade 3 18 yes 6 no
Grade 5 19 yes 1 no
Grade 8 19 yes 2 no
Total 56 yes 9 no
Although the number of “yes” votes was substantial, the girls were surprised
that there were nine negative votes. Proceeding with their plan to graph the
combined results, they entered 56 and 9 on an Excel sheet and created a pie
chart (Figure 3.3). The resulting graph showed 86% “yes” and 14% “no.” Since
they had originally anticipated nearly 100% “yes,” they found this graph to be a
54 • Definitions and Categories
Figure 3.3 Graph showing data from the soup/salad/sandwich survey question with
aggregated responses from grades 3, 5, 8.
Figure 3.4 Graph showing data from the soup/salad/sandwich survey question without
the third grade responses.
gave the girls and their peers opportunities to discuss ethical decisions involved
in data reporting.
The work of the fourth- and fifth-grade students shows interesting parallels
to the published reports in the media. Like the Pfizer study above, Jeremy’s
exploration highlighted how categories can be aggregated to hide less desirable
data. When he hypothetically combined adults and children into one category
of “people,” the negative adult votes were diluted. Pfizer’s “bundling” strategy
similarly hid data that damaged their marketing goals. In the airline study,
mainline carriers favored disaggregating data from their regional affiliates,
while the fifth graders’ argument for a sandwich/soup/salad option would have
been stronger if the third-grade data were not aggregated with data from the
other grades. In the lunch study the girls were aware that they were making
assumptions that may have been unfounded about the third graders’ experi-
ence, and that their alternative graph might unfairly silence this group.
As children gain more opportunities to gather data, and then experiment
with selecting portions of it to argue different sides of an issue, the more they
realize that these alternatives shape the data to fit various purposes. In turn,
children can cast a more skeptical eye over the data reported by others. As
critics, they can learn to question whether the data that are reported are part of
a larger pool of information, and to ask who benefits from, and who is silenced
by, the chosen form of presentation.