Professional Documents
Culture Documents
Tomas Ariztia
To cite this article: Tomas Ariztia (2018): Consumer databases as practical accomplishments:
the making of digital objects in three movements, Journal of Cultural Economy, DOI:
10.1080/17530350.2018.1435421
Introduction
Number crunching describes the first stage in any work pertaining to data. It consists of all the activi-
ties related to the preparation and cleaning of transactional data-sets for further analysis. This
includes tasks such as collecting ‘raw’ consumer data from different commercial sources, dealing
with data ‘errors’, and preparing data-sets for analytics software. As such, number crunching is a
critical step in the production of a workable consumer data-set. It involves performing repetitive
tasks, such as dealing with very specific errors in data that must be cleaned, or having to arrange
and organize large amounts of different sources into a final coherent database that is useful for clients
who want to get to know their consumers. In some cases, number crunching comprises a relatively
short amount of time at the start of the data analysis process and moves from the process of auto-
matizing to the cleaning and organization of data-sets. In most cases, however, number crunching is
a long and costly process that requires collecting and ordering data from many different sources and
reshaping it to produce a workable database.
Nonetheless, this mundane activity offers a central space for thinking about how data-sets – and
other digital objects – are assembled in business settings. During the process of getting cleaned, orga-
nized, and transformed, some consumer records are shaped and packed for further analytics while
CONTACT Tomas Ariztia tomas.ariztia@udp.cl Facultad de Ciencias Sociales e Historia, Universidad Diego Portales, Ejercito
333, Santiago, Chile
© 2018 Informa UK Limited, trading as Taylor & Francis Group
2 T. ARIZTIA
others get discarded, remaining forever silent and invisible. By using the practice of number crunch-
ing as an illustration, this paper aims to reflect on some key issues linked to the production of digital
objects in business settings. In doing so, it problematizes current social science scholarship, which
emphasizes the analysis of digital data and analytics, reinforces the magnitude of its consequences
and ‘data power’, and often neglects the myriad of mundane practices and devices through which
consumer data-sets and analytics are accomplished. Opposed to the current fascination with the
magnitude of the ‘data deluge’ (Cukier and Mayer-Schonberger 2013), we suggest focusing on the
more mundane set of practices and devices through which such objects are enacted in business set-
tings. In doing so, this paper seeks to develop a more deflationary sensibility to the research of digital
objects in business settings (Lynch 2013, Neyland 2015) that focuses on describing the multiple
mundane situations by which these objects are enacted. Taking this view might not only enrich scho-
larly understanding of the specific versions of the social that are effectuated in the enactment of data-
bases but also – and perhaps mainly – might contribute to fostering new ways of making these types
of objects visible and accountable to critical examination. As recently argued by Jasanoff (2016),
looking behind the surfaces of technological objects, that is the different types of judgments and
practices that shape them, is a critical element in making them more governable.
More concretely, and inspired by the Science and Technology Studies (STS) tradition and the
recent pragmatic turn in economic sociology (McFall and Ossandon 2013), this paper proposes mak-
ing three corrective ‘movements’ that might enrich our approaches to how databases and analytics
are assembled in business settings. Each of these movements is illustrated by ethnographic vignettes
from a 9-month ethnographic experiment. This experiment involved participating in the first stages
of the manufacturing and analysis of an online financial retail company’s consumer database.
The first movement deals with the problem of ethnographic access, more concretely, with the fab-
rication of intimacy that made possible the encounter between the researcher and live data practices
(Farías and Wilkie 2015). We propose a shift from an emphasis on ethnographic representation to
ethnographic provocation, which involves taking seriously the issue of fabricating an ethnographic
encounter that ‘slows down expert reasoning’ (Lezaun et al. 2017, p. 18). Specifically, this experimen-
tal approach implies making visible, or exposing, the practical and contextual knowledge from which
these types of digital objects are produced. We illustrate this first movement by presenting empirical
material related to the setup and initial stages of an ethnographic experiment related to the creation
of an artificial consulting firm.
The second movement concerns the problem of visibility of data-making practices and the type of
politics involved in examining databases in the making. Following a long tradition in STS, we suggest
that instead of focusing mainly on the final uses and consequences of the digital object, another fruit-
ful way to approach these businesses is by opening up the bundle of ordinary practices and devices
through which they are assembled. Doing so entails revealing the ontological politics taking place in
data and commercial practices. In this vein, we suggest attending to the myriad of situated operations
and relations by which a particular record becomes ‘consumer data’ while others are discarded and/
or silent. In doing so, we follow an understanding of politics as intrinsic to the ontological consti-
tution of (digital) entities (Woolgar and Lezaun 2013). We illustrate this point by describing
some key operations developed during the production of databases: the cleaning of data record.
Specifically, we analyze how during cleaning records are ‘counted as data’ in relation to different
forms of valuation in which the reality of records is stated. This is related mainly to situating existing
records in relation to other elements and moments that organize the database.
Finally, the third movement concerns the problem of stability, success, and failure in the pro-
duction of databases. In order to grasp digital objects ‘in the making’, an increasing attention to
error, failure, and fragility is needed. We describe how errors present moments of breakdown
where the multiple entities and relations that converge in the creation of a consumer database –
and their affordances – are exposed. Focusing on errors might work not only as a better way to
describe and account for data-sets in the making but also as a methodological trick through
which the enactment of such objects is exposed. As in the previous two movements, we illustrate
JOURNAL OF CULTURAL ECONOMY 3
the place of error by describing the key role that it – along with failure – plays during the production
of a consumer database. We particularly discuss how errors involve unveiling the different agencies
(platforms, organizations, experts, code languages), which enable and constrain the creation of a
final database.
The paper is organized as follows. Section 2 critically engages with existing scholarship on data
practices and objects in business settings, in particular, the scholarship that has examined the
relation between digital practices and objects and the production of the consumer in contemporary
capitalism. In doing so, we identify what we describe as the ‘data power’ discourse and discuss some
of its limitations. The sections that follow (3, 4 and 5) describe each of the aforementioned move-
ments. The paper concludes by discussing some general implications of taking these movements
as a basis for research on the assembling of digital objects in business settings.
Furthermore, it has been argued that, as part of the increasing use of data and analytics, new
forms of organizing and classifying consumers have also been incorporated, giving rise to a new
type of moral economy (Fourcade and Healy 2016), in addition to reshaping the links between
businesses and their publics (Turow et al. 2015). These new processes suppose the emergence of a
new type of moral economy through which consumers are sorted and valued (Fourcade and
Healy 2016).
A key element that traverses all the above-mentioned scholarship is the emphasis on the social
and political consequences that the increasing prevalence of digital data-sets and analytics has on
business and marketing. We could say that these distinct views emphasize the ‘power’ of data and
digital devices in the making of the economy. This power – described by much of this scholarship
– is mainly presented by examining the ‘effects’ or results that different digital business objects
have on shaping and affecting consumers.
Though this view has been central in terms of articulating a critical stance and making visible
power issues in contemporary digital capitalism, it lacks a more nuanced attention to the mundane
production of such digital objects and practices. In this way – and as Neyland has recently argued in
relation to algorithms – this scholarship ‘tends toward the dramatic, seeking out broad societal
trends at the expense of detailed consideration of specific algorithmic examples’ (Neyland 2014,
pp. 121–122). In fact, a cross-examination of the aforementioned scholarship shows that most of
it relies on empirical material related to the final uses and effects of data-sets and algorithms, and
only a small fraction of it relies on an ethnographic description of the process through which
such objects are assembled. Most of this latter work focuses on databases and data practices in scien-
tific work (MacKenzie et al. 2012, Garnett 2016, Nadim 2016).
One unintended consequence of attending primarily to the social and political effects of data and
analytics in marketing and business is that these analyses tend to take for granted and naturalize the
claims of ‘power’ and effectiveness regularly made by actors in the marketing analytic industry, doing
so without exposing and problematizing the digital practices and objects themselves. Thus, they reify
a particular form of technological determinism, where businesses (and social worlds more generally)
seem to be strongly shaped by these new digital commercial objects but where contingency, frictions,
and failure are out of the scope (Morozov 2014).
Against this backdrop, in this article, we concur with scholars who have increasingly called for
more in-depth research on the specificities of digital practices and objects as well as more detailed
descriptions on how different digital devices work empirically (Ruppert et al. 2013). We aim to con-
tribute to this by problematizing the moments and mode of attention required to unpack the prac-
tical manufacturing of such objects. By doing so, this article seeks to complement recent research to
digital data objects that, while focusing on the increasing impacts that these objects have on the
different aspects of social life, have nonetheless neglected the practices of data-making. Furthermore,
in doing so, we attempt to offer a corrective to certain critiques to data practices that have left the
moments of data production unattended.
Inspired by the traditional interest that STS has paid to the empirical assembling of techno-scien-
tific objects (Latour 1992) as well as the pragmatic turn in the anthropology of markets (Muniesa
2014), we suggest that, besides focusing on the consequences of the increasing centrality of digital
data and analytics in business settings, more attention needs to be paid to the practical and mundane
operations through which such objects are made in business organizations. Doing so involves under-
standing consumer data-sets, algorithms, and other types of digital objects from within the varied
circumstances of their production, mobilization, and use (Gitelman 2013). Thus, we propose empha-
sizing and concentrating on the different moments, tests, operations, and displacements through
which final consumer data-sets are assembled.
As mentioned above, our argument revolves around three suggested movements by which ‘to
address’ the question of digital business objects ‘in the making’. In doing so, we seek to problematize
both the ontological and epistemological basis for unpacking the assemblage of digital objects in
business. Each movement proposes a displacement in the focus of attention required to hold
JOURNAL OF CULTURAL ECONOMY 5
these objects visible and accountable for ethnographic consideration. We illustrate each of these
movements by describing different specific ethnographic situations, all of them emerging from
our involvement in the early production stages of a customer database.
this view implies that we do not claim or attempt to unveil a ‘natural’ expert practice. Instead, our
focus is to examine what is being effectuated in its own value by producing an artificial encounter. In
other words, in the experimental situation, there are no attempts to ‘represent’ (as if there were some-
thing in another space that is being represented) but to ‘make’/create a specific type of situation that
allows for an encounter with some type of ethnographic intimacy with data mining expert practices.
We should note that there exists a long tradition of using such tools, including the work of Harold
Garfinkel and ethnomethodology, where the fabrication of situations worked as a key tool for
unpacking processes of meaning-making that were taken for granted (1967). In this case of a
research on the production of business databases, an experimental approach involves introducing
a new type of slow temporality where the data experts and ethnographers are involved in making
sense of what is happening. This situation involves creating a ‘new space of thinking’, where what
is taken for granted has to be explained (Lezaun et al. 2017).
In the following paragraphs, we illustrate this movement by describing the initial phase of an
experimental situation, which was organized in order to participate in the first stages of the pro-
duction and analysis of a consumer database. We further describe some key artificial moments of
engagement that involve resisting taken-for-granted data practices and introduce moments of
explanation.
amount, etc.) that would exist in the database. Several aspects were taken into account during this
first stage.
An initial key aspect had to do with the levels of privacy and data disclosure they wanted to grant
us, and what this meant for the type and properties of the data we were receiving in our database. It
was decided that we would receive mostly transactional (and anonymized) records.4 These data were
collected on EasyFinance’s webpage by scraping bank account details and user transactions; the cus-
tomer ID was substituted with a random ID, therefore making it impossible for us to link the data we
had with specific persons.
A second key aspect was the data-set architecture. We were told that we were receiving the data as
it ‘lives’ in the company’s servers.5 This meant we would be working with the same data structure
they used during daily operations. After initial negotiations, we received a database consisting of
bank transaction data from users registered in the platform. This consisted of approximately
376,339 transaction records and 3500 client bank accounts. The company delivered this initial data-
base in the form of ‘testing’ data (which means we were receiving only a few months’ worth of data)
and only later would we receive a complete data-set with all the clients’ transactions. Each trans-
action record listed the amount, date, type of transaction, and a code designed to link the transaction
with the user’s ID.
With these pieces in place, our first task was to make sense and organize the data we received for
further analysis. Both our engineer and our counterpart within the company told us we first had to
go through the process of ‘number crunching’ that we described at the beginning of the paper. In
other words, we had to work with data concerning all the activities related to the preparation and
cleaning of transactional data-sets for further analysis. This included making general sense out of
the data we were receiving, dealing with data ‘errors’, producing new entities that could fit into
analytical models, and preparing data-sets for analytics software. This process took 4 months, during
which time we met regularly, alternating between working with the data and conversing openly
about the different processes, discussing and questioning them in detail.
Once a meeting started, we began by discussing in detail the task that was needed (e.g. coding a
new operation for listing data transactions), and then we moved on to the practical work. During
practical work, we created artificial situations where we asked a participant to unfold what was
happening. For example, every time a relevant action was done or a question emerged, we
asked the engineer to pause the work and to further explain what was happening in the screen,
what were the key aspects involved, and why it would have to be done in such way. We also orga-
nized meetings for discussing the general aspects involved in the work done. In such meetings, I
reviewed my own notes on the processes and discussed the rationale of the different moments, and
the elements that were taking place.
As this brief account shows, the fabrication of an artificial situation helps us to produce an
intimate encounter with a data mining practice in a relatively stable situation. Additionally, as we
will describe in the following pages, having the researcher’s active participation in the database
creation process allows for the production of various moments of resistance where procedures,
problems, and solutions have to be explained to the ethnographers. In this sense, the experimen-
tal situation added certain moments of resistance and ‘overspilling’ (Michael 2012) to the hum-
ble process of creating and analyzing the data-set. Because I actively participated in the process
(not being myself a technical expert), the presupposed methods and knowledge required to pro-
duce the database had to be explained to me, thereby foregrounding the internal rules of what
constitutes a good and a bad data-set. What thus emerged from this experimental situation was a
particular type of reality, real though it was provoked (Lezaun et al. 2013, p. 279). Based on this
reflection, it is interesting to see this movement from representation to provocation not so much
as a betrayal of other forms of ethnographic representation but as a move that helps to provoke
and mobilize some of the specific realities that constitute the practice of data mining, thereby
exposing the ‘intimate’ but key moment in the production of such objects.
8 T. ARIZTIA
was organized in three different SQL6 tables that contained all of costumers’ information: (a) a table
with the bank account number, bank, and name details; (b) a table with the subaccount details; and
(c) a table with transactions, date of transaction, amount, and a brief description.
Our main work was cleaning the records we received from the original data-sets. Broadly speak-
ing, cleaning refers to the practice of revising data records to see if they are pertinent for further
analysis and thus pruning the data that does not fit (Amoore and Piotukh 2015). This action required
dealing with, among other elements, the data’s format and content, ensuring that it could fit into the
necessary analytics software and model. Cleaning, for example, demanded that we identify those
records that do not fit into the analysis, either because they are ‘wrong’ or because they rendered
the use of a particular more difficult analytic tool. The operation of defining what can and cannot
be counted ‘as data’ relied on several practices and crossed distinct moments along the cleaning.
It can be noted that while the company generates data as a result of its daily operation, this data
was not used for doing any analysis, but it existed in the system and it was used for specific clients’
web visualization. In this sense, it is important to highlight that before we used the data-set as a
source for analysis, it had not been organized nor valued in this regard. It was only when we
moved into the step of analyzing the data as a whole that cleaning was required in order to make
that data useful for our purpose. In other words, for the data to become ‘valuable’ for analysis, it
needed to be transformed into something else by being cleaned. Therefore, cleaning entails making
data ready and valuable for a specific mode of analysis.
One of the first things we did with these data was to create SQL queries that listed all the data and
to explore whether it was ‘real’ data or not. To do this, we examined the list of data by using different
software that supported SQL programming language. We first employed the Google BigQuery
(GBQ) platform to create the list but, after some sessions, we exported the data and started using
Python and Excel as software for cleaning the data. In this case, as in others, we had to define whether
the data was correct or not. In this first moment, ‘strange numbers’ were defined in relation to the
other numbers in the list. If a number was too big or if a number was in an order of magnitude differ-
ent from others, we had to explore it further. This is similar to the cleaning practices described by
Helgesson (2010) in randomized clinical trials data cleaning practices, where comparing records
was a key task in defining which data were wrong. However, unlike Helgesson’s description, during
our work we did not use formal rules for cleaning, but instead we relied on a practical procedure,
which consisted of performing a visual examination of the columns to see if some records were
different from others in the list. For example, we created a list of all customers’ transactions and
found out that several of the transactions had ‘strange’ numbers, meaning either too big or too
small. When a record exhibited something strange, a second moment of valuation involved attribut-
ing the origin of the error. This entailed finding out the organizational reasons explaining a given
record: whether the data were false because of the code we wrote in previous sessions, because it
was an error originated in the client’s original data source, or because it was an error made in pre-
vious moments during the construction of the database. In order to clarify these possibilities, we dis-
cussed when the records were created, by whom, or how they were linked with transactions
registered in the clients’ interface. We also examined these strange records in relation to other
non-similar data that could help to make sense of them; for example the date or the type of account
linked to the record. By examining these different relations (with similar records, with previous
organizational work, and with other data), we worked on developing a plausible explanation on
the origin and purpose of these ‘strange records’. In other words, we situated records in terms of
their condition of being organizationally account-able (Neyland 2015). This means that the records’
values were stated in relation to their link to preexisting explanations related to previous organiz-
ational practices and procedures.
Our analysis of user transaction data provides a good example of the practice of cleaning. One of
the main SQL outcomes we produced was a list with all transactions for each user. This list was a
CSV file (a ‘comma-separated values’ file, or table whose values are separated by commas) that
included all different transactional records for each customer. Once we had the file, we opened it
10 T. ARIZTIA
to check whether the data looked okay. As we did this, we discovered that some numbers in the file
were too big compared to other records. This led us to discuss whether those numbers were ‘real
transactions’ – explained, for example, by a house purchase – or the results of an error in the database
and therefore ‘nonexistent’. While examining those big numbers, we found that some of those trans-
actions were made on dates the analysts identified as strange because they were before the company
was launched; so we decided that they were probably pieces of data that had been used as ‘tests’ when
the records were created, and therefore they were not real transactions or customers. Based on this,
we deleted those transactions from our database.
However, not all data without errors can be counted as final ‘data’. In other cases, even though
some records were okay, we still had to decide whether they should form part of the analysis or
not. The reason for this was because some records could complicate the subsequent analytical pro-
cess without furnishing any gain in terms of the result. This operation of ‘blacklisting’ data in order
to make future work run smoothly was very important for certain specific tasks in number crunch-
ing. For example, while talking about this ‘blacklisting’ practice, an EasyFinance engineer told me
that he had to blacklist half of a database in order to develop a predictive analytics algorithm for
classifying transactions. He explained to me that, during that process, he had decided that blacklist-
ing all the transactions between different current accounts did not relate to a particular category of
spending and thus did not provide more useful information. Pruning data involves considering the
needs of economizing. As he explained to me:
Some records are previously pruned from our system. They are automatically classified. For example, [such-
and-such a type of] transactions between clients does not tell you anything. [For example], you know that a
transfer between clients is a bank deposit, you know the meaning, you know also the bank account, but it
does not conceal more information to us.
In doing so, the analysis often focused exclusively on the data that were useful for analysis and left
out of the picture the records that did not contribute to it. The emphasis on getting work properly
done instead of taking a representational view of records shows similarities with Lakoff’s ethno-
graphic work on how medical researchers reclassify experimental subjects in drug testing in order
to deal with testing uncertainty and bring a drug into the market (Lakoff 2007). What is critical,
in practical matters, is the ability of records to have relevance to the task at hand and to further stra-
tegic considerations. In a similar way, the process of pruning data and finding the ‘right’ data
depends on whether it is useful and improves the success of further operations.
As we have briefly described in previous paragraphs, cleaning practices entail examining and
valuing records from at least two different angles. A first form, regarding key aspects, involves pla-
cing records in relation to other records, other data, and elements and moments in the database. This
requires developing a practical theory that gives a plausible reason or justification for their existence.
A second form of valuation involves judging their relevance in terms of the ongoing work and pur-
poses. As it can be noticed, these two forms of valuation involve counting records as data insofar they
fit with the different entities, moments, tasks, and purposes of the database. In other words, and simi-
larly to other marketing areas (Ariztia 2015, 2017), the truth of such data results more from a process
of internal problem-solving than a representation of ‘real’ customers. Real records are the ones that
work. These two forms of valuation mean that the relevance or irrelevance of a given record relates to
the different built-in theories populating the database elements, concerning among other elements
the database purposes, the practical requirements for the task ahead, and the story and plausibility
of each data.
achievement that involves dealing with recurrent failures and error. As it has been discussed with
respect to other types of technological objects, consumer databases – and other business digital
objects – conceal an incessant work of maintenance and repair (Graham and Thrift 2007, Denis
and Pontille 2013). Attending to these moments of rupture and fragility, and thus highlighting
the maintenance, repair, and care tasks taking place during the production of databases is a third
corrective movement that might help to unpack the practical production of such digital objects.
In line with previous discussions on data-making practices, error provides a space where practical
work can be exposed and where different operations through which the reality of data is produced
become examinable. Errors can be understood as moments of resistance that unveil the elements and
relations through which these digital entities are constituted. In this sense, as discussed by STS scho-
lars (Star 1999), error is also a key heuristic tool through which the practical organization of such
objects can be unpacked. In looking at error, however, we are not only focusing on ‘spectacular
breakdowns’ as spaces where the invisible work of making a database is exposed, we are concerned
with the multiple recurrent errors and failures happening during the mostly mundane activity of
keeping data workable (Denis and Pontille 2013, Dominguez Rubio 2016).
We illustrate this point in the next paragraphs by describing the place of error during the process
of building a consumer database. As in the previous example, we focus on describing the critical
place of error in the process of number crunching.
Figure 1. Picture in the left: GBQ Interface with the command console where we wrote the code. Left side: list of tables we had in
our project. Middle right side: buttons that we press (run, save, etc.). Lower right side: view panel that shows the outcome. Picture
in the right: Drawing of database links and entities. This drawing was produced during our first meetings and proved a key tool for
developing the SQL queries in GBQ.
Specifically, there is a very useful command that invalidates the need to rewrite the full name of a
table inside a query that is called ‘alias’. During our first three meetings, we could not make this com-
mand work properly. As a result, we spent much of our time writing queries and getting a significant
amount of errors (Figure 2).
During our first meetings, much of our time was spent finding ways to understand and ‘tame’ the
errors that popped up during the analyses we were doing. It is worthwhile to take a moment to describe
the trick we developed to deal with these problems. We examined the failed SQL code in detail and
developed alternative ways of writing the same SQL code with different wording and without using
the problematic piece of code. Once the alternative version was created, we pressed the ‘RUN’ button
again to test if it worked. The whole process involves a repetitive sequence of building an SQL query
and testing it in the framework, then creating a new variant and testing it again. This process of writing
and testing was repeated until we found which elements of the queries were ‘not working’ and fixed
them. Fixing errors involves having to find the implicit rules and affordances of the different actors
and entities taking part in organizing the database. Their work resembles a process of continuous tin-
kering, such as that described by Mol in her research on health practices (Mol 2008).
Figure 2. List of errors made when creating new consumers records (error queries had a red exclamation sign, OK saved queries
without color).
JOURNAL OF CULTURAL ECONOMY 13
For example, a key element in our case was the database platform GBQ that involved modes of
work and coding that were not usual for the engineers who participated in the project. For example,
it had very restrictive sets of possibilities regarding what operation could or could not be done using
SQL, and it also had a very small command box. Another element was the structure of permissions
for dealing with data. This means that in order to deal with some errors that involved, for example,
changing original records’ properties, we had to deal with different actors in order to obtain per-
mission. Doing this implied having to elaborate and explain why a specific solution was better
than the previous forms in which data permissions were structured. For example, during our
work with data, we realized that we needed to change the ‘date’ records in order to use them as num-
bers because we wanted to group transactions into weeks. However, for the creators of the database,
the week was not regarded as a key entity for organizing transaction, and it was unproblematic.
Therefore, dates were structured and registered as strings and not numbers. These types of situations
problematize also how existing relations between different actors involved in the production of the
database affect its final shape. For example, it happens when the distance between the creation time
of the database and the cleaning/programming time is wider since the people in charge are different
actors, as in the case of a formal supplier–client relationship. The space of possibilities for change
becomes more problematic as original decisions have to be taken for granted.
The previous examples highlight the central place of error and repetition during the process of
assembling the database. Error operates as a critical space through which different possibilities
open and close for the analysts, since it generates a moment of pause and examination when the
different movements are reexamined. It also involves considering existing possibilities. Besides,
error unveils the different entities and relations that constitute a particular version of the database,
exposing the different guiding principles, forms of delegations, and affordances by which a given
record was produced. As such, errors also involve dealing with the attributions made with respect
to other moments in the production of data-sets.
Error also worked for us as a key methodological device. Much in line with the work of scholars
concerned with infrastructures (Star 1999), who discuss the methodological trick of focusing on
infrastructural breakdown as a way of examining this operation, examining error in data practice
could offer a very useful path for researching this space.
consumer database. More concretely, we described some situations pertaining to the early process of
number crunching data, a relatively humble data practice concerned with organizing, cleaning, and
transforming records for making them suitable for further analytics. This article can thus be read as
an empirical, ethnographic description of the multiple circumstances proper to the initial stages in
the creation of a consumer database.
By arguing for and reinforcing the existence of such objects as the result of a myriad of mundane
practices and devices, this paper attempts to establish a contrast with other types of approaches that
have focused on the big-scale societal consequences that the data deluge in business and marketing
have had. This brings us to a final reflection, one that has figured in the pages of this article: the place
of politics.
Focusing on the ordinary production of objects such as consumer databases situates the discus-
sion about politics in a different place: it concerns the examination of the ontological politics taking
place in the manufacturing of such objects (Mol 1999, Woolgar and Lezaun 2013). This paper has
addressed this place in two related manners.
On the one hand, in terms of the specificities of the data practices analyzed here, we have exam-
ined how cleaning involves defining what counts as data. A key element described here is the fact that
final data does not result necessarily due to its adequacy with an external reference but according to
how they fit and work with the myriad of other entities, operations, and practical purposes that shape
the database. These different elements entail specific theories as well as forms of agency about what
might or might not count as data. Focusing on mundane practices and errors might help thus to
unveil and trace the different elements that make real a specific type of data.
On the other hand, this paper has to do with ontological politics in a second manner, which con-
cerns proposing new forms of approaching the manufacturing of digital objects in business settings.
Against this backdrop, we have suggested that designing new forms of experimental and critical
engagement that focus on unveiling the different mundane operations by which digital objects are
enacted can be a contribution for making business digital objects more accountable and governable
(Jasanoff 2016). Furthermore, it might help social scientists to extend our critical examination of
digital capitalism beyond a focus on ‘results’, thus exposing the contingencies and hinterlands taking
part in the production of these objects.
Notes
1. See http://www.forbes.com/sites/gilpress/2014/12/11/6-predictions-for-the-125-billion-big-data-analytics-
market-in-2015/.
2. The name of the company has been anonymized.
3. Quoted from the company’s webpage.
4. As we were told, records in the system had entities which correspond to ‘personal details’. Before uploading the
data into the system we used, the engineers at EasyFinance deleted personal data and created an artificial ID for
each user. In this sense, it is important noting that the data we used involved EasyFinance to engender an arti-
ficial situation which created anonymized data, something that is not necessarily done in natural settings.
5. From this database, the company regularly produces several operations. For instance, it calculates your balance,
offers you some specific promotions, or organizes your transactions into different categories.
6. SQL (Structured Query Language) is a standard programming language used to communicate and manipulate
data in relational databases. It is currently one of the most commonly used and oldest forms of data program-
ming language.
7. GBQ is an online analytic tool for handling big amounts of data, provided by Google.
Disclosure statement
No potential conflict of interest was reported by the author.
Funding
This work was supported by Fondo de Fomento al Desarrollo Científico y Tecnológico [Grant Number 1140078].
JOURNAL OF CULTURAL ECONOMY 15
Notes on contributor
Tomas Ariztia is associate professor in the Escuela de Sociología, Universidad Diego Portales, Chile. His research inter-
ests concern the sociology of markets and consumption, science and technology studies, and sustainable transitions.
He is particularly interested in research on how business professionals enact social worlds. Currently he is involved in a
4-year research project that focuses on the domestic, commercial, and regulatory lives of recycling and solar
technologies.
ORCID
Tomas Ariztia http://orcid.org/0000-0001-5806-3328
References
Amoore, L. and Piotukh, V., 2015. Life beyond big data. Governing with little analytics. Economy and Society, 44 (3),
341–366.
Andrejevic, M., Hearn, A., and Kennedy, H., 2015. Cultural studies of data mining. Introduction. European Journal of
Cultural Studies, 18 (4–5), 379–394.
Ariztia, T., 2015. Unpacking insight: how consumers are qualified by advertising agencies. Journal of Consumer
Culture, 15 (2), 143–162.
Ariztia, T., 2017. Manufacturing the consumer’s truth: the uses of consumer research in advertising Inquiry. In: F.
Cochoy, et al., eds. Markets and the arts of attachment. New York: Routledge, 38–54.
Beer, D., 2009. Power through the algorithm? Participatory web cultures and the technological unconscious. New
Media & Society, 11 (6), 985–1002.
Cukier, K. and Mayer-Schonberger, V., 2013. Big data: a revolution that will transform how we live, work and think.
Boston, MA: Mariner Books.
Denis, J. and Pontille, D., 2013. Material ordering and the care of things. CSI working papers series No. 34.
Dominguez Rubio, F., 2016. On the discrepancy between objects and things. An ecological approach. Journal of
Material Culture, 21 (1), 59–86.
Farías, I. and Wilkie, A., eds., 2015. Studio studies: operations, topologies & displacements. London: Routledge.
Fourcade, M. and Healy, K., 2016. Seeing like a market. Socio-Economic Review, 15, 9–29.
Garfinkel, H., 1967. Studies in ethnomethodology. Cambridge: Polity.
Garnett, E., 2016. Developing a feeling for error. Practices of monitoring and modelling air pollution data. Big Data &
Society, 3 (2), 1–25.
Gitelman, L., ed., 2013. “Raw data” is an oxymoron. Cambridge: MIT Press.
Graham, S. and Thrift, N., 2007. Out of order. Understanding repair and maintenance. Theory, Culture & Society, 24
(3), 1–25.
Helgesson, C.-F., 2010. From dirty data to credible scientific evidence: some practices used to clean data in large ran-
domised clinical trials. In: C. Will and T. Moreira, eds. Medical proofs, social experiments: clinical trials in shifting
contexts. Aldershot: Ashgate, 49–64.
Jasanoff, S., 2016. The ethics of invention. Technology and the human future/Sheila Jasanoff. 1st ed. New York: W.W.
Norton & Company (The Norton global ethics series).
Knorr-Cetina, K., 1999. Epistemic cultures. How the sciences make knowledge. Cambridge: Harvard University Press.
Knorr-Cetina, K. and Preda, A., eds., 2005. The sociology of financial markets. Oxford: Oxford University Press.
Knox, H., O’Doherty, D., and Vurdubakis, T., 2007. Transformative capacity, information technology, and the making
of business ‘experts’. The Sociological Review, 55 (1), 22–41.
Kushner, S., 2013. The freelance translation machine. Algorithmic culture and the invisible industry. New Media &
Society, 15 (8), 1241–1258.
Lakoff, A., 2007. The right patients for the drug: managing the placebo effect in antidepressant trials. BioSocieties, 2 (1),
57–71.
Latour, B., 1992. Ciencia en acción: Cómo seguir a los científicos e ingenieros a través de la sociedad. Barcelona: Labor.
Lezaun, J., Muniesa, F., and Vikkelso, S., 2013. Provocative containment and the drift of social-scientific realism.
Journal of Cultural Economy, 6 (3), 278–293.
Lezaun, J., Marres, N., and Tironi, M., 2017. Experiments in participation. The Handbook of Science and Technology
Studies, 7, 195–222.
Lupton, D., 2016. The quantified self: A sociology of self-tracking. Cambridge: Polity.
Lynch, M., 2013. Ontography: investigating the production of things, deflating ontology. Social Studies of Science, 43
(3), 444–462.
Mackenzie, A., 2003. These things called systems: collective imaginings and infrastructural software. Social Studies of
Science, 33 (3), 365–387.
16 T. ARIZTIA
Mackenzie, A., 2012. More parts than elements. How databases multiply. Environment & Planning D, 30 (2), 335–350.
Mackenzie, A. and McNally, R., 2013. Living multiples. How large-scale scientific data-mining pursues identity and
differences. Theory, Culture & Society, 30 (4), 72–91.
MacKenzie, D., et al., 2012. Drilling through the Allegheny mountains. Journal of Cultural Economy, 5 (3), 279–296.
Marres, N., 2009. Testing powers of engagement: green living experiments, the ontological turn and the undoability of
involvement. European Journal of Social Theory, 12 (1), 117–133.
Marres, N., 2012. The redistribution of methods: on intervention in digital social research, broadly conceived. The
Sociological Review, 60, 139–165.
McFall, L. and Ossandon, J., 2013. What’s new in the ‘new, new economic sociology’ and should organisation studies
care? In: P. Adler, et al., eds. Oxford handbook of sociology, social theory and organization studies: contemporary
currents. Oxford: Oxford University Press, 510–533.
Michael, M., 2012. What are we busy doing? Engaging the idiot. Science, Technology & Human Values, 37 (5), 528–554.
Mol, A., 1999. Ontological politics. A word and some questions. The Sociological Review, 47 (S1), 74–89.
Mol, A., 2008. The logic of care: health and the problem of patient choice. New York: Routledge.
Moore, P. and Robinson, A., 2015. The quantified self. What counts in the neoliberal workplace. New Media & Society,
18, 1–19.
Morozov, E., 2014. To save everything, click here. The folly of technological solutionism. New York: PublicAffairs.
Muniesa, F., 2014. The provoked economy. Economic reality and the performative turn. London: Routledge.
Nadim, T., 2016. Data labours. How the sequence databases GenBank and EMBL-Bank make data. Science as Culture,
25 (4), 1–24.
Neyland, D., 2014. On organizing algorithms. Theory, Culture & Society, 32 (1), 119–132.
Neyland, D., 2015. Bearing account-able witness to the ethical algorithmic system. Science, Technology & Human
Values, 41 (1), 50–76.
Ruppert, E., 2012. The governmental topologies of database devices. Theory, Culture & Society, 29 (4–5), 116–136.
Ruppert, E. and Savage, M., 2011. Transactional politics. The Sociological Review, 59, 73–92.
Ruppert, E., Law, J., and Savage, M., 2013. Reassembling social science methods. The challenge of digital devices.
Theory, Culture & Society, 30 (4), 22–46.
Star, S.L., 1999. The ethnography of infrastructure. American Behavioral Scientist, 43 (3), 377–391.
Thévenot, L., 1984. Rules and implements: investment in forms. Information, 23 (1), 1–45.
Turow, J., McGuigan, L., and Maris, E.R., 2015. Making data mining a natural part of life. Physical retailing, customer
surveillance and the 21st century social imaginary. European Journal of Cultural Studies, 18 (4–5), 464–478.
Woolgar, S., and Lezaun, J., 2013. The wrong bin bag: a turn to ontology in science and technology studies? Social
Studies of Science, 43 (3), 321–340.
Zwick, D. and Denegri, J., 2009. Manufacturing customers. The database as new means of production. Journal of
Consumer Culture, 9 (2), 221–247.