Professional Documents
Culture Documents
Michael T. Zimmer
and Siva Vaidhyanathan for providing me with valuable guidance, challenges, and
new approaches to this dissertation, and Professors Brett Gary, Ted Magder, and
Powers, and Maryann Tsokantas, who, along with Professor JoEllen Fisherkeller,
helped shape this project from its earliest days. Additional thanks to Melissa
and the rest of the Ph.D. student community for their helpful feedback and
and Tim Weber for their friendship and support throughout these hectic years.
Society Project at Yale Law School, the Philosophy Departments at the Delft
Philosophy and Technology, and the Society for the Social Studies of Science.
iii
Additional thanks to Geoffrey Bowker and Susan Leigh Star at the Center for
Funding for this research was generously provided by the Phyllis and
Gerald LeBoff Doctoral Fellowship, the PORTIA Project, and a National Science
both the Zimmers and the Laydes – for their unending love and support, and
encouragement, love, and proofreading this dissertation simply would not exist.
Finally, I thank Ethan, my amazing son, for being a new source of inspiration in
iv
TABLE OF CONTENTS
ACKNOWLEDGEMENTS iii
LIST OF FIGURES ix
CHAPTER
I INTRODUCTION 1
Prologue 1
Spheres of Mobility 3
The Search for the Perfect Search Engine 5
A Faustian Bargain? 8
Overview of Social Research on Search Engines 10
Privacy in Technology 14
Methodology 17
Chapter Outline 21
Introduction 24
Early Philosophies of Technology 25
Summary 34
Founders of Contemporary Philosophy of Technology 34
Summary 42
Contemporary Philosophies of Technology 43
Media Ecology 43
Social Construction of Technology 49
Politics of Technology 53
Ethics in Technology 64
Summary 71
Chapter Summary: A Faustian Bargain 73
Introduction 79
v
A Brief History of the Internet and World Wide Web 80
The Internet 80
Hypertext Links 81
The World Wide Web 86
Early Internet and Web Navigation Tools 88
FTP, Gopher, and WAIS 88
Web Directories 90
Web Search Engines 99
How Web Search Engines Work 103
Economics of Web Search Engines 111
The Quest for the “Perfect Search” 114
Perfect Reach 115
Perfect Recall 116
A Faustian Bargain? 117
Introduction 156
Understanding “Contextual Integrity” 160
Example: Vehicle Safety Communication Technology 165
Contextual Integrity in the Perfect Search 169
Introduction 178
Physical Mobility: Freedom on the Roads 180
Exploration, Autonomy, and Escape on the Roads 183
Threats to Freedom on the Roads 189
Intellectual Mobility: Intellectual Freedom and the Library 193
Intellectual Freedom and the Library Bill of Rights 195
vi
Privacy and the Library Bill of Rights 198
Digital Mobility: Autonomy, Privacy, and Digital Rights
Management 205
DRM, Digital Mobility, and Privacy 211
Convergence of Mobilities 214
Summary 219
BIBLIOGRAPHY 229
APPENDICES
vii
LIST OF TABLES
viii
LIST OF FIGURES
7 Partial Google Web search results page for “Boston subway” 266
ix
1
CHAPTER I
INTRODUCTION
Prologue
uphold an online pornography law, the U.S. Department of Justice had asked a
federal judge to compel the Web search engine Google to turn over records on
millions of its users’ search queries (Hafner & Richtel, 2006; Mintz, 2006).
Google resisted, but three of its competitors, America Online (AOL), Microsoft,
and Yahoo!, complied with similar government subpoenas of their search records
(Hafner & Richtel, 2006). Later that year, AOL released over 20 million search
queries from 658,000 of its users to the public in an attempt to support academic
research on search engine query analysis (Hansell, 2006). Despite AOL’s attempts
to anonymize the data, individual users remained identifiable based solely on their
search histories, which included search terms matching users’ names, social
“innumerable number of life stories ranging from the mundane to the illicit and
bizarre” (McCullagh, 2006a). Upon being identified by the New York Times based
solely on her search terms in the AOL database, a Georgia woman exclaimed,
“My goodness, it’s my whole personal life…I had no idea somebody was looking
These two cases revealed how the collection of users’ web search
value considered “fundamental to our free society” (Froomkin, 2000, p. 121). Yet,
while these events brought to light the fact that search engine providers routinely
keep detailed records of users’ searches, and created anxiety among some
information-seeking activities (Barbaro & Zeller Jr, 2006; Hafner, 2006; Levy,
2006; Maney, 2006), by and large, users continued to flock to the growing suite of
paradox has emerged: while revelations that Web search engine companies
increasingly monitor, store, aggregate – and in some cases, share with third
parties – users’ search histories, users continue to embrace and integrate these
services into their daily lives. A faithful Google user interviewed by the New York
Times puts it best: “I don’t know if I want all my personal information saved on
of an improvement on how life was before, I can’t help it” (Williams, 2006).
Google’s mission “to organize the world's information and make it universally
accessible and useful” (Google, 2005b) is indeed alluring, but the demands on the
1
In the year since the DOJ case emerged, search engine activity has
increased from 5.3 billion searches in February 2006 (Nielsen//NetRatings, 2006)
to 6.4 billion in 2007 (Nielsen//NetRatings, 2007), an increase of over 20%.
Google, for its part, reported $10.6 billion in revenues during fiscal 2006,
compared to only $6.1 billion for 2005, a 73% increase (Google, 1999).
2
individual to disclose intimate and personal information with the search engine
a company whose motto is “Don’t be evil,” (Google, 2005k), have we, like Faust,
Spheres of Mobility
In his thesis on justice and injustice, legal scholar Edmond Cahn has
forms the foundation for a just society. Physical mobility relates to movements of
people in geographical space and their ability to navigate and explore new spaces,
involves the freedom to learn new things, explore new ideas, adapt, and change
individual. A new digital mobility is also emerging, providing the means to move
within and across digital computer networks, offering a novel means of achieving
both physical and intellectual mobilities in an online world. These mobilities are
3
Within these spheres of mobility, individuals create, discover, and enjoy
spaces for personal growth, exploration, and escape. Central to these mobilities is
the notion that individuals are granted the right to move about and explore new
foster the freedoms envisioned by Cahn, our spheres of mobility must become
what Hakim Bey has described as “temporary autonomous zones,” the moments
and spaces that elude formal structures of control “in which freedom is not only
possible but actual” (1991, p. 131). Within these spheres, individuals have come
cannot gain the sort of understanding of our world and develop the awareness and
ranging from the automobile and related network of road and highways to provide
physical mobility, to public library systems and their related information services
supporting intellectual mobility, to the Internet and its related protocols that
facilitate mobility across digital computer networks. The latest addition to the
assortment of tools for navigating spheres of mobility is the Web search engine,
communication and interaction, and new means of experiencing the world. The
4
prominent role of Web search engines in navigating our contemporary spheres of
their everyday lives (see Horrigan & Rainie, 2006), Web search engines have
available on this global network. Consider, for example, the Web search engine
Google. Google has become the prevailing knowledge tool for searching and
research project by Larry Page and Sergey Brin at Stanford University (see Brin
& Page, 1998; Page et al., 1998), Google’s Web search engine now dominates the
market, processing almost 3.6 billion search queries in February 2007, over half
stated quite simply and innocuously, is to “organize the world’s information and
goal, Google has developed dozens of search-related tools and services to help
users organize and use information in multiple contexts, ranging from general
2
At its peak in early 2004, Google handled upwards of 80 percent of all
search requests on the Web through its own website and clients such as Yahoo!,
AOL, and CNN who relied on Google for their customer’s search engine results.
Google’s share fell to a still dominant 57% in 2004 when Yahoo! dropped
Google’s search technology for their own (Hansen, 2004).
5
management, shopping and product research, computer file management, and
search-related services and tools.4 They also use these tools to communicate,
navigate, shop, and organize their lives. By providing a medium for various
large part of people’s lives, both online and off (Williams, 2006). In many ways,
the physical, intellectual, and digital mobilities described above converge in Web
Since the first search engines provided a means of navigating the spheres
of mobility accessible via the Web, a desire emerged to create the “perfect search
providing fast and relevant results. The quest for the perfect search engine has led
to calls for search engines to provide results that suit the “context and intent” of
the search query. The perfect search engine will have to have “perfect reach” to
deliver any type of online content from all online (and, increasingly, offline)
sources, as well as “perfect recall” to deliver personalized and relevant results that
3
See Chapter IV for detailed discussion of Google’s various products and
services.
4
Yahoo!, and to a lesser extent, Microsoft and AOL, also offer search-
related tools beyond just locating relevant websites. Google, however, remains the
clear market leader at 55.8% of all search activity, with Yahoo! following at
20.7% and Microsoft at 9.6% (Nielsen//NetRatings, 2007). Given their strong
dominance of the overall marketplace, and recognition as the “gold standard” in
search engine practices and innovation (Hellweg, 2002; Clark, 2006), Google will
be the primary focus of this dissertation.
6
are informed by who the searcher is. Given a search for “Paris Hilton,” the perfect
search engine will know whether to deliver results about the celebrity socialite,
complete with the requisite image and video files, or a place to spend the night in
France, complemented with photos of the property, maps, and even flight
search engine: the company’s very first press release noted that “a perfect search
engine will process and understand all the information in the world…That is
where Google is headed” (Google, 1999). Google co-founder Larry Page later
reiterated the goal of achieving the perfect search: “The perfect search engine
would understand exactly what you mean and give back exactly what you want”
(Google, 2007). Silicon Valley journalist John Battelle summarizes how such an
Imagine the ability to ask any question and get not just an accurate answer,
but your perfect answer – an answer that suits the context and intent of
your question, an answer that is informed by who you are and why you
might be asking. The engine providing this answer is capable of
incorporating all the world’s knowledge to the task at hand – be it
captured in text, video, or audio. It’s capable of discerning between
straightforward requests – who was the third president of the United
States? – and more nuanced ones – under what circumstances did the third
president of the United States foreswear his views on slavery?
This perfect search also has perfect recall – it knows what you’ve
seen, and can discern between a journey of discovery – where you want to
find something new – and recovery – where you want to find something
you’ve seen before. (Battelle, 2004)
When asked what a perfect search engine would be like, Sergey Brin replied quite
7
A Faustian Bargain?
mobility. Yet, while the quest for the perfect search is spearheaded by a company
critic Neil Postman that the true relationship between a society and its technology
is often not purely benevolent, but instead may require a sacrifice for society to
History has revealed how such a Faustian bargain persists with many technologies
collection systems provide efficiencies for physical mobility along the highways,
collections and circulation, but also facilitates the recording of each patron’s
commerce and enhance digital mobilities, the widespread use of Web cookies has
8
defined as both “the massive collection and storage of vast quantities of personal
data” (Bennett, 1996, p. 237) and “the systemic use of personal data systems in
often for the purpose of forming detailed “digital dossiers” (Solove, 2004, p. 2) of
nearly every individual within its reach. While often developed for benign
have devolved into Faustian bargains, with concrete effects relating to issues of
The task of this dissertation, then, is to expose the perfect search engine as
dissertation will explore how the quest for the perfect search engine also
surveillance, and DRM – the dissertation will argue that the quest to create the
5
Clarke discusses the relative benefits and dangers of dataveillance
technologies in more detail at (Clarke, 1988).
9
information flows, restricting the ability to engage in social, cultural, and
information technologies. The impact of search engines on society and culture has
enhancing the underlying Web search engine technology (see, for example, Brin
& Page, 1998; Page et al., 1998; Heydon & Najork, 1999), but also technical
analyses of the extent of coverage achieved by search engine products and how it
relates to information access (see, for example, Lawrence & Giles, 1998, 2000;
Some of the earliest social studies of Web search engines emerged from
engine users through the analysis of transaction log data (Jansen & Pooch, 2001).
These include Hoelsher’s (1998) analysis of 16 million queries from the German
search engine Fireball; Jansen, Spink, and Saracevic’s (2000) study of a sample
10
day’s worth of search activity from the Excite search engine; and Silversetin,
Henzinger, Marais, and Moricz’s (1999) detailed analysis of just under one billion
queries submitted to the Alta Vista search engine over a 42-day period. These
studies of transaction log data provide valuable information about search query
structure and complexity, including insights about common search topics, query
length, Boolean operator usage, search session length, and search results page
studies offer limited insights into the behavior of Web searchers beyond the
search queries submitted. Eszter Hargittai’s (2002; 2004b) use of surveys and in-
providing insights into how people find information online in the context of their
other media use, their general Internet use patterns, and their social support
allowed Hargittai (2004a) to reveal the ways that factors such as age, gender,
education level, and time spent online are relevant predictors of a user’s Web
searching skills. The work of Machill, et al (2004) and Hölscher and Strube
users.
Jansen and Spink recognize that the “overwhelming research focus in the
scientific literature is on the technological aspects of Web search” and that when
studies do venture beyond the technology itself, they are “generally focused on
11
the individual level of analysis” (2004, p. 181). In response, recent scholarship
has moved beyond the technical and individual focus of the user studies described
above to include research into broader cultural, legal, and social implications of
Web search engines. For example, cultural scholars (Wouters et al., 2004;
Hellsten et al., 2006) have explored the ways in which search engines “re-write
the past” due to the frequent updating of their indices and the corresponding loss
of a historical record of content on the Web. Martely (in press) and Roy and Chi
(2003) examine gendered differences in Web search engine use, suggesting that
well.
Introna and Nissenbaum’s (2000) seminal study, “Shaping the Web: Why
the Politics of Search Engines Matter,” was among the first to analyze search
engines from a political perspective, noting how search engines have been
Search engines, then, act as a powerful source of access and accessibility within
the Web. Introna and Nissenbaum reveal, however, that search engines
“systematically exclude certain sites, and certain types of sites, in favor of others,
169).
12
Such a critique resembles the stance that political economists take against
McChesney, 1999), a critique that has recently been extended to Web search
engines. For example, Hargittai (2004b) has extended her user studies to include
search engine industry impact the way in which content is organized, presented,
and distributed to users. And Van Couvering (2004) has engaged in extensive
research on the political economy of the search engine industry in terms of its
ownership, its revenues, the products it sells, its geographic spread, and the
politics and regulations that govern it. Drawing comparisons to concerns over
market consolidations in the mass media industry, Van Couvering fears that the
market concentration and business practices of the search engine industry might
limit its ability to serve “the public interest in the information society” (Van
Extending from these various social and cultural critiques, Web search
engines have only recently been scrutinized from a moral or ethical perspective. A
recent panel discussion at the Santa Clara University Markkula Center for
Applied Ethics was one of the first to bring together ethicists, computer scientists,
and social scientists for the express purpose of confronting some of the
search engine bias, censorship, trust, and privacy (Norvig et al., 2006). A special
Engines” (Nagenborg, 2005) brought into focus many of the particular privacy
13
concerns with search engines. Included in this special issue were discussions of
the use of search engines to acquire information about persons, threatening their
search engines to collect user search histories and how this ability might impinge
the tracking of users’ Web search histories, a weakness that this dissertation will
attempt to overcome. Building from these social, cultural, and ethical explorations
into Web search engines, this dissertation will reveal how the quest for the perfect
search engine implicates more than just the privacy of our personal information,
but also, if left unfettered, threatens to impede upon the freedoms enjoyed in our
spheres of mobility.
Privacy in Technology
This dissertation argues that technologies embody values, that their design
diverse community of scholars has recently emerged to understand how the rise of
information technologies bear on moral and ethical values (see, for example,
2005). This research seeks to identify, understand, and address the ethical and
value-laden concerns that arise from the rapid design of information technologies
and their deployment into society. Friedman and Kahn (2002) identify twelve
14
specific values with moral and ethical import that are often embedded in the
Warren and Brandeis’ (1890) seminal essay “The Right to Privacy,” in which they
quote Judge Cooley’s view of privacy as the right to be left alone. Other popular
(Westin, 1970), or with its complement, the control over the degree of access to
oneself (Gavison, 1980). Another view of privacy means having control over
one’s entire realm of intimate decisions, including decisions about physical access
(Inness, 1992). Privacy has been further described as both an individual as well as
a social value (Regan, 1995), and in relation to the system of norms that facilitate
6
Friedman and Kahn (2002) argue that these are universal values,
although how such values play out in a particular culture at a particular point in
time can vary considerably.
7
A more detailed discussion of the relationship between privacy and
technology can be found at (Agre & Rotenberg, 1997).
15
• Privacy of the person, concerned with the integrity of the individual’s
body.
• Privacy of personal behavior, relating to all aspects of behavior, but
especially to sensitive matters, such as sexual preferences and habits,
political and intellectual activities and religious practices, both in private
and in public places.
• Privacy of personal communications, the interest in being able to
communicate with other individuals, using various media, without routine
monitoring of their communications by other persons or organizations.
• Privacy of personal data, the claim that data about oneself should not be
automatically available to other individuals and organizations, and that,
even where data is possessed by another party, the individual must be able
to exercise a substantial degree of control over that data and its use.
(Clarke, 1997)
culture, both for reasons related to individual dignity and because of the powerful
chilling effect that disclosure of intellectual preferences would produce” (p. 48;
emphasis added). The privacy of intellectual activity has been tied to the right to
read and consume media products anonymously (Cohen, 1996; Froomkin, 1999),
the right to send and receive communications free from government surveillance
(Regan, 1995, 2001), and the right of unrestricted access to information and open
inquiry free from fear that privacy or confidentiality could be compromised (see
16
those that take place while navigating the sphere of the Internet. Concerns over
the impact of technology on the flow of online activities have emerged in the
Web cookies and tracking bugs (Kang, 1998; Bennett, 2001), the digitization of
library records (Sturges et al.), and the general expansion of the ability to
aggregate personal activity from all such sources (Garfinkel, 2000; Solove, 2004).
“fundamental to our free society” (Froomkin, 2000, p. 121). Arising from these
concerns, both scholars and designers alike have attempted to incorporate privacy
into the design stages of technological systems. It is from such efforts that this
Methodology
1993; Raskin, 2000), participatory design (Muller & Kuhn, 1993; Sclove, 1995),
reflective design (Sengers et al., 1990, 2005), and critical technical practice (Agre,
1997a, 1997b; Boehner et al., 2005). Recently, new pragmatic frameworks have
emerged to ensure that particular attention to moral and ethical values becomes an
17
and systems. These include Design for Values (Camp, n.d.), Values at Play
(Flanagan et al., in press, 2005), and Value Sensitive Design (Friedman, 1999;
al., in press), where attention to three different modes (balls) of investigation must
conceptual mode, first ball in play, involves an analysis informed by ethics and
question. The second ball in play is the technical investigation of the particular
design specifications and variables that might promote or obscure given values
within the context of the technology being designed. Finally, the empirical mode
of investigation, the third ball in play, has the dual goal of providing measurable
analyses of the values within the particular design context (in support of the
investigation).
18
The Values at Play (VAP) approach offers a similar tripartite
phases (Flanagan et al., 2005). The goal of the discovery phase is to identify the
those explicit in the aspirations of the technology’s designers, as well as those that
emerge only when the technological design process is underway. The translation
phase of VAP is the activity in which designers translate the value considerations
identified in the discovery phase into the architecture and features of the
technology. The final phase is verification, ensuring that the designers have
human and ethical values. For example, Friedman, Felten, and their colleagues
cookie management tools in support of the values of informed consent and user
privacy (Friedman et al., 2002). Similarly, Camp and her colleagues engaged in
protect Internet users from consumer fraud and identity theft (Camp et al., n.d.;
science while also embodying values such as cooperation, creativity, privacy, and
19
independence (Flanagan et al., in press, 2005). Howe and Nissenbaum’s (2006)
“TrackMeNot” Web browser extension was designed to help obfuscate one’s Web
the values of privacy and user autonomy. Finally, Zimmer (2005), applied Value-
Huits et al., in progress), these examples reveal the promise of influencing the
This dissertation will follow the lead of these case studies to form the
groundwork for the value-conscious design of the perfect search engine. Two
balls from the value-conscious design methodological toolkit will be put in play:
and cultural framing of the values enjoyed in our spheres of mobility, bringing
conceptual clarity and a normative understanding of the ways in which the quest
for the perfect search engine bears on these values.8 The dissertation will also
uncover how its technological properties and underlying architecture might bear
8
This is in line with the “disclosive computer ethics” called for by Philip
Brey (2000); see discussion in Chapter II.
20
methodological frameworks are meant to be iterative, the initial steps taken by
verification stages of VAP, as well as the empirical mode of VSD, with the shared
goal of proactively influencing the design of the perfect search engine to account
for the values necessary for the full enjoyment of our spheres of mobility.
Chapter Outline
presents the starting point of this dissertation, that the design of technology bears
This chapter will sketch a brief outline of key humanist, social, and philosophical
systems and artifacts impact society in ethical and value-laden ways, closing with
the realization that society often makes a Faustian bargain with its technology –
Given this concern about the Faustian bargain with technology, Chapter
III, “The Quest for the Perfect Search Engine,” discusses the emergence of a
significant social impact: the Web search engine. This chapter introduces the
Internet and the World Wide Web, and how the Web search engine has become
the “center of gravity” of people’s online intellectual activities. Chapter III closes
with a discussion of the growing quest among Web search engine companies to
21
achieve the “perfect search,” and the instruments of reach and recall necessary to
Chapter IV, “Google’s Quest for the Perfect Search Engine,” focuses the
previous chapter’s discussion on today’s dominant search engine, Google, and its
particular quest for the perfect search engine. As the design features of their
version of the perfect search engine begin to take hold, certain anxieties emerge
that might threaten the values in our spheres of mobility, especially in terms of
privacy and the flow of personal information. However, despite this rising
anxiety, many users prefer to remain citizens of “Planet Google,” thus, a Faustian
While the previous chapter identifies many of the latent anxieties of the
perfect search, little evidence can be found that these concerns are affecting the
overall popularity and use of Web search engines. Drawing on the concern that
the design of these technologies are being taken “at interface value” (Turkle,
1995, p. 103), Chapter V, “Contextual Integrity and the Perfect Search Engine,”
conceptual clarity to how Google’s quest for the perfect search is disrupting
perfect search engine, but lacking the normative framework to determine its harm
22
activities free from answerability and oversight. This chapter identifies the values
at play within our spheres of physical, intellectual, and digital mobility, and
illustrates how the shifts in informational norms caused by the quest for the
perfect search engine threaten our ability to realize full freedom of mobility
proposals for how the Faustian bargain between society and the quest of for the
for personal privacy and ensure continued freedom within our spheres of mobility.
search engine industry, and pragmatic intervention with search engine companies
these chapters. Appendix A, “Google’s Quest for the Perfect Search: Products and
Data Capture,” outlines twenty-six of Google’s key products and services in each
presents a thought experiment with two ideal typical information seekers – Libby
and Netty – who differ only in how they navigate their informational spheres.
Comparisons are made between the personal information flows inherent within
23
24
CHAPTER II
Introduction
embody values, that their design bears “directly and systematically on the
political values” (Flanagan et al., in press). Identifying the social, ethical, and
technology and its role in nearly all aspects of society. But such an understanding
is rarely transparent. In his book Technopoly, Neil Postman remarked how “we
are surrounded by the wondrous effects of machines and are encouraged to ignore
the ideas embedded in them. Which means we become blind to the ideological
meaning of our technologies” (1992, p. 94). It has been the goal of many
systems and artifacts. This chapter will sketch the history and development of
survey of the robust contributions to the study of technology and society from
these various disciplinary perspectives, if even possible, is beyond the scope of
this chapter.9 The sections that follow aim to provide a reliable map of this rich
and important intellectual territory and identify the waypoints to help establish the
While the study of the nature of technology and its effects has become an
object of great interest for philosophers over the past few centuries, technology
we often view technology as physical tools and machines, the term “technology”
has its roots in the Greek techne, meaning literally “an art or craft” (Williams,
manual work and physical manipulation of the world. This is contrasted with
knowing. Put simply, techne involves tools of the body and earth, while episteme
against episteme, the latter considered by Plato, like Aristotle before him, as a
9
More extensive surveys include (Durbin & Rapp, 1983; Ihde, 1993;
Mitcham, 1994; Scharff & Dusek, 2003; Kaplan, 2004).
25
higher form of human knowledge.10 For Plato, techne is merely imitative of
episteme, a lower form of knowledge achieved through manual arts and crafts,
rather than mental and philosophical reasoning. One target of Plato’s critique of
the imitative nature of techne is writing, viewed by Plato as a craft that merely
Plato invokes a meeting between the Egyptian god Theuth and the pharaoh
Theuth, responsible for introducing geometry astronomy to the world, presents his
Most ingenious Theuth, one man has the ability to beget arts, but the
ability to judge of their usefulness or harmfulness to their users belongs to
another; and now you, who are the father of letters, have been led by your
affection to ascribe to them a power the opposite of that which they really
possess. For this invention will produce forgetfulness in the minds of those
who learn to use it, because they will not practice their memory. Their
trust in writing, produced by external characters which are no part of
themselves, will discourage the use of their own memory within them.
You have invented an elixir not of memory, but of reminding; and you
offer your pupils the appearance of wisdom, not true wisdom, for they will
read many things without instruction and will therefore seem to know
many things, when they are for the most part…not wise, but only appear
wise. (Plato, 1990, pp. 274e-275b)
writing, indeed any of the arts or crafts that make up techne, fails to encourage
what he viewed as true learning, and instead leads to reliance of things other than
10
Jacques Derrida’s (1981) “Plato’s Pharmacy” provides a close reading –
and critique – of Plato’s Phaedrus.
26
the mind. He fears that memory will become confused with recollection, and that
the true wisdom of episteme will be indistinguishable from the lesser, imitative
Plato’s concern about how the imitative arts invite people to rely on techne
for knowledge rather than the efforts of their minds persists centuries later. Neil
Postman has suggested that Thamus would have reacted to Gutenberg’s printing
press in a similar manner as he did regarding writing: with the view that the new
invention would create a vast population of readers who “will receive a quantity
wisdom without real wisdom” (qtd. in Postman, 1992, p. 16). Similarly, many
2000), or even how encyclopedias and Web search engines have supplanted the
In short, thousands of years before the printing press, the calculator, or the
Web search engine, Plato’s “judgment of Thamus” provided a warning about the
summarized by Postman:
27
prescient, whether considering a system of writing introduced during antiquity, or
the development of the World Wide Web in the late twentieth century.
pursuits of knowledge was mirrored centuries later in Karl Marx’s social critique
social community and life’s work, so that in the end they had no ownership over
their own lives or the products of their labor. Marx places the modes of
of social existence:
In the social production of their life, men enter into definite relations that
are indispensable and independent of their will, relations of production
which correspond to a definite stage of development of their material
forces. The sum total of these relations of production constitutes the
economic structure of society, the real foundation, on which rises a legal
and political superstructure and to which correspond definite forms of
social consciousness. The mode of production of material life conditions
the social, political, and intellectual life process in general. (Marx, 1978a,
p. 4; emphasis added)
emphasizes the divisive role played by the rise of industrial capitalism in society.
Marx notes that to examine society properly we must start with its material basis,
“an actual economic fact,” rather than “go back to a fictitious primordial
condition” (p. 71). This economic fact is how “the worker becomes all the poorer
the more wealth he produces, the more his production increases in power and
range. The worker becomes an ever cheaper commodity the more commodities he
28
creates” (p. 71). The concept of “estranged labor” summed up this enslaved
condition of the worker in a capitalist society: his “loss of reality” though the
objectification of his labor (p. 71), the external aspect of labor in “the fact that
labour is external to the worker…it does not belong to his essential being” (p. 74),
and in the loss of self since labor “operates independently of the self” (p. 74).
Marx argues, “an immediate consequence of the fact that man is estranged from
the product of his labour, from his life-activity, from this species being is the
estrangement of man from man” (p. 77). In an alienated society, the whole mind-
technological and material conditions in which they find themselves and of the
and alienated by the structures that emerged. Just as Thamus warned that reliance
on writing would change the very nature of knowledge and wisdom, Marx
11
This appears again in the “Fragment on Machines” in the Grundrisse,
where Marx argues that capitalist production moves away from dependency on
direct labor and tends toward reliance on the “general state of science and on the
progress of technology” (1973, p. 705). Here, Marx argues that “The general
productive forces of the social brain” have become absorbed, shaped, and
generally subsumed to fit the needs of capital and indeed become constitutive of
(1973, p. 694).
29
recognized that technological changes in production changed the very nature of
contexts, it was another German thinker, Ernst Kapp, who made technology the
took a more Romantic position, arguing that humankind was the center of history
and culture, and technology provided the means to achieve self-awareness in the
rising industrial age. In this work, Kapp developed the idea of Organprojektion,
[T]he intrinsic relationship that arises between tools and organs, and one
that is to be revealed and emphasized…is that in the tool the human
continually produces itself. Since the organ whose utility and power is to
be increased is the controlling factor, the appropriate form of a tool can be
derived only from that organ.
A wealth of spiritual creations thus springs from hand, arm, and
teeth. The bent finger becomes a hook, the hollow of the hand a bowl; in
the sword, spear, oar, shovel, rake, plow, and spade one observes sundry
positions of arm, hand, and fingers, the adaptation of which to hunting,
fishing, gardening, and the field tools is readily apparent. (qtd in Mitcham,
1994, pp. 23-24)
12
According to Carl Mitcham, this publication marked the coining of the
phrase “philosophy of technology” (Mitcham, 1994, p. 20).
13
Kapp’s insights predate Marshall McLuhan’s more famous notion that
technologies are the “extensions of man” by nearly 100 years.
30
This relationship between technologies and the body, Kapp argued, was the
humanity was connected through its technologies, and that mankind is essentially
human faculties and activities, and that that social, the natural and the
and reflection within – its technology was continued by the American historian
technology, the natural environment, the individual, and society, each influencing
Mumford placed technology squarely within the context of the larger ecology of
our society and culture, and hoped to reveal the connection between the human
inventors and scientists but also the cultural sources and moral consequences of
is shaped by his quest to explain the origins and prospects of modern culture. He
During the last thousand years the material basis and the cultural forms of
Western civilization have been profoundly modified by the development
of the machine. How did this come about? Where did it take place? What
31
were the chief motives that encouraged this radical transformation of the
environment and the routine of life: what were the ends in view: what
were the means and methods: what unexpected values have arisen in the
process? (Mumford, 1934, p. 3)
Technics and Civilization sets modern technics within this larger framework,
correlating the changes taking place in our physical environment with changes
tools, techniques, materials and sources of energy. The eotechnic era, stretching
century, opened the way for the industrial and scientific revolutions, including the
mechanical clock, the telescope, the printing press, and the magnetic compass.
Mumford admired the people, cities, and cultures of this era who strove for a
harmonious balance between the natural senses and the freedom from labor
This phase gave way eventually to the paleotechnic era, roughly from the
science, technology, and capitalism. While the eotechnic society had been
this admirable concern into a determined effort to bring the whole of human
32
experience under the direction of capitalism and the machine: “there was a sharp
shift in interest from life values to pecuniary values” (p. 153). The technical gains
made during this phase were tremendous: power-propelled vehicles were created,
the use of iron increased, and the mass-production of clothes and mass-
multiplication of machines were all reflections of the new means of and uses of
The neotechnic phase represents the third development in the machine, but
Partly because it has not yet developed its own form and organization,
partly because we are still in the midst of it and cannot see its details in
their ultimate relationships, and partly because it has not displaced the
older [paleotechnic] regime with anything like the speed and decisiveness
that characterized the transformation [to] the eotechnic order. (pp. 212-3)
disruption and chaos have increased” (p. 213). Here, the scientific method, whose
chief advances had previously been in mathematics and the physical sciences,
“took possession of other domains of experience: the living organism and human
society. …Physiology became for the nineteenth century what mechanics had
been for the seventeenth: instead of mechanism forming a pattern for life, living
organisms began to form a pattern for mechanism” (p. 216). In short, the concepts
of science, previously associated largely with the “mechanical,” were now applied
33
imperatives of technological and mechanical adaptation. Thus, while both
Mumford and Kapp shared the notion that a natural relationship can exist between
humans and technology, Mumford countered Kapp’s optimism, arguing that the
technologies of the industrial age prevented such an organic affinity between man
and machine.
Summary
and Plato’s concern that an increasing reliance on techne will interfere with
achieving wisdom and attaining the good life is mirrored in Mumford’s position
that the modern technology of our neotechnic age threatens to supplant the
organic systems of the paleotechnic past, replacing our close connection to the
technology. And, while optimistic about the role of technology in society, Kapp
34
unbound could, and clearly did, have devastating effects. In line with Mumford’s
system, and the technological society. Ellul’s concern is not with any particular
a system, worldview, and way of life; the term he uses in this context is la
rate, and encompassing every sector of human society. Ellul outlines this new
it grows according to a process which is causal but not directed to ends; (e) it is
formed by the accumulation of means which have established primacy over ends;
and (f) all its parts are mutually implicated to such a degree that it is impossible to
separate them or to settle any technical problems in isolation (Ellul, 1985, p. 40).
35
Ellul considers whether this technological society can be “civilized,” and
materialistic: “The technical work is the world of material things; it is put together
out of material things and with respect to them. When technique displays any
interest in man, it does so by converting him into a material object” (Ellul, 1985,
incomparably more effective than anything ever before invented, power which
has as its object only power in the widest sense of the word” (p. 45). And, finally,
society, Ellul contends that the result is far from liberating: technique denies
freedom and asserts its own autonomy over all spheres of life – politics,
“prevails” over the human subject, forcing it to succumb “the king of the slaves of
technologies; it is the search for the most rational method, the most efficient
system, which excludes all others. The effect of this modern technological
spontaneous, i.e., humans and nature. Ellul argues that la technique can be applied
communication, politics, the economy, leisure, sport – indeed, to all areas of life.
36
The inevitable result of such domination of technique over every aspect of
subject to it, from procreation to how we eat, grow, where we live, and how we
die (Ellul, 1964, p. 128). Aligned with Mumford, Ellul’s final analysis is
decidedly pessimistic: “Today the sharp knife of [la technique] has passed like a
razor into the living flesh. It has cut the umbilical cord which linked men with
and the driving forces of its essence.14 In a powerful passage from the beginning
primary driving force within culture – as an extreme danger both to culture and
14
Heidegger’s historical involvement with Nazism and support of Adolf
Hitler’s policies has shrouded his theories in controversy. Philosophers disagree
on the consequences of this historical responsibility on his philosophy. Some
claim that his philosophy is pure from historical and political contingencies, while
others argue that his historical engagement for the Nazi party are inextricably
linked to his philosophical conceptions (see Farías, 1989). This debate cannot be
solved here, and my inclusion of Heidegger’s theories of technology is meant to
help chart the range of thinking on the subject, and not necessarily to endorse his
entire philosophy. (I thank Prof. Terry Moran for helping elucidate this concern
with Heidegger’s theories.)
37
thought. Technicity is not just a series of technologies or a technological system;
in which the emphasis on the power of the thinking human subject led the subject
to overestimate his ability to transcend time and nature. Such thinking led to the
notion of that subject’s ability to gain control of nature, and the development of
technological tools to attain such control. For Heidegger this delusional thinking –
man’s control over nature through technology – inevitably led to our loss of
The danger of technology lies in the transformation of the human being, by which
human actions and aspirations are fundamentally distorted. Technology enters the
inmost recesses of human existence, transforming the way we know and think.
understanding of being.
38
For Heidegger, the essence of technology should be something other than
way of “revealing” other aspects of life, not of controlling life and the world just
there is a way that we can keep our technological devices and yet remain true to
ourselves. “We can affirm the unavoidable use of technical devices, and also deny
them the right to dominate us, and so to warp, confuse, and lay waste our nature”
We are released from the burden of seeking efficiency for its own sake,
Dreyfus claims, once we understand that this kind of calculative thinking is only a
historical product and that things could be different. Once we recognize that we
have taken the first steps to escape from that framework; we are then free to
appreciate that there is more to life than efficiency, than technological rationality.
Thinking as “releasement”:
39
which we can stand and endure in the world of technology without being
imperiled by it. (Heidegger, 1966, p. 55)
This striving for harmony with the technologies around us can make us sensitive
from our compulsion to force all things into one efficient, technological order.
Heidegger’s comments here are deeply insightful. Like Ellul, he is asserting the
totalizing effect of the essence of technology, but he sees hope for releasement
place in Western culture, and, like Ellul, he views technology as the embodiment
argues that humans have been reduced to a one-dimensional being due to their
and technological society created false needs that integrated individuals into the
behavior in which the very ability of critical thinking and oppositional behavior
Marcuse acknowledges that all basic needs, such as food, shelter and
safety, are provided for in the modern technological society. But these are all
40
“deceptive liberties as free competition at administered prices, a free press which
censors itself, free choice between brands and gadgets” (1964, p. 7). The
systematic deception; our needs have become the needs of the technological
power.
41
We have pointed to the possible democratization of functions which
technics may promote and which may facilitate complete human
development in all branches of work and administration. Moreover,
mechanization and standardization may one day help to shift the center of
gravity from the necessities of material production to the arena of free
human realization. (2004, p. 77)
because the technological apparatus has incorporated and subsumed all critical
Summary
arrive at similar conclusions about modern technology and technical culture – that
fears that the rational and efficient motivations of the technologization of our
world will transform all aspects of humanity, and emerge as the “only way of
thinking” (1966, p. 56). And Marcuse fears that this technological deception will
become more deeply embedded in our lives with the rise of advanced systems of
production and consumption, where the needs of society are replaced by the needs
42
of the technological apparatus. For all three thinkers, technology is largely
Marcuse, the study of technology and its social impact has thrived in recent
political science, and media theory. This section will highlight some of the
point is a unique field of study that includes Ellul and Mumford within its diverse
Media Ecology
“the study of media as environments,” explaining that the main concern for media
understanding, feeling and value; and how our interaction with media facilitates
43
or impedes our chances for survival” (Postman, 1970, p. 161). Postman later
metaphor:
The fundamental goal for media ecology, then, is to understand how the
uncover and understand “how the form and inherent biases of communication
media help create the environment…in which people symbolically construct the
world they come to know and understand, as well as its social, economic,
political, and cultural consequences” (Lum, 2000, p. 3). The media ecological
44
4. Because their physical form dictates differences in conditions of
attendance, different media have different social biases.
5. Because of the ways in which they organize time and space, different
media have different metaphysical biases.
6. Because of their differences in physical and symbolic form, different
media have different content biases.
7. Because of their differences in physical and symbolic form, and the
resulting differences in their intellectual, emotional, temporal, spatial,
political, social, metaphysical, and content biases, different media have
different epistemological biases. (Personal communication, September
2002)
Man (1964), that what he means by “media” extends well beyond traditional
houses, light bulbs, bicycles, airplanes, games, weapons, and automobiles. The
the nature of media technology and understand how the introduction of new
extensions of the body, that they “amplify and extend ourselves” (McLuhan,
1964, p. 64). Differing from Rapp, however, McLuhan rejects any romantic view
“autoamputation” that comes with technology, often numbing us to the full effects
45
of technology, and how particular media technologies, even as extensions of the
(McLuhan, 1964, p. 66). To help explain this, McLuhan introduces his most
…that the personal and social consequences of any medium – that is, of
any extension of ourselves – result from the new scale that is introduced
into our affairs by each extension of ourselves, or by any new technology.
(McLuhan, 1964, p. 7)
Simply put, “the medium is the message” means that the media or
technologies that we use play a leading role in how and what we communicate,
how we think, feel and use our senses, and in our social organization, way of life,
the individual, and his concern with the widespread and ecological effects of
most media ecological literature. Notable examples include Harold Innis’ (1951)
biases, Walter Ong’s (1982) documentation of the intellectual, social, and cultural
Elizabeth Eisenstein’s (1979; 1983) in-depth study of the impact of the printing
46
Innis, Ong and, Eisenstein focus on the introduction of new media
medium is the message,” that the technological form of a medium carries greater
which rests at the center of the media ecology tradition, is often criticized for its
media determinism. The theories presented by Innis, Ong, and Eisenstein, then,
could all be labeled as overly deterministic, arguing that social, cultural, political,
and economic aspects of our lives are solely determined by the form and biases of
causal relationship between print and these cultural events. In fact, the title of the
within society, but not necessarily the first and only cause. She states:
47
By acknowledging how technology and culture might affect one another,
Eisenstein recognizes that it is often the interaction between a technology and its
users that determines its impact. A close inspection of the media ecology tradition
(2000) notes in his introduction to the intellectual roots of Media Ecology: “[O]ne
of media ecology’s major concerns [is] the complex symbiotic relationship among
the media and…between media and the various forces in society” (p. 1; emphasis
added).
technology and society. Building from the notion that “the medium is the
by Neil Postman:
Confirming the views of Ellul, Heidegger, and Marcuse, media ecologists strive to
48
its root, maintains a form of soft determinism: that the introduction of a
forces that led to the emergence of the technology in the first place. To fill this
void, we can turn to the theory of the social construction of technology (SCOT),
Technology Studies (see, for example, Pinch & Bijker, 1984; Bijker et al., 1987;
advocates of SCOT focus less on the ways technology might determine human
action, and instead explore how human and social actions shape the development
symbiotic relationship exists “between media and the various forces in society,”
fully rational path – and instead argues that technologies are constructed through a
process of strategic negotiation between different social groups, each pursuing its
own specific interests, resulting in great variance in the ultimate form of the
49
technology. The key idea underlying SCOT, then, is that social arrangements
create, shape, and determine technologies and how they are used.
between technology and society is Trevor Pinch and Wiebe Bijker’s (1987) “The
Social Construction of Facts and Artifacts: Or How the Sociology of Science and
the Sociology of Technology Might Benefit Each Other,” which outlines four key
that technological design is an open process that can produce different outcomes
are culturally constructed and interpreted…. By this we mean not only that there
is flexibility in how people think of or interpret artifacts but also that there is
flexibility in how artifacts are designed” (p. 40). Interpretative flexibility implies
artifact, negotiate over its design, with different social groups seeing and
constructing quite different objects. Relevant social groups are the embodiments
of particular interpretations: “all members of a certain social group share the same
set of meanings, attached to a specific artifact” (Pinch & Bijker, 1987, p. 30).
until all groups come to a consensus that their common artifact “works.”
50
The third component of the SCOT framework is the dual presence of
closure and stabilization. When the relevant social groups have formed a
technology that has reached closure is so imbedded that people find it difficult to
technology succeeds.
Finally, SCOT places emphasis on the wider context; the broad cultural
sociocultural and political situation of a social group shapes its norms and values,
which in turn influence the meaning given to an artifact” (Pinch & Bijker, 1987,
p. 46). Examining the wider context played a relatively minor role in Pinch and
later work (1995), when he added the notion of the technological frame to his
constructivist theory of technology. This is the shared cognitive frame that defines
goals, key problems, current theories, rules of thumb, testing procedures, and
51
exemplary artifacts that, tacitly or explicitly, structure group members’ thinking,
problem solving, strategy formation, and design activities (Bijker, 1995, p. 125).
social arrangements create, shape, and determine technologies and how they are
used. As Bijker and Law (Bijker & Law, 1992) state, “Our technologies mirror
our societies. They reproduce and embody the complex interplay of professional,
technical, economic, and political factors” (p. 3). In short, technologies are shaped
by and mirror the complex trade-offs that make up the social sphere from which
they emerge.
society seems at odds with the media ecological approach. While media ecology
views technology as a force that shapes society, SCOT sees society mirrored in its
technologies. But, just as Eisenstein recognized that media ecology must not be
fully deterministic, Bijker and Law (Bijker & Law, 1992) note that even within
the constructivist paradigm, the social world is affected by the technology it has
created:
52
Bijker and Law’s acknowledgement that technology, while socially constructed,
and the social groups that form them. Finding common ground with the soft
determinism of media ecology, Bijker and Law accept that “society itself is being
built along with objects and artifacts” (Bijker & Law, 1992, p. 19). Just as the
strategies, priorities, and biases of the social groups involved with the
biases can be reflected back upon society when the technology reaches a certain
Politics of Technology
of power within our technological society. Despite invoking issues of power and
often criticized for largely ignoring the political consequences of the technologies
that emerge. Such criticism has been most vocal from Landon Winner (1993),
who, in his essay “Upon Opening the Black Box and Finding It Empty,” argues
that:
53
texture of human communities, for qualities of everyday living, and for the
broader distribution of power in society – these are not matters of explicit
concern. (p. 368)
Winner expands his indictment of SCOT beyond simply not looking at the
dimensions of the technologies that permeate society. Leading the way in this
endeavor is, not surprisingly, Winner himself. But the foundations for the political
Civilization, Lewis Mumford later turned his attention to the specific political
Democratic Technics” (1964), penned thirty years after Technics and Civilization,
describes two distinct technological types that frequently exist side by side: “one
powerful, but inherently unstable, the other man-centered, relatively weak, but
resourceful and durable” (1964, p. 2). Democratic technics are built upon human
skill and animal energy, and remain under the active direction of the craftsmen or
54
laborers who use them as though “gifts from nature” (p. 3). While such
modest demands, and enabled “adaptation and recuperation” (p. 3). Authoritarian
invention, scientific observation, and centralized political control” that gave rise
to what we refer to today as “civilization” (p. 3). Connected to the rise of new
authoritarian technics have persisted: “At the very moment Western nations threw
king, they were restoring this same system in a far more effective form in their
technology” (Mumford, 1964, p. 4). Mumford argues that the rise of political
55
authoritarian technics will give back as much of it as can be mechanically
graded, quantitatively multiplied, collectively manipulated and magnified.
(1964, p. 6)
consolidates its powers, with the aid of its new forms of mass control, its panoply
conditions of power, authority, freedom, and social justice are often deeply
artifacts have innate dispositions that both create and affect these arrangements of
general) as embodying ideas about the social order, whether well or evil
intentioned: “The issues that divide or unite people in society are settled not only
in the institutions and practices of politics proper, but also, and less obviously, in
tangible arrangements of steel and concrete, wires and semiconductors, nuts and
56
Winner identifies two levels at which technological artifacts embody
politics. The first is one in which segments of society build into technologies their
own explicit prejudices, biases and ideologies in such a way that they settle
designed to resolve political conflict by supporting the power and authority of one
group over another. Winner provides the example of Robert Moses, the urban
planner responsible for much of the design of modern New York city, who
famously built overpasses over the Long Island Highway only 9 feet high,
allowing only automobiles to navigate the parkway. The city’s poor and minority
classes, largely dependent on taller buses for transportation, could not drive along
the highway and were essentially denied access to the beachfront destinations.
Winner argues that Moses incorporated widespread prejudices among the city’s
upper class into the design of the parkway overpasses, essentially preventing
access by the poor and black population of New York City to the wealthy’s social
suggest certain kinds of social and political arrangements. For example, Winner
15
Bernward Joerges (1999) has provided a critique of Winner’s thesis,
arguing that the Long Island Highway story is apocryphal. Regardless of who is
correct, the episode stands as a suitable parable for how political arrangements
could be embedded in artifacts in order to settle particular social or political
issues.
57
investments (nuclear power reactors, interstate highways, national communication
systems, and the like) is positively related to the need for highly centralized,
forms of management used to manage such large technical systems are leaking
such cases, the emergence of technologies that are only compatible with a
particular type of governance structure will perpetuate the spread of that kind of
systems and the type of government and bureaucratic structures they support.
Because industrialization involved the large and fast flow of goods, it could not be
58
well as the usual mechanical devices such as computers, data files systems, and
the like). Moreover, without highly detailed and structured management systems,
the new post-industrial economy simply could not work. This need for large-scale
management and information systems brought about what Beniger calls the
“Control Revolution”:
organization – what Beniger labels the growing “systemness of society” (p. 278).
replace industrial capital as the material base for our modern economy, and, well
before the twentieth century and digital computing, brought about our Information
the telegraph, typewriter, and telephone, extending into the early 1900s with the
telecommunications, and presumably, the Internet, Beniger would argue, are not
century earlier.
59
Beniger argues that just as the rapid industrialization of the late-nineteenth
and control” (1986, p. 32). The rise of the Information Society has exposed the
Beniger, then, arrives at similar conclusions as both Mumford and Winner, that
“control to all aspects of human society and social behavior” brought forth by the
60
individual as information to be parsed and processed. Like Mumford, Deleuze
believes that the form of society is matched by the form of its machines (Deleuze,
1995, p. 180), and similar to Beniger, he views the rise of information processing
argues, made use of simple mechanisms such as levers, pulleys, and clocks, and
and internal combustion engines. The defining technology of the new society of
machines” (Deleuze, 1995, p. 180), where the “digital language of control is made
school, the hospital, the prison – have collapsed under the weight of our
information society, leading, not to the eradication of these power relations, but
rather, to their dispersal and often hidden proliferation throughout society. The
social regulation of space and time has become expropriated from the relatively
control. The control society offers what feels like greater freedom, yet is more
technologies such as tracking systems, electronic tagging, and coded access cards,
61
Lawrence Lessig (1999) recognizes how a Deleuzian “digital language of
human-made, Lessig argues, little is naturally inherent to the system – all of its
rules, tendencies, affordances, and constraints are the result of human decisions,
actions, and, essentially, code.16 What we can and cannot do there is governed by
the underlying code of all of the programs and protocols that make up the
Internet, which equally permit and restrict human action. Quite simply, “code is
law”:
For Lessig, “how a system is designed will affect the freedoms and control
the system enables” (Lessig, 2001, p. 35); the very architecture of the Internet
cyberspace that constitutes its freedom, and as the architecture is threatened and
changed, this freedom will be erased. While locating the political and ideological
power of the system within its formal structure – code – Lessig recognizes that the
political nature of this new medium does not exist per se, but is the product of
how that power is exercised, and by whom” (1999, p. 59; emphasis added).
16
See (Hafner & Lyon, 1996) for a study of the human decisions that
informed the creation of the Internet.
62
Like Deleuze and Lessig, Alex Galloway locates systems of control
particularly utopian vision of the freedoms enabled by the digital networks of the
Galloway (2004) reveals how power relations and control persist within the
formal structure of the Internet, the level of the protocols that make the Internet
work.17 He argues that such protocols facilitate the exercise of control in our
networked society, one in which power is distributed laterally, rather than being
Galloway asserts that the founding principle of the Internet is control, and,
networks is part of a larger shift in social life. The shift includes a movement
away from central bureaucracies and vertical hierarchies toward a broad network
17
Similar work includes Chun’s (2006) Control and Freedom: Power and
Paranoia in the Age of Fiber Optics, where she describes the various controls and
freedoms enabled by the design of our networked technologies and online
environments.
63
of autonomous social actors” (p. 33). Protological control is embedded in the code
of the system, and reflects, for Galloway, a larger shift towards horizontal,
Ethics in Technology
above are appeals to recognize how technology impacts ethical and human
values. Plato confronts how a commitment to techne might conflict with the
proper achievement of the “good life,” while Marx sees technology as a force of
domination and alienation. Ellul, Heidegger, and Marcuse are each concerned
about the way technology acts as a force that denies individual freedom and
and Galloway all point to issues of freedom and control within the technologies of
These concerns with the way that technology affects levels of freedom,
autonomy, power, and social justice bring us back to our starting point in this
chapter: the position that the design of technology bears “directly and
social, ethical, and political values” (Flanagan et al., in press). The increasing
technologization of our world brings with it important ethical concerns about its
impact on human values at both the social and individual levels. As a result, many
scholars have directed their focus on how emerging digital technologies bear on
64
ethical and human values. The origin of the contemporary study of ethics in
Wiener.
Norbert Wiener shared the concern of the theorists outlined above that the
War II, Wiener helped to design an anti-aircraft gun capable of shooting down
the creation a new field of research that Wiener called “cybernetics” – the science
when combined with digital computers under development at that time, led
the “second industrial revolution” (Wiener, 1965, pp. 37-38). He believed that the
65
integration of new systems for information processing into society will constitute
the remaking of society – “the second industrial revolution” and “the automatic
age” – destined to affect all aspects of society. His solution was “to have a society
based on human values other than buying and selling” (Wiener, 1965, p. 38;
emphasis added). To achieve this, Wiener argued, will take decades of effort and
will radically change the world: Workers will need to adjust to radical changes in
the work place; governments must establish new laws and regulations; industry
and businesses must create new policies and practices; professional organizations
must develop new codes of conduct for their members; sociologists and
phenomena; and philosophers must rethink and redefine old social and ethical
concepts.
with the publication of his monumental book The Human Use of Human Beings
(1988) in 1950. Invoking Plato’s desire for the attainment of episteme, Wiener
outlines the purpose of a “good human life,” one in which “great human values”
are realized and the creative and flexible information-processing potential of “the
human sensorium” enables humans to reach their full promise in variety and
possibility of action (Wiener, 1988, p. 51). Wiener then outlined four key ethical
minimizing the harmful ones (Wiener, 1988). Terrell Ward Bynum (2000)
66
- The Principle of Freedom – Justice requires “the liberty of each human
being to develop in his freedom the full measure of the human possibilities
embodied in him.”
- The Principle of Equality – Justice requires “the equality by which what is
just for A and B remains just when the positions of A and B are
interchanged.”
- The Principle of Benevolence – Justice requires “a good will between man
and man that knows no limits short of those of humanity itself.”
- The Principle of Minimum Infringement of Freedom – “What compulsion
the very existence of the community and the state may demand must be
exercised in such a way as to produce no unnecessary infringement of
freedom.”
While Wiener did not use the term “computer ethics,” these principles outlined in
The Human Use of Human Beings laid the foundation for future computer ethics
philosophers and ethicists built the discipline of “computer ethics.” At the center
of this effort were Deborah Johnson and James Moor. In her landmark book,
Computer Ethics, Johnson (1985) defined the field as one that studies the way in
which computers “pose new versions of standard moral problems and moral
dilemmas, exacerbating the old problems, and forcing us to apply ordinary moral
norms in uncharted realms” (p. 1). Johnson addresses key categories of ethical
technology. Johnson questions, however, whether such ethical issues are unique
67
Computer Ethics, Johnson openly contemplates this pivotal question: “What is
where he argues for a much broader definition of computer ethics than Johnson.
For Moor, computer ethics can be considered distinct of existing ethical theories
information technology:
In contrast to Johnson, Moor is calling for new ethical frameworks to address the
And while Moor agrees with Johnson that “not all ethical situations involving
computers are central to computer ethics” (1985, p. 267), he insists that “because
computer technology provides us with new possibilities for acting, new values
emerge” (1985, p. 266). In short, “computer ethics requires us to think anew about
the nature of computer technology and our values” (Moor, 1985, p. 268; emphasis
68
values points to an important turn in computer ethics – the identification and
analysis of the impacts of information technology upon human values like trust,
justice.
identify the ways in which information technologies bear on human values is what
Most of the time and under most conditions computer operations are
invisible. One may be quite knowledgeable about the inputs and outputs of
a computer and only dimly aware of the internal processing. (Moor, 1985,
p. 272)
programming code (1985, p. 273). This invisibility factor is present within many
with the generally hidden protocols and underlying architecture of the Internet –
and presents the primary dilemma for Moor’s conception of computer ethics:
with his call for computer ethics to become more “disclosive,” to be “centrally
69
technologies. Brey’s new disclosive computer ethics distinguishes itself from
their moral importance” (2000, p. 12). Brey criticizes mainstream computer ethics
for being limited to the analysis of ethical issues for which there is a pre-existing
and identifiable policy vacuum. What is left out, according to Brey, are
“computer-related practices that are not (yet) morally controversial, but that
Brey describes these practices that have moral import but that are not yet
media coverage, for example). In such cases, a critical function of computer ethics
(Brey, 2000, p. 11). The second way in which moral opacity may arise is when a
practice is familiar in its basic form, but is not recognized as having moral
computing practice often has the appearance of moral neutrality when in fact they
are not morally neutral” (Brey, 2000, p. 11). Disclosive computer ethics, then,
must put the technological artifact itself under moral scrutiny, “independently
from, and prior to, particular ways of using them” (Brey, 2000, p. 11).
70
To summarize, disclosive computer ethics works to uncover the moral
issues and features in technologies that had not until then gained much
practices in a way that reveals their moral importance” (Brey, 2000, p. 12).
Summary
perception, understanding, feeling and value, and how our interaction with
complicated world. The answer, Postman reminds us, is deceptively simple: “It
understand the other side of the coin, how various social and cultural forces led to
wider context from which technologies emerge, SCOT advocates argue, can we
“Our technologies mirror our societies. They reproduce and embody the complex
71
The contemporary philosophies of technology outlined thereafter focus
and the exercise of power. Mumford shows concern that certain technologies
politics argues that conditions of power and authority are often deeply embedded
in technical devices and systems, either inherently or though the ways they settle
political issues. Finally, Beniger, Deleuze, Lessig, and Galloway recognize that
(1995, p. 180), can be applied to Beniger’s history of the rise of the information
infrastructures. In terms of politics and power, all agree that a system’s design
will affect the freedoms and control that the system enables.
control, Wiener identified key ethical principles against which any new
Wiener’s efforts led to the establishment of computer ethics, which provided the
philosophical tools for weighing technology against various moral and ethical
principles. In that effort, Brey has called for disclosive computer ethics to uncover
72
and morally evaluate the values and norms embedded in the design of computer
discussed at the beginning of this chapter, Thamus criticizes the god Theuth’s
instead that the practice of writing will ultimately weaken memory, illustrating an
anxiety that an over-reliance on arts that imitate knowledge (techne) threatens the
ability to achieve true wisdom and knowledge (episteme). Thamus recognized the
The error is not in his claim that writing will damage memory and create
false wisdom. It is demonstrable that writing has had such an effect.
Thamus’ error is in his claim that writing will be a burden to society and
nothing but a burden. For all his wisdom, he fails to imagine what
writing’s benefits might be, which, as we know, have been considerable.
(p.4)
History has shown us how numerous tools and technologies that imitate
episteme. Even before writing, diverse forms of techne emerged to imitate and
represent knowledge, including cave painting, textile patterns, and the knotted
strings of Incan quipu (see Crowley & Heyer, 2007). Later came clay and
73
alphabet, parchment and paper, charts and maps, monastic manuscripts, codices
reached new scope with the recent evolution of electronic and digital forms of
techne, such as the telegraph, telephone, radio, television, and digital computer
invention of the printing press, for example, fostered the modern idea of
but also caused a transformation of the public sphere from a space of open and
political and economic interests (Habermas, 1992). The bright lights and moving
images of television might have fostered a new “global village,” but they also
little more than sound bites tailored for the cameras (Postman, 1985). And while
18
Considering that techne is defined as an art or craft that is imitative of
knowledge, all forms of information technologies appear can claim a lineage to
this ancient concept. Just as writing imitates memory by relying on words on a
page, audio recordings imitate sound, photographs imitate what is visible, video
imitates motion, and the language of computers can imitate any form of
information translatable into binary code. All information technologies are
imitative of knowledge through representation, storage, and mediation.
74
many hope the Internet will recharge a diverse public sphere of deliberative
isolate themselves within groups that share their own views and experiences, and
thus cut themselves off from any information that might challenge their beliefs
(Sunstein, 2001).
It is inescapable, then, that every society must negotiate with its new
this struggle between society and its technology. His speech to a conference of
technologists in 1990 presents this delicate balance between the benefits and
Postman also feared that the Faustian bargain society strikes with its technology
[In] cultures that have a democratic ethos, relatively weak traditions, and a
high receptivity to new technologies, everyone is inclined to be
enthusiastic about technological change, believing that its benefits will
eventually spread evenly among the entire population. Especially in the
75
United States, where the lust for what is new has no bounds, do we find
this childlike conviction most widely held. (Postman, 1992, p. 11)
the face of this technological enthusiasm: “from whose point of view the
efficiency is warranted or what might be its costs? …Whom will the technology
give greater power and freedom? And whose power and freedom will be reduced
made by technology, ignoring such issues of power, equality, and freedom. While
Faust traded his soul to devil in exchange for unlimited knowledge, Postman
feared that society is sacrificing its core values to blindly satisfy its desire for
technological progress.
views of Marx and Kapp of the social impact of technologization of our world in
“giveth and taketh away.” Like Mumford’s historical treatment in Technics and
analytic, and efficient thinking – an existence that threatened to subsume all other
ways of living our lives and relating to our world. In their view, most members of
society were losers in the Faustian bargain with technology, caught up in the
76
blind acceptance of the Faustian bargain appears in Mumford’s later exploration
management – made possible by the allure of the Faustian bargain – results in the
the power relationships enabled by the Faustian bargain are rarely questioned,
Deleuze and Galloway’s warnings about the levels of control implicated by and
system is designed will affect the freedoms and control the system
enables”(Lessig, 2001, p. 35), but, if Postman’s fears are correct, and the Faustian
traditionally enjoy.
Thamus replied: “Most ingenious Theuth, one man has the ability to beget arts,
but the ability to judge of their usefulness or harmfulness to their users belongs to
their own technologies. Instead, moral and ethical philosophers have taken up this
duty. Building upon the work of Wiener, Johnson, and Moor, Brey urges the
77
practice of disclosive computer ethics to try to make apparent the ethical
implications of technology that remain hidden behind the alluring veil of the
Faustian bargain.
This dissertation builds upon this broad foundation and will expose the
its own, exercising power and control through its vast information processing
explore how the perfect search engine empowers the widespread capture of
personal information flows across the Internet, threatening the ability to engage in
online social, cultural, and intellectual activities free from answerability and
oversight, thereby bearing on the values of privacy, autonomy, and liberty. It will
answer Brey’s call for disclosive computer ethics and attempt to attain
knowledge tools, shedding the “blinders” in which our Faustian bargain has
shrouded us.
78
79
CHAPTER III
Introduction
In order to examine how the quest for the perfect search engine empowers
the widespread capture of personal information flows across the Internet, we must
first understand how search engines work, as well as the historical context from
which they emerged. As it turns out, this is no small feat; it requires a rather deep
for the quest for the perfect search engine. This chapter presents a technical
just one dimension of Web search engines – this chapter will develop a basic
creation. Most importantly, this chapter (combined with the next) represents the
technical investigation of the design of the perfect search engine to uncover how
its technological properties and underlying architecture might bear on the ethical
The Internet
computer networks that transmit data by packet switching using a standardized set
which together carry various information and services, such as electronic mail,
online chat, file transfer, and the interlinked Web pages and other documents of
Since its inception in 1969, the Internet has grown from just four host
million global locations by July 2006 (Internet Systems Consortium, 2006), with
over 1 billion users worldwide (Internet World Stats, 2006). During this relatively
short history, the Internet has “revolutionized the computer and communications
interaction between individuals and their computers without regard for geographic
19
Certainly, the Internet is not completely worldwide in its reach, and not
publicly accessible by all populations. Despite these important “digital divide”
concerns, the Internet stands as a unique medium in its expansive reach and
general openness.
20
See below.
80
distributed across the network. It became widely accessible to the scholarly,
the end of the 1980s with the emergence of the FTP (file transfer protocol) and
Gopher protocols. The FTP protocol was designed to connect two computers over
the Internet for the efficient sharing and transfer of files (Network Working
Group, 1985). While usable directly by a person at a computer terminal, FTP was
the Internet in order to access and transfer needed files automatically. The Gopher
protocol, on the other hand, was designed specifically for use by people.
sports teams are called the Golden Gophers), Gopher presented files in
hierarchical menus for easier and more intuitive navigation (Network Working
a Gopher server can be linked to as a menu item from any other Gopher server.
Hypertext Links
document that, when selected, automatically delivers the linked information to the
81
the Internet and World Wide Web, hyperlinks, in the most general sense, predates
community after World War II to develop knowledge tools rather than military
Enlightenment, Bush realized that the amount of scientific data was growing at an
incredible pace in the first half of the twentieth century, and argued that people
needed to find new ways to organize and access information through the use of
new technology:
information organization:
21
Diderot’s 18th-century Encyclopédie, for example, featured the
widespread use of renvois, a system of hyperlink-styled cross references to link
articles with related – both complementary and oppositional – ideas or arguments
(see Brewer & Hayes, 2002).
82
Here, Bush recognized the limitations of interacting with a system through a rigid
data structure: If the data is stored in classes and subclasses in a database, then
users can only navigating the database as required by its data structure – via those
precise classes and subclasses – rather than by their own interests or personal
method of information organization and retrieval. Bush’s goal, then, was to invent
new knowledge tools to help users locate, organize, coordinate, and navigate
What made a piece of information valuable, Bush suggested, was not the
overarching class or category that it belonged to, but rather its connections to
tool, half microfilm machine and half computer, to support the process of thinking
The memex would aid the process of thinking through a mechanized indexing
analogous to the trail of mental association in the user’s mind: A memex user
builds a “trail of interest through the maze of materials available to him” (Bush,
83
Bush’s system implied a profound shift in the way we grapple with
information that frees users from the strict, inflexible dictates of systematic or
suggested, was not the overarching class or species that it belonged to, but rather
the connections it had to other data. Documents can be connected for more
elusive, transient, or personal reasons, and each item might have many trails
leading to it. As Steven Johnson relates, “The Memex wouldn’t see the world as
the librarian does, as an endless series of items to be filed away on the proper
shelf. It would see the world the way a poet does: a world teeming with
Bush’s vision for the memex remained unrealized, but he inspired another
pioneer in knowledge tools, Ted Nelson, who wrote twenty years later of a new
knowledge tool that would enable users to publish and access information in a
similar nonlinear and interlinked format. Coined hypertext by Nelson for use in
his Project Xanadu (see Nelson, 1987, 1993), the hypertext link was meant to
84
Hypertext would allow people to create, annotate, link together, and share
implementation of a “docuverse” where all data was stored once, there were no
deletions, and all information was accessible by a link from anywhere else.
follow links in and out of documents at random, the path being determined by the
linear way from A to B to C – but by free association, starting with A but taking
autonomy, liberating users to navigate and explore information free from the
minor hypertext applications came and went, the full potential of nonlinear and
85
nonsequential linking of information via hypertext was brought to its fullest
fruition by Tim Berners-Lee with his creation of the World Wide Web.
Tim Berners-Lee’s development of the World Wide Web was among the
The fundamental principle behind the Web was that once someone
somewhere made available a document, database, graphic, sound, video,
or screen…it should be accessible…by anyone, with any type of
computer, in any country. And it should be possible to make a reference –
a link – to that thing, so others could find it. (Berners-Lee, 2000, p. 37)
of both Bush and Nelson, he understood the human mind’s ability to link random
sharing and updating information among researchers within the facility, and,
86
systems” (Berners-Lee, 2000, p. 21). Supported by his HyperText Markup
Language (HTML), a simple method for encoding text files with links to other
Berners-Lee announced the debut of the World Wide Web as a publicly available
service in 1991:
concepts to be stored and shared in ways similar to Bush’s call for associative
assembly of ideas.” By releasing the World Wide Web for public use, Berners-
Lee hoped that implementation of his HTML and HTTP protocols would become
widespread. His strategy worked, and from the first Web site created by Berners-
Lee in 1991, the Web has expanded to over 100 million Web sites in late 2006
(Netcraft, 2006), representing over 11.5 billion indexable23 Web pages (Gulli &
Signorini, 2005).
22
Archive of original e-mail to the alt.hypertext message board is
available at http://www.w3.org/People/Berners-Lee/1991/08/art-6484.txt.
23
The indexable Web is that portion of the World Wide Web that is
indexed by conventional search engines, which will be discussed in more detail
below. For various reasons (e.g., the Robots Exclusion Standard, links generated
by JavaScript and Flash, password-protection, dynamic pages temporarily created
in response to a user action) some pages cannot be indexed by traditional search
engines. These “invisible” pages are referred to as the Deep Web. It is estimated
87
Early Internet and Web Navigation Tools
begin to understand the depth and breadth of information that resides in the
billions of Web pages scattered across the global network. Creating tools to locate
and navigate these information spaces has become a priority as more and more of
the public go online for their information-seeking needs. One of the first systems
for locating information on the Internet was Archie, initially developed in 1989 at
McGill University (Emtage & Deutsch, 1992). Archie – a play on the term
“archive” – was essentially an index to thousands of FTP sites around the Internet.
FTP sites and compile a master index, mirroring the file structure of the servers
indexed. Users could submit word queries through a command line interface or
via e-mail. The queries were processed against the complied index, and a set of
matching directories or files were returned. In the early 1990s, two other tools,
Veronica and Jughead, were developed to help locate documents across remote
that the deep Web is several magnitudes larger than the indexable Web (Bergman,
2001).
24
Since Archie, coincidentally, was the name of a popular American
comic book character, the Veronica and Jughead systems were named after other
characters from the same comic series.
88
contributors, 2006c), while Jughead was designed to search only within one
search system was the Wide Area Information Server (WAIS), a commercial
software package that allowed the indexing of large quantities of information, and
then made those indices searchable across the Internet. Examples of information
found on WAIS were the Dow Jones stock listings, United States government
documents, and the Library of Congress archives (Gillies & Cailliau, 2000).
The Gopher and WAIS search tools were barely more than a couple of
years old when they were overwhelmed by the rapid development of the World
Wide Web. Because Berners-Lee’s World Wide Web protocols enabled users to
create simple pages of information on the Web with relatively little effort or
expertise, pages could be added to the Web without any particular organization or
While the diminished barriers of entry and decentralized nature of the Web were
considered among its most significant characteristics, the resulting “rummage sale
of information on the World Wide Web” (Bowker & Star, 1999, p. 7) became
World Wide Web was little more than word of mouth – a colleague telling other
colleagues about an interesting or new Web site. Lists of new and notable Web
links could also be found online, such as the “What’s New” section of the
homepage of Netscape, the default start page of one of the first widely used Web
89
browsers.25 Paper listings of Web sites were also published, but, as Candy
Schwartz recognized early on, “print publishing has never been a particularly
1998, p. 974).26
By the mid 1990s, as the Web continued to grow exponentially in size and
the user was already aware of the location of the information resource.
Researchers began to look for methods that could add some sense of organization
Web Directories
example, a user seeking sites on Web graphics, the hierarchy might look
25
Archives of Netscape’s “What’s New” pages are viewable at
http://wp.netscape.com/home/whatsnew/.
26
I recall purchasing a printed directory of the World Wide Web in the
late 1990s. The size of a large city’s phone book, it became obsolete in a matter of
months.
90
The categorization is usually based on the whole website, rather than one page or
a set of keywords, and sites are often limited to inclusion in only one or two
categories. Web directories typically allow site owners to directly submit their site
for inclusion, and have human editors who review submissions for evaluation and
categorization.
One of the earliest Web directories was the World Wide Web Virtual
Today, the Virtual Library consists of 263 individual categories of sites, each
maintained by its own “librarian,” including experts from academia, industry, and
volunteers.28
Stanford University Ph.D. students. In January 1994, Jerry Yang and David Filo
began compiling a list classifying their favorite Web sites, “Jerry and David’s
Guide to the World Wide Web.” Within a year, their amateur and informal
directory had received over 1 million visitors from across the globe, and its name
was changed to the more memorable (and marketable) “Yahoo!”, chosen both for
27
Archived at http://www.w3.org/History/19921103-
hypertext/hypertext/DataSources/bySubject/Overview.html.
28
Viewable at http://vlib.org/.
91
its playful meaning and as an acronym for “Yet Another Hierarchical Officious
Web directory. As the site grew and the number of links increased, its method of
Yang and Filo turned to Srinija Srinivasan, hired as Yahoo!’s fifth employee and
“Chief Ontologist,” who took Yahoo!’s extensive lists of Internet sites and
knowledge” (Srinivasan, 2005). Building from the ad hoc categories she inherited
from Yang and Filo, Srinivasan and her team of editors began slowly and
deliberately steering Yahoo!’s ontology toward this “holistic view,” adding and
page in Yahoo!’s directory (See Figure 1). Once the ontological categories
become stable, Jerry Yang argued, “We will have captured the breadth of human
While helping to organize the increasingly chaotic World Wide Web, Web
directories have four major drawbacks, two of a pragmatic nature, and two
thousands of Web pages are created and updated daily, and human-driven Web
directories have difficulty keeping up with the rapid growth of the Web. Yahoo!,
for example, initially employed only 20 editors for its directory, allowing
92
however, the need to hire another “50 or 60 classifiers” otherwise the “percentage
of sites Yahoo! knows about will continue to shrink” (Steinberg, 1996, p. 111).
Directory Project, one of the largest and most extensive human-edited directories
29
Number of editors listed at http://www.dmoz.org, although it is
estimated that only around 10% are “active” (Wikipedia contributors, 2007c).
30
Archived at
http://Web.archive.org/Web/19961017235908/http://www2.yahoo.com/.
93
The second pragmatic drawback of Web directories involves their user-
browsable, that is, users can click on a subject of interest to see pertinent links and
subcategories on the topic. Users, however, are dependent upon the Web editors’
lexicon, and it can prove difficult to discern exactly under which topic heading a
particular item has been classified. For example, a user seeking a website about
the painkiller Tylenol might not find a category of that name, but instead would
Navigating the depths of such a hierarchical subject tree requires both familiarity
with the topic and its general vocabulary, as well as the patience for engaging the
trial and error, as some clicks inevitably will result in dead-ends. To help alleviate
larger directories provide a simple search function (as shown in Figure 1),
allowing users to search for particular subject categories and page listings within
the directory (but typically not within the content of the linked Web pages).
94
Along with these two pragmatic shortcomings, presence of human editors
ontological concerns over bias and authority in any attempt to, as Yahoo! puts it,
humans to evaluate and place Web sites within the directory places them in a
position of ontological authority over which sites are included (and which are
not), and where in the hierarchy they belong. For example, should a link to
directories, and act as gatekeepers holding “the key to inclusion” for site owners
wishing to have their pages indexed (Introna & Nissenbaum, 2000, p. 171).
the biases and politics of classification itself casts a shadow on the usefulness of
Web directories, and deserves greater attention. The drive for the classification
modern Europe, rooted in the rational and scientific methods gaining dominance
at that time. Carolus Linneaus’ classification system for separating animals and
pyramid fashions, with overarching general categories and subdivisions, all the
way down to specific topics” (Stockwell, 2001, p. 98). The division of topics into
95
structured hierarchies is meant to help reduce large sets of knowledge to a logical
schemes of Web directories attempt to mimic such structures, moving from the
human ignorance” (Headrick, 2000, p. 22). Or, as Bates (2002) realizes in his
things does not lend itself to simple orders. All the distinctions between various
are problematic, Bates maintains, because “they reify particular orders and present
them as an objective reality. The individual map defines one version of the world
the mind, which imposes on its subjects an arbitrary pattern that distorts their
96
encyclopedia which organizes the animal world according to a complex and
foreign system of criteria: “(i) frenzied, (j) innumerable, (k) drawn with a very
fine camelhair brush, …(m) having just broken the water pitcher” (Foucault,
encyclopedia is not the seemingly absurd categories that order the world of
animals so much as one particular category: “(h) those that are included in this
category, and the “monstrous quality of the encyclopedic order is not the oddity of
juxtaposition but the destruction of a common ground for any order” (Bates, 2002,
p. 4). Such encyclopedic “order” represents not an ontological category, but only
Geoffrey Bowker and Susan Leigh Star (1999) continue this criticism of
arbitrary classification systems, arguing that any such systems are inherently
social organization, moral order, and layers of technical integration” (p. 33). They
stress that the “material force of classification systems” impacts our world
makes a similar claim in her argument that those who determine the classificatory
categories, and how such categories can and will be used, impute their own
personal values and ideologies into the system, exerting power over both the user
97
The systematic organization of knowledge in Web directories, by
categorization, a system that Foucault, Bowker & Star, and Suchman reveal to be
not only arbitrary, but often politically charged. While the systematic organization
of Web directories was meant to improve the ability to find information on the
rapidly expanding Web, their structure – like paper encyclopedias before them –
threaten to impart a dogmatic rigidity to the way Web sites and information are
Domain: Participants:
• No clear edges
(adapted from Shirky, 2005)
Comparing the World Wide Web to these characteristics reveals that relying on
The list of factors making ontology a bad fit is, also, an almost perfect
description of the Web – largest corpus, most naive users, no global
authority, and so on. The more you push in the direction of scale, spread,
98
fluidity, flexibility, the harder it becomes to handle the expense of starting
a cataloguing system and the hassle of maintaining it, to say nothing of the
amount of force you have to get to exert over users to get them to drop
their own world view in favor of yours. (Shirky, 2005)
Browse says the people making the ontology, the people doing the
categorization, have the responsibility to organize the world in advance.
Given this requirement, the views of the catalogers necessarily override
the user’s needs and the user’s view of the world. If you want something
that hasn’t been categorized in the way you think about it, you’re out of
luck. (Shirky, 2005)
When combined with the growing complexity of the World Wide Web,
these four drawbacks of Web directories – the human labor required, the difficulty
of navigation, the potential for editors to act as gatekeepers, and the general
99
The search paradigm says the reverse. It says nobody gets to tell you in
advance what it is you need. Search says that, at the moment that you are
looking for it, we will do our best to service it based on this link structure,
because we believe we can build a world where we don’t need the
hierarchy to coexist with the link structure. (Shirky, 2005)
Web search engines reflect the epitome of this new search paradigm, employing a
Web, store their contents in a database, and automatically retrieve and rank results
based on a user’s specific search query. The first Web search engines were
In her review of the web search engine industry, Elizabeth Van Couvering
(forthcoming) has identified three distinct periods in the history of the search
Of the twenty-one search ventures launched in this short period of time, only six
remain as fully independent search engine providers. And while the industry’s
roots were in the academic domain of university research laboratories, the market
100
Table 2: Early period search engine dates, institutions, and founders.
101
Figure 2: Search engine mergers and acquisitions in the three periods of search
history (adapted from Van Couvering, forthcoming).
have emerged as the prevailing tool for accessing the vast amount of information
available on the World Wide Web and beyond. They locate, index, and provide
almost immediate access to billions of Web pages and related Internet content.
102
According to the Pew Internet & American Life Project, 84% of American adult
Internet users have used a search engine to seek information online (Fallows,
2005, p. 1). On any given day, more than 60 million American adults send over
200 million information requests to Web search engines, making Web searches
second most popular online activity (behind using e-mail) (Rainie, 2005). They
processes, Figure 3 shows the typical architecture of a Web search engine divided
into three key modules: a crawler (left), an indexer (center), and a query and
Figure 3: Typical search engine architecture (adapted from Arasu et al., 2001)
103
Crawlers (also known as “spiders” or “bots”) are small programs that
“crawl” the Web on the search engine’s behalf, downloading Web pages into a
page repository for later processing in the indexing module (see, for example,
Heydon & Najork, 1999). Usually starting from a predetermined set of URLs,
crawlers progressively access and download Web pages, scan them for outgoing
links, which are themselves accessed and scanned for outgoing links, and so on.31
Due to the enormous size of the Web, search engines often employ multiple
possible for a crawler to visit and download large numbers of Web pages virtually
unattended, the main limitations being the ability to locate pages to be crawled
and the storage capacity of the page repository (Arasu et al., 2001, p. 3).32
The indexer module extracts all the words from each page downloaded by
the crawlers and records the location where each word was found, creating a very
large text index that can provide the URLs where any given word occurrs on the
Web.33 The text index also typically includes meta-data about the appearance and
location of particular words, such as whether a word appeared in the page’s title,
module might also perform document preprocessing to make the overall indexing
31
Discovered outgoing links might be either immediately visited and
scanned by the same crawler, or put into a queue to be visited and scanned by
another crawler as directed by the crawl control system.
32
Despite the automated abilities of crawlers, some studies show that no
search engine has indexed more than 16% of the Web (Lawrence & Giles, 2000).
While Web crawlers face various challenges in their efforts to index the entire
Web (Arasu et al., 2001, pp. 4-13), discussing these in detail is beyond the scope
of this dissertation.
33
Limited, of course, by the portion of the Web initially crawled.
104
process more efficient. Preprocessing might include automatically ignoring
common words with little semantic importance (so-called “stop words” such as
“a”, “of”, “the”, or “it”), stemming words down to their root form (for example,
spelling variations, or word case (Arasu et al., 2001, pp. 18-19; Türker, 2004, pp.
16-17).
record the link structure of the documents crawled, providing information such as
the set of incoming and outgoing links to a page, parent/child page relationships,
adjacent pages, and so on.34 The link index might be used to direct future Web
crawling activity, to help provide “related pages” results for particular queries,
algorithms (Broder et al., 2000). Finally, a utility index might contain results of
preliminary calculations and rankings based on the content and link structure of
indexed pages to speed query processing (Arasu et al., 2001, pp. 18-19).
submitting it through the search engine. The query and ranking modules of a
search engine receive and fulfill these search requests, relying on the indices
prepared by the crawler and indexer modules to find matches to users’ keyword
requests, and utilizing algorithms to rank and sort results to achieve optimal
34
The mapping of the link structure of the web by search engines is
discussed in more detail in the next chapter.
105
relevancy to the user’s request. The query engine’s interface is typically a text box
for inputting the search terms or phrases desired, sometimes accompanied with
checkboxes to indicate whether the request should focus on Websites, video files,
images, and so on. Queries can be quite simple, a single word, or more complex,
such as string of words or a phrase within quotations. The use of advanced search
operators, such as the Boolean commands “AND”, “OR”, and “NOT”, allow
users to refine or extend the terms of the search query. Search terms are often
processing. The query engine then scans the text index for the search terms, and
the pages presented to the user. While there may be millions of Web pages that
include the search terms, some pages may be more relevant, popular, or
rank the results, providing the “best” results first. Ranking algorithms and
techniques vary across search engines, and while their exact details are considered
the content of Web documents, their layout and attributes, and their link structure.
the relevance of that document to the user’s search. Utilizing the meta-data
collected by the indexer, the ranking engine can also estimate the relative
106
documents where the search term appears in a title or hyperlink, for example, than
higher within search results, a Web site owner can manipulate how search engines
rank their page by altering the way the page “looks” to the engine, such as adding
hidden or misleading keywords and phrases to fool the engine into ranking it
higher for search terms that are not actually relevant for the page. Conversely,
search engine results. For example, many Web search engines’ own sites do not
actually contain the phrase “search engine,” reducing the chances these pages
would be ranked highly in query results for that phrase (Kleinberg, 1999).35
Finally, as the number of pages on the Web continues to increase, the number of
documents that might include the search terms increases proportionally, and it
Ranking algorithms that take advantage of the Web’s link structure have
emerged in an attempt to improve the quality of search engine results in the face
35
In their academic article introducing Google, Brin and Page note that
“as of November 1997, only one of the top four commercial search engines finds
itself (returns its own search page in response to its name in the top ten results)”
(Brin & Page, 1998)
107
of these challenges. This approach involves analyzing the hyperlinks between
measurement of authority for that page, in which a page with many incoming
links is considered more authoritative than a page with none. The Hypertext
Induced Topic Selection (HITS) (Kleinberg, 1999) and Google’s PageRank (Brin
& Page, 1998; Page et al., 1998) are the best known of these algorithms.36 To
overcome the threat of linkspamming (the flooding of a page with incoming links
to deceptively increase its ranking), these algorithms calculate and apply page
authorities recursively, that is, instead of just counting the raw number of links to
how much weight to give it. Thus, a page receives more importance if
Microsoft.com itself has many incoming links. The authority of a page both
depends on and influences the authority of other pages (Arasu et al., 2001, p. 28).
There are many variations on this approach, but utilizing the link structure of the
Web to help determine the ranking of search engine results has become an
the drawbacks inherent in the “browse paradigm” of Web directories. First, they
are fully automated, employing multiple crawlers to scour the Web for new pages,
eliminating the burden of a large staff of humans to keep up with the rapidly
expanding Web. Second, search engines are more user-friendly than Web
36
PageRank will be discussed in more detail in the next chapter.
108
directories, typically featuring a simple text box to enter the search terms. Results
are often provided with the title of the page as well as a brief description of what
the page contains. Users no longer need to understand and discern a specialized
vocabulary, nor must they navigate complex hierarchies in order to find links to
relevant Web sites. Further, since the perfect search engine is designed to provide
specific and relevant results for each individual query, personalized to the
particular searcher, the ontological struggles inherent in fitting Web content into
pages on the Web like books in a library that can be neatly classified into rigid
categories, Web search engines exploit the inherent link structure of the Web,
locating, indexing, and ranking pages based on their relationship to other pages, in
order to “make sense of the vast heterogeneity of the World Wide Web” (Page et
Web search engines, however, cannot alleviate all of the drawbacks of the
Web directory model of organizing and navigating the World Wide Web. One of
the drawbacks identified above was the potential for an individual editor’s bias to
impact the decision whether to include a particular Web site in the directory and
which ontological category it best fits. While Web search engines are often
portrayed as neutral technologies merely selecting and ranking Web sites based
on the “democratic nature of the Web” (Google, 2004c), the technical design of
their algorithms might actually heighten the bias feared in Web directories.
Introna and Nissenbaum’s (2000) seminal study, “Shaping the Web: Why the
Politics of Search Engines Matter,” was among the first to challenge the neutrality
109
of search engines, revealing how they “systematically exclude certain sites, and
programming of bias within the algorithmic (and invisible) code of Web search
engines makes their bias much more threatening than the personal bias of a
& Lawrence, 2001; Chandler, 2002; Hargittai, 2004b; Vaughan & Thelwall,
2004; Diaz, 2005) have built on Introna and Nissenbaum’s thesis to reveal how
routinely fail to index the entire World Wide Web. A 1994 study claimed that the
top six search engines together indexed only 42% of the Web (Lawrence & Giles,
80%-90% for each of the major engines (Vaughan, 2004). Nevertheless, as more
Web services and applications rely on dynamic Web pages, such as online stores
that only generate a Web page in response to a specific product search, this so-
called “invisible Web” is often left outside Web search engines’ indexes
(Bergman, 2001). Even where specific efforts were made to ensure such pages are
visible to search engines (e.g., the Open Access Initiative) the best search engine
was able to find only 60% of this content (McCown et al., 2006).
that exists on the Web. A large part of their continued success, and what allowed
110
them, on the whole, to survive the dot-com bubble that bankrupted countless Web
(Vine, 2004). They earn the vast majority of their revenue through the sale of
advertising space on search results pages: in 2003, the percentage of revenues due
to advertising for Yahoo! and Google were 82% and 95%, respectively (Van
Couvering, 2004, p. 7). Search engine advertising takes various forms. Following
the trend of other non-search Web sites, such as the online versions of
newspapers, some search engines include graphical banner ads on their home
page or search results pages, earning revenue each time the ad is viewed, clicked,
or some other action is taken. Search engines can also earn revenue by charging
Web sites for inclusion in the search engine’s index, or to increase the frequency
that their site would be crawled. For example, Yahoo!’s paid inclusion program
guarantees that paying clients’ websites will be crawled for updates every two
days, while it may update its index of other sites only once a month (Hansell,
2004b).
crawler’s path, but do not necessarily guarantee a particular spot in the ranking of
search results or position on the search engine results page. Most search engines
have separate paid placement programs, where Web sites pay a fee to have links
111
placed within the results for a particular search query. For example, a digital
camera manufacturer may pay a search engine to gain a prominent position on the
page resulting from a user’s search for “digital cameras.” Usually, paid listings
are shown on top of, or to the side of, any standard unpaid search results (also
results or advertising (see Figure 4), although there are search engines that insert
paid results into the organic results with little or no user notification (Wouters,
2005).
Figure 4: Search results page for "digital cameras" showing paid placement of
links to advertisers (circled).
112
Paid placement advertising has quickly become the primary revenue
source for Web search engines (Reinhardt, 2003). Google’s AdWords37 and
launching its own adCenter39 to tap into this lucrative market (Hansell, 2005).
increasingly are placing similar contextual ads across their diverse offerings, such
as their mapping or e-mail products (Hansell, 2004a; Roush, 2005). Search engine
content providers to include contextual ads on their Web properties (Thaw &
Daurat, 2006; Waters, 2006) and the potential revenues from search-related
(Fabrikant, 2005). Marking the health of the search advertising market, the Search
revenues to be $5.75 billion for 2005 in North America alone, and predicts that
Organization, 2006). Google has capitalized most on the growing search engine
billion, moving it into the top twenty-five largest corporations, with a market
37
http://adwords.google.com/
38
http://searchmarketing.yahoo.com/
39
http://adcenter.microsoft.com/
113
capitalization larger than IBM, AT&T, or Intel.40 Amazingly, one of the world’s
engine providers continually work to improve and expand their services in order
to increase their advertising revenues. To help achieve these financial goals, there
has been an ongoing quest within the search engine industry to create the “perfect
search engine,” one that has indexed all available information and provides the
most relevant and personalized results (see Kushmerick, 1998; Andrews, 1999;
Gussow, 1999; Mostafa, 2005). A perfect search engine would deliver intuitive
results based on users’ past searches and general browsing history (Pitkow et al.,
2002; Teevan et al., 2005), knowing, for example, whether a search for the
keywords “Paris Hilton” is meant to help a user locate the hotel chain in the
French city, or find the latest gossip about the young socialite. Search engine
companies have clear financial incentives for achieving the “perfect search”:
partners as well as improving chances that the user would purchase fee-based
services. Similarly, search engines can charge higher advertising rates when ads
40
Retrieved December 3, 2006 from Yahoo! Stock Screener at
http://screen.yahoo.com/stocks.html.
114
are accurately placed before the eyes of users with relevant needs and interests
(Hansell, 2005).
Along with these financial incentives for the search engine providers,
journalist John Battelle illustrates the potential benefits the perfect search engine
enjoyed by users:
Imagine the ability to ask any question and get not just an accurate answer,
but your perfect answer – an answer that suits the context and intent of
your question, an answer that is informed by who you are and why you
might be asking. The engine providing this answer is capable of
incorporating all the world’s knowledge to the task at hand – be it
captured in text, video, or audio. It’s capable of discerning between
straightforward requests – who was the third president of the United
States? – and more nuanced ones – under what circumstances did the third
president of the United States foreswear his views on slavery?
This perfect search also has perfect recall – it knows what you’ve
seen, and can discern between a journey of discovery – where you want to
find something new – and recovery – where you want to find something
you’ve seen before. (Battelle, 2004)
When asked what a perfect search engine would be like, Google’s Sergey Brin
replied, perhaps jokingly (but perhaps not), “like the mind of God” (quoted in
Ferguson, 2005, p. 40). To attain such an omnipresent and omniscient ideal, the
perfect search engine must have both “perfect reach” in order to provide access to
all available information on the Web and “perfect recall” in order to deliver
personalized and relevant results that are informed by the previous habits of that
particular searcher.
Perfect Reach
To achieve the reach necessary for the perfect search, Web search engines
amass enormous indices of the Web’s content. Expanding beyond just HTML-
115
based Web pages, search engine providers have indexed a wide variety of media
found on the Web, including images, video files, PDFs, and other computer
documents. For example, Yahoo! claims to have indexed over 20 billion items,
including over 19.2 billion Web documents, 1.6 billion images, and over 50
million audio and video files (Mayer, 2005). Google claims to have an index more
than three times larger than that of any other search engine (Google, 2005n), and
it is estimated that Google has indexed nearly 70% of the total World Wide Web
(Sullivan, 2005). The increasing sophistication and reach of Web crawler and
entire World Wide Web to fuel the quest for the perfect search – So powerful that
century proclamation that “esse est percipi” (to exist is to be perceived) to the
indexed on Google.
Perfect Recall
and understand searchers’ intellectual wants, needs, and desires when they
relevant results. The primary means for personalizing search results is to rely on a
users’ search habits and history (see, for example, Speretta, 2000; Pitkow et al.,
2002; Teevan et al., 2005). To gather users’ search histories, most Web search
engines maintain detailed server logs recording each Web search request
116
processed through their servers, the pages viewed, and the results clicked (see, for
example, Google, 2005i; IAC Search & Media, 2005; Yahoo!, 2006). Search
engines also rely heavily on Web cookies to help differentiate users and track
activity from session to session, and increasingly push the creation of user
accounts to help associate particular users with their online activity.41 The
motivation behind gathering this user information is explained to the user in terms
of improving their search experience. Google, for example, states, “We use this
information to improve the quality of our services and for other business
purposes” (Google, 2005i), while the search engine Ask.com also presents its
economic motivations fueling the need for this perfect recall in pursuit of the
A Faustian Bargain?
The quest for the perfect search engine has led to calls for search engines
to provide results that suit the “context and intent” of the search query. Given a
search for “Paris Hilton,” the perfect search engine will know whether to deliver
results about the celebrity or a place to spend the night in France. To attain such
an omnipotent and omniscient ideal, the perfect search engine will have to have
41
These practices will be discussed in further detail in the following
chapter.
117
“perfect reach” and be able to deliver any type of online content from all online
results that are informed by who the searcher is. Search engine users are
repeatedly reminded of the benefits of the perfect search engine, ranging from
Google’s bold goal to “organize the world's information and make it universally
and strategy:
visible” (Brey, 2000, p. 13). Given our position that technology bears “directly
of social, ethical, and political values,” we must work to uncover and morally
evaluate the values and norms embedded in the quest for the perfect search
engine. For example, what does it mean to have all available information on the
Web indexable and searchable – essentially at the fingertips of any person with
access to the Internet? Or, what are the consequences of having search engines
offerings?
Herein lies the concern that the perfect search engine is a Faustian bargain:
The perfect search engine promises accuracy, efficiency, and relevancy, but at
118
what cost? Does the perfect search “giveth” as well as “taketh away?” In the
the promises made by technology, ignoring such issues of power, equality, and
more tempting to simply take the design of such tools “at interface value” (Turkle,
explore the Faustian bargain that we are making with our embracing of the perfect
search engine. To come to terms with this Faustian bargain, the next chapter will
focus on the search engine that holds the most promise for achieving the perfect
search, and invokes the most anxiety among its critics: Google.
119
120
CHAPTER IV
Introduction: Google
The web search engine Google has established itself as the prevailing
interface for searching and accessing virtually all information on the Web.
Originating in 1996 as a Ph.D. research project by Larry Page and Sergey Brin at
Stanford University (see Brin & Page, 1998; Page et al., 1998), Google was a
market share in December 2000 (Sullivan, 2001). Google’s Web search engine
quickly rose to dominate the U.S. market, processing almost 3.6 billion search
Google held an initial public offering in August 2004, and within six months rose
to become one of the 100 largest companies in the world (Datamonitor, 2005).44 It
reported $10.6 billion in revenues during fiscal 2006 (Google, 1999), and has
42
See Table 2 in previous chapter.
43
At its peak in early 2004, Google handled upwards of 80 percent of all
search requests on the Web through its own website and clients like Yahoo!,
AOL, and CNN who relied on Google for their customer’s search engine results.
Google’s share fell to a still dominant 57% in 2004 when Yahoo! dropped
Google’s search technology for their own (Hansen, 2004).
44
Based on market capitalization as of March 30, 2005.
become one of the top twenty-five largest corporations in the world, with a market
factors: its grassroots origins in academia; its simple, clean interface design and
use of only text-based advertising; its belated and unconventional initial public
offering (Google, 1999); its constant stream of new services and Web
technologies; and the appeal of its informal corporate motto, “Don’t be evil”
trusted service, with a lighthearted corporate philosophy, Google has won the
hearts and minds of millions of users, becoming so popular that it has even
generated its own verb, to google (Harris, 2006), and is regarded as one of the
most reputable companies in the world (Alsp, 2005). Google, in short, is the “gold
standard” against which all other search engine practices and innovations are
measured (Hellweg, 2002; Clark, 2006). The core of Google’s Web search engine
PageRank
stand apart from the competition. In 1998, Brin and Page’s paper, “The Anatomy
proposed a system to more effectively retrieve information from the World Wide
45
Retrieved December 3, 2006 from Yahoo! Stock Screener at
http://screen.yahoo.com/stocks.html.
121
Web to “improve the quality of search engines” and thus “bring order to the Web”
(Brin & Page, 1998, p. 3). The core of their new Web search engine is PageRank,
a set of algorithms for ranking Web pages, using the immense link structure of the
PageRank relies on the uniquely democratic nature of the Web by using its
vast link structure as an indicator of an individual page's value. In essence,
Google interprets a link from page A to page B as a vote, by page A, for
page B. But, Google looks at more than the sheer volume of votes, or links
a page receives; it also analyzes the page that casts the vote. Votes cast by
pages that are themselves “important” weigh more heavily and help to
make other pages “important.” (Google, 2004c)
meant here by the “link structure of the Web.” Recall from the previous chapter
that the Web is essentially a set of hypertext documents, each of which contains a
number of unidirectional links to other documents. In this light, we can view the
Web as a directed graph (Figure 5), wherein a node represents a specific page (A,
B, etc), and an “edge” from node A to node B represents a link from page A to
page B. Such a graph describes how each page is interlinked and interrelated,
revealing the topology of the network of Web pages.46 Mapping the link structure
of the Web in this way provides a detailed account of the complex inter-
46
See (Barabási, 2003; Watts, 2003) for introduction to network and graph
theory, and (Broder et al., 2000) for technical details on the graph structure of the
Web.
122
Figure 5: Hypothetical Web graph. Seven pages contain links as specified by the
table to the right. This link structure is depicted in the directed graph to the left.
(adapted from Diaz, 2005)
backlink counting in which the search engine simply calculates the number of
pages that link to a particular page A. If this count is high – many pages refer to A
step further by recognizing that not all links are equal. With simple backlink
ranking as a similar link from the New York Times website. Borrowing from
academic citation analysis, Google’s founders recognized that some links are
more authoritative than others – a link from the Times is (presumably) more
123
pages. The importance of the Times is, in turn, measured by the importance of all
the pages that refer to it, creating a recursive calculation. Since the New York
Times site is deemed more “important” than my own, the former link goes much
further in elevating the PageRank of my professor’s site, and thus its visibility
among the results. This recursive definition of PageRank differs sharply from
page’s importance without taking into account any textual information. In other
words, the PageRank score of a page is influenced by neither the contents of the
page itself nor the user’s search terms, but is based solely on the aggregate
revealing both its recursive nature and how the relative importance of certain
124
The rank of a page is divided among its forward links evenly to contribute to the
ranks of the pages they point to.47 For example, page 1 has a PageRank of 0.304,
and links to five other pages (2, 3, 4, 5, and 7), thus sharing with them a
PageRank of 0.061 (0.304 divided by five). Each of those pages combines the
value. Notice that page 1 and page 5 both have four incoming links. If a search
engine relied only on link counting to establish relevancy, these two pages would
that page 1 is nearly twice as important as page 5, since the PageRank of page 1’s
four incoming links are higher than the links pointing to page 5 (two of page 5’s
links, for example, come from pages with only one incoming link themselves,
Since the importance of any one page influences the importance of any
other, the usefulness and accuracy of PageRank is, in the end, dependent on
attaining a reliable mapping of the link structure of the entire World Wide Web on
which to base the calculations. To map the link structure of the Web, Google
deploys its Web crawler – Googlebot – to traverse and record billions of Web
47
In such a recursive calculation, the starting PageRank is not known.
However, as noted in Brin and Page’s (1998) paper, “PageRank…can be
calculated using a simple iterative algorithm”, meaning that the calculation can
start with any number, and then through repeated iterations, will converge on the
theoretically true PageRank value. An example of such iterative calculations can
be found at (Rogers, 2002). To save time, Google relies on linear algebra rather
than calculating multiple iterations (Langville & Meyer, 2006).
125
pages and their links.48 Since its initial description in Brin and Page’s (1998)
original article, little is known about the current form of Google’s crawler and
supporting architecture for mapping the Web; its details remain a closely guarded
trade secret. From the information made available (Barroso et al., 2003; Google,
2007), we can discern that Google uses a set of distributed crawlers, each on its
the Web’s link structure. At its launch, Brin and Page (1998) claimed Google had
and saving a copy of every Web page it encounters also allows Google to analyze
and utilize the content within those pages. For example, Google relies on
particular search term. At the time of Google’s launch, most search engines relied
heavily on how often a word appeared on a Web page in order to determine its
within the page, Google analyzes the full content of a page and factors in font
size, header levels, and the relative location of each word in order to measure its
importance. In such an analysis, a page with the search term in the title or in large,
48
Recalling the brief introduction in the previous chapter, Web crawlers
are small programs that “crawl” the Web on the search engine’s behalf,
downloading Web pages into a page repository for later processing.
126
bold font will be considered more relevant than a page on which the word appears
pages, Google can estimate the relative importance of particular words and
phrases.
In the process of mapping the link structure of the Web and indexing the
content of every Web page crawled, Google has amassed an incredibly large
dataset of the words and sentences that appear on the Web.49 Google has taken
learning and natural language processing with the goal of improving their search
words appearing in proximity to each other, Google can cluster concepts into
for a certain search term, Google can determine the probability the searcher might
keywords not initially searched for. For example, if someone searches for “Bay
Area cooking class,” Google might determine through clustering that the related
49
Google recently released its corpus of over one trillion words scraped
from public Web pages to the linguistic community. The dataset included almost
100 billion full sentences (Google, 2006a).
50
Google has released how many misspelled queries it had over a three-
month period of users searching for “Britney Spears.” Almost 600 different
spellings where attempted by at least 2 users, all of which were corrected by its
spelling correction system. (http://www.google.com/jobs/britney.html)
51
See (Zamir & Etzioni, 1999; Wen et al., 2001) for technical descriptions
of clustering with Web search engines.
127
terms “Berkeley courses: vegetarian cuisine” is a also good match, even though it
contains none of the original query’s keywords. Other uses for clustering include
determining how to aggregate and organize related news stories within Google
engine: The company’s very first press release noted that “a perfect search engine
will process and understand all the information in the world…That is where
Google is headed” (Google, 1999). Google co-founder Larry Page later reiterated
the goal of achieving the perfect search: “The perfect search engine would
understand exactly what you mean and give back exactly what you want”
(Google, 2007). From its dominant market position, Google continues to refine its
PageRank algorithm, expand the reach of its crawler, and map the link structure
poised to achieve the perfect reach and perfect recall necessary to fulfill its quest
128
information-seeking contexts: general information inquiries, academic research,
discussed in more detail in Appendix A), the reach of Google’s crawlers and
index has expanded beyond websites to include other online documents as well,
such as images, news feeds, Usenet archives, and video files. Additionally,
Google has begun digitizing the “material world,” adding the contents of popular
books, university libraries, maps, and satellite images to their growing index.
Users can also search the files on their hard drives, send e-mail and instant
messages, shop online, and even engage in social networking through Google.
They also use these tools to communicate, navigate, shop, and organize their
activities, “Planet Google” has become a large part of people’s lives, both on- and
along the path towards realizing Sergey Brin’s dream of creating “a perfect search
engine [that] will process and understand all the information in the world”
(Google, 1999; emphasis added). While the seemingly perfect reach of Google is
52
These nine contexts are not necessarily mutually exclusive and are not
put forth as airtight metaphysical divisions. They are meant simply to help
compartmentalize the various types information-seeking activities a person
undertakes in her daily activities for easier discussion.
129
well known – indeed its ability to access information other search engines appear
to miss is an ingredient of its great success – its attempts to attain perfect recall
are more likely to create anxiety among users, and deserves closer attention.
In order to provide results that suit the “context and intent” of the search
query, a perfect search engine must have “perfect recall” of who the searcher is
and her previous search-related activities. In order to discern the context and
intent of a search for “Paris Hilton,” the perfect search engine would know if the
searcher has shown interest in European travel, or whether she spends time online
searching for sites about celebrity gossip. Attaining such perfect recall requires
possible. To accomplish this, Google, like most Web search engines, relies on
fuel the perfect recall: the maintenance of server logs, the use of persistent Web
Maintained by nearly all websites, server logs help website owners gain an
understanding of who is visiting their site, the path visitors take through the
website’s pages, which elements (links, icons, menu items, etc.) a visitor clicks,
how much time visitors spend on each page, and from what page visitors are
leaving the site. In other words, a website owner aims to collect enough data to
reconstruct the entire “episode” of a user’s visit to the website (Tec-Ed, 1999).
130
Google maintains detailed server logs recording each of the 100 million search
requests processed each day (Google, 2005j). While the exact contents are not
publicly known, Google has provided an example of a “typical log entry” for a
user by the user’s Internet service provider, 25/Mar/2003 10:15:32 is the date
requested page, which also happens to identify the search query, “cars,” Firefox
1.0.7; Windows NT 5.1 is the browser and operating system being used, and
the first time it visited Google. To help further reconstruct a user’s movements,
advertising links a user clicks (Google, 2005i). Given Google’s wide array of
53
An Internet Protocol (IP) address is a unique address that electronic
devices use in order to identify and communicate with each other on a computer
network. An IP address can be thought of as a rough equivalent of a street address
or a phone number for a computer or other network device on the Internet. Just as
each street address and phone number uniquely identifies a building or telephone,
an IP address can uniquely identify a specific computer or other network device
on a network (Wikipedia contributors, 2007b).
54
A Web cookie is a piece of text generated by a Web server and stored in
the user’s computer, where it waits to be sent back to the server the next time the
browser accesses that particular Web address. By returning a cookie to a Web
server, the browser provides the server a means of associating the current page
view with prior page views in order to “remember” something about the previous
page requests and events (see Clarke, 2001; Kristol, 2001). Google’s user of Web
cookies allows it to identify particular browsers between sessions, even if that
browser’s IP address changes.
131
products and services, their server logs potentially contain much more than simply
a user’s Web search queries. Other searches logged by Google include those for
images, news stories, videos, books, academic research, and blog posts, as well as
links clicked and related usage statistics from within Google’s News, Reader,
Logging this array of information – the user’s IP address, cookie ID, date
and time, search terms, results clicked, and so on – enhances Google’s ability to
attain the “perfect recall” necessary to deliver valuable search results and
the IP address each request sent to the server along with the particular page being
requested and other server log data, it is possible to find out which pages, and in
which sequence, a particular IP address has visited. When asked, “Given a list of
search terms, can Google produce a list of people who searched for that term,
or Google cookie value, can Google produce a list of the terms searched by the
both questions, confirming its ability to track user activity through such logs
browsing and searching activities completely and consistently has its limitations.
Internet through a university proxy server or through some ISPs (such as AOL)
132
might share the same IP address. Privacy concerns have also led more savvy
Internet users to disguise their IP address with anonymous routing services such
as Tor (Zetter, 2005b). Similarly, as the privacy concerns of the use of cookies to
features that make it easier to view, delete and block Web cookies received from
the sites they visit (McGann, 2005; Mindlin, 2006). Even in the absence of such
particular Web browser or computer, not necessarily a particular user. Neither the
browser passing the cookie nor the Web server receiving it can know who is
actually using the computer, or whether multiple users are using the same
differentiation between users, limiting the extent of the “perfect recall” necessary
register with the website and login when using the services (Ho, 2005, pp. 660-
661; Tec-Ed, 1999). When a user supplies a unique login identity to a Web server,
that information, along with the current cookie ID, is stored in each log file record
for that user’s subsequent activity at the site. By tying aspects of the site’s
functionality to being logged in, the user is compelled to accept the Web cookie
for that session. Even if the user deletes the cookie or changes her IP address at
the end of the session, by logging in again at the next visit, a consistent record for
the user in the server log can be maintained. Logging in with a unique user name
133
similarly reduces the variability of multiple or shielded IP addresses. Further, any
as age, gender, zip code, or occupation, can be associated with the user’s account
and server log history, providing a more detailed profile of the user.
that required users to register and login, including personalized search results, e-
mail alerts when sites about a particular topic of interest are added to Google’s
index (Kopytoff, 2004). Soon afterward, Google introduced products and services
that required the creation of a Google Account, such as Gmail, Google Calendar,
and the Reader service to organize news feeds. Other Google services can be
partially used without a Google Account, but users are encouraged to create an
include Google Video, with a Google Account required for certain premium
content, and Book Search, in which a Google Account helps control access to
with their own login protocols, migration to Google Accounts is typical, as the
case with Blogger or Dodgeball (see Weinberg, 2005; Google, 2006b). Internally
developed products that previously utilized unique logins, such as Orkut, have
with its use of persistent Web cookies, provides the necessary architecture for the
134
creation of detailed server logs of users’ activities across Google’s various
products and services, ranging from the simplest of search queries to minute
details of their personal lives. While the full extent of the data capturable by
The ability to track specific Web search queries is the most discussed –
and perhaps most pernicious – of Google’s ability to monitor its users. Logging
the specific search terms for each of the 100 million Web search queries it
processes daily, along with the particular results clicked, provides Google a
unique insight into the wants and needs of its users. Evidenced by the search
terms revealed in AOL’s release of search history data (Maney, 2006; McCullagh,
2006a), the individual search terms within Google’s logs are a mix of the
mundane and the stimulating, the trivial and the informative. While over half of
searchers say they split their searches among those that are “for fun” and those
that are “important” to them (Fallows, 2005), users are increasingly using the
internet and search engines to help them make important decisions or negotiate
their way through major episodes in their lives (Horrigan & Rainie, 2006). In such
potentially personal and sensitive circumstances, the terms for which users search,
along with the results they decide to clink on, are stored in Google’s server logs.
Whether a user searches for teen pop star “Lindsay Lohan” or “Cleveland HIV
treatment center,” or whether a user clicks on a news story about “abortion rights”
55
See Table 4 for a summary of personal information collected across
Google’s products, and Appendix A for a more detailed description of each
product and its method of collecting personal data.
135
or a blog post on “American Idol,” all such actions across Google’s services
become associated with a user’s IP address and cookie ID within Google’s vast
server logs.
dataveillance also enables the capturing of various other personal and intellectual
interests and activities across its products and services. For example, the modules
(displaying headings from The Advocate, for example). Users curious about their
personal Web presence might create a Google Alert with their name, mailing
“connect some information – your Google Account name – with the books and
pages that you’ve viewed”, restricting the ability to browse and read books
facilitate the collection of keywords that potentially relate to users’ personal lives
augmented when the user creates a Google Account. While all that is needed to
Upon creating a Google Account, users are prompted to edit their profile to
include their full name and zip code. Creation of a Gmail account requires that a
136
first and last name be provided.56 When using Google Groups, users are
title, industry, website or blog. Blogger users are also encouraged to create a
profile, which includes information such as the user’s full name, photograph,
birthday, location, gender, as well as lists of favorite books, movies, music, and so
on. Any personal information provided for such profiles is associated with the
user’s Google Account. Google’s new Checkout online payment service also
requires the collection of user’s personal transactional data, including a real name,
credit or debit card number, card expiration date, card verification number, billing
Typically linked via a Google Account, these services allow Google to tabulate
users’ calendar events, e-mail contacts, chat buddies, Web bookmarks, and
financial portfolio (see Table 4). Some of Google’s efforts, however, have
received particular scrutiny for their potential impact on user privacy. Gmail, for
example, has been heavily criticized for its practices of scanning of the text of
56
Of course, there is no method of verifying whether the user provides her
real name.
137
appear in the right margin of the Gmail interface. Google scans the text of
incoming e-mail messages in order to target the advertising to the user. For
example, if the user is reading an e-mail that contains the text “Atlantic City,”
Gmail might present the user with ads about hotels, casinos, and other websites
related to that travel destination. While Google maintains that “no human will
read the content of your email in order to target such advertisements or other
Gmail terms of use also note that Google may “monitor, edit or disclose your
order to comply with any valid legal process or governmental request” (Google,
2005d).
policy stating that “residual copies of e-mail may remain on our systems for some
time, even after you have deleted messages from your mailbox or after the
Since electronic communications stored for more than 180 days enjoy less robust
2004b), the prospect of indefinite storage of Gmail e-mails raises concerns over
the privacy of users’ communications. Google insists that this phrasing in the
Gmail privacy policy was simply “poor wording” (Gillmor, 2004), and that while,
like most Web-based e-mail providers, Google keeps multiple backup copies of
users’ emails so that users can recover messages and restore accounts in case of
138
errors or system failure, deleted e-mails are eventually completely removed from
Google’s servers within 60 days (Gillmor, 2004; Google, 2005c). Even with these
backup systems” (Google, 2005c), and in at least one reported case, a subpoena
was sent to Google for the complete contents of a Gmail account, including
early 2006, Google released a new version of Google Desktop Search with a
information from all of their computers with Google Desktop installed. Once
enabled,57 the file index of each authorized computer is uploaded and stored on
To help protect user privacy, the data is encrypted in transmission and while
stored on Google servers, and Google retains the data for only 30 days. However,
privacy concerns persist, typified by this warning from the Electronic Frontier
Foundation:
57
The Search Across Computers feature is not automatically activated and
must be enabled and authenticated through the Google Desktop preferences. A
Google Account is required to activate and access the service.
139
If you use the Search Across Computers feature and don't configure
Google Desktop very carefully—and most people won’t—Google will
have copies of your tax returns, love letters, business records, financial
and medical files, and whatever other text-based documents the Desktop
software can index. The government could then demand these personal
files with only a subpoena rather than the search warrant it would need to
seize the same things from your home or business, and in many cases you
wouldn’t even be notified in time to challenge it. Other litigants—your
spouse, your business partners or rivals, whoever—could also try to cut
out the middleman (you) and subpoena Google for your files. (Foundation,
2006)
It remains unknown whether the data stored on Google’s servers are retained on
“offline backup systems” past the 30-day window (similar to Gmail messages), or
whether employees within Google are able to decrypt the files if subpoenaed or
Google Toolbar and the Web Accelerator, also present unique privacy concerns
beyond the traditional logging of user activities across Google’s product suite.
Users running Google Toolbar with certain advanced features enabled (PageRank,
information about their Web browsing activities with Google. By sending Google
the addresses of every website visited by the user, the PageRank feature provides
information on the page (addresses, ZIP codes, ISBN numbers, etc) AutoLink
monitors the words that users type into Web forms in order to correct any spelling
140
designate by hovering over them with the mouse and provides translations into
various languages. Whenever these features are activated, information about the
Web site being viewed is sent to Google for processing (Google, 2006k).58
page load times for faster Web browsing. While not directly related to Web
Google’s computer infrastructure to make Web pages load faster. The software
downloading only the updates if a Web page has changed slightly since it was last
viewed, prefetching certain pages onto a user’s computer that the user might visit
in the near future, as well as other data management and compression techniques
(Google, 2006m).
When using Web Accelerator, all non-secure Web page requests are
routed through Google’s servers, along with information such as the date and time
of the request, the user’s IP address, and computer and connection information.
Google stores and uses this information to help predict and prefetch additional
relevant Web content. Depending on how particular websites are set up, it is
possible that personally identifiable information embedded in the URL might also
be processed through and stored within Google’s servers. Google might also
58
See Appendix A for additional Google Toolbar features which capture
user information.
141
temporarily cache other sites’ Web cookies when prefetching certain page
combined with its use of Web cookies and other data collection means, provides
the architecture to monitor and log user activity across the myriad products and
services the make up their larger Web search information infrastructures.59 The
result is a robust infrastructure arming Google with the ability to capture and
aggregate a wide array of personal and intellectual information about its users,
extending beyond just the keywords for which they search, but also including the
news they read, the interests they have, the blogs they follow, the books they
enjoy, the stocks in their portfolio, their schedule for the coming week, and
initial public offering, Brin and Page state that Google is “not a conventional
company” and that they “aspire to make the world a better place” by “improv[ing]
the lives of as many people as possible” (Google, 2004). Elsewhere, Brin and
Page have noted their desire to “have positive social effects” and to make Google
by many who seem ready to concede Brin and Page’s quest to create a perfect
59
See Table 4 for a summary of personal information collected across
Google’s products, and Appendix A for a more detailed description of each
product and its method of collecting personal data.
142
search engine that will be “like the mind of God,” noting that Google is “poised to
become the perfect, all-seeing, al-knowing, all-powerful force of the 21st century”
(Ayers, 2003; see also, Friedman, 2003; Gorman, 2004). Others embrace “Planet
2006).
with which societies can embrace utopian visions of technological progress and
efficiency:
[In] cultures that have a democratic ethos, relatively weak traditions, and a
high receptivity to new technologies, everyone is inclined to be
enthusiastic about technological change, believing that its benefits will
eventually spread evenly among the entire population. Especially in the
United States, where the lust for what is new has no bounds, do we find
this childlike conviction most widely held. (Postman, 1992, p. 11)
technology and society, the concern that certain questions about the adoption of
whose point of view the efficiency is warranted or what might be its costs?
…Whom will the technology give greater power and freedom? And whose power
and freedom will be reduced by it?” (Postman, 1992, p. 11). Indeed, the expansion
of Google’s reach into so many areas of people’s lives has left some uneasy, such
as one Google user who expressed “feeling a ‘weird tension’ about his love of
Google’s products and his fear about its omnipresence in his life” (Williams,
2006). It appears, then, that certain anxieties have emerged as a result of Google’s
143
Anxieties of Perfect Reach
Web pages and other online sources to provide the largest possible database of
potential search results. Among the billions of pages indexed by search engines
forum postings, online resumes, minutes of public meetings, property tax records,
and court records. Few people are not affected by the “long arm of Google’s Web
Engaging in a “vanity search” – a Web search for one’s own name – can
example. The notion of “Googling” someone before a blind date has become
common practice (Lobron, 2006). Almost one in four Web users have searched
online for information about co-workers or business contacts (Sharma, 2004), and
144
employers are Googling prospective employees before making hiring decisions
(Weiss, 2006). In less than an hour, one reporter uncovered a variety of personal
Schmidt doesn’t reveal much about himself on his home page. But
spending 30 minutes on the Google search engine lets one discover that
Schmidt, 50, was worth an estimated $1.5 billion last year. Earlier this
year, he pulled in almost $90 million from sales of Google stock and made
at least another $50 million selling shares in the past two months as the
stock leaped to more than $300 a share.
He and his wife Wendy live in the affluent town of Atherton,
Calif., where, at a $10,000-a-plate political fund-raiser five years ago,
presidential candidate Al Gore and his wife Tipper danced as Elton John
belted out “Bennie and the Jets.”
Schmidt has also roamed the desert at the Burning Man art festival
in Nevada, and is an avid amateur pilot. (Mills, 2005)60
encouraging search engines to expand both the depth and breadth of their Web
indexes, the quest for the perfect search has also reduced users’ “security through
Not surprisingly, anxieties have arisen in the wake of this lack of security
My friend went on a date last week and “Googled” the man when she got
home -- that is, looked him up on the Internet search engine google.com.
She found that he had been involved in many malpractice suits. (He’s a
doctor.) Her “homework” has now resulted in a discounted opinion of this
man. What do you think about using Google to check up on another
person? (Cohen, 2002)
60
Ironically, Google punished CNET for publishing the personal
information about Schmidt – found via their own search engine – with a one-year
boycott against answering any inquirers from the news service. Amid public
criticism, Google ended the boycott two months later.
145
Randy Cohen – the ethicist – is not overly concerned with this instance of using
Google’s perfect reach in order to find out details about another person. He argues
it “was akin to asking her friends about this fellow – offhand, sociable and
Not everyone agrees with the ease at which Cohen justified the use of
search engines to obtain personal information about another person. Certainly, not
all occurrences are as “sociable and benign” as the example above. Wright and
Kakalik (2000) have pointed out that a certain kind of information about
individuals, which was once difficult to find and even more difficult to cross-
reference, is now readily accessible and collectible through the use search
engines, with particular consequences for individual privacy. Other scholars have
digital dossiers of individuals (Solove, 2004). Herman Tavani (2005) has written
specifically about the ease with which personal information can be routinely
collected, aggregated, and analyzed by Web search engines. Noting how the
perfect reach of search engines now extend to various mailing list and discussion
146
into that person’s interests and activities.61 So it would seem to follow that
not all of the personal information currently included on Web sites
accessible to search engines was necessarily either placed there by the
persons themselves or explicitly authorized to be placed there by those
persons. (Tavani, 2005, p. 40)
An individual might not be aware that her name is among those included in one or
more of these databases accessible to search engines, let alone fluent in how
search engines themselves work and their ability to retrieve personal information
from a variety of online sources. John Battelle summarizes this anxiety best:
(Nissenbaum, 1998, 2004), we are forced to recognize that, powered by its perfect
reach, the quest for the perfect search engine “hath taken away” the practical
A key component of the perfect search engine is its perfect recall – the
ability to know the searcher, and what she has searched for in the past to help
tailor both search results and advertising to her interests and needs. While it is
entered, and a flood of possible search results are returned – there is an important
61
A search for the my name uncovers (admittedly forgotten) posts to
Usenet discussion forums from the early 1990s on topics ranging from abortion
rights, Catholicism, feminism, marketing, and Lotus 1-2-3 spreadsheet software.
147
feedback loop. In the quest for the perfect search engine, the interface is actually
Accounts, combined with the use of persistent Web cookies, Google has also
constructed the necessary architecture for the creation of detailed server logs of
user’s online intellectual activities, both on Google properties and beyond (Table
4 at end of chapter). The result, a major step toward achieving the perfect recall
necessary to deliver personalized results and services, also arms Google with the
information about its users, extending beyond just the keywords they search for,
but also including the news they read, the interests they have, the blogs they
follow, the books they enjoy, and perhaps even every website they visit.
As is typical for search engines, Google’s log files record the search terms
used, Web sites visited, Internet Protocol address, and Web cookie for every
single search conducted through its site. This can easily be combined with other
services. For instance, Gmail asks for a user’s name and e-mail address, Google
Maps could store her home address, Dodgeball stores her cellphone number and
locational data, and Google Finance could collect the stocks in her portfolio. If
combined with her Web search history, Alert keywords, and Personalized
Homepage modules, Google would see intimate details about a person’s identity,
political interests, health status, sex life, religion, financial status, and buying
148
This information represents, in aggregate form, a place holder for the
intentions of humankind - a massive database of desires, needs, wants, and
likes that can be discovered, subpoenaed, archived, tracked, and exploited
to all sorts of ends. Such a beast has never before existed in the history of
culture, but is almost guaranteed to grow exponentially from this day
forward. This artifact can tell us extraordinary things about who we are
and what we want as a culture. (Battelle, 2003)
While many of our day-to-day habits – such as using credit cards, ATMs,
search histories, e-mails, blog posts, and general browsing habits, providing “an
excellent source of insight into what someone is thinking, not just what that
Recognizing these anxieties, a Faustian bargain emerges with the quest for
the perfect search engine: The perfect search engine promises breadth, depth,
efficiency, and relevancy, but threatens any sense of “security through obscurity”
personal and intellectual information in the name of its perfect recall. While many
searchers have acknowledged this anxiety about the presence of such systematic
2006; Hafner, 2006; Levy, 2006; Maney, 2006), there has been little evidence of
widespread changes in user behavior in light of these revelations.62 After the dust
62
In the year since the DOJ case emerged, search engine activity has
increased from 5.3 billion searches in February 2006 (Nielsen//NetRatings, 2006)
149
settled from these privacy controversies, the allure of “Planet Google” has
maintained its hold on the faithful: “I don’t know if I want all my personal
an improvement on how life was before, I can’t help it” (Williams, 2006).
The fact that the user quoted above “can’t help” embracing Google’s suite
personal information about his online intellectual and social activities, reveals the
potency of the Faustian bargain that society makes the perfect search engine. In
the promises made by technology, ignoring such issues of power, equality, and
freedom. As the quest for the perfect search engine continues its meteoric rise,
value-related externalities, and more tempting to simply take the design of such
tools “at interface value” (Turkle, 1995, p. 103), a condition that seems all too
Anne Rubin, 20, a New York University junior who uses Google's search,
Gmail and Blogger services, says quality overrides any privacy concerns,
and she doesn't mind that profiles are built on her in order to make the ads
she sees more relevant. “I see it as a tradeoff. They give services for free,”
she said. “I have a vague assumption that things I do (online) aren't
entirely private. It doesn't faze me.” (Associated Press, 2005)
In order to break the hold of this Faustian bargain, we need to gain conceptual
clarity and a normative understanding of the ways in which the quest for the
150
perfect search engine bears on user privacy. As the following chapter will explain,
provide both a novel and effective framework to reveal how Google’s quest for
the perfect search does alter personal information flows in such a way that
threatens users’ ability to fully utilize and enjoy this important online information
space. Contrary to Ms Rubin’s stance, we will reveal how Google’s quest for the
151
Table 3: Google Suite of Products and Services
Product Description Notes
General Information Inquiries
Web search - Query-based website searches
Personalized Homepage - Customized Google start page - Use in conjunction with Google
with content-specific modules Account is encouraged
Alerts - E-mail alerts of new Google
results for specific search terms
Image Search - Query based search for website
images
Video - Query based search for videos - Google Video Player available
hosted by Google for download
Book Search - Full text searches of books - Google Account required in
scanned into Google’s servers order to limit the number of
pages a particular user can view
Academic Research
Scholar - Full text searches of scholarly
books and journals
News and Political Information
News - Full text search of recent news - With a Google Account, users
articles can create customized keyword-
based news sections
Reader - Web-based news feed reader - Google Account required
Blog Search - Full text search of blog content
Communication and Social Networking
Gmail - Free Web based e-mail service - Creation of Gmail account
with contextual advertising automatically results in activation
of Google Account
- Logging into Gmail also logs
user into their Google Account
Groups - Free Web based discussion - Includes complete Usenet
forums archives dating back to 1981
- Google Account required for
creation of new Group;
Talk - Web-based instant messaging - Google Account and Gmail e-
and voice calling service mail address required
Blogger - Web-based blog publishing - Google Account required
platform -
Orkut - Web-based social networking - Invitation-only
service - Google Account required
Dodgeball - Location-based social
networking service for
cellphones
Personal Data Management
Calendar - Web-based time-management
tool
(Table continues)
152
Table 3: Google Suite of Products and Services (continued)
Product Description Notes
Financial Data Management
Finance - Portal providing news and - Google Account required for
financial information about posting to discussion board
stocks, mutual funds; Ability to
track one’s financial portfolio
Consumer Activities
Catalog Search - Full text search of scanned
product catalogs
Froogle - Full text search of online retailers - Google Account required for
shipping lists
Local / Maps - Location specific Web searching;
digital mapping
Computer File Management
Desktop Search - Keyword based searching of
computer files
- Ability to search files on remote
computer
Internet Browsing
Bookmarks - Online storage of website - Google Account required
bookmarks
Notebook - Browser tool for saving notes - Google Account required
while visiting websites
Toolbar - Browser tool providing access to - Some features require Google
various Google products without Account
visiting Google websites
Web Accelerator - Software to speed up page load
times for faster Web browsing
153
Table 4: Personal Information Collected by Google’s Suite of Products
Product Information Collected Notes
General Information Inquiries
Web search - Web search queries - Search for own name, address,
- Results clicked social security number, etc is
common
Personalized Homepage - News preferences
- Special interests
- Zip code
Alerts - News preferences - Alerts for a user’s own name
- Special interests (vanity search) are common
- E-mail address
Image Search - Search queries
- Results clicked
Video - Search queries - Google Video Player contains
- Videos watched/downloaded additional DRM technology to
- Credit card information for monitor off-site video usage
purchased videos
- E-mail details for shared videos
Book Search - Search queries
- Results clicked
- Pages read
- Bookseller pages viewed
Academic Research
Scholar - Search queries
- Results clicked
- Home library (Optional)
News and Political Information
News - News search queries
- Results clicked
Reader - Feed subscriptions
- Usage statistics
Blog Search - Search queries
- Results clicked
Communication and Social Networking
Gmail - Text of email messages
- E-mail searches performed
- Email address or cellphone
number (used for account
creation)
Groups - Search queries - Users are encouraged to create
- User interests detailed profiles, including name,
- Usage statistics location, industry, homepage, etc
- Profile information
Talk - Contact list
- Chat messages
- Usage statistics
(Table continues)
154
Table 4: Personal Information Collected by Google’s Suite of Products (continued)
Product Information Collected Notes
Communication and Social Networking
Blogger - Weblog posts and comments - Users are encouraged to create
- Profile information detailed profiles, including name,
- Usage statistics location, gender, birthday, etc
Orkut - Profile information - Users are encouraged to create
- Usage statistics detailed profiles, including name,
location, gender, birthday, etc
Dodgeball - Profile information - User location when messages
- E-mail address sent are tracked by Google
- Location
- Mobile phone information
- Text messages sent
Personal Data Management
Calendar - Profile information
- Events
- Usage statistics
Financial Data Management
Finance - Financial quotes - Names and e-mails are displayed
- Discussion group activity with discussion posts
- Portfolio (optional)
- Profile information
Consumer Activities
Catalog Search - Product search queries
- Results clicked
Froogle - Product search queries
- Results clicked
- Sites visited
- Shopping list
Local / Maps - Search queries - Search queries might include
- Results clicked geographic-specific information
- Home/default location
Computer File Management
Desktop Search - Search queries - Search queries visible to Google
- Computer file index (Optional) under certain circumstances
- Desktop file index is stored on
Google’s services if using Search
Across Computers
Internet Browsing
Bookmarks - Favorite websites
- When visited
Notebook - Notes and clippings
- Sites annotated
Toolbar - Search queries - Use of some advanced features
- Websites visited routes all browsing traffic
through Google servers
Web Accelerator - Websites visited - All browsing traffic is routed
through Google servers
155
156
CHAPTER V
Introduction
Postman feared that the Faustian bargain society strikes with its
the ideas embedded in them. Which means we become blind to the ideological
meaning of our technologies” (1992, p. 94). Indeed, search engine users are
constantly reminded that the collection of such information is “to improve the
quality of our services” (Google, 2005i) or to “improve the overall quality of the
online experience” (IAC Search & Media, 2005). The quest for the perfect search
Consider, for example, Sergey Brin and Larry Page’s response when
mail messages are countered by rhetorical claims that Google’s ads are smaller
and more discrete than the competition, that people clicked on the ads during
testing, and scanning emails to place ads is inherently “helpful,” “useful,” and a
Google’s founders frame the issue as a choice between “big, intrusive ads and our
157
concerns over user privacy, even those Page admits the ads are “spooky at first.”
“organize the world's information and make it universally accessible and useful”
(Google, 2005b).
Fueling the acquiescence to compromise for the perfect search engine are
2003c),63 that “the privacy concerns are probably overblown” (Mills, 2005), or
that “if you have nothing to hide when you use the internet, you have nothing to
fear” (Griffin, 2006). This final rhetorical device – that if you have nothing to
hide then you should have no concern for your privacy – is particularly
pernicious. Here, the word “hide” presupposes that nobody can have a legitimate
motive for wishing to protect information about his or her life outside of wanting
to hide it from discovery. As Phil Agre has noted, “This is obviously false”:
Agre argues that in a free society, one does not need to have “something to hide”
to keep personal information from prying eyes; it should be the default position.
63
A claim disproven by the ease of identifying users from the
“anonymized” AOL search records data release (see Barbaro & Zeller Jr, 2006).
158
The quest for the perfect search threatens to change this default, yet its danger
remains clouded among rhetoric of improving services, making one’s life easier,
and the notion that one should not have anything to hide in the first place.
knowingly share the information, and already share similar information with other
people and institutions. For example, addressing concerns that Google is able to
track and collect all of a user’s browsing activity through the Web Accelerator
asserting that Web Accelerator receives much of the same kind of information
that people already share with their Internet service providers when surfing the
Web (Hines, 2005). Or that Google logging book searches is no different than
asking a librarian for help finding particular books. Or that Google’s scanning of
spam filters. Such appeals that the status quo has simply been maintained have
It appears, then, that the Faustian bargain that society must make to reap
the benefits of the perfect search engine has succeeded in obscuring the various
issues related to privacy, freedom, and autonomy inherent in using the Web for
that little threat to privacy actually exists. Recalling Brey’s prescription for
159
disclosive computer ethics, to “make potentially morally controversial computer
features and practices visible” (Brey, 2000, p. 13), we must take steps to achieve
conceptual clarity of the value and ethical implications of the perfect search
Google’s quest for the perfect search alters personal information flows in such a
way that threatens users’ ability to fully utilize and enjoy this important online
information space.
evaluating the flow of personal information between agents to help identify and
its related clean division between public and private information – a key
160
They are at home with families, they go to work, they seek medical care,
visit friends, consult with psychiatrists, talk with lawyers, go to the bank,
attend religious services, vote, shop, and more. Each of these spheres,
realms, or contexts involves, indeed may even be defined by, a distinct set
of norms, which governs its various aspects such as roles, expectations,
actions, and practices. (Nissenbaum, 2004, p. 137)
Within each of these contexts, norms exist – either implicitly or explicitly – which
both shape and limit our roles, behaviors, and expectations. For example, it might
religious service, but not in the grocery store. A judge might willingly accept
birthday gifts from colleagues, but would hesitate to accept one from a lawyer
to ask me my age, but not for a bank teller. While it is necessary for an airline to
know my destination city, it would be inappropriate for them to ask where I will
In short, norms of behavior vary based on the particular context. The latter
examples above reveal the ways in which norms govern the flow of personal
information flow govern what type and how much personal information is
integrity is built around the notion that there are “no arenas of life not governed
our privacy is invaded when the informational norms are contravened. Within
161
each context, the relevant agents, the types of information, and transmission
subject, the one who has the information and is distributing it (who may or may
not be the subject), and the one who receives the information. The informational
norms within a particular context dictate the roles of the agents, each associated
with a set of duties and privileges. For example, in the healthcare context, the
personal information shared by the patient (the subject and sender) depends very
much on who the recipient is – the physician, the receptionist, the claims
processor, and so on. In turn, the rules governing the transmission of personal
colleague, the insurance company, and so on. The specification and roles of the
various agents are key variables affecting the maintenance of contextual integrity
the notion that information types fit into a rigid dichotomy of public or private.
162
The notion of “appropriateness” is a useful way to signal whether the type of
information tends to flow freely. In other contexts, such as the job interview or
classroom, more explicit and restrictive norms of appropriateness prevail, and the
norms of appropriateness apply in all situations: among both strangers and loved
outlined in professional codes of ethics dictate that my physician can share only
163
symptoms or family history to aid in diagnosis, but not my name. More restrictive
principles have been codified into our legal systems, such as the burden necessary
in particular contexts, and such norms are violated if the principles are not
followed.
identification of the relevant agents, the types of information, and the appropriate
decision heuristic to help explain when privacy objections are likely to be aroused
given context might impact the governing informational norms to see whether and
integrity has been maintained, we must consider how the new technology or
practice affects the agents involved, the appropriateness and type of information,
and the transmission principles that constrain the flow of information from agent
found to conflict with the standing informational norms, a red flag is raised,
indicating that contextual integrity has been violated. The usefulness of contextual
164
revealed by examining a recent application of the theory to the introduction of
(Zimmer, 2005).
board vehicle safety applications that share, receive, and process data from the
n.d.; Horrell, 2003; Derene, 2007). Made possible by recent advances in wireless
data communication technology, VSC solutions aim to afford the driver every
roadside infrastructure and with each other. In these networks, both vehicles and
infrastructure collect local data from their immediate surroundings, process this
safety information about the immediate surroundings. Data messages, which are
transmitted 10 times per second, potentially include the vehicle’s location, time
and date stamps, vehicle speed and telemetry data, and some sort of vehicle
165
VSC systems pose a Faustian bargain of their own: Coupled with the
2005), is a potential rise in the ability to surveil a driver engaging in her everyday
activities on the roads and highways. VSC technologies potentially enable the
collection of information on where drivers go, when they make their trips, and
what routes they use. However, since much of the tracking or surveillance made
possible by VSC technologies occurs in public as the driver travels along the open
concerns with the perfect search engine, many argue that drivers have no
expectation of privacy when traveling on the public roads (Harris, 2005), while
others maintain that VSC systems do not provide any information different than
provides the conceptual framework to reveal how these safety technologies have
the potential to disrupt the informational norms in the context of highway travel,
threatening drivers’ privacy even when driving along public roads (for a more
could be viewed by someone who happens to be at the right place at the right time
to visually-observe a car pass by. Mass surveillance was difficult due to particular
64
In a personal conversation, an engineer working on VSC-related
technologies remarked that the information shared with these new systems “is the
same as your license plate.”
166
natural barriers that influenced the informational norms. The license plate could
be read if the lighting conditions are correct, speed could only be approximated,
and the direction a car was traveling could be monitored until it was out of visual
range.
increases with the introduction of VSC systems that process and record vehicle
about their identity, location, and status for reception by other vehicles, roadside
infrastructure, or anyone else with the proper receiving equipment. Humans will
all that is needed is a well-placed receiver and information for all passing vehicles
167
same vehicle over a span of miles. VSC technology has the potential to disrupt the
natural barriers that previously limited the ability to track individual vehicles over
space and time. Rather than a single piece of information being observed by a
person or camera that just happens to be at the right place at the right time, VSC
even further. While existing traffic cameras allow the archival and retrieval of
video surveillance images, the digital nature of the information provided by VSC
applications vastly expands the ability to process, store, and share vast amounts of
physically view hours of camera footage, and increasing exponentially the size
and complexity of data analyses. Data mining can be performed with ease, as can
aggregation with other databases. Additionally, the digital nature of vehicle data
enabled by VSC technology expands the ability and reduces the cost for
“contextual integrity,” we can see how the design of these systems might alter the
flow of personal information in the context of highway travel and threaten the
168
on drivers’ habits and activities on the roads. They represent a shift from the
drivers’ daily activities. With the potential integration of VSC technologies into
our daily activities on the public roads, we are in danger of violating the existing
informational norms within context of highway travel. Concerns about the impact
of VSC technology on a driver’s privacy can receive new attention when framed
integrity will prove fruitful when considering the privacy implications of the
agent (a librarian, say) does not mean that it is automatically acceptable to share
the same information with another agent (a search engine provider or advertiser).
integrity works especially well to resolve issues of how new technologies and
information flows can be disrupted in such a way that threatens the privacy of the
parties involved. When applied to Google’s quest for the perfect search engine,
169
contextual integrity will provide the means to understand how these new technical
To determine the potential impact of Google’s quest for the perfect search
featuring two ideal typical information seekers, Elizabeth “Libby” Doe and
Annette “Netty” Roe. Libby and Netty are nearly identical in their personal,
single, gay south-Asian women. Both are Hindu, live in Brooklyn, New York, and
tend to vote for Democrats. Libby and Netty are graduate students at New York
University, studying political science and feminist theory. They enjoy sports and
cooking as hobbies; both are thinking of having a baby, but have concerns due to
being diabetic. They have similar investment portfolios, enjoy keeping in touch
with friends, and like to share photos and stories from their travels.
word-of-mouth, written, and oral correspondence. While not averse to using the
Internet, when Libby needs to find information on a topic, she prefers visiting the
library. Netty, on the other hand, relies heavily on the Internet to manage
information and communicate with others. When Netty needs information about a
topic, she “Googles” it. In fact, Netty relies on Google’s broad array of products
170
When navigating their respective “spheres of information,” both Libby
and Netty inevitably share bits of personal information with others. Appendix B
describes these flows of personal information within each of the nine distinct
Google” to access and organize information. Building from this narrative of Libby
and Netty’s differing informational practices and flows, we can attempt to apply
Assessing the information practices and flows from our thought experiment with
any degree of certainty is not easy, but we can approximate the particular agents,
information types, and transmission principles that govern information flows from
our thought experiment (see Table 5). Within each context there is evidence of
obtain information from multiple engagements with Libby, let alone across
The result of having one single entity act as a receiving agent across the various
171
The types of information shared by Libby tend to be incomplete, scattered
customer, but in general, the information she divulges is only a fragment of the
entire picture of her activities in each context. Netty, on the other hand, provides
Google much more complete sets of information on nearly all her interactions
with Google products and services. The information is digital, allowing for
simpler storage, processing, and sharing, and its accuracy is difficult to dispute.
decides to interact directly with librarians, booksellers, and so on, while Netty is
compelled to allow Google to track and collect her information browsing and
usage habits as a condition of using its products and services. Further differences
exist in terms of how these agents might share the information with other parties.
transactions might be used for marketing purposes, the librarians she interacts
with are bound by a code of ethics, and the phone and financial companies who
receive information must adhere to strict laws protecting consumer privacy. For
Netty, in nearly all cases, use of the information by Google is dictated by its
We may combine the information you submit under your account with
information from other Google services or third parties in order to provide
172
you with a better experience and to improve the quality of our services.
(Google, 2005j)
Google further states it will share personal information with third parties when,
represented by Libby – to the growing reliance on Google’s quest for the perfect
is concentrated with one agent (Google), digital and comprehensive, and often
required in order to use the services. Thus, a violation of the contextual integrity
private information is divulged when utilizing the tools that make up the perfect
search engine, or that the information shared is simply the same as that provided
informational norms helps to expose the Faustian bargain implicit within the quest
173
privacy of personal information in support of broader social, political, and moral
technology might support countervailing values. Such is the case here, for the
very logic of the perfect search is increased relevancy and efficiency of users’
broader examination of the social, political, and moral importance of free and
174
Table 5: Differences in informational norms within various information-seeking
contexts
Informational norm Libby Netty
General Information Inquiries
Agent (receiver) - Might interact with various - Google
librarians, booksellers, and other
information sources
Information type - Might verbally divulge personal - All information queries logged in
interests due to interactions with digital form
agents
- Booksellers might keep
transaction logs
Transmission principle - Information divulged to librarian - Information divulged to Google
voluntarily automatically
- Retailers might require certain - Google privacy policy allows use
information for purchase to “provide a better user
- Librarian bound by code of ethics experience”
to maintain patron privacy - May share with third parties to
- Booksellers might use/sell “comply with legal processes”
transaction data
Academic Research
Agent (receiver) - Might interact with research - Google
librarian
Information type - Might verbally divulge research - All information queries logged in
interests due to interactions with digital form
Transmission principle - Information divulged to librarian - Information divulged to Google
voluntarily automatically
- Librarian bound by code of ethics - Privacy policy applies
to maintain
News and Political Information
Agent (receiver) - Subscriptions receive some - Google
information
Information type - Sources subscribed to have - All information queries logged in
address and billing information digital form
Transmission principle - Some information required for - Information divulged to Google
subscriptions automatically
- Sources subscribed to might - Privacy policy (above)
use/sell information for
marketing purposes
(Table continues)
175
Table 5: Differences in informational norms within various information-seeking
contexts (continued)
Informational norm Libby Netty
Communication and Social Networking
Agent (receiver) - Recipients of messages - Recipients of messages
- E-mail and phone providers - Google
Information type - Recipients see contents of - All message content and
messages interactions logged in digital
- E-mail and phone providers track form by Google
usage, might scan for spam, etc - Contacts and friends lists stored
in databases at Google
Transmission principle - Information voluntarily divulged - Information divulged to Google
to recipients automatically
- Recipients might share - Privacy policy
information; generally bound by
norms of friendship
Personal Data Management
Agent (receiver) - Information not shared with any - Google
third party
Information type - N/A - Calendar information and queries
logged in digital form
- Cellphone number provided for
alerts
Transmission principle - N/A - Information divulged to Google
automatically
- Privacy policy
Financial Data Management
Agent (receiver) - Information not shared with - Google
anyone outside of broker
Information type - N/A - Portfolio information
- Personal information for forum
participation
Transmission principle - N/A - Information divulged to Google
automatically
- Privacy policy
Consumer Activities
Agent (receiver) - Retailers receive some - Google
information for purchases
Information type - Purchased items can be tracked - All browsing and purchase
- Browsing at select .com sites activity logged
logged
Transmission principle - Retailer might use/sell - Information divulged to Google
transaction data automatically
- Privacy policy
(Table continues)
176
Table 5: Differences in informational norms within various information-seeking
contexts (continued)
Informational norm Libby Netty
Computer File Management
Agent (receiver) - No third party has access to files - Google
Information type - N/A - Some search terms could be
logged via referrer field
- All queries logged with “Search
Across Computers” feature
- Encrypted file index stored at
Google with “Search Across
Computers” feature
Transmission principle - N/A - Privacy policy
- “Search Across Computers” files
protected via encryption
Internet Browsing
Agent (receiver) - Websites visited keep server logs - Google
Information type - Typical information collected by - Bookmarks, notes, etc
Web sites - Some Toolbar functions track
every Web site visited
- Web Accelerator tracks every
Web site visited
Transmission principle - Each site’s privacy policy - Privacy policy
177
178
CHAPTER VI
Introduction
Viewing the quest for the perfect search engine through the lens of
theory does not provide the tools needed to make the normative decision whether
the existing norms are preferred over the promises of the perfect search to provide
relevant and efficient results. This chapter addresses that question, arguing that
activities free from answerability and oversight. In such a case, maintaining the
paramount.
and strengths – of modern society is its social, physical, and intellectual mobility.
fundamental assumption that individuals are granted the right to move about and
explore new physical and intellectual terrain relatively free from answerability or
either. Our work and or play, our cities and our contrysides, our taxes and our
eating habits, our pleasures and our pains, our hopes and our fears are inextricably
tied up with mobility” (1973, p. 93). Without the ability and opportunity to move,
our world or develop the awareness and competencies necessary for effective
the ways in which people live and work, some have stated that: “[In] spite of the
upsurge of concern with mobility in out social lives, current research perspectives
This chapter, however, argues that mobility is not just a matter of physically
traveling from place to place, but also related to people’s social, cultural, and
179
Various configurations of social-technical relationships afford various
geographical space and our ability to navigate and explore those spaces.
Intellectual mobility refers to our mental and intellectual ability to learn new
things, explore new ideas, adapt, and change our thoughts and beliefs. The term
digital mobility is introduced to describe the ability to move within and across the
digital networks of cyberspace, the unique ability to navigate both spaces and
for how interference with one’s mobility threatens the values traditionally enjoyed
65
Note, however, that these three different conceptualizations of mobility
share some common characteristics and are highly interdependent, which will be
discussed in more detail below.
180
bipedalism in vertebrates, physical mobility represents one of the most important
species, mobility was the precondition for survival as their basic needs, such as
the acquisition of food or the search for a potential partner, depended on physical
movement. Beyond these primeval needs, physical mobility has also become a
precondition to satisfy all kinds of personal needs that imply movements, such as
exploration, recreation, connection, growth, and escape. The United States, for
example, is a nation formed and populated largely by the results of mobility, such
shore, they began a movement westward, first to the Mississippi River and then to
1893, the migration westward had been responsible for American economic
American history has been in a large degree the history of the colonization
of the Great West. The existence of an area of free land, its continuous
recession, and the advance of American settlement westward, explain
American development. (Turner, 1921, p. 1)
“The frontier,” Turner claimed, “is the line of most rapid Americanization” (1921,
“that coarseness and strength combined with acuteness and acquisitiveness; that
practical inventive turn of mind, quick to find expedients; that masterful grasp of
181
material things…that restless, nervous energy; that dominant individualism”
(Turner, 1921, p. 37) – could all be attributed to the influence of the frontier,
All was motion and change. A restlessness was universal. Men moved, in
their single life, from Vermont to New York, from New York to Ohio,
from Ohio to Wisconsin, from Wisconsin to California, and longed for the
Hawaiian Islands. When the bark started from their fence rails, they felt
the call to change. They were conscious of the mobility of their society
and gloried in it. (Turner, 1921, pp. 354-355)
Turner warned, however, that since only isolated pockets of free and unexplored
land remained at the end of the nineteenth century, “the frontier has gone, and
with its going has closed the first period of American history” (Turner, 1921, p.
38). He feared that with the closing of the frontier, American society would lose
its safety valve, that pent-up social tensions might find no ready release.
argued that Turner’s fears were misplaced. For Pierson, as long as we can move
somewhere and somehow, it does not matter if the geographic frontier of the
American West is closed. “To live is to move,” wrote Pierson in his cultural
precondition to action, the breath of social animation, the quite visible yet rarely
noticed act that makes possible most of the performances of man” (1973, p. ix).
Pierson reveals, with wonderful detail, the connections between mobility and
182
defiance against authority, and as a tool for gaining wisdom and a deeper
understanding of our world. Even if, as Turner believed, the literal frontier was
terminated in the 1890s, Pierson argued that the days of figurative, frontier style
movement was just beginning with the emergence of the automobile: “Today it is
the automobiles that breathe of romance and adventure, that speak to us of distant
lands” (Pierson, 1973, p. 125). Automobility would take over where Turner’s
Western frontier left off, as Phil Patton observed: “The automobile and its
highways froze the values of the frontier by making movement a permanent state
Americans to be constantly on the go, to explore new places, meet new people,
learn new things, all at their own pace and direction. Nearly anyone with a few
dollars can hit the road and be the rugged individual unrestrained by schedules or
outside control and relish the freedom to chose where and when to stop. With
immediate access to both the landscape and the people, we can commune with
nature, with friends, or with strangers. This physical mobility on the roads has had
a profound hold on American culture for much of the past century (see, for
example, Flink, 1975; Interrante, 1983; Lewis & Goldstein, 1983; Patton, 1986;
183
With the introduction of the automobile and a growing system of roads
and highways in the early twentieth century, Americans began to explore places
that a mere decade or two before had been beyond the range of the average
predetermined routes of the train. The documentary “To New Horizons” (General
exhibition at the 1939 New York World’s Fair, told of the automobile’s role in
satisfying the urge for exploration of distant horizons and frontiers. In a montage
of roadside images, the film described how the “restless search for new
opportunities, the mystery and promise of distant horizons have always called
men forward…old horizons open the way to new horizons.” Automobility shrunk
the size of the continent on whose vastness and inexhaustibility explorers had
commented since the sixteenth century, re-opening Turner’s frontier and making
these new horizons available for new levels of exploration, adventure, and escape.
social arrangements, providing countless new freedoms and opportunities (see, for
example, Flink, 1975; Lewis & Goldstein, 1983; McShane, 1994; Berger, 2001).
The automobile freed rural people from the physical and cultural isolation that
travel to large towns for shopping and selling their wares, children could be
longer limited to a small village church. Urban life also began a transformation as
184
a result of automobility.66 With cities becoming overcrowded, polluted, and
enclaves. While trains or trolleys often provided transportation between home and
work in the city, automobility freed this new breed of commuters from the rigid
The incorporation of the automobile into daily life also reshaped family
relations. The traditional family dinner and subsequent neighborhood stroll faced
divisive activity that did not always involve all members of the family. The
liberating effect on teenagers was especially profound, since the car offered swift
relative anonymity, escaping the prying eyes of their parents and other adults
within their community (Berger, 2001, p. xxi). The automobile also brought about
teenage couples to get much farther away from front porch swings, parlor sofas,
hovering parents, and pesky siblings than ever before (see Lewis, 1983).
responsible for keeping the home, had at their disposal a form of transportation
with levels of privacy, safety and speed that outweighed previous reliance on
66
The automobile, after all, was a creation of industrial society, and cities
were its initial domain; in 1910 urban residents were four times more likely than
rural residents to own a car (McShane, 1994, p. 105).
185
The automobile provided a means by which women could escape their
homebound existence without neglecting their traditional domestic
responsibilities. Their range of mobility began to approach that of men,
and the sphere of their activities expanded accordingly. Thus, they were
able to develop and take advantage of new employment opportunities
outside the home, form geographically extensive social clubs for
philanthropic or recreational pursuits, or just get away from the house or
apartment for an hour or two of refection, shopping or culture. (Berger,
2001, p. xxi)
While such benefits of car ownership were not available to all women,
discovery often took center stage in literary fiction. For example, the main
character in John Updike’s Rabbit, Run (1960) “runs” away from reality by taking
an extended automobile trip throughout the American South. Similarly, both John
from the pressures and perplexities of the external world. As (Laird, 1983) has
noted:
186
“The road is life” Jack Kerouac pronounces in On the Road (1955, p. 175),
and other exploits made between 1946 and 1950 with his fellow members of the
unlike World War I’s own Lost Generation; however, the Beats, rather than
Roger Casey, “Central to their experience was the verb go…. A group on the
move, the Beats found the car essential to their circumstances” (Casey, 1997, p.
movement. While the Beats did not necessarily look into the future with utopian
vision, they nonetheless agreed on the road as the ideal site for both adventure and
More than anything else, On the Road is about being on the go. Many
writers before Kerouac (Steinbeck, for one) had already asserted that the
basic impulse of America is to move, to go west, young man. Kerouac
listened to his forbearers, doing just that – moving, again and again. Like
Huck Finn, Sal Paradise (Kerouac) “lit out” for the territory; he then
returned east, then lit out west again, then east again, then south, then to
Mexico, and finally back east – the ultimate restless American. Sal cannot
find Paradise because he finds the American Edenic myth just that, a myth
– there is no Shangri-la. Therefore, since paradise cannot be found in a
place, paradise must become movement itself, and the car thus became the
method of nirvanic transport to the Beats. (Casey, 1997, pp. 108-019)
quality – was also often the subject of film. Indeed, the two technologies rose to
187
popularity in America simultaneously because, according to flamboyant film
director Cecil B. De Mille, they both reflected “the love of motion and speed, the
restless urge toward improvement and expansion, the kinetic energy of a young,
vigorous nation” (quoted in Hey, 1983, p. 193). Both the automobile experience
and the film experience freed their users from the static normalcy of their day-to-
day lives, permitting the consumer to select desirable settings or themes outside
range from The Grapes of Wrath to Easy Rider to Thelma and Louise to Natural
signifier for freedom, and their continued popularity express the cultural
Hopper’s road movie Easy Rider, to Bruce Springsteen’s anthem Born to Run, the
has been expressed in American life is beyond the scope of this chapter,67 but the
adventure, the potential for individual autonomy and liberation from social
67
Interested readers can find more information in (Dettelbach, 1976; Hey,
1983; Laird, 1983; Lewis & Goldstein, 1983; Casey, 1997; Berger, 2001).
188
constraints. Cultural critic Stephen Bayley has noted how the automobile is “a
curiously precise tool for calibrating cultural values” (Bayley, 1986, p. 62), and
this revelation best, noting how the automobile “is now the vehicle for our
People will do almost anything rather than give up this outlet for feeling.
They simply people their tensions, their frustrations and unfulfilled
yearnings into the automobile and they’re off. Many people drive (it is
obvious) to satisfy their longings for power; others make it their whole
recreation; still others use it as an almost perfect way of escape. It satisfies
an ancient and fundamental American urge. By simply turning a key we
can now go almost anywhere we please. (Pierson, 1973, pp. 127-128)
With the emergence of the automobile and a growing network of roads and
means of expression, and a new “escape valve” had been discovered for the
journalist and author, summarizes best how automobility “enabled [a] banner of
freedom to be unfurled”:
Throughout its history, the car has been a liberator, an agent of freedom.
Throughout its history, the car has enabled people to break out of their
constraints, to attempt something they could never previously do, to
venture somewhere they could never previously go, to support ideas and
trends they could never previously endorse. (Setright, 2003, p. 186)
“America,” noted writer John Jerome, “is a road epic; we have even
189
Dettelbach, 1976, p. 4). As the automobile increasingly became a part of culture –
autonomy, and personal freedoms. The ability to move unencumbered within this
The sovereignty of the automobile that historian James Flink relishes is not,
threatening the anonymity and freedom to be gained on the roads. The emergence
to enable a number of key safety and operational services that would take
Numerous networked vehicle systems and technologies have emerged from this
190
initiative. For example, surveillance cameras are increasingly installed along
highways and intersections to manage traffic flows and enforce traffic laws
provide navigational and safety benefits (Fogarty, 1997; Baig, 2000), and event
technologies contain unique privacy problems due to the fact that the information
person’s location and movements (see previous chapter and Zimmer, 2005). They
make possible the creation of detailed travel histories of a driver’s location at all
times in the past, as well as her usual travel patterns and habits. Networked
people engaging in their everyday, public activities on the roads (Zimmer, 2005),
and prompt complicated questions as their usage becomes more widespread. Key
191
concerns include: who owns the information about one’s driving activities, under
what conditions can law enforcement access such information, can information
collected for one purpose be used for another, and what kinds of disclosure and
informed consent are necessary to implement such systems (see, for example,
navigation within the sphere of physical mobility has not gone unnoticed. As
early as 1995, when these technical systems were in their early stages, a group of
scholars gathered to discuss the emerging privacy concerns with what were then
mitigated by ensuring that the data collection activities could not capture enough
data from any one vehicle to make it “singularly identifiable” (Alpert, 1995, p.
116); others called for restrictions on the retention of individually identifiable data
(Halpern, 1995); along with policy initiatives, calls were made to pay strict
ten years since this symposium’s initial warnings, scholars have continued to
sphere of physical mobility (see, for example, Garfinkel, 1995; Clarke, 2000;
use and rising ubiquity of networked vehicles systems may threaten the autonomy
192
and freedoms inherent to our culture of automobility, a concern condensed in
Jeffery Reiman’s warning that by blindly embracing such technology, we run the
analysis of the full implications of networked vehicle systems is beyond the scope
of this chapter. Yet, evidenced by the attention given these emerging technologies
and the growing concerns of how their design threatens the anonymity and
freedoms sought via the roads, protecting the free and unencumbered navigation
form the foundation of many American cultural values. Instead of the freedom to
move along the physical highway, the notion of intellectual mobility centers on
what Marshall McLuhan has called “the highways of the mind” (1964, p. 102).
knowledge, which spurs the desire for more knowledge and more intellectual
mobility: “When information itself is the main traffic, the need for advanced
knowledge presses on the spirits of the most routine-ridden minds” (1964, pp.
102-103). Education, access to information, and the freedom of inquiry are the
central drivers along these new “highways of the mind,” enabling full mobility
193
In a 1786 letter to a friend, Thomas Jefferson called for “the diffusion of
knowledge among the people. No other sure foundation can be devised for the
preservation of freedom and happiness [than] educating the common people” (qtd.
in Padover, 1952, p. 87). Jefferson was arguing that the fortunes of the then-
young democracy of the United States rested on the ability of its citizens to
understand and use information about the world around them. Jefferson was able
to champion the cause of public education himself with the founding of the
illimitable freedom of the human mind, to explore and to expose every subject
susceptible of its contemplation” (Jefferson, 1820a), and that at the university “we
are not afraid to follow truth wherever it may lead, nor to tolerate any error so
long as reason is left free to combat it” (Jefferson, 1820b). In these words, the
The centerpiece of Jefferson’s design for the University was not a church,
intellectual inquiry in his vision for democracy. In the spirit of Jefferson, libraries
have assumed the social role as intuitions of “education for democratic living,”
with intellectual freedom forming their foundation. The American library has
been described as “the Nation’s most basic First Amendment institution,” serving
as a “primary resource for the intellectual freedom required for the preservation of
a free society and a creative culture” (Foerstel, 1991, p. viii). According to library
194
scholar Charles Busha, librarians believe in “library users’ rights to read, watch,
(Busha, 1977, p. 12). This commitment represents a core stance of the American
Library Association (ALA), which, since its inception in 1876, has become
by a concern for the public’s right to free and unfettered access to information. In
essence, the intellectual freedom enjoyed in the context of the library represents a
answerability.
At its 1939 annual conference in San Francisco, the ALA adopted a formal
The document began with the statement, “Today indications in many parts of the
totalitarian states during that time (American Library Association, 2002a, p. 60).
In response to the changing political and cultural surroundings, the Bill of Rights
outlined three policy statements to ensure free and open access to public library
services. The first stated that library materials should be selected based on their
value and intrinsic interest to the community, not on the race, nationality,
political, or religious views of authors. The second directed that library materials
195
should “fairly and adequately” represent all sides of social issues. The final
that all community groups would have equal access (American Library
Association, 2002a, pp. 60-61). The ALA’s adoption of the Library’s Bill of
then on, the principle of intellectual freedom defined the library’s role as a forum
for uninhibited intellectual inquiry and debate, and solidified the library as a
Almost immediately after adopting the Bill of Rights, new political and
social pressures began to weigh on the intellectual freedoms outlined within the
ALA’s bold position statement. Shortly after World War II, the House Committee
and pressure to conform came to a head with the rise of McCarthyism; between
1949 and 1953, Wisconsin Senator Joseph R. McCarthy and his supporters
persecuted almost anyone who deviated from the status quo, including
intellectuals, teachers, and librarians. During this period marked by both paranoia
and intolerance, many librarians were fired, some libraries were closed, and
196
countless books were either labeled un-American or simply destroyed in the name
behind the 1939 Library’s Bill of Rights, and it become even more evident that the
remedies stated therein were necessary to protect free and open inquiry of
American citizens. In these early moments of the Cold War and McCarthyism, the
ALA updated the newly entitled Library Bill of Rights, highlighting that
Association, 2002a, p. 62). Through this 1948 revision, the ALA reaffirmed its
version that stands today as a strong statement expressing the rights of library
users to intellectual freedom, and the expectations that the Association places on
The American Library Association affirms that all libraries are forums for
information and ideas, and that the following basic policies should guide
their services.
I. Books and other library resources should be provided for the interest,
information, and enlightenment of all people of the community the library
serves. Materials should not be excluded because of the origin,
background, or views of those contributing to their creation.
197
II. Libraries should provide materials and information presenting all points
of view on current and historical issues. Materials should not be
proscribed or removed because of partisan or doctrinal disapproval.
III. Libraries should challenge censorship in the fulfillment of their
responsibility to provide information and enlightenment.
IV. Libraries should cooperate with all persons and groups concerned with
resisting abridgment of free expression and free access to ideas.
V. A person’s right to use a library should not be denied or abridged
because of origin, age, background, or views.
VI. Libraries which make exhibit spaces and meeting rooms available to
the public they serve should make such facilities available on an equitable
basis, regardless of the beliefs or affiliations of individuals or groups
requesting their use. (American Library Association, 2006c)
Overall, the ALA responded to threats to the library’s social role as an institution
of education and inquiry for democratic living by making intellectual freedom its
defining ideological stance (see, for example, Robbins, 1991). Through the
Library Bill of Rights and related policy and procedural stances, the ALA has
The Library Bill of Rights begins with the premise that everyone is entitled
to freedom of access, freedom to read texts and view images, and freedom of
right to freely read and to receive ideas, information, and points of view – it is a
and individual reading and library use patterns are made known to anyone without
198
permission. Only when an individual is assured that her choice of reading material
does not subject her to reprisals or punishment can the individual enjoy fully her
freedom to explore ideas, weigh arguments, and decide for herself what she
Such assurances were put to the test when, in 1970, United States Treasury
elsewhere and asked to see circulation records for books on bomb making,
incident, for example, agents of the Bureau of Alcohol, Tobacco, and Firearms of
materials on explosives, but were initially rebuffed by the local librarians. The
agents then returned with a “letter of opinion” from the City Attorney, advising
the library that the circulation records were public records and therefore could not
be withheld from the agents. While the letter had no legal authority, and the
request was never reviewed by a judge, the local library acquiesced (American
Library Association, 2002a, p. 236). Reaction from the ALA, however, was swift,
The ALA recommended that each library adopt a confidentiality policy, advise all
library employees that library records are not to be released except pursuant to a
199
court order, and resist the issuance or enforcement of such an order until a proper
showing of good cause has been made in court (Foerstel, 1991, p. 6). These
ALA’s position on the privacy of patron records was further solidified in 1980
with the amendment of the Code of Ethics to mandate that librarians “protect each
within libraries arose again in 1987 when it was disclosed that the Federal Bureau
“hostile to the United States, such as the Soviet Union” and to provide the FBI
documents these efforts, which were largely unsuccessful due to the tremendous
outrage and resistance from those in the library profession. Shortly after the
disclosure of the Library Awareness Program, the New York Library Association
Should the citizens of this nation perceive the library and its staff as a
covert agency of government watching to record who is seeking which
bits of information, then the library will cease to be creditable as a
democratic resource for free and open inquiry. Once the people of this
country begin to fear what they read, view or make inquiry about may at
200
some future time be used against them or made the object of public
knowledge, then this nation will have turned away from the very most
basic principle of freedom from tyranny which inspired this union of
states. (qtd in Foerstel, 1991, p. 43)
We simply do not wish to have our readers feel that they may be under
surveillance by intelligence agents. Furthermore, we want to assure all
library users of their right to read freely and to explore ideas without
question of their motives. At New York University we believe this type of
invasion into the privacy of the American public is an unwarranted threat
to our civil liberties. (qtd in Foerstel, 1991, p. 57)
about the FBI Library Awareness Program. This letter of concern was later
Following the creation of this vital policy statement, the ALA’s Intellectual
Freedom Committee kept this privacy problem in the consciousness of the library
201
confidentiality of records at local libraries (Kennedy, 1989). The confrontation
between the ALA and the FBI over the Library Awareness program once again
freedom, and the fundamental values that underpin our democratic society. Yet,
the FBI has never publicly abandoned the Library Awareness Program, and the
p. 12).
The simmering tensions between the FBI and the ALA returned to a boil
with the passage of the USA PATRIOT Act in the aftermath of the September 11,
2001 terrorist attacks.68 This controversial act was quickly signed into law on
October 26, 2001, only weeks after the September 11, 2001 terrorist attacks on
New York City and Washington, D.C. The Act broadly expanded law
different statutes for the stated purpose of updating wiretap and surveillance laws
communications (e-mail, voice mail, etc.), and to give law enforcement greater
American law enforcement in order to fight and prevent terrorism in the United
has faced significant criticism since its passage, particularly in relation to its
68
The USA PATRIOT Act (Public Law 107-56, 115 STAT.272, H.R
3162) stands for the Uniting and Strengthening America by Providing
Appropriate Tools Required to Intercept and Obstruct Terrorism Act of 2001.
202
impact on privacy and civil liberties (see, for example, Chang, 2001; Lardner,
2001; Olsen, 2001; Purdy, 2001). One controversial component of the Act was
Section 215, which amends sections of the Foreign Intelligence Security Act
(FISA) to make it easier for a federal agent to obtain a search warrant for “any
tangible things (including books, records, papers, documents, and other items)”
(p. 38). Under the revised provisions, a federal agent need not demonstrate
probable cause to obtain a warrant. Instead, she can merely assert that records
activities, a much lower legal threshold. On its face, the section does not directly
refer to libraries, but rather to business records and other “tangible” items in
general. Its scope, however, has been widely interpreted to include library patron
probable cause. When asked about Section 215 by the House Judiciary
emphasis added).
the privacy and confidentially of patron records. By April 2002, the ALA Office
the implications of the USA PATRIOT Act and published The USA Patriot Act in
the Library: Analysis of the USA Patriot Act Related to Libraries (American
Library Association Office for Intellectual Freedom, 2004). Two months later, in
further response to the USA PATRIOT Act, the full ALA approved a new policy
203
statement on Privacy: An Interpretation of the Library Bill of Rights (American
its Resolution on the USA Patriot Act and Related Measures That Infringe on the
resolution called for education within libraries about how to comply with the Act
and also about the inherent dangers to intellectual freedom. It further advised that
libraries “adopt and implement patron privacy and record retention policies” to
collect only information that is necessary for the library’s work. Second, the
resolution bound the ALA to work with other like–minded organizations “to
protect the rights of inquiry and free expression.” Third, it committed the ALA
“to obtain and publicize information about the surveillance of libraries and library
with the ALA’s formal responses and recommendations, individual librarians and
libraries took their own action to protect patron privacy and confidentiality,
use new computer technology to profile the reading habits of patrons and inform
them when works they enjoy are published, destroying Internet access logs on a
204
daily basis, posting warning signs, and offering patron education on privacy
patron records have persisted for over the last 60 years, aggravated by escalating
McCarthyism, the FBI Awareness Program, or the war against terror, librarians
have fought to ensure the democratic ideal of intellectual freedom survives such
intellectual freedom, has argued that granting librarians both the responsibility
and the tools to defend the right of readers to freedom of inquiry, the Library Bill
inevitably extends to the library patrons as well, forming what this chapter calls a
sphere of intellectual mobility, where, like the freedoms enjoyed in the physical
sphere of automobility, citizens must be free to read, inquire, and learn free from
205
shift from the physical sphere of the library to the digitally-networked sphere of
the Internet and the World Wide Web, a new sphere of digital mobility has
emerged. And just as the physical and intellectual spheres of mobility described
above are frequently confronted with new technologies and practices that threaten
the freedoms enjoyed within their purview, this new sphere is also confronted
New digital computing and network technologies have led to a surge in the
via the World Wide Web, browse online versions of newspapers and magazines,
purchase and read electronic books, and play music and video files via the
Internet with relative ease. However, this new form of digital access to
informational and cultural works presents a number of challenges for the creators
and distributors of these media products. Although such works are often protected
difficulty of enforcing the rights of the copyright holder, and at the same time has
presented a challenge to the legitimacy of the continuation of those rights (see, for
form, it can potentially be copied and distributed widely without the permission of
the owner and possibly in violation of their legal rights. Thus, a digital dilemma
emerges: the very technologies that open new avenues for the consumption and
206
distribution of informational goods enable potentially unauthorized use and
content.69 Definitions of DRM vary, but this version from a recent conference of
Following this definition, we can isolate three levels of control that DRM
usage control.
69
Legislative solutions have also been supported by copyright holders,
such as the Digital Millennium Copyright Act (H.R.2281), which heightened the
penalties for copyright infringement on the Internet as well as criminalizing
production and dissemination of technology whose primary purpose is to
circumvent measures taken to protect copyright.
207
number of times the file was read or played. Julie Cohen describes how the
DRM monitoring and metering systems date as far back as 1995 when IBM
protection software platform in which any attempt to open or use a protected file
was first routed through a centralized clearinghouse for tracking purposes (Evans,
1996). While Cryptolope has since been abandoned, monitoring systems remain
in heavy use by media content providers to help track usage, such as Microsoft’s
access and usage control. Access controls attempt to restrict or limit users’ ability
techniques include encryption algorithms that prohibit people without the required
decryption key from accessing the encrypted content. The key, which is provided
only to users who have paid for the content, is typically found inside software
208
packages accessible only after they have been purchased and opened. Other
access control methods limit the amount of otherwise proper use of copyright-
protected content, such as e-books that expire after a certain period of time after
purchase or music files that can only be played a finite number of times.
restricting normal access to the content and instead inhibit users’ ability to print,
printing the files. Many DRM systems combine both access and usage controls,
such as the Content Scrambling System (CSS) employed on DVDs. CSS uses a
simple encryption algorithm to scramble the DVD’s content, which only licensed
DVD players can decrypt to enable access. To obtain the necessary decryption
keys, device manufacturers are required to sign a license agreement restricting the
inclusion of certain features in their players, such as a digital output that could be
2007a). With the design of CSS, the existence of an access control system allows
209
Some DRM systems combine all three levels of control, such as Sony’s
2005. This infamous DRM system limited a user’s ability to access and play the
CDs on their home computers (access control), restricted the ability to copy or
share songs from the CD (usage control), and also secretly communicated with
Sony over the Internet when listeners played the discs, transmitting the name of
the CD being played along with the IP address of the listener’s computer (active
distribution of this DRM scheme and provided users the ability to uninstall the
Evident from the reaction to Sony’s actions, the design and use of DRM
impact on users, creativity, competition, and law (see, for example, Felten, 2003;
Vaidhyanathan, 2004). A full discussion of this larger debate is beyond the scope
of this chapter, but a critical concern has emerged from the various social,
intellectual activities extend from the physical library into the new digital realm,
the presence of DRM technologies within digital materials poses as much a threat
to individuals’ privacy and intellectual freedom as the presence of the FBI within
70
A rootkit is a set of software tools intended to conceal running
processes, files, or system data from the operating system, and often from the user
herself.
210
DRM, Digital Mobility, and Privacy
fundamental ways. The first involves what legal scholar Julie Cohen describes as
“direct restrictions on what individuals can do in their privacy of their own homes
with copies of works they’ve paid for” (2003a, p. 47). The primary architecture of
DRM systems constrains user agency and autonomy, limiting the scope for users
to choose how to behave (Cohen, 1996; Burk & Gillespie, 2006). DRM allows
only those actions determined in advance, embedding the rules of access and use
into the very tools with which the information is to be used, rendering prohibited
actions all but impossible. Herein lies a key distinction from traditional
applications of copyright law. Whereas legal prohibition via copyright law leaves
discretion over their behavior in the hands of the users, allowing them to
determine whether to risk activity that might result in legal penalties, DRM
the information producer. DRM technologies restrict the choice of the individual
Besides direct control over user actions, Cohen (1996) argues that even the
apprehensive that their choices are being remotely observed and recorded. Even
211
when a user might not mind others knowing that they accessed or read certain
content, the user might not want others to know that they had to read it 20 times,
that they highlighted parts of it, that they wrote notes in the margin, that they
copied part of it, that they forwarded certain excerpts to their friends with
comments, and so on. For many users, knowledge that these or similar kinds of
information would be gathered about them would naturally affect the types of
content they choose to access and use, as well as how they go about it. Cohen is
“one of the most personal and private of activities” and that given its monitoring
capabilities, DRM will “create records of behavior within private spaces, spaces
within which one might reasonably expect that one’s behavior is not subject to
al., 2001; Cohen, 2003b; Mulligan et al., 2003; Electronic Privacy Information
vast quantities of data about the use of copyrighted works. This data could reveal
a great deal about the manner in which individuals explore copyrighted works.
DRM thus presents the potential for a level of usage monitoring that is
212
unprecedented in the use of informational goods. Mulligan and her colleagues
physical sphere of the privacy of one’s home or the protected halls of the library
interests and habits by DRM technologies. The effects of DRM technologies can
autonomy as they relate to the ability to select, use, and benefit from intellectual
built into DRM occur in the context of cultural and informational content basic to
human flourishing:
213
The broader debate over the legitimacy and efficacy of digital rights management
technologies cannot be resolved in this chapter. But the very existence and
potency of the debate points to the fact that individuals anticipate that the
public libraries will extend into the digital sphere as well. Whether at the library
information-seeking and intellectual activities can occur free from oversight and
control.
Convergence of Mobilities
social, cultural, and intellectual activities free from answerability and oversight.
fundamental values and aspirations that define American culture, such as the
promise of exploration and adventure, the potential for individual autonomy and
illimitable freedom of the human mind” and to provide citizens the ability “to
1820a). And the new sphere of digital mobility frees our intellectual curiosities
from the physical confines of the library, providing new freedom to move within
214
and across the digital networks of cyberspace to explore places and ideas
physical mobility, has also frequently been cited in its supporting role of fostering
Throughout its history, the car has enabled people to break out of their
constraints, to attempt something they could never previously do, to
venture somewhere they could never previously go, to support ideas and
trends they could never previously endorse. (Setright, 2003, pp. 186/,
emphasis added/)
undiscovered. In this way, the road served not only as an exit from one’s own life,
but also as an entrance into the experiences and interactions of the people, places,
and ideas encountered along its path. City dwellers could learn of rural culture
firsthand, while those residing in the countryside could drive to the cities and be
the beauty and mystery of the country’s diverse natural resources, to travel and
learn about their family’s roots, or to retrace the history of the nation.
The physical mobility enabled by the growing car culture also allowed
easier access to education, including the ability for individuals to leave their
with different sets of norms and values. Further, the intellectual freedoms enjoyed
215
in the nation’s public libraries were bolstered by the newfound ability to drive to
local libraries, as well as travel to larger and more diverse libraries in neighboring
the mind” and enabling not only physical mobility, but also new levels of
intellectual mobility.
Modern America can be defined by the “mobility of its people and their
mobility are perpetually intertwined. With the growing importance of the Internet
in modern life, digital mobility has emerged as a vital proxy for both physical and
physical and intellectual mobilities into marriage with one another. Indeed, links
between the digital mobility provided by the Internet and the physical and
intellectual mobility attained via the automobile and highway system abound.
First, broadly speaking, both the highways and the Internet act as communication
and that “lines of communication” included not only the telegraph or telephone,
but also “roads, canals and railways” (Williams, 1983, p. 72). The highways, then,
216
imparting information (be they physical mail or commercial products) across
space, in much the same way that the telegraph imparted messages across the
various information and services, such as electronic mail, file sharing, and the
interlinked pages of the World Wide Web that facilitate modern-day sociability,
Second, both the Internet and the highway systems are methods of
variety of “places” online, providing the ability to escape and explore beyond the
physical limitations of the highway. And a third linkage between the highways
and the Internet is their shared history. Both the interstate and information
superhighways were first conceived during the late 1950s and early 1960s as
responses, in part, to the supposed technological and nuclear threat of the Soviet
Union. A key motivation for the design and construction of the interstate highway
system was to enable military and civil defense operations, including troop
movements and the emergency evacuation of cities in the event of nuclear war. In
response to the Soviet launching of the Sputnik satellites, the Advanced Research
217
communications network (ARPANET) soon emerged from the collected minds of
ARPA, which eventually became today’s Internet (see Hafner & Lyon, 1996).71
The highway system and the Internet share a fourth similarity: their
result, each node has several possible routes through which to send data: if one
network because it “lacks any centralized hubs and offers direct linkages from
71
The belief persists that a primary motivation for the development of the
distributed ARPANET was so that military information and communication
networks could withstand a nuclear attack. According to some of the original
designers of the network, this is only partially true:
It was from the RAND study that the false rumor started claiming that the
ARPANET was somehow related to building a network resistant to nuclear
war. This was never true of the ARPANET, only the unrelated RAND study
on secure voice considered nuclear war. However, the later work on
Internetting did emphasize robustness and survivability, including the
capability to withstand losses of large portions of the underlying networks.
(Leiner et al., 2003 at note 5)
72
It should be noted that many of the systems used to enable Internet
usage do not follow a purely distributed form. The Domain Name System (DNS),
which translates alphabetic Web addresses (www.michaelzimmer.org) into
numeric network addresses (70.103.189.67), is a decentralized and hierarchical
system through which nearly all Web browsing traffic must flow. As Galloway
(2004) notes, “ironically, then, nearly all Web traffic must submit to a hierarchical
structure (DNS) to gain access to the anarchic and radically horizontal structure of
the Internet” (p. 9).
218
city to city through a variety of highway combinations” (Galloway, 2004, p. 35).73
Given their mesh-like topology, traffic across these two networks – the physical
highway and the digital Internet – are often random and unrepeatable, fostering a
certain level of autonomy and freedom for their users. These various linkages
historian George Pierson’s statement regarding the critical role of mobility in our
lives:
Summary
individuals have historically enjoyed the ability to engage in social, cultural, and
intellectual activities free from answerability and oversight. Within these spheres,
spaces for personal growth, exploration, and escape within these spheres of
new worlds of information, new spaces for communication, and new means of
73
Likewise, while design upon the principle of a distributed network, in
practice, the highway system does not achieve a fully distributed form.
219
experiencing the world. In many ways, the physical, intellectual, and digital
The previous chapter demonstrated how the quest for the perfect search
exposed a Faustian bargain implicit within the quest for the perfect search engine,
the theory of contextual integrity did not provide a means of assessing the
through the perfect search outweigh any potential harm. This chapter has argued,
however, that the stakes for violating contextual integrity in the perfect search
engine are much greater than simply allowing Google to collect search queries or
the library, or on the Internet, spheres of mobility provide the means to break
down barriers, expand our horizons, offer new insights, and lead us into new
directions. By violating the information norms within our spheres of mobility, the
quest for the perfect search engine threatens our ability to navigate, to inquire, and
life, and impedes our enjoyment of the freedoms fundamental to our spheres of
mobility.
220
221
CHAPTER VII
Shortly after reports emerged that the Web search engine Google had
months’ worth of search queries that Google had received from its users, The
Mr. Tutley,
The DOJ should not be in the business of threatening American
enterprises, particularly one as beloved as Google, the people's search
engine. Especially when taxpayers pay your salaries -- salaries that, at
least in your four-person office, are apparently going largely to subsidize
(according to our tracking cookies) a minimum of four hours daily of eBay
auctions (James Llewellyn and Carol Santana, both GS-14) and
SpongeBob Collapse (Martha Stanhope, GS-13).
David Miller, Google
David Miller,
Twice last month your VW Passat went through the Ninth Street
Toll Plaza -- 11:04 p.m. on 12/26 and then, in the opposite direction, at
4:31 a.m., 12/27 -- with a fetching young passenger whose retinal scan
does not appear to match that of Donna Weinstein-Miller, your wife of
seven years and mother of your twins, Emma and Jedediah.
Turn over the records.
B. Tutley, DOJ
Mr. Tutley,
For a man whose Google searches the past six months have
included the terms personal bankruptcy, bankruptcy lawyer, Chapter 11
and Azerbaijan hotties (btw, the “and” is superfluous), you're in no
position to muscle us, or to demand a peek at the personal information of
others.
D. Miller
(excerpted from Postman, 2006)
tale, networked vehicle information systems and Web search engines – bear on
the ability to move, navigate, inquire, and explore within our spheres of mobility.
This dissertation has argued that the quest for the perfect search engine presents a
Faustian bargain: The perfect search promises new breadth, depth, efficiency, and
promises made by the perfect search, while ignoring the ways that privacy,
Proponents of the perfect search have succeeded in obscuring its value and
exists, as the information shared in the perfect search is not personally identifiable
and often is already shared with other entities in other circumstances. In order for
visible” (Brey, 2000, p. 13) this dissertation utilized the theory of “contextual
222
of the quest for the perfect search engine, revealing how Google’s goal of creating
for the perfect search is altering personal information flows in such a way that
The emergence of the perfect search engine, then, forces us to examine its
Whether on the highways, in the library, or on the Internet, without the ability and
sort of understanding of our world and develop the awareness and competencies
life. More than just shifting the contextual integrity of informational norms within
particular information-seeking contexts, the quest for the perfect search engine
represents the latest – and perhaps the most potent – threat to the freedoms
about its users as possible. A recent New York Times article profiling a citizen of
As Dan Firger, a law student at New York University, strolls from class to
class during the course of his day or pauses for a breather in Washington
Square Park, his cellphone is routinely buzzing inside his messenger bag.
He can often guess who it is: Google. Six to eight times a day text
messages pop up, courtesy of Google Calendar, a free daily organizer
introduced this year. The program can scan appointments and send
reminders of coming events. Google is everywhere in Mr. Firger’s life. He
scours the Web with its search engine; he chats with friends in Bolivia
using Google Talk; and he receives e-mail messages on a Google Gmail
223
account. “I find myself getting sucked down the Google wormhole,” Mr.
Firger said with equal parts resentment and admiration. “It’s all part of
Google’s benign dictatorship of your life.”
…Mr. Firger, the law student, acknowledged feeling a “weird
tension” about his love of Google’s products and his fear about its
omnipresence in his life. “I don’t know if I want all my personal
information saved on this massive server in Mountain View, but it is so
much of an improvement on how life was before, I can’t help it,” he said.
(Williams, 2006)
In its quest for the perfect search engine, Google has constructed an
into an infrastructure for the capture of personal information. Greg Elmer warns
technology designers:
interface, where the default settings and arrangement of services make the
of renegotiating our Faustian bargain with the perfect search engine. One avenue
224
for changing the terms of the Faustian bargain is to enact laws to regulate the
gathering of leading legal scholars and industry lawyers to discuss the possibility
solutions are difficult to conceive, let alone agree upon.74 Alternatively, the search
engine industry could self-regulate, creating strict policies regarding the capture,
aggregation, and use of personal data via their services. But as Chris Hoofnagle
reminds us, “We now have ten years of experience with privacy self-regulation
economic interests in capturing user information for powering the perfect search
technologies around us (1966, p. 55), and Lessig’s assertion that “how a system is
designed will affect the freedoms and control the system enables” (Lessig, 2001,
p. 35), this dissertation argues that technological design is one of the critical
junctures to ensure that the technologies we use support human and ethical values
Design, we can work pragmatically to ensure that the design of Web search
74
See “Regulating Search: A Symposium on Search Engines, Law, and
Public Policy” held in December 2005 at Yale Law School
(http://islandia.law.yale.edu/isp/regulatingsearch.html).
225
through a harmonious relationship with the technology, i.e., a relationship free
Design of the perfect search engine. Two balls from the methodological toolkit
has attempted to bring clarity and a normative understanding to the ways in which
the quest for the perfect search engine bears on the values enjoyed in our spheres
the particular design features of the perfect search engine, revealing how its
spheres of mobility.
be iterative, the process is non-linear and rarely, if ever, complete. The conceptual
investigation of the perfect search engine will continue beyond the pages of this
Similarly, further technical investigations need to take place: For example, the
Web advertising aspect of the perfect search engine has remained unexplored,
including Google’s AdSense and AdWords programs, its Web Analytics software,
and its recent acquisition of DoubleClick. Additional products and services, such
as Web software for creating documents and spreadsheets, are continually added
new tools in support of the perfect search, the investigation of its technological
226
Opportunities exist to put the translation ball (from the Values at Play
our understanding of how the value of privacy is implicated in the design of the
perfect search, we can try to translate that value into various design features.
Potential design variables include whether default settings for new products or
process can be turned off. Or the extent to which different products should be
logged in to other services? Ideally, new tools can be developed to give users
access and control over the personal information collected: In the spirit of the
Code of Fair Information Practices, a Google Data Privacy Center should be built
to allow users to view all their personal data collected, make changes and
deletions, restrict how it is used, and so on. Countless more opportunities exist for
technology, this dissertation has exposed the Faustian bargain that has been thrust
upon online information-seekers as a result of the quest for the perfect search
engine. The theory of contextual integrity clarified how the perfect search engine
broader spheres of mobility, the dissertation revealed how the quest for the perfect
227
potentially inhibiting our ability to develop the awareness and competencies
life. In short, this has been an exercise in disclosive computer ethics, uncovering
the moral issues and features in the perfect search engine that have not, until now,
Conscious Design, there is hope that our Faustian bargain can be renegotiated to
provide a more harmonious relationship with the quest for the perfect search
engine, allowing full enjoyment of the fundamental freedoms within our spheres
of mobility.
228
229
BIBLIOGRAPHY
Ackerman, E., & Blitstein, R. (2006, October 9). Google buys YouTube for $1.65
billion. San Jose Mercury News.
Agre, P. (1995). Reasoning about the future: The technology and institutions of
intelligent transportation systems. Santa Clara Computer and High
Technology Law Journal, 11(1), 129-136.
Agre, P., & Rotenberg, M. (Eds.). (1997). Technology and privacy: The new
landscape. Cambridge, MA: MIT Press.
Alpert, S. (1995). Privacy and intelligent highways: Finding the right of way.
Santa Clara Computer and High Technology Law Journal, 11(1), 97-118.
Alsp, R. (2005, December 6). Ranking corporate reputations. Wall Street Journal
Online. Retrieved March 25, 2007, from
http://online.wsj.com/public/article/SB113382708423014553-
qFM4JXwHCQvWsS14_SXjj123W5M_20061206.html
American Library Association Office for Intellectual Freedom. (2004, April). The
USA Patriot act in the library: Analysis of the USA Patriot act related to
libraries. Retrieved January 14, 2007, from
http://www.ala.org/ala/oif/ifissues/usapatriotactlibrary.htm
American Library Association. (2003). Resolution on the USA Patriot act and
related measures that infringe on the rights of library users. Retrieved
January 14, 2007, from
http://www.ala.org/ala/washoff/WOissues/civilliberties/theusapatriotact/alar
esolution.htm
Andrews, P. (1999, February 7). The search for the perfect search engine. The
Seattle Times, p. E1.
Arasu, A., Cho, J., Garcia-Molina, H., Paepcke, A., & Raghavan, S. (2001).
Searching the Web. ACM Transactions on Internet Technology, 1(1), 2-43.
Aristotle. (1999). Nicomachean ethics (T. Irwin, Trans. 2nd ed.). Indianapolis:
Hackett Publishing.
230
Associated Press. (2005, July 17). Google Growth yields privacy fear. Wired.com.
Retrieved March 31, 2007, from
http://www.wired.com/politics/security/news/2005/07/68235
Ayers, C. (2003, November 1). Google: Could this be the new God in the
machine? The Times, p. 4.
Baig, E. (2000, May 31). Help from above gets you there pioneer gps keeps driver
on straight and narrow. USA Today, p. 3D.
Barbaro, M., & Zeller Jr, T. (2006, August 9). A face is exposed for AOL
searcher no. 4417749. The New York Times, p. A1.
Barroso, L. A., Dean, J., & Holzle, U. (2003). Web search for a planet: The
Google cluster architecture. IEEE Micro, 23(2), 22-28.
Barth, A., Datta, A., Mitchell, J. C., & Nissenbaum, H. (2006). Privacy and
contextual integrity: Framework and applications. Paper presented at the
IEEE Symposium on Security and Privacy.
Battelle, J. (2004, September 8). Perfect search. Searchblog. Retrieved May 16,
2006, from http://battellemedia.com/archives/000878.php
Battelle, J. (2005). The search: How Google and its rivals rewrote the rules of
business and transformed our culture. New York: Portfolio.
Battelle, J. (2006a, January 30). More on what Google (and probably a lot of
others) know. Searchblog. Retrieved May 16, 2006, from
http://battellemedia.com/archives/002283.php
231
Battelle, J. (2006b, January 27). What info does Google Keep? Searchblog.
Retrieved May 16, 2006, from
http://battellemedia.com/archives/002272.php
Bayley, S. (1986). Sex, drink, and fast cars. New York: Random House.
Becker, E., Buhse, W., Günnewig, D., & Rump, N. (2004). Digital rights
management: Technological, economic, legal and political aspects. New
York: Springer.
Bennett, C. (2001). Cookies, Web bugs, webcams and cue cats: Patterns of
surveillance on the world wide Web. Ethics and Information Technology,
3(3), 197-210.
Bennett, C., Raab, C., & Regan, P. (2003). People and place: Patterns of
individual identification within intelligent transport systems. In D. Lyon
(Ed.), Surveillance as social sorting: Privacy, risk, and digital
discrimination. (pp. 153-175). London: Routledge.
Berkley, J. (1996). Women at the motor wheel: Gender and car culture ni the
u.s.a., 1920-1930. Unpublished Dissertation, Claremont Graduate
University.
Berners-Lee, T. (2000). Weaving the Web: The past, present and future of the
world wide Web by its inventor. New York: Harper Business.
232
Bijker, W. E. (1995). Of bicycles, bakelites, and bulbs: Toward a theory of
sociotechnical change. Cambridge, MA: MIT Press.
Bijker, W. E., Hughes, T., & Pinch, T. (Eds.). (1987). The social construction of
technological systems: New directions in the sociology and history of
technology. Cambridge, MA: MIT Press.
Boehner, K., David, S., Kaye, J., & Sengers, P. (2005). Critical technical practice
as a methodology for values in design. Paper presented at the CHI 2005
Workshop: Quality, Value(s) and Choice: Exploring Wider Implications of
HCI Practice, Portland, OR.
Bowker, G. C., & Star, S. L. (1999). Sorting things out: Classification and its
consequences. Cambridge, MA: MIT Press.
Bray, H. (2004, April 26). Gmail controversy highlights new privacy issue. The
Boston Globe, p. C1.
Bray, H. (2005, November 8). Security firm: Sony cds secretly install spyware.
Boston Globe. Retrieved january 20, 2007, from
http://www.boston.com/business/technology/articles/2005/11/08/security_fir
m_sony_cds_secretly_install_spyware/
Brewer, D., & Hayes, J. C. (Eds.). (2002). Using the encyclopédie: Ways of
knowing, ways of reading. Oxford: Voltaire Foundation.
Brey, P. (2000). Disclosive computer ethics. Computers and Society, 30(4), 10-
16.
Brin, S. & Page, L. (1998). The anatomy of a large-scale hypertextual Web search
engine. WWW7 / Computer Networks, 30(1-7), 107-117.
Broder, A., Kumar, R., Maghoul, F., Raghavan, P., Rajagopalan, S., Stata, R., et
al. (2000). Graph structure in the Web. Computer Networks, 33(1-6), 309-
320.
Burk, D. & Gillespie, T. (2006). Autonomy and morality in drm and anti-
circumvention law. tripleC: Cognition, Communication, Cooperation, 4(2),
239-245.
Bush, V. (1945, July). As we may think. The Atlantic Monthly, 176(1), 101-108.
233
Bynum, T. W. (2000). Norbert wiener’s foundation of computer ethics. The
Research Center on Computing & Society. Retrieved March 18, 2007, from
http://www.southernct.edu/organizations/rccs/resources/research/introductio
n/bynum_wiener.html
Camp, L. J. (n.d.). Design for values, design for trust. Retrieved September 20,
2006, from http://www.ljean.com/design.html
Camp, L. J., Friedman, A., & Genkina, A. (n.d.). Embedding trust via social
context in virtual spaces. Retrieved September 23, 2006, from
http://www.ljean.com/files/NetTrust.pdf
Chang, N. (2001, November). The USA PATRIOT act: What's so patriotic about
trampling on the bill of rights? The Center for Constitutional Rights.
Retrieved January 14, 2007, from http://www.ccr-
ny.org/v2/reports/docs/USA_PATRIOT_ACT.pdf
Chun, W. H.-K. (2006). Control and freedom: Power and paranoia in the age of
fiber optics. Cambridge, MA: MIT Press.
234
http://www.forbes.com/leadership/2006/08/23/leadership-innovation-
requiredreading-cx_hc_0823moore.html
Clayton, M. (2000, May 23). Calculators in class: Freedom from scratch paper or
'crutch'? The Christian Science Monitor, p. 20.
Cohan, S., & Hark, I. R. (1997). The road movie book. London: Routledge.
Cohen, J. (2003a). Drm and privacy. Communications of the ACM, 46(4), 46-49.
Cohen, J. (2003b). Drm and privacy. Berkeley Technology Law Journal, 18, 575-
617.
Cohen, R. (2002, December 15). Is Googling o.k.? The New York Times
Magazine, p. 50.
Datamonitor. (2005, July 5). Google Enters top 100 list. Datamonitor
Computerwire. Retrieved November 30, 2006, from
http://www.computerwire.com/industries/research/?pid=B9775C66-25B1-
4979-9672-0A11732DAD9F
235
Derene, G. (2007, March 21). Buzzword: Vehicle-to-vehicle communications.
PopularMechanics.com. Retrieved March 28, 2007, from
http://www.popularmechanics.com/blogs/automotive_news/4213544.html
Downs, R. B., & McCoy, R. E. (1984). The first freedom today: Critical issues
relating to censorship and intellectual freedom. Chicago: American Library
Association.
Doyle, C. (2003, February 26). Libraries and the USA PATRIOT act.
Congressional Research Service Report for Congress.
Doyle, J. (1998, October 10). Electronic tolls for golden gate bridge; district
approves scanner system for use late next year. San Francisco Chronicle, p.
A1.
Dunlap, A. R. (2002). Fixing the fourth amendment with trade secret law: A
response to kyllo v. United states. Georgetown Law Journal, 90(6), 2175-
2206.
Durbin, P., & Rapp, F. (Eds.). (1983). Philosophy and technology. Boston:
Kluwer Academic Publishers.
Eisenstein, E. (1983). The printing revolution in early modern europe. New York:
Cambridge University Press.
236
Electronic Privacy Information Center. (2004a, March 29). Digital rights
management and privacy. Retrieved January 20, 2007, from
http://www.epic.org/privacy/drm/
Electronic Privacy Information Center. (2004 b, August 18). Gmail privacy Page.
Retrieved June 17, 2006, from http://www.epic.org/privacy/gmail/faq.html
Ellul, J. (1964). The technological society (J. Wilkinson, Trans.). New York:
Knopf.
Emtage, A. & Deutsch, P. (1992). Archie: An electronic directory service for the
internet. Proceedings of the Winter 1992 Usenix Conference, 93-110.
Evans, J. (1996, May 10). Copyright comes to the internet; ibm's 'cryptolope'
technology collects the fees. The Washington Post, p. F1.
Fabrikant, G. (2005, March 21). Ask jeeves inc. To be bought for $2 billion. The
New York Times, p. A12.
Fallows, D. (2005). Search engine users: Internet searchers are confident, satisfied
and trusting – but they are also unaware and naïve. Pew Internet &
American Life Project. Retrieved October 15, 2005, from
http://www.pewinternet.org/pdfs/PIP_Searchengine_users.pdf
Feigenbaum, J., Freedman, M. J., Sander, T., & Shostack, A. (2001). Privacy
engineering for digital rights management systems. Proceedings of the ACM
Workshop on Security and Privacy in Digital Rights Management.
Felten, E. (2003). A skeptical view of drm and fair use. Communications of the
ACM, 46(4), 56-59.
Ferguson, C. (2005). That's next for Google? Technology Review, 108(1), 38-46.
Flanagan, M., Howe, D., & Nissenbaum, H. (2005). Values at play: Design
tradeoffs in socially-oriented game design. Conference on Human Factors
in Computing Systems, 751-760.
237
Flanagan, M., Howe, D., & Nissenbaum, H. (in press). Values in design: Theory
and practice. In J. van den Hoven, & J. Weckert (Eds.), Information
technology and moral philosophy. Cambridge University Press.
Fogarty, T. (1997, November 20). Gm at your service with onstar aboard, buy a
car, get a concierge. USA Today, p. 1B.
Franzier, D. (2002, May 27). Traffic gets smarter; high-tech solutions can help
manage denver roads. Rocky Mountain News, p. 26A.
Friedman, B. (1997). Human values and the design of computer technology (CSLI
lecture notes. ; no. 72). New York: Cambridge University Press.
Friedman, B., Howe, D., & Felten, E. (2002). Informed consent in the mozilla
browser: Implementing value-sensitive design. Proceedings of the 35th
Annual Hawaii International Conference on System Sciences.
Friedman, B., & Kahn, P. (2002). Human values, ethics, and design. In J. Jacko,
& S. A. (Eds.), The human-computer interaction handbook. (pp. 1177-
1201). Mahwah, NJ,: Lawrence Erlbaum.
Friedman, B., Kahn, P., & Borning, A. (2002). Value sensitive design: Theory
and methods. (Technical Report 02-12-01). Seattle, WA.
Friedman, T. (2003, June 29). Is Google God? The New York Times, p. 13.
238
Galloway, A. (2004). Protocol: How control exists after decentralization.
Cambridge, MA: MIT Press.
Garfinkel, S. (1995, May 3). The road watches you. New York Times, p. A17.
Garfinkel, S. (2000). Database nation: The death of privacy in the 21st century
(1st ed ed.). Sebastopol, CA: O'Reilly.
Gavison, R. (1980). Privacy and the limits of law. Yale Law Journal, 89(3), 421-
471.
Gillies, J., & Cailliau, R. (2000). How the Web was born: The story of the world
wide Web. New York: Oxford University Press.
Gillmor, S. (2004, April 23). Google's Brin talks on Gmail future. eWeek.com.
Retrieved June 20, 2006, from
http://www.eweek.com/article2/0,1759,1572683,00.asp
Goldman, J. (2006, August 17). Beta update!. Blogger Buzz. Retrieved August 28,
2006, from http://buzz.blogger.com/2006/08/beta-update.html
Google. (1999, June 7). Google Receives $25 million in equity funding [press
release]. Google Press Center. Retrieved August 18, 2006, from
http://www.google.com/press/pressrel/pressrelease1.html
Google. (2003). 20 year archive on Google Groups. Retrieved June 20, 2006,
from http://www.google.com/googlegroups/archive_announce_20.html
Google. (2004a). Google Alerts faq. Retrieved May 22, 2006, from
http://www.google.com/alerts/faq.html?hl=en
Google. (2004b, March 17). Google Connects searchers with local information
[press release]. Google Press Center. Retrieved May 25, 2006, from
http://www.google.com/press/pressrel/local.html
239
Google. (2005a, October 15). Blogger privacy notice. Retrieved August 20, 2006,
from http://beta.blogger.com/privacy
Google. (2005c). Gmail privacy policy. Retrieved June 17, 2006, from
http://mail.google.com/mail/help/privacy.html
Google. (2005d). Gmail terms of use. Retrieved June 17, 2006, from
http://mail.google.com/mail/help/terms_of_use.html
Google. (2005e, September 15). Google Blog search [press release]. Google Press
Center. Retrieved May 25, 2006, from
http://www.google.com/press/annc/blog_search.html
Google. (2005f). Google Book search frequently asked questions. Retrieved May
25, 2006, from http://books.google.com/googleprint/help.html
Google. (2005g). Google Image search help. Retrieved May 3, 2006, from
http://www.google.com/help/faq_images.html
Google. (2005h, October 6). Google Merges local and maps products [press
release]. Google Press Center. Retrieved May 25, 2006, from
http://www.google.com/press/pressrel/local_merge.html
Google. (2005j, October 14). Google Privacy policy. Retrieved May 3, 2006,
from http://www.google.com/privacypolicy.html
Google. (2005l, October 14). Orkut privacy notice. Retrieved August 20, 2006,
from http://www.orkut.com/Privacy.aspx
Google. (2005m, May 19). Personalize your homepage [press release]. Google
Press Center. Retrieved May 16, 2006, from
http://www.google.com/press/annc/personalize.html
240
Google. (2006a, September 22). All our n-gram are belong to you. Google
Research Blog. Retrieved April 15, 2007, from
http://googleresearch.blogspot.com/2006/08/all-our-n-gram-are-belong-to-
you.html
Google. (2006b). Blogger help: Will my use of blogger be associated with my use
of other Google services? Retrieved August 20, 2006, from
http://help.blogger.com/bin/answer.py?answer=42601&topic=8939
Google. (2006d). Google Calendar privacy notice. Retrieved May 25, 2006, from
http://www.google.com/googlecalendar/privacy_policy.html
Google. (2006e). Google Desktop features. Retrieved May 25, 2006, from
http://desktop.google.com/features.html
Google. (2006f). Google Finance faq. Retrieved May 25, 2006, from
http://www.google.com/googlefinance/faq.html
Google. (2006g, February 9). Google Groups beta privacy policy. Retrieved June
20, 2006, from http://groups.google.com/googlegroups/privacy.html
Google. (2006i, May 8). Google Notebook privacy policy. Retrieved June 20,
2006, from http://www.google.com/googlenotebook/privacy.html
Google. (2006j, February 7). Google Talk privacy notice. Retrieved May 25,
2006, from http://www.google.com/talk/privacy.html
Google. (2006k, January 11). Google Toolbar privacy notice. Retrieved May 3,
2006, from
http://www.google.com/support/toolbar/bin/answer.py?answer=32551&topi
c=938
Google. (2006l). Google Video player privacy notice. Retrieved May 25, 2006,
from
http://video.google.com/support/bin/answer.py?answer=32170&topic=1490
Google. (2006m). Google Web accelerator help. Retrieved May 3, 2006, from
http://webaccelerator.google.com/support.html
241
Google. (2006n). How do I personalize the Google homepage? Retrieved May 22,
2006, from
http://www.google.com/support/bin/answer.py?answer=25551&topic=1592
Google. (2006o). Making purchases with your Google Account. Retrieved May
25, 2006, from
http://www.google.com/support/purchases/bin/answer.py?answer=31754
Google. (2006p). What are google's privacy practices for the send to features?
Retrieved May 3, 2006, from
http://www.google.com/support/toolbar/bin/answer.py?answer=32819&topi
c=938
Google. (2006q). What are the advanced features of the Google Toolbar?
Retrieved May 3, 2006, from
http://www.google.com/support/toolbar/bin/answer.py?answer=14293
Google. (2004). Letter from the founders. Retrieved March 29, 2007, from
http://investor.google.com/ipo_letter.html
Gorman, M. (2004, December 17). Google and god's mind. Los Angeles Times, p.
B15.
Graham, J. (2005, November 14). Sony to pull controversial cds, offer swap. USA
Today. Retrieved January 20, 2007, from
http://www.usatoday.com/money/industries/technology/2005-11-14-sony-
cds_x.htm
Griffin, P. (2006, January 27). Big brother wants to track your cybersteps. New
Zealand Herald.
Gross, J. (1997, March 25). E-z pass living up to its name; electronic tolls are
catching on, and commuters are catching up. The New York Times, p. B1.
Gulli, A. & Signorini, A. (2005). The indexable Web is more than 11.5 billion
pages. International World Wide Web Conference, 902-903.
242
Gussow, D. (1999, October 4). In search of. St. Petersburg Times, p. 13.
Hafner, K. (2005, May 20). Google Moves to challenge Web portals. The New
York Times, p. C6.
Hafner, K. (2006, January 25). After subpoenas, internet searches give some
pause. The New York Times, pp. A1, A19.
Hafner, K., & Lyon, M. (1996). Where wizards stay up late: The origins of the
internet. New York: Simon & Schuster.
Hafner, K., & Richtel, M. (2006, January 20). Google resists u.s. Subpoena of
search data. The New York Times, pp. A1, C4.
Hall, E. (2000). Internet core protocols: The definitive guide. Cambridge, MA:
O'Reilly.
Halpern, S. (1995). The traffic in souls: Privacy interest and intelligent vehicle
highway systems. Santa Clara Computer and High Technology Law
Journal, 11(1), 45-73.
Hansell, S. (2004a, June 21). The internet ad you are about to see has already read
your e-mail. The New York Times, p. C1.
Hansell, S. (2004b, March 2). Yahoo to charge for guaranteeing a spot on its
index. The New York Times, p. C4.
Hansell, S. (2005, September 26). Microsoft plans to sell search ads of its own.
The New York Times, pp. C1, C8.
Hansell, S. (2006, August 8). AOL removes search data on vast group of Web
users. The New York Times, p. C4.
Hansen, E. (2004, January 14). Yahoo, Google primed for search war. CnET
News.com. Retrieved August 20, 2006, from http://news.com.com/2100-
1024-5141328.html
Hargittai, E. (2002). Beyond logs and surveys: In-depth measures of people's Web
use skills. Journal of the American Society for Information Science and
Technology, 53(14), 1239-1244.
243
Hargittai, E. (2004a). Informed Web surfing: The social context of user
sophistication. Society online: the Internet in context, Thousand Oaks: Sage
Publications, Inc, 257-274.
Harris, A. (2005, January 13). Car's black box evidence ruled admissible.
Law.com. Retrieved March 31, 2007, from
http://www.law.com/jsp/article.jsp?id=1105364095740
Harris, S. (2006, July 7). Dictionary adds verb: To Google. San Jose Mercury
News.
Heidegger, M. (1977). The question concerning technology, and other essays (W.
Lovitt, Trans.). New York: Harper & Row.
Hellsten, I., Leydesdorff, L., & Wouters, P. (2006). Multiple presents: How
search engines re-write the past. New Media & Society, 8(6), 901-924.
Hellweg, E. (2002, April 22). Google's need for speed. CNN/Money. Retrieved
August 20, 2006, from
http://money.cnn.com/2002/04/22/technology/techinvestor/hellweg/index.ht
m
Hey, K. (1983). Cars and films in american culture, 1929-1959. In D. L. Lewis, &
L. Goldstein (Eds.), The automobile and american culture. (pp. 193-205).
Ann Arbor. MI: University of Michigan Press.
Hines, M. (2005, May 5). Google tool to speed Web surfing. CNET News.com.
Retrieved March 31, 2007, from
http://news.com.com/Google+tool+to+speed+Web+surfing/2100-1032_3-
5696496.html
244
Hinman, L. (2005). Esse est indicato in Google: Ethical and political issues in
search engines. International Review of Information Ethics, 3, 19-25.
Hoelscher, C. (1998). How internet experts search for information on the Web.
World Conference of the World Wide Web, Internet, and Intranet, Orlando,
FL.
Hölscher, C. & Strube, G. (2000). Web search behavior of internet experts and
newbies. Computer Networks, 33(1-6), 337-346.
Horrigan, J., & Rainie, L. (2006, April 19). The internet’s growing role in life’s
major moments. Pew Internet & American Life Project. Retrieved May 26,
2006, from http://www.pewinternet.org/PPF/r/181/report_display.asp
Howe, D., & Nissenbaum, H. (2006). Trackmenot. Retrieved August 27, 2006,
from http://mrl.nyu.edu/~dhowe/TrackMeNot
IAC Search & Media. (2005, July 13). Privacy policy for ask.com. Retrieved
January 6, 2007, from http://sp.ask.com/en/docs/about/privacy.shtml
Inness, J. (1992). Privacy, intimacy, and isolation. New York, NY: Oxford
University Press.
245
Internet World Stats. (2006, September 18). World internet usage and population
statistics. Retrieved November 21, 2006, from
http://www.internetworldstats.com/stats.htm
Interrante, J. (1983). The road to autopia: The automobile and the spatial
transformation of american culture. In D. L. Lewis, & L. Goldstein (Eds.),
The automobile and american culture. (pp. 89-104). Ann Arbor. MI:
University of Michigan Press.
Introna, L. & Nissenbaum, H. (2000). Shaping the Web: Why the politics of
search engines matters. The Information Society, 16(3), 169-185.
Jansen, B. J., Spink, A., & Saracevic, T. (2000). Real life, real users, and real
needs: A study and analysis of user queries on the Web. Information
Processing and Management, 36(2), 207-227.
Johnson, D. (2001). Computer ethics (3rd ed.). Upper Saddle River, NJ: Prentice-
Hall.
Johnson, S. (1997). Interface culture: How new technology transforms the way we
create and communicate. San Francisco: Basic Books.
Jordan, M. (2006, January 7). Electronic eye grows wider in britain; cars to be
subject to video surveillance. The Washington Post, p. A1.
246
Kang, J. (1998). Information privacy in cyberspace transactions. Stanford Law
Review, 50(4), 1193-1294.
Kleinberg, J. & Lawrence, S. (2001). The structure of the Web. Science, 294,
1849-1850.
Kopytoff, V. (2003, February 23). After the boom: Dot-coms defy trend, make
money. San Francisco Chronicle, p. I1.
Kopytoff, V. (2004, March 30). Google tests souped-up Web searches. San
Francisco Chronicle, p. C3.
Kushmerick, N. (1998, February 23). The search engineers. The Irish Times, p.
10.
La Monica, P. (2004, April 30). Google sets $2.7 billion IPO. CNNMoney.com.
Retrieved July 29, 2006, from
http://money.cnn.com/2004/04/29/technology/google/
Laird, D. (1983). Versions of eden: The automobile and the american novel. In D.
L. Lewis, & L. Goldstein (Eds.), The automobile and american culture. (pp.
244-256). Ann Arbor. MI: University of Michigan Press.
247
Langville, A., & Meyer, C. (2006). Google's PageRank and beyond: The science
of search engine rankings. Princeton, NJ: Princeton University Press.
Lardner, G. (2001, November 16). On left and right, concern over anti-terrorism
moves; administration actions threaten civil liberties, critics say. The
Washington Post, p. A40.
Lawrence, S. & Giles, C. L. (1998). Searching the world wide Web. Science,
280(5360), 98-100.
Leiner, B., Cerf, V., Clark, D., Kahn, R., Kleinrock, L., Lynch, D., et al. (2003,
December 10). A brief history of the internet. Internet Society. Retrieved
November 22, 2006, from http://www.isoc.org/internet/history/brief.shtml
Lessig, L. (1999). Code and other laws of cyberspace. New York: Basic Books.
Lessig, L. (2001). The future of ideas: The fate of the commons in a connected
world. New York: Random House.
Lessig, L. (2004). Free culture: How big media uses technology and the law to
lock down culture and control creativity. New York: Penguin Press.
Lewis, D. (1983). Sex and the automobile: From rumble seats to rockin' vans. In
D. L. Lewis, & L. Goldstein (Eds.), The automobile and american culture.
(pp. 123-133). Ann Arbor. MI: University of Michigan Press.
Lewis, D. L., & Goldstein, L. (Eds.). (1983). The automobile and american
culture. Ann Arbor. MI: University of Michigan Press.
Lobron, A. (2006, February 5). Googling your Friday-night date may or may not
be snooping, but it won't let you peek inside any souls. The Boston Globe
Magazine, p. 42.
Lum, C. (2000). Introduction: The intellectual roots of media ecology. New Jersey
Journal of Communication, 8(1), 1-7.
Machill, M., Neuberger, C., Schweiger, W., & Wirth, W. (2004). Navigating the
internet. European Journal of Communication, 19(3), 321-347.
248
Manders-Huits, N. & Zimmer, M. (in progress). Values & pragmatic action: The
challenges of engagement with technical design communities.
Maney, K. (2006, August 9). Aol's data sketch sometimes scary picture of
personalities searching net. USA Today, p. 4B.
Marlowe Jr, H. A., Nyhan, R., Arrington, L. W., & Pammer, W. J. (1994). The re-
ing of local government: Understanding and shaping governmental change.
Public Productivity & Management Review, 17(3), 299-311.
Martey, R. M. (in press). Exploring gendered notions: Gender, job hunting and
Web search engines. In A. Spink, & M. Zimmer (Eds.), Web searching:
Interdisciplinary perspectives. Dordrecht, The Netherlands: Springer.
Mayer, T. (2005, August 8). Our blog is growing up – and so has our index.
Yahoo! Search Blog. Retrieved November 25, 2006, from
http://www.ysearchblog.com/archives/000172.html
249
McChesney, R. (1999). Rich media, poor democracy: Communication politics in
dubious times. Urbana, IL: University of Illinois Press.
McCown, F., Liu, X., Nelson, M. L., & Zubair, M. (2006). Search engine
coverage of the oai-pmh corpus. IEEE Internet Computing, 10(2), 66-73.
McCullagh, D. (2006a, August 7). Aol's disturbing glimpse into users' lives.
CNET News.com. Retrieved December 3, 2006, from
http://news.com.com/AOLs+disturbing+glimpse+into+users+lives/2100-
1030_3-6103098.html?tag=st.num
McCullagh, D. (2006b, March 17). Police blotter: Judge orders Gmail disclosure.
News.com. Retrieved June 20, 2006, from
http://news.com.com/Police%20blotter%20Judge%20orders%20Gmail%20
disclosure/2100-1047_3-6050295.html
McShane, C. (1994). Down the asphalt path: The automobile and the american
city. New York: Columbia University Press.
Mills, E. (2005, August 3). Google balances privacy, reach. CnET News.com.
Retrieved January 7, 2007, from
http://news.com.com/Google+balances+privacy,+reach/2100-1032_3-
5787483.html
Mindlin, A. (2006, May 15). The case of the disappearing cookies. The New York
Times, p. C5.
250
Mintz, H. (2006, January 16). Feds after Google data: Records sought in u.s.
Quest to revive porn law. San Jose Mercury News. Retrieved January 19,
2006, from http://www.siliconvalley.com/mld/siliconvalley/13657386.htm
Mulligan, D., Han, J., & Burstein, A. (2003). How drm-based content delivery
systems disrupt expectations of "personal use". Proceedings of the 2003
ACM workshop on Digital Rights Management, 77-89.
Murphy, D. (2003, April 7). Some librarians use shredder to show opposition to
new f.b.i. Powers. The New York Times, p. A12.
Netcraft. (2006, November 1). November 2006 Web server survey. Retrieved
November 26, 2006, from
251
http://news.netcraft.com/archives/2006/11/01/november_2006_web_server_
survey.html
Network Working Group. (1985, October). Request for comments: 959, file
transfer protocol. Retrieved November 21, 2006, from
http://ietf.org/rfc/rfc959.txt
Network Working Group. (1993, March). Request for comments: 1436, the
internet gopher protocol. Retrieved November 21, 2006, from
http://ietf.org/rfc/rfc1436.txt
Nielsen//NetRatings. (2006, March 30). Google Accounts for nearly half of all
Web searches, while approximately one third are conducted on Yahoo! And
msn combined. Retrieved May 26, 2006, from http://www.nielsen-
netratings.com/pr/pr_060330.pdf
North, S. (2006). The road movie. Hackwriters.com. Retrieved January 27, 2007,
from http://www.hackwriters.com/roadone.htm
Norvig, P., Winograd, T., & Bowker, G. (2006, February 27). The ethics and
politics of search engines. Panel at Santa Clara University Markkula Center
for Applied Ethics. Retrieved March 1, 2006, from
http://www.scu.edu/sts/Search-Engine-Event.cfm
Olsen, S. (2001, October 26). Patriot Act draws privacy concerns. CNET
News.com. Retrieved January 14, 2007, from http://news.com.com/2100-
1023-275026.html
252
Olsen, S. (2006, May 25). Dell embraces Google. CNET News.com. Retrieved
July 24, 2006, from http://news.com.com/Dell+embraces+Google/2100-
1032_3-6077051.html
Ong, W. (1982). Orality and literacy: The technologizing of the word. London:
Routledge.
Padover, S. (1952). Jefferson: A great american's life and ideas. New York:
Harcourt, Brace & World.
Page, L., Brin, S., Motwani, R., & Winograd, T. (1998). The PageRank citation
ranking: Bringing order to the Web. Technical report. Stanford University,
Stanford, CA.
Patton, P. (1986). Open road: A celebration of the american highway. New York:
Simon.
Pierson, G. (1942). The frontier and american institutions a criticism of the turner
theory. The New England Quarterly, 15(2), 224-255.
Pinch, T., & Bijker, W. E. (1987). The social construction of facts and artifacts:
Or how the sociology of science and the sociology of technology might
benefit each other. In W. E. Bijker, T. Hughes, & T. Pinch (Eds.), The social
construction of technological systems: New directions in the sociology and
history of technology. (pp. 17-50). Cambridge, MA: MIT Press.
Pinch, T. J. & Bijker, W. E. (Aug., 1984). The social construction of facts and
artefacts: Or how the sociology of science and the sociology of technology
might benefit each other. Social Studies of Science, 14(3), 399-441.
Pitkow, J., Schütze, H., Cass, T., Turnbull, D., Edmonds, A., & Adar, E. (2002).
Personalized search. Communications of the ACM, 45(9), 50-55.
Plato. (1990). Phaedrus (H. Fowler, Trans. 17th ed.). Cambridge: Loeb Classical
Library.
253
Postman, A. (2006, January 29). Dear sir, we see from your files. The Washington
Post, p. B04.
Postman, N., & Weingartner, C. (1971). The soft revolution: A student handbook
for turning schools around. New York: Delacorte Press.
Purdy, M. (2001, November 25). Bush's new rules to fight terror transform the
legal landscape. The New York Times, p. A1.
Rainie, L. (November 2005). Search engine use shoots up in the past year and
edges towards e–mail as the primary internet application. Pew Internet and
American Life Project.
Ramasastry, A. (2005a, May 12). Can we stop zabasearch -- and similar personal
information search engines?: When data democratization verges on privacy
invasion. FindLaw. Retrieved June 12, 2006, from
http://writ.news.findlaw.com/ramasastry/20050512.html
Ramasastry, A. (2005b, August 23). Tracking every move you make: Can car
rental companies use technology to monitor our driving? FindLaw.
Retrieved September 15, 2005, from
http://writ.news.findlaw.com/ramasastry/20050823.html
Raskin, J. (2000). The humane interface: New directions for designing interactive
systems. Reading, MA: Addison-Wesley.
254
http://webservices.rhapsody.com/rwssdk/RhapsodyDNAWhitePaperV1_0.p
df
Reinhardt, A. (2003, May 5). And you thought the Web ad market was dead.
BusinessWeek Online. Retrieved November 20, 2006, from
http://www.businessweek.com/magazine/content/03_18/b3831134_mz034.h
tm
Retson, D. (2006, March 27). Black box tells all: The event data recorder is now
included in the majority of new vehicles recording information, including
speed and seat-belt use, in the event of a crash. The Gazette, p. E1.
Rogers, I. (2002, April). The Google Pagerank algorithm and how it works. IPR
Computing. Retrieved April 14, 2007, from
http://www.iprcom.com/papers/pagerank/
Samuelson, P. (2003). Digital rights management {and, or, vs.} the law.
Communications of the ACM, 46(4), 41-45.
Sanchez, R. (2003, April 10). Librarians make some noise over Patriot Act. The
Washington Post, p. A20.
255
Santa Clara symposium on privacy and IVHS. (1995). Santa Clara Computer &
High Technology Law Journal, 11.
Scharff, V. (1991). Taking the wheel: Women and the coming of the motor age.
New York: The Free Press.
Schwartz, C. (1998). Web search engines. Journal of the American Society for
Information Science, 49(11), 973-982.
Schwartz, J. (2001, September 4). Giving the Web a memory costs its users
privacy. The New York Times, p. A1.
Schwartz, J. (1998, September 8). Kids and computers; how wired would a
student's world be? The Washington Post, p. Z7.
Selingo, J. (2001, October 25). It’s the cars, not the tires, that squeal. The New
York Times, pp. G1, G8.
Sengers, P., Boehner, K., & David, S. (2005). Reflective design. Proceedings of
the 4th decennial conference on Critical computing: between sense and
sensibility, 49-58.
Sengers, P., Liesendahi, R., Magar, W., Seibert, C., Müller, B., Joachims, T., et al.
(1990). The enigmatics of affect. Proceedings of the conference on
Designing interactive systems: processes, practices, methods, and
techniques, 87-98.
Setright, L. J. K. (2003). Drive on!: A social history of the motor car. London:
Granta.
Shankland, S. (2005, October 4). Sun and Google shake hands. CNET News.com.
Retrieved July 24, 2006, from
256
http://news.com.com/Sun+and+Google+shake+hands/2100-1014_3-
5888701.html
Sharma, D. (2004, October 21). Is your boss Googling you? CNET News.com.
Retrieved January 6, 2007, from
http://news.com.com/Is+your+boss+Googling+you/2100-1038_3-
5421210.html
Sheff, D. (2004). Playboy interview: Google guys. Playboy, 51(9), 55-60, 142-
145.
Silverstein, C., Henzinger, M. R., Marais, H., & Moricz, M. (1999). Analysis of a
very large Web search engine query log. SIGIR Forum, 33(1), 6-12.
Simon, F. (2003, October 4). Ernst kapp: An early and romantic philosopher of
technology. Retrieved 2006, December 1, from
http://members.home.nl/fsimon/index.html
Solove, D. (2004). The digital person: Technology and privacy in the information
age (Ex machina). New York: New York University Press.
Spink, A., & Jansen, B. J. (2004). Web search: Public searching of the Web. New
York: Kluwer Academic Publishers.
Standage, T. (1998). The victorian internet: The remarkable story of the telegraph
and the nineteenth century's on-line pioneers. New York: Walker and Co.
Steinberg, S. (1996, May). Seek and ye shall find (maybe). Wired, pp. 108-114,
172-182.
257
Stockwell, F. (2001). A history of information storage and retrieval. Jefferson,
NC: McFarland & Company.
Sturges, P., Teng, V., & Iliffe, U. User privacy in the digital library environment:
A matter of concern for information professionals. Library Management,
22(8/9), 364-370.
Sullivan, D. (2003a, July 31). How search engines rank Web pages.
SearchEngineWatch. Retrieved November 25, 2006, from
http://searchenginewatch.com/showPage.html?page=2167961
Sullivan, D. (2003b, April 2). Search privacy at Google & other search engines.
SearchEngineWatch. Retrieved March 31, 2007, from
http://searchenginewatch.com/showPage.html?page=2189531
Sullivan, D. (2005, May 17). New estimate puts Web size at 11.5 billion pages &
compares search engine coverage. SearchEngineWatch. Retrieved January
4, 2007, from http://blog.searchenginewatch.com/blog/050517-075657
Swidey, N. (2003, February 2). A nation of voyeurs: How the internet search
engine Google is changing what we can find out about one another - and
raising questions about whether we should. The Boston Globe Sunday
Magazine, p. 10.
Tancer, B. (2006, May 18). Google Properties - understanding the breakdown. Hit
Wise. Retrieved May 22, 2006, from http://weblogs.hitwise.com/bill-
tancer/2006/05/google_properties_understandin.html
258
Tavani, H. T. (2004). Ethics and technology: Ethical issues in an age of
information and communication technology. Hoboken, NJ: Wiley.
Tec-Ed. (1999, December). Assessing Web site usability from server log files
[white paper]. Retrieved April 3, 2006, from
www.teced.com/PDFs/whitepap.pdf
Teevan, J., Dumais, S. T., & Horvitz, E. (2005). Personalizing search via
automated analysis of interests and activities. Proceedings of the 28th
annual international ACM SIGIR conference on Research and development
in information retrieval, 449-456.
Thaw, J., & Daurat, C. (2006, August 8). Google to provide MySpace search. The
Seattle Times, p. E1.
Türker, D. (2004). The optimal design of a search engine from an agency theory
perspective. Retrieved November 1, 2006, from http://rundfunkoek.uni-
koeln.de/institut/publikationen/arbeitspapiere/ap191.php
Turkle, S. (1995). Life on the screen: Identity in the age of the internet. New
York: Simon & Schuster.
Turner, F. J. (1921). The frontier in american history. New York: Henry Holt and
Company.
United States Department of Justice. (2006). The USA PATRIOT act: Preserving
life and liberty. Retrieved 2007, March 11, from
http://www.lifeandliberty.gov/highlights.htm
259
Vaidhyanathan, S. (2001). Copyrights and copywrongs: The rise of intellectual
property and how it threatens creativity. New York: New York University
Press.
Vaidhyanathan, S. (2004). The anarchist in the library: How the clash between
freedom and control is hacking the real world and crashing the system. New
York: Basic Books.
Van Couvering, E. (2004). New media? The political economy of internet search
engines. Annual Conference of the International Association of Media &
Communications Researchers, Porto Alegre, Brazil, 7-14.
Vaughan, L. & Thelwall, M. (2004). Search engine coverage bias: Evidence and
possible causes. Information Processing & Management, 40(4), 693-707.
Vine, R. (2004). The business of search engines. Information Outlook, 8(2), 25-
31.
Warren, S. & Brandeis, L. (1890). The right to privacy. Harvard Law Review, 4,
193-200.
Waters, R. (2006, April 22). Google, Microsoft and Yahoo woo Ebay. Financial
Times, p. 21.
Watts, D. J. (2003). Six degrees: The science of a connected age. New York:
Norton.
260
Weiss, P. (2006, March 19). What a tangled Web we weave: Being googled can
jeopardize your job search. New York Daily News. Retrieved January 7,
2007
Wen, J. R., Nie, J. Y., & Zhang, H. J. (2001). Clustering user queries of a search
engine. Proceedings of the tenth international conference on World Wide
Web, 162-168.
Wiener, N. (1988). Human use of human beings: Cybernetics and society. Boston:
Da Capo Press.
261
http://en.wikipedia.org/w/index.php?title=Open_Directory_Project&oldid=1
17786100
Williams, A. (2006, October 15). Planet Google Wants you. The New York Times,
p. 9.1.
Winner, L. (1986). The whale and the reactor: A search for limits in an age of
high technology. Chicago, IL: University of Chicago Press.
Winner, L. (Summer, 1993). Upon opening the black box and finding it empty:
Social constructivism and the philosophy of technology. Science,
Technology, and Human Values, 18(3), 362-378.
Wouters, J. (2005, June 9). Still searching for disclosure:. Comsuer Reports
WebWatch. Retrieved Sept. 15, 2005, from
http://www.consumerwebwatch.org/pdfs/search-engine-disclosure.pdf
Wouters, P., Hellsten, I., & Leydesdorff, L. (2004). Internet time and the
reliability of search engines. First Monday. Retrieved December 24, 2006,
from http://www.firstmonday.org/issues/issue9_10/wouters/index.html
Yahoo!. (2005). The history of Yahoo! - how it all started. Retrieved March 25,
2007, from http://docs.yahoo.com/info/misc/history.html
Yahoo!. (2006, November 11). Yahoo! Privacy policy. Retrieved January 6, 2007,
from http://info.yahoo.com/privacy/us/yahoo/details.html
262
Zetter, K. (2005a, June 21). Driving big brother. Wired News. Retrieved June 28,
2005, from
http://www.wired.com/news/privacy/0,1848,67952,00.html?tw=wn_tophead
_3
Zetter, K. (2005b, May 17). Tor torches online tracking. Wired News. Retrieved
May 28, 2006, from
http://www.wired.com/news/privacy/0,1848,67542,00.html
263
264
APPENDIX A
its search services to include not only websites, but other online documents as
well, such as images, news feeds, Usenet archives, and video files. Additionally,
Google has begun digitizing the “material world,” adding the contents of popular
books, university libraries, maps and satellite images to its growing index. Users
can also search the files on their hard drives, send e-mail and instant messages,
shop online, and even engage in social networking through Google. In all, Google
sections will briefly describe Google’s key products in each of these information-
seeking contexts, revealing the circumstances of their use, and providing insights
75
These nine contexts are not necessarily mutually exclusive and are not
put forth as strict metaphysical divisions. They are meant simply to help
compartmentalize for easier discussion the various types information-seeking
activities a person undertakes in her daily activities.
into how they help Google attain the “perfect reach” and “perfect recall”
Web Search
stand apart from the competition. In 1998, Brin and Page’s paper, “The Anatomy
proposed a system to more effectively retrieve information from the World Wide
Web to “improve the quality of search engines” and thus “bring order to the Web”
(Brin & Page, 1998, p. 3). The core of their new Web search engine was
PageRank, a set of algorithms for ranking Web pages, using the immense link
PageRank relies on the uniquely democratic nature of the Web by using its
vast link structure as an indicator of an individual page's value. In essence,
Google interprets a link from page A to page B as a vote, by page A, for
page B. But, Google looks at more than the sheer volume of votes, or links
a page receives; it also analyzes the page that casts the vote. Votes cast by
pages that are themselves “important” weigh more heavily and help to
make other pages “important.” (Google, 2004c)
time of Google’s launch, most search engines relied heavily on how often a word
However, instead of simply scanning for the occurrence of a word within the
page, Google analyzes the full content of a page and factors in font size, header
levels, and the relative location of each word in order to measure its importance.
265
For example, a page with the search term in the title or in large, bold font will be
considered more relevant than a page where the word appears only in a footnote.
designed to make it easy for users to enter search queries and interpret results.
whether the corresponding Web pages will satisfy their needs. Google has
increasingly incorporated a “One Box” result for select searches, in which the first
Figure 7: Partial Google Web search results page for “Boston subway” showing
“One Box” with additional navigational links (circled).
size and reach of its index. While Google no longer publishes its index size, in
2005, it claimed over 8.1 billion pages, and it is estimated that it has indexed
266
nearly 70% of the Web (Sullivan, 2005). In addition to html-based Web pages,
Google’s Web search also indexes and provides results for Adobe Portable
As noted above, Google maintains server logs recording each Web search
request processed through its search engine (Google, 2005j). These logs contain,
at a minimum, the IP address, date and time, browser type and operating system,
cookie ID and the specific search terms for each of the 100 million Web search
queries that Google processes daily. The individual search terms within the logs
are a mix of the mundane and stimulating, the trivial and the informative. While
over half of searchers say they split their searches among those that are “for fun”
and those that are “important” to them (Fallows, 2005), users are increasingly
using the Internet and search engines to help them make important decisions or
negotiate their way through major episodes in their lives (Horrigan & Rainie,
2006). In such potentially personal and sensitive circumstances, the terms for
which users search, along with the results they decide to click on, are stored in
Google’s server logs. Whether a user searches for teen pop star “Lindsay Lohan”
IP address, cookie ID, and possibly a user account in Google’s server logs.
clickstream data, including which search results or advertising links a user clicks
(Google, 2005i).
267
Personalized Homepage
ability to customize the default Google home page to display their Gmail inbox,
local weather, local cinema times, news headlines, stock quotes and other
and content from across the Web, in ways that are useful to [its] users” (Google,
2005m). At its launch, Google CEO Eric Schmidt predicted that Personalized
Homepage will become “the definition” of Google and that “it will become a
central part of Google…A majority of people will eventually use Google like
A user can click the “Personalize this page” link from the Google
cookie on the user’s browser with an expiration date of 2038 to ensure that the
use of a Google Account, advising sers that they can “save this page and take it
76
The Personalized Homepage is accessed by visiting
www.google.com/ig.
268
information or intellectual interests with Google. For example, to deliver local
weather forecasts or movie theater information, the user’s zip code must be
her religion could be divulged if her Personalized Homepage included one of the
many religiously themed modules, such as the Christian Today or The Hindu.
Similarly, the selection of one of the many foreign language news modules
(which include French, Chinese, Korean, Spanish, Russian, and many more)
could help establish the ethnicity or national origin of a user. A user’s hobbies and
ESPN might indicate a sports enthusiast, while a technophile might opt for the
Wired News module, and so on. Numerous financial and stock market modules
are available for tracking a user’s stock holdings, and submitting the same address
to the Mapping and Directions module might reveal a user’s home or work
address. Even a user’s sexual orientation could be inferred if any one of the
incentive for users to log into Personalized Homepage with a Google Account, all
such information can be associated with the user’s Google Account profile.
Google Alerts
Google Alerts are emails automatically sent by Google when there are new results
for specific search terms selected by a user (Google, 2004a). An e-mail account is
269
required to set up a Google Alert, and Google encourages the creation of Google
Accounts to better manage multiple Alerts. Users can set up any search query as
an Alert, and track the query results from Google’s index of Web pages, news
tracking medical advances, [or] getting the latest on a celebrity or sports team”
(Google, 2004a).
personal and intellectual information to Google. For users curious about their
personal Web presence, a common Alert would include their name or even their
mailing address.77 Users can submit much more detailed search queries than the
a more detailed glimpse into their personal and intellectual interests. All of a
user’s Alerts are associated with either a user’s e-mail address or Google
Account.
Image Search
77
Google would have no definitive way of knowing it was indeed the
user’s name, although comparison to the user’s email address (commonly a
variant of the account holder’s name) or other information in their Google
Account might allow identification.
78
http://images.google.com
270
become Google’s second most popular search-related product (Tancer, 2006),
with billions of images indexed and available for viewing (Google, 2005g). Users
can enter image search queries just like traditional Web searches, and the results
up a framed display with a slightly larger version of the thumbnail in the top half
of the page, and the Web page on which the image was found in the bottom half.
From the top frame, users can click the thumbnail to display the full-size image,
remove the frame to display the entire page, or return to their search results.
Just as with searching the Web, a user’s image search queries are passed to
Google along with her unique Web cookie, allowing Google to maintain records
of the images searched. The unique design of Image Search, however, also allows
Google to track exactly on which thumbnail image a user clicked. For example, if
a user searches for “Bill Gates” in Image Search, the first result is a thumbnail
image of Mr. Gates’ portrait from Microsoft’s website. Hovering the cursor over
http://images.google.com/imgres?imgurl=http://www.microsoft.com/press
pass/images/gallery/execs/Web/gates-
2.jpg&imgrefurl=http://www.microsoft.com/billgates/bio.asp
&h=840&w=600&sz=136&tbnid=QTP5Hbhx7_3soM:&tbnh=143&tbnw=102&hl=en&sta
rt=1&prev=/images%3Fq%3Dbill%2Bgates%26svnum%3D10%26hl%3Den%26lr%3D%2
6safe%
Clicking on the result passes this URL to Google’s servers so they can create the
framed display of the Bill Gates photo along with the actual source page. By
logging this URL, Google is able to track which image the user was interested in,
http://www.microsoft.com/presspass/images/gallery/execs/Web/gates-2.jpg,
271
the search terms used to find that image, (the search terms “bill gates” are
information, along with the user’s Web cookie or Google Account, allows Google
to track a user’s particular image searches and subsequent search result clicks.
Google Video
Google Video allows users to upload, search, and watch videos stored on
Google’s servers, as well as download video files for viewing on their own
as PBS, Fox News, and C-SPAN, Google Video now includes video files from
Video enables users to search across the closed captioning content and other
metadata to locate relevant video files. For example, entering the search query
“how to pick a lock” will return a list of relevant video clips on whichs the search
metadata. While most videos are free, premium content can be rented or
purchased for a fee. In fall 2006, Google agreed to buy rival video sharing site
YouTube for $1.65 billion in stock. While YouTube will remain a separate
service under its own identity, Google video searches include YouTube results as
Google records the particular search terms, and since the video content is
located on Google servers, it is able to monitor and track on which search results a
particular user clicks, and whether he choses to download the file. If a user e-
272
mails a video clip to a friend, Google captures the following information in the
browser command:
docid=759726176973245447&q=how+to+pick+a+lock&from=NettyRoe
%40gmail.com&to=friend%40email.com&msg=I%20found%20this%20c
ool%20video&sendToSender=false
The docid number identifies the specific video, the &q field identifies the original
search terms, &from and &to are the respective e-mail addresses, and &msg is the
content of the e-mail send by the user. This detailed information is passed to
collects and records information about the transaction, including the file name of
the video purchased, the name of the seller, the transaction amount, and the
payment method used (Google, 2006o). Google also implements digital rights
When the video is played, the Player sends this encrypted information to Google,
including the identity of the video, informing Google that a particular user is
attempting to play the video and allowing it to confirm that the copy is authorized
(Google, 2006l).79
79
If a user downloads free videos or purchase non-copy-protected videos,
no account information is embedded.
273
Google Book Search
search the full text of books scanned into Google’s database, and, depending on
the book’s copyright status, view either snippets or complete pages. The Google
Book Search service remains in a beta stage but the underlying database continues
to grow, with more than 100,000 titles added by publishers and authors and some
10,000 works in the public domain now indexed and included in search results.
Google also has formed partnerships with several high-profile university and
Texas, and the New York Public Library, to digitize and make available their
Because many of the books in Book Search are still under copyright,
Google limits the extent to which a user can view a particular volume. In order to
enforce these limits, users must use a Google Account to access particular pages,
with the books and pages that you’ve viewed” (Google, 2005f). Google Book
these links first routes the browser through Google’s servers before loading the
bookseller’s page, allowing Google to track which online bookstore the user has
decided to visit.
274
Academic Research
Google Scholar
indexes the full text of scholarly literature across an array of publishing formats
and scholarly fields. Examining a user’s searches within Google Scholar might
tracks the results clicked by searchers in Google Scholar by redirecting all results
through Google’s servers. For example, clicking on a search result for Sergey
Brin and Larry Page’s article “The Anatomy of a Large-Scale Hypertextual Web
GET
http://scholar.google.com/url?sa=U&q=http://www.public.asu.edu
/~ychen127/cse591f05/anatomy.pdf
The request for the article is first routed through Google’s server, which then
requests the article download from its home server, allowing Google to track the
results clicked. Google Scholar also allows users to identify their home library
(such as New York University) to determine which journals and papers the library
subscribes to electronically, and then links to articles from those sources when
available. Google records the user’s library selection via the Web cookie.
Google News
275
(Wiggins, 2001). Google’s initial response to this increased demand in news and
current events was to provide Google News Headlines, a page that summarized
top news stories from about 100 different publications. Within a year, this evolved
into Google News, a service that scans over 4,500 different news websites in real
time, determines which news stories are related and then clusters them into related
categories. The top stories are highlighted under common categories such as
“World,” “Science and Technology,” or “Health,” and users can execute search
passed to Google along with her unique Web cookie, allowing Google to maintain
records of all Google News searches alongside Web and image searches. Users
can also customize Google News with news from 22 regional editions and 10
For example, a custom news section can be created for any stories that include the
are managed through a Web cookie, but Google also encourages the use of a
Google Account to ensure user’s can access their customized news from any
computer. In such cases, Google acknowledges that “if you create personalized
news front page settings as part of your Google Account, the settings will be
276
Google Reader
sites for updates, users can subscribe to their Web feeds,80 and Google Reader
will monitor the sites for updates and display new content in the reading list. A
search box is provided to help users find and subscribe to specific Web content,
and users can organize feeds by using descriptive labels and stars for particular
favorites. Feed subscriptions are stored on Google’s servers and can also be
use Google Reader, and usage statistics can be recorded in Google’s server logs in
accordance with its privacy policy, including a user’s subscribed feeds, searches
Blog Search
users to search for content published on blogs worldwide. As with most Google
services, any search terms used with Blog Search can be recorded in Google’s
server logs along with the browser’s Web cookie or the user’s Google Account.
and commentary” (Google, 2005e), analyzing a user’s Blog Search history might
80
A Web feed is a coded Web file which contains content from the
originating website, typically summaries of news stories or weblog posts with
Web links to longer versions.
277
yield search terms with unique affinity to his political, social or cultural beliefs,
platform.81 Searching for blog content from this interface sends Google the
referer field. For example, if a user searches for the phrase “marijuana in
Brooklyn” from the Blog Search header that appears on the U.S. Marijuana
Google along with the search terms and Web cookie, allowing Google to record
that this particular browser made this search from that particular blog.
Gmail
gigabytes of storage and the ability to search within messages via Google’s Web
search algorithms. Gmail’s large storage capacity eliminates the need for the
servers. Gmail allows users to maintain contact lists on Google’s servers, and
offers the ability to send instant messages directly from the Gmail interface. Users
have the option to save their chat histories on Google’s servers along with their
email messages.
81
The use of Blogger as a communication medium is discussed below.
278
At its launch, Gmail was heavily criticized for its practices of scanning the
related pages appear in the right margin of the Gmail interface. Google scans the
text of incoming e-mail messages in order to target the advertising to the user. For
example, if the user is reading an e-mail that contains the text “Atlantic City,”
Gmail might present the user with ads about hotels, casinos, and other websites
related to that travel destination. While Google maintains that “no human will
read the content of your email in order to target such advertisements or other
Gmail terms of use also note that Google may “monitor, edit or disclose your
order to comply with any valid legal process or governmental request” (Google,
2005d).
policy stating that “residual copies of e-mail may remain on our systems for some
time, even after you have deleted messages from your mailbox or after the
Because electronic communications stored for more than 180 days enjoy lower
2004b), the prospect of indefinite storage of Gmail e-mails raises concerns over
279
the privacy of users’ communications. Google insists this phrasing in the Gmail
privacy policy was simply “poor wording” (Gillmor, 2004), and that while
Google, like most Web-based e-mail providers, keeps multiple backup copies of
users’ emails so they can recover messages and restore accounts in case of errors
or system failure, deleted e-mails are completely removed from Google’s servers
within 60 days (Gillmor, 2004; Google, 2005c). Even with these commitments, a
user’s deleted Gmail still might remain in Google’s “offline backup systems”
(Google, 2005c), and in at least one case, Google has received a subpoena for the
(McCullagh, 2006b).
Groups
Google Groups is a free service enabling users to create and manage their
own email groups and discussion lists. Users can participate in ongoing
discussions related to specific interests, create new groups, and access the Usenet
archive of newsgroups and discussion forums with over 800 million posts on
thousands of topics dating back to 1981 (Google, 2003). While any user can
search and read Google Group discussions, a Google Account is required to post
new messages or create a new discussion group. A user may also create a public
profile which displays her name, nickname, location, title, industry, website or
blog, as well as the most recent posts she made. All postings submitted to Google
Groups are stored and maintained on Google servers, and Google collects and
maintains information about a user’s account activity, including the groups that he
280
joins or manages, the messages or topics she tracks, her ratings of particular
messages or groups, and her preferred settings when using Google Groups
(Google, 2006g).
Talk
conjunction with Google’s Gmail e-mail service. A Google Account and a Gmail
address are required to access Google Talk, and users have the option of recording
their Talk chat histories within their Gmail accounts. Google records information
about Talk usage, such as when the service is used, contact list members and
those actually communicated with, and the frequency and size of data transfers.
(Google, 2006j). Google notes that it deletes the activity information associated
with a user’s account “on a regular basis” (Google, 2006j). The frequency of such
Blogger
was purchased by Google in 2003 and existed somewhat separately from other
access the service. In August 2006, Google updated the service to allow closer
integration with its other products, including the ability for users to use their
Google Account to login and access the Blogger service (Goldman, 2006).
281
Google encourages the creation of a Blogger profile, which includes information
such as the user’s full name, photograph, birthday, location, and gender, as well as
lists of favorite books, movies, music and so on. All account information and
copies of weblog posts and comments, including drafts, are stored by Google, and
are associated with the Google Account used to access the service (Google,
2005a, 2006b).
list their personal and professional information, create relationships, and join
maintained unique login accounts, Google now requires the creation of a Google
Account to log in and use the Orkut service (Weinberg, 2005). Google encourages
the creation of a Orkut profile that includes personal information, such as gender,
age, occupation, hobbies, interests and photos. All account information is stored
networking service built specifically for use on mobile devices. Dogdeball users
report their locations through their mobile phones, and the service broadcasts their
location to their network of friends. Users can also send personal messages, check
for addresses and interact with other Dodgeball users through the service. To use
Dodgeball, users must register for a Google Account and provide personal
282
information including name, email address, home city, gender, and mobile phone
information. Dodgeball logs all text messages sent through the service, including
Google Calendar
stored on Google’s servers, and can be accessed from any computer through a
Google Account. To activate the service, a user must provide a first and last
name, preferred default language, and time zone. Google Calendar is closely
integrated with Gmail: When an e-mail that contains trigger words (such as
“meeting,” or dates and times) arrives, Gmail displays an “add to calendar” button
to encourage use of the service. Users can also share their calendars publicly or
complete removal of the event information from their servers “may not be
media (Google, 2006d). In accordance with its privacy policy, Google also
records usage statistics from Google Calendar, such as when and for how long the
service is used, the frequency and size of data transfers, and the number of events
Google Calendar account (including user interface elements, ads, links, and other
283
other users regarding events, including email addresses, dates and times of the
events, and any responses from invites. Google permanently deletes usage
statistics associated with a user’s account every ninety days, but retains aggregate
Google Finance
information about stocks, mutual funds, and public and private companies. Along
with normal collection of a user’s search activity within Google Finance, a unique
Web cookie is also utilized to provide a Recent Quotes feature, allowing users
keep track of the stocks and mutual funds recently searched for and viewed. Using
their Google Accounts, users can create a permanent Google Finance portfolio to
keep track of financial information, including how many shares owned and at
what price, for up to 200 stocks or mutual funds (Google, 2006f). Google Finance
link to their website or blog. Google employees moderate all posts to the
discussion group, and users’ names and e-mail addresses are displayed with their
284
Shopping and Product Research
Google’s first entrance into e-commerce was its Catalog Search service,
allowing users to search and browse more than 6,000 mail order catalogs archived
recognition, users can search for a text string in these catalogs in a fashion similar
to how they would search for materials on the general Web, and matching results
are displayed as thumbnails of the catalog’s printed pages. Google later launched
Froogle, giving users the ability to search through a database of online retailers,
find multiple sources for specific products, and deliver details, images and prices
for the items sought. Google logs the search queries and results clicked for both
services in order to load the proper catalog page or product detail with a list of
online retailers.82 When using Froogle, users can connect directly to the retailer’s
website to purchase an item. For example, a search for “sniper rifle” might send
users to a page describing the Tokyo Marui PSG-1 Airsoft Sniper Rifle, with a
link to the Supply Tent online retailer. Clicking on the link sends the following
browser instruction:
GET http://froogle.google.com/froogle_url?q=http://www.supply-
tent.com/
product_info.php%3Fproducts_id%3D248&fr=ANUtWOyKpinjJMKIH-
nQCMyR5l5J8b94V3vCeNWMfGyxAAAAAAAAAAA&ei=D2p3ROy-
DZ_kqwLnkM1y&sig2=qwb6mFSHSQNzewweBrZFnQ
Before loading the product page at Supply Tent, the request is routed
through Google’s server, allowing Google to track which store the user has
82
Froogle searches are also included in a user’s Search History if that
service is activated.
285
decided to visit for more information. Users can also create shopping lists on
Froogle to save and share product information. Google stores users’ shopping list
a locality parameter (such as street address, city name or ZIP code) and provide
search results based upon websites of businesses that have physical addresses
located within the parameter. In spring 2004, Google launched its Google Local
locality, and relevant local search results at the top of a Web search results page
(if a locality can be deduced from the search terms) (Google, 2004b). A year later,
Google launched Google Maps, a dynamic online mapping feature for viewing
detailed street information and satellite images and the creation of driving
directions. Today, the two products have merged into one service, allowing users
to find local search and mapping information in one place (Google, 2005h).
in a Google Local or Google Maps search request is sent to Google along with a
user’s Web cookie and can be stored alongside all other search query information
in Google’s server logs. Recognizing that many users search for information near
their homes or workplaces, a default location can be set as the default starting
point for the next location-specific search. To facilitate this, Google adds a
location-specific parameter to the Web cookie that is passed to the browser. For
286
example, if a user sets “239 Greene Street, New York, NY 10003” as their default
L=0vSX508Toojyru7zEvby35nyXl_cymLkB
Each location has its own L parameter setting, and by adding location data
to the Web cookie, Google can offer location specific results with traditional Web
search queries.
Google Desktop
Search, a free downloadable application for locating personal computer files using
messages (if stored on the user’s computer), her Web history, Microsoft Office
documents, instant messenger chat histories, PDF files, as well as music, video,
and image files. After isntallation, the software completes a full indexing of these
files on the user’s computer. (After the initial indexing is completed, the software
continues to index files as needed.)83 When performing searches, the user receives
results in the browser on a Desktop search results page much like the results for
hard drive, and searches performed do not connect to Google’s online servers, if a
user performs a traditional Web search from the Desktop Search results page, the
83
Only the first 10,000 words of each file, and only the first 100,000 files
are indexed (http://desktop.google.com/support/bin/answer.py?answer=13754).
287
original Desktop search terms are passed to Google within the referer field. For
Desktop Search to help locate a spreadsheet file on the user’s computer. After
seeing the results page, the user takes advantage of the Web search interface at the
top of the results page to perform a traditional Web search by clicking the
conveniently placed “Web” link. When the new search is executed, the following
code is sent to Google’s servers along with the new search terms and Web cookie:
Referer:
http://127.0.0.1:12758/search?q=donations+to+republican+party
&flags=32&s=k1C7ekMgaBHFPEm2aaPHD-8pyEw
Even though the user had searched her desktop files for documents referencing
contributions to the Republican Party, the subsequent Web search provides that
with a “Search Across Computers” function allowing users to search and access
information from all of their computers that have Google Desktop installed. Once
Google explains:
84
This feature is automatically activated in Google Desktop 3.0, but can
be turned off via the Desktop Preferences control panel.
85
The Search Across Computers feature is not automatically activated and
must be enabled and authenticated through the Google Desktop preferences. A
Google Account is required to activate and access the service.
288
This is necessary, for example, if one of your computers is turned off or
otherwise offline when new or updated items are indexed on another of
your machines. We store this data temporarily on Google Desktop servers
and automatically delete older flies, and your data is never accessible by
anyone doing a Google search. (Google, 2006e)
To help protect user privacy, the data is encrypted in transmission and while
stored on Google servers, and is retained for only 30 days. However, privacy
Foundation:
If you use the Search Across Computers feature and don’t configure
Google Desktop very carefully—and most people won’t—Google will
have copies of your tax returns, love letters, business records, financial
and medical files, and whatever other text-based documents the Desktop
software can index. The government could then demand these personal
files with only a subpoena rather than the search warrant it would need to
seize the same things from your home or business, and in many cases you
wouldn’t even be notified in time to challenge it. Other litigants—your
spouse, your business partners or rivals, whoever—could also try to cut
out the middleman (you) and subpoena Google for your files. (Foundation,
2006)
It is unclear whether the data stored on Google’s servers are retained on “offline
backup systems” past the 30-day window (similar to Gmail messages), or whether
Google employees are able to decrypt the files if they are subpoenaed or
Internet Browsing
Bookmarks
Web pages on Google’s servers. Users can create Bookmarks by “starring” a page
from their Search History, through a Bookmark javascript added to their Web
289
browser, or via the Google Toolbar. Bookmarks can be accessed from any
computer by logging into a Google Account, and can also be added to a user’s
http://www.google.com/bookmarks/url?url=http://www.nytimes.com
&ei=m-
OBROu3DIre4QGE38ygDA&sig2=N_8dXWKvMIvdU6iQUVKujA&zx=JzzAy4Ao9g
M&ct=b
The request to load The New York Times webpage is first routed through Google’s
Notebook
Google Notebook is a browser tool that provides users with the means to
save and organize notes while browsing online. Users can clip text, images, and
links from Web pages, save them to an online “notebook” that is accessible from
any computer, and share them with other users. A Google Account is required to
use Notebook, and all notes and annotations are stored on Google’s servers.
Google Toolbar
integrated into the Google Toolbar. Google Toolbar is a browser plug-in allowing
users to perform Google searches and other functions without visiting the Google
homepage, either using the toolbar’s search box or right-clicking on text within a
290
Web page. The Google Toolbar has been downloaded by millions of users86, and
Toolbar with Sun’s popular Java Web software87 (Shankland, 2005), as well as
with Dell computers to pre-install the Toolbar in all Dell personal computers
(Olsen, 2006).
to Google’s servers. The advanced features of the Google Toolbar are PageRank,
sends Google the addresses of every website visited by the user. The AutoLink
feature scans the content of a visited webpage. If Google recognizes certain types
of information on the page (addresses, ZIP codes, ISBN numbers, etc.) AutoLink
monitors the words users type into Web forms in order to correct any spelling
mistakes. Finally, WordTranslator sends to Google the English words that users
identify with the mouse and provides translations into Chinese, Japanese, Korean,
86
Download statistics from Google’s webpage are not available, but over
3 million downloads of the Toolbar have been recorded at Download.com alone
(http://www.download.com/Google-Toolbar/3000-2379_4-10056938.html).
87
The Java Runtime Environment is downloaded 20 million times per
month (Shankland, 2005).
88
PageRank is a patented algorithm for calculating the relative importance
that Google assigns to a page.
291
French, Italian, German, or Spanish. With these various tools activated, Google is
The Toolbar’s Safe Browsing feature alerts users if a Web page appears to
be asking for personal or financial information under false pretences. When used
in Enhanced Protection mode, the Toolbar will send the URLs of all pages visited
and information about the page to Google for evaluation. When the user is warned
about a suspicious site, Google will log that site’s URL and the user’s decision to
accept, reject, or close the warning message. Toolbar also features a pop-up
blocker that monitors all Web pages visited in order to prevent unwanted
Toolbar sends a website’s URL to Google, it is possible that the URL may itself
website might “embed” that personal information into its URL, typically after a
question mark (“?”). When the URL is transmitted to Google, its servers
automatically store the URL, including any personal information that has been
With Toolbar, users can e-mail Web content and URLs to other users with
the “Send to Gmail” button, or automatically create blog posts with the “Send to
Blogger” option. Users can also automatically create Bookmarks from the
89
Specifically, AutoLink and SpellCheck only send snippets of text when
their respective buttons are clicked, and WordTranslator sends words for
translation only when the mouse is hovered over the text to request. When
PageRank is activated, Google automatically receives the URLs for all webpages
visited.
292
Toolbar. To use these features, users’ Google Account or Blogger account
Toolbar sends with these other Google accounts (Google, 2006k). Users can also
send text messages directly from the Toolbar; when that feature is used, Google
logs the number and carrier to which the message is sent, and in some cases may
Web Accelerator
load times for faster Web browsing. While not directly related to Web searching
only the updates if a Web page has changed slightly since it was last viewed,
When using Web Accelerator, all non-secure Web page requests are
routed through Google’s servers, along with information such as the date and time
of the request, the user’s IP address, and computer and connection information.
Google stores and uses this information to help predict and prefetch additional
relevant Web content. Depending on how particular websites are set up, it is
293
possible that personally identifiable information embedded in the URL might also
be processed through and stored within Google’s servers. Google might also
temporarily cache other sites’ Web cookies when prefetching certain page
294
295
APPENDIX B
Elizabeth “Libby” Doe and Annette “Netty” Roe. Libby and Netty are nearly
Both are 30-year-old, single, gay south-Asian women. Both are Hindu, live in
Brooklyn, New York, and tend to vote for Democrats. Libby and Netty are
graduate students at New York University, studying political science and feminist
theory. They enjoy sports and cooking as hobbies; both are thinking of having a
baby, but have concerns due to being diabetic. They have similar investment
portfolios, enjoy keeping in touch with friends, and like to share photos and
word-of-mouth, written, and oral correspondence. While not averse to using the
Internet, when Libby needs to find information on a topic, she prefers visiting the
library. Netty, on the other hand, relies heavily on the Internet to manage
information and communicate with others. When Netty needs information about a
topic, she “Googles” it. In fact, Netty relies on Google’s broad array of products
and Netty inevitably share bits of personal information with others. The following
sections will describe these flows of personal information within each of the nine
Information Practices
Libby’s primary source for information is the library. Visits to the local
branch of the Brooklyn Public Library are almost a daily occurrence; many of the
staff librarians greet her by her first name when she arrives. Libby often browses
the new fiction shelves, flipping through books of interest, occasionally checking
out one or two. Lately, Libby has been searching the library’s computerized card
catalog for resources on pregnancy and childbirth, as well how diabetes might
become a complication. She has found useful books at the library, reading some
there, checking others out for use at home. Libby often compliments her visits to
the library by spending afternoons at a local bookstore, browsing their books and
magazines. Libby does use the Internet and search engines to help find
296
In all, Libby relies on the library’s judgment, and prefers to find her information
there.
Netty, on the other hand, relies almost exclusively on the Internet for her
information and research needs, and often explores the Web just to see what kind
of new and interesting things she can discover. A dedicated Google search engine
user, Netty has learned to trust its results, and has integrated a variety of its
products and services into her Web searching practice. She likes to take advantage
of Google’s Personalized Search and Search History services to help improve her
search results and recall past searches. She also frequently uses Google’s specialty
search products for images, videos, and books. Recently, Netty has searched for
cooking and her favorite sports. She also takes advantage of Google Alerts, a
service that sends her an e-mail whenever new pages are added to Google’s index
that match certain search queries, such as “diabetes and pregnancy.” Netty enjoys
browsing through bookstores, and occasionally visits the public library, but she
remains dedicated to the wealth of information at her fingertips via the Internet
and Google.
Information Flows
merchants (newsstands, etc.). The type of information Libby shares with these
297
purchased. It would be extremely rare for a librarian to require a user provide
however, librarians are guided by strict code of ethics dictating the transmission
principles of patron data. Similarly, while Libby can browse books and magazines
to purchase an item (unless paid for with cash). Using the computers at school do
in these informational norms. Rather than the disparate set of agents with whom
Libby interacts, Netty relies almost exclusively on Google, and any personal
outlined in its privacy policy. This collection is made automatic and constant
through the Web cookies associated with Netty’s Google account, allowing the
Academic Research
Information Practices
for her academic research, with the New York University library system
supplementing the Brooklyn Public Library. Libby makes use of the research
librarians at NYU to help find resources, and she often reads and checks out
298
books related to her studies in political science and feminist theory. She also
While sharing Libby’s use of the printed political science cannon, Netty
also relies heavily on the Internet for her academic research. Like Libby, Netty
utilizes the NYU libraries and the access they provide to online academic
search engine that indexes the full text of scholarly literature across an array of
her home library in Google Scholar, Netty can easily determine which journals
and papers the library subscribes to electronically, and simply click the link
Information Flows
interact with a reference librarian for her academic research. The type of
information Libby shares with a librarian is limited to the titles of the reading
require that a user provide personally identifiable information in order help guide
her research. Again, the librarian is guided by the ALA Code of Ethics regarding
299
research agenda. Her affiliation with New York University can also be logged in
Information Practices
magazines and television. She subscribes to the printed version of the New York
Times and reads it daily on the subway. During visits to the library, Libby often
gay news magazine. Libby also likes to keep up to date on political issues and
and often glances at issues of Harpers and The Nation while visiting a bookstore.
Libby also flips through sports and health related magazines to stay current on
news related to those aspects of her life. Along with the New York Times and the
New Yorker, Libby frequently picks up a free copy of the Village Voice and other
activities. In addition to these print sources, Libby also watches local television
news on a nightly basis for local news and weather updates. She occasionally
watches cable news providers as well, especially for breaking events, and listens
to National Public Radio every morning. Finally, Libby does get some news from
Internet sources, but tends to view only the websites of her other news sources,
300
Not surprisingly, Netty stays current on news and political events via the
Internet. Her starting point for news information is Google News, a service that
scans over 4,500 different news websites in real time and organizes them in
particular topics within Google’s index of daily news stories, and takes advantage
of the ability to create custom sections for articles that match certain keywords
Homepage product to have access to her preferred news stories right on the main
Google search page. She has activated news modules delivering the New York
Times headlines, gay and lesbian coverage from The San Francisco Chronicle,
and articles from the online version of The Hindu. Netty also has local New York
sports and weather modules activated on the homepage. To stay up-to-date with
breaking news, Netty has subscribed to receive e-mail Alerts when certain phrases
Along with traditional media outlets, Netty also reads many weblogs for
worldwide, she frequently uses Google’s Blog Search service. Due to the large
number of blogs she likes to follow, Netty also utilizes Google’s Reader service,
which allows her to subscribe to various blog feeds and read them all from
Google’s interface.
301
Information Flows
mailing address and billing information to fulfill those subscriptions. They do not,
however, have any means of knowing what articles she reads, whether she writes
notes in the margins, copies pages, tears them out to share with others, and so on.
newspapers she reads at the library or at newsstands, as well as the news and
political information she receives from television or radio. When she does visit
news-related websites, tracking cookies from those sites might monitor her
activities.
Netty’s reliance on Google News allows Google to track all of her news-
related search queries, as well as which articles she clicks on in order to read.
Google can also gain insight of her interests based on the keywords she selects for
Alerts, the feeds she subscribes to on the Reader, as well as the commentary she
Information Practices
Like most students, Libby utilizes her NYU e-mail account for academic-
related communication. While she also sends and receives occasional e-mails to
friends and family from her NYU account, Libby is a habitual letter writer,
preferring to send notes and the occasional photos via regular mail. She also likes
302
mailing postcards from her travels. Libby does not use instant messenger, but
does make frequent cell phone calls to keep in touch with her friends.
Netty uses Google’s Gmail e-mail service for all her e-mail needs; all
messages sent to her NYU account are automatically forwarded to her Gmail
messaging, using both AOL’s Instant Messenger service as well as Google’s Talk
messaging system. Netty often e-mails photos of her travels to friends, which is
easy for her to do with Google’s Picasa photo management software, and she has
started to experiment with Google’s new Hello instant messaging service for
photos. Netty has embraced some of the latest interactive and self-publishing
theory, Hinduism, and pregnancy, and she uses Google’s Blogger publishing
these topics and other events from her daily life. Netty also has accounts with
various online social networking sites, including Google’s Orkut service, where
she finds people and joins communities who share her hobbies and interests.
Finally, Netty has signed up for Google’s dodgeball service, which allows her to
quickly communicate and coordinate social activities with her friends via her
mobile phone.
90
Netty signed up for Gmail using Google’s mobile phone text-messaging
feature, which also allowed her to associate her mobile phone number with her
Google Account in order to use Google Mobile services as they become available.
303
Information Flows
While Libby’s e-mail traffic can be tracked by NYU, the majority of her
recipient of her letters or postcards. Similarly, her cellphone provider tracks her
usage for billing purposes, but the content of the conversations remains private.
Netty’s use of Gmail means all of her incoming messages are scanned for
placement of contextual ads, and any clicking of those ads is tracked by Google.
Her list of friends and IM messages are also archived on Google’s servers, as are
the e-mails and messages sent to friends through Picassa or Hello. Google also
retains a record of all of Netty’s activity in her various Groups, including what
messages she clicks on to read, as well as those she posts herself. A log of all her
blog posts is maintained, as is the activity she engages in via Orkut or dodgeaball.
Information Practices
Libby relies on a written date book order to manage her personal and
school schedules. While she receives many notices of events and activities via e-
mail (especially school-related), Libby transcribes them to her calendar. She also
uses her date book to manage to-do lists and contact information for family,
Netty maintains a calendar for her personal events and activities online
using Google’s Web-based Calendar product, which can send reminders to her
304
mobile phone. Netty also keeps a contact list of her friends and colleagues in her
Information Flows
Since Libby keeps her personal data offline, they remain almost entirely
private; someone would have to gain physical access to her date and address
Information Practices
Netty uses Google Finance to manage her personal stock portfolio, and has
of her portfolio. She often reads financial news from the Google Finance
holdings.
305
Information Flows
Other than the brokerage companies, who have strict laws regulating
Information Practices
Libby does the vast majority of her shopping and product research at
traditional retail storefronts. She often uses her frequent shopper card when
making purchases at Barnes & Noble or other retailers. She frequently browses
popular magazines for shopping ideas, and occasionally uses the websites of
However, she prefers to purchase at stores so she can examine the products in
person.
auctions. Froogle allows Netty to find multiple sources for specific products,
examine product information and images, and compare prices for the items
the online seller’s site and complete a purchase. Other times, she performs a
traditional Google search for the item to see whether other websites have
306
links” that appear in the margins of her Google search results if the link appears to
Information Flows
While Libby can “window shop” anonymously, retailers can track her
purchases for marketing purposes. Her purchasing habits at Barnes & Noble, for
example, can be monitored through her frequent shopper card. Any of her
Since Netty shops online with greater frequency, almost all of her
searches on Froogle (not just purchases). Google also tracks her clicks on
sponsored links.
Information Practices
Libby has a home desktop computer for schoolwork and general Internet
use. She relies on the traditional “folder system” of her Windows operating
system to organize and access her computer files. If she needs to work on files
away from home, she copies them onto a USB flash drive for portability.
her computer, freeing her from having to remember in which directory or folder
307
she saved them. Using the familiar Google search interface, she can quickly locate
any of her Microsoft Office documents or archived PDF files, as well as music,
video and image files. When performing a desktop search, Netty also often
receives sites from her search history in the results. Because she also frequently
Desktop Search’s “search across computer” function so she can access her home
Information Flows
media, her computer files are not shared with any third party. Netty’s use of
Desktop Search means some of her computer file searches can be logged by
Google. Her entire index of files is also stored (in cryptographic form) on
Google’s servers so she can access files remotely from any computer.
Internet Browsing
Information Practices
While Libby tends to prefer visiting the library, reading the printed
newspaper, and shopping inside retail stores, she is not completely averse to
surfing the Internet. As noted above, she occasionally browses the Web from her
home computer and bookmarks frequently visited websites. She takes advantage
308
however, Libby is considered a novice Internet user, utilizing it for only the most
basic of tasks.
History service described above, Netty uses Google Bookmarks to save and
organize her bookmarked Web pages on Google’s servers, including easy access
Notebook browser tool that provides her a way to save and organize notes while
browsing online. With Notebook, she can clip text, images, and links from Web
pages while browsing, and save them to her online “notebook” for easy access
Many of the information tasks described above have been integrated into
the Google Toolbar, which Netty has installed in her Web browser to making
using Google’s services easier. From the Toolbar’s search box, she can perform
searches from many of Google’s sites and receive useful suggestions based on
popular Google searches. Netty frequently uses other helpful Toolbar buttons to
quickly bookmark a page, share Web pages via email, send a text message, create
a blog entry, subscribe to a site’s news feed, send an e-mail, and even check the
PageRank of the site she is visiting. Google Toolbar stores Netty’s address and
credit card information, enabling her to fill out Web forms with a single click. It
asking for her personal or financial information under false pretences. Finally,
309
Netty has installed Google’s Web Accelerator application to help make the Web
Information Flows
As with most Web users, Libby’s Internet activities can be tracked and
logged by the various sites she visits via their Web cookies. The same applies for
Netty’s surfing activities, with the addition of her bookmarks and browsing notes
also being stored on Google’s servers. Further, her use of the Toolbar and Web
Accelerator allows Google to monitor and log every website she visits.
310