You are on page 1of 319

Sponsoring Committee: Professor Helen Nissenbaum, Chairperson

Professor Alex Galloway


Professor Siva Vaidhyanathan

THE QUEST FOR THE PERFECT SEARCH ENGINE: VALUES,

TECHNICAL DESIGN, AND THE FLOW OF PERSONAL

INFORMATION IN SPHERES OF MOBILITY

Michael T. Zimmer

Program in Media Ecology


Department of Culture and Communication

Submitted in partial fulfillment


of the requirements for the degree of
Doctor of Philosophy in the
Steinhardt School of Culture, Education, and Human Development
New York University
2007
Copyright © 2007 Michael T. Zimmer
ACKNOWLEDGEMENTS

I would like to thank my committee chair and mentor, Professor Helen

Nissenbaum, whose insights, encouragement, friendship and patience have made

this dissertation possible. I also extend my thanks to Professors Alex Galloway

and Siva Vaidhyanathan for providing me with valuable guidance, challenges, and

new approaches to this dissertation, and Professors Brett Gary, Ted Magder, and

Charlton McIlwain for helpful feedback during the proposal process.

A special thank you to my cohorts, Rob Jones, David Parisi, Devon

Powers, and Maryann Tsokantas, who, along with Professor JoEllen Fisherkeller,

helped shape this project from its earliest days. Additional thanks to Melissa

Aronczyk, Sam Howard-Spink, Alice Marwick, Joseph Reagle, Jessie Shimmin

and the rest of the Ph.D. student community for their helpful feedback and

camaraderie throughout this process. I owe particular gratitude to Cheryl Casey

and Tim Weber for their friendship and support throughout these hectic years.

I extend my appreciation to the various institutions and forums that have

allowed me to present this work in various forms, including the Information

Society Project at Yale Law School, the Philosophy Departments at the Delft

University of Technology, the University of Twente, and the Royal Institute of

Technology in Stockholm, the Association of Internet Researchers, the

International Conference of Computer Ethics: Philosophical Enquiry, the Media

Ecology Association, the National Communication Association, the Society of

Philosophy and Technology, and the Society for the Social Studies of Science.

iii
Additional thanks to Geoffrey Bowker and Susan Leigh Star at the Center for

Science, Technology, and Society at Santa Clara University.

Funding for this research was generously provided by the Phyllis and

Gerald LeBoff Doctoral Fellowship, the PORTIA Project, and a National Science

Foundation Dissertation Improvement Grant.

Most of all, I would like to extend my profound thanks to my family –

both the Zimmers and the Laydes – for their unending love and support, and

especially to Rebecca, my incredible wife, without whose hard work,

encouragement, love, and proofreading this dissertation simply would not exist.

Finally, I thank Ethan, my amazing son, for being a new source of inspiration in

these final few months.

iv
TABLE OF CONTENTS

ACKNOWLEDGEMENTS iii

LIST OF TABLES viii

LIST OF FIGURES ix

CHAPTER

I INTRODUCTION 1

Prologue 1
Spheres of Mobility 3
The Search for the Perfect Search Engine 5
A Faustian Bargain? 8
Overview of Social Research on Search Engines 10
Privacy in Technology 14
Methodology 17
Chapter Outline 21

II PHILOSOPHIES OF TECHNOLOGY: HISTORY, POLITICS,


AND ETHICS 24

Introduction 24
Early Philosophies of Technology 25
Summary 34
Founders of Contemporary Philosophy of Technology 34
Summary 42
Contemporary Philosophies of Technology 43
Media Ecology 43
Social Construction of Technology 49
Politics of Technology 53
Ethics in Technology 64
Summary 71
Chapter Summary: A Faustian Bargain 73

III THE QUEST FOR THE PERFECT SEARCH ENGINE 79

Introduction 79

v
A Brief History of the Internet and World Wide Web 80
The Internet 80
Hypertext Links 81
The World Wide Web 86
Early Internet and Web Navigation Tools 88
FTP, Gopher, and WAIS 88
Web Directories 90
Web Search Engines 99
How Web Search Engines Work 103
Economics of Web Search Engines 111
The Quest for the “Perfect Search” 114
Perfect Reach 115
Perfect Recall 116
A Faustian Bargain? 117

IV GOOGLE’S QUEST FOR THE PERFECT SEARCH 120

Introduction: Google 120


PageRank 121
Crawling and Mapping the Web 125
Google and the Perfect Search Engine 128
Google’s Perfect Reach 128
Google’s Perfect Recall 130
Capturing User Information 130
Potential Information Captured 134
Anxieties of Google’s Drive for the Perfect Search 142
Anxieties of Perfect Reach 144
Anxieties of Perfect Recall 147
A Faustian Bargain Emerges 149

V CONTEXTUAL INTEGRITY AND THE PERFECT SEARCH


ENGINE 156

Introduction 156
Understanding “Contextual Integrity” 160
Example: Vehicle Safety Communication Technology 165
Contextual Integrity in the Perfect Search 169

VI VALUES AND SPHERES OF MOBILITY 178

Introduction 178
Physical Mobility: Freedom on the Roads 180
Exploration, Autonomy, and Escape on the Roads 183
Threats to Freedom on the Roads 189
Intellectual Mobility: Intellectual Freedom and the Library 193
Intellectual Freedom and the Library Bill of Rights 195

vi
Privacy and the Library Bill of Rights 198
Digital Mobility: Autonomy, Privacy, and Digital Rights
Management 205
DRM, Digital Mobility, and Privacy 211
Convergence of Mobilities 214
Summary 219

VII CONCLUSION: RENEGOTIATING THE FAUSTIAN BARGAIN


221

BIBLIOGRAPHY 229

APPENDICES

A GOOGLE’S QUEST FOR THE PERFECT SEARCH ENGINE:


PRODUCTS AND DATA CAPTURE 264

General Information Inquiries 265


Academic Research 275
News and Political Information 275
Communication and Social Networking 278
Personal Data Management 283
Financial Data Management 284
Shopping and Product Research 285
Computer File Management 287
Internet Browsing 289

B A THOUGHT EXPERIMENT: LIBBY AND NETTY’S


INFORMATION-SEEKING ACTIVITIES 295

General Information Inquiries 296


Academic Research 298
News and Political Information 300
Communication and Social Networking 302
Personal Data Management 304
Financial Data Management 305
Shopping and Product Research 306
Computer File Management 307
Internet Browsing 308

vii
LIST OF TABLES

1 Characteristics of situations where ontological


classification is not advised. 98

2 Early period search engine dates, institutions, and founders. 101

3 Google Suite of Products and Services 152

4 Personal Information Collected by Google’s Suite of Products 154

5 Differences in informational norms within various information-seeking


contexts 175

viii
LIST OF FIGURES

1 The Yahoo! Web directory as seen on Oct 17, 1996 93

2 Search engine mergers and acquisitions 102

3 Typical search engine architecture 103

4 Search results page for "digital cameras" 112

5 Hypothetical Web graph 123

6 Simplified example of a recursive calculation of PageRank 124

7 Partial Google Web search results page for “Boston subway” 266

ix
1

CHAPTER I

INTRODUCTION

Prologue

In January 2006, it was revealed that, as part of the government’s effort to

uphold an online pornography law, the U.S. Department of Justice had asked a

federal judge to compel the Web search engine Google to turn over records on

millions of its users’ search queries (Hafner & Richtel, 2006; Mintz, 2006).

Google resisted, but three of its competitors, America Online (AOL), Microsoft,

and Yahoo!, complied with similar government subpoenas of their search records

(Hafner & Richtel, 2006). Later that year, AOL released over 20 million search

queries from 658,000 of its users to the public in an attempt to support academic

research on search engine query analysis (Hansell, 2006). Despite AOL’s attempts

to anonymize the data, individual users remained identifiable based solely on their

search histories, which included search terms matching users’ names, social

security numbers, addresses, phone numbers, and other personally identifiable

information. Simple keyword analyses of the AOL database also revealed an

“innumerable number of life stories ranging from the mundane to the illicit and

bizarre” (McCullagh, 2006a). Upon being identified by the New York Times based

solely on her search terms in the AOL database, a Georgia woman exclaimed,
“My goodness, it’s my whole personal life…I had no idea somebody was looking

over my shoulder” (Barbaro & Zeller Jr, 2006).

These two cases revealed how the collection of users’ web search

activities posed a challenge to the privacy of one’s online intellectual activities, a

value considered “fundamental to our free society” (Froomkin, 2000, p. 121). Yet,

while these events brought to light the fact that search engine providers routinely

keep detailed records of users’ searches, and created anxiety among some

searchers about the presence of such systematic monitoring of their online

information-seeking activities (Barbaro & Zeller Jr, 2006; Hafner, 2006; Levy,

2006; Maney, 2006), by and large, users continued to flock to the growing suite of

web search-related products and services at an unprecedented rate.1 Thus, a

paradox has emerged: while revelations that Web search engine companies

increasingly monitor, store, aggregate – and in some cases, share with third

parties – users’ search histories, users continue to embrace and integrate these

services into their daily lives. A faithful Google user interviewed by the New York

Times puts it best: “I don’t know if I want all my personal information saved on

this massive server in Mountain View [Google’s headquarters], but it is so much

of an improvement on how life was before, I can’t help it” (Williams, 2006).

Google’s mission “to organize the world's information and make it universally

accessible and useful” (Google, 2005b) is indeed alluring, but the demands on the

1
In the year since the DOJ case emerged, search engine activity has
increased from 5.3 billion searches in February 2006 (Nielsen//NetRatings, 2006)
to 6.4 billion in 2007 (Nielsen//NetRatings, 2007), an increase of over 20%.
Google, for its part, reported $10.6 billion in revenues during fiscal 2006,
compared to only $6.1 billion for 2005, a 73% increase (Google, 1999).

2
individual to disclose intimate and personal information with the search engine

giant is reminiscent of Faust’s bargaining away his soul to Mephistopheles to gain

access to unlimited knowledge. While it is easy to be seduced by the promises of

a company whose motto is “Don’t be evil,” (Google, 2005k), have we, like Faust,

made a deal with the devil?

Spheres of Mobility

In his thesis on justice and injustice, legal scholar Edmond Cahn has

described freedom as “the ability to move, mobility” (1949). Our spheres of

mobility – be they physical, intellectual, or digital – sustain the freedom that

forms the foundation for a just society. Physical mobility relates to movements of

people in geographical space and their ability to navigate and explore new spaces,

escape their daily lives, or achieve newfound autonomy. Intellectual mobility

involves the freedom to learn new things, explore new ideas, adapt, and change

one’s thoughts and beliefs in order to grow and develop intellectually as an

individual. A new digital mobility is also emerging, providing the means to move

within and across digital computer networks, offering a novel means of achieving

both physical and intellectual mobilities in an online world. These mobilities are

intrinsically linked, as cultural historian George Pierson has suggested:

Without spatial movement, no social improvement, either. Our work and


or play, our cities and our contrysides, our taxes and our eating habits, our
pleasures and our pains, our hopes and our fears are inextricably tied up
with mobility. (1973, p. 93)

3
Within these spheres of mobility, individuals create, discover, and enjoy

spaces for personal growth, exploration, and escape. Central to these mobilities is

the notion that individuals are granted the right to move about and explore new

physical and intellectual terrain relatively free from answerability or oversight. To

foster the freedoms envisioned by Cahn, our spheres of mobility must become

what Hakim Bey has described as “temporary autonomous zones,” the moments

and spaces that elude formal structures of control “in which freedom is not only

possible but actual” (1991, p. 131). Within these spheres, individuals have come

to expect the enjoyment of liberty and autonomy, of un-answerability, self-

determination, and self-definition. Without the free and unfettered opportunity to

move, to navigate, to inquire, and to explore within spheres of mobility, we

cannot gain the sort of understanding of our world and develop the awareness and

competencies necessary for effective participation in social, economic, cultural,

and political life.

Throughout history, various technologies and socio-technical systems

have been developed to foster successful navigation of these spheres of mobility,

ranging from the automobile and related network of road and highways to provide

physical mobility, to public library systems and their related information services

supporting intellectual mobility, to the Internet and its related protocols that

facilitate mobility across digital computer networks. The latest addition to the

assortment of tools for navigating spheres of mobility is the Web search engine,

providing an interface to new worlds of information, new spaces for

communication and interaction, and new means of experiencing the world. The

4
prominent role of Web search engines in navigating our contemporary spheres of

mobility demands critical attention, and is a central focus of this dissertation.

The Search for the Perfect Search Engine

As the Internet has become increasingly important to modern citizens in

their everyday lives (see Horrigan & Rainie, 2006), Web search engines have

emerged as an indispensable tool for accessing the vast amount of in-formation

available on this global network. Consider, for example, the Web search engine

Google. Google has become the prevailing knowledge tool for searching and

accessing virtually all information on the Web. Originating in 1996 as a Ph.D.

research project by Larry Page and Sergey Brin at Stanford University (see Brin

& Page, 1998; Page et al., 1998), Google’s Web search engine now dominates the

market, processing almost 3.6 billion search queries in February 2007, over half

of all Web searches performed (Nielsen//NetRatings, 2007).2 Google’s mission,

stated quite simply and innocuously, is to “organize the world’s information and

make it universally accessible and useful” (Google, 2005b). In pursuit of this

goal, Google has developed dozens of search-related tools and services to help

users organize and use information in multiple contexts, ranging from general

information inquiries to academic research, news and political information,

communication and social networking, personal data management, financial data

2
At its peak in early 2004, Google handled upwards of 80 percent of all
search requests on the Web through its own website and clients such as Yahoo!,
AOL, and CNN who relied on Google for their customer’s search engine results.
Google’s share fell to a still dominant 57% in 2004 when Yahoo! dropped
Google’s search technology for their own (Hansen, 2004).

5
management, shopping and product research, computer file management, and

enhanced Internet browsing.3 Consequently, users increasingly search, find, and

relate to information through Google’s growing information infrastructure of

search-related services and tools.4 They also use these tools to communicate,

navigate, shop, and organize their lives. By providing a medium for various

social, intellectual, and commercial activities, “Planet Google” has become a

large part of people’s lives, both online and off (Williams, 2006). In many ways,

the physical, intellectual, and digital mobilities described above converge in Web

search engines, providing new means of physical escape, intellectual exploration,

and digital freedom.

Since the first search engines provided a means of navigating the spheres

of mobility accessible via the Web, a desire emerged to create the “perfect search

engine” – a search engine capable of indexing all available information and

providing fast and relevant results. The quest for the perfect search engine has led

to calls for search engines to provide results that suit the “context and intent” of

the search query. The perfect search engine will have to have “perfect reach” to

deliver any type of online content from all online (and, increasingly, offline)

sources, as well as “perfect recall” to deliver personalized and relevant results that

3
See Chapter IV for detailed discussion of Google’s various products and
services.
4
Yahoo!, and to a lesser extent, Microsoft and AOL, also offer search-
related tools beyond just locating relevant websites. Google, however, remains the
clear market leader at 55.8% of all search activity, with Yahoo! following at
20.7% and Microsoft at 9.6% (Nielsen//NetRatings, 2007). Given their strong
dominance of the overall marketplace, and recognition as the “gold standard” in
search engine practices and innovation (Hellweg, 2002; Clark, 2006), Google will
be the primary focus of this dissertation.

6
are informed by who the searcher is. Given a search for “Paris Hilton,” the perfect

search engine will know whether to deliver results about the celebrity socialite,

complete with the requisite image and video files, or a place to spend the night in

France, complemented with photos of the property, maps, and even flight

information. Google recognized early on the importance of designing a perfect

search engine: the company’s very first press release noted that “a perfect search

engine will process and understand all the information in the world…That is

where Google is headed” (Google, 1999). Google co-founder Larry Page later

reiterated the goal of achieving the perfect search: “The perfect search engine

would understand exactly what you mean and give back exactly what you want”

(Google, 2007). Silicon Valley journalist John Battelle summarizes how such an

omniscient and omnipotent search engine might work:

Imagine the ability to ask any question and get not just an accurate answer,
but your perfect answer – an answer that suits the context and intent of
your question, an answer that is informed by who you are and why you
might be asking. The engine providing this answer is capable of
incorporating all the world’s knowledge to the task at hand – be it
captured in text, video, or audio. It’s capable of discerning between
straightforward requests – who was the third president of the United
States? – and more nuanced ones – under what circumstances did the third
president of the United States foreswear his views on slavery?
This perfect search also has perfect recall – it knows what you’ve
seen, and can discern between a journey of discovery – where you want to
find something new – and recovery – where you want to find something
you’ve seen before. (Battelle, 2004)

When asked what a perfect search engine would be like, Sergey Brin replied quite

simply, “like the mind of God” (quoted in Ferguson, 2005, p. 40).

7
A Faustian Bargain?

The perfect search engine, then, reflects an omniscient and omnipresent

ideal, promising to provide a new means of successfully navigating our spheres of

mobility. Yet, while the quest for the perfect search is spearheaded by a company

whose motto is “Don’t be evil,” (Google, 2005k), we are reminded by cultural

critic Neil Postman that the true relationship between a society and its technology

is often not purely benevolent, but instead may require a sacrifice for society to

enjoy its benefits, what Postman recognizes as a Faustian bargain:

[A]nyone who has studied the history of technology knows that


technological change is always a Faustian bargain: Technology giveth and
technology taketh away, and not always in equal measure. A new
technology sometimes creates more than it destroys. Sometimes, it
destroys more than it creates. But it is never one-sided. (Postman, 1990)

History has revealed how such a Faustian bargain persists with many technologies

designed to enhance the navigation of our spheres of mobility, including the

physical, intellectual, and digital spheres identified above. Automated toll

collection systems provide efficiencies for physical mobility along the highways,

but also provide a means of tracking a particular vehicle’s location. The

introduction of computer systems in libraries has improved management of

collections and circulation, but also facilitates the recording of each patron’s

borrowing activities. And while originally developed to facilitate Web-based

commerce and enhance digital mobilities, the widespread use of Web cookies has

also facilitated the tracking of users as they navigate the Web.

These technological developments resemble what privacy and surveillance

scholar Roger Clarke (1988) has referred to as technologies of “dataveillance,”

8
defined as both “the massive collection and storage of vast quantities of personal

data” (Bennett, 1996, p. 237) and “the systemic use of personal data systems in

the investigation or monitoring of one or more persons” (Clarke, 1988, p. 499),

often for the purpose of forming detailed “digital dossiers” (Solove, 2004, p. 2) of

nearly every individual within its reach. While often developed for benign

purposes,5 information technologies that resemble infrastructures of dataveillance

have devolved into Faustian bargains, with concrete effects relating to issues of

social justice and personal freedoms:

[The] impact of dataveillance is a reduction in the meaningfulness of


individual actions, and hence in self- reliance and self- responsibility.
…In general, mass dataveillance tends to subvert individualism and the
meaningfulness of human decisions and actions. (Clarke, 1988, p. 508)

In short, technologies of dataveillance post a threat to the freedoms envisioned by

Cahn, Pierson, and Bey within our spheres of mobility.

The task of this dissertation, then, is to expose the perfect search engine as

a technology of dataveillance embroiled in a Faustian bargain of its own. While

designed to foster increased navigation within our spheres of mobility, this

dissertation will explore how the quest for the perfect search engine also

empowers the widespread capture of personal information flows across the

Internet. Drawing from historical examples of infrastructures of dataveillance

from other spheres of mobility – including vehicle tracking systems, library

surveillance, and DRM – the dissertation will argue that the quest to create the

perfect search engine constitutes a violation of the contextual integrity of personal

5
Clarke discusses the relative benefits and dangers of dataveillance
technologies in more detail at (Clarke, 1988).

9
information flows, restricting the ability to engage in social, cultural, and

intellectual activities online free from answerability and oversight, thereby

limiting users’ full realization of the levels of autonomy, self-determination, and

self-definition traditionally afforded within our spheres of mobility.

Overview of Social Research on Search Engines

This dissertation’s study of the perfect search engine in relation to its

impact on the freedoms in spheres of mobility represents a unique and significant

contribution to existing research on the impact of these increasingly irreplaceable

information technologies. The impact of search engines on society and culture has

already received attention from a variety of disciplines. Not surprisingly, initial

research on Web search engines was technical in nature. Numerous computer

scientists have contributed not only valuable research on improving and

enhancing the underlying Web search engine technology (see, for example, Brin

& Page, 1998; Page et al., 1998; Heydon & Najork, 1999), but also technical

analyses of the extent of coverage achieved by search engine products and how it

relates to information access (see, for example, Lawrence & Giles, 1998, 2000;

Kleinberg & Lawrence, 2001).

Some of the earliest social studies of Web search engines emerged from

information scientists attempting to isolate the habits and characteristics of search

engine users through the analysis of transaction log data (Jansen & Pooch, 2001).

These include Hoelsher’s (1998) analysis of 16 million queries from the German

search engine Fireball; Jansen, Spink, and Saracevic’s (2000) study of a sample

10
day’s worth of search activity from the Excite search engine; and Silversetin,

Henzinger, Marais, and Moricz’s (1999) detailed analysis of just under one billion

queries submitted to the Alta Vista search engine over a 42-day period. These

studies of transaction log data provide valuable information about search query

structure and complexity, including insights about common search topics, query

length, Boolean operator usage, search session length, and search results page

viewing (see, for example, Spink & Jansen, 2004).

Notwithstanding the value of transaction log data analysis, these types of

studies offer limited insights into the behavior of Web searchers beyond the

search queries submitted. Eszter Hargittai’s (2002; 2004b) use of surveys and in-

person observation of search engine usage helps alleviate these shortcomings,

providing insights into how people find information online in the context of their

other media use, their general Internet use patterns, and their social support

networks. Broadening the analysis of user behavior beyond transaction logs

allowed Hargittai (2004a) to reveal the ways that factors such as age, gender,

education level, and time spent online are relevant predictors of a user’s Web

searching skills. The work of Machill, et al (2004) and Hölscher and Strube

(2000) also combined surveys, interviews, and transaction log analysis to

characterize a number of information-seeking behaviors of Web search engine

users.

Jansen and Spink recognize that the “overwhelming research focus in the

scientific literature is on the technological aspects of Web search” and that when

studies do venture beyond the technology itself, they are “generally focused on

11
the individual level of analysis” (2004, p. 181). In response, recent scholarship

has moved beyond the technical and individual focus of the user studies described

above to include research into broader cultural, legal, and social implications of

Web search engines. For example, cultural scholars (Wouters et al., 2004;

Hellsten et al., 2006) have explored the ways in which search engines “re-write

the past” due to the frequent updating of their indices and the corresponding loss

of a historical record of content on the Web. Martely (in press) and Roy and Chi

(2003) examine gendered differences in Web search engine use, suggesting that

males and females demonstrate somewhat different Web navigation patterns as

well.

Introna and Nissenbaum’s (2000) seminal study, “Shaping the Web: Why

the Politics of Search Engines Matter,” was among the first to analyze search

engines from a political perspective, noting how search engines have been

heralded as “a democratizing force” that will

…give voice to diverse social, economic, and cultural groups, to members


of society not frequently heard in the public sphere. It will empower the
traditionally disempowered, giving them access both to typically
unreachable nodes of power and to previously inaccessible troves of
information. (Introna & Nissenbaum, 2000, p. 169)

Search engines, then, act as a powerful source of access and accessibility within

the Web. Introna and Nissenbaum reveal, however, that search engines

“systematically exclude certain sites, and certain types of sites, in favor of others,

systematically giving prominence to some at the expense of others” (2000, p.

169).

12
Such a critique resembles the stance that political economists take against

the contemporary mass media industry (Habermas, 1992; Castells, 1996;

McChesney, 1999), a critique that has recently been extended to Web search

engines. For example, Hargittai (2004b) has extended her user studies to include

investigations of how financial and organizational considerations within the Web

search engine industry impact the way in which content is organized, presented,

and distributed to users. And Van Couvering (2004) has engaged in extensive

research on the political economy of the search engine industry in terms of its

ownership, its revenues, the products it sells, its geographic spread, and the

politics and regulations that govern it. Drawing comparisons to concerns over

market consolidations in the mass media industry, Van Couvering fears that the

market concentration and business practices of the search engine industry might

limit its ability to serve “the public interest in the information society” (Van

Couvering, 2004, p. 25).

Extending from these various social and cultural critiques, Web search

engines have only recently been scrutinized from a moral or ethical perspective. A

recent panel discussion at the Santa Clara University Markkula Center for

Applied Ethics was one of the first to bring together ethicists, computer scientists,

and social scientists for the express purpose of confronting some of the

“unavoidable ethical questions about search engines,” including concerns of

search engine bias, censorship, trust, and privacy (Norvig et al., 2006). A special

issue of the International Review of Information Ethics on “The Ethics of Search

Engines” (Nagenborg, 2005) brought into focus many of the particular privacy

13
concerns with search engines. Included in this special issue were discussions of

the use of search engines to acquire information about persons, threatening their

privacy in public (Tavani, 2005), as well as a brief introduction to the ability of

search engines to collect user search histories and how this ability might impinge

on privacy and liberty (Hinman, 2005).

Hinman, unfortunately, dedicates far too few paragraphs to the concerns of

the tracking of users’ Web search histories, a weakness that this dissertation will

attempt to overcome. Building from these social, cultural, and ethical explorations

into Web search engines, this dissertation will reveal how the quest for the perfect

search engine implicates more than just the privacy of our personal information,

but also, if left unfettered, threatens to impede upon the freedoms enjoyed in our

spheres of mobility.

Privacy in Technology

This dissertation argues that technologies embody values, that their design

bears “directly and systematically on the realization, or suppression, of particular

configurations of social, ethical, and political values” (Flanagan et al., in press). A

diverse community of scholars has recently emerged to understand how the rise of

information technologies bear on moral and ethical values (see, for example,

Shneiderman, 1991; Friedman, 1997; Nissenbaum, 2001; Tavani, 2004; Mitcham,

2005). This research seeks to identify, understand, and address the ethical and

value-laden concerns that arise from the rapid design of information technologies

and their deployment into society. Friedman and Kahn (2002) identify twelve

14
specific values with moral and ethical import that are often embedded in the

design of information technologies: human welfare, ownership and property,

privacy, freedom from bias, universal usability, trust, autonomy, informed

consent, accountability, identity, calmness, and environmental sustainability.6

Attaining a basic understanding of the value of privacy, and its relationship to

technology, is central to this dissertation’s investigation of the impact of Web

search engines on the flow of personal information.7

One of the oldest legal conceptualizations of privacy is articulated in

Warren and Brandeis’ (1890) seminal essay “The Right to Privacy,” in which they

quote Judge Cooley’s view of privacy as the right to be left alone. Other popular

conceptions of privacy identify it with the control of information about oneself

(Westin, 1970), or with its complement, the control over the degree of access to

oneself (Gavison, 1980). Another view of privacy means having control over

one’s entire realm of intimate decisions, including decisions about physical access

to oneself, cognitive access to one’s thoughts, and one’s intimate behaviors

(Inness, 1992). Privacy has been further described as both an individual as well as

a social value (Regan, 1995), and in relation to the system of norms that facilitate

personal expression within domains of private life (Schoeman, 1992).

Roger Clarke (1997) identifies four key dimensions of the concept of

privacy in an attempt to reconcile this plurality of conceptualizations:

6
Friedman and Kahn (2002) argue that these are universal values,
although how such values play out in a particular culture at a particular point in
time can vary considerably.
7
A more detailed discussion of the relationship between privacy and
technology can be found at (Agre & Rotenberg, 1997).

15
• Privacy of the person, concerned with the integrity of the individual’s
body.
• Privacy of personal behavior, relating to all aspects of behavior, but
especially to sensitive matters, such as sexual preferences and habits,
political and intellectual activities and religious practices, both in private
and in public places.
• Privacy of personal communications, the interest in being able to
communicate with other individuals, using various media, without routine
monitoring of their communications by other persons or organizations.
• Privacy of personal data, the claim that data about oneself should not be
automatically available to other individuals and organizations, and that,
even where data is possessed by another party, the individual must be able
to exercise a substantial degree of control over that data and its use.
(Clarke, 1997)

Condensing and appropriating Clarke’s taxonomy, privacy can perhaps best be

viewed as the interest that individuals have in sustaining a personal space – a

“sphere of mobility” – free from interference by other people and organizations.

Julie Cohen (2003a) has argued that “information about private

intellectual activity has long been regarded as fundamentally private in our

culture, both for reasons related to individual dignity and because of the powerful

chilling effect that disclosure of intellectual preferences would produce” (p. 48;

emphasis added). The privacy of intellectual activity has been tied to the right to

read and consume media products anonymously (Cohen, 1996; Froomkin, 1999),

the right to send and receive communications free from government surveillance

(Regan, 1995, 2001), and the right of unrestricted access to information and open

inquiry free from fear that privacy or confidentiality could be compromised (see

American Library Association, 2006b).

The design of information and communication technologies often poses

significant challenges to the value of privacy of intellectual activities, especially

16
those that take place while navigating the sphere of the Internet. Concerns over

the impact of technology on the flow of online activities have emerged in the

context of spyware and digital rights management software (Cohen, 2003a),

workplace monitoring of electronic communications (Froomkin, 2000), online

Web cookies and tracking bugs (Kang, 1998; Bennett, 2001), the digitization of

library records (Sturges et al.), and the general expansion of the ability to

aggregate personal activity from all such sources (Garfinkel, 2000; Solove, 2004).

In each of these cases, the design and employment of information technology

challenges the value of the privacy of intellectual activities, a value considered

“fundamental to our free society” (Froomkin, 2000, p. 121). Arising from these

concerns, both scholars and designers alike have attempted to incorporate privacy

into the design stages of technological systems. It is from such efforts that this

dissertation gets its methodological footing.

Methodology

A variety of interdisciplinary approaches have emerged to identify and

account for values in technological design. Examples of such interdisciplinary

efforts include the fields of human-computer interaction, (Norman, 1990; Nielsen,

1993; Raskin, 2000), participatory design (Muller & Kuhn, 1993; Sclove, 1995),

reflective design (Sengers et al., 1990, 2005), and critical technical practice (Agre,

1997a, 1997b; Boehner et al., 2005). Recently, new pragmatic frameworks have

emerged to ensure that particular attention to moral and ethical values becomes an

integral part of the conception, design, and development of technological artifacts

17
and systems. These include Design for Values (Camp, n.d.), Values at Play

(Flanagan et al., in press, 2005), and Value Sensitive Design (Friedman, 1999;

Friedman et al., 2002). Each of these frameworks – which can be referred to

collectively as Value-Conscious Design – seek to broaden the criteria for judging

the quality of technological systems to include the advancement of ethical and

human values, and to proactively influence the design of technologies to account

for such values during the conception and design process.

These Value-Conscious Design initiatives share a similar methodological

structure that can be summarized by the metaphor of “balls in play” (Flanagan et

al., in press), where attention to three different modes (balls) of investigation must

be maintained and balanced for successful implementation. The Value-Sensitive

Design approach suggests a three-part methodological framework consisting of

conceptual, technical, and empirical modes (Friedman et al., 2002). The

conceptual mode, first ball in play, involves an analysis informed by ethics and

moral philosophy of the particular value constructs relevant to the design in

question. The second ball in play is the technical investigation of the particular

design specifications and variables that might promote or obscure given values

within the context of the technology being designed. Finally, the empirical mode

of investigation, the third ball in play, has the dual goal of providing measurable

analyses of the values within the particular design context (in support of the

conceptual investigation), as well as the success or failure of the attempt to

embody values within technological design (in support of the technical

investigation).

18
The Values at Play (VAP) approach offers a similar tripartite

methodological framework consisting of discovery, translation, and verification

phases (Flanagan et al., 2005). The goal of the discovery phase is to identify the

values that might be relevant to the design of a particular technology, including

those explicit in the aspirations of the technology’s designers, as well as those that

emerge only when the technological design process is underway. The translation

phase of VAP is the activity in which designers translate the value considerations

identified in the discovery phase into the architecture and features of the

technology. The final phase is verification, ensuring that the designers have

successfully implemented the values identified throughout the discovery process.

The collaborators who developed these Value-Conscious Design

frameworks have enjoyed various successes in designing technologies to protect

human and ethical values. For example, Friedman, Felten, and their colleagues

relied on the Value Sensitive Design methodology to develop Web browser

cookie management tools in support of the values of informed consent and user

privacy (Friedman et al., 2002). Similarly, Camp and her colleagues engaged in

Value-Conscious Design by embedding the value of trust in Web browser tools to

protect Internet users from consumer fraud and identity theft (Camp et al., n.d.;

Camp, 2006). A multidisciplinary team of researchers at New York University

have employed Value-Conscious Design principles with the RAPUNSEL project,

a computer game environment designed for teaching middle-school girls

programming skills to help counter gender inequity in methematics and computer

science while also embodying values such as cooperation, creativity, privacy, and

19
independence (Flanagan et al., in press, 2005). Howe and Nissenbaum’s (2006)

“TrackMeNot” Web browser extension was designed to help obfuscate one’s Web

search history records to prevent profiling by search engine providers, fostering

the values of privacy and user autonomy. Finally, Zimmer (2005), applied Value-

Conscious Design principles in an attempt to influence the design of emerging

vehicle safety communication technologies so that the value of privacy would

become a constitutive part of the design process. Notwithstanding various

challenges inherent with engaging with technical design communities (Manders-

Huits et al., in progress), these examples reveal the promise of influencing the

design of new information and communication technologies in order to account

for ethical and human values.

This dissertation will follow the lead of these case studies to form the

groundwork for the value-conscious design of the perfect search engine. Two

balls from the value-conscious design methodological toolkit will be put in play:

This work will engage in a conceptual investigation, providing a philosophical

and cultural framing of the values enjoyed in our spheres of mobility, bringing

conceptual clarity and a normative understanding of the ways in which the quest

for the perfect search engine bears on these values.8 The dissertation will also

engage in a technical investigation of the design of the perfect search engine to

uncover how its technological properties and underlying architecture might bear

on the values identified in the conceptual phase. As the Value-Conscious Design

8
This is in line with the “disclosive computer ethics” called for by Philip
Brey (2000); see discussion in Chapter II.

20
methodological frameworks are meant to be iterative, the initial steps taken by

this dissertation will guide future implementations of the translation and

verification stages of VAP, as well as the empirical mode of VSD, with the shared

goal of proactively influencing the design of the perfect search engine to account

for the values necessary for the full enjoyment of our spheres of mobility.

Chapter Outline

Chapter II, “Philosophies of Technology: History, Politics, and Ethics,”

presents the starting point of this dissertation, that the design of technology bears

“directly and systematically on the realization, or suppression, of particular

configurations of social, ethical, and political values” (Flanagan et al., in press).

This chapter will sketch a brief outline of key humanist, social, and philosophical

scholars of technology to construct a framework of understanding how technical

systems and artifacts impact society in ethical and value-laden ways, closing with

the realization that society often makes a Faustian bargain with its technology –

that “technology giveth and technology taketh away” (Postman, 1990).

Given this concern about the Faustian bargain with technology, Chapter

III, “The Quest for the Perfect Search Engine,” discusses the emergence of a

particular socio-technical system for which such a bargain would have a

significant social impact: the Web search engine. This chapter introduces the

Internet and the World Wide Web, and how the Web search engine has become

the “center of gravity” of people’s online intellectual activities. Chapter III closes

with a discussion of the growing quest among Web search engine companies to

21
achieve the “perfect search,” and the instruments of reach and recall necessary to

achieve this goal.

Chapter IV, “Google’s Quest for the Perfect Search Engine,” focuses the

previous chapter’s discussion on today’s dominant search engine, Google, and its

particular quest for the perfect search engine. As the design features of their

version of the perfect search engine begin to take hold, certain anxieties emerge

that might threaten the values in our spheres of mobility, especially in terms of

privacy and the flow of personal information. However, despite this rising

anxiety, many users prefer to remain citizens of “Planet Google,” thus, a Faustian

bargain emerges between Google and its community of loyal users.

While the previous chapter identifies many of the latent anxieties of the

perfect search, little evidence can be found that these concerns are affecting the

overall popularity and use of Web search engines. Drawing on the concern that

the design of these technologies are being taken “at interface value” (Turkle,

1995, p. 103), Chapter V, “Contextual Integrity and the Perfect Search Engine,”

introduces the theory of “privacy as contextual integrity” to help provide

conceptual clarity to how Google’s quest for the perfect search is disrupting

existing informational norms within the contexts of online information-seeking.

Recognizing that a violation of contextual integrity is present with the

perfect search engine, but lacking the normative framework to determine its harm

or benefit, Chapter VI, “Values and Spheres of Mobility,” provides a historical

and cultural description of spheres of mobility in which individuals have

traditionally enjoyed the ability to engage in social, cultural, and intellectual

22
activities free from answerability and oversight. This chapter identifies the values

at play within our spheres of physical, intellectual, and digital mobility, and

illustrates how the shifts in informational norms caused by the quest for the

perfect search engine threaten our ability to realize full freedom of mobility

within these spheres.

Chapter VII, “Renegotiating the Faustian Bargain,” concludes with

proposals for how the Faustian bargain between society and the quest of for the

perfect search engine might be renegotiated in order to mitigate the implications

for personal privacy and ensure continued freedom within our spheres of mobility.

Potential avenues include regulation via legal solutions, self-regulation by the

search engine industry, and pragmatic intervention with search engine companies

to foster the value-conscious design of these important technologies.

Two appendices are included at the end of this document in support of

these chapters. Appendix A, “Google’s Quest for the Perfect Search: Products and

Data Capture,” outlines twenty-six of Google’s key products and services in each

of nine information-seeking contexts, briefly describes their history and

circumstances of their use, and provides a technical analysis of the personal

information captured through the typical use of each product. Appendix B, “A

Thought Experiment: Libby and Netty’s Information-Seeking Activities,”

presents a thought experiment with two ideal typical information seekers – Libby

and Netty – who differ only in how they navigate their informational spheres.

Comparisons are made between the personal information flows inherent within

their respective information-seeking activities.

23
24

CHAPTER II

PHILOSOPHIES OF TECHNOLOGY: HISTORY, POLITICS, AND ETHICS

Introduction

The starting point of this dissertation is the position that technologies

embody values, that their design bears “directly and systematically on the

realization, or suppression, of particular configurations of social, ethical, and

political values” (Flanagan et al., in press). Identifying the social, ethical, and

political dimensions of technology requires a clear understanding of the nature of

technology and its role in nearly all aspects of society. But such an understanding

is rarely transparent. In his book Technopoly, Neil Postman remarked how “we

are surrounded by the wondrous effects of machines and are encouraged to ignore

the ideas embedded in them. Which means we become blind to the ideological

meaning of our technologies” (1992, p. 94). It has been the goal of many

humanist, social, and philosophical scholars of technology to remove these

“blinders” and critically explore the ideologies embedded within technical

systems and artifacts. This chapter will sketch the history and development of

such philosophical explorations of the relationship between technology and

society, bringing together disparate theories from philosophy, sociology, media

theory, science and technology studies, and information science. A complete

survey of the robust contributions to the study of technology and society from
these various disciplinary perspectives, if even possible, is beyond the scope of

this chapter.9 The sections that follow aim to provide a reliable map of this rich

and important intellectual territory and identify the waypoints to help establish the

intellectual footing for the remainder of the dissertation.

Early Philosophies of Technology

While the study of the nature of technology and its effects has become an

object of great interest for philosophers over the past few centuries, technology

has been subject to philosophical investigation for thousands of years. Although

we often view technology as physical tools and machines, the term “technology”

has its roots in the Greek techne, meaning literally “an art or craft” (Williams,

1983, p. 314). In Book VI of the Nicomachean Ethics, Aristotle defines techne as

the rational method involved in producing an object or accomplishing a goal or

objective (1999, p. 1140a11). Techne takes form through art or craftwork –

manual work and physical manipulation of the world. This is contrasted with

episteme, a form of scientific knowledge that exists in a natural, unchanging state

(Aristotle, 1999, p. 1139b15). Aristotle offers the philosophical distinction of his

period, that techne is making or doing, while episteme relates to understanding or

knowing. Put simply, techne involves tools of the body and earth, while episteme

is centered on the intellect and the mind.

In the Phaedrus, Plato (1990) provides a similar juxtaposition of techne

against episteme, the latter considered by Plato, like Aristotle before him, as a

9
More extensive surveys include (Durbin & Rapp, 1983; Ihde, 1993;
Mitcham, 1994; Scharff & Dusek, 2003; Kaplan, 2004).

25
higher form of human knowledge.10 For Plato, techne is merely imitative of

episteme, a lower form of knowledge achieved through manual arts and crafts,

rather than mental and philosophical reasoning. One target of Plato’s critique of

the imitative nature of techne is writing, viewed by Plato as a craft that merely

imitates true knowledge (episteme) through its representation. In the Phaedrus,

Plato invokes a meeting between the Egyptian god Theuth and the pharaoh

Thamus to illustrate his criticism against writing as merely an imitative art.

Theuth, responsible for introducing geometry astronomy to the world, presents his

invention of writing to Thamus, announcing it as “an elixir of memory and

wisdom.” Thamus, however, disagrees:

Most ingenious Theuth, one man has the ability to beget arts, but the
ability to judge of their usefulness or harmfulness to their users belongs to
another; and now you, who are the father of letters, have been led by your
affection to ascribe to them a power the opposite of that which they really
possess. For this invention will produce forgetfulness in the minds of those
who learn to use it, because they will not practice their memory. Their
trust in writing, produced by external characters which are no part of
themselves, will discourage the use of their own memory within them.
You have invented an elixir not of memory, but of reminding; and you
offer your pupils the appearance of wisdom, not true wisdom, for they will
read many things without instruction and will therefore seem to know
many things, when they are for the most part…not wise, but only appear
wise. (Plato, 1990, pp. 274e-275b)

This parable outlines Plato’s belief that writing is merely an imitator of

knowledge, and produces mere imitators of knowledge in people. For Plato,

writing, indeed any of the arts or crafts that make up techne, fails to encourage

what he viewed as true learning, and instead leads to reliance of things other than

10
Jacques Derrida’s (1981) “Plato’s Pharmacy” provides a close reading –
and critique – of Plato’s Phaedrus.

26
the mind. He fears that memory will become confused with recollection, and that

the true wisdom of episteme will be indistinguishable from the lesser, imitative

knowledge gained via techne.

Plato’s concern about how the imitative arts invite people to rely on techne

for knowledge rather than the efforts of their minds persists centuries later. Neil

Postman has suggested that Thamus would have reacted to Gutenberg’s printing

press in a similar manner as he did regarding writing: with the view that the new

invention would create a vast population of readers who “will receive a quantity

of information without proper instruction…who will be filled with the conceit of

wisdom without real wisdom” (qtd. in Postman, 1992, p. 16). Similarly, many

decry how calculators have replaced memorized multiplication tables (Clayton,

2000), or even how encyclopedias and Web search engines have supplanted the

need to retain knowledge within one’s memory (Schwartz, 1998).

In short, thousands of years before the printing press, the calculator, or the

Web search engine, Plato’s “judgment of Thamus” provided a warning about the

impact of technology on the pursuit of knowledge and attainment of wisdom. As

summarized by Postman:

[Thamus] meant that new technologies change what we mean by


“knowledge” and “truth”; they alter those deeply embedded habits of
thought which give a culture its sense of what the world is like – a sense of
what is the natural order of things, of what is reasonable, of what is
necessary, of what is inevitable, of what is real. (Postman, 1992, p. 12)

Plato’s warning that the achievement of wisdom – episteme – is increasingly

threatened through the introduction of imitative technologies – techne – remains

27
prescient, whether considering a system of writing introduced during antiquity, or

the development of the World Wide Web in the late twentieth century.

Indeed, Plato’s concern with the introduction of techne into human

pursuits of knowledge was mirrored centuries later in Karl Marx’s social critique

of capitalism. Marx argued that technological changes in the material conditions

of production subjugated human beings to technology, separating them from their

social community and life’s work, so that in the end they had no ownership over

their own lives or the products of their labor. Marx places the modes of

production – technology – squarely in the center of his philosophical conception

of social existence:

In the social production of their life, men enter into definite relations that
are indispensable and independent of their will, relations of production
which correspond to a definite stage of development of their material
forces. The sum total of these relations of production constitutes the
economic structure of society, the real foundation, on which rises a legal
and political superstructure and to which correspond definite forms of
social consciousness. The mode of production of material life conditions
the social, political, and intellectual life process in general. (Marx, 1978a,
p. 4; emphasis added)

The centrality of technology in Marx’s social theory also manifests itself

in his “Economic and Philosophic Manuscripts of 1844” (1978c) which

emphasizes the divisive role played by the rise of industrial capitalism in society.

Marx notes that to examine society properly we must start with its material basis,

“an actual economic fact,” rather than “go back to a fictitious primordial

condition” (p. 71). This economic fact is how “the worker becomes all the poorer

the more wealth he produces, the more his production increases in power and

range. The worker becomes an ever cheaper commodity the more commodities he

28
creates” (p. 71). The concept of “estranged labor” summed up this enslaved

condition of the worker in a capitalist society: his “loss of reality” though the

objectification of his labor (p. 71), the external aspect of labor in “the fact that

labour is external to the worker…it does not belong to his essential being” (p. 74),

and in the loss of self since labor “operates independently of the self” (p. 74).

Marx argues, “an immediate consequence of the fact that man is estranged from

the product of his labour, from his life-activity, from this species being is the

estrangement of man from man” (p. 77). In an alienated society, the whole mind-

set of men, their consciousness, becomes to a large extent a reflection of the

technological and material conditions in which they find themselves and of the

position in the process of production in which they are variously placed.11

This alienating effect of technology remained central to Marx’s social and

economic analysis, ranging from his discussion of fetishized commodities in

“Capital” (Marx, 1978b) to the exploitation of the working classes in “The

Manifesto of the Communist Party” (Marx & Engels, 1978). Shifts in

technological modes of production had caused human beings to become subjected

and alienated by the structures that emerged. Just as Thamus warned that reliance

on writing would change the very nature of knowledge and wisdom, Marx

11
This appears again in the “Fragment on Machines” in the Grundrisse,
where Marx argues that capitalist production moves away from dependency on
direct labor and tends toward reliance on the “general state of science and on the
progress of technology” (1973, p. 705). Here, Marx argues that “The general
productive forces of the social brain” have become absorbed, shaped, and
generally subsumed to fit the needs of capital and indeed become constitutive of
(1973, p. 694).

29
recognized that technological changes in production changed the very nature of

humans as social beings.

While Marx focused on the role of technology in social and historical

contexts, it was another German thinker, Ernst Kapp, who made technology the

specific focus of philosophical examination in his 1877 publication of

Grundlinien einer Philosophie der Technik (Foundations of a Philosophy of

Technology).12 Contrary to Marx’s concern that industrialization and its new

technological modes of production was an alienating and destructive force, Kapp

took a more Romantic position, arguing that humankind was the center of history

and culture, and technology provided the means to achieve self-awareness in the

rising industrial age. In this work, Kapp developed the idea of Organprojektion,

or organ projection, where technologies have their analogies in the human

organism in appearance, form, and function:13

[T]he intrinsic relationship that arises between tools and organs, and one
that is to be revealed and emphasized…is that in the tool the human
continually produces itself. Since the organ whose utility and power is to
be increased is the controlling factor, the appropriate form of a tool can be
derived only from that organ.
A wealth of spiritual creations thus springs from hand, arm, and
teeth. The bent finger becomes a hook, the hollow of the hand a bowl; in
the sword, spear, oar, shovel, rake, plow, and spade one observes sundry
positions of arm, hand, and fingers, the adaptation of which to hunting,
fishing, gardening, and the field tools is readily apparent. (qtd in Mitcham,
1994, pp. 23-24)

12
According to Carl Mitcham, this publication marked the coining of the
phrase “philosophy of technology” (Mitcham, 1994, p. 20).
13
Kapp’s insights predate Marshall McLuhan’s more famous notion that
technologies are the “extensions of man” by nearly 100 years.

30
This relationship between technologies and the body, Kapp argued, was the

primary road to self-awareness. He maintained an optimistic view of humankind’s

relationship to the emerging technologies of industrialization, arguing that all of

humanity was connected through its technologies, and that mankind is essentially

a technical species (see Simon, 2003). Kapp’s philosophy of technology, then,

centered on the view that technology was a complex extension or projection of

human faculties and activities, and that that social, the natural and the

technological were uniquely interwoven.

Kapp’s philosophical treatment of humankind’s natural connection to –

and reflection within – its technology was continued by the American historian

Lewis Mumford. Mumford sought to understand a complex system involving

technology, the natural environment, the individual, and society, each influencing

the others. In writing his groundbreaking Technics and Civilization (1934),

Mumford placed technology squarely within the context of the larger ecology of

our society and culture, and hoped to reveal the connection between the human

spirit and the character of our technological works.

In Technics and Civilization, Mumford describes not simply the work of

inventors and scientists but also the cultural sources and moral consequences of

the breakthroughs in technology and science. Mumford’s approach to technology

is shaped by his quest to explain the origins and prospects of modern culture. He

defines this quest in the opening sentences of the book:

During the last thousand years the material basis and the cultural forms of
Western civilization have been profoundly modified by the development
of the machine. How did this come about? Where did it take place? What

31
were the chief motives that encouraged this radical transformation of the
environment and the routine of life: what were the ends in view: what
were the means and methods: what unexpected values have arisen in the
process? (Mumford, 1934, p. 3)

Technics and Civilization sets modern technics within this larger framework,

correlating the changes taking place in our physical environment with changes

that were taking place in the mind, as well as in society.

Mumford emphasizes the evolution of the machine and machine

civilization over the course of “three successive but over-lapping and

interpenetrating phases” (1934, p. 109; emphasis in original). Each phase – the

eotechnic, the paleotechnic and the neotechnic – is defined by its characteristic

tools, techniques, materials and sources of energy. The eotechnic era, stretching

from 1000 AD to the dawn of the Industrial Revolution in the mid-eighteenth

century, opened the way for the industrial and scientific revolutions, including the

mechanical clock, the telescope, the printing press, and the magnetic compass.

Mumford admired the people, cities, and cultures of this era who strove for a

harmonious balance between the natural senses and the freedom from labor

provided by technology: “the goal of eotechnic civilization as a whole…was not

more power alone but a greater intensification of life” (1934, p. 149).

This phase gave way eventually to the paleotechnic era, roughly from the

mid-eighteenth century to 1900. This phase was dominated by regimented

science, technology, and capitalism. While the eotechnic society had been

intellectually interested in science and technics, the paleotechnic era transformed

this admirable concern into a determined effort to bring the whole of human

32
experience under the direction of capitalism and the machine: “there was a sharp

shift in interest from life values to pecuniary values” (p. 153). The technical gains

made during this phase were tremendous: power-propelled vehicles were created,

the use of iron increased, and the mass-production of clothes and mass-

distribution of food became widespread. Size, speed, quantity, and the

multiplication of machines were all reflections of the new means of and uses of

power in this phase.

The neotechnic phase represents the third development in the machine, but

is harder to define, as Mumford (1934) explains:

Partly because it has not yet developed its own form and organization,
partly because we are still in the midst of it and cannot see its details in
their ultimate relationships, and partly because it has not displaced the
older [paleotechnic] regime with anything like the speed and decisiveness
that characterized the transformation [to] the eotechnic order. (pp. 212-3)

Nevertheless, due the emergence of the neotechnic age, “the possibilities of

disruption and chaos have increased” (p. 213). Here, the scientific method, whose

chief advances had previously been in mathematics and the physical sciences,

“took possession of other domains of experience: the living organism and human

society. …Physiology became for the nineteenth century what mechanics had

been for the seventeenth: instead of mechanism forming a pattern for life, living

organisms began to form a pattern for mechanism” (p. 216). In short, the concepts

of science, previously associated largely with the “mechanical,” were now applied

to every phase of human experience and every manifestation of life.

For Mumford, modern technology threatens to replace the organic

environment and to sacrifice the last vestiges of individual autonomy to the

33
imperatives of technological and mechanical adaptation. Thus, while both

Mumford and Kapp shared the notion that a natural relationship can exist between

humans and technology, Mumford countered Kapp’s optimism, arguing that the

technologies of the industrial age prevented such an organic affinity between man

and machine.

Summary

These early philosophies of technology span millennia, but share a

common outlook on the impact of technology on society and culture. Aristotle

and Plato’s concern that an increasing reliance on techne will interfere with

achieving wisdom and attaining the good life is mirrored in Mumford’s position

that the modern technology of our neotechnic age threatens to supplant the

organic systems of the paleotechnic past, replacing our close connection to the

world with the technological imperatives industrial society. These alienating

effects of technology are repeated in Marx’s materialist history, where

subjectification becomes the inevitable consequence of societies reliance on

technology. And, while optimistic about the role of technology in society, Kapp

recognizes that mankind has become essentially a technical species – a position

likely shared by all five early theorists of technology.

Founders of Contemporary Philosophy of Technology

By the mid-twentieth century, humanists and philosophers became more

explicitly interested in technology’s impact on modern society. The onset, and

violent end, of World War II provided a clear example of how technology

34
unbound could, and clearly did, have devastating effects. In line with Mumford’s

thesis, many philosophers stressed that technology was inherently dangerous to

humanity. These include important contributions by the philosophers Jacques

Ellul, Herbert Marcuse, and Martin Heidegger.

Jacques Ellul is known as a harsh critic of technology, the technological

system, and the technological society. Ellul’s concern is not with any particular

technological artifact, but rather the broader technological phenomenon within

society as a whole. He focuses on technology at the highest level of abstraction, as

a system, worldview, and way of life; the term he uses in this context is la

technique. La technique is “the totality of methods rationally arrived at and

having absolute efficiency (for a given stage of development) in every field of

human activity” (Ellul, 1964, p. xxv). According to Ellul, la technique is

autonomous and automatic, self-augmenting and expanding at an ever increasing

rate, and encompassing every sector of human society. Ellul outlines this new

technological order – the technological society – with the following

characteristics: (a) it is artificial; (b) it is autonomous with respect to values, ideas

and the state; (c) it is self-determining in a closed circle, it is a closed organization

which permits it to be self-determinative independently of human intervention; (d)

it grows according to a process which is causal but not directed to ends; (e) it is

formed by the accumulation of means which have established primacy over ends;

and (f) all its parts are mutually implicated to such a degree that it is impossible to

separate them or to settle any technical problems in isolation (Ellul, 1985, p. 40).

35
Ellul considers whether this technological society can be “civilized,” and

his answer is decidedly negative. First, technological society is thoroughly

materialistic: “The technical work is the world of material things; it is put together

out of material things and with respect to them. When technique displays any

interest in man, it does so by converting him into a material object” (Ellul, 1985,

p. 45). Further, all that thrives in a technological society is effective power:

“Technical growth leads to a growth of power in the sense of technical means

incomparably more effective than anything ever before invented, power which

has as its object only power in the widest sense of the word” (p. 45). And, finally,

while technique frees humans from many of limitations of a pre-technological

society, Ellul contends that the result is far from liberating: technique denies

freedom and asserts its own autonomy over all spheres of life – politics,

economics, morality, and spirituality (1964, p. 134). “Technique can never

engender freedom” (1964, p. 46), Ellul argues, since technological society

“prevails” over the human subject, forcing it to succumb “the king of the slaves of

technique” (1964, p. 138).

La technique is a frame of mind that both precedes and leads to particular

technologies; it is the search for the most rational method, the most efficient

system, which excludes all others. The effect of this modern technological

phenomenon is that it seeks to overcome anything that is unpredictable and

spontaneous, i.e., humans and nature. Ellul argues that la technique can be applied

not only to technological machines, but also to humans, organizations,

communication, politics, the economy, leisure, sport – indeed, to all areas of life.

36
The inevitable result of such domination of technique over every aspect of

existence, according to Ellul, is that everything necessarily serves it: Everything is

subject to it, from procreation to how we eat, grow, where we live, and how we

die (Ellul, 1964, p. 128). Aligned with Mumford, Ellul’s final analysis is

decidedly pessimistic: “Today the sharp knife of [la technique] has passed like a

razor into the living flesh. It has cut the umbilical cord which linked men with

each other and with nature” (Ellul, 1964, p. 132).

In 1954, Martin Heidegger published The Question Concerning

Technology (1977), in which he considered the problem of modern technology

and the driving forces of its essence.14 In a powerful passage from the beginning

of the essay, he presents the problem:

Everywhere we remain unfree and chained to technology, whether we


passionately affirm or deny it. But we are delivered over to it in the worst
possible way when we regard it as something neutral; for this conception
of it, to which today we particularly like to do homage, makes us utterly
blind to the essence of technology. (Heidegger, 1977, p. 4)

In The Question Concerning Technology, Heidegger portrays the modern

focus on what he calls “technicity” – technology in itself, or technology as the

primary driving force within culture – as an extreme danger both to culture and

14
Heidegger’s historical involvement with Nazism and support of Adolf
Hitler’s policies has shrouded his theories in controversy. Philosophers disagree
on the consequences of this historical responsibility on his philosophy. Some
claim that his philosophy is pure from historical and political contingencies, while
others argue that his historical engagement for the Nazi party are inextricably
linked to his philosophical conceptions (see Farías, 1989). This debate cannot be
solved here, and my inclusion of Heidegger’s theories of technology is meant to
help chart the range of thinking on the subject, and not necessarily to endorse his
entire philosophy. (I thank Prof. Terry Moran for helping elucidate this concern
with Heidegger’s theories.)

37
thought. Technicity is not just a series of technologies or a technological system;

it is also a way of thinking. He examines the rhetoric of the seventeenth century,

in which the emphasis on the power of the thinking human subject led the subject

to overestimate his ability to transcend time and nature. Such thinking led to the

notion of that subject’s ability to gain control of nature, and the development of

technological tools to attain such control. For Heidegger this delusional thinking –

man’s control over nature through technology – inevitably led to our loss of

control of the very technological instruments that are used, paradoxically, in

order to feed our illusion that we can control the world:

The instrumental conception of technology conditions every attempt to


bring man into the right relation to technology. Everything depends on our
manipulating technology in the proper manner as a means. …The will to
mastery becomes all the more urgent the more technology threatens to slip
from human control. (Heidegger, 1977, p. 5)

The danger of technology lies in the transformation of the human being, by which

human actions and aspirations are fundamentally distorted. Technology enters the

inmost recesses of human existence, transforming the way we know and think.

The “greatest danger,” according to Heidegger, is that:

The approaching tide of technological revolution…could so captivate,


bewitch, dazzle, and beguile man that calculative thinking may someday
become so accepted and practiced as the only way of thinking. (1966, p.
56)

Technology becomes, then, the primary mode of human existence, and

Heidegger’s concern is the human distress caused by this technological

understanding of being.

38
For Heidegger, the essence of technology should be something other than

technological. He states somewhat paradoxically, “the essence of technology is by

no means anything technological” (1977, p. 4). Technology should, instead, be a

way of “revealing” other aspects of life, not of controlling life and the world just

to enable more technological development (1977, p. 12). Heidegger suggests that

there is a way that we can keep our technological devices and yet remain true to

ourselves. “We can affirm the unavoidable use of technical devices, and also deny

them the right to dominate us, and so to warp, confuse, and lay waste our nature”

(quoted in Dreyfus, 2004, p. 57). As Hubert Dreyfus comments:

[The] technological understanding of being is our destiny, [but] it is not


our fate. That is, although a technological understanding of things and
ourselves as resources to be ordered, enhanced, and used efficiently has
been building up since antiquity and dominates our practices, we are not
stuck with it. It is not the way things have to be…. We can break out of
the technological understanding of being whenever we find ourselves
gathered by things rather than controlling them. (Dreyfus, 2004, p. 57)

We are released from the burden of seeking efficiency for its own sake,

Dreyfus claims, once we understand that this kind of calculative thinking is only a

historical product and that things could be different. Once we recognize that we

have positioned ourselves within a self-created technological understanding, we

have taken the first steps to escape from that framework; we are then free to

appreciate that there is more to life than efficiency, than technological rationality.

This new attitude toward technology is what Heidegger describes in Discourse on

Thinking as “releasement”:

Releasement…grants us the possibility of dwelling in the world in a


totally different way. It promises us a new ground and foundation upon

39
which we can stand and endure in the world of technology without being
imperiled by it. (Heidegger, 1966, p. 55)

This striving for harmony with the technologies around us can make us sensitive

to the technological understanding of being, perhaps a first step toward freeing us

from our compulsion to force all things into one efficient, technological order.

Heidegger’s comments here are deeply insightful. Like Ellul, he is asserting the

totalizing effect of the essence of technology, but he sees hope for releasement

through our very relationship to technology.

Like Heidegger, Herbert Marcuse is critical of technology’s dominant

place in Western culture, and, like Ellul, he views technology as the embodiment

of a totalizing kind of rationality. In One-Dimensional Man (1964), Marcuse

argues that humans have been reduced to a one-dimensional being due to their

immersion in a technological civilization. He argues that our advanced industrial

and technological society created false needs that integrated individuals into the

existing system of production and consumption. Mass media and culture,

advertising, industrial management, and contemporary modes of thought all

reproduced the existing system and attempted to eliminate negativity, critique,

and opposition. The result was a “one-dimensional” universe of thought and

behavior in which the very ability of critical thinking and oppositional behavior

was withering away.

Marcuse acknowledges that all basic needs, such as food, shelter and

safety, are provided for in the modern technological society. But these are all

controlled within a cooptive system of needs-manipulation and a system of

40
“deceptive liberties as free competition at administered prices, a free press which

censors itself, free choice between brands and gadgets” (1964, p. 7). The

technological society gains its totality through a certain kind of apparent

satisfaction of needs, manipulated through technical means. We are caught up in a

systematic deception; our needs have become the needs of the technological

apparatus. This completely rationalized, one-dimensional, technological society

closes political possibilities and represents a new form of dominating political

power.

In his essay “Social Implications of Technology” (2004), Marcuse

reaffirms his critique of the technological society. Here he claims that

technological rationality is a political rationality that swallows up all opposition

by homogenizing nature and people into neutral objects of manipulation:

Under the impact of this apparatus, individualistic rationality has been


transformed into technological rationality. It is by no means confined to
the subjects and objects of large scale enterprises but characterizes the
pervasive mode of thought and even the manifold forms of protest and
rebellion. This rationality establishes the standards of judgment and fosters
attitudes which make men ready to accept and even to introcept the
dictates of the apparatus. (Marcuse, 2004, p. 65)

We are thus stripped of our individuality by a technological rationality that makes

conformity seem reasonable and protest unreasonable. The result is Marcuse’s

“one-dimensional” universe of thought and behavior in which our capacity for

critical thinking and practical resistance is disappearing, and where “there is no

room for autonomy” (p. 66).

Marcuse, however, holds out the possibility that technological rationality

might actually be used as an instrument to foster democracy and autonomy:

41
We have pointed to the possible democratization of functions which
technics may promote and which may facilitate complete human
development in all branches of work and administration. Moreover,
mechanization and standardization may one day help to shift the center of
gravity from the necessities of material production to the arena of free
human realization. (2004, p. 77)

The same objective, impersonal rationality that makes individualism unnecessary

can be harnessed by a society to fully realize (rather than repress) human

capacities. Marcuse is pessimistic about the prospects for such a transformation

because the technological apparatus has incorporated and subsumed all critical

and oppositional thought. Yet despite Marcuse’s pessimism regarding the

practical achievement such a transformation, he maintains that it is, at least in

principle, possible to attain.

Summary

These three “founding fathers” of contemporary philosophy of technology

arrive at similar conclusions about modern technology and technical culture – that

the pervasiveness of technology brings with it increasingly harmful effects. For

Ellul, this is embodied in the notion of la technique, “the totality of methods

rationally arrived at and having absolute efficiency (for a given stage of

development) in every field of human activity” (Ellul, 1964, p. xxv). Heidegger

fears that the rational and efficient motivations of the technologization of our

world will transform all aspects of humanity, and emerge as the “only way of

thinking” (1966, p. 56). And Marcuse fears that this technological deception will

become more deeply embedded in our lives with the rise of advanced systems of

production and consumption, where the needs of society are replaced by the needs

42
of the technological apparatus. For all three thinkers, technology is largely

associated with calculative, analytic, and efficient thinking, an inevitable

consequence of modernity in which technological rationality threatens to subsume

all other values and modes of existence.

Contemporary Philosophies of Technology

Building from these fundamental concerns of Ellul, Heidegger, and

Marcuse, the study of technology and its social impact has thrived in recent

decades, spreading from the discipline of philosophy into sociology, history,

political science, and media theory. This section will highlight some of the

contemporary philosophical explorations of the nature of technology as well as its

effects on human knowledge, activities, societies, and environments. Our starting

point is a unique field of study that includes Ellul and Mumford within its diverse

canon: media ecology.

Media Ecology

Media ecology is a multi-disciplinary field dedicated to studying the

intersections between media, communication, technology, and culture. Inspired by

the provocative theories of Marshall McLuhan, the idea of media ecology

emerged formally in an address by Neil Postman in 1968 where he described it as

“the study of media as environments,” explaining that the main concern for media

ecologists is “how media of communication affect human perception,

understanding, feeling and value; and how our interaction with media facilitates

43
or impedes our chances for survival” (Postman, 1970, p. 161). Postman later

offered more elaborate definition, summarizing the importance of the ecological

metaphor:

The word ecology implies the study of environments – their structure,


content, and impact on people. An environment is, after all, a complex
message system which regulates ways of feeling and behaving. It
structures what we can see and say and, therefore, do. Sometimes, as in
the case of a courtroom, or classroom, or business office, the
specifications of the environment are explicit and formal. In the case of
media environments (e.g., books, radio, film, television, etc.), the
specifications are more often implicit and informal, half-concealed by our
assumption that we are dealing with machines and nothing more. media
ecology tries to make those specifications explicit. It tries to find out what
roles media force us to play, how media structure what we are seeing, why
media make us feel and act as we do. (Postman & Weingartner, 1971, p.
139)

The fundamental goal for media ecology, then, is to understand how the

form of media and communication technologies impact our everyday lives, to

uncover and understand “how the form and inherent biases of communication

media help create the environment…in which people symbolically construct the

world they come to know and understand, as well as its social, economic,

political, and cultural consequences” (Lum, 2000, p. 3). The media ecological

perspective on understanding the biases of media technology is perhaps best

illustrated through a series of assertions offered by Christine Nystrom:

1. Because of the different symbolic forms in which they encode


information, different media have different intellectual and emotional
biases.
2. Because of the different physical forms in which they encode, store,
and transmit information, different media have different temporal,
spatial, and sensory biases.
3. Because of the accessibility of the symbolic forms in which they
encode information, different media have different political biases.

44
4. Because their physical form dictates differences in conditions of
attendance, different media have different social biases.
5. Because of the ways in which they organize time and space, different
media have different metaphysical biases.
6. Because of their differences in physical and symbolic form, different
media have different content biases.
7. Because of their differences in physical and symbolic form, and the
resulting differences in their intellectual, emotional, temporal, spatial,
political, social, metaphysical, and content biases, different media have
different epistemological biases. (Personal communication, September
2002)

Media ecology explores how media and communication technologies construct

the world in which we live, including their social, economic, political,

epistemological, and cultural spheres. The foundation of much media ecological

explorations is the provocative ideas of Marshall McLuhan.

McLuhan is popularly known as a media and communications theorist, but

he makes it clear in his influential work, Understanding Media: The Extensions of

Man (1964), that what he means by “media” extends well beyond traditional

communication technologies, evidenced by chapters on such “media” as clocks,

houses, light bulbs, bicycles, airplanes, games, weapons, and automobiles. The

central aim of Understanding Media – indeed all of McLuhan’s work – is to probe

the nature of media technology and understand how the introduction of new

technologies affects society.

To this end, McLuhan echoes Rapp’s notion that technologies are

extensions of the body, that they “amplify and extend ourselves” (McLuhan,

1964, p. 64). Differing from Rapp, however, McLuhan rejects any romantic view

of such technological extensions of the body, noting how the act of

“autoamputation” that comes with technology, often numbing us to the full effects

45
of technology, and how particular media technologies, even as extensions of the

body, function as metaphors, languages, and translators of our experiences.

(McLuhan, 1964, p. 66). To help explain this, McLuhan introduces his most

famous aphorism: “the medium is the message,” which means:

…that the personal and social consequences of any medium – that is, of
any extension of ourselves – result from the new scale that is introduced
into our affairs by each extension of ourselves, or by any new technology.
(McLuhan, 1964, p. 7)

Simply put, “the medium is the message” means that the media or

technologies that we use play a leading role in how and what we communicate,

how we think, feel and use our senses, and in our social organization, way of life,

and world view. McLuhan saw changes in the dominant medium of

communication as the main determinant of major changes in society, culture and

the individual, and his concern with the widespread and ecological effects of

introducing a new technology to a given environment is an overarching theme in

most media ecological literature. Notable examples include Harold Innis’ (1951)

examination of how the formal features of a culture’s communication

technologies have distinctive sensory, cognitive, socio-political, and ideological

biases, Walter Ong’s (1982) documentation of the intellectual, social, and cultural

effects of the shift from oral modes of communication to a predominantly scribal

culture (mirroring Plato’s concern with the “Judgment of Thamus”), and

Elizabeth Eisenstein’s (1979; 1983) in-depth study of the impact of the printing

press on science, government, religion, and culture in early modern Europe.

46
Innis, Ong and, Eisenstein focus on the introduction of new media

technologies and practices to gain a greater awareness of the role of technology in

the shaping of society and human culture. Representative of most media

ecological scholarship, they appear to subscribe to McLuhan’s assertion that “the

medium is the message,” that the technological form of a medium carries greater

force than the particular message it is delivering. This McLuhanesque logic,

which rests at the center of the media ecology tradition, is often criticized for its

media determinism. The theories presented by Innis, Ong, and Eisenstein, then,

could all be labeled as overly deterministic, arguing that social, cultural, political,

and economic aspects of our lives are solely determined by the form and biases of

the prevailing media technology.

Eisenstein is quick, however, to avoid an overly deterministic model of

technology’s impact on society. She positions the technological bias of the

printing press alongside the rise of nationalism, inductive science, capitalism,

individualism, and Protestantism, but is cautious about assigning too strong a

causal relationship between print and these cultural events. In fact, the title of the

unabridged version of her text is “The Printing Press as an Agent of Change”

(1979; emphasis added), recognizing that media technology is an agent of change

within society, but not necessarily the first and only cause. She states:

I want to suggest that printing produced a mutation…. The relationship


between a given technological and a given cultural change will be
approached, not by taking them to coincide…, but by acknowledging that
they came at different times and by investigating how they affected each
other. (Eisenstein, 1983, pp. 114-115)

47
By acknowledging how technology and culture might affect one another,

Eisenstein recognizes that it is often the interaction between a technology and its

users that determines its impact. A close inspection of the media ecology tradition

reveals broad commitment to this softer form of determinism, as Casey Lum

(2000) notes in his introduction to the intellectual roots of Media Ecology: “[O]ne

of media ecology’s major concerns [is] the complex symbiotic relationship among

the media and…between media and the various forces in society” (p. 1; emphasis

added).

To summarize, media ecology has surfaced as a philosophical approach

well suited to exploring the symbiotic and ecological relationship between

technology and society. Building from the notion that “the medium is the

message,” media ecologists understand that the introduction of new technologies

have widespread ecological impacts, a position expressed most succinctly, again,

by Neil Postman:

Technological change is neither additive or subtractive. It is ecological. I


mean “ecological” in the same sense as the word is used by environmental
scientists. One significant change generates total change. If you remove
the caterpillar from a given habitat, you are left not with the same
environment minus caterpillars: you have a new environment, and you
have reconstituted the conditions of survival; the same is true if you add
caterpillars to an environment that had none. This is how the ecology of
media works as well. A new technology does not add or subtract
something. It changes everything. (Postman, 1992, p. 18)

Confirming the views of Ellul, Heidegger, and Marcuse, media ecologists strive to

uncover the totalizing effects of technology on society. While recognizing that a

symbiotic relationship exists between technology and culture, media ecology, at

48
its root, maintains a form of soft determinism: that the introduction of a

technology into an ecosystem “changes everything.”

Social Construction of Technology

While media ecology provides a framework for understanding how the

introduction of a new technology or practice into an environment changes that

entire environment, it remains relatively silent regarding the social or cultural

forces that led to the emergence of the technology in the first place. To fill this

void, we can turn to the theory of the social construction of technology (SCOT),

developed by sociologists and historians within the discipline of Science and

Technology Studies (see, for example, Pinch & Bijker, 1984; Bijker et al., 1987;

Bijker & Law, 1992). In contrast to more deterministic theories of technology,

advocates of SCOT focus less on the ways technology might determine human

action, and instead explore how human and social actions shape the development

of technology. While media ecology concedes, perhaps reluctantly, that a

symbiotic relationship exists “between media and the various forces in society,”

SCOT embraces such a position as primary. Further, SCOT theorists challenge

the views of Ellul, Heidegger, or Marcuse that technology is self-determinant –

that technological development occurs through a pre-determined, logical, and

fully rational path – and instead argues that technologies are constructed through a

process of strategic negotiation between different social groups, each pursuing its

own specific interests, resulting in great variance in the ultimate form of the

49
technology. The key idea underlying SCOT, then, is that social arrangements

create, shape, and determine technologies and how they are used.

A core text in the SCOT approach to understanding the relationship

between technology and society is Trevor Pinch and Wiebe Bijker’s (1987) “The

Social Construction of Facts and Artifacts: Or How the Sociology of Science and

the Sociology of Technology Might Benefit Each Other,” which outlines four key

components of the theory. The first is interpretative flexibility, which suggests

that technological design is an open process that can produce different outcomes

depending on the social circumstances of development: “Technological artifacts

are culturally constructed and interpreted…. By this we mean not only that there

is flexibility in how people think of or interpret artifacts but also that there is

flexibility in how artifacts are designed” (p. 40). Interpretative flexibility implies

that technological artifacts are underdetermined by their original designers,

allowing for a multitude of possible designs to emerge.

The concept of the relevant social group is a second component of the

SCOT approach. For social constructivists, technological development is a

process in which multiple groups, each embodying a specific interpretation of an

artifact, negotiate over its design, with different social groups seeing and

constructing quite different objects. Relevant social groups are the embodiments

of particular interpretations: “all members of a certain social group share the same

set of meanings, attached to a specific artifact” (Pinch & Bijker, 1987, p. 30).

Groups may have different definitions of a technology, so development continues

until all groups come to a consensus that their common artifact “works.”

50
The third component of the SCOT framework is the dual presence of

closure and stabilization. When the relevant social groups have formed a

consensus regarding the purposes and suitability of a particular technology, a state

of closure is said to have occurred. The technology is no longer subject to

interpretative flexibility; it has become a “black box” to its users, a taken-for-

granted technology that almost seems to be part of the natural environment. A

technology that has reached closure is so imbedded that people find it difficult to

think of alternative ways of doing things. Closely related to closure is

stabilization. Technologies reach a point of stabilization because their social roles

and statuses have themselves stabilized, often because an individual or group is in

a position to exert power, and it is in their interest to see to it that a particular

technology succeeds.

Finally, SCOT places emphasis on the wider context; the broad cultural

and political milieu in which artifact development takes place: “…the

sociocultural and political situation of a social group shapes its norms and values,

which in turn influence the meaning given to an artifact” (Pinch & Bijker, 1987,

p. 46). Examining the wider context played a relatively minor role in Pinch and

Bijker’s original conception of SCOT, but took on more importance in Bijker’s

later work (1995), when he added the notion of the technological frame to his

constructivist theory of technology. This is the shared cognitive frame that defines

a relevant social group and constitutes members’ common interpretation of an

artifact. Like a Kuhnian (1962) paradigm, a technological frame can include

goals, key problems, current theories, rules of thumb, testing procedures, and

51
exemplary artifacts that, tacitly or explicitly, structure group members’ thinking,

problem solving, strategy formation, and design activities (Bijker, 1995, p. 125).

A technological frame may promote certain actions and discourage others:

Within a technological frame not everything is possible anymore (the


structure and tradition aspect), but the remaining possibilities are relatively
clearly and readily available to all members of the relevant social group
(the actor and innovation aspect). (Bijker, 1995, p. 192)

The key idea underlying the social construction of technology is that

social arrangements create, shape, and determine technologies and how they are

used. As Bijker and Law (Bijker & Law, 1992) state, “Our technologies mirror

our societies. They reproduce and embody the complex interplay of professional,

technical, economic, and political factors” (p. 3). In short, technologies are shaped

by and mirror the complex trade-offs that make up the social sphere from which

they emerge.

This constructivist approach to the relationship between technology and

society seems at odds with the media ecological approach. While media ecology

views technology as a force that shapes society, SCOT sees society mirrored in its

technologies. But, just as Eisenstein recognized that media ecology must not be

fully deterministic, Bijker and Law (Bijker & Law, 1992) note that even within

the constructivist paradigm, the social world is affected by the technology it has

created:

[Technology] is born of the social, the economic, and the technical


relations that are already in place. A product of the existing structure of
opportunities and constraints, it extends, shapes, reworks, or reproduces
that structure in ways that are more or less unpredictable. And, in doing so,
it distributes, or redistributes, opportunities and constraints equally or
unequally, fairly or unfairly. (p. 11; emphasis added)

52
Bijker and Law’s acknowledgement that technology, while socially constructed,

in turn distributes opportunities and constraints within society is a tacit

recognition that a reciprocal relationship exists between technological artifacts

and the social groups that form them. Finding common ground with the soft

determinism of media ecology, Bijker and Law accept that “society itself is being

built along with objects and artifacts” (Bijker & Law, 1992, p. 19). Just as the

strategies, priorities, and biases of the social groups involved with the

stabilization of a technology might influence its eventual design, these same

biases can be reflected back upon society when the technology reaches a certain

level of stabilization and ubiquity.

Politics of Technology

When noting that the design of technology “distributes, or redistributes,

opportunities and constraints equally or unequally, fairly or unfairly,” Bijker and

Law reveal how the introduction of particular technologies impact arrangements

of power within our technological society. Despite invoking issues of power and

politics through such a statement, social constructivist studies of technology are

often criticized for largely ignoring the political consequences of the technologies

that emerge. Such criticism has been most vocal from Landon Winner (1993),

who, in his essay “Upon Opening the Black Box and Finding It Empty,” argues

that:

The most obvious lack in social constructionist writing is an almost total


disregard for the social consequences of technical choice. …What the
introduction of new artifacts means for people’s sense of self, for the

53
texture of human communities, for qualities of everyday living, and for the
broader distribution of power in society – these are not matters of explicit
concern. (p. 368)

Winner expands his indictment of SCOT beyond simply not looking at the

social consequences of technological choice: “Social constructivists,” he states,

“choose to remain agnostic as regards the ultimate good or ill attached to

particular technical accomplishments” (p. 372). A fundamental flaw of social

constructivists, according to Winner, is their lack of an evaluative or political

stance on the development of technologies: “they offer no judgment on what it all

means” (1993, p. 375). In contrast to SCOT’s apparent agnosticism, other

scholars have confronted the politics of technology head-on, building on the

theories espoused by Ellul, Heidegger, and Marcuse to expose the political

dimensions of the technologies that permeate society. Leading the way in this

endeavor is, not surprisingly, Winner himself. But the foundations for the political

exploration of technologies were laid, again, by Lewis Mumford.

In the wake of his historical treatment of technology in Technics and

Civilization, Lewis Mumford later turned his attention to the specific political

dimensions of contemporary technological society. His essay “Authoritarian and

Democratic Technics” (1964), penned thirty years after Technics and Civilization,

describes two distinct technological types that frequently exist side by side: “one

authoritarian, the other democratic, the first system-centered, immensely

powerful, but inherently unstable, the other man-centered, relatively weak, but

resourceful and durable” (1964, p. 2). Democratic technics are built upon human

skill and animal energy, and remain under the active direction of the craftsmen or

54
laborers who use them as though “gifts from nature” (p. 3). While such

democratic technologies might have “limited horizons of achievement,” they had

modest demands, and enabled “adaptation and recuperation” (p. 3). Authoritarian

technics, on the other hand, emerged from “a new configuration of technical

invention, scientific observation, and centralized political control” that gave rise

to what we refer to today as “civilization” (p. 3). Connected to the rise of new

organizations of political power – kingship and empire – a new kind of

“theological-technological mass organization” emerged, uniting and centralizing

the previously disperse and scattered democratic techncis (p. 3).

While history reveals that authoritarian governing models of kingship and

empire have largely been replaced by democratic modes of governance,

authoritarian technics have persisted: “At the very moment Western nations threw

off the ancient regime of absolute government, operating under a once-divine

king, they were restoring this same system in a far more effective form in their

technology” (Mumford, 1964, p. 4). Mumford argues that the rise of political

democracy has been increasingly nullified by the “successful resurrection of a

centralized authoritarian technics” (p. 4). In place of an all-powerful king, the

center of authority in this new authoritarian technocratic system is the system

itself. The ideological power of technological progress that drives the

authoritarian system is too seductive to refuse. As Mumford describes:

each member of the community may claim every material advantage,


every intellectual and emotional stimulus he may desire, in quantities
hardly available hitherto even for a restricted minority: food, housing,
swift transportation, instantaneous communication, medical care,
entertainment, education. …If one surrenders one’s life at source,

55
authoritarian technics will give back as much of it as can be mechanically
graded, quantitatively multiplied, collectively manipulated and magnified.
(1964, p. 6)

Mumford maintains, however, that this is not a fair bargain. He demands

recognition of the human disadvantages and costs of our unqualified acceptance

of this authoritarian technocratic system: “Once our authoritarian technics

consolidates its powers, with the aid of its new forms of mass control, its panoply

of tranquillizers and sedatives and aphrodisiacs, could democracy in any form

survive?” (p. 7).

In his seminal essay “Do Artifacts Have Politics?”, Winner (1986)

presents a theory of technological politics similar to Mumford, arguing that

conditions of power, authority, freedom, and social justice are often deeply

embedded in technical devices and systems. Winner contends that technological

artifacts have innate dispositions that both create and affect these arrangements of

power and authority, with often-widespread political implications. In response to

Postman’s concern about our blindness to the ideologies embedded in technology,

Winner’s theory of technological politics compels us to “pay attention to the

characteristics of technical objects and the meaning of those characteristics”

(1986, p. 22). He urges us to see our technologies (and built environments in

general) as embodying ideas about the social order, whether well or evil

intentioned: “The issues that divide or unite people in society are settled not only

in the institutions and practices of politics proper, but also, and less obviously, in

tangible arrangements of steel and concrete, wires and semiconductors, nuts and

bolts” (Winner, 1986, p. 29).

56
Winner identifies two levels at which technological artifacts embody

politics. The first is one in which segments of society build into technologies their

own explicit prejudices, biases and ideologies in such a way that they settle

particular issues within a community. At this level, technologies are implicitly

designed to resolve political conflict by supporting the power and authority of one

group over another. Winner provides the example of Robert Moses, the urban

planner responsible for much of the design of modern New York city, who

famously built overpasses over the Long Island Highway only 9 feet high,

allowing only automobiles to navigate the parkway. The city’s poor and minority

classes, largely dependent on taller buses for transportation, could not drive along

the highway and were essentially denied access to the beachfront destinations.

Winner argues that Moses incorporated widespread prejudices among the city’s

upper class into the design of the parkway overpasses, essentially preventing

access by the poor and black population of New York City to the wealthy’s social

and recreational spaces. Such a technological design decision provided a

(literally) concrete settlement of a particular power struggle between the classes.15

Winner also suggests that some technologies are inherently political.

Inherently political technologies, by their very nature, make necessary or strongly

suggest certain kinds of social and political arrangements. For example, Winner

argues that the level of a society’s dependence on massive technological

15
Bernward Joerges (1999) has provided a critique of Winner’s thesis,
arguing that the Long Island Highway story is apocryphal. Regardless of who is
correct, the episode stands as a suitable parable for how political arrangements
could be embedded in artifacts in order to settle particular social or political
issues.

57
investments (nuclear power reactors, interstate highways, national communication

systems, and the like) is positively related to the need for highly centralized,

hierarchical, and authoritative forms of administration to ensure their

maintenance. Mirroring Mumford, Winner is concerned that the authoritarian

forms of management used to manage such large technical systems are leaking

into society at large:

Americans have long rested content in the belief that arrangements of


power and authority inside industrial corporations, public utilities, and the
like have little bearing on public institutions, practices, and ideas at large.
That ‘democracy stops at the factory gate’ was taken as a fact of life that
had nothing to do with the practices of political freedom. But can the
internal politics of technology and the politics of the whole community be
so easily separated? (Winner, 1986, p. 36)

The increased ubiquity of technological systems that require authoritative

forms of management, Winner argues, will result in the normalization and

naturalization of those kinds of power relationships outside the factory gate. In

such cases, the emergence of technologies that are only compatible with a

particular type of governance structure will perpetuate the spread of that kind of

political arrangement of power throughout society.

Focusing on the rise of large-scale systems of communication and

information processing in the wake of industrialization, James Beniger (1986)

identified a similar correlation between the emergence of particular technological

systems and the type of government and bureaucratic structures they support.

Because industrialization involved the large and fast flow of goods, it could not be

managed without a new breed of information processing technical systems (in

which Beniger includes product standardization, bureaucracy and advertising, as

58
well as the usual mechanical devices such as computers, data files systems, and

the like). Moreover, without highly detailed and structured management systems,

the new post-industrial economy simply could not work. This need for large-scale

management and information systems brought about what Beniger calls the

“Control Revolution”:

The Control Revolution developed in response to problems arising out of


advanced industrialization: a mounting crisis of control at the most
aggregate level of national and international systems, levels that had had
little practical relevance before the mass production, distribution, and
consumption of factory goods. (Beniger, 1986, p. 278)

Resolution of the problems created by advanced industrialization demanded new

means of information processing and communication to control an economy

shifting from local segmented markets to increasingly higher levels of

organization – what Beniger labels the growing “systemness of society” (p. 278).

The growing “systemness of society” meant that information began to

replace industrial capital as the material base for our modern economy, and, well

before the twentieth century and digital computing, brought about our Information

Society. According to Beniger, mass industrial processes and technology began to

coalesce in the mid- to late-1800s, beginning with landmark inventions such as

the telegraph, typewriter, and telephone, extending into the early 1900s with the

radio and, eventually, television. More recent developments such as computers,

telecommunications, and presumably, the Internet, Beniger would argue, are not

radical milestones or emblems of a newly emerging “Information Society,” but

examples of the steady continuation of the Control Revolution that began a

century earlier.

59
Beniger argues that just as the rapid industrialization of the late-nineteenth

century forced theoretical reconstructions in terms of capital, energy, and material

processing, “the Information Society demands similar reanalysis according to the

physical relationships governing information storage, processing, communication,

and control” (1986, p. 32). The rise of the Information Society has exposed the

centrality of information processing, communication, and bureaucratic control to

all aspects of human society and social behavior:

Because the activities of information processing, programming, decision,


and communication are inseparable components of the control function, a
society’s ability to maintain control at all levels – from interpersonal to
international relations – will be directly proportional to the development of
its information technologies. (Beniger, 1986, p. 287)

Beniger, then, arrives at similar conclusions as both Mumford and Winner, that

the growing ubiquity and importance of information technology, paralleled by the

growth of an information-based economy and the “centrality of information

processing” (Beniger, 1986, p. 436), inevitably leads to a social and political

environment based on efficiency and bureaucratic control.

The French philosopher Gilles Deleuze focuses Beniger’s critique of the

“control to all aspects of human society and social behavior” brought forth by the

informatization of society squarely on issues of power, arguing that we have

shifted from a disciplinary society, organized around the management of physical

space, to a control society, organized around the control of information and,

inevitably, people. Deleuze (1995) paints a bold picture that identifies a

movement away from technologies of discipline, based on physical confinement

of the body, to technologies of control, based on commodification of the

60
individual as information to be parsed and processed. Like Mumford, Deleuze

believes that the form of society is matched by the form of its machines (Deleuze,

1995, p. 180), and similar to Beniger, he views the rise of information processing

as the source of control in contemporary society. Sovereign societies, Deleuze

argues, made use of simple mechanisms such as levers, pulleys, and clocks, and

disciplinary societies employed “machines involving energy,” presumably steam

and internal combustion engines. The defining technology of the new society of

control is information technology, what he calls the “third generation of

machines” (Deleuze, 1995, p. 180), where the “digital language of control is made

up of codes indicating whether access to some information should be allowed or

denied” (p. 180).

Deleuze argues that the arrangements of power deployed in the spaces of

enclosure which previously defined disciplinary institutions – the factory, the

school, the hospital, the prison – have collapsed under the weight of our

information society, leading, not to the eradication of these power relations, but

rather, to their dispersal and often hidden proliferation throughout society. The

social regulation of space and time has become expropriated from the relatively

closed systems of discipline, moving into open, often contradictory spaces of

control. The control society offers what feels like greater freedom, yet is more

suffocating, and to Deleuze, more sinister: With the emergence of control

technologies such as tracking systems, electronic tagging, and coded access cards,

Deleuze warns of the “widespread progressive introduction of a new system of

domination” (1995, p. 182).

61
Lawrence Lessig (1999) recognizes how a Deleuzian “digital language of

control” is present in the architecture of cyberspace. Since cyberspace is entirely

human-made, Lessig argues, little is naturally inherent to the system – all of its

rules, tendencies, affordances, and constraints are the result of human decisions,

actions, and, essentially, code.16 What we can and cannot do there is governed by

the underlying code of all of the programs and protocols that make up the

Internet, which equally permit and restrict human action. Quite simply, “code is

law”:

In real space recognize how laws regulate – through constitutions, statues,


and other legal codes. In cyberspace we must understand how code
regulates – how the software and hardware that make cyberspace what it is
regulate cyberspace as it is. (1999, p. 6)

For Lessig, “how a system is designed will affect the freedoms and control

the system enables” (Lessig, 2001, p. 35); the very architecture of the Internet

dictates its politics and ideology. He argues that it is the architecture of

cyberspace that constitutes its freedom, and as the architecture is threatened and

changed, this freedom will be erased. While locating the political and ideological

power of the system within its formal structure – code – Lessig recognizes that the

political nature of this new medium does not exist per se, but is the product of

human agents. While the “architecture of the cyberspace is power,” he states,

“How it is could be different. Politics is about how we decide. Politics is about

how that power is exercised, and by whom” (1999, p. 59; emphasis added).

16
See (Hafner & Lyon, 1996) for a study of the human decisions that
informed the creation of the Internet.

62
Like Deleuze and Lessig, Alex Galloway locates systems of control

embedded within the code of contemporary technological systems. While there is

a steady stream of literature, both scholarly and journalistic, which presents a

particularly utopian vision of the freedoms enabled by the digital networks of the

Internet (for example Kelly, 1996; Negroponte, 1996; Rheingold, 2000),

Galloway (2004) reveals how power relations and control persist within the

formal structure of the Internet, the level of the protocols that make the Internet

work.17 He argues that such protocols facilitate the exercise of control in our

networked society, one in which power is distributed laterally, rather than being

hierarchically structured or centralized. Extending Deleuze’s concern that

information systems increasingly disperse power relations in hidden and

seemingly contradictory ways, Galloway argues that:

Control exists after decentralization, that is, in specific places where


decentralization is done and gone and distribution has set in as the
dominant network diagram. Protocol is my answer for that. …protocol not
only installs control into a terrain that on its surface appears actively to
resist it, but in fact goes further to create the most highly controlled mass
media hitherto known. (Galloway, 2004, p. 147)

Galloway asserts that the founding principle of the Internet is control, and,

following Deleuzian logic, it represents a continuation on the path toward a

society of broad, horizontal control structures: “The emergence of distributed

networks is part of a larger shift in social life. The shift includes a movement

away from central bureaucracies and vertical hierarchies toward a broad network

17
Similar work includes Chun’s (2006) Control and Freedom: Power and
Paranoia in the Age of Fiber Optics, where she describes the various controls and
freedoms enabled by the design of our networked technologies and online
environments.

63
of autonomous social actors” (p. 33). Protological control is embedded in the code

of the system, and reflects, for Galloway, a larger shift towards horizontal,

distributed, and often hidden control mechanisms.

Ethics in Technology

Threaded throughout the various philosophies of technology described

above are appeals to recognize how technology impacts ethical and human

values. Plato confronts how a commitment to techne might conflict with the

proper achievement of the “good life,” while Marx sees technology as a force of

domination and alienation. Ellul, Heidegger, and Marcuse are each concerned

about the way technology acts as a force that denies individual freedom and

autonomy. Mumford and Winner express concern that certain technological

arrangements conflict with democratic principles, and Beniger, Deleuze, Lessig,

and Galloway all point to issues of freedom and control within the technologies of

our new information society.

These concerns with the way that technology affects levels of freedom,

autonomy, power, and social justice bring us back to our starting point in this

chapter: the position that the design of technology bears “directly and

systematically on the realization, or suppression, of particular configurations of

social, ethical, and political values” (Flanagan et al., in press). The increasing

technologization of our world brings with it important ethical concerns about its

impact on human values at both the social and individual levels. As a result, many

scholars have directed their focus on how emerging digital technologies bear on

64
ethical and human values. The origin of the contemporary study of ethics in

technology is the visionary work of mathematician-turned-philosopher, Norbert

Wiener.

Norbert Wiener shared the concern of the theorists outlined above that the

rise of information processing would become a locus of control. During World

War II, Wiener helped to design an anti-aircraft gun capable of shooting down

fast-moving warplanes. The particular engineering challenge of this project led to

the creation a new field of research that Wiener called “cybernetics” – the science

of information feedback systems (Wiener, 1965). The concepts of cybernetics,

when combined with digital computers under development at that time, led

Wiener to become concerned about the ethical uses of these advanced

technologies (which later evolved into our contemporary computer and

information technology infrastructures). Wiener foresaw revolutionary moral and

political consequences within these emerging technologies:

It has long been clear to me that the modern ultra-rapid computing


machine was in principle an ideal central nervous system to an apparatus
for automatic control; and that its input and output need not be in the form
of numbers or diagrams. It might very well be, respectively, the readings
of artificial sense organs, such as photoelectric cells or thermometers, and
the performance of motors or solenoids [that] we are already in a position
to construct artificial machines of almost any degree of elaborateness of
performance. Long before Nagasaki and the public awareness of the
atomic bomb, it had occurred to me that we were here in the presence of
another social potentiality of unheard-of importance for good and for evil.
(Wiener, 1965, pp. 27-28)

Drawing on similar concerns held by Marx, Mumford, and Beniger alike,

Wiener became increasingly apprehensive about the emergence of what he called

the “second industrial revolution” (Wiener, 1965, pp. 37-38). He believed that the

65
integration of new systems for information processing into society will constitute

the remaking of society – “the second industrial revolution” and “the automatic

age” – destined to affect all aspects of society. His solution was “to have a society

based on human values other than buying and selling” (Wiener, 1965, p. 38;

emphasis added). To achieve this, Wiener argued, will take decades of effort and

will radically change the world: Workers will need to adjust to radical changes in

the work place; governments must establish new laws and regulations; industry

and businesses must create new policies and practices; professional organizations

must develop new codes of conduct for their members; sociologists and

psychologists must study and understand new social and psychological

phenomena; and philosophers must rethink and redefine old social and ethical

concepts.

Wiener embarked on this rethinking of existing social and ethical concepts

with the publication of his monumental book The Human Use of Human Beings

(1988) in 1950. Invoking Plato’s desire for the attainment of episteme, Wiener

outlines the purpose of a “good human life,” one in which “great human values”

are realized and the creative and flexible information-processing potential of “the

human sensorium” enables humans to reach their full promise in variety and

possibility of action (Wiener, 1988, p. 51). Wiener then outlined four key ethical

principles against which any new technology or practice should be measured to

advance and facilitate the good consequences of technology while preventing or

minimizing the harmful ones (Wiener, 1988). Terrell Ward Bynum (2000)

provides a convenient summary based on Wiener’s text:

66
- The Principle of Freedom – Justice requires “the liberty of each human
being to develop in his freedom the full measure of the human possibilities
embodied in him.”
- The Principle of Equality – Justice requires “the equality by which what is
just for A and B remains just when the positions of A and B are
interchanged.”
- The Principle of Benevolence – Justice requires “a good will between man
and man that knows no limits short of those of humanity itself.”
- The Principle of Minimum Infringement of Freedom – “What compulsion
the very existence of the community and the state may demand must be
exercised in such a way as to produce no unnecessary infringement of
freedom.”

While Wiener did not use the term “computer ethics,” these principles outlined in

The Human Use of Human Beings laid the foundation for future computer ethics

research and analysis.

In the decades following Wiener’s expressed concerns, various

philosophers and ethicists built the discipline of “computer ethics.” At the center

of this effort were Deborah Johnson and James Moor. In her landmark book,

Computer Ethics, Johnson (1985) defined the field as one that studies the way in

which computers “pose new versions of standard moral problems and moral

dilemmas, exacerbating the old problems, and forcing us to apply ordinary moral

norms in uncharted realms” (p. 1). Johnson addresses key categories of ethical

issues in the context of computer and information technology including

professional ethics, privacy, property, accountability, and social implications of

technology. Johnson questions, however, whether such ethical issues are unique

to computer technology, or whether new information technology merely brings

long-known and debated ethical dilemmas to light. In the third edition of

67
Computer Ethics, Johnson openly contemplates this pivotal question: “What is

computer ethics?” (2001, p. vii; emphasis added).

James Moor, however, already provided a compelling answer to Johnson’s

question in his equally foundational essay “What Is Computer Ethics?” (1985),

where he argues for a much broader definition of computer ethics than Johnson.

For Moor, computer ethics can be considered distinct of existing ethical theories

and applications, spurred by specific issues directly related to the use of

information technology:

A typical problem in computer ethics arises because there is a policy


vacuum about how computer technology should be used. Computers
provide us with new capabilities and these in turn give us new choices for
action. Often, either no policies for conduct in these situations exist or
existing policies seem inadequate. A central task of computer ethics is to
determine what we should do in such cases, that is, formulate policies to
guide our actions…. One difficulty is that along with a policy vacuum
there is often a conceptual vacuum. Although a problem in computer
ethics may seem clear initially, a little reflection reveals a conceptual
muddle. What is needed in such cases is an analysis that provides a
coherent conceptual framework within which to formulate a policy for
action. (Moor, 1985, p. 266)

In contrast to Johnson, Moor is calling for new ethical frameworks to address the

“conceptual muddle” that thinking about computer ethics inevitably produces.

And while Moor agrees with Johnson that “not all ethical situations involving

computers are central to computer ethics” (1985, p. 267), he insists that “because

computer technology provides us with new possibilities for acting, new values

emerge” (1985, p. 266). In short, “computer ethics requires us to think anew about

the nature of computer technology and our values” (Moor, 1985, p. 268; emphasis

added). Moor’s appeal to considerations of computer technology’s relationship to

68
values points to an important turn in computer ethics – the identification and

analysis of the impacts of information technology upon human values like trust,

accountability, informed consent, freedom from bias, autonomy, privacy, and

justice.

A significant challenge, however, to the application of computer ethics to

identify the ways in which information technologies bear on human values is what

Moor calls the “invisibility factor”:

Most of the time and under most conditions computer operations are
invisible. One may be quite knowledgeable about the inputs and outputs of
a computer and only dimly aware of the internal processing. (Moor, 1985,
p. 272)

These hidden internal operations can be intentionally employed for unethical

purposes – what Moor calls “invisible abuse” – or might be manifested as

“invisible programming values” embedded within the technology’s design or

programming code (1985, p. 273). This invisibility factor is present within many

computer and information technologies – recall Galloway and Lessig’s concern

with the generally hidden protocols and underlying architecture of the Internet –

and presents the primary dilemma for Moor’s conception of computer ethics:

…this invisibility…makes us vulnerable. We are open to invisible abuse


or invisible programming of inappropriate values or invisible
miscalculation. The challenge for computer ethics is to formulate policies
which will help us deal with this dilemma. (1985, p. 275)

Philip Brey, a Dutch philosopher of technology, answers Moor’s challenge

with his call for computer ethics to become more “disclosive,” to be “centrally

concerned with the moral deciphering of computer technology” (Brey, 2000, p.

11) to help eliminate the invisibility factor of many advanced information

69
technologies. Brey’s new disclosive computer ethics distinguishes itself from

traditional computer ethics by “disclosing and evaluating [the] embedded

normativity in computer systems, applications and practices” and through its

“description of computer technology and related practices in a way that reveals

their moral importance” (2000, p. 12). Brey criticizes mainstream computer ethics

– such as Johnson’s positioning of the discipline, and to some extent, Moor’s –

for being limited to the analysis of ethical issues for which there is a pre-existing

and identifiable policy vacuum. What is left out, according to Brey, are

“computer-related practices that are not (yet) morally controversial, but that

nevertheless have moral import” (2000, p. 11).

Brey describes these practices that have moral import but that are not yet

generally recognized as controversial as “opaque.” This kind of moral opacity can

occur for two reasons. Some computer-related practices might simply be

unfamiliar or unknown to most people (due to lack of widespread experience or

media coverage, for example). In such cases, a critical function of computer ethics

should be to “identify, analyze, morally evaluate and devise policy guidelines”

(Brey, 2000, p. 11). The second way in which moral opacity may arise is when a

practice is familiar in its basic form, but is not recognized as having moral

implications: “the hardware, software, techniques and procedures used in

computing practice often has the appearance of moral neutrality when in fact they

are not morally neutral” (Brey, 2000, p. 11). Disclosive computer ethics, then,

must put the technological artifact itself under moral scrutiny, “independently

from, and prior to, particular ways of using them” (Brey, 2000, p. 11).

70
To summarize, disclosive computer ethics works to uncover the moral

issues and features in technologies that had not until then gained much

recognition, while focusing on the particular design features of computer

technology. Its major contribution to computer ethics, and the philosophy of

technology overall, is in the “description of computer technology and related

practices in a way that reveals their moral importance” (Brey, 2000, p. 12).

Summary

Media ecology and the social construction of technology emerge as a

unique pairing to understanding the symbiotic relationship between technology

and society. Following the fundamental concerns of Plato through to Mumford,

advocates of media ecology strive to understand how technology affects human

perception, understanding, feeling and value, and how our interaction with

technology facilitates or impedes our chances for survival in this increasingly

complicated world. The answer, Postman reminds us, is deceptively simple: “It

changes everything” (1992, p. 18). Meanwhile, social constructivists seek to

understand the other side of the coin, how various social and cultural forces led to

the emergence of particular technologies in the first place. By understanding the

wider context from which technologies emerge, SCOT advocates argue, can we

better understand the complexities of our society, including technology’s role:

“Our technologies mirror our societies. They reproduce and embody the complex

interplay of professional, technical, economic, and political factors” (p. 3).

71
The contemporary philosophies of technology outlined thereafter focus

these perspectives of technology’s role in society squarely onto issues of politics

and the exercise of power. Mumford shows concern that certain technologies

might threaten the preservation of democratic principles within society, fostering

a new authoritarian technocratic system. Winner’s entire theory of technological

politics argues that conditions of power and authority are often deeply embedded

in technical devices and systems, either inherently or though the ways they settle

political issues. Finally, Beniger, Deleuze, Lessig, and Galloway recognize that

the growing ubiquity and importance of information technology inevitably leads

to a social and political environment based largely on efficiency and control.

Deleuze’s argument that the “digital language of control is made up of codes

indicating whether access to some information should be allowed or denied”

(1995, p. 180), can be applied to Beniger’s history of the rise of the information

society, as well as to Lessig’s and Galloway’s treatment of modern Internet

infrastructures. In terms of politics and power, all agree that a system’s design

will affect the freedoms and control that the system enables.

Understanding how technologies affect varying levels of freedom and

control, Wiener identified key ethical principles against which any new

technology or practice should be measured to advance and facilitate the good

consequences of technology while preventing or minimizing the harmful ones.

Wiener’s efforts led to the establishment of computer ethics, which provided the

philosophical tools for weighing technology against various moral and ethical

principles. In that effort, Brey has called for disclosive computer ethics to uncover

72
and morally evaluate the values and norms embedded in the design of computer

and information technology, and to “make potentially morally controversial

computer features and practices visible” (2000, p. 13).

Chapter Summary: A Faustian Bargain

Recounting Plato’s parable of the invention of writing in the Phaedrus

discussed at the beginning of this chapter, Thamus criticizes the god Theuth’s

enthusiasm for the supposed benefits of writing as an elixir of memory, claiming

instead that the practice of writing will ultimately weaken memory, illustrating an

anxiety that an over-reliance on arts that imitate knowledge (techne) threatens the

ability to achieve true wisdom and knowledge (episteme). Thamus recognized the

unintended consequences of writing, the social and cultural biases embedded

within this new techne unrecognizable by its supporters. Thamus’ judgment,

however, is incomplete, as pointed out by Neil Postman (1992):

The error is not in his claim that writing will damage memory and create
false wisdom. It is demonstrable that writing has had such an effect.
Thamus’ error is in his claim that writing will be a burden to society and
nothing but a burden. For all his wisdom, he fails to imagine what
writing’s benefits might be, which, as we know, have been considerable.
(p.4)

History has shown us how numerous tools and technologies that imitate

knowledge have indeed benefited the acquisition of wisdom, the pursuit of

episteme. Even before writing, diverse forms of techne emerged to imitate and

represent knowledge, including cave painting, textile patterns, and the knotted

strings of Incan quipu (see Crowley & Heyer, 2007). Later came clay and

cuneiform impressions, reeds and hieroglyphics, bamboo and ideograms, the

73
alphabet, parchment and paper, charts and maps, monastic manuscripts, codices

and encyclopedia, and libraries of knowledge. The imitation of knowledge

reached new scope with the recent evolution of electronic and digital forms of

techne, such as the telegraph, telephone, radio, television, and digital computer

networks. As the techne of antiquity evolved into today’s information

technologies,18 it benefited the pursuit of episteme considerably.

Yet, as with writing, the anxieties of Thamus have not subsided –

unforeseen consequences persist with today’s advanced forms of techne. The

invention of the printing press, for example, fostered the modern idea of

individuality, but it destroyed the medieval sense of community and social

integration. It made modern science possible but transformed religious sensibility

into an exercise in superstition (see, generally, Eisenstein, 1979). Large

commercial newspapers fostered the wide distribution of news and information,

but also caused a transformation of the public sphere from a space of open and

public communication into a commercial space increasingly controlled by

political and economic interests (Habermas, 1992). The bright lights and moving

images of television might have fostered a new “global village,” but they also

have transformed news and political discourse into entertainment, consisting of

little more than sound bites tailored for the cameras (Postman, 1985). And while

18
Considering that techne is defined as an art or craft that is imitative of
knowledge, all forms of information technologies appear can claim a lineage to
this ancient concept. Just as writing imitates memory by relying on words on a
page, audio recordings imitate sound, photographs imitate what is visible, video
imitates motion, and the language of computers can imitate any form of
information translatable into binary code. All information technologies are
imitative of knowledge through representation, storage, and mediation.

74
many hope the Internet will recharge a diverse public sphere of deliberative

discourse, others fear it may weaken democracy because it allows citizens to

isolate themselves within groups that share their own views and experiences, and

thus cut themselves off from any information that might challenge their beliefs

(Sunstein, 2001).

It is inescapable, then, that every society must negotiate with its new

technologies, balancing benefits against unanticipated consequences. Much of

Neil Postman’s contribution to the philosophical study of technology is to expose

this struggle between society and its technology. His speech to a conference of

technologists in 1990 presents this delicate balance between the benefits and

unanticipated consequences of technology:

[A]nyone who has studied the history of technology knows that


technological change is always a Faustian bargain: Technology giveth and
technology taketh away, and not always in equal measure. A new
technology sometimes creates more than it destroys. Sometimes, it
destroys more than it creates. But it is never one-sided. … Another way of
saying this is that a new technology tends to favor some groups of people
and harms other groups. School teachers, for example, will, in the long
run, probably be made obsolete by television, as blacksmiths were made
obsolete by the automobile, as balladeers were made obsolete by the
printing press. Technological change, in other words, always results in
winners and losers. (Postman, 1990)

Postman also feared that the Faustian bargain society strikes with its technology

often becomes hidden, subsumed in some wider ideology of efficiency and

utopian vision of technological progress:

[In] cultures that have a democratic ethos, relatively weak traditions, and a
high receptivity to new technologies, everyone is inclined to be
enthusiastic about technological change, believing that its benefits will
eventually spread evenly among the entire population. Especially in the

75
United States, where the lust for what is new has no bounds, do we find
this childlike conviction most widely held. (Postman, 1992, p. 11)

As a result, certain questions about the adoption of technology remain ignored in

the face of this technological enthusiasm: “from whose point of view the

efficiency is warranted or what might be its costs? …Whom will the technology

give greater power and freedom? And whose power and freedom will be reduced

by it?” (Postman, 1992, p. 11). Eventually, society succumbs to the promises

made by technology, ignoring such issues of power, equality, and freedom. While

Faust traded his soul to devil in exchange for unlimited knowledge, Postman

feared that society is sacrificing its core values to blindly satisfy its desire for

technological progress.

This Faustian bargain between society and its technologies permeates

throughout the theories of technology presented in this chapter. The contrasting

views of Marx and Kapp of the social impact of technologization of our world in

the aftermath of mass industrialization speaks directly to how technologies both

“giveth and taketh away.” Like Mumford’s historical treatment in Technics and

Civilization, Ellul, Marcuse, and Heidegger focus their philosophical critique on

how technologies “taketh away” from society. They viewed technology as an

autonomous force ushering in a new mode of existence based on calculative,

analytic, and efficient thinking – an existence that threatened to subsume all other

ways of living our lives and relating to our world. In their view, most members of

society were losers in the Faustian bargain with technology, caught up in the

rhetoric of efficiency, consumerism, and technological progress. The dangers of

76
blind acceptance of the Faustian bargain appears in Mumford’s later exploration

of authoritarian technologies, where society’s unqualified acceptance of an

authoritarian technocratic system, complete with its “panoply of tranquillizers and

sedatives and aphrodisiacs” (Mumford, 1964, p. 7) threatens the survival of

democracy. Winner and Beniger present similar arguments: The increased

ubiquity of socio-technological systems that require authoritative forms of

management – made possible by the allure of the Faustian bargain – results in the

normalization and naturalization of authoritative power relationships. And since

the power relationships enabled by the Faustian bargain are rarely questioned,

Deleuze and Galloway’s warnings about the levels of control implicated by and

embedded within our technologies remain ignored. Lessig cautioned, “how a

system is designed will affect the freedoms and control the system

enables”(Lessig, 2001, p. 35), but, if Postman’s fears are correct, and the Faustian

bargain between society and technology becomes increasingly irresistible, society

will lose the ability to recognize technology’s impact on the freedoms we

traditionally enjoy.

Upon listening to Theuth’s praise of writing – Theuth’s own invention –

Thamus replied: “Most ingenious Theuth, one man has the ability to beget arts,

but the ability to judge of their usefulness or harmfulness to their users belongs to

another” (Plato, 1990, p. 274e). Here, Thamus understood the limitations of

designers and inventors in grasping the social, cultural, or ethical implications of

their own technologies. Instead, moral and ethical philosophers have taken up this

duty. Building upon the work of Wiener, Johnson, and Moor, Brey urges the

77
practice of disclosive computer ethics to try to make apparent the ethical

implications of technology that remain hidden behind the alluring veil of the

Faustian bargain.

This dissertation builds upon this broad foundation and will expose the

perfect search engine as a political technology embroiled in a Faustian bargain of

its own, exercising power and control through its vast information processing

capacities, bringing with it particular value and ethical implications. It will

explore how the perfect search engine empowers the widespread capture of

personal information flows across the Internet, threatening the ability to engage in

online social, cultural, and intellectual activities free from answerability and

oversight, thereby bearing on the values of privacy, autonomy, and liberty. It will

answer Brey’s call for disclosive computer ethics and attempt to attain

Heidegger’s desire for “releasement” and a more harmonious relationship with

our technological world through the value-conscious design of these vital

knowledge tools, shedding the “blinders” in which our Faustian bargain has

shrouded us.

78
79

CHAPTER III

THE QUEST FOR THE PERFECT SEARCH ENGINE

Introduction

In order to examine how the quest for the perfect search engine empowers

the widespread capture of personal information flows across the Internet, we must

first understand how search engines work, as well as the historical context from

which they emerged. As it turns out, this is no small feat; it requires a rather deep

technical understanding of the Web, of search engine design, and of motivations

for the quest for the perfect search engine. This chapter presents a technical

overview of these concepts, written with the nontechnical reader in mind.

Although the discussion will be far from comprehensive – indeed, many

engineering and computer science dissertations have been devoted to explaining

just one dimension of Web search engines – this chapter will develop a basic

understanding of the relevant technologies and the motivations behind their

creation. Most importantly, this chapter (combined with the next) represents the

technical investigation of the design of the perfect search engine to uncover how

its technological properties and underlying architecture might bear on the ethical

and human values. Such an investigation is necessary to engage in the value-

conscious design of these important information technologies.


A Brief History of the Internet and World Wide Web

The Internet

The Internet is a worldwide, publicly accessible network of interconnected

computer networks that transmit data by packet switching using a standardized set

of TCP/IP protocols (Hall, 2000).19 A “network of networks,” the Internet consists

of thousands of smaller domestic, academic, business, and government networks,

which together carry various information and services, such as electronic mail,

online chat, file transfer, and the interlinked Web pages and other documents of

the World Wide Web.20

Since its inception in 1969, the Internet has grown from just four host

computers located in American university computer laboratories to more than 400

million global locations by July 2006 (Internet Systems Consortium, 2006), with

over 1 billion users worldwide (Internet World Stats, 2006). During this relatively

short history, the Internet has “revolutionized the computer and communications

world” through its simultaneous use as “a worldwide broadcasting capability, a

mechanism for information dissemination, and a medium for collaboration and

interaction between individuals and their computers without regard for geographic

location” (Leiner et al., 2003).

Fueling this phenomenal growth of the Internet was the development of

innovative protocols for accessing the wealth of information increasingly

19
Certainly, the Internet is not completely worldwide in its reach, and not
publicly accessible by all populations. Despite these important “digital divide”
concerns, the Internet stands as a unique medium in its expansive reach and
general openness.
20
See below.

80
distributed across the network. It became widely accessible to the scholarly,

business and consumer communities as a research and communication tool near

the end of the 1980s with the emergence of the FTP (file transfer protocol) and

Gopher protocols. The FTP protocol was designed to connect two computers over

the Internet for the efficient sharing and transfer of files (Network Working

Group, 1985). While usable directly by a person at a computer terminal, FTP was

designed mainly for use by programs to remotely connect to other computers on

the Internet in order to access and transfer needed files automatically. The Gopher

protocol, on the other hand, was designed specifically for use by people.

Developed in 1991 by computer scientists at the University of Minnesota (whose

sports teams are called the Golden Gophers), Gopher presented files in

hierarchical menus for easier and more intuitive navigation (Network Working

Group, 1993). Designed to act as a distributed document delivery system, a file on

a Gopher server can be linked to as a menu item from any other Gopher server.

Gopher’s unique cross-server, non-linear, and non-hierarchical linking of files

helped to usher in a new paradigm for navigating distributed information across

remote computers: the hypertext link.

Hypertext Links

A hypertext link (or, simply, hyperlink) is a navigational element

embedded in a document that links to another section of the same or different

document that, when selected, automatically delivers the linked information to the

user (Wikipedia contributors, 2006a). While often considered synonymous with

81
the Internet and World Wide Web, hyperlinks, in the most general sense, predates

these computerized networks, often by centuries.21 In our modern era, a type of

hyperlink was a prominent feature of Vannevar Bush’s (1945) inventive proposal

for the memex. Bush, a science advisor to President Franklin D. Roosevelt,

published “As We May Think” in 1945 as an attempt to mobilize the scientific

community after World War II to develop knowledge tools rather than military

tools. Mirroring the explosion of knowledge experienced during the European

Enlightenment, Bush realized that the amount of scientific data was growing at an

incredible pace in the first half of the twentieth century, and argued that people

needed to find new ways to organize and access information through the use of

new technology:

The summation of human experience is being expanded at a prodigious


rate, and the means we use for threading through the consequent maze to
the momentarily important item is the same as was used in the days of
square-rigged ships. (Bush, 1945, p. 102)

Further, Bush realized the constraints of the dominant systematic method of

information organization:

Our ineptitude in getting at the record is largely caused by the artificiality


of systems of indexing. When data of any sort are placed in storage, they
are filed alphabetically or numerically, and information is found (when it
is) by tracing it down from subclass to subclass. It can be in only one
place, unless duplicates are used; one has to have rules as to which path
will locate it, and the rules are cumbersome. Having found one item,
moreover, one has to emerge from the system and re-enter on a new path.
(Bush, 1945, p. 106)

21
Diderot’s 18th-century Encyclopédie, for example, featured the
widespread use of renvois, a system of hyperlink-styled cross references to link
articles with related – both complementary and oppositional – ideas or arguments
(see Brewer & Hayes, 2002).

82
Here, Bush recognized the limitations of interacting with a system through a rigid

data structure: If the data is stored in classes and subclasses in a database, then

users can only navigating the database as required by its data structure – via those

precise classes and subclasses – rather than by their own interests or personal

method of information organization and retrieval. Bush’s goal, then, was to invent

new knowledge tools to help users locate, organize, coordinate, and navigate

through ever-increasing amounts of information, and to free them from the

constraints of rigid systems of classification and data organization.

What made a piece of information valuable, Bush suggested, was not the

overarching class or category that it belonged to, but rather its connections to

other data. As a solution, Bush proposed the “memex,” a mechanical knowledge

tool, half microfilm machine and half computer, to support the process of thinking

through “associative indexing” (Bush, 1945, p. 107):

A memex is a device in which an individual stores all his books, records,


and communications, and which is mechanized so that it may be consulted
with exceeding speed and flexibility. It is an enlarged intimate supplement
to his memory. (pp. 106-107)

The memex would aid the process of thinking through a mechanized indexing

system, in which different pieces of information in the indexing system could be

connected together by creating individualized associative trails. A trail was

analogous to the trail of mental association in the user’s mind: A memex user

builds a “trail of interest through the maze of materials available to him” (Bush,

1945, p. 107) as he explores the collection of knowledge presented.

83
Bush’s system implied a profound shift in the way we grapple with

information. Before the memex, our access to knowledge was constrained by an

imperfect indexing system that used alphabetical and systematic classifications as

means of organizing the “summation of human experience” (Bush, 1945, p. 107).

The memex’s associative trails represent a way of organizing and navigating

information that frees users from the strict, inflexible dictates of systematic or

alphabetic conventions. What made a piece of information valuable, Bush

suggested, was not the overarching class or species that it belonged to, but rather

the connections it had to other data. Documents can be connected for more

elusive, transient, or personal reasons, and each item might have many trails

leading to it. As Steven Johnson relates, “The Memex wouldn’t see the world as

the librarian does, as an endless series of items to be filed away on the proper

shelf. It would see the world the way a poet does: a world teeming with

associations, minglings, continuities” (Johnson, 1997, p. 119).

Bush’s vision for the memex remained unrealized, but he inspired another

pioneer in knowledge tools, Ted Nelson, who wrote twenty years later of a new

knowledge tool that would enable users to publish and access information in a

similar nonlinear and interlinked format. Coined hypertext by Nelson for use in

his Project Xanadu (see Nelson, 1987, 1993), the hypertext link was meant to

foster an open-ended, nonsequential assembly of knowledge:

The ultimate hypertext goal is the global accumulation of knowledge…. It


would be a universal publishing system where every interested person has
direct access to humanity’s accumulated knowledge – in effect, the
ultimate publishing system where each person is both contributor and user.
(qtd in Stockwell, 2001, p. 168)

84
Hypertext would allow people to create, annotate, link together, and share

information from a variety of sources and media. Nelson’s vision involved

implementation of a “docuverse” where all data was stored once, there were no

deletions, and all information was accessible by a link from anywhere else.

Navigation through the information would be non-linear, depending on each

individual's choice of links. Enabled by hypertext, users would no longer be

constrained to read or navigate information in any particular order, but could

follow links in and out of documents at random, the path being determined by the

needs and interests of each individual reader. It is designed to work in a fashion

similar to the way people actually think – not by processing information in a

linear way from A to B to C – but by free association, starting with A but taking

you to G or M or F before you continue sequentially (if at all) to B.

Together, Bush’s vision for the memex and Nelson’s development of

hypertext spawned a new generation of innovative information technologies that

provided novel means of access to information while also promoting user

autonomy, liberating users to navigate and explore information free from the

rigidity of predetermined structures. By the late twentieth century, their collective

visions were increasingly being realized in CD-ROM-based multimedia

dictionaries and encyclopedias utilizing hypertext links to connect text, animation,

audio, and video in ways never imagined by Diderot or Webster. Other

applications of hypertext during this pre-Web era included Apple Computer’s

HyperCard application program and XEROX’s NoteCards system. While other

minor hypertext applications came and went, the full potential of nonlinear and

85
nonsequential linking of information via hypertext was brought to its fullest

fruition by Tim Berners-Lee with his creation of the World Wide Web.

The World Wide Web

Tim Berners-Lee’s development of the World Wide Web was among the

first large-scale networks build around the hypertext link:

The fundamental principle behind the Web was that once someone
somewhere made available a document, database, graphic, sound, video,
or screen…it should be accessible…by anyone, with any type of
computer, in any country. And it should be possible to make a reference –
a link – to that thing, so others could find it. (Berners-Lee, 2000, p. 37)

Berners-Lee recognized that “there was a power in arranging ideas in an

unconstrained, weblike way” (Berners-Lee, 2000, p. 3) and, following the vision

of both Bush and Nelson, he understood the human mind’s ability to link random

bits of data, enabling the creation of an online information-space where anything

could be linked to anything – a Web of information:

Suppose all the information stored on computers everywhere were linked.


…Suppose I could program my computer to create a space in which
anything could be linked to anything. …Once a bit of information in that
space was labeled with an address, I could tell my computer to get it. By
being able to reference anything with equal ease, a computer could
represent associations between things that might seem unrelated but
somehow did, in fact, share a relationship. A Web of information would
form. (Berners-Lee, 2000, p. 4)

In 1980, while an independent contractor at the CERN particle physics laboratory,

Berners-Lee proposed a project based on the concept of hypertext to facilitate

sharing and updating information among researchers within the facility, and,

importantly, to escape from the “straitjacket of hierarchical documentation

86
systems” (Berners-Lee, 2000, p. 21). Supported by his HyperText Markup

Language (HTML), a simple method for encoding text files with links to other

documents, and the HyperText Transfer Protocol (HTTP), a protocol for

transferring HTML documents upon request to other computers over a network,

Berners-Lee announced the debut of the World Wide Web as a publicly available

service in 1991:

The WorldWideWeb (WWW) project aims to allow links to be made to


any information anywhere. …The WWW project was started to allow high
energy physicists to share data, news, and documentation. We are very
interested in spreading the Web to other areas, and having gateway servers
for other data. Collaborators welcome!22

Utilizing hypertextual linking, the WWW allowed documents, ideas, and

concepts to be stored and shared in ways similar to Bush’s call for associative

trails to guide information navigation, and Nelson’s vision for a “nonsequential

assembly of ideas.” By releasing the World Wide Web for public use, Berners-

Lee hoped that implementation of his HTML and HTTP protocols would become

widespread. His strategy worked, and from the first Web site created by Berners-

Lee in 1991, the Web has expanded to over 100 million Web sites in late 2006

(Netcraft, 2006), representing over 11.5 billion indexable23 Web pages (Gulli &

Signorini, 2005).

22
Archive of original e-mail to the alt.hypertext message board is
available at http://www.w3.org/People/Berners-Lee/1991/08/art-6484.txt.
23
The indexable Web is that portion of the World Wide Web that is
indexed by conventional search engines, which will be discussed in more detail
below. For various reasons (e.g., the Robots Exclusion Standard, links generated
by JavaScript and Flash, password-protection, dynamic pages temporarily created
in response to a user action) some pages cannot be indexed by traditional search
engines. These “invisible” pages are referred to as the Deep Web. It is estimated

87
Early Internet and Web Navigation Tools

FTP, Gopher, and WAIS

The Internet’s growth as a massive, dynamic depository of the world’s

information continues at an unprecedented rate. It becomes impossible to even

begin to understand the depth and breadth of information that resides in the

billions of Web pages scattered across the global network. Creating tools to locate

and navigate these information spaces has become a priority as more and more of

the public go online for their information-seeking needs. One of the first systems

for locating information on the Internet was Archie, initially developed in 1989 at

McGill University (Emtage & Deutsch, 1992). Archie – a play on the term

“archive” – was essentially an index to thousands of FTP sites around the Internet.

Periodically, Archie would fetch directory listings from a list of pre-determined

FTP sites and compile a master index, mirroring the file structure of the servers

indexed. Users could submit word queries through a command line interface or

via e-mail. The queries were processed against the complied index, and a set of

matching directories or files were returned. In the early 1990s, two other tools,

Veronica and Jughead, were developed to help locate documents across remote

Gopher servers.24 Veronica consisted of a periodically updated database of the

names of almost every menu item on thousands of Gopher servers (Wikipedia

that the deep Web is several magnitudes larger than the indexable Web (Bergman,
2001).
24
Since Archie, coincidentally, was the name of a popular American
comic book character, the Veronica and Jughead systems were named after other
characters from the same comic series.

88
contributors, 2006c), while Jughead was designed to search only within one

Gopher server at a time (Wikipedia contributors, 2006b). Another early Internet

search system was the Wide Area Information Server (WAIS), a commercial

software package that allowed the indexing of large quantities of information, and

then made those indices searchable across the Internet. Examples of information

found on WAIS were the Dow Jones stock listings, United States government

documents, and the Library of Congress archives (Gillies & Cailliau, 2000).

The Gopher and WAIS search tools were barely more than a couple of

years old when they were overwhelmed by the rapid development of the World

Wide Web. Because Berners-Lee’s World Wide Web protocols enabled users to

create simple pages of information on the Web with relatively little effort or

expertise, pages could be added to the Web without any particular organization or

independent evaluation of the quality, location, or nature of the information.

While the diminished barriers of entry and decentralized nature of the Web were

considered among its most significant characteristics, the resulting “rummage sale

of information on the World Wide Web” (Bowker & Star, 1999, p. 7) became

increasingly difficult to navigate. An early process for finding resources on the

World Wide Web was little more than word of mouth – a colleague telling other

colleagues about an interesting or new Web site. Lists of new and notable Web

links could also be found online, such as the “What’s New” section of the

homepage of Netscape, the default start page of one of the first widely used Web

89
browsers.25 Paper listings of Web sites were also published, but, as Candy

Schwartz recognized early on, “print publishing has never been a particularly

appropriate method for keeping up to date with Internet resources” (Schwartz,

1998, p. 974).26

By the mid 1990s, as the Web continued to grow exponentially in size and

complexity, it became increasingly difficult to locate a particular Web page unless

the user was already aware of the location of the information resource.

Researchers began to look for methods that could add some sense of organization

and navigability to the Web. Borrowing from traditional information organization

principles in library and information science, a solution emerged: Web directories.

Web Directories

Web directories provide a structured hierarchy of websites, typically

organized by subject in a manner similar to a library classification scheme. For

example, a user seeking sites on Web graphics, the hierarchy might look

something like this:

25
Archives of Netscape’s “What’s New” pages are viewable at
http://wp.netscape.com/home/whatsnew/.
26
I recall purchasing a printed directory of the World Wide Web in the
late 1990s. The size of a large city’s phone book, it became obsolete in a matter of
months.

90
The categorization is usually based on the whole website, rather than one page or

a set of keywords, and sites are often limited to inclusion in only one or two

categories. Web directories typically allow site owners to directly submit their site

for inclusion, and have human editors who review submissions for evaluation and

categorization.

One of the earliest Web directories was the World Wide Web Virtual

Library, derived from a collection of links originally maintained on Tim Berners-

Lee’s Web page at CERN.27 Berners-Lee describes its creation in 1992:

By now the Web consisted of a small number of servers, with info.cern.ch


the most interconnected with the rest. …When the list [of connected
servers] became larger, it needed to be organized, so I arranged it in two
lists, by geography and by subject matter. As more servers arrived, it was
exciting to see how the subjects filled out. Arthur Secret…set up the lists
into what we called the Virtual Library, with a tree structure that allowed
people to find things. (Berners-Lee, 2000, p. 55)

Today, the Virtual Library consists of 263 individual categories of sites, each

maintained by its own “librarian,” including experts from academia, industry, and

volunteers.28

Another popular Web directory originated in the dorm room of two

Stanford University Ph.D. students. In January 1994, Jerry Yang and David Filo

began compiling a list classifying their favorite Web sites, “Jerry and David’s

Guide to the World Wide Web.” Within a year, their amateur and informal

directory had received over 1 million visitors from across the globe, and its name

was changed to the more memorable (and marketable) “Yahoo!”, chosen both for

27
Archived at http://www.w3.org/History/19921103-
hypertext/hypertext/DataSources/bySubject/Overview.html.
28
Viewable at http://vlib.org/.

91
its playful meaning and as an acronym for “Yet Another Hierarchical Officious

Oracle” (Yahoo!, 2005).

Hierarchy (the “h” in Yahoo!) was indeed a crucial aspect of Yahoo’s

Web directory. As the site grew and the number of links increased, its method of

sorting links to categories and subcategories needed organization and refinement.

Yang and Filo turned to Srinija Srinivasan, hired as Yahoo!’s fifth employee and

“Chief Ontologist,” who took Yahoo!’s extensive lists of Internet sites and

redefined its categorization to provide “a more holistic view of human

knowledge” (Srinivasan, 2005). Building from the ad hoc categories she inherited

from Yang and Filo, Srinivasan and her team of editors began slowly and

deliberately steering Yahoo!’s ontology toward this “holistic view,” adding and

reorganizing categories, as well as evaluating and reclassifying nearly every Web

page in Yahoo!’s directory (See Figure 1). Once the ontological categories

become stable, Jerry Yang argued, “We will have captured the breadth of human

knowledge” (Steinberg, 1996, p. 111).

While helping to organize the increasingly chaotic World Wide Web, Web

directories have four major drawbacks, two of a pragmatic nature, and two

ontological. First, because of the human role in selection and categorization of

sites, maintenance of Web directories is very labor intensive. Hundreds of

thousands of Web pages are created and updated daily, and human-driven Web

directories have difficulty keeping up with the rapid growth of the Web. Yahoo!,

for example, initially employed only 20 editors for its directory, allowing

consistent application of its categorization scheme. They quickly recognized,

92
however, the need to hire another “50 or 60 classifiers” otherwise the “percentage

of sites Yahoo! knows about will continue to shrink” (Steinberg, 1996, p. 111).

By comparison, a community of over 75,000 volunteer editors maintains the Open

Directory Project, one of the largest and most extensive human-edited directories

of Web links.29 As the Web continues to grow in size and complexity,

maintaining exhaustive human-edited directories becomes increasingly difficult.

Figure 1: The Yahoo! Web directory as seen on Oct 17, 1996.30

29
Number of editors listed at http://www.dmoz.org, although it is
estimated that only around 10% are “active” (Wikipedia contributors, 2007c).
30
Archived at
http://Web.archive.org/Web/19961017235908/http://www2.yahoo.com/.

93
The second pragmatic drawback of Web directories involves their user-

friendliness. Because of their hierarchical arrangement, most directories are

browsable, that is, users can click on a subject of interest to see pertinent links and

subcategories on the topic. Users, however, are dependent upon the Web editors’

lexicon, and it can prove difficult to discern exactly under which topic heading a

particular item has been classified. For example, a user seeking a website about

the painkiller Tylenol might not find a category of that name, but instead would

need to know to click through to the category “Acetaminophen,” which, in the

Open Directory Project, is found within the following hierarchy:

Navigating the depths of such a hierarchical subject tree requires both familiarity

with the topic and its general vocabulary, as well as the patience for engaging the

trial and error, as some clicks inevitably will result in dead-ends. To help alleviate

these challenges to successful navigation of Web directly listings, some of the

larger directories provide a simple search function (as shown in Figure 1),

allowing users to search for particular subject categories and page listings within

the directory (but typically not within the content of the linked Web pages).

94
Along with these two pragmatic shortcomings, presence of human editors

combined with relatively rigid and linear categorization schema prompts

ontological concerns over bias and authority in any attempt to, as Yahoo! puts it,

capture “the breadth of human knowledge” (Steinberg, 1996, p. 111). Relying on

humans to evaluate and place Web sites within the directory places them in a

position of ontological authority over which sites are included (and which are

not), and where in the hierarchy they belong. For example, should a link to

Planned Parenthood be categorized under the directory path “Health:

Reproductive Health” or “Social Issues: Family Planning” or “Medical services:

Abortion” – or, depending on one’s moral compass, should it be listed at all? A

directory’s human editors clearly have influence on the contents of Web

directories, and act as gatekeepers holding “the key to inclusion” for site owners

wishing to have their pages indexed (Introna & Nissenbaum, 2000, p. 171).

Beyond the influence of biases and politics of a singular editor’s actions,

the biases and politics of classification itself casts a shadow on the usefulness of

Web directories, and deserves greater attention. The drive for the classification

and categorization of knowledge can be traced back to the encyclopedists of early

modern Europe, rooted in the rational and scientific methods gaining dominance

at that time. Carolus Linneaus’ classification system for separating animals and

plants into a hierarchical taxonomy prompted many encyclopedists to

“experiment with ways of arranging their subject matter in similar upside-down

pyramid fashions, with overarching general categories and subdivisions, all the

way down to specific topics” (Stockwell, 2001, p. 98). The division of topics into

95
structured hierarchies is meant to help reduce large sets of knowledge to a logical

and intelligible form. Supporters of systematic organization suggested that

encyclopedias and other reference works should be designed to resemble tree-like

diagrams showing movement from general to more specific propositions by

means of branching dichotomies (Yeo, 2001). The hierarchical classification

schemes of Web directories attempt to mimic such structures, moving from the

broadest categories down to their specific elements.

Yet, such strict hierarchical systems of classification can be problematic.

Linnaeus recognized that his classifications were “cultural constructs reflecting

human ignorance” (Headrick, 2000, p. 22). Or, as Bates (2002) realizes in his

epistemological exploration of historical attempts to map knowledge, “any

division and classification must be somewhat arbitrary, because the complexity of

things does not lend itself to simple orders. All the distinctions between various

kinds of human knowledge must be decided, created, distributed…” (p. 15-16).

Such knowledge maps, common in many systematically organized encyclopedias,

are problematic, Bates maintains, because “they reify particular orders and present

them as an objective reality. The individual map defines one version of the world

at the expense of other perspectives, excluding them with its appearance of

scientific ‘accuracy’” (p. 6).

Indeed, all classificatory nomenclatures can be criticized as constructs of

the mind, which imposes on its subjects an arbitrary pattern that distorts their

underlying reality. The imposition of such an arbitrary classification system

resonates with Michel Foucault’s reaction to Borges’ descriptions of a Chinese

96
encyclopedia which organizes the animal world according to a complex and

foreign system of criteria: “(i) frenzied, (j) innumerable, (k) drawn with a very

fine camelhair brush, …(m) having just broken the water pitcher” (Foucault,

1971, p. xv). What Foucault found most unsettling in Borges’ Chinese

encyclopedia is not the seemingly absurd categories that order the world of

animals so much as one particular category: “(h) those that are included in this

classification” (p. xv). Systematic order is fractured by this self-reflexive

category, and the “monstrous quality of the encyclopedic order is not the oddity of

juxtaposition but the destruction of a common ground for any order” (Bates, 2002,

p. 4). Such encyclopedic “order” represents not an ontological category, but only

a rhetorical performance, a linguistic act that defines and classifies in order to

exert control (Foucault, 1971, p. xx).

Geoffrey Bowker and Susan Leigh Star (1999) continue this criticism of

arbitrary classification systems, arguing that any such systems are inherently

political: “Systems of classification (and of standardization) form a juncture of

social organization, moral order, and layers of technical integration” (p. 33). They

stress that the “material force of classification systems” impacts our world

“epistemologically, politically, and ethically” (p. 10). Lucy Suchman (1997)

makes a similar claim in her argument that those who determine the classificatory

categories, and how such categories can and will be used, impute their own

personal values and ideologies into the system, exerting power over both the user

and the information itself.

97
The systematic organization of knowledge in Web directories, by

definition, arranges concepts according to a preconceived and rigid system of

categorization, a system that Foucault, Bowker & Star, and Suchman reveal to be

not only arbitrary, but often politically charged. While the systematic organization

of Web directories was meant to improve the ability to find information on the

rapidly expanding Web, their structure – like paper encyclopedias before them –

threaten to impart a dogmatic rigidity to the way Web sites and information are

presented. In his critical essay on ontological systems (including the Yahoo!

Directory), Clay Shirky (2005) outlines characteristics of situations where such

ontological classification is not advised (Table 1).

Table 1: Characteristics of situations where ontological classification is not


advised.

Domain: Participants:

• Large corpus • Uncoordinated users

• No formal categories • Amateur users

• Unstable entities • Naive catalogers

• Unrestricted entities • No authority

• No clear edges
(adapted from Shirky, 2005)

Comparing the World Wide Web to these characteristics reveals that relying on

ontological classification for organizing the Web is inherently problematic:

The list of factors making ontology a bad fit is, also, an almost perfect
description of the Web – largest corpus, most naive users, no global
authority, and so on. The more you push in the direction of scale, spread,

98
fluidity, flexibility, the harder it becomes to handle the expense of starting
a cataloguing system and the hassle of maintaining it, to say nothing of the
amount of force you have to get to exert over users to get them to drop
their own world view in favor of yours. (Shirky, 2005)

Shirky describes this as a problem with the “browse paradigm” of Web

organization and navigation:

Browse says the people making the ontology, the people doing the
categorization, have the responsibility to organize the world in advance.
Given this requirement, the views of the catalogers necessarily override
the user’s needs and the user’s view of the world. If you want something
that hasn’t been categorized in the way you think about it, you’re out of
luck. (Shirky, 2005)

When combined with the growing complexity of the World Wide Web,

these four drawbacks of Web directories – the human labor required, the difficulty

of navigation, the potential for editors to act as gatekeepers, and the general

ontological debate over categorization itself – combine to make the “browse

paradigm” of navigation via Web directories untenable. Thus, as John Battelle

relates, a new paradigm was needed:

A hierarchical approach simply made sense for a public trying to


understand the wild and rather disorganized chaos of the early Web. As
surfers moved form a stance of exploration (“What’s out there?”) to
expectation (“I want to find something that I know is out there”), search as
a navigational metaphor began to make more sense. (Battelle, 2005, p. 61;
emphasis added)

Thus, the Web search engine was born.

Web Search Engines

As an alternative to the beleaguered “browse paradigm,” Clay Shirky

offers what he calls the “search paradigm”:

99
The search paradigm says the reverse. It says nobody gets to tell you in
advance what it is you need. Search says that, at the moment that you are
looking for it, we will do our best to service it based on this link structure,
because we believe we can build a world where we don’t need the
hierarchy to coexist with the link structure. (Shirky, 2005)

Web search engines reflect the epitome of this new search paradigm, employing a

complex information retrieval system to automatically browse sites across the

Web, store their contents in a database, and automatically retrieve and rank results

based on a user’s specific search query. The first Web search engines were

developed in the mid-1990s, often as not-for-profit research projects at university

computer or information science departments (see Table 2).

In her review of the web search engine industry, Elizabeth Van Couvering

(forthcoming) has identified three distinct periods in the history of the search

engine marketplace (see Figure 2):

First, a period of technical entrepreneurship from 1994 to late 1997;


second, a period which was characterised by the development of portals
and vertical integration from late 1997 to the end of 2001, in which major
media companies and network providers attempted to buy their way into
the search arena; and finally a period of consolidation and “virtual”
integration from 2002 to the present day.

Of the twenty-one search ventures launched in this short period of time, only six

remain as fully independent search engine providers. And while the industry’s

roots were in the academic domain of university research laboratories, the market

is now dominated by large, multi-national, for-profit corporations like Google,

Yahoo!, and Microsoft.

100
Table 2: Early period search engine dates, institutions, and founders.

Engine/ Date went Position at time of


Institution Developer(s)
Directory live development
Yahoo! February Stanford Jerry Yang Computer Science (CS)
(directory) 1994 University David Filo PhD students
WebCrawler April University of
Brian Pinkerton PhD student in CS
(engine) 1994 Washington
Carnegie Dr Michael
Lycos Postdoctoral research fellow
July 1994 Mellon Mauldin
(engine) in CS
University Bob Leavitt
Serial technology
entrepreneur – founded
Infoseek February n/a
Steve Kirsch Frame Technology and
(engine) 1995
Mouse Systems. BA and
MS from MIT.
Early provider of search
OpenText April interfaces to products such
n/a (uncredited)
(engine) 1995 as Oxford English
Dictionary
Daughters of publishing
Isabel &
Magellan August magnate Richard Maxwell,
n/a Christine
(directory) 1995 originally published a print
Maxwell
guide to the Web
Graham Spence
Joe Krausz
Recent CS graduates (apart
Excite September Stanford Ben Lutch
from Krausz who graduated
(engine) 1995 University Ryan McIntyre
in political science)
Martin Reinfreid
Mark Van Haren
Digital
AltaVista December
Equipment Dr Louis Monier Research fellow
(engine) 1995
PARC
University of
Inktomi Dr Eric Brewer Assistant professor of CS
May 1996 California at
(engine) Paul Gaulthier and graduate student
Berkeley
(uncredited – presumably
LookSmart October
Reader’s Digest (uncredited) the publishing team acting
(directory) 1996
through ordinary channels?)
(adapted from Van Couvering, forthcoming)

101
Figure 2: Search engine mergers and acquisitions in the three periods of search
history (adapted from Van Couvering, forthcoming).

To summarize, after existing for merely a decade, Web search engines

have emerged as the prevailing tool for accessing the vast amount of information

available on the World Wide Web and beyond. They locate, index, and provide

almost immediate access to billions of Web pages and related Internet content.

102
According to the Pew Internet & American Life Project, 84% of American adult

Internet users have used a search engine to seek information online (Fallows,

2005, p. 1). On any given day, more than 60 million American adults send over

200 million information requests to Web search engines, making Web searches

second most popular online activity (behind using e-mail) (Rainie, 2005). They

are, in essence, the doorways to the universe of information available online.

How Web Search Engines Work

While particular search engines vary in their technical design and

processes, Figure 3 shows the typical architecture of a Web search engine divided

into three key modules: a crawler (left), an indexer (center), and a query and

ranking module (right).

Crawler Indexer Query and ranking

Figure 3: Typical search engine architecture (adapted from Arasu et al., 2001)

103
Crawlers (also known as “spiders” or “bots”) are small programs that

“crawl” the Web on the search engine’s behalf, downloading Web pages into a

page repository for later processing in the indexing module (see, for example,

Heydon & Najork, 1999). Usually starting from a predetermined set of URLs,

crawlers progressively access and download Web pages, scan them for outgoing

links, which are themselves accessed and scanned for outgoing links, and so on.31

Due to the enormous size of the Web, search engines often employ multiple

crawlers working in parallel. This automated and recursive process makes it

possible for a crawler to visit and download large numbers of Web pages virtually

unattended, the main limitations being the ability to locate pages to be crawled

and the storage capacity of the page repository (Arasu et al., 2001, p. 3).32

The indexer module extracts all the words from each page downloaded by

the crawlers and records the location where each word was found, creating a very

large text index that can provide the URLs where any given word occurrs on the

Web.33 The text index also typically includes meta-data about the appearance and

location of particular words, such as whether a word appeared in the page’s title,

in a heading, in boldface, or embedded within a hypertext link. The indexer

module might also perform document preprocessing to make the overall indexing

31
Discovered outgoing links might be either immediately visited and
scanned by the same crawler, or put into a queue to be visited and scanned by
another crawler as directed by the crawl control system.
32
Despite the automated abilities of crawlers, some studies show that no
search engine has indexed more than 16% of the Web (Lawrence & Giles, 2000).
While Web crawlers face various challenges in their efforts to index the entire
Web (Arasu et al., 2001, pp. 4-13), discussing these in detail is beyond the scope
of this dissertation.
33
Limited, of course, by the portion of the Web initially crawled.

104
process more efficient. Preprocessing might include automatically ignoring

common words with little semantic importance (so-called “stop words” such as

“a”, “of”, “the”, or “it”), stemming words down to their root form (for example,

trimming occurrences of “connected”, “connecting”, “connection” and

“connections” to the root form of “connect”), and regularizing abbreviations,

spelling variations, or word case (Arasu et al., 2001, pp. 18-19; Türker, 2004, pp.

16-17).

The indexer module also typically creates a structure or link index to

record the link structure of the documents crawled, providing information such as

the set of incoming and outgoing links to a page, parent/child page relationships,

adjacent pages, and so on.34 The link index might be used to direct future Web

crawling activity, to help provide “related pages” results for particular queries,

and for calculating a page’s relative “importance” by link-based ranking

algorithms (Broder et al., 2000). Finally, a utility index might contain results of

preliminary calculations and rankings based on the content and link structure of

indexed pages to speed query processing (Arasu et al., 2001, pp. 18-19).

Searching through an index involves a user building a query and

submitting it through the search engine. The query and ranking modules of a

search engine receive and fulfill these search requests, relying on the indices

prepared by the crawler and indexer modules to find matches to users’ keyword

requests, and utilizing algorithms to rank and sort results to achieve optimal

34
The mapping of the link structure of the web by search engines is
discussed in more detail in the next chapter.

105
relevancy to the user’s request. The query engine’s interface is typically a text box

for inputting the search terms or phrases desired, sometimes accompanied with

checkboxes to indicate whether the request should focus on Websites, video files,

images, and so on. Queries can be quite simple, a single word, or more complex,

such as string of words or a phrase within quotations. The use of advanced search

operators, such as the Boolean commands “AND”, “OR”, and “NOT”, allow

users to refine or extend the terms of the search query. Search terms are often

stemmed, parsed, and spell-checked by the query engine in order to regularize

processing. The query engine then scans the text index for the search terms, and

provides a list of matching Web pages.

The usefulness of the query engine’s results depends on the relevance of

the pages presented to the user. While there may be millions of Web pages that

include the search terms, some pages may be more relevant, popular, or

authoritative than others. Most search engines employ complicated algorithms to

rank the results, providing the “best” results first. Ranking algorithms and

techniques vary across search engines, and while their exact details are considered

proprietary and confidential, the methods employed typically include analyses of

the content of Web documents, their layout and attributes, and their link structure.

The simplest ranking measure is the counting of keywords, where the

presumption is that as a search term’s frequency in a document increases, so does

the relevance of that document to the user’s search. Utilizing the meta-data

collected by the indexer, the ranking engine can also estimate the relative

importance of a particular word on the page, assigning a higher ranking to

106
documents where the search term appears in a title or hyperlink, for example, than

a document where it only appears in a footnote. These qualitative measures help

the ranking engine to estimate the search term’s relative importance in a

document, and thus that document’s ranking within the results.

Relying on the analysis of a page’s content for ranking the document

within search results is not unproblematic. Recognizing the value of appearing

higher within search results, a Web site owner can manipulate how search engines

rank their page by altering the way the page “looks” to the engine, such as adding

hidden or misleading keywords and phrases to fool the engine into ranking it

higher for search terms that are not actually relevant for the page. Conversely,

some Web pages are insufficiently self-descriptive to ensure proper ranking in

search engine results. For example, many Web search engines’ own sites do not

actually contain the phrase “search engine,” reducing the chances these pages

would be ranked highly in query results for that phrase (Kleinberg, 1999).35

Finally, as the number of pages on the Web continues to increase, the number of

documents that might include the search terms increases proportionally, and it

becomes difficult to sufficiently differentiate between multiple pages that match

the search criteria based on the page’s content alone.

Ranking algorithms that take advantage of the Web’s link structure have

emerged in an attempt to improve the quality of search engine results in the face

35
In their academic article introducing Google, Brin and Page note that
“as of November 1997, only one of the top four commercial search engines finds
itself (returns its own search page in response to its name in the top ten results)”
(Brin & Page, 1998)

107
of these challenges. This approach involves analyzing the hyperlinks between

Web pages (collected in the link index described above) to establish a

measurement of authority for that page, in which a page with many incoming

links is considered more authoritative than a page with none. The Hypertext

Induced Topic Selection (HITS) (Kleinberg, 1999) and Google’s PageRank (Brin

& Page, 1998; Page et al., 1998) are the best known of these algorithms.36 To

overcome the threat of linkspamming (the flooding of a page with incoming links

to deceptively increase its ranking), these algorithms calculate and apply page

authorities recursively, that is, instead of just counting the raw number of links to

a page, the algorithm factors in the importance of an incoming link to determine

how much weight to give it. Thus, a page receives more importance if

Microsoft.com links to it than if some lesser-known page links to it, since

Microsoft.com itself has many incoming links. The authority of a page both

depends on and influences the authority of other pages (Arasu et al., 2001, p. 28).

There are many variations on this approach, but utilizing the link structure of the

Web to help determine the ranking of search engine results has become an

industry standard (Sullivan, 2003a).

The technical design features of Web search engines overcome many of

the drawbacks inherent in the “browse paradigm” of Web directories. First, they

are fully automated, employing multiple crawlers to scour the Web for new pages,

eliminating the burden of a large staff of humans to keep up with the rapidly

expanding Web. Second, search engines are more user-friendly than Web

36
PageRank will be discussed in more detail in the next chapter.

108
directories, typically featuring a simple text box to enter the search terms. Results

are often provided with the title of the page as well as a brief description of what

the page contains. Users no longer need to understand and discern a specialized

vocabulary, nor must they navigate complex hierarchies in order to find links to

relevant Web sites. Further, since the perfect search engine is designed to provide

specific and relevant results for each individual query, personalized to the

particular searcher, the ontological struggles inherent in fitting Web content into

predetermined and hierarchical categories are eliminated. Rather than treating

pages on the Web like books in a library that can be neatly classified into rigid

categories, Web search engines exploit the inherent link structure of the Web,

locating, indexing, and ranking pages based on their relationship to other pages, in

order to “make sense of the vast heterogeneity of the World Wide Web” (Page et

al., 1998, p. 1).

Web search engines, however, cannot alleviate all of the drawbacks of the

Web directory model of organizing and navigating the World Wide Web. One of

the drawbacks identified above was the potential for an individual editor’s bias to

impact the decision whether to include a particular Web site in the directory and

which ontological category it best fits. While Web search engines are often

portrayed as neutral technologies merely selecting and ranking Web sites based

on the “democratic nature of the Web” (Google, 2004c), the technical design of

their algorithms might actually heighten the bias feared in Web directories.

Introna and Nissenbaum’s (2000) seminal study, “Shaping the Web: Why the

Politics of Search Engines Matter,” was among the first to challenge the neutrality

109
of search engines, revealing how they “systematically exclude certain sites, and

certain types of sites, in favor of others, systematically giving prominence to some

at the expense of others” (2000, p. 169). The potential for systematic

programming of bias within the algorithmic (and invisible) code of Web search

engines makes their bias much more threatening than the personal bias of a

random individual editor of a Web directory. A host of recent studies (Kleinberg

& Lawrence, 2001; Chandler, 2002; Hargittai, 2004b; Vaughan & Thelwall,

2004; Diaz, 2005) have built on Introna and Nissenbaum’s thesis to reveal how

such systematic bias persists in search engine coverage and results.

Further, despite their efforts to deploy extensive crawlers, search engines

routinely fail to index the entire World Wide Web. A 1994 study claimed that the

top six search engines together indexed only 42% of the Web (Lawrence & Giles,

2000), although a more recently study estimated coverage at a more complete

80%-90% for each of the major engines (Vaughan, 2004). Nevertheless, as more

Web services and applications rely on dynamic Web pages, such as online stores

that only generate a Web page in response to a specific product search, this so-

called “invisible Web” is often left outside Web search engines’ indexes

(Bergman, 2001). Even where specific efforts were made to ensure such pages are

visible to search engines (e.g., the Open Access Initiative) the best search engine

was able to find only 60% of this content (McCown et al., 2006).

Despite these limitations, Web search engines have clearly established

themselves as the primary means of accessing the vast universe of information

that exists on the Web. A large part of their continued success, and what allowed

110
them, on the whole, to survive the dot-com bubble that bankrupted countless Web

ventures (Kopytoff, 2003), was their innovative “pay-per-click” economic model.

Economics of Web Search Engines

While the primary task of search engines is to provide users access to

information, search engines have become, essentially, advertising companies

(Vine, 2004). They earn the vast majority of their revenue through the sale of

advertising space on search results pages: in 2003, the percentage of revenues due

to advertising for Yahoo! and Google were 82% and 95%, respectively (Van

Couvering, 2004, p. 7). Search engine advertising takes various forms. Following

the trend of other non-search Web sites, such as the online versions of

newspapers, some search engines include graphical banner ads on their home

page or search results pages, earning revenue each time the ad is viewed, clicked,

or some other action is taken. Search engines can also earn revenue by charging

Web sites for inclusion in the search engine’s index, or to increase the frequency

that their site would be crawled. For example, Yahoo!’s paid inclusion program

guarantees that paying clients’ websites will be crawled for updates every two

days, while it may update its index of other sites only once a month (Hansell,

2004b).

These paid inclusion programs remove the randomness of the Web

crawler’s path, but do not necessarily guarantee a particular spot in the ranking of

search results or position on the search engine results page. Most search engines

have separate paid placement programs, where Web sites pay a fee to have links

111
placed within the results for a particular search query. For example, a digital

camera manufacturer may pay a search engine to gain a prominent position on the

page resulting from a user’s search for “digital cameras.” Usually, paid listings

are shown on top of, or to the side of, any standard unpaid search results (also

called algorithmic or organic results), and are explicitly marked as sponsored

results or advertising (see Figure 4), although there are search engines that insert

paid results into the organic results with little or no user notification (Wouters,

2005).

Figure 4: Search results page for "digital cameras" showing paid placement of
links to advertisers (circled).

112
Paid placement advertising has quickly become the primary revenue

source for Web search engines (Reinhardt, 2003). Google’s AdWords37 and

Yahoo! Search Marketing38 dominate the marketplace, with Microsoft recently

launching its own adCenter39 to tap into this lucrative market (Hansell, 2005).

Building on the success of these “sponsored links” programs, search engines

increasingly are placing similar contextual ads across their diverse offerings, such

as their mapping or e-mail products (Hansell, 2004a; Roush, 2005). Search engine

providers have struck multi-million dollar advertising partnerships with other

content providers to include contextual ads on their Web properties (Thaw &

Daurat, 2006; Waters, 2006) and the potential revenues from search-related

advertising have prompted multi-billion dollar acquisitions within the industry

(Fabrikant, 2005). Marking the health of the search advertising market, the Search

Engine Marketing Professionals Organization estimates search engine advertising

revenues to be $5.75 billion for 2005 in North America alone, and predicts that

will increase to $11 billion by 2010 (Search Engine Marketing Professional

Organization, 2006). Google has capitalized most on the growing search engine

advertising marketplace: as of December 3, 2006, Google was worth over $147

billion, moving it into the top twenty-five largest corporations, with a market

37
http://adwords.google.com/
38
http://searchmarketing.yahoo.com/
39
http://adcenter.microsoft.com/

113
capitalization larger than IBM, AT&T, or Intel.40 Amazingly, one of the world’s

largest corporations was built on pennies earned on each advertisement clicked.

The Quest for the “Perfect Search”

Given the richness of the search engine advertising marketplace, search

engine providers continually work to improve and expand their services in order

to increase their advertising revenues. To help achieve these financial goals, there

has been an ongoing quest within the search engine industry to create the “perfect

search engine,” one that has indexed all available information and provides the

most relevant and personalized results (see Kushmerick, 1998; Andrews, 1999;

Gussow, 1999; Mostafa, 2005). A perfect search engine would deliver intuitive

results based on users’ past searches and general browsing history (Pitkow et al.,

2002; Teevan et al., 2005), knowing, for example, whether a search for the

keywords “Paris Hilton” is meant to help a user locate the hotel chain in the

French city, or find the latest gossip about the young socialite. Search engine

companies have clear financial incentives for achieving the “perfect search”:

receiving personalized search results might contribute to a user’s allegiance to a

particular search engine service, increasing exposure to that site’s advertising

partners as well as improving chances that the user would purchase fee-based

services. Similarly, search engines can charge higher advertising rates when ads

40
Retrieved December 3, 2006 from Yahoo! Stock Screener at
http://screen.yahoo.com/stocks.html.

114
are accurately placed before the eyes of users with relevant needs and interests

(Hansell, 2005).

Along with these financial incentives for the search engine providers,

journalist John Battelle illustrates the potential benefits the perfect search engine

enjoyed by users:

Imagine the ability to ask any question and get not just an accurate answer,
but your perfect answer – an answer that suits the context and intent of
your question, an answer that is informed by who you are and why you
might be asking. The engine providing this answer is capable of
incorporating all the world’s knowledge to the task at hand – be it
captured in text, video, or audio. It’s capable of discerning between
straightforward requests – who was the third president of the United
States? – and more nuanced ones – under what circumstances did the third
president of the United States foreswear his views on slavery?
This perfect search also has perfect recall – it knows what you’ve
seen, and can discern between a journey of discovery – where you want to
find something new – and recovery – where you want to find something
you’ve seen before. (Battelle, 2004)

When asked what a perfect search engine would be like, Google’s Sergey Brin

replied, perhaps jokingly (but perhaps not), “like the mind of God” (quoted in

Ferguson, 2005, p. 40). To attain such an omnipresent and omniscient ideal, the

perfect search engine must have both “perfect reach” in order to provide access to

all available information on the Web and “perfect recall” in order to deliver

personalized and relevant results that are informed by the previous habits of that

particular searcher.

Perfect Reach

To achieve the reach necessary for the perfect search, Web search engines

amass enormous indices of the Web’s content. Expanding beyond just HTML-

115
based Web pages, search engine providers have indexed a wide variety of media

found on the Web, including images, video files, PDFs, and other computer

documents. For example, Yahoo! claims to have indexed over 20 billion items,

including over 19.2 billion Web documents, 1.6 billion images, and over 50

million audio and video files (Mayer, 2005). Google claims to have an index more

than three times larger than that of any other search engine (Google, 2005n), and

it is estimated that Google has indexed nearly 70% of the total World Wide Web

(Sullivan, 2005). The increasing sophistication and reach of Web crawler and

indexing technology provide search engine companies a powerful mapping of the

entire World Wide Web to fuel the quest for the perfect search – So powerful that

philosopher Lawrence Hinman (2005) has updated George Berkeley’s eighteenth-

century proclamation that “esse est percipi” (to exist is to be perceived) to the

twenty-first-century equivalent “esse est indicato in Google”: to exist is to be

indexed on Google.

Perfect Recall

To achieve “perfect recall,” Web search engines must be able to identify

and understand searchers’ intellectual wants, needs, and desires when they

perform information-seeking tasks online in order to deliver personalized and

relevant results. The primary means for personalizing search results is to rely on a

users’ search habits and history (see, for example, Speretta, 2000; Pitkow et al.,

2002; Teevan et al., 2005). To gather users’ search histories, most Web search

engines maintain detailed server logs recording each Web search request

116
processed through their servers, the pages viewed, and the results clicked (see, for

example, Google, 2005i; IAC Search & Media, 2005; Yahoo!, 2006). Search

engines also rely heavily on Web cookies to help differentiate users and track

activity from session to session, and increasingly push the creation of user

accounts to help associate particular users with their online activity.41 The

motivation behind gathering this user information is explained to the user in terms

of improving their search experience. Google, for example, states, “We use this

information to improve the quality of our services and for other business

purposes” (Google, 2005i), while the search engine Ask.com also presents its

economic motivations fueling the need for this perfect recall in pursuit of the

perfect search: “We collect…anonymous information to improve the overall

quality of the online experience, including product monitoring, product

improvement, targeted advertising, and monetizing commercially oriented search

keywords” (IAC Search & Media, 2005).

A Faustian Bargain?

The quest for the perfect search engine has led to calls for search engines

to provide results that suit the “context and intent” of the search query. Given a

search for “Paris Hilton,” the perfect search engine will know whether to deliver

results about the celebrity or a place to spend the night in France. To attain such

an omnipotent and omniscient ideal, the perfect search engine will have to have

41
These practices will be discussed in further detail in the following
chapter.

117
“perfect reach” and be able to deliver any type of online content from all online

sources, as well as “perfect recall” in order to deliver personalized and relevant

results that are informed by who the searcher is. Search engine users are

repeatedly reminded of the benefits of the perfect search engine, ranging from

Google’s bold goal to “organize the world's information and make it universally

accessible and useful” (Google, 2005b) to Yahoo’s revamped mission statement

and strategy:

Yahoo!’s mission is to connect people to their passions, their


communities, and the world’s knowledge. To ensure this, Yahoo! offers a
broad and deep array of products and services to create unique and
differentiated user experiences and consumer insights by leveraging
connections, data, and user participation. (Yahoo!, 2007)

We are reminded, however, of Brey’s call for disclosive computer ethics,

to “make potentially morally controversial computer features and practices

visible” (Brey, 2000, p. 13). Given our position that technology bears “directly

and systematically on the realization, or suppression, of particular configurations

of social, ethical, and political values,” we must work to uncover and morally

evaluate the values and norms embedded in the quest for the perfect search

engine. For example, what does it mean to have all available information on the

Web indexable and searchable – essentially at the fingertips of any person with

access to the Internet? Or, what are the consequences of having search engines

monitor and record our search behaviors in order to personalize product

offerings?

Herein lies the concern that the perfect search engine is a Faustian bargain:

The perfect search engine promises accuracy, efficiency, and relevancy, but at

118
what cost? Does the perfect search “giveth” as well as “taketh away?” In the

presence of a Faustian bargain, Postman warned, society eventually succumbs to

the promises made by technology, ignoring such issues of power, equality, and

freedom. As the ubiquity of Web search engines rises, it becomes increasingly

difficult for users to recognize or question their value-related externalities, and

more tempting to simply take the design of such tools “at interface value” (Turkle,

1995, p. 103). In the spirit of disclosive computer ethics, we are compelled to

explore the Faustian bargain that we are making with our embracing of the perfect

search engine. To come to terms with this Faustian bargain, the next chapter will

focus on the search engine that holds the most promise for achieving the perfect

search, and invokes the most anxiety among its critics: Google.

119
120

CHAPTER IV

GOOGLE’S QUEST FOR THE PERFECT SEARCH

Introduction: Google

The web search engine Google has established itself as the prevailing

interface for searching and accessing virtually all information on the Web.

Originating in 1996 as a Ph.D. research project by Larry Page and Sergey Brin at

Stanford University (see Brin & Page, 1998; Page et al., 1998), Google was a

relative latecomer to the search engine industry,42 receiving only 5% of the

market share in December 2000 (Sullivan, 2001). Google’s Web search engine

quickly rose to dominate the U.S. market, processing almost 3.6 billion search

queries in February 2007, over half of all Web searches performed

(Nielsen//NetRatings, 2007).43 The company has been successful financially as

well. Already extremely profitable as a private company (La Monica, 2004),

Google held an initial public offering in August 2004, and within six months rose

to become one of the 100 largest companies in the world (Datamonitor, 2005).44 It

reported $10.6 billion in revenues during fiscal 2006 (Google, 1999), and has

42
See Table 2 in previous chapter.
43
At its peak in early 2004, Google handled upwards of 80 percent of all
search requests on the Web through its own website and clients like Yahoo!,
AOL, and CNN who relied on Google for their customer’s search engine results.
Google’s share fell to a still dominant 57% in 2004 when Yahoo! dropped
Google’s search technology for their own (Hansen, 2004).
44
Based on market capitalization as of March 30, 2005.
become one of the top twenty-five largest corporations in the world, with a market

capitalization larger than IBM, AT&T, or Intel.45

Google’s success stems, at least in part, from a combination of unique

factors: its grassroots origins in academia; its simple, clean interface design and

use of only text-based advertising; its belated and unconventional initial public

offering (Google, 1999); its constant stream of new services and Web

technologies; and the appeal of its informal corporate motto, “Don’t be evil”

(Google, 2005k). By creating an ethos of providing an innovative, free, and

trusted service, with a lighthearted corporate philosophy, Google has won the

hearts and minds of millions of users, becoming so popular that it has even

generated its own verb, to google (Harris, 2006), and is regarded as one of the

most reputable companies in the world (Alsp, 2005). Google, in short, is the “gold

standard” against which all other search engine practices and innovations are

measured (Hellweg, 2002; Clark, 2006). The core of Google’s Web search engine

– and its success – is its innovative ranking algorithm, PageRank.

PageRank

Google’s pioneering Web search technology and design has enabled it to

stand apart from the competition. In 1998, Brin and Page’s paper, “The Anatomy

of a Large-Scale Hypertextual Web Search Engine” (Brin & Page, 1998),

proposed a system to more effectively retrieve information from the World Wide

45
Retrieved December 3, 2006 from Yahoo! Stock Screener at
http://screen.yahoo.com/stocks.html.

121
Web to “improve the quality of search engines” and thus “bring order to the Web”

(Brin & Page, 1998, p. 3). The core of their new Web search engine is PageRank,

a set of algorithms for ranking Web pages, using the immense link structure of the

Web as an organizational tool:

PageRank relies on the uniquely democratic nature of the Web by using its
vast link structure as an indicator of an individual page's value. In essence,
Google interprets a link from page A to page B as a vote, by page A, for
page B. But, Google looks at more than the sheer volume of votes, or links
a page receives; it also analyzes the page that casts the vote. Votes cast by
pages that are themselves “important” weigh more heavily and help to
make other pages “important.” (Google, 2004c)

In order to understand how PageRank works, we must first understand what is

meant here by the “link structure of the Web.” Recall from the previous chapter

that the Web is essentially a set of hypertext documents, each of which contains a

number of unidirectional links to other documents. In this light, we can view the

Web as a directed graph (Figure 5), wherein a node represents a specific page (A,

B, etc), and an “edge” from node A to node B represents a link from page A to

page B. Such a graph describes how each page is interlinked and interrelated,

revealing the topology of the network of Web pages.46 Mapping the link structure

of the Web in this way provides a detailed account of the complex inter-

relationships among the billions of documents available online.

46
See (Barabási, 2003; Watts, 2003) for introduction to network and graph
theory, and (Broder et al., 2000) for technical details on the graph structure of the
Web.

122
Figure 5: Hypothetical Web graph. Seven pages contain links as specified by the
table to the right. This link structure is depicted in the directed graph to the left.
(adapted from Diaz, 2005)

Various ranking heuristics – which vary in complexity from simple link

counting procedures to complex clustering algorithms – can exploit this

information to improve the quality of search results. A straightforward example is

backlink counting in which the search engine simply calculates the number of

pages that link to a particular page A. If this count is high – many pages refer to A

– we might assume that it is a “trusted” or “important” source of information.

Consequently, page A might be placed higher among the search results.

Google’s patented PageRank algorithm takes the backlink heuristic one

step further by recognizing that not all links are equal. With simple backlink

counting, a link from my page to my professor’s site contributes as much to her

ranking as a similar link from the New York Times website. Borrowing from

academic citation analysis, Google’s founders recognized that some links are

more authoritative than others – a link from the Times is (presumably) more

authoritative than a link from my personal website. PageRank attempts to account

for this asymmetry by putting greater weight on backlinks from “important”

123
pages. The importance of the Times is, in turn, measured by the importance of all

the pages that refer to it, creating a recursive calculation. Since the New York

Times site is deemed more “important” than my own, the former link goes much

further in elevating the PageRank of my professor’s site, and thus its visibility

among the results. This recursive definition of PageRank differs sharply from

other link-based ranking methods in that it arrives at a global measure of a Web

page’s importance without taking into account any textual information. In other

words, the PageRank score of a page is influenced by neither the contents of the

page itself nor the user’s search terms, but is based solely on the aggregate

measure of the importance implied in each of its backlinks.

Figure 6 provides a simplified example of how PageRank is calculated,

revealing both its recursive nature and how the relative importance of certain

pages influences the importance of other pages.

Figure 6: Simplified example of a recursive calculation of PageRank (adapted


from Gnix, 2006).

124
The rank of a page is divided among its forward links evenly to contribute to the

ranks of the pages they point to.47 For example, page 1 has a PageRank of 0.304,

and links to five other pages (2, 3, 4, 5, and 7), thus sharing with them a

PageRank of 0.061 (0.304 divided by five). Each of those pages combines the

PageRank values of their incoming links to calculate their respective PageRank

value. Notice that page 1 and page 5 both have four incoming links. If a search

engine relied only on link counting to establish relevancy, these two pages would

be considered relatively equal in importance. Using PageRank, however, reveals

that page 1 is nearly twice as important as page 5, since the PageRank of page 1’s

four incoming links are higher than the links pointing to page 5 (two of page 5’s

links, for example, come from pages with only one incoming link themselves,

lessening their own importance).

Crawling and Mapping the Web

Since the importance of any one page influences the importance of any

other, the usefulness and accuracy of PageRank is, in the end, dependent on

attaining a reliable mapping of the link structure of the entire World Wide Web on

which to base the calculations. To map the link structure of the Web, Google

deploys its Web crawler – Googlebot – to traverse and record billions of Web

47
In such a recursive calculation, the starting PageRank is not known.
However, as noted in Brin and Page’s (1998) paper, “PageRank…can be
calculated using a simple iterative algorithm”, meaning that the calculation can
start with any number, and then through repeated iterations, will converge on the
theoretically true PageRank value. An example of such iterative calculations can
be found at (Rogers, 2002). To save time, Google relies on linear algebra rather
than calculating multiple iterations (Langville & Meyer, 2006).

125
pages and their links.48 Since its initial description in Brin and Page’s (1998)

original article, little is known about the current form of Google’s crawler and

supporting architecture for mapping the Web; its details remain a closely guarded

trade secret. From the information made available (Barroso et al., 2003; Google,

2007), we can discern that Google uses a set of distributed crawlers, each on its

own physically-independent computer, to open multiple parallel connections to

thousands of Web pages at a time. Googlebot crawls from page to page,

harvesting hyperlinks from every page it encounters in order to produce a map of

the Web’s link structure. At its launch, Brin and Page (1998) claimed Google had

“created maps containing as many as 518 million of these hyperlinks,” a figure

which is certainly larger by an order of magnitude today.

In addition to providing the link structure to power PageRank, crawling

and saving a copy of every Web page it encounters also allows Google to analyze

and utilize the content within those pages. For example, Google relies on

hypertext-matching analysis to help measure the relevance of a page to a

particular search term. At the time of Google’s launch, most search engines relied

heavily on how often a word appeared on a Web page in order to determine its

relevancy. However, instead of simply scanning for the occurrence of a word

within the page, Google analyzes the full content of a page and factors in font

size, header levels, and the relative location of each word in order to measure its

importance. In such an analysis, a page with the search term in the title or in large,

48
Recalling the brief introduction in the previous chapter, Web crawlers
are small programs that “crawl” the Web on the search engine’s behalf,
downloading Web pages into a page repository for later processing.

126
bold font will be considered more relevant than a page on which the word appears

only in a footnote. By mapping the distribution of keywords within particular

pages, Google can estimate the relative importance of particular words and

phrases.

In the process of mapping the link structure of the Web and indexing the

content of every Web page crawled, Google has amassed an incredibly large

dataset of the words and sentences that appear on the Web.49 Google has taken

advantage of the content crawled by the Googlebot to pursue research in machine

learning and natural language processing with the goal of improving their search

results through such enhancements as statistical machine translation, spelling

checking,50 word sense disambiguation, and clustering of search queries and

results.51 For example, by analyzing the frequency and probability of certain

words appearing in proximity to each other, Google can cluster concepts into

“reasonably coherent” subclusters that seem related. When a query is processed

for a certain search term, Google can determine the probability the searcher might

also be interested in a related cluster, and provide supplemental results for

keywords not initially searched for. For example, if someone searches for “Bay

Area cooking class,” Google might determine through clustering that the related

49
Google recently released its corpus of over one trillion words scraped
from public Web pages to the linguistic community. The dataset included almost
100 billion full sentences (Google, 2006a).
50
Google has released how many misspelled queries it had over a three-
month period of users searching for “Britney Spears.” Almost 600 different
spellings where attempted by at least 2 users, all of which were corrected by its
spelling correction system. (http://www.google.com/jobs/britney.html)
51
See (Zamir & Etzioni, 1999; Wen et al., 2001) for technical descriptions
of clustering with Web search engines.

127
terms “Berkeley courses: vegetarian cuisine” is a also good match, even though it

contains none of the original query’s keywords. Other uses for clustering include

determining how to aggregate and organize related news stories within Google

News, or selecting which AdSense contextual advertisements relate best to a

particular search term or page element.

Google and the Perfect Search Engine

Google recognized early on the importance of designing a perfect search

engine: The company’s very first press release noted that “a perfect search engine

will process and understand all the information in the world…That is where

Google is headed” (Google, 1999). Google co-founder Larry Page later reiterated

the goal of achieving the perfect search: “The perfect search engine would

understand exactly what you mean and give back exactly what you want”

(Google, 2007). From its dominant market position, Google continues to refine its

PageRank algorithm, expand the reach of its crawler, and map the link structure

of the Web. Armed with these technological advantages, as well as a seemingly

constant stream of new products and technological advances, Google seems

poised to achieve the perfect reach and perfect recall necessary to fulfill its quest

for perfect search engine.

Google’s Perfect Reach

In their effort to “organize the world’s information,” Google has amassed

an extensive Web search information infrastructure comprising nine distinct

128
information-seeking contexts: general information inquiries, academic research,

news and political information, communication and social networking, personal

data management, financial data management, shopping and product research,

computer file management, and Internet browsing.52 As outlined in Table 3 (and

discussed in more detail in Appendix A), the reach of Google’s crawlers and

index has expanded beyond websites to include other online documents as well,

such as images, news feeds, Usenet archives, and video files. Additionally,

Google has begun digitizing the “material world,” adding the contents of popular

books, university libraries, maps, and satellite images to their growing index.

Users can also search the files on their hard drives, send e-mail and instant

messages, shop online, and even engage in social networking through Google.

Consequently, users increasingly search, find, and relate to information through

Google’s growing information infrastructure of search-related services and tools.

They also use these tools to communicate, navigate, shop, and organize their

lives. By providing a medium for various social, intellectual, and commercial

activities, “Planet Google” has become a large part of people’s lives, both on- and

offline (Williams, 2006). Assembling these wide-ranging products pushes Google

along the path towards realizing Sergey Brin’s dream of creating “a perfect search

engine [that] will process and understand all the information in the world”

(Google, 1999; emphasis added). While the seemingly perfect reach of Google is

52
These nine contexts are not necessarily mutually exclusive and are not
put forth as airtight metaphysical divisions. They are meant simply to help
compartmentalize the various types information-seeking activities a person
undertakes in her daily activities for easier discussion.

129
well known – indeed its ability to access information other search engines appear

to miss is an ingredient of its great success – its attempts to attain perfect recall

are more likely to create anxiety among users, and deserves closer attention.

Google’s Perfect Recall

Capturing User Information

In order to provide results that suit the “context and intent” of the search

query, a perfect search engine must have “perfect recall” of who the searcher is

and her previous search-related activities. In order to discern the context and

intent of a search for “Paris Hilton,” the perfect search engine would know if the

searcher has shown interest in European travel, or whether she spends time online

searching for sites about celebrity gossip. Attaining such perfect recall requires

search engine providers to collect as much information about their users as

possible. To accomplish this, Google, like most Web search engines, relies on

three technical strategies in order to capture the personal information necessary to

fuel the perfect recall: the maintenance of server logs, the use of persistent Web

cookies, and the encouragement of user registration.

Maintained by nearly all websites, server logs help website owners gain an

understanding of who is visiting their site, the path visitors take through the

website’s pages, which elements (links, icons, menu items, etc.) a visitor clicks,

how much time visitors spend on each page, and from what page visitors are

leaving the site. In other words, a website owner aims to collect enough data to

reconstruct the entire “episode” of a user’s visit to the website (Tec-Ed, 1999).

130
Google maintains detailed server logs recording each of the 100 million search

requests processed each day (Google, 2005j). While the exact contents are not

publicly known, Google has provided an example of a “typical log entry” for a

user who searched for the term “cars” (Google, 2005i):

123.45.67.89 - 25/Mar/2003 10:15:32 -


http://www.google.com/search?q=cars - Firefox 1.0.7; Windows
NT 5.1 - 740674ce2123e969

In this sample entry, 123.45.67.89 is the IP address53 assigned to the

user by the user’s Internet service provider, 25/Mar/2003 10:15:32 is the date

and time of the query, http://www.google.com/search?q=cars is the

requested page, which also happens to identify the search query, “cars,” Firefox

1.0.7; Windows NT 5.1 is the browser and operating system being used, and

740674ce2123a969 is the unique cookie ID54 assigned to this particular browser

the first time it visited Google. To help further reconstruct a user’s movements,

Google also records clickstream data, including which search results or

advertising links a user clicks (Google, 2005i). Given Google’s wide array of

53
An Internet Protocol (IP) address is a unique address that electronic
devices use in order to identify and communicate with each other on a computer
network. An IP address can be thought of as a rough equivalent of a street address
or a phone number for a computer or other network device on the Internet. Just as
each street address and phone number uniquely identifies a building or telephone,
an IP address can uniquely identify a specific computer or other network device
on a network (Wikipedia contributors, 2007b).
54
A Web cookie is a piece of text generated by a Web server and stored in
the user’s computer, where it waits to be sent back to the server the next time the
browser accesses that particular Web address. By returning a cookie to a Web
server, the browser provides the server a means of associating the current page
view with prior page views in order to “remember” something about the previous
page requests and events (see Clarke, 2001; Kristol, 2001). Google’s user of Web
cookies allows it to identify particular browsers between sessions, even if that
browser’s IP address changes.

131
products and services, their server logs potentially contain much more than simply

a user’s Web search queries. Other searches logged by Google include those for

images, news stories, videos, books, academic research, and blog posts, as well as

links clicked and related usage statistics from within Google’s News, Reader,

Finance, Groups, and other services (see Table 4).

Logging this array of information – the user’s IP address, cookie ID, date

and time, search terms, results clicked, and so on – enhances Google’s ability to

attain the “perfect recall” necessary to deliver valuable search results and

generally improve its search engine services. For example, by cross-referencing

the IP address each request sent to the server along with the particular page being

requested and other server log data, it is possible to find out which pages, and in

which sequence, a particular IP address has visited. When asked, “Given a list of

search terms, can Google produce a list of people who searched for that term,

identified by IP address and/or Google cookie value?” and “Given an IP address

or Google cookie value, can Google produce a list of the terms searched by the

user of that IP address or cookie value?”, Google responded in the affirmative to

both questions, confirming its ability to track user activity through such logs

(Battelle, 2006a, 2006b).

Sole reliance on IP logging and Web cookies to reconstruct a users’

browsing and searching activities completely and consistently has its limitations.

Some Internet service providers frequently change the IP address assigned to a

particular user’s network connection. Alternatively, multiple users accessing the

Internet through a university proxy server or through some ISPs (such as AOL)

132
might share the same IP address. Privacy concerns have also led more savvy

Internet users to disguise their IP address with anonymous routing services such

as Tor (Zetter, 2005b). Similarly, as the privacy concerns of the use of cookies to

track users’ online activities increases (Mayer-Schönberger, 1997; Kristol, 2001;

Schwartz, 2001), users increasingly take advantage of software and browser

features that make it easier to view, delete and block Web cookies received from

the sites they visit (McGann, 2005; Mindlin, 2006). Even in the absence of such

privacy-protecting measures, cookies and IP addresses are linked only to a

particular Web browser or computer, not necessarily a particular user. Neither the

browser passing the cookie nor the Web server receiving it can know who is

actually using the computer, or whether multiple users are using the same

machine. Reliance on IP addresses and cookies might not provide necessary

differentiation between users, limiting the extent of the “perfect recall” necessary

for Google to deliver the most relevant results and advertising.

To overcome such limitations, Web site owners frequently urge users to

register with the website and login when using the services (Ho, 2005, pp. 660-

661; Tec-Ed, 1999). When a user supplies a unique login identity to a Web server,

that information, along with the current cookie ID, is stored in each log file record

for that user’s subsequent activity at the site. By tying aspects of the site’s

functionality to being logged in, the user is compelled to accept the Web cookie

for that session. Even if the user deletes the cookie or changes her IP address at

the end of the session, by logging in again at the next visit, a consistent record for

the user in the server log can be maintained. Logging in with a unique user name

133
similarly reduces the variability of multiple or shielded IP addresses. Further, any

personally identifiable information provided during the registration process, such

as age, gender, zip code, or occupation, can be associated with the user’s account

and server log history, providing a more detailed profile of the user.

In early 2004, Google started experimenting with products and services

that required users to register and login, including personalized search results, e-

mail alerts when sites about a particular topic of interest are added to Google’s

index (Kopytoff, 2004). Soon afterward, Google introduced products and services

that required the creation of a Google Account, such as Gmail, Google Calendar,

and the Reader service to organize news feeds. Other Google services can be

partially used without a Google Account, but users are encouraged to create an

account in order to maximize its benefits or access certain features. Examples

include Google Video, with a Google Account required for certain premium

content, and Book Search, in which a Google Account helps control access to

copyright-protected text. When Google acquires external products and services

with their own login protocols, migration to Google Accounts is typical, as the

case with Blogger or Dodgeball (see Weinberg, 2005; Google, 2006b). Internally

developed products that previously utilized unique logins, such as Orkut, have

also migrated to the universal Google Account (Weinberg, 2005).

Potential Information Captured

Google’s encouragement of the creation of Google Accounts, combined

with its use of persistent Web cookies, provides the necessary architecture for the

134
creation of detailed server logs of users’ activities across Google’s various

products and services, ranging from the simplest of search queries to minute

details of their personal lives. While the full extent of the data capturable by

Google’s infrastructure is difficult to estimate, we can identify some of the typical

forms of personal information potentially stored within Google’s servers.55

The ability to track specific Web search queries is the most discussed –

and perhaps most pernicious – of Google’s ability to monitor its users. Logging

the specific search terms for each of the 100 million Web search queries it

processes daily, along with the particular results clicked, provides Google a

unique insight into the wants and needs of its users. Evidenced by the search

terms revealed in AOL’s release of search history data (Maney, 2006; McCullagh,

2006a), the individual search terms within Google’s logs are a mix of the

mundane and the stimulating, the trivial and the informative. While over half of

searchers say they split their searches among those that are “for fun” and those

that are “important” to them (Fallows, 2005), users are increasingly using the

internet and search engines to help them make important decisions or negotiate

their way through major episodes in their lives (Horrigan & Rainie, 2006). In such

potentially personal and sensitive circumstances, the terms for which users search,

along with the results they decide to clink on, are stored in Google’s server logs.

Whether a user searches for teen pop star “Lindsay Lohan” or “Cleveland HIV

treatment center,” or whether a user clicks on a news story about “abortion rights”

55
See Table 4 for a summary of personal information collected across
Google’s products, and Appendix A for a more detailed description of each
product and its method of collecting personal data.

135
or a blog post on “American Idol,” all such actions across Google’s services

become associated with a user’s IP address and cookie ID within Google’s vast

server logs.

Along with individual Web search queries, Google’s infrastructure of

dataveillance also enables the capturing of various other personal and intellectual

interests and activities across its products and services. For example, the modules

selected for one’s Personal Homepage might divulge personal information

(displaying headings from The Advocate, for example). Users curious about their

personal Web presence might create a Google Alert with their name, mailing

address, or social security number. Google’s Book Search project is able to

“connect some information – your Google Account name – with the books and

pages that you’ve viewed”, restricting the ability to browse and read books

anonymously (Google, 2005f). Virtually all of Google’s search-related products

facilitate the collection of keywords that potentially relate to users’ personal lives

and intellectual interests.

The ability to collect and cross-reference personal information is greatly

augmented when the user creates a Google Account. While all that is needed to

create a Google Account is a valid e-mail address, users are frequently

encouraged, and sometimes required, to provide additional personal information.

Upon creating a Google Account, users are prompted to edit their profile to

include their full name and zip code. Creation of a Gmail account requires that a

136
first and last name be provided.56 When using Google Groups, users are

encouraged to create public profile, including their name, nickname, location,

title, industry, website or blog. Blogger users are also encouraged to create a

profile, which includes information such as the user’s full name, photograph,

birthday, location, gender, as well as lists of favorite books, movies, music, and so

on. Any personal information provided for such profiles is associated with the

user’s Google Account. Google’s new Checkout online payment service also

requires the collection of user’s personal transactional data, including a real name,

credit or debit card number, card expiration date, card verification number, billing

address, phone number, and email address.

Google’s attempt to “organize the world’s information” has lead to the

development of various non-search-related products and services, which also

provide an opportunity to collect and aggregate users’ personal information.

Typically linked via a Google Account, these services allow Google to tabulate

users’ calendar events, e-mail contacts, chat buddies, Web bookmarks, and

financial portfolio (see Table 4). Some of Google’s efforts, however, have

received particular scrutiny for their potential impact on user privacy. Gmail, for

example, has been heavily criticized for its practices of scanning of the text of

incoming messages in order to place context-sensitive advertisements (Bray,

2004; Clearinghouse, 2004; Electronic Privacy Information Center, 2004b). When

viewing e-mail messages in Gmail, advertisements and links to related pages

56
Of course, there is no method of verifying whether the user provides her
real name.

137
appear in the right margin of the Gmail interface. Google scans the text of

incoming e-mail messages in order to target the advertising to the user. For

example, if the user is reading an e-mail that contains the text “Atlantic City,”

Gmail might present the user with ads about hotels, casinos, and other websites

related to that travel destination. While Google maintains that “no human will

read the content of your email in order to target such advertisements or other

information without your consent, and no email content or other personally

identifiable information will be provided to advertisers as part of the Service,” the

Gmail terms of use also note that Google may “monitor, edit or disclose your

personal information, including the content of your emails, if required to do so in

order to comply with any valid legal process or governmental request” (Google,

2005d).

Further criticism of Gmail centered on a clause in its original privacy

policy stating that “residual copies of e-mail may remain on our systems for some

time, even after you have deleted messages from your mailbox or after the

termination of your account” (Electronic Privacy Information Center, 2004b).

Since electronic communications stored for more than 180 days enjoy less robust

protection from law enforcement access (Electronic Privacy Information Center,

2004b), the prospect of indefinite storage of Gmail e-mails raises concerns over

the privacy of users’ communications. Google insists that this phrasing in the

Gmail privacy policy was simply “poor wording” (Gillmor, 2004), and that while,

like most Web-based e-mail providers, Google keeps multiple backup copies of

users’ emails so that users can recover messages and restore accounts in case of

138
errors or system failure, deleted e-mails are eventually completely removed from

Google’s servers within 60 days (Gillmor, 2004; Google, 2005c). Even with these

commitments, a user’s deleted Gmail still might remain in Google’s “offline

backup systems” (Google, 2005c), and in at least one reported case, a subpoena

was sent to Google for the complete contents of a Gmail account, including

deleted e-mail messages (McCullagh, 2006b).

Similar concerns were raised with a feature of Google’s Desktop Search

product, in which a user’s personal files might be stored on Google’s servers. In

early 2006, Google released a new version of Google Desktop Search with a

“Search Across Computers” function allowing users to search and access

information from all of their computers with Google Desktop installed. Once

enabled,57 the file index of each authorized computer is uploaded and stored on

Google’s servers. As Google explains:

This is necessary, for example, if one of your computers is turned off or


otherwise offline when new or updated items are indexed on another of
your machines. We store this data temporarily on Google Desktop servers
and automatically delete older flies, and your data is never accessible by
anyone doing a Google search. (Google, 2006e)

To help protect user privacy, the data is encrypted in transmission and while

stored on Google servers, and Google retains the data for only 30 days. However,

privacy concerns persist, typified by this warning from the Electronic Frontier

Foundation:

57
The Search Across Computers feature is not automatically activated and
must be enabled and authenticated through the Google Desktop preferences. A
Google Account is required to activate and access the service.

139
If you use the Search Across Computers feature and don't configure
Google Desktop very carefully—and most people won’t—Google will
have copies of your tax returns, love letters, business records, financial
and medical files, and whatever other text-based documents the Desktop
software can index. The government could then demand these personal
files with only a subpoena rather than the search warrant it would need to
seize the same things from your home or business, and in many cases you
wouldn’t even be notified in time to challenge it. Other litigants—your
spouse, your business partners or rivals, whoever—could also try to cut
out the middleman (you) and subpoena Google for your files. (Foundation,
2006)

It remains unknown whether the data stored on Google’s servers are retained on

“offline backup systems” past the 30-day window (similar to Gmail messages), or

whether employees within Google are able to decrypt the files if subpoenaed or

requested for other uses.

Two of Google’s products designed to assist users in Web navigation, the

Google Toolbar and the Web Accelerator, also present unique privacy concerns

beyond the traditional logging of user activities across Google’s product suite.

Users running Google Toolbar with certain advanced features enabled (PageRank,

AutoLink, SpellCheck, and WordTranslator) share a considerable amount of

information about their Web browsing activities with Google. By sending Google

the addresses of every website visited by the user, the PageRank feature provides

a proxy PageRank calculation for a particular website. The AutoLink feature

scans the content of a visited webpage. If Google recognizes certain types of

information on the page (addresses, ZIP codes, ISBN numbers, etc) AutoLink

automatically adds relevant links to the webpage. The SpellCheck feature

monitors the words that users type into Web forms in order to correct any spelling

mistakes. Finally, WordTranslator sends to Google the English words users

140
designate by hovering over them with the mouse and provides translations into

various languages. Whenever these features are activated, information about the

Web site being viewed is sent to Google for processing (Google, 2006k).58

Google Web Accelerator is a downloadable application that speeds up

page load times for faster Web browsing. While not directly related to Web

searching or information organization, Web Accelerator takes advantage of

Google’s computer infrastructure to make Web pages load faster. The software

accomplishes this through various means, including sending page requests

through Google servers dedicated to handling Google Web Accelerator traffic,

storing copies of frequently viewed pages to make them quickly accessible,

downloading only the updates if a Web page has changed slightly since it was last

viewed, prefetching certain pages onto a user’s computer that the user might visit

in the near future, as well as other data management and compression techniques

(Google, 2006m).

When using Web Accelerator, all non-secure Web page requests are

routed through Google’s servers, along with information such as the date and time

of the request, the user’s IP address, and computer and connection information.

Google stores and uses this information to help predict and prefetch additional

relevant Web content. Depending on how particular websites are set up, it is

possible that personally identifiable information embedded in the URL might also

be processed through and stored within Google’s servers. Google might also

58
See Appendix A for additional Google Toolbar features which capture
user information.

141
temporarily cache other sites’ Web cookies when prefetching certain page

requests (Google, 2006m).

In sum, Google’s encouragement of the creation of Google Accounts,

combined with its use of Web cookies and other data collection means, provides

the architecture to monitor and log user activity across the myriad products and

services the make up their larger Web search information infrastructures.59 The

result is a robust infrastructure arming Google with the ability to capture and

aggregate a wide array of personal and intellectual information about its users,

extending beyond just the keywords for which they search, but also including the

news they read, the interests they have, the blogs they follow, the books they

enjoy, the stocks in their portfolio, their schedule for the coming week, and

perhaps the URL of every website they visit.

Anxieties of Google’s Drive for the Perfect Search

In their “Letter from the Founders” submitted in anticipation of Google’s

initial public offering, Brin and Page state that Google is “not a conventional

company” and that they “aspire to make the world a better place” by “improv[ing]

the lives of as many people as possible” (Google, 2004). Elsewhere, Brin and

Page have noted their desire to “have positive social effects” and to make Google

a “social good” (Sheff, 2004, p. 59). Google’s apparent benevolence is embraced

by many who seem ready to concede Brin and Page’s quest to create a perfect

59
See Table 4 for a summary of personal information collected across
Google’s products, and Appendix A for a more detailed description of each
product and its method of collecting personal data.

142
search engine that will be “like the mind of God,” noting that Google is “poised to

become the perfect, all-seeing, al-knowing, all-powerful force of the 21st century”

(Ayers, 2003; see also, Friedman, 2003; Gorman, 2004). Others embrace “Planet

Google” as “so much of an improvement on how life was before” (Williams,

2006).

Statements such as these remind us of Postman’s warning about the ease

with which societies can embrace utopian visions of technological progress and

efficiency:

[In] cultures that have a democratic ethos, relatively weak traditions, and a
high receptivity to new technologies, everyone is inclined to be
enthusiastic about technological change, believing that its benefits will
eventually spread evenly among the entire population. Especially in the
United States, where the lust for what is new has no bounds, do we find
this childlike conviction most widely held. (Postman, 1992, p. 11)

Postman warned us of the allure of the Faustian bargain struck between

technology and society, the concern that certain questions about the adoption of

technology remain ignored in the face of this technological enthusiasm: “from

whose point of view the efficiency is warranted or what might be its costs?

…Whom will the technology give greater power and freedom? And whose power

and freedom will be reduced by it?” (Postman, 1992, p. 11). Indeed, the expansion

of Google’s reach into so many areas of people’s lives has left some uneasy, such

as one Google user who expressed “feeling a ‘weird tension’ about his love of

Google’s products and his fear about its omnipresence in his life” (Williams,

2006). It appears, then, that certain anxieties have emerged as a result of Google’s

quest for the perfect search engine.

143
Anxieties of Perfect Reach

Achieving the “perfect reach” requires search engines to index as many

Web pages and other online sources to provide the largest possible database of

potential search results. Among the billions of pages indexed by search engines

are Web pages containing personal information about individuals, easily

discoverable by the simplest of search queries, such as old homepages, discussion

forum postings, online resumes, minutes of public meetings, property tax records,

and court records. Few people are not affected by the “long arm of Google’s Web

crawler,” explains journalist Neil Swidey, referring to the broad scope of

Google’s Web index:

Maybe it was a stupid fraternity prank or a careless posting to an Internet


newsgroup in college. Perhaps you once went on a rant at a selectmen’s
meeting or signed a petition without stopping to read it. Or maybe you
endured a bitter divorce. You may think those chapters are closed. Google
begs to differ.
While most of your embarrassing baggage was already available to
the public, it was effectively off-limits to everyone but the professionally
intrepid or supremely nosy. Now, in states where court records have gone
online, and thanks to the one-click ease of Google, you can read all the
sordid details of your neighbor’s divorce with no more effort than it takes
to check your e-mail. (Swidey, 2003)

Engaging in a “vanity search” – a Web search for one’s own name – can

reveal a surprising amount of personal information: tax assessments, court

documents, marriage licenses, deeds and voter registration information, for

example. The notion of “Googling” someone before a blind date has become

common practice (Lobron, 2006). Almost one in four Web users have searched

online for information about co-workers or business contacts (Sharma, 2004), and

144
employers are Googling prospective employees before making hiring decisions

(Weiss, 2006). In less than an hour, one reporter uncovered a variety of personal

details of Google CEO Eric Schmidt’s life:

Schmidt doesn’t reveal much about himself on his home page. But
spending 30 minutes on the Google search engine lets one discover that
Schmidt, 50, was worth an estimated $1.5 billion last year. Earlier this
year, he pulled in almost $90 million from sales of Google stock and made
at least another $50 million selling shares in the past two months as the
stock leaped to more than $300 a share.
He and his wife Wendy live in the affluent town of Atherton,
Calif., where, at a $10,000-a-plate political fund-raiser five years ago,
presidential candidate Al Gore and his wife Tipper danced as Elton John
belted out “Bennie and the Jets.”
Schmidt has also roamed the desert at the Burning Man art festival
in Nevada, and is an avid amateur pilot. (Mills, 2005)60

Besides concerns over these type of vanity or journalistic searches, by

encouraging search engines to expand both the depth and breadth of their Web

indexes, the quest for the perfect search has also reduced users’ “security through

obscurity” (Swidey, 2003; Ramasastry, 2005a).

Not surprisingly, anxieties have arisen in the wake of this lack of security

through obscurity. In 2002, a reader submitted the following dilemma to “The

Ethicist” column in the New York Times Magazine:

My friend went on a date last week and “Googled” the man when she got
home -- that is, looked him up on the Internet search engine google.com.
She found that he had been involved in many malpractice suits. (He’s a
doctor.) Her “homework” has now resulted in a discounted opinion of this
man. What do you think about using Google to check up on another
person? (Cohen, 2002)

60
Ironically, Google punished CNET for publishing the personal
information about Schmidt – found via their own search engine – with a one-year
boycott against answering any inquirers from the news service. Amid public
criticism, Google ended the boycott two months later.

145
Randy Cohen – the ethicist – is not overly concerned with this instance of using

Google’s perfect reach in order to find out details about another person. He argues

it “was akin to asking her friends about this fellow – offhand, sociable and

benign” (Cohen, 2002).

Not everyone agrees with the ease at which Cohen justified the use of

search engines to obtain personal information about another person. Certainly, not

all occurrences are as “sociable and benign” as the example above. Wright and

Kakalik (2000) have pointed out that a certain kind of information about

individuals, which was once difficult to find and even more difficult to cross-

reference, is now readily accessible and collectible through the use search

engines, with particular consequences for individual privacy. Other scholars have

focused on particular privacy implications of the perfect reach of search engines,

such as the ability to exploit the increased publication of personal information

online in order to engage in cyberstalking (Tavani & Grodzinsky, 2002), the

usefulness of search engines to aid in aggregation and data mining of personal

information across otherwise disparate databases (Garfinkel, 2000), and to build

digital dossiers of individuals (Solove, 2004). Herman Tavani (2005) has written

specifically about the ease with which personal information can be routinely

collected, aggregated, and analyzed by Web search engines. Noting how the

perfect reach of search engines now extend to various mailing list and discussion

group archives, Tavani argues that

Because the various news groups contain links to information posted by a


person, they can provide search-engine users with considerable insight

146
into that person’s interests and activities.61 So it would seem to follow that
not all of the personal information currently included on Web sites
accessible to search engines was necessarily either placed there by the
persons themselves or explicitly authorized to be placed there by those
persons. (Tavani, 2005, p. 40)

An individual might not be aware that her name is among those included in one or

more of these databases accessible to search engines, let alone fluent in how

search engines themselves work and their ability to retrieve personal information

from a variety of online sources. John Battelle summarizes this anxiety best:

What do we do when information that we know, by law, should be public,


becomes, well…really public?...What happens when every single thing
that’s ever been publicly known about you – from a mention in your
second-grade newsletter (now online, of course) to the vengeful ravings of
a spurned lover – trails your name forever? (Battelle, 2005, p. 193)

As we are faced with this challenge of protecting “privacy in public”

(Nissenbaum, 1998, 2004), we are forced to recognize that, powered by its perfect

reach, the quest for the perfect search engine “hath taken away” the practical

obscurity of personal information online.

Anxieties of Perfect Recall

A key component of the perfect search engine is its perfect recall – the

ability to know the searcher, and what she has searched for in the past to help

tailor both search results and advertising to her interests and needs. While it is

easy to think of a search engine as a one-way interface – a simple search term is

entered, and a flood of possible search results are returned – there is an important

61
A search for the my name uncovers (admittedly forgotten) posts to
Usenet discussion forums from the early 1990s on topics ranging from abortion
rights, Catholicism, feminism, marketing, and Lotus 1-2-3 spreadsheet software.

147
feedback loop. In the quest for the perfect search engine, the interface is actually

two-way. As detailed above, through encouragement of the creation of Google

Accounts, combined with the use of persistent Web cookies, Google has also

constructed the necessary architecture for the creation of detailed server logs of

user’s online intellectual activities, both on Google properties and beyond (Table

4 at end of chapter). The result, a major step toward achieving the perfect recall

necessary to deliver personalized results and services, also arms Google with the

ability to collect and aggregate a wide array of personal and intellectual

information about its users, extending beyond just the keywords they search for,

but also including the news they read, the interests they have, the blogs they

follow, the books they enjoy, and perhaps even every website they visit.

As is typical for search engines, Google’s log files record the search terms

used, Web sites visited, Internet Protocol address, and Web cookie for every

single search conducted through its site. This can easily be combined with other

personally identifiable information collected by Google in order to use its other

services. For instance, Gmail asks for a user’s name and e-mail address, Google

Maps could store her home address, Dodgeball stores her cellphone number and

locational data, and Google Finance could collect the stocks in her portfolio. If

combined with her Web search history, Alert keywords, and Personalized

Homepage modules, Google would see intimate details about a person’s identity,

political interests, health status, sex life, religion, financial status, and buying

preferences. The result is what John Battelle calls a “database of intentions”:

148
This information represents, in aggregate form, a place holder for the
intentions of humankind - a massive database of desires, needs, wants, and
likes that can be discovered, subpoenaed, archived, tracked, and exploited
to all sorts of ends. Such a beast has never before existed in the history of
culture, but is almost guaranteed to grow exponentially from this day
forward. This artifact can tell us extraordinary things about who we are
and what we want as a culture. (Battelle, 2003)

While many of our day-to-day habits – such as using credit cards, ATMs,

cell phones, or automated toll collection systems – leave countless “virtual

footprints” of our activities, Google’s infrastructure of dataveillance tracks our

search histories, e-mails, blog posts, and general browsing habits, providing “an

excellent source of insight into what someone is thinking, not just what that

person is doing” (Hinman, 2005, p. 23).

A Faustian Bargain Emerges

Recognizing these anxieties, a Faustian bargain emerges with the quest for

the perfect search engine: The perfect search engine promises breadth, depth,

efficiency, and relevancy, but threatens any sense of “security through obscurity”

through its perfect reach, as well as enabling the widespread collection of

personal and intellectual information in the name of its perfect recall. While many

searchers have acknowledged this anxiety about the presence of such systematic

monitoring of their online information-seeking activities (Barbaro & Zeller Jr,

2006; Hafner, 2006; Levy, 2006; Maney, 2006), there has been little evidence of

widespread changes in user behavior in light of these revelations.62 After the dust

62
In the year since the DOJ case emerged, search engine activity has
increased from 5.3 billion searches in February 2006 (Nielsen//NetRatings, 2006)

149
settled from these privacy controversies, the allure of “Planet Google” has

maintained its hold on the faithful: “I don’t know if I want all my personal

information saved on this massive server in Mountain View, but it is so much of

an improvement on how life was before, I can’t help it” (Williams, 2006).

The fact that the user quoted above “can’t help” embracing Google’s suite

of products, while acknowledging Google’s ability to collect a vast amount of

personal information about his online intellectual and social activities, reveals the

potency of the Faustian bargain that society makes the perfect search engine. In

the presence of such a bargain, Postman warned, society eventually succumbs to

the promises made by technology, ignoring such issues of power, equality, and

freedom. As the quest for the perfect search engine continues its meteoric rise,

then, it becomes increasingly difficult for users to recognize or question their

value-related externalities, and more tempting to simply take the design of such

tools “at interface value” (Turkle, 1995, p. 103), a condition that seems all too

apparent in this student’s perspective:

Anne Rubin, 20, a New York University junior who uses Google's search,
Gmail and Blogger services, says quality overrides any privacy concerns,
and she doesn't mind that profiles are built on her in order to make the ads
she sees more relevant. “I see it as a tradeoff. They give services for free,”
she said. “I have a vague assumption that things I do (online) aren't
entirely private. It doesn't faze me.” (Associated Press, 2005)

In order to break the hold of this Faustian bargain, we need to gain conceptual

clarity and a normative understanding of the ways in which the quest for the

to 6.4 billion in 2007 (Nielsen//NetRatings, 2007), an increase of over 20%.


Google, for its part, reported $10.6 billion in revenues during fiscal 2006,
compared to only $6.1 billion for 2005, a 73% increase (Google, 1999).

150
perfect search engine bears on user privacy. As the following chapter will explain,

the theory of “privacy as contextual integrity” (Nissenbaum, 1998, 2004) can

provide both a novel and effective framework to reveal how Google’s quest for

the perfect search does alter personal information flows in such a way that

threatens users’ ability to fully utilize and enjoy this important online information

space. Contrary to Ms Rubin’s stance, we will reveal how Google’s quest for the

perfect search should faze us.

151
Table 3: Google Suite of Products and Services
Product Description Notes
General Information Inquiries
Web search - Query-based website searches
Personalized Homepage - Customized Google start page - Use in conjunction with Google
with content-specific modules Account is encouraged
Alerts - E-mail alerts of new Google
results for specific search terms
Image Search - Query based search for website
images
Video - Query based search for videos - Google Video Player available
hosted by Google for download
Book Search - Full text searches of books - Google Account required in
scanned into Google’s servers order to limit the number of
pages a particular user can view
Academic Research
Scholar - Full text searches of scholarly
books and journals
News and Political Information
News - Full text search of recent news - With a Google Account, users
articles can create customized keyword-
based news sections
Reader - Web-based news feed reader - Google Account required
Blog Search - Full text search of blog content
Communication and Social Networking
Gmail - Free Web based e-mail service - Creation of Gmail account
with contextual advertising automatically results in activation
of Google Account
- Logging into Gmail also logs
user into their Google Account
Groups - Free Web based discussion - Includes complete Usenet
forums archives dating back to 1981
- Google Account required for
creation of new Group;
Talk - Web-based instant messaging - Google Account and Gmail e-
and voice calling service mail address required
Blogger - Web-based blog publishing - Google Account required
platform -
Orkut - Web-based social networking - Invitation-only
service - Google Account required
Dodgeball - Location-based social
networking service for
cellphones
Personal Data Management
Calendar - Web-based time-management
tool
(Table continues)

152
Table 3: Google Suite of Products and Services (continued)
Product Description Notes
Financial Data Management
Finance - Portal providing news and - Google Account required for
financial information about posting to discussion board
stocks, mutual funds; Ability to
track one’s financial portfolio
Consumer Activities
Catalog Search - Full text search of scanned
product catalogs
Froogle - Full text search of online retailers - Google Account required for
shipping lists
Local / Maps - Location specific Web searching;
digital mapping
Computer File Management
Desktop Search - Keyword based searching of
computer files
- Ability to search files on remote
computer
Internet Browsing
Bookmarks - Online storage of website - Google Account required
bookmarks
Notebook - Browser tool for saving notes - Google Account required
while visiting websites
Toolbar - Browser tool providing access to - Some features require Google
various Google products without Account
visiting Google websites
Web Accelerator - Software to speed up page load
times for faster Web browsing

153
Table 4: Personal Information Collected by Google’s Suite of Products
Product Information Collected Notes
General Information Inquiries
Web search - Web search queries - Search for own name, address,
- Results clicked social security number, etc is
common
Personalized Homepage - News preferences
- Special interests
- Zip code
Alerts - News preferences - Alerts for a user’s own name
- Special interests (vanity search) are common
- E-mail address
Image Search - Search queries
- Results clicked
Video - Search queries - Google Video Player contains
- Videos watched/downloaded additional DRM technology to
- Credit card information for monitor off-site video usage
purchased videos
- E-mail details for shared videos
Book Search - Search queries
- Results clicked
- Pages read
- Bookseller pages viewed
Academic Research
Scholar - Search queries
- Results clicked
- Home library (Optional)
News and Political Information
News - News search queries
- Results clicked
Reader - Feed subscriptions
- Usage statistics
Blog Search - Search queries
- Results clicked
Communication and Social Networking
Gmail - Text of email messages
- E-mail searches performed
- Email address or cellphone
number (used for account
creation)
Groups - Search queries - Users are encouraged to create
- User interests detailed profiles, including name,
- Usage statistics location, industry, homepage, etc
- Profile information
Talk - Contact list
- Chat messages
- Usage statistics
(Table continues)

154
Table 4: Personal Information Collected by Google’s Suite of Products (continued)
Product Information Collected Notes
Communication and Social Networking
Blogger - Weblog posts and comments - Users are encouraged to create
- Profile information detailed profiles, including name,
- Usage statistics location, gender, birthday, etc
Orkut - Profile information - Users are encouraged to create
- Usage statistics detailed profiles, including name,
location, gender, birthday, etc
Dodgeball - Profile information - User location when messages
- E-mail address sent are tracked by Google
- Location
- Mobile phone information
- Text messages sent
Personal Data Management
Calendar - Profile information
- Events
- Usage statistics
Financial Data Management
Finance - Financial quotes - Names and e-mails are displayed
- Discussion group activity with discussion posts
- Portfolio (optional)
- Profile information
Consumer Activities
Catalog Search - Product search queries
- Results clicked
Froogle - Product search queries
- Results clicked
- Sites visited
- Shopping list
Local / Maps - Search queries - Search queries might include
- Results clicked geographic-specific information
- Home/default location
Computer File Management
Desktop Search - Search queries - Search queries visible to Google
- Computer file index (Optional) under certain circumstances
- Desktop file index is stored on
Google’s services if using Search
Across Computers
Internet Browsing
Bookmarks - Favorite websites
- When visited
Notebook - Notes and clippings
- Sites annotated
Toolbar - Search queries - Use of some advanced features
- Websites visited routes all browsing traffic
through Google servers
Web Accelerator - Websites visited - All browsing traffic is routed
through Google servers

155
156

CHAPTER V

CONTEXTUAL INTEGRITY AND THE PERFECT SEARCH ENGINE

Introduction

Postman feared that the Faustian bargain society strikes with its

technology is often hidden, subsumed in some wider ideology of efficiency and

utopian vision of technological progress (1992, p. 11). He warned that we tend to

be “surrounded by the wondrous effects of machines and are encouraged to ignore

the ideas embedded in them. Which means we become blind to the ideological

meaning of our technologies” (1992, p. 94). Indeed, search engine users are

constantly reminded that the collection of such information is “to improve the

quality of our services” (Google, 2005i) or to “improve the overall quality of the

online experience” (IAC Search & Media, 2005). The quest for the perfect search

engine threatens to lower expectations of privacy as users increasingly become

bombarded with the rhetoric of efficiency, utility, and relevancy.

Consider, for example, Sergey Brin and Larry Page’s response when

confronted in an interview with Playboy magazine about privacy concerns related

to Gmail’s scanning of incoming messages in order to place advertising:

PLAYBOY: The Electronic Privacy Information Center equates such


monitoring with a telephone operator listening to your conversations and
pitching ads while you talk.
BRIN: That’s what Hotmail and Yahoo do, don’t forget. They have big
ads that interfere with your ability to use your mail. Our ads are more
discreet and off to the side. Yes, the ads are related to what you are
looking at, but that can make them more useful.
PAGE: During Gmail tests, people bought lots of things using the ads.
BRIN: Today I got a message from a friend saying I should prepare a toast
for another friend’s birthday party. Off to the side were two websites I
could go to that help prepare speeches. I like to make up my own
speeches, but it’s a useful link if I want to take advantage of it.
PLAYBOY: Even that sounds ominous. We may not want anyone—or
any machine—knowing we’re giving a speech at a friend’s birthday party.
BRIN: Any Web mail service will scan your e-mail. It scans it in order to
show it to you; it scans it for spam. All I can say is that we are very up-
front about it. That’s an important principle of ours.
PLAYBOY: But do you agree that it raises a privacy issue? If you scan for
keywords that will trigger ads, you could easily scan for political content.
BRIN: All we’re doing is showing ads. It’s automated. No one is looking,
so I don’t think it’s a privacy issue. To me, if it’s a choice between big,
intrusive ads and our smaller ones, it’s a pretty obvious choice. I’ve used
Gmail for a while, and I like having the ads.
PLAYBOY: Do the ads pay for the extra storage space?
BRIN: Yes. Targeted advertising is an important component. We could
have had glaring videos appear before you look at every message. That
could generate revenue too. Our ads aren’t distracting; they’re helpful.
PAGE: I find it works well. And it’s an example of the way we try to do
good. It’s a high-quality product. I like using it. Even if it seems a little
spooky at first, it’s useful, and it’s a good way to support a valuable
service. (Sheff, 2004, pp. 59-60)

Here, concerns over the appropriateness of scanning the content of incoming e-

mail messages are countered by rhetorical claims that Google’s ads are smaller

and more discrete than the competition, that people clicked on the ads during

testing, and scanning emails to place ads is inherently “helpful,” “useful,” and a

“good way to support a service.” Instead of addressing privacy concerns head-on,

Google’s founders frame the issue as a choice between “big, intrusive ads and our

smaller ones.” In their eyes, creating “a valuable service” easily outweighs

157
concerns over user privacy, even those Page admits the ads are “spooky at first.”

Such reasoning is hard to for an average consumer to dispute, especially when

enticed by an ever-increasing array of search-related products and services to help

“organize the world's information and make it universally accessible and useful”

(Google, 2005b).

Fueling the acquiescence to compromise for the perfect search engine are

the persistent claims by pundits, journalists, and search engine companies

themselves, that “no personal information is ever collected” (Sullivan, 2003b,

2003c),63 that “the privacy concerns are probably overblown” (Mills, 2005), or

that “if you have nothing to hide when you use the internet, you have nothing to

fear” (Griffin, 2006). This final rhetorical device – that if you have nothing to

hide then you should have no concern for your privacy – is particularly

pernicious. Here, the word “hide” presupposes that nobody can have a legitimate

motive for wishing to protect information about his or her life outside of wanting

to hide it from discovery. As Phil Agre has noted, “This is obviously false”:

People have a legitimate interest in avoiding disclosure of a wide variety


of personal circumstances that are none of anyone’s business. If you are
raped, will you want the full details published in the newspapers the next
day? I don’t think so. People also have a broader (though obviously not
unbounded) interest in regulating how they are represented in public. If
someone doesn’t like you, they will dredge up all sorts of facts and portray
them in a bad light. If only for this reason, it is reasonable to avoid giving
everyone unlimited access to your life. (Agre, 1996)

Agre argues that in a free society, one does not need to have “something to hide”

to keep personal information from prying eyes; it should be the default position.

63
A claim disproven by the ease of identifying users from the
“anonymized” AOL search records data release (see Barbaro & Zeller Jr, 2006).

158
The quest for the perfect search threatens to change this default, yet its danger

remains clouded among rhetoric of improving services, making one’s life easier,

and the notion that one should not have anything to hide in the first place.

An equally common – and equally problematic – response to concerns

about the information-gathering abilities of Google is the argument that users

knowingly share the information, and already share similar information with other

people and institutions. For example, addressing concerns that Google is able to

track and collect all of a user’s browsing activity through the Web Accelerator

product, a company representative attempted to quell the privacy issue by

asserting that Web Accelerator receives much of the same kind of information

that people already share with their Internet service providers when surfing the

Web (Hines, 2005). Or that Google logging book searches is no different than

asking a librarian for help finding particular books. Or that Google’s scanning of

Gmail messages in order to place contextual advertisements is no different than

spam filters. Such appeals that the status quo has simply been maintained have

clouded many discussions and concerns about user privacy.

It appears, then, that the Faustian bargain that society must make to reap

the benefits of the perfect search engine has succeeded in obscuring the various

issues related to privacy, freedom, and autonomy inherent in using the Web for

information-seeking and other intellectual and social activities. Concerns about

the collection of personal and intellectual information are frequently

overwhelmed by bold claims of efficiency or utility, or countered with arguments

that little threat to privacy actually exists. Recalling Brey’s prescription for

159
disclosive computer ethics, to “make potentially morally controversial computer

features and practices visible” (Brey, 2000, p. 13), we must take steps to achieve

conceptual clarity of the value and ethical implications of the perfect search

engine. The theory of “privacy as contextual integrity,” developed by Helen

Nissenbaum (1998; 2004), provides an effective framework to reveal how

Google’s quest for the perfect search alters personal information flows in such a

way that threatens users’ ability to fully utilize and enjoy this important online

information space.

Understanding “Contextual Integrity”

Contextual integrity is a benchmark theory of privacy, a conceptual

framework that links the protection of personal information to the norms of

personal information flow within specific contexts. It provides a framework for

evaluating the flow of personal information between agents to help identify and

explain why certain patterns of information flow are viewed as problematic.

Through her development of this new theory of privacy, Nissenbaum (1998;

2004) argues that informational norms – specific to particular contexts – govern

the flow of personal information from one entity to another.

Rejecting the traditional dichotomy of pubic versus private spaces – and

its related clean division between public and private information – a key

recognition within contextual integrity is that the multitude of information-sharing

activities take place in a “plurality of distinct realms”:

160
They are at home with families, they go to work, they seek medical care,
visit friends, consult with psychiatrists, talk with lawyers, go to the bank,
attend religious services, vote, shop, and more. Each of these spheres,
realms, or contexts involves, indeed may even be defined by, a distinct set
of norms, which governs its various aspects such as roles, expectations,
actions, and practices. (Nissenbaum, 2004, p. 137)

Within each of these contexts, norms exist – either implicitly or explicitly – which

both shape and limit our roles, behaviors, and expectations. For example, it might

be acceptable for me to approach a stranger and offer her a hug at a moving

religious service, but not in the grocery store. A judge might willingly accept

birthday gifts from colleagues, but would hesitate to accept one from a lawyer

currently arguing a case in her courtroom. It is deemed appropriate for a physician

to ask me my age, but not for a bank teller. While it is necessary for an airline to

know my destination city, it would be inappropriate for them to ask where I will

be staying, with whom I will be meeting, or what will be discussed.

In short, norms of behavior vary based on the particular context. The latter

examples above reveal the ways in which norms govern the flow of personal

information in particular contexts. Whether in discussions with a physician,

purchasing items in a store, or simply walking through a public park, norms of

information flow govern what type and how much personal information is

relevant and appropriate to be shared with others. The theory of contextual

integrity is built around the notion that there are “no arenas of life not governed

by norms of information flow” (Nissenbaum, 2004, p. 137). These norms explain

the boundaries of our underlying entitlements regarding personal information, and

our privacy is invaded when the informational norms are contravened. Within

161
each context, the relevant agents, the types of information, and transmission

principles combine to shape the governing informational norms (Barth et al.,

2006). Each of these constructs will be described in greater detail below.

Informational norms always include three relevant agents: the information

subject, the one who has the information and is distributing it (who may or may

not be the subject), and the one who receives the information. The informational

norms within a particular context dictate the roles of the agents, each associated

with a set of duties and privileges. For example, in the healthcare context, the

personal information shared by the patient (the subject and sender) depends very

much on who the recipient is – the physician, the receptionist, the claims

processor, and so on. In turn, the rules governing the transmission of personal

information by the physician depends on who the recipient is – the patient, a

colleague, the insurance company, and so on. The specification and roles of the

various agents are key variables affecting the maintenance of contextual integrity

within a particular context of informational norms.

The type of information in question is another defining aspect of

informational norms. Unlike most theories of privacy, contextual integrity rejects

the notion that information types fit into a rigid dichotomy of public or private.

Instead, there are potentially an indefinite variety of types of information that

could feature in the informational norms of a given context, and whose

categorization might shift from one context to another. Again, in a healthcare

setting, different informational norms apply depending on whether the

information is a patient’s medical condition, home address, or account balance.

162
The notion of “appropriateness” is a useful way to signal whether the type of

information in question conforms to the relevant informational norms. Norms of

appropriateness “circumscribe the type or nature of information about various

individuals that, within a given context, is allowable, expected, or even demanded

to be revealed” (Nissenbaum, 2004, p. 138). In some contexts, norms of

appropriateness are very open, such as in a personal friendship where personal

information tends to flow freely. In other contexts, such as the job interview or

classroom, more explicit and restrictive norms of appropriateness prevail, and the

flow of appropriate personal information is more highly regulated. Nevertheless,

norms of appropriateness apply in all situations: among both strangers and loved

ones, in personal and professional interactions, in private and public.

The notion of a transmission principle may be the most distinctive

component of the informational norms that frame contextual integrity.

Transmission principles place constraints on the flow or distribution of

information from agent to agent within a context. Confidentiality is an example of

a transmission principle in which the agent receiving the information is prohibited

from transmitting the information to other agents. In some contexts, the

information flow is bi-directional, representing the transmission principle of

reciprocity. In others, agents might be compelled to divulge information by a legal

authority; in still others, the transfer of information might be voluntary, or made

only when proper consent is provided. For example, transmission principles

outlined in professional codes of ethics dictate that my physician can share only

some of my personal information with other doctors: she might share my

163
symptoms or family history to aid in diagnosis, but not my name. More restrictive

principles have been codified into our legal systems, such as the burden necessary

for law enforcement to obtain my detailed phone records. Informational norms

prescribe which transmission principles ought to govern the flow of information

in particular contexts, and such norms are violated if the principles are not

followed.

To summarize, within each context, informational norms are shaped by

identification of the relevant agents, the types of information, and the appropriate

transmission principles. With these components, contextual integrity generates a

decision heuristic to help explain when privacy objections are likely to be aroused

by the introduction of a new technology or practice. Rather than aspiring to

universal prescriptions for what is public versus private information, contextual

integrity works from within the normative bounds of a particular context. It is

designed to consider how the introduction of a new practice or technology into a

given context might impact the governing informational norms to see whether and

in what ways those norms might be breached. In order to determine if contextual

integrity has been maintained, we must consider how the new technology or

practice affects the agents involved, the appropriateness and type of information,

and the transmission principles that constrain the flow of information from agent

to agent. If the introduction of a new technology or practice in a given context is

found to conflict with the standing informational norms, a red flag is raised,

indicating that contextual integrity has been violated. The usefulness of contextual

integrity as a heuristic for identifying potential privacy violations is perhaps best

164
revealed by examining a recent application of the theory to the introduction of

vehicle safety communication technologies into the context of highway travel

(Zimmer, 2005).

Example: Vehicle Safety Communication Technology

Vehicle safety communication (VSC) technologies are intelligent, on-

board vehicle safety applications that share, receive, and process data from the

vehicle’s surrounding environment (Vehicle Safety Communications Consortium,

n.d.; Horrell, 2003; Derene, 2007). Made possible by recent advances in wireless

data communication technology, VSC solutions aim to afford the driver every

possible opportunity to avoid an accident by providing real-time information

about the surrounding road conditions as well as nearby vehicles, warning of

hazards, and predicting dangerous scenarios or imminent collisions. VSC systems

rely on the creation of autonomous, self-organizing, point-to-multipoint wireless

communication networks – so-called ad-hoc networks – connecting vehicles with

roadside infrastructure and with each other. In these networks, both vehicles and

infrastructure collect local data from their immediate surroundings, process this

information and exchange it with other networked vehicles to provide real-time

safety information about the immediate surroundings. Data messages, which are

transmitted 10 times per second, potentially include the vehicle’s location, time

and date stamps, vehicle speed and telemetry data, and some sort of vehicle

identification number (Vehicle Safety Communications Consortium, n.d.).

165
VSC systems pose a Faustian bargain of their own: Coupled with the

predicted safety benefits of VSC applications (U.S. Department of Transportation,

2005), is a potential rise in the ability to surveil a driver engaging in her everyday

activities on the roads and highways. VSC technologies potentially enable the

collection of information on where drivers go, when they make their trips, and

what routes they use. However, since much of the tracking or surveillance made

possible by VSC technologies occurs in public as the driver travels along the open

road, traditional conceptualizations of privacy fail to properly accommodate the

concern of one’s privacy in public. Recalling some of the responses to privacy

concerns with the perfect search engine, many argue that drivers have no

expectation of privacy when traveling on the public roads (Harris, 2005), while

others maintain that VSC systems do not provide any information different than

what can observed physically from a passing vehicle.64

Applying the theory of privacy as “contextual integrity,” however,

provides the conceptual framework to reveal how these safety technologies have

the potential to disrupt the informational norms in the context of highway travel,

threatening drivers’ privacy even when driving along public roads (for a more

detailed discussion, see Zimmer, 2005). Prior to the introduction of VSC

technologies, informational norms dictated only the sharing of information that

could be viewed by someone who happens to be at the right place at the right time

to visually-observe a car pass by. Mass surveillance was difficult due to particular

64
In a personal conversation, an engineer working on VSC-related
technologies remarked that the information shared with these new systems “is the
same as your license plate.”

166
natural barriers that influenced the informational norms. The license plate could

be read if the lighting conditions are correct, speed could only be approximated,

and the direction a car was traveling could be monitored until it was out of visual

range.

While existing informational norms in the context of highway travel

anticipate the sharing of some generally observable and nonidentifiable

information, the introduction of VSC technology challenges many of these norms.

The implementation of GPS-enabled VSC technology allows recording of a

vehicle’s precise location over time. The transmission of unique identifiers by

VSC technologies also increase the accuracy of vehicle identification. The

precision of information regarding a driver’s habits and current status also

increases with the introduction of VSC systems that process and record vehicle

telemetry, including speed, acceleration, heading, yaw-rate, brake position,

throttle position and steering wheel angle.

By overcoming some of the natural barriers to mass surveillance of

highway traffic, VSC technologies could also disrupt existing transmission

principles of personal information in the context of highway travel. Vehicles

equipped with VSC technologies will be constantly transmitting information

about their identity, location, and status for reception by other vehicles, roadside

infrastructure, or anyone else with the proper receiving equipment. Humans will

no longer need to be positioned in a particular place to visually observe a vehicle;

all that is needed is a well-placed receiver and information for all passing vehicles

to be recorded. Further, a series of receivers could collect information from the

167
same vehicle over a span of miles. VSC technology has the potential to disrupt the

natural barriers that previously limited the ability to track individual vehicles over

space and time. Rather than a single piece of information being observed by a

person or camera that just happens to be at the right place at the right time, VSC

technologies could allow information to be gathered and consolidated on a large

scale and across a large area.

VSC technologies have the potential to disrupt transmission principles

even further. While existing traffic cameras allow the archival and retrieval of

video surveillance images, the digital nature of the information provided by VSC

applications vastly expands the ability to process, store, and share vast amounts of

personal information about individual vehicles. The processing of digital

information can be done electronically, alleviating the need for a human to

physically view hours of camera footage, and increasing exponentially the size

and complexity of data analyses. Data mining can be performed with ease, as can

aggregation with other databases. Additionally, the digital nature of vehicle data

enabled by VSC technology expands the ability and reduces the cost for

distributing information to third parties, potentially including insurance

companies, marketers, or other government agencies who might have interest in

detailed diver data.

By viewing the introduction of VSC technologies through the lens of

“contextual integrity,” we can see how the design of these systems might alter the

flow of personal information in the context of highway travel and threaten the

value of privacy in public. VSC technologies enable the collection of information

168
on drivers’ habits and activities on the roads. They represent a shift from the

sharing of only general and visually observable driver information to the

widespread and constant broadcasting of precise, digital information about

drivers’ daily activities. With the potential integration of VSC technologies into

our daily activities on the public roads, we are in danger of violating the existing

informational norms within context of highway travel. Concerns about the impact

of VSC technology on a driver’s privacy can receive new attention when framed

through the heuristic of contextual integrity. A similar application of contextual

integrity will prove fruitful when considering the privacy implications of the

perfect search engine.

Contextual Integrity in the Perfect Search

Rejecting strict dichotomies of public versus private information,

contextual integrity frames issues of privacy within particular contexts,

recognizing that what information is public in a given context might be

considered private in another, that just because information is provided to one

agent (a librarian, say) does not mean that it is automatically acceptable to share

the same information with another agent (a search engine provider or advertiser).

As shown in the example with vehicle safety communication systems, contextual

integrity works especially well to resolve issues of how new technologies and

practices are introduced into particular contexts, revealing how personal

information flows can be disrupted in such a way that threatens the privacy of the

parties involved. When applied to Google’s quest for the perfect search engine,

169
contextual integrity will provide the means to understand how these new technical

systems violate existing norms of information flows.

To determine the potential impact of Google’s quest for the perfect search

on the informational norms dictating the flow of personal information when

engaging in social and intellectual activities, we can create a thought experiment

featuring two ideal typical information seekers, Elizabeth “Libby” Doe and

Annette “Netty” Roe. Libby and Netty are nearly identical in their personal,

social, political, cultural and economic characteristics. Both are 30-year-old,

single, gay south-Asian women. Both are Hindu, live in Brooklyn, New York, and

tend to vote for Democrats. Libby and Netty are graduate students at New York

University, studying political science and feminist theory. They enjoy sports and

cooking as hobbies; both are thinking of having a baby, but have concerns due to

being diabetic. They have similar investment portfolios, enjoy keeping in touch

with friends, and like to share photos and stories from their travels.

The two differ, however, in how they navigate their “informational

spheres.” Libby prefers traditional, “old-fashioned” methods of information-

seeking and communication: reading print newspapers, watching television news,

word-of-mouth, written, and oral correspondence. While not averse to using the

Internet, when Libby needs to find information on a topic, she prefers visiting the

library. Netty, on the other hand, relies heavily on the Internet to manage

information and communicate with others. When Netty needs information about a

topic, she “Googles” it. In fact, Netty relies on Google’s broad array of products

and services for virtually all of her online activities.

170
When navigating their respective “spheres of information,” both Libby

and Netty inevitably share bits of personal information with others. Appendix B

describes these flows of personal information within each of the nine distinct

contexts of information-seeking identified in the previous chapter, allowing

comparison with the personal information shared by Libby in her traditional

information-seeking methods, and Netty, who relies almost exclusively on “Planet

Google” to access and organize information. Building from this narrative of Libby

and Netty’s differing informational practices and flows, we can attempt to apply

contextual integrity as a benchmark to determine if privacy violations exist.

Assessing the information practices and flows from our thought experiment with

any degree of certainty is not easy, but we can approximate the particular agents,

information types, and transmission principles that govern information flows from

our thought experiment (see Table 5). Within each context there is evidence of

shifts in each component of the governing informational norms.

Libby’s interactions are scattered among various agents, resulting in a

fragmented dispersal of personal information. Rarely would any single receiver

obtain information from multiple engagements with Libby, let alone across

contexts. Indeed, given Libby’s information practices in some contexts, there is

no agent at all in receipt of her personal information. In contrast, all of Netty’s

information-seeking activities involve Google as an agent receiving personal

information, allowing a level of consolidation not possible in Libby’s scenario.

The result of having one single entity act as a receiving agent across the various

contexts represents a significant shift in informational norms.

171
The types of information shared by Libby tend to be incomplete, scattered

verbal requests to librarians or booksellers, and the occasional transactional (but

not window-shopping) data provided to retailers. Some agents have access to

Libby’s home address or financial data because she is a subscriber or repeat

customer, but in general, the information she divulges is only a fragment of the

entire picture of her activities in each context. Netty, on the other hand, provides

Google much more complete sets of information on nearly all her interactions

with Google products and services. The information is digital, allowing for

simpler storage, processing, and sharing, and its accuracy is difficult to dispute.

Finally, the key difference in transmission principles for our two

information seekers is that Libby voluntarily divulges information when she

decides to interact directly with librarians, booksellers, and so on, while Netty is

compelled to allow Google to track and collect her information browsing and

usage habits as a condition of using its products and services. Further differences

exist in terms of how these agents might share the information with other parties.

In Libby’s case, while some of the information divulged in commercial

transactions might be used for marketing purposes, the librarians she interacts

with are bound by a code of ethics, and the phone and financial companies who

receive information must adhere to strict laws protecting consumer privacy. For

Netty, in nearly all cases, use of the information by Google is dictated by its

privacy policy, which states, in part:

We may combine the information you submit under your account with
information from other Google services or third parties in order to provide

172
you with a better experience and to improve the quality of our services.
(Google, 2005j)

Google further states it will share personal information with third parties when,

among other reasons,

We have a good faith belief that access, use, preservation or disclosure of


such information is reasonably necessary to (a) satisfy any applicable law,
regulation, legal process or enforceable governmental request. (Google,
2005j)

To summarize, the shift from traditional information-seeking practices –

represented by Libby – to the growing reliance on Google’s quest for the perfect

search engine – typified by Netty’s information-seeking practices – represents a

potentially significant shift of the existing informational norms. While Libby’s

divulgence of personal information is scattered, informal, and voluntary, Netty’s

is concentrated with one agent (Google), digital and comprehensive, and often

required in order to use the services. Thus, a violation of the contextual integrity

of the privacy of personal information across these various information-seeking

contexts is revealed. It is no longer acceptable to hide behind the rhetoric that no

private information is divulged when utilizing the tools that make up the perfect

search engine, or that the information shared is simply the same as that provided

in other information-seeking scenarios. Revealing this kind of transgression of

informational norms helps to expose the Faustian bargain implicit within the quest

for the perfect search engine.

A breach of the contextual integrity of information flow in a particular

context often triggers an assessment in terms of countervailing values. The

preservation of contextual integrity is meant to promote the preservation of

173
privacy of personal information in support of broader social, political, and moral

values. By disrupting the existing contextual integrity, a new practice or

technology might support countervailing values. Such is the case here, for the

very logic of the perfect search is increased relevancy and efficiency of users’

information-seeking activities. The question emerges whether the increased

relevancy and efficiency of users’ information-seeking activities outweigh the

values of privacy and autonomy supported by the existing informational norms.

As a decision heuristic, contextual integrity does not provide this kind of

normative assessment. Instead, it often retains a conservative bias, endorsing

“entrenched informational norms that might be deleterious even in the face of

technological means to make things better” (Nissenbaum, 2004, p. 143). A

broader examination of the social, political, and moral importance of free and

unfettered information-seeking activities is required in order to determine whether

the maintenance of the existing informational norms is truly warranted. Moving

above the contexts described in this chapter, we need to consider these

information-seeking activities as part of a larger sphere of intellectual mobility

deserving of special protection.

174
Table 5: Differences in informational norms within various information-seeking
contexts
Informational norm Libby Netty
General Information Inquiries
Agent (receiver) - Might interact with various - Google
librarians, booksellers, and other
information sources
Information type - Might verbally divulge personal - All information queries logged in
interests due to interactions with digital form
agents
- Booksellers might keep
transaction logs
Transmission principle - Information divulged to librarian - Information divulged to Google
voluntarily automatically
- Retailers might require certain - Google privacy policy allows use
information for purchase to “provide a better user
- Librarian bound by code of ethics experience”
to maintain patron privacy - May share with third parties to
- Booksellers might use/sell “comply with legal processes”
transaction data
Academic Research
Agent (receiver) - Might interact with research - Google
librarian
Information type - Might verbally divulge research - All information queries logged in
interests due to interactions with digital form
Transmission principle - Information divulged to librarian - Information divulged to Google
voluntarily automatically
- Librarian bound by code of ethics - Privacy policy applies
to maintain
News and Political Information
Agent (receiver) - Subscriptions receive some - Google
information
Information type - Sources subscribed to have - All information queries logged in
address and billing information digital form
Transmission principle - Some information required for - Information divulged to Google
subscriptions automatically
- Sources subscribed to might - Privacy policy (above)
use/sell information for
marketing purposes
(Table continues)

175
Table 5: Differences in informational norms within various information-seeking
contexts (continued)
Informational norm Libby Netty
Communication and Social Networking
Agent (receiver) - Recipients of messages - Recipients of messages
- E-mail and phone providers - Google
Information type - Recipients see contents of - All message content and
messages interactions logged in digital
- E-mail and phone providers track form by Google
usage, might scan for spam, etc - Contacts and friends lists stored
in databases at Google
Transmission principle - Information voluntarily divulged - Information divulged to Google
to recipients automatically
- Recipients might share - Privacy policy
information; generally bound by
norms of friendship
Personal Data Management
Agent (receiver) - Information not shared with any - Google
third party
Information type - N/A - Calendar information and queries
logged in digital form
- Cellphone number provided for
alerts
Transmission principle - N/A - Information divulged to Google
automatically
- Privacy policy
Financial Data Management
Agent (receiver) - Information not shared with - Google
anyone outside of broker
Information type - N/A - Portfolio information
- Personal information for forum
participation
Transmission principle - N/A - Information divulged to Google
automatically
- Privacy policy
Consumer Activities
Agent (receiver) - Retailers receive some - Google
information for purchases
Information type - Purchased items can be tracked - All browsing and purchase
- Browsing at select .com sites activity logged
logged
Transmission principle - Retailer might use/sell - Information divulged to Google
transaction data automatically
- Privacy policy
(Table continues)

176
Table 5: Differences in informational norms within various information-seeking
contexts (continued)
Informational norm Libby Netty
Computer File Management
Agent (receiver) - No third party has access to files - Google
Information type - N/A - Some search terms could be
logged via referrer field
- All queries logged with “Search
Across Computers” feature
- Encrypted file index stored at
Google with “Search Across
Computers” feature
Transmission principle - N/A - Privacy policy
- “Search Across Computers” files
protected via encryption
Internet Browsing
Agent (receiver) - Websites visited keep server logs - Google
Information type - Typical information collected by - Bookmarks, notes, etc
Web sites - Some Toolbar functions track
every Web site visited
- Web Accelerator tracks every
Web site visited
Transmission principle - Each site’s privacy policy - Privacy policy

177
178

CHAPTER VI

VALUES AND SPHERES OF MOBILITY

Introduction

Viewing the quest for the perfect search engine through the lens of

contextual integrity helps us to focus on the privacy and surveillance threats

represented by these emerging Web search information infrastructures. While

contextual integrity illuminates how the introduction of these new technologies

disrupt informational norms across various information-seeking contexts, the

theory does not provide the tools needed to make the normative decision whether

the existing norms are preferred over the promises of the perfect search to provide

relevant and efficient results. This chapter addresses that question, arguing that

these informational norms extend beyond the confines of a particular context of

intellectual activity into broader spheres of mobility where individuals have

historically enjoyed the ability to engage in social, cultural, and intellectual

activities free from answerability and oversight. In such a case, maintaining the

contextual integrity of personal information flows within this broader sphere is

paramount.

This chapter is concerned with mobility. One of the defining features –

and strengths – of modern society is its social, physical, and intellectual mobility.

As freedom is often defined as the ability to move (Marlowe Jr et al., 1994, p.


307), modern societies demand a high level of mobility coupled with the

fundamental assumption that individuals are granted the right to move about and

explore new physical and intellectual terrain relatively free from answerability or

intrusive oversight by governmental or private entities. As cultural historian

George Pierson has noted, “Without spatial movement, no social improvement,

either. Our work and or play, our cities and our contrysides, our taxes and our

eating habits, our pleasures and our pains, our hopes and our fears are inextricably

tied up with mobility” (1973, p. 93). Without the ability and opportunity to move,

to navigate, to inquire, and to explore, we cannot gain the sort of understanding of

our world or develop the awareness and competencies necessary for effective

participation in social, economic, cultural, and political life.

In considering the perceived importance of the concept of mobility in our

contemporary world—visible, for instance, in the pervasiveness of modern

transportation and mobile communication technologies—and how it has reshaped

the ways in which people live and work, some have stated that: “[In] spite of the

upsurge of concern with mobility in out social lives, current research perspectives

define the notion of mobility quite narrowly, exclusively in terms of humans’

independency from geographical constraints” (Kakihara & Sorensen, 2002, p. 1).

This chapter, however, argues that mobility is not just a matter of physically

traveling from place to place, but also related to people’s social, cultural, and

intellectual movements, explorations, and possibilities. Mobility includes

intellectual inquiry and information accessibility, as much as physical movement.

179
Various configurations of social-technical relationships afford various

dimensions of mobility to humans interaction in our social and intellectual lives.

Therefore, this chapter suggests expanding the concept of mobility by looking at

three interrelated dimensions of human activities; namely, physical, intellectual,

and cyber mobility. Physical mobility relates to movements of people in

geographical space and our ability to navigate and explore those spaces.

Intellectual mobility refers to our mental and intellectual ability to learn new

things, explore new ideas, adapt, and change our thoughts and beliefs. The term

digital mobility is introduced to describe the ability to move within and across the

digital networks of cyberspace, the unique ability to navigate both spaces and

ideas on the Internet free of restrictions or control.65 These three dimensions of

mobility have been dramatically enhanced by intensive use of information

technology in our everyday lives. In the following sections, we will isolate

particular examples of mobility in each of these realms, identifying their social,

cultural, and political importance as spheres of mobility, and draw implications

for how interference with one’s mobility threatens the values traditionally enjoyed

within these spheres.

Physical Mobility: Freedom on the Roads

Physical mobility denotes the most immediate and visible aspect of

mobility. From the emergence of flagella in eukaryotes to the evolution of

65
Note, however, that these three different conceptualizations of mobility
share some common characteristics and are highly interdependent, which will be
discussed in more detail below.

180
bipedalism in vertebrates, physical mobility represents one of the most important

breakthroughs in biological development. For hunters and collectors of any

species, mobility was the precondition for survival as their basic needs, such as

the acquisition of food or the search for a potential partner, depended on physical

movement. Beyond these primeval needs, physical mobility has also become a

precondition to satisfy all kinds of personal needs that imply movements, such as

exploration, recreation, connection, growth, and escape. The United States, for

example, is a nation formed and populated largely by the results of mobility, such

as the immigration of Europeans, Latin Americans, and Asians, as well as the

forced dislocation of Africans. When European immigrants arrived on the eastern

shore, they began a movement westward, first to the Mississippi River and then to

the Pacific Ocean.

This movement across the frontier became a defining feature of American

history and character. Famously argued by historian Frederick Jackson Turner in

1893, the migration westward had been responsible for American economic

growth, democratic institutions, and robust values:

American history has been in a large degree the history of the colonization
of the Great West. The existence of an area of free land, its continuous
recession, and the advance of American settlement westward, explain
American development. (Turner, 1921, p. 1)

“The frontier,” Turner claimed, “is the line of most rapid Americanization” (1921,

pp. 3-4). The predominance of numerous cultural traits in American society –

“that coarseness and strength combined with acuteness and acquisitiveness; that

practical inventive turn of mind, quick to find expedients; that masterful grasp of

181
material things…that restless, nervous energy; that dominant individualism”

(Turner, 1921, p. 37) – could all be attributed to the influence of the frontier,

fueled by a new-found sense of mobility:

All was motion and change. A restlessness was universal. Men moved, in
their single life, from Vermont to New York, from New York to Ohio,
from Ohio to Wisconsin, from Wisconsin to California, and longed for the
Hawaiian Islands. When the bark started from their fence rails, they felt
the call to change. They were conscious of the mobility of their society
and gloried in it. (Turner, 1921, pp. 354-355)

Turner warned, however, that since only isolated pockets of free and unexplored

land remained at the end of the nineteenth century, “the frontier has gone, and

with its going has closed the first period of American history” (Turner, 1921, p.

38). He feared that with the closing of the frontier, American society would lose

its safety valve, that pent-up social tensions might find no ready release.

George Pierson (1942), a frequent critic of Turner’s “frontier thesis,” has

argued that Turner’s fears were misplaced. For Pierson, as long as we can move

somewhere and somehow, it does not matter if the geographic frontier of the

American West is closed. “To live is to move,” wrote Pierson in his cultural

history of movement in American society. “Movement,” he continued, “is the

precondition to action, the breath of social animation, the quite visible yet rarely

noticed act that makes possible most of the performances of man” (1973, p. ix).

Pierson reveals, with wonderful detail, the connections between mobility and

various elements of what he describes as the “American character,” including

mobility as a means to escape persecution and hardships, to find pleasure,

recreation and adventure, as a social or personal safety valve, as an expression of

182
defiance against authority, and as a tool for gaining wisdom and a deeper

understanding of our world. Even if, as Turner believed, the literal frontier was

terminated in the 1890s, Pierson argued that the days of figurative, frontier style

movement was just beginning with the emergence of the automobile: “Today it is

the automobiles that breathe of romance and adventure, that speak to us of distant

lands” (Pierson, 1973, p. 125). Automobility would take over where Turner’s

Western frontier left off, as Phil Patton observed: “The automobile and its

highways froze the values of the frontier by making movement a permanent state

of mind” (1986, p. 13).

Exploration, Autonomy, and Escape on the Roads

Today it seems retrospectively inevitable that the automobile would

prosper in the United States as it brings together two fundamental aspects of

American culture: individualism and mobility. Automobility has allowed

Americans to be constantly on the go, to explore new places, meet new people,

learn new things, all at their own pace and direction. Nearly anyone with a few

dollars can hit the road and be the rugged individual unrestrained by schedules or

outside control and relish the freedom to chose where and when to stop. With

immediate access to both the landscape and the people, we can commune with

nature, with friends, or with strangers. This physical mobility on the roads has had

a profound hold on American culture for much of the past century (see, for

example, Flink, 1975; Interrante, 1983; Lewis & Goldstein, 1983; Patton, 1986;

Berger, 2001; Setright, 2003).

183
With the introduction of the automobile and a growing system of roads

and highways in the early twentieth century, Americans began to explore places

that a mere decade or two before had been beyond the range of the average

traveler previously dependent on the physical limitations of the horse or the

predetermined routes of the train. The documentary “To New Horizons” (General

Motors, 1940), commissioned by General Motors to document its grand

exhibition at the 1939 New York World’s Fair, told of the automobile’s role in

satisfying the urge for exploration of distant horizons and frontiers. In a montage

of roadside images, the film described how the “restless search for new

opportunities, the mystery and promise of distant horizons have always called

men forward…old horizons open the way to new horizons.” Automobility shrunk

the size of the continent on whose vastness and inexhaustibility explorers had

commented since the sixteenth century, re-opening Turner’s frontier and making

these new horizons available for new levels of exploration, adventure, and escape.

The opening of Turner’s frontier via automobility reshaped numerous

social arrangements, providing countless new freedoms and opportunities (see, for

example, Flink, 1975; Lewis & Goldstein, 1983; McShane, 1994; Berger, 2001).

The automobile freed rural people from the physical and cultural isolation that

was a characteristic feature of life in the countryside. Automobile owners could

travel to large towns for shopping and selling their wares, children could be

transported to larger schools with broader curricula, and worshipers were no

longer limited to a small village church. Urban life also began a transformation as

184
a result of automobility.66 With cities becoming overcrowded, polluted, and

crime-ridden, urbanites increasingly sought refuge in rural (and later, suburban)

enclaves. While trains or trolleys often provided transportation between home and

work in the city, automobility freed this new breed of commuters from the rigid

schedule of public transportation and afforded new levels of personal privacy.

The incorporation of the automobile into daily life also reshaped family

relations. The traditional family dinner and subsequent neighborhood stroll faced

severe competition from “going for a spin” in the automobile, a particularly

divisive activity that did not always involve all members of the family. The

liberating effect on teenagers was especially profound, since the car offered swift

transportation to a distant town or city where teens could enjoy themselves in

relative anonymity, escaping the prying eyes of their parents and other adults

within their community (Berger, 2001, p. xxi). The automobile also brought about

a certain sexual liberty, often serving as a “bedroom on wheels.” Cars permitted

teenage couples to get much farther away from front porch swings, parlor sofas,

hovering parents, and pesky siblings than ever before (see Lewis, 1983).

As more households purchased multiple vehicles, women, especially those

responsible for keeping the home, had at their disposal a form of transportation

with levels of privacy, safety and speed that outweighed previous reliance on

walking or public transportation. As Berger notes:

66
The automobile, after all, was a creation of industrial society, and cities
were its initial domain; in 1910 urban residents were four times more likely than
rural residents to own a car (McShane, 1994, p. 105).

185
The automobile provided a means by which women could escape their
homebound existence without neglecting their traditional domestic
responsibilities. Their range of mobility began to approach that of men,
and the sphere of their activities expanded accordingly. Thus, they were
able to develop and take advantage of new employment opportunities
outside the home, form geographically extensive social clubs for
philanthropic or recreational pursuits, or just get away from the house or
apartment for an hour or two of refection, shopping or culture. (Berger,
2001, p. xxi)

While such benefits of car ownership were not available to all women,

automobility was a strong contributor to women’s increased autonomy and

liberation from the home (Scharff, 1991; Berkley, 1996).

The cultural importance of automobility as a means of escape and self-

discovery often took center stage in literary fiction. For example, the main

character in John Updike’s Rabbit, Run (1960) “runs” away from reality by taking

an extended automobile trip throughout the American South. Similarly, both John

Steinbeck’s Travels with Charley in Search of America (1980) and Jack

Kerouac’s On the Road (1955) use cross-country road trips as a means of

escaping the “mainstream,” providing a fresh glimpse of American culture and a

route towards self-discovery. In these latter examples, automobility also fulfills a

kind of Edenic promise, offering enclosure and containment, protecting occupants

from the pressures and perplexities of the external world. As (Laird, 1983) has

noted:

A common element in these intimate depictions of fictional drivers and


their vehicles is the notion of an escape into a “poetic” space, a secret
garden, a room of one’s own, set off from the ordinariness of things by
means of motion, memory, or encabined contemplation. (Laird, 1983, p.
248)

186
“The road is life” Jack Kerouac pronounces in On the Road (1955, p. 175),

his semi-autobiographical account of numerous cross-country automobile trips

and other exploits made between 1946 and 1950 with his fellow members of the

Beat Generation, including Neal Cassady, Allen Ginsberg, and William

Burroughs. The Beats were a generation of post-World War II Americans not

unlike World War I’s own Lost Generation; however, the Beats, rather than

expatriating themselves to Europe, practiced a form of domestic expatriation,

attempting to become lost in the speed and vastness of America. As explained by

Roger Casey, “Central to their experience was the verb go…. A group on the

move, the Beats found the car essential to their circumstances” (Casey, 1997, p.

108). Indeed, Kerouac’s On the Road exceptionally captured this post-war

generation’s feelings of restlessness, discontent, juvenility, and, most importantly,

movement. While the Beats did not necessarily look into the future with utopian

vision, they nonetheless agreed on the road as the ideal site for both adventure and

transformation. As Casey explains:

More than anything else, On the Road is about being on the go. Many
writers before Kerouac (Steinbeck, for one) had already asserted that the
basic impulse of America is to move, to go west, young man. Kerouac
listened to his forbearers, doing just that – moving, again and again. Like
Huck Finn, Sal Paradise (Kerouac) “lit out” for the territory; he then
returned east, then lit out west again, then east again, then south, then to
Mexico, and finally back east – the ultimate restless American. Sal cannot
find Paradise because he finds the American Edenic myth just that, a myth
– there is no Shangri-la. Therefore, since paradise cannot be found in a
place, paradise must become movement itself, and the car thus became the
method of nirvanic transport to the Beats. (Casey, 1997, pp. 108-019)

The adventurous and escapist nature of automobility – its “nirvanic”

quality – was also often the subject of film. Indeed, the two technologies rose to

187
popularity in America simultaneously because, according to flamboyant film

director Cecil B. De Mille, they both reflected “the love of motion and speed, the

restless urge toward improvement and expansion, the kinetic energy of a young,

vigorous nation” (quoted in Hey, 1983, p. 193). Both the automobile experience

and the film experience freed their users from the static normalcy of their day-to-

day lives, permitting the consumer to select desirable settings or themes outside

their normal spheres of existence, offering an ecstatic experience potentially

devoid of depressing connections to reality (Cohan & Hark, 1997). Examples

range from The Grapes of Wrath to Easy Rider to Thelma and Louise to Natural

Born Killers. Through these so-called “road movies,” automobility became a

signifier for freedom, and their continued popularity express the cultural

importance of mobility as a means to escape, explore, and a route to achieve

personal autonomy and, ultimately, liberation (North, 2006).

To summarize, from John Kerouac’s On the Road (1955) to Dennis

Hopper’s road movie Easy Rider, to Bruce Springsteen’s anthem Born to Run, the

road and automobility have enjoyed a privileged place in American popular

culture. Recounting the multitude of examples of how freedom of automobility

has been expressed in American life is beyond the scope of this chapter,67 but the

examples outlined above attest to how automobility acts as a crucible for

fundamental values of modern society, such as the promise of exploration and

adventure, the potential for individual autonomy and liberation from social

67
Interested readers can find more information in (Dettelbach, 1976; Hey,
1983; Laird, 1983; Lewis & Goldstein, 1983; Casey, 1997; Berger, 2001).

188
constraints. Cultural critic Stephen Bayley has noted how the automobile is “a

curiously precise tool for calibrating cultural values” (Bayley, 1986, p. 62), and

the cultural expressions of automobility described above validate Bayley’s thesis:

automobility is a central metaphor for America itself. George Pierson summarizes

this revelation best, noting how the automobile “is now the vehicle for our

passions as well as for transportation”:

People will do almost anything rather than give up this outlet for feeling.
They simply people their tensions, their frustrations and unfulfilled
yearnings into the automobile and they’re off. Many people drive (it is
obvious) to satisfy their longings for power; others make it their whole
recreation; still others use it as an almost perfect way of escape. It satisfies
an ancient and fundamental American urge. By simply turning a key we
can now go almost anywhere we please. (Pierson, 1973, pp. 127-128)

With the emergence of the automobile and a growing network of roads and

highways on which to travel, individualism and personal mobility found a new

means of expression, and a new “escape valve” had been discovered for the

stresses of modern society. L. J. K. Setright, the famous British motoring

journalist and author, summarizes best how automobility “enabled [a] banner of

freedom to be unfurled”:

Throughout its history, the car has been a liberator, an agent of freedom.
Throughout its history, the car has enabled people to break out of their
constraints, to attempt something they could never previously do, to
venture somewhere they could never previously go, to support ideas and
trends they could never previously endorse. (Setright, 2003, p. 186)

Threats to Freedom on the Roads

“America,” noted writer John Jerome, “is a road epic; we have even

developed a body of road art…cutting loose a path to the dream” (qtd in

189
Dettelbach, 1976, p. 4). As the automobile increasingly became a part of culture –

a key component of Jerome’s “path to the dram” of America – it also became an

instrument of social change, providing opportunities for greater privacy,

autonomy, and personal freedoms. The ability to move unencumbered within this

sphere of physical mobility resonates throughout American cultural products and

experiences. Indeed, automobility has carved itself a niche among other

fundamental human rights:

Although driving is in theory a privilege granted by the state, the necessity


of being able to drive in our automobile culture has in practice made
driving another inalienable right. (Flink, 1975, p. 172)

The sovereignty of the automobile that historian James Flink relishes is not,

however, without challenge. New technologies and practices frequently confront

the free and unencumbered navigation of this sphere of physical mobility,

threatening the anonymity and freedom to be gained on the roads. The emergence

of networked vehicle information systems illustrates this concern best.

In response to an increase in traffic-related accidents and fatalities, the

U.S. Department of Transportation – in partnership with the private sector and

state and local governments – has supported the development of technological

solutions to this problem. A key USDOT initiative is Vehicle Infrastructure

Integration (VII), with the goal of achieving a “nationwide deployment of a

communications infrastructure on the roadways and in all production vehicles and

to enable a number of key safety and operational services that would take

advantage of this capability” (U.S. Department of Transportation, 2005).

Numerous networked vehicle systems and technologies have emerged from this

190
initiative. For example, surveillance cameras are increasingly installed along

highways and intersections to manage traffic flows and enforce traffic laws

(Franzier, 2002; Jordan, 2006), regional transportation authorities are

implementing electronic toll collection systems at a rapid pace (Gross, 1997;

Doyle, 1998), on-board GPS services are purchased by millions of drivers to

provide navigational and safety benefits (Fogarty, 1997; Baig, 2000), and event

data recorders have become standard features in most automobiles to record

information related to accidents (Retson, 2006). Recent advances in wireless data

communications technologies have led to the active development of a new breed

of vehicle safety applications enabling vehicle-to-vehicle and vehicle-to-

infrastructure communication (Vehicle Safety Communications Consortium, n.d.;

Horrell, 2003; Derene, 2007).

Considered together, these networked vehicle information systems rely on

the transmission, collection, and aggregation of a particular vehicle’s identity,

location, telemetry data, or other vehicle- and location-specific information. These

technologies contain unique privacy problems due to the fact that the information

collected may enable the accumulation and aggregation of detailed records of a

person’s location and movements (see previous chapter and Zimmer, 2005). They

make possible the creation of detailed travel histories of a driver’s location at all

times in the past, as well as her usual travel patterns and habits. Networked

vehicle systems represent an intensification of the ability to surveil everyday

people engaging in their everyday, public activities on the roads (Zimmer, 2005),

and prompt complicated questions as their usage becomes more widespread. Key

191
concerns include: who owns the information about one’s driving activities, under

what conditions can law enforcement access such information, can information

collected for one purpose be used for another, and what kinds of disclosure and

informed consent are necessary to implement such systems (see, for example,

Selingo, 2001; McCullagh, 2003; Ramasastry, 2005b; Zetter, 2005a).

The impact of these new technologies on the free and unfettered

navigation within the sphere of physical mobility has not gone unnoticed. As

early as 1995, when these technical systems were in their early stages, a group of

scholars gathered to discuss the emerging privacy concerns with what were then

referred to as “intelligent vehicle highway systems” (see Santa Clara symposium

on privacy and IVHS, 1995). From those discussions, a number of

recommendations emerged: Some suggested that many privacy concerns could be

mitigated by ensuring that the data collection activities could not capture enough

data from any one vehicle to make it “singularly identifiable” (Alpert, 1995, p.

116); others called for restrictions on the retention of individually identifiable data

(Halpern, 1995); along with policy initiatives, calls were made to pay strict

attention to privacy in the technical standards-setting process (Agre, 1995). In the

ten years since this symposium’s initial warnings, scholars have continued to

investigate the implications of networked vehicle information systems on our

sphere of physical mobility (see, for example, Garfinkel, 1995; Clarke, 2000;

Bennett et al., 2003; Thompson & Kerr, 2005; Zimmer, 2005).

Overall, scholars have grown increasingly concerned that the widespread

use and rising ubiquity of networked vehicles systems may threaten the autonomy

192
and freedoms inherent to our culture of automobility, a concern condensed in

Jeffery Reiman’s warning that by blindly embracing such technology, we run the

risk of “driving to the panopticon” (Reiman, 1995) of widespread surveillance and

the further erosion of the values important to American society. A complete

analysis of the full implications of networked vehicle systems is beyond the scope

of this chapter. Yet, evidenced by the attention given these emerging technologies

and the growing concerns of how their design threatens the anonymity and

freedoms sought via the roads, protecting the free and unencumbered navigation

within this vital spheres of physical mobility is of clear importance.

Intellectual Mobility: Intellectual Freedom and the Library

Physical mobility is only one dimension of the spheres of mobility that

form the foundation of many American cultural values. Instead of the freedom to

move along the physical highway, the notion of intellectual mobility centers on

what Marshall McLuhan has called “the highways of the mind” (1964, p. 102).

Successfully navigating these intellectual highways requires the “fuel” of

knowledge, which spurs the desire for more knowledge and more intellectual

mobility: “When information itself is the main traffic, the need for advanced

knowledge presses on the spirits of the most routine-ridden minds” (1964, pp.

102-103). Education, access to information, and the freedom of inquiry are the

central drivers along these new “highways of the mind,” enabling full mobility

within this vital intellectual sphere of our lives.

193
In a 1786 letter to a friend, Thomas Jefferson called for “the diffusion of

knowledge among the people. No other sure foundation can be devised for the

preservation of freedom and happiness [than] educating the common people” (qtd.

in Padover, 1952, p. 87). Jefferson was arguing that the fortunes of the then-

young democracy of the United States rested on the ability of its citizens to

understand and use information about the world around them. Jefferson was able

to champion the cause of public education himself with the founding of the

University of Virginia in 1819. The University, he wrote, “will be based on the

illimitable freedom of the human mind, to explore and to expose every subject

susceptible of its contemplation” (Jefferson, 1820a), and that at the university “we

are not afraid to follow truth wherever it may lead, nor to tolerate any error so

long as reason is left free to combat it” (Jefferson, 1820b). In these words, the

author of the Declaration of Independence described the distinctive, irreplaceable

role of the university in a free society: providing a forum for uninhibited

intellectual inquiry and debate.

The centerpiece of Jefferson’s design for the University was not a church,

as was custom, but a massive domed library, representing the centrality of

intellectual inquiry in his vision for democracy. In the spirit of Jefferson, libraries

have assumed the social role as intuitions of “education for democratic living,”

with intellectual freedom forming their foundation. The American library has

been described as “the Nation’s most basic First Amendment institution,” serving

as a “primary resource for the intellectual freedom required for the preservation of

a free society and a creative culture” (Foerstel, 1991, p. viii). According to library

194
scholar Charles Busha, librarians believe in “library users’ rights to read, watch,

or listen to material” of their choice “without supervision or restraint from public

officials, public opinion, institutional repression, private groups, or individuals”

(Busha, 1977, p. 12). This commitment represents a core stance of the American

Library Association (ALA), which, since its inception in 1876, has become

increasingly connected with the ideology of intellectual freedom and galvanized

by a concern for the public’s right to free and unfettered access to information. In

essence, the intellectual freedom enjoyed in the context of the library represents a

sphere of intellectual mobility traditionally shielded from oversight or

answerability.

Intellectual Freedom and the Library Bill of Rights

At its 1939 annual conference in San Francisco, the ALA adopted a formal

policy statement on intellectual freedom known as the Library’s Bill of Rights.

The document began with the statement, “Today indications in many parts of the

world point to growing intolerance, suppression of free speech, and censorship

affecting the rights of minorities and individuals,” a reference to the emergence of

totalitarian states during that time (American Library Association, 2002a, p. 60).

In response to the changing political and cultural surroundings, the Bill of Rights

outlined three policy statements to ensure free and open access to public library

services. The first stated that library materials should be selected based on their

value and intrinsic interest to the community, not on the race, nationality,

political, or religious views of authors. The second directed that library materials

195
should “fairly and adequately” represent all sides of social issues. The final

statement pertained to a democratic open-use policy for library meeting rooms, so

that all community groups would have equal access (American Library

Association, 2002a, pp. 60-61). The ALA’s adoption of the Library’s Bill of

Rights marked a moment of affirmation in the history of American libraries. From

then on, the principle of intellectual freedom defined the library’s role as a forum

for uninhibited intellectual inquiry and debate, and solidified the library as a

sphere of mobility within which Jefferson’s ideal of a educated and critically-

engaged populace could be achieved.

Almost immediately after adopting the Bill of Rights, new political and

social pressures began to weigh on the intellectual freedoms outlined within the

ALA’s bold position statement. Shortly after World War II, the House Committee

on Un-American Activities began efforts to expose real and suspected

communists in government, labor unions, schools, and other social institutions,

including public libraries. In 1947, President Harry S. Truman implemented a

national loyalty program for government workers. Shortly thereafter, state

legislatures also introduced loyalty oaths to prevent the spread of communism

(Busha, 1977, pp. 41-42). This increased atmosphere of intolerance, suspicion,

and pressure to conform came to a head with the rise of McCarthyism; between

1949 and 1953, Wisconsin Senator Joseph R. McCarthy and his supporters

persecuted almost anyone who deviated from the status quo, including

intellectuals, teachers, and librarians. During this period marked by both paranoia

and intolerance, many librarians were fired, some libraries were closed, and

196
countless books were either labeled un-American or simply destroyed in the name

of fighting communism (Downs & McCoy, 1984, p. 7).

These political and social developments appeared to justify the motivation

behind the 1939 Library’s Bill of Rights, and it become even more evident that the

remedies stated therein were necessary to protect free and open inquiry of

American citizens. In these early moments of the Cold War and McCarthyism, the

ALA updated the newly entitled Library Bill of Rights, highlighting that

intellectual freedom was as a specific “responsibility of the library service” and

recognizing the need of libraries to challenge “censorship of books urged or

practiced by volunteer arbiters of morals or political opinion” (American Library

Association, 2002a, p. 62). Through this 1948 revision, the ALA reaffirmed its

commitment to intellectual freedom, democratic values, and the public’s right to

read what it liked free from government oversight or interference.

Other revisions and rewordings of the Library Bill of Rights followed as

libraries faced continued challenges to intellectual freedom throughout the

politically and socially tumultuous years from 1939-1969, culminating in the

version that stands today as a strong statement expressing the rights of library

users to intellectual freedom, and the expectations that the Association places on

libraries to support those rights:

The American Library Association affirms that all libraries are forums for
information and ideas, and that the following basic policies should guide
their services.
I. Books and other library resources should be provided for the interest,
information, and enlightenment of all people of the community the library
serves. Materials should not be excluded because of the origin,
background, or views of those contributing to their creation.

197
II. Libraries should provide materials and information presenting all points
of view on current and historical issues. Materials should not be
proscribed or removed because of partisan or doctrinal disapproval.
III. Libraries should challenge censorship in the fulfillment of their
responsibility to provide information and enlightenment.
IV. Libraries should cooperate with all persons and groups concerned with
resisting abridgment of free expression and free access to ideas.
V. A person’s right to use a library should not be denied or abridged
because of origin, age, background, or views.
VI. Libraries which make exhibit spaces and meeting rooms available to
the public they serve should make such facilities available on an equitable
basis, regardless of the beliefs or affiliations of individuals or groups
requesting their use. (American Library Association, 2006c)

Overall, the ALA responded to threats to the library’s social role as an institution

of education and inquiry for democratic living by making intellectual freedom its

defining ideological stance (see, for example, Robbins, 1991). Through the

Library Bill of Rights and related policy and procedural stances, the ALA has

worked to ensure that citizens could enjoy libraries as a sphere of intellectual

mobility, free from undue answerability or oversight of their intellectual activities.

Privacy and the Library Bill of Rights

The Library Bill of Rights begins with the premise that everyone is entitled

to freedom of access, freedom to read texts and view images, and freedom of

thought and expression. Privacy is the bedrock foundation for an individual’s

right to freely read and to receive ideas, information, and points of view – it is a

necessary ingredient for achieving and protecting intellectual freedom. None of

these freedoms can survive in an atmosphere in which library use is monitored

and individual reading and library use patterns are made known to anyone without

198
permission. Only when an individual is assured that her choice of reading material

does not subject her to reprisals or punishment can the individual enjoy fully her

freedom to explore ideas, weigh arguments, and decide for herself what she

believes (see, broadly, American Library Association, 2006f).

Such assurances were put to the test when, in 1970, United States Treasury

officials approached public libraries in Atlanta, Cleveland, Milwaukee, and

elsewhere and asked to see circulation records for books on bomb making,

guerrilla warfare, and other subjects considered “dangerous” by government

authorities (American Library Association, 2002a, p. 236). In the Milwaukee

incident, for example, agents of the Bureau of Alcohol, Tobacco, and Firearms of

the Treasury Department demanded access to circulation records of books and

materials on explosives, but were initially rebuffed by the local librarians. The

agents then returned with a “letter of opinion” from the City Attorney, advising

the library that the circulation records were public records and therefore could not

be withheld from the agents. While the letter had no legal authority, and the

request was never reviewed by a judge, the local library acquiesced (American

Library Association, 2002a, p. 236). Reaction from the ALA, however, was swift,

issuing an emergency advisory statement claiming that:

The efforts of the federal government to convert library circulation records


into “suspect lists” constitute an unconscionable and unconstitutional
invasion of the right of privacy of library patrons and, if permitted to
continue, will do irreparable damage to the educational and social value of
the libraries of this country.” (qtd in Foerstel, 1991, p. 6)

The ALA recommended that each library adopt a confidentiality policy, advise all

library employees that library records are not to be released except pursuant to a

199
court order, and resist the issuance or enforcement of such an order until a proper

showing of good cause has been made in court (Foerstel, 1991, p. 6). These

guidelines subsequently were formalized in 1971 into a new Policy on

Confidentiality of Library Records (American Library Association, 2006e). The

ALA’s position on the privacy of patron records was further solidified in 1980

with the amendment of the Code of Ethics to mandate that librarians “protect each

library user’s right to privacy and confidentiality with respect to information

sought or received and resources consulted, borrowed, or acquired” (American

Library Association, 2006a).

The question of the privacy and confidentiality of intellectual activities

within libraries arose again in 1987 when it was disclosed that the Federal Bureau

of Investigation (FBI) was engaging in a “Library Awareness Program,” a covert

counter-intelligence program in which FBI agents visited libraries and asked

librarians to be alert to the use of their collections by persons from countries

“hostile to the United States, such as the Soviet Union” and to provide the FBI

with information about these activities (McFadden, 1987). Foerstel (1991)

documents these efforts, which were largely unsuccessful due to the tremendous

outrage and resistance from those in the library profession. Shortly after the

disclosure of the Library Awareness Program, the New York Library Association

issued the following warning:

Should the citizens of this nation perceive the library and its staff as a
covert agency of government watching to record who is seeking which
bits of information, then the library will cease to be creditable as a
democratic resource for free and open inquiry. Once the people of this
country begin to fear what they read, view or make inquiry about may at

200
some future time be used against them or made the object of public
knowledge, then this nation will have turned away from the very most
basic principle of freedom from tyranny which inspired this union of
states. (qtd in Foerstel, 1991, p. 43)

Academic librarians at New York University had a similar reaction:

We simply do not wish to have our readers feel that they may be under
surveillance by intelligence agents. Furthermore, we want to assure all
library users of their right to read freely and to explore ideas without
question of their motives. At New York University we believe this type of
invasion into the privacy of the American public is an unwarranted threat
to our civil liberties. (qtd in Foerstel, 1991, p. 57)

The Intellectual Freedom Committee of the ALA prepared a statement to

be sent to then-FBI director William Sessions, outlining the ALA’s concerns

about the FBI Library Awareness Program. This letter of concern was later

formalized into an official ALA policy concerning Confidentiality of Personally

Identifiable Information about Library Users, which stated, in part:

The First Amendment’s guarantee of freedom of speech and of the press


requires that the corresponding rights to hear what is spoken and read what
is written be preserved, free from fear of government intrusion,
intimidation, or reprisal. The American Library Association reaffirms its
opposition to any use of governmental prerogatives that lead to the
intimidation of individuals or groups and discourages them from
exercising the right of free expression as guaranteed by the First
Amendment to the U.S. Constitution and encourages resistance to such
abuse of governmental power. In seeking access or in the pursuit of
information, confidentiality is the primary means of providing the privacy
that will free the individual from fear of intimidation or retaliation.
(American Library Association, 2006d)

Following the creation of this vital policy statement, the ALA’s Intellectual

Freedom Committee kept this privacy problem in the consciousness of the library

profession. It drafted and pressed for adoption of a statement on governmental

intimidation and published tools and procedures for implementing policies on

201
confidentiality of records at local libraries (Kennedy, 1989). The confrontation

between the ALA and the FBI over the Library Awareness program once again

brought to light the inextricability of the privacy of patron records, intellectual

freedom, and the fundamental values that underpin our democratic society. Yet,

the FBI has never publicly abandoned the Library Awareness Program, and the

ALA suspects it may still be conducting it (American Library Association, 2002a,

p. 12).

The simmering tensions between the FBI and the ALA returned to a boil

with the passage of the USA PATRIOT Act in the aftermath of the September 11,

2001 terrorist attacks.68 This controversial act was quickly signed into law on

October 26, 2001, only weeks after the September 11, 2001 terrorist attacks on

New York City and Washington, D.C. The Act broadly expanded law

enforcement’s surveillance and investigative powers by amending more than 15

different statutes for the stated purpose of updating wiretap and surveillance laws

for the Internet age, addressing real-time communications and stored

communications (e-mail, voice mail, etc.), and to give law enforcement greater

authority to conduct searches of property, dramatically expanding the authority of

American law enforcement in order to fight and prevent terrorism in the United

States and abroad (see United States Department of Justice, 2006).

While almost universally supported in Congress, the USA PATRIOT Act

has faced significant criticism since its passage, particularly in relation to its

68
The USA PATRIOT Act (Public Law 107-56, 115 STAT.272, H.R
3162) stands for the Uniting and Strengthening America by Providing
Appropriate Tools Required to Intercept and Obstruct Terrorism Act of 2001.

202
impact on privacy and civil liberties (see, for example, Chang, 2001; Lardner,

2001; Olsen, 2001; Purdy, 2001). One controversial component of the Act was

Section 215, which amends sections of the Foreign Intelligence Security Act

(FISA) to make it easier for a federal agent to obtain a search warrant for “any

tangible things (including books, records, papers, documents, and other items)”

(p. 38). Under the revised provisions, a federal agent need not demonstrate

probable cause to obtain a warrant. Instead, she can merely assert that records

may be related to an ongoing investigation related to terrorism or intelligence

activities, a much lower legal threshold. On its face, the section does not directly

refer to libraries, but rather to business records and other “tangible” items in

general. Its scope, however, has been widely interpreted to include library patron

records among the “tangible things” accessible by law enforcement without

probable cause. When asked about Section 215 by the House Judiciary

Committee, the U.S. Department of Justice acknowledged, “such an order could

conceivably be served on a public library, bookstore, or newspaper (Doyle, 2003;

emphasis added).

Library professionals were quick to respond to this potential new threat to

the privacy and confidentially of patron records. By April 2002, the ALA Office

of Intellectual Freedom had conducted an extensive evaluation and assessment of

the implications of the USA PATRIOT Act and published The USA Patriot Act in

the Library: Analysis of the USA Patriot Act Related to Libraries (American

Library Association Office for Intellectual Freedom, 2004). Two months later, in

further response to the USA PATRIOT Act, the full ALA approved a new policy

203
statement on Privacy: An Interpretation of the Library Bill of Rights (American

Library Association, 2002b), confirming that:

In a library (physical or virtual), the right to privacy is the right to open


inquiry without having the subject of one’s interest examined or
scrutinized by others. Confidentiality exists when a library is in possession
of personally identifiable information about users and keeps that
information private on their behalf. (American Library Association,
2002b)

By January of 2003, the ALA codified a three–fold response to the Act in

its Resolution on the USA Patriot Act and Related Measures That Infringe on the

Rights of Library Users (American Library Association, 2003). First, the

resolution called for education within libraries about how to comply with the Act

and also about the inherent dangers to intellectual freedom. It further advised that

libraries “adopt and implement patron privacy and record retention policies” to

collect only information that is necessary for the library’s work. Second, the

resolution bound the ALA to work with other like–minded organizations “to

protect the rights of inquiry and free expression.” Third, it committed the ALA

“to obtain and publicize information about the surveillance of libraries and library

users by law enforcement agencies” (American Library Association, 2003). Along

with the ALA’s formal responses and recommendations, individual librarians and

libraries took their own action to protect patron privacy and confidentiality,

including destroying records of what patrons had borrowed, scrapping plans to

use new computer technology to profile the reading habits of patrons and inform

them when works they enjoy are published, destroying Internet access logs on a

204
daily basis, posting warning signs, and offering patron education on privacy

issues (Murphy, 2003; Sanchez, 2003).

To summarize, concern over the privacy and confidentiality of library

patron records have persisted for over the last 60 years, aggravated by escalating

government attempts to gain access to such records. Whether in the face of

McCarthyism, the FBI Awareness Program, or the war against terror, librarians

have fought to ensure the democratic ideal of intellectual freedom survives such

challenges to the privacy and confidentiality of patrons’ information-seeking

activities. Louise Robbins, a historian of ALA policy responses to threats to

intellectual freedom, has argued that granting librarians both the responsibility

and the tools to defend the right of readers to freedom of inquiry, the Library Bill

of Rights and related ALA policies established a “zone of autonomy” for

librarians to perform their duties (1991, p. 360). Such a zone of autonomy

inevitably extends to the library patrons as well, forming what this chapter calls a

sphere of intellectual mobility, where, like the freedoms enjoyed in the physical

sphere of automobility, citizens must be free to read, inquire, and learn free from

undue answerability and oversight.

Digital Mobility: Autonomy, Privacy, and Digital Rights Management

As discussed above, libraries have fought – and continue to fight – to

provide a sphere of intellectual mobility, a space for individuals to read, inquire,

and learn free from external surveillance, judgment, control, or attribution of

motives. As more and more of our intellectual and information-seeking activities

205
shift from the physical sphere of the library to the digitally-networked sphere of

the Internet and the World Wide Web, a new sphere of digital mobility has

emerged. And just as the physical and intellectual spheres of mobility described

above are frequently confronted with new technologies and practices that threaten

the freedoms enjoyed within their purview, this new sphere is also confronted

with potential constraints to full freedom of mobility in the digital realm.

New digital computing and network technologies have led to a surge in the

use of Internet for the creation, distribution, and consumption of a variety of

media and information products. Rather than relying on physical libraries,

bookstores, or newsstands, individuals can now perform informational inquries

via the World Wide Web, browse online versions of newspapers and magazines,

purchase and read electronic books, and play music and video files via the

Internet with relative ease. However, this new form of digital access to

informational and cultural works presents a number of challenges for the creators

and distributors of these media products. Although such works are often protected

under existing copyright laws, technology has dramatically increased the

difficulty of enforcing the rights of the copyright holder, and at the same time has

presented a challenge to the legitimacy of the continuation of those rights (see, for

example, Lessig, 1999; Vaidhyanathan, 2001). Once a work is published in digital

form, it can potentially be copied and distributed widely without the permission of

the owner and possibly in violation of their legal rights. Thus, a digital dilemma

emerges: the very technologies that open new avenues for the consumption and

206
distribution of informational goods enable potentially unauthorized use and

duplication of copyright-protected content.

In response to this dilemma, owners of copyright-protected works have

deployed an array of so-called digital rights management (DRM) technologies,

which aim to impose technical prohibitions on the unauthorized use and

duplication of digital content by monitoring and/or regulating access to that

content.69 Definitions of DRM vary, but this version from a recent conference of

science and technology scholars provides a good summary:

[DRM] means the chain of hardware and software services and


technologies governing the authorized use of digital content and
management of any consequences of that use throughout the entire life
cycle of the content. DRM is an access and copy control system for digital
content, such that the DRM securely conveys and enforces complex usage
rights rather than simple low-level access/copy controls. …DRM
technologies include a range of functions to support the management of
intellectual property for digital resources, such as expression of rights
offers and agreements, description, identification, trading, protection,
monitoring and tracking of digital content. (qtd. in Cameron, 2004, pp.
298-299)

Following this definition, we can isolate three levels of control that DRM

technologies provide copyright owners: active monitoring, access control, and

usage control.

Some of the earliest DRM systems were designed to monitor the

frequency of use of a media file in order to charge consumers based on the

69
Legislative solutions have also been supported by copyright holders,
such as the Digital Millennium Copyright Act (H.R.2281), which heightened the
penalties for copyright infringement on the Internet as well as criminalizing
production and dissemination of technology whose primary purpose is to
circumvent measures taken to protect copyright.

207
number of times the file was read or played. Julie Cohen describes how the

monitoring function of a DRM system might operate:

For example, if I purchase a collection of essays online, the copyright


owner can charge me for the file containing the essays, generate a record
of my identity and what I purchased, and insert pieces of microcode into
the file that will: (1) notify the copyright owner every time I “open” one
of the essays and specify which one I opened; (2) notify me when I must
remit additional fees to the copyright owner — this much to browse the
essay, this much to print it out, this much to extract an excerpt, and so on;
and (3) prevent me from opening, printing, or excerpting the piece until I
have paid. (Cohen, 1996, pp. 983-984)

DRM monitoring and metering systems date as far back as 1995 when IBM

released Cryptolope (a portmanteau of “cryptographic envelope”), a document

protection software platform in which any attempt to open or use a protected file

was first routed through a centralized clearinghouse for tracking purposes (Evans,

1996). While Cryptolope has since been abandoned, monitoring systems remain

in heavy use by media content providers to help track usage, such as Microsoft’s

Windows Media DRM (Microsoft Corporation, 2005) and RealNetworks’

Rhapsody DNA (RealNetworks, 2006).

The third element of Cohen’s description of a typical monitoring system –

the prevention from opening, printing, or excerpting content – characterizes the

remaining two levels of control typical to most DRM contemporary systems:

access and usage control. Access controls attempt to restrict or limit users’ ability

to view or listen to copyright-protected materials. Examples of access control

techniques include encryption algorithms that prohibit people without the required

decryption key from accessing the encrypted content. The key, which is provided

only to users who have paid for the content, is typically found inside software

208
packages accessible only after they have been purchased and opened. Other

access control methods limit the amount of otherwise proper use of copyright-

protected content, such as e-books that expire after a certain period of time after

purchase or music files that can only be played a finite number of times.

DRM technologies that include usage control features go beyond

restricting normal access to the content and instead inhibit users’ ability to print,

copy, download, upload, perform, distribute, modify, or otherwise manipulate

digital materials. Examples include DRM implementations that prevent playback

of audio CDs on home computers in order to deter copying of the content, or

DRM controls embedded in PDF documents to prevent editing, copying, or even

printing the files. Many DRM systems combine both access and usage controls,

such as the Content Scrambling System (CSS) employed on DVDs. CSS uses a

simple encryption algorithm to scramble the DVD’s content, which only licensed

DVD players can decrypt to enable access. To obtain the necessary decryption

keys, device manufacturers are required to sign a license agreement restricting the

inclusion of certain features in their players, such as a digital output that could be

used to extract a high-quality digital copy of the movie (Wikipedia contributors,

2007a). With the design of CSS, the existence of an access control system allows

enforcement of a strict usage control regime.

209
Some DRM systems combine all three levels of control, such as Sony’s

surreptitious distribution of rootkit70 software on audio compact discs in late

2005. This infamous DRM system limited a user’s ability to access and play the

CDs on their home computers (access control), restricted the ability to copy or

share songs from the CD (usage control), and also secretly communicated with

Sony over the Internet when listeners played the discs, transmitting the name of

the CD being played along with the IP address of the listener’s computer (active

monitoring) (Bray, 2005). In the face of significant criticism, Sony halted

distribution of this DRM scheme and provided users the ability to uninstall the

DRM software from infected computers (Graham, 2005).

Evident from the reaction to Sony’s actions, the design and use of DRM

technologies has sparked considerable criticism and debate regarding DRM’s

impact on users, creativity, competition, and law (see, for example, Felten, 2003;

Samuelson, 2003; Becker et al., 2004; Lessig, 2004; Petrick, 2004;

Vaidhyanathan, 2004). A full discussion of this larger debate is beyond the scope

of this chapter, but a critical concern has emerged from the various social,

cultural, and legal explorations of DRM systems: As information-seeking and

intellectual activities extend from the physical library into the new digital realm,

the presence of DRM technologies within digital materials poses as much a threat

to individuals’ privacy and intellectual freedom as the presence of the FBI within

the public library.

70
A rootkit is a set of software tools intended to conceal running
processes, files, or system data from the operating system, and often from the user
herself.

210
DRM, Digital Mobility, and Privacy

DRM technologies constrain an individual’s digital mobility in two

fundamental ways. The first involves what legal scholar Julie Cohen describes as

“direct restrictions on what individuals can do in their privacy of their own homes

with copies of works they’ve paid for” (2003a, p. 47). The primary architecture of

DRM systems constrains user agency and autonomy, limiting the scope for users

to choose how to behave (Cohen, 1996; Burk & Gillespie, 2006). DRM allows

only those actions determined in advance, embedding the rules of access and use

into the very tools with which the information is to be used, rendering prohibited

actions all but impossible. Herein lies a key distinction from traditional

applications of copyright law. Whereas legal prohibition via copyright law leaves

discretion over their behavior in the hands of the users, allowing them to

determine whether to risk activity that might result in legal penalties, DRM

forecloses such discretion, allowing only those actions determined in advance by

the information producer. DRM technologies restrict the choice of the individual

by fundamentally shifting the moment in which the use of information is

regulated; it works on the “principle of preemption” (Burk & Gillespie, 2006, p.

241), with user autonomy as a key casualty.

Besides direct control over user actions, Cohen (1996) argues that even the

simplest monitoring features of copyright management technologies are likely to

“chill” certain types of reading, as consumers will be aware and perhaps

apprehensive that their choices are being remotely observed and recorded. Even

211
when a user might not mind others knowing that they accessed or read certain

content, the user might not want others to know that they had to read it 20 times,

that they highlighted parts of it, that they wrote notes in the margin, that they

copied part of it, that they forwarded certain excerpts to their friends with

comments, and so on. For many users, knowledge that these or similar kinds of

information would be gathered about them would naturally affect the types of

content they choose to access and use, as well as how they go about it. Cohen is

particularly troubled by this invasion and argues that intellectual exploration is

“one of the most personal and private of activities” and that given its monitoring

capabilities, DRM will “create records of behavior within private spaces, spaces

within which one might reasonably expect that one’s behavior is not subject to

observation” (Cohen, 2003b, pp. 584-585).

In addition to restricting one’s intellectual freedom and autonomy, DRM

technologies impact an individual’s intellectual mobility in a second critical way:

Due to their monitoring capabilities, DRM technologies pose an enormous threat

to the privacy of individual reading, viewing, and listening habits (Feigenbaum et

al., 2001; Cohen, 2003b; Mulligan et al., 2003; Electronic Privacy Information

Center, 2004a). DRM systems, and their associated authentication and

authorization systems, carry the potential of generating, transmitting, and storing

vast quantities of data about the use of copyrighted works. This data could reveal

a great deal about the manner in which individuals explore copyrighted works.

DRM thus presents the potential for a level of usage monitoring that is

212
unprecedented in the use of informational goods. Mulligan and her colleagues

summarize the privacy threats of typical DRM architectures:

By gathering data from consumers incidental to DRM transactions,


businesses interfere with the privacy norms and expectations regarding the
post-purchase use of content and derive benefit that is not reciprocated.
The DRM systems we examined engage in detailed surveillance of content
consumption by consumers within private spaces. In most instances the
systems monitor the content used, the time of use, the frequency of use,
and the location of use. The services both limit what consumers can do in
the confines of their own home, or the equivalent, and create detailed
reports about use of digital works. (Mulligan et al., 2003, p. 11)

To summarize, as the acts of reading books or listening to music in the

physical sphere of the privacy of one’s home or the protected halls of the library

shift to digital spheres of online information access and distribution, these

previously anonymous and autonomous activities increasingly become entangled

in the collection of personal information about an individual’s intellectual

interests and habits by DRM technologies. The effects of DRM technologies can

be characterized as a subtle but pernicious curb on individual choice and

autonomy as they relate to the ability to select, use, and benefit from intellectual

and informational goods. As Cohen summarizes, the technological constraints

built into DRM occur in the context of cultural and informational content basic to

human flourishing:

Technologies that constrain user behavior narrow the zone of freedom


traditionally enjoyed for activities in private spaces, and in particular for
activities relating to intellectual consumption within those spaces. In so
doing, they decrease the level of autonomy that users enjoy with respect to
the terms of use and enjoyment of intellectual goods. (Cohen, 2003b, p.
580)

213
The broader debate over the legitimacy and efficacy of digital rights management

technologies cannot be resolved in this chapter. But the very existence and

potency of the debate points to the fact that individuals anticipate that the

intellectual freedoms historically enjoyed when using content in private homes or

public libraries will extend into the digital sphere as well. Whether at the library

or online, individuals must be able to enjoy a sphere of mobility where

information-seeking and intellectual activities can occur free from oversight and

control.

Convergence of Mobilities

This chapter has presented a broad conceptualization of spheres of

mobility where individuals historically have enjoyed the freedom to engage in

social, cultural, and intellectual activities free from answerability and oversight.

The physical mobility afforded by the automobile represent some of the

fundamental values and aspirations that define American culture, such as the

promise of exploration and adventure, the potential for individual autonomy and

liberation from social constraints. The intellectual mobility preserved and

protected by free access to libraries fulfills Jefferson’s vision to enhance “the

illimitable freedom of the human mind” and to provide citizens the ability “to

explore and to expose every subject susceptible of its contemplation” (Jefferson,

1820a). And the new sphere of digital mobility frees our intellectual curiosities

from the physical confines of the library, providing new freedom to move within

214
and across the digital networks of cyberspace to explore places and ideas

previously beyond reach.

Beyond sharing a common theme of mobility, these three spheres are

uniquely interconnected. The automobile, while obviously enhancing individual’s

physical mobility, has also frequently been cited in its supporting role of fostering

new levels of intellectual mobility. Recalling L. J. K. Setright’s reflections on the

automobile as a “liberator” and “agent of freedom”:

Throughout its history, the car has enabled people to break out of their
constraints, to attempt something they could never previously do, to
venture somewhere they could never previously go, to support ideas and
trends they could never previously endorse. (Setright, 2003, pp. 186/,
emphasis added/)

An essential power of automobility is its ability to be both a means of escape and

a means to gain knowledge, to “support ideas and trends” previously

undiscovered. In this way, the road served not only as an exit from one’s own life,

but also as an entrance into the experiences and interactions of the people, places,

and ideas encountered along its path. City dwellers could learn of rural culture

firsthand, while those residing in the countryside could drive to the cities and be

exposed to their cosmopolitan character. Automobility enabled all to experience

the beauty and mystery of the country’s diverse natural resources, to travel and

learn about their family’s roots, or to retrace the history of the nation.

The physical mobility enabled by the growing car culture also allowed

easier access to education, including the ability for individuals to leave their

familiar surroundings in order to be educated in a different part of the country,

with different sets of norms and values. Further, the intellectual freedoms enjoyed

215
in the nation’s public libraries were bolstered by the newfound ability to drive to

local libraries, as well as travel to larger and more diverse libraries in neighboring

areas. Libraries themselves also embraced automobility with the development of

the “traveling library” and “bookmobile,” bringing a new kind of mobility to

intellectual activities. In short, automobility made possible an array of new

experiences, new insights, and new knowledge, fueling McLuhan’s “highways of

the mind” and enabling not only physical mobility, but also new levels of

intellectual mobility.

Modern America can be defined by the “mobility of its people and their

information” (Dunlap, 2002, p. 2187). The spheres of physical and intellectual

mobility are perpetually intertwined. With the growing importance of the Internet

in modern life, digital mobility has emerged as a vital proxy for both physical and

intellectual mobilities. Commonly referred to as an “information superhighway,”

the Internet – a necessary component of digital mobility – brings the notion of

physical and intellectual mobilities into marriage with one another. Indeed, links

between the digital mobility provided by the Internet and the physical and

intellectual mobility attained via the automobile and highway system abound.

First, broadly speaking, both the highways and the Internet act as communication

media for the transmission of information. Raymond Williams notes an early

meaning of the term “communication” was to “make common to many, impart,”

and that “lines of communication” included not only the telegraph or telephone,

but also “roads, canals and railways” (Williams, 1983, p. 72). The highways, then,

represent vital routes of communication, physically connecting people and places,

216
imparting information (be they physical mail or commercial products) across

space, in much the same way that the telegraph imparted messages across the

wires. The Internet, often considered as today’s equivalent of the telegraph

(Standage, 1998), has become a dominant communication medium for imparting

various information and services, such as electronic mail, file sharing, and the

interlinked pages of the World Wide Web that facilitate modern-day sociability,

commerce, and entertainment.

Second, both the Internet and the highway systems are methods of

personal transportation. The highways, as described above, provide a means of

escaping the banality of our day-to-day activities; we can be transported to far-off

places or unexplored regions full of excitement and liberation. Similarly, a

vaunted benefit of the Internet is its ability to transport us to a seemingly infinite

variety of “places” online, providing the ability to escape and explore beyond the

physical limitations of the highway. And a third linkage between the highways

and the Internet is their shared history. Both the interstate and information

superhighways were first conceived during the late 1950s and early 1960s as

responses, in part, to the supposed technological and nuclear threat of the Soviet

Union. A key motivation for the design and construction of the interstate highway

system was to enable military and civil defense operations, including troop

movements and the emergency evacuation of cities in the event of nuclear war. In

response to the Soviet launching of the Sputnik satellites, the Advanced Research

Projects Agency (ARPA) was formed within the Department of Defense to

reestablish an American lead in science and technology. An advanced data

217
communications network (ARPANET) soon emerged from the collected minds of

ARPA, which eventually became today’s Internet (see Hafner & Lyon, 1996).71

The highway system and the Internet share a fourth similarity: their

distributed form. A distributed network has no centralized hubs; each node is

connected to several of its neighboring nodes in a lattice-like configuration. As a

result, each node has several possible routes through which to send data: if one

node or neighboring route is destroyed, various alternative paths are available.

The Internet, a worldwide system of interconnected computer networks utilizing

packet-switching protocols for the movement of information, follows the

distributed network model.72 The highway system also represents a distributed

network because it “lacks any centralized hubs and offers direct linkages from

71
The belief persists that a primary motivation for the development of the
distributed ARPANET was so that military information and communication
networks could withstand a nuclear attack. According to some of the original
designers of the network, this is only partially true:
It was from the RAND study that the false rumor started claiming that the
ARPANET was somehow related to building a network resistant to nuclear
war. This was never true of the ARPANET, only the unrelated RAND study
on secure voice considered nuclear war. However, the later work on
Internetting did emphasize robustness and survivability, including the
capability to withstand losses of large portions of the underlying networks.
(Leiner et al., 2003 at note 5)
72
It should be noted that many of the systems used to enable Internet
usage do not follow a purely distributed form. The Domain Name System (DNS),
which translates alphabetic Web addresses (www.michaelzimmer.org) into
numeric network addresses (70.103.189.67), is a decentralized and hierarchical
system through which nearly all Web browsing traffic must flow. As Galloway
(2004) notes, “ironically, then, nearly all Web traffic must submit to a hierarchical
structure (DNS) to gain access to the anarchic and radically horizontal structure of
the Internet” (p. 9).

218
city to city through a variety of highway combinations” (Galloway, 2004, p. 35).73

Given their mesh-like topology, traffic across these two networks – the physical

highway and the digital Internet – are often random and unrepeatable, fostering a

certain level of autonomy and freedom for their users. These various linkages

between physical, intellectual, and digital mobilities are repeated in cultural

historian George Pierson’s statement regarding the critical role of mobility in our

lives:

Without spatial movement, no social improvement, either. Our work and


or play, our cities and our contrysides, our taxes and our eating habits, our
pleasures and our pains, our hopes and our fears are inextricably tied up
with mobility. (1973, p. 93)

Summary

This chapter has introduced the notion of spheres of mobility, in which

individuals have historically enjoyed the ability to engage in social, cultural, and

intellectual activities free from answerability and oversight. Within these spheres,

individuals enjoy the presumption of liberty and autonomy, of un-answerability,

self-determination, and self-definition. Individuals create, discover, and enjoy

spaces for personal growth, exploration, and escape within these spheres of

mobility, whether experienced on the open roads, in public libraries, or online.

Web search engines, as the center of gravity of information-seeking activities,

represent the latest addition to these spheres of mobility, providing an interface to

new worlds of information, new spaces for communication, and new means of

73
Likewise, while design upon the principle of a distributed network, in
practice, the highway system does not achieve a fully distributed form.

219
experiencing the world. In many ways, the physical, intellectual, and digital

mobilities described above converge in Web search engines, providing new

means of physical escape, of intellectual exploration, and digital freedoms.

The previous chapter demonstrated how the quest for the perfect search

engine results in the transgression of informational norms, violating the

contextual integrity of the privacy of personal information. While such an analysis

exposed a Faustian bargain implicit within the quest for the perfect search engine,

the theory of contextual integrity did not provide a means of assessing the

normative consequences of such a breach – whether the efficiencies gained

through the perfect search outweigh any potential harm. This chapter has argued,

however, that the stakes for violating contextual integrity in the perfect search

engine are much greater than simply allowing Google to collect search queries or

scan e-mail messages to deliver relevant advertising. Whether on the highways, in

the library, or on the Internet, spheres of mobility provide the means to break

down barriers, expand our horizons, offer new insights, and lead us into new

directions. By violating the information norms within our spheres of mobility, the

quest for the perfect search engine threatens our ability to navigate, to inquire, and

to explore. It inhibits our ability to develop the awareness and competencies

necessary for effective participation in social, economic, cultural, and political

life, and impedes our enjoyment of the freedoms fundamental to our spheres of

mobility.

220
221

CHAPTER VII

CONCLUSION: RENEGOTIATING THE FAUSTIAN BARGAIN

Shortly after reports emerged that the Web search engine Google had

resisted a U. S. Department of Justice subpoena demanding disclosure of two full

months’ worth of search queries that Google had received from its users, The

Washington Post’s op-ed page published this fictional correspondence between a

DOJ attorney (Mr. Tutley) and an attorney at Google (Mr. Miller):

Mr. Tutley,
The DOJ should not be in the business of threatening American
enterprises, particularly one as beloved as Google, the people's search
engine. Especially when taxpayers pay your salaries -- salaries that, at
least in your four-person office, are apparently going largely to subsidize
(according to our tracking cookies) a minimum of four hours daily of eBay
auctions (James Llewellyn and Carol Santana, both GS-14) and
SpongeBob Collapse (Martha Stanhope, GS-13).
David Miller, Google

David Miller,
Twice last month your VW Passat went through the Ninth Street
Toll Plaza -- 11:04 p.m. on 12/26 and then, in the opposite direction, at
4:31 a.m., 12/27 -- with a fetching young passenger whose retinal scan
does not appear to match that of Donna Weinstein-Miller, your wife of
seven years and mother of your twins, Emma and Jedediah.
Turn over the records.
B. Tutley, DOJ
Mr. Tutley,
For a man whose Google searches the past six months have
included the terms personal bankruptcy, bankruptcy lawyer, Chapter 11
and Azerbaijan hotties (btw, the “and” is superfluous), you're in no
position to muscle us, or to demand a peek at the personal information of
others.
D. Miller
(excerpted from Postman, 2006)

This satirical exchange provides a glimpse of how technologies – in this

tale, networked vehicle information systems and Web search engines – bear on

the ability to move, navigate, inquire, and explore within our spheres of mobility.

This dissertation has argued that the quest for the perfect search engine presents a

Faustian bargain: The perfect search promises new breadth, depth, efficiency, and

relevancy of online information-seeking activities, but also enables the

widespread surveillance and capture of users’ online personal and intellectual

activities. As Neil Postman warned, we are in danger of succumbing to the

promises made by the perfect search, while ignoring the ways that privacy,

freedom, and autonomy are threatened by its existence.

Proponents of the perfect search have succeeded in obscuring its value and

ethical implications by focusing attention on bold claims of newfound efficiency

or utility, and by presenting arguments that no real threat to privacy actually

exists, as the information shared in the perfect search is not personally identifiable

and often is already shared with other entities in other circumstances. In order for

us to “make potentially morally controversial computer features and practices

visible” (Brey, 2000, p. 13) this dissertation utilized the theory of “contextual

integrity” (Nissenbaum, 1998, 2004) to provide clarity to the privacy implications

222
of the quest for the perfect search engine, revealing how Google’s goal of creating

for the perfect search is altering personal information flows in such a way that

threatens the existing informational norms.

The emergence of the perfect search engine, then, forces us to examine its

impact in terms of the freedoms enjoyed in our broader spheres of mobility.

Whether on the highways, in the library, or on the Internet, without the ability and

opportunity to move, to navigate, to inquire, and to explore, we cannot gain the

sort of understanding of our world and develop the awareness and competencies

necessary for effective participation in social, economic, cultural, and political

life. More than just shifting the contextual integrity of informational norms within

particular information-seeking contexts, the quest for the perfect search engine

represents the latest – and perhaps the most potent – threat to the freedoms

traditionally enjoyed in our spheres of mobility.

Lured by a tantalizing collection of innovative, user-friendly, and indeed

useful tools, consumers increasingly relish becoming citizens of “Planet Google,”

despite a growing awareness of Google’s predilection to capture as much data

about its users as possible. A recent New York Times article profiling a citizen of

this brave new world reveals this paradox:

As Dan Firger, a law student at New York University, strolls from class to
class during the course of his day or pauses for a breather in Washington
Square Park, his cellphone is routinely buzzing inside his messenger bag.
He can often guess who it is: Google. Six to eight times a day text
messages pop up, courtesy of Google Calendar, a free daily organizer
introduced this year. The program can scan appointments and send
reminders of coming events. Google is everywhere in Mr. Firger’s life. He
scours the Web with its search engine; he chats with friends in Bolivia
using Google Talk; and he receives e-mail messages on a Google Gmail

223
account. “I find myself getting sucked down the Google wormhole,” Mr.
Firger said with equal parts resentment and admiration. “It’s all part of
Google’s benign dictatorship of your life.”
…Mr. Firger, the law student, acknowledged feeling a “weird
tension” about his love of Google’s products and his fear about its
omnipresence in his life. “I don’t know if I want all my personal
information saved on this massive server in Mountain View, but it is so
much of an improvement on how life was before, I can’t help it,” he said.
(Williams, 2006)

In its quest for the perfect search engine, Google has constructed an

alluring information-seeking environment, whereby individuals are also integrated

into an infrastructure for the capture of personal information. Greg Elmer warns

that such an environment, where the collection of personal information is a

prerequisite to participation, inevitably entrenches power in the hands of the

technology designers:

Ultimately, what both requesting and requiring personal information


highlight is the centrality of producing, updating, and deploying consumer
profiles – simulations or pictures of consumer likes, dislikes, and
behaviors that are automated within the process of consuming goods,
services, or media and that increasingly anticipate our future needs and
wants based on our aggregated past choices and behaviors. And although
Foucault warns of the self-disciplinary model of punishment in panoptic
surveillance, computer profiling, conversely, oscillates between seemingly
rewarding participation and punishing attempts to elect not to divulge
personal information. (Elmer, 2004, pp. 5-6)

This blurring of punishments and rewards – subtle requests and not-so-subtle

commands for personal information – reoccurs throughout Google’s information

interface, where the default settings and arrangement of services make the

collection of personal information automatic and difficult to resist.

These constraints on user resistance force us to look for alternative means

of renegotiating our Faustian bargain with the perfect search engine. One avenue

224
for changing the terms of the Faustian bargain is to enact laws to regulate the

capture and use of personal information by Web search engines. A recent

gathering of leading legal scholars and industry lawyers to discuss the possibility

of regulating search engines revealed, however, that viable and constitutional

solutions are difficult to conceive, let alone agree upon.74 Alternatively, the search

engine industry could self-regulate, creating strict policies regarding the capture,

aggregation, and use of personal data via their services. But as Chris Hoofnagle

reminds us, “We now have ten years of experience with privacy self-regulation

online, and the evidence points to a sustained failure of business to provide

reasonable privacy protections” (2005, p. 1). Given search engine companies’

economic interests in capturing user information for powering the perfect search

engine, relying solely on self-regulation will likely be unsatisfying.

A third option is to affect the design of the technology itself. Connecting

Heidegger’s call for “releasement,” guiding us to strive be in harmony with the

technologies around us (1966, p. 55), and Lessig’s assertion that “how a system is

designed will affect the freedoms and control the system enables” (Lessig, 2001,

p. 35), this dissertation argues that technological design is one of the critical

junctures to ensure that the technologies we use support human and ethical values

vital that are within our spheres of mobility. By engaging in Value-Conscious

Design, we can work pragmatically to ensure that the design of Web search

engines positively affect freedom of mobility, while providing releasement

74
See “Regulating Search: A Symposium on Search Engines, Law, and
Public Policy” held in December 2005 at Yale Law School
(http://islandia.law.yale.edu/isp/regulatingsearch.html).

225
through a harmonious relationship with the technology, i.e., a relationship free

from external surveillance, judgment, control, or attribution of motives.

This dissertation has constructed a foundation for the Value-Conscious

Design of the perfect search engine. Two balls from the methodological toolkit

have been put in play: In support of a conceptual investigation, this dissertation

has attempted to bring clarity and a normative understanding to the ways in which

the quest for the perfect search engine bears on the values enjoyed in our spheres

of mobility. The dissertation has further engaged in a technical investigation of

the particular design features of the perfect search engine, revealing how its

technological properties and underlying architecture constrain freedom within our

spheres of mobility.

As the Value-Conscious Design methodological frameworks are meant to

be iterative, the process is non-linear and rarely, if ever, complete. The conceptual

investigation of the perfect search engine will continue beyond the pages of this

dissertation as the theory of contextual integrity is further developed and refined.

Similarly, further technical investigations need to take place: For example, the

Web advertising aspect of the perfect search engine has remained unexplored,

including Google’s AdSense and AdWords programs, its Web Analytics software,

and its recent acquisition of DoubleClick. Additional products and services, such

as Web software for creating documents and spreadsheets, are continually added

to Google’s infrastructure. Given Google’s penchant for building and acquiring

new tools in support of the perfect search, the investigation of its technological

properties and underlying architecture may never end.

226
Opportunities exist to put the translation ball (from the Values at Play

methodology) into play as well. Having identified and conceptually broadened

our understanding of how the value of privacy is implicated in the design of the

perfect search, we can try to translate that value into various design features.

Potential design variables include whether default settings for new products or

services automatically enroll users in data-collecting processes – or whether the

process can be turned off. Or the extent to which different products should be

interconnected: For example, if a user signs up to use Gmail, should the

Personalized Search automatically be activated? Should the user automatically be

logged in to other services? Ideally, new tools can be developed to give users

access and control over the personal information collected: In the spirit of the

Code of Fair Information Practices, a Google Data Privacy Center should be built

to allow users to view all their personal data collected, make changes and

deletions, restrict how it is used, and so on. Countless more opportunities exist for

translating this dissertation’s conceptual and technical work into pragmatic

interventions in the design of the perfect search engine.

To close, building from its broad foundation in the philosophy of

technology, this dissertation has exposed the Faustian bargain that has been thrust

upon online information-seekers as a result of the quest for the perfect search

engine. The theory of contextual integrity clarified how the perfect search engine

disrupts existing informational norms. Finally, by introducing the notion of our

broader spheres of mobility, the dissertation revealed how the quest for the perfect

search engine threatens our ability to navigate, to inquire, and to explore,

227
potentially inhibiting our ability to develop the awareness and competencies

necessary for effective participation in social, economic, cultural, and political

life. In short, this has been an exercise in disclosive computer ethics, uncovering

the moral issues and features in the perfect search engine that have not, until now,

gained much recognition. By harnessing the pragmatic methodologies of Value-

Conscious Design, there is hope that our Faustian bargain can be renegotiated to

provide a more harmonious relationship with the quest for the perfect search

engine, allowing full enjoyment of the fundamental freedoms within our spheres

of mobility.

228
229

BIBLIOGRAPHY

Ackerman, E., & Blitstein, R. (2006, October 9). Google buys YouTube for $1.65
billion. San Jose Mercury News.

Agre, P. (1995). Reasoning about the future: The technology and institutions of
intelligent transportation systems. Santa Clara Computer and High
Technology Law Journal, 11(1), 129-136.

Agre, P. (1996, July). Responding to arguments against privacy. The Network


Observer. Retrieved March 31, 2007, from
http://polaris.gseis.ucla.edu/pagre/tno/july-1996.html

Agre, P. (1997a). Toward a critical technical practice: Lessons learned in trying to


reform ai. In G. Bowker, L. Gasser, S. L. Star, & B. Turner (Eds.), Social
science, technical systems and cooperative work: Beyond the great divide.
Lawrence Erlbaum.

Agre, P. (1997b). Computation and human experience (Learning in doing).


Cambridge ; New York: Cambridge University Press.

Agre, P., & Rotenberg, M. (Eds.). (1997). Technology and privacy: The new
landscape. Cambridge, MA: MIT Press.

Alpert, S. (1995). Privacy and intelligent highways: Finding the right of way.
Santa Clara Computer and High Technology Law Journal, 11(1), 97-118.

Alsp, R. (2005, December 6). Ranking corporate reputations. Wall Street Journal
Online. Retrieved March 25, 2007, from
http://online.wsj.com/public/article/SB113382708423014553-
qFM4JXwHCQvWsS14_SXjj123W5M_20061206.html

American Library Association Office for Intellectual Freedom. (2004, April). The
USA Patriot act in the library: Analysis of the USA Patriot act related to
libraries. Retrieved January 14, 2007, from
http://www.ala.org/ala/oif/ifissues/usapatriotactlibrary.htm

American Library Association. (2002a). Intellectual freedom manual (6th ed.).


Chicago: American Library Association.
American Library Association. (2002b). Privacy: An interpretation of the library
bill of rights. Retrieved January 14, 2007, from
http://www.ala.org/ala/oif/statementspols/statementsif/interpretations/privac
y.htm

American Library Association. (2003). Resolution on the USA Patriot act and
related measures that infringe on the rights of library users. Retrieved
January 14, 2007, from
http://www.ala.org/ala/washoff/WOissues/civilliberties/theusapatriotact/alar
esolution.htm

American Library Association. (2006a). Code of ethics of the american library


association. Retrieved January 14, 2007, from
http://www.ala.org/ala/oif/statementspols/codeofethics/codeethics.htm

American Library Association. (2006b). Intellectual freedom manual (7th ed.).


Chicago: American Library Association.

American Library Association. (2006c). Library bill of rights. Retrieved January


14, 2007, from http://www.ala.org/work/freedom/lbr.html

American Library Association. (2006d). Policy concerning confidentiality of


personally identifiable information about library users. Retrieved January
14, 2007, from
http://www.ala.org/ala/oif/statementspols/otherpolicies/policyconcerning.ht
m

American Library Association. (2006e). Policy on confidentiality of library


records. Retrieved January 14, 2007, from
http://www.ala.org/ala/oif/statementspols/otherpolicies/policyconfidentiality
.htm

American Library Association. (2006f). Privacy and confidentiality. Retrieved


January 14, 2007, from
http://www.ala.org/ala/oif/ifissues/privacyconfidentiality.htm

Andrews, P. (1999, February 7). The search for the perfect search engine. The
Seattle Times, p. E1.

Arasu, A., Cho, J., Garcia-Molina, H., Paepcke, A., & Raghavan, S. (2001).
Searching the Web. ACM Transactions on Internet Technology, 1(1), 2-43.

Aristotle. (1999). Nicomachean ethics (T. Irwin, Trans. 2nd ed.). Indianapolis:
Hackett Publishing.

230
Associated Press. (2005, July 17). Google Growth yields privacy fear. Wired.com.
Retrieved March 31, 2007, from
http://www.wired.com/politics/security/news/2005/07/68235

Ayers, C. (2003, November 1). Google: Could this be the new God in the
machine? The Times, p. 4.

Baig, E. (2000, May 31). Help from above gets you there pioneer gps keeps driver
on straight and narrow. USA Today, p. 3D.

Barabási, A.-L. (2003). Linked: How everything is connected to everything else


and what it means for business, science, and everyday life. New York:
Plume.

Barbaro, M., & Zeller Jr, T. (2006, August 9). A face is exposed for AOL
searcher no. 4417749. The New York Times, p. A1.

Barroso, L. A., Dean, J., & Holzle, U. (2003). Web search for a planet: The
Google cluster architecture. IEEE Micro, 23(2), 22-28.

Barth, A., Datta, A., Mitchell, J. C., & Nissenbaum, H. (2006). Privacy and
contextual integrity: Framework and applications. Paper presented at the
IEEE Symposium on Security and Privacy.

Bates, D. (2002). Cartographic aberrations: Epistemology and order in the


encyclopedic map. In D. Brewer, & J. C. Hayes (Eds.), Using the
encyclopédie: Ways of knowing, ways of reading. (pp. 1-20). Oxford:
Voltaire Foundation.

Battelle, J. (2003, November 13). The database of intentions. Searchblog.


Retrieved May 16, 2006, from
http://battellemedia.com/archives/000063.php

Battelle, J. (2004, September 8). Perfect search. Searchblog. Retrieved May 16,
2006, from http://battellemedia.com/archives/000878.php

Battelle, J. (2005). The search: How Google and its rivals rewrote the rules of
business and transformed our culture. New York: Portfolio.

Battelle, J. (2006a, January 30). More on what Google (and probably a lot of
others) know. Searchblog. Retrieved May 16, 2006, from
http://battellemedia.com/archives/002283.php

231
Battelle, J. (2006b, January 27). What info does Google Keep? Searchblog.
Retrieved May 16, 2006, from
http://battellemedia.com/archives/002272.php

Bayley, S. (1986). Sex, drink, and fast cars. New York: Random House.

Becker, E., Buhse, W., Günnewig, D., & Rump, N. (2004). Digital rights
management: Technological, economic, legal and political aspects. New
York: Springer.

Beniger, J. (1986). The control revolution: Technological and economic origins of


the information society. Cambridge, MA: Harvard University Press.

Bennett, C. (1996). The public surveillance of personal data: A cross-national


analysis. In D. Lyon, & E. Zureik (Eds.), Computers, surveillance, and
privacy. (pp. 237-259). Minneapolis: University of Minnesota Press.

Bennett, C. (2001). Cookies, Web bugs, webcams and cue cats: Patterns of
surveillance on the world wide Web. Ethics and Information Technology,
3(3), 197-210.

Bennett, C., Raab, C., & Regan, P. (2003). People and place: Patterns of
individual identification within intelligent transport systems. In D. Lyon
(Ed.), Surveillance as social sorting: Privacy, risk, and digital
discrimination. (pp. 153-175). London: Routledge.

Berger, M. L. (2001). The automobile in american history and culture: A


reference guide. Westport, CT: Greenwood Press.

Bergman, M. (2001). The deep Web: Surfacing hidden value. Journal of


Electronic Publishing, 7(1).

Berkley, J. (1996). Women at the motor wheel: Gender and car culture ni the
u.s.a., 1920-1930. Unpublished Dissertation, Claremont Graduate
University.

Berners-Lee, T. (2000). Weaving the Web: The past, present and future of the
world wide Web by its inventor. New York: Harper Business.

Bey, H. (1991). Taz: The temporary autonomous zone, ontological anarchy,


poetic terrorism. Brooklyn, NY: Autonomedia.

Bijker, W., & Law, J. (Eds.). (1992). Shaping technology/building society:


Studies in sociotechnical change. Cambridge, MA: MIT Press.

232
Bijker, W. E. (1995). Of bicycles, bakelites, and bulbs: Toward a theory of
sociotechnical change. Cambridge, MA: MIT Press.

Bijker, W. E., Hughes, T., & Pinch, T. (Eds.). (1987). The social construction of
technological systems: New directions in the sociology and history of
technology. Cambridge, MA: MIT Press.

Boehner, K., David, S., Kaye, J., & Sengers, P. (2005). Critical technical practice
as a methodology for values in design. Paper presented at the CHI 2005
Workshop: Quality, Value(s) and Choice: Exploring Wider Implications of
HCI Practice, Portland, OR.

Bowker, G. C., & Star, S. L. (1999). Sorting things out: Classification and its
consequences. Cambridge, MA: MIT Press.

Bray, H. (2004, April 26). Gmail controversy highlights new privacy issue. The
Boston Globe, p. C1.

Bray, H. (2005, November 8). Security firm: Sony cds secretly install spyware.
Boston Globe. Retrieved january 20, 2007, from
http://www.boston.com/business/technology/articles/2005/11/08/security_fir
m_sony_cds_secretly_install_spyware/

Brewer, D., & Hayes, J. C. (Eds.). (2002). Using the encyclopédie: Ways of
knowing, ways of reading. Oxford: Voltaire Foundation.

Brey, P. (2000). Disclosive computer ethics. Computers and Society, 30(4), 10-
16.

Brin, S. & Page, L. (1998). The anatomy of a large-scale hypertextual Web search
engine. WWW7 / Computer Networks, 30(1-7), 107-117.

Broder, A., Kumar, R., Maghoul, F., Raghavan, P., Rajagopalan, S., Stata, R., et
al. (2000). Graph structure in the Web. Computer Networks, 33(1-6), 309-
320.

Burk, D. & Gillespie, T. (2006). Autonomy and morality in drm and anti-
circumvention law. tripleC: Cognition, Communication, Cooperation, 4(2),
239-245.

Bush, V. (1945, July). As we may think. The Atlantic Monthly, 176(1), 101-108.

Busha, C. (1977). An intellectual freedom primer. Littleton, CO: Libraries


Unlimited.

233
Bynum, T. W. (2000). Norbert wiener’s foundation of computer ethics. The
Research Center on Computing & Society. Retrieved March 18, 2007, from
http://www.southernct.edu/organizations/rccs/resources/research/introductio
n/bynum_wiener.html

Cahn, E. N. (1949). The sense of injustice, an anthropocentric view of law. New


York: New York University Press.

Cameron, A. (2004). Infusing privacy norms in drm: Incentives and perspectives


from law. In Y. Deswarte, F. Cuppens, S. Jajodia, & L. Wang (Eds.),
Information security management, education and privacy: Ifip 18th world
computer congress: Tc11 19th international information security
workshops, 22-27 August 2004, toulouse, france. (pp. 297-312). Norwell,
MA: Kluwer Academic Publishers.

Camp, L. J. (2006, March 23). Reliable, usable signaling to defeat masquerade


attacks. Retrieved September 23, 2006, from
http://www.ljean.com/files/NetTrustEcon.pdf

Camp, L. J. (n.d.). Design for values, design for trust. Retrieved September 20,
2006, from http://www.ljean.com/design.html

Camp, L. J., Friedman, A., & Genkina, A. (n.d.). Embedding trust via social
context in virtual spaces. Retrieved September 23, 2006, from
http://www.ljean.com/files/NetTrust.pdf

Casey, R. (1997). Textual vehicles: The automobile in american literature. New


York: Garland.

Castells, M. (1996-1998). The information age: Economy, society and culture,


vol. I, ii, and iii. Cambridge, MA: Blackwell Publishers.

Chandler, J. (2002). Bias in internet search engines: Free speech implications.


Harvard Law School, Cambridge, MA.

Chang, N. (2001, November). The USA PATRIOT act: What's so patriotic about
trampling on the bill of rights? The Center for Constitutional Rights.
Retrieved January 14, 2007, from http://www.ccr-
ny.org/v2/reports/docs/USA_PATRIOT_ACT.pdf

Chun, W. H.-K. (2006). Control and freedom: Power and paranoia in the age of
fiber optics. Cambridge, MA: MIT Press.

Clark, H. (2006, August 23). Innovation: A waste of money? Forbes.com.


Retrieved August 20, 2006, from

234
http://www.forbes.com/leadership/2006/08/23/leadership-innovation-
requiredreading-cx_hc_0823moore.html

Clarke, R. (1988). Information technology and dataveillance. Communications of


the ACM, 37(5), 498-512.

Clarke, R. (1997). Introduction to dataveillance and information privacy, and


definitions of terms. Retrieved December 3, 2005, from at
http://www.anu.edu.au/people/Roger.Clarke/DV/Intro.html

Clarke, R. (2000). Person-location and person-tracking: Technologies, risks and


policy implications. Retrieved December 3, 2005, from
http://www.anu.edu.au/people/Roger.Clarke/DV/PLT.html

Clarke, R. (2001). Cookies. Retrieved May 5, 2006, from


http://www.anu.edu.au/people/Roger.Clarke/II/Cookies.html

Clayton, M. (2000, May 23). Calculators in class: Freedom from scratch paper or
'crutch'? The Christian Science Monitor, p. 20.

Cohan, S., & Hark, I. R. (1997). The road movie book. London: Routledge.

Cohen, J. (1996). A right to read anonymously: A closer look at ‘copyright


management’ in cyberspace. Connecticut Law Review, 28(4), 981-1039.

Cohen, J. (2003a). Drm and privacy. Communications of the ACM, 46(4), 46-49.

Cohen, J. (2003b). Drm and privacy. Berkeley Technology Law Journal, 18, 575-
617.

Cohen, R. (2002, December 15). Is Googling o.k.? The New York Times
Magazine, p. 50.

Crowley, D., & Heyer, P. (2007). Communication in history: Technology, culture,


society (5th ed.). Boston: Pearson Allyn & Bacon.

Datamonitor. (2005, July 5). Google Enters top 100 list. Datamonitor
Computerwire. Retrieved November 30, 2006, from
http://www.computerwire.com/industries/research/?pid=B9775C66-25B1-
4979-9672-0A11732DAD9F

Deleuze, G. (1995). Postscripts on control societies (M. Joughin, Trans.).


Negotiations, 1972-1990. (pp. 177-182). New York: Columbia University
Press.

235
Derene, G. (2007, March 21). Buzzword: Vehicle-to-vehicle communications.
PopularMechanics.com. Retrieved March 28, 2007, from
http://www.popularmechanics.com/blogs/automotive_news/4213544.html

Derrida, J. (1981). Plato’s pharmacy (B. Johnson, Trans.). Chicago: University of


Chicago Press.

Dettelbach, C. G. (1976). In the driver's seat: The automobile in american


literature and popular culture. Westport, CT: Greenwood.

Diaz, A. (2005). Through the Google goggles: Sociopolitical bias in search


engine design. Stanford University.

Downs, R. B., & McCoy, R. E. (1984). The first freedom today: Critical issues
relating to censorship and intellectual freedom. Chicago: American Library
Association.

Doyle, C. (2003, February 26). Libraries and the USA PATRIOT act.
Congressional Research Service Report for Congress.

Doyle, J. (1998, October 10). Electronic tolls for golden gate bridge; district
approves scanner system for use late next year. San Francisco Chronicle, p.
A1.

Dreyfus, H. (2004). Heidegger on gaining a free relation to technology. In D. M.


Kaplan (Ed.), Readings in the philosophy of technology. (pp. 53-62).
Lanham, MD: Rowman & Littlefield.

Dunlap, A. R. (2002). Fixing the fourth amendment with trade secret law: A
response to kyllo v. United states. Georgetown Law Journal, 90(6), 2175-
2206.

Durbin, P., & Rapp, F. (Eds.). (1983). Philosophy and technology. Boston:
Kluwer Academic Publishers.

Eisenstein, E. (1979). The printing press as an agent of change: Communications


and cultural transformations in early-modern europe. New York:
Cambridge University Press.

Eisenstein, E. (1983). The printing revolution in early modern europe. New York:
Cambridge University Press.

Foundation, E. F. (2006, February 9). Google Copies your hard drive -


government smiles in anticipation. Retrieved May 25, 2006, from
http://www.eff.org/news/archives/2006_02.php#004400

236
Electronic Privacy Information Center. (2004a, March 29). Digital rights
management and privacy. Retrieved January 20, 2007, from
http://www.epic.org/privacy/drm/

Electronic Privacy Information Center. (2004 b, August 18). Gmail privacy Page.
Retrieved June 17, 2006, from http://www.epic.org/privacy/gmail/faq.html

Ellul, J. (1964). The technological society (J. Wilkinson, Trans.). New York:
Knopf.

Ellul, J. (1985). The technological order. In C. Hickman (Ed.), Philosophy,


technology, and human affairs. (pp. 40-54). College Station, TX: Ibis.

Elmer, G. (2004). Profiling machines: Mapping the personal information


economy. Cambridge, MA: MIT Press.

Emtage, A. & Deutsch, P. (1992). Archie: An electronic directory service for the
internet. Proceedings of the Winter 1992 Usenix Conference, 93-110.

Evans, J. (1996, May 10). Copyright comes to the internet; ibm's 'cryptolope'
technology collects the fees. The Washington Post, p. F1.

Fabrikant, G. (2005, March 21). Ask jeeves inc. To be bought for $2 billion. The
New York Times, p. A12.

Fallows, D. (2005). Search engine users: Internet searchers are confident, satisfied
and trusting – but they are also unaware and naïve. Pew Internet &
American Life Project. Retrieved October 15, 2005, from
http://www.pewinternet.org/pdfs/PIP_Searchengine_users.pdf

Farías, V. (1989). Heidegger and nazism. Philadelphia: Temple University Press.

Feigenbaum, J., Freedman, M. J., Sander, T., & Shostack, A. (2001). Privacy
engineering for digital rights management systems. Proceedings of the ACM
Workshop on Security and Privacy in Digital Rights Management.

Felten, E. (2003). A skeptical view of drm and fair use. Communications of the
ACM, 46(4), 56-59.

Ferguson, C. (2005). That's next for Google? Technology Review, 108(1), 38-46.

Flanagan, M., Howe, D., & Nissenbaum, H. (2005). Values at play: Design
tradeoffs in socially-oriented game design. Conference on Human Factors
in Computing Systems, 751-760.

237
Flanagan, M., Howe, D., & Nissenbaum, H. (in press). Values in design: Theory
and practice. In J. van den Hoven, & J. Weckert (Eds.), Information
technology and moral philosophy. Cambridge University Press.

Flink, J. J. (1975). The car culture. Cambridge, MA: MIT Press.

Foerstel, H. N. (1991). Surveillance in the stacks: The fbi's library awareness


program. New York: Greenwood Press.

Fogarty, T. (1997, November 20). Gm at your service with onstar aboard, buy a
car, get a concierge. USA Today, p. 1B.

Foucault, M. (1971). The order of things: An archaeology of the human sciences.


New York: Vintage.

Franzier, D. (2002, May 27). Traffic gets smarter; high-tech solutions can help
manage denver roads. Rocky Mountain News, p. 26A.

Friedman, B. (1997). Human values and the design of computer technology (CSLI
lecture notes. ; no. 72). New York: Cambridge University Press.

Friedman, B. (1999). Value-sensitive design: A research agenda for information


technology. National Science Foundation, Contract No: SBR-9729633).
Arlington, VA.

Friedman, B., Howe, D., & Felten, E. (2002). Informed consent in the mozilla
browser: Implementing value-sensitive design. Proceedings of the 35th
Annual Hawaii International Conference on System Sciences.

Friedman, B., & Kahn, P. (2002). Human values, ethics, and design. In J. Jacko,
& S. A. (Eds.), The human-computer interaction handbook. (pp. 1177-
1201). Mahwah, NJ,: Lawrence Erlbaum.

Friedman, B., Kahn, P., & Borning, A. (2002). Value sensitive design: Theory
and methods. (Technical Report 02-12-01). Seattle, WA.

Friedman, T. (2003, June 29). Is Google God? The New York Times, p. 13.

Froomkin, A. M. (1999). Legal issues in anonymity and pseudonymity. The


Information Society, 15, 113-127.

Froomkin, A. M. (2000). The death of privacy. Stanford Law Review, 52(5),


1461-1543.

238
Galloway, A. (2004). Protocol: How control exists after decentralization.
Cambridge, MA: MIT Press.

Garfinkel, S. (1995, May 3). The road watches you. New York Times, p. A17.

Garfinkel, S. (2000). Database nation: The death of privacy in the 21st century
(1st ed ed.). Sebastopol, CA: O'Reilly.

Gavison, R. (1980). Privacy and the limits of law. Yale Law Journal, 89(3), 421-
471.

General Motors. (1940). To New Horizons [Film]. Handy Jam Organization.

Gillies, J., & Cailliau, R. (2000). How the Web was born: The story of the world
wide Web. New York: Oxford University Press.

Gillmor, S. (2004, April 23). Google's Brin talks on Gmail future. eWeek.com.
Retrieved June 20, 2006, from
http://www.eweek.com/article2/0,1759,1572683,00.asp

Gnix. (2006, November 2). Linkstruct2.Gif. Wikipedia, The Free Encyclopedia.


Retrieved April 14, 2007, from
http://en.wikipedia.org/w/index.php?title=Image:Linkstruct2.GIF&oldid=12
1320289

Goldman, J. (2006, August 17). Beta update!. Blogger Buzz. Retrieved August 28,
2006, from http://buzz.blogger.com/2006/08/beta-update.html

Google. (1999, June 7). Google Receives $25 million in equity funding [press
release]. Google Press Center. Retrieved August 18, 2006, from
http://www.google.com/press/pressrel/pressrelease1.html

Google. (2003). 20 year archive on Google Groups. Retrieved June 20, 2006,
from http://www.google.com/googlegroups/archive_announce_20.html

Google. (2004a). Google Alerts faq. Retrieved May 22, 2006, from
http://www.google.com/alerts/faq.html?hl=en

Google. (2004b, March 17). Google Connects searchers with local information
[press release]. Google Press Center. Retrieved May 25, 2006, from
http://www.google.com/press/pressrel/local.html

Google. (2004c). Google Technology. Retrieved May 3, 2006, from


http://www.google.com/technology/

239
Google. (2005a, October 15). Blogger privacy notice. Retrieved August 20, 2006,
from http://beta.blogger.com/privacy

Google. (2005b). Company overview. Retrieved May 3, 2006, from


http://www.google.com/corporate/index.html

Google. (2005c). Gmail privacy policy. Retrieved June 17, 2006, from
http://mail.google.com/mail/help/privacy.html

Google. (2005d). Gmail terms of use. Retrieved June 17, 2006, from
http://mail.google.com/mail/help/terms_of_use.html

Google. (2005e, September 15). Google Blog search [press release]. Google Press
Center. Retrieved May 25, 2006, from
http://www.google.com/press/annc/blog_search.html

Google. (2005f). Google Book search frequently asked questions. Retrieved May
25, 2006, from http://books.google.com/googleprint/help.html

Google. (2005g). Google Image search help. Retrieved May 3, 2006, from
http://www.google.com/help/faq_images.html

Google. (2005h, October 6). Google Merges local and maps products [press
release]. Google Press Center. Retrieved May 25, 2006, from
http://www.google.com/press/pressrel/local_merge.html

Google. (2005i). Google Privacy faq. Retrieved May 3, 2006, from


http://www.google.com/privacy_faq.html

Google. (2005j, October 14). Google Privacy policy. Retrieved May 3, 2006,
from http://www.google.com/privacypolicy.html

Google. (2005k). Investor relations: Google Code of conduct. Retrieved May 3,


2006, from http://investor.google.com/conduct.html

Google. (2005l, October 14). Orkut privacy notice. Retrieved August 20, 2006,
from http://www.orkut.com/Privacy.aspx

Google. (2005m, May 19). Personalize your homepage [press release]. Google
Press Center. Retrieved May 16, 2006, from
http://www.google.com/press/annc/personalize.html

Google. (2005n). Sizing up search engines. Retrieved December 1, 2006, from


http://www.google.com/help/indexsize.html

240
Google. (2006a, September 22). All our n-gram are belong to you. Google
Research Blog. Retrieved April 15, 2007, from
http://googleresearch.blogspot.com/2006/08/all-our-n-gram-are-belong-to-
you.html

Google. (2006b). Blogger help: Will my use of blogger be associated with my use
of other Google services? Retrieved August 20, 2006, from
http://help.blogger.com/bin/answer.py?answer=42601&topic=8939

Google. (2006c). Dodgeball privacy policy. dodgeball.com. Retrieved August 20,


2006, from http://www.dodgeball.com/privacy

Google. (2006d). Google Calendar privacy notice. Retrieved May 25, 2006, from
http://www.google.com/googlecalendar/privacy_policy.html

Google. (2006e). Google Desktop features. Retrieved May 25, 2006, from
http://desktop.google.com/features.html

Google. (2006f). Google Finance faq. Retrieved May 25, 2006, from
http://www.google.com/googlefinance/faq.html

Google. (2006g, February 9). Google Groups beta privacy policy. Retrieved June
20, 2006, from http://groups.google.com/googlegroups/privacy.html

Google. (2006h). Google News. Retrieved May 22, 2006, from


http://news.google.com/intl/en_us/about_google_news.html#multi_personali
zed

Google. (2006i, May 8). Google Notebook privacy policy. Retrieved June 20,
2006, from http://www.google.com/googlenotebook/privacy.html

Google. (2006j, February 7). Google Talk privacy notice. Retrieved May 25,
2006, from http://www.google.com/talk/privacy.html

Google. (2006k, January 11). Google Toolbar privacy notice. Retrieved May 3,
2006, from
http://www.google.com/support/toolbar/bin/answer.py?answer=32551&topi
c=938

Google. (2006l). Google Video player privacy notice. Retrieved May 25, 2006,
from
http://video.google.com/support/bin/answer.py?answer=32170&topic=1490

Google. (2006m). Google Web accelerator help. Retrieved May 3, 2006, from
http://webaccelerator.google.com/support.html

241
Google. (2006n). How do I personalize the Google homepage? Retrieved May 22,
2006, from
http://www.google.com/support/bin/answer.py?answer=25551&topic=1592

Google. (2006o). Making purchases with your Google Account. Retrieved May
25, 2006, from
http://www.google.com/support/purchases/bin/answer.py?answer=31754

Google. (2006p). What are google's privacy practices for the send to features?
Retrieved May 3, 2006, from
http://www.google.com/support/toolbar/bin/answer.py?answer=32819&topi
c=938

Google. (2006q). What are the advanced features of the Google Toolbar?
Retrieved May 3, 2006, from
http://www.google.com/support/toolbar/bin/answer.py?answer=14293

Google. (2007). Googlebot: Google's Web crawler. Google Information for


Webmasters. Retrieved April 14, 2007, from http://www.google.com/intl/pt-
PT/webmasters/bot.html

Google. (2004). Letter from the founders. Retrieved March 29, 2007, from
http://investor.google.com/ipo_letter.html

Google. (2007). Our philosophy. Retrieved March 27, 2007, from


http://www.google.com/intl/en/corporate/tenthings.html

Gorman, M. (2004, December 17). Google and god's mind. Los Angeles Times, p.
B15.

Graham, J. (2005, November 14). Sony to pull controversial cds, offer swap. USA
Today. Retrieved January 20, 2007, from
http://www.usatoday.com/money/industries/technology/2005-11-14-sony-
cds_x.htm

Griffin, P. (2006, January 27). Big brother wants to track your cybersteps. New
Zealand Herald.

Gross, J. (1997, March 25). E-z pass living up to its name; electronic tolls are
catching on, and commuters are catching up. The New York Times, p. B1.

Gulli, A. & Signorini, A. (2005). The indexable Web is more than 11.5 billion
pages. International World Wide Web Conference, 902-903.

242
Gussow, D. (1999, October 4). In search of. St. Petersburg Times, p. 13.

Habermas, J. (1992). The structural transformation of the public sphere: An


inquiry into a category of bourgeois society (T. Burger, & F. Lawrence,
Trans.). Cambridge: Polity.

Hafner, K. (2005, May 20). Google Moves to challenge Web portals. The New
York Times, p. C6.

Hafner, K. (2006, January 25). After subpoenas, internet searches give some
pause. The New York Times, pp. A1, A19.

Hafner, K., & Lyon, M. (1996). Where wizards stay up late: The origins of the
internet. New York: Simon & Schuster.

Hafner, K., & Richtel, M. (2006, January 20). Google resists u.s. Subpoena of
search data. The New York Times, pp. A1, C4.

Hall, E. (2000). Internet core protocols: The definitive guide. Cambridge, MA:
O'Reilly.

Halpern, S. (1995). The traffic in souls: Privacy interest and intelligent vehicle
highway systems. Santa Clara Computer and High Technology Law
Journal, 11(1), 45-73.

Hansell, S. (2004a, June 21). The internet ad you are about to see has already read
your e-mail. The New York Times, p. C1.

Hansell, S. (2004b, March 2). Yahoo to charge for guaranteeing a spot on its
index. The New York Times, p. C4.

Hansell, S. (2005, September 26). Microsoft plans to sell search ads of its own.
The New York Times, pp. C1, C8.

Hansell, S. (2006, August 8). AOL removes search data on vast group of Web
users. The New York Times, p. C4.

Hansen, E. (2004, January 14). Yahoo, Google primed for search war. CnET
News.com. Retrieved August 20, 2006, from http://news.com.com/2100-
1024-5141328.html

Hargittai, E. (2002). Beyond logs and surveys: In-depth measures of people's Web
use skills. Journal of the American Society for Information Science and
Technology, 53(14), 1239-1244.

243
Hargittai, E. (2004a). Informed Web surfing: The social context of user
sophistication. Society online: the Internet in context, Thousand Oaks: Sage
Publications, Inc, 257-274.

Hargittai, E. (2004b). The changing online landscape: From free-for-all to


commercial gatekeeping. Retrieved October 14, 2006, from
http://www.eszter.com/research/c03-onlinelandscape.html

Harris, A. (2005, January 13). Car's black box evidence ruled admissible.
Law.com. Retrieved March 31, 2007, from
http://www.law.com/jsp/article.jsp?id=1105364095740

Harris, S. (2006, July 7). Dictionary adds verb: To Google. San Jose Mercury
News.

Headrick, D. R. (2000). When information came of age: Technologies of


knowledge in the age of reason and revolution, 1700-1850. New York:
Oxford University Press.

Heidegger, M. (1966). Discourse on thinking (J. M. Andersonand, & E. H.


Freund, Trans.). Harper and Row.

Heidegger, M. (1977). The question concerning technology, and other essays (W.
Lovitt, Trans.). New York: Harper & Row.

Hellsten, I., Leydesdorff, L., & Wouters, P. (2006). Multiple presents: How
search engines re-write the past. New Media & Society, 8(6), 901-924.

Hellweg, E. (2002, April 22). Google's need for speed. CNN/Money. Retrieved
August 20, 2006, from
http://money.cnn.com/2002/04/22/technology/techinvestor/hellweg/index.ht
m

Hey, K. (1983). Cars and films in american culture, 1929-1959. In D. L. Lewis, &
L. Goldstein (Eds.), The automobile and american culture. (pp. 193-205).
Ann Arbor. MI: University of Michigan Press.

Heydon, A. & Najork, M. (1999). Mercator: A scalable, extensible Web crawler.


World Wide Web, 2(4), 219-229.

Hines, M. (2005, May 5). Google tool to speed Web surfing. CNET News.com.
Retrieved March 31, 2007, from
http://news.com.com/Google+tool+to+speed+Web+surfing/2100-1032_3-
5696496.html

244
Hinman, L. (2005). Esse est indicato in Google: Ethical and political issues in
search engines. International Review of Information Ethics, 3, 19-25.

Ho, S. Y. (2005). An exploratory study of using a user remote tracker to examine


Web users' personality traits. Proceedings of the 7th international
conference on Electronic commerce, 659-665.

Hoelscher, C. (1998). How internet experts search for information on the Web.
World Conference of the World Wide Web, Internet, and Intranet, Orlando,
FL.

Hölscher, C. & Strube, G. (2000). Web search behavior of internet experts and
newbies. Computer Networks, 33(1-6), 337-346.

Hoofnagle, C. (2005, March 4). Privacy self regulation: A decade of


disappointment. Electronic Privacy Information Center. Retrieved April 18,
2007, from http://www.epic.org/reports/decadedisappoint.html

Horrell, P. (2003, June). Intelligence: Behold the all-seeing, self-parking, safety-


enforcing, networked. Automobile. Popular Science, pp. 32-46.

Horrigan, J., & Rainie, L. (2006, April 19). The internet’s growing role in life’s
major moments. Pew Internet & American Life Project. Retrieved May 26,
2006, from http://www.pewinternet.org/PPF/r/181/report_display.asp

Howe, D., & Nissenbaum, H. (2006). Trackmenot. Retrieved August 27, 2006,
from http://mrl.nyu.edu/~dhowe/TrackMeNot

IAC Search & Media. (2005, July 13). Privacy policy for ask.com. Retrieved
January 6, 2007, from http://sp.ask.com/en/docs/about/privacy.shtml

Ihde, D. (1993). Philosophy of technology: An introduction. New York: Paragon


House.

Inness, J. (1992). Privacy, intimacy, and isolation. New York, NY: Oxford
University Press.

Innis, H. (1951). The bias of communication. Toronto: University of Toronto


Press.

Internet Systems Consortium. (2006). Isc domain survey: Number of internet


hosts. Retrieved November 12, 2006, from
http://www.isc.org/index.pl?/ops/ds/host-count-history.php

245
Internet World Stats. (2006, September 18). World internet usage and population
statistics. Retrieved November 21, 2006, from
http://www.internetworldstats.com/stats.htm

Interrante, J. (1983). The road to autopia: The automobile and the spatial
transformation of american culture. In D. L. Lewis, & L. Goldstein (Eds.),
The automobile and american culture. (pp. 89-104). Ann Arbor. MI:
University of Michigan Press.

Introna, L. & Nissenbaum, H. (2000). Shaping the Web: Why the politics of
search engines matters. The Information Society, 16(3), 169-185.

Jansen, B. J. & Pooch, U. (2001). A review of Web searching studies and a


framework for future research. Journal of the American Society for
Information Science and Technology, 52(3), 235-246.

Jansen, B. J., Spink, A., & Saracevic, T. (2000). Real life, real users, and real
needs: A study and analysis of user queries on the Web. Information
Processing and Management, 36(2), 207-227.

Jefferson, T. (1820a, December 26). Letter to destutt de tracy. University of


Virginia: Jefferson Quotations. Retrieved January 10, 2007, from
http://www.monticello.org/reports/quotes/uva.html

Jefferson, T. (1820b, December 27). Letter to william roscoe. University of


Virginia: Jefferson Quotations. Retrieved January 10, 2007, from
http://www.monticello.org/reports/quotes/uva.html

Joerges, B. (Jun., 1999). Do politics have artefacts? Social Studies of Science,


29(3), 411-431.

Johnson, D. (1985). Computer ethics. Englewood Cliffs, NJ: Prentice-Hall.

Johnson, D. (2001). Computer ethics (3rd ed.). Upper Saddle River, NJ: Prentice-
Hall.

Johnson, S. (1997). Interface culture: How new technology transforms the way we
create and communicate. San Francisco: Basic Books.

Jordan, M. (2006, January 7). Electronic eye grows wider in britain; cars to be
subject to video surveillance. The Washington Post, p. A1.

Kakihara, M. & Sorensen, C. (2002). Mobility: An extended perspective.


Proceedings of the 35th Annual Hawaii International Conference on System
Sciences, 1756-1766.

246
Kang, J. (1998). Information privacy in cyberspace transactions. Stanford Law
Review, 50(4), 1193-1294.

Kaplan, D. M. (Ed.). (2004). Readings in the philosophy of technology. Lanham,


MD: Rowman & Littlefield.

Kelly, K. (1996). The electronic hive: Embrace it. In R. Kling (Ed.),


Computerization and controversy. (2nd ed., pp. 75-78). San Diego:
Academic Press, Inc.

Kennedy, B. (1989). Confidentiality of library records: A survey of problems,


policies and laws. Law Library Journal, 81(4), 733-767.

Kerouac, J. (1955/1991). On the road. New York: Penguin.

Kleinberg, J. & Lawrence, S. (2001). The structure of the Web. Science, 294,
1849-1850.

Kleinberg, J. (1999). Authoritative sources in a hyperlinked environment. Journal


of the ACM (JACM), 46(5), 604-632.

Kopytoff, V. (2003, February 23). After the boom: Dot-coms defy trend, make
money. San Francisco Chronicle, p. I1.

Kopytoff, V. (2004, March 30). Google tests souped-up Web searches. San
Francisco Chronicle, p. C3.

Kristol, D. (2001). Http cookies: Standards, privacy, and politics. ACM


Transactions on Internet Technology (TOIT), 1(2), 151-198.

Kuhn, T. (1962). The structure of scientific revolutions. Chicago: University of


Chicago Press.

Kushmerick, N. (1998, February 23). The search engineers. The Irish Times, p.
10.

La Monica, P. (2004, April 30). Google sets $2.7 billion IPO. CNNMoney.com.
Retrieved July 29, 2006, from
http://money.cnn.com/2004/04/29/technology/google/

Laird, D. (1983). Versions of eden: The automobile and the american novel. In D.
L. Lewis, & L. Goldstein (Eds.), The automobile and american culture. (pp.
244-256). Ann Arbor. MI: University of Michigan Press.

247
Langville, A., & Meyer, C. (2006). Google's PageRank and beyond: The science
of search engine rankings. Princeton, NJ: Princeton University Press.

Lardner, G. (2001, November 16). On left and right, concern over anti-terrorism
moves; administration actions threaten civil liberties, critics say. The
Washington Post, p. A40.

Lawrence, S. & Giles, C. L. (1998). Searching the world wide Web. Science,
280(5360), 98-100.

Lawrence, S. & Giles, L. (2000). Accessibility of information on the Web.


intelligence, 11(1), 32-39.

Leiner, B., Cerf, V., Clark, D., Kahn, R., Kleinrock, L., Lynch, D., et al. (2003,
December 10). A brief history of the internet. Internet Society. Retrieved
November 22, 2006, from http://www.isoc.org/internet/history/brief.shtml

Lessig, L. (1999). Code and other laws of cyberspace. New York: Basic Books.

Lessig, L. (2001). The future of ideas: The fate of the commons in a connected
world. New York: Random House.

Lessig, L. (2004). Free culture: How big media uses technology and the law to
lock down culture and control creativity. New York: Penguin Press.

Levy, S. (2006, January 30). Searching for searches. Newsweek, p. 49.

Lewis, D. (1983). Sex and the automobile: From rumble seats to rockin' vans. In
D. L. Lewis, & L. Goldstein (Eds.), The automobile and american culture.
(pp. 123-133). Ann Arbor. MI: University of Michigan Press.

Lewis, D. L., & Goldstein, L. (Eds.). (1983). The automobile and american
culture. Ann Arbor. MI: University of Michigan Press.

Lobron, A. (2006, February 5). Googling your Friday-night date may or may not
be snooping, but it won't let you peek inside any souls. The Boston Globe
Magazine, p. 42.

Lum, C. (2000). Introduction: The intellectual roots of media ecology. New Jersey
Journal of Communication, 8(1), 1-7.

Machill, M., Neuberger, C., Schweiger, W., & Wirth, W. (2004). Navigating the
internet. European Journal of Communication, 19(3), 321-347.

248
Manders-Huits, N. & Zimmer, M. (in progress). Values & pragmatic action: The
challenges of engagement with technical design communities.

Maney, K. (2006, August 9). Aol's data sketch sometimes scary picture of
personalities searching net. USA Today, p. 4B.

Marcuse, H. (1964). One dimensional man: Studies in the ideology of advanced


industrial society. Boston: Beacon Press.

Marcuse, H. (2004). Social implications of technology. In D. M. Kaplan (Ed.),


Readings in the philosophy of technology. (pp. 63-80). Lanham, MD:
Rowman & Littlefield.

Marlowe Jr, H. A., Nyhan, R., Arrington, L. W., & Pammer, W. J. (1994). The re-
ing of local government: Understanding and shaping governmental change.
Public Productivity & Management Review, 17(3), 299-311.

Martey, R. M. (in press). Exploring gendered notions: Gender, job hunting and
Web search engines. In A. Spink, & M. Zimmer (Eds.), Web searching:
Interdisciplinary perspectives. Dordrecht, The Netherlands: Springer.

Marx, K. (1973). Grundrisse. Foundations of the critique of political economy


(M. Nicolaus, Trans.). New York: Vintage Books.

Marx, K. (1978a). A contribution to the critique of political economy. In R. C.


Tucker (Ed.), The marx-engels reader. (2nd ed., pp. 3-6). New York:
Norton.

Marx, K. (1978b). Capital, volume one. In R. C. Tucker (Ed.), The marx-engels


reader. (2nd ed., pp. 469-500). New York: Norton.

Marx, K. (1978c). Economic and philosophic manuscripts of 1844. In R. C.


Tucker (Ed.), The marx-engels reader. (2nd ed., pp. 66-125). New York:
Norton.

Marx, K., & Engels, F. (1978). Manifesto of the Communist Party. In R. C.


Tucker (Ed.), The marx-engels reader. (2nd ed., pp. 469-500). New York:
Norton.

Mayer-Schönberger, V. (1997). The internet and privacy legislation: Cookies for


a treat? West Virginia Journal of Law and Technology, 1(1).

Mayer, T. (2005, August 8). Our blog is growing up – and so has our index.
Yahoo! Search Blog. Retrieved November 25, 2006, from
http://www.ysearchblog.com/archives/000172.html

249
McChesney, R. (1999). Rich media, poor democracy: Communication politics in
dubious times. Urbana, IL: University of Illinois Press.

McCown, F., Liu, X., Nelson, M. L., & Zubair, M. (2006). Search engine
coverage of the oai-pmh corpus. IEEE Internet Computing, 10(2), 66-73.

McCullagh, D. (2003, November 19). Court to fbi: No spying on in-car


computers. CNET News.com. Retrieved December 3, 2003, from
http://news.com.com/2100-1029_3-5109435.html

McCullagh, D. (2006a, August 7). Aol's disturbing glimpse into users' lives.
CNET News.com. Retrieved December 3, 2006, from
http://news.com.com/AOLs+disturbing+glimpse+into+users+lives/2100-
1030_3-6103098.html?tag=st.num

McCullagh, D. (2006b, March 17). Police blotter: Judge orders Gmail disclosure.
News.com. Retrieved June 20, 2006, from
http://news.com.com/Police%20blotter%20Judge%20orders%20Gmail%20
disclosure/2100-1047_3-6050295.html

McFadden, R. (1987, September 18). Libraries are asked by f.b.i. To report on


foreign agents. The New York Times, pp. A1, A22.

McGann, R. (2005, March 14). Study: Consumers delete cookies at surprising


rate. ClickZ News. Retrieved May 15, 2006, from
http://www.clickz.com/news/article.php/3489636

McLuhan, M. (1964/1994). Understanding media: The extensions of man.


Cambridge, MA: MIT Press.

McShane, C. (1994). Down the asphalt path: The automobile and the american
city. New York: Columbia University Press.

Microsoft Corporation. (2005, Februrary). Metering the use of digital media


content with windows media drm 10. Retrieved January 20, 2007, from
http://msdn2.microsoft.com/en-us/library/ms867183.aspx

Mills, E. (2005, August 3). Google balances privacy, reach. CnET News.com.
Retrieved January 7, 2007, from
http://news.com.com/Google+balances+privacy,+reach/2100-1032_3-
5787483.html

Mindlin, A. (2006, May 15). The case of the disappearing cookies. The New York
Times, p. C5.

250
Mintz, H. (2006, January 16). Feds after Google data: Records sought in u.s.
Quest to revive porn law. San Jose Mercury News. Retrieved January 19,
2006, from http://www.siliconvalley.com/mld/siliconvalley/13657386.htm

Mitcham, C. (1994). Thinking through technology: The path between engineering


and philosophy. Chicago, IL: University of Chicago Press.

Mitcham, C. (Ed.). (2005). Encyclopedia of science, technology, and ethics.


Detroit, MI: Macmillan Reference.

Moor, J. (1985). What is computer ethics? Metaphilosophy, 16, 266-275.

Mostafa, J. (2005, January 24). Seeking better Web searches. Scientific


American.com. Retrieved January 30, 2005, from
http://www.sciam.com/print_version.cfm?articleID=0006304A-37F4-11E8-
B7F483414B7F0000

Muller, M. & Kuhn, S. (1993). Special issue on participatory design.


Communications of the ACM, 36(4).

Mulligan, D., Han, J., & Burstein, A. (2003). How drm-based content delivery
systems disrupt expectations of "personal use". Proceedings of the 2003
ACM workshop on Digital Rights Management, 77-89.

Mumford, L. (1934). Technics and civilization. New York: Harcourt Brace.

Mumford, L. (1964). Authoritarian and democratic technics. Technology and


Culture, 5(1), 1-8.

Murphy, D. (2003, April 7). Some librarians use shredder to show opposition to
new f.b.i. Powers. The New York Times, p. A12.

Nagenborg, M. (2005). The ethics of search engines (special issue). International


Review of Information Ethics. 3

Negroponte, N. (1996). Being digital. New York: Random House.

Nelson, T. (1987). Computer lib/dream machines. Redmond, WA: Microsoft


Press.

Nelson, T. (1993). Literary machines. Mindful Press.

Netcraft. (2006, November 1). November 2006 Web server survey. Retrieved
November 26, 2006, from

251
http://news.netcraft.com/archives/2006/11/01/november_2006_web_server_
survey.html

Network Working Group. (1985, October). Request for comments: 959, file
transfer protocol. Retrieved November 21, 2006, from
http://ietf.org/rfc/rfc959.txt

Network Working Group. (1993, March). Request for comments: 1436, the
internet gopher protocol. Retrieved November 21, 2006, from
http://ietf.org/rfc/rfc1436.txt

Nielsen, J. (1993). Usability engineering. Boston: Academic Press.

Nielsen//NetRatings. (2006, March 30). Google Accounts for nearly half of all
Web searches, while approximately one third are conducted on Yahoo! And
msn combined. Retrieved May 26, 2006, from http://www.nielsen-
netratings.com/pr/pr_060330.pdf

Nielsen//NetRatings. (2007, March 20). Nielsen//Netratings announces February


u.s. Search share rankings. Retrieved March 27, 2007, from
http://www.nielsen-netratings.com/pr/pr_070320.pdf

Nissenbaum, H. (1998). Protecting privacy in an information age: The problem of


privacy in public. Law and Philosophy, 17(5), 559-596.

Nissenbaum, H. (2001). How computer systems embody values. IEEE Computer,


34(3), 118-120.

Nissenbaum, H. (2004). Privacy as contextual integrity. Washington Law Review,


79(1), 119-157.

Norman, D. A. (1990). The design of everyday things (1st Doubleday/Currency ed


ed.). New York: Doubleday.

North, S. (2006). The road movie. Hackwriters.com. Retrieved January 27, 2007,
from http://www.hackwriters.com/roadone.htm

Norvig, P., Winograd, T., & Bowker, G. (2006, February 27). The ethics and
politics of search engines. Panel at Santa Clara University Markkula Center
for Applied Ethics. Retrieved March 1, 2006, from
http://www.scu.edu/sts/Search-Engine-Event.cfm

Olsen, S. (2001, October 26). Patriot Act draws privacy concerns. CNET
News.com. Retrieved January 14, 2007, from http://news.com.com/2100-
1023-275026.html

252
Olsen, S. (2006, May 25). Dell embraces Google. CNET News.com. Retrieved
July 24, 2006, from http://news.com.com/Dell+embraces+Google/2100-
1032_3-6077051.html

Ong, W. (1982). Orality and literacy: The technologizing of the word. London:
Routledge.

Padover, S. (1952). Jefferson: A great american's life and ideas. New York:
Harcourt, Brace & World.

Page, L., Brin, S., Motwani, R., & Winograd, T. (1998). The PageRank citation
ranking: Bringing order to the Web. Technical report. Stanford University,
Stanford, CA.

Patton, P. (1986). Open road: A celebration of the american highway. New York:
Simon.

Petrick, P. (2004, November). Why drm should be cause for concern: An


economic and legal analysis of the effect of digital technology on the music
industry. Berkman Center for Internet & Society at Harvard Law School
Research Publication No. 2004-09. Retrieved January 20, 2007, from
http://cyber.law.harvard.edu/home/2004-09

Pierson, G. (1942). The frontier and american institutions a criticism of the turner
theory. The New England Quarterly, 15(2), 224-255.

Pierson, G. (1973). The moving american. New York: Knopf.

Pinch, T., & Bijker, W. E. (1987). The social construction of facts and artifacts:
Or how the sociology of science and the sociology of technology might
benefit each other. In W. E. Bijker, T. Hughes, & T. Pinch (Eds.), The social
construction of technological systems: New directions in the sociology and
history of technology. (pp. 17-50). Cambridge, MA: MIT Press.

Pinch, T. J. & Bijker, W. E. (Aug., 1984). The social construction of facts and
artefacts: Or how the sociology of science and the sociology of technology
might benefit each other. Social Studies of Science, 14(3), 399-441.

Pitkow, J., Schütze, H., Cass, T., Turnbull, D., Edmonds, A., & Adar, E. (2002).
Personalized search. Communications of the ACM, 45(9), 50-55.

Plato. (1990). Phaedrus (H. Fowler, Trans. 17th ed.). Cambridge: Loeb Classical
Library.

253
Postman, A. (2006, January 29). Dear sir, we see from your files. The Washington
Post, p. B04.

Postman, N. (1970). The reformed english curriculum. In A. C. Eurich (Ed.), High


school 1980: The shape of the future in american secondary education. (pp.
160-168). New York: Pitman.

Postman, N. (1985). Amusing ourselves to death: Public discourse in the age of


show business. New York: Viking.

Postman, N. (1990, October 11). Informing ourselves to death. German


Informatics Society.

Postman, N. (1992). Technopoly: The surrender of culture to technology. New


York: Vintage Books.

Postman, N., & Weingartner, C. (1971). The soft revolution: A student handbook
for turning schools around. New York: Delacorte Press.

Clearinghouse, P. R. (2004, April 19). Thirty-one privacy and civil liberties


organizations urge Google to suspend Gmail. Retrieved June 17, 2006, from
http://www.privacyrights.org/ar/GmailLetter.htm

Purdy, M. (2001, November 25). Bush's new rules to fight terror transform the
legal landscape. The New York Times, p. A1.

Rainie, L. (November 2005). Search engine use shoots up in the past year and
edges towards e–mail as the primary internet application. Pew Internet and
American Life Project.

Ramasastry, A. (2005a, May 12). Can we stop zabasearch -- and similar personal
information search engines?: When data democratization verges on privacy
invasion. FindLaw. Retrieved June 12, 2006, from
http://writ.news.findlaw.com/ramasastry/20050512.html

Ramasastry, A. (2005b, August 23). Tracking every move you make: Can car
rental companies use technology to monitor our driving? FindLaw.
Retrieved September 15, 2005, from
http://writ.news.findlaw.com/ramasastry/20050823.html

Raskin, J. (2000). The humane interface: New directions for designing interactive
systems. Reading, MA: Addison-Wesley.

RealNetworks. (_2006, October 4). Rhapsody DNA white paper. Retrieved


January 20, 2007, from

254
http://webservices.rhapsody.com/rwssdk/RhapsodyDNAWhitePaperV1_0.p
df

Regan, P. (1995). Legislating privacy: Technology, social values, and public


policy. Chapel Hill: University of North Carolina Press.

Regan, P. (2001). From clipper to carnivore: Balancing privacy, law enforcement


and industry interests. Paper presented at the American Political Science
Association, San Francisco, CA.

Reiman, J. (1995). Driving to the panopticon: A philosophical exploration of the


risks to privacy posed by the highway technology of the future. Santa Clara
Computer and High Technology Law Journal, 11(1), 27-44.

Reinhardt, A. (2003, May 5). And you thought the Web ad market was dead.
BusinessWeek Online. Retrieved November 20, 2006, from
http://www.businessweek.com/magazine/content/03_18/b3831134_mz034.h
tm

Retson, D. (2006, March 27). Black box tells all: The event data recorder is now
included in the majority of new vehicles recording information, including
speed and seat-belt use, in the event of a crash. The Gazette, p. E1.

Rheingold, H. (2000). The virtual community: Homesteading on the electronic


frontier. Cambridge, MA: MIT Press.

Robbins, L. (1991). Toward ideology and autonomy: The american library


associaton's response to threats to intellectual freedom, 1939-1969.
Unpublished Dissertation, Texas Woman's University.

Rogers, I. (2002, April). The Google Pagerank algorithm and how it works. IPR
Computing. Retrieved April 14, 2007, from
http://www.iprcom.com/papers/pagerank/

Roush, W. (2005). Killer maps. Technology Review, 108(10), 54-60.

Roy, M. & Chi, M. T. C. (2003). Gender differences in patterns of searching the


Web. Journal of Educational Computing Research, 29(3), 335-348.

Samuelson, P. (2003). Digital rights management {and, or, vs.} the law.
Communications of the ACM, 46(4), 41-45.

Sanchez, R. (2003, April 10). Librarians make some noise over Patriot Act. The
Washington Post, p. A20.

255
Santa Clara symposium on privacy and IVHS. (1995). Santa Clara Computer &
High Technology Law Journal, 11.

Scharff, R., & Dusek, V. (Eds.). (2003). Philosophy of technology: The


technological condition: An anthology. Malden, MA: Blackwell Publishers.

Scharff, V. (1991). Taking the wheel: Women and the coming of the motor age.
New York: The Free Press.

Schoeman, F. (1992). Privacy and social freedom (Cambridge studies in


philosophy and public policy). Cambridge: Cambridge University Press.

Schwartz, C. (1998). Web search engines. Journal of the American Society for
Information Science, 49(11), 973-982.

Schwartz, J. (2001, September 4). Giving the Web a memory costs its users
privacy. The New York Times, p. A1.

Schwartz, J. (1998, September 8). Kids and computers; how wired would a
student's world be? The Washington Post, p. Z7.

Sclove, R. (1995). Democracy and technology. New York: Guilford.

Search Engine Marketing Professional Organization. (2006, January 9). Search


engine marketers spent $5.75 billion in 2005. Retrieved November 21, 2006,
from http://www.sempo.org/news/releases/Search_Engine_Marketers

Selingo, J. (2001, October 25). It’s the cars, not the tires, that squeal. The New
York Times, pp. G1, G8.

Sengers, P., Boehner, K., & David, S. (2005). Reflective design. Proceedings of
the 4th decennial conference on Critical computing: between sense and
sensibility, 49-58.

Sengers, P., Liesendahi, R., Magar, W., Seibert, C., Müller, B., Joachims, T., et al.
(1990). The enigmatics of affect. Proceedings of the conference on
Designing interactive systems: processes, practices, methods, and
techniques, 87-98.

Setright, L. J. K. (2003). Drive on!: A social history of the motor car. London:
Granta.

Shankland, S. (2005, October 4). Sun and Google shake hands. CNET News.com.
Retrieved July 24, 2006, from

256
http://news.com.com/Sun+and+Google+shake+hands/2100-1014_3-
5888701.html

Sharma, D. (2004, October 21). Is your boss Googling you? CNET News.com.
Retrieved January 6, 2007, from
http://news.com.com/Is+your+boss+Googling+you/2100-1038_3-
5421210.html

Sheff, D. (2004). Playboy interview: Google guys. Playboy, 51(9), 55-60, 142-
145.

Shirky, C. (2005). Ontology is overrated: Categories, links, and tags. Clay


Shirky’s Writings About the Internet. Retrieved March 25, 2007, from
http://www.shirky.com/writings/ontology_overrated.html

Shneiderman, B. (1991). Human values and the future of technology: A


declaration of responsibility. ACM SIGCHI Bulletin, 23(1), 11-16.

Silverstein, C., Henzinger, M. R., Marais, H., & Moricz, M. (1999). Analysis of a
very large Web search engine query log. SIGIR Forum, 33(1), 6-12.

Simon, F. (2003, October 4). Ernst kapp: An early and romantic philosopher of
technology. Retrieved 2006, December 1, from
http://members.home.nl/fsimon/index.html

Solove, D. (2004). The digital person: Technology and privacy in the information
age (Ex machina). New York: New York University Press.

Speretta, M. (2000). Personalizing search based on user search histories.


University of Kansas.

Spink, A., & Jansen, B. J. (2004). Web search: Public searching of the Web. New
York: Kluwer Academic Publishers.

Srinivasan, S. (2005, February 25). Personal communication.

Standage, T. (1998). The victorian internet: The remarkable story of the telegraph
and the nineteenth century's on-line pioneers. New York: Walker and Co.

Steinbeck, J. (1980). Travels with charley in search of America. New York:


Penguin.

Steinberg, S. (1996, May). Seek and ye shall find (maybe). Wired, pp. 108-114,
172-182.

257
Stockwell, F. (2001). A history of information storage and retrieval. Jefferson,
NC: McFarland & Company.

Sturges, P., Teng, V., & Iliffe, U. User privacy in the digital library environment:
A matter of concern for information professionals. Library Management,
22(8/9), 364-370.

Suchman, L. (1997). Do categories have politics? The language/action perspective


reconsidered. In B. Friedman (Ed.), Human values and the design of
computer technology. (pp. 91-105). Cambridge, UK: Cambridge University
Press.

Sullivan, D. (2001, February 19). Nielsen Netratings search engine ratings,


December 2000. SearchEngineWatch. Retrieved July 17, 2006, from
http://searchenginewatch.com/mhts/9902-0012-netratings.htm

Sullivan, D. (2003a, July 31). How search engines rank Web pages.
SearchEngineWatch. Retrieved November 25, 2006, from
http://searchenginewatch.com/showPage.html?page=2167961

Sullivan, D. (2003b, April 2). Search privacy at Google & other search engines.
SearchEngineWatch. Retrieved March 31, 2007, from
http://searchenginewatch.com/showPage.html?page=2189531

Sullivan, D. (2003c, May 21). Search privacy: An issue?, part 1. ClickZ.


Retrieved March 31, 2007, from
http://www.clickz.com/showPage.html?page=2207951

Sullivan, D. (2005, May 17). New estimate puts Web size at 11.5 billion pages &
compares search engine coverage. SearchEngineWatch. Retrieved January
4, 2007, from http://blog.searchenginewatch.com/blog/050517-075657

Sunstein, C. (2001). Republic.Com. Princeton, NJ: Princeton University Press.

Swidey, N. (2003, February 2). A nation of voyeurs: How the internet search
engine Google is changing what we can find out about one another - and
raising questions about whether we should. The Boston Globe Sunday
Magazine, p. 10.

Tancer, B. (2006, May 18). Google Properties - understanding the breakdown. Hit
Wise. Retrieved May 22, 2006, from http://weblogs.hitwise.com/bill-
tancer/2006/05/google_properties_understandin.html

Tavani, H. & Grodzinsky, F. (2002). Cyberstalking, personal privacy, and moral


responsibility. Ethics and Information Technology, 4(2), 123-132.

258
Tavani, H. T. (2004). Ethics and technology: Ethical issues in an age of
information and communication technology. Hoboken, NJ: Wiley.

Tavani, H. T. (2005). Search engines, personal information and the problem of


privacy in public. International Review of Information Ethics, 3, 39-45.

Tec-Ed. (1999, December). Assessing Web site usability from server log files
[white paper]. Retrieved April 3, 2006, from
www.teced.com/PDFs/whitepap.pdf

Teevan, J., Dumais, S. T., & Horvitz, E. (2005). Personalizing search via
automated analysis of interests and activities. Proceedings of the 28th
annual international ACM SIGIR conference on Research and development
in information retrieval, 449-456.

Thaw, J., & Daurat, C. (2006, August 8). Google to provide MySpace search. The
Seattle Times, p. E1.

Thompson, C., & Kerr, I. (2005). Tailgating on spyways: Vanishing anonymity


on electronic toll roads. On the Identity Trail. Retrieved December 3, 2005,
from http://idtrail.org/content/view/126/42/

Türker, D. (2004). The optimal design of a search engine from an agency theory
perspective. Retrieved November 1, 2006, from http://rundfunkoek.uni-
koeln.de/institut/publikationen/arbeitspapiere/ap191.php

Turkle, S. (1995). Life on the screen: Identity in the age of the internet. New
York: Simon & Schuster.

Turner, F. J. (1921). The frontier in american history. New York: Henry Holt and
Company.

U.S. Department of Transportation. (2005). Vehicle infrastructure integration


(vii): Major initiatives. Retrieved December 18, 2005, from
http://www.its.dot.gov/vii/vii_overview.htm

United States Department of Justice. (2006). The USA PATRIOT act: Preserving
life and liberty. Retrieved 2007, March 11, from
http://www.lifeandliberty.gov/highlights.htm

Updike, J. (1960). Rabbit, run. New York: Knopf.

259
Vaidhyanathan, S. (2001). Copyrights and copywrongs: The rise of intellectual
property and how it threatens creativity. New York: New York University
Press.

Vaidhyanathan, S. (2004). The anarchist in the library: How the clash between
freedom and control is hacking the real world and crashing the system. New
York: Basic Books.

Van Couvering, E. (2004). New media? The political economy of internet search
engines. Annual Conference of the International Association of Media &
Communications Researchers, Porto Alegre, Brazil, 7-14.

Van Couvering, E. (forthcoming). The history of the internet search engine:


Navigational media and the traffic commodity. In A. Spink, & M. Zimmer
(Eds.), Web searching: Interdisciplinary perspectives. Dordrecht, The
Netherlands: Springer.

Vaughan, L. (2004). New measurements for search engine evaluation proposed


and tested. Information Processing and Management: an International
Journal, 40(4), 677-691.

Vaughan, L. & Thelwall, M. (2004). Search engine coverage bias: Evidence and
possible causes. Information Processing & Management, 40(4), 693-707.

Vehicle Safety Communications Consortium. (n.d.). Vehicle safety


communications project: Task 3 final report: Identify intelligent vehicle
safety applications enabled by dsrc.

Vine, R. (2004). The business of search engines. Information Outlook, 8(2), 25-
31.

Warren, S. & Brandeis, L. (1890). The right to privacy. Harvard Law Review, 4,
193-200.

Waters, R. (2006, April 22). Google, Microsoft and Yahoo woo Ebay. Financial
Times, p. 21.

Watts, D. J. (2003). Six degrees: The science of a connected age. New York:
Norton.

Weinberg, N. (2005, September 11). Google unifyinig accounts. Inside Google.


Retrieved August 20, 2006, from
http://google.blognewschannel.com/index.php/archives/2005/09/21/google-
unifying-logins/

260
Weiss, P. (2006, March 19). What a tangled Web we weave: Being googled can
jeopardize your job search. New York Daily News. Retrieved January 7,
2007

Wen, J. R., Nie, J. Y., & Zhang, H. J. (2001). Clustering user queries of a search
engine. Proceedings of the tenth international conference on World Wide
Web, 162-168.

Westin, A. F. (1970). Privacy and freedom. New York: Atheneum.

Wiener, N. (1965). Cybernetics or control and communication in the animal and


the machine. Cambridge, MA: MIT Press.

Wiener, N. (1988). Human use of human beings: Cybernetics and society. Boston:
Da Capo Press.

Wiggins, R. W. (2001, October). The effects of September 11 on the leading


search engine. First Monday. Retrieved May 3, 2006, from
http://firstmonday.org/issues/issue6_10/wiggins/index.html

Wikipedia contributors. (2006a, November 24). Hyperlink. Wikipedia, The Free


Encyclopedia. Retrieved November 26, 2006, from
http://en.wikipedia.org/w/index.php?title=Hyperlink&oldid=89857700

Wikipedia contributors. (2006b, September 18). Jughead (computer). Wikipedia,


The Free Encyclopedia. Retrieved November 26, 2006, from
http://en.wikipedia.org/w/index.php?title=Jughead_%28computer%29&oldi
d=76447079

Wikipedia contributors. (2006c, November 19). Veronica (computer). Wikipedia,


The Free Encyclopedia. Retrieved November 26, 2006, from
http://en.wikipedia.org/w/index.php?title=Veronica_%28computer%29&old
id=88861161

Wikipedia contributors. (2007a, January 23). Content scramble system.


Wikipedia, The Free Encyclopedia. Retrieved January 23, 2007, from
http://en.wikipedia.org/w/index.php?title=Content_Scramble_System&oldid
=102274059

Wikipedia contributors. (2007b, April 19). Ip address. Wikipedia, The Free


Encyclopedia. Retrieved April 19, 2007, from
http://en.wikipedia.org/w/index.php?title=IP_address&oldid=123964420

Wikipedia contributors. (2007c, March 25). Open directory project. Wikipedia,


The Free Encyclopedia. Retrieved March 26, 2007, from

261
http://en.wikipedia.org/w/index.php?title=Open_Directory_Project&oldid=1
17786100

Williams, A. (2006, October 15). Planet Google Wants you. The New York Times,
p. 9.1.

Williams, R. (1983). Keywords: A vocabulary of culture and society (Rev. ed ed.).


New York: Oxford University Press.

Winner, L. (1986). The whale and the reactor: A search for limits in an age of
high technology. Chicago, IL: University of Chicago Press.

Winner, L. (Summer, 1993). Upon opening the black box and finding it empty:
Social constructivism and the philosophy of technology. Science,
Technology, and Human Values, 18(3), 362-378.

Wouters, J. (2005, June 9). Still searching for disclosure:. Comsuer Reports
WebWatch. Retrieved Sept. 15, 2005, from
http://www.consumerwebwatch.org/pdfs/search-engine-disclosure.pdf

Wouters, P., Hellsten, I., & Leydesdorff, L. (2004). Internet time and the
reliability of search engines. First Monday. Retrieved December 24, 2006,
from http://www.firstmonday.org/issues/issue9_10/wouters/index.html

Wright, M., & Kakalik, J. (2000). The erosion of privacy. In R. M. Baird, R. M.


Ramsower, & S. E. Rosenbaum (Eds.), Cyberethics : Social and moral
issues in the computer age. (pp. 162-170). Amherst, N.Y: Prometheus
Books.

Yahoo!. (2005). The history of Yahoo! - how it all started. Retrieved March 25,
2007, from http://docs.yahoo.com/info/misc/history.html

Yahoo!. (2006, November 11). Yahoo! Privacy policy. Retrieved January 6, 2007,
from http://info.yahoo.com/privacy/us/yahoo/details.html

Yahoo!. (2007). Company overview. Retrieved march 31, 2007, from


http://yhoo.client.shareholder.com/press/overview.cfm

Yeo, R. R. (2001). Encyclopaedic visions: Scientific dictionaries and


enlightenment culture. Cambridge, UK: Cambridge University Press.

Zamir, O. & Etzioni, O. (1999). Grouper: A dynamic clustering interface to Web


search results. WWW8 / Computer Networks, 31(11-16), 1361-1374.

262
Zetter, K. (2005a, June 21). Driving big brother. Wired News. Retrieved June 28,
2005, from
http://www.wired.com/news/privacy/0,1848,67952,00.html?tw=wn_tophead
_3

Zetter, K. (2005b, May 17). Tor torches online tracking. Wired News. Retrieved
May 28, 2006, from
http://www.wired.com/news/privacy/0,1848,67542,00.html

Zimmer, M. (2005). Surveillance, privacy and the ethics of vehicle safety


communication technologies. Ethics and Information Technology, 7(4), 201-
210.

263
264

APPENDIX A

GOOGLE’S QUEST FOR THE PERFECT SEARCH ENGINE:


PRODUCTS AND DATA CAPTURE

In its effort to “organize the world’s information,” Google has expanded

its search services to include not only websites, but other online documents as

well, such as images, news feeds, Usenet archives, and video files. Additionally,

Google has begun digitizing the “material world,” adding the contents of popular

books, university libraries, maps and satellite images to its growing index. Users

can also search the files on their hard drives, send e-mail and instant messages,

shop online, and even engage in social networking through Google. In all, Google

has amassed an extensive Web search information infrastructure comprising nine

distinct information-seeking contexts: general information inquiries, academic

research, news and political information, communication and social networking,

personal data management, financial data management, shopping and product

research, computer file management, and Internet browsing.75 The following

sections will briefly describe Google’s key products in each of these information-

seeking contexts, revealing the circumstances of their use, and providing insights

75
These nine contexts are not necessarily mutually exclusive and are not
put forth as strict metaphysical divisions. They are meant simply to help
compartmentalize for easier discussion the various types information-seeking
activities a person undertakes in her daily activities.
into how they help Google attain the “perfect reach” and “perfect recall”

necessary for the perfect search engine.

General Information Inquiries

Web Search

Google’s pioneering Web search technology and design has enabled it to

stand apart from the competition. In 1998, Brin and Page’s paper, “The Anatomy

of a Large-Scale Hypertextual Web Search Engine” (Brin & Page, 1998),

proposed a system to more effectively retrieve information from the World Wide

Web to “improve the quality of search engines” and thus “bring order to the Web”

(Brin & Page, 1998, p. 3). The core of their new Web search engine was

PageRank, a set of algorithms for ranking Web pages, using the immense link

structure of the Web as an organizational tool:

PageRank relies on the uniquely democratic nature of the Web by using its
vast link structure as an indicator of an individual page's value. In essence,
Google interprets a link from page A to page B as a vote, by page A, for
page B. But, Google looks at more than the sheer volume of votes, or links
a page receives; it also analyzes the page that casts the vote. Votes cast by
pages that are themselves “important” weigh more heavily and help to
make other pages “important.” (Google, 2004c)

Complimenting PageRank is Google’s use of hypertext-matching analysis. At the

time of Google’s launch, most search engines relied heavily on how often a word

appears on a Web page in order to determine its relevancy and importance.

However, instead of simply scanning for the occurrence of a word within the

page, Google analyzes the full content of a page and factors in font size, header

levels, and the relative location of each word in order to measure its importance.

265
For example, a page with the search term in the title or in large, bold font will be

considered more relevant than a page where the word appears only in a footnote.

Google’s Web search also features a simple, uncluttered interface

designed to make it easy for users to enter search queries and interpret results.

Results are presented with context-sensitive summaries so users can determine

whether the corresponding Web pages will satisfy their needs. Google has

increasingly incorporated a “One Box” result for select searches, in which the first

result presents additional information and links (Figure 7).

Figure 7: Partial Google Web search results page for “Boston subway” showing
“One Box” with additional navigational links (circled).

The usefulness of Google’s Web search technology is dependent on the

size and reach of its index. While Google no longer publishes its index size, in

2005, it claimed over 8.1 billion pages, and it is estimated that it has indexed

266
nearly 70% of the Web (Sullivan, 2005). In addition to html-based Web pages,

Google’s Web search also indexes and provides results for Adobe Portable

Document Format (pdf) files, spreadsheets, slide presentations, text documents,

and even Shockwave Flash animation files.

As noted above, Google maintains server logs recording each Web search

request processed through its search engine (Google, 2005j). These logs contain,

at a minimum, the IP address, date and time, browser type and operating system,

cookie ID and the specific search terms for each of the 100 million Web search

queries that Google processes daily. The individual search terms within the logs

are a mix of the mundane and stimulating, the trivial and the informative. While

over half of searchers say they split their searches among those that are “for fun”

and those that are “important” to them (Fallows, 2005), users are increasingly

using the Internet and search engines to help them make important decisions or

negotiate their way through major episodes in their lives (Horrigan & Rainie,

2006). In such potentially personal and sensitive circumstances, the terms for

which users search, along with the results they decide to click on, are stored in

Google’s server logs. Whether a user searches for teen pop star “Lindsay Lohan”

or “Cleveland HIV treatment center,” these searches become associated with an

IP address, cookie ID, and possibly a user account in Google’s server logs.

To help further reconstruct a user’s movements, Google also records

clickstream data, including which search results or advertising links a user clicks

(Google, 2005i).

267
Personalized Homepage

In May 2005, Google introduced Personalized Homepage, giving users the

ability to customize the default Google home page to display their Gmail inbox,

local weather, local cinema times, news headlines, stock quotes and other

services. 76 The launch of Personalized Homepage represented the first step in a

broader effort at Google, called Fusion, to “bring together Google functionality,

and content from across the Web, in ways that are useful to [its] users” (Google,

2005m). At its launch, Google CEO Eric Schmidt predicted that Personalized

Homepage will become “the definition” of Google and that “it will become a

central part of Google…A majority of people will eventually use Google like

this” (Hafner, 2005).

A user can click the “Personalize this page” link from the Google

homepage to activate Personalized Homepage. Doing so places a unique Web

cookie on the user’s browser with an expiration date of 2038 to ensure that the

personalized settings will remain across browser sessions. To counteract the

possible deletion of cookies, and to provide access to a user’s unique Personalized

Homepage on shared or remote computers, Google encourages the creation and

use of a Google Account, advising sers that they can “save this page and take it

with [them]” (Google, 2006n).

A number of different “modules” can be customized in a user’s

Personalized Homepage, many of which require the sharing of personal

76
The Personalized Homepage is accessed by visiting
www.google.com/ig.

268
information or intellectual interests with Google. For example, to deliver local

weather forecasts or movie theater information, the user’s zip code must be

submitted. A user’s political affiliation might be deduced based on whether NPR,

FoxNews or any of the variety of politically-aligned blog portals is selected, and

her religion could be divulged if her Personalized Homepage included one of the

many religiously themed modules, such as the Christian Today or The Hindu.

Similarly, the selection of one of the many foreign language news modules

(which include French, Chinese, Korean, Spanish, Russian, and many more)

could help establish the ethnicity or national origin of a user. A user’s hobbies and

interests might also be identifiable based on his choice of modules: selecting

ESPN might indicate a sports enthusiast, while a technophile might opt for the

Wired News module, and so on. Numerous financial and stock market modules

are available for tracking a user’s stock holdings, and submitting the same address

to the Mapping and Directions module might reveal a user’s home or work

address. Even a user’s sexual orientation could be inferred if any one of the

fifteen different gay-related modules is selected. Given the encouragement and

incentive for users to log into Personalized Homepage with a Google Account, all

such information can be associated with the user’s Google Account profile.

Google Alerts

Delivery of customized content is also available through Google Alerts.

Google Alerts are emails automatically sent by Google when there are new results

for specific search terms selected by a user (Google, 2004a). An e-mail account is

269
required to set up a Google Alert, and Google encourages the creation of Google

Accounts to better manage multiple Alerts. Users can set up any search query as

an Alert, and track the query results from Google’s index of Web pages, news

articles, or discussion groups. Some of Google’s suggested Alerts include

“monitoring a developing news story, keeping current on a competitor or industry,

tracking medical advances, [or] getting the latest on a celebrity or sports team”

(Google, 2004a).

As with Personalized Homepage, using Google Alerts might reveal

personal and intellectual information to Google. For users curious about their

personal Web presence, a common Alert would include their name or even their

mailing address.77 Users can submit much more detailed search queries than the

general modules provided in Personalized Homepage, such as “alcoholic

anonymous meetings in Milwaukee” or “Internet hacking instructions,” allowing

a more detailed glimpse into their personal and intellectual interests. All of a

user’s Alerts are associated with either a user’s e-mail address or Google

Account.

Image Search

Google’s Image Search service78 allows users to search for images

embedded in Web pages. Launched in mid-2001, Image Search has grown to

77
Google would have no definitive way of knowing it was indeed the
user’s name, although comparison to the user’s email address (commonly a
variant of the account holder’s name) or other information in their Google
Account might allow identification.
78
http://images.google.com

270
become Google’s second most popular search-related product (Tancer, 2006),

with billions of images indexed and available for viewing (Google, 2005g). Users

can enter image search queries just like traditional Web searches, and the results

are displayed in sets of twenty thumbnail images. Clicking on a thumbnail brings

up a framed display with a slightly larger version of the thumbnail in the top half

of the page, and the Web page on which the image was found in the bottom half.

From the top frame, users can click the thumbnail to display the full-size image,

remove the frame to display the entire page, or return to their search results.

Just as with searching the Web, a user’s image search queries are passed to

Google along with her unique Web cookie, allowing Google to maintain records

of the images searched. The unique design of Image Search, however, also allows

Google to track exactly on which thumbnail image a user clicked. For example, if

a user searches for “Bill Gates” in Image Search, the first result is a thumbnail

image of Mr. Gates’ portrait from Microsoft’s website. Hovering the cursor over

this image result reveals a lengthy URL:

http://images.google.com/imgres?imgurl=http://www.microsoft.com/press
pass/images/gallery/execs/Web/gates-
2.jpg&imgrefurl=http://www.microsoft.com/billgates/bio.asp
&h=840&w=600&sz=136&tbnid=QTP5Hbhx7_3soM:&tbnh=143&tbnw=102&hl=en&sta
rt=1&prev=/images%3Fq%3Dbill%2Bgates%26svnum%3D10%26hl%3Den%26lr%3D%2
6safe%

Clicking on the result passes this URL to Google’s servers so they can create the

framed display of the Bill Gates photo along with the actual source page. By

logging this URL, Google is able to track which image the user was interested in,

identified within the URL as:

http://www.microsoft.com/presspass/images/gallery/execs/Web/gates-2.jpg,

271
the search terms used to find that image, (the search terms “bill gates” are

embedded in this part of the URL: =/images%3Fq%3Dbill%2Bgates). Logging this

information, along with the user’s Web cookie or Google Account, allows Google

to track a user’s particular image searches and subsequent search result clicks.

Google Video

Google Video allows users to upload, search, and watch videos stored on

Google’s servers, as well as download video files for viewing on their own

computer. Originally limited to television programs from content providers such

as PBS, Fox News, and C-SPAN, Google Video now includes video files from

hundreds of providers, including content submitted directly by users. Google

Video enables users to search across the closed captioning content and other

metadata to locate relevant video files. For example, entering the search query

“how to pick a lock” will return a list of relevant video clips on whichs the search

terms were spoken, included in the video’s title, or otherwise indicated in

metadata. While most videos are free, premium content can be rented or

purchased for a fee. In fall 2006, Google agreed to buy rival video sharing site

YouTube for $1.65 billion in stock. While YouTube will remain a separate

service under its own identity, Google video searches include YouTube results as

well (Ackerman & Blitstein, 2006).

Google records the particular search terms, and since the video content is

located on Google servers, it is able to monitor and track on which search results a

particular user clicks, and whether he choses to download the file. If a user e-

272
mails a video clip to a friend, Google captures the following information in the

browser command:

docid=759726176973245447&q=how+to+pick+a+lock&from=NettyRoe
%40gmail.com&to=friend%40email.com&msg=I%20found%20this%20c
ool%20video&sendToSender=false

The docid number identifies the specific video, the &q field identifies the original

search terms, &from and &to are the respective e-mail addresses, and &msg is the

content of the e-mail send by the user. This detailed information is passed to

Google’s server along with the user’s unique Web cookie.

Some content on Google Video requires payment in order to view or

download. To facilitate this, a Google Account is required, and payment

information must be transmitted to Google. When a video is purchased, Google

collects and records information about the transaction, including the file name of

the video purchased, the name of the seller, the transaction amount, and the

payment method used (Google, 2006o). Google also implements digital rights

management measures to monitor copyright-protected content downloaded from

Google Video: The Google Account information of the user downloading

copyright-protected video is embedded, in encrypted form, in the video itself.

When the video is played, the Player sends this encrypted information to Google,

including the identity of the video, informing Google that a particular user is

attempting to play the video and allowing it to confirm that the copy is authorized

(Google, 2006l).79

79
If a user downloads free videos or purchase non-copy-protected videos,
no account information is embedded.

273
Google Book Search

Google Book Search (originally called Google Print) allows users to

search the full text of books scanned into Google’s database, and, depending on

the book’s copyright status, view either snippets or complete pages. The Google

Book Search service remains in a beta stage but the underlying database continues

to grow, with more than 100,000 titles added by publishers and authors and some

10,000 works in the public domain now indexed and included in search results.

Google also has formed partnerships with several high-profile university and

public libraries, including the University of Michigan, Harvard University,

Stanford University, Oxford University, University of California, University of

Texas, and the New York Public Library, to digitize and make available their

volumes through its Google Book Search.

Because many of the books in Book Search are still under copyright,

Google limits the extent to which a user can view a particular volume. In order to

enforce these limits, users must use a Google Account to access particular pages,

allowing Google to “connect some information – your Google Account name –

with the books and pages that you’ve viewed” (Google, 2005f). Google Book

Search offers links to online booksellers to purchase books viewed; clicking on

these links first routes the browser through Google’s servers before loading the

bookseller’s page, allowing Google to track which online bookstore the user has

decided to visit.

274
Academic Research

Google Scholar

In November 2004, Google released Google Scholar, a search engine that

indexes the full text of scholarly literature across an array of publishing formats

and scholarly fields. Examining a user’s searches within Google Scholar might

reveal her specific intellectual pursuits, such as “radical feminist theory” or

“strong cryptography.” In additional to tracking specific search terms, Google

tracks the results clicked by searchers in Google Scholar by redirecting all results

through Google’s servers. For example, clicking on a search result for Sergey

Brin and Larry Page’s article “The Anatomy of a Large-Scale Hypertextual Web

Search Engine” sends this command to Google’s Web server:

GET
http://scholar.google.com/url?sa=U&q=http://www.public.asu.edu
/~ychen127/cse591f05/anatomy.pdf

The request for the article is first routed through Google’s server, which then

requests the article download from its home server, allowing Google to track the

results clicked. Google Scholar also allows users to identify their home library

(such as New York University) to determine which journals and papers the library

subscribes to electronically, and then links to articles from those sources when

available. Google records the user’s library selection via the Web cookie.

News and Political Information

Google News

In the weeks following the terrorist attacks on September 11, 2001,

Google reported an increase in news-related searches by a factor of sixty

275
(Wiggins, 2001). Google’s initial response to this increased demand in news and

current events was to provide Google News Headlines, a page that summarized

top news stories from about 100 different publications. Within a year, this evolved

into Google News, a service that scans over 4,500 different news websites in real

time, determines which news stories are related and then clusters them into related

categories. The top stories are highlighted under common categories such as

“World,” “Science and Technology,” or “Health,” and users can execute search

queries to seek specific news stories.

As standard practice, a user’s search queries within Google News are

passed to Google along with her unique Web cookie, allowing Google to maintain

records of all Google News searches alongside Web and image searches. Users

can also customize Google News with news from 22 regional editions and 10

languages, or include news related to specific keywords of the user’s choosing.

For example, a custom news section can be created for any stories that include the

terms “flag desecration amendment” or “Green Bay Packers.” These preferences

are managed through a Web cookie, but Google also encourages the use of a

Google Account to ensure user’s can access their customized news from any

computer. In such cases, Google acknowledges that “if you create personalized

news front page settings as part of your Google Account, the settings will be

stored together with your Account information” (Google, 2006h).

276
Google Reader

Google Reader is a Web-based tool to help users organize and manage

content from regularly viewed websites. Instead of continuously checking favorite

sites for updates, users can subscribe to their Web feeds,80 and Google Reader

will monitor the sites for updates and display new content in the reading list. A

search box is provided to help users find and subscribe to specific Web content,

and users can organize feeds by using descriptive labels and stars for particular

favorites. Feed subscriptions are stored on Google’s servers and can also be

displayed on a user’s Personalized Homepage. A Google Account is required to

use Google Reader, and usage statistics can be recorded in Google’s server logs in

accordance with its privacy policy, including a user’s subscribed feeds, searches

performed, and items read, labeled, and starred.

Blog Search

Based on Google’s Web search technology, Google Blog Search enables

users to search for content published on blogs worldwide. As with most Google

services, any search terms used with Blog Search can be recorded in Google’s

server logs along with the browser’s Web cookie or the user’s Google Account.

Given the role of blogs as “important [media] of self-expression…news, opinion,

and commentary” (Google, 2005e), analyzing a user’s Blog Search history might

80
A Web feed is a coded Web file which contains content from the
originating website, typically summaries of news stories or weblog posts with
Web links to longer versions.

277
yield search terms with unique affinity to his political, social or cultural beliefs,

such as “arguments against immigration reform” or “media liberal bias.”

Blog Search can be accessed by visiting blogsearch.google.com, but also

appears as an automatic header in many blogs hosted at Google’s Blogger

platform.81 Searching for blog content from this interface sends Google the

referer field. For example, if a user searches for the phrase “marijuana in

Brooklyn” from the Blog Search header that appears on the U.S. Marijuana

Party’s blog, the code Referer: http://usmjparty.blogspot.com/ is sent to

Google along with the search terms and Web cookie, allowing Google to record

that this particular browser made this search from that particular blog.

Communication and Social Networking

Gmail

Gmail is Google’s free Web-based e-mail service, offering users over 2

gigabytes of storage and the ability to search within messages via Google’s Web

search algorithms. Gmail’s large storage capacity eliminates the need for the

typical user to delete e-mails, allowing all messages to be archived on Google’s

servers. Gmail allows users to maintain contact lists on Google’s servers, and

offers the ability to send instant messages directly from the Gmail interface. Users

have the option to save their chat histories on Google’s servers along with their

email messages.

81
The use of Blogger as a communication medium is discussed below.

278
At its launch, Gmail was heavily criticized for its practices of scanning the

text of incoming messages in order to place context-sensitive advertisements

(Bray, 2004; Clearinghouse, 2004; Electronic Privacy Information Center,

2004b). When viewing e-mail messages in Gmail, advertisements and links to

related pages appear in the right margin of the Gmail interface. Google scans the

text of incoming e-mail messages in order to target the advertising to the user. For

example, if the user is reading an e-mail that contains the text “Atlantic City,”

Gmail might present the user with ads about hotels, casinos, and other websites

related to that travel destination. While Google maintains that “no human will

read the content of your email in order to target such advertisements or other

information without your consent, and no email content or other personally

identifiable information will be provided to advertisers as part of the Service,” the

Gmail terms of use also note that Google may “monitor, edit or disclose your

personal information, including the content of your emails, if required to do so in

order to comply with any valid legal process or governmental request” (Google,

2005d).

Further criticism of Gmail centered on a clause in its original privacy

policy stating that “residual copies of e-mail may remain on our systems for some

time, even after you have deleted messages from your mailbox or after the

termination of your account” (Electronic Privacy Information Center, 2004b).

Because electronic communications stored for more than 180 days enjoy lower

protections from law enforcement access (Electronic Privacy Information Center,

2004b), the prospect of indefinite storage of Gmail e-mails raises concerns over

279
the privacy of users’ communications. Google insists this phrasing in the Gmail

privacy policy was simply “poor wording” (Gillmor, 2004), and that while

Google, like most Web-based e-mail providers, keeps multiple backup copies of

users’ emails so they can recover messages and restore accounts in case of errors

or system failure, deleted e-mails are completely removed from Google’s servers

within 60 days (Gillmor, 2004; Google, 2005c). Even with these commitments, a

user’s deleted Gmail still might remain in Google’s “offline backup systems”

(Google, 2005c), and in at least one case, Google has received a subpoena for the

complete contents of a Gmail account, including deleted e-mail messages

(McCullagh, 2006b).

Groups

Google Groups is a free service enabling users to create and manage their

own email groups and discussion lists. Users can participate in ongoing

discussions related to specific interests, create new groups, and access the Usenet

archive of newsgroups and discussion forums with over 800 million posts on

thousands of topics dating back to 1981 (Google, 2003). While any user can

search and read Google Group discussions, a Google Account is required to post

new messages or create a new discussion group. A user may also create a public

profile which displays her name, nickname, location, title, industry, website or

blog, as well as the most recent posts she made. All postings submitted to Google

Groups are stored and maintained on Google servers, and Google collects and

maintains information about a user’s account activity, including the groups that he

280
joins or manages, the messages or topics she tracks, her ratings of particular

messages or groups, and her preferred settings when using Google Groups

(Google, 2006g).

Talk

Talk is a Web-based instant messaging and voice calling service offered in

conjunction with Google’s Gmail e-mail service. A Google Account and a Gmail

address are required to access Google Talk, and users have the option of recording

their Talk chat histories within their Gmail accounts. Google records information

about Talk usage, such as when the service is used, contact list members and

those actually communicated with, and the frequency and size of data transfers.

Information displayed or clicked on in the Google Talk interface is also recorded

(Google, 2006j). Google notes that it deletes the activity information associated

with a user’s account “on a regular basis” (Google, 2006j). The frequency of such

deletions remains publicly undefined.

Blogger

Blogger is a Web-based publishing platform providing a simple means of

publishing an online journal or weblog. Originally created by Pyra Labs, Blogger

was purchased by Google in 2003 and existed somewhat separately from other

Google-based services: for example, users of Blogger required a separate login to

access the service. In August 2006, Google updated the service to allow closer

integration with its other products, including the ability for users to use their

Google Account to login and access the Blogger service (Goldman, 2006).

281
Google encourages the creation of a Blogger profile, which includes information

such as the user’s full name, photograph, birthday, location, and gender, as well as

lists of favorite books, movies, music and so on. All account information and

copies of weblog posts and comments, including drafts, are stored by Google, and

are associated with the Google Account used to access the service (Google,

2005a, 2006b).

Orkut and Dodgeball

Orkut is a social networking service offered by Google, in which users can

list their personal and professional information, create relationships, and join

communities of users with similar interests. While Orkut users originally

maintained unique login accounts, Google now requires the creation of a Google

Account to log in and use the Orkut service (Weinberg, 2005). Google encourages

the creation of a Orkut profile that includes personal information, such as gender,

age, occupation, hobbies, interests and photos. All account information is stored

and maintained by Google, including the e-mail addresses and content of

invitation messages sent by Orkut users (Google, 2005l).

Dodgeball, acquired by Google in 2005, is a location-based social

networking service built specifically for use on mobile devices. Dogdeball users

report their locations through their mobile phones, and the service broadcasts their

location to their network of friends. Users can also send personal messages, check

for addresses and interact with other Dodgeball users through the service. To use

Dodgeball, users must register for a Google Account and provide personal

282
information including name, email address, home city, gender, and mobile phone

information. Dodgeball logs all text messages sent through the service, including

those indicating a user’s location (Google, 2006c).

Personal Data Management

Google Calendar

Google Calendar is a Web-based time-management tool. User events are

stored on Google’s servers, and can be accessed from any computer through a

Google Account. To activate the service, a user must provide a first and last

name, preferred default language, and time zone. Google Calendar is closely

integrated with Gmail: When an e-mail that contains trigger words (such as

“meeting,” or dates and times) arrives, Gmail displays an “add to calendar” button

to encourage use of the service. Users can also share their calendars publicly or

send invitations to other users regarding specific calendar events.

If a user deletes events from her calendar, Google acknowledges that

complete removal of the event information from their servers “may not be

immediate,” and residual copies of calendar information may remain on backup

media (Google, 2006d). In accordance with its privacy policy, Google also

records usage statistics from Google Calendar, such as when and for how long the

service is used, the frequency and size of data transfers, and the number of events

and calendars created. Information displayed or clicked on while in a user’s

Google Calendar account (including user interface elements, ads, links, and other

information) is also recorded, as is information associated with invitations sent to

283
other users regarding events, including email addresses, dates and times of the

events, and any responses from invites. Google permanently deletes usage

statistics associated with a user’s account every ninety days, but retains aggregate

information for an unspecified period (Google, 2006d).

Financial Data Management

Google Finance

In 2006, Google launched Google Finance, offering news and financial

information about stocks, mutual funds, and public and private companies. Along

with normal collection of a user’s search activity within Google Finance, a unique

Web cookie is also utilized to provide a Recent Quotes feature, allowing users

keep track of the stocks and mutual funds recently searched for and viewed. Using

their Google Accounts, users can create a permanent Google Finance portfolio to

keep track of financial information, including how many shares owned and at

what price, for up to 200 stocks or mutual funds (Google, 2006f). Google Finance

also features a discussion board, which requires a Google Account. Discussion

forum participants are encouraged to create a Finance profile, which might

include personal information such as a user’s full name, location, industry, or a

link to their website or blog. Google employees moderate all posts to the

discussion group, and users’ names and e-mail addresses are displayed with their

posts (Google, 2006f).

284
Shopping and Product Research

Catalog Search and Froogle

Google’s first entrance into e-commerce was its Catalog Search service,

allowing users to search and browse more than 6,000 mail order catalogs archived

on Google’s servers as scanned image files. Through the use of character

recognition, users can search for a text string in these catalogs in a fashion similar

to how they would search for materials on the general Web, and matching results

are displayed as thumbnails of the catalog’s printed pages. Google later launched

Froogle, giving users the ability to search through a database of online retailers,

find multiple sources for specific products, and deliver details, images and prices

for the items sought. Google logs the search queries and results clicked for both

services in order to load the proper catalog page or product detail with a list of

online retailers.82 When using Froogle, users can connect directly to the retailer’s

website to purchase an item. For example, a search for “sniper rifle” might send

users to a page describing the Tokyo Marui PSG-1 Airsoft Sniper Rifle, with a

link to the Supply Tent online retailer. Clicking on the link sends the following

browser instruction:

GET http://froogle.google.com/froogle_url?q=http://www.supply-
tent.com/
product_info.php%3Fproducts_id%3D248&fr=ANUtWOyKpinjJMKIH-
nQCMyR5l5J8b94V3vCeNWMfGyxAAAAAAAAAAA&ei=D2p3ROy-
DZ_kqwLnkM1y&sig2=qwb6mFSHSQNzewweBrZFnQ

Before loading the product page at Supply Tent, the request is routed

through Google’s server, allowing Google to track which store the user has

82
Froogle searches are also included in a user’s Search History if that
service is activated.

285
decided to visit for more information. Users can also create shopping lists on

Froogle to save and share product information. Google stores users’ shopping list

information on its servers maintaining their Google Accounts.

Google Local & Google Maps

A recent enhancement to search engine technology is the ability to process

a locality parameter (such as street address, city name or ZIP code) and provide

search results based upon websites of businesses that have physical addresses

located within the parameter. In spring 2004, Google launched its Google Local

service, which provided both a dedicated portal to search within a specific

locality, and relevant local search results at the top of a Web search results page

(if a locality can be deduced from the search terms) (Google, 2004b). A year later,

Google launched Google Maps, a dynamic online mapping feature for viewing

detailed street information and satellite images and the creation of driving

directions. Today, the two products have merged into one service, allowing users

to find local search and mapping information in one place (Google, 2005h).

As with Web search queries, any location-specific search parameter used

in a Google Local or Google Maps search request is sent to Google along with a

user’s Web cookie and can be stored alongside all other search query information

in Google’s server logs. Recognizing that many users search for information near

their homes or workplaces, a default location can be set as the default starting

point for the next location-specific search. To facilitate this, Google adds a

location-specific parameter to the Web cookie that is passed to the browser. For

286
example, if a user sets “239 Greene Street, New York, NY 10003” as their default

location, the following code is added to the PREF Web cookie:

L=0vSX508Toojyru7zEvby35nyXl_cymLkB

Each location has its own L parameter setting, and by adding location data

to the Web cookie, Google can offer location specific results with traditional Web

search queries.

Computer File Management

Google Desktop

In October 2004, Google released the first version of Google Desktop

Search, a free downloadable application for locating personal computer files using

Google’s search technology. By 2006, supported files included a user’s e-mail

messages (if stored on the user’s computer), her Web history, Microsoft Office

documents, instant messenger chat histories, PDF files, as well as music, video,

and image files. After isntallation, the software completes a full indexing of these

files on the user’s computer. (After the initial indexing is completed, the software

continues to index files as needed.)83 When performing searches, the user receives

results in the browser on a Desktop search results page much like the results for

Google Web searches.

While the Desktop Search software resides on a user’s local computer

hard drive, and searches performed do not connect to Google’s online servers, if a

user performs a traditional Web search from the Desktop Search results page, the

83
Only the first 10,000 words of each file, and only the first 100,000 files
are indexed (http://desktop.google.com/support/bin/answer.py?answer=13754).

287
original Desktop search terms are passed to Google within the referer field. For

example, suppose a user first searches for “donations to Republican Party” on

Desktop Search to help locate a spreadsheet file on the user’s computer. After

seeing the results page, the user takes advantage of the Web search interface at the

top of the results page to perform a traditional Web search by clicking the

conveniently placed “Web” link. When the new search is executed, the following

code is sent to Google’s servers along with the new search terms and Web cookie:

Referer:
http://127.0.0.1:12758/search?q=donations+to+republican+party
&flags=32&s=k1C7ekMgaBHFPEm2aaPHD-8pyEw

Even though the user had searched her desktop files for documents referencing

contributions to the Republican Party, the subsequent Web search provides that

potentially personal information Google. Desktop Search also offers spelling

suggestions for searches conducted from the Sidebar, Deskbar, or Floating

Deskbar interfaces. In order to provide these suggestions, all Desktop search

terms are automatically sent to Google’s online servers for processing.84

In early 2006, Google released a new version of Google Desktop Search

with a “Search Across Computers” function allowing users to search and access

information from all of their computers that have Google Desktop installed. Once

enabled,85 each authorized computer’s file index is stored on Google’s servers. As

Google explains:

84
This feature is automatically activated in Google Desktop 3.0, but can
be turned off via the Desktop Preferences control panel.
85
The Search Across Computers feature is not automatically activated and
must be enabled and authenticated through the Google Desktop preferences. A
Google Account is required to activate and access the service.

288
This is necessary, for example, if one of your computers is turned off or
otherwise offline when new or updated items are indexed on another of
your machines. We store this data temporarily on Google Desktop servers
and automatically delete older flies, and your data is never accessible by
anyone doing a Google search. (Google, 2006e)

To help protect user privacy, the data is encrypted in transmission and while

stored on Google servers, and is retained for only 30 days. However, privacy

concerns persist, typified by this warning from the Electronic Frontier

Foundation:

If you use the Search Across Computers feature and don’t configure
Google Desktop very carefully—and most people won’t—Google will
have copies of your tax returns, love letters, business records, financial
and medical files, and whatever other text-based documents the Desktop
software can index. The government could then demand these personal
files with only a subpoena rather than the search warrant it would need to
seize the same things from your home or business, and in many cases you
wouldn’t even be notified in time to challenge it. Other litigants—your
spouse, your business partners or rivals, whoever—could also try to cut
out the middleman (you) and subpoena Google for your files. (Foundation,
2006)

It is unclear whether the data stored on Google’s servers are retained on “offline

backup systems” past the 30-day window (similar to Gmail messages), or whether

Google employees are able to decrypt the files if they are subpoenaed or

requested for other uses.

Internet Browsing

Bookmarks

Google Bookmarks allows users to save and organize their bookmarked

Web pages on Google’s servers. Users can create Bookmarks by “starring” a page

from their Search History, through a Bookmark javascript added to their Web

289
browser, or via the Google Toolbar. Bookmarks can be accessed from any

computer by logging into a Google Account, and can also be added to a user’s

Personalized Homepage. Clicking on a Google Bookmark for The New York

Times website, for example, sends the following browser command:

http://www.google.com/bookmarks/url?url=http://www.nytimes.com
&ei=m-
OBROu3DIre4QGE38ygDA&sig2=N_8dXWKvMIvdU6iQUVKujA&zx=JzzAy4Ao9g
M&ct=b

The request to load The New York Times webpage is first routed through Google’s

server, allowing Google to track when a particular bookmark is clicked.

Notebook

Google Notebook is a browser tool that provides users with the means to

save and organize notes while browsing online. Users can clip text, images, and

links from Web pages, save them to an online “notebook” that is accessible from

any computer, and share them with other users. A Google Account is required to

use Notebook, and all notes and annotations are stored on Google’s servers.

Similar to its other services, Google automatically records users’ Google

Notebook account activity, including storage usage, number of log-ins, data

displayed or clicked on, and other log information (Google, 2006i).

Google Toolbar

Many of the information retrieval services described above have been

integrated into the Google Toolbar. Google Toolbar is a browser plug-in allowing

users to perform Google searches and other functions without visiting the Google

homepage, either using the toolbar’s search box or right-clicking on text within a

290
Web page. The Google Toolbar has been downloaded by millions of users86, and

Google has partnerships with Sun Microsystems to package and distribute

Toolbar with Sun’s popular Java Web software87 (Shankland, 2005), as well as

with Dell computers to pre-install the Toolbar in all Dell personal computers

(Olsen, 2006).

If users are running Google Toolbar with certain advanced features

enabled, a considerable amount of information about all webpages viewed is sent

to Google’s servers. The advanced features of the Google Toolbar are PageRank,

AutoLink, SpellCheck, and WordTranslator (Google, 2006q). The PageRank

feature provides a proxy PageRank calculation for a particular website.88 Toolbar

sends Google the addresses of every website visited by the user. The AutoLink

feature scans the content of a visited webpage. If Google recognizes certain types

of information on the page (addresses, ZIP codes, ISBN numbers, etc.) AutoLink

automatically adds relevant links to the webpage. The SpellCheck feature

monitors the words users type into Web forms in order to correct any spelling

mistakes. Finally, WordTranslator sends to Google the English words that users

identify with the mouse and provides translations into Chinese, Japanese, Korean,

86
Download statistics from Google’s webpage are not available, but over
3 million downloads of the Toolbar have been recorded at Download.com alone
(http://www.download.com/Google-Toolbar/3000-2379_4-10056938.html).
87
The Java Runtime Environment is downloaded 20 million times per
month (Shankland, 2005).
88
PageRank is a patented algorithm for calculating the relative importance
that Google assigns to a page.

291
French, Italian, German, or Spanish. With these various tools activated, Google is

able to collect information on virtually every webpage a user visits.89

The Toolbar’s Safe Browsing feature alerts users if a Web page appears to

be asking for personal or financial information under false pretences. When used

in Enhanced Protection mode, the Toolbar will send the URLs of all pages visited

and information about the page to Google for evaluation. When the user is warned

about a suspicious site, Google will log that site’s URL and the user’s decision to

accept, reject, or close the warning message. Toolbar also features a pop-up

blocker that monitors all Web pages visited in order to prevent unwanted

advertising from appearing in additional browser windows. Additionally, when

Toolbar sends a website’s URL to Google, it is possible that the URL may itself

contain additional personal information. For example, when a user submits

information to a Web page (such as a login ID or registration information), the

website might “embed” that personal information into its URL, typically after a

question mark (“?”). When the URL is transmitted to Google, its servers

automatically store the URL, including any personal information that has been

embedded after the question mark.

With Toolbar, users can e-mail Web content and URLs to other users with

the “Send to Gmail” button, or automatically create blog posts with the “Send to

Blogger” option. Users can also automatically create Bookmarks from the

89
Specifically, AutoLink and SpellCheck only send snippets of text when
their respective buttons are clicked, and WordTranslator sends words for
translation only when the mouse is hovered over the text to request. When
PageRank is activated, Google automatically receives the URLs for all webpages
visited.

292
Toolbar. To use these features, users’ Google Account or Blogger account

information is required, and it might be possible to associate the information that

Toolbar sends with these other Google accounts (Google, 2006k). Users can also

send text messages directly from the Toolbar; when that feature is used, Google

logs the number and carrier to which the message is sent, and in some cases may

record the text itself for “debugging purposes” (Google, 2006p).

Web Accelerator

Google Web Accelerator is a downloadable application to speed up page

load times for faster Web browsing. While not directly related to Web searching

or information organization, Web Accelerator takes advantage of Google’s

computer infrastructure to make Web pages load faster. The software

accomplishes this through various means: by sending page requests through

Google servers dedicated to handling Google Web Accelerator traffic, storing

copies of frequently viewed pages to make them quickly accessible, downloading

only the updates if a Web page has changed slightly since it was last viewed,

prefetching certain pages onto a user’s computer in advance, and implementing

other data management and compression techniques.

When using Web Accelerator, all non-secure Web page requests are

routed through Google’s servers, along with information such as the date and time

of the request, the user’s IP address, and computer and connection information.

Google stores and uses this information to help predict and prefetch additional

relevant Web content. Depending on how particular websites are set up, it is

293
possible that personally identifiable information embedded in the URL might also

be processed through and stored within Google’s servers. Google might also

temporarily cache other sites’ Web cookies when prefetching certain page

requests (Google, 2006m).

294
295

APPENDIX B

A THOUGHT EXPERIMENT: LIBBY AND NETTY’S


INFORMATION-SEEKING ACTIVITIES

This thought experiment features two ideal typical information seekers,

Elizabeth “Libby” Doe and Annette “Netty” Roe. Libby and Netty are nearly

identical in their personal, social, political, cultural, and economic characteristics.

Both are 30-year-old, single, gay south-Asian women. Both are Hindu, live in

Brooklyn, New York, and tend to vote for Democrats. Libby and Netty are

graduate students at New York University, studying political science and feminist

theory. They enjoy sports and cooking as hobbies; both are thinking of having a

baby, but have concerns due to being diabetic. They have similar investment

portfolios, enjoy keeping in touch with friends, and like to share photos and

stories from their travels.

The two differ, however, in how they navigate their “informational

spheres.” Libby prefers traditional, “old-fashioned” methods of information-

seeking and communication: reading print newspapers, watching television news,

word-of-mouth, written, and oral correspondence. While not averse to using the

Internet, when Libby needs to find information on a topic, she prefers visiting the

library. Netty, on the other hand, relies heavily on the Internet to manage

information and communicate with others. When Netty needs information about a
topic, she “Googles” it. In fact, Netty relies on Google’s broad array of products

and services for virtually all of her online activities.

When navigating their respective “spheres of information,” both Libby

and Netty inevitably share bits of personal information with others. The following

sections will describe these flows of personal information within each of the nine

distinct contexts of information-seeking identified in the previous chapter,

allowing comparison with the personal information shared by Libby in her

traditional information-seeking methods, and Netty, who relies almost exclusively

on “planet Google” to access and organize information.

General Information Inquiries

Information Practices

Libby’s primary source for information is the library. Visits to the local

branch of the Brooklyn Public Library are almost a daily occurrence; many of the

staff librarians greet her by her first name when she arrives. Libby often browses

the new fiction shelves, flipping through books of interest, occasionally checking

out one or two. Lately, Libby has been searching the library’s computerized card

catalog for resources on pregnancy and childbirth, as well how diabetes might

become a complication. She has found useful books at the library, reading some

there, checking others out for use at home. Libby often compliments her visits to

the library by spending afternoons at a local bookstore, browsing their books and

magazines. Libby does use the Internet and search engines to help find

information, typically using the computer workstations at the library or at school.

296
In all, Libby relies on the library’s judgment, and prefers to find her information

there.

Netty, on the other hand, relies almost exclusively on the Internet for her

information and research needs, and often explores the Web just to see what kind

of new and interesting things she can discover. A dedicated Google search engine

user, Netty has learned to trust its results, and has integrated a variety of its

products and services into her Web searching practice. She likes to take advantage

of Google’s Personalized Search and Search History services to help improve her

search results and recall past searches. She also frequently uses Google’s specialty

search products for images, videos, and books. Recently, Netty has searched for

information on pregnancy and childbirth, as well as finding helpful sites on

cooking and her favorite sports. She also takes advantage of Google Alerts, a

service that sends her an e-mail whenever new pages are added to Google’s index

that match certain search queries, such as “diabetes and pregnancy.” Netty enjoys

browsing through bookstores, and occasionally visits the public library, but she

remains dedicated to the wealth of information at her fingertips via the Internet

and Google.

Information Flows

Libby might interact with a variety of receivers of information in her

general information-seeking activities, including librarians, booksellers, or other

merchants (newsstands, etc.). The type of information Libby shares with these

agents is generally limited to the titles of the reading material borrowed or

297
purchased. It would be extremely rare for a librarian to require a user provide

personally identifiable information in order to browse a book, some personal

information must be shared in order to check out materials. In such cases,

however, librarians are guided by strict code of ethics dictating the transmission

principles of patron data. Similarly, while Libby can browse books and magazines

at the bookstore anonymously, some personal information must be shared in order

to purchase an item (unless paid for with cash). Using the computers at school do

not require logins, so little identifiable information is shared.

Netty’s general information-seeking activities, however, represent a shift

in these informational norms. Rather than the disparate set of agents with whom

Libby interacts, Netty relies almost exclusively on Google, and any personal

information shared across her information-seeking actions are collected and

aggregated by the search engine company according to the transmission principles

outlined in its privacy policy. This collection is made automatic and constant

through the Web cookies associated with Netty’s Google account, allowing the

creation of a centralized database of Netty’s search behavior.

Academic Research

Information Practices

As with her general information needs, Libby relies heavily on libraries

for her academic research, with the New York University library system

supplementing the Brooklyn Public Library. Libby makes use of the research

librarians at NYU to help find resources, and she often reads and checks out

298
books related to her studies in political science and feminist theory. She also

frequently uses the university’s computers, both at the library at in her

department, to find the most recently published articles in her field.

While sharing Libby’s use of the printed political science cannon, Netty

also relies heavily on the Internet for her academic research. Like Libby, Netty

utilizes the NYU libraries and the access they provide to online academic

journals. However, Netty is also a frequent user of Google Scholar, a specialized

search engine that indexes the full text of scholarly literature across an array of

publishing formats and academic fields. By identifying New York University as

her home library in Google Scholar, Netty can easily determine which journals

and papers the library subscribes to electronically, and simply click the link

provided by Google to view individual articles.

Information Flows

Similar to her practice with general information inquiries, Libby might

interact with a reference librarian for her academic research. The type of

information Libby shares with a librarian is limited to the titles of the reading

material borrowed or purchased. It would be extremely rare for a librarian to

require that a user provide personally identifiable information in order help guide

her research. Again, the librarian is guided by the ALA Code of Ethics regarding

any patron information received.

Netty’s frequent use of Google Scholar results in the automatic and

constant collection of the searches and documents viewed in support of her

299
research agenda. Her affiliation with New York University can also be logged in

Google’s data files.

News and Political Information

Information Practices

Libby keeps up-to-date on news through a combination of newspapers,

magazines and television. She subscribes to the printed version of the New York

Times and reads it daily on the subway. During visits to the library, Libby often

reads India’s English-language newspaper The Hindu, as well as The Advocate, a

gay news magazine. Libby also likes to keep up to date on political issues and

commentary through additional publications. She subscribes to The New Yorker,

and often glances at issues of Harpers and The Nation while visiting a bookstore.

Libby also flips through sports and health related magazines to stay current on

news related to those aspects of her life. Along with the New York Times and the

New Yorker, Libby frequently picks up a free copy of the Village Voice and other

free neighborhood publications to remain abreast of local news events and

activities. In addition to these print sources, Libby also watches local television

news on a nightly basis for local news and weather updates. She occasionally

watches cable news providers as well, especially for breaking events, and listens

to National Public Radio every morning. Finally, Libby does get some news from

Internet sources, but tends to view only the websites of her other news sources,

such as NewYorkTimes.com, CNN.com or ESPN.com.

300
Not surprisingly, Netty stays current on news and political events via the

Internet. Her starting point for news information is Google News, a service that

scans over 4,500 different news websites in real time and organizes them in

categories such as “World,” “Sports,” or “Health.” Netty often searches for

particular topics within Google’s index of daily news stories, and takes advantage

of the ability to create custom sections for articles that match certain keywords

such as “gay rights” or “Brooklyn.” She also uses Google’s Personalized

Homepage product to have access to her preferred news stories right on the main

Google search page. She has activated news modules delivering the New York

Times headlines, gay and lesbian coverage from The San Francisco Chronicle,

and articles from the online version of The Hindu. Netty also has local New York

sports and weather modules activated on the homepage. To stay up-to-date with

breaking news, Netty has subscribed to receive e-mail Alerts when certain phrases

appear in new stories over the course of the day.

Along with traditional media outlets, Netty also reads many weblogs for

news and political commentary. To find relevant content published on blogs

worldwide, she frequently uses Google’s Blog Search service. Due to the large

number of blogs she likes to follow, Netty also utilizes Google’s Reader service,

which allows her to subscribe to various blog feeds and read them all from

Google’s interface.

301
Information Flows

The magazines and newspapers to which Libby subscribes possess her

mailing address and billing information to fulfill those subscriptions. They do not,

however, have any means of knowing what articles she reads, whether she writes

notes in the margins, copies pages, tears them out to share with others, and so on.

She remains completely anonymous in terms of the other magazines and

newspapers she reads at the library or at newsstands, as well as the news and

political information she receives from television or radio. When she does visit

news-related websites, tracking cookies from those sites might monitor her

activities.

Netty’s reliance on Google News allows Google to track all of her news-

related search queries, as well as which articles she clicks on in order to read.

Google can also gain insight of her interests based on the keywords she selects for

Alerts, the feeds she subscribes to on the Reader, as well as the commentary she

might search for through its Blog Search service.

Communication and Social Networking

Information Practices

Like most students, Libby utilizes her NYU e-mail account for academic-

related communication. While she also sends and receives occasional e-mails to

friends and family from her NYU account, Libby is a habitual letter writer,

preferring to send notes and the occasional photos via regular mail. She also likes

302
mailing postcards from her travels. Libby does not use instant messenger, but

does make frequent cell phone calls to keep in touch with her friends.

Netty uses Google’s Gmail e-mail service for all her e-mail needs; all

messages sent to her NYU account are automatically forwarded to her Gmail

account.90 Netty communicates with a lot of her friends through instant

messaging, using both AOL’s Instant Messenger service as well as Google’s Talk

messaging system. Netty often e-mails photos of her travels to friends, which is

easy for her to do with Google’s Picasa photo management software, and she has

started to experiment with Google’s new Hello instant messaging service for

photos. Netty has embraced some of the latest interactive and self-publishing

communication tools, including discussion groups and blogs. She is an active

member in various Google Groups, engaging in online discussions about political

theory, Hinduism, and pregnancy, and she uses Google’s Blogger publishing

software to maintain a personal blog, posting daily comments and observations on

these topics and other events from her daily life. Netty also has accounts with

various online social networking sites, including Google’s Orkut service, where

she finds people and joins communities who share her hobbies and interests.

Finally, Netty has signed up for Google’s dodgeball service, which allows her to

quickly communicate and coordinate social activities with her friends via her

mobile phone.

90
Netty signed up for Gmail using Google’s mobile phone text-messaging
feature, which also allowed her to associate her mobile phone number with her
Google Account in order to use Google Mobile services as they become available.

303
Information Flows

While Libby’s e-mail traffic can be tracked by NYU, the majority of her

communication and social interactions remains anonymous, known only to the

recipient of her letters or postcards. Similarly, her cellphone provider tracks her

usage for billing purposes, but the content of the conversations remains private.

Netty’s use of Gmail means all of her incoming messages are scanned for

placement of contextual ads, and any clicking of those ads is tracked by Google.

Her list of friends and IM messages are also archived on Google’s servers, as are

the e-mails and messages sent to friends through Picassa or Hello. Google also

retains a record of all of Netty’s activity in her various Groups, including what

messages she clicks on to read, as well as those she posts herself. A log of all her

blog posts is maintained, as is the activity she engages in via Orkut or dodgeaball.

Personal Data Management

Information Practices

Libby relies on a written date book order to manage her personal and

school schedules. While she receives many notices of events and activities via e-

mail (especially school-related), Libby transcribes them to her calendar. She also

uses her date book to manage to-do lists and contact information for family,

friends and colleagues.

Netty maintains a calendar for her personal events and activities online

using Google’s Web-based Calendar product, which can send reminders to her

304
mobile phone. Netty also keeps a contact list of her friends and colleagues in her

Gmail account for access from any computer.

Information Flows

Since Libby keeps her personal data offline, they remain almost entirely

private; someone would have to gain physical access to her date and address

books in order to discern any information.

Most of Netty’s contact and calendar information are stored on Google’s

servers, as well as her cell phone information in order to receive alerts.

Financial Data Management

Information Practices

Libby receives monthly paper statements to help track her small

investment portfolio, and occasionally accesses the brokerage website to perform

routine account maintenance.

Netty uses Google Finance to manage her personal stock portfolio, and has

added a financial module to her Personalized Homepage for convenient viewing

of her portfolio. She often reads financial news from the Google Finance

interface, and occasionally participates in discussion forums related to her

holdings.

305
Information Flows

Other than the brokerage companies, who have strict laws regulating

customer privacy, no one else has access to Libby’s financial information.

Netty’s financial interests are stored on Google’s systems, and her

financial research activities can be tracked as well.

Shopping and Product Research

Information Practices

Libby does the vast majority of her shopping and product research at

traditional retail storefronts. She often uses her frequent shopper card when

making purchases at Barnes & Noble or other retailers. She frequently browses

popular magazines for shopping ideas, and occasionally uses the websites of

popular stores to help compare products, such as Target.com or Amazon.com.

However, she prefers to purchase at stores so she can examine the products in

person.

Netty shops online whenever possible, and frequently relies on Google’s

Froogle shopping engine to search through a database of online retailers and

auctions. Froogle allows Netty to find multiple sources for specific products,

examine product information and images, and compare prices for the items

sought. Sometimes, Netty simply clicks on the link provided by Froogle to go to

the online seller’s site and complete a purchase. Other times, she performs a

traditional Google search for the item to see whether other websites have

information on the product. Netty also occasionally clicks on the “sponsored

306
links” that appear in the margins of her Google search results if the link appears to

have relevance to her product search.

Information Flows

While Libby can “window shop” anonymously, retailers can track her

purchases for marketing purposes. Her purchasing habits at Barnes & Noble, for

example, can be monitored through her frequent shopper card. Any of her

shopping performed online can be tracked by those websites.

Since Netty shops online with greater frequency, almost all of her

shopping activities can be monitored and tracked, including simple product

searches on Froogle (not just purchases). Google also tracks her clicks on

sponsored links.

Computer File Management

Information Practices

Libby has a home desktop computer for schoolwork and general Internet

use. She relies on the traditional “folder system” of her Windows operating

system to organize and access her computer files. If she needs to work on files

away from home, she copies them onto a USB flash drive for portability.

Understanding the importance of backing up files, Libby also frequently copies

her most important documents to CDs for storage.

Netty uses Google’s Desktop Search to navigate the multitude of files on

her computer, freeing her from having to remember in which directory or folder

307
she saved them. Using the familiar Google search interface, she can quickly locate

any of her Microsoft Office documents or archived PDF files, as well as music,

video and image files. When performing a desktop search, Netty also often

receives sites from her search history in the results. Because she also frequently

works on files from a computer in an office on campus, Netty takes advantage of

Desktop Search’s “search across computer” function so she can access her home

computer files remotely.

Information Flows

Unless someone gains physical access to Libby’s computer or backup

media, her computer files are not shared with any third party. Netty’s use of

Desktop Search means some of her computer file searches can be logged by

Google. Her entire index of files is also stored (in cryptographic form) on

Google’s servers so she can access files remotely from any computer.

Internet Browsing

Information Practices

While Libby tends to prefer visiting the library, reading the printed

newspaper, and shopping inside retail stores, she is not completely averse to

surfing the Internet. As noted above, she occasionally browses the Web from her

home computer and bookmarks frequently visited websites. She takes advantage

of the anti-spyware software automatically provided by Microsoft, and uses the

Mozilla Firefox browser based on a recommendation from a friend. In general,

308
however, Libby is considered a novice Internet user, utilizing it for only the most

basic of tasks.

As a heavy Internet user, Netty takes advantage of many browsing-related

services offered by Google. Along with taking advantage of Google’s Search

History service described above, Netty uses Google Bookmarks to save and

organize her bookmarked Web pages on Google’s servers, including easy access

on her Personalized Homepage. Netty is also experimenting with Google’s new

Notebook browser tool that provides her a way to save and organize notes while

browsing online. With Notebook, she can clip text, images, and links from Web

pages while browsing, and save them to her online “notebook” for easy access

from any computer.

Many of the information tasks described above have been integrated into

the Google Toolbar, which Netty has installed in her Web browser to making

using Google’s services easier. From the Toolbar’s search box, she can perform

searches from many of Google’s sites and receive useful suggestions based on

popular Google searches. Netty frequently uses other helpful Toolbar buttons to

quickly bookmark a page, share Web pages via email, send a text message, create

a blog entry, subscribe to a site’s news feed, send an e-mail, and even check the

PageRank of the site she is visiting. Google Toolbar stores Netty’s address and

credit card information, enabling her to fill out Web forms with a single click. It

also includes a Safe Browsing feature to warn Netty if a website appears to be

asking for her personal or financial information under false pretences. Finally,

309
Netty has installed Google’s Web Accelerator application to help make the Web

pages she views load faster.

Information Flows

As with most Web users, Libby’s Internet activities can be tracked and

logged by the various sites she visits via their Web cookies. The same applies for

Netty’s surfing activities, with the addition of her bookmarks and browsing notes

also being stored on Google’s servers. Further, her use of the Toolbar and Web

Accelerator allows Google to monitor and log every website she visits.

310

You might also like