Professional Documents
Culture Documents
ACM
cacm.acm.oRG OF THE 01/2011 VoL.54 no.01
a firm
foundation
for Private
Data analysis
is virtualization a
Curse or a blessing?
Follow the
intellectual Property
an interview
With Fran allen
aCM’s Fy10
annual report
Association for
Computing Machinery
ACM TechNews Goes Mobile
iPhone & iPad Apps Now Available in the iTunes Store
ACM TechNews—ACM’s popular thrice-weekly news briefing service—is now
available as an easy to use mobile apps downloadable from the Apple iTunes Store.
These new apps allow nearly 100,000 ACM members to keep current with
news, trends, and timely information impacting the global IT and Computing
communities each day.
The Apps are freely available to download from the Apple iTunes Store, but users must be registered
individual members of ACM with valid Web Accounts to receive regularly updated content.
http://www.apple.com/iphone/apps-for-iphone/ http://www.apple.com/ipad/apps-for-ipad/
acm technews
membership application &
Advancing Computing as a Science & Profession
digital library order form
Priority Code: AD10
Special rates for residents of developing countries: Special rates for members of sister societies:
http://www.acm.org/membership/L2-3/ http://www.acm.org/membership/dues.html
Please print clearly
Purposes of ACM
ACM is dedicated to:
Name
1) advancing the art, science, engineering,
and application of information technology
2) fostering the open interchange of
Address information to serve both professionals and
the public
3) promoting the highest professional and
City State/Province Postal code/Zip ethics standards
I agree with the Purposes of ACM:
Country E-mail address
Signature
Area code & Daytime phone Fax Member number, if applicable ACM Code of Ethics:
http://www.acm.org/serving/ethics.html
o ACM Professional Membership plus the ACM Digital Library: o ACM Student Membership plus the ACM Digital Library: $42 USD
$198 USD ($99 dues + $99 DL) o ACM Student Membership PLUS Print CACM Magazine: $42 USD
o ACM Digital Library: $99 USD (must be an ACM member) o ACM Student Membership w/Digital Library PLUS Print
CACM Magazine: $62 USD
2 communications of the ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
01/2011 VoL. 54 no. 01
86 A Firm Foundation
for Private Data Analysis
What does it mean to
preserve privacy?
By Cynthia Dwork
Research Highlights
98 Technical Perspective
Sora Promises Lasting Impact
By Dina Katabi
and Paul P. Maglio Reinvent Computing for Parallelism Architecture for the Internet
The ICE abstraction may take CS By Damon Wischik
54 UX Design and Agile: A Natural Fit? from serial (single-core) computing
Talking with Julian Gosper, to effective parallel (many-core) 109 Path Selection and Multipath
Jean-Luc Agathos, Richard Rutter, computing. Congestion Control
and Terry Coatta. By Uzi Vishkin By Peter Key, Laurent Massoulié,
ACM Case Study and Don Towsley
On the Move, Wirelessly
61 Virtualization: Blessing or Curse? Connected to the World
Managing virtualization at How to experience real-world
a large scale is fraught with landmarks through a wave,
hidden challenges. gaze, location coordinates,
By Evangelos Kotsovinos or touch, prompting delivery of
useful digital information.
articles’ development led by By Peter Fröhlich, Antti Oulasvirta,
queue.acm.org Matthias Baldauf, and Antti Nurminen
vulnerabilities indeed
context of different social networks. remain. This month’s
cover story by Cynthia
By Matthias Häsel Dwork (p. 86) spotlights
the difficulties involved
in protecting statistical
databases, where the
value of accurate statistics
about a set of respondents often compromises the privacy
of the individual.
ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f the acm 3
communications of the acm
Trusted insights for computing’s leading professionals.
Communications of the ACM is the leading monthly print and online magazine for the computing and information technology fields.
Communications is recognized as the most trusted and knowledgeable source of industry information for today’s computing professional.
Communications brings its readership in-depth coverage of emerging areas of computer science, new trends in information technology,
and practical applications. Industry leaders use Communications as a platform to present and debate various technology implications,
public policies, engineering challenges, and market trends. The prestige and unmatched reputation that Communications of the ACM
enjoys today is built upon a 50-year commitment to high-quality editorial content and a steadfast dedication to advancing the arts,
sciences, and applications of information technology.
CL
PL
T (212) 869-7440; F (212) 869-0481 Jeff Johnson; Wendy E. MacKay Printed in the U.S.A.
NE
TH
S
I
Z
I
M AGA
4 communications of the ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
editor’s letter
ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f the acm 5
letters to the editor
DOI:10.1145/1866739.1866741
S
o m e o f w hat Constantine offshoring on the U.S. IT labor market
What Deeper Implications
Dovrolis said in the Point/ merits its own discussion.
Counterpoint “Future In- for Offshoring? Misappropriation of information has
ternet Architecture: Clean- As someone who has known offshor- been studied in the broader outsourcing
Slate Versus Evolutionary ing for years, I was drawn to the article context; see, for example, Eric K. Clemons’s
Research” (Sept. 2010) concerning an “How Offshoring Affects IT Workers” and Lorin M. Hitt’s “Poaching and the
evolutionary approach to developing by Prasanna B. Tambe and Lorin M. Misappropriation of Information” in the
Internet architecture made sense, and, Hitt (Oct. 2010) but disappointed to Journal of Management Information
like Jennifer Rexford on the other side, find a survey-type analysis that essen- Systems 21, 2 (2004), 87–107.
I applaud and encourage the related tially confirmed less than what most of Prasanna B. Tambe, New York, NY
“evolutionary” research. But I found us in the field already know. For exam- Lorin M. Hitt, Philadelphia, PA
his “pragmatic vision” argument nei- ple, at least one reason higher-salaried
ther pragmatic nor visionary. Worse workers are less likely to be offshored
was the impudence of the claim of is they already appreciate the value of Interpreting Data 100 Years On
“practicality.” being able to bridge the skill and cul- Looking to preserve data for a cen-
Mid-20th century mathematician tural gap created by employing off- tury or more involves two challenging
Morris Kline said it best when referring shore workers. orthogonal problems, one—how to
to the history of mathematics: “The I was also disappointed by the ar- preserve the bits—addressed by David
lesson of history is that our firmest ticle’s U.S.-centric view (implied at the S.H. Rosenthal in his article “Keep-
convictions are not to be asserted dog- top in the word “offshoring”). What ing Bits Safe: How Hard Can It Be?”
matically; in fact they should be most about how offshoring affects IT work- (Nov. 2010). The other is how to read
suspect; they mark not our conquests ers in countries other than the U.S.? and interpret them 100 years on when
but our limitations and our bounds.” In my experience, they are likewise af- everything might have changed—for-
For example, it took 2,000 years for fected; for example, in India IT work- mats, protocols, architecture, storage
geometry to move beyond the “prag- ers are in the midst of a dramatic cul- system, operating system, and more.
matism” of the parallel postulate, tural upheaval involving a high rate of Consider the dramatic changes over
some 200 years for Einstein to overtake turnover. just the past 20 years. There is also the
Newton, 1,400 years for Copernicus to While seeking deeper insight into challenge of how to design, build, and
see beyond Ptolemy, and 10,000 years offshoring, I would like to ask some- test complete systems, trying to antici-
for industrialization to supplant agri- one to explain the implications of giv- pate how they will be used in 100 years.
culture as the dominant economic ac- ing the keys to a mission-critical sys- The common, expensive solution is to
tivity. The Internet’s paltry 40–50-year tem to someone in another country not migrate all the data every time some-
history is negligible compared to these subject to U.S. law? Imagine if the rela- thing changes while controlling costs
other clean-slate revolutions. tionships between countries would de- by limiting the amount of data that
Though such revolutions gener- teriorate, and the other country would must be preserved in light of dedupli-
ally fail, failure is often the wellspring seize critical information assets? We cation, legal obsolescence, librarians,
of innovation. Honor and embrace it. have pursued offshoring for years, but archivists, and other factors.
Don’t chide it as “impractical.” The I have still not heard substantive an- For more on data interpretation see:
only practical thing to do with this or swers to these questions. 1. Lorie, R.A. A methodology and
any other research agenda is to open- Mark Wiman, Atlanta, GA system for preserving digital data. In
mindedly test our convictions and as- Proceedings of the Joint Conference on
sumptions over and over…including Digital Libraries (Portland, OR, July
any clean-slate options. Authors’ Response: 2002), 312–319.
I worry about the blind spot in our With so little hard data on outsourcing, it is 2. Lorie, R.A. Long-term preserva-
culture, frequently choosing “practi- important to first confirm some of the many tion of digital information. In Proceed-
cal effort” over bolder investment, to anecdotes now circulating. The main point ings of the First ACM/IEEE-CS Joint Con-
significantly change things. Who takes of the article was that the vulnerability of ference on Digital Libraries (Roanoke,
the 10,000-, 1,000-, or even 100-year occupations to offshoring can be captured VA, Jan. 2001), 346–352.
view when setting a research agenda? by their skill sets and that the skills story 3. Rothenberg, J. Avoiding Techno-
Far too few. Though “newformers” fail is not the only narrative in the outsourcing logical Quicksand: Finding a Viable Tech-
more often than the “practical” among debate. nical Foundation for Digital Preserva-
us, they are indeed the ones who The study was U.S.-centric by design. tion. Council on Library & Information
change the world. How offshoring affects IT workers in other Resources, 1999.
CJ Fearnley, Upper Darby, PA countries is important, but the effects of Robin Williams, San Jose, CA
6 communications of the ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
letters to the editor
ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f the acm 7
in the virtual extension
DOI:10.1145/1866739.1866742
The Ephemeral Legion: Producing On the Move, Wirelessly OpenSocial: An Enabler for
an Expert Cyber-Security Work Connected to the World Social Applications on the Web
Force from Thin Air Peter Fröhlich, Antti Oulasvirta, Matthias Häsel
Michael E. Locasto, Anup K. Ghosh, Matthias Baldauf, and Antti Nurminen Social networking and open interfaces
Sushil Jajodia, and Angelos Stavrou Is it possible to experience real-world can be seen as representative of two
Although recent hiring forecasts landmarks through a wave, gaze, location characteristic trends to have emerged
(some thousands of new cyber-security coordinates, or touch, prompting in the Web 2.0 era, both of which
professionals over the next three years) delivery of useful digital information? have evolved in recent years largely
by both the NSA and DHS show a strong Today’s mobile handheld devices offer independently of each other. A significant
demand for cyber-security skills, such a opportunities never before possible for portion of our social interaction now
hiring spree seems ambitious, to say the interacting with digital information that takes place on social networks, and
least. The current rate of production of responds to users’ physical locations. URL-addressable APIs have become
skilled cyber-security workers satisfies But mobile interfaces have only limited an integral part of the Web. The arrival
the appetite of neither the public nor input capabilities, usually just a keyboard of OpenSocial heralds a new standard
private sector, and if a concerted effort and audio, while emerging multimodal uniting these two trends by defining a set
to drastically increase this work force is interaction paradigms are beginning to of programming interfaces for developing
not made the U.S. will export high-paying take advantage of user movements and social applications that are interoperable
information security jobs. In a global gestures through sensors, actuators, and on different social network sites.
economy, such a situation isn’t necessarily content. For example, tourists asking about OpenSocial applications are
a bad outcome, but it poses several an unfamiliar landmark might point at it interoperable within the context
challenges to the U.S.’s stated cyber- intuitively and would certainly welcome of multiple networks and build on
security plans. a handheld computer that responds standard technologies such as HTML
The authors believe the creation of a directly to that interest. When passersby and JavaScript. The advent of OpenSocial
significant cyber-security work force is provide directions, the description might increases a developer’s scope and
not only feasible, but also will help ensure include local features, as in, say, “Turn productivity considerably, as it means
the economic strength of the U.S. Beyond right after the red building and enter that applications need only be developed
offering immediate economic stimulus, through the metal gates.” They, too, would once, and can then be implemented
the nature of these jobs demands they welcome being able to see these features within the context of any given container
remain in the U.S. for the long term, and represented in a directly recognizable way that supports the standard. Meanwhile,
they would directly support efforts to on their handhelds. Or when following a operators of social network sites are
introduce information technology into route to a remote destination, they would presented with the opportunity to expand
the health care and energy systems in a want to know the turns and distances on their own existing functionalities with a
secure and reliable fashion. Without a they would need to take through tactile or host of additional third-party applications,
commitment to educating such a work auditory cues, without having to switch without having to relinquish control over
force, it is impossible to hire such a work their gaze between the environment and their user data in the process.
force into existence. the display. Until it was made public in November
From the authors’ point of view, far This article explores the synthesis 2007, the OpenSocial standard was
too few workers are adequately trained of several emerging research trends driven primarily by Google. The standard
mostly because traditional educational called Mobile Spatial Interaction, or was not suited to productive use at that
mechanisms lack the resources to MSI (http://msi.ftw.at), covering new time however, as there were several
effectively train large numbers of interaction techniques that let users shortcomings with respect to the user
experienced, knowledgeable cyber-security interact with physical, natural, and interface and security. The specification
specialists. Just as importantly, many of urban surroundings through today’s is now managed by the non-profit
the current commercial training programs sensor-rich mobile devices. OpenSocial Foundation and, with its
and certifications focus on teaching skills 0.8 version, a stable state suitable for
useful for fighting the last cyberwar, not commercial use has been reached.
the current, nor future ones.
8 communications of the ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
acm’s annual report for FY10
By discovering,
welcoming,
ACM’s Annual Report
and nurturing It has truly been a banner year for ACM.
talent from all We firmly established ACM hubs in Europe,
corners of the India, and China after years of exhaustive
computing arena, efforts to expand the Association’s global
ACM can truly be reach. We moved ACM’s commitment parts to join forces to improve working
distinguished as to women in computing to a new level conditions for women in computing in
the world’s leading with further development of the ACM
Women’s Council and the launch of
India. The Association’s commitment
to addressing the challenges faced by
computing society. ACM-W activities in India. And, (dare women in the field today is one that ev-
I say, not surprisingly), ACM mem- ery member should applaud.
bership ended the year at another all- The fact that membership has
time high. continued to increase for eight con-
Increasing ACM’s relevance and in- secutive years is testament to the
fluence in the global computing com- ever-growing awareness of ACM’s
munity has been a top priority through- commitment to supporting the pro-
out my presidency. By sharing ACM’s fessional growth of its members. In-
array of valued resources and services deed, by the end of FY10—spanning
with a borderless audience, and by dis- an acutely challenging year in global
covering, welcoming, and nurturing economies—the Association’s mem-
talent from all corners of the comput- bership stood at an all-time high, thus
ing arena, ACM can truly be distin- cementing ACM’s position as the larg-
guished as the world’s leading com- est educational and scientific comput-
puting society. It was therefore a great ing society in the world.
honor to host the opening days of ACM The following pages summarize
Europe, ACM India, and ACM China. some of the highlights of a busy year in
The global stage has indeed been set the life of ACM. While much has been
for ACM to flourish internationally as accomplished, there is still much to be
never before. done. In FY11, the Association will con-
ACM continues to play a leadership tinue to grow initiatives in India, Chi-
role in improving the image and health na, and Europe as well as identify other
of the computing discipline. This is regions of the world where it is feasible
particularly evident with the Associa- for ACM to increase its level of activ-
tion’s work in influencing change for ity. Improving the image and health
women pursuing a career in comput- of our discipline and field requires the
ing. Through committees and initia- concerted commitment of every ACM
tives such as ACM Women’s Council, volunteer, board, chapter, committee,
The Coalition to Diversify Computing, and member. It is through the support
and the Computer Science Teachers of devoted volunteers, members, and
Association (CSTA), ACM is helping to industry partners that ACM is able to
build balance, diversity, and opportu- make a real difference in the future of
nity for all who may be interested in computing. It has been a pleasure to
technology. It was particularly inspir- serve as your president during a time of
ing to see members of ACM-W on hand such great promise.
at the launching of ACM India earlier
this year, encouraging their counter- Wendy Hall, acm president
Ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f the acm 9
acm’s annual report for FY10
ACM, the Association for Computing Education ography of resources selected from
Machinery, is an international scientific ACM continues to work with multiple ACM’s Digital Library, ACM’s online
and educational organization dedicated organizations on important issues book and course offerings, and non-
to advancing the arts, sciences, and ap- related to the image of computing ACM resources created by experts’
plications of information technology. and the health of the discipline and recommendations on current com-
profession. In the second year of an puting topics. A Tech Pack comprises
Publications NSF grant to develop a more relevant a set of fundamentally important ar-
The centerpiece of the ACM Publica- image for computing, ACM worked ticles on a subject with new material
tion portfolio is the ACM Digital Li- in tandem with WGBH-Boston in the to provide a context and perspective
brary. During the past year, 21,000 creation of a new messaging cam- on the theme. The goal is that com-
full-text articles were added to the DL, paign called “Dot Diva.” The cam- munities might be built around Tech
bringing total holdings to 281,000 paign, which rolled out in the U.S. last Packs with members commenting on
articles. ACM’s Guide to Computing month, is focused on ways to engage selected resources and suggesting
Literature is an integral part of the young girls with the potential of com- new ones.
DL, providing an increasingly com- puting. The Professions Board Case Study
prehensive index to the literature of ACM and the Association for In- program took off this year, with the
computing. More than 230,000 works formation Systems (AIS) jointly de- first of several planned studies avail-
were added to the bibliographic data- veloped new curriculum guidelines able online and in print. The program
base in FY10, bringing the total Guide for undergraduate degree programs was designed to take an in-depth look
coverage to over 1.52 million works. in information systems that for the at a company or product or technol-
Significant enhancements were first time include both core and elec- ogy from its inception to future plans
made to the Digital Library and Guide tive courses suited to specific career by interviewing some of the key play-
this year, including a major reorgani- tracks. Released in May, IS 2010 is ers involved. The inaugural case study
zation of the core citation pages and aimed at educating graduates who was posted on the ACM Queue site and
to ACM bibliometrics. Along with con- are prepared to enter the work force published in Communications of the
tent reformation, there is now greater equipped with IS-specific as well as ACM. The article was quickly slash-
ease of navigation and a greater selec- foundational knowledge and skills. dotted, and drew over 50,000 unique
tion of tools and resources. The report describes the seven core visits to the Queue site by the end of
ACM currently publishes 40 jour- courses that must be covered in every the fiscal year.
nals and Transactions, 10 magazines, IS program and the curriculum can be Traffic to the Queue Web site
and 23 newsletters. In addition, it adapted for schools of business, pub- (http://queue.acm.org/) more than
provides primary online distribution lic administration, and information doubled this year over last. By the
for 10 periodicals through the Digi- science or informatics. end of FY10, the site delivered nearly
tal Library. During FY10, ACM added ACM’s Computer Science Teach- a million page views to nearly half a
364 conference and related workshop ers Association (CSTA) continues to million readers.
proceedings to the DL, including 45 support and promote the teaching of
in ACM’s International Conference Pro- computer science at the K–12 level as Public Policy
ceedings Series. well as providing opportunities and Members of the U.S. Public Policy
Two ACM magazines were re- resources for teachers and students to Council of ACM (USACM) had an ac-
launched during FY10. Crossroads, the improve their understanding of com- tive year interacting with policymak-
ACM student magazine became XRDS, puting disciplines. CSTA’s mission is ers in areas of e-voting, privacy, and
with a more expansive editorial scope to ensure computer science emerges security, as well as testifying before
and a more modern look to appeal to as a viable discipline in high schools Congressional committees and help-
the student audience. ACM Inroads and middle schools; it is a key partner ing develop principles for increasing
was transformed from the SIGCSE in ACM’s effort to see real computer the usability of government informa-
Bulletin newsletter to an ACM maga- science count at the high school level. tion online. Among the issues tackled
zine with a wider variety of content for this year, USACM joined a task force
computer science educators. Professional Development for the Future of American Innova-
Periodicals that were approved by The Professional Development Com- tion urging more funding for basic
the Publications Board and are now mittee spearheaded the development research and STEM education. Mem-
on the launching pad for FY11: ACM of a new product for practitioners and bers also expressed concerns with the
Transactions on Management Infor- managers this year called Tech Packs. Cybersecurity Act of 2009, provided
mation Systems; ACM Transactions on These integrated learning packages constructive comments on a draft of
Intelligent Systems and Technology; were created to provide a resource the Internet Privacy bill, and issued
and ACM Transactions on Interactive for emerging areas of computing de- a response to e-voting legislation and
Intelligent Systems. signed around an annotated bibli- Internet voting as it relates to military
10 communications of th e ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
acm’s annual report for FY10
and overseas voters. The ACM Student Research Com- ACM Council
The ACM Committee on Comput- petition (SRC), sponsored by Micro- President
Wendy Hall
ers and Public Policy aids the Associa- soft Research, provides a unique fo-
Vice President
tion with respect to a variety of inter- rum for undergraduate and graduate Alain Chesnais
nationally relevant issues pertaining students to present their original re- Secretary/Treasurer
to computers and public policy. The search at well-known ACM-sponsored Barbara Ryder
online ACM Forum on Risks to the and co-sponsored conferences before Past President
Stuart I. Feldman
Public in Computer and Related Sys- a panel of judges and attendees. This
SIG Governing Board Chair
tems and the “Inside Risks” column venue draws an increasing number Alexander Wolf
published in Communications of the of students each year as it affords an Publications Board Co-Chairs
ACM reflect CCPP’s long-standing exceptional opportunity for students Ronald Boisvert, Holly Rushmeier
dedication to policy issues on a global to showcase their work and develop Members-at-Large
Carlo Ghezzi, Anthony Joseph,
scale. their skills as researchers. Mathai Joseph, Kelly Lyons,
ACM played an active role in the ACM continues to cultivate its Bruce Maggs, Mary Lou Soffa,
Fei-Yue Wang
National Center for Women and In- partnerships with leading technology
SGB Council
formation Technology (NCWIT) this companies, including Microsoft and Representatives
year, particularly with regard to the Computer Associates, to offer valu- Joseph A. Konstan, Robert A. Walker,
Jack Davidson
K–12 Alliance—a coalition of edu- able tools specifically for ACM stu-
cational organizations interested in dent members. Available under the ACM Headquarters
helping young girls develop an inter- Student Academic Initiative is the Mi- Executive Director/CEO
est in computer science and informa- crosoft Developer Academic Alliance John R. White
tion technology. now offering student members free Deputy Executive Director/
COO
The ACM Education Policy Com- and unlimited access to over 100 soft- Patricia M. Ryan
mittee (ACM EPC), established to ed- ware packages and the CA Academic
ucate policymakers about the appro- Initiative including access to compli- 2009 ACM Award
priate role of computer science in the mentary CA software. Recipients
K–12 system, made major progress in ACM-W’s Scholarship program, A.M. Turing Award
Charles P. Thacker
bringing computer science into STEM which offers stipends to select stu-
ACM-Infosys Foundation Award
discussions at all levels of govern- dents to attend research conferences in the Computing Sciences
ment. Through the work of EPC, com- worldwide, was given an extra finan- Eric Brewer
puter science is now explicitly recog- cial boost this year with new funding ACM/AAAI Allen Newell Award
Michael I. Jordan
nized in key federal legislation as well from the Bangalore-based global IT
The 2009–2010 ACM-W
as Department of Education regula- services corporation Wipro and Sun Athena Lecturer Award
tions and initiatives. Indeed, EPC suc- Microsystems (prior to the Oracle Mary Jane Irwin
cessfully led an effort that resulted takeover). The increased funding will Grace Murray Hopper Award
Tim Roughgarden
in the U.S. House of Representatives allow ACM-W to offer students larger
ACM-IEEE CS 2010
declaring the week of December 7th scholarships as well as enable partici- Eckert-Mauchly Award
as National Computer Science Edu- pation by women in both internation- William J. Dally
cation Week. ACM took a leadership al and local events. Karl V. Karlstrom
Outstanding Educator Award
role in steering the first CSEDWeek Matthias Felleisen
(held Dec. 6–12, 2009); a role the or- International Outstanding Contribution
ganization reprised for the second ACM Europe and ACM India were to ACM Award
Moshe Y. Vardi
CSEDWeek held last month. launched in FY10. Both organiza-
Distinguished Service Award
tions operate with councils estab- Edward Lazowska
Students lished around three subcommittees: Paris Kanellakis Theory
ACM’s renowned International Col- chapters; conferences; and members, and Practice Award
Mihir Bellare and Phillip Rogaway
legiate Programming Contest (ICPC), awards, and volunteer leaders with
Software System Award
sponsored by IBM, drew 22,000 con- the goal of increasing the presence of VMware Workstation 1.0,
testants representing 1,931 universi- and generating interest in these pop- Mendel Rosenblum,
Edouard Bugnion, Scott Devine,
ties from 82 countries. The finals were ular ACM services. Jeremy Sugerman, Edward Wang
held in Harbin, China, where 103 The number of ACM Fellows, Dis- Eugene L. Lawler Award
teams competed. The top four teams tinguished, and Senior members and Informatics
Gregory D. Abowd
won gold medals as well as employ- from Europe has increased as has the
ACM-IEEE Ken Kennedy Award
ment or internship offers from IBM. number of ACM chapters throughout Francine Berman
Last January, ACM Queue’s Web site Europe. Doctoral Dissertation Award
offered an online programming com- Moreover, Microsoft Research Eu- Craig Gentry
petition based on the ICPC. The inau- rope provided $50,000 to enhance AcM PRESIDENTIAL AWARD
Mathai Joseph, Elaine J. Weyuker
gural Queue ICPC Challenge—open to the ACM Distinguished Speakers Pro-
Honorable Mention
all Queue readers (not just students)— gram with a goal of delivering more Haryadi S. Gunawi, Andre Platzer,
was a huge success. high-quality, ACM-branded lectures Keith Noah Snavely
Ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t he acm 11
acm’s annual report for FY10
12 communications of th e ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
acm’s annual report for FY10
Statement of Activities: Year ended June 30, 2010 (in Thousands) participated in the conference. SIG-
GRAPH Asia 2009 attracted over 6,500
Temporarily visitors from more than 50 countries
revenue Unrestricted Restricted Total across Asia and globally to Yokohama,
Japan where over 500 artists, academ-
Membership dues $9,201 $9,201
ics, and industry experts shared their
Publications 17,361 17,361 work.
Conferences and other meetings 24,933 24,933 ACM’s SIG Governing Board agreed
to sponsor select conferences that
Interests and dividends 1,707 1,707
come to ACM without a technical tie
Net appreciation of investments 2,442 2,442 to one its SIGs. In FY10, SGB approved
Contributions and grants 2,882 $963 3,845 sponsorship for two conferences: The
ACM International Conference on
Other revenue 348 348
Bioinformatics and Computational
Net assets released from restrictions 1,057 (1,057) 0 Biology and the First ACM Interna-
tional Health Informatics Sympo-
Total Revenue 59,931 (94) 59,837
sium.
Attendance for ASSETS 09, spon-
EXPENSES sored by SIGACCESS, exceeded all
projections, drawing a record number
Program:
of participants to its technical pro-
Membership processing and services $945 $945 gram that addressed key issues such
Publications 11,457 11,457 as cognitive accessibility, wayfinding,
virtual environments, and accessibil-
Conferences and other meetings 25,035 25,035
ity obstacles for the hearing impaired.
Program support and other 7,269 7,269 SIGOP’s flagship ACM Symposium
on Operating Systems Principles en-
Total 44,706 44,706 joyed record-breaking attendance;
the SIG also jointly sponsored (with
Supporting services:
SIGMOD) the first annual ACM Sym-
General administration 9,009 9,009 posium on Cloud Computing.
Marketing 1,299 1,299 KDD 2009 maintained SIGKDD’s
position as the leading conference on
Total expenses 55,014 55,014 data mining and knowledge discov-
ery, with a record number of submis-
sions.
Increase (decrease) in net assets 4,917 (94) 4,823
Net assets at the beginning of the year 47,412 5,815 53,227 Recognition
The ACM Fellows Program, estab-
Net assets at the end of the year $52,329* $5,721 $58,050* lished in 1993 to honor outstanding
* Includes SIG Fund balance of $28,448K ACM members for their achievements
in computer science and information
ence on such popular social networks DAD). This wireless network, created technology, inducted 47 new fellows
as Facebook, Twitter, and LinkedIn. to bridge the gap between research in FY10, bringing the total number of
SIGUCCS established an online com- and real-world use of wireless net- ACM Fellows to 722.
munity using Ning’s social network- works, has rapidly become one of the ACM also recognized 84 Distin-
ing services and linked its portal to most critical wireless network data re- guished Members for their individual
its new Web site (http://www.siguccs. sources for the global research com- contributions to both the practical
org/) as well as initiated a series of munity. and theoretical aspects of comput-
Webinars to continue on a quarterly ing and information technology. In
basis. SIGSIM’s Modeling and Simu- Conferences addition, 150 Senior Members were
lation Knowledge Repository (http:// SIGGRAPH 2009 welcomed 11,000 recognized for demonstrated perfor-
www.acm-sigsim-mskr.org) has prov- artists, research scientists, gam- mances that set them apart from their
en an innovative program for supply- ing experts, and filmmakers from 69 peers.
ing services to the SIGSIM technical countries to New Orleans. Exhibits at There were 104 new ACM chapters
community. And SIGMOBILE spon- SIGGRAPH experienced the largest chartered in last year. Of the 28 new
sored programs in the mobile com- percentage of international participa- professional chapters, 26 of them
puting research community such as tion in more than 10 years, with a total were internationally based; of the
a Community Resource for Archiving of 140 industry organizations repre- 76 new student chapters, 41 of them
Wireless Data at Dartmouth (CRAW- sented. In addition, over 965 speakers were based internationally.
Ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t he acm 13
The Communications Web site, http://cacm.acm.org,
features more than a dozen bloggers in the BLOG@CACM
community. In each issue of Communications, we’ll publish
selected posts or excerpts.
doi:10.1145/1866739.1866743 http://cacm.acm.org/blogs/blog-cacm
Laptops as a
is tersely described as: Do better, look
better, and connect better.
Do Better. “The most reliable way
14 communications of th e ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
blog@cacm
“Refuse to let your time get burned up you must have a relationship with men- validated questionnaire tool about
with things that are less important.” tors or others who are connected to the laptop usage in education.
A critical lesson Azzarello learned in company president and key executives. Of course, there have always been
her career at HP is that the “most suc- This step involves networking, and distractions during class—as the mar-
cessful executives don’t do everything. many people (and Azzarello admits gins of my Maths 101 notes demon-
They do a few things right and hit them she’s one of these people) are uncom- strate with their elaborate doodles.
out of the park.” fortable with meeting new people for It’s just that laptops make it so easy
Look Better. The second step of Azza- the purpose of networking. If you’re and seductive to drop your attention
rello’s career plan involves making your one of these people, Azzarello’s advice out of the lecture while still feeling
work and accomplishments known to is to network with the people you al- that you are achieving something. (“I
your immediate bosses. After all, if you ready know. simply must update Facebook now.
deliver excellent results, but no one If Azzarello’s career advice sounds Otherwise people will not know I am
above you in the company is aware of like a lot of work, you’re right—it is. in a boring lecture.”)
them or doesn’t connect the results Which is why she urges employees to Unsurprisingly, laptop usage in
with your job performance, it’ll be diffi- create a yearlong plan for implement- class has been associated with poorer
cult for you to advance in your company. ing these three stages. learning outcomes, poorer self-per-
Azzarello recommends creating For many employees, Azzarello’s ad- ception of learning, and students re-
an audience list of the people in your vice is a real challenge. The alternative, porting feeling distracted by their own
company who should know about your however, is rather unsatisfying. After screen as well as their neighbors’ (see
achievements at work and a communi- all, who wants a zero raise? Carrie Fried’s “In-class Laptop Use and
cation plan for how to inform these key its Effects on Student Learning”). Many
players about your work and what you’ve Judy Robertson educators get frustrated by this (see
accomplished. The audience list should “Laptops in the Dennis Adam’s “Wireless Laptops in
include the influencers who have a say Classroom” the Classroom [and the Sesame Street
in your career—your bosses and your http://cacm.acm.org/ Syndrome]”) and there is debate about
bosses’ bosses—and any stakehold- blogs/blog-cacm/93398 whether laptops should be banned, or
ers who are dependent on your work. Do you ever find yourself whether the lecturer should have a big
And your communication plan should checking your email during a boring red button to switch off wireless (or to
describe how you will inform these in- meeting? Do you drift off on a wave of electrocute all students) when he or
fluencers—usually via conversations, RSS feeds when you should be listen- she can’t stand it anymore.
reports, and email—about your job and ing to your colleagues? Do you pretend Bear in mind, though, the stud-
what you’ve accomplished. to be taking studious notes during ies I mention here were conducted in
For your achievements to be appre- seminars while actually reading Slash- lecture-style classes and the students
ciated, Azzarello says it’s vital that they dot? In fact, shouldn’t your full atten- were not given guidance on how to ef-
are relevant to your company’s goals. tion be somewhere else right now? fectively use their laptops to help them
“Your priorities must be relevant to I find it increasingly tempting to do learn rather than arrange their social
their priorities,” says Azzarello. “Your lots of things at once, or at least take lives. It is possible to design active
work must be recognized as matching microbreaks from activities to check classes around laptop use (if you can
the business’s goals.” mail or news. I do think it’s rude to make sure that students who don’t own
Connect Better. The third step of Az- do so during meetings so I try to stop a laptop can borrow one) thereby mak-
zarello’s career plan involves connect- myself. My students don’t tend to have ing the technology work in your favor.
ing with key players at your company, such scruples. They use their laptops For example, my students learn to do
which involves building relationships openly in class, and they’re not all literature searches in class, try out code
with mentors and creating a broad net- conscientiously following along with snippets, or critique the design of Web
work of support. “Successful people my slides, I suspect. In fact, in a recent pages. And, yes, some of them still get
get a lot of help from others,” Azzarello study, “Assessing Laptop Use in High- distracted from these activities and wan-
says. “You can’t be successful alone.” er Education Classrooms: The Laptop der off to FarmVille. But at least I have
Azzarello stresses the importance of Effectiveness Scale,” published in the given them the opportunity to integrate
mentors (note the plural) at your com- Australasian Journal of Educational their technology with their learning in a
pany and outside of it, and says em- Technology, 70% of students spent half meaningful way. They are adult learners
ployees “shouldn’t attempt career ad- their time sending email during class after all. It’s their decision how best to
vancement without mentors.” Not only (instant messaging, playing games, spend their brain cells in my class and
can mentors help you understand a and other nonacademic activities my job is to give them a compelling rea-
company’s culture and goals, but they, were also popular). They did also take son to spend them on computer science
and other key players, can help you get notes and other learning tasks, but rather than solitaire.
a spot on the company’s list of employ- they weren’t exactly dedicated to stay-
ees who are viewed as up and coming. ing on task. If you’re interested in sur- Jack Rosenberger is senior editor, news, of Communica-
tions. Judy Robertson is a lecturer at Heriot-Watt
All of this is about visibility. The veying your own class to find out what University.
company president or other top execu- they really do behind their screens,
tives must know or know about you, or the study’s authors provide a reliable, © 2011 ACM 0001-0782/11/0100 $10.00
ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t he acm 15
cacm online
ACM
Member
News
DOI:10.1145/1866739.1866744 David Roman Jan Camenisch Wins
Sigsac’s outstanding
16 communications of th e ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
N
news
Nonlinear Systems
Made Easy
Pablo Parrilo has discovered a new approach to convex optimization
that creates order out of chaos in complex nonlinear systems.
I
magine you are hiking in a com-
plex and rugged environment.
You are surrounded by hills
and valleys, mountains, shallow
ditches, steep cliffs, and lakes.
Nothing about the ground immediately
around you, or your current direction,
tells you much about where you will end
up or what might lie in between.
Now imagine you are walking on
the inside surface of a huge, smoothly
shaped bowl. You can see the bottom of
it, and even a few steps over the surface
of the bowl tell you much about its shape
and dimensions. There are no surprises.
Pablo Parrilo, a professor of electri-
cal engineering and computer science
at Massachusetts Institute of Technol-
ogy (MIT), has found a way to remake New algorithms devised by Pablo Parrilo, an MIT professor of electrical engineering and
the mathematical landscapes of com- computer science, have made working with nonlinear systems both easier and more efficient.
plex, nonlinear systems into predictable
smooth bowls. He has constructed a rare it warms, then explodes in volume at the Parrilo developed algorithms that
bridge between theoretical math and boiling point. An airplane rises smooth- take the complex, nonlinear polyno-
engineering that extends the frontiers of ly and ever more steeply—until it stalls. mials in models that describe these
such diverse disciplines as chip design, Understanding these systems often re- systems and—without actually solving
PHOTOGRA PH BY PATRICK GILLOO LY
robotics, biology, and economics. quires a great deal of prior knowledge, them—rewrites them as much simpler
Nonlinear dynamical systems are plus a painstaking combination of trial mathematical expressions represented
inherently difficult, especially when and error and modeling. Sometimes the as sums of squares of other functions.
they involve many variables. Often they models themselves are so complex their Because squares can only be positive, his
act in a linear fashion over some small behavior can’t be predicted or guaran- expressions are guaranteed to be greater
region, then change radically in some teed, and running realistic models can than zero—the bottom of a “bowl”—
other region. Water expands linearly as be computationally intractable. and relatively straightforward to analyze
ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t he acm 17
news
via conventional mathematical tools tions from the convex sums-of-squares says, because it makes dealing with
such as optimization techniques. equations in polynomial time, Parrilo nonlinear systems much easier.
Parrilo’s transformed equations are says. In fact, his techniques can improve John Harrison, a principal engineer
“convex,” like bowls. Convexity in math- efficiency so greatly—for researchers as at Intel, knows what it’s like to wrestle
ematics essentially means that a func- well as their computers—that they en- with nonlinear systems. He develops
tion is free of undulations that form able fundamentally new ways of work- formal proofs for the correctness of de-
local minima and maxima. “It means ing and, in some cases, qualitatively bet- signs for floating-point arithmetic cir-
that if you know something about two ter results. cuits. The idea is to prevent a recurrence
points, then you know what’s going to of the floating-point division bug in the
happen in the middle,” Parrilo says. Systems, Functions, Properties Pentium chip that cost Intel nearly $500
“The reason that the convexity prop- Familiar systems—one describing en- million in the mid-1990s. The tools he
erty is so important is that it allows us ergy, for example—are often defined by had used to do that, before discovering
to make global statements from local functions in which equilibrium exists at Parrilo’s sums of squares and semidefi-
properties,” Parrilo says. “You can some- some minimum point, with the systems nite programming, were complex, time
times give bounds on the quantity you moving toward that point along smooth consuming, and required huge amounts
are trying to find; essentially, you use the trajectories. “But with many systems, of computer time, he says. Now his for-
convexity of a function as a way of estab- like a biological system, it is very differ- mal verifications typically run in “tens
lishing whatever conclusion you want to ent. We don’t quite know what these of seconds” rather than “many minutes
make.” In other words, one can deduce a functions are,” Parrilo says. “So what or even hours,” Harrison says.
great deal about a bowl from visiting just [my] methods do, in a more or less auto- Harrison doesn’t have to find exact
a few points on it. matic way, is find a function that has the solutions to his equations, but can work
Once Parrilo has derived the sums- properties of an energy function.” with proofs that a polynomial will re-
of-squares equations, the equations are Because nonlinear systems are main within certain acceptable bounds
solved (a minimum is found) via an opti- so difficult, system designers and over a specified range. That’s exactly
mization technique called semidefinite researchers often take the easy but what Parrilo’s method does, he says.
programming, a relatively recent exten- wrong way out, says Elizabeth Brad- “The key,” he adds, “is the ability to certi-
sion of linear programming that works ley, a professor of computer science fy the result formally, otherwise various
on matrices representing convex func- at the University of Colorado at Boul- less rigorous methods could be used.”
tions. (The algorithms for both steps are der. “Linear systems dominate our The broad scope of Parrilo’s concepts
contained in a MATLAB toolbox called education as engineers solely because may mean that formal methods, which
Sostools, which is available at http:// they are easy,” Bradley says. “But that can mathematically prove or verify the
www.mit.edu/~parrilo/sostools/.) leads to the lamppost problem. People correctness of designs, but usually with
While solving the original nonlinear look around the linear lamppost even some difficulty, will propagate more
equations is often NP-hard, Sostools can though the answers aren’t there.” Par- widely, Harrison predicts.
find useful bounds or even exact solu- rilo’s contribution is important, she
Specifying a Robot’s Bounds
Phase plot of a two-dimensional dynamical system, and estimate of the region of A robot walking slowly can be controlled
attraction of the stable equilibrium at the origin. This estimate was obtained by solving
a sum of squares optimization problem.
by a relatively simple system that works
linearly, says Russ Tedrake, associate
professor of electrical engineering and
3 computer science at MIT. But if the robot
walks too fast or encounters some kind
of disturbance, nonlinear factors kick in
2
and the robot’s behavior becomes much
more difficult to predict and control. Te-
drake is using Parrilo’s sums-of-squares
1
and semidefinite programming tech-
niques to rigorously specify the bounds
0 within which the robot won’t fall. He
y
18 communications of th e ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
news
ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t he acm 19
news
E
ve r since the first silent- moving beyond special-purpose ap- for exploiting that capability because
mode cell phones started plications to tackle one of the defin- it’s already a background sense.”
buzzing in our pockets a few ing challenges of our age: information As people consume more informa-
years ago, many of us have overload. For many of us, a growing re- tion on mobile devices, the case for hap-
unwittingly developed a liance on screen-based computers has tics seems to grow stronger. “As screen
fumbling familiarity with haptics: tech- long since overtaxed our visual senses. size has become smaller, there is inter-
nology that invokes our sense of touch. But the human mind comes equipped est in offloading some information that
Video games now routinely employ to process information simultaneously would have been presented visually to
force-feedback joysticks to jolt their from multiple inputs—including the other modalities,” says Jones, who also
players with a sense of impending on- sense of touch. “People are not biologi- sees opportunities for haptic interfaces
screen doom, while more sophisticated cally equipped to handle the assault embedded in vehicles as early warning
haptic devices have helped doctors con- of information that all comes through systems and proximity indicators, as
duct surgeries from afar, allowed desk- one channel,” says Karon MacLean, a well as more advanced applications in
bound soldiers to operate robots in haz- professor of computer science at the surgery, space, undersea exploration,
ardous environments, and equipped University of British Columbia. and military scenarios.
musicians with virtual violins. Haptic interfaces offer the promise While those opportunities may be
Despite recent technological ad- of creating an auxiliary information real, developers will first have to over-
vances, haptic interfaces have made channel that could offload some of the come a series of daunting technical
only modest inroads into the mass cognitive load by transmitting data to obstacles. For starters, there is cur-
consumer market. Buzzing cell phones the human brain through a range of rently no standard API for the various
and shaking joysticks aside, develop- vibrations or other touch-based feed- force feedback devices on the market,
ers have yet to create a breakthrough back. “In the real world things happen although some recent efforts have re-
product—a device that would do for on the periphery,” says Lynette Jones, a sulted in commercial as well as open
haptics what the iPhone has done for senior research scientist at Massachu- source solutions for developing soft-
touch screens. The slow pace of market setts Institute of Technology. “It seems ware for multiple haptic hardware
acceptance stems partly from typical like haptics might be a good candidate platforms. And as haptic devices grow
new-technology growing pains: high
production costs, the lack of standard
application programming interfaces
(APIs), and the absence of established
user interface conventions. Those is-
sues aside, however, a bigger question
looms over this fledgling industry:
What are haptics good for, exactly?
Computer scientists have been ex-
ploring haptics for more than two de-
cades. Early research focused largely
on the problem of sensory substitu-
tion, converting imagery or speech
information into electric or vibratory
stimulation patterns on the skin. As
the technology matured, haptics found
PHOTOGRA PH S BY STEVE YOH ANA N
20 communications of th e ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
news
changes.
on both the Green500 and
ample—the ordinary consumer still Top500 lists. Named in honor
requires some kind of training to as- of one of the UIUC campus’s
sociate a haptic stimulation pattern main thoroughfares, the Green
Street supercomputer placed
with a particular meaning, such as the
third in the Green500 list
urgency of a phone call or the status of of the world’s most energy-
a download on a mobile device. efficient supercomputers,
The prospect of convincing con- with a performance of 938
megaflops per watt. It also
sumers to learn a new haptic language quickly. And that’s not to criticize the placed 403rd in the Top500
might seem daunting at first, but the developers of haptics—it’s just a tough list, a ranking of the world’s
good news is that most of us have al- problem.” fastest supercomputers, with a
ready learned to rely on haptic feedback Many efforts to date have used hap- performance of 33.6 teraflops.
The Green Street
in our everyday lives, without ever giving tics as a complementary layer to exist- supercomputer grew out of
it much thought. “We make judgments ing screen-based interfaces. MacLean an independent study course
based on the firmness of a handshake,” argues that haptics should do more led by Bill Gropp, the Bill and
Cynthia Saylor Professor of
says Ed Colgate, a professor of me- than just embellish an interaction al- Computer Science, and
chanical engineering at Northwestern ready taking place on the screen. “A lot Wen-mei Hwu, who holds
University. “We enjoy petting a dog and of times you’re using haptics to slap it the AMD Jerry Sanders Chair
holding a spouse’s hand. We don’t en- on top of a graphical interaction,” she of Electrical and Computer
Engineering. Approximately
joy getting sticky stuff on our fingers.” says. “But there can also be an emotion- 15 UIUC undergraduate and
Colgate believes that advanced haptics al improvement, a comfort and delight graduate students helped
could eventually give rise to a set of in using the interface.” build the supercomputer,
which boosts a cluster of 128
widely recognized device behaviors that Led by Ph.D. candidate Steve Yo- graphics processing units
go well beyond the familiar buzz of cell hanan, MacLean’s team has built the donated by NVIDIA, and uses
phones. For now, however, the prospect Haptic Creature, a device about the unorthodox supercomputer
of a universal haptic language seems a size of a cat that simulates emotional building materials, such as
wood and Plexiglas.
distant goal at best. responses. Covered with touch sensors, The UIUC team hopes to
“Until we have a reasonably mature the Haptic Creature creates different increase the supercomputer’s
approach to providing haptic feedback, sensations—hot, cold, or stiffening its energy efficiency by 10%–20%
it’s hard to imagine something as so- “ears” in response to human touch. with better management of its
message passing interface and
phisticated as a haptic language aris- The team is exploring possible applica- several other key elements.
ing,” says Colgate, who believes that tions such as fostering companionship “You really need to make sure
success in the marketplace will ulti- in older and younger people, or treating that the various parts of your
communications path, in
mately hinge on better systems integra- children with anxiety disorders. terms of different software
tion, along the lines of what Apple has MacLean’s team has also devel- layers and hardware drivers
accomplished with the iPhone. “Today, oped an experimental device capable and components, are all in
haptics is thought of as an add-on to of buzzing in 84 different ways. After tune,” says Hwu. “It’s almost
like when you drive a car, you
the user interface,” says Colgate. “It giving users a couple of months to get need to make sure that all these
may enhance usability a little bit, but familiar with the feedback by way of an things are in tune to get the
its value pales in comparison to things immersive game, they found that the maximum efficiency.”
The Green Street super-
you can do with graphics and sound. In process of learning to recognize haptic computer is being used as a
many cases, the haptics is so poorly im- feedback bore a great deal of similarity teaching and research tool.
plemented that people turn it off pretty to the process of learning a language. —Graeme Stemp-Morlock
ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t he acm 21
news
“The surprising thing is that people are tion in correspondence with fingertip mature and robust, there has to be an
able to quickly learn an awful lot and motion across a surface, the interface active marketplace that creates compe-
learn it without conscious attention,” can simulate the feeling of texture or a tition and drives down costs, and it has
says MacLean. “There’s a lot of poten- bump on the surface. Compared with to meet a real need.”
tial for people to learn encoded signals force-feedback technology, vibrotac- As production costs fall and new
that mean something not in a represen- tile stimulators, known as tactors, are standards emerge—as they almost
tational way but in an abstract way with- much smaller in size and more por- certainly will—the marketplace for
out conscious attention.” table, although high-performance tac- touch-based devices may yet come into
To date, most low-cost haptic inter- tors with wide bandwidths, small form its own. Until that happens, most of
faces have relied exclusively on varying factors, and independently control- the interesting work will likely remain
modes of vibration, taking advantage lable vibrational frequency and ampli- confined to the labs. And the future of
of the human skin’s sensitivity to move- tude are still hard to come by at a rea- the haptics industry seems likely to re-
ment. But vibration constitutes the sonable cost. main, well, a touchy subject.
simplest, most brute-force execution The Northwestern researchers have
of haptic technology. “Unfortunately,” figured out how to make transparent
Further Reading
says Colgate, “vibration isn’t all that force sensors that can capture tactile
pleasing a sensation.” feedback on a screen, so that they can Chubb, E.C., Colgate, J.E., and Peshkin, M.A.
Some of the most interesting re- be combined with a graphical display. ShiverPaD: a glass haptic surface that
produces shear force on a bare finger, IEEE
search taking place today involves ex- “My ideal touch interface is one that Transactions on Haptics 3, 3, July–Sept.,
panding the haptic repertoire beyond can apply arbitrary forces to the finger,” 2010.
the familiar buzz of the vibrating cell says Colgate, whose team has been ap- Ferris, T.K. and Sarter, N.
phone. At MIT, Jones’ team has con- proaching the problem by combining When content matters: the role of
ducted extensive research into human friction control with small lateral mo- processing code in tactile display design,
body awareness and tactile sensory tions of the screen itself. IEEE Transactions on Haptics 3, 3,
systems, examining the contribution By controlling the force on the finger, July–Sept., 2010.
of receptors in the skin and muscles the system can make parts of the screen Jones, L.A. and Ho, H.-N.
to human perceptual performance. In feel “magnetic” so that a user’s finger Warm or cool, large or small? The challenge
of thermal displays, IEEE Transactions on
one study, Jones demonstrated that us- is pulled toward them—up, down, left,
Haptics 1, 1, Jan.–June, 2008.
ers were unable to distinguish between right—or letting a user feel the outline
two thermal inputs presented on a sin- of a button on the screen where none MacLean, K.E.
Putting haptics into the ambience, IEEE
gle finger pad; instead, they perceived it exists. Colgate’s team is also exploring Transactions on Haptics 2, 3, July–Sept.,
as a single stimulus, demonstrating the how to develop devices using multiple 2009.
tendency of thermal senses to create fingers, each on a different variable fric- Ryu, J., Chun, J., Park, G., Choi, S., and Han, S.H.
“spatial summation” rather than fine- tion interface. Vibrotactile feedback for information
tuned feedback. Looking ahead, Colgate believes the delivery in the vehicle, IEEE Transactions
Colgate’s research has focused on evolution of haptic interfaces may fol- on Haptics 3, 2, April–June, 2010.
a fingertip-based interface that pro- low the trajectory of touch screens: a
vides local contact information using technology long in development that Alex Wright is a writer and information architect who
lives and works in Brooklyn, NY. Hong Z. Tan, Purdue
new actuation technologies includ- finally found widespread and relatively University, contributed to the development of this article.
ing shear skin stretch, ultrasonic, and sudden acceptance in the marketplace.
thermal actuators. By varying the fric- “The technology has to be sufficiently © 2011 ACM 0001-0782/11/0100 $10.00
Obituary
22 communications of th e ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
news
India’s Elephantine
Effort
An ambitious biometric ID project in the world’s second most
populous nation aims to relieve poverty, but faces many hurdles.
D
espi t e I ndi a’s eco nomic
boom, more than a third of
the country remains impov-
erished, with 456 million
people subsisting on less
than $1.25 per day, according to the
most recent World Bank figures. Gov-
ernment subsidies on everything from
food to fuel have tried to spread the na-
tion’s wealth, but rampant corruption
has made the redistribution pipeline
woefully inefficient.
The “leakage” happens in part be-
cause the benefits aren’t directed at
specific individuals, says Salil Prabha-
kar, a Silicon Valley-based computer
scientist who is working as a volun-
teer for the World Bank as part of the
Unique ID (UID) project, a massive bio- A 95-year-old Indian man has his fingerprints scanned as part of the Unique ID project.
metrics initiative aimed at overhauling
the current system. “If I as the govern- tification, and to link that number fraught with problems, most of which
ment issue $1 of a benefit,” Prabhakar with the owner’s biometric data—all stem from the project’s sheer size, given
explains, “I don’t know where it’s go- 10 fingerprints, an iris scan, and a India’s population of 1.2 billion. “Bio-
ing. I just know that there’s a poor per- headshot (plus four hidden “virtual” metric systems have never operated on
son in some remote village, and I hope digits). Aadhaar’s national enrollment such a massive scale,” says Arun Ross,
it reaches them.” was launched in September, with the an associate professor of computer sci-
The current system relies on a chain goal of issuing 100 million ID numbers ence at West Virginia University.
of middlemen—many of them corrupt by March and 600 million within four One of the biggest challenges is de-
bureaucrats at various levels of govern- years. Like the Social Security number duplication. When a new user tries to
ment—who collectively siphon off 10% in the U.S., the number won’t guaran- enroll, the system must check for du-
or more of what’s due to the poor and tee government aid, but your biomet- plicates by comparing the new user’s
resell the goods and services on the rics will prove the UID is yours, letting data against all the other records in the
black market. For example, according you claim whatever benefits to which UID database. Hundreds of millions of
to Transparency International, officials you’re entitled. In theory, the result records make this a computationally
extracted $212 million in bribes alone should be the end of counterfeit ration demanding process, made all the more
from Indian households below the pov- cards and other fraud, as well as mak- so by the size of each record, which in-
erty line in 2007. ing it easier for hundreds of millions of cludes up to 12 higher-resolution im-
The UID project—helmed by the Indian adults to gain easier access to ages.
much-admired former Infosys Tech- banking services for the first time. And The demands continue each time
nologies CEO Nandan Nilekani and because the system will work nation- there’s an authentication request. “The
operated by Unique Identification Au- wide, Aadhaar should make it possible matching is extremely computation-
PHOTOGRA PH BY SA NJIT DAS/PANO S
thority of India, a government agen- for the poor to move without losing ally intensive,” says Prabhakar. At peak
cy—promises to be the first step in the benefits. The lower-income Indians times, the system must process tens of
solution. Recently renamed Aadhaar love the idea, says Prabhakar, who wit- millions of requests per hour while re-
(meaning “foundation” in Hindi), the nessed what he describes as “almost sponding in real time, requiring mas-
UID project plans to assign a unique a stampede” during a recent proof-of- sive data centers the likes of Google’s.
16-digit number to each citizen above concept enrollment. Achieving acceptable levels of accu-
the age of 18 who wants national iden- The full implementation, though, is racy at this scale is another major diffi-
ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t he acm 23
news
culty. Unlike passwords, biometrics nev- out a UID card, how is it voluntary?” The
er produce an exact match, so matching loss of civil liberties is too high a price to
always entails the chance of false ac- “The key issue,” pay for a system that she believes leaves
cepts and false rejects, but as the num- says Nalini Ratha, gaping opportunities for continued
ber of enrollments rises, so do the error corruption. “The guy handing out the
rates, since it becomes more likely that “is have I captured bags of rice could ask for a bribe even
two different individuals will share sim- enough variation so to operate the machine that scans the
ilar biometrics. Using a combination of fingerprints, or he could say that the
biometrics—instead of a single thumb- I don’t reject you, machine isn’t working,” says Jayaram.
print, for example—greatly improves and at the same time “And there’s every chance the machine
accuracy and deters impostors. (In the isn’t working. Or he could say, ‘I don’t
words of Marios Savvides, assistant re- I don’t match against know who you are and I don’t care; just
search professor in the department of everybody else?” pay me 500 rupees and I’ll give you a bag
electrical and computer engineering at of rice.’ All the ways that humans can
Carnegie Mellon University, “It’s hard subvert the system are not helped by
to spoof fingerprints, face, and iris all this scheme.”
at the same time.”) But using multiple Abraham suggests a more effective
biometrics requires extra equipment, way to root out fraud through biomet-
demands information fusion, and adds one type of attack—a fake finger, a fake rics would be to target the much small-
to the data processing load. mask, or something,” says IBM’s Ratha, er number of residents who own most
Other steps to improve accuracy also “but there are probably 10 other attacks of the country’s wealth, much of it ill-
bring their own challenges. “The key is- to a biometric system that can compro- gotten. “The leakage is not happening
sue,” says Nalini Ratha, a researcher at mise the system.” at the bottom of the pyramid,” he says.
the IBM Watson Research Center, “is For starters, when data is stored in “It’s bureaucrats and vendors and poli-
have I captured enough variation so I a centralized database, it becomes an ticians throughout the chain that are
don’t reject you, and at the same time attractive target for hackers. Another corrupt.”
I don’t match against everybody else?” vulnerability is the project’s reliance on Despite all the technical and social
Capturing the optimal amount of varia- a network of public and private “regis- challenges, Nandan Nilekani’s UID
tion requires consistent conditions trars”—such as banks, telecoms, and project is on course to provide 100 mil-
across devices in different settings—no government agencies—to collect bio- lion Indian residents with a Unique ID
easy feat in a country whose environ- metric data and issue UIDs. Though reg- by March. Will Nilekani’s UID scheme
ment varies from deserts to tropics and istrars might ease enrollment, they’re work? Only time will tell. “But if there’s
from urban slums to far-flung rural ar- not necessarily worthy of the govern- anybody in India who’s capable of
eas. “It’s almost like having many dif- ment’s trust. Banks, for example, have pulling it off, it’s him,” says Abraham.
ferent countries in a single country, bio- been helping wealthy depositors evade Meanwhile, the hopes of millions of In-
metrically speaking,” says Ross. taxes by opening fictitious accounts, dia’s poor are invariably tied to the proj-
The challenge isn’t just to reduce er- so entrusting the banks with biometric ect’s success.
rors—under some conditions, a biomet- devices doesn’t make sense, says Sunil
ric reader may not work at all. “If it’s too Abraham, executive director of the Cen- Further Reading
hot, people sweat and you end up with tre for Internet and Society in Banga-
Bolle, R.M., Connell, J.H., Pankanti, S.,
sweaty fingers,” says Prabhakar, “and if lore. “If I’m a bank manager, I can hack Ratha, N.K., Senior, A.W.
it’s too dry, the finger is too dry to make into the biometric device and introduce Guide to Biometrics. Springer, New York, NY,
good contact with the optical surface of a variation in the fingerprint because 2004.
the scanner.” Normalizing across varied the device is in my bank and the bio- Jain, A.K., Flynn, P. and Ross, A. (Eds.)
lighting conditions is essential, since all metric is, once it’s in the computer, Handbook of Biometrics. Springer, New
of the biometric data is optical. just an image sent up the pipe,” he says. York, NY, 2007.
India’s diverse population presents a Though careful monitoring could catch Pato, J.N and Millett, L.I. (Eds.)
whole other set of hurdles. Many of the such hacks, Abraham says that’s not re- Biometric Recognition: Challenges and
poor work with their hands, but manual alistic once you’ve got as many records Opportunities. The National Academies
Press, Washington, D.C., 2010.
labor leads to fingertips so callused or as Aadhaar will have.
dirty they can’t produce usable finger- Registrars may also make UIDs, Ramakumar, R.
prints. And some of the most unfor- which are officially voluntary, a de facto High-cost, high-risk, FRONTLINE 26, 16,
Aug. 1–14, 2009.
tunate residents are missing hands or requirement for services, especially in
eyes altogether. the current absence of a law governing Ross, A. and Jain, A.K.
Information fusion in biometrics, Pattern
how the data can be used. Such “func- Recognition Letters 24, 13, Sept. 2003.
Security Challenges tion creep” troubles privacy advocates
As if these problems weren’t enough, like Malavika Jayaram, a partner in the Marina Krakovsky is a San Francisco area-based
journalist and co-author of Secrets of the Moneylab: How
the UID system poses formidable se- Bangalore-based law firm Jayaram & Behavioral Economics Can Improve Your Business.
curity challenges beyond the threat of Jayaram, who says, “If every utility and
spoofing. “People get carried away by every service I want is denied to me with- © 2011 ACM 0001-0782/11/0100 $10.00
24 communications of th e ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
news
EMET Prize
and Other Awards
Edward Felten, David Harel, Sarit Kraus, and others
are honored for their contributions to computer science,
technology, and electronic freedom and innovation.
T
he A.M.N. Foundation, Frank- which has enabled programmers and
lin Institute, Electronic Fron- engineers to educate lawyers on tech-
tier Foundation, and other nology relevant to legal cases of sig-
organizations recently recog- nificance to the Free and Open Source
nized leading computer sci- community, and which, in turn, has
entists and technologists. taught technologists about the work-
ings of the legal system; and
EMET Prize Hari Krishna Prasad Vemuru, a secu-
David Harel and Sarit Kraus were hon- rity researcher in India, who revealed
ored with EMET Prizes by the A.M.N. security flaws in India’s paperless
Foundation for the Advancement of electronic voting machines, and en-
Science, Art, and Culture in Israel for dured jail time and political harass-
excellence in the computer sciences. ment to protect an anonymous source
Harel, a professor of computer science who enabled him to conduct the first
at the Weizmann Institute of Science, independent security review of India’s
was recognized for his studies on a e-voting system.
wide variety of topics within the disci-
pline, among them logic and comput- Franklin Institute Laureate
ability, software and systems engineer- John R. Anderson, R.K. Mellon Univer-
ing, graphical structures and visual sity Professor of Psychology and Com-
languages, as well as modeling and Edward Felten, the U.S. Trade Commission’s puter Science at Carnegie Mellon Uni-
analysis of biological systems. first Chief Technologist. versity, was named a 2011 Laureate by
Kraus, a professor of computer sci- the Franklin Institute and awarded the
ence at Bar-Ilan University, was recog- four individuals who are extending Benjamin Franklin Medal in Computer
nized for her expertise in the field of freedom and innovation in the digital and Cognitive Science “for the develop-
artificial intelligence, along with her world. The honorees are: ment of the first large-scale computa-
significant contributions to the field of Steven Aftergood, who directs the tional theory of the process by which
autonomous agents, and studies in the Federation of American Scientists Pro- humans perceive, learn, and reason,
field of multiagent systems. ject on Government Secrecy, which and its application to computer tutor-
PHOTOGRA PH : Princeton U niversity, Office of C ommunications, Denise A pplewhite
ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t he acm 25
10 Years of Celebrating Since 2001, the Tapia Celebration of Diversity The Tapia Conference 2011 will continue past
Diversity in Computing in Computing has served as a leading forum popular sessions, including the Student Post-
for bringing together students, professors and er Session, Resume and Early Career Advice
professionals to discuss and strengthen their Workshops, Town Hall Meeting, Banquet, and
2011 passion and commitment to computing. The 2011 the Doctoral Consortium, a daylong program
program will include featured speakers who are designed to help equip students for the
Richard Tapia exemplary leaders and rising stars in academia grueling challenge of finishing their doctor-
Celebration of and industry, such as: ates. There will also be attendee-proposed
BOFs and panels. A new program will con-
Diversity • IrvingWladawsky-Berger, former chair of the
IBM Academy of Engineering and the 2001
nect students with computing professionals
from around the San Francisco Bay Area,
in Computing HENAAC Hispanic Engineer of the Year, will opening the door to future opportunities.
give the Ken Kennedy Memorial Lecture on A special outing will take in the sights of
Conference “The Changing Nature of Research and San Francisco. Conference program news
Innovation in the 21st Century.” and registration information can be found at:
http://tapiaconference.org/2011/
April 3-5, 2011 • DeborahEstrin,the Jon Postel Professor of
San Francisco, California TapiaConference2011supportersinclude:
Computer Science at UCLA and a member of
the National Academy of Engineering, will talk Google (Platinum)
http://tapiaconference.org/2011/ on “Participatory Sensing: from Ecosystems
to Human Systems.” Intel (Gold)
Cisco,Microsoft and NetApp (Silver);
• AlanEustace, Senior Vice President of Engi- Symantec (Bronze); Amazon, Lawrence
neering and Research at Google, will give an Berkeley National Laboratory, Lawrence
after dinner talk entitled “Organizing the Livermore National Laboratory, and the
World’s Information.” NationalCenterforAtmosphericResearch
(Supporter). The Tapia Conference 2011
Wladawsky- Estrin Eustace Howard
Berger • AyannaHoward, Associate Professor in the is organized by the CoalitiontoDiversify
ECE School at Georgia Tech who Technology Computing and is co-sponsored by the
Review selected as a 2003 Young Innovator, AssociationforComputingMachinery and
will give the talk “SnoMotes - Robotic Scientific the IEEEComputerSociety, in cooperation
CACM lifetime mem half page ad:Layout 1 2/3/10
Explorers 2:21 PM
for Understanding Page
Climate 1
Change.” with the ComputingResearchAssociation.
Estrin
Take Advantage of
ACM’s Lifetime Membership Plan!
◆ ACM Professional Members can enjoy the convenience of making a single payment for their
entire tenure as an ACM Member, and also be protected from future price increases by
taking advantage of ACM's Lifetime Membership option.
◆ ACM Lifetime Membership dues may be tax deductible under certain circumstances, so
becoming a Lifetime Member can have additional advantages if you act before the end of
2010. (Please consult with your tax advisor.)
◆ Lifetime Members receive a certificate of recognition suitable for framing, and enjoy all of
the benefits of ACM Professional Membership.
26 communications of th e ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
V
viewpoints
Y
o u wa n t to know how to and the other panelists were adamant
get my attention?” Jason that a good idea, even when supported
Kalich asked the audience by possible cost savings, just doesn’t
rhetorically. “First off, don’t cut it in the current economic climate.
bring me a good idea—I’ve What the panel was saying is that
already got plenty of good ideas.” Ka- good ideas are just that. And cost re-
lich, the general manager of Micro- duction, while valuable, tends to be
soft’s Relationship Experience Divi- quite incompressible—once the first
sion, was participating in the keynote 10%–20% of cost savings are achieved,
panel at the Quest Conference in Chi- further savings usually become in-
cago.a The three industry experts on creasingly difficult to get. Reducing
the panel were addressing the question costs is like compressing a spring—it
asked by the moderator Rebecca Sta- may require more and more energy for
ton-Reinstein: “How can I get my man- less and less movement.
ager’s buy in (to software quality and pro- I looked around the audience and,
cess change initiatives)?” The audience while there were nods of understand-
consisted of several hundred software ing, there were also many blank stares
professionals, most of them employed as people tried to figure out: How can I
in the areas of software quality, testing, turn my process initiative into a profit
and process management. Kalich had center? Making money is not a typical
clearly given the topic a lot of thought goal of process change as it is usually
and he warmed to the theme: “Don’t Sponsorship practiced which, according to the pan-
even bring me cost savings. Cost sav- Obtaining sponsorship for software el, might be why it doesn’t always get
ings are nice, but they’re not what I’m development process changes is es- the support it might.
really interested in.” He paused for sential. The first of Watts Humphrey’s But how to actually do it? That after-
emphasis. “Bring me revenue growth Six Rules of Process Change is “start at noon, I attended a presentation that
and you’ve got my ear. Bring me new the top”—get executive sponsorship showed how it can be done, and what
value, new products, new customers, for whatever change you are trying to critical success factors are needed to
new markets: then you’ve got my atten- make.1 Without solid and continu- make it work.
tion, then you’ve got my support. Don’t ing executive commitment to support
bring me a good idea. Not interested.” changes they usually wither on the vine. SmartSignal
But just how does a software quality “Predictive Analytics is a really com-
a http://www.qaiquest.org/chicago/index.html professional get this support? Kalich plex data set,” said George Cerny, “our
ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t he acm 27
viewpoints
Specification Expected
and Design Test Results
Documents
Knowledge:
Knowledge: Expected behavior
Specified/Designed and functionality
behavior and
functionality
systems predict the possible failure of it finds. Sometimes these reports are ers are often in a paper format and are
commercial aircraft, power stations, large and detailed; sometimes they are processed manually.
and oil rigs sometimes weeks before urgent and immediate. Cerny described this: “We realized
a failure might actually occur.” Cerny “But before all this happens, the early we had to test using virtual ma-
is the quality assurance manager at analytic system must be set up.” Cerny chines, but how could we test these?
SmartSignal,b an Illinois-based data said. “This setup was manual and And how could we ensure scalability
analytics company. data-entry intensive. A single power with both the numbers and the com-
To manage predictive analytics, station might have hundreds of items plexities of environments and inputs?”
large and complex systems must be of equipment that need to be moni- To the testing group at SmartSignal
instrumented and enormous amounts tored. Each item might have hundreds test automation was clearly a good
of complicated data must be collected of measurements that must be taken idea. But how to get sponsorship for
from many different sources: pumps, over short, medium, and long time- this good idea?
power meters, pressure switches, frames. Each measurement might be Jim Gagnard, CEO of SmartSignal,
maintenance databases, and other associated with many similar or differ- put it this way: “We are a software com-
devices. Sometimes data is collected ent measurements on the same device pany whose products measure quality
in real time, sometimes it is batched. or on other equipment.” The screen and everything is at risk if we aren’t as
Simple data is monitored for thresh- flashed with list after list of data items. good as we can be in everything we do.
old conditions and complex interac- “So how could we test this? How could Leaders can help define and reinforce
tive data is analyzed for combinational we make sure the system works before the culture that gets these results, but
conditions. The analysis system must we put it in?” if it’s not complemented with the right
recognize patterns that indicate the people who truly own the issues, it does
future possibility of component, sub- Testing a System not work.”
system, or systemic failure and what Testing is the interaction of several Dave Bell, vice president of Appli-
the probability of that failure might knowledge-containing artifacts, as cation Engineering and Stacey Kacek,
be. And then it needs to report what shown in the accompanying figure. vice president of Product Develop-
Some of these artifacts must be in ex- ment at SmartSignal, concurred. “We
b http://www.smartsignal.com ecutable software form, but many oth- always have to be looking to replace
28 communications of th e ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
viewpoints
ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t he acm 29
V
viewpoints
W
hen the dot-com istries required to check domain name whether Google is violating trademark
boom began in the registrations for trademark violations? law by operating its AdWords system.
late 1990s, many ana- Is eBay liable for counterfeit product With Google AdWords, advertisers can
lysts and observers sales on its site? To what extent should buy advertising links in the “sponsored
proclaimed the death Google be allowed to offer excerpts links” section of a Google search re-
of intermediation. Supply chains from copyrighted books in its Google sults page. When a user enters a key-
seemed to become shorter and short- Book service without the consent of the word selected by the advertiser, the ad-
er as new B2C companies emerged in relevant rights owners? vertising link will appear in the upper
Silicon Valley. These companies could Both in the U.S. and in Europe, such right-hand corner of the search results
deal with their customers directly questions have led to countless law- page. In principle, the advertiser is free
over the Internet, rendering distribu- suits and legislative initiatives over to select any keyword for his advertis-
tors, wholesalers, brokers, and agents the last 15 years. One of the most de- ing link. This becomes a legal issue,
superfluous. bated issues in recent years has been however, if the advertiser chooses a
While some traditional middlemen
have indeed become less important as
Internet commerce has developed, we
have not seen a general death of inter-
mediation. Rather, many new interme-
diaries have arisen on the digital land-
scape over the last 15 years. Just think
of Amazon, eBay, or Google. If all these
companies have been successful, it is
not because they have removed all bar-
riers between producers and consum-
ers. They have been successful because
they offer innovative services located
between producers and consumers
along the digital supply chain.
The law often has a difficult time
coping with new intermediaries.
Should an Internet service provider be
held liable for violations of copyright
or criminal law committed by its cus-
tomers? Is Yahoo obliged to prevent
photograph by Gwena ël Piaser
30 communications of th e ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
viewpoints
ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t he acm 31
viewpoints
Google was not using the LV trademark Google AdWords can really cause con-
in its AdWords system in a manner cov- fusion among consumers. Up to now,
ered by European trademark law. The The danger is that most U.S. courts have denied Google’s
idea behind this is simple. Trademark national courts will liability on such grounds.
law does not entitle a trademark owner Third, as a result of the decisions
to prevent all utilization of his trade- continue to interpret by the European Court of Justice re-
mark by a third party. In the view of the European trademark lating to the AdWords system, Google
court, Google is merely operating a ser- revised its European AdWords trade-
vice that may enable advertisers to en- law in different ways. mark policy in September 2010 and
gage in trademark violations. Google limited its support for trademark
does not decide which trademarks to owners. Under the new policy, adver-
use as keywords, but merely provides tisers are free to select trademarks
a keyword selection service. This is not when registering advertising links.
sufficient, in the view of the court, to However, if a trademark owner dis-
justify an action for direct trademark the court did not give a definite answer covers that an advertiser is using his
infringement. as to whether Google should be pro- trademark without proper authoriza-
However, Google might still be li- tected by safe harbors provisions. For tion, Google will remove the advertis-
able for what lawyers call secondary most of these questions, the European ing link if the trademark is being used
infringement. The argument would Court of Justice provided some general in a confusing manner, for example
be that, if advertisers actually infringe guidelines, but left it to the national if it falsely implies some affiliation
trademark law because they create courts to rule on details which may be between the advertiser and the trade-
customer confusion in the AdWords small, but decisive. Therefore, in Eu- mark owner. By this policy change,
system, Google is benefiting finan- rope, it will ultimately be the national Google has mollified at least some
cially from these trademark violations. courts which will decide on the liability trademark owners and provided a
While this argument may sound con- of Google for its AdWords system. We mechanism outside the court system
vincing at first sight, the European E- still lack a clear answer on how to de- that may resolve a substantial propor-
Commerce Directive of 2000 restricts sign a keyword-backed advertisement tion of AdWords trademark disputes
the liability of “information society system in a way that clearly does not in Europe. Nevertheless, it is almost
service providers” (such as, potentially, violate European trademark law. certain that national courts in Europe
Google) for infringing activities by third will continue to rule on the details of
parties (the advertisers). Therefore, Indecisive Decision how the AdWords trademark policy is
the European Court of Justice had to This does not mean that one should implemented and enforced.
decide whether the safe harbor provi- feel sorry for Google which still has
sions of this directive shielded Google to operate in an area of somewhat Conclusion
from secondary liability. The European unsettled law. First, Google has some In the end, the decision by the Euro-
Court of Justice held that the answer to experience in this regard. Just think pean Court of Justice may indeed turn
this question depends on whether the of the Google Books project. Second, out to be a victory for Google. Whether
Google AdWords system is a mere auto- Google has been running its AdWords it is a victory for the European trade-
matic and passive system, as portrayed service in the U.S. for years, and in mark system is less clear. While the
by Google, or whether Google plays an the U.S. the liability question is still European Court of Justice provided
active role in selecting and ordering ad- not fully settled. In 2009, the Court of some general guidelines on Google Ad-
vertisements. As in the customer con- Appeals for the Second Circuit held Words, the task of working out the little
fusion question, the court refrained that Google was using trademarks “in details has been left to courts in Paris,
from giving any definite answer, but commerce” (as required by the Lan- Vienna, Karlsruhe, The Hague, London
rather referred the case back to the ham Act) when operating its AdWords and other cities. The danger is that na-
French courts. system,c thereby taking a slightly dif- tional courts will continue to interpret
In the popular press, the Europe- ferent stance from that of the Euro- European trademark law in different
an Court of Justice’s decision in the pean Court of Justice. The impact of ways. French courts, for example, may
Google AdWords case has often been this decision on Google AdWords in continue to be more critical of Google
portrayed as a victory for Google. Does the U.S. remains to be seen. At least, AdWords in their decisions than Ger-
victory really look like this? Well, it de- courts in the U.S. will now examine man or U.K. courts. This is not exactly
pends. The European Court of Justice more closely whether unauthorized the idea of a trademark system which
refrained from providing a final answer trademark-backed advertising links in is supposed to be harmonized across
as to whether keyword advertising can Europe by the institutions of the Euro-
lead to customer confusion. Nor did it c Rescuecom Corp. v. Google, Inc., 562 F.3d 123 pean Union.
provide a comprehensive answer as to (2009). This decision did not rule on the ul-
whether Google could be held liable timate question of Google’s liability, as the Stefan Bechtold (sbechtold@ethz.ch) is Associate
Court of Appeals remanded the case back to Professor of Intellectual Property at ETH Zurich and a
not because of customer confusion, the district court for further proceedings. In Communications Viewpoints section board member.
but because other goals of trademark March 2010, the parties settled their dispute
protection had been violated. Finally, out of court. Copyright held by author.
32 communications of th e ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
V
viewpoints
Technology Strategy
and Management
Reflections on the Toyota Debacle
A look in the rearview mirror reveals system and process blind spots.
V
a rio us e xper t s in indus-
try and academia have long
recognized that Toyota,
founded in 1936, is one of
the finest manufacturing
companies the world has ever seen.a
Over the past 70-plus years, Toyota
has evolved unique capabilities in
manufacturing, quality control, sup-
ply-chain management, and product
engineering, as well as sales and mar-
keting. It began perfecting its famous
Just-in-Time or “Lean” production sys-
tem in 1948. I am a longtime observer
(and customer) of Toyota, and have re-
cently tried to understand how such a
renowned company could experience
the kinds of quality problems that
generated numerous media headlines Example of an unsecured driver-side floor mat trapping the accelerator pedal in a 2007
during 2009–2010.b Lexus ES350.
First, to recount some of the facts:
Between 1999 and 2010, at least 2,262 pedal instead of the brake) appear to be vehicles such as the Prius and Lexus
Toyota vehicles sold in the U.S. experi- the result of sticky brake pedals (eas- hybrids, which were also involved in
enced unintended cases of rapid accel- ily fixed with a metal shim to replace the complaints. Toyota also encoun-
eration and are associated with at least a plastic component) as well as loose tered other quality problems that it
815 accidents and perhaps as many as floor mats that inadvertently held down mostly kept out of the headlines—in
102 deaths. The incidents that were not the gas pedal.c Another possible cause particular, dangerous corrosion in the
due to driver error (stepping on the gas is the software that controls the engine frames of Tacoma and Tundra pickup
and braking functions, particularly in trucks sold in North America between
a See, for example, J. Womack et al., The Machine 1995 and 2000, apparently due to im-
that Changed the World (1990); or J. Liker, The c There are numerous reports on the Toyota proper antirust treatment. Toyota did
Toyota Way (2003). problem in the media and information avail- not recall these trucks, but silently
b My first book, The Japanese Automobile Indus- able from Toyota directly. A particularly de-
try (1985), presented a history of how the Just- tailed early document is Toyota Sudden Un-
bought them back from consumers.d
PHOTOGRA PH BY A P Photo/NH TSA
ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t he acm 33
viewpoints
CACM_TACCESS_one-third_page_vertical:Layout 1 6/9/09 1:04 PM Page 1
There have also been some minor com- ment. These systems and managerial
plaints about the driving mechanisms processes also reflect intangible corpo-
in the Corolla and Camry models, and rate values such as what kind of com-
stalling in some Corolla models. Over- mitment the organization has to qual-
all, during a 12-month period, Toyota ity and customer satisfaction.
recalled some 10 million vehicles The Toyota production system does
through August 2010—an extraordi- not seem to be the cause of the quality
ACM nary number given that the company problems experienced over the prior
sold only approximately seven million decade. In the past, Toyota has exhib-
Transactions on vehicles during this same period.2
In the software business, produc-
ited a significant advantage over its
mass-producer competitors in physical
Accessible ers and consumers are accustomed
to product defects and an occasional
and value-added productivity. The com-
petition has improved, but it is unlikely
34 communications of th e ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
viewpoints
velopment to make sure their vehicles ment. But one observation is that, al-
work properly under all conditions. though we can learn a lot about best
Whether the component comes from Even the best firms practices from looking at exemplar
an in-house Toyota factory or a sup- are likely to decline firms and their unique processes,
plier makes no difference. Toyota en- like Just-in-Time production, we also
gineers are responsible. at least a little as need to have some perspective. An
In terms of general management, competitors catch up enduring management principle that
such as of the supply chain and over- truly differentiates firms over the long
all quality, Toyota clearly failed to live or when managers haul must also be separable from the
up to its historical standards. Execu- lose their focus. experience of any particular firm, in-
tives within the company admit they cluding the originator. This sounds
overstretched their managerial re- like a contradiction but it is not. Ev-
sources and overseas supply chain in ery company, market, and country
the push to overtake General Motors as will experience ups and downs. Even
the world’s largest automaker, which the best firms are likely to decline at
Toyota finally did in 2009. More spe- Toyota redefined mass production least a little as competitors catch up
cifically, the quality problems appear and built its reputation around qual- or when managers lose their focus.
connected to overly rapid expansion ity and reliability by paying attention Moreover, success often brings with it
of production and parts procurement to details, large and small. The recent the potential seeds of decline—such
outside Japan, particularly given the slew of recalls definitely indicates as increases in the size, complexity,
decision to use a different brake pedal. something changed for the worse in and global scale of operations, which
In the past, Toyota manufactured new the company. can be much more difficult to man-
models in Japan initially for a couple of What shocked me most was that the age. In this case, Toyota’s quality
years, using carefully tested Japanese quality lapses seemed to take Toyota’s problems in 2009–2010 do not mean
parts, and only then did it move pro- senior managers by such surprise. the principles of “lean production” or
duction of the best high-volume mod- CEO Akio Toyoda, and other senior ex- lean management more generally are
els to overseas factories. Over the last ecutives in the U.S. and Japan, admit- any less valuable to managers. What
decade, by contrast, Toyota ramped ted to having little or no information managers need to understand are the
up overseas production of new and old about these quality issues, which first limitations of any best practice as well
models with new suppliers much more surfaced in Europe. They were unpre- as the potential even for great compa-
quickly and, apparently, with inade- pared to explain the source or nature of nies to lose their focus and attention
quate stress testing. the problems—to themselves or to the to detail—at least temporarily.
Also at the management level, Toyota global media. Toyota also made its pre- The best outcome for Toyota will
executives seem to have paid increas- dicament worse by responding much be for managers, engineers, and other
ingly less attention to product and pro- too slowly to customer complaints and employees to reflect deeply on what
cess details. It may well be that Toyota allowing bad news to leak out sporadi- happened to them and use these in-
managers as well as staff engineers cally, while executives continued to sights to create an even stronger com-
believed their company had already deny—at least initially—that there was pany. They should become better able
reached such a high level of perfec- a real problem. to handle adversity and change in the
tion that there was nothing much Companies with true staying power future because they now know what
to worry about. But automobiles are fix their problems and recover from failure looks like. The Toyota way used
themselves very complex systems, with their mistakes. Here, Toyota has not to be that one defect was too many.
lots of hardware and software, and as disappointed us. By the fall of 2010, That is the kind of thinking that Toyota
many as 15,000 discrete components. Toyota managers and dealers had got- seems to be regaining.
It is not surprising that some things ten their act together and were work-
go wrong and recalls are common in ing hard to rebuild customer confi- References
1. Ackman, D. Tire trouble: The Ford-Firestone blowout.
the industry. Other automakers over dence. The problems seemed mostly Forbes.com (June 20, 2001).
the past year recalled more than 10 contained to the pedals and floor mats, 2. Bunkley, N. and Vlasic, B. Carmakers initiating
more recalls voluntarily. The New York Times (Aug.
million vehicles, not counting the though Toyota also upgraded some of 24, 2010); http://www.nytimes.com/2010/08/25/
Toyota recalls.2 In the grand scheme the software in its hybrid vehicles. Ser- automobiles/25recall.html
3. Welch, W. Toyota plunges to 21st in auto-quality
of things, moreover, the number of vice technicians worked overtime for survey; Ford makes Top 5, Bloomberg (June 17, 2010);
accidents and even deaths attributed months to fix recalled vehicles. Sales http://www.bloomberg.com/news/2010-06-17/toyota-
plunges-to-21st-in-j-d-power-quality-survey-ford-
to Toyota are not so large compared and profits recovered. And Toyota now makes-top-five.html.
4. Wikipedia. Firestone and Ford tire controversy.
to what other companies have experi- recalls any vehicle immediately with Wikipedia.com.
enced. For example, Ford had a mas- even the slightest hint of a problem.
sive recall in 2000 of some 13 million Michael A. Cusumano (cusumano@mit.edu) is a
professor at the MIT Sloan School of Management and
faulty tires made by Firestone and fit- Technology and Management School of Engineering and author of Staying Power: Six
ted on its Explorer SUVs, reportedly Lapses and Lessons Enduring Principles for Managing Strategy and Innovation
in an Uncertain World (Oxford University Press, 2010).
resulting in over 250 deaths and 3,000 The Toyota debacle offers many les-
catastrophic injuries.1,4 Nonetheless, sons about technology and manage- Copyright held by author.
ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t he acm 35
V
viewpoints
Viewpoint
Cloud Computing Privacy
Concerns on Our Doorstep
Privacy and confidentiality issues in cloud-based conference
management systems reflect more universal themes.
C
l oud c o mputing mea ns en-
trusting data to information
systems that are managed
by external parties on re-
mote servers “in the cloud.”
Webmail and online documents (such
as Google Docs) are well-known exam-
ples. Cloud computing raises privacy
and confidentiality concerns because
the service provider necessarily has ac-
cess to all the data, and could acciden-
tally or deliberately disclose it or use it
for unauthorized purposes.
Conference management systems
based on cloud computing represent
an example of these problems within
the academic research community. It
is an interesting example, because it is
small and specific, making it easier to
explore the exact nature of the privacy
problem and to think about solutions.
This column describes the problem,
highlights some of the possible unde- account their preferences and conflicts computing model: instead of installing
sirable consequences, and points out of interest; and hosting the server, the conference
directions for addressing it. ˲˲ The system organizes the collec- chair simply creates the conference
tion and distribution of reviews and account “in the cloud.” In addition to
Conference Management Systems discussion, can rank papers accord- the benefits described previously, this
Most academic conferences are man- ing to scores, and send out reminder model has extra conveniences:
aged using software that allows the email, as well as email notifications of ˲˲ The whole business of managing
program committee (PC) members to acceptance or rejection; and the server (including backups and se-
browse papers and contribute reviews ˲˲ It can also produce a range of other curity) is done by someone else, and
and discussion via the Web. In one reports, such as lists of sub-reviewers, gains economy of scale;
arrangement, the conference chair acceptance statistics, and the confer- ˲˲ Accounts for authors and PC mem-
ILLUSTRATION BY GA RY NEILL
downloads and hosts the appropriate ence program. bers exist already, and don’t have to be
server software, say HotCRP or iChair. HotCRP and iChair require the con- managed on a per-conference basis;
The benefits of using such software are ference chair to download and install ˲˲ Data is stored indefinitely, and
familiar: software, and to host the Web server. reviewers are spared the necessity of
˲˲ Distribution of papers to PC mem- Other systems such as EasyChair and keeping copies of their own reviews;
bers is automated, and can take into EDAS work according to the cloud and
36 communications of th e ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
viewpoints
˲˲ The system can help complete rather than being left solely to the de
forms such as the PC member invita- facto data custodians.
tion form and the paper submission The acceptance
form by suggesting likely colleagues success records Ways Forward
based on past collaboration history. Policies and legislation. An obvious
For these reasons, EasyChair and could be identified, first step is to articulate clear policies
EDAS are an immense contribution to for individual that circumscribe the ways in which
the academic community. According the data is used. For example, a simple
to its Web page, EasyChair hosted over researchers and policy might be that the data gathered
3,300 conferences in 2010. Because of groups, over during the administration of a confer-
its optimizations for multiconferences ence should be used only for the man-
and multitrack conferences, it is man- a period of years. agement of that particular conference.
dated for conferences and workshops Adherence to this policy would imply
that participate in the Federated Logic that the data is deleted after the con-
Conference (FLoC), a huge multicon- ference, which is not done in the case
ference that attracts approximately of Easychair (I don’t know if it is done
1,000 paper submissions. fidentiality, but the data was just about for EDAS). Other policies might allow
one conference. Cloud computing wider uses of the data. Debate within
Data Privacy Concerns solutions allow data to be aggregated different academic communities can
Accidental or deliberate disclosure. A across thousands of conferences over be expected to yield consensus about
privacy concern with cloud-comput- decades, presenting tremendous op- which practices are to be allowed in
ing-based conference management portunities for abuse if the data gets a discipline, and which ones not. For
systems such as EDAS and EasyChair into the wrong hands. example, some communities may
arises because the system administra- Beneficial data mining. In addition welcome plagiarism detection based
tors are custodians of a huge quantity to the abuses of conference review data on previously reviewed submissions,
of data about the submission and re- described here, there are some uses while others may consider it useless for
viewing behavior of thousands of re- that might be considered beneficial. their subject, or simply unnecessary.
searchers, aggregated across multiple The data could be used to help detect or Since its inception in 2002 and up to
conferences. This data could be delib- prevent fraud or other kinds of unwant- the time of writing, EasyChair has ap-
erately or accidentally disclosed, with ed behavior, for example, by identifying: peared not to have any privacy policy,
unwelcome consequences. ˲˲ Researchers who systematically or any statement about the purposes
˲˲ Reviewer anonymity could be com- unfairly accept each other’s papers, or and possible uses of the data it stores.
promised, as well as the confidentiality rivals who systematically reject each There is no privacy policy linked from
of PC discussions. other’s papers, or reviewers who reject its main page, and a search for “privacy
˲˲ The acceptance success records a paper and later submit to another policy” (or similar terms) restricted to
could be identified, for individual re- conference a paper with similar ideas; the domain “easychair.org” does not
searchers and groups, over a period of and yield any results. I have been told that
years; and ˲˲ Undesirable submission patterns new users are presented with a privacy
˲˲ The aggregated reviewing profile and behaviors by individual research- statement at the time of first signing
(fair/unfair, thorough/scant, harsh/un- ers (such as parallel or serial submis- up to Easychair. I did not create a new
discerning, prompt/late, and so forth) sions of the same paper; repeated pa- account to test this; regardless, the
of researchers could be disclosed. per withdrawals after acceptance; and privacy statement is not linked from
The data could be abused by hiring recurring content changes between anywhere or later findable via search.
or promotions committees, funding submitted version and final version). EDAS does have an easily accessed
and award committees, and more gen- The data could also be used to under- privacy policy, which (while not water-
erally by researchers choosing collab- stand and improve the way conferences tight) appears to comply with the “use
orators and associates. The mere ex- are administered. ACM, for example, only for this conference” principle.
istence of the data makes the system could use the data to construct quality Another direction would be to try
administrators vulnerable to bribery, metrics for its conferences, enabling it to find alternative custodians for the
coercion, and/or cracking attempts. If to profile the kinds of authors who sub- data—custodians that are not them-
the administrators are also research- mit, how much “new blood” is entering selves also researchers participating
ers, the data potentially puts them in the community, and how that changes actively in conferences. The ACM or
situations of conflict of interest. over different editions of the conference. IEEE might be considered suitable,
The problem of data privacy in gen- This could help identify conferences although they contribute to decisions
eral is of course well known, but cloud that are emerging as dominant, or oth- about publications and appointments
computing magnifies it. Conference ers that have outlived their usefulness. of staff and fellows. Professional data
data is an example in our backyard. The decisions about who is allowed custodians such as Google might also
When conference organizers had to mine the data, and for what purpos- be considered. It may be difficult to
to install the software from scratch, es, are difficult. Policies should be de- find an ideal custodian, especially if
there was still a risk of breach of con- cided transparently and by consensus, cost factors are taken into account.
ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t he acm 37
viewpoints
In most countries, legislation exists cations, and might not require a great
to govern the protection of personal deal of processing to be performed on
data. In the U.K., the Data Protection the server side. In that case, encrypting
Act is based on eight principles, includ- the data before sending it to the cloud
ing the principle that personal data is may be realistic. It would require keys
obtained only for specified purposes to be managed and shared among us-
and is not processed in a manner in- ers in a practical and efficient way, and
compatible with the purposes; and the the necessary computations to be done
principle that the data is not kept lon- in a browser plug-in. It is worthwhile to
ger than is necessary for the purposes. investigate whether this arrangement
EasyChair is hosted in the U.K., but the could work for conference manage-
lack of an accessible purpose state- ment software.
ment or evidence of registration under
the Act mean I was unable to deter- Conclusion
mine whether it complies with the leg- Many people with whom I have dis-
islation. The Data Protection Directive cussed these issues have argued that
of the European Union embodies simi- the professional honor of data custodi-
lar principles; personal data can only ans (and PC chairs and PC members) is
be processed for specified purposes sufficient to guard against the threats
and may not be processed further in a I have described. Indeed, adherence
way incompatible with those purposes. by professionals to ethical behavior is
Processing encrypted data in the essential to ensure all kinds of confi-
cloud. Policies are a first step, but dentiality. In practice, system admin-
alone they are insufficient to prevent istrators are able to read all the orga-
cloud service providers from abusing nization’s email, and medical staff can
the data entrusted to them. Current browse celebrity health records; we
ACM’s research aims to develop technologies trust our colleagues’ sense of honor to
that can give users guarantees that the ensure these bad things don’t happen.
interactions
agreed policies are adhered to. The fol- But my standpoint is that we should
magazine explores lowing descriptions of research direc- still try to minimize the extent to which
critical relationships tions are not exhaustive or complete. we rely on people’s sense of good be-
between experiences, people, Progress has been made in encryp- havior. We are just at the beginning of
tion systems that would allow users to the digital era, and many of the solu-
and technology, showcasing upload encrypted data, and allow the tions we currently accept won’t be con-
emerging innovations and industry service providers to perform compu- sidered adequate in the long term.
leaders from around the world tations and searches on the encrypted The issues raised about cloud-
data without giving them the possibil- computing-based conference man-
across important applications of ity of decrypting it. Although such en- agement systems are replicated in
design thinking and the broadening cryption has been shown possible in numerous other domains, across all
field of the interaction design. principle, current techniques are very sectors of industry and academia. The
expensive in both computation and problem of accumulations of data on
Our readers represent a growing servers is very difficult to solve in any
bandwidth, and show little sign of be-
community of practice that coming practical. But the research is generality. The particular instance
is of increasing and vital ongoing, and there are developments considered here is interesting because
all the time. it may be small enough to be solvable,
global importance.
Hardware-based security initiatives and it is also within the control of the
such as the Trusted Platform Module academic community that will directly
and Intel’s Trusted Execution Technol- benefit—or suffer—according to the
ogy are designed to allow a remote user solution we adopt.
e
and conference management software interesting and constructive comments. I also benefited
w
search will be needed before a usable Thanks to Henning Schulzrinne, administrator of EDAS, for
ht
38 communications of th e ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
V
viewpoints
Interview
An Interview with
Frances E. Allen
Frances E. Allen, recipient of the 2006 ACM A.M. Turing Award,
reflects on her career.
A
CM Fellow Frances E. Allen,
recipient of the 2006 ACM
A.M. Turing Award and
IBM Fellow Emerita, has
made fundamental con-
tributions to the theory and practice
of program optimization and compil-
er construction over a 50-year career.
Her contributions also greatly extend-
ed earlier work in automatic program
parallelization, which enables pro-
grams to use multiple processors si-
multaneously in order to obtain fast-
er results. These techniques made it
possible to achieve high performance
from computers while programming
them in languages suitable to appli-
cations. She joined IBM in 1957 and
worked on a long series of innovative
projects that included the IBM 7030
(Stretch) and its code-breaking co-
processor Harvest, the IBM Advanced
Computing System, and the PTRAN
(Parallel Translation) project. She is
an IEEE Fellow, a Fellow of the Com- Fran Allen on CS: “It’s just such an amazing field, and it’s changed the world, and we’re just
puter History Museum, a member of at the beginning…”
the American Academy of Arts and
Sciences, and a member of the U.S. Your first compiler work was for IBM ideas and technologies they put into
PHOTOGRA PH BY F RA NK BECERRA , J R. / TH E J OURNAL NEWS
ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t he acm 39
viewpoints
the look-ahead unit was a phenomenal customers. I had a pretty unhappy class, erything down there, and on purpose.
piece of hardware. because they knew they could do better “NSA” was not a term that was known.
than any high-level language could. While I was on the project, two guys
How many copies of Stretch were built? went to Moscow, just left, and it hit
Eight or nine. The original was built Did you win them over? the New York Times, and that’s when I
for Los Alamos and shipped late. Then Yes—and won myself over. John learned what it was about. It was a very
they discovered its performance was Backus, who led the FORTRAN project, carefully guarded activity. The problem
about half of what was intended. had set two goals from the beginning: was basically searching for identifiers
programmer productivity and applica- in vast streams of data and looking for
But still, 50 times… tion performance. I learned all about the relationships, identifying k-graphs and
Meanwhile, the underlying technol- compiler as part of teaching this course. doing statistical analysis. Any single
ogy had changed. T.J. Watson got up Harvest instruction could run for days,
at the Spring Joint Computer Confer- Did you ever work on that compiler and be self-modifying.
ence and announced they would not yourself? The most amazing thing about
build any more Stretch machines, and I was reading the code in order to that machine is that it was synchro-
apologized to the world about our fail- do the training. It set the way I thought nized. Data flowed from this tape sys-
ure. But it was recognized later that about compilers. It had a parser, then tem through memory, through the
the technology developed in building an optimizer, then a register allocator. streaming unit, to the Harvest unit, the
Stretch made a huge difference for The optimizer identified loops, and streaming unit, back to memory, and
subsequent machines, particularly the they built control flow graphs. back out onto the data repository, and
360. A lot of people went from Stretch The Stretch group recognized that it was synchronized at the clock level.
to the 360, including Fred Brooks. the compiler was going to be an es- The data was coming from listening
sential part of that system. A bunch of stations around the world, during the
What was your connection with us in research were drafted to work on Cold War. I spent a year at NSA install-
Stretch? it. The National Security Agency [NSA] ing the system; during that year, the
My role was on the compiler. When I had a contract with IBM to build an Bay of Pigs and the Cuban Missile Cri-
joined IBM in 1957, I had a master’s de- add-on to Stretch, for code-breaking. sis happened, so it was a very tense pe-
gree in mathematics from the Universi- Stretch would host the code-breaking riod. I assume most of the data was in
ty of Michigan, where I had gone to get a component, and there was a large tape Cyrillic. But Alpha could deal with any
teaching certificate to teach high school device, tractor tape, for holding mas- data that had been coded into bytes.
math. But I had worked on an IBM 650 sive amounts of data. I wrote the final acceptance test for
there, so I was hired by IBM Research as the compiler and the language. I wrote
a programmer. My first assignment was This was Stretch Harvest? the final report and gave it to them and
to teach FORTRAN, which had come out Yes. There was going to be one com- never saw it again, which I regret.
in the spring of that year. piler for Stretch Harvest that would
take FORTRAN, and the language I was What did you do next?
Did you already know FORTRAN, or working on with NSA for code-break- John Cocke was enamored with
were you learning it a week ahead, as ing, called Alpha, and also Autocoder, building the fastest machine in the
professors often do? which was similar to COBOL. world, and Stretch had been an an-
Yeah, a week ahead [laughs]. They nounced public failure. When I fin-
had to get their scientists and research- A single compiler framework to en- ished with Harvest, Stretch was al-
ers to use it if they were going to convince compass all three languages? ready done. I could have gone and
Yes, three parsers going to a high- worked on the 360. I didn’t particu-
level intermediate language, then an larly want to do that; it was a huge
optimizer, then the register allocator. project spread around the world.
It was recognized later It was an extraordinarily ambitious John wanted to take another crack at
that the technology compiler for the time, when even hash building the fastest machine in the
tables were not yet well understood. world, so I joined him on a project
developed in building One compiler, three source languag- called System Y. This time the com-
Stretch made a es, targeted to two machines, Stretch piler was built first. Dick Goldberg
and Harvest. In addition to managing was the manager and did the parser, I
huge difference the optimizer group, I was responsible did the optimizer, and Jim Beatty did
for subsequent for working with the NSA on designing the register allocator. We had a very
Alpha. I was the bridge between the nice cycle-level timing simulator. We
machines. NSA team, which knew the problem… built what was called the Experimen-
tal Compiling System.
And never wanted to tell you complete-
ly what the problem is. What became of System Y?
They told me not at all, but it didn’t It changed into ACS [Advanced
matter. I was pretty clueless about ev- Computing System], which was even-
40 communications of th e ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
viewpoints
tually canceled [in 1969] by Armonk, The graph interval decomposition im-
by headquarters, which we should have proved the theoretical cost bounds of
known would happen, because it was Any single Harvest the algorithm by guiding the order—
not 360. But we developed things that instruction could but if I hear you correctly, the interval
subsequently influenced the company structure is just as important, per-
a lot. We did a lot with branch predic- run for days, and haps more important, for guiding the
tion, both hardware and software, and be self-modifying. transformations than for doing the
caching, and machine-independent, analysis?
language-independent optimizers. Yes. People who were focusing on
John, after being very disappointed the theoretical bounds missed, I think,
about not being able to build the fast- the importance of leaving a framework
est machine in the world, decided he in which one could make the transfor-
would build the best cost-performance John Cocke contains some actual PL/I mations. But then something really ex-
machine. That was where the PowerPC code that represents sets as bit vectors, citing happened. A student of Knuth’s,
came from—the 801 project. and propagates sets around the pro- [Robert] Tarjan, developed a way to
After ACS, I took an unhappy di- gram control flow graph. The intersec- map this problem into a spanning tree.
gression from my work on compilers. tions and unions of the sets were just
I was assigned to work on FS, the fa- PL/I & and | operators, which makes Nodal graphs could be decomposed
mous “Future System” of IBM. It was the code concise and easy to read. You into spanning trees plus back edges.
so bad on performance, I wrote a letter. have said that PL/I was a complicated Yes! It was startling. Great things
FS took two round trips to memory to language to compile, but it seems to sometimes look simple in retrospect,
fetch any item of data, because it had have expressive power. but that solved that part of structuring
a very high-level intermediate form as Yes, it was really very useful for writ- the bounds of subsequent algorithms’
the architected form for the machine. ing optimizers and compilers. The analysis and transformation.
data flow work came from early FOR-
Should I be reminded of the Intel TRAN and their use of control flow So Tarjan’s work played a role in this?
432, the processor designed for Ada? graphs. On Project Y we built control Yes, I don’t think he knew it, but as
It had a very high-level architecture flow graphs and developed a language soon as he published that, it was just
that turned out to be memory-bound, about the articulation points on the obvious that we should abandon graph
because it was constantly fetching de- graph, abstracting away from DO loops intervals and go there.
scriptors from memory. into something more general, then op-
Yes. We aren’t very good about pass- timizing based on a hierarchy of these Could you talk about Jack Schwartz?
ing on the lessons we’ve learned, and graphs, making the assumption that Jack spent a summer at ACS and
we don’t write our failures up very well. they represented parts of the program had a huge influence. He wrote a
that were most frequently executed. number of wonderful papers on op-
It’s harder to get a failure published timizing transformations, one being
than a success. When did you first start using that bit- “Strength reduction, or Babbage’s
But there are a lot of lessons in vector representation? differencing engine in modern
them. After fuming about FS for a few Right at the beginning of the ACS dress.” Jack had a list of applications
months, I wrote a letter to somebody project. “Graph intervals” was a term for strength reduction, which we in
higher up and said, “This isn’t going to that John had come up with, but then compilers never took advantage of.
work,” and why, and that was the wrong I wrote the paper and carried the idea He and John wrote a big book, never
thing to say. So I was kind of put on the further. Then Mike Harrison came, published but widely circulated, on a
shelf for a while. But then I did a lot of and we were struggling with the prob- lot of this work. I spent a year in the
work with a PL/I compiler that IBM had lem that we had no way of bounding Courant Institute—I taught graduate
subcontracted to Intermetrics. the computation of the flow of infor- compilers. And Jack and I were mar-
mation in such a graph. ried for a number of years. So it was a
The compilers you worked on—such as good relationship all around.
the ACS compiler and the PL/I compil- In some of your papers, you talked
er in the 1970s—what languages were about earlier monotonic relaxation What did you think about SETL [a
those implemented in? techniques, but they had very large the- programming language developed by
Some of them were implemented oretical bounds. Schwartz]?
in FORTRAN, some in PL/I, and some Yes, but I wasn’t much concerned, It wasn’t the right thing for that
were in assembly language. because I knew that real programs time, but it may be an interesting lan-
don’t have those, and Mike agreed. guage to go back and look at now that
How about Alpha? Jeff Ullman did some analysis on pro- we’re mired in over-specifying.
That was in the assembly language grams. That did get a better bound, but
for Stretch. that analysis didn’t produce a structure Gregory Chaitin’s classic PLDI paper
against which one could actually make on “Register Allocation and Spilling via
Your 1976 Communications paper with transformations. Graph Coloring” contains a substan-
ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t he acm 41
viewpoints
tial chunk of SETL code, four and a half Greg immediately recognized that he ture. Mao was still alive, and a lot of
pages, that implements the algorithm. could apply this solution to the register the institutes and universities were
I liked SETL and was amazed that allocator issue. It was a wonderful kind pretty much closed. There was a sci-
they got some good compiling ap- of serendipity. ence institute in Peking and in Shang-
plications out of it. In the context of hai, where we gave talks on compilers,
multicores and all the new challenges Anything else we should know about and we looked at the machines there,
that we’ve got, I like it a lot—it’s one John Cocke? which were really quite primitive. The
instance of specifying the problem at He had a major impact on every- compiler they were running on the
such a high level that there’s a good body. Let me talk about his style of machine in Peking was on paper tape.
possibility of being able to target mul- work. He didn’t write anything, and I recognized, looking at the code, that
tiple machines and to get high perfor- giving a talk was exceedingly rare and it was essentially Ershov’s compiler.
mance from programs that are easy to painful for him. He would walk around So the people in China were really
write. the building, working on multiple quite concerned about being cut out
I have a story about register alloca- things at the same time, and furthered of the advances in computing. This
tion. FORTRAN back in the 1950s had his ideas by talking to people. He nev- is a conjecture I’ve only recently ar-
the beginnings of a theory of register er sat in his office—he lost his tennis rived at, why we in particular in the
allocation, even though there were only racket one time for several months and U.S. were asked to come: it was a con-
three registers on the target machine. eventually found it on his desk. If he nection through the technology that
Quite a bit later, John Backus became came into your office, he would start the three groups shared. We were very
interested in applying graph coloring drawing and pick up the conversation involved with Ershov and his group.
to allocating registers; he worked for exactly where he had left off with you He and his family wanted to leave the
about 10 years on that problem and two weeks ago! Soviet Union, and they lived with us in
just couldn’t solve it. I considered it our home for about a year.
the biggest outstanding problem in So he was very good at co-routining!
optimizing compilers for a long time. Yes, he could look at a person and You actually had two projects called
Optimizing transformations would remember exactly the last thing he said “Experimental Compiling System.”
produce code with symbolic registers; to them. And people used to save his bar What was the second one like?
the issue was then to map symbolic napkins. He spent a lot of time in bars; Its overall goals were to take our
registers to real machine registers, he liked beer. He would draw complex work on analysis and transformation
of which there was a limited set. For designs on napkins, and people would of codes, and embed that knowledge in
high-performance computing, register take the napkins away at the end of the a schema that would advance compil-
allocation often conflicts with instruc- evening. The Stretch look-ahead was ing. I wish we had done it on Pascal or
tion scheduling. There wasn’t a good designed on bar napkins, particularly something like that.
algorithm until the Chaitin algorithm. in the Old Brauhaus in Poughkeepsie.
Chaitin was working on the PL.8 com- PL/I was that difficult a language?
piler for the 801 system. Ashok Chan- You also knew Andrei Ershov. Yes, it was the pointers and the
dra, another student of Knuth’s, joined He did some marvelous work in condition handling—those were the
the department and told about how the Soviet Union. Beta was his com- big problems. This was another bold
he had worked on the graph coloring piler, a really wonderful optimizing project, and my interest was mostly
problem, which Knuth had given out compiler. He had been on the ALGOL in the generalized solution for inter-
in class, and had solved it—not by solv- committee. procedural analysis—but also putting
ing the coloring problem directly, but what we knew into a context that would
in terms of what is the minimal num- He had an earlier project that he called make writing compilers easy and more
ber of colors needed to color the graph. Alpha, not to be confused with the Al- formal, put more structure into the de-
pha language you did for Stretch, right? velopment of compilers. We already
No, it was totally unrelated. But had a lot of great algorithms which we
The Stretch look- later we read his papers. Then in 1972 could package up, but this was to build
he couldn’t travel, because he wasn’t a compiler framework where the meth-
ahead was designed a party member, so he had a work- ods that we already had could be used
on bar napkins, shop in Novosibirsk and invited a large more flexibly.
number of people. It was broader than
particularly in the compilers, but there was a big focus Did lessons learned from this project
Old Brauhaus in on compilers, and we picked up some feed forward into your PTRAN work?
things from his work. The interprocedural work did, abso-
Poughkeepsie. Ershov also worked with people in lutely, and to some extent the work on
China. When the curtain came down binding. It sounds trivial, but constant
between the Soviet Union and China, propagation, getting that right, and
the Chinese group then didn’t have being able to take what you know and
access to Ershov’s work. Jack and I refine the program without having to
were invited to China in 1973 to lec- throw things away and start over.
42 communications of th e ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
viewpoints
Let’s talk about PTRAN. Two papers Shields and Philippe Charles from
came out in 1988: your “Overview of NYU. All of these people have gone on
the PTRAN Analysis System” and “IBM Another thing to have some really wonderful careers.
Parallel FORTRAN”. It’s important to I think was a very Mark Wegman and Kenny Zadeck were
distinguish these two projects. IBM not in the PTRAN group but were do-
Parallel FORTRAN was a product, a big step was not ing related work. We focused on taking
FORTRAN augmented with constructs only identifying dusty decks and producing good paral-
such as PARALLEL LOOP and PARAL- lel code for the machines—continuing
LEL CASE and ORIGINATE TASK. So parallelism, but the theme of language-independent,
the FORTRAN product is FORTRAN identifying useful machine-independent, and do it auto-
with extra statements of various kinds, matically.
whereas with PTRAN, you were work- parallelism.
ing with raw FORTRAN and doing the “Dusty decks” refers to old programs
analysis to get parallelism. punched on decks of Hollerith cards.
Right. Nowadays we’ve got students who have
never seen a punched card.
What was the relationship between the We also went a long way with work-
two projects? The IBM Parallel FOR- The Ultracomputer was perhaps the ing with product groups. There was
TRAN paper cites your group as having first to champion fetch-and-add as a a marvelous and very insightful pro-
provided some discussion. synchronization primitive. grammer, Randy Scarborough, who
The PTRAN group was formed in the Yes. A little history: The Ultra- worked in our Palo Alto lab at the time.
early 1980s, to look first at automatic computer had 256 processors, with He was able to take the existing FOR-
vectorization. IBM was very late in get- shared distributed memory, acces- TRAN compiler and add a little bit or
ting into parallelism. The machines sible through an elaborate switching a piece into the optimizer that could
had concurrency, but getting into ex- system. Getting data from memory is do pretty much everything that we
plicit parallelization, the first step was costly, so they had a combining switch, could do. It didn’t have the future that
vectorization of programs. I was asked one of the big inventions that the NYU we were hoping to achieve in terms of
to form a compiler group to do paral- people had developed. The fetch-and- building a base for extending the work
lel work, and I knew of David Kuck’s add primitive could be done in the and applying it to other situations, but
work, which started in the late 1960s at switch itself. it certainly solved the immediate prob-
the University of Illinois around the IL- lem very inexpensively and well at the
LIAC project. I visited Kuck and hired Doing fetch-and-add in the switch time. That really helped IBM quickly
some of his students. Kuck and I had a helped avoid the hot-spot problem of move into the marketplace with a very
very good arrangement over the years. having many processors go for a single parallel system that was familiar to the
He set up his own company—KAI. shared counter. Very clever idea. customers and solved the problem.
Very, very clever. So IBM and NYU Disappointing for us, but it was the
Kuck and Associates, Inc. together were partners, and supported right thing to have happen.
Right. IBM, at one point later on, by DARPA to build a smaller machine.
had them subcontracted to do some The number of processors got cut back Did PTRAN survive the introduction of
of the parallelism. They were very open to 64 and the combining switch was no this product?
about their techniques, with one ex- longer needed, and the project kind of Yes, it survived. The product just
ception, and they were the leaders early dragged on. But my group supplied the did automatic vectorization. What we
on. They had a system called Parafrase, compiler for that. The project eventu- were looking at was more parallelism
which enabled students to try various ally got canceled. in general.
kinds of parallelizing code with FOR- So that was the background, in IBM
TRAN input and then hooked to a tim- Research and at the Courant Institute. One particular thing in PTRAN was
ing simulator back-end. So they could But then the main server line, the 370s, looking at the data distribution prob-
get real results of how effective a partic- 3090s, were going to have vector pro- lem, because, as you remarked in your
ular set of transformations would be. cessors. paper, the very data layouts that im-
It was marvelous for learning how to prove sequential execution can actual-
do parallelism, what worked and what Multiple vector processors as well as ly harm parallel execution, because you
didn’t work, and a whole set of great multiple scalar processors. get cache conflicts and things like that.
students came out of that program. Yes. And the one that we initially Yes.
In setting up my group, I mostly hired worked on was a six-way vector proces-
from Illinois and NYU. The NYU peo- sor. We launched a parallel translation That doesn’t seem to be addressed at
ple were involved with the Ultracom- group, PTRAN. Jean Ferrante played a all by the “IBM Parallel FORTRAN” pa-
puter, and we had a variant of it here, key role. Michael Burke was involved; per. What kinds of analysis were you
a project called RP3, Research Parallel NYU guy. Ron Cytron was the Illinois doing in PTRAN? What issues were you
Processor Prototype, which was an in- guy. Wilson Hsieh was a co-op student. studying?
stantiation of their Ultracomputer. Vivek Sarkar was from Stanford, Dave Well, two things about the project.
ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t he acm 43
viewpoints
•
should be irrelevant to the writer of the
Redesigned binders program.
• Expanded table-of-contents
Apparently, just convincing the early
programmers of that was one of your
early successes. FORTRAN is good
44 communications of th e ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
viewpoints
there is a constant that you have recog- like to write them, and making them run Any advice for the future?
nized could replace the use of the vari- really effectively and efficiently on target Yes, I do have one thing. Students
able. You then know, say, which way machines. One of the many ways you do aren’t joining our field, computer sci-
a branch is going to go. You’ve built this, but a very important one, is to do ence, and I don’t know why. It’s just
up this infrastructure of analysis and many kinds of sophisticated analysis such an amazing field, and it’s changed
you’re ready to make the transforma- and optimization of the code and to find the world, and we’re just at the begin-
tion—but then the results of the analy- out as much as you can about the char- ning of the change. We have to find a
sis are obsolete, so you have to start acteristics of the program without actu- way to get our excitement out to be
again. That was a problem we strug- ally running it. So these tend to be static more publicly visible. It is exciting—in
gled with early on: How do you avoid techniques, and very sophisticated ones. the 50 years that I’ve been involved, the
redoing the analysis? It got particularly While you have worked with and pio- change has been astounding.
bad with interprocedural activities. neered quite a number of them, some of
the most interesting involve using graphs
Is there some simple insight or over- as a representation medium for the pro- Recommended Reading
arching idea that helps you to avoid gram and using a strategy of propagat- Buchholz, W., Ed.
having to completely redo the compu- ing information around the graph. Be- Planning a Computer System: Project
Stretch. McGraw-Hill, 1962; http://ed-
tation? cause a program can be represented as
thelen.org/comp-hist/IBM-7030-Planning-
Vivek Sarkar was one of the key peo- a graph in more than one way, there’s McJones.pdf
ple on that, but Dave Kuck—this is at more than one way in which to propa-
Allen, F.E. and Cocke, J.
the core of KAI’s work, too. That group gate that information. In some of these A catalogue of optimizing tranformations.
described it as “the oracle.” You assign algorithms in particular, the informa- In R. Rustin, Ed., Design and Optimization of
costs to each of the instructions, and tion that’s being propagated around the Compilers. Prentice-Hall, 1972, 1–30.
you can do it in a hierarchical form, so graph is in the form of sets—for example, Allen, F.E.
this block gets this cost, and this block sets of variable names. As a strategy for Interprocedural data flow analysis. In
has that cost, and then do a cost analy- making some of these algorithms effi- Proceedings of Information Processing
sis. This is the time it’s going to take. cient enough to use, you’ve represented 74. IFIP. Elsevier/North-Holland, 1974,
398–402.
Then there’s the overhead cost of hav- sets as bit vectors and decomposed the
ing the parallelism. graphs using interval analysis in order Allen, F.E. and Cocke, J.
A program data flow analysis procedure.
to provide an effective order in which to
Commun. ACM 19, 3 (Mar. 1976), 137–147;
Earlier, you said that Kuck was very process the nodes. In doing this, you have http://doi.acm.org/10.1145/360018.360025
open about everything he was doing, built a substantial sequence of working
Allen, F.E. et al.
with one exception— systems; these aren’t just paper designs. The Experimental Compiling System. IBM
The oracle! “What have you got in You build a great system, and then you J. Res. Dev. 24, 6 (Nov. 1980), 695–715.
that thing?” [laughs] “We’re not going go on and build the next one, and so Allen, F.E.
to tell you!” So we built our own variant on. These all actually work on code and The history of language processor
of it, which was a very powerful tech- take real programs that aren’t artificial technology at IBM. IBM J. Res. Dev. 25, 5
nique. benchmarks and make them run. (Sept. 1981), 535–548.
That’s really very good. There’s one Allen, F.E. et al.
What else should we mention? thing: the overall goal of all of my work An overview of the PTRAN analysis system
We talked about the NSA work that has been the FORTRAN goal, John for multiprocessing. In Proceedings
of the 1st International Conference on
wasn’t published. That was, for me, a Backus’ goal: user productivity, appli- Supercomputing (Athens, Greece, 1988),
mind-changer that led to my feeling cation performance. Springer-Verlag, 194–211. Also in J. Par.
very strongly about domain-specific Dist. Comp. 5 (Academic Press, 1988),
languages. Now, three goofy questions. What’s 617–640.
your favorite language to compile?
Recommended Viewing
Are you for them or against them? FORTRAN, of course!
For them! Allen, F.E.
The Stretch Harvest compiler. Computer
What’s your favorite language to pro- History Museum, Nov. 8, 2000. Video, TRT
Oh, okay. Let’s be clear! [laughs] gram in? 01:12:17; http://www.computerhistory.org/
Good! [laughs] I guess it would have to be FOR- collections/accession/102621818
TRAN. The IBM ACS System: A Pioneering
I’m going to try something very fool- Supercomputer Project of the 1960s.
ish: summarize your career in one Okay, now, if you had to build a com- Speakers: Russ Robelen, Bill Moone, John
paragraph, then ask you to critique it. piler that would run on a parallel ma- Zasio, Fran Allen, Lynn Conway, Brian
Randell. Computer History Museum, Feb.
A major focus of your career has been chine, what language would you use to 18, 2010; Video, TRT 1:33:35; http://www.
that, rather than inventing new pro- write that compiler? youtube.com/watch?v=pod53_F6urQ
gramming languages or language fea- Probably something like SETL or a
tures and trying to get people to program functional language. And I’m very in-
in them, you focused on taking programs trigued about ZPL. I really liked that
as they are written, or as programmers language. Copyright held by author.
ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t he acm 45
practice
doi:10.1145/1866739.1866755
a system administrator, one of the peo-
Article development led by
queue.acm.org
ple who work behind the scenes to con-
figure, operate, maintain, and trouble-
shoot the computer infrastructure that
For sysadmins, solving problems usually supports much of modern life. Their
involves collaborating with others. work is critical—and expensive. The hu-
man part of total system cost-of-owner-
How can we make it more effective? ship has been growing for decades, now
dominating the costs of hardware or
by Eben M. Haber, Eser Kandogan, and Paul P. Maglio software.2–4
To understand why, and to try to
Collaboration
learn how administration can be bet-
ter supported, we have been watching
system administrators at work in their
natural environments. Over the course
in System
of several years, and equipped with cam-
corders, cameras, tapes, computers, and
notebooks, we made 16 visits, each as
long as a week, across six different sites.
Administration
We observed administrators managing
databases, Web applications, and sys-
tem security; as well as storage design-
ers, infrastructure architects, and sys-
tem operators. Whatever their specific
titles were, we refer to them all as system
administrators, or sysadmins for short.
At the beginning of our studies, we
held a stereotypical view of the sysad-
min as that guy (and it was always a guy)
George was in trouble. A seemingly simple deployment in the back room of the university com-
puter center who knew everything and
was taking all morning, and there seemed no end in could solve all problems by himself. As
sight. His manager kept coming in to check on his we ventured into enterprise data cen-
ters, we realized the reality was signifi-
progress, as the customer was anxious to have the cantly more complex. To describe our
deployment done. He was supposed to be leaving findings fully would take a book (which
for a goodbye lunch for a departing co-worker, we are currently writing).6 In this short
article, we limit ourselves to a few epi-
adding to the stress. He had called in all kinds of sodes that illustrate the kinds of collabo-
help, including colleagues, an application architect, ration we saw in system administration
work and where the major problems lie.
technical support, and even one of the system As we’ll show from real-world stories we
developers. He used email, instant messaging, face- collected and our analyses of work pat-
to-face contacts, his phone, and even his office mate’s terns, it’s really not just one guy in the
back room.
phone to communicate with everyone. And George
was no novice. He had been working as a Web-hosting The Story of George
George is a Web administrator in a large
administrator for three years, and he had a bachelor’s
illustration by yarek waszul
46 communications of th e ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t he acm 47
practice
Figure 1. George had to add a new front-end Web server to an existing installation.
Back-end server 1
Junction 7234 7234 7234
Firewall
Back-end server N
????
????
???? ????
New Web server
instance
???? ????
the other team members often, as work George spent more than two hours own was still connected to Adam), but
is distributed. They need to coordinate troubleshooting the error, mainly in col- quickly transitioned communications
their actions, hand off long-running laboration with others. He had created with tech support to IM. For the next 20
tasks, and consult each other (especially the new Web-server instance seemingly minutes or so, George continued to trou-
during troubleshooting). He also inter- without incident, and it registered itself bleshoot with Adam on the phone and
acts with other teams that are in charge with the middleware authentication tech support via IM, and Ted kept pop-
of different areas, such as networks, op- server. Yet when he issued the command ping into the office to offer suggestions.
erating systems, and mail servers. to the middleware server to permit the After a while, George became unhappy
During our week of observation, one front-end Web server to talk to the back- with the answers from tech support, so
of George’s tasks was to set up Web ac- end mail server, he got the following Adam hooked him up with one of the
cess to email for a customer. This in- message: developers of the middleware, and they
volved creating a new Web-server in- started discussing the problem over IM.
stance on an existing machine outside Error: Could not connect to Throughout, George remained the sole
the firewall and connecting through a server (status: 0x1354a424) person with access to the system—all
middleware authentication server in- commands and information requests
side the firewall to a back-end mail serv- Given that three different servers went through him. He became increas-
er (Figure 1). George had never before were involved, the error message gave ingly stressed out as the problem re-
installed a second Web server on an ex- him insufficient information. The on- mained unresolved.
isting machine, but he had instructions line docs and a Web search on the mes- Eventually, Ted went back to his own
emailed to him by a colleague as well as sage provided no additional details, so office and looked into the problem in-
access to online documentation. The he reached out for help. (For more on dependently. He discovered that George
task involved several people from dif- error messages, see “Error Messages: had misunderstood one of the front-end
ferent teams. Early in the week, George What’s the Problem?” ACM Queue, Nov. server’s network configuration parame-
asked the network team to create a new 2004.7) ters, described vaguely in the documen-
IP address and open ports on the fire- George’s manager suggested calling tation as “internal port.” George thought
wall. Throughout the week, we saw him Adam, the application architect, and this parameter (port 7137) specified the
collaborate extensively with Ted, a col- George and Adam started troubleshoot- port for communication from the front-
league who was troubleshooting some ing together, talking on the phone and end to the middleware server, when it
problems with the authentication serv- exchanging system logs, error messages, went the other way. George, in fact, had
er. George’s progress was gated by Ted’s configuration files, and sample com- made two mistakes: he didn’t realize that
work, so they exchanged IMs all the time mands via IMs and email messages (Fig- every front-end server used port 7135 to
and frequently dropped into each oth- ure 2). Adam did not have access to the talk to the middleware server (which was
er’s offices to work through problems troublesome system, so George acted as permitted by the firewall, see Figure 1),
together. his eyes and hands, collecting informa- and he specified a port for communica-
By Friday morning, George had com- tion and executing commands. tion from the middleware server to the
pleted all preparations. The final steps They were not able to find the error, front-end, 7137, that was blocked by
should have taken just a few minutes, so about an hour in, Adam suggested the firewall. Communications worked
but this was where the action really be- that George call technical support. He in one direction, but not the other. The
gan. A mysterious error appeared, and used his office mate’s phone (as his software only tested communications
48 communications of th e ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
practice
in one direction, so the error was not call Ted (Adam was still on the other Ted: Just create it with the 7236. Trust
reported until the middleware authenti- phone), and the conversation immedi- me.
cation server was configured. Ted found ately switched modes. With the nuance George: Why? That port’s not…, that’s
a solution to this complex situation, of spoken words, Ted started to realize going the wrong…, that’s only one way,
and tried unsuccessfully to explain it to that George fundamentally misunder- too.
George over IM: stood what was going on. Rather than Ted: Trust me.
continually telling George what to do George: It’s only one way. Do you under-
Ted: We were supposed to use 7236. Un- (“DO IT!”), Ted explained why. The task stand what I am saying?
configure that instance and... had shifted from debugging the system Ted: ’Cause it’s the [middleware] server
George: Can’t specify a return port... you to debugging George, and they tried to talking back to the [Web-server] in-
only specify one port. establish a common understanding on stance.
Ted: You did it wrong. which network ports were going which George: Yeah, but how does [the Web
George: No, I didn’t. direction. server] talk to the [middleware] server to
Ted: Yes, you did. You need to put in make some kind of request?
7236. George: What are you talking about? Ted: 7135 is the standard port it uses
George: We just didn’t tell it to go both 7236? We thought that it came in on in all cases. So we had it wrong. Our as-
ways. The other port has nothing to do 7137 and went back on 7236, but we sumption on how it works was incorrect.
with this. were wrong, that 7236 is like an HTTPS George: All right, all right.
Ted: Well, all I know is what I see in the listener port or something? Ted: If it doesn’t work, you can beat me
conf file. Ted: It will still come in on 7135 to talk to up after.
George: We thought that was the return [middleware] server apparently... George: I want to right now. [Laughter
port. That is not a return port. George: Right? on both sides]
Ted: There currently is no listener on Ted: What’s happening is it’s actually try-
[middleware server] on 7137. So use ing to make a request back, um, through How did George get into trouble?
7236. DO IT! the 72... well, actually trying to make it Like many failures, there were a num-
back through the 7137 to the instance... ber of contributing factors. George mis-
Ted wasn’t getting his point across, and it’s not happening. understood the meaning of one of the
and George was getting ever-more frus- George: I know. I know that. But I can’t front-end configuration parameters,
trated. George told his office mate to tell it to... not realizing that it conflicted with the
Figure 2. George engaged with at least seven different individuals or groups using various means of communication, including instant
message (solid lines), email (dashed lines), phone (dotted-and-dashed lines), and face-to-face (dotted lines). Only George and his colleague
Ted had direct access to the problematic server (double-solid lines).
Phone Servers
Laptop Phone
Laptop
Phone
Monitor Laptop
Monitor
Laptop
Office Mate George
Monitor
Tech Support
Network Team
Manager
ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t he acm 49
practice
firewall rules. The front-end did not test collaboration can work only when cor- “Let me call you” or “Please email me
two-way communication, so that errors rect information is shared, something that log file.”
in the front-end port configuration were that is impeded by misunderstandings Collaboration is especially important
not reported until the middleware server and the limitations of communication in situations where a person’s under-
was configured. The error message cer- tools. Proper system design can help standing must be debugged, as we saw in
tainly did not help. Perhaps most im- avoid misunderstandings in the first George’s story. Misunderstandings are a
portant was the fact that for most of the place, and improved tools for sharing fact of life, and here it was compounded
troubleshooting session, George was the information could help more quickly by poorly designed error messages and
only one who had direct access to the sys- rectify misunderstandings when they late reporting of misconfiguration. It
tem. All the other participants got their occur. can take a long time for someone even to
information filtered through George. We analyzed the 2.5 hours of George’s realize that his or her understanding is
Examining the videotapes in de- troubleshooting session, coding each incorrect. An extra pair of eyes can really
tail, we discovered several instances 30-second time slice of what George did help to identify and correct misunder-
in which George misreported or mis- (see Figure 3). We found 91% of these standings, yet misunderstandings af-
understood what he saw, filtering the time slices were spent in collaboration fect what a person reports—so getting a
information through his own misun- with other people, either via phone, IM- second opinion on the problem will help
derstanding, and reporting back incor- ing, email, or face-to-face. Only 6% of the only if the collaborator gets an accurate
rectly. (One example occurred when time was he actually interacting with the picture of the system.
George misread the results of a network system, whether to discover state or to Another lesson is that different
trace, his misunderstanding filtering make changes, as each interaction was communications media are good for
out a critical clue.) This prevented Adam followed by lengthy discussions of the different things: the nuance and inter-
and tech support from helping him ef- implications of what was seen and what change of the telephone and face-to-
fectively. The problem was found only to do next. face contacts help in getting complex
when Ted looked at the machine state While not every troubleshooting epi- ideas across and in assessing what oth-
independently—and then he had to de- sode we witnessed had this extreme level er people know. IMing is excellent for
bug George, too. George had many tools of collaboration, we saw people working quickly exchanging commands and er-
for sharing information about system together to solve problems much more ror messages verbatim, but subtle per-
state, but none of them gave the whole commonly than a single person toiling sonal cues are lost. Even for longtime
picture to the others. alone. We also coded for the topic of colleagues like George and Ted, build-
What are the lessons? Collaboration collaboration, which included expected ing trust over IM was difficult. Email is
is critical, especially when misunder- topics such as configuration details, sys- great for exchanging lengthy items such
standings occur (and from what we saw, tem state, ongoing strategy, and what as log files and instructions or things
incorrect or incomplete understanding commands to execute. Surprisingly, 21% that need to persist. Different commu-
of highly complex systems is a common of the communication involved discuss- nications media suggest different levels
source of problems for sysadmins). Yet ing collaboration itself—for example, of commitment to the collaboration.
Tools Log
50 communications of th e ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
practice
Given the need for collaboration to help have done from home.”
sysadmins share their understanding After watching the people at work,
of systems, it is possible to imagine bet- however, we saw real value in having all
ter tools for sharing system state. These of them together in one place. The room
tools should take best advantage of dif-
ferent forms of communication to share Collaboration is was alive with different conversations,
usually many at once diverging and re-
more completely what is going on with
both system and sysadmin alike.
especially important joining, and with different experts ex-
in situations
changing ideas or asking questions. Peo-
We now turn to another example of ple would use the whiteboard to diagram
collaboration we observed among sys-
tem administrators working on a much
where a person’s theories, and could see and supplement
what others were writing. When some-
more complex system exhibiting a prob- understanding thing important occurred, the attention
lem that required incredible effort to un-
derstand.
must be debugged, of everybody in the room was instantly
focused. A group chat room was also
as we saw in used as a historical record for system
The Crit-Sit
A critical situation, or crit-sit, is a prac- George’s story. status, error messages, and ideas. Chat
was also used for private conversations
tice that is invoked when an IT system’s Misunderstandings within the room and beyond, and for ex-
performance becomes unacceptable
and the IT provider must devote spe- are a fact of life, changing technical information. At one
point we saw them build a monitoring
cific resources to solving the problem
as quickly as possible. Several sysad-
and here it was script collaboratively through talking,
looking at each other’s screens, and ex-
mins—experts on different compo- compounded by changing code snippets over IM both in-
nents—are brought into a room and
told to work together until the problem
poorly designed side and outside the room.
Not surprisingly, the people in the
is fixed. Crit-sits occur more often than error messages and room appeared much more engaged
sysadmins would like (one we inter-
viewed estimated taking part in four late reporting of than the remote participants. Being in
the same room signified a level of com-
crit-sits per year), and they can last days, misconfiguration. mitment by the participants. Those on
weeks, or even months. the conference call spoke up only when
We observed one crit-sit for a day, addressed directly; we assume that they
just after it had started, and followed its were doing other work and keeping just
progress over two months until its solu- one ear on the discussions in the room.
tion was found. This was exceptionally It is also likely that remote participants
long for a crit-sit. It involved an intermit- could not follow the chaotic, ever-shift-
tent Web application failure resulting ing discussions in the room.
from a subtle interaction of a Web ap- At a macro level, following the logs
plication server and back-end database. of 11 weeks of troubleshooting was also
Other potential problems were found fascinating. It tells the story of a signifi-
and fixed along the way, but it took more cant, complicated problem that could
than 80 days for a dedicated team of ex- not be successfully reproduced on any
perts to determine the true root cause. test system—a problem in which turn-
At a micro level, being in the room ing on logging would slow the system to
during the crit-sit was fascinating. Eight the point of unusability at the load levels
to 10 people were present in the large required to cause the failure. The story
conference room, either sitting at the shows the crit-sit team interacting with
two tables or walking around the room the support teams for a variety of prod-
talking; an additional four to six people ucts, escalating to the highest levels,
joined in via conference call and chat applying patch after patch and experi-
room (including technical support rep- menting with configuration settings,
resentatives for the various software new hardware, and special versions of
products involved). At first, it seemed the software. The process involved a lot
amazing to us that this many people of work by many different teams.
had been instructed to work together On the whole, the crit-sit was a col-
in a single room until the problem was laborative effort by a group of experts
solved. Indeed, one of the people in the to understand and repair the behavior
room complained via an instant mes- of a complex system consisting of many
sage to a colleague offsite: components. They used a wide variety of
“We’re doing lots of PD [problem de- technical tools: IMing, email, telephone,
termination], but nothing that I couldn’t and screen sharing, yet it seems that
ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t he acm 51
practice
they received the greatest value from forth chatter about system activity was
interacting face-to-face. By being in the common. They joked about taking down
same room, people could quickly shift the wall to make one big workspace.
from conversation to conversation when They also used a universitywide MOO
a critical phrase was heard, with a very
low barrier to asking someone a ques- One of our (multiuser domain, object oriented), a
textual virtual environment where all the
tion or suggesting an idea.
Although the crit-sit seems heavy-
motivations system administrators would hang out,
with different “rooms” for different top-
weight and wasteful, we have no other for studying ics. The start of an incident would result
approaches that can replicate the col-
laborative interaction of a bunch of
sysadmins is in high levels of activity in the security
room of the MOO, as security admins
people stuck in a room searching for a the ever-increasing from different parts of campus would
solution to a common problem. It would
be a revolutionary advance for system
cost of IT compare what was happening on their
own systems. On a day-to-day basis, the
administration if a tool were developed management. MOO might hold conversations on the
that could permit the same engagement
in remote collaborators as we saw in the Part of this can latest exploits discovered or theories
as to how a virus might be getting into
crit-sit room. certainly be the network. The admins described the
We next describe the sorts of collabo-
rations we observed among security ad- attributed to MOO’s persistence features as really
helpful in allowing them to catch up on
ministrators at a U.S. university. the fact that everything that was going on when they
came back after being away, even for a
The “ettercap” Incident computers get day. They also used a “whisper” feature
When we first met the security admin-
istration team for a computer center at
faster and cheaper of the MOO for point-to-point communi-
cation (like traditional IM).
a large university,5 they seemed some- every year, and An example of MOO use for quick in-
what paranoid, making such state-
ments as, “I’ll never type my password people do not. terchange of security status came when
we observed a meeting that focused on
on a Windows box, because I can’t really hacker tools. The security administra-
tell if it’s secure.” After watching them tors discussed a package called “etter-
for two weeks, we realized they had cap.” Being unfamiliar with this tool,
good reason to be cautious. IT systems, one of us began searching the Web for
as a rule, have no volition and don’t care information about it using the wireless
how they’re configured or whether you network. A few minutes later, one of the
apply a patch to them. Security admin- administrators in the room informed us
istrators face human antagonists, how- that a security administrator working
ever, who have been known to get angry remotely had detected this traffic and
when locked out of a system and work asked about it on the MOO:
extra hard to find new vulnerabilities
and do damage to the data of those who Remote: Any idea who was looking for
locked them out. ettercap? The DHCP logs say [observer’s
The work of these security adminis- machine name] is a NetBIOS name.
trators was centered around monitor- Nothing in email logs (like POP from
ing. New attacks came every week or two. that IP address).
Viruses, worms, and malicious intru- Remote: Seemed more like research.
sions could happen anytime. They had Remote: The SMTP port is open on that
a battery of automatic monitoring soft- host, but it doesn’t respond as SMTP.
ware looking for traces of attacks in sys- That could be a hacker defender port.
tem logs and network traffic. Automated Local: We were showing how [hacker]
intrusion-detection systems needed to downloaded ettercap. One of the visitors
err on the side of caution, with the sys- started searching for it.
admins making the final decision as to Remote: Ah, OK. Thanks.
whether suspicious activity was really an
attack. These sysadmins relied on com- In the space of only a few minutes,
munications tools to share information the sysadmin had detected Web search-
and to help them maintain awareness of es for the dangerous ettercap package,
what was going on in their center, across identified the name of the machine in
their campus, and around the world. question, checked the logs for other
The security administration team activity by that machine, and probed
shared adjacent offices, so back-and- the ports on the machine. He could see
52 communications of th e ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
practice
that it was probably someone doing re- management. Part of this can certainly first-class citizen in the work of system
search, but checked the MOO to verify be attributed to the fact that computers administration itself. Better collabora-
that it was in fact legitimate. get faster and cheaper every year, and tion support could relieve the burden
The participants also collaborated on people do not. Yet complexity is also a on individuals of communicating and
a broader scale. During our visit, the site huge issue—a Web site today is built establishing shared context, and so
was dealing with a worldwide security upon a dramatically more complicated avoiding missed information and en-
incident targeting military, educational, infrastructure than one 15 years ago. abling a persistent store for communi-
and government sites across the U.S. With complexity comes specialization cation. We believe that improved tools
and Europe. This was a particularly per- in IT management. With around-the- for system administrator collaboration
sistent attack—every time an intrusion clock operations needed for today’s en- have great potential to significantly im-
was detected and a vulnerability was terprises, coordination is also a must. pact system administration work—per-
closed, the attackers would come back System administrators need to share haps even helping to restrain the ever-
using a new exploit. The attackers would knowledge, coordinate their work, com- growing human portion of IT’s total
hop from institution to institution, com- municate system status, develop a com- cost of ownership.
promising a machine in one place, col- mon understanding, find and share
lecting passwords, and then trying those expertise, and build trust and develop
Related articles
passwords on machines at other institu- relationships. System administration is on queue.acm.org
tions (as users often have a single pass- inherently collaborative.
Error Messages: What’s the Problem?
word for accounts at different sites). At first, it is easy to think that George’s
Paul P. Maglio, Eser Kandogan
This broad-based attack required a story shows poor debugging practices or http://queue.acm.org/detail.cfm?id=1036499
broad-based response, so security ad- worse, poor skills, but we don’t think
Oops! Coping with Human Error
ministrators from affected institutions that’s the case. The system was complex, in IT Systems
formed an ad hoc community to moni- the documentation poor, the error mes- Aaron B. Brown
tor and share information about the at- sages unenlightening, and no single http://queue.acm.org/detail.cfm?id=1036497
tacks, with the goal of tracing the attacks person was responsible for all of it. Bet- Building Collaboration into IDEs
back to their source. When a compro- ter error messages or better documenta- Li-Te Cheng, Cleidson R.B. de Souza,
mised machine was found, they would tion would certainly help, but that miss- Susanne Hupfer, John Patterson, Steven Ros
let it remain compromised so that they es the point. There will always be cases http://queue.acm.org/detail.cfm?id=966803
could then trace the attackers and see that go uncovered and complexities that
References
where else they were connecting. This are hidden until it is too late. Modern IT 1. Barrett, R., Kandogan, E., Maglio, P.P., Haber, E.M.,
collaboration was like information systems are so complex that people will Prabaker, M., Takayama, L.A. Field studies of
computer system administrators: analysis of system
warfare: it was important to share in- often have an incorrect or incomplete management tools and practices. In Proceedings of
formation about known compromised understanding of their operation. That’s the Conference on Computer-Supported Collaborative
Work. 2004.
machines and exploits with trusted col- the nature of IT. The crit-sit story and the 2. Gartner Group/Dataquest. Server Storage and RAID
leagues, but the information had to be security story also show it. The one con- Worldwide (May 1999).
3. Gelb, J.P. System-managed storage. IBM Systems
kept from the attackers. You did not stant in these cases—and in almost all Journal 28, 1 (1989), 77–103.
want the attackers to know that you had the cases we observed—was collabora- 4. ITCentrix. Storage on Tap: Understanding the Business
Value of Storage Service Providers (Mar. 2001).
detected their attack and were monitor- tion. 5. Kandogan, E., Haber, E. M. 2005. Security and
ing their activities. When we first ob- We observed collaboration at many Usability: Designing Secure Systems that People Can
Use. In Security Administration Tools and Practices.
served them, the security administrators levels: within a small team, within an L.F. Cranor and S. Garfinkel, Eds. O’Reilly Media,
used conference calls for community organization, and across organizations. Sebastapol, 2005, 357–378.
6. Kandogan, E., Maglio, P.P., Haber, E.M., Bailey, J.
meetings. Later they found a special We observed several different types of (forthcoming). Information Technology Management:
Studies in Large-Scale System Administration. Oxford
encrypted email listserv to keep their collaboration tools in use. We observed University Press.
information under wraps—but because people switching from one tool to the 7. Maglio, P.P., Kandogan, E. 2004. Error messages:
What’s the problem? ACM Queue 2, 8 (2004), 50–55.
this tool was unmaintained, they had to other as needs shifted. We also observed
adopt and maintain it themselves. simultaneous use of several collabora-
Eben M. Haber is a research staff member at IBM
The world of security administration tion tools for different purposes. Not Research, Almaden, in San Jose, CA. He studies human-
seems very fluid, with new vulnerabili- surprisingly, system administrators use computer interaction, working on projects including data
mining and visualization, ethnographic studies of IT
ties and exploits discovered every day. the same collaboration tools as the rest system administration, and end-user programming tools.
Though secrecy was a greater concern of us, but these are not optimized for Eser Kandogan is a research staff member at IBM
than with other sysadmins we observed, sysadmin needs—whether it is team Research, Almaden, San Jose, CA. His interests include
human interaction with complex systems, ethnographic
collaboration was the foundation of brainstorming and debugging or secure studies of system administrators, information
their work: sharing knowledge of un- information sharing. visualization, and end-user programming.
folding events and system status, espe- Though specific features can be im- Paul P. Maglio is a research scientist and manager at
IBM Research, Almaden, San Jose, CA. He is working
cially when an attack might be starting plemented for system administrators, it on a system to compose loosely coupled heterogeneous
and time was critical. is clear to us that because of the diverse models and simulations to inform health and health policy
decisions. Since joining IBM Research, he has worked
needs among system administrators, a on programmable Web intermediaries, attentive user
Conclusion single collaboration tool will not work interfaces, multimodal human-computer interaction, human
aspects of autonomic computing, and service science.
One of our motivations for studying sys- for all. There needs to be a variety of
admins is the ever-increasing cost of IT tools, and collaboration needs to be a © 2011 ACM 0001-0782/11/0100 $10.00
ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t he acm 53
practice
doi:10.1145/1866739.1866753
BusinessObjects Polestar (currently mar-
Article development led by
queue.acm.org
keted as SAP BusinessObjects Explor-
er), a business intelligence (BI) query
tool designed for casual business us-
Talking with Julian Gosper, Jean-Luc Agathos, ers. In the past, such users did not have
Richard Rutter, and Terry Coatta. their own BI query tools. Instead, they
would pass their business queries on to
ACM Case Study analysts and IT people, who would then
use sophisticated BI tools to extract
UX Design
the relevant information from a data
warehouse. The Polestar team wanted
to leverage a lot of the same back-end
processing as the company’s more so-
and Agile:
phisticated BI query tools, but the new
software required a simpler, more user-
friendly interface with less arcane ter-
minology. Therefore, good UX design
was essential.
A Natural Fit?
To learn about the development pro-
cess, we spoke with two key members
of the Polestar team: software architect
Jean-Luc Agathos and senior UX de-
signer Julian Gosper. Agathos joined
BusinessObjects’ Paris office in 1999
and stayed with the company through
its acquisition by SAP in 2007. Gosper
started working with the company five
years ago in its Vancouver, B.C., office.
Fo und at the intersection of many fields—including The two began collaborating early in
usability, human-computer interaction (HCI), and the project, right after the creation of
a Java prototype incorporating some
interaction design—user experience (UX) design of Gosper’s initial UX designs. Because
addresses a software user’s entire experience: from the key back-end architecture is one
that had been developed earlier by the
logging on to navigating, accessing, modifying, Paris software engineering team, Gos-
and saving data. Unfortunately, UX design is often per joined the team in Paris to collabo-
overlooked or treated as a “bolt-on,” available rate on efforts to implement Polestar
on top of that architecture.
only to those projects blessed with the extra time To lead our discussion, we enlisted a
and budget to accommodate it. Careful design of pair of developers whose skill sets large-
ly mirror those of Agathos and Gosper.
the user experience, however, can be crucial to the Terry Coatta is a veteran software engi-
success of a product. And it’s not just window neer who is the CTO of Vitrium Systems
dressing: choices made about the user experience in Vancouver, B.C. He also is a mem-
ber of ACM Queue’s editorial advisory
can have a significant impact on a software product’s board. Joining Coatta to offer another
underlying architecture, data structures, and perspective on UX design is Richard
Rutter, a longtime Web designer and a
processing algorithms. founder of Clearleft, a UX design firm
To improve our understanding of UX design based in Brighton, England.
and how it fits into the software development process, Before diving in to see how the col-
laboration between Agathos and Gos-
we focus here on a project where UX designers per played out, it’s useful to be familiar
worked closely with software engineers to build with a few of the fundamental disci-
54 communications of th e ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
plines of classic UX design: neer to iterate on some lightweight pa-
Contextual inquiry. Before develop- per prototypes. What was the role of the
ing use cases, the team observes users engineer in that process?
of current tools, noting where the pain Julian Gosper: Adam Binnie, a se-
points lie. Contextual inquiry is often nior product manager who had con-
helpful in identifying problems that ceived of the project, brought me in
users are not aware of themselves. Un- to elaborate the interaction design
fortunately, this often proves expensive of the early Polestar prototype. Davor
and so is not always performed. Cubranic, who had produced that ini-
Formative testing. Formative test- tial proof-of-concept for research pur-
ing is used to see how well a UX design poses, was looking to work some of
addresses a product’s anticipated use the user experience ideas I had begun
cases. It also helps to determine how to collaborate on with Adam back into
closely those use cases actually cleave to his original design. Davor saw value
real-world experience. In preparation, in creating Java prototypes of some of
UX designers generally create light- those new concepts so we would have
weight prototypes to represent the use not only paper prototypes to work with,
cases the product is expected to eventu- but also a live prototype that end users
ally service. Often these are paper proto- could interact with and that develop-
types, which the test-group participants ment could evaluate from a technical
simply flip through and then comment perspective. I really pushed for that
upon. For the Polestar project, the UX since we already had this great devel-
team used a working Java prototype to opment resource available to us. And
facilitate early formative testing. it didn’t seem as though it was going
Summative testing. In summative to take all that long for Davor to ham-
tests, users test-drive the finished soft- mer out some of the key UX concepts,
ware according to some script. The which of course was going to make the
feedback from these tests is often used live prototype a much better vehicle for
to inform the next round of develop- formative testing.
ment since it usually comes too late Generally speaking, as an interac-
in the process to allow for significant tion designer you don’t want to invest
changes to be incorporated into the a lot of time programming something
current release. live, since what you really want is to
Although the Polestar team did not keep iterating on the fundamentals of
have the budget to conduct contex- the design quickly. That’s why working
tual inquiry, it was able to work closely with paper prototypes is so common-
with the software engineer who built place and effective early in a project.
the research prototype responsible for Typically, we’ll use Illustrator or Visio
spawning the project. This allowed the to mock up the key use cases and their
team to perform early formative test- associated UI, interactions, and task
ing with the aid of a working UX design, flows, and then output a PowerPoint
which in turn made it possible to refine deck you can just flip through to get a
the user stories that would be used as sense for a typical user session. Then
the basis for further testing. Working various project stakeholders can mark
with the software engineer responsible that up to give you feedback. You also
for the initial design also made it pos- then have a tool for formative testing.
sible to evaluate some of the initial UX Collaborating closely with develop-
designs both from a performance and a ment at that stage was appealing in
feasibility perspective, preventing a lot this particular case, however, because
of unwelcome surprises once develop- some of the directions we were taking
ment was under way in earnest. with the user interface were likely to
have serious back-end implications—
Terry Coatta: You mentioned you Top to Bottom: Jean-Luc Agathos, Terry
worked early on with a software engi- Coatta, Julian Gosper, and Richard Rutter.
ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t he acm 55
practice
for example, the ability of the applica- sible. To me, user experience and de-
tion to return and reevaluate facets and velopment are essentially one and the
visualizations with each click. Having same. I see it as our job as a group to
Davor there to help evaluate those pro- turn the user stories into deliverables.
posed design changes right away from julian gosper Of course, in development we are
a performance perspective through
rapid iterations of a lightweight Java-
For this product generally working from architectur-
al diagrams or some other kind of
based prototype helped to create a nice to succeed, product description that comes out
set of synergies right from the get-go.
Jean-Luc Agathos: Even though I
performance was of product management. We’re also
looking to the user experience people
didn’t get involved in the project until critical—both in to figure out what the interaction mod-
later, it seemed as though Davor had
produced a solid proof-of-concept. He terms of processing el is supposed to look like.
The interesting thing about the first
also figured out some very important
processing steps along the way, in addi-
a large index of version of Polestar is that Julian essen-
tially ended up taking on both roles. He
tion to assessing the feasibility of some data and being able acted as a program manager because
key algorithms we developed a bit later
in the process. I think this was critical
to evaluate which he knew what the user stories were
and how he wanted each of them to be
for the project, since even though peo- facets to return handled in terms of product function-
ple tend to think about user experience
as being just about the way things are
to a user to support ality. He also had a clear idea of how he
wanted all of that to be exposed in the
displayed, it’s also important to figure the experience UI and how he wanted end users ulti-
out how that stuff needs to be manipu-
lated so it can be processed efficiently. of clicking through mately to be able to interact with the
system. That greatly simplified things
Gosper: That’s absolutely correct. a new data analysis from my perspective because I had
For this product to succeed, perfor-
mance was critical—both in terms of at the speed of only one source I had to turn to for di-
rection.
processing a large index of data and
being able to evaluate which facets to
thought. Coatta: You used Agile for this
project. At what point in the process
return to a user to support the experi- can the software developers and UX
ence of clicking through a new data designers begin to work in parallel?
analysis at the speed of thought. The Gosper: If you have a good set of user
capabilities for assessing the relevance stories that have been agreed upon by
of metadata relative to any particular the executive members of the project
new query was actually Davor’s real and include clear definitions of the as-
focus all along, so he ended up driv- sociated workflows and use cases, then
ing that investigation in parallel to the the Agile iterative process can begin.
work I was doing to refine the usability At that point you are able to concretely
of the interface. understand the functionality and expe-
I do recall having discussions with rience the product needs to offer. On
Davor where he said, “Well, you know, the basis of that, both UX interaction
if you approach ‘X’ in the way you’re designers and the development team
suggesting, there is going to be a sig- should have enough to get going in par-
nificant performance hit; but if we allel. That is, the developers can start
approach it this other way, we might working on what the product needs
be able to get a much better response to do while the UX guys can work on
time.” So we went back and forth like use-case diagramming, wireframing
that a lot, which I think ultimately and scenarios, as well as begin to coor-
made for a much better user experi- dinate the time of end users to supply
ence design than would have been pos- whatever validation is required.
sible had we taken the typical waterfall The important thing is that you have
approach. lots of different people involved to help
Coatta: Are product engineers typi- pull those user stories together. Clear-
cally excited about being involved in a ly, the UX team needs to be part of that,
project when the user experience stuff but the development team should par-
is still in its earliest stages of getting ticipate as well—along with the busi-
sorted out? ness analysts and anybody else who
Agathos: Yes, I think developers might have some insights regarding
need to be acquainted with the user what the product requirements ought
stories as early in the process as pos- to be. That’s what I think of when we
56 communications of th e ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
practice
talk about starting development on the posed certain constraints in terms of pretty novel. [Since users often have
basis of some common “model.” how to approach processing the data only a fuzzy idea about some of the pa-
and exposing appropriate facets to the rameters or aspects of the information
For the most part, development of Pole- user. There already were some UI con- they’re seeking, faceted navigation is a
star’s first release went smoothly. Gos- straints driven by the user stories. Then UI pattern that facilitates queries by
per’s collaboration with the prototype we learned we had only a couple of allowing users to interactively refine
developer helped iron out some of the weeks to get something ready to show relevant categories (or “facets”) of the
technical challenges early on, as well to customers. And finally, it was strong- search.]
as refine the main user stories, which ly suggested that we use Adobe Flex— A lot of those initial design direc-
helped pave the way for implementing something we had not even been aware tions ended up being re-explored as
the product with Agathos. The user in- of previously—to get the work done. open topics as the UX team in Paris
terface that emerged from that process So, our first job was to learn that worked on the second version of the
did face certain challenges, which were technology. Initially, we were pretty product. They ended up coming to
verified during summative testing, af- frustrated about that since it wasn’t some different conclusions and be-
ter development work for release 1 was really as if we had a choice in the mat- gan to promote some alternatives.
essentially complete. Contributing to ter. Instead of fighting it, however, we That, of course, can be very healthy in
the problem was the general dearth quickly realized it was a really good the evolution of a product, but at the
of query tools for casual business us- idea. Flex is so powerful that it actually same time, it can really challenge the
ers at the time. The blessing there was allowed us to come up with a working development team whenever the new
that Gosper and his UX team had the prototype in time for that user confer- UI choices don’t align entirely with the
freedom to innovate with their design. ence. code you’ve already got running under
That innovation, however, meant the Gosper: There was a special aspect the hood.
software would inevitably end up incor- to this project in that it was earmarked Coatta: Another area where I think
porating new design concepts that— as innovation—a step toward next- there is huge potential for trouble has
even with many iterations of formative generation self-service BI. Prior to this, to do with the feedback process to UX
testing—faced an uncertain reception the company had never productized a design. I’ve been through that process
from users. lightweight business intelligence tool and have found it to be extremely chal-
Further challenges arose during the for business users, so we didn’t really lenging. As an example, suppose you
development of subsequent releases have much in the way of similar prod- need to find two business objects that
of Polestar, once Agathos and Gosper ucts—either within the company or are related to each other. Let’s say we
were no longer working together. In out in the market—that we could refer know that one of those objects can be
response to user feedback, the new to. As a result, the design we ended up shared across multiple business do-
UX team decided to make fundamen- pushing forward was sort of risky from mains. One possibility is that they share
tal changes to the UI, which forced the a user adoption perspective. a unique ownership, which would have
engineering team to rearchitect some The first version of Polestar, in all huge ramifications for the user experi-
parts of the software. On top of these honesty, tended to throw users for a ence. When we have run across situa-
challenges, the new team learned what loop at first. Most of the users we test- tions like that, we’ve often had trouble
could happen when there is confusion ed needed to spend a few minutes ex- communicating the semantic implica-
over the underlying conceptual model, ploring the tool just to get to the point tions back to the folks who are respon-
which is something Agathos and Gos- where they really understood the de- sible for doing the UI. I wonder if you’ve
per had managed to avoid during the sign metaphors and the overall user run into similar situations.
first phase of development. experience. Agathos: Actually, Julian had already
Of course, that’s not to say the initial Agathos: That was definitely the worked all of that out before joining
phase was entirely free of development case. us in Paris, so we didn’t have any prob-
challenges, some of which could be as- Gosper: That sparked a fair amount lems like that during our initial round
cribed to pressure to prove the worth of of controversy across the UX group of development. With later versions,
the new endeavor to management. because some of the methodologies however, we had an issue in the ad-
around formative testing back then dif- ministration module with “infospace,”
Richard Rutter: What were the most fered from one site to another. The next which is a concept we exposed only
challenging parts of this project? designers assigned to the project end- to administrators. The idea was that
Agathos: One big challenge was ed up coming to different conclusions you could create an infospace based
that right after we received the origi- about how to refine the interaction de- on some data source, which could be
nal Java POC (proof of concept), we sign and make it more intuitive. Some a BWA (Burrows-Wheeler Alignment)
received word that we needed to pro- questions were raised about whether index or maybe an Excel spreadsheet.
duce a Flash version in less than two we were fundamentally taking the right [BWA is a fast, lightweight tool that
weeks so it could be shown at Business interaction development approach in aligns relatively short queries to a se-
Objects’s [now SAP BusinessObjects] several different areas of the interface. quence database. These sequences are
international user group meeting. Ac- We employed faceted navigation in a usually indexed in the FASTA format.]
tually, that was only one of many con- fairly unique way, and the interactions Before anybody can use the system’s
straints we faced. The POC itself im- we created around analytics were also exploration module to investigate one
ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t he acm 57
practice
of these information spaces, that space tween team members and frequent
must be indexed, resulting in a new in- face-to-face communication, often be-
dex entity. That seems straightforward tween stakeholders with vastly differ-
enough, but we spent a lot of time try- ent backgrounds and technical vocab-
ing to agree on what ought to happen Terry Coatta ularies. It’s therefore no surprise that
when one person happens to be ex-
ploring an infospace while someone
An area where one of the team’s biggest challenges
had to do with making sure commu-
else is trying to index the same space. I think there is nication was as clear and effective as
Those discussions proved to be diffi-
cult simply because we had not made it
huge potential possible.
explicit that the entity in question was for trouble has Coatta: I understand that some UX
an index. That is, we talked about it at
first strictly as an information space. It to do with the designers feel Agile is less something
to be worked with than something to
wasn’t until after a few arguments that
it became clear what we were actually
feedback process be worked around. Yet, this was the first
time you faced implementing a UX de-
talking about was an index of that in- to UX design. sign using Agile, and it appears you ab-
formation space.
Whenever the model can be pre-
I’ve been through solutely loved it. Why is that?
Gosper: In an ideal world, you would
cisely discussed, you can avoid a lot that process and do all your contextual inquiry, paper
of unnecessary complexity. When a
model is correctly and clearly defined
have found it to prototyping, and formative testing
before starting to actually write lines
right from the outset of a project, the be extremely of code. In reality, because the turn-
only kind of feedback that ought to be
required—and the only sort of change challenging. around time between product incep-
tion and product release continues to
you should need to add to your sprints grow shorter and shorter, that simply
in subsequent iterations—has to isn’t possible. If the UX design is to be
do with the ways you want to expose implemented within a waterfall proj-
things. For example, you might de- ect, then it’s hard to know how to take
cide to change from a dropdown box what you’re learning about your use
to a list so users can be given faster cases and put that knowledge to work
access to something—with one click, once the coding has begun.
rather than two. If you start out with a In contrast, if you are embedded
clear model, you’re probably not going with the development team and you’re
to need to make any changes later that acquainted, tactically, with what
would likely have a significant impact they’re planning to accomplish two or
on either the UI or the underlying ar- three sprints down the road, you can
chitecture. start to plan how you’re going to test
different aspects of the user experience
In developing a product for which there in accordance with that.
was no obvious equivalent in the mar- Coatta: In other words, you just feel
ketplace, Agathos and Gosper were al- your way along?
ready sailing into somewhat uncharted Gosper: Yes, and it can be a little
waters. And there was yet another area scary to dive right into development
where they would face the unknown: without knowing all the answers.
neither had ever used Agile to imple- And by that I don’t just mean that you
ment a UX design. Adding to this un- haven’t had a chance to work through
certainty was the Agile development certain areas to make sure things make
methodology itself, where the tenets sense from an engineering perspec-
include the need to accept changes in tive; you also can’t be sure about how
requirements, even late in the develop- to go about articulating the design be-
ment process. cause there hasn’t been time to iterate
Fortunately, the Polestar team soon on that enough to get a good read on
embraced the iterative development what’s likely to work best for the user.
process and all its inherent uncertain- Then you have also got to take into ac-
ties. In fact, both Gosper and Agathos count all the layers of development
found Agile to be far more effective considerations. All of that comes to-
than the waterfall methodology for im- gether at the same time, so you’ve got
plementing UX designs. This doesn’t to be very alert to everything that’s hap-
mean all was smooth sailing, however. pening around you.
Agile requires close collaboration be- There is also a lot of back-and-forth
58 communications of th e ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
practice
in the sprint planning that has to hap- terests constantly vying to have certain ment tools we have today don’t help
pen. For example, Jean-Luc would let parts of the product materialize by cer- us with all those iterations because
me know we really needed to have a tain points in time, there’s plenty of ne- they don’t provide enough separation
certain aspect of the UI sorted out by gotiating to be done along the way. between the business logic and the UI.
some particular sprint, which meant Rutter: Yes, but I think you have Ideally, we should be able to change
I had essentially received my march- to accept that some reengineering of the UI from top to bottom and revisit
ing orders. Conversely, there also were bits and pieces along the way is just an absolutely every last bit of it without
times when I needed to see some par- inherent part of the process. That is, ever touching the underlying logic.
ticular functionality in place in time to based on feedback from testing, you Unfortunately, today that just isn’t the
use it as part of a live build I wanted to can bank on sprints down the line in- case.
incorporate into an upcoming round volving the reengineering of at least a Gosper: That’s why, as a UX interac-
of formative testing. That, alone, would few of the things that have already been tion designer, you have to be prepared
require a ton of coordination. built. But as long as that’s what you ex- to demonstrate to the product devel-
The other important influence on pect…no problem. opers that any changes you’re recom-
sprint planning comes from product Agathos: I think you’re right about mending are based on substantive
management, since they too often that: we have to develop a mind-set evidence, not just some intuitive or
want to be able to show off some cer- that’s accepting of changes along the anecdotal sense of the users’ needs.
tain capabilities by some particular way. But I actually think the biggest You need to make a strong case and be
date. With all three of these vested in- problem is the tooling. The develop- able to support design changes with as
ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t he acm 59
practice
much quantitative and qualitative end- most of what I wanted to communi- what it is they actually want.
user validation data as you can get your cate could be inferred from them. But Coatta: Since we’re talking about
hands on. there were other times when it would how engineers and designers live in
Coatta: So far, we’ve looked at this have been helpful for me to break different worlds and thus employ dif-
largely from the UX side of the equa- things down into more granular speci- ferent tools, different skill sets, and
tion. What are some of the benefits and fications. It was a bit challenging in the different worldviews—what do you
challenges of using Agile to implement moment to sort that out, because on think of having software engineers get
UX design from more of an engineer- the one hand you’re trying to manage directly involved in formative testing?
ing perspective? your time in terms of producing speci- Does that idea intrigue you? Or does it
Agathos: The biggest challenge we fications for the next sprint, but on the seem dangerous?
faced in this particular project—at other hand you want to get them to the Gosper: Both, actually. It all depends
least after we had completed the first appropriate depth. on how you act upon that information.
version of Polestar—had to do with The other challenge is that each do- At one point there is an internal valida-
changes that were made in the overall main has its own technical language, tion process whereby a product that is
architecture just as we were about to and it can sometimes prove tricky to just about to be released is opened up
finish a release. Even in an Agile envi- make sure you’re correctly interpreting to a much wider cross section of people
ronment you still need to think things what you’re hearing. For example, I re- in the company than the original group
through pretty thoroughly up front, member one of the sprints where I was of stakeholders. And then all those
and then sketch out your design, proto- very concerned with a particular set of folks are given a script that allows them
type it, test it, and continue fixing it un- functionality having to do with users’ to walk through the product so they
til it becomes stable. As soon as you’ve abilities to specify financial periods can experience it firsthand.
achieved something that’s pretty solid for the data they might be looking to What that can trigger, naturally, is
for any given layer in the architecture, explore. I therefore became very active a wave of feedback on a project that
you need to finish that up and move on in trying to get product management is just about finalized, when we don’t
to the next higher layer. In that way, it to allocate more resources to that ef- have a lot of time to do anything about
is like creating a building. If you have fort because I was certain it would be it. In a lot of that feedback, people
to go back down to the foundation to a major pain point for end users if the don’t just point out problems; they also
start making some fairly significant functionality was insufficient. offer solutions, such as, “That check-
changes once you’ve already gotten to During that time I remember see- mark can’t be green. Make it gray.” To
a very late stage of the process, you’re ing a sprint review presentation that take all those sorts of comments at
likely to end up causing damage to ev- referred to a number of features, a cou- face value, of course, would be danger-
erything else you’ve built on top of that. ple of which related to date and finan- ous. Anyway, my tendency is to think of
The nice thing about Agile is that it cial period and so forth. Next to each of feedback that comes through develop-
allows for design input at every sprint those items was the notation “d-cut.” I ment or any other internal channel as
along the way—together with some didn’t say a word, but I was just flabber- something that should provide a good
discussion about the reasoning behind gasted. I was thinking to myself, “Wow! basis for the next user study.
that design input. That’s really impor- So they just decided to cut that. I can’t Rutter: UX design should always
tant. For example, when we were work- believe it. And nobody even bothered involve contact with lots of different
ing with Julian, he explained to us that to tell me.” But of course it turns out end users at plenty of different points
one of the fundamental design goals “d-cut” stands for “development cut,” throughout the process. Still, as Julian
for Polestar was to minimize the num- which means they had already imple- says, it’s what you end up doing with all
ber of clicks the end user would have mented those items. There are times that information that really matters.
to perform to accomplish any particu- when you can end up talking past each When it comes to figuring out how to
lar task. He also talked about how that other just because everyone is using solve the problems that come to light
applied to each of the most important terms specific to his or her own techni- that way, that’s actually what the UX
user stories. For us as developers, it re- cal domain. Of course, the same is true guys get paid to do.
ally helped to understand all that. for product and program management
I don’t think we have exchanges like as well.
that nearly enough—and that doesn’t Coatta: Don’t the different tools Related articles
on queue.acm.org
apply only to the UX guys. It would also used by each respective domain also
be good to have discussions like that make contributions to these commu- The Future of Human-Computer Interaction
John Canny
with the program managers. nication problems? http://queue.acm.org/detail.cfm?id=1147530
Gosper: In those discussions for the Agathos: I couldn’t agree more. For
Human-KV Interaction
Polestar project, one of the greatest example, when you program in Java,
Kode Vicious
challenges for me had to do with fig- there are some things you can express http://queue.acm.org/detail.cfm?id=1122682
uring out just how much depth to go and some you cannot. Similarly, for ar-
Other People’s Data
into when describing a particular de- chitects, using a language such as UML Stephen Petschulat
sign specification. Sometimes a set of constrains them in some ways. They http://queue.acm.org/detail.cfm?id=1655240
wireframes supporting a particular use end up having to ask themselves tons
case seemed to be good enough, since of questions prior to telling the system © 2011 ACM 0001-0782/11/0100 $10.00
60 communications of th e ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
doi:10.1145/1866739 . 1 8 6 6 7 5 4
Virtualization:
Blessing
or Curse?
touted as the solution
V ir t ua lizati on is o f t en
to many challenging problems, from resource
underutilization to data-center optimization and
carbon emission reduction. However, the hidden costs
of virtualization, largely stemming from the complex
and difficult system administration challenges it
ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t he acm 61
practice
ture as a Service) cloud offerings that running multiple VMs is similar to that started happening with such services
give users access to computing re- of managing a physical, nonvirtualized entering the hypervisor, and it has the
sources on demand in the form of VMs. server. Therefore, as dozens of VMs can potential to reduce operational work-
This can improve developer productiv- run on one virtualized server, consoli- load substantially.
ity and reduce time to market, which is dation can reduce operational work- Scale. Enterprises have spent years
key in today’s fast-moving business load. Not so: the workload of manag- improving and streamlining their man-
environment. Since rolling out an ap- ing a physical, nonvirtualized server is agement tools and processes to handle
plication sooner can provide first-mov- comparable to that of managing a VM, scale. They have invested in a back-
er advantage, virtualization can help not the underlying virtualized server. bone of configuration management
boost the business. The fruits of common, standardized and provisioning systems, operational
management—such as centrally held tools, and monitoring solutions that
The Practice configuration and image-based provi- can handle building and managing
Although virtualization is a 50-year-old sioning—have already been reaped by tens or even hundreds of thousands of
technology,3 it reached broad popular- enterprises, as this is how they manage systems. Thanks to this—largely home-
ity only as it became available for the their physical environments. There-
x86 platform from 2001 onward—and fore, managing 20 VMs that share a
most large enterprises have been us- virtualized server requires the same
ing the technology for fewer than five amount of work as managing 20 physi-
years.1,4 As such, it is a relatively new cal servers. Add to that the overhead of
technology, which, unsurprisingly, car- managing the hypervisor and associ-
ries a number of less-well-understood ated services, and it is easy to see that
system administration challenges. operational workload will be higher.
Old Assumptions. It is not, strictly More importantly, there is evidence
speaking, virtualization’s fault, but that virtualization leads to an increase
many systems in an enterprise infra- in the number of systems—now run-
structure are built on the assumption ning in VMs—instead of simply con-
of running on real, physical hardware. solidating existing workloads.2,5 Mak-
The design of operating systems is ing it easy to get access to computing
often based on the principle that the capacity in the form of a VM, as IaaS
hard disk is local, and therefore read- clouds do, has the side effect of leading
ing from and writing to it is fast and to a proliferation of barely used VMs,
low cost. Thus, they use the disk gen- since developers forget to return the
erously in a number of ways, such as VMs they do not use to the pool after
caching, buffering, and logging. This, the end of a project. As the number of
of course, is perfectly fair in a nonvirtu- VMs increases, so does the load placed
alized world. on administrators and on shared in-
With virtualization added to the frastructure such as storage, Dynamic
mix, many such assumptions are Host Configuration Protocol (DHCP),
turned on their heads. VMs often use and boot servers.
shared storage, instead of local disks, Most enterprise users of virtualiza-
to take advantage of high availability tion implement their own VM recla- grown—tooling, massively parallel op-
and load-balancing solutions—a VM mation systems. Some solutions are erational tasks, such as the build-out
with its data on the local disk is a lot straightforward and borderline sim- of thousands of servers, daily operating
more difficult to migrate, and doomed plistic: if nobody has logged on for system checkouts, and planned data-
if the local disk fails. With virtualiza- more than three months, then notify center power-downs, are routine and
tion, each read and write operation and subsequently reclaim if nobody straightforward for operational teams.
travels to shared storage over the net- objects. Some solutions are elaborate Enter virtualization: most vendor
work or Fiber Channel, adding load and carry the distinctive odor of over- solutions are not built for the large en-
to the network interface controllers engineering: analyze resource utiliza- terprise when it comes to scale, particu-
(NICs), switches, and shared storage tion over a period of time based on larly with respect to their management
systems. In addition, as a result of con- heuristics; determine level of usage; frameworks. Their scale limitations
solidation, the network and storage and act accordingly. Surprising as it are orders of magnitude below those
infrastructure has to cope with a poten- may be there is a lack of generic and of enterprise systems, often because
tially much higher number of systems, broadly applicable VM reclamation of fundamental design flaws—such as
compounding this effect. It will take solutions to address sprawl challeng- overreliance on central components or
years for the entire ecosystem to adapt es. In addition, services that are com- data sources. In addition, they often do
fully to virtualization. mon to all VMs sharing a host—such not scale out; running more instances
System Sprawl. Conventional wis- as virus scanning, firewalls, and back- of the vendor solution will not fully ad-
dom has it that the operational work- ups—should become part of the virtu- dress the scaling issue, as the instances
load of managing a virtualized server alization layer itself. This has already will not talk to each other. This chal-
62 communications of th e ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
practice
lenge is not unique to virtualization. An tion and size of operational teams. physical infrastructure.
enterprise faces similar issues when it Interoperability. Many enterprises To be sure, some enterprises are for-
introduces a new operating system to its have achieved a good level of integra- tunate enough to have a homogeneous
environment. Scaling difficulties, how- tion between their backbone systems. environment, managed by a product
ever, are particularly important when it The addition of a server in the config- suite for which solid virtualization ex-
comes to virtualization for two reasons: uration-management system allows tensions already exist. In a heteroge-
first, virtualization increases the num- it to get an IP address and host name. neous infrastructure, however, with
ber of systems that must be managed, The tool that executes a power-down more than one virtualization platform,
as discussed in the section on system draws its data about what to power with virtualized and nonvirtualized
sprawl; second, one of the main benefits off seamlessly from the configuration- parts, and with a multitude of tightly
of virtualization is central management management system. A change in a integrated homegrown systems, the
of the infrastructure, which cannot be server’s configuration will automati- introduction of virtualization leads to
achieved without a suitably scalable cally change the checkout logic applied administration islands—parts of the
management framework. to it. This uniformity and tight integra- infrastructure that are managed differ-
As a result, enterprises are left with tion massively simplifies operational ently from everything else. This breaks
a choice: either they live with a mul- and administrative work. the integration and uniformity of the
titude of frameworks with which to Virtualization often seems like an enterprise environment, and increases
manage the infrastructure, which in- awkward guest in this tightly integrat- operational complexity.
creases operational complexity; or they ed enterprise environment. Each virtu- Many enterprises will feel like they
must engineer their own solutions that alization platform comes with its own have been here before—for example,
work around those limitations—for ex- APIs, ways of configuring, describing, when they engineered their systems to
ample, the now open source Aquilon and provisioning VMs, as well as its be able to provision and manage mul-
framework extending the Quattor tool- own management tooling. The ven- tiple operating systems using the same
kit (http://www.quattor.org). Another dor ecosystem is gradually catching frameworks. Once again, customers
option is for enterprises to wait until up, providing increased integration face the “build versus suffer” choice.
the vendor ecosystem catches up with between backbone services and virtu- Should they live with the added opera-
enterprise-scale requirements before alization management. Solutions are tional complexity of administration
they virtualize. The right answer de- lacking, however, that fulfill all three islands until standardization and con-
pends on a number of factors, includ- of the following conditions: vergence emerge in the marketplace,
ing the enterprise’s size, business ˲˲ They can be relatively easily inte- or should they invest in substantial
requirements, existing backbone of grated with homegrown systems. engineering and integration work to
systems and tools, size of virtualized ˲˲ They can handle multiple virtual- ensure hypervisor agnosticism and in-
and virtualizable infrastructure, engi- ization platforms. tegration with the existing backbone?
neering capabilities, and sophistica- ˲˲ They can manage virtual as well as Troubleshooting. Contrary to con-
ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t he acm 63
practice
as a solution for
substantial resources into understand-
a physical computer is often a case of ing the impact of changes to different
glancing at a few log files on the local
disk and potentially investigating local
many challenging parts of the infrastructure. Change-
management processes and policies
hardware issues. The amount of data problems. are well oiled and time tested, ensuring
that needs to be looked at is relatively
small, contained, and easily found.
Expectations that every change to the environment is
assessed and its impact documented.
Monitoring performance and diag- are running high. Once again, virtualization brings
nosing a problem of a virtual desktop,
on the other hand, requires trawling Can virtualization fundamental change. Sharing the in-
frastructure comes with centralization
through logs and data from a number deliver? and, therefore, with potential bottle-
of sources including the desktop oper- necks that are not as well understood.
ating system, the hypervisor, the stor- Rolling out a new service pack that in-
age system, and the network. creases disk utilization by 5IOPS (in-
In addition, this large volume of put/output operations per second) on
disparate data must be aggregated or each host will have very little impact in
linked; the administrator should be a nonvirtualized environment—each
able to obtain information easily from host will be using its disk a little more
all relevant systems for a given time pe- often. In a virtualized environment, an
riod, or to trace the progress of a spe- increase of disk usage by 5IOPS per VM
cific packet through the storage and will result in an increase of 10,000IOPS
network stack. Because of this mul- on a storage system shared by 2,000
tisource and multilayer obfuscation, VMs, with potentially devastating con-
resolution will be significantly slower sequences. It will also place increased
if administrators have to look at sev- load on the shared host, as more
eral screens and manually identify bits packets will have to travel through the
of data and log files that are related, in hypervisor, as well as the network in-
terms of either time or causality. New frastructure. We have seen antivirus
paradigms are needed for storing, re- updates and operating-system patches
trieving, and linking logs and perfor- resulting in increases in CPU utiliza-
mance data from multiple sources. tion on the order of 40% across the
Experience from fields such as Web virtualized plant—changes that would
search can be vital in this endeavor. have a negligible effect when applied to
Silos? What Silos? In a nonvirtual- physical systems.
ized enterprise environment, respon- Similarly, large-scale reboots can
sibilities for running different parts of impact shared infrastructure compo-
the infrastructure are neatly divided nents in ways that are radically dif-
among operational teams, such as ferent from the nonvirtualized past.
Unix, Windows, network, and stor- Testing and change management pro-
age operations. Each team has a clear cesses need to change to account for
scope of responsibility, communica- effects that may be much broader than
tion among teams is limited, and ap- before.
portioning credit, responsibility, and Contention. Virtualization plat-
accountability for infrastructure issues forms do a decent job of isolating VMs
is straightforward. on a shared physical host and manag-
Virtualization bulldozes these silo ing resources on that host (such as CPU
walls. Operational issues that involve and memory). In a complex enterprise
more than one operational team—and, environment, however, this is only part
in some cases, all—become far more of the picture. A large number of VMs
common than issues that can be re- will be sharing a network switch, and
solved entirely within a silo. As such, an even larger number of VMs will be
cross-silo collaboration and commu- sharing a storage system. Contention
nication are of paramount importance, on those parts of the virtualized stack
64 communications of th e ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
practice
can have as much impact as contention tion brings unprecedented integra- ter cross-silo collaboration, and instill
on a shared host, or more. Consider tion and hard dependencies among an end-to-end mentality in their staff.
the case where a rogue VM overloads components—a storage outage could Controls to prevent VM sprawl are key,
shared storage: hundreds or thousands mean that thousands of users cannot and new processes and policies for
of VMs will be slowed down. use their desktops. Enterprises need change management are needed, as
Functionality that allows isolat- to ensure that their operational teams virtualization multiplies the effect of
ing and managing contention when it across all silos are comfortable with changes that would previously be of
comes to networking and storage ele- managing a massively interconnected minimal impact.
ments is only now reaching maturity large-scale system, rather than a col- Virtualization can bring significant
and entering the mainstream virtual- lection of individual and independent benefits to the enterprise, but it can
ization scene. Designing a virtualiza- components, without GUIs. also bite the hand that feeds it. It is no
tion technology stack that can take curse, but, like luck, it favors the pre-
advantage of such features requires Conclusion pared.
engineering work and a good amount Virtualization holds promise as a solu-
of networking and storage expertise tion for many challenging problems. It Acknowledgments
on behalf of the enterprise customer. can help reduce infrastructure costs, Many thanks to Mostafa Afifi, Neil Al-
Some do that, combining exotic net- delay data-center build-outs, improve len, Rob Dunn, Chris Edmonds, Rob-
work adapters that provide the right our ability to respond to fast-moving bie Eichberger, Anthony Golia, Alli-
cocktail of I/O virtualization in hard- business needs, allow a massive-scale son Gorman Nachtigal, and Martin
ware with custom rack, storage, and infrastructure to be managed in a more Vazquez for their invaluable feedback
network designs. Some opt for the flexible and automated way, and even and suggestions. I am also grateful to
riskier but easier route of doing noth- help reduce carbon emissions. Expec- John Stanik and the ACM Queue Edito-
ing special, hoping that system admin- tations are running high. rial Board for their feedback and guid-
istrators will cope with any contention Can virtualization deliver? It abso- ance in completing this article.
issues as they arise. lutely can, but not out of the box. For
GUIs. Graphical user interfaces virtualization to deliver on its promise,
work well when managing an email both vendors and enterprises need to Related articles
on queue.acm.org
inbox, data folder, or even the desktop adapt in a number of ways. Vendors
of a personal computer. In general, it must place strategic emphasis on en- Beyond Server Consolidation
is well understood in the human-com- terprise requirements for scale, en- Werner Vogels
http://queue.acm.org/detail.cfm?id=1348590
puter interaction research community suring that their products can grace-
that GUIs work well for handling a rela- fully handle managing hundreds of CTO Roundtable: Virtualization
http://queue.acm.org/detail.cfm?id=1508219
tively small number of elements. If that thousands or even millions of VMs.
number gets large, GUIs can overload Public cloud service providers do this The Cost of Virtualization
Ulrich Drepper
the user, which often results in poor very successfully. Standardization,
http://queue.acm.org/detail.cfm?id=1348591
decision making.7 Agents and automa- automation, and integration are key;
tion have been proposed as solutions eye-pleasing GUIs are less important.
References
to reduce information overload.6 Solutions that help manage resource 1. Bailey, M., Eastwood, M., Gillen, A., Gupta, D.
Virtualization solutions tend to contention end to end, rather than only Server virtualization market forecast and analysis,
2005–2010. IDC, 2006.
come with GUI-based management on the shared hosts themselves, will 2. Brodkin, J. Virtual server sprawl kills cost savings,
frameworks. That works well for man- significantly simplify the adoption of experts warn. NetworkWorld. Dec. 5, 2008.
3. Goldberg, R.P. Survey of virtual machine research.
aging 100 VMs, but it breaks down in virtualization. In addition, the indus- IEEE Computer Magazine 7, 6 (1974), 34–45.
an enterprise with 100,000 VMs. What try’s ecosystem needs to consider the 4. Humphreys, J. Worldwide virtual machine software
2005 vendor shares. IDC, 2005.
is really needed is more intelligence fundamental redesign of components 5. IDC. Virtualization market accelerates out of the
and automation; if the storage of a vir- that perform suboptimally with virtual- recession as users adopt “Virtualize First” mentality;
2010.
tualized server is disconnected, auto- ization, and it must provide better ways 6. Maes, P. Agents that reduce work and information
matically reconnecting it is a lot more to collect, aggregate, and interpret logs overload. Commun. ACM 37, 7 (1994), 30–40.
7. Schwartz, B. The Paradox of Choice. HarperCollins, NY,
effective than displaying a little yellow and performance data from disparate 2005.
triangle with an exclamation mark in sources.
a GUI that contains thousands of ele- Enterprises that decide to virtual- Evangelos Kotsovinos is a vice president at Morgan
Stanley, where he leads virtualization and cloud-
ments. What is also needed is interop- ize strategically and at a large scale computing engineering. His areas of interest include
erability with enterprise backbones need to be prepared for the substantial massive-scale provisioning, predictive monitoring,
scalable storage for virtualization, and operational tooling
and other systems, as mentioned pre- engineering investment that will be for efficiently managing a global cloud. He also serves
viously. required to achieve the desired levels as the chief strategy officer at Virtual Trip, an ecosystem
of dynamic start-up companies, and is on the Board
In addition, administrators who are of scalability, interoperability, and op- of Directors of NewCred Ltd. Previously, Kotsovinos
accustomed to the piecemeal systems erational uniformity. The alternative was a senior research scientist at T-Labs, where he
helped develop a cloud-computing R&D project into a
management of the previrtualization is increased operational complexity VC-funded Internet start-up. A pioneer in the field of
era—managing a server here and a and cost. In addition, enterprises that cloud computing, he led the XenoServers project, which
produced one of the first cloud-computing blueprints.
storage element there—will discover are serious about virtualization need a
they will have to adapt. Virtualiza- way to break the old dividing lines, fos- © 2011 ACM 0001-0782/11/0100 $10.00
ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t he acm 65
contributed articles
doi:10.1145/1866739.1866756
available for investment in new proj-
How companies pay programmers ects and jobs.
My intent is to help make computer
when they move the related IP rights scientists aware of the relationship of
to offshore taxhavens. the flow of jobs in computing and the
flow of preexisting IP. The ability to cre-
by Gio Wiederhold ate valuable software greatly depends
on prior technological prowess. The
Follow the
processes allowing IP to be moved off-
shore, beyond where the software was
created, are formally legal. The result-
ing accumulation of massive capital in
Intellectual
taxhavensa has drawn governmental
attention and put pressure officials
to change tax regulations.1 However,
the changes proposed in these discus-
Property
sions ignore IP’s crucial role in gener-
ating such capital and, even if enacted,
would be ineffective. Transparency is
needed to gain public support for any
effective change. In addition to advo-
cating transparency about IP trans-
fer, I also offer a radical suggestion—
eliminate corporate taxation as a way
to avoid the distortion now driving the
outflow of IP and providing much of
the motivation for keeping capital and
In the ongoing discussion about offshoring in the IP offshore.
computer and data-processing industries, the 2006 I do not address the risk of misap-
ACM report Globalization and Offshoring of Software propriation of IP when offshoring, a
related but orthogonal issue, cover-
addressed job shifts due to globalization in the ing instead only the processes that are
software industry.1 But jobs represent only half legal. The risk of loss was addressed
throughout the 2006 ACM report,1
of the labor and capital equation in business. In which also cited tax incentives, a much
today’s high-tech industries, intellectual property larger economic factor for businesses
(IP) supplies the other half, the capital complement. than misappropriation of IP. The role
Offshoring IP always accompanies offshoring jobs a The notion of a taxhaven is a concept in ordi-
nary discourse and a crucial aspect of this ar-
and, while less visible, may be a major driver of job ticle. Moreover, using a one-word term simpli-
transfer. The underlying economic model—involving fies the specification and parsing of subsets, as
in “primary taxhavens” and “semi-taxhavens.”
ownership of profits, taxation, and compensation
of workers from the revenue their products generate— key insights
has not been explicated and is largely unknown in P rofits from the work of software
creators and programmers are based
the computer science community. This article on IP being moved offshore.
presents the issue of software income allocation L ocating IP in primary taxhavens
and the role IP plays in offshoring. It also tries to damages both developed and emerging
economies and disadvantages small
explain why computer experts’ lack of insight into businesses.
66 communications of th e ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
of taxhavens was ignored.
Programmers and the computer
scientists supporting their work have
traditionally focused on producing
quality high-performance software on
time and at an affordable cost.4 They
are rarely concerned with the sales
and pricing of software, questioning
financial policies only when the com-
pany employing them goes broke.
There is actually a strong sense in the
profession that software should be
a free good.12 Implicit in this view is
that government, universities, and
foundations should pay for software
development, rather than the users
benefitting from it. In this model, pro-
grammers see themselves as artists
creating beauty and benefits for all
mankind. But consider the size of the
software industry. In the U.S. alone, its
revenue is $121 billion per year, well
over 1% of U.S. GDP.7 An even larger
amount is spent in non-software com-
panies for business-specific software
development and maintenance. The
more than 4.8 million people em-
ployed in this and directly related
fields earn nearly $333 billion annu-
ally.5 It is hence unlikely that universal
free software is an achievable or even
desirable goal. Appropriately, open-
source initiatives focus on software
that deserves wide public use (such as
editors, compilers, and operating sys-
tems) and should be freely available to
students and innovators.
flow of money
OFFSHORING
ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t he acm 67
contributed articles
68 communications of th e ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
contributed articles
comes less critical. To gain financial switching to cheaper labor. in IP creation (as in Figure 2): the par-
flexibility, a company might identify The actual IP content needed to ent, the CFH, and the CFCs. Employ-
and isolate its IP. The rights to identi- perform creative work is transferred ees work at and create IP at the parent
fied IP, as trademarks and technology, through multiple paths: documents, and the CFC locations. Large multina-
can be moved to a distinct subcorpora- code, and personal interaction by staff tional corporations actually establish
tion. Separating IP is an initial phase interchanges among the remote sites dozens of controlled entities to take
in setting up an offshored operation and the originating location. Most advantage of different regulations and
when significant IP is involved.29 To be transfers are mediated by the Internet, incentives in various countries.
productive, the extant technology still allowing rapid interaction and feed-
must be made available to the creative back. The CFH does not get involved Valuing Transferred IP
workers, by having the productive cor- at all. The CFH subcorporation that obtains
porate divisions pay license fees to the Three types of parties are involved the rights to the IP, and that will profit
subcorporation holding the technolo-
gy IP; see the sidebar “Property Rights” Figure 1. Components of the economic loops for software.
for an illustrative example clarifying
the process of splitting rights from the
property itself.
Such transfer-of-rights transactions taxes routine profits
are even simpler when applied to IP.
The rights to a company’s IP or to an
arbitrary fraction of that IP can be sold
Commodity Products
to a controlled foreign holding com- Common Knowledge
pany (CFH) set up in a taxhaven. Once
the rights to the IP are in the CFH the Know-How Integration
Public and Private of the
flow of income and expenses changes. Investments work force High-value
Intellectual Technology
The rights to the IP are bundled, so Capital
Products
no specific patents, trade secrets, or Intellectual
Property Trademarks
documents are identified. The net in-
come attributable to the fraction of
the IP held in the CFH is collected in
an account also held in the taxhaven.
One way of collecting such income is non-routine
taxes profits
to charge royalties or license fees for
the use of the IP at the sites where the
workers create saleable products, both
at home and offshore. There is no risk
of IP loss at the CFH, because noth-
ing is actually kept there. To reduce Figure 2. Extracting and selling the rights to derive income from a property.
the risk of IP loss where the work is
performed, new offshore sites are set
Parent corporation Offshore job sites
up as controlled foreign corporations
(CFCs), rather than using contrac- Kn
ow
tors.29 Since IP is crucial to making Salaries o -Ho
wo f the w
non-routine profits, the royalty license rk
for
fees to be paid to the CFH can be sub- ce
$
stantial and greatly reduce the profit- Integration
ability at the parent and at the CFCs Initial
from worldwide software product purchase
$$
sales (see Figure 2). License $ $ High-value
The consolidated enterprise thus Fees Products
ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t he acm 69
contributed articles
from fees for its use, must initially pur- confidence in the resulting valuation Cayman Islands is the address for
chase the IP from the prior owner. For of the IP. If existing trademarks are 18,000 holding companies, and the
work that is offshored, the new work- being transferred or kept after an ac- entire country, with fewer than 50,000
ers do not contribute prior proprietary quisition their contribution to income inhabitants, hosts more than 90,000
knowledge, only IP subsequently. But requires adjustments as well. The ben- registered companies and banks. The
setting a fair price for the initial IP efits of marketing expenses tend to be income from the $3,000 annual regis-
received is difficult and risky. If it is short-lived. Technological IP is a mix- tration fees for that many companies
overvalued the company selling it to ture, some created through product allows the Cayman Islands to not im-
the CFH will have gained too much in- improvement that drives revenue with pose any taxes on anybody. Even the
come, on which it must pay taxes. If it little delay and some resulting from beach resorts, available for board of
is undervalued, excessive profits will the fundamental R&D that takes a long directors meetings, are not taxed.
accrue to the CFH. time to get to market. Defining what makes a country a
How does the company document While valuing all IP in a company is a prime taxhaven varies but always in-
the value of its transferred IP? The challenge, for the purpose of offshoring cludes negligible or no taxation and
annual reports to shareholders and software IP, simplification of confound- lack of transparency. A few dozen juris-
the 10-K reports submitted annually ing items is possible, making the task dictions actively solicit and lobby for
to the U.S. Securities and Exchange easier. The amount of tangible property business, citing their taxhaven advan-
Commission rarely include estimates is relatively small in a high-tech com- tages. Reporting income and assets
of the value of a company’s intangible pany. The value of the work force can be is often not required. Advantages can
property. Only when one company ac- determined through comparison with be combined; for instance, the rule
quires another high-tech company are public data of acquisitions of similar that Cayman-based corporations must
due-diligence assessments of the IP companies with little IP. have one local annual meeting can be
obtained made. Various method types overcome by having a Cayman com-
help assess the value of the transferred Taxhavens pany be formally resident in a British
IP from a parent to its CFCs or CFH are Offshoring is greatly motivated by be- Crown Colony like Bermuda. Often,
available, including these five (with ing able to avoid or reduce taxes on only a single CFH shareholder is fully
many variations): income by moving rights to the IP into controlled by another corporation.
Future income. Predict the future low-tax jurisdictions, or taxhavens, cat- Cayman companies need not have ex-
income ceded to the CFH, subtract egorized as semi-taxhavens, or coun- ternal directors on their boards, and
all expected costs, and reduce the re- tries looking to attract jobs through optional board meetings can be held
mainder to account for routine profits. active external investments, and pri- anywhere convenient . Neither audits
Compute the IP’s net present value mary taxhavens. Semi-taxhavens tend nor annual reports are required, but
(NPV) over its lifetime to obtain the to provide temporary tax benefits. for criminal cases, records are made
amount due to the IP2; Countries intent on growth like Israel available. At the extreme end of the
Shareholder expectations. Use share- and Ireland have offered tax holidays taxhaven spectrum are countries iden-
holder expectations embodied in the to enterprises setting up activities tified by the Organisation for Econom-
company’s total market capitalization, there, while India provides incentives ic Co-operation and Development as
subtract the value of its tangibles, and for companies that export. Many East- uncooperative taxhavens, even shelter-
split the remainder among the CFH ern European countries have set up ing fraud.20
and the parent3; or are considering similar initiatives. The use of primary taxhavens
Expected value. Search for similar Setting up a subsidiary CFC in a semi- causes a loss to the U.S. Treasury of
public transactions where the IP trans- taxhaven requires financial capital more than $100 billion annually, a
fer was among truly independent orga- and significant corporate IP, helping substantial amount compared to the
nizations, then adjust for differences workers be productive quickly. These $370 billion total actually collected as
and calculate a median value19; resources are best provided via prima- corporate tax.8,32 Only $16 billion in
Diminishing maintenance. Aggre- ry taxhavens. taxes were paid (for the year reviewed)
gate the NPV of the specific incomes Primary taxhavens are countries by multinational corporations in the
expected from the products sold over with small populations that focus on U.S.; smaller businesses pay the great-
their lifetimes as their initial IP con- attracting companies that will not use est share.27 The actual tax rate paid by
tribution due to maintenance dimin- actual resources there, and with no companies using taxhavens averages
ishes31; local personnel hired. Although their 5%, even as they complain about high
Expected R&D margin. Extrapolate role is crucial in offshoring, the jobs U.S. corporate taxes. It was estimated
the past margin obtained from ongo- issue is not raised, and the services at a G-20 meeting that developing na-
ing R&D investment as it delivers ben- needed for remote holding compa- tions overall lose annual revenue of
efits over successive years.15 nies (such as registration with local $125 billion due to taxhaven use.11
All valuation methods depend on government, mail forwarding, and ar-
data. The availability of trustworthy ranging boards-of-directors meetings) Assets in a Taxhaven
data determines applicability and the are offered by branches of global ac- Following an IP transfer to a primary
trustworthiness of their results. Using counting firms. For example, a single taxhaven, the taxhaven CFH will have
more than one method helps increase well-known five-floor building in the two types of assets: more-auditable fi-
70 communications of th e ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
contributed articles
Taxing Country
$$ $$
for for IP at the parent $$
IP taxes taxes corporation for
tax
ongoing
es IP rights
Primary taxhaven
IP
rig
ht
$$ .
s
$ available for
Profit share more new projects
for parent
Profit share
$ for CFH
All untaxed
new S
$
New IP
and $
time
?
nancial ones, derived from licensing was transferred,19 an amount typically funds flow to the holding company in
and royalties for use of the IP, and the paid over several years. Moving the the taxhaven. Additional funds may be
IP itself. Both grow steadily, as out- IP offshore early in the life of a com- repatriated from a CFH when a coun-
lined in Figure 3 and are now freely pany, when there is little documented try (such as the U.S.) offers tax amnes-
available to initiate and grow projects IP, increases the leverage of this ap- ties for capital repatriation or when
in any CFC. The IP in the primary tax- proach. The income of the CFH is also the parent companies show losses, so
haven is made available by charging used to pay for ongoing R&D or for the the corporate income tax due can be
license fees to projects in the semi-tax- programmers at the parent company offset.1,6
havens, providing immediate income and in any IP-generating offshore lo- The payments by the CFH for cre-
to the CFH. When the projects have cation.28 U.S. taxes must be paid on ative work ensure that all resulting IP
generated products for sale, royalties such funds as they are repatriated to belongs to the CFH. While the value
on the sales provide further income to the U.S., since they represent taxable of the initial IP purchased diminishes
the CFH. income. The funds not needed to sup- over time, the total IP held in the CFH
Initially, the income at the CFH is port R&D (often more than half after increases as the product is improved
used to reimburse the parent compa- the initial payback) can remain in the and provides a long-term IP and in-
ny for the assumed value of the IP that CFH. In each yearly cycle yet more come stream.
ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t he acm 71
contributed articles
72 communications of th e ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
contributed articles
taxhavens are disadvantaged, even rangement makes it easy to cross the quantitatively assess the relationships
though most economists view them boundaries of legality. Misvaluations between IP offshoring and jobs off-
as the major drivers of growth. In ad- can greatly reduce the magnitude of IP shoring. While it is clear that there is
dition to unequal taxation, they are exports and consequent tax benefits. initial dependency, long-term effects
also less likely to be able to benefit The firms that provide advice for set- are only imagined. Tax schemes clearly
from government tax credits for R&D. ting up tax shelters have the required create an imbalance of actual tax-rates
Such credits enable mature corpora- broad competencies not available in being paid by small- versus large-busi-
tions to offset their U.S. R&D labor the critical constituencies of the com- ness innovators.
costs against any taxes remaining on puting community.1 Staff of firms pro- With more information in hand,
their profits, while the IP generated viding such advice often function in- scientists and researchers in industry
accrues to their CFH. Smaller com- visibly as directors of their customers’ might try to influence potentially wor-
panies establish tax shelters as well. CFHs. Most advising organizations risome corporate policies. While em-
Since most large companies have al- protect themselves from legal liability ployees have few (if any) legal rights to
ready established them, tax-consult- by formally splitting themselves into determine corporate directions, they
ing firms intent on their own growth distinct companies for each country in may well have expectations about their
now also market taxhavens to mid- which they operate. These companies employers’ behavior. A corporation
size businesses. then rejoin by becoming members of may listen, since the motivation of its
a “club” (Vereinsgesetz) set up under work force is a valuable asset. Corpo-
Lack of Transparency Swiss laws. The member companies of rate leaders might not have considered
The creators of the software, even if such clubs do not assume responsibil- the long-term effect of schemes they
they care where their paycheck comes ity for one another’s work and advice. themselves set in place to minimize
from and where the IP they produce But the club can share resources, in- taxes. However, these leaders are also
goes, cannot follow the tortuous path formation, and income among mem- under pressure to compete, nationally
from sales to salary.18 Many interme- ber companies, allowing them to func- and internationally.1 It has been sug-
diate corporate entities are involved, tion as a unit. gested that international initiatives
so tracing the sources of programmer U.S. government officials are re- are needed to level the corporate-taxa-
income becomes well nigh impossi- stricted in how they share corporate tion playing field.
ble. Even corporate directors, despite information. Rules established to
having ultimate responsibility, are protect corporate privacy prohibit the Change the Flow
not aware of specifics, other than hav- sharing of information among Inter- A radical solution to problems created
ing agreed to a tax-reduction scheme nal Revenue staff regarding arrange- by tax-avoidance schemes is to do away
operated by their accountants. Inves- ments used by specific taxpayers to with corporate taxation altogether and
tors and shareholders will not find avoid taxes. Even a 2008 U.S. govern- compensate the U.S. government for
in consolidated annual reports or ment report14 had to rely on survey the loss of tax income by fully taxing
10-K filings any direct evidence of tax- data and could not use corporate fil- dividends and capital gains, that is, by
haven use, since regulations devised ings. A thorough study of IP and capi- imposing taxes only when corporate
to reduce paperwork hide amounts tal flow would require changes in the profits flow to the individuals consum-
held and internal transactions within restricting regulations. ing the benefits. The net effect on to-
controlled corporations. Funds trans- tal tax revenues in the U.S. might be
ferred for R&D and dividends from Incremental Suggestions modest, since, in light of current tax
taxhavens are first deposited in corpo- No matter what conclusions you avoidance strategies, corporations
rate income accounts, then taxed, but draw from this article, any follow-up contributed as little as 8% to total U.S.
may remain eligible for government will require increased transparency. tax revenue in 2004.6 Such a radical
tax credits for corporate research. The U.S. Senate bill S.506 introduced in change would reduce the motivation
taxpayers in these countries are not March 2009 by Senator Carl Levin for many distortions now seen in cor-
aware of benefits beyond salaries; that (Dem., Michigan) “To restrict the use porate behavior. Small businesses un-
is, income from profitable IP will not of taxhavens…” includes measures to able to pay the fees and manage the
accrue to the country providing the re- increase access to corporate data of complexity of taxhavens would no lon-
search credits.23 companies that set up taxhavens and ger be disadvantaged.
Tax-avoidance processes have been to the information their advisers pro- Getting effective international
explored in many publications but vide. Its primary goal is to tax CFHs as agreement seems futile, and no single
not applied to corporate IP transfer16; if they were domestic corporations. It government can adequately regulate
the adventures of movie and sports is unclear if the bill will become law, multinational enterprises. Unilateral
stars make more interesting reading. since confounding arguments can alternatives to deal with countries
Promoters of corporate tax reduc- be raised about its effects. The role that shelter tax-shy corporations are
tion, seeking to, perhaps, gain more of IP and jobs is not addressed in the infeasible as well, even without con-
business, provide general documen- bill, and unless the public is well-in- sideration of the role of IP and mal-
tation, and even address the risks of formed, meaningful reforms will have feasance.9 An underlying problem is
misvaluation of IP and of faulty roy- difficulty gaining traction. that the law equates a corporation
alty rates.19 The complexity of this ar- Without transparency one cannot with a person, allowing confusing ar-
ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t he acm 73
contributed articles
guments, even though people have $7.6 billion, a 168% increase from the Washington D.C., May 2007; http://www.bls.gov/oes/
6. Clausing, K. The American Jobs Creation Act of
morals, motivations, and obligations same period in 2009.23 The eight larg- 2004, Creating Jobs for Accountants and Lawyers.
that differ greatly from the obligations est companies have $300 billion avail- Urban-Brookings Tax Policy Center, Report 311122,
Washington, D.C., Dec. 2004.
of corporations. This equivalence is able in taxhavens. Cisco Systems alone 7. Compustat. Financial Results of Companies in SIC
seen as a philosophical mistake by reported it had $30 billion available Code 7372 for 1999 to 2002; http://www.compustat.com
8. Cray, C. Obama’s tax haven reform: Chump change.
some.13 For instance, humans cannot, in its tax shelters and expects to keep CorpWatch (June 15, 2009); http://www.corpwatch.
without creating corporate entities, spending on foreign acquisitions. Such org/article.php?id=15386.
9. Dagan, T. The tax treaties myth. NYU Journal of
split themselves into multiple clones investment will create jobs all over the International Law and Politics 32 article 379181 (Oct.
that take advantage of differing taxa- world, primarily in semi-taxhavens. 2000), 939–996.
10. Damodaran, A. Dealing with Intangibles: Valuing
tion regimes. In practice, not taxing More support for CS education Brand Names, Flexibility and Patents Working
Paper. Stern School of Business Reports, New York
corporations is such a radical change, was a major emphasis of the ACM re- University, New York, Jan. 2006; http://pages.stern.
affecting so many other aspects of the port, but where will the funding come nyu.edu/~adamodar/
11. Deutsche Stiftung für Entwicklungsländer. Bitter
economy and public perception, that it from? The taxes on Cisco’s available losses. English version (Aug. 7, 2010); http://www.
is as unlikely as many other proposed funds, were they to be used for invest- inwent.org/ez/articles/178169/index.en.shtml
12. Gay, J. Free Software, Free Society: In Selected
tax reforms.8 ment in the U.S., exceed total fund- Essays of Richard M. Stallman. GNU Press, Boston,
ing for the National Science Founda- MA, 2002.
13. Gore, A. The Assault on Reason. Bloomsbury
Why Care? tion and Defense Advanced Research Publications, London, 2008.
The knowledge-based society brought Projects Agency. The IP, if it remains 14. Government Accountability Office. International
Taxation: Large U.S. Corporations and Federal
forth a revolution of human produc- offshore, would quickly refill Cisco’s Contractors with Subsidiaries in Jurisdictions Listed
tivity in the past 50 years, moving well coffers there. Discussions concerning as Tax Havens or Financial Privacy Jurisdictions. U.S.
Government Accountability Office Report GAO-09-
beyond the industrial revolution that future education, leading to growth of 157, Washington, D.C., Dec. 2008; http://www.gao.gov/
started more than a century earlier, knowledge-based industries, job cre- new.items/d09157.pdf
15. Grossman, G.M. and Helpman, E. Innovation and
and globalization is a means to dis- ation, protection of retirement ben- Growth in the Global Economy, Seventh Edition. MIT
tribute its benefits. But the growth efits, and the required infrastructure press, Cambridge, MA, 2001.
16. Johnston, D.C. Perfectly Legal. Portfolio Publishers,
of assets in taxhavens deprives work- for growing businesses are futile if the New York, 2003.
17. Kiesewetter-Koebinger, S. Programmers’ capital. IEEE
ers worldwide of reasonably expected creators of the required intellectual Computer 43, 2 (Feb. 2010), 106–108.
benefits. These hidden assets have resources are uninformed about the 18. Lev, B. Intangibles, Management, Measurement and
Reporting. Brookings Institution Press, Washington,
grown to be a multiple of annual in- interaction of IP and capital alloca- D.C., 2001.
dustry revenue, exceeding the assets tion. Initiating effective action is more 19. Levey, M.M., Wrappe, S.C., and Chung, K. Transfer
Pricing Rules and Compliance Handbook. CCH Wolters
held in the countries where the IP is difficult still. Kluwer Publications, Chicago, IL, 2006.
being created. The presence of signifi- 20. Makhlouf, G. List of Uncooperative Tax Havens.
OECD’s Committee on Fiscal Affairs. Organisation
cant IP rights in taxhavens provides Acknowledgments for Economic Co-operation and Development, Paris,
global corporations great flexibility to This exposition was motivated by the France, Apr. 19, 2002.
21. Parr, R. Royalty Rates for Licensing Intellectual
invest capital anywhere, avoiding in- Rebooting Computing meeting in Sili- Property. John Wiley & Sons, Inc., New York, 2007.
come due to IP from being taxed any- con Valley in January 2009 (http://www. 22. Rahn, R.W. In defense of tax havens. The Wall Street
Journal (Mar. 17, 2009).
where. The combination of reduced rebootingcomputing.org/content/sum- 23. Rashkin, M. Practical Guide to Research and
support for education, government mit) and benefited from discussions Development Tax Incentives: Federal, State, and
Foreign, Second Edition. CCH Wolters Kluwer
research funding, and physical infra- on the topic with Peter J. Denning Publications, Chicago, IL, 2007.
structure, along with the increased (the organizer), Joaquin Miller, Erich 24. Saitto, S. U.S. tech firms shop abroad to avoid taxes.
Bloomberg Businessweek (Sept. 6, 2010), 31–32.
motivation to start new initiatives in Neuhold, Claudia Newbold, Shaibal 25. Smith, G. and Parr, R. Intellectual Property, Valuation,
Exploitation, and Infringement Damages, John Wiley
semi-taxhavens and the imbalance of Roy, Stephen Smoliar, Shirley Tessler, & Sons, Inc., New York, 2005.
small businesses versus global corpo- Andy van Dam, Moshe Y. Vardi, and 26. Tambe, P.B. and Hitt, L.M. How offshoring affects IT
workers. Commun. ACM 53, 10 (Oct. 2010), 62–70.
rations, is bound to affect the future of unknown, patient Communications 27. Tichon, N. Tax Shell Game: The Taxpayer Cost of
enterprises in countries that initiated reviewers. Any remaining lack of clar- Offshore Corporate Havens. U.S. Public Interest
Research Group, Apr. 2009; http://www.uspirg.org/
high-tech industries, though the rate ity is due to my failure in presenting a news-releases/tax-and-budget/tax-and-budget-news/
and final magnitude is unpredictable novel topic adequately to a technical washington-d.c.-taxpayers-footing-a-100-billion-bill-
for-tax-dodgers
today. Better-educated scientists will audience. Any errors and opinions are 28. Weissler, R. Advanced Pricing Agreement Program
be less affected and feel the effects also solely my responsibility. Training on Cost Sharing Buy-In Payments. Transfer
Pricing Report 533. IRS, Washington D.C., Feb. 2002.
more slowly.26 But any industry re- 29. Wiederhold, G., Tessler, S., Gupta, A., and Smith,
quires a mix of related competencies. D.B. The valuation of technology-based intellectual
References property in offshoring decisions. Commun. Assoc.
It took 50 years for the U.S. car indus- 1. Aspray, W., Mayadas, F., and Vardi, M.Y., Eds. Infor. Syst. 24, 31 (June 2009), 523–544.
try to be reduced to its current state. Globalization and Offshoring of Software. A Report of 30. Wiederhold, G. Determining software investment lag.
the ACM Job Migration Task Force. ACM, New York, Journal of Universal Computer Science 14, 22 (2008).
The velocity of change when intangi- 2006. 31. Wiederhold, G. What is your software worth?
bles, instead of tangible capabilities, 2. Babcock, H. Appraisal Principles and Procedures. Commun. ACM 49, 9 (Sept. 2006), 65–75.
American Society of Appraisers, Herndon, VA, 1994; 32. Wilson, S. Is this the end for treasure islands?
are involved may well be greater. first edition 1968. MoneyWeek (Mar. 13, 2009).
The large amount of capital ac- 3. Becker, B. Cost sharing buy-ins. Chapter in Transfer
Pricing Handbook, Third Edition, R. Feinschreiber, Ed.
cumulated in taxhavens encourages John Wiley & Sons, Inc., New York, 2002, A3–A16. Gio Wiederhold (gio@cs.stanford.edu) is Professor
4. Boehm, B. Software Engineering Economics. Prentice- (Emeritus) of Computer Science, Medicine, and Electrical
ever-greater investment in foreign Hall, Upper Saddle River, NJ, 1981. Engineering at Stanford University, Stanford, CA; http://
companies. As of August 2010, such 5. Bureau of Labor Statistics. National Employment infolab.stanford.edu/people/gio.html
and Wage Data Survey. Bureau of Labor Statistics,
investment was reported to amount to © 2011 ACM 0001-0782/11/0100 $10.00
74 communications of th e ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
doi:10.1145/1866739 . 1 8 6 6 7 5 7
Using Simple
Abstraction
to Reinvent
Computing for
Parallelism
shift from single-processor
T h e re c e n t dr a m atic
computer systems to many-processor parallel ones
requires reinventing much of computer science to
build and program the new systems. CS urgently
requires convergence to a robust parallel general-
purpose platform providing good performance
ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t he acm 75
contributed articles
76 communications of th e ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
contributed articles
abstraction, where unit-time instruc- tion. Allowing programmers to view a programmers can’t handle,19 a prob-
tions execute one at a time. computer operation as a PRAM would lem of broad interest. Software pro-
The rudimentary parallel abstrac- make it easy to program,10 hence this duction has become a key compo-
tion I propose here is that indefinitely article should interest all such majors nent of the manufacturing sector of
many instructions available for con- and graduates. the economy. Mainstream machines
current execution execute immediate- Until 2004, standard (desktop) most programmers can’t handle cause
ly, dubbing the abstraction Immediate computers comprised a single proces- significant decline in productivity
Concurrent Execution. A consequence sor core. Since 2005 when multi-core of manufacturing, a concern for the
of ICE is a step-by-step (inductive) ex- computers became the standard, CS overall economy. Andy Grove, former
plication of the instructions available has appeared to be on track with a Chairman of the Board of Intel Corp.,
next for concurrent execution. The prediction5 of 100+-core computers said in the 1990s that the software spi-
number of instructions in each step is by the mid-2010s. Transition from se- ral—the cyclic process of hardware
independent of the number of proces- rial (single-core) computing to parallel improvements leading to software
sors, which are not even mentioned. (many-core) computing mandates the improvements leading back to hard-
The explication falls back on the serial reinvention of the very heart of CS, as ware improvements—was an engine
abstraction in the event of one instruc- these highly parallel computers must of sustained growth for IT for decades
tion per step. The right side of Figure 1 be built and programmed differently to come. A stable application-software
outlines parallel execution as implied from the single-core machines that base that could be reused and en-
by the ICE abstraction. At each time dominated standard computer sys- hanced from one hardware generation
unit, any number of unit-time instruc- tems since the inception of the field to the next was available for exploita-
tions that can execute concurrently do almost 70 years ago. By 2003, the clock tion. Better performance was assured
so, followed by yet another time unit rate of a high-end desktop proces- with each new generation, if only the
in which the same execution pattern sor had reached 4GHz, but processor hardware could run serial code faster.
repeats, and so on, as long as the pro- clock rates have improved only barely, Alas, the software spiral today is bro-
gram is running. if at all, since then; the industry simply ken.21 No broad parallel-computing
How might parallelism be advan- did not find a way to continue improv- application software base exists for
tageous for performance? The PRAM ing clock rates within an acceptable which hardware vendors are commit-
answer is that in a serial program the power budget.5 Fortunately, silicon ted to improving performance. And
number of time units, or “depth,” is technology improvements (such as no agreed-upon parallel architecture
the same as the algorithm’s total num- miniaturization) allow the amount of allows application programmers to
ber of operations, or “work,” while in logic a computer chip can contain to build such a base for the foreseeable
the parallel program the number of keep growing, doubling every 18 to 24 future. Instating a new software spiral
time units can be much lower. For a months per Gordon Moore’s 1965 pre- could indeed be a killer app for gener-
parallel program, the objective is that diction. Computers with an increas- al-purpose many-core computing; ap-
its work does not much exceed that ing number of cores are now expected plication software developers would
of its serial counterpart for the same but without significant improvement put it to good use for specific applica-
problem, and its depth is much lower in clock rates. Exploiting the cores tions, and more consumers worldwide
than its work. (Later in the article, I in parallel for faster completion of a would want to buy new machines.
note the straightforward connection computing task is today the only way This robust market for many-core-
between ICE and the rich PRAM algo- to improve performance of individual based machines and applications
rithmic theory and that ICE is nothing tasks from one generation of comput- leads to the following case for govern-
more than a subset of the work-depth ers to the next. ment support: Foremost among to-
model.) But how would a system de- Unfortunately, chipmakers are de- day’s challenges is many-core conver-
signer go about building a computer signing multi-core processors most gence, seeking timely convergence to
system that realizes the promise of
Figure 1. Serial execution based on the serial ISE abstraction vs. parallel execution based
ease of programming and strong per-
on the parallel ICE abstraction.
formance?
Outlining a comprehensive solu-
tion, I discuss basic tension between Serial doctrine Natural (parallel) algorithm
the PRAM abstraction and hardware (Immediate serial execution) (Immediate concurent execution)
implementation and a workflow that
goes through ICE and PRAM-related ..
abstractions for programming the ..
Operations
Operations
Number of
Number of
..
XMT computer architecture.
Some many-core architectures are
likely to become mainstream, mean- .. .. .. ..
ing they must be easy enough to pro- Time Time
gram by every CS major and graduate. Time = Number of Operations Time << Number of Operations
I am not aware of other many-core
architectures with PRAM-like abstrac-
ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t he acm 77
contributed articles
Figure 2. Right column is a workflow from an ICE abstraction of an algorithm to the algorithm to fit bandwidth con-
implementation; left column may never terminate. straints among threads of the compu-
tation, a programming process that
doesn’t always yield an acceptable
ICE ICE
outcome. However, the XMT hardware
allows a workflow (right side of the
figure) that requires tuning only for
performance; revisiting and possibly
Parallel algorithm Parallel algorithm
changing the algorithm is generally
not needed. An optimizing compiler
should be able to do its own tuning
Parallel program XMT Program
without programmer intervention, as
in serial computing.
Most of the programming effort
yes Rethink algorithm: Tune
Insufficient inter−thread
Take better advantage
in traditional parallel programming
bandwidth?
of cache (domain partitioning, load balancing)
no is generally of lesser importance for
XMT hardware exploiting on-chip parallelism, where
Hardware parallelism overhead can be kept low
and processor-to-memory bandwidth
high. This observation drove develop-
ment of the XMT programming model
a robust many-core platform coupled plementing step-by-step synchrony and its implementation by my re-
with a new many-core software spiral in hardware, consider two examples: search team. XMT is intended to pro-
to serve the world of computing for Memories based on long tightly syn- vide a simpler parallel programming
years to come. A software spiral is ba- chronous pipelines of the type seen in model that efficiently exploits on-chip
sically an infrastructure for the econ- Cray vector machines have long been parallelism through multiple design
omy. Since advancing infrastructures out of favor among architects of high- elements.
generally depends on government performance computing; and process- The XMT architecture uses a high-
funding, designating software-spiral ing memory requests takes from one bandwidth low-latency on-chip inter-
rebirth a killer app also motivates to 400 clock cycles. Hardware must be connection network to provide more
funding agencies and major vendors made as flexible as possible to advance uniform memory-access latencies.
to support the work. The impact on without unnecessary waiting for con- Other specialized XMT hardware
manufacturing productivity could fur- current memory requests. primitives allow concurrent instantia-
ther motivate them. To underscore the importance of tion of as many threads as the number
the bridge the XMT approach builds of available processors, a count that
Programmer Workflow from the tightly synchronous PRAM can reach into the thousands. Specifi-
ICE requires the lowest level of cog- to relaxed synchrony implementation, cally, XMT can perform two main op-
nition from the programmer relative note three known limitations with erations: forward (instantly) program
to all current parallel programming power consumption of multi-core ar- instructions to all processors in the
models. Other approaches require chitectures: high power consumption time required to forward the instruc-
additional steps (such as decomposi- of the wide communication buses tions (for one thread) to just one pro-
tion10). In CS theory, the speedup pro- needed to implement cache coher- cessor; and reallocate any number of
vided by parallelism is measured as ence; basic nm complexity of cache- processors that complete their jobs at
work divided by depth; reducing the coherence traffic (given n cores and the same time to new jobs (along with
advantage of ICE/PRAM to practice is m invalidations) and implied toll on their instructions) in the time required
a different matter. inter-core bandwidth; and high power to reallocate one processor. The high-
The reduction to practice I have led consumption needed for a tightly syn- bandwidth, low-latency interconnec-
relies on the programmer’s workflow, chronous implementation in silicon tion network and low-overhead cre-
as outlined in the right side of Figure in these designs. The XMT approach ation of many threads allow efficient
2. Later, I briefly cover the parallel- addresses all three by avoiding hard- support for the fine-grain parallelism
algorithms stage. The step-by-step ware-supported cache-coherence al- used to hide memory latencies and a
PRAM explication, or “data-parallel” together and by significantly relaxing programming model for which local-
instructions, represents a traditional synchrony. ity is less an issue than in designs with
tightly synchronous outlook on paral- Workflow is important, as it guides less bandwidth. These mechanisms
lelism. Unfortunately, tight step-by- the human-to-machine process of pro- support dynamic load balancing, re-
step synchrony is not a good match gramming; see Figure 2 for two work- lieving programmers from having to
with technology, including its power flows. The non-XMT hardware imple- directly assign work to processors.
constraints. mentation on the left side of the figure The programming model is simplified
To appreciate the difficulty of im- may require revisiting and changing further by letting threads run to com-
78 communications of th e ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
contributed articles
pletion without synchronization (no as a sequence of rounds and, for each with the easy-to-understand ICE ab-
busy-waits) and synchronizing access round, up to p processors execute con- straction and ends with the XMT sys-
to shared data with prefix-sum (fetch- currently. The performance objective tem, providing a practical implemen-
and-add type) instructions. These fea- is to minimize the number of rounds. tation of the vast PRAM algorithmic
tures result in a flexible programming The PRAM parallel-algorithmic ap- knowledge base.
style that accommodates the ICE ab- proach is well-known and has never XMT programming model. The
straction and encourages program de- been seriously challenged by any programming model behind the XMT
velopment for a range of applications. other parallel-algorithmic approach framework is an arbitrary concurrent
The reinvention of computing for in terms of ease of thinking or wealth read, concurrent write single program
parallelism also requires pulling to- of knowledgebase. However, PRAM multiple data, or CRCW SPMD, pro-
gether a number of technical commu- is also a strict formal model. A PRAM gramming model with two executing
nities. My 2009 paper26 sought to build algorithm must therefore prescribe modes: serial and parallel. The two in-
a bridge to other architectures by cast- for each and every one of its p proces- structions—spawn and join—specify
ing the abstraction-centric vision of sors the instruction the processor ex- the beginning and end, respectively,
this article as a possible module in ecutes at each time unit in a detailed of a parallel section (see Figure 3). An
them, identifying a limited number of computer-program-like fashion that arbitrary number of virtual threads,
capabilities the module provides and can be quite demanding. The PRAM- initiated by a spawn and terminated
suggesting a preferred embodiment algorithms theory mitigates this in- by a join, share the same code. The
of these capabilities using concrete struction-allocation scheme through workflow relies on the spawn com-
“hardware hooks.” If it is possible the work-depth (WD) methodology. mand to extend the ICE abstraction
to augment a computer architecture This methodology (due to Shiloach from the WD methodology to XMT
through them (with hardware hooks and Vishkin20) suggests a simpler way programming. As with the respective
or other means), the ICE abstraction to allocate instructions: A parallel PRAM model, the arbitrary CRCW as-
and the programmer’s workflow, in algorithm can be prescribed as a se- pect dictates that concurrent writes
line with this article, can be support- quence of rounds, and for each round, to the same memory location result
ed. The only significant obstacle in to- any number of operations can be ex- in an arbitrary write committing.
day’s multi-core architectures is their ecuted concurrently, assuming un- No assumption needs to be made by
large cache-coherent local caches. limited hardware. The total number the programmer beforehand about
Their limited scalability with respect of operations is called “work,” and the which one will succeed. An algorithm
to power gives vendors more reasons number of rounds is called “depth,” as designed with this property in mind
beyond an easier programming model in the ICE abstraction. The first perfor- permits each thread to progress at its
to let go of this obstacle. mance objective is to reduce work, and own speed, from initiating spawn to
PRAM parallel algorithmic ap- the immediate second one is to reduce terminating join, without waiting for
proach. The parallel random-access depth. The methodology of restrict- other threads—no thread “busy-waits”
machine/model (PRAM) virtual model ing attention only to work and depth for another thread. The implied “inde-
of computation is a generalization of has been used as the main framework pendence of order semantics” allows
the random-access machine (RAM) for the presentation of PRAM algo- XMT to have a shared memory with a
model.9 RAM, the basic serial model rithms16,17 and is in my class notes on relatively weak coherence model. An
behind standard programming lan- the XMT home page http://www.umi- advantage of this easier-to-implement
guages, assumes any memory access acs.umd.edu/users/vishkin/XMT/. De- SPMD model is that it is PRAM-like. It
or any operation (logic or arithmetic) riving a full PRAM description from a also incorporates the prefix-sum state-
takes unit-time (serial abstraction). WD description is easy. For concrete- ment operating on a base variable, B,
The formal PRAM model assumes a ness, I demonstrate WD descriptions and an increment variable, R. The re-
certain number, say, p of processors, on two examples, the first concerning sult of a prefix-sum is that B gets the
each able to concurrently access any parallelism, the second concerning value B + R, while R gets the initial val-
location of a shared memory in the the WD methodology (see the sidebar ue of B, a result called “atomic” that’s
same time as a single access. PRAM “Parallel Algorithms”). similar to fetch-and-increment in Got-
has several submodels that differ by The programmer’s workflow starts tlieb et al.12
assumed outcome of concurrent ac-
cess to the same memory location for Figure 3. Serial and parallel execution modes.
either read or write purposes. Here, I
note only one of them—the Arbitrary
Serial Parallel Serial Parallel Serial
Concurrent-Read Concurrent-Write mode mode mode mode mode
(CRCW) PRAM—which allows con-
current accesses to the same memory
location for reads or writes; reads … Spawn Join Spawn Join …
…
ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t he acm 79
contributed articles
The primitive is especially useful thread returns a different R value. This XMTC is an extension of standard C,
when several threads perform a prefix- way, the parallel prefix-sum command augmenting C with a small number
sum simultaneously against a com- can be used to implement efficient of commands (such as spawn, join,
mon base, because multiple prefix- and scalable inter-thread synchroniza- and prefix-sum). Each parallel re-
sum operations can be combined by tion by arbitrating an ordering among gion is delineated by spawn and join
the hardware to form a very fast multi- the threads. statements, and synchronization is
operand prefix-sum operation. Be- The XMTC high-level language im- achieved through the prefix-sum and
cause each prefix-sum is atomic, each plements the programming model. join commands. Every thread execut-
80 communications of th e ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
contributed articles
ing the parallel code is assigned a the thread is non-zero, the thread per- by the architecture, the compiler, and
unique thread ID, designated $. The forms a prefix-sum to get a unique in- the programmer/algorithm designer.
spawn statement takes as arguments dex into B where it can place its value. See Vishkin et al27 for a demonstra-
the lowest ID and highest ID of the Other XMTC commands. Prefix-sum- tion of tuning XMTC code for perfor-
threads to be spawned. For the hard- to-memory (psm) is another prefix- mance by accounting for LSRTM. As
ware implementation (discussed lat- sum command, the base of which is an example, it improves XMT hard-
er), XMTC threads can be as short as any location in memory. While the ware performance on the problem of
eight to 10 machine instructions that increment of ps must be 0 or 1, the in- summing n numbers.
are not difficult to get from PRAM al- crement of psm is not limited, though Execution can differ from the literal
gorithms. Programmers from high its implementation is less efficient. XMTC code in order to keep the size of
school to graduate school are pleas- Single Spawn (sspawn) is a command working space under control or other-
antly surprised by the flexibility of that can spawn an extra thread and be wise improve performance. For exam-
translating PRAM algorithms to XMTC nested. A nested spawn command in ple, compiler and runtime methods
multi-threaded programs. The ability XMTC code must be replaced (by pro- could perform this modification by
to code the whole merging algorithm grammer or compiler) by sspawn com- clustering virtual threads offline or on-
using a single spawn-join pair is one mands. The XMTC commands are de- line and prioritize execution of nested
such surprise (see the sidebar “Merg- scribed in the programmer’s manual spawns using known heuristics based
ing with a Single Spawn-Join”). included in the software release on the on a mix of depth-first and breadth-
To demonstrate simple code, con- XMT Web pages. first searches.
sider two code examples: Tuning XMT programs for perfor- Commitments to silicon of XMT
The first is a small XMTC program mance. My discussion here of perfor- by my research team at the University
for the parallel exchange algorithm mance tuning would be incomplete of Maryland include a 64-processor,
discussed in the “Parallel Algorithms” without a description of salient fea- 75MHz computer based on field-pro-
sidebar: tures of the XMT architecture and grammable gate array (FPGA) technol-
hardware. The XMT on-chip general- ogy developed by Wen28 and 64-proces-
spawn ( 0 , n−1){ purpose computer architecture is sor ASIC 10mm X 10mm chip using
var x aimed at the classic goal of reducing IBM’s 90nm technology developed
x:=A( $ ) ; single-task completion time. The WD together by Balkan, Horak, Keceli, and
A( $ ):=B( $ ) ; methodology gives algorithm design- Wen (see Figure 4). Tzannes and Car-
B( $ ):=x ers the ability to express all the paral- gaea (guided by Barua and me) have
} lelism they observe. XMTC program- also developed a basic yet stable com-
ming further permits expressing this piler, and Keceli has developed a cycle-
The program spawns a concurrent virtual parallelism by letting program- accurate simulator of XMT. Both are
thread for each of the depth-3 serial- mers express as many concurrent available through the XMT software
exchange iterations using a local vari- threads as they wish. The XMT proces- release on the XMT Web pages.
able x. Note that the join command is sor must now provide an effective way Easy to build. An individual gradu-
implied by the right parenthesis at the to map this virtual parallelism onto the ate student with no prior design expe-
end of the program. hardware. The XMT architecture pro- rience completed the XMT hardware
The second assumes an array of n vides dynamic allocation of the XMTC description (in Verilog) in just over
integers A. The programmer wishes threads onto the hardware for better two years (2005–2007). XMT is also sil-
to “compact” the array by copying all load balancing. Since XMTC threads icon-efficient. The ASIC design by the
non-zero values to another array, B, in can be short, the XMT hardware must XMT research team at the University
an arbitrary order. The XMTC code is: directly manage XMT threads to keep of Maryland shows that a 64-processor
overhead low. In particular, an XMT XMT needs the same silicon area as a
psBaseReg x=0; program looks like a single thread to (single) current commodity core. The
spawn ( 0 , n−1){ the operating system (see the sidebar XMT approach goes after any type of
int e ; “The XMT Processor” for an overview application parallelism regardless of
e=1; of XMT hardware). how much parallelism the application
i f (A[ $ ] ) !=0) { The main thing performance pro- requires, the regularity of this paral-
ps ( e , x ) ; grammers must know in order to tune lelism, or the parallelism’s grain size,
B[ e ]=A[ $ ] the performance of their XMT pro- and is amenable to standard multipro-
} grams is that a ready-to-run version of gramming where the hardware sup-
} an XMT program depends on several ports several concurrent operating-
parameters: the length of the (longest) system threads.
It declares a variable x as the base sequence of roundtrips to memory The XMT team has demonstrated
value to be used in a prefix-sum com- (LSRTM); queuing delay to the same good XMT performance, independent
mand (ps in XMTC), initializing it to 0. shared memory location (known as software engineers have demonstrat-
It then spawns a thread for each of the queue-read queue-write, or QRQW11); ed XMT programmability (see Hoch-
n elements in A. A local thread variable and work and depth. Their optimiza- stein et al.14), and independent educa-
e is initialized to 1. If the element of tion is a responsibility shared subtly tion professionals have demonstrated
ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t he acm 81
contributed articles
82 communications of th e ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
contributed articles
PS
TCU 0
FU interconnection network
executes XMT instructions (such as
spawn and join). Typical program
execution flow, as in Figure 3, can
also be extended through nesting of FU 0 FU 1 FU p PS
sspawn commands. The MTCU uses Shared Functional Units Unit
the following XMT extension to the
LSU with Hashing Function Instruction
standard von Neumann apparatus Broadcast
of the program counters and stored
program: Upon encountering
a spawn command, the MTCU
broadcasts the instructions in the
parallel section starting with that Cluster-Memory Interconnection Network
spawn command and ending with a
join command on a bus connecting
to all TCU clusters. Master TCU
The largest ID number of a Functional Units
MM 0 MM 1 MM M and Register File
thread the current spawn command
L1 Cache L1 Cache L1 Cache
must execute Y is also broadcast L2 Cache L2 Cache L2 Cache Private Private
to all TCUs. The ID (index) of the L1 D-Cache L1 I-Cache
Shared Memory Modules
largest executing threads is stored in
a global register X. In parallel mode,
a TCU executes one thread at a time.
ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t he acm 83
contributed articles
general-purpose
an application-driven approach to re- over, the ICE abstraction incorporates
inventing computing for parallelism. work-depth early in the design work-
Conclusion
computer flow, similar to Cormen et al.’s 1990
first edition.9
The vertical integration of a parallel- architecture The O(log n) depth parallel merging
processing system, compiler, pro-
gramming, and algorithms proposed
is aimed at the algorithm versus the O(log2 n) depth
one in Cormen et al.9 demonstrated an
here through the XMT framework with classic goal XMT advantage over current hardware,
the ICE/PRAM abstraction as its front-
end is notable for its relative simplic- of reducing as XMT allows a parallel algorithm for
the same problem that is both fast-
ity. ICE is a newly defined feature that single-task er and simpler. The XMT hardware
has not appeared in prior research, in-
cluding my own, and is more rudimen- completion time. scheduling brought the hardware per-
formance model much closer to work-
tary than prior parallel computing depth and allowed the XMT workflow
concepts. Rudimentary concepts are to streamline the design with the anal-
the basis for the fundamental develop- ysis from the start.
ment of any field. ICE can be viewed as Several features of the serial para-
an axiom that builds on mathematical digm made it a success, including a
induction, one of the more rudimen- simple abstraction at the heart of the
tary concepts in mathematics. The “contract” between programmers
suggestion here of using a simple ab- and builders, the software spiral, ease
straction as the guiding principle for of programming, ease of teaching,
reinventing computing for parallelism and backward compatibility on serial
also appears to be new. Considerable code and application programming.
evidence suggests it can be done (see The only feature that XMT, as in other
the sidebar “Eye-of-a-Needle Apho- multi-core approaches, does not pro-
rism”). vide is speedups for serial code. The
The following comparison with a ICE/PRAM/XMT workflow and archi-
chapter on multithreading algorithms tecture provide a viable option for
in the 2009 textbook Introduction to Al- the many-core era. My XMT solution
gorithms by Cormen et al.9 helps clarify should challenge and inspire others to
some of the article’s contributions. come up with competing abstraction
The 1990 first edition of Cormen et proposals or alternative architectures
al.9 included a chapter on PRAM algo- for ICE/PRAM. Consensus around an
rithms emphasizing the role of work- abstraction will move CS closer to con-
depth design and analysis; the 2009 vergence toward a many-core platform
chapter9 likewise emphasized work- and putting the software spiral back
depth analysis. However, to match cur- on track.
rent commercial hardware, the 2009 The XMT workflow also gives pro-
chapter turned to a variant of dynamic grammers a productivity advantage.
multithreading (in lieu of work-depth For example, I have traced several er-
design) in which the main primitive rors in student-developed XMTC pro-
was similar to the XMT sspawn com- grams to shortcuts the students took
mand (discussed here). One thread around the ICE algorithms. Overall,
was able to generate only one more improved understanding of program-
thread at a time; these two threads mer productivity, a traditionally dif-
would then generate one more thread ficult issue in parallel computing,
each, and so on, instead of freeing the must be a top priority for architec-
programmer to directly design for the ture research. To the extent possible,
work-depth analysis that follows (per evaluation of productivity should be
the same 2009 chapter). on par with performance and power.
Cormen et al.’s9 dynamic multi- For starters, productivity benchmarks
threading should encourage hardware must be developed.
84 communications of th e ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
contributed articles
Ease of programming, or program- neered to exploit for performance. A Hardware/Software Approach. Morgan-Kaufmann,
San Francisco, CA, 1999.
mability, is a necessary condition for For more, see the XMT home page 11. Gibbons, P., Matias, Y., and Ramachandran, V. The
the success of any many-core plat- at the University of Maryland http:// queue-read queue-write asynchronous PRAM model.
Theoretical Computer Science 196, 1–2 (Apr. 1998),
form, and teachability is a necessary www.umiacs.umd.edu/users/vishkin/ 3–29.
condition for programmability and in XMT/. The XMT software environment 12. Gottlieb, A. et al. The NYU ultracomputer designing
an MIMD shared-memory parallel computer. IEEE
turn for productivity. The teachability release is available by free download Transactions on Computers 32, 2 (Feb. 1983), 175–189.
of the XMT approach has been demon- there and from sourceforge.net at 13. Gu, P. and Vishkin, U. Case study of gate-level
logic simulation on an extremely fine-grained chip
strated extensively; for example, since http://sourceforge.net/projects/xmtc/, multiprocessor. Journal of Embedded Computing 2, 2
2007 more than 100 students in grades along with extensive documentation. (Apr. 2006), 181–190.
14. Hochstein, L., Basili, V., Vishkin, U., and Gilbert, J. A
K–12 have learned to program XMT, A 2010 release of the XMTC compiler pilot study to compare programming effort for two
including in two magnet programs: and cycle-accurate simulator of XMT parallel programming models. Journal of Systems and
Software 81, 11 (Nov. 2008), 1920–1930.
Montgomery Blair High School, Silver can also be downloaded to any stan- 15. Horak, M., Nowick, S., Carlberg, M., and Vishkin, U. A
Spring, MD, and Thomas Jefferson dard desktop computing platform. low-overhead asynchronous interconnection network
for gals chip multiprocessor. In Proceedings of the
High School for Science and Technol- Teaching materials covering a Uni- Fourth ACM/IEEE International Symposium on
Networks-on-Chip (Grenoble, France, May 3–6). IEEE
ogy, Alexandria, VA.22 Others are Balti- versity of Maryland class-tested pro- Computer Society, Washington D.C., 2010, 43–50.
more Polytechnic High School, where gramming methodology in which even 16. JaJa, J. An Introduction to Parallel Algorithms.
Addison-Wesley Publishing Company, Reading, MA,
70% of the students are African Ameri- college freshmen and high school 1992.
can, and a summer workshop for mid- students are taught only parallel al- 17. Keller, J., Kessler, C., and Traeff, J. Practical PRAM
Programming. Wiley-Interscience, New York, 2001.
dle-school students from underrepre- gorithms are also available from the 18. Nuzman, J. and Vishkin, U. Circuit Architecture for
sented groups in Montgomery County, XMT Web pages. Reduced-Synchrony-On-Chip Interconnect. U.S.
Patent 6,768,336, 2004; http://patft.uspto.gov/
MD, public schools. netacgi/nph-Parser?Sect1=PTO2&Sect2=HITOFF&p=
In the fall of 2010, I jointly con- Acknowledgment 1&u=%2Fnetahtml%2FPTO%2Fsearch-bool.html&r=1
&f=G&l=50&co1=AND&d=PTXT&s1=6768336.PN.&O
ducted another experiment, this one This work is supported by the Nation- S=PN/6768336&RS=PN/6768336
via video teleconferencing with Pro- al Science Foundation under grant 19. Patterson, D. The trouble with multi-core: Chipmakers
are busy designing microprocessors that most
fessor David Padua of the University 0325393. programmers can’t handle. IEEE Spectrum (July
of Illinois, Urbana-Champaign using 2010).
20. Shiloach, Y. and Vishkin, U. An O(n2 log n) parallel
Open MP and XMTC, with XMTC pro- References
max-flow algorithm. Journal of Algorithms 3, 2
gramming assignments run on the (Feb.1982), 128–146.
1. Adve, S. et al. Parallel Computing Research at Illinois:
21. Sutter, H. The free lunch is over: A fundamental shift
The UPCRC Agenda. White Paper. University of
XMT 64-processor FPGA machine. Illinois, Champaign-Urbana, IL,2008; http://www.
towards concurrency in software. Dr. Dobb’s Journal
30, 3 (Mar. 2005).
Our hope was to produce a meaning- upcrc.illinois.edu/UPCRC_Whitepaper.pdf
22. Torbert, S., Vishkin, U., Tzur, R., and Ellison, D. Is
2. Asanovic, K. et al. The Landscape of Parallel
ful comparison of programming de- Computing Research: A View from Berkeley. Technical
teaching parallel algorithmic thinking to high school
students possible? One teacher’s experience. In
velopment time from the 30 partici- Report UCB/EECS-2006-183. University of California,
Proceedings of the 41st ACM Technical Symposium
Berkeley, 2006; http://www.eecs.berkeley.edu/Pubs/
pating Illinois students. The topics TechRpts/2006/EECS-2006-183.pdf
on Computer Science Education (Milwaukee, WI, Mar.
10–13). ACM Press, New York, 2010, 290–294.
and problems covered in the PRAM/ 3. Balkan, A., Horak, M., Qu, G., and Vishkin, U. Layout-
23. Tzannes, A., Caragea, G., Barua, R., and Vishkin, U.
accurate design and implementation of a high-
XMT part of the course were signifi- throughput interconnection network for single-chip
Lazy binary splitting: A run-time adaptive dynamic
works-stealing scheduler. In Proceedings of the15th
cantly more advanced than Open MP parallel processing. In Proceedings of the 15th Annual
ACM Symposium on Principles and Practice of Parallel
IEEE Symposium on High Performance Interconnects
alone. Having sought to demonstrate (Stanford, CA, Aug. 22–24). IEEE Press, Los Alamitos,
Programming (Bangalore, India, Jan. 9–14). ACM
Press, New York, 2010, 179–189.
the importance of teachability from CA, 2007.
24. Valiant, L. A bridging model for multi-core computing.
4. Blake, G., Dreslinski, R., Flautner, K., and Mudge,
middle school on up, I strongly rec- T. Evolution of thread-level parallelism in desktop
In Proceedings of the European Symposium on
Algorithms (Karlruhe, Germany, Sept. 15–17). Lecture
ommend that it becomes a standard applications. In Proceedings of the 37th Annual
Notes in Computer Science 5193. Springer, Berlin,
International Symposium on Computer Architecture
benchmark for evaluating many-core 2008, 13–28.
(Saint-Malo, France, June 19–23). ACM Press, New
25. Vishkin, U. U.S. Patents 6,463,527; 6,542,918;
hardware platforms. York, 2010, 302–313.
7,505,822; 7,523,293; 7,707,388, 2002–2010;
5. Borkar, S. et al. Platform 2015: Intel Processor and
Blake et al.4 reported that after ana- http://patft.uspto.gov/
Platform Evolution for the Next Decade. White Paper.
26. Vishkin, U. Algorithmic approach to designing an
Intel, Santa Clara, CA, 2005; http://epic.hpi.uni-
lyzing current desktop/laptop appli- potsdam.de/pub/Home/TrendsAndConceptsII2010/
easy-to-program system: Can it lead to a hardware-
enhanced programmer’s workflow add-on? In
cations for which the goal was better HW_Trends_borkar_2015.pdf
Proceedings of the 27th International Conference on
6. Caragea, G., Tzannes, A., Keceli, F., Barua, R., and
performance, the applications tend to Vishkin, U. Resource-aware compiler prefetching for
Computer Design (Lake Tahoe, CA, Oct. 4–7). IEEE
Computer Society, Washington D.C., 2009, 60–63.
comprise many threads, though few many-cores. In Proceedings of the Ninth International
27. Vishkin, U., Caragea, G., and Lee, B. Models for
Symposium on Parallel and Distributed Computing
of them are used concurrently; conse- (Istanbul, Turkey, July 7–9). IEEE Press, Los
advancing PRAM and other algorithms into parallel
programs for a PRAM-on-chip platform. In Handbook
quently, the applications fail to trans- Alamitos, CA, 2010, 133–140.
on Parallel Computing, S. Rajasekaran and J. Reif, Eds.
7. Caragea, G., Keceli, F., Tzannes, A., and Vishkin, U.
late the increasing thread-level paral- General-purpose vs. GPU: Comparison of many-
Chapman and Hall/CRC Press, Boca Raton, FL, 2008,
5.1-60.
lelism in hardware to performance cores on irregular workloads. In Proceedings of the
28. Wen, X. and Vishkin, U. FPGA-based prototype of a
Second Usenix Workshop on Hot Topics in Parallelism
gains. This problem is not surprising (University of California, Berkeley, June 14–15).
PRAM-on-chip processor. In Proceedings of the Fifth
ACM Conference on Computing Frontiers (Ischia,
given that most programmers can’t Usenix, Berkeley, CA, 2010.
Italy, May 5–7). ACM Press, New York, 2008, 55–66.
8. Caragea, G., Saybasili, B., Wen, X., and Vishkin, U.
handle multi-core microprocessors. Performance potential of an easy-to-program PRAM-
In contrast, guided by the simple ICE on-chip prototype versus state-of-the-art processor.
In Proceedings of the 21st ACM SPAA Symposium on Uzi Vishkin (vishkin@umd.edu) is a professor in the
abstraction and by the rich PRAM Parallelism in Algorithms and Architectures (Calgary, University of Maryland Institute for Advanced Computer
Canada, Aug. 11–13). ACM Press, New York, 2009, Studies (http://www.umiacs.umd.edu/~vishkin) and
knowledgebase to find parallelism, 163–165. Electrical and Computer Engineering Department, College
XMT programmers are able to repre- 9. Cormen, T., Leiserson, C., Rivest, R., and Stein, C. Park, MD.
Introduction to Algorithms, Third Edition. MIT Press,
sent that parallelism using a type of Cambridge, MA, 2009.
threading the XMT hardware is engi- 10. Culler, D. and Singh, J. Parallel Computer Architecture: © 2011 ACM 0001-0782/11/0100 $10.00
ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t he acm 85
review articles
doi:10.1145/1866739.1866758
This long history is a testament to
What does it mean to preserve privacy? the importance of the problem. Sta-
tistical databases can be of enormous
by Cynthia Dwork social value; they are used for appor-
tioning resources, evaluating medical
A Firm
therapies, understanding the spread
of disease, improving economic util-
ity, and informing us about ourselves
as a species.
Foundation
The data may be obtained in diverse
ways. Some data, such as census,
tax, and other sorts of official data,
is compelled; other data is collected
opportunistically, for example, from
for Private
traffic on the Internet, transactions
on Amazon, and search engine query
logs; other data is provided altruisti-
cally, by respondents who hope that
Data Analysis
sharing their information will help
others to avoid a specific misfortune,
or more generally, to increase the
public good. Altruistic data donors
are typically promised their individ-
ual data will be kept confidential—in
short, they are promised “privacy.”
Similarly, medical data and legally
compelled data, such as census data
and tax return data, have legal privacy
86 communications of th e ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
mandates. In my view, ethics demand of the suggestion. Suppose it is known among them that query monitoring is
that opportunistically obtained data that Mr. X is in a certain medical da- computationally infeasible16 and that
should be treated no differently, espe- tabase. Taken together, the answers the refusal to respond to a query may
cially when there is no reasonable to the two large queries “How many itself be disclosive.15
alternative to engaging in the actions people in the database have the sickle We think of a database as a collection
that generate the data in question. cell trait?” and “How many people, not of rows, with each row containing the
The problems remain: even if data named X, in the database have the sick- data of a different respondent. In sub-
encryption, key management, access le cell trait?” yield the sickle cell status sampling a subset of the rows is chosen
control, and the motives of the data of Mr. X. The example also shows that at random and released. Statistics can
curator are all unimpeachable, what encrypting the data, another frequent then be computed on the subsample
does it mean to preserve privacy, and suggestion (oddly), would be of no help and, if the subsample is sufficiently
how can it be accomplished? at all. The privacy compromise arises large, these may be representative of
from correct operation of the database. the dataset as a whole. If the size of the
“How” Is Hard In query auditing, each query to the subsample is very small compared to
Let us consider a few common sugges- database is evaluated in the context the size of the dataset, this approach
tions and some of the difficulties they of the query history to determine if a has the property that every respondent
can encounter. response would be disclosive; if so, is unlikely to appear in the subsample.
image by brian greenberg
Large Query Sets. One frequent sug- then the query is refused. For example, However, this is clearly insufficient:
gestion is to disallow queries about a query auditing might be used to inter- Suppose appearing in a subsample
specific individual or small set of indi- dict the pair of queries about sickle has terrible consequences. Then every
viduals. A well-known differencing ar- cell trait just described. This approach time subsampling occurs some individ-
gument demonstrates the inadequacy is problematic for several reasons, ual suffers horribly.
ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t he acm 87
review articles
In input perturbation, either the data for this. For example, syntactically dif- Proof. Let d be the true database. The
or the queries are modified before a ferent queries may be semantically adversary can attack in two phases:
response is generated. This broad cat- equivalent, and if the query language is
egory encompasses a generalization of sufficiently rich, then the equivalence 1. Estimate the number of 1’s in all
subsampling, in which the curator first problem itself is undecidable, so the possible sets: Query M on all
chooses, based on a secret, random, curator cannot even test for this. subsets S Í [n].
function of the query, a subsample Problems with noise addition arise 2. Rule out “distant” databases:
from the database, and then returns the even when successive queries are com- For every candidate database
result obtained by applying the query to pletely unrelated to previous queries.5 c Î {0, 1}n, if, for any S Í [n],
the subsample.4 A nice feature of this Let us assume for simplicity that the then rule
approach is that repeating the same database consists of a single—but out c. If c is not ruled out, then
query yields the same answer, while very sensitive—bit per person, so we output c and halt.
semantically equivalent but syntactially can think of the database as an n-bit
different queries are made on essen- Boolean vector d = (d1, . . . , dn). This is Since M (S) never errs by more than E,
tially unrelated subsamples. However, an abstraction of a setting in which the the real database will not be ruled out, so
an outlier may only be protected by the database rows are quite complex, for this simple (but inefficient!) algorithm
unlikelihood of being in the subsample. example, they may be medical records, will output some database; let us call
In what is traditionally called ran- but the attacker is interested in one it c. We will argue that the number of
domized response, the data itself is specific field, such as HIV status. The positions in which c and d differ is at
randomized once and for all and sta- abstracted attack consists of issuing most 4 × E.
tistics are computed from the noisy a string of queries, each described Let I0 be the indices in which di = 0,
responses, taking into account the by a subset S of the database rows. that is, I0 = {i | di = 0}. Similarly, define
distribution on the perturbation.23 The The query is asking how many 1’s are I1 = {i | di = 1}. Since c was not ruled
term “randomized response” comes in the selected rows. Representing out, However, by
from the practice of having the respon- the query as the n-bit characteris- assumption . It fol-
dents to a survey flip a coin and, based tic vector S of the set S, with 1’s in all lows from the triangle inequality that
on the outcome, answering an inva- the positions corresponding to rows c and d differ in at most 2E positions
sive yes/no question or answering a in S and 0’s everywhere else; the true in I0; the same argument shows that
more emotionally neutral one. In the answer to the query is the inner prod- they differ in at most 2E positions in I1.
computer science literature the choice uct . Suppose the privacy Thus, c and d agree on all but at most
governed by the coin flip is usually mechanism responds with A(S) + ran- 4E positions.
between honestly reporting one’s value dom noise. How much noise is needed What if we consider more realistic
and responding randomly, typically by in order to preserve privacy? bounds on the number of queries? We
flipping a second coin and reporting Since we have not yet defined think of as an interesting threshold
the outcome. Randomized response privacy, let us consider the easier on noise, for the following reason: If the
was devised for the setting in which problem of avoiding blatant “non-pri- database contains n people drawn uni-
the individuals do not trust the cura- vacy,” defined as follows: the system formly at random from a population of
tor, so we can think of the randomized is blatantly non-private if an adversary size N n, and the fraction of the pop-
responses as simply being published. can construct a candidate database ulation satisfying a given condition is
Privacy comes from the uncertainty that agrees with the real database D p, then we expect the number of rows in
of how to interpret a reported value. in, say, 99% of the entries. An easy con- the database satisfying p to be roughly
The approach becomes untenable for sequence of the following theorem by the properties of the
complex data. is that a privacy mechanism adding hypergeometric distribution. That
Adding random noise to the output noise with magnitude always bounded is, the sampling error is on the order
has promise, and we will return to it by, say, n/401 is blatantly non-private of . We would like that the noise
later; here we point out that if done against an adversary that can ask all 2n introduced for privacy is smaller than
naïvely this approach will fail. To see possible queries.5 There is nothing spe- the sampling error, ideally .
this, suppose the noise has mean zero cial about 401; any number exceeding Unfortunately, noise of magnitude
and that fresh randomness is used in 400 would work. is blatantly non-private against
generating every response. In this case, a series of n log2 n randomly generated
if the same query is asked repeatedly, Theorem 1. Let M be a mechanism queries,5 no matter the distribution on
then the responses can be averaged, that adds noise bounded by E. Then there the noise. Several strengthenings of
and the true answer will eventually exists an adversary that can reconstruct this pioneering result are now known.
emerge. This is disastrous: an adver- the database to within 4E positions.5 For example, if the entries in S are
sarial analyst could exploit this to carry chosen independently according to
out the difference attack described Blatant non-privacy with E = n/401 fol- a standard normal distribution, then
above. The approach cannot be “fixed” lows immediately from the theorem, blatant non-privacy continues to hold
by recording each query and providing as the reconstruction will be accurate even against an adversary asking only
the same response each time a query in all but at most Q(n) questions, and even if more than
is re-issued. There are several reasons positions. a fifth of the responses have arbitrarily
88 communications of th e ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
review articles
wild noise magnitudes, provided the logs. People search for many “obvi-
other responses have noise magnitude ously” disclosive things, such as their
.8 full names (vanity searches), their own
These are not just interesting math- social security numbers (to see if their
ematical exercises. We have been
focusing on interactive privacy mecha- It has taken numbers are publicly available on the
Web, possibly with a goal of assessing
nisms, distinguished by the involve-
ment of the curator in answering each
several years the threat of identity theft), and even
the combination of mother’s maiden
query. In the noninteractive setting the to fully appreciate name and social security number. AOL
curator publishes some information
of arbitrary form, and the data is not
the importance carefully redacted such obviously dis-
closive “personally identifiable infor-
used further. Research statisticians of taking auxiliary mation,” and each user id was replaced
like to “look at the data,” and we have
frequently been asked for a method
information into by a random string. However, search
histories can be very idiosyncratic, and
of generating a “noisy table” that will account in a New York Times reporter correctly
permit highly accurate answers to
be derived for computations that are privacy-preserving traced such an “anonymized” search
history to a specific resident of Georgia.
not specified at the outset. The noise data release. In a linkage attack, released data
bounds say this is impossible: No such are linked to other databases or other
table can safely provide very accurate sources of information. We use the
answers to too many weighted subset term auxiliary information to capture
sum questions; otherwise the table information about the respondents
could be used in a simulation of the other than that which is obtained
interactive mechanism, and an attack through the (interactive or noninter-
could be mounted against the table. active) statistical database. Any priors,
Thus, even if the analyst only requires beliefs, or information from newspa-
the responses to a small number of pers, labor statistics, and so on, all fall
unspecified queries, the fact that the into this category.
table can be exploited to gain answers In a notable demonstration of
to other queries is problematic. the power of auxiliary information,
In the case of “Internet scale” data medical records of the governor of
sets, obtaining responses to, say, Massachusetts were identified by
n ≥ 108 queries is infeasible. What linking voter registration records to
happens if the curator permits only “anonymized” Massachusetts Group
a sublinear number of questions? Insurance Commission (GIC) medi-
This inquiry led to the first algorith- cal encounter data, which retained
mic results in differential privacy, in the birthdate, sex, and zip code of the
which it was shown how to maintain patient.22
privacy against a sublinear number of Despite this exemplary work, it has
counting queries, that is, queries of the taken several years to fully appreci-
form “How many rows in the database ate the importance of taking auxiliary
satisfy property P?” by adding noise of information into account in privacy-
order —less than the sampling preserving data release. Sources and
error —to each answer.12 The cumber- uses of auxiliary information are end-
some privacy guarantee, which focused lessly varied. As a final example, it has
on the question of what an adversary been proposed to modify search query
can learn about a row in the database, logs by mapping all terms, not just the
is now known to imply a natural and user ids, to random strings. In token-
still very powerful relaxation of differ- based hashing each query is tokenized,
ential p rivacy, defined here. and then an uninvertible hash function
is applied to each token. The intuition
“What” Is Hard is that the hashes completely obscure
Newspaper horror stories about the terms in the query. However, using
“anonymized” and “de-identified” a statistical analysis of the hashed
data typically refer to noninteractive log and any (unhashed) query log, for
approaches in which certain kinds example, the released AOL log dis-
of information in each data record cussed above, the anonymization can
have been suppressed or altered. A be severely compromised, showing
famous example is AOL’s release of that token-based hashing is unsuitable
a set of “anonymized” search query for anonymization.17
ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t he acm 89
review articles
As we will see next, there are deep consumer of the information provided
reasons for the fact that auxiliary infor- by the statistical database (not to men-
mation plays such a prominent role in tion the fact that he or she may also be
these examples. a respondent in the database).
90 communications of th e ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
review articles
statistical databases are designed to The multiplicative nature of the guar- Returning to randomized response,
teach can, sometimes indirectly, cause antee implies that an output whose we see that it yields e-differential pri-
damage to an individual, even if this probability is zero on a given database vacy for a value of e that depends on
individual is not in the database. must also have probability zero on any the universe from which the rows are
In practice, statistical databases neighboring database, and hence, by chosen and the probability with which
are (typically) created to provide some repeated application of the defini- a random, rather than non-random,
anticipated social gain; they teach us tion, on any other database. Thus, value is contributed by the respon-
something we could not (easily) learn Definition 1 trivially rules out the dent. As an example, suppose each
without the database. Together with subsample-and-release paradigm dis- row consists of a single bit, and that
the attack against Turing, and the cussed: For an individual x not in the the respondent’s instructions are to
fact that he did not have to be a mem- dataset, the probability that x’s data first flip an unbiased coin to determine
ber of the database for the attack to is sampled and released is obviously whether he or she will answer ran-
work, this suggests a new privacy goal: zero; the multiplicative nature of the domly or truthfully. If heads (respond
Minimize the increased risk to an indi- guarantee ensures that the same is randomly), then the respondent is to
vidual incurred by joining (or leaving) true for an individual whose data is in flip a second unbiased coin and report
the database. That is, we move from the dataset. the outcome; if tails, the respondent
comparing an adversary’s prior and Any mechanism satisfying this defi- answers truthfully. Fix b Î {0, 1}. If the
posterior views of an individual to com- nition addresses all concerns that any true value of the input is b, then b is out-
paring the risk to an individual when participant might have about the leak- put with probability 3/4. On the other
included in, versus when not included age of his or her personal information, hand, if the true value of the input is
in, the database. This makes sense. regardless of any auxiliary information 1 − b, then b is output with probability
A privacy guarantee that limits risk known to an adversary: Even if the par- 1/4. The ratio is 3, yielding (ln 3)-differ-
incurred by joining encourages partici- ticipant removed his or her data from ential privacy.
pation in the dataset, increasing social the dataset, no outputs (and thus con- Suppose n respondents each employ
utility. This is the starting point on our sequences of outputs) would become randomized response independently,
path to differential privacy. significantly more or less likely. For but using coins of known, fixed, bias.
example, if the database were to be con- Then, given the randomized data, by
Differential Privacy sulted by an insurance provider before the properties of the binomial distri-
Differential privacy will ensure that deciding whether or not to insure a bution the analyst can approximate
the ability of an adversary to inflict given individual, then the presence or the true answer to the question “How
harm (or good, for that matter)—of absence of any individual’s data in the many respondents have value b?” to
any sort, to any set of people—should database will not significantly affect within an expected error on the order
be essentially the same, independent his or her chance of receiving coverage. of . As we will see, it is possible
of whether any individual opts in to, or Definition 1 extends naturally to to do much better—obtaining constant
opts out of, the dataset. We will do this group privacy. Repeated application expected error, independent of n.
indirectly, simultaneously addressing of the definition bounds the ratios of Generalizing in a different direc-
all possible forms of harm and good, probabilities of outputs when a collec- tion, suppose each row now has two
by focusing on the probability of any tion C of participants opts in or opts bits, each one randomized indepen-
given output of a privacy mechanism out, by a factor of e|C|e. Of course, the dently, as described earlier. While each
and how this probability can change point of the statistical database is to bit remains (ln 3)-differentially private,
with the addition or deletion of any disclose aggregate information about their logical-AND enjoys less privacy.
row. Thus, we will concentrate on pairs large groups (while simultaneously That is, consider a privacy mechanism
of databases (D, D¢) differing only in protecting individuals), so we should in which each bit is protected by this
one row, meaning one is a subset of expect privacy bounds to disintegrate exact method of randomized response,
the other and the larger database con- with increasing group size. and consider the query: “What is the
tains just one additional row. Finally, The parameter e is public, and its logical-AND of the bits in the row of
to handle worst-case pairs of data- selection is a social question. We tend respondent i (after randomization)?”
bases, our probabilities will be over the to think of e as, say, 0.01, 0.1, or in If we consider the two extremes, one
random choices made by the privacy some cases, ln 2 or ln 3. in which respondent i has data 11
mechanism. Sometimes, for example, in the cen- and the other in which respondent
sus, an individual’s participation is i has data 00, we see that in the first
Definition 1. A randomized function K known, so hiding presence or absence case the probability of output 1 is 9/16,
gives e-differential privacy if for all data- makes no sense; instead we wish to while in the second case the probabil-
sets D and D¢ differing on at most one row, hide the values in an individual’s row. ity is 1/16. Thus, this mechanism is at
and all S ⊆ Range(K), Thus, we can (and sometimes do) best (ln 9)-differentially private, not
extend “differing in at most one row” ln 3. Again, it is possible to do much
to mean having symmetric difference better, even while releasing the entire
(1) at most 1 to capture both possibilities. 4-element histogram, also known as a
where the probability space in each case However, we will continue to use the contingency table, with only constant
is over the coin flips of K. original definition. expected error in each cell.
ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t he acm 91
review articles
Achieving Differential Privacy |z − z¢| £ 1 the density at z is at most where the inequality follows from
Achieving differential privacy revolves ee times the density at z¢, satisfying the triangle inequality. By defini-
around hiding the presence or absence the condition in Equation 2. It is also tion of sensitivity, || f (D¢) − f (D)||1 £ Df,
of a single individual. Consider the symmetric about 0, and this is impor- and so the ratio is bounded by exp(e).
query “How many rows in the database tant. We cannot, for example, have Integrating over S yields e-differential
satisfy property P?” The presence or a distribution that only yields non- privacy.
absence of a single row can affect the negative noise. Otherwise the only Given any query sequence f1, . . . , fm,
answer by at most 1. Thus, a differen- databases on which a counting query e-differential privacy can be achieved
tially private mechanism for a query of could return a response of 0 would be by running K with noise distribution
this type can be designed by first com- databases in which no row satisfies on each query, even if
puting the true answer and then adding the query. Letting D be such a data- the queries are chosen adaptively, with
random noise according to a distribu- base, and letting D¢ = D È {r} for some each successive query depending on
tion with the following property: row r satisfying the query, the pair D, the answers to the previous queries.
D¢ would violate e-differential privacy. In other words, by allowing the quality
(2)
Finally, the distribution gets flatter as of each answer to deteriorate in a con-
To see why this is desirable, consider e decreases. This is correct: smaller e trolled way with the sum of the sensi-
any feasible response r. For any m, if means better privacy, so the noise den- tivities of the queries, we can maintain
m is the true answer and the response sity should be less “peaked” at 0 and e-differential privacy.
is r then the random noise must have change more gradually as the magni- With this in mind, let us return to
value r − m; similarly, if m − 1 is the true tude of the noise increases. some of the suggestions we consid-
answer and the response is r, then the There is nothing special about the ered earlier. Recall that using the spe-
random noise must have value r − m + cases d = 1, Df = 1: cific randomized response strategy
1. In order for the response r to be gen- described above, for a single Boolean
erated in a differentially private fash- Theorem 2. For f : D ® Rd, the attribute, yielded error on
ion, it suffices for echanism K that adds independently
m databases of size n and (ln 3)-dif-
generated noise with distribution Lap ferential privacy. In contrast, using
(Df/e) to each of the d output terms enjoys Theorem 2 with the same value of e,
e-differential privacy.7 noting that Df = 1 yields a variance
In general we are interested in of 2(1/ln 3)2, or an expected error of
v ector-valued queries; for example, the Before proving the theorem, we . More generally, to obtain e-
data may be points in Rd and we wish illustrate the situation for the case of a differential privacy we get an expected
to carry out an analysis that clusters the counting query (Df = 1) when e = ln2 and error of . Thus, our expected error
points and reports the location of the the true answer to the query is 100. The magnitude is constant, independent
largest cluster. distribution on the outputs (in gray) is of n.
centered at 100. The distribution on What about two queries? The sen-
Definition 2. For f : D ® Rd, the L1 sen- outputs when the true answer is 101 is sitivity of a sequence of two counting
sitivity of f is7 shown in orange. queries is 2. Applying the theorem
with Df/e = 2/e, adding independently
generated noise distributed as
Lap(2/e) to each true answer yields
(3)
… 97 98 99 100 101 102 103 …
e-differential privacy. The variance is
Proof. (Theorem 2) The proof is a sim- 2(2/e)2, or standard deviation .
ple generalization of the reasoning we Thus, for any desired e we can achieve
for all D, D¢ differing in at most one row. used to illustrate the case of a single e-differential privacy by increasing
counting query. the expected magnitude of the errors
In particular, when d = 1 the sensitivity Consider any subset S ⊆ Range(K), as a function of the total sensitivity of
of f is the maximum difference in the and let D, D¢ be any pair of databases the two-query sequence. This holds
values that the function f may take on a differing in at most one row. When the equally for:
pair of databases that differ in only one database is D, the probability density at
row. This is the difference our noise any r S is proportional to exp(−|| f (D) • Two instances of the same query,
must be designed to hide. For now, let − r||1(e/D f )). Similarly, when the data- addressing the repeated query
us focus on the case d = 1. base is D¢, the probability density at problem
The Laplace distribution with any r Range(K) is proportional to • One count for each of two differ-
parameter b, denoted Lap(b), has den- exp(−|| f (D¢) − r ||1(e/Df) ). ent bit positions, for example,
sity function We have when each row consists of two bits
; its variance is 2b2. Taking b = 1/e we • A pair of queries of the form: “How
have that the density at z is propor- many rows satisfy property P?”
tional to e−e|z|. This distribution has and “How many rows satisfy prop-
highest density at 0 (good for accu- erty Q?” (where possibly P = Q)
racy), and for any z, z¢ such that • An arbitrary pair of queries
92 communications of th e ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
review articles
However, the theorem also shows we output of such a function while pre-
can sometimes do better. The logical- serving e-differential privacy requires
AND count we discussed earlier, even additional technology.
though it involves two different bits Assume the curator holds a data-
in each row, still only has sensitivity 1:
The number of 2-bit rows whose entries There are times base D and the goal is to produce an
object y. The exponential mechanism19
are both 1 can change by at most 1 with
the addition or deletion of a single row.
when the addition works as follows. We assume the exis-
tence of a utility function u(D, y) that
Thus, this more complicated query can of noise for measures the quality of an output y,
be answered in an e-differentially pri-
vate fashion using noise distributed as
achieving privacy given that the database is D. For exam-
ple, the data may be a set of labeled
Lap(1/e); we do not need to use the dis- makes no sense. points in Rd and the output y might
tribution Lap(2/e). be a d-ary vector describing a (d − 1)-
dimensional hyperplane that attempts
Histogram Queries. The power of to classify the points, so that those
Theorem 2 really becomes clear when labeled with +1 have non-negative
considering histogram queries, defined inner product with y and those labeled
as follows. If we think of the rows of with −1 have negative inner product. In
the database as elements in a universe this case the utility would be the num-
X, then a histogram query is a parti- ber of points correctly classified, so that
tioning of X into an arbitrary number higher utility corresponds to a better
of disjoint regions X1, X2, . . . , Xd. The classifier. The exponential mechanism
implicit question posed by the query is: outputs y with probability proportional
“For i = 1, 2, . . . , d, how many points to exp(u(D,y)e/Du) and ensures e-differ-
in the database are contained in Xi?” ential privacy. Here Du is the sensitivity
For example, the database may contain of the utility function bounding, for all
the annual income for each respon- databases (D, D¢) differing in only one
dent, and the query is a partitioning of row and potential outputs y, the differ-
incomes into ranges: {[0, 50K), [50K, ence |u(D, y) − u(D¢,y)|. In our example,
100K), . . . , ³ 500K}. In this case d = 11, Du = 1. The mechanism assigns most
and the question is asking, for each of mass to the best classifier, and the
the d ranges, how many respondents mass assigned to any other drops off
in the database have annual income exponentially in the decline in its util-
in the given range. This looks like d ity for the current dataset—hence the
separate counting queries, but the name “exponential mechanism.”
entire query actually has sensitivity Df
= 1. To see this, note that if we remove When Sensitivity Is Hard to Analyze.
one row from the database, then only The Laplace and exponential mecha-
one cell in the histogram changes, and nisms provide a differentially private
that cell only changes by 1; similarly for interface through which the analyst
adding a single row. So Theorem 2 says can access the data. Such an interface
that e-differential privacy can be main- can be useful even when it is difficult to
tained by perturbing each cell with determine the sensitivity of the desired
an independent random draw from function or query sequence; it can also
Lap(1/e). Returning to our example of be used to run an iterative algorithm,
2-bit rows, we can pose the 4-ary histo- composed of easily analyzed steps, for
gram query requesting, for each pair of as many iterations as a given privacy
literals v1v2, the number of rows with budget permits. This is a powerful
value v1v2, adding noise of order 1/e to observation; for example, using only
each of the four cells. noisy sum queries, it is possible to
carry out many standard data mining
When Noise Makes No Sense. There tasks, such as singular value decom-
are times when the addition of noise positions, finding an ID3 decision
for achieving privacy makes no sense. tree, clustering, learning association
For example, the function f might rules, and learning anything learn-
map databases to strings, strategies, able in the statistical queries learning
or trees, or it might be choosing the model, frequently with good accuracy,
“best” among some specific, not nec- in a privacy-preserving fashion.2 This
essarily continuous, set of real-valued approach has been generalized to
objects. The problem of optimizing the yield a publicly available codebase for
ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t he acm 93
review articles
writing programs that ensure differen- obtain e-differential privacy. If we do The minimal size of the input data-
tial privacy.18 not know the number of iterations base depends on the quality of the
in advance we can increase the noise approximation, the logarithm of the
k-Means Clustering. As an example parameter as the computation pro- cardinality of the universe X, the pri-
of “private programming,”2 consider ceeds. There are many ways to do this. vacy parameter e, and the Vapnick–
k-means clustering, described first in its For example, we can answer in the first Chervonenkis dimension of the concept
usual, non-private form. The input con- iteration with parameter (d + 1)(e/2), class C (for finite |C| this is at most
sists of points p1, . . . , pn in the d-dimen- in the next with parameter (d + 1)(e/4), log2 |C|). The synthetic dataset, chosen
sional unit cube [0, 1]d. Initial candidate and so on, each time using up half of by the exponential mechanism, will be
means m1, . . . , mk are chosen randomly the remaining “privacy budget.” a set of m = O(VCdim(C)/γ2), elements
from the cube and updated as follows: in X (γ governs the maximum permissi-
Generating Synthetic Data ble inaccuracy in the fractional count).
1. Partition the samples {pi} into k The idea of creating a synthetic dataset Letting D denote the input dataset and
sets S1, . . . , Sk, associating each pi whose statistics closely mirror those ^ a candidate synthetic dataset, the
d
with the nearest mj. of the original dataset, but which pre- utility function for the exponential
2. For 1 £ j £ k, set , serves privacy of individuals, was pro- mechanism is given by
the mean of the samples posed in the statistics community no
associated with mj. later than 1993.21 The lower bounds on
noise discussed at the end of Section
This update rule is typically iterated on “How Is Hard” imply that no such
until some convergence criterion has dataset can safely provide very accurate Pan-Privacy
been reached, or a fixed number of answers to too many weighted subset Data collected by a curator for a given
iterations have been applied. sum questions, motivating the inter- purpose may be subject to “mission
Although computing the nearest active approach to private data analy- creep” and legal compulsion, such
mean of any one sample (Step 1) would sis discussed herein. Intuitively, the as a subpoena. Of course, we could
breach privacy, we observe that to com- advantage of the interactive approach analyze data and then throw it away,
pute an average among an unknown set is that only the questions actually but can we do something even stron-
of points it is enough to compute their asked receive responses. ger, never storing the data in the first
sum and divide by their number. Thus, Against this backdrop, the non- place? Can we strengthen our notion
the computation only needs to expose interactive case was revisited from a of privacy to capture the “never store”
the approximate cardinalities of the Sj, learning theory perspective, challeng- requirement?
not the sets themselves. Happily, the ing the interpretation of the noise These questions suggest an investi-
k candidate means implicitly define lower bounds as a limit on the number gation of differentially private stream-
a histogram query, since they parti- of queries that can be answered pri- ing algorithms with small state—much
tion the space [0, 1]d according to their vately.3 This work, described next, has too small to store the data. However,
Voronoi cells, and so the vector (|S1|, excited interest in interactive and non- nothing in the definition of a stream-
. . . , |Sk|) can be released with very low interactive solutions yielding noise in ing algorithm, even one with very
noise in each coordinate. This gives us the range . small state, precludes storing a few
a differentially private approximation Let X be a universe of data items individual data points. Indeed, popu-
to the denominators in Step 2. As for and let C be a concept class consisting lar techniques from the streaming
the numerators, the sum of a subset of of functions c : X ® {0,1}. We say x Î X literature, such as Count-Min Sketch
the pi has sensitivity at most d, since satisfies a concept c Î C if and only if c(x) and subsampling, do precisely this.
the points come from the bounded = 1. A concept class can be extremely In such a situation, a subpoena or
region [0,1]d. Even better, the sensitiv- general; for example, it might consist other intrusion into the local state will
ity of the d-ary function that returns, of all rectangles in the plane, or all breach privacy.
for each of the k Voronoi cells, the d-ary Boolean circuits containing a given A pan-private algorithm is private
sum of the points in the cell is at most number of gates. “inside and out,” remaining differen-
d: Adding or deleting a single d-ary Given a sufficiently large database tially private even if its internal state
point can affect at most one sum, and D Î Xn, it is possible to privately gen- becomes visible to an adversary.10 To
that sum can change by at most 1 in erate a synthetic database that main- understand the pan-privacy guaran-
each of the d dimensions. Thus, using tains approximately correct fractional tee, consider click stream data. This
a query sequence with total sensitivity counts for all concepts in C (there may data is generated by individuals, and
at most d + 1, the analyst can compute be infinitely many!). That is, letting an individual may appear many times
a new set of candidate means by divid- S denote the synthetic database pro- in the stream. Pan-privacy requires
ing, for each mj, the approximate sum duced, with high probability over the that any two streams differing only
of the points in Sj by the approxima- choices made by the privacy mecha- in the information of a single indi-
tion to the cardinality |Sj|. nism, for every concept c Î C, the frac- vidual should produce very similar
If we run the algorithm for a fixed tion of elements in S that satisfy c is distributions on the internal states of
number N of iterations we can use the approximately the same as the fraction the algorithm and on its outputs, even
noise distribution Lap( (d + 1)N/e) to of elements in D that satisfy c. though the data of an individual are
94 communications of th e ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
review articles
interleaved arbitrarily with other data expanding rapidly, and there is insuf- References
in the stream. ficient space here to list all the inter- 1. Adam, N.R., Wortmann, J. Security-control methods
for statistical databases: A comparative study. ACM
As an example, consider the prob- esting directions currently under Comput. Surv. 21 (1989), 515–556.
lem of density estimation. Assuming, investigation by the community. We 2. Blum, A., Dwork, C., McSherry, F., Nissim, K.
Practical privacy: The SuLQ framework. In
for simplicity, that the data stream is identify a few of these. Proceedings of the 24th ACM Symposium on
just a sequence of IP addresses in a Principles of Database Systems (2005), 128–138.
3. Blum, A., Ligett, K., Roth, A. A learning theory
certain range, we wish to know what The Geometry of Differential Privacy. approach to non-interactive database privacy. In
fraction of the set of IP addresses in the Sharper upper and lower bounds Proceedings of the 40th ACM Symposium on Theory
of Computing (2008), 609–618.
range actually appears in the stream. on noise required for achieving dif- 4. Denning, D.E. Secure statistical databases with
A solution inspired by randomized ferential privacy against a sequence random sample queries. ACM Trans. Database Syst.
5 (1980), 291–315.
response can be designed using the fol- of linear queries can be obtained 5. Dinur, I., Nissim, K. Revealing information while
lowing technique.10 by understanding the geometry of preserving privacy. In Proceedings of the 22nd ACM
Symposium on Principles of Database Systems
Define two probability distribu- the query sequence.14 In some cases (2003), 202–210.
tions, D0 and D1, on the set {0, 1}. D0 dependencies among the queries can 6. Dwork, C. Differential privacy. In Proceedings of
the 33rd International Colloquium on Automata,
assigns equal mass to zero and to one. be exploited by the curator to markedly Languages and Programming (ICALP) (2) (2006),
1–12.
D1 has a slight bias toward 1; specifi- improve the accuracy of the responses. 7. Dwork, C., McSherry, F., Nissim, K., Smith, A.
cally, 1 has mass 1/2 + e/4, while 0 has Generalizing this investigation to the Calibrating noise to sensitivity in private data
analysis. In Proceedings of the 3rd Theory of
mass 1/2 − e/4. nonlinear and interactive cases would Cryptography Conference (2006), 265–284.
Let X denote the set of all possible IP be of significant interest. 8. Dwork, C., McSherry, F., Talwar, K. The price of
privacy and the limits of lp decoding. In Proceedings
addresses in the range of interest. The of the 39th ACM Symposium on Theory of
algorithm creates a table, with a 1-bit Algorithmic Complexity. We have Computing (2007), 85–94.
9. Dwork, C. Naor, M. On the difficulties of disclosure
entry bx for each x Î X, initialized to an so far ignored questions of compu- prevention in statistical databases or the case for
independent random draw from D0. So tational complexity. Many, but not differential privacy. J. Privacy Confidentiality 2
(2010). Available at: http://repository.cmu.edu/jpc/
initially the table is roughly half zeroes all, of the techniques described here vol2/iss1/8.
and half ones. have efficient implementations. For 10. Dwork, C., Naor, M., Pitassi, T., Rothblum, G.,
Yekhanin, S. Pan-private streaming algorithms. In
In an atomic step, the algorithm example, there are instances of the Proceedings of the 1st Symposium on Innovations in
receives an element from the stream, synthetic data generation problem Computer Science (2010).
11. Dwork, C., Naor, M., Reingold, O., Rothblum, G.,
changes state, and discards the ele- that, under standard cryptographic Vadhan, S. When and how can privacy-preserving
ment. When processing x Î X, the assumptions, have no polynomial data release be done efficiently? In Proceedings of
the 41st ACM Symposium on Theory of Computing
algorithm makes a fresh random time implementation.11 It follows (2009), 381–390.
12. Dwork, C., Nissim, K. Privacy-preserving datamining
draw from D1, and stores the result in that there are cases in which the expo- on vertically partitioned databases. In Advances in
bx. This is done no matter how many nential mechanism has no efficient Cryptology—CRYPTO’04 (2004), 528–544.
13. Goldwasser, S., Micali, S. Probabilistic encryption.
times x may have appeared in the implementation. When can this pow- JCSS 28 (1984), 270–299.
past. Thus, for any x appearing at least erful tool be implemented efficiently, 14. Hardt, M., Talwar, K. On the geometry of differential
privacy, (2009). In Proceedings of the 42nd ACM
once, bx will be distributed accord- and how? Symposium on Theory of Computing (2010),
ing to D1. However, if x never appears, 705–714.
15. Kenthapadi K., Mishra, N., Nissim, K. Simulatable
then the entry for x is the bit drawn An Alternative to Differential Privacy? auditing. In Proceedings of the 24th ACM
according to D0 during the initializa- Is there an alternative, “ad omnia,” Symposium on Principles of Database Systems
(2005), 118–127.
tion of the table. guarantee that composes automati- 16. Kleinberg, J., Papadimitriou, C., Raghavan, P.
As with randomized response, cally, and permits even better accuracy Auditing boolean attributes. In Proceedings of the
19th ACM Symposium on Principles of Database
the density in X of the items in the than differential privacy? Can crypto Systems (2000), 86–91.
stream can be approximated from graphy be helpful in this regard?20 17. Kumar, R., Novak, J., Pang, B., Tomkins, A. On
anonymizing query logs via token-based hashing. In
the number of 1’s in the table, taking The work described herein has, Proceedings of the WWW 2007 (2007), 629–638.
into account the expected fraction of for the first time, placed private data 18. McSherry, F. Privacy integrated queries (codebase).
Available on Microsoft Research downloads website.
“false positives” from the initializa- analysis on a strong mathematical See also Proceedings of SIGMOD (2009), 19–30.
19. McSherry, F., Talwar, K. Mechanism design via
tion phase and the “false negatives” foundation. The literature connects differential privacy. In Proceedings of the 48th
when sampling from D1. Letting q differential privacy to decision theory, Annual Symposium on Foundations of Computer
Science (2007).
denote the fraction of entries in the economics, robust statistics, geom- 20. Mironov, I., Pandey, O., Reingold, O., Vadhan, S.
table with value 1, the output is 4(q − etry, additive combinatorics, cryp- Computational differential privacy. In Advances in
Cryptology—CRYPTO’09 (2009), 126–142.
1/2)/e + Lap(1/e|X|). tography, complexity theory learning 21. Rubin, D. Discussion: Statistical disclosure limitation.
Intuitively, the internal state is theory, and machine learning. J. Official Statist. 9 (1993), 462–468.
22. Sweeney, L. Weaving technology and policy together
differentially private because, for Differential privacy thrives because to maintain confidentiality. J. Law Med. Ethics 25
each it is natural, it is not domain-specific, (1997), 98–110.
23. Warner, S. Randomized response: a survey technique
privacy for the output is ensured by and it enjoys fruitful interplay with for eliminating evasive answer bias. JASA (1965),
the addition of Laplacian noise. Over other fields. This flexibility gives hope 63–69.
ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t he acm 95
cacm online
[con t inu e d from p. 16] is not solely influenced, so there is no single solu-
about communication; it is also about tion that can be identified to improve
maintaining a historical record, so that the speed or efficiency of scholarly
future generations of scientists can communication. Ed Chi, of the Palo
learn and build on the work of the past. Alto Research Center, described some
Whether new forms of scholarly com- of the difficulties of modernizing
munication pass this second test is far peer-review publishing in the Blog@
from certain. CACM at (http://cacm.acm.org/blogs/
A common misconception about blog-cacm/100284). “In many non-U.S.
the “dead tree” model of scholarly research evaluations, only the ISI Sci-
communication is that it is antitheti- ence Citation Index actually counts
cal to speed. This is only true to a cer- for publication. Already this doesn’t fit
tain extent, but almost certainly not to with many real-world metrics for repu-
the extent that most believe. tation,” Chi said via email. Some well-
Scott Delman, ACM’s Director of known ACM conference publications
Group Publishing, commented that are excluded from the SCI (http://bit.
“The current system of peer review ly/iaobEa), “even when their real-world
is the single largest bottleneck in the reputation is much higher than other
scholarly publication process, but this ‘dead-tree’ journals.” Technology may
does not mean the established system provide opportunities to facilitate and
can simply be thrown out in favor of accelerate the discourse, but there is
new models just because new technol- no guarantee the academic establish-
ogy enables dramatic improvements ment around the world will move as
in speed.” Establishing a new model quickly in accepting new media and
for scholarly communication will in- ways of communicating.
volve experimentation, trial and error, Paperless publishing will happen
and most likely evolution instead of gradually, but “only if there are ways to
revolution. Proclamations of the death manage the publication process,” Chi
of scholarly publishers and scholarly says. “Open source journal publica-
publishing as a result of the rise of the tion management systems will enable
Internet are no longer taken seriously journals to go somewhat independent
by those working in the publishing of traditional paper publishers, but we
industry. What we have seen is a slow will also need national scientific insti-
but steady evolution of print to online tutions to establish digital archives.”
publication and distribution models Other challenges he notes include
instead of an overnight upheaval. handling an increased number of sub-
Delman adds, “I believe strongly missions and managing potentially
that there is a need for a new model,” larger editorial boards.
but then goes on to refute the notion As an organization with the stated
that digital-only publishing—and the mission to advance computing as a
elimination of print—would quicken science and profession, ACM could
the publication of scholarly articles. “lead the charge” in experimenting
“The most substantial component in with new digital publishing models
the time delay related to the publica- for computing scholarship, says Chi.
tion of articles in scholarly journals is “This might include creating usable
the peer-review process,” and a digital- software, digital libraries, or archi-
only model won’t change that, he says. val standards.” A particularly impor-
Nor will it reduce article backlogs or tant area of research would examine
remove page limitations. “Eliminat- how to make socially derived metrics
ing print will not have the dramatic a part of reputation systems, so that
impact that most assume will occur if the number of downloads, online
print publications go away,” he says. mentions, citations, and blog discus-
Importantly, ACM readers and sub- sions can be measured for influence.
scribers “look for high-quality content Then, according to Chi, ACM “should
delivered in multiple formats, and work with national libraries to actively
they still want print.” change the publication models of oth-
Adding to the complexity of the er professions and fields.” This will
challenge is the fact that while science not be a revolution. ACM can help to
is global, scientific publication mod- drive the change in a positive way for
els are often socially or geographically the scientific community.
96 communications of th e ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
research highlights
p. 98 p. 99
Technical Sora: High-Performance
Perspective
Sora Promises Software Radio Using
Lasting Impact General-Purpose Multi-Core
By Dina Katabi
Processors
By Kun Tan, He Liu, Jiansong Zhang, Yongguang Zhang, Ji Fang,
and Geoffrey M. Voelker
p. 108 p. 109
Technical Path Selection and Multipath
Perspective Congestion Control
Multipath: A New
By Peter Key, Laurent Massoulié, and Don Towsley
Control Architecture
for the Internet
By Damon Wischik
ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t he acm 97
research highlights
doi:10.1145/1866739.1 8 6 6 7 5 9
Technical Perspective
Sora Promises Lasting Impact
By Dina Katabi
The term software defined radio (SDR) with the programmability and flexibil- time demos. It has enabled demand-
first appeared in 1992 and referred ity of general-purpose processors. To ing designs, such as LTE and AP virtu-
to a radio transceiver where the basic do so, Sora must overcome the follow- alization, to be built fully in software.
signal processing components (for ex- ing challenge: How can a radio deliver However, currently most SDR-based
ample, filtering, frame detection, syn- high throughput and support real-time research uses the GNU Radio/USRP
chronization, and demodulation) are protocols when all signal processing is platform. Despite the limitations of
all done in a general-purpose proces- done in software on a PC? this platform, previous attempts at
sor. The goal of an SDR was to enable a Sora’s approach uses various fea- replacing it with more capable plat-
single radio to support multiple wire- tures common in today’s multicore ar- forms did not experience significant
less technologies (for example, AM, chitectures. For example, transferring success. In fact, history shows that
VHF, FM) and be easily upgradable the digital waveform samples from the wide adoption is not necessarily cor-
with a software patch. radio board to the PC requires very high related with the more capable design.
While the concept of SDR has been bus throughput. While alternative SDR One of the classic papers we teach our
around for decades, only recently have technologies employ USB 2.0 or Giga- undergraduate students is “The Rise
SDRs become common in academic bit Ethernet, Sora opts for PCI-Express. of Worse is Better” by Richard Gabriel
wireless research. However, research This design decision enables Sora to that explains why the Lisp language
projects typically employ SDR as a de- achieve significantly higher transfer lost to C and Unix. Gabriel argues that
velopment platform, that is, they use rates, which are important for high for wide adoption, a system must be
software radios to develop new physi- bandwidth multi-antenna designs. good enough and as simple as pos-
cal layer designs with the understand- The choice of PCI-express also enables sible. Such a design (termed worse is
ing that if these designs make it to a Sora to reduce the transfer latency to better) tends to appear first because
product they will be built in ASICs. sub-microseconds, which is neces- the implementer did not spend an ex-
The reason why SDR has become a sary for wireless protocols with timing cessive amount of time over-optimiz-
development platform rather than a constraints (for example, MAC proto- ing. Therefore, if good enough, it will
fully functional software radio is that cols). Further, to accelerate wireless be adopted by developers because of
building high-performance SDRs has processing, Sora replaces computation its simplicity. Once adopted, the sys-
turned out to be very challenging. with memory lookups, exploits single tem will gradually improve until it is
Sora has revived the original SDR vi- instruction multiple data (SIMD), and almost the right design. One may ar-
sion. The objective of Sora is to build an dedicates certain cores exclusively to gue that the history of the GNU Radio/
SDR that combines the performance real-time signal processing. USRP SDR is fairly similar; the plat-
and fidelity of hardware platforms There are many reasons why the form originally provided just enough
following paper about Sora stands out for people to start experimenting with
as one of the most significant wireless the wireless physical layer. As a result,
There are many papers in the past few years. First, it it was simple and cheap, which caused
presents the first SDR platform that it to spread. Once it was accepted, it
reasons why fully implements IEEE 802.11b/g on kept improving just enough to enable
the following paper standard PCs. Second, the design the next step in research.
choices it makes (for example, the use The Sora team has recently started
about Sora stands of PCIe, SIMD, trading computation a program that awards Sora kits to aca-
out as one of for memory lookups, and core dedica- demic institutions to enable them to
tion) are highly important if software experiment with this new platform. It
the most significant radios are ever to meet their original will be interesting to see whether Sora
wireless papers goal of one-radio-for-all-wireless- with its higher performance can even-
technologies. Third, the paper is a tually replace the GNU Radio/USRP
in the past few years. beautiful and impressive piece of en- platform. If this happens, it will be a
gineering that spans signal process- major success for Sora.
ing, hardware design, multicore pro-
gramming, kernel optimization, and Dina Katabi (dina@csail.mit.edu) is an associate
professor in the Electrical Engineering and Computer
so on. For all these reasons, this paper Science Department at Massachusetts Institute of
Technology, Cambridge, MA.
will have a lasting impact on wireless
research.
The Sora platform has been used
in multiple research projects and real- © 2011 ACM 0001-0782/11/0100 $10.00
98 communications of th e ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
doi:10.1145/1866739 . 1 8 6 6 7 6 0
Ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t he acm 99
research highlights
motivated specialized hardware approaches in the past.14, 16 different wireless technologies may have subtle differences
Lastly, wireless PHY and media access control (MAC) proto- among one another, they generally follow similar designs
cols have low-latency real-time deadlines that must be met and share many common algorithms. In this section, we use
for correct operation. For example, the 802.11 MAC protocol the IEEE 802.11a/b/g standards to exemplify characteristics
requires precise timing control and ACK response latency on of wireless PHY and MAC components as well as the chal-
the order of tens of microseconds. Existing software archi- lenges of implementing them in software.
tectures on the PC cannot consistently meet this timing
requirement. 2.1. Wireless PHY
Sora addresses these challenges with novel hardware The role of the PHY layer is to convert information bits into
and software designs. First, we have developed a new, inex- a radio waveform, or vice versa. At the transmitter side, the
pensive radio control board (RCB) with a radio front-end wireless PHY component first modulates the message (i.e., a
for transmission and reception. The RCB bridges an RF MAC frame) into a time sequence of digital baseband signals.
front-end with PC memory over the high-speed and low- Digital baseband signals are then passed to the radio front-
latency PCIe bus. With this bus standard, the RCB can sup- end, where they are converted to analog waveform, multiplied
port 16.7Gbps (×8 mode) throughput with sub-microsecond by a high frequency carrier and transmitted into the wireless
latency, which together satisfies the throughput and timing channel. At the receiver side, the radio front-end receives
requirements of modern wireless protocols while perform- radio signals in the channel and extracts the baseband wave-
ing all digital signal processing on host CPU and memory. form by removing the high-frequency carrier. The extracted
Second, to meet PHY processing requirements, Sora baseband waveform is digitalized and converted back into
makes full use of various features of widely adopted multi- digital signals. Then, the digital baseband signals are fed into
core architectures in existing GPPs. The Sora software the receiver’s PHY layer to be demodulated into the original
architecture explicitly supports streamlined processing message.
that enables components of the signal processing pipeline The PHY layer directly operates on the digital base-
to efficiently span multiple cores. Further, we change the band signals after modulation on the transmitter side and
conventional implementation of PHY components to exten- before demodulation on the receiver side. Therefore, high-
sively take advantage of lookup tables (LUTs), trading off throughput interfaces are needed to connect the PHY layer
computation for memory. These LUTs substantially reduce and the radio front-end. The required throughput linearly
the computational requirements of PHY processing, while scales with the bandwidth of the baseband signal as well as
at the same time taking advantage of the large, low-latency the number of antennas in a MIMO system. For example, the
caches on modern GPPs. Finally, Sora uses the Single channel width is 20MHz in 802.11a. It requires a data rate of
Instruction Multiple Data (SIMD) extensions in existing pro- at least 20M complex samples per second to represent the
cessors to further accelerate PHY processing. waveform. These complex samples normally require 16-bit
Lastly, to meet the real-time requirements of high-speed quantization for both in-phase and quadrature (I/Q) compo-
wireless protocols, Sora provides a new kernel service, core nents to provide sufficient fidelity, translating into 32 bits
dedication, which allocates processor cores exclusively for per sample, or 640Mbps for the full 20 MHz channel. Over-
real-time SDR tasks. We demonstrate that it is a simple sampling, a technique widely used for better performance,11
yet crucial abstraction that guarantees the computational doubles the requirement to 1.28Gbps. With a 4 × 4 MIMO
resources and precise timing control necessary for SDR on and 40-MHz channel, as specified in 802.11n, it will again
a multi-core GPP. quadruple the requirement to 10Gbps to move data between
We have developed a few demonstration wireless sys- the RF frond-end and PHY for one channel.
tems based on the Sora platform, including: (1) SoftWiFi, Advanced communication systems (e.g., IEEE 802.11a/b/g,
an 802.11a/b/g implementation that supports a full suite as shown in Figure 1) contain multiple functional blocks in
of modulation rates (up to 54Mbps) and seamlessly inter- their PHY components. These functional blocks are pipe-
operates with commercial 802.11 NICs, and (2) SoftLTE, lined with one another. Data are streamed through these
a 3GPP LTE uplink PHY implementation that supports up to blocks sequentially, but with different data types and sizes.
43.8Mbps data rate. As illustrated in Figure 1, different blocks may consume or
The rest of the paper is organized as follows. Section 2 produce different types of data in different rates arranged
provides background on wireless communication systems. in small data blocks. For example, in 802.11b, the scram-
We then present the Sora architecture in Section 3, and we bler may consume and produce one bit, while DQPSK
discuss our approach for addressing the challenges of building modulation maps each two-bit data block onto a complex
an SDR platform on a GPP system in Section 4. We then symbol, whose real and image components represent I and
describe the implementation of the Sora platform in Section 5. Q, respectively.
Section 6 provides a quantitative evaluation of the radio Each PHY block performs a fixed amount of computation
systems based on Sora. Finally, Section 7 describes related on every transmitted or received bit. When the data rate is
work and Section 8 concludes. high, e.g., 11Mbps for 802.11b and 54Mbps for 802.11a/g,
PHY processing blocks consume a significant amount of
2. BACKGROUND AND REQUIREMENTS computational power. Based on the model in Neel et al.,16
In this section, we briefly review the PHY and MAC compo- we estimate that a direct implementation of 802.11b may
nents of typical wireless communication systems. Although require 10GOPS while 802.11a/g needs at least 40GOPs.
These requirements are very demanding for software require substantial computational power for their PHY
processing in GPPs. processing. Such computational requirements also increase
proportionally with communication speed. Unfortunately,
2.2. Wireless MAC techniques used in conventional PHY hardware or embed-
The wireless channel is a resource shared by all transceiv- ded DSPs do not directly carry over to GPP architectures.
ers operating on the same spectrum. As simultaneously Thus, we require new software techniques to accelerate
transmitting neighbors may interfere with each other, vari- high-speed signal processing on GPPs. With the advent of
ous MAC protocols have been developed to coordinate their many-core GPP architectures, it is now reasonable to aggre-
transmissions in wireless networks to avoid collisions. gate computational power of multiple CPU cores for signal
Most modern MAC protocols, such as 802.11, require processing. But, it is still challenging to build a software
timely responses to critical events. For example, 802.11 architecture to efficiently exploit the full capability of mul-
adopts a carrier sense multiple access (CSMA) MAC proto- tiple cores.
col to coordinate transmissions. Transmitters are required Real-time enforcement. Wireless protocols have multiple
to sense the channel before starting their transmission, real-time deadlines that need to be met. Consequently, not
and channel access is only allowed when no energy is only is processing throughput a critical requirement, but
sensed, i.e., the channel is free. The latency between sense the processing latency needs to meet response deadlines.
and access should be as small as possible. Otherwise, the Some MAC protocols also require precise timing control at
sensing result could be outdated and inaccurate. Another the granularity of microseconds to ensure certain actions
example is the link-layer retransmission mechanisms occur at exactly pre-scheduled time points. Meeting such
in wireless protocols, which may require an immediate real-time deadlines on a general PC architecture is a non-
acknowledgement (ACK) to be returned in a limited time trivial challenge: time sharing operating systems may not
window. respond to an event in a timely manner, and bus interfaces,
Commercial standards like IEEE 802.11 mandate a response such as Gigabit Ethernet, could introduce indefinite delays
latency within 16 ms, which is challenging to achieve in software far more than a few microseconds. Therefore, meeting
on a general-purpose PC with a general-purpose OS. these real-time requirements requires new mechanisms
on GPPs.
2.3. Software radio requirements
Given the above discussion, we summarize the requirements 3. ARCHITECTURE
for implementing a software radio system on a general PC We have developed a high-performance software radio
platform: platform called Sora that addresses these challenges. It is
High-system throughput. The interfaces between the radio based on a commodity general-purpose PC architecture. For
front-end and PHY as well as between some PHY processing flexibility and programmability, we push as much commu-
blocks must possess sufficiently high throughput to transfer nication functionality as possible into software, while keep-
high-fidelity digital waveforms. ing hardware additions as simple and generic as possible.
Intensive computation. High-speed wireless protocols Figure 2 illustrates the overall system architecture.
Figure 2. Sora system architecture. All PHY and MAC execute in Figure 3. Software architecture of Sora soft-radio stack.
software on a commodity multi-core CPU.
Applications
User mode
Multi-core CPU Digital Samples
@Multiple Gbps
APP APP APP APP RF Kernel mode
Mem RCB RF Network Layer (TCP/IP)
A/DRF
Sora Sora APP APP D/A RF
in a DMA burst (128B). When transmitting, the large RCB algorithms. We have been able to rewrite more than half of
memory enables Sora software to first write the generated the PHY algorithms with LUTs. Some LUTs are straightfor-
samples onto the RCB, and then trigger transmission with ward precalculations, others require more sophisticated
another command to the RCB. This functionality provides implementations to keep the LUT size small. For the soft-
flexibility to the Sora software for precalculating and stor- demapper example mentioned earlier, we can greatly reduce
ing several waveforms before actually transmitting them, the LUT size (e.g., 1.5KB for the 802.11a/g 54Mbps modu-
while allowing precise control of the timing of the waveform lation) by exploiting the symmetry of the algorithm. In our
transmission. SoftWiFi implementation described below, the overall size
While implementing Sora, we encountered a consistency of the LUTs is around 200KB for 802.11a/g and 310KB for
issue in the interaction between DMA operations and the 802.11b, both of which fit comfortably within the L2 caches
CPU cache system. When a DMA operation modifies a mem- of commodity CPUs.
ory location that has been cached in the L2 cache, it does not We also heavily use SIMD instructions in coding Sora
invalidate the corresponding cache entry. When the CPU software. We currently use the SSE2 instruction set designed
reads that location, it can therefore read an incorrect value for Intel CPUs. Since the SSE registers are 128-bit wide while
from the cache. most PHY algorithms require only 8-bit or 16-bit fixed-point
We solve this problem with a smart-fetch strategy, enabling operations, one SSE instruction can perform 8 or 16 simulta-
Sora to maintain cache coherency with DMA memory with- neous calculations. SSE also has rich instruction support for
out drastically sacrificing throughput if disabling cached flexible data permutations, and most PHY algorithms, e.g.,
accesses. First, Sora organizes DMA memory into small slots, FFT, FIR Filter and Viterbi, can fit naturally into this SIMD
whose size is a multiple of a cache line. Each slot begins with model. For example, the Sora Viterbi decoder uses only 40
a descriptor that contains a flag. The RCB sets the flag after it cycles to compute the branch metric and select the shortest
writes a full slot of data, and clears it after the CPU processes path for each input. As a result, our Viterbi implementation
all data in the slot. When the CPU moves to a new slot, it first can handle 802.11a/g at the 54Mbps modulation with only
reads its descriptor, causing a whole cache line to be filled. one 2.66 GHz CPU core, whereas previous implementations
If the flag is set, the data just fetched is valid and the CPU relied on hardware implementations. Note that other GPP
can continue processing the data. Otherwise, the RCB has architectures, like AMD and PowerPC, have very similar
not updated this slot with new data. Then, the CPU explicitly SIMD models and instruction sets, and we expect that our
flushes the cache line and repeats reading the same location. optimization techniques will directly apply to these other
This next read refills the cache line, loading the most recent GPP architectures as well.
data from memory. Table 2 summarizes some key PHY processing algo-
Table 1 summarizes the RCB throughput results, which rithms we have implemented in Sora, together with the
agree with the hardware specifications. To precisely mea- optimization techniques we have applied. The table also
sure PCIe latency, we instruct the RCB to read a memory compares the performance of a conventional software
address in host memory, and measure the time interval implementation (e.g., a direct translation from a hardware
between issuing the request and receiving the response in implementation) and the Sora implementation with the
hardware. Since each read involves a round trip operation, LUT and SIMD optimizations.
we use half of the measured time to estimate the one-way Lightweight, Synchronized FIFOs: Sora allows different
delay. This one-way delay is 360 ns with a worst case varia- PHY processing blocks to streamline across multiple cores,
tion of 4 ns. and we have implemented a lightweight, synchronized FIFO
to connect these blocks with low contention overhead. The
5.2. Software idea is to augment each data slot in the FIFO with a header
The Sora software is written in C, with some assembly for that indicates whether the slot is empty or not. We pad each
performance-critical processing. The entire Sora software data slot to be a multiple of a cache line. Thus, the con-
is implemented on Windows XP as a network device driver sumer is always chasing the producer in the circular buffer
and it exposes a virtual Ethernet interface to the upper TCP/IP for filled slots. If the speed of the producer and consumer
stack. Since any software radio implemented on Sora can is the same and the two pointers are separated by a partic-
appear as a normal network device, all existing network ular offset (e.g., two cache lines in the Intel architecture),
applications can run unmodified on it. no cache miss will occur during synchronized streaming
PHY Processing Library: In the Sora PHY processing library, since the local cache will prefetch the following slots before
we extensively exploit the use of look-up tables (LUTs) and the actual access. If the producer and the consumer have
SIMD instructions to optimize the performance of PHY different processing speeds, e.g., the reader is faster than
the writer, then eventually the consumer will wait for the
Table 1. DMA throughput performance of the RCB. producer to release a slot. In this case, each time the pro-
ducer writes to a slot, the write will cause a cache miss at
Mode Rx (Gbps) Tx (Gbps) the consumer. But the producer will not suffer a miss since
PCIe-x4 6.71 6.55 the next free slot will be prefetched into its local cache.
PCIe-x8 12.8 12.3 Fortunately, such cache misses experienced by the con-
sumer will not cause significant impact on the overall per-
formance of the streamline processing since the consumer
108 communications of t h e ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
doi:10.1145/1866739 . 1 8 6 6 7 6 2
Ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t h e acm 109
research highlights
(e.g. TCP Reno) maps to a particular (user) utility function,9 from the sets of paths and shift to paths with higher net
and users selfishly seek to choose paths in such a way as to benefit, they can rely on a small number of paths and
maximize their net utility. Within this optimization frame- do as well as if they were fully using all available paths.
work, a coordinated controller is modeled by a single util- •• Coordinated control has better fairness properties
ity function per user, whose argument is the aggregate rate than uncoordinated in the static case. When combined
summed over paths, whereas an uncoordinated controller with path reselection, uncoordinated control only does
has a utility function per path and the aggregation is over all as well as a coordinated control if there is no RTT bias
of the utility functions. in the controllers.
Key to the usefulness of multipath rate control is its abil-
ity in the hands of users operating independently of each We conclude the paper with some thoughts on how mul-
other to balance the load throughout the network. We illus- tipath rate control might be deployed.
trate this for a particular scenario, where the paths chosen
are fixed and static, but chosen at random from a set of size 2. THE MULTIPATH FRAMEWORK
N. We focus on the worst-case allocation, which is a measure The standard model of the network is as a capacitated graph
of the fairness of the scheme. In the uncoordinated case, G = (V, E, C) where V represents a set of end-hosts or routers,
the worst-case allocation scales as log(log(N) )/log(N) inde- E is a set of communication links and each link has a capac-
pendent of the number b of paths chosen. In contrast, in the ity, say in bits per second, Cl, l ∈ E. In addition a large popu-
coordinated case where each user can balance its load across lation of sessions perform data transfers over the network.
the b paths available to it, provided there are two or more, the These sessions are partitioned into a set of session classes
worst-case allocation is bounded away from zero. This dem- S with Ns sessions in class s S. Associated with class s is a
onstrates that source ss, a destination ds, and a set of one or more, possi-
bly overlapping paths, R(s) between the source and destina-
1. coordinated control balances loads significantly better tion that is made available to all class s sessions. Finally, we
than uncoordinated when paths are fixed. associate an increasing, concave function with each session
2. coordinated improves on greedy least-loaded resource class, Us(x), which is the utility that a class s session receives
selection, as in Mitzenmacher,16 where the least-loaded when it sends data at rate x > 0 from source to destination.
selection of b resources scales as 1/log(log(N) ) for b > 1. Now, exactly how this utility is used and the meaning of
x depends on whether we are concerned with coordinated
Effectively, coordinated control is able to shift the load or uncoordinated control. We will shortly describe each of
among the resources, and with each user independently these in turn.
balancing loads over no more than two paths, able to uti- A discussion of how utility functions can be used to model
lize the resources as if global load balancing was being standard TCP Reno is given in Kunniyur and Srikant.15 The
performed. so-called weighted alpha-fair utility functions given by
This raises the question of how users should be assigned
a set of paths to use. One natural path selection mechanism
is to allow users to make their own choices. We study this
as a game between users and consider a natural notion of a
Nash equilibrium in this context, where users seek to selfishly
maximize their own net utilities. We find that when users use were introduced in Mo and Walrand,17 and are linked to dif-
coordinated controllers, the Nash equilibria coincide with ferent notions of fairness. For example, a = 1 corresponds to
welfare-maximizing social optima. When we consider unco- (weighted) proportional fairness,8 and lim a → ∞ to max–min
ordinated controllers, then the results depend on whether the fairness. TCP’s behavior is well approximated by taking a = 2
controllers exhibit RTT bias (like TCP) or not. When they do and wr = 1/T 2r, where Tr is the round trip time for path r, in
not exhibit RTT bias, the Nash equilibria also coincide with the following sense: TCP achieves the maximum aggregate
welfare-maximizing social optima. Otherwise they need not. utility, for given paths and link capacities, for the corre-
We show that increasing the number of paths available sponding utility functions Ur.
to a source destination pair is desirable from a performance The set of paths available to a class s session can poten-
perspective. However, the simultaneous use of a large num- tially be very large. Hence a session will likely use only a small
ber of paths may not be possible. We show that this does not subset of these paths. We assume for now that every class s
pose a problem as simple path selection policies that com- session uses exactly bs paths. Let c denote a subset of R(s)
bine random path resampling with moving to paths with that contains bs paths and C(s) the set of all such subsets of
higher net benefit lead to welfare maximizing equilibria and paths, C(s) = {c : c R(s) ∧ |c| = bs}. Let Nc denote the num-
also increase the throughput capacity of the network. In fact ber of class s sessions that use the set of paths c ∈ C (s), s ∈ S,
such a policy does as well as if each user uses all of the avail- and hence Ns = ∑c∈C(s) Nc. Last, let Nr denote the number of ses-
able paths simultaneously. sions that use path r ∈ R(s), Nr = ∑c ∈C(s) 1(r ∈ c) )Nc, where 1(x)
In summary, we shall provide some partial answers to our is the indicator function taking the value 1 when x is true.
initial questions. Associated with each class s session is a congestion con-
troller (rate controller) that determines the rates at which
• In a large system, provided users re-select randomly to send data over each of the bs paths available to it. We
110 communications of t h e ac m | Ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
now distinguish between coordinated and uncoordinated uncoordinated control can be implemented motivates our
control. study of it. In Kelly’s optimization formulation this corre-
Coordinated control. Given a set of paths c, a coordinated sponds to solving the following problem:
controller actively balances loads over all paths in c, taking
into account the states of the paths. Our understanding of
and ability to design such controllers relies on a significant
advance made by Kelly et al.,8 which maps this problem into
one of utility optimization. In the case of coordinated con- over nonnegative λcr subject to the capacity constraints (3). As
gestion control, the objective is to maximize the “social wel- above, by analogy with (5) the constraints can be generalized
fare,” that is to to reflect the signaling used by a controller such as TCP. Note
the difference between this formulation and that for coordi-
nated control. In the case of the latter, the utility is applied to
the aggregate sending rate whereas in the case of the former,
the utility is evaluated on each path and then summed over all
over (λcr ≥ 0) subject to the capacity constraints paths. Note also that really we have written Ur instead of Us for
the uncoordinated controller, to reflect the fact that the con-
gestion control may differ across different paths (as is the case
with TCP whose allocation depends on the RTT of the path).
where λcr is the sending rate of a class s session that is using The functions above are strictly concave and are being
path r in c ∈ C(s). We will find it useful to represent the total optimized over a convex feasible region. Hence the prob-
rate contributed by class s sessions that use path r ∈ R(s) as lems admit to unique solutions in terms of aggregate per
Lr = Nc ∑c r λcr, and the aggregate rate achieved by a single s class rates, even though distinct solutions may exist.
session over all paths in c as λc = ∑r ∈ c λcr.
Note that in the absence of restrictions on the number of 3. LOAD BALANCING PROPERTIES OF MULTIPATH
paths used, C(s) = R(s), and the optimization can be written Multipath has been put forward as a mechanism that when
used by all sessions can balance traffic loads in the Internet.
It is impossible to determine whether this is universally
true. However, we present in this section a simple scenario
where this issue can be definitively resolved. We consider
subject to the capacity constraints. We shall see later in a simple scenario where there are N resources with unit
Section 5 that by using random path reselection the solu- capacity (Cl ≡ 1).
tion to (2) actually solves (4), and hence give conditions for To provide a concrete interpretation, the resources can
when the restriction to using a subset of paths of limited size be interpreted as servers, or as relay or access nodes—see
imposes no performance penalties. Figure 2. There are aN users. Each user selects b resources
More generally, we can replace the hard capacity con- at random from the N available, where b is an integer larger
straintsb by a convex nondecreasing penalty function G. than one (the same resource may be sampled several times).
In the context of TCP, this penalty function can be thought We shall look at the worst-case rate allocation of users in
of as capturing the signaling conveyed by packet losses or two scenarios. In the first scenario, users implement unco-
packet marking (ECN19) by the network to the sessions when ordinated multipath congestion control where there is no
link capacities are violated. Under this extension, the coor- coordination between the b distinct connections of each
dinated control problem transforms to user. Thus, a connection sharing a resource handling X con-
nections overall achieves a rate allocation of exactly 1/X. In
the second scenario, each user implements coordinated
multipath congestion control.
We take the worst-case user rate allocation (or through-
There are many ways to approach the problem of design- put), as the load balance metric. One can show13 that the
ing controllers that solve these problems, but a very natural more “unfair” the allocation, the greater the expected time
one is suggested by the TCP congestion control, which solves to download a unit of data.
this variation of the above problem when each session is
restricted to a single path (see Key et al.11). Figure 2. Load balancing example: there are N servers, aN users and
Uncoordinated control. As mentioned earlier, uncoor- each selects b > 1 servers at random.
dinated control corresponds to a session with path set c
executing independent rate controllers over each path in c. A B C
This is easily done in the current Internet by establishing
separate TCP connections over each path. The ease in which
b
The hard constraints in (3) can be written as the sum of penalty func-
tions, each of which is a step function Gl(x), with Gl(x) = 0 if x ≤ Cl and ∞
otherwise
Ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t h e acm 111
research highlights
3.1. Uncoordinated congestion control probability for all nonempty subsets of 1, . . . , aN. Using the
Denote by λi the total rate that user i obtains from all its binomial theorem and the union bound yields an upper
connections. In the case of uncoordinated congestion bound on the probability the condition fails to hold, and
control, we can show that the worst-case rate allocation, then Stirling’s approximation is used to approximate this
min λi decreases like b2 log(log N)/log N as N increases. bound.
This is to be compared with the worst-case rate allocation This result says that the worst-case rate allocation is
that one gets when b = 1, that is when a single path is used: bounded away from zero as N tends to infinity, i.e., it is
from classical balls and bins models,16 this also decreases O(1) in the number of resources N. Thus coordinated con-
as log(log(N) )/log(N) as N increases. It should come as no trol exhibits significantly better load balancing properties
surprise that using more than two paths exhibits the same than does uncoordinated control. It is also interesting to
asymptotic performance as using only one path; there is compare this result to the result quoted by Mitzenmacher
no potential for balancing load within the network when et al.,16 which says that if users arrive in some random
all connections operate independent of each other. A for- order, and choose among their b candidate resources one
mal statement and proof of this result can be found in Key with the lowest load, then the worst-case rate scales like 1/
et al.11 log(log(N) ), which unlike the allocation under coordinated
control, goes to zero as N increases. The difference between
3.2. Coordinated congestion control the two schemes is that in Mitzenmacher’s scheme a choice
Here we assume as before that there are aN users, each has to be made immediately at arrival, which cannot be
selecting b resources at random, from a collection of N avail- changed afterward, whereas a coordinated controller
able resources. Denote by λij the rate that user i obtains from actively and adaptively balances load over the b paths react-
resource j, and let R(i) denote the set of resources that user ing to changes that may occur to the loads on the resources.
i accesses. In contrast with the previous situation, we now
assume that the rates λij are chosen to maximize: 4. A PATH SELECTION GAME
In this section we address the following question. Suppose
that each session is restricted to using exactly b paths each,
taken from a much larger set of possible paths: what is the
effect of allowing each user to choose its b paths so as to
for some concave utility function U. maximize the benefit that it receives? To answer this ques-
An interesting property of this problem is that the set tion, we study a path selection game. Here each session is a
of {λ*ij} that solves the above optimization is insensitive to player that greedily searches for throughput-optimal paths.
the choice of utility function U so long as it is concave and We characterize the equilibrium allocations that ensue. We
increasing. Moreover, this insensitivity implies that the show that the same equilibria arise with coordinated con-
optimal aggregate user rates (λ*i ) correspond to the max– gestion control and uncoordinated congestion control pro-
min fair rate allocations (see Bertsekas and Gallager,3 vided that the latter does not introduce RTT biases on the
Section 6.5.2). Simply stated a rate allocation (λ*i ) is said different paths. Moreover, these equilibria correspond to
to be max–min fair if and only if an increase of any rate λ*i 0 the optimal set of rates that solve problems (2) and (6), i.e.,
must result in the decrease of some already smaller rate. achieve welfare maximization. We shall use the models and
Formally, for any other feasible allocation (xi), if xi > λ*i notation of Section 2.
then there must exist some j such that λ*j < λ*i and xj < λ*j . We shall restrict attention to when Ns is large, so that a
The above statements are easily verified by checking that change of paths by an individual player (session) does not
the max–min fair allocation satisfies the Karush–Kuhn– significantly change the network performance. In game
Tucker conditions associated with the above optimization theory terms we are only considering non-atomic games.
problem.
This leads to the following result. 4.1. Coordinated congestion control
For coordinated control, we use the model of Section 2, where
Theorem 1. Assume there are N resources, and aN users each the number of sessions Ns is fixed for all s, and introduce the
connecting to b resources selected at random. Denote by {λ*i } following notion of a Nash equilibrium.
the optimal allocations that result. Then there exists x > 0, that
depends only on a and b, such that: definition 1. The nonnegative variables Nc, c ∈ C(s), s ∈ S, are a
Nash equilibrium for the coordinated congestion control alloca-
tion if they satisfy the constraints ∑c Nc = Ns, and, moreover, for
all s ∈ S, all c ∈ C(s), if Nc > 0, then the corresponding coordinated
The style of the proof has wide applicability and we outline rate allocations satisfy
it here: first, an application of Hall’s celebrated marriage
theorem shows that the minimum allocation will be at least
x provided that any set of users (of size n say) connect to
at least x times as many servers (nx servers). If this condi-
tion is satisfied, the allocation (λ*i ) will exceed x; hence it is In other words, for each session (player), weight is only given
sufficient to ensure that Hall’s condition is met with high to sets c that maximize the throughput for s. ◊
112 communications of t h e ac m | Ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
We then have the following. them. In this section we focus on two questions. The first
regards how many paths, bs, to allow each class s user so as
Theorem 2. At a Nash equilibrium as in Definition 1, the path to enhance its performance and that of the system. We estab-
allocations λr solve the welfare maximization problem (4). lish a monotonicity result for coordinated control in order
to address this question. The second question regards how
The proof follows since at a Nash equilibrium, type s play- to manage the overhead that may ensue due to the need for
ers only use minimum “cost” paths, which can be shown to a user to balance load actively over a large number of paths.
coincide with the Kuhn–Tucker conditions of (4). This result Possibly surprisingly, we will show that it suffices for a user
says that a selfish choice of path sets by end-users results in to maintain a small set of paths, say two (b = 2), provided that
a solution that is socially optimal. it repeatedly selects new paths at random and replaces the
old paths with these paths when the latter provide higher
4.2. Uncoordinated control throughput. It is interesting to point out that BitTorrent uses
We introduce the following notion of Nash equilibrium. a strategy much like this where it “unchokes” a peer (tries out
a new peer) and replaces the lowest performing of its existing
Definition 2. The collection of per path connection numbers Nr four connections with this new connection if the latter exhib-
is a Nash equilibrium for selfish throughput maximization if it its higher throughput.
satisfies ∑r Nr = Ns, and furthermore, the allocations (6) are such We will examine the above questions for both coordi-
that for all s ∈ S, all r ∈ R(s), if Nr > 0, then nated control and uncoordinated control. We begin with
coordinated control.
Ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t h e acm 113
research highlights
currently using, Nc(t) denoting the number of class s-users maximization on the part of a user conforms exactly to the
actively using paths in c R(s) at time t. Class s users user trying to maximize its rate through the path reselection
actively using the set c of paths consider replacing their process. Thus, this path reselection policy is easy to imple-
path set c with path set c′ according to a Poisson process ment: at random times the session initiates data transfer
with intensity Acc′. We shall assume that |c| = |c′| = b, i.e., the using the coordinated rate controller over a new set of paths
number of paths in an active set is fixed at b. Finally, assume and measures the achieved throughput, dropping either the
that for each class s, any r ∈ R(s), any given set c ∈ C(s), there old path set or new path set depending on which achieves
is some c′ such that r ∈ c′ and Acc′ is positive (recall that C(s) lower throughput. This equivalence is a consequence of the
is defined as the collection of size b subsets of R(s) ). This assumption that the utility U is strictly concave and continu-
assumption states that all paths available to a class s ses- ously differentiable.
sion should be tried no matter what set of initial paths is
given to that session. 5.2. Uncoordinated congestion control
We also have to concern ourselves with the send- As one might expect by now, the story is not as clean in the
ing rates of the different users as path reselection pro- case of uncoordinated control, and no monotonicity result
ceeds over time. Let λc(t) denote the data transfer rate exists. Indeed, for a symmetric triangle network described
for a user actively using path set c, λc(t) = ∑r∈c λc,r(t) where in Key et al.,11 with three source-destination session types,
λc,r(t) is the sending rate along path r at time t. We have allowing each session to use the two-link path as well as
described in Key et al.11 a dynamic process where the vec- the direct path decreases throughput. However, random
tors {Nc(t), λc,r(t)} change over time. This process is sto- resampling is still beneficial provided that the uncoordi-
chastic in nature and consequently difficult to model. nated control exhibits no RTT bias. If a session is given
However, if we assume that the population of users in each a set of paths to draw from, then the random resampling
class is large, which is reasonable for the Internet, then we strategy described earlier maximizes welfare without the
can model this process over time by a set of ordinary dif- need to use all paths. Moreover, it suffices for sessions to
ferential equations, representing the path reselection and use a greedy rate optimization strategy to determine which
rate adaptation dynamics of users over their active path set of paths to keep in order to ensure welfare maximiza-
sets. Under the condition that the utility functions and tion. The reader is referred to Key et al.11 for further details.
penalty functions are well behaved, we can show that Nc(t)
converges to a limit Nc and λcr(t) converges to λcr as t tends 6. DISCUSSION AND DEPLOYMENT
to infinity. Remarkably, we can show that these limits are Till now, we have focused on networks supporting work-
the maximizers of loads consisting of persistent or infinite backlog flows.
Moreover, the emphasis has been on the effect that mul-
tipath has on aggregate utility. In this section we consider
workloads consisting of finite length flows that arrive ran-
domly to the network. Our metric will be the capacity of the
subject to ∑c∈C(s) Nc = Ns. In other words, this resampling network to handle such flows. We will observe that several
process allows the system to converge to a state where the results from previous sections have their counterparts
proportion of class s sessions using active path set c ∈ when we focus on finite flows.
R(s) and the aggregate rates at which they use these paths As before, we represent a network as a capacitated undi-
maximize the aggregate sum of utilities. This is more pre- rected graph G = (V, E, C) supporting a finite set of flow classes,
cisely stated in the following theorem. S with attendant sets of paths {R(s)}. We assume that class
s sessions arrive at rate as according to a Poisson process
Theorem 4. Assume that the utility functions Us and the and that they introduce independent and identical expo-
penalty function Γ are continuously differentiable on their nentially distributed workloads with a mean number of bits
domain, that the former are strictly concave increasing, and 1/ms. We introduce the notion of a capacity region for this
the latter convex increasing. Assume further that U′s (x) → 0 network, namely the sets of {as} and {ms} for which there
as x → ∞. Then (Nc, λc,r) converges to the set of maximizers of exists some rate allocation over the paths available to the
the welfare function (10) under the constraints ∑c∈C(s) Nc = Ns. sessions such that the time required for sessions to com-
The corresponding equilibrium rates (λr) are solutions of the plete their downloads are finite.
coordinated welfare maximization problem (2). In the case of coordinated control, it is possible to
derive the following monotonicity result with respect to
The proof proceeds by showing that trajectories of the the capacity region of the network. Consider a network G
limiting ordinary differential equation are bounded, that supports a set S of flow classes with arrival rates {as}
that welfare increases over time, and then using Lasalle’s and loads {ms}. Let {R(s)} and {R′(s)} be two collections of
invariance theorem to prove that the limiting points of paths for these classes that satisfies R(s) R′(s) for each
these dynamics coincide with equilibrium points of the s ∈ S and suppose that each session applies coordinated
ordinary differential equation; showing that the equilib- rate and path control over these paths. Then if {as}, {ms},
rium points coincide with the maximum of (10) completes lie within the capacity region of the network with path sets
the proof. {R(s)}, they lie in the capacity region of the network with
What makes this result especially useful is that benefit path sets {R′(s)} as well.
114 communications of t h e ac m | Ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
Remark 2. It is easy to find examples where the capacity region Figure 3. Capacity region under multipath with and without resampling.
strictly increases with the addition of more paths.
Remark 3. Although this result is stated for the case of exponen-
tially distributed workloads, it is straightforward to show that
it holds for any workload whose distribution is characterized by
a decreasing failure rate. This includes heavy-tailed distribu-
maximum capacity
tions such as Pareto.
It is interesting to ask the same question about the capac- multi path
ity region when uncoordinated control is used by all flows.
Unfortunately, similar to the infinite session workload case,
no such monotonicity property exists. two paths
It is also interesting to ask the question as to which con- +
two paths
resampling
troller yields the larger capacity region. As in the case for
finite flows, we can show that for a given network configura- single
path
tion (G, S, and R fixed), if {as : s ∈ S}, {ms : s ∈ S} lies within
the capacity region of the network when operating with an
uncoordinated control, then they lie within the capacity
region of the network when operating under coordinated increase capacity over uncoordinated control over the entire
control as well. set of paths.
Remark 4. It is easy to construct cases where the converse is not 6.1. Deployment
true. For instance, the symmetric triangle with single and two- To effectively deploy multipath, key ingredients are first,
link routing mentioned for fixed flows is such an example (see diversity, which is achieved through a combination of multi
Key and Massoulié12). homing and random path sampling, and second, path selec-
tion and multipath streaming using a congestion controller
We conclude from this monotonicity property for coor- that actively streams along the best paths from a working
dinated control that more is better. However, improved set. Although home-users are currently often limited in their
capacity comes at the cost of increased complexity at the choice of Internet Service Provider (ISP) and hence cannot
end-host, namely maintenance of state for each path and multihome, in contrast campus or corporate nodes often have
executing rate controllers over each path. Fortunately, diverse connections, via different ISPs or through 3G wireless
as in the case of infinite backlogged sessions, this is and wired connectivity. Moreover, the growth of wireless hot-
not necessary. It suffices for a session to maintain a spots, wireless mesh and broadband wireless in certain parts
small set of paths, say two paths, and continually try out of the globe means that even home-users may become multi-
random paths from the set of paths available to it, and homed in the future. Recent figures1 suggest that 60% of stub-
drop the path which provides it with the poorest perfor- ASes (those which do not transit traffic) are multihomed, and
mance, say throughput. Note the similarity of this process de Launois5 claims that with IPv6 type multihoming there are
to that of BitTorrent, which periodically drops the con- at least two disjoint paths between such stub-ASes.
nection providing the lowest throughput and replacing it The multipath controllers we have outlined need to be
with a random new connection. Interestingly enough, this put into practice. Some high-level algorithms designs are
multipath algorithm coupled with random resampling considered in Kelly and Voice10 and Han et al.,7 and practical
achieves the same capacity region as one that requires questions are addressed in Raiciu et al.18 Translating from
flows to utilize all paths. Indeed, we can prove the anal- algorithms derived from fluid models to practical packet-
ogy of 5.1. based implementations does require care; however, we
believe this to be perfectly feasible in practice. Indeed, the
Theorem 5. Assume that class s sessions use all paths from IETF has a current Multipath TCP working group, which is
R(s). Assume the set of loads {as} and {ms} lies within the net- looking into adding multipath into TCP.
work capacity region. Consider an approach where a class s ses-
sion uses a subset of paths from R(s), randomly samples a new 7. SUMMARY
path set according to a Poisson process with rate γs and drops There are potentially significant gains from combining
the worst of the two path sets. Then {ak} and {mk} also lie within multipath routing with congestion control. Two different
the capacity region when flows use this resampling approach in flavors of control are possible: one which coordinates trans-
the limit as gs → ∞. fers across the multiple paths; and another uncoordinated
control with sets up parallel connections. The uncoordi-
Figure 3 illustrates and summarizes our capacity results. nated approach is simpler to implement; however, it can
As before it is also interesting to ask about the effect of unco- suffer from poorer performance while coordinated con-
ordinated control coupled with random sampling on capac- trol is better performing and intrinsically “fairer.” We have
ity. Surprisingly enough, uncoordinated control on a small contrasted the two types of control, and shown that with
set of paths coupled with random resampling can often fixed path choices uncoordinated control can yield inferior
Ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t h e acm 115
research highlights
7. Han, H., Shakkottai, S., Hollot, C., 15. Kunniyur, S., Srikant, R. End-to-end
Srikant, R., Towsley, D. Multi-path TCP: congestion control schemes: utility
performance, halving throughput in one example. a joint congestion control and routing functions, random losses and ECN
If path-choices are allowed to be chosen optimally or “self- scheme to exploit path diversity in the marks. In INFOCOM 2000 (2000).
Internet. IEEE/ACM Trans. Netw. 14, 6 16. Mitzenmacher, M., Richa, A.
ishly” by the end-system, then coordinated control reaches (Dec. 2006), 1260–1271. Sitaraman, R. The power of two
the best systemwide optimum; as indeed does uncoordi- 8. Kelly, F., Maulloo, A., Tan, D. Commu random choices: a survey of the
nication networks: shadow prices, techniques and results. Handbook
nated control, but only if the control objective is the same for proportional fairness and stability. of Randomized Computing.
all paths (unlike current TCP), and also only if all users agree J. Oper. Res. Soc. 49, (1998), 237–252. P. Pardalos, S. Rajasekaran, and
9. Kelly, F.P. Mathematical modelling of J. Rolim, eds. Kluwer Academic
to use the same number of parallel paths (connections). This the Internet. Mathematics Unlimited – Publishers, Dordrecht, 2001,
2001 and Beyond. B. Engquist and 255–312.
optimum can also be reached by limiting each session to a W. Schmid, eds. Springer-Verlag, 17. Mo, J., Walrand, J. Fair end-to-
small number of path choices (e.g., 2) but allowing paths to New York, 2001, 685–702. end window based congestion
10. Kelly, F.P., Voice, T. Stability of end-to- control. In SPIE 98, International
be resampled and better paths to replace existing ones. end algorithms for joint routing and Symposium on Voice, Video and Data
This suggests that good design choices for multipath rate control. ACM SIGCOMM Comput. Communications (1998).
Comm. Rev. 35, 2 (2005), 5–12. 18. Raiciu, C., Wischik, D., Handley, M.
controllers are coordinated controllers or uncoordinated 11. Key, P., Massoulié, L., Towsley, D. Path Practical congestion control for
controllers with the RTT bias removed. selection and multipath congestion multipath transport protocols. UCL
control. In INFOCOM07 (May 2007). Technical Report (2010).
12. Key, P., Massoulié, L. Fluid models 19. Ramakrishnan, K., Floyd, S., Black, D.
Acknowledgment of integrated traffic and multipath The addition of explicit congestion
routing. Queueing Syst. 53, 1 notification (ECN) to IP. Technical
This work was supported in part by the NSF under award (June 2006), 85–98. Report RFC3168, IETF (Sept. 2001).
CNS-0519922. 13. Key, P., Massoulié, L., Towsley, D. 20. Srikant, R. The Mathematics of
Multipath routing, congestion control Internet Congestion Control.
and load balancing. In ICASSP 2007 Birkhauser, Boston, 2003.
References
(Apr. 2007). 21. Zhang-Shen, R., McKeown, N.
1. Agarwal, S., Chuah, C.-N., Katz, R. 4. Cohen, B. Incentives build robustness 14. Kodialam, M., Lakshman, T., Sengupta, Designing a predictable Internet
OPCA: Robust interdomain policy in BitTorrent. In Proceeding of P2P S. Efficient and robust routing of highly backbone network with Valiant load-
routing and traffic control. In Economics workshop (June 2003). variable traffic. In HotNets (2004). balancing. In IWQoS (June 2005).
Proceedings of the IEEE Openarch 5. de Launois, C., Quoitin, B.
(April 2003). Bonaventure, O. Leveraging network
2. Andersen, D., Balakrishnan, H., performance with IPv6 multihoming Peter Key (peter.key@microsoft.com), Don Towsley (towsley@cs.umass.edu),
Kaashoek, F., Rao, R. Improving Web and multiple provider-dependent Microsoft Research, Cambridge, UK. Department of Computer Science, University
availability for clients with MONET. aggregatable prefixes. Comput. Netw. of Massachusetts, Amherst, MA.
In Proceedings of the NSDI 2005 50, 8 (2006), 1145–1157. Laurent Massoulié (laurent.massoulie@
(July 2005). 6. Gummadi, K., Madhyastha, H., Gribble, technicolor.com), Thomson Technology
3. Bertsekas, D., Gallager, R. Data S., Levy, H., Wetherall, D. Improving Paris Laboratory, 1, Issy-les-Moulineaux-
Networks. Longman Higher Education, the reliability of Internet paths with Moulineau, France.
Prentice-Hall, Inc., Englewood Cliffs, one-hop source routing. In Proceedings
NJ, 1992. of the 6th OSDI (Dec. 2004). © 2011 ACM 0001-0782/11/0100 $10.00
ACM has partnered with MentorNet, the award-winning nonprofit e-mentoring network in engineering,
science and mathematics. MentorNet’s award-winning One-on-One Mentoring Programs pair ACM
student members with mentors from industry, government, higher education, and other sectors.
• Communicate by email about career goals, course work, and many other topics.
• Spend just 20 minutes a week - and make a huge difference in a student’s life.
• Take part in a lively online community of professionals and students all over the world.
116 communications of t h e ac m | Ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
careers
Air Force Institute of Technology (AFIT) Possess, or complete by September 2011, a Ph.D. Associate Professors at Columbia are aca-
Dayton, Ohio in Computer Science or closely related area. Dem- demic officers holding the doctorate or its pro-
Department of Electrical and Computer onstrate strong English communication skills, a fessional equivalent who have demonstrated
Engineering commitment to actively engage in the teaching, scholarly and teaching ability and show great
Graduate School of Engineering and research and curricular development activities promise of attaining distinction in their fields of
Management of the department at both undergraduate and specialization.
Faculty Positions in Computer Science or graduate levels, and ability to work with a diverse Professors at Columbia are academic officers
Computer Engineering student body and multicultural constituencies. holding the doctorate or its professional equiva-
Ability to teach a broad range of courses, and to lent who are widely recognized for their distinc-
The Department of Electrical and Computer En- articulate complex subject matter to students at tion. Candidates for senior-level appointment
gineering is seeking applicants for tenure track all educational levels. First consideration will be must have a distinguished record of achieve-
positions in computer science or computer engi- given to completed applications received no later ment and evidenced by leadership in their field
neering. The department is particularly interest- than December 15, 2010. Contact: Faculty Search of expertise, publications, professional recogni-
ed in receiving applications from individuals with Committee, Computer Science Department, Cal tion, as well as a commitment to excellence in
strong backgrounds in formal methods (with em- Poly Pomona, Pomona, CA 91768. Email: cs@ teaching.
phasis on cryptography), software engineering, csupomona.edu. Cal Poly Pomona is an Equal Candidates must have a Ph.D. degree, DES, or
bioinformatics, computer architecture/VLSI sys- Opportunity, Affirmative Action Employer. Posi- equivalent degree by the starting date of the ap-
tems, and computer networks and security. The tion announcement available at: http://academic. pointment and are expected to establish a strong
positions are at the assistant professor level, al- csupomona.edu/faculty/positions.aspx. Lawful research program and excel in teaching both un-
though qualified candidates will be considered at authorization to work in US required for hiring. dergraduate and graduate courses.
all levels. Applicants must have an earned doctor- Our department of 36 tenure-track faculty and
ate in computer science or computer engineering 1 lecturer attracts excellent Ph.D. students, virtu-
or closely related field and must be U.S. citizens. Carnegie Mellon University ally all of whom are fully supported by research
These positions require teaching at the gradu- School of Design grants. The department has active ties with ma-
ate level as well as establishing and sustaining a IxD Faculty Position jor industry partners including Adobe, Autodesk,
strong research program. Canon, Disney, Dreamworks, Microsoft, Nvidia,
AFIT is the premier institution for defense-re- School of Design at Carnegie Mellon Google, Sony, Weta, Yahoo! and also to the nearby
lated graduate education in science, engineering, University research laboratories of AT&T, Google, IBM (T.J.
advanced technology, and management for the U.S. IxD Faculty Position Watson), NEC, Siemens, Telcordia Technologies
Air Force and the Department of Defense (DoD). Application deadline December 3, 2010 and Verizon. Columbia University is one of the
Full details on these positions, the department, Submit application to johnz@cs.cmu.edu. leading research universities in the United States,
and application procedures can be found at: http:// View complete job description at http://bit. and New York City is one of the cultural, finan-
www.afit.edu/en/eng/employment_faculty.cfm ly/cGrgeC. cial, and communications capitals of the world.
Review of applications will begin immediately Columbia’s tree-lined campus is located in Morn-
and will continue until the positions are filled. ingside Heights on the Upper West Side.
The United States Air Force is an equal opportu- Columbia University Applicants should apply online at:
nity, affirmative action employer. Department of Computer Science academicjobs.columbia.edu/applicants/
Tenured or Tenure-Track Faculty Positions Central?quickFind=54003
California State University, Fullerton The Department of Computer Science at Colum- and should submit electronically the following:
Assistant Professor bia University in New York City invites applica- curriculum-vitae including a publication list,
tions for tenured or tenure-track faculty positions. a statement of research interests and plans, a
The Department of Computer Science invites ap- The search committee is especially interested in statement of teaching interests, names with con-
plications for a tenure-track position at the Assis- candidates who through their research, teaching, tact information of three references, and up to
tant Professor level starting fall 2011. For a com- and/or service will contribute to the diversity and four pre/reprints. Applicants can consult www.
plete description of the department, the position, excellence of the academic community. Appoint- cs.columbia.edu for more information about the
desired specialization and other qualifications, ments at all levels, including assistant professor, department.
please visit http://diversity.fullerton.edu/. associate professor and full professor, will be The position will close no sooner than Decem-
considered. Priority themes for the department ber 31, 2010, and will remain open until filled.
include Computer Systems, Software, Artificial Columbia University is an Equal Opportunity/Af-
Cal Poly Pomona Intelligence, Theory, and Computational Biology. firmative Action employer
Assistant Professor Candidates who work in specific technical areas
including, but not limited to, Computer Graph-
The Computer Science Department invites ap- ics, Human-Computer Interaction, Simulation, DePaul University
plications for a tenure-track position at the rank and Animation, with research programs that can Assistant/Associate Professor
of Assistant Professor to begin Fall 2011. We are significantly impact the above priority themes are
particularly interested in candidates with spe- particularly welcome to apply. Candidates doing The School of Computing at DePaul University
cialization in Software Engineering, although research at the interface of computer sciences invites applications for a tenure-track position
candidates in all areas of Computer Science will and the life sciences and the physical sciences are in distributed systems. We seek candidates with
be considered, and are encouraged to apply. Cal also encouraged to apply. a research interest in data-intensive distributed
Poly Pomona is 30 miles east of L.A. and is one of Assistant Professors at Columbia are aca- systems, cloud computing, distributed databas-
23 campuses in the California State University. demic officers holding the doctorate or its profes- es, or closely related areas. For more information,
The department offers an ABET-accredited B.S. sional equivalent who are beginning a career of see https://facultyopportunities.depaul.edu/ap-
program and an M.S. program. Qualifications: independent scholarly research and teaching. plicants/Central?quickFind=50738.
ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t h e acm 117
careers
Eastern New Mexico University strong teaching credentials. Women and minori- and systems and novel storage and computing
Instructor of Computer Science ties are encouraged to apply. architectures. Ideal candidate would work both
Application review will begin January 18, 2011 independently and as a part of a storage architec-
For more information visit www.enmu.edu/ser- and continue until the position is filled. ture team in conceiving, prototyping and guiding
vices/hr or call (575) 562-2115. All employees Interested applicants must apply online: development of new storage related projects and
must pass a pre-employment background check. http://goucher.interviewexchange.com/can- ideas, as well as writing invention disclosures, ac-
AA/EO/Title IX Employer dapply.jsp?JOBID=21846 ademic papers and publications and participating
in scientific societies and industry associations.
Please submit the following application ma- Postdoctoral candidates are welcome to apply and
Eastern Washington University terials online: propose more specific research programs.
Tenure-track Position ˲˲ Curriculum Vitae
˲˲ Cover letter Job Requirements
The Computer Science Department at Eastern ˲˲ A personal statement describing your interest PhD in Computer Science with a proven publica-
Washington University invites applications for a in teaching at a small liberal arts college tion track record and implementation experience
tenure-track position starting Sept 2011. Please in system architecture, operating systems, file
visit: http://access.ewu.edu/HRRR/Jobs.xml for Three letters of recommendation and official systems, and embedded systems.
complete information. For questions contact graduate transcripts should be forwarded separate- Send applications (a full Curriculum Vitae
Margo Stanzak (509) 359-4734 ly to: Human Resources, Goucher College, 1021 Du- and short description of research interests) to
laney Valley Road, Baltimore, MD 21204-2794. Zvonimir.Bandic@hitachigst.com
Goucher College is an
Goucher College Equal Opportunity Employer.
Visiting Assistant Professor, Computer Science Illinois Institute of Technology
Department of Computer Science
Applications are invited for a three year visiting Hitachi Research
assistant professor position beginning August Research Scientist Applications are invited for a tenure-track assistant
2011. This is a one-year appointment, renewable Storage Architecture professor position in Computer Science beginning
for up to two additional years. A Ph.D. in comput- Fall 2011. Excellence in research, teaching and ob-
er science, or a closely related field, is preferred Hitachi San Jose Research Center is a premier taining external funding is expected. While strong
(non-Ph.D. applicants must be ABD). research center with more than 100 scientists candidates from all areas of computer science will
Applicants should have experience teaching a working in many exciting fields including stor- be considered, applicants from general data areas
wide range of courses at all levels of the computer age architecture, consumer electronics, storage such as database, data mining, information secu-
science curriculum. Preference will be given to technology and nanotechnology. The job opening rity, information retrieval, and data understanding
applicants with a systems background, but appli- is in the area of research of storage architecture and processing are especially encouraged.
cants from all areas of computer science will be and systems, more specifically operating systems, The Department offers B.S., M.S., and Ph.D.
considered. Applicants are expected to present novel file systems, reliability of storage devices degrees in Computer Science and has research
118 communications of t h e ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
strengths in distributed systems, information re- International Computer Applications should include a resume, select-
trieval, computer networking, intelligent informa- Science Institute ed publications, and names of three references.
tion systems and algorithms. The Illinois Institute Director Review begins February 1, 2011; candidates are
of Technology, located within 10 minutes of down- urged to apply by that date.
town Chicago, is a dynamic and innovative institu- The International Computer Science Institute To learn more about ICSI, go to http://www.
tion. The Department has strong connections to (ICSI), an independent non-profit laboratory icsi.berkeley.edu.
Fermi and Argonne National Laboratories, and to closely affiliated with the EECS Department, Uni- To apply for this position, send the above ma-
local industry, and is on a successful and aggressive versity of California, Berkeley (UCB), invites ap- terial to apply@icsi.berkeley.edu. Recommend-
recruitment plan. IIT is an equal opportunity/affir- plications for the position of Director, beginning ers should send letters directly to apply@icsi.
mative action employer. Women and Underrepre- Fall 2011. berkeley.edu by 2/1/2011. ICSI is an Affirmative
sented Minorities are strongly encouraged to apply. The ICSI Director’s primary responsibilities Action/Equal Opportunity Employer. Applica-
Evaluation of applications will start on De- are to: oversee and expand ICSI’s research agen- tions from women and minorities are especially
cember 1, 2010 and will continue until the posi- da; act as a high-level external evangelist for ICSI encouraged.
tion is filled. Applicants should submit a detailed research; identify and pursue strategic funding
curriculum vita, a statement of research and opportunities; and strengthen ICSI’s relationship
teaching interests, and the names and email ad- with UCB. The Director reports directly to ICSI’s Kansas State University
dresses of at least four references to: Board of Trustees. Department of Computing and
Computer Science Faculty Search Committee ICSI is recognized for world-class research Information Sciences
Department of Computer Science activities in networking, speech, language and Associate/Full Professor
Illinois Institute of Technology vision processing, as well as computational biol-
10 W. 31st Street ogy and computer architecture. Several of ICSI’s The department of Computing and Information
Chicago, IL 60616 research staff have joint UCB appointments, and Sciences at Kansas State University invites appli-
Phone: 312-567-5152 many UCB graduate students perform their re- cations for a position beginning in Fall 2011 at
Email: search@cs.iit.edu search at ICSI. In addition, ICSI places significant the level of Associate or Full Professor from can-
http://www.iit.edu/csl/cs emphasis on international partnerships and visit- didates working in the areas of high assurance
ing scholar programs. computing, program specification and verifica-
ICSI is seeking a Director with sufficient tion, and formal methods.
Ingram Content Group breadth, interest, and professional connections Kansas State University is committed to the
Development Technical Lead to promote and augment ICSI’s ongoing research growth and excellence of the CIS department. The
efforts. Applicants should have recognized re- department offers a stimulating environment
Working with a 400 core processing grid & tera- search leadership, as well as a strong record in for research and teaching, and has several ongo-
scale computing. Leadership of internal developed research management and demonstrated suc- ing collaborative projects involving researchers
systems &/or business applications. Key duties are cess at government and industrial fundraising. in different areas of computer science as well as
programming & debugging. View full job descrip- Experience with international collaboration and other engineering and science departments. The
tion & apply online at www.ingramcontent.com. fundraising is a plus. department has a faculty of nineteen, more than
ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t h e acm 119
careers
100 graduate students, and 250 undergraduate puter science to establish a record of scholarly, of national origin or citizenship. The working
students and offers BS, MS, MSE, and PhD de- peer-reviewed work, and providing service to language is English; knowledge of the German
grees. Computing facilities include a large net- the department and the University. The position language is not required for a successful career at
work of servers, workstations and PCs with more comes with dedicated funding for course releas- the institute.
than 300 machines and a Beowulf cluster with es as well as a professional development fund. The institute is located in Kaiserslautern and
1000+ processors. The department building has The course releases will mean a reduced load Saarbruecken, in the tri-border area of Germany,
a wireless network and state-of-the-art media- in the 1st-3rd and 5th years, and a guaranteed France and Luxembourg. The area offers a high
equipped classrooms. The department hosts sev- early research sabbatical in the 4th year after standard of living, beautiful surroundings and
eral laboratories for embedded systems, software successful midterm review. More information is easy access to major metropolitan areas in the
analysis, robotics, computational engineering available at www.loyola.edu/cs and www.loyola. center of Europe, as well as a stimulating, com-
and science, and data-mining. Details of the CIS edu Applicants must submit the following on- petitive and collaborative work environment. In
Department can be found at the URL http://www. line (http://careers.loyola.edu): a letter of appli- immediate proximity are the MPI for Informatics,
cis.ksu.edu/. cation listing teaching and research interests, Saarland University, the Technical University of
Applicants must be committed to both teach- a curriculum vitae, and contact information for Kaiserslautern, the German Center for Artificial
ing and research, and have an excellent research three references. For full consideration applica- Intelligence (DFKI), and the Fraunhofer Insti-
and teaching track record. Applicants should have tions must be received by January 31, 2011. Ap- tutes for Experimental Software Engineering and
a PhD degree in computer science or related dis- ply URL: https://careers.loyola.edu/applicants/ for Industrial Mathematics.
ciplines; salary will be commensurate with quali- Central?quickFind=52395 Qualified candidates should apply online at
fications. Applications must include descriptions http://www.mpi-sws.org/application. The review
of teaching and research interests along with cop- of applications will begin on January 3, 2011, and
ies of representative publications. Max Planck Institute for Software applicants are strongly encouraged to apply by
Preference will be given to candidates who Systems (MPI-SWS) that date; however, applications will continue to
will compliment the existing areas of strengths Tenure-track openings be accepted through January 2011.
of the department which include high assurance The institute is committed to increasing the
systems, tools for developing, testing and verify- Applications are invited for tenure-track and representation of minorities, women and individ-
ing software systems, static analysis, model-driv- tenured faculty positions in all areas related uals with physical disabilities in Computer Sci-
en computing, programming languages, security, to the study, design, and engineering of soft- ence. We particularly encourage such individuals
and medical device software. ware systems. These areas include, but are not to apply.
Please send applications to Chair of the Re- limited to, data and information management,
cruiting Committee, Department of Computing programming systems, software verification,
and Information Sciences, 234 Nichols Hall, Kan- parallel, distributed and networked systems, Mississippi State University
sas State University, Manhattan, KS 66506 (email: and embedded systems, as well as cross-cutting Head
Recruiting@cis.ksu.edu). Review of applications areas like security, machine learning, usabil- Department of Computer Science and
will commence January 3rd, 2011 and continue ity, and social aspects of software systems. A Engineering
until the position is filled. doctoral degree in computer science or related
Kansas State University is an Equal Opportu- areas and an outstanding research record are Applications and nominations are being sought
nity Employer and actively seeks diversity among required. Successful candidates are expected to for the Head of the Department of Computer Sci-
its employees. Background checks are required. build a team and pursue a highly visible research ence and Engineering (www.cse.msstate.edu) at
agenda, both independently and in collabora- Mississippi State University. This is a 12-month
tion with other groups. Senior candidates must tenure-track position.
Lingnan University have demonstrated leadership abilities and rec- The successful Head will provide:
Chair Professor / Professor ognized international stature. ˲˲ Vision and leadership for nationally recognized
MPI-SWS, founded in 2005, is part of a net- computing education and research programs
The Department of Computing and Decision Sci- work of eighty Max Planck Institutes, Germany’s ˲˲ Exceptional academic and administrative
ences at the Lingnan University is seeking a Chair premier basic research facilities. MPIs have an skills
Professor/Professor with outstanding teaching established record of world-class, foundational ˲˲ A strong commitment to faculty recruitment
and research experience in one or more of the fol- research in the fields of medicine, biology, chem- and development
lowing areas: Information Systems, Operations istry, physics, technology and humanities. Since
Management, Management Science and Statis- 1948, MPI researchers have won 17 Nobel prizes. Applicants must have a Ph.D. in computer
tics. Please visit http://www.ln.edu.hk/job-vacan- MPI-SWS aspires to meet the highest standards of science, software engineering, computer en-
cies/acad/10-170 for details and quote post ref: excellence and international recognition with its gineering, or a closely related field. The suc-
10/170/CACM in Form R1. research in software systems. cessful candidate must have earned national
To this end, the institute offers a unique envi- recognition by a distinguished record of accom-
ronment that combines the best aspects of a uni- plishments in computer science education and
Loyola University Maryland versity department and a research laboratory: research. Demonstrated administrative experi-
Assistant Professor, Computer Science a) Faculty receive generous base funding to build ence is desired, as is teaching experience at both
and lead a team of graduate students and post- the undergraduate and graduate levels. The suc-
Loyola University Maryland invites applications docs. They have full academic freedom and pub- cessful candidate must qualify for the rank of
for the position of Clare Boothe Luce Professor lish their research results freely. professor.
in the Department of Computer Science, with b) Faculty supervise doctoral theses, and have the Please provide a letter of application outlin-
an expected start date of fall 2011 at the level of opportunity to teach graduate and undergraduate ing your experience and vision for this position, a
Assistant Professor. We are seeking an enthusi- courses. curriculum vita, and names and contact informa-
astic individual committed to excellent teaching c) Faculty are provided with outstanding techni- tion of at least three professional references. Ap-
and a continuing, productive research program. cal and administrative support facilities as well plication materials should be submitted online at
A Ph.D. in Computer Science, Computer En- as internationally competitive compensation http://www.jobs.msstate.edu/.
gineering, or a closely related field is required. packages. Screening of candidates will begin February
Candidates in all areas of specialization will be MPI-SWS currently has 8 tenured and tenure- 15, 2011 and will continue until the position is
considered. The position is restricted by the track faculty, and is funded to support 17 faculty filled. Mississippi State University is an AA/EOE
Clare Boothe Luce bequest to the Henry Luce and about 100 doctoral and post-doctoral posi- institution. Qualified minorities, women, and
Foundation to women who are U.S. citizens. tions. Additional growth through outside funding people with disabilities are encouraged to apply.
Duties of the position include teaching under- is possible. We maintain an open, international Please direct any questions to Dr. Nicolas
graduate and professional graduate computer and diverse work environment and seek applica- Younan, Search Committee Chair (662-325-3912
science courses, conducting research in com- tions from outstanding researchers regardless or younan@ece.msstate.edu).
120 communications of t h e ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
National Taiwan University tems. The group is seeking candidates with an in- Candidates will be considered from all major
Professor-Associate Professor-Assistant terest in file and storage systems, cloud comput- disciplines in computer and information science.
Professor ing, or related areas. Applicants must have a PhD We particularly welcome candidates who can
in CS and a strong publication record in the above contribute to our strong research groups in soft-
The Department of Computer Science and Infor- topics. Required skills are: ware reliability (formal methods, programming
mation Engineering has faculty openings at all ˲˲ proactive and assume leadership in proposing languages, software engineering) and systems
ranks beginning in August 2011. Highly qualified and executing innovative research projects and networks.
candidates in all areas of computer science/engi- ˲˲ develop advanced prototypes leading to dem- The College maintains a strong research pro-
neering are invited to apply. A Ph.D. or its equiva- onstration in industrial environment gram with significant funding from the major
lent is required. Applicants are expected to con- ˲˲ initiate and maintain collaborations with aca- federal research agencies and private industry.
duct outstanding research and be committed to demic and industrial research communities The College has a diverse full-time faculty of 30.
teaching. Candidates should send a curriculum Four faculty members have joint appointments
vitae, three letters of reference, and supporting Postdoctoral Researchers with other disciplines, specifically, electrical and
materials before February 28, 2011, to Prof Kun- The Machine Learning group conducts research computer engineering, health sciences, physics
Mao Chao, Department of Computer Science on various aspects of machine intelligence, from and political science, and contribute to interdis-
and Information Engineering, National Taiwan the exploration of new algorithms to applications ciplinary initiatives in information assurance,
University, No 1, Sec 4, Roosevelt Rd., Taipei 106, in data mining and semantic comprehension. network science and health informatics. The Col-
Taiwan. Ongoing projects focus on text and video analy- lege has approximately 520 undergraduates, 350
sis, digital pathology, and bioinformatics. The Masters, and 65 Ph.D. students.
group is seeking postdoctoral researchers with Northeastern University has made major in-
NEC Laboratories America, Inc PhD in CS and experience in bioinformatics (em- vestments over the course of the last several years
Research Staff Positions phasis on genomics or proteomics a plus) or text in the broad areas of Health, Security and Sustain-
analysis and/or text mining. Required skills and ability. The College has been a major participant
NEC Laboratories America, Inc. is a vibrant in- experience are: in the recruitment of faculty who can contribute
dustrial research center, conducting research ˲˲ Strong publication record in top machine to these themes and will continue to do so this
in support of NEC’s U.S. and global businesses. learning, data mining or related conferences and year as well with an additional three interdisci-
Our research program covers many areas, reflect- journals plinary searches ongoing in Health Informatics,
ing the breadth of NEC business, and maintains ˲˲ Solid knowledge in math, optimization, and Information Assurance and Game Design and In-
a balanced mix of fundamental and applied statistical inference teractive Media.
research. We have openings in the following re- ˲˲ Hands-on experiences in implementing large- Northeastern University is located on the Av-
search areas: scale learning algorithms and systems enue of the Arts in Boston’s historic Back Bay. The
˲˲ Good problem solving skills, with strong soft- College occupies a state of the art building oppo-
Research Staff Members ware knowledge site Boston’s Museum of Fine Arts.
The Large-Scale Distributed Systems group con- Additional information and instructions for
ducts advanced research in the area of design, Associate Research Staff Members submitting application materials may be found at
analysis, modeling and evaluation of distributed Candidates for Associate Research Staff Member the following web site: http://www.ccs.neu.edu/.
systems. Our current focus is to create innovative in the Computing Systems Architecture depart- Screening of applications begins immediately
technologies to build next generation large-scale ment must have an MS in CS/CE or EE with strong and will continue until the search is completed.
computing platforms, and to simplify and au- motivation and skill set to prototype/transfer in- Northeastern University is an Equal Opportu-
tomate the management of complex IT systems novate research results into industry practice. Ex- nity/Affirmative Action Employer. We strongly en-
and services. The group is seeking research staff pertise in at least one of the above parallel com- courage applications from women and minorities.
members in the area of distributed systems and puting areas is desirable.
networks. The candidates must have a PhD in CS/ The Storage Systems department is seeking
CE with strong publication records on the follow- applicants for an Associate Research Staff Mem- Northeastern University
ing topics: ber. The successful candidate will have an MS in Open Rank - Interdisciplinary
˲˲ distributed systems and networks CS or equivalent and the following skills: Northeastern University is seeking a faculty
˲˲ operating systems and middleware ˲˲ Solid understanding of operating systems member at an open rank for an interdisciplinary
˲˲ performance, reliability, dependability and ˲˲ Experience in systems programming under appointment in the College of Computer and In-
security Linux/Unix formation Science and the College of Arts, Media
˲˲ data centers and cloud computing ˲˲ Experience with performance evaluation and and Design to start in the Fall of 2011.
˲˲ virtualization and system management tuning The successful candidate will contribute to
˲˲ system modeling and statistical analysis ˲˲ Strong algorithms, data structures and multi- shaping the research, academic, and develop-
threaded programming experience ment goals of the cross-disciplinary areas of
The Computing Systems Architecture depart- ˲˲ Good knowledge of C++ and OOD/OOP Game Design and Interactive Media at both the
ment seeks to innovate, design, evaluate, and ˲˲ Proactive with can-do attitude and work well in undergraduate and the graduate levels.
deliver parallel systems for high-performance, small teams It is expected that the candidate for this po-
energy-efficient enterprise computing. The group sition will possess an excellent track record in
is seeking senior and junior level research staff For more information about NEC Labs and these research/scholarship, publication, grant acqui-
as follows. Candidates for Research Staff Mem- openings, access http://www.nec-labs.com and sition, and teaching. A terminal degree, either
ber must have a PhD in CS/CE or EE with strong submit your CV and research statement through PhD or MFA depending on the candidate’s field,
research record and excellent credentials in the our career center. is required.
international research community. Applicants EOE/AA/MFDV Contact: Terrence Masson - Email: t.masson@
must demonstrate competency in at least one of neu.edu
the following areas:
˲˲ heterogeneous cluster architectures Northeastern University
˲˲ parallel programming models and runtimes Boston, Massachusetts Northeastern University, Boston,
˲˲ key technologies to accelerate performance College of Computer and Information Science Massachusetts
and low power consumption of enterprise appli- Full or Associate Professor - Health
cations on heterogeneous clusters We Invite applications for tenure-track faculty posi- Informatics and Interfaces
tions in computer science and information science,
The Storage Systems department engages in beginning in Fall 2011. Applicants at all ranks will The College of Computer and Information Sci-
research in all aspects of storage systems with an be considered. A PhD in computer science, infor- ence and the Bouvé College of Health Sciences
emphasis on large scale reliable distributed sys- mation science or a related field is required. invite applications for a faculty position in Health
ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t h e acm 121
careers
Informatics. A Ph.D. level degree in Health or Please direct inquiries to Professor Stephen for the tenure-track position of Assistant Profes-
Medical Informatics, Computer Science, Infor- Intille (S.Intille@neu.edu). sor of Computer Science effective Fall Semester
mation Science, or a health-related discipline, to- 2011. The successful candidate will have experi-
gether with a proven ability to secure grant fund- ence and research interest in software engineer-
ing for research using advanced technology in the Oregon State University ing/software design, compilers, or programming
health domain, is required. School of Electrical Engineering and Computer languages. Candidates will be evaluated on teach-
Building upon our successful joint Master Science ing and research potential. Ph.D. in Computer
of Science degree program in Health Informat- Two tenure-track Professorial positions in Science is required. Faculty are expected to teach
ics, and our many graduate and undergraduate Computer Science courses for the B.S. and M.S. degrees in Computer
degree programs in health sciences, nursing, Science, pursue scholarly research and publica-
pharmacy, computer science, and information The School of Electrical Engineering and Com- tions, contribute to curriculum development,
science, we are interested in growing our faculty puter Science at Oregon State University invites participate in University and professional ser-
in the general area of health interfaces, which applications for two tenure-track professorial vice activities, advise undergraduate and gradu-
includes technologies that patients interact positions in Computer Science. Exceptionally ate students, and serve on graduate level degree
with directly, health informatics, and technol- strong candidates in all areas of Computer Sci- committees. For information on Penn State Har-
ogy design for health and wellness systems. ence are encouraged to apply. We are building risburg, please visit our websites at www.hbg.psu.
The candidate would play a key role in launch- research and teaching strengths in the areas edu and www.cs.hbg.psu.edu.
ing a new interdisciplinary Ph.D.-level degree of open source software, internet and social Applicants are invited to submit current
program in this area. Faculty in our colleges are computing, and cyber security, so our primary curriculum vitae, a list of three references with
currently working on multiple NIH-funded re- need is for candidates specializing in software one reference addressing candidate’s teaching
search projects in consumer informatics, clini- engineering, database systems, web/distributed effectiveness, a personal statement of research
cal informatics, behavioral informatics, and systems, programming languages, and HCI. and teaching objectives that includes a list of
assistive technologies, and we are particularly Applicants should demonstrate a strong com- preferred courses to teach. Please submit cre-
interested in faculty candidates who can expand mitment to collaboration with other research dentials to: Chair, Computer Science Search
or complement our work in these areas. Topics groups in the School of EECS, with other de- Committee, c/o Mrs. Dorothy J. Guy, Director of
of interest include the use of mobile technolo- partments at Oregon State University, and with Human Resources, Penn State Harrisburg, Box:
gies to monitor and manage health, the use of other universities. ACM-33389, 777 West Harrisburg Pike, Middle-
virtual agents for physical exercise and health The School of EECS supports a culture of en- town, PA 17057-4898.
management, the development of assistive com- ergetic collaboration and faculty are committed Application review will begin immediately
munication aids, the use of artificial intelligence to quality in both education and research. With and continue until the position is filled. Penn
to study mental and physical health behavior, 40 tenure/tenure-track faculty, we enroll 160 PhD, State is committed to affirmative action, equal
and the development and evaluation of other 120 MS and 1200 undergraduate students. OSU opportunity, and the diversity of its workforce.
novel technologies to study health behavior and is the only Oregon institution recognized for its
improve health outcomes. We are interested in “very high research activity” (RU/VH) by the Car-
candidates who create new tools and candidates negie Foundation for the Advancement of Teach- Princeton University
who specialize in evaluation of new technolo- ing. The School of EECS is housed in the Kelley Computer Science
gies in field research. Northeastern University is Engineering Center, a green building designed Assistant Professor
making a major investment in interdisciplinary to support collaboration among faculty and stu- Tenure-Track Positions
health research, with several recent hires and dents across campus. Oregon State University is
additional open interdisciplinary faculty search- located in Corvallis, a college town renowned for The Department of Computer Science at Princ-
es in Health Systems, Health Policy, Urban Envi- its high quality of life. eton University invites applications for faculty
ronment and Health and Administration. For more information, including full position positions at the Assistant Professor level. We are
announcement and instructions for application, accepting applications in all areas of Computer
Additional Information visit: http://eecs.oregonstate.edu/faculty/openings. Science.
Recognizing the importance of multidisciplinary php. Applicants must demonstrate superior re-
approaches to solving complex problems facing OSU is an AAEOE. search and scholarship potential as well as teach-
society, Northeastern is hiring faculty in several ing ability. A PhD in Computer Science or a relat-
areas related to this search. Searches are current- ed area is required.
ly underway in health care policy/ management, Pacific Lutheran University Successful candidates are expected to pursue
health systems engineering, health law, and ur- Assistant Professor an active research program and to contribute
ban health. We will consider hiring a multidisci- significantly to the teaching programs of the de-
plinary group as a ‘cluster hire’. Candidates may Assistant Professor in the Computer Science and partment. Applicants should include a resume
choose to form a team and propose an innovative Computer Enginieering Department beginning contact information for at least three people who
and translational research and educational direc- September 2011. Review of applications will be- can comment on the applicant’s professional
tion and apply to more than one of the position gin February 14, 2011, and continue until the po- qualifications.
announcements. Information on these positions sition is filled. There is no deadline, but review of applica-
and on cluster applications can be obtained from A master’s degree is required and a Doctorate tions will start in December 2010; the review of
the http://www.northeastern.edu/hrm/ web site. is required for tenure. Preferred candidates will applicants in the field of theoretical computer
have a Ph.D. in Computer Engineering, Comput- science will begin as early as October 2010.
Equal Employment Opportunity er Science, or a related field; promise of teaching Princeton University is an equal opportunity
Northeastern University is an Equal Opportunity, excellence is essential. employer and complies with applicable EEO and
Affirmative Action Educational Institution and Application details and further information affirmative action regulations You may apply on-
Employer, Title IX University. Northeastern Uni- about PLU and the CSCE department can be line at:
versity strongly encourages applications from mi- found at www.plu.edu and www.cs.plu.edu. In- http://www.cs.princeton.edu/jobs Requisition
norities, women and persons with disabilities. quiries may be sent by e-mail to csce@plu.edu. Number: 1000520
AA/EOE
How To Apply
Applicants should submit a letter of interest, cur- Princeton University
riculum vitae, and the contact information of at Penn State Harrisburg Computer Science Department
least five references. Submission is online via Assistant Professor, Computer Science Postdoc Research Associate
http://www.ccs.neu.edu/. Screening of applica-
tions begins November 30, 2010 and will contin- Penn State Harrisburg, School of Science, En- The Department of Computer Science at Princ-
ue until the position is filled. gineering and Technology, invites applications eton University is seeking applications for post-
ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t h e acm 123
careers
Swarthmore College Room 224 New Engineering Building uate programs in computer science and computer
Visiting Assistant Professor Baltimore, MD 21218-2682 engineering. The university is committed to grow-
Phone: 410-516-8775 ing the faculty ranks over the next several years
Swarthmore College invites applications for a Fax: 410-516-6134 and promoting interdisciplinary research toward
three-year faculty position in Computer Science, fsearch@cs.jhu.edu cyber-enabled discovery and design.
at the rank of Visiting Assistant Professor, be- http://www.cs.jhu.edu/apply Penn State is a major research university and
ginning September 2011. Specialization is open. is ranked 3rd in the nation in industry-sponsored
Review of applications will begin January 1, 2011, research. Computer science is ranked 6th in the
and continue until the position is filled. For infor- The Ohio State University nation in research expenditures. U.S. News and
mation, see http://www.cs.swarthmore.edu/job. Department of Computer Science and World Report consistently ranks Penn State’s Col-
Swarthmore College has a strong commit- Engineering (CSE) lege of Engineering undergraduate and graduate
ment to excellence through diversity in educa- Assistant Professor programs in the top 15 of the nation. As reported
tion and employment and welcomes applications in the Chronicles of Higher Education, computer
from candidates with exceptional qualifications, The Department of Computer Science and En- science is ranked 3rd and computer engineering
particularly those with demonstrable commit- gineering (CSE), at The Ohio State University, is ranked 8th in the nation, respectively.
ments to a more inclusive society and world. anticipates significant growth in the next few The university is located in the beautiful col-
years. This year, CSE invites applications for four lege town of State College in the center of Penn-
tenure-track positions at the Assistant Professor sylvania. State College has 40,000 inhabitants
Texas A&M University level. Priority consideration will be given to candi- and offers a variety of cultural and outdoor recre-
Department of Visualization dates in database systems, graphics & animation, ational activities nearby. The university offers out-
Assistant Professor machine learning, and networking. Outstanding standing events from collegiate sporting events
applicants in all CSE areas (including software to fine arts productions. Many major population
Tenure-track faculty in the area of interactive me- engineering & programming languages, systems, centers on the east coast (New York, Philadelphia,
dia. Responsibilities include research/creative and theory) will also be considered. Pittsburgh, Washington D.C., Baltimore) are only
work, advising graduate/undergraduate levels, The department is committed to enhancing a few hours drive away and convenient air services
service to dept, university & field, teaching inc. in- faculty diversity; women, minorities, and individ- to several major hubs are operated by four major
tro courses in game design & development. uals with disabilities are especially encouraged to airlines out of State College.
Candidates must demonstrate collaborative apply. Applicants should hold a Ph.D. in Computer
efforts across disciplinary lines. Graduate degree Applicants should hold or be completing Science, Computer Engineering, or a closely relat-
related to game design & development, mobile a Ph.D. in CSE or a closely related field, have a ed field and should be committed to excellence in
media, interactive graphics, interactive art, mul- commitment to and demonstrated record of ex- both research and teaching. Support will be pro-
timedia or simulation is required. Apply URL: cellence in research, and a commitment to excel- vided to the successful applicants for establish-
http://www.viz.tamu.edu lence in teaching. ing their research programs. We encourage dual
To apply, please submit your application via career couples to apply. Applications should be
the online database. The link can be found at: received by January 31, 2011 to receive full consid-
The Johns Hopkins University http://www.cse.ohio-state.edu/department/posi- eration. To apply by electronic mail, send your re-
Tenure-track Faculty Positions tions.shtml sume (including curriculum vitae and the names
Review of applications will begin in November and addresses of at least three references) as a pdf
The Department of Computer Science at The and will continue until the positions are filled. file to recruiting@cse.psu.edu.
Johns Hopkins University is seeking applications The Ohio State University is an Equal Oppor- For more information about the Department
for tenure-track faculty positions. The search tunity/Affirmative Action Employer. of CSE at PSU, see http://www.cse.psu.edu. Click
is open to all areas of Computer Science, with a here to fill out an Affirmative Action Applicant
particular emphasis on candidates with research Data Card. Our search number is 015-87. You
interests in machine learning, theoretical com- The Pennsylvania State University MUST include this search number in order to
puter science, computational biology, computa- Tenure-track faculty submit this form.
tional aspects of biomedical informatics, or other Penn State is committed to affirmative action,
data-intensive or health-related applications. The Department of Computer Science and Engi- equal opportunity and the diversity of its work-
All applicants must have a Ph.D. in Computer neering (CSE) invites applications for tenure-track force.
Science or a related field and are expected to show faculty positions at all ranks. We seek outstand-
evidence of an ability to establish a strong, inde- ing candidates who can contribute to the core
pendent, multidisciplinary, internationally rec- of computer science and engineering through a The University of Alabama at
ognized research program. Commitment to qual- strong program of interdisciplinary research in Birmingham
ity teaching at the undergraduate and graduate areas such as high performance computing appli- Assistant/Associate Professor
levels will be required of all candidates. Prefer- cations and computational modeling for energy,
ence will be given to applications at the assistant life sciences, environmental sustainability, etc. The Department of Computer & Information Sci-
professor level, but other levels of appointment The department has 32 tenure-track faculty ences at the University of Alabama at Birmingham
will be considered based on area and qualifica- representing major areas of computer science and (UAB) is seeking candidates for a tenure-track/
tions. The Department is committed to building engineering. Eleven members of our faculty are tenure-earning faculty position at the Assistant
a diverse educational environment; women and recipients of the NSF Career Award. Two faculty or Associate Professor level beginning August 15,
minorities are especially encouraged to apply. members have received the prestigious NSF PE- 2011.
A more extensive description of our search can CASE Award. In recent years, our faculty received Candidates with leading expertise in Infor-
be found at http://www.cs.jhu.edu/Search2011. seven NSF ITR Grants, a $35M Network Science mation Assurance, particularly Computer Foren-
More information on the department is available Center Award, over $4.5M in computing and re- sics and/or Computer and Network Security are
at http://www.cs.jhu.edu. search infrastructure and instrumentation grants sought. The successful candidate must be able
Applicants should apply using the online ap- from NSF, eleven NSF Cyber Trust and Networking to participate effectively in multidisciplinary
plication which can be accessed from http://www. awards, and several awards from DARPA, DOE and research with scientists in Computer and Infor-
cs.jhu.edu/apply. Applications should be received DoD. There are state-of-the-art research labs for mation Sciences and Justice Sciences for advanc-
by Dec 1, 2010 for full consideration. Questions computer systems, computer vision and robotics, ing Information Assurance Research at UAB,
should be directed to fsearch@cs.jhu.edu. The Microsystems design and VLSI, networking and including joint scientific studies, co-advising of
Johns Hopkins University is an EEO/AA employer. security, high performance computing, bioinfor- students, and funding. Allied expertise in Artifi-
Faculty Search matics and virtual environments. The department cial Intelligence, Knowledge Discovery and Date
Johns Hopkins University offers a graduate program with over 40 Masters Mining, Software Engineering, and/or High Per-
Department of Computer Science students and 153 Ph.D. students, and undergrad- formance Computing is highly desirable. UAB
124 communications of t h e ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
has made significant commitment to this area of Computer vision dedicated to work-life balance through an array
research and teaching. Candidates must conse- Computational biology of family-friendly policies, and is the recipient of
quently have strong teaching credentials as well Scientific computing an NSF Advance Award for gender equity.
as research credentials.
For additional information about the depart- Positions are available at all ranks, and we
ment please visit http://www.cis.uab.edu. have a large number of limited term positions University of California, Riverside
Applicants should have demonstrated the po- currently available. Tenure-Track Faculty Positions
tential to excel in one of these areas and in teach- For all positions we require a Ph.D. Degree or
ing at all levels of instruction. They should also be Ph.D. candidacy, with the degree conferred prior The Department of Computer Science and Engi-
committed to professional service including de- to date of hire. Submit your application electroni- neering, University of California, Riverside invites
partmental service. A Ph.D. in Computer Science cally at: applications for tenure-track faculty positions
or closely related field is required. http://ttic.uchicago.edu/facapp/ beginning in the 2011/2012 academic year with
Applications should include a complete cur- research interests in (a) Operating and Distrib-
riculum vita with a publication list, a statement Toyota Technological Institute at Chicago is an uted Systems (b) Data Mining and, (c) Computer
of future research plans, a statement on teaching Equal Opportunity Employer Graphics. Exceptional candidates in all areas will
experience and philosophy, and minimally two be considered. A Ph.D. in Computer Science (or
letters of reference with at least one letter ad- in a closely related field) is required at the time
dressing teaching experience and ability. University of California, Irvine of employment. Junior candidates must show
Applications and all other materials may be Computer Science Department outstanding research, teaching and graduate stu-
submitted via email to facapp.ia@cis.uab.edu or Tenure-Track Position in Operating Systems / dent mentorship potential. Exceptional senior
via regular mail to: Programming Languages candidates may be considered. Salary level will
Search Committee be competitive and commensurate with qualifi-
Department of Computer and Information Department of Computer Science at the Univer- cations and experience. Details and application
Sciences sity of California, Irvine (UCI) invites applications materials can be found at www.engr.ucr.edu/
115A Campbell Hall for a tenure-track Assistant Professor position in facultysearch. Full consideration will be given
1300 University Blvd the general area of Systems. We are particularly to applications received by February 1, 2011. Ap-
Birmingham, AL 35294-1170 interested in applicants who specialize in Oper- plications will continue to be received until the
ating Systems, Programming Languages or Dis- positions are filled. For inquiries and questions,
Interviewing for the position will begin as tributed Systems. Exceptionally qualified more please contact us at search@cs.ucr.edu. EEO/AA
soon as qualified candidates are identified, and senior candidates may also be considered. employer.
will continue until the position is filled. Department of Computer Science is the larg-
The department and university are commit- est department in the Bren School of Information
ted to building a culturally diverse workforce and and Computer Sciences, one of only a few such University of Houston – Clear Lake
strongly encourage applications from women schools in the nation and the only one in the UC Assistant or Associate Professor of Computer
and individuals from underrepresented groups. System. The department has over 45 faculty mem- Science/Computer Information Systems
UAB has a Spouse Relocation Program to assist bers and over 200 PhD students. Faculty research
in the needs of dual career couples. UAB is an Af- is very vibrant and broad, spanning prominent The Computer Science and Computer Informa-
firmative Action/Equal Employment Opportunity areas such as: distributed systems, software, tion Systems programs of the School of Science
employer. networking, databases, embedded systems, and Computer Engineering at the University of
theory, security, graphics, multimedia, machine Houston-Clear Lake invite applications for ten-
learning, AI, and bioinformatics. Prospective ap- ure-track Assistant or Associate Professor of CS
Theophilus, Inc. plicants are encouraged to visit our web page at: or CIS to begin August 2011. Ph.D. in CS, CIS/IS,
Recommendation Engine / Java Developer http://cs.www.uci.edu/ or a closely related field is required. Applications
One of the youngest UC campuses, UCI is are accepted only online at https://jobs.uhcl.edu.
Funded startup with virtual office and flexible ranked 10th among the nation’s best public uni- See http://sce.uhcl.edu/cs and http://sce.uhcl.
working hours seeking experienced part-time versities by US News & World Report. It has re- edu/cis for additional information about CS/CIS
Java Developer with recommendation engine ceived three Nobel prizes in the past 15 years. Sal- programs. AA/EOE.
experience (implementation and experimental ary and other compensation (including priority
evaluation), to work on next generation recom- access to on-campus for-sale faculty housing) are
mendation product integrating advanced seman- competitive with the nation’s finest universities. University of Houston-Downtown
tic analysis, search, social networks, and smart- UCI is located 3 miles from the Pacific Ocean in Assistant Professor, Computer Sciences
phones (iPhone). Developer can work remotely. Southern California (50 miles South of Los Ange-
Email for full description: david.kim@theo- les) with a very pleasant year-round climate. The The Department of Computer and Mathematical
philus-inc.com area offers numerous recreational and cultural Sciences invites applications for a tenure-track
opportunities. Also, the Irvine public school sys- Assistant Professor position in Computer Science
tem is one of the highest-ranked in the nation. starting Fall 2011. Successful candidates will have
Toyota Technological Institute at Screening will begin immediately upon re- a PhD in Computer Science or a closely related
Chicago ceipt of a completed application. Applications field in hand by the time of appointment, a prom-
Computer Science Faculty Positions at All will be accepted until the position is filled, al- ising research profile, and a commitment to ex-
Levels though maximum consideration will be given cellence in teaching. Review of applications will
to applications received by December 15, 2010. begin immediately and continue until the posi-
Toyota Technological Institute at Chicago (TTIC) Each application must contain: a cover letter, CV, tion is filled. Only online applications submitted
is a philanthropically endowed degree-granting sample publications (up to 3) and 3-5 letters of through http://jobs.uhd.edu will be considered.
institute for computer science located on the recommendation. All these materials must be up-
University of Chicago campus. The Institute is loaded on-line. Please refer to the following web
expected to reach a steady-state of 12 traditional site for instructions: University of Mississippi
faculty (tenure and tenure track), and 12 limited https://recruit.ap.uci.edu/ Chair, Department of Computer and
term faculty. Applications are being accepted in Information Science
all areas, but we are particularly interested in: UCI is an equal opportunity employer com-
Theoretical computer science mitted to excellence through diversity and en- The Department of Computer and Information
Speech processing courages applications from women, minorities, Science at the University of Mississippi (Ole Miss)
Machine learning and other under-represented groups. UCI is re- invites applications for the position of Chair. The
Computational linguistics sponsive to the needs of dual career couples, is Chair provides leadership and overall strategic
ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t h e acm 125
careers
direction for the instructional and research pro- position in Computer Engineering (F10/11-24). appointment. The position requires demonstrat-
grams. Requirements include a PhD or equiva- All candidates must have a potential/proven re- ed research success, a significant potential for
lent in computer science or a closely related field, cord in teaching and active research. The Assis- attracting external research funding, excellence
evidence of excellence in teaching and research tant Professor position in Computer Engineering in teaching both undergraduate and graduate
in one or more major areas of computer and in- requires a Ph.D. in computer science or com- courses, the ability to supervise student research,
formation science, and administrative experi- puter engineering. Highest priority will be given and excellent communication skills.
ence relevant to the management of an academic to candidates with research expertise in areas of USU offers competitive salaries and outstand-
computer science department. The Department Computer Networks, Cybersecurity/Forensics, ing medical, retirement, and professional ben-
has an ABET/CAC-accredited undergraduate pro- Data Management, Scientific Workflows, and/or efits (see http://www.usu.edu/hr/ for details). The
gram and MS and PhD programs. See the website Semantic Web.The program in Computer Engi- department currently has approximately 280 un-
http://www.cs.olemiss.edu for more information neering leading to BS degree in Computer Engi- dergraduate majors, 80 MS students and 27 PhD
about the Department and its programs. neering is administered jointly by the Computer students. There are 17 full time faculty. The BS
The University is located in the historic town Science Department and the Electrical Engineer- degree is ABET accredited. Utah State University
of Oxford in the wooded hills of north Missis- ing Department. The Computer Science Depart- is a Carnegie Research Doctoral extensive Univer-
sippi, an hour drive from Memphis. Oxford has ment (http://www.cs.panam.edu) also offers the sity of over 23,000 students, nestled in a moun-
a wonderful small-town atmosphere with afford- BSCS (ABET/CAC Accredited) and BS undergradu- tain valley 80 miles north of Salt Lake City, Utah.
able housing and excellent schools. ate degrees, MS in Computer Science and MS in Opportunities for a wide range of outdoor activi-
Requirements include a PhD or equivalent in Information Technology. ties are plentiful. Housing costs are at or below
computer science or a closely related field, evi- UTPA is situated in the lower Rio Grande valley national averages, and the area provides a sup-
dence of excellence in teaching and research in of south Texas, a strategic location at the center of portive environment for families and a balanced
one or more major areas of computer and infor- social and economic change. With a population personal and professional life. Women, minority,
mation science, and administrative experience of over one million, the Rio Grande Valley is one veteran and candidates with disabilities are en-
relevant to the management of an academic com- of the fastest growing regions in the country. The couraged to apply. USU is sensitive to the needs
puter science department. region has a very affordable cost-of-living. UTPA is of dual-career couples. Utah State University is an
Individuals may apply online at http://jobs. a leading educator of Hispanic/Latino students, affirmative action/equal opportunity employer,
olemiss.edu. Applicants will be asked to upload a with enrollment of over 18,500. with a National Science Foundation ADVANCE
cover letter, curriculum vitae, names and contact The position starts in Fall 2011. The salary is Gender Equity program, committed to increasing
information for five references, and a statement competitive. A complete application should in- diversity among students, faculty, and all partici-
of department administrative philosophy, objec- clude: (1) a cover letter, specifically stating an pants in university life.
tives, and vision. Review of applications will begin interest in the Assistant Professor in Computer Applications must be submitted using USU’s
immediately and will continue until the position Engineering position, noting your specialization, online job-opportunity system. To access this job
is filled or an adequate applicant pool is reached. (2) vita, (3) statements of teaching and research opportunity directly and begin the application
The University of Mississippi is an EEO/AA/Ti- interests, and (4) names and contact information process, visit https://jobs.usu.edu/applicants/
tle VI/Title IX/Section 504/ADA/ADEA employer. of at least three references. Applications can be Central?quickFind=54615.
mailed to Dean’s Office, Computer Engineering The review of the applications will begin on
Search, College of Engineering and Computer January 15, 2011 and continue until the position
University of North Texas Science, The University of Texas-Pan American, is filled. The salary will be competitive and de-
Department of Computer Science and 1201 W. University Drive, Edinburg, Texas 78539 pend on qualifications.
Engineering or emailed to coec@utpa.edu. Review of materi-
Department Chair als will begin on November 1, 2010 and continue
until the position is filled. Wichita State University
Applications are invited for the Chair position in NOTE: UTPA is an Equal Opportunity/Affirma- Assistant Professor
the Department of Computer Science and Engi- tive Action employer. Women, racial/ethnic mi-
neering at the University of North Texas. UNT is norities and persons with disabilities are encour- The Department of Electrical Engineering and
one of seven universities designated by the state aged to apply. This position is security-sensitive as Computer Science at Wichita State University has
as an “Emerging Research University.” Candi- defined by the Texas Education Code §51.215(c) multiple open tenure-eligible faculty positions
dates must have an earned doctorate in Comput- and Texas Government Code §411.094(a)(2). Tex- at the assistant professor level in Electric Energy
er Science and Engineering or a closely related as law requires faculty members whose primary Systems, Information Security, and Software En-
field with a record of significant and sustained language is not English to demonstrate profi- gineering. Duties and responsibilities of all posi-
research funding and scholarly output that quali- ciency in English as determined by a satisfactory tions include teaching undergraduate and gradu-
fies them to the rank of full professor. Preferred: grade on the International Test of English as a ate courses, advising undergraduate students,
Administrative experience as a department chair Foreign Language (TOEFL). supervising MS and PhD students in their theses
or director of personnel working in computer and dissertations, obtaining research funding,
science and engineering; experience in curricu- conducting an active research program, publish-
lum development; and demonstrated experience University of Wisconsin-Platteville ing the results of research, actively participating
mentoring junior faculty. The committee will be- Assistant Professor in professional societies, and service to the de-
gin its review of the applications on December 1, partment, college and university. Complete infor-
2010 and the position will close on April 4, 2011. The University of Wisconsin-Platteville Computer mation can be found on our website, www.eecs.
All applicants must apply online to: https://facul- Science and Software Engineering Department wichita.edu.
tyjobs.unt.edu. Nominations and any questions has two tenure track positions to be filled in Fall To ensure full consideration, the complete
regarding the position may be directed to Dr Bill 2011. One is in Software Engineering and one is application package must be submitted online
Buckles (bbuckles@cse.unt.edu). Additional in- an anticipated opening in Computer Science. For at jobs.wichita.edu by January 15, 2011. Appli-
formation and about the department is available more information and to apply electronically: cations will be continuously reviewed after that
at www.cse.unt.edu. UNT is an AA/ADA/EOE. http://www.uwplatt.edu/csse/positions. date until determinations are made with regard
to filling the positions. Offers of employment
are contingent upon completion of a satisfactory
University of Texas-Pan American Utah State University criminal background check as required by Board
Computer Science Department Assistant Professor of Regents policy.
Assistant Professor Faculty Position Questions only (not applications) can be di-
Applications are invited for a faculty position at rected to the search chair, Ward Jewell, wardj@
The Department of Computer Science at the Uni- the Assistant Professor level, for employment ieee.org.
versity of Texas-Pan American (UTPA) seeks ap- beginning Fall 2011. Applicants must have com- Wichita State University is an equal opportu-
plications for a tenure-track Assistant Professor pleted a PhD in computer science by the time of nity and affirmative action employer.
ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t h e acm 127
last byte
Q&A
A Journey of Discovery
Ed Lazowska discusses his heady undergraduate days
at Brown University, teaching, eScience, and being chair
of the Computing Community Consortium.
As a n unde rgra duate student at
Brown University, Ed Lazowska hardly
seemed destined to become a leader in
computer science. Actually, he wasn’t
sure what he wanted to do. He started
as an engineering student, switched to
physics, and briefly considered chem-
istry. Essentially, he was “adrift.” (His
description, not ours.)
It wasn’t until he fell under the
tutelage of computer science profes-
sor Andy van Dam that he discovered
what really excited him: the process of
discovery.
“We had access to an IBM 360
mainframe that occupied an entire
building,” Lazowska recalls. “Despite
its size, it had only a couple hundred
megabytes of disk storage and 512 ki-
lobytes of memory. Today, your typical
smartphone will have 1,000 times the
processing power and storage of this
machine. During the day, it supported
the entire campus. But between mid-
night and 8 a.m., we were allowed to Lazowska holds the Bill & Melinda search assistants were graduate stu-
use it as a personal computer. We were Gates Chair in Computer Science & dents, but Brown had few computer
building a ‘what-you-see-is-what-you- Engineering at the University of Wash- science graduate students at the time.
get’ hypertext editor—Microsoft Word ington, where he served as department Andy was asking us to join him in dis-
plus the Web, minus networking. It chair from 1993 to 2001. He also di- covery—to figure out how to do things
was revolutionary.” rects the university’s eScience Institute that no one had done before. Up to
Four decades later, Lazowska is ded- and chairs the Computing Community that time, including my freshman year
icated to making the same transforma- Consortium, a National Science Foun- at Brown, I had been learning things
tional impact on countless computer dation initiative that seeks to inspire that people already knew. It blew my
science students. After graduating computer scientists to tackle the soci- mind that Andy was asking 19- and
from Brown in 1972, he received his etal challenges of the 21st century. 20-year-olds to find answers to ques-
PHOTOGRA PH BY B RIA N SMA LE
Ph.D. from the University of Toronto tions that he himself didn’t know.
in 1977, and joined the University of How invigorating were those early days People rise to the expectations and
Washington faculty, focusing on the at Brown under van Dam? challenges that are set for them. Andy
design, implementation, and analy- It was an amazing time. He had a understood this.
sis of high-performance computing crew of 20 undergraduates who were Back then, most people thought
and communication systems. Today, his research assistants. Typically, re- of computers [c on tinued o n p. 1 2 7 ]