You are on page 1of 132

COMMUNICATIONS

ACM

cacm.acm.oRG OF THE 01/2011 VoL.54 no.01

a firm
foundation
for Private
Data analysis
is virtualization a
Curse or a blessing?
Follow the
intellectual Property
an interview
With Fran allen
aCM’s Fy10
annual report

Association for
Computing Machinery
ACM TechNews Goes Mobile
iPhone & iPad Apps Now Available in the iTunes Store
ACM TechNews—ACM’s popular thrice-weekly news briefing service—is now
available as an easy to use mobile apps downloadable from the Apple iTunes Store.
These new apps allow nearly 100,000 ACM members to keep current with
news, trends, and timely information impacting the global IT and Computing
communities each day.

TechNews mobile app users will enjoy:


• Latest News: Concise summaries of the most
relevant news impacting the computing world
• Original Sources: Links to the full-length
articles published in over 3,000 news sources
• Archive access: Access to the complete
archive of TechNews issues dating back to
the first issue published in December 1999
• Article Sharing: The ability to share news
with friends and colleagues via email, text
messaging, and popular social networking sites
• Touch Screen Navigation: Find news
articles quickly and easily with a
streamlined, fingertip scroll bar
• Search: Simple search the entire TechNews
archive by keyword, author, or title
• Save: One-click saving of latest news or archived
summaries in a personal binder for easy access
• Automatic Updates: By entering and saving
your ACM Web Account login information,
the apps will automatically update with
the latest issues of TechNews published
every Monday, Wednesday, and Friday

The Apps are freely available to download from the Apple iTunes Store, but users must be registered
individual members of ACM with valid Web Accounts to receive regularly updated content.
http://www.apple.com/iphone/apps-for-iphone/ http://www.apple.com/ipad/apps-for-ipad/

acm technews
membership application &
Advancing Computing as a Science & Profession
digital library order form
Priority Code: AD10

You can join ACM in several easy ways:


Online Phone Fax
http://www.acm.org/join +1-800-342-6626 (US & Canada) +1-212-944-1318
+1-212-626-0500 (Global)
Or, complete this application and return with payment via postal mail

Special rates for residents of developing countries: Special rates for members of sister societies:
http://www.acm.org/membership/L2-3/ http://www.acm.org/membership/dues.html
Please print clearly
Purposes of ACM
ACM is dedicated to:
Name
1) advancing the art, science, engineering,
and application of information technology
2) fostering the open interchange of
Address information to serve both professionals and
the public
3) promoting the highest professional and
City State/Province Postal code/Zip ethics standards
I agree with the Purposes of ACM:
Country E-mail address

Signature

Area code & Daytime phone Fax Member number, if applicable ACM Code of Ethics:
http://www.acm.org/serving/ethics.html

choose one membership option:


PROFESSIONAL MEMBERSHIP: STUDENT MEMBERSHIP:
o ACM Professional Membership: $99 USD o ACM Student Membership: $19 USD

o ACM Professional Membership plus the ACM Digital Library: o ACM Student Membership plus the ACM Digital Library: $42 USD
$198 USD ($99 dues + $99 DL) o ACM Student Membership PLUS Print CACM Magazine: $42 USD
o ACM Digital Library: $99 USD (must be an ACM member) o ACM Student Membership w/Digital Library PLUS Print
CACM Magazine: $62 USD

All new ACM members will receive an payment:


ACM membership card. Payment must accompany application. If paying by check or
For more information, please visit us at www.acm.org money order, make payable to ACM, Inc. in US dollars or foreign
currency at current exchange rate.
Professional membership dues include $40 toward a subscription
to Communications of the ACM. Member dues, subscriptions, o Visa/MasterCard o American Express o Check/money order
and optional contributions are tax-deductible under certain
circumstances. Please consult with your tax advisor.
o Professional Member Dues ($99 or $198) $ ______________________

o ACM Digital Library ($99) $ ______________________


RETURN COMPLETED APPLICATION TO:
o Student Member Dues ($19, $42, or $62) $ ______________________
Association for Computing Machinery, Inc.
General Post Office Total Amount Due $ ______________________
P.O. Box 30777
New York, NY 10087-0777

Questions? E-mail us at acmhelp@acm.org Card # Expiration date


Or call +1-800-342-6626 to speak to a live representative

Satisfaction Guaranteed! Signature


communications of the acm

Departments News Viewpoints

5 Editor’s Letter 27 The Business of Software


Where Have All Don’t Bring Me a Good Idea
the Workshops Gone? How to sell process changes.
By Moshe Y. Vardi By Phillip G. Armour

6 Letters To The Editor 30 Law and Technology


To Change the World, Take a Chance Google AdWords and
European Trademark Law
8 In the Virtual Extension Is Google violating trademark law
by operating its AdWords system?
9 ACM’s FY10 Annual Report By Stefan Bechtold

14 BLOG@CACM 33 Technology Strategy and Management


Smart Career Advice; Laptops Reflections on the Toyota Debacle
as a Classroom Distraction 23 A look in the rearview mirror reveals
Jack Rosenberger shares Patty system and process blind spots.
Azzarello’s life lessons about 17 Nonlinear Systems Made Easy By Michael A. Cusumano
advancing in the workplace. Judy Pablo Parrilo has discovered a new
Robertson discusses students’ approach to convex optimization 36 Viewpoint
in-class usage of laptops. that creates order out of chaos in Cloud Computing Privacy
complex nonlinear systems. Concerns on Our Doorstep
16 CACM Online By Gary Anthes Privacy and confidentiality issues
Scholarly Publishing in cloud-based conference
Model Needs an Update 20 The Touchy Subject of Haptics management systems reflect more
By David Roman After more than 20 years of research universal themes.
and development, are haptic By Mark D. Ryan
29 Calendar interfaces finally getting ready to
enter the computing mainstream? 39 Interview
117 Careers By Alex Wright An Interview with Frances E. Allen
Frances E. Allen, recipient of
23 India’s Elephantine Effort the 2006 ACM A.M. Turing Award,
Last Byte An ambitious biometric ID project reflects on her career.
in the world’s second most By Guy L. Steele Jr.
128 Q&A populous nation aims to relieve
A Journey of Discovery poverty, but faces many hurdles. The Ephemeral Legion:
Ed Lazowska discusses his heady By Marina Krakovsky Producing an Expert Cyber-Security
undergraduate days at Brown Work Force from Thin Air
University, teaching, eScience, 25 EMET Prize and Other Awards Seeking to improve the educational
and being chair of the Computing Edward Felten, David Harel, mechanisms for efficiently
Community Consortium. Sarit Kraus, and others are honored training large numbers of
By Dennis McCafferty for their contributions to computer information security workers.
science, technology, and electronic By Michael E. Locasto, Anup K. Ghosh,
freedom and innovation. Sushil Jajodia, and Angelos Stavrou
By Jack Rosenberger
PHOTOGRA PH BY SA NJIT DAS/PANO S

Association for Computing Machinery


Advancing Computing as a Science & Profession

2 communications of the ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
01/2011 VoL. 54 no. 01

Practice Contributed Articles Review Articles

86 A Firm Foundation
for Private Data Analysis
What does it mean to
preserve privacy?
By Cynthia Dwork

Research Highlights

98 Technical Perspective
Sora Promises Lasting Impact
By Dina Katabi

99 Sora: High-Performance Software


61 66 Radio Using General-Purpose
Multi-Core Processors
46 Collaboration in System 66 Follow the Intellectual Property By Kun Tan, He Liu, Jiansong Zhang,
Administration How companies pay programmers Yongguang Zhang, Ji Fang,
For sysadmins, solving problems when they move the related IP rights and Geoffrey M. Voelker
usually involves collaborating to offshore taxhavens.
with others. How can we make it By Gio Wiederhold
more effective? 108 Technical Perspective
By Eben M. Haber, Eser Kandogan, 75 Using Simple Abstraction to Multipath: A New Control
OFFSHORING

and Paul P. Maglio Reinvent Computing for Parallelism Architecture for the Internet
The ICE abstraction may take CS By Damon Wischik
54 UX Design and Agile: A Natural Fit? from serial (single-core) computing
Talking with Julian Gosper, to effective parallel (many-core) 109 Path Selection and Multipath
Jean-Luc Agathos, Richard Rutter, computing. Congestion Control
and Terry Coatta. By Uzi Vishkin By Peter Key, Laurent Massoulié,
ACM Case Study and Don Towsley
On the Move, Wirelessly
61 Virtualization: Blessing or Curse? Connected to the World
Managing virtualization at How to experience real-world
a large scale is fraught with landmarks through a wave,
hidden challenges. gaze, location coordinates,
By Evangelos Kotsovinos or touch, prompting delivery of
useful digital information.
articles’ development led by By Peter Fröhlich, Antti Oulasvirta,
queue.acm.org Matthias Baldauf, and Antti Nurminen

about the cover:


OpenSocial: An Enabler for Preserving privacy in
Social Applications on the Web an online world remains
one of the industry’s
Building on the openSocial API suite, most exhaustive
developers can create applications challenges. While great
progress has been made,
that are interoperable within the
ILLUSTRATION BY PET ER GRU NDY

vulnerabilities indeed
context of different social networks. remain. This month’s
cover story by Cynthia
By Matthias Häsel Dwork (p. 86) spotlights
the difficulties involved
in protecting statistical
databases, where the
value of accurate statistics
about a set of respondents often compromises the privacy
of the individual.

ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f the acm 3
communications of the acm
Trusted insights for computing’s leading professionals.

Communications of the ACM is the leading monthly print and online magazine for the computing and information technology fields.
Communications is recognized as the most trusted and knowledgeable source of industry information for today’s computing professional.
Communications brings its readership in-depth coverage of emerging areas of computer science, new trends in information technology,
and practical applications. Industry leaders use Communications as a platform to present and debate various technology implications,
public policies, engineering challenges, and market trends. The prestige and unmatched reputation that Communications of the ACM
enjoys today is built upon a 50-year commitment to high-quality editorial content and a steadfast dedication to advancing the arts,
sciences, and applications of information technology.

ACM, the world’s largest educational STAF F editorial Board


and scientific computing society, delivers  
resources that advance computing as a Director of Group P ublishi ng E ditor-i n -c hief
science and profession. ACM provides the Scott E. Delman Moshe Y. Vardi ACM Copyright Notice
computing field’s premier Digital Library publisher@cacm.acm.org eic@cacm.acm.org Copyright © 2011 by Association for
and serves its members and the computing Executive Editor News Computing Machinery, Inc. (ACM).
profession with leading-edge publications, Diane Crawford Co-chairs Permission to make digital or hard copies
conferences, and career resources. Managing Editor Marc Najork and Prabhakar Raghavan of part or all of this work for personal
Thomas E. Lambert Board Members or classroom use is granted without
Executive Director and CEO Senior Editor Brian Bershad; Hsiao-Wuen Hon; fee provided that copies are not made
John White Andrew Rosenbloom Mei Kobayashi; Rajeev Rastogi; or distributed for profit or commercial
Deputy Executive Director and COO Senior Editor/News Jeannette Wing advantage and that copies bear this
Patricia Ryan Jack Rosenberger notice and full citation on the first
Director, Office of Information Systems Web Editor Viewpoints page. Copyright for components of this
Wayne Graves David Roman Co-chairs work owned by others than ACM must
Director, Office of Financial Services Editorial Assistant Susanne E. Hambrusch; John Leslie King; be honored. Abstracting with credit is
Russell Harris Zarina Strakhan J Strother Moore permitted. To copy otherwise, to republish,
Director, Office of Membership Rights and Permissions Board Members to post on servers, or to redistribute to
Lillian Israel Deborah Cotton P. Anandan; William Aspray; lists, requires prior specific permission
Director, Office of SIG Services Stefan Bechtold; Judith Bishop; and/or fee. Request permission to publish
Donna Cappo Art Director Stuart I. Feldman; Peter Freeman; from permissions@acm.org or fax
Director, Office of Publications Andrij Borys Seymour Goodman; Shane Greenstein; (212) 869-0481.
Bernard Rous Associate Art Director Mark Guzdial; Richard Heeks;
Director, Office of Group Publishing Alicia Kubista Rachelle Hollander; Richard Ladner; For other copying of articles that carry a
Scott Delman Assistant Art Directors Susan Landau; Carlos Jose Pereira de Lucena; code at the bottom of the first or last page
Mia Angelica Balaquiot Beng Chin Ooi; Loren Terveen or screen display, copying is permitted
ACM Cou n c i l Brian Greenberg provided that the per-copy fee indicated
President Production Manager P ractice in the code is paid through the Copyright
Alain Chesnais Lynn D’Addesio Chair Clearance Center; www.copyright.com.
Vice-President Director of Media Sales Stephen Bourne
Barbara G. Ryder Jennifer Ruzicka Board Members Subscriptions
Secretary/Treasurer Public Relations Coordinator Eric Allman; Charles Beeler; David J. Brown; An annual subscription cost is included
Alexander L. Wolf Virgina Gold Bryan Cantrill; Terry Coatta; Mark Compton; in ACM member dues of $99 ($40 of
Past President Publications Assistant Stuart Feldman; Benjamin Fried; which is allocated to a subscription to
Wendy Hall Emily Eng Pat Hanrahan; Marshall Kirk McKusick; Communications); for students, cost
Chair, SGB Board George Neville-Neil; Theo Schlossnagle; is included in $42 dues ($20 of which
Vicki Hanson Columnists is allocated to a Communications
Jim Waldo
Co-Chairs, Publications Board Alok Aggarwal; Phillip G. Armour; subscription). A nonmember annual
Martin Campbell-Kelly; The Practice section of the CACM
Ronald Boisvert and Jack Davidson subscription is $100.
Michael Cusumano; Peter J. Denning; Editorial Board also serves as
Members-at-Large
Shane Greenstein; Mark Guzdial; the Editorial Board of .
Vinton G. Cerf; ACM Media Advertising Policy
Carlo Ghezzi; Peter Harsha; Leah Hoffmann; Communications of the ACM and other
C on tributed Articles
Anthony Joseph; Mari Sako; Pamela Samuelson; ACM Media publications accept advertising
Co-chairs
Mathai Joseph; Gene Spafford; Cameron Wilson in both print and electronic formats. All
Al Aho and Georg Gottlob
Kelly Lyons; Board Members advertising in ACM Media publications is
Mary Lou Soffa; C o n tac t P o i n ts at the discretion of ACM and is intended
Yannis Bakos; Elisa Bertino; Gilles
Salil Vadhan Copyright permission to provide financial support for the various
Brassard; Alan Bundy; Peter Buneman;
SGB Council Representatives permissions@cacm.acm.org activities and services for ACM members.
Andrew Chien; Anja Feldmann;
Joseph A. Konstan; Calendar items Current Advertising Rates can be found
Blake Ives; James Larus; Igor Markov;
G. Scott Owens; calendar@cacm.acm.org by visiting http://www.acm-media.org or
Gail C. Murphy; Shree Nayar; Lionel M. Ni;
Douglas Terry Change of address by contacting ACM Media Sales at
Sriram Rajamani; Jennifer Rexford;
acmcoa@cacm.acm.org (212) 626-0654.
Publi cat i o n s B oa r d Marie-Christine Rousset; Avi Rubin;
Letters to the Editor
Co-Chairs Fred B. Schneider; Abigail Sellen;
letters@cacm.acm.org Single Copies
Ronald F. Boisvert; Jack Davidson Ron Shamir; Marc Snir; Larry Snyder;
Manuela Veloso; Michael Vitale; Single copies of Communications of the
Board Members W e b SITE
Wolfgang Wahlster; Andy Chi-Chih Yao; ACM are available for purchase. Please
Nikil Dutt; Carol Hutchins; http://cacm.acm.org
Willy Zwaenepoel contact acmhelp@acm.org.
Joseph A. Konstan; Ee-Peng Lim;
Catherine McGeoch; M. Tamer Ozsu; Au t h o r G u i d e l i n es Research High lights Comm uni cations o f the ACM
Holly Rushmeier; Vincent Shen; http://cacm.acm.org/guidelines Co-chairs (ISSN 0001-0782) is published monthly
Mary Lou Soffa David A. Patterson and Stuart J. Russell by ACM Media, 2 Penn Plaza, Suite 701,
ACM U.S. Public Policy Office A dv e rt i s i ng Board Members New York, NY 10121-0701. Periodicals
Cameron Wilson, Director Martin Abadi; Stuart K. Card; Jon Crowcroft; postage paid at New York, NY 10001,
ACM Advertisi n g Department Deborah Estrin; Shafi Goldwasser;
1828 L Street, N.W., Suite 800 and other mailing offices.
2 Penn Plaza, Suite 701, New York, NY Monika Henzinger; Maurice Herlihy;
Washington, DC 20036 USA
10121-0701 Dan Huttenlocher; Norm Jouppi; P OSTM ASTER
T (202) 659-9711; F (202) 667-1066
T (212) 869-7440 Andrew B. Kahng; Gregory Morrisett; Please send address changes to
Computer Science Teachers Association F (212) 869-0481 Michael Reiter; Mendel Rosenblum; Communications of the ACM
Chris Stephenson Ronitt Rubinfeld; David Salesin; 2 Penn Plaza, Suite 701
Director of Media Sales
Executive Director Lawrence K. Saul; Guy Steele, Jr.; New York, NY 10121-0701 USA
Jennifer Ruzicka
2 Penn Plaza, Suite 701 Madhu Sudan; Gerhard Weikum;
jen.ruzicka@hq.acm.org
New York, NY 10121-0701 USA Alexander L. Wolf; Margaret H. Wright
T (800) 401-1799; F (541) 687-1840 Media Kit acmmediasales@acm.org
W eb
Association for Computing Machinery Co-chairs
(ACM) James Landay and Greg Linden
2 Penn Plaza, Suite 701 Board Members A
SE
REC
Y

New York, NY 10121-0701 USA Gene Golovchinsky; Jason I. Hong;


E

CL
PL

T (212) 869-7440; F (212) 869-0481 Jeff Johnson; Wendy E. MacKay Printed in the U.S.A.
NE
TH

S
I

Z
I

M AGA

4 communications of the ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
editor’s letter

DOI:10.1145/1866739.1866740 Moshe Y. Vardi

Where Have All was built as a manor house of a German

the Workshops Gone? prince in 1760. It was converted into


the International Conference and Re-
search Center for Computer Science in
My initiation into the computing-research 1989, now called Leibniz Center for In-
community was a workshop on “Logic formatics. The first week-long seminar
(Dagstuhl workshops are called semi-
and Databases” in 1979. I was the only nars) took place in August 1990. Since
graduate student attending that workshop; then, Dagstuhl has hosted close to 800
seminars, drawing about 30,000 partic-
ipants. In addition to week-long semi-
my graduate advisor was invited, and research conferences. What they usu- nars, Dagstuhl hosts perspectives work-
he got permission from the organizers ally lack is the prestige of major confer- shops, summer schools, retreat stays of
to bring me along. In spite of the infor- ences. Furthermore, most workshops research guests, and the like. If you re-
mality of the event I was quite in awe of today do publish proceedings, before ceive an invitation to a Dagstuhl semi-
the senior researchers who attended or after the meeting, which means a nar, accept it! The facility offers a good
the workshop. In fact, I was quite in workshop paper cannot be resubmit- library and an outstanding wine cel-
shock when one of them, an author of ted to a conference. As a result, today’s lar. The rural location facilitates both
a well-respected logic textbook, proved workshops do not attract papers of the group and one-on-one interactions. In
to be far from an expert in the subject same quality as those submitted to ma- a nutshell, Dagstuhl is the place to ex-
matter of his book. jor conferences. perience the tradition of workshops as
Throughout the 1980s, workshops Workshops have become, I am afraid informal scientific gatherings. Its con-
continued to be informal gatherings to say, simply second-rate conferences. tributions to computing research over
of researchers mixing networking with Yes, I am sure there are exceptions to the past 20 years are incalculable. It is
work-in-progress presentations and this, but I believe my description does no wonder that the National Institute of
intellectually stimulating discussions. apply to the vast majority of today’s Informatics in Japan recently created a
A workshop was typically a rather inti- computing-research workshops. It is similar center in Shonan, near Tokyo.
mate gathering of specialists; an oppor- not uncommon to see workshops where This brings me to a question that
tunity to invite one’s scientific friends the size of the program committee ex- has been bothering me for years. Call it
to get together. While conferences were ceeds the number of papers submitted “Dagstuhl Envy,” but why don’t we have
the place to present polished technical to the workshop. It is not uncommon to a North American “Dagstuhl”? There
results, workshops were a place to see if see deadlines extended in the hope of are several facilities in North America
your colleagues were as impressed with attracting a few more submissions. to host mathematics workshops, for
your new results or directions as you I miss the old workshops. Regard- example, the Banff International Re-
were. The pace was leisurely, many pre- less of what one thinks of computing- search Station, and these are often used
sentations were done on blackboards, research conferences (our community for workshops on topics in theoretical
and it was perfectly acceptable to ask is now engaged in serious discussions computer science. There is, however,
questions during presentations. Orga- on the advantages and disadvantages no facility dedicated for general com-
nizers may have posted an occasional of these meetings), informal work- puting-research workshops. It would
“call for abstracts,” but never a “call for shops played an important role in the probably take about $10 million to
papers.” In fact, workshops typically computing-research ecosystem. Many build such a facility and approximately
had no formal proceedings. preliminary results improved signifi- $2 million–$3 million annually to cover
Such informal workshops are almost cantly as a result of feedback received operating costs. These are modest sums
extinct today. As selective conferences from discussions carried out during in the context of the size of the North
become our dominant way of publish- these gatherings. The disappearance American computing-research portfo-
ing, workshops have gradually become of such workshops is, in my opinion, a lio and the size of the North American
mini-conferences. Today’s workshops loss to our community. information-technology industry. Can
have typically large program commit- I am a big fan of Schloss Dagstuhl, a we make it happen?
tees, calls for papers, deadlines, and all workshop facility near the small town of
the other accoutrements of computing- Wadern in Germany. Schloss Dagstuhl Moshe Y. Vardi, editor-in-chief

ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f the acm 5
letters to the editor

DOI:10.1145/1866739.1866741

To Change the World, Take a Chance

S
o m e o f w hat Constantine offshoring on the U.S. IT labor market
What Deeper Implications
Dovrolis said in the Point/ merits its own discussion.
Counterpoint “Future In- for Offshoring? Misappropriation of information has
ternet Architecture: Clean- As someone who has known offshor- been studied in the broader outsourcing
Slate Versus Evolutionary ing for years, I was drawn to the article context; see, for example, Eric K. Clemons’s
Research” (Sept. 2010) concerning an “How Offshoring Affects IT Workers” and Lorin M. Hitt’s “Poaching and the
evolutionary approach to developing by Prasanna B. Tambe and Lorin M. Misappropriation of Information” in the
Internet architecture made sense, and, Hitt (Oct. 2010) but disappointed to Journal of Management Information
like Jennifer Rexford on the other side, find a survey-type analysis that essen- Systems 21, 2 (2004), 87–107.
I applaud and encourage the related tially confirmed less than what most of Prasanna B. Tambe, New York, NY
“evolutionary” research. But I found us in the field already know. For exam- Lorin M. Hitt, Philadelphia, PA
his “pragmatic vision” argument nei- ple, at least one reason higher-salaried
ther pragmatic nor visionary. Worse workers are less likely to be offshored
was the impudence of the claim of is they already appreciate the value of Interpreting Data 100 Years On
“practicality.” being able to bridge the skill and cul- Looking to preserve data for a cen-
Mid-20th century mathematician tural gap created by employing off- tury or more involves two challenging
Morris Kline said it best when referring shore workers. orthogonal problems, one—how to
to the history of mathematics: “The I was also disappointed by the ar- preserve the bits—addressed by David
lesson of history is that our firmest ticle’s U.S.-centric view (implied at the S.H. Rosenthal in his article “Keep-
convictions are not to be asserted dog- top in the word “offshoring”). What ing Bits Safe: How Hard Can It Be?”
matically; in fact they should be most about how offshoring affects IT work- (Nov. 2010). The other is how to read
suspect; they mark not our conquests ers in countries other than the U.S.? and interpret them 100 years on when
but our limitations and our bounds.” In my experience, they are likewise af- everything might have changed—for-
For example, it took 2,000 years for fected; for example, in India IT work- mats, protocols, architecture, storage
geometry to move beyond the “prag- ers are in the midst of a dramatic cul- system, operating system, and more.
matism” of the parallel postulate, tural upheaval involving a high rate of Consider the dramatic changes over
some 200 years for Einstein to overtake turnover. just the past 20 years. There is also the
Newton, 1,400 years for Copernicus to While seeking deeper insight into challenge of how to design, build, and
see beyond Ptolemy, and 10,000 years offshoring, I would like to ask some- test complete systems, trying to antici-
for industrialization to supplant agri- one to explain the implications of giv- pate how they will be used in 100 years.
culture as the dominant economic ac- ing the keys to a mission-critical sys- The common, expensive solution is to
tivity. The Internet’s paltry 40–50-year tem to someone in another country not migrate all the data every time some-
history is negligible compared to these subject to U.S. law? Imagine if the rela- thing changes while controlling costs
other clean-slate revolutions. tionships between countries would de- by limiting the amount of data that
Though such revolutions gener- teriorate, and the other country would must be preserved in light of dedupli-
ally fail, failure is often the wellspring seize critical information assets? We cation, legal obsolescence, librarians,
of innovation. Honor and embrace it. have pursued offshoring for years, but archivists, and other factors.
Don’t chide it as “impractical.” The I have still not heard substantive an- For more on data interpretation see:
only practical thing to do with this or swers to these questions. 1. Lorie, R.A. A methodology and
any other research agenda is to open- Mark Wiman, Atlanta, GA system for preserving digital data. In
mindedly test our convictions and as- Proceedings of the Joint Conference on
sumptions over and over…including Digital Libraries (Portland, OR, July
any clean-slate options. Authors’ Response: 2002), 312–319.
I worry about the blind spot in our With so little hard data on outsourcing, it is 2. Lorie, R.A. Long-term preserva-
culture, frequently choosing “practi- important to first confirm some of the many tion of digital information. In Proceed-
cal effort” over bolder investment, to anecdotes now circulating. The main point ings of the First ACM/IEEE-CS Joint Con-
significantly change things. Who takes of the article was that the vulnerability of ference on Digital Libraries (Roanoke,
the 10,000-, 1,000-, or even 100-year occupations to offshoring can be captured VA, Jan. 2001), 346–352.
view when setting a research agenda? by their skill sets and that the skills story 3. Rothenberg, J. Avoiding Techno-
Far too few. Though “newformers” fail is not the only narrative in the outsourcing logical Quicksand: Finding a Viable Tech-
more often than the “practical” among debate. nical Foundation for Digital Preserva-
us, they are indeed the ones who The study was U.S.-centric by design. tion. Council on Library & Information
change the world. How offshoring affects IT workers in other Resources, 1999.
CJ Fearnley, Upper Darby, PA countries is important, but the effects of Robin Williams, San Jose, CA

6 communications of the ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
letters to the editor

def quicksort(v) ing trap that often permeates this de-


Author’s Response:
 return v if v.nil? or bate. OO offers a higher level of encap-
As Williams says, the topic of my article v.length <= 1 sulation than non-OO languages and
was not interpreting preserved bits. Jeff  less, more = v[1..-1]. allows programmers to view software
Rothenberg drew attention to the threat partition { |i| i < v[0] } realistically from a domain-oriented
of format obsolescence in Scientific  quicksort(less) + [v[0]] + perspective, as opposed to a solution/
American (Jan. 1995), a focus that has quicksort(more) machine-oriented perspective.
dominated digital preservation ever end The notion of higher levels of encap-
since. Rothenberg was writing before sulation has indeed permeated many
the Gutenberg-like impact of the Web This concise implementation shows aspects of programmer thinking; for
transformed digital formats from private, quicksort’s intent beautifully. Can a example, mobile-device and Web-ap-
to applications, to publishing medium, nicer solution be developed in a non- plication-development frameworks le-
leading him and others to greatly OOP language? Perhaps, but only in a verage these ideas, and the core tenets
overestimate the obsolescence risk of functional one. Also interesting is to of OO were envisioned to solve prob-
current formats. compare this solution with those in 30+ lems involving software development
I am unable to identify a single format other languages at http://en.wikibooks. prevalent at that time.
widely used in 1995 that has since become org/wiki/Algorithm_implementation/ Helping my students become com-
obsolete but would welcome an example. Sorting/Quicksort, especially the Java petent, proficient software developers,
At the 2010 iPres conference (http:// versions. OO languages are not all cre- I find the ones in my introductory class
www.ifs.tuwien.ac.at/dp/ipres2010/) I ated equal. move more easily from OOP-centric
asked the audience whether any of them But is OOP dominant? I disagree view to procedural view than in the
had ever been forced to migrate the with Ben-Ari’s assertion that “…the ex- opposite direction, but both types of
format of preserved content to maintain tensive use of languages that support experience are necessary, along with
renderability, and no one had. OOP proves nothing.” Without OOP in others (such as scripting). So, for me,
Format obsolescence is clearly a threat our toolbox, our models would not be how to start them off and what to em-
that must be considered, but compared to as beautiful as they could be. Consider phasize are important questions. I like
the technical and economic threats facing again Ruby quicksort, with no obvious objects-first, domain-realistic soft-
the bits we generate, it is insignificant classes or inheritance, yet the objects ware models, moving as needed into
and the resources devoted to it are themselves—arrays, iterators, and in- the nitty-gritty (such as embedded
disproportionate. tegers—are all class-based and have in- network protocols and bus signals).
David S.H. Rosenthal, Palo Alto, CA heritance. Even if OOP is needed only Today’s OO languages may indeed re-
occasionally, the fact that it is needed flect deficiencies, but returning to an
at all and subsumes other popular par- environment with less encapsulation
Objects Always! Well, adigms (such as structured program- would mean throwing out the baby
Almost Always ming) supports the idea that OOP is with the bathwater.
Unlike Mordechai Ben-Ari’s Viewpoint dominant. James B. Fenwick Jr., Boone, NC
“Objects Never, Well, Hardly Ever!” I recognize how students taught
(Sept. 2010), for me learning OOP was flowcharts first (as I was) would have The bells rang out as I read Morde-
exciting when I was an undergradu- difficulty switching to an OO para- chai Ben-Ari’s Viewpoint (Sept. 2010)—
ate almost 30 years ago. I realized that digm. But what if they were taught the rare, good kind, signaling I might
programming is really a modeling ex- modeling first? Would OOP come be reading something of lasting im-
ercise and the best models reduce the more naturally, as it did for me? More- portance. In particular, his example of
communication gap between comput- over, do students encounter difficul- an interpreter being “nicer” as a case/
er and customer. OOP provides more ties due to the choice of language in switch statement; some software is
tools and techniques for building their first-year CS courses? I’m much simply action-oriented and does not fit
good models than any other program- more comfortable with Ruby than with the object paradigm.
ming paradigm. Java and suspect it would be a better His secondary conclusion—that
Viewing OOP from a modeling per- introductory CS language. As it did in Eastern societies place greater empha-
spective makes me question Ben-Ari’s the example, Ruby provides better sup- sis on “balance” than their Western
choice of examples. Why would anyone port for the modeling process. counterparts, to the detriment of the
expect the example of a car to be appli- Henry Baragar, Toronto West—is equally important in soft-
cable to a real-time control system in ware. Objects certainly have their place
a car? The same applies to the “inter- I respect Mordechai Ben-Ari’s View- but should not be advocated to excess.
face” problem in supplying brake sys- point (Sept. 2010), agreeing there is Alex Simonelis, Montreal
tems to two different customers. There neither a “most successful” way of
would then be no need to change the structuring software nor even a “domi- Communications welcomes your opinion. To submit a
Letter to the Editor, please limit your comments to 500
“interface” to the internal control sys- nant” way. I also agree that research words or less and send to letters@cacm.acm.org.
tems, contrary to Ben-Ari’s position. into success and failure would inform
Consider, too, quicksort as imple- the argument. However, he seemed to
mented in Ruby: have fallen into the same all-or-noth- © 2011 ACM 0001-0782/11/0100 $10.00

ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f the acm 7
in the virtual extension

DOI:10.1145/1866739.1866742

In the Virtual Extension


To ensure the timely publication of articles, Communications created the Virtual Extension (VE)
to expand the page limitations of the print edition by bringing readers the same high-quality
articles in an online-only format. VE articles undergo the same rigorous review process as those
in the print edition and are accepted for publication on merit. The following synopses are from
articles now available in their entirety to ACM members via the Digital Library.

viewpoint contributed article contributed article


DOI: 10.1145/1866739.1866764 DOI: 10.1145/1866739.1866766 DOI: 10.1145/1866739.1866765

The Ephemeral Legion: Producing On the Move, Wirelessly OpenSocial: An Enabler for
an Expert Cyber-Security Work Connected to the World Social Applications on the Web
Force from Thin Air Peter Fröhlich, Antti Oulasvirta, Matthias Häsel
Michael E. Locasto, Anup K. Ghosh, Matthias Baldauf, and Antti Nurminen Social networking and open interfaces
Sushil Jajodia, and Angelos Stavrou Is it possible to experience real-world can be seen as representative of two
Although recent hiring forecasts landmarks through a wave, gaze, location characteristic trends to have emerged
(some thousands of new cyber-security coordinates, or touch, prompting in the Web 2.0 era, both of which
professionals over the next three years) delivery of useful digital information? have evolved in recent years largely
by both the NSA and DHS show a strong Today’s mobile handheld devices offer independently of each other. A significant
demand for cyber-security skills, such a opportunities never before possible for portion of our social interaction now
hiring spree seems ambitious, to say the interacting with digital information that takes place on social networks, and
least. The current rate of production of responds to users’ physical locations. URL-addressable APIs have become
skilled cyber-security workers satisfies But mobile interfaces have only limited an integral part of the Web. The arrival
the appetite of neither the public nor input capabilities, usually just a keyboard of OpenSocial heralds a new standard
private sector, and if a concerted effort and audio, while emerging multimodal uniting these two trends by defining a set
to drastically increase this work force is interaction paradigms are beginning to of programming interfaces for developing
not made the U.S. will export high-paying take advantage of user movements and social applications that are interoperable
information security jobs. In a global gestures through sensors, actuators, and on different social network sites.
economy, such a situation isn’t necessarily content. For example, tourists asking about OpenSocial applications are
a bad outcome, but it poses several an unfamiliar landmark might point at it interoperable within the context
challenges to the U.S.’s stated cyber- intuitively and would certainly welcome of multiple networks and build on
security plans. a handheld computer that responds standard technologies such as HTML
The authors believe the creation of a directly to that interest. When passersby and JavaScript. The advent of OpenSocial
significant cyber-security work force is provide directions, the description might increases a developer’s scope and
not only feasible, but also will help ensure include local features, as in, say, “Turn productivity considerably, as it means
the economic strength of the U.S. Beyond right after the red building and enter that applications need only be developed
offering immediate economic stimulus, through the metal gates.” They, too, would once, and can then be implemented
the nature of these jobs demands they welcome being able to see these features within the context of any given container
remain in the U.S. for the long term, and represented in a directly recognizable way that supports the standard. Meanwhile,
they would directly support efforts to on their handhelds. Or when following a operators of social network sites are
introduce information technology into route to a remote destination, they would presented with the opportunity to expand
the health care and energy systems in a want to know the turns and distances on their own existing functionalities with a
secure and reliable fashion. Without a they would need to take through tactile or host of additional third-party applications,
commitment to educating such a work auditory cues, without having to switch without having to relinquish control over
force, it is impossible to hire such a work their gaze between the environment and their user data in the process.
force into existence. the display. Until it was made public in November
From the authors’ point of view, far This article explores the synthesis 2007, the OpenSocial standard was
too few workers are adequately trained of several emerging research trends driven primarily by Google. The standard
mostly because traditional educational called Mobile Spatial Interaction, or was not suited to productive use at that
mechanisms lack the resources to MSI (http://msi.ftw.at), covering new time however, as there were several
effectively train large numbers of interaction techniques that let users shortcomings with respect to the user
experienced, knowledgeable cyber-security interact with physical, natural, and interface and security. The specification
specialists. Just as importantly, many of urban surroundings through today’s is now managed by the non-profit
the current commercial training programs sensor-rich mobile devices. OpenSocial Foundation and, with its
and certifications focus on teaching skills 0.8 version, a stable state suitable for
useful for fighting the last cyberwar, not commercial use has been reached.
the current, nor future ones.

8 communications of the ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
acm’s annual report for FY10

DOI:10.1145/1866739.1866768 Wendy Hall

By discovering,
welcoming,
ACM’s Annual Report
and nurturing It has truly been a banner year for ACM.
talent from all We firmly established ACM hubs in Europe,
corners of the India, and China after years of exhaustive
computing arena, efforts to expand the Association’s global
ACM can truly be reach. We moved ACM’s commitment parts to join forces to improve working
distinguished as to women in computing to a new level conditions for women in computing in
the world’s leading with further development of the ACM
Women’s Council and the launch of
India. The Association’s commitment
to addressing the challenges faced by
computing society. ACM-W activities in India. And, (dare women in the field today is one that ev-
I say, not surprisingly), ACM mem- ery member should applaud.
bership ended the year at another all- The fact that membership has
time high. continued to increase for eight con-
Increasing ACM’s relevance and in- secutive years is testament to the
fluence in the global computing com- ever-growing awareness of ACM’s
munity has been a top priority through- commitment to supporting the pro-
out my presidency. By sharing ACM’s fessional growth of its members. In-
array of valued resources and services deed, by the end of FY10—spanning
with a borderless audience, and by dis- an acutely challenging year in global
covering, welcoming, and nurturing economies—the Association’s mem-
talent from all corners of the comput- bership stood at an all-time high, thus
ing arena, ACM can truly be distin- cementing ACM’s position as the larg-
guished as the world’s leading com- est educational and scientific comput-
puting society. It was therefore a great ing society in the world.
honor to host the opening days of ACM The following pages summarize
Europe, ACM India, and ACM China. some of the highlights of a busy year in
The global stage has indeed been set the life of ACM. While much has been
for ACM to flourish internationally as accomplished, there is still much to be
never before. done. In FY11, the Association will con-
ACM continues to play a leadership tinue to grow initiatives in India, Chi-
role in improving the image and health na, and Europe as well as identify other
of the computing discipline. This is regions of the world where it is feasible
particularly evident with the Associa- for ACM to increase its level of activ-
tion’s work in influencing change for ity. Improving the image and health
women pursuing a career in comput- of our discipline and field requires the
ing. Through committees and initia- concerted commitment of every ACM
tives such as ACM Women’s Council, volunteer, board, chapter, committee,
The Coalition to Diversify Computing, and member. It is through the support
and the Computer Science Teachers of devoted volunteers, members, and
Association (CSTA), ACM is helping to industry partners that ACM is able to
build balance, diversity, and opportu- make a real difference in the future of
nity for all who may be interested in computing. It has been a pleasure to
technology. It was particularly inspir- serve as your president during a time of
ing to see members of ACM-W on hand such great promise.
at the launching of ACM India earlier
this year, encouraging their counter- Wendy Hall, acm president

Ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f the acm 9
acm’s annual report for FY10

ACM’s Annual Report for FY10

ACM, the Association for Computing Education ography of resources selected from
Machinery, is an international scientific ACM continues to work with multiple ACM’s Digital Library, ACM’s online
and educational organization dedicated organizations on important issues book and course offerings, and non-
to advancing the arts, sciences, and ap- related to the image of computing ACM resources created by experts’
plications of information technology. and the health of the discipline and recommendations on current com-
profession. In the second year of an puting topics. A Tech Pack comprises
Publications NSF grant to develop a more relevant a set of fundamentally important ar-
The centerpiece of the ACM Publica- image for computing, ACM worked ticles on a subject with new material
tion portfolio is the ACM Digital Li- in tandem with WGBH-Boston in the to provide a context and perspective
brary. During the past year, 21,000 creation of a new messaging cam- on the theme. The goal is that com-
full-text articles were added to the DL, paign called “Dot Diva.” The cam- munities might be built around Tech
bringing total holdings to 281,000 paign, which rolled out in the U.S. last Packs with members commenting on
articles. ACM’s Guide to Computing month, is focused on ways to engage selected resources and suggesting
Literature is an integral part of the young girls with the potential of com- new ones.
DL, providing an increasingly com- puting. The Professions Board Case Study
prehensive index to the literature of ACM and the Association for In- program took off this year, with the
computing. More than 230,000 works formation Systems (AIS) jointly de- first of several planned studies avail-
were added to the bibliographic data- veloped new curriculum guidelines able online and in print. The program
base in FY10, bringing the total Guide for undergraduate degree programs was designed to take an in-depth look
coverage to over 1.52 million works. in information systems that for the at a company or product or technol-
Significant enhancements were first time include both core and elec- ogy from its inception to future plans
made to the Digital Library and Guide tive courses suited to specific career by interviewing some of the key play-
this year, including a major reorgani- tracks. Released in May, IS 2010 is ers involved. The inaugural case study
zation of the core citation pages and aimed at educating graduates who was posted on the ACM Queue site and
to ACM bibliometrics. Along with con- are prepared to enter the work force published in Communications of the
tent reformation, there is now greater equipped with IS-specific as well as ACM. The article was quickly slash-
ease of navigation and a greater selec- foundational knowledge and skills. dotted, and drew over 50,000 unique
tion of tools and resources. The report describes the seven core visits to the Queue site by the end of
ACM currently publishes 40 jour- courses that must be covered in every the fiscal year.
nals and Transactions, 10 magazines, IS program and the curriculum can be Traffic to the Queue Web site
and 23 newsletters. In addition, it adapted for schools of business, pub- (http://queue.acm.org/) more than
provides primary online distribution lic administration, and information doubled this year over last. By the
for 10 periodicals through the Digi- science or informatics. end of FY10, the site delivered nearly
tal Library. During FY10, ACM added ACM’s Computer Science Teach- a million page views to nearly half a
364 conference and related workshop ers Association (CSTA) continues to million readers.
proceedings to the DL, including 45 support and promote the teaching of
in ACM’s International Conference Pro- computer science at the K–12 level as Public Policy
ceedings Series. well as providing opportunities and Members of the U.S. Public Policy
Two ACM magazines were re- resources for teachers and students to Council of ACM (USACM) had an ac-
launched during FY10. Crossroads, the improve their understanding of com- tive year interacting with policymak-
ACM student magazine became XRDS, puting disciplines. CSTA’s mission is ers in areas of e-voting, privacy, and
with a more expansive editorial scope to ensure computer science emerges security, as well as testifying before
and a more modern look to appeal to as a viable discipline in high schools Congressional committees and help-
the student audience. ACM Inroads and middle schools; it is a key partner ing develop principles for increasing
was transformed from the SIGCSE in ACM’s effort to see real computer the usability of government informa-
Bulletin newsletter to an ACM maga- science count at the high school level. tion online. Among the issues tackled
zine with a wider variety of content for this year, USACM joined a task force
computer science educators. Professional Development for the Future of American Innova-
Periodicals that were approved by The Professional Development Com- tion urging more funding for basic
the Publications Board and are now mittee spearheaded the development research and STEM education. Mem-
on the launching pad for FY11: ACM of a new product for practitioners and bers also expressed concerns with the
Transactions on Management Infor- managers this year called Tech Packs. Cybersecurity Act of 2009, provided
mation Systems; ACM Transactions on These integrated learning packages constructive comments on a draft of
Intelligent Systems and Technology; were created to provide a resource the Internet Privacy bill, and issued
and ACM Transactions on Interactive for emerging areas of computing de- a response to e-voting legislation and
Intelligent Systems. signed around an annotated bibli- Internet voting as it relates to military

10 communications of th e ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
acm’s annual report for FY10

and overseas voters. The ACM Student Research Com- ACM Council
The ACM Committee on Comput- petition (SRC), sponsored by Micro- President
Wendy Hall
ers and Public Policy aids the Associa- soft Research, provides a unique fo-
Vice President
tion with respect to a variety of inter- rum for undergraduate and graduate Alain Chesnais
nationally relevant issues pertaining students to present their original re- Secretary/Treasurer
to computers and public policy. The search at well-known ACM-sponsored Barbara Ryder
online ACM Forum on Risks to the and co-sponsored conferences before Past President
Stuart I. Feldman
Public in Computer and Related Sys- a panel of judges and attendees. This
SIG Governing Board Chair
tems and the “Inside Risks” column venue draws an increasing number Alexander Wolf
published in Communications of the of students each year as it affords an Publications Board Co-Chairs
ACM reflect CCPP’s long-standing exceptional opportunity for students Ronald Boisvert, Holly Rushmeier
dedication to policy issues on a global to showcase their work and develop Members-at-Large
Carlo Ghezzi, Anthony Joseph,
scale. their skills as researchers. Mathai Joseph, Kelly Lyons,
ACM played an active role in the ACM continues to cultivate its Bruce Maggs, Mary Lou Soffa,
Fei-Yue Wang
National Center for Women and In- partnerships with leading technology
SGB Council
formation Technology (NCWIT) this companies, including Microsoft and Representatives
year, particularly with regard to the Computer Associates, to offer valu- Joseph A. Konstan, Robert A. Walker,
Jack Davidson
K–12 Alliance—a coalition of edu- able tools specifically for ACM stu-
cational organizations interested in dent members. Available under the ACM Headquarters
helping young girls develop an inter- Student Academic Initiative is the Mi- Executive Director/CEO
est in computer science and informa- crosoft Developer Academic Alliance John R. White
tion technology. now offering student members free Deputy Executive Director/
COO
The ACM Education Policy Com- and unlimited access to over 100 soft- Patricia M. Ryan
mittee (ACM EPC), established to ed- ware packages and the CA Academic
ucate policymakers about the appro- Initiative including access to compli- 2009 ACM Award
priate role of computer science in the mentary CA software. Recipients
K–12 system, made major progress in ACM-W’s Scholarship program, A.M. Turing Award
Charles P. Thacker
bringing computer science into STEM which offers stipends to select stu-
ACM-Infosys Foundation Award
discussions at all levels of govern- dents to attend research conferences in the Computing Sciences
ment. Through the work of EPC, com- worldwide, was given an extra finan- Eric Brewer
puter science is now explicitly recog- cial boost this year with new funding ACM/AAAI Allen Newell Award
Michael I. Jordan
nized in key federal legislation as well from the Bangalore-based global IT
The 2009–2010 ACM-W
as Department of Education regula- services corporation Wipro and Sun Athena Lecturer Award
tions and initiatives. Indeed, EPC suc- Microsystems (prior to the Oracle Mary Jane Irwin
cessfully led an effort that resulted takeover). The increased funding will Grace Murray Hopper Award
Tim Roughgarden
in the U.S. House of Representatives allow ACM-W to offer students larger
ACM-IEEE CS 2010
declaring the week of December 7th scholarships as well as enable partici- Eckert-Mauchly Award
as National Computer Science Edu- pation by women in both internation- William J. Dally
cation Week. ACM took a leadership al and local events. Karl V. Karlstrom
Outstanding Educator Award
role in steering the first CSEDWeek Matthias Felleisen
(held Dec. 6–12, 2009); a role the or- International Outstanding Contribution
ganization reprised for the second ACM Europe and ACM India were to ACM Award
Moshe Y. Vardi
CSEDWeek held last month. launched in FY10. Both organiza-
Distinguished Service Award
tions operate with councils estab- Edward Lazowska
Students lished around three subcommittees: Paris Kanellakis Theory
ACM’s renowned International Col- chapters; conferences; and members, and Practice Award
Mihir Bellare and Phillip Rogaway
legiate Programming Contest (ICPC), awards, and volunteer leaders with
Software System Award
sponsored by IBM, drew 22,000 con- the goal of increasing the presence of VMware Workstation 1.0,
testants representing 1,931 universi- and generating interest in these pop- Mendel Rosenblum,
Edouard Bugnion, Scott Devine,
ties from 82 countries. The finals were ular ACM services. Jeremy Sugerman, Edward Wang
held in Harbin, China, where 103 The number of ACM Fellows, Dis- Eugene L. Lawler Award
teams competed. The top four teams tinguished, and Senior members and Informatics
Gregory D. Abowd
won gold medals as well as employ- from Europe has increased as has the
ACM-IEEE Ken Kennedy Award
ment or internship offers from IBM. number of ACM chapters throughout Francine Berman
Last January, ACM Queue’s Web site Europe. Doctoral Dissertation Award
offered an online programming com- Moreover, Microsoft Research Eu- Craig Gentry
petition based on the ICPC. The inau- rope provided $50,000 to enhance AcM PRESIDENTIAL AWARD
Mathai Joseph, Elaine J. Weyuker
gural Queue ICPC Challenge—open to the ACM Distinguished Speakers Pro-
Honorable Mention
all Queue readers (not just students)— gram with a goal of delivering more Haryadi S. Gunawi, Andre Platzer,
was a huge success. high-quality, ACM-branded lectures Keith Noah Snavely

Ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t he acm 11
acm’s annual report for FY10

in Europe. Balance Sheet: June 30, 2010 (in Thousands)


Through the efforts of ACM India,
launched last January in Bangalore,
Assets
the number of chapters in India has
more than doubled over the last 12 Cash and cash equivalents $26,463
months and professional member- Investments 51,720
ship is up over 50%.
Accounts receivable and other assets 6,793
ACM-W hosted a “Women in Com-
puting” event at the ACM India festivi- Deferred conference expenses 5,151
ties to encourage India’s women in Fixed assets, net of accumulated depreciation and amortization 781
computing to network and organize
to form a community that works to- Total Assets $90,908
ward improving working and learning
environments for all women in com-
LIABILITIES AND NET ASSETS
puting in India.
A new ACM China Council was of- Liabilities:
ficially launched in Beijing in June.
Established to recognize and support Accounts payable, accrued expenses, and other liabilities $10,100
ACM members and activities in Chi- Unearned conference, membership, and subscription revenue 22,758
na, the new council comprises a cross
section of the computer science and Total liabilities $32,858
information technology community
committed to increasing the visibility Net assets:
and relevance of ACM in China. The Unrestricted 52.329
group is also exploring several coop- Temporarily restricted 5,721
erative efforts with the China Com-
puter Federation (CCF). Total net assets 58,050
The Publications Board is playing a
significant role in ACM’s China Initia- Total liabilities and net assets $90,908
tive. The Board approved three specif-
ic steps aimed at improving the expo-
Optional contributions fund – program expense ($000)
sure of Western audiences to Chinese
research and to further improve ties Education board accreditation $95
between ACM and the CCF by includ- USACM Committee 20
ing Chinese translations of articles
from Communications of the ACM in Total expenses $115
the Digital Library; hosting two CCF
journals in the Digital Library; and
developing a co-branded ACM/CCF
journal.
The first two SIGSPATIAL chapters
in China and Australia were chartered
this year. zation’s most popular activities and to protect the vast online collection of
events. The Multimedia homepage resources in its Digital Library used by
Electronic Community (http://myacm.acm.org/dashboard. over one million computing profes-
Communications of the ACM (http:// cfm?svc=mmc) features a collection sionals worldwide.
cacm.acm.org/) Web site garnered the of 10 videos at all times, with a new ACM-W unveiled a new Web site
top award for Best New Web site by video replacing an existing one each this year that offers myriad ways to
Media Business. The magazine’s spe- week. celebrate, inform, and support wom-
cial report on “Ten Great Media Web ACM is now providing its institu- en in computing. The redesigned site
Sites” recognized Communications’ tional library customers advanced (http://women.acm.org/) has many
powerful search and browse function- electronic archiving services to pre- new features, including “Women of
ality, deep integration with ACM’s siz- serve their valuable electronic re- Distinction,” highlighting women
able archive of computing literature, sources. These services, provided by leaders; international activities of
and clean, fresh look. Portico and CLOCKSS, address the ACM-W Ambassadors and Regional
ACM launched a new Multimedia scholarly community’s critical need Councils; and ways to get involved in
Center last fall that offers members for long-term solutions that assure attracting more young women to the
free access to select videos from vari- reliable, secure, deliverable access computing profession.
ous areas of interest in computing to their digital collection of scholarly It was a year that saw ACM—as well
as well as from some of the organi- work. ACM is offering these services as most of its SIGs—establish a pres-

12 communications of th e ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
acm’s annual report for FY10

Statement of Activities: Year ended June 30, 2010 (in Thousands) participated in the conference. SIG-
GRAPH Asia 2009 attracted over 6,500
Temporarily visitors from more than 50 countries
revenue Unrestricted Restricted Total across Asia and globally to Yokohama,
Japan where over 500 artists, academ-
Membership dues $9,201 $9,201
ics, and industry experts shared their
Publications 17,361 17,361 work.
Conferences and other meetings 24,933 24,933 ACM’s SIG Governing Board agreed
to sponsor select conferences that
Interests and dividends 1,707 1,707
come to ACM without a technical tie
Net appreciation of investments 2,442 2,442 to one its SIGs. In FY10, SGB approved
Contributions and grants 2,882 $963 3,845 sponsorship for two conferences: The
ACM International Conference on
Other revenue 348 348
Bioinformatics and Computational
Net assets released from restrictions 1,057 (1,057) 0 Biology and the First ACM Interna-
tional Health Informatics Sympo-
Total Revenue 59,931 (94) 59,837
sium.
Attendance for ASSETS 09, spon-
EXPENSES sored by SIGACCESS, exceeded all
projections, drawing a record number
Program:
of participants to its technical pro-
Membership processing and services $945 $945 gram that addressed key issues such
Publications 11,457 11,457 as cognitive accessibility, wayfinding,
virtual environments, and accessibil-
Conferences and other meetings 25,035 25,035
ity obstacles for the hearing impaired.
Program support and other 7,269 7,269 SIGOP’s flagship ACM Symposium
on Operating Systems Principles en-
Total 44,706 44,706 joyed record-breaking attendance;
the SIG also jointly sponsored (with
Supporting services:
SIGMOD) the first annual ACM Sym-
General administration 9,009 9,009 posium on Cloud Computing.
Marketing 1,299 1,299 KDD 2009 maintained SIGKDD’s
position as the leading conference on
Total expenses 55,014 55,014 data mining and knowledge discov-
ery, with a record number of submis-
sions.
Increase (decrease) in net assets 4,917 (94) 4,823
Net assets at the beginning of the year 47,412 5,815 53,227 Recognition
The ACM Fellows Program, estab-
Net assets at the end of the year $52,329* $5,721 $58,050* lished in 1993 to honor outstanding
* Includes SIG Fund balance of $28,448K ACM members for their achievements
in computer science and information
ence on such popular social networks DAD). This wireless network, created technology, inducted 47 new fellows
as Facebook, Twitter, and LinkedIn. to bridge the gap between research in FY10, bringing the total number of
SIGUCCS established an online com- and real-world use of wireless net- ACM Fellows to 722.
munity using Ning’s social network- works, has rapidly become one of the ACM also recognized 84 Distin-
ing services and linked its portal to most critical wireless network data re- guished Members for their individual
its new Web site (http://www.siguccs. sources for the global research com- contributions to both the practical
org/) as well as initiated a series of munity. and theoretical aspects of comput-
Webinars to continue on a quarterly ing and information technology. In
basis. SIGSIM’s Modeling and Simu- Conferences addition, 150 Senior Members were
lation Knowledge Repository (http:// SIGGRAPH 2009 welcomed 11,000 recognized for demonstrated perfor-
www.acm-sigsim-mskr.org) has prov- artists, research scientists, gam- mances that set them apart from their
en an innovative program for supply- ing experts, and filmmakers from 69 peers.
ing services to the SIGSIM technical countries to New Orleans. Exhibits at There were 104 new ACM chapters
community. And SIGMOBILE spon- SIGGRAPH experienced the largest chartered in last year. Of the 28 new
sored programs in the mobile com- percentage of international participa- professional chapters, 26 of them
puting research community such as tion in more than 10 years, with a total were internationally based; of the
a Community Resource for Archiving of 140 industry organizations repre- 76 new student chapters, 41 of them
Wireless Data at Dartmouth (CRAW- sented. In addition, over 965 speakers were based internationally.

Ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t he acm 13
The Communications Web site, http://cacm.acm.org,
features more than a dozen bloggers in the BLOG@CACM
community. In each issue of Communications, we’ll publish
selected posts or excerpts.

Follow us on Twitter at http://twitter.com/blogCACM

doi:10.1145/1866739.1866743 http://cacm.acm.org/blogs/blog-cacm

Smart Career Advice; company’s list of employees who are


the leaders of tomorrow. Her advice

Laptops as a
is tersely described as: Do better, look
better, and connect better.
Do Better. “The most reliable way

Classroom Distraction to advance your career is to add more


value to your business,” says Azzarello.
You need to understand what is most
Jack Rosenberger shares Patty Azzarello’s life lessons about important to your company in terms
advancing in the workplace. Judy Robertson discusses students’ of your job—whether it’s cutting costs,
in-class usage of laptops. increasing Web traffic, or improving
the delivery of software products—and
focus on that. Everything else is less
Jack Rosenberger she wasn’t receiving a raise, he replied, important.
“Are You Invisible?” “Because nobody knows you.” However, too many employees, like
http://cacm.acm.org/ Azzarello told this anecdote during the younger Azzarello, fall into the
blogs/blog-cacm/94307 her keynote speech at a DAC 2010 ca- trap of being “workhorses.” They do
Early in her career as reer workshop titled “More Than Core everything they’re asked to do, and of-
a manager at Hewlett- Competence…What it Takes for Your ten more, but the end result is “you’re
Packard (HP), Patty Azzarello was put Career to Survive, and Thrive!” She being valued as a workhorse, not as a
in charge of a software development learned from the experience, and pro- leader,” Azzarello says.
team whose product life cycle took two ceeded to flourish at HP, becoming the To get ahead in the workplace, Az-
years. It was “a ridiculously long period youngest HP general manager at the zarello says it’s important to figure
of time” to develop software, Azzarello age of 33, running a $1 billion software out how to deliver your work but also
says, and the length of the development business at 35, and becoming a CEO at how to create free time at work during
cycle left both HP’s sales team and its 39, she says. Today, Azzarello runs her which you can manage and advance
customers unhappy and frustrated. Az- own business management company, your career. After all, if you spend all
zarello revamped the team, reinvented Azzarello Group, and gives career ad- of your time working, you won’t have
its operating mode, and reduced the vice, which brought her to DAC. the time or energy to understand the
software cycle to nine months. The suc- Like Azzarello early in her career at company and its goals (which can
cessful delivery of the new software life HP, many employees believe that if they change), promote yourself and your
cycle’s first product coincided with Az- do their job and work hard, they will be accomplishments, and build relation-
zarello’s annual performance review, recognized and justly rewarded. Not so, ships with mentors and fellow em-
and she expected to receive a healthy says Azzarello. ployees.
raise as the economy was strong, HP Azzarello’s career advice for employ- The most successful employees
was performing well, and Azzarello ees is to make sure the work that you learn how to be the master of their work
herself was awarding significant raises do is aligned with the company’s goals, and not let it control them. They un-
to her top employees. bring your accomplishments to the at- derstand which aspects of their job are
Azzarello’s own raise, however, was tention of your superiors, and create most important to their company, and
zero. a network of mentors who will guide focus on them. “It’s essential to be ruth-
When Azzarello asked her boss why you and ensure you win a spot on the less with your priorities,” Azzarello says.

14 communications of th e ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
blog@cacm

“Refuse to let your time get burned up you must have a relationship with men- validated questionnaire tool about
with things that are less important.” tors or others who are connected to the laptop usage in education.
A critical lesson Azzarello learned in company president and key executives. Of course, there have always been
her career at HP is that the “most suc- This step involves networking, and distractions during class—as the mar-
cessful executives don’t do everything. many people (and Azzarello admits gins of my Maths 101 notes demon-
They do a few things right and hit them she’s one of these people) are uncom- strate with their elaborate doodles.
out of the park.” fortable with meeting new people for It’s just that laptops make it so easy
Look Better. The second step of Azza- the purpose of networking. If you’re and seductive to drop your attention
rello’s career plan involves making your one of these people, Azzarello’s advice out of the lecture while still feeling
work and accomplishments known to is to network with the people you al- that you are achieving something. (“I
your immediate bosses. After all, if you ready know. simply must update Facebook now.
deliver excellent results, but no one If Azzarello’s career advice sounds Otherwise people will not know I am
above you in the company is aware of like a lot of work, you’re right—it is. in a boring lecture.”)
them or doesn’t connect the results Which is why she urges employees to Unsurprisingly, laptop usage in
with your job performance, it’ll be diffi- create a yearlong plan for implement- class has been associated with poorer
cult for you to advance in your company. ing these three stages. learning outcomes, poorer self-per-
Azzarello recommends creating For many employees, Azzarello’s ad- ception of learning, and students re-
an audience list of the people in your vice is a real challenge. The alternative, porting feeling distracted by their own
company who should know about your however, is rather unsatisfying. After screen as well as their neighbors’ (see
achievements at work and a communi- all, who wants a zero raise? Carrie Fried’s “In-class Laptop Use and
cation plan for how to inform these key its Effects on Student Learning”). Many
players about your work and what you’ve Judy Robertson educators get frustrated by this (see
accomplished. The audience list should “Laptops in the Dennis Adam’s “Wireless Laptops in
include the influencers who have a say Classroom” the Classroom [and the Sesame Street
in your career—your bosses and your http://cacm.acm.org/ Syndrome]”) and there is debate about
bosses’ bosses—and any stakehold- blogs/blog-cacm/93398 whether laptops should be banned, or
ers who are dependent on your work. Do you ever find yourself whether the lecturer should have a big
And your communication plan should checking your email during a boring red button to switch off wireless (or to
describe how you will inform these in- meeting? Do you drift off on a wave of electrocute all students) when he or
fluencers—usually via conversations, RSS feeds when you should be listen- she can’t stand it anymore.
reports, and email—about your job and ing to your colleagues? Do you pretend Bear in mind, though, the stud-
what you’ve accomplished. to be taking studious notes during ies I mention here were conducted in
For your achievements to be appre- seminars while actually reading Slash- lecture-style classes and the students
ciated, Azzarello says it’s vital that they dot? In fact, shouldn’t your full atten- were not given guidance on how to ef-
are relevant to your company’s goals. tion be somewhere else right now? fectively use their laptops to help them
“Your priorities must be relevant to I find it increasingly tempting to do learn rather than arrange their social
their priorities,” says Azzarello. “Your lots of things at once, or at least take lives. It is possible to design active
work must be recognized as matching microbreaks from activities to check classes around laptop use (if you can
the business’s goals.” mail or news. I do think it’s rude to make sure that students who don’t own
Connect Better. The third step of Az- do so during meetings so I try to stop a laptop can borrow one) thereby mak-
zarello’s career plan involves connect- myself. My students don’t tend to have ing the technology work in your favor.
ing with key players at your company, such scruples. They use their laptops For example, my students learn to do
which involves building relationships openly in class, and they’re not all literature searches in class, try out code
with mentors and creating a broad net- conscientiously following along with snippets, or critique the design of Web
work of support. “Successful people my slides, I suspect. In fact, in a recent pages. And, yes, some of them still get
get a lot of help from others,” Azzarello study, “Assessing Laptop Use in High- distracted from these activities and wan-
says. “You can’t be successful alone.” er Education Classrooms: The Laptop der off to FarmVille. But at least I have
Azzarello stresses the importance of Effectiveness Scale,” published in the given them the opportunity to integrate
mentors (note the plural) at your com- Australasian Journal of Educational their technology with their learning in a
pany and outside of it, and says em- Technology, 70% of students spent half meaningful way. They are adult learners
ployees “shouldn’t attempt career ad- their time sending email during class after all. It’s their decision how best to
vancement without mentors.” Not only (instant messaging, playing games, spend their brain cells in my class and
can mentors help you understand a and other nonacademic activities my job is to give them a compelling rea-
company’s culture and goals, but they, were also popular). They did also take son to spend them on computer science
and other key players, can help you get notes and other learning tasks, but rather than solitaire. 
a spot on the company’s list of employ- they weren’t exactly dedicated to stay-
ees who are viewed as up and coming. ing on task. If you’re interested in sur- Jack Rosenberger is senior editor, news, of Communica-
tions. Judy Robertson is a lecturer at Heriot-Watt
All of this is about visibility. The veying your own class to find out what University.
company president or other top execu- they really do behind their screens,
tives must know or know about you, or the study’s authors provide a reliable, © 2011 ACM 0001-0782/11/0100 $10.00

ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t he acm 15
cacm online

ACM
Member
News
DOI:10.1145/1866739.1866744 David Roman Jan Camenisch Wins
Sigsac’s outstanding

Scholarly Publishing Innovation award


Jan Camenisch,

Model Needs an Update


a research staff
member and
project leader at
IBM Research-
Zurich, is the
recipient of
Science demands an overhaul of the well-established system of peer-review in ACM’s Special
scholarly communication. The current system is outmoded, inefficient, and Interest Group on Security,
slow. The only question is how! Audit, and Control’s (SIGSAC’s)
Outstanding Innovation Award.
The speed of scientific discovery is accelerating, especially in the field of Camenisch was recognized for
computing, with an increasing number of ways to communicate results to glob- outstanding theoretical work on
al research communities, and to facilitate the exchange of ideas, critiques, and privacy-enhancing cryptographic
protocols, which led to IBM’s
information through blogs, social networks, virtual meetings, and other elec-
Identity Mixer, a system that
tronic media in real time. These changes represent an enormous opportunity authenticates a person’s identity
for scientific publishing. while preserving their privacy.
Technology facilitated this acceleration, but technology alone will not pro- With colleagues, Camenisch
has addressed the problem of
vide the solution. Scientific discovery will not reduce or replace the need for preserving privacy in distributed
good judgment, expertise, and quality should always take priority over speed. At systems, which often require a
times, these values are at odds with the speed of digital communication, and this user to disclose more personal
is never more apparent than when spending a few spare moments reading gen- information than is necessary to
gain access to online resources.
eral Twitter or Facebook posts in response to serious scholarly articles published For instance, Camenisch
online in established publications. The combination of social networking and has developed cryptographic
scientific peer review is not a de facto home run. tools that allow a person to
create a pseudonym for a
Nevertheless, if implemented well, technology can help to serve as a spring- subscription-based Web site.
board for positive changes to the scholarly communication process. But it’s not The cryptographically secure
clear how to measure the import or impact of these activities, or their ability pseudonym proves the person
to truly change the current system, which is still heavily dependent on a long has a subscription, but doesn’t
reveal any information about
established system of “publish or perish” in scholarly journals or conference the individual’s identity. It can
proceedings. Many of the ways in which we communicate scientific discovery or be used in nearly all situations
conduct discourse are simply not counted in professional assessments, and this that require authentication,
such as transacting business
provides a negative incentive to changing the present system. The existing model
online with a smart card.
of peer review is part of the problem, but the social system of rewarding only As technical leader of the
the long-established scholarly media (print/online journals and conference pro- European Union-funded Privacy
ceedings in the case of computer science) is also a major hurdle. The publica- and Identity Management for
Europe project, which aims to
tion media that are accepted by the academic establishment happen to be those give users more awareness of
that take the most time to reach their intended readership. It is also worth not- and control over their personal
ing that these media have stood the test of time. Science [c on tinued o n p. 96] information, Camenisch is in
the process of developing an
entire identity management
system.
He says many users of
Facebook and other social media
sites do not realize the extent of
the footprints they are leaving
image courtesy of IBM Research - Zurich

on the Web, and that system


designers don’t put enough
emphasis on identity protection.
His advice for prospective
security experts? “Get fascinated
by the cryptography and believe
that you can solve seemingly
paradoxical problems,”
Camenisch says. “And, of
course, come work for IBM.”
—Neil Savage

16 communications of th e ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
N
news

Science | doi:10.1145/1866739.1866745 Gary Anthes

Nonlinear Systems
Made Easy
Pablo Parrilo has discovered a new approach to convex optimization
that creates order out of chaos in complex nonlinear systems.

I
magine you are hiking in a com-
plex and rugged environment.
You are surrounded by hills
and valleys, mountains, shallow
ditches, steep cliffs, and lakes.
Nothing about the ground immediately
around you, or your current direction,
tells you much about where you will end
up or what might lie in between.
Now imagine you are walking on
the inside surface of a huge, smoothly
shaped bowl. You can see the bottom of
it, and even a few steps over the surface
of the bowl tell you much about its shape
and dimensions. There are no surprises.
Pablo Parrilo, a professor of electri-
cal engineering and computer science
at Massachusetts Institute of Technol-
ogy (MIT), has found a way to remake New algorithms devised by Pablo Parrilo, an MIT professor of electrical engineering and
the mathematical landscapes of com- computer science, have made working with nonlinear systems both easier and more efficient.
plex, nonlinear systems into predictable
smooth bowls. He has constructed a rare it warms, then explodes in volume at the Parrilo developed algorithms that
bridge between theoretical math and boiling point. An airplane rises smooth- take the complex, nonlinear polyno-
engineering that extends the frontiers of ly and ever more steeply—until it stalls. mials in models that describe these
such diverse disciplines as chip design, Understanding these systems often re- systems and—without actually solving
PHOTOGRA PH BY PATRICK GILLOO LY

robotics, biology, and economics. quires a great deal of prior knowledge, them—rewrites them as much simpler
Nonlinear dynamical systems are plus a painstaking combination of trial mathematical expressions represented
inherently difficult, especially when and error and modeling. Sometimes the as sums of squares of other functions.
they involve many variables. Often they models themselves are so complex their Because squares can only be positive, his
act in a linear fashion over some small behavior can’t be predicted or guaran- expressions are guaranteed to be greater
region, then change radically in some teed, and running realistic models can than zero—the bottom of a “bowl”—
other region. Water expands linearly as be computationally intractable. and relatively straightforward to analyze

ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t he acm 17
news

via conventional mathematical tools tions from the convex sums-of-squares says, because it makes dealing with
such as optimization techniques. equations in polynomial time, Parrilo nonlinear systems much easier.
Parrilo’s transformed equations are says. In fact, his techniques can improve John Harrison, a principal engineer
“convex,” like bowls. Convexity in math- efficiency so greatly—for researchers as at Intel, knows what it’s like to wrestle
ematics essentially means that a func- well as their computers—that they en- with nonlinear systems. He develops
tion is free of undulations that form able fundamentally new ways of work- formal proofs for the correctness of de-
local minima and maxima. “It means ing and, in some cases, qualitatively bet- signs for floating-point arithmetic cir-
that if you know something about two ter results. cuits. The idea is to prevent a recurrence
points, then you know what’s going to of the floating-point division bug in the
happen in the middle,” Parrilo says. Systems, Functions, Properties Pentium chip that cost Intel nearly $500
“The reason that the convexity prop- Familiar systems—one describing en- million in the mid-1990s. The tools he
erty is so important is that it allows us ergy, for example—are often defined by had used to do that, before discovering
to make global statements from local functions in which equilibrium exists at Parrilo’s sums of squares and semidefi-
properties,” Parrilo says. “You can some- some minimum point, with the systems nite programming, were complex, time
times give bounds on the quantity you moving toward that point along smooth consuming, and required huge amounts
are trying to find; essentially, you use the trajectories. “But with many systems, of computer time, he says. Now his for-
convexity of a function as a way of estab- like a biological system, it is very differ- mal verifications typically run in “tens
lishing whatever conclusion you want to ent. We don’t quite know what these of seconds” rather than “many minutes
make.” In other words, one can deduce a functions are,” Parrilo says. “So what or even hours,” Harrison says.
great deal about a bowl from visiting just [my] methods do, in a more or less auto- Harrison doesn’t have to find exact
a few points on it. matic way, is find a function that has the solutions to his equations, but can work
Once Parrilo has derived the sums- properties of an energy function.” with proofs that a polynomial will re-
of-squares equations, the equations are Because nonlinear systems are main within certain acceptable bounds
solved (a minimum is found) via an opti- so difficult, system designers and over a specified range. That’s exactly
mization technique called semidefinite researchers often take the easy but what Parrilo’s method does, he says.
programming, a relatively recent exten- wrong way out, says Elizabeth Brad- “The key,” he adds, “is the ability to certi-
sion of linear programming that works ley, a professor of computer science fy the result formally, otherwise various
on matrices representing convex func- at the University of Colorado at Boul- less rigorous methods could be used.”
tions. (The algorithms for both steps are der. “Linear systems dominate our The broad scope of Parrilo’s concepts
contained in a MATLAB toolbox called education as engineers solely because may mean that formal methods, which
Sostools, which is available at http:// they are easy,” Bradley says. “But that can mathematically prove or verify the
www.mit.edu/~parrilo/sostools/.) leads to the lamppost problem. People correctness of designs, but usually with
While solving the original nonlinear look around the linear lamppost even some difficulty, will propagate more
equations is often NP-hard, Sostools can though the answers aren’t there.” Par- widely, Harrison predicts.
find useful bounds or even exact solu- rilo’s contribution is important, she
Specifying a Robot’s Bounds
Phase plot of a two-dimensional dynamical system, and estimate of the region of A robot walking slowly can be controlled
attraction of the stable equilibrium at the origin. This estimate was obtained by solving
a sum of squares optimization problem.
by a relatively simple system that works
linearly, says Russ Tedrake, associate
professor of electrical engineering and
3 computer science at MIT. But if the robot
walks too fast or encounters some kind
of disturbance, nonlinear factors kick in
2
and the robot’s behavior becomes much
more difficult to predict and control. Te-
drake is using Parrilo’s sums-of-squares
1
and semidefinite programming tech-
niques to rigorously specify the bounds
0 within which the robot won’t fall. He
y

has done the same to specify when a


linear control system for a flying robot
−1 will become unable to keep the machine
on its desired flight path.
Tedrake builds models of his robots’
−2 flight consisting of very complex differ-
ential equations. From those, there are
well-established tools for defining work-
−3
−4 −3 −2 −1 0 1 2 3 4 able flight paths, and given those trajec-
x
tories, there are good linear systems for
controlling the robot even when it devi-

18 communications of th e ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
news

ates from the desired trajectory. But if Education


it veers too far off, nonlinear terms can
take over and the robot may crash. But
now, using Parrilo’s tools, Tedrake can
Pablo Parrilo
has constructed
Hispanics
generate “proofs of stability” for these
nonlinear terms that in essence define a bridge between and STEM
an envelope within which the robot will theoretical math Federal and state government
converge back to its nominal trajectory.
“Proving stability in this way is impor- and engineering agencies have taken steps to
help more Hispanic students
tant for the techniques to be accepted in that extends the be trained in the science,
technology, engineering, and
many applications,” he says.
“This opens up a rigorous way of frontiers of chip mathematics (STEM) fields
in the U.S., but Hispanic
thinking of nonlinear systems in ways design, robotics, students remain severely
underrepresented among STEM
not possible a few years ago,” Tedrake
says. “It used to be you had to be very in- biology, and master’s and doctoral degree
recipients. Now, a report from
novative and creative to come up with a economics. the Center for Urban Education,
proof of stability for nonlinear systems.” Tapping HSI-STEM Funds to
Now, he says, his results are more reli- Improve Latina and Latino Access
to STEM Professions, offers
able and rigorous, and he can more eas- suggestions about improving
ily devise and evaluate alternate flight Hispanics’ STEM participation,
paths and control systems. and notes that financial issues
often play an important role.
Like Harrison, Tedrake hails an effi- explains it this way: “You have a set of
“It is clear that every
ciency breakthrough in Parrilo’s meth- bad behaviors that you don’t want the computer scientist, scientist,
ods. Instead of laboriously verifying the system to have, and you have a model and engineer has a role to
stability of thousands of isolated trajec- of the system. You want to prove that play in diversifying the STEM
fields and that the time to do
tories, he can now work with just a few the two sets don’t intersect. Pablo rec- it is now,” says Alicia Dowd,
“regions of stability,” he says. ognized that there was a sort of univer- associate professor of higher
Parrilo’s work is noteworthy for the sal way to attack this set non-intersec- education at the University
specific techniques and algorithms he tion problem.” of Southern California and
codirector of the Center for
has developed, but, at a higher level, As for the breadth of Parrilo’s think- Urban Education.
it is impressive for its reach, says John ing, Doyle says, “The mathematicians The report alerts STEM
Doyle, a professor of control and dy- think of Pablo as one of their own, and administrators and faculty at
Hispanic-Serving Institutions
namical systems at the California In- so do the engineers.” (HSIs) that substantial funds
stitute of Technology. Doyle points will be available over the next
out that researchers have labored for decade to support Hispanic
Further Reading STEM students, and provides
decades to understand and control
Boyd, S. and Vandenberghe, L. recommendations for how to
complex nonlinear systems, such as best use those funds.
Convex Optimization, Cambridge University
those that run computers, manage net- HSIs can improve the
Press, Cambridge, U.K, 2004.
works, and guide airplanes. Many use- number of Hispanic students
ful tools—such as formal verification Dekker, S. earning STEM degrees by
Structured Semidefinite Programs and helping them balance their
of software and hardware—and disci- Semialgebraic Geometry Methods in reliance on loans and earnings
plines—such as robust control—have Robustness and Optimization (Ph.D. with grants and scholarships,
emerged from this work. But, Doyle dissertation), California Institute of the report notes. It recommends
Technology, Pasadena, CA, May 2000. that when applying for HSI-
says, there has not been a “unified ap- STEM funds, HSIs should
proach” to the problem of anticipat- Parrilo, P.A. increase support for intensive
ing and preventing unintended conse- Semidefinite programming relaxations for junior- and senior-year
quences in systems that are “dynamic, semialgebraic problems, Mathematical STEM research experiences
Programming Ser. B 96, 2, 2003. and propose programs
nonlinear, distributed and complex.” that incorporate research
Now, Parrilo has made a giant step Parrilo, P.A. opportunities into the core
Sum of squares optimization in the analysis curriculum rather than into
toward developing just such a unified and synthesis of control systems, 2006 special programs that may not
approach, Doyle says. “What Pablo said American Control Conference, Minneapolis, be accessible to working adults.
was, ‘Here is a systematic way to pursue MN, June 14–16 2006. The report suggests that
these problems and, oh, by the way, a lot Parrilo, P.A. and Sturmfels, B.
colleges, particularly those with
of these tricks that you guys have come large Hispanic populations,
Minimizing polynomial functions, Cornell
should inform students about
up with over 30 years—in a whole bunch University Library, March 26, 2001, http:// their full range of financial-aid
of fields that we didn’t see as related— arxiv.org/abs/math.OC/0103170. options. It also urges colleges to
are all special cases of this strategy.’ recognize that many Hispanic
undergraduates are supporting
What he did is connect the dots.” Gary Anthes is a technology writer and editor based in
themselves and are more likely
Arlington, VA.
Parrilo enables one to “automate to work than their peers.
the search for proofs,” Doyle says. He © 2011 ACM 0001-0782/11/0100 $10.00 —Bob Violino

ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t he acm 19
news

Technology | doi:10.1145/1866739.1866746 Alex Wright

The Touchy Subject


of Haptics
After more than 20 years of research and development, are haptic
interfaces finally getting ready to enter the computing mainstream?

E
ve r since the first silent- moving beyond special-purpose ap- for exploiting that capability because
mode cell phones started plications to tackle one of the defin- it’s already a background sense.”
buzzing in our pockets a few ing challenges of our age: information As people consume more informa-
years ago, many of us have overload. For many of us, a growing re- tion on mobile devices, the case for hap-
unwittingly developed a liance on screen-based computers has tics seems to grow stronger. “As screen
fumbling familiarity with haptics: tech- long since overtaxed our visual senses. size has become smaller, there is inter-
nology that invokes our sense of touch. But the human mind comes equipped est in offloading some information that
Video games now routinely employ to process information simultaneously would have been presented visually to
force-feedback joysticks to jolt their from multiple inputs—including the other modalities,” says Jones, who also
players with a sense of impending on- sense of touch. “People are not biologi- sees opportunities for haptic interfaces
screen doom, while more sophisticated cally equipped to handle the assault embedded in vehicles as early warning
haptic devices have helped doctors con- of information that all comes through systems and proximity indicators, as
duct surgeries from afar, allowed desk- one channel,” says Karon MacLean, a well as more advanced applications in
bound soldiers to operate robots in haz- professor of computer science at the surgery, space, undersea exploration,
ardous environments, and equipped University of British Columbia. and military scenarios.
musicians with virtual violins. Haptic interfaces offer the promise While those opportunities may be
Despite recent technological ad- of creating an auxiliary information real, developers will first have to over-
vances, haptic interfaces have made channel that could offload some of the come a series of daunting technical
only modest inroads into the mass cognitive load by transmitting data to obstacles. For starters, there is cur-
consumer market. Buzzing cell phones the human brain through a range of rently no standard API for the various
and shaking joysticks aside, develop- vibrations or other touch-based feed- force feedback devices on the market,
ers have yet to create a breakthrough back. “In the real world things happen although some recent efforts have re-
product—a device that would do for on the periphery,” says Lynette Jones, a sulted in commercial as well as open
haptics what the iPhone has done for senior research scientist at Massachu- source solutions for developing soft-
touch screens. The slow pace of market setts Institute of Technology. “It seems ware for multiple haptic hardware
acceptance stems partly from typical like haptics might be a good candidate platforms. And as haptic devices grow
new-technology growing pains: high
production costs, the lack of standard
application programming interfaces
(APIs), and the absence of established
user interface conventions. Those is-
sues aside, however, a bigger question
looms over this fledgling industry:
What are haptics good for, exactly?
Computer scientists have been ex-
ploring haptics for more than two de-
cades. Early research focused largely
on the problem of sensory substitu-
tion, converting imagery or speech
information into electric or vibratory
stimulation patterns on the skin. As
the technology matured, haptics found
PHOTOGRA PH S BY STEVE YOH ANA N

new applications in teleoperator sys-


tems and virtual environments, useful
for robotics and flight simulator appli-
cations.
Today, some researchers think the About the size of a cat, the Haptic Creature produces different sensations in response to
big promise of haptics may involve human touch. Insert: The Haptic Creature with furry skin.

20 communications of th e ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
news

more complex, engineers will have to HPC


optimize for a much more diverse set
of sensory receptors in the human body
that respond to pressure, movement,
As haptic devices
grow more complex,
Students
and temperature changes.
As the range of possible touch- engineers will Build
based interfaces expands, developers
face a further hurdle in helping users
have to optimize for
a much more
Green500
make sense of all the possible per-
mutations of haptic feedback. This
lack of a standard “haptic language”
diverse set of Super-
may prove one of the most vexing
barriers to widespread market accep-
sensory receptors
in the human body
computer
tance. Whereas most people have by
now formed reliable mental models that respond to A team of students at the
University of Illinois at
of how certain software interfaces pressure, movement, Urbana-Champaign (UIUC)
should work—keyboards and mice, have built an energy-efficient
touchpads, and touch screens, for ex- and temperature supercomputer that appeared

changes.
on both the Green500 and
ample—the ordinary consumer still Top500 lists. Named in honor
requires some kind of training to as- of one of the UIUC campus’s
sociate a haptic stimulation pattern main thoroughfares, the Green
Street supercomputer placed
with a particular meaning, such as the
third in the Green500 list
urgency of a phone call or the status of of the world’s most energy-
a download on a mobile device. efficient supercomputers,
The prospect of convincing con- with a performance of 938
megaflops per watt. It also
sumers to learn a new haptic language quickly. And that’s not to criticize the placed 403rd in the Top500
might seem daunting at first, but the developers of haptics—it’s just a tough list, a ranking of the world’s
good news is that most of us have al- problem.” fastest supercomputers, with a
ready learned to rely on haptic feedback Many efforts to date have used hap- performance of 33.6 teraflops.
The Green Street
in our everyday lives, without ever giving tics as a complementary layer to exist- supercomputer grew out of
it much thought. “We make judgments ing screen-based interfaces. MacLean an independent study course
based on the firmness of a handshake,” argues that haptics should do more led by Bill Gropp, the Bill and
Cynthia Saylor Professor of
says Ed Colgate, a professor of me- than just embellish an interaction al- Computer Science, and
chanical engineering at Northwestern ready taking place on the screen. “A lot Wen-mei Hwu, who holds
University. “We enjoy petting a dog and of times you’re using haptics to slap it the AMD Jerry Sanders Chair
holding a spouse’s hand. We don’t en- on top of a graphical interaction,” she of Electrical and Computer
Engineering. Approximately
joy getting sticky stuff on our fingers.” says. “But there can also be an emotion- 15 UIUC undergraduate and
Colgate believes that advanced haptics al improvement, a comfort and delight graduate students helped
could eventually give rise to a set of in using the interface.” build the supercomputer,
which boosts a cluster of 128
widely recognized device behaviors that Led by Ph.D. candidate Steve Yo- graphics processing units
go well beyond the familiar buzz of cell hanan, MacLean’s team has built the donated by NVIDIA, and uses
phones. For now, however, the prospect Haptic Creature, a device about the unorthodox supercomputer
of a universal haptic language seems a size of a cat that simulates emotional building materials, such as
wood and Plexiglas.
distant goal at best. responses. Covered with touch sensors, The UIUC team hopes to
“Until we have a reasonably mature the Haptic Creature creates different increase the supercomputer’s
approach to providing haptic feedback, sensations—hot, cold, or stiffening its energy efficiency by 10%–20%
it’s hard to imagine something as so- “ears” in response to human touch. with better management of its
message passing interface and
phisticated as a haptic language aris- The team is exploring possible applica- several other key elements.
ing,” says Colgate, who believes that tions such as fostering companionship “You really need to make sure
success in the marketplace will ulti- in older and younger people, or treating that the various parts of your
communications path, in
mately hinge on better systems integra- children with anxiety disorders. terms of different software
tion, along the lines of what Apple has MacLean’s team has also devel- layers and hardware drivers
accomplished with the iPhone. “Today, oped an experimental device capable and components, are all in
haptics is thought of as an add-on to of buzzing in 84 different ways. After tune,” says Hwu. “It’s almost
like when you drive a car, you
the user interface,” says Colgate. “It giving users a couple of months to get need to make sure that all these
may enhance usability a little bit, but familiar with the feedback by way of an things are in tune to get the
its value pales in comparison to things immersive game, they found that the maximum efficiency.”
The Green Street super-
you can do with graphics and sound. In process of learning to recognize haptic computer is being used as a
many cases, the haptics is so poorly im- feedback bore a great deal of similarity teaching and research tool.
plemented that people turn it off pretty to the process of learning a language. —Graeme Stemp-Morlock

ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t he acm 21
news

“The surprising thing is that people are tion in correspondence with fingertip mature and robust, there has to be an
able to quickly learn an awful lot and motion across a surface, the interface active marketplace that creates compe-
learn it without conscious attention,” can simulate the feeling of texture or a tition and drives down costs, and it has
says MacLean. “There’s a lot of poten- bump on the surface. Compared with to meet a real need.”
tial for people to learn encoded signals force-feedback technology, vibrotac- As production costs fall and new
that mean something not in a represen- tile stimulators, known as tactors, are standards emerge—as they almost
tational way but in an abstract way with- much smaller in size and more por- certainly will—the marketplace for
out conscious attention.” table, although high-performance tac- touch-based devices may yet come into
To date, most low-cost haptic inter- tors with wide bandwidths, small form its own. Until that happens, most of
faces have relied exclusively on varying factors, and independently control- the interesting work will likely remain
modes of vibration, taking advantage lable vibrational frequency and ampli- confined to the labs. And the future of
of the human skin’s sensitivity to move- tude are still hard to come by at a rea- the haptics industry seems likely to re-
ment. But vibration constitutes the sonable cost. main, well, a touchy subject.
simplest, most brute-force execution The Northwestern researchers have
of haptic technology. “Unfortunately,” figured out how to make transparent
Further Reading
says Colgate, “vibration isn’t all that force sensors that can capture tactile
pleasing a sensation.” feedback on a screen, so that they can Chubb, E.C., Colgate, J.E., and Peshkin, M.A.
Some of the most interesting re- be combined with a graphical display. ShiverPaD: a glass haptic surface that
produces shear force on a bare finger, IEEE
search taking place today involves ex- “My ideal touch interface is one that Transactions on Haptics 3, 3, July–Sept.,
panding the haptic repertoire beyond can apply arbitrary forces to the finger,” 2010.
the familiar buzz of the vibrating cell says Colgate, whose team has been ap- Ferris, T.K. and Sarter, N.
phone. At MIT, Jones’ team has con- proaching the problem by combining When content matters: the role of
ducted extensive research into human friction control with small lateral mo- processing code in tactile display design,
body awareness and tactile sensory tions of the screen itself. IEEE Transactions on Haptics 3, 3,
systems, examining the contribution By controlling the force on the finger, July–Sept., 2010.
of receptors in the skin and muscles the system can make parts of the screen Jones, L.A. and Ho, H.-N.
to human perceptual performance. In feel “magnetic” so that a user’s finger Warm or cool, large or small? The challenge
of thermal displays, IEEE Transactions on
one study, Jones demonstrated that us- is pulled toward them—up, down, left,
Haptics 1, 1, Jan.–June, 2008.
ers were unable to distinguish between right—or letting a user feel the outline
two thermal inputs presented on a sin- of a button on the screen where none MacLean, K.E.
Putting haptics into the ambience, IEEE
gle finger pad; instead, they perceived it exists. Colgate’s team is also exploring Transactions on Haptics 2, 3, July–Sept.,
as a single stimulus, demonstrating the how to develop devices using multiple 2009.
tendency of thermal senses to create fingers, each on a different variable fric- Ryu, J., Chun, J., Park, G., Choi, S., and Han, S.H.
“spatial summation” rather than fine- tion interface. Vibrotactile feedback for information
tuned feedback. Looking ahead, Colgate believes the delivery in the vehicle, IEEE Transactions
Colgate’s research has focused on evolution of haptic interfaces may fol- on Haptics 3, 2, April–June, 2010.
a fingertip-based interface that pro- low the trajectory of touch screens: a
vides local contact information using technology long in development that Alex Wright is a writer and information architect who
lives and works in Brooklyn, NY. Hong Z. Tan, Purdue
new actuation technologies includ- finally found widespread and relatively University, contributed to the development of this article.
ing shear skin stretch, ultrasonic, and sudden acceptance in the marketplace.
thermal actuators. By varying the fric- “The technology has to be sufficiently © 2011 ACM 0001-0782/11/0100 $10.00

Obituary

Watts Humphrey, Software Engineer: 1927–2010


Watts Humphrey, who Software Engineering Process introduced software licenses in Capability Maturity Model
distinguished himself as the Management Program at the the 1960s. Humphrey focused on and eventually the Capability
“father of software quality Carnegie Mellon Software how disciplined and experienced Maturity Model Integration
engineering,” died on October 28 Engineering Institute (SEI). “He professionals, working as (CMMI), a framework of software
at age 83 at his home in Sarasota, was a visionary, a wonderful teams, could produce high engineering best practices
FL. Humphrey combined leader, and a wonderful man.” quality, reliable software within now used by thousands of
business practices with software After receiving B.S. and M.S. committed cost and schedule organizations globally.
development, and brought degrees in physics from the constraints. Humphrey was also the
discipline and innovation to the University of Chicago and the In 1986, after a 27-year career author of 11 books, including
process of designing, developing, Illinois Institute of Technology, as a manager and executive at Managing a Software Process.
testing, and releasing software. respectively, and an MBA from IBM, Humphrey joined SEI and An ACM and SEI Fellow, he was
“Watts had a profound the University of Chicago, founded the school’s Software awarded the National Medal of
impact on the field,” says Humphrey went to work at IBM. Process Program. He led the Technology in 2005.
Anita Carleton, director of the There, he headed a team that development of the Software —Samuel Greengard

22 communications of th e ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
news

Society | doi:10.1145/1866739.1866747 Marina Krakovsky

India’s Elephantine
Effort
An ambitious biometric ID project in the world’s second most
populous nation aims to relieve poverty, but faces many hurdles.

D
espi t e I ndi a’s eco nomic
boom, more than a third of
the country remains impov-
erished, with 456 million
people subsisting on less
than $1.25 per day, according to the
most recent World Bank figures. Gov-
ernment subsidies on everything from
food to fuel have tried to spread the na-
tion’s wealth, but rampant corruption
has made the redistribution pipeline
woefully inefficient.
The “leakage” happens in part be-
cause the benefits aren’t directed at
specific individuals, says Salil Prabha-
kar, a Silicon Valley-based computer
scientist who is working as a volun-
teer for the World Bank as part of the
Unique ID (UID) project, a massive bio- A 95-year-old Indian man has his fingerprints scanned as part of the Unique ID project.
metrics initiative aimed at overhauling
the current system. “If I as the govern- tification, and to link that number fraught with problems, most of which
ment issue $1 of a benefit,” Prabhakar with the owner’s biometric data—all stem from the project’s sheer size, given
explains, “I don’t know where it’s go- 10 fingerprints, an iris scan, and a India’s population of 1.2 billion. “Bio-
ing. I just know that there’s a poor per- headshot (plus four hidden “virtual” metric systems have never operated on
son in some remote village, and I hope digits). Aadhaar’s national enrollment such a massive scale,” says Arun Ross,
it reaches them.” was launched in September, with the an associate professor of computer sci-
The current system relies on a chain goal of issuing 100 million ID numbers ence at West Virginia University.
of middlemen—many of them corrupt by March and 600 million within four One of the biggest challenges is de-
bureaucrats at various levels of govern- years. Like the Social Security number duplication. When a new user tries to
ment—who collectively siphon off 10% in the U.S., the number won’t guaran- enroll, the system must check for du-
or more of what’s due to the poor and tee government aid, but your biomet- plicates by comparing the new user’s
resell the goods and services on the rics will prove the UID is yours, letting data against all the other records in the
black market. For example, according you claim whatever benefits to which UID database. Hundreds of millions of
to Transparency International, officials you’re entitled. In theory, the result records make this a computationally
extracted $212 million in bribes alone should be the end of counterfeit ration demanding process, made all the more
from Indian households below the pov- cards and other fraud, as well as mak- so by the size of each record, which in-
erty line in 2007. ing it easier for hundreds of millions of cludes up to 12 higher-resolution im-
The UID project—helmed by the Indian adults to gain easier access to ages.
much-admired former Infosys Tech- banking services for the first time. And The demands continue each time
nologies CEO Nandan Nilekani and because the system will work nation- there’s an authentication request. “The
operated by Unique Identification Au- wide, Aadhaar should make it possible matching is extremely computation-
PHOTOGRA PH BY SA NJIT DAS/PANO S

thority of India, a government agen- for the poor to move without losing ally intensive,” says Prabhakar. At peak
cy—promises to be the first step in the benefits. The lower-income Indians times, the system must process tens of
solution. Recently renamed Aadhaar love the idea, says Prabhakar, who wit- millions of requests per hour while re-
(meaning “foundation” in Hindi), the nessed what he describes as “almost sponding in real time, requiring mas-
UID project plans to assign a unique a stampede” during a recent proof-of- sive data centers the likes of Google’s.
16-digit number to each citizen above concept enrollment. Achieving acceptable levels of accu-
the age of 18 who wants national iden- The full implementation, though, is racy at this scale is another major diffi-

ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t he acm 23
news

culty. Unlike passwords, biometrics nev- out a UID card, how is it voluntary?” The
er produce an exact match, so matching loss of civil liberties is too high a price to
always entails the chance of false ac- “The key issue,” pay for a system that she believes leaves
cepts and false rejects, but as the num- says Nalini Ratha, gaping opportunities for continued
ber of enrollments rises, so do the error corruption. “The guy handing out the
rates, since it becomes more likely that “is have I captured bags of rice could ask for a bribe even
two different individuals will share sim- enough variation so to operate the machine that scans the
ilar biometrics. Using a combination of fingerprints, or he could say that the
biometrics—instead of a single thumb- I don’t reject you, machine isn’t working,” says Jayaram.
print, for example—greatly improves and at the same time “And there’s every chance the machine
accuracy and deters impostors. (In the isn’t working. Or he could say, ‘I don’t
words of Marios Savvides, assistant re- I don’t match against know who you are and I don’t care; just
search professor in the department of everybody else?” pay me 500 rupees and I’ll give you a bag
electrical and computer engineering at of rice.’ All the ways that humans can
Carnegie Mellon University, “It’s hard subvert the system are not helped by
to spoof fingerprints, face, and iris all this scheme.”
at the same time.”) But using multiple Abraham suggests a more effective
biometrics requires extra equipment, way to root out fraud through biomet-
demands information fusion, and adds one type of attack—a fake finger, a fake rics would be to target the much small-
to the data processing load. mask, or something,” says IBM’s Ratha, er number of residents who own most
Other steps to improve accuracy also “but there are probably 10 other attacks of the country’s wealth, much of it ill-
bring their own challenges. “The key is- to a biometric system that can compro- gotten. “The leakage is not happening
sue,” says Nalini Ratha, a researcher at mise the system.” at the bottom of the pyramid,” he says.
the IBM Watson Research Center, “is For starters, when data is stored in “It’s bureaucrats and vendors and poli-
have I captured enough variation so I a centralized database, it becomes an ticians throughout the chain that are
don’t reject you, and at the same time attractive target for hackers. Another corrupt.”
I don’t match against everybody else?” vulnerability is the project’s reliance on Despite all the technical and social
Capturing the optimal amount of varia- a network of public and private “regis- challenges, Nandan Nilekani’s UID
tion requires consistent conditions trars”—such as banks, telecoms, and project is on course to provide 100 mil-
across devices in different settings—no government agencies—to collect bio- lion Indian residents with a Unique ID
easy feat in a country whose environ- metric data and issue UIDs. Though reg- by March. Will Nilekani’s UID scheme
ment varies from deserts to tropics and istrars might ease enrollment, they’re work? Only time will tell. “But if there’s
from urban slums to far-flung rural ar- not necessarily worthy of the govern- anybody in India who’s capable of
eas. “It’s almost like having many dif- ment’s trust. Banks, for example, have pulling it off, it’s him,” says Abraham.
ferent countries in a single country, bio- been helping wealthy depositors evade Meanwhile, the hopes of millions of In-
metrically speaking,” says Ross. taxes by opening fictitious accounts, dia’s poor are invariably tied to the proj-
The challenge isn’t just to reduce er- so entrusting the banks with biometric ect’s success.
rors—under some conditions, a biomet- devices doesn’t make sense, says Sunil
ric reader may not work at all. “If it’s too Abraham, executive director of the Cen- Further Reading
hot, people sweat and you end up with tre for Internet and Society in Banga-
Bolle, R.M., Connell, J.H., Pankanti, S.,
sweaty fingers,” says Prabhakar, “and if lore. “If I’m a bank manager, I can hack Ratha, N.K., Senior, A.W.
it’s too dry, the finger is too dry to make into the biometric device and introduce Guide to Biometrics. Springer, New York, NY,
good contact with the optical surface of a variation in the fingerprint because 2004.
the scanner.” Normalizing across varied the device is in my bank and the bio- Jain, A.K., Flynn, P. and Ross, A. (Eds.)
lighting conditions is essential, since all metric is, once it’s in the computer, Handbook of Biometrics. Springer, New
of the biometric data is optical. just an image sent up the pipe,” he says. York, NY, 2007.
India’s diverse population presents a Though careful monitoring could catch Pato, J.N and Millett, L.I. (Eds.)
whole other set of hurdles. Many of the such hacks, Abraham says that’s not re- Biometric Recognition: Challenges and
poor work with their hands, but manual alistic once you’ve got as many records Opportunities. The National Academies
Press, Washington, D.C., 2010.
labor leads to fingertips so callused or as Aadhaar will have.
dirty they can’t produce usable finger- Registrars may also make UIDs, Ramakumar, R.
prints. And some of the most unfor- which are officially voluntary, a de facto High-cost, high-risk, FRONTLINE 26, 16,
Aug. 1–14, 2009.
tunate residents are missing hands or requirement for services, especially in
eyes altogether. the current absence of a law governing Ross, A. and Jain, A.K.
Information fusion in biometrics, Pattern
how the data can be used. Such “func- Recognition Letters 24, 13, Sept. 2003.
Security Challenges tion creep” troubles privacy advocates
As if these problems weren’t enough, like Malavika Jayaram, a partner in the Marina Krakovsky is a San Francisco area-based
journalist and co-author of Secrets of the Moneylab: How
the UID system poses formidable se- Bangalore-based law firm Jayaram & Behavioral Economics Can Improve Your Business.
curity challenges beyond the threat of Jayaram, who says, “If every utility and
spoofing. “People get carried away by every service I want is denied to me with- © 2011 ACM 0001-0782/11/0100 $10.00

24 communications of th e ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
news

Milestones | doi:10.1145/1866739.1866767 Jack Rosenberger

EMET Prize
and Other Awards
Edward Felten, David Harel, Sarit Kraus, and others
are honored for their contributions to computer science,
technology, and electronic freedom and innovation.

T
he A.M.N. Foundation, Frank- which has enabled programmers and
lin Institute, Electronic Fron- engineers to educate lawyers on tech-
tier Foundation, and other nology relevant to legal cases of sig-
organizations recently recog- nificance to the Free and Open Source
nized leading computer sci- community, and which, in turn, has
entists and technologists. taught technologists about the work-
ings of the legal system; and
EMET Prize Hari Krishna Prasad Vemuru, a secu-
David Harel and Sarit Kraus were hon- rity researcher in India, who revealed
ored with EMET Prizes by the A.M.N. security flaws in India’s paperless
Foundation for the Advancement of electronic voting machines, and en-
Science, Art, and Culture in Israel for dured jail time and political harass-
excellence in the computer sciences. ment to protect an anonymous source
Harel, a professor of computer science who enabled him to conduct the first
at the Weizmann Institute of Science, independent security review of India’s
was recognized for his studies on a e-voting system.
wide variety of topics within the disci-
pline, among them logic and comput- Franklin Institute Laureate
ability, software and systems engineer- John R. Anderson, R.K. Mellon Univer-
ing, graphical structures and visual sity Professor of Psychology and Com-
languages, as well as modeling and Edward Felten, the U.S. Trade Commission’s puter Science at Carnegie Mellon Uni-
analysis of biological systems. first Chief Technologist. versity, was named a 2011 Laureate by
Kraus, a professor of computer sci- the Franklin Institute and awarded the
ence at Bar-Ilan University, was recog- four individuals who are extending Benjamin Franklin Medal in Computer
nized for her expertise in the field of freedom and innovation in the digital and Cognitive Science “for the develop-
artificial intelligence, along with her world. The honorees are: ment of the first large-scale computa-
significant contributions to the field of Steven Aftergood, who directs the tional theory of the process by which
autonomous agents, and studies in the Federation of American Scientists Pro- humans perceive, learn, and reason,
field of multiagent systems. ject on Government Secrecy, which and its application to computer tutor-
PHOTOGRA PH : Princeton U niversity, Office of C ommunications, Denise A pplewhite

works to reduce the scope of official se- ing systems.”


FTC Chief Technologist crecy and to promote public access to
Edward Felten, a professor of comput- government information; Prince Philip Designers Prize
er science and public affairs at Prince- James Boyle, William Neal Reynolds Bill Moggridge, director of the Smith-
ton University, was named the U.S. Fed- Professor of Law and cofounder of the sonian Cooper-Hewitt National De-
eral Trade Commission’s first Chief Center for the Study of the Public Do- sign Museum, was awarded the 2010
Technologist. Felten, a vice-chair of main at Duke Law School, who was Prince Philip Designers Prize, an an-
ACM’s U.S. Public Policy Council, will recognized for his scholarship on the nual award by the Duke of Edinburgh
advise the federal agency on evolving “second enclosure movement”—the that recognizes a lifetime contribu-
technology-related issues of consumer worldwide expansion of intellectual tion to design, for his contributions to
protection, such as online privacy and property rights—and its threat to the the GRiD Compass laptop. Released
cybersecurity, and antitrust matters, public domain of cultural and scien- in 1982, the GRiD Compass is widely
including tech industry mergers and tific materials that the Internet might credited as being the forerunner of to-
anticompetitive behavior. otherwise make available; day’s modern laptop.
Pamela Jones, a blogger, and her
EFF Pioneer Awards Web site Groklaw, were honored for the Jack Rosenberger is senior editor, news, of
Communications.
The Electronic Frontier Foundation creation of a new style of participatory
presented its 2010 Pioneer awards to journalism and distributed discovery, © 2011 ACM 0001-0782/11/0100 $10.00

ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t he acm 25
10 Years of Celebrating Since 2001, the Tapia Celebration of Diversity The Tapia Conference 2011 will continue past
Diversity in Computing in Computing has served as a leading forum popular sessions, including the Student Post-
for bringing together students, professors and er Session, Resume and Early Career Advice
professionals to discuss and strengthen their Workshops, Town Hall Meeting, Banquet, and
2011 passion and commitment to computing. The 2011 the Doctoral Consortium, a daylong program
program will include featured speakers who are designed to help equip students for the
Richard Tapia exemplary leaders and rising stars in academia grueling challenge of finishing their doctor-

Celebration of and industry, such as: ates. There will also be attendee-proposed
BOFs and panels. A new program will con-
Diversity • IrvingWladawsky-Berger, former chair of the
IBM Academy of Engineering and the 2001
nect students with computing professionals
from around the San Francisco Bay Area,
in Computing HENAAC Hispanic Engineer of the Year, will opening the door to future opportunities.
give the Ken Kennedy Memorial Lecture on A special outing will take in the sights of
Conference “The Changing Nature of Research and San Francisco. Conference program news
Innovation in the 21st Century.” and registration information can be found at:
http://tapiaconference.org/2011/
April 3-5, 2011 • DeborahEstrin,the Jon Postel Professor of
San Francisco, California TapiaConference2011supportersinclude:
Computer Science at UCLA and a member of
the National Academy of Engineering, will talk Google (Platinum)
http://tapiaconference.org/2011/ on “Participatory Sensing: from Ecosystems
to Human Systems.” Intel (Gold)
Cisco,Microsoft and NetApp (Silver);
• AlanEustace, Senior Vice President of Engi- Symantec (Bronze); Amazon, Lawrence
neering and Research at Google, will give an Berkeley National Laboratory, Lawrence
after dinner talk entitled “Organizing the Livermore National Laboratory, and the
World’s Information.” NationalCenterforAtmosphericResearch
(Supporter). The Tapia Conference 2011
Wladawsky- Estrin Eustace Howard
Berger • AyannaHoward, Associate Professor in the is organized by the CoalitiontoDiversify
ECE School at Georgia Tech who Technology Computing and is co-sponsored by the
Review selected as a 2003 Young Innovator, AssociationforComputingMachinery and
will give the talk “SnoMotes - Robotic Scientific the IEEEComputerSociety, in cooperation
CACM lifetime mem half page ad:Layout 1 2/3/10
Explorers 2:21 PM
for Understanding Page
Climate 1
Change.” with the ComputingResearchAssociation.

Estrin

Take Advantage of
ACM’s Lifetime Membership Plan!
◆ ACM Professional Members can enjoy the convenience of making a single payment for their
entire tenure as an ACM Member, and also be protected from future price increases by
taking advantage of ACM's Lifetime Membership option.
◆ ACM Lifetime Membership dues may be tax deductible under certain circumstances, so
becoming a Lifetime Member can have additional advantages if you act before the end of
2010. (Please consult with your tax advisor.)
◆ Lifetime Members receive a certificate of recognition suitable for framing, and enjoy all of
the benefits of ACM Professional Membership.

Learn more and apply at:


http://www.acm.org/life

26 communications of th e ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
V
viewpoints

doi:10.1145/1866739.1866748 Phillip G. Armour

The Business of Software


Don’t Bring Me a Good Idea
How to sell process changes.

Y
o u wa n t to know how to and the other panelists were adamant
get my attention?” Jason that a good idea, even when supported
Kalich asked the audience by possible cost savings, just doesn’t
rhetorically. “First off, don’t cut it in the current economic climate.
bring me a good idea—I’ve What the panel was saying is that
already got plenty of good ideas.” Ka- good ideas are just that. And cost re-
lich, the general manager of Micro- duction, while valuable, tends to be
soft’s Relationship Experience Divi- quite incompressible—once the first
sion, was participating in the keynote 10%–20% of cost savings are achieved,
panel at the Quest Conference in Chi- further savings usually become in-
cago.a The three industry experts on creasingly difficult to get. Reducing
the panel were addressing the question costs is like compressing a spring—it
asked by the moderator Rebecca Sta- may require more and more energy for
ton-Reinstein: “How can I get my man- less and less movement.
ager’s buy in (to software quality and pro- I looked around the audience and,
cess change initiatives)?” The audience while there were nods of understand-
consisted of several hundred software ing, there were also many blank stares
professionals, most of them employed as people tried to figure out: How can I
in the areas of software quality, testing, turn my process initiative into a profit
and process management. Kalich had center? Making money is not a typical
clearly given the topic a lot of thought goal of process change as it is usually
and he warmed to the theme: “Don’t Sponsorship practiced which, according to the pan-
even bring me cost savings. Cost sav- Obtaining sponsorship for software el, might be why it doesn’t always get
ings are nice, but they’re not what I’m development process changes is es- the support it might.
really interested in.” He paused for sential. The first of Watts Humphrey’s But how to actually do it? That after-
emphasis. “Bring me revenue growth Six Rules of Process Change is “start at noon, I attended a presentation that
and you’ve got my ear. Bring me new the top”—get executive sponsorship showed how it can be done, and what
value, new products, new customers, for whatever change you are trying to critical success factors are needed to
new markets: then you’ve got my atten- make.1 Without solid and continu- make it work.
tion, then you’ve got my support. Don’t ing executive commitment to support
bring me a good idea. Not interested.” changes they usually wither on the vine. SmartSignal
But just how does a software quality “Predictive Analytics is a really com-
a http://www.qaiquest.org/chicago/index.html professional get this support? Kalich plex data set,” said George Cerny, “our

ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t he acm 27
viewpoints

Knowledge-containing artifacts involved in testing.

Specification Expected
and Design Test Results
Documents
Knowledge:
Knowledge: Expected behavior
Specified/Designed and functionality
behavior and
functionality

System Under Test Test Results


Comparison
Knowledge:
As constructed (executable) Knowledge:
behavior and functionality Difference
between expected
Expected Test Setup and observed
Environment behavior
Knowledge:
Knowledge: Initial system,
Expected environment, data,
characterization and input states
and behavior of
runtime/test time Test Execution System
environment
Knowledge:
Executable behavior
and functionality
of test system Actual
Test Results
Manual Automated Knowledge:
Process Process Actual exhibited/
Paper Software Medium executed behavior
Medium and functionality

systems predict the possible failure of it finds. Sometimes these reports are ers are often in a paper format and are
commercial aircraft, power stations, large and detailed; sometimes they are processed manually.
and oil rigs sometimes weeks before urgent and immediate. Cerny described this: “We realized
a failure might actually occur.” Cerny “But before all this happens, the early we had to test using virtual ma-
is the quality assurance manager at analytic system must be set up.” Cerny chines, but how could we test these?
SmartSignal,b an Illinois-based data said. “This setup was manual and And how could we ensure scalability
analytics company. data-entry intensive. A single power with both the numbers and the com-
To manage predictive analytics, station might have hundreds of items plexities of environments and inputs?”
large and complex systems must be of equipment that need to be moni- To the testing group at SmartSignal
instrumented and enormous amounts tored. Each item might have hundreds test automation was clearly a good
of complicated data must be collected of measurements that must be taken idea. But how to get sponsorship for
from many different sources: pumps, over short, medium, and long time- this good idea?
power meters, pressure switches, frames. Each measurement might be Jim Gagnard, CEO of SmartSignal,
maintenance databases, and other associated with many similar or differ- put it this way: “We are a software com-
devices. Sometimes data is collected ent measurements on the same device pany whose products measure quality
in real time, sometimes it is batched. or on other equipment.” The screen and everything is at risk if we aren’t as
Simple data is monitored for thresh- flashed with list after list of data items. good as we can be in everything we do.
old conditions and complex interac- “So how could we test this? How could Leaders can help define and reinforce
tive data is analyzed for combinational we make sure the system works before the culture that gets these results, but
conditions. The analysis system must we put it in?” if it’s not complemented with the right
recognize patterns that indicate the people who truly own the issues, it does
future possibility of component, sub- Testing a System not work.”
system, or systemic failure and what Testing is the interaction of several Dave Bell, vice president of Appli-
the probability of that failure might knowledge-containing artifacts, as cation Engineering and Stacey Kacek,
be. And then it needs to report what shown in the accompanying figure. vice president of Product Develop-
Some of these artifacts must be in ex- ment at SmartSignal, concurred. “We
b http://www.smartsignal.com ecutable software form, but many oth- always have to be looking to replace

28 communications of th e ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
viewpoints

what people do manually with the


automated version of the same,” said
The knowledge of
Calendar
Kacek, “…once it works.” Bell added:
“While the management team under-
stood the advantages of test automa-
how to set up the test of Events
tion and we all have an engineering system is also the January 17–20
background, we had to keep asking: knowledge of how The Thirteenth Australiasian
Computing Education
how to get buy-in?” “Our driver was to
find creative ways for our customers to to set up the target Conference,
Perth QLD Australia,
make decisions, what we call ‘speed to production system. Contact: Michael de Raadt,
Email: deraadt@usq.edu.au
value’” Kacek asserted.
January 22–26
Some Steps Fifth International Conference
Cerny described a few of the steps they on Tangible, Embedded, and
took at SmartSignal to build their auto- Embodied Interaction,
Funchal, Portugal,
mated test system: this around? What if we use our testing Contact: Mark D. Gross,
˲˲ Build to virtual machines and capability to find out what configura- Email: mdgross@cmu.edu
virtualization to isolate device depen- tion would show the lowest likely failure
dence; rate? Doing this allows field engineers January 23–29
The 38th Annual ACM SIGPLAN-
˲˲ Start simply using comma delim- and power plant designers to model SIGACT Symposium on
ited scripts and hierarchical tree data different configurations of systems for Principles
views; least likelihood of failure before they of Programming, Languages
˲˲ Build up a name directory of func- actually build and install them. Austin, TX,
Sponsored: SIGPLAN,
tions; The knowledge in the test system Contact: Thomas J. Ball,
˲˲ Separate global (run in any envi- is the same as the knowledge in the Email: tball@microsoft.com
ronment) from local variables; target system. The knowledge of how
˲˲ Keep object recognition out of to set up the test system is also the January 25–28
16th Asia and South Pacific
scripts, use both static and dynamic knowledge of how to set up the target Design Automation Conference,
binding; and production system. Automating this Yokohama, Japan,
˲˲ Initially automate within the de- knowledge allows simulation of a sys- Sponsored: SIGDA,
Contact: Kunihiro Asada,
velopment team to prove the concept tem before it is built. Email: asada@silicon.u-tokyo.
before moving to production. ac.jp
A Really Good Idea
Test Knowledge is This is what Jason Kalich and the panel January 28–30
Global Game Jam,
System Knowledge at Quest were looking for. Automating Multiple Locations (TBA),
These steps are typical engineering the test system at SmartSignal ended Contact: Bellamy Gordon,
design actions anyone might take in up being not simply about speeding Email: gordon@igda.org
automating testing or, indeed, in auto- things up a bit, making the testers’
February 2–5
mating any process or any system. But lives easier, or saving a few dollars. It Fourth ACM International
in this case there was a difference. was not just about cranking through a Conference on Web Search and
“Asset configuration is a big issue few more tests in limited time or reduc- Data Mining,
in the power industry.” Cerny said in ing test setup, analysis, and reporting Kowloon, Hong Kong,
Sponsored: SIGKDD, SIGMOD,
his presentation at Quest. “Imagine time. It became something different— SIGWEB, SIGIR,
setting up a power station: what equip- it became a configuration simulator Contact: Irwin K. King,
ment should go where? Which pumps and that’s a new product. Email: king@cse.cukhk.edu.hk
are used and connected to which other If we automate knowledge in the
February 9–10
equipment? Where are sensors to be right way, even internal software pro- International Symposium on
placed? What is the ‘best’ configura- cess knowledge, it can be used in Engineering Secure Software
tion of equipment that will most likely many different ways and it can even be and Systems,
Madrid, Spain,
reduce the overall failure rate of the used to create new functionality and Contact: Ulfar Erlingsson,
plant?” new products that our customers will Email: ulfar@yahoo.com
The analytics test system is designed pay for.
to prove the analytical system itself Now that’s a good idea. February 9–12
Fourth ACM International
works. To do this, the test system must
Conference on Web Search and
be set up (automatically, of course) to Reference
Data Mining,
1. Humphrey, W. Managing the Software Process.
the appropriate target system configu- Prentice Hall, New York, 1989, 54. Kowloon, Hong Kong,
ration. The normal test function is Sponsored: SIGWEB, SIGIR,
SIGKDD, and SIGMOD,
meant to show that the analytical sys- Phillip G. Armour (armour@corvusintl.com) is a senior
Contact: Irwin K. King,
consultant at Corvus International Inc., Deer Park, IL.
tem will work as built for that particu- Email: king@cse.cukhk.edu.hk
lar target system. But what if we turn Copyright held by author.

ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t he acm 29
V
viewpoints

doi:10.1145/1866739.1866749 Stefan Bechtold

Law and Technology


Google AdWords and
European Trademark Law
Is Google violating trademark law by operating its AdWords system?

W
hen the dot-com istries required to check domain name whether Google is violating trademark
boom began in the registrations for trademark violations? law by operating its AdWords system.
late 1990s, many ana- Is eBay liable for counterfeit product With Google AdWords, advertisers can
lysts and observers sales on its site? To what extent should buy advertising links in the “sponsored
proclaimed the death Google be allowed to offer excerpts links” section of a Google search re-
of intermediation. Supply chains from copyrighted books in its Google sults page. When a user enters a key-
seemed to become shorter and short- Book service without the consent of the word selected by the advertiser, the ad-
er as new B2C companies emerged in relevant rights owners? vertising link will appear in the upper
Silicon Valley. These companies could Both in the U.S. and in Europe, such right-hand corner of the search results
deal with their customers directly questions have led to countless law- page. In principle, the advertiser is free
over the Internet, rendering distribu- suits and legislative initiatives over to select any keyword for his advertis-
tors, wholesalers, brokers, and agents the last 15 years. One of the most de- ing link. This becomes a legal issue,
superfluous. bated issues in recent years has been however, if the advertiser chooses a
While some traditional middlemen
have indeed become less important as
Internet commerce has developed, we
have not seen a general death of inter-
mediation. Rather, many new interme-
diaries have arisen on the digital land-
scape over the last 15 years. Just think
of Amazon, eBay, or Google. If all these
companies have been successful, it is
not because they have removed all bar-
riers between producers and consum-
ers. They have been successful because
they offer innovative services located
between producers and consumers
along the digital supply chain.
The law often has a difficult time
coping with new intermediaries.
Should an Internet service provider be
held liable for violations of copyright
or criminal law committed by its cus-
tomers? Is Yahoo obliged to prevent
photograph by Gwena ël Piaser

French consumers from accessing a


site where Nazi memorabilia is sold?
Can copyright holders compel peer-
to-peer file sharing systems to remove
copyrighted material or to screen for
such material? Are domain name reg- The European Court of Justice in Luxembourg.

30 communications of th e ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
viewpoints

keyword that has been registered as a sions as to whether Google’s AdWords


trademark by another company. system violates trademark law. Courts
In 2003, the French fashion house Google’s business in France and Belgium, and some
Louis Vuitton discovered that, when model relies courts in Germany, had ruled that the
French users entered “Louis Vuit- AdWords system violates trademark
ton” into Google, they were shown an extensively on law or unfair competition law, on the
advertising link pointing to fake LV the advertisement grounds that Google is using trade-
products. While LV could have sued marks, confusing consumers, and free-
the product imitator, it decided to sue auctioning riding on the goodwill of trademark
Google. From LV’s perspective, Google mechanisms owners. Courts in the U.K. and other
was a very attractive target: If Google courts in Germany have ruled the op-
was found liable, LV would not need underlying the posite, while decisions in Austria and
to sue numerous individual product AdWords system. the Netherlands have come out some-
imitators. With one lawsuit against where between these opposing view-
Google, LV could stop all keyword-re- points. Ultimately, in addition to the
lated trademark violations at a stroke. French Court de Cassation, the highest
Google, on the other hand, has a vital courts in Austria, Germany, the Neth-
interest in avoiding being held liable in erlands, and the U.K. have referred
such lawsuits. Google’s business mod- prehensive trademark system since AdWords-related lawsuits to the Euro-
el relies extensively on the advertise- 1857. However, in 1989, the European pean Court of Justice.
ment auctioning mechanisms under- Union required its member states to In March 2010, the European Court
lying the AdWords system. Of Google’s amend their national trademark sys- of Justice decided the French LV case.b
$23.6 billion gross revenues in 2009, tems in order to make them compli- The court held that a producer of fake
about $22.9 billion came from adver- ant with the European Trademark LV products violates trademark law if
tising (see http://investor.google.com/ Directive enacted that year. This di- his keyword-backed advertising link
financial/tables.html). A major part of rective did not create a unitary Eu- creates the impression that his prod-
this advertising revenue is believed to ropewide trademark system. Rather, ucts are actually produced, or at least
come from Google AdWords. it harmonized national trademark authorized by LV. This holding by the
systems across countries.a Today, if court was not surprising. More sur-
Case Studies there is some disagreement about prising was the court’s holding that
Cases such as LV’s have popped up like how a particular provision of nation- the fake product producer would vio-
mushrooms over the last few years in al trademark law should be inter- late trademark law even if he kept his
many countries. From a trademark preted and whether this provision is advertisement so vague that ordinary
law perspective, they are not easy to affected by the European Trademark consumers would be unable to deter-
resolve. On the one hand, it seems Directive, it is the European Court of mine whether or not there was some
unfair that, by choosing third-party Justice that has the last word. This affiliation between the producer and
trademarks for keyword registrations was the case with the French LV liti- LV. What this means in practice is un-
without proper authorization, firms gation. As the highest court in France clear. While the European Court of Jus-
can benefit from the goodwill of such could not itself decide the case, in tice settled the relevant points of law,
marks. It also seems problematic that 2008, this court referred it to the Eu- it did not provide a final answer as to
Google may benefit, at least indirectly, ropean Court of Justice, which is lo- whether the fake product producer was
from such behavior. On the other hand, cated in Luxembourg. actually infringing trademark law. This
trademark law does not protect trade- The intellectual property communi- depends on whether French consum-
mark owners against each and every ty eagerly awaited the European Court ers were really confused by the adver-
use of their registered marks by others. of Justice’s decision in this case. It tising link in question. As such matters
Where the Google AdWords system lies was of particular importance because of fact are not for the European Court
along this continuum is unclear. courts in various European countries of Justice to decide, the court referred
In the French lawsuit of Louis Vuit- had reached wildly different conclu- the case back to the French courts in
ton vs. Google, a Paris regional court this regard.
found Google guilty of infringing LV’s a As a separate measure, the European Trade- The court then turned to the liabil-
trademark in February 2005. After mark Regulation of 1994 created a European- ity of Google itself. The court held that
an appeals court in Paris had upheld wide trademark system that is administered
by the European trademark office (officially
this decision, Google appealed to the named the “Office of Harmonization for the b This decision covered not only the lawsuit
Cour de Cassation, which is the high- Internal Market“) in Alicante, Spain. As a re- between LV and Google, but also two other
est French court in this area of the sult, two trademark systems now exist in Eu- related lawsuits in France, which will not be
law. The court had to decide whether rope: the national trademark systems that are considered here. In addition, as of November
or not Google AdWords was in com- administered by national trademark offices 2010, the European Court of Justice has also
and enforced by national courts, and the Eu- ruled on AdWords-related cases from Austria,
pliance with French trademark law. ropean trademark system that is administered Germany, and the Netherlands. No decision
At this point in the story the European by the European trademark office and is also on the U.K. case (Interflora) had been issued at
Union kicks in. France has had a com- enforced by national courts. the time of writing.

ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t he acm 31
viewpoints

Google was not using the LV trademark Google AdWords can really cause con-
in its AdWords system in a manner cov- fusion among consumers. Up to now,
ered by European trademark law. The The danger is that most U.S. courts have denied Google’s
idea behind this is simple. Trademark national courts will liability on such grounds.
law does not entitle a trademark owner Third, as a result of the decisions
to prevent all utilization of his trade- continue to interpret by the European Court of Justice re-
mark by a third party. In the view of the European trademark lating to the AdWords system, Google
court, Google is merely operating a ser- revised its European AdWords trade-
vice that may enable advertisers to en- law in different ways. mark policy in September 2010 and
gage in trademark violations. Google limited its support for trademark
does not decide which trademarks to owners. Under the new policy, adver-
use as keywords, but merely provides tisers are free to select trademarks
a keyword selection service. This is not when registering advertising links.
sufficient, in the view of the court, to However, if a trademark owner dis-
justify an action for direct trademark the court did not give a definite answer covers that an advertiser is using his
infringement. as to whether Google should be pro- trademark without proper authoriza-
However, Google might still be li- tected by safe harbors provisions. For tion, Google will remove the advertis-
able for what lawyers call secondary most of these questions, the European ing link if the trademark is being used
infringement. The argument would Court of Justice provided some general in a confusing manner, for example
be that, if advertisers actually infringe guidelines, but left it to the national if it falsely implies some affiliation
trademark law because they create courts to rule on details which may be between the advertiser and the trade-
customer confusion in the AdWords small, but decisive. Therefore, in Eu- mark owner. By this policy change,
system, Google is benefiting finan- rope, it will ultimately be the national Google has mollified at least some
cially from these trademark violations. courts which will decide on the liability trademark owners and provided a
While this argument may sound con- of Google for its AdWords system. We mechanism outside the court system
vincing at first sight, the European E- still lack a clear answer on how to de- that may resolve a substantial propor-
Commerce Directive of 2000 restricts sign a keyword-backed advertisement tion of AdWords trademark disputes
the liability of “information society system in a way that clearly does not in Europe. Nevertheless, it is almost
service providers” (such as, potentially, violate European trademark law. certain that national courts in Europe
Google) for infringing activities by third will continue to rule on the details of
parties (the advertisers). Therefore, Indecisive Decision how the AdWords trademark policy is
the European Court of Justice had to This does not mean that one should implemented and enforced.
decide whether the safe harbor provi- feel sorry for Google which still has
sions of this directive shielded Google to operate in an area of somewhat Conclusion
from secondary liability. The European unsettled law. First, Google has some In the end, the decision by the Euro-
Court of Justice held that the answer to experience in this regard. Just think pean Court of Justice may indeed turn
this question depends on whether the of the Google Books project. Second, out to be a victory for Google. Whether
Google AdWords system is a mere auto- Google has been running its AdWords it is a victory for the European trade-
matic and passive system, as portrayed service in the U.S. for years, and in mark system is less clear. While the
by Google, or whether Google plays an the U.S. the liability question is still European Court of Justice provided
active role in selecting and ordering ad- not fully settled. In 2009, the Court of some general guidelines on Google Ad-
vertisements. As in the customer con- Appeals for the Second Circuit held Words, the task of working out the little
fusion question, the court refrained that Google was using trademarks “in details has been left to courts in Paris,
from giving any definite answer, but commerce” (as required by the Lan- Vienna, Karlsruhe, The Hague, London
rather referred the case back to the ham Act) when operating its AdWords and other cities. The danger is that na-
French courts. system,c thereby taking a slightly dif- tional courts will continue to interpret
In the popular press, the Europe- ferent stance from that of the Euro- European trademark law in different
an Court of Justice’s decision in the pean Court of Justice. The impact of ways. French courts, for example, may
Google AdWords case has often been this decision on Google AdWords in continue to be more critical of Google
portrayed as a victory for Google. Does the U.S. remains to be seen. At least, AdWords in their decisions than Ger-
victory really look like this? Well, it de- courts in the U.S. will now examine man or U.K. courts. This is not exactly
pends. The European Court of Justice more closely whether unauthorized the idea of a trademark system which
refrained from providing a final answer trademark-backed advertising links in is supposed to be harmonized across
as to whether keyword advertising can Europe by the institutions of the Euro-
lead to customer confusion. Nor did it c Rescuecom Corp. v. Google, Inc., 562 F.3d 123 pean Union.
provide a comprehensive answer as to (2009). This decision did not rule on the ul-
whether Google could be held liable timate question of Google’s liability, as the Stefan Bechtold (sbechtold@ethz.ch) is Associate
Court of Appeals remanded the case back to Professor of Intellectual Property at ETH Zurich and a
not because of customer confusion, the district court for further proceedings. In Communications Viewpoints section board member.
but because other goals of trademark March 2010, the parties settled their dispute
protection had been violated. Finally, out of court. Copyright held by author.

32 communications of th e ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
V
viewpoints

doi:10.1145/1866739.1866750 Michael A. Cusumano

Technology Strategy
and Management
Reflections on the Toyota Debacle
A look in the rearview mirror reveals system and process blind spots.

V
a rio us e xper t s in indus-
try and academia have long
recognized that Toyota,
founded in 1936, is one of
the finest manufacturing
companies the world has ever seen.a
Over the past 70-plus years, Toyota
has evolved unique capabilities in
manufacturing, quality control, sup-
ply-chain management, and product
engineering, as well as sales and mar-
keting. It began perfecting its famous
Just-in-Time or “Lean” production sys-
tem in 1948. I am a longtime observer
(and customer) of Toyota, and have re-
cently tried to understand how such a
renowned company could experience
the kinds of quality problems that
generated numerous media headlines Example of an unsecured driver-side floor mat trapping the accelerator pedal in a 2007
during 2009–2010.b Lexus ES350.
First, to recount some of the facts:
Between 1999 and 2010, at least 2,262 pedal instead of the brake) appear to be vehicles such as the Prius and Lexus
Toyota vehicles sold in the U.S. experi- the result of sticky brake pedals (eas- hybrids, which were also involved in
enced unintended cases of rapid accel- ily fixed with a metal shim to replace the complaints. Toyota also encoun-
eration and are associated with at least a plastic component) as well as loose tered other quality problems that it
815 accidents and perhaps as many as floor mats that inadvertently held down mostly kept out of the headlines—in
102 deaths. The incidents that were not the gas pedal.c Another possible cause particular, dangerous corrosion in the
due to driver error (stepping on the gas is the software that controls the engine frames of Tacoma and Tundra pickup
and braking functions, particularly in trucks sold in North America between
a See, for example, J. Womack et al., The Machine 1995 and 2000, apparently due to im-
that Changed the World (1990); or J. Liker, The c There are numerous reports on the Toyota proper antirust treatment. Toyota did
Toyota Way (2003). problem in the media and information avail- not recall these trucks, but silently
b My first book, The Japanese Automobile Indus- able from Toyota directly. A particularly de-
try (1985), presented a history of how the Just- tailed early document is Toyota Sudden Un-
bought them back from consumers.d
PHOTOGRA PH BY A P Photo/NH TSA

in-Time system was developed at Toyota. A intended Acceleration; www.safetyresearch.


later book, Thinking Beyond Lean (1998), exam- net. Also see “U.S. Safety Agency Reviewing d Toyota’s buyback program covered Tacoma
ined Toyota’s product development system. More Crashes,” The Wall Street Journal, (Feb. pickup trucks made between 1995 and 2000.
My most recent book, Staying Power (2010), 15, 2010); http://online.wsj.com; and “Toyo- See “Toyota Announces Tacoma Buyback
looks back at Toyota’s manufacturing, prod- ta’s Sudden Acceleration Blamed for More Program for Severe Rust Corrosion,” The Con-
uct development, and learning capabilities Deaths,” Los Angeles Times (Mar. 26, 2010); sumerist (Apr. 15, 2008); http://consumerist.
as well as how it ended up with these quality http://articles.latimes.com/2010/mar/26/busi- com/379734/toyota-announces-tacoma-buy-
problems in 2009–2010. ness/la-fi-toyota-deaths26-2010mar26. back-program-for-severe-rust-corrosion.

ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t he acm 33
viewpoints
CACM_TACCESS_one-third_page_vertical:Layout 1 6/9/09 1:04 PM Page 1

There have also been some minor com- ment. These systems and managerial
plaints about the driving mechanisms processes also reflect intangible corpo-
in the Corolla and Camry models, and rate values such as what kind of com-
stalling in some Corolla models. Over- mitment the organization has to qual-
all, during a 12-month period, Toyota ity and customer satisfaction.
recalled some 10 million vehicles The Toyota production system does
through August 2010—an extraordi- not seem to be the cause of the quality
ACM nary number given that the company problems experienced over the prior
sold only approximately seven million decade. In the past, Toyota has exhib-
Transactions on vehicles during this same period.2
In the software business, produc-
ited a significant advantage over its
mass-producer competitors in physical
Accessible ers and consumers are accustomed
to product defects and an occasional
and value-added productivity. The com-
petition has improved, but it is unlikely

Computing recall as well as lots of “patches” or


product fixes (see my earlier Commu-
that any firm has actually passed Toyota
in manufacturing prowess. Data related
nications column, “Who is Liable for to manufacturing or assembly quality,
Bugs and Security Flaws in Software?” such as the number of defects reported
March 2004, p. 25). Compared to au- by customers in newly purchased ve-
tomobiles, though, software product hicles, generally has placed Toyota at
technology is relatively new, and the the top of the auto industry or at least
design and engineering processes are among the leaders. This past year was
highly complex, especially for large sys- different due to the recalls—Toyota
tems with many interdependent com- fell from sixth to 21st in the annual J.D.
ponents. But to what can we attribute Power’s survey of initial quality.3 How-
so many quality problems in an indus- ever, the recent quality problems ex-
try as mature as automobiles and in a pose the limits of Toyota’s production
company so renowned for quality and system. Making components or receiv-
manufacturing? Moreover, when even ing supplier deliveries “just-in-time”
the mighty Toyota can falter, what does as the assembly lines need the compo-
it say about “staying power”—the abil- nents minimizes inventory and operat-
ity of a firm to sustain a competitive ad- ing costs, and exposes quality problems
vantage and keep renewing or expand- visible to assembly workers. But it does
ing its capabilities? not detect design flaws that surface
during usage of a product.
Systems and Managerial In terms of product development,
◆ ◆ ◆ ◆ ◆
Process Problems including design and testing process-
This quarterly publication is a One way to think about the Toyota es, Toyota has slipped a notch. The
quarterly journal that publishes debacle is to divide the problem into company seems to have tried too hard
refereed articles addressing issues categories: the production system, the to reduce costs due to rising compe-
product development system, and, for tition from low-cost but high-quality
of computing as it impacts the
lack of a better term, general manage- competitors such as Hyundai in Korea
lives of people with disabilities. or new entrants in China. It is clearly
The journal will be of particular a lapse in design and testing when
To what can we
interest to SIGACCESS members accelerator pedals get stuck on loose
floor mats, or when new types of plas-
attribute so many
and delegrates to its affiliated
tic pedal materials become sticky after
quality problems in
conference (i.e., ASSETS), as well
as other international accessibility
being exposed to moisture and fric-
tion. It is also a problem of design and
conferences. an industry as mature testing when drivers feel that braking
◆ ◆ ◆ ◆ ◆ as automobiles and software or on-board computer con-
trols and sensor devices seem to mal-
www.acm.org/taccess in a company so function or operate crudely. Toyota’s
www.acm.org/subscribe renowned for quality engineers and U.S. government safety
investigators have not been able to
and manufacturing? replicate the conditions that caused
some customers to complain about
these software-related problems.
But the kinds of problems we saw in
2009–2010 indicate Toyota engineers
need to do a better job in product de-

34 communications of th e ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
viewpoints

velopment to make sure their vehicles ment. But one observation is that, al-
work properly under all conditions. though we can learn a lot about best
Whether the component comes from Even the best firms practices from looking at exemplar
an in-house Toyota factory or a sup- are likely to decline firms and their unique processes,
plier makes no difference. Toyota en- like Just-in-Time production, we also
gineers are responsible. at least a little as need to have some perspective. An
In terms of general management, competitors catch up enduring management principle that
such as of the supply chain and over- truly differentiates firms over the long
all quality, Toyota clearly failed to live or when managers haul must also be separable from the
up to its historical standards. Execu- lose their focus. experience of any particular firm, in-
tives within the company admit they cluding the originator. This sounds
overstretched their managerial re- like a contradiction but it is not. Ev-
sources and overseas supply chain in ery company, market, and country
the push to overtake General Motors as will experience ups and downs. Even
the world’s largest automaker, which the best firms are likely to decline at
Toyota finally did in 2009. More spe- Toyota redefined mass production least a little as competitors catch up
cifically, the quality problems appear and built its reputation around qual- or when managers lose their focus.
connected to overly rapid expansion ity and reliability by paying attention Moreover, success often brings with it
of production and parts procurement to details, large and small. The recent the potential seeds of decline—such
outside Japan, particularly given the slew of recalls definitely indicates as increases in the size, complexity,
decision to use a different brake pedal. something changed for the worse in and global scale of operations, which
In the past, Toyota manufactured new the company. can be much more difficult to man-
models in Japan initially for a couple of What shocked me most was that the age. In this case, Toyota’s quality
years, using carefully tested Japanese quality lapses seemed to take Toyota’s problems in 2009–2010 do not mean
parts, and only then did it move pro- senior managers by such surprise. the principles of “lean production” or
duction of the best high-volume mod- CEO Akio Toyoda, and other senior ex- lean management more generally are
els to overseas factories. Over the last ecutives in the U.S. and Japan, admit- any less valuable to managers. What
decade, by contrast, Toyota ramped ted to having little or no information managers need to understand are the
up overseas production of new and old about these quality issues, which first limitations of any best practice as well
models with new suppliers much more surfaced in Europe. They were unpre- as the potential even for great compa-
quickly and, apparently, with inade- pared to explain the source or nature of nies to lose their focus and attention
quate stress testing. the problems—to themselves or to the to detail—at least temporarily.
Also at the management level, Toyota global media. Toyota also made its pre- The best outcome for Toyota will
executives seem to have paid increas- dicament worse by responding much be for managers, engineers, and other
ingly less attention to product and pro- too slowly to customer complaints and employees to reflect deeply on what
cess details. It may well be that Toyota allowing bad news to leak out sporadi- happened to them and use these in-
managers as well as staff engineers cally, while executives continued to sights to create an even stronger com-
believed their company had already deny—at least initially—that there was pany. They should become better able
reached such a high level of perfec- a real problem. to handle adversity and change in the
tion that there was nothing much Companies with true staying power future because they now know what
to worry about. But automobiles are fix their problems and recover from failure looks like. The Toyota way used
themselves very complex systems, with their mistakes. Here, Toyota has not to be that one defect was too many.
lots of hardware and software, and as disappointed us. By the fall of 2010, That is the kind of thinking that Toyota
many as 15,000 discrete components. Toyota managers and dealers had got- seems to be regaining.
It is not surprising that some things ten their act together and were work-
go wrong and recalls are common in ing hard to rebuild customer confi- References
1. Ackman, D. Tire trouble: The Ford-Firestone blowout.
the industry. Other automakers over dence. The problems seemed mostly Forbes.com (June 20, 2001).
the past year recalled more than 10 contained to the pedals and floor mats, 2. Bunkley, N. and Vlasic, B. Carmakers initiating
more recalls voluntarily. The New York Times (Aug.
million vehicles, not counting the though Toyota also upgraded some of 24, 2010); http://www.nytimes.com/2010/08/25/
Toyota recalls.2 In the grand scheme the software in its hybrid vehicles. Ser- automobiles/25recall.html
3. Welch, W. Toyota plunges to 21st in auto-quality
of things, moreover, the number of vice technicians worked overtime for survey; Ford makes Top 5, Bloomberg (June 17, 2010);
accidents and even deaths attributed months to fix recalled vehicles. Sales http://www.bloomberg.com/news/2010-06-17/toyota-
plunges-to-21st-in-j-d-power-quality-survey-ford-
to Toyota are not so large compared and profits recovered. And Toyota now makes-top-five.html.
4. Wikipedia. Firestone and Ford tire controversy.
to what other companies have experi- recalls any vehicle immediately with Wikipedia.com.
enced. For example, Ford had a mas- even the slightest hint of a problem.
sive recall in 2000 of some 13 million Michael A. Cusumano (cusumano@mit.edu) is a
professor at the MIT Sloan School of Management and
faulty tires made by Firestone and fit- Technology and Management School of Engineering and author of Staying Power: Six
ted on its Explorer SUVs, reportedly Lapses and Lessons Enduring Principles for Managing Strategy and Innovation
in an Uncertain World (Oxford University Press, 2010).
resulting in over 250 deaths and 3,000 The Toyota debacle offers many les-
catastrophic injuries.1,4 Nonetheless, sons about technology and manage- Copyright held by author.

ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t he acm 35
V
viewpoints

doi:10.1145/1866739.1866751 Mark D. Ryan

Viewpoint
Cloud Computing Privacy
Concerns on Our Doorstep
Privacy and confidentiality issues in cloud-based conference
management systems reflect more universal themes.

C
l oud c o mputing mea ns en-
trusting data to information
systems that are managed
by external parties on re-
mote servers “in the cloud.”
Webmail and online documents (such
as Google Docs) are well-known exam-
ples. Cloud computing raises privacy
and confidentiality concerns because
the service provider necessarily has ac-
cess to all the data, and could acciden-
tally or deliberately disclose it or use it
for unauthorized purposes.
Conference management systems
based on cloud computing represent
an example of these problems within
the academic research community. It
is an interesting example, because it is
small and specific, making it easier to
explore the exact nature of the privacy
problem and to think about solutions.
This column describes the problem,
highlights some of the possible unde- account their preferences and conflicts computing model: instead of installing
sirable consequences, and points out of interest; and hosting the server, the conference
directions for addressing it. ˲˲ The system organizes the collec- chair simply creates the conference
tion and distribution of reviews and account “in the cloud.” In addition to
Conference Management Systems discussion, can rank papers accord- the benefits described previously, this
Most academic conferences are man- ing to scores, and send out reminder model has extra conveniences:
aged using software that allows the email, as well as email notifications of ˲˲ The whole business of managing
program committee (PC) members to acceptance or rejection; and the server (including backups and se-
browse papers and contribute reviews ˲˲ It can also produce a range of other curity) is done by someone else, and
and discussion via the Web. In one reports, such as lists of sub-reviewers, gains economy of scale;
arrangement, the conference chair acceptance statistics, and the confer- ˲˲ Accounts for authors and PC mem-
ILLUSTRATION BY GA RY NEILL

downloads and hosts the appropriate ence program. bers exist already, and don’t have to be
server software, say HotCRP or iChair. HotCRP and iChair require the con- managed on a per-conference basis;
The benefits of using such software are ference chair to download and install ˲˲ Data is stored indefinitely, and
familiar: software, and to host the Web server. reviewers are spared the necessity of
˲˲ Distribution of papers to PC mem- Other systems such as EasyChair and keeping copies of their own reviews;
bers is automated, and can take into EDAS work according to the cloud and

36 communications of th e ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
viewpoints

˲˲ The system can help complete rather than being left solely to the de
forms such as the PC member invita- facto data custodians.
tion form and the paper submission The acceptance
form by suggesting likely colleagues success records Ways Forward
based on past collaboration history. Policies and legislation. An obvious
For these reasons, EasyChair and could be identified, first step is to articulate clear policies
EDAS are an immense contribution to for individual that circumscribe the ways in which
the academic community. According the data is used. For example, a simple
to its Web page, EasyChair hosted over researchers and policy might be that the data gathered
3,300 conferences in 2010. Because of groups, over during the administration of a confer-
its optimizations for multiconferences ence should be used only for the man-
and multitrack conferences, it is man- a period of years. agement of that particular conference.
dated for conferences and workshops Adherence to this policy would imply
that participate in the Federated Logic that the data is deleted after the con-
Conference (FLoC), a huge multicon- ference, which is not done in the case
ference that attracts approximately of Easychair (I don’t know if it is done
1,000 paper submissions. fidentiality, but the data was just about for EDAS). Other policies might allow
one conference. Cloud computing wider uses of the data. Debate within
Data Privacy Concerns solutions allow data to be aggregated different academic communities can
Accidental or deliberate disclosure. A across thousands of conferences over be expected to yield consensus about
privacy concern with cloud-comput- decades, presenting tremendous op- which practices are to be allowed in
ing-based conference management portunities for abuse if the data gets a discipline, and which ones not. For
systems such as EDAS and EasyChair into the wrong hands. example, some communities may
arises because the system administra- Beneficial data mining. In addition welcome plagiarism detection based
tors are custodians of a huge quantity to the abuses of conference review data on previously reviewed submissions,
of data about the submission and re- described here, there are some uses while others may consider it useless for
viewing behavior of thousands of re- that might be considered beneficial. their subject, or simply unnecessary.
searchers, aggregated across multiple The data could be used to help detect or Since its inception in 2002 and up to
conferences. This data could be delib- prevent fraud or other kinds of unwant- the time of writing, EasyChair has ap-
erately or accidentally disclosed, with ed behavior, for example, by identifying: peared not to have any privacy policy,
unwelcome consequences. ˲˲ Researchers who systematically or any statement about the purposes
˲˲ Reviewer anonymity could be com- unfairly accept each other’s papers, or and possible uses of the data it stores.
promised, as well as the confidentiality rivals who systematically reject each There is no privacy policy linked from
of PC discussions. other’s papers, or reviewers who reject its main page, and a search for “privacy
˲˲ The acceptance success records a paper and later submit to another policy” (or similar terms) restricted to
could be identified, for individual re- conference a paper with similar ideas; the domain “easychair.org” does not
searchers and groups, over a period of and yield any results. I have been told that
years; and ˲˲ Undesirable submission patterns new users are presented with a privacy
˲˲ The aggregated reviewing profile and behaviors by individual research- statement at the time of first signing
(fair/unfair, thorough/scant, harsh/un- ers (such as parallel or serial submis- up to Easychair. I did not create a new
discerning, prompt/late, and so forth) sions of the same paper; repeated pa- account to test this; regardless, the
of researchers could be disclosed. per withdrawals after acceptance; and privacy statement is not linked from
The data could be abused by hiring recurring content changes between anywhere or later findable via search.
or promotions committees, funding submitted version and final version). EDAS does have an easily accessed
and award committees, and more gen- The data could also be used to under- privacy policy, which (while not water-
erally by researchers choosing collab- stand and improve the way conferences tight) appears to comply with the “use
orators and associates. The mere ex- are administered. ACM, for example, only for this conference” principle.
istence of the data makes the system could use the data to construct quality Another direction would be to try
administrators vulnerable to bribery, metrics for its conferences, enabling it to find alternative custodians for the
coercion, and/or cracking attempts. If to profile the kinds of authors who sub- data—custodians that are not them-
the administrators are also research- mit, how much “new blood” is entering selves also researchers participating
ers, the data potentially puts them in the community, and how that changes actively in conferences. The ACM or
situations of conflict of interest. over different editions of the conference. IEEE might be considered suitable,
The problem of data privacy in gen- This could help identify conferences although they contribute to decisions
eral is of course well known, but cloud that are emerging as dominant, or oth- about publications and appointments
computing magnifies it. Conference ers that have outlived their usefulness. of staff and fellows. Professional data
data is an example in our backyard. The decisions about who is allowed custodians such as Google might also
When conference organizers had to mine the data, and for what purpos- be considered. It may be difficult to
to install the software from scratch, es, are difficult. Policies should be de- find an ideal custodian, especially if
there was still a risk of breach of con- cided transparently and by consensus, cost factors are taken into account.

ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t he acm 37
viewpoints

In most countries, legislation exists cations, and might not require a great
to govern the protection of personal deal of processing to be performed on
data. In the U.K., the Data Protection the server side. In that case, encrypting
Act is based on eight principles, includ- the data before sending it to the cloud
ing the principle that personal data is may be realistic. It would require keys
obtained only for specified purposes to be managed and shared among us-
and is not processed in a manner in- ers in a practical and efficient way, and
compatible with the purposes; and the the necessary computations to be done
principle that the data is not kept lon- in a browser plug-in. It is worthwhile to
ger than is necessary for the purposes. investigate whether this arrangement
EasyChair is hosted in the U.K., but the could work for conference manage-
lack of an accessible purpose state- ment software.
ment or evidence of registration under
the Act mean I was unable to deter- Conclusion
mine whether it complies with the leg- Many people with whom I have dis-
islation. The Data Protection Directive cussed these issues have argued that
of the European Union embodies simi- the professional honor of data custodi-
lar principles; personal data can only ans (and PC chairs and PC members) is
be processed for specified purposes sufficient to guard against the threats
and may not be processed further in a I have described. Indeed, adherence
way incompatible with those purposes. by professionals to ethical behavior is
Processing encrypted data in the essential to ensure all kinds of confi-
cloud. Policies are a first step, but dentiality. In practice, system admin-
alone they are insufficient to prevent istrators are able to read all the orga-
cloud service providers from abusing nization’s email, and medical staff can
the data entrusted to them. Current browse celebrity health records; we
ACM’s research aims to develop technologies trust our colleagues’ sense of honor to
that can give users guarantees that the ensure these bad things don’t happen.
interactions
agreed policies are adhered to. The fol- But my standpoint is that we should
magazine explores lowing descriptions of research direc- still try to minimize the extent to which
critical relationships tions are not exhaustive or complete. we rely on people’s sense of good be-
between experiences, people, Progress has been made in encryp- havior. We are just at the beginning of
tion systems that would allow users to the digital era, and many of the solu-
and technology, showcasing upload encrypted data, and allow the tions we currently accept won’t be con-
emerging innovations and industry service providers to perform compu- sidered adequate in the long term.
leaders from around the world tations and searches on the encrypted The issues raised about cloud-
data without giving them the possibil- computing-based conference man-
across important applications of ity of decrypting it. Although such en- agement systems are replicated in
design thinking and the broadening cryption has been shown possible in numerous other domains, across all
field of the interaction design. principle, current techniques are very sectors of industry and academia. The
expensive in both computation and problem of accumulations of data on
Our readers represent a growing servers is very difficult to solve in any
bandwidth, and show little sign of be-
community of practice that coming practical. But the research is generality. The particular instance
is of increasing and vital ongoing, and there are developments considered here is interesting because
all the time. it may be small enough to be solvable,
global importance.
Hardware-based security initiatives and it is also within the control of the
such as the Trusted Platform Module academic community that will directly
and Intel’s Trusted Execution Technol- benefit—or suffer—according to the
ogy are designed to allow a remote user solution we adopt.
e

to have confidence that data submitted


ib
cr

to a platform is processed according to


s

Mark D. Ryan (M.D.Ryan@cs.bham.ac.uk) is Professor


ub

in Computer Security and EPSRC Leadership Fellow


an agreed policy. These technologies
/s

in the School of Computer Science at the University of


rg

could be leveraged to give privacy guar- Birmingham, U.K.


.o
cm

antees in cloud computing in general,


a

Many thanks to the Communications reviewers for


w.

and conference management software interesting and constructive comments. I also benefited
w

from discussions with many colleagues at Birmingham,


w

in particular. However, significant re-


://

and also in the wider academic research community.


tp

search will be needed before a usable Thanks to Henning Schulzrinne, administrator of EDAS, for
ht

comments and clarifications. Drafts of this Viewpoint were


system could be developed. sent to Andrei Voronkov, the Easychair administrator, but
Certain cloud computing applica- he did not respond.
tions may be primarily storage appli- Copyright held by author.

38 communications of th e ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
V
viewpoints

doi:10.1145/1866739.1866752 Guy L. Steele Jr.

Interview
An Interview with
Frances E. Allen
Frances E. Allen, recipient of the 2006 ACM A.M. Turing Award,
reflects on her career.

A
CM Fellow Frances E. Allen,
recipient of the 2006 ACM
A.M. Turing Award and
IBM Fellow Emerita, has
made fundamental con-
tributions to the theory and practice
of program optimization and compil-
er construction over a 50-year career.
Her contributions also greatly extend-
ed earlier work in automatic program
parallelization, which enables pro-
grams to use multiple processors si-
multaneously in order to obtain fast-
er results. These techniques made it
possible to achieve high performance
from computers while programming
them in languages suitable to appli-
cations. She joined IBM in 1957 and
worked on a long series of innovative
projects that included the IBM 7030
(Stretch) and its code-breaking co-
processor Harvest, the IBM Advanced
Computing System, and the PTRAN
(Parallel Translation) project. She is
an IEEE Fellow, a Fellow of the Com- Fran Allen on CS: “It’s just such an amazing field, and it’s changed the world, and we’re just
puter History Museum, a member of at the beginning…”
the American Academy of Arts and
Sciences, and a member of the U.S. Your first compiler work was for IBM ideas and technologies they put into
PHOTOGRA PH BY F RA NK BECERRA , J R. / TH E J OURNAL NEWS

National Academy of Engineering. Stretch.a Stretch were to address that problem.


ACM Fellow Guy L. Steele Jr. visited Yes. In 1955, IBM recognized that to Six instructions could be in flight at the
Allen at her office in the IBM T.J. Wat- be 100 times faster than any machine same time. The internal memory was
son Research Center in 2008 for an existing or planned at the time, the ma- interleaved, and data would arrive out
extended oral interview. The complete jor performance problem to overcome of order—data and instructions were
transcript of this interview is available was latency to memory. The advanced both stored in this memory. They built
in the ACM Digital Library; presented a very complex buffering system and
here is a condensed version that high- look-ahead. John Cocke, when he came
a See the December 2010 Communications His-
lights Allen’s technical accomplish- torical Perspectives column “IBM’s Single-
in 1956, was put in charge of the look-
ments and provides some anecdotes Processor Supercomputer Efforts” for more ahead for instructions. It was also ar-
about her colleagues. discussion of the IBM Stretch supercomputer. chitected to have precise interrupts. So

ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t he acm 39
viewpoints

the look-ahead unit was a phenomenal customers. I had a pretty unhappy class, erything down there, and on purpose.
piece of hardware. because they knew they could do better “NSA” was not a term that was known.
than any high-level language could. While I was on the project, two guys
How many copies of Stretch were built? went to Moscow, just left, and it hit
Eight or nine. The original was built Did you win them over? the New York Times, and that’s when I
for Los Alamos and shipped late. Then Yes—and won myself over. John learned what it was about. It was a very
they discovered its performance was Backus, who led the FORTRAN project, carefully guarded activity. The problem
about half of what was intended. had set two goals from the beginning: was basically searching for identifiers
programmer productivity and applica- in vast streams of data and looking for
But still, 50 times… tion performance. I learned all about the relationships, identifying k-graphs and
Meanwhile, the underlying technol- compiler as part of teaching this course. doing statistical analysis. Any single
ogy had changed. T.J. Watson got up Harvest instruction could run for days,
at the Spring Joint Computer Confer- Did you ever work on that compiler and be self-modifying.
ence and announced they would not yourself? The most amazing thing about
build any more Stretch machines, and I was reading the code in order to that machine is that it was synchro-
apologized to the world about our fail- do the training. It set the way I thought nized. Data flowed from this tape sys-
ure. But it was recognized later that about compilers. It had a parser, then tem through memory, through the
the technology developed in building an optimizer, then a register allocator. streaming unit, to the Harvest unit, the
Stretch made a huge difference for The optimizer identified loops, and streaming unit, back to memory, and
subsequent machines, particularly the they built control flow graphs. back out onto the data repository, and
360. A lot of people went from Stretch The Stretch group recognized that it was synchronized at the clock level.
to the 360, including Fred Brooks. the compiler was going to be an es- The data was coming from listening
sential part of that system. A bunch of stations around the world, during the
What was your connection with us in research were drafted to work on Cold War. I spent a year at NSA install-
Stretch? it. The National Security Agency [NSA] ing the system; during that year, the
My role was on the compiler. When I had a contract with IBM to build an Bay of Pigs and the Cuban Missile Cri-
joined IBM in 1957, I had a master’s de- add-on to Stretch, for code-breaking. sis happened, so it was a very tense pe-
gree in mathematics from the Universi- Stretch would host the code-breaking riod. I assume most of the data was in
ty of Michigan, where I had gone to get a component, and there was a large tape Cyrillic. But Alpha could deal with any
teaching certificate to teach high school device, tractor tape, for holding mas- data that had been coded into bytes.
math. But I had worked on an IBM 650 sive amounts of data. I wrote the final acceptance test for
there, so I was hired by IBM Research as the compiler and the language. I wrote
a programmer. My first assignment was This was Stretch Harvest? the final report and gave it to them and
to teach FORTRAN, which had come out Yes. There was going to be one com- never saw it again, which I regret.
in the spring of that year. piler for Stretch Harvest that would
take FORTRAN, and the language I was What did you do next?
Did you already know FORTRAN, or working on with NSA for code-break- John Cocke was enamored with
were you learning it a week ahead, as ing, called Alpha, and also Autocoder, building the fastest machine in the
professors often do? which was similar to COBOL. world, and Stretch had been an an-
Yeah, a week ahead [laughs]. They nounced public failure. When I fin-
had to get their scientists and research- A single compiler framework to en- ished with Harvest, Stretch was al-
ers to use it if they were going to convince compass all three languages? ready done. I could have gone and
Yes, three parsers going to a high- worked on the 360. I didn’t particu-
level intermediate language, then an larly want to do that; it was a huge
optimizer, then the register allocator. project spread around the world.
It was recognized later It was an extraordinarily ambitious John wanted to take another crack at
that the technology compiler for the time, when even hash building the fastest machine in the
tables were not yet well understood. world, so I joined him on a project
developed in building One compiler, three source languag- called System Y. This time the com-
Stretch made a es, targeted to two machines, Stretch piler was built first. Dick Goldberg
and Harvest. In addition to managing was the manager and did the parser, I
huge difference the optimizer group, I was responsible did the optimizer, and Jim Beatty did
for subsequent for working with the NSA on designing the register allocator. We had a very
Alpha. I was the bridge between the nice cycle-level timing simulator. We
machines. NSA team, which knew the problem… built what was called the Experimen-
tal Compiling System.
And never wanted to tell you complete-
ly what the problem is. What became of System Y?
They told me not at all, but it didn’t It changed into ACS [Advanced
matter. I was pretty clueless about ev- Computing System], which was even-

40 communications of th e ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
viewpoints

tually canceled [in 1969] by Armonk, The graph interval decomposition im-
by headquarters, which we should have proved the theoretical cost bounds of
known would happen, because it was Any single Harvest the algorithm by guiding the order—
not 360. But we developed things that instruction could but if I hear you correctly, the interval
subsequently influenced the company structure is just as important, per-
a lot. We did a lot with branch predic- run for days, and haps more important, for guiding the
tion, both hardware and software, and be self-modifying. transformations than for doing the
caching, and machine-independent, analysis?
language-independent optimizers. Yes. People who were focusing on
John, after being very disappointed the theoretical bounds missed, I think,
about not being able to build the fast- the importance of leaving a framework
est machine in the world, decided he in which one could make the transfor-
would build the best cost-performance John Cocke contains some actual PL/I mations. But then something really ex-
machine. That was where the PowerPC code that represents sets as bit vectors, citing happened. A student of Knuth’s,
came from—the 801 project. and propagates sets around the pro- [Robert] Tarjan, developed a way to
After ACS, I took an unhappy di- gram control flow graph. The intersec- map this problem into a spanning tree.
gression from my work on compilers. tions and unions of the sets were just
I was assigned to work on FS, the fa- PL/I & and | operators, which makes Nodal graphs could be decomposed
mous “Future System” of IBM. It was the code concise and easy to read. You into spanning trees plus back edges.
so bad on performance, I wrote a letter. have said that PL/I was a complicated Yes! It was startling. Great things
FS took two round trips to memory to language to compile, but it seems to sometimes look simple in retrospect,
fetch any item of data, because it had have expressive power. but that solved that part of structuring
a very high-level intermediate form as Yes, it was really very useful for writ- the bounds of subsequent algorithms’
the architected form for the machine. ing optimizers and compilers. The analysis and transformation.
data flow work came from early FOR-
Should I be reminded of the Intel TRAN and their use of control flow So Tarjan’s work played a role in this?
432, the processor designed for Ada? graphs. On Project Y we built control Yes, I don’t think he knew it, but as
It had a very high-level architecture flow graphs and developed a language soon as he published that, it was just
that turned out to be memory-bound, about the articulation points on the obvious that we should abandon graph
because it was constantly fetching de- graph, abstracting away from DO loops intervals and go there.
scriptors from memory. into something more general, then op-
Yes. We aren’t very good about pass- timizing based on a hierarchy of these Could you talk about Jack Schwartz?
ing on the lessons we’ve learned, and graphs, making the assumption that Jack spent a summer at ACS and
we don’t write our failures up very well. they represented parts of the program had a huge influence. He wrote a
that were most frequently executed. number of wonderful papers on op-
It’s harder to get a failure published timizing transformations, one being
than a success. When did you first start using that bit- “Strength reduction, or Babbage’s
But there are a lot of lessons in vector representation? differencing engine in modern
them. After fuming about FS for a few Right at the beginning of the ACS dress.” Jack had a list of applications
months, I wrote a letter to somebody project. “Graph intervals” was a term for strength reduction, which we in
higher up and said, “This isn’t going to that John had come up with, but then compilers never took advantage of.
work,” and why, and that was the wrong I wrote the paper and carried the idea He and John wrote a big book, never
thing to say. So I was kind of put on the further. Then Mike Harrison came, published but widely circulated, on a
shelf for a while. But then I did a lot of and we were struggling with the prob- lot of this work. I spent a year in the
work with a PL/I compiler that IBM had lem that we had no way of bounding Courant Institute—I taught graduate
subcontracted to Intermetrics. the computation of the flow of infor- compilers. And Jack and I were mar-
mation in such a graph. ried for a number of years. So it was a
The compilers you worked on—such as good relationship all around.
the ACS compiler and the PL/I compil- In some of your papers, you talked
er in the 1970s—what languages were about earlier monotonic relaxation What did you think about SETL [a
those implemented in? techniques, but they had very large the- programming language developed by
Some of them were implemented oretical bounds. Schwartz]?
in FORTRAN, some in PL/I, and some Yes, but I wasn’t much concerned, It wasn’t the right thing for that
were in assembly language. because I knew that real programs time, but it may be an interesting lan-
don’t have those, and Mike agreed. guage to go back and look at now that
How about Alpha? Jeff Ullman did some analysis on pro- we’re mired in over-specifying.
That was in the assembly language grams. That did get a better bound, but
for Stretch. that analysis didn’t produce a structure Gregory Chaitin’s classic PLDI paper
against which one could actually make on “Register Allocation and Spilling via
Your 1976 Communications paper with transformations. Graph Coloring” contains a substan-

ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t he acm 41
viewpoints

tial chunk of SETL code, four and a half Greg immediately recognized that he ture. Mao was still alive, and a lot of
pages, that implements the algorithm. could apply this solution to the register the institutes and universities were
I liked SETL and was amazed that allocator issue. It was a wonderful kind pretty much closed. There was a sci-
they got some good compiling ap- of serendipity. ence institute in Peking and in Shang-
plications out of it. In the context of hai, where we gave talks on compilers,
multicores and all the new challenges Anything else we should know about and we looked at the machines there,
that we’ve got, I like it a lot—it’s one John Cocke? which were really quite primitive. The
instance of specifying the problem at He had a major impact on every- compiler they were running on the
such a high level that there’s a good body. Let me talk about his style of machine in Peking was on paper tape.
possibility of being able to target mul- work. He didn’t write anything, and I recognized, looking at the code, that
tiple machines and to get high perfor- giving a talk was exceedingly rare and it was essentially Ershov’s compiler.
mance from programs that are easy to painful for him. He would walk around So the people in China were really
write. the building, working on multiple quite concerned about being cut out
I have a story about register alloca- things at the same time, and furthered of the advances in computing. This
tion. FORTRAN back in the 1950s had his ideas by talking to people. He nev- is a conjecture I’ve only recently ar-
the beginnings of a theory of register er sat in his office—he lost his tennis rived at, why we in particular in the
allocation, even though there were only racket one time for several months and U.S. were asked to come: it was a con-
three registers on the target machine. eventually found it on his desk. If he nection through the technology that
Quite a bit later, John Backus became came into your office, he would start the three groups shared. We were very
interested in applying graph coloring drawing and pick up the conversation involved with Ershov and his group.
to allocating registers; he worked for exactly where he had left off with you He and his family wanted to leave the
about 10 years on that problem and two weeks ago! Soviet Union, and they lived with us in
just couldn’t solve it. I considered it our home for about a year.
the biggest outstanding problem in So he was very good at co-routining!
optimizing compilers for a long time. Yes, he could look at a person and You actually had two projects called
Optimizing transformations would remember exactly the last thing he said “Experimental Compiling System.”
produce code with symbolic registers; to them. And people used to save his bar What was the second one like?
the issue was then to map symbolic napkins. He spent a lot of time in bars; Its overall goals were to take our
registers to real machine registers, he liked beer. He would draw complex work on analysis and transformation
of which there was a limited set. For designs on napkins, and people would of codes, and embed that knowledge in
high-performance computing, register take the napkins away at the end of the a schema that would advance compil-
allocation often conflicts with instruc- evening. The Stretch look-ahead was ing. I wish we had done it on Pascal or
tion scheduling. There wasn’t a good designed on bar napkins, particularly something like that.
algorithm until the Chaitin algorithm. in the Old Brauhaus in Poughkeepsie.
Chaitin was working on the PL.8 com- PL/I was that difficult a language?
piler for the 801 system. Ashok Chan- You also knew Andrei Ershov. Yes, it was the pointers and the
dra, another student of Knuth’s, joined He did some marvelous work in condition handling—those were the
the department and told about how the Soviet Union. Beta was his com- big problems. This was another bold
he had worked on the graph coloring piler, a really wonderful optimizing project, and my interest was mostly
problem, which Knuth had given out compiler. He had been on the ALGOL in the generalized solution for inter-
in class, and had solved it—not by solv- committee. procedural analysis—but also putting
ing the coloring problem directly, but what we knew into a context that would
in terms of what is the minimal num- He had an earlier project that he called make writing compilers easy and more
ber of colors needed to color the graph. Alpha, not to be confused with the Al- formal, put more structure into the de-
pha language you did for Stretch, right? velopment of compilers. We already
No, it was totally unrelated. But had a lot of great algorithms which we
The Stretch look- later we read his papers. Then in 1972 could package up, but this was to build
he couldn’t travel, because he wasn’t a compiler framework where the meth-
ahead was designed a party member, so he had a work- ods that we already had could be used
on bar napkins, shop in Novosibirsk and invited a large more flexibly.
number of people. It was broader than
particularly in the compilers, but there was a big focus Did lessons learned from this project
Old Brauhaus in on compilers, and we picked up some feed forward into your PTRAN work?
things from his work. The interprocedural work did, abso-
Poughkeepsie. Ershov also worked with people in lutely, and to some extent the work on
China. When the curtain came down binding. It sounds trivial, but constant
between the Soviet Union and China, propagation, getting that right, and
the Chinese group then didn’t have being able to take what you know and
access to Ershov’s work. Jack and I refine the program without having to
were invited to China in 1973 to lec- throw things away and start over.

42 communications of th e ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
viewpoints

Let’s talk about PTRAN. Two papers Shields and Philippe Charles from
came out in 1988: your “Overview of NYU. All of these people have gone on
the PTRAN Analysis System” and “IBM Another thing to have some really wonderful careers.
Parallel FORTRAN”. It’s important to I think was a very Mark Wegman and Kenny Zadeck were
distinguish these two projects. IBM not in the PTRAN group but were do-
Parallel FORTRAN was a product, a big step was not ing related work. We focused on taking
FORTRAN augmented with constructs only identifying dusty decks and producing good paral-
such as PARALLEL LOOP and PARAL- lel code for the machines—continuing
LEL CASE and ORIGINATE TASK. So parallelism, but the theme of language-independent,
the FORTRAN product is FORTRAN identifying useful machine-independent, and do it auto-
with extra statements of various kinds, matically.
whereas with PTRAN, you were work- parallelism.
ing with raw FORTRAN and doing the “Dusty decks” refers to old programs
analysis to get parallelism. punched on decks of Hollerith cards.
Right. Nowadays we’ve got students who have
never seen a punched card.
What was the relationship between the We also went a long way with work-
two projects? The IBM Parallel FOR- The Ultracomputer was perhaps the ing with product groups. There was
TRAN paper cites your group as having first to champion fetch-and-add as a a marvelous and very insightful pro-
provided some discussion. synchronization primitive. grammer, Randy Scarborough, who
The PTRAN group was formed in the Yes. A little history: The Ultra- worked in our Palo Alto lab at the time.
early 1980s, to look first at automatic computer had 256 processors, with He was able to take the existing FOR-
vectorization. IBM was very late in get- shared distributed memory, acces- TRAN compiler and add a little bit or
ting into parallelism. The machines sible through an elaborate switching a piece into the optimizer that could
had concurrency, but getting into ex- system. Getting data from memory is do pretty much everything that we
plicit parallelization, the first step was costly, so they had a combining switch, could do. It didn’t have the future that
vectorization of programs. I was asked one of the big inventions that the NYU we were hoping to achieve in terms of
to form a compiler group to do paral- people had developed. The fetch-and- building a base for extending the work
lel work, and I knew of David Kuck’s add primitive could be done in the and applying it to other situations, but
work, which started in the late 1960s at switch itself. it certainly solved the immediate prob-
the University of Illinois around the IL- lem very inexpensively and well at the
LIAC project. I visited Kuck and hired Doing fetch-and-add in the switch time. That really helped IBM quickly
some of his students. Kuck and I had a helped avoid the hot-spot problem of move into the marketplace with a very
very good arrangement over the years. having many processors go for a single parallel system that was familiar to the
He set up his own company—KAI. shared counter. Very clever idea. customers and solved the problem.
Very, very clever. So IBM and NYU Disappointing for us, but it was the
Kuck and Associates, Inc. together were partners, and supported right thing to have happen.
Right. IBM, at one point later on, by DARPA to build a smaller machine.
had them subcontracted to do some The number of processors got cut back Did PTRAN survive the introduction of
of the parallelism. They were very open to 64 and the combining switch was no this product?
about their techniques, with one ex- longer needed, and the project kind of Yes, it survived. The product just
ception, and they were the leaders early dragged on. But my group supplied the did automatic vectorization. What we
on. They had a system called Parafrase, compiler for that. The project eventu- were looking at was more parallelism
which enabled students to try various ally got canceled. in general.
kinds of parallelizing code with FOR- So that was the background, in IBM
TRAN input and then hooked to a tim- Research and at the Courant Institute. One particular thing in PTRAN was
ing simulator back-end. So they could But then the main server line, the 370s, looking at the data distribution prob-
get real results of how effective a partic- 3090s, were going to have vector pro- lem, because, as you remarked in your
ular set of transformations would be. cessors. paper, the very data layouts that im-
It was marvelous for learning how to prove sequential execution can actual-
do parallelism, what worked and what Multiple vector processors as well as ly harm parallel execution, because you
didn’t work, and a whole set of great multiple scalar processors. get cache conflicts and things like that.
students came out of that program. Yes. And the one that we initially Yes.
In setting up my group, I mostly hired worked on was a six-way vector proces-
from Illinois and NYU. The NYU peo- sor. We launched a parallel translation That doesn’t seem to be addressed at
ple were involved with the Ultracom- group, PTRAN. Jean Ferrante played a all by the “IBM Parallel FORTRAN” pa-
puter, and we had a variant of it here, key role. Michael Burke was involved; per. What kinds of analysis were you
a project called RP3, Research Parallel NYU guy. Ron Cytron was the Illinois doing in PTRAN? What issues were you
Processor Prototype, which was an in- guy. Wilson Hsieh was a co-op student. studying?
stantiation of their Ultracomputer. Vivek Sarkar was from Stanford, Dave Well, two things about the project.

ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t he acm 43
viewpoints

We worked on both the theory and ab-


straction, building up methods that
ACM LAUNCHES were analyzable and could be reasoned
about, and implementing them. I in-
sisted in this project that, if someone
ENHANCED DIGITAL LIBRARY on the systems side showed me a piece
of code, I would say, “Can you describe
this in a paper? How would other peo-
ple know about this?” If they were on
the theoretical side, I would say, “Can
you implement it? Show me the imple-
mentation.”
The idea of trying to change a pro-
gram into a functional program was
something that I was trying to push.
We could do much better analysis even
for just plain straight optimization if
we could name the values but not bur-
den it with the location, apply a func-
The new DL simplifies usability, extends tional paradigm to it.

We could trace that idea back to your


connections, and expands content with: early work on strength reduction, when
you were making hash names for inter-
mediate values.
• Broadened citation pages Yes. The value contributes to the
answer, but where that value resides


should be irrelevant to the writer of the
Redesigned binders program.

• Expanded table-of-contents
Apparently, just convincing the early
programmers of that was one of your
early successes. FORTRAN is good

• Enhanced interactivity tools enough; you don’t need to keep track of


every single machine register yourself.
That’s right. So I had that challenge
out there. We needed to try and recast
the program as close as we could to
Visit the ACM Digital Library at: functional.
Another thing I think was a very big
dl.acm.org step was not only identifying parallel-
ism, but identifying useful parallelism.
Another problem: say that one of the
optimizations is constant propagation.
Not a DL Subscriber yet? For some variable deep in the code,

Register for a free 3 month


personal subscription at: That was a problem
we struggled with
dl.acm.org/free3 early on: How do
you avoid redoing
the analysis?

44 communications of th e ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
viewpoints

there is a constant that you have recog- like to write them, and making them run Any advice for the future?
nized could replace the use of the vari- really effectively and efficiently on target Yes, I do have one thing. Students
able. You then know, say, which way machines. One of the many ways you do aren’t joining our field, computer sci-
a branch is going to go. You’ve built this, but a very important one, is to do ence, and I don’t know why. It’s just
up this infrastructure of analysis and many kinds of sophisticated analysis such an amazing field, and it’s changed
you’re ready to make the transforma- and optimization of the code and to find the world, and we’re just at the begin-
tion—but then the results of the analy- out as much as you can about the char- ning of the change. We have to find a
sis are obsolete, so you have to start acteristics of the program without actu- way to get our excitement out to be
again. That was a problem we strug- ally running it. So these tend to be static more publicly visible. It is exciting—in
gled with early on: How do you avoid techniques, and very sophisticated ones. the 50 years that I’ve been involved, the
redoing the analysis? It got particularly While you have worked with and pio- change has been astounding.
bad with interprocedural activities. neered quite a number of them, some of
the most interesting involve using graphs
Is there some simple insight or over- as a representation medium for the pro- Recommended Reading
arching idea that helps you to avoid gram and using a strategy of propagat- Buchholz, W., Ed.
having to completely redo the compu- ing information around the graph. Be- Planning a Computer System: Project
Stretch. McGraw-Hill, 1962; http://ed-
tation? cause a program can be represented as
thelen.org/comp-hist/IBM-7030-Planning-
Vivek Sarkar was one of the key peo- a graph in more than one way, there’s McJones.pdf
ple on that, but Dave Kuck—this is at more than one way in which to propa-
Allen, F.E. and Cocke, J.
the core of KAI’s work, too. That group gate that information. In some of these A catalogue of optimizing tranformations.
described it as “the oracle.” You assign algorithms in particular, the informa- In R. Rustin, Ed., Design and Optimization of
costs to each of the instructions, and tion that’s being propagated around the Compilers. Prentice-Hall, 1972, 1–30.
you can do it in a hierarchical form, so graph is in the form of sets—for example, Allen, F.E.
this block gets this cost, and this block sets of variable names. As a strategy for Interprocedural data flow analysis. In
has that cost, and then do a cost analy- making some of these algorithms effi- Proceedings of Information Processing
sis. This is the time it’s going to take. cient enough to use, you’ve represented 74. IFIP. Elsevier/North-Holland, 1974,
398–402.
Then there’s the overhead cost of hav- sets as bit vectors and decomposed the
ing the parallelism. graphs using interval analysis in order Allen, F.E. and Cocke, J.
A program data flow analysis procedure.
to provide an effective order in which to
Commun. ACM 19, 3 (Mar. 1976), 137–147;
Earlier, you said that Kuck was very process the nodes. In doing this, you have http://doi.acm.org/10.1145/360018.360025
open about everything he was doing, built a substantial sequence of working
Allen, F.E. et al.
with one exception— systems; these aren’t just paper designs. The Experimental Compiling System. IBM
The oracle! “What have you got in You build a great system, and then you J. Res. Dev. 24, 6 (Nov. 1980), 695–715.
that thing?” [laughs] “We’re not going go on and build the next one, and so Allen, F.E.
to tell you!” So we built our own variant on. These all actually work on code and The history of language processor
of it, which was a very powerful tech- take real programs that aren’t artificial technology at IBM. IBM J. Res. Dev. 25, 5
nique. benchmarks and make them run. (Sept. 1981), 535–548.
That’s really very good. There’s one Allen, F.E. et al.
What else should we mention? thing: the overall goal of all of my work An overview of the PTRAN analysis system
We talked about the NSA work that has been the FORTRAN goal, John for multiprocessing. In Proceedings
of the 1st International Conference on
wasn’t published. That was, for me, a Backus’ goal: user productivity, appli- Supercomputing (Athens, Greece, 1988),
mind-changer that led to my feeling cation performance. Springer-Verlag, 194–211. Also in J. Par.
very strongly about domain-specific Dist. Comp. 5 (Academic Press, 1988),
languages. Now, three goofy questions. What’s 617–640.
your favorite language to compile?
Recommended Viewing
Are you for them or against them? FORTRAN, of course!
For them! Allen, F.E.
The Stretch Harvest compiler. Computer
What’s your favorite language to pro- History Museum, Nov. 8, 2000. Video, TRT
Oh, okay. Let’s be clear! [laughs] gram in? 01:12:17; http://www.computerhistory.org/
Good! [laughs] I guess it would have to be FOR- collections/accession/102621818
TRAN. The IBM ACS System: A Pioneering
I’m going to try something very fool- Supercomputer Project of the 1960s.
ish: summarize your career in one Okay, now, if you had to build a com- Speakers: Russ Robelen, Bill Moone, John
paragraph, then ask you to critique it. piler that would run on a parallel ma- Zasio, Fran Allen, Lynn Conway, Brian
Randell. Computer History Museum, Feb.
A major focus of your career has been chine, what language would you use to 18, 2010; Video, TRT 1:33:35; http://www.
that, rather than inventing new pro- write that compiler? youtube.com/watch?v=pod53_F6urQ
gramming languages or language fea- Probably something like SETL or a
tures and trying to get people to program functional language. And I’m very in-
in them, you focused on taking programs trigued about ZPL. I really liked that
as they are written, or as programmers language. Copyright held by author.

ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t he acm 45
practice
doi:10.1145/1866739.1866755
a system administrator, one of the peo-
Article development led by
queue.acm.org
ple who work behind the scenes to con-
figure, operate, maintain, and trouble-
shoot the computer infrastructure that
For sysadmins, solving problems usually supports much of modern life. Their
involves collaborating with others. work is critical—and expensive. The hu-
man part of total system cost-of-owner-
How can we make it more effective? ship has been growing for decades, now
dominating the costs of hardware or
by Eben M. Haber, Eser Kandogan, and Paul P. Maglio software.2–4
To understand why, and to try to

Collaboration
learn how administration can be bet-
ter supported, we have been watching
system administrators at work in their
natural environments. Over the course

in System
of several years, and equipped with cam-
corders, cameras, tapes, computers, and
notebooks, we made 16 visits, each as
long as a week, across six different sites.

Administration
We observed administrators managing
databases, Web applications, and sys-
tem security; as well as storage design-
ers, infrastructure architects, and sys-
tem operators. Whatever their specific
titles were, we refer to them all as system
administrators, or sysadmins for short.
At the beginning of our studies, we
held a stereotypical view of the sysad-
min as that guy (and it was always a guy)
George was in trouble. A seemingly simple deployment in the back room of the university com-
puter center who knew everything and
was taking all morning, and there seemed no end in could solve all problems by himself. As
sight. His manager kept coming in to check on his we ventured into enterprise data cen-
ters, we realized the reality was signifi-
progress, as the customer was anxious to have the cantly more complex. To describe our
deployment done. He was supposed to be leaving findings fully would take a book (which
for a goodbye lunch for a departing co-worker, we are currently writing).6 In this short
article, we limit ourselves to a few epi-
adding to the stress. He had called in all kinds of sodes that illustrate the kinds of collabo-
help, including colleagues, an application architect, ration we saw in system administration
work and where the major problems lie.
technical support, and even one of the system As we’ll show from real-world stories we
developers. He used email, instant messaging, face- collected and our analyses of work pat-
to-face contacts, his phone, and even his office mate’s terns, it’s really not just one guy in the
back room.
phone to communicate with everyone. And George
was no novice. He had been working as a Web-hosting The Story of George
George is a Web administrator in a large
administrator for three years, and he had a bachelor’s
illustration by yarek waszul

IT service delivery center. We observed


degree in computer science. But it seemed that all him over a week as he engaged in various
the expertise being brought to bear was simply not planning, deployment, maintenance,
and troubleshooting tasks for different
enough. Why was George in trouble? We’ll find out. customers.1 George is part of a team of
But first, why were we watching George? George is Web administrators; he interacts with

46 communications of th e ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t he acm 47
practice

Figure 1. George had to add a new front-end Web server to an existing installation.

Back-end server 1
Junction 7234 7234 7234

Back-end server 2 Front-end


Authentication
Web server
Server
7135 7135 7135

Firewall
Back-end server N

????
????

???? ????
New Web server
instance
???? ????

the other team members often, as work George spent more than two hours own was still connected to Adam), but
is distributed. They need to coordinate troubleshooting the error, mainly in col- quickly transitioned communications
their actions, hand off long-running laboration with others. He had created with tech support to IM. For the next 20
tasks, and consult each other (especially the new Web-server instance seemingly minutes or so, George continued to trou-
during troubleshooting). He also inter- without incident, and it registered itself bleshoot with Adam on the phone and
acts with other teams that are in charge with the middleware authentication tech support via IM, and Ted kept pop-
of different areas, such as networks, op- server. Yet when he issued the command ping into the office to offer suggestions.
erating systems, and mail servers. to the middleware server to permit the After a while, George became unhappy
During our week of observation, one front-end Web server to talk to the back- with the answers from tech support, so
of George’s tasks was to set up Web ac- end mail server, he got the following Adam hooked him up with one of the
cess to email for a customer. This in- message: developers of the middleware, and they
volved creating a new Web-server in- started discussing the problem over IM.
stance on an existing machine outside Error: Could not connect to Throughout, George remained the sole
the firewall and connecting through a server (status: 0x1354a424) person with access to the system—all
middleware authentication server in- commands and information requests
side the firewall to a back-end mail serv- Given that three different servers went through him. He became increas-
er (Figure 1). George had never before were involved, the error message gave ingly stressed out as the problem re-
installed a second Web server on an ex- him insufficient information. The on- mained unresolved.
isting machine, but he had instructions line docs and a Web search on the mes- Eventually, Ted went back to his own
emailed to him by a colleague as well as sage provided no additional details, so office and looked into the problem in-
access to online documentation. The he reached out for help. (For more on dependently. He discovered that George
task involved several people from dif- error messages, see “Error Messages: had misunderstood one of the front-end
ferent teams. Early in the week, George What’s the Problem?” ACM Queue, Nov. server’s network configuration parame-
asked the network team to create a new 2004.7) ters, described vaguely in the documen-
IP address and open ports on the fire- George’s manager suggested calling tation as “internal port.” George thought
wall. Throughout the week, we saw him Adam, the application architect, and this parameter (port 7137) specified the
collaborate extensively with Ted, a col- George and Adam started troubleshoot- port for communication from the front-
league who was troubleshooting some ing together, talking on the phone and end to the middleware server, when it
problems with the authentication serv- exchanging system logs, error messages, went the other way. George, in fact, had
er. George’s progress was gated by Ted’s configuration files, and sample com- made two mistakes: he didn’t realize that
work, so they exchanged IMs all the time mands via IMs and email messages (Fig- every front-end server used port 7135 to
and frequently dropped into each oth- ure 2). Adam did not have access to the talk to the middleware server (which was
er’s offices to work through problems troublesome system, so George acted as permitted by the firewall, see Figure 1),
together. his eyes and hands, collecting informa- and he specified a port for communica-
By Friday morning, George had com- tion and executing commands. tion from the middleware server to the
pleted all preparations. The final steps They were not able to find the error, front-end, 7137, that was blocked by
should have taken just a few minutes, so about an hour in, Adam suggested the firewall. Communications worked
but this was where the action really be- that George call technical support. He in one direction, but not the other. The
gan. A mysterious error appeared, and used his office mate’s phone (as his software only tested communications

48 communications of th e ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
practice

in one direction, so the error was not call Ted (Adam was still on the other Ted: Just create it with the 7236. Trust
reported until the middleware authenti- phone), and the conversation immedi- me.
cation server was configured. Ted found ately switched modes. With the nuance George: Why? That port’s not…, that’s
a solution to this complex situation, of spoken words, Ted started to realize going the wrong…, that’s only one way,
and tried unsuccessfully to explain it to that George fundamentally misunder- too.
George over IM: stood what was going on. Rather than Ted: Trust me.
continually telling George what to do George: It’s only one way. Do you under-
Ted: We were supposed to use 7236. Un- (“DO IT!”), Ted explained why. The task stand what I am saying?
configure that instance and... had shifted from debugging the system Ted: ’Cause it’s the [middleware] server
George: Can’t specify a return port... you to debugging George, and they tried to talking back to the [Web-server] in-
only specify one port. establish a common understanding on stance.
Ted: You did it wrong. which network ports were going which George: Yeah, but how does [the Web
George: No, I didn’t. direction. server] talk to the [middleware] server to
Ted: Yes, you did. You need to put in make some kind of request?
7236. George: What are you talking about? Ted: 7135 is the standard port it uses
George: We just didn’t tell it to go both 7236? We thought that it came in on in all cases. So we had it wrong. Our as-
ways. The other port has nothing to do 7137 and went back on 7236, but we sumption on how it works was incorrect.
with this. were wrong, that 7236 is like an HTTPS George: All right, all right.
Ted: Well, all I know is what I see in the listener port or something? Ted: If it doesn’t work, you can beat me
conf file. Ted: It will still come in on 7135 to talk to up after.
George: We thought that was the return [middleware] server apparently... George: I want to right now. [Laughter
port. That is not a return port. George: Right? on both sides]
Ted: There currently is no listener on Ted: What’s happening is it’s actually try-
[middleware server] on 7137. So use ing to make a request back, um, through How did George get into trouble?
7236. DO IT! the 72... well, actually trying to make it Like many failures, there were a num-
back through the 7137 to the instance... ber of contributing factors. George mis-
Ted wasn’t getting his point across, and it’s not happening. understood the meaning of one of the
and George was getting ever-more frus- George: I know. I know that. But I can’t front-end configuration parameters,
trated. George told his office mate to tell it to... not realizing that it conflicted with the

Figure 2. George engaged with at least seven different individuals or groups using various means of communication, including instant
message (solid lines), email (dashed lines), phone (dotted-and-dashed lines), and face-to-face (dotted lines). Only George and his colleague
Ted had direct access to the problematic server (double-solid lines).

Adam Laptop Laptop Phone Ted

Phone Monitor Monitor Laptop


Developer

Phone Servers
Laptop Phone
Laptop

Phone
Monitor Laptop
Monitor
Laptop
Office Mate George

Monitor
Tech Support
Network Team
Manager

ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t he acm 49
practice

firewall rules. The front-end did not test collaboration can work only when cor- “Let me call you” or “Please email me
two-way communication, so that errors rect information is shared, something that log file.”
in the front-end port configuration were that is impeded by misunderstandings Collaboration is especially important
not reported until the middleware server and the limitations of communication in situations where a person’s under-
was configured. The error message cer- tools. Proper system design can help standing must be debugged, as we saw in
tainly did not help. Perhaps most im- avoid misunderstandings in the first George’s story. Misunderstandings are a
portant was the fact that for most of the place, and improved tools for sharing fact of life, and here it was compounded
troubleshooting session, George was the information could help more quickly by poorly designed error messages and
only one who had direct access to the sys- rectify misunderstandings when they late reporting of misconfiguration. It
tem. All the other participants got their occur. can take a long time for someone even to
information filtered through George. We analyzed the 2.5 hours of George’s realize that his or her understanding is
Examining the videotapes in de- troubleshooting session, coding each incorrect. An extra pair of eyes can really
tail, we discovered several instances 30-second time slice of what George did help to identify and correct misunder-
in which George misreported or mis- (see Figure 3). We found 91% of these standings, yet misunderstandings af-
understood what he saw, filtering the time slices were spent in collaboration fect what a person reports—so getting a
information through his own misun- with other people, either via phone, IM- second opinion on the problem will help
derstanding, and reporting back incor- ing, email, or face-to-face. Only 6% of the only if the collaborator gets an accurate
rectly. (One example occurred when time was he actually interacting with the picture of the system.
George misread the results of a network system, whether to discover state or to Another lesson is that different
trace, his misunderstanding filtering make changes, as each interaction was communications media are good for
out a critical clue.) This prevented Adam followed by lengthy discussions of the different things: the nuance and inter-
and tech support from helping him ef- implications of what was seen and what change of the telephone and face-to-
fectively. The problem was found only to do next. face contacts help in getting complex
when Ted looked at the machine state While not every troubleshooting epi- ideas across and in assessing what oth-
independently—and then he had to de- sode we witnessed had this extreme level er people know. IMing is excellent for
bug George, too. George had many tools of collaboration, we saw people working quickly exchanging commands and er-
for sharing information about system together to solve problems much more ror messages verbatim, but subtle per-
state, but none of them gave the whole commonly than a single person toiling sonal cues are lost. Even for longtime
picture to the others. alone. We also coded for the topic of colleagues like George and Ted, build-
What are the lessons? Collaboration collaboration, which included expected ing trust over IM was difficult. Email is
is critical, especially when misunder- topics such as configuration details, sys- great for exchanging lengthy items such
standings occur (and from what we saw, tem state, ongoing strategy, and what as log files and instructions or things
incorrect or incomplete understanding commands to execute. Surprisingly, 21% that need to persist. Different commu-
of highly complex systems is a common of the communication involved discuss- nications media suggest different levels
source of problems for sysadmins). Yet ing collaboration itself—for example, of commitment to the collaboration.

Figure 3. Accounting of time spent during George’s troubleshooting session.

Tool Usage Topics Discussed

Instant Web Command State Error


Messenger

Face-to-face Command Collaboration Strategy Documentation


Line

Email Phone Administrative Personal Configuration

Tools Log

50 communications of th e ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
practice

Given the need for collaboration to help have done from home.”
sysadmins share their understanding After watching the people at work,
of systems, it is possible to imagine bet- however, we saw real value in having all
ter tools for sharing system state. These of them together in one place. The room
tools should take best advantage of dif-
ferent forms of communication to share Collaboration is was alive with different conversations,
usually many at once diverging and re-
more completely what is going on with
both system and sysadmin alike.
especially important joining, and with different experts ex-

in situations
changing ideas or asking questions. Peo-
We now turn to another example of ple would use the whiteboard to diagram
collaboration we observed among sys-
tem administrators working on a much
where a person’s theories, and could see and supplement
what others were writing. When some-
more complex system exhibiting a prob- understanding thing important occurred, the attention
lem that required incredible effort to un-
derstand.
must be debugged, of everybody in the room was instantly
focused. A group chat room was also
as we saw in used as a historical record for system
The Crit-Sit
A critical situation, or crit-sit, is a prac- George’s story. status, error messages, and ideas. Chat
was also used for private conversations
tice that is invoked when an IT system’s Misunderstandings within the room and beyond, and for ex-
performance becomes unacceptable
and the IT provider must devote spe- are a fact of life, changing technical information. At one
point we saw them build a monitoring
cific resources to solving the problem
as quickly as possible. Several sysad-
and here it was script collaboratively through talking,
looking at each other’s screens, and ex-
mins—experts on different compo- compounded by changing code snippets over IM both in-
nents—are brought into a room and
told to work together until the problem
poorly designed side and outside the room.
Not surprisingly, the people in the
is fixed. Crit-sits occur more often than error messages and room appeared much more engaged
sysadmins would like (one we inter-
viewed estimated taking part in four late reporting of than the remote participants. Being in
the same room signified a level of com-
crit-sits per year), and they can last days, misconfiguration. mitment by the participants. Those on
weeks, or even months. the conference call spoke up only when
We observed one crit-sit for a day, addressed directly; we assume that they
just after it had started, and followed its were doing other work and keeping just
progress over two months until its solu- one ear on the discussions in the room.
tion was found. This was exceptionally It is also likely that remote participants
long for a crit-sit. It involved an intermit- could not follow the chaotic, ever-shift-
tent Web application failure resulting ing discussions in the room.
from a subtle interaction of a Web ap- At a macro level, following the logs
plication server and back-end database. of 11 weeks of troubleshooting was also
Other potential problems were found fascinating. It tells the story of a signifi-
and fixed along the way, but it took more cant, complicated problem that could
than 80 days for a dedicated team of ex- not be successfully reproduced on any
perts to determine the true root cause. test system—a problem in which turn-
At a micro level, being in the room ing on logging would slow the system to
during the crit-sit was fascinating. Eight the point of unusability at the load levels
to 10 people were present in the large required to cause the failure. The story
conference room, either sitting at the shows the crit-sit team interacting with
two tables or walking around the room the support teams for a variety of prod-
talking; an additional four to six people ucts, escalating to the highest levels,
joined in via conference call and chat applying patch after patch and experi-
room (including technical support rep- menting with configuration settings,
resentatives for the various software new hardware, and special versions of
products involved). At first, it seemed the software. The process involved a lot
amazing to us that this many people of work by many different teams.
had been instructed to work together On the whole, the crit-sit was a col-
in a single room until the problem was laborative effort by a group of experts
solved. Indeed, one of the people in the to understand and repair the behavior
room complained via an instant mes- of a complex system consisting of many
sage to a colleague offsite: components. They used a wide variety of
“We’re doing lots of PD [problem de- technical tools: IMing, email, telephone,
termination], but nothing that I couldn’t and screen sharing, yet it seems that

ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t he acm 51
practice

they received the greatest value from forth chatter about system activity was
interacting face-to-face. By being in the common. They joked about taking down
same room, people could quickly shift the wall to make one big workspace.
from conversation to conversation when They also used a universitywide MOO
a critical phrase was heard, with a very
low barrier to asking someone a ques- One of our (multiuser domain, object oriented), a
textual virtual environment where all the
tion or suggesting an idea.
Although the crit-sit seems heavy-
motivations system administrators would hang out,
with different “rooms” for different top-
weight and wasteful, we have no other for studying ics. The start of an incident would result
approaches that can replicate the col-
laborative interaction of a bunch of
sysadmins is in high levels of activity in the security
room of the MOO, as security admins
people stuck in a room searching for a the ever-increasing from different parts of campus would
solution to a common problem. It would
be a revolutionary advance for system
cost of IT compare what was happening on their
own systems. On a day-to-day basis, the
administration if a tool were developed management. MOO might hold conversations on the
that could permit the same engagement
in remote collaborators as we saw in the Part of this can latest exploits discovered or theories
as to how a virus might be getting into
crit-sit room. certainly be the network. The admins described the
We next describe the sorts of collabo-
rations we observed among security ad- attributed to MOO’s persistence features as really
helpful in allowing them to catch up on
ministrators at a U.S. university. the fact that everything that was going on when they
came back after being away, even for a
The “ettercap” Incident computers get day. They also used a “whisper” feature
When we first met the security admin-
istration team for a computer center at
faster and cheaper of the MOO for point-to-point communi-
cation (like traditional IM).
a large university,5 they seemed some- every year, and An example of MOO use for quick in-
what paranoid, making such state-
ments as, “I’ll never type my password people do not. terchange of security status came when
we observed a meeting that focused on
on a Windows box, because I can’t really hacker tools. The security administra-
tell if it’s secure.” After watching them tors discussed a package called “etter-
for two weeks, we realized they had cap.” Being unfamiliar with this tool,
good reason to be cautious. IT systems, one of us began searching the Web for
as a rule, have no volition and don’t care information about it using the wireless
how they’re configured or whether you network. A few minutes later, one of the
apply a patch to them. Security admin- administrators in the room informed us
istrators face human antagonists, how- that a security administrator working
ever, who have been known to get angry remotely had detected this traffic and
when locked out of a system and work asked about it on the MOO:
extra hard to find new vulnerabilities
and do damage to the data of those who Remote: Any idea who was looking for
locked them out. ettercap? The DHCP logs say [observer’s
The work of these security adminis- machine name] is a NetBIOS name.
trators was centered around monitor- Nothing in email logs (like POP from
ing. New attacks came every week or two. that IP address).
Viruses, worms, and malicious intru- Remote: Seemed more like research.
sions could happen anytime. They had Remote: The SMTP port is open on that
a battery of automatic monitoring soft- host, but it doesn’t respond as SMTP.
ware looking for traces of attacks in sys- That could be a hacker defender port.
tem logs and network traffic. Automated Local: We were showing how [hacker]
intrusion-detection systems needed to downloaded ettercap. One of the visitors
err on the side of caution, with the sys- started searching for it.
admins making the final decision as to Remote: Ah, OK. Thanks.
whether suspicious activity was really an
attack. These sysadmins relied on com- In the space of only a few minutes,
munications tools to share information the sysadmin had detected Web search-
and to help them maintain awareness of es for the dangerous ettercap package,
what was going on in their center, across identified the name of the machine in
their campus, and around the world. question, checked the logs for other
The security administration team activity by that machine, and probed
shared adjacent offices, so back-and- the ports on the machine. He could see

52 communications of th e ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
practice

that it was probably someone doing re- management. Part of this can certainly first-class citizen in the work of system
search, but checked the MOO to verify be attributed to the fact that computers administration itself. Better collabora-
that it was in fact legitimate. get faster and cheaper every year, and tion support could relieve the burden
The participants also collaborated on people do not. Yet complexity is also a on individuals of communicating and
a broader scale. During our visit, the site huge issue—a Web site today is built establishing shared context, and so
was dealing with a worldwide security upon a dramatically more complicated avoiding missed information and en-
incident targeting military, educational, infrastructure than one 15 years ago. abling a persistent store for communi-
and government sites across the U.S. With complexity comes specialization cation. We believe that improved tools
and Europe. This was a particularly per- in IT management. With around-the- for system administrator collaboration
sistent attack—every time an intrusion clock operations needed for today’s en- have great potential to significantly im-
was detected and a vulnerability was terprises, coordination is also a must. pact system administration work—per-
closed, the attackers would come back System administrators need to share haps even helping to restrain the ever-
using a new exploit. The attackers would knowledge, coordinate their work, com- growing human portion of IT’s total
hop from institution to institution, com- municate system status, develop a com- cost of ownership.
promising a machine in one place, col- mon understanding, find and share
lecting passwords, and then trying those expertise, and build trust and develop
Related articles
passwords on machines at other institu- relationships. System administration is on queue.acm.org
tions (as users often have a single pass- inherently collaborative.
Error Messages: What’s the Problem?
word for accounts at different sites). At first, it is easy to think that George’s
Paul P. Maglio, Eser Kandogan
This broad-based attack required a story shows poor debugging practices or http://queue.acm.org/detail.cfm?id=1036499
broad-based response, so security ad- worse, poor skills, but we don’t think
Oops! Coping with Human Error
ministrators from affected institutions that’s the case. The system was complex, in IT Systems
formed an ad hoc community to moni- the documentation poor, the error mes- Aaron B. Brown
tor and share information about the at- sages unenlightening, and no single http://queue.acm.org/detail.cfm?id=1036497
tacks, with the goal of tracing the attacks person was responsible for all of it. Bet- Building Collaboration into IDEs
back to their source. When a compro- ter error messages or better documenta- Li-Te Cheng, Cleidson R.B. de Souza,
mised machine was found, they would tion would certainly help, but that miss- Susanne Hupfer, John Patterson, Steven Ros
let it remain compromised so that they es the point. There will always be cases http://queue.acm.org/detail.cfm?id=966803
could then trace the attackers and see that go uncovered and complexities that
References
where else they were connecting. This are hidden until it is too late. Modern IT 1. Barrett, R., Kandogan, E., Maglio, P.P., Haber, E.M.,
collaboration was like information systems are so complex that people will Prabaker, M., Takayama, L.A. Field studies of
computer system administrators: analysis of system
warfare: it was important to share in- often have an incorrect or incomplete management tools and practices. In Proceedings of
formation about known compromised understanding of their operation. That’s the Conference on Computer-Supported Collaborative
Work. 2004.
machines and exploits with trusted col- the nature of IT. The crit-sit story and the 2. Gartner Group/Dataquest. Server Storage and RAID
leagues, but the information had to be security story also show it. The one con- Worldwide (May 1999).
3. Gelb, J.P. System-managed storage. IBM Systems
kept from the attackers. You did not stant in these cases—and in almost all Journal 28, 1 (1989), 77–103.
want the attackers to know that you had the cases we observed—was collabora- 4. ITCentrix. Storage on Tap: Understanding the Business
Value of Storage Service Providers (Mar. 2001).
detected their attack and were monitor- tion. 5. Kandogan, E., Haber, E. M. 2005. Security and
ing their activities. When we first ob- We observed collaboration at many Usability: Designing Secure Systems that People Can
Use. In Security Administration Tools and Practices.
served them, the security administrators levels: within a small team, within an L.F. Cranor and S. Garfinkel, Eds. O’Reilly Media,
used conference calls for community organization, and across organizations. Sebastapol, 2005, 357–378.
6. Kandogan, E., Maglio, P.P., Haber, E.M., Bailey, J.
meetings. Later they found a special We observed several different types of (forthcoming). Information Technology Management:
Studies in Large-Scale System Administration. Oxford
encrypted email listserv to keep their collaboration tools in use. We observed University Press.
information under wraps—but because people switching from one tool to the 7. Maglio, P.P., Kandogan, E. 2004. Error messages:
What’s the problem? ACM Queue 2, 8 (2004), 50–55.
this tool was unmaintained, they had to other as needs shifted. We also observed
adopt and maintain it themselves. simultaneous use of several collabora-
Eben M. Haber is a research staff member at IBM
The world of security administration tion tools for different purposes. Not Research, Almaden, in San Jose, CA. He studies human-
seems very fluid, with new vulnerabili- surprisingly, system administrators use computer interaction, working on projects including data
mining and visualization, ethnographic studies of IT
ties and exploits discovered every day. the same collaboration tools as the rest system administration, and end-user programming tools.
Though secrecy was a greater concern of us, but these are not optimized for Eser Kandogan is a research staff member at IBM
than with other sysadmins we observed, sysadmin needs—whether it is team Research, Almaden, San Jose, CA. His interests include
human interaction with complex systems, ethnographic
collaboration was the foundation of brainstorming and debugging or secure studies of system administrators, information
their work: sharing knowledge of un- information sharing. visualization, and end-user programming.
folding events and system status, espe- Though specific features can be im- Paul P. Maglio is a research scientist and manager at
IBM Research, Almaden, San Jose, CA. He is working
cially when an attack might be starting plemented for system administrators, it on a system to compose loosely coupled heterogeneous
and time was critical. is clear to us that because of the diverse models and simulations to inform health and health policy
decisions. Since joining IBM Research, he has worked
needs among system administrators, a on programmable Web intermediaries, attentive user
Conclusion single collaboration tool will not work interfaces, multimodal human-computer interaction, human
aspects of autonomic computing, and service science.
One of our motivations for studying sys- for all. There needs to be a variety of
admins is the ever-increasing cost of IT tools, and collaboration needs to be a © 2011 ACM 0001-0782/11/0100 $10.00

ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t he acm 53
practice
doi:10.1145/1866739.1866753
BusinessObjects Polestar (currently mar-
Article development led by
queue.acm.org
keted as SAP BusinessObjects Explor-
er), a business intelligence (BI) query
tool designed for casual business us-
Talking with Julian Gosper, Jean-Luc Agathos, ers. In the past, such users did not have
Richard Rutter, and Terry Coatta. their own BI query tools. Instead, they
would pass their business queries on to
ACM Case Study analysts and IT people, who would then
use sophisticated BI tools to extract

UX Design
the relevant information from a data
warehouse. The Polestar team wanted
to leverage a lot of the same back-end
processing as the company’s more so-

and Agile:
phisticated BI query tools, but the new
software required a simpler, more user-
friendly interface with less arcane ter-
minology. Therefore, good UX design
was essential.

A Natural Fit?
To learn about the development pro-
cess, we spoke with two key members
of the Polestar team: software architect
Jean-Luc Agathos and senior UX de-
signer Julian Gosper. Agathos joined
BusinessObjects’ Paris office in 1999
and stayed with the company through
its acquisition by SAP in 2007. Gosper
started working with the company five
years ago in its Vancouver, B.C., office.
Fo und at the intersection of many fields—including The two began collaborating early in
usability, human-computer interaction (HCI), and the project, right after the creation of
a Java prototype incorporating some
interaction design—user experience (UX) design of Gosper’s initial UX designs. Because
addresses a software user’s entire experience: from the key back-end architecture is one
that had been developed earlier by the
logging on to navigating, accessing, modifying, Paris software engineering team, Gos-
and saving data. Unfortunately, UX design is often per joined the team in Paris to collabo-
overlooked or treated as a “bolt-on,” available rate on efforts to implement Polestar
on top of that architecture.
only to those projects blessed with the extra time To lead our discussion, we enlisted a
and budget to accommodate it. Careful design of pair of developers whose skill sets large-
ly mirror those of Agathos and Gosper.
the user experience, however, can be crucial to the Terry Coatta is a veteran software engi-
success of a product. And it’s not just window neer who is the CTO of Vitrium Systems
dressing: choices made about the user experience in Vancouver, B.C. He also is a mem-
ber of ACM Queue’s editorial advisory
can have a significant impact on a software product’s board. Joining Coatta to offer another
underlying architecture, data structures, and perspective on UX design is Richard
Rutter, a longtime Web designer and a
processing algorithms. founder of Clearleft, a UX design firm
To improve our understanding of UX design based in Brighton, England.
and how it fits into the software development process, Before diving in to see how the col-
laboration between Agathos and Gos-
we focus here on a project where UX designers per played out, it’s useful to be familiar
worked closely with software engineers to build with a few of the fundamental disci-

54 communications of th e ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
plines of classic UX design: neer to iterate on some lightweight pa-
Contextual inquiry. Before develop- per prototypes. What was the role of the
ing use cases, the team observes users engineer in that process?
of current tools, noting where the pain Julian Gosper: Adam Binnie, a se-
points lie. Contextual inquiry is often nior product manager who had con-
helpful in identifying problems that ceived of the project, brought me in
users are not aware of themselves. Un- to elaborate the interaction design
fortunately, this often proves expensive of the early Polestar prototype. Davor
and so is not always performed. Cubranic, who had produced that ini-
Formative testing. Formative test- tial proof-of-concept for research pur-
ing is used to see how well a UX design poses, was looking to work some of
addresses a product’s anticipated use the user experience ideas I had begun
cases. It also helps to determine how to collaborate on with Adam back into
closely those use cases actually cleave to his original design. Davor saw value
real-world experience. In preparation, in creating Java prototypes of some of
UX designers generally create light- those new concepts so we would have
weight prototypes to represent the use not only paper prototypes to work with,
cases the product is expected to eventu- but also a live prototype that end users
ally service. Often these are paper proto- could interact with and that develop-
types, which the test-group participants ment could evaluate from a technical
simply flip through and then comment perspective. I really pushed for that
upon. For the Polestar project, the UX since we already had this great devel-
team used a working Java prototype to opment resource available to us. And
facilitate early formative testing. it didn’t seem as though it was going
Summative testing. In summative to take all that long for Davor to ham-
tests, users test-drive the finished soft- mer out some of the key UX concepts,
ware according to some script. The which of course was going to make the
feedback from these tests is often used live prototype a much better vehicle for
to inform the next round of develop- formative testing.
ment since it usually comes too late Generally speaking, as an interac-
in the process to allow for significant tion designer you don’t want to invest
changes to be incorporated into the a lot of time programming something
current release. live, since what you really want is to
Although the Polestar team did not keep iterating on the fundamentals of
have the budget to conduct contex- the design quickly. That’s why working
tual inquiry, it was able to work closely with paper prototypes is so common-
with the software engineer who built place and effective early in a project.
the research prototype responsible for Typically, we’ll use Illustrator or Visio
spawning the project. This allowed the to mock up the key use cases and their
team to perform early formative test- associated UI, interactions, and task
ing with the aid of a working UX design, flows, and then output a PowerPoint
which in turn made it possible to refine deck you can just flip through to get a
the user stories that would be used as sense for a typical user session. Then
the basis for further testing. Working various project stakeholders can mark
with the software engineer responsible that up to give you feedback. You also
for the initial design also made it pos- then have a tool for formative testing.
sible to evaluate some of the initial UX Collaborating closely with develop-
designs both from a performance and a ment at that stage was appealing in
feasibility perspective, preventing a lot this particular case, however, because
of unwelcome surprises once develop- some of the directions we were taking
ment was under way in earnest. with the user interface were likely to
have serious back-end implications—
Terry Coatta: You mentioned you Top to Bottom: Jean-Luc Agathos, Terry
worked early on with a software engi- Coatta, Julian Gosper, and Richard Rutter.

ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t he acm 55
practice

for example, the ability of the applica- sible. To me, user experience and de-
tion to return and reevaluate facets and velopment are essentially one and the
visualizations with each click. Having same. I see it as our job as a group to
Davor there to help evaluate those pro- turn the user stories into deliverables.
posed design changes right away from julian gosper Of course, in development we are
a performance perspective through
rapid iterations of a lightweight Java-
For this product generally working from architectur-
al diagrams or some other kind of
based prototype helped to create a nice to succeed, product description that comes out
set of synergies right from the get-go.
Jean-Luc Agathos: Even though I
performance was of product management. We’re also
looking to the user experience people
didn’t get involved in the project until critical—both in to figure out what the interaction mod-
later, it seemed as though Davor had
produced a solid proof-of-concept. He terms of processing el is supposed to look like.
The interesting thing about the first
also figured out some very important
processing steps along the way, in addi-
a large index of version of Polestar is that Julian essen-
tially ended up taking on both roles. He
tion to assessing the feasibility of some data and being able acted as a program manager because
key algorithms we developed a bit later
in the process. I think this was critical
to evaluate which he knew what the user stories were
and how he wanted each of them to be
for the project, since even though peo- facets to return handled in terms of product function-
ple tend to think about user experience
as being just about the way things are
to a user to support ality. He also had a clear idea of how he
wanted all of that to be exposed in the
displayed, it’s also important to figure the experience UI and how he wanted end users ulti-
out how that stuff needs to be manipu-
lated so it can be processed efficiently. of clicking through mately to be able to interact with the
system. That greatly simplified things
Gosper: That’s absolutely correct. a new data analysis from my perspective because I had
For this product to succeed, perfor-
mance was critical—both in terms of at the speed of only one source I had to turn to for di-
rection.
processing a large index of data and
being able to evaluate which facets to
thought. Coatta: You used Agile for this
project. At what point in the process
return to a user to support the experi- can the software developers and UX
ence of clicking through a new data designers begin to work in parallel?
analysis at the speed of thought. The Gosper: If you have a good set of user
capabilities for assessing the relevance stories that have been agreed upon by
of metadata relative to any particular the executive members of the project
new query was actually Davor’s real and include clear definitions of the as-
focus all along, so he ended up driv- sociated workflows and use cases, then
ing that investigation in parallel to the the Agile iterative process can begin.
work I was doing to refine the usability At that point you are able to concretely
of the interface. understand the functionality and expe-
I do recall having discussions with rience the product needs to offer. On
Davor where he said, “Well, you know, the basis of that, both UX interaction
if you approach ‘X’ in the way you’re designers and the development team
suggesting, there is going to be a sig- should have enough to get going in par-
nificant performance hit; but if we allel. That is, the developers can start
approach it this other way, we might working on what the product needs
be able to get a much better response to do while the UX guys can work on
time.” So we went back and forth like use-case diagramming, wireframing
that a lot, which I think ultimately and scenarios, as well as begin to coor-
made for a much better user experi- dinate the time of end users to supply
ence design than would have been pos- whatever validation is required.
sible had we taken the typical waterfall The important thing is that you have
approach. lots of different people involved to help
Coatta: Are product engineers typi- pull those user stories together. Clear-
cally excited about being involved in a ly, the UX team needs to be part of that,
project when the user experience stuff but the development team should par-
is still in its earliest stages of getting ticipate as well—along with the busi-
sorted out? ness analysts and anybody else who
Agathos: Yes, I think developers might have some insights regarding
need to be acquainted with the user what the product requirements ought
stories as early in the process as pos- to be. That’s what I think of when we

56 communications of th e ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
practice

talk about starting development on the posed certain constraints in terms of pretty novel. [Since users often have
basis of some common “model.” how to approach processing the data only a fuzzy idea about some of the pa-
and exposing appropriate facets to the rameters or aspects of the information
For the most part, development of Pole- user. There already were some UI con- they’re seeking, faceted navigation is a
star’s first release went smoothly. Gos- straints driven by the user stories. Then UI pattern that facilitates queries by
per’s collaboration with the prototype we learned we had only a couple of allowing users to interactively refine
developer helped iron out some of the weeks to get something ready to show relevant categories (or “facets”) of the
technical challenges early on, as well to customers. And finally, it was strong- search.]
as refine the main user stories, which ly suggested that we use Adobe Flex— A lot of those initial design direc-
helped pave the way for implementing something we had not even been aware tions ended up being re-explored as
the product with Agathos. The user in- of previously—to get the work done. open topics as the UX team in Paris
terface that emerged from that process So, our first job was to learn that worked on the second version of the
did face certain challenges, which were technology. Initially, we were pretty product. They ended up coming to
verified during summative testing, af- frustrated about that since it wasn’t some different conclusions and be-
ter development work for release 1 was really as if we had a choice in the mat- gan to promote some alternatives.
essentially complete. Contributing to ter. Instead of fighting it, however, we That, of course, can be very healthy in
the problem was the general dearth quickly realized it was a really good the evolution of a product, but at the
of query tools for casual business us- idea. Flex is so powerful that it actually same time, it can really challenge the
ers at the time. The blessing there was allowed us to come up with a working development team whenever the new
that Gosper and his UX team had the prototype in time for that user confer- UI choices don’t align entirely with the
freedom to innovate with their design. ence. code you’ve already got running under
That innovation, however, meant the Gosper: There was a special aspect the hood.
software would inevitably end up incor- to this project in that it was earmarked Coatta: Another area where I think
porating new design concepts that— as innovation—a step toward next- there is huge potential for trouble has
even with many iterations of formative generation self-service BI. Prior to this, to do with the feedback process to UX
testing—faced an uncertain reception the company had never productized a design. I’ve been through that process
from users. lightweight business intelligence tool and have found it to be extremely chal-
Further challenges arose during the for business users, so we didn’t really lenging. As an example, suppose you
development of subsequent releases have much in the way of similar prod- need to find two business objects that
of Polestar, once Agathos and Gosper ucts—either within the company or are related to each other. Let’s say we
were no longer working together. In out in the market—that we could refer know that one of those objects can be
response to user feedback, the new to. As a result, the design we ended up shared across multiple business do-
UX team decided to make fundamen- pushing forward was sort of risky from mains. One possibility is that they share
tal changes to the UI, which forced the a user adoption perspective. a unique ownership, which would have
engineering team to rearchitect some The first version of Polestar, in all huge ramifications for the user experi-
parts of the software. On top of these honesty, tended to throw users for a ence. When we have run across situa-
challenges, the new team learned what loop at first. Most of the users we test- tions like that, we’ve often had trouble
could happen when there is confusion ed needed to spend a few minutes ex- communicating the semantic implica-
over the underlying conceptual model, ploring the tool just to get to the point tions back to the folks who are respon-
which is something Agathos and Gos- where they really understood the de- sible for doing the UI. I wonder if you’ve
per had managed to avoid during the sign metaphors and the overall user run into similar situations.
first phase of development. experience. Agathos: Actually, Julian had already
Of course, that’s not to say the initial Agathos: That was definitely the worked all of that out before joining
phase was entirely free of development case. us in Paris, so we didn’t have any prob-
challenges, some of which could be as- Gosper: That sparked a fair amount lems like that during our initial round
cribed to pressure to prove the worth of of controversy across the UX group of development. With later versions,
the new endeavor to management. because some of the methodologies however, we had an issue in the ad-
around formative testing back then dif- ministration module with “infospace,”
Richard Rutter: What were the most fered from one site to another. The next which is a concept we exposed only
challenging parts of this project? designers assigned to the project end- to administrators. The idea was that
Agathos: One big challenge was ed up coming to different conclusions you could create an infospace based
that right after we received the origi- about how to refine the interaction de- on some data source, which could be
nal Java POC (proof of concept), we sign and make it more intuitive. Some a BWA (Burrows-Wheeler Alignment)
received word that we needed to pro- questions were raised about whether index or maybe an Excel spreadsheet.
duce a Flash version in less than two we were fundamentally taking the right [BWA is a fast, lightweight tool that
weeks so it could be shown at Business interaction development approach in aligns relatively short queries to a se-
Objects’s [now SAP BusinessObjects] several different areas of the interface. quence database. These sequences are
international user group meeting. Ac- We employed faceted navigation in a usually indexed in the FASTA format.]
tually, that was only one of many con- fairly unique way, and the interactions Before anybody can use the system’s
straints we faced. The POC itself im- we created around analytics were also exploration module to investigate one

ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t he acm 57
practice

of these information spaces, that space tween team members and frequent
must be indexed, resulting in a new in- face-to-face communication, often be-
dex entity. That seems straightforward tween stakeholders with vastly differ-
enough, but we spent a lot of time try- ent backgrounds and technical vocab-
ing to agree on what ought to happen Terry Coatta ularies. It’s therefore no surprise that
when one person happens to be ex-
ploring an infospace while someone
An area where one of the team’s biggest challenges
had to do with making sure commu-
else is trying to index the same space. I think there is nication was as clear and effective as
Those discussions proved to be diffi-
cult simply because we had not made it
huge potential possible.

explicit that the entity in question was for trouble has Coatta: I understand that some UX
an index. That is, we talked about it at
first strictly as an information space. It to do with the designers feel Agile is less something
to be worked with than something to
wasn’t until after a few arguments that
it became clear what we were actually
feedback process be worked around. Yet, this was the first
time you faced implementing a UX de-
talking about was an index of that in- to UX design. sign using Agile, and it appears you ab-
formation space.
Whenever the model can be pre-
I’ve been through solutely loved it. Why is that?
Gosper: In an ideal world, you would
cisely discussed, you can avoid a lot that process and do all your contextual inquiry, paper
of unnecessary complexity. When a
model is correctly and clearly defined
have found it to prototyping, and formative testing
before starting to actually write lines
right from the outset of a project, the be extremely of code. In reality, because the turn-
only kind of feedback that ought to be
required—and the only sort of change challenging. around time between product incep-
tion and product release continues to
you should need to add to your sprints grow shorter and shorter, that simply
in subsequent iterations—has to isn’t possible. If the UX design is to be
do with the ways you want to expose implemented within a waterfall proj-
things. For example, you might de- ect, then it’s hard to know how to take
cide to change from a dropdown box what you’re learning about your use
to a list so users can be given faster cases and put that knowledge to work
access to something—with one click, once the coding has begun.
rather than two. If you start out with a In contrast, if you are embedded
clear model, you’re probably not going with the development team and you’re
to need to make any changes later that acquainted, tactically, with what
would likely have a significant impact they’re planning to accomplish two or
on either the UI or the underlying ar- three sprints down the road, you can
chitecture. start to plan how you’re going to test
different aspects of the user experience
In developing a product for which there in accordance with that.
was no obvious equivalent in the mar- Coatta: In other words, you just feel
ketplace, Agathos and Gosper were al- your way along?
ready sailing into somewhat uncharted Gosper: Yes, and it can be a little
waters. And there was yet another area scary to dive right into development
where they would face the unknown: without knowing all the answers.
neither had ever used Agile to imple- And by that I don’t just mean that you
ment a UX design. Adding to this un- haven’t had a chance to work through
certainty was the Agile development certain areas to make sure things make
methodology itself, where the tenets sense from an engineering perspec-
include the need to accept changes in tive; you also can’t be sure about how
requirements, even late in the develop- to go about articulating the design be-
ment process. cause there hasn’t been time to iterate
Fortunately, the Polestar team soon on that enough to get a good read on
embraced the iterative development what’s likely to work best for the user.
process and all its inherent uncertain- Then you have also got to take into ac-
ties. In fact, both Gosper and Agathos count all the layers of development
found Agile to be far more effective considerations. All of that comes to-
than the waterfall methodology for im- gether at the same time, so you’ve got
plementing UX designs. This doesn’t to be very alert to everything that’s hap-
mean all was smooth sailing, however. pening around you.
Agile requires close collaboration be- There is also a lot of back-and-forth

58 communications of th e ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
practice

in the sprint planning that has to hap- terests constantly vying to have certain ment tools we have today don’t help
pen. For example, Jean-Luc would let parts of the product materialize by cer- us with all those iterations because
me know we really needed to have a tain points in time, there’s plenty of ne- they don’t provide enough separation
certain aspect of the UI sorted out by gotiating to be done along the way. between the business logic and the UI.
some particular sprint, which meant Rutter: Yes, but I think you have Ideally, we should be able to change
I had essentially received my march- to accept that some reengineering of the UI from top to bottom and revisit
ing orders. Conversely, there also were bits and pieces along the way is just an absolutely every last bit of it without
times when I needed to see some par- inherent part of the process. That is, ever touching the underlying logic.
ticular functionality in place in time to based on feedback from testing, you Unfortunately, today that just isn’t the
use it as part of a live build I wanted to can bank on sprints down the line in- case.
incorporate into an upcoming round volving the reengineering of at least a Gosper: That’s why, as a UX interac-
of formative testing. That, alone, would few of the things that have already been tion designer, you have to be prepared
require a ton of coordination. built. But as long as that’s what you ex- to demonstrate to the product devel-
The other important influence on pect…no problem. opers that any changes you’re recom-
sprint planning comes from product Agathos: I think you’re right about mending are based on substantive
management, since they too often that: we have to develop a mind-set evidence, not just some intuitive or
want to be able to show off some cer- that’s accepting of changes along the anecdotal sense of the users’ needs.
tain capabilities by some particular way. But I actually think the biggest You need to make a strong case and be
date. With all three of these vested in- problem is the tooling. The develop- able to support design changes with as

Polestar Faceted Navigation.

User may delete a facet


from the search path with
this button. The facet will Delete facet button
then (again) be available
from the Suggested facets.
Facet name label
Note: Measures/Time facet
do not have this control.

Select facets to refine exploration… (Showing 8–9 of 34 facets) Show All

Measures Total  State × Line × SKU number × Cities ×


Sales Revenue Florida 123.547 Sweat-T-Shirts 123,547 *122,231" Miami
Quantity sold Texas 121,432 Accessories 121,432 *125,342" Orlando

Count DC 87,654 Shirt Waist 87,654 *175,342" Tampa Bay
more… California 67,543 Sweaters 67,543 *144,443" Daytona Beach
Time 1 Date 2 Dates Period New York 67,543 Dresses 56,765 *103,442" Facet Value label Palm Beach
Illinois 44,658 Trousers 44,658 *187,287" Boca Raton
Sales Date 
Iowa 41,976 Jackets 41,976 *184,724" Winter Haven
Massachusetts 41,976 City Skirts 34,546 *129,934" Tallahassee
Y Q M D 
Montana 22,356 *186,235" Panama Beach
2003  --  --  -- 
Iowa 16,834 *177,556" Clearwater
more… more… more…

Facet Value measure label


more… Facet Values link
This link opens the More… values
dialogue for this particular Facet container (border)
Facet which contains a list of all
available values for selection.

Measures label Measures list

Select facets to refine exploration… (Showing 8–9 of 34 facets) Show All

Measures Total  State × Line × SKU number × Cities ×


Sales Revenue Florida 123.547 Sweat-T-Shirts 123,547 *122,231" Miami
Quantity sold Texas 121,432 Accessories 121,432 *125,342" Orlando

Count “Count” measure DC 87,654 Shirt Waist 87,654 *175,342" Tampa Bay
more… more… link California 67,543 Sweaters 67,543 *144,443" Daytona Beach
Time 1 Date 2 Dates Period New York 67,543 Dresses 56,765 *103,442" Palm Beach
Illinois 44,658 Trousers 44,658 *187,287" Boca Raton
Sales Date 
Iowa 41,976 Jackets 41,976 *184,724" Winter Haven
Massachusetts 41,976 City Skirts 34,546 *129,934" Tallahassee
Y Q M D 
Montana 22,356 22,356 *186,235" Panama Beach
2003  --  --  -- 
Iowa 16,834 16,834 *177,556" Clearwater
more… more… more…

ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t he acm 59
practice

much quantitative and qualitative end- most of what I wanted to communi- what it is they actually want.
user validation data as you can get your cate could be inferred from them. But Coatta: Since we’re talking about
hands on. there were other times when it would how engineers and designers live in
Coatta: So far, we’ve looked at this have been helpful for me to break different worlds and thus employ dif-
largely from the UX side of the equa- things down into more granular speci- ferent tools, different skill sets, and
tion. What are some of the benefits and fications. It was a bit challenging in the different worldviews—what do you
challenges of using Agile to implement moment to sort that out, because on think of having software engineers get
UX design from more of an engineer- the one hand you’re trying to manage directly involved in formative testing?
ing perspective? your time in terms of producing speci- Does that idea intrigue you? Or does it
Agathos: The biggest challenge we fications for the next sprint, but on the seem dangerous?
faced in this particular project—at other hand you want to get them to the Gosper: Both, actually. It all depends
least after we had completed the first appropriate depth. on how you act upon that information.
version of Polestar—had to do with The other challenge is that each do- At one point there is an internal valida-
changes that were made in the overall main has its own technical language, tion process whereby a product that is
architecture just as we were about to and it can sometimes prove tricky to just about to be released is opened up
finish a release. Even in an Agile envi- make sure you’re correctly interpreting to a much wider cross section of people
ronment you still need to think things what you’re hearing. For example, I re- in the company than the original group
through pretty thoroughly up front, member one of the sprints where I was of stakeholders. And then all those
and then sketch out your design, proto- very concerned with a particular set of folks are given a script that allows them
type it, test it, and continue fixing it un- functionality having to do with users’ to walk through the product so they
til it becomes stable. As soon as you’ve abilities to specify financial periods can experience it firsthand.
achieved something that’s pretty solid for the data they might be looking to What that can trigger, naturally, is
for any given layer in the architecture, explore. I therefore became very active a wave of feedback on a project that
you need to finish that up and move on in trying to get product management is just about finalized, when we don’t
to the next higher layer. In that way, it to allocate more resources to that ef- have a lot of time to do anything about
is like creating a building. If you have fort because I was certain it would be it. In a lot of that feedback, people
to go back down to the foundation to a major pain point for end users if the don’t just point out problems; they also
start making some fairly significant functionality was insufficient. offer solutions, such as, “That check-
changes once you’ve already gotten to During that time I remember see- mark can’t be green. Make it gray.” To
a very late stage of the process, you’re ing a sprint review presentation that take all those sorts of comments at
likely to end up causing damage to ev- referred to a number of features, a cou- face value, of course, would be danger-
erything else you’ve built on top of that. ple of which related to date and finan- ous. Anyway, my tendency is to think of
The nice thing about Agile is that it cial period and so forth. Next to each of feedback that comes through develop-
allows for design input at every sprint those items was the notation “d-cut.” I ment or any other internal channel as
along the way—together with some didn’t say a word, but I was just flabber- something that should provide a good
discussion about the reasoning behind gasted. I was thinking to myself, “Wow! basis for the next user study.
that design input. That’s really impor- So they just decided to cut that. I can’t Rutter: UX design should always
tant. For example, when we were work- believe it. And nobody even bothered involve contact with lots of different
ing with Julian, he explained to us that to tell me.” But of course it turns out end users at plenty of different points
one of the fundamental design goals “d-cut” stands for “development cut,” throughout the process. Still, as Julian
for Polestar was to minimize the num- which means they had already imple- says, it’s what you end up doing with all
ber of clicks the end user would have mented those items. There are times that information that really matters.
to perform to accomplish any particu- when you can end up talking past each When it comes to figuring out how to
lar task. He also talked about how that other just because everyone is using solve the problems that come to light
applied to each of the most important terms specific to his or her own techni- that way, that’s actually what the UX
user stories. For us as developers, it re- cal domain. Of course, the same is true guys get paid to do.
ally helped to understand all that. for product and program management
I don’t think we have exchanges like as well.
that nearly enough—and that doesn’t Coatta: Don’t the different tools Related articles
on queue.acm.org
apply only to the UX guys. It would also used by each respective domain also
be good to have discussions like that make contributions to these commu- The Future of Human-Computer Interaction
John Canny
with the program managers. nication problems? http://queue.acm.org/detail.cfm?id=1147530
Gosper: In those discussions for the Agathos: I couldn’t agree more. For
Human-KV Interaction
Polestar project, one of the greatest example, when you program in Java,
Kode Vicious
challenges for me had to do with fig- there are some things you can express http://queue.acm.org/detail.cfm?id=1122682
uring out just how much depth to go and some you cannot. Similarly, for ar-
Other People’s Data
into when describing a particular de- chitects, using a language such as UML Stephen Petschulat
sign specification. Sometimes a set of constrains them in some ways. They http://queue.acm.org/detail.cfm?id=1655240
wireframes supporting a particular use end up having to ask themselves tons
case seemed to be good enough, since of questions prior to telling the system © 2011 ACM 0001-0782/11/0100 $10.00

60 communications of th e ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
doi:10.1145/1866739 . 1 8 6 6 7 5 4

Article development led by


queue.acm.org

Managing virtualization at a large scale is


fraught with hidden challenges.
by Evangelos Kotsovinos

Virtualization:
Blessing
or Curse?
touted as the solution
V ir t ua lizati on is o f t en
to many challenging problems, from resource
underutilization to data-center optimization and
carbon emission reduction. However, the hidden costs
of virtualization, largely stemming from the complex
and difficult system administration challenges it

poses, are often overlooked. Reaping deferral of data-center build-outs—the


the fruits of virtualization requires same data-center space can now last
the enterprise to navigate scalability longer.
limitations, revamp traditional opera- Virtualization is also meant to en-
tional practices, manage performance, hance the manageability of the enter-
and achieve unprecedented cross-silo prise infrastructure. As virtual servers
collaboration. Virtualization is not a and desktops can be live-migrated with
curse: it can bring material benefits, no downtime, coordinating hardware
but only to the prepared. upgrades with users or negotiating
Al Goodman once said, “The perfect work windows is no longer necessary—
computer has been invented. You just upgrades can happen at any time with
feed in your problems and they never no user impact. In addition, high avail-
come out again.” This is how virtualiza- ability and dynamic load-balancing so-
tion has come to be perceived in recent lutions provided by virtualization prod-
years: as a panacea for a host of IT prob- uct families can monitor and optimize
lems. Bringing virtualization into the the virtualized environment with little
enterprise is often about reducing costs manual involvement. Supporting the
without compromising quality of ser- same capabilities in a nonvirtualized
vice. Running the same workloads as world would require a large amount of
virtual machines (VMs) on fewer serv- operational effort.
ers can improve server utilization and, Furthermore, enterprises use virtu-
perhaps more importantly, allow the alization to provide IaaS (Infrastruc-

ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t he acm 61
practice

ture as a Service) cloud offerings that running multiple VMs is similar to that started happening with such services
give users access to computing re- of managing a physical, nonvirtualized entering the hypervisor, and it has the
sources on demand in the form of VMs. server. Therefore, as dozens of VMs can potential to reduce operational work-
This can improve developer productiv- run on one virtualized server, consoli- load substantially.
ity and reduce time to market, which is dation can reduce operational work- Scale. Enterprises have spent years
key in today’s fast-moving business load. Not so: the workload of manag- improving and streamlining their man-
environment. Since rolling out an ap- ing a physical, nonvirtualized server is agement tools and processes to handle
plication sooner can provide first-mov- comparable to that of managing a VM, scale. They have invested in a back-
er advantage, virtualization can help not the underlying virtualized server. bone of configuration management
boost the business. The fruits of common, standardized and provisioning systems, operational
management—such as centrally held tools, and monitoring solutions that
The Practice configuration and image-based provi- can handle building and managing
Although virtualization is a 50-year-old sioning—have already been reaped by tens or even hundreds of thousands of
technology,3 it reached broad popular- enterprises, as this is how they manage systems. Thanks to this—largely home-
ity only as it became available for the their physical environments. There-
x86 platform from 2001 onward—and fore, managing 20 VMs that share a
most large enterprises have been us- virtualized server requires the same
ing the technology for fewer than five amount of work as managing 20 physi-
years.1,4 As such, it is a relatively new cal servers. Add to that the overhead of
technology, which, unsurprisingly, car- managing the hypervisor and associ-
ries a number of less-well-understood ated services, and it is easy to see that
system administration challenges. operational workload will be higher.
Old Assumptions. It is not, strictly More importantly, there is evidence
speaking, virtualization’s fault, but that virtualization leads to an increase
many systems in an enterprise infra- in the number of systems—now run-
structure are built on the assumption ning in VMs—instead of simply con-
of running on real, physical hardware. solidating existing workloads.2,5 Mak-
The design of operating systems is ing it easy to get access to computing
often based on the principle that the capacity in the form of a VM, as IaaS
hard disk is local, and therefore read- clouds do, has the side effect of leading
ing from and writing to it is fast and to a proliferation of barely used VMs,
low cost. Thus, they use the disk gen- since developers forget to return the
erously in a number of ways, such as VMs they do not use to the pool after
caching, buffering, and logging. This, the end of a project. As the number of
of course, is perfectly fair in a nonvirtu- VMs increases, so does the load placed
alized world. on administrators and on shared in-
With virtualization added to the frastructure such as storage, Dynamic
mix, many such assumptions are Host Configuration Protocol (DHCP),
turned on their heads. VMs often use and boot servers.
shared storage, instead of local disks, Most enterprise users of virtualiza-
to take advantage of high availability tion implement their own VM recla- grown—tooling, massively parallel op-
and load-balancing solutions—a VM mation systems. Some solutions are erational tasks, such as the build-out
with its data on the local disk is a lot straightforward and borderline sim- of thousands of servers, daily operating
more difficult to migrate, and doomed plistic: if nobody has logged on for system checkouts, and planned data-
if the local disk fails. With virtualiza- more than three months, then notify center power-downs, are routine and
tion, each read and write operation and subsequently reclaim if nobody straightforward for operational teams.
travels to shared storage over the net- objects. Some solutions are elaborate Enter virtualization: most vendor
work or Fiber Channel, adding load and carry the distinctive odor of over- solutions are not built for the large en-
to the network interface controllers engineering: analyze resource utiliza- terprise when it comes to scale, particu-
(NICs), switches, and shared storage tion over a period of time based on larly with respect to their management
systems. In addition, as a result of con- heuristics; determine level of usage; frameworks. Their scale limitations
solidation, the network and storage and act accordingly. Surprising as it are orders of magnitude below those
infrastructure has to cope with a poten- may be there is a lack of generic and of enterprise systems, often because
tially much higher number of systems, broadly applicable VM reclamation of fundamental design flaws—such as
compounding this effect. It will take solutions to address sprawl challeng- overreliance on central components or
years for the entire ecosystem to adapt es. In addition, services that are com- data sources. In addition, they often do
fully to virtualization. mon to all VMs sharing a host—such not scale out; running more instances
System Sprawl. Conventional wis- as virus scanning, firewalls, and back- of the vendor solution will not fully ad-
dom has it that the operational work- ups—should become part of the virtu- dress the scaling issue, as the instances
load of managing a virtualized server alization layer itself. This has already will not talk to each other. This chal-

62 communications of th e ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
practice

lenge is not unique to virtualization. An tion and size of operational teams. physical infrastructure.
enterprise faces similar issues when it Interoperability. Many enterprises To be sure, some enterprises are for-
introduces a new operating system to its have achieved a good level of integra- tunate enough to have a homogeneous
environment. Scaling difficulties, how- tion between their backbone systems. environment, managed by a product
ever, are particularly important when it The addition of a server in the config- suite for which solid virtualization ex-
comes to virtualization for two reasons: uration-management system allows tensions already exist. In a heteroge-
first, virtualization increases the num- it to get an IP address and host name. neous infrastructure, however, with
ber of systems that must be managed, The tool that executes a power-down more than one virtualization platform,
as discussed in the section on system draws its data about what to power with virtualized and nonvirtualized
sprawl; second, one of the main benefits off seamlessly from the configuration- parts, and with a multitude of tightly
of virtualization is central management management system. A change in a integrated homegrown systems, the
of the infrastructure, which cannot be server’s configuration will automati- introduction of virtualization leads to
achieved without a suitably scalable cally change the checkout logic applied administration islands—parts of the
management framework. to it. This uniformity and tight integra- infrastructure that are managed differ-

As a result, enterprises are left with tion massively simplifies operational ently from everything else. This breaks
a choice: either they live with a mul- and administrative work. the integration and uniformity of the
titude of frameworks with which to Virtualization often seems like an enterprise environment, and increases
manage the infrastructure, which in- awkward guest in this tightly integrat- operational complexity.
creases operational complexity; or they ed enterprise environment. Each virtu- Many enterprises will feel like they
must engineer their own solutions that alization platform comes with its own have been here before—for example,
work around those limitations—for ex- APIs, ways of configuring, describing, when they engineered their systems to
ample, the now open source Aquilon and provisioning VMs, as well as its be able to provision and manage mul-
framework extending the Quattor tool- own management tooling. The ven- tiple operating systems using the same
kit (http://www.quattor.org). Another dor ecosystem is gradually catching frameworks. Once again, customers
option is for enterprises to wait until up, providing increased integration face the “build versus suffer” choice.
the vendor ecosystem catches up with between backbone services and virtu- Should they live with the added opera-
enterprise-scale requirements before alization management. Solutions are tional complexity of administration
they virtualize. The right answer de- lacking, however, that fulfill all three islands until standardization and con-
pends on a number of factors, includ- of the following conditions: vergence emerge in the marketplace,
ing the enterprise’s size, business ˲˲ They can be relatively easily inte- or should they invest in substantial
requirements, existing backbone of grated with homegrown systems. engineering and integration work to
systems and tools, size of virtualized ˲˲ They can handle multiple virtual- ensure hypervisor agnosticism and in-
and virtualizable infrastructure, engi- ization platforms. tegration with the existing backbone?
neering capabilities, and sophistica- ˲˲ They can manage virtual as well as Troubleshooting. Contrary to con-

ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t he acm 63
practice

ventional wisdom, virtualized environ- requiring a true mentality shift in the


ments do not really consolidate three way enterprise infrastructure organiza-
physical machines into one physical tions operate—as well as, potentially,
machine; they consolidate three physi- organizational changes to adapt to this
cal machines onto several physical sub-
systems, including the shared server, Virtualization requirement.
Impact of Changes. Enterprises
the storage system, and the network.
Finding the cause of slowness in
holds promise have spent a long time and invested

as a solution for
substantial resources into understand-
a physical computer is often a case of ing the impact of changes to different
glancing at a few log files on the local
disk and potentially investigating local
many challenging parts of the infrastructure. Change-
management processes and policies
hardware issues. The amount of data problems. are well oiled and time tested, ensuring
that needs to be looked at is relatively
small, contained, and easily found.
Expectations that every change to the environment is
assessed and its impact documented.
Monitoring performance and diag- are running high. Once again, virtualization brings
nosing a problem of a virtual desktop,
on the other hand, requires trawling Can virtualization fundamental change. Sharing the in-
frastructure comes with centralization
through logs and data from a number deliver? and, therefore, with potential bottle-
of sources including the desktop oper- necks that are not as well understood.
ating system, the hypervisor, the stor- Rolling out a new service pack that in-
age system, and the network. creases disk utilization by 5IOPS (in-
In addition, this large volume of put/output operations per second) on
disparate data must be aggregated or each host will have very little impact in
linked; the administrator should be a nonvirtualized environment—each
able to obtain information easily from host will be using its disk a little more
all relevant systems for a given time pe- often. In a virtualized environment, an
riod, or to trace the progress of a spe- increase of disk usage by 5IOPS per VM
cific packet through the storage and will result in an increase of 10,000IOPS
network stack. Because of this mul- on a storage system shared by 2,000
tisource and multilayer obfuscation, VMs, with potentially devastating con-
resolution will be significantly slower sequences. It will also place increased
if administrators have to look at sev- load on the shared host, as more
eral screens and manually identify bits packets will have to travel through the
of data and log files that are related, in hypervisor, as well as the network in-
terms of either time or causality. New frastructure. We have seen antivirus
paradigms are needed for storing, re- updates and operating-system patches
trieving, and linking logs and perfor- resulting in increases in CPU utiliza-
mance data from multiple sources. tion on the order of 40% across the
Experience from fields such as Web virtualized plant—changes that would
search can be vital in this endeavor. have a negligible effect when applied to
Silos? What Silos? In a nonvirtual- physical systems.
ized enterprise environment, respon- Similarly, large-scale reboots can
sibilities for running different parts of impact shared infrastructure compo-
the infrastructure are neatly divided nents in ways that are radically dif-
among operational teams, such as ferent from the nonvirtualized past.
Unix, Windows, network, and stor- Testing and change management pro-
age operations. Each team has a clear cesses need to change to account for
scope of responsibility, communica- effects that may be much broader than
tion among teams is limited, and ap- before.
portioning credit, responsibility, and Contention. Virtualization plat-
accountability for infrastructure issues forms do a decent job of isolating VMs
is straightforward. on a shared physical host and manag-
Virtualization bulldozes these silo ing resources on that host (such as CPU
walls. Operational issues that involve and memory). In a complex enterprise
more than one operational team—and, environment, however, this is only part
in some cases, all—become far more of the picture. A large number of VMs
common than issues that can be re- will be sharing a network switch, and
solved entirely within a silo. As such, an even larger number of VMs will be
cross-silo collaboration and commu- sharing a storage system. Contention
nication are of paramount importance, on those parts of the virtualized stack

64 communications of th e ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
practice

can have as much impact as contention tion brings unprecedented integra- ter cross-silo collaboration, and instill
on a shared host, or more. Consider tion and hard dependencies among an end-to-end mentality in their staff.
the case where a rogue VM overloads components—a storage outage could Controls to prevent VM sprawl are key,
shared storage: hundreds or thousands mean that thousands of users cannot and new processes and policies for
of VMs will be slowed down. use their desktops. Enterprises need change management are needed, as
Functionality that allows isolat- to ensure that their operational teams virtualization multiplies the effect of
ing and managing contention when it across all silos are comfortable with changes that would previously be of
comes to networking and storage ele- managing a massively interconnected minimal impact.
ments is only now reaching maturity large-scale system, rather than a col- Virtualization can bring significant
and entering the mainstream virtual- lection of individual and independent benefits to the enterprise, but it can
ization scene. Designing a virtualiza- components, without GUIs. also bite the hand that feeds it. It is no
tion technology stack that can take curse, but, like luck, it favors the pre-
advantage of such features requires Conclusion pared.
engineering work and a good amount Virtualization holds promise as a solu-
of networking and storage expertise tion for many challenging problems. It Acknowledgments
on behalf of the enterprise customer. can help reduce infrastructure costs, Many thanks to Mostafa Afifi, Neil Al-
Some do that, combining exotic net- delay data-center build-outs, improve len, Rob Dunn, Chris Edmonds, Rob-
work adapters that provide the right our ability to respond to fast-moving bie Eichberger, Anthony Golia, Alli-
cocktail of I/O virtualization in hard- business needs, allow a massive-scale son Gorman Nachtigal, and Martin
ware with custom rack, storage, and infrastructure to be managed in a more Vazquez for their invaluable feedback
network designs. Some opt for the flexible and automated way, and even and suggestions. I am also grateful to
riskier but easier route of doing noth- help reduce carbon emissions. Expec- John Stanik and the ACM Queue Edito-
ing special, hoping that system admin- tations are running high. rial Board for their feedback and guid-
istrators will cope with any contention Can virtualization deliver? It abso- ance in completing this article.
issues as they arise. lutely can, but not out of the box. For
GUIs. Graphical user interfaces virtualization to deliver on its promise,
work well when managing an email both vendors and enterprises need to Related articles
on queue.acm.org
inbox, data folder, or even the desktop adapt in a number of ways. Vendors
of a personal computer. In general, it must place strategic emphasis on en- Beyond Server Consolidation
is well understood in the human-com- terprise requirements for scale, en- Werner Vogels
http://queue.acm.org/detail.cfm?id=1348590
puter interaction research community suring that their products can grace-
that GUIs work well for handling a rela- fully handle managing hundreds of CTO Roundtable: Virtualization
http://queue.acm.org/detail.cfm?id=1508219
tively small number of elements. If that thousands or even millions of VMs.
number gets large, GUIs can overload Public cloud service providers do this The Cost of Virtualization
Ulrich Drepper
the user, which often results in poor very successfully. Standardization,
http://queue.acm.org/detail.cfm?id=1348591
decision making.7 Agents and automa- automation, and integration are key;
tion have been proposed as solutions eye-pleasing GUIs are less important.
References
to reduce information overload.6 Solutions that help manage resource 1. Bailey, M., Eastwood, M., Gillen, A., Gupta, D.
Virtualization solutions tend to contention end to end, rather than only Server virtualization market forecast and analysis,
2005–2010. IDC, 2006.
come with GUI-based management on the shared hosts themselves, will 2. Brodkin, J. Virtual server sprawl kills cost savings,
frameworks. That works well for man- significantly simplify the adoption of experts warn. NetworkWorld. Dec. 5, 2008.
3. Goldberg, R.P. Survey of virtual machine research.
aging 100 VMs, but it breaks down in virtualization. In addition, the indus- IEEE Computer Magazine 7, 6 (1974), 34–45.
an enterprise with 100,000 VMs. What try’s ecosystem needs to consider the 4. Humphreys, J. Worldwide virtual machine software
2005 vendor shares. IDC, 2005.
is really needed is more intelligence fundamental redesign of components 5. IDC. Virtualization market accelerates out of the
and automation; if the storage of a vir- that perform suboptimally with virtual- recession as users adopt “Virtualize First” mentality;
2010.
tualized server is disconnected, auto- ization, and it must provide better ways 6. Maes, P. Agents that reduce work and information
matically reconnecting it is a lot more to collect, aggregate, and interpret logs overload. Commun. ACM 37, 7 (1994), 30–40.
7. Schwartz, B. The Paradox of Choice. HarperCollins, NY,
effective than displaying a little yellow and performance data from disparate 2005.
triangle with an exclamation mark in sources.
a GUI that contains thousands of ele- Enterprises that decide to virtual- Evangelos Kotsovinos is a vice president at Morgan
Stanley, where he leads virtualization and cloud-
ments. What is also needed is interop- ize strategically and at a large scale computing engineering. His areas of interest include
erability with enterprise backbones need to be prepared for the substantial massive-scale provisioning, predictive monitoring,
scalable storage for virtualization, and operational tooling
and other systems, as mentioned pre- engineering investment that will be for efficiently managing a global cloud. He also serves
viously. required to achieve the desired levels as the chief strategy officer at Virtual Trip, an ecosystem
of dynamic start-up companies, and is on the Board
In addition, administrators who are of scalability, interoperability, and op- of Directors of NewCred Ltd. Previously, Kotsovinos
accustomed to the piecemeal systems erational uniformity. The alternative was a senior research scientist at T-Labs, where he
helped develop a cloud-computing R&D project into a
management of the previrtualization is increased operational complexity VC-funded Internet start-up. A pioneer in the field of
era—managing a server here and a and cost. In addition, enterprises that cloud computing, he led the XenoServers project, which
produced one of the first cloud-computing blueprints.
storage element there—will discover are serious about virtualization need a
they will have to adapt. Virtualiza- way to break the old dividing lines, fos- © 2011 ACM 0001-0782/11/0100 $10.00

ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t he acm 65
contributed articles
doi:10.1145/1866739.1866756
available for investment in new proj-
How companies pay programmers ects and jobs.
My intent is to help make computer
when they move the related IP rights scientists aware of the relationship of
to offshore taxhavens. the flow of jobs in computing and the
flow of preexisting IP. The ability to cre-
by Gio Wiederhold ate valuable software greatly depends
on prior technological prowess. The

Follow the
processes allowing IP to be moved off-
shore, beyond where the software was
created, are formally legal. The result-
ing accumulation of massive capital in

Intellectual
taxhavensa has drawn governmental
attention and put pressure officials
to change tax regulations.1 However,
the changes proposed in these discus-

Property
sions ignore IP’s crucial role in gener-
ating such capital and, even if enacted,
would be ineffective. Transparency is
needed to gain public support for any
effective change. In addition to advo-
cating transparency about IP trans-
fer, I also offer a radical suggestion—
eliminate corporate taxation as a way
to avoid the distortion now driving the
outflow of IP and providing much of
the motivation for keeping capital and
In the ongoing discussion about offshoring in the IP offshore.
computer and data-processing industries, the 2006 I do not address the risk of misap-
ACM report Globalization and Offshoring of Software propriation of IP when offshoring, a
related but orthogonal issue, cover-
addressed job shifts due to globalization in the ing instead only the processes that are
software industry.1 But jobs represent only half legal. The risk of loss was addressed
throughout the 2006 ACM report,1
of the labor and capital equation in business. In which also cited tax incentives, a much
today’s high-tech industries, intellectual property larger economic factor for businesses
(IP) supplies the other half, the capital complement. than misappropriation of IP. The role

Offshoring IP always accompanies offshoring jobs a The notion of a taxhaven is a concept in ordi-
nary discourse and a crucial aspect of this ar-
and, while less visible, may be a major driver of job ticle. Moreover, using a one-word term simpli-
transfer. The underlying economic model—involving fies the specification and parsing of subsets, as
in “primary taxhavens” and “semi-taxhavens.”
ownership of profits, taxation, and compensation
of workers from the revenue their products generate— key insights
has not been explicated and is largely unknown in P rofits from the work of software
creators and programmers are based
the computer science community. This article on IP being moved offshore.
presents the issue of software income allocation L ocating IP in primary taxhavens
and the role IP plays in offshoring. It also tries to damages both developed and emerging
economies and disadvantages small
explain why computer experts’ lack of insight into businesses.

the economics of software, from investments T he capabilities of multinational


corporations exceed the capabilities
made, to profits accumulated, to capital becoming of governments.

66 communications of th e ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
of taxhavens was ignored.
Programmers and the computer
scientists supporting their work have
traditionally focused on producing
quality high-performance software on
time and at an affordable cost.4 They
are rarely concerned with the sales
and pricing of software, questioning
financial policies only when the com-
pany employing them goes broke.
There is actually a strong sense in the
profession that software should be
a free good.12 Implicit in this view is
that government, universities, and
foundations should pay for software
development, rather than the users
benefitting from it. In this model, pro-
grammers see themselves as artists
creating beauty and benefits for all
mankind. But consider the size of the
software industry. In the U.S. alone, its
revenue is $121 billion per year, well
over 1% of U.S. GDP.7 An even larger
amount is spent in non-software com-
panies for business-specific software
development and maintenance. The
more than 4.8 million people em-
ployed in this and directly related
fields earn nearly $333 billion annu-
ally.5 It is hence unlikely that universal
free software is an achievable or even
desirable goal. Appropriately, open-
source initiatives focus on software
that deserves wide public use (such as
editors, compilers, and operating sys-
tems) and should be freely available to
students and innovators.

flow of money
OFFSHORING

Since economics is the focus of this ar-


ticle, and the economic model of open-
source software is not well understood,
it is limited to the flow of money relat-
ed to commercial software, or software
written to make a profit, either by sell-
ing it or by making enterprises more
efficient. Part of the income generated
by commercial software is used to pay
ILLUSTRATION BY PET ER GRU NDY

programmers’ salaries. Other portions


go to grow the business, to investors,
and to taxes that are due and that sup-
port the needed infrastructure. Figure
1 sketches the two loops—on top, the
flow without IP, and, on bottom, the

ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t he acm 67
contributed articles

Without protected IP, a company’s


income would be at the routine level
Property Rights provided by commodity products, with
margins after production and distri-
a company called USco may sell its headquarters building to a real-estate enterprise bution of, say, 10%, insufficient to
REco, with a provision that REco will lease the building back to USco (see the figure here).
If USco receives a fair value for the building, USco’s total tangibles remain unchanged invest in innovation. Having IP with-
until it starts spending the money it received for the sale. REco may offer an attractive out a knowledgeable staff to exploit
lease because it is located in a taxhaven. an additional strategy by USco is to set up it is equally futile.17 When a high-tech
REco so it remains under the control of USco, also its tenant. Nobody moves, and few
employees would notice any change. there may be a new brass plaque on the building
company is acquired, senior staff are
and a sign saying “Reco” on the door to the rooms housing the people who maintain typically required to remain until its
the building. USco’s consolidated annual report, delivered to its shareholders and the processes are solidly embedded with-
IRS, needs to list only the name and location of the controlled subcorporation REco; in the purchaser. Even a startup, with
the assets of both are combined. Since the lease receipts and payments cancel out, the
more complex financial flow is externally invisible. no identifiable IP, will have some spe-
cific ideas and concepts in the minds
Rights to iP sold to an offshore subcorporation. of its founders that form the seeds of
growth. Time and money are required
for such seeds to mature into IP and
price then into salable products, a delay re-
ferred to as “economic gestation time”
Convert Property or “lag.”30
to cash
minus rents rents
Protected IP and staff knowledge
usco usco reco and expertise within a company are
owner tenant new owner the intangibles that together represent
the enterprise’s IP. The employees who
know how to exploit it complement
the IP; such integration is essential for
gains due to having and using IP. difficult, only releasing binary images success. The combination of labor and
Definitions. Since the revenue as- of the code, and threatening prosecu- IP leads to non-routine profits. The
pect of software economics has been tion. The IP held within the owning en- margins are then typically greater than
ignored in CS curricula, this article in- terprise is primarily protected by keep- 50%, even after spending on R&D, al-
troduces several concepts from the lit- ing the source code secret. lowing further investment and growth.
erature of business economics and IP Employees and contractors in the
generation and exploitation.25 Within software industry are routinely re- iP and Jobs
the context of software, many general quired to sign nondisclosure agree- All subsequent developers on a soft-
definitions can be simplified; for in- ments in order to protect trade secrets. ware project benefit from the work
stance, manufacturing costs can be Trade secrets cover the majority of the that has gone before, that is, from the
ignored, since software products are IP owned by companies developing IP already in place. That IP comple-
easy to copy. The cost of equipment is software. Patents can protect visible ments the knowledge due to education
minor when developing and produc- processes (such as one-click-ordering). and prior experience new employees
ing software. For tangible products But patenting internal processes that bring to the job.
(such as computers), material costs are contribute to the creation of quality The importance of IP to employee
significant, but for software the cost of software would require revealing the productivity becomes clear when com-
tangible media is negligible. The value methods, records, and documents em- panies grow to a size that offshore-
of software is hence assignable solely ployed. Such things are best protected outsourcing of jobs is considered. The
to the intellectual effort of its design- as trade secrets.10 For companies that new offshore workers, whether testers
ers, implementers, and marketers. market software, trademarks repre- providing quality assurance, main-
Even the content of a tangible master sent a complementary component of tenance programmers, sales staff, or
file and the content in the memory in a corporate IP. Trademarks are visible call-center employees, receive mate-
cellphone is an intangible. Everything and registered and their use defended. rial representing IP that exists in the
inside the dashed lines of Figure 1 is The value of trademarks derives from a parent company at the time. Offshore
intangible; only the money surround- combination of having excellent prod- researchers also build on require-
ing it is real. ucts in the market, marketing meth- ments for innovation and the experi-
If an owner of software protects ods to grow sales, and advertising to ence collected by the parent company.
that ownership, the software is consid- spread the word. For products that
ered IP. To protect its IP an enterprise benefit from ongoing sales, customer splitting intellectual capital
would disallow purchasers of copies lists are also of value and protected as Intellectual capital is the know-how of
to make further copies that might be a trade secret. Employees are motivat- the work force and the IP it has generat-
sold. The means of protection vary, ed to keep trade secrets by the contri- ed. As a company matures its IP grows
including asserting copyrights, reg- bution to their collective job security and becomes its major asset. Risk of
istering trademarks, making copying provided by these constraints. IP loss due to employee turnover be-

68 communications of th e ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
contributed articles

comes less critical. To gain financial switching to cheaper labor. in IP creation (as in Figure 2): the par-
flexibility, a company might identify The actual IP content needed to ent, the CFH, and the CFCs. Employ-
and isolate its IP. The rights to identi- perform creative work is transferred ees work at and create IP at the parent
fied IP, as trademarks and technology, through multiple paths: documents, and the CFC locations. Large multina-
can be moved to a distinct subcorpora- code, and personal interaction by staff tional corporations actually establish
tion. Separating IP is an initial phase interchanges among the remote sites dozens of controlled entities to take
in setting up an offshored operation and the originating location. Most advantage of different regulations and
when significant IP is involved.29 To be transfers are mediated by the Internet, incentives in various countries.
productive, the extant technology still allowing rapid interaction and feed-
must be made available to the creative back. The CFH does not get involved Valuing Transferred IP
workers, by having the productive cor- at all. The CFH subcorporation that obtains
porate divisions pay license fees to the Three types of parties are involved the rights to the IP, and that will profit
subcorporation holding the technolo-
gy IP; see the sidebar “Property Rights” Figure 1. Components of the economic loops for software.
for an illustrative example clarifying
the process of splitting rights from the
property itself.
Such transfer-of-rights transactions taxes routine profits
are even simpler when applied to IP.
The rights to a company’s IP or to an
arbitrary fraction of that IP can be sold
Commodity Products
to a controlled foreign holding com- Common Knowledge
pany (CFH) set up in a taxhaven. Once
the rights to the IP are in the CFH the Know-How Integration
Public and Private of the
flow of income and expenses changes. Investments work force High-value
Intellectual Technology
The rights to the IP are bundled, so Capital
Products
no specific patents, trade secrets, or Intellectual
Property Trademarks
documents are identified. The net in-
come attributable to the fraction of
the IP held in the CFH is collected in
an account also held in the taxhaven.
One way of collecting such income is non-routine
taxes profits
to charge royalties or license fees for
the use of the IP at the sites where the
workers create saleable products, both
at home and offshore. There is no risk
of IP loss at the CFH, because noth-
ing is actually kept there. To reduce Figure 2. Extracting and selling the rights to derive income from a property.
the risk of IP loss where the work is
performed, new offshore sites are set
Parent corporation Offshore job sites
up as controlled foreign corporations
(CFCs), rather than using contrac- Kn
ow
tors.29 Since IP is crucial to making Salaries o -Ho
wo f the w
non-routine profits, the royalty license rk
for
fees to be paid to the CFH can be sub- ce
$
stantial and greatly reduce the profit- Integration
ability at the parent and at the CFCs Initial
from worldwide software product purchase
$$
sales (see Figure 2). License $ $ High-value
The consolidated enterprise thus Fees Products

gains much strategic business flex-


ibility. Work can be shifted wherever it IP Documentation
Sub corporation
appears to be effective, perhaps where “CFH” he
new incentives are provided, and to t l
hts ua
Rig llect
the needed IP can be made available e y
purchased Int opert
Pr
there, as long as the license fees are rights to non-routine
intellectual profits
paid to the CFH.22 Paying these fees as
property
royalties on profits is preferred, since
profits reflect the ever-changing profit
margins due to sales variability and to

ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t he acm 69
contributed articles

from fees for its use, must initially pur- confidence in the resulting valuation Cayman Islands is the address for
chase the IP from the prior owner. For of the IP. If existing trademarks are 18,000 holding companies, and the
work that is offshored, the new work- being transferred or kept after an ac- entire country, with fewer than 50,000
ers do not contribute prior proprietary quisition their contribution to income inhabitants, hosts more than 90,000
knowledge, only IP subsequently. But requires adjustments as well. The ben- registered companies and banks. The
setting a fair price for the initial IP efits of marketing expenses tend to be income from the $3,000 annual regis-
received is difficult and risky. If it is short-lived. Technological IP is a mix- tration fees for that many companies
overvalued the company selling it to ture, some created through product allows the Cayman Islands to not im-
the CFH will have gained too much in- improvement that drives revenue with pose any taxes on anybody. Even the
come, on which it must pay taxes. If it little delay and some resulting from beach resorts, available for board of
is undervalued, excessive profits will the fundamental R&D that takes a long directors meetings, are not taxed.
accrue to the CFH. time to get to market. Defining what makes a country a
How does the company document While valuing all IP in a company is a prime taxhaven varies but always in-
the value of its transferred IP? The challenge, for the purpose of offshoring cludes negligible or no taxation and
annual reports to shareholders and software IP, simplification of confound- lack of transparency. A few dozen juris-
the 10-K reports submitted annually ing items is possible, making the task dictions actively solicit and lobby for
to the U.S. Securities and Exchange easier. The amount of tangible property business, citing their taxhaven advan-
Commission rarely include estimates is relatively small in a high-tech com- tages. Reporting income and assets
of the value of a company’s intangible pany. The value of the work force can be is often not required. Advantages can
property. Only when one company ac- determined through comparison with be combined; for instance, the rule
quires another high-tech company are public data of acquisitions of similar that Cayman-based corporations must
due-diligence assessments of the IP companies with little IP. have one local annual meeting can be
obtained made. Various method types overcome by having a Cayman com-
help assess the value of the transferred Taxhavens pany be formally resident in a British
IP from a parent to its CFCs or CFH are Offshoring is greatly motivated by be- Crown Colony like Bermuda. Often,
available, including these five (with ing able to avoid or reduce taxes on only a single CFH shareholder is fully
many variations): income by moving rights to the IP into controlled by another corporation.
Future income. Predict the future low-tax jurisdictions, or taxhavens, cat- Cayman companies need not have ex-
income ceded to the CFH, subtract egorized as semi-taxhavens, or coun- ternal directors on their boards, and
all expected costs, and reduce the re- tries looking to attract jobs through optional board meetings can be held
mainder to account for routine profits. active external investments, and pri- anywhere convenient . Neither audits
Compute the IP’s net present value mary taxhavens. Semi-taxhavens tend nor annual reports are required, but
(NPV) over its lifetime to obtain the to provide temporary tax benefits. for criminal cases, records are made
amount due to the IP2; Countries intent on growth like Israel available. At the extreme end of the
Shareholder expectations. Use share- and Ireland have offered tax holidays taxhaven spectrum are countries iden-
holder expectations embodied in the to enterprises setting up activities tified by the Organisation for Econom-
company’s total market capitalization, there, while India provides incentives ic Co-operation and Development as
subtract the value of its tangibles, and for companies that export. Many East- uncooperative taxhavens, even shelter-
split the remainder among the CFH ern European countries have set up ing fraud.20
and the parent3; or are considering similar initiatives. The use of primary taxhavens
Expected value. Search for similar Setting up a subsidiary CFC in a semi- causes a loss to the U.S. Treasury of
public transactions where the IP trans- taxhaven requires financial capital more than $100 billion annually, a
fer was among truly independent orga- and significant corporate IP, helping substantial amount compared to the
nizations, then adjust for differences workers be productive quickly. These $370 billion total actually collected as
and calculate a median value19; resources are best provided via prima- corporate tax.8,32 Only $16 billion in
Diminishing maintenance. Aggre- ry taxhavens. taxes were paid (for the year reviewed)
gate the NPV of the specific incomes Primary taxhavens are countries by multinational corporations in the
expected from the products sold over with small populations that focus on U.S.; smaller businesses pay the great-
their lifetimes as their initial IP con- attracting companies that will not use est share.27 The actual tax rate paid by
tribution due to maintenance dimin- actual resources there, and with no companies using taxhavens averages
ishes31; local personnel hired. Although their 5%, even as they complain about high
Expected R&D margin. Extrapolate role is crucial in offshoring, the jobs U.S. corporate taxes. It was estimated
the past margin obtained from ongo- issue is not raised, and the services at a G-20 meeting that developing na-
ing R&D investment as it delivers ben- needed for remote holding compa- tions overall lose annual revenue of
efits over successive years.15 nies (such as registration with local $125 billion due to taxhaven use.11
All valuation methods depend on government, mail forwarding, and ar-
data. The availability of trustworthy ranging boards-of-directors meetings) Assets in a Taxhaven
data determines applicability and the are offered by branches of global ac- Following an IP transfer to a primary
trustworthiness of their results. Using counting firms. For example, a single taxhaven, the taxhaven CFH will have
more than one method helps increase well-known five-floor building in the two types of assets: more-auditable fi-

70 communications of th e ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
contributed articles

Figure 3. The changing investment scene, as taxhaven resources become available.

Taxing Country

$$ $$
for for IP at the parent $$
IP taxes taxes corporation for
tax
ongoing
es IP rights
Primary taxhaven
IP
rig
ht

$$ .
s

IP held at the CFH


IP available for
$$ to .
$$ more new projects
for maintain
for initial IP the IP
dividends
Initial right
IP transfer to use
the IP new
IP

$ available for
Profit share more new projects
for parent
Profit share
$ for CFH

All untaxed

new S
$
New IP
and $

New projects in New profits only


semi-taxhavens and for CFH
low-cost countries

time
?

nancial ones, derived from licensing was transferred,19 an amount typically funds flow to the holding company in
and royalties for use of the IP, and the paid over several years. Moving the the taxhaven. Additional funds may be
IP itself. Both grow steadily, as out- IP offshore early in the life of a com- repatriated from a CFH when a coun-
lined in Figure 3 and are now freely pany, when there is little documented try (such as the U.S.) offers tax amnes-
available to initiate and grow projects IP, increases the leverage of this ap- ties for capital repatriation or when
in any CFC. The IP in the primary tax- proach. The income of the CFH is also the parent companies show losses, so
haven is made available by charging used to pay for ongoing R&D or for the the corporate income tax due can be
license fees to projects in the semi-tax- programmers at the parent company offset.1,6
havens, providing immediate income and in any IP-generating offshore lo- The payments by the CFH for cre-
to the CFH. When the projects have cation.28 U.S. taxes must be paid on ative work ensure that all resulting IP
generated products for sale, royalties such funds as they are repatriated to belongs to the CFH. While the value
on the sales provide further income to the U.S., since they represent taxable of the initial IP purchased diminishes
the CFH. income. The funds not needed to sup- over time, the total IP held in the CFH
Initially, the income at the CFH is port R&D (often more than half after increases as the product is improved
used to reimburse the parent compa- the initial payback) can remain in the and provides a long-term IP and in-
ny for the assumed value of the IP that CFH. In each yearly cycle yet more come stream.

ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t he acm 71
contributed articles

Starting an Offshore Project Financial and IP Assets


Money and IP accumulated in a pri- Several issues should be of concern to
mary taxhaven should be deployed computer professionals, even though
for generating yet more income and their effects are indirect. For example,
to avoid showing excess capital on the
consolidated books. Money will be Eliminate corporate three effects of moving IP rights to a
taxhaven are instability of work op-
needed to pay for workers on new proj- taxation as a way to portunities, imbalance of large-versus-

avoid the distortion


ects, and IP is needed to make them small companies, and loss of infra-
effective and bring future income to structure support.
non-routine levels. The period cov-
ered by Figure 3 may extend 10 years
now driving the Having funds in a primary taxhaven
gives multinational corporations
or more. The value of the IP needed for outflow of IP and enough flexibility to exploit global op-
a new project is based on the expecta-
tion of the income it will generate and
providing much portunities. Whenever and wherever
business opportunities and incentives
is very high for a promising project. of the motivation are available, funds can be deployed
The export of IP, just like any proper-
ty export, should generate income to for keeping capital quickly. Of course, moving work to
semi-taxhavens is more advantageous
the provider. Such exported income, and IP offshore. than supporting work in countries that
moved via a primary taxhaven, avoids tax at typical rates. The flexibility for
any payment of taxes. Note that only high-tech industries is notably great
an appropriate fraction of the rights to compared to industries that rely on
the IP is shipped out of the taxhaven. tangible assets. When semi-taxhaven
The actual documents are provided by countries attract investment in tangi-
the originators, wherever they might bles (such as a car factory), benefits are
work, but the documents are kept by retained after the tax holiday, but IP in-
the CFH in the taxhaven, which for- vestments can be redeployed quickly.
mally owns the IP. Only a few senior people may have to
Since the value of IP is not reported move (physically). When the semi-tax-
anywhere, nothing is visible to em- haven also has a low-wage structure,
ployees or shareholders, except a few benefits for the consolidated corpora-
financial experts in the company, or tion multiply.
more typically, their financial advisors. Countries seeking jobs for grow-
If new projects are initiated fully ing populations are pleased about any
by the primary taxhaven and not lo- investment that creates jobs, even if
cated with the parent, both the IP ex- structured to minimize local corpo-
ports and resulting non-routine in- rate profits and taxes. Governments
come enabled by IP transfers escape often create semi-taxhavens to en-
all taxation. In most jurisdictions no courage new projects but rarely real-
regulatory authority will check if the ize how rapidly corporations can move
IP valuations and related royalties are facilities that depend primarily on IP.
fair.21 Funding a similar new project Temporary tax incentives then fail to
at a taxing locale that requires visibil- provide the long-term benefits these
ity, as in the U.S., would be costly and countries expected in return for the
awkward; for instance, profit margins tax losses.
would be out of line and raise suspi- The tax avoidance enabled by accu-
cion. Investing in low-cost countries mulating IP and funds in any taxhaven
that tax profits still provides tax ben- reduces the ability of governments of
efits, since high license fees paid to a the countries where the actual work is
CFH greatly reduce the taxable profit performed to support the infrastruc-
in those countries. ture they need for a healthy economy.
Over time, the share of profits di- That infrastructure includes public
rectly available to the parent decreas- roads and transportation, health ser-
es, and dividends may have to be paid vices, and education for future gen-
out of CFH funds. These payments are erations. Scarcities can be seen in Sili-
taxed twice, first as part of corporate con Valley, California, Silicon Glen,
taxes and then, at a greatly reduced Scotland, and Electronics City, Ban-
rate, as shareholder income. Paying galore,1 but tracing cause and effect is
few dividends out of CFH funds and complex.
starting new projects instead is an at- Smaller companies that have
tractive alternative. not had the opportunity to employ

72 communications of th e ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
contributed articles

taxhavens are disadvantaged, even rangement makes it easy to cross the quantitatively assess the relationships
though most economists view them boundaries of legality. Misvaluations between IP offshoring and jobs off-
as the major drivers of growth. In ad- can greatly reduce the magnitude of IP shoring. While it is clear that there is
dition to unequal taxation, they are exports and consequent tax benefits. initial dependency, long-term effects
also less likely to be able to benefit The firms that provide advice for set- are only imagined. Tax schemes clearly
from government tax credits for R&D. ting up tax shelters have the required create an imbalance of actual tax-rates
Such credits enable mature corpora- broad competencies not available in being paid by small- versus large-busi-
tions to offset their U.S. R&D labor the critical constituencies of the com- ness innovators.
costs against any taxes remaining on puting community.1 Staff of firms pro- With more information in hand,
their profits, while the IP generated viding such advice often function in- scientists and researchers in industry
accrues to their CFH. Smaller com- visibly as directors of their customers’ might try to influence potentially wor-
panies establish tax shelters as well. CFHs. Most advising organizations risome corporate policies. While em-
Since most large companies have al- protect themselves from legal liability ployees have few (if any) legal rights to
ready established them, tax-consult- by formally splitting themselves into determine corporate directions, they
ing firms intent on their own growth distinct companies for each country in may well have expectations about their
now also market taxhavens to mid- which they operate. These companies employers’ behavior. A corporation
size businesses. then rejoin by becoming members of may listen, since the motivation of its
a “club” (Vereinsgesetz) set up under work force is a valuable asset. Corpo-
Lack of Transparency Swiss laws. The member companies of rate leaders might not have considered
The creators of the software, even if such clubs do not assume responsibil- the long-term effect of schemes they
they care where their paycheck comes ity for one another’s work and advice. themselves set in place to minimize
from and where the IP they produce But the club can share resources, in- taxes. However, these leaders are also
goes, cannot follow the tortuous path formation, and income among mem- under pressure to compete, nationally
from sales to salary.18 Many interme- ber companies, allowing them to func- and internationally.1 It has been sug-
diate corporate entities are involved, tion as a unit. gested that international initiatives
so tracing the sources of programmer U.S. government officials are re- are needed to level the corporate-taxa-
income becomes well nigh impossi- stricted in how they share corporate tion playing field.
ble. Even corporate directors, despite information. Rules established to
having ultimate responsibility, are protect corporate privacy prohibit the Change the Flow
not aware of specifics, other than hav- sharing of information among Inter- A radical solution to problems created
ing agreed to a tax-reduction scheme nal Revenue staff regarding arrange- by tax-avoidance schemes is to do away
operated by their accountants. Inves- ments used by specific taxpayers to with corporate taxation altogether and
tors and shareholders will not find avoid taxes. Even a 2008 U.S. govern- compensate the U.S. government for
in consolidated annual reports or ment report14 had to rely on survey the loss of tax income by fully taxing
10-K filings any direct evidence of tax- data and could not use corporate fil- dividends and capital gains, that is, by
haven use, since regulations devised ings. A thorough study of IP and capi- imposing taxes only when corporate
to reduce paperwork hide amounts tal flow would require changes in the profits flow to the individuals consum-
held and internal transactions within restricting regulations. ing the benefits. The net effect on to-
controlled corporations. Funds trans- tal tax revenues in the U.S. might be
ferred for R&D and dividends from Incremental Suggestions modest, since, in light of current tax
taxhavens are first deposited in corpo- No matter what conclusions you avoidance strategies, corporations
rate income accounts, then taxed, but draw from this article, any follow-up contributed as little as 8% to total U.S.
may remain eligible for government will require increased transparency. tax revenue in 2004.6 Such a radical
tax credits for corporate research. The U.S. Senate bill S.506 introduced in change would reduce the motivation
taxpayers in these countries are not March 2009 by Senator Carl Levin for many distortions now seen in cor-
aware of benefits beyond salaries; that (Dem., Michigan) “To restrict the use porate behavior. Small businesses un-
is, income from profitable IP will not of taxhavens…” includes measures to able to pay the fees and manage the
accrue to the country providing the re- increase access to corporate data of complexity of taxhavens would no lon-
search credits.23 companies that set up taxhavens and ger be disadvantaged.
Tax-avoidance processes have been to the information their advisers pro- Getting effective international
explored in many publications but vide. Its primary goal is to tax CFHs as agreement seems futile, and no single
not applied to corporate IP transfer16; if they were domestic corporations. It government can adequately regulate
the adventures of movie and sports is unclear if the bill will become law, multinational enterprises. Unilateral
stars make more interesting reading. since confounding arguments can alternatives to deal with countries
Promoters of corporate tax reduc- be raised about its effects. The role that shelter tax-shy corporations are
tion, seeking to, perhaps, gain more of IP and jobs is not addressed in the infeasible as well, even without con-
business, provide general documen- bill, and unless the public is well-in- sideration of the role of IP and mal-
tation, and even address the risks of formed, meaningful reforms will have feasance.9 An underlying problem is
misvaluation of IP and of faulty roy- difficulty gaining traction. that the law equates a corporation
alty rates.19 The complexity of this ar- Without transparency one cannot with a person, allowing confusing ar-

ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t he acm 73
contributed articles

guments, even though people have $7.6 billion, a 168% increase from the Washington D.C., May 2007; http://www.bls.gov/oes/
6. Clausing, K. The American Jobs Creation Act of
morals, motivations, and obligations same period in 2009.23 The eight larg- 2004, Creating Jobs for Accountants and Lawyers.
that differ greatly from the obligations est companies have $300 billion avail- Urban-Brookings Tax Policy Center, Report 311122,
Washington, D.C., Dec. 2004.
of corporations. This equivalence is able in taxhavens. Cisco Systems alone 7. Compustat. Financial Results of Companies in SIC
seen as a philosophical mistake by reported it had $30 billion available Code 7372 for 1999 to 2002; http://www.compustat.com
8. Cray, C. Obama’s tax haven reform: Chump change.
some.13 For instance, humans cannot, in its tax shelters and expects to keep CorpWatch (June 15, 2009); http://www.corpwatch.
without creating corporate entities, spending on foreign acquisitions. Such org/article.php?id=15386.
9. Dagan, T. The tax treaties myth. NYU Journal of
split themselves into multiple clones investment will create jobs all over the International Law and Politics 32 article 379181 (Oct.
that take advantage of differing taxa- world, primarily in semi-taxhavens. 2000), 939–996.
10. Damodaran, A. Dealing with Intangibles: Valuing
tion regimes. In practice, not taxing More support for CS education Brand Names, Flexibility and Patents Working
Paper. Stern School of Business Reports, New York
corporations is such a radical change, was a major emphasis of the ACM re- University, New York, Jan. 2006; http://pages.stern.
affecting so many other aspects of the port, but where will the funding come nyu.edu/~adamodar/
11. Deutsche Stiftung für Entwicklungsländer. Bitter
economy and public perception, that it from? The taxes on Cisco’s available losses. English version (Aug. 7, 2010); http://www.
is as unlikely as many other proposed funds, were they to be used for invest- inwent.org/ez/articles/178169/index.en.shtml
12. Gay, J. Free Software, Free Society: In Selected
tax reforms.8 ment in the U.S., exceed total fund- Essays of Richard M. Stallman. GNU Press, Boston,
ing for the National Science Founda- MA, 2002.
13. Gore, A. The Assault on Reason. Bloomsbury
Why Care? tion and Defense Advanced Research Publications, London, 2008.
The knowledge-based society brought Projects Agency. The IP, if it remains 14. Government Accountability Office. International
Taxation: Large U.S. Corporations and Federal
forth a revolution of human produc- offshore, would quickly refill Cisco’s Contractors with Subsidiaries in Jurisdictions Listed
tivity in the past 50 years, moving well coffers there. Discussions concerning as Tax Havens or Financial Privacy Jurisdictions. U.S.
Government Accountability Office Report GAO-09-
beyond the industrial revolution that future education, leading to growth of 157, Washington, D.C., Dec. 2008; http://www.gao.gov/
started more than a century earlier, knowledge-based industries, job cre- new.items/d09157.pdf
15. Grossman, G.M. and Helpman, E. Innovation and
and globalization is a means to dis- ation, protection of retirement ben- Growth in the Global Economy, Seventh Edition. MIT
tribute its benefits. But the growth efits, and the required infrastructure press, Cambridge, MA, 2001.
16. Johnston, D.C. Perfectly Legal. Portfolio Publishers,
of assets in taxhavens deprives work- for growing businesses are futile if the New York, 2003.
17. Kiesewetter-Koebinger, S. Programmers’ capital. IEEE
ers worldwide of reasonably expected creators of the required intellectual Computer 43, 2 (Feb. 2010), 106–108.
benefits. These hidden assets have resources are uninformed about the 18. Lev, B. Intangibles, Management, Measurement and
Reporting. Brookings Institution Press, Washington,
grown to be a multiple of annual in- interaction of IP and capital alloca- D.C., 2001.
dustry revenue, exceeding the assets tion. Initiating effective action is more 19. Levey, M.M., Wrappe, S.C., and Chung, K. Transfer
Pricing Rules and Compliance Handbook. CCH Wolters
held in the countries where the IP is difficult still. Kluwer Publications, Chicago, IL, 2006.
being created. The presence of signifi- 20. Makhlouf, G. List of Uncooperative Tax Havens.
OECD’s Committee on Fiscal Affairs. Organisation
cant IP rights in taxhavens provides Acknowledgments for Economic Co-operation and Development, Paris,
global corporations great flexibility to This exposition was motivated by the France, Apr. 19, 2002.
21. Parr, R. Royalty Rates for Licensing Intellectual
invest capital anywhere, avoiding in- Rebooting Computing meeting in Sili- Property. John Wiley & Sons, Inc., New York, 2007.
come due to IP from being taxed any- con Valley in January 2009 (http://www. 22. Rahn, R.W. In defense of tax havens. The Wall Street
Journal (Mar. 17, 2009).
where. The combination of reduced rebootingcomputing.org/content/sum- 23. Rashkin, M. Practical Guide to Research and
support for education, government mit) and benefited from discussions Development Tax Incentives: Federal, State, and
Foreign, Second Edition. CCH Wolters Kluwer
research funding, and physical infra- on the topic with Peter J. Denning Publications, Chicago, IL, 2007.
structure, along with the increased (the organizer), Joaquin Miller, Erich 24. Saitto, S. U.S. tech firms shop abroad to avoid taxes.
Bloomberg Businessweek (Sept. 6, 2010), 31–32.
motivation to start new initiatives in Neuhold, Claudia Newbold, Shaibal 25. Smith, G. and Parr, R. Intellectual Property, Valuation,
Exploitation, and Infringement Damages, John Wiley
semi-taxhavens and the imbalance of Roy, Stephen Smoliar, Shirley Tessler, & Sons, Inc., New York, 2005.
small businesses versus global corpo- Andy van Dam, Moshe Y. Vardi, and 26. Tambe, P.B. and Hitt, L.M. How offshoring affects IT
workers. Commun. ACM 53, 10 (Oct. 2010), 62–70.
rations, is bound to affect the future of unknown, patient Communications 27. Tichon, N. Tax Shell Game: The Taxpayer Cost of
enterprises in countries that initiated reviewers. Any remaining lack of clar- Offshore Corporate Havens. U.S. Public Interest
Research Group, Apr. 2009; http://www.uspirg.org/
high-tech industries, though the rate ity is due to my failure in presenting a news-releases/tax-and-budget/tax-and-budget-news/
and final magnitude is unpredictable novel topic adequately to a technical washington-d.c.-taxpayers-footing-a-100-billion-bill-
for-tax-dodgers
today. Better-educated scientists will audience. Any errors and opinions are 28. Weissler, R. Advanced Pricing Agreement Program
be less affected and feel the effects also solely my responsibility. Training on Cost Sharing Buy-In Payments. Transfer
Pricing Report 533. IRS, Washington D.C., Feb. 2002.
more slowly.26 But any industry re- 29. Wiederhold, G., Tessler, S., Gupta, A., and Smith,
quires a mix of related competencies. D.B. The valuation of technology-based intellectual
References property in offshoring decisions. Commun. Assoc.
It took 50 years for the U.S. car indus- 1. Aspray, W., Mayadas, F., and Vardi, M.Y., Eds. Infor. Syst. 24, 31 (June 2009), 523–544.
try to be reduced to its current state. Globalization and Offshoring of Software. A Report of 30. Wiederhold, G. Determining software investment lag.
the ACM Job Migration Task Force. ACM, New York, Journal of Universal Computer Science 14, 22 (2008).
The velocity of change when intangi- 2006. 31. Wiederhold, G. What is your software worth?
bles, instead of tangible capabilities, 2. Babcock, H. Appraisal Principles and Procedures. Commun. ACM 49, 9 (Sept. 2006), 65–75.
American Society of Appraisers, Herndon, VA, 1994; 32. Wilson, S. Is this the end for treasure islands?
are involved may well be greater. first edition 1968. MoneyWeek (Mar. 13, 2009).
The large amount of capital ac- 3. Becker, B. Cost sharing buy-ins. Chapter in Transfer
Pricing Handbook, Third Edition, R. Feinschreiber, Ed.
cumulated in taxhavens encourages John Wiley & Sons, Inc., New York, 2002, A3–A16. Gio Wiederhold (gio@cs.stanford.edu) is Professor
4. Boehm, B. Software Engineering Economics. Prentice- (Emeritus) of Computer Science, Medicine, and Electrical
ever-greater investment in foreign Hall, Upper Saddle River, NJ, 1981. Engineering at Stanford University, Stanford, CA; http://
companies. As of August 2010, such 5. Bureau of Labor Statistics. National Employment infolab.stanford.edu/people/gio.html
and Wage Data Survey. Bureau of Labor Statistics,
investment was reported to amount to © 2011 ACM 0001-0782/11/0100 $10.00

74 communications of th e ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
doi:10.1145/1866739 . 1 8 6 6 7 5 7

The ICE abstraction may take CS from serial


(single-core) computing to effective parallel
(many-core) computing.
by Uzi Vishkin

Using Simple
Abstraction
to Reinvent
Computing for
Parallelism
shift from single-processor
T h e re c e n t dr a m atic
computer systems to many-processor parallel ones
requires reinventing much of computer science to
build and program the new systems. CS urgently
requires convergence to a robust parallel general-
purpose platform providing good performance

and programming easy enough for all


CS students and graduates. Unfortu-
key insights
nately, ease-of-programming objec- C omputing can be reinvented for
tives have eluded parallel-computing parallelism, from parallel algorithms
through programming to hardware,
research over at least the past four preempting the technical barriers
decades. The idea of starting with inhibiting use of parallel machines.
an established easy-to-apply parallel
programming model and building an
M oving beyond the serial von Neumann
computer (the only successful general-
architecture for it has been treated as purpose platform to date), computer
radical by hardware and software ven- science will again be able to augment
mathematical induction with a simple
dors alike. Here, I advocate an even
one-line computing abstraction.
more radical parallel programming
and architecture idea: Start with a sim- B eing able to think algorithmically in
parallel is a significant advantage for
ple abstraction encapsulating the de- systems developers and programmers
sired interface between programmers building and programming multi-core
and system builders. machines.

ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t he acm 75
contributed articles

I begin by proposing the Immediate


Concurrent Execution (ICE) abstrac-
Parallel Algorithms tion, followed by two contributions
supporting this abstraction I have led:
The following two examples explore how these algorithms look and the opportunities XMT. A general-purpose many-core
and benefits they provide to systems developers and programmers. explicit multi-threaded (XMT) com-
Example 1. Given are two variables A and B, each containing some value. The
exchange problem involves exchanging their values; for example, if the input to the puter architecture designed from the
exchange problem is A=2 and B=5, then the output is A=5 and B=2. The standard ground up to capitalize on the on-chip
algorithm for this problem uses an auxiliary variable X and works in three steps: resources becoming available to sup-
port the formidable body of knowl-
X:=A
A:=B edge, known as parallel random-
B:=X access machine (model), or PRAM,
algorithmics, and the latent, though
In order not to overwrite A and lose its content, the content of A is first stored in X,
B is then copied to A, and finally the original content of A is copied from X to B. The
not widespread, familiarity with it;
work in this algorithm is three operations, the depth is three time units, and the space and
requirement (beyond input and output) is one word. Workflow. A programmer’s work-
Given two arrays A[0..n-1] and B[0..n-1], each of size n, the array-exchange problem flow links ICE, PRAM algorithmics,
involves exchanging their content, so A(i) exchanges its content with B(i) for every
i=0..n-1. The array exchange serial algorithm serially iterates the standard exchange and XMT programming. The ICE ab-
algorithm n times. Here’s the pseudo-code: straction of an algorithm is followed
by a description of the algorithm for
For i =0 to n−1 do
the synchronous PRAM, allowing ease
X:=A( i ) ; A( i ):=B( i ) ; B( i ):=X
of reasoning about correctness and
The work is 3n, depth is 3n, and space is 2 (for X and i). A parallel array-exchange complexity, which is followed by mul-
algorithm uses an auxiliary array X[0..n-1] of size n, the parallel algorithm applies tithreaded programming that relaxes
concurrently the iterations of the serial algorithm, each exchanging A(i) with B(i) for a
different value of i. Note the new pardo command in the following pseudo-code: this synchrony for the sake of imple-
mentation. Directly reasoning about
For i =0 to n−1 pardo soundness and performance of mul-
X( i ):=A( i ) ; A( i ):=B( i ) ; B( i ):=X( i ) tithreaded code is generally known
This parallel algorithm requires 3n work, as in the serial algorithm. Its depth has to be error-prone. To circumvent the
improved from 3n to 3. If the size of the array n is 1,000 words, it would constitute likelihood of errors, the workflow in-
speedup by a factor of 1,000 relative to the serial algorithm. The increase in space to 2n corporates multiple levels of abstrac-
(for array X and n concurrent values of i) demonstrates a cost of parallelism.
Example 2. Given is the directed graph with nodes representing all commercial
tion; the programmer must establish
airports in the world. An edge connects node u to node v if there is a nonstop flight from only that multithreaded program
airport u to airport v, and s is one of these airports. The problem is to find the smallest behavior matches the synchronous
number of nonstop flights from s to any other airport. The WD algorithm works as PRAM-like algorithm it implements,
follows: Suppose the first i steps compute the fewest number of nonstop flights from s
to all airports that can be reached from s in at most i flights, while all other airports are a much simpler task. Current XMT
marked “unvisited.” hardware and software prototypes and
Step i+1 concurrently finds the destination of every outgoing flight from any airport demonstrated ease-of-programming
to which the fewest number of flights from s is exactly i, and for every such destination
and strong speedups suggest that CS
marked “unvisited” requires i+1 flights from s. Note that some “unvisited” nodes may
have more than one incoming edge. In such a case the arbitrary CRCW convention may be much better prepared for the
implies that one of the attempting writes succeeds. While we don’t know which one, we challenges ahead than many of our
do know all writes would enter the number i+1; in general, however, arbitrary CRCW colleagues realize.
also allows different values.
The standard serial algorithm for this problem9 is called breadth-first search, A notable rudimentary abstrac-
and the parallel algorithm described earlier is basically breadth-first search with one tion—that any single instruction avail-
difference: Step i+1 described earlier allows concurrent-writes. In the serial version, able for execution in a serial program
breadth-first search also operates by marking all nodes whose shortest path from s executes immediately—made serial
requires i+1 edges after all nodes whose shortest path from s requires i edges. The
serial version then proceeds to impose a serial order. Each newly visited node is placed computing simple. Abstracting away
in a first-in-first-out queue data structure. a hierarchy of memories, each with
Three lessons are drawn from this example: First, the serial order obstructs greater capacity but slower access
the parallelism in breadth-first search; freedom to process in any-order nodes for
which the shortest path from s has the same length is lost. Second, programmers
time than its predecessor, along with
trained to incorporate such serial data structures into their programs acquire bad different execution time for different
serial habits difficult to uproot; it may be better to preempt the problem by teaching operations, this Immediate Serial Exe-
parallel programming and parallel algorithms early. And third, to demonstrate the cution (ISE) abstraction has been used
performance advantage of the parallel algorithm over the serial algorithm, assume
that the number of edges in the graph is 600,000 (the number of nonstop flight links), by programmers for years to concep-
and the smallest number of flights from airport s to any other airport is no more than tualize serial computing and ensure
five. While the serial algorithm requires 600,000 basic steps, the parallel algorithm support by hardware and compilers. A
requires only six. Meanwhile, each of the six steps may require longer wall clock time
than each of the 600,000 steps, but the factor 600,000/6 provides leeway for speedups
program provides the instruction to be
by a proper architecture. executed next at each step (inductive-
ly). The left side of Figure 1 outlines
serial execution as implied by this ISE

76 communications of th e ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
contributed articles

abstraction, where unit-time instruc- tion. Allowing programmers to view a programmers can’t handle,19 a prob-
tions execute one at a time. computer operation as a PRAM would lem of broad interest. Software pro-
The rudimentary parallel abstrac- make it easy to program,10 hence this duction has become a key compo-
tion I propose here is that indefinitely article should interest all such majors nent of the manufacturing sector of
many instructions available for con- and graduates. the economy. Mainstream machines
current execution execute immediate- Until 2004, standard (desktop) most programmers can’t handle cause
ly, dubbing the abstraction Immediate computers comprised a single proces- significant decline in productivity
Concurrent Execution. A consequence sor core. Since 2005 when multi-core of manufacturing, a concern for the
of ICE is a step-by-step (inductive) ex- computers became the standard, CS overall economy. Andy Grove, former
plication of the instructions available has appeared to be on track with a Chairman of the Board of Intel Corp.,
next for concurrent execution. The prediction5 of 100+-core computers said in the 1990s that the software spi-
number of instructions in each step is by the mid-2010s. Transition from se- ral—the cyclic process of hardware
independent of the number of proces- rial (single-core) computing to parallel improvements leading to software
sors, which are not even mentioned. (many-core) computing mandates the improvements leading back to hard-
The explication falls back on the serial reinvention of the very heart of CS, as ware improvements—was an engine
abstraction in the event of one instruc- these highly parallel computers must of sustained growth for IT for decades
tion per step. The right side of Figure 1 be built and programmed differently to come. A stable application-software
outlines parallel execution as implied from the single-core machines that base that could be reused and en-
by the ICE abstraction. At each time dominated standard computer sys- hanced from one hardware generation
unit, any number of unit-time instruc- tems since the inception of the field to the next was available for exploita-
tions that can execute concurrently do almost 70 years ago. By 2003, the clock tion. Better performance was assured
so, followed by yet another time unit rate of a high-end desktop proces- with each new generation, if only the
in which the same execution pattern sor had reached 4GHz, but processor hardware could run serial code faster.
repeats, and so on, as long as the pro- clock rates have improved only barely, Alas, the software spiral today is bro-
gram is running. if at all, since then; the industry simply ken.21 No broad parallel-computing
How might parallelism be advan- did not find a way to continue improv- application software base exists for
tageous for performance? The PRAM ing clock rates within an acceptable which hardware vendors are commit-
answer is that in a serial program the power budget.5 Fortunately, silicon ted to improving performance. And
number of time units, or “depth,” is technology improvements (such as no agreed-upon parallel architecture
the same as the algorithm’s total num- miniaturization) allow the amount of allows application programmers to
ber of operations, or “work,” while in logic a computer chip can contain to build such a base for the foreseeable
the parallel program the number of keep growing, doubling every 18 to 24 future. Instating a new software spiral
time units can be much lower. For a months per Gordon Moore’s 1965 pre- could indeed be a killer app for gener-
parallel program, the objective is that diction. Computers with an increas- al-purpose many-core computing; ap-
its work does not much exceed that ing number of cores are now expected plication software developers would
of its serial counterpart for the same but without significant improvement put it to good use for specific applica-
problem, and its depth is much lower in clock rates. Exploiting the cores tions, and more consumers worldwide
than its work. (Later in the article, I in parallel for faster completion of a would want to buy new machines.
note the straightforward connection computing task is today the only way This robust market for many-core-
between ICE and the rich PRAM algo- to improve performance of individual based machines and applications
rithmic theory and that ICE is nothing tasks from one generation of comput- leads to the following case for govern-
more than a subset of the work-depth ers to the next. ment support: Foremost among to-
model.) But how would a system de- Unfortunately, chipmakers are de- day’s challenges is many-core conver-
signer go about building a computer signing multi-core processors most gence, seeking timely convergence to
system that realizes the promise of
Figure 1. Serial execution based on the serial ISE abstraction vs. parallel execution based
ease of programming and strong per-
on the parallel ICE abstraction.
formance?
Outlining a comprehensive solu-
tion, I discuss basic tension between Serial doctrine Natural (parallel) algorithm
the PRAM abstraction and hardware (Immediate serial execution) (Immediate concurent execution)
implementation and a workflow that
goes through ICE and PRAM-related ..
abstractions for programming the ..
Operations

Operations
Number of

Number of

..
XMT computer architecture.
Some many-core architectures are
likely to become mainstream, mean- .. .. .. ..
ing they must be easy enough to pro- Time Time
gram by every CS major and graduate. Time = Number of Operations Time << Number of Operations
I am not aware of other many-core
architectures with PRAM-like abstrac-

ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t he acm 77
contributed articles

Figure 2. Right column is a workflow from an ICE abstraction of an algorithm to the algorithm to fit bandwidth con-
implementation; left column may never terminate. straints among threads of the compu-
tation, a programming process that
doesn’t always yield an acceptable
ICE ICE
outcome. However, the XMT hardware
allows a workflow (right side of the
figure) that requires tuning only for
performance; revisiting and possibly
Parallel algorithm Parallel algorithm
changing the algorithm is generally
not needed. An optimizing compiler
should be able to do its own tuning
Parallel program XMT Program
without programmer intervention, as
in serial computing.
Most of the programming effort
yes Rethink algorithm: Tune
Insufficient inter−thread
Take better advantage
in traditional parallel programming
bandwidth?
of cache (domain partitioning, load balancing)
no is generally of lesser importance for
XMT hardware exploiting on-chip parallelism, where
Hardware parallelism overhead can be kept low
and processor-to-memory bandwidth
high. This observation drove develop-
ment of the XMT programming model
a robust many-core platform coupled plementing step-by-step synchrony and its implementation by my re-
with a new many-core software spiral in hardware, consider two examples: search team. XMT is intended to pro-
to serve the world of computing for Memories based on long tightly syn- vide a simpler parallel programming
years to come. A software spiral is ba- chronous pipelines of the type seen in model that efficiently exploits on-chip
sically an infrastructure for the econ- Cray vector machines have long been parallelism through multiple design
omy. Since advancing infrastructures out of favor among architects of high- elements.
generally depends on government performance computing; and process- The XMT architecture uses a high-
funding, designating software-spiral ing memory requests takes from one bandwidth low-latency on-chip inter-
rebirth a killer app also motivates to 400 clock cycles. Hardware must be connection network to provide more
funding agencies and major vendors made as flexible as possible to advance uniform memory-access latencies.
to support the work. The impact on without unnecessary waiting for con- Other specialized XMT hardware
manufacturing productivity could fur- current memory requests. primitives allow concurrent instantia-
ther motivate them. To underscore the importance of tion of as many threads as the number
the bridge the XMT approach builds of available processors, a count that
Programmer Workflow from the tightly synchronous PRAM can reach into the thousands. Specifi-
ICE requires the lowest level of cog- to relaxed synchrony implementation, cally, XMT can perform two main op-
nition from the programmer relative note three known limitations with erations: forward (instantly) program
to all current parallel programming power consumption of multi-core ar- instructions to all processors in the
models. Other approaches require chitectures: high power consumption time required to forward the instruc-
additional steps (such as decomposi- of the wide communication buses tions (for one thread) to just one pro-
tion10). In CS theory, the speedup pro- needed to implement cache coher- cessor; and reallocate any number of
vided by parallelism is measured as ence; basic nm complexity of cache- processors that complete their jobs at
work divided by depth; reducing the coherence traffic (given n cores and the same time to new jobs (along with
advantage of ICE/PRAM to practice is m invalidations) and implied toll on their instructions) in the time required
a different matter. inter-core bandwidth; and high power to reallocate one processor. The high-
The reduction to practice I have led consumption needed for a tightly syn- bandwidth, low-latency interconnec-
relies on the programmer’s workflow, chronous implementation in silicon tion network and low-overhead cre-
as outlined in the right side of Figure in these designs. The XMT approach ation of many threads allow efficient
2. Later, I briefly cover the parallel- addresses all three by avoiding hard- support for the fine-grain parallelism
algorithms stage. The step-by-step ware-supported cache-coherence al- used to hide memory latencies and a
PRAM explication, or “data-parallel” together and by significantly relaxing programming model for which local-
instructions, represents a traditional synchrony. ity is less an issue than in designs with
tightly synchronous outlook on paral- Workflow is important, as it guides less bandwidth. These mechanisms
lelism. Unfortunately, tight step-by- the human-to-machine process of pro- support dynamic load balancing, re-
step synchrony is not a good match gramming; see Figure 2 for two work- lieving programmers from having to
with technology, including its power flows. The non-XMT hardware imple- directly assign work to processors.
constraints. mentation on the left side of the figure The programming model is simplified
To appreciate the difficulty of im- may require revisiting and changing further by letting threads run to com-

78 communications of th e ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
contributed articles

pletion without synchronization (no as a sequence of rounds and, for each with the easy-to-understand ICE ab-
busy-waits) and synchronizing access round, up to p processors execute con- straction and ends with the XMT sys-
to shared data with prefix-sum (fetch- currently. The performance objective tem, providing a practical implemen-
and-add type) instructions. These fea- is to minimize the number of rounds. tation of the vast PRAM algorithmic
tures result in a flexible programming The PRAM parallel-algorithmic ap- knowledge base.
style that accommodates the ICE ab- proach is well-known and has never XMT programming model. The
straction and encourages program de- been seriously challenged by any programming model behind the XMT
velopment for a range of applications. other parallel-algorithmic approach framework is an arbitrary concurrent
The reinvention of computing for in terms of ease of thinking or wealth read, concurrent write single program
parallelism also requires pulling to- of knowledgebase. However, PRAM multiple data, or CRCW SPMD, pro-
gether a number of technical commu- is also a strict formal model. A PRAM gramming model with two executing
nities. My 2009 paper26 sought to build algorithm must therefore prescribe modes: serial and parallel. The two in-
a bridge to other architectures by cast- for each and every one of its p proces- structions—spawn and join—specify
ing the abstraction-centric vision of sors the instruction the processor ex- the beginning and end, respectively,
this article as a possible module in ecutes at each time unit in a detailed of a parallel section (see Figure 3). An
them, identifying a limited number of computer-program-like fashion that arbitrary number of virtual threads,
capabilities the module provides and can be quite demanding. The PRAM- initiated by a spawn and terminated
suggesting a preferred embodiment algorithms theory mitigates this in- by a join, share the same code. The
of these capabilities using concrete struction-allocation scheme through workflow relies on the spawn com-
“hardware hooks.” If it is possible the work-depth (WD) methodology. mand to extend the ICE abstraction
to augment a computer architecture This methodology (due to Shiloach from the WD methodology to XMT
through them (with hardware hooks and Vishkin20) suggests a simpler way programming. As with the respective
or other means), the ICE abstraction to allocate instructions: A parallel PRAM model, the arbitrary CRCW as-
and the programmer’s workflow, in algorithm can be prescribed as a se- pect dictates that concurrent writes
line with this article, can be support- quence of rounds, and for each round, to the same memory location result
ed. The only significant obstacle in to- any number of operations can be ex- in an arbitrary write committing.
day’s multi-core architectures is their ecuted concurrently, assuming un- No assumption needs to be made by
large cache-coherent local caches. limited hardware. The total number the programmer beforehand about
Their limited scalability with respect of operations is called “work,” and the which one will succeed. An algorithm
to power gives vendors more reasons number of rounds is called “depth,” as designed with this property in mind
beyond an easier programming model in the ICE abstraction. The first perfor- permits each thread to progress at its
to let go of this obstacle. mance objective is to reduce work, and own speed, from initiating spawn to
PRAM parallel algorithmic ap- the immediate second one is to reduce terminating join, without waiting for
proach. The parallel random-access depth. The methodology of restrict- other threads—no thread “busy-waits”
machine/model (PRAM) virtual model ing attention only to work and depth for another thread. The implied “inde-
of computation is a generalization of has been used as the main framework pendence of order semantics” allows
the random-access machine (RAM) for the presentation of PRAM algo- XMT to have a shared memory with a
model.9 RAM, the basic serial model rithms16,17 and is in my class notes on relatively weak coherence model. An
behind standard programming lan- the XMT home page http://www.umi- advantage of this easier-to-implement
guages, assumes any memory access acs.umd.edu/users/vishkin/XMT/. De- SPMD model is that it is PRAM-like. It
or any operation (logic or arithmetic) riving a full PRAM description from a also incorporates the prefix-sum state-
takes unit-time (serial abstraction). WD description is easy. For concrete- ment operating on a base variable, B,
The formal PRAM model assumes a ness, I demonstrate WD descriptions and an increment variable, R. The re-
certain number, say, p of processors, on two examples, the first concerning sult of a prefix-sum is that B gets the
each able to concurrently access any parallelism, the second concerning value B + R, while R gets the initial val-
location of a shared memory in the the WD methodology (see the sidebar ue of B, a result called “atomic” that’s
same time as a single access. PRAM “Parallel Algorithms”). similar to fetch-and-increment in Got-
has several submodels that differ by The programmer’s workflow starts tlieb et al.12
assumed outcome of concurrent ac-
cess to the same memory location for Figure 3. Serial and parallel execution modes.
either read or write purposes. Here, I
note only one of them—the Arbitrary
Serial Parallel Serial Parallel Serial
Concurrent-Read Concurrent-Write mode mode mode mode mode
(CRCW) PRAM—which allows con-
current accesses to the same memory
location for reads or writes; reads … Spawn Join Spawn Join …

complete before writes, and an arbi-


trary write (to the same location, un-
known in advance) succeeds. PRAM
algorithms are essentially prescribed

ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t he acm 79
contributed articles

Merging with a Single Spawn-Join


The merging problem takes as Main steps of the ranking/merging algorithm.
input two sorted arrays A = A[1 . . . n]
and B = B[1 . . . n]. Each of these 2n
elements must then be mapped into A B A B
an array C = C[1 . . . 2n] that is also
sorted. I first review the Shiloach-
4 1
Vishkin two-step PRAM algorithm
for merging, then discuss its related
6 2 6 2
XMTC programming:
Step 1. Partitioning. This step
8 3 8 3
selects some number x of elements
from A at equal distances. In the
9 5 9 5
example in the figure here, suppose
the x = 4 elements 4, 16, 20, and 27
16 7
are selected and ranked relative to
array B using x concurrent binary
17 10 17 10
searches. Similarly, x elements from
B at equal distances, say, elements 1,
18 11 18 11
7, 13, and 24, are also selected, then
ranked relative to array A using x = 4
19 12 19 12
concurrent binary searches. The step
takes O(log n) time. These ranked
20 13
elements partition the merging job
that must be completed into 2x = 8
21 14 21 14
“strips”; in the figure, step 2 includes
eight such strips.
23 15 23 15
Step 2. Actual work. For each
strip the remaining job is to merge
25 22 25
a subarrary of A with a subarray of 22
B, mapping their elements into a
27 24
subarray of C. Since these 2x merging
jobs are mutually independent, each
29 26 29 26
is able to concurrently apply the
standard linear-time serial merging
31 28 31 28
algorithm.
Consider the following
32 30 32 30
complexity analysis of this algorithm:
Since each strip has at most n/x
Step 1 Step 2
elements from A and n/x elements
Partitioning Actual Work
from B, the depth (or parallel time) of
the second step is O(n/x). If x ≤ n/ log
n, the first step and the algorithm as
a whole does O(n) work. In the PRAM
model, this algorithm requires O(n/x Merging in XMTC. An XMTC thread finds that 20 ranks as 11 in B; 11
+ log n) time. A simplistic XMTC program spawns 2x concurrent is the index of 15 in B. Since the index of
program requires as many spawn threads, one for each of the selected 20 in A is 9, element 20 ranks 20 in C. The
(and respective join) commands elements in array A or B. Using binary thread then compares 21 to 22 and ranks
as the number of PRAM steps. The search, each thread first ranks its array element 21 (as 21), then compares 23 to
reasons I include this example here element relative to the other array, 22 to rank 22, 23 to 24 to rank 23, and 24
are that it involves a way to use only then proceeds directly (without a join to 25 but terminates since the thread of 24
a single spawn (and a single join) operation) to merge the elements in its ranks 24, concluding the example.
command to represent the whole strip, terminating just before setting Our experience is that, with little
merging algorithm and, as I explain the merging result of another selected effort, XMT-type threading requires
in the Conclusion, to demonstrate element because the merging result is fewer synchronizations than implied
an XMT advantage over current computed by another thread. by the original PRAM algorithm.
hardware by comparing it with To demonstrate the operation of a The current merging example
the parallel merging algorithm in thread, consider the thread of element 20. demonstrates that synchronization
Cormen et al.9 Starting with binary search on array B the reduction is sometimes significant.

The primitive is especially useful thread returns a different R value. This XMTC is an extension of standard C,
when several threads perform a prefix- way, the parallel prefix-sum command augmenting C with a small number
sum simultaneously against a com- can be used to implement efficient of commands (such as spawn, join,
mon base, because multiple prefix- and scalable inter-thread synchroniza- and prefix-sum). Each parallel re-
sum operations can be combined by tion by arbitrating an ordering among gion is delineated by spawn and join
the hardware to form a very fast multi- the threads. statements, and synchronization is
operand prefix-sum operation. Be- The XMTC high-level language im- achieved through the prefix-sum and
cause each prefix-sum is atomic, each plements the programming model. join commands. Every thread execut-

80 communications of th e ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
contributed articles

ing the parallel code is assigned a the thread is non-zero, the thread per- by the architecture, the compiler, and
unique thread ID, designated $. The forms a prefix-sum to get a unique in- the programmer/algorithm designer.
spawn statement takes as arguments dex into B where it can place its value. See Vishkin et al27 for a demonstra-
the lowest ID and highest ID of the Other XMTC commands. Prefix-sum- tion of tuning XMTC code for perfor-
threads to be spawned. For the hard- to-memory (psm) is another prefix- mance by accounting for LSRTM. As
ware implementation (discussed lat- sum command, the base of which is an example, it improves XMT hard-
er), XMTC threads can be as short as any location in memory. While the ware performance on the problem of
eight to 10 machine instructions that increment of ps must be 0 or 1, the in- summing n numbers.
are not difficult to get from PRAM al- crement of psm is not limited, though Execution can differ from the literal
gorithms. Programmers from high its implementation is less efficient. XMTC code in order to keep the size of
school to graduate school are pleas- Single Spawn (sspawn) is a command working space under control or other-
antly surprised by the flexibility of that can spawn an extra thread and be wise improve performance. For exam-
translating PRAM algorithms to XMTC nested. A nested spawn command in ple, compiler and runtime methods
multi-threaded programs. The ability XMTC code must be replaced (by pro- could perform this modification by
to code the whole merging algorithm grammer or compiler) by sspawn com- clustering virtual threads offline or on-
using a single spawn-join pair is one mands. The XMTC commands are de- line and prioritize execution of nested
such surprise (see the sidebar “Merg- scribed in the programmer’s manual spawns using known heuristics based
ing with a Single Spawn-Join”). included in the software release on the on a mix of depth-first and breadth-
To demonstrate simple code, con- XMT Web pages. first searches.
sider two code examples: Tuning XMT programs for perfor- Commitments to silicon of XMT
The first is a small XMTC program mance. My discussion here of perfor- by my research team at the University
for the parallel exchange algorithm mance tuning would be incomplete of Maryland include a 64-processor,
discussed in the “Parallel Algorithms” without a description of salient fea- 75MHz computer based on field-pro-
sidebar: tures of the XMT architecture and grammable gate array (FPGA) technol-
hardware. The XMT on-chip general- ogy developed by Wen28 and 64-proces-
spawn ( 0 , n−1){ purpose computer architecture is sor ASIC 10mm X 10mm chip using
var x aimed at the classic goal of reducing IBM’s 90nm technology developed
x:=A( $ ) ; single-task completion time. The WD together by Balkan, Horak, Keceli, and
A( $ ):=B( $ ) ; methodology gives algorithm design- Wen (see Figure 4). Tzannes and Car-
B( $ ):=x ers the ability to express all the paral- gaea (guided by Barua and me) have
} lelism they observe. XMTC program- also developed a basic yet stable com-
ming further permits expressing this piler, and Keceli has developed a cycle-
The program spawns a concurrent virtual parallelism by letting program- accurate simulator of XMT. Both are
thread for each of the depth-3 serial- mers express as many concurrent available through the XMT software
exchange iterations using a local vari- threads as they wish. The XMT proces- release on the XMT Web pages.
able x. Note that the join command is sor must now provide an effective way Easy to build. An individual gradu-
implied by the right parenthesis at the to map this virtual parallelism onto the ate student with no prior design expe-
end of the program. hardware. The XMT architecture pro- rience completed the XMT hardware
The second assumes an array of n vides dynamic allocation of the XMTC description (in Verilog) in just over
integers A. The programmer wishes threads onto the hardware for better two years (2005–2007). XMT is also sil-
to “compact” the array by copying all load balancing. Since XMTC threads icon-efficient. The ASIC design by the
non-zero values to another array, B, in can be short, the XMT hardware must XMT research team at the University
an arbitrary order. The XMTC code is: directly manage XMT threads to keep of Maryland shows that a 64-processor
overhead low. In particular, an XMT XMT needs the same silicon area as a
psBaseReg x=0; program looks like a single thread to (single) current commodity core. The
spawn ( 0 , n−1){ the operating system (see the sidebar XMT approach goes after any type of
int e ; “The XMT Processor” for an overview application parallelism regardless of
e=1; of XMT hardware). how much parallelism the application
i f (A[ $ ] ) !=0) { The main thing performance pro- requires, the regularity of this paral-
ps ( e , x ) ; grammers must know in order to tune lelism, or the parallelism’s grain size,
B[ e ]=A[ $ ] the performance of their XMT pro- and is amenable to standard multipro-
} grams is that a ready-to-run version of gramming where the hardware sup-
} an XMT program depends on several ports several concurrent operating-
parameters: the length of the (longest) system threads.
It declares a variable x as the base sequence of roundtrips to memory The XMT team has demonstrated
value to be used in a prefix-sum com- (LSRTM); queuing delay to the same good XMT performance, independent
mand (ps in XMTC), initializing it to 0. shared memory location (known as software engineers have demonstrat-
It then spawns a thread for each of the queue-read queue-write, or QRQW11); ed XMT programmability (see Hoch-
n elements in A. A local thread variable and work and depth. Their optimiza- stein et al.14), and independent educa-
e is initialized to 1. If the element of tion is a responsibility shared subtly tion professionals have demonstrated

ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t he acm 81
contributed articles

ming difficulties have failed all gen-


eral-purpose parallel systems to date
Eye-of-a-Needle by limiting their use. In contrast, XMT
frees its programmers from doing all
Aphorism the steps, in line with the ICE/PRAM
abstraction.
Introduced at a time of hardware scarcity almost 70 years ago, the von Neumann The XMT software environment
apparatus of stored program and program counter forced the threading of instructions
through a metaphoric eye of a needle. Coupling of mathematical induction and (serial) release (a 2010 release of the XMTC
ISE abstraction was engineered to provide this threading, as discussed throughout the compiler and cycle-accurate simulator
article. See especially the description of how variable X is used in the pseudo-code of of XMT) is available by free download
the serial iterative algorithm in the exchange problem; also the first-in-first-out queue from the XMT home page and from
data structure in the serial breadth-first search; and the serial merging algorithm in
which two elements are compared at a time, one from each of two sorted input arrays. sourceforge.net, along with extensive
As eye-of-a-needle threading is already second nature for many programmers, it has documentation, and can be download-
come to be associated with ease of programming. ed to any standard desktop computing
Threading through the eye of a needle is an aphorism for extreme difficulty, even
impossibility, in the broader culture, including in the texts of three major religions.
platform. Teaching materials cover-
The XMT extension to the von Neumann apparatus (noted in “The XMT Processor” ing a class-tested programming meth-
sidebar) exploits today’s relative abundance of hardware resources to free computing odology in which students are taught
from the constraint of threading through the original apparatus. Coupling only parallel algorithms are also avail-
mathematical induction and the ICE abstraction explored here is engineered to
capitalize on this freedom for ease of parallel programming and improved machine able from the XMT Web pages.
and application performance. Most CS programs today graduate
students to a job market certain to be
dominated by parallelism but without
XMT teachability (see Torbert et al.23). single core. In 2010, Caragea et al.7 the preparation they need. The level of
Highlights include evidence of 100X demonstrated that, using the same awareness of parallelism required by
speedups on general-purpose applica- silicon area as a modern graphics pro- the ICE/PRAM abstraction is so basic
tions on a simulator of 1,000 on-chip cessing unit (GPU), the XMT design it is necessary for all other current ap-
processors13 and speedups ranging achieves an average speedup of 6X rel- proaches. As XMT is also buildable,
from 15X to 22X for irregular prob- ative to the GPU for irregular applica- the XMT approach is sufficient for pro-
lems (such as Quicksort, breadth-first tions and falls only slightly behind on gramming a real machine. I therefore
search on graphs, finding the longest regular ones. All GPU code was written propose basing the introduction of
path in a directed acyclic graph), and and optimized by researchers and pro- the new generation of CS students to
speedups of 35X–45X for regular pro- grammers unrelated to the XMT proj- parallelism on the workflow presented
grams (such as matrix multiplication ect. here, at least until CS generally con-
and convolution) on the 64-processor With few exceptions, parallel pro- verges on a many-core platform.
XMT prototype versus the best serial gramming approaches that dominat- Related efforts. Related efforts to-
code on XMT.28 ed parallel computing prior to many- ward parallelism come in several fla-
In 2009, Caragea et al.8 demonstrat- cores are still favored by vendors, as vors; for example, Valiant’s Multi-BSP
ed nearly 10X average performance well as high-performance users. The bridging model for multi-core com-
improvement potential relative to In- steps they require include decomposi- puting24 appears closest to the XMT
tel Core 2 Duo for a 64-processor XMT tion, assignments, orchestration, and focus on abstraction. The main dif-
chip using the same silicon area as a mapping.10 Indeed, parallel program- ference, however, is that XMT aims
to preempt known shortcomings in
Figure 4. Left side. FPGA board (size of a car license plate) with three FPGA chips (gener- existing machines by showing how to
ously donated by Xilinx): A, B: Virtex-4LX200; C: Virtex-4FX100. Right side. 10mm X 10mm
build machines differently, while the
chip using IBM Flip-Chip technology.
modeling in Valiant24 aims to improve
understanding of existing machines.
These prescriptive versus descrip-
tive objectives are not the only differ-
ence. Valiant24 modeled relatively low-
level parameters of certain multi-core
A B C architectures, making them closer to
Vishkin et al.27 than to this article. Un-
DDR2 like both sources, simplicity drives the
“one-liner” ICE abstraction. Parallel
PCI bus languages (such as CUDA, MPI, and
OpenMP) tend to be different from
computational models, as they often
A, B: Virtex-4LX200. 10mm X 10mm chip using
C: Virtex-4FX100. IBM Flip-Chip technology.
do not involve performance model-
ing. They require a level of detail that
distances them farther from simple

82 communications of th e ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
contributed articles

The XMT Processor


The XMT processor (see the figure Executing a thread to completion registers, are simple in-order
here) includes a master thread (upon reaching a join command), the pipelines, including fetch, decode,
control unit (MTCU), processing TCU does a prefix-sum using the PS execute/memory-access, and write-
clusters, each comprising several unit to increment global register X. In back stages. The FPGA computer has
thread-control units (TCUs), response, the TCU gets the ID of the 64 TCUs in four clusters of 16 TCUs
a high-bandwidth low-latency thread it could execute next; if the ID is each. XMT designers and evangelists
interconnection network3 and its ≤Y, the TCU executes a thread with this aspire to develop a machine with 1,024
extension to a globally asynchronous ID. Otherwise, the TCU reports to the TCUs in 64 clusters. A cluster includes
locally synchronous, GALS-style, MTCU that it finished executing. When functional units shared by several
design incorporating asynchronous all TCUs report they’ve finished, the TCUs and one load/store port to the
logic,15,18 memory modules (MM), MTCU continues in serial mode. interconnection network shared by all
each comprising on-chip cache and The broadcast operation is its TCUs.
off-chip memory, prefix-sum (PS) essential to the XMT ability to start all The global memory address
unit(s), and global registers. The TCUs at once in the same time it takes space is evenly partitioned into the
shared-memory-modules block to start one TCU. The PS unit allows MMs through a form of hashing. The
(bottom left of the figure) suppresses allocation of new threads to the TCUs XMT design eliminates the cache-
the sharing of a memory controller that just became available within the coherence problem, a challenge in
by several MMs. The processor same time it takes to allocate one terms of bandwidth and scalability. In
alternates between serial mode (in thread to one TCU. This dynamic principle, there are no local caches at
which only the MTCU is active) and allocation provides runtime load- the TCUs. Within each MM, the order
parallel mode. The MTCU has a balancing of threads coming from an of operations to the same memory
standard private data cache (used XMTC program. location is preserved.
in serial mode) and a standard We are now ready to connect with For performance enhancements
instruction cache. The TCUs, which the NBW FSM ideal. Consider an XMT (such as data prefetch) incorporated
lack a write data cache, share the program derived from the workflow. into the XMT hardware, along with
MMs with the MTCU. From the moment the MTCU starts more on the architecture, see Wen
The overall XMT design is executing a spawn command until and Vishkin28; for more on compiler
guided by a general design ideal each TCU terminates the threads and runtime scheduling methods for
I call no-busy-wait finite-state- allocated to it, no TCU can cause nested parallelism, see Tzannes et
machines, or NBW FSM, meaning any other TCU to busy-wait for it. An al.,23 and for prefetching methods, see
the FSMs, including processors, unavoidable busy-wait ultimately Caragea et al.6
memories, functional units, occurs when a TCU terminates and Patents supporting the XMT
and interconnection networks begins waiting for the next spawn hardware were granted from 2002
comprising the parallel machine, command. to 2010, appearing in Nuzman and
never cause one another to busy- TCUs, with their own local Vishkin18 and Vishkin.25
wait. It is ideal because no parallel
machine can operate that way.
Nontrivial parallel processing Block diagram of the XMT architecture.
demands the exchange of results
among FSMs. The NBW FSM
ideal represents my aspiration to
minimize busy-waits among the Read Buffers
Network

various FSMs comprising a machine.


TCU t
TCU 2

TCU I-Cache PS Unit


TCU 1

PS
TCU 0

Here, I cite the example of how (and global register)


Register File
the MTCU orchestrates the TCUs
to demonstrate the NBW FSM
cluster n

ideal. The MTCU is an advanced


cluster 2
cluster 1

serial microprocessor that also


cluster 0

FU interconnection network
executes XMT instructions (such as
spawn and join). Typical program
execution flow, as in Figure 3, can
also be extended through nesting of FU 0 FU 1 FU p PS
sspawn commands. The MTCU uses Shared Functional Units Unit
the following XMT extension to the
LSU with Hashing Function Instruction
standard von Neumann apparatus Broadcast
of the program counters and stored
program: Upon encountering
a spawn command, the MTCU
broadcasts the instructions in the
parallel section starting with that Cluster-Memory Interconnection Network
spawn command and ending with a
join command on a bus connecting
to all TCU clusters. Master TCU
The largest ID number of a Functional Units
MM 0 MM 1 MM M and Register File
thread the current spawn command
L1 Cache L1 Cache L1 Cache
must execute Y is also broadcast L2 Cache L2 Cache L2 Cache Private Private
to all TCUs. The ID (index) of the L1 D-Cache L1 I-Cache
Shared Memory Modules
largest executing threads is stored in
a global register X. In parallel mode,
a TCU executes one thread at a time.

ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t he acm 83
contributed articles

abstractions. enhancement to allow simultaneously


Several research centers1,2 are ac- starting many threads in the same
tively exploring the general problems time required to start one thread.
discussed here. The University of Cali- A step ahead of available hardware,
fornia, Berkeley, Parallel Computing
Lab and Stanford University’s Perva- The XMT XMT includes a spawn command that
spawns any number of threads upon
sive Parallelism Laboratory advocate on-chip transition to parallel mode. More-

general-purpose
an application-driven approach to re- over, the ICE abstraction incorporates
inventing computing for parallelism. work-depth early in the design work-

Conclusion
computer flow, similar to Cormen et al.’s 1990
first edition.9
The vertical integration of a parallel- architecture The O(log n) depth parallel merging
processing system, compiler, pro-
gramming, and algorithms proposed
is aimed at the algorithm versus the O(log2 n) depth
one in Cormen et al.9 demonstrated an
here through the XMT framework with classic goal XMT advantage over current hardware,
the ICE/PRAM abstraction as its front-
end is notable for its relative simplic- of reducing as XMT allows a parallel algorithm for
the same problem that is both fast-
ity. ICE is a newly defined feature that single-task er and simpler. The XMT hardware
has not appeared in prior research, in-
cluding my own, and is more rudimen- completion time. scheduling brought the hardware per-
formance model much closer to work-
tary than prior parallel computing depth and allowed the XMT workflow
concepts. Rudimentary concepts are to streamline the design with the anal-
the basis for the fundamental develop- ysis from the start.
ment of any field. ICE can be viewed as Several features of the serial para-
an axiom that builds on mathematical digm made it a success, including a
induction, one of the more rudimen- simple abstraction at the heart of the
tary concepts in mathematics. The “contract” between programmers
suggestion here of using a simple ab- and builders, the software spiral, ease
straction as the guiding principle for of programming, ease of teaching,
reinventing computing for parallelism and backward compatibility on serial
also appears to be new. Considerable code and application programming.
evidence suggests it can be done (see The only feature that XMT, as in other
the sidebar “Eye-of-a-Needle Apho- multi-core approaches, does not pro-
rism”). vide is speedups for serial code. The
The following comparison with a ICE/PRAM/XMT workflow and archi-
chapter on multithreading algorithms tecture provide a viable option for
in the 2009 textbook Introduction to Al- the many-core era. My XMT solution
gorithms by Cormen et al.9 helps clarify should challenge and inspire others to
some of the article’s contributions. come up with competing abstraction
The 1990 first edition of Cormen et proposals or alternative architectures
al.9 included a chapter on PRAM algo- for ICE/PRAM. Consensus around an
rithms emphasizing the role of work- abstraction will move CS closer to con-
depth design and analysis; the 2009 vergence toward a many-core platform
chapter9 likewise emphasized work- and putting the software spiral back
depth analysis. However, to match cur- on track.
rent commercial hardware, the 2009 The XMT workflow also gives pro-
chapter turned to a variant of dynamic grammers a productivity advantage.
multithreading (in lieu of work-depth For example, I have traced several er-
design) in which the main primitive rors in student-developed XMTC pro-
was similar to the XMT sspawn com- grams to shortcuts the students took
mand (discussed here). One thread around the ICE algorithms. Overall,
was able to generate only one more improved understanding of program-
thread at a time; these two threads mer productivity, a traditionally dif-
would then generate one more thread ficult issue in parallel computing,
each, and so on, instead of freeing the must be a top priority for architec-
programmer to directly design for the ture research. To the extent possible,
work-depth analysis that follows (per evaluation of productivity should be
the same 2009 chapter). on par with performance and power.
Cormen et al.’s9 dynamic multi- For starters, productivity benchmarks
threading should encourage hardware must be developed.

84 communications of th e ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
contributed articles

Ease of programming, or program- neered to exploit for performance. A Hardware/Software Approach. Morgan-Kaufmann,
San Francisco, CA, 1999.
mability, is a necessary condition for For more, see the XMT home page 11. Gibbons, P., Matias, Y., and Ramachandran, V. The
the success of any many-core plat- at the University of Maryland http:// queue-read queue-write asynchronous PRAM model.
Theoretical Computer Science 196, 1–2 (Apr. 1998),
form, and teachability is a necessary www.umiacs.umd.edu/users/vishkin/ 3–29.
condition for programmability and in XMT/. The XMT software environment 12. Gottlieb, A. et al. The NYU ultracomputer designing
an MIMD shared-memory parallel computer. IEEE
turn for productivity. The teachability release is available by free download Transactions on Computers 32, 2 (Feb. 1983), 175–189.
of the XMT approach has been demon- there and from sourceforge.net at 13. Gu, P. and Vishkin, U. Case study of gate-level
logic simulation on an extremely fine-grained chip
strated extensively; for example, since http://sourceforge.net/projects/xmtc/, multiprocessor. Journal of Embedded Computing 2, 2
2007 more than 100 students in grades along with extensive documentation. (Apr. 2006), 181–190.
14. Hochstein, L., Basili, V., Vishkin, U., and Gilbert, J. A
K–12 have learned to program XMT, A 2010 release of the XMTC compiler pilot study to compare programming effort for two
including in two magnet programs: and cycle-accurate simulator of XMT parallel programming models. Journal of Systems and
Software 81, 11 (Nov. 2008), 1920–1930.
Montgomery Blair High School, Silver can also be downloaded to any stan- 15. Horak, M., Nowick, S., Carlberg, M., and Vishkin, U. A
Spring, MD, and Thomas Jefferson dard desktop computing platform. low-overhead asynchronous interconnection network
for gals chip multiprocessor. In Proceedings of the
High School for Science and Technol- Teaching materials covering a Uni- Fourth ACM/IEEE International Symposium on
Networks-on-Chip (Grenoble, France, May 3–6). IEEE
ogy, Alexandria, VA.22 Others are Balti- versity of Maryland class-tested pro- Computer Society, Washington D.C., 2010, 43–50.
more Polytechnic High School, where gramming methodology in which even 16. JaJa, J. An Introduction to Parallel Algorithms.
Addison-Wesley Publishing Company, Reading, MA,
70% of the students are African Ameri- college freshmen and high school 1992.
can, and a summer workshop for mid- students are taught only parallel al- 17. Keller, J., Kessler, C., and Traeff, J. Practical PRAM
Programming. Wiley-Interscience, New York, 2001.
dle-school students from underrepre- gorithms are also available from the 18. Nuzman, J. and Vishkin, U. Circuit Architecture for
sented groups in Montgomery County, XMT Web pages. Reduced-Synchrony-On-Chip Interconnect. U.S.
Patent 6,768,336, 2004; http://patft.uspto.gov/
MD, public schools. netacgi/nph-Parser?Sect1=PTO2&Sect2=HITOFF&p=
In the fall of 2010, I jointly con- Acknowledgment 1&u=%2Fnetahtml%2FPTO%2Fsearch-bool.html&r=1
&f=G&l=50&co1=AND&d=PTXT&s1=6768336.PN.&O
ducted another experiment, this one This work is supported by the Nation- S=PN/6768336&RS=PN/6768336
via video teleconferencing with Pro- al Science Foundation under grant 19. Patterson, D. The trouble with multi-core: Chipmakers
are busy designing microprocessors that most
fessor David Padua of the University 0325393. programmers can’t handle. IEEE Spectrum (July
of Illinois, Urbana-Champaign using 2010).
20. Shiloach, Y. and Vishkin, U. An O(n2 log n) parallel
Open MP and XMTC, with XMTC pro- References
max-flow algorithm. Journal of Algorithms 3, 2
gramming assignments run on the (Feb.1982), 128–146.
1. Adve, S. et al. Parallel Computing Research at Illinois:
21. Sutter, H. The free lunch is over: A fundamental shift
The UPCRC Agenda. White Paper. University of
XMT 64-processor FPGA machine. Illinois, Champaign-Urbana, IL,2008; http://www.
towards concurrency in software. Dr. Dobb’s Journal
30, 3 (Mar. 2005).
Our hope was to produce a meaning- upcrc.illinois.edu/UPCRC_Whitepaper.pdf
22. Torbert, S., Vishkin, U., Tzur, R., and Ellison, D. Is
2. Asanovic, K. et al. The Landscape of Parallel
ful comparison of programming de- Computing Research: A View from Berkeley. Technical
teaching parallel algorithmic thinking to high school
students possible? One teacher’s experience. In
velopment time from the 30 partici- Report UCB/EECS-2006-183. University of California,
Proceedings of the 41st ACM Technical Symposium
Berkeley, 2006; http://www.eecs.berkeley.edu/Pubs/
pating Illinois students. The topics TechRpts/2006/EECS-2006-183.pdf
on Computer Science Education (Milwaukee, WI, Mar.
10–13). ACM Press, New York, 2010, 290–294.
and problems covered in the PRAM/ 3. Balkan, A., Horak, M., Qu, G., and Vishkin, U. Layout-
23. Tzannes, A., Caragea, G., Barua, R., and Vishkin, U.
accurate design and implementation of a high-
XMT part of the course were signifi- throughput interconnection network for single-chip
Lazy binary splitting: A run-time adaptive dynamic
works-stealing scheduler. In Proceedings of the15th
cantly more advanced than Open MP parallel processing. In Proceedings of the 15th Annual
ACM Symposium on Principles and Practice of Parallel
IEEE Symposium on High Performance Interconnects
alone. Having sought to demonstrate (Stanford, CA, Aug. 22–24). IEEE Press, Los Alamitos,
Programming (Bangalore, India, Jan. 9–14). ACM
Press, New York, 2010, 179–189.
the importance of teachability from CA, 2007.
24. Valiant, L. A bridging model for multi-core computing.
4. Blake, G., Dreslinski, R., Flautner, K., and Mudge,
middle school on up, I strongly rec- T. Evolution of thread-level parallelism in desktop
In Proceedings of the European Symposium on
Algorithms (Karlruhe, Germany, Sept. 15–17). Lecture
ommend that it becomes a standard applications. In Proceedings of the 37th Annual
Notes in Computer Science 5193. Springer, Berlin,
International Symposium on Computer Architecture
benchmark for evaluating many-core 2008, 13–28.
(Saint-Malo, France, June 19–23). ACM Press, New
25. Vishkin, U. U.S. Patents 6,463,527; 6,542,918;
hardware platforms. York, 2010, 302–313.
7,505,822; 7,523,293; 7,707,388, 2002–2010;
5. Borkar, S. et al. Platform 2015: Intel Processor and
Blake et al.4 reported that after ana- http://patft.uspto.gov/
Platform Evolution for the Next Decade. White Paper.
26. Vishkin, U. Algorithmic approach to designing an
Intel, Santa Clara, CA, 2005; http://epic.hpi.uni-
lyzing current desktop/laptop appli- potsdam.de/pub/Home/TrendsAndConceptsII2010/
easy-to-program system: Can it lead to a hardware-
enhanced programmer’s workflow add-on? In
cations for which the goal was better HW_Trends_borkar_2015.pdf
Proceedings of the 27th International Conference on
6. Caragea, G., Tzannes, A., Keceli, F., Barua, R., and
performance, the applications tend to Vishkin, U. Resource-aware compiler prefetching for
Computer Design (Lake Tahoe, CA, Oct. 4–7). IEEE
Computer Society, Washington D.C., 2009, 60–63.
comprise many threads, though few many-cores. In Proceedings of the Ninth International
27. Vishkin, U., Caragea, G., and Lee, B. Models for
Symposium on Parallel and Distributed Computing
of them are used concurrently; conse- (Istanbul, Turkey, July 7–9). IEEE Press, Los
advancing PRAM and other algorithms into parallel
programs for a PRAM-on-chip platform. In Handbook
quently, the applications fail to trans- Alamitos, CA, 2010, 133–140.
on Parallel Computing, S. Rajasekaran and J. Reif, Eds.
7. Caragea, G., Keceli, F., Tzannes, A., and Vishkin, U.
late the increasing thread-level paral- General-purpose vs. GPU: Comparison of many-
Chapman and Hall/CRC Press, Boca Raton, FL, 2008,
5.1-60.
lelism in hardware to performance cores on irregular workloads. In Proceedings of the
28. Wen, X. and Vishkin, U. FPGA-based prototype of a
Second Usenix Workshop on Hot Topics in Parallelism
gains. This problem is not surprising (University of California, Berkeley, June 14–15).
PRAM-on-chip processor. In Proceedings of the Fifth
ACM Conference on Computing Frontiers (Ischia,
given that most programmers can’t Usenix, Berkeley, CA, 2010.
Italy, May 5–7). ACM Press, New York, 2008, 55–66.
8. Caragea, G., Saybasili, B., Wen, X., and Vishkin, U.
handle multi-core microprocessors. Performance potential of an easy-to-program PRAM-
In contrast, guided by the simple ICE on-chip prototype versus state-of-the-art processor.
In Proceedings of the 21st ACM SPAA Symposium on Uzi Vishkin (vishkin@umd.edu) is a professor in the
abstraction and by the rich PRAM Parallelism in Algorithms and Architectures (Calgary, University of Maryland Institute for Advanced Computer
Canada, Aug. 11–13). ACM Press, New York, 2009, Studies (http://www.umiacs.umd.edu/~vishkin) and
knowledgebase to find parallelism, 163–165. Electrical and Computer Engineering Department, College
XMT programmers are able to repre- 9. Cormen, T., Leiserson, C., Rivest, R., and Stein, C. Park, MD.
Introduction to Algorithms, Third Edition. MIT Press,
sent that parallelism using a type of Cambridge, MA, 2009.
threading the XMT hardware is engi- 10. Culler, D. and Singh, J. Parallel Computer Architecture: © 2011 ACM 0001-0782/11/0100 $10.00

ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t he acm 85
review articles
doi:10.1145/1866739.1866758
This long history is a testament to
What does it mean to preserve privacy? the importance of the problem. Sta-
tistical databases can be of enormous
by Cynthia Dwork social value; they are used for appor-
tioning resources, evaluating medical

A Firm
therapies, understanding the spread
of disease, improving economic util-
ity, and informing us about ourselves
as a species.

Foundation
The data may be obtained in diverse
ways. Some data, such as census,
tax, and other sorts of official data,
is compelled; other data is collected
opportunistically, for example, from

for Private
traffic on the Internet, transactions
on Amazon, and search engine query
logs; other data is provided altruisti-
cally, by respondents who hope that

Data Analysis
sharing their information will help
others to avoid a specific ­misfortune,
or more generally, to increase the
public good. Altruistic data donors
are typically promised their individ-
ual data will be kept confidential—in
short, they are promised “privacy.”
Similarly, medical data and legally
compelled data,  such as census data
and tax return data, have legal privacy

In the information realm, loss of p


­ rivacy is usually key insights
associated with failure to ­control access to In analyzing private data, only
by focusing on rigorous privacy
information, to control the flow of information, or guarantees can we convert the cycle
to control the purposes for which i­ nformation is of “propose-break-propose again”
into a path of progress.
employed. Differential privacy arose in a context A natural approach to defining privacy is
in which ensuring privacy is a challenge even if all to require that accessing the database
teaches the analyst nothing about any
these control problems are solved: privacy-preserving individual. But this is problematic: the
whole point of a statistical database is to
statistical analysis of data. teach general truths, for example, that
smoking causes cancer. Learning this
The problem of statistical disclosure control— fact teaches the data analyst something
revealing accurate statistics about a set of respondents about the likelihood with which
certain individuals, not necessarily
while preserving the privacy of individuals—has in the database, will develop cancer.
We therefore need a definition that
a venerable history, with an extensive literature separates the utility of the database
(learning that smoking causes cancer)
spanning statistics, theoretical computer science, from the increased risk of harm due to
joining the database. This is the intuition
security, databases, and cryptography (see, behind differential privacy.
for example, the excellent survey of Adam and T his can be achieved, often with low
Wortmann,1 the discussion of related work in Blum et distortion. The key idea is to randomize
responses so as to effectively hide the
al.,2 and the Journal of Official Statistics dedicated to presence or absence of the data of any
individual over the course of the lifetime
confi­den­tiality and disclosure control). of the database.

86 communications of th e ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
mandates. In my view, ethics demand of the suggestion. Suppose it is known among them that query monitoring is
that opportunistically obtained data that Mr. X is in a certain medical da- computationally infeasible16 and that
should be treated no differently, espe- tabase. Taken together, the answers the refusal to respond to a query may
cially when there is no reasonable to the two large queries “How many itself be disclosive.15
alternative to engaging in the actions people in the database have the sickle We think of a database as a ­collection
that generate the data in question. cell trait?” and “How many people, not of rows, with each row containing the
The problems remain: even if data named X, in the database have the sick- data of a different respondent. In sub-
encryption, key management, access le cell trait?” yield the sickle cell status sampling a subset of the rows is chosen
control, and the motives of the data of Mr. X. The example also shows that at random and released. Statistics can
curator are all unimpeachable, what encrypting the data, another frequent then be computed on the subsample
does it mean to preserve privacy, and suggestion (oddly), would be of no help and, if the subsample is sufficiently
how can it be accomplished? at all. The privacy compromise arises large, these may be representative of
from correct operation of the database. the dataset as a whole. If the size of the
“How” Is Hard In query auditing, each query to the subsample is very small compared to
Let us consider a few common sugges- database is evaluated in the context the size of the dataset, this approach
tions and some of the difficulties they of the query history to determine if a has the property that every respondent
can encounter. response would be disclosive; if so, is unlikely to appear in the subsample.
image by brian greenberg

Large Query Sets. One frequent sug- then the query is refused. For example, However, this is clearly insufficient:
gestion is to disallow queries about a query auditing might be used to inter- Suppose appearing in a subsample
specific individual or small set of indi- dict the pair of queries about sickle has terrible consequences. Then every
viduals. A well-known differencing ar- cell trait just described. This approach time subsampling occurs some individ-
gument demonstrates the inadequacy is problematic for several reasons, ual suffers horribly.

ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t he acm 87
review articles

In input perturbation, either the data for this. For example, syntactically dif- Proof. Let d be the true database. The
or the queries are modified before a ferent queries may be semantically adversary can attack in two phases:
response is generated. This broad cat- equivalent, and if the query language is
egory encompasses a generalization of sufficiently rich, then the equivalence 1. Estimate the number of 1’s in all
subsampling, in which the curator first problem itself is undecidable, so the possible sets: Query M on all
chooses, based on a secret, random, curator cannot even test for this. subsets S Í [n].
function of the query, a subsample Problems with noise addition arise 2. Rule out “distant” databases:
from the database, and then returns the even when successive queries are com- For  every candidate database
result obtained by applying the query to pletely unrelated to previous queries.5 c Î   {0, 1}n, if, for any S Í [n],
the subsample.4 A nice feature of this Let us assume for simplicity that the then rule
approach is that repeating the same database consists of a single—but out  c. If c is not ruled out, then
query yields the same answer, while very sensitive—bit per person, so we output c and halt.
semantically equivalent but syntactially can think of the database as an n-bit
different queries are made on essen- Boolean ­vector d = (d1, . . . , dn). This is Since M (S) never errs by more than E,
tially unrelated subsamples. However, an abstraction of a setting in which the the real database will not be ruled out, so
an outlier may only be protected by the database rows are quite complex, for this simple (but inefficient!) algorithm
unlikelihood of being in the subsample. example, they may be medical records, will output some database; let us call
In what is traditionally called ran- but the attacker is interested in one it c. We will argue that the number of
domized response, the data itself is specific field, such as HIV status. The positions in which c and d differ is at
randomized once and for all and sta- abstracted attack consists of issuing most 4 × E.
tistics are computed from the noisy a string of queries, each described Let I0 be the indices in which di = 0,
responses, taking into account the by a subset S of the database rows. that is, I0 = {i | di = 0}. Similarly, define
distribution on the ­perturbation.23 The The  query is asking how many 1’s are I1 = {i | di = 1}. Since c was not ruled
term “randomized response” comes in the selected rows. Representing out, However, by
from the practice of having the respon- the query as the n-bit characteris- assumption . It fol-
dents to a survey flip a coin and, based tic vector S of the set S, with 1’s in all lows from the triangle inequality that
on the outcome, answering an inva- the positions corresponding to rows c and d differ in at most 2E positions
sive yes/no question or answering a in S and 0’s everywhere else; the true in I0; the same argument shows that
more emotionally neutral one. In the answer to the query is the in­ner prod- they differ in at most 2E positions in I1.
­computer science literature the choice uct . Suppose the privacy Thus, c and d agree on all but at most
governed by the coin flip is usually mechanism responds with A(S) + ran- 4E positions. 
between honestly reporting one’s value dom noise. How much noise is needed What if we consider more realistic
and responding randomly, typically by in order to preserve privacy? bounds on the number of queries? We
flipping a ­second coin and reporting Since we have not yet defined think of as an interesting threshold
the outcome. Randomized response ­privacy, let us consider the easier on noise, for the following reason: If the
was devised for the setting in which problem of avoiding blatant “non-pri- database contains n ­people drawn uni-
the individuals do not trust the cura- vacy,” defined as follows: the system formly at random from a population of
tor, so we can think of the randomized is ­blatantly non-private if an adversary size N  n, and the fraction of the pop-
responses as simply being published. can construct a candidate database ulation satisfying a given condition is
Privacy comes from the uncertainty that agrees with the real database D p, then we expect the number of rows in
of how to interpret a reported value. in, say, 99% of the entries. An easy con- the database satisfying p to be roughly
The approach becomes untenable for sequence of the ­following theorem by the ­properties of the
­complex data. is that a privacy mechanism adding hypergeometric distribution. That
Adding random noise to the output noise with magnitude always bounded is, the sampling error is on the order
has promise, and we will return to it by, say, n/401 is blatantly non-private of . We  would like that the noise
later; here we point out that if done against an adversary that can ask all 2n introduced for privacy is smaller than
naïvely this approach will fail. To see possible queries.5 There is nothing spe- the sampling error, ideally .
this, suppose the noise has mean zero cial about 401; any number exceeding Unfortunately, noise of magnitude
and that fresh randomness is used in 400 would work. is blatantly non-private against
generating every response. In this case, a series of n log2 n randomly generated
if the same query is asked repeatedly, Theorem 1. Let M be a mechanism queries,5 no matter the distribution on
then the responses can be averaged, that adds noise bounded by E. Then there the noise. Several strengthenings of
and the true answer will eventually exists an adversary that can reconstruct this pioneering result are now known.
emerge. This is disastrous: an adver- the database to within 4E positions.5 For example, if the entries in S are
sarial analyst could exploit this to carry chosen independently according to
out the difference attack described Blatant non-privacy with E = n/401 fol- a standard normal distribution, then
above. The approach cannot be “fixed” lows immediately from the theorem, blatant non-privacy continues to hold
by recording each query and providing as the reconstruction will be accurate even against an adversary asking only
the same response each time a query in all but at most Q(n) questions, and even if more than
is re-issued. There are several reasons positions. a fifth of the responses have arbitrarily

88 communications of th e ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
review articles

wild noise magnitudes, provided the logs. People search for many “obvi-
other responses have noise magnitude ously” disclosive things, such as their
.8 full names (vanity searches), their own
These are not just interesting math- social security numbers (to see if their
ematical exercises. We have been
focusing on interactive privacy mecha- It has taken numbers are publicly available on the
Web, possibly with a goal of assessing
nisms, distinguished by the involve-
ment of the curator in answering each
several years the threat of identity theft), and even
the combination of mother’s maiden
query. In the noninteractive setting the to fully appreciate name and social security number. AOL
curator publishes some information
of arbitrary form, and the data is not
the importance carefully redacted such obviously dis-
closive “personally identifiable infor-
used further. Research statisticians of taking auxiliary mation,” and each user id was replaced
like to “look at the data,” and we have
frequently been asked for a method
information into by a random string. However, search
histories can be very idiosyncratic, and
of generating a “noisy table” that will account in a New York Times reporter correctly
permit highly accurate answers to
be derived for computations that are privacy-preserving traced such an “anonymized” search
history to a specific resident of Georgia.
not specified at  the outset. The noise data release. In a linkage attack, released data
bounds say this is impossible: No such are linked to other databases or other
table can safely provide very accurate sources of information. We use the
answers to too many weighted subset term auxiliary information to capture
sum questions; otherwise the table information about the respondents
could be used in a simulation of the other than that which is obtained
interactive mechanism, and an attack through the (interactive or noninter-
could be mounted against the table. active) statistical database. Any priors,
Thus, even if the analyst only requires beliefs, or information from newspa-
the responses to a small number of pers, labor statistics, and so on, all fall
unspecified queries, the fact that the into this category.
table can be exploited to gain answers In a notable demonstration of
to other queries is problematic. the power of auxiliary information,
In the case of “Internet scale” data­ medical records of the governor of
sets, obtaining responses to, say, Massachusetts were identified by
n  ≥  108 queries is infeasible. What linking voter registration records to
happens if the curator permits only “anonymized” Massachusetts Group
a sublinear number of questions? Insurance Commission (GIC) medi-
This inquiry led to the first algorith- cal encounter data, which retained
mic results in ­differential ­privacy, in the birthdate, sex, and zip code of the
which it was shown how to maintain patient.22
privacy against a sublinear number of Despite this exemplary work, it has
counting queries, that is, queries of the taken several years to fully appreci-
form “How many rows in the database ate the importance of taking auxiliary
satisfy property P?” by adding noise of information into account in privacy-
order —less than the sampling preserving data release. Sources and
error —to each answer.12 The cumber- uses of auxiliary information are end-
some privacy guarantee, which focused lessly varied. As a final example, it has
on the question of what an adversary been proposed to modify search query
can learn about a row in the database, logs by mapping all terms, not just the
is now known to imply a natural and user ids, to random strings. In token-
still very powerful relaxation of differ- based hashing each query is tokenized,
ential p­ rivacy, defined here. and then an uninvertible hash function
is applied to each token. The intuition
“What” Is Hard is that the hashes completely obscure
Newspaper horror stories about the terms in the query. However, using
“anonymized” and “de-identified” a statistical analysis of the hashed
data typically refer to noninteractive log and any (unhashed) query log, for
approaches in which certain kinds example, the released AOL log dis-
of information in each data record cussed above, the anonymization can
have been suppressed or altered. A be severely compromised, showing
famous example is AOL’s release of that token-based hashing is unsuitable
a set of “anonymized” search query for anonymization.17

ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t he acm 89
review articles

As we will see next, there are deep consumer of the information provided
reasons for the fact that auxiliary infor- by the statistical database (not to men-
mation plays such a prominent role in tion the fact that he or she may also be
these examples. a respondent in the database).

Dalenius’s Desideratum The formal Many papers in the literature


attempt to formalize Dalenius’ goal (in
In 1977 the statistician Tore Dalenius
articulated an “ad omnia” (as opposed
definition of some cases unknowingly) by requiring
that the adversary’s prior and poste-
to ad hoc) privacy goal for statistical semantic security rior beliefs about an individual (that
databases: Anything that can be learned
about a respondent from the statistical
is a pillar of modern is, before and after having access to
the statistical database) should not be
database should be learnable without crypto­graphy. “too different,” or that access to the
access to the database. Although infor-
mal, this feels like the “right” direc-
It is therefore statistical database should not change
the adversary’s views about any indi-
tion. The breadth of the goal captures natural to ask vidual “too much.” The difficulty with
all the common intuitions for privacy.
In addition, the definition only holds whether a similar this approach is that if the statistical
database teaches us anything at all,
the database accountable for whatever property, such as then it should change our beliefs about
“extra” is learned about an individual,
beyond that which can be learned from Dalenius’ goal, individuals. For example, suppose the
adversary’s (incorrect) prior view is
other sources. In particular, an extro-
vert who posts personal information
can be achieved that everyone has two left feet. Access
to the statistical database teaches that
on the Web may destroy his or her own for statistical almost everyone has one left foot and
privacy, and the database should not
be held accountable.
databases. one right foot. The adversary now has
a very different view of whether or not
Formalized, Dalenius’ goal is strik- any given respondent has two left feet.
ingly similar to the gold standard for But has privacy been compromised?
security of a cryptosystem against a The last hopes for Dalenius’ goal
­passive eavesdropper, defined five evaporate in light of the following par-
years later. Semantic security captures able, which again involves auxiliary
the intuition that the encryption of information. Suppose we have a sta-
a message reveals no information tistical database that teaches average
about the message. This is formalized heights of population subgroups, and
by comparing the ability of a compu- suppose further that it is infeasible
tationally efficient adversary, having to learn this information (perhaps
access to both the ciphertext and any for financial reasons) in any other
auxiliary information, to output (any- way (say, by conducting a new study).
thing about) the plaintext, to the abil- Finally, suppose that one’s true height
ity of a computationally efficient party is considered sensitive. Given the auxil-
having access only to the auxiliary iary information “Turing is two inches
information (and not the ciphertext), taller than the average Lithuanian
to achieve the same goal.13 Abilities are woman,” access to the statistical data-
measured by probabilities of success, base teaches Turing’s height. In con-
where the probability space is over the trast, anyone without access to the
random choices made in choosing the database, knowing only the auxiliary
encryption keys, the ciphertexts, and information, learns much less about
by the adversaries. Clearly, if this dif- Turing’s height.
ference is very, very tiny, then in a rigor- A rigorous impossibility result gen-
ous sense the ciphertext leaks (almost) eralizes this argument, extending to
no information about the plaintext. essentially any notion of privacy com-
The formal definition of semantic promise, assuming the statistical data-
security is a pillar of modern crypto­ base is useful. The heart of the attack
graphy. It is therefore natural to ask uses extracted randomness from the
whether a similar property, such as statistical database as a one-time pad
Dalenius’ goal, can be achieved for sta- for conveying the privacy compromise
tistical databases. But there is an essen- to the adversary/user.6, 9
tial difference in the two problems. Turing did not have to be a member
Unlike the eavesdropper on a conversa- of the database for the attack described
tion, the statistical database attacker earlier to be prosecuted against
is also a user, that is, a legitimate him. More generally, the things that

90 communications of th e ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
review articles

statistical databases are designed to The multiplicative nature of the guar- Returning to randomized response,
teach can, sometimes indirectly, cause antee implies that an output whose we see that it yields e-differential pri-
damage to an individual, even if this probability is zero on a given database vacy for a value of e that depends on
individual is not in the database. must also have probability zero on any the universe from which the rows are
In practice, statistical databases neighboring database, and hence, by chosen and the probability with which
are (typically) created to provide some repeated application of the defini- a random, rather than non-random,
anticipated social gain; they teach us tion, on any other database. Thus, value is contributed by the respon-
something we could not (easily) learn Definition 1 trivially rules out the dent. As an example, suppose each
without the database. Together with subsample-and-release paradigm dis- row consists of a single bit, and that
the attack against Turing, and the cussed: For an individual x not in the the respondent’s instructions are to
fact that he did not have to be a mem- dataset, the probability that x’s data first flip an unbiased coin to determine
ber of the database for the attack to is sampled and released is ­obviously whether he or she will answer ran-
work, this suggests a new privacy goal: zero; the multiplicative nature of the domly or truthfully. If heads (respond
Minimize the increased risk to an indi- guarantee ensures that the same is randomly), then the respondent is to
vidual incurred by joining (or leaving) true for an individual whose data is in flip a second unbiased coin and report
the database. That is, we move from the dataset. the outcome; if tails, the respondent
comparing an adversary’s prior and Any mechanism satisfying this defi- answers truthfully. Fix b Î {0, 1}. If the
posterior views of an individual to com- nition addresses all concerns that any true value of the input is b, then b is out-
paring the risk to an individual when participant might have about the leak- put with probability 3/4. On the other
included in, versus when not included age of his or her personal information, hand, if the true value of the input is
in, the database. This makes sense. regardless of any auxiliary information 1 − b, then b is output with probability
A privacy guarantee that limits risk known to an adversary: Even if the par- 1/4. The ratio is 3, yielding (ln 3)-differ-
incurred by joining encourages partici- ticipant removed his or her data from ential privacy.
pation in the dataset, increasing social the dataset, no outputs (and thus con- Suppose n respondents each employ
utility. This is the starting point on our sequences of outputs) would become randomized response independently,
path to differential privacy. significantly more or less likely. For but using coins of known, fixed, bias.
example, if the database were to be con- Then, given the randomized data, by
Differential Privacy sulted by an insurance provider before the properties of the binomial distri-
Differential privacy will ensure that deciding whether or not to insure a bution the analyst can approximate
the ability of an adversary to inflict given individual, then the presence or the true answer to the question “How
harm (or good, for that matter)—of absence of any individual’s data in the many respondents have value b?” to
any sort, to any set of people—should database will not significantly affect within an expected error on the order
be essentially the same, independent his or her chance of receiving coverage. of . As we will see, it is possible
of whether any individual opts in to, or Definition 1 extends naturally to to do much better—obtaining constant
opts out of, the dataset. We will do this group privacy. Repeated application expected error, independent of n.
indirectly, simultaneously addressing of the definition bounds the ratios of Generalizing in a different direc-
all possible forms of harm and good, probabilities of outputs when a collec- tion, suppose each row now has two
by focusing on the probability of any tion C of participants opts in or opts bits, each one randomized indepen-
given output of a privacy mechanism out, by a factor of e|C|e. Of course, the dently, as described earlier. While each
and how this probability can change point of the statistical database is to bit remains (ln 3)-differentially private,
with the addition or deletion of any disclose aggregate information about their logical-AND enjoys less privacy.
row. Thus, we will ­concentrate on pairs large groups (while simultaneously That is, consider a privacy mechanism
of databases (D,  D¢) differing only in protecting individuals), so we should in which each bit is protected by this
one row, meaning one is a subset of expect privacy bounds to disintegrate exact method of randomized response,
the other and the larger database con- with increasing group size. and consider the query: “What is the
tains just one additional row. Finally, The parameter e is public, and its logical-AND of the bits in the row of
to handle worst-case pairs of data- selection is a social question. We tend respondent i (after randomization)?”
bases, our probabilities will be over the to think of e as, say, 0.01, 0.1, or in If we consider the two extremes, one
random choices made by the privacy some cases, ln 2 or ln 3. in which respondent i has data 11
mechanism. Sometimes, for example, in the cen- and  the other in which respondent
sus, an individual’s participation is i has data 00, we see that in the first
Definition 1. A randomized function K known, so hiding presence or absence case the probability of output 1 is 9/16,
gives e-differential privacy if for all data- makes no sense; instead we wish to while in the second case the probabil-
sets D and D¢ differing on at most one row, hide the values in an individual’s row. ity is 1/16. Thus, this mechanism is at
and all S ⊆ Range(K), Thus, we can (and sometimes do) best (ln  9)-differentially private, not
extend “differing in at most one row” ln 3. Again, it is possible to do much
  to mean having symmetric difference ­better, even while releasing the entire
(1) at most 1 to capture both possibilities. 4-element histogram, also known as a
where the probability space in each case However, we will continue to use the contingency table, with only constant
is over the coin flips of K. original definition. expected error in each cell.

ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t he acm 91
review articles

Achieving Differential Privacy |z − z¢| £  1 the density at z is at most where the inequality follows from
Achieving differential privacy revolves ee times the density at z¢, satisfying the triangle inequality. By defini-
around hiding the presence or absence the condition in Equation 2. It is also tion of sensitivity, ||  f (D¢) − f (D)||1 £ Df,
of a single individual. Consider the symmetric about 0, and this is impor- and so the ratio is bounded by exp(e).
query “How many rows in the database tant. We cannot, for example, have Integrating over S yields e-differential
satisfy property P?” The presence or a distribution that only yields non- privacy. 
absence of a single row can affect the negative noise. Otherwise the only Given any query sequence f1, . . . , fm,
answer by at most 1. Thus, a differen- databases on which a counting query e-differential privacy can be achieved
tially private mechanism for a query of could return a response of 0 would be by running K with noise distribution
this type can be designed by first com- databases in which no row satisfies on each query, even if
puting the true answer and then adding the query. Letting D be such a data- the queries are chosen adaptively, with
random noise according to a distribu- base, and letting D¢ = D È {r} for some each successive query depending on
tion with the following property: row r satisfying the query, the pair D, the answers to the previous queries.
D¢ would violate e-differential privacy. In other words, by allowing the quality
   (2)
Finally, the distribution gets flatter as of each answer to deteriorate in a con-
To see why this is desirable, consider e decreases. This is correct: smaller e trolled way with the sum of the sensi-
any feasible response r. For any m, if means better privacy, so the noise den- tivities of the queries, we can maintain
m is the true answer and the response sity should be less “peaked” at 0 and e-differential privacy.
is r then the random noise must have change more gradually as the magni- With this in mind, let us return to
value r − m; similarly, if m − 1 is the true tude of the noise increases. some of the suggestions we consid-
answer and the response is r, then the There is nothing special about the ered earlier. Recall that using the spe-
random noise must have value r − m + cases d = 1, Df = 1: cific randomized response strategy
1. In order for the response r to be gen- described above, for a single Boolean
erated in a differentially private fash- Theorem 2. For f : D ® Rd, the attribute, yielded error on
ion, it suffices for ­ echanism K that adds independently
m databases of size n and (ln 3)-dif-
generated noise with distribution Lap ferential privacy. In contrast, using
(Df/e) to each of the d output terms enjoys Theorem 2 with the same value of e,
e-differential privacy.7 noting that Df  = 1 yields a variance
In general we are interested in of 2(1/ln 3)2, or an expected error of
v­ ector-valued queries; for example, the Before proving the theorem, we . More generally, to obtain e-
data may be points in Rd and we wish illustrate the situation for the case of a differential privacy we get an expected
to carry out an analysis that clusters the counting query (Df = 1) when e = ln2 and error of . Thus, our expected error
points and reports the location of the the true answer to the query is 100. The magnitude is constant, independent
largest cluster. distribution on the outputs (in gray) is of n.
centered at 100. The distribution on What about two queries? The sen-
Definition 2. For f : D ® Rd, the L1 sen- outputs when the true answer is 101 is sitivity of a sequence of two counting
sitivity of f is7 shown in orange. queries is 2. Applying the theorem
with Df/e = 2/e, adding independently
generated noise distributed as
 Lap(2/e) to each true answer yields
(3)
… 97 98 99 100 101 102 103 …
e-differential privacy. The variance is
Proof. (Theorem 2) The proof is a sim- 2(2/e)2, or standard deviation .
ple generalization of the reasoning we Thus, for any desired e we can achieve
for all D, D¢ differing in at most one row. used to illustrate the case of a single e-differential privacy by increasing
counting query. the expected magnitude of the errors
In particular, when d = 1 the sensitivity Consider any subset S ⊆ Range(K), as a function of the total sensitivity of
of f is the maximum difference in the and let D, D¢ be any pair of databases the two-query sequence. This holds
values that the function f may take on a differing in at most one row. When the equally for:
pair of databases that differ in only one database is D, the probability density at
row. This is the difference our noise any r  S is proportional to exp(−||  f  (D) • Two instances of the same query,
must be designed to hide. For now, let − r||1(e/D f  )). Similarly, when the data- addressing the repeated query
us focus on the case d = 1. base is D¢, the probability density at problem
The Laplace distribution with any r  Range(K) is proportional to • One count for each of two differ-
parameter b, denoted Lap(b), has den- exp(−||  f  (D¢) − r ||1(e/Df) ). ent bit positions, for example,
sity function We have when each row consists of two bits
; its variance is 2b2. Taking b = 1/e  we • A pair of queries of the form: “How
have that the density at z is propor- many rows satisfy property P?”
tional to e−e|z|. This distribution has and “How many rows satisfy prop-
highest density at 0 (good for accu- erty Q?” (where possibly P = Q)
racy), and for any z, z¢ such that • An arbitrary pair of queries

92 communications of th e ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
review articles

However, the theorem also shows we output of such a function while pre-
can sometimes do better. The logical- serving e-differential privacy requires
AND count we discussed earlier, even additional technology.
though it involves two different bits Assume the curator holds a data-
in each row, still only has sensitivity 1:
The number of 2-bit rows whose entries There are times base D and the goal is to produce an
object y. The exponential mechanism19
are both 1 can change by at most 1 with
the addition or deletion of a single row.
when the addition works as follows. We assume the exis-
tence of a utility function u(D, y) that
Thus, this more complicated query can of noise for measures the quality of an output y,
be answered in an e-differentially pri-
vate fashion using noise distributed as
achieving privacy given that the database is D. For exam-
ple, the data may be a set of labeled
Lap(1/e); we do not need to use the dis- makes no sense. points in Rd and the output y might
tribution Lap(2/e). be a d-ary vector describing a (d − 1)-
dimensional hyperplane that attempts
Histogram Queries. The power of to classify the points, so that those
Theorem 2 really becomes clear when labeled with +1 have non-­negative
considering histogram queries, defined inner product with y and those labeled
as follows. If we think of the rows of with −1 have negative inner product. In
the database as elements in a universe this case the utility would be the num-
X, then a histogram query is a parti- ber of points correctly classified, so that
tioning of X into an arbitrary number higher utility corresponds to a better
of disjoint regions X1, X2, .  .  .  ,  Xd. The classifier. The exponential mechanism
implicit question posed by the query is: outputs y with probability proportional
“For i = 1, 2, .  .  .  , d, how many points to exp(u(D,y)e/Du) and ensures e-differ-
in the database are contained in Xi?” ential privacy. Here Du is the sensitivity
For example, the database may contain of the utility function bounding, for all
the annual income for each respon- databases (D, D¢) differing in only one
dent, and the query is a partitioning of row and potential outputs y, the differ-
incomes into ranges: {[0, 50K), [50K, ence |u(D, y) − u(D¢,y)|. In our example,
100K), . . . , ³ 500K}. In this case d = 11, Du = 1. The mechanism assigns most
and the question is asking, for each of mass to the best classifier, and the
the d ranges, how many respondents mass assigned to any other drops off
in the database have annual income exponentially in the decline in its util-
in the given range. This looks like d ity for the current dataset—hence the
separate counting queries, but the name “exponential mechanism.”
entire query actually has sensitivity Df
= 1. To see this, note that if we remove When Sensitivity Is Hard to Analyze.
one row from the database, then only The Laplace and exponential mecha-
one cell in the histogram changes, and nisms provide a differentially private
that cell only changes by 1; similarly for interface through which the analyst
adding a single row. So Theorem 2 says can access the data. Such an interface
that e-differential privacy can be main- can be useful even when it is difficult to
tained by perturbing each cell with determine the sensitivity of the desired
an independent random draw from function or query sequence; it can also
Lap(1/e). Returning to our example of be used to run an iterative algorithm,
2-bit rows, we can pose the 4-ary histo- composed of easily analyzed steps, for
gram query requesting, for each pair of as many iterations as a given privacy
literals v1v2, the number of rows with budget permits. This is a powerful
value v1v2, adding noise of order 1/e to observation; for example, using only
each of the four cells. noisy sum queries, it is possible to
carry out many standard data mining
When Noise Makes No Sense. There tasks, such as singular value decom-
are times when the addition of noise positions, finding an ID3 decision
for achieving privacy makes no sense. tree, clustering, learning association
For example, the function f might rules, and learning anything learn-
map databases to strings, strategies, able in the statistical queries learning
or trees, or it might be choosing the model, frequently with good accuracy,
“best” among some specific, not nec- in a privacy-preserving fashion.2 This
essarily continuous, set of real-valued approach has been generalized to
objects. The problem of optimizing the yield a publicly available codebase for

ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t he acm 93
review articles

writing programs that ensure differen- obtain e-differential privacy. If we do The minimal size of the input data-
tial privacy.18 not know the number of iterations base depends on the quality of the
in advance we can increase the noise approximation, the logarithm of the
k-Means Clustering. As an example parameter as the computation pro- cardinality of the universe X, the pri-
of “private pro­gramming,”2 consider ceeds. There are many ways to do this. vacy parameter e, and the Vapnick–
k-means clustering, described first in its For example, we can answer in the first Chervonenkis dimension of the concept
usual, non-private form. The input con- iteration with parameter (d + 1)(e/2), class C (for finite |C| this is at most
sists of points p1, . . . , pn in the d-dimen- in the next with parameter (d + 1)(e/4), log2 |C|). The synthetic dataset, chosen
sional unit cube [0, 1]d. Initial candidate and so on, each time using up half of by the exponential mechanism, will be
means m1, . . . , mk are chosen randomly the remaining “privacy budget.” a set of m = O(VCdim(C)/γ2), elements
from the cube and updated as follows: in X (γ governs the maximum permissi-
Generating Synthetic Data ble inaccuracy in the fractional count).
1. Partition the samples {pi} into k The idea of creating a synthetic dataset Letting D denote the input dataset and
sets S1, . . . , Sk, associating each pi whose statistics closely mirror those ^ a candidate synthetic dataset, the
d
with the nearest mj. of the original dataset, but which pre- utility function for the exponential
2. For 1 £ j £ k, set , serves privacy of individuals, was pro- mechanism is given by
the mean of the samples  posed in the statistics community no
associated with mj. later than 1993.21 The lower bounds on
noise discussed at the end of Section
This update rule is typically iterated on “How Is Hard” imply that no such
until some convergence criterion has dataset can safely provide very accurate Pan-Privacy
been reached, or a fixed number of answers to too many weighted subset Data collected by a curator for a given
iterations have been applied. sum questions, motivating the inter- purpose may be subject to “mission
Although computing the nearest active approach to private data analy- creep” and legal compulsion, such
mean of any one sample (Step 1) would sis discussed herein. Intuitively, the as a subpoena. Of course, we could
breach privacy, we observe that to com- advantage of the interactive approach analyze data and then throw it away,
pute an average among an unknown set is that only the questions actually but can we do something even stron-
of points it is enough to compute their asked receive responses. ger, never storing the data in the first
sum and divide by their number. Thus, Against this backdrop, the non- place? Can we strengthen our notion
the computation only needs to expose interactive case was revisited from a of privacy to capture the “never store”
the approximate cardinalities of the Sj, learning theory perspective, challeng- requirement?
not the sets themselves. Happily, the ing the interpretation of the noise These questions suggest an investi-
k candidate means implicitly define lower bounds as a limit on the number gation of differentially private stream-
a histogram query, since they parti- of queries that can be answered pri- ing algorithms with small state—much
tion the space [0, 1]d according to their vately.3 This work, described next, has too small to store the data. However,
Voronoi cells, and so the vector (|S1|, excited interest in interactive and non- nothing in the definition of a stream-
. . . , |Sk|) can be released with very low interactive solutions yielding noise in ing algorithm, even one with very
noise in each coordinate. This gives us the range . small state, precludes storing a few
a differentially private approximation Let X be a universe of data items individual data points. Indeed, popu-
to the denominators in Step 2. As for and let C be a concept class consisting lar techniques from the streaming
the numerators, the sum of a subset of of functions c : X ® {0,1}. We say x Î X literature, such as Count-Min Sketch
the pi has sensitivity at most d, since satisfies a concept c Î C if and only if c(x) and subsampling, do precisely this.
the points come from the bounded = 1. A concept class can be extremely In such a situation, a subpoena or
region [0,1]d. Even better, the sensitiv- general; for example, it might consist other intrusion into the local state will
ity of the d-ary function that returns, of all rectangles in the plane, or all breach privacy.
for each of the k Voronoi cells, the d-ary Boolean circuits containing a given A pan-private algorithm is private
sum of the points in the cell is at most number of gates. “inside and out,” remaining differen-
d: Adding or deleting a single d-ary Given a sufficiently large database tially private even if its internal state
point can affect at most one sum, and D Î Xn, it is possible to privately gen- becomes visible to an adversary.10 To
that sum can change by at most 1 in erate a synthetic database that main- understand the pan-privacy guaran-
each of the d dimensions. Thus, using tains approximately correct fractional tee, consider click stream data. This
a query sequence with total sensitivity counts for all concepts in C (there may data is generated by individuals, and
at most d + 1, the analyst can compute be infinitely many!). That is, letting an individual may appear many times
a new set of candidate means by divid- S denote the synthetic database pro- in the stream. Pan-privacy requires
ing, for each mj, the approximate sum duced, with high probability over the that any two streams differing only
of the points in Sj by the approxima- choices made by the privacy mecha- in the information of a single indi-
tion to the cardinality |Sj|. nism, for every concept c Î C, the frac- vidual should produce very similar
If we run the algorithm for a fixed tion of elements in S that satisfy c is distributions on the internal states of
number N of iterations we can use the approximately the same as the fraction the algorithm and on its outputs, even
noise distribution Lap( (d + 1)N/e) to of elements in D that satisfy c. though the data of an individual are

94 communications of th e ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
review articles

interleaved arbitrarily with other data expanding rapidly, and there is insuf- References
in the stream. ficient space here to list all the inter- 1. Adam, N.R., Wortmann, J. Security-control methods
for statistical databases: A comparative study. ACM
As an example, consider the prob- esting directions currently under Comput. Surv. 21 (1989), 515–556.
lem of density estimation. Assuming, investigation by the community. We 2. Blum, A., Dwork, C., McSherry, F., Nissim, K.
Practical privacy: The SuLQ framework. In
for simplicity, that the data stream is identify a few of these. Proceedings of the 24th ACM Symposium on
just a sequence of IP addresses in a Principles of Database Systems (2005), 128–138.
3. Blum, A., Ligett, K., Roth, A. A learning theory
certain range, we wish to know what The Geometry of Differential Privacy. approach to non-interactive database privacy. In
fraction of the set of IP addresses in the Sharper upper and lower bounds Proceedings of the 40th ACM Symposium on Theory
of Computing (2008), 609–618.
range actually appears in the stream. on noise required for achieving dif- 4. Denning, D.E. Secure statistical databases with
A solution inspired by randomized ferential privacy against a sequence random sample queries. ACM Trans. Database Syst.
5 (1980), 291–315.
response can be designed using the fol- of linear queries can be obtained 5. Dinur, I., Nissim, K. Revealing information while
lowing technique.10 by understanding the geometry of preserving privacy. In Proceedings of the 22nd ACM
Symposium on Principles of Database Systems
Define two probability distribu- the query sequence.14 In some cases (2003), 202–210.
tions, D0 and D1, on the set {0, 1}. D0 dependencies among the queries can 6. Dwork, C. Differential privacy. In Proceedings of
the 33rd International Colloquium on Automata,
assigns equal mass to zero and to one. be exploited by the curator to markedly Languages and Programming (ICALP) (2) (2006),
1–12.
D1 has a slight bias toward 1; specifi- improve the accuracy of the responses. 7. Dwork, C., McSherry, F., Nissim, K., Smith, A.
cally, 1 has mass 1/2 + e/4, while 0 has Generalizing this investigation to the Calibrating noise to sensitivity in private data
analysis. In Proceedings of the 3rd Theory of
mass 1/2 − e/4. nonlinear and interactive cases would Cryptography Conference (2006), 265–284.
Let X denote the set of all possible IP be of significant interest. 8. Dwork, C., McSherry, F., Talwar, K. The price of
privacy and the limits of lp decoding. In Proceedings
addresses in the range of interest. The of the 39th ACM Symposium on Theory of
algorithm creates a table, with a 1-bit Algorithmic Complexity. We have Computing (2007), 85–94.
9. Dwork, C. Naor, M. On the difficulties of disclosure
entry bx for each x Î X, initialized to an so far ignored questions of compu- prevention in statistical databases or the case for
independent random draw from D0. So tational complexity. Many, but not differential privacy. J. Privacy Confidentiality 2
(2010). Available at: http://repository.cmu.edu/jpc/
initially the table is roughly half zeroes all, of the techniques described here vol2/iss1/8.
and half ones. have efficient implementations. For 10. Dwork, C., Naor, M., Pitassi, T., Rothblum, G.,
Yekhanin, S. Pan-private streaming algorithms. In
In an atomic step, the algorithm example, there are instances of the Proceedings of the 1st Symposium on Innovations in
receives an element from the stream, synthetic data generation problem Computer Science (2010).
11. Dwork, C., Naor, M., Reingold, O., Rothblum, G.,
changes state, and discards the ele- that, under standard cryptographic Vadhan, S. When and how can privacy-preserving
ment. When processing x Î X, the assumptions, have no polynomial data release be done efficiently? In Proceedings of
the 41st ACM Symposium on Theory of Computing
algorithm makes a fresh random time implementation.11 It follows (2009), 381–390.
12. Dwork, C., Nissim, K. Privacy-preserving datamining
draw from D1, and stores the result in that there are cases in which the expo- on vertically partitioned databases. In Advances in
bx. This is done no matter how many nential mechanism has no efficient Cryptology—CRYPTO’04 (2004), 528–544.
13. Goldwasser, S., Micali, S. Probabilistic encryption.
times x may have appeared in the implementation. When can this pow- JCSS 28 (1984), 270–299.
past. Thus, for any x appearing at least erful tool be implemented efficiently, 14. Hardt, M., Talwar, K. On the geometry of differential
privacy, (2009). In Proceedings of the 42nd ACM
once, bx will be distributed accord- and how? Symposium on Theory of Computing (2010),
ing to D1. However, if x never appears, 705–714.
15. Kenthapadi K., Mishra, N., Nissim, K. Simulatable
then the entry for x is the bit drawn An Alternative to Differential Privacy? auditing. In Proceedings of the 24th ACM
according to D0 during the initializa- Is there an alternative, “ad omnia,” Symposium on Principles of Database Systems
(2005), 118–127.
tion of the table. guarantee that composes automati- 16. Kleinberg, J., Papadimitriou, C., Raghavan, P.
As with randomized response, cally, and permits even better accuracy Auditing boolean attributes. In Proceedings of the
19th ACM Symposium on Principles of Database
the density in X of the items in the than differential privacy? Can crypto­ Systems (2000), 86–91.
stream can be approximated from graphy be helpful in this regard?20 17. Kumar, R., Novak, J., Pang, B., Tomkins, A. On
anonymizing query logs via token-based hashing. In
the number of 1’s in the table, taking The work described herein has, Proceedings of the WWW 2007 (2007), 629–638.
into account the expected fraction of for the first time, placed private data 18. McSherry, F. Privacy integrated queries (codebase).
Available on Microsoft Research downloads website.
“false positives” from the initializa- analysis on a strong mathematical See also Proceedings of SIGMOD (2009), 19–30.
19. McSherry, F., Talwar, K. Mechanism design via
tion phase and the “false negatives” foundation. The literature connects differential privacy. In Proceedings of the 48th
when sampling from D1. Letting q differential privacy to decision theory, Annual Symposium on Foundations of Computer
Science (2007).
denote the fraction of entries in the economics, robust statistics, geom- 20. Mironov, I., Pandey, O., Reingold, O., Vadhan, S.
table with value 1, the output is 4(q − etry, additive combinatorics, cryp- Computational differential privacy. In Advances in
Cryptology—CRYPTO’09 (2009), 126–142.
1/2)/e + Lap(1/e|X|). tography, complexity theory learning 21. Rubin, D. Discussion: Statistical disclosure limitation.
Intuitively, the internal state is theory, and machine learning. J. Official Statist. 9 (1993), 462–468.
22. Sweeney, L. Weaving technology and policy together
­differentially private because, for Differential privacy thrives because to maintain confidentiality. J. Law Med. Ethics 25
each it is natural, it is not domain-specific, (1997), 98–110.
23. Warner, S. Randomized response: a survey technique
privacy for the output is ensured by and it enjoys fruitful interplay with for eliminating evasive answer bias. JASA (1965),
the addition of Laplacian noise. Over other fields. This flexibility gives hope 63–69.

all, the algorithm is 2e-differentially for a principled approach to privacy


pan-private. in cases, like private data analysis, Cynthia Dwork (dwork@microsoft.com) is a principal
researcher at Microsoft Research, Silicon Valley Campus,
where traditional notions of crypto- Mountain View, CA.
Conclusion graphic security are inappropriate or
The differential privacy frontier is impracticable. © 2011 ACM 0001-0782/11/0100 $10.00

ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t he acm 95
cacm online

[con t inu e d from p. 16] is not solely influenced, so there is no single solu-
about communication; it is also about tion that can be identified to improve
maintaining a historical record, so that the speed or efficiency of scholarly
future generations of scientists can communication. Ed Chi, of the Palo
learn and build on the work of the past. Alto Research Center, described some
Whether new forms of scholarly com- of the difficulties of modernizing
munication pass this second test is far peer-review publishing in the Blog@
from certain. CACM at (http://cacm.acm.org/blogs/
A common misconception about blog-cacm/100284). “In many non-U.S.
the “dead tree” model of scholarly research evaluations, only the ISI Sci-
communication is that it is antitheti- ence Citation Index actually counts
cal to speed. This is only true to a cer- for publication. Already this doesn’t fit
tain extent, but almost certainly not to with many real-world metrics for repu-
the extent that most believe. tation,” Chi said via email. Some well-
Scott Delman, ACM’s Director of known ACM conference publications
Group Publishing, commented that are excluded from the SCI (http://bit.
“The current system of peer review ly/iaobEa), “even when their real-world
is the single largest bottleneck in the reputation is much higher than other
scholarly publication process, but this ‘dead-tree’ journals.” Technology may
does not mean the established system provide opportunities to facilitate and
can simply be thrown out in favor of accelerate the discourse, but there is
new models just because new technol- no guarantee the academic establish-
ogy enables dramatic improvements ment around the world will move as
in speed.” Establishing a new model quickly in accepting new media and
for scholarly communication will in- ways of communicating.
volve experimentation, trial and error, Paperless publishing will happen
and most likely evolution instead of gradually, but “only if there are ways to
revolution. Proclamations of the death manage the publication process,” Chi
of scholarly publishers and scholarly says. “Open source journal publica-
publishing as a result of the rise of the tion management systems will enable
Internet are no longer taken seriously journals to go somewhat independent
by those working in the publishing of traditional paper publishers, but we
industry. What we have seen is a slow will also need national scientific insti-
but steady evolution of print to online tutions to establish digital archives.”
publication and distribution models Other challenges he notes include
instead of an overnight upheaval. handling an increased number of sub-
Delman adds, “I believe strongly missions and managing potentially
that there is a need for a new model,” larger editorial boards.
but then goes on to refute the notion As an organization with the stated
that digital-only publishing—and the mission to advance computing as a
elimination of print—would quicken science and profession, ACM could
the publication of scholarly articles. “lead the charge” in experimenting
“The most substantial component in with new digital publishing models
the time delay related to the publica- for computing scholarship, says Chi.
tion of articles in scholarly journals is “This might include creating usable
the peer-review process,” and a digital- software, digital libraries, or archi-
only model won’t change that, he says. val standards.” A particularly impor-
Nor will it reduce article backlogs or tant area of research would examine
remove page limitations. “Eliminat- how to make socially derived metrics
ing print will not have the dramatic a part of reputation systems, so that
impact that most assume will occur if the number of downloads, online
print publications go away,” he says. mentions, citations, and blog discus-
Importantly, ACM readers and sub- sions can be measured for influence.
scribers “look for high-quality content Then, according to Chi, ACM “should
delivered in multiple formats, and work with national libraries to actively
they still want print.” change the publication models of oth-
Adding to the complexity of the er professions and fields.” This will
challenge is the fact that while science not be a revolution. ACM can help to
is global, scientific publication mod- drive the change in a positive way for
els are often socially or geographically the scientific community.

96 communications of th e ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
research highlights
p. 98 p. 99
Technical Sora: High-Performance
Perspective
Sora Promises Software Radio Using
Lasting Impact General-Purpose Multi-Core
By Dina Katabi
Processors
By Kun Tan, He Liu, Jiansong Zhang, Yongguang Zhang, Ji Fang,
and Geoffrey M. Voelker

p. 108 p. 109
Technical Path Selection and Multipath
Perspective Congestion Control
Multipath: A New
By Peter Key, Laurent Massoulié, and Don Towsley
Control Architecture
for the Internet
By Damon Wischik

ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t he acm 97
research highlights
doi:10.1145/1866739.1 8 6 6 7 5 9

Technical Perspective
Sora Promises Lasting Impact
By Dina Katabi

The term software defined radio (SDR) with the programmability and flexibil- time demos. It has enabled demand-
first appeared in 1992 and referred ity of general-purpose processors. To ing designs, such as LTE and AP virtu-
to a radio transceiver where the basic do so, Sora must overcome the follow- alization, to be built fully in software.
signal processing components (for ex- ing challenge: How can a radio deliver However, currently most SDR-based
ample, filtering, frame detection, syn- high throughput and support real-time research uses the GNU Radio/USRP
chronization, and demodulation) are protocols when all signal processing is platform. Despite the limitations of
all done in a general-purpose proces- done in software on a PC? this platform, previous attempts at
sor. The goal of an SDR was to enable a Sora’s approach uses various fea- replacing it with more capable plat-
single radio to support multiple wire- tures common in today’s multicore ar- forms did not experience significant
less technologies (for example, AM, chitectures. For example, transferring success. In fact, history shows that
VHF, FM) and be easily upgradable the digital waveform samples from the wide adoption is not necessarily cor-
with a software patch. radio board to the PC requires very high related with the more capable design.
While the concept of SDR has been bus throughput. While alternative SDR One of the classic papers we teach our
around for decades, only recently have technologies employ USB 2.0 or Giga- undergraduate students is “The Rise
SDRs become common in academic bit Ethernet, Sora opts for PCI-Express. of Worse is Better” by Richard Gabriel
wireless research. However, research This design decision enables Sora to that explains why the Lisp language
projects typically employ SDR as a de- achieve significantly higher transfer lost to C and Unix. Gabriel argues that
velopment platform, that is, they use rates, which are important for high for wide adoption, a system must be
software radios to develop new physi- bandwidth multi-antenna designs. good enough and as simple as pos-
cal layer designs with the understand- The choice of PCI-express also enables sible. Such a design (termed worse is
ing that if these designs make it to a Sora to reduce the transfer latency to better) tends to appear first because
product they will be built in ASICs. sub-microseconds, which is neces- the implementer did not spend an ex-
The reason why SDR has become a sary for wireless protocols with timing cessive amount of time over-optimiz-
development platform rather than a constraints (for example, MAC proto- ing. Therefore, if good enough, it will
fully functional software radio is that cols). Further, to accelerate wireless be adopted by developers because of
building high-performance SDRs has processing, Sora replaces computation its simplicity. Once adopted, the sys-
turned out to be very challenging. with memory lookups, exploits single tem will gradually improve until it is
Sora has revived the original SDR vi- instruction multiple data (SIMD), and almost the right design. One may ar-
sion. The objective of Sora is to build an dedicates certain cores exclusively to gue that the history of the GNU Radio/
SDR that combines the performance real-time signal processing. USRP SDR is fairly similar; the plat-
and fidelity of hardware platforms There are many reasons why the form originally provided just enough
following paper about Sora stands out for people to start experimenting with
as one of the most significant wireless the wireless physical layer. As a result,
There are many papers in the past few years. First, it it was simple and cheap, which caused
presents the first SDR platform that it to spread. Once it was accepted, it
reasons why fully implements IEEE 802.11b/g on kept improving just enough to enable
the following paper standard PCs. Second, the design the next step in research.
choices it makes (for example, the use The Sora team has recently started
about Sora stands of PCIe, SIMD, trading computation a program that awards Sora kits to aca-
out as one of for memory lookups, and core dedica- demic institutions to enable them to
tion) are highly important if software experiment with this new platform. It
the most significant radios are ever to meet their original will be interesting to see whether Sora
wireless papers goal of one-radio-for-all-wireless- with its higher performance can even-
technologies. Third, the paper is a tually replace the GNU Radio/USRP
in the past few years. beautiful and impressive piece of en- platform. If this happens, it will be a
gineering that spans signal process- major success for Sora.
ing, hardware design, multicore pro-
gramming, kernel optimization, and Dina Katabi (dina@csail.mit.edu) is an associate
professor in the Electrical Engineering and Computer
so on. For all these reasons, this paper Science Department at Massachusetts Institute of
Technology, Cambridge, MA.
will have a lasting impact on wireless
research.
The Sora platform has been used
in multiple research projects and real- © 2011 ACM 0001-0782/11/0100 $10.00

98 communications of th e ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
doi:10.1145/1866739 . 1 8 6 6 7 6 0

Sora: High-Performance Software


Radio Using General-Purpose
Multi-Core Processors
By Kun Tan, He Liu, Jiansong Zhang, Yongguang Zhang, Ji Fang, and Geoffrey M. Voelker

Abstract are relatively inexpensive. However, since PC hardware and


This paper presents Sora, a fully programmable software radio software have not been designed for wireless signal process-
platform on commodity PC architectures. Sora combines the ing, existing GPP-based SDR platforms can achieve only lim-
performance and fidelity of hardware software-defined radio ited performance.1, 7 For example, the popular USRP/GNU
(SDR) platforms with the programmability and flexibility of Radio platform is reported to achieve only 100kbps through-
general-purpose processor (GPP) SDR platforms. Sora uses put on an 8-MHz channel,18 whereas modern high-speed
both hardware and software techniques to address the chal- wireless protocols like 802.11 support multiple Mbps data
lenges of using PC architectures for high-speed SDR. The Sora rates on a much wider 20-MHz channel. These constraints
hardware components consist of a radio front-end for recep- prevent developers from using such platforms to achieve
tion and transmission, and a radio control board for high- the full fidelity of state-of-the-art wireless protocols while
throughput, low-latency data transfer between radio and host using standard operating systems and applications in a real
memories. Sora makes extensive use of features of contem- environment.
porary processor architectures to accelerate wireless protocol In this paper we present Sora, a fully programmable soft-
processing and satisfy protocol timing requirements, includ- ware radio platform that provides the benefits of both SDR
ing using dedicated CPU cores, large low-latency caches approaches, thereby resolving the SDR platform dilemma
to store lookup tables, and SIMD processor extensions for for developers. With Sora, developers can implement and
highly efficient physical layer processing on GPPs. Using the experiment with high-speed wireless protocol stacks, e.g.,
Sora platform, we have developed a few demonstration wire- IEEE 802.11a/b/g and 3GPP LTE, using commodity general-
less systems, including SoftWiFi, an 802.11a/b/g implemen- purpose PCs. Developers program in familiar programming
tation that seamlessly interoperates with commercial 802.11 environments with powerful tools on standard operating
NICs at all modulation rates, and SoftLTE, a 3GPP LTE uplink systems. Software radios implemented on Sora appear like
PHY implementation that supports up to 43.8Mbps data rate. any other network device, and users can run unmodified
applications on their software radios with the same perfor-
mance as commodity hardware wireless devices.
1. INTRODUCTION An implementation of high-speed wireless protocols on
Software-defined radio (SDR) holds the promise of fully pro- general-purpose PC architectures must overcome a number
grammable wireless communication systems, effectively of challenges that stem from existing hardware interfaces
supplanting current technologies which have the lowest and software architectures. First, transferring high-fidelity
communication layers implemented primarily in fixed, cus- digital waveform samples into PC memory for processing
tom hardware circuits. Realizing the promise of SDR in prac- requires very high bus throughput. For example, existing
tice, however, has presented developers with a dilemma. 802.11a/b/g requires 1.2Gbps system throughput to transfer
Many current SDR platforms are based on either pro- digital signals for a single 20-MHz channel, while the latest
grammable hardware such as field programmable gate 802.11n standard needs near 10Gbps as it uses even wider
arrays (FPGAs)8, 10 or embedded digital signal processors band and multiple-input–multiple-output (MIMO) technol-
(DSPs).6, 12 Such hardware platforms can meet the process- ogy. Second, physical layer (PHY) signal processing requires
ing and timing requirements of modern high-speed wireless high computation for generating information bits from the
protocols, but programming FPGAs and specialized DSPs large amount of digital samples, and vice versa, particularly
are difficult tasks. Developers have to learn how to program at high modulation rates; indeed, back-of-the-envelope cal-
to each particular embedded architecture, often without culations for processing requirements on GPPs have instead
the support of a rich development environment of program-
ming and debugging tools. Such hardware platforms can
The original version of this paper was published in
also be expensive.
Proceedings of the 6th USENIX Symposium on Networked
In contrast, SDR platforms based on general-purpose
Systems Design and Implementation (NSDI’09). This work
processor (GPP) architectures, such as commodity PCs,
was performed when Ji Fang and He Liu were visiting
have the opposite set of trade-offs. Developers program to a
students and Geoffrey M. Voelker was a visiting researcher
familiar architecture and environment using sophisticated
at Microsoft Research Asia.
tools, and radio front-end boards for interfacing with a PC

Ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t he acm 99
research highlights

motivated specialized hardware approaches in the past.14, 16 different wireless technologies may have subtle differences
Lastly, wireless PHY and media access control (MAC) proto- among one another, they generally follow similar designs
cols have low-latency real-time deadlines that must be met and share many common algorithms. In this section, we use
for correct operation. For example, the 802.11 MAC protocol the IEEE 802.11a/b/g standards to exemplify characteristics
requires precise timing control and ACK response latency on of wireless PHY and MAC components as well as the chal-
the order of tens of microseconds. Existing software archi- lenges of implementing them in software.
tectures on the PC cannot consistently meet this timing
requirement. 2.1. Wireless PHY
Sora addresses these challenges with novel hardware The role of the PHY layer is to convert information bits into
and software designs. First, we have developed a new, inex- a radio waveform, or vice versa. At the transmitter side, the
pensive radio control board (RCB) with a radio front-end wireless PHY component first modulates the message (i.e., a
for transmission and reception. The RCB bridges an RF MAC frame) into a time sequence of digital baseband signals.
front-end with PC memory over the high-speed and low- Digital baseband signals are then passed to the radio front-
latency PCIe bus. With this bus standard, the RCB can sup- end, where they are converted to analog waveform, multiplied
port 16.7Gbps (×8 mode) throughput with sub-microsecond by a high frequency carrier and transmitted into the wireless
latency, which together satisfies the throughput and timing channel. At  the receiver side, the radio front-end receives
requirements of modern wireless protocols while perform- radio signals in the channel and extracts the baseband wave-
ing all digital signal processing on host CPU and memory. form by removing the high-frequency carrier. The extracted
Second, to meet PHY processing requirements, Sora baseband waveform is digitalized and converted back into
makes full use of various features of widely adopted multi- digital signals. Then, the digital baseband signals are fed into
core architectures in existing GPPs. The Sora software the receiver’s PHY layer to be demodulated into the original
architecture explicitly supports streamlined processing message.
that enables components of the signal processing pipeline The PHY layer directly operates on the digital base-
to efficiently span multiple cores. Further, we change the band signals after modulation on the transmitter side and
conventional implementation of PHY components to exten- before demodulation on the receiver side. Therefore, high-
sively take advantage of lookup tables (LUTs), trading off throughput interfaces are needed to connect the PHY layer
computation for memory. These LUTs substantially reduce and the radio front-end. The required throughput linearly
the computational requirements of PHY processing, while scales with the bandwidth of the baseband signal as well as
at the same time taking advantage of the large, low-latency the number of antennas in a MIMO system. For example, the
caches on modern GPPs. Finally, Sora uses the Single channel width is 20MHz in 802.11a. It requires a data rate of
Instruction Multiple Data (SIMD) extensions in existing pro- at least 20M complex samples per second to represent the
cessors to further accelerate PHY processing. waveform. These complex samples normally require 16-bit
Lastly, to meet the real-time requirements of high-speed quantization for both in-phase and quadrature (I/Q) compo-
wireless protocols, Sora provides a new kernel service, core nents to provide sufficient fidelity, translating into 32 bits
dedication, which allocates processor cores exclusively for per sample, or 640Mbps for the full 20 MHz channel. Over-
real-time SDR tasks. We demonstrate that it is a simple sampling, a technique widely used for better performance,11
yet crucial abstraction that guarantees the computational doubles the requirement to 1.28Gbps. With a 4 × 4 MIMO
resources and precise timing control necessary for SDR on and 40-MHz channel, as specified in 802.11n, it will again
a multi-core GPP. quadruple the requirement to 10Gbps to move data between
We have developed a few demonstration wireless sys- the RF frond-end and PHY for one channel.
tems based on the Sora platform, including: (1) SoftWiFi, Advanced communication systems  (e.g., IEEE 802.11a/b/g,
an 802.11a/b/g implementation that supports a full suite as shown in Figure 1) contain multiple functional blocks in
of modulation rates (up to 54Mbps) and seamlessly inter- their PHY components. These functional blocks are pipe-
operates with commercial 802.11 NICs, and (2) SoftLTE, lined with one another. Data are streamed through these
a 3GPP LTE uplink PHY implementation that supports up to blocks sequentially, but with different data types and sizes.
43.8Mbps data rate. As illustrated in Figure 1, different blocks may consume or
The rest of the paper is organized as follows. Section 2 produce different types of data in different rates arranged
provides background on wireless communication systems. in small data blocks. For example, in 802.11b, the scram-
We then present the Sora architecture in Section 3, and we bler may consume and produce one bit, while DQPSK
discuss our approach for addressing the challenges of building modulation maps each two-bit data block onto a complex
an SDR platform on a GPP system in Section 4. We then symbol, whose real and image components represent I and
describe the implementation of the Sora platform in Section 5. Q, respectively.
Section 6 provides a quantitative evaluation of the radio Each PHY block performs a fixed amount of computation
systems based on Sora. Finally, Section 7 describes related on every transmitted or received bit. When the data rate is
work and Section 8 concludes. high, e.g., 11Mbps for 802.11b and 54Mbps for 802.11a/g,
PHY processing blocks consume a significant amount of
2. BACKGROUND AND REQUIREMENTS computational power. Based on the model in Neel et  al.,16
In this section, we briefly review the PHY and MAC compo- we estimate that a direct implementation of 802.11b may
nents of typical wireless communication systems. Although require 10GOPS while 802.11a/g needs at least 40GOPs.

100 communications of t h e ac m | JANUARY 2 0 1 1 | vo l . 5 4 | n o. 1


Figure 1. PHY operations of IEEE 802.11a/b/g transceiver.
Bits Bits Samples Samples Samples
@2Mbps @2Mbps @32Mbps @352Mbps @1.4Gbps

Transmitter: Direct Sequence Symbol Wave


Scramble DQPSK Mod To RF
Spread Spectrum Shaping
From MAC
Samples Samples Samples Bits Bits
@1.4Gbps @352Mbps @32Mbps @2Mbps @2Mbps
Receiver:
Decimation Despreading DQPSK Demod Descramble
From RF To MAC

(a) IEEE 802.11b 2Mbps

Bits Bits Bits Bits Samples Samples Samples Samples


@24Mbps @24Mbps @48Mbps @48Mbps @384Mbps @512Mbps @640Mbps @1.28Gbps
Convolutional Symbol Wave
Scramble Interleaving QAM Mod IFFT GI Addition
Transmitter: encoder Shaping To RF
From MAC
Samples Samples Samples Samples Bits Bits Bits
Receiver: @1.28Gbps @640Mbps @512Mbps @384Mbps @48Mbps @24Mbps @24Mbps
Demod + Viterbi
Decimation Remove GI FFT Descramble
Interleaving decoding
From RF To MAC

(b) IEEE 802.11a/g 24Mbps

These requirements are very demanding for software require substantial computational power for their PHY
­processing in GPPs. ­processing. Such computational requirements also increase
proportionally with communication speed. Unfortunately,
2.2. Wireless MAC tech­niques used in conventional PHY hardware or embed-
The wireless channel is a resource shared by all transceiv- ded DSPs do not directly carry over to GPP architectures.
ers operating on the same spectrum. As simultaneously Thus, we require new software techniques to accelerate
transmitting neighbors may interfere with each other, vari- high-speed ­signal processing on GPPs. With the advent of
ous MAC protocols have been developed to coordinate their many-core GPP architectures, it is now reasonable to aggre-
transmissions in wireless networks to avoid collisions. gate computational power of multiple CPU cores for signal
Most modern MAC protocols, such as 802.11, require processing. But, it is still challenging to build a software
timely responses to critical events. For example, 802.11 architecture to efficiently exploit the full capability of mul-
adopts a carrier sense multiple access (CSMA) MAC proto- tiple cores.
col to coordinate transmissions. Transmitters are required Real-time enforcement. Wireless protocols have multiple
to sense the channel before starting their transmission, real-time deadlines that need to be met. Consequently, not
and channel access is only allowed when no energy is only is processing throughput a critical requirement, but
sensed, i.e., the channel is free. The latency between sense the processing latency needs to meet response deadlines.
and access should be as small as possible. Otherwise, the Some MAC protocols also require precise timing control at
sensing result could be outdated and inaccurate. Another the granularity of microseconds to ensure certain actions
example is the link-layer retransmission mechanisms occur at exactly pre-scheduled time points. Meeting such
in wireless protocols, which may require an immediate real-time deadlines on a general PC architecture is a non-
acknowledgement (ACK) to be returned in a limited time trivial challenge: time sharing operating systems may not
window. respond to an event in a timely manner, and bus interfaces,
Commercial standards like IEEE 802.11 mandate a re­sponse such as Gigabit Ethernet, could introduce indefinite delays
latency within 16 ms, which is challenging to achieve in software far more than a few microseconds. Therefore, meeting
on a general-purpose PC with a general-purpose OS. these real-time requirements requires new mechanisms
on GPPs.
2.3. Software radio requirements
Given the above discussion, we summarize the requirements 3. ARCHITECTURE
for implementing a software radio system on a general PC We have developed a high-performance software radio
platform: platform called Sora that addresses these challenges. It is
High-system throughput. The interfaces between the radio based on a commodity general-purpose PC architecture. For
front-end and PHY as well as between some PHY processing flexibility and programmability, we push as much commu-
blocks must possess sufficiently high throughput to transfer nication functionality as possible into software, while keep-
high-fidelity digital waveforms. ing hardware additions as simple and generic as possible.
Intensive computation. High-speed wireless protocols Figure 2 illustrates the overall system architecture.

JANUARY 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t h e acm 101


research highlights

Figure 2. Sora system architecture. All PHY and MAC execute in Figure 3. Software architecture of Sora soft-radio stack.
software on a commodity multi-core CPU.
Applications
User mode
Multi-core CPU Digital Samples
@Multiple Gbps
APP APP APP APP RF Kernel mode
Mem RCB RF Network Layer (TCP/IP)
A/DRF
Sora Sora APP APP D/A RF

Sora soft radio stack


Sora supporting lib
High throughput
Wireless MAC
Sora Soft-Radio Stack low-latency PCIe bus Sora PHY Lib

Wireless PHY Streamline Processing


Support
3.1. Hardware components
RCB Manager
The hardware components in the Sora architecture are a Real-time Support (Core
DMA Memory dedication)
new RCB with an interchangeable radio front-end (RF front-
end). The radio front-end is a hardware module that receives
and/or transmits radio signals through an antenna. In the PC Bus
RCB
Sora architecture, the RF front-end represents the well-
defined interface between the digital and analog domains. It
contains analog-to-digital (A/D) and digital-to-analog (D/A)
converters, and necessary circuitry for radio transmission. system. In addition to facilitating the interaction with the
Since all signal processing is done in software, the RF front- RCB, Sora provides a set of techniques to greatly improve
end design can be rather generic. It can be implemented in a the performance of PHY and MAC processing on GPPs. To
self-contained module with a standard interface to the RCB. meet the processing and real-time requirements, these tech-
Multiple wireless technologies defined on the same fre- niques make full use of various common features in existing
quency band can use the same RF front-end hardware, and multi-core CPU architectures, including the extensive use of
the RCB can connect to different RF front-ends designed for LUTs, substantial data-parallelism with CPU SIMD exten-
different frequency bands. sions, the efficient partitioning of streamlined processing
The RCB is a new PC interface board for establish- over multiple cores, and exclusive dedication of cores for
ing a high-throughput, low-latency path for transferring software radio tasks. We describe these software techniques
high-fidelity digital signals between the RF front-end and in details in the next section.
PC memory. To achieve the required system throughput
discussed in Section 2.1, the RCB uses a high-speed, low- 4. HIGH-PERFORMANCE SDR SOFTWARE
latency bus such as PCIe. With a maximum throughput
of 64Gbps (PCIe × 32) and sub-microsecond latency, it is 4.1. Efficient PHY processing
well suited for supporting multiple gigabit data rates for In a memory-for-computation trade-off, Sora relies upon the
wireless signals over a very wide band or over many MIMO large-capacity, high-speed cache memory in GPPs to acceler-
channels. Further, the PCIe interface is now common in ate PHY processing with precalculated LUTs. Contemporary
contemporary commodity PCs. modern CPU architectures usually have megabytes of L2
Another important role of the RCB is to bridge the syn- cache with a low (10–20 cycles) access latency. If we precal-
chronous data transmission at the RF front-end and the culate LUTs for a large portion of PHY algorithms, we can
asynchronous processing on the host CPU. The RCB uses greatly reduce the computational requirement for online
various buffers and queues, together with a large onboard processing.
memory, to convert between synchronous and asynchro- For example, the soft demapper algorithm used in demod-
nous streams and to smooth out bursty transfers between ulation needs to calculate the confidence level of each bit
the RCB and host memory. The large onboard memory fur- contained in an incoming symbol. This task involves rather
ther allows caching precomputed waveforms, adding addi- complex computation proportional to the modulation den-
tional flexibility for software radio processing. sity. More precisely, it conducts an extensive search for all
Finally, the RCB provides a low-latency control path for modulation points in a constellation graph and calculates
software to control the RF front-end hardware and to ensure a ratio between the minimum of Euclidean distances to all
it is properly synchronized with the host CPU. Section 5.1 points representing one and the minimum of distances to
describes our implementation of the RCB in more detail. all points representing zero. In this case, we can precalcu-
late the confidence levels for all possible incoming symbols
3.2. Sora software based on their I and Q values, and build LUTs to directly
Figure  3 illustrates Sora’s software architecture. The soft- map the input symbol to confidence level. Such LUTs are
ware components in Sora provide necessary system services not large. For example, in 802.11a/g with a 54Mbps modula-
and programming support for implementing various wire- tion rate (64-QAM), the size of the LUT for the soft demap-
less PHY and MAC protocols in a general-purpose operating per is only 1.5KB.

102 communications of t h e ac m | JANUARY 2 0 1 1 | vo l . 5 4 | n o. 1


As we detail later in Section 5.2.1, more than half of the to minimize overhead like cache misses or TLB flushes.
common PHY algorithms can indeed be rewritten with Second, previous work on multi-core OSes also suggests
LUTs, each with a speedup from 1.5× to 50×. Since the size of that isolating applications into different cores may have bet-
each LUT is sufficiently small, the sum of all LUTs in a pro- ter performance compared to symmetric scheduling, since
cessing path can easily fit in the L2 caches of contemporary an effective use of cache resources and a reduction in locks
GPP cores. With core dedication (Section 4.3), the possibility can outweigh dedicating cores.9 Moreover, a core dedication
of cache collisions is very small. As a result, these LUTs are mechanism is much easier to implement than a real-time
almost always in caches during PHY processing. scheduler, sometimes even without modifying an OS ­kernel.
To accelerate PHY processing with data-level parallel- For example, we can simply raise the priority of a kernel
ism, Sora heavily uses the SIMD extensions in modern GPPs, thread so that it is pinned on a core and it exclusively runs
such as SSE, 3DNow! and AltiVec. Although these extensions until termination (Section 5.2.3).
were designed for multimedia and graphics applications,
they also match the needs of wireless signal processing very 5. IMPLEMENTATION
well because many PHY algorithms have fixed computation
structures that can easily map to large vector operations. 5.1. Hardware
We have designed and implemented the Sora RCB as shown
4.2. Multi-core streamline processing in Figure 4. It contains a Virtex-5 FPGA, a PCIe-×8 interface,
Even with the above optimizations, a single CPU core may and 256MB of DDR2 SDRAM. The RCB can connect to vari-
not have sufficient capacity to meet the processing require- ous RF front-ends. In our experimental prototype, we use a
ments of high-speed wireless communication technologies. third-party RF front-end that is capable of transmitting and
As a result, Sora must be able to use more than one core in receiving a 20 MHz channel at 2.4 or 5 GHz.
a multi-core CPU for PHY processing. This multi-core tech- Figure  5 illustrates the logical components of the Sora
nique should also be scalable because the signal processing hardware platform. The DMA and PCIe controllers inter-
algorithms may become increasingly more complex as wire- face with the host and transfer digital samples between the
less technologies progress. RCB and PC memory. Sora software sends commands and
As discussed in Section 2, PHY processing typically con- reads RCB states through RCB registers. The RCB uses its
tains several functional blocks in a pipeline. These blocks onboard SDRAM as well as small FIFOs on the FPGA chip
differ in processing speed and in input/output data rates to bridge data streams between the CPU and RF front-end.
and units. A block is only ready to execute when it has suf- When receiving, digital signal samples are buffered in
ficient input data from the previous block. Therefore, a key ­on-chip FIFOs and delivered into PC memory when they fit
issue is how to schedule a functional block on multiple cores
when it is ready. Figure 4. Sora radio control board.
Sora chooses a static scheduling scheme. This decision
is based on the observation that the schedule of each block
in a PHY processing pipeline is actually static: the process-
ing pattern of previous blocks can determine whether a sub-
sequent block is ready or not. Sora can thus partition the
whole PHY processing pipeline into several sub-pipelines
and statically assign them to different cores. Within one
sub-pipeline, when a block has accumulated enough data
for the next block to be ready, it explicitly schedules the next
block. Adjacent sub-pipelines are still connected with a syn-
chronized FIFO (SFIFO), but the number of SFIFOs and their
overhead are greatly reduced.

4.3. Real-time support


SDR processing is a time-critical task that requires strict Figure 5. Hardware architecture of RCB and RF.
guarantees of computational resources and hard real-time
deadlines. As an alternative to relying upon the full general- FPGA
ity of real-time operating systems, we can achieve real-time DMA
FIFO
RF A/D
RF Circuit
guarantees by simply dedicating cores to SDR process- Controller FIFO
Controller
D/A Antenna
ing in a multi-core system. Thus, sufficient computational PCIE
PCIe
resources can be guaranteed without being affected by other bus Controller SDRAM
Controller
RF Front-end
concurrent tasks in the system.
This approach is particularly plausible for SDR. First, Registers

wireless communication often requires its PHY to con-


DDR
stantly monitor the channel for incoming signals. Therefore, SDRAM
RCB
the PHY processing may need to be active all the time. It is
much better to always schedule this task on the same core

JANUARY 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t h e acm 103


research highlights

in a DMA burst (128B). When transmitting, the large RCB algorithms. We have been able to rewrite more than half of
memory enables Sora software to first write the generated the PHY algorithms with LUTs. Some LUTs are straightfor-
samples onto the RCB, and then trigger transmission with ward precalculations, others require more sophisticated
another command to the RCB. This functionality provides implementations to keep the LUT size small. For the soft-
flexibility to the Sora software for precalculating and stor- demapper example mentioned earlier, we can greatly reduce
ing several waveforms before actually transmitting them, the LUT size (e.g., 1.5KB for the 802.11a/g 54Mbps modu-
while allowing precise control of the timing of the waveform lation) by exploiting the symmetry of the algorithm. In our
transmission. SoftWiFi implementation described below, the overall size
While implementing Sora, we encountered a consistency of the LUTs is around 200KB for 802.11a/g and 310KB for
issue in the interaction between DMA operations and the 802.11b, both of which fit comfortably within the L2 caches
CPU cache system. When a DMA operation modifies a mem- of commodity CPUs.
ory location that has been cached in the L2 cache, it does not We also heavily use SIMD instructions in coding Sora
invalidate the corresponding cache entry. When the CPU software. We currently use the SSE2 instruction set designed
reads that location, it can therefore read an incorrect value for Intel CPUs. Since the SSE registers are 128-bit wide while
from the cache. most PHY algorithms require only 8-bit or 16-bit fixed-point
We solve this problem with a smart-fetch strategy, enabling operations, one SSE instruction can perform 8 or 16 simulta-
Sora to maintain cache coherency with DMA memory with- neous calculations. SSE also has rich instruction support for
out drastically sacrificing throughput if disabling cached flexible data permutations, and most PHY algorithms, e.g.,
accesses. First, Sora organizes DMA memory into small slots, FFT, FIR Filter and Viterbi, can fit naturally into this SIMD
whose size is a multiple of a cache line. Each slot begins with model. For example, the Sora Viterbi decoder uses only 40
a descriptor that contains a flag. The RCB sets the flag after it cycles to compute the branch metric and select the shortest
writes a full slot of data, and clears it after the CPU processes path for each input. As a result, our Viterbi implementation
all data in the slot. When the CPU moves to a new slot, it first can handle 802.11a/g at the 54Mbps modulation with only
reads its descriptor, causing a whole cache line to be filled. one 2.66 GHz CPU core, whereas previous implementations
If the flag is set, the data just fetched is valid and the CPU relied on hardware implementations. Note that other GPP
can continue processing the data. Otherwise, the RCB has architectures, like AMD and PowerPC, have very similar
not updated this slot with new data. Then, the CPU explicitly SIMD models and instruction sets, and we expect that our
flushes the cache line and repeats reading the same location. optimization techniques will directly apply to these other
This next read refills the cache line, loading the most recent GPP architectures as well.
data from memory. Table 2 summarizes some key PHY processing algo-
Table 1 summarizes the RCB throughput results, which rithms we have implemented in Sora, together with the
agree with the hardware specifications. To precisely mea- optimization techniques we have applied. The table also
sure PCIe latency, we instruct the RCB to read a memory compares the performance of a conventional software
address in host memory, and measure the time interval implementation (e.g., a direct translation from a hardware
between issuing the request and receiving the response in implementation) and the Sora implementation with the
hardware. Since each read involves a round trip operation, LUT and SIMD optimizations.
we use half of the measured time to estimate the one-way Lightweight, Synchronized FIFOs: Sora allows different
delay. This one-way delay is 360 ns with a worst case varia- PHY processing blocks to streamline across multiple cores,
tion of 4 ns. and we have implemented a lightweight, synchronized FIFO
to connect these blocks with low contention overhead. The
5.2. Software idea is to augment each data slot in the FIFO with a header
The Sora software is written in C, with some assembly for that indicates whether the slot is empty or not. We pad each
performance-critical processing. The entire Sora software data slot to be a multiple of a cache line. Thus, the con-
is implemented on Windows XP as a network device driver sumer is always chasing the producer in the circular buffer
and it exposes a virtual Ethernet interface to the upper TCP/IP for filled slots. If the speed of the producer and consumer
stack. Since any software radio implemented on Sora can is the same and the two pointers are separated by a partic-
appear as a normal network device, all existing network ular offset (e.g., two cache lines in the Intel architecture),
applications can run unmodified on it. no cache miss will occur during synchronized streaming
PHY Processing Library: In the Sora PHY processing library, since the local cache will prefetch the following slots before
we extensively exploit the use of look-up tables (LUTs) and the actual access. If the producer and the consumer have
SIMD instructions to optimize the performance of PHY ­different processing speeds, e.g., the reader is faster than
the writer, then eventually the consumer will wait for the
Table 1. DMA throughput performance of the RCB. producer to release a slot. In this case, each time the pro-
ducer writes to a slot, the write will cause a cache miss at
Mode Rx (Gbps) Tx (Gbps) the consumer. But the producer will not suffer a miss since
PCIe-x4 6.71 6.55 the next free slot will be prefetched into its local cache.
PCIe-x8 12.8 12.3 Fortunately, such cache misses experienced by the con-
sumer will not cause significant impact on the overall per-
formance of the streamline processing since the consumer

104 communications of t h e ac m | JANUARY 2 0 1 1 | vo l . 5 4 | n o. 1


Table 2. Key algorithms in IEEE 802.11b/a and their performance with conventional and Sora implementations.

I/O Size (bit) Computation Required (Mcycles/s)


Optimization Conventional Sora
Algorithm Configuration Input Output Method Implementation Implementation Speedup
IEEE 802.11b
Scramble 11Mbps 8 8 LUT 96.54 10.82 8.9×
Descramble 11Mbps 8 8 LUT 95.23 5.91 16.1×
Mapping and 2Mbps, DQPSK 8 44 × 16 × 2 LUT 128.59 73.92 1.7×
spreading
CCK modulator 5Mbps, CCK 8 8 × 16 × 2 LUT 124.93 81.29 1.5×
11Mbps, CCK 8 8 × 16 × 2 LUT 203.96 110.88 1.8×
FIR filter 16-bit I/Q, 37 taps, 22MSps 16 × 2 × 4 16 × 2 × 4 SIMD 5,780.34 616.41 9.4×
Decimation 16-bit I/Q, 4× Oversample 16 × 2 × 4 × 4 16 × 2 × 4 SIMD 422.45 198.72 2.1×
IEEE 802.11a
FFT/IFFT 64 points 64 × 16 × 2 64 × 16 × 2 SIMD 754.11 459.52 1.6×
Conv. encoder 24Mbps, 1/2 rate 8 16 LUT 406.08 18.15 22.4×
48Mbps, 2/3 rate 16 24 LUT 688.55 37.21 18.5×
54Mbps, 3/4 rate 24 32 LUT 712.10 56.23 12.7×
Viterbi 24Mbps, 1/2 rate 8 × 16 8 SIMD+LUT 68,553.57 1,408.93 48.7×
48Mbps, 2/3 rate 8 × 24 16 SIMD+LUT 117,199.6 2,422.04 48.4×
54Mbps, 3/4 rate 8 × 32 24 SIMD+LUT 131,017.9 2,573.85 50.9×
Soft demapper 24Mbps, QAM 16 16 × 2 8×4 LUT 115.05 46.55 2.5×
54Mbps, QAM 64 16 × 2 8×6 LUT 255.86 98.75 2.4×
Scramble and 54Mbps 8 8 LUT 547.86 40.29 13.6×
descramble

is not the bottleneck element. 6.1. SoftWiFi


Real-Time Support: Sora uses exclusive threads (or ethreads) SoftWiFi implements the basic access mode of 802.11. The
to dedicate cores for real-time SDR tasks. Sora implements MAC state machine (SM) is implemented as an ethread. Since
ethreads without any modification to the kernel code. 802.11 is a simplex radio, the demodulation components can
An ­ethread is implemented as a kernel-mode thread, and it run directly within a MAC SM thread. If a single core is insuf-
exploits the processor affiliation that is commonly supported ficient for all PHY processing (e.g., 802.11a/g), the PHY pro-
in commodity OSes to control on which core it runs. Once the cessing can be partitioned across two ethreads. These two
OS has scheduled the ethread on a specified physical core, it ethreads are streamed using a synchronized FIFO. Two addi-
will raise its IRQL (interrupt request level) to a level as high as tional auxiliary threads modulate the outgoing frames in the
the kernel scheduler, e.g., dispatch_level in Windows. Thus, ­background and transfer the demodulated frames to upper
the ethread takes control of the core and prevents itself from layers, respectively.
being preempted by other threads. In idle state, the SM continuously measures the aver-
Running at such an IRQL, however, does not prevent the age energy to determine whether the channel is clean or
core from responding to hardware interrupts. Therefore, we there is an incoming frame. If it detects a high energy,
also constrain the interrupt affiliations of all devices attached SoftWiFi starts to demodulate a frame. After successfully
to the host. If an ethread is running on one core, all interrupt receiving a frame, the 802.11 MAC standard requires a sta-
handlers for installed devices are removed from the core, tion to transmit an ACK frame in a timely manner (10 ms
thus prevent the core from being interrupted by hardware. for 802.11b and 16 ms for 802.11a). This ACK requirement
To ensure the correct operation of the system, Sora always is quite difficult for an SDR implementation in software
ensures core zero is able to respond to all hardware inter- on a PC. Both generating and transferring the waveform
rupts. Consequently, Sora only allows ethreads to run on across the PC bus will cause a latency of several microsec-
cores whose ID is greater than zero. onds, and the sum is usually larger than mandated by the
standard.
6. EXPERIENCE Fortunately, an ACK frame generally has a fixed pat-
To demonstrate the use of Sora, we have developed two wire- tern with only a few dynamic fields (i.e., sender address).
less systems fully in software in a multi-core PC, namely Thus, we can precalculate most of an ACK frame (19B), and
SoftWiFi and SoftLTE. The performance we report for update only the address (10B) on the flight. We can further
SoftWiFi is measured on an Intel Core Duo 2 (2.67 GHz), and do it immediately after demodulating the MAC header, and
the performance reported for SoftLTE is measured on an without waiting for the end of a frame. We then prestore the
Intel Core i7-920 (2.67 GHz). waveform in the memory of the RCB. Thus, the time for ACK

JANUARY 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t h e acm 105


research highlights

Figure 6. Throughput of Sora when communicating with a commercial


limit is due to the overhead of headers at different layers as
WiFi card. Sora–Commercial presents the transmission throughput well as the MAC overhead to coordinate channel access (i.e.,
when a Sora node sends data. Commercial–Sora presents the carrier sense, ACKs, and backoff), and is a well-known prop-
throughput when a Sora node receives data. Commercial–Commercial erty of 802.11 performance.
presents the throughput when a commercial NIC communicates with
another commercial NIC.
6.2. SoftLTE
25 We have also implemented the 3GPP LTE Physical Uplink
802.11b 802.11a/g
Shared Channel (PHUSC) on the Sora platform.13 LTE is
20 the next generation cellular standard. It is more complex
Throughput (Mbps)

than 802.11 since it uses a higher-order FFT (1024-point)


15 and advanced coding/decoding algorithms (e.g., Turbo
coding). Our SoftLTE implementation on Sora provides
10 a peak data rate of 43.8Mbps with a 20-MHz channel,
16QAM modulation, and 3/4 Turbo coding. The most com-
5
putationally intensive component of an LTE PHY is the
Turbo decoder. Our current implementation can achieve
35Mbps throughput using one hardware thread of an Intel
0
1M 2M 5.5M 11M 6M 24M 54M Core i7-920 core (2.66 GHz). Since Core i7 supports hyper-
Modulation Mode threading, though, we can execute the Turbo decoder in
parallel on two threads, achieving an aggregated through-
Sora–Commercial Commercial–Commercial put of 54.8Mbps. We can achieve this performance because
Commercial–Sora Turbo decoding is relatively balanced in the number of
arithmetic instructions and memory accesses. Therefore,
the two threads can ­overlap these two kinds of operations
generation and transferring can overlap with the demodu- well and yield a 56% performance gain even though they
lation of the data frame. After the entire frame is demodu- share the same execution units of a single core. Thus, the
lated and validated, SoftWiFi instructs the RCB to transmit whole SoftLTE implementation can run in real time with
the ACK which has already been stored in the RCB. Thus, the two Intel Core i7 cores.
latency for ACK transmission is very small.
Figure 6 shows the transmitting and receiving through- 7. RELATED WORK
put of a Sora SoftWiFi node when it communicates with Traditionally, device drivers have been the primary software
a commercial WiFi NIC. In the “Sora–Commercial” con- mechanism for changing wireless functionality on general-
figuration, the Sora node acts as a sender and gener- purpose computing systems. For example, the MadWiFi
ates 1400-byte UDP frames and unicast transmits them drivers for cards with Atheros chipsets,3 HostAP drivers for
to a laptop equipped with a commercial NIC. In the Prism chipsets,2 and the rtx200 drivers for RaLink chipsets5
“Commercial–Sora” configuration, the Sora node acts as are popular driver suites for experimenting with 802.11.
a receiver, and the laptop ­generates the same workload. These drivers typically allow software to control a wide
The “Commercial–Commercial” configuration shows range of 802.11 management tasks and non-time-critical
the throughput when both sender and receiver are com- aspects of the MAC protocol, and allow software to access
mercial NICs. In all configurations, the hosts were at the some device hardware state and exercise limited control
same distance from each other and experienced very little over device operation (e.g., transmission rate or power).
packet loss. Figure  6 shows the throughput achieved for However, they do not allow changes to fundamental aspects
all configurations with the various modulation modes in of 802.11 like the MAC packet format or any aspects of PHY.
11a/b/g. We show only three selective rates in 11a/g for SoftMAC goes one step further to provide a platform
­conciseness. The results are averaged over five runs (the for implementing customized MAC protocols using inex-
variance was very small). pensive commodity 802.11 cards.17 Based on the MadWiFi
We make a number of observations from these results. drivers and associated open-source hardware abstraction
First, the Sora SoftWiFi implementation operates seam- layers, SoftMAC takes advantage of features of the Atheros
lessly with commercial devices, showing that Sora SoftWiFi chipsets to control and disable default low-level MAC
is protocol compatible. Second, Sora SoftWiFi can achieve behavior. SoftMAC enables greater flexibility in implement-
similar performance as commercial devices. The through- ing nonstandard MAC features, but does not provide a full
puts for both configurations are essentially equivalent, dem­ platform for SDR. With the separation of functionality
onstrating that SoftWiFi (1) has the processing capability to between driver software and hardware firmware on com-
demodulate all incoming frames at full modulation rates, modity devices, time critical tasks and PHY processing
and (2) it can meet the 802.11 timing constraints for return- remain unchangeable.
ing ACKs within the delay window required by the standard. GNU Radio is a popular software toolkit for building
We note that the maximal achievable application through- ­software radios using general-purpose computing plat­
put for 802.11 is less than 80% of the PHY data rate, and the forms.1  GNU Radio consists of a software library and a
percentage decreases as the PHY data rate increases. This hardware platform. Developers implement software radios

106 communications of t h e ac m | JANUARY 2 0 1 1 | vo l . 5 4 | n o. 1


by ­composing modular precompiled components into Sora is now available for academic use as the MSR
­processing graphs using Python scripts. The default GNU Software Radio Kit.4 The Sora hardware can be ordered
Radio platform is the Universal Software Radio Peripheral from a vender company in Beijing and all software can be
(USRP), a configurable FPGA radio board that connects to downloaded for free from Microsoft Research website. Our
the host. As with Sora, GNU Radio performs much of the SDR hope is that Sora can substantially contribute to the adop-
processing on the host itself. Current USRP supports USB2.0 tion of SDR for wireless networking experimentation and
and a new version USRP 2.0 upgrades to Gigabit Ethernet. innovation.
Such interfaces, though, are not sufficient for high-speed
wireless protocols in wide bandwidth channels. Existing Acknowledgments
USRP/GNU Radio platforms can only sustain low-speed The authors would like to thank Xiongfei Cai, Ningyi Xu,
wireless communication due to both the hardware con- and Zenlin Xia in the Hardware Computing group at MSRA
straints as well as software processing.18 As a consequence, for their essential assistance in the hardware design of
users must sacrifice radio performance for its flexibility. the RCB. We also thank Fan Yang and Chunyi Peng in the
The WARP hardware platform provides a high-­ Wireless Networking (WN) Group at MSRA; in particular
performance SDR platform.8 Based on Xilinx FPGAs and we have learned much from their early study on acceler-
PowerPC cores, WARP allows full control over the PHY and ating 802.11a using GPUs. We would also like to thank
MAC layers and supports customized modulations up to all members in the WN Group and Zheng Zhang for their
36Mbps. A variety of projects have used WARP to experi- support and feedback. The authors also want to thank
ment with new PHY and MAC features, demonstrating the Songwu Lu, Frans Kaashoek, and MSR colleagues (Victor
impact a high-­performance SDR platform can provide. Bahl, Ranveer Chandra, etc.) for their comments on earlier
KUAR is another SDR development platform.15 Similar to drafts of this paper.
WARP, KUAR mainly uses Xilinx FPGAs and PowerPC cores
for ­signal processing. But it also contains an embedded PC
References platform. In ACM Moicom 2009
as the control processor host (CPH), enabling some commu- 1. GNU Radio. http://www.gnu.org/ (Demonstration) (Beijing, 2009).
nication systems to be implemented completely in software software/gnuradio/. 14. Lin, Y., Lee, H., who, M., Harel, Y.,
2. HostAP. http://hostap.epitest.fi/. Mahlke, S., Mudge, T. SODA: a low-
on the CPH. Sora provides the same flexibility and perfor- 3. MadWifi. http://sourceforge.net/ power architecture for software
projects/madwifi. radio. In ISCA ‘06: Proceedings of
mance as hardware-based platforms, like WARP, but it also 4. Microsoft Research Software Radio the 33rd International Symposium
provides a  familiar and powerful programming environ- Platform. http://research.microsoft. on Computer Architecture (2006).
com/enus/projects/sora/academickit. 15. Minden, G.J., Evans, J.B., Searl, L.,
ment with software portability at a lower cost. aspx. DePardo, D., Patty, V.R., Rajbanshi,
The SODA architecture represents another point in the 5. Rt2x00. http://rt2x00.serialmonkey. R., Newman, T., Chen, Q., Weidling,
com. F., Guffey, J., Datla, D., Barker, B.,
SDR design space.14 SODA is an application domain-specific 6. Small Form Factor SDR Development Peck, M., Cordill, B., Wyglinski, A.M.,
multiprocessor for SDR. It is fully programmable and targets Platform. http://www.xilinx.com/ Agah, A. KUAR: a flexible software-
products/devkits/SFF-SDR-DP.htm. defined radio development platform.
a range of radio platforms—four such processors can meet 7. Universal Software Radio Peripheral. In DySpan (2007).
the computational requirements of 802.11a and W-CDMA. http://www.ettus.com/. 16. Neel, J., Robert, P., Reed, J. A formal
8. WARP: Wireless Open Access methodology for estimating the
Compared to WARP and Sora, as a single-chip implementa- Research Platform. http://warp.rice. feasible processor solution space for a
tion it is more appropriate for embedded scenarios. As with edu/trac. software radio. In SDR’05: Proceedings
9. Boyd-Wickizer, S., Chen, H., Chen, R., of the SDR Technical Conference and
WARP, developers must program to a custom architecture to Mao, Y., Kaashoek, F., Morris, R., Product Exposition (2005).
implement SDR functionality. Pesterev, A., Stein, L., Wu, M., Dai, Y., 17. Neufeld, M., Fifield, J., Doerr, C.,
Zhang, Y., Zhang Z. Corey: an operating Sheth, A., Grunwald, D. SoftMAC—
system for many cores. In OSDI 2008. flexible wireless research platform.
8. CONCLUSION 10. Cummings, M., Haruyama, S. FPGA in In HotNets’05 (2005).
the Software Radio. IEEE Commun. 18. Schmid, T., Sekkat, O., Srivastava, M.B.
This paper presented Sora, a fully programmable soft- Mag. 1999. An experimental study of network
11. de Vegte, J.V. Fundamental of performance impact of increased
ware radio platform on commodity PC architectures. Sora Digital Signal Processing. Cambridge latency in software defined radios.
combines the performance and fidelity of hardware SDR University Press, 2005. In WiNETCH’07 (2007).
12. Glossner, J., Hokenek, E., Moudgill, M. 19. Tan, K., Liu, H., Fang, J., Wang, W.,
platforms with the programmability of GPP-based SDR plat- The Sandbridge Sandblaster Zhang, J., Chen, M., Voelker, G.M.
forms. Using the Sora platform, we also present the design Communications Processor. In 3rd SAM: enabling practical spatial
Workshop on Application Specific multiple access in wireless LAN.
and implementation of SoftWiFi, a software implementa- Processors (2004). In MobiCom’09: Proceedings of the
tion of the 802.11a/b/g protocols, and SoftLTE, a software 13. Li, Y., Fang, J., Tan, K., Zhang, J., 15th Annual International Conference
Cui, Q., Tao, X. Soft-LTE: a software on Mobile Computing and Networking
implementation of the LTE uplink PHY. radio implementation of 3GPP (New York, NY, 2009), ACM, USA,
The flexibility provided by Sora makes it a convenient long term evolution based on Sora 49–60.

platform for experimenting with novel wireless proto-


cols. In our research group, we have extensively used Sora Kun Tan (kuntan@microsoft.com), Yongguang Zhang (ygz@microsoft.com),
Microsoft Research Asia, Beijing, China. Microsoft Research Asia, Beijing, China.
to implement and evaluate various ideas in our wireless
research projects. For example, we have built a spatial mul- He Liu (h8liu@ucsd.edu), University Ji Fang (v-fangji@microsoft.com),
of California, San Diego, La Jolla, CA. Microsoft Research Asia and Beijing
tiplexing system with 802.11b.19 In this work, we imple- Jiaotong University, Beijing, China.
mented not only a complex PHY algorithm with successive Jiansong Zhang (kuntan@microsoft.
com), Microsoft Research Asia, Beijing, Geoffrey M. Voelker (voelker@cs.ucsd.
interference cancellation, but also a sophisticated carrier- China. edu), University of California, San Diego,
counting multi-access (CCMA) MAC—implementations La Jolla, CA.
would not have been possible with previous PC-based soft-
ware radio platforms. © 2011 ACM 0001-0782/11/0100 $10.00

JANUARY 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t h e acm 107


research highlights
doi:10.1145/1866739.1 8 6 6 7 6 1

Technical Perspective dependently devised an appropriate


form of coupling. This is the approach
Multipath: A New Control under exploration in the mptcp work-
ing group at the IETF, although with
Architecture for the Internet some concessions to graceful coexis-
tence with existing TCP.
By Damon Wischik The differences between the two
sorts of congestion control show up
Mult ipat h t ra ns m issio n for the In- If flows have access to multiple paths, then both in the overall throughput of the
ternet—that is, allowing users to send spikes in traffic on one link can make use of network, and also in how the net-
some of their packets along one path spare capacity on other links. work’s capacity is allocated. The au-
and others along different paths—is thors use the framework of social wel-
an elegant solution still looking for the fare utility maximization to address
right problem. both metrics in a unified way. This
The most obvious benefit of mul- framework has been mainstream in
tipath transmission is greater reliabil- theoretical research on congestion
ity. For example, I’d like my phone to control for the past decade. But it
use WiFi when it can, but seamlessly is not mainstream in systems work,
switch to cellular when needed, with- where more intuitive metrics such as
out disrupting my flow of data. In gen- average throughput and Jain’s fair-
eral, the only way to create a reliable 1988 was a powerful motivator for the ness index hold sway, along with views
network out of unreliable components quick deployment of Jacobson’s TCP. like “Congestion is only a problem at
is through redundancy, and multipath There is not yet a killer problem for access links, and if I’ve paid for two
transmission is an obvious solution. which multipath congestion control is connections then I ought to be able to
The second benefit of multipath the only good solution. Perhaps we will use two TCP flows.” These differences
transmission is that it gives an ex- be unlucky enough to find one. (It has in language and culture have meant
tra degree of flexibility in sharing the been shown1 that simple greedy route that the paper’s conclusions have not
network. Just as packet switching re- choice by end users, combined with in- become systems orthodoxy.
moved the artificial constraints im- telligent routing by network operators, Now that multipath transport pro-
posed by splitting links into circuits, can in theory lead to arbitrarily ineffi- tocols are a hot topic in the network
so too multipath removes the artificial cient outcomes, but this has not been systems community, it is a good time
constraints imposed by ‘splitting’ the seen in practice.) to highlight this work, and to translate
network’s total capacity into separate Lacking a killer problem, the au- its conclusions into practical answers
links (see the accompanying figure). thors present four vignettes that illus- about systems such as data centers
Flexibility comes with dangers. trate the inefficiency and unfairness and multihomed mobile devices. The
By building the Internet with packet of a naïve approach to multipath, and authors only address congestion con-
switching, we no longer had the con- that showcase the benefit of clever trol and path selection for an idealized
trol over congestion that circuit switch- multipath congestion control. model of moderately long-lived flows.
ing provides (crude though it may be), The niggling problems of naïve ap- There are still important questions to
and this led in 1988 to Internet con- proaches to multipath could probably answer, such as: When is a flow long
gestion collapse. Van Jacobson3 real- all be mitigated by special-case fixes enough to make it worth opening a
ized there needed to be a new system such as “only use paths whose round new path? When is a path so bad it
for controlling congestion, and he had trip times are within a factor of two of should be closed?
the remarkable insight that it could be each other” or “no flow may use more
achieved by end systems on their own. than four paths at a time,” perhaps en- References
1. Acemoglu, D., Johari, R. and Ozdaglar, A.E. Partially
The Internet has been using his trans- forced by deep packet inspection. So, optimal routing. IEEE Journal of Selected Areas in
mission control protocol (TCP) largely in effect, the authors present a choice Communication 25 (2007), 1148–1160.
2. Han, H., Shakkottai, S., Hollot, C.V., Srikant, R. and
unchanged until recently. between a single clean control archi- Towsley, D.F. Multi-path TCP: A joint congestion
The flexibility offered by multipath tecture for multipath transmission, control and routing scheme to exploit path diversity in
the Internet. IEEE/ACM Transactions on Networking
transport also brings dangers. The and a series of special-case fixes. 16 (2006), 1260–1271.
claim of the following paper is that, The naïve approach to multipath, 3. Jacobson, V. Congestion avoidance and control. In
Proceedings of SIGCOMM 1988 Conference.
once we do away with the crude control as studied in this paper, is to simply 4. Kelly, F.P. and Voice, T. Stability of end-to-end algorithms
of “each flow may use only one path,” run separate TCP congestion control for joint routing and rate control. ACM/SIGCOMM
Computer Communication Review 35 (2005), 5–12.
there should be some new control put in on each path. The clever alternative is
place—and, in fact, the proper control to couple the congestion control on
Damon Wischik (d.wischik@cs.ucl.ac.uk) is a Royal
can be achieved by end systems on their different paths, with the overall effect Society university research fellow in the Networks
own. That is to say, if multipath is packet of shifting traffic away from more- Research Group in the Department of Computer Science
at University College London.
switching 2.0, then it needs TCP 2.0. congested paths onto less-congested
Internet congestion collapse in paths; two research groups2,4 have in- © 2011 ACM 0001-0782/11/0100 $10.00

108 communications of t h e ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
doi:10.1145/1866739 . 1 8 6 6 7 6 2

Path Selection and Multipath


Congestion Control
By Peter Key, Laurent Massoulié, and Don Towsley

Abstract paths? Opening multitudinous TCP connections has


In this paper, we investigate the benefits that accrue from negative systems performance implications, hence
the use of multiple paths by a session coupled with rate there are incentives to keep the number of connections
control over those paths. In particular, we study data trans- small.
fers under two classes of multipath control, coordinated •• P2P applications use independent uncoordinated TCP
control where the rates over the paths are determined as rate control mechanisms over each active path as this is
a function of all paths, and uncoordinated control where straightforward to implement and requires no change
the rates are determined independently over each path. to the network. However, starting from first principles,
We show that coordinated control exhibits desirable load mechanism design produces a coordinated control
balancing properties; for a homogeneous static random mechanism where the rates over each path are deter-
paths scenario, we show that the worst-case through- mined as a function of all of the paths. How does an
put performance of uncoordinated control behaves as if uncoordinated control mechanism perform relative to
each user has but a single path (scaling like log(log(N) )/ a coordinated control mechanism? This is important
log(N) where N is the system size, measured in number of because the latter requires a revised transport layer
resources), whereas coordinated control yields a worst- protocol or a careful application layer solution whereas
case throughput allocation bounded away from zero. We the former is easily implemented using TCP.
then allow users to change their set of paths and introduce
the notion of a Nash equilibrium. We show that both coor- The motivating application scenario is of data trans-
dinated and uncoordinated control lead to Nash equilib- fers over a network, where the transfers are long enough
ria corresponding to desirable welfare maximizing states, to allow performance benefits for multipath routing,
provided in the latter case, the rate controllers over each although our results apply more generally to situations
path do not exhibit any round-trip time (RTT) bias (unlike where there are alternative resources that can help service
TCP Reno). Finally, we show in the case of coordinated a demand, and where the demand is serviced using some
­control that more paths are better, leading to greater form of rate control. We assume that demand is fixed, and
­welfare states and throughput capacity, and that simple each usera attempts to optimize its performance by choos-
path reselection polices that shift to paths with higher net ing appropriate paths (resources), where the rate control
benefit can achieve these states. algorithm is fixed. More precisely, we assume that the rate
control is implicitly characterized by a utility maximiza-
tion problem,20 where a particular rate control algorithm
1. INTRODUCTION
Multipath routing has received attention recently.2, 5, 6, 14, 21
Figure 1. (a) A canonical multipath example. (b) A BitTorrent example
Furthermore, combining multipath routing with rate control where a receiving peer receives data from four peers. A virtual
is implicitly used by several peer-to-peer (P2P) applications. sender has been included to show the relationship to canonical
Most relevant to us is BitTorrent,4 which maintains a num- multipath.
ber of, typically four, active connections to other peers with (a) (b)
an additional path periodically chosen at random together
Sender
with a mechanism that retains the best paths (as measured Virtual
by throughput). Sender
The basic setting of multipath coupled with rate control
is as follows. A source and destination pair in a network is
given a set of possibly overlapping paths connecting them.
File
The pair then chooses a subset to use and the rates at which Receiver replicas
Receiver
to transfer data over those paths. This scenario is illustrated
in Figure 1a. Note that the P2P example described above
falls into this formulation once one includes a fictitious a
  We use the term “user” as a convenient shorthand for a user, or the software
source feeding data through peers to the intended receiver, or algorithm a user or end-system employs.
as shown in Figure 1b. Some natural questions arise:
The original version of this paper was published in the
• How many paths are required? And does it suffice, as Proceedings of IEEE Infocom 2007, May 2007 by IEEE.
with the above P2P application, to use a subset of the

Ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t h e acm 109
research highlights

(e.g. TCP Reno) maps to a particular (user) utility function,9 from the sets of paths and shift to paths with higher net
and users selfishly seek to choose paths in such a way as to benefit, they can rely on a small number of paths and
maximize their net utility. Within this optimization frame- do as well as if they were fully using all available paths.
work, a coordinated controller is modeled by a single util- •• Coordinated control has better fairness properties
ity function per user, whose argument is the aggregate rate than uncoordinated in the static case. When combined
summed over paths, whereas an uncoordinated controller with path reselection, uncoordinated control only does
has a utility function per path and the aggregation is over all as well as a coordinated control if there is no RTT bias
of the utility functions. in the controllers.
Key to the usefulness of multipath rate control is its abil-
ity in the hands of users operating independently of each We conclude the paper with some thoughts on how mul-
other to balance the load throughout the network. We illus- tipath rate control might be deployed.
trate this for a particular scenario, where the paths chosen
are fixed and static, but chosen at random from a set of size 2. THE MULTIPATH FRAMEWORK
N. We focus on the worst-case allocation, which is a measure The standard model of the network is as a capacitated graph
of the fairness of the scheme. In the uncoordinated case, G = (V, E, C) where V represents a set of end-hosts or routers,
the worst-case allocation scales as log(log(N) )/log(N) inde- E is a set of communication links and each link has a capac-
pendent of the number b of paths chosen. In contrast, in the ity, say in bits per second, Cl, l ∈ E. In addition a large popu-
coordinated case where each user can balance its load across lation of sessions perform data transfers over the network.
the b paths available to it, provided there are two or more, the These sessions are partitioned into a set of session classes
worst-case ­allocation is bounded away from zero. This dem- S with Ns sessions in class s  S. Associated with class s is a
onstrates that source ss, a destination ds, and a set of one or more, possi-
bly overlapping paths, R(s) between the source and destina-
1. coordinated control balances loads significantly better tion that is made available to all class s sessions. Finally, we
than uncoordinated when paths are fixed. associate an increasing, concave function with each session
2. coordinated improves on greedy least-loaded resource class, Us(x), which is the utility that a class s session receives
selection, as in Mitzenmacher,16 where the least-loaded when it sends data at rate x > 0 from source to destination.
selection of b resources scales as 1/log(log(N) ) for b > 1. Now, exactly how this utility is used and the meaning of
x depends on whether we are concerned with coordinated
Effectively, coordinated control is able to shift the load or uncoordinated control. We will shortly describe each of
among the resources, and with each user independently these in turn.
balancing loads over no more than two paths, able to uti- A discussion of how utility functions can be used to model
lize the resources as if global load balancing was being standard TCP Reno is given in Kunniyur and Srikant.15 The
performed. so-called weighted alpha-fair utility functions given by
This raises the question of how users should be assigned
a set of paths to use. One natural path selection mechanism
is to allow users to make their own choices. We study this
as a game between users and consider a natural notion of a
Nash equilibrium in this context, where users seek to selfishly
maximize their own net utilities. We find that when users use were introduced in Mo and Walrand,17 and are linked to dif-
coordinated controllers, the Nash equilibria coincide with ferent notions of fairness. For example, a = 1 corresponds to
welfare-maximizing social optima. When we consider unco- (weighted) proportional fairness,8 and lim a → ∞ to max–min
ordinated controllers, then the results depend on whether the fairness. TCP’s behavior is well approximated by taking a = 2
controllers exhibit RTT bias (like TCP) or not. When they do and wr = 1/T 2r, where Tr is the round trip time for path r, in
not exhibit RTT bias, the Nash equilibria also coincide with the ­following sense: TCP achieves the maximum aggregate
welfare-maximizing social optima. Otherwise they need not. ­utility, for given paths and link capacities, for the corre-
We show that increasing the number of paths available sponding utility functions Ur.
to a source destination pair is desirable from a performance The set of paths available to a class s session can poten-
perspective. However, the simultaneous use of a large num- tially be very large. Hence a session will likely use only a small
ber of paths may not be possible. We show that this does not subset of these paths. We assume for now that every class s
pose a problem as simple path selection policies that com- session uses exactly bs paths. Let c denote a subset of R(s)
bine random path resampling with moving to paths with that contains bs paths and C(s) the set of all such subsets of
higher net benefit lead to welfare maximizing equilibria and paths, C(s) = {c : c  R(s) ∧ |c| = bs}. Let Nc denote the num-
also increase the throughput capacity of the network. In fact ber of class s sessions that use the set of paths c ∈ C (s), s ∈ S,
such a policy does as well as if each user uses all of the avail- and hence Ns = ∑c∈C(s) Nc. Last, let Nr denote the number of ses-
able paths simultaneously. sions that use path r ∈ R(s), Nr = ∑c ∈C(s) 1(r ∈ c) )Nc, where 1(x)
In summary, we shall provide some partial answers to our is the indicator function taking the value 1 when x is true.
initial questions. Associated with each class s session is a congestion con-
troller (rate controller) that determines the rates at which
• In a large system, provided users re-select randomly to send data over each of the bs paths available to it. We

110 communications of t h e ac m | Ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
now distinguish between coordinated and uncoordinated uncoordinated control can be implemented motivates our
control. study  of  it. In  Kelly’s optimization formulation this corre-
Coordinated control. Given a set of paths c, a coordinated sponds to ­solving the following problem:
controller actively balances loads over all paths in c, taking
into account the states of the paths. Our understanding of
and ability to design such controllers relies on a significant
advance made by Kelly et al.,8 which maps this problem into
one of utility optimization. In the case of coordinated con- over nonnegative λcr subject to the capacity constraints (3). As
gestion control, the objective is to maximize the “social wel- above, by analogy with (5) the constraints can be generalized
fare,” that is to to reflect the signaling used by a controller such as TCP. Note
the difference between this formulation and that for coordi-
nated control. In the case of the latter, the utility is applied to
the aggregate sending rate whereas in the case of the former,
the utility is evaluated on each path and then summed over all
over (λcr ≥ 0) subject to the capacity constraints paths. Note also that really we have written Ur instead of Us for
the uncoordinated controller, to reflect the fact that the con-
gestion control may differ across different paths (as is the case
with TCP whose allocation depends on the RTT of the path).
where λcr is the sending rate of a class s session that is using The functions above are strictly concave and are being
path r in c ∈ C(s). We will find it useful to represent the total optimized over a convex feasible region. Hence the prob-
rate contributed by class s sessions that use path r ∈ R(s) as lems admit to unique solutions in terms of aggregate per
Lr = Nc ∑c  r λcr, and the aggregate rate achieved by a single s class rates, even though distinct solutions may exist.
session over all paths in c as λc = ∑r ∈ c λcr.
Note that in the absence of restrictions on the number of 3. LOAD BALANCING PROPERTIES OF MULTIPATH
paths used, C(s) = R(s), and the optimization can be written Multipath has been put forward as a mechanism that when
used by all sessions can balance traffic loads in the Internet.
It is impossible to determine whether this is universally
true. However, we present in this section a simple scenario
where this issue can be definitively resolved. We consider
subject to the capacity constraints. We shall see later in a simple scenario where there are N resources with unit
Section 5 that by using random path reselection the solu- capacity (Cl ≡ 1).
tion to (2) actually solves (4), and hence give conditions for To provide a concrete interpretation, the resources can
when the restriction to using a subset of paths of limited size be interpreted as servers, or as relay or access nodes—see
imposes no performance penalties. Figure 2. There are aN users. Each user selects b resources
More generally, we can replace the hard capacity con- at random from the N available, where b is an integer larger
straintsb by a convex nondecreasing penalty function G. than one (the same resource may be sampled several times).
In the context of TCP, this penalty function can be thought We shall look at the worst-case rate allocation of users in
of as capturing the signaling conveyed by packet losses or two scenarios. In the first scenario, users implement unco-
packet marking (ECN19) by the network to the sessions when ordinated multipath congestion control where there is no
link capacities are violated. Under this extension, the coor- ­coordination between the b distinct connections of each
dinated control problem transforms to user. Thus, a connection sharing a resource handling X con-
nections overall achieves a rate allocation of exactly 1/X. In
the second scenario, each user implements coordinated
multipath congestion control.
We take the worst-case user rate allocation (or through-
There are many ways to approach the problem of design- put), as the load balance metric. One can show13 that the
ing controllers that solve these problems, but a very natural more “unfair” the allocation, the greater the expected time
one is suggested by the TCP congestion control, which solves to download a unit of data.
this variation of the above problem when each session is
restricted to a single path (see Key et al.11). Figure 2. Load balancing example: there are N servers, aN users and
Uncoordinated control. As mentioned earlier, uncoor- each selects b > 1 servers at random.
dinated control corresponds to a session with path set c
executing independent rate controllers over each path in c. A B C
This is easily done in the current Internet by establishing
separate TCP connections over each path. The ease in which

b
  The hard constraints in (3) can be written as the sum of penalty func-
tions, each of which is a step function Gl(x), with Gl(x) = 0 if x ≤ Cl and ∞
otherwise

Ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t h e acm 111
research highlights

3.1. Uncoordinated congestion control probability for all nonempty subsets of 1, . . . , aN. Using the
Denote by λi the total rate that user i obtains from all its binomial theorem and the union bound yields an upper
connections. In the case of uncoordinated congestion bound on the probability the condition fails to hold, and
control, we can show that the worst-case rate allocation, then Stirling’s approximation is used to approximate this
min λi decreases like b2 log(log N)/log N as N increases. bound.
This is to be compared with the worst-case rate allocation This result says that the worst-case rate allocation is
that one gets when b = 1, that is when a single path is used: bounded away from zero as N tends to infinity, i.e., it is
from classical balls and bins models,16 this also decreases O(1) in the number of resources N. Thus coordinated con-
as log(log(N) )/log(N) as N increases. It should come as no trol exhibits significantly better load balancing properties
surprise that using more than two paths exhibits the same than does uncoordinated control. It is also interesting to
asymptotic performance as using only one path; there is compare this result to the result quoted by Mitzenmacher
no potential for balancing load within the network when et al.,16 which says that if users arrive in some random
all connections operate independent of each other. A for- order, and choose among their b candidate resources one
mal statement and proof of this result can be found in Key with the lowest load, then the worst-case rate scales like 1/
et al.11 log(log(N) ), which unlike the allocation under coordinated
control, goes to zero as N increases. The difference between
3.2. Coordinated congestion control the two schemes is that in Mitzenmacher’s scheme a choice
Here we assume as before that there are aN users, each has to be made immediately at arrival, which cannot be
selecting b resources at random, from a collection of N avail- changed afterward, whereas a coordinated controller
able resources. Denote by λij the rate that user i obtains from actively and adaptively balances load over the b paths react-
resource j, and let R(i) denote the set of resources that user ing to changes that may occur to the loads on the resources.
i accesses. In contrast with the previous situation, we now
assume that the rates λij are chosen to maximize: 4. A PATH SELECTION GAME
In this section we address the following question. Suppose
that each session is restricted to using exactly b paths each,
taken from a much larger set of possible paths: what is the
effect of allowing each user to choose its b paths so as to
for some concave utility function U. maximize the benefit that it receives? To answer this ques-
An interesting property of this problem is that the set tion, we study a path selection game. Here each session is a
of {λ*ij} that solves the above optimization is insensitive to player that greedily searches for throughput-optimal paths.
the choice of utility function U so long as it is concave and We characterize the equilibrium allocations that ensue. We
increasing. Moreover, this insensitivity implies that the show that the same equilibria arise with coordinated con-
optimal aggregate user rates (λ*i ) correspond to the max– gestion control and uncoordinated congestion control pro-
min fair rate allocations (see Bertsekas and Gallager,3 vided that the latter does not introduce RTT biases on the
Section 6.5.2). Simply stated a rate allocation (λ*i ) is said different paths. Moreover, these equilibria correspond to
to be max–min fair if and only if an increase of any rate λ*i 0 the optimal set of rates that solve problems (2) and (6), i.e.,
must result in the decrease of some already smaller rate. achieve welfare maximization. We shall use the models and
Formally, for any other feasible allocation (xi), if xi > λ*i notation of Section 2.
then there must exist some j such that λ*j < λ*i and xj < λ*j . We shall restrict attention to when Ns is large, so that a
The above statements are easily verified by checking that change of paths by an individual player (session) does not
the max–min fair allocation satisfies the Karush–Kuhn– significantly change the network performance. In game
Tucker conditions associated with the above optimization ­theory terms we are only considering non-atomic games.
problem.
This leads to the following result. 4.1. Coordinated congestion control
For coordinated control, we use the model of Section 2, where
Theorem 1. Assume there are N resources, and aN users each the number of sessions Ns is fixed for all s, and introduce the
connecting to b resources selected at random. Denote by {λ*i } following notion of a Nash equilibrium.
the optimal allocations that result. Then there exists x > 0, that
depends only on a and b, such that: definition 1. The nonnegative variables Nc, c ∈ C(s), s ∈ S, are a
Nash equilibrium for the coordinated congestion control alloca-
tion if they satisfy the constraints ∑c Nc = Ns, and, moreover, for
all s ∈ S, all c ∈ C(s), if Nc > 0, then the corresponding coordinated
The style of the proof has wide applicability and we outline rate allocations satisfy
it here: first, an application of Hall’s celebrated marriage
theorem shows that the minimum allocation will be at least
x provided that any set of users (of size n say) connect to
at least x times as many servers (nx servers). If this condi-
tion is satisfied, the allocation (λ*i ) will exceed x; hence it is In other words, for each session (player), weight is only given
sufficient to ensure that Hall’s condition is met with high to sets c that maximize the throughput for s. ◊

112 communications of t h e ac m | Ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
We then have the following. them. In this section we focus on two questions. The first
regards how many paths, bs, to allow each class s user so as
Theorem 2. At a Nash equilibrium as in Definition 1, the path to enhance its performance and that of the system. We estab-
allocations λr solve the welfare maximization problem (4). lish a monotonicity result for coordinated control in order
to address this question. The second question regards how
The proof follows since at a Nash equilibrium, type s play- to manage the overhead that may ensue due to the need for
ers only use minimum “cost” paths, which can be shown to a user to balance load actively over a large number of paths.
coincide with the Kuhn–Tucker conditions of (4). This result Possibly surprisingly, we will show that it suffices for a user
says that a selfish choice of path sets by end-users results in to maintain a small set of paths, say two (b = 2), provided that
a solution that is socially optimal. it repeatedly selects new paths at random and replaces the
old paths with these paths when the latter provide higher
4.2. Uncoordinated control throughput. It is interesting to point out that BitTorrent uses
We introduce the following notion of Nash equilibrium. a strategy much like this where it “unchokes” a peer (tries out
a new peer) and replaces the lowest performing of its existing
Definition 2. The collection of per path connection numbers Nr four connections with this new connection if the latter exhib-
is a Nash equilibrium for selfish throughput maximization if it its higher throughput.
satisfies ∑r Nr = Ns, and furthermore, the allocations (6) are such We will examine the above questions for both coordi-
that for all s ∈ S, all r ∈ R(s), if Nr > 0, then nated control and uncoordinated control. We begin with
coordinated control.

5.1. Coordinated control


◊ We begin by addressing the first question, namely how many
The intuition for this definition is as follows: any class s paths are needed. Consider a network G that supports a set
session maintains a connection along path r only if it cannot of flow classes S with populations {Ns}, and utility functions
find an alternative path r′ along which the default conges- {Us}. Let {R(s)} and {R′(s)} be two collections of paths for S
tion control mechanism would allocate a larger rate. that satisfy R(s)  R′(s) for s ∈ S and suppose that each ses-
We then have the following result, whose proof is similar sion applies coordinated control over these paths. Then for
to that of Theorem 2. the problem (4)

Theorem 3. Assume that for each s ∈ S, there is a class utility


function Us such that Ur(x) ≡ Us(x/b) for all r ∈ R(s). Then for a
Nash equilibrium (Nr), the corresponding rate allocations (λr ) and hence performance increases when the path sets
solve the general optimization problem (4). increase, with performance measured by the optimal welfare
under the capacity constraints. This follows from the fact
To summarize: if (i) the utility functions associated with that a solution honoring the constraints on path-sets {R}
the congestion control mechanism are path-independent, remains a feasible solution when the set of paths increases.
(ii) users agree to concurrently use a fixed number b of paths,
and (iii) they manage to find throughput-optimal paths, that Remark 1. Although we have not shown strict inequality, it
is they achieve a Nash equilibrium, then at the macroscopic is easy to construct examples where aggregate utility strictly
level, the per-class allocations solve the coordinated optimi- increases as more and more paths are provided.
zation problem (4).
It is well known that the bandwidth shares achieved The above result suggests that we would like to provide each
by TCP Reno are affected by the path round trip times. user with as large a set of paths possible to perform active
Thus the underlying utility functions are necessarily path load balancing on. However, this comes with the overhead of
dependent and the above result does not apply as (i) fails to having to maintain these paths. We examine this issue next
hold. As a consequence “bad” Nash equilibrium can exist. by considering a simple policy where a session is given a set
Indeed, a  specific example is given in Key et al.11 where of possible paths to draw from, say R(s) for a class s session,
the preference of TCP for “short-thin links” over “fat-­long- and the policy allows the session to actively use a small sub-
links” gives rise to a Nash equilibrium where the through- set of these paths, say two of them, while at the same time
put is half of what could be achieved. If (ii) is relaxed, constantly trying out new paths and replacing poorly per-
different uses have different “market power,” where those forming paths in the old set with better performing paths in
with larger bs gain a large share, thus also creating “bad” the new set. More specifically we consider the following path
Nash equilibria in general. selection mechanism. Assume a user is using path set c. This
user is offered a new path set c′ at some fixed rate Acc′. This
5. MULTIPATH ROUTING WITH RANDOM PATH new path set is accepted under the condition that the user
­ ESELECTION
R receives a higher aggregate rate than it was receiving under
In the previous section we explored the effect of allow- c. This process then repeats.
ing users to greedily select their set of paths (b in num- We use the model of Section 2, where the class s users, Ns
ber) out of the set of all possible paths that are available to in number, are divided according to the set of paths they are

Ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t h e acm 113
research highlights

currently using, Nc(t) denoting the number of class s-­users maximization on the part of a user conforms exactly to the
actively using paths in c  R(s) at time t. Class s users user trying to maximize its rate through the path reselection
actively using the set c of paths consider replacing their process. Thus, this path reselection policy is easy to imple-
path set c  with path set c′ according to a Poisson process ment: at random times the session initiates data transfer
with intensity Acc′. We shall assume that |c| = |c′| = b, i.e., the using the coordinated rate controller over a new set of paths
number of paths in an active set is fixed at b. Finally, assume and measures the achieved throughput, dropping either the
that for each class s, any r ∈ R(s), any given set c ∈ C(s), there old path set or new path set depending on which achieves
is some c′ such that r ∈ c′ and Acc′ is positive (recall that C(s) lower throughput. This equivalence is a ­consequence of the
is defined as the collection of size b subsets of R(s) ). This assumption that the utility U is strictly concave and continu-
assumption states that all paths available to a class s ses- ously differentiable.
sion should be tried no matter what set of initial paths is
given to that session. 5.2. Uncoordinated congestion control
We also have to concern ourselves with the send- As one might expect by now, the story is not as clean in the
ing rates of the different users as path reselection pro- case of uncoordinated control, and no monotonicity result
ceeds over time. Let λc(t) denote the data transfer rate exists. Indeed, for a symmetric triangle network described
for a user actively using path set c, λc(t) = ∑r∈c λc,r(t) where in Key et al.,11 with three source-destination session types,
λc,r(t) is  the  sending rate along path r at time t. We have allowing each session to use the two-link path as well as
described in Key et al.11 a dynamic process where the vec- the direct path decreases throughput. However, random
tors {Nc(t), λc,r(t)} change over time. This process is sto- re­sampling is still beneficial provided that the uncoordi-
chastic in nature and consequently difficult to model. nated ­control exhibits no RTT bias. If a session is given
However, if we assume that the population of users in each a set of paths to draw from, then the random resampling
class is large, which is reasonable for the Internet, then we strategy described earlier maximizes welfare without the
can model this process over time by a set of ordinary dif- need to use all paths. Moreover, it suffices for sessions to
ferential equations, representing the path reselection and use a greedy rate optimization strategy to determine which
rate adaptation dynamics of users over their active path set of paths to keep in order to ensure welfare maximiza-
sets. Under the condition that the utility functions and tion. The reader is referred to Key et al.11 for further details.
penalty functions are well behaved, we can show that Nc(t)
converges to a limit Nc and λcr(t) converges to λcr as t tends 6. DISCUSSION AND DEPLOYMENT
to infinity. Remarkably, we can show that these limits are Till now, we have focused on networks supporting work-
the maximizers of loads consisting of persistent or infinite backlog flows.
Moreover, the emphasis has been on the effect that mul-
tipath has on aggregate utility. In this section we consider
workloads consisting of finite length flows that arrive ran-
domly to the network. Our metric will be the capacity of the
subject to ∑c∈C(s) Nc = Ns. In other words, this resampling network to handle such flows. We will observe that several
­process allows the system to converge to a state where the results from previous sections have their counterparts
proportion of class s sessions using active path set c ∈ when we focus on finite flows.
R(s) and  the aggregate rates at which they use these paths As before, we represent a network as a capacitated undi-
­maximize the aggregate sum of utilities. This is more pre- rected graph G = (V, E, C) supporting a finite set of flow classes,
cisely stated in the following theorem. S with attendant sets of paths {R(s)}. We assume that class
s sessions arrive at rate as according to a Poisson process
Theorem 4. Assume that the utility functions Us and the and that they introduce independent and identical expo-
penalty function Γ are continuously differentiable on their nentially distributed workloads with a mean number of bits
domain, that the former are strictly concave increasing, and 1/ms. We introduce the notion of a capacity region for this
the latter convex increasing. Assume further that U′s (x) → 0 network, namely the sets of {as} and {ms} for which there
as x → ∞. Then (Nc, λc,r) converges to the set of maximizers of exists some rate allocation over the paths available to the
the welfare function (10) under the constraints ∑c∈C(s) Nc = Ns. sessions such that the time required for sessions to com-
The corresponding equilibrium rates (λr) are solutions of the plete their downloads are finite.
­coordinated welfare maximization problem (2). In the case of coordinated control, it is possible to
derive the following monotonicity result with respect to
The proof proceeds by showing that trajectories of the the capacity region of the network. Consider a network G
limiting ordinary differential equation are bounded, that supports a set S of flow classes with arrival rates {as}
that welfare increases over time, and then using Lasalle’s and loads {ms}. Let {R(s)} and {R′(s)} be two collections of
invariance theorem to prove that the limiting points of paths for these classes that satisfies R(s)  R′(s) for each
these dynamics coincide with equilibrium points of the s ∈ S and suppose that each session applies coordinated
ordinary differential equation; showing that the equilib- rate and path control over these paths. Then if {as}, {ms},
rium points coincide with the maximum of (10) completes lie within the capacity region of the network with path sets
the proof. {R(s)}, they lie in the capacity region of the network with
What makes this result especially useful is that benefit path sets {R′(s)} as well.

114 communications of t h e ac m | Ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
Remark 2. It is easy to find examples where the capacity region Figure 3. Capacity region under multipath with and without resampling.
strictly increases with the addition of more paths.
Remark 3. Although this result is stated for the case of exponen-
tially distributed workloads, it is straightforward to show that
it holds for any workload whose distribution is ­characterized by
a decreasing failure rate. This includes heavy-tailed distribu-
maximum capacity
tions such as Pareto.

It is interesting to ask the same question about the capac- multi path
ity region when uncoordinated control is used by all flows.
Unfortunately, similar to the infinite session workload case,
no such monotonicity property exists. two paths
It is also interesting to ask the question as to which con- +
two paths
resampling
troller yields the larger capacity region. As in the case for
finite flows, we can show that for a given network configura- single
path
tion (G,  S, and R fixed), if {as : s ∈ S}, {ms : s ∈ S} lies within
the capacity region of the network when operating with an
uncoordinated control, then they lie within the capacity
region of the network when operating under coordinated increase capacity over uncoordinated control over the entire
control as well. set of paths.

Remark 4. It is easy to construct cases where the converse is not 6.1. Deployment
true. For instance, the symmetric triangle with single and two- To effectively deploy multipath, key ingredients are first,
link routing mentioned for fixed flows is such an example (see diversity, which is achieved through a combination of multi­
Key and Massoulié12). homing and random path sampling, and second, path selec-
tion and multipath streaming using a congestion controller
We conclude from this monotonicity property for coor- that actively streams along the best paths from a working
dinated control that more is better. However, improved set. Although home-users are currently often limited in their
capacity comes at the cost of increased complexity at the choice of Internet Service Provider (ISP) and hence cannot
end-host, namely maintenance of state for each path and multihome, in contrast campus or corporate nodes often have
executing rate controllers over each path. Fortunately, diverse connections, via different ISPs or through 3G wireless
as  in the case of infinite backlogged sessions, this is and wired connectivity. Moreover, the growth of wireless hot-
not ­necessary. It suffices for a session to maintain a spots, wireless mesh and broadband wireless in certain parts
small set of paths, say two paths, and continually try out of the globe means that even home-users may become multi-
­random paths from the set of paths available to it, and homed in the future. Recent figures1 suggest that 60% of stub-
drop the path which provides it with the poorest perfor- ASes (those which do not transit traffic) are multihomed, and
mance, say throughput. Note the similarity of this process de Launois5 claims that with IPv6 type multihoming there are
to that of BitTorrent, which periodically drops the con- at least two disjoint paths between such stub-ASes.
nection providing the lowest throughput and replacing it The multipath controllers we have outlined need to be
with a random new connection. Interestingly enough, this put into practice. Some high-level algorithms designs are
multipath algorithm coupled with random resampling considered in Kelly and Voice10 and Han et al.,7 and practical
achieves the same capacity region as one that requires questions are addressed in Raiciu et al.18 Translating from
flows to utilize all paths. Indeed, we can prove the anal- algorithms derived from fluid models to practical packet-
ogy of 5.1. based implementations does require care; however, we
believe this to be perfectly feasible in practice. Indeed, the
Theorem 5. Assume that class s sessions use all paths from IETF has a current Multipath TCP working group, which is
R(s). Assume the set of loads {as} and {ms} lies within the net- looking into adding multipath into TCP.
work capacity region. Consider an approach where a class s ses-
sion uses a subset of paths from R(s), randomly samples a new 7. SUMMARY
path set according to a Poisson process with rate γs and drops There are potentially significant gains from combining
the worst of the two path sets. Then {ak} and {mk} also lie within multipath routing with congestion control. Two different
the capacity region when flows use this resampling approach in flavors of control are possible: one which coordinates trans-
the limit as gs → ∞. fers across the multiple paths; and another uncoordinated
control with sets up parallel connections. The uncoordi-
Figure 3 illustrates and summarizes our capacity results. nated approach is simpler to implement; however, it can
As before it is also interesting to ask about the effect of unco- suffer from poorer performance while coordinated con-
ordinated control coupled with random sampling on capac- trol is better performing and intrinsically “fairer.” We have
ity. Surprisingly enough, uncoordinated control on a small contrasted the two types of control, and shown that with
set of paths coupled with random resampling can often fixed path choices uncoordinated control can yield inferior

Ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t h e acm 115
research highlights

7. Han, H., Shakkottai, S., Hollot, C., 15. Kunniyur, S., Srikant, R. End-to-end
Srikant, R., Towsley, D. Multi-path TCP: congestion control schemes: utility
­performance, halving throughput in one example. a joint congestion control and routing functions, random losses and ECN
If path-choices are allowed to be chosen optimally or “self- scheme to exploit path diversity in the marks. In INFOCOM 2000 (2000).
Internet. IEEE/ACM Trans. Netw. 14, 6 16. Mitzenmacher, M., Richa, A.
ishly” by the end-system, then coordinated control reaches (Dec. 2006), 1260–1271. Sitaraman, R. The power of two
the best systemwide optimum; as indeed does uncoordi- 8. Kelly, F., Maulloo, A., Tan, D. Commu­ random choices: a survey of the
nication networks: shadow prices, techniques and results. Handbook
nated control, but only if the control objective is the same for proportional fairness and stability. of Randomized Computing.
all paths (unlike current TCP), and also only if all users agree J. Oper. Res. Soc. 49, (1998), 237–252. P. Pardalos, S. Rajasekaran, and
9. Kelly, F.P. Mathematical modelling of J. Rolim, eds. Kluwer Academic
to use the same number of parallel paths (connections). This the Internet. Mathematics Unlimited – Publishers, Dordrecht, 2001,
2001 and Beyond. B. Engquist and 255–312.
optimum can also be reached by limiting each session to a W. Schmid, eds. Springer-Verlag, 17. Mo, J., Walrand, J. Fair end-to-
small number of path choices (e.g., 2) but allowing paths to New York, 2001, 685–702. end window based congestion
10. Kelly, F.P., Voice, T. Stability of end-to- control. In SPIE 98, International
be resampled and better paths to replace existing ones. end algorithms for joint routing and Symposium on Voice, Video and Data
This suggests that good design choices for multipath rate control. ACM SIGCOMM Comput. Communications (1998).
Comm. Rev. 35, 2 (2005), 5–12. 18. Raiciu, C., Wischik, D., Handley, M.
controllers are coordinated controllers or uncoordinated 11. Key, P., Massoulié, L., Towsley, D. Path Practical congestion control for
controllers with the RTT bias removed. selection and multipath congestion multipath transport protocols. UCL
control. In INFOCOM07 (May 2007). Technical Report (2010).
12. Key, P., Massoulié, L. Fluid models 19. Ramakrishnan, K., Floyd, S., Black, D.
Acknowledgment of integrated traffic and multipath The addition of explicit congestion
routing. Queueing Syst. 53, 1 notification (ECN) to IP. Technical
This work was supported in part by the NSF under award (June 2006), 85–98. Report RFC3168, IETF (Sept. 2001).
CNS-0519922. 13. Key, P., Massoulié, L., Towsley, D. 20. Srikant, R. The Mathematics of
Multipath routing, congestion control Internet Congestion Control.
and load balancing. In ICASSP 2007 Birkhauser, Boston, 2003.
References
(Apr. 2007). 21. Zhang-Shen, R., McKeown, N.
1. Agarwal, S., Chuah, C.-N., Katz, R. 4. Cohen, B. Incentives build robustness 14. Kodialam, M., Lakshman, T., Sengupta, Designing a predictable Internet
OPCA: Robust interdomain policy in BitTorrent. In Proceeding of P2P S. Efficient and robust routing of highly backbone network with Valiant load-
routing and traffic control. In Economics workshop (June 2003). variable traffic. In HotNets (2004). balancing. In IWQoS (June 2005).
Proceedings of the IEEE Openarch 5. de Launois, C., Quoitin, B.
(April 2003). Bonaventure, O. Leveraging network
2. Andersen, D., Balakrishnan, H., performance with IPv6 multihoming Peter Key (peter.key@microsoft.com), Don Towsley (towsley@cs.umass.edu),
Kaashoek, F., Rao, R. Improving Web and multiple provider-dependent Microsoft Research, Cambridge, UK. Department of Computer Science, University
availability for clients with MONET. aggregatable prefixes. Comput. Netw. of Massachusetts, Amherst, MA.
In Proceedings of the NSDI 2005 50, 8 (2006), 1145–1157. Laurent Massoulié (laurent.massoulie@
(July 2005). 6. Gummadi, K., Madhyastha, H., Gribble, technicolor.com), Thomson Technology
3. Bertsekas, D., Gallager, R. Data S., Levy, H., Wetherall, D. Improving Paris Laboratory, 1, Issy-les-Moulineaux-
Networks. Longman Higher Education, the reliability of Internet paths with Moulineau, France.
Prentice-Hall, Inc., Englewood Cliffs, one-hop source routing. In Proceedings
NJ, 1992. of the 6th OSDI (Dec. 2004). © 2011 ACM 0001-0782/11/0100 $10.00

You’ve come a long way.


Share what you’ve learned.

ACM has partnered with MentorNet, the award-winning nonprofit e-mentoring network in engineering,
science and mathematics. MentorNet’s award-winning One-on-One Mentoring Programs pair ACM
student members with mentors from industry, government, higher education, and other sectors.
• Communicate by email about career goals, course work, and many other topics.
• Spend just 20 minutes a week - and make a huge difference in a student’s life.
• Take part in a lively online community of professionals and students all over the world.

Make a difference to a student in your field.


Sign up today at: www.mentornet.net
Find out more at: www.acm.org/mentornet
MentorNet’s sponsors include 3M Foundation, ACM, Alcoa Foundation, Agilent Technologies, Amylin Pharmaceuticals, Bechtel Group Foundation, Cisco
Systems, Hewlett-Packard Company, IBM Corporation, Intel Foundation, Lockheed Martin Space Systems, National Science Foundation, Naval Research
Laboratory, NVIDIA, Sandia National Laboratories, Schlumberger, S.D. Bechtel, Jr. Foundation, Texas Instruments, and The Henry Luce Foundation.

116 communications of t h e ac m | Ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
careers

Air Force Institute of Technology (AFIT) Possess, or complete by September 2011, a Ph.D. Associate Professors at Columbia are aca-
Dayton, Ohio in Computer Science or closely related area. Dem- demic officers holding the doctorate or its pro-
Department of Electrical and Computer onstrate strong English communication skills, a fessional equivalent who have demonstrated
Engineering commitment to actively engage in the teaching, scholarly and teaching ability and show great
Graduate School of Engineering and research and curricular development activities promise of attaining distinction in their fields of
Management of the department at both undergraduate and specialization.
Faculty Positions in Computer Science or graduate levels, and ability to work with a diverse Professors at Columbia are academic officers
Computer Engineering student body and multicultural constituencies. holding the doctorate or its professional equiva-
Ability to teach a broad range of courses, and to lent who are widely recognized for their distinc-
The Department of Electrical and Computer En- articulate complex subject matter to students at tion. Candidates for senior-level appointment
gineering is seeking applicants for tenure track all educational levels. First consideration will be must have a distinguished record of achieve-
positions in computer science or computer engi- given to completed applications received no later ment and evidenced by leadership in their field
neering. The department is particularly interest- than December 15, 2010. Contact: Faculty Search of expertise, publications, professional recogni-
ed in receiving applications from individuals with Committee, Computer Science Department, Cal tion, as well as a commitment to excellence in
strong backgrounds in formal methods (with em- Poly Pomona, Pomona, CA 91768. Email: cs@ teaching.
phasis on cryptography), software engineering, csupomona.edu. Cal Poly Pomona is an Equal Candidates must have a Ph.D. degree, DES, or
bioinformatics, computer architecture/VLSI sys- Opportunity, Affirmative Action Employer. Posi- equivalent degree by the starting date of the ap-
tems, and computer networks and security. The tion announcement available at: http://academic. pointment and are expected to establish a strong
positions are at the assistant professor level, al- csupomona.edu/faculty/positions.aspx. Lawful research program and excel in teaching both un-
though qualified candidates will be considered at authorization to work in US required for hiring. dergraduate and graduate courses.
all levels. Applicants must have an earned doctor- Our department of 36 tenure-track faculty and
ate in computer science or computer engineering 1 lecturer attracts excellent Ph.D. students, virtu-
or closely related field and must be U.S. citizens. Carnegie Mellon University ally all of whom are fully supported by research
These positions require teaching at the gradu- School of Design grants. The department has active ties with ma-
ate level as well as establishing and sustaining a IxD Faculty Position jor industry partners including Adobe, Autodesk,
strong research program. Canon, Disney, Dreamworks, Microsoft, Nvidia,
AFIT is the premier institution for defense-re- School of Design at Carnegie Mellon Google, Sony, Weta, Yahoo! and also to the nearby
lated graduate education in science, engineering, University research laboratories of AT&T, Google, IBM (T.J.
advanced technology, and management for the U.S. IxD Faculty Position Watson), NEC, Siemens, Telcordia Technologies
Air Force and the Department of Defense (DoD). Application deadline December 3, 2010 and Verizon. Columbia University is one of the
Full details on these positions, the department, Submit application to johnz@cs.cmu.edu. leading research universities in the United States,
and application procedures can be found at: http:// View complete job description at http://bit. and New York City is one of the cultural, finan-
www.afit.edu/en/eng/employment_faculty.cfm ly/cGrgeC. cial, and communications capitals of the world.
Review of applications will begin immediately Columbia’s tree-lined campus is located in Morn-
and will continue until the positions are filled. ingside Heights on the Upper West Side.
The United States Air Force is an equal opportu- Columbia University Applicants should apply online at:
nity, affirmative action employer. Department of Computer Science academicjobs.columbia.edu/applicants/
Tenured or Tenure-Track Faculty Positions Central?quickFind=54003

California State University, Fullerton The Department of Computer Science at Colum- and should submit electronically the following:
Assistant Professor bia University in New York City invites applica- curriculum-vitae including a publication list,
tions for tenured or tenure-track faculty positions. a statement of research interests and plans, a
The Department of Computer Science invites ap- The search committee is especially interested in statement of teaching interests, names with con-
plications for a tenure-track position at the Assis- candidates who through their research, teaching, tact information of three references, and up to
tant Professor level starting fall 2011. For a com- and/or service will contribute to the diversity and four pre/reprints. Applicants can consult www.
plete description of the department, the position, excellence of the academic community. Appoint- cs.columbia.edu for more information about the
desired specialization and other qualifications, ments at all levels, including assistant professor, department.
please visit http://diversity.fullerton.edu/. associate professor and full professor, will be The position will close no sooner than Decem-
considered. Priority themes for the department ber 31, 2010, and will remain open until filled.
include Computer Systems, Software, Artificial Columbia University is an Equal Opportunity/Af-
Cal Poly Pomona Intelligence, Theory, and Computational Biology. firmative Action employer
Assistant Professor Candidates who work in specific technical areas
including, but not limited to, Computer Graph-
The Computer Science Department invites ap- ics, Human-Computer Interaction, Simulation, DePaul University
plications for a tenure-track position at the rank and Animation, with research programs that can Assistant/Associate Professor
of Assistant Professor to begin Fall 2011. We are significantly impact the above priority themes are
particularly interested in candidates with spe- particularly welcome to apply. Candidates doing The School of Computing at DePaul University
cialization in Software Engineering, although research at the interface of computer sciences invites applications for a tenure-track position
candidates in all areas of Computer Science will and the life sciences and the physical sciences are in distributed systems. We seek candidates with
be considered, and are encouraged to apply. Cal also encouraged to apply. a research interest in data-intensive distributed
Poly Pomona is 30 miles east of L.A. and is one of Assistant Professors at Columbia are aca- systems, cloud computing, distributed databas-
23 campuses in the California State University. demic officers holding the doctorate or its profes- es, or closely related areas. For more information,
The department offers an ABET-accredited B.S. sional equivalent who are beginning a career of see https://facultyopportunities.depaul.edu/ap-
program and an M.S. program. Qualifications: independent scholarly research and teaching. plicants/Central?quickFind=50738.

ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t h e acm 117
careers

Eastern New Mexico University strong teaching credentials. Women and minori- and systems and novel storage and computing
Instructor of Computer Science ties are encouraged to apply. architectures. Ideal candidate would work both
Application review will begin January 18, 2011 independently and as a part of a storage architec-
For more information visit www.enmu.edu/ser- and continue until the position is filled. ture team in conceiving, prototyping and guiding
vices/hr or call (575) 562-2115. All employees Interested applicants must apply online: development of new storage related projects and
must pass a pre-employment background check. http://goucher.interviewexchange.com/can- ideas, as well as writing invention disclosures, ac-
AA/EO/Title IX Employer dapply.jsp?JOBID=21846 ademic papers and publications and participating
in scientific societies and industry associations.
Please submit the following application ma- Postdoctoral candidates are welcome to apply and
Eastern Washington University terials online: propose more specific research programs.
Tenure-track Position ˲˲ Curriculum Vitae
˲˲ Cover letter Job Requirements
The Computer Science Department at Eastern ˲˲ A personal statement describing your interest PhD in Computer Science with a proven publica-
Washington University invites applications for a in teaching at a small liberal arts college tion track record and implementation experience
tenure-track position starting Sept 2011. Please in system architecture, operating systems, file
visit: http://access.ewu.edu/HRRR/Jobs.xml for Three letters of recommendation and official systems, and embedded systems.
complete information. For questions contact graduate transcripts should be forwarded separate- Send applications (a full Curriculum Vitae
Margo Stanzak (509) 359-4734 ly to: Human Resources, Goucher College, 1021 Du- and short description of research interests) to
laney Valley Road, Baltimore, MD 21204-2794. Zvonimir.Bandic@hitachigst.com
Goucher College is an
Goucher College Equal Opportunity Employer.
Visiting Assistant Professor, Computer Science Illinois Institute of Technology
Department of Computer Science
Applications are invited for a three year visiting Hitachi Research
assistant professor position beginning August Research Scientist Applications are invited for a tenure-track assistant
2011. This is a one-year appointment, renewable Storage Architecture professor position in Computer Science beginning
for up to two additional years. A Ph.D. in comput- Fall 2011. Excellence in research, teaching and ob-
er science, or a closely related field, is preferred Hitachi San Jose Research Center is a premier taining external funding is expected. While strong
(non-Ph.D. applicants must be ABD). research center with more than 100 scientists candidates from all areas of computer science will
Applicants should have experience teaching a working in many exciting fields including stor- be considered, applicants from general data areas
wide range of courses at all levels of the computer age architecture, consumer electronics, storage such as database, data mining, information secu-
science curriculum. Preference will be given to technology and nanotechnology. The job opening rity, information retrieval, and data understanding
applicants with a systems background, but appli- is in the area of research of storage architecture and processing are especially encouraged.
cants from all areas of computer science will be and systems, more specifically operating systems, The Department offers B.S., M.S., and Ph.D.
considered. Applicants are expected to present novel file systems, reliability of storage devices degrees in Computer Science and has research

California State UniverSity, freSno


Faculty Positions in Computational Computer Scientist
Computer Science Assistant Professor of Computer Science (#11583)
Position will teach computational science and
at Ecole polytechnique computer science, help organize, utilize, and
fédérale de Lausanne (EPFL) apply high-performance computing clusters
with a proposed Computational Science
Center. This position is part of a University-
wide cohort of new faculty with a broad
emphasis in the area of Urban and Regional
Transformation. An earned doctorate (Ph.D)
in Computer Science or closely related field
from an accredited institution is required
The School of Computer and Communication To apply, please follow the application proce- for appointment to a tenure-track position.
Sciences at EPFL invites applications for fac- dure at http://icrecruiting.epfl.ch. The following Preference will be given to candidates with
ulty positions in computer science. We are documents are requested in PDF format: cur- post-doctoral research or industrial experience.
Visit jobs.csufresno.edu for full announcement
primarily seeking candidates for tenure- riculum vitae, including publication list, brief and application. Application materials
track assistant professor positions, but statements of research and teaching interests, should be submitted online by 1/12/2011.
suitably qualified candidates for senior po- names and addresses (including e-mail) of 3 ref- California State University, Fresno is an
sitions will also be considered. erences for junior positions, and 6 for senior po- affirmative action/equal opportunity institution.
sitions. Screening will start on January 1, 2011.
Assistant Professor of Computer Science (#11589)
Successful candidates will develop an inde- Further questions can be addressed to : Position will teach undergraduate and graduate
pendent and creative research program, par- computer science to majors and non-majors.
ticipate in both undergraduate and graduate Professor Willy Zwaenepoel, Dean An earned doctorate (Ph.D) in Computer
teaching, and supervise PhD students. School of Computer and Science or closely related field is required
for appointment to a tenure-track position.
Communication Sciences, EPFL Preference will be given to candidates with
Candidates from all areas of computer sci- CH-1015 Lausanne, Switzerland post-doctoral research or industrial experience,
ence will be considered, but preference will recruiting.ic@epfl.ch and whose specialties are in software
be given to candidates with interests in algo- engineering, computer/operating systems,
database/information systems, graphics/visual
rithms, bio-informatics and verification. For additional information on EPFL, please computing, intelligent systems, or algorithms.
consult: http://www.epfl.ch or http://ic.epfl.ch Visit jobs.csufresno.edu for full announcement
EPFL offers internationally competitive sal- and application. Application materials
aries, significant start-up resources, and out- should be submitted online by 1/12/2011.
California State University, Fresno is an
standing research infrastructure. EPFL is an equal opportunity employer. affirmative action/equal opportunity institution.

118 communications of t h e ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
strengths in distributed systems, information re- International Computer Applications should include a resume, select-
trieval, computer networking, intelligent informa- Science Institute ed publications, and names of three references.
tion systems and algorithms. The Illinois Institute Director Review begins February 1, 2011; candidates are
of Technology, located within 10 minutes of down- urged to apply by that date.
town Chicago, is a dynamic and innovative institu- The International Computer Science Institute To learn more about ICSI, go to http://www.
tion. The Department has strong connections to (ICSI), an independent non-profit laboratory icsi.berkeley.edu.
Fermi and Argonne National Laboratories, and to closely affiliated with the EECS Department, Uni- To apply for this position, send the above ma-
local industry, and is on a successful and aggressive versity of California, Berkeley (UCB), invites ap- terial to apply@icsi.berkeley.edu. Recommend-
recruitment plan. IIT is an equal opportunity/affir- plications for the position of Director, beginning ers should send letters directly to apply@icsi.
mative action employer. Women and Underrepre- Fall 2011. berkeley.edu by 2/1/2011. ICSI is an Affirmative
sented Minorities are strongly encouraged to apply. The ICSI Director’s primary responsibilities Action/Equal Opportunity Employer. Applica-
Evaluation of applications will start on De- are to: oversee and expand ICSI’s research agen- tions from women and minorities are especially
cember 1, 2010 and will continue until the posi- da; act as a high-level external evangelist for ICSI encouraged.
tion is filled. Applicants should submit a detailed research; identify and pursue strategic funding
curriculum vita, a statement of research and opportunities; and strengthen ICSI’s relationship
teaching interests, and the names and email ad- with UCB. The Director reports directly to ICSI’s Kansas State University
dresses of at least four references to: Board of Trustees. Department of Computing and
Computer Science Faculty Search Committee ICSI is recognized for world-class research Information Sciences
Department of Computer Science activities in networking, speech, language and Associate/Full Professor
Illinois Institute of Technology vision processing, as well as computational biol-
10 W. 31st Street ogy and computer architecture. Several of ICSI’s The department of Computing and Information
Chicago, IL 60616 research staff have joint UCB appointments, and Sciences at Kansas State University invites appli-
Phone: 312-567-5152 many UCB graduate students perform their re- cations for a position beginning in Fall 2011 at
Email: search@cs.iit.edu search at ICSI. In addition, ICSI places significant the level of Associate or Full Professor from can-
http://www.iit.edu/csl/cs emphasis on international partnerships and visit- didates working in the areas of high assurance
ing scholar programs. computing, program specification and verifica-
ICSI is seeking a Director with sufficient tion, and formal methods.
Ingram Content Group breadth, interest, and professional connections Kansas State University is committed to the
Development Technical Lead to promote and augment ICSI’s ongoing research growth and excellence of the CIS department. The
efforts. Applicants should have recognized re- department offers a stimulating environment
Working with a 400 core processing grid & tera- search leadership, as well as a strong record in for research and teaching, and has several ongo-
scale computing. Leadership of internal developed research management and demonstrated suc- ing collaborative projects involving researchers
systems &/or business applications. Key duties are cess at government and industrial fundraising. in different areas of computer science as well as
programming & debugging. View full job descrip- Experience with international collaboration and other engineering and science departments. The
tion & apply online at www.ingramcontent.com. fundraising is a plus. department has a faculty of nineteen, more than

Assistant Professorships (Tenure Track) in Computer Science


The Department of Computer Science (www.inf.ethz.ch) at ETH Zurich invites applications for assistant professorships
(Tenure Track) in the areas of:
– Computer Systems
– Computational Science
– Human Computer Interaction
– Software Engineering
The department offers a stimulating and well-supported research and teaching environment. Collaboration in research and
teaching is expected both within the department and with other groups of ETH Zurich and related institutions.
Applicants should have internationally recognized expertise in their field and pursue research at the forefront of Computer
Science. Successful candidates should establish and lead a strong research program. They will be expected to supervise Ph.D.
students and teach both undergraduate level courses (in German or English) and graduate level courses (in English).
Assistant professorships have been established to promote the careers of younger scientists. The initial appointment is for
four years with the possibility of renewal for an additional two-year period and promotion to a permanent position.
Please address your application together with a curriculum vitae, a list of publications, a statement of research and teaching
interests and the names of at least three referees to the President of ETH Zurich, Prof. Dr. Ralph Eichler, no later than Febru-
ary 15, 2011. For further information, candidates may contact the Head of the Department, Prof. F. Mattern (mattern@ethz.ch).
With a view towards increasing the number of female professors, ETH Zurich specifically encourages qualified female candi-
dates to apply. In order to apply for one of these positions, please visit: www.facultyaffairs.ethz.ch

ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t h e acm 119
careers

100 graduate students, and 250 undergraduate puter science to establish a record of scholarly, of national origin or citizenship. The working
students and offers BS, MS, MSE, and PhD de- peer-reviewed work, and providing service to language is English; knowledge of the German
grees. Computing facilities include a large net- the department and the University. The position language is not required for a successful career at
work of servers, workstations and PCs with more comes with dedicated funding for course releas- the institute.
than 300 machines and a Beowulf cluster with es as well as a professional development fund. The institute is located in Kaiserslautern and
1000+ processors. The department building has The course releases will mean a reduced load Saarbruecken, in the tri-border area of Germany,
a wireless network and state-of-the-art media- in the 1st-3rd and 5th years, and a guaranteed France and Luxembourg. The area offers a high
equipped classrooms. The department hosts sev- early research sabbatical in the 4th year after standard of living, beautiful surroundings and
eral laboratories for embedded systems, software successful midterm review. More information is easy access to major metropolitan areas in the
analysis, robotics, computational engineering available at www.loyola.edu/cs and www.loyola. center of Europe, as well as a stimulating, com-
and science, and data-mining. Details of the CIS edu Applicants must submit the following on- petitive and collaborative work environment. In
Department can be found at the URL http://www. line (http://careers.loyola.edu): a letter of appli- immediate proximity are the MPI for Informatics,
cis.ksu.edu/. cation listing teaching and research interests, Saarland University, the Technical University of
Applicants must be committed to both teach- a curriculum vitae, and contact information for Kaiserslautern, the German Center for Artificial
ing and research, and have an excellent research three references. For full consideration applica- Intelligence (DFKI), and the Fraunhofer Insti-
and teaching track record. Applicants should have tions must be received by January 31, 2011. Ap- tutes for Experimental Software Engineering and
a PhD degree in computer science or related dis- ply URL: https://careers.loyola.edu/applicants/ for Industrial Mathematics.
ciplines; salary will be commensurate with quali- Central?quickFind=52395 Qualified candidates should apply online at
fications. Applications must include descriptions http://www.mpi-sws.org/application. The review
of teaching and research interests along with cop- of applications will begin on January 3, 2011, and
ies of representative publications. Max Planck Institute for Software applicants are strongly encouraged to apply by
Preference will be given to candidates who Systems (MPI-SWS) that date; however, applications will continue to
will compliment the existing areas of strengths Tenure-track openings be accepted through January 2011.
of the department which include high assurance The institute is committed to increasing the
systems, tools for developing, testing and verify- Applications are invited for tenure-track and representation of minorities, women and individ-
ing software systems, static analysis, model-driv- tenured faculty positions in all areas related uals with physical disabilities in Computer Sci-
en computing, programming languages, security, to the study, design, and engineering of soft- ence. We particularly encourage such individuals
and medical device software. ware systems. These areas include, but are not to apply.
Please send applications to Chair of the Re- limited to, data and information management,
cruiting Committee, Department of Computing programming systems, software verification,
and Information Sciences, 234 Nichols Hall, Kan- parallel, distributed and networked systems, Mississippi State University
sas State University, Manhattan, KS 66506 (email: and embedded systems, as well as cross-cutting Head
Recruiting@cis.ksu.edu). Review of applications areas like security, machine learning, usabil- Department of Computer Science and
will commence January 3rd, 2011 and continue ity, and social aspects of software systems. A Engineering
until the position is filled. doctoral degree in computer science or related
Kansas State University is an Equal Opportu- areas and an outstanding research record are Applications and nominations are being sought
nity Employer and actively seeks diversity among required. Successful candidates are expected to for the Head of the Department of Computer Sci-
its employees. Background checks are required. build a team and pursue a highly visible research ence and Engineering (www.cse.msstate.edu) at
agenda, both independently and in collabora- Mississippi State University. This is a 12-month
tion with other groups. Senior candidates must tenure-track position.
Lingnan University have demonstrated leadership abilities and rec- The successful Head will provide:
Chair Professor / Professor ognized international stature. ˲˲ Vision and leadership for nationally recognized
MPI-SWS, founded in 2005, is part of a net- computing education and research programs
The Department of Computing and Decision Sci- work of eighty Max Planck Institutes, Germany’s ˲˲ Exceptional academic and administrative
ences at the Lingnan University is seeking a Chair premier basic research facilities. MPIs have an skills
Professor/Professor with outstanding teaching established record of world-class, foundational ˲˲ A strong commitment to faculty recruitment
and research experience in one or more of the fol- research in the fields of medicine, biology, chem- and development
lowing areas: Information Systems, Operations istry, physics, technology and humanities. Since
Management, Management Science and Statis- 1948, MPI researchers have won 17 Nobel prizes. Applicants must have a Ph.D. in computer
tics. Please visit http://www.ln.edu.hk/job-vacan- MPI-SWS aspires to meet the highest standards of science, software engineering, computer en-
cies/acad/10-170 for details and quote post ref: excellence and international recognition with its gineering, or a closely related field. The suc-
10/170/CACM in Form R1. research in software systems. cessful candidate must have earned national
To this end, the institute offers a unique envi- recognition by a distinguished record of accom-
ronment that combines the best aspects of a uni- plishments in computer science education and
Loyola University Maryland versity department and a research laboratory: research. Demonstrated administrative experi-
Assistant Professor, Computer Science a) Faculty receive generous base funding to build ence is desired, as is teaching experience at both
and lead a team of graduate students and post- the undergraduate and graduate levels. The suc-
Loyola University Maryland invites applications docs. They have full academic freedom and pub- cessful candidate must qualify for the rank of
for the position of Clare Boothe Luce Professor lish their research results freely. professor.
in the Department of Computer Science, with b) Faculty supervise doctoral theses, and have the Please provide a letter of application outlin-
an expected start date of fall 2011 at the level of opportunity to teach graduate and undergraduate ing your experience and vision for this position, a
Assistant Professor. We are seeking an enthusi- courses. curriculum vita, and names and contact informa-
astic individual committed to excellent teaching c) Faculty are provided with outstanding techni- tion of at least three professional references. Ap-
and a continuing, productive research program. cal and administrative support facilities as well plication materials should be submitted online at
A Ph.D. in Computer Science, Computer En- as internationally competitive compensation http://www.jobs.msstate.edu/.
gineering, or a closely related field is required. packages. Screening of candidates will begin February
Candidates in all areas of specialization will be MPI-SWS currently has 8 tenured and tenure- 15, 2011 and will continue until the position is
considered. The position is restricted by the track faculty, and is funded to support 17 faculty filled. Mississippi State University is an AA/EOE
Clare Boothe Luce bequest to the Henry Luce and about 100 doctoral and post-doctoral posi- institution. Qualified minorities, women, and
Foundation to women who are U.S. citizens. tions. Additional growth through outside funding people with disabilities are encouraged to apply.
Duties of the position include teaching under- is possible. We maintain an open, international Please direct any questions to Dr. Nicolas
graduate and professional graduate computer and diverse work environment and seek applica- Younan, Search Committee Chair (662-325-3912
science courses, conducting research in com- tions from outstanding researchers regardless or younan@ece.msstate.edu).

120 communications of t h e ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
National Taiwan University tems. The group is seeking candidates with an in- Candidates will be considered from all major
Professor-Associate Professor-Assistant terest in file and storage systems, cloud comput- disciplines in computer and information science.
Professor ing, or related areas. Applicants must have a PhD We particularly welcome candidates who can
in CS and a strong publication record in the above contribute to our strong research groups in soft-
The Department of Computer Science and Infor- topics. Required skills are: ware reliability (formal methods, programming
mation Engineering has faculty openings at all ˲˲ proactive and assume leadership in proposing languages, software engineering) and systems
ranks beginning in August 2011. Highly qualified and executing innovative research projects and networks.
candidates in all areas of computer science/engi- ˲˲ develop advanced prototypes leading to dem- The College maintains a strong research pro-
neering are invited to apply. A Ph.D. or its equiva- onstration in industrial environment gram with significant funding from the major
lent is required. Applicants are expected to con- ˲˲ initiate and maintain collaborations with aca- federal research agencies and private industry.
duct outstanding research and be committed to demic and industrial research communities The College has a diverse full-time faculty of 30.
teaching. Candidates should send a curriculum Four faculty members have joint appointments
vitae, three letters of reference, and supporting Postdoctoral Researchers with other disciplines, specifically, electrical and
materials before February 28, 2011, to Prof Kun- The Machine Learning group conducts research computer engineering, health sciences, physics
Mao Chao, Department of Computer Science on various aspects of machine intelligence, from and political science, and contribute to interdis-
and Information Engineering, National Taiwan the exploration of new algorithms to applications ciplinary initiatives in information assurance,
University, No 1, Sec 4, Roosevelt Rd., Taipei 106, in data mining and semantic comprehension. network science and health informatics. The Col-
Taiwan. Ongoing projects focus on text and video analy- lege has approximately 520 undergraduates, 350
sis, digital pathology, and bioinformatics. The Masters, and 65 Ph.D. students.
group is seeking postdoctoral researchers with Northeastern University has made major in-
NEC Laboratories America, Inc PhD in CS and experience in bioinformatics (em- vestments over the course of the last several years
Research Staff Positions phasis on genomics or proteomics a plus) or text in the broad areas of Health, Security and Sustain-
analysis and/or text mining. Required skills and ability. The College has been a major participant
NEC Laboratories America, Inc. is a vibrant in- experience are: in the recruitment of faculty who can contribute
dustrial research center, conducting research ˲˲ Strong publication record in top machine to these themes and will continue to do so this
in support of NEC’s U.S. and global businesses. learning, data mining or related conferences and year as well with an additional three interdisci-
Our research program covers many areas, reflect- journals plinary searches ongoing in Health Informatics,
ing the breadth of NEC business, and maintains ˲˲ Solid knowledge in math, optimization, and Information Assurance and Game Design and In-
a balanced mix of fundamental and applied statistical inference teractive Media.
research. We have openings in the following re- ˲˲ Hands-on experiences in implementing large- Northeastern University is located on the Av-
search areas: scale learning algorithms and systems enue of the Arts in Boston’s historic Back Bay. The
˲˲ Good problem solving skills, with strong soft- College occupies a state of the art building oppo-
Research Staff Members ware knowledge site Boston’s Museum of Fine Arts.
The Large-Scale Distributed Systems group con- Additional information and instructions for
ducts advanced research in the area of design, Associate Research Staff Members submitting application materials may be found at
analysis, modeling and evaluation of distributed Candidates for Associate Research Staff Member the following web site: http://www.ccs.neu.edu/.
systems. Our current focus is to create innovative in the Computing Systems Architecture depart- Screening of applications begins immediately
technologies to build next generation large-scale ment must have an MS in CS/CE or EE with strong and will continue until the search is completed.
computing platforms, and to simplify and au- motivation and skill set to prototype/transfer in- Northeastern University is an Equal Opportu-
tomate the management of complex IT systems novate research results into industry practice. Ex- nity/Affirmative Action Employer. We strongly en-
and services. The group is seeking research staff pertise in at least one of the above parallel com- courage applications from women and minorities.
members in the area of distributed systems and puting areas is desirable.
networks. The candidates must have a PhD in CS/ The Storage Systems department is seeking
CE with strong publication records on the follow- applicants for an Associate Research Staff Mem- Northeastern University
ing topics: ber. The successful candidate will have an MS in Open Rank - Interdisciplinary
˲˲ distributed systems and networks CS or equivalent and the following skills: Northeastern University is seeking a faculty
˲˲ operating systems and middleware ˲˲ Solid understanding of operating systems member at an open rank for an interdisciplinary
˲˲ performance, reliability, dependability and ˲˲ Experience in systems programming under appointment in the College of Computer and In-
security Linux/Unix formation Science and the College of Arts, Media
˲˲ data centers and cloud computing ˲˲ Experience with performance evaluation and and Design to start in the Fall of 2011.
˲˲ virtualization and system management tuning The successful candidate will contribute to
˲˲ system modeling and statistical analysis ˲˲ Strong algorithms, data structures and multi- shaping the research, academic, and develop-
threaded programming experience ment goals of the cross-disciplinary areas of
The Computing Systems Architecture depart- ˲˲ Good knowledge of C++ and OOD/OOP Game Design and Interactive Media at both the
ment seeks to innovate, design, evaluate, and ˲˲ Proactive with can-do attitude and work well in undergraduate and the graduate levels.
deliver parallel systems for high-performance, small teams It is expected that the candidate for this po-
energy-efficient enterprise computing. The group sition will possess an excellent track record in
is seeking senior and junior level research staff For more information about NEC Labs and these research/scholarship, publication, grant acqui-
as follows. Candidates for Research Staff Mem- openings, access http://www.nec-labs.com and sition, and teaching. A terminal degree, either
ber must have a PhD in CS/CE or EE with strong submit your CV and research statement through PhD or MFA depending on the candidate’s field,
research record and excellent credentials in the our career center. is required.
international research community. Applicants EOE/AA/MFDV Contact: Terrence Masson - Email: t.masson@
must demonstrate competency in at least one of neu.edu
the following areas:
˲˲ heterogeneous cluster architectures Northeastern University
˲˲ parallel programming models and runtimes Boston, Massachusetts Northeastern University, Boston,
˲˲ key technologies to accelerate performance College of Computer and Information Science Massachusetts
and low power consumption of enterprise appli- Full or Associate Professor - Health
cations on heterogeneous clusters We Invite applications for tenure-track faculty posi- Informatics and Interfaces
tions in computer science and information science,
The Storage Systems department engages in beginning in Fall 2011. Applicants at all ranks will The College of Computer and Information Sci-
research in all aspects of storage systems with an be considered. A PhD in computer science, infor- ence and the Bouvé College of Health Sciences
emphasis on large scale reliable distributed sys- mation science or a related field is required. invite applications for a faculty position in Health

ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t h e acm 121
careers

Informatics. A Ph.D. level degree in Health or Please direct inquiries to Professor Stephen for the tenure-track position of Assistant Profes-
Medical Informatics, Computer Science, Infor- Intille (S.Intille@neu.edu). sor of Computer Science effective Fall Semester
mation Science, or a health-related discipline, to- 2011. The successful candidate will have experi-
gether with a proven ability to secure grant fund- ence and research interest in software engineer-
ing for research using advanced technology in the Oregon State University ing/software design, compilers, or programming
health domain, is required. School of Electrical Engineering and Computer languages. Candidates will be evaluated on teach-
Building upon our successful joint Master Science ing and research potential. Ph.D. in Computer
of Science degree program in Health Informat- Two tenure-track Professorial positions in Science is required. Faculty are expected to teach
ics, and our many graduate and undergraduate Computer Science courses for the B.S. and M.S. degrees in Computer
degree programs in health sciences, nursing, Science, pursue scholarly research and publica-
pharmacy, computer science, and information The School of Electrical Engineering and Com- tions, contribute to curriculum development,
science, we are interested in growing our faculty puter Science at Oregon State University invites participate in University and professional ser-
in the general area of health interfaces, which applications for two tenure-track professorial vice activities, advise undergraduate and gradu-
includes technologies that patients interact positions in Computer Science. Exceptionally ate students, and serve on graduate level degree
with directly, health informatics, and technol- strong candidates in all areas of Computer Sci- committees. For information on Penn State Har-
ogy design for health and wellness systems. ence are encouraged to apply. We are building risburg, please visit our websites at www.hbg.psu.
The candidate would play a key role in launch- research and teaching strengths in the areas edu and www.cs.hbg.psu.edu.
ing a new interdisciplinary Ph.D.-level degree of open source software, internet and social Applicants are invited to submit current
program in this area. Faculty in our colleges are computing, and cyber security, so our primary curriculum vitae, a list of three references with
currently working on multiple NIH-funded re- need is for candidates specializing in software one reference addressing candidate’s teaching
search projects in consumer informatics, clini- engineering, database systems, web/distributed effectiveness, a personal statement of research
cal informatics, behavioral informatics, and systems, programming languages, and HCI. and teaching objectives that includes a list of
assistive technologies, and we are particularly Applicants should demonstrate a strong com- preferred courses to teach. Please submit cre-
interested in faculty candidates who can expand mitment to collaboration with other research dentials to: Chair, Computer Science Search
or complement our work in these areas. Topics groups in the School of EECS, with other de- Committee, c/o Mrs. Dorothy J. Guy, Director of
of interest include the use of mobile technolo- partments at Oregon State University, and with Human Resources, Penn State Harrisburg, Box:
gies to monitor and manage health, the use of other universities. ACM-33389, 777 West Harrisburg Pike, Middle-
virtual agents for physical exercise and health The School of EECS supports a culture of en- town, PA 17057-4898.
management, the development of assistive com- ergetic collaboration and faculty are committed Application review will begin immediately
munication aids, the use of artificial intelligence to quality in both education and research. With and continue until the position is filled. Penn
to study mental and physical health behavior, 40 tenure/tenure-track faculty, we enroll 160 PhD, State is committed to affirmative action, equal
and the development and evaluation of other 120 MS and 1200 undergraduate students. OSU opportunity, and the diversity of its workforce.
novel technologies to study health behavior and is the only Oregon institution recognized for its
improve health outcomes. We are interested in “very high research activity” (RU/VH) by the Car-
candidates who create new tools and candidates negie Foundation for the Advancement of Teach- Princeton University
who specialize in evaluation of new technolo- ing. The School of EECS is housed in the Kelley Computer Science
gies in field research. Northeastern University is Engineering Center, a green building designed Assistant Professor
making a major investment in interdisciplinary to support collaboration among faculty and stu- Tenure-Track Positions
health research, with several recent hires and dents across campus. Oregon State University is
additional open interdisciplinary faculty search- located in Corvallis, a college town renowned for The Department of Computer Science at Princ-
es in Health Systems, Health Policy, Urban Envi- its high quality of life. eton University invites applications for faculty
ronment and Health and Administration. For more information, including full position positions at the Assistant Professor level. We are
announcement and instructions for application, accepting applications in all areas of Computer
Additional Information visit: http://eecs.oregonstate.edu/faculty/openings. Science.
Recognizing the importance of multidisciplinary php. Applicants must demonstrate superior re-
approaches to solving complex problems facing OSU is an AAEOE. search and scholarship potential as well as teach-
society, Northeastern is hiring faculty in several ing ability. A PhD in Computer Science or a relat-
areas related to this search. Searches are current- ed area is required.
ly underway in health care policy/ management, Pacific Lutheran University Successful candidates are expected to pursue
health systems engineering, health law, and ur- Assistant Professor an active research program and to contribute
ban health. We will consider hiring a multidisci- significantly to the teaching programs of the de-
plinary group as a ‘cluster hire’. Candidates may Assistant Professor in the Computer Science and partment. Applicants should include a resume
choose to form a team and propose an innovative Computer Enginieering Department beginning contact information for at least three people who
and translational research and educational direc- September 2011. Review of applications will be- can comment on the applicant’s professional
tion and apply to more than one of the position gin February 14, 2011, and continue until the po- qualifications.
announcements. Information on these positions sition is filled. There is no deadline, but review of applica-
and on cluster applications can be obtained from A master’s degree is required and a Doctorate tions will start in December 2010; the review of
the http://www.northeastern.edu/hrm/ web site. is required for tenure. Preferred candidates will applicants in the field of theoretical computer
have a Ph.D. in Computer Engineering, Comput- science will begin as early as October 2010.
Equal Employment Opportunity er Science, or a related field; promise of teaching Princeton University is an equal opportunity
Northeastern University is an Equal Opportunity, excellence is essential. employer and complies with applicable EEO and
Affirmative Action Educational Institution and Application details and further information affirmative action regulations You may apply on-
Employer, Title IX University. Northeastern Uni- about PLU and the CSCE department can be line at:
versity strongly encourages applications from mi- found at www.plu.edu and www.cs.plu.edu. In- http://www.cs.princeton.edu/jobs Requisition
norities, women and persons with disabilities. quiries may be sent by e-mail to csce@plu.edu. Number: 1000520
AA/EOE
How To Apply
Applicants should submit a letter of interest, cur- Princeton University
riculum vitae, and the contact information of at Penn State Harrisburg Computer Science Department
least five references. Submission is online via Assistant Professor, Computer Science Postdoc Research Associate
http://www.ccs.neu.edu/. Screening of applica-
tions begins November 30, 2010 and will contin- Penn State Harrisburg, School of Science, En- The Department of Computer Science at Princ-
ue until the position is filled. gineering and Technology, invites applications eton University is seeking applications for post-

122 communications of t h e ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | no. 1


doctoral or more senior research positions in are highly competitive. Applicants are strongly 94 Rockafeller Road
theoretical computer science. Candidates will be encouraged to apply online at https://hiring.sci- Piscataway, NJ 08854-8054
affiliated with the Center for Computational In- ence.purdue.edu. Hard copy applications can
tractability (CCI) or the Princeton Center for The- be sent to: Faculty Search Chair, Department of
oretical Computer Science. Candidates should Computer Science, 305 N. University Street, Pur- State University of New York at
have a PhD in computer Science, a related field, due University, West Lafayette, IN 47907. Review Binghamton
or on track to finish by August 2011. Candidates of applications will begin on November 10, 2010, Department of Computer Science
affiliated with the CCI will have visiting privileges and will continue until the positions are filled. Assistant Professor
at partner institutions NYU, Rutgers University, Purdue University is an Equal Opportunity/Equal
and The Institute for Advanced Study. Review of Access/Affirmative Action employer fully commit- We have an opening for a tenure-track Assistant
candidates will begin Jan 1, 2011, and will con- ted to achieving a diverse workforce. Professor starting Fall 2011. Our preferred spe-
tinue until positions are filled. Applicants should cializations are embedded systems, energy-aware
submit a CV, research statement, and contact in- computing and systems development. We have
formation for three references. Purdue University well-established BS, MS and PhD programs, with
Princeton University is an equal opportunity Computer and Information Technology well over 50 full-time PhD students. We offer a
employer and complies with applicable EEO and Department Head significantly reduced teaching load for junior fac-
affirmative action regulations. ulty for at least the first three years. Please submit
Apply to:http://jobs.princeton.edu/ requisi- Computer and Information Technology, Purdue a resume and the names of three references at:
tion# 1000829 University, Department Head; The College of http://binghamton.interviewexchange.com
Technology invites nominations and applica- First consideration will be given to applica-
tions for the position of Department Head, Com- tions that are received by January 15, 2011. We are
Princeton University puter and Information Technology (CIT). The an EE/AA employer.
Visiting Fellows department head reports to the Dean of the Col-
lege of Technology and is responsible for the stra-
The Center for Information Technology Policy tegic leadership of the department in academic, Stevens Institute of Technology
(CITP) seeks candidates for positions as visiting administrative, budgetary, and personnel deci- Three Tenure Track Faculty Positions:
faculty members or researchers, or postdoctoral sions. Candidates must hold a terminal degree Social Networks, Decision and Cognitive
research associates for one year appointments in a related discipline and have the credentials Sciences, and Social Computing
the 2011-2012 academic year. Please see our web- appropriate for appointment at the rank of full
site for additional information and requirements professor with tenure at Purdue University. Can- The Howe School of Technology Management at
at http://citp.princeton.edu/call-for-visitors/. didates will share the vision of the University and Stevens Institute of Technology announces three
If you are interested, please submit a CV and the department, and have demonstrated strategic tenure-track faculty positions, at either the as-
cover letter, stating background, intended re- and transformative leadership with a proven re- sistant or associate professor level, for the 2011-
search, and salary requirements, to jobs.princ- cord of scholarship, external funding, and teach- 2012 academic year.
eton.edu/applicants/Central?quickFind=60250. ing. A full description of the position and appli- Social Networks. Candidates should have
Princeton University is an equal opportunity cation process is available on line at www.tech. a background in Information Systems or other
employer and complies with applicable EEO and purdue.edu/cit/aboutus/positions.cfm. Purdue fields pertinent to the position. Expertise in social
affirmative action regulations. University is an equal opportunity, equal access, network analysis is expected, and candidates may
affirmative action employer. also have an interest in its applications to finan-
cial market behavior.
Purdue University Decision and Cognitive Sciences. Candidates for
Department of Computer Science Rutgers, should have a background in decision analysis and
Assistant Professor The State University of New Jersey be knowledgeable about cognitive science and its
Department of Management Science and methods, from work in the disciplines of marketing,
The Department of Computer Science at Purdue Information Systems management, cognitive science, or psychology. An
University invites applications for tenure-track interest in the applications of decision making is also
positions at the assistant professor level begin- The Department of Management Science and important, in, for example, the areas of consumer be-
ning August 2011. Outstanding candidates in all Information Systems (MSIS) has a tenure-track havior, negotiation and risk and uncertainty.
areas of Computer Science will be considered. opening starting Fall 2011 at either the Assistant Social Computing. Candidates should have
Specific needs that have been identified include or Associate Professor level. Candidates should demonstrated expertise in designing, building
theory and software engineering. Candidates have a Ph.D. in information technology or a relat- and analyzing systems that combine aspects of
with a multi-disciplinary focus are encouraged to ed area. A candidate must be an active researcher human and machine intelligence to solve large-
apply. and have a strong record of scholarly excellence. scale problems, e.g., organizational, systems,
The Department of Computer Science offers Special consideration will be given to candidates financial. Candidates should also have a back-
a stimulating and nurturing academic environ- with expertise in data mining, security, data man- ground in Information Systems or a related dis-
ment. Forty-four faculty members direct research agement and other analytical methods related to cipline and be cognizant of recent research in
programs in analysis of algorithms, bioinformat- business operations. Teaching and curriculum crowdsourcing and the cloud computing archi-
ics, databases, distributed and parallel comput- development at the undergraduate, MBA, and tectures that underlie it.
ing, graphics and visualization, information Ph.D. levels will be expected. Candidates for all positions are expected to
security, machine learning, networking, program- Rutgers University is an affirmative action have demonstrated the capacity to perform high-
ming languages and compilers, scientific com- equal opportunity employer. Applications re- impact research. Classroom experience is also
puting, and software engineering. Information ceived by February 1, 2011 are guaranteed full expected. Applicants must apply online at http://
about the department and a detailed description consideration. All applicants should have com- www.apply2jobs.com/Stevens where they will be
of the open position are available at http://www. pleted a Ph.D. degree in a relevant subject area by asked to create an applicant profile and to for-
cs.purdue.edu. the Fall-2011 Semester. Applicants should send mally apply for the position. Use job requisition
All applicants should hold a PhD in Comput- curriculum vitae, cover letter, and the names of number MGMT2077 for Social Networks, job
er Science, or a closely related discipline, be com- three references to: requisition number MGMT2078 for Decision and
mitted to excellence in teaching, and have dem- Ms. Carol Gibson Cognitive Sciences and job requisition number
onstrated potential for excellence in research. (CGibson@rci.rutgers.edu, pdf files only) MGMT2079 for Social Computing.
The successful candidate will be expected to teach Department of Management Science and In addition, please send a curriculum vitae,
courses in computer science, conduct research in Information Systems statement of interest in the position, statement of
field of expertise and participate in other depart- Rutgers Business School research and teaching interests, three references
ment and university activities. Salary and benefits Rutgers, The State University of New Jersey and a sample of published or other research.

ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t h e acm 123
careers

Swarthmore College Room 224 New Engineering Building uate programs in computer science and computer
Visiting Assistant Professor Baltimore, MD 21218-2682 engineering. The university is committed to grow-
Phone: 410-516-8775 ing the faculty ranks over the next several years
Swarthmore College invites applications for a Fax: 410-516-6134 and promoting interdisciplinary research toward
three-year faculty position in Computer Science, fsearch@cs.jhu.edu cyber-enabled discovery and design.
at the rank of Visiting Assistant Professor, be- http://www.cs.jhu.edu/apply Penn State is a major research university and
ginning September 2011. Specialization is open. is ranked 3rd in the nation in industry-sponsored
Review of applications will begin January 1, 2011, research. Computer science is ranked 6th in the
and continue until the position is filled. For infor- The Ohio State University nation in research expenditures. U.S. News and
mation, see http://www.cs.swarthmore.edu/job. Department of Computer Science and World Report consistently ranks Penn State’s Col-
Swarthmore College has a strong commit- Engineering (CSE) lege of Engineering undergraduate and graduate
ment to excellence through diversity in educa- Assistant Professor programs in the top 15 of the nation. As reported
tion and employment and welcomes applications in the Chronicles of Higher Education, computer
from candidates with exceptional qualifications, The Department of Computer Science and En- science is ranked 3rd and computer engineering
particularly those with demonstrable commit- gineering (CSE), at The Ohio State University, is ranked 8th in the nation, respectively.
ments to a more inclusive society and world. anticipates significant growth in the next few The university is located in the beautiful col-
years. This year, CSE invites applications for four lege town of State College in the center of Penn-
tenure-track positions at the Assistant Professor sylvania. State College has 40,000 inhabitants
Texas A&M University level. Priority consideration will be given to candi- and offers a variety of cultural and outdoor recre-
Department of Visualization dates in database systems, graphics & animation, ational activities nearby. The university offers out-
Assistant Professor machine learning, and networking. Outstanding standing events from collegiate sporting events
applicants in all CSE areas (including software to fine arts productions. Many major population
Tenure-track faculty in the area of interactive me- engineering & programming languages, systems, centers on the east coast (New York, Philadelphia,
dia. Responsibilities include research/creative and theory) will also be considered. Pittsburgh, Washington D.C., Baltimore) are only
work, advising graduate/undergraduate levels, The department is committed to enhancing a few hours drive away and convenient air services
service to dept, university & field, teaching inc. in- faculty diversity; women, minorities, and individ- to several major hubs are operated by four major
tro courses in game design & development. uals with disabilities are especially encouraged to airlines out of State College.
Candidates must demonstrate collaborative apply. Applicants should hold a Ph.D. in Computer
efforts across disciplinary lines. Graduate degree Applicants should hold or be completing Science, Computer Engineering, or a closely relat-
related to game design & development, mobile a Ph.D. in CSE or a closely related field, have a ed field and should be committed to excellence in
media, interactive graphics, interactive art, mul- commitment to and demonstrated record of ex- both research and teaching. Support will be pro-
timedia or simulation is required. Apply URL: cellence in research, and a commitment to excel- vided to the successful applicants for establish-
http://www.viz.tamu.edu lence in teaching. ing their research programs. We encourage dual
To apply, please submit your application via career couples to apply. Applications should be
the online database. The link can be found at: received by January 31, 2011 to receive full consid-
The Johns Hopkins University http://www.cse.ohio-state.edu/department/posi- eration. To apply by electronic mail, send your re-
Tenure-track Faculty Positions tions.shtml sume (including curriculum vitae and the names
Review of applications will begin in November and addresses of at least three references) as a pdf
The Department of Computer Science at The and will continue until the positions are filled. file to recruiting@cse.psu.edu.
Johns Hopkins University is seeking applications The Ohio State University is an Equal Oppor- For more information about the Department
for tenure-track faculty positions. The search tunity/Affirmative Action Employer. of CSE at PSU, see http://www.cse.psu.edu. Click
is open to all areas of Computer Science, with a here to fill out an Affirmative Action Applicant
particular emphasis on candidates with research Data Card. Our search number is 015-87. You
interests in machine learning, theoretical com- The Pennsylvania State University MUST include this search number in order to
puter science, computational biology, computa- Tenure-track faculty submit this form.
tional aspects of biomedical informatics, or other Penn State is committed to affirmative action,
data-intensive or health-related applications. The Department of Computer Science and Engi- equal opportunity and the diversity of its work-
All applicants must have a Ph.D. in Computer neering (CSE) invites applications for tenure-track force.
Science or a related field and are expected to show faculty positions at all ranks. We seek outstand-
evidence of an ability to establish a strong, inde- ing candidates who can contribute to the core
pendent, multidisciplinary, internationally rec- of computer science and engineering through a The University of Alabama at
ognized research program. Commitment to qual- strong program of interdisciplinary research in Birmingham
ity teaching at the undergraduate and graduate areas such as high performance computing appli- Assistant/Associate Professor
levels will be required of all candidates. Prefer- cations and computational modeling for energy,
ence will be given to applications at the assistant life sciences, environmental sustainability, etc. The Department of Computer & Information Sci-
professor level, but other levels of appointment The department has 32 tenure-track faculty ences at the University of Alabama at Birmingham
will be considered based on area and qualifica- representing major areas of computer science and (UAB) is seeking candidates for a tenure-track/
tions. The Department is committed to building engineering. Eleven members of our faculty are tenure-earning faculty position at the Assistant
a diverse educational environment; women and recipients of the NSF Career Award. Two faculty or Associate Professor level beginning August 15,
minorities are especially encouraged to apply. members have received the prestigious NSF PE- 2011.
A more extensive description of our search can CASE Award. In recent years, our faculty received Candidates with leading expertise in Infor-
be found at http://www.cs.jhu.edu/Search2011. seven NSF ITR Grants, a $35M Network Science mation Assurance, particularly Computer Foren-
More information on the department is available Center Award, over $4.5M in computing and re- sics and/or Computer and Network Security are
at http://www.cs.jhu.edu. search infrastructure and instrumentation grants sought. The successful candidate must be able
Applicants should apply using the online ap- from NSF, eleven NSF Cyber Trust and Networking to participate effectively in multidisciplinary
plication which can be accessed from http://www. awards, and several awards from DARPA, DOE and research with scientists in Computer and Infor-
cs.jhu.edu/apply. Applications should be received DoD. There are state-of-the-art research labs for mation Sciences and Justice Sciences for advanc-
by Dec 1, 2010 for full consideration. Questions computer systems, computer vision and robotics, ing Information Assurance Research at UAB,
should be directed to fsearch@cs.jhu.edu. The Microsystems design and VLSI, networking and including joint scientific studies, co-advising of
Johns Hopkins University is an EEO/AA employer. security, high performance computing, bioinfor- students, and funding. Allied expertise in Artifi-
Faculty Search matics and virtual environments. The department cial Intelligence, Knowledge Discovery and Date
Johns Hopkins University offers a graduate program with over 40 Masters Mining, Software Engineering, and/or High Per-
Department of Computer Science students and 153 Ph.D. students, and undergrad- formance Computing is highly desirable. UAB

124 communications of t h e ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1
has made significant commitment to this area of Computer vision dedicated to work-life balance through an array
research and teaching. Candidates must conse- Computational biology of family-friendly policies, and is the recipient of
quently have strong teaching credentials as well Scientific computing an NSF Advance Award for gender equity.
as research credentials.
For additional information about the depart- Positions are available at all ranks, and we
ment please visit http://www.cis.uab.edu. have a large number of limited term positions University of California, Riverside
Applicants should have demonstrated the po- currently available. Tenure-Track Faculty Positions
tential to excel in one of these areas and in teach- For all positions we require a Ph.D. Degree or
ing at all levels of instruction. They should also be Ph.D. candidacy, with the degree conferred prior The Department of Computer Science and Engi-
committed to professional service including de- to date of hire. Submit your application electroni- neering, University of California, Riverside invites
partmental service. A Ph.D. in Computer Science cally at: applications for tenure-track faculty positions
or closely related field is required. http://ttic.uchicago.edu/facapp/ beginning in the 2011/2012 academic year with
Applications should include a complete cur- research interests in (a) Operating and Distrib-
riculum vita with a publication list, a statement Toyota Technological Institute at Chicago is an uted Systems (b) Data Mining and, (c) Computer
of future research plans, a statement on teaching Equal Opportunity Employer Graphics. Exceptional candidates in all areas will
experience and philosophy, and minimally two be considered. A Ph.D. in Computer Science (or
letters of reference with at least one letter ad- in a closely related field) is required at the time
dressing teaching experience and ability. University of California, Irvine of employment. Junior candidates must show
Applications and all other materials may be Computer Science Department outstanding research, teaching and graduate stu-
submitted via email to facapp.ia@cis.uab.edu or Tenure-Track Position in Operating Systems / dent mentorship potential. Exceptional senior
via regular mail to: Programming Languages candidates may be considered. Salary level will
Search Committee be competitive and commensurate with qualifi-
Department of Computer and Information Department of Computer Science at the Univer- cations and experience. Details and application
Sciences sity of California, Irvine (UCI) invites applications materials can be found at www.engr.ucr.edu/
115A Campbell Hall for a tenure-track Assistant Professor position in facultysearch. Full consideration will be given
1300 University Blvd the general area of Systems. We are particularly to applications received by February 1, 2011. Ap-
Birmingham, AL 35294-1170 interested in applicants who specialize in Oper- plications will continue to be received until the
ating Systems, Programming Languages or Dis- positions are filled. For inquiries and questions,
Interviewing for the position will begin as tributed Systems. Exceptionally qualified more please contact us at search@cs.ucr.edu. EEO/AA
soon as qualified candidates are identified, and senior candidates may also be considered. employer.
will continue until the position is filled. Department of Computer Science is the larg-
The department and university are commit- est department in the Bren School of Information
ted to building a culturally diverse workforce and and Computer Sciences, one of only a few such University of Houston – Clear Lake
strongly encourage applications from women schools in the nation and the only one in the UC Assistant or Associate Professor of Computer
and individuals from underrepresented groups. System. The department has over 45 faculty mem- Science/Computer Information Systems
UAB has a Spouse Relocation Program to assist bers and over 200 PhD students. Faculty research
in the needs of dual career couples. UAB is an Af- is very vibrant and broad, spanning prominent The Computer Science and Computer Informa-
firmative Action/Equal Employment Opportunity areas such as: distributed systems, software, tion Systems programs of the School of Science
employer. networking, databases, embedded systems, and Computer Engineering at the University of
theory, security, graphics, multimedia, machine Houston-Clear Lake invite applications for ten-
learning, AI, and bioinformatics. Prospective ap- ure-track Assistant or Associate Professor of CS
Theophilus, Inc. plicants are encouraged to visit our web page at: or CIS to begin August 2011. Ph.D. in CS, CIS/IS,
Recommendation Engine / Java Developer http://cs.www.uci.edu/ or a closely related field is required. Applications
One of the youngest UC campuses, UCI is are accepted only online at https://jobs.uhcl.edu.
Funded startup with virtual office and flexible ranked 10th among the nation’s best public uni- See http://sce.uhcl.edu/cs and http://sce.uhcl.
working hours seeking experienced part-time versities by US News & World Report. It has re- edu/cis for additional information about CS/CIS
Java Developer with recommendation engine ceived three Nobel prizes in the past 15 years. Sal- programs. AA/EOE.
experience (implementation and experimental ary and other compensation (including priority
evaluation), to work on next generation recom- access to on-campus for-sale faculty housing) are
mendation product integrating advanced seman- competitive with the nation’s finest universities. University of Houston-Downtown
tic analysis, search, social networks, and smart- UCI is located 3 miles from the Pacific Ocean in Assistant Professor, Computer Sciences
phones (iPhone). Developer can work remotely. Southern California (50 miles South of Los Ange-
Email for full description: david.kim@theo- les) with a very pleasant year-round climate. The The Department of Computer and Mathematical
philus-inc.com area offers numerous recreational and cultural Sciences invites applications for a tenure-track
opportunities. Also, the Irvine public school sys- Assistant Professor position in Computer Science
tem is one of the highest-ranked in the nation. starting Fall 2011. Successful candidates will have
Toyota Technological Institute at Screening will begin immediately upon re- a PhD in Computer Science or a closely related
Chicago ceipt of a completed application. Applications field in hand by the time of appointment, a prom-
Computer Science Faculty Positions at All will be accepted until the position is filled, al- ising research profile, and a commitment to ex-
Levels though maximum consideration will be given cellence in teaching. Review of applications will
to applications received by December 15, 2010. begin immediately and continue until the posi-
Toyota Technological Institute at Chicago (TTIC) Each application must contain: a cover letter, CV, tion is filled. Only online applications submitted
is a philanthropically endowed degree-granting sample publications (up to 3) and 3-5 letters of through http://jobs.uhd.edu will be considered.
institute for computer science located on the recommendation. All these materials must be up-
University of Chicago campus. The Institute is loaded on-line. Please refer to the following web
expected to reach a steady-state of 12 traditional site for instructions: University of Mississippi
faculty (tenure and tenure track), and 12 limited https://recruit.ap.uci.edu/ Chair, Department of Computer and
term faculty. Applications are being accepted in Information Science
all areas, but we are particularly interested in: UCI is an equal opportunity employer com-
Theoretical computer science mitted to excellence through diversity and en- The Department of Computer and Information
Speech processing courages applications from women, minorities, Science at the University of Mississippi (Ole Miss)
Machine learning and other under-represented groups. UCI is re- invites applications for the position of Chair. The
Computational linguistics sponsive to the needs of dual career couples, is Chair provides leadership and overall strategic

ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t h e acm 125
careers

direction for the instructional and research pro- position in Computer Engineering (F10/11-24). appointment. The position requires demonstrat-
grams. Requirements include a PhD or equiva- All candidates must have a potential/proven re- ed research success, a significant potential for
lent in computer science or a closely related field, cord in teaching and active research. The Assis- attracting external research funding, excellence
evidence of excellence in teaching and research tant Professor position in Computer Engineering in teaching both undergraduate and graduate
in one or more major areas of computer and in- requires a Ph.D. in computer science or com- courses, the ability to supervise student research,
formation science, and administrative experi- puter engineering. Highest priority will be given and excellent communication skills.
ence relevant to the management of an academic to candidates with research expertise in areas of USU offers competitive salaries and outstand-
computer science department. The Department Computer Networks, Cybersecurity/Forensics, ing medical, retirement, and professional ben-
has an ABET/CAC-accredited undergraduate pro- Data Management, Scientific Workflows, and/or efits (see http://www.usu.edu/hr/ for details). The
gram and MS and PhD programs. See the website Semantic Web.The program in Computer Engi- department currently has approximately 280 un-
http://www.cs.olemiss.edu for more information neering leading to BS degree in Computer Engi- dergraduate majors, 80 MS students and 27 PhD
about the Department and its programs. neering is administered jointly by the Computer students. There are 17 full time faculty. The BS
The University is located in the historic town Science Department and the Electrical Engineer- degree is ABET accredited. Utah State University
of Oxford in the wooded hills of north Missis- ing Department. The Computer Science Depart- is a Carnegie Research Doctoral extensive Univer-
sippi, an hour drive from Memphis. Oxford has ment (http://www.cs.panam.edu) also offers the sity of over 23,000 students, nestled in a moun-
a wonderful small-town atmosphere with afford- BSCS (ABET/CAC Accredited) and BS undergradu- tain valley 80 miles north of Salt Lake City, Utah.
able housing and excellent schools. ate degrees, MS in Computer Science and MS in Opportunities for a wide range of outdoor activi-
Requirements include a PhD or equivalent in Information Technology. ties are plentiful. Housing costs are at or below
computer science or a closely related field, evi- UTPA is situated in the lower Rio Grande valley national averages, and the area provides a sup-
dence of excellence in teaching and research in of south Texas, a strategic location at the center of portive environment for families and a balanced
one or more major areas of computer and infor- social and economic change. With a population personal and professional life. Women, minority,
mation science, and administrative experience of over one million, the Rio Grande Valley is one veteran and candidates with disabilities are en-
relevant to the management of an academic com- of the fastest growing regions in the country. The couraged to apply. USU is sensitive to the needs
puter science department. region has a very affordable cost-of-living. UTPA is of dual-career couples. Utah State University is an
Individuals may apply online at http://jobs. a leading educator of Hispanic/Latino students, affirmative action/equal opportunity employer,
olemiss.edu. Applicants will be asked to upload a with enrollment of over 18,500. with a National Science Foundation ADVANCE
cover letter, curriculum vitae, names and contact The position starts in Fall 2011. The salary is Gender Equity program, committed to increasing
information for five references, and a statement competitive. A complete application should in- diversity among students, faculty, and all partici-
of department administrative philosophy, objec- clude: (1) a cover letter, specifically stating an pants in university life.
tives, and vision. Review of applications will begin interest in the Assistant Professor in Computer Applications must be submitted using USU’s
immediately and will continue until the position Engineering position, noting your specialization, online job-opportunity system. To access this job
is filled or an adequate applicant pool is reached. (2) vita, (3) statements of teaching and research opportunity directly and begin the application
The University of Mississippi is an EEO/AA/Ti- interests, and (4) names and contact information process, visit https://jobs.usu.edu/applicants/
tle VI/Title IX/Section 504/ADA/ADEA employer. of at least three references. Applications can be Central?quickFind=54615.
mailed to Dean’s Office, Computer Engineering The review of the applications will begin on
Search, College of Engineering and Computer January 15, 2011 and continue until the position
University of North Texas Science, The University of Texas-Pan American, is filled. The salary will be competitive and de-
Department of Computer Science and 1201 W. University Drive, Edinburg, Texas 78539 pend on qualifications.
Engineering or emailed to coec@utpa.edu. Review of materi-
Department Chair als will begin on November 1, 2010 and continue
until the position is filled. Wichita State University
Applications are invited for the Chair position in NOTE: UTPA is an Equal Opportunity/Affirma- Assistant Professor
the Department of Computer Science and Engi- tive Action employer. Women, racial/ethnic mi-
neering at the University of North Texas. UNT is norities and persons with disabilities are encour- The Department of Electrical Engineering and
one of seven universities designated by the state aged to apply. This position is security-sensitive as Computer Science at Wichita State University has
as an “Emerging Research University.” Candi- defined by the Texas Education Code §51.215(c) multiple open tenure-eligible faculty positions
dates must have an earned doctorate in Comput- and Texas Government Code §411.094(a)(2). Tex- at the assistant professor level in Electric Energy
er Science and Engineering or a closely related as law requires faculty members whose primary Systems, Information Security, and Software En-
field with a record of significant and sustained language is not English to demonstrate profi- gineering. Duties and responsibilities of all posi-
research funding and scholarly output that quali- ciency in English as determined by a satisfactory tions include teaching undergraduate and gradu-
fies them to the rank of full professor. Preferred: grade on the International Test of English as a ate courses, advising undergraduate students,
Administrative experience as a department chair Foreign Language (TOEFL). supervising MS and PhD students in their theses
or director of personnel working in computer and dissertations, obtaining research funding,
science and engineering; experience in curricu- conducting an active research program, publish-
lum development; and demonstrated experience University of Wisconsin-Platteville ing the results of research, actively participating
mentoring junior faculty. The committee will be- Assistant Professor in professional societies, and service to the de-
gin its review of the applications on December 1, partment, college and university. Complete infor-
2010 and the position will close on April 4, 2011. The University of Wisconsin-Platteville Computer mation can be found on our website, www.eecs.
All applicants must apply online to: https://facul- Science and Software Engineering Department wichita.edu.
tyjobs.unt.edu. Nominations and any questions has two tenure track positions to be filled in Fall To ensure full consideration, the complete
regarding the position may be directed to Dr Bill 2011. One is in Software Engineering and one is application package must be submitted online
Buckles (bbuckles@cse.unt.edu). Additional in- an anticipated opening in Computer Science. For at jobs.wichita.edu by January 15, 2011. Appli-
formation and about the department is available more information and to apply electronically: cations will be continuously reviewed after that
at www.cse.unt.edu. UNT is an AA/ADA/EOE. http://www.uwplatt.edu/csse/positions. date until determinations are made with regard
to filling the positions. Offers of employment
are contingent upon completion of a satisfactory
University of Texas-Pan American Utah State University criminal background check as required by Board
Computer Science Department Assistant Professor of Regents policy.
Assistant Professor Faculty Position Questions only (not applications) can be di-
Applications are invited for a faculty position at rected to the search chair, Ward Jewell, wardj@
The Department of Computer Science at the Uni- the Assistant Professor level, for employment ieee.org.
versity of Texas-Pan American (UTPA) seeks ap- beginning Fall 2011. Applicants must have com- Wichita State University is an equal opportu-
plications for a tenure-track Assistant Professor pleted a PhD in computer science by the time of nity and affirmative action employer.

126 communications of t h e ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | no. 1


last byte

[ co n tinu e d f ro m p. 128] as calcula- a better job of telling that story, and a


tors for business and scientific appli- better job of garnering the resources to
cations. We were using it, however, as “I’m surrounded do the research that will enable further
a personal augmenter. The hypertext by people who can breakthroughs. That’s what the CCC
system we built was used in the early seeks to do.
1970s for a poetry class, where some- be counted on
one would enter a poem and invite to produce. I try You also seek to increase the role of
commentary about it. This was like cre- computer science in K–12, within the
ating a wiki, only 35 years earlier. to be one of those Science, Technology, Engineering,
people, too.” and Mathematics (STEM) Education
How different is it teaching computer Coalition learning framework. Kids
science now than it was five, 10, or 20 spend plenty of time with computers
years ago, given how fast-shifting the these days, but it seems more voy-
landscape is? euristically focused, with lots of social
Actually, much has stayed the same. networking and watching YouTube
We’re still teaching students the pro- in gene sequencers, in highways and clips. How do you get them to take a
cess of discovery. Children have an in- buildings and bridges, on the sea floor, step beyond?
nate interest in discovering everything in forest canopies, on the Web. In the There’s a positive aspect to what
around them. But, tragically, by the old days, if you wanted to study the cre- they’re doing. It’s better than in the
time they’ve gotten to college, this has ation of a social clique, for example, days when the only computer in the
usually been beaten out of them. So we you’d pay some undergraduates $6 house was a video game console.
teach them how to be children again— to participate in a focus group during When you have young people using
how to think like they did when they their lunch break. Now, you have mil- their home computers or iPhones for
were four years old. lions of such cliques that can be re- communication and social network-
Undergraduates go to a research searched through social media. The ing, it’s good because it introduces
university to engage in discovery. We challenge of research before was find- them to the broader power of comput-
constantly convey to them that we all ing enough data. Now, we’re drowning er science. Hopefully, many of these
got involved in computer science be- in it. This is a new turn of the crank in young people will say, “I want to create
cause we wanted to solve real problems the field of computational science— something like this.” But to empower
and change the world. To do this in the it’s about the data, not the cycles. these kids, there needs to be a revolu-
modern era just makes it more exciting. With the eScience Institute, we’re tion in STEM education.
The students in my classroom now will focusing on finding better ways to pur- All available data tells us that our
go on to computer science grad school, sue the collection and exploration of K–12 students are falling behind the
to law school, to medical school, to Mi- all of this data. We want to apply this rest of the developed world. To ad-
crosoft, Google, Amazon, and startups. knowledge first to the needs of the re- dress this, we’re exploring how to bet-
Then, they change the world. search scientists here at the universi- ter teach STEM disciplines. We need to
ty—the biologists and astrophysicists revive the concept that science is about
What was your philosophy when it who need to resolve big-data explora- discovery, and not the memorization
came to chairing the department? tion problems to do their jobs. Then, of facts. And computer science needs
First, it’s about leadership, not ad- we can create solutions for people to be a big part of this revolution—it
ministration. Be a leader; hire an ad- around the world. needs to be viewed as an essential
ministrator. Second, it’s about people STEM discipline.
and making them productive. You You’re also chair of the Computing
provide a shield that allows the people Community Consortium (CCC). What You clearly have a lot on your plate.
working for you to focus on their re- are the primary efforts the CCC is most How do you serve so many needs with-
search and teaching. Third, it’s about involved with now? And what do you out shortchanging some?
students—they’re the multiplicative hope to accomplish with these efforts? I’m the wrong person to ask! As re-
factor. If you’re at a university and Let’s face it, as computer scientists, sponsibilities increase, the hours of
you’re not engaged with students, we can come across as a bit geeky. Our the day don’t. If I’m writing a report, I
you’re in the wrong line of work—go to neighbors don’t understand what we always want to have another month. If
an industrial research lab. do. But what we do greatly impacts the I’m preparing for a class, I always want
issues that affect our neighbors, like to have another hour. I wish I had to
With respect to your role in the eScience the improvement of our transporta- power to say “no,” but I don’t. I’m sur-
Institute, how are you working to take tion systems, energy efficiency, health rounded by people who can be counted
research about disruptive technologies care, education, open government, cy- on to produce. I try to be one of those
and make a real-world impact? bersecurity, and discovery in all fields. people, too.
We are undergoing a revolution in As computer scientists, we have a rich
science. The growth of data sensors intellectual agenda with the capability Dennis McCafferty is a Washington, D.C.-based
technology writer.
has been phenomenal. They’re every- to have an enormous breadth of im-
where—in cell phones, in telescopes, pact upon daily lives. We need to do © 2011 ACM 0001-0782/11/0100 $10.00

ja n ua ry 2 0 1 1 | vo l . 5 4 | n o. 1 | c o m m u n i c at i o n s o f t h e acm 127
last byte

DOI:10.1145/1866739.1866763 Dennis McCafferty

Q&A
A Journey of Discovery
Ed Lazowska discusses his heady undergraduate days
at Brown University, teaching, eScience, and being chair
of the Computing Community Consortium.
As a n unde rgra duate student at
Brown University, Ed Lazowska hardly
seemed destined to become a leader in
computer science. Actually, he wasn’t
sure what he wanted to do. He started
as an engineering student, switched to
physics, and briefly considered chem-
istry. Essentially, he was “adrift.” (His
description, not ours.)
It wasn’t until he fell under the
tutelage of computer science profes-
sor Andy van Dam that he discovered
what really excited him: the process of
discovery.
“We had access to an IBM 360
mainframe that occupied an entire
building,” Lazowska recalls. “Despite
its size, it had only a couple hundred
megabytes of disk storage and 512 ki-
lobytes of memory. Today, your typical
smartphone will have 1,000 times the
processing power and storage of this
machine. During the day, it supported
the entire campus. But between mid-
night and 8 a.m., we were allowed to Lazowska holds the Bill & Melinda search assistants were graduate stu-
use it as a personal computer. We were Gates Chair in Computer Science & dents, but Brown had few computer
building a ‘what-you-see-is-what-you- Engineering at the University of Wash- science graduate students at the time.
get’ hypertext editor—Microsoft Word ington, where he served as department Andy was asking us to join him in dis-
plus the Web, minus networking. It chair from 1993 to 2001. He also di- covery—to figure out how to do things
was revolutionary.” rects the university’s eScience Institute that no one had done before. Up to
Four decades later, Lazowska is ded- and chairs the Computing Community that time, including my freshman year
icated to making the same transforma- Consortium, a National Science Foun- at Brown, I had been learning things
tional impact on countless computer dation initiative that seeks to inspire that people already knew. It blew my
science students. After graduating computer scientists to tackle the soci- mind that Andy was asking 19- and
from Brown in 1972, he received his etal challenges of the 21st century. 20-year-olds to find answers to ques-
PHOTOGRA PH BY B RIA N SMA LE

Ph.D. from the University of Toronto tions that he himself didn’t know.
in 1977, and joined the University of How invigorating were those early days People rise to the expectations and
Washington faculty, focusing on the at Brown under van Dam? challenges that are set for them. Andy
design, implementation, and analy- It was an amazing time. He had a understood this.
sis of high-performance computing crew of 20 undergraduates who were Back then, most people thought
and communication systems. Today, his research assistants. Typically, re- of computers [c on tinued o n p. 1 2 7 ]

128 communications of t h e ac m | ja n ua ry 2 0 1 1 | vo l . 5 4 | no. 1


IEEE 7th World Congress on Services
(SERVICES 2011)
July 5-10, 2011, Washington DC, USA, http://www.servicescongress.org/2011
Modernization of all vertical services industries including finance, government, media, communi-
cation, healthcare, insurance, energy and ...

IEEE 8th International Conference on Services Computing (SCC 2011)


In the modern services and software industry, Services Computing has become a cross-discipline that covers
the science and technology of bridging the gap between Business Services and IT Services. The scope of Ser-
vices Computing covers the whole lifecycle of services innovation research that includes business componenti-
zation, services modeling, services creation, services realization, services annotation, services deployment, ser-
vices discovery, services composition, services delivery, service-to-service collaboration, services monitoring, services optimiza-
tion, as well as services management. The goal of Services Computing is to enable IT services and computing technology to per-
form business services more efficiently and effectively. Visit http://conferences.computer.org/scc.

IEEE 9th International Conference on IEEE 4th International Conference on


Web Services (ICWS 2011) Cloud Computing (CLOUD 2011)
As a major implementation technology Cloud Computing is becoming a scal-
for modernizing software and services able services delivery and consump-
industry, Web services are Internet- tion platform in the field of Services
based application components published Computing. The technical foundations
using standard interface description lan- of Cloud Computing include Service-
guages and universally available via uniform communica- Oriented Architecture (SOA) and Virtualizations of
tion protocols. The program of ICWS 2011 will continue to hardware and software. The goal of Cloud Computing
feature research papers with a wide range of topics focus- is to share resources among the cloud service con-
ing on various aspects of implementation and infrastruc- sumers, cloud partners, and cloud vendors in the
ture of Web-based services. ICWS has been a prime inter- cloud value chain. Major topics cover Infrastructure
national forum for both researchers and industry practitio- Cloud, Software Cloud, Application Cloud, and Busi-
ners to exchange the latest fundamental advances in the ness Cloud. Visit http://thecloudcomputing.org.
state of the art on Web services. Visit icws.org.

Sponsored by IEEE Technical Committee on Services Computing (TC-SVC, tab.computer.org/tcsc)


Submission Deadlines

ICWS 2011: 1/31/2011


CLOUD 2011: 1/31/2011
SCC 2011: 2/14/2011
SERVICES 2011: 2/14/2011

Contact: Liang-Jie Zhang (LJ) at


zhanglj@ieee.org
(Steering Committee Chair)
PATINA
Personal Architectonics Through INteractions with Artefacts

You might also like