Professional Documents
Culture Documents
ACM
CACM.ACM.ORG OF THE 04/2015 VOL.58 NO.04
Sketch-Thru-Plan:
A Multimodal Interface
for Command and Control
42 56 74
Communications of the ACM is the leading monthly print and online magazine for the computing and information technology fields.
Communications is recognized as the most trusted and knowledgeable source of industry information for today’s computing professional.
Communications brings its readership in-depth coverage of emerging areas of computer science, new trends in information technology,
and practical applications. Industry leaders use Communications as a platform to present and debate various technology implications,
public policies, engineering challenges, and market trends. The prestige and unmatched reputation that Communications of the ACM
enjoys today is built upon a 50-year commitment to high-quality editorial content and a steadfast dedication to advancing the arts,
sciences, and applications of information technology.
ACM, the world’s largest educational STA F F EDITORIAL BOARD ACM Copyright Notice
and scientific computing society, delivers Copyright © 2015 by Association for
resources that advance computing as a DIRECTOR OF GROUP PU BLIS HING E DITOR- IN- C HIE F Computing Machinery, Inc. (ACM).
science and profession. ACM provides the Scott E. Delman Moshe Y. Vardi Permission to make digital or hard copies
computing field’s premier Digital Library cacm-publisher@cacm.acm.org eic@cacm.acm.org of part or all of this work for personal
and serves its members and the computing NE W S or classroom use is granted without
profession with leading-edge publications, Executive Editor fee provided that copies are not made
Co-Chairs
conferences, and career resources. Diane Crawford or distributed for profit or commercial
William Pulleyblank and Marc Snir
Managing Editor advantage and that copies bear this
Board Members
Executive Director and CEO Thomas E. Lambert notice and full citation on the first
Mei Kobayashi; Kurt Mehlhorn;
John White Senior Editor page. Copyright for components of this
Michael Mitzenmacher; Rajeev Rastogi
Deputy Executive Director and COO Andrew Rosenbloom work owned by others than ACM must
Patricia Ryan Senior Editor/News VIE W P OINTS be honored. Abstracting with credit is
Director, Office of Information Systems Larry Fisher Co-Chairs permitted. To copy otherwise, to republish,
Wayne Graves Web Editor Tim Finin; Susanne E. Hambrusch; to post on servers, or to redistribute to
Director, Office of Financial Services David Roman John Leslie King lists, requires prior specific permission
Darren Ramdin Rights and Permissions Board Members and/or fee. Request permission to publish
Director, Office of SIG Services Deborah Cotton William Aspray; Stefan Bechtold; from permissions@acm.org or fax
Donna Cappo Michael L. Best; Judith Bishop; (212) 869-0481.
Director, Office of Publications Art Director Stuart I. Feldman; Peter Freeman;
Bernard Rous Andrij Borys Mark Guzdial; Rachelle Hollander; For other copying of articles that carry a
Director, Office of Group Publishing Associate Art Director Richard Ladner; Carl Landwehr; code at the bottom of the first or last page
Scott E. Delman Margaret Gray Carlos Jose Pereira de Lucena; or screen display, copying is permitted
Assistant Art Director Beng Chin Ooi; Loren Terveen; provided that the per-copy fee indicated
Mia Angelica Balaquiot Marshall Van Alstyne; Jeannette Wing in the code is paid through the Copyright
ACM CO U N C I L
Designer Clearance Center; www.copyright.com.
President
Alexander L. Wolf Iwona Usakiewicz
Production Manager P R AC TIC E Subscriptions
Vice-President
Lynn D’Addesio Co-Chairs An annual subscription cost is included
Vicki L. Hanson
Director of Media Sales Stephen Bourne in ACM member dues of $99 ($40 of
Secretary/Treasurer
Jennifer Ruzicka Board Members which is allocated to a subscription to
Erik Altman
Public Relations Coordinator Eric Allman; Charles Beeler; Bryan Cantrill; Communications); for students, cost
Past President
Virginia Gold Terry Coatta; Stuart Feldman; Benjamin Fried; is included in $42 dues ($20 of which
Vinton G. Cerf
Publications Assistant Pat Hanrahan; Tom Limoncelli; is allocated to a Communications
Chair, SGB Board
Juliet Chance Kate Matsudaira; Marshall Kirk McKusick; subscription). A nonmember annual
Patrick Madden
Erik Meijer; George Neville-Neil; subscription is $100.
Co-Chairs, Publications Board
Jack Davidson and Joseph Konstan Columnists Theo Schlossnagle; Jim Waldo
David Anderson; Phillip G. Armour; ACM Media Advertising Policy
Members-at-Large The Practice section of the CACM
Michael Cusumano; Peter J. Denning; Communications of the ACM and other
Eric Allman; Ricardo Baeza-Yates; Editorial Board also serves as
Mark Guzdial; Thomas Haigh; ACM Media publications accept advertising
Cherri Pancake; Radia Perlman; the Editorial Board of .
Leah Hoffmann; Mari Sako; in both print and electronic formats. All
Mary Lou Soffa; Eugene Spafford;
Pamela Samuelson; Marshall Van Alstyne advertising in ACM Media publications is
Per Stenström
C ONTR IB U TE D A RTIC LES at the discretion of ACM and is intended
SGB Council Representatives
Co-Chairs to provide financial support for the various
Paul Beame; Barbara Boucher Owens; CO N TAC T P O IN TS Al Aho and Andrew Chien activities and services for ACM members.
Andrew Sears Copyright permission Board Members Current Advertising Rates can be found
permissions@cacm.acm.org William Aiello; Robert Austin; Elisa Bertino; by visiting http://www.acm-media.org or
BOARD C HA I R S Calendar items Gilles Brassard; Kim Bruce; Alan Bundy; by contacting ACM Media Sales at
Education Board calendar@cacm.acm.org Peter Buneman; Peter Druschel; (212) 626-0686.
Mehran Sahami and Jane Chu Prey Change of address Carlo Ghezzi; Carl Gutwin; Gal A. Kaminka;
Practitioners Board acmhelp@acm.org James Larus; Igor Markov; Gail C. Murphy; Single Copies
George Neville-Neil Letters to the Editor Bernhard Nebel; Lionel M. Ni; Kenton O’Hara; Single copies of Communications of the
letters@cacm.acm.org Sriram Rajamani; Marie-Christine Rousset; ACM are available for purchase. Please
REGIONA L C O U N C I L C HA I R S Avi Rubin; Krishan Sabnani; contact acmhelp@acm.org.
ACM Europe Council W E B S IT E Ron Shamir; Yoav Shoham; Larry Snyder;
Fabrizio Gagliardi http://cacm.acm.org Michael Vitale; Wolfgang Wahlster; COMMUN ICATION S OF THE ACM
ACM India Council Hannes Werthner; Reinhard Wilhelm (ISSN 0001-0782) is published monthly
Srinivas Padmanabhuni AU T H O R G U ID E L IN ES by ACM Media, 2 Penn Plaza, Suite 701,
ACM China Council http://cacm.acm.org/ RES E A R C H HIGHLIGHTS New York, NY 10121-0701. Periodicals
Jiaguang Sun Co-Chairs postage paid at New York, NY 10001,
Azer Bestovros and Gregory Morrisett and other mailing offices.
PUB LICATI O N S BOA R D ACM ADVERTISIN G DEPARTM E NT Board Members
Martin Abadi; Amr El Abbadi; Sanjeev Arora; POSTMASTER
Co-Chairs 2 Penn Plaza, Suite 701, New York, NY
Dan Boneh; Andrei Broder; Doug Burger; Please send address changes to
Jack Davidson; Joseph Konstan 10121-0701
Stuart K. Card; Jeff Chase; Jon Crowcroft; Communications of the ACM
Board Members T (212) 626-0686
Sandhya Dwaekadas; Matt Dwyer; 2 Penn Plaza, Suite 701
Ronald F. Boisvert; Nikil Dutt; Roch Guerrin; F (212) 869-0481
Alon Halevy; Maurice Herlihy; Norm Jouppi; New York, NY 10121-0701 USA
Carol Hutchins; Yannis Ioannidis;
Catherine McGeoch; M. Tamer Ozsu; Director of Media Sales Andrew B. Kahng; Henry Kautz; Xavier Leroy;
Mary Lou Soffa Jennifer Ruzicka Kobbi Nissim; Mendel Rosenblum;
jen.ruzicka@hq.acm.org David Salesin; Steve Seitz; Guy Steele, Jr.; Printed in the U.S.A.
ACM U.S. Public Policy Office David Wagner; Margaret H. Wright
Renee Dopplick, Director Media Kit acmmediasales@acm.org
1828 L Street, N.W., Suite 800 WEB
Washington, DC 20036 USA Association for Computing Machinery Chair
T (202) 659-9711; F (202) 667-1066 (ACM) James Landay
2 Penn Plaza, Suite 701 Board Members A
SE
REC
Y
Computer Science Teachers Association New York, NY 10121-0701 USA Marti Hearst; Jason I. Hong;
E
CL
PL
Lissa Clayborn, Acting Executive Director T (212) 869-7440; F (212) 869-0481 Jeff Johnson; Wendy E. MacKay
NE
TH
S
I
Z
I
M AGA
We’re more than computational theorists, database managers, UX mavens, coders and developers.
We’re on a mission to solve tomorrow. ACM gives us the resources, the access and the tools to invent the future.
Join ACM today and receive 25% off your first year of membership.
ACM.org/KeepInventing
cerf’s up
DOI:10.1145/2740243
Human or Machine?
W
E WISH TO clarify an ac- gene Goostman’s personality is that of relayed on the right. Timings and text
count of the 2014 Tur- a 13-year-old boy from Odessa, Ukraine, are exactly as they were in the test.
ing Test experiment we a character we do not consider contrary So, could you “pass the test” and be
conducted at the Royal to Alan M. Turing’s vision for build- able to say which of the two entities—
Society London, U.K., ing a machine to think. In 1950, Tur- E20 and E24—is the human and which
as outlined by Moshe Y. Vardi in his ing said, “Instead of trying to produce a the machine?
Editor’s Letter “Would Turing Have programme to simulate the adult mind, Huma Shah, London, U.K., and
Passed the Turing Test?” (Sept. 2014). why not rather try to produce one which Kevin Warwick, Reading, U.K.
Vardi was referring to a New Yorker blog simulates the child’s?”
by Gary Marcus, rather than to our exper- The figure here includes one simul-
iment directly. But Marcus had no first- taneous conversation from the experi- Author’s Response:
hand experience with our 2014 experi- ment, showing one of Judge J19’s tests The details of this 2014 Turing Test
ment nor has he seen any of our Turing after that judge simultaneously inter- experiment only reinforces my judgment
Test conversations. acted with two hidden entities, in this that the Turing Test says little about
Our experiment involved 30 human case E20 and E24. In this test, E20’s machine intelligence. The ability to
judges, 30 hidden humans, and five responses to the judge were relayed to generate a human-like dialogue is at best an
machines—Cleverbot, Elbot, Eugene a message box displayed on the left of extremely narrow slice of intelligence.
Goostman, JFred, and Ultra Hal; for back- the judge’s screen; E24’s answers were Moshe Y. Vardi, Editor-in-Chief
ground and details see http://turingtest-
sin2014.blogspot.co.uk/2014/06/eugene- Simultaneous comparison by judge J19 in session four, round one of hidden entities E20
and E24 in a Turing Test.
goostman-machine-convinced-3333.html.
We used social media to recruit judges
Left Right
and a variety of hidden humans, includ- [11:59:14] J19: Hello [11:59:57] J19: How are you doing?
ing males, females, adults, teenagers, ex- [11:59:18] E20: I was hoping to see you [12:00:07] E24: I'm doing fine, you?
perts in computer science and robotics, Judge! [12:00:17] J19: great
and non-experts, including journalists, [11:59:25] J19: good! [12:00:27] E24: That's good. :)
lecturers, students, and interested mem- [11:59:29] E20: That's good to hear. [12:01:07] J19: are you enjoying this process?
[11:59:38] J19: how are you [12:01:17] E24: So far seems good.
bers of the public.
[11:59:41] E20: Everything's fine, Caused [12:02:02] J19: sorry if I am slow
Prior to the tests, the judges were un- any scandals lately? [12:02:12] E24: That's ok.
aware of the nature of the pairs of hidden [12:00:09] J19: no not really! [12:02:31] J19: do you have a favorite tv
entities they would be interrogating; we [12:00:22] E20: Can you use better show?
told them only that they would simultane- grammar and rephrase your question? I'd [12:02:47] E24: I have quite a few.
ously interrogate one human and one ma- like to help, but I can't even make it through [12:03:01] J19: me too
the jumbled mess. Use capitals and [12:03:16] E24: What type ot TV do you like?
chine for five minutes and that the human punctuation. [12:03:34] J19: Game of thrones
could be a male or female, child or adult, [12:00:34] J19: sorry [12:03:50] E24: I prefer documentories.
native English speaker, or non-native Eng- [12:00:38] E20: You're excused.
lish speaker. We asked the hidden humans [12:00:49] J19: 't type as good as I talk
to be themselves, that is, to be human. [12:00:55] E20: Talk to me whenever you
can.
The 30 judges, each given an anony-
[12:01:49] J19: Are you interested in
mous experiment identity—labeled J1– artificial intelligence?
J30—interrogated five pairs of hidden en- [12:01:55] E20: I believe machines should
tities. Likewise each human and machine do many to assist the lives of man.
was given a unique identity—E1–E35. We [12:02:12] J19: I agree
ran 150 “simultaneous comparison” Tur- [12:02:17] E20: You are being well behaved
today.
ing Tests in which we instructed the judg- [12:02:49] J19: as well behaved as I can be!
es that their task was to determine which [12:03:02] E20: Well then our peanut
was human and which was machine in distracted me from the movie. Bad baby. [I
the pair, a decision to be made based look at your stomach as I say it, laughing] I
solely on the responses the hidden enti- don't mind if -.
[12:03:19] J19: you're funny!
ties posted in reply to what a judge said. [12:03:23] E20: Do you really think so? I'll
Eugene Goostman was not correctly always believe you're brilliant!
identified as the machine in the pair [12:04:00] J19: thankyou remote19 left
in 10 of its 30 tests; that is, 10 judges [12:04:03] E20: Don't think anything of it!
did not recognize it was a machine. Eu-
Bahrain Revokes Masaud engage in nonviolent expressive activ- Even if able to program in a new lan-
Jahromi’s Citizenship ity. Indeed, Article 15 of the Universal guage, a developer might still need
I have written several letters to the Declaration of Human Rights specifi- further training to become a produc-
editor (June 2012, Jan. 2012, and Nov. cally prohibits arbitrary deprivation of tive member of a team. It takes time
2011) about a citizen of Bahrain, pro- anyone’s nationality. and resources to graduate from “Hello
fessor and chair Masaud Jahromi of Denial of citizenship without ex- World” to programs fulfilling client
the Telecommunications Engineer- planation or apparent basis imposes use cases. Such an investment is per-
ing Department at the Ahlia Univer- severe damage on an individual who haps what motivates organizations to
sity in Bahrain, whose human rights consequently becomes stateless. outsource their software projects, a
had been violated by his own govern- Moreover, just including Jahromi’s practice that risks even failure due to
ment. Jahromi was arrested and im- name on a list with the names of poor-quality software.2
prisoned in April 2011 for nearly six obvious terrorists serving with ISIS Though developing verifiable soft-
months for attending a rally on be- abroad damages Jahromi’s reputa- ware is not typically high on an orga-
half of freedom. He was eventually tion as an academic. nization’s must-teach list, hosting an
tried, convicted, and sentenced by a Jahromi believes wide publicity of application in a cloud adds further re-
court to five months in prison and a his plight through Communications quirements due to the code’s remote
fine of approximately $1,400. As he and support from the ACM member- service-driven execution architecture.
had already served five months, the ship was a positive factor in addressing As long as the cloud delivers the ser-
court simultaneously suspended the his previous legal problem. Those who vice at the promised quality of service,
four months. Following this January wish to help may write to the following clients are unlikely to be interested
19, 2012 ruling, he was dismissed address to request immediate restora- in the details of its implementation.
from his position as professor and tion of Jahromi’s Bahraini citizenship. For example, Jeremy Avigad and John
chair at Ahlia University only to be His Majesty Shaikh Hamad bin Issa Harrison concluded in their article
reinstated as professor February 20, Al Khalifa “Formally Verified Mathematics” (Apr.
2012 and then as chair in March or King of Bahrain 2014), “There is a steep learning curve
April 2012. Office of His Majesty the King to the use of formal methods, and
The Bahrain Ministry of Interior P.O. Box 555, Rifa’a Palace verifying even straightforward and in-
has now revoked Jahromi’s citizenship Isa Town Central, tuitively clear inferences can be time
through a decree issued January 31, Kingdom of Bahrain consuming and difficult.”1
2015. Jahromi was one of 72 Bahrainis, Jack Minker, College Park, MD Yet another aspect of the problem
including journalists, activists, and is the absence of the skillset needed
doctors, to be stripped of their citizen- to write efficient cloud-based code, as
ship pursuant to a revision of the 1963 The Case of the Missing Skillset not all code is readily convertible for
Bahrain Citizenship Act. In their article “Verifying Computa- parallel and cloud-friendly environ-
The Ministry of Interior an- tions without Reexecuting Them” ments. While ample processing pow-
nounced its decree without court pro- (Feb. 2015), Michael Walfish and er may be available in a rental cloud, it
cess or opportunity to respond, saying Andrew J. Blumberg reviewed the may be difficult to find and train soft-
it was revoking the citizenship of the state of the art in program verifica- ware developers to produce related
named individuals for “terrorist activ- tion while questioning whether to high-quality cloud-based code.
ities,” including “advocating regime trust anything stored or computed Formalisms (such as those dis-
change through illegal means.” There on third-party servers, as in cloud cussed by Walfish and Blumberg) may
is no evidence Jahromi has ever par- computing, where companies and be exciting, at least theoretically, but
ticipated in terrorism in any form. His consumers alike access remote re- realizing efficient and verifiable cloud
sentence in 2011 was for participating sources, including data, processing code requires bridging the gap between
in “unauthorized rallies” during pro- power, and memory, on a rental ba- the software industry and the commu-
tests. There are no additional allega- sis. Walfish and Blumberg proposed nity of formal-methods practitioners.
tions or evidence that he violated any the formalism of probabilistically Muaz A. Niazi, Islamabad, Pakistan
law since then. checkable proofs to allow, at least
The summary revocation of citizen- theoretically, a verifier to verify re-
ship appears to be a result of nonvio- motely performed computations. References
1. Avigad, J. and Harrison, J. Formally verified
lent expressive activity that has already Not discussed, however, was an im- mathematics. Commun. ACM 57, 4 (Apr. 2014), 66–75.
been punished and not recurred. In- portant aspect of verifiable software 2. Moe, N.B., Šmite, D., Hanssen, G.K., and Barney, H. From
offshore outsourcing to insourcing and partnerships:
ternational instruments, including for the cloud—the general lack of the Four failed outsourcing attempts. Empirical Software
Engineering 19, 5 (Aug. 2014), 1225–1258.
the Universal Declaration of Human skillsets needed to write efficient and
Rights and the International Conven- verifiable parallel programs.
tion on Civil and Political Rights, to As anyone in the software industry
which Bahrain is a signatory, explicit- or related academic institutions can Communications welcomes your opinion. To submit
a Letter to the Editor, please limit yourself to 500 words
ly protect both the right of individuals attest, training software engineers or less, and send to letters@cacm.acm.org.
to be free from arbitrary deprivation to use new tools and paradigms can
of their nationality and their right to involve an arduous learning curve. © 2015 ACM 0001-0782/15/04 $15.00
Dear Colleague,
For over 50 years, ACM has helped computing professionals to be their most creative,
connect to peers, and see what’s next. We are creating a climate in which fresh ideas are
generated and put into play.
Enhance your professional career with these exclusive ACM Member benefits:
We’re more than computational theorists, database engineers, UX mavens, coders and
developers. Be a part of the dynamic changes that are transforming our world. Join
ACM and dare to be the best computing professional you can be. Help us shape the
future of computing.
Sincerely,
Alexander Wolf
President
Association for Computing Machinery
q Join ACM-W: ACM-W supports, celebrates, and advocates internationally for the full engagement of women in
all aspects of the computing field. Available at no additional cost.
Priority Code: CAPP
Payment Information
Payment must accompany application. If paying by check
or money order, make payable to ACM, Inc., in U.S. dollars
Name or equivalent in foreign currency.
Credit Card #
City/State/Province
Exp. Date
ZIP/Postal Code/Country
Signature
Email
1-800-342-6626 (US & Canada) Hours: 8:30AM - 4:30PM (US EST) acmhelp@acm.org
1-212-626-0500 (Global) Fax: 212-944-1318 acm.org/join/CAPP
The Communications Web site, http://cacm.acm.org,
features more than a dozen bloggers in the BLOG@CACM
community. In each issue of Communications, we’ll publish
selected posts or excerpts.
DOI:10.1145/2732417 http://cacm.acm.org/blogs/blog-cacm
many smart program chairs that tried retention right now; instead, they are
hard. Why is it not better? There are looking for ways to manage rising enroll-
Stanford total CS106A
strong nonvisible constraints on the ment threatening to undermine efforts Stanford total
CS106A
CS106A
CS101
2500
Stanford total
CS106A
total CS106A
CS101
CS106A
reviewers’ time and attention. to increase diversity in CS education. Stanford
2500
CS106A CS101
2000
2500
2500 CS106A CS101
What does it mean? In the end, Enrollments in computer science are 2000
1500
2000
2000
1500
I think it means two things of real skyrocketing. Ed Lazowska and Eric Rob- 1000
1500
1500
importance. erts sounded the alarm at the NCWIT 1000
500
1000
1000
500
1. The result of the process is mostly summit last May, showing rising enroll- 0
500
500
0
arbitrary. As an author, I found rejects of ments at several institutions (see http:// University
0
0 of Pennsylvania
University of Pennsylvania
good papers hard to swallow, especially tcrn.ch/1zxUho2 and charts at right). In- 1000
University
University of
total
of Pennsylvania
total
Pennsylvania
CS110
CS110
CS120
CS120
1000
when reviews were nonsensical. Learn- diana University’s undergraduate com- 800
1000
1000
total
total CS110
CS110 CS120
CS120
800
ing to accept the process has a strong puting and informatics enrollment has 600
800
800
600
element of arbitrariness helped me deal tripled in the last seven years (http://bit. 400
600
600
400
with that. Now there is proof, so new au- ly/1EBaX0K). At the Georgia Institute of 200
400
400
200
0
200
thors need not be so discouraged. Technology (Georgia Tech), our previous 200
0
MIT 0
2. The Conference Management maximum number of undergraduates in MIT 0 total 6.01 6.00
MIT
1200
MIT total 6.01 6.00
Toolkit (http://bit.ly/16n3WCL) has a computing was 1,539 set in 2001. As of 1200
total
total 6.01
6.01 6.00
6.00
1200
1200
tool to measure arbitrariness that can be fall 2014, we have 1,665 undergraduates. 800
800
used by other conferences. Joelle Pineau What do we do? One thing we might 800
800
400
and I changed ICML 2012 (http://bit. do is hire more faculty, and some schools 400
400
400
ly/1wZiZaW) in various ways. Many of are doing that. There were over 250 job 0
0
these appeared beneficial and some ads for CS faculty in a recent month. I do 0
0
Harvard
Harvard CS50
CS50
stuck, but others did not. In the long not know if there are enough CS Ph.D.’s Harvard
Harvard CS50
CS50
run, it is things that stick that matter. Be- looking for jobs to meet this kind of
ing able to measure the review process growth in demand for our courses.
in a more powerful way might be benefi- Many schools are putting the brakes
cial in getting good practices to stick. on enrollment. Georgia Tech is limit-
7 8 9 10 –11 12 3 4
You can see related commentary by ing transfers into the CS major and mi- –07 –0 –0 – 1 – –13 –1
06 0 07 08 08 09 09–10 10–1 11–12 12–1 13–14
20006–02770007–02880008–02990009 12000010 12110011 12220012 12330013 144
Lance Fortnow (http://bit.ly/1HNfPm7), nor. The University of Massachusetts 06 07 08 0
12
1
–12
2 6––02 7––02 8––02 99–– 100– 111––1 122–– 33––
1
2
1
12
1
1
22000 22000 22000 22000 2200 2200 2200 22001
Bert Huang (http://bit.ly/1DpGf6L), and at Amherst is implementing caps. The
Yisong Yue (http://bit.ly/1zNvoqb). University of California at Berkeley has
a minimum GPA requirement to trans- priority. When you start allocating
Mark Guzdial fer into computer science. seats, students with prior experience
“Rising Enrollment
CHA RTS COURT ESY ED L AZOWSKA , BILL & MELINDA GAT ES CH A IR IN COM PUTER SCIENCE & ENGINEER I N G, UN I V E RSI T Y OF WAS HI N GTON
We have been down this road before. look most deserving of the opportunity.
Might Capsize In the 1980s when enrollment spiked, Google is trying to help; it started a pilot
Retention and a variety of mechanisms were put into program to offer grants to schools with
Diversity Efforts” place to limit enrollment (http://bit. innovative ideas to manage booming
http://bit.ly/1J3lsto ly/1KkZ9hB). If there were too few seats enrollment without sacrificing diversity
January 19, 2015 available in our classes, we wanted to (http://bit.ly/16b292F).
Computing educators have been work- save those for the “best and brightest.” One reason to put more computer
ing hard at figuring out how to make sure From that perspective, a minimum science into schools is the need for com-
students succeed in computer science GPA requirement made sense. From a puting workers (http://bit.ly/1DxYbN3).
classes—with measurable success. The diversity perspective, it did not. What happens if kids get engaged by ac-
best paper award for the 2013 SIGCSE Even today, white and Asian males tivities like the Hour of Code, then can-
Symposium went to a paper on how a are more likely to have access to Ad- not get into undergraduate CS classes?
combination of pair programming, peer vanced Placement Computer Science The linkage between computing in
instruction, and curriculum change led and other pre-college computing op- schools and a well-prepared workforce
to dramatic improvements in retention portunities. Who is going to do better will be broken. It is ironic our efforts
(http://bit.ly/1EB9mIe). The chairs award in the intro course: the female student to diversify computing may be getting
for the 2013 ICER Conference went to a trying out programming for the first broken by too many kids being sold on
paper describing how Media Compu- time, or the male student who has al- the value of computing!
tation positively impacted retention in ready programmed in Java? Our efforts It is great students see the value of
multiple institutions over a 10-year pe- to increase the diversity of computing computing. Now, we have to figure out
riod (http://bit.ly/1AkpH2x). The best education are likely to be undermined how to meet the demand—without sac-
paper award at ITICSE 2014 was a meta- by efforts to manage rising enrollment. rificing our efforts to increase diversity.
analysis of papers exploring approaches Students who get shut out by such limits
to lower failure rates in CS undergradu- are most often in demographic groups un- John Langford is a principal researcher at Microsoft
Research New York. Mark Guzdial is a professor at the
ate classes (http://bit.ly/1zNrvBH). derrepresented in computer science. Georgia Institute of Technology.
How things have changed! Few CS de- When swamped by overwhelming
partments in the U.S. are worried about numbers, retention is not your first © 2015 ACM 0001-0782/15/04 $15.00
A P R I L 2 0 1 5 | VO L. 58 | N O. 4 | C OM M U N IC AT ION S OF T HE ACM 13
Distinguished Speakers Program
talks by and with technology leaders and innovators
A great speaker can make the difference between a good event and a WOW event!
The Association for Computing Machinery (ACM), the world’s largest educational and scientific computing society, now provides colleges and
universities, corporations, event and conference planners, and agencies – in addition to ACM local Chapters – with direct access to top technology leaders
and innovators from nearly every sector of the computing industry.
Book the speaker for your next event through the ACM Distinguished Speakers Program (DSP) and deliver compelling and insightful content to your
audience. ACM will cover the cost of transportation for the speaker to travel to your event. Our program features renowned thought leaders
in academia, industry and government speaking about the most important topics in the computing and IT world today. Our booking process is simple and
convenient. Please visit us at: www.dsp.acm.org. If you have questions, please send them to acmdsp@acm.org.
Colleges and Universities Expand the knowledge base of your students ACM Local Chapters Boost attendance at your meetings with live talks
with exciting lectures and the chance to engage with a computing by DSP speakers and keep your chapter members informed of the latest
professional in their desired field of expertise. industry findings.
Association for
Computing Machinery
The DSP is sponsored
in part by Microsoft Europe Advancing Computing as a Science & Profession
N
news
Molecular Moonshots
Synthetic biologists may be closing in on potentially
world-changing breakthroughs, but they are often
hamstrung by a shortage of software tools.
W
HEN A TEAM of research-
ers at Bar-Ilan Univer-
sity in Israel recently
announced they had
successfully implant-
ed DNA-based nanorobots inside liv-
ing cockroaches—possibly paving the
way for a revolution in cancer treat-
ment—it marked the latest in a series of
promising innovations to emerge from
the synthetic biology community over
the past decade.
In recent years, biotechnologists
have started to come tantalizingly close
to engineering next-generation drugs
and vaccines, DNA-based computa-
tional systems, and even brand-new
synthetic life forms. Amid all these ad-
vances, however, the development of
synthetic biology software has largely
failed to keep up with the pace of inno-
vation in the field.
With only a handful of commer-
cial software tools at their disposal,
most synthetic biologists have had no
choice but to build their own bespoke
systems to support the intense data
modeling needs of molecular engi-
neering.
“Computer science is critically im-
IMAGE BY ST EVE YOUNG
A P R I L 2 0 1 5 | VO L. 58 | N O. 4 | C OM M U N IC AT ION S OF T HE ACM 15
news
effort to create open source biological that work. Otherwise, DNA is just as cancer cells in a lab environment.
parts for molecular engineering. Yet unreadable as ones and zeros.” The nanorobots work by twisting
surprisingly few computer scientists Project Cyborg offers researchers into a precise configuration that allows
have chosen to enter the field. and designers a set of Web-based com- them to attach to targeted cells, look-
That situation may be slowly chang- puter-aided design (CAD) tools that ing for signals from antigens on the
ing, as a handful of developers have allow the engineering of nano-scale cells’ surface that flag them as cancer-
started to delve into the challenges objects in a visual design framework. ous. The nanorobots then lock onto
of molecular engineering. One of the The system allows users to store and the cell, breaking it open and engaging
most promising initiatives to date has manipulate that data in the cloud. De- the cell receptors to trigger a response
come from Autodesk, which is devel- signs can be fabricated by a number of that destroys the cancer cell.
oping a software platform designed DNA “print shops” around the world Transforming DNA strands into
for synthetic biology, 3D bioprinting, (such as DNA2.0 in Menlo Park or Gen9 predetermined shapes requires a com-
4D printing, and DNA nanotechnol- in Boston). plex and delicate process of manipu-
ogy, code-named Project Cyborg. Thinking about the design of living lating nucleotides to assemble in a pre-
“DNA is the universal program- things, Hessel has discovered irresist- ordained sequencing, precisely order-
ming language,” says Andrew Hessel ible parallels between synthetic biology ing the genomic components of ade-
of Auto-desk’s Bio/Nano/Programma- and computing. “Cells are just living, nine, cytosine, guanine, and thymine.
ble Matter group, who sees enormous squishy parallel processors,” says Hes- It is painstaking work, involving thou-
potential in applying the principles sel. “They are fully programmable.” sands of detailed specifications.
of computer science to biological ap- One of Project Cyborg’s best- Douglas initially used the open
plications. “The architectures of biol- known beta testers is Shawn Doug- source drawing tool Inkblot (and in
ogy—from proteins to metabolism to las at the University of California some cases, pen and paper). Frus-
cells to tissues to organisms and eco- San Francisco, who is using the Au- trated with the limitations of the avail-
systems—it’s all just layered systems todesk tools to design cancer-fighting able software, he drew on his own
on systems. But you need the founda- nanorobots that have already proven background in computer science to
tion of silicon computing to support effective at isolating and attacking develop an open-source software tool
Milestones
A P R I L 2 0 1 5 | VO L. 58 | N O. 4 | C OM M U N IC AT ION S OF T HE ACM 17
news
Secure-System
Designers Strive
to Stem Data Leaks
Attackers using side-channel analysis require
little knowledge of how an implementation operates.
C
HIP AND SYSTEM designers are is actually happening inside. They are the operation of a circuit in research by
engaged in a cat-and-mouse able to work toward success based leading cryptologist Adi Shamir, based
battle against hackers to try purely on statistics.” at the Weizmann Institute of Science
to prevent information leak- The key to side-channel analysis is and working with colleagues from Tel
ing from their circuits that that changes in data as they are pro- Aviv University. Changes in the amount
can be used to reveal supposedly secure cessed by algorithms running on a mi- of current passing through the trans-
cryptographic keys and other secrets. croprocessor or dedicated hardware former cause oscillations that can be
Traditionally, such side-channel at- unit yield a detectable fingerprint, heard as subtle changes in sound.
tacks have relied on expensive bench- which may be picked up as changes in Timing-based attacks provide the
top instruments such as digital-storage the power consumed by the target or basis for some of the simplest forms
oscilloscopes, but the development of as heat or electromagnetic emissions. of side-channel analysis. O’Flynn cites
an open-source platform dubbed Chip- Daniel Mayer, senior applications a now-discontinued hard drive that
Whisperer based on affordable pro-
grammable integrated circuits (ICs)
has widened the potential user base,
as well as making it easier for design-
ers to assess the vulnerability of their
own designs. The latest addition to the
ChipWhisperer platform developed by
Colin O’Flynn, a doctoral student at
Dalhousie University in Nova Scotia,
Canada, and colleague Zhizhang Chen
is based on a $90 board that is able to
recover secret keys from simple micro-
controller-based targets in a matter of
minutes, although it is by no means an
automated process.
“Because the way in which the at-
tack works, it’s very important to
understand the theory behind it,” IMAGE COURTESY OF COLIN O’ F LYNN A ND ZH IZH A NG CH EN, DA LH OUSIE UNIVERSIT Y
O’Flynn points out. However, an im-
portant feature of this class of attack is A board built on the ChipWhisperer platform can recover secret keys from microcontroller-
that it focuses on the core algorithm, based targets in minutes.
rather than on the idiosyncrasies of a
particular implementation. security consultant at New York City- used a PIN code entered on its panel
To Patrick Schaumont, associate based security research firm Matasano to provide access to users. An attacker
professor of electrical and computer Security, explained at the recent Black could measure how long the drive’s
engineering and the director of the Hat USA conference: “They are all relat- firmware took to analyze different
Center for Embedded Systems for ed to the computation that the system codes by simply iterating through in-
Critical Applications at Virginia Poly- does at a given time. Using that infor- tegers at each position in the six-digit
technic Institute and State University mation, picked up outside the appli- PIN. “As soon as the while-loop to
(Virginia Tech), “What makes side- cation, you can infer something about check the PIN fails, it just exits. There
channel analysis so impressive and the secret it contains.” should be a million combinations of
so scary is that people who have ac- Even the sounds from the windings this password. But even in the worst
cess to your implementation need to of a transformer in a power supply have case for this drive, it takes just 60 tries
make very few assumptions on what been used to collect information about to guess the PIN.”
A P R I L 2 0 1 5 | VO L. 58 | N O. 4 | C OM M U N IC AT ION S OF T HE ACM 19
news
I
T IS FAIRLY common for travel- different prices for their seats. Supply
ers to begin their searches for and demand are certainly key factors
the ideal itinerary online, only What is enabling that determine the cost of a ticket, but
to find the price of a plane ticket the growth of competitors’ prices, seasonality, the
or hotel room has changed dra- cost of fuel, and a host of other vari-
matically in just a few hours. Checking dynamic pricing ables go into the computation of pric-
back on one of the many travel compar- is the availability es, all based on complex algorithms.
ison websites a few days later will only While airlines were the pioneers in
result in more exasperation, as a hotel of petabytes of the use of dynamic pricing science, the
room price may have gone up, while the data from billions retail industry has emerged as a leader
cost of the airline ticket has gone down. in innovation and implementation of
These industries rely on pricing of consumer price optimization systems.
models that try to eke the most profit transactions. According to a November 2013 Retail
out of a plane ticket or hotel room. The Info Systems News survey, 22.6% of on-
constantly changing selling price is by line retailers are utilizing pricing in-
design: travel providers have decades telligence software to assist in price
of experience in using dynamic pricing optimization. An additional 35.6% of
models based on supply and demand, survey respondents said they expected
which help them balance prices with desired outcome. Online retailers that to implement such systems within the
seat and room availability. Yet there want to ensure pricing for “hot” items next year. If these retailers followed
is far more to dynamic pricing than is in line with competitors may choose through with their implementation
basic economics, and a number of in- to adjust prices more frequently than plans, that means more than half of
dustries—including some unexpected on slower-moving products. online retailers are using pricing intel-
ones—are adopting this concept to Airlines in particular have become ligence systems today.
capitalize on this growing trend of adept at using virtual warehouses of Arnoud Victor den Boer of the Uni-
price optimization. data to help set pricing at the seat level. versity of Twente’s Stochastic Opera-
What is really enabling the growth The result: passengers sitting in the tions Research group says, “The emer-
of dynamic pricing is the availability of same row on a plane, just inches from gence of the Internet as a sales channel
petabytes of data stemming from the one another, may have paid drastically has made it very easy for companies to
billions of transactions that consum-
ers have conducted with businesses on
a daily basis. Coupled with advances
in software, inexpensive storage solu-
tions, and computers capable of ana-
lyzing thousands of variables, this vast $371
amount of data can be crunched almost
instantly to help companies optimize $519
pricing on everything from a pair of $389
$1,140 $197 $520
shoes to a luxury automobile.
There are dozens of software firms $299 $435 $185
$388
offering intelligent pricing software, $452
$229
many designed with specific vertical
markets in mind. At the center of each
vendor’s offering is their own proprie-
tary algorithm used to help the deploy-
er maximize profits. The algorithms
PHOTO F ROM SHUT TERSTOCK .COM
A P R I L 2 0 1 5 | VO L. 58 | N O. 4 | C OM M U N IC AT ION S OF T HE ACM 21
news
experiment with selling prices.” In the car and, perhaps one day, how much the buyer and seller.” TrueCar surveyed
past, companies had to replace price we spend on fueling the vehicle. car buyers, asking how much profit they
tags on shelves or print new catalogs One of the most complex uses of thought a dealer makes; consumers on
when prices changed, but online re- dynamic pricing may be in the retail average believed the average margin
tailers have much more flexibility, and automobile market. A car is generally on a car is 19.7%. When asked what a
can instantly change prices based on one of the largest purchases a consum- fair profit margin would be, car buyers
market dynamics. “This flexibility in er will make in their lifetime, second said 13.2%; in reality, the average dealer
pricing is one of the main drivers for only the purchase of a home. Santa profit margin is closer to 3%–4%. “This
dynamic pricing,” says den Boer. Monica, CA-based TrueCar Inc. has shows that there is a huge opportunity
developed a proprietary methodology to produce better outcomes for both
The ‘Amazon Effect’ designed to help the three primary con- buyers and sellers within the market-
What might be called an ‘Amazon ef- stituents in the car-buying process— place,” says Williams.
fect’ is prompting retailers across the manufacturer, the dealer, and the The volume of data TrueCar must
many vertical markets to take a closer consumer—complete a deal that is crunch is huge, requiring parallel
look at price optimization strategies. equitable for all parties. computing and an enormous amount
Pricing intelligence software can scan John Williams, senior vice presi- of storage. The company compares
competitors’ websites as often as ev- dent of technology at TrueCar, says, buyers’ needs with dealers’ existing
ery 10 minutes for their pricing and, “Dynamic pricing within automotive inventories, adding in variables for
using pre-set rules, automatically retail has been an ever-present market what other buyers have paid, vehicle
raise or lower prices on products to force; however, it is in many ways more condition, demand for specific cars,
stay competitive. complex and more foundational to the and hundreds of other data elements.
Den Boer says online retailers are process than in other industries. It is The result is a complex algorithm sit-
leading the pack in the use of dynam- often the case that two different shop- ting on top of a massive data ware-
ic pricing, “but a growing number of pers can walk into the same store, on house. TrueCar’s Williams said, “We
companies in ‘offline’ settings are the same day, buy the exact same car, built a multi-petabyte Hadoop cluster
starting to acknowledge the possible and pay completely different prices— for $0.29/GB, a truly game-changing
advantages of dynamic pricing. The sometimes as much as a 30% differ- price point. Storing all the ambient
idea that your customer has a ‘person- ence.” The same pricing phenomenon data within a marketplace provides
al’ willingness to pay which you would may also be observed in other markets, a powerful historical record to use
like to learn/exploit is very appealing.” but is amplified in a car purchase, for many dynamic pricing applica-
Dynamic pricing is being used in ar- where the price tag is in the tens of tions, including techniques like ma-
eas that might, at least on the surface, thousands of dollars. chine learning. Historical data is used
seem to be incompatible with this tech- According to Williams, “What’s es- to train machine learning models;
nology. Such models are being used to pecially fascinating about this inherent the more history you can feed it, the
determine the cost of the cars we drive, price variability in automotive retail is smarter the algorithms are, and that
the amount we pay for parking for that that it generally is inefficient for both yields a marketplace advantage.”
Milestones
customer is willing to pay the price. for example, was the first company to © 2015 ACM 0001-0782/15/04 $15.00
A P R I L 2 0 1 5 | VO L. 58 | N O. 4 | C OM M U N IC AT ION S OF T HE ACM 23
V
viewpoints
L
A S T Y E A R, T H E National Insti- which involves holding companies lia- for target exploitation. One possibility,
tute Standards and Technol- ble for faulty software, is an idea whose suggested by Dan Geer, is for the U.S.
ogy (NIST) added 7,937 vul- time has come. government (USG) to openly corner the
nerabilities to the National vulnerability market.6 In particular, the
Vulnerability Database (NVD), Cornering the Vulnerability Market USG would buy all vulnerabilities and
up from 5,174 in 2013. That is approxi- Many software companies, includ- share them with vendors and the public,
mately 22 per day, or almost one every ing Microsoft, Google, and Mozilla, offering to pay say 10 times as much as
hour. Of these, 1,912 (24%) were la- operate bug bounty programs, paying any competitor. Geer argues this strat-
beled “high severity” and 7,243 (91%) security researchers who bring new egy will enlarge the talent pool of vulner-
“high” or “medium.”7 Simply put, they security flaws to their attention. Other ability finders, while also devaluing the
cannot be ignored. companies serve as brokers, buying vulnerabilities themselves. Assuming
As I read reports of new vulnerabili- vulnerabilities and exploits from se- the supply of vulnerabilities is relatively
ties and the risks they enable, I won- curity researchers, and then selling sparse, the approach could eventually
der whether it will ever end. Will our or donating them to product vendors lead to a situation where most vulner-
software products ever be sufficiently and other customers. To the extent the abilities have been exposed and fixed,
secure that reports such as these are end consumers in this growing market rendering any cyber weapons that ex-
few and far between? Or, will they only are the companies whose products are ploited them useless. In addition, since
become more prevalent as more and flawed, the market serves to strengthen researchers finding new zero-day will
more software enters the market, and software security. But when end con- maximize their earnings by selling them
more dangerous as software increas- sumers are intelligence agencies and to the USG, fewer zero-days should end
ingly controls network-enabled medi- criminals who use the information to up in the hands of adversaries.
cal devices and other products that exploit target systems, the vulnerabil- The cost of Geer’s proposal seems
perform life-critical functions? ity market exposes us all to greater risk. reasonable. Current prices for vulner-
In this column, I will explore two To further reduce software vulner- abilities range from a few hundred to
proposals aimed at reducing software abilities beyond what the market has several hundred thousand dollars. If
flaws. The first, which involves the U.S. achieved so far, we could look for ways we consider the approximately 8,000
government cornering the vulnerabil- that encourage the pursuit of vulner- vulnerabilities added to the NVD in
ity market, I believe, could make the abilities with the goal of getting them 2014 and assume an average price
problem worse. However, the second, fixed, but discourage their sale and use of $1,000, then the cost of purchas-
Software Liability
A better approach to reducing vulnera-
bilities would be to hold software com-
panies liable for damages incurred by
cyber-attacks that exploit security flaws
in their code. Right now, companies
are not liable, protected by their licens-
ing agreements. No other industry en-
joys such dispensation. The manufac-
turers of automobiles, appliances, and
other products can be sued for faulty
ing these vulnerabilities would be $8 the cost from the private sector to the products that lead to death and injury.
million, a drop in the bucket for the USG, companies would lose an eco- In Geekonomics, David Rice makes
USG. Even if the average price rose to nomic incentive to produce more se- a strong case that industry incentives
$100,000, the annual cost would still cure software in the first place. As it is, to produce secure software are inad-
be reasonable at $800 million. How- an empirical study by UC Berkeley re- equate under current market forces,
ever, the costs could become much searchers of the bug bounty programs and that one way of shifting this would
higher and the problems worse if the offered by Google and Mozilla for their be to hold companies accountable for
program perversely incentivized the respective browsers, Chrome and faulty products.8 Geer proposes that
creation of bugs (for example, an in- Firefox, found the programs were eco- software companies be liable for dam-
side developer colluding with an out- nomically efficient and cost effective ages caused by commercial software
side bounty collector).1 Costs could compared to hiring full-time security when used normally, but that devel-
also rise from outrageous monetary researchers to hunt down the flaws.5 opers could avoid liability by deliver-
demands or the effects of more people Would it not be better to shift the in- ing their software with “complete and
looking for bugs in more products. centives so it was more economical to buildable source code and a license
I especially worry that by shifting invest in secure software development that allows disabling any functional-
A P R I L 2 0 1 5 | VO L. 58 | N O. 4 | C OM M U N IC AT ION S OF T HE ACM 25
viewpoints
ity or code the licensee decides.”6 The Because software licenses and the
escape clause, which would cover free Uniform Commercial Code severely lim-
and open source software, would allow Developing a suitable it vendors from liability for security flaws
users to inspect and cut out any soft- liability regime in their code, companies today cannot
ware they did not trust. be effectively sued or punished when
My main concern with Geer’s propos- will be a challenge, they are negligent and the flaws are ex-
al relates to absolving any code offered as the system must ploited to cause economic harm.9 Legis-
commercially from liability. As a practi- lation or regulation is needed to change
cal matter, very few users are in a posi- address the concerns this and remove the ability of companies
tion to inspect source code. Even those of both software to exempt themselves through licensing
that are can miss significant flaws, as agreements. Developing a suitable li-
seen with Heartbleed, a flaw in OpenS- developers and users. ability regime will be a challenge, how-
SL that gives attackers access to sensi- ever, as the system must address the
tive information, and also ShellShock. concerns of both software developers
In addition, exemption does nothing and users. Perhaps a good start would
to incentivize the production of more be for ACM to sponsor a workshop that
secure open source code. At the same fails to check inputs. Standards and brings together a diverse community of
time, penalizing a large, volunteer com- best practices for secure coding have ad- stakeholders and domain experts to rec-
munity for flaws in their code would be vanced considerably, and readers inter- ommend a course of action.
both difficult and distasteful. ested in learning more might start with Of course, holding software com-
A better way around this dilemma CERT’s Secure Coding Web portal.4 panies accountable will not solve all
might be to exempt the immediate de- The argument is often made that our security woes. Many cyber-attacks
velopers of open source code, but hold software liability will inhibit innovation, exploit weaknesses unrelated to faulty
accountable any company that embeds but we should inhibit the introduction software, such as weak and default
it in their products or that offers ser- of faulty software. Moreover, assigning passwords and failure to encrypt sen-
vices for open source products. Under liability will likely stimulate innovation sitive information. But companies are
such a provision, if an individual or relating to secure software develop- liable when their systems are attacked,
group of volunteers offers a free, open ment. Another argument against soft- and they can be successfully sued for
source App, they would not be account- ware liability is that it will raise the price not following security standards and
able, though any company offering it of software. While this may be true, it best practices. The time has come to
through their App store would be. should lower the costs we all pay from make software vendors liable as well.
This compromise would incentiv- cyber-attacks that exploit software vul-
ize software companies to pay greater nerabilities, costs that have been rising References
1. Baker, H. Re: Zero-day bounties. The Risks Digest 28.25
attention to security in all of the code over the years and fall on the backs of (Sept. 9, 2014).
they offer through their products and users as well as software companies. 2. Bilge, L. and Dumitras, T. An empirical study of zero-
day attacks in the real world. CCS’12 (Oct. 16–18,
services, regardless of whether the Raleigh, N.C.); http://users.ece.cmu.edu/~tdumitra/
code is developed in house, by a con- Conclusion public_documents/bilge12_zero_day.pdf.
3. Carman, A. Shellshock used to amass botnet and
tractor, or within the open source com- Bug bounties emerged under current execute phishing campaign. SC Magazine (Oct. 15,
2014); http://bit.ly/1Df7Slg.
munity; and regardless of whether the market forces and are likely here to 4. CERT Coding Standards; http://www.cert.org/secure-
product is released with source code or stay. I oppose a program that would coding/index.cfm.
5. Finifter, M., Akhawe, D., and Wagner, D. An empirical
the capability to disable certain func- attempt to have the U.S. government study of vulnerability rewards programs. USENIX
tions. Moreover, given that many com- corner and fund this market, in part Security 13; https://www.eecs.berkeley.edu/~daw/
papers/vrp-use13.pdf.
panies contribute to open source de- because it would reduce the incentive 6. Geer, D. Cybersecurity as realpolitik. Blackhat 2014;
velopment, they would be incentivized for software companies to produce http://geer.tinho.net/geer.blackhat.6viii14.txt.
7. National Vulnerability Database, National Institute of
to promote secure coding practices in more secure code and could make the Standards and Technology; https://web.nvd.nist.gov.
the open source community as well as problem worse. 8. Rice, D. Geekonomics: The Real Cost of Insecure
Software, Addison Wesley, 2008.
within their own development teams. A better strategy is one that increas- 9. Scott, M.D. Tort liability for vendors of insecure
Importantly, as with other forms of es the incentives for the development software: Has the time finally come? Maryland
Law Review 67, 2 (2008); http://digitalcommons.
liability, software liability should be of secure software but decreases those law.umaryland.edu/cgi/viewcontent.
tied to standards and best practices, as for putting out sloppy code. One way of cgi?article=3320&context=mlr.
10. Zetter, K. Obama: NSA must reveal bugs like Heartbleed,
well as the damage and harm that result achieving this is by holding companies unless they help the NSA. Wired (Apr. 15, 2014); http://
from security flaws. The objective is not responsible for all the code they sell www.wired.com/2014/04/obama-zero-day/.
to penalize companies who invest con- and service, including both proprietary
Dorothy E. Denning (dedennin@nps.edu) is Distinguished
siderable resources in software security and open source. Under this strategy, Professor of Defense Analysis at the Naval Postgraduate
but find their code vulnerable to a new companies could be sued for damages School in Monterey, CA.
exploit that nobody had anticipated. caused by cyber-attacks that exploited
The views expressed here are those of the author and
Rather, it is to bring all software up to flaws in their code, and penalties would do not reflect the official policy or position of the U.S.
a higher level of security by punishing be inflicted according to whether the Department of Defense or the U.S. federal government.
those who are negligent in this domain, code was developed under standards
for example, by putting out code that and best practices for secure coding. Copyright held by author.
Technology Strategy
and Management
Competing in
Emerging Markets
Considering the many different paths and unprecedented
opportunities for companies exploring emerging markets.
T
H E BE RLI N WALL fell in No-
vember 1989. This was a
world-shaking event that
triggered the disintegration
of the Soviet Union, trans-
forming our view of competition be-
tween nations. It also altered how we
think about innovation-based compe-
tition in emerging markets. Now, 25
years later, we can take stock of these
emerging markets.
The label “emerging markets” was
coined by Antoine van Agtmael in the
early 1980s. It remains alluring for in-
vestors, innovators, and governments.
Yet it remains difficult to characterize
them. Our view of emerging markets
has changed over time. Precise under-
standing might help innovators and
business managers compete more effec-
tively. This column explores these claims.
Excitement about
Emerging Markets
The interest in emerging markets is in Fast growth comes because emerg- accumulated capabilities in produc-
COLL AGE BY A NDRIJ BO RYS ASSOCIATES/ SH UTT ERSTO CK
part because they grow fast and have ing markets provide low-cost locations tion but also in research. By 2010, U.S.
potential for further growth. Growth for sourcing inputs, processing prod- Fortune 500 companies had 98 R&D fa-
rates have slowed somewhat since ucts, and delivering services. Firms led cilities in China and 63 in India. Gener-
Goldman Sachs started “dreaming by giants such as Foxconn in electron- al Electric’s health-care arm had mul-
with BRICs,” referring to the emerg- ics and Pou Chen Group in footwear tiple facilities.
ing economies of Brazil, Russia, India, generate wealth and jobs as suppliers Emerging markets also have inno-
and China.2 But as developed econo- to global brand owners (see my July vative firms making new products and
mies see little growth in the age of 2011 Communications column).5 These services. Sometimes these are dramati-
austerity, emerging markets remain firms are part of global value chains cally cheaper than their Western equiv-
important destinations for global cor- and have been catching up with firms alents: $3,000 cars by Tata Motors, $300
porations to sell goods and services. from advanced economies. They have computers by Lenovo, and $30 mobile
A P R I L 2 0 1 5 | VO L. 58 | N O. 4 | C OM M U N IC AT ION S OF T HE ACM 27
viewpoints
handsets by HTC. These firms do not investment (FDI) and thriving private
merely imitate U.S. and Japanese firms, sectors as engines of growth.
but innovate to meet the needs of low- The fall of the Berlin Wall in 1989
income consumers at the bottom of created “transition economies” in
the pyramid and the rising middle Central and Eastern Europe, moving
classes. When prices are one-quarter to from planned to capitalist models.
one-third of developed country prices, There was a strong belief that capital-
ACM
ACM Conference
Conference economies of scale and incremental ist economies would thrive only if en-
process innovation is not enough. Pro- trenched in multi-party democracies
Proceedings
Proceedings found understanding of local consum- with free elections. The Washington
Now
Now Available via
Available via ers’ needs enables innovators to strip
down unnecessary functionalities to
Consensus of the 1990s saw the Inter-
national Monetary Fund (IMF) advance
Print-on-Demand!
Print-on-Demand! reduce costs drastically. “Frugal inno- financing only if countries would move
vation” focuses on product redesign toward open, liberalized, and privatized
and the invention of new models for economies. However, China, Russia,
Did you know that you can production and distribution, not on and Brazil endorsed options other than
now order many popular technological breakthroughs.7 Health- liberal democracy, giving rise to the la-
care is a hotbed for frugal innovation bel of “state capitalism”6 in which state-
ACM conference proceedings with General Electric competing with owned enterprises are an alternative to
via print-on-demand? emerging market firms to develop low- private-sector firms. Emerging market
cost portable medical equipment such leaders believe that state capitalism is
as electrocardiogram and ultrasound a sustainable alternative to liberal mar-
Institutions, libraries and machines. ket capitalism. Economics and politics
individuals can choose “Local dynamos”1 in emerging mar- in emerging markets come in different
kets are suppliers and/or competitors colors and stripes.
from more than 100 titles of developed economy multination- Defining emerging markets today
on a continually updated als. Already in 2010, 17% of the For- is more difficult than when the Ber-
list through Amazon, Barnes tune Global 500 companies (ranked by lin Wall fell. Low and middle-income
revenue size) were headquartered in countries with high growth potential
& Noble, Baker & Taylor, emerging markets. McKinsey projects remain part of the definition, but the
Ingram and NACSCORP: this will rise to 45% by 2025.3 Japanese expectation of 25 years ago that they
CHI, KDD, Multimedia, companies with innovation in lean are all moving toward liberal capitalist
manufacturing posed threats to West- economies has changed. Some govern-
SIGIR, SIGCOMM, SIGCSE, ern companies in the 1980s. Emerging ments of emerging markets control re-
SIGMOD/PODS, market multinationals from China, In- sources and political freedom in ways
and many more. dia, Brazil, Mexico, Turkey, and many liberal democracies do not.
other countries pose similar threats,
but with a material difference leading Emerging Markets:
For available titles and to greater uncertainty in the risk-re- Are They Behind or Ahead?
ward equation. It is useful to examine the notion of in-
ordering info, visit: novation catch-up by emerging market
librarians.acm.org/pod Defining “Emerging Markets”: companies. Some branches of science
Combining Economics and Politics and medicine might give Western firms
The New York Times called van Agtamel an edge, but a broader meaning of in-
a marketing genius when reviewing
his book, The Emerging Market Century.
The label “emerging market” broke The interest in
from seeing the Third World as under-
developed or developing. Economic emerging markets
development had been about help- is in part because
ing these countries with international
aid and technology transfer. Devel- they grow fast
oping countries protected domestic and have potential
industries by erecting international
trade and investment barriers. Some for further growth.
countries were firmly entrenched in
the planned economy with the state
owning the means of production. By
contrast, emerging markets have free
international trade and foreign direct
A P R I L 2 0 1 5 | VO L. 58 | N O. 4 | C OM M U N IC AT ION S OF T HE ACM 29
V
viewpoints
Kode Vicious
Raw Networking
Relevance and repeatability.
Dear KV,
The company I work for has de-
cided to use a wireless network link
to reduce latency, at least when the
weather between the stations is good.
It seems to me that for transmission
over lossy wireless links we will want
our own transport protocol that sits
directly on top of whatever the radio
provides, instead of wasting bits on IP
and TCP or UDP headers, which, for a
point-to-point network, are not really
useful.
Raw Networking
Dear Raw,
I completely agree that the best way
to roll out a new networking service
is to ignore 30 years of research in the
area. Good luck.
Second only to operating system
developers—all of whom want to re-
write the scheduler (see “Bugs and
Bragging Rights,” second letter, at
http://bit.ly/1yGXHV9)—are the net-
working engineers and developers
who want to write their own protocol.
“If only we could go at it with a clean
sheet of paper, we could do so much ing time applied to them than any some benefit from the work done—to
better than the ARPANET, since that other network protocols currently in tune the bandwidth and round-trip
IMAGE BY ALICIA KUBISTA /A ND RIJ BORYS ASSOCIAT ES
was designed for old, crappy hard- existence. You say you are building a time estimators—that will exist in the
ware, and ours is shiny and new.” wireless network with—I am sure— nodes sending and receiving the data.
That statement is both true and false, the highest-quality gear you can buy. Your network is point-to-point,
and you had better be damned sure Wireless networks are notoriously which means you do not think you
about which side of the Boolean logic lossy, at least in comparison to wired care about routing. But unless all the
your idea lies before you write a single networks. And it turns out there has work is always going to be carried out
line of new code. been a lot of research done on TCP in at one or the other end of this link,
The Internet protocols are not the lossy environments. So although you you are eventually going to have to
be all and end all of networking, but will pay an extra 40 bytes per packet to worry about addressing and routing.
they have had more research and test- transport data over TCP, you might get It turns out that someone thought
about those problems, and they im- we just use the control interfaces to
plemented their ideas in, yes, the In- source and sink the packets?” I hear
ternet protocols. When I see a you cry, “Wiring all that stuff is com-
The TCP/IP protocols are not just poor testing setup plicated and we have three computers
a set of standard headers, they are an on the same switch, we can just test
entire system of addressing, routing, I should be this now.” The way it works is the con-
congestion control, and error detec- prepared to see trol and test interfaces must be dis-
tion that has been built upon for 30 tinct on all the systems to prevent in-
years and improved so users can ac- poor code as well. terference during the test. No matter
cess the network from the poorest what you are testing, you must ensure
and most remote corners of the net- you reduce the amount of outside in-
work, where bandwidth is still mea- terference unless that is what you are
sured in kilobits and latencies exceed intending to test. If you want to know
half a second. Unless you are building how a system reacts with interference,
a system that will never grow and nev- ing a software engineer’s productivity then set up the test to introduce the
er be connected to anything else, you in KLOC. To write tests that matter, interference, but do not let interfer-
had better consider whether or not test developers have to be familiar ence show up out of nowhere. In our
you need the features of TCP/IP. enough with the software domain specific networking case, we want to
I am all for clean-sheet research to come up with tests that will both retain control over all three nodes, no
into networking protocols, there are confirm that the software works and matter what happens when we blast
many things that have not been tried which also attempt to break the soft- packets across the firewall. Retain-
and some that have been, but did not ware. Much has been written about ing control of a system under stress is
work at the time. Your letter implied this topic, so I am going to switch non-trivial.
not so much research, but rollout, gears to talk about repeatability. Another way to maintain control
and unless you have done your home- Tests are considered repeatable over the systems is to have access to a
work, this type of rollout will flatten when the execution of two different serial or video console. This requires
you and your project. tests on the same system do not in- even more specialized wiring than
KV terfere with one another. A concrete just a bunch more network ports, but
example from my own work is the it is well worth it. Often, bad things
population of various software cach- happen, and the only way to regain
Dear KV, es—such as routing and ARP tables— control over the systems is via a con-
You write about the importance of that might speed up the second test in sole login.
testing, but I have not seen anything a series of tests of packet forwarding. The ultimate fallback for control
in your columns on how to test. It is To achieve repeatability, the system is the ability to remotely power-cycle
fine to tell everyone testing is good, or person running the test must have the system being tested. Modern
but some specifics would be helpful. complete control over the environ- servers have an out-of-band manage-
How Not Why ment in which the test runs. If the ment system, such as IPMI, that al-
system being tested is completely en- lows someone with a user name and
capsulated by a single program with password to remotely power-cycle a
Dear How, no side effects, then running the pro- machine as well as do other low-level
The weasel’s way out of this response gram repeatedly on the same inputs is system management tasks including
would be to say there are too many a sufficient level of control. But most connecting to the console. Whenever
ways to test software to give an answer systems are not so simple. someone wants me to test networked
in a column. After all, many books Working from the concrete exam- systems in the way I am describing,
have been written about software test- ple of testing a firewall: To test any I require them to have either out-
ing. Most of those books are dreadful, piece of networking equipment that of-band power management via a
and for the most part, also theoreti- passes packets from one network to network-connected power controller
cal. Anyone who disagrees can send another, you need at least three sys- or IPMI on the systems in question.
me email with their favorite book on tems, a source, sink, and the device There is nothing more frustrating
software testing and I will consider under test (DUT, in test parlance.) As during testing than having a system
publishing the list or trashing the rec- I pointed out earlier, repeatability of wedge itself and having to either walk
ommendation. What I will do here is tests requires a level of control over down to the data center to reset it.
describe how I have set up various test the systems being tested. In our net- Or worse, having your remote hands
labs for my specific type of testing, work testing scenario, that means have to do it for you. The amount of
and maybe this will be of some use. each system requires at least two in- time I have wasted in testing because
There are two requirements for any terfaces and the DUT requires three. someone was too cheap to get IPMI on
testing regimen: relevance and repeat- The source and sink need both a con- their servers or put in a proper power
ability. Test-driven development is a trol interface and the interface on controller could have been far better
fine idea, but writing tests for the sake which packets will be either sent to spent killing the brain cells that had
of writing tests is the same as measur- or received from the DUT. “Why can’t absorbed the same company’s poorly
A P R I L 2 0 1 5 | VO L. 58 | N O. 4 | C OM M U N IC AT ION S OF T HE ACM 31
viewpoints
written code. It seems that inatten- code and configurations are checked the systems on every test run, which
tion to detail is pervasive, and when out onto the systems in the test group. clears all caches. That is not the right
I see a poor testing setup I should be That might work if everyone were dili- answer for all testing, but it definitely
prepared to see poor code as well. gent about checking in and pushing reduces interference from previous
At this point, we now know that changes from the test system. But in runs.
we have to retain control over the sys- my experience, people are never that KV
tems—and we have several ways to do diligent, and inevitably someone up-
that via separate control interfaces— grades a system that had crucial test
Related articles
and ultimately, we have to have con- results or configuration changes on it. on queue.acm.org
trol over the system’s power. The next Use a networked file system and it will
Orchestrating an Automated Test Lab
place that most test labs fall down is save whatever hair you have left on Michael Donat
in access to necessary files. your head. (I should have learned this http://queue.acm.org/detail.cfm?id=1046946
Once upon a time a workstation lesson sooner.) Ensure the networked
The Deliberate Revolution
company figured out they could sell file system traffic goes across the con- Mike Burner
lots of cheap workstations if they trol interfaces and not the test inter- http://queue.acm.org/detail.cfm?id=637960
could concentrate file storage on a faces. That should go without saying, Kode Vicious Unleashed
single, larger, and admittedly more but in test lab construction, much of George Neville-Neil
expensive, server. Thus was born what I think could go without saying http://queue.acm.org/detail.cfm?id=1046939
the Network File System, the much needs to be said.
maligned, but still relevant, way of At this point we have fulfilled the George V. Neville-Neil (kv@acm.org) is the proprietor of
Neville-Neil Consulting and co-chair of the ACM Queue
sharing files among a set of systems. most basic requirements of a net- editorial board. He works on networking and operating
If your tests can in any way destroy a working test system: We have control systems code for fun and profit, teaches courses on
various programming-related subjects, and encourages
system, or if upgrading a system with over all the systems, and we have a your comments, quips, and code snips pertaining to his
Communications column.
new software removes old files, then way to ensure all the systems can see
you need to be using some form of the same configuration data without
networked file system. Of late I have undue risk of data loss. From here it
seen people try to handle this prob- is time to write the automation that
lem with distributed version control controls these systems. For most test-
systems such as Git, where the test ing scenarios, I tend to just reboot all Copyright held by author.
Interview
An Interview with
Juris Hartmanis
A pioneer in the field of computational complexity theory
reflects on his career.
A
CM FE LLOW J U RI S H ART-
MANIS, recipient of the
1993 A.M. Turing Award
with Richard E. Stearns,
has made fundamental
contributions to theoretical computer
science—particularly in the area of
computational complexity—for over
50 years. After earning a Ph.D. in Math-
ematics from Caltech, he developed
the foundations of this new field first at
General Electric Research Laboratory
and then at Cornell University. He says
“Computational complexity, the study
of the quantitative laws that govern
computation, is an essential part of the
science base needed to guide, harness,
and exploit the explosively growing
computer technology.”
Noted historian and Communica-
tions Viewpoints board member Wil-
liam Aspray conducted an extended
oral interview of Hartmanis in his Cor-
nell University office in July 2009. The
complete transcript of this interview Juris Hartmanis.
is available in the ACM Digital Library;
presented here is a condensed and and interesting. I attended the excel- Union collapsed in the 1990s and their
highly edited version designed to whet lent French Lycee there, expecting to archives were opened did we find out
your appetite. follow my father into a military career. that my father has been taken to Mos-
—Len Shustek I am surprised how much motivation cow, tried, convicted, and executed.
IMAGE COURTESY OF CORNELL UNIVERSIT Y
and insight they gave besides teaching The Soviet occupation really was very,
An Unusual Early Life the basic subject matters. very brutal.
I was born on July 5, 1928 in Riga, the That good life unfortunately When Riga fell to the Soviets in
capital of Latvia, and was very fortunate changed when I was about 12 years 1944, we left by ship and moved to
to have been born into a prominent old. The Soviets occupied Latvia, and the university town of Marburg an der
Latvian family. My father was the Chief in the winter of 1940 my father was ar- Lahn in Germany, and I enrolled in a
of Staff of the Latvian army, and my rested. We did not know for years what German high school. But this was 1944
early childhood was secure, pleasant, happened to him. Only after the Soviet and there wasn’t much of a school year;
A P R I L 2 0 1 5 | VO L. 58 | N O. 4 | C OM M U N IC AT ION S OF T HE ACM 33
viewpoints
CACM_TACCESS_one-third_page_vertical:Layout 1 6/9/09 1:04 PM Page 1
every time allied bombers approached During the summer I went to the
there were air raid alerts, and everybody University of Kansas City. They gave me
had to proceed to air raid shelters. By a credit of 128 units for my five semes-
1945 everybody knew the war was lost, ters in Marburg, and decided that I had
and German soldiers surrendered en the equivalent of a bachelor’s degree.
masse in a disciplined manner. They also gave me a scholarship and ad-
When the war was over I attended mitted me as a graduate student, which
ACM Latvian high schools, first in the Eng-
lish occupation zone, then in the Amer-
surprised me since I had studied in
Marburg for only two-and-a-half years.
Transactions on ican zone. Travel was very difficult.
There were bombed-out bridges where
I was delighted to be accepted as a
graduate student, but there was a prob-
Accessible trains had passed over valleys; trains
now stopped on one side and you car-
lem: there was no graduate program in
physics. There was a graduate program
A P R I L 2 0 1 5 | VO L. 58 | N O. 4 | C OM M U N IC AT ION S OF T HE ACM 35
viewpoints
it off really well. When he finished that showed that for nice bounds there nondeterministic tape bounded com-
his Ph.D. and joined the research are computations that can be done ex- putations. The work in complexity
lab as a research mathematician, we actly in the given bound and not within theory started spreading.
worked day in and day out together, a smaller bound. We also showed that As the group of people working in
sitting and staring at each other in for any given computable bound there this area grew, we felt a need for a con-
his office or mine, shouting and hol- were problems not computable in the ference and publication dedicated to
lering about the other’s ignorance given bound. Today this approach is computational complexity. In 1986
for not understanding the subtlest known as asymptotic complexity. we organized the first conference,
points of computability. We did a lot In April 1963 we finished the paper “Structure in Complexity Theory”—
of work on finite automata, particu- that later earned us the Turing Award, “structure” because we were interested
larly in decomposing finite automa- on the computational complexity of in how all these complexity classes re-
ta into smaller ones from which they algorithms. We originally submitted late to each other.
could be realized. In 1966 we pub- it to the Journal of the ACM, but there It is quite amazing that in spite of
lished a book summarizing our work, was lots of mathematics and we wor- the 45 years of impressive progress un-
Algebraic Structure Theory of Sequen- ried that not too many people would derstanding the complexity of compu-
tial Machines. care to study it. So we published it tations, we still cannot prove that P is
When one looks at the early years of in the Transactions of the American different from NP and NP is different
theoretical work in computer science, Mathematical Society instead. The from PSPACE, the problems that can be
there was a lot of switching theory, first public presentation was in 1964 solved with a polynomial-length tape.
which really was how to design cir- at the IEEE Annual Symposium on Intuitively, all these classes look to be
cuits. We worked on code assignment, Switching Theory and Logical Design, different, but no proof has been found.
optimal naming of finite automata and it was very well received. We were Until we settle the separation problem
states, and related problems. But soon confident that we had opened a new for these classes, particularly Cook’s
the research moved away from finite and important research area, and we notorious P versus NP problem, we
devices and started exploring formal were totally committed to exploring have not fully understood complexity
languages, push-down automata, Tur- computational complexity. of computations. But I do believe that
ing machines, design and analysis of they are provable, and will be solved.
algorithms, computational complex- Expanding the Field
ity, and so forth. This was a turn to We started exploring other complex- What Happened at GE
more realistic models and problems ity measures besides time. We stud- When I joined, GE had all the things
about real computation. ied, along with Phil Lewis, tape and in place to become a dominant com-
memory bounded computational puter designer and manufacturer. We
Starting Computational Complexity complexity classes, now also known hoped that our work would eventually
In the very early 1960s, when we had as space-bounded complexity classes. be the foundation for, or help with,
really well understood Turing’s work, We proposed a new Turing machine the computer business. But GE failed
we read a paper by Yamada on real- model that had a read-only two-way in- to exploit these early successes and
time computation. Yamada was inter- put tape and a separate working tape their computer effort was fumbled
ested in computing when the Turing for the computation. away. GE had the philosophy that a
machine had to print the digits at a Many computer scientists joined good manager can manage anything,
steady pace and could not slow down. this research area. For example, our and that was proved absolutely wrong
He showed quite easily that there were context-free language recognition in the computer field. It was a great
recursive sequences that could not be (log n)2 algorithm led Savage to gen- failure and a great disillusionment for
so computed. eralize it and prove a beautiful re- my colleagues and me.
Yamada studied just this one class lation between deterministic and I had really enjoyed Cornell before,
with a time bound. But we thought so in 1965 I accepted their offer to be-
there should be a theory about all come a full professor with tenure, and
complexity classes, and that every “When I joined, to be chair of their new graduate de-
computation should fall in some partment in computer science. There
class. We started doing generalized GE had all the things were a number of other computer sci-
computational complexity. Among in place to become ence departments that were being
the key ideas was the realization that formed or planned, and the excitement
every solution to a class of problems a dominant computer in computer science almost percepti-
should have a computation bound as designer and bly had shifted toward education.
the problem grows in size. We simply
defined computational complexity manufacturer.” Creating the Cornell Department
classes by a function, like say n3, that of Computer Science
encompassed all problems that can be Our greatest problem was finding fac-
solved in n3 steps—which we referred ulty. The most important part is to get
to as time—for problem size n. We the best possible people you can. Qual-
proved a general hierarchy theorem ity, quality, quality.
We decided to start with a very light broader agenda. At the end the report budget had to be built up to meet the
teaching load: one course per semes- was well received by the CS community. expanding CS needs. That was not an
ter plus participation in a seminar. Not easy process, to argue for the recogni-
only that, our Sloan Foundation sup- A Tour of Duty at the tion of computer science in a world
port allowed us to pay better salaries National Science Foundation dominated by physicists.
than were paid for the same ranks in the In 1996 I went to NSF as Assistant Di- We were concerned, and we still
math department and some other com- rector of the Directorate for Computer should be concerned, that we are not
puter science departments. We created and Information Science and Engi- attracting enough women. We are los-
a very strong department, although we neering, CISE. Absolutely loved Wash- ing out on talent. We were puzzled, and
were really overloaded in theory. ington, and loved the job. It was a kind we put extra money, while I was there,
We worked very hard on the intellec- of twofold job. One, run CISE: worry in fellowships for women. I am still
tual environment, the cohesiveness of about the distribution of funds, and surprised that there are more women
the department. I decided not to have what areas to support. I felt strongly in mathematics, percentage wise, than
regular faculty meetings. We all met that the program managers just did in computer science.
at the Statler Club, and after a quick not exercise enough of their power
lunch, we gathered in a corner niche of in making their own decisions about Summing Up
the Cornell Faculty Club for coffee and what will or will not get funded. They I was once asked, “Which of your two
discussions. All departmental issues really have a fair amount of power to research achievements (not my words,
were discussed in these meetings. I, as do that. Complaints can be leveled his words) do you think is more impor-
chair, listened and argued and listened that the program managers do not do tant: the structure of finite automata,
some more. If I felt a consensus emerg- enough risky research support and or complexity theory?” Without a sec-
ing, I said so, and then I implemented enough interdisciplinary funding. ond’s thought I said, “Complexity.”
the decisions. From the very beginning But, for example, I closed down an en- Finite automata were fun. With Dick
we had the tradition that everybody is gineering program that was support- Stearns, some parts were just like play-
to be heard. ing university chip design, because by ing a big, interesting game. There were
that time there were other venues. some very unintuitive results. A num-
Other Jobs: Pondering the Future The other job is to represent com- ber of people joined in that kind of fun,
of Computer Science puter science and represent CISE with- but I think that is more of parlor-type
What is computer science and engi- in NSF, where you compete with your mathematics.
neering? What should the field be fellow assistant directors for funding. But almost all computer science
doing? What does the field need in You spend a lot of time telling, in short problems involve algorithms—and
order to prosper? That was the man- and long paragraphs, about computer therefore computational complexity
date for “Computing the Future,” a science. You basically are a spokesman problems. For example, computational
1992 study by the Computer Science for computer science. complexity theory has put cryptography
and Telecommunications Board of At CISE I did a thing that had not on a different base. I think every com-
the National Academies. been done before ever: I reviewed every puter scientist should have some intui-
We did argue. It is amazing when bloody program. I sat down individu- tive grasp of computational complexity.
you put a bunch of bright, successful, ally with every program manager and Recently there have been very inter-
good scientists in one room and ask they explained what the program was esting results about the complexity of
them, “Tell me what computer science doing, what research it was support- approximate results to NP and other
is.” A very lovely, lively discussion. ing. I told all my program and division problems that we believe not to be fea-
We made recommendations: sus- managers, “I need nuggets. I need sible computable. Many of these prob-
tain the core effort in computer sci- crisply explainable research results.” lems have easily computable approxi-
ence, improve undergraduate edu- CISE ratings were lower than a num- mations, but many others do not. In
cation, and broaden the scope of ber of other directorates on the average. design of algorithms it has helped tre-
computer science. This meant broad- I think computer scientists probably mendously in knowing what can and
ening interdisciplinary work as well still do not have as uniform an under- cannot be done. In some cases we do
as expanding in new applications; we standing of each other as the physicists not have that insight yet, but in other
wanted to see every computer science do, for example. Physicists will argue cases it helped, and better algorithms
Ph.D. program require a minor in some about different projects, but I think have emerged.
other discipline. they have a firmer way of judging each These are deep results and reveal
I think it is a good report, but there other’s research proposals. Computer very interesting connections between
was some controversy. John McCar- science is new. That’s no real excuse, different aspects of computation, al-
thy from Stanford University got very but it’s growing fast, it’s changing. though there are important unsolved
upset; he felt that AI did not have the I stayed a little bit over two years. problems that have been around for
prominent role it should have. The Within NSF it’s a very delicate relation- quite a long time.
controversy died down, but I think it ship, how hard you try to convince the
did bring some more publicity to the director that you should get the biggest Len Shustek (shustek@computerhistory.org) is the
chairman of the Computer History Museum.
report. A number of people have said slice. CISE did quite well, but when
they were delighted that it fought for a CISE started it was very small and its Copyright held by author.
A P R I L 2 0 1 5 | VO L. 58 | N O. 4 | C OM M U N IC AT ION S OF T HE ACM 37
V
viewpoints
Viewpoint
Who Builds a House
without Drawing
Blueprints?
Finding a better solution by thinking about the problem
and its solution, rather than just thinking about the code.
I
BEG A N W RI T I N G programs in
1957. For the past four de-
cades I have been a computer
science researcher, doing only
a small amount of program-
ming. I am the creator of the TLA+
specification language. What I have
to say is based on my experience pro-
gramming and helping engineers write
specifications. None of it is new; but
sensible old ideas need to be repeated
or silly new ones will get all the atten-
tion. I do not write safety-critical pro-
grams, and I expect that those who do
will learn little from this.
Architects draw detailed plans
before a brick is laid or a nail is ham-
mered. But few programmers write
even a rough sketch of what their pro-
grams will do before they start coding.
We can learn from architects.
A blueprint for a program is called a
specification. An architect’s blueprint is
a useful metaphor for a software speci-
fication. For example, it reveals the fal-
lacy in the argument that specifications
are useless because you cannot gener-
ate code from them. Architects find
blueprints to be useful even though
buildings cannot be automatically gen- it is a good idea to think about what we something, we can explain it clearly in
IMAGE BY SA KONBOON SA NSRI
erated from them. However, metaphors are going to do before doing it, and as writing. If we have not explained it in
can be misleading, and I do not claim the cartoonist Guindon wrote: “Writ- writing, then we do not know if we re-
that we should write specifications just ing is nature’s way of letting you know ally understand it.
because architects draw blueprints. how sloppy your thinking is.” The second observation is that to
The need for specifications follows We think in order to understand write a good program, we need to think
from two observations. The first is that what we are doing. If we understand above the code level. Programmers
A P R I L 2 0 1 5 | VO L. 58 | N O. 4 | C OM M U N IC AT ION S OF T HE ACM 39
viewpoints
A P R I L 2 0 1 5 | VO L. 58 | N O. 4 | C OM M U N IC AT ION S OF T HE ACM 41
practice
DOI:10.1145/ 2717517
attack requests the server cannot an-
Article development led by
queue.acm.org
swer any useful customer requests.
Either way, DDoS means outsiders are
controlling our devices, and that is bad
In the end, dynamic systems for us.
are simply less secure. Surveillance, exfiltration, and
other forms of privacy loss often take
BY PAUL VIXIE the form of malicious software or
hardware (so, “malware”) that some-
Go Static
how gets into your devices, adding
features like reading your address
book or monitoring your keystrokes
and reporting that information to
outsiders. Malware providers often
know more about our devices than
we as users (or makers) do, espe-
or
cially if they have poisoned our sup-
ply chain. This means we sometimes
use devices we do not consider to be
programmable, but which actually
are programmable by an outsider
who knows of some vulnerability or
secret handshake. Surveillance and
Go Home
exfiltration are merely examples of
a device doing things its owner does
not know about, would not like, and
cannot control.
Because the Internet is a distrib-
uted system, it involves sending
messages between devices such as
computers and smartphones, each
containing some hardware and some
software. By far the most common way
that malware is injected into these de-
vices is by sending a message that is
malformed in some deliberate way to
MOST CURRENT AND historic problems in computer exploit a bug or vulnerability in the re-
ceiving device’s hardware or software,
and network security boil down to a single observation: such that something we thought of
letting other people control our devices is bad for us. as data becomes code. Most defense
mechanisms in devices that can re-
At another time, I will explain what I mean by “other ceive messages from other devices
people” and “bad.” For the purpose of this article, I prevent the promotion of data that is
will focus entirely on what I mean by control. One way expected to contain text or graphics or
maybe a spreadsheet to code, mean-
we lose control of our devices is to external distributed ing instructions to the device telling
denial-of-service (DDoS) attacks, which fill a network it how to behave (or defining its fea-
ILLUSTRATION BY MICH A EL GLENWO OD
kind of attack, what is needed is to au- tent Management System (CMS), but tells the CDN provider that it can dis-
dit every scrap of software used to pro- it is extremely technical—it requires tribute those files across its network
gram the DCMS, including the com- the use of UNIX text editors, a version and return them many times to many
puter language interpreter; all code control utility called GIT, and knowl- viewers—and in case of a DDoS, many
libraries, especially Open SSL; the op- edge of a language called Markdown. times to many attackers. Of course,
erating system including its kernel, This frustrates our non-technical em- once a user logs into the site, there
utilities, and compilers; the Web serv- ployees, including some members of will be some dynamic content, which
er software; and any third-party apps our business team, but it means our is when the CDN will pass requests
that have been installed alongside the Web server runs no code to render a to the real Web server, and the DCMS
DCMS. (Hint: this is ridiculous.) Web object—it just returns files that will be exposed to outsider data again.
were pre-generated using the Ikiwiki This must never cease to be a cause
Distributed Denial of Service CMS. Bricolage is another example for concern, vigilance, caution, and
Let’s rewind from remote code execu- of a non-dynamic CMS but is friend- contingency planning.
tion vulnerability (the promotion of lier to non-technical WYSIWYG users As a hybrid almost-CDN model,
outsider data into executable code) than something like Ikiwiki. Please a mostly static DCMS might be put
back to DDoS for a moment. Even if note that nobody is DDoS-proof, no behind a front-end Web proxy such
your DCMS is completely non-interac- matter what their marketing litera- as Squid or the mod_proxy feature of
tive, such that it never offers its users ture or their annual report may say. Apache. This will not protect your net-
a chance to enter any input, the input We all live on an Internet that lacks work against DDoS attacks as well as
data path for URLs and request envi- any kind of admission control, so outsourcing to a CDN would do, but
ronment variables has been carefully most low-investment attackers can it can protect your DCMS’s resources
audited, and there is nothing like Bash trivially take out most high-invest- from exhaustion. Just note that any
installed on the same computer as the ment defenders. However, we do have mostly static model (CDN or no CDN)
Web server, a DCMS is still a “kick a choice about whether our website will still fail to protect your DCMS
me” sign for DDoS attacks. This is be- wears a “kick me” sign. code from exposure to outsider data.
cause every DCMS page view involves There is a hybrid model, which I What this means for most of us in the
running a few tiny bits of software will call mostly static, where all the security industry is that static is bet-
on your Web server, rather than just style sheets, graphics, menus, and ter than mostly static if the business
returning the contents of some files other objects that do not change be- purpose of the Web service can be met
that were generated earlier. Executing tween views and can be shared by using a static publication model.
code is quite fast on modern comput- many viewers are pre-generated and So if you are serious about running
ers, but still far slower than returning are served as files. The Web server ex- a Web-based service, don’t put a “kick
the contents of pre-generated files. If ecutes no code on behalf of a viewer me” sign on it. Go static, or go home.
someone is attacking a Web service until that viewer has logged in, and
with LOIC or any similar tool, they will even after that, most of the objects
need 1,000 times fewer attackers to ex- returned on each page view are static Related articles
on queue.acm.org
haust a DCMS than to exhaust a static (from files). This is a little bit less safe
or file-based service. than a completely static website, but Finding More Than One Worm in the Apple
Mike Bland
Astute readers will note that my it is a realistic compromise for many
http://queue.acm.org/detail.cfm?id=2620662
personal website is a DCMS. Instead of Web service operators. I say “less
some lame defense like “the cobbler’s safe,” because an attacker can regis- Internal Access Controls
Geetanjali Sampemane
children go shoeless,” I will point out ter some accounts within the service http://queue.acm.org/detail.cfm?id=2697395
the attractions of a DCMS are so ob- in order to make their later attacks
DNS Complexity
vious that even I can see them —I do more effective. Mass account creation Paul Vixie
not like working on raw HTML using is a common task for botnets, and so http://queue.acm.org/detail.cfm?id=1242499
UNIX text editors when I do not have most Web service operators who al-
to, and my personal Web server is not low online registration try to protect Paul Vixie is the CEO of Farsight Security. He previously
a revenue source and contains no sen- their service using CAPTCHAs. served as president, chairman, and founder of ISC
(Internet Systems Consortium); president of MAPS, PAIX,
sitive data. I do get DDoS’d from time The mostly static model also works and MIBH; and CTO of Abovenet/MFN.
to time, and I have to go in periodical- with Content Distribution Networks
ly and delete a lot of comment spam. (CDNs) where the actual front end
The total cost of ownership is pretty server that your viewers’ Web brows-
low, and if your enterprise website is ers are connecting to is out in the
as unimportant as my personal web- cloud somewhere, operated by ex-
site, then you should feel free to run a perts, and massively overprovisioned
DCMS like I do. (Hint: wearing a “kick to cope with all but the highest-grade
me” sign on your enterprise website DDoS attacks. To make this possible,
may be bad for business.) a website has to signal static objects
At work, our public-facing website such as graphics, style sheets, and Copyright held by author. Publication rights
is completely static. There is a Con- JavaScript files are cacheable. This licensed to ACM. $15.00.
A P R I L 2 0 1 5 | VO L. 58 | N O. 4 | C OM M U N IC AT ION S OF T HE ACM 45
practice
DOI:10.1145/ 2719919
To elaborate, Figure 1 shows con-
Article development led by
queue.acm.org
ceptually that linear speedup (the
dashed line) is the best you can ordi-
narily expect to achieve when scaling
The perpetual motion of parallel performance. an application. Linear means you get
equal bang for your capacity buck be-
BY NEIL J. GUNTHER, PAUL PUGLIA, AND KRISTOFER TOMASETTE cause the available capacity is being
consumed at 100% efficiency. More
Hadoop
commonly, however, some of that ca-
pacity is consumed by various forms of
overhead (red area). That corresponds
to a growing loss of available capac-
Superlinear
ity for the application, so it scales in a
sublinear fashion (red curve). Superlin-
ear speedup (blue curve), on the other
hand, seems to arise from some kind of
Scalability
hidden capacity boost (green area).
As we will demonstrate, super-
linearity is a genuinely measurable
effect,4,12,14,19,20,21–23 so it is important to
understand exactly what it represents in
order to address it when sizing distribut-
ed systems for scalability. As far as we are
aware, this has not been done before.
Measurability notwithstanding, su-
perlinearity is reminiscent of perpe-
tuum mobile claims. What makes a per-
“WE OFTEN SEE more than 100% speedup efficiency!” came petual motion machine attractive is its
the rejoinder to the innocent reminder that you cannot supposed ability to produce more work
have more than 100% of anything. This was just the first or energy than it consumes. In the case
of computer performance, superlin-
volley from software engineers during a presentation on earity is tantamount to speedup that
how to quantify computer-system scalability in terms of exceeds the computer capacity avail-
able to support it. More importantly for
the speedup metric. In different venues, on subsequent this discussion, when it comes to per-
occasions, that retort seemed to grow into a veritable petual motion machines, the difficult
chorus that not only was superlinear speedup commonly part is not deciding if the claim violates
the conservation of energy law; the dif-
observed, but also the model used to quantify scalability ficult part is debugging the machine to
for the past 20 years—Universal Scalability Law (USL)— find the flaw in the logic. Sometimes
that endeavor can even prove fatal.5
failed when applied to superlinear speedup data. If, prima facie, superlinearity is akin
Indeed, superlinear speedup is a bona fide to perpetual motion, why would some
phenomenon that can be expected to appear more software engineers be proclaiming
its ubiquity rather than debugging it?
frequently in practice as new applications are deployed That kind of exuberance comes from
onto distributed architectures. As demonstrated an overabundance of trust in perfor-
mance data. To be fair, that misplaced
here using Hadoop MapReduce, however, the USL trust likely derives from the way per-
is not only capable of accommodating superlinear formance data is typically presented
speedup in a surprisingly simple way, it also reveals without any indication of measure-
ment error. No open source or com-
that superlinearity, although alluring, is as illusory as mercial performance tools of which we
perpetual motion. are aware display measurement errors,
A P R I L 2 0 1 5 | VO L. 58 | N O. 4 | C OM M U N IC AT ION S OF T HE ACM 47
practice
USL. In our usual notation,7,8 we write fall away from linear (Figure 2b), even nator of equation 2, each with its at-
the theoretical speedup in equation 2 when the node configuration is relative- tendant coefficient. For κ > 0, however,
as: ly small. As the number of nodes con- a maximum exists, and there is usually
tinues to grow, the speedup approaches little virtue in describing analytically
Sp = p (2) the ceiling, Sceiling = σ−1, indicated by the how scalability degrades beyond that
1 + σp + κp (p − 1) horizontal dashed line in Figure 2c. The point. The preferred goal is to remove
where the coefficient σ represents the two triangles in Figure 2c indicate this the peak altogether, if possible—hence
degree of contention in the system, is a region of diminishing returns, since the name universal.
and the coefficient κ represents the in- both triangles have the same width but The central idea is to match the
coherency in distributed data. the right triangle has less vertical gain measured speedup in equation 1 with
The contention term in equation 2 than the left triangle. the USL defined in equation 2. For a
grows linearly with the number of clus- If κ > 0, the speedup will eventu- given node configuration p, this can be
ter nodes, p, since it represents the cost ally degrade like p−1. The continuous achieved only by adjusting the σ and κ
of waiting for a shared resource such scalability curve must therefore pass coefficients. In practice, this is accom-
as message queuing. The coherency through a maximum or peak value, as plished using nonlinear statistical re-
term grows quadratically with p be- in Figure 2d. Although both triangles gression.8,17 (The scalability of Varnish,
cause it represents the cost of making are congruent, the one on the right side Memcached, and Zookeeper applica-
distributed data consistent (or coher- of the peak is reversed, indicating the tions are discussed in the ACM Queue
ent) via a pairwise exchange between slope has become negative—a region version of this article).
distributed resources (for example, of negative returns.
processor caches). From a mathematical perspective, Hadoop Terasort in the Cloud
Interpreting the coefficients. If σ = 0 the USL is a parametric model based To explore superlinearity in a con-
and κ = 0, then the speedup simply re- on rational functions,7 and one could trolled environment, we used a well-
duces to Sp = p, which corresponds to imagine continuing to add successive known workload, the TeraSort bench-
Figure 2a. If σ > 0, the speedup starts to polynomial terms in p to the denomi- mark,15,16 running on the Hadoop
MapReduce framework.3,24 Instead of
Figure 1. Qualitative comparison of sublinear, linear, and superlinear speedup scalability. using a physical cluster, however, we
installed it on Amazon Web Services
Speedup (AWS) to provide the flexibility of re-
configuring a sufficiently large number
of nodes, as well as the ability to run
multiple experiments in parallel at a
r
ea fraction of the cost of the correspond-
Lin
ing physical system.
Superlinear Hadoop framework overview. This
discussion of superlinear speedup
in TeraSort requires some familiarity
with the Hadoop framework and its ter-
Sublinear
minology.24 In particular, this section
provides a high-level overview with the
primary focus on just those Hadoop
components that pertain to the later
Processors
performance analysis.
The Hadoop framework is designed
to facilitate writing large-scale, data-
1/σ
1/p
σ=0 σ>0
κ=0 κ=0 σ>0
κ=0
σ>0
κ>0
Processors Processors Processors Processors
(a) (b) (c) (d)
Ideal linearity Diminishing returns Bottleneck saturationv Data exchange
intensive, distributed applications that Node 1 of Figure 3 where the Map task work in parallel on each slice of the
can run on a multinode cluster of com- is represented schematically as a pro- input data, effectively sorting and par-
modity hardware in a reliable, fault- cedure Map(k, v). Besides performing titioning it into a set of files where all
tolerant fashion. This is achieved by this transform, the Map task also sorts the <k, v> objects that have equal key
providing application developers with the data by key and stores the sorted values are grouped. Once all the Map
two programming libraries: <k, v> objects so they can easily be ex- tasks have completed, the Reduce tasks
˲˲ MapReduce, a distributed process- changed with a Reduce task. are signaled to start reading the parti-
ing library that enables applications to ˲˲ Reduce task. The function of the tions to transform and combine these
be written for easy adaptation to paral- Reduce task is to collect all the <k, v> intermediate data into new <k, [v1, v2,...]>
lel execution by decomposing the en- objects for a specific key and transform objects. This process is referred to as
tire job into a set of independent tasks. them into a new <k, v> object, where shuffle exchange, shown schematically
˲˲ Hadoop Distributed File System the value of the key is the specific key in Figure 3 as arrows spanning physical
(HDFS), which allows data to be stored and whose value is a list [v1 , v2 ,...] of nodes 1, 2,..., p.
on any node but remain accessible by all the values that are < k, [v1, v2,...]> To facilitate running the applica-
any task in the Hadoop cluster. An ap- objects whose key is the specific key tion in a distributed fashion, the Map-
plication written using the MapReduce across the entire input data set. Node 1 Reduce library provides a distributed
library is organized as a set of indepen- of Figure 3 shows the detailed Reduce- execution server composed of a cen-
dent tasks that can be executed in par- task dataflow. tral service called the JobTracker and
allel. These tasks fall into two classes: A MapReduce application processes a number of slave services called Task-
˲˲ Map task. The function of the Map its input dataset using the following Trackers.24 The JobTracker is respon-
task is to take a slice of the entire input workflow. On startup, the application sible for scheduling and transferring
dataset and transform it into key-val- creates and schedules one Map task tasks to the TaskTrackers residing on
ue pairs, commonly denoted by <key, per slice of the input dataset, as well each cluster node. The JobTracker can
value> in the context of MapReduce. as creating a user-defined number of also detect and restart tasks that might
See the detailed Map-tasks dataflow in Reduce tasks. These Map tasks then fail. It provides a level of fault toler-
Figure 3. Hadoop MapReduce dataflow with Node 1 expanded to show tasks detail.
Map
Tasks …
Map Map
Input Tasks Tasks
DataNode DataNode
DataNode Map(k,v)
Sort
Partition
Reduce Reduce
tasks tasks
Reduce Shuffle
…
tasks Input Exchange
Merge
Reduce(k,[v])
Output
Write to HDFS
A P R I L 2 0 1 5 | VO L. 58 | N O. 4 | C OM M U N IC AT ION S OF T HE ACM 49
practice
Table 1. Amazon EC2 instance configurations. case MapReduce workload, similar co-
herency phases will occur with differ-
ent magnitudes in different Hadoop
Instance Optimized Processor vCPU Memory Instance Network applications. The actual magnitude
Type for Arch. number (GiB) Storage (GB) Performance of the physical coherency effect is re-
BigMem m2.2xlarge Memory 64-bit 4 34.2 1 x 850 Moderate flected in the value of the κ coefficient
BigDisk c1.xlarge Compute 64-bit 8 7 4 x 420 High that results from USL analysis of Ha-
doop performance data.
Running TeraSort on AWS. TeraSort
Figure 4. Bash script used to record Terasort elapse times. is a synthetic workload that has been
used recently to benchmark the perfor-
BEFORE_SORT='date +%s%3N‘
mance of Hadoop MapReduce15,16 by
hadoop jar $HADOOP_MAPRED_HOME/hadoop-examples.jar terasort /user/hduser/ measuring the time taken to sort 1TB
terasort-input of randomly generated data. A separate
/user/hduser/terasort-output program called TeraGen generates the
AFTER_SORT='date +%s%3N'
SORT_TIME='expr $AFTER_SORT - $BEFORE_SORT‘ echo "$CLUSTER_SIZE, $SORT_TIME" input data, consisting of 100-byte re-
>> sort_time cords with the first 10 bytes used as a key.
The scripts for setting TeraSort up
on a Hadoop cluster are readily avail-
Figure 5. USL analysis of superlinear speedup for p ≤ 50 BigMem nodes. able. The performance goal here was
to use TeraSort to examine the phe-
nomenon of superlinear scalability,
80 not to tune the cluster to produce the
shortest runtimes as demanded by
competitive benchmarking.15,16
60 Amazon’s Elastic Compute Cloud
(EC2) provides rapid and cheap pro-
Speedup
Linux CentOS 5.4 with the Cloudera for physical consistency—the likely speedup, which the USL predicts as
CDH 4.7.0 distribution of Hadoop 1.0 source of the criticism that the USL Smax = 73.48, occurring at p = 48 nodes.
installed.3 Included in that distribu- failed with superlinear speedup data. More significantly, it also means the
tion is the Hadoop-examples.jar As explained earlier, a positive USL curve must cross the linear bound
file that contains the code for both value of σ is associated with conten- and enter the payback region shown
the TeraGen and TeraSort MapRe- tion for shared resources. For exam- in Figure 6.
duce jobs. Whirr can read the desired ple, the same processor that executes The USL model predicts this cross-
configuration from a properties file, user-level tasks may also need to ac- over from the superlinear region to
as well as receiving properties passed commodate operating-system tasks the payback region must occur for
from the command line. This allowed such as IO requests. The processor the following reason. Although the
permanent storage of the parameters capacity is consumed by work other magnitude of σ is small, it is also mul-
that did not change (for example, the than the application itself. Therefore, tiplied by (p − 1) in equation 2. There-
operating system version and Amazon the application throughput is less fore, as the number of nodes increas-
credentials). than the expected linear bang for the es, the difference, 1 − σ (p − 1), in the
Three sets of performance metrics capacity buck. denominator of equation 2 becomes
were gathered: Capacity consumption (σ > 0) ac- progressively smaller such that Sp is
˲˲ The elapsed time for the TeraSort counts for the sublinear scalability eventually dominated by the coher-
job (excluding the TeraGen job). component in Figure 2b. Conversely, σ ency term, κ p(p − 1).
˲˲ Hadoop-generated job data files. < 0 can be identified with some kind of Figure 7 includes additional speed-
˲˲ Linux performance metrics. capacity boost. This interpretation will up measurements (squares). The fitted
Of these, the most important met- be explained shortly. USL coefficients are now significantly
ric was the elapsed time, which was Additionally, the (positive) coher- smaller than those in Figure 5. The max-
recorded using the Posix time stamp ency coefficient κ = 0.000447 means imum speedup, Smax, therefore is about
in milliseconds (since EC2 hardware there must be a peak value in the 30% higher than predicted with the data
supports it) via the shell command il-
lustrated in Figure 4. Figure 6. Superlinearity and its associated payback region (see Figure 1).
Runtime performance metrics (for
Speedup
example, memory usage, disk IO rates,
and processor utilization) were cap-
tured from each EC2 node using the Superlinear Payback
resident Linux performance tools up-
time, vmstat, and iostat. The perfor-
mance data was parsed and appended
to a file every two seconds.
A sign of perpetual motion. Figure 5
shows the TeraSort speedup data (dots)
together with the fitted USL scalability
curve (blue). The linear bound (dashed
line) is included for reference. That the Processors
speedup data lies on or above the linear
bound provides immediate visual evi-
dence that scalability is indeed super- Figure 7. USL analysis of p ≤ 150 BigMem nodes (solid blue curve) with Figure 4 (dashed
blue curve) inset for comparison.
linear. Rather than a linear fit,21 the
USL regression curve exhibits a convex
trend near the origin that is consistent
120
with the generic superlinear profile
in Figure 1.
100
The entirely unexpected outcome
is that the USL contention coefficient
80
develops a negative value: σ = −0.0288.
Speedup
20
T1 = 13057 ± 606 seconds (r.e. 5%)
T2 = 6347 ± 541 seconds (r.e. 9%)
T3 = 4444 ± 396 seconds (r.e. 9%) 0
0 50 100 150
T5 = 2065 ± 147 seconds (r.e. 7%) EC2 Nodes
T10 = 893 ± 27 seconds (r.e. 3%)
A P R I L 2 0 1 5 | VO L. 58 | N O. 4 | C OM M U N IC AT ION S OF T HE ACM 51
practice
Figure 8. Hadoop TeraSort superlinearity is eliminated by increasing nodal disk IO bandwidth. per node) and four SATA disks.16 This
is very similar to the BigDisk EC2 con-
figuration in Table 1. We therefore
14 14 repeated the TeraSort scalability mea-
12 12 surements on the BigDisk cluster. The
results for p = 2, 3, 5, and 10 clusters are
10 10
compared in Figure 8.
Speedup
Speedup
8 8 Consistent with Figure 5, BigMem
6 6 speedup values in Figure 8a are su-
perlinear, whereas the BigDisk nodes
4 4
in Figure 8b unexpectedly exhibit
2 2 speedup values that are either linear or
sublinear. The superlinear effect has
0 2 4 6 8 10 12 0 2 4 6 8 10 12
EC2 Nodes
essentially been eliminated by increas-
EC2 Nodes
ing the number of local spindles from
one to four per cluster node. In other
words, increasing nodal IO bandwidth
in Figure 5 and occurs at p = 95 nodes. For each of the runtimes in Table 2, leads to the counterintuitive result that
The measured values of the speedup the number before the ± sign is the scalability is degraded from superlin-
differ from the original USL prediction, sample mean, while the error term ear to sublinear.
not because the USL is wrong but be- following the ± sign is derived from In an attempt to explain why the
cause there is now more information the sample variance. The relative error superlinear effect has diminished, we
available than previously. Moreover, (r.e.) is the ratio of the standard error formed a working hypothesis by identi-
this confirms the key USL prediction to the mean value reported as a per- fying the key performance differences
that superlinear speedup reaches a centage. between BigMem and BigDisk.
maximum value and then rapidly de- What is immediately evident from BigMem has the larger memory
clines into the payback region. this numerical analysis is the signifi- configuration, which possibly pro-
Based on USL analysis, the scalabil- cant variation in the relative errors vides more CentOS buffer caching
ity curve is expected to cross the linear with a range from 3%, which is nomi- for the TeraSort data, and that could
bound at p× nodes given by equation 3: nal, to 9%, which likely warrants fur- be thought of as being the source of
ther attention. This variation in the the capacity boost associated with the
|σ|
p×= κ (3) measurement error does not mean the negative USL contention coefficient.
measurement technique is unreliable; Incremental memory growth in pro-
For the dashed curve in Figure 7, rather, it means there is a higher de- portion to cluster size is a common ex-
the crossover occurs at p× = 65 nodes, gree of dispersion in the runtime data planation for superlinear speedup.4,14
whereas for the solid curve it occurs for reasons that cannot be discerned at Increasing memory size, however, is
at p× = 99 nodes. Like predicting Smax, this level of analysis. probably not the source of the capac-
the difference in the two p× predic- Nor is this variation in runtimes pe- ity boost in Hadoop TeraSort. If the
tions comes from the difference in the culiar to our EC2 measurements. The buffer cache fills to the point where it
amount of information contained in Yahoo TeraSort benchmark team also needs to be written to disk, it will take
the two sets of measurements. noted significant variations in their longer because there is only a single
execution times, although they did not local disk per node on BigMem. A sin-
Hunting the Superlinear Snark quantify them. “Although I had the 910 gle-disk DataNode in Figure 3 implies
After the TeraSort data was validated nodes mostly to myself, the network core all disk IO is serialized. In this sense,
against the USL model, a deeper per- was shared with another active 2000 when disk writes (including replica-
formance analysis was needed to deter- node cluster, so the times varied a lot de- tions) occur, TeraSort is IO bound—
mine the cause of superlinearity. Let’s pending on the other activity.”15 most particularly in the single-node
start with a closer examination of the Some of the Yahoo team’s sources of case. As the cluster configuration gets
actual runtime measurements for each variability may differ from ours (for ex- larger, this latent IO constraint be-
EC2 cluster configuration. ample, the 10x larger cluster size is like- comes less severe since the amount
Runtime data analysis. To make ly responsible for some of the Yahoo of data per node that must be written
a statistical determination of the er- variation). “Note that in any large cluster to disk is reduced in proportion to the
ror in the runtime measurements, we and distributed application, there are a number of nodes. Successive cluster
performed some runs with a dozen lot of moving pieces and thus we have seen sizes therefore exhibit runtimes that
repetitions per node configuration. a wide variation in execution times.”16 are shorter than the single-node case,
From that sample size a reasonable A surprising hypothesis. The physi- and that results in the superlinear
estimate of the uncertainty can be cal cluster configuration used by the speedup values shown in Figure 8a.
calculated based on the standard Yahoo benchmark team comprised Conversely, although BigDisk has
error, or the relative error, which is nodes with two quad-core Xeon pro- a smaller amount of physical mem-
more intuitive. cessors (that is, a total of eight cores ory per node, it has quad disks per
DataNode, which means each node cluster, as opposed to the JobClient. ATTEMPT _ ID (namely, a trailing 1 in-
has greater disk bandwidth to ac- Since the failure occurred during the stead of a trailing 0) that was successful.
commodate more concurrent IOs. invocation of DFSOutputStream, it This log analysis suggests if a Re-
TeraSort is therefore far less likely also suggests there was an issue while duce task fails to complete its current
to become IO bound. Since there is physically writing data to HDFS. write operation to disk, it must start
no latent single-node IO constraint, Furthermore, a subsequent record over by rewriting that same data until
there is no capacity boost at play. As in the log with the same task ID, as it is successful. In fact, there may be
a result, the speedup values are more shown in Figure 11, had a newer TASK_ multiple failures and retries (see Table
orthodox and fall into the sublinear
region of Figure 8b. Figure 9. Failed Reduce task as seen in the Hadoop job-client console.
Note that since the Yahoo bench-
14/10/01 21:53:41 INFO mapred.JobClient: Task Id :
mark team used a cluster configura-
attempt_201410011835_0002_r_000000_0,
tion with four SATA disks per node, Status : FAILED java.io.IOException: All datanodes 10.16.132.16:50010 are bad.
they probably did not observe any su- Aborting . . .
perlinear effects. Moreover, they were . . .
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.
focused on measuring elapsed times, java:463)
not speedup, for the benchmark com-
petition, so superlinearity would have
been observable only as execution
times Tp falling faster than p−1. Figure 10. Hadoop cluster log message corresponding to Figure 9.
Console stack traces. The next step
ReduceAttempt TASK_TYPE=”REDUCE” TASKID="task_201410011835_0002_r_000000"
was to validate the IO bottleneck hy- TASK_ATTEMPT_ID="attempt_201410011835_0002_r_000000_0" TASK_STATUS="FAILED"
pothesis in terms of Hadoop metrics FINISH_TIME="1412214818818" HOSTNAME="ip-10-16-132-16.ec2.internal"
collected during each run. While Tera- ERROR="java.io.IOException: All datanodes 10.16.132.16:50010 are bad. Aborting
. . .
Sort was running on BigMem, task fail- . . .
ures were observed in the Hadoop Job- at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.
Client console that communicates with java:463)
the Hadoop JobTracker. The following
is an abbreviated form of a failed task
status message with the salient identi- Figure 11. Successful retry of failed Reduce task.
fiers shown in bold in Figure 9.
Since the TeraSort job continued ReduceAttempt TASK_TYPE="REDUCE" TASKID="task_201410011835_0002_r_000000"
TASK_ATTEMPT_ID=
and all tasks ultimately completed "attempt_201410011835_0002_r_000000_1" TASK_STATUS= "SUCCESS"
successfully, we initially discounted
these failure reports. Later, with the
IO bottleneck hypothesis in mind,
we realized these failures seemed to Figure 12. Origin of Reduce task failure as seen in the Hadoop log.
occur only during the Reduce phase.
ReduceAttempt TASK_TYPE="REDUCE" ... TASK_STATUS="FAILED" . . .
Simultaneously, the Reduce task ERROR="java.io.IOException: All datanodes are bad. Aborting . . .
%Complete value decreased immedi- . . .
ately when a failure appeared in the .setupPipelineForAppendOrRecovery(DFSOutputStream.java:1000)
console. In other words, progress of
that Reduce task became retrograde.
Moreover, given that the failure in the Table 3. Single-node BigMem metrics extracted from Hadoop log.
stack trace involved the Java class DF-
SOutputStream, we surmised the er- Job ID Finished Maps Failed Maps Finished Reduces Failed Reduces Job Runtime
ror was occurring while attempting to 1 840 0 3 0 9608794
write to HDFS. This suggested exam- 2 840 0 3 2 12167730
ining the server-side Hadoop logs to 3 840 0 3 0 10635819
establish the reason why the Reduce 4 840 0 3 1 11991345
failures are associated with HDFS 5 840 0 3 0 11225591
writes. 6 840 0 3 2 12907706
Hadoop log analysis. Searching 7 840 0 3 2 12779129
the Hadoop cluster logs for the same 8 840 0 3 3 13800002
failed TASK _ ATTEMPT _ ID, initially 9 840 0 3 3 14645896
seen in the JobClient logs, revealed 10 840 0 3 4 15741466
the corresponding record as shown in 11 840 0 3 2 14536452
Figure 10. 12 840 0 3 3 16645014
This record indicates the Reduce
task actually failed on the Hadoop
A P R I L 2 0 1 5 | VO L. 58 | N O. 4 | C OM M U N IC AT ION S OF T HE ACM 53
practice
3). The potential difference in runtime of failed Reduces in Table 3 indicates of data. Before it transmits each data
resulting from Reduce retries is ob- that a write failure in the Reduce task packet to be written by a DataNode
scured by the aforementioned varia- causes it to retry the write operation— (2. Write packet), it pushes a copy
tion in runtime measurements, which possibly multiple times. In addition, of that packet onto a queue. The DF-
is also on the order of 10%. failed Reduce tasks tend to incur longer SOutputStream keeps that packet in
Table 3 shows 12 rows correspond- runtimes as a consequence of those ad- the queue until it receives an acknowl-
ing to 12 parallel TeraSort jobs, each ditional retries. The only outstanding edgment (3. ACK packet) from each
running on its own BigMem single- question is, what causes the writes to DataNode that the write operation
node cluster. A set of metrics indicat- fail in the first place? We already know completed successfully.
ing how each of the runs executed that write operations are involved dur- When an exception is thrown (for
is stored in the Hadoop job-history ing a failure, and that suggests examin- example, in the stack trace) the DF-
log and extracted using Hadoop log ing the HDFS interface. SOutputStream attempts to rem-
tools.13 Returning to the earlier failed Re- edy the situation by reprocessing the
The 840 Map tasks are determined duce stack trace, closer scrutiny reveals packets to complete the HDFS write.
by the TeraSort job partitioning 100 (bi- the following lines, with important key The DFSOutputStream can make ad-
nary) GB of data into 128 (decimal) MB words shown in bold in Figure 12. ditional remediation attempts up to
HDFS blocks. No Map failures occurred. The “All datanodes are bad” Java one less than the replication factor. In
The number of Reduce tasks was set to IOException means the HDFS DataNode the case of TeraSort, however, since the
three per cluster node. The number of pipeline in Figure 13 has reached a state replication factor is set to one, the lack
failed Reduce tasks varied randomly where the setupPipelineForAp- of a single HDFS packet acknowledg-
between none and four. In comparison, pendOrRecovery method, on the ment will cause the entire DFSOut-
there were no Reduce failures for the DFSOutputStream Java class, cannot putStream write operation to fail.
corresponding BigDisk case. recover the write operation, and the Re- The DFSOutputStream endeavors
The average runtime for Hadoop duce task fails to complete. to process its data in an unfettered
jobs was 13057078.67 ms, shown as When the pipeline is unhindered, way, assuming the DataNodes will be
T1 in Table 2. Additional statistical a Reduce task makes a call into the able to keep up and respond with ac-
analysis reveals a strong correlation HDFSClient, which then initiates the knowledgments. If, however, the un-
between the number of Reduce task creation of a HDFS DataNode pipe- derlying IO subsystem on a DataNode
retries and longer runtimes. Recalling line. The HDFSClient opens a DFSOut- cannot keep up with this demand, an
the definition of speedup, if the mean putStream and readies it for writing outstanding packet can go unacknowl-
single-node runtime, T1, is longer (1. Write in Figure 13) by allocating edged for too long. Since there is only
than successive values of pTp, then the a HDFS data block on a DataNode. a single replication in the case of Tera-
speedup will be superlinear. The DFSOutputStream then breaks Sort, no remediation is undertaken.
Whence reduce fails? The number the data stream into smaller packets Instead, the DFSOutputStream im-
mediately regards the outstanding
Figure 13. HDFS DataNode pipeline showing single replication (blue) and default triple write packet to be AWOL and throws
replication blue and gray).
an exception that propagates back up
to the Reduce task in Figure 13.
Reduce task
Since the Reduce task does not
know how to handle this IO excep-
HDFS Client
tion, it completes with a TASK _
STATUS=”FAILED”. The MapReduce
1. Write framework will eventually retry the
entire Reduce task, possibly more than
once (see Table 3), and that will be re-
DFSOutputStream
flected in a stretched T1 value that is ul-
timately responsible for the observed
superlinear speedup.
2. Write packet This operational insight into Re-
3. ACK packet
duce failures can be used to construct
Triple replication pipeline a list of simple tactics to avoid runtime
stretching.
3. ACK 3. ACK
1. Resize the buffer cache.
2. Tune kernel parameters to in-
HDFS HDFS HDFS crease IO throughput.
DataNode DataNode DataNode 3. Reconfigure Hadoop default
2. Write 2. Write
timeouts.
If maintaining a BigMem-type clus-
ter is dictated by nonengineering re-
quirements (for example, budgetary
constraints), then any of these steps engineering projects, Hadoop applica- 3. Cloudera Hadoop; http://www.cloudera.com/content/
cloudera/en/downloads/cdh/cdh-4-7-0.html/.
could be helpful in ameliorating super- tions require only a fixed development 4. Eijkhout, V. Introduction to High Performance
linear effects. effort. Once an application is demon- Scientific Computing. Lulu.com, 2014.
5. Feynman, R.P. Papp perpetual motion engine; http://
strated to work on a small cluster, the hoaxes.org/comments/papparticle2.html.
Conclusion Hadoop framework facilitates scaling 6. Gunther, N.J. A simple capacity model of massively
parallel transaction systems. In Proceedings of
The large number of controlled mea- it out to an arbitrarily large number International Computer Measurement Group
surements performed by running of nodes with no additional effort. For Conference, (1993).
7. Gunther, N.J. A general theory of computational
Hadoop TeraSort on Amazon EC2 ex- many MapReduce applications, scale- scalability based on rational functions, 2008;
posed the underlying cause of super- out may be driven more by the need for http://arxiv.org/abs/0808.1431.
8. Gunther, N.J. Guerrilla Capacity Planning: A Tactical
linearity that would otherwise be dif- disk storage than compute power as Approach to Planning for Highly Scalable Applications
ficult to resolve in the field. Fitting our the growth in data volume necessitates and Services. Springer, New York, NY, 2007.
9. Gunther, N.J. Performance and scalability models for
speedup data to the USL performance more Maps. The unfortunate term flat a hypergrowth e-commerce Web site. Performance
Engineering. R.R. Dumke, C. Rautenstrauch, A.
model produced a negative conten- scalability has been used to describe Schmietendorf, and A. Scholz, eds. A. Lecture Notes in
tion coefficient as a telltale sign of su- this effect.25 Computer Science 2047 (2001). Springer-Verlag 267-282.
10. Gunther, N.J. PostgreSQL scalability analysis
perlinearity on BigMem clusters. Although flat scalability may be a deconstructed. The Pith of Performance, 2012; http://
The subtractive effect of negative σ reasonable assumption for the initial perfdynamics.blogspot.com/2012/04/postgresql-
scalability-analysis.html.
introduces a point of inflection in the development process, it does not guar- 11. Gunther, N.J., Subramanyam, S. and Parvu, S. Hidden
convex superlinear curve that causes antee that performance goals—such scalability gotchas in memcached and friends.
VELOCITY Web Performance and Operations
it ultimately to become concave, thus as batch windows, traffic capacity, or Conference, (2010).
crossing over the linear bound at p× in service-level objectives—will be met 12. Haas, R. Scalability, in graphical form, analyzed, 2011;
http://rhaas.blogspot.com/2011/09/scalability-in-
equation 3. At that point, Hadoop Tera- without significant additional effort. graphical-form-analyzed.html.
Sort superlinear scalability returns to The unstated assumption behind the 13. Hadoop Log Tools; https://github.com/melrief/Hadoop-
Log-Tools.
being sublinear in the payback region. flat-scalability precept is that Hadoop 14. Hennessy, J.L. and Patterson, D.A. Computer
The cluster size p× provides an estimate applications scale linearly (Figure Architecture: A Quantitative Approach. Second edition.
Morgan Kaufmann, Waltham, MA, 1996.
of the minimal node capacity needed 2a) or near-linearly (Figure 2b). Any 15. O’Malley, O. TeraByte sort on Apache Hadoop, 2008;
to ameliorate superlinear speedup on shuffle-exchange processing, how- http://sortbenchmark.org/YahooHadoop.pdf.
16. O’Malley, O., Murthy, A. C. 2009. Winning a 60-second
BigMem clusters. ever, will induce a peak somewhere dash with a yellow elephant; http://sortbenchmark.
org/Yahoo2009.pdf.
Although superlinearity is a bona in the scalability profile (Figure 2d). 17. Performance Dynamics Company. How to quantify
fide phenomenon, just like perpetual The Hadoop cluster size at which the scalability, 2014; http://www.perfdynamics.com/
Manifesto/USLscalability.html.
motion it is ultimately a performance peak occurs can be predicted by apply- 18. Schwartz, B. Is VoltDB really as scalable as they
illusion. For TeraSort on BigMem, the ing the USL to small-cluster measure- claim? Percona MySQL Performance Blog; http://
www.percona.com/blog/2011/02/28/is-voltdb-really-
apparent capacity boost can be traced ments. The performance-engineering as-scalable-as-they-claim/.
to successively relaxing the latent IO effort needed to temper that peak will 19. sFlow. SDN analytics and control using sFlow
standard — Superlinear; http://blog.sflow.
bandwidth constraint per node as typically far exceed the flat-scalability com/2010/09/superlinear.html.
the cluster size grows. This IO bottle- assumption. As this article has en- 20. Stackoverflow. Where does superlinear speedup come
from?; http://stackoverflow.com/questions/4332967/
neck induces stochastic failures of deavored to show, the USL provides a where-does-super-linear-speedup-come-from
the HDFS pipeline in the Reduce task. valuable tool for the software engineer 21. Sun Fire X2270 M2 superlinear scaling of Hadoop
TeraSort and CloudBurst benchmarks, 2010; https://
That causes the Hadoop framework to analyze Hadoop scalability. blogs.oracle.com/BestPerf/entry/20090920_
to restart the Reduce task file-write, x2270m2_hadoop.
22. Sutter, H. Going superlinear. Dr. Dobb’s J. 33,
which stretches the measured run- Acknowledgments 3 (2008); http://www.drdobbs.com/cpp/going-
times. If runtime stretching is great- We thank Comcast Corporation for superlinear/206100542.
23. Sutter, H. Super linearity and the bigger machine.
est for T1, then successive speedup supporting the acquisition of Hadoop Dr. Dobb’s J. 33, 4 (2008); http://www.drdobbs.
measurements will be superlinear. com/parallel/super-linearity-and-the-bigger-
data used in this work. machine/206903306.
Increasing the IO bandwidth per 24. White, T. Hadoop: The Definitive Guide, third edition.
O’Reilly Media, 2012.
node, as we did with BigDisk clusters, 25. Yahoo! Hadoop Tutorial; https://developer.yahoo.com/
diminishes or eliminates superlinear Related articles hadoop/tutorial/module1.html#scalability.
on queue.acm.org
speedup by reducing T1 stretching.
This USL analysis suggests superlin- Hazy: Making it Easier to Build Neil J. Gunther (http://perfdynamics.blogspot.com;
and Maintain Big-Data Analytics tweets as @DrOz) is a researcher and teacher at
ear scalability is not peculiar to Tera- Performance Dynamics where he developed the USL and
Arun Kumar, Feng Niu, and Christopher Ré
Sort on Hadoop but may arise with any http://queue.acm.org/detail.cfm?id=2431055
the PDQ open source performance analyzer.
MapReduce application. Superlinear Paul Puglia (pjpuglia@gmail.com) has been working in
Data-Parallel Computing IT for more than 20 years doing Python programming,
speedup has also been observed in re- Chas. Boyd system administration, and performance testing. He has
lational database systems.2,12 For high- http://queue.acm.org/detail.cfm?id=1365499 authored an R package, SATK, for fitting performance
data to the USL, and contributed to the PDQ open source
performance computing applications, Condos and Clouds performance analyzer.
however, superlinear speedup may Pat Helland Kristofer Tomasette (ktomasette@gmail.com) is a
arise differently from the explanation http://queue.acm.org/detail.cfm?id=2398392 senior software engineer on the Platforms & APIs team
at Comcast Corporation. He has built software systems
presented here.4,14,20 involving warehouse management, online banking,
Superlinearity aside, the more im- References telecom, and most recently cable TV.
1. Apache Whirr; https://whirr.apache.org.
portant takeaway for many readers may 2. Calvert, C. and Kulkarni D. Essential LINQ. Pearson Copyright held by authors.
be the following. Unlike most software- Education, Boston, MA, 2009. Publication rights licensed to ACM. $15.00.
A P R I L 2 0 1 5 | VO L. 58 | N O. 4 | C OM M U N IC AT ION S OF T HE ACM 55
contributed articles
DOI:10.1145/ 2735589
the challenges posed by ground opera-
Speaking military jargon, users can tions for C2 systems and their user in-
terfaces. We discuss how C2 GUIs have
create labels and draw symbols to led to inefficient operation and high
position objects on digitized maps. training costs. And to address them, we
cover STP’s multimodal interface and
BY PHILIP R. COHEN, EDWARD C. KAISER, evaluations. Finally, we discuss deploy-
M. CECELIA BUCHANAN, SCOTT LIND, ment of the system by the U.S. Army and
MICHAEL J. CORRIGAN, AND R. MATTHEWS WESSON U.S. Marine Corps. This case study in-
volves the user-centered design-and-de-
Sketch-Thru-Plan:
velopment process required for promis-
ing basic research to scale reliably and
be incorporated into mission-critical
A Multimodal
products in large organizations.
for Command
an Army division or brigade) and their
own dedicated staff to relatively inexpe-
rienced commanders of smaller units.1
and Control
Across this range, there is great need
for a planning tool that is easy to learn
and use for both actual and simulated
operations while being functional in
field and mobile settings with varying
digital infrastructure and computing
devices. No military C2 system current-
ly meets all these requirements, due in
part to GUI limitations.
IN 2000, OVIATT and Cohen25 predicted multimodal Prior to the introduction of digital
systems, C2 functions were performed
user interfaces would “supplement, and eventually on paper maps with transparent plas-
replace, the standard GUIs of today’s computers for tic overlays and grease pencils. Users
many applications,” focusing on mobile interfaces with would collaboratively develop plans by
speaking to one another while drawing
alternative modes of input, including speech, touch, on a map overlay. Such an interface had
and handwriting, as well as map-based interfaces the benefit of requiring no interface
training and fail-safe operation. How-
designed to process and fuse multiple simultaneous
modes. In the intervening years, basic multimodal key insights
interfaces employing alternative input modalities have ˽˽ Multimodal interfaces allow users
indeed become the dominant interface for mobile to concentrate on the task at hand,
not on the tool.
devices. Here, we describe an advanced fusion-based ˽˽ Multimodal speech+sketch interfaces
multimodal map system called Sketch-Thru-Plan, or employing standardized symbol names
ILLUSTRATION BY J UST IN M ETZ
A P R I L 2 0 1 5 | VO L. 58 | N O. 4 | C OM M U N IC AT ION S OF T HE ACM 57
contributed articles
ever, obvious drawbacks included the 11 high-level steps can take an expe- and how to navigate among various
need to copy data into digital systems rienced user one minute to perform, screens and windows. With CPOF re-
and lack of remote collaboration. Ad- with many more functions necessary to quiring so many atomic GUI steps to
dressing them, C2 systems today are properly specify a plan. In comparison, accomplish a standard function, SRI
based on GUI technologies. The most with a version of the Quickset multi- International and General Dynamics
widely used Army C2 system, called modal interface that was tightly inte- Corp. built a learning-by-demonstra-
the Command Post of the Future, or grated with CPOF in 2006, a user could tion system an experienced user could
CPOF,11 is a three-screen system that say “Charlie company, patrol this route use to describe a higher-level pro-
relies on a drag-and-drop method of <draw route> from oh eight hundred to cedure.21 Expert users were trained
manipulating information. It supports sixteen hundred.” All attribute values within their Army units to create such
co-located and remote collaboration are filled in with one simple utterance procedures, and the existence of the
through human-to-human dialogue processed in six seconds on a tablet PC procedures would be communicated
and collaborative sketching. CPOF was computer. to the rest of the unit as part of the
a major advance over prior C2 systemsa Soldiers must learn where the many “lore” of operating the system. How-
and the primary Army C2 system dur- functions are located within the menu ever, if the interface had supported
ing Operation Iraqi Freedom, starting system, how to link information by easier expression of user intent, there
2003. “ctrl-dragging”b a rendering of it, would have been less need for the sys-
Table 1 outlines how a CPOF user tem to learn higher-level procedures.
would send a unit to patrol a specified b “Ctrl-dragging” refers to holding down the Thousands of soldiers are trained at
route from time 0800 to 1600. These CTRL key while also holding the left mouse great expense in Army schoolhouses
button on a map symbol, then dragging the and in deployed locations each year
symbol elsewhere in the user interface; a
a http://www.army.mil/article/16774/Command_ “clone” of the symbol appears at the destina-
to operate this complex system.
Post_of_the_Future_Wins_Outstanding_US_ tion location, such that if the original one is One essential C2 planning task is
Government_Program_Award/ changed, the clone is changed as well. to position resources, as represented
by symbols on a map of the terrain.
Table 1. Steps to send a unit on patrol through Command Post of the Future. Symbols are used to represent mili-
tary units, individual pieces of equip-
ment, routes, tactical boundaries,
Step CPOF
events, and tasks. The symbol names
1. Right click on the display background (not the map) to bring up a menu of CPOF objects that
can be created; and shapes are part of military “doc-
2. Select “Task,” opening a small window; trine,” or standardized procedures,
3. Type the task label “Patrol”; symbols, and language, enabling
4. Type the name of the unit in the “Performer” slot (such as C/1-62/3-BCT); people to share meaning relatively
5. Use the mouse to sketch the route on a map in digital ink;
unambiguously. Soldiers spend con-
6. Holding down the control key, select the digital ink just drawn (with left mouse button) and
siderable time learning doctrine, and
drag it into the container box in the task window; anything that reinforces doctrine is
7. Cntrl-drag the task window to the map near the link to position a symbol for the Task on the viewed as highly beneficial.
map; Each unit symbol has a frame and
8. Locate and expose the “Schedule” window; color (indicating friendly, hostile,
9. Cntrl-drag the task to the Schedule window, positioning the Task in the line associated with the neutral, and coalition), an echelon
name of the unit in the Performer slot;
marking (such as a platoon) on top,
10. Move the task graphic to the day in question; and
a label or “designator” on the side(s),
11. Slide the left and right interval boundaries to line up with 0800 and 1600, respectively. and a “role” (such as armored, medi-
cal, engineering, and fixed-wing avia-
tion) in the middle, as well as nu-
merous other markings (see Figure
Figure 1. Compositional military-unit symbols and example tactical graphic. 1). This is a compositional language
through which one can generate thou-
sands of symbol configurations. In or-
der to cope with the large vocabulary
using GUI technology, C2 systems
often use large dropdown menus for
Affiliation: Echelon: Role: Friendly Mechanized the types of entities that can be posi-
Friendly Platoon Mechanized Infantry Infantry Platoon tioned on the map. Common symbols
may be arrayed on a palette a user can
select from. However, these palettes
can become quite large, taking up
Tactical Graphic FLOT FLOT
Forward Line of Own Troops valuable screen space better used for
displaying maps, plans, and sched-
ules.
Another method used in GUIs tions; enterprise and health applica- and task.23,25 User sketching typically
to identify a military unit involves tions are also beginning to appear;22 provides spatial information (such
specifying its compositional pieces and other commercial multimodal as shape and location), while speech
in terms of the attributes and values systems have been developed for provides information about identity
for unit name, role, echelon, and warehousing and are emerging in au- and other attributes. This user inter-
strength. Each is displayed with mul- tomobiles.26 Adapx’s STP work is most face emulates the military’s non-dig-
tiple smaller menus from which the related to the QuickSet system devel- ital practices using paper maps6 and
user chooses a value. The user may oped at the Oregon Graduate Institute leads to reduced cognitive load for the
type into a search field that finds pos- in the late 1990s.c Quickset5,6,14,23–25 user.23
sible units through a string match. was a prototype multimodal speech/ QuickSet’s total vocabulary was ap-
The user must still select the desired sketch/handwriting interface used proximately 250 unit symbols and ap-
entity and set any attribute values via for map-based interaction. Because proximately 180 “tactical graphics,”
menus. When a symbol is created or speech processing needs no screen as in Figure 1. Speech recognition was
found, it is then positioned through space, its multimodal interface was based on an early IBM recognizer, and
a drag-and-drop operation onto the easily deployed on tablets, PDAs, and sketch recognition involved a light-
map. Due to these constraints (and wearables, as well as on wall-size dis- ly trained neural network and hid-
many more) on system design, users plays. Offering distributed, collabora- den Markov-model recognizer. The
told STP developers C2 system in- tive operation, it was used to position major research effort was devoted
terfaces based on such classical GUI entities on a map by speaking and/ to establishing innovative methods
technologies are difficult to learn or drawing, as well as create tasks for for multimodal fusion processing.14
and use. We have found speech-and- them that could be simulated through QuickSet’s unification-based fusion
sketch interfaces employing doctri- the Modular Semi-Automated Forces, of multimodal inputs14 supported
nal language, or standardized symbol or ModSAF,4,8 simulator. QuickSet mutual disambiguation, or MD, of
names and shapes, to be a much more was also used to control 3D visual- modalities16,24 in which processing of
efficient means for creating and posi- izations,7 various networked devices information conveyed in one mode
tioning symbols. (such as TV monitors and augment- compensated for errors and ambi-
ed-reality systems16) through hand guities in others, leading to relative
Multimodal Map-Based Systems gestures tracked with acoustic, mag- error rate reduction of 15%–67%; for
Many projects have investigated netic, and camera-based methods. example, a sketch with three objects
multimodal map-based interaction Based on extensive user-centered- could disambiguate that the user said
with pen and voice3,5,6,14,20,23 and with design research, the Oregon Graduate “boats” and not “boat.” MD increased
gesture and voice.2,7,16,19 Some such Institute team showed users prefer system robustness to recognition er-
systems represent the research foun- to interact multimodally when ma- rors, critical in high-noise environ-
dation for the present work, though nipulating a map. They are also able ments, where users are heavily ac-
none to our knowledge is deployed to select the best mode or combina- cented, or when sketches are created
for C2. Apart from smartphones, the tion of modes to suit their situation while moving or when the user’s arm
most widely deployed multimodal is tired.18 QuickSet demonstrated a
system is Microsoft’s Kinect, which multimodal interface could function
c Adapx Inc. was a corporate spin-off of the Or-
tracks the user’s body movements and egon Graduate Institute’s parent institution,
robustly under such real-world con-
allows voice commands, primarily for the Oregon Health and Science University, ditions, a necessary precondition of
gaming and entertainment applica- Portland, OR. field deployment.
Figure 2. Speech and Sketch (left) processed by STP into the digital objects on the right.
“Friendly Company
Boundary Alpha north,
Bravo south”
“Objective Black”
line + speech (boundary)
area + speech (objective)
point + speech
(unit)
point + speech (unit)
sketch-only (FIX)
“Bravo Company”
line + speech “Hostile Mechanized
(boundary) Infantry Platoon”
line + speech
(phase line) “Friendly Company
Boundary Bravo north,
“Phase line Charlie south”
Moe”
A P R I L 2 0 1 5 | VO L. 58 | N O. 4 | C OM M U N IC AT ION S OF T HE ACM 59
contributed articles
temporal distance
Strokes to Segmenter 7 6 5 4 3 2 1
spatial distance
Glyphs to 10 9 8
Strokes segmented into interpreters
glyphs (stroke groups) Individual Input Strokes
0.8 [tg_task: follow...] 0.8 [echelon: platoon, role engineer, capability: air_assault, affiliation: friendly...]
A P R I L 2 0 1 5 | VO L. 58 | N O. 4 | C OM M U N IC AT ION S OF T HE ACM 61
contributed articles
interface state. Contextual knowledge is first located, after which the glyph
restricts potential speech and lan- is broken into its constituent parts,
guage, thus increasing accuracy and including affiliation, role, and ech-
speed; for example, an “attribute-val- elon, that have canonical locations
ue” grammar context is invoked when
a stroke is drawn over an object on the The interface relative to the frame. Though the
roles may themselves have compo-
map. As the context-setting actions takes advantage sitional structure, they are matched
holistically. Where linguistic con-
of and reinforces
may themselves be ambiguous, STP
is designed to compare the results of tent that annotates the icon is con-
multiple simultaneous recognizers
embodying different restrictions.
skills soldiers ventionally expected, handwriting is
processed by Microsoft’s recognizer.
In the future, it may be helpful to already have, as These parts are then compared to a
use spoken dictation, as in Google
Voice, Nuance Communications’s
they are trained in library of template images, with the
results combined to form recogni-
Dragon Dictate, Apple’s Siri, and the standardized tion outputs. If a symbol “frame”
speech-to-speech translation sys-
tems,13 that require development of language, symbols, is not found, the sketch recognizer
attempts to use the tactical graph-
large-scale statistical-language mod- and military ics interpreter. For tactical graph-
els. However, because the spoken
military data needed to build such decision-making ics, whose shapes can be elongated
or contorted, the algorithm uses a
language models is likely classified,
this approach to creating a language
process. graph-matching approach that first
partitions the glyph into a graph of
model could be problematic. Since line segments and nodes. This graph
STP can take advantage of users’ is then matched against piecewise
knowledge of military jargon and a graph templates that allow for elon-
structured planning process,g gram- gation or bending. The pieces are
mar-based speech recognition has recombined based on sketch rules
thus far been successful. that define the relations between the
Sketch recognition. STP’s sketch pieces and anchor points from which
recognizer is based on algorithms a complete symbol can be composed;
from computer vision, namely Haus- for example, such rules define a “for-
dorff matching,17 using an array of ink ward line of own troops,” or FLOT,
interpreters to process sketched sym- symbol, as in Figure 1, as a “linear
bols and tactical graphics (see Figure array” of semicircles (a “primitive”),
3). For unit symbols, the recognizer’s with a barbed-wire fence composed
algorithm uses templates of line seg- of two approximately parallel lines,
ments, matching the sketched digi- plus a parallel linear array of circles.
tal ink against them and applying An advantage of the template-
a modified Hausdorff metric based based approach to unit-icon recogni-
on stroke distance and stroke an- tion is easy expansion by adding new
gles to compute similarity. For tacti- templates to the library; for example,
cal graphics, the recognizer creates new unit roles can be added in the
graphs of symbol pieces and matches form of scalable vector graphics that
them against the input. Fundamen- would then be located within the af-
tal to them all is a spatiotemporal filiation border by the compositional
ink segmenter. Regarding spatial unit symbol recognizer.
segmentation, if the minimum dis- Explicit and implicit task creation.
tance of a given stroke from the cur- Aside from creating and positioning
rently segmented group of strokes, symbols on a map, users can state
or “glyph,” is below a threshold pro- tasks explicitly or rely on the system
portional to the already existing glyph to implicitly build up an incremental
size and its start time is within a us- interpretation of the set of tasks that
er-settable threshold from the end- use those symbols (see Figure 4). STP
time of the prior stroke, then the new does the latter inference by matching
stroke is added to the existing glyph. the symbols on the map against the
For template-based unit-icon in- argument types of possible domain
terpretation, the affiliation “frame” tasks (such as combat service units
perform “supply” and medical units
g Soldiers are taught to use the structured Mili- perform “evacuate casualties”), as in
tary Decision-Making Process.28 Figure 4, subject to spatiotemporal
constraints. STP presents the plan- use sketch alone, most users prefer STP has also been tested using
ner with a running visualization of to interact multimodally, speaking head-mounted noise-canceling mi-
matching tasks in the evolving plan labels while drawing a point, line, or crophones in high-noise Army ve-
under creation. The planner can read- area. STP’s multimodal recognition hicles. Two users—one male, one
ily inspect the potential tasks, accept- in 2008, as reported by an externally female—issued a combined total of
ing or correcting them as needed. contracted evaluator, was a consider- 221 multimodal commands while
Here, STP has inferred that the Com- ably higher 96%. If STP’s interpreta- riding in each of two types of moving
bat Service Support unit and Main tion is incorrect, users are generally vehicles in the field, with mean noise
Supply Route A can be combined into able to re-enter the multimodal in- 76.2dbA and spikes to 93.3dbA. They
the task Resupply along Main Supply put, select among symbols on a list of issued the same 220 multimodal com-
Route A. If that is correct, the planner alternative symbol hypotheses, or in- mands to STP with the recorded vehi-
can select the checkbox that then up- voke the multimodal help system that cle noise played at maximum volume
dates the task matrix and schedule. presents the system’s coverage and in the laboratory, with mean 91.4dbA
As the planner adds more symbols to can be used for symbol creation. and spikes to 104.2dbA. These tests
the map, the system’s interpretations
of matching tasks are likewise up- Figure 4. Implicit task creation.
dated. Task start and end times can
The map depicts a combat-service-support unit and main supply route, medical unit, and casualty-collection point;
be spoken or adjusted graphically in
the task inference process finds tasks (such as resupply) that can include various entities along main supply route A.
a standard Gantt chart task-synchro-
nization matrix. Note STP is not try-
ing to do automatic planning or plan
recognition but rather assist during
the planning process; for instance,
STP can generate a templated “op-
erations order” from the tasks and
graphics, a required output of the
planning process. Much more plan-
ning assistance can, in principle, be
provided, though not clear is what a
planner would prefer.
Because the system is database
driven, the multimodal interface and
system technology have many poten-
tial commercial uses, including other
types of operations planning (such as
“wildland” firefighting, as firefighters
say), as well as geographical informa-
tion management, computer-aided de-
sign, and construction management.
Figure 5. COA sketch used in controlled study, January 2013.
Evaluations
Four types of evaluations of STP have
been conducted by the U.S. military:
component evaluations, user juries,
controlled study, and exercise plan-
ning tests.
Component evaluations. In rec-
ognition tests of 172 symbols by a
DARPA-selected third-party evalua-
tor during the Deep Green Program
in 2008, the STP sketch-recognition
algorithm had an accuracy of 73% for
recognizing the correct value at the
top of the list of potential symbol-rec-
ognition hypotheses. The next-best
Deep Green sketch recognizer built
for the same symbols and tested at
the same time with the same data had
a 57% recognition accuracy for the
top-scoring hypothesis.12 Rather than
A P R I L 2 0 1 5 | VO L. 58 | N O. 4 | C OM M U N IC AT ION S OF T HE ACM 63
contributed articles
resulted in 94.5% and 93.3% multi- trained for 30 minutes by the STP de- being hampered by systems with dif-
modal recognition accuracy, respec- velopment team on STP, then given ferent user interfaces and operation-
tively. We conjecture that, in addi- the COA sketch in Figure 5 to enter al difficulty. Still, such a realization
tion to the multimodal architecture, through both STP and CPOF. Results takes time to pervade such a large
the noise-canceling microphones showed these experienced CPOF us- organization with so many military
may have compensated for the loud ers created and positioned units and civilian stakeholders, including
but relatively constant vehicle noise. and tactical graphics on the map operational users and the defense-
Further research by the STP team will using STP’s multimodal interface acquisition community. In addition
look to tease apart the contributions 235% faster than with CPOFh; these to technology development, it took
of these factors in a larger study. subjects’ questionnaire remarks are the STP development team years of
User juries. One way the Army tests included in Table 2. Note “symbol presentations, demonstrations, tests,
software is to have soldiers just re- laydown” is only one step in the plan- and related activities to achieve the
turning from overseas deployment ning process, which also included visibility needed to begin to influence
engage in a “user jury” to try a poten- tasking of units and creating a full organizational adoption. Over that
tial product and provide opinions as COA and an operations order. Experts time, although the STP prototypes
to whether it would have been useful have reported the STP time savings had been demonstrated, commercial
in their recent activities. In order to for these other planning functions is availability of speech recognition was
get soldier feedback on STP, 2011– considerably greater. necessary to enable conservative de-
2013, the Army’s Training and Doc- Exercise planning test. STP was re- cision makers to decide that the risk
trine Command invited 126 soldiers cently used by a team of expert plan- from incorporating speech technol-
from four Army divisions experienced ners charged with developing exercise ogy into mission-critical systems had
with the vehicle C2 system and/or COAs that would ultimately appear in been sufficiently reduced. Moreover,
CPOF to compare them with STP. For CPOF. The STP team worked alongside the decision makers had indepen-
privacy, this article has changed the another expert planning team using dently become aware of the effects
names of those divisions to simply Microsoft PowerPoint in preference of interface complexity on their or-
Divisions 1, 2, 3, and 4. STP develop- to CPOF to develop its plans. Many at- ganizations’ training and operations.
ers trained soldiers for 30 minutes on tempts to develop exercise COAs have Still, the process is by no means com-
STP, then gave them a COA sketch to used various planning tools, includ- plete, with organizational changes
enter using STP. They later filled out a ing CPOF itself, but PowerPoint con- and thus customer education always
five-point Likert-style questionnaire. tinues to be used in spite of its many a potential problem. Currently, STP
In all areas, STP was judged more limitations (such as lack of geospa- has been transitioned to the Army’s
usable and preferred to the soldiers’ tial fidelity) because it is known by Intelligence Experimentation Analy-
prior C2 systems; Table 2 summarizes all. When the exercise was over, the sis Element, the Army Simulation and
their comparative ratings of STP ver- team using PowerPoint asked for STP Training Technology Center, and the
sus their prior C2 systems. for planning future exercises. Marine Corps Warfighting Laborato-
Controlled user study. Contrac- ry’s Experiments Division where it is
tors have difficulty running con- Transition and Deployment used for creating plans for exercises
trolled studies with active-duty sol- Although the U.S. military is ex- and integrating with simulators. We
diers. However, during the STP user tremely conservative in its adoption have also seen considerable interest
jury in January 2013, 12 experienced of computing technology, there is to- from the Army’s training facilities,
CPOF users from Division 3 evaluated day a growing appreciation that op- where too much time is spent training
STP vs. CPOF in a controlled experi- erational efficiency and training are students to use C2 systems, in rela-
ment. Using a within-subject design tion to time spent on the subject mat-
with order-of-system-use counterbal- h F-test two-sample for variances test: F(19) = 4.05, ter. Moreover, beyond STP’s use as a
anced, experienced CPOF users were p <0.02 planning tool, there has been great
interest in its multimodal technology
Table 2. Questionnaire results for STP vs. prior C2 system(s) used by the subjects. for rapid data entry, in both vehicle-
based computers and handhelds.
System Number STP Easier STP STP Prefer Regarding full deployment of STP,
Organization Compared to STP of Users to Use Faster Better Speech/Sketch the congressionally mandated “pro-
Division 1 Vehicle C2 system 41 83% 88% 81% 90% gram of record” acquisition process
Division 2 Vehicle C2 system 44 97% 97% 100% 87% specifies program budgets many
Division 3 Vehicle C2 system 37 78% 89% 84% 87% years into the future; new technolo-
Overall 122 87% 92% 89% 88% gies have a difficult time being incor-
Division 2 CPOF 16 76% 94% 85% 88% porated into such programs, as they
Division 3 CPOF 12 88% 79% 84% 100% must become an officially required
Division 4 CPOF 5 100% 100% 100% 100% capability and selected to displace
Overall 33 84% 89% 87% 94% already budgeted items in a competi-
tive feature triage process. In spite of
these hurdles, STP and multimodal
interface technology are now being Oviatt, General (ret.) Peter Chiarelli, Computers and Graphics 29, 4 (2005), 501–517.
18. Kumar, S., Cohen, P.R., and Coulston, R. Multimodal
evaluated for integration into C2 sys- and the anonymous reviewers. interaction under exerted conditions in a natural field
tems by the Army’s Program Executive setting. In Proceedings of the Sixth International
Conference on Multimodal Interfaces (State College,
Office responsible for command-and- References PA, Oct. 13–15). ACM Press, New York, 2004, 227–234.
control technologies. 1. Alberts, D.S. and Hayes, R.E. Understanding Command 19. MacEachren, A.M., Cai, G., Brewer, I., and Chen,
and Control. DoD Command and Control Research J. Supporting map-based geo-collaboration
Program Publication Series, Washington, D.C., 2006. through natural interfaces to large-screen display.
Conclusion 2. Bolt, R.A. Voice and gesture at the graphics interface. Cartographic Perspectives 54 (Spring 2006), 16–34.
ACM Computer Graphics 14, 3 (1980), 262–270. 20. Moran, D.B., Cheyer, A.J., Julia, L.E., Martin, D.L.,
We have shown how the STP multi- 3. Cheyer, A. and Julia, L. Multimodal maps: An and Park, S. Multimodal user interfaces in the Open
modal interface can address user- agent-based approach. In Proceedings of the Agent Architecture. In Proceedings of the Second
International Conference on Cooperative Multimodal International Conference on Intelligent User
interface problems challenging cur- Communication (Eindhoven, the Netherlands, May). Interfaces (Orlando, FL, Jan. 6–9). ACM Press, New
Springer, 1995, 103–113. York, 1997, 61–68.
rent C2 GUIs. STP is quick and easy to 4. Clarkson, J.D. and Yi, J. LeatherNet: A synthetic forces 21. Myers, K., Kolojejchick, J., Angiolillo, C., Cummings, T.,
learn and use and supports many dif- tactical training system for the USMC commander. Garvey, T., Gervasio, M., Haines, W., Jones, C., Knittel,
In Proceedings of the Sixth Conference on Computer J., Morley, D., Ommert, W., and Potter, S. Learning
ferent form factors, including hand- Generated Forces and Behavioral Representation by demonstration for military planning and decision
held, tablet, vehicle-based, worksta- Technical Report IST-TR-96-18, University of Central making: A deployment story. In Proceedings of the
Florida, Institute for Simulation and Training, Orlando, 23rd Innovative Applications of Artificial Intelligence
tion, and ultra-mobile digital paper FL, 1996, 275–281. Conference (San Francisco, CA, Aug. 6–10). AAAI
and pen. The interface takes advan- 5. Cohen, P.R., Johnston, M., McGee, D., Oviatt, S., Press, Menlo Park, CA, 2011, 1597–1604.
Pittman, J., Smith, I., Chen, L., and Clow, J. QuickSet: 22. O’Hara, K., Gonzalez, G., Sellen, A., Penney, G.,
tage of and reinforces skills soldiers Multimodal interaction for distributed applications. Varnavas, A., Mentis, H., Criminisi, A., Corish, R.,
already have, as they are trained in the In Proceedings of the Fifth ACM International Rouncefield, M., Dastur, N., and Carrell, T. Touchless
Conference on Multimedia (Seattle, WA, Nov. 9–13). interaction in surgery. Commun. ACM 57, 1 (Jan.
standardized language, symbols, and ACM Press, New York, 1997, 31–40. 2014), 70–77.
military decision-making process. 6. Cohen, P.R. and McGee, D.R. Tangible multimodal 23. Oviatt, S.L. Multimodal interfaces. The Human-
interfaces for safety-critical applications. Commun. Computer Interaction Handbook: Fundamentals,
In virtue of this common “doctrinal” ACM 47, 1 (Jan. 2004), 41–46. Evolving Technologies and Emerging Applications,
language, STP users can quickly cre- 7. Cohen, P.R., McGee, D., Oviatt, S., Wu, L., Clow, J., King, Revised Third Edition, J. Jacko, Ed. Lawrence Erlbaum
R., Julier, S., and Rosenblum, L. Multimodal interaction Associates, Mahwah, NJ, 2012, 405–430.
ate a course of action or enter data for 2D and 3D environments. IEEE Computer Graphics 24. Oviatt, S.L. Taming recognition errors with a
multimodally for operations, C2, and and Applications 19, 4 (Apr. 1999), 10–13. multimodal architecture. Commun. ACM 43, 9 (Sept.
8. Courtemanche, A.J. and Ceranowicz, A. ModSAF 2000), 45–51.
simulation systems without extensive development status. In Proceedings of the Fifth 25. Oviatt, S.L. and Cohen, P.R. Perceptual user interfaces:
training on a complicated user in- Conference on Computer Generated Forces and Multimodal interfaces that process what comes
Behavioral Representation, University of Central naturally. Commun. ACM 43, 3 (Mar. 2000), 45–53.
terface. The result is a highly usable Florida, Institute for Simulation and Training, Orlando, 26. Oviatt, S.L. and Cohen, P.R. The Paradigm Shift to
FL, 1995, 3–13. Multimodality in Contemporary Computer Interfaces.
interface that can be integrated with Morgan & Claypool Publishers, San Francisco, CA, 2015.
9. Dowding, J., Gawron, J.M., Appelt, D., Bear, J.,
existing C2 systems, thus increasing Cherny, L., Moore, R., and Moran, D. Gemini: A natural 27. Stilman, B., Yakhnis, V., and Umanskiy, O. Strategies
language system for spoken-language understanding. in large-scale problems. In Adversarial Reasoning:
user efficiency while decreasing cost. Computational Approaches to Reading the Opponent’s
In Proceedings of the 31st Annual Meeting of the
Association for Computational Linguistics (Ohio State Mind, A. Kott and W. McEneaney, Eds. Chapman &
Hall/CRC, London, U.K., 2007, 251–285.
Acknowledgments University, Columbus, OH, June 22–26). Association for
Computational Linguistics, Stroudsburg, PA, 1993, 54–61. 28. U.S. Army. U.S. Army Field Manual 101–5-1, Chapter
STP development was supported 10. Dowding, J., Frank, J., Hockey, B.A., Jonsson, A., Aist, 5, 1997; http://armypubs.army.mil/doctrine/dr_pubs/
G., and Hieronymus, J. A spoken-dialogue interface dr_a/pdf/fm1_02c1.pdf
by Small Business Innovation Re- to the EUROPA planner. In Proceedings of the Third
search Phase III contracts, including International NASA Workshop on Planning and
Scheduling for Space (Washington, D.C.). NASA, 2002. Philip R. Cohen (philcohen86@gmail.com) is a co-founder
HR0011-11-C-0152 from DARPA, a 11. Greene, H., Stotts, L., Patterson, R., and Greenburg, J. of Adapx, a fellow of the Association for the Advancement
subcontract from SAIC under prime Command Post of the Future: Successful Transition of of Artificial Intelligence, and a past-president of the
a Science and Technology Initiative to a Program of Association for Computational Linguistics.
contract W15P7T-08-C-M011, a sub- Record. Defense Acquisition University, Fort Belvoir,
contract from BAE Systems under VA, Jan. 2010; http://www.dau.mil Edward C. Kaiser (ekaiser@sensoryinc.com) is a senior
12. Hammond, T., Logsdon, D., Peschel, J., Johnston, application engineer at Sensory Inc., Portland, OR, and
prime contract W15P7T-08-C-M002, J., Taele, P., Wolin, A., and Paulson, B. A sketch- was co-PI/PI (2008–2009/2010) for STP at Adapx Inc.,
and contract W91CRB-10-C-0210 recognition interface that recognizes hundreds of Seattle, WA.
shapes in course-of-action diagrams. In Proceedings
from the Army Research, Develop- of ACM CHI Conference on Human Factors in
M. Cecelia Buchanan (mcbuchanan@gmail.com) is a
consultant at Tuatara Consulting, Seattle, WA, and was a
ment, and Engineering Command/ Computing Systems (Atlanta, Apr. 10–15). ACM Press,
research scientist at Adapx, Seattle, WA, when this article
New York, 2010, 4213–4218.
Simulation and Training Technology 13. Hyman, P. Speech-to-speech translations stutter, but
was written.
Center. This article is approved for researchers see mellifluous future. Commun. ACM 57, Scott Lind (scott.lind@adapx.com) is vice president for
4 (Apr. 2014), 16–19. Department of Defense and federal solutions at Adapx,
public release, distribution unlim- 14. Johnston, M., Cohen, P.R., McGee, D., Oviatt, S.L., Seattle, WA.
ited. The results of this research and Pittman, J.A., and Smith, I. Unification-based multimodal
integration. In Proceedings of the 35th Annual Meeting of Michael J. Corrigan (michael.corrigan@adapx.com) is a
the opinions expressed herein are the Association for Computational Linguistics and Eighth research software engineer at Adapx, Seattle, WA.
those of the authors, and not those of Annual Meeting of the European ACL (Madrid, Spain,
July 7–12). Association for Computational Linguistics, R. Matthews Wesson (matt.wesson@adapx.com) is a
the U.S. Government. We thank Todd Stroudsburg, PA, 1997, 281–288. senior research programmer at Adapx, Seattle, WA.
Hughes, Colonels (ret.) Joseph Moore, 15. Johnston, M., Bangalore, S., Varireddy, G., Stent, A.,
Ehlen, P., Walker, M., Whittaker, S., and Maloor, P.
Pete Corpac, and James Zanol and the MATCH: An architecture for multimodal dialogue
ROTC student testers. We are grate- systems. In Proceedings of the 40th Annual Meeting
of the Association for Computational Linguistics
ful to Paulo Barthelmess, Sumithra (Philadelphia, PA, July). Association for Computational
Linguistics, Stroudsburg, PA, 2002, 376–383. Watch the authors discuss
Bhakthavatsalam, John Dowding, 16. Kaiser, E., Olwal, A., McGee, D., Benko, H., Corradini, A., this work in this exclusive
Arden Gudger, David McGee, Moiz Li, X., Cohen, P.R., and Feiner, S. Mutual disambiguation Communications video.
of 3D multimodal interaction in augmented and virtual
Nizamuddin, Michael Robin, Melissa reality. In Proceedings of the Seventh International
Trapp-Petty, and Jack Wozniak for Conference on Multimodal Interfaces (Trento, Italy,
Oct. 4–6). ACM Press, New York, 2005, 12–19.
their contributions to developing and 17. Kara, L.B. and Stahovich, T.F. An image-based,
testing of STP. Thanks also to Sharon trainable symbol recognizer for hand-drawn sketches. @ 2015 ACM 0001-0782/15/04 $15.00
A P R I L 2 0 1 5 | VO L. 58 | N O. 4 | C OM M U N IC AT ION S OF T HE ACM 65
contributed articles
DOI:10.1145/ 2699417
S3 is just one of many AWS ser-
Engineers use TLA+ to prevent serious but vices that store and process data our
customers have entrusted to us. To
subtle bugs from reaching production. safeguard that data, the core of each
service relies on fault-tolerant dis-
BY CHRIS NEWCOMBE, TIM RATH, FAN ZHANG, BOGDAN MUNTEANU, tributed algorithms for replication,
MARC BROOKER, AND MICHAEL DEARDEUFF consistency, concurrency control, au-
to-scaling, load balancing, and other
How Amazon
coordination tasks. There are many
such algorithms in the literature, but
combining them into a cohesive sys-
tem is a challenge, as the algorithms
Web Services
must usually be modified to interact
properly in a real-world system. In
addition, we have found it necessary
to invent algorithms of our own. We
Uses Formal
work hard to avoid unnecessary com-
plexity, but the essential complexity of
the task remains high.
Complexity increases the probabil-
Methods
ity of human error in design, code,
and operations. Errors in the core of
the system could cause loss or corrup-
tion of data, or violate other interface
contracts on which our customers de-
pend. So, before launching a service,
we need to reach extremely high con-
fidence that the core of the system is
correct. We have found the standard
at Amazon Web Services
SI N CE 2011, ENG I NE E RS verification techniques in industry are
necessary but not sufficient. We rou-
(AWS) have used formal specification and model tinely use deep design reviews, code
checking to help solve difficult design problems in reviews, static code analysis, stress
testing, and fault-injection testing but
critical systems. Here, we describe our motivation still find that subtle bugs can hide in
and experience, what has worked well in our problem complex concurrent fault-tolerant
domain, and what has not. When discussing personal systems. One reason they do is that
human intuition is poor at estimating
experience we refer to the authors by their initials. the true probability of supposedly “ex-
At AWS we strive to build services that are simple for tremely rare” combinations of events
in systems operating at a scale of mil-
customers to use. External simplicity is built on a hidden lions of requests per second.
substrate of complex distributed systems. Such complex
internals are required to achieve high availability while key insights
running on cost-efficient infrastructure and cope ˽˽ Formal methods find bugs in system
designs that cannot be found through
with relentless business growth. As an example of this any other technique we know of.
growth, in 2006, AWS launched S3, its Simple Storage ˽˽ Formal methods are surprisingly feasible
Service. In the following six years, S3 grew to store one for mainstream software development
and give good return on investment.
trillion objects.3 Less than a year later it had grown ˽˽ At Amazon, formal methods are routinely
to two trillion objects and was regularly handling 1.1 applied to the design of complex
real-world software, including public
million requests per second.4 cloud services.
think more clearly, helping eliminate return on investment. straction also helps designers manage
“plausible hand waving,” and tools We found what we were looking for the complexity of real-world systems;
can be applied to check for errors in in TLA+,11 a formal specification lan- designers may choose to describe
the design, even while it is being writ- guage based on simple discrete math, the system at several “middle” levels
ten. In contrast, conventional design or basic set theory and predicates, of abstraction, with each lower level
documents consist of prose, static dia- with which all engineers are familiar. serving a different purpose (such as to
grams, and perhaps pseudo-code in A TLA+ specification describes the set understand the consequences of fin-
A P R I L 2 0 1 5 | VO L. 58 | N O. 4 | C OM M U N IC AT ION S OF T HE ACM 67
contributed articles
system receives a request, it must even- make innovative performance optimi- What Formal Specification
tually respond to that request. zations (such as removing or narrow- Is Not Good For
After defining correctness prop- ing locks or weakening constraints on We are concerned with two major
erties, we then precisely describe an message ordering) we would not have classes of problems with large distrib-
abstract version of the design, along dared to do without having model- uted systems: bugs and operator er-
with an abstract version of its operat- checked those changes. A precise, test- rors that cause a departure from the
ing environment. We express “what able description of a system becomes system’s logical intent; and surpris-
must go right” by explicitly specifying a what-if tool for designs, analogous to ing “sustained emergent performance
all properties of the environment on how spreadsheets are a what-if tool for degradation” of complex systems that
which the system relies. Examples of financial models. We find that using inevitably contain feedback loops.
such properties might be “If a commu- such a tool to explore the behavior of We know how to use formal specifica-
nication channel has not failed, then the system can improve the designer’s tion to find problems in the first class.
messages will be propagated along understanding of the system. However, problems in the second class
it,” and “If a process has not restarted, In addition, a precise, testable, well- can cripple a system even though no
then it retains its local state, modulo commented description of a design is logic bug is involved. A common ex-
any intentional modifications.” Next, an excellent form of documentation, ample is when a momentary slowdown
with the goal of confirming our design which is important, as AWS systems in a server (due, perhaps, to Java gar-
correctly handles all dynamic events have unbounded lifetimes. Over time, bage collection) causes timeouts to be
in the environment, we specify the ef- teams grow as the business grows, so breached on clients, causing the cli-
fects of each of those possible events— we regularly have to bring new people ents to retry requests, thus adding load
network errors and repairs, disk er- up to speed on systems. This educa- to the server, and further slowdown. In
rors, process crashes and restarts, tion must be effective. To avoid creat- such scenarios the system eventually
data-center failures and repairs, and ing subtle bugs, we need all engineers makes progress; it is not stuck in a logi-
actions by human operators. We then to have the same mental model of the cal deadlock, livelock, or other cycle.
use the model checker to verify that system and for that shared model to be But from the customer’s perspective
the specification of the system in its accurate, precise, and complete. Engi- it is effectively unavailable due to sus-
environment implements the chosen neers form mental models in various tained unacceptable response times.
correctness properties, despite any ways—talking to each other, reading TLA+ can be used to specify an upper
combination or interleaving of events design documents, reading code, and bound on response time, as a real-time
in the operating environment. We find implementing bug fixes or small fea- safety property. However, AWS systems
this rigorous “what needs to go right” tures. But talk and design documents are built on infrastructure—disks, op-
approach to be significantly less error can be ambiguous or incomplete, and erating systems, network—that does
prone than the ad hoc “what might go the executable code is much too large not support hard real-time scheduling
wrong” approach. to absorb quickly and might not pre- or guarantees, so real-time safety prop-
cisely reflect the intended design. In erties would not be realistic. We build
More Side Benefits contrast, a formal specification is pre- soft real-time systems in which very
We also find that writing a formal cise, short, and can be explored and ex- short periods of slow responses are not
specification pays dividends over the perimented on with tools. considered errors. However, prolonged
lifetime of the system. All production
services at Amazon are under constant Applying TLA+ to some of Amazon’s more complex systems.
development, even those released
years ago; we add new features cus-
Line Count
tomers have requested, we redesign System Components (Excluding Comments) Benefit
components to handle massive in- Fault-tolerant, low-level 804 PlusCal Found two bugs, then
creases in scale, and we improve per- network algorithm others in proposed
formance by removing bottlenecks. S3
optimizations
Many of these changes are complex Background redistribution of 645 PlusCal Found one bug, then
data another in the first
and must be made to the running sys-
proposed fix
tem with no downtime. Our first prior-
DynamoDB Replication and 939 TLA+ Found three bugs requir-
ity is always to avoid causing bugs in a group-membership system ing traces of up to 35
production system, so we often have steps
to answer “Is this change safe?” We EBS Volume management 102 PlusCal Found three bugs
find a major benefit of having a pre- Lock-free data structure 223 PlusCal Improved confidence
cise, testable model of the core system though failed to find a
Internal liveness bug, as liveness
is that we can quickly verify that even distributed not checked
deep changes are safe or learn they are lock
Fault-tolerant replication-and- 318 TLA+ Found one bug and
manager
unsafe without doing harm. In several reconfiguration algorithm verified an aggressive
cases, we have prevented subtle but se- optimization
rious bugs from reaching production.
In other cases we have been able to
A P R I L 2 0 1 5 | VO L. 58 | N O. 4 | C OM M U N IC AT ION S OF T HE ACM 69
contributed articles
severe slowdowns are considered er- at AWS; for instance, we could not find pressed in the language. But so far we
rors. We do not yet know of a feasible a practical way in Alloy to represent have always been able to find a way to
way to model a real system that would rich data structures (such as dynamic express our intent in a way that is clear,
enable tools to predict such emergent sequences containing nested records direct, and can be model checked.
behavior. We use other techniques to with multiple fields). After evaluating Alloy and TLA+,
mitigate these risks. Alloy’s limited expressivity appears C.N. tried to persuade colleagues at
to be a consequence of the particular Amazon to adopt TLA+. However, en-
First Steps to Formal Methods approach to analysis taken by the Al- gineers have almost no spare time for
With hindsight, Amazon’s path to for- loy Analyzer tool. The limitations do such things, unless compelled by need.
mal methods seems straightforward; not seem to be caused by Alloy’s con- Fortunately, a need was about to arise.
we had an engineering problem and ceptual model (“execution traces” over
found a solution. Reality was some- system states). This hypothesis moti- First Big Success at Amazon
what different. The effort began with vated C.N. to look for a language with In January 2012, Amazon launched Dy-
author C.N.’s dissatisfaction with the a similar conceptual model but with namoDB, a scalable high-performance
quality of several distributed systems richer constructs for describing system “no SQL” data store that replicates
he had designed and reviewed, and states. C.N. eventually stumbled on a customer data across multiple data
with the development process and language with those properties when centers while promising strong con-
tools that had been used to construct he found a TLA+ specification in the sistency.2 This combination of require-
those systems. The systems were con- appendix of a paper on a canonical al- ments leads to a large, complex system.
sidered successful, yet bugs and opera- gorithm in our problem domain—the The replication and fault-tolerance
tional problems persisted. To mitigate Paxos consensus algorithm.12 mechanisms in DynamoDB were creat-
the problems, the systems used well- The fact that TLA+ was created by ed by author T.R. To verify correctness
proven methods—pervasive contract the designer of such a widely used of the production code, T.R. performed
assertions enabled in production—to algorithm gave us some confidence extensive fault-injection testing using
detect symptoms of bugs, and mecha- that TLA+ would work for real-world a simulated network layer to control
nisms (such as “recovery-oriented systems. We became more confident message loss, duplication, and reor-
computing”20) to attempt to minimize when we learned a team of engineers dering. The system was also stress test-
the impact when bugs are triggered. at DEC/Compaq had used TLA+ to ed for long periods on real hardware
However, reactive mechanisms can- specify and verify some intricate under many different workloads. We
not recover from the class of bugs that cache-coherency protocols for the Al- know such testing is absolutely neces-
cause permanent damage to customer pha series of multicore CPUs.5,16 We sary but can still fail to uncover subtle
data; we must instead prevent such read one of the specifications13 and flaws in design. To verify the design of
bugs from being created. found they were sophisticated distrib- DynamoDB, T.R. wrote detailed infor-
When looking for techniques to pre- uted algorithms involving rich mes- mal proofs of correctness that did in-
vent bugs, C.N. did not initially consid- sage passing, fine-grain concurrency, deed find several bugs in early versions
er formal methods, due to the pervasive and complex correctness properties. of the design. However, we have also
view that they are suitable for only tiny That left only the question of whether learned that conventional informal
problems and give very low return on in- TLA+ could handle real-world failure proofs can miss very subtle problems.14
vestment. Overcoming the bias against modes. (The Alpha cache-coherency To achieve the highest level of confi-
formal methods required evidence they algorithm does not consider failure.) dence in the design, T.R. chose TLA+.
work on real-world systems. This evi- We knew from Lamport’s Fast Paxos T.R. learned TLA+ and wrote a de-
dence was provided by Zave,22 who used paper12 that TLA+ could model fault tailed specification of these compo-
a language called Alloy to find serious tolerance at a high level of abstrac- nents in a couple of weeks. To model-
bugs in the membership protocol of a tion and were further convinced when check the specification, we used the
distributed system called Chord. Chord we found other papers showing TLA+ distributed version of the TLC model
was designed by an expert group at MIT could model lower-level failures.15 checker running on a cluster of 10
and is successful, having won a “10-year C.N. evaluated TLA+ by writing a cc1.4xlarge EC2 instances, each with
test of time” award at the SIGCOMM specification of the same non-trivial eight cores plus hyperthreads and
2011 conference and influenced several concurrent algorithm he had written in 23GB of RAM. The model checker veri-
systems in industry. Zave’s success mo- Alloy.18 Both Alloy and TLA+ were able fied that a small, complicated part of
tivated C.N. to perform an evaluation of to handle the problem, but the com- the algorithm worked as expected for
Alloy by writing and model checking a parison revealed that TLA+ is much a sufficiently large instance of the sys-
moderately large Alloy specification of more expressive than Alloy. This differ- tem to give high confidence it is cor-
a non-trivial concurrent algorithm.18 ence is important in practice; several rect. T.R. then checked the broader
We liked many characteristics of the Al- of the real-world specifications we have fault-tolerant algorithm. This time the
loy language, including its emphasis on written in TLA+ would have been infea- model checker found a bug that could
“execution traces” of abstract system sible in Alloy. We initially had the oppo- lead to losing data if a particular se-
states composed of sets and relations. site concern about TLA+; it is so expres- quence of failures and recovery steps
However, we also found that Alloy is not sive that no model checker can hope would be interleaved with other pro-
expressive enough for many use cases to evaluate everything that can be ex- cessing. This was a very subtle bug; the
shortest error trace exhibiting the bug ware engineers more readily grasp the
included 35 high-level steps. The im- concept and practical value of TLA+ if
probability of such compound events we dub it “exhaustively testable pseu-
is not a defense against such bugs; his- do-code.” We initially avoid the words
torically, AWS engineers have observed
many combinations of events at least Formal methods “formal,” “verification,” and “proof”
due to the widespread view that for-
as complicated as those that could trig-
ger this bug. The bug had passed unno-
have helped us mal methods are impractical. We also
initially avoid mentioning what TLA
ticed through extensive design reviews, devise aggressive stands for, as doing so would give an
code reviews, and testing, and T.R. is
convinced we would not have found it
optimizations to incorrect impression of complexity.
Immediately after seeing the pre-
by doing more work in those conven- complex algorithms sentation, a team working on S3 asked
tional areas. The model checker later
found two bugs in other algorithms,
without sacrificing for help using TLA+ to verify a new
fault-tolerant network algorithm.
both serious and subtle. T.R. fixed all quality. The documentation for the algorithm
these bugs, and the model checker ver- consisted of many large, complicated
ified the resulting algorithms to a very state-machine diagrams. To check
high degree of confidence. the state machine, the team had been
T.R. says that, had he known about considering writing a Java program
TLA+ before starting work on Dy- to brute-force explore possible execu-
namoDB he would have used it from tions: essentially a hard-wired form
the start. He believes the investment of model checking. They were able to
he made in writing and checking the avoid the effort by using TLA+ instead.
formal TLA+ specifications was more Author F.Z. wrote two versions of the
reliable and less time consuming than spec over a couple of weeks. For this
the work he put into writing and check- particular problem, F.Z. found that
ing his informal proofs. Using TLA+ in she was more productive in PlusCal
place of traditional proof writing would than TLA+, and we have observed that
thus likely have improved time to mar- engineers often find it easier to begin
ket, in addition to achieving greater with PlusCal.
confidence in the system’s correctness. Model checking revealed two sub-
After DynamoDB was launched, T.R. tle bugs in the algorithm and allowed
worked on a new feature to allow data F.Z. to verify fixes for both. F.Z. then
to be migrated between data centers. used the spec to experiment with the
As he already had the specification for design, adding new features and opti-
the existing replication algorithm, T.R. mizations. The model checker quickly
was able to quickly incorporate this revealed that some of these changes
new feature into the specification. The would have introduced bugs.
model checker found the initial design This success led AWS management
would have introduced a subtle bug, to advocate TLA+ to other teams work-
but it was easy to fix, and the model ing on S3. Engineers from those teams
checker verified the resulting algo- wrote specs for two additional critical
rithm to the necessary level of confi- algorithms and for one new feature.
dence. T.R. continues to use TLA+ and F.Z. helped teach them how to write
model checking to verify changes to their first specs. We find it encouraging
the design for both optimizations and that TLA+ can be taught by engineers
new features. who are still new to it themselves; this is
important for quickly scaling adoption
Persuading More Engineers in an organization as large as Amazon.
Success with DynamoDB gave us Author B.M. was one such engineer.
enough evidence to present TLA+ to His first spec was for an algorithm
the broader engineering community at known to contain a subtle bug. The bug
Amazon. This raised a challenge—how had passed unnoticed through mul-
to convey the purpose and benefits tiple design reviews and code reviews
of formal methods to an audience of and had surfaced only after months of
software engineers. Engineers think in testing. B.M. spent two weeks learning
terms of debugging rather than “verifi- TLA+ and writing the spec. Using it,
cation,” so we called the presentation the TLC model checker found the bug
“Debugging Designs.”18 Continuing in seconds. The team had already de-
the metaphor, we have found that soft- signed and reviewed a fix for the bug,
A P R I L 2 0 1 5 | VO L. 58 | N O. 4 | C OM M U N IC AT ION S OF T HE ACM 71
contributed articles
so B.M. changed the spec to include the data that were much richer than
the proposed fix. The model checker standard multiplicity constraints and
found the problem still occurred in a foreign key constraints. We then added
different execution trace. A stronger fix high-level specifications of some of
was proposed, and the model checker
verified the second fix. B.M. later wrote Executive the main operations on the data that
helped us correct and refine the sche-
another spec for a different algorithm.
That spec did not uncover any bugs but
management ma. This result suggests a data model
can be viewed as just another level of
did uncover several important ambi- actively encourages abstraction of the entire system. It also
guities in the documentation for the
algorithm the spec helped resolve.
teams to write suggests TLA+ may help designers im-
prove a system’s scalability. In order to
Somewhat independently, after see- TLA+ specs for new remove scalability bottlenecks, design-
ing internal presentations about TLA+,
authors M.B and M.D. taught them-
features and other ers often break atomic transactions
into finer-grain operations chained
selves PlusCal and TLA+ and started significant design together through asynchronous work-
using them on their respective projects
without further persuasion or assis- changes. flows; TLA+ can help explore the conse-
quences of such changes with respect
tance. M.B. used PlusCal to find three to isolation and consistency.
bugs and wrote a public blog about his
personal experiments with TLA+ out- Most Frequently Asked Question
side of Amazon.7 M.D. used PlusCal to On learning about TLA+, engineers
check a lock-free concurrent algorithm usually ask, “How do we know that the
and then used TLA+ to find a critical executable code correctly implements
bug in one of AWS’s most important the verified design?” The answer is
new distributed algorithms. M.D. also we do not know. Despite this, formal
developed a fix for the bug and veri- methods still help in multiple ways:
fied the fix. Independently, C.N. wrote Get design right. Formal methods
a spec for the same algorithm that was help engineers get the design right,
quite different in style from the spec which is a necessary first step toward
written by M.D., but both found the getting the code right. If the design is
same bug in the algorithm. This sug- broken, then the code is almost cer-
gests the benefits of using TLA+ are tainly broken, as mistakes during cod-
quite robust to variations among en- ing are extremely unlikely to compen-
gineers. Both specs were later used to sate for mistakes in design. Worse,
verify that a crucial optimization to the engineers are likely to be deceived into
algorithm did not introduce any bugs. believing the code is “correct” because
Engineers at Amazon continue to it appears to correctly implement the
use TLA+, adopting the practice of first (broken) design. Engineers are un-
writing a conventional prose-design likely to realize the design is incorrect
document, then incrementally refining while focused on coding;
parts of it into PlusCal or TLA+. This Gain better understanding. Formal
method often yields important insight methods help engineers gain a better
about the design, even without going as understanding of the design. Improved
far as full specification or model check- understanding can only increase the
ing. In one case, C.N. refined a prose chances they will get the code right;
design of a fault-tolerant replication and
system that had been designed by an- Write better code. Formal methods
other Amazon engineer. C.N. wrote can help engineers write better “self-
and model checked specifications diagnosing code” in the form of asser-
at two levels of concurrency; these tions. Independent evidence10 and our
specifications helped him understand own experience suggest pervasive use
the design well enough to propose of assertions is a good way to reduce
a major protocol optimization that errors in code. An assertion checks a
radically reduced write-latency in the small, local part of an overall system
system. We have also discovered that invariant. A good system invariant
TLA+ is an excellent tool for data mod- captures the fundamental reason the
eling, as when designing the schema system works; the system will not do
for a relational or “no SQL” database. anything wrong that could violate a
We used TLA+ to design a non-trivial safety property as long as it continu-
schema with semantic invariants over ously maintains the system invariant.
The challenge is to find a good system Conclusion (Norfolk, VA, July 2005); http://klabs.org/richcontent/
conferences/faa_nasa_2005/presentations/cmh-why-
invariant, one strong enough to en- Formal methods are a big success at read-accident-reports.pdf
sure no safety properties are violated. AWS, helping us prevent subtle but se- 9. Joshi, R., Lamport, L. et al. Checking cache-coherence
protocols with TLA+. Formal Methods in System
Formal methods help engineers find rious bugs from reaching production, Design 22, 2 (Mar, 2003) 125–131.
strong invariants, so formal methods bugs we would not have found through 10. Kudrjavets, G., Nagappan, N., and Ball, T. Assessing
the relationship between software assertions
can help improve assertions that help any other technique. They have helped and code quality: An empirical investigation. In
improve the quality of code. us devise aggressive optimizations to Proceedings of the 17th International Symposium on
Software Reliability Engineering (Raleigh, NC, Nov.
While we would like to verify that complex algorithms without sacrific- 2006), 204–212.
executable code correctly imple- ing quality. At the time of this writing, 11. Lamport, L. The TLA Home Page; http://research.
microsoft.com/en-us/um/people/lamport/tla/tla.html
ments the high-level specification or seven Amazon teams have used TLA+, 12. Lamport, L. Fast Paxos. Distributed Computing 19, 2
even generate the code from the spec- all finding value in doing so, and more (Oct. 2006), 79–103.
13. Lamport, L. The Wildfire Challenge Problem; http://
ification, we are not aware of any such Amazon teams are starting to use it. research.microsoft.com/en-us/um/people/lamport/
tla/wildfire-challenge.html
tools that can handle distributed sys- Using TLA+ will improve both time- 14. Lamport, L. Checking a multithreaded algorithm with
tems as large and complex as those to-market and quality of our systems. +CAL. In Distributed Computing: 20th International
Conference, S. Dolev, Ed. Springer-Verlag, 2006, 11–163.
being built at Amazon. We do rou- Executive management actively en- 15. Lamport, L. and Merz, S. Specifying and verifying fault-
tinely use conventional static analy- courages teams to write TLA+ specs tolerant systems. In Formal Techniques in Real-Time
and Fault-Tolerant Systems, Lecture Notes in Computer
sis tools, but they are largely limited for new features and other significant Science, Number 863, H. Langmaack, W.-P. de Roever,
to finding “local” issues in the code, design changes. In annual planning, and J. Vytopil, Eds. Springer-Verlag, Sept. 1994, 41–76.
16. Lamport, L., Sharma, M., Tuttle, M., and Yu, Y.
and are unable to verify compliance managers now allocate engineering The Wildfire Challenge Problem. Jan. 2001;
with a high-level specification. time to TLA+. http://research.microsoft.com/en-us/um/people/
lamport/pubs/wildfire-challenge.pdf
We have seen research on using the While our results are encourag- 17. Lu, T., Merz, S., and Weidenbach, C. Towards
TLC model checker to find “edge cas- ing, some important caveats remain. verification of the Pastry Protocol using TLA+. In
Proceedings of Joint 13th IFIP WG 6.1 International
es” in the design on which to test the Formal methods deal with models of Conference and 30th IFIP WG 6.1 International
code,21 an approach that seems prom- systems, not the systems themselves, Conference Lecture Notes in Computer Science
Volume 6722 (Reykjavik, Iceland, June 6–9). Springer-
ising. However, Tasiran et al.21 covered so the adage “All models are wrong, Verlag, 2011, 244 –258.
hardware design, and we have not yet some are useful” applies. The design- 18. Newcombe, C. Debugging Designs. Presented at the
14th International Workshop on High-Performance
tried to apply the method to software. er must ensure the model captures the Transaction Systems (Monterey, CA, Oct. 2011); http://
hpts.ws/papers/2011/sessions_2011/Debugging.
significant aspects of the real system. pdf and associated specifications http://hpts.ws/
Alternatives to TLA+ Achieving it is a special skill, the ac- papers/2011/sessions_2011/amazonbundle.tar.gz
19. Newcombe, C. Why Amazon chose TLA+. In
There are many formal specifica- quisition of which requires thought- Proceedings of the Fourth International Conference
tion methods. We evaluated several ful practice. Also, we were solely Lecture Notes in Computer Science Volume 8477, Y.A.
Ameur and K.-D. Schewe, Eds. (Toulouse, France, June
and published our findings in New- concerned with obtaining practical 2–6). Springer, 2014, 25–39.
combe,19 listing the requirements benefits in our particular problem do- 20. Patterson, D., Fox, A. et al. The Berkeley/Stanford
Recovery-Oriented Computing Project. University of
we think are important for a formal main and have not attempted a com- California, Berkeley; http://roc.cs.berkeley.edu/
method to be successful in our indus- prehensive survey. Therefore, mileage 21. Tasiran, S., Yu, Y., Batson, B., and Kreider, S. Using
formal specifications to monitor and guide simulation:
try segment. When we found TLA+ met may vary with other tools or in other Verifying the cache coherence engine of the Alpha
those requirements, we stopped evalu- problem domains. 21364 microprocessor. In Proceedings of the Third
IEEE International Workshop on Microprocessor Test
ating methods, as our goal was always and Verification (Austin, TX, June). IEEE Computer
practical engineering rather than an Society, 2002.
References
22. Zave, P. Using lightweight modeling to understand
exhaustive survey. 1. Abrial, J. Formal methods in industry: Achievements,
Chord. ACM SIGCOMM Computer Communication
problems, future. In Proceedings of the 28th
Review 42, 2 (Apr. 2012), 49–57.
International Conference on Software Engineering
Related Work (Shanghai, China, 2006), 761–768.
2. Amazon.com. Supported Operations in DynamoDB:
We find relatively little published liter- Strongly Consistent Reads. System documentation; Chris Newcombe (chris.newcombe@gmail.com) is an
http://docs.aws.amazon.com/amazondynamodb/ architect at Oracle, Seattle, WA, and was a principal
ature on using high-level formal spec- latest/developerguide/APISummary.html engineer in the AWS database services group at Amazon.
ification for verifying the design of 3. Barr, J. Amazon S3: The first trillion objects. Amazon com, Seattle, WA, when this article was written.
Web Services Blog, June 2012; http://aws.typepad.
complex distributed systems in indus- com/aws/2012/06/amazon-s3-the-first-trillion-
Tim Rath (rath@amazon.com) is a principal engineer in the
AWS database services group at Amazon.com, Seattle, WA.
try. The Farsite project6 is complex but objects.html
4. Barr, J. Amazon S3: Two trillion objects, 1.1 million Fan Zhang (fanxhang58@gmail.com) is a software
somewhat different from the types of requests per second. Amazon Web Services Blog, Mar. engineer and technical product and program manager at
systems we describe here and appar- 2013; http://aws.typepad.com/aws/2013/04/amazon- Cyanogen, Seattle, WA, and was a software engineer for
s3-two-trillion-objects-11-million-requests-second.html AWS S3 at Amazon.com, Seattle, WA, when this article
ently never launched commercially. 5. Batson, B. and Lamport, L. High-level specifications: was written.
Abrial1 cited applications in commer- Lessons from industry. In Formal Methods for
Components and Objects, Lecture Notes in Computer Bogdan Munteanu (bogdanmunte@gmail.com) is
cial safety-critical control systems, Science Number 2852, F.S. de Boer, M. Bonsangue, a software engineer at Dropbox, and was a software
but they seem less complex than our S. Graf, and W.-P. de Roever, Eds. Springer, 2003, engineer in the AWS S3 Engines group at Amazon.com,
242–262. Seattle, WA, when this article was written.
problem domain. Lu et al.17 described 6. Bolosky, W., Douceur, J., and Howell, J. The Farsite
Project: A retrospective. ACM SIGOPS Operating Marc Brooker (mbrooker@amazon.comv) is a principal
post-facto verification of a well-known engineer for AWS EC2 at Amazon.com, Seattle, WA.
Systems Review: Systems Work at Microsoft Research
algorithm for a fault-tolerant distrib- 41, 2 (Apr. 2007), 17–26. Michael Deardeuff (mdearde@amazon.com) is a
uted hash table, and Zave22 described 7. Brooker, M. Exploring TLA+ with two-phase commit. software engineer in the AWS database services group
Personal blog, Jan. 2013; http://brooker.co.za/ at Amazon.com, Seattle, WA.
another such algorithm, but we do not blog/2013/01/20/two-phase.html
8. Holloway, C. Michael Why you should read accident
know if these algorithms have been reports. Presented at the Software and Complex Copyright held by Owners/Authors.
used in commercial products. Electronic Hardware Standardization Conference Publication rights licensed to ACM. $15.00
A P R I L 2 0 1 5 | VO L. 58 | N O. 4 | C OM M U N IC AT ION S OF T HE ACM 73
review articles
DOI:10.1145/ 2667218
of hundreds of medical devices that
Implantable devices, often dependent have been infected by malware.”34
Even though deaths and injuries have
on software, save countless lives. not yet been reported from such intru-
But how secure are they? sions, it is not difficult to imagine that
someday they will. There is no doubt
BY JOHANNES SAMETINGER, JERZY ROZENBLIT, that health care will increasingly be
ROMAN LYSECKY, AND PETER OTT digitized in the future. Medical devic-
es will increasingly become smarter
Security
and more interconnected. The risk
of computer viruses in hospitals and
clinics is one side effect of this trend.
Without suitable countermeasures,
Challenges
more data breaches and even mali-
cious attacks threatening the lives of
patients may result.
Security is about protecting infor-
for Medical
mation and information systems from
unauthorized access and use. As men-
tioned, medical devices have more and
more embedded software with com-
Devices
munication mechanisms that now
qualify them as information systems.
Confidentiality, integrity, and avail-
ability of information are core design
and operational goals. Secure software
is supposed to continue to function
correctly under a malicious attack.25 In
this sense, medical device security is
the idea of engineering these devices
so they continue to function correctly
even if under a malicious attack. This
SECURITY AND S AFE T Y issues in the medical domain includes internal hardware and soft-
ware aspects as well as intentional and
take many different forms. Examples range from unintentional external threats.
purposely contaminated medicine to recalls of Medical devices comprise a broad
vascular stents, and health data breaches. Risks range of instruments and implements.
ever remote and unlikely this scenario credit card information, or website events has shown that both the num-
might sound, it is not completely im- availability problems. The loss, theft, ber of recalls and adverse events have
plausible. Securing medical devices or exposure of personally identifiable increased over the years.
A P R I L 2 0 1 5 | VO L. 58 | N O. 4 | C OM M U N IC AT ION S OF T HE ACM 75
review articles
The major reason for device recalls tions for approximately 1,700 differ- doctors or devices. Safety-critical in-
involves malfunctions. Computer- ent generic types of devices. These formation has an influence on the
related recalls account for about 20% devices are grouped into medical safety of a person or her environment.
to 25%, and counting. The numbers specialties, called panels. Examples Examples include parameter set-
show that computer-related recalls for the FDA’s specialty panels include tings or commands for devices such
are caused mainly by software.1 More cardiovascular devices, dental, ortho- as implanted defibrillators or x-ray
than 90% of device recalls mentioned pedic, as well as ear, nose, and throat machines. Both malicious and unin-
the word ‘software’ as the reason for devices. Active devices may or may tentional modification of such infor-
the corrective action. Less than 3% not involve software, hardware, and mation may lead to safety-critical situ-
mentioned an upgrade would be avail- interfaces, which are important when ations. Sensitive information includes
able online.23 Kramer et al. also tested considering security issues. These de- anything that is about a patient, for
the FDA’s adverse event reporting by vices can do some processing, receive example, medical records as well as
notifying a device’s vulnerability, only inputs from outside the device (sen- values from sensing devices that re-
to find out that it took several months sors), output values to the outer world port information about a person’s or
before the event showed up in the cor- (actuators), and communicate with her device’s state, for example, glu-
responding database. This time span other devices. cose level, ID, or parameter settings
is definitely much too long to respond Device safety. Each of the FDA’s ge- of a pacemaker. It is interesting to
to software-related malfunctions. neric device types is assigned to one of note that all medical devices as de-
Successful hacking of medical three regulatory classes: I, II, and III. fined by the WHO or by the FDA have
devices has been demonstrated on The classes are based on the level of aspects that are inherently safety re-
several occasions. For example, com- control necessary to ensure the safe- lated. Some have a higher risk, some a
mands have been sent wirelessly to an ty and effectiveness of a device; the lower one (see FDA’s classes I, II, and
insulin pump (raise or lower the lev- higher the risk, the higher the class.8 III). However, not all of these devices
els of insulin, disable it). This could For example, class III devices have to are relevant from a security point of
be done within a distance of up to 150 be approved by a premarket approval view; recall the aforementioned arti-
feet.20 The FDA’s safety communica- process. This class contains devices ficial joint. Typically, security is an is-
tion has issued a warning to device that are permanently implanted into sue as soon as software is involved. But
makers and health care providers to human bodies and may be necessary there are also security-relevant devices
put safeguards in place to prevent to sustain life, for example, artificial that are not considered to be medical
cyber-attacks.9 Deaths or injuries are hearts or an automated external defi- devices by the WHO or the FDA. Ex-
not yet known, but the hypothetical brillator. The classification is based amples include smartphones that run
ramifications are obvious. The non- on the risk that a device poses to the medical apps handling sensitive infor-
medical IT landscape can also pose a patient or the user. Class I includes mation, or regular PCs in a hospital for
threat to medical operations. For ex- devices with the lowest risk, class III processing medical records.
ample, when computers around the those with the greatest risk. The difference between safety and
world came to a halt after an antivirus According to the WHO, optimum security is not always obvious because
program identified a normal system safety and performance of medical de- security can clearly have an effect on
file as a virus, hospitals had to post- vices requires risk management with safety. Generally speaking, safety is
pone elective surgeries and to stop the cooperation among all involved in about the protection of a device’s en-
treating patients.11 the device’s life span, that is, the gov- vironment, that is, mainly the patient,
ernment, the manufacturer, the im- from the device itself. The manufac-
Medical Devices porter/vendor, the user, and the pub- turer must ensure the device does not
Medical devices include everything lic.37 The international standard ISO harm the patient, for example, by not
from simple wooden tongue depres- 14971: 2007 provides a framework for using toxic substances in implants or
sors and stethoscopes to highly so- medical device manufacturers includ- by careful development of an insulin
phisticated computerized medical ing risk analysis, risk evaluation, and pump’s software. Security is about the
equipment.37 According to the World risk control for risk management in a protection of the device from its envi-
Health Organization (WHO), a medi- device’s design, development, manu- ronment, that is, just the opposite of
cal device is “an instrument, appa- facturing, and after-sale monitoring of safety. As long as a device is operating
ratus, implement, machine, contriv- a device’s safety and performance.18 in a stand-alone mode, this is not an is-
ance, implant, in vitro reagent, or Device security. We consider a sue. But if a device communicates with
other similar or related article” in- medical device to be security-critical if its environment or is connected to the
tended for use in the diagnosis, pre- it does some form of processing and Internet or other systems, then some-
vention, monitoring, and treatment communicating, typically by running one may get access to data on the de-
of disease or other conditions.37 The some form of software on special- vice or even gain control over it. A secu-
FDA uses a similar definition.7 Class- ized hardware, and often, employing rity issue becomes a safety issue when
es of medical devices have been de- a range of sensors.7 Sensing devices a malicious attacker gains control of a
fined differently in, for example, the constitute a security threat because device and harms the patient.
U.S., Canada, Europe, or Australia. wrong sensor values may later induce Non-communicating but process-
The FDA has established classifica- therapeutically wrong decisions by ing devices can be critical to security
when attackers have managed to im- tors can be especially critical for the have arisen. At this time these remote
plant malicious hardware or software patient’s health and welfare. These follow-up systems are in read-only
before the device gets installed. Ex- devices are implanted in hundreds mode. However, device programming
amples include hardware or software of thousands of patients every year; through remote follow-up systems is
Trojans that might be installed in heart many of these patients would not be being investigated. Incorrect program-
pacemakers to be activated upon a spe- able to live without a fully functional ming either by error, technical failure,
cific event. Precautions must be taken device. Patients with these types of or malicious intent could have poten-
at the design and development pro- implantable devices are typically seen tially life-threatening implications for
cesses in order to avoid such attacks. in a follow-up on a regular basis, in the patient.
Communicating devices, of course, an outpatient clinic or hospital set- Risk assessment. In our pacemaker
provide a broader attack “surface.” ting, where the device is interrogated scenario, we distinguish different risks
We suggest a security classifica- and adjustments are made as needed. according to the CIA triad, confidenti-
tion of medical devices depending on Trained staff or physicians perform ality, integrity, and availability. First—
whether they process or communicate these functions using a vendor-specif- confidentiality—sensitive data about
sensitive information and on whether ic programming system, which com- the patient and her pacemaker may be
they process or communicate safety- municates with the device by means disclosed. Second—integrity—data on
critical information. The accompany- of a wand or wireless technology. In a device may be altered, resulting in a
ing table summarizes our proposed addition, over the last several years es- range of slightly to highly severe im-
levels for devices that are security-rel- sentially all device vendors have estab- pacts on the patient. Third—availabil-
evant. Note this set is an initial classi- lished a home-based device follow-up ity—may render a device inoperable.
fication. While not yet fully elaborat- system. For this purpose, a data mod- An architectural overview of the pace-
ed, it is a first step toward developing ule is located at the patient’s home, maker environment is given in the ac-
a more comprehensive taxonomy of typically at the bedside. Once the pa- companying figure on page 79. While
security levels. tient is in proximity to the data mod- the pacemaker itself is communicat-
Health care professionals increas- ule, wireless contact is established ing wirelessly, other communication is
ingly improve and facilitate patient and the data module interrogates the done via the Internet, a phone line, and
care with mobile medical applica- device. This information is sent (typi- sometimes by means of a USB stick.
tions. An increasing number of pa- cally through a telephone landline) to Even if programming devices may not
tients manage their health and well- an Internet-based repository. Autho- yet have a direct connection to the clin-
ness with such applications. Such rized health care professionals can ic, sooner or later, they will.
apps may promote healthy living and view this information. Information disclosure and tamper-
provide access to useful health infor- Implantable cardiac pacemakers ing may happen on any connection be-
mation. Mobile medical apps can be and defibrillators are highly reliable. tween devices. On the Internet, a man-
used for a plethora of uses. They can Nevertheless, failure of device compo- in-the-middle attack can occur, unless
extend medical devices by connecting nents has occurred and highlighted appropriate measures such as encryp-
to them for the purpose of displaying, the potential medical and legal im- tion mechanisms have been used.
storing, analyzing, or transmitting pa- plications. These failures have largely Wireless communication additionally
tient-specific data. Not every mobile been due to problems with manufac- allows attackers to listen to the traffic
medical application necessarily poses turing processes and/or materials and with a separate device, that is, another
a security risk. However, as soon as it have typically been limited to certain programming device, another home
processes or transmits sensitive in- device batches. Almost always, how- monitor, or a different device specifi-
formation or even controls the medi- ever, such device failures require sur- cally for an attack. Such devices can
cal device, security precautions must gical device replacement. With the be used not only for listening but also
be taken. increasing prevalence of Web-based to pretend being an authorized com-
wireless remote device follow-up sys- munication partner. Denial-of-service
Pacemaker Scenario tems, concerns about device security attacks may occur as well. In our sce-
We will illustrate security issues
through an example of pacemakers, Security levels of medical devices.
that is, medical devices that are im-
planted in patients to regulate the pa-
Security level Description Device examples
tient’s heart rate. The purpose of such a
Low Neither sensitive nor safety-critical PC in hospital used for administrative work
device is to maintain an adequate heart
activity Heart rate watch
rate of a patient whose heart would not
Medium Sensitive activity PC processing electronic health records (EHRs)
be able to do so otherwise. Pacemakers Smartphone communicating glucose levels
are classified as Class III, the highest High Safety-critical activity Device contolling insulin pump or
safety category. sending parameters to pacemaker
Clinical perspective. Implantable Very High Safety-critical activity, Pacemaker receiving external parameters
medical devices are prevalent in many input from elsewhere
medical specialties. The implantable
cardiac pacemakers and defibrilla-
A P R I L 2 0 1 5 | VO L. 58 | N O. 4 | C OM M U N IC AT ION S OF T HE ACM 77
review articles
nario, the biggest threat stems from attack vectors.21 Various potential at-
the pacemaker’s interoperability. The tacks like privilege escalation, login
purpose of an assessment of a device’s backdoor, and password stealing have
risks is a determination of risks, their been demonstrated. The hardware of
degree of harm as well as the likelihood
of harm occurring.27 Based on this in- Medical device pacemakers is, like its software, con-
fidential and proprietary. A hardware
formation, countermeasures must be
identified and selected.
security is the idea reference platform is available at the
University of Minnesota. It is based
Software. Vulnerabilities in soft- of engineering upon an 8-bit microcontroller.26 Hard-
ware are bugs or flaws in software that
can directly be used by attackers to gain
these devices so ware for programming devices and
home monitors is less constrained.
access to a system or network. Software they continue to These devices have no space and
for pacemakers is confidential and
proprietary. A system specification is
function correctly power constraints and are comparable
to regular PCs. Similarly to software,
available for academic purposes.2 It even if under a malicious hardware circuits can be
demonstrates the complexity of these
seemingly simple devices. There are malicious attack. placed on the medical device itself, but
also on other devices it communicates
many programmable parameters, for with, such as the programming device
example, lower and upper rate limit, and the home monitor in our pace-
as well as various delays and periods. maker scenario. Malicious hardware
Functionality includes device monitor- on the Web server, where pacemaker
ing, lead support, pulse pacing, various data is stored, also poses a threat by ei-
operation modes and states as well as ther revealing sensitive medical data or
extensive diagnostic features. Software by even modifying this data and, thus,
is not only needed on the pacemaker misleading the treating physician.
itself, but also on the programming Interoperability. Security issues
device and on the home monitor. Soft- of pacemakers have also been raised
ware on the programming device is due to their capability of wireless com-
needed to non-invasively reprogram munication. Concerns include unau-
a pacemaker, for example, to modify thorized access to patient data on the
the pacemaker rate, to monitor spe- device as well as unauthorized modifi-
cific functions, and to process data ob- cations of the device’s parameters.
tained from the pacemaker. Such soft- Needless to say, modified settings
ware can work with one or a few models may harm patients’ well-being, cause
of devices, typically from the same severe damages to their hearts, and
manufacturer. Software on the home even cause their deaths. Device integ-
monitor has to communicate with the rity is at stake when its wireless com-
pacemaker and to mainly upload im- munication is attacked. The crucial
portant information to a specific serv- question is whether it is possible for
er, where personnel from the clinic can unauthorized third parties to change
later access it. Installing updates may device settings, to change or disable
be necessary on both the programming therapies, or even to deliver command
and the home monitor, but also on shocks. Halperin et al. have partially re-
the pacemaker itself. A compromised verse engineered a pacemaker’s com-
pacemaker can directly do harm to its munications protocol with an oscil-
patient. A compromised programming loscope and a software radio and have
device can do so indirectly. It may just then implemented several attacks able
send other parameters to the device to compromise the safety and privacy
than the ones the cardiologist has cho- of patients.15
sen. A compromised home monitor Even if hardware and software of all
also poses a serious threat. If it uploads devices in our pacemaker scenario are
incorrect values to the server, then free of malware, an attacker may still
these values may lead the cardiologist pose a threat by communicating with
to wrong conclusions and eventually to either one of these devices, such as the
wrong device settings that may harm home monitor, the programming de-
the patient. Last but not least, a com- vice, the service provider’s Web server,
promised server that stores all these or the pacemaker itself. Interoper-
values poses a similar threat. ability requires protocols that define
Hardware. Hidden malicious cir- sequences of operations between the
cuits provide attackers with stealthy two communicating parties. These se-
quences must ensure the protection of ous challenges and postulate a means distributed to the systems with that
data. Network protocols have often suf- of tackling them. vulnerability. The update mechanism
fered from vulnerabilities, thus, allow- Software security. Besides the itself may be misused for an attack. Up-
ing attackers to pretend being some- functionality, software developers of dates and patches are (still) much less
one else. Attackers may use a modified medical devices must take measures frequent for medical devices than they
programming device with stronger to ensure the safety as well as the se- are for personal computers and smart-
antennae that allow them to commu- curity of their code. Both secure de- phones. However, sometimes they will
nicate with a pacemaker from a longer velopment and secure update mecha- be necessary.
distance. They may then pretend to be nisms are needed. Risks of medical We need user-friendly update pro-
the authorized cardiologist and modify device software have also been de- cesses for medical devices and take
settings of the device. Similarly, they scribed in Fu and Blum.12 precautions such that malware is not
may act as the home monitor and read Secure development. Security is a vol- involved in the update process itself. In
out sensitive data, or communicate with atile property. A system is never 100% addition, the update must not break the
the home monitor, pretending to be secure. As long as vulnerabilities are device or halt its proper functioning.
the pacemaker, and relay wrong values. unknown, this is not a problem. When Off-the-shelf software often “pow-
attackers know a specific vulnerability, ers” medical technology. On medical
Challenges the target system is at risk. The engi- devices, software patches or updates
Critical assets deserving strong pro- neering of secure medical software is are often delayed or are even miss-
tection in health care include medical not radically different from the devel- ing altogether. Missing patches may
records, a plethora of medical sensors opment of other types of software. It also be an organizational problem.
and devices, and last but not least, hu- is a common misconception that only Delays may result from the fact that
man health and life. The security of bad programmers write insecure code. device manufacturers must approve
medical devices is different and more Besides the underlying complexity of upgrades to software as well as any
challenging vis-à-vis regular IT securi- writing code, it takes detailed knowl- security installations.36 The problem
ty for several reasons, not just because edge, extra training, and additional with old software versions is they of-
of the fact that human life is at stake. development activities in order to write ten contain known vulnerabilities.
Clearly, nonmedical devices like auto- secure code.17 Thus, economic and Old software in medical devices was
mobiles can also endanger human life sometimes social factors often play not an issue as long as these devices
if their safety is compromised through against security quality. operated stand-alone. Increasing in-
a security breach. One can imagine a In medical device software we must terconnection makes these devices
scenario where malware is implanted ensure both safety and security have vulnerable even with old malware.12
into a dynamic stability control system top priority and there is a defined pro- For medical devices, it is important
to intentionally cause an accident. cess to report and fix vulnerabilities. the production life cycles of embedded
But many medical devices impact the The challenge for medical devices in- software must match the devices’ pro-
patients’ physiology and, thus, pose cludes the fact that additional code duction life cycles. Manufacturers must
a permanent threat. Resource con- for security must not interfere with re- ensure software is not used on medical
straints are present not for all, but for al-time constraints and other resource devices after its support has expired.
many, most notably implanted medi- constraints like limited battery power. Hardware security. Safety issues
cal devices. Little memory, process- Update mechanisms. When manu- are more prevalent in hardware than
ing power, physical size limitations facturers of a system know about vul- the security concerns. An example in-
and battery life limit the options that nerabilities, they will address and cor- cludes the electromagnetic interfer-
are available for security countermea- rect the problems. A fix must then be ence of non-medical devices with pace-
sures. Emergency situations provide
an additional challenge that is not Pacemaker environment.
present in other domains. Medical de-
vices must prevent unauthorized ac-
cess, yet may need to allow for quick
and simple access in emergency situ-
ations. Another problem is reproduc- Home Monitor Programming Device
ibility. Security researchers often lack Wireless
access to proprietary devices and are,
thus, limited in their ability to study Phone line Pacemaker Manual
attacks and defenses.
Several countermeasures to vulner-
abilities in medical devices have been
Service Provider Clinic
described.4,14 They can be protective,
Internet
corrective, or detective. Examples are
auditing, notification, trusted external
or internal devices, and cryptographic
protections.16 Here, we enumerate vari-
A P R I L 2 0 1 5 | VO L. 58 | N O. 4 | C OM M U N IC AT ION S OF T HE ACM 79
review articles
makers. Hardware Trojans on medical must have a bypass or shortcut for such typically needs to be approved by the
devices seem unrealistic today, but circumstances. However, these bypass- manufacturer. Thus, deployment of se-
precautions must be taken to reduce es and shortcuts should not provide a curity-relevant upgrades typically gets
attack vectors wherever possible. Back- means that enables attackers to gain delayed.36 Manufacturers, importers,
doors in military chips have already access to the device. and device user facilities are required
been documented, where attackers Initiatives to secure the interoper- to report specific device-related ad-
could extract configuration data from ability of medical devices include ex- verse events and product problems.
the chip, reprogram crypto and access ternally worn devices,3 for example, a Surveillance strategies must be re-
keys, modify low-level silicon features, trustworthy wrist-worn amulet,31 and considered in order to effectively and
and also permanently damage the de- software radio shields.13 Researchers efficiently collect data on security and
vice.30 An approach for automatic em- have also created a prototype firewall privacy problems in medical devices.23
bedding of customizable hardware to block hackers from interfering with Some regulation aspects as well as the
Trojan horses into arbitrary finite state wireless medical devices32 and to au- role of standards bodies, manufactur-
machines has been demonstrated. thenticate via physical contact and the ers, and clinical facilities have been
These Trojan horses are undetectable comparison of ECG readings.28 discussed in Fu and Blum.12 We see
and improvable.35 Radio pathways Organizational. Security is most a demand for action to adjust the in-
have been embedded into computers, effective when designed into the sys- creasing need for software updates for
where computers could be remotely tem from the very initial development medical devices with the need to redo
controlled and provided with malware cycle. It is important to develop and clinical trials after major changes.
even when they were not connected to maintain threat models and to as- Malware detection. Vulnerabili-
the Internet.29 sess risks during device development. ties are often unknown until malware
We must keep in mind that hard- A systematic plan for the provision exploiting those vulnerabilities is de-
ware Trojans can be an attack vector of software updates and patches is tected. We need methods to detect
for medical devices too. It is important needed. Last but not least, a security the presence of malware. Malware
to ensure such malware is not installed response team has to permanently detection techniques include control-
in the manufacturing process. Given identify, monitor, and resolve security flow integrity verification, call stack
the reliance on computer-aided design incidents and security vulnerabilities. monitoring, dataflow analysis, and
tools, it is further necessary to ensure For that purpose, user facilities such multisource hash-based verification.
hardware Trojans are not inserted in as hospitals and clinics should be in- Although software-based malware de-
the design by these tools. Verification centivized to report security occurrenc- tection methods are suitable for tradi-
methods utilized in designing hard- es. These reports can provide valuable tional computing systems, the perfor-
ware should ensure the resulting out- insights into security problems of med- mance overhead may be prohibitive
put designs match the inputs and do ical devices. In addition, we propose for medical devices with strict time
not contain additional circuitry. Out- the definition of security and threat constraints. Hardware-based detection
side of using trusted manufactures for levels for medical devices with defined methods can reduce or eliminate the
each stage of design, ensuring Trojan- rules of action and an audit guideline performance overhead, but power con-
free hardware is not practical. Thus, for all involved stakeholders. The lev- sumption remains a challenge.
detection and mitigation capabilities els defined in the table here are a small For medical devices, we need mal-
will still be needed. Once malicious first step in that direction. We imagine ware detection methods that are non-
hardware is detected and its behav- simple scores for medical devices that intrusive with very low power consump-
ior is understood, research on how to summarize their sensitivity, their im- tion, as power is a precious resource,
mitigate the affects of the malicious pact as well as their exposure and their especially in implantable devices. In
hardware to ensure safety of medical current threat level. Rule-based actions order to provide resilience to zero-
devices will be of critical importance. could then trigger needed actions to re- day exploits, anomaly-based malware
Interoperability. Increasingly, medi- act to security-related incidents. detection methods will be needed.
cal devices rely on wireless connectivity, Regulations. It is important to know These methods rely on accurate mod-
be it for remote monitoring, or for re- at any time the level of danger and to els of normal system behavior, which
mote updates of settings or even for an take appropriate countermeasures. will require both formal methods for
update of the software itself. Interoper- Design and distribution of medical modeling this behavior and tight inte-
ability challenges include secure pro- devices is tightly regulated. In the U.S., gration with system design tasks. The
tocols, authentication, authorization, the FDA has the authority over medical importance of timing requirements in
encryption, and key management. In- device distribution. A device manu- medical devices may provide a unique
teroperability of medical devices is es- facturer has the responsibility for the system feature that can be exploited to
pecially tricky due to medical emergen- approved configuration of the device. better detect malware.
cy situations. In case of an emergency, Device users, such as hospitals and pa- Malware reaction. Detecting mal-
health personnel may need to access tients, do not have access to a device’s ware only addresses half of the prob-
not only medical records, but also medi- software environment and cannot lems. Once malware is detected, how
cal devices of a person in need, perhaps install additional security measures. should the medical device respond?
in a life-threatening situation. Authenti- Any upgrade or update—either added Notification is a straightforward op-
cation and authorization mechanisms functionality or security measures— tion, but it allows the malware to re-
A P R I L 2 0 1 5 | VO L. 58 | N O. 4 | C OM M U N IC AT ION S OF T HE ACM 81
review articles
unauthorized people. Technically vi- importantly, medical devices should defibrillators: Software radio attacks and zero-power
defenses. In Proceedings of the IEEE Symposium on
able systems may nonetheless be un- always assume their surroundings Security and Privacy, May 2008.
desirable to patients. might have been compromised. 16. Hansen, J.A. and Hansen, N.M. A taxonomy of
vulnerabilities in implantable medical devices. In
The general population is increas- Proceedings of SPIMACS’10, (Chicago, IL, Oct. 8, 2010).
ingly concerned about the misuse of Conclusion 17. Howard, M. and Lipner, S. The Security Development
Lifecycle. Microsoft Press, 2006.
the Internet in many aspects of their Securing medical devices means pro- 18. International Standards Organization. Medical
daily life, for example, banking fraud or tecting human life, human health, devices—Application of risk management to medical
devices. ISO 14971:2007.
identity theft. As a cardiologist and elec- and human well-being. It is also about 19. Jee, E. et al. A safety-assured development approach
tro-physiologist, one of the authors (P. protecting and securing the privacy of for real-time software, Proc. IEEE Int. Conf. Embed.
Real-time Comput. Syst. Appl. (Aug. 2010), 133–142.
Ott, M.D.) has observed an increase in sensitive health information. We see 20. Kaplan, D. Black Hat: Insulin pumps can be hacked.
patients’ awareness of security issues, an increase in the use of mobile medi- SC Magazine, (Aug. 04, 2011).
21. King, S.T. et al. Designing and implementing malicious
who question the safety of implanted cal applications as well as an increase hardware. In Proceedings of the 1st Usenix Workshop
on Large-Scale Exploits and Emergent Threats. Fabian
devices in the digital realm. We expect in medical devices that use wireless Monrose, ed. USENIX Association, Berkeley, CA.
such concerns will become even more communication and utilize Internet 22. Kolata, G. Of fact, fiction and Cheney’s defibrillator.
New York Times, (Oct. 27, 2013).
pressing. A small study has shown per- connections. New sensing technology 23. Kramer, D.B. et al. Security and privacy qualities
ceived security, safety, freedom from provides opportunities for telemedi- of medical devices: An analysis of fda postmarket
surveillance. PLoS ONE 7, 7 (2012), e40200;
unwanted cultural and historical asso- cine with the promise to make health doi:10.1371/journal.pone.0040200
ciations, and self-image must be taken care more cost effective. Unless ap- 24. Li, C., Raghunathan, A. and Jha, N.K. Improving the
trustworthiness of medical device software with
into account when designing counter- propriate countermeasures are taken, formal verification methods. IEEE Embedded Systems
measures for medical devices.5 the doors stand wide open for the Letters 5, 3 (Sept. 2013), 50–53.
25. McGraw, G. Software security. IEEE Security & Privacy
We need more information about misuse of sensitive medical data and 2, 2 (Mar-Apr 2004), 80–83.
how concerned patients are about even for malware and attacks that put 26. Nixon, C. et al. Academic Dual Chamber Pacemaker.
University of Minnesota, 2008.
the security of the devices they are human life in danger. 27. Ross, R.S. Guide for Conducting Risk Assessments.
using. A user study could reveal what NIST Special Publication 800-30 Rev. 1, Sept. 2012.
28. Rostami, M., Juels, A. and Koushanfar F. Heart-to-
specific, additional steps patients are Heart (H2H): Authentication for implanted medical
References
willing to take in order to increase se- devices. In Proceedings for ACM SIGSAC Conference
1. Alemzadeh, H., Iyer, R.K. and Kalbarczyk, Z. Analysis
on Computer & Communications Security. ACM, New
curity. This will give manufacturers of safety-critical computer failures in medical devices.
York, NY, 1099–1112.
IEEE Security & Privacy 11, 4, (July-Aug. 2013), 14-26.
29. Sanger, D.E. and Shanker, T. N.S.A. devises radio pathway
valuable information. We will need 2. Boston Scientific. PACEMAKER System
into computers. New York Times (Jan. 14, 2014).
Specification. 2007.
to increase security awareness of all 3. Denning, T., Fu, K. and Kohno, T. Absence makes the
30. Skorobogatov, S. and Woods, C. Breakthrough
silicon scanning discovers backdoor in military chip,
stakeholders, that is, manufacturers, heart grow fonder: New directions for implantable
cryptographic hardware and embedded systems.
medical device security. In Proceedings of USENIX
patients, doctors, and medical insti- Workshop on Hot Topics in Security, July 2008.
Lecture Notes in Computer Science 7428 (2012),
23–40.
tutions. Additionally, the devices’ 4. Denning, T., Matsuoka, Y. and Kohno, T. Neurosecurity:
31. Sorber, J. et al. An amulet for trustworthy wearable
Security and privacy for neural devices. Neurosurgical
security states must be more visible, Focus 27, 1 (July 2009).
mHealth. In Proceedings of the 12th Workshop on
Mobile Computing Systems & Applications. ACM, New
understandable, and accessible for 5. Denning, T. et al. Patients, pacemakers, and
York, NY.
implantable defibrillators: Human values and
all stakeholders. security for wireless implantable medical devices. In
32. Venere, E. New firewall to safeguard against medical-
device hacking. Purdue University News Service, Apr.
IT infrastructure. In order to protect Proceedings of the 28th International Conference on
12, 2012.
Human Factors in Computing Systems, 2010.
medical devices, the surrounding IT 6. Food and Drug Administration. MAUDE—Manufacturer
33. Vockley, M. Safe and Secure? Healthcare in the
cyberworld. AAMI (Advancing Safety in Medical
environment must be secured as well. and User Facility Device Experience; http://www.
Technology) BI&T – Biomedical Instrumentation &
accessdata.fda.gov/scripts/cdrh/cfdocs/cfMAUDE/
Focusing on medical devices, we will Technology, May/June 2012.
search.CFM
34. Weaver, C. Patients put at risk by computer viruses.
refrain from enumerating regular coun- 7. Food and Drug Administration. Is The Product A
Wall Street Journal (June 13, 2013).
Medical Device? http://www.fda.gov/MedicalDevices/
termeasures found in IT security. These 35. Wei, S., Potkonjak, M. The undetectable and
DeviceRegulationandGuidance/Overview/
unprovable hardware Trojan horse. In Proceedings of
are appropriate for health care secu- ClassifyYourDevice/ucm051512.htm
the ACM Design Automation Conference (Austin, TX,
8. Food and Drug Administration. Medical Devices –
May 29 –June 07, 2013).
rity or medical device security as well, Classify Your Medical Device; http://www.fda.gov/
36. Wirth, A. Cybercrimes pose growing threat to medical
MedicalDevices/DeviceRegulationandGuidance/
for example, erasing hard disks before Overview/ClassifyYourDevice/default.htm
devices. Biomed Instrum Technol. 45, 1 (Jan/Feb
2011), 26–34.
disposing of them, backing up data, or 9. Food and Drug Administration Safety Communication:
37. World Health Organization. Medical device regulations:
Cybersecurity for Medical Devices and Hospital Networks;
BYOD (bring your own device) policies. June 2013. http://www.fda.gov/ MedicalDevices/
Global overview and guiding principles. 2003.
Off-the-shelf devices like smartphones Safety/AlertsandNotices/ucm356423.htm
10. Food and Drug Administration. Content of premarket
or tablets also increasingly store, pro- submissions for management of cybersecurity Johannes Sametinger (johannes.sametinger@jku.at) is
cess, and transmit sensitive medical in medical devices—Draft guidance for industry an associate professor in the Department of Information
and Food and Drug administration staff, June Systems at the Johannes Kepler University Linz, Austria.
data. This data must be protected from 14, 2013; http://www.fda.gov/medicalDevices/
Jerzy Rozenblit (jr@ece.arizona.edu) is Distinguished
malware on these devices. Deviceregulationandguidance/guidanceDocuments/
Professor in the Department of Electrical and Computer
ucm356186.htm
IT infrastructure must guarantee 11. Fox News. Antivirus Program Goes Berserk, Freezes
Engineering/Dept. of Surgery at the University of Arizona,
Tucson, AZ.
privacy of medical data according to PCs. Apr. 22, 2010.
12. Fu, K. and Blum, J. Controlling for cybersecurity risks Roman Lysecky (rlysecky@ece.arizona.edu) is an
the Health Insurance Portability and of medical device software. Commun. ACM 56, 10 associate professor in the Department of Electrical
Accountability Act (HIPAA). However, (Oct. 2013), 35–37. and Computer Engineering at the University of Arizona,
13. Gollakota, S. et al. They can hear your heartbeats: Tucson, AZ.
safety is at stake as well. For medi- Non-invasive security for implantable medical devices.
In Proceedings from SIGCOMM’11 (Toronto, Ontario, Peter Ott (ottp@email.arizona.edu) is an associate
cal devices, it is important to keep in Canada, Aug. 15–19, 2011). professor in the College of Medicine, Sarver Heart Center
mind regular IT devices pose a threat 14. Halperin, D. et al. Security and privacy for implantable at the University of Arizona, Tucson, AZ.
medical devices. IEEE Pervasive Computing, Special
to medical devices also when they in- Issue on Implantable Electronics, (Jan. 2008).
teroperate directly or indirectly. Most 15. Halperin, D. et al. Pacemakers and implantable cardiac © 2015 ACM 000107/82/15/04 @15.00
A P R I L 2 0 1 5 | VO L. 58 | N O. 4 | C OM M U N IC AT ION S OF T HE ACM 83
research highlights
DOI:10.1145/ 2 73 5 8 3 9
Technical Perspective
To view the accompanying paper,
visit doi.acm.org/10.1145/2735841 rh
Convolution Engine:
Balancing Efficiency and Flexibility
in Specialized Computing
By Wajahat Qadeer, Rehan Hameed, Ofer Shacham, Preethi Venkatesan, Christos Kozyrakis, and Mark Horowitz
A P R I L 2 0 1 5 | VO L. 58 | N O. 4 | C OM M U N IC AT ION S OF T HE ACM 85
research highlights
data-paths and storage structures allowing hundreds of These two constraints seem contradictory as performing
low-energy operations to be performed for each instruc- hundreds of operations per cycle would generally necessitate
tion and data fetched. Processors can enjoy similar energy reading large amounts of data from the memory. These condi-
gains if they target computational motifs and data-flow tions could be reconciled, however, for algorithms where most
patterns common to a wide-range of kernels in a domain. instructions either operate on intermediate results produced
Our CE implements a generalized map-reduce abstraction, by previous instructions, or reuse most of the input data used
which describes a surprisingly large number of operations by the previous instructions. If an adequate storage structure
in image processing domain. The resulting design achieves is in place to retain this “past data” in the processor data-path,
up to two orders of magnitude lower energy consumption then large number of operations can be performed per instruc-
compared to a general-purpose processor and comes within tion without needing frequent trips to the memory. Fortunately
2–3× of dedicated hardware accelerators. compute bound applications including most image process-
The next section provides an overview of why general ing and video processing algorithms are a good match for these
processors consume so much power and the limitations constraints, providing large data-parallelism and data-reuse.
of existing optimization strategies. Section 3 then intro- Most high-performance processors today already include
duces the convolution abstraction and the five application SIMD units which are widely regarded as the most efficient
kernels we target in this study. Section 4 describes the CE general-purpose optimization for compute intensive appli-
architecture focusing primarily on features that improve cations. SIMD units typically achieve an order of magnitude
energy efficiency and/or allow for flexibility and reuse. We energy reduction by simultaneously operating on many
then compare this CE to both general-purpose cores with data operands in a single cycle (typically 8–16). However, as
SIMD extensions and to highly customized solutions for explained in Hameed et al.7 the resulting efficiency remains
individual kernels in terms of energy and area efficiency. two orders of magnitude less than specialized hardware
Section 5 shows that the CE is within a factor of 2–3× of cus- accelerators, as the SIMD model does not scale well to larger
tom units and almost 10× better than the SIMD solution for degrees of parallelism.
most applications. To better understand the architectural limitations of tra-
ditional SIMD units, consider the two-dimensional sum of
2. BACKGROUND absolute difference operation (SAD) operating on a 16-bit
The low efficiency of general-purpose processors is 8 × 8 block as shown in Listing 1.5 The 2D SAD operator is
explained in Figure 1, which compares the energy dissi- widely employed in multimedia applications such as H.264
pation of various arithmetic operations with the overall video encoder to find the closest match for a 2D image sub-
instruction energy for an extremely simple RISC processor. block in a reference image or video frame. Listing 1 carries
The energy dissipation of arithmetic operations that per- out this search for every location in a srchWinHeight × srch-
form the useful work in a computation remains much lower WinWidth search window in the reference frame, resulting
than the energy wasted in the instruction overheads such as in four nested loops. All four loops are independent and can
instruction fetch, decode, pipeline management, program be simultaneously parallelized. At the same time, each SAD
sequencing, etc. The overhead is even worse for media pro- output substantially reuses the input data used to compute
cessing applications which typically operate on short data previous outputs, both in vertical and horizontal directions.
requiring just 0.2–0.5 pJ (90 nm) of energy per operation, However, a typical SIMD unit with a register row size of
with the result that over 99% energy goes into overheads. 128 bits is only able to operate on elements that fit in one
These overheads must be drastically reduced to increase register row limiting the parallelism to the inner most loop.
energy efficiency. That places two constraints on the proces- Trying to scale up the SIMD width to gain more parallelism
sor design, (i) the processor must execute hundreds of oper- requires either simultaneously reading multiple image rows
ations per instruction to sufficiently amortize instruction from the register file (to parallelize across the 2nd most-
cost and (ii) the processor must also fetch little data, since inner loop), or simultaneously reading multiple overlapping
even a cache hit costs 25 pJ (90 nm) per memory fetch, com- rows of image data (to parallelize across multiple horizontal
pared to 0.2–0.5 pJ for the arithmetic operations. outputs). Neither support exists in the SIMD model.
Figure 1. Comparison of functional unit energy with that of a typical Listing 1. 2D 8 × 8 sum of absolute difference operation (SAD),
RISC instruction in 90nm. Strategy for amortizing processor overheads commonly employed in H.264 motion estimation.
includes executing hundreds of low-power operations per instruction.
RISC Instruction Overhead ALU 125 pJ for ( sWinY = 0; sWinY < srchWinHeight ; sWinY ++) {
for ( sWinX = 0; sWinX < srchWinWidth ; sWinX ++) {
Load/Store D-$ Overhead ALU 150 pJ sad = 0;
for ( y = 0; y < 8; y ++) {
SP Floating Point + 15–20 pJ for ( x = 0; x < 8; x ++) {
32-bit Addition + 7 pJ cY = y + sWinY ; cX = x + sWinX ;
sad += abs ( ref [ cY ][ cX ] - cur [ y ][ x ]) ;
8-bit Addition + 0.2–0.5 pJ }
}
outSad [ sWinY ][ sWinX ] = sad ;
To get more than two orders of magnitude gain in efficiency }
Overhead + + + hundreds + + + }
A P R I L 2 0 1 5 | VO L. 58 | N O. 4 | C OM M U N IC AT ION S OF T HE ACM 87
research highlights
operator previously described in Section 2. Note how SAD “reduction” tree to implement the gradient-based selec-
fits quite naturally to a CE abstraction: the map function is tion. The data access pattern is also nontrivial since indi-
absolute difference and the reduce function is summation. vidual color values from the mosaic must be separated
Fractional Motion Estimation (FME): FME refines the ini- before performing interpolation.
tial match obtained at the IME step to a quarter-pixel resolu-
tion. It first up-samples the block selected by IME, and then 4. CONVOLUTION ENGINE
performs a slightly modified variant of the SAD operation. Convolution operators are highly compute-intensive, par-
Up-sampling also fits nicely to the convolution abstraction ticularly for large stencil sizes, and being data-parallel they
and actually includes two convolution operations: first, lend themselves to vector processing. However, as explained
the image block is up-sampled by two using a six-tap sepa- earlier, existing SIMD units are limited in the extent to
rable 2D filter. This part is purely convolution. Second, the which they can exploit the inherent parallelism and local-
resulting image is up-sampled by another factor of two by ity of convolution due to the organization of their register
interpolating adjacent pixels, which can be defined as a map files. The CE overcomes these limitations with the help of
operator (to generate the new pixels) with no reduce. shift register structures. As shown in Figure 3 for the 2D con-
volution case, when such a storage structure is augmented
3.2. SIFT with an ability to generate multiple shifted versions of the
Scale Invariant Feature Transform (SIFT) looks for distinctive input data, it can fill 128 ALUs from just a small 16 × 8 2D
features in an image.10 To ensure scale invariance, Gaussian register with low access energy as well as area. Similar gains
blurring and down-sampling is performed on the image to are possible for 1D horizontal and 1D vertical convolutions.
create a pyramid of images at coarser and coarser scales. As we will see shortly, the CE facilitates further reductions in
A Difference-of-Gaussian (DoG) pyramid is then created by energy overheads by creating fused super-instructions intro-
computing the difference between every two adjacent image duced in Section 3.
scales. Features of interest are then found by looking at the The CE is developed as a domain specific hardware exten-
scale-space extrema in the DoG pyramid.10 sion to Tensilica’s extensible RISC cores.6 The extension
Gaussian blurring and down-sampling are naturally 2D hardware is developed using Tensilica’s TIE language.14 The
convolution operations. Finding scale-space extrema is a 3D next sections discuss the key blocks in the CE extension
stencil computation, but we can convert it into a 2D stencil hardware, depicted in Figure 4.
operation by interleaving rows from different images into a
single buffer. The extrema operation is mapped to convolu- 4.1. Register files
tion using compare as a map operator and logical AND as the The 2D shift register is used for vertical and 2D convolution
reduce operator. flows and supports vertical row shift: one new row of pixel
data is shifted in as the 2D stencil moves vertically down
3.3. Demosaic into the image. The 2D shift register provides simultaneous
Camera sensor output is typically a red, green, and blue access to all of its elements enabling the interface unit to
(RGB) color mosaic laid out in Bayer pattern.3 At each feed any data element to the ALUs. 1D shift register is used
location, the two missing color values are then interpo- to supply data for horizontal convolution flow. New image
lated using the luminance and color values in surround- pixels are shifted horizontally into the 1D register as the 1D
ing cells. Because the color information is undersampled, stencil moves over an image row.
the interpolation is tricky; any linear approach yields color The 2D Coefficient Register stores data that does not
fringes. We use an implementation of Demosaic that is change as the stencil moves across the image. This can be
based upon adaptive color plane interpolation (ACPI),8 filter coefficients, current image pixels in IME for perform-
which computes image gradients and then uses a three- ing SAD, or pixels at the center of Windowed Min/Max sten-
tap filter in the direction of smallest gradient. While this cils. The results of convolution operations are either written
fits the generalize convolution flow, it requires a complex back to the 2D Shift Register or the Output Register. A later
arbitrary maps.
R7,0 R7,1 R7,15 C15,0 C7,7 Functional units. Since all data rearrangement is handled
by the interface unit, the functional units are just an array of
R0,0 R0,7
R0,8
R0,6 R0,14
R0,15 C0,0 C0,7 short fixed point two-input arithmetic ALUs. In addition to
Generate “2D”
shifted versions multipliers, we support absolute difference to facilitate SAD
R7,0 R7,7
R7,8
R7,6 R7,14
R7,15 C15,0 C7,7 and other typical arithmetic operations such as addition,
subtraction, and comparison. The output of the ALU is fed
Multiplexer Broadcast
to the Reduce stage.
Reduce unit. The reduce part of the map-reduce operation
128 Multipliers/ALUs
is handled by a programmable reduce stage. Based upon
Reduction the needs of our applications, we currently support arith-
metic and logical reduction stages. The degree of reduction
Out0 Out1
is dependent on the kernel size, for example a 4 × 4 2D ker-
nel requires a 16 to 1 reduction whereas 8 to 1 reduction is
needed for an 8-tap 1D kernel. Thus, the reduction stage is
Figure 4. Block diagram of convolution engine. The interface units implemented as a combining tree and outputs can be tapped
(IF) connect the register files to the functional units and provide shifted
broadcast to facilitate convolution. Data shuffle (DS) stage combined
out from multiple stages of the tree.
with instruction graph fusion (IGF) stage create the generalized To enable the creation of “super instructions” described
reduction unit, and is called the complex graph fusion unit. in Section 3, we augment the combining tree to enable
handle noncommutative operations by adding support for
Load/Store IF diverse arithmetic operations at different levels of the tree.
This fusion increases the computational efficiency by reduc-
ing the number of required instructions and by eliminat-
2D Shift Register
2D Coefficient
Register
ing temporary storage of intermediate data in register files.
Output
1D Shift Register Register file
Because this more complex data combination need not be
commutative, the right data (output of the map operation)
Data Horizontal Column
Shuffle IF IF
2D IF 1D IF 2D IF
Row Select must be placed on each input to the combining network.
Stage
Thus, a “Data Shuffle Stage” is also added to the CE in the
ALU Input Port 1 form of a very flexible swizzle network that provides permu-
ALU Input Port 2
ALUs
SIMD ALUs tations of the input data.
MAP
A P R I L 2 0 1 5 | VO L. 58 | N O. 4 | C OM M U N IC AT ION S OF T HE ACM 89
research highlights
unit for such computation, the CE allows the fixed function of values from 2D and coefficient registers, performs the
block access to its Output Register File. This model is simi- convolution and write the result into the row 0 of 2D output
lar to a GPU where custom blocks are employed for raster- register.
ization and such, and that work alongside the shader cores. The code example in Listing 2 brings it all together and
For these applications, we created three custom functional implements a 2D 8 × 8 Filter. First the CE is set to perform
blocks to compute motion vector costs in IME and FME and multiplication at MAP stage and addition at reduce stage
the Hadamard Transform in FME. which are the required setting for filtering. Then the con-
volution size is set which controls the pattern in which data
4.4. Resource sizing is fed from the registers to the ALUs. Filter tap coefficients
Energy efficiency considerations and resource require-
ments of target applications drive the sizes of various Table 3. Major instructions added to processor ISA.
resources within CE. As shown in Hameed et al.,7 amor-
tizing the instruction cost requires performing hundreds Description
of ALU operations per instruction for media processing SET_CE_OPS Set arithmetic functions for MAP and REDUCE
applications based on short 8-bit Addition/Subtraction steps
operations. Many convolution flow applications are, how- SET_CE_OPSIZE Set convolution size
ever, based on higher energy multiplication operations. LD_COEFF_REG_n Load n bits to specified row of 2D coeff register
Our analysis shows that for multiplication-based algo- LD_1D_REG_n Load n bits to 1D shift register. Optional Shift left
LD_2D_REG_n Load n bits to top row of 2D shift register.
rithms, 50–100 operations per instruction are enough to
Optional shift row down
provide sufficient amortization. Increasing the number ST_OUT_REG_n Store top row of 2D output register to memory
of ALUs much further than that gives diminishing returns CONVOLVE_1D_HOR 1D convolution step—input from 1D shift register
and increases the size of storage required to keep these CONVOLVE_1D_VER 1D convolution step—column access to 2D shift
units busy, thus increasing storage area and data-access register
CONVOLVE_2D 2D convolution step with 2D access to 2D shift
energy. Thus for this study we choose an ALU array size of
register
128 ALUs, and size the rest of the resources accordingly
to keep these ALUs busy. To provide further flexibility we
allow powering off half of the ALU array and compute
structures. The size and capability of each resource is pre- Listing 2. Example C code implements 8 × 8 2D filter for a vertical
sented in Table 2. These resources support filter sizes of image stripe and adds 2 to each output.
4, 8, and 16 for 1D filtering and 4 × 4, 8 × 8, and 16 × 16 for
// Set MAP function = MULT , Reduce function = ADD
2D filtering. Notice that that the register file sizes deviate SET_CE_OPS ( CE_MULT , CE_ADD ) ;
from power of 2 to efficiently handle boundary conditions
// Set convolution size 8
common in convolution operations. SET_CE_OPSIZ E (8) ;
another output.
5. EVALUATION
0
To evaluate the efficiency of the CE, we map each target
SIFT-DoG SIFT-Extrema H.264-FME H.264-IME Demosaic
application described in Section 3 on a chip multiprocessor
Custom Convolution Engine SIMD
(CMP) comprised of two CEs. To quantify the performance
and energy cost of such a programmable unit, we also built
Figure 5. Executing a 8 × 8 2D filter on CE. The grayed out boxes Figure 7. Ops/mm2 normalized to custom implementation: number
represent units not used in the example. of image blocks each core processes in 1 second, divided by the area
of the core. For H.264 an image block is a 16 × 16 macroblock and for
SIFT and Demosaic it is a 64 × 64 image block.
Load/Store IF 10.0
256-bit 256-bit 256-bit
256-bit
Ops/mm2 normalized to custom
8 × 8 Coeff
8 Pixel Rows
Block
2D Shift Register 2D Coefficient Filtered Data
1D Shift Register Register Output
(higher is better)
1.0
40 × 10-bit 16 × 18 × 10-bit 16 × 16 × 10-bit Register file
16 × 18 × 10-bit
Data Horizontal Column
2D IF 1D IF 2D IF
Shuffle IF IF Row Select
Stage
160-bit 0.1
ALU input Port 1 128 × 10-bit
ALU input Port 2 128 × 10-bit
SIMD ALUs
ALUs 128 × 10-bit Multiplies 16 × 10-bit
MAP
64 × 20-bit
Up to 128:1 16 × 10-bit 0.0
Instruction Graph Fusion/ (2 × 64:1 Reductgion)
Reduce
Multi-level Reduction Tree SIFT-DoG SIFT-Extrema H.264-FME H.264-IME Demosaic
REDUCE
Custom Convolution Engine SIMD
A P R I L 2 0 1 5 | VO L. 58 | N O. 4 | C OM M U N IC AT ION S OF T HE ACM 91
research highlights
interconnection energy was then added to energy estimates Figure 8. Change in energy consumption as programmability is
from Tensilica tools. The simulation results employ 90 nm incrementally added to the core.
technology at 1.1 V operating voltage with a target frequency
3.50
of 450 MHz. All units are pipelined appropriately to achieve
the frequency target. 3.00
PUBS_halfpage_Ad.indd 1 6/7/12
A P R I L 2 0 1 5 | VO L. 58 | N O. 4 | C OM M U N IC AT ION S OF T HE 11:38
ACM AM93
CAREERS
TENURE-TRACK AND TENURED POSITIONS IN
ELECTRICAL ENGINEERING AND COMPUTER SCIENCE
The newly launched ShanghaiTech University invites highly qualified candidates to
fill multiple tenure-track/tenured faculty positions as its core team in the School
of Information Science and Technology (SIST). Candidates should have exceptional
Rutgers, The State University academic records or demonstrate strong potential in cutting-edge research areas
of New Jersey of information science and technology. They must be fluent in English. Overseas
academic connection or background is highly desired.
Assistant Professor
ShanghaiTech is built as a world-class research university for training future
generations of scientists, entrepreneurs, and technological leaders. Located in
The Department of Management Science and Zhangjiang High-Tech Park in the cosmopolitan Shanghai, ShanghaiTech is ready to
trail-blaze a new education system in China. Besides establishing and maintaining
Information Systems of Rutgers Business School- a world-class research profile, faculty candidates are also expected to contribute
Newark and New Brunswick invites applications substantially to graduate and undergraduate education within the school.
for a tenure-track position at the Assistant Profes- Academic Disciplines: We seek candidates in all cutting edge areas of information
sor rank to start in September 2015. science and technology. Our recruitment focus includes, but is not limited to:
computer architecture and technologies, nano-scale electronics, high speed and
This position is focused in the area of infor- RF circuits, intelligent and integrated signal processing systems, computational
mation systems and the candidate must be an foundations, big data, data mining, visualization, computer vision, bio-computing,
smart energy/power devices and systems, next-generation networking, as well as
active researcher and have strong record of schol- inter-disciplinary areas involving information science and technology.
arly excellence. Special consideration will be Compensation and Benefits: Salary and startup funds are highly competitive,
given to candidates with knowledge in any of the commensurate with experience and academic accomplishment. We also offer a
following areas: data mining, machine learning, comprehensive benefit package to employees and eligible dependents, including
housing benefits. All regular ShanghaiTech faculty members will be within its new
security, data management and analytical meth- tenure-track system commensurate with international practice for performance
ods related to business operations. evaluation and promotion.
A letter of application articulating the candi- Qualifications:
• A detailed research plan and demonstrated record/potentials;
date’s fit (in terms of research and teaching) with
• Ph.D. (Electrical Engineering, Computer Engineering, Computer Science, or related
the position description, a curriculum vitae, and field)
the names and contact information of three per- • A minimum relevant research experience of 4 years.
sons that can provide references should be sent Applications: Submit (in English, PDF version) a cover letter, a 2-page research
plan, a CV plus copies of 3 most significant publications, and names of three
electronically to Luz Kosar at: kosar@business. referees to: sist@shanghaitech.edu.cn (until positions are filled). For more
rutgers.edu information, visit http://www.shanghaitech.edu.cn.
Luz Kosar, Deadline: April 30, 2015
MSIS
Rutgers Business School –
Newark and New Brunswick
1 Washington Park #1068 Pub
Newark, New Jersey 07102-1895 Faculty Positions in the Issu
Size
Institute for Cos
University of Central Missouri
Assistant Professor of Computer Science
Advanced Computational Science
Pub
Applications are invited for four tenure-track faculty positions of any rank (including endowed
The Department of Mathematics and Computer chairs), in applied mathematics and computer science in the Institute for Advanced Computational Issu
Science at the University of Central Missouri is Science (IACS) at Stony Brook University. Candidates wishing to apply should have a doctoral May
accepting applications for four tenure-track and degree in Applied Mathematics or Computer Science, though a degree in related fields may be con- Size
sidered. Ten years of faculty or professional experience is required for a senior position along with
several non-tenure track positions in Computer
a demonstrated record of publications and research funding. A demonstrated record of publications Cos
Science beginning August 2015 at the rank of As-
and a demonstrated potential for research funding is required for any junior faculty. The selected
sistant Professor. The UCM Computer Science candidate is expected to participate in interdisciplinary program development within the Institute
program has 18 full time faculty and offers un- and to establish a research program with a solid funding base through both internal and external
dergraduate and master programs in Computer collaborations. Of specific interest is research in, for example, programming models, algorithms, or Jou
Science. The department is expecting to launch a numerical representations that advance scientific productivity or broaden the benefit and impact of
cybersecurity program in Fall 2015. We are look- high-performance computing. The selected candidates will have access to world-class facilities Wom
including those at nearby Brookhaven National Laboratory.
ing for faculty excited by the prospect of shaping
The Institute for Advanced Computational Science (http://iacs.stonybrook.edu/) was established in
our department’s future and contributing to its
2012 with an endowment of $20M, including $10M from the Simons Foundation. The current ten IM
sustained excellence. faculty members will double in number over the next few years to span all aspects of computation
Tenure Track Positions (#997516 and with the intent of creating a vibrant multi-disciplinary program. IACS seeks to make sustained
#997517): Ph.D. in Computer Science by August advances in the fundamental techniques of computation and in high-impact applications including The
2015 is required. All areas in computer science will engineering and the physical, life, and social sciences. Our integrated, multidisciplinary team of fac- plac
be considered with preference given to candidates ulty, students, and staff overcome the limitations at the very core of how we compute, collectively
take on challenges of otherwise overwhelming complexity and scale, and individually and jointly
with expertise in Big Data Analytics, Cybersecurity, define new frontiers and opportunities for discovery through computation. In coordination with the
Machine Learning or Software Engineering. Center for Scientific Computing at Brookhaven National Laboratory, our dynamic and diverse insti-
Non-Tenure Track Positions (#997495): Ph.D. tute serves as an ideal training and proving ground for new generations of students and
in Computer Science or a closely related area is pre- researchers, and provides computational leadership and resources across the SBU campus and
ferred. ABD will be considered. Previous college/ State of New York.
university teaching experience is highly desirable. The search will remain open until suitable candidates are found with the first round of applications
To apply online, go to https://jobs.ucmo.edu.
due May 15, 2015. All candidates must submit the required documentation online through the link
provided below. Please input a cover letter, your curriculum vitae, a research plan (max. 2 pages)
Apply to positions #997516, #997517 or #997495. which should also describe how graduate and undergraduate students participate, a one-page state-
Initial screening of applications begins March 1, ment of your teaching philosophy, a publication list, your funding record, and three reference letters
2015, and continues until positions are filled. to: https://iacs-hiring.cs.stonybrook.edu.
Contact: Dr. Songlin Tian. Email: tian@ucmo. Stony Brook University/SUNY is an equal opportunity, affirmative action employer.
edu. Phone: 660-543-4930. Fax: 660-543-8013
A P R I L 2 0 1 5 | VO L. 58 | N O. 4 | C OM M U N IC AT ION S OF T HE ACM 95
last byte
Future Tense
The Wealth of Planets
Launch swarms of self-replicating robots
to exploit the most lucrative of resources.
Our lunar complexes were built by tels like this or for space tourism … even radioactive potassium—and ob-
commercially available robot fabri- Yet. But that makes it all the more at- tain fuel for the greatest growth invest-
cator cascades. You’ve all seen their tractive for this reason: A robot army ment ever. There’s a total of 1016 kg of U
work, powered by free solar energy, ro- can mine the planet without restric- and Th in the planet. Nuclear breeder
bots building robots building robots tions like the lunar zoning regulations reactions with that fuel would yield
until the work force is large enough that just passed here. more than [C O NTINUED O N P. 95]