Professional Documents
Culture Documents
ACM
CACM.ACM.ORG OF THE 09/2017 VOL.60 NO.09
Moving Beyond
the Turing Test
with the Allen AI
Science Challenge
Association for
Computing Machinery
http://www.can-cwic.ca/
Canadian Celebration of Women in Computing
Association for
Computing Machinery
Previous
A.M. Turing Award
Recipients
By Lawrence M. Fisher
72 Security in High-Performance
Computing Environments
Exploring the many distinctive
elements that make securing
HPC systems much different than
securing traditional systems.
By Sean Peisert
48 60
Research Highlights
42 The Calculus of Service Availability 60 Moving Beyond the Turing Test 82 Technical Perspective
You’re only as available as with the Allen AI Science Challenge A Gloomy Look at the Integrity
the sum of your dependencies. Answering questions correctly of Hardware
By Ben Treynor, Mike Dahlin, from standardized eighth-grade By Charles (Chuck) Thacker
Vivek Rau, and Betsy Beyer science tests is itself a test
of machine intelligence. 83 Exploiting the Analog
48 Data Sketching By Carissa Schoenick, Peter Clark, Properties of Digital Circuits
The approximate approach is Oyvind Tafjord, Peter Turney, for Malicious Hardware
often faster and more efficient. and Oren Etzioni By Kaiyuan Yang, Matthew Hicks,
By Graham Cormode Qing Dong, Todd Austin,
and Dennis Sylvester
Watch the authors discuss
56 10 Ways to Be a Better Interviewer their work in this exclusive
Plan ahead to make the interview Communications video. 92 Technical Perspective
https://cacm.acm.org/
a successful one. videos/moving-beyond-the- Humans and Computers
By Kate Matsudaira turing-test Working Together on Hard Tasks
By Ed H. Chi
Articles’ development led by 65 Trust and Distrust in
queue.acm.org
Online Fact-Checking Services 93 Scribe: Deep Integration of Human
Even when checked by and Machine Intelligence to Caption
PHOTO BY TA F FPIXTURE; ROBOT ILLUSTRAT IO N BY PET ER CROW TH ER ASSO CIATES
Communications of the ACM is the leading monthly print and online magazine for the computing and information technology fields.
Communications is recognized as the most trusted and knowledgeable source of industry information for today’s computing professional.
Communications brings its readership in-depth coverage of emerging areas of computer science, new trends in information technology,
and practical applications. Industry leaders use Communications as a platform to present and debate various technology implications,
public policies, engineering challenges, and market trends. The prestige and unmatched reputation that Communications of the ACM
enjoys today is built upon a 50-year commitment to high-quality editorial content and a steadfast dedication to advancing the arts,
sciences, and applications of information technology.
ACM, the world’s largest educational STA F F EDITORIAL BOARD ACM Copyright Notice
and scientific computing society, delivers DIRECTOR OF PU BL ICATIONS E DITOR- IN- C HIE F Copyright © 2017 by Association for
resources that advance computing as a Scott E. Delman Andrew A. Chien Computing Machinery, Inc. (ACM).
science and profession. ACM provides the cacm-publisher@cacm.acm.org eic@cacm.acm.org Permission to make digital or hard copies
computing field’s premier Digital Library of part or all of this work for personal
and serves its members and the computing or classroom use is granted without
Executive Editor S E NIOR E DITOR
profession with leading-edge publications, fee provided that copies are not made
Diane Crawford Moshe Y. Vardi
conferences, and career resources. or distributed for profit or commercial
Managing Editor
advantage and that copies bear this
Thomas E. Lambert NE W S
Executive Director and CEO notice and full citation on the first
Senior Editor Co-Chairs
Bobby Schnabel page. Copyright for components of this
Andrew Rosenbloom William Pulleyblank and Marc Snir
Deputy Executive Director and COO work owned by others than ACM must
Senior Editor/News Board Members
Patricia Ryan be honored. Abstracting with credit is
Lawrence M. Fisher Mei Kobayashi; Michael Mitzenmacher;
Director, Office of Information Systems permitted. To copy otherwise, to republish,
Web Editor Rajeev Rastogi; François Sillion
Wayne Graves to post on servers, or to redistribute to
David Roman
Director, Office of Financial Services lists, requires prior specific permission
Rights and Permissions
Darren Ramdin VIE W P OINTS and/or fee. Request permission to publish
Deborah Cotton
Director, Office of SIG Services Co-Chairs from permissions@hq.acm.org or fax
Editorial Assistant
Donna Cappo Tim Finin; Susanne E. Hambrusch; (212) 869-0481.
Jade Morris
Director, Office of Publications John Leslie King; Paul Rosenbloom
Scott E. Delman Board Members For other copying of articles that carry a
Art Director William Aspray; Stefan Bechtold; code at the bottom of the first or last page
Andrij Borys Michael L. Best; Judith Bishop; or screen display, copying is permitted
ACM CO U N C I L
Associate Art Director Stuart I. Feldman; Peter Freeman; provided that the per-copy fee indicated
President
Margaret Gray Mark Guzdial; Rachelle Hollander; in the code is paid through the Copyright
Vicki L. Hanson
Assistant Art Director Richard Ladner; Carl Landwehr; Clearance Center; www.copyright.com.
Vice-President
Cherri M. Pancake Mia Angelica Balaquiot Carlos Jose Pereira de Lucena;
Production Manager Beng Chin Ooi; Loren Terveen; Subscriptions
Secretary/Treasurer
Bernadette Shade Marshall Van Alstyne; Jeannette Wing An annual subscription cost is included
Elizabeth Churchill
Advertising Sales Account Manager in ACM member dues of $99 ($40 of
Past President
Ilia Rodriguez which is allocated to a subscription to
Alexander L. Wolf
P R AC TIC E Communications); for students, cost
Chair, SGB Board
Chair is included in $42 dues ($20 of which
Jeanna Matthews Columnists
Stephen Bourne and Theo Schlossnagle is allocated to a Communications
Co-Chairs, Publications Board David Anderson; Phillip G. Armour;
Board Members subscription). A nonmember annual
Jack Davidson and Joseph Konstan Michael Cusumano; Peter J. Denning;
Eric Allman; Samy Bahra; Peter Bailis; subscription is $269.
Members-at-Large Mark Guzdial; Thomas Haigh;
Gabriele Anderst-Kotis; Susan Dumais; Terry Coatta; Stuart Feldman; Nicole Forsgren;
Leah Hoffmann; Mari Sako; ACM Media Advertising Policy
Elizabeth D. Mynatt; Pamela Samuelson; Camille Fournier; Benjamin Fried;
Pamela Samuelson; Marshall Van Alstyne Communications of the ACM and other
Eugene H. Spafford Pat Hanrahan; Tom Killalea; Tom Limoncelli;
Kate Matsudaira; Marshall Kirk McKusick; ACM Media publications accept advertising
SGB Council Representatives
C O N TAC T P O IN TS Erik Meijer; George Neville-Neil; in both print and electronic formats. All
Paul Beame; Jenna Neefe Matthews;
Copyright permission Jim Waldo; Meredith Whittaker advertising in ACM Media publications is
Barbara Boucher Owens
permissions@hq.acm.org at the discretion of ACM and is intended
Calendar items to provide financial support for the various
BOARD C HA I R S calendar@cacm.acm.org C ONTR IB U TE D A RTIC LES activities and services for ACM members.
Education Board Change of address Co-Chairs Current advertising rates can be found
Mehran Sahami and Jane Chu Prey acmhelp@acm.org James Larus and Gail Murphy by visiting http://www.acm-media.org or
Practitioners Board Letters to the Editor Board Members by contacting ACM Media Sales at
Terry Coatta and Stephen Ibaraki letters@cacm.acm.org William Aiello; Robert Austin; (212) 626-0686.
Elisa Bertino; Gilles Brassard; Kim Bruce;
Alan Bundy; Peter Buneman; Carl Gutwin; Single Copies
W E B S IT E
REGIONA L C O U N C I L C HA I R S Yannis Ioannidis; Gal A. Kaminka; Single copies of Communications of the
http://cacm.acm.org
ACM Europe Council Karl Levitt; Igor Markov; Gail C. Murphy; ACM are available for purchase. Please
Dame Professor Wendy Hall Bernhard Nebel; Lionel M. Ni; Adrian Perrig; contact acmhelp@acm.org.
ACM India Council AU T H O R G U ID E L IN ES Sriram Rajamani; Marie-Christine Rousset;
Srinivas Padmanabhuni http://cacm.acm.org/about- Krishan Sabnani; Ron Shamir; Yoav Shoham; COMMUN ICATION S OF THE ACM
ACM China Council communications/author-center Josep Torrellas; Michael Vitale; (ISSN 0001-0782) is published monthly
Jiaguang Sun Hannes Werthner; Reinhard Wilhelm by ACM Media, 2 Penn Plaza, Suite 701,
ACM ADVERTISIN G DEPARTM E NT New York, NY 10121-0701. Periodicals
2 Penn Plaza, Suite 701, New York, NY RES E A R C H HIGHLIGHTS postage paid at New York, NY 10001,
PUB LICATI O N S BOA R D and other mailing offices.
10121-0701 Co-Chairs
Co-Chairs
T (212) 626-0686 Azer Bestavros and Gregory Morrisett
Jack Davidson; Joseph Konstan POSTMASTER
F (212) 869-0481 Board Members
Board Members Please send address changes to
Martin Abadi; Amr El Abbadi; Sanjeev Arora;
Karin K. Breitman; Terry J. Coatta; Communications of the ACM
Michael Backes; Maria-Florina Balcan;
Anne Condon; Nikil Dutt; Roch Guerrin; 2 Penn Plaza, Suite 701
Advertising Sales Account Manager Andrei Broder; Doug Burger; Stuart K. Card;
Chris Hankin; Carol Hutchins; New York, NY 10121-0701 USA
Ilia Rodriguez Jeff Chase; Jon Crowcroft; Alexei Efros;
Yannis Ioannidis; M. Tamer Ozsu;
ilia.rodriguez@hq.acm.org Alon Halevy; Sven Koenig; Steve Marschner;
Eugene H. Spafford; Stephen N. Spencer;
Tim Roughgarden; Guy Steele, Jr.; Printed in the U.S.A.
Alex Wade; Keith Webster
Media Kit acmmediasales@acm.org Margaret H. Wright; Nicholai Zeldovich;
Andreas Zeller
ACM U.S. Public Policy Office
1701 Pennsylvania Ave NW, Suite 300, WEB
Washington, DC 20006 USA Association for Computing Machinery Chair
T (202) 659-9711; F (202) 667-1066 (ACM) James Landay
2 Penn Plaza, Suite 701 Board Members A
SE
REC
Y
Computer Science Teachers Association New York, NY 10121-0701 USA Marti Hearst; Jason I. Hong;
E
CL
PL
Mark R. Nelson, Executive Director T (212) 869-7440; F (212) 869-0481 Jeff Johnson; Wendy E. MacKay
NE
TH
S
I
Z
I
M AGA
DOI:10.1145/3125780 Simson Garfinkel, Jeanna Matthews, Stuart S. Shapiro, and Jonathan M. Smith
Toward Algorithmic
Transparency and Accountability
A
LGORITHMS ARE REPLACING principled, and independent source of gies used in computer security should
or augmenting human de- scientific and technical expertise, free be employed to increase confidence in
cision making in crucial from the influence of product vendors automated systems.
ways. People have become or other vested interests. As organizations deploy complex al-
accustomed to algorithms More recently, the ACM Europe gorithms for automated decision mak-
making all manner of recommenda- Council Policy Committee (EUACM) ing, system designers should build
tions, from products to buy, to songs to has been doing the same in Europe. these principles into their systems. In
listen to, to social network connections. USACM and EUACM, both separately some cases, doing so will require ad-
However, algorithms are not just rec- and jointly, provide information and ditional research. For example, how to
ommending, they are also being used analysis to policymakers and the pub- design and deploy large-scale neural
to make big decisions about people’s lic regarding important societal issues networks while ensuring compliance
lives, such as who gets loans, whose ré- involving IT, including algorithmic with laws prohibiting discrimination
sumés are reviewed by humans for pos- transparency and accountability. against legally protected groups? This
sible employment, and the length of USACM and EUACM have identi- is especially crucial given the ability to
prison terms. While algorithmic deci- fied and codified a set of principles in- infer characteristics such as gender,
sion making can offer benefits in terms tended to ensure fairness in this evolv- race, or disability status even if the
of speed, efficiency, and even fairness, ing policy and technology ecosystem.a computer system is not provided with
there is a common misconception that These are: (1) awareness; (2) access and that data directly. How should informa-
algorithms automatically result in un- redress; (3) accountability; (4) explana- tion on automated decisions be logged
biased decisions. In reality, inscrutable tion; (5) data provenance; (6) audit- to ensure auditability? How can the op-
algorithms can also unfairly limit op- ability; and (7) validation and testing. eration of these networks be explained
portunities, restrict services, and even Awareness speaks to educating the to technologists and non-technical
improperly curtail liberty. public regarding the degree to which policymakers alike?
Information and communication decision making is automated. Ac- One model for moving forward may
technologies invariably raise these cess and redress means there is a way be self-regulation by industry. Our expe-
kinds of important public policy is- to investigate and correct erroneous rience, however, is that self-regulation is
sues. How should self-driving cars be decisions. Accountability rejects the only possible when there is a consensus
required to act? How private is informa- common deflection of blame to an on a set of relevant standards. We hope
tion stored on a cellphone? Can elec- automated system by ensuring those our principles can serve as input to such
tronic voting machines be trusted? How who deploy an algorithm cannot es- an effort. If policymakers determine
will the increasing uses of automation in chew responsibility for its actions. Ex- regulation is necessary, our principles
the workplace impact workers? Since its planation means the logic of the algo- are available, potentially in the way that
founding, ACM’s members have played rithm, no matter how complex, must the Code of Fair Information Practices
a leading role in discussing these issues be communicable in human terms. provided a basis for decades of privacy
within the computing profession and As many modern techniques are regulation around the world.
with policymakers. based on statistical analyses of large USACM and EUACM seek input and
The ACM U.S. Public Policy Council pools of collected data, decisions will involvement from ACM’s members in
(USACM) was established in the early be influenced by the choice of data- providing technical expertise to de-
1990s as a focal point for ACM’s inter- sets for training, and thus knowing cision makers on the often difficult
actions with U.S. government organiza- the data sources and their trustwor- policy questions relating to algorithmic
tions, the computing community, and thiness—that is, their provenance—is transparency and accountability, as
the public in all matters of U.S. public essential. Auditability for a decision- well as those relating to security,
policy related to information technol- making system requires logging and privacy, accessibility, intellectual
ogy. USACM came to prominence dur- record keeping, for example, for dis- property, big data, voting, and other
ing the debates over cryptography and pute resolution or regulatory compli- technical areas. For more information,
key escrow technology. Today, USACM ance. Finally, validation and testing visit www.acm.org/public-policy/usacm
continues to make public policy recom- on an ongoing basis means that tech- or www.acm.org/euacm.
mendations that are based on scientific niques such as regression tests, vetting
evidence, follow recognized best prac- of corner cases, or red-teaming strate- The authors are members of the ACM U.S. Public Policy
tices in computing, and are grounded Council, for which Stuart S. Shapiro (s_shapiro@acm.org)
serves as chair.
in the ACM Code of Ethics. It has estab- a https://www.acm.org/binaries/content/assets/public-
lished a reputation as a non-partisan, policy/2017_usacm_statement_algorithms.pdf Copyright held by authors.
One wonders whether we should a Roughly, “Who will watch the watchmen?”
take the metaphor more seriously and b https://www.virustotal.com Copyright held by owner/author.
D
practice
I V I N AT I O N I S T H E people of Borneo use birdwatching to modus operandi of program com-
of an occultic ritual as an decide which sites to farm and which mittees. The standard approach in
aid in decision making. It sites to leave fallow, they are simply such committees can be viewed as
has old historical roots. Ac- randomizing in the face of uncer- “guilty until proven innocent.” We
cording to the biblical book tainty about rain, pests, and more, expect only 25%–35% of the papers
of Samuel I, in the 11th century BCE, but this randomization comes with a to be accepted, so the default deci-
the Hebrew King Saul sought wisdom belief in the divine source of the deci- sion is to reject unless there is strong
from the Witch of Endor, who sum- sion. (See essay by Michael Sulson at agreement to accept. But the reality is
moned the dead prophet Samuel, https://goo.gl/RYb264.) that a different committee may have
before his impending battle with the But what does this have to do with reached a different decision on the
Philistines. Alexander the Great, after program committees? In 2014, the majority of accepted papers. Is it wise
conquering Egypt in 332 BCE, visited Neural Information Processing Sys- to reject papers based essentially on
the Oracle of Amun at the Siwa Oasis tems Foundation (NIPS) Conference the whim of the program committee?
to learn about his future prospects. split the program committee into two If we switch mode to “innocent until
Divination can be practiced in many independent committees, and then proven guilty,” we would reject only
ways, including sortilege (casting of subjected 10% of the submissions— papers on which there is strong agree-
lots), reading tea leaves or animal en- 166 papers—to decision making by ment to reject, and accept all other
trails, random querying of texts, and both committees. The two commit- papers.
more. Divination has been dismissed tees disagreed on 43 papers. Given the Beyond the increased fairness of
as superstition since antiquity; the NIPS paper acceptance rate of 25%, “innocent until proven guilty,” this
Greek scholar Lucian derided divina- this means that close to 60% of the approach would also increase the effi-
tion already in the 2nd century CE. Yet papers accepted by the first commit- ciency of the conference-publication
the practice persists. tee were rejected by the second one system. A high rejection rate means
Developments in mathematics and vice versa. (See analysis by Eric that papers are submitted, resubmit-
and in computer science in the 20th Price at https://goo.gl/fy5jLR.) This ted, and re-resubmitted, resulting in
century shed new light on the power high level of randomness came as a a very high reviewing burden on the
of divination. Unless we believe that surprised to many people, but I have community. It also results in the pro-
divination truly allows us to consult found it quite expected. My own ex- liferation of conferences, which frag-
the divine, we can view it simply as a perience is that in a typical program- ments research communities. As I
form of randomization, which is rec- committee meeting there is broad argued in an earlier editorial (https://
ognized as a powerful construct in agreement for acceptance about the goo.gl/dUMkwZ), I believe the proper
game theory and algorithm design. top 10% of the papers, as well as broad way to adapt to the growth of the com-
The classical game-theoretic example agreement rejections about the bot- puting research is to grow our confer-
is the game of Rock-Scissors-Paper in tom 25% of the papers. For the other ences rather than proliferate confer-
which there is no Nash equilibrium 65% of the submissions, there is no ences.
of pure strategies, but there is a Nash agreement and the final accept/reject NIPS should be lauded for applying
equilibrium in which both players decision is fairly random. This is par- the “publication method” to scientif-
choose their actions uniformly at ran- ticularly true when the accept/reject ic inquiry. It is up to the computing-
dom. The classical Dining Philoso- decision pivots on issues such as sig- research community to draw the con-
phers Problem has no symmetric dis- nificance and interestingness, which clusions and act accordingly!
tributed deterministic solution, but, can be quite subjective. Yet, we seem Follow me on Facebook, Google+,
as shown by Michael Rabin, has such to pretend that this random decision and Twitter.
a solution if we allow randomization. reflects the deep wisdom of the pro-
The essential insight is that random- gram committee. Moshe Y. Vardi (vardi@cs.rice.edu) is the Karen Ostrum
George Distinguished Service Professor in Computational
ization is a powerful way to deal with I believe the NIPS experiment Engineering and Director of the Ken Kennedy Institute for
incomplete information. Thus, as re- should not only teach us some hu- Information Technology at Rice University, Houston, TX.
He is the former Editor-in-Chief of Communications.
alized by the anthropologist Michael mility, but should also suggest that
Dove in the 1970s, when the Kantu we may want to reconsider the basic Copyright held by author.
DOI:10.1145/3128899
Computational Thinking Is
Not Necessarily Computational
I
APPLAUD PETER J. DENNING’S View- that would help students move into the ing definition by Al Aho: “Abstractions
point “Remaining Trouble Spots field, should that be their preference. called computational models are at
with Computational Thinking” But should computational thinking also the heart of computation and compu-
(June 2017), especially for point- be taught to artists, writers, poets, physi- tational thinking. Computation is a
ing out the subject itself is of- cians, and lawyers? Not as I see it . . . process that is defined in terms of an
ten characterized by “vague definitions The faulty thinking behind the “com- underlying model of computation, and
and unsubstantiated claims”; “com- puter science for all” approach to peda- computational thinking is the thought
putational thinking primarily benefits gogy is best seen in Denning’s table, processes involved in formulating
people who design computations and . labeled “Traditional versus New Com- problems so their solutions can be rep-
. . claims of benefit to nondesigners are putational Thinking.” Its entry on “do- resented as computational steps and
not substantiated”; and “I am now wary main knowledge” suggested tradition- algorithms.” But as Aho’s definition is
of believing that what looks good to me alists see domain knowledge as vitally highly circular, it reveals very little.
as a computer scientist is good for every- important to the person doing the com- All disciplines rely on models. The
one.” Moreover, the accompanying table putational thinking, while “new” think- only specifically computational word
outlined various historic definitions of ing says the importance of computa- here is “algorithms.” If we replaced it
“computational thinking,” including a tional thinking is domain-independent. with similar words, like “procedures”
comparison of what Denning called the As a practicing programmer who has or “sequences,” we would arrive at such
“new” and the “traditional” view of the dabbled in many different application vacuous “definitions” as, say, “Medicine
subject. However, my own interest in domains over a long professional career, is a process that is defined in terms of
computational thinking differs some- I see it as beyond understanding how an underlying model of medicine, and
what from Denning’s. First, I question anyone could fail to see the importance medical thinking is the thought process-
the legitimacy of the term “computation- of deeply knowing a domain to being es involved in formulating problems so
al” itself. Why say it, when the very subject able to solve problems in that domain. their solutions can be represented as
is “computers” and the chief academic Robert L. Glass, Toowong, Australia medical steps and procedures.” And
approach to their study is “computer sci- “Drama is a process that is defined in
ence”? If one looks at how computers are terms of an underlying model of drama,
actually used, it may come as a surprise to Author Responds: and dramatic thinking is the thought
learn that few such uses actually involve Computational thinking is the habits of mind processes involved in formulating prob-
computing. For example, applications developed from designing computations. The lems so their solutions can be represent-
that deal with scientific and engineer- meaning of computation has evolved from ed as dramatic steps and sequences.”
ing problems are of course heavily com- the 1960s “sequence of states of a computer One could analogously “define” musi-
puting-focused, but, last I heard, they executing a program” to today’s “evolution cal thinking, artistic thinking, chemical
constitute only approximately 20% of of an information process.” This changed thinking, and so forth.
all applications being developed world- meaning reflects the ever-expanding reach Unless somebody can come up with
wide. The most predominant applica- of computing into all sectors of work and life. a more insightful definition, it is indeed
tions—those for business—involve lit- Many of today’s most popular apps feature time to retire “computational thinking.”
tle computation beyond arithmetic. And computations well beyond arithmetic, as in, awrence C. Paulson,
L
systems programs like operating sys- say, facial recognition, speech transcription, Cambridge, England
tems and compilers, the focus of much driverless cars, and industrial robots. The
computer science study, historically at computational thinking developed by
least, involve little or no computation those who worked on these achievements Toward a True Measure
and primarily concern manipulating in- is much more powerful than the handful of Patent Intensity
formation rather than numbers. of programming concepts offered as the In their article “How Important Is IT?”
The problem is that computational- definition of “new CT.” (July 2017), Pantelis Koutroumpis et al.
thinking enthusiasts, as Denning wrote, Peter J. Denning, Monterey, CA described a methodology for assess-
are driven to spread the subject across ing the importance of information and
all academic majors. I certainly believe communications technologies (ICTs)
in the importance of programming and Time to Retire compared to non-ICT technologies,
using computers for the variety of appli- ‘Computational Thinking’? using PatStat, a dataset from the Euro-
cations for which they provide benefit Peter J. Denning asked, “What is com- pean Patent Office of 90 million patents
and that educational systems worldwide putational thinking?” in his Viewpoint awarded from 1900 to 2014. Controlling
should provide the knowledge and skills (June, 2017), then quoted the follow- for variables (such as patent office, year
of grant, and patent family), they con- Trademark Office’s economy update,2
cluded ICT patents are more influential the non-ICT “basic chemicals” category
than non-ICT patents because they re- ranked first, with $64.5 billion in mer-
ceive significantly more citations and a chandise exports of selected intellectu-
considerably higher PageRank. al-property-intensive industries, while
When one publication (not just those “semiconductors and electronic com-
involving patents) is cited more often ponents” was second at $54.8 billion.
than some other publication, the more- Most industries involve non-ICT tech-
cited one is thus more influential. How- nology. As for “patent intensity,” or the
ever, patent publications are unique ratio of patents to employees measured
because they not only describe novel sys- as patents/thousand jobs, “computer
Call for
tems and methods but also hold com- and peripheral equipment” and “com-
mercial value and represent licensable munications equipment” topped the
assets for their holders. A patent may be
cited hundreds of times yet still have rel-
atively low financial value; on the other
list, though this was due directly to the
relatively high number of patents issued
in the industry versus the industry’s rela-
Nominations
hand, a patent may be cited only rarely tively low number of employees. Conclu- for ACM
yet reflect enormous valuation. sions regarding level of influence of ICT General Election
Consider that in 2013, Kodak, the technologies versus other types of tech-
company that invented the digital cam- nologies should thus be reported with
era, sold its portfolio of 1,100 digital care when a comparison is based solely
photography-related patents to multiple on number of inventions and citations. The ACM Nominating
licensees for $525 million (or $477.3K If such influence is indeed the ba-
per patent). Earlier, Google bought Mo- sis for a comparison, then additional Committee is preparing
torola Mobility and its 17,000 patents covariates should be controlled for, in- to nominate candidates
for $12.5 billion (or $735.3K per patent), cluding the mean estimated valuation for the officers of ACM:
and Microsoft acquired 800 patents per patent, number of employees in the
from AOL for $1.06 billion (or $1.33M industry, and additional financial and
President,
per patent). Snap paid the exceptional industry-specific characteristics. Vice-President,
price of $7.7 million for Mobli’s Geo- Secretary/Treasurer;
filters patent, believed by TechCrunch References
1. Kartoun, U. A user, an interface, or none. Interactions 24, 1 and two
to be the highest amount ever paid for
a patent from an Israeli tech company.
(Jan.-Feb. 2017), 20–21.
2. U.S. Patent and Trademark Office. Intellectual Property Members at Large.
and the US Economy: 2016 Update. U.S. Patent and
However, the valuations of most pat- Trademark Office, Washington, D.C., 2016; https://
ents are unknown until they are indeed www.uspto.gov/sites/default/files/documents/ Suggestions for candidates
IPandtheUSEconomySept2016.pdf
auctioned or sold off. For instance, ICT- are solicited. Names should be
related patents (such as those involv- Uri Kartoun, Cambridge, MA sent by November 5, 2017
ing Google’s and Microsoft’s methods to the Nominating Committee Chair,
c/o Pat Ryan,
for faster Internet browsing)1 may have Authors Respond:
Chief Operating Officer,
impressive valuations, but those valua- Although there may be some correlation
ACM, 2 Penn Plaza, Suite 701,
tions are difficult to predict before actu- between patent price and technological
New York, NY 10121-0701, USA.
ally being auctioned or sold off. influence, the relationship is neither clear nor
Considering non-ICT patents, the systematic. Patent prices are more likely driven
With each recommendation,
revenue streams of several pharmaceu- by how incremental/radical/breakthrough it is,
please include background
tical companies depend on patents and whether its value is standalone or as part of a
information and names of individuals
their corresponding expiration dates, bundle, projected commercialization timescale, the Nominating Committee
and one patent could be worth billions cost versus risk, bidder’s experience, patent age, can contact for additional
over the course of its licensing period. rate of technological change, and substitution information if necessary.
Notable patented medications include and reverse-engineering risk, to say nothing
Pfizer’s Lipitor (for lowering fatty acids of broader economic factors. Perhaps our Alexander L. Wolf is the Chair
known as lipids), Bristol-Myers Squibb’s technological-influence measure could thus of the Nominating Committee,
Plavix (for preventing heart attacks and be used to help understand patent pricing. and the members are
strokes), and Teva’s Copaxone (for treat- Pantelis Koutroumpis, London, U.K., Karin Breitman, Judith Gal-Ezer,
ing multiple sclerosis). Other non-ICT Aija Leiponen, Ithaca, NY, and Rashmi Mohan, and Satoshi Matsuoka.
patents that have significantly and di- Llewellyn D W Thomas, London, U.K.
rectly improved people’s lives are cited
only rarely, including those related to Communications welcomes your opinion. To submit a
Letter to the Editor, please limit yourself to 500 words or
agriculture, transportation, and cre- less, and send to letters@cacm.acm.org.
ation of new materials.
In the most recent U.S. Patent and ©2017 ACM 0001-0782/17/09
DOI:10.1145/3121430 http://cacm.acm.org/blogs/blog-cacm
Assuring Software
Quality By
Preventing Neglect
Robin K. Hill suggests software neglect is a failure of the coder to pay
enough attention and take enough trouble to ensure software quality.
Robin K. Hill open-source projects, that developers refined by some other rules to correct
The Ethical Problem produce no documentation at all, as for what happens at longer periods,
of Software Neglect a matter of course, and that further- but this code is a prototype ... She
http://bit.ly/2roEDf1 more, during maintenance cycles, retains the simple test, meaning to
May 31, 2017 they do not correct the old source code look up the specifics ... but her boss
comments, seeing such edits as risky commits her code. No harm is fore-
Ethical concern about technology and presumptuous. All of these peo- seeable ... except that it turns out to
enjoys booming popularity, evident ple are fine coders, and fine people. interface with another module where
in worry over artificial intelligence, Their practices seem oddly reason- the leap-year calculation incorporates
threats to privacy, the digital divide, able in the circumstances, under the the complete set of conditions, which
reliability of research results, and pressure of haste, even while those is discovered to drive execution down
vulnerability of software. Concern practices degrade the understandabil- the wrong path in some calculations.
over software shows in cybersecurity ity of the program. Couple that with The program is designated for fixing
efforts and professional codes.1 The the complexity of modern programs, but it continues to run, those in the
black hats are hackers who deploy and we conclude that, in some cases, know compensating for it somehow...
software as a weapon with malicious programmers simply don’t know what What sort of violation is neglect?
intent, and the white hats are the orga- their code does. It doesn’t attack security because it
nizations that set safeguards against Examples of software quality short- occurs behind the firewall. It doesn’t
defective products. But we have a gray- comings readily come to mind—out- attack ideals of quality because no-
hat problem—neglect. of-bounds values unchecked, com- one officially disputes those ideals. It
My impression is that the criteria plex conditions that identify the is a failure of degree, a failure to pay
under which I used to assess student wrong cases, initializations to the enough attention and take enough
programs—rigorous thought, design, wrong constant. Picture a clever and trouble. Can philosophy help clarify
and testing, clean nested conditions, conscientious coder finishing up a what’s wrong? An emerging theory
meaningful variable names, complete calendar module before an impor- called the ethics of care displaces
case coverage, careful modulariza- tant meeting. She knows that the the classical agent-centered moral-
tion—have been abandoned or weak- test for leap years from the numeric ity of duty and justice, endorsing in-
ened. I have been surprised to find, at yyyy value, if (yyyy mod 4 = 0) stead patient-centered morality as
prestigious institutions working on and (yyyy mod 100 != 0), must be manifest real-time in relationships.2,4
The theory offers a contextual per- 5. Franssen, M., Lokhorst, G., and van de Poel, I.
Philosophy of Technology. The Stanford Encyclopedia
spective rather than the cut-and-dried
directives of more traditional views. What sort of Philosophy (Fall 2015 Edition), Edward N. Zalta
(ed.). https://plato.stanford.edu/archives/fall2015/
of violation
entries/technology/.
While care can be construed as a vir-
tue (relating to my prior post in this
space3) or as a goal like justice, the is neglect? Note: While the Web encyclopedias, as cited, provide good
surveys of current philosophical views, pursuit of any ideas
It doesn’t attack
in depth will require reading original research.
promoters of care ethics resist a uni-
versal mandate. They may also reject
this attempt to apply it to software, of security because Comments
all things; the heart of the matter for it occurs behind This is possibly the most important
care ethics is the work of delivering paragraph of the article, outlining the exact
care to a person in need. the firewall. problem in the industry:
Yet software neglect seems exactly It doesn’t attack “The quality that has corrected for
the type of transgression addressed neglect in the past is professionalism,
by the ethics of care, if we allow its ideals of quality by which I mean that the expert does
reinterpretation outside of human because no one what’s best for the client even at a cost
relationships. Appeal to the theory to personal time, energy, money, or
allows us to identify the opposite of officially disputes prestige—within reason! Certainly these
care, that is, neglect, as the quality to those ideals. judgments are subjective, and viable
condemn. This yields our account of when the professional is autonomous,
software quality as an ethical issue, when that single person exercises
especially piquant in its application control over the product and its quality.
of tools from the feminist foundry Counterforces in the current tech
to the code warrior culture. But little business world are (1) employment,
credit is due! We are not solving the under which most programmers are not
problem, only embedding it in the consultants, but rather given orders by
terms of a philosophical platform. ity, one possible resolution, odd as it a company; and (2) collaboration, under
This account raises issues in the eth- may seem, is simply to acknowledge which most software is the product of
ics of engineering, such as individual the situation, to admit to the public committees, in effect. Professionalism
versus corporate responsibility (and that software is not always reliable, or also depends on strong personal
whether corporate responsibility mature, or even understood. Given its identification with disciplinary peers and
can be rendered coherent and en- familiarity with bug fixes, the public pride in the group’s traditions.”
forceable short of the law). For a may not be unduly shocked. If we pre- It sounds like, short of working for
concise summary, see Section 3.3.2, fer to reject that fatalistic move, the enlightened organizations, software
on Responsibility, in Stanford Ency- pressing question is, are there some developers should be leaning towards more
clopedia of Philosophy entry on the public standards that developers can autonomy and self-ownership.
Philosophy of Technology.5 and will actually follow? The collec- I recently read Developer Hegemony
The quality that has corrected for tive response will determine whether (a very bold title!), http://amzn.
neglect in the past is professional- software engineering is a profession. I to/2pA18wB, and it addresses that
ism, by which I mean that the expert urge all coders who wish to take pride side of the issue by encouraging more
does what’s best for the client even at in their jobs to read the draft profes- professionalism and autonomy.
a cost to personal time, energy, mon- sional standards,1 which mention There’s already a strong movement
ey, or prestige—within reason! Cer- code quality in Section 2.1. in favor of Software Craftsmanship,
tainly these judgments are subjec- We see that ethical issues appear and the free software and open source
tive, and viable when the professional not only in the external social context, movements both seem to care more
is autonomous, when that single per- but in the heart of software, the cod- about quality than most companies
son exercises control over the prod- ing practice itself, a gray-hat problem, (though they do neglect documentation
uct and its quality. Counterforces in if you will. We hope that the ethics of sometimes). For example, we already
the current tech business world are care can somehow help to alleviate prefer software written by recognizably
(1) employment, under which most those issues. smart/professional developers.
programmers are not consultants, Here’s hoping to more autonomy
but rather given orders by a com- References in the future and the allowance of our
pany; and (2) collaboration, under 1. Association for Computing Machinery. Code 2018 professionalism to counteract the neglect
Project. https://ethics.acm.org/.
which most software is the product of 2. Burton, B.K., and Dunn, C.P. Ethics of Care. of software.
committees, in effect. Professional- Encyclopædia Britannica, https://www.britannica.com/ —Rudolf Olah
topic/ethics-of-care.
ism also depends on strong personal 3. Hill, R.K. Ethical Theories Spotted in Silicon Valley.
identification with disciplinary peers Blog@CACM, March 16, 2017, https://cacm.acm.org/
blogs/blog-cacm/214615-ethical-theories-spotted-in- Robin K. Hill is an adjunct professor in the Department of
and pride in the group’s traditions. silicon-valley/fulltext. Philosophy at the University of Wyoming.
4. Sander-Staudt, M. Care Ethics. The Internet
In the face of knotty difficulties Encyclopedia of Philosophy, 2017. http://www.iep.utm.
enforcing or fostering ideals of qual- edu/care-eth/. © 2017 ACM 0001-0782/17/09 $15.00
SE PT E MB E R 2 0 1 7 | VO L. 6 0 | N O. 9 | C OM M U N IC AT ION S OF T HE ACM 11
Introducing ACM Transactions
on Human-Robot Interaction
In January 2018, the Journal of Human-Robot Interaction (JHRI) will become an ACM
publication and be rebranded as the ACM Transactions on Human-Robot Interaction (THRI).
To submit, go to https://mc.manuscriptcentral.com/thri
N
news
D
ISCOVERING THE SECRETS of
the universe is not a task for
the timid and the impatient;
there’s a need to peer into
the deepest reaches of outer
space and try to make sense of distant
galaxies, stars, gas clouds, quasars, ha-
los, and black holes. “Understanding
how these objects behave and how they
interact gives us answers to how the
universe was formed and how it works,”
says Kevin Schawinski, an astrophysi-
cist and assistant professor in the Insti-
tute for Astronomy at ETH Zurich, the
Swiss Federal Institute of Technology.
The problem is that traditional tools
such as telescopes can see only so far,
even with radical advances in optics and
the placement of observatories in space,
where they are free of the light and dust
of Earth. For instance, the Hubble Tele- the equation. As huge volumes of data Center for Cosmology at Carnegie Mel-
scope changed the way astrophysicists stream in, they are able to find answers lon University.
and astronomers viewed deep space by to previously unfathomable questions. Indeed, the combination of more
delivering far clearer images than pre- In recent years, scientists have begun data, advances in data science, and
viously possible. Of course, in this con- to train neural nets to analyze data new methods that allow researchers
text, distance and time are inextricably from images captured by cameras in to easily and cheaply train neural net-
linked. “But the images still do not al- telescopes located on Earth and in works is allowing scientists to boldly
low us to see as far back in time as we space. In many cases, the resulting ma- see where they have never seen before.
would like,” Schawinski says. “The far- chine-based algorithms can sharpen No less important, these advances are
IMAGE F RO M SH UTT ERSTOCK.CO M
ther we can see, the more we can under- blurs and identify distant objects bet- not limited to astrophysics and as-
stand about the origins of the universe ter than humans can. tronomy; they have touched an array of
and how it has evolved.” “Data science and big data are revo- other fields and have advanced autono-
Enter computer image recognition, lutionizing many areas of astrophys- mous vehicles, robots, drones, smart-
artificial neural networks, and data ics,” says François Lanusse, a post- phones and more. They’re also being
science; together, they are changing doctoral researcher in the McWilliams used to better understand everything
SE PT E MB E R 2 0 1 7 | VO L. 6 0 | N O. 9 | C OM M U N IC AT ION S OF T HE ACM 13
news
SE PT E MB E R 2 0 1 7 | VO L. 6 0 | N O. 9 | C OM M U N IC AT ION S OF T HE ACM 15
news
Broadband to Mars
Scientists are demonstrating that lasers
could be the future of space communication.
I
N MARCH, THE U.S. National Aero-
nautics and Space Administra-
tion (NASA) announced that
its planned Orion spacecraft,
which could one day carry as-
tronauts to the Moon and Mars, will
include a new kind of communica-
tion system. Typically, manned and
unmanned vehicles and probes use
radio waves to send and receive infor-
mation. For decades, though, scien-
tists have been pushing toward using
laser-based communications in space.
Lasers are no faster, but they can de-
liver far more information than radio
waves in the same amount of time.
NASA’s Apollo missions to the Moon
were capable of transmitting 51kb
worth of data per second, for example,
but Orion’s planned Laser-Enhanced
Mission and Navigation Operational Artist’s conception of how a NASA spacecraft would use lasers to communicate with Earth.
Services (LEMNOS) system could send
back more than 80 megabytes each sec- the invention of the laser itself, notes applied within Earth’s orbit as well.
ond from the lunar surface. Abi Biswas, supervisor of the Optical The European Space Agency (ESA) and
That stream could be packed with Communications Systems group at Airbus recently put lasers to work as a
rich scientific data, or it could in- NASA’s Jet Propulsion Laboratory in broadband data transfer technology, the
clude ultra-high-resolution video of Pasadena, CA. From a basic physics European Data Relay System (EDRS).
distant worlds. Scaled-up versions of standpoint, Biswas says the advan- Normally, a satellite flying in low
this system could dispatch movies of tage is clear: lasers occupy the higher- Earth orbit transmits data only when it
dust devils, storms, or even astronauts frequency end of the electromagnetic is within view of a ground station. As a
walking on the surface of Mars. Dur- spectrum, relative to radio waves. That result, it may take 90 minutes for the
ing the six-month-long trip to the Red means the beam itself is much nar- ground station to receive data after it
Planet, space travelers could poten- rower. If you were to aim a beam of ra- has been collected.
tially trade videos with family mem- dio waves back at Earth from Mars, the In the EDRS system, lasers are used
bers back on Earth, and mitigate the beam would spread out so much that both to send more data and to acceler-
psychological toll of the long journey. the footprint would be much larger ate its transfer. A geostationary satel-
The LEMNOS project is just one of than the size of our planet. “If you did lite locks onto the low-orbiting satel-
many planned or existing laser-based the same thing with a laser,” Biswas lite via laser the moment it passes over
communications systems in orbit and says, “the beam footprint would be the horizon, then remains connected
beyond. about the size of California.” as the craft soars over the hemisphere
These recent and anticipated ad- When those beams are sent with below. The observing satellite begins
vances cannot be attributed to a single, the same amount of power, the laser transmitting data via laser once the
revolutionary breakthrough, according ends up concentrating more power link is established. The satellite can
to experts. Instead, this new age of laser- on that receiver. “You can send many transfer far more data this way, but
based broadband in space has resulted more bits of information for the same it also gets that data to the ground
from steady improvements in detectors, amount of power,” Biswas explains. faster. Instead of waiting for the ob-
IMAGE COURTESY OF NASA
actuators, control systems, and more. Relative to radio, laser or optical com- serving satellite to fly within view of
munications can transmit anywhere the ground station, the laser transfer
Broadband in Orbit from 10 to 100 times as much data. begins once the craft establishes line
The idea of laser communications in The advantages are not limited to of sight with the geostationary craft,
space has been around nearly since solar system exploration; they can be which then transmits data to the
ground via radio. “You cut down the one meter wide. One way around this been working on a project to propel
time or delivery of the data to the end would be to build a larger receiving miniature spacecraft beyond the so-
user on the ground from hours to 10 antenna, but the goal of the LLCD lar system using a phased array of
to 20 minutes,” says Michael Witting, was to show that an optical commu- either ground- or space-based lasers.
program manager for ESA EDRS. nication system could work without a The spacecraft would have a modest
This speed, combined with the abil- massive—and massively expensive— laser to send back data, and Lubin
ity to transmit more high-resolution dish on the ground. “You have to fig- says the array used to propel the craft
satellite images, will allow organiza- ure out, how can I catch this dancing could also be engineered to receive
tions to track the movement of ice in signal onto a very sensitive detector its messages. “If we’re setting up to
polar regions to help ships navigate and then add very little noise?” asks blast something out with lasers, then
the Arctic crossing. Officials could Don Boroson, a research fellow in the why not use that system to send some-
monitor oil spills, earthquakes, floods, Massachusetts Institute of Technol- thing back?” he asks. That something
and other instances in which informa- ogy (MIT) Lincoln Laboratory’s Com- probably will not include video, but
tion needs to travel quickly to disaster munication Systems Division, and a the lasers could dispatch images and
response teams. major contributor to the LLCD. other information.
The EDRS is already in use, and ESA For the Moon demonstration, Bo- Back on Earth, larger receiving tele-
is scheduled to launch a second satel- roson says the group used an old idea scopes would help pick up signals from
lite in 2018. known as error correction coding, the Moon, Mars, or beyond. Currently,
NASA has a similar project in the which intelligently bundles in redun- NASA scientists are demonstrating how
works, and while the link does not ex- dant bits, so you can still decipher an laser communications systems work
tend all the way to the Moon or Mars, entire message even if you only catch with small receivers, but with the kind
Witting says the technical challenges part of the beam. So, if they were try- of ground telescopes that measured 10
were significant. The system operates ing to send a message that was 10,000 to 15 meters across, it would be pos-
over approximately 45,000 kilometers bits, they’d add in another 10,000 sible to catch far more light and infor-
(about 28,000 miles), and each la- carefully chosen redundant bits, and mation. Boroson doesn’t expect those
ser terminal must locate and remain send 20,000 in all. Then, even if only receivers to be built anytime soon, but
locked on the other throughout flight. half of that message was received, the he does anticipate laser communica-
“It’s like taking a torch from Europe original 10,000-bit code could still be tions will be used more and more.
and hitting a coin in New York,” Wit- deciphered. This approach was criti- “It’s going to happen slowly,” says
ting says— all while the coin is racing cal, Boroson explains; “it allowed us Boroson. “First we’ll see lots of systems
at about 17,000 miles per hour. to have as small as possible a receiver around the Earth, then a few systems
on the ground and still do these very further out in space, and then more and
Lasers from the Moon high data rates and make no errors. more. But it’s all coming, it’s definitely
As you move out to larger distances, We did the lunar link with half a watt coming.”
such as the Moon or Mars, the chal- and a four-inch telescope in space,
lenge increases. Biswas compares the and we still did 622 megabits per sec-
Further Reading
effort involved with hitting a target on ond to the ground.”
Earth from the Moon or Mars to try- Biswas, A. Piazzolla, S. Moision, B.,
vand Lisman, D.
ing to look at a small object through Making Every Photon Count
Evaluation of deep-space laser
a one-meter-long straw; holding that Pushing beyond satellite or lunar communication under different mission
straw steady enough to keep it fo- communication increases the techni- scenarios, Proceedings of SPIE, 2012.
cused is a tremendous challenge. If cal difficulty, because the laser beam Boroson, D.M. and Robinson, B.S.
not held steady and aimed accurately, loses energy at a rate proportional to The Lunar Laser Communication
the California-sized footprint of a laser the square of the distance between Demonstration: NASA’s First Step Toward
beam traveling from Mars could actu- transmitter and receiver. Scaling up Very High Data Rate Support of Science
and Exploration Missions, Space Science
ally miss its target on Earth, and fail to the power used to generate the laser
Reviews, Volume 185, 2014.
transmit the data. is not an option, Biswas explains, be-
Experts say the success of NASA’s cause the laser systems would become Lubin, P.
A Roadmap to Interstellar Flight, Journal of
2013 test of such a system, the Lu- too large and expensive. “As you get the British Interplanetary Society, vol. 69,
nar Laser Communication Demon- farther and farther away, you have to 2016.
stration (LLCD), can be attributed improve the efficiency of your system,” Space Data Without Delay
to a number of advances, including says Biswas. “You have to make every http://bit.ly/2pcIlt2
improvements in the actuators that photon count.”
Hemmati, H.
make micro-adjustments to the posi- Despite the challenges of larger Deep Space Optical Communications, John
tion of the beam, ensuring it remains distances, physicist Philip Lubin of Wiley & Sons, 2006.
on target, and advances in the control the University of California, Santa
systems that determine exactly where Barbara, argues that lasers would still Gregory Mone is a Boston-based science writer and the
it needs to aim. When the laser struck author, with Bill Nye, of Jack and the Geniuses: At the
be a preferred means of communica- Bottom of the World.
Earth, the beam was six kilometers tion for missions to the edges of our
wide, but the receiver was less than solar system and beyond. Lubin has © 2017 ACM 0001-0782/17/09 $15.00
SE PT E MB E R 2 0 1 7 | VO L. 6 0 | N O. 9 | C OM M U N IC AT ION S OF T HE ACM 17
news
W
H E N T H E CRE W of an
$80-million super-
yacht in the Ionian Sea
checked its computer,
they realized they were
drifting slightly off course, likely as a
result of strong currents buffeting their
ship. The crew made adjustments and
went back to work—without realizing
they were now taking directions from a
hacker.
In the bowels of the ship, Todd
Humphreys, an associate professor in
the Department of Aerospace Engi-
neering and Engineering Mechanics at
the University of Texas at Austin, Part of an animation showing how a radio navigation research team from The University of
worked with his team to feed the super- Texas at Austin was able to successfully spoof the GPS system of an $80-million private yacht.
yacht’s crew false navigation data us-
ing a few thousand dollars worth of war in a billion-dollar battleship. A makes receivers behave any way you like.
hardware and software. range of GPS devices and networks are “So far as I know, no commercial
The crew was completely unaware used for everything from military appli- GPS receivers offer any strong de-
they were now piloting in a direction of cations to commercial needs—and all fense against spoofing or even any re-
Humphreys’ choosing. the use cases in between. liable spoofing detection capability,”
Thankfully, it was all an experiment Yet all of these systems rely on the says Humphreys.
that took place with the yacht owner’s data from the network of GPS satel-
blessing. If it had been real, Hum- lites. If you can corrupt the data com- Stealing an $80-Million Superyacht
phreys could have sent the superyacht ing from those satellites, you can cre- In 2013, Humphreys, then a researcher
1,000 miles off-course into the hands ate a world of headaches for systems in the Department of Aerospace Engi-
of a rogue government, terrorist group, that rely on this data. neering and Engineering Mechanics at
or professional criminal organiza- GPS spoofing can be performed with the Cockrell School of Engineering, was
tion—and the crew would not have re- relatively low-cost tech, which is an ex- invited, along with a team of students,
alized it until it was far too late. pensive problem for the people, com- aboard an $80-million yacht in the Io-
Welcome to the very real dangers panies, and governments that trust the nian Sea to test their GPS spoofing tech-
posed by Global Positioning System system implicitly. In the case of Hum- nology. Using his hardware and soft-
(GPS) spoofing, or the dark art of con- phreys’ superyacht hacking, he and his ware rig, Humphreys managed to falsify
vincing computers you are somewhere team used about $2,000 worth of tech. GPS data used by the ship, effectively
that you’re not. It is surprisingly easy— Even in more advanced spoofing sce- giving him control over the vessel.
and shockingly dangerous, because narios, the technology is still straight- Humphreys explained GPS receivers
IMAGE COURTESY OF UNIVERSIT Y OF T EXAS AT AU ST IN
we’re not prepared for it at all. forward, says Dinesh Manandhar, an calculate their distance from several
associate professor and GPS expert at satellites at the same time. Each satel-
GPS Is Easy to Spoof the University of Tokyo. lite has a code—called a pseudoran-
The U.S. Global Positioning System “A device that can generate GPS sig- dom noise (PRN) code—that identifies
consists of 24 satellites that orbit Earth. nals is necessary. Such devices are avail- which satellite in the GPS network is
GPS devices receive signals from the able from GPS signal simulator device broadcasting. Humphreys’ spoofing
nearest satellites that allow them to de- manufacturers,” Manandhar explains. equipment slowly replaced the real
termine their precise location, whether These devices are used to test GPS re- GPS signals with fake ones, working
you’re looking for creatures in the wild- ceivers in factories. As such, they can be delicately so the ship’s system did not
ly popular Pokémon Go app, or going to programmed to transmit a signal that detect an abrupt change in signal.
The spoofed GPS reported the yacht Department of Homeland Security’s re-
was three degrees off-course. The crew, cent document on anti-spoofing, ”Im-
unaware when the experiment would Cargo shipments proving the Operation and Develop-
take place, adjusted the ship’s course are at risk from ment of Global Positioning System
based on the spoofed GPS. The crew as- (GPS) Equipment Used by Critical Infra-
sumed it was due to natural forces such GPS spoofing, structure,” as a sign that the right par-
as water currents and crosswinds.” as are geofences — ties are taking GPS spoofing seriously.
GPS spoofing can be used for all Manandhar has developed anti-
sorts of nefarious purposes. As seen digitally proscribed spoofing methodologies for Japanese
with the yacht, cargo shipments are at boundaries used satellites that may be used in the next
risk, especially dangerous or high-val- generation to be sent into orbit, he
ue ones that are required to follow des- by many corporations says. He recommends that major navi-
ignated GPS routes. Geofences—or to protect gation data provider countries like the
digitally proscribed boundaries—are U.S., Japan, the European Union, Chi-
used to protect sensitive data in many sensitive data. na, and India conduct official joint dis-
corporations; GPS spoofing could be cussions on the security of their sys-
used to access that data well out of the tems at the International Committee
bounds intended. on Global Navigation Satellite Sys-
Once you add emerging technolo- tems, an organization under the um-
gies, like self-driving cars, to the mix, it the data right away; only when the sig- brella of the United Nations.
gets even scarier. Autonomous vehicles nature was verified would the client The dangers, however, are not going
use GPS data at regular intervals not use the GPS data it had received. away. Humphreys worries particularly
only to understand where they are, but “Using cryptography makes it hard that spoofing the GPS-sourced timing
also to decide where to drive passen- to forge a signature, such that even an used to regulate financial databases
gers and cargo. adversary that can feed the client with could create havoc. Industries like fi-
Humphreys’ yacht spoofing was the false data cannot forge a signature, nancial services, he says, “have back-
first time commercial tech had been thus the client does not use forged ups in place, but on close inspection
used in such an effective—and power- data,” Ashur says. one realizes that the backups them-
ful—demonstration. This would prevent, say, spoofing the selves are either short-term or eventu-
Now, said Manandhar, it is even eas- signal to hijack a self-driving car or re- ally trace their source to GPS.”
ier to acquire spoofing technology. “Re- route a drone that relied upon the data. “A coordinated attack that under-
cently, software-based low-cost devices However, the Galileo system, which stood the finance world’s dependency
have become available that cost less comes fully online in 2020, presented a on GPS would be hard to detect and
than $1,000.” unique obstacle: low bandwidth. Gali- even harder to defeat,” he cautions.
leo has relatively low-bandwidth sig-
A Problem for Governments, People nals that make a typical approach to
Further Reading
It is not just yacht owners who need to the problem, using public-key cryptog-
be concerned; the problem is espe- raphy, impossible. Psiaki, M., and Humphreys, T.
Protecting GPS from Spoofers
cially acute for national governments “The uniqueness of our solution is
Is Critical to the Future of Navigation,
and international bodies, which are that it uses symmetric cryptography and IEEE Spectrum, Jul 29, 2016,
waking up to the dangers posed by can thus fit into the bandwidth con- http://spectrum.ieee.org/telecom/security/
GPS spoofing. straints,” says Ashur. The protocol is protecting-gps-from-spoofers-is-critical-to-
Incredibly, Europe’s Galileo glob- scheduled to go into effect in 2018, ac- the-future-of-navigation
al navigation satellite system—the cording to ZDNet. Until all 24 of Gali- Amirtha, T.
European Union’s version of GPS— leo’s satellites are deployed and opera- Satnav spoofing attacks: Why these
researchers think they have the answer,
operated beginning in December tional in 2020, however, the protocol will
ZDNet, Mar 27, 2017,
2016 “with no way to protect civilian “operate in test mode.” http://www.zdnet.com/article/satnav-
users from hacking attempts,” re- In the meantime, manufacturers spoofing-attacks-why-these-researchers-
ported ZDNet. are starting to pay attention to the think-they-have-the-answer/
University of Leuven researchers problem, says Humphreys. Some, like U.S. Department of Homeland Security,
Ashur and Rijmen say they have devel- u-blox, a Swiss company that creates National Cybersecurity & Communications
oped an authentication protocol to deter wireless semiconductors and modules Integration Center, National Coordinating
Center for Communications
the forging of Galileo’s navigation data. for consumer, automotive, and indus-
Improving the Operation and Development
The protocol, called the TESLA sig- trial markets, offer anti-spoofing mea- of Global Positioning System (GPS)
nature, is designed to complement lo- sures such as the capability to detect Equipment Used by Critical Infrastructure,
cation data with a cryptographic “sig- fake global navigation satellite system http://bit.ly/2oZewfz
nature,” so Galileo’s satellites would (GNSS) signals, as well as a message in-
send both navigation data and the tegrity protection system to prevent Logan Kugler is a freelance technology writer based in
Tampa, FL. He has written for over 60 major publications.
cryptographic signature to the receiv- “man in the middle” attacks.
ing client. The client would not trust Humphreys also points to the U.S. © 2017 ACM 0001-0782/17/09 $15.00
SE PT E MB E R 2 0 1 7 | VO L. 6 0 | N O. 9 | C OM M U N IC AT ION S OF T HE ACM 19
news
Turing Laureates
Celebrate Award’s
50th Anniversary
A
CM RE CE N T LY H E LD a con- ics related to their fields of study. mental contributions to artificial intel-
ference in celebration of After welcomes from Hanson, pro- ligence through the development of a
the first 50 years of the gram chair Craig Partridge, and master calculus for probabilistic and causal
ACM A.M. Turing Award. of ceremonies (and past ACM presi- reasoning”), who spoke about an evo-
“Just over 50 years ago, dent) Dame Wendy Hall, 2008 Turing lutionary advance 40,000 years ago that
ACM awarded its first A.M. Turing Laureate Barbara Liskov (who received allowed Homo sapiens to advance past
Award to Alan Perlis for his work on the award “for contributions to prac- competitor species Homo erectus and
advanced programming techniques tical and theoretical foundations of the Neanderthals. “The ability to imag-
and compiler construction,” said ACM programming language and system de- ine things that do not physically exist
president Vicki L. Hanson. “In total, 64 sign, especially related to data abstrac- … the ability to model one’s environ-
people from around the world have re- tion, fault tolerance, and distributed ment, imagine other worlds, served to
ceived the Turing Award, recognizing computing”) offered a presentation accelerate evolution in favor of Homo
work that laid the foundations of mod- on the “Impact of Turing Recipients’ sapiens,” he said.
ern computing.” Work” focusing on the impact of early The session on “Restoring Person-
The award was presented to its 65th Turing recipients, which she described al Privacy Without Compromising Na-
recipient, Sir Tim Berners-Lee, at the as “tremendous.” tional Security” featured 2015 Turing
event in June. A session on “Advances in Deep Laureate Whitfield Diffie (co-recipi-
The conference included more than Neural Networks” featured 2011 Tur- ent of the award with Martin Hellman
20 Turing Laureates speaking on top- ing Laureate Judea Pearl (“for funda- “for inventing and promulgating both
Among the 22 Turing Laureates in attendance at the conference were: Front row, from left: Whitfield Diffie (2015), Martin Hellman (2015),
Robert Tarjan (1986), Barbara Liskov (2008). Second row, from left: Vinton Cerf (2004), Richard Karp (1985), Richard Stearns (1993), Dana
Scott (1976). Third row, from left: Ivan Sutherland (1988), Leslie Valiant (2010), Robert Kahn (2004). Fourth row, from left: Frederick Brooks
(1999), Raj Reddy (1994), William (Velvel) Kahan (1989), Donald Knuth (1974).
SE PT E MB E R 2 0 1 7 | VO L. 6 0 | N O. 9 | C OM M U N IC AT ION S OF T HE ACM 21
news
Unfortunately, that’s trumped by laws puter Science as a Major Body of Accu- Around the Corner? Or Maybe Both at
and government.” mulated Knowledge.” Computer sci- the Same Time?” 2000 Turing Laureate
“Accountability is what we want from ence, he said, shares with mathematics Andrew Chi-Chih Yao (“in recognition
all systems,” Reddy said. “The role of “the great privilege that we can invent of his fundamental contributions to
philosophers/ethicists is to convince the problems to work on.” Basically, he said, the theory of computation, including
government,” because “if it is not writ- computer science and mathematics are the complexity-based theory of pseudo-
ten into the law, nothing will change. “are two parallel disciplines with a lot in random number generation, cryptogra-
Unless we find mechanisms to get it into common, but a distinct difference.” phy, and communication complexity”)
the legal system, we can have all kinds of Knuth said he was both “ optimistic said, “ I am a believer in quantum com-
discussions and nothing will happen.” and pessimistic” about artificial intelli- puting,” adding, “it seems clear that
Opening the second day of the con- gence, and that he is “more pessimistic the technology of quantum computing
ference, 1974 Turing Laureate Donald when it is based on the notion humans is going to have a big practical impact.”
Knuth (“for his major contributions to make rational decisions.” Yao described quantum computing
the analysis of algorithms and the de- The 79-year-old Knuth said he con- as “a great experiment, and we’re all
sign of programming languages, and in siders “computer programming is art, waiting to see what can come of it.” He
particular for his contributions to the in the sense that it’s not from nature, also called is “a great paradigm for in-
‘art of computer programming’ through as well as being beautiful.” terdisciplinary computing.”
his well-known books in a continuous As a member of the panel discuss- The session on “Augmented Reality:
series by this title“) addressed “Com- ing “Quantum Computing: Far Away? From Gaming to Cognitive Aids and
Beyond” was the only session to fea-
ture two Turing Laureates: 1988’s Ivan
Sutherland (“for his pioneering and
visionary contributions to computer
graphics, starting with Sketchpad, and
continuing after“), and 1999’s Freder-
ick P. Brooks, Jr. (“for landmark con-
tributions to computer architecture,
operating systems, and software engi-
neering”).
Brooks said he has a vision of using
augmented reality (AR) for the purpose
of training emergency teams. He asked
the panel about “the state of actual
use of augmented reality today? Who
is using is a tool to earn their living?”
Sutherland responded that the pilot of
a jumbo jet, who trains in a simulator,
is taking advantage of “some of the best
VR (virtual reality) in use today,” while
A young conference attendee takes a selfie with Ivan Sutherland (1988). Yvonne Rogers of University College
London pointed out that head-up dis-
plays “are a reality for navigation.” Pe-
ter Lee, of Microsoft AI and Research,
said there is “a lot of belief, interest,
and a growing amount of experimenta-
tion in AR, such as the ability to “tele-
port” (virtual visit other locations); he
added, “If we can teleport, there really
isn’t a need for so many airplanes.”
Sutherland added that the “greatest
value of AR/VR is to show people things
in a way that makes the underlying
physics, the meaning, clear.”
The full conference sessions are
available at https://www.facebook.
com/pg/AssociationForComputingMa-
chinery/videos/.
—Lawrence M. Fisher
Panel discussions during the conference drew a packed house. © 2017 ACM 0001-0782/17/09 $15.00
SE PT E MB E R 2 0 1 7 | VO L. 6 0 | N O. 9 | C OM M U N IC AT ION S OF T HE ACM 23
news
Charles W. Bachman:
1924–2017
An engineer best known for his work in database management systems,
and in techniques of layered architecture that include Bachman diagrams.
C
HARLES WILLIAM “CHARLIE” enced me as much as his creative ge-
Bachman, the “father of data- nius. His respect for his colleagues, al-
bases” who received the ACM Who inspired ways looking for their positive contribu-
A.M. Turing Award for 1973 Bachman? tion, his patience in explaining ideas to
for creating the first database people who were not always at his level,
management system, died June 13 at “The inventors, his humility and open mind in always
the age of 92. the developers of listening to others as an opportunity to
Born in Manhattan, KS, in 1924, learn something new, characterize him
Bachman earned his B.S. in mechani- new concepts, the as a gentleman in this industry.”
cal engineering in 1948, as well as an solvers of previously Haigh last saw Bachman when he
M.S. in mechanical engineering from was “close to 90 but still sharp and en-
the University of Pennsylvania. unsolved problems.” joying life; talking about the article he
He went to work for Dow Chemical in was working on and his chats with E.O.
1950, using mechanical punched-card Wilson in the retirement community
computing devices to solve networks of they shared. He never stopped trying to
simultaneous equations representing tion, the first to win for a specific piece understand how things worked, or try-
data from Dow plants. In 1957, Bach- of software, and the first who would ing to make them work better. I feel
man became head of Dow’s Data Pro- spend his whole career in industry.” honored to have known him.”
cessing Department, through which he The British Computer Society In 2014, Bachman was named a Fel-
became a member of Share Inc., and a named Bachman a Distinguished Fel- low of the ACM for his contributions to
founding member of the Share Data low in 1977 for his work in database sys- database technology.
Processing Committee. tems. Bachman was named a Fellow of the
In 1960, Bachman joined the Gener- Bachman received the U.S. National Computer History Museum in 2015, for
al Electric (GE) Production Control Ser- Medal of Technology and Innovation his work on database management sys-
vices Group in New York City, using a (NMTI) for 2012. The award was pre- tems. Also that year, Michigan State
factory in Philadelphia to test designs sented to Bachman in 2014 by President University awarded Bachman an honor-
for a system to automate factory plan- Barack Obama. ary doctorate of engineering for being
ning, scheduling, operational control, He was nominated for the NMTI by “at the forefront of computer science
and inventory control. The resulting MI- U.S. Senator Edward J. Markey (D-MA), for more than 65 years.”
ACS was based on the Integrated Data who said, “The United States would not Bachman’s son, Jon, said his father’s
Store (IDS), Bachman’s concept of an be the worldwide hub for technological vision of the Integrated Data Store re-
“information inventory,” and was first innovation had it not been for the sulted in “a high-performance direct ac-
to adopt the “network data model” in achievements of Charles Bachman.” cess storage model (that) allows devel-
which the system would support and Data scientist Gary Rector said Bach- opers to build large efficient databases
enforce relationships between records. man was “humble, kind, generous, and of any type of business or operational
Bachman moved to GE’s Computer a gentle soul; his entire family reflects data. In fact, the first versions were so
Department in 1964, where he helped that humanity. Charlie loved flowers successful that they became established
build another management information and had a smile that embraced every- as the most important system software
system, the Weyerhauser Comprehen- one. His heart connected to people on mainframe computers of that era.”
sive Operating Supervisor (WEYCOS 2). more meaningfully than any database In an interview in 2008, Bachman
Bachman was awarded the ACM could ever do merely with data. To con- was asked who in the IT industry “in-
A.M. Turing Award for 1973 for his con- nect to people in this way is the greatest spired you or was a role model for you?”
PHOTO C OURT ESY OF BACH MA N FA MILY
tributions to database technology. As lesson he gave me.” He replied, “The inventors, the develop-
biographer Thomas Haigh observed, George Colliat, a colleague from GE, ers of new concepts, the solvers of previ-
“Bachman was the first Turing Award said, “I have learned from his ability to ously unsolved problems, the assem-
winner without a Ph.D., the first to be look for solutions that transcend the blers of new and interesting combina-
trained in engineering rather than sci- problems at hand and thereby multiply tions of old technologies. Take Sir
ence, the first to win for the application the value of the solutions.” He added, Maurice Wilkes, Edsger Dijkstra, Sir
of computers to business administra- “Charlie’s human values have influ- Tim Berners-Lee.”
D
I G I TA L T E CH N OLOGIE S H AV E rules for citizens’ interactions online. The Internet’s Promise
unleashed profound forces Where public-sector surveillance and Without a doubt, the Internet revolu-
changing and reshaping private-sector tracking are so pervasive, tionized the dissemination of informa-
rule making in the democ- citizens lose the ability to control the tion and the ability of individuals to
racies of the information disclosure of their thoughts, friends, engage with each other. The euphoria
society. Today, we are witnessing a activities, and no longer have privacy. surrounding the early days of the Inter-
transformative period for law and Where lone coders wreak massive hav- net’s expansion into the public sphere
governance in the digital age. Elected oc for private gain or for opposition to predicted that technology would ex-
representative government and demo- governmental policies, they can use pand democracy and empower citizens
cratically chosen rules vie for author- their information resources to reject around the world. The conventional
ity with new players who have emerged majority rule. Where technology can wisdom thought citizen participation
from the network environment. At the protect the anonymity of wrongdoers, would multiply online with e-govern-
same time, network technologies have rule-breakers can escape accountabil- ment, and the public would have better
unraveled basic foundational prereq- ity. In short, the modern information oversight of the state thanks to new ca-
uisites for the rule of law in democracy society destroys one of the most fun- pabilities for monitoring administra-
like privacy, freedom of association, damental truths of any democracy that tive and executive actions. The power
and government oversight. The digital “the power to make the laws rests with of the Internet to disseminate informa-
age, thus, calls for the emergence of a those chosen by the people.”a tion from one to millions and the pow-
Digitocracy—a new set of more complex er of the Internet to foster conversa-
governance mechanisms assuring pub- a King v. Burwell, 135 S. Ct. 2480, 2496 (2015). tions seemed an unstoppable force for
lic accountability for online power held democratic discourse. Popular move-
by state and nonstate actors through ments like the Arab Spring, the Occupy
the creation of new checks and bal- We are witnessing Movement, and the Bernie Sanders
ances among a more diverse group of U.S. presidential campaign illustrated
players than democracy’s traditional a transformative that information technologies could
grouping of a representative legisla- period for law indeed significantly enhance and en-
ture, executive branch, and judiciary. able political organizing on a new,
Where Google and Facebook know and governance unprecedented scale. Many expected
more than most spy agencies about the in the digital age. that mechanisms like open electronic
lives of millions of citizens as well as the proceedings for rule making and open
inner workings of companies and gov- data for government transparency
ernments, information powerhouses would herald better representative gov-
and platforms can establish their own ernment and decision making.
The Internet’s technical infrastruc- ternet in our daily lives has effectively circumvent traditional political checks
ture turns out to challenge the promise demonstrated new vulnerabilities. The and balances and the public’s over-
of the political empowerment of citi- Internet’s infrastructure has already sight of government suffers irrepara-
zens. Just as network technologies of- displaced three key areas essential to bly. For example, in Oakland, CA, the
fered organizational tools for political the rule of law in democracy: sover- police engaged in a mass-scale surveil-
empowerment, the technologies them- eignty, government accountability, lance program to geo-locate thousands
selves provided the means to reverse the and respect for law. Internet technolo- of mobile phones using stingray devic-
hope that the Internet would be a one- gies restructure a state’s ability to pre- es without any judicial approval and, in
way pro-democracy force. Network in- scribe and assure the enforcement of New York City, the police program to
frastructure proved that it could be used law. Governments forfeit sovereignty record drivers through traffic cams and
to frustrate empowerment dreams. to networks when services like cloud smart city sensors also escapes judicial
Egypt, for example, pulled the plug on computing transcend borders and oversight. At the same time, technolog-
the Internet for several days during the enable organizations to choose rules ically enabled leaks and wide dissemi-
Arab Spring uprisings to block political in the blink of an eye. Network archi- nation of non-public activities of gov-
organizing; Brazil shut down WhatsApp tecture enables technology develop- ernment through sites like WikiLeaks
for 48 hours; local police in the U.S. used ers and service providers to embed may jeopardize legitimate functions
stealth Stingray technology to engage in rules for online activities through of government such as international
large-scale geo-surveillance of citizens. infrastructure choices. For example, relations and active law enforcement
And, at the same time, Twitter bots cloud service providers like Dropbox investigations. Snowden’s leaks, for
flooded social media in order to shut make determinations every day on example, are reported to have endan-
IMAGE BY ALICIA KUBISTA /A ND RIJ BORYS ASSOCIAT ES
down political dialog or to falsify sup- the security of users’ data. These en- gered the lives of British M16 agents in
port for candidates, while hate and bul- cryption decisions determine the very Russia and China.
lying flourish online. In short, the Inter- capability of states to examine user Laws lose their authority when gov-
net has embedded the means to block data in lawful investigations. ernments can no longer control the
political empowerment and discourse. Network infrastructure undermines use of power to enforce rules and hack-
the oversight and accountability of ers have control over weapons of mass
Undermining Democracy government. While open government disruption. Network infrastructure
In the intervening years since the early technologies enable greater transpar- removes the state’s monopoly on the
euphoria over the Internet’s political ency of public institutions, electronic use of coercive, police power to enforce
potential, the embedding of the In- tools also empower governments to rules and protect its citizens. Technol-
SE PT E MB E R 2 0 1 7 | VO L. 6 0 | N O. 9 | C OM M U N IC AT ION S OF T HE ACM 27
viewpoints
ogy allows lone-wolf actors unchecked for privacy have become more power-
by states to create and deploy weapons ful in people’s lives than rules from the
of mass disruption whether through Beyond undermining democratic constitutional framework.
malware, ransomware, or botnets. For key aspects of Business organizations are likely to
example, hospitals across the U.S. in serve as counterweights to govern-
the spring of 2016 faced a wave of ran- the rule of law, ment power. Google’s Transparency
somware attacks that left some in a the Internet Report, Apple’s defiance of an FBI re-
“state of emergency.” ISIS uses crowd quest for encryption keys, and Micro-
sourcing to sow terror in the U.S. and infrastructure has soft’s challenge to U.S. government
Europe. Simultaneously, the infrastruc- toppled critical access to foreign-based servers each
ture empowers private actors to engage reflect a check on the state’s intrusive-
in vigilante actions. The underground substantive legal ness. And, individuals like Snowden
group, Anonymous, recently illustrat- pillars of democracy. may serve as counterweights to states
ed such actions when they threatened and businesses. Individuals and as-
an electronic attack against ISIS fol- sociations of individuals have direct
lowing the Paris massacres in Novem- authority when they coalesce with on-
ber 2016. In essence, individuals and line tools ranging from social media to
associations now have tools—outside hacktivism as they perceive the need
the ability of state control—to enforce to interject and amplify their end goals
their choices and rules online in ways online. All while national government
that are independent of the state. To rapid and widespread dissemination provides checks on overreaching pri-
be sure when a Texas college discov- of harmful content, while wrongdo- vate actors. Where each actor from a
ered in 2015 that Facebook provided ers can shield their activities from ac- state to an individual can assure mass
better real-time information for an on- countability through encryption and disruption online, fair governance will
campus police emergency than 911, it anonymity tools. At the same time, free- require co-existence among the rule-
becomes clear the state has even lost dom of expression limits the authority making actors.
control over basic information it needs of states to ban nefarious online con- At the core, the assurance of public
to protect its citizens. tent. In the U.S., for example, there is accountability online is the key objec-
Beyond undermining key aspects of no public recourse for the rapid growth tive of Digitocracy. The mechanisms
the rule of law, the Internet’s infrastruc- of anti-Semitic Twitter accounts. Users for states, private actors and citizens
ture has toppled critical, substantive must appeal to the social media firms to co-exist as rule-makers in the net-
legal pillars of democracy. Freedom of who, in turn, then decide what to sup- worked society are likely to be defined
thought and association as well as pub- press or censor. By contrast, in Europe, in unexpected ways incorporating no-
lic safety are essential elements of de- platforms bear more legal responsibil- tions of federalism, multistakeholder
mocracy and privacy is a prerequisite. ity for content, but firms are often left governance, and subsidiarity. These
Yet, the network infrastructure con- in the same position as an all-powerful tools will draw the boundaries of rule-
tradicts the basic tenents of freedom censor. In effect, government is un- making authority among the state ac-
of association and privacy. Network able to suppress the vile and corrosive tors, platform operators, corporate orga-
functionality works thanks to ubiqui- online material that threatens citizens nizations, and empowered users. Each
tous data surveillance. The resulting without resorting to oppressive, anti- actor, whether state or non-state, has an
transparency of citizens to those in the democratic controls. important role to prevent overreaching
network undermine both state and citi- by the other actors. In essence, Digitoc-
zen’s respect for the rule of law. States The Opportunity of Digitocracy racy constructs a more multifaceted
lose important checks and balances The information society lacks a model set of interwoven checks and balances
against omnipotent acquisition of in- of governance suited to the digital age. to establish limits on the powers of
formation and citizen’s freedom of Going forward, the digital age will need both state and non-state actors and a
thought and association are undercut. a new system of checks and balances reliance on both to protect the public
Counterintuitively, public safety and for its political decision making—a good. For our future, now is the time
security are also destabilized by the “Digitocracy”—offering the opportuni- to begin the robust public discussion
transparency when stalkers, social en- ty to develop new governing principles on our means of governance in the
gineering hackers, and cyberwarriors that articulate who regulates what to digital age.
find the informational keys to success preserve public accountability online.
readily accessible online. Our challenge is how to construct Joel R. Reidenberg (jreidenberg@law.fordham.edu) is the
Stanley D. and Nikki Waxberg Chair and Professor of Law,
Freedom of expression is another the appropriate checks and balances. Fordham University, Director, Fordham Center on Law and
cornerstone of democracy. Yet, de- Digitocracy’s dynamic will be much Information Policy, and Visiting Research Affiliate, Center
for Information Technology Policy, Princeton University.
mocracies have a capability problem more complex than the analog world.
dealing with socially destructive con- Online private rule making like Twit- The author is preparing a book on this topic to be
tent like hate, threats, and cyberbul- ter’s decisions regarding censorship, published by Yale University Press.
lying that jeopardize public order and Adobe’s technical protections on digi-
individual safety. Technology allows tal content, and Facebook’s settings Copyright held by author.
Computing Ethics
Is That Social Bot
Behaving Unethically?
A procedure for reflection and discourse on the behavior of bots in the
context of law, deception, and societal norms.
A
TTEMPTING TO ANSWER
the question posed by
the title of this column
requires us to reflect on
moral goods and moral
evils—on laws, duties, and norms, on
actions and their consequences. In
this Viewpoint, we draw on informa-
tion systems ethics6,7 to present Bot
Ethics, a procedure the general social
media community can use to decide
whether the actions of social bots are
unethical. We conclude with a consid-
eration of culpability.
Social bots are computer algo-
rithms in online social networks.8
They can share messages, upload pic-
tures, and connect with many users
on social media. Social bots are more
common than people often think.a
Twitter has approximately 23 million Items purchased by Random Darknet Shopper, an automated computer program designed as
an online shopping system that would make random purchases on the deep Web. The robot
of them, accounting for 8.5% of total would have its purchases delivered to a group of artists who then put the items in an exhibition
users; and Facebook has an estimated in Switzerland; the robot was ‘arrested’ by Swiss police after it bought illegal drugs.
140 million social bots, which are be-
tween 5.5%–1.2% total users.b,c Almost service by disseminating information been reported to behave badly in a
27 million Instagram users (8.2%) are about earthquakes, as they happen, in variety of ways across various con-
estimated to be social bots.d LinkedIn the San Francisco Bay area. However, texts—everything from disseminat-
and Tumblr also have significant so- in other situations, social bots can be- ing spam i and fake news j to limit-
cial bot activity.e,f Sometimes their have quite unethically. ing free speech.k But it is not always
activity on these networks can be in- clear whether their undesirable ac-
Social Bots Behaving Unethically
IMAGE COURTESY OF ! MED IENGRUPPE BITNIK
SE PT E MB E R 2 0 1 7 | VO L. 6 0 | N O. 9 | C OM M U N IC AT ION S OF T HE ACM 29
viewpoints
Bot Ethics: How to determine whether social bot actions are unethical. ethical questions, such as whether
algorithms plant viruses in someone
else’s device. This is clearly illegal and
unethical. There are cases where a so-
Social Bot
Action cial bot might ethically violate the law,
such as civil disobedience for a cause
the creator considers just. However,
civil disobedience is only ethical in
Y Appeal to
1. Break Law? very rare cases in constitutional de-
Majority?
mocracies where legal recourse for
2. Involve Y Higher unjust laws pervade.6 Cases where a
Deception? Duty? N
law may be broken that are not unethi-
3. Violate Y If Evil, Less cal require justification—compelling
Strong Norm? than Good?
N arguments that appeal to moral stan-
Justifiable? dards of the majority.6 Only in such
rare cases may illegal acts be seen as
moral and therefore ethical.6 Thus we
Not Y
Unethical Unethical ask “Is the illegal act justifiable?” Acts
that are not suitably justifiable (that
is, do not appeal to the morality of the
majority) are unethical. Swiss author-
ities did not file charges against the
there are shades of gray that are dif- Bot Ethics: A Procedure to Evaluate Random Darknet Shopper developers.p
ficult to judge. the Ethics of Social Bot Activity They argued that social bots can buy
For example, Tay,l a social bot cre- Ethics in philosophy dates back thou- illegal narcotics over the Internet for
ated by Microsoft to conduct research sands of years, and this Viewpoint col- the purpose of artq and that “ecstasy
on conversational understanding, umn cannot do justice to the entire in this presentation was safe.” The
went from “humans are super cool” field. However, because of the increas- behavior was not unethical because it
to “Hitler was right I hate the Jews” ing prominence of social bots and their was justified according to the pervad-
in less than 24 hours on Twitter due potential for malicious activity, ethical ing morality of the community.
to malicious humans interacting judgment about their activity is nec-
with the social bot.m In another case, essary. The best way to guide ethical Involve Deception?
a social bot tweeted “I seriously want conduct in a community is to provide a If a social bot’s behavior does not
to kill people” from randomly gen- procedure for reflection and discourse.5 break any laws, next evaluate for truth-
erated sentences during a fashion The procedure we created is called “Bot fulness: “Is any deception involved?” So-
convention in Amsterdam.n Clearly Ethics” (see the figure here) and it fo- cial bots may act deceitfully. For exam-
such inadvertent comments violate cuses on the behavior of social bots with ple, they can misrepresent themselves
our sensibilities and are distaste- respect to law, deception, and norms. as human beings2 or spread untruth-
ful, but are they unethical? Perhaps, ful information (such as fake news).
but by what standard do we judge? Break Law? Deceiving acts communicate false or
Some social bots do more than just Many laws are developed from ethical erroneous assertions, violating the
comment—clearly those that steal principles.6 Even when a law may be prima facie duty of fidelity. Social bots
information and other misdeeds flawed, it is typically the ethical course should always act truthfully.3 However,
are engaging in unethical activity, of action to follow that law.9 Therefore deceitful acts can be justifiable if the
but, again, it is not always so clear. a natural first question is: “Does the ac- duty of fidelity is superseded by a high-
For instance, the Random Darknet tion of the social bot break the law?” The er-order duty, such as beneficence.r
Shopper—a social bot coded to ex- objective is to assess straightforward Deceptive, satirical actions may not
plore the dark Web in the name of be unethical since they elicit pleasure,
art—inadvertently purchased 10 Ec- improving the life of others. Consider
stasy pills (an illegal narcotic) and a Social bots have been Big Data Batmans as an illustration.
counterfeit passport. o So a law was
broken, but was this unethical be- reported to behave p By “developer” we are referring to either the
havior? We developed a procedure, badly in a variety organization or management of the organiza-
tion or the software developer involved in the
which we describe next, to help an-
swer such questions. of ways across creation of the social bot.
q http://bit.ly/2ud2cZC
various contexts. r Beneficence is the duty to bring virtue, knowl-
edge or pleasure to others; other duties, ac-
l https://twitter.com/TayandYou cording to Ross 1930, include non-malefi-
m http://bit.ly/14bDiuN cence, self-improvement, justice, gratitude,
n http://bit.ly/2ttN5Ox reparation (see Mason et al.7, p. 132–133).
o http://bit.ly/2vFGdu9 s http://bit.ly/2ttNUH7
SE PT E MB E R 2 0 1 7 | VO L. 6 0 | N O. 9 | C OM M U N IC AT ION S OF T HE ACM 31
V
viewpoints
The Profession of IT
Multitasking
Without Thrashing
Lessons from operating systems teach
how to do multitasking without thrashing.
O
to
U R I N D I VI D UAL ABILITY The first four destinations basi- mercial world with its OS 360 in 1965.
be productive has been cally remove incoming tasks from your Operating systems implement mul-
hard stressed by the sheer workspace, the fifth closes quick loops, titasking by cycling a CPU through a
load of task requests we and the sixth holds your incomplete list of all incomplete tasks, giving each
receive via the Internet. In loops. GTD helps you keep track of one a time slice on the CPU. If the task
2001, David Allen published Getting these unfinished loops. does not complete by the end of its
Things Done,1 a best-selling book about The idea of tasks being closed loops time slice, the OS interrupts it and puts
a system for managing all our tasks to of a conversation between a requester it on the end of the list. To switch the
eliminate stress and increase produc- and a perform was first proposed in CPU context, the OS saves all the CPU
tivity. Allen claims that a considerable 1979 by Fernando Flores.5 The “condi- registers of the current task and loads
amount of stress comes our way when tions of satisfaction” that are produced the registers of the new task. The de-
we have too many incomplete tasks. by the performer define loop comple- signers set the time slice length long
He views tasks as loops connecting tion and allow tracking the movement enough to keep the total context switch
someone making a request and you of the conversation toward completion. time insignificant. However, if the time
as the performer who must deliver the Incomplete loops have many negative slice is too short, the system can signif-
requested results. Getting systematic consequences including accumulations icantly slow down due to rapidly accu-
about completing loops dramatically of dissatisfaction, stress, and distrust. mulating context-switching time.
reduces stress. Many people have found the GTD When main memory was small, mul-
Allen says that operating systems are operating system to be very helpful at titasking was implemented by loading
designed to get tasks done efficiently completing their loops, maintaining only one task at a time. Thus, each con-
on computers. Why not export key ideas satisfaction with work, and reducing text switch forced a memory swap: the
about task management into a person- stress. It is a fine example of us taking pages of the running task were saved to
al operating system? He calls his oper- lessons from technology to improve disk, and then the pages of the new task
ating system GTD, for Getting Things our lives. loaded. Page swapping is extremely ex-
Done. The GTD system supports you in pensive. The 1965 era OSs eliminated
tracking open loops and moving them Multitasking this problem by combining multitask-
toward completion. It routes incoming Unfortunately, GTD does not eliminate ing with multiprogramming: the pages
requests to one of these destinations in another source of stress that was much of all active tasks stay loaded in main
your filing system: less of a problem in 2001 than today. memory and context switching involves
˲˲ Trash This is the problem of thrashing when no swapping. However, if too many tasks
˲˲ Tasks that might one day turn out you have too many tasks in progress at were activated, their allocations would
to be worth doing the same time.2 be too small and they would page exces-
˲˲ Tasks that serve as potential future The term multitasking is used in op- sively, causing system throughput to col-
reference points erating systems to mean executing mul- lapse. Engineers called this thrashing, a
˲˲ Tasks delegated to someone else, tiple computational processes simulta- shorthand for “paging to death.”
awaiting their response neously. The very first operating system Eventually researchers discovered
˲˲ Tasks that can be completed im- do this was the Atlas supervisor, running the root cause of thrashing and built
mediately in under two minutes at the University of Manchester, U.K., in control systems to eliminate it—I will
˲˲ Tasks accepted for processing 1959. IBM brought the idea to the com- return to this shortly.
Figure 1. In this memory map of a Firefox Browser in Linux, the colored pixels indicate that a page (vertical axis) is used during a fixed size
execution interval (horizontal axis). The locality sets (pages used) are small compared to the whole address space and their use persists
over extended intervals.
6cc4
PAGES
5dd8
4eec
4000
Human Multitasking numerals. With fewer context switches, decision process that can take quite a
Humans multitask too by juggling sev- time-slicing is faster than fine-grained long time to decide—a situation known
eral incomplete tasks at once. Cogni- multitasking but still slower than one- as the choice uncertainty problem.4
tive scientists and psychologists have at-a-time processing. A third factor that slows human multi-
studied human multitasking for almost Human context switching is more tasking is gathering the resources neces-
two decades. Their main finding is that complicated than computer context sary to continue with a task. Some resourc-
humans do not switch tasks well. Psy- switching. Whereas the computer con- es are physical such as books, equipment,
chologist Nancy Napier illustrates with text switch replaces a fixed number of and tools. Some are digital such as files,
a simple do-it-yourself test.7 Write “I am bytes in a few CPU registers, the human images, sounds, Web pages, and remote
a great multitasker” on line 1 and the has to recall what was “on the mind” at databases. And some are mental, things
series of numbers 1, 2, 3, …, 20 on line the time of the switch and, if the human you have to remember about where you
2. Time how long it takes to do this. Now was interrupted with no opportunity to were in the task and what approach you
do it again, alternating one letter from choose a “clean break,” the human has were taking to perform it. All these re-
line 1 and one numeral from line 2. to reconstruct lost short term memory. sources must be close at hand so that you
Time how long it takes. For most people, Context switching is not the only can access them quickly.
FIGURE 1 CO URT ESY OF ANDRIAN M CMENA M IN
the fine-grained multitasking in the sec- problem. Whereas a computer picks These three problems plague multi-
ond run takes over twice as long as the the next task from the head of a queue, taskers of all age groups. Many studies
one-task-at-a-time first run. Moreover, your brain has to consider all the tasks report considerable evidence of nega-
you are likely to make more errors while and select one, such as the most urgent tive effects—multitasking seems to
multitasking. This test reveals just how or the most important. The time to reduce productivity, increase errors,
slow our brains are at context switching. choose a next task goes up faster than increase stress, and exhaust us. Some
You can try the test a third time using linear with the number of tasks. More- researchers report that multitaskers are
time-slicing, for example writing five over, if you have several urgent impor- less likely to develop expertise in a topic
letters and then switching to write five tant tasks, your brain can get stuck in a because they do not get enough inten-
SE PT E MB E R 2 0 1 7 | VO L. 6 0 | N O. 9 | C OM M U N IC AT ION S OF T HE ACM 33
viewpoints
Figure 2. OS control system to maximize throughput with variable partition of main time needed for a task.
memory determined by task working sets. ˲˲ Some tasks need to be held aside
in an inactive status until you have the
main memory (active tasks)
capacity to deal with them. Analog: the
tasks awaiting waiting tasks queue.
activation ˲˲ When a task’s working set is in
accepted
tasks free WS1 WS1 WS3 WS4 your workspace, protect it from being
completed
incoming tasks unloaded as long as the task is active.
requests tasks put Analog: protect working sets of active
aside by OS open valve when tasks and do not steal from other tasks.
first waiting task’s ˲˲ You will thrash if you activate too
WS fits into free
many tasks so that the total demand is
beyond your capacity. Analog: insuffi-
cient CPU and memory for active tasks.
sive focused practice with it. Some fret susceptible to thrashing as the num- ˲˲ If you are able to choose moments
that if we do not learn to manage our ber of tasks sharing memory increases of context switch, select a moment of
multitasking well, we may wind up be- because each gets a smaller workspace “clean break” that requires little men-
coming a world of dilettantes with few and, when the workspaces are smaller tal reacquisition time when you return
experts to keep our technology running. than the working sets, every task is to the task. If you cannot defer an in-
Thrashing happens to human mul- quickly interrupted by a page fault. terruption to such a moment, you will
titaskers when they have too many in- Under working-set partitioning need more reacquisition time because
complete tasks. They fall into a mood the OS sizes the workspaces to hold you will have to reconstruct short-term
of “overwhelm” in which they expe- each task’s measured working set. As memory lost at the interruption. Ana-
rience considerable stress, cannot shown in Figure 2, it loads tasks into log: ill-timed interrupts can cause loss
choose a next task to work on, and can- memory until the unused free space is of part of a working set.
not stay focused on the chosen task. It too small to hold the next task’s work- You are likely to find that you can-
can be a difficult state to recover from. ing set; the remaining tasks are held not accommodate more than a few
Let us now take a look at what OSs aside in a queue until there is room for active tasks at once without thrash-
do to avoid thrashing and see what les- their working sets. When a task has a ing. However, with the precautions
sons we can take to avoid it ourselves. page fault, the new page is added to its described here, thrashing is unlikely.
workspace by taking a free page; when If it does occur you will feel over-
Locality, Working Sets, and Thrashing any page has not been used for T mem- whelmed and your processing effi-
The OS seeks to allocate memory ory references, it is evicted from the ciency will be badly impaired. To exit
among multiple tasks so as to maxi- task’s workspace and placed in the free the thrashing state, you need to reduce
mize system throughput—the number space. Thus, the OS divides the memory demand or increase your capacity. You
of completed tasks per second.3 among the active tasks such that each can do this by reaching out to other
The accompanying Figure 1 is strong task’s workspace tracks its locality sets. people—making requests for help, re-
graphical evidence of the principle of lo- Page faults do not steal pages from oth- negotiating deadlines, acquiring more
cality—computations concentrate their er working sets. This strategy automati- resources, and in some cases cancel-
memory accesses to relatively small lo- cally adjusts the load (number of active ing less important tasks.
cality sets over extended intervals. Local- tasks) to keep throughput near its maxi-
ity should be no surprise—it reflects the mum and to avoid thrashing. References
1. Allen, D. Getting Things Done. Penguin. 2001.
way human designers approach tasks. Context switching is not the cause of 2. Christian, B. and Griffiths. T. Algorithms to Live By: The
We use the term working set for OS’s thrashing. The cause of thrashing is the Computer Science of Human Decisions. Henry Holt
and Company, 2016.
estimate of a task’s locality set. The for- failure to give every active task enough 3. Denning, P. Working sets past and present. IEEE Trans
mal definition is that working set is space for its working set, thereby caus- Software Engineering SE-6, 1 (Jan. 1980), 64–84.
4. Denning, P. and Martell, C. Great Principles of
the pages used in a backward-looking ing excessive movement of pages be- Computing. MIT Press, 2015.
window of a fixed size T memory refer- tween secondary and main memory. 5. Flores, F. Conversations for Action and Collected
Essays. CreateSpace Independent Publishing
ences. In Figure 1, T is the length of the Platform, 2012.
6. McMenamin, A. Applying working set heuristics to
sampling interval and the working set Translation to Human Multitasking the Linux kernel. Masters Thesis, Birkbeck College,
equals the locality set 97% of the time. Although the analogy with OSs is not University of London, 2011; http://bit.ly/2vFSgY8
7. Napier, N. The myth of multitasking, 2014; http://bit.
Each task needs a workspace—its perfect, there are some lessons: ly/1vuBGcC
own area of memory in which to load its ˲˲ Recognize that each task needs a
pages. There are at least two ways to di- variable working set of resources (phys- Peter J. Denning (pjd@nps.edu) is Distinguished
vide the total memory among the active ical, digital, and mental), which must Professor of Computer Science and Director of the
Cebrowski Institute for information innovation at
tasks. In fixed partitioning, the OS gives be easily accessible in your workspace. the Naval Postgraduate School in Monterey, CA,
each task a fixed workspace. In work- Analog: the working set of pages. is Editor of ACM Ubiquity, and is a past president of ACM.
The author’s views expressed here are not necessarily
ing-set partitioning, the OS gives each ˲˲ Your capacity to deal with a task is those of his employer or the U.S. federal government.
task a variable workspace that tracks the resources and time needed to get
its locality sets. Fixed partitioning is it done. Analog: the memory and CPU Copyright held by author.
Viewpoint
Why Agile Teams Fail
Without UX Research
Failures to involve end users or to collect comprehensive data representing
user needs are described and solutions to avoid such failures are proposed.
L
ESSONS LEARNED BY two user interactions supported by the app to ac-
researchers in the software complish a goal).9
industry point to recurrent Even when customers ˲˲ With growing emphasis on good
failures to incorporate user are involved, UX design, UX professionals, both de-
experience (UX) research signers and researchers, are gradually
or design research. This leads agile sometimes the teams being incorporated as required roles
teams to miss the mark with their may still fail to involve in software development, alongside
products because they neglect or mis- product managers and software de-
characterize the target users’ needs the actual end users. velopers. A 2014 Forrester survey of
and environment. While the reported 112 companies found that organiza-
examples focus on software, the les- tions in which there was systematic
sons apply equally well to the develop- investment in UX design process and
ment of services or tangible products. user research self-evaluated as having
greater impact than those with more
Why It Matters to with wide adoption of mobile devices. limited scope of investment.
the ACM Community Any new application needs to do some- These trends describe a new con-
DILBERT © 2 012 SC OT T A DAM S. USED BY PERM ISSION OF ANDREW S MCM EEL SYNDICAT IO N. AL L RIGH TS RES E RV E D.
Over the past 15 years, agile and lean thing useful or fun, plus it needs to do text that often finds agile teams un-
product development practices have it well and fast enough. In 2013, tech- prepared for two main reasons. First,
increasingly become the norm in the nology analysts found that only 16% of while the agile process formally val-
IT industry.3 At the same time, two people tried a new mobile app more ues the principle of collaboration
synergistic trends have also emerged. than twice, suggesting that users have with customers to define the product
˲˲ End users’ demand for good user low tolerance for poor user experience vision, we and our colleagues in in-
experience has increased significantly, (UX) (where UX is the totality of user’s dustry too often observe this princi-
SE PT E MB E R 2 0 1 7 | VO L. 6 0 | N O. 9 | C OM M U N IC AT ION S OF T HE ACM 35
viewpoints
ple not being put into practice: teams internal tools unavailable to external
do not validate requirements system- customers; and do not need to use the
atically in the settings of use. Second, Agile teams product within the target users’ time
even when customers are involved, without constraints or digital environment.
sometimes the teams may still fail to Second, the evidence internal prox-
involve actual end users. As Rosen- user research ies bring to the team is also biased.
berg puts it, when user requirements are prone Professional sales and support staff
are not validated but are still called are more likely to channel the needs
“user stories,” it creates “the illusion to building of the largest or most strategic existing
of user requirements” that fools the the wrong customers in the marketplace. They
team and the executives, who are then are more likely to focus on pain points
mystified when the product fails in product. of existing customers and less on
the marketplace.10 what works well. Also, they may ignore
In this Viewpoint, we illustrate five new requirements that are not yet ad-
classic examples of failures to involve dressed by the current tool or market.
actual end users or to gather suffi- Therefore internal staff cannot be
ciently comprehensive data to repre- the sole representative of “users”—
sent their needs. Then we propose how who chooses it. Then a customer demo as shown in the “Dilbert” comic strip
these failures can be avoided. (or stakeholder review) at the end of at the beginning of this column.
an iteration confirms that each user User research welcomes their com-
Five Cases of Neglect or story is satisfied. Here is when the ments about competitive analysis,
Mischaracterizations of the User terms customer and user are conflat- current insights about information
We identified five classic cases of fail- ed. For enterprise software and large architecture or other issues, which
ures to involve actual end users. systems, practice teaches us that of- complement customer support data,
The Wild West case. The first and ten the “end-of-iteration customer” UX research, and other sources of
most obvious case occurs when the is someone representing the product user feedback.
team does not do regular testing chooser rather than the end user. Executives liking sales demos ≠
with the users along the develop- So the end-of-iteration demo cannot target users adopting product. En-
ment process. Thus the team fails to be the sole form of feedback to predict terprise software companies, during
evaluate how well the software built user adoption and satisfaction. In ad- their annual customer conferences,
fits target users, their tasks, and their dition, the software development team use a sales demo to portray features
environments. A real-life example of should also leverage user research to and functions intended to excite the
this failure is the development and answer questions such as: audience of buyers, investors, and the
deployment of Healthcare.org, where ˲ ˲ What are the classes of users market analysts about the company
the team, admittedly, did not fully test (personas)? strategy. However, positive responses
the online health insurance market- ˲˲ Have we validated that the intended to the sales demos should not be tak-
place until two weeks before it opened users have the needs specified in the en as equivalent to assertions about a
to the public on October 1, 2013. Then user stories? product’s user requirements. Instead,
the site ran into major failures.8 ˲˲ What are the current user practices these requirements need confirmation
Chooser ≠ target user. The second before the introduction of the product via a careful validation cycle. Let sales
case is neither new nor unique to ag- and the impact afterward? demos open a door toward users with
ile. The term “customer” conflates the ˲˲ How would we extend the tool to the help of choosers and influencers.
chooser with the user. Let’s unpack support new personas or future use Similarly, Customer Advisory
these words: cases? Boards (which draw from customers
˲˲ A customer is often an organiza- Internal proxies ≠ target user. The who have large installations, or who
tion (the target buyer of enterprise third case is about bias. Some teams represent a specific or important seg-
software, that is, product chooser) as work with their in-house profes- ment of the market) stand in for all
represented by the purchasing officer, sional services or sales support staff customers and offer additional op-
an executive or committee that makes (that is, experts thought to represent portunities to showcase future fea-
a buying decision. large groups of customers) as proxies tures or strategy. However, a basic law
˲˲ A customer is the target user only for end users. While we appreciate for success in the software industry is
for consumer-facing products. For the expertise and knowledge these “Build Once, Sell Many.”7 This prin-
enterprise software, target users may resources bring, we are wary of two ciple creates an inherent tension be-
be far from the process of choosing a common types of misrepresentation tween satisfying current customers
product, and have no input about prod- in these situations. and attracting new ones. Therefore, a
ucts the organization selects. First, internal proxies are unrepre- software company needs to constant-
Agile terminology adds to the confu- sentative as end users because they ly rethink their tiered offerings to in-
sion: product teams write user stories have multiple unfair advantages: they clude new market segments or cus-
from the perspective of the person know the software inside out, includ- tomer classes as these emerge, and
who uses the software, not the one ing the work-arounds; have access to avoid one-off development efforts.
Confusing business leaders with us- Every software company is in the egories, or brands, and tries to predict
ers or the sales demo with the product business of finding and keeping new the likelihood of purchase, engagement,
prototype leads companies to build customers. Suppose the logs show the or subscription.
products based on what sales and subscribers of an online dating applica- ˲˲ User research aims at improving
product managers believe is awesome tion are not renewing. Should the com- the user experience by understand-
(for example, see Loranger6). Instead, pany rejoice or despair? If people are ing the relation between actual usage
we advocate validating the designs getting good matches, and thus are sat- behaviors and the properties of the
with actual end users during the prod- isfied, non-renewal implies success. If design. To this end, it measures the
uct development. they are hopelessly disappointed by not behavior and attitudes of users thereby
Big data (What? When?) < The getting dates, non-renewal implies fail- learning whether the product (or ser-
full picture (... How? Why?). Collect- ure. Big data won’t tell you which, but vice) is usable, useful and delightful,
ing and analyzing big data about observing and listening to even a hand- including after decision to purchase.
digital product use is popular among ful of non-renewing individuals will. We urge organizations to act strate-
product managers and even soft- In brief, quantitative data is use- gically and connect market research,
ware developers, who can now learn ful but has two limitations: First, it user research, and customer success
what features get traction with us- will not tell the team why the current functions. This requires aligning goals
ers. We support the use of big data features are or are not used.5 Different and sharing data among Marketing,
techniques as part of user research classes of users can have different rea- Sales, Customer Success, and the UX
and user-centered design, but not as sons. Second, it will not identify what Team (typically in Product or R&D).1,4
a substitute for qualitative user re- additional or alternative features ap-
search. Let’s review two familiar ways peal to a new class of users unfamil- The Way Forward:
to use big data on usage: user data iar with the product. To answer these Educate Managers and Agile
analytics and A/B testing. questions the team needs to rely on Development Teams
User data analytics can quickly an- qualitative research with existing and We have shown five different ways that
swer questions about current usage: proposed classes of users. agile teams without user research are
quantity and most frequent patterns, prone to building the wrong product.
such as How many? How often? Market Research ≠ User Research To avoid such failures, we invite soft-
When? Where? Once a product team Finally, we point to the growing and ware managers and product teams
has worked out most of the design worrisome tendency in industry to mix to assess and fill the current gap in a
(interaction patterns, page layouts, up user research with market research. team’s competencies. The closing ta-
and more), A/B testing compares de- Market research groups make great ble gives short-term and longer-term
sign alternatives, such as “which im- partners for user research. While user action items to address the gaps.
age on a page produces more click- research and market research have a few
throughs”? In vivo experiments with techniques in common (for example,
References
sufficient traffic can generate large surveys and focus groups), the goals and 1. Buley, L. The modern UX organization. Forrester
amount of useful data. Thus, A/B variables they focus on are different. Report. (2016); https://vimeo.com/121037431
2. Grudin J. From Tool to Partner: The Evolution of Human-
testing is very helpful for small in- ˲˲ Market research seeks to under- Computer Interaction. Morgan & Claypool, 2017.
3. HP report. Agile Is the New Normal: Adopting Agile
cremental adjustments. stand attitudes toward products, cat- Project Management. 4AA5-7619ENW, May 2015.
4. Kell, E. Interview by Steve Portigal. Portigal blog.
Actions to address gaps in UX competencies. Podcast and transcript. (Mar. 1, 2016); http://www.
portigal.com/podcast/10-elizabeth-kell-of-comcast/
5. Klein, L. UX for Lean Startups: Faster, Smarter User
Short term Experience Research and Design. O’Reilly, 2013.
1. Analyze the current skills of the team and Support product managers (or product
2. 6. Loranger, H. UX Without User Research Is Not UX.
(Aug. 10, 2014) Nielsen Norman Group blog. http://
flag the gap. A functional product team needs owners) with investment in UX. www.nngroup.com/articles/ux-without-user-research/
several key skill sets or UX competencies: Too often, product managers find their role 7. Mironov, R. Four Laws Of Software Economics. Part 2:
UX research, UX design, UI software is a sort of “kitchen sink” for any task Law of Build Once, Sell Many. (Sept. 14, 2015); http://
development and prototyping.11 These might be that is not software development. www.mironov.com/4law2/
filled by training the current team members or We encourage product managers to find 8. Pear, R. Contractors Describe Limited Testing of
Insurance Web Site. New York Times (Oct. 24, 2013);
hiring UX professionals full-time or part-time. additional resources in the UX competencies, http://nyti.ms/292NryG
to benefit both product and their workload. 9. Perez, S. Users have low tolerance for buggy apps.
Techcrunch. (Mar 12, 2013);[ http://tcrn.ch/Y80ctA
10. Rosenberg, D. Introducing the business of UX.
Longer term Interactions. Forums. XXI.1 Jan.–Feb. 2014.
11. Spool, J.M. Assessing your team’s UX skills. UIE. (Dec.
3. Integrate UX competencies 10, 2007); https://www.uie.com/articles/assessing_
a. Teams need UX research competencies as well as UX design skills (interaction, visual). ux_teams/
Other related skill sets include content development and documentation; accessibility;
globalization and localization. Gregorio Convertino (gconvertino@informatica.com)
4. Collect and prioritize findings from user research is a UX manager and principal user researcher at
a. Seek user feedback early and often. Informatica LLC.
b. Create channels to learn from end users and appropriate surrogates.
Nancy Frishberg (nancyf@acm.org) is a UX researcher
c. Prioritize UX issues during backlog grooming; remove friction and measure delight. and strategist, in private practice, and a 25+-year member
d. Build new features only after steps 4.a.–c. are done for each key version of the product. of the local SIGCHI Chapter BayCHI.org.
SE PT E MB E R 2 0 1 7 | VO L. 6 0 | N O. 9 | C OM M U N IC AT ION S OF T HE ACM 37
V
viewpoints
Viewpoint
When Does Law Enforcement’s
Demand to Read Your Data
Become a Demand to Read
Your Mind?
On cryptographic backdoors and prosthetic intelligence.
T
H E RECE N T DI SPU T E between
the FBI and Apple has raised
a potent set of questions
about companies’ right to
design strong cryptographic
protections for their customers’ data.
The real stakes in these questions are
not just whether the security of our de-
vices should be weakened to facilitate
FBI investigations, but ultimately, the
ability of law enforcement and intelli-
gence agencies to read our minds and
most intimate private thoughts.
In the U.S. and other countries,
there have been many legal cases in
recent years pitting the demands of
law enforcement against the concerns
of technology companies and privacy
advocates over access to new, tech-
nologically generated, information
about people. The disputed topics
have included spy agencies’ bulk col-
IMAGE COLL AGE BY ANDRIJ BORYS ASSOC IAT ES/SH UTT ERSTOC K
SE PT E MB E R 2 0 1 7 | VO L. 6 0 | N O. 9 | C OM M U N IC AT ION S OF T HE ACM 39
viewpoints
Where is this heading? Consider strengthens the black market for in-
a future technological innovation—a dustrial espionage—many people
brain reader. It is a little device that With access to would pay to know the thoughts of
you attach to your skull that lets some- a vast store their competitors, people they are ne-
one read your thoughts. This could gotiating with, or even people they are
be a great boon to law enforcement. of reference considering going on a date with.
Trials could be conducted more ac- information Of course the state is not the only
curately by reading the thoughts of institution that wants to read your
the defendant. Even better, everyone massive deductions mind. There is great value to corpo-
could be required to daily attend a can be made. rations in knowing about you. They
mind reading to make sure they are collect this data from phone apps
not plotting any criminal acts. This and operating systems, credit cards,
would significantly cut down on pre- and web browsers; they use it to help
meditated crime, making our lives design their products, but also for
safer. Then we can concentrate on targeted advertising, differential
unpremeditated crime. Possibly there pricing, and other debatable pur-
are some thoughts that people who The available information is not poses. People joke, semi-seriously,
are likely to commit unpremeditated complete, and there will be gaps. But that Google knows you better than
crimes might think. We can proscribe you can inference an awful amount you know yourself. As well as being a
those thoughts, and then preemp- with limited data. Think about how threat in their own right, corporations
tively arrest people for thought crime. well you know your friends, and how provide an additional target of attack
While we are at it, the morality police you can often predict what decisions for an intrusive state: as Snowden’s
can put in laws against thinking rac- they will make, with only the small view leaks revealed, the NSA didn’t try to
ist, sexist, extremist, sacrilegious, of- of their world that you get from your in- track the location of every cellphone
fensive, or fattening thoughts. teractions with them. With access to on the planet directly: they let adver-
While such an extreme society may a vast store of reference information tisements and tracking code in apps
have a low crime rate, some people (in- massive deductions can be made. collect the data for them.
cluding us) may think this police state Conversely, the possibility of faulty Ultimately, the question of what
would not actually be a better society to deductions is itself a threat to individu- to do about the data accumulated by
live in. Even ignoring the horrors that als. You would not want to have per- technology companies is different
would result from imperfect readings, formed Internet searches for pressure from the question of what to do about
who doesn’t feel guilty about some- cookers and backpacks just before the the FBI, but it should also be under-
thing? As attributed to Cardinal Riche- Boston marathon bombings. stood that we have largely given these
lieu, “If you give me six lines written by Dedicated, well-meaning people companies the power to read our
the hand of the most honest of men, I in law enforcement naturally want minds, and might want to find alter-
will find something in them which will to be able to do their jobs better and natives to that arrangement.
hang him.” Such devices do not exist yet, make the world a safer, and thus bet- We fear we are slowly moving to-
although the demand has been strong ter, place. They see the new data as a ward the era of universal mind moni-
enough that polygraphs, notorious for boon, and law enforcement agencies toring without having recognized
unreliability, are widely used in the U.S. select extremely unphotogenic crimi- and considered it in those terms.
Other technologies like fMRI are al- nals and terrorists as the test cases And those are the terms in which we
ready being used and may turn out to be that will set the rules for millions of should understand battles about the
slightly more accurate than polygraphs, other people. Unfortunately, while right to use effective cryptography.
but we are still some distance from hav- this surveillance apparatus may oc- That wonderful gadget in your pocket
ing to worry about the societal effects of casionally be useful, it also poses a is not a phone. It is a prosthetic part
active mind-reading machines. structural threat to democracy. of your mind—which happens to also
What we have instead is a society Even beyond the threat of police be able to make telephone calls. We
moving toward prosthetic brains that states in the Western world and else- need to think of it as such, and ask
can be monitored at all times by the where, there is a fundamental issue again which parts of our thoughts
state, without the inconvenience of with cryptography that mathematics should be categorically shielded
having to have everyone check in each works the same regardless of whether against prying by the state.
day at the police station. It may feel less you are naughty or nice. So if the state
invasive to have one’s eye movements can break cryptography then so can Andrew Conway (andrewed@greatcactus.org) is
an engineer and mostly retired entrepreneur. He founded
recorded by your augmented reality other actors. There are obvious di- and ran Silicon Genetics.
glasses when an attractive member of rect applications to crime—knowing Peter Eckersley (pde@eff.org) is Chief Computer
the opposite sex walks past than to have when someone is away from home; Scientist for the Electronic Frontier Foundation,
San Francisco, CA.
a daily visit to the mind reader. The for- knowing who is worth kidnapping
mer is certainly more convenient than and what their movements are; iden-
the latter. But practically speaking, the tity theft, bank fraud, and so forth.
effects are the same. But ineffective cryptography also Copyright held by authors.
The Calculus
and pacemakers!
For a detailed discussion of how
SLOs relate to SLIs (service-level indi-
cators) and SLAs (service-level agree-
of Service
ments), see the “Service Level Objec-
tives” chapter in the SRE book. That
chapter also details how to choose
metrics that are meaningful for a par-
Availability
ticular service or system, which in turn
drives the choice of an appropriate SLO
for that service.
This article expands upon the topic
of SLOs to focus on service dependen-
cies. Specifically, we look at how the
availability of critical dependencies in-
forms the availability of a service, and
how to design in order to mitigate and
minimize critical dependencies.
Most services offered by Google aim
AS DETAILED IN Site Reliability Engineering: How to offer 99.99% (sometimes referred
Google Runs Production Systems1 (hereafter referred to as the “four 9s”) availability to us-
ers. Some services contractually com-
to as the SRE book), Google products and services mit to a lower figure externally but set
seek high-velocity feature development while a 99.99% target internally. This more
stringent target accounts for situations
maintaining aggressive service-level objectives (SLOs) in which users become unhappy with
for availability and responsiveness. An SLO says service performance well before a con-
that the service should almost always be up, and the tract violation occurs, as the number
one aim of an SRE team is to keep users
service should almost always be fast; SLOs also provide happy. For many services, a 99.99% in-
precise numbers to define what “almost always” ternal target represents the sweet spot
that balances cost, complexity, and
means for a particular service. SLOs are based on the availability. For some services, notably
following observation: global cloud services, the internal tar-
The vast majority of software services and systems get is 99.999%.
should aim for almost-perfect reliability rather than 99.99% Availability:
perfect reliability—that is, 99.999% or 99.99% rather Observations And Implications
Let’s examine a few key observations
than 100%—because users cannot tell the difference about and implications of designing
between a service being 100% available and less than and operating a 99.99% service and
“perfectly” available. There are many other systems in then move to a practical application.
Observation 1. Sources of outages.
the path between user and service (laptop, home WiFi, Outages originate from two main
ISP, the power grid ...), and those systems collectively sources: problems with the service it-
ing appropriate units. face of errors, and so on.) is, the published availability) is also
Implication 1. Rule of the extra 9. A Implication 2. The math vis-à-vis fre- an option, and often it is the correct
service cannot be more available than quency, detection time, and recovery choice: make it clear to the dependent
the intersection of all its critical de- time. A service cannot be more avail- service that it should either reengineer
pendencies. If your service aims to of- able than its incident frequency mul- its system to compensate for your ser-
fer 99.99% availability, then all of your tiplied by its detection and recovery vice’s availability or reduce its own tar-
SE PT E MB E R 2 0 1 7 | VO L. 6 0 | N O. 9 | C OM M U N IC AT ION S OF T HE ACM 43
practice
Key Definitions
Some of the terms and concepts used Failing safe means whatever behavior Operational readiness practice:
throughout this article may not be is required to prevent the system Exercises designed to ensure the team
familiar to readers who don’t specialize from falling into an unsafe mode supporting a service knows how to
in operations. when expected functionality suddenly respond effectively when an issue
doesn’t work. For example, a given arises, and that the service is resilient
Capacity cache: A cache that serves system might be able to fail open for a to disruption. For example, Google
precomputed results for API calls while by serving cached data, but then performs disaster-recovery test drills
or queries to a service, generating fail closed when that data becomes continuously to make sure that its
cost savings in terms of compute/IO stale (perhaps because past a certain services deliver continuous uptime
resource needs by reducing the volume point, the data is no longer useful). even if a large-scale disaster occurs.
of client traffic hitting the underlying
service. Failover: A strategy that handles failure Rollout policy: A set of principles
Unlike the more typical of a system component or service applied during a service rollout (a
performance/latency cache, a capacity instance by automatically routing deployment of any sort of software
cache is considered critical to service incoming requests to a different component or configuration) to
operation. A drop in the cache hit instance. For example, you might route reduce the scope of an outage in
rate or cache ratio below the SLO database queries to a replica database, the early stages of the rollout.
is considered a capacity loss. Some or route service requests to a replicated For example, a rollout policy
capacity caches may even sacrifice server pool in another datacenter. might specify that rollouts occur
performance (for example, redirecting progressively, on a 5%/20%/100%
to remote sites) or freshness (for Fallback: A mechanism that allows timeline, so that a rollout proceeds
example, CDNs) in order to meet hit a tool or system to use an alternative to a larger portion of customers
rate SLOs. source for serving results when a only when it passes the first
given component is unavailable. milestone without problems.
Customer isolation: Isolating For example, a system might fall Most problems will manifest
customers from each other may be back to using an in-memory cache when the service is exposed to
advantageous so that the behavior of of previous results. While the results a small number of customers,
one customer doesn’t impact other may be slightly stale, this behavior is allowing you to minimize the
customers. For example, you might better than outright failure. This type scope of the damage. Note that for
isolate customers from one another of fallback is an example of graceful a rollout policy to be effective in
based on their global traffic. When a degradation. minimizing damage, you must have
given customer sends a surge of traffic a mechanism in place for rapid
beyond what they’re provisioned for, Geographic isolation: You can build
additional reliability into your service rollback.
you can start throttling or rejecting this
excess traffic without impacting traffic by isolating particular geographic Rollback: This is the ability to revert
from other customers. zones to have no dependencies on each a set of changes that have been
other. For example, if you separate previously rolled out (fully or not) to a
Failing safe/failing open/failing North America and Australia into given service or system. For example,
closed: Strategies for gracefully separate serving zones, an outage you can revert configuration changes
tolerating the failure of a dependency. that occurs in Australia because of a or run a previous version of a binary
The “safe” strategy depends on traffic overload won’t also take out that’s known to be good.
context: failing open may be the safe your service in North America. Note
strategy in some scenarios, while that geographic isolation does come Sharding: Splitting a data
failing closed may be the safe strategy at increased cost: isolating these structure or service into shards is a
in others. geographic zones also means that management strategy based on the
Australia cannot borrow spare capacity principle that systems built for a
Failing open: When the trigger in North America. single machine’s worth of resources
normally required to authorize an don’t scale. Therefore, you can
action fails, failing open means to Graceful degradation: A service distribute resources such as CPU,
let some action happen, rather than should be “elastic” and not fail memory, disk, file handles, and
making a decision. For example, catastrophically under overload so on across multiple machines to
a building exit door that normally conditions and spikes—that is, you create smaller, faster, more easily
requires badge verification “fails open” should make your applications do managed parts of a larger whole.
to let you exit without verification something reasonable even if not all is
during a power failure. right. It is better to give users limited Tail latency: When setting a target
functionality than an error page. for the latency (response time) of a
Failing closed is the opposite of falling service, it is tempting to measure the
open. For example, a bank vault door Integration testing: The phase in average latency. The problem with this
denies all attempts to unlock it if software testing in which individual approach is that an average that looks
its badge reader cannot contact the software modules are combined acceptable can hide a “long tail” of very
access-control database. and tested as a group to verify that large outliers, where some users may
they function correctly together. experience terrible response times.
These “parts” may be code modules, Therefore, the SRE best practice is to
individual applications, client and measure and set targets for 95th- and/
server applications on a network, or 99th-percentile latency, with the goal
among others. Integration testing is of reducing this tail latency, not just
usually performed after unit testing average latency.
and before final validation testing.
get. If you do not correct or address the ˲˲ Time allotted for an on-call re- ond-order dependencies need two ex-
discrepancy, an outage will inevitably sponder to start investigating an alert: tra 9s, third-order dependencies need
force the need to correct it. five minutes. (On-call means that a three extra 9s, and so on.
technical person is carrying a pager This inference is incorrect. It is
Practical Application that receives an alert when the service based on a naive model of a dependen-
Let’s consider an example service with is having an outage, based on a moni- cy hierarchy as a tree with constant fan-
a target availability of 99.99% and work toring system that tracks and reports out at each level. In such a model, as
through the requirements for both its SLO violations. Many Google services shown in Figure 1, there are 10 unique
dependencies and its outage responses. are supported by an SRE on-call rota- first-order dependencies, 100 unique
The numbers. Suppose your 99.99% tion that fields urgent issues.) second-order dependencies, 1,000
available service has the following ˲˲ Remaining time for an effective unique third-order dependencies,
characteristics: mitigation: 10 minutes and so on, leading to a total of 1,111
˲˲ One major outage and three mi- Implication. Levers to make a ser- unique services even if the architecture
nor outages of its own per year. Note vice more available. It’s worth looking is limited to four layers. A highly avail-
that these numbers sound high, but closely at the numbers just presented able service ecosystem with that many
a 99.99% availability target implies a because they highlight a fundamental independent critical dependencies is
20- to 30-minute widespread outage point: there are three main levers to clearly unrealistic.
and several short partial outages per make a service more reliable. A critical dependency can by itself
year. (The math makes two assump- ˲˲ Reduce the frequency of outages— cause a failure of the entire service (or
tions: that a failure of a single shard is via rollout policy, testing, design re- service shard) no matter where it ap-
not considered a failure of the entire views, and other tactics. pears in the dependency tree. There-
system from an SLO perspective, and ˲˲ Reduce the scope of the average fore, if a given component X appears
that the overall availability is comput- outage—via sharding, geographic iso- as a dependency of several first-order
ed with a weighted sum of regional/ lation, graceful degradation, or cus- dependencies of a service, X should be
shard availability.) tomer isolation. counted only once because its failure
˲˲ Five critical dependencies on oth- ˲˲ Reduce the time to recover—via will ultimately cause the service to fail
er, independent 99.999% services. monitoring, one-button safe actions no matter how many intervening ser-
˲˲ Five independent shards, which (for example, rollback or adding emer- vices are also affected.
cannot fail over to one another. gency capacity), operational readiness The correct rule is as follows:
˲˲ All changes are rolled out progres- practice, and so on. ˲˲ If a service has N unique critical
sively, one shard at a time. You can trade among these three dependencies, then each one contrib-
The availability math plays out as levers to make implementation easier. utes 1/N to the dependency-induced
follows. For example, if a 17-minute MTTR is unavailability of the top-level service,
difficult to achieve, instead focus your regardless of its depth in the depen-
Dependency requirements. efforts on reducing the scope of the dency hierarchy.
˲˲ The total budget for outages for the average outage. Strategies for minimiz- ˲˲ Each dependency should be count-
year is 0.01% of 525,600 minutes/year, ing and mitigating critical dependen- ed only once, even if it appears multiple
or 53 minutes (based on a 365-day year, cies are discussed in more depth later times in the dependency hierarchy (in
which is the worst-case scenario). in this article. other words, count only unique depen-
˲˲ The budget allocated to outages dencies). For example, when counting
of critical dependencies is five inde- Clarifying the “Rule of the Extra 9” dependencies of Service A in Figure 2,
pendent critical dependencies, with for Nested Dependencies count Service B only once toward the
a budget of 0.001% each = 0.005%; A casual reader might infer that each total N.
0.005% of 525,600 minutes/year, or additional link in a dependency chain For example, consider a hypo-
26 minutes. calls for an additional 9, such that sec- thetical Service A, which has an error
˲˲ The remaining budget for outages
caused by your service, accounting for Figure 1. Dependency hierarchy: Incorrect model.
outages of critical dependencies, is 53
- 26 = 27 minutes.
example
SE PT E MB E R 2 0 1 7 | VO L. 6 0 | N O. 9 | C OM M U N IC AT ION S OF T HE ACM 45
practice
budget of 0.01%. The service owners ate, because the amount of allowable infrastructure is being used correctly.
are willing to spend half that budget downtime is small. Be explicit in identifying the owners
on their own bugs and losses, and Error budgets eliminate the struc- of shared infrastructure as additional
half on critical dependencies. If the tural tension that might otherwise stakeholders. Also, beware of over-
service has N such dependencies, develop between SRE and product loading your dependencies—coordi-
each dependency receives 1/Nth of development teams by giving them a nate launches carefully with the own-
the remaining error budget. Typical common, data-driven mechanism for ers of these dependencies.
services often have about five to 10 assessing launch risk. They also give Internal vs. external dependencies.
critical dependencies, and therefore both SRE and product development Sometimes a product or service de-
each one can fail only one-tenth or teams a common goal of developing pends on factors beyond company con-
one-twentieth as much as Service A. practices and technology that allow trol—for example, code libraries, or
Hence, as a rule of thumb, a service’s faster innovation and more launches services or data provided by third par-
critical dependencies must have one without “blowing the budget.” ties. Identifying these factors allows
extra 9 of availability. you to mitigate the unpredictability
Strategies for Minimizing and they entail.
Error Budgets Mitigating Critical Dependencies Engage in thoughtful system plan-
The concept of error budgets is covered Thus far, this article has established ning and design. Design your system
quite thoroughly in the SRE book,1 but what might be called the “Golden Rule with the following principles in mind.
bears mentioning here. Google SRE of Component Reliability.” This sim- Redundancy and isolation. You can
uses error budgets to balance reliabil- ply means that any critical component seek to mitigate your reliance upon a
ity and the pace of innovation. This must be 10 times as reliable as the over- critical dependency by designing that
budget defines the acceptable level of all system’s target, so that its contribu- dependency to have multiple indepen-
failure for a service over some period of tion to system unreliability is noise. It dent instances. For example, if storing
time (often a month). An error budget follows that in an ideal world, the aim data in one instance provides 99.9%
is simply 1 minus a service’s SLO, so is to make as many components as pos- availability for that data, then storing
the previously discussed 99.99% avail- sible noncritical. Doing so means the three copies in three widely distributed
able service has a 0.01% “budget” for components can adhere to a lower re- instances provides a theoretical avail-
unavailability. As long as the service liability standard, gaining freedom to ability level of 1 - 0.013, or nine 9s, if
hasn’t spent its error budget for the innovate and take risks. instance failures are independent with
month, the development team is free The most basic and obvious strat- zero correlation.
(within reason) to launch new features, egy to reduce critical dependencies is In the real world, the correlation
updates, and so on. to eliminate single points of failure is never zero (consider network back-
If the error budget is spent, the (SPOFs) whenever possible. The larg- bone failures that affect many cells
service freezes changes (except for er system should be able to operate concurrently), so the actual avail-
urgent security fixes and changes ad- acceptably without any given compo- ability will be nowhere close to nine
dressing what caused the violation in nent that’s not a critical dependency 9s but is much higher than three 9s.
the first place) until either the service or SPOF. Also note that if a system or service
earns back room in the budget, or the In reality, you likely cannot get is “widely distributed,” geographic
month resets. Many services at Google rid of all critical dependencies, but separation is not always a good proxy
use sliding windows for SLOs, so the you can follow some best practices for uncorrelated failures. You may be
error budget grows back gradually. For around system design to optimize re- better off using more than one system
mature services with an SLO greater liability. While doing so isn’t always in nearby locations than the same sys-
than 99.99%, a quarterly rather than possible, it is easier and more effec- tem in distant locations.
monthly budget reset is appropri- tive to achieve system reliability if you Similarly, sending an RPC (remote
plan for reliability during the design procedure call) to one pool of serv-
Figure 2. Multiple dependencies in and planning phases, rather than af- ers in one cluster may provide 99.9%
the dependency hierarchy.
ter the system is live and impacting availability for results, but sending
actual users. three concurrent RPCs to three dif-
service A Conduct architecture/design re- ferent server pools and accepting the
views. When you are contemplating a first response that arrives helps in-
new system or service, or refactoring crease availability to well over three 9s
or improving an existing system or ser- (noted earlier). This strategy can also
service C
vice, an architecture or design review reduce tail latency if the server pools
can identify shared infrastructure and are approximately equidistant from
service B
internal vs. external dependencies. the RPC sender. (Since there is a high
Shared infrastructure. If your service cost to sending three RPCs concur-
is using shared infrastructure—for ex- rently, Google often stages the timing
ample, an underlying database service of these calls strategically: most of our
service B
used by multiple user-visible prod- systems wait a fraction of the allotted
ucts—think about whether or not that time before sending the second RPC,
and a bit more time before sending trigger safe rollbacks. some or many of the concepts this ar-
the third RPC.) Systematically examine all possible ticle has covered, assembling this in-
Failover and fallback. Pursue soft- failure modes. Examine each compo- formation and putting it into concrete
ware rollouts and migrations that fail nent and dependency and identify the terms may make the concepts easier to
safe and are automatically isolated impact of its failure. Ask yourself the understand and teach. Its recommen-
should a problem arise. The basic prin- following questions: dations are uncomfortable but not
ciple at work here is that by the time ˲˲ Can the service continue serving in unattainable. A number of Google ser-
you bring a human online to trigger degraded mode if one of its dependen- vices have consistently delivered better
a failover, you have likely already ex- cies fails? In other words, design for than four 9s of availability, not by su-
ceeded your error budget. graceful degradation. perhuman effort or intelligence, but by
Where concurrency/voting is not ˲˲ How do you deal with unavailabili- thorough application of principles and
possible, automate failover and fall- ty of a dependency in different scenari- best practices collected and refined
back. Again, if the issue needs a hu- os? Upon startup of the service? During over the years (see SRE’s Appendix B: A
man to check what the problem is, the runtime? Collection of Best Practices for Produc-
chances of meeting your SLO are slim. Conduct thorough testing. Design tion Services).
Asynchronicity. Design dependen- and implement a robust testing envi-
cies to be asynchronous rather than ronment that ensures each dependen- Acknowledgments
synchronous where possible so that cy has its own test coverage, with tests Thank you to Ben Lutch, Dave Rensin,
they don’t accidentally become criti- that specifically address use cases that Miki Habryn, Randall Bosetti, and Pat-
cal. If a service waits for an RPC re- other parts of the environment expect. rick Bernier for their input.
sponse from one of its noncritical Here are a few recommended strate-
dependencies and this dependency gies for such testing:
has a spike in latency, the spike will ˲˲ Use integration testing to perform Related articles
on queue.acm.org
unnecessarily hurt the latency of the fault injection—verify that your system
parent service. By making the RPC can survive failure of any of its depen- There’s Just No Getting Around It:
You’re Building a Distributed System
call to a noncritical dependency asyn- dencies.
Mark Cavage
chronous, you can decouple the la- ˲˲ Conduct disaster testing to iden-
http://queue.acm.org/detail.cfm?id=2482856
tency of the parent service from the tify weaknesses or hidden/unexpected
Eventual Consistency Today:
latency of the dependency. While dependencies. Document follow-up Limitations, Extensions, and Beyond
asynchronicity may complicate code actions to rectify the flaws you uncover. Peter Bailis and Ali Ghodsi
and infrastructure, this trade-off will ˲˲ Don’t just load test. Deliberately http://queue.acm.org/detail.cfm?id=2462076
be worthwhile. overload your system to see how it A Conversation with Wayne Rosing
Capacity planning. Make sure that degrades. One way or another, your David J. Brown
every dependency is correctly provi- system’s response to overload will be http://queue.acm.org/detail.cfm?id=945162
sioned. When in doubt, overprovision tested; better to perform these tests
if the cost is acceptable. yourself than to leave load testing to Reference
Configuration. When possible, your users. 1. Beyer, B., Jones, C., Petoff, J., Murphy, N.R. Site
Reliability Engineering: How Google Runs Production
standardize configuration of your de- Plan for the future. Expect changes Systems. O’Reilly Media, 2016; https://landing.google.
com/sre/book.html.
pendencies to limit inconsistencies that come with scale: a service that be-
among subsystems and avoid one-off gins as a relatively simple binary on a
failure/error modes. single machine may grow to have many Ben Treynor started programming at age six and
joined Oracle as a software engineer at age 17. He has
Detection and troubleshooting. Make obvious and nonobvious dependen- also worked in engineering management at E.piphany,
SEVEN, and Google (2003-present). His current team
detecting, troubleshooting, and diag- cies when deployed at a larger scale. of approximately 4,200 at Google is responsible for Site
nosing issues as simple as possible. Every order of magnitude in scale will Reliability Engineering, networking, and datacenters
worldwide.
Effective monitoring is a crucial com- reveal new bottlenecks—not just for
ponent of being able to detect issues in your service, but for your dependencies Mike Dahlin is a distinguished engineer at Google, where
he has worked on Google’s Cloud Platform since 2013.
a timely fashion. Diagnosing a system as well. Consider what happens if your Prior to joining Google, he was a professor of computer
science at the University of Texas at Austin.
with deeply nested dependencies is dif- dependencies cannot scale as fast as
ficult. Always have an answer for miti- you need them to. Vivek Rau is an SRE manager at Google and a founding
member of the Launch Coordination Engineering sub-team
gating failures that doesn’t require an Also be aware that system depen- of SRE. Prior to joining Google, he worked at Citicorp
operator to investigate deeply. dencies evolve over time and that your Software, Versant, and E.piphany. He currently manages
various SRE teams tasked with tracking and improving the
Fast and reliable rollback. Introduc- list of dependencies may very well reliability of Google’s Cloud Platform.
ing humans into a mitigation plan sub- grow over time. When it comes to in- Betsy Beyer is a technical writer for Google, specializing
stantially increases the risk of miss- frastructure, Google’s typical design in Site Reliability Engineering. She has previously written
documentation for Google’s Data Center and Hardware
ing a tight SLO. Build systems that are guideline is to build a system that will Operations Teams. She was formerly a lecturer on
easy, fast, and reliable to roll back. As scale to 10 times the initial target load technical writing at Stanford University.
your system matures and you gain con- without significant design changes.
fidence in your monitoring to detect
problems, you can lower MTTR by en- Conclusion Copyright held by owner/authors.
gineering the system to automatically While readers are likely familiar with Publication rights licensed to ACM. $15.00.
SE PT E MB E R 2 0 1 7 | VO L. 6 0 | N O. 9 | C OM M U N IC AT ION S OF T HE ACM 47
practice
DOI:10.1145/ 3080008
scale, the amount of information may
Article development led by
queue.acm.org
be too large to store in an impoverished
setting (say, an embedded device) or to
keep conveniently in fast storage.
The approximate approach is In response to this challenge, the
often faster and more efficient. model of streaming data processing
has grown in popularity. The aim is no
BY GRAHAM CORMODE longer to capture, store, and index ev-
ery minute event, but rather to process
each observation quickly in order to
Data
create a summary of the current state.
Following its processing, an event is
dropped and is no longer accessible.
The summary that is retained is often
referred to as a sketch of the data.
Coping with the vast scale of infor-
Sketching
mation means making compromises:
The description of the world is approx-
imate rather than exact; the nature of
queries to be answered must be decid-
ed in advance rather than after the fact;
and some questions are now insoluble.
The ability to process vast quantities of
data at blinding speeds with modest re-
sources, however, can more than make
up for these limitations.
As a consequence, streaming meth-
ods have been adopted in a number
DO YO U EVER feel overwhelmed by an unending stream of domains, starting with telecom-
of information? It can seem like a barrage of new munications but spreading to search
engines, social networks, finance, and
email and text messages demands constant attention, time-series analysis. These ideas are
and there are also phone calls to pick up, articles to also finding application in areas using
traditional approaches, but where the
read, and knocks on the door to answer. Putting these rough-and-ready sketching approach
pieces together to keep track of what is important can is more cost effective. Successful appli-
be a real challenge. cations of sketching involve a mixture
of algorithmic tricks, systems know-
The same information overload is a concern in how, and mathematical insight, and
many computational settings. Telecommunications have led to new research contributions
in each of these areas.
companies, for example, want to keep track of the This article introduces the ideas be-
activity on their networks, to identify overall network hind sketching, with a focus on algo-
health and spot anomalies or changes in behavior. Yet, rithmic innovations. It describes some
algorithmic developments in the ab-
the scale of events occurring is huge: many millions of stract, followed by the steps needed to
network events per hour, per network element. While put them into practice, with examples.
The article also looks at four novel al-
new technologies allow the scale and granularity gorithmic ideas and discusses some
of events being monitored to increase by orders of emerging areas.
magnitude, the capacity of computing elements
Simply Sampling
(processors, memory, and disks) to make sense of When faced with a large amount of
these is barely increasing. Even on a small information to process, there may be
city or have bought a certain product. poll to 0.3% would require contacting 1, and if it is smaller than p, put the re-
The method. To flesh this out, let’s 100,000 voters. cord in the sample. The problem with
fill in a few gaps. First, how big should Second, how should the sample be this approach is that you do not know
the sample be to supply good answers? drawn? Simply taking the first s re- in advance what p should be. In the
SE PT E MB E R 2 0 1 7 | VO L. 6 0 | N O. 9 | C OM M U N IC AT ION S OF T HE ACM 49
practice
previous analysis a fixed sample size tion that requires detailed knowledge query for any recorded attribute of the
s was desired, and using a fixed sam- of individual records in the data can- sampled items.
pling rate p means there are too few el- not be answered by sampling. For ex- Because of its flexibility, sampling
ements initially, but then too many as ample, if you want to know whether is a powerful and natural way of build-
more records arrive. one specific individual is among your ing a sketch of a large dataset. There
Presented this way, the question customers, then a sample will leave you are many different approaches to sam-
has the appearance of an algorithmic uncertain. If the customer is not in the pling that aim to get the most out of
puzzle, and indeed this was a com- sample, you do not know whether this the sample or to target different types
mon question in technical interviews is because that person is not in the data of queries that the sample may be used
for many years. One can come up with or because he or she did not happen to to answer.11 Here, more information is
clever solutions that incrementally ad- be sampled. A question like this ulti- presented about less flexible methods
just p as new records arrive. A simple mately needs all the presence informa- that address some of these limitations
and elegant way to maintain a sample tion to be recorded and is answered by of sampling.
is to adapt the idea of random tags. At- highly compact encodings such as the
tach to each record a random tag, and Bloom filter (described later). Summarizing Sets
define the sample to be the s records A more complex example is when with Bloom Filters
with the smallest tag values. As new the question involves determining the The Bloom filter is a compact data
records arrive, the tag values decide cardinality of quantities. In a dataset structure that summarizes a set of
whether to add the new record to the that has many different values, how items. Any computer science data-
sample (and to remove an old item to many distinct values of a certain type structures class is littered with exam-
keep the sample size fixed at s). are there? For example, how many dis- ples of “dictionary” data structures,
Discussion and applications. Sam- tinct surnames are in a particular cus- such as arrays, linked lists, hash ta-
pling methods are so ubiquitous that tomer dataset? Using a sample does bles, and many esoteric variants of
there are many examples to consider. not reveal this information. Let’s say in balanced tree structures. The com-
One simple case is within database a sample size of 1,000 out of one mil- mon feature of these structures is
systems. It is common for the database lion records, 900 surnames occur just that they can all answer “membership
management system to keep a sample once among the sampled names. What questions” of the form: Is a certain
of large relations for the purpose of can you conclude about the popularity item stored in the structure or not?
query planning. When determining of these names in the rest of the data- The Bloom filter can also respond to
how to execute a query, evaluating dif- set? It might be that almost every other such membership questions. The an-
ferent strategies provides an estimate name in the full dataset is also unique. swers given by the structure, however,
of how much data reduction may occur Or it might be that each of the unique are either “the item has definitely not
at each step, with some uncertainty of names in the sample reoccurs tens or been stored” or “the item has probably
course. Another example comes from hundreds of times in the remainder been stored.” This introduction of un-
the area of data integration and link- of the data. With the sampled infor- certainty over the state of an item (it
age, in which a subproblem is to test mation there is no way to distinguish might be thought of as introducing po-
whether two columns from separate between these two cases, which leads tential false positives) allows the filter
tables can relate to the same set of en- to huge confidence intervals on these to use an amount of space that is much
tities. Comparing the columns in full kinds of statistics. Tracking informa- smaller than its exact relatives. The fil-
can be time consuming, especially tion about cardinalities, and omitting ter also does not allow listing the items
when you want to test all pairs of col- duplicates, is addressed by techniques that have been placed into it. Instead,
umns for compatibility. Comparing a such as HyperLogLog, addressed later. you can pose membership questions
small sample is often sufficient to de- Finally, there are quantities that only for specific items.
termine whether the columns have any samples can estimate, but for which The method. To understand the fil-
chance of relating to the same entities. better special-purpose sketches ex- ter, it is helpful to think of a simple ex-
Entire books have been written on ist. Recall that the standard error of a act solution to the membership prob-
the theory and practice of sampling, sample of size s is 1/√s. For problems lem. Suppose you want to keep track
particularly around schemes that try such as estimating the frequency of of which of a million possible items
to sample the more important ele- a particular attribute (such as city of you have seen, and each one is help-
ments preferentially, to reduce the er- residence), you can build a sketch of fully labeled with its ID number (an
ror in estimating from the sample. For size s so the error it guarantees is pro- integer between one and a million).
a good survey with a computational portional to 1/s. This is considerably Then you can keep an array of one
perspective, see Synopses for Massive stronger than the sampling guarantee million bits, initialized to all 0s. Every
Data: Samples, Histograms, Wavelets and only improves as we devote more time you see an item i, you just set the
and Sketches.11 space s to the sketch. The Count-Min ith bit in the array to 1. A lookup query
Given the simplicity and general- sketch described later in this article for item j is correspondingly straight-
ity of sampling, why would any other has this property. One limitation is that forward: just see whether bit j is a 1
method be needed to summarize data? the attribute of interest must be speci- or a 0. The structure is very compact:
It turns out that sampling is not well fied in advance of setting up the sketch, 125KB will suffice if you pack the bits
suited for some questions. Any ques- while a sample allows you to evaluate a into memory.
Real data, however, is rarely this positive is approximately exp(k ln(1 that keeping the full database, as part
nicely structured. In general, you exp(kn/m))).4 While extensive study of of the browser would be unwieldy, es-
might have a much larger set of possi- this expression may not be rewarding pecially on mobile devices.
ble inputs—think again of the names in the short term, some simple analy- Instead, a Bloom filter encoding of
of customers, where the number of sis shows that this rate is minimized the database can be included with the
possible name strings is huge. You by picking k = (m/n) ln 2. This corre- browser, and each URL visited can be
can nevertheless adapt your bit-array sponds to the case when about half the checked against it. The consequence
approach by borrowing from a differ- bits in the filter are 1 and half are 0. of a false positive is that the browser
ent dictionary structure. Imagine the For this to work, the number of bits may believe that an innocent site is on
bit array is a hash table: you will use a in the filter should be some multiple of the bad list. To handle this, the brows-
hash function h to map from the space the number of items that you expect to er can contact the database author-
of inputs onto the range of indices for store in it. A common setting is m = 10n ity and check whether the full URL is
your table. That is, given input i, you and k = 7, which means a false posi- on the list. Hence, false positives are
now set bit hi to 1. Of course, now you tive rate below 1%. Note that there is removed at the cost of a remote data-
have to worry about hash collisions no magic here that can compress data base lookup.
in which multiple entries might map beyond information-theoretical limits: Notice the effect of the Bloom filter:
onto the same bit. A traditional hash under these parameters, the Bloom fil- it gives the all clear to most URLs and
table can handle this, as you can keep ter uses about 10 bits per item and must incurs a slight delay for a small frac-
information about the entries in the use space proportional to the number tion (or when a bad URL is visited).
table. If you stick to your guns and of different items stored. This is a mod- This is preferable both to the solution
keep the bits only in the bit array, est savings when representing integer of keeping a copy of the database with
however, false positives will result: if values but is a considerable benefit the browser and to doing a remote
you look up item i, it may be that entry when the items stored have large de- lookup for every URL visited. Brows-
h i is set to 1, but i has not been seen; scriptions—say, arbitrary strings such ers such as Chrome and Firefox have
instead, there is some item j that was as URLs. Storing these in a traditional adopted this concept. Current versions
seen, where h(i) = h(j). structure such as a hash table or bal- of Chrome use a variation of the Bloom
Can you fix this while sticking to a bit anced search tree would consume filter based on more directly encoding
array? Not entirely, but you can make tens or hundreds of bytes per item. a list of hashed URLs, since the local
it less likely. Rather than just hashing A simple example is shown in Figure copy does not have to be updated dy-
each item i once, with a single hash 1, where an item i is mapped by k = 3 namically and more space can be saved
function, use a collection of k hash hash functions to a filter of size m = 12, this way.
functions h1, h2, . . . h k, and map i with and these entries are set to 1. The Bloom filter was introduced
each of them in turn. All the bits corre- Discussion and applications. The in 1970 as a compact way of storing a
sponding to h1(i), h2(i) . . . h k(i) are possibility of false positives needs to dictionary, when space was really at a
set to 1. Now to test membership of j, be handled carefully. Bloom filters are premium.3 As computer memory grew,
check all the entries it is hashed to, and at their most attractive when the con- it seemed that the filter was no longer
say no if any of them are 0. sequence of a false positive is not the needed. With the rapid growth of the
There’s clearly a trade-off here: Ini- introduction of an error in a computa- Web, however, a host of applications
tially, adding extra hash functions re- tion, but rather when it causes some for the filter have been devised since
duces the chances of a false positive as additional work that does not adversely around the turn of the century.4 Many
more things need to “go wrong” for an impact the overall performance of the of these applications have the flavor
incorrect answer to be given. As more system. A good example comes in the of the preceding example: the filter
and more hash functions are added, context of browsing the Web. It is now gives a fast answer to lookup queries,
however, the bit array gets fuller and common for Web browsers to warn us- and positive answers may be double-
fuller of 1 values, and therefore colli- ers if they are attempting to visit a site checked in an authoritative reference.
sions are more likely. This trade-off can that is known to host malware. Check- Bloom filters have been widely used
be analyzed mathematically, and the ing the URL against a database of “bad” to avoid storing unpopular items in
sweet spot found that minimizes the URLs does this. The database is large caches. This enforces the rule that an
chance of a false positive. The analysis enough, and URLs are long enough, item is added to the cache only if it has
works by assuming that the hash func-
tions look completely random (which Figure 1. Bloom filter with K=3, M=12.
is a reasonable assumption in prac-
tice), and by looking at the chance that i
an arbitrary element not in the set is
reported as present.
If n distinct items are being stored
in a Bloom filter of size m, and k hash
functions are used, then the chance of
0 1 1 0 0 0 1 1 0 0 0 1
a membership query that should re-
ceive a negative answer yielding a false
SE PT E MB E R 2 0 1 7 | VO L. 6 0 | N O. 9 | C OM M U N IC AT ION S OF T HE ACM 51
practice
been seen before. The Bloom filter is could in principle link to one or more item. The counter was also potentially
used to compactly represent the set of tweets, so allocating counters for each incremented by occurrences of other
items that have been seen. The con- is infeasible and unnecessary. Instead, items that were mapped to the same
sequence of a false positive is that a it is natural to look for a more compact location, however, since collisions are
small fraction of rare items might also way to encode counts of items, possibly expected. Given the collection of coun-
be stored in the cache, contradicting with some tolerable loss of fidelity. ters containing the desired count, plus
the letter of the rule. Many large dis- The Count-Min sketch is a data noise, the best guess at the true count
tributed databases (Google’s Bigtable, structure that allows this trade-off to of the desired item is to take the small-
Apache’s Cassandra and HBase) use be made. It encodes a potentially mas- est of these counters as your estimate.
Bloom filters as indexes on distributed sive number of item types in a small ar- Figure 2 shows the update proc-
chunks of data. They use the filter to ray. The guarantee is that large counts ess: an item i is mapped to one entry
keep track of which rows or columns will be preserved fairly accurately, in each row j by the hash function hj,
of the database are stored on disk, thus while small counts may incur greater and the update of c is added to each
avoiding a (costly) disk access for non- (relative) error. This means it is good entry. It can also be seen as modeling
existent attributes. for applications where you are inter- the query process: a query for the same
ested in the head of a distribution and item i will result in the same set of lo-
Counting with Count-Min Sketch less so in its tail. cations being probed, and the smallest
Perhaps the canonical data summari- The method. At first glance, the value returned as the answer.
zation problem is the most trivial: to sketch looks quite like a Bloom filter, as Discussion and applications. As
count the number of items of a certain it involves the use of an array and a set with the Bloom filter, the sketch
type that have been observed, you do of hash functions. There are significant achieves a compact representation of
not need to retain each item. Instead, differences in the details, however. The the input, with a trade-off in accuracy.
a simple counter suffices, incremented sketch is formed by an array of coun- Both provide some probability of an
with each observation. The counter has ters and a set of hash functions that unsatisfactory answer. With a Bloom
to be of sufficient bit depth in order to map items into the array. More precise- filter, the answers are binary, so there
cope with the magnitude of events ob- ly, the array is treated as a sequence of is some chance of a false positive re-
served. When the number of events rows, and each item is mapped by the sponse; with a Count-Min sketch, the
gets truly huge, ideas such as Robert first hash function into the first row, answers are frequencies, so there is
Morris’s approximate counter can be by the second hash function into the some chance of an inflated answer.
used to provide such a counter in fewer second row, and so on (note that this What may be surprising at first
bits12 (another example of a sketch). is in contrast to the Bloom filter, which is that the obtained estimate is very
When there are different types of allows the hash functions to map onto good. Mathematically, it can be shown
items, and you want to count each type, overlapping ranges). An item is pro- that there is a good chance that the
the natural approach is to allocate a cessed by mapping it to each row in returned estimate is close to the cor-
counter for each item. When the num- turn via the corresponding hash func- rect value. The quality of the estimate
ber of item types grows huge, however, tion and incrementing the counters to depends on the number of rows in the
you encounter difficulties. It may not which it is mapped. sketch (each additional row halves the
be practical to allocate a counter for Given an item, the sketch allows its probability of a bad estimate) and on
each item type. Even if it is, when the count to be estimated. This follows a the number of columns (doubling the
number of counters exceeds the capac- similar outline to processing an up- number of columns halves the scale of
ity of fast memory, the time cost of in- date: inspect the counter in the first the noise in the estimate). These guar-
crementing the relevant counter may row where the item was mapped by the antees follow from the random selec-
become too high. For example, a social first hash function, and the counter in tion of hash functions and do not rely
network such as Twitter may wish to the second row where it was mapped on any structure or pattern in the data
track how often a tweet is viewed when by the second hash, and so on. Each distribution that is being summarized.
displayed via an external website. There row has a counter that has been in- For a sketch of size s, the error is pro-
are billions of Web pages, each of which cremented by every occurrence of the portional to 1/s. This is an improve-
ment over the case for sampling where,
Figure 2. Count-min sketch data structure with four rows, nine columns. as noted earlier, the corresponding be-
havior is proportional to 1/√s.
Just as Bloom filters are best suited
+c for the cases where false positives can
h1
be tolerated and mitigated, Count-Min
+c sketches are best suited for handling
i a slight inflation of frequency. This
+c means, in particular, they do not ap-
hd
ply to cases where a Bloom filter might
+c be used: if it matters a lot whether an
item has been seen or not, then the
uncertainty that the Count-Min sketch
introduces will obscure this level of have been seen out of a large set of
precision. The sketches are very good possibilities. For example, a Web pub-
for tracking which items exceed a giv- lisher might want to track how many
en popularity threshold, however. In different people have been exposed
particular, while the size of a Bloom
filter must remain proportional to the Successful to a particular advertisement. In this
case, you would not want to count the
size of the input it is representing, a
Count-Min sketch can be much more
applications of same viewer more than once. When
the number of possible items is not too
compressive: its size can be considered sketching involve large, keeping a list, or a binary array,
to be independent of the input size, de-
pending instead on the desired accu-
a mixture of is a natural solution. As the number of
possible items becomes very large, the
racy guarantee only (that is, to achieve algorthmic tricks, space needed by these methods grows
a target accuracy of ε, fix a sketch size of
s proportional to 1/ε that does not vary
systems know-how, proportional to the number of items
tracked. Switching to an approximate
over the course of processing data). and mathematical method such as a Bloom filter means
The Twitter scenario mentioned pre-
viously is a good example. Tracking the insight, and have the space remains proportional to the
number of distinct items, although the
number of views that a tweet receives led to new research constants are improved.
across each occurrence in different
websites creates a large enough volume contributions in Could you hope to do better? If
you just counted the total number of
of data to be difficult to manage. More-
over, the existence of some uncertainty
each of these areas. items, without removing duplicates,
then a simple counter would suffice,
in this application seems acceptable: using a number of bits that is propor-
the consequences of inflating the pop- tional to the logarithm of the number
ularity of one website for one tweet are of items encountered. If only there
minimal. Using a sketch for each tweet were a way to know which items were
consumes only moderately more space new, and count only those, then you
than the tweet and associated meta- could achieve this cost.
data, and allows tracking which venues The HyperLogLog (HLL) algorithm
attract the most attention for the tweet. promises something even stronger: the
Hence, a kilobyte or so of space is suf- cost needs to depend only on the loga-
ficient to track the percentage of views rithm of the logarithm of the quantity
from different locations, with an error computed. Of course, there are some
of less than one percentage point, say. scaling constants that mean the space
Since their introduction over a de- needed is not quite so tiny as this might
cade ago,7 Count-Min sketches have suggest, but the net result is that quan-
found applications in systems that track tities can be estimated with high preci-
frequency statistics, such as popularity sion (say, up to a 1%–2% error) with a
of content within different groups—say, couple of kilobytes of space.
online videos among different sets of us- The method. The essence of this
ers, or which destinations are popular method is to use hash functions ap-
for nodes within a communications net- plied to item identifiers to determine
work. Sketches are used in telecommu- how to update counters so that dupli-
nications networks where the volume of cate items are treated identically. A
data passing along links is immense and Bloom filter has a similar property: at-
is never stored. Summarizing network tempting to insert an item already rep-
traffic distribution allows hotspots to be resented within a Bloom filter means
detected, informing network-planning setting a number of bits to 1 that are
decisions and allowing configuration already recording 1 values. One ap-
errors and floods to be detected and proach is to keep a Bloom filter and
debugged.6 Since the sketch compactly look at the final density of 1s and 0s to
encodes a frequency distribution, it can estimate the number of distinct items
also be used to detect when a shift in represented (taking into account col-
popularities occurs, as a simple example lisions under hash functions). This
of anomaly detection. still requires space proportional to the
number of items and is the basis of ear-
Counting Distinct Items ly approaches to this problem.15
with HyperLogLog To break this linearity, a different
Another basic problem is keeping approach to building a binary coun-
track of how many different items ter is needed. Instead of adding 1 to
SE PT E MB E R 2 0 1 7 | VO L. 6 0 | N O. 9 | C OM M U N IC AT ION S OF T HE ACM 53
practice
the counter for each item, you could A last interesting application of dis-
3 2 1
add 1 with a probability of one-half, tinct counting is in the context of social
2 with a probability of one-fourth, 4 network analysis. In 2016, Facebook set
with a probability of 1/8th, and so on. The estimate is obtained by taking out to test the “six degrees of separa-
This use of randomness decreases the 2 to the power of each of the array en- tion” claim within its social network.
reliability of the counter, but you can tries and computing the sum of the The Facebook friendship graph is suffi-
check that the expected count corre- reciprocals of these values, obtaining ciently large (more than a billion nodes
sponds to the true number of items 1/8 + 1/4 + 1/2 = 7/8 in this case. The and hundreds of billions of edges)
encountered. This makes more sense final estimate is made by multiplying that maintaining detailed information
when using hash functions. Apply a αss2 by the reciprocal of this sum. Here, about the distribution of long-range
hash function g to each item i, with αs is a scaling constant that depends on connections for each user would be in-
the same distribution: g maps items to s. α3 = 0.5305, so 5.46 is obtained as the feasible. Essentially, the problem is to
j with probability 2−j (say, by taking the estimate—close to the true value of 5. count, for each user, how many friends
number of leading zero bits in the bi- The analysis of the algorithm is they have at distance 1, 2, 3, and so on.
nary expansion of a uniform hash val- rather technical, but the proof is in the This would be a simple graph explora-
ue). You can then keep a set of bits in- deployment: the algorithm has been tion problem, except that some friends
dicating which j values have been seen widely adopted and applied in practice. at distance 2 are reachable by multiple
so far. This is the essence of the early Discussion and applications. One paths (via different mutual friends).
Flajolet-Martin approach to tracking example of HLL’s use is in tracking Hence, distinct counting is used to gen-
the number of distinct items.8 Here a the viewership of online advertising. erate accurate statistics on reachability
logarithmic number of bits is needed, Across many websites and differ- without double counting and to provide
as there are only this many distinct j ent advertisements, trillions of view accurate distance distributions (the es-
values expected. events may occur every day. Advertis- timated number of degrees of separa-
The HLL method reduces the num- ers are interested in the number of tion in the Facebook graph is 3.57).2
ber of bits further by retaining only the “uniques:” how many different people
highest j value that has been seen when (or rather, browsing devices) have been Advanced Sketching
applying the hash function. This might exposed to the content. Collecting and Roughly speaking, the four examples
be expected to be correlated to the car- marshaling this data is not infeasible, of sketching described in this article
dinality, although with high variation but rather unwieldy, especially if it cover most of the current practical ap-
for example, there might be only a sin- is desired to do more advanced que- plications of this model of data sum-
gle item seen, which happens to hash to ries (say, to count how many uniques marization. Yet, unsurprisingly, there
a large value. To reduce this variation, saw both of two particular advertise- is a large body of research into new
the items are partitioned into groups ments). Use of HLL sketches allows applications and variations of these
using a second hash function (so the this kind of query to be answered di- ideas. Just around the corner are a host
same item is always placed in the same rectly by combining the two sketches of new techniques for data summariza-
group), and information about the larg- rather than trawling through the full tion that are on the cusp of practicality.
est hash in each group is retained. Each data. Sketches have been put to use This section mentions a few of the di-
group yields an estimate of the local car- for this purpose, where the small rections that seem most promising.
dinality; these are all combined to ob- amount of uncertainty from the use Sketching for dimensionality reduc-
tain an estimate of the total cardinality. of randomness is comparable to other tion. When dealing with large high-
A first effort would be to take the sources of error, such as dropped data dimensional numerical data, it is
mean of the estimates, but this still or measurement failure. common to seek to reduce the dimen-
allows one large estimate to skew the Approximate distinct counting is sionality while preserving fidelity of
result; instead, the harmonic mean also widely used behind the scenes the data. Assume the hard work of data
is used to reduce this effect. By hash- in Web-scale systems. For example, wrangling and modeling is done and
ing to s separate groups, the standard Google’s Sawzall system provides a the data can be modeled as a massive
error is proportional to 1/√s. A small variety of sketches, including count matrix, where each row is one example
example is shown in Figure 3. The fig- distinct, as primitives for log data point, and each column encodes an
ure shows a small example HLL sketch analysis.13 Google engineers have de- attribute of the data. A common tech-
with s = 3 groups. Consider five distinct scribed some of the implementation nique is to apply PCA (principal com-
items a, b, c, d, e with their related modifications made to ensure high ponents analysis) to extract a small
hash values. From this, the following accuracy of the HLL across the whole number of “directions” from the data.
array is obtained: range of possible cardinalities.10 Projecting each row of data along each
of these directions yields a different
Figure 3. Example of HyperLogLog in action. representation of the data that captures
most of the variation of the dataset.
x a b c d e
One limitation of PCA is that find-
h(x) 1 2 3 1 3 ing the direction entails a substantial
g(x) 0001 0011 1010 1101 0101 amount of work. It requires finding
eigenvectors of the covariance matrix,
which rapidly becomes unsustainable the number of rows. Instead, applying that solving a problem a certain way
for large matrices. The competing ap- sketching to matrix A solves the prob- is the only option. Often, fast approxi-
proach of random projections argues lem in the lower-dimensional sketch mate sketch-based techniques can pro-
that rather than finding “the best” di- space.5 David Woodruff provides a vide a different trade-off.
rections, it suffices to use (a slightly comprehensive mathematical survey
larger number of) random vectors. of the state of the art in this area.16
Related articles
Picking a moderate number of ran- Rich data: Graphs and geometry. The on queue.acm.org
dom directions captures a comparable applications of sketching so far can be
It Probably Works
amount of variation, while requiring seen as summarizing data that might Tyler McMullen
much less computation. be thought of as a high-dimensional http://queue.acm.org/detail.cfm?id=2855183
The random projection of each row vector, or matrix. These mathematical
Statistics for Engineers
of the data matrix can be seen as an ex- abstractions capture a large number of Heinrich Hartmann
ample of a sketch of the data. More di- situations, but, increasingly, a richer http://queue.acm.org/detail.cfm?id=2903468
rectly, close connections exist between model of data is desired—say, to model
random projections and the sketches links in a social network (best thought of References
1. Ahn, K.J., Guha, S., McGregor, A. Analyzing graph
described earlier. The Count-Min sketch as a graph) or to measure movement pat- structure via linear measurements. In Proceedings of the
can be viewed as a random projection of terns of mobile users (best thought of as ACM-SIAM Symposium on Discrete Algorithms, (2012).
2. Bhagat, S., Burke, M., Diuk, C., Filiz, I.O., Edunov, S.
sorts; moreover, the best constructions points in the plane or in 3D). Sketching Three-and-a-half degrees of separation. Facebook
of random projections for dimension- ideas have been applied here also. Research, 2016; https://research.fb.com/three-and-a-
half-degrees-of-separation/.
ality reduction look a lot like Count- For graphs, there are techniques 3. Bloom, B. Space/time trade-offs in hash coding with
Min sketches with some twists (such as to summarize the adjacency informa- allowable errors. Commun. ACM 13, 7 (July 1970),
422–426.
randomly multiplying each column of tion of each node, so that connectivity 4. Broder, M., Mitzenmacher, A. Network applications
the matrix by either -1 or 1). This is the and spanning tree information can be of Bloom filters: a survey. Internet Mathematics 1, 4
(2005), 485–509.
basis of methods for speeding up high- extracted.1 These methods provide a 5. Clarkson, K.L., Woodruff, D.P. Low rank approximation
dimensional machine learning, such as surprising mathematical insight that and regression in input sparsity time. In Proceedings
of the ACM Symposium on Theory of Computing,
the Hash Kernels approach.14 much edge data can be compressed (2013), 81–90.
6. Cormode, G., Korn, F., Muthukrishnan, S., Johnson, T.,
Randomized numerical linear al- while preserving fundamental informa- Spatscheck, O., Srivastava, D. 2004. Holistic UDAFs
gebra. A grand objective for sketching tion about the graph structure. These at streaming speeds. In Proceedings of the ACM
SIGMOD International Conference on Management of
is to allow arbitrary complex mathe- techniques have not found significant Data, (2004), 35–46.
matical operations over large volumes use in practice yet, perhaps because of 7. Cormode, G., Muthukrishnan, S. An improved data
stream summary: the Count-Min sketch and its
of data to be answered approximately high overheads in the encoding size. applications. J. Algorithms 55, 1 (2005), 58–75.
and quickly via sketches. While this For geometric data, there has been 8. Flajolet, P., Martin, G.N. 1985. Probabilistic counting.
In Proceedings of the IEEE Conference on
objective appears quite a long way off, much interest in solving problems such Foundations of Computer Science, 1985, 76–82. Also
and perhaps infeasible because of some as clustering.9 The key idea here is that in J. Computer and System Sciences 31, 182–209.
9. Guha, S., Mishra, N., Motwani, R., O’Callaghan, L.
impossibility results, a number of core clustering part of the input can capture Clustering data streams. In Proceedings of the IEEE
mathematical operations can be solved a lot of the overall structural informa- Conference on Foundations of Computer Science, 2000.
10. Heule, S., Nunkesser, M., Hall, A. HyperLogLog in
using sketching ideas, which leads tion, and by merging clusters together practice: Algorithmic engineering of a state of the art
to the notion of randomized numeri- (clustering clusters) you can retain a cardinality estimation algorithm. In Proceedings of
the International Conference on Extending Database
cal linear algebra. A simple example is good picture of the overall point density Technology, 2013.
matrix multiplication: given two large distribution. 11. Jermaine, C. Sampling techniques for massive data.
Synopses for massive data: samples, histograms,
matrices A and B, you want to find their wavelets and sketches. Foundations and Trends in
Databases 4, 1–3 (2012). G. Cormode, M. Garofalakis,
product AB. An approach using sketch- Why Should You Care? P. Haas, and C. Jermaine, Eds. NOW Publishers.
ing is to build a dimensionality-reduc- The aim of this article has been to 12. Morris, R. Counting large numbers of events in small
registers. Commun. ACM 21, 10 (Oct. 1977), 840–842.
ing sketch of each row of A and each col- introduce a selection of recent tech- 13. Pike, R., Dorward, S., Griesemer, R., Quinlan, S.
umn of B. Combining each pair of these niques that provide approximate an- Interpreting the data: Parallel analysis with Sawzall.
Dynamic Grids and Worldwide Computing 13, 4 (2005),
provides an estimate for each entry of swers to some general questions that 277–298.
the product. Similar to other examples, often occur in data analysis and manip- 14. Weinberger, K.Q., Dasgupta, A., Langford, J., Smola,
A.J., Attenberg, J. Feature hashing for large-scale
small answers are not well preserved, ulation. In all cases, simple alternative multitask learning. In Proceedings of the International
but large entries are accurately found. approaches can provide exact answers, Conference on Machine Learning, 2009.
15. Whang, K.Y., Vander-Zanden, B.T., Taylor, H.M. A linear-
Other problems that have been tack- at the expense of keeping complete time probabilistic counting algorithm for database
led in this space include regression. information. The examples shown applications. ACM Trans. Database Systems 15, 2
(1990, 208.
Here the input is a high-dimensional here have illustrated, however, that in 16. Woodruff, D. Sketching as a tool for numerical linear
dataset modeled as matrix A and col- many cases the approximate approach algebra. Foundations and Trends in Theoretical
Computer Science 10, 1–2 (2014), 1–157.
umn vector b: each row of A is a data can be faster and more space efficient.
point, with the corresponding entry of The use of these methods is growing. Graham Cormode is a professor of computer science
b the value associated with the row. The Bloom filters are sometimes said to at the University of Warwick, U.K. Previously, he was a
researcher at Bell Labs and AT&T on algorithms for data
goal is to find regression coefficients x be one of the core technologies that management. He received the 2017 Adams Prize for his
that minimize ||Ax-b||2. An exact so- “big data experts” must know. At the work on data analysis.
lution to this problem is possible but very least, it is important to be aware Copyright held by owner/author.
costly in terms of time as a function of of sketching techniques to test claims Publication rights licensed to ACM. $15.00.
SE PT E MB E R 2 0 1 7 | VO L. 6 0 | N O. 9 | C OM M U N IC AT ION S OF T HE ACM 55
practice
DOI:10.1145/ 3106631
1. Review the Candidate’s Résumé
Article development led by
queue.acm.org
Read every line of every résumé (and
this goes for the really long ones that
go on for four pages). Where have these
Plan ahead to make the interview candidates worked? How long did they
a successful one. stay in a role and did their positions
change? These questions make for in-
BY KATE MATSUDAIRA teresting conversation topics. Hope-
fully there will be something in a can-
10 Ways to
didate’s background that piques your
interest and can be great fodder for
starting the interview with some com-
mon ground. This can put candidates
Be a Better
at ease, giving them their greatest
chance of success.
Interviewer
Previous Interviews
Most software companies have a lon-
ger interview process that can start
with phone-screen or homework
problems and evolve from there. If the
candidate has done homework prob-
lems, or your teammates have taken
the time to type up feedback, do your
due diligence and read it. These can
also be a great source of material for
questions, but more importantly, it is
unprofessional to ask the same ques-
tions that have already been posed to
the candidate. This is partly because
I N MANY WAYS interviewing is an art. You have one you will not learn as much from re-
hour (more if you count the cumulative interview time) peated questions, but also because
the candidate will be bored or unim-
to determine if the candidate has the desired skills, pressed going over the same ground.
and, more importantly, if you would enjoy working Great candidates want to be chal-
with this person. That is a lot of ground to cover. lenged, and an interview team where
people are asking the same questions
As if finding out all that information isn’t a daunting makes the candidate think the team is
enough task, you also need to make sure that the disorganized or unimaginative.
candidate has a positive experience while visiting your 3. Use Calibrated Questions
company (after all, people talk and you want them to be Interviews are not the time to try
saying good things—since this candidate may not be something new. Take the time to do
new problems on your own or test
your next hire, but someone he or she meets may be). them on your peers. Come to the in-
As an interviewer, the key to your success is terview with questions that you were
given in your interview (since you
preparation. Planning will help ensure the success of certainly will know how well you did)
the interview (both in terms of getting the information or that you have already given to oth-
you need and giving the candidate a good impression). ers. Testing new material can really
hurt a candidate’s chances for suc-
The following list is advice to consider prior to stepping cess or, worse, give him or her a bad
into that room with two chairs and a whiteboard. impression of the company when you
If you do have a new question you didate’s background (or common in-
want to give a dry run, have someone 5. Create a Timeline terest): 5–10 minutes
ask you to answer it. Where do you for the Interview ˲˲ Problem-solving question that
get hung up? How long does it take You should walk into every interview involves coding of some sort: 10–20
you? If the problem is too familiar with a schedule: what questions you minutes
SE PT E MB E R 2 0 1 7 | VO L. 6 0 | N O. 9 | C OM M U N IC AT ION S OF T HE ACM 57
practice
The ACM Europe Conference, hosted in Barcelona by the Barcelona Supercomputing Center,
aims to bring together computer scientists and practitioners interested in exascale high
performance computing and cybersecurity.
The High Performance Computing track includes a panel discussion of top world experts in HPC
to review progress and current plans for the worldwide roadmap toward exascale computing.
The Cybersecurity track will review the latest trends in this very hot field. High-level European
Commission officials and representatives of funding agencies are participating.
Europe Council
http://acmeurope-conference.acm.org
contributed articles
DOI:10.1145/ 3122814
Moving
Beyond the
Turing Test
with the Allen AI
Science Challenge
probe the state of the art while sowing the as a challenge task for several reasons. Finally, the test as originally conceived
seeds for possible future breakthroughs. First, in its details, it is not well defined is pass/fail rather than scored, thus
Challenge problems have histori- (such as Who is the person giving the providing no measure of progress to-
cally played an important role in moti- test?). A computer scientist would ward a goal, something essential for
vating and driving progress in research. likely know good distinguishing ques- any challenge problem.a,b
For a field striving to endow machines tions to ask, while a random member Machine intelligence today is viewed
with intelligent behavior (such as lan- of the general public may not. What less as a binary pass/fail attribute and
guage understanding and reasoning), constraints are there on the interac-
challenge problems that test such skills tion? What guidelines are provided
a Turing himself did not conceive of the Turing
are essential. to the judges? Second, recent Turing Test as a challenge problem to drive the field
In 1950, Alan Turing proposed the Test competitions have shown that, forward but rather as a thought experiment
now well-known Turing Test as a pos- in certain formulations, the test it- to explore a useful alternative to the question
sible test of machine intelligence: If a self is gameable; that is, people can Can machines think?
system can exhibit conversational be- be fooled by systems that simply re- b Although one can imagine metrics that quan-
tify performance on the Turing Test, the im-
havior that is indistinguishable from trieve sentences and make no claim precision of the task definition and human
that of a human during a conversation, of being intelligent. 2,3 John Markoff variability make it difficult to define metrics
that system could be considered intel- of The New York Times wrote that the that are reliably reproducible.
SE PT E MB E R 2 0 1 7 | VO L. 6 0 | N O. 9 | C OM M U N IC AT ION S OF T HE ACM 61
contributed articles
more as a diverse collection of capabili- turn result in more or less energy being tion answering or unfair advantage of
ties associated with intelligent behav- consumed. Understanding the question additional training examples. A week
ior. Rather than a single test, cognitive also requires the system being able to before the end of the competition, we
scientist Gary Marcus of New York Uni- recognize that “energy” in this context provided the final test set of 21,298
versity and others have proposed the no- refers to resource consumption for the questions (including the validation
tion of series of tests—a Turing Olym- purposes of transportation, as opposed set) to participants to use to produce a
pics of sorts—that could assess the full to other forms of energy one might find final score for their models, of which
gamut of AI, from robotics to natural in a science exam (such as electrical and 2,583 were legitimate. We licensed the
language processing.9,12 kinetic/potential). data for the competition from private
Our goal with the Allen AI Science assessment-content providers that did
Challenge was to operationalize one AI vs. Eighth Grade not wish to allow the use of their data
such test—answering science-exam To put this approach to the test, AI2 beyond the constraints of the competi-
questions. Clearly, the Science Chal- designed and hosted The Allen AI Sci- tion, though AI2 made some subsets of
lenge is not a full test of machine in- ence Challenge, a four-month-long the questions available on its website
telligence but does explore several competition in partnership with Kaggle http://allenai.org/data.
capabilities strongly associated with in- (https://www.kaggle.com/) that began in Baselines and scores. As these ques-
telligence—capabilities our machines October 2015 and concluded in Febru- tions were all four-way multiple choice,
need if they are to reliably perform the ary 2016.7 Researchers worldwide were a standard baseline score using random
smart activities we desire of them in the invited to build AI software that could guessing was 25%. AI2 also generated
future, including language understand- answer standard eighth-grade multiple- a baseline score using a Lucene search
ing, reasoning, and use of common- choice science questions. The competi- over the Wikipedia corpus, producing
sense knowledge. Doing well on the tion aimed to assess the state of the art scores of 40.2% on the training set and
challenge appears to require significant in AI systems utilizing natural language 40.7% on the final test set. The final re-
advances in AI technology, making it a understanding and knowledge-based sults of the competition was quite close,
potentially powerful way to advance the reasoning; how accurately the partici- with the top three teams achieving
field. Moreover, from a practical point pants’ models could answer the exam scores with a spread of only 1.05%. The
of view, exams are accessible, measur- questions would serve as an indicator of highest score was 59.31%.
able, understandable, and compelling. how far the field has come in these areas.
One of the most interesting and Participants. A total of 780 teams First Place
appealing aspects of science exams is participated during the model-build- Top prize went to Chaim Linhart of
their graduated and multifaceted na- ing phase, with 170 of them eventually Hod HaSharon, Israel (Kaggle data
ture; different questions explore dif- submitting a final model. Participants science website https://www.kaggle.
ferent types of knowledge, varying sub- were required to make the code for their com username Cardal). His model
stantially in difficulty, especially for models available to AI2 at the close of achieved a final score of 59.31% cor-
a computer. There are questions that the competition to validate model per- rect on the test question set of 2,583
are easily addressed with a simple fact formance and confirm they followed questions using a combination of 15
lookup, like this contest rules. At the conclusion of the gradient-boosting models, each with
competition, the winners were also ex- a different subset of features. Unlike
How many chromosomes does the pected to make their code open source. the other winners’ models, Linhart’s
human body cell contain? The three teams achieving the highest model predicted the correctness of
(A) 23 scores on the challenge’s test set re- each answer option individually. Lin-
(B) 32 ceived prizes of $50,000, $20,000, and hart used two general categories of
(C) 46 $10,000, respectively. features to make these predictions;
(D) 64 Data. AI2 licensed a total of 5,083 the first consisted of information-
eighth-grade multiple-choice science retrieval-based features, applied by
Then there are questions requiring questions from providing partners searching over corpora he compiled
extensive understanding of the world, for the purposes of the competition. from various sources (such as study-
like this All questions were standard multiple- guide or quiz-building websites, open
choice format, with four answer op- source textbooks, and Wikipedia).
City administrators can encourage tions, as in the earlier examples. From His searches used various weightings
energy conservation by this collection, we provided partici- and stemmed words to optimize per-
(A) lowering parking fees pants with a set of 2,500 training ques- formance. The other flavor of features
(B) building larger parking lots tions to train their models. We used a used in his ensemble of 15 models
(C) decreasing the cost of gasoline validation set of 8,132 questions during was based on properties of the ques-
(D) lowering the cost of bus and sub- the course of the competition for con- tions themselves (such as length of
way fares firming model performance. Only 800 question and answer, form of answer
of the validation questions were legiti- like numeric answer options, answers
This question requires the knowl- mate; we artificially generated the rest containing referential clauses like
edge that certain activities and incen- to disguise the real questions in order “none of the above” as an option, and
tives result in human behaviors that in to prevent cheating via manual ques- relationships among answer options).
al small models required that the learn- models gained Third Place
ing algorithm use features it would oth-
erwise ignore, an advantage, given the
from information- The third-place winner was Alejandro
Mosquera from Reading, U.K. (Kaggle
relatively limited training data available retrieval-based username Alejandro Mosquera), with a
in the competition.
The information-retrieval-based
methods, indicative score of 58.26%. Mosquera approached
the challenge as a three-way classifica-
features alone could achieve scores as of the state of AI tion problem for each pair of answer op-
high as 55% by Linhart’s estimation. His
question-form features filled in some technology in this tions. He transformed answer choices A,
B, C, and D to all 12 possible pairs (A,B),
remaining gaps to bring the system up area of research. (A,C), ..., (D,C) he labeled with three
to approximately 60% correct. He com- classes: left-pair element is correct; right
bined his 15 models using a simple is correct; or neither is correct. He then
weighted average to yield the final score classified the pairs using logistic re-
for each choice. He credited careful cor- gression. This three-way classification
pus selection as one of the primary ele- is easier for supervised learning algo-
ments driving the success of his model. rithms than the more natural two-way
(correct vs. incorrect) classification with
Second Place four choices, because the two-way clas-
The second-place team, with a score of sification requires an absolute decision
58.34%, was from a social-media-analyt- about a choice, whereas the three-way
ics company based in Luxembourg called classification requires only a relative
Talkwalker (https://www.talkwalker. ranking of the choices. Mosquera made
com), led by Benedikt Wilbertz (Kaggle use of three types of features: informa-
username poweredByTalkwalker). tion-retrieval-based features based on
The Talkwalker team built a relatively scores from Elastic Search using Lucene
large corpus compared to other winning over a corpus; vector-based features that
models, using 180GB of disk space af- measured question-answer similarity by
ter indexing with Lucene. Feature types comparing vectors from word2vec; and
included information-retrieval-based question-form features that considered
features, vector-based features (scoring such aspects of the data as the structure
question-answer similarity by compar- of a question, length of a question, and
ing vectors from word2vec, a two-layer answer choices. Mosquera also noted
neural net that processes text, and that careful corpus selection was crucial
GloVe, an unsupervised learning algo- to his model’s success.
rithm (for obtaining vector representa-
tions for words), pointwise mutual infor- Lessons
mation features (measured between the In the end, each of the winning mod-
question and target answer, calculated els gained from information-retrieval-
on the team’s large corpus), and string based methods, indicative of the state
hashing features in which term-defini- of AI technology in this area of research.
tion pairs were hashed and a supervised AI researchers intent on creating a ma-
learner was then trained to classify pairs chine with human-like intelligence are
as correct or incorrect. A final model unable to ace an eighth-grade science
used them to learn pairwise ranking exam because they do not currently have
between the answer options using the AI systems able to go beyond surface text
XGBoost library, an implementation of to a deeper understanding of the mean-
gradient-boosted decision trees. ing underlying each question, then use
Wilbertz’s use of string hashing fea- reasoning to find the appropriate an-
tures was unique, not tried by either swer. All three winners said it was clear
of the other two winners nor currently that applying a deeper, semantic level of
used in AI2’s Project Aristo. His team reasoning with scientific knowledge to
used a corpus of terms and defini- the questions and answers would be the
SE PT E MB E R 2 0 1 7 | VO L. 6 0 | N O. 9 | C OM M U N IC AT ION S OF T HE ACM 63
contributed articles
key to achieving scores of 80% and high- reasoning required to successfully an- 4. Berant, J., Chou, A., Frostig, R., and Liang, P. Semantic
parsing on Freebase from question-answer pairs. In
er and demonstrating what might be swer these example questions. Ques- Proceedings of the 2013 Conference on Empirical
considered true artificial intelligence. tion-answering systems developed for Methods in Natural Language Processing (Seattle, WA,
Oct. 18–21). Association for Computational Linguistics,
A few other example questions each the message-understanding conferenc- Stroudsburg, PA, 2013, 6.
of the top three models got wrong high- es6 and text-retrieval conferences13 have 5. Fader, A., Zettlemoyer, L., and Etzioni, O. Open question
answering over curated and extracted knowledge
light the more interesting, complex nu- historically focused on retrieving an- bases. In Proceedings of the 20th ACM SIGKDD
ances of language and chains of reason- swers from text, the former from news- International Conference on Knowledge Discovery and
Data Mining (New York, Aug. 24–27). ACM Press, New
ing an AI system must be able to handle wire articles, the latter from various York, 2014.
in order to answer the following ques- large corpora (such as the Web, micro- 6. Grishman, R. and Sundheim, B. Message understanding
Conference-6: A brief history. In Proceedings of the 16th
tions correctly and for which informa- blogs, and clinical data). More recent Conference on Computational Linguistics (Copenhagen,
Denmark, Aug. 5–9). Association for Computational
tion-retrieval methods are not sufficient: work has focused on answer retrieval Linguistics, Stroudsburg, PA, 1996, 466–471.
from structured data (such as “In which 7. Kaggle. The Allen AI Science Challenge; https://www.
kaggle.com/c/the-allen-ai-science-challenge
What do earthquakes tell scientists city was Bill Clinton born?” from Free- 8. Katz, B., Borchardt, G., and Felshin, S. Natural language
about the history of the planet? Base, a large publicly available collab- annotations for question answering. In Proceedings
of the 19th International Florida Artificial Intelligence
(A) Earth’s climate is constantly orative knowledgebase).4,5,15 However, Research Society Conference (Melbourne Beach, FL,
changing. these systems rely on the information May 11–13). AAAI Press, Menlo Park, CA, 2006.
9. Marcus, G., Rossi, F., and Veloso, M., Eds. Beyond the
(B) The continents of Earth are con- being stated explicitly in the underly- Turing Test. AI Magazine (Special Edition) 37, 1 (Spring
tinually moving. ing data and are unable to perform the 2016).
10. Simmons, J. True Knowledge: The natural language
(C) Dinosaurs became extinct about reasoning steps that would be required question answering Wikipedia for facts. Semantic Focus
65 million years ago. to conclude this information from indi- (Feb. 26, 2008); http://www.semanticfocus.com/blog/
entry/title/true-knowledge-the-natural-language-
(D) The oceans are much deeper to- rect supporting evidence. question-answering-wikipedia-for-facts/
day than millions of years ago. A few systems attempt some form 11. Turing, A.M. Computing machinery and intelligence.
Mind 59, 236 (Oct. 1950), 433–460.
of reasoning: Wolfram Alpha14 answers 12. Turk, V. The plan to replace the Turing Test with a
This involves the causes behind mathematical questions, providing they ‘Turing Olympics.’ Motherboard (Jan. 28, 2015); https://
motherboard.vice.com/en_us/article/the-plan-to-
earthquakes and the larger geographic are stated either as equations or with replace-the-turing-test-with-a-turing-olympics
13. Voorhees, E. and Ellis, A., Eds. In Proceedings of the
phenomena of plate tectonics and is not relatively simple English; Evi10 is able to 24th Text REtrieval Conference (Gaithersburg, MD, Nov.
easily solved by looking up a single fact. combine facts to answer simple ques- 17–20). Publication SP 500-319, National Institute of
Standards and Technology, Gaithersburg, MD, 2015.
Additionally, other true facts appear in tions (such as “Who is older: Barack or 14. Wolfram, S. Making the world’s data computable.
the answer options (“Dinosaurs became Michelle Obama?”); and START,8 which Stephen Wolfram Blog (Sept. 24, 2010); http://blog.
stephenwolfram.com/2010/09/making-the-worlds-
extinct about 65 million years ago.”) but likewise is able to answer simple infer- data-computable/
must be intentionally identified and ence questions (such as “What South 15. Yao, X. and Van Durme, B. Information extraction over
structured data: Question answering with Freebase.
discounted as incorrect in the context American country has the largest popu- In Proceedings of the 52nd Annual Meeting of the
of the question. lation?”) using Web-based databases. Association for Computational Linguistics (Baltimore,
MD, June 22–27). Association for Computational
However, none of them attempts the Linguistics, Stroudsburg, PA, 2014, 956–966.
Which statement correctly describes level of complex question processing
a relationship between the distance and reasoning that is indeed required to
Carissa Schoenick (carissas@allenai.org) is the senior
from Earth and a characteristic of a star? successfully answer many of the science program manager for Project Aristo at the Allen Institute
(A) As the distance from Earth to the questions in the Allen AI Challenge. for Artificial Intelligence in Seattle, WA.
star decreases, its size increases. Peter Clark (peterc@allenai.org) is the senior research
(B) As the distance from Earth to the Looking Forward manager for Project Aristo at the Allen Institute for
Artificial Intelligence in Seattle, WA.
star increases, its size decreases. As the 2015 Allen AI Science Challenge
Oyvind Tafjord (oyvindt@allenai.org) is a senior research
(C) As the distance from Earth to the demonstrated, achieving a high score scientist and engineer at the Allen Institute for Artificial
star decreases, its apparent brightness on a science exam requires a system Intelligence in Seattle, WA.
increases. that can do more than sophisticated Peter Turney (peter.turney@gmail.com) was a senior
(D) As the distance from Earth to the information retrieval. Project Aristo at research scientist for Project Aristo at the Allen Institute
for Artificial Intelligence in Seattle, WA, and is now retired.
star increases, its apparent brightness AI2 is focused on the problem of suc-
increases. cessfully demonstrating artificial in- Oren Etzioni (orene@allenai.org) is the Chief Executive
Officer of the Allen Institute for Artificial Intelligence
telligence using standardized science in Seattle, WA, and a professor in the Allen School for
This requires general common- exams, developing an assortment of ap- Computer Science at the University of Washington in
Seattle, WA.
sense-type knowledge of the physics of proaches to address the challenge. AI2
distance and perception, as well as the plans to release additional datasets and Copyright held by the authors.
Publication rights licensed to ACM. $15.00
semantic ability to relate one statement software for the wider AI research com-
to another within each answer option to munity in this effort.1
find the right directional relationship.
References
Other Attempts 1. Allen Institute for Artificial Intelligence. Datasets;
http://allenai.org/data
While numerous question-answering 2. Aron, J. Software tricks people into thinking it is human. Watch the authors discuss
systems have emerged from the AI com- New Scientist 2829 (Sept. 6, 2011). their work in this exclusive
3. BBC News. Computer AI passes Turing Test in ‘world Communications video.
munity, none has addressed the chal- first.’ BBC News (June 9, 2014); http://www.bbc.com/ https://cacm.acm.org/videos/
lenges of scientific and commonsense news/technology-27762088 moving-beyond-the-turing-test
Trust and
Distrust
in Online
Fact-Checking
Services
WHILE THE INTERNET has the potential to give people
ready access to relevant and factual information,
social media sites like Facebook and Twitter have made
filtering and assessing online content increasingly
difficult due to its rapid flow and enormous volume.
In fact, 49% of social media users in the U.S. in 2012
received false breaking news through disseminated further and faster than
social media.8 Likewise, a survey by ever before due to social media. Polit-
Silverman11 suggested in 2015 that ical analysts continue to discuss mis-
false rumors and misinformation information and fake news in social
media and its effect on the 2016 U.S.
key insights presidential election.
Such misinformation challenges
˽˽ Though fact-checking services play the credibility of the Internet as a
an important role countering online
disinformation, little is known about whether
venue for authentic public informa-
users actually trust or distrust them. tion and debate. In response, over the
˽˽ The data we collected from social media
past five years, a proliferation of out-
discussions—on Facebook, Twitter, blogs, lets has provided fact checking and
forums, and discussion threads in online debunking of online content. Fact-
newspapers—reflects users’ opinions checking services, say Kriplean et al.,6
about fact-checking services.
provide “… evaluation of verifiable
˽˽ To strengthen trust, fact-checking services claims made in public statements
should strive to increase transparency
in their processes, as well as in their through investigation of primary and
organizations, and funding sources. secondary sources.” An international
SE PT E MB E R 2 0 1 7 | VO L. 6 0 | N O. 9 | C OM M U N IC AT ION S OF T HE ACM 65
contributed articles
Figure 1. Categorization of fact-checking services based on areas of concern. Figure 2. Example of Snopes debunking
a social media rumor on Twitter
Fact-checking services’ areas of concern (March 6, 2016);
https://twitter.com/snopes/
Online rumors Political and Specific topics status/706545708233396225
and hoaxes public claims or controversies
Figure 3. Outline of our research approach; posts collected October 2014 to March 2015.
SE PT E MB E R 2 0 1 7 | VO L. 6 0 | N O. 9 | C OM M U N IC AT ION S OF T HE ACM 67
contributed articles
Figure 4. Positive and negative posts related to trustworthiness and usefulness per to reflect how people start a sentence
fact-checking service (in %); “other” refers to posts not relevant for the research when formulating their opinions.
categories (N = 595 posts). StopFake is a relatively less-known
service. We thus selected a broad-
Snopes (n = 385) FactCheck.org (n = 80) StopFake (n = 130) er search string—“StopFake”—to
be able to collect enough relevant
Positive (total) opinions. The searches returned a
data corpus of 1,741 posts over six
Negative (total)
months—October 2014 to March
Usefulness (positive) 2015—as in Figure 3. By “posts,” we
mean written contributions by indi-
Ability (positive) vidual users. To create a sufficient
dataset for analysis, we removed all
Benevolence (positive)
duplicates, including a small number
Integrity (positive) of non-relevant posts lacking person-
al opinions about fact checkers. This
Usefulness (negative) filtering process resulted in a dataset
of 595 posts.
Ability (negative)
We then performed content analy-
Benevolence (negative)
sis, coding all posts to identify and
investigate patterns within the data1
Integrity (negative) and reveal the perceptions users ex-
press in social media about the three
Other
fact-checking services we investigat-
0% 20% 40% 60% 80% 100% ed. We analyzed their perceptions of
the usefulness of fact-checking ser-
vices through a usefulness construct
Table 2. Snopes and themes we analyzed (n = 385). similar to the one used by Tsakonas
et al.14 “Usefulness” concerns the ex-
tent the service is perceived as benefi-
Theme Sentiment Example
cial when doing a specific fact-check-
Positive (21%) Snopes is a wonderful Website for verifying things seen online; it is at ing task, often illustrated by positive
least a starting point for research.
Usefulness recommendations and characteriza-
Negative (10%) Snopes is a joke. Look at its Boston bombing debunking failing to
debunk the worst hoax ever ... tions (such as the service is “good”
Positive (6%) […] Snopes is a respectable source for debunking wives’ tales, urban or “great”). Following Mayer et al.’s
legends, even medical myths ... theoretical framework,7 we catego-
Ability Negative (24%) Heh ... Snopes is a man and a woman with no investigative rized trustworthiness according to
background or credentials who form their opinions solely on Internet the perceived ability, benevolence,
research; they don’t interview anyone. […]
and integrity of the services. “Ability”
Positive (0%) No posts
concerns the extent a service is per-
Negative (21%) You show your Ignorance by using Snopes … Snopes is a NWO
Benevolence
ceived as having available the needed
Disinformation System designed to fool the Masses ... SORRY. I
Believe NOTHING from Snopes. Snopes is a Disinformation vehicle skills and expertise, as well as being
of the Elitist NWO Globalists. Believe NOTHING from them ... […] reputable and well regarded. “Benev-
Positive (2%) Snopes is a standard, rather dull fact-checking site, nailing right and olence” refers to the extent a service
left equally. […] is perceived as intending to do good,
Integrity
Negative (44%) Snopes is a leftist outlet supported with money from George Soros. beyond what would be expected from
Whatever Snopes says I take with a grain of salt ...
an egocentric motive. “Integrity” tar-
gets the extent a service is generally
viewed as adhering to an acceptable
as https://wordpress.com/), discus- of more than 500 members. This set of principles, in particular being
sion forums (such as https://offtopic. limitation in Facebook data partly independent, unbiased, and fair.
com/), and online newspapers (such explains why the overall number of Since we found posts typically re-
as https://www.washingtonpost. posts we collected—1,741—was not flect rather polarized perceptions of
com/) requested by Meltwater cus- more than it was. the studied services, we also grouped
tomers, thus representing a large, To collect opinions about social the codes manually according to sen-
though convenient, sample. It col- media user perceptions of Snopes timent, positive or negative. Some
lects various amounts of data from and FactCheck.org, we applied the posts described the services in a plain
each platform; for example, it crawls search term “[service name] is,” as and objective manner. We thus coded
all posts on Twitter but only the Face- in “Snopes is,” “FactCheck.org is,” them using a positive sentiment (see
book pages with 3,500 likes or groups and “FactCheck is.” We intended it Table 1) because they refer to the
service as a source for fact checking, 2 reflect how negative sentiment in crediting a service. Posts expressing
and users are likely to reference fact- the posts we analyzed on Snopes was positive sentiment mainly argue for
checking sites because they see them rooted in issues pertaining to trust- the usefulness of the service, claim-
as useful. worthiness. Integrity issues typically ing that Snopes is, say, a useful re-
For reliability, both researchers in involved a perceived “left-leaning” source for checking up on the veracity
the study did the coding. One coded political bias in the people behind of Internet rumors.
all the posts, and the second then the service. Pertaining to benevo- FactCheck.org. The patterns in the
went through all the assigned codes, lence, users in the study said Snopes posts we analyzed for FactCheck.org
a process repeated twice. Finally, is part of a larger left-leaning or “lib- resemble those for Snopes. As in Ta-
both researchers went through all eral” conspiracy often claimed to be ble 3, the most frequently mentioned
comments for which an alternative funded by George Soros, whereas trustworthiness concerns related to
code had been suggested to decide comments on ability typically tar- service integrity; as for Snopes, us-
on the final coding, a process that geted lack of expertise in the people ers said the service is politically bi-
recommended an alternative coding running the service. Some negative ased toward the left. Posts concern-
for 153 posts (or 26%). comments on trustworthiness may ing benevolence and ability were also
A post could include more than be seen as a rhetorical means of dis- relatively frequent, reflecting user
one of the analytical themes, so 30%
of the posts were thus coded as ad- Table 3. FactCheck.org and themes we analyzed (n = 80).
dressing two or more themes.
Theme Sentiment Example
Results Positive (25%) […] You obviously haven’t listened to what they say.
Despite the potential benefits of fact- Usefulness Also, I hate liars. FactCheck is a great tool.
checking services, Figure 4 reports Negative (3%) Anyway, “FactCheck” is a joke […]
the majority of the posts on the two Positive (6%) The media sources I use must pass a high credibility bar. FactCheck.
org is just one of the resources I use to validate what I read ...
U.S.-based services expressed nega- Ability
Negative (16%) […] FactCheck is NOT a confidence builder; see its rider and sources,
tive sentiment, with Snopes at 68%
Huffpo articles … REALLY?
and FactCheck.org at 58%. Most posts
Positive (0%) No posts
on the Ukraine-based StopFake (78%)
Negative (25%) FactCheck studies the factual correctness of what major players in
reflected positive sentiment. Benevolence U.S. politics say in TV commercials, debates, talks, interviews, and
The stated reasons for negative news presentations, then tries to present the best possible fictional
sentiment typically concerned one or and propaganda-like version for its target […]
more of the trustworthiness themes Positive (19%) When you don’t like the message, blame the messenger.
FactCheck is nonpartisan. It's just that conservatives either lie
rather than usefulness. For example, Integrity or are mistaken more ...
for Snopes and FactCheck.org, the Negative (39%) FactCheck is left-leaning opinion. It doesn’t check facts ...
negative posts often expressed con-
cern over lack in integrity due to per-
ceived bias toward the political left.
Negative sentiment pertaining to the Table 4. StopFake and themes we analyzed (n = 130); note * also coded as integrity/positive.
ability and benevolence of the servic-
es were also common. The few critical Theme Sentiment Example
comments on usefulness were typi- Positive (72%) Don’t forget a strategic weapon of the Kremlin is the “web of lies”
cally aimed at discrediting a service, spread by its propaganda machine; see antidote http://www.stopfake.
by, say, characterizing it as “satirical” Usefulness org/en/news
or as “a joke.” Negative (2%) […] StopFake! HaHaHa. You won, I give up. Next time I will quote
“Saturday Night Live”; there is more truth:)) ...
Positive posts were more often re-
Positive (2%) […] by the way, the website StopFake.org is a very objective and
lated to usefulness. For example, the accurate source exposing Russian propaganda and disinformation
stated reasons for positive sentiment techniques. […]*
toward StopFake typically concerned Ability Negative (2%) […] Ha Ha … a flow of lies is constantly sent out from the Kremlin.
the service’s usefulness in countering Really. If so, StopFake needs updates every hour, but the best way it
pro-Russian propaganda and trolling can do that is to find low-grade blog content and make it appear as if
it was produced by Russian media […]
and in the information war associat-
Positive (4%) […] StopFake is devoted to exposing Russian propaganda against the
ed with the ongoing Ukraine conflict. Ukraine. […]
In line with a general notion of Benevolence
Negative (14%) So now you acknowledge StopFake is part of Kiev’s propaganda. I
an increasing need to interpret and guess that answers my question […]
act on information and misinforma- Positive (2%) […] by the way, the website StopFake.org is a very objective and
tion in social media,6,11 some users accurate source exposing Russian propaganda and disinformation
Integrity techniques. […]
included in the study discussed fact-
Negative (11%) […] Why should I give any credence to StopFake.org? Does it ever
checking sites as important elements
criticize the Kiev regime, in favor of the Donbass position? […]
of an information war.
Snopes. The examples in Table
SE PT E MB E R 2 0 1 7 | VO L. 6 0 | N O. 9 | C OM M U N IC AT ION S OF T HE ACM 69
contributed articles
concern regarding the service as a tion when comparing the various argument. For some users in our
contributor to propaganda or doubts services, topic-specific StopFake is sample, lack of trust extends beyond
about its fact-checking practices. perceived as more useful than Snopes a particular service to encompass the
StopFake. As in Table 4, the results and FactCheck.org. One reason might entire social and political system. Us-
for StopFake show more posts ex- be that a service targeting a specific ers with negative perceptions thus
pressing positive sentiment than we topic faces less criticism because it seem trapped in a perpetual state of
found for Snopes and FactCheck.org. attracts a particular audience that informational disbelief.
In particular, the posts included in seeks facts supporting its own view. While one’s initial response to
the study pointed out that StopFake For example, StopFake users target statements reflecting a state of infor-
helps debunk rumors seen as Russian anti-Russian, pro-Ukrainian readers. mational disbelief may be to dismiss
propaganda in the Ukraine conflict. Another, more general, reason might them as the uninformed paranoia of
Nevertheless, the general pat- be that positive perceptions are mo- a minority of the public, the state-
tern in the reasons users gave us for tivated by user needs pertaining to a ments should instead be viewed as a
positive and negative sentiment for perceived high load of misinforma- source of user insight. The reason the
Snopes and FactCheck.org also held tion, as in the case of the Ukraine services are often unsuccessful in re-
for StopFake. The positive posts were conflict, where media reports and ducing ill-founded perceptions9 and
typically motivated by usefulness, social media are seen as overflowing people tend to disregard fact check-
whereas the negative posts reflected with propaganda. Others highlighted ing that goes against their preexisting
the sentiment that StopFake is politi- the general ease information may be beliefs2,13 may be a lack of basic trust
cally biased (“integrity”), a “fraud,” filtered or separated from misinfor- rather than a lack of fact-based argu-
a “hoax,” or part of the machinery mation through sites like Snopes and ments provided by the services.
of Ukraine propaganda (“benevo- FactCheck.org, as expressed like this: We found such distrust is often
lence”). “As you pointed out, it doesn’t take highly emotional. In line with Sil-
that much effort to see if something verman,11 fact-checking sites must
Discussion on the Internet is legit, and Snopes is be able to recognize how debunking
We found users with positive percep- a great place to start. So why not take and fact checking evoke emotion in
tions typically extoled the usefulness that few seconds of extra effort to do their users. Hence, they may benefit
of fact-checking services, whereas that, rather than creating and sharing from rethinking the way they design
users with negative opinions cited misleading items.” and present themselves to strengthen
concerns over trustworthiness. This This finding suggests there is in- trust among users in a general state
pattern emerged across all three ser- creasing demand for fact-checking of informational disbelief. More-
vices. In the following sections, we services,6 while at the same time a over, users of online fact-checking
discuss how these findings provide substantial proportion of social me- sites should compensate for the lack
new insight into trustworthiness as dia users who would benefit from of physical evidence online by be-
a key challenge when countering on- such services do not use them suf- ing, say, demonstrably independent,
line rumors and misinformation2,9 ficiently. The services should thus impartial, and able to clearly distin-
and why ill-founded beliefs may have be even more active on social media guish fact from opinion. Rogerson10
such online reach, even though the sites like Facebook and Twitter, as wrote that fact-checking sites exhibit
beliefs are corrected by prominent well as in online discussion forums, varying levels of rigor and effective-
fact checkers, including Snopes, where greater access to fact checking ness. The fact-checking process and
FactCheck.org, and StopFake. is needed. even what are considered “facts” may
Usefulness. Users in our sample Trustworthiness. Negative percep- in some cases involve subjective in-
with a positive view of the services tions and opinions about fact-check- terpretation, especially when actors
mainly pointed to their usefulness. ing services seem to be motivated by with partial ties aim to provide the
While everyone should exercise cau- basic distrust rather than rational service. For example, in the 2016 U.S.
presidential campaign, the organiza-
Table 5. Challenges and our related recommendations for fact-checking services. tion “Donald J. Trump for President”
invited Trump’s supporters to join a
Challenges Recommendations fact-check initiative, similar to the
Unrealized potential in public Increase presence in social media and category “topics or controversies,”
Usefulness
use of fact-checking services discussion forums urging “fact checking” the presiden-
Ability Critique of expertise and Provide nuanced but simple overview tial debates on social media. How-
reputation of the fact-checking process where
ever, the initiative was criticized as
relevant sources are included
mainly promoting Trump’s views and
Benevolence Suspicion of conspiracy and Establish open policy on fact checking
Trustworthiness propaganda and open spaces for collaboration on candidacy.5
fact checking Users of fact-checking sites ask:
Integrity Perception of bias and Ensure transparency on organization Who actually does the fact checking
partiality and funding. and demonstrable and how do they do it? What organi-
impartiality in fact-checking process
zations are behind the process? And
how does the nature of the organiza-
lent and unbiased just because they of informational psychological advantage of unfalsifiability: The appeal
of untestable religious and political ideologies. Journal
disbelief.
of Personality and Social Psychology 108, 3 (Nov.
check facts. Rather, they must strive 2014), 515–529.
for transparency in their working pro- 3. Gingras, R. Labeling fact-check articles in Google
News. Journalism & News (Oct. 13, 2016); https://blog.
cess, as well as in their origins, orga- google/topics/journalism-news/labeling-fact-check-
nization, and funding sources. articles-google-news/
4. Hermida, A. Tweets and truth: Journalism as a
To increase transparency in its discipline of collaborative verification. Journalism
processes, a service might try to take Practice 6, 5-6 (Mar. 2012), 659–668.
5. Jamieson, A. ‘Big League Truth Team’ pushes Trump’s
a more horizontal, collaborative ap- talking points on social media. The Guardian (Oct. 10,
proach than is typically seen in the 2016); https://www.theguardian.com/us-news/2016/
oct/10/donald-trump-big-league-truth-team-social-
current generation of services. Fol- media-debate
lowing Hermida’s recommenda- 6. Kriplean, T., Bonnar, C., Borning, A., Kinney, B., and Gill,
B. Integrating on-demand fact-checking with public
tion4 to social media journalists, fact dialogue. In Proceedings of the 17th ACM Conference
checkers could be set up as a plat- on Computer-Supported Cooperative Work & Social
Computing (Baltimore, MD, Feb. 15–19). ACM Press,
form for collaborative verification New York, 2014, 1188–1199.
and genuine fact checking, relying 7. Mayer, R.C., Davis, J.H., and Schoorman, F.D. An
integrative model of organizational trust. Academy of
less on centralized expertise. Form- Management Review 20, 3 (1995), 709–734.
8. Morejon, R. How social media is replacing traditional
ing an interactive relationship with journalism as a news source. Social Media Today
users might also help build trust.6,7 Report (June 28, 2012); http://www.socialmediatoday.
com/content/how-social-media-replacing-traditional-
journalism-news-source-infographic
Conclusion 9. Nyhan, B. and Reifler, J. When corrections fail: The
persistence of political misperceptions. Political
We identified a lack of perceived Behavior 32, 2 (June 2010), 303–330.
trustworthiness and a state of infor- 10. Rogerson, K.S. Fact checking the fact checkers:
Verification Web sites, partisanship and sourcing.
mational disbelief as potential obsta- In Proceedings of the American Political Science
cles to fact-checking services reach- Association (Chicago, IL, Aug. 29–Sept. 1). American
Political Science Association, Washington, D.C., 2013.
ing social media users most critical 11. Silverman, C. Lies, Damn Lies, and Viral Content.
to such services. Table 5 summarizes How News Websites Spread (and Debunk) Online
Rumors, Unverified Claims, and Misinformation. Tow
our overall findings and discussions, Center for Digital Journalism, Columbia Journalism
outlining related key challenges and School, New York, 2015; http://towcenter.org/wp-
content/uploads/2015/02/LiesDamnLies_Silverman_
our recommendations for how to ad- TowCenter.pdf
dress them. 12. Stencel, M. International fact checking gains
ground, Duke census finds. Duke Reporters’ Lab,
Given the exploratory nature of Duke University, Durham, NC, Feb. 28, 2017; https://
reporterslab.org/international-fact-checking-gains-
this study, we cannot conclude our ground/
findings are valid for all services. In 13. Stroud, N.J. Media use and political predispositions:
Revisiting the concept of selective exposure. Political
addition, more research is needed Behavior 30, 3 (Sept. 2008), 341–366.
to be able to make definite claims 14. Tsakonas, G. and Papatheodorou, C. Exploring
usefulness and usability in the evaluation of open-
on systematic differences among the access digital libraries. Information Processing &
various fact checkers based on their Management 44, 3 (May 2008), 1234–1250.
15. Van Mol, C. Improving web survey efficiency: The
“areas of concern.” Nevertheless, the impact of an extra reminder and reminder content on
consistent pattern in opinions we Web survey response. International Journal of Social
Research Methodology 20, 4 (May 2017), 317–327.
found across three prominent ser- 16. Xu, C., Yu, Y., and Hoi, C.K. Hidden in-game intelligence
vices suggests challenges and recom- in NBA players’ tweets. Commun. ACM 58, 11 (Nov.
2015), 80–89.
mendations that can provide useful
guidance for future development in Petter Bae Brandtzaeg (pbb@sintef.no) is a senior
this important area. research scientist at SINTEF in Oslo, Norway.
Asbjørn Følstad (asf@sintef.no) is a senior research
Acknowledgments scientist at SINTEF in Oslo, Norway.
SE PT E MB E R 2 0 1 7 | VO L. 6 0 | N O. 9 | C OM M U N IC AT ION S OF T HE ACM 71
review articles
DOI:10.1145/ 3096742
Security
in High-
Performance
Computing
Environments
HOW IS COMPUTER security different in a high-performance
computing (HPC) context from a typical IT context? On key insights
the surface, a tongue-in-cheek answer might be, “just the ˽˽ High-performance computing systems
same, only faster.” After all, HPC facilities are connected have some similarities and some
differences with traditional IT computing
to networks the same way any other computer is, often systems, which present both challenges
and opportunities.
run the same, typically Linux-based operating systems ˽˽ One challenge is that HPC systems are
as are many other common computers, and have long “high-performance” by definition, and so
many traditional security techniques are
been subject to many of the same styles of attacks, be they not effective because they cannot keep up
with the system or reduce performance.
compromised credentials, system misconfiguration, or
˽˽ Many opportunities also exist: HPC
software flaws. Such attacks have ranged from the “wily systems tend to be used for very
distinctive purposes, have much more
hacker” who broke into U.S. Department of Energy (DOE) regular and predictable activity, and
and U.S. Department of Defense (DOD) computing systems contain highly custom hardware/
software stacks. Each of these elements
in the mid-1980s,42 to the “Stakkato” attacks against can provide a toehold for leveraging
some aspect of the HPC platform to
NCAR, DOE, and NSF-funded supercomputing centers in improve security.
means that aside from all of the nor- a mechanism designed to enforce a does not get in the way of performance
mal reasons that any network-connect- particular policy considered essen- or usability. While laudable, this article
ed computer might be attacked, HPC tial for security by one site might be argues that this assessment of HPC’s
computers have their own distinct considered a denial of service to le- distinctiveness is incomplete.
systems, resources, and assets that an gitimate users of another site, or how This article focuses on four key
attacker might target, as well as their a smartphone is protected is distinct themes surrounding this issue:
SE PT E MB E R 2 0 1 7 | VO L. 6 0 | N O. 9 | C OM M U N IC AT ION S OF T HE ACM 73
review articles
The first theme is that HPC systems regular and predictable mode of opera- even in open science, data leakage is
are optimized for high performance tion, which changes the way security certainly an issue and a threat, this ar-
by definition. Further, they tend to be can be enforced. ticle focuses more on integrity related
used for very distinctive purposes, no- As a final aside, many, but by no threats,31,32 including alteration of code
tably mathematical computations. means all HPC systems are often ex- or data, or misuse of computing cycles,
The second theme is that HPC tremely open systems from a security and availability related threats, in-
systems tend to have very distinctive standpoint, and may be used by scien- cluding disruption or denial of service
modes of operation. For example, com- tists worldwide whose identities have against HPC systems or networks that
pute nodes in an HPC system may be never been validated. Increasingly, we connect them.
accessed exclusively through some are also starting to see HPC systems in Computations that are incorrect
kind of scheduling system on a login which computation and visualization for non-malicious reasons, including
node in which it is typical for a single are more tightly coupled and, a human flaws in application code, such as gen-
program or common set of programs manipulates the inputs to the computa- eral logic errors, round-off errors, non-
to run in sequence. And, even on that tion itself in near-real time. determinism in parallel algorithms,
login node, from which the computa- This distinctiveness presents both unit conversion errors,20 as well as in-
tion is submitted to the scheduler, it opportunities and challenges. This correct assumptions by users about the
may be the case that an extremely nar- article discusses the basis for these hardware they are running on, are vital
row range of programs exist compared themes and the conclusions for secu- issues, but beyond the scope of this ar-
to those commonly found on general- rity for these systems. ticle, due to length and the fact those is-
use computing systems. Scope and threat model. I have spent sues are well-covered elsewhere.4,5,6,8,36
The third theme is that while some most of my career in or near “open sci-
HPC systems use standard operating ence:” National Science Foundation High-Performance
systems, some use highly exotic stacks. and Department of Energy Office of Sci- Computing Environments
And even the ones that use standard op- ence-funded high-performance com- Distinctive purposes. The first theme
erating systems, very often have custom puting centers, and so the lens through of the distinctiveness of security for
aspects to their software stacks, particu- which this article is discussed tends to HPC systems is that these systems
larly at the I/O and network driver levels, focus on such environments. The chal- are high-performance by definition,
and also at the application layer. And, lenges in “closed” environments, such and are made that way for a reason.
of course, while the systems may use as those used by the National Security They are typically used for automated
commodity CPUs, the CPUs and other Agency (NSA), Department of Defense computation of some kind, typically
hardware system components are often (DoD), or National Nuclear Security performing some set of mathemati-
integrated in HPC systems in a way (for Administration (NNSA) National Labs, cal operations. Historically, this has
example, by Cray or IBM) that may well or commercial industry, shares some, often been for the purpose of model-
exist nowhere else in the world. but not all of the attributes discussed ing and simulation, and increasingly
The fourth theme, which follows in this article. As a result, although I today, for data analysis as well. Given
from the first three themes, is that HPC discuss confidentiality, a typical com- the primary purpose of HPC systems
systems tend to have a much more ponent of the “C-I-A” triad, because is therefore high-performance, and
given that such systems themselves are
Figure 1. Three typical high-level workflow diagrams of scientific computing. The diagram both few in number, and therefore also
at top shows a typical workflow for data analysis in HPC; the middle diagram shows a
typical workflow for modeling and simulation; and the bottom diagram shows a coupled,
that computing time on such systems
interactive compute-visualization workflow. is quite valuable, there is a reluctance
by the major stakeholders—the fund-
ing agencies that support HPC systems
Data Analysis
as well as the users who run computa-
Connect Transfer Edit config files, Transfer tions on them—to agree to any solu-
to login data in compile, Wait data out tion that might impose overhead on
node via DTN submit batch job via DTNs
the system. Those stakeholders might
well regard such a solution as a waste
Simulation of cycles at worst, and an unacceptable
delay of scientific results at best. This is
Connect Edit config files, (Maybe)
to login compile, Wait Transfer data
an important detail, because it frames
node submit batch job out via DTNs the types of security solutions that at
least historically might have been con-
sidered acceptable to use.
Simulation with Coupled Computation/Visualization
Distinctive modes of operation. The
second theme of the distinctiveness of
Connect Edit config files, Visualize
Job security for HPC systems is that these
to login compile, Wait Output/Adjust
Starts
node submit batch job Inputs systems tend to have distinctive modes
of operation. The typical mode of oper-
ation for using a scientific high-perfor-
SE PT E MB E R 2 0 1 7 | VO L. 6 0 | N O. 9 | C OM M U N IC AT ION S OF T HE ACM 75
review articles
support neither multi-tasking or virtu- mon stacks. On the other hand, some
al memory27 (CNK has no relationship custom stacks may be smaller, more
with CNL). The I/O system runs the easily verified, and less complex.
GPFS file system client. Openness. Our final theme is the
Aurora,c the system scheduled to
be installed at ALCF in 2019, will be There is a relative “openness” of at least some
HPC systems. That is, scientists from
constructed by a partnership between
Cray and Intel and will run third-gen-
reluctance by major all over the world whose identities have
never been validated may use them.
eration Intel Xeon Phi processors with stakeholders—the For example, many such systems, such
second-generation Intel Omni-Path
photonic interconnects and a variety of
funding agencies as those used by NSF or DOE ASCR,
have no traditional firewalls between
ash memory and NVRAM components that support HPC the data transfer nodes and the Inter-
to accelerate I/O, including 3DXpoint
and 3D NAND in multiple locations,
systems as well net, let alone the ability to “air gap” the
HPC system (that is, ensure no physi-
all user accessible. Aurora will run Cray as the users who cal connection to the regular Internet
Linux10—a full Linux stack on its login
nodes and I/O nodes (though the I/O run computations is possible) as some communities are
able to do.
nodes do not allow general user ac- on them—to agree
Security Mechanisms and
cess), and mOS46 on its compute nodes.
mOS supports both a lightweight ker- to any solution Solutions that Overcome
nel (LWK) and full Linux operating sys-
tem to enable users to choose between
that might impose the Constraints of
HPC Environments
avoiding unexpected operating system overhead on the Traditional IT security solutions, in-
overhead, and the flexibility of a full
Linux stack.
system. cluding network and host-based intru-
sion detection, access controls, and
Summit,d the system scheduled to software verification work about as
be installed at OLCF in 2018, will be well in HPC as traditional IT (often not
based on both IBM POWER9 CPUs and very), or worse, due to constraints in
NVIDIA Volta GPUs, with NVIDIA NV- HPC environments.
Link on-node networks and dual-rail For example, traditional host-based
Mellanox interconnects. security mechanisms, such as those le-
In short, there is certainly some veraging system call data via audited,
variation on exactly what operating as well as certain types of network se-
systems are run—in all cases, login curity mechanisms, like network fire-
nodes run “full” operating systems. walls and firewalls doing deep packet
And in some cases, full operating sys- inspection, may be antithetical to the
tems are also used for compute nodes, needs of the system being protected.
while in other cases, lighter-weight For example, it has been shown that
but Linux API-compatible versions of even 0.0046% packet loss (1 out of
operating systems are used, while in 22,000 packets) can cause a loss in
some cases entirely custom operating throughput of network data transfers
systems are used that are single-user of approximately 90%.13 Given that
only, and contain no virtual memory stateful and/or deep-packet inspect-
capabilities or multitasking. ing firewalls can cause delays that
At least for the full operating sys- might lead to such loss, a firewall, as
tems, it is reasonable to assume the traditionally defined, is inappropriate
operating systems contain similar or for use in environments with high net-
identical capabilities and bugs as stan- work data throughput requirements.
dard desktop and server versions of Thus, alternative approaches must
Linux, are just as vulnerable to attack be applied. Some solutions exist that can
via various pieces of software (libraries, help compensate for these constraints.
runtime, and application) that are run- The Science DMZ13 security frame-
ning on the system. work defines a set of security poli-
Custom hardware and software cies, procedures, and mechanisms
components may have both positives to address the distinct needs of sci-
and negatives. On one hand, they may entific environments with high net-
receive less assurance than more com- work throughput needs (HPC security
theme #1). While the needs of high
c http://aurora.alcf.anl.gov throughput networks do not elimi-
d https://www.olcf.ornl.gov/summit/ nate options for security monitoring
or mitigation, those requirements do of HPC environments, such as those behavior in HPC are likely more regular
change what is possible. requiring environments with greater than in typical computing systems, one
In particular, in the Science DMZ data confidentiality guarantees, such might expect that one can reduce the
framework, the scientific computing as medical, defense, and intelligence error rates when using anomaly-based
systems are moved to their own en- environments. Steps have been made intrusion detection, and possibly even
clave, away from other types of comput- toward the medical context as well. making specifications possible to con-
ing systems that might have their own The Medical Science DMZ29 applies struct for specification-based intrusion
distinctive security needs and perhaps the Science DMZ framework to com- detection. Thus, such security mecha-
even distinct regulations—for example, puting environments requiring com- nisms might even fare better in HPC
financial, human resources, and other pliance with HIPAA Security Rule. Key environments than in traditional IT
business computing systems. In addi- architectural aspects include the notion environments (theme #4), though dem-
tion, it directs transfers through single that all traffic from outside compute/ onstrating the degree to which the in-
network ingress and egress point that storage infrastructure passes through creased regularity of HPC environments
can be monitored and restricted. heavily monitored head nodes, that may be helpful for security analysis is an
However, the Science DMZ does storage and compute nodes themselves open research question.
not use “deep packet inspecting” or are not connected directly to the Inter- Analyzing system behavior with ma-
stateful firewalls. It does leverage pack- net, and that traffic containing sensitive chine learning. A second, and related key
et filtering firewalls that is, firewalls or controlled access data is encrypted. point about HPC systems being used
that examine only attributes of packet However, further work in medical en- primarily for mathematical computa-
headers and not packet payloads. And, vironments, as well as other environ- tion is that if we can do better analysis of
separately, it also performs deep packet ments is required. system behavior, the insight that most
inspection and stateful intrusion detec- HPC machines are used for computa-
tion, such as might be done with the Leveraging the Distinctiveness tion focuses our attention on what se-
Bro Network Security Monitor.28 How- of HPC as an Opportunity curity risks to care about (for example,
ever, the two processes are not directly The Science DMZ helps compensate users running “illicit computations,” as
coupled, as, unlike a firewall, the IDS is for HPC’s limitations—we need more defined by the owners of the HPC sys-
not used in-line with the network traffic, such solutions. As indicated by the four tem) and might give us better ability to
and as a result, delays are not imposed themes enumerated in this article, we understand what type of computation is
on transmission of the traffic due to also need solutions that can leverage taking place.
inspection, and thus congestion that HPC distinctiveness as a strength. An example of a successful approach
might lead to packet loss and retrans- Sommer and Paxson41 point out the to addressing this question involved re-
mission is also not created. fact that anomaly-based detection typi- search that I was involved with at Berke-
Thus, by moving the traffic to its own cally is not used in traditional IT envi- ley Lab between 2009–2013.14,30,47,48 In
enclave that can be centrally monitored ronments is due to the high-level fact this project, we asked the questions:
at a single point, the framework seeks that “finding attacks is fundamentally What are people running on HPC sys-
to maintain a similar level of security different from … other applications” tems? Are they running what they usu-
to traditional organizations that typi- (such as credit card fraud detection, ally run? Are they running what they
cally have a single ingress/egress point, for example). Among other key issues, requested cycle allocations to run, or
rather than simply removing network they note that network traffic is often mining Bitcoins?
monitoring without replacing it with much more diverse than one might Are they running something illegal
an alternative. However, the Science expect. They point out that semantic (for example, classified)? In that work,
DMZ does so in a very specific way that understanding is a vital component of we developed technique for answering
accommodates the type and volume of overcoming this limitation to enable these questions by fingerprinting com-
network traffic used in scientific and machine-learning approaches to secu- munication on HPC systems.
high-performance computing environ- rity to be more effective. Specifically, we collected Message
ments. More specifically, it achieves On the other hand, as mentioned Passing Interface (MPI) function calls
throughput by reducing complexity, earlier, HPC systems tend to be used for via the Integrated Performance Moni-
which is a theme that we will return to very distinctive purposes, notably math- toring (IPM)43 tool, which showed pat-
in this article. ematical computations (theme #1). The terns of communication between ores
The Science DMZ framework has specific application of HPC systems var- in an HPC system, as shown in Figure 2.
been implemented widely in university ies by the organization that uses them Using 1681 logs for 29 scientific ap-
and National Lab environments around (for example, DOE National Lab, DOD plications from NERSC HPC systems,
the world as a result of funding from lab), but each individual system typi- we applied Bayesian-based machine
NSF, DOE ASCR, and other, internation- cally has a very specific use. This is a key learning techniques for classification
al funding organizations, to support point because the result may be that of scientific computations, as well as
computing and networking infrastruc- both specification-based and anomaly- a graphtheoretic approach using “ap-
ture for open science. It goes with- based intrusion detection may be more proximate” graph matching techniques
out saying that both the Science DMZ useful in HPC environments than in tra- (subgraph isomorphism and edit dis-
framework and the Bro IDS must also ditional IT environments. Specifically, tance). A hybrid machine learning and
continue to be adapted to more types given the hypothesis that patterns of graph theory approach identified test
SE PT E MB E R 2 0 1 7 | VO L. 6 0 | N O. 9 | C OM M U N IC AT ION S OF T HE ACM 77
review articles
HPC codes with 95%–99% accuracy. tain distinctive security policies in HPC system to accomplish whatever illicit
Our work analyzing distributed environments that might help improve use the attacker is attempting.
memory parallel computation patterns the usefulness of application-level use Collecting better audit and prove-
on HPC compute nodes is by no means monitoring. There are at least two rea- nance data. It is important to note the
conclusive that anomaly detection is an sons for this. success of the work mentioned in the
unqualified success on HPC systems First, given the organization re- previous section is dependent on avail-
for intrusion detection. For one thing, sponsible for security of HPC systems ability of useful security monitoring
the experiments were not conducted are likely to care more about misuse of data. It is our observation that the cur-
in an adversarial environment, and so cycles if very large numbers of cycles rent trend in many scientific environ-
the difficultly of an attacker intention- are used, this suggests focusing on the ments on collecting provenance data
ally evading detection by attempting to users that use cycles for many hours for scientific reproducibility purposes,
make one program look like another per day for days at a time. This is a very such as the Tigres workflow system,38
was not explored. In addition, in our different practical scenario than net- and the DOE Biology Knowledgebase
“fingerprinting HPC computation” work security monitoring where a de- (KBase)21 may help to provide better
project, we had what we deemed to be cision about security might require a data that can be used for security moni-
a reasonable, though not exhaustive response in a fraction of a second in or- toring, as might DARPA’s “Transparent
corpus of data representative of typi- der to prevent compromise. Given the Computing” program 11, which seeks
cal computations on NERSC facilities longer time scale, therefore, a human to “make currently opaque comput-
to examine. In addition, in examining security analyst can be involved rather ing systems transparent by providing
the data, we focused on a specific set than requiring the application moni- high-fidelity visibility into component
of activity contained within the NERSC toring, on the level that we have done interactions during system operation
Acceptable Use. it, to be conclusive. Rather, that appli- across all layers of software abstrac-
Policy as falling outside of “accept- cation monitoring might simply serve tion, while imposing minimal perfor-
able use.” Other sites will have a differ- to focus an analyst’s attention, and to mance overhead.”
ent baseline of “typical computation,” lead to a manual source code analysis, In line with this, as noted earlier,
and are also likely have somewhat dif- or even an actual conversation with the HPC systems have a lot in common
ferent policies that define what is or is user whose account was used to run with traditional systems, but also
not “illicit use.” the code. contain a lot of highly custom OS and
However, regardless, we do believe A second reason why this issue of network-level, and application-level
the approach is an example of the type an attacker evading detection on HPC software. A key point here is that such
of techniques that could possibly have might be harder is because, users are exotic hardware and low-level software
success in HPC environments and pos- often given “cycle allocations” to run stacks may also provide opportunities
sibly even greater success than in many code. As a result, the more a program for monitoring data going forward. An
non-HPC environments. For example, running on an HPC system is modified example of the performance counters
consider the possibility of a skilled at- to mask illicit use, the more likely it is used in many of today’s HPC machines
tacker attempting to evade detection that additional cycles must be used to is an example of this.
something that any security mecha- do additional tasks to make it look like Post-exascale systems, as well as
nism relying on machine learning is the program is doing something differ- more architectures that are still in
vulnerable to. Not only do there appear ent than it actually is. Thus, the faster their early phases of practical imple-
to be more regular use patterns in HPC that a stolen allocation will be used up mentation, such as neuromorphic
environments, but there also exist cer- and/or the longer it will take the HPC computing, quantum computing, and
Figure 2. “Adjacency matrices” for individual runs of a performance benchmark, an atmospheric dynamics simulator, and a linear equation
solver SUPERLU. Number of bytes sent between ranks is linearly mapped from dark blue (lowest) to red (highest), with white indicating an
absence of communication.47,48
Source Rank
Source Rank
Source Rank
photonic computing may all provide to HPC, rather than full-blown UNIX
additional challenges and opportuni- command-line interfaces, may provide
ties. For example, though neural net- a reduction of complexity that super-
works were previously thought by many facility would otherwise introduce.
to be inscrutable,16 new research sug-
gests this may be actually possible at In the future, it is While science gateways still represent
vulnerability vectors from arbitrary
some point.12,49 If successful, this might
give to rise to the ability to interpret net-
clear that numerous code, even when it is submitted via
Web front-ends, since security tends to
works learned by neuromorphic chips. aspects of HPC will benefit from more constrained opera-
SE PT E MB E R 2 0 1 7 | VO L. 6 0 | N O. 9 | C OM M U N IC AT ION S OF T HE ACM 79
review articles
P. 92 P. 93
Technical
Perspective Scribe: Deep Integration
Humans and of Human and Machine
Computers
Working Together Intelligence to Caption
on Hard Tasks Speech in Real Time
By Ed H. Chi By Walter S. Lasecki, Christopher D. Miller, Iftekhar Naim,
Raja Kushalnagar, Adam Sadilek, Daniel Gildea, and Jeffrey P. Bigham
SE PT E MB E R 2 0 1 7 | VO L. 6 0 | N O. 9 | C OM M U N IC AT ION S OF T HE ACM 81
research highlights
DOI:10.1145/ 3 0 6 8 774
Technical Perspective
To view the accompanying paper,
visit doi.acm.org/10.1145/3068776 rh
A Gloomy Look at
the Integrity of Hardware
By Charles (Chuck) Thacker
SINCE THE INVENTION of the integrated get’s software, rather than by adding
circuit, the complexity of the devices hardware. The reports seem to indi-
and the cost of the facilities used to As technologists, cate the bot devices were easily com-
build them have increased dramati- technical solutions promised, using default passwords
cally. The first fabrication facility that could not be changed, and the de-
with which I was associated was built are what we do best. vices were not designed to be updated
at Xerox PARC in the mid-1970s at a In the case of in the field. While the security provid-
cost of approximately $15M ($75M to- ed by IoT devices will surely improve,
day). Today, the cost of a modern fab the attack the authors argue that the introduc-
is approximately $15B. This cost is proposed by tion of small Trojans by untrusted
justified by the fact that today’s chips fabrication facilities will remain a
are much more complex than in ear- the authors, problem for which technical solutions
lier times. The number of layers in- a technical defense appear elusive.
volved has grown to over 100, and the As technologists, technical solu-
tolerances involved are approaching seems problematic. tions to problems are what we do best.
atomic dimensions. In the case of the attack proposed by
The high cost of a fab means that the authors, a technical defense seems
in order to be cost-effective, it must problematic. We do, however, have ex-
be fully loaded. This has led to “sili- amples from other fields that might be
con foundries,” which build chips for promising. The A2 Trojan assumes an
a variety of “fabless” semiconductor untrusted fabrication facility. While it
companies based on a set of physical ment, and it may then be triggered might not be possible to do all future
design libraries supplied by the found- by an external software attack. When fabrication in trusted facilities, using
ry. Carver Mead and Lynn Conway in triggered, the chip’s normal function a third party trusted by both the fab
their seminal 1980 “Introduction to is subverted by the attacker. In the A2 and its customers to monitor the be-
VLSI Systems” initially proposed this implementation, the trigger is used havior of the fab seems plausible. The
concept, but the Taiwan Semiconduc- to elevate the privilege of a user-mode job of the third party is to certify the
tor Company (TSMC), founded in 1987, program. The authors argue that the proper behavior of the fab. Trusted
changed what had been an academic simplicity of the Trojan and its use of third parties are widely used in areas
exercise into an industrial norm. To- analog circuitry make it difficult to de- ranging from financial contracts to
day, a few large fabs throughout the tect, even with enhanced levels of test- nuclear treaty compliance. “Trust but
world dominate this business. ing. They go to considerable lengths to verify” was used during the Cold War
Over the last two decades, integrat- verify their approach, including exten- to describe this relationship.
ed circuit design has diverged into two sive simulation and actual fabrication The authors have a lot of experience
specialties: (1) Architectural and logi- of a processor in a modern silicon pro- with attacks on digital logic, and do a
cal design and device layout, done by cess. On the actual hardware, the Tro- good job of explaining previous work
a design house, with (2) mask genera- jan operated as expected. in the area. The paper is definitely
tion and device fabrication done by a Is this realistic? Certainly no worth reading carefully, as it covers
foundry. To ensure the foundry has foundry wants to compromise its an area that will likely become much
done its job correctly, the design house business model by being identified more important in an increasingly
relies on extensive testing to verify that as untrustworthy. technology-dependent world.
devices meet their specifications. As I was preparing this Techni-
The following paper assumes the cal Perspective, the Dyn/Mirai DDoS Charles Thacker, computing pioneer and recipient of the
2009 ACM A.M. Turing Award, passed away in June 2017,
foundry (or other parties involved in attack occurred. Apparently, the at- soon after this Technical Perspective was written.
the low levels of fabrication) is mali- tack used a large number of IoT de-
cious, and can modify the design they vices (DVRs and webcams) as a botnet,
receive to produce a device that can which targeted a major DNS server.
later be used for malice. Their attack This is approximately what the au-
employs a very small Trojan circuit in- thors of the following paper describe,
cluded in an otherwise correct design. although the attack was done by ex-
The Trojan awaits the chip’s deploy- ploiting the lack of security in the tar- Copyright held by author.
SE PT E MB E R 2 0 1 7 | VO L. 6 0 | N O. 9 | C OM M U N IC AT ION S OF T HE ACM 83
research highlights
This paper presents three contributions. (1) We design vulnerable to malicious attacks by rogue engineers involved
and implement the first fabrication-time processor attack in any of the above steps.
that mimics the triggered attacks often added during design The design house implements the specification for the
time. As a part of our implementation, we are the first to chip’s behavior in some Hardware Description Language
show how a fabrication-time attacker can leverage the empty (HDL). Once the specification is implemented in an HDL and
space common in chip layouts to implement malicious cir- that implementation has been verified, the design is passed
cuits, (2) We show how an analog attack can be much smaller to a back-end house, which places and routes the circuit.
and more stealthy than its digital counterpart. Our attack Conventional digital Trojans can only be inserted in
diverts charge from unlikely signal transitions to imple- design phase and are easier to be detected by design phase
ment its trigger, so it is invisible to all known side-channel verifications. Fabrication-time attacks inserted in back-end
defenses. Additionally, as an analog circuit, our attack is and fabrication phases can evade these defenses. Since it is
under the digital layer and missed by functional verification strictly more challenging to implement attacks at the fabri-
performed on the hardware description language, and (3) cation phase due to limited information and ability to mod-
We fabricate an openly malicious processor and then evalu- ify the design compared to the back-end phase, we focus on
ate the behavior of our fabricated attacks across many chips that threat model for our attack.
and changes in environmental conditions. We compare The attacker starts with a Graphic Database System II
these results to Simulation Program with Integrated Circuit (GDSII) file that is a polygon representation of the completely
Emphasis (SPICE) simulation models. laid-out and routed circuit. Our threat model assumes that
the delivered GDSII file represents a perfect implementa-
2. BACKGROUND AND THREAT MODEL tion—at the digital level of abstraction—of the chip’s speci-
The typical design and fabrication process of integrated cir- fication. This is very restrictive as it means that the attacker
cuits is as shown in Figure 1. See Rostami16. This process often can only modify existing circuits or—as we are the first to
involves collaboration between different parties all over the show in this paper—add attack circuits to open spaces in
world and each step is likely done by different teams even the laid-out design. The attacker can not increase the dimen-
if they are in the same company. Therefore, the designs are sions of the chip or move existing components around. This
restrictive threat model also means that the attacker must
Figure 1. Typical IC design process with commonly-research threat
perform some reverse engineering to select viable victim flip-
vectors highlighted in red. The blue text and brackets highlights the flops and wires to tap. After the untrusted fabrication house
party in control of the stage(s). completes fabrication, it sends the fabricated chips off to a
trusted party for post-fabrication testing. Our threat model
(Third Third-party IPs Design assumes that the attacker has no knowledge of the test cases
party) time used for post-fabrication testing. Such a model dictates the
RTL design attack use of a sophisticated trigger to hide the attack.
SE PT E MB E R 2 0 1 7 | VO L. 6 0 | N O. 9 | C OM M U N IC AT ION S OF T HE ACM 85
research highlights
with Metal Oxide Semiconductor (MOS) caps. M0 and M1 are 3.2. Multi-stage trigger circuit
the two switches as shown in Figure 3. A detector is used to The one-stage trigger circuit described in the previous sec-
compare cap voltage with a threshold voltage and can be tion takes only one victim wire as an input. Using only one
implemented by inverters or Schmitt triggers. An inverter trigger input limits the attacker in two ways: (1) Because fast
has a switching voltage depending on its sizing and when toggling of one signal for tens of cycles triggers the single
the capacitor voltage is higher than the switching voltage, stage attack, there is still a chance that normal operations or
the output is 0; otherwise, the output is 1. A Schmitt trigger certain benchmarks can expose the attack, and (2) Certain
is an inverter with hysteresis. It has a large threshold when instructions are required to create fast toggling of a single
input goes from low to high and a small threshold when trigger input and there is not much room for a flexible and
input goes from high to low. The hysteresis is beneficial for stealthy attack program.
our attack because it extends both trigger time and retention We note that an attacker can make a logical combination
time. To balance the leakage current through M0 and M1, an of two or more single-stage trigger outputs to create a vari-
additional leakage path to ground (NMOS M2 as shown in ety of more flexible multi-stage analog triggers. Basic opera-
Figure 4) is added to the design. tions to combine two triggers include AND and OR. When
A SPICE simulation waveform is as shown in Figure 5 to analyzing the behavior of logic operations on single stage
illustrate the operation of our analog trigger circuit after trigger output, it should be noted that the single-stage trig-
optimization. The operation is same as the behavioral model ger outputs 0 when triggered. Thus, for AND operation, the
that we proposed as shown in Figure 2, allowing us to use the final trigger is activated when either A or B triggers fire. For
behavior model for system-level attack design. OR operation, the final trigger is activated when both A and
B triggers fire. It is possible for an attacker to combine these
simple AND and OR-connected triggers into an arbitrarily
Figure 3. Design concepts of analog trigger circuit based on complex multi-level multi-stage trigger.
capacitor charge sharing.
VDD
3.3. Triggering the attack
Clk For A2, the payload design is independent of the trigger mecha-
nism, so our proposed analog trigger is suitable for various pay-
Clk Cunit
Clk VDD loads to achieve different attacks. Since the goal of this work
Cap is to achieve a Trojan that is nearly invisible while providing a
voltages powerful foothold for a software-level attacker, we couple our
Cunit Cmain Cmain analog triggers to a privilege escalation attack,9 which provides
Time
maximum capabilities to an attacker. We propose a simple
design to overwrite security critical registers directly by adding
one AND/OR gate to asynchronous set or reset pins of the reg-
isters. These reset/set pins are specified in original designs for
Figure 4. Transistor-level schematic of analog trigger circuit. processor reset. These reset signals are asynchronous with no
timing constraints so that adding one gate into the reset sig-
VDD
Trigger
nal of one register does not affect functionality or timing con-
inputs straints of the design. Because there are no timing constraints
M0 on asynchronous inputs, the payload circuit can be inserted
M1
Trigger manually after final placement and routing in a manner con-
Detector
Switch output sistent with our threat model.
leakage Cap
M2 Drain
Cunit Cmain leakage leakage
3.4. Selecting victims
It is important that the attacker validate their choice of vic-
tim signal. This requires verifying that the victim wire has
low baseline activity and its activity level is controllable
given the expected level of access of the attacker. To validate
Figure 5. SPICE simulation waveform of analog trigger circuit. that the victim wire used in A2 has a low background activity,
we use benchmarks from the MiBench embedded systems
1 benchmark suite. For cases where the attacker does not have
Trigger
access to such software or the attacked processor will see a
Voltage
input
0
wide range of use, the attacker can follow A2’s example and
Trigger
Trigger time Retention time use a multi-stage trigger with wires that toggle in a mutually-
(240ns) (0.8us)
output
1
exclusive fashion and require inputs that are unlikely to
Voltage
SE PT E MB E R 2 0 1 7 | VO L. 6 0 | N O. 9 | C OM M U N IC AT ION S OF T HE ACM 87
research highlights
of the design, so locating the desired signal is trivial. But an Comparisons with several variants of NAND2 and DFlip–Flop
attack inserted at back-end stage can still be discovered by standard cells from commercial libraries are summarized in
SPICE simulation and layout checks, though the chance is Table 1. The area of the trigger circuit not using IO device
extremely low if no knowledge about the attack exists. In is similar to a X4 strength DFlip–Flop. Using an IO device
contrast, fabrication time attacks can only be discovered by increases trigger circuit size significantly, but area is still
post-silicon testing, which is believed to be very expensive similar to the area of two standard cells, which ensures it can
and difficult to find small Trojans. To insert an attack during be inserted into empty space in final design layout. AC power
chip fabrication, some insights about the design are needed, is the total energy consumed by the circuits when input
which can be extracted from layout through physical verifi- changes, the power numbers are simulated with SPICE on
cation tools and digital simulations or from a co-conspirator a netlist including extracted parasitics. Standby power is the
involved in the design phase. power consumption of the circuits when inputs are static,
The next step is to find empty space around the victim which comes from leakage currents of CMOS devices.
wire and insert the analog trigger circuit. Unused space is After inserting A2, post-layout simulation with extracted
usually automatically filled with filler cells or capacitor cells parasitics shows that the extra delay of victim wires is 1.2ps
by placement and routing tools. Removing these cells will on average, which is only 0.33% of 4ns clock period and
not affect the functionality or timing. well below the process variation and noise range. In prac-
To insert the attack payload circuit, the reset wire needs tice, such delay difference is nearly impossible to measure,
to be cut as discussed in Section 3.3. It has been shown unless a high-resolution time to digital converter is included
that timing of reset signal is flexible, so the AND or OR gate on chip, which is impractical due to its large area and power
only need to be placed somewhere close to the reset signal. overhead.
Because the added gates can be a minimum strength cell, Comparison to digital-only attacks. If we look at a previ-
their area is small and finding space for them is trivial. ously proposed, digital only and smallest implementation of
The last step is to manually do the routing from trigger a privilege escalation attack,5 it requires 25 gates and 80mm2
input wires to analog trigger circuit and then to the payload while our analog attack requires as little as one gate for the
circuits. There is no timing requirement on this path so that same effect. Our attack is also much more stealthy as it
the routing can go around existing wires at same metal layer requires dozens of consecutive rare events, where the other
(jogging) or jump over existing wires by going to another attack only requires two. We also implement a digital only,
metal layer (jumping). If long and high metal wires become counter-based attack that aims to mimic A2. The digital ver-
a concern of the attacker due to potentially easier detection, sion of A2 requires 91 cells and 382mm2, almost two orders-
repeaters (buffers) can be added to break long wire into of-magnitude more than the analog counterpart. These
small sections. Furthermore, it is possible that the attacker results demonstrate how analog attacks can provide attack-
can choose different trigger input wires and/or payload ers the same power and control as existing digital attacks,
according to the existing layout of the target design. but much more difficult to catch.
In our OR1200 implementation, inserting the attack fol-
lowing the steps above is trivial, even with the design’s 80% 5. EVALUATION
area utilization. Routing techniques including jogging and We perform all experiments with our fabricated 2.1mm2
jumping are used, but such routing approach is very com- malicious OR1200 processor as shown in Figure 6. Figure 6
mon for automatic routing tools so the information leaked also marks the locations of A2 attacks, with two levels of
by such wires is limited. zoom to aide in understanding the challenges of identifying
Side-channel information. For the attack to be stealthy A2 in a sea of non-malicious logic. In fact, A2 occupies less
and defeat existing protections, the area, power and timing than 0.08% of the chip’s area. Our fabricated chip contains
overhead of the analog trigger circuit should be minimized. two sets of attacks: the first set of attacks are one and two-
High accuracy SPICE simulation is used to characterize stage triggers baked-in to the processor that we use to assess
power and timing overhead of implemented trigger circuits. the end-to-end impact of A2. The second set of attacks exist
Table 1. Comparison of area and power between our implemented analog trigger circuits and commercial standard cells in 65nm GP CMOS
technology.
Scan
OR1200 chain A2 Trigger
I$ CLK Testing 4 4 4 4
core Structure
3 3 3 3
2mm
Digital IO (b) Distribution of analog trigger circuit using only core device
SE PT E MB E R 2 0 1 7 | VO L. 6 0 | N O. 9 | C OM M U N IC AT ION S OF T HE ACM 89
research highlights
confirmed that the trigger circuit can be activated when the programs, at the nominal condition (1V supply voltage and
victim wire toggles between 0.46MHZ and 120MHz, the sup- 25°C). Direct measurement of trigger circuit power is
ply voltage varies between 0.8V and 1.2V, and the ambient infeasible in our setup, so simulation is used as an esti-
temperature varies between −25°C and 100°C. mation. Simulated trigger power consumption in Table 1
As expected, different conditions yield different mini- translates to 5.3nW and 0.5mW for trigger circuits con-
mum toggling rates to activate the trigger. Temperature structed with and without IO devices. These numbers are
has a stronger impact than voltage on the trigger condi- based on the assumption that trigger inputs keep tog-
tion because of leakage current’s exponential dependence gling at 1/4 of the clock frequency of 240MHz, which is the
on temperature. At higher temperature, more cycles are maximum switching activity that our attack program can
required to trigger and higher switching activity is required achieve. In the common case of non-attacking software,
because leakage from capacitor is larger. the switching activity is much lower—approaching zero—
and only lasts a few cycles so that the extra power due to
5.2. Is the attack triggered by non-malicious our trigger circuit is even smaller. In our experiments, the
benchmarks? power of the attack circuit is orders-of-magnitude less
Another important property for any hardware Trojan is not than the normal power fluctuations that occur in a pro-
exposing itself under normal operations. Because A2’s trig- cessor while it executes different instructions. Further
ger circuit is connected only to the trigger input signal, digi- discussions about possible defenses such as split manu-
tal simulation of the design is enough to acquire the activity facturing and runtime verifications are presented in our
of the signals. However, since we make use of analog charac- original A2 paper.21
teristics to attack, analog effects should also be considered
as potential effects to accidentally trigger the attack. We use 6. CONCLUSION
MiBench4 as test bench because it targets the class of pro- Experimental results with our fabricated malicious proces-
cessor that best fits the OR1200 and it consists of a set of sor show that a new style of fabrication-time attack is pos-
well-understood applications that are popular benchmarks sible, which applies to a wide range of hardware, spans the
in both academia and in industry. To validate that A2’s trig- digital and analog domains, and affords control to a remote
ger avoids spurious activations from a wide variety of soft- attacker. Experimental results also show that A2 is effec-
ware, we select five benchmark applications from MiBench, tive at reducing the security of existing software, enabling
each from a different class. This ensures that we thoroughly unprivileged software full control over the processor.
test all subsystems of the processor—exposing likely activity Finally, the experimental results demonstrate the elusive
rates for the wires in the processor. Again, in all programs, nature of A2: (1) A2 is as small as a single gate—two orders of
the victim registers are initialized to opposite states that A2 magnitude smaller than a digital-only equivalent; (2) attack-
puts them in when its attack is deployed. The processor runs ers can add A2 to an existing circuit layout without perturb-
all five programs at six different temperatures from −25°C to ing the rest of the circuit; (3) a diverse set of benchmarks fail
100°C. Results prove that neither the one-stage nor the two- to activate A2 and (4) A2 has little impact on circuit power,
stage trigger circuit is exposed when running these bench- frequency, or delay.
marks across such wide temperature range. Our results expose two weaknesses in current malicious
hardware defenses. First, existing defenses analyze the
5.3. Existing protections digital behavior of a circuit using functional simulation or
Existing protections against fabrication-time attacks are the analog behavior of a circuit using circuit simulation.
mostly based on side-channel information, for example, Functional simulation is unable to capture the analog prop-
power, temperature, and delay. In A2, we only add one gate erties of an attack, while it is impractical to simulate an
in the trigger, thus minimizing power and temperature per- entire processor for thousands of clock cycles in a circuit
turbations caused by the attack. simulator—this is why we had to fabricate A2 to verify that it
Table 2 summarizes the average power consumption worked. Second, the minimal impact on the run-time prop-
measured when the processor runs our five benchmark erties of a circuit (e.g., power, temperature, and delay) due
to A2 suggests that it is an extremely challenging task for
Table 2. Power consumption of our test chip running a variety of
side-channel analysis techniques to detect this new class of
benchmark programs. attacks. We believe that our results motivate a different type
of defense, where trusted circuits monitor the execution of
Program Power (mW) untrusted circuits, looking for out-of-specification behavior
Standby 6.210 in the digital domain.
Basic math 23.703
Dijkstra 16.550 Acknowledgments
FFT 18.120 This work was supported in part by C-FAR, one of the six
SHA 18.032
Search 21.960
SRC STARnet Centers, sponsored by MARCO and DARPA.
Single-stage attack 19.505 This work was also partially funded by the National Science
Two-stage attack 22.575 Foundation. Any opinions, findings, conclusions, and rec-
Unsigned division 23.206 ommendations expressed in this paper are solely those of
the authors.
SE PT E MB E R 2 0 1 7 | VO L. 6 0 | N O. 9 | C OM M U N IC AT ION S OF T HE ACM 91
research highlights
DOI:10.1145/ 30 6 8 6 1 4
Technical Perspective
To view the accompanying paper,
visit doi.acm.org/10.1145/3068663 rh
THE FIELD OF crowdsourcing and human transform individual untrained work- leagues at Bletchley Park in a recent is-
computation has evolved considerably ers into better captionists. sue of Communications: “Another myth
from its early days. At first, crowdsourc- Second, the system uses a Map- is that code-breaking machines elimi-
ing was mainly conceived as a way to Reduce programming paradigm to di- nated human labor and code-breaking
obtain ground truth labels for datasets, vide and conquer the various pieces of skill ... Technology transcended, rather
particularly image datasets, in the mid- the captioning tasks and coordinates than supplemented, human labor and
2000s. Soon after, researchers began to the workers and their tasks through bureaucracy.”e The article points out
utilize crowdsourcing for performing this organization paradigm. First in- the real challenge of the whole effort
large-scale user studies of systems.a,b As troduced by Kittur et al.,d this is a clever was a combination of the management
our understanding of crowdsourcing application of the MapReduce para- of a (mostly female!) human operator
continued to evolve, researchers real- digm, but instead of applying to com- force along with the Enigma machines.
ized the workers can be reserved ahead puting tasks, the system applies the From my perspective, intelligent aug-
of time to perform real-time tasks.c Uti- concept to organizing human tasks. mentation of our abilities is the real re-
lizing this idea, the system described in Third, impressively, to combine the search frontier.
the following paper demonstrates how partial contributions from individual While we continue to explore the
a crowd of workers can caption speech workers, the system utilizes a sequence boundary of what is possible for ma-
nearly as well as a professional caption- alignment algorithm to combine the chine intelligence, we should also be
ist. Importantly, this paper was one of streams of input from various workers. exploring the boundary of how humans
the first in a recent set of crowdsourcing This is novel because most crowd- will interact with machine intelligence.
papers that demonstrated how human sourcing systems use a simple major- For example, how can we have an intel-
workers can collaborate in concert with ity voting approach to combine the ligent conversation with computing sys-
computing systems to accomplish a worker inputs. The use of a sophisti- tems? Can I talk to a restaurant recom-
real-time task that is difficult for either cated algorithm here is necessary to fit mendation system while I drive home to
one to do by itself. This is notable for the captioning problem, and it points get ready for a dinner date? How should
many reasons, but let me first summa- to the possibility of other combiner my television respond if I say I wanted
rize the significance of this work. functions in other problems in future an exciting action film tonight that takes
First, the system demonstrated that research. A natural extension of the into account the tastes of other fam-
significant innovation is needed to get alignment algorithm here would be to ily members? If it doesn’t have enough
human workers to productively per- utilize a task-specific language model information on everyone in the room,
form the captioning task. For example, trained using deep learning. will it (he/she?) ask intelligent ques-
the Scribe system slows down the con- From a historical perspective, aug- tions while naturally conversing with
tinuous speech for a brief period of menting humans has been at the very my guests? Can I give feedback both via
time with the right volume changes to center of much personal computing hand gestures as well as voice dialog?
emphasize what passage to transcribe and HCI research. There has been Since an important application of
for the worker. The volume variations much talk about the degree in which machine intelligence is to augment hu-
help with audio saliency. This tech- machine learning (ML) will replace mans in their desires, goals, and tasks,
nique is interesting to human-comput- human labor (HL) in the future, but I what we should do is to ask important re-
er interaction (HCI) researchers, since think that is misguided. Instead, what search questions about human interac-
it utilizes our intuition about how we we see in this research is a good ex- tions with ML systems. In other words,
can direct human attention, helping to ample in which humans and machines we should have much better research of
work in concert on a very hard task that ML+HL, ML+HCI, and ML+Human In-
a Kittur, A., Chi, E.H., Suh B.. Crowdsourcing is currently still too difficult to do by teraction, and this research is a shining
user studies with Mechanical Turk. In Proceed- either alone. Interestingly, this aligns example that points the way.
ings of the ACM Conference on Human-Factors
in Computing Systems, ACM Press (Florence,
well with a historical recounting of the
Italy, 2008), 453–456. code-breaking work by Turing and col- e Haigh, T. Colossal genius: Tutte, flowers, and a
b Egelman, S., Chi, E.H., Dow, S. Crowdsourc- bad imitation of Turing. Commun. ACM 60, 1 (Jan.
ing in HCI research. Ways of Knowing in HCI. 2017), 29–35; https://doi.org/10.1145/3018994
J.S. Olson and W.A. Kellogg, Eds. Springer, NY, d Kittur. K, Smus. B., Khamkar. S., and Kraut. R.E.
2014, 267–289. CrowdForge: Crowdsourcing complex work. In
Ed H. Chi is Research Lead Manager and Sr. Staff
c Bernstein, M., Brandt, J., Miller, R., and Karger, Proceedings of the 24th Annual ACM Symposium on Research Scientist at Google Inc., Mountain View, CA.
D. Crowds in two seconds: Enabling real-time User Interface Software and Technology (2011), 43–
crowd-powered interfaces. UIST 2011. 52; http://dx.doi.org/10.1145/2047196.2047202 Copyright held by author.
SE PT E MB E R 2 0 1 7 | VO L. 6 0 | N O. 9 | C OM M U N IC AT ION S OF T HE ACM 93
research highlights
workers on Mechanical Turk can be recruited within a few sec- metrics used in this paper. Methods for producing real-time
onds,1, 2, 11 and engaged in continuous tasks.21, 24, 25, 28 Recruiting captioning services come in three main varieties:
from a broader pool allows workers to be selectively chosen Computer-Aided Real-time Transcription (CART): CART
for their expertise not in captioning but in the technical is the most reliable real-time captioning service, but is
areas covered in a lecture. While professional stenographers also the most expensive. Trained stenographers type in
are able to type faster and more accurately than most crowd shorthand on a “steno” keyboard that maps multiple key
workers, they are not necessarily experts in the field they are presses to phonemes that are expanded to verbatim text.
captioning, which can lead to mistakes that distort the mean- Stenography requires 2–3 years of training to consistently
ing of transcripts of technical talks.30 Scribe allows student keep up with natural speaking rates that average 141 WPM
workers to serve as non-expert captionists for $8–$12 per hour and can reach 231 WPM.13
(a typical work-study pay rate). Therefore, we could hire sev- Non-Verbatim Captioning: In response to the cost of
eral students for much less than the cost of one professional CART, computer-based macro expansion services like
captionist. C-Print were introduced.30 C-Print captionists need less train-
Scribe makes it possible for non-experts to collabora- ing, and generally charge around $60 an hour. However, they
tively caption speech in real time by providing automated normally cannot type as fast as the average speaker’s pace,
assistance in two ways. First, it assists captionists by mak- and cannot produce a verbatim transcript. Scribe employs
ing the task easier for each individual. It directs each captionists with no training and compensates for slower
worker to type only part of the stream audio, it slows down typing speeds and lower accuracy by combining the efforts
the portion they are asked to type so they can more easily of multiple parallel captionists.
keep up, and it adaptively determines the segment length Automated Speech Recognition: ASR works well in ideal
based on each individual’s typing speed. Second, it solves situations with high-quality audio equipment, but degrades
the coordination problem for workers by automatically quickly in real-world settings. ASR is has difficulty recogniz-
merging the partial input of multiple workers into a single ing domain-specific jargon, and adapts poorly to changes,
transcript using a custom version of multiple-sequence such as when the speaker has a cold.6 ASR systems can
alignment. require substantial computing power and special audio
Because captions are dynamic, readers spend far more equipment to work well, which lowers availability. In our
mental effort reading real-time captions compared to experiments, we used Dragon Naturally Speaking 11.5 for
static text. Also, regardless of method, captions require Windows.
users to absorb information that is otherwise consumed Re-speaking: In settings where trained typists are not
via two senses (vision and hearing) via only one (vision). common (such as in the U.K.), alternatives have arisen. In
In classroom settings, this can be particularly common, re-speaking, a person listens to the speech and enunci-
with content appearing on the board and being refer- ates clearly into a high-quality microphone, often in a spe-
enced in speech. The effort required to track both the cial environment, so that ASR can produce captions with
captions and the material they pertain to simultaneously high accuracy. This approach is generally accurate, but
is one possible reason why deaf students often lag behind cannot produce punctuation, and has considerable delay.
their hearing peers, even with the best accomodations.26 Additionally, re-speaking still requires extensive training,
To address these issues, we also explore how captions since simultaneous speaking and listening is challenging.
can be best presented to users,16 and show that control-
ling bookmarks in caption playback can even increase 3. LEGION: SCRIBE
comprehension.22 Scribe gives users on-demand access to real-time cap-
This paper outlines the following contributions: tioning from groups of non-experts via their laptop or
mobile devices (Figure 1). When a user starts Scribe, it
• Scribe, an end-to-end system that has advantages over immediately begins recruiting workers to the task from
current state-of-the-art solutions in terms of availabil- Mechanical Turk, or a pool of volunteer workers, using
ity, cost, and accuracy. LegionTools.11, 20 When users want to begin captioning
• Evidence that non-experts can collectively cover speech audio, they press the start button, which forwards audio
at rates similar to or above that of a professional. to Flash Media Server (FMS) and signals the Scribe server
• Methods for quickly merging multiple partial captions to begin captioning.
to create a single, accurate stream of final results. Workers are presented with a text input interface
• Evidence that Scribe can produce transcripts that both designed to encourage real-time answers and increase
cover more of the input signal and are more accurate global coverage (Figure 2). A display shows workers their
than either ASR or any single constituent worker. rewards for contributing in the form of both money and
• The idea of automatically combining the real-time points. In our experiments, we paid workers $0.005 for
efforts of dynamic groups of workers to outperform every word the system thought was correct. As workers type,
individuals on human performance tasks. their input is forwarded to an input combiner on the Scribe
server. The input combiner is modular to accommodate dif-
2. CURRENT APPROACHES ferent implementations without needing to modify Scribe.
We first overview current approaches for real-time cap- The combiner and interface are discussed in more detail
tioning, introduce our data set, and define the evaluation later in this article.
Merging
Scribe has a two-fold axis
server
System overview
Merged
Capt
ion s Crowd corrections captions
tream
SE PT E MB E R 2 0 1 7 | VO L. 6 0 | N O. 9 | C OM M U N IC AT ION S OF T HE ACM 95
research highlights
In the following sections, we detail the co-evolution of the 6s. This seems to work well in practice, but it is likely that it
worker interface and algorithm for merging partial captions is not ideal for everyone (discussed below). Our experience
in order to form a final transcript. suggests that keeping the in period short is preferable even
when a particular worker was able to type more than the
4. COORDINATING CAPTIONISTS period because the latency of a worker’s input tended to go
Scribe’s non-expert captioning interface allows contributors up as they typed more consecutive words.
to hear an audio stream of the speaker(s), and provide cap-
tions with a simple user interface (UI) (Figure 2). Captionists 5. IMPROVING HUMAN PERFORMANCE
are instructed to type as much as they can, but are under no Even when workers are directed to small, specific portions of
pressure to type everything they hear. If they are able, work- the audio, the resulting partial captions are not perfect. This
ers are asked to separate contiguous sequences of words by is due to several factors, including bursts of increased speak-
pressing enter . Knowing which word sequences are likely to ing rates being common, and workers mis-hearing some con-
be contiguous can help later when recombining the partial tent due to a particular accent or audio disruption. To make
captions from multiple captionists. the task easier for workers, we created TimeWarp,23 which
To encourage real-time entry of captions, the interface allows each worker to type what they hear in clips with a lower
“locks in” words a short time after they are typed (500ms). playback rate, while still keeping up with real time and main-
New words are identified when the captionist types a space taining context from content they are not responsible for.
after the word, and are sent to the server. The delay is added to
allow workers to correct their input while adding as little addi- 5.1. Warping time
tional latency as possible to it. When the captionist presses TimeWarp manages this by balancing the play speed dur-
enter (or following a 2s timeout during which they have not ing in periods, where workers are expected to caption the
typed anything), the line is confirmed and animates upward. audio and the playback speed is reduced, and out periods,
During the 10–15s trip to the top of the display (depending where workers listen to the audio and the playback speed is
on settings), words that Scribe determines were entered cor- increased. A cycle is one in period followed by an out period.
rectly (based on either spell-checking or overlap with another At the beginning of each cycle, the worker’s position in the
worker) are colored green. When the line reaches the top, a audio is aligned with the real-time stream. To do this, we
point score is calculated for each word based on its length first need to select the number of different sets of workers
and whether it has been determined to be correct. N that will be used in order to partition the stream. We call
To recover the true speech, non-expert captions must the length of the in period Pi, the length of the out period Po
cover all of the words spoken. A primary reason why the par- and the play speed reduction factor r. Therefore, the play-
tial transcriptions may not fully cover the true signal relates back rate during in periods is 1r . The amount of the real-time
to saliency, which is defined in a linguistic context as “that stream that gets buffered while playing at the reduced speed
quality which determines how semantic material is distrib- is compensated for by an increased playback speed of N − 1
N−r
uted within a sentence or discourse, in terms of the relative during out periods. The result is that the cycle time of the modi-
emphasis which is placed on its various parts”.7 Numerous fied stream equals the cycle time of the unmodified stream.
factors influence what is salient, and so it is likely to be dif- To set the length of Pi for our experiments, we conducted
ficult to detect automatically. Instead, we inject artificial preliminary studies with 17 workers drawn from Mechanical
saliency adjustments by systematically varying the volume Turk. We found that their mean typing speed was 42.8 WPM
of the audio signal that captionists hear. Scribe’s captionist on a similar real-time captioning task. We also found that
interface is able to vary the volume over a given a period with a worker could type at most 8 words in a row on average before
an assigned offset. It also displays visual reminders of the the per-word latency exceeded 8s (our upper bound on accept-
period to further reinforce this notion. able latency). Since the mean speaking rate is around 150 WPM,13
Initially, we tried dividing the audio signal into segments workers will hear 8 words in roughly 3.2s, with an entry time
that we gave to individual workers. We found several prob- of roughly 8s from the last word spoken. We used this to set
lems with this approach. First, workers tended to take lon- Pi = 3.25s, Po = 9.75s, and N = 4. We chose r = 2 in our tests so that
ger to provide their transcriptions as it took them some time the playback speed would be 2 1 = 0.5 times for in periods, and
to get into the flow of the audio. A continuous stream avoids the play speed for out periods is N −1 3 times.
N − r = 2 = 1.5
this problem. Second, the interface seemed to encourage To speed up and slow down the play speed of content
workers to favor quality over speed, whereas streaming con- being provided to workers without changing the pitch
tent reminds workers of the real-time nature of the task. The (which would make the content more difficult to under-
continuous interface was designed in an iterative process stand for the worker), we use the Waveform Similarity
involving tests with 57 remote and local users with a range Based Overlap and Add (WSOLA) algorithm.4 WSOLA works
of backgrounds and typing abilities. These tests showed that by dividing the signal into small segments, then either
workers tended to provide chains of words rather than dis- skipping (to increase play speed) or adding (to decrease
joint words, and needed to be informed of the motivations play speed) content, and finally stitching these segments
behind aspects of the interface to use them properly. back together. To reduce the number of sound artifacts,
A non-obvious question is what the period of the volume WSOLA finds overlap points with similar wave forms, then
changes should be. In our experiments, we chose to play the gradually transitions between sequences during these
audio at regular volume for 4s and then at a lower volume for overlap periods.
SE PT E MB E R 2 0 1 7 | VO L. 6 0 | N O. 9 | C OM M U N IC AT ION S OF T HE ACM 97
research highlights
Figure 4. Optimal coverage reaches nearly 80% when combining the input of four workers, and nearly 95% with all 10 workers, showing
captioning audio in real time with non-experts is feasible.
100%
90%
80%
Optimal
70%
Coverage
60% CART
50% SCRIBE
40% ASR
30%
Single
20%
10%
0%
1 2 3 4 5 6 7 8 9 10
Number of workers
(c=10s, threshold=2)
0.44 tem that deeply integrates human and machine intelligence
0.42 A*-15-t
0.4 0.34
(c=15s, threshold=2) in order to provide a service that is still beyond what com-
A*-15 puters can do alone. We believe it may serve as a model for
0.3 (c=15s, no threshold)
interactive systems that solve other problems of this type.
0.2 Graph-based
MUSCL Acknowledgments
0.1
This work was supported by the National Science Foundation
0.0 under awards #IIS-1149709 and #IIS-1218209, the University
of Michigan, Google, an Alfred P. Sloan Foundation Fellowship,
and a Microsoft Research Ph.D. Fellowship.
Figure 7. Relative improvement from no warp to warp conditions
in terms of mean and median values of coverage, precision, and
latency. We expected coverage and precision to improve. Shorter References research 32, 5 (2004), 1792–1797.
latency was unexpected, but resulted from workers being able to 1. Bernstein, M.S., Brandt, J.R., Miller, R.C., 6. Elliot, L.B., Stinson, M.S., Easton, D.,
Karger, D.R. Crowds in two seconds: Bourgeois, J. College students learning
consistently type along with the audio instead of having to remember Enabling realtime crowd-powered with C-print’s education software
and go back as the speech outpaced their typing. interfaces. In Proceedings of the 24th and automatic speech recognition.
Annual ACM Symposium on User In American Educational Research
Mean Median Interface Software and Technology, Association Annual Meeting (New
20% UIST ‘11 (New York, NY, USA, 2011). York, NY, 2008), AERA.
Improvement (%)
SE PT E MB E R 2 0 1 7 | VO L. 6 0 | N O. 9 | C OM M U N IC AT ION S OF T HE ACM 99
research highlights
Annual ACM Symposium on User 2012, 2012. Proceedings of the 2013 conference International Foundation for
Interface Software & Technology 19. Lasecki, W., Miller, C., Sadilek, A., on Computer supported cooperative Autonomous Agents and Multiagent
(2015) ACM, 81–82. Abumoussa, A., Borrello, D., work (2013) ACM, 1203–1212. Systems (2015), 841–849.
12. Gordon-Salant, S. Aging, hearing Kushalnagar, R., Bigham, J. Real- 26. Marschark, M., Sapere, P., Convertino, C., 29. Turner, O.G. The comparative legibility
loss, and speech recognition: stop time captioning by groups of non- Seewagen, R. Access to postsecondary and speed of manuscript and cursive
shouting, i can’t understand you. In experts. In Proceedings of the 25th education through sign language handwriting. The Elementary School
Perspectives on Auditory Research, Annual ACM Symposium on User interpreting. J Deaf Stud Deaf Educ. Journal (1930), 780–786.
volume 50 of Springer Handbook of Interface Software and Technology, 10, 1 (Jan. 2005), 38–50. 30. Wald, M. Creating accessible
Auditory Research. A.N. Popper and UIST ‘12, (2012), 23–34. 27. Naim, I., Gildea, D., Lasecki, W.S., educational multimedia through
R.R. Fay, eds. Springer New York, 20. Lasecki, W.S., Gordon, M., Koutra, D., Bigham, J.P. Text alignment for editing automatic speech recognition
2014, 211–228. Jung, M.F., Dow, S.P., Bigham, J.P. real-time crowd captioning. In captioning in real time. Interactive
13. Jensema, C., McCann, R., Ramsey, S. Glance: rapidly coding behavioral Proceedings North American Chapter Technology and Smart Education 3, 2
Closed-captioned television presentation video with the crowd. In Proceedings of the Association for Computational (2006), 131–141.
speed and vocabulary. In Am Ann of the 27th Annual ACM Symposium Linguistics (NAACL) (2013), 201–210. 31. Wang, L., Jiang, T. On the complexity
Deaf 140, 4 (October 1996), 284–292. on User Interface Software and 28. Salisbury, E., Stein, S., Ramchurn, S. of multiple sequence alignment.
14. John, B.E. Newell, A. Cumulating Technology, UIST ‘14, (New York, NY, Real-time opinion aggregation J Comput Biol. 1, 4 (1994), 337–348.
the science of HCI: from s-R 2014). ACM, 1. methods for crowd robotics. In 32. World Health Organization. Deafness
compatibility to transcription typing. 21. Lasecki, W.S., Homan, C., Bigham, J.P. Proceedings of the 2015 International and hearing loss, fact sheet N300.
ACM SIGCHI Bulletin 20, SI (Mar. Architecting real-time crowd-powered Conference on Autonomous http://www.who.int/mediacentre/
1989), 109–114. systems. Human Computation 1, 1 Agents and Multiagent Systems. factsheets/fs300/en/, February 2014.
15. Kadri, H., Davy, M., Rabaoui, A., (2014).
Lachiri, Z., Ellouze, N., et al. Robust 22. Lasecki, W.S., Kushalnagar, R.,
audio speaker segmentation using Bigham, J.P. Helping students keep Walter S. Lasecki (wlasecki@umich. Raja Kushalnagar (raja.kushalnagar@
one class SVMs. In IEEE European up with real-time captions by pausing edu), Computer Science & Engineering, gallaudet.edu), Information Technology
Signal Processing Conference and highlighting. In Proceedings of the University of Michigan. Program, Gallaudet University.
(Lausanne, Switzerland, 2008) ISSN: 11th Web for All Conference, W4A ‘14
2219-5491. (New York, NY, 2014). ACM, 39:1–39:8. Christopher D. Miller, Iftekhar Naim, Jeffrey P. Bigham (jbigham@cmu.edu),
16. Kushalnagar, R.S., Lasecki, W.S., 23. Lasecki, W.S., Miller, C.D., Bigham, J.P. Adam Sadilek, and Daniel Gildea (c.miller HCI and LT Institutes, Carnegie Mellon
Bigham, J.P. Captions versus Warping time for more effective @rochester.edu) ({inaim,sadilek,gildea}@ University.
transcripts for online video content. In real-time crowdsourcing. In cs.rochester.edu), Computer Science
Proceedings of the 10th International Proceedings of the SIGCHI Department, University of Rochester.
Cross-Disciplinary Conference on Web Conference on Human Factors in
Accessibility, W4A ’13, (New York, NY, Computing Systems, CHI ‘13 (New
2013), ACM, 32:1–32:4. York, NY, 2013). ACM, 2033–2036.
17. Kushalnagar, R.S., Lasecki, W.S., 24. Lasecki, W.S., Murray, K., White, S.,
Bigham, J.P. Accessibility evaluation Miller, R.C., Bigham, J.P. Real-time
of classroom captions. ACM Trans crowd control of existing interfaces.
Access Comput. 5, 3 (Jan. 2014), In Proceedings of the 24th Annual
1–24. ACM Symposium on User Interface
18. Lasecki, W. Bigham, J. Online Software and Technology, UIST ‘11,
quality control for real-time crowd (New York, NY, 2011). ACM, 23–32.
captioning. In International 25. Lasecki, W.S., Song, Y.C., Kautz, H.,
ACM SIGACCESS Conference on Bigham, J.P. Real-time crowd labeling
Computers & Accessibility, ASSETS for deployable activity recognition. In © 2017 ACM 0001-0782/17/09 $15.00
Brigham Young University Church of Jesus Christ of Latter-day Saints. Suc- Applying
Faculty Position cessful candidates are expected to support and To apply via Academic Jobs Online submit (1) cur-
contribute to the academic and religious mis- riculum vitae, (2) graduate transcripts, (3) three
The Department of Electrical and Computer sions of the university within the context of the letters of recommendation (at least one of which
Engineering at Brigham Young University an- principles and doctrine of the affiliated Church. discusses your potential as a teacher), (4) a cover
nounces an opening for a professorial continu- Equal Opportunity Employer: m/f/Vets/ letter that addresses why you are interested in
ing-faculty-status (tenure) track position. While Disability Macalester, (5) a statement of teaching philoso-
our preference is in the area of Computer Engi- phy, and (6) a research statement. Please contact
neering, applicants in all areas of Electrical and Shilad Sen at ssen@macalester.edu with any
Computer Engineering will be considered. Macalester College questions about the position. Evaluation of appli-
Areas of interest include but are not limited Two Tenure-Track Assistant Professors of cations will begin October 15, 2017 and continue
to: Computer Systems (including architecture, Computer Science until the position is filled.
IoT and embedded/real-time systems, network- Apply now: https://www.macalester.edu/
ing, security, software, compilers, O/S, parallel Macalester invites applications for two tenure- academics/mscs/compscitenure-trackjob.html
systems, etc.), Robotics and Autonomous Sys- track positions at the assistant professor level to
tems, Computer Vision, Machine Learning, Data begin Fall 2018. Candidates must have or be com-
Science, Distributed Systems, and Digital Sys- pleting a PhD in Computer Science and have a National University of Singapore
tems Design (FPGA and/or VLSI). strong commitment to both teaching and research Senior and Junior Tenure-Track Faculty
The department has state-of-the-art facilities in an undergraduate liberal arts environment. We Positions in Artificial Intelligence
in computing and supercomputing, autonomous are especially interested in candidates who are en-
vehicles and computer vision, control systems, thusiastic to teach a broad range of undergraduate The Department of Computer Science at the Na-
optics, and microelectronic fabrication. Excel- courses. This person will contribute to the teach- tional University of Singapore (NUS) invites appli-
lent research programs exist in the department in ing of our introductory, core and advanced cours- cations for one Distinguished Professorship and
the areas of FPGA-based computing, high-perfor- es, and mentor undergraduate research. several tenure-track faculty positions in artificial
mance embedded systems, autonomous vehicles Macalester offers majors in Computer Sci- intelligence, machine learning, computational
and control, robotics and computer vision, high- ence, Mathematics, and Applied Mathematics neuroscience and related areas of robotics. The
speed low-power electronics, digital communi- and Statistics, and minors in Computer Science, Department enjoys ample research funding,
cations systems, signal processing, biomedical Mathematics, and Statistics, as well as a new mi- moderate teaching loads, excellent facilities, and
imaging, optics, and microfluidics. Successful nor in Data Science. Typical class sizes range from extensive international collaborations. We have
candidates will be expected to strengthen under- 15 to 32 students. We encourage innovative peda- a full range of faculty covering all major research
graduate and graduate education and to develop gogy and curriculum and emphasize computer areas in computer science and a thriving PhD pro-
an outstanding research program to complement science’s interdisciplinary connections. We have gram that attracts the brightest students from the
existing research or develop new research areas. close relationships with several disciplines both region and beyond. More information is available
The ACT score for the average BYU entering within and beyond the sciences, and we are inter- at www.comp.nus.edu.sg/careers.
freshman is above the 90th percentile nationally. ested in candidates whose work spans disciplin- NUS offers highly competitive salaries and is
BYU is also fifth on the NSF’s list of U.S. baccalau- ary boundaries. Areas of highest priority include situated in Singapore, an English-speaking cosmo-
reate-origin institutions for engineering doctorate computer and data security and privacy, mobile politan city that is a melting pot of many cultures,
recipients. We expect our faculty to challenge these and ubiquitous computing, computer networks both the east and the west. Singapore offers a safe
outstanding students to reach their potential. and systems. For more information about our and family-friend environment with high qual-
Successful candidates will be hired at the programs, see: http://macalester.edu/mscs ity education and healthcare at all levels, as well
assistant, associate, or full professor level de- as very low tax rates. Singapore has also recently
pending on experience. Requirements include About Macalester launched a S$150 million national initiative, AI.SG,
a doctorate in computer engineering, computer Macalester College is a highly selective, private to expand research, development, and adoption of
science, electrical engineering, or closely related liberal arts college in the vibrant Minneapolis- AI technologies. AI.SG will be hosted at NUS.
field and a willingness to fully support and par- Saint Paul metropolitan area. The Twin Cities Candidates for the Distinguished Profes-
ticipate in the ideals and mission of BYU. have a population of approximately three million, sor position should have an established record
An on-line application for this position can a rich arts community, strong local industries, of outstanding research achievements, thought
be found at: https://yjobs.byu.edu, job posting an award-winning parks system, and are home leadership, and international stature in artificial
#64783. to many colleges and universities, including the intelligence.
Questions regarding the position can be di- University of Minnesota. Macalester’s diverse Candidates for Assistant Professor positions
rected to: student body comprises over 2000 undergradu- should demonstrate excellent research poten-
Dr. Aaron Hawkins, Faculty Committee Chair ates from 40 states and the District of Columbia tial in AI, and a strong commitment to teaching.
Dept of ECE, Brigham Young University and over 90 nations. The College maintains a Truly outstanding Assistant Professor applicants
459 CB longstanding commitment to academic excel- will be considered for the endowed Sung Kah Kay
Provo UT 84602 lence with a special emphasis on international- Assistant Professorship.
ahawkins@byu.edu ism, multiculturalism, and service to society. We
are especially interested in applicants dedicated Application Details:
*The position will remain open until filled. to excellence in teaching and research/creative Submit the following documents (in a single PDF)
** Brigham Young University is an equal op- activity within a liberal arts college community. online via: https://faces.comp.nus.edu.sg
portunity employer. All faculty are required to As an Equal Opportunity employer supportive of 1.
A cover letter that indicates the position
abide by the university’s Honor Code and Dress affirmative efforts to achieve diversity among its applied for and the main research interests
& Grooming Standards. Strong preference will be faculty, Macalester College strongly encourages 2. Curriculum Vitae
given to qualified candidates who are members applications from women and members of un- 3. A teaching statement
in good standing of the affiliated Church, The derrepresented minority groups. 4. A research statement
˲˲ Provide the contact information of 3 referees team player who can help bring together current cants to apply, including women, veterans, indi-
when submitting your online application, or, ar- campus efforts in cyber security or privacy. In par- viduals with disabilities, and members of tradi-
range for at least 3 references to be sent directly ticular, we are looking for someone who will work tionally underrepresented populations.
to csrec@comp.nus.edu.sg. at the intersection of several areas, such as: (a) For questions, please contact the Cluster’s
˲˲ Application reviews will commence immedi- hardware and IoT security, (b) explaining and pre- Search Committee Chair, Gary T. Leavens, at
ately and continue until positions are filled. dicting human behavior, creating policies, study- Leavens@ucf.edu.
˲˲ Please submit your application by 1 December ing ethics, and ensuring privacy, (c) cryptography
2017. and theory of security or privacy, or (d) tools, meth-
If you have further enquiries, please contact ods, training, and evaluation of human behavior. University of Central Florida
the Search Committee Chair, Weng-Fai Wong, at Minimum qualifications include a Ph.D., ter- Cluster Lead, Cyber Security and Privacy
csrec@comp.nus.edu.sg minal degree, or foreign degree equivalent from Cluster
an accredited institution in an area appropriate
to the cluster, and a record of high impact re- The University of Central Florida (UCF) is recruit-
University of Central Florida search related to cyber security and privacy, dem- ing a lead for its cluster on cyber security and
Assistant or Associate Professor in Faculty onstrated by a strong scholarly and/or funding re- privacy. This position has a start date of August 8,
Cluster for Cyber Security and Privacy cord. A history of working with teams, especially 2018. The position will carry a rank of associate
teams that span multiple disciplines, is a strongly or full professor, commensurate with the candi-
The University of Central Florida (UCF) is recruit- preferred qualification. The position will carry a date’s prior experience and record. The lead is ex-
ing a tenure-track assistant or associate professor rank commensurate with the candidate’s prior pected to have credentials and qualifications like
for its cyber security and privacy cluster. This po- experience and record. those expected of a tenured associate or full pro-
sition has a start date of August 8, 2018. Candidates must apply online at https://www. fessor. To obtain tenure, the selected candidate
This will be an interdisciplinary position that jobswithucf.com/postings/50404 and attach the must have a demonstrated record of teaching,
will be expected to strengthen both the cluster and following materials: a cover letter, curriculum vi- research and service commensurate with rank.
a chosen tenure home department, as well as a pos- tae, teaching statement, research statement, and This will be an interdisciplinary position that
sible combination of joint appointments. The can- contact information for three professional refer- will be expected to strengthen both the cluster
didate can choose a combination of units from the ences. In the cover letter candidates must address and a chosen tenure home department, as well
cluster for their appointment (see http://www.ucf. their background in cyber security and privacy, as a possible combination of joint appointments.
edu/faculty/cluster/cyber-security-and-privacy/). and identify the department or departments for The candidate can choose a combination of units
The ideal junior candidates will have a strong their potential tenure home and the joint ap- from the cluster for their appointment. (See http://
background in cyber security and privacy, and be pointments they would desire. When applying, www.ucf.edu/faculty/cluster/cyber-security-and-
on an upward leadership trajectory in these areas. have all documents ready so they can be attached privacy/.) Both individual and interdisciplinary in-
They will have research impact, as reflected in at that time, as the system does not allow resub- frastructure and startup support will be provided.
high-quality publications and the ability to build a mittal to update applications. The ideal candidate will have a strong back-
well-funded research program. All relevant techni- As an equal opportunity/affirmative action ground in cyber security and privacy and outstand-
cal areas will be considered. We are looking for a employer, UCF encourages all qualified appli- ing research credentials and research impact, as
reflected in a sustained record of high quality pub-
lications and external funding. All relevant techni-
cal areas will be considered including: network
security, cryptography, blockchains, hardware
security, trusted computing bases, cloud comput-
ing, human factors, anomaly detection, forensics,
privacy, and software security, as well as appli-
cations of security and privacy to areas such as
TENURE-TRACK AND TENURED POSITIONS IoT, cyber-physical systems, finance, and insider
ShanghaiTech University invites highly qualified threats. A history of working with teams, especially
candidates to fill multiple tenure-track/tenured teams that span multiple disciplines, is a strongly
faculty positions as its core founding team in the School of Information Science and
Technology (SIST). We seek candidates with exceptional academic records or demonstrated preferred qualification. A record of demonstrated
strong potentials in all cutting-edge research areas of information science and technology. leadership is highly desired, as we are looking for
They must be fluent in English. English-based overseas academic training or background
is highly desired. a leader to bring together all the current campus
efforts in cyber security and privacy. This includes
ShanghaiTech is founded as a world-class research university for training future generations
of scientists, entrepreneurs, and technical leaders. Boasting a new modern campus in three cluster members already hired, as well as a
Zhangjiang Hightech Park of cosmopolitan Shanghai, ShanghaiTech shall trail-blaze a new pending hire for the 2017-18 academic year.
education system in China. Besides establishing and maintaining a world-class research
profile, faculty candidates are also expected to contribute substantially to both graduate Minimum qualifications include a Ph.D. from
and undergraduate educations. an accredited institution in an appropriate area,
Academic Disciplines: Candidates in all areas of information science and technology shall and a record of high impact research related to cy-
be considered. Our recruitment focus includes, but is not limited to: computer architecture, ber security and privacy demonstrated by a strong
software engineering, database, computer security, VLSI, solid state and nano electronics, RF
electronics, information and signal processing, networking, security, computational foundations, scholarly publication record and a significant
big data analytics, data mining, visualization, computer vision, bio-inspired computing systems, amount of sustained funding.
power electronics, power systems, machine and motor drive, power management IC as well as Candidates must apply online at http://www.
inter-disciplinary areas involving information science and technology.
jobswithucf.com/postings/50044 and upload the
Compensation and Benefits: Salary and startup funds are highly competitive,
commensurate with experience and academic accomplishment. We also offer a following materials: cover letter, CV, teaching and
comprehensive benefit package to employees and eligible dependents, including on- research statements, and contact information for
campus housing. All regular ShanghaiTech faculty members will join its new tenure-track 3 professional references. In the cover letter, can-
system in accordance with international practice for progress evaluation and promotion.
didates should address their background, and
Qualifications:
• Strong research productivity and demonstrated potentials; identify the department for their potential tenure
• Ph.D. (Electrical Engineering, Computer Engineering, Computer Science, Statistics, home and any desired joint appointments.
Applied Math, or related field); An equal opportunity/affirmative action em-
• A minimum relevant (including PhD) research experience of 4 years.
ployer, UCF encourages all qualified applicants
Applications: Submit (in English, PDF version) a cover letter, a 2-page
research plan, a CV plus copies of 3 most significant publications, and names to apply, including women, veterans, individuals
of three referees to: sist@shanghaitech.edu.cn. For more information, visit with disabilities, and members of traditionally
http://sist.shanghaitech.edu.cn/NewsDetail.asp?id=373 underrepresented populations.
Deadline: The positions will be open until they are filled by appropriate candidates. Questions can be directed to the search com-
mittee chair, Gary T. Leavens, at Leavens@ucf.edu.
Q&A
All The Pretty Pictures
Alexei Efros, recipient of the 2016 ACM Prize in Computing,
works to harness the power of visual complexity.
DESPITE the fact that he does not see smart kids go into CS, and many look
very well, Alexei Efros, recipient of the down at all of these humanities peo-
2016 ACM Prize in Computing and a ple with disdain. In my classes, I try to
professor at the University of California remind them that computer scientists
at Berkeley, has spent most of his ca- are hot now, but physicists were hot
reer trying to understand, model, and in the Sixties, and chemists were hot
recreate the visual world. Drawing on in the Thirties, and they’re not super-
the massive collection of images on the hot now. Shakespeare is going to be
Internet, he has used machine learn- around much longer than Python.
ing algorithms to manipulate objects
in photographs, translate black-and- How did you get involved with com-
white images into color, and identify puter vision, graphics, and machine
architecturally revealing details about learning?
cities. Here, he talks about harnessing Even in high school, my goal was to
the power of visual complexity. solve AI. But then I reasoned it out: AI
is too hard, and you don’t know when
You were born in St. Petersburg (Russia), you’re succeeding. With language, you
and were 14 when you came to the U.S. kind of know when you’re succeeding,
What drew you to computer science? but that’s also very high-level. Mean-
I was interested in computers from while, almost all animals have vision.
an early age. I remember reading a Interestingly enough, I was actu- Vision seems like the most basic thing,
book about PDP-11 assembly lan- ally considering whether I should go so it’s got to be easy, right?
guage programming when I was 12 into computer science (CS) or theater.
and dreaming about how one day, I In fact, I applied to Carnegie Mellon Of course.
might actually have a computer of my University because it’s one of the top Basically, I think I’ve just had one
own to try this out in practice. Then, departments in CS, but also one of idea throughout my whole career, and
in high school, I did some research the top universities for theater. Then I’ve been milking it since undergrad,
with a professor at the University of I showed my father the tuition, and, and the idea is not even that profound.
Utah. It sounds kind of brazen, but I well, we were immigrants. So I went It’s that we fetishize intellectual con-
went to the CS department and was to the University of Utah, where CS tributions—algorithms, data struc-
like, “Bring me to your chairman.” was much stronger than theater, and I tures, and so on. And we often forget
Tom Henderson was the chair at that think I got a very good education. But that a lot of the complexity in the
PHOTO BY NOA H BERGER, COU RTESY OF UC BERKEL EY
time and, you know, he actually saw I’m still practicing my stagecraft twice world is actually due to the data. My
me. I told him that I wanted to do a week in my classes. favorite example is in computer graph-
computer science and asked him for ics. We know how light behaves, and
a problem. And he basically said, “Ok, I’ve seen your talks. You’re a very en- we can simulate everything we want.
weird Russian kid. I have a robot run- gaging speaker. But the reason current animated mov-
ning around; do you want to help with There is this whole dichotomy be- ies don’t look like the real thing is the
that?” It was wonderful. tween the geeks and the artsy people— data. There is a lot of entropy in the
either you are good with numbers, or world and it’s just too hard to capture.
You did your undergraduate work at with arts and humanities. I think it’s The algorithms are fine. It’s the data
the University of Utah, as well. misplaced. CS is hot right now. A lot of that is miss- [C O NTINUED O N P. 103]