Professional Documents
Culture Documents
The
Complexities
of Privacy
and
Anonymity
Why You Should Care
About Cryptocurrencies
Understanding
the Public vs. Private
Debate
Learning @ ScaLe
ATLANTA, GEORGIA MARCH 4–5 2014
for a 1-4 hour discussion on a relevant tool, technology, or methodology related to learning at
scale; or a work-in-progress poster or demo.
➤ Papers must tackle topics “at scale.”
Be more than a word on paper.
UNI FRESHMAN
QUALS MENTORS MAJOR
CUM LAUDE EXAMS PROFESSORS
HIGH SCHOOL GRADUATION TUMBLR
COMP SCI THESIS TECHNOLOGY GLOBAL
SEMESTER XRDS QUALITY JUNIOR
VOLUNTEER EDUCATION COLLEGE
VOICE EDITORS CHANGE
SENIOR FINALS
TECH
COMMUNITY CREATIVITY CS
CHALLENGE TEAM WORK LAB ACM
LECTURE
DINING HALL
RECOGNITION SOPHOMORE
FACEBOOK CONTROL GRAD SCHOOLS
PRECEPT PHD SCHOLARSHIP
ACADEMIA DISSERTATION DEFENSE
LIBRARY ADVISOR TWITTER
SPRING BREAK
XRDS
Crossroads
The ACM Magazine for Students
FA L L 2 01 3 v ol . 2 0 • no . 1
begin
05 letter from the editors
08 INBOX
09 INIT
Keeping Your Little Back Shop
By Maire Byrne-Evans
and Christine Task
10 benefit
XRDS Mobilizes
By Debarka Sengupta
10 advice
Managing Your Time
By Vaggelis Giannikas
11 updates
Revitalizing ACM Student Chapters
By Michael Zuba
12 BLOGS
Algorithms Fit for Compilation?
By Olivia Simpson
Habits: Our cognitive shortcut
By Gidi Nave
Coordination When Information
is Scarce: How privacy can help
By Aaron Roth
The New Firefox Cookie Policy
By Jonathan Mayer
22 32 69
features end
18 feature 45 feature 62 labz
What is Public and Private Anyway? The Tor Project: An inside view CyLab Usable Privacy
A Pragmatic Take on By Kelley Misata and Security Laboratory
Privacy and Democracy By Rich Shay
By Andreas Birkbak 48 feature
It’s Not About Winning, 64 back
22 FEATURE It’s About Sending a Message: WLAN Security
Something Bad Might Happen: Hiding information in games By Finn Kuusisto
Lawyers, anonymization, and risk By Philip C. Ritchey
By Marion Oswald 65 hello world
53 feature Zero-Knowledge Proofs
27 feature n Illustrated Primer
A By Marinka Zitnik
Personal, Pseudonymous, in Differential Privacy
and Anonymous Data: By Chrisine Task 68 eventS
The problem of identification
By Iain Bourne 58 INTERVIEW 70 acronyms
Cynthia Dwork on
Image for page 32 by Vyacheslav Pokrovskiy. Image for page 69 by S. Borisov.
40 FEATURE
What is Bitcoin?
By Dominic Hobson
Computing Reviews
is on the move E DI T ORI A L B O A RD A d v is ory B o ard SUB S C RIBE
Editors-in-Chief Subscriptions ($19
Peter Kinnaird Mark Allman, per year includes XRDS
Carnegie Mellon International Computer electronic subscription)
University, USA Science Institute are available
Bernard Chazelle, by becoming an
Inbal Talgam-Cohen ACM Student Member
Stanford University, USA Princeton University
C
Our new URL is Departments Chief
Laurie Faith Cranor,
Carnegie Mellon
www.acm.org/
membership/student
Vaggelis Giannikas Non-member
M ComputingReviews.com University of Cambridge,
UK
Alan Dix,
Lancaster University
subscriptions:
$80 per year
David Harel, http://store.acm.org/
Y
Weizmann Institute acmstore
Issue Editors of Science
CM
Maire Byrne-Evans ACM Member Services
University of Panagiotis Takis Metaxas , To renew your ACM
Southampton, UK Wellesley College membership or XRDS
MY subscription, please send
Christine Task Noam Nisan, Hebrew a letter with your name,
CY
Purdue University, USA University Jerusalem address, member number
Issue Feature Editor Bill Stevenson , and payment to:
CMY Richard Gomer Apple, Inc. ACM General Post Office
University of P.O. Box 30777
Southampton, UK Andrew Tuson,
K City University London New York, NY
10087-0777 USA
Jeffrey D. Ullman,
Feature Editors InfoLab, Stanford
Jed Brubaker University Postal Information
University of California XRDS (ISSN# 1528-4981)
Irvine, USA Moshe Y. Vardi, is published quarterly in
Rice University spring, winter, summer
Erin Claire Carson and fall by Association for
University of California Computing Machinery,
Berkeley, USA E dit oria l S TA F F 2 Penn Plaza, Suite 701,
Ryan Kelly Director, Group New York, NY 10121.
University of Bath, UK Publishing Application to mail at
Scott E. Delman Periodical Postage rates
Hannah Pileggi is paid at New York, NY
Georgia Institute of XRDS Managing Editor & and additional mailing
A daily snapshot of what is new and hot in computing. Technology, USA Senior Editor at ACM HQ offices.
Denise Doig
Michael Zuba
University of Connecticut, Production Manager POSTMASTER: Send
USA Lynn D’Addessio addresses change to:
Art Direction XRDS: Crossroads ,
JOIN THE INNOVATION. Department Editors
Andrij Borys Associates, Association for
Arka Bhattacharya Andrij Borys, Computing Machinery,
Qatar Computing Research National Institute of Mia Balaquiot 2 Penn Plaza, Suite 701,
Technology, India New York, NY 10121.
Institute seeks talented scientists and Director of Media Sales
software engineers to join our team Luigi De Russis Jennifer Ruzicka Offering# XRDS0171
Politecnico di Torino, Italy ISSN# 1528-4972 (print)
and conduct world-class applied jen.ruzicka@acm.org
ISSN# 1528-4980
research focused on tackling Rohit Goyal Copyright Permissions (electronic)
large-scale computing challenges. West Chester East High Deborah Cotton
School, USA permissions@acm.org Copyright ©2013 by the
We offer unique opportunities for a John Kloosterman Public Relations Association for Computing
strong career spanning academic and University of Michigan, Coordinator Machinery, Inc. Permission
applied research in the areas of Arabic We also welcome applications for USA Virginia Gold to make digital or hard
post-doctoral researcher positions. copies of part of this work
language technologies including Finn Kuusisto for personal or classroom
natural language processing, University of use is granted without fee
Wisconsin-Madison, USA ACM
information retrieval and machine As a national research institute Association for provided that copies are
translation, distributed systems, data and proud member of Qatar Founda- Ashok Rao Computing Machinery not made or distributed
University of 2 Penn Plaza, for profit or commercial
analytics, cyber security, social tion, our research program offers a advantage and that copies
Pennsylvania, USA Suite 701
computing and computational science collaborative, multidisciplinary team New York, NY bear this notice and the
and engineering. environment endowed with a compre- Debarka Sengupta 10121-0701 USA full citation on the first
hensive support infrastructure. Indian Statistical +1 212-869-7440 page or initial screen of
Institute, India the document. Copyrights
Scientist applicants must hold (or C ON TA C T for components of this
Successful candidates will be offered a Adrian Scoică General feedback:
University of Cambridge, work owned by others than
will hold at the time of hiring) a PhD highly competitive compensation xrds@acm.org ACM must be honored.
degree, and should have a package including an attractive UK
For submission Abstracting with credit
compelling track record of tax-free salary and additional benefits Marinka Zitnik guidelines, please see is permitted. To copy
University of Ljubljana, otherwise, republish, post
accomplishments and publications, such as furnished accommodation, Slovenia
http://xrds.acm.org/
on servers, or redistribute
strong academic excellence, excellent medical insurance, generous authorguidelines.cfm
requires prior specific
effective communication and Marketing Editor
annual paid leave, and more. Casey Fiesler
permission and a fee.
collaboration skills. Permissions requests:
Georgia Institute PUBLIC AT IONS BOA RD
For full details about our vacancies of Technology, USA Co-Chairs
permissions@acm.org.
Software engineer applicants must and how to apply online please visit Ronald F. Boisvert
hold a degree in computer science, http://www.qcri.qa/join-us/ Web Editor and Jack Davidson
Shelby Solomon Darnell
computer engineering or related field; For queries, please email Clemson University, USA Board Members
MSc or PhD degree is a plus. QFJobs@qf.org.qa Nikil Dutt, Carol Hutchins,
Joseph A. Konstan,
Ee-Peng Lim,
Catherine C. McGeoch,
/QCRI.QA @QatarComputing QatarComputing QatarComputing www.qcri.qa M. Tamer Ozsu,
Vincent Y. Shen,
Mary Lou Soffa
Equip Yourself
for Creativity
A
privacy issue of XRDS couldn’t be better timed. Given the rapid and continuing
revelations about the NSA from Edward Snowden, anything we write here might
be out of date by the time it reaches you. One piece that we feel will remain
relevant—unless, and until, a paradigm shift occurs in the mathematics behind
cryptographic technologies and/or in the international culture around privacy and
human rights—appeared recently in ACM Queue. In his column titled “More Encryption
Is Not the Solution,” Poul-Henning The foremost things that the Hu- cessful graduate students we know
Kamp makes a rigorous argument that man-Computer Interaction Ph.D. Pro- what was the single most valuable
is well worth a read. Instead of focusing gram at Carnegie Mellon seek in appli- course they took in undergrad. Top-
directly on the issue at hand, we wish cants are past academic achievement ping the list were philosophy courses
to start a discussion we hope will equip and creativity—not incidentally the covering classics like Plato, Socrates,
readers to come up with out of the box subject of the previous issue of XRDS. Descartes, Sartre, and Kierkegaard.
solutions to the privacy problem, or any They tell prospective students CMU These philosophers speculated on
other for that matter. won’t be able to supply more creativity the nature of reality, how we trust
and that these two factors are the most that others aren’t mere figments of
important factors in predicting a suc- our imagination, and more. The value
Upcoming Issues cessful research career. of such courses often does not come
While a Ph.D. program is about dig- only from the specific ideas they
Winter 2013 ging deep, research supports the idea cover, but from the way they are de-
[December issue] that having a breadth of knowledge scribed and discussed.
enhances creativity. This idea isn’t In our unofficial poll we found a
Wearable Computing
new; it’s the subject of numerous con- surprising number of computer sci-
Article deadline: September 15, 2013 ference keynote addresses and gradu- ence majors enjoy philosophy cours-
ation ceremony speeches. It’s easy to es. Both fields focus on conjuring
Spring 2014 suggest to someone they should try to rigorous logic from the thoughts of
[March issue] learn about lots of different things. But the writer. Computer scientists write
actually getting someone (yourself) to code that either compiles and runs
Off the (Academic) Grid Computing
do that is somewhat harder. correctly, or does not. The creative
Article deadline: December 2, 2013 To get some motivation, we asked challenge is in sorting out how to go
a couple of the most creative and suc- from nil to a functional program. In
interactions.acm.org,
is designed to
BLOGS
capture the influen-
tial voice of its
print component in FORUMS
interactions.acm.org
Association for
Computing Machinery
philosophy, it’s up to the reader to amazed at the ones we hear about that
figure out if there are any “bugs in come from off the academic grid.
the code”; that is, does the argument Paola Santana is a cofounder of
make sense? What are the assump- Matternet, which will maintain a fleet
tions it relies on (the “operating sys- of drones (autonomous aircrafts) to de-
tem,” if you will)? It’s not a coinci- liver medicine to villages in the devel-
dence that one of the most profound oping world where there are no physi-
ACM
ACM Conference
Conference philosophical results of the 20th cen- cal roads leading to them.
tury came from logician Kurt Goedel, Marc Roth is starting a business
Proceedings
Proceedings who had a thing or two to say that that’s outfitting shipping containers
informed modern computer science (like the ones that are used on trains
Now
Now Available via
Available via theoretical work as well. and cargo ships) with computers and
Print-on-Demand!
Print-on-Demand! Training your mind to root out logi-
cal fallacies and recognize assump-
3-D printers. He’s training homeless
people to operate them and sell their
tions behind arguments is an enor- products for a source of income.
Did you know that you can mously powerful exercise that will Aereo was sued for taking over the
help you to think critically about all air television and allowing custom-
now order many popular
knowledge you encounter. Philosophy ers to stream it online. Broadcasters
ACM conference proceedings (and no doubt other fields) will give argued in court that this wasn’t the
via print-on-demand? you the toolbox you need to think criti- intended use of over-the-air TV, be-
cally about ideas from just about any- cause they expect each viewer to have
where else.1 their own antenna. Aereo promptly
Institutions, libraries and Students, and in particular gradu- installed millions of tiny antennas on
individuals can choose ate students, are accustomed to hear- their server.
ing about lots of creative projects com- Blueseed.co is considering a novel
from more than 100 titles ing out of academia. We’re continually way to avoid the mess of paperwork
on a continually updated associated with immigrating to the
list through Amazon, Barnes 1 For those interested in startups and busi-
U.S. They plan to park a cruise ship
off the coast of San Francisco and
& Noble, Baker & Taylor, ness, this is likely similar to the notion of Lat-
helicopter international executives
ticework upheld by Warren Buffett’s partner
Ingram and NACSCORP: Charles Munger. between their Silicon Valley jobs and
CHI, KDD, Multimedia, international waters every day get-
ting them to and from work. (We’re
SIGIR, SIGCOMM, SIGCSE, planning to cover off the grid ideas
SIGMOD/PODS, like these in a future issue, so feel
and many more. free to send us any cool pointers.)
It’s definitely not clear that any of
these are good ideas. What is clear
For available titles and is that they are truly out of the box
ordering info, visit: answers to difficult problems. Pull-
librarians.acm.org/pod It’s not a coincidence ing them off requires a great deal
has winged its way vision-impaired users. blast from the past Ex-XRDSian (2009-12)
to Australia, lks like Plus many people these
How to contact XRDS: Send a letter to
another great read :) days prefer to read Dear XRDSians, the editors or other feedback by email
—Patrick Sunter, Public- You are going beyond my (xrds@acm.org), on Facebook by posting
reflowable text on on our group page (http://tinyurl.com/
spirited urban researcher their e-reader of choice. expectations. The Scientific XRDS-Facebook), via Twitter by using
#xrds in any message, or by post to
and computer geek, —Pat Kujawa, Computing issue is really ACM Attn: XRDS, 2 Penn Plaza, Suite 701,
Twitter (@PatSunter) Facebook fabulous. I enjoyed all the New York, New York 10121, U.S.
768 bits
The largest RSA key known
30,000
The approximate number of accounts Google released
to have been brute-forced. information about to the U.S. government in 2012.
benefit advice
XRDS
Mobilizes
Managing Your Time
L
et me ask you a very simple and when. Find a proper system and make
To make content more short question. Do you feel you sure it can help you at least capture your
accessible to the student manage your time well? If the tasks and schedule. Old-style notepads
readers, XRDS has recently answer is yes, then you should and diaries can do the job too.
launched its new mobile be the one writing this article; not me. I 3. Don’t be afraid of drafts. If you think
app. The app will enable mean it! Personally, I have issues when about it, drafts are amazing. It is like
Android and iOS users to it comes to time management and until giving yourself the opportunity to do
browse the magazine on the today I haven’t met anyone who doesn’t. something of ambiguous quality with-
go from their Android or But if you truly don’t, please drop me an out having anyone judge you. Drafts for
iOS enabled smart phones email and I’ll publish your advice on students are like rehearsals for actors or
or tablets. time management in one of our future practice sessions for athletes. You need
Each issue of XRDS issues. In the meantime, even if I am to use them in your favor. Even if you
features a theme, such not an expert in the area, here are some have to submit your draft to somebody
as “The Role of Academia tips that might help you improve. such as your supervisor, worst-case sce-
in the Startup World” 1. Time is not manageable. I am not a nario he will give you some harsh feed-
or “Big Data,” with coverage physicist but time used to be, still is, back that will help you improve your
of research trends and and seems like it is going to be perfectly draft. No evaluation, no marks, nothing
interviews. Students can managed. Each hour has 60 minutes to be afraid of.
also read timely updates and each minute has 60 seconds and 4. Consider the possibility of saying no.
on upcoming major there is nothing you can do to control Yes, this is a real possibility. One you
conferences, grants, it (at least in practice). My point is the can use and be acceptable to the other
fellowships, and contests. first step is to realize that you don’t have party. Even if it might not always sound
Even better, students can to manage time. You have to manage nice, the answer to “could you please...”
now get all this in a more yourself—your own activities—in order can be negative; perhaps in a more kind
convenient way—their to make sure you can accomplish them and friendly way, but still negative. Hav-
hand-held devices. in the available time. ing said that, you need to be careful
The application 2. To-do lists and prioritization. OK, a when and how to deny your help/ser-
is powered by list might be a bit too obvious, but it vice. Make sure to consider “no” as one
Godengo+Texterity. is one of the fundamental things you of your available options before making
With these newly should have a system for. As people who your final decision.
launched apps, XRDS enjoy computing, I am sure you can find 5. Just do it. Do you know how many
wishes to extend its the right software that can help you “time management” workshops, semi-
reach among the student write down what you have to do and nars, books, and tutorials you can find
community. So don’t delay; out there? A simple Google search will
get your favorite magazine give you more than a billion results
on your smart phone with websites full of tips on how to
and recommend your effectively manage your time. Do you
friends do the same. know what’s the only way to make them
The links to the app are: work? Do them! Choose some, try them,
˲˲ iTunes: adopt them, and use them in your every-
http://bit.ly/130Lnyk day life. Believe me, I have spent a lot
˲˲ Google Play: of my time trying to improve my time
Photograph by IR Stone
updates
R
un n i ng a n ACM group of students to form an
student chapter is a active computing commu-
demanding task. In nity. In the first semester of
a world where com- the chapter’s reformation,
puting has touched nearly the student leaders, with
every aspect of our daily help from the UML Alumni
lives, it still seems difficult Association, have managed
to bring students togeth- to provide strong initiatives
er. Further, given the over for their student communi-
stimulating environment ty. These events ranged from
in which students are im- a speaker series to tutorial Rich Miner’s visit to UML as part of ACM’s speaker series.
mersed while pursuing their sessions such as an Android
studies, it appears there is a Development night, dur- chapter can help to overcome from the chapter’s previous
natural occurrence of “roller ing which students learned this divide: “We believe that development night has pro-
coaster” based involvement to create Android applica- the ACM chapter is a place vided a well-valued insight
and interest in ACM student tions. The chapter hosted for students to connect into the common interests
initiatives. As the torch is an impressive list of speak- with common interests. We of the student community.
passed down every few years ers including Google’s Rich are not just a computer sci- When asked for advice
to the next generation of stu- Miner, start-up guru Abby ence club—we have 43 active on how to build an ACM
dents, the newly appointed Fichtner, cyber-security ex- members from various fields, community, Ms. O’Neal of-
ACM student chapter lead- pert Gary Miliefsky, and in- including math, English, fered the following words
ers are faced with the task of teractive fiction author An- engineering, and business of wisdom: “Never give up.
trying to appeal to the con- drew Plotkin. backgrounds. Going forward, What we have learned over
stantly changing interests Newly appointed Presi- we would love to see more the past six months is that
and directions of their peers dent of the ACM chapter at disciplines and backgrounds there will be events in which
and professional commu- UML, Shawna O’Neal, ex- come together. There is seri- attendance is high and feed-
nity. In this issue of XRDS we plained some of the issues ous potential to rebuild our back is great, and converse-
solicited student chapters their ACM chapter faces: campus community.” ly, events in which only two
who were working hard to “UML is a unique campus Moving forward, the ACM people attend or technical
revitalize their student in- where students are sepa- student chapter at UML difficulties interfere with
volvement and bring a new rated by location and major. plans to continue its speak- the event itself. However, as
face to their student chapter. The technolog y, science, er series after the positive difficult as it may be to pull
The ACM student chap- and business domains are student feedback provided students out of their dorms,
ter at the University of Mas- on the north side of the Mer- from the previous events. or to convince commuter
sachusetts Lowell (UML) rimack River, while the hu- Additionally, there is inter- students to stay on campus
has been around since the manities, social sciences, est in starting a “Game De- late—it is absolutely pos-
1990s, but has suffered from and arts are on the south velopment Night,” which the sible to create a successful
several years of inactivity. side. We also have a heavy chapter hopes will showcase chapter. Keep trying.”
Recently this past spring, commuter base on our cam- student-created gaming ap- If you would like to read
dedicated students working pus. What this means for us plications that can be used more about ACM student ini-
with faculty advisor Dr. Jesse is that our sense of commu- by the UML community for tiatives at UML, you can visit
Heines have refreshed their nity can suffer.” However, leisurely fun and to solicit their website at the follow-
student chapter in efforts given these challenges, Ms. development advice from ing link: http://umlacm.org.
to bring together a diverse O’Neal explains an ACM other students. Feedback —Michael
— Zuba
BLOGS
try to shop at the same chain, where I know the products and lever. The mice picked up a lever-pressing habit.
their location on the shelves. Although we all have natural, automatic responses to
One morning, the parking lot of the usual store was events in our environment (like the urge to step out of the
packed. I decided to explore and went to the supermarket elevator even though it had stopped on the wrong floor),
across the street for the first time. My experience at the new us, humans, have a sense of control. We believe in our
place was mildly traumatizing: Confronted by all sorts of ability to suppress habitual tendencies more easily than
new packages calling from the shelves, I felt helpless. How other animals. In 2009, Elizabeth Tricomi and colleagues
should I know what I would like? recruited Rutgers college students, making sure that none
In my first blog post, I wrote about our mind’s limited of the subjects were dieting. Participants fasted for six
computational capacity. When we shop, examining all of hours, and then performed a simple task. In each trial, they
the products, predicting how much we will enjoy them, were instructed to push one of four buttons in order to get
and whether the price is reasonable (compared with the a food reward. In some trials, the reward was an M&M; in
alternative) is very costly in terms of cognitive effort. others it was corn chips (the researcher verified that all of
Luckily, our brains have figured out a shortcut that saves the subjects liked both foods).
a lot of time and mental effort, making it easy to navigate Like in Dickinson’s mice studies, subjects were divided
through the local convenient store and fill ours cart with into two groups: The first performed the task only for
the usual products. a single day, in two eight-minute training sessions; the
Faced with a novel situation, we usually don’t know what second performed four sessions each day for three days—
to do. Sooner or later, after a short exploration period of six times as much training. Now came the fun part. After
trial and error, we learn what works for us. Animals are not the last training session, participants were given one of the
much different. In 1898 American psychologist Edward food types, and were instructed to eat until it was no longer
L Thorndike invented the “puzzle boxes”—cages that the pleasant to them. With one of the food “devalued,” subjects
animal (say, a cat) could exit (and get a food reward) only by went back to perform the task for three more minutes.
using a specific response (like pulling a lever, for example). What do you think happened? When the rewarded food
Thorndike found the time it took the animals to escape and was no longer desirable, the first group stopped pushing
get their food reward decreased with their experience. buttons, as expected. Surprisingly (or not) the over-trained
This is not news for anyone who ever had a pet: Animals subjects became habitual, and kept pressing buttons even
quickly figure out what action would get them the reward. when the food reward had become unpleasant.
The process of associating a state of the world with an action Habit formation imposes a trade-off. Most of the time,
and its outcome in the animal’s brain, called “instrumental the shortcut works well: We easily navigate our cars and
conditioning,” can be used to teach animals, like raccoons, carts in familiar avenues without having to pay too much
to perform complicated tasks, like playing basketball. attention. Faced with a new situation, however, we are prone
Do animals learn to associate the sensory stimulus to make mistakes, or even worse— pick up unhealthy habits.
and their responsive actions with the reward, or maybe Obesity, alcoholism, and other addictions are often a result
they are just reinforced to perform the action, regardless of a long reinforcement history (of alcohol, drugs or fried-
of the reward value? To answer this question, Dickinson chicken) that ended up turning into out of control habits.
and colleagues trained mice to push a lever in order to Accepting the habitual system as an inseparable part of
get food.1 After 120 successful trials, the mice were fed to our minds, understanding its limitation, and the way it works
satiety. Lacking the motivation to get food, the mice stopped may help us to achieve our long-term goals. Forcing ourselves
pressing the lever, providing evidence that they had learnt to start working out and eat healthy is effortful, but the
the consequence of their actions, and weren’t just blindly investment will pay off. After a period of reinforcement (by
responding to the lever in their cage. endorphin rush, feeling lighter and getting compliments),
Dickinson repeated his experiment with a new group of our brains will pick up the habit. Going to the gym or having
mice, but this time he extended the training period to 360 kale salad may become effortless or even enjoyable.
(rather than 120) trials. The effect of over-training on the Gidi Nave is a Computation and Neural Systems Ph.D. student in Colin Camerer’s lab
mice behavior was dramatic: Even after they were fed to at Caltech. His research is in the field of neuroeconomics—the intersection between
neuroscience, psychology and economics. He uses a medley of theoretical and experimental
satiety and no longer wanted the food, they kept pressing the methods for reverse-engineering the processes underlying human decision-making. By
understanding how emotions and cognition generate judgments and decisions under
1 A. Dickinson, et. al. Motivational control after extended instrumental uncertainty, he seeks to contribute models that take into account the biological origins of
training. Animal Learning & Behavior 23, 2 (1995), 197-206. the decision process in the brain. For more information visit www.gidinave.com.
PCI DSS
The Payment Card Industry Data Security Standard is
a mandatory data protection standard for businesses
that handle credit card information.
value. Its not hard to see that in this case, in equilibrium have one of two types—(M)ountain, or (B)each. Each player
only O(√n)goods get allocated. has two actions— they can go to the beach, or go to the
So if we find ourselves in a setting of incomplete informa- mountain. Players prefer the activity that corresponds to
tion, we might nevertheless prefer to implement an equilib- their own type, but they also like company. So if a p frac-
rium of the game of complete information defined by the tion of people go to the Mountain, an M type gets utility
realized types of the players. How can we do that, especially if 10 ∙ p, if he goes to the mountain, and 5 ∙ (1 – p), if he goes
we have limited power to modify the game? to the beach. A B type gets utility 5 ∙ p, if she goes to the
One tempting solution is just to augment the game to mountain, and 10 ∙ (1 – p), if she goes to the beach. A proxy
allow players to publicly announce their types (and thus that suggests that all players go to the beach if type M is in
make the game one of complete information). Of course the majority, and otherwise suggests that all agents go to
equilibria of the complete information game might not be the mountain always computes a Nash equilibrium of the
unique, so to help solve the coordination problem, we could complete information game defined by the reported types.
introduce a weak “proxy” that players can choose whether Nevertheless, any player that is pivotal has incentive to
or not to use. Players can “opt in” to the proxy, which means opt-out of the proxy, since this causes the proxy to send all
that they report their type to the proxy, which then recom- other players to her preferred location.
mends an action for them to play. At this point they are free But what if it were possible to compute an equilibrium
to follow the recommendation, or not. Alternately, they can in such a way so that whether or not any player i opted in
“opt out” of the proxy, which means they never report their had very little affect on the distribution over actions sug-
type, and then just choose an action on their own. It would gested by the proxy to all other player j ≠ i? In this case, the
be nice if we could design a proxy such that: problem would be solved: any unilateral deviation from
1. Simultaneously for every prior on agent types, it is a the all-opt-in-and-follow-the-proxy’s-suggestion solution
Bayes-Nash equilibrium for agents to opt-in to the proxy, and would not (substantially) affect the play of one’s oppo-
then follow its suggested action, and nents, and so they would all continue playing their part of
2. Given that all players behave as in (1), that the resulting an equilibrium of the complete information game, defined
play forms an equilibrium of the complete information game on all player’s realized types. The equilibrium condition
induced by the actual, realized types of the players. would now directly guarantee that such a deviation could
Its tempting to think that to design such a proxy, it is not be beneficial.
sufficient to have it compute a Nash equilibrium of the So to design a good proxy, its not enough to just have
complete information game defined by types of the players an algorithm that computes an equilibrium of the game
who opt in, and then suggest that each player play her part defined by the reported types, but it is enough if that algo-
of this Nash equilibrium. After all, if everyone is opting in rithm also satisfies a strong enough stability condition. It
and playing their part of a Nash equilibrium, how can you turns out that a sufficient “stability” condition is a variant
do better than to do the same? By definition, in a Nash equi- on differential privacy: informally, that any unilaterial de-
librium, your suggested action is a best response given what viation by a single player i should not change (by more than
all other players are playing. And in fact, in the toy alloca- a (1 ± ε) factor) the probability that any particular profile of
tion example we discussed above, this works, since there is n – 1 actions is suggested to the n – 1 players .
an equilibrium of the complete information game that is And in fact it is possible to implement this plan, at least
simultaneously optimal for all players. for certain classes of games. We study the class of “large
In general, however, this approach fails. The flaw in our games”, which informally, are n player games in which the
reasoning is that by opting out, you change what the proxy affect of any player i’s action on the utility of any player j≠i
is computing: it is now computing a different equilibrium, is bounded by some quantity which is diminishing in n.
for a different game, and so by opting out, you are not The Beach/Mountain example is of this type, as are many
merely making a unilateral deviation from a Nash equi- large population games. Our main technical result is
librium (which cannot be beneficial), you are potentially an algorithm satisfying the required differential privacy
dramatically changing what actions all other players are stability condition, which computes a correlated equilib-
playing. The Nash equilibrium condition does not pro- rium of any large game. The upshot is that in large games,
tect against deviations like that, and in fact it’s not hard it is possible to design a very weak proxy (that doesn’t have
to construct an example in which this is a real problem. the power to make payments, force players to opt in, or
Consider, e.g. toy example #2: There are n players who each enforce actions once they do opt in) that implements an
η-approximate correlated equilibrium of the complete How does the new Firefox cookie policy work?
information game, as an η-approximate Bayes Nash equi- Roughly: Only websites that you actually visit can use
librium of the partial information game, no matter what cookies to track you across the Web.
the prior distribution on agent types is. Here η is some More precisely: If content has a first-party origin, noth-
approximation parameter that is tending to zero in the size ing changes [1]. Content from a third-party origin only has
of the game—i.e. as the game grows large, the equilibria cookie permissions if its origin already has at least one
become exact. cookie set.
To emphasize the purely game theoretic aspects of this How does Firefox’s new policy compare to the other
problem, I have ignored the fact that differential privacy of major browsers?
course also provides a good guarantee of “privacy.” Aside ˲˲ Chrome: Allows all cookies.
from straightforward incentive issues, there are other ˲˲ Internet Explorer: Cookie permissions vary by P3P com-
reasons why players might be reluctant to announce their pact policy. In practice, almost all third-party tracking cook-
types and participate in a complete information game: ies are allowed [2].
perhaps their types are valuable trade secrets, or would be ˲˲ Safari: First-party content has cookie permissions.
embarrassing admissions. Because our solution also hap- Third-party content only has cookie permissions if the con-
pens to provide differential privacy, it is able to implement tent already has at least one cookie set.
an equilibrium of the complete information game, while In short, the new Firefox policy is a slightly relaxed ver-
maintaining the privacy properties inherent in a game of sion of the Safari policy [3].
incomplete information.
To conclude, differential privacy thus far has been Will the new Firefox policy break websites?
primarily a problem-driven field that has borrowed Collateral impact should be limited. Safari’s cookie policy
techniques from many other areas to solve problems in has been in place for over a decade, and it is included in
private data analysis. But in the process, it has also built both the desktop and iOS versions of the browser. A few
up a strong conceptual and technical tool kit for think- websites may require a tiny code change to accommodate
ing about the stability of randomized algorithms to small Firefox in the same way as Safari.
changes in their inputs. I hope this and future posts serve Just to be sure, the Mozilla privacy team is closely
to convince you that there is something to be gained by monitoring the policy before final release. The patch will
borrowing the tools of differential privacy and applying spend about six weeks each in the pre-alpha, alpha, and
them to solve problems in seemingly unrelated fields. beta builds. If you spot any oddities, please report them to
Aaron Roth is the Raj and Neera Singh assistant professor of Computer and Information
Mozilla support!
Sciences at the University of Pennsylvania. Prior to this, he was a postdoctoral researcher
at Microsoft Research, New England, and earned his PhD at Carnegie Mellon University.
He is the recipient of a Yahoo! Academic Career Enhancement Award, and an NSF CAREER IS IT NECESSARY?
award. His research focuses on the algorithmic foundations of data privacy, game theory Consumers neither expect nor approve of Web tracking
and mechanism design, and the intersection of the two topics.
[4]. Mozilla has been a frequent advocate for its users,
advancing technologies that signal preferences (Do Not
Track), lend transparency (Collusion), and facilitate
The New Firefox Cookie Policy privacy-friendly Web services (Persona and Social API).
Last fall, the Mozilla community began a concerted effort
Originally posted on WebPolicy.org in a new direction: technical countermeasures against
By Jonathan Mayer tracking [5]. One of our first projects has been a revision of
the Firefox cookie policy [6].
Editor’s Note: Earlier this year Stanford grad student Jona- Cookie policies are inherently imprecise. Some un-
than Mayer discussed cookies, Web tracking, and changes to wanted tracking cookies might slip through, compromis-
Mozilla’s cookie policy on his personal blog. ing user privacy (“underblocking”). And some non-tracking
cookies might get blocked, breaking the Web experience
The default Firefox cookie policy will, beginning with (“overblocking”). The challenge in designing a cookie
release 22, more closely reflect user privacy preferences. policy is calibrating the tradeoff between underblocking
This mini-FAQ addresses some of the questions that I’ve and overblocking [7].
received from Mozillans, Web developers, and users. The patch that I developed is an intentionally cautious
that we know we need to address with future [6] Apple and Microsoft have both automatically limited tracking cookies for a decade.
There was an effort to block tracking cookies by default in Firefox three years ago,
improvements. but it was withdrawn under contested circumstances.
˲˲ Old cookies. The revised policy does not limit preexist- [7] Other considerations could include types of underblocking and overblocking, as
well as possible reactions to the policy. Future posts might address these topics,
ing tracking cookies. Firefox users who update to the revised depending on reader interest.
policy will not fully benefit until they clear their cookies. [8] In the interest of precision, the revised Firefox cookie policy is slightly more
˲˲ Temporary visits. Sometimes a user temporarily visits permissive than the Safari policy owing to implementation specifics.
a tracking website, such as after clicking an advertisement [9] The sites are dayonecenter.com (Alexa rank > 1M) and western.org (Alexa rank ≈
200K).
(intentionally or inadvertently). The revised policy indefi- [10] As I understand our release conditions, the patch will move forward unless there’s
nitely allows tracking cookies from a website after just one confirmed breakage, the breakage is so substantial as to outweigh longstanding
user demand for privacy, and the breakage cannot be ameliorated through outreach,
temporary visit. mitigation measures, or rapid iteration. Under present circumstances, the patch
˲˲ Dual-use domains. Several popular websites use the plainly satisfies these release conditions.
same domain for both consumer services and tracking. Ya- [11] We may wish to relatedly take steps to accommodate websites (if any) that have
a third-party domain, do not compromise consumer privacy, do not break the
hoo, for example, operates both its homepage and adver- consumer Web experience without cookies, cannot deploy an accommodation
for the revised cookie policy, require cookies for functionality, and have lost that
tisement tracking from yahoo.com. If a user visits the Ya- functionality on account of the revised policy.
hoo homepage, the company will be able to track the user
across other websites. Google, on the other hand, largely Jonathan Mayer is a grad student at Stanford University in computer science and law.
But he doesn’t live in the ivory tower. His research homes are the Security Lab (advised by
hosts search on google.com but advertising tracking on John Mitchell), CISAC, and CIS. Wherever information technology, public policy, and law
doubleclick.net. If a user runs a query with Google, they intersect, Mayer is interested.
By Andreas Birkbak
DOI: 10.1145/2508969
T
he Web should work in the most democratic way viable, while doing as much as possible
to protect the privacy of its users. These are statements that most people seem to agree
with, to an extent where they have become common sense. The widespread uptake of
social media use, however, suggests the conventional distinction between something
private and something public does not always hold in practice. For example, prominent Web
scholars like Nancy Baym and danah boyd note how we might understand social media better
if we see that “even the most private of selves are formed in relation to diverse others.”
This insight is taken from pragma- observation that as our societies grow boundaries of the immediate situation
tist philosophy, and I would like to sug- increasingly technological, our ac- in which the act takes place. The re-
gest that this line of thinking is fruitful tions also tend to have an increasing sult is that a public needs to be formed
for challenging our tendency to think number of unforeseen and indirect in order to take care of the indirect
about the Web in terms of a strong consequences. In order for people to consequences. One might take pollu-
public/private dichotomy. My source live democratically, by which Dewey tion as a simple example: If a father is
of inspiration is the classic American simply means to be able to direct burning garden waste, and the direct
pragmatist John Dewey [1]. one’s own life in a meaningful way, it consequence is the smoke prevents
In a deceptively small book from is imperative to come to grabs with all his children from playing outside, it is
1927, The Public and its Problems, Dew- these indirect consequences. This is a private matter to put out the fire or
ey offers an alternative take on the where the public enters the picture. postpone any playing outdoors. How-
public/private distinction. Like any Dewey distinguishes between di- ever, if the father is also burning ma-
other pragmatist his vantage point is rect and indirect consequences of ac- terial that contains toxic chemicals,
Illustration by JNT Visual
practice, that is, human actions. What tions. If consequences are direct, it his actions might have the indirect
Dewey is interested in is how we might means they are contained in the situa- consequence of polluting the air in the
come to better understand our actions tion where the act is taking place, and whole neighborhood. In this situation,
and their consequences. More specifi- they can be dealt with privately in that a public is needed to a) sort out the con-
cally, the problems of the public that situation. If consequences are indi- sequences, e.g. by putting up air qual-
Dewey is pointing to arise from the rect, however, they spread beyond the ity measurement instruments and b)
legislate so that the father will hesitate teresting for a discussion of privacy. revealed too easily on the Web. The
to repeatedly release toxics into the air. The Facebook users on Bornholm had notion of having privacy only makes
The important thing to notice to share private stories, photos, and sense in relation to its counterpart—
here is the apparently private act of videos in order to qualify their situa- the notion of exposing something pri-
burning waste needs to be qualified tion as a public issue of uncontrolled vate in public, that is, a breach of priva-
as having public consequences in indirect consequences (of the inac- cy. Here, public simply means “visible”
order to be controlled. A process of tion of the authorities). In other words, and private simply means “hidden.” Or
inquiry has to take place, the result the sharing of personal accounts to put it in terms of control, public is
of which is that it becomes clear the formed the basis of public engage- taken to mean “deliberately revealed”
burning of waste is in fact polluting. ment. Revealing private content, even while private means “only shown to a
The act of burning waste is now seen unintentionally, can have productive select audience.” Such an understand-
in an entirely new light: It is no longer consequences for how other people un- ing of the public/private distinction is
a mundane everyday event, but an derstand the issues they struggle with. highly intuitive and indeed captures
unacceptable practice. Powerful “we-identities” can be built many of the concerns with user privacy
In my work on social media groups, around deeply personal stories that in the age of social media.
I have noticed how infrastructures like people can relate to. Second, some fear that the Web is
Facebook groups sometimes come to Such we-identities can then provide harmful to democracy. These concerns
serve this purpose of qualifying acts the self-confidence needed to take ac- have to do with the notion of public
as having indirect consequences. One tion on issues, as when the snowbound space as something fundamentally im-
case I have studied is the use of open Facebook users on Bornholm worked portant in a democracy. The concern
Facebook groups during a severe together to attract national media at- here is not that we are too exposed by
snowstorm on the Danish island of tention to their situation and help each our social media content, but quite
Bornholm [2]. People in the more rural other in other, more mundane ways. the opposite: What happens on so-
parts of the island were snowbound for What is more, democracy does not nec- cial media tends to be “too private” to
up to a week, something few of them essarily hinge on rational deliberation qualify as public deliberation. Here is a
were prepared for. In dealing with the in an abstract public sphere. Rather, tendency to talk about public and pri-
situation, some inhabitants not only public engagement happens when vate in binary terms. A popular way to
searched for information and tried to people work to qualify their personal describe this is to use Cass Sunstein’s
help each other, but also questioned problems as public issues. What I have term “echo chambers,” which de-
whether the authorities on Bornholm observed is social media platforms like scribes how the groups we participate
had done enough to remove the snow Facebook provide interesting ways of in on social media tend to confirm our
from their roads. This concern formed making such moves. own (read: biased) beliefs rather than
the rationale behind the Facebook However, we are not used to think- challenge them. This is not conven-
group that became an important meet- ing about the Web in these pragmatist tionally seen as a good dynamic, since
ing point for a couple of hundred snow- terms. Rather, our ideas about how the democracy is understood as depen-
bound islanders. Web should work are informed by oth- dent on the clash of divergent opinions
With Dewey we might come to er kinds of philosophies. Let me high- in an open public space.
understand this Facebook group on light two widespread concerns with The two sets of concerns—privacy
Bornholm as contributing to democ- how the contemporary Internet works. and democracy—are key to the way
racy in the pragmatist sense in so far First, some fear that too much is scholars in Web science and related
as the members collectively qualified disciplines try to capture the current
their snowstorm troubles as not only Web as a domain of “networked pub-
a result of forces of nature, but also of lics.” The concerns stem from under-
the (in)actions of the local authorities. The widespread standing the relationship between
The snowbound Facebook users came
to understand their situation as also
uptake of social the private and the public domains
as fundamentally problematic. Con-
an indirect consequence of the author- media use, cerning privacy, the thinking goes, it
ities’ act of not removing more snow.
Importantly, the members of the Face-
however, suggests is a good thing that people can share
content from their everyday lives with
book group did not ask the authori- the conventional each other through the Web, but peo-
ties to do the impossible, but rather to
understand the situation of people in
distinction between ple need to be able to control where
the line between public and private is
the rural parts of Bornholm. The rural something private drawn. Concerning democracy, it is a
dwellers felt overlooked and misunder-
stood. In order to compensate for that,
and something good thing that people get new ways of
engaging through social media, but if
they used Facebook to share updates public does not people are not entering into a dialogue
about the snowstorm and its conse-
quences from their perspectives.
always hold with opposing viewpoints, the contri-
bution to democracy is far from clear.
This is where the story becomes in- in practice. In the worst case, social media might
communal interests. This kind of logic thinking about the Web in terms of [2] Birkbak, A. Crystallizations in the Blizzard:
Contrasting informal emergency collaboration in
is arguably more widespread in Europe publics, which I find most fruitful, is Facebook groups. In Proceedings of NordiCHI ’12
than in the U.S., but it certainly also ex- that of pragmatism. One general in- (Copenhagen, Denmark, Oct. 14-17) ACM Press,
New York, 2012.
ists in both places, offering another in- sight pragmatist thought has to offer
fluential argument why the private and is that in practice, we never exist as
Biography
the public needs to be kept apart. isolated individuals. Even in our most
Andreas Birkbak is a Ph.D. fellow in the Techno-Anthropology
While these two powerful logics private moments, we always draw on Research Group at Aalborg University, Copenhagen. His work
seem mutually exclusive, they co-exist is on social media and technological democracy.
the habits, skills, and ideas passed
in practice. How is this possible? Apart over to us from others, and we con- Copyright held by Owner/Author(s).
from the fact that humans seem to stantly imagine the thoughts and re- Publication rights licensed to ACM $15.00
Something Bad
Might Happen:
Lawyers,
anonymization,
and risk
The line between personal and anonymous information is often unclear.
Increasingly it falls to lawyers to understand and manage the risks associated
with the sharing of “anonymized” data sets.
By Marion Oswald
DOI: 10.1145/2508970
I
f you wanted to predict the future, who would you call upon? An economist, a statistician,
Nate Silver? A lawyer might not be high on your list. Yet when faced with questions of
individual privacy and data anonymization, this is what lawyers are being asked to do.
This article aims to illustrate how this is the case and consequently why lawyers need help
from computer scientists.
LEGAL BACKGROUND becomes subject to a number of (some- nymized, a crucial question the courts
Anonymization presents lawyers with times complex) duties and responsi- often have to decide is whether this
somewhat of a challenge. Take the Eu- bilities designed to safeguard the data. supposedly anonymized dataset in
ropean Data Protection Directive for If personal data can be converted into fact falls within the definition of per-
instance. It applies to personal data, anonymized form in such a way that sonal data. Why then does this in-
that is any information relating to an a living individual can no longer be volve predicting the future? Because
identified or identifiable natural per- identified from it (taking into account it involves an assessment of risk, or
son. An identifiable person is one who all the means likely reasonably to be as David Spiegelhalter has put it, “the
can be identified, directly or indirectly, used by anyone receiving the data), possibility that something bad might
in particular by reference to an iden- disclosure of information in this ano- happen” [1]. This is usually decon-
tification number or to one or more nymized form will not be disclosure of structed into “the likelihood of some-
factors specific to his or her physical, personal data, and therefore those du- thing happening and the impact if it
physiological, mental, economic, cul- ties and responsibilities will not apply actually does.” Then some attempt is
tural or social identity. to the disclosed data. made to quantify the magnitude of
If data is personal, the organiza- these two dimensions.
tion that decides how and why the IDENTIFIABILITY In its Code of Practice “Anonymiza-
data is processed (the data controller) Where personal data has been ano- tion: Managing data protection risk,”
the UK’s Information Commissioner The Tribunal made use of the “mo- ise made by [privacy/data protection]
Office advised the Data Protection Act tivated intruder” test; a motivated in- laws—that anonymization protects
does not require the process of anony- truder being someone who has access privacy—as an empty one” [3]. Ohm
mization to be completely risk free— to the Internet and public documents highlighted “release-and-forget” an-
data controllers must instead mitigate and would use investigatory tech- onymization, with generalized rather
the risk of re-identification until the niques such as making enquiries of than suppressed identifiers, as of
risk is “remote.” people likely to have additional knowl- particular concern. He also pointed
So when lawyers are presented with edge. The requestor was an investiga- out that other “data fingerprints”
an anonymized dataset and asked “is tive journalist, and so might have been such as search queries or social me-
it personal data?”, they have to assess highly motivated to identify individu- dia postings can be combined with
the possibility of something bad hap- als using other information available anonymized data to attempt re-iden-
pening; i.e. the likelihood of someone and common investigative steps. The tification. However others disagree
being able to re-identify an individual, Tribunal concluded an investigative with Ohm’s view that re-identifica-
and the harm or impact if that re-iden- journalist “would have little difficulty tion can be achieved with “aston-
tification occurred. And tending to be in making the necessary enquiries ishing ease.” In her new guidance,
conservative creatures, lawyers may which could lead to the identification “Looking Forward: De-identification
be tempted to respond: Yes, it could of individuals subject to disciplinary Developments—New Tools, New
happen, therefore there is a risk, proceedings,” particularly as the com- Challenges,” Ann Cavoukian, the In-
therefore the data is personal data. munity was small and close-knit, and formation and Privacy Commission-
that identification would be all the er of Ontario, Canada, restated her
THE POSSIBILITY OF SOMETHING more likely when the sanction was sus- opinion that re-identification “is not
BAD HAPPENING pension or dismissal. easy” and that the most significant
There are situations where the risk privacy risks arise from ineffectively
of re-identification is undoubtedly BIG AND OPEN DATA de-identified data. Commenting on
high: the likelihood of re-identifi- The Magherafelt decision dealt with a big data, she said: “As masses of in-
cation high, the potential for harm relatively small set of data where prior formation are linked across multiple
high, and the certainty factor high. or personal knowledge about a par- sources it becomes more difficult to
Take the request under the UK’s Free- ticular individual may already have ensure the anonymity of the informa-
dom of Information Act for details of existed or could have been obtained. tion” [4]. On the other hand, big data
disciplinary action taken against em- The UK Government’s Open Data could make de-identification easier
ployees of the Magherafelt District agenda is particularly concerned with to achieve; “smaller datasets are
Council in Northern Ireland. Should regional or national datasets where more challenging to de-identify as it
the Council disclose a schedule con- the likelihood of personal knowledge is easier to be unique in a small data-
taining the penalty issued and the having an impact on re-identification set,” as seen in the Magherafelt case
reason for the action, but which ex- risk is minimized. How should we as- discussed previously.
cluded the date of the action, and the sess the risk of re-identification in re- So who should we believe?
gender, job title, and department of lation to such datasets?
the employee? Paul Ohm, in his paper “Broken TRUST, RISK, AND
No, said the Upper Tribunal Ad- Promises of Privacy,” argued “re-iden- ANONYMIZATION STUDIES
ministrative Appeals Chamber [2]. tification science exposes the prom- “…[O]ur views of the facts about big
The information was personal data, risks are often prompted by our politics
and it would be unfair to disclose it. and behaviour, even as we insist that
But the information had been ano- the rock on which we build our beliefs is
nymized—why was it personal data? Where personal scientific and objective, not the least bit
The issue was not whether the in-
formation was personal data in the
data has been personal” [5].
In Nick Pidgeon’s view, emotional
hands of the Council, but whether it anonymized, a responses are very important in the
was personal data in the hands of the
general public. A crucial question was
crucial question the assessment of risk. “If you do not trust
the parties who manage the risk, you
whether the public could identify the courts often have to are not likely to have confidence that
individuals to whom the summarized
schedule related.
decide is whether the risk is being safely managed” [1].
Kieron O’Hara has said “trust is an im-
The Tribunal considered evidence this supposedly portant risk and complexity manage-
that the Council was a small authority
with only 150 employees, all known to
anonymized dataset ment tool…The stronger X’s trust, the
higher the degree, and the greater the
each other, in a district with a popula- in fact falls within risk he is willing to take” [6].
tion of 39,500. The Council was likened
to a family, with a high level of knowl-
the definition of A recent Ipsos MORI study of Public
Understanding of Statistics examined
edge of each other’s affairs. personal data. how much trust the participants had
significant and lasting value in of time before a re-identification risk is [8] Barth-Jones, D. Public Policy Considerations for
Recent Re-Identification Demonstration Attacks
created by the open and uncoordinat- on Genomic Data Sets: Part 1 (Re-Identification
all areas relating to the use of ICT
ed release by separate public bodies of Symposium). Bill of Health (blog), Harvard Law
in support of Cultural Heritage, two similar or identical datasets, one
School , May 29, 2013.
[9] Sweeney, L., Abu, A., and Winn, J. Identifying
seeking to combine the best of anonymized effectively, the other not. Participants in the Personal Genome Project by
computing science with real And what about re-identification risks Name. Harvard University. Data Privacy Lab. White
Paper 1021-1. April 24, 2013. (PDF)
attention to any aspect of the that may be created by personal data
[10] Barth-Jones, D. The ‘Re-Identification’ of Governor
disclosed by a breach? William Weld’s Medical Information: A Critical
cultural heritage sector. Courts faced with a dispute over a Re-Examination of Health Data Identification
Risks and Privacy Protections, Then and Now (June
proposed or existing release of an ano- 4, 2012). Available at SSRN: http://ssrn.com/
◆ ◆ ◆ ◆ ◆ nymized dataset will increasingly be abstract=2076397
[11] Ambrose, M. L. It’s About Time: Privacy, information,
called upon to assess the robustness of life cycles, and the right to be forgotten. Stanford
the risk assessment. But what will be Technology Law Review 16 , 2 (2013). (PDF)
judged an acceptable risk? 10 percent,
www.acm.org/jocch 2 percent, 0.001 percent? How do per- Biography
centages equate to “remote” or “mini-
www.acm.org/subscribe mal” risks? The question of whether
Marion Oswald is a practicing solicitor and Head of
the Centre for Information Rights at the University of
Winchester. Before joining the University, Oswald worked
data is, or is not, personal data is ulti- in legal management roles within private practice,
mately a legal one; lawyers need con- international technology companies and UK central
government, including the Ministry of Defence, and
text in order to tackle it, not something specializes in data protection, freedom of information and
that lawyers can do alone. information technology.
As per Kieron O’Hara, “…data pro-
tection is not sufficient for preserving
privacy, or public trust, or indeed the Copyright held by Owner/Author(s).
usability of data, and the right discus- Publication rights licensed to ACM $15.00
Personal,
Pseudonymous, and
Anonymous Data:
The problem
of identification
Why defining what counts as personal data is important
for data protection and information sharing.
By Iain Bourne
DOI: 10.1145/2508972
D
ata protection (DP) law has been around in one form or another for around 40 years.
In the United Kingdom, DP law came into effect back in 1998, based on a directive
published in 1996 that had been in gestation since the early ‘90s. A lot has happened
since then. Think of the information technology and resources that were around
almost 20 years ago, and compare them to the ones you have now. Back then, you probably
kept an address book of friends and acquaintances on a slow, non-networked home PC; had
physical files relating to your own commercial affairs; held records relating to the family
members’ health checks, school re- friends so it’s possible for you all to simple and there is currently a great
ports, and employment; and had a meet up in a local café. In amongst deal of debate within the DP commu-
basic mobile phone containing indi- this activity lies a wealth of personal nity about what sort of information is
viduals’ contact details and maybe a data, and we need to ensure such data “personal data” and, therefore, about
ping-pong game, if you were lucky. is used and managed safely. This is the scope of DP law. This is currently
Nowadays, you can have a host where DP law comes in. coming to a head as work on our next
of email addresses and Twitter ac- DP law does good things for citizens iteration of DP law—the proposed DP
counts under a variety of different and should make sense for organiza- Regulation—continues in Europe. This
names; maintain a social networking tions. In short, it means your personal is not solely an academic matter; it has
account to make contact with people data has to be used fairly and lawfully, great real-world consequences for the
all over the world; store your pictures and organizations must store it secure- rights and protections we enjoy as indi-
and videos in the cloud; sell your un- ly, have to be open about what they do viduals and for what organizations have
wanted birthday presents on an e- with it, and have to give you access to it to do to comply with the law. The prob-
commerce site; and use your mobile when you ask. That’s about it really—so lem is that we have data protection law,
phone to share geo-location data with far so good. However, it’s not quite that and will continue to have data protec-
ART TK
tion law, but it is (arguably) becoming tifying people—this is no longer just search engine anonymously, i.e. with-
less clear what sort of information the about real-world or civic identifica- out being signed in through your email
law applies to. And, of course, the law tion. Second, the question is not only account. It is clear the service provider
doesn’t work very well unless it is clear about whether information does iden- has no “real-world” information about
what it applies to. tify someone, but also whether it could you, such as your name, address, or
identify someone. phone number. However, it is also clear
THE PERSONAL DATA PROBLEM To look at the first point, what does that you are being identified, but in a
Put simply, personal data is informa- it mean to “identify” someone? Imag- different way. The personalization of
tion that identifies someone or can ine you go online and use, say, a major your browsing experience and the be-
allow someone to be identified when havioral advertisements you receive are
combined with other information. On evidence of this. This all works through
the face of it, that’s a fairly straightfor- the IP address that allows the website
ward definition that should make it A pseudonym can be to recognize you—or more precisely the
easy to say whether or not a particular
piece (or set) of information counts as
both a way of hiding device you use to go online—coupled
with cookies and other data linked to
personal data. There is no problem say- identity and, at the that IP address. I would argue this op-
ing information related to your health
record, tax contributions, or bank ac-
same time, a means eration involves the processing of per-
sonal data and should be covered by
count is your personal data. The in- of revealing it in a DP law, albeit in a modified way. This is
formation identifies you and relates
to you, so that’s easy. However, there
different way. This just one illustration of an alternative or
“non-obvious” form of identification, a
are two reasons as to why the situation lies at the heart of phenomenon that was not anticipated
around the peripheries of personal
data can be far less clear. First, there
the pseudonymous when current DP law was drafted.
I think the example above is fairly
are now lots of different ways of iden- data problem. straightforward. However, let’s think
about identification in the context of, New Year’s Eve revelers example. The tigation. If the police review pictures
say, a photograph of a crowd of people. publisher of the newspaper may not of known suspects and use a televised
It could be a shot entitled “New Year’s wish to identify (and may not even be appeal to the public for assistance, it
Eve revellers in Trafalgar Square” pub- able to identify) anyone in the crowd, becomes much more likely that some,
lished in a newspaper. The publisher but imagine if there was a terrorist in- or all, of the individuals in the original
has no interest in identifying any of the cident in Trafalgar Square shortly after picture will be identified in the old-
people in the crowd and has no means the photograph was taken. The pub- fashioned, real-world sense of being
of doing so. However, it is certainly pos- lisher later hands over all of its photog- named. This brings the photograph
sible to single out one person from an- rapher’s shots to the police for inves- much more firmly within the scope of
other and there is no doubt the people personal data. This also illustrates the
in the picture would be able to identify current debate about whether the test
themselves, as would people who know for information being personal data
them. A current stream of thought ar- An overly wide is identification, the reasonable likeli-
gues the information is personal data
if one individual can be singled out
definition of hood of identification (the formulation
in current DP law), or identification and
from another. This is a view that holds personal data and the possibility of identification. Some
considerable sway in some European
Union member states’ DP authorities.
an overly restrictive argue the latter formulation would pro-
vide better protection for individuals
However, there is currently much de- application of the and would “catch” significant informa-
bate about this within DP circles.
Recall the second part of the per-
law could take our tion around the peripheries of the cur-
rent definition of personal data. Others
sonal data problem is whether informa- society backwards argue it would widen the scope of DP
tion could identify someone. Clearly,
this can be very difficult to assess with
in terms of access law unacceptably and would bring per-
sonally inconsequential information
any real objectivity. Let’s return to our to information. within its scope.
PSEUDONYMOUS DATA dividual to be singled out from another analysis and say the dataset contains
The English word “pseudonym” derives individual, then the information is per- the personal data of the five “singled
from the Greek word pseudōnumon, via sonal data. And that’s that. However, out” individuals and, in fact, data like
the French pseudonyme. In Greek it let’s look at an example to see how this this can always be linked back to an
means “false name,” a meaning that plays out in practice. identified individual and is therefore
still corresponds extremely closely Table 1 shows a redacted version of always personal data. This begs the
to its ordinary English one. In one a set of personal data that is being used question of how the research organi-
respect, a pseudonym is a privacy- by researchers to examine the relation- zation would grant subject access (the
enhancing construct intended to pre- ship between the receipt of a state ben- statutory right of access to your per-
vent individual identification, rather efit and a person’s weight. The original sonal information) to an individual,
than to facilitate it. However, as is of- dataset—which also included research or how it would tell individuals that it
ten the case in DP law, things aren’t subjects’ names, addresses, and dates has their personal data, as it could be
that straightforward. A pseudonym of birth—has been irreversibly deleted. required to do under DP law. The prac-
may indeed be a “false” name, but it The “research cohort reference num- tical problems of an overly wide defini-
is still a name and can still be used to ber” was generated from individuals’ tion could be very great.
single one person out from another. names using a one-way irreversible
Let’s consider the use of an alias, encryption algorithm. This is the sort PSEUDONYMOUS DATA AND
such as a nom de guerre. When Car- of dataset commonly used to conduct THE NEW DP REGULATION
los the Jackal committed his crimes longitudinal studies into health and There is a lot of discussion at the mo-
he used his alias to a) hide his real other research or analytics, where the ment about the possibility of introduc-
identity—Ilich Ramírez Sánchez—from objective is to study information about ing a new class of pseudonymous data
the authorities, and b) to let the public particular individuals without iden- into the proposed DP Regulation. The
know he was responsible for the various tifying any of them. In this example, argument seems to be introducing a
actions they were reading about in their it is clearly possible to single out one new sub-class of something akin to
newspapers. So, a pseudonym can be individual from another, but is any potentially identifiable personal data
both a way of hiding identity and, at the individual really identified? Does this would extend the regulation’s cover-
same time, a means of revealing it in a table contain any individual’s per- age, offering better protection to in-
different way. This lies at the heart of the sonal data? I would argue it does not, dividuals while also providing more
pseudonymous data problem, which because although each row relates to a “lite” regulatory coverage for organi-
has translated into a complex current particular individual, it does not iden- zations processing this form of data.
debate about whether a pseudonym is tify any of them, and nor could any However, as we have seen, the issue of
(potentially) personally identifiable— individual ever be identified because what a pseudonym is, and of whether
and is therefore personal data—or can the original source data no longer ex- pseudonymous data is a form of per-
be anonymous (and is not, therefore, ists. (In reality this type of dataset is sonal data or is anonymous, is far from
personal data). The Information Com- normally given additional protection straightforward. There seem to be
missioner’s Office, the UK regulator of through value-swapping, perturbation, three basic meanings in circulation:
DP law, would argue it can be either, blurring, and other techniques, mean- 1. Pseudonymous data is data
depending on how the pseudonym is ing the information remains valuable where a “real” identifier—such as
produced and the context in which it is to researchers but does not—and can- somebody’s name or National Insur-
used. However, others reject this view not—be used to identify anyone.) How- ance number—is replaced by a “false”
and argue if a pseudonym allows an in- ever, others would disagree with this identifier such as a hashed code num-
ber. This is a privacy-enhancing tech-
Table 1: A fictional example of redacted personal data. nique used in contexts such as medical
research or online analytics. It allows
individuals to be tracked longitudi-
1. 2. 3. 4. 5. nally without their identities being re-
Name, Period of Body Age Research vealed. However, the link between the
address, Special Mass Range Cohort data and the individual could be per-
date Index reference manently and irreversibly broken, al-
Assistance
of birth number though there is much debate about the
Benefit
possibility of this. This corresponds
< 2 years 21 40-45 QA5FRD4 most closely with the ordinary English
> 5 years 19 50-55 2B48HFG meaning of pseudonymous.
2. Pseudonymous data is an alter-
< 2 years 20 40-45 RC3URPQ
native form of personal identification,
> 5 years 23 45-50 SD289K9 such as when an online services com-
< 2 years 20 45-50 5E1FL7Q pany uses an IP address and associated
cookie logs to target content at a par-
ticular device user.
Talking
‘Bout Your
Reputation
People think they want anonymity, but actually desire privacy.
But how do we reframe the debate surrounding privacy and security?
Perhaps technology is the answer.
By David Birch
DOI: 10.1145/2517998
T
here are sound rationales both for and against anonymity [1] and these are
brought into sharp relief by the combination of social networking, mobile phones,
online business, and government. Real issues ranging from online bullying to
activism under repressive regimes mean we as a society need to think about what
we want from the emerging infrastructure and how anonymity should work, or even exist,
within that infrastructure.
I imagine if you were to walk down a important that they can login to a web- anonymity should be stopped because
typical street with a clipboard and ask site about trade unionism, or diseases, it allows child pornographers, terror-
members of the general public whether or pornography, or anything else they ists, and drug dealers to congregate
Photograph by Vyacheslav Pokrovskiy
they think online anonymity is impor- might not want other people to know in virtual space with impunity, they
tant, most would say “yes.” They would they have been looking at. I take a guilty would similarly say “yes.” As I noted,
say it is important that they can pay pleasure in occasionally visiting the Dai- sound rationales for and against. Per-
for something in cash without it being ly Mail website to read the readers’ com- haps I might generalize and say people
tracked and traced by the government. ments on the major news stories of the want anonymity for themselves, but
They would say it is important that they day, but I certainly don’t want friends or not for other people. So why do they
can vote in a secret ballot without their work colleagues to know about that! want anonymity for themselves?
choices being observed by party activ- However, if you set out clipboard in In the UK, there are some people
ists (or spouses). They would say it is hand and asked people whether online who don’t want to register their Oys-
similar constructs. Therefore although needs to know I have a valid driving Biography
you can turn up at my door claiming to license. Barclays does not know this, David G.W. Birch is a director of Consult Hyperion, an IT
be an employee of the electricity com- but they know an AP who does (i.e. in management consultancy that specializes in electronic
transactions. Here he provides specialist consultancy
pany, I can use my phone to read your the UK, the Driver and Vehicle Licens- support to clients around the world, including all of the
phone, which lets me know whether ing Agency) so the bank can obtain the leading payment brands, major telecommunications
providers, governments bodies, and international
that’s true or not. I do not need to know relevant attribute (with my permission) organizations including the OECD. Before helping to found
who you are, but I do need to know what and then present it to the service. Consult Hyperion in 1986, he spent several years working
as a consultant in Europe, the Far East, and North America.
you are. If you subsequently batter me He graduated from the University of Southampton with a
over the head and steal my life savings, B.Sc (Hons.) in physics.
1 Please note that Consult Hyperion provide
then the police can go back to the elec- paid consultancy services in connection with Copyright held by Owner/Author(s).
tricity company to find out who you are, one of these Alpha Projects. Publication rights licensed to ACM $15.00
T
he data environment is a new concept in the field of data confidentiality. Although
there have been references to its various aspects, manifestations, and impacts,
it is only now that it has become a focus of inquiry in its own right. It is a focus,
we would argue, that is long overdue and rather urgent given the manner and
pace in which the data landscape is evolving. The huge amounts of data being generated,
combined with the economic drivers and political will to share it more widely, means
concerns about data privacy and anonymity are ever more founded. Here, we explain why
we need to understand the data envi- more extensive than this; for example, removed, obscured, aggregated, and
ronment in order to minimize threats schools also collect data on their pu- or altered in some way. There are two
to data privacy and anonymity. pils’ exam scores, special educational types of identifiers that organizations
When we talk about protecting data needs, and health; law enforcement need to think about when processing
privacy and maintaining anonym- also collects data on crime and anti-so- data: formal identifiers and complex
ity in the data confidentiality field, we cial behavior; and retailers also collect identifiers. Formal identifiers are rela-
are in essence talking about ensuring data on shopping and leisure habits, tively easy to spot and deal with and
anonymized data remains anonymous finance, employment status, and oc- include data such as a subject’s name,
once it is shared, disseminated, and cupation. This information will in all address, and unique reference num-
released in the data environment. So likelihood be stored in databases that bers (e.g. their social security number
what does this actually mean in prac- hold very many individual level records or National Health Service number).
tice? To answer this, we will first dis- of information. Complex identifiers are less easy to
cuss data and anonymization, as this This data is termed personal data, spot and deal with. They could in prin-
will set the scene for what we really which, as described by the UK Data ciple include any piece of information
want to discuss, the data environment. Protection Act (DPA, 1998), is “data (or combination of pieces of informa-
that relates to living individuals who tion). For example, take age and mari-
DATA AND ANONYMIZATION are or can be identified from the data.” tal status. Considered in the abstract,
All organizations will collect some Organizations that want or need to they are not immediately obvious iden-
information from their customers/cli- share and disseminate their data for tifiers. But, if we consider the case of
ents/service users as part and parcel of secondary use are obliged under the an 18 year-old widow, our implicit de-
their organizational activities. Almost DPA (1998) to process the data in such mographic knowledge tells us this is
always, this will include classic identi- a way as to render it anonymous and a rare combination (at least in peace
fiers such as client’s names, address- therefore no longer personal. The time). This means such an individual
es, and contact details. However, the transforming of data from personal could potentially be re-identified by,
information that is collected is often to anonymous requires identifiers are for example, someone spontaneously
recognizing that this record corre- ing how a statistical disclosure might to model. To address these failings,
sponded to their friend/neighbor/col- actual occur and then play out is not there has been a broadening of per-
league/family member. straight forward. This is the crux of the spective in the last 20 years, which has
Just this example alone presents a problem. As it stands, we know little seen attempts to incorporate some
data complexity problem, which dem- about the factors, conditions, and context beyond the data itself. This
onstrates anonymizing data is not mechanisms involved in a statistical has usually taken the form of intruder
straightforward. To complicate mat- disclosure largely because we know scenario analysis, which has shifted
ters further, organizations preparing little about the data environment. We attention away from the traditional
data for dissemination don’t just have will give a technical description of this position of asking “how risky is the
to think about sufficiently anonymiz- term shortly; for now, consider it as data for release” to a more critical po-
ing their data, but also about retain- the context for any piece of data, with- sition of asking “how a statistical dis-
ing data utility. After all, there is little out which the data has no meaning. closure might actually occur.” Some
point in sharing and disseminating You may wonder why it is only now inroads in addressing this latter ques-
data that doesn’t represent whatever it attention is being directed toward the tion have been made, most notably:
is that it is meant to represent (because data environment. After all, it would (i) the development of a framework
it has been altered during the ano- seem like an obvious point of focus for identifying plausible intrusion
nymisation process). given the task in hand. The explana- scenarios and (ii) the identification
Because anonymization is difficult tion for this lies with: (i) the particular of sets of key variables, i.e., informa-
and has to be balanced against data perspectives that have underpinned tion that can be used for statistically
utility, the risk a re-identification will and informed data confidentiality matching one dataset with another
happen will never be zero. In other work, and (ii) the intractability of un- [3]. But, for all intents and purposes,
words, there will be a risk (although derstanding and gathering data from this is where the work has stalled not
extremely small) of de-anonymization the data environment. least because much of it is theoretical.
present in all useful anonymized data. The traditional perspective was one It is certainly true that we lack a real
The only way to remove this risk entire- where statistical disclosure risk was worldview of statistical disclosure and
ly is not to share any data at all, which seen as originating from, and there- have relatively little direct data on it.
is obviously undesirable if we are to ex- fore largely contained within, the This may be because an act of statis-
ploit the undoubtedly huge social and data to be disseminated, released, or tical disclosure is a rare event and or
economic value locked up in the data. shared. It meant data researchers and is one in which the key protagonists
practitioners rarely looked beyond (i.e., the data intruder and the organi-
Statistical Disclosure the statistical properties of the data zation releasing data) are both incen-
For researchers in the data confidenti- in question. More precisely, it meant tivized to conceal (albeit for differing
ality field, the first step to determining they did not concern themselves with reasons). It is difficult to speculate
how best organizations can minimize issues such as how or why a data in- productively on this and we do not do
the risk of de-anonymization and op- truder might make a disclosure at- so here. The important point we wish
timize the trade-off they must make tempt, or with what skills, knowledge, to make is while there is little direct
between anonymization and data util- or access to other data they would data in the form of cases of disclosure,
ity is to assess how the process of de- require to ensure their attempt was it does not mean there isn’t any (key)
anonymization might actually occur. a success. As a consequence, the data; the data environment can poten-
The term commonly used in the field statistical models they built to as- tially tell us all we need to know about
to denote the process of de-anonymiza- sess disclosure risk, while statisti- how a statistical disclosure might ac-
tion (and one that we will use from here cally sophisticated, were based on tually happen.
on in) is “statistical disclosure.” A sta- very crude assumptions about the
tistical disclosure, we should point out, context of the risk they were trying THE DATA ENVIRONMENT
incorporates not just the idea of de- The data environment is made up of
anonymization (or re-identification), a small number of components: data,
but also captures the idea that con- agents, and infrastructure. It is these
fidential information is revealed (or
disclosed). See Duncan et al. or Hun-
We can only components that we need to look at in
order to ascertain how a statistical dis-
depool et al. for recent reviews of the effectively guard closure might occur and play out.
statistical disclosure control field [1, 2].
Formally, we describe a statisti-
against the threat Data. What (other) data exists in
the data environment? This is what
cal disclosure as a form of data con- to data privacy and we need to know in order to identify
fidentiality breach that occurs when,
through statistical matching, an in-
anonymity when what data (key variables) are risky, i.e.,
can be used for statistically match-
dividual population unit is identified we have a clear idea ing one dataset with another thereby
within an anonymized dataset and/
or confidential information about
of what it is we are providing (some of the) conditions for
statistical disclosure. This is still a
them is revealed. However, determin- guarding against. developing area, which, at Manches-
It provides the context to data and agents, ments that define what can and cannot
so, for example, it will influence what be done with the data comprise—the Biographies
data is shared, to whom it is given, environment. Such an environment
Elaine Mackey is a well-established researcher into the
and how that process takes place. It cannot be as tightly controlled as the broader aspects of statistical confidentiality where the
will also influence key agents, such as secure data center environment, but it statistical, data management, and social policy meet. Her
Ph.D. demonstrated the value of using game theory to map
National Statistical Institutes, any or- does allow for some control of the data disclosure attack scenarios. She has recently worked as
ganization releasing data, data users, environment not (currently) present part of the Data Environment Analysis Service mapping
the data that an attacker might feasibly use to identify
specialist interest groups, the general when for example data is published on individuals in anonymized datasets.
public, and the media in terms of their the Internet.
Mark Elliot has an international reputation in the field
possible actions, interactions, and All data environments will con- of data privacy. He has led numerous interdisciplinary
counter responses. Infrastructure in- tain the features outlined above but projects in the field and his special unique methods are at
the center of the SUDA system for anonymization decision
cludes storage systems, information are likely to differ in form depend- support developed at the University of Manchester and
systems, data security systems, gov- ing on how they are made up and used in statistical agencies across the world. He has a
long track record of relevant stakeholder engagement
ernance structures, and national and how they are operationalized. A local most recently through his work on the Administrative Data
Liaison Service (www.adls.ac.uk) and as lead for the UK
international legislation. data environment may in turn con- Anonymisation Network (www.ukanon.net).
Thus far, we have talked about the tain sub-environments. For example,
data environment in the definite sin- an organization may have multiple Copyright held by Owner/Author(s).
gular; in other words, we are referring servers with differential access. Indi- Publication rights licensed to ACM $15.00
What is Bitcoin?
Strengths and weaknesses of the leader in a new generation
of emerging cryptocurrencies.
By Dominic Hobson
DOI: 10.1145/2510124
C
ontrol over your personal data is an important part of privacy. The Web enables
personal data to be gathered, shared, and traded with unimaginable ease, and keeping
a firm grasp on personal data is becoming more and more challenging.
One motivation for the mass movement of personal data around the Web is money.
Before the advent of the Internet, sending money from one side of the world to the other was
not as straight forward a task as it is today. But, just like personal information, the transfer
of money enabled by the Web has big implications for privacy. As we have moved from cash,
to checks, to credit cards—away from money” brings back literally millions Enter Bitcoin, a pseudo-anony-
physical gold standards—our money of results. mous, peer-to-peer currency protocol
has gradually become just numbers on As well as possession of money, cen- created and released, quite fittingly,
a computer. Inevitably, that computer tralized services also get possession of by a mysterious pseudonym, “Satoshi
belongs to someone else. personal data relating to purchases. Nakamoto,” whom has since disap-
Privacy, that is, the ability to be Supermarket chains can—and do— peared. Bitcoin is the leader in a new
able to reveal personal information infer age, gender, household salary generation of emerging currencies
through choice, is near impossible to bracket, and more from the items you known as “cryptocurrencies,” which
attain when a third party holds all your purchase in their stores to aid market- aim to, among other things, facilitate
money, personal information, as well ing and advertising efforts. In 2011, the movement of money electronical-
as every electronic transaction you’ve Visa announced a system called “Real ly while still maintaining a sense of
ever made. Part of the motivation for Time Messaging,” which sent offers privacy. Bitcoin disrupts this move to
creating the Internet was resilience and discounts direct to phones based centralized money services, putting
to attack through distribution with as on information deduced from card the Internet to the use for which it
few single points of failure as possible. use, such as location. was originally intended—fully decen-
Despite having such a system, users These are just a few examples of tralized services.
have still flocked en masse to central- how large companies are using per- It’s been suggested that by print-
ized services such as PayPal. With re- sonal data related to your payments ing our names and details on our
spect to privacy, we have given all our and transaction to target advertis- debit and credit cards we undermine
control; our personal data regarding ing and sell more efficiently. Many thousands of pounds of smart chip,
our transactions; our balances; and people may be comfortable with this. which is a privacy enabling technol-
who we pay, why, and when over to a After all, if somebody is going to try ogy. Unfortunately the biggest group
few large centralized services. and sell something to you wouldn’t of people helped isn’t ourselves, but
For some, this shift of monetary you rather it be something relevant to those with nefarious intent. It begs
control has its benefits. In theory, your you, which you might actually want? the question that in a system where
money is safer in a large organization’s However, there is virtually no alter- everything and everyone is represent-
virtual vault than under your bed. For native to holding money in a bank or ed as numbers on a computer, do we
others, this hand over of monetary other centralized services—which in even need a name? It is this approach
control has been more costly: Search- turn buy, sell, and profit from your that gives the Bitcoin protocol one of
ing for the phrase “PayPal took my personal data. its many strengths.
Breaking Down Bitcoin 2. Hash this block. Look at the The reward a miner receives for find-
The Bitcoin protocol itself stores no newly produced hash, specifically how ing a block is the sum of the transac-
personal data. Bitcoin offers its privacy many zeros it starts with: (a) If the tion fees of all the transactions in that
by design through novel use of cryptog- number of leading zeros is less than a block, as well as a block reward that is
raphy. Nothing personally identifiable predefined number (known as “diffi- currently 25BTC. This reward halves
is recorded. Instead, users have many culty”) then start again from step 1, in- every four years, so no more than 21
wallet addresses, which are hashes crementing the nonce to ensure a dif- million bitcoins will ever be produced.
of public keys. Users can, and are en- ferent hash is reached. (b) If there are The aim of this is supposedly to mimic
couraged to, have as many different more leading zeros than the required a finite resource such as gold. With
wallet addresses as required (ideally difficulty, then proceed to step 3. 3,600 bitcoins being produced a day
one per transaction). The correspond- 3. The miner has successfully and each bitcoin worth around $100,
ing private keys, required to authorize mined a block, adding it to the block- mining has become an industry and
a transaction, are stored locally in the chain. They then broadcast their hash, profession itself.
users wallet file. along with the transactions in it and Miners typically invest thousands
Users maintain full control and the nonce to others. The successful into their mining rigs in order to hash
possession of their local wallet file. But miner also receives newly created bit- just a little bit faster in hope of find-
“with great power comes great respon- coins as a reward in a special coin base ing a block before another miner does.
sibility.” Should a user accidentally de- transaction—this is how bitcoins are People are even going as far as creating
lete or lose their wallet file, they also initially produced. Application Specific Integrated Cir-
lose any associated bitcoins. Although 4. Other miners receive the new cuits (ASICs) that can cost tens of thou-
the bitcoins are still technically stored block and its contents. They check that sands of dollars each for the sole pur-
on a peer-to-peer network, the private all transactions in the block are valid pose of mining. For Bitcoin users, the
keys required to authorize a new trans- and not double spends, and check that more people mining and hashing, the
action are lost, effectively making the when hashed, they give the right result. more secure and resistant the block-
coins unspendable. If everything is valid, they use the new chain is to attack.
Behind the scenes, Bitcoin doesn’t block hash and start to mine the next To attack the network, a malicious
store each coin and who owns it. In- block with new transactions. actor would have to create or modify
stead, it uses a distributed ledger The mining process is effectively a transaction in a block and mine it
book system (called a “blockchain”) trial and error. As more people try faster than the rest of the entire net-
based on the logic that if you know mining, they would in theory be able work can mine a block. Mining faster
every transaction an address has to mine blocks more quickly. For this than the rest of the network on aver-
made, then you know if it has money reason after every 2,016 blocks that age requires the same or more hash-
to spend. This may appear initially are found (which happens approxi- ing power than the entire network
quite contradictory: A privacy protect- mately every two weeks), the pre- combined, which is currently hover-
ing money system that lets the en- defined number of leading zeros that ing at around 1500 petaFLOPS (float-
tire world see every transaction ever must be in a hash for it to be success- ing point operations per second). To
made. However, from a privacy per- ful (the difficulty) is adjusted. This put that in perspective, Tianhe-2, the
spective, it doesn’t matter if everyone is based on the average time it has world’s fastest supercomputer, has
can see every transaction, if the only taken to mine a block. If this time is managed to muster a measly 31 pet-
identifying information in a transac- more than 10 minutes, the difficulty aFLOPS and theoretically maxes out
tion is a seemingly random number of is decreased. This effectively restricts at just 54.9 petaFLOPS.
which everyone can have many. the mining process to a block every Even if an attacker could acquire
Transactions are verified through a 10 minutes. such power, malicious transactions
process known as “mining.” The min- must still be accepted as valid by other
ing process also serves as the mecha- miners. With more than 50 percent
nism by which bitcoins are initially of the network power, a malicious ac-
produced and distributed. Mining is Bitcoin disrupts tor can only prevent transactions, and
effectively the act of adding transac-
tions to the blockchain so everyone
this move to reverse or double spend transactions.
Should a malicious party have more
can agree on the same set of transac- centralized money than 50 percent of the hashing power,
tions. A node that chooses to mine
runs mining software, which repeats
services, putting they would earn more by legitimately
mining then they could with fraudu-
the following: the Internet to the lent transactions.
1. Gather up all unverified trans-
actions into a block (ensuring they’re
use for which it was Ultimately, providing you look af-
ter your wallet, your bitcoins are safe,
all valid transactions) along with the originally intended— unless somebody manages to break
hash of the last block added to the
blockchain and a random number
fully decentralized the military-grade cryptographic al-
gorithm ECDSA (Elliptic Curve Digi-
called a “nonce.” services. tal Signature Algorithm). Even if this
550K 260
240
500K
220
450K
200
400K
180
350K 160
300K 140
250K 120
100
200K
80
150K
60
100K
40
50K 20
0K 0
Jan 13 Feb Mar Apr
were to happen, rolling out a new cli- automatically generate a new Bitcoin However, these clones still share
ent and switching over to a new block- address for each individual. Unfortu- some of the same weaknesses as Bit-
chain (called a “hard fork”) would nately, the vast majority of the popu- coin. In order to cash out Bitcoins into
solve the problem. Even a full break lation probably doesn’t know how to a fiat currency such as £ or $, typically
in ECDSA would have little to no im- do that. A potential flaw in Bitcoin, one must go through an exchange.
plication for privacy of Bitcoin as as it stands, is that it is so incredibly The price of a Bitcoin varies. In the
there is still no personal data stored novel and ingenious; it’s not yet intui- first four months of 2013, the price of
within the protocol. tive or easily understandable for most a bitcoin went from $20 to more than
of the population. How can someone $250, down to $60, and back up to
Weaknesses of Bitcoin be expected to trust something new $160 (see Figure 1). Such volatility is
So with such secure algorithms, that they don’t understand? Bitcoin unheard of in fiat currencies and has
what makes Bitcoin only pseudo- was originally created and used by brought Bitcoin’s value as a stand-
anonymous? One reason is that Bit- people with a technical disposition, alone currency (i.e., not pegged to a
coin can’t guarantee that users will not so its lack of ease of use is most likely fiat currency) into question.
somehow accidentally or intentionally a feature of being a first generation Such volatility can lead to practical
link themselves to a wallet address. For cryptocurrency. issues. For example, let’s assume a min-
example, let’s assume Alice published There are already clones of Bitcoin ing hardware company accepted pre-
a wallet address of hers on Twitter being created using the Bitcoin source orders in bitcoins on equipment worth
for donations and received 20BTC. By code as the base. Although none of $30,000. After missing shipping dates,
looking through the blockchain, any- these improve the usability, they do some customers requested refunds.
one can find the addresses that Alice offer variations in block production However, when some customers paid
sent her money to, and it’s more than time and rate. Some of these “altcoin” in bitcoins, the value of a bitcoin was as
likely Alice knows the people she sends clones offer mining algorithms that low as $20, whereas now they’re worth
money to. Alice may wish to support a are considered fairer by hashing blocks more than fives times that amount. If
controversial group and anonymously with the Scrypt algorithm. Producing the company has to pay back custom-
donate. She naively sends money to a hash using the Scrypt algorithm is ers with the amount of bitcoins the cus-
an address the group posted on their more memory intensive than SHA256 tomer paid, then they will be paying the
Twitter feed. Anyone looking up Alice’s and as a result doesn’t benefit as much customers more than fives times the
donation address in the blockchain with mass parallelization provided by dollar equivalent they originally paid.
would only have to Google the address- more expensive hardware such ASICs. However, paying out the dollar value to
es she sent bitcoins to in order to link Some altcoins use a variation of Bit- the customer in bitcoins is also not fair,
her with the controversial group. coin’s proof of work algorithm, mak- as the user will end up with consider-
All the above could have been avoid- ing them in theory more resistant to a ably less bitcoins than they had before
ed if either party had used a script to 51 percent attack. they bought the product.
By Kelley Misata
DOI:10.1145/2510125
T
en years has flown by since The Tor Project released the first version of the Tor
software in 2002. Since those early days Tor’s technology, research, company, and
mission has grown to meet the needs of constantly changing global landscape.
Fueled by a passionate team of more than 30 core employees and contractors—
including myself—we along with more than 3,000 dedicated volunteers and a community
of sponsors share in Tor’s mission in bringing a voice to global debates around privacy,
anonymity, and censorship circumvention.
Today, with more than 2.4 billion peo- The Tor Project’s place in the arena grow of relays and Tor bridges from
ple online, Tor continues to be on the of privacy and anonymity is complex March 16 through June 14, 2013.
front lines helping people across scien- and ever changing. Therefore, in the in-
tific, charitable, civic, government, and terest of space, this article will explore a ANONYMITY LOVES COMPANY
educational sectors stay safe and com- few key areas that illustrate Tor’s broad Ongoing trends in law, policy, and tech-
municate freely. global impact. Much more about us, our nology threaten anonymity as never be-
Protecting the rights of privacy mission, and our projects can be found fore, undermining our ability to speak
and anonymity for all isn’t always the on www.torproject.org or, even better, and read freely online. These trends
most popular or easy place to stand join the ongoing conversations with Tor also undermine national security and
when the public is faced with reports developers on IRC at #tor-dev. critical infrastructure by making com-
of national and international privacy munication among individuals, orga-
breaches. However, Tor maintains a HOW TOR WORKS nizations, corporations, and govern-
consistent focus on technology and Relay operators are really the heart of ments more vulnerable to analysis.
the passion to help people stay safe. the Tor network. We could not exist Each new user provides additional di-
We empower NGOs, law enforcement, without the loyalty and passion of these versity, enhancing Tor’s ability to put
and survivors of crime through our 3,000-plus volunteers. By downloading control over your security and privacy
technology. We give ordinary citizens a Tor’s software and setting up as a relay, back into your hands. (For more on how
fighting chance against criminals who operators manage a constant flow of anonymity loves company I would di-
steal identities and bandwidth to com- online traffic through the Tor network rect readers to “On the Economics of
mit crime. Without Tor in the world, every day. Each relay has a direct impact Anonymity,” written by Roger Dingle-
the bad actors will find another tool on Tor network’s ability to run better dine, Paul Syverson, and Alessandro
to achieve their objectives. With Tor in and faster for all users. A Tor relay can Acquisti; http://freehaven.net/doc/fc03/
the world, the good actors will contin- run on almost any operating systems, econymics.pdf.)
ue to have a tool and experts commit- but currently runs best on Windows, The Tor community is often por-
ted to providing safe online channels Mac, Linux, and on Amazon cloud ser- trayed in the media as being comprised
of communication. vices. Figure 1 illustrates the steady only of immoral, unjust, and malicious
CONCLUSION
organizations that progress, but we need your help. Please
consider running a relay, volunteering,
Recent events remind us all how impor- all share a common as a developer, attending an event, or
tant and complex privacy and anonym-
ity are in today’s digital environment.
vision: Privacy is contacting us for training or education.
decisions. Our technical team will con- place in daily life. © 2013 ACM 1529-4972/13/09 $15.00
By Philip C. Ritchey
DOI: 10.1145/2510126
A
lice and Bob have been imprisoned under the guard of the watchful warden Wendy.
The two of them want to collaborate on an escape plan, but the only means of
communication they have is a public channel that is vigilantly monitored by Wendy.
On top of that, Wendy will not allow them to use cryptography to obscure the contents
of their messages. If she sees a message that she cannot read, she will throw both Alice and
Bob into solitary confinement and they will not be allowed any further communication at all.
What can they do to communicate covertly over a public channel?
Alice and Bob’s situation can be re- can now be proven for steganographic secret data. The resulting change in
solved through steganography—the techniques. Despite having once been the image is imperceptible to a human
art and science of sending messages regarded as a defective form of cryp- observer, and an 8-megapixel grayscale
in such a way that only the sender and tography, steganography in the digi- image can hold one megabyte of secret
the intended recipient are aware of tal age is almost as indispensable as data without noticeably reducing im-
the existence of the message. Whereas cryptography. From watermarks for age quality. The spacing between words
cryptography is concerned with keep- use in digital rights management and or lines in a text can be used to hide in-
ing the contents of a message secret, intellectual property protection, to formation as well. Many network pro-
steganography is concerned with protecting anonymity online, to cen- tocols and file formats have unused
keeping the existence of a message sorship resistant technologies, to fin- (or misused) fields that can hold secret
secret. In the past, steganography has gerprinting digital objects for traitor data. The possibilities are really only
been discounted as being security-by- tracing and authentication, stegano- limited by one’s imagination.
obscurity, which is to say: No security graphy is everywhere.
at all. However, modern steganogra- The standard example of digital HIDING INFORMATION IN GAMES
phy does not rely on methodological steganography is hiding secret data Much to the dismay of Chess-by-mail
obscurity, and equivalent notions of in images by replacing the least sig- players during World War II, the imagi-
security to those used in cryptography nificant bit of each pixel with a bit of nation of the censorship offices of the
thing you write because you’re not al- Let’s assume Alice is playing X, so sends the correct bits. Let’s assume
lowed to encrypt your messages, you she goes first. With her first move, she Alice is trying to send data that begins
will probably end up playing Tic-Tac- must choose which space to mark. with bits 11001. We can label each
Toe and Wendy will think nothing With her first choice, Alice will send available move with a C-bit string that
of it. The key to making Tic-Tac-Toe, some bits of secret information. The represents the meaning of the move,
or any other game, work as a covert number of bits she can send depends i.e. what bits will be sent to Bob by se-
communication channel comes from on the number of available options lecting that move. The easiest way to
quirement and be left with an aver- a bit of secret data. ARTIFICIAL VERSUS
age of 11.815 bits of total capacity per AUTHENTIC PLAY
game. And so, yet again, Wendy finds Wendy’s task all along has been to
herself suspecting Alice and Bob are distinguish between gameplay that
using Tic-Tac-Toe to share secrets, otherwise the move is made normally is, and is not, used to hide secret data.
yet she is still unable to determine and does not hide any secret bits. If An equivalent statement is Wendy is
when they are doing so. I’ll tell you the coin is fair (50 percent chance of attempting to distinguish between
now that she may be down, but she’s heads), then Alice and Bob will see gameplay generated by a computer
far from out. their capacity cut in half since only (artificial gameplay) and gameplay
Right now, there should be a half of the moves will be hiding data. generated by a human (authentic
screaming objection in your head: Why But, Wendy’s detection task now be- gameplay). Authentic gameplay is
doesn’t Wendy just decode what Alice comes harder since the covert channel clean by definition, since, in order to
sends? If she knows the method, she usage is diluted and dirty moves are hide data in a move, the move is not se-
can extract the bits just as easily as Bob in amongst clean moves. Her accura- lected by the human, but rather by the
can, and if she sees anything mean- cy will drop from above 95 percent to secret data and the stego-system. It is
ingful she can prove they were using below 75 percent when Alice and Bob, impossible for you, a human, to make
the game as a covert channel. That is playing random, switch from hiding the moves you want to make while
exactly why security-by-obscurity is data in every move to hiding it in only at the same time sending the secret
no security at all. Therefore, Alice and half of the moves. If they use a coin that data you need to send. Therefore, if a
Bob need to introduce an extra step lands on heads 33 percent of the time, move is used to transmit data, then we
and a secret key into their system so so that a third of all moves are hiding know a computer generated the move.
that only someone with the secret key data, Wendy’s accuracy drops to 50 Logically, artificial moves are neces-
will be able to correctly decode the percent. This is equivalent to guessing, sary, but not sufficient, for hiding data
gameplay. Simply put, Alice and Bob meaning Wendy cannot distinguish in games. However, because Wendy is
should encrypt their messages before between dirty and clean gameplay. the warden and she makes the rules,
hiding them in the game. That way, Meanwhile, Alice and Bob realize an Wendy can decide the only gameplay
when Wendy extracts the bits from the average total capacity of 5.249 bits per allowed is authentic gameplay. That
gameplay, she will get something that game. This illustrates the tradeoff be- is, because it is impossible for Wendy
appears to be random. To determine if tween capacity and security. By giving to tell the difference between artifi-
the covert channel is being used, Wen- up some capacity, Alice and Bob can cial and dirty gameplay, she must con-
dy must be able to distinguish between significantly increase their security sider artificial gameplay dirty even if
encrypted data and random data when against Wendy, from having almost no it is not being used to send secrets. If
analyzing the extracted bits. While security to having nearly perfect secu- Wendy can accurately classify moves
this is, in theory, possible, it requires rity. Note, however, this assumes Wen- as authentic or artificial, then she can
Wendy to wait until she has a very large dy does not realize Alice and Bob are also accurately classify moves as clean
number of bits before she can make not hiding data in every move. If Wendy or dirty.
a decision. Alice and Bob may finish figures out how often their coin lands In artificial intelligence, there is a
their communication before Wendy on heads, she can adjust her detector name for the task of distinguishing
even has enough bits to be able to test to take that into account and reclaim between an artificial source and an au-
them for randomness. some of her advantage. For this reason, thentic source: the Turing test. Turing
What else can Alice and Bob do to the secret key must include not only originally formulated the test using
make Wendy’s job even more difficult? the seed for the random number gen- natural language, since humans are
Imagine Alice and Bob each have a erator, but also the parameter which very good at using and understanding
coin and whenever one coin is flipped, controls how often the coin lands language and computers are, so far,
the other is flipped at the same time on heads. Note, also, that all of this not that good. Natural language pro-
and the result is the same for both. doesn’t even matter since it assumes cessing is a hard AI problem and the
If the coin lands on heads, then the Alice and Bob are playing stupidly Turing test is effectively the true test
next move is used to hide secret bits; again. If Wendy is checking for stupid of whether we’ve solved it. What about
a Turing test for gameplaying? Can every day. But does a Turing test for After collecting some human game-
you, as a human, play chess against gameplay even make sense? It must, play data for Tic-Tac-Toe, we used it to
an online opponent and determine because that is precisely what Wendy build a classifier for Tic-Tac-Toe game-
whether your opponent is a human needs. She needs to be able to look at a play. Our classifier achieves 95 percent
or a computer? Probably not with ac- conversation between two players who accuracy when shown 10 games for all
curacy anywhere close to the accu- use the language of gameplay and de- six computer strategies that we have
racy with which you could distinguish cide whether they are humans or com- tested so far. So, whether Alice and
between a human and a computer puters. To do this, she needs a better Bob’s stego-system is playing random-
conversational partner. This is likely understanding of how humans play ly, or greedy, or optimal, or a hybrid of
due to the fact that you use language games and what human generated these, Wendy can observe just 10 games
every day but you do not play chess gameplay looks like. and accurately determine whether they
were played by humans or computers.
Trying to send a secret in less than 10
Figure 3. With focused research and development, computer players could defeat games, which still gives Wendy accura-
the best human players at most games, but there are games at which they may cy better than guessing, means Alice can
never be able to beat top humans. While a computer may never become a champion only send a few words to Bob. There’s
at Mao or Calvinball, creating a computer that has the ability to play such games, no hope for Alice to pack more bits into
without boring their human opponent, will be a considerable achievement. There is each game, so, if she needs to send more
more to playing games than simply winning. information, she must increase the
number of games Wendy needs to ob-
serve in order to make a decision.
The next step for Alice and Bob is
to design a strategy that more closely
mimics the actual gameplaying be-
havior of humans, putting them in
the same boat as Wendy. Both sides
want to have the best, most accurate,
model of human gameplaying behav-
ior so that they can generate realistic
gameplay or detect the slight varia-
tions that betray artificially generated
gameplay. In fact, this type of work
has been going on for a long time in
video games, where developers are
continually striving to provide more
realistic non-player characters and
opponents for human players to in-
teract with and play against. But, no
matter what the game, from simple
pen-and-paper games to MMORPGs,
where there is choice, there is infor-
mation; any game can be used to send
secret messages.
Acknowledgment
This research has been supported by
the following grants: NSF CNS-0716398
and NSF CCF-0939370.
Biography
Philip C. Ritchey is a graduate student in the Department
of Computer Science at Purdue University. He is a member
of the Center for Education and Research in Information
Assurance and Security and the Center for the Science of
Information. He received his B.S. in computer engineering
from Texas A&M University in 2008. His research interests
include information assurance and security, interactive
artificial intelligence and machine learning, and
computational models of human problem solving.
n Illustrated
A
Primer in
Differential Privacy
The vast amounts of data that are now available provide new opportunities to
social science researchers, but also raise huge privacy concerns for data subjects.
Differential privacy offers a way to balance the needs of both parties. But how?
By Chrisine Task
DOI: 10.1145/2510127
G
ood grief there is data everywhere now, just everywhere! Our governments, our
doctors, our schools, our Web browsers, our social networks, our cameras, our
phones, our cars, our shoes, and now even our glasses can collect reams and reams
of electronic data about us. Storage is cheap, data-ownership is often poorly defined:
There is so much data about you already scattered around the world. You’re aware of this.
If you’ve made it this far in this issue of XRDS, I expect you’re rather alarmed. But, really,
you should be a little excited too. In the hands of social science researchers this data gives us
a real chance at
building a world that
WITH DATA!
mation about itself Track The Spread
all over, it is now of Disease!
and forever more in Stop Epidemics!
“verbose” mode, and
if we listen, we can Medical Records!
learn something use-
ful. We can begin to Sensitive Data!
really understand our
world in an objective,
quantitative way, and Better Understand
Improve
we can start to make Struggling Families
Location Data! Public
it a better place. I and Communities!
Transportation!
drew you a picture to Diagnose Problems
give you an idea. Address
With Failing Students
Traffic
and Schools!
Congestion!
If Bob decides to fill out a survey, he will answer “yes” to two of the questions (he
had some wild days in his youth), and so he changes two of Alice’s counts by one
each. One plus one is two, and thus we say Bob’s total impact on Alice’s results is
two. If you look at it, you’ll see that in this survey it’s impossible for any person to
affect more than two of the counts. Bob is making the largest possible difference
on the results, so if we can protect Bob, we’ll also protect everyone else in the
data set. We call the largest difference any one person can make on the analysis
results, the “global sensitivity” of the analysis. This is what we need to cover up
with random noise. But how should we go about adding that noise? Alice is going to
sample a random value from the Laplace distribution. (It’s OK. It’s not much more
complicated than rolling dice to get a random number. )
What’s the probability distribution look like? What’s the probability distribution look like?
0.3 0.3
0.2 0.2
0.1 0.1
0.0 0.0
–7 –6 –5 –4 –3 –2 –1 0 1 2 3 4 5 6 –7 –6 –5 –4 –3 –2 –1 0 1 2 3 4 5 6
If we add this noise to our true result? If we add this noise to our true result?
If our real 56, with probability (1/6) If our real A value between 50 and 60,
answer 57, with probability (1/6) answer with 92% probability
were 55, 58, with probability (1/6) were 55, A value greater than 60,
we’d get: 59, with probability (1/6) we’d get: with 4% probability
60, with probability (1/6) A value less than 50,
61, with probability (1/6) with 4% probability
Cynthia Dwork
on Differential Privacy
Distinguished Scientist at Microsoft Research, Dr. Cynthia Dwork,
provides a first-hand look at the basics of differential privacy.
By Michael Zuba
DOI: 10.1145/2510128
L
arge-scale statistical databases, specifically those that contain aggregate information
about a population, are becoming an ever-important resource in our world. These
databases are valuable assets to researchers, businesses, and governments. Researchers
can use them to try and discover commonalities in a population for diseases, business
can use them to understand how to effectively market their products and services, and
governments are provided with knowledge about their citizens. Differential privacy techniques
are applied to these databases in order to minimize the risk of a person or group being able
to associate information in these data- MZ: What inspired the idea of individual that the adversary could have
bases with a specific person. Differential differential privacy? learned without interacting with the
privacy is essentially a “definition” database. But this leads to problems
of privacy for statistical databases. CD: Differential privacy was inspired by whenever the database is actually
In this interview, Dr. Cynthia Dwork shares two negative theoretical results. The useful: If the adversary is from Mars
with us a first-hand look at this emerging first showed that, roughly speaking, and believes that all humans have two
topic of differential privacy. “overly accurate” answers to “too left feet, and the adversary then learns
many” questions is completely NON- from the statistical database that
Michael Zuba: For those who are not private. In this particular case, “overly almost all humans have one left food
in the domain, how would you explain accurate” meant something like and one right foot, then the adversary
differential privacy? “accurate to within smaller than the has learned something about me (and
sampling error” and “too many” was about most humans) not learnable by a
Cynthia Dwork: Differential privacy is roughly the number of people in the Martian without access to the database.
a guarantee, made by a data curator to a data set. But if the data set is ver y Should this be viewed as a violation
data owner: No additional harm (or benefit) large—Internet scale—then asking of my privacy? The whole point of
will come to you as a result of permitting this many questions is probably statistical databases is to learn about
your data to be used in a statistical study. impractical. This suggested an the population as a whole, meaning that
In a little more detail, differential privacy investigation of what can be achieved learning facts about the population,
is a mathematical statement about a data if the number of questions is cur tailed, such as “smoking causes cancer in
analysis algorithm, which in English says which quickly led to a precursor of humans,” is actually the goal. So instead
that the outcome of any analysis is differential privacy. we ask that the adversary learn no MORE
essentially equally likely to be observed, The second negative result about me than the adversary would have
independent of whether any individual concerned mathematical definitions learned were I not in the database. And
or small group of individuals opts into or of privacy. One way to try to define that’s exactly differential privacy.
opts out of the data set. The probability privacy is to say that an adversary,
in “equally likely” is over random choices interacting with a privacy-preserving MZ: Can you give some examples of where
made by the algorithm. database, learns nothing about an differential privacy could be used?
and machine learning. My hope is that are formalized—if at all—and the Michael Zuba is a Ph.D. candidate at the University of
Connecticut. His research is on underwater acoustic
differential privacy can democratize assumptions about the information communication and networking. He is the recipient of
research, giving to members of the public and computational power to which an NSF EAPSI fellowship and a Department of Education
GAANN fellowship in advanced computing.
who are not “credentialed researchers” the adversary has access. Are the
the ability to learn about the population assumptions reasonable? See what the © 2013 ACM 1529-4972/13/09 $15.00
Privacy Research
issue at Google. Here, research is very
embedded, and technology transfer
happens as part of your job.” Secondly,
DOI: 10.1145/2517256
she expressed a deep appreciation toward
her colleagues, who create an inspiring
As we witness and engaging work environment within
questions regarding the the company culture.
secrecy of our emails Despite not having remained in
and voice calls break academia after her Ph.D., her constant
loose into the realm of stream of publications throughout the
politics, little attention years serves to dispel the myth that
is paid to the researchers and engineers working in the industry rarely gets you
who bear the burden of keeping our published. This fear of lack of visibility
digital estates safe from prying eyes. We intimidates countless young researchers
sat down with Jessica Staddon, privacy each year. “It is hard to imagine while
research manager at Google, who offers you are in grad school, that there are
rare insight into what it takes to become a other ways to have impact on the world
privacy scientist for one the world’s best- which are not publication-oriented,”
known software companies. commented Staddon, adding she strongly
Staddon’s research career path today, with the introduction of numerous encourages grad students to find other
started out in security. In the years DARPA and IARPA programs as significant ways to have impact on the world. “Here at
preceding the dot-com bubble, she historical milestones. “A lot of the work Google, you develop something and it can
was a Ph.D. candidate in Berkeley’s was in data mining and pattern detection, go on to a product that millions of users
Mathematics Department, working on maybe identifying potential terrorists, but experience all the time. So the research
the management of encryption keys. “I there was also this increasing awareness is more oriented towards user impact
was drawn to mathematics because I that this needed to be paired with we and product impact, with publications
was really interested in problems that are needed to protect the privacy of the users being an offshoot of that, as opposed
easy to describe; you don’t need a lot of in some way, in addition to finding those to the main goal. However, publishing is
terminology or background to state the aggregate patterns.” It was through this understood to be an important part of the
problems, but they are very hard to solve.” quest for safeguarding user privacy that job,” she reassured me.
After being awarded her Ph.D., she inference detection was developed, which Looking back on her career, the
sought a broader work scope. Staddon she proudly acknowledged as the most importance of a broad scientific outlook
joined RSA Labs, then first moved on to evolved of her scientific contributions. easily stands out as valuable advice for
Bell Labs, and later to Xerox PARC before Staddon describes Google as emerging privacy researchers. “Certainly
finally joining Google in 2010. Each time, “the epitome” of applied research in privacy, it is often the case that ideas
she said, her interdisciplinary outlook environments. “Google is a fantastic place which are really established and second
on computer science expanded, and she to work on privacy. It really values the nature in other disciplines can really have
was exposed to increasingly dynamic and work tremendously, it is really concerned an impact. ” It is no wonder that in the light
exciting work environments. about privacy as a problem area, and... of the recent NSA scandal, she believes
It was during the mid 2000s that from personal experience, it has sort of a intensified media attention will prove
her prior work in a progressively broader sense of what it takes to tackle healthy to research and help invigorate
interdisciplinary applied research privacy.” With Google under constant progress. “I do think that for research,
environment allowed her to recognize public pressure to maintain impeccable overall that’s a good thing, with different
the emerging field of privacy and privacy standards, I wanted to know what corners of the world all thinking about this
consequently shift her research interests she thought were the most rewarding issue. I would like to see the discussion
from security. She recalls “people aspects of her job. even broader than it is, actually. In the
started realizing that although privacy “First, I think I would put the impact sense that these conversations cause
and security are related, they are not out there,” she replied. “I never feel more people to think about these things,
the same thing.” Once she started that I’m going to come to work and not I do think that it’s valuable.”
focusing on privacy, she witnessed the make any difference. A lot of places have
establishment of the field, as we know it these sort of active research labs, where © 2013 ACM 1529-4972/13/09 $15.00
T
search to gain a better understanding
he recent spying disclosures in-person surveys to Mechanical Turk of how the practice of online behavior-
from Edward Snowden are studies with 12,000 participants. Re- al advertising affects user privacy. In
just the latest front of a long- searchers at CUPS have also explored online behavioral advertising, compa-
standing debate within our how to measure and compare the nies track users online to show them
society. As we struggle to balance strength of passwords in a meaning- targeted ads. What options do adver-
computer security, privacy, and the ful way, and this work has included the tising companies provide for users
public good, a research lab at Carn- development of a technique to deter- not to be tracked? Can users manage
egie Mellon University remains dedi- mine when a given password-cracking to use the available options to protect
cated to addressing the broad array of algorithm would crack a given pass- their privacy? Are online advertisers
challenges collectively called “usable word in much less time than actually following their own disclosure rules?
privacy and security.” running the algorithm. These are among the questions CUPS
The CyLab Usable Privacy and Secu- Another arm of CUPS research in- lab members have addressed through
rity Laboratory (CUPS) brings together volves understanding how users make this research.
a diverse team of researchers includ- privacy decisions, specifically looking CUPS researchers have delved into
ing computer scientists, engineers, at whether website privacy policies many other areas as well. Work has
public policy experts, and economists can be presented in a more human- examined how users react to different
at Carnegie Mellon University. The readable form—such as using a design computer warnings, such as the warn-
director of CUPS is Professor Lorrie inspired by food nutrition labels. Re- ings browsers display when a user at-
Faith Cranor, who is a faculty member cent work has examined what leads tempts to navigate to a website with an
at both the Institute for Software Re- invalid certificate. Other areas of inter-
search in the School of Computer Sci- est include looking at how users man-
ence and the Engineering and Public age the data and files on their devices
Policy department of the College of
Engineering. CUPS seeks in the home environment. Still other
work has investigated how to train us-
Researchers work on issues at the to navigate ers to avoid phishing attacks. The re-
convergence of security, privacy, and
usability. One example of the research the increasingly searchers working on that project de-
veloped an online training game and
conducted at CUPS is an ongoing complex space other anti-phishing tools. They ended
passwords project. Passwords are be-
coming increasingly ubiquitous as we of computer up starting a company called Wombat
Security to commercialize these tools
entrust them with more and more of security and and sell them to companies around
our data. They are often the only bar-
rier keeping out a potential attacker. privacy in order the world to train their employees.
Students contribute meaningfully
Researchers at CUPS are investigating to help users to all phases of CUPS research. With
how to provide policies and guidance
for users creating passwords. The goal
be more secure faculty guidance, students take part in
creating and developing study instru-
is for organizations to be able to give and better able ments and software, gathering data,
their users guidelines that lead to se-
cure yet memorable passwords. CUPS
to make privacy performing data analysis, and writing
up the results for publication. In fact,
password research has ranged from decisions. many research projects within CUPS
have their roots in student ideas and jective: CUPS seeks to navigate the Researchers from the CyLab Usable
class projects. increasingly complex space of com- Privacy and Security Laboratory at CMU.
The latest news from CUPS is that puter security and privacy in order to
many of its faculty are collaborating help users be more secure and better
to offer a unique master’s degree for able to make privacy decisions. CUPS
privacy engineers. Called the Master research also influences public policy
of Science in Information Technology- makers. CUPS faculty members have
Privacy Engineering, this degree is been invited to testify at Congressio-
offered jointly through Carnegie nal hearings and are regularly called
Mellon University’s School of Com- upon to advise U.S. federal agencies
puter Science and College of En- on privacy issues. Further, CUPS re-
gineering. The one-year graduate search is published at the top secu-
program is intended to prepare its rity and human-computer interaction
students to create and develop sys- conferences, as well as at SOUPS, the
tems to protect user privacy through Symposium On Usable Privacy and
a combination of classes on privacy Security, which was founded by CUPS
and security and a practical hands-on director Lorrie Faith Cranor.
capstone project. The first class in this
program began Fall 2013. Biography
As you can see, the CUPS lab is Rich Shay is a Ph.D. student at Carnegie Mellon University
engaged in many different strands in the School of Computer Science. He conducts research
on usable privacy and security. His current research
of research. These diverse projects, focuses on password-composition policies, as well as
however, all share a common ob- online privacy and online behavioral advertising.
ACM back
Transactions on WLAN Security
Accessible It is entirely likely that you, reader, are within 10 feet of a device that can
Computing communicate on a wireless local area network (WLAN). Perhaps there is even
one in your hand or pocket right now. It’s no big surprise that much of the world
is becoming absolutely inundated with such devices, and it’s been a long time
coming. With such a drastic increase in the number of these devices, however,
comes a drastic increase in the amount of private information that is broadcast
on radio waves for anyone who is willing to listen. Fortunately, we have ways of
protecting this information, but it hasn’t always been so secure.
The first wireless network was developed by Dr. Norman Abramson at the
University of Hawaii. It was called ALOHAnet and included seven computers
across four islands. ALOHAnet pioneered a lot of interesting technology and
some ideas that are in use today, but most modern WLANs are implemented
using the IEEE 802.11 standards. The first of these was released in 1997 and
included a clause describing Wired Equivalency Privacy (WEP). WEP used the
RC4 stream cipher for encryption and a 24-bit initialization vector. The way WEP
uses RC4 and the short initialization vector, however, was shown to be exploitable
in 2001 by Scott Fluhrer, Itsik Mantin, and Adi Shamir. This fact, along with
further demonstrated exploits, led to the development of Wi-Fi Protected Access
(WPA) and Wi-Fi Protected Access II (WPA2). WPA was meant to be a temporary
replacement for WEP until WPA2 became available in 2004 as part of the 802.11i
amendment. It implemented much of the 802.11i amendment and adopted
the Temporal Key Integrity Protocol (TKIP), which generates a new key for each
◆ ◆ ◆ ◆ ◆ packet. This made WPA secure from the types of attacks that had previously
plagued WEP. WPA2 went further and introduced the Counter Cipher Mode and
This quarterly publication is a
Block Chaining Message Authentication Code Protocol (CCMP) based on the
quarterly journal that publishes Advanced Encryption Standard (AES) block cipher. The ratification of the 802.11i
refereed articles addressing issues draft standard, which marked the release of WPA2, also officially deprecated
of computing as it impacts the WEP. A security flaw was revealed in 2011 for routers that had the Wi-Fi Protected
Setup (WPS) feature enabled, but WPA2 is still far more secure than other options
lives of people with disabilities.
Zero-Knowledge Proofs
BY Marinka Zitnik
A
zero-knowledge proof allows practical and theoretical interests in have on the complexity theory. We will
one person to convince an- cryptography and mathematics. They then conclude with an application of
other person of some state- achieve a seemingly contradictory goal zero-knowledge proofs in cryptog-
ment without revealing any of proving a statement without reveal- raphy, the Fiat-Shamir identification
information about the proof other than ing it. We will describe the interactive protocol, which is the basis of current
the fact that the statement is indeed proof systems and some implica- zero-knowledge entity authentication
true. Zero-knowledge proofs are of tions that zero-knowledge proofs schemes.
Visit ACM’s Career & Job Center at: Fiat-Shamir Identification Protocol
Zero-knowledge proofs in cryptog-
raphy have natural applications for
CareerCenter_TwoThird_Ad.indd
66 1 4/3/12 1:38 PM XRDS • fall 2013 • Vol.20 • No.1
probability 1. If Peggy (or an impostor)
does not possess the secret s , then Definition 2: Initialization and identification phases of Fiat-Shamir
she can provide only a random guess identification protocol in Python.
of y = r or y = rs . Honest Victor will
def fiat_shamir_initialization(n, s):
reject with probability ½ in every tc = TrustedCenter(n)
iteration. That implies an overall peggy = Prover(tc, s)
probability of 2–t that cheating Peggy return tc, peggy
will not be caught and as a result the
Fiat-Shamir protocol is sound. The def fiat_shamir_identification(t, tc, peggy):
Fiat-Shamir scheme also upholds the victor = Verifier(tc)
property of zero-knowledge. The only for _ in xrange(t):
victor.set_up(peggy.generate())
information revealed in each round is
c = victor.verify(peggy.response(victor.challenge()))
the x and y. Such pairs (x ,y) could be if not c:
simulated by choosing y randomly and print ‘Reject’
then computing the corresponding return
x . These pairs are computationally print ‘Accept’
indistinguishable from pairs generated
by the protocol.
Definition 1 is a straight-forward Definition 3: An example run of the Fiat-Shamir identification protocol.
implementation of the described Fiat- Suppose the trusted center selects an RSA-like modulus n=35, Peggy secretly
Shamir identification protocol. Let see chooses s=16, and Victor requires t=10 successful iterations of the protocol.
an example with p =7 and q =5. Then
n =35 and n is published to a trusted >>> tc, peggy = fiat_shamir_initialization(35, 16)
center. Let assume Peggy secretly >>> fiat_shamir_identification(10, tc, peggy)
Accept
chooses s =16, which is coprime to
35. She publishes v =11 to the trusted
center. Victor requires 10 successful
rounds of the protocol in order for him cheating. At the outset of the game, that she did the work. This is an
to accept (see Definition 3). parties commit to the secret inputs NP-statement since the work is a
and random coins of the prescribed valid witness, which Alice has in her
Zero-Knowledge Proofs tools they are supposed to use. They possession. Bob will believe the proof,
and NP Complexity Class then carry out the game procedures but he will not be able to convincingly
Zero-knowledge proofs exist for and with each output message they transfer the transcript of that proof
decision problems, such as graph prove to each other in zero-knowledge to anybody else. For all we know,
isomorphism, 3-colorability, quadratic that the message was honestly Bob could have created the encoded
residuosity, and non-residuosity. obtained under the committed inputs transcript of the homework on his own
Readers would now ask, for which and random coins. Properties of zero- by running a simulator. In other words,
problems can we design zero- knowledge systems guarantee us that Alice’s proof is deniable, in that she can
knowledge proofs. Powerful and participants have to act honestly in plausibly claim she was not responsible
general result exists [4] that informally order to be able to provide a valid proof for producing it.
say that any language for which (i.e. soundness) and the proofs cannot
membership can be efficiently verified compromise the privacy of their secret References
can be proved in zero-knowledge. Zero- inputs (i.e. zero-knowledge). [1] Stinson, D. R. Cryptography: Theory and Practice.
Chapman & Hall, Boca Raton, FL, 2005.
knowledge proofs exist for all problems Zero-knowledge systems are useful
[2] Oded Goldreich and Yair Oren. Definitions and
in NP, provided that one-way functions to assure deniability and prevent properties of zero-knowledge proof systems.
unwanted transfer of information. Journal of Cryptology, 7, 1 (1994), 1-32.
exist. That result is utilized for the
[3] Fiat, A. and Shamir, A. How to prove yourself:
design of cryptographic protocols, Suppose Alice wants to prove her practical solutions to identification and signature
because it enforces parties to behave classmate Bob that she did her essay problems. Advances in Cryptology-Crypto’86 ,
(1987), 186-194.
according to predetermined standards. homework. One way to do this is for
[4] Goldreich, O., Micali, S., and Wigderson, A. Proofs
Alice to show her homework to Bob. that yield nothing but their validity or all languages in
Conclusion However, what if Bob is ignorant and NP have zero-knowledge proof systems. Journal of
the ACM , 38, 3 (1991), 690-728.
Zero-knowledge proofs have some wants to cheat by copying Alice’s
fascinating applications. We might essay? The problem is an essay
use them to enforce honest behavior. identifying Alice as an author is
For instance, parties in an interactive transferable. Instead, Alice should © 2013 ACM 1529-4972/13/09 $15.00
game could prove they are not prove to Bob using zero-knowledge
BEMUSEMENT
Happy
Birthday
When asked about his birthday,
a man said: “The day before yesterday
I was only 25 and next year I will
turn 28.” This is true only one day in
http://xkcd.com/538/
CODE: CRSRDS
Join ACM online: www.acm.org/joinacm
Name Please print clearly
INSTRUCTIONS
Address
Carefully complete this application and return
with payment by mail or fax to ACM. You must
City State/Province Postal code/Zip
be a full-time student to qualify for student rates.
Country E-mail address
CONTACT ACM
Area code & Daytime phone Mobile phone Member number, if applicable
phone: 800-342-6626
MEMBERSHIP BENEFITS AND OPTIONS (US & Canada)
• Free software and courseware through the ACM • ACM e-news digest TechNews (thrice weekly) +1-212-626-0500
Academic Initiative • ACM online newsletter MemberNet (monthly) (Global)
• Free e-mentoring services from MentorNet® • Student Quick Takes, ACM student e-newsletter (quarterly) hours: 8:30am–4:30pm
• Electronic subscriptions to Communications of the ACM US Eastern Time
• Free "acm.org" email forwarding address plus filtering
and XRDS: Crossroads magazines through Postini fax: +1-212-944-1318
• Online courses, online books and videos • Option to subscribe to the full ACM Digital Library email: acmhelp@acm.org
• ACM's CareerNews (twice monthly) • Discounts on ACM publications and conferences, mail: Association for Computing
valuable products and services, and more Machinery, Inc.
PLEASE CHOOSE ONE:
General Post Office
❏ Student Membership: $19 (USD) P.O. Box 30777
❏ Student Membership PLUS Digital Library: $42 (USD) New York, NY 10087-0777
❏ Student Membership PLUS Print CACM Magazine: $42 (USD)
❏ Student Membership w/Digital Library PLUS Print CACM Magazine: $62 (USD) For immediate processing, FAX this
application to +1-212-944-1318.
P U B L I C AT I O N S Please check
Check the appropriate box and calculate Issues
amount due on reverse. per year Code Member Rate Air Rate* PAYMENT INFORMATION
• ACM Inroads 4 178 $16 ❐ $58 ❐
• Communications of the ACM 12 101 $25 ❐ $58 ❐ Payment must accompany application
• Computers in Entertainment (online only) 4 247 $43 ❐ N/A
Member dues ($19, $42, or $62) $
Computing Reviews 12 104 $55 ❐ $39 ❐
• Computing Surveys 4 103 $36 ❐ $32 ❐ To have Communications of the ACM
Evolutionary Computation (MIT Press) 4 177 $32 ❐ $30 ❐ sent to you via Expedited Air Service,
• interactions, new visions of human-computer interaction 6 123 $22 ❐ $35 ❐
(included in SIGCHI membership) add $58 here (for residents outside of
• Int’l Journal of Network Management (online only) (Wiley) 6 136 $92 ❐ $30 ❐ North America only). $
Int’l Journal on Very Large Databases 4 148 $83 ❐ $30 ❐
• Journal of Educational Resources in Computing (see TOCE) N/A N/A N/A N/A Publications $
• Journal of Experimental Algorithmics (online only) 12 129 $31 ❐ N/A
• Journal of Personal and Ubiquitous Computing 6 144 $65 ❐ $30 ❐ Total amount due $
• Journal of the ACM 6 102 $55 ❐ $58 ❐
• Journal on Computing and Cultural Heritage 4 173 $49 ❐ $23 ❐ Check or money order (make payable to ACM,
• Journal on Data and Information Quality 4 171 $49 ❐ $25 ❐
• Journal on Emerging Technologies in Computing Systems 4 154 $42 ❐ $23 ❐
Inc. in U.S. dollars or equivalent in foreign currency)
• Linux Journal (SSC) 12 137 $31 ❐ $33 ❐
• Mobile Networks and Applications 6 130 $72 ❐ $29 ❐ ❏ Visa/Mastercard ❏ American Express
• Wireless Networks 4 125 $72 ❐ $29 ❐
• XRDS (included with membership) 4 XRoads $39 ❐ N/A
Transactions on: Card number Exp. date
• Accessible Computing 4 174 $49 ❐ $24 ❐
• Algorithms 4 151 $52 ❐ $23 ❐
• Applied Perception 4 145 $43 ❐ $23 ❐ Signature
• Architecture & Code Optimization 4 146 $43 ❐ $23 ❐
Member dues, subscriptions, and optional contributions
• Asian Language Information Processing 4 138 $39 ❐ $23 ❐
are tax deductible under certain circumstances. Please
• Autonomous and Adaptive Systems 4 158 $41 ❐ $23 ❐ consult with your tax advisor.
• Computational Biology and Bioinformatics 4 149 $20 ❐ $49 ❐
• Computer-Human Interaction 4 119 $43 ❐ $25 ❐
• Computational Logic 4 135 $44 ❐ $25 ❐ EDUCATION
• Computation Theory 8 176 $49 ❐ $32 ❐
• Computer Systems 4 114 $47 ❐ $25 ❐
• Computing Education (formerly JERIC) 277 $25 ❐ N/A
• Database Systems 4 109 $46 ❐ $25 ❐
Name of School
• Design Automation of Electronic Systems 4 128 $43 ❐ $25 ❐
• Economics and Computation 4 192 $49 ❐ $23 ❐ Please check one: ❐ High School (Pre-college, Secondary
• Embedded Computing Systems 4 142 $44 ❐ $23 ❐ School) College: ❐ Freshman/1st yr. ❐ Sophomore/2nd yr.
• Graphics 4 112 $51 ❐ $25 ❐ ❐ Junior/3rd yr. ❐ Senior/4th yr. Graduate Student: ❐
• Information and System Security 4 134 $44 ❐ $23 ❐
Masters Program ❐ Doctorate Program ❐ Postdoctoral
• Information Systems 4 113 $47 ❐ $25 ❐
• Intelligent Systems and Technology 4 179 $46 ❐ $72 ❐ Program ❐ Non-Traditional Student
• Interactive Intelligent Systems 4 191 $49 ❐ $76 ❐
• Internet Technology 4 140 $42 ❐ $23 ❐
• Knowledge Discovery From Data 4 170 $50 ❐ $23 ❐ Major Expected mo./yr. of grad.
• Management Information Systems 4 190 $47 ❐ $22 ❐
• Mathematical Software 4 108 $47 ❐ $25 ❐ Age Range: ❐ 17 & under ❐ 18-21 ❐ 22-25 ❐ 26-30
• Modeling and Computer Simulation 4 116 $51 ❐ $25 ❐
• Multimedia Computing, Communications, and Applications 4 156 $42 ❐ $23 ❐ ❐ 31-35 ❐ 36-40 ❐ 41-45 ❐ 46-50 ❐ 51-55 ❐ 56-59 ❐ 60+
• Networking 6 118 $29 ❐ $52 ❐
• Programming Languages & Systems 6 110 $59 ❐ $32 ❐ Do you belong to an ACM Student Chapter? ❐ Yes ❐ No
• Reconfigurable Technology & Systems 4 172 $49 ❐ $24 ❐
• Sensor Networks 4 155 $42 ❐ $23 ❐ I attest that the information given is correct and that I will
• Software Engineering and Methodology 4 115 $43 ❐ $25 ❐ abide by the ACM Code of Ethics. I understand that my
• Speech and Language Processing (online only) 4 253 $33 ❐ N/A membership is non transferable.
• Storage 4 157 $42 ❐ $23 ❐
• Web 4 159 $41 ❐ $23 ❐
Marked • are available in the ACM Digital Library
* Check here to have publications delivered via Expedited Air Service. Signature
For residents outside North America only. PUBLICATION SUBTOTAL:
CARE ERS AT THE N ATI ONAL S ECURI TY A GE NCY
Rise Above
the Ordinary
A career at NSA is no ordinary job. It’s a
profession dedicated to identifying and
defending threats to our nation. It’s a
dynamic career filled with challenging
and highly rewarding work that you can’t
do anywhere else but NSA.
KNOWINGMATTERS
U.S. citizenship is required. NSA is an Equal Opportunity Employer. All applicants for employment are considered without regard to race, color, religion, sex, national origin, age,
marital status, disability, sexual orientation, or status as a parent.