You are on page 1of 61

Security and

Privacy
Machine Learning
Autonomous
Vehicles
Software

APRIL 2021 www.computer.org


IEEE COMPUTER SOCIETY JOBS BOARD

Evolving Career
Opportunities
Need Your Skills
Explore new options—upload your resume today

Changes in the marketplace shift demands for vital skills


www.computer.org/jobs and talent. The IEEE Computer Society Jobs Board is a
valuable resource tool to keep job seekers up to date on
the dynamic career opportunities offered by employers.
Take advantage of these special resources for job seekers:

JOB ALERTS TEMPLATES WEBINARS

CAREER RESUMES VIEWED


ADVICE BY TOP EMPLOYERS

No matter what your career level, the IEEE Computer


Society Jobs Board keeps you connected to
workplace trends and exciting career prospects.
IEEE COMPUTER SOCIETY computer.org

STAFF
Editor Publications Portfolio Managers
Cathy Martin Carrie Clark, Kimberly Sperka

Publications Operations Project Specialist Publisher


Christine Anthony Robin Baldwin

Production & Design Artist Senior Advertising Coordinator


Carmen Flores-Garvey Debbie Sims

Circulation: ComputingEdge (ISSN 2469-7087) is published monthly by the IEEE Computer Society. IEEE Headquarters, Three Park Avenue, 17th
Floor, New York, NY 10016-5997; IEEE Computer Society Publications Office, 10662 Los Vaqueros Circle, Los Alamitos, CA 90720; voice +1 714 821 8380;
fax +1 714 821 4010; IEEE Computer Society Headquarters, 2001 L Street NW, Suite 700, Washington, DC 20036.
Postmaster: Send address changes to ComputingEdge-IEEE Membership Processing Dept., 445 Hoes Lane, Piscataway, NJ 08855. Periodicals Postage
Paid at New York, New York, and at additional mailing offices. Printed in USA.
Editorial: Unless otherwise stated, bylined articles, as well as product and service descriptions, reflect the author’s or firm’s opinion. Inclusion in
ComputingEdge does not necessarily constitute endorsement by the IEEE or the Computer Society. All submissions are subject to editing for style,
clarity, and space.
Reuse Rights and Reprint Permissions: Educational or personal use of this material is permitted without fee, provided such use: 1) is not made for
profit; 2) includes this notice and a full citation to the original work on the first page of the copy; and 3) does not imply IEEE endorsement of any third-
party products or services. Authors and their companies are permitted to post the accepted version of IEEE-copyrighted material on their own Web
servers without permission, provided that the IEEE copyright notice and a full citation to the original work appear on the first screen of the posted copy.
An accepted manuscript is a version which has been revised by the author to incorporate review suggestions, but not the published version with copy-
editing, proofreading, and formatting added by IEEE. For more information, please go to: http://www.ieee.org/publications_standards/publications
/rights/paperversionpolicy.html. Permission to reprint/republish this material for commercial, advertising, or promotional purposes or for creating new
collective works for resale or redistribution must be obtained from IEEE by writing to the IEEE Intellectual Property Rights Office, 445 Hoes Lane,
Piscataway, NJ 08854-4141 or pubs-permissions@ieee.org. Copyright © 2021 IEEE. All rights reserved.
Abstracting and Library Use: Abstracting is permitted with credit to the source. Libraries are permitted to photocopy for private use of patrons,
provided the per-copy fee indicated in the code at the bottom of the first page is paid through the Copyright Clearance Center, 222 Rosewood Drive,
Danvers, MA 01923.
Unsubscribe: If you no longer wish to receive this ComputingEdge mailing, please email IEEE Computer Society Customer Service at help@
computer.org and type “unsubscribe ComputingEdge” in your subject line.
IEEE prohibits discrimination, harassment, and bullying. For more information, visit www.ieee.org/web/aboutus/whatis/policies/p9-26.html.

IEEE Computer Society Magazine Editors in Chief

Computer IEEE Intelligent Systems IEEE Pervasive Computing


Jeff Voas, NIST V.S. Subrahmanian, Marc Langheinrich, Università
Dartmouth College della Svizzera italiana
Computing in Science
& Engineering IEEE Internet Computing IEEE Security & Privacy
Lorena A. Barba, George George Pallis, University Sean Peisert, Lawrence
Washington University of Cyprus Berkeley National
Laboratory and University
IEEE Annals of the History IEEE Micro of California, Davis
of Computing Lizy Kurian John, University
Gerardo Con Diaz, University of Texas at Austin IEEE Software
of California, Davis Ipek Ozkaya, Software
IEEE MultiMedia Engineering Institute
IEEE Computer Graphics Shu-Ching Chen, Florida
and Applications International University IT Professional
Torsten Möller, Irena Bojanova, NIST
Universität Wien

2469-7087/21 © 2021 IEEE Published by the IEEE Computer Society April 2021 1
APRIL 2021 � VOLUME 7 � NUMBER 4

8
Buying Your
16
Knowledge
23
Towards
Genetic Self Graph Semantic Explainability
Online: Pitfalls Enhancement in Machine
and Potential of Input Data for Learning:
Reforms in Improving AI The Formal
DNA Testing Methods Way
Security and Privacy
8 Buying Your Genetic Self Online: Pitfalls and Potential
Reforms in DNA Testing
ANDELKA M. PHILLIPS

14 Policies on Privacy
STEVEN M. BELLOVIN

Machine Learning
16 Knowledge Graph Semantic Enhancement of Input Data
for Improving AI
SHREYANSH BHATT, AMIT SHETH, VALERIE SHALIN, AND JINJIN ZHAO

23 Towards Explainability in Machine Learning:


The Formal Methods Way
FREDERIK GOSSEN, TIZIANA MARGARIA, AND BERNHARD STEFFEN

Autonomous Vehicles
28 Disruptive Innovations and Disruptive Assurance:
Assuring Machine Learning and Autonomy
ROBIN BLOOMFIELD, HEIDY KHLAAF, PHILIPPA RYAN CONMY, AND
GARETH FLETCHER

37 Validation of Autonomous Systems


CHRISTOF EBERT AND MICHAEL WEYRICH

Software
46 Queens of Code
EILEEN BUCKHOLTZ

54 Mom, Where Are the Girls?


IPEK OZKAYA

Departments
4 Magazine Roundup
7 Editor’s Note: Who’s Doing What with Your Data?
57 Conference Calendar

Subscribe to ComputingEdge for free at


www.computer.org/computingedge.
Magazine Roundup

T he IEEE Computer Society’s lineup of 12 peer-reviewed technical magazines covers cutting-edge topics rang-
ing from software design and computer graphics to Internet computing and security, from scientific appli-
cations and machine intelligence to visualization and microchip design. Here are highlights from recent issues.

first-order finite-element method Sottsass Jr. (1917–2007) along with


(FEM) approximation are selected Andries van Onck (1928–2018)
Dynamic Assurance for code design, but the tech- at the end of the 1950s. Aiming
Cases: A Pathway to niques are also applicable to the to launch the ELEA computer on
Trusted Autonomy CUDA programming model and the international market, Olivetti
other types of finite-element dis- developed the idea of a visual lan-
The authors of this article from the cretizations (including discontin- guage for human–computer inter-
December 2020 issue of Computer uous Galerkin and isogeometric). action that could be learned by any
propose a system architecture The auto-tuning optimization is operator, regardless of their native
that facilitates dynamic assurance performed for four example graph- language. The task of designing
of autonomous systems embed- ics processors and the obtained this sign system was assigned to
ding machine learning-based com- results are discussed. Tomás Maldonado (1922–2018).
ponents. They also introduce Together with Gui Bonsiepe (b.
dynamic assurance cases as a 1934), Maldonado designed a visual
generic framework to provide jus- language that incorporated gram-
tified confidence in these systems. Olivetti ELEA Sign System: matical and syntactic reasoning.
Interfaces Before the Advent Later discarded, the sign system for
of HCI ELEA prefigured the contemporary
use of icons in computer interfaces.
Optimal Kernel Design for This article from the October–
Finite-Element Numerical December 2020 issue of IEEE
Integration on GPUs Annals of the History of Comput-
ing uses the case of ELEA 9000, Move&Find: The Value of
This article from the November/ the first Olivetti computer series, to Kinaesthetic Experience in a
December 2020 issue of Comput- demonstrate the close relationship Casual Data Representation
ing in Science & Engineering pres- between industrial design, semiot-
ents the design and optimization ics, ergonomics, and the history of The value of a data representation
of the GPU kernels for numerical computing. A focus on the Olivetti is traditionally judged based on
integration, as it is applied in the ELEA series invites scholars to aspects like effectiveness and effi-
standard form in finite-element reconsider the history of computer ciency that are important in util-
codes. The optimization process interface design well before the itarian or work-related contexts.
employs auto-tuning, with the emergence of HCI as a widely rec- Most multisensory data represen-
main emphasis on the placement ognized field of research. The con- tations, however, are employed in
of variables in the shared mem- sole and racks of the mainframe casual contexts where creative,
ory or registers. OpenCL and the computer were designed by Ettore affective, physical, intellectual,

4 April 2021 Published by the IEEE Computer Society 2469-7087/21 © 2021 IEEE
and social engagement might be center control and decision units. transactions. The evaluation of the
of greater value. The authors of The computational experiments architecture reference model is pro-
this article from the November/ are performed on the artificial sys- vided through the design of a block-
December 2020 issue of IEEE Com- tem to analyze and evaluate emer- chain-based trusted public service
puter Graphics and Applications gency management strategies. and a use-case scenario example.
introduce Move&Find, a multisen- The mechanism of parallel execu- The proposed architecture refer-
sory data representation in which tion between the actual system ence model is based on the CEF
people pedaled on a bicycle to and artificial system is presented building blocks EBSI, eSignature,
exert the energy required to power to manage and optimize the emer- and eID compliant with eIDAS.
a search query on Google’s serv- gency strategy, which is capable
ers. To evaluate Move&Find, they of guiding the actual URTS system
operationalized a framework suit- through real-time online supervi-
able to evaluate the value of data sion and adjustment and of provid- History of IBM Z
representations in casual con- ing an active optimization of pas- Mainframe Processors
texts and experimentally com- senger emergency management.
pared Move&Find to a correspond- IBM Z is both the oldest and among
ing visualization. With Move&Find, the most modern of computing
participants achieved a higher platforms. Launched as S/360 in
understanding of the data. Signing Blockchain 1964, the mainframe became syn-
Transactions Using onymous with large-scale comput-
Qualified Certificates ing for business and remains the
workhorse of enterprise comput-
Parallel Urban Rail Transit Blockchain technology is increas- ing for businesses worldwide. Most
Stations for Passenger ingly being considered among both of the world’s largest banks, insur-
Emergency Management private enterprises and public ser- ers, retailers, airlines, and enter-
vices. However, it poses a challenge prises from many other industries
In this article from the November/ with regard to aligning its identity have IBM Z at the center of their IT
December 2020 issue of IEEE Intel- management scheme with the pub- infrastructure. This article from the
ligent Systems, a parallel urban lic key infrastructure and the qual- November/December 2020 issue
rail transit station (URTS) system ified digital certificates issued by of IEEE Micro presents an over-
for passenger emergency man- qualified trust service providers. view of the evolution of the IBM
agement is presented based on To solve this challenge, the authors Z microprocessors over the past
the artificial systems, computa- of this article from the November/ six generations. It discusses some
tional experiments, and parallel December 2020 issue of IEEE Inter- of the underlying workload char-
execution (ACP) approach. The net Computing present an architec- acteristics and how these have
agent-based modeling technol- ture reference model that enables influenced the microarchitecture
ogy is applied to build the artifi- enterprises and public services enhancements driving the per-
cial URTS system, which contains to leverage blockchain technol- formance and capacity improve-
the models of people, trains, facil- ogy by integrating qualified elec- ments. The article then describes
ities, events, environments, and tronic signatures with blockchain how the focus shifted over time

www.computer.org/computingedge 5
MAGAZINE ROUNDUP

from speeds and feeds to new fea- applications that are compute- the important lessons emerging
tures, functions, and accelerators. intensive, bandwidth-hungry, and from studies of programming and
latency-sensitive. The authors the new programming tools they
of this article from the October– motivate.
December 2020 issue of IEEE Per-
WarpClothingOut: A Stepwise vasive Computing show how
Framework for Clothes infrastructure deployed for such
Translation From the Human futuristic applications can also The IT Challenges in Disaster
Body to Tiled Images benefit virtual machine (VM)- Relief: What We Learned From
encapsulated Windows or Linux Hurricane Harvey
With the increasing popular- closed-source legacy applications.
ity of online shopping, searching They present a new capability for This article from the November/
for products with images for item legacy applications called edge- December 2020 issue of IT Profes-
retrieval has gradually become an based virtual desktop infrastruc- sional explores the information
effective approach. This trend is ture (EdgeVDI) and discuss exam- systems involved in disaster relief
especially evident in the fashion ple use cases that it enables. supply chain for Hurricane Har-
industry. In common media, cloth- vey survivors. The authors inter-
ing items are usually worn on the viewed three organizations—the
human body. They can be straight- United Way, the BakerRipley, and
forwardly segmented from the End-to-End Verifiable the American Red Cross—on how
source media by utilizing detection E-Voting Trial for Polling information systems were used in
or parsing algorithms. However, Station Voting this concerted effort of long-term
this may be deleterious to retrieval recovery. They found that data
performance due to distortion, On 2 May 2019, during the United sharing is the major challenge, and
occlusion, and different back- Kingdom’s local elections, an e-vot- it is further constrained and com-
grounds. In this article from the ing trial was conducted in Gates- plicated by legal concerns. They
October–December 2020 issue of head using a touchscreen, end-to- also observed that organizations
IEEE MultiMedia, a stepwise trans- end verifiable system. This was the used ad hoc technology solutions
lation framework using a genera- first test of its kind in the United to accommodate different relief
tive adversarial network and thin Kingdom, and it presented a case project needs; an integrated open-
plate spline is developed to transfer study to envisage the future of source system would not only
human body images to tiled cloth- e-voting. Read more in this article save cost but also improve overall
ing images, which can be directly from the November/December 2020 productivity.
used for clothing retrieval. Exper- issue of IEEE Security & Privacy.
imental results demonstrate the
effectiveness of the resultant tiled
images produced from the frame- Join the IEEE
Information Needs: Lessons
work compared to other methods.
Computer
for Programming Tools
Society
Why is programming some- computer.org/join
Edge Computing for times so frustrating and annoy-
Legacy Applications ing and other times so fast and
painless? This article from the
Edge computing was motivated November/December 2020 issue
by the vision of new edge-native of IEEE Software surveys a few of

6 ComputingEdge April 2021


Editor’s Note

Who’s Doing What with


Your Data?

T he Internet and Internet-


connected devices enhance
our lives in myriad ways, but they
“Policies on Privacy,” invites read-
ers to consider the ethical impli-
cations of various types of data
Autonomy” presents a framework
for better dependability in auton-
omous systems such as self-driv-
can also expose us to surveillance collection and use. The author ing cars. The authors of IEEE Soft-
and data collection that put our asserts that societies should cre- ware’s “Validation of Autonomous
privacy at risk. While some govern- ate privacy regulations to reflect Systems” argue that employing
ments have enacted privacy-pro- their values. intelligent validation and testing
tecting regulations—most notably Another ethical issue in tech- will help build public trust in auton-
the European Union’s General Data nology today is explainability in omous vehicles.
Protection Regulation—policymak- machine learning (ML). IEEE Inter- The final two articles in this
ers can do more to safeguard pri- net Computing’s “Knowledge ComputingEdge issue celebrate
vacy in the digital age. Two articles Graph Semantic Enhancement of women in software engineering
from IEEE Security & Privacy take Input Data for Improving AI” dis- and encourage gender diversity in
on contemporary privacy concerns cusses an iterative-optimization the field. “Queens of Code,” from
and recommend related policies. approach to knowledge graphs IEEE Annals of the History of Com-
The first article, “Buying Your that helps improve ML explain- puting, highlights 12 female pro-
Genetic Self Online: Pitfalls and ability. IT Professional’s “Towards grammers who worked for the US
Potential Reforms in DNA Testing,” Explainability in Machine Learning: National Security Agency in the
discusses privacy issues related to The Formal Methods Way” argues 1950s through the 1980s. In “Mom,
direct-to-consumer genetic test- for using formal methods in ML to Where Are the Girls?,” from IEEE
ing, a popular service provided by discover precise reasons behind Software, the author describes
companies such as Ancestry and algorithmic decisions and actions. how she came to recognize the
23andMe. The article proposes ML-based systems are impor- gender diversity problem in soft-
regulations and standardization tant parts of autonomous vehi- ware engineering and encourages
that could help mitigate the risks cles. Computer’s “Disruptive Inno- individuals and organizations in
of companies obtaining genetic vations and Disruptive Assurance: the software community to advo-
information. The second article, Assuring Machine Learning and cate for diversity.

2469-7087/21 © 2021 IEEE Published by the IEEE Computer Society April 2021 7
EDITORS: Khaled El Emam, kelemam@cheo.on.ca
Katrine Evans, k.evans@haymanlawyers.co.nz
This article originally
appeared in

DEPARTMENT: PRIVACY INTERESTS


vol. 17, no. 3, 2019

Buying Your Genetic Self Online:


Pitfalls and Potential Reforms in DNA Testing
Andelka M. Phillips, University of Waikato

T
oday’s world is one of constant monitoring and Ancestry testing is particularly popular, but the
tracking—sometimes driven by us, sometimes industry varies widely with a broad spectrum of avail-
driven by others. Developments in the field of able services. The best-known ancestry and health
health and identity are no exception. New technolo- tests are provided by prominent companies, such as
gies, such as wearable devices, and other technologies 23andMe, AncestryDNA, Orig3n, MyHeritage, and
in consumer-centered health care allow us to track FamilyTreeDNA. However, there are also companies
our fitness and health data, and they connect us with offering lesser-known tests that are often more dubi-
others. ous, including assessing child talent, peace-of-mind
Similarly, the rise in direct-to-consumer (DTC) paternity, and infidelity (often dubbed surreptitious
genetic testing services, sometimes known as per- testing). Several of these tests raise privacy and ethi-
sonal genomics or commercial genomics, can be cal concerns.
viewed both as an example of emerging technology The proliferation and variety of services offered
and also as disruptive innovation. These services have are increasingly attracting attention from research-
created a commercial market for genetic tests, allow- ers. My own research (due to be published as a book
ing people to buy their own DNA tests online without a later this year) included a review of the online con-
medical intermediary. tracts of 71 DTC companies providing tests for health
However, as with wearable health devices, DTC purposes. It found that a number of terms commonly
potentially affords opportunities for other entities included in these contracts were problematic from a
to access and compile those data and subject us to consumer protection standpoint. Some companies,
profiling. Consumers, therefore, need to understand such as Soccer Genomics, have also raised concern
what’s involved when we buy our so-called genetic self from research scientists, with Stephen Montgomery
online. at Stanford University launching a parody Yes or No
This article provides a brief introduction to the Genomics website in response. Another parody web-
world of DTC and its potential traps for the unwary. site, DNA Friend, is a useful resource to highlight the
It discusses some short- and longer-term regulatory sensitive nature of these services. However, these
measures that may help to iron out the most serious parodies do, to some extent, assume a level of knowl-
risks to consumer privacy. In particular, it concludes edge about genetics, and we really need more efforts
that the industry needs more oversight and consum- to assist the public in understanding the risks here.
ers need more control of their genetic data and per- While there is increasing public awareness of
sonal data in the DTC context. ancestry and health tests, what is less well understood
is that these tests are generally not standardized and
THE GROWTH OF DTC that any entity collecting genetic data could poten-
GENETIC TESTING tially use that data for secondary research or share it
The market for DTC has experienced significant
growth in the last couple of years with some promi-
nent DTC companies having databases with several Digital Object Identifier 10.1109/MSEC.2019.2904128
million consumers’ samples. Date of publication: 14 May 2019

8 April 2021 Published by the IEEE Computer Society  2469-7087/21 © 2021 IEEE
with third parties, such as law enforcement. This arti- companies. One challenge here is that it is not pos-
cle explores the problems that can arise as a result. It sible to truly anonymize genetic data. (See, for exam-
also discusses the existing and potential mechanisms ple, the works by Erlich and Narayanan2 and Gymrek
that might help to resolve those problems. et al. 3). If something goes wrong, we cannot change
our stored genetic data in the same way that we
A LACK OF STANDARDIZATION could change our bank password. So, it is particularly
In relation to DTC tests for health purposes, many tests important that where DTC companies engage in such
for common complex diseases are not harmonized, research, they implement strong security practices
and the validity of their findings is open to dispute. and infrastructure.
In particular, DTC companies often do not provide
whole genome scans and instead focus on portions
of an individual’s genome. Also, they can focus on IN RELATION TO DTC TESTS FOR
different genetic variants and also frame their popu- HEALTH PURPOSES, MANY TESTS FOR
lations differently. As a result, it is possible to get COMMON COMPLEX DISEASES ARE
contradictory disease-risk estimates from different NOT HARMONIZED, AND THE VALIDITY
companies. OF THEIR FINDINGS IS OPEN TO
The more common ancestry tests have also not DISPUTE.
been standardized, and it is similarly possible to
obtain contradictory ethnicity estimates from dif-
ferent companies. There have even been instances It is important for consumers to understand the
of DTC companies providing DNA test reports on potential for secondary use here. The source of profit
canine samples without distinguishing them from for DTC companies will often be partnerships and
human samples. For example, in their article “Hered- mergers with other entities, and there is a significant
ity or Hoax?” Barrera and Fox 1 discussed an example level of uncertainty here in relation to the variety
where a man had sent a dog DNA sample to a com- of ways in which genetic data could be used in the
pany (under a human name) and received an estimate future.
of 20% First Nations ancestry. Use for law enforcement is also attracting
This means that consumers need to be cautious increasing attention. In the last year, there was much
about these services. At the very least, the public media coverage of the genetic genealogy database
needs to be provided with more information about the GEDmatch’s involvement in the investigation of the
limitations of testing because the utility of the service Golden State Killer case, where law enforcement
being sold may be less than expected. accessed its database to find a potential suspect,
through the process of familial DNA matching.4
SECONDARY USE OF Since this revelation, it has emerged that more
GENETIC DATA than 100 other DNA profiles from cold cases have
The potential for genetic data to be used in ongo- been uploaded to GEDmatch. 5 In early 2019, it also
ing research is high. A number of the most promi- emerged that the DTC company FamilyTreeDNA has
nent DTC companies have begun to partner with the been working with the U.S. Federal Bureau of Investi-
pharmaceutical industry, and we have also begun to gation to investigate violent crime (see, for instance,
see investment by the insurance industry from these the work by Haag6).

www.computer.org/computingedge 9
PRIVACY INTERESTS

GENETIC DATA ARE SENSITIVE These features, among others, affect how we need
IN NATURE to think about regulation of businesses that handle
Genetic data are generally viewed as sensitive and genetic data.
can do real harm in the wrong hands. It is also much
more than a method of identification in criminal pro- THE IMPACT OF THE
ceedings. Genetic data have certain characteristics, GENERAL DATA PROTECTION
which means that it can pose long-term privacy risks REGULATION ON DTC
for individuals and their relatives. Europe’s data protection law, the General Data Protec-
Once you have a genetic test, your genetic code tion Regulation (GDPR), is supposed to put users back in
is digitized and that digital data can be stored poten- control of their data. It has direct relevance to the DTC
tially indefinitely and used for purposes beyond the industry: any company that sells or provides services
primary purpose for which you gave it. It can also directly to consumers based in the European Union
serve as a unique identifier for you, and since you (EU) needs to ensure that it complies with the GDPR.
share much of your DNA with your genetic relatives, it Genetic data are included in the prohibition on
can also be used to trace those relatives. The impact processing of special categories of data in article 9 of
of a data leak may be substantial, and it does not the GDPR. Consequently, to comply with the GDPR,
decrease over time. companies should be obtaining explicit and informed
The industry also operates internationally. Typi- consent from their consumers for a DNA test. A more
cally, consumers can purchase a test through a web- traditional notice-and-choice model is insufficient. In
site, and then they will receive a sample collection kit my research to date on the regulation of DTC, it seems
likely that many businesses will need to alter their con-
sent mechanisms to meet this higher standard.
UNDER BOTH THE GDPR AND Part of the problem is that e-commerce-based ser-
EU CONSUMER PROTECTION vices have relied on their online information (including
LEGISLATION, THERE ARE contracts and privacy policies) to govern relationships
REQUIREMENTS FOR THESE with consumers. However, providing clear online infor-
DOCUMENTS TO BE IN PLAIN AND mation about complex subjects can be a challenge.
INTELLIGIBLE LANGUAGE. Also, we have all grown accustomed to ignoring terms
and conditions and privacy policies on websites. This is
due to a number of factors. One of the most significant
in the mail. This is normally used for the collection of a problems is that people often lack the time to read
saliva sample or a cheek swab, which is then sent back these documents, and even where they do take the
to the company for processing. Although services time, they may struggle to understand the contents.
vary, companies will generally provide results through Many businesses have created longer contracts and
a web interface. privacy policies that are heavily skewed in favor of their
From a regulatory perspective, the international interests, rather than those of their consumers. There
nature of the industry creates complexity. The physi- has also been a lack of oversight of these documents.
cal sample may be sent overseas and processed and Consumers are deterred from reading them and may
stored by a company in a different country from where believe that they are not capable of challenging or
the consumer resides. The sequenced genetic data changing the use of their information in any case.
generated from this physical sample may or may However, under the GDPR, a high standard of con-
not be stored in that same country. Also, DTC com- sent is required for data processing, and it is not going
panies may collect other forms of personal data to be acceptable to bury consent in a lengthy contract
from their consumers through surveys and other or to only make company policies accessible after a
research activities. Where this is stored may also consumer has registered for a service. Under both
vary, and again, it may be different from where the the GDPR and EU consumer protection legislation,
consumer resides. there are requirements for these documents to be in

10 ComputingEdge April 2021


PRIVACY INTERESTS

plain and intelligible language. Because contracts and could be drawn upon in developing new legislation or
privacy policies are often linked together, problematic industry codes of conduct.
terms in contracts, which could be challenged on More suggestions for both short-and long-term
consumer protection grounds, may also be found to strategies are provided next. There is no perfect solu-
be problematic from a data protection perspective as tion, but a number of steps could lead to significant
well. EU consumer protection legislation also restricts improvements for consumers and for improving stan-
the inclusion of terms that may be deemed to be unfair dards across the industry.
and limits their enforceability.
As the GDPR beds in, consumers are also start-
ing to realize that they have genuine mechanisms to THE DTC INDUSTRY HAS GROWN
challenge what companies are doing with their data. IN THE LAST TWO DECADES WITH
The recurring and self-serving rhetoric expressed RELATIVELY LITTLE OVERSIGHT,
by some key players in big tech who say “privacy is DURING WHICH TIME THE POTENTIAL
dead” is changing. We are starting to see a shift with OF THE TECHNOLOGY HAS GROWN
wide-reaching laws, such as the GDPR, together with CONSIDERABLY.
growth in mega data breaches, resulting in calls for
further regulation. Privacy is not only still alive—it is
kicking. For example, the most recent annual report Short-Term Strategies
released by the Irish Data Protection Commissioner7 ›› The public needs more independent infor-
(which is the first line of regulation for many tech com- mational resources to assist them in making
panies in Europe) demonstrates that people do care informed decisions about whether or not to
about their privacy and that complaints lodged under utilize DTC services. Data protection authorities
the GDPR are likely to increase. and privacy regulators as well as consumer
Many countries outside the EU are also reform- regulators could release statements in relation to
ing their privacy and data protection laws to cater for the industry. The Office of the Canadian Privacy
new developments. Simply stopping marketing DTC Commissioner has already begun to take steps in
services to EU consumers, to avoid coverage by the this direction. It has released a number of docu-
GDPR, is therefore unlikely to be a viable solution. DTC ments in relation to DTC, including recommenda-
companies will increasingly need to meet similar legal tions for questions that consumers could ask
requirements for consumers based outside of the EU. DTC companies and questions that they should
ask themselves when considering purchasing a
SUGGESTIONS FOR REFORM test. This example could provide a useful model
The DTC industry has grown in the last two decades for other regulators exploring these issues.
with relatively little oversight, during which time the ›› Existing regulators should also consider develop-
potential of the technology has grown considerably. ing industry codes of conduct and model privacy
A number of policy documents have been released by policies and consumer contracts. One potential
diverse bodies, which could be drawn upon in improv- foundation for such a code is the Future of
ing industry governance. For example, the Science and Privacy Forum’s paper,8 which was developed in
Technology Committee of the United Kingdom has collaboration with some prominent DTC compa-
recently begun an inquiry into Commercial Genomics nies. This document makes a number of positive
and is seeking public submissions. There is hope that commitments in relation to privacy, but it is
this inquiry will lead to improved oversight of the DTC voluntary. It remains to be seen how businesses
industry in the United Kingdom and may provide use- will adhere to this. Unlike the Future of Privacy
ful guidance for other countries considering how to Forum paper, though, any code should make it
regulate the industry. The disbanded Human Genetics clear that American companies selling genetic
Commission from the United Kingdom also previously tests to consumers based in the EU should still
developed a Common Framework of Principles, which be complying with the GDPR.

www.computer.org/computingedge 11
PRIVACY INTERESTS

›› Another model is to make codes of conduct industry governance. Compliance reviews of


mandatory for the industry to follow. There privacy policies, contracts, and personal data
may be reasonable support for such a move: practices, particularly in relation to security
DTC companies that wish to engage in health practice, would all be beneficial for improving
research and maintain consumer trust have an industry governance.
interest in showing that they comply with the
law and support improvement of industry stan- Longer-Term Strategies
dards. They will wish to distance themselves ›› We need more specific oversight of the industry
from more dubious types of tests. to improve standards and ensure the protection
›› Businesses should rethink their drafting of of privacy and consumer rights more generally.
contracts and privacy policies. In relation to One possibility is the creation of new regulatory
contracts, clauses that significantly limit con- bodies with a mandate to regulate all businesses
sumers’ rights should be avoided. For example, if that handle genetic data. This could draw upon
businesses wish to be compliant with the GDPR existing models of data protection authorities and
and applicable consumer protection legislation, financial services regulators, and in some coun-
then they should not include clauses that allow tries, this could be a new body that was under
them to change their terms at any time without the oversight of the data protection authority.
notice to the consumer. ›› Tests of more dubious validity, such as surrepti-
›› Businesses should also think about their inter- tious tests and child talent, should be banned,
face design. Given the sensitive nature of genetic and regulators should help to alert the public
data and the complex nature of some health test about the most problematic services. In the
results, consumers should not be rushed into United Kingdom, the Human Tissue Act makes it
making a purchase. Putting speed bumps into an offense to analyze DNA without appropriate
the process, which encourage reflection and consent, and it is likely that any company offer-
allow consumers to change their minds, could ing surreptitious tests to U.K. consumers will be
help to achieve compliance with the GDPR. It in breach of this.
would be beneficial for businesses to allow for a ›› New legislation is needed that deals more specif-
cooling-off period as well in between purchase ically with individual rights in genetic data. The
and processing of the sample. recent Canadian Genetic Non-Discrimination
›› Businesses should also improve their practices Act could provide a useful model for other
in relation to deletion and destruction of physi- countries considering how to strengthen the
cal samples and data. It should be possible for rights of citizens in their genetic data.
any company performing a genetic test to pro- ›› New industry-specific legislation should also be
vide their consumers with the option of deleting introduced at a national level, and international
the data and destroying the sample after send- collaboration to develop more universal stan-
ing the consumer their test results. Guardiome dards that could be followed globally could also
is an interesting example here because they help consumers given the international nature
offer consumers their whole genome sequence of these services.
on a device, and their approach seems to be
more privacy centric.
›› Businesses should also keep in mind the GDPR’s
principles in relation to data processing. In the
context of DTC, adhering to the data minimiza-
T his article has provided an introduction to the
world of DTC and the challenges the industry
poses for privacy. It is vital to understand that there
tion principle could be particularly beneficial. is also a lot of uncertain risk in this context. We do not
›› At the national level, privacy and data protec- know all of the ways that our genetic data could be
tion regulators as well as consumer protection used in the future, but reform is needed given that we
regulators should play a role in improving cannot change our genetic data and that it can always

12 ComputingEdge April 2021


PRIVACY INTERESTS

potentially be linked back to us, can be used for many genealogy sites,” Wired, June 1, 2018. [Online]. Avail-
different purposes, and can also be used to trace our able: https://www.wired.com/story/police-will-crack
family members. People do need protection of their -a-lot-more-cold-cases-with-dna/
rights in this space and businesses should also view 6. M. Haag, “FamilyTreeDNA admits to sharing genetic
this as an opportunity to do things differently. data with F.B.I.,” NY Times, Feb. 4, 2019. [Online]. Avail-
able: https://nyti.ms/2DVnK3x
REFERENCES 7. Data Protection Commission. (2018). Annual report:
1. J. Barrera and T. Fox, “Heredity or hoax?” CBC News, June 25 May–31 December. Data Protection Commission.
13, 2018. [Online]. Available: https://newsinteractives Dublin, Ireland. [Online]. Available: https://www
.cbc.ca/longform/dna-ancestry-test .dataprotection.ie/sites/default/files/uploads/2019-03
2. Y. Erlich and A. Narayanan, “Routes for breaching and /DPC%20Annual%20Report%2025%20May%20-%20
protecting genetic privacy,” Nature Rev. Genetics, vol. 31%20December%202018.pdf
15, pp. 407–421, 2014. [Online]. Available: https://www 8. FPF. (2018, July 31). Privacy best practices for consumer
.nature.com/articles/nrg3423 genetic testing services. Future of Privacy Forum.
3. M. Gymrek, A. L. McGuire, D. Golan, E. Halperin, and Washington, D.C. [Online]. Available: https://fpf.org
Y. Erlich, “Identifying personal genomes by surname /wp-content/uploads/2018/07/Privacy-Best-Practices
inference,” Science, vol. 339, no. 6117, pp. 321–324, 2013. -for-Consumer-Genetic-Testing-Services-FINAL.pdf
4. R. Becker, “Golden State Killer suspect was tracked
down through genealogy website GEDmatch,” The
Verge, Apr. 26 2018. [Online]. Available: https://www ANDELKA M. PHILLIPS is a senior lecturer at Te Piringa
.theverge.com/2018/4/26/17288532/golden-state-killer Faculty of Law, the University of Waikato, New Zealand,
-east-area-rapistgenealogy-websites-dna-genetic and a research associate at the Centre for Health, Law, and
-investigation Emerging Technologies (HeLEX), University of Oxford, United
5. M. Molteni, “The key to cracking cold cases might be Kingdom. Contact her at andelka.phillips@waikato.ac.nz

IEEE COMPUTER SOCIETY


Call for Papers
Write for the IEEE Computer
Society’s authoritative
computing publications
and conferences.

GET PUBLISHED
www.computer.org/cfp

www.computer.org/computingedge 13
COLUMN: LAST WORD
This article originally
appeared in

vol. 18, no. 2, 2020

Policies on Privacy
Steven M. Bellovin, Columbia University

P
rivacy is a hotly debated topic. But there These questions are common. Two less common
isn’t just one question—“Should we have issues are the existence of dossiers and the existence,
more privacy?”—to answer. Rather, there in essence, of time machines.
are many, and until we reach consensus on the A dossier is a large compilation of data about a par-
answers—and their consequences—we cannot ticular individual, similar to what is compiled by credit
agree on what regulation, if any, is appropriate. Bear bureaus and data brokers. These dossiers can be very
in mind that the answers can be different for govern- powerful, but they’re what Paul Ohm has referred to
ments and for the private sector, and that this ques- as databases of ruin. Note, too, that these databases
tion in particular will be answered very differently by need not contain personally identifiable information
different people or different cultures. to be dangerous; a pseudonymous TiVo account can
The first set of questions concerns what sort be just as violative to privacy as one with a real name,
of information can be used. Can an entity obtain since the viewing history can often be deanonymized
information from others about a subject, or is it and linked to a real person.
restricted to information it directly collects? Note Dossiers can enable time machines, the ability to
that this interacts very directly with the issue of see what someone did in the past, before they were
use controls: should collected data be used only of interest to someone else. Governments, of course,
for the specified purposes, or can it be repurposed? love that—but so do marketers. Should such dossiers
Secondary use of data—using data for something be allowed to exist? Who should be allowed to query
other than the reason it was originally collected— them? Should the information in them “expire” after a
is one of the biggest sources of privacy problems. while? After how long?
This is especially true if multiple datasets are Perhaps, for dossiers, we need revocable anonym-
combined. ity, so that law enforcement can get at the informa-
What, though, constitutes direct collection? If I tion, but not marketers. That, too, involves a policy
tag an online picture with someone else’s name, is decision, albeit a more legalistic one: what are the
the site entitled to make the association between constraints on police?
that person and the picture? Between me and the It is important for society, not marketers, to
person I tagged? Between that person and me? answer the questions. For most answers, there are
Direct collection is even murkier when it comes privacy-preserving cryptographic techniques that
to web advertising. Is an on-page advertiser a direct can at least approximate today’s abilities where
collector? Is it the site hosting the page? Both? needed, but without endangering privacy or creat-
If we want use restrictions, how do we define the ing databases of ruin. There are already schemes for
categories of uses? What if someone changes his or things like privacy-preserving targeted ads, verifiable
her mind? Do we want exceptions for, e.g., medical income reporting with anonymous accounts and
research if identities are protected by contracts? payment schemes, age verification credentials that
don’t show a name but are demonstrably valid, and
more. I strongly suspect that most other necessary
Digital Object Identifier 10.1109/MSEC.2020.2966900 functions can be handled the same way, as soon as
Date of current version: 19 March 2020 the requirements are agreed upon.

14 April 2021 Published by the IEEE Computer Society  2469-7087/21 © 2021 IEEE
LAST WORD

T here are certainly other important components


to privacy, such as a requirement for clear and
precise privacy policies by businesses—no more
weasel words like sometimes, may, and business
partners. But the important thing is to start by mak-
ing explicit choices about the many different aspects
of privacy.

STEVEN M. BELLOVIN is a professor of com-


puter science and affiliate law faculty at WWW.COMPUTER.ORG/COMPUTINGEDGE
Columbia University. Contact him via https://
www.cs.columbia.edu/~smb.

ADVERTISER INFORMATION

Advertising Coordinator Central US, Northwest US, Southeast US, Asia/Pacific:


Eric Kincaid
Debbie Sims Email: e.kincaid@computer.org
Email: dsims@computer.org Phone: +1 214-553-8513 | Fax: +1 888-886-8599
Phone: +1 714-816-2138 | Fax: +1 714-821-4010 Cell: +1 214-673-3742

Midwest US:
Advertising Sales Contacts Dave Jones
Email: djones@computer.org
Mid-Atlantic US: Phone: +1 708-442-5633 Fax: +1 888-886-8599
Dawn Scoda Cell: +1 708-624-9901
Email: dscoda@computer.org
Phone: +1 732-772-0160
Cell: +1 732-685-6068 | Fax: +1 732-772-0164 Jobs Board (West Coast and Asia), Classified Line Ads

Southwest US, California: Heather Bounadies


Mike Hughes Email: hbuonadies@computer.org
Email: mikehughes@computer.org Phone: +1 623-233-6575
Cell: +1 805-208-5882

Northeast, Europe, the Middle East and Africa: Jobs Board (East Coast and Europe), SE Radio Podcast
David Schissler
Email: d.schissler@computer.org Marie Thompson
Phone: +1 508-394-4026 Email: marie.thompson@computer.org
Phone: +1 714-813-5094

www.computer.org/computingedge 15
EDITOR: Amit Sheth, amit@sc.edu
This article originally
appeared in
DEPARTMENT: KNOWLEDGE GRAPHS
vol. 24, no. 2, 2020

Knowledge Graph Semantic


Enhancement of Input Data for
Improving AI
Shreyansh Bhatt, Amazon*
Amit Sheth, University of South Carolina
Valerie Shalin, Wright State University
Jinjin Zhao, Amazon

Intelligent systems designed using machine learning algorithms require a large number
of labeled data. Background knowledge provides complementary, real-world factual
information that can augment the limited labeled data to train a machine learning
algorithm. The term Knowledge Graph (KG) is in vogue as for many practical applications,
it is convenient and useful to organize this background knowledge in the form of a graph.
Recent academic research and implemented industrial intelligent systems have shown
promising performance for machine learning algorithms that combine training data
with a knowledge graph. In this article, we discuss the use of relevant KGs to enhance
the input data for two applications that use machine learning—recommendation
and community detection. The KG improves both accuracy and explainability.

M
achine learning algorithms trained with personalization, and semantic advertisement.3 Subse-
a large labeled data have shown promis- quently, background knowledge has played a key role
ing performance in solving problems in various tasks ranging from search and classification
from various domains.1 One of the most challenging to personalized recommendations. In this decade, sev-
aspects associated with using such algorithms is the eral researchers have explored the role of background
availability of training data. On the other hand, sym- knowledge to enhance the natural language process-
bolic knowledge representation has been a key area ing and machine learning.17
of Artificial Intelligence research since the mid 1970s, In this article, we first describe the history of
yielding a number of vetted background knowledge KGs and their application in research and industry,
bases. The AI community started to use the term and introduce the problem of augmenting training
“Ontology” in the 1980's to refer to such background data with the contents of a KG. We then review the
knowledge.2 A patent filed in 2000 described the use of most common approaches to augmenting data with
background knowledge to power commercial faceted knowledge, contrasting simple, explicit association
and semantic search, semantic browsing, semantic of graph content with input data and approaches that
depend on deep learning to combine separate KG
and input content. We list the challenges associated
* Work done prior to joining Amazon with these and provide an overview of an alternative
joint optimization-based approach for KG enhanced
Digital Object Identifier 10.1109/MIC.2020.2979620 machine learning. We report four case studies in dif-
Date of current version 29 April 2020. ferent domains that used this approach.

16 April 2021 Published by the IEEE Computer Society  2469-7087/21 © 2021 IEEE
KNOWLEDGE GRAPHS

HISTORY that Donald Trump is the current president of the USA.


Google introduced its “Google Knowledge Graph” The input data augmentation approach enhances the
in 2012, acknowledging its central role in their entity Tweet representation with concepts associated with
search1. Although a number of knowledge representa- the president of the USA in DBpedia. One of the early
tion alternatives have been used, the graph has been approaches for sentiment analysis used this strategy
one of the most popular formats to represent domain and reported an F1-score improvement with Tweets.
knowledge2. Apart from Google Search, a number of Specifically, for each extracted entity (e.g., iPhone)
commercial products, such as Apple Siri and Amazon from Tweets, this approach adds its semantic anno-
Alexa, are powered by a Knowledge Graph (KG). tation (e.g., “Apple product”) as an additional feature,
A KG is a collection of facts where entities (nodes) and measures the correlation of the added concept
are connected with typed relationships. The scope with negative/positive sentiment.5 Treating the con-
of the knowledge captured by a KG can vary with cepts obtained from the KG as one of the Tweet fea-
broad-based coverage that may involve many generic tures results in a 6.5% increase in the F1-score for sen-
domains (e.g., DBpedia and Yago), a specific domain timent classification.
(e.g., Bio2RDF and UMLS for aspects of biomedical or KGs provide rich information that not only includes
medical domains), an industry or an enterprise.18 To a type associated with the concept but also other
extract a KG for a domain of interest4 domain spe- related concepts. Training data augmentation with
cific KG creation approaches start from one or more the KG content requires relative weighting of the
entities (concepts) in the KG. The graph of interest is following.
then created by traversing from those initial entities.
However, the method does not gener as even a 2-3 1. Concepts present in the training data.
hop traversal may end up including more than 50% of 2. KG concepts that map to the concepts in the
the source content.4 To control traversal for generic training data, e.g., Tweet about Donald Trump
sources, traversing is restricted through relevant maps to the dbpedia:Donald Trump in the
relationships (edges) that can be identified by comput- DBpedia KG.
ing a specificity score of relationship to the domain. 3. KG concepts that are associated with different
Research in creating domain specific sub-KGs has types of relationships with the concept in the
shown promising applications.4 training data. E.g., dbpedia:President of the
KGs have potential applications in augmenting United States is associated with Donald Trump
training data for machine learning algorithms. Training in DBpedia with dbpedia:type relationship.
data augmented by a KG has been shown to enhance
performance of applications that have limited training
data, such as sentiment analysis, named entity rec- AUGMENTED DEEP LEARNING
ognition, recommendation, question answering, and WITH KGS
object detection.5–7, 16 However, the training data may Separate training data, knowledge concepts, and
not be available in the same form as the KG, hindering related concepts from KGs can be the input to the neu-
a facile augmentation of training data. ral network, which finds the appropriate importance
of these different modalities. Artificial neuron pat-
SIMPLE DATA ELABORATION terns or neural network architectures that compound
USING A KG training data, KG concepts, and related concepts can
KGs provide auxiliary factual information about the differ depending on the nature of the training data
entities that are present in the training data. A simple (text or user-item interaction and time series or static).
approach to augmenting training data is to enhance Most of the approaches use the first input layer of the
the training data with auxiliary information extracted deep neural network architecture as the layer that
from the KG.18 Consider a sentiment classification augments training data with the KG. The remaining
task on Tweets. If a Tweet mentions the president of layers are application and task specific with loss com-
the USA, a KG, such as DBpedia has the information puted at the last layer of the deep neural network. The

www.computer.org/computingedge 17
KNOWLEDGE GRAPHS

end-to-end training of such a network results in learn- that corresponds to an entity in the KG. For example,
ing the relative weighting between the training data one channel corresponds to the KG's representation
and different concepts from the KG to solve the appli- for the entity and another channel corresponds to
cation or task, such as sentiment analysis, machine a representation of related entities. A convolutional
reading, recommendation etc. A key advantage of neural network is then applied on such representation
using the neural network to augment the training that combines word level information with the KG's
data with the KG is that a neural network can handle information. Such a representation of each news story
the nonlinearity involved in merging the training data is then combined using an attention network to gener-
and the KG. A set of artificial neurons, so-called neu- ate a user representation, which in turn is combined
ral network layer, considers the input from different with the candidate news story representation to pre-
modalities of data. Each neuron on this layer combines dict the click through rate.
these different inputs and applies a (nonlinear) activa- Machine translation using KG: One of the chal-
tion function. Hence, deep network with multiple lay- lenges in using concepts from a KG to augment text
ers can learn the appropriate nonlinear combination of data is a relative weighting of the actual word in the
the training data and the KG. This approach has shown sentence and the word's representation from the
promising results for various applications, such as KG. Yang et al. reported that while using background
sentiment analysis, recommendation, machine read- knowledge for machine reading, it is crucial to have
ing, and collaborative filtering. both the relative weighing of word in the text and its
Augmented deep learning for sentiment analysis: KG-based representation, as in some cases the text
Kumar et al. proposed an approach to sentiment context properly overrides the context-independent
analysis that augments an input text word with con- background knowledge available in KG. 8 They use
cepts from WordNet KG.6 The proposed model takes a a sentinel vector that combines the word and its
sentence as input to a BiLSTM that computes a hidden related concepts in the KG. Their model uses a BiL-
representation of words in the sentence. The hidden STM where the hidden representation from each
representation is then passed through an attention BiLSTM unit is combined with related concepts from
layer along with the KG concepts related to the word. the KG corresponding to that word using the sentinel
The attention layer then computes a weighted vector vector. The resulting vectored representation is used
of the hidden representation of the input word and KG as the BiLSTM cell's hidden representation. Training
concepts. Weighted vectors of a sentence are passed this network results in learning the weights for the
through another attention layer, the output of which sentinel vector.
predicts sentiment. The whole network is trained to KG for recommendation: Zhang et al. showed that a
predict the correct sentiment of a given sentence in KG can be used to solve the data sparsity issue arising
the training data. from collaborative filtering for item recommendation.9
Personalized news recommendation using KG: They use an item's representation computed from the
Wang et al. reported that a KG plays a key role in KG in user-item feedback matrix. This item's represen-
personalized news recommendation.7 During the tation is computed by concatenating the visual and
inference, the model predicts a click through rate for textual item's representation available in a KG. They
a given user and a given news story. The model gener- showed that such a combined item's representation
ates user representations from their prior history of in the user-item matrix can improve collaborative fil-
news click. The user representation is then concate- tering for recommendation. Other work on augment-
nated with a given news story's representation to gen- ing training data with a KG in recommendation, such
erate a user context vector. The resulting user context as the Personalized Entity Recommendation10 and
vector predicts the click through rate. For training, Factorization Machine with Group lasso,11 treat KG
authors represented each news story with the entities as a heterogeneous information network, and extract
found in the news story and context (neighborhood meta-path/metagraph-based latent features to repre-
of the entity in the KG) of each entity. A multichannel sent the connectivity between users and items along
representation is used for each word of a news story different types of relation paths/graphs.

18 ComputingEdge April 2021


KNOWLEDGE GRAPHS

FIGURE 1. Iterative optimization for knowledge enhanced machine learning. Training data is linked/augmented with
the KG. The learning algorithm is applied on the augmented data to find the results. The results then drive the anno-
tated KG-based learning to identify the updated KG. The updated KG is then used in the training data augmentation.

CHALLENGES FOR KG AUGMENTED weights appropriate for the different relationships for
MACHINE LEARNING the given task instead of using a generic relationship
As noted above, the data format of the input data and weighting of learned for a KG completion task.
KG are often different from each other and require dif- Moreover, as the knowledge is fused with the
ferent processing algorithms and architectures. For training data before applying a machine learning
example, most NLP tasks have sentences as the input algorithm with nonlinearity, the KG may not facilitate
data while background knowledge bases are available explainability.
in graph form. To augment the input data with a KG To address these challenges, recent approaches
requires either converting the knowledge into the for- propose specialized algorithms or neural network
mat of input data or representing input data in the KG. architectures for the input data and for the KG. We
Differences in processing algorithm and architec- review these approaches as iterative optimization for
tures mean that augmenting the input data with a KG enhanced machine learning using a KG.
and converting into a common format may not lead
to the best representation. KGs represent different ITERATIVE OPTIMIZATION FOR
kinds of information about a concept indicated by dif- KNOWLEDGE ENHANCED
ferent types of relationships. For example, in DBpedia MACHINE LEARNING
the concept dbr:ohio is connected with the concept In order to augment input data effectively with
dbr:USA by a hierarchical relationship dbo:country, a KG and to preserve the explainability poten-
whereas dbr:Ohio is connected to columbus with tial of the KG, recent approaches iteratively opti-
dbo:Capital relationship. The algorithms exploiting mize the task specific objective for the input data
the KG must be cognizant of the different types of and for the KG representation of the data.12, 13 As
relationships. Moreover, these algorithms should find shown in Figure 1, these approaches start with an

www.computer.org/computingedge 19
KNOWLEDGE GRAPHS

initial KG representation. The input data are aug- with each other as they live in the same county. Read-
mented with the initial KG representation. Opti- ily apparent user attributes, such as a city name may
mizing the application-specific objective on such not inform the county characterization for the group.
input data provides application-specific results. The The Crowdsourced KGs, such as DBPedia, have the
knowledge graph is updated based on the results information that connects cities to a county.
and the application-specific objective optimiza- Bhatt et al. proposed the KG enhanced community
tion is then run for a KG, which leads to an improved detection.13 They used iterative optimization over an
(application-specific) representation of the KG. The attributed social network graph and a hierarchical
whole process is repeated until convergence or a pre- KG to detect and characterize communities. The
defined number of epochs. The initial generic KG rep- hierarchical KG can represent real-world communi-
resentation augments the input data and iteratively ties or clusters. For example, a hierarchical KG for the
updates the KG representation. Next, we review four geolocation domain represents the United States of
approaches designed based on this concept. America as a root concept with California and Ohio as
KG enhanced recommendation: Wang et al. subsuming concepts. Attributes on the nodes of the
proposed a multitask feature learning approach for social network graph are mapped to the KG. Hence,
the KG enhanced recommendation.12 For a given each node is represented in the KG. Initially, each node
user-item interaction matrix, the item's representa- is considered to be in its own community with a rela-
tion is initially computed from the KG. A collaborative tionship weight computed according to the distance
filtering technique is then applied on such a user-item between nodes in the hierarchical KG. In each iteration,
interaction matrix. As a second step, the user-item first the community detection objective is applied on
interaction is represented in the KG with a concept the graph. Depending on the communities identified in
specific to the item in KG and the other related item the graph, the hierarchical KG is broken into multiple
based on user-item matrix. The KG-based recommen- hierarchical KG representing each community. Hence,
dation is then solved for predicting item similarity. the communities identified in the graph inform KG
Iterative optimization of these two objectives leads representation of nodes and the input graph is modi-
to learning an application specific KG representation fied based on the distance of nodes in the modified
that can be used for explanation and also to enhance hierarchical KGs.
performance for collaborative filtering. Wang et al. Unknown relevant domain: The initial mapping of
proposed to use “cross and compress” units for com- the input data to the KG may represent multiple, or
bining a KG's representation to item and user-item unknown domains. However, the objective optimi-
representation to KG. zation for the input training data may only depend
KG enhanced community detection and charac- on a certain domain or an intersection of multiple
terization: KGs can also help improve our understand- domains. Hence, if we augment the KG with the train-
ing of networked structured data (graph structured ing data prior to optimizing the application specific
data), such as social networks. An attributed graph objective, we may miss the appropriate background
consists of nodes, attributes associated with nodes, knowledge domain.
and relationships (links) that connect nodes. For Social media-based wisdom of crowd analysis is
example, in a social network, users are nodes, loca- one of the application domains where the objective
tion or user posts are attributes, and users are con- optimization on the input data depends on the appro-
nected with friendship relationships. A group of users priate domain in the KG. Recent research shows that
form a community when the number of relationships diverse crowds bring diverse perspectives in decision
within a group exceeds the number of relationships making.14 Such a decision results in a more accurate
across a group. It is often hard to divide a graph in forecast than a decision made by a randomly selected
communities. Node attributes can help explain cer- or homogeneous crowd. As users share their opinion
tain communities. However, communities are often on social media, we can use social media data to infer
formed because a group of users share a generic diverse crowds. Diverse crowd selection can be solved
concept. For example, a group of users may be friends as subset selection, maximizing diversity within the

20 ComputingEdge April 2021


KNOWLEDGE GRAPHS

subset. As we want to measure diversity in perspec- Dr. Sheth and Dr. Shalin’s work was funded in part by
tives in the given domain of interest, we can use data NSF under award #1513721 titled “TWC SBE: Medium:
on individual users' attribute and augment it with the Context-Aware Harassment Detection on Social
KG. However, it is hard to identify the domain of inter- Media.” Any opinions, findings, and conclusions or rec-
est from the KG in the context of selecting a diverse ommendations expressed in this article are those of
crowd. Hence, we can find the diverse crowd by start- the authors and do not necessarily reflect the views of
ing with a generic KG-based user attribute augmenta- the National Science Foundation.
tion, and then find the appropriate domain of interest
for the given set of diverse crowds. For example, REFERENCES
a crowd may be diverse in the domain of politics, 1. Alon Halevy, P. Norvig, and F. Pereira, “The unreason-
whereas it may not be diverse in the domain of sports. able effectiveness of data,” IEEE Intell. Syst., vol. 24, no.
This results in the new domain of interest for diverse 2, pp. 8–12, Mar./Apr. 2009.
crowd selection and helps identifying the appropriate 2. T. R. Gruber, “Toward principles for the design of
diverse crowd. ontologies used for knowledge sharing?” Int. J.
Generative applications: Domain specific short- Human-Comput. Stud., vol. 43, no. 5/6 pp. 907–928,
text generation suffers from limited training data. 1995.
Short and diverse text generation can benefit from 3. A. Sheth, et al., “System and method for creating a
the domain knowledge. Recent research shows that semantic web and its applications in browsing, search-
domain knowledge captured in the form of word2vec ing, profiling, personalization and advertising,” U.S.
vectors improves text generation quality.15 Here, the Patent No. 6,311,194. 30 Oct. 2001.
specific training data can be limited, whereas the data 4. S. Lalithsena, “Domain-specific knowledge extraction
to generate word2vec vectors can consist of signals from the web of data,” 2018.
related to generic text generation, such as grammar 5. H. Saif, et al., “Semantic sentiment analysis of twitter,”
and sentence structure. The use of domain specific in Proc. Int. Semantic Web Conf., 2012, pp. 508–524.
KGs can further improve diverse text generation qual- 6. A. Kumar, et al., “Knowledge-enriched two-layered
ity as it captures words and rules associated with attention network for sentiment analysis,” in Proc.
the domain. However, it is challenging to identify the Conf. North Amer. Chapter Assoc. Comput. Linguistics:
appropriate domain in the KG that helps the particular Human Lang. Technol., 2018, vol. 2, pp. 253–258.
text generation. Iterative optimization can help such 7. H. Wang, et al., “DKN: Deep knowledge-aware network
diverse text generation. for news recommendation,” in Proc. World Wide Web
Conf., 2018, pp. 1835–1844.
CONCLUSION 8. B. Yang and T. Mitchell, “Leveraging knowledge bases
KGs play a key role in machine learning. Crowdsourced in LSTMS for improving machine reading,” in Proc. 55th
KGs can complement the available training data for Annu. Meeting Assoc. Comput. Linguistics, 2019, pp.
machine learning algorithms and improve perfor- 1436–1446.
mance for a number of applications. Iterative opti- 9. F. Zhang, et al., “Collaborative knowledge base embed-
mization can further improve accuracy and also help ding for recommender systems,” in Proc. 22nd ACM
explain the data in the context of the application. This SIGKDD Int. Conf. Knowl. Discovery Data Mining, 2016,
approach is particularly useful when the entities pres- pp. 353–362.
ent in the input data are associated with a concept 10. X. Yu, et al., “Personalized entity recommendation:
in knowledge graph that is present in the multiple A heterogeneous information network approach,” in
domains. An iterative approach can identify the appro- Proc. 7th ACM Int. Conf. Web Search Data Mining, 2014,
priate domain in the context of the application. pp. 283–292.
11. H. Zhao, et al., “Meta-graph based recommendation
ACKNOWLEDGMENTS fusion over heterogeneous information networks,” in
The authors would like to thank Manas Gaur and Ruwan Proc. 23rd ACM SIGKDD Int. Conf. Knowl. Discovery
Wickramarachchi for their review and suggestions. Data Mining, 2017, pp. 635–644.

www.computer.org/computingedge 21
KNOWLEDGE GRAPHS

12. Ho. Wang , et al., “ Multi-task feature learning for enhancing deep learning ,” IEEE Internet Comput., vol.
knowledge graph enhanced recommendation,” in Proc. 23, no. 6, pp. 54 – 63, Nov./Dec. 2019.
World Wide Web Conf., 2019, pp. 2000 –2010.
13. S. Bhatt , et al., “ Knowledge graph enhanced commu-
nity detection and characterization,” in Proc. 12th ACM SHREYASH BHATT is a machine learning scientist with
Int. Conf. Web Search Data Mining, 2019, pp. 51 – 59. Amazon.com, Seattle, WA, USA. Contact him at bhattshr
14. S. Bhatt , et al., “ Who should be the captain this week? @amazon.com.
Leveraging inferred diversity-enhanced crowd wisdom
for a fantasy premier league captain prediction,” in Proc. AMIT SHETH (Fellow, IEEE) is the director of Artificial Intel-
Int. AAAI Conf. Web Social Media, 2019, vol. 13. no. 01, pp. ligence Institute, University of South Carolina, Columbia,
103 –113. SC, USA (http://ai.sc.edu). He is a fellow of AAAI and AAAS.
15. A. Nalamothu, “Abusive and hate speech tweets detec- Contact him at amit@sc.edu.
tion with text generation,” 2019.
16. Y. Fang , et al., “Object detection meets knowledge VALERIE SHALIN is a cognitive scientist and a professor of
graphs,” 2017. Psychology with Wright State University, Dayton, OH, USA.
17. A. Sheth, et al., “ Knowledge will propel machine Contact her at valerie.shalin@wright.edu.
understanding of content: Extrapolating from current
examples,” in Proc Int. Conf. Web Intell., 2017, pp. 1 – 9. JINJIN ZHAO is a machine learning scientist with Amazon
18. A. Sheth M. Gaur, U. Kursuncu, and R. Wickrama- .com, Seattle, WA, USA. Contact Jinjin at jinjzhao@amazon
rachchi, “ Shades of knowledge-infused learning for .com.

PURPOSE: The IEEE Computer Society


BOARD OF GOVERNORS MEETING
is the world’s largest association
of computing professionals and is 21 April 2021, virtual
the leading provider of technical EXECUTIVE STAFF
information in the field. Executive Director: Melissa A. Russell; Director, Governance &
MEMBERSHIP: Members receive the monthly magazine
Associate Executive Director: Anne Marie Kelly; Director,
Computer, discounts, and opportunities to serve (all activities
Conference Operations: Silvia Ceballos; Director, Finance &
are led by volunteer members). Membership is open to all IEEE
Accounting: Sunny Hwang; Director, Information Technology &
members, affiliate society members, and others interested in the
Services: Sumit Kacker; Director, Marketing & Sales: Michelle
computer field. OMBUDSMAN: Email ombudsman@computer.org
Tubb; Director, Membership & Education: Eric Berkowitz
COMPUTER SOCIETY WEBSITE: www.computer.org
COMPUTER SOCIETY OFFICES
EXECUTIVE COMMITTEE
Washington, D.C.: 2001 L St., Ste. 700, Washington, D.C.
President: Forrest Shull; President-Elect: William D. Gropp; Past
20036-4928; Phone: +1 202 371 0101; Fax: +1 202 728 9614;
President: Leila De Floriani; First VP: Riccardo Mariani; Second VP:
Email: help@computer.org
Fabrizio Lombardi; Secretary: Ramalatha Marimuthu; Treasurer:
Los Alamitos: 10662 Los Vaqueros Cir., Los Alamitos, CA 90720;
David Lomet; VP, Membership & Geographic Activities: Andre
Phone: +1 714 821 8380; Email: help@computer.org
Oboler; VP,Professional & Educational Activities: Hironori
MEMBERSHIP & PUBLICATION ORDERS: Phone: +1 800 678 4333;
Washizaki; VP, Publications: M. Brian Blake; VP, Standards
Fax: +1 714 821 4641; Email: help@computer.org
Activities: Riccardo Mariani; VP, Technical & Conference Activities:
Grace Lewis; 2021-2022 IEEE Division VIII Director: Christina M. IEEE BOARD OF DIRECTORS
Schober; 2020-2021 IEEE Division V Director: Thomas M. Conte; 2021 President: Susan K. “Kathy” Land; President-Elect: K.J. Ray Liu;
IEEE Division V Director-Elect: Cecilia Metra Past President: Toshio Fukuda; Secretary: Kathleen A. Kramer;
Treasurer: Mary Ellen Randall; Director & President, IEEE-USA:
BOARD OF GOVERNORS Katherine J. Duncan; Director & President, Standards Association:
Term Expiring 2021: M. Brian Blake, Fred Douglis, Carlos E. Jimenez- James Matthews; Director & VP, Educational Activities: Stephen
Gomez, Ramalatha Marimuthu, Erik Jan Marinissen, Kunio Uchiyama Phillips; Director & VP, Membership and Geographic Activities:
Term Expiring 2022: Nils Aschenbruck, Ernesto Cuadros‐Vargas, Maike Luiken; Director & VP, Publication Services & Products:
David S. Ebert, Grace Lewis, Hironori Washizaki, Stefano Zanero Lawrence Hall; Director & VP, Technical Activities: Roger U. Fujii
Term Expiring 2023: Jyotika Athavale, Terry Benzel, Takako
Hashimoto, Irene Pazos Viana, Annette Reilly, Deborah Silver

revised 5 March 2021

22 ComputingEdge April 2021


COLUMN: FORMAL METHODS IN INDUSTRY
This article originally
appeared in

Towards Explainability in
Machine Learning:
vol. 22, no. 4, 2020

The Formal Methods Way


Frederik Gossen, Tiziana Margaria, and Bernhard Steffen

C
lassification is a central discipline of machine is also particularly important when the proposed clas-
learning (ML) and classifiers have become sification is correct, but apparently counter-intuitive.
increasingly popular to support or replace This is why Explainability is now a new hot topic in ML,
human decisions. We encounter them as email spam and this is where formal methods can play an essential
detectors, as decision support systems, for example in role. Let us show the power of the formal methods way
healthcare, as aid in interpreting X-rays in breast can- in combination with random forests.
cer detection, or in the financial and insurance sector, Random Forests are one of the most popular
for financial and risk analysis. For example, Facebook logic-based classifiers in ML. The larger they are, the
uses classifiers to predict the likelihood that users will more precise the outcome of their predictions. Figure 1
navigate or click in a certain way, at scale, for millions shows a random forest with 100 tree elements that
and millions of users every day [9]. They also play a sig- was learned from the Iris Classification3 problem of the
nificant role in various areas of computer vision, where popular UCI dataset.4 The dataset lists dimensions of
traffic signals and other objects need to be identified Iris flowers’ sepals and petals for three different species
in order to “read” a situation during assisted or autono- of flowers: iris setosa, iris virginiana, and iris versicolor.
mous driving. Because we rely on classifiers not only These are our classes. Random Forests are a collection
for ease and comfort but also in business or safety of many decision trees, each learned from a random
critical systems, they need to be precise and reliable. sample of the training dataset. All trees have different
Classifiers foot on a wide variety of techniques: structure, represent different decision functions, and
neural networks, statistical learning like Bayesian can produce different decisions for the same input
networks, instance leaning like in K-Nearest Neigh- data. The training method is easy to understand and to
bor, separability of classes in a vector space like in implement, and at the same time achieves impressive
support vector machines, or logics, like in decision classification accuracies in many applications.
trees, random forests, and rule-based classifiers. ML Once we have the random forest, to classify previ-
classifiers were traditionally judged mostly in terms of ously unseen input data every decision tree is evalu-
precision, ease of training and fast response. In many ated separately, potentially in parallel. The overall
cases, however, small differences in the sample led to decision of the random forest is then typically derived
spectacularly wrong decisions. Meanwhile, AI failure as the most frequently chosen class, an aggregation
stories populate various sites1 including fails by popu- commonly referred to as majority vote. Key advantage
lar AI platforms like IBM's Watson. of this approach is the reduced variance compared to
When something goes wrong, it is good to know single decision trees. But can we explain how and why
why. In cases where legal action follows a misclas- this decision was taken?
sification, as in the recent CervicalCheck cancer
scandal* that rocked Ireland's Health Service,2 it is EXPLAINABILITY
important to be able to find out exactly why a certain Neural networks and random forests are considered
classification verdict was issued. Ease of explanation

Digital Object Identifier 10.1109/MITP.2020.3005640


* The CervicalCheck cancer misdiagnosis was human, and
not due to machine learning. Date of current version 17 July 2020.

2469-7087/21 © 2021 IEEE Published by the IEEE Computer Society April 2021 23
FORMAL METHODS IN INDUSTRY

FIGURE 1. Excerpt from the considered Random Forest, which contains 100 trees with a total of 1312 nodes. Evaluation means
majority vote-based aggregation of the evaluation results of the individual trees. Even only trying to understanding the reason
for a certain classification on this basis is considered hard (outcome explanation problem).

black-box models because of their highly parallel a correct classification, the tag by itself gives no rea-
nature: following the execution of neural networks son why the identification is indeed correct.
means following sequences of parallel execution steps More ambitious are methods that try to turn
that result from a complex interplay of the value of all black-box model into white-box models, ideally pre-
neurons (or nodes). The execution of a random forest serving the semantics of the classification function.
is simpler, but it still requires to aggregate the results For random forests this has been achieved for the
of each of its often many hundreds trees after having first time using algebraic transformations.5 In fact,
executed all of them individually. The results of such the proposed method is based on Algebraic Decision
black-box executions are hard to explain to a human Diagrams (ADDs)6 and Binary Decision Diagrams. An
user even for very small examples. ADD is essentially a decision tree where redundant
In contrast, decision trees are considered white-box subparts are merged. A Binary Decision Diagram is
models because of their sequential evaluation nature. an ADD over the algebra of Boolean values, i.e., the
Even if a tree is large in size, a human can easily follow leaves are Boolean (true/false, yes/no). The method
its computation step by step by evaluating (simple) solves the following three explainability problems with
decisions at each node from the root to a leaf. Indeed, absolute precision.
the set of decisions along such an execution path pre- The Model Explanation Problem is solved in terms
cisely explains why a certain choice has been taken. of an ADD that specifies precisely the same classifica-
Popular methods towards explainability try to tion function as the original random forest.
establish some user intuition. For example, they may The Class Characterizations Problem is solved
hint at the most influential input data, like highlighting in terms of a BDD that precisely characterizes all
or framing the area of a picture where a face has been samples that the original random forest will classify as
identified. Such information is very helpful, and it helps the considered class.
in particular to reveal some of the “popular” drastic The Outcome Explanation Problem is solved in
mismatches incurred by neural networks: if the framed terms of a minimal conjunction of (negated) decisions
area of the image does not contain the “tagged” object, that are sufficient to guide the sample into the consid-
the identification is clearly incorrect. However, even in ered class.

24 ComputingEdge April 2021


FORMAL METHODS IN INDUSTRY

FIGURE 2. Model Explanation. This graph with its 1077 nodes is considered a white-box model for the Random Forest as
individual classifications can be explained simply by looking at the corresponding classification path whose lengtt, in this case,
never exceeds 20. Note that there are individual trees in the original Random Forest with paths of length 10.

We will now illustrate this approach and the three most twenty individual decisions based on the petal
forms of explainability starting from the random forest and sepal characteristics. This decision set is our set
with hundred trees for the Iris classification shown in of predicates. The conjunction of these predicates is
Figure 1. a solution to the Outcome Explanation Problem. How-
The black box character of this forest is obvious: ever, more concise explanations are derived from the
given a sample, how can a human follow the 100 class characterization BDD discussed in the “Class
individual trees evaluations, grasp their essence and Characterization Problem” section.
then understand the impact of the following majority This construction exploits algebraic properties:
vote-based classification? In the next section, we will intuitively, we “add” the entire decision trees. This
see that this is an inherently hard task, because also is technically possible because ADDs inherit the
the canonical white box model with its more than algebraic structure of their leaf set. In this case, the
thousand nodes we are able to construct is still quite algebra of the leaf set is the set of vectors that have
hard to understand. one component for each class that counts how often
this particular class has been chosen under the con-
MODEL EXPLANATION PROBLEM ditions represented by the paths to this very leaf. To
The canonical white box model corresponding to the add two ADDs of this set, we use the component-wise
random forest of Figure 1 can be constructed compo- addition of the underlying vector structure. 5
sitionally by taking the individual trees of the random
forest and successively “adding” their corresponding CLASS CHARACTERIZATION
ADDs. This solves the Model Explanation Problem. PROBLEM
Figure 2 sketches the result of this construction: The class characterization problem is particularly
A canonical white box model with 1077 nodes. Admit- interesting because it allows on to “reverse” the
tedly, this model is still frightening, but given a sample, classification process. While the direct problem
it allows one to easily follow the corresponding clas- is “given a sample, provide its classification,” the
sification process, and in this case it may require at reverse problem sounds “given a class, what are the

www.computer.org/computingedge 25
FORMAL METHODS IN INDUSTRY

FIGURE 3. Class characterization. This Class Characterization Model has only 53 nodes, a size that can be considered compre-
hensible by humans. Its maximal path length is 18. The path for our sample is 14, and the corresponding reduction leads to an
outcome explanation with only 4 predicates. Note that with four classification parameters the outcome explanation can never
have more than 4 predicates.

characteristics of all the samples belonging to this OUTCOME EXPLANATION


class?” The BDD shown in Figure 3 is a minimal char- PROBLEM
acterization of the set of all the samples that are The path highlighted in Figure 3 defines an outcome
guaranteed to be classified as Iris Setosa. classification formula for the sample
Being able to reverse a learned classification func- petallength = 2,49
tion has a major practical importance. Think, e.g., of sepalwidth = 2,45
a marketing research scenario where data have been sepallength = 7,15
collected with the aim to propose best-fitting product As the conjunction of the following 13 predicates:
offers to customers according to their user profile. NOT petallength < 2.45
This scenario can be considered as a classification petalwidth < 1.65
problem where the offered product plays the role of petalwidth < 1.45
the class. Now, being able to reverse the customer → NOT sepallength < 7.05
product classification function provides the market- sepalwidth < 2.65
ing team with a tailored product → customer promotion petalwidth <1.35
process: for a given product, it addresses all custom- petalwidth < 0.8
ers considered to favor this very product as in the cor- petalwidth < 0.7
responding patent.7 NOT sepalwidth < 2.25

26 ComputingEdge April 2021


FORMAL METHODS IN INDUSTRY

petallength < 5.0 the learned classification function for tailored prod-
petallength < 2.7 uct presentations in order to obtain an optimized cus-
petallength < 2.6 tomer list for a product campaign. Moreover, the size
petallength < 2.5 and therefore the comprehensibility of class char-
The classification formula expresses the collec- acterization seem to hardly explode. In our example
tion of “conditions” that this sample satisfies, and with only three classes, the model characterization
it provides therefore a precise justification why it is ADD had more than 1100 nodes, while all the class
classified in this class. characterization ADDs have less than 60 nodes, a
Despite the fact that the class characterization size still within the range of a visual investigation.
BDD is canonical, it is easy to see that there are some Of course, these are first steps in a very ambi-
redundancies in the formula. For example, a petal- tious new direction and it has to be seen how far the
length < 2.5 is also inherently smaller than 2.6 and 2.7; approach carries. Scalability will probably require
therefore, for this specific sample those two predi- decompositions methods, perhaps in a similar fash-
cates are redundant. This is the result of the imposed ion as illustrated by the difference between model
predicate ordering in BDDs: all the BDD predicates explanation and the considerably smaller class
are listed, and they are listed in a fixed order. After characterization. More work is needed also on tech-
eliminating these redundancies, we are left with the niques that aim at limiting the number of involved
following precise minimal outcome explanation: this predicates.
sample is recognized as belonging to the class Iris Promising results reported in sttt2 lift the
Setosa because it has the properties approach we illustrated from random forests to binary
2.45 < petallength < 2.45 neural networks. They indicate that true explainability
petalwidth < 0.7 may well be in reach even for neural networks, on the
7.05 < sepallength formal methods way.
2.25 sepalwidth < 2.65
REFERENCES
CONCLUSIONS AND PERSPECTIVES 1. [Online]. Available: https://www.lexalytics.com
Explainable AI is a new direction aiming at the matu- /lexablog/stories-ai-failure-avoid-ai-fails-2020
ration of a field that has experienced a boost in partic- 2. Wikipedia. [Online]. Available: https://en.wikipedia.org
ular because of its fancy heuristics and correspond- /wiki/CervicalCheck_cancer_scandal
ing breakthroughs in specific applications like the 3. R. A. Fisher, “The use of multiple measurements in
AlphaGo program for the game Go. In this context, the taxonomic problems,” Ann. Eugenics, vol. 7, no. 2,
typical concept of “explanation” is still comparatively pp. 179–188, 1936.
weak. For example, highlighting the most important 4. “UCI machine learning repository: Iris data set,”
pixel for a certain image classification is not really a Retrieved 2017-12-01, [Online]: Available: archive.ics
comprehensive explanation, but rather a hint, an indi- .uci.edu
cation that helps pinpoint situations where things 5. F. Gossen and B. Steffen, “Algebraic aggregation
went drastically wrong. In contrast we take a formal random forests: Towards explainability and rapid
methods-based path, originally established in STTT,5 evaluation” to be published.
where the concept of “explanation” is interpreted as a 6. R. I. Bahar, et al., “Algebraic decision diagrams and their
precise characterization of the considered phenome- applications,” in Proc. IEEE/ACM Int. Conf. Comput.-
non. Our illustration on how much information about Aided Des. IEEE Comput. Soc. Press, 1993.
the how and why can be extracted with exact meth- 7. H. Hungar, B. Steffen, and T. Margaria, “Methods for
ods from a random forest consisting of 100 trees indi- generating selection structures, for making selections
cates that such characterization may indeed turn out according to selection structures and for creating
to be practical. selection descriptions,” USPTO Patent number:
The concise class characterization has a particu- 9141708, Granted Sep. 22, 2015. [Online]. Available:
larly high application potential, e.g., when reversing https://patents.justia.com/patent/9141708

www.computer.org/computingedge 27
EDITOR: Jeffrey Voas, IEEE Fellow, j.voas@ieee.org

This article originally


appeared in
DEPARTMENT: CYBERTRUST
vol. 52, no. 9, 2019

Disruptive Innovations and


Disruptive Assurance: Assuring
Machine Learning and Autonomy
Robin Bloomfield, Adelard LLP and City University of London
Heidy Khlaaf, Philippa Ryan Conmy, and Gareth Fletcher, Adelard LLP

Autonomous and machine learning-based systems are disruptive innovations and


thus require a corresponding disruptive assurance strategy. We offer an overview of
a framework based on claims, arguments, and evidence aimed at addressing these
systems and use it to identify specific gaps, challenges, and potential solutions.

T
he advancement and adoption of machine- (e.g., the safety property or reliability) of a system
learning (ML) algorithms constitute a crucial through an argument- or outcome-based approach
innovative disruption. However, to benefit from that integrates disparate sources of evidence, whether
these innovations within security and safety-critical from compliance, experience, or product analysis. We
domains, we need to be able to evaluate the risks and argue that building trust and trustworthiness through
benefits of the technologies used; in particular, we argument-based mechanisms, specifically the claims,
need to assure ML-based and autonomous systems. arguments, and evidence (CAE) framework (see “The
The assurance of complex software-based sys- Assurance Framework”), allows for the accelerated
tems often relies on a standards-based justification. exploration of novel mechanisms that would lead to
But in the case of autonomous systems, it is difficult to the quality advancement and assurance of disruptive
rely solely on this approach, given the lack of validated technologies (see Figures S1 and S2 in the “The Assur-
standards, policies, and guidance for such novel tech- ance Framework” sidebar).
nologies. Other strategies, such as “driving to safety,” The key advantage of a claim-based approach is
that use evidence developed from trials and experi- that there is considerable flexibility in how the claims
ence to support claims of safety in deployment are are demonstrated since different types of arguments
unlikely to be successful by themselves,1,2 especially and evidence can be used as appropriate. Such a flex-
if the impact of security threats is taken into account. ible approach is necessary when identifying gaps and
This reinforces the need for innovation in assurance challenges in uncharted territory, such as the assur-
and the development of an assurance methodology ance of ML-based systems. Indeed, CAE is commonly
for autonomous systems. used in safety-critical industries (such as defense,
Although forthcoming standards and guidelines nuclear, and medical) to assure a wide range of systems
will eventually have an important, yet indirect, role in and devices and support innovation in assurance.
helping us justify behaviors, we need further devel- We are developing a particular set of CAE struc-
opment of assurance frameworks that enable us to tures that is generically applicable and helps identify
exploit disruptive technologies. In this article, we how to construct trustworthy ML-based systems by
focus on directly investigating the desired behavior explicitly considering evidence of sources of doubt,
vulnerabilities, and mitigations addressing the behav-
ior of the system. In doing this, we not only assure and
Digital Object Identifier 10.1109/MC.2019.2914775 determine challenges and gaps in behavioral proper-
Date of publication: 27 August 2019 ties but also self-identify gaps within the assurance

28 April 2021 Published by the IEEE Computer Society  2469-7087/21 © 2021 IEEE
framework itself. In the remainder of this article, we comparison and assessment of diverse subsystems’
describe our systematic approach to identifying a contribution to defense in depth. This, in turn, can also
range of gaps and challenges regarding ML-based sys- inform future architectures of autonomous systems.
tems and their assurance. Beyond the study of the applicability of CAE to
assure ML-based systems, the lens of the assurance
IDENTIFYING case is used to identify gaps and challenges regarding
ASSURANCE CHALLENGES techniques and evidence aimed at justifying desired
The decision to trust an engineering system resides in system behaviors. This is further informed by a review
engineering argumentation that addresses the eval- of literature, a case study-based assessment of the
uation and risk assessment of the system and the experimental vehicle, and an investigation of our
role of the different subsystems and components industry partners’ development processes to assess
in achieving trustworthiness. Although previous the current state of the vehicle and the short- to
abstractions, models, and relationships have been medium-term future vision of its use case (approxi-
constructed in CAE for the assurance of traditional mately two years). To see how and whether security
software systems, it is not clear if the said existing is addressed in the product lifecycle, we used the new
blocks are sufficient to provide compositional argu- U.K. Code of Practice PAS 11281, Connected Automo-
mentation enabling trustworthiness in ML-based sys- tive Ecosystems—Impact of Security on Safety.4
tems. For example, domain-specific abstractions and In the subsequent sections, we discuss some of
arguments may need to be developed in CAE to specif- the gaps identified regarding technical capabilities
ically target ML subcomponents. that may enable trust of system behaviors. We high-
To develop a detailed understanding of such assur- light three areas: requirements, security, and verifica-
ance challenges, we use CAE to create an outline of tion and validation (V&V). There are also issues of eth-
an overall assurance case, proceeding from top-level ics, advanced safety analysis techniques, defense in
claims, concerning an experimental autonomous vehi- depth, and diversity modeling that we do not address.
cle and its social context, down to claims regarding
the evaluation of subsystems, such as the ML model GAPS AND CHALLENGES
(Figure 1). The case study autonomous vehicle, as is
typical with similar state-of-the-art vehicles, contains Innovation, trust, and requirements
a heterogeneous mixture of commercial off-the-shelf There is a need to address the realities of the innova-
(COTS) components, including image recognition, lidar, tion lifecycle and progressively develop requirements,
and other items. Apportioning the trustworthiness, including those for trustworthiness and assurance.
dependability, and requirements of each component In this innovation approach, the vehicle is gradu-
to consider the real-time and safety-related nature of ally developed from a platform to trial technologies
the system is challenging. In traditional safety-critical to the final product (Figure 2). There is an assurance
engineering, there would be diversity and defense gap in that, when analyzing how much the technolo-
in depth to reduce the trust needed in specific ML gies need to be trusted, there must be an articulated
components; yet we do not know whether this is vision of what they will be used for. If the vision of how
practicable for ML-based systems. Argumentation something will be used is not clearly formulated, we
blocks may need to be further developed within CAE cannot assess how much we need to trust it or what
to determine how experimental data can allow for the the risks are.

www.computer.org/computingedge 29
CYBERTRUST

THE ASSURANCE FRAMEWORK

T he claims, arguments, and evidence (CAE) frame-


work supports the structured argumentation for
complex engineering systems. It is based on an explicit
Claim

claim-based approach to justification and relates


back to earlier philosophical work by WigmoreS6 and
Argument
ToulminS7 as well as drawing on theory and empirical
research in recent years in the safety and assurance
cases areas (see John Rushby’s analysis S4 for a rigorous
review of the field).
Subclaim 1 Subclaim 2
At the heart of the CAE framework are three key
elements (Figure S1). Claims are assertions put forward
for general acceptance. They are typically statements
about a property of the system or some subsystem.
Evidence 1 Evidence 2
Claims asserted as true without justification are as-
sumptions, and claims supporting an argument are sub-
claims. Arguments link evidence to a claim, which can
FIGURE S1. The CAE notation.
be deterministic, probabilistic, or qualitative. They con-
sist of “statements indicating
the general ways of arguing
being applied in a particular
case and implicitly relied on Top-Level Claim
and whose trustworthiness
is well established“ (see
The top-level claim
ToulminS7 ), together with
is made precise with a Concretion
validation of any scientific concretion argument.
laws used. In an engineering
context, arguments should
be explicit. Evidence serves Claim (X)
as the basis for justifica-
tion of a claim. Sources of The claim cannot be
evidence can include the directly shown by Decomposition, Application of
design, the development evidence, so one of the Substitution, or Argument
CAE blocks is selected Calculation Justified
process, prior experience, to define subclaims.
testing (including statistical
Claim (A) is now
testing), or formal analysis. precise enough to
be directly Claim (A) Claim (B) Side Claims
In addition to the basic
supported/rebutted Validate the
CAE concepts, the framework
by evidence. Argument
consists of CAE blocks that
provide a restrictive set of Results R
Evidence
common argument fragments Directly Support
Incorporation
Claim (A)
and a mechanism for separat-
ing inductive and deductive
aspects of the argumentation
Results R
(Figure S2). These were identi-
fied by empirical analysis of
actual safety cases. S5 The
FIGURE S2. An example of CAE block use.
blocks are as follows:

30 ComputingEdge April 2021


CYBERTRUST

THE ASSURANCE FRAMEWORK (CONT.)


»» Decomposition: There is partition of some aspect of REFERENCES
the claim, or divide and conquer. S1. Systems and Software Engineering—Systems and
»» Substitution: A claim about an object is refined into Software Assurance, Part 2: Assurance Case, ISO/IEC
a claim about an equivalent object. 15026-2:2011, 2011.
»» Evidence incorporation: Evidence supports the S2. P. G. Bishop and R. E. Bloomfield, “A methodology for
claim, with an emphasis on direct support. safety case development,” in Industrial Perspectives
»» Concretion: Some aspect of the claim is given a of Safety-Critical Systems: Proceedings of the Sixth
more precise definition. Safety-Critical Systems Symposium, Birmingham 1998,
»» Calculation or proof: Some value of the claim can be F. Redmill and T. Anderson, Eds. London: Springer-Verlag,
computed or proved. 1998, pp. 194–203.
S3. International Atomic Energy Agency, “Dependability assess-

The framework also defines connection rules to ment of software for safety instrumentation and control

restrict the topology of CAE graphical structures. The systems at nuclear power plants,” IAEA Nuclear Energy Series
NP-T-3.27, 2018. [Online]. Available: https://www-pub
use of blocks and associated narrative can capture
.iaea.org/books/IAEABooks/12232/Dependability
challenges, doubts, and rebuttals and illustrates how
-Assessment-of-Software-for-Safety-Instrumentation
confidence can be considered as an integral part of the
-and-Control-Systems-at-Nuclear-Power-Plants
justification.
S4. J. Rushby, “The interpretation and evaluation of assurance
The basic concepts of CAE are supported by an in-
cases,” SRI Int., Menlo Park CA, Tech. Rep. SRI-CSL-15-01,
ternational standard, S1 IAEA guidance, S3 and industry
July 2015.
guidance. S2 To support CAE, a graphical notation can S5. R. Bloomfield and K. Netkachova, “Building blocks for
be used to describe the interrelationship of evidence, assurance cases,” in Proc. IEEE Int. Symp. Software
arguments, and claims. S3,S5 In practice, top desirable Reliability Engineering Workshops (ISSREW), Nov. 2014, pp.
claims, such as “the system is adequately secure,” are 186–191. doi: 10.1109/ISSREW.2014.72.
too vague or are not directly supported or refuted by S6. J. H. Wigmore, “The science of judicial proof,” Virginia Law
evidence. Therefore, it is necessary to create subclaim Rev., vol. 25, no. 1, pp. 120–127, Nov. 1938. doi: 10.2307/1068138.
nodes until the final nodes of the assessment can be S7. S. E. Toulmin, The Uses of Argument. Cambridge Univ.
directly supported or refuted by evidence. Press, United Kingdom. 1958.

This is particularly important for security and sys- technical themes of the project, in the requirements,
temic risks, where the scale and nature of the deploy- V&V, and assurance research. While the requirements
ment (such as a key part of an urban transport system) of the new PAS 11281 Code of Practice may be met in
will lead to more onerous requirements that have to a mature implementation of the vehicle being stud-
be reflected in the earlier technology trials and evalu- ied, on the whole, the security will be challenging for
ations. Alternatively, more agile approaches would be industry, and advice must be provided on partial and
to progressively identify these trust requirements as project-specific implementation of the PAS that allows
the innovation proceeds. But this might lead to solu- for maturity growth.
tions that do not scale and, in the extreme, could not The security aspects need to be integrated into
be deployed. We believe that the innovation lifecycle the entire lifecycle: systems are not safe if they are
subsequently presented is typical for many players in not secure. This applies to the vehicle as a whole
the industry and will be increasingly adopted as the as well as to the ML subsystems; most ML systems
ML components become more productized. have not been designed with a systematic attention
to security.10 The PAS clauses address the following
Security areas and are equally applicable to the vehicle and its
Security is a fundamental and integral attribute of the components:

www.computer.org/computingedge 31
CYBERTRUST

Deployed
Supply Chain Vehicle Adheres
to Safety
Requirements

Safety
Requirements
Functional
Decomposition

VisionMeet Sensor Fusion Localization Route


Sensors Meet
Requirements Meet Meet Planning Meet
Requirements
Requirements Requirements Requirements

Concretion

Robo Vision
Meet
Requirements

FIGURE 1. A high-level example of an assurance subcase in CAE.

1. security policy, organization, and culture 1. Explicitly define the innovation cycle and
2. security-aware development process assess the impact and feasibility of adding
3. maintaining effective defenses assurance and security.
4. incident management 2. Address the approach to security-informed
5. secure and safe design safety at all stages of the innovation cycle. If
6. contributing to a safe and secure world. safety, security, and resilience requirements
are largely undefined at the start of the
As we noted previously, the deployment of autono- innovation cycle, the feasibility of progressively
mous technologies may follow an innovation lifecycle identifying them during the cycle should be
that first focuses on functionality and seeks to pro- assessed, together with the issues involved in
gressively add additional assurance and security. This evolving the architecture and increasing the
will make the development of the assurance and safety assurance evidence.
cases and associated security and safety risk assess- 3. Apply PAS 11281 to systematically identify the
ments particularly challenging. From our experience, issues. Use a CAE assurance case framework
we recommend the following: and map PAS clauses to this to provide a

32 ComputingEdge April 2021


CYBERTRUST

Road Vehicle in Widespread


Lab Richer Widespread Use
Constrained Public Use Use in
Components, Environment in Current
Environment (Small Scale) Heterogeneous
Tests and Test Pilots Infrastructure
and Test Pilots Smart Cities

Attack Surface and Impact


Increasing

Greater Speed, Kinetics

FIGURE 2. The typical stages of development from innovation to products.

systematic approach to applying the PAS. speaking, the idea is that a neighborhood should be
4. Consider a partial and project-specific imple- reasonably classified as the given class.
mentation of the PAS to meet the innovation However, proposed pointwise robustness veri-
cycle. fication methods8–10 suffer from the same set of
5. Collect experience in developing a limitations.
security-informed safety case and integrating
security issues into the safety analyses needed ›› There is a lack of clarity on how to define mean-
to implement the PAS. ingful regions and manipulations.
○ The neighborhoods surrounding a point x
that are currently used are arbitrary and
V&V conservative.
We use the assurance case in CAE top-down to iden- ›› We cannot enumerate all x points near which
tify the claims we wish to support and bottom-up the classifier should be approximately constant;
to evaluate the evidence that could be provided by that is, we cannot predict all future inputs.
them and, hence, systematically assess gaps, chal-
lenges, and solutions. This is shown schematically Furthermore, researchers have been unable to find
in Figure 3. As part of this analysis, we assessed compelling threat models that required perturbation
state-of-the-art formal methods for autonomous sys- indistinguishability,12 and it has been demonstrated
tems and observed that their maturity and applicabil- that lp, which defines the neighborhood region , is
ity are lacking for sufficiently justifying behavioral and a poor proximity for measuring what humans actu-
vulnerability claims. ally see.13 Finally, adversarial perturbations can be
Consider the issue of adversarial attacks and achieved by much simpler attacks that do not require
perturbations, 5,6 which has been particularly chal- ML algorithms (e.g., covering a stop sign). Thus, the
lenging with regard to the robustness of ML algo- extent to which these techniques can provide us with
rithms. Verification researchers have focused on the any level of confidence is not very high.
property of pointwise robustness, in which a classi- Other verification techniques7,9 aim to verify more
fier function f’ is not robust at point x if there exists general behaviors regarding ML algorithms, instead of
a point y within such that the classification of y is just pointwise robustness. Such techniques require
not the same as the classification of x. That is, for functional specifications, written as constraints, to
some point x from the input, the classification label be fed into a specialized linear-programming solver to
remains constant within the neighborhood of x, be verified against a piecewise linear constraint model
even when small-value deltas (i.e., perturbations) are of the ML algorithm. However, the generalization of
applied to x. A point x would not be robust if it were at these algorithms is challenging, given the require-
a decision boundary, and adding a perturbation would ment of well-defined and bounded traditional system
cause it to be categorized in the next class. Generally specifications, devoid of specifications regarding the

www.computer.org/computingedge 33
CYBERTRUST

Vision
System
Is Safe

Attribute
Decomposition

Failure
Security Is Functionality Accuracy Robustness Operability Reliability
Integrity Is
Adequate Is Adequate Is Adequate Is Adequate Is Adequate Is Adequate
Adequate

Decomposition Decomposition
by Aspects of by Sources of
Robustness Unreliability

Pointwise Absence
Robustness Gap of Runtime Gap
Is Adequate Errors

Gap

FIGURE 3. The use of CAE to assess V&V gaps.

behavior of the ML algorithm itself. These techniques from analyzing YOLO, a commonly used open source
are thus applicable to well-specified deterministic ML vision software, and a number of different run-time
ML algorithms and cannot be applied to perception errors that were identified:
algorithms, which are notoriously difficult to specify,
let alone verify. ›› a number of memory leaks, such as files opened
Apart from the ML algorithm, the assurance of the and not closed, and temporarily allocated data
non-ML supporting components of an autonomous not freed, leading to unpredictable behavior,
system is challenging, given that the use of COTS or crashes, and corrupted data
open source components leads to uncertain prov- ›› a large number of calls to free where the validity
enance. Errors within non-ML components can propa- of the returned data is not checked [this could
gate and affect the functionality of the ML model.14 It lead to incorrect (but potentially plausible)
is, therefore, important to explore how traditional V&V weights being loaded to the network]
methods—in particular, static analysis of C code— ›› potential “divide by zeros” in the training code
can provide assurance for the larger ML system, (this could lead to crashes during online training,
offering confidence beyond the component level. In if the system were to be used in such a way)
the following, we provide a preliminary list of results ›› potential floating-point divide by zeros, some of

34 ComputingEdge April 2021


CYBERTRUST

which were located in the network cost calcula- Towards Identifying and closing Gaps in Assurance of
tion function (as noted above, this could be an autonomous Road vehicleS (TIGARS) project. The proj-
issue during online training). ect is a collaboration between Adelard, Witz, the City
University of London, the University of Nagoya, and
These errors would be applicable only to languages Kanagawa University. This work is partially supported
such as C and C++. Not all errors would be relevant to by the Assuring Autonomy International Programme, a
a language such as Python, used in the implementa- partnership between Lloyd’s Register Foundation and
tion of numerous ML libraries and frameworks, as the the University of York. We acknowledge the additional
semantics and implementation of the language itself support of the U.K. Department for Transport.
do not enable overflow/underflow errors, defined by
Hutchison et al.14 However, Python is a dynamically
typed language, bringing about a different set of
IT IS UNCLEAR HOW POTENTIAL
program errors not exhibited by statically typed lan-
FAULTS ARISING FROM DYNAMIC
guages (such as type errors). Unfortunately, no static
LANGUAGES COULD AFFECT THE
analysis techniques or tools exist to allow for the FUNCTIONALITY OF AN ML MODEL
analysis of Python code. Furthermore, it is unclear how ITSELF.
potential faults arising from dynamic languages could
affect the functionality of an ML model itself. This is a
large gap within the formal methods field that needs
to be addressed immediately, given the deployment of REFERENCES
autonomous vehicles utilizing Python. 1. N. Kalra and S. Paddock, Driving to Safety: How Many
Miles of Driving Would It Take to Demonstrate Autono-

T here is a need for disruptive innovation in the


assurance of autonomous and ML-based systems.
We provided a summary of the outcome-focused,
mous Vehicle Reliability? Santa Monica, CA: RAND
Corporation, 2016. [Online]. Available: https://www
.rand.org/pubs/research_reports/RR1478.html
CAE-based framework we are evolving to address 2. P. Koopman and M. Wagner, “Challenges in autono-
these systems and used it to identify specific gaps mous vehicle testing and validation,” SAE Int. J. Transp.
and challenges; we also discussed some solutions. Safety, vol. 4, no. 1, pp. 15–24, 2016.
We demonstrated the feasibility of deploying the best 3. R. Bloomfield, P. Bishop, E. Butler, and R. Stroud,
of existing work (e.g., advanced static analysis tech- “Security-informed safety: Supporting stakeholders
niques) and identified the need for new approaches. with codes of practice [Cybertrust],” Computer, vol. 51,
Overall, there is a need for stronger evidence and no. 8, pp. 60–65, Aug. 2018.
techniques to assure the dependability of ML compo- 4. Connected Automotive Ecosystems—Impact of
nents and for autonomous systems as a whole. Indeed, Security on Safety, British Standards Institution, PAS
there is common good in sharing techniques and 11281, 2018.
strategies regarding development lifecycles, diver- 5. C. Szegedy et al., Intriguing properties of neural net-
sity, security, and V&V algorithms in sufficient detail works. 2013. [Online]. Available: https://arxiv.org/abs
for independent analysis and research. We hope to /1312.6199
play our part in this by sharing our generic developed 6. I. Goodfellow, J. Shlens, and C. Szegedy, “Explaining
assurance case and providing, in the public domain, and harnessing adversarial examples,” in Proc. Int.
the more detailed report this article is based on. If we Conf. Learning Representations—Computational and
can achieve our goal of disruptive assurance, this can Biological Learning Society, 2015. [Online]. Available:
have a positive impact on innovation in a wide range of https://arxiv.org/abs/1412.6572v3
industries and technologies, not just ML-based ones. 7. L. Pulina and A. Tacchella, “An abstraction-refinement
approach to verification of artificial neural networks,”
ACKNOWLEDGMENTS Computer Aided Verification, CAV 2010, Lecture Notes
This article discusses work undertaken within the in Computer Science, vol 6174, T. Touili, B. Cook, and

www.computer.org/computingedge 35
CYBERTRUST

P. Jackson, Eds. Berlin: Springer, pp. 243–257. or leave it? A new look at signal fidelity measures,” IEEE
8. X. Huang, M. Kwiatkowska, S. Wang, and M. Wu, Safety Signal Process. Mag., vol. 26, no. 1, pp. 98–117, 2009.
verification of deep neural networks. 2016. [Online]. 14. C. Hutchison et al., “Robustness testing of autonomy
Available: https://arxiv.org/abs/1610.06940 software,” in Proc. IEEE/ACM 40th Int. Conf. Software
9. G. Katz, C. Barrett, D. Dill, K. Julian, and M. Kochender- Engineering: Software Engineering in Practice Track,
fer, Reluplex: An efficient SMT solver for verifying deep Gothenburg, Sweden, May 27–June 3, 2018, pp.
neural networks. 2017. [Online]. Available: https://arxiv 276–285.
.org/abs/1702.01135
10. N. Papernot, P. McDaniel, A. Sinha, and M. Wellman,
Towards the science of security and privacy in machine ROBIN BLOOMFIELD is with Adelard LLP and the City Univer-
learning. 2016. [Online]. Available: https://arxiv.org/abs sity of London. Contact him at reb@adelard.com or reb@csr
/1611.03814 .city.ac.uk.
11. W. Ruan, X. Huan, and M. Z. Kwiatkowska, Reachability
analysis of deep neural networks with provable guaran- HEIDY KHLAAF is with Adelard LLP. Contact her at hak
tees. 2018. [Online]. Available: http://arxiv.org/abs @adelard.com.
/1805.02242
12. J. Gilmer, R. Adams, I. Goodfellow, D. Andersen, and G. PHILIPPA RYAN CONMY is with Adelard LLP. Contact her at
Dahl, Motivating the rules of the game for adversarial pmrc@addelard.com.
example research. 2018. [Online]. Available: https://arxiv
.org/abs/1807.06732 GARETH FLETCHER is with Adelard LLP. Contact him at gtf
13. Z. Wang and A. C. Bovik, “Mean squared error: Love it @adelard.com.

IEEE Computer Society


Has You Covered!
WORLD-CLASS CONFERENCES —
Over 215 globally recognized conferences.
DIGITAL LIBRARY — Over 800k articles covering
world-class peer-reviewed content.
CALLS FOR PAPERS — Write and present your
ground-breaking accomplishments.
EDUCATION — Strengthen your resume with the
IEEE Computer Society Course Catalog.
ADVANCE YOUR CAREER — Search new positions
in the IEEE Computer Society Jobs Board.
NETWORK — Make connections in local Region,
Section, and Chapter activities.

Explore all of the member benefits


at www.computer.org today!

36 ComputingEdge April 2021


EDITOR: Christof Ebert, Vector Consulting Services, christof.ebert@vector.com
This article originally
appeared in
COLUMN: SOFTWARE TECHNOLOGY
vol. 36, no. 5, 2019

Validation of
Autonomous Systems
Christof Ebert and Michael Weyrich

FROM THE EDITOR


Autonomous systems are widely used. Yet, for lack of transparency, we are increasingly suspicious of
their decision making. Traditional validation, such as functional testing and brute force, won’t help, due
to complexity and cost. To achieve dependability and trust we need dedicated, intelligent validating tech-
niques that cover, for instance, dynamic changes and learning. Michael Weyrich and I provide industry
insights into validating autonomous systems. I look forward to hearing from both readers and prospec-
tive authors about this article and the technologies you want to know more about. —Christof Ebert

S
ociety today depends on autonomous sys- demanding are medical devices, which must provide
tems, such as intelligent service systems, a hierarchical software assurance because there is no
self-driving trains, and remote surgeries.1 The room for failure.
ultimate validation of the Turing test is that we often Autonomous systems have multiple complex
do not recognize autonomous systems. This growing interactions with the real world. They perceive and act
usage poses many challenges, such as how to provide in the environment, based upon the reflections of an
transparency, which rules or learning patterns are intelligent control system, and they have an increas-
applied in a complex situation, and if these rules are ing impact on our lives as they implement and execute
the right ones. Validation is the key challenge, of which high-level tasks without detailed programming or
we will provide an overview in this article. direct human control. Unlike automated systems,
With machine learning and continuous over-the-air which execute a carefully engineered sequence of
upgrades and updates, a core tenant of any quality actions, they are self-governing their course of action
strategy is continuous verification and validation. Cor- to independently achieve their goals.
rections and changes must be deployed in a fluid and Figure 1 indicates the five steps from automation
continuous scheme, reliably over the air. We will face to autonomy as we know them from human learn-
future scenarios where software-driven systems, and ing, where we advance from novice to expert. Those
maybe whole infrastructures, must not be started if steps exemplify the progress of a simple and “assisted
they do not include all of the latest software upgrades. behavior” from low-level sensing and control toward
Automobiles and manufacturing processes that “full cognitive systems” with a very high degree auton-
are safety critical fall into that category. Even more omy. Automated systems are gradually enhanced
to develop a skilled behavior along with enhanced
mission planning and control and execution capa-
Digital Object Identifier 10.1109/MS.2019.2921037 bilities that will eventually lead to the full cognitive
Date of publication: 20 August 2019 actions of an autonomous system. It is expected that

2469-7087/21 © 2021 IEEE Published by the IEEE Computer Society April 2021 37
SOFTWARE TECHNOLOGY

Aspects of Automated Systems Autonomous Systems


Cognition

Action

Mission
Reflection
Planning,
Skilled Control, and Cognitive
Low-Level Reactive Behavior Execution Behavior
Sensing and Behavior
Perception
Control

1 2 3 4 5
Assisted Partially Conditionally Highly Fully
Automated Automated Automated Automated

FIGURE 1. The five steps from automation to autonomy.

an intelligent behavior can be identified by acquiring automation is forecast to reduce deadly accidents by
knowledge and understanding, which entails system 90%.4 Autonomous systems can become an aid in the
functionalities such as perception, reflection, and future, in areas such as automated and autonomous
action in terms of a cognition. driving, flying, and production robotics.
A completely autonomous car on level 5 is sup-
posed to drive with no human intervention, even in dire VALIDATION OF
situations. This implies that the car must have intel- AUTONOMOUS SYSTEMS
ligence on par with or better than humans to handle Autonomous systems provide efficiency and safety
not just regular traffic scenarios but unexpected ones. as they relieve human operators from tedious man-
Although several players, such as Google and Uber, ual activities. For instance, the widespread use of
are granted permission to operate their self-driving self-driving cars could eliminate as much as 50% of a
services, deadly incidents put our faith in these cars person’s daily commuting time.4 As exciting as this may
to a test.2 It is quite apparent that existing validation sound, the question “Can we trust the autonomous
measures aren’t enough.3 We need new test methods systems?” will grow for years to come. Public confi-
that can envision fatal traffic situations that humans dence in autonomous systems depends heavily on
haven’t encountered yet. In addition, testing cannot algorithmic transparency and continuous validation.
simply be isolated to the final development stages. It Recently, we have seen several dramatic accidents,
must be part of every phase in the product lifecycle. A such as an automated car misinterpreting a white
sensible engineering process must be adopted in the truck as a white cloud, and another one overlooking
development of autonomous cars that lays enough pedestrians on a road, thus, killing people. One spec-
emphasis on testing and validation. tacular accident happened when an automated vehicle
Unlike an automated system, which cannot reflect continued along while its driver had a heart attack and
on the consequences of its actions and cannot change could not supervise it. Within a few seconds, the auto-
a predefined sequence of activities, an autonomous mated vehicle killed a mother and child as it tried to
system is meant to understand and decide how to avoid colliding with a tree. Hitting the tree might have
execute tasks based on its goals, skills, and a learning killed the driver, but innocent people in the surround-
experience. While contemplating the deficiencies of ing environment would have been safe.
autonomous systems, we should acknowledge that There are many open questions about the valida-
humans have natural limits, in terms of processing tion of autonomous systems: How do we define reli-
speed, repeatability of tasks, handling complexity, ability? How do we trace back decision making and
and so forth. In fact, in aerospace, we already trust judge it after the fact? How do we supervise these sys-
autonomous flying, and for automotive applications, tems? How do we define liability in the event of failure?

38 ComputingEdge April 2021


SOFTWARE TECHNOLOGY

Simulation Environments
Figure 2 provides an overview
With MIL/SIL
of validation technologies for Brute-Force Usage in the
autonomous systems. We distin- Simulation Environments Real World, While Running
Automatic Realistic Scenarios
guish, horizontally, the transpar- With MIL, HIL, and SIL

Validation Handling
ency of the validation. Black box Intelligent Validation,
e.g., Cognitive
means that we have no insight Testing and AI Testing
to the method and coverage,
Experiments and Empirical
while white box denotes trans- Function Test Test Strategies
parency. The vertical axis classi- Fault Injection Simulation Environments
Negative Requirements, With a MIL/SIL
fies the degree to which we can
Manual With Misuse, Abuse, and Brute-Force Usage in the
automate validation techniques Confuse Cases Real World, While Running
and, for instance, facilitate regres- FMEA and FTA for Safety Realistic Scenarios
sion strategies through software Simulation Environments Specific Quantity Requirements,
With a MIL, HIL, and SIL e.g., Penetration
updates and upgrades. Testing and Usability
Let us look at traditional test-
White Box Black Box
ing techniques (see Figures 1 and Validation Strategy
2) and evaluate their behaviors.
Table 1 provides the complete FIGURE 2. The validation technologies for autonomous systems. FMEA: failure
evaluation of static and dynamic mode and effects analysis; FTA: fault tree analysis; AI: artificial intelligence; MIL:
validation technologies for auton­ model in the loop; HIL: hardware in the loop; SIL: software in the loop.
omous systems. Negative require-
ments (such as safety and cyber-
security) are typically implied and not explicitly stated a vehicle should be able to recognize other cars and
in the system specifications.5 The following sections trucks, pedestrians, and so forth for vision-based func-
explain how these methods are applied to validate tionality. Combinations of these recognized objects
autonomous cars. can act as inputs to decision functionality, and sev-
eral decisions can lead to actions. Functionality-based
Fault Injection testing breaks down the scenarios into various opera-
Fault injection techniques make use of external equip- tional components that can be tested individually.
ment to insert faults into a target system’s hardware,
with or without direct contact. By having direct con- Hardware in the Loop
tact, faults, such as forced current addition, forced Although simulation tries to encapsulate the real
voltage variations, and so forth, can be injected to world as closely as possible, inherent limitations
observe the behavior of the system. Faults can be invariably create a void between the two. Hardware
introduced without making physical contact by using in the loop (HIL) closes this gap a little by using phys-
methods such as heavy-ion radiation, exposure to ical components for certain aspects of simulation.
electromagnetic fields, and so on. Such fault injections For example, a camera model in a simulation tech-
can cause bit flips, hardware failure, and similar events nique can be replaced by an actual camera. The input
that are not tolerated in safety-critical systems. to the camera can be fed by means of a computer
screen where videos of various real-time traffic condi-
Functionality-Based Testing tions are played to validate the behavior of car. A more
Functionality-based test methods categorize the advanced technique has been proposed for autono-
intelligence of a system into three classes: 1) sens- mous systems that are tested by robots, for instance,
ing, 2) decision, and 3) action functionalities. The idea vehicle HIL, where the simulated vehicles in traffic
behind such methods is that an autonomous vehicle have been replaced by moving robots. This has the
should be able to retrieve various functionalities for a advantage that, in addition to the camera, radar and
given task analogous to human beings. For example, lidar hardware can be tested using HIL.

www.computer.org/computingedge 39
40
TABLE 1. The evaluation of validation technologies for autonomous systems

ComputingEdge
SOFTWARE TECHNOLOGY

Continued

April 2021
TABLE 1. The evaluation of validation technologies for autonomous systems (cont.).

www.computer.org/computingedge
SOFTWARE TECHNOLOGY

41
SOFTWARE TECHNOLOGY

Vehicle in the Loop scenarios. It also eradicates the enormous amount


Human interaction can have a drastic influence on of time that needs to be invested to obtain the test
the behavior of partially automated cars. The meth- cases. The “Intelligent Testing” section summarizes
ods specified earlier fail to account for this reality. In some approaches that attempt to derive such valida-
vehicle-in-the-loop simulations, real cars are used, tion techniques.
though in a safe environment. A driver is shown sim- Truly transparent validation methods and pro-
ulated feeds of the external environment to capture cesses assume the utmost relevance and will be chal-
his interaction with the car. The car travels across a lenged by the progress of technology through the five
ground devoid of obstacles, simulating inertial effects steps toward autonomous behavior that are sketched
and simultaneously responding to the external feed. in Figure 1. Although they are still relevant, traditional
The greatest advantage to this method is safety: Since validation methods aren’t enough to completely test
there are no real obstructions involved, no harm will the growing complexity of autonomous cars. Machine
be incurred by the test drivers, even if they encounter learning, with situational adaptations and software
dangerous situations. updates and upgrades, demands novel regression
strategies. Figure 2 provides a map of the different
Simulators testing techniques.
Simulators are closed, indoor cubicles that act as sub-
stitutes for physical systems. They can replicate the INTELLIGENT TESTING
behavior of any system by using hardware and a soft- With AI and machine learning, we need to satisfy
ware model. The behavior of a driver can be captured algorithmic transparency. For instance, what are the
by immersing him a replicated external environment. rules, in a neural network that is obviously no lon-
Since simulators employ hydraulic actuators and elec- ger algorithmically tangible, to determine who gets
tric motors, the inertial effects they generate feel a credit or how an autonomous vehicle might react
nearly the same as the real-life version. They are used with several hazards at the same time? Classic trace-
for robots in industrial automation, surgery planning ability and regression testing will certainly not work.
in medicine, and railway and automotive applications. Future verification and validation methods and tools
will include more intelligence based on big data
Brute Force exploits, business information, and the processes’
Nothing can come closer to the real world than the ability to learn about and improve software quality in
real world itself. This is perhaps the final validation a dynamic way.4
phase, where a completely ready system is physically A key question concerns which way AI can support
driven onto roads with actual traffic. The sensor data the process of validation. Obviously, there are many
are recorded and logged to capture behavior in criti- AI approaches, ranging from rule-based systems,
cal situations. They are analyzed to accommodate and fuzzy logic,6 and Bayesian nets to the multiple neural
fine-tune the system according to everyday scenarios. network approaches to deep learning. However, the
The challenge in this stage, however, lies in the sheer process of validating an autonomous system is multi-
amount of test data that are generated. A stereo video layered and rich in detail. Various levels of validation
camera, alone, generates 100 GB of data for every testing can be distinguished, such as the systems
kilometer driven. In such situations, big data analysis level, the components, and the modules.
becomes extremely important. The potential for intelligent testing is manifold.
On a system level, there are questions about which
Intelligent Validation Techniques test cases must be executed and to what extent. This
Intelligent validation techniques tend to automate means that intelligent validation is required to help
the complete testing process or certain aspects of with the selection and even the creation of test cases.
testing. This eliminates the potential errors asso- A first step in that direction would be an assistance
ciated with manual derivations of test cases, since functionality that helped to identify priorities in an
humans may fail to recognize and think about certain existing set of cases. As a result, a validation expert

42 ComputingEdge April 2021


SOFTWARE TECHNOLOGY

COGNITIVE TESTING FOR AUTONOMOUS SYSTEMS

I n our industrial projects, we often face the challenge


of how systems can be validated, and safety assured,
when they undergo a change during operation. Updates
again from scratch? The method presented here applies
an artificial intelligence (AI) that can ascertain the
consequences of an individual change in all the control
over the air are commonly used for functional modifica- units.
tions of software-based automated systems. Be they in From our industry experience, we recommend a
manufacturing, automotive applications, or intelligent three-step approach to assess the impacts of software
building, automated systems are mostly component updates and upgrades (see Figure 3). First, the alteration
based; they consist of multiple control units that are in the system needs to be identified in terms of its origin
distributed. Each unit is in a certain location and has in a module and its localization in the network. Second,
a specific functionality that it provides to the overall a logical model of the overall system is composed to
system. understand the impact on other modules. However,
Unwanted behavior and basic functional errors might this model is distributed and needs to be automatically
occur somewhere in a distributed system because of an processed from the multiple submodules of the compo-
alteration elsewhere. How can such a system be safe- nents that are available.
guarded when changes in its components occur during Third, a process of functional verification is required
runtime? How can safety and security certifications be to check how the change is propagated and what it
maintained after a software modification happens within means with respect to potential malfunctions in the
a single module? distributed system. This AI can be used to test and
A test certification requires an understanding of safeguard following a stepwise procedure for testing. It
the effect of a change that is triggered somewhere in a only requires the specification of the control models and
software module and has impacts elsewhere. How can their intended interaction with the other modules, upon
this interaction be deduced and the consequences for which the overall functionality can be deduced and test
all modules be verified without testing the whole system certificates can be obtained on request.

would be able to test faster and with a better cover- combination of several of the tools presented in this
age of situationally relevant scenarios. On the level of article. It is important not only to deploy tools but
a component or module,7 testing it is also required to to build the necessary verification and validation
identify relevant cases. This can range from a simple competences. Too often we see solid tool chains but
support mechanism for how to feed a system with ade- no tangible test strategies. To mitigate these purely
quate inputs and checks on the outputs, to complex human risks, software must increasingly be capable
algorithms that automatically create test cases based of detecting its own defects and failure points. Var-
on code or a user interface. Figure 3 provides an over- ious intelligent methods and tools will evolve that
view of intelligent testing as we ramp up for autono- can assist with smart validation of autonomous sys-
mous systems. Unlike brute force, intelligent testing tems. However, even with the support of the smart-
considers the white-box and black-box dependencies est intelligent algorithms, the question remains how
and, thus, balances efficiency and effectiveness. See to build the public’s trust that autonomous systems
“Cognitive Testing for Autonomous Systems” for a can be validated while considering ethical dilemmas,
concrete case study. such as the accident when the mother and child
were killed.
PERSPECTIVES With the growing concern of users and policy
Verification and validation depend on many fac- makers about the impact of autonomous systems on
tors. Every organization implements its own meth- our lives and society, software engineers must ensure
odology and development environment, based on a that autonomy acts better than humans. Clearly, we

www.computer.org/computingedge 43
SOFTWARE TECHNOLOGY

Autonomous System Intelligent Testing

Model Database
Spec Spec AI-Based Testing
Component Component
1. Develop a Component
Spec Spec
Component Component
Model and
Spec
Dependency Model
Spec
Component Component 2. Develop a Dynamic
Test Strategy

SOA Networking
Dependency Model 3. Identify Changes
4. Compose Relevant
Submodels for
Regression
Functional 5. Automatically
Change Analyze Change
WP Impacts
P 6. Automatically Select
PS PS Test Cases for
1 1 Minimum Effort and
Necessary Coverage
WP

FIGURE 3. Intelligent testing for autonomous systems. SOA: service-oriented architecture; P: process; PS: production sensor;
WP: work package.

are not talking about few percentage points. To build REFERENCES


trust, we need a level of quality at least one order of 1. M. Weyrich and C. Ebert, “Reference architectures for
magnitude higher than human-operated systems. It the Internet of Things,” IEEE Softw., vol. 33, no. 1, pp.
is, above all, a question of validation to achieve trust. 112–116, Jan.–Feb. 2016.
Alan Turing, who was one of the first to consider AI 2. M. Santori and D. A. Hall. (2016). Tackling the test chal-
in real life, remarked wisely, “We can only see a short lenge of next generation ADAS vehicle architecture.
distance ahead, but we can see plenty there that National Instruments. Austin, TX. [Online]. Available:
needs to be done.” This remains true for a rather long http://download.ni.com/evaluation/automotive/Next
transition period, and intelligent validation will play a _Generation_ADAS_Vehicle_Architectures.pdf
pivotal role. 3. M. Rodriguez, M. Piattini, and C. Ebert, “Software

44 ComputingEdge April 2021


SOFTWARE TECHNOLOGY

verification and validation technologies and tools,” systems,” in Proc. 28th Int. Conf. Flexible Automation
IEEE Softw., vol. 36, no. 2, pp. 13–24, Mar. 2019. and Intelligent Manufacturing (FAIM2018), Columbus,
4. P. Gao, .H.-W. Kaas, D. Mohr, and D. Wee, (2016, Jan.) OH, 2018, pp. 870–877.
Automotive revolution: Perspective towards 2030.
McKinsey & Co., New York. [Online]. Available: https:
//www.mckinsey.com/~/media/mckinsey/industries CHRISTOF EBERT is the managing director of
/high%20tech/our%20insights/disruptive%20trends%20 Vector Consulting Services. He is on the IEEE
that%20will%20transform%20the%20auto%20industry Software editorial board and teaches at the
/auto%202030%20report%20jan%202016.ashx University of Stuttgart, Germany, and the
5. Road vehicles—Safety of the indented functionality, Sorbonne in Paris. Contact him at christof.ebert@vector
International Organization for Standardization, 21448, .com.
2019.
6. C. Ebert, “Rule-based fuzzy classification for software MICHAEL WEYRICH is the director of the
quality control,” Fuzzy Sets Syst., vol. 63, no. 3, pp. Institute of Industrial Automation and Soft-
349–358, May 1994. doi: 10.1016/0165-0114(94)90221-6. ware Engineering at the University of Stutt-
7. A. Zeller and M. Weyrich, “Composition of modular gart, Germany. Contact him at michael
models for verification of distributed automation .weyrich@ias.uni-stuttgart.de.

CALL FOR ARTICLES


IT Professional seeks original submissions on technology
solutions for the enterprise. Topics include
• emerging technologies, • social software,
• cloud computing, • data management and mining,
• Web 2.0 and services, • systems integration,
• cybersecurity, • communication networks,
• mobile computing, • datacenter operations,
• green IT, • IT asset management, and
• RFID, • health information technology.
We welcome articles accompanied by web-based demos.
For more information, see our author guidelines at
www.computer.org/itpro/author.htm.

WWW.COMPUTER.ORG/ITPRO

www.computer.org/computingedge 45
DEPARTMENT: ANECDOTES
This article originally
appeared in

vol. 42, no. 2, 2020

Queens of Code
Eileen Buckholtz, Director, Queens of Code Project

INTRODUCTION TO THE QUEENS been, for practical purposes, a secret for more than
OF CODE 50 years.
Queens of Code is a women's technology history proj- Women have always been in the workforce—
ect—a collection of stories, experiences, and insights although their contributions to science have often
from women who worked in information technology gone unrecognized. In the 20th century, women
at the National Security Agency (NSA) in the 1960s, worked for the U.S. government and military, not just
1970s, and 1980s. NSA's computing women pro- in clerical, nursing, and other “women's” positions,
grammed and managed the most sophisticated sys- but in specialized technical fields such as cryptology,
tems of their day and I was one of them. I started this mathematics, and computing. The U.S. military dur-
project in 2018 to collect the stories of the agency's ing World War II actively recruited educated and tal-
women technology pioneers and recognize their con- ented women, including those from some of the best
tributions because I believed that if we did not doc- colleges, to fill critical vacancies and to “free a man
ument these stories now while many of us are still to fight.” These women often found themselves doing
living, our history would never be told. The National tedious work, but gained a foothold in the technical
Cryptologic Museum and NSA's historians offered workplace.
encouragement. I reached out to women I had worked According to Liza Mundy's Code Girls, over 10,000
with, and dozens signed up. Participants were asked women were a critical part of the cryptologic mission,
to complete a detailed questionnaire and write their some working with the early computing machines.1
stories. All material had to be approved through In the U.S. many women who had technical skills
NSA's prepublication review. We have been network- were sent home after the war to free the jobs for men
ing online for almost two years and have more than 75 returning from war. More generally, women's place in
women in the group. The goals for the project are rec- computer history has not been publicized because it
ognition of the Queens of Code in the history of com- has largely been HIStory, focusing on hardware and
puting, expanding the understanding of how women the male inventors,2 as I saw on my visit to the Com-
worked in early computing, and inspiring more young puter History Museum in Mountain View, California, in
women to pursue STEM careers. We are sharing our July 2018.
stories in presentations, articles, and interviews. Fortunately, modern cryptology, in particular,
Because these NSA women's jobs were often was welcoming to women from the start. Elizebeth
top secret and they worked on the most sensitive Friedman and Agnes Driscoll led the way in the
national security programs, they could not discuss 1920s and 1930s. 3 The work of the “Code Girls” dur-
what they did, even with their families. In many ing World War II was critical for winning the war. Like
cases, they could not even confirm they worked their contemporaries at NASA, whose story was told
for NSA. They and their computing activities have in the bestselling book and hit movie Hidden Figures,
the women at NSA, walking in the footsteps of their
World War II sisters, have broken ground from the
Digital Object Identifier 10.1109/MAHC.2020.2982751 1960s on as they contributed to advances in comput-
Date of current version 29 May 2020. ing in the world of cryptology.4

46 April 2021 Published by the IEEE Computer Society  2469-7087/21 © 2021 IEEE
ANECDOTES

Many of the Queens of Code were recruited by NSA home or at school. You might have learned to program
right after college and worked in computing technol- in basic using my Micro Adventure books7 on an Apple
ogy for 30, 40, and even 50-year careers. I was one II, Radio Shack TRS-80, Atari, or IBM PC. If you are part
of those queens, hired in 1970, with one of the first of Generation Z, you probably played games on your
undergraduate degrees in computer science in the first computer tablet or smart phone maybe as early
country. Starting from data systems interns and rising as a toddler. E-books, apps, and online shopping and
to senior leaders and computer science experts, we learning are things you took for granted.
were on the forefront of computer technology devel- That was not the case when the Queens of Code
opment. In the 1960s, 1970s, and 1980s, our agency were young. The ARPANET (the early version of the
had the most sophisticated computers in the world as Internet that had just begun to come online in 1969)
well as the most challenging information processing only connected some dozens of government agencies,
requirements. By 1968, NSA had more than 100 com- universities, and other research organizations, and
puters spread over five acres of computer rooms. 5 The the World Wide Web had not been invented. Back in
inventory grew rapidly over the next decades as we the early 1970s, one of our offices did have a terminal
and our male colleagues worked with many vendors that we could use a modem to dial into the National
to drive new system development to meet our big data Bureau of Standards’ ARPANET. From NBS, we could
processing needs. connect to the Stanford Research Institute—and it
Our stories may also provide some insight to com- took dozens of steps to send a line of text along with
panies today that struggle to recruit and retain women manually calculated checksums (a digit that was
in tech. In contrast to corporations and institutions the sum of the other digits in a piece of data used to
in various other sectors, NSA did a lot right over a detect errors).
50-to-60-year period to recruit, develop, and retain When we were growing up, there was little digital
their computing women. They had learned from previ- computing technology in the schools we attended
ous experience with the Code Girls during World War before college. Pocket-size calculators made their
II that women were a valuable asset to their mission. debut in the 1970s. Before then, in high school or col-
They invested in us through training, intern programs, lege, we used a slide rule (the manual device invented
and advanced degrees, paid equal starting salaries in 1620) for math, chemistry, or physics courses.
for men and women, gave women responsibility and Many of us were 18-to-22-year olds when we met
credit, promoted many women to senior manage- our first computer, perhaps an IBM 1620, 1401, or even
ment and technical positions, and provided a good 360 (after its release in 1964) at our college or uni-
work/life balance. Fortunately for us, most of the men versity. Often the Queens of Code's first computing
we worked with were supportive as well. Of course, experiences were on their initial assignment at NSA
there were some struggles along the way, including a or at college. These computer installations could be
class-action lawsuit over fair promotion in the 1970s,6 huge and expensive, especially those in NSA's exten-
but we prevailed. The Queens of Code made a daring sive basement sometimes taking up spaces as big as
leap into a new career field of computer science and a couple of basketball courts, cooled by water under
found innovative, exciting, and rewarding careers that the floors to make the rooms so cold that you had to
contributed to the high-tech world we live in today wear a heavy sweater or jacket when working there.
The rest of this article highlights some of the expe- Some of our first computing experiences were on
riences I, and many other women, shared working on computers with limited capacity and programming
our first computers. done in assembly language or even octal, and that was
not easy. We had to be crafty to make the programs
FIRST ENCOUNTERS OF A work within the constraints. FORTRAN, the first
BINARY KIND commercially available computer language to use a
If you grew up in the 1980s or 1990s as part of the mil- compiler was released in 1957. A compiler meant that
lennial generation, your first experience on a com- the code could be written with higher level and easier
puter might have been with a personal computer at to manage commands that would automatically

www.computer.org/computingedge 47
ANECDOTES

generate the assembly language or machine code before the phrase was coined in 2005. Programs were
needed for a specific machine. FORTRAN was written to support requirements at the time.
designed to provide a language for the scientific com- The programmer would then break the process
munity, and NSA certainly fit in that box. As computer down into small steps that would provide a solution.
technology advanced and memory size increased and Programmers often use flowcharts to block out the
became less expensive, programmers could write steps that need to be taken. We used plastic tem-
code with less computer specific restrictions. plates back then to draw the flow charts.9 Now there
Dottie Blum, a legendary computing woman at are many software tools and applications to help with
our agency, was using FORTRAN as early as 1954 even program design.
as it was being developed by John Backus and team Next, we had write the code in a programming
at IBM. At first, people wondered if using a compiler language like FORTRAN, PL1, or C or in assembly
would produce code as efficient as writing in assembly language in the earlier days. Then, we had to debug
language. But over time, computer speed and memory it, resolving all the problems that we could find. After
size increased and convenience won out. Another that, we tested with our real-world users; and, when
benefit was that programs could be ported (moved all was working, officially declared the program live.
over) to run on other machines that had a FORTRAN Of course, there would always be more bugs that
compiler. It was a big improvement over having to popped up, and we had to fix those in a timely manner.
rewrite programs in another assembly language every At the agency, system programmers who worked
time a new computer came into our collection. 8 on the operating systems and networking were in the
Programming was a little like cooking. You had C (for Computer) Organization (which was later reor-
input (like your ingredients) and then steps that pro- ganized and renamed T (for Telecommunications and
cessed the ingredients. If all worked well, you have Computer) Organization. It seemed that every three
something to eat for supper. Fortunately, I was a bet- or four years we have a major reorganization, some-
ter programmer than a cook. times corresponding to a new Director's arrival. Some
Our first programs were either assignments at application programmers started out as part of C, but
school or “toy” programs we were assigned to learn later moved out to sit with the users in the production
how the computer worked. On the earliest computers organizations. All the reorganizations and reassign-
from the 1950s like the special purpose ones built in ments were confusing. One of my bosses had a sign
house, there was only the basic documentation, so in his office that read, “Perfect reorganization is only
you had to figure things out for yourself. Sarah, one achieved by groups on the verge of collapse.”
of our first programmers, used to say that when she
started at NSA the computers took up a whole room LEGACY QUEEN'S FIRST
and you were lucky to find a small notebook with ENCOUNTER WITH A COMPUTER
instructions. By the time she retired, the computers
were small enough to fit on your desk, and you had a Dottie Toplitzky Blum, 1950
bookcase of manuals and online documentation. Dottie had worked with the Electronic Adding
In our environment, the programming process Machines (EAM) equipment and the Army's version
worked as follows. The first step was to define the of the BOMBE, an electromechanical device devel-
problem. In our case for application programs, this oped by Joe Desch of National Cash Register during
meant talking to the analyst to understand the prob- WWII to decode Enigma messages. Another of Dottie
lem that needed solving. The problem was often to Blum's earliest binary encounters was with the Stan-
automate a time-intensive manual process such as dards Eastern Automatic Computer (SEAC), which
an attack on a cryptographic code we had collected was built in Washington, DC, USA, for the National
by analyzing signals or language translation. NSA pro- Bureau of Standards. The SEAC was one of the first
cessed tons of data to produce intelligence reports U.S. stored program computers. Dottie then worked
for government decision makers including the Presi- for AFSA, the Armed Forces Security Agency, NSA's
dent and the military. NSA was doing “big data,” long predecessor. AFSA did not have their own computer

48 ComputingEdge April 2021


ANECDOTES

but the support organization did manage a number of was wrapped butterfly style in a figure eight with a
calculating and cipher machines including the Navy's paper clip in the center and stored until you had time
and Army's BOMBEs that were used in the war effort. on the computer.12
These earlier devices were not actually computers The first programs I wrote were standalone pro-
since they lacked memory or ability to do anything cesses. I had data input from magnetic tape, ran the
outside of their limited computational functions such program, and produced data output. I scheduled
as compiling and comparing text, searching for cribs computer time and, when it was my time, I took my
(plain text), or calculating statistics. paper tape and mag data tape to the computer room.
It was 1950 and Dottie and Sam Snyder, one of her After loading the paper tape into memory, I put my
coworkers whose computer history writings docu- magnetic tape on the spool and initiated the program.
mented this story, got an urgent request from the When I was done, I took my program tape, mag tape,
Navy's Communications Security Division. The job and results off the computer, cleaned the heads for
required the verification of a few hundred involutory the next programmer, and took everything back to
4 x 4 matrices10 that were used in the Navy callsign my desk to assess the results and debug my program.
system. The SEAC's memory was only 512 (45 bit) Very much a hands-on process!
words, which was pretty limiting. Back then, they had
to negotiate time on the SEAC to debug the program, Eileen Buckholtz, 1968-1969
and the only time NBS would allow them to purchase I had transferred to Ohio State University (OSU) as a
(at $24 an hour) was after midnight or on Sunday math major and was taking a fourth course in calcu-
afternoons. lus while struggling with the theoretical proofs in the
The program was written but they needed test class. I remember the professor covering a big black-
data. Dottie, who was working for the Machine Pro- board with the proof of the Heine–Borel Theorem. He
duction Organization as an IBM specialist, produced got near the end, realized that he had made a mistake
thousands of random numbers on punched tape to be and started to erase half his scribblings. My eyes were
used to test the application. The SEAC took between glazing over. What was I doing here?
8 and 15 seconds to process each matrix and then Later that afternoon, I heard that OSU was open-
printed out the ones that met the “useful” criteria. ing their computer science department and they were
With a lot of work and some late nights and weekends, looking for students. My boyfriend Howard was in
Dottie and Sam got the information to the Navy in a engineering and he heard the same thing. Turns out
timely manner to help solve the problem. Sam said, they were offering degrees in both the Arts and Sci-
“Those who participated in this task found the experi- ences Department, where I was enrolled as well as an
ence ‘frustrating, exhilarating sense of accomplish- engineering computer science degree. We both signed
ment and participation in making history.”11 up, became OSU's first computer science graduates,
and have been computing together for over 50 years.
MORE FIRST-HAND ENCOUNTERS There were only several dozen students in the first
FROM OUR QUEENS OF CODE computer science classes. It was love at first byte
for me when I took my first programming course. The
Carol McWilliams, 1967–1970 initial assignment was a simple sort. The next was
My first programming experience was assembly lan- to use a random number generator to simulate shuf-
guage on a CP818 (UNIVAC 1224) for field installation. fling a deck of cards and a matrix for holding all the
We “wrote” our programs on a Kleinschmidt—some- hands. The idea that you could learn a language like
thing like a typewriter, but it produced punched paper FORTRAN and make an enormous computer do your
tape with one instruction per line (e.g., “clear regis- bidding with structured commands was just fun. As
ter”). You could fix an error by wrapping Scotch tape we got into more advanced programs, it became chal-
over the holes in the line and repunching the line! For- lenging as well.
tunately, the readers were not sensitive to the opac- An IBM 360 installation including CPU, tape drives,
ity of the tape, just the holes. The resulting paper tape IO controllers, and other peripherals were housed in

www.computer.org/computingedge 49
ANECDOTES

a big computer room that took up much of an upper my first assignment in 1967. This report identified how
floor of the engineering building. We could see into critical those data were to national security at that
the computer room through big windows but we were time. Suddenly, I became acutely aware that the many
not allowed to go in. After class, we would design our hours I had spent reviewing those 1's and 0's were
programs and then punch up Hollerith cards (one line definitely worth the challenge of the task. In the end,
of code per card) on keypunch machines, submit them I determined that this work was probably some of the
over the counter, and then wait 5 or 6 hours for them most significant work I did during my entire career at
to run and get our output back. If we made an error, we the Agency. I took pride in knowing that this work was
had to correct that and submit again. No wonder the very important to the security of our nation.
four women in the computer science program were
all dating guys also in the program. Who else would Kathleen Reading, 1982
want to spend Friday and Saturday nights debugging “Oh great, another girl!” Imagine hearing those words
programs? upon meeting your supervisor for the first time. I was
21 years old and just beginning my 34-year career in the
Elaine Mills, 1965 Information Assurance Directorate (IAD), in the Agen-
As part of a work-study program at Towson State Col- cy's print shop. I was taken aback by my boss's com-
lege (known today as Towson University), I was study- ment, but did not say anything as I was just starting a
ing to become an elementary school teacher. I was new job and did not know what to expect. I do remem-
privileged to be assigned to a special project to “com- ber thinking I was going to do everything in my power
puterize” all the records for the five Maryland state to change my boss's mind about what “girls” could do.
colleges. Using Hollerith cards and becoming profi- My job title was “Reproduction Worker,” and I was
cient in writing FORTRAN programs in the mid-1960s one of three women working in the shop. I found that
was a “blast” that actually proved to be a tremendous job title pretty funny. I first started working in the
personal boost a few years later at NSA. bindery, and then also ran a printing press, large Xerox
machines and printers, and eventually worked in the
Kathleen Jackson, 1967 Electronic Printing and Publishing (EPP) branch. In the
My orientation started with a tour of the “basement,” EPP branch, for the first time, documents to be printed
a huge area where the computer I would be using, the were sent electronically via computer by Agency cus-
UNIVAC 1108, was located. The system was so large tomers; and documents also were sent electronically
that it nearly filled the whole computer room, since to the printers for printing. One of the documents
it had several printers and other pieces of peripheral printed on the night shift was a daily report that was
equipment attached to it. My job was to remove com- couriered each day to the White House.
puter printouts from one of the printers, review the As it turns out, I did prove to my boss that women
data, look for data “anomalies,” and adjust the FOR- are good workers. I was promoted several times, and
TRAN software as needed to fix them. It seemed chal- was also one of the first women selected to participate
lenging and interesting at first but, as the days and in the Agency's first ever production trade program.
weeks rolled by, reviewing rows and rows of 1s and
0s became a little tedious to put it mildly. However, I Mary Clulow, 1977
persisted. After completing my tour, I looked forward “I will rule these machines; they will not rule me.” Qui-
to my next assignment. Over the years, I sometimes etly determined, I spoke these words late one eve-
thought about that initial assignment, and how differ- ning at work while trying to complete a typing task. I
ent it was from all the other work I had done at the was using an IBM Magnetic Card Selectric (aka Mag-
Agency. Card) typewriter. This was quite a bold statement for
Several years later, I came across a report that an entry level Clerical Assistant at the National Secu-
contained Agency historical information. It included rity Agency. However, I had been challenged by this
information about the data that was processed on machine more than once since learning to use it two
that UNIVAC 1108 computer I was supporting during years prior, in 1977.

50 ComputingEdge April 2021


ANECDOTES

For those who may not be familiar with the Mag- Peggy Strader, 1969
Card, it was quite state of the art for its time and was During my intern tour, I was introduced to the UNI-
the upgrade to its predecessor, the Magnetic Tape VAC 494 and the SPRYE assembly language. This was
typewriter. Basically, after typing on bond paper, it an octal-based system so I learned and became profi-
recorded one page at a time, provided you inserted cient in reading dumps in octal. On this system I honed
a magnetic card (much like an IBM card, but Mylar my skills in SPRYE, FORTRAN, and ALGOL. I believe
and magnetic) and pressed Record, before turning this group was responsible for the first Information
the machine off—or else it was not saved. Once Storage and Retrieval System named TIPS (Technical
recorded, the file could be edited by picking the Information Processing System) and its retrieval lan-
related card to the desired page from the labeled guage named TIPS Interrogation Language. On this
envelope, inserting it into the card reader, and then system, we were able to put our queries on model 35
pressing Read. The file then could be played back teletypes and it would search magnetic tapes of data
one page, one line, one word, or one character at a or magnetic drums for the information requested. I
time. The playback was rather quick, making it easy became the user/customer interface for these sys-
to miss the mark. Sometimes, the paper ripped dur- tems, often teaching the Boolean logic and con-
ing a return motion. This was one of those moments; structs necessary to retrieve the information needed.
blame it on user error, but I finally thought, enough
was enough, and enrolled at the local community Lois Gutman, 1970
college. I had no real computer or programming experience
This was the beginning of my journey into other than creating small card decks for overnight
advanced learning, leading to a BS degree in informa- runs on a cardpunch machine in a summer job at
tion management, but definitely not the end of using Johns Hopkins University. NSA's high-level program-
other machines that would enter NSA's workspaces. ming language at the time was IMP,13 running on the
They provided more word processing technologies, operating system FOLKLORE, NSA's homegrown
office automation, and then advanced further into time-sharing system, developed by the Institute for
end user computing. Defense Analyses in Princeton, NJ, USA. Everyone
in my office used CDC-6600 computers and sat in a
Maureen McHugh, 1969 large open tube room, a room full of Cathode-ray tube
I graduated from Marywood College in May, 1969. My (CRT) terminals connected to a mainframe where pro-
experience with computers was limited. I was a Math grammers could work on their code in the NSA Head-
major in a college whose curriculum focused mainly quarters building basement. Operators hung large
on training teachers. I did not want to be a teacher. It magnetic tape reels for users. We stacked the reels
was obvious in 1969 that computer work would be an on our desks (under sheets of black cloth for security)
exciting field and I could get in on the ground floor. As and made hanging decorations from colored plastic
a senior in college, I took a FORTRAN II class. write rings.
The teacher was a business professional who
taught a few night classes at the college. He had a cus- Toby Merriken, 1970
tomized van in which he had a card punch machine, a Fresh out of college, I went to work for NSA in 1966.
printer, and sorter. It was only accessible a few hours I started out in the Cryptanalysis Intern Program
a week including class time. Writing and debugging and became certified as a professional in that field.
our “toy” programs was difficult, to say the least. We Shortly after that, in 1970, I joined a newly created
submitted our punch card deck to the teacher, and branch dedicated to using computer science for the
he would return it the following week after running it first time for cryptanalytic applications. With no com-
on a computer back in his office. One turnaround per puter training or experience, I wrote programs in FOR-
week! A single typo could set you back two weeks. I TRAN and learned a lot on the job.
think I got a B in the course, but certainly did not feel I wrote each program in longhand and took it to a
as though I had mastered FORTRAN. staff of key punch operators to transfer to keypunch

www.computer.org/computingedge 51
ANECDOTES

cards. I then took the cards to the remote job entry “BILBO” became the second counter. Eventually, the
(RJE) room housing a printer and a card reader, into person guiding me looked at my work and gently men-
which I manually fed the cards. The input went to tioned that it was traditional to name the variables
the computer mainframe and was printed out on the after the function they performed so other people
printer in the RJE room. The computers were high tech could follow the program. DUH! Later, as I was finish-
for the time but not interactive. A great deal of time ing the project, I asked if he thought I should put in
transpired between the writing of a program and the a routine to handle Y2K—something I had discovered
implementation of it. I left this branch in 1974 to return in my library research. I do not know how he kept
to cryptanalysis and eventually became a linguist. a straight face when he replied that it probably was
not necessary for the purpose of this project. I often
Marie Rowland, 1970 remembered this Y2K-innocence when it struck with a
When I started at NSA, I can honestly say I knew noth- vengeance years later.
ing about computers. I had gone to an all-women's col- As it turned out, teaching this little machine to
lege and majored in math. The only exposure to com- tell time was a very good introduction to the world
puters we experienced came one afternoon when of computers. Programming it illustrated the “edge”
a guest speaker explained to us that we would all of the hardware and software divide and left me
have to learn a language called FORTRAN. So, when completely fearless to wade into all kinds of hard-
I landed at the Agency with my group of new mathe- ware–software issues. I realized what a leap it was
maticians, management decided that with my back- from the early computer greats made in the 1940s
ground I should start at the beginning. and 1950s when their research allowed them to move
I was assigned to the Research organization, from purpose-built machines to building machines
where they handed me a small box and explained they that kept both the instructions and the data that the
were doing research on how small computers might instructions worked on in the same form. I went on to
one day be used to schedule jobs for large computers. write many programs and eventually received an MS
My task was to teach the small box to tell time. Now, in Computer Science from Johns Hopkins, but I was
I was naïve enough to believe this and did not realize always able to view the complexity of tasks through
that it was actually a good exercise for me to learn the lens of my first project.
about how to program computers and really under-
stand them from the inside out. REFLECTION
I started reading the manuals and some library We all had memorable encounters with our first com-
books. Each morning I plugged in my little box and puters and went on to have rewarding careers in tech-
got it going with a paper tape from a teletype, start- nology. Our Queens of Code are good examples of how
ing a heartbeat interrupt, a periodic signal that women were working with early computer technology.
the hardware generated to indicate its working or NSA gave us opportunities to excel in this exciting new
to synchronize other parts of the system. I could career field.
count these beats and get up to a second, then a Over the past 50 years, women have continued to
minute and so forth, and thus tell time. The little bring their talents and skills to the technology revolu-
machine only had a few instructions such as load, tion. We hope our project will encourage other women
compare, and store so it seemed much easier than computer pioneers in both the public and private sec-
the FORTRAN description. The biggest headache tors to step forward and tell their stories. While much
was working with the teletype and paper tape. An has been written on the low percentage of women
“all thumbs” affliction was to plague me through graduating with computer science degrees14 and
card decks and keyboards, through all my years of problems with retaining female technical employees,
writing code. it is critical for the future of tech that women's ideas
The library books said I could name my variables and points of view be part of future developments. We
anything I wanted. I took this to heart and called them hope our stories will inspire more women to pursue
names from the book I was reading, The Hobbit. Thus, STEM careers.

52 ComputingEdge April 2021


ANECDOTES

Look for more stories from the Queens of Code Available: https://www.nsa.gov/About-Us/Current
and follow our journey on Facebook: https://www -Leadership/Article-View/Article/1622398/, Accessed:
.facebook.com/QueensofCode/. 2019.
Our website: https://Queensofcode.com. 9. Read more about flowcharting templates in the Peggy
On Twitter: @QueensofCode Aldrich Kidwell’s article in the IEEE Annals of the
History of Computing (vol. 41, no. 1). Accessed: 2019.
REFERENCES [Online]. Available: https://ieeexplore.ieee.org
1. Liza Mundy, Code Girls: The Untold Story of the /document/8667955
American Women Code Breakers of World War II. New 10. Involutory matrices can be used in visual cryptography
York, NY, USA: Hachette, 2017. to transform the data. A more detailed example of the
2. J. Abbate, “Women and gender in the history of Hill cipher algorithm can be found at Accessed: 2020.
computing,” IEEE Annu. History Comput., vol. 25, no. 4, [Online]. Available: https://pdfs.semanticscholar.org
pp. 4–8, Oct.–Dec. 2003. /9626/7d1194c8beffc8abcf8b142f9870051bdb7c.pdf
3. Friedman is the subject of two recent books: G. S. 11. S. Snyder, Earliest Application of the Computer at NSA
Smith’s, A Life in Code: Pioneer Cryptanalyst Elizebeth (Snyder), 1973.
Smith Friedman. Jefferson, NC, USA: McFarland & 12. 12.Details on the history of Punched paper tape.
Company, Inc., Publishers, 2017 and J. Fagone’s, The Accessed: 2020. [Online]. Available: https://en
Woman who Smashed Codes. New York, NY, USA: .wikipedia.org/wiki/Punched_tape
Dey Street, an imprint of William Morrow, 2017. Agnes 13. 13.Read more about IMP at: Accessed: 2020. [Online].
Driscoll’s mostly unknown life is the subject of K. W. Available: http://www.liquisearch.com/imp
Johnson’s, The Neglected Giant: Agnes Meyer Driscoll. _programming_language https://www.seas.harvard
Ft. George G. Meade, Center for Cryptologic History, .edu/courses/cs152/2016sp/lectures/lec05-imp.pdf
2015. Another little-known early female cryptologist, 14. 14.Only 18% of computer science degrees were earned
Genevieve Young Hitt, is the subject of B. R. Smoot’s, by women in 2016. Accessed: 2020. [Online]. Available
“An accidental cryptologist,” Cryptologia, vol. 35, no. 2, https://www.computerscience.org/resources/women
pp. 164–175, 2011. -in-computer-science/
4. Hidden Figures: The American Dream and the Untold
Story of the Black Women Mathematicians Who
Helped Win the Space Race. New York, NY, USA: William EILEEN BUCKHOLTZ is a Maryland-based computer scien-
Morrow, 2016. tist and author of 40 books. She received a M.S. degree in
5. NSA’s Key Role in Major Developments in Computer computer science from the University of Maryland, a B.S.
Science, Part Two, Accessed: 2019. [Online]. Available: in computer and information science from Ohio State Uni-
https://www.nsa.gov/Portals/70/documents/news versity, and a Chief Information Officer certificate from the
-features/declassified-documents/nsa-early-computer National Defense University. For over 30 years, she had an
-history/6586785-nsa-key-role-in-major-developments exciting and distinguished career working on cutting-edge
-in-computer-science.pdf technology for the National Security Agency.
6. R. Predmore-Lynch. Accessed: 2019. [Online]. Avail-
able: https://www.nsa.gov/About-Us/Current
-Leadership/Article-View/Article/1620951/renetta
-predmore-lynch/
7. I was a co-creator of the popular Micro Adventure
series (Scholastic 1984, with Ruth Glick). [Online]. Avail-
able: http://www.microadventure.net/Home/About/.
We had a talented team of mostly women writers and
F O LLOW US
programmers and some teens to create the series. @ s e cu rit y p riv a c y
1984–1986.
8. Dottie T. Blum, Hall of Honor Inductee. 2004. [Online].

www.computer.org/computingedge 53
EDITOR IN CHIEF: Ipek Ozkaya, Carnegie Mellon Software Engineering Institute,
ipek.ozkaya@computer.org This article originally
appeared in

DEPARTMENT: FROM THE EDITOR


vol. 38, no. 2, 2021

Mom, Where Are the Girls?


Ipek Ozkaya

D
uring the fall semester of 2005, I was working of the Methods of Software Development fall class
hastily on the finishing touches of my Ph.D. members of 2005 for a couple minutes, my daughter
dissertation at Carnegie Mellon University. asked who they were. I explained that they were the
That semester, I also was the teaching assistant for students with whom I was working. She continued
the Methods of Software Development graduate to study the photos, and I started to concentrate
course taught by Dr. Mary Shaw and Dr. Jim Herbsleb. on my work. After a couple more minutes, I heard,
It was a busy time, with the challenges of finishing “Mom, where are the girls?” I did not understand
graduate school; getting ready for a new job; fulfilling the question at first and asked her to explain. Her
responsibilities such as grading and helping students; three-and-a-half-year-old observant mind was trying
and parenting my then three-and-a-half-year-old to categorize the students as at that age she was just
daughter. Methods of Software Development was a starting to recognize gender.
demanding course with a lot of reading and reflection It is my response in all of its irony that demon-
assignments. Students took abundant advantage of strates one of the reasons that we have the diversity,
the office hours. Those meetings always went better if equity, and inclusion issues. Once I finally understood
I remembered the students’ names, but with that was the question, I, without any doubt, said, “Oh, they are
all going on, my brain did not always comply, so I had a all there, let’s find them.” I put her on my lap, and we
started studying the photos. There were two female
students out of a total number of 27 people. At that
IT WAS DURING ONE OF OUR “LET’S age, my daughter was content with finding the two girls
PLAY AT MOMMY’S OFFICE” VISITS and moved on to exploring something else. She did ask
WHEN I FIRST BECAME AWARE OF “more” once or twice as we studied the photos, not in
THE DIVERSITY ISSUE IN SOFTWARE search of more girls, but simply because “more” was
ENGINEERING. her most favorite phrase due to her daycare routines.
I do recall, however, being confused for the first
time. Panicking, admittedly not because there were
hack. I had printed all of their photos and hung them only two female students in the 2005 Masters of Soft-
right above my desk. ware Engineering class at Carnegie Mellon University,
As with all graduate students, my weekends con- but because I felt like a young, tired, and inexperienced
sisted mostly of work. I often took my daughter into mom showing her daughter a bad example in my very
the office with me over the weekends to give her a own office. I was supposed to be teaching her that she
glimpse of my work life and sneak in some tasks. It could become anything she wanted, that she had the
was during one of our “let’s play at Mommy’s office” power, and that others had paved the way for her. The
visits when I first became aware of the diversity issue example I was supposed to set was not that we had to
in software engineering. After staring at the photos look close to find the two female students in the field
of the future at the very university leading the way. I,
as a female who consistently had been an underrepre-
Digital Object Identifier 10.1109/MS.2020.3044410 sented minority in her field, was not aware of the issue
Date of current version: 11 February 2021 until that very exchange.

54 April 2021 Published by the IEEE Computer Society  2469-7087/21 © 2021 IEEE
FROM THE EDITOR

BECOME AWARE and stronger. History has shown us again and again
On the one hand, this memory reflects my luck. Clearly, that there are many underrepresented groups that,
I have been fortunate enough to not have felt as under- when empowered, will help move a field forward. But
represented in a field where I consistently have been. I this starts with awareness. Each of these diversity
had worked within teams that made me feel welcome. groups may demonstrate similarities; however, each
I had colleagues, supervisors, mentors, and advisors also has its unique challenges. Becoming truly aware
who advocated for me, gave me timely and concrete takes patience, an open mind, and learning to act the
feedback, motivated me, provided me with opportuni- right way. I was fortunate enough that my awakening
ties, and appreciated my contributions. On the other moment did not involve me being frustrated against
hand, it also reflects the unfortunate reality—we can- an unfair situation, feeling left out, or not finding
not assume that people are aware. Even those who people like myself with whom I could identify. This is
themselves are members of underrepresented groups an exception, not the norm.
may not be aware of the extent of the issues at hand.
We cannot assume that institutions are doing their
part, even the best of the best. Carnegie Mellon Univer- WHEN TEAMS ARE DIVERSE, WITH
sity has come a long way since 2005—admission com- REPRESENTATION FROM CULTURAL,
mittees at that time clearly were not aware or, in any ETHNIC, ECONOMIC, RELIGIOUS,
event, aware enough. Today, Carnegie Mellon boasts POLITICAL, AND TECHNICAL
the ability to achieve close to 50% admission rates of BACKGROUNDS, THEY ARE MORE
qualified female students across most of its degree PRODUCTIVE AND STRONGER.
programs at the undergraduate level, including in engi-
neering and computer science. Graduate admissions
numbers are also definitely way better than two out of BEING AN ADVOCATE AND
27. In addition, there are several university-wide initia- TAKING ACTION
tives to address other diversity, equity, and inclusion Being an advocate for diversity, equity, and inclu-
challenges. The road ahead to achieve a truly diverse, sion takes every single one of us to drive meaningful
inclusive, and equitable software engineering commu- change, starting at the personal level all the way up
nity is still long and not smooth as we all are aware. to organizational levels. Advocacy is any action that
Many global and national initiatives such as Girls speaks in favor of, recommends, argues for, supports,
Who Code, Girl Develop It, Black Girls Code, and defends, or pleads on behalf of a cause or for others.
countless others have also taken it upon themselves And the most impactful advocacy is achieved when our
to empower, motivate, and educate females to enter allies are diverse and show up with concrete actions
careers in software engineering. Bringing more women in support. The CEO of Girls Who Code recently pub-
into software engineering is not a solved challenge lished a public thank-you note to Jack Dorsey, CEO
despite the significant amount of attention it has of Twitter, for his advocacy and financial support for
received. Overcoming challenges of retention, salary Girls Who Code.1 She emphasized how the biggest ally
equity, and growth opportunities are still in the works. of Girls Who Code has, in fact, been a man passion-
But achieving diversity with enough female repre- ate about empowering women and girls to enter soft-
sentation is just the tip of the iceberg. We are finally ware engineering. We need examples demonstrating
learning that gender is not a binary identity; those allies working together to improve diversity, equity,
who identify as lesbian, gay, bisexual, transgender, and inclusion in software engineering. Organizations
questioning (LGBTQ+) face a number of very different that hire software engineers need to do their part as
challenges and biases as software engineers. well. Large, global software engineering organizations
Gender is one of several aspects of diversity. like Google2 and Microsoft 3 have started publishing
When teams are diverse, with representation from yearly diversity, equity, and inclusion reports to share
cultural, ethnic, economic, religious, political, and their data. Objectively understanding the state of
technical backgrounds, they are more productive the situation is one step toward improving diversity,

www.computer.org/computingedge 55
FROM THE EDITOR

equity, and inclusion in software engineering. Soft- we are taking action, and we will do more, including
ware engineering research only recently has started featuring case studies, experiences from industry, and
to look closer at studying the implications of diversity, empirical research results regarding improving diver-
equity, and inclusion on software engineering work. In sity, equity, and inclusion in software engineering. We
the guest editorial in this issue of the magazine, Albu- trust that the software engineering community will
says and colleagues summarize concepts and related keep us accountable.
current research work.4
IEEE Software has always kept diversity, equity, REFERENCES
and inclusion as part of its core values. But we, too, 1. R. Saujani. “Thank you, Jack Dorsey.” Medium. https:
need to do more. Despite our many efforts, the gender //medium.com/@GirlsWhoCode/thank-you-jack-dorsey
diversity of our magazine can improve. We have a long -cc9a210184b6 (accessed Dec. 2020).
way to go to improve representation from people of 2. “Google diversity annual report,” Google, Mountain
color on our boards. While we have always strived to View, CA. Accessed: Dec. 2020. [Online]. Available:
achieve global diversity, we struggle with including https://diversity.google
enough readers and authors from the Far East. Diver- 3. “Global diversity and inclusion report,” Microsoft,
sity, equity, and inclusion are our board’s collective Redmond, WA, 2020. Accessed: Dec. 2020. [Online].
responsibilities. However, to bring targeted focus and Available: https://query.prod.cms.rt.microsoft.com
identify actionable steps, we are also launching an ini- /cms/api/am/binary/RE4H2f8
tiative dedicated to this cause. We are in the process of 4. K. Albusays et al., “The diversity crisis in software
establishing a workforce team and will strive to share development,” IEEE Software, vol. 38, no. 2, pp. 19–25,
our data and the steps that we take. This is one way Mar.–Apr. 2021. doi: 10.1109/MS.2020.3045817.

Faculty Chair Position


Department of Computer Science
The College of Engineering and Applied
Sciences of the University at Albany – State
IEEE DataPort is an accessible online platform University of New York seeks applicants for
that enables researchers to easily share, access, a faculty position at the rank of full professor,
and manage datasets in one trusted location. to begin September 2021 or January 2022,
The platform accepts all types of datasets, up to 2TB, for Chair of the Department of Computer
Science. The successful candidate will have
and dataset uploads are currently free of charge. an established record of scholarship with
demonstrated potential to lead the growth
and development of computer science at the
University at Albany.
Applicants must have a PhD in Computer
Science, Computer Engineering, Electrical
Engineering, or a closely related discipline.
For a complete job description and
application procedures, visit:
https://albany.interviewexchange.com/
jobofferdetails.jsp?JOBID=128738
Questions regarding the position may be
addressed to CSChairSearch@albany.edu
For additional information on the
College and its departments, please visit:
http://www.albany.edu/ceas/
UPLOAD DATASETS AT IEEE-DATAPORT.ORG

The University at Albany is an


EO/AA/IRCA/ADA Employer.

56 ComputingEdge April 2021


Conference Calendar

I EEE Computer Society conferences are valuable forums for learning on broad and dynamically shifting top-
ics from within the computing profession. With over 200 conferences featuring leading experts and thought
leaders, we have an event that is right for you. Questions? Contact conferences@computer.org.

MAY • SP (IEEE Symposium on Secu- Information Science), Shang-


10 May rity and Privacy), virtual hai, China
• CCGrid (IEEE/ACM Int’l Sym- 24 May 7 June
posium on Cluster, Cloud and • ETS (IEEE European Test Sym- • BCD (IEEE/ACIS Int’l Conf.
Internet Computing), virtual posium), virtual on Big Data, Cloud Comput-
• ICFEC (IEEE Int’l Conf. on Fog • SEENG (Int’l Workshop on ing, and Data Science Eng.),
and Edge Computing), virtual Software Eng. Education for Macao
15 May the Next Generation), virtual • CBMS (IEEE Int’l Symposium
• BigDataSecurity (IEEE Int’l 25 May on Computer-Based Medical
Conf. on Big Data Security on • ISMVL (IEEE Int’l Symposium Systems), virtual
Cloud), virtual on Multiple-Valued Logic), • WoWMoM (IEEE Int’l Sympo-
17 May virtual sium on a World of Wireless,
• IPDPS (IEEE Int’l Parallel and 29 May Mobile and Multimedia Net-
Distributed Processing Sym- • SoHeal (Int’l Workshop on works), Pisa, Italy
posium), virtual Software Health in Projects, 14 June
18 May Ecosystems and Communi- • ARITH (IEEE Int’l Symposium
• RTAS (IEEE Real-Time and ties), virtual on Computer Arithmetic),
Embedded Technolog y and 30 May virtual
Applications S ymposium), • GI (Int’l Workshop on Genetic • ISCA (ACM/IEEE Int’l Sympo-
virtual Improvement), virtual sium on Computer Architec-
19 May 31 May ture), virtual
• TechDebt (IEEE /ACM Int ’ l • SEmotion (Int’l Workshop on 20 June
Conf. on Technical Debt), Emotion Awareness in Soft- • SERA (IEEE/ACIS Int’l Conf. on
virtual ware Eng.), virtual Soft ware Eng. Research, Man-
20 May agement and Applications),
• CHASE (Int’l Workshop on JUNE Kanazawa, Japan
Cooperative and Human 1 June 21 June
Aspects of Sof tware Eng.), • ISORC (IEEE Int’l Symposium • CSF (IEEE Computer Secu-
virtual on Real-Time Distributed Com- rity Foundations Symposium),
23 May puting), Daegu, South Korea Dubrovnik, Croatia
• ICCP (IEEE Int’l Conf. on Com- • Q-SE (Int’l Workshop on Quan- • DSN (IEEE/IFIP Int’l Conf. on
p u t at i o na l Ph o t o g r a p hy), tum Soft ware Eng.), virtual Dependable Systems and Net-
Haifa, Israel 2 June works), Taipei, Taiwan
• ICSE (IEEE/ACM Int’l Conf. on • ICIS (IEEE/ACIS Int’l Sum- 26 June
Soft ware Eng.), virtual mer Conf. on Computer and • CSCloud (IEEE Int’l Conf. on

57 April 2021 Published by the IEEE Computer Society 2469-7087/21 © 2021 IEEE
C yber Security and Cloud Autonomic Computing and Computer Networks), Edmon-
Computing), Washington, DC, Self- Or ganizing S ys tems), ton, Canada
USA Washington, DC, USA 10 October
23 August • M ODEL S (AC M/IEEE In t ’ l
JULY • SCC (IEEE Space Computing Conf. on Model Driven Eng.
5 July Conf.), virtual Languages and S ystems),
• ICME (IEEE Int’l Conf. on Mul- • SMARTCOMP (IEEE Int’l Conf. Fukuoka, Japan
timedia and Expo), Shenzhen, on Smart Computing), Irvine, 13 October
China USA • FIE (IEEE Frontiers in Educa-
7 July tion Conf.), Lincoln, Nebraska,
• ICDCS (IEEE Int’l Conf. on Dis- SEPTEMBER USA
tributed Computing Systems), 5 September
Washington, DC, USA • SERVICES (IEEE World Con- NOVEMBER
• SNPD (IEEE/ACIS Int’l Conf. on gress on Services), Chicago, 15 November
Soft ware Eng., Artificial Intel- USA • ASE (IEEE/ACM Int’l Conf. on
ligence, Networking and Par- 7 September Automated Sof tware Eng.),
allel/Distributed Computing), • CLUSTER (IEEE Int’l Conf. on Melbourne, Australia
Taichung, Taiwan Cluster Computing), Portland,
12 July Oregon, USA DECEMBER
• COMPSAC (IEEE Computers, • EuroS&P (IEEE European Sym- 20 December
Software, and Applications posium on Security and Pri- • MCSoC (IEEE Int’l Sympo -
Conf.), Madrid, Spain vacy), Vienna, Austria sium on Embedded Multicore/
• ICALT (IEEE Int’ l Conf. on 8 September Many-Core Systems-on-Chip),
Advanced Learning Technolo- • MIPR (IEEE Int’l Conf. on Multi- Singapore
gies), virtual media Information Processing
27 July and Retrieval), Tokyo, Japan
• SMC-IT (IEEE Int’l Conf. on 20 September
Space Mission Challenges for • eScience (IEEE Int’l Conf. on
Information Technology), virtual eScience), Innsbruck, Austria

AUGUST OCTOBER
9 August 1 October
• ICKG (IEEE Int’l Conf. on Knowl- • ISPA (IEEE Int’l Symposium on
edge Graph), Hong Kong Parallel and Distributed Pro- Learn more
11 August cessing with Applications),
about IEEE
• IRI (IEEE Int’l Conf. on Informa- New York, USA
Computer Society
tion Reuse and Integration for 4 October
Data Science), virtual • IC2E (IEEE Int’l Conf. on Cloud
conferences
16 August Eng.), San Francisco, USA computer.org/conferences
• ACSOS (IEEE Int’l Conf. on • LCN (IEEE Conf. on Local
Get Published in the IEEE Open
Journal of the Computer Society

Submit a paper today to the


premier open access journal
in computing and information
technology.

Your research will benefit from


the IEEE marketing launch and
5 million unique monthly users
of the IEEE Xplore® Digital Library.
Plus, this journal is fully open
and compliant with funder
mandates, including Plan S.

Submit your paper today!


Visit www.computer.org/oj to learn more.

You might also like