Professional Documents
Culture Documents
How To Build Ai That People Trust Whitepaper September-2021
How To Build Ai That People Trust Whitepaper September-2021
08... The points in the AI life cycle where trust can be undermined
2
Introduction
Many assume that AI success hinges on getting world of AI. This raises very difficult questions
the model to accurately interpret data. This is when it comes to trust. Lots of companies
important, but accuracy is one element of a have implemented AI but have issues trusting
more nuanced challenge: trust. the results.
3
Why is trust a
unique issue
for AI?
Most people are familiar with software in their replacing, or predict drug formulations, we
everyday life and workplace, which operates need to be very confident that it has reached
based on rules. They may be complex rules, but the right answer before we can trust it.
they are (bugs aside) predictable. They have
been explicitly programmed to follow a set of Added to this complexity is that AI conclusions
instructions which turn inputs into outputs. may be confusing, but still correct. NASA used
AI to design an antenna against a defined set of
AI works differently. It ingests data and learns criteria. The result would never have occurred
how to interpret it by establishing connections to a human, but it was better aligned to their
between different datasets. So, an English- needs than anything a human came up with.
Spanish translation AI is not explicitly told What does one do when an AI recommends
word-by-word that perro means dog or gato something completely counter-intuitive? It
means cat, etc., alongside fixed grammar rules. could be a breakthrough (as in NASA’s case),
It is fed texts that have been translated and is or it could be a spectacular oversight in the AI
told to learn what pattern links to the other – design or training. How do we know?
with guidance from language and data experts.
All of this raises questions of trust. If we
This allows it to learn complex tasks such know it is not 100% accurate, we need to
as translation or image recognition quickly. reach a decision about how much we trust
Many tasks performed with AI would not be its recommendation. This comes down to
possible with traditional software, or would multiple factors including how accurate we
take decades of programming. are told it is, how much we believe that claim,
how much control we had over the inputs, how
However, this approach brings unpredictability well we understand its decision-making, what
because the input data is complex and supplementary information it has provided to
imperfect, not a set of binary options. To learn back up its recommendations, its past record,
a language, an AI needs huge amounts of text the consequences of it being wrong, and the
and there is not enough time to manually check user’s own knowledge of the problem.
it all. Some translations may be poor, contain
mistakes, or deliberately misuse language. Even
correct ones contain nuance, where experts
disagree on the precise translation. A phrase
can be translated in several ways, depending on
the context. Anyone who has used a translation
app will know they are good, but not perfect.
The good
5
The bad
6
How much
trust do I need?
The level of confidence in an AI output needed, far from perfect, but high-risk AI decisions need
before the user will trust it, depends on the much greater levels of confidence in order to
seriousness of the consequences of failure. create trust.
Users will trust a useful low-risk AI even if it is
Predictive Unnecessary
maintenance downtime
Mortgage Harm to customer,
recommendation legal challenges
Targeted adverts Missed sales
opportunities
Translation Miscommunication
Film Occasionally
recommendation frustrated
customers
Photo tagging Wrong photo
tagged Lower
AI-created artwork Probably none
7
The points in the AI life
cycle where trust can be
undermined
There are various stages where trust can the information they want, they are likely to be
be undermined in the AI development and suspicious of the result. If the interface is overly
deployment process. In this section, we discuss complex or the results presented in a confusing
the main risk factors. way, or with no explanation as to how they were
reached (even if they are correct), it will quickly
Bias in training data be abandoned. Even something as simple as a
film recommendation is much more trustworthy
Unconscious gender or racial bias has often hit if you can see what aspects of viewing history
the headlines, usually created by applying AI to led to it.
process automation without understanding the
limits of the data. Amazon’s Rekognition, for Bias in the real world
example, misidentified women and people of
color, likely due to the use of smaller training Many AIs continue to learn post deployment,
datasets for these groups. Such stories but they are not necessarily well prepared for
undermine the credibility of commercially the complexities of real-world data. Famously,
available technology. Microsoft’s Tay, an artificially intelligent
chatbot, was designed to learn from interactions
The AI doesn’t learn incorrectly; it learns to it had with real people on Twitter. Some users
reflect bias in its training, which reflects bias in decided to feed it offensive information,
the real world. Prejudice is the nasty face of this, which it had not been designed to deal with
but bias in data can also extend to misplaced appropriately. Within 24 hours, Tay had to
assumptions by scientists, doctors recording be deactivated and withdrawn for spreading
incorrect diagnoses, and even people’s choice deeply upsetting opinions.
of written or recorded language.
Malicious attacks
Badly curated data
AI is susceptible to new malicious attacks in
Data can also be mislabeled, or poorly curated, ways that are poorly understood by users. AIs
such that the AI struggles to make sense of it. that appear to take human decisions can be
If data is not appropriately selected, then the fooled in ways that humans cannot.
model will not learn how to reach the right
conclusions. And if conclusions seem suspect, In a test case, an AI was trained to recognize
people won’t trust it (or worse, they will trust images. By changing just one pixel in an image,
it and take bad decisions as a result). researchers fooled the AI, causing it to wrongly
label what it saw – sometimes very wide off the
User interface and explainability mark (one thought a stealth bomber was a dog).
Tesla’s self-driving image recognition systems
Trust is not just about how good the model is, have been tricked by placing stickers on roads
but how easy it is to use and interact with and and signs, causing them to suddenly accelerate
how clear the answers are represented to the or change lanes.
user. If the user does not feel they can input
8
Lack of transparency complained that the card’s lending algorithms
discriminated against women. No one from
Sitting above all these issues is a fear fed by AI’s Apple or Goldman was able to justify the
lack of transparency. Not only do the end-users output or describe how the algorithm worked.
not understand how AIs make their decisions, The apparent correlation between gender and
in many cases nor do their makers. credit doesn’t necessarily mean one is causing
the other, but it creates suspicion that bias has
Apple’s credit card, backed by Goldman Sachs, crept in. Without transparency, it’s impossible
was investigated by regulators after customers to know, and that makes it hard to trust.
9
A framework for building
and deploying trusted AI
Despite the risks discussed in this paper, AI rather than dogs.
delivers huge value when done well. And aside
from the negative headlines, it is often done As soon as it makes mistakes, users will stop
very well. trusting it. This may not matter too much
when classifying cats and dogs, but it matters
The problems usually come when poor a lot when classifying images of healthy vs.
process and lack of experience lead to poor precancerous cells.
choices: such as wrong algorithms, bad data,
inadequate verification, poor interfaces, or Trusted AIs must use a well-designed model,
the lack of post-deployment support. These and be trained and tested on data that is
errors are often baked in from the outset with proven to be accurate, complete, from trusted
fundamental mistakes in initial scoping and sources, and free from bias. Capturing that
design, caused by a lack of understanding about data requires rigorous processes around data
AI and the real-world problem it solves. These collection and curation.
all undermine trust.
Those designing an assured AI model should
As AI plays an ever-increasingly important role examine AI inputs and ask:
in our lives, we need to design it to be trusted.
This goes beyond data scientists designing an • Does this data accurately represent the
algorithm that learns about correlations and system to be modeled?
works on test data. AI must be designed as a
whole product, with a set of support services • Does the data contain confounding or
around it that allow the user to trust its irrelevant information?
outputs. Doing so needs a rigorous approach
to AI development. • How will I know that the data is of
sufficient quality?
In this final section, we outline five key
parameters for creating trusted AI. • Is the underlying data biased – and how
would I tell?
1. Assured
• Are my assumptions about data
A data-driven decision is only as trusted as the collection biased?
data that underpins it.
2. Explainable
The most obvious aspect of trusted AI is
ensuring it does what it is supposed to. Because A functioning model is not enough; users need
AI learns from data, that data must be reliable. to understand how it works. The AI earns
You can train an AI to recognize cats and dogs trust by backing up its recommendations with
by feeding it lots of labeled images of each. But transparent explanations and further details.
if some cats are labeled as dogs, some are not
labeled, or some show a completely different If a bank turned you down for a mortgage, you’d
animal, the AI will learn incorrectly and make expect to know why. Was it past or existing
incorrect decisions. If all images of dogs are debt? Was it an error? Did they confuse you
in the snow, the AI may learn to detect snow with someone else? Knowing the reason allows
10
you to move forward in the most constructive 3. Human
manner. For the bank, it allows them to spot
faults, retain customers, and improve processes. A trusted AI is intuitive to use. Netflix would be
less successful if users had to enter a complex
It’s the same for AI. A recommendation is much set of parameters to get film recommendations.
more useful if you understand how and why it But instead, it automatically presents films you
was made. Explainability allows the user to see may like based on your history of search terms
if the AI supports their own intuition (e.g., about in an easy to navigate interface, and sometimes
a disease diagnosis or the best way to make a even tells you why it recommended them (the
new material), or helps them question it. And Because You Watched feature).
it allows developers to spot errors in the AI’s
learning and remedy them. An intuitive interface, consistently good
recommendations, and easy-to-understand
A responsibly designed AI will have tools to decisions, all help the user come to trust it
analyze what data was used, its provenance, over time.
and how the model weighted different inputs,
then report on that conclusion in clear language Intuitive doesn’t always mean simple. A simple
appropriate to the user’s expertise. smartphone app may use very intuitive, guided
decision-making. A drug property prediction
Explainability may also involve some trade-off platform can expect advanced chemistry
between raw predictive power and transparency knowledge from its user and display complex
of interpretation. If AI cannot fully explain its information in a manner appropriate for an
outcome, trust may still be built in some cases expert to understand and interact with.
through rigorous validation to show it has a
high success rate, and by ensuring the user has The complexity of these guided decisions must
the information they need to understand that be matched to the user’s knowledge. Equally,
validation. the time it takes the user to fully trust the AI will
be relative to the complexity and risk of failure.
Those designing AI to be explainable should ask:
Making AI usable for humans means
• What could be known in principle about understanding the end-user and how they
the working of this AI? interact and learn over time. Those designing
AIs should ask:
• Does the model need to be fully
i n t e r p r e t a b l e o r a r e p o s t- h o c • What does each user need to understand
explanations sufficient? about why the AI did what it did?
• Can the AI rationalize why it decided to • How should we communicate with users
offer the user this piece of clarifying and collect feedback?
information, rather than another?
• Why would users not trust this AI?
• How consistent is the given answer with
previous examples? • What reassurances are they likely to need?
• Does too much transparency make the AI • What training and support is needed for
vulnerable to attack? different users?
• Does the information on offer meet the • Should the system allow users to ask for
accountability needs of different human more details?
users?
11
• How can we retain confidence if the AI gets of real-world data or because they have not
it wrong? been designed to integrate into the users’
working life (either technically or practically).
• How do we make human users feel the AI Users will quickly lose trust in an AI they see
is accountable? making less and less reliable decisions.
An AI may conclude that certain groups are Those designing AI to perform in the real world
more likely to reoffend or default on loans. should examine AI outputs, and ask:
While this may be true at a group level (for
broader socio-economic reasons), it does not • Does this AI actually solve the intended
mean an individual from that group is more likely business problem?
to do so. An AI using this as a decision-making
factor creates undue prejudice and opens its • Do we understand the required levels of
user up to legal challenges. throughput and robustness?
• Why are we building this AI at all? • What safety barriers are needed if the AI
makes a mistake?
• Are we aligned with prevailing
ethical standards? • How robust is my testing, validation, and
verification policy?
• Is it fair and impartial?
• Is the in-service AI protected against
• Is it proportionate in its decisions? adversarial attacks?
5. Performant
David Hughes
Analytics solutions lead
Sam Genway
Analytics consultant
LinkedIn
John Godfree
Head of consulting
LinkedIn
13
About Capgemini Engineering
Capgemini Engineering combines, under one brand, a unique set of strengths from across
the Capgemini Group: the world leading engineering and R&D services of Altran – acquired
by Capgemini in 2020 – and Capgemini’s digital manufacturing expertise. With broad industry
knowledge and cutting-edge technologies in digital and software, Capgemini Engineering
supports the convergence of the physical and digital worlds. Combined with the capabilities of
the rest of the Group, it helps clients to accelerate their journey towards Intelligent Industry.
Capgemini Engineering has more than 52,000 engineer and scientist team members in over 30
countries across sectors including aeronautics, automotive, railways, communications, energy,
life sciences, semiconductors, software & internet, space & defence, and consumer products.
Write to us at:
engineering@capgemini.com