The False Comfort of Human Oversight As An Antidote To A.I. Harm

01/07/2021 The false comfort of human oversight as an antidote to A.I. harm.
STRENUOUS BOUTS OF UNIVERSE-SAVING
The False Comfort of Human Oversight as an

Antidote to A.I. Harm
BY
BEN GREEN AND AMBA KAK
JUNE 15, 2021 • 5:45 AM
What does “human oversight of A.I.” really mean?

Photo illustration by Slate. Photo by iLexx/iStock/Getty Images
Plus and Anastasiia Makarevich/iStock/Getty Images Plus.
In April, the European Commission released a wide-ranging proposed regulation to govern

the design, development, and deployment of A.I. systems. The regulation stipulates that
“high-risk A.I. systems” (such as facial recognition and algorithms that determine eligibility
for public benefits) should be designed to allow for oversight by humans who will be tasked
with preventing or minimizing risks. Often expressed as the “human-in-the-loop” solution,
this approach of human oversight over A.I. is rapidly becoming a staple in A.I. policy
proposals globally. And although placing humans back in the “loop” of A.I. seems reassuring,
this approach is instead “loopy” in a different sense: It rests on circular logic that offers false
comfort and distracts from inherently harmful uses of automated systems.
https://slate.com/technology/2021/06/human-oversight-artificial-intelligence-laws.html 1/4
A.I. is celebrated for its superior accuracy, efficiency, and objectivity in comparison to
humans. Yet as increasing evidence demonstrates the dangers of A.I., policymakers and
developers are turning back to humans to mitigate harm. In other words, humans are being
tasked with overseeing algorithms that were put in place with the promise of augmenting
human deficiencies. The 2020 Washington State facial recognition law, for example,
includes requirements for “meaningful human review,” defined in terms of “review or
oversight by one or more individuals … who have the authority to alter the decision under
review.” Several hiring A.I. companies advertise human intervention in the candidate
screening process as a way to prevent the errors and discriminatory outcomes associated
with these tools. And the backlash to lives lost in accidents involving self-driving cars has
prompted growing calls for reasserting human control.
But human oversight falls short as a solution for the risks of algorithmic decision-making
for three key reasons. First, calling for human oversight alone creates shallow protection
that companies and governments can easily avoid in superficial ways. The European
General Data Protection Regulation, for instance, mandates that people “shall have the
right not to be subject to a decision based solely on automated processing.” Although on
first glance this would seem to prevent the harms of high-stakes decisions being made by
opaque machines, setting up the binary of “solely” automated decisions versus those made
by humans obscures the reality that most A.I. systems lie on some continuum between the
two. Although headlines often emphasize the injustice of decisions being made by
machines, in practice it is uncommon for algorithms—particularly in high-stakes settings
such as criminal justice and child welfare—to operate without human involvement and
without a human making the final decision.
Furthermore, the mere presence of a human operator provides little protection against
forms of automated decision-making that are intrusive, opaque, or faulty—and instead may
serve only to legitimize them. At least by the letter of laws that prevent “solely” automated
decisions, any nominal form of human involvement is sufficient to avoid restrictions and
protections. Provisions like the GDPR thus may create an incentive to introduce superficial
human oversight of automated decisions (e.g., “rubber stamping” automated decisions) as
a way to bypass scrutiny.
Second, even calls for more “meaningful” forms of human oversight—which are gaining
traction as a way to account for the first issue just described—are incredibly difficult to
accomplish in practice. A significant challenge is that this principle suffers from “inherent
imprecision”: While a human operator rubber stamping algorithmic decisions is clearly not
meaningful, there is no clear definition regarding what actually constitutes “meaningful”
oversight. Furthermore, mounting research demonstrates that even when humans are
granted “meaningful” discretion regarding how to use A.I., they are either unwilling or
unable to intervene to appropriately balance human and algorithmic insights. People
presented with the advice of automated tools are prone to “automation bias” (through
which they defer to the automated system without proper scrutiny), struggle to evaluate
the quality of algorithmic advice, often discount accurate algorithmic recommendations,
and exhibit racial biases in their responses to algorithms.
These effects mean, for instance, that police in London “overwhelmingly overestimated the
credibility” of a live facial recognition system, often deferring to incorrect computer-
generated matches despite the algorithm’s low rate of accuracy. As another example,
studies have found that the implementation of pretrial risk assessments exacerbated
rather than diminished racial disparities in pretrial detention, in part because judges tend to
make more punitive decisions regarding Black defendants than white defendants with the
same risk score.
Third, presenting human oversight as a key remedy for A.I. harms can lead to a blurring of
responsibility, where frontline human operators of A.I. systems are blamed for broader
system failures over which they have little or no control. This allows developers and
companies to have it both ways: They can promote how their A.I. has capabilities that vastly
exceed those of humans, but when concerns get raised, they can point to human oversight
as the proper corrective. In this way, powerful institutional actors like companies and
governments are able to shift accountability (and liability) to individuals operating these
systems, typically workers who themselves have severely limited bargaining power and
control over how these systems are designed or used.
For instance, the developers of controversial algorithms such as the Alleghany Family
Screening Tool (which predicts the likelihood of child abuse or neglect), COMPAS (a “risk
assessment” that predicts the likelihood of recidivism), and hiring software attempt to
reassure critics by asserting that human decision-makers retain full discretion over
decisions. Another notable instance of this convenient finger-pointing occurred in 2018
when a self-driving Uber vehicle struck and killed a woman in Arizona. Even as Uber boasted
about its autonomous vehicle development (since sold off to another company), blame for
the crash fell primarily on the human operator tasked with monitoring the vehicle—even
though investigations found that the vehicle failed to stop because Uber engineers had
tuned it to be less responsive to unidentified objects.
Policymakers and companies eager to find a “regulatory fix” to harmful uses of technology
must acknowledge and engage with the limits of human oversight rather than presenting
human involvement—even “meaningful” human involvement—as an antidote to algorithmic
harms. This requires moving away from abstract understandings of both the machine and
the human in isolation, and instead considering the precise nature of human-algorithm
interactions. Who is the specific human engaging with the algorithm? What misaligned
incentives or gaps in knowledge and power could limit their ability to assess and anticipate
concerns? To what extent might the algorithm curb human discretion that is essential to
the decision? Who are the other human actors responsible for shaping the system?
We also need to subject human oversight to greater research and scrutiny, further studying
what human oversight does and does not accomplish and how to structure human-
algorithm interactions to facilitate better collaborations. This requires preliminary testing
of human oversight mechanisms before they are enshrined in policy and monitoring human
oversight behaviors as a standard feature of algorithmic impact assessments and A.I.
audits, which are becoming popular policy mechanisms to evaluate A.I. systems.
Yet the limits of human oversight of A.I. do not simply require facilitating better human-
algorithm collaborations—instead, they expose fundamental tensions around whether
algorithms should be involved in certain decisions at all. In many contexts, algorithms are
introduced as a mechanism to improve upon the cognitive limits and biases of humans—yet
now those same humans are presented in policy as the essential backstop overseeing
algorithmic limits and biases. This circular logic exposes our recognition that A.I. often
cannot be trusted to adjudicate high-stakes decisions, despite common proclamations
about its benefits.
Discriminatory outcomes at the hands of A.I. systems are not problems confined to
technical code, biased datasets, or flawed human oversight. From facial recognition to
predictive policing, welfare benefit automation to worker surveillance, A.I. systems often
work to disguise historical discrimination, amplify power imbalances, and obscure political
decisions under the veneer of technical neutrality. For these systems, harmful outcomes
might be a feature, not a bug. Rather than prompt a superficial “human-in-the-loop” policy
fix, the material harms caused by A.I. must trigger a re-evaluation of whether many of these
systems should be used at all and greater accountability for the real human (and
institutional) decision-makers behind these harms.
Future Tense
is a partnership of
Slate,
New America, and
Arizona State University
that
examines emerging technologies, public policy, and society.
Slate is published by The Slate Group, a Graham Holdings Company.
All contents © 2021 The Slate Group LLC. All rights reserved.

The False Comfort of Human Oversight As An Antidote To A.I. Harm

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

The False Comfort of Human Oversight As An Antidote To A.I. Harm

Uploaded by

Copyright:

Available Formats

01/07/2021 The false comfort of human oversight as an antidote to A.I. harm.

STRENUOUS BOUTS OF UNIVERSE-SAVING

The False Comfort of Human Oversight as an

What does “human oversight of A.I.” really mean?

In April, the European Commission released a wide-ranging proposed regulation to govern

Slate is published by The Slate Group, a Graham Holdings Company.

You might also like