You are on page 1of 8

Page 1 of 8

A.I. Shows Promise Assisting Physicians

Doctors competed against A.I. computers to recognize illnesses on magnetic resonance images of a human brain
during a competition in Beijing last year. The human doctors lost. Credit Mark Schiefelbein/Associated Press
By Cade Metz New York Times Feb. 11, 2019

Each year, millions of Americans walk out of a doctor’s office with a misdiagnosis. Physicians
try to be systematic when identifying illness and disease, but bias creeps in. Alternatives are
overlooked.

Now a group of researchers in the United States and China has tested a potential remedy for all-
too-human frailties: artificial intelligence.

In a paper published on Monday in Nature Medicine, the scientists reported that they had built a
system that automatically diagnoses common childhood conditions — from influenza to
meningitis — after processing the patient’s symptoms, history, lab results and other clinical data.

The system was highly accurate, the researchers said, and one day may assist doctors in
diagnosing complex or rare conditions.

Drawing on the records of nearly 600,000 Chinese patients who had visited a pediatric hospital
over an 18-month period, the vast collection of data used to train this new system highlights an
advantage for China in the worldwide race toward artificial intelligence.

Because its population is so large — and because its privacy norms put fewer restrictions on the
sharing of digital data — it may be easier for Chinese companies and researchers to build and
train the “deep learning” systems that are rapidly changing the trajectory of health care.
Page 2 of 8

On Monday, President Trump signed an executive order meant to spur the development of A.I.
across government, academia and industry in the United States. As part of this “American A.I.
Initiative,” the administration will encourage federal agencies and universities to share data that
can drive the development of automated systems.

Pooling health care data is a particularly difficult endeavor. Whereas researchers went to a single
Chinese hospital for all the data they needed to develop their artificial-intelligence system,
gathering such data from American facilities is rarely so straightforward.

“You have go to multiple places,” said Dr. George Shih, associate professor of clinical radiology
at Weill Cornell Medical Center and co-founder of MD.ai, a company that helps researchers
label data for A.I. services. “The equipment is never the same. You have to make sure the data is
anonymized. Even if you get permission, it is a massive amount of work.”

After reshaping internet services, consumer devices and driverless cars in the early part of the
decade, deep learning is moving rapidly into myriad areas of health care. Many
organizations, including Google, are developing and testing systems that analyze electronic
health records in an effort to flag medical conditions such as osteoporosis, diabetes, hypertension
and heart failure.

Similar technologies are being built to automatically detect signs of illness and disease in X-rays,
M.R.I.s and eye scans.

The new system relies on a neural network, a breed of artificial intelligence that is accelerating
the development of everything from health care to driverless cars to military applications. A
neural network can learn tasks largely on its own by analyzing vast amounts of data.

Using the technology, Dr. Kang Zhang, chief of ophthalmic genetics at the University of
California, San Diego, has built systems that can analyze eye scans for hemorrhages, lesions and
other signs of diabetic blindness. Ideally, such systems would serve as a first line of defense,
screening patients and pinpointing those who need further attention.

Now Dr. Zhang and his colleagues have created a system that can diagnose an even wider range
of conditions by recognizing patterns in text, not just in medical images. This may augment what
doctors can do on their own, he said.

“In some situations, physicians cannot consider all the possibilities,” he said. “This system can
spot-check and make sure the physician didn’t miss anything.”

The experimental system analyzed the electronic medical records of nearly 600,000 patients at
the Guangzhou Women and Children’s Medical Center in southern China, learning to associate
common medical conditions with specific patient information gathered by doctors, nurses and
other technicians.

First, a group of trained physicians annotated the hospital records, adding labels that identified
information related to certain medical conditions. The system then analyzed the labeled data.
Page 3 of 8

Then the neural network was given new information, including a patient’s symptoms as
determined during a physical examination. Soon it was able to make connections on its own
between written records and observed symptoms.

When tested on unlabeled data, the software could rival the performance of experienced
physicians. It was more than 90 percent accurate at diagnosing asthma; the accuracy of
physicians in the study ranged from 80 to 94 percent.

In diagnosing gastrointestinal disease, the system was 87 percent accurate, compared with the
physicians’ accuracy of 82 to 90 percent.

Able to recognize patterns in data that humans could never identify on their own, neural
networks can be enormously powerful in the right situation. But even experts have difficulty
understanding why such networks make particular decisions and how they teach themselves.

As a result, extensive testing is needed to reassure both doctors and patients that these systems
are reliable.

Experts said extensive clinical trials are now needed for Dr. Zhang’s system, given the difficulty
of interpreting decisions made by neural networks.

“Medicine is a slow-moving field,” said Ben Shickel, a researcher at the University of Florida
who specializes in the use of deep learning for health care. “No one is just going to deploy one of
these techniques without rigorous testing that shows exactly what is going on.”

It could be years before deep-learning systems are deployed in emergency rooms and clinics. But
some are closer to real-world use: Google is now running clinical trials of its eye-scan system at
two hospitals in southern India.

Deep-learning diagnostic tools are more likely to flourish in countries outside the United States,
Dr. Zhang said. Automated screening systems may be particularly useful in places where doctors
are scarce, including in India and China.

The system built by Dr. Zhang and his colleagues benefited from the large scale of the data set
gathered from the hospital in Guangzhou. Similar data sets from American hospitals are typically
smaller, both because the average hospital is smaller and because regulations make it difficult to
pool data from multiple facilities.

Dr. Zhang said he and his colleagues were careful to protect patients’ privacy in the new study.
But he acknowledged that researchers in China may have an advantage when it comes to
collecting and analyzing this kind of data.

“The sheer size of the population — the sheer size of the data — is a big difference,” he said.

https://www.nytimes.com/2019/02/11/health/artificial-intelligence-medical-diagnosis.html
Page 4 of 8

Training A Computer To Read Mammograms As


Well As A Doctor
RICHARD HARRIS MORNING EDITION NPR APRIL 1, 2019

"I was really surprised how primitive information technology is in the hospitals," says Regina Barzilay, a
professor at the Massachusetts Institute of Technology who is working on improving mammography with
artificial intelligence. Kayana Szymczak for NPR

Regina Barzilay teaches one of the most popular computer science classes at the Massachusetts
Institute of Technology.

And in her research — at least until five years ago — she looked at how a computer could use
machine learning to read and decipher obscure ancient texts.

"This is clearly of no practical use," she says with a laugh. "But it was really cool, and I was
really obsessed about this topic, how machines could do it."

But in 2014, Barzilay was diagnosed with breast cancer. And that not only disrupted her life, but
it led her to rethink her research career. She has landed at the vanguard of a rapidly growing
effort to revolutionize mammography and breast cancer management with the use of computer
algorithms.
She started down that path after her disease put her into the deep end of the American medical
system. She found it baffling.

"I was really surprised how primitive information technology is in the hospitals," she says. "It
almost felt that we were in a different century."
Page 5 of 8

Questions that seemed answerable were hopelessly out of reach, even though the hospital had
plenty of data to work from.

"At every point of my treatment, there would be some point of uncertainty, and I would say,
'Gosh, I wish we had the technology to solve it,' " she says. "So when I was done with the
treatment, I started my long journey toward this goal."

Getting started wasn't so easy. Barzilay found that the National Cancer Institute wasn't interested
in funding her research on using artificial intelligence to improve breast cancer treatment.
Likewise, she says she couldn't get money out of the National Science Foundation, which funds
computer studies. But private foundations ultimately stepped up to get the work rolling.

Barzilay struck up a collaboration with Connie Lehman, a Harvard University radiologist who is
chief of breast imaging at Massachusetts General Hospital. We meet in a dim, hushed room
where she shows me the progress that she and her colleagues have made in bringing artificial
intelligence to one of the most common medical exams in the United States. More than 39
million mammograms are performed annually, according to data from the Food and Drug
Administration.

Step one in reading a mammogram is to determine breast density. Lehman's first collaboration
with Barzilay was to develop what's called a deep-learning algorithm to perform this essential
task.

"We're excited about this because we find there's a lot of human variation in assessing breast
density," Lehman says, "and so we've trained our deep-learning model to assess the density in a
much more consistent way."

Lehman reads a mammogram and assesses the density; then she pushes a button to see what the
algorithm concluded. The assessments match.

Next, she toggles back and forth between new breast images and those taken at the patient's
previous appointment. Doing this job is the next task she hopes computer models will take over.
Page 6 of 8

"The optimist in me says in three years we can train this tool to read mammograms as well as an average
radiologist," says Connie Lehman, chief of breast imaging at Massachusetts General Hospital in Boston.
Kayana Szymczak for NPR

"These are the sorts of things that we can also teach a model, but more importantly we allow the
model to teach itself," she says. That's the power of artificial intelligence — it's not simply
automating rules that the researchers provide but also creating its own rules.

"The optimist in me says in three years we can train this tool to read mammograms as well as an
average radiologist," she says. "So we'll see. That's what we're working on."

This is an area that's evolving rapidly. For example, researchers at Radboud University Medical
Center in the Netherlands spun off a company, ScreenPoint Medical, that can read mammograms
as well as the average radiologist now, says Ioannis Sechopoulos, a radiologist at the university
who ran a study to evaluate the software.

"A very good breast radiologist is still better than the computer," Sechopoulos says, but "there's
no theoretical reason for [the software] not to become as good as the best breast radiologists in
the world."

At least initially, Sechopoulos suggests, computers could identify mammograms that are clearly
normal. "So we can get rid of the human reading a significant portion of normal mammograms,"
he says. That could free up radiologists to perform more demanding tasks and could potentially
save money.

Sechopoulos says the biggest challenge now isn't technology but ethics. When the algorithm
makes a mistake, "then who's responsible, and who do we sue?" he asks. "That medical-legal
aspect has to be solved first."

Lehman sees other challenges. One question she's starting to explore is whether women will be
comfortable having this potentially life-or-death task turned over to a computer algorithm.

"I know a lot of people say ... 'I'm intrigued by [artificial intelligence], but I'm not sure I'm ready
to get in the back of the car and let the computer drive me around, unless there's a human being
there to take the wheel when necessary,' " Lehman says.

She asks a patient, Susan Biener Bergman, a 62-year-old physician from a nearby suburb, how
she feels about it.

Bergman agrees that giving that much control to a computer is "creepy," but she also sees the
value in automation. "Computers remember facts better than humans do," she says. And as long
as a trustworthy human being is still in the loop, she's OK with empowering an algorithm to read
her mammogram.

Lehman is happy to hear that. But she's also mindful that trusted technologies haven't always
been trustworthy. Twenty years ago, radiologists adopted a technology called CAD, short for
computer-aided detection, which was supposed to help them find tumors on mammograms.
Page 7 of 8

"The CAD story is a pretty uncomfortable one for us in mammography," Lehman says.

The technology became ubiquitous due to the efforts of its commercial developers. "They
lobbied to have CAD paid for," she says, "and they convinced Congress this is better for women
— and if you want your women constituents to know that you support women, you should
support this."

Once Medicare agreed to pay for it, CAD became widely adopted, despite misgivings among
many radiologists.

A few years ago, Lehman and her colleagues decided to see if CAD was actually beneficial.
They compared doctors at centers that used the software with doctors at those that didn't to see
who was more adept at finding suspicious spots.

Radiologists "actually did better at centers without CAD," Lehman and her colleagues concluded
in a study. Doctors may have been distracted by so many false indications that popped up on the
mammograms, or perhaps they became complacent, figuring the computer was doing a perfect
job.

Whatever the reason, Lehman says, "we want to make sure as we're developing and evaluating
and implementing artificial intelligence and deep learning, we don't repeat the mistakes of our
past."

That's certainly on the mind of Joshua Fenton, a family practice doctor at the University of
California, Davis' Center for Healthcare Policy and Research. He has written about the evidence
that led the FDA to let companies market CAD technology.

"It was, quote, 'promising' data, but definitely not blockbuster data — definitely not large
population studies or randomized trials," Fenton says.

The agency didn't foresee how doctors would change their behavior — evidently not for the
better — when using computers equipped with the software.

"We can't always anticipate how a technology will be used in practice," Fenton says, so he would
like the FDA to monitor software like this after it has been on the market to see if its use is
actually improving medical care.

Those challenges will grow as algorithms take on ever more tasks. And that's on the not-so-
distant horizon.

Lehman and Barzilay are already thinking beyond the initial reading of mammograms and are
looking for algorithms to pick up tasks that humans currently can't perform well or at all.
Page 8 of 8

One project is an algorithm that can examine a high-risk spot on a mammogram and provide
advice about whether a biopsy is necessary. Reducing the number of unnecessary biopsies would
reduce costs and help women avoid the procedure.

In 2017, Barzilay, Lehman and colleagues reported that their algorithm could reduce biopsies by
about 30 percent.

They have also developed a computer program that analyzes a lot of information about a patient
to predict future risk of breast cancer.

The first time you go and do your screening, Barzilay says, the algorithm doesn't just look for
cancer on your mammogram — "the model tells you what is the likelihood that you develop
cancer within two years, three years, 10 years."

That projection can help women and doctors decide how frequently to screen for breast cancer.

"We're so excited about it because it is a stronger predictor than anything else that we've found
out there," Lehman says. Unlike other tools like this, which were developed by examining
predominantly white European women, it works well among women of all races and ages, she
says.

Lehman is mindful that an algorithm developed at one hospital or among one demographic might
fail when tried elsewhere, so her research addresses that issue. But potential pitfalls aren't what
keep her up at night.

"What keeps me up at night is 500,000 women [worldwide] die every year of breast cancer," she
says. She would like to find ways to accelerate progress so that innovations can help people
sooner.

And that imperative calls for more than new technology, she says — it calls for a new
philosophy.

"We're too fearful of change," she says. "We're too fearful of losing our jobs. We're too fearful of
things not staying the way they've always been. We're going to have to think differently."

https://www.npr.org/sections/health-shots/2019/04/01/707675965/training-a-computer-to-read-
mammograms-as-well-as-a-doctor

You might also like