You are on page 1of 11

Life Sciences Practice

AI in biopharma
research: A time to
focus and scale
By focusing on specific scientific and operational pain points and
fully integrating AI into research workflows, biopharma companies
can deliver greater patient impact and significant value.

This article is a collaborative effort by Alex Devereson, Erwin Idoux, Matej Macak, Navraj Nagra,
and Erika Stanzl, representing views from McKinsey’s Life Sciences Practice.

© Janiecbros/Getty Images

October 2022
Despite recent advancements,1 biopharma research AI can help identify the most promising compounds
in drug R&D remains expensive and time-consuming, and targets at every step of the value chain so that
although there are numerous opportunities to build fewer, more successful experiments are conducted
capabilities that enhance productivity and provide in the lab to achieve the same number of leads.
probability-of-success gains. In this time of rapid
growth of AI in biopharma, attention today is on
how to make the most of the opportunity to deliver The AI-driven drug discovery industry:
value at scale by fully integrating AI approaches into Jury still out on impact
scientific process changes. In this article, we outline The AI-driven drug discovery industry has grown
how biopharma companies can harness AI-driven significantly over the past decade, fueled by
discovery to deliver patient benefit, and why now is new entrants in the market, significant capital
the time for a shift from pursuing select marquee investment, and technology maturation. These
partnerships and self-contained capability builds, to AI-driven companies fall broadly into two categories:
focusing on coordinated investment in research AI providers of AI enablement for biopharma as a
with impact to show for it. service only, including software as a service (SaaS);
and providers of AI enablement that have, in parallel
The goal of the research phase in drug R&D is with their services, their own AI-enabled drug
to generate as many quality drug candidates as development pipeline (see sidebar “Why now is the
possible, as quickly as possible, with the highest time for AI-enabled drug discovery”).
probability of successful transition to clinical
development. The discovery process has historically Our research has identified nearly 270 companies
been a convergent, stepped, pass–fail funnel working in the AI-driven drug discovery industry,
process with attrition at every step—a process that with more than 50 percent of the companies
is highly inefficient given the number of compounds based in the United States, though key hubs
initially tested.2 Ideally, this process should only are emerging in Western Europe and Southeast
promote compounds for testing that are relevant for Asia.3 The number of AI-driven companies with
targets that would lead to effective drugs for patients. their own pipeline is still relatively small today

1
Fabio Pammolli, et al., “The endless frontier? The recent increase of R&D productivity in pharmaceuticals,” Journal of Translational Medicine,
April 9, 2020.
2
Less than 0.1 percent of candidate molecules pass from screening to Phase I, with approximately 30 percent of new molecular entity costs
being spent in discovery (in the context of $1.1 billion to $2.8 billion spent to bring each drug to market). Oliver J. Wouters et al., “Estimated
research and development investment needed to bring a new medicine to market, 2009–2018,” JAMA, March 3, 2020.

Why now is the time for AI-enabled drug discovery

Technology advances and regulatory of toxicity instead of traditional assay- learning (ML)—for example, MLOps (ML
openness to innovation have now based approaches. At the same time, operations) and DataOps (data operations)
combined to make AI-enabled drug the US Food and Drug Administration alongside customized services and
discovery a practicable proposition. (FDA) recognizes single-arm trials platforms; and (2) development of deep-
in rare diseases with control groups learning approaches for designing new
In Europe, regulatory openness to in silico incorporating real-world evidence (RWE). molecules and computer vision, which are
and synthetic-derived insights has been increasingly accessible through public
facilitated by EU regulation ICH M7 EU, In parallel, technology is advancing on two code repositories and academic literature.
which enables quantitative structure- fronts with (1) standardized approaches to
activity relationship (QSAR) assessment industrialization and scaling of machine

2 AI in biopharma research: A time to focus and scale


(approximately 15 percent have an asset in of AI-driven players with high valuations, multiple
preclinical development). Those with new molecular deals, and significant capital raised (62 percent
entities (NMEs) in clinical development (Phase I and CAGR in investment over the past decade). Over half
II) have predominantly in-licensed assets or have the capital invested in the space is concentrated in
developed assets using traditional techniques.4 only ten companies (all based in the United States
or United Kingdom). This is partly because of the
The growth in the AI-driven drug discovery space difficulty biopharma companies and investors have in
has caught the attention of established biopharma evaluating the long tail of AI-driven players. We have
companies, and there has been a rapid rise in seen biopharma companies that are deeply interested
partnerships between traditional biopharma in this space struggle to determine what emerging
companies and AI-driven drug companies (Exhibit 1). players do, where they operate along the value
However, there is a significant concentration in chain, the distinctiveness of their technology, and
partnership activity and funding toward a small number which technologies have demonstrable impact.

Web 2022
AiBiopharmaResearch
Exhibit 1 of 4
Exhibit 1
Investment in
Investment in emerging
emerging AI-driven
AI-driven discovery
discovery players
players increased
increased for
for aa decade
decade
before the recent public-market downturn.
before the recent public-market downturn.
Companies founded, pharma Funding and capital investment in AI-driven drug
partnerships by year discovery companies
Companies founded Pre-seed Funding by year, $ billion
Pharma partnerships¹ and seed 8
Early-stage
60 VC²
Late-stage
VC 6
50
Private
equity
IPO/ 4
40 secondary
offering

Corporate/
M&A 2
30
Debt
Other ²
20 0
2011 2015 2020

Share of funding for top 10 companies with >50% of


10 funding vs others, %

0 51 49
2012 2015 2020 Top 10 companies Others

¹Includes companies founded from previous years.


²Venture capital.
Source: IQVIA Pharma Deals; Pitchbook (data has not been reviewed by PitchBook or IQVIA Pharma Deals analysts)

3
PitchBook data, 2022; IQVIA Pharma Deals, 2022.
4
Pharmaprojects/Informa, 2022.

AI in biopharma research: A time to focus and scale 3


Two potential obstacles need to be overcome to in the AI space should have some core benefits:
unlock impact from AI enablement in partnerships biopharma companies gain access to technology (AI
among biopharma companies and AI-driven platforms, algorithms, and infrastructure), data (such
discovery players. First, AI-enabled discovery as curated labeled cell images, screening, ADMET7
approaches (including via partnerships) are often data), talent (a ready supply of data scientists and
kept at arm’s length from internal day-to-day R&D; data engineers to build AI pipelines while training
they proceed as an experiment and are not anchored biopharma talent), and assurances of data protection
in a biopharma companies’ scientific and operational in relation to a highly specific strategic intent to
processes to achieve impact at scale. Second, maximize patient impact (for example, to co-develop
investment in digitized drug discovery capabilities a certain molecule class in a specific TA).
and data sets within internal R&D teams is all too
frequently to leverage partner platforms and enrich
their IP, rather than building the biopharma’s end-to- Substantial impact from building
end tech stack and capabilities. enterprise capabilities in-house
When biopharma companies successfully integrate
When hurdles are overcome, partnerships can AI processes in day-to-day science and assembles
come to fruition, and examples exist across the cross-functional teams with the right skill sets
discovery value chain. AstraZeneca’s long-standing (data science, engineering, software development,
collaboration with BenevolentAI resulted in the epidemiology, discovery sciences, clinical, and
identification of multiple new targets in idiopathic design) we have observed significant impact along
pulmonary fibrosis, with subsequent broadening the value chain (Exhibit 2):
of the scope to other therapeutic areas (TAs).5
Sumitomo Dainippon Pharma worked with Exscientia — Hypothesis generation capabilities—simplified
to identify DSP-1181 for obsessive compulsive hypothesis generation tasks in experimental
disorder in less than a quarter of the time typically biology fields from several weeks of researcher
taken for drug discovery processes (under 12 months time to curated lists in minutes by combining
versus four and a half years)—with ambitions to enter real-world data (RWD), genomics data, and
the molecule into Phase I trials.6 scientific literature through a knowledge graph
for target identification
Similarly, building AI-enablement capabilities
in-house within biopharma companies is difficult, — Large-molecule-structure inference—
assembling the cross-functional teams required 100 times acceleration in time to generation
to drive the transformation is challenging, and it of protein structures (for example, for peptide
has been observed that AI enablement is often or mRNA-vaccine-antigen generation) for
implemented in a relatively isolated way. AI-enabled target identification
approaches are often undertaken separately from
day-to-day science, with AI-based tools not fully — Computer vision technology—up to ten times
integrated into routine research activities. acceleration achieved for screening- plate-image
analysis, with higher accuracy than classical
Biopharma companies, therefore, need to strike approaches, harnessing deep-leaning approaches
a balance between internal capability building (for instance, convolutional neural networks) for
and partnerships with AI-enabled drug discovery target validation and hit identification
companies. Successful biopharma partnerships

5
“BenevolentAI announces 3-year collaboration expansion with AstraZeneca focused on systemic lupus erythematosus and heart failure,”
BenevolentAI press release, January 13, 2022.
6
“Sumitomo Dainippon Pharma and Exscientia joint development new drug candidate created using artificial intelligence (AI) begins clinical trial,
Exscientia press release,” January 30, 2020.
7
ADMET refers to chemical absorption, distribution, metabolism, excretion, and toxicity.

4 AI in biopharma research: A time to focus and scale


Web 2022
AiBiopharmaResearch
Exhibit 2 of 4
Exhibit 2
AI has already delivered value across the research value chain.
AI has already delivered value across the research value chain.
Examples of AI-driven innovation in biotech and biopharma and observed impact across the
research value chain

Target Target Hit Lead generation


Preclinical
identification validation identification /optimization

Examples of Insights from In silico, Automated image Molecular Safety issue and
AI-driven data sources phenotypic, analysis for structure and DMPK¹ prediction
acceleration (internal and from cellular models cellular assays property using internal
vendors) to validate targets/ through prediction (eg, and public data
generate novel identify computer vision protein binding,
target hypothesis biomarkers technology logP, toxicity) for Hypothesis-
novel target driven dosages
Gene network, Disease causality Molecular proteins for adaptive trials
biochemical determined property and targeted
pathway, and within patient prediction (virtual Rapid design populations
cellular-response groups with screening) iteration, across
data integration significant unmet small and large
in target medical need molecules, using
identification eg, Generative
Adversarial
Networks

Examples Biopharma Biopharma Biotech saw Biopharma Biopharma


from unlocked internalized significant leveraged utilized predictive
industry and all-inclusive view AlphaFold2 and acceleration of generative algorithms to
observed of complex ColabFold to high-throughput machine learning maximize
impact indication by generate 3-D screening (HTS) model to probability of
attributing models of almost phase (time to expand library/ successful PK²
disease causality any known, 75% hits optimize predictions with
through linkages synthesized detected promising 83% of drug
between protein and reduced by 50%) compounds and development
genomic data protein–protein with platform- predict projects
and patient interactions, based compound progressing to
electronic reducing access “compound effcacy, clinic with no PK
medical records to 3-D structures prioritization” signifcantly issues
(EMRs) from 6 months to algorithm increasing
a few hours effciency of
library expansion,
with >60,000
new compounds
generated
¹Drug metabolism and pharmacokinetics.
²Pharmacokinetic.

— In silico medicinal chemistry—30 to 50 percent — In silico chemi-informatics—more than two


acceleration in small molecule, high-throughput times improvement over baseline on the key
screening, using approaches such as molecular metric of “efficacy observed,” over 100 times
property prediction in an iterative screening loop the number of in silico experiments possible
(versus the existing approach of randomized compared with previous screening, and faster
selection of compounds) for hit identification time for design of compounds for optimization of
drug delivery efficacy for lead optimization

AI in biopharma research: A time to focus and scale 5


— Knowledge-graph-based hypothesis management, prioritizing or deprioritizing
generation and drug repurposing—rapid ongoing programs within clinical plan by
identification of novel indications for existing stopping low probability of success programs
investigational new drugs (INDs) or marketed early and reducing patient burden in clinical
drugs via genomic information and pathways trials; informing diligence of molecules
associated with specific disease phenotypes, for licensing with an independent view of
accelerating time to new treatments for patients, biological potential, as part of the preclinical
as part of the preclinical phase of R&D phase of R&D

— Indication finding leveraging genomics— Biopharma companies that maximize the impact of
prioritizing indications to pursue for novel AI enablement can move beyond minimum viable
mechanisms of action (MoAs), finding product (MVP) individual use cases and build
new greenfield indications for life cycle research systems (Exhibit 3).

Web 2022
AiBiopharmaResearch
Exhibit 3 of 4
Exhibit 3
Research systems harness synergies when AI technologies are embedded into
Research systems harness
‘wet lab’ scientific synergies when AI technologies are embedded into
processes.
‘wet lab’ scientific processes.
Parts of a high-throughput screening (HTS) process embedded with AI technology

6 2
In silico/ In-vitro/
on-the-chip ‘wet lab’
simulations experiments
5
3

1. High-throughput screen commenced with diverse 5. Compound library inferencing and prioritization
compound sets ML algorithm then scans the remainder of the library
Scientist selects diverse compound sets (a set of compounds and predicts which plates should be
chemical compounds with a wide range of chemical prioritized to identify the highest number of hits in the
structures) as first high-throughput screen next screen

2. Automated compound selection and transfer


6. Automated compound selection based on ML
Using HTS machinery, individual compounds are
recommendations
transferred to individual wells of cells under
ML recommendations are automatically queued and
experimental conditions
used in the next round of HTS. The cycle continues,
with the algorithm continuously learning
3. Computer-vision-based hit selection from “real world” outputs. Recommendations trigger
Cell response to each compound is measured using scientists to explore new chemical space and begin
microscope analysis (eg, through computer vision downstream screening processes more quickly. These
techniques); promising compounds are labeled “hits” recommendations feed into the selection of chemical
compounds in step 1
4. Automated machine learning (ML) model training
from screen outcomes
Information from HTS for first few plates is
automatically transferred into an ML pipeline, which
“learns” how cells respond to each kind of chemical
structure

6 AI in biopharma research: A time to focus and scale


Research systems harness synergies created by large number of “hits” are identified for the
putting AI at the center of the research engine to relative amount of the library screened. These
enhance the outcome of experiments—instead research systems reduce overall costs, have higher
of simply being a preparatory step for real-world probability of success, accelerate R&D processes
experiments in isolation. They act as feedback loops (and therefore time to patient impact), and are fully
to refine the predictive capability and stability of AI integrated for specific use cases.
algorithms and inform experimental design (for more
key definitions, see sidebar “Glossary of key pharma
AI R&D terms”). An example is “iterative screening”: What does it take to successfully
results of an initial round of high-throughput implement AI in biopharma research?
screening8 are used to train a machine learning By implementing digital and data science tools and
(ML) algorithm. The ML algorithm can learn which concepts, biopharma can capture the full value of
underlying compound structures are most effective current portfolios and develop core technologies,
against a target and suggest other molecules competences, and IP to drive future research (such
in the library to prioritize for testing. As the ML as AI-enabled large-molecule and antibody design).
algorithm gathers more data, its predictions rapidly Current AI-driven drug discovery companies
become more accurate, and a disproportionately are already developing their own, significantly

8
High-throughput screening involves thousands to millions of candidate molecules in a compound library being tested against a specific assay
and determined to be effective against a target, or a “hit.”

Glossary of key pharma AI R&D terms

Artificial intelligence/machine learning Molecular property prediction/ links across them, typically in the same
Often used interchangeably, AI and iterative screening domain (for example, drug discovery).
ML have subtly different meanings. A An approach at the intersection of Data is organized in accordance with an
subset of AI, machine learning refers to machine learning and drug discovery that ontology—a web of concepts, including the
algorithms that improve by design through utilizes machine learning approaches to codified relationships between concepts—
the addition of new data. AI describes, determine patterns in candidate molecule and its controlled vocabulary. Knowledge
more broadly, a computer system that can structures that predict their likelihood graphs can be used to infer connections
solve problems based on data available of being a successful drug candidate. across multiple data sources, including link
in its environment, in the context of Typically, in an “iterative screening” prediction and insight generation.
achieving specific goals. setup, an ML algorithm will be trained
based on historic screening experiments DataOps/MLOps
Deep learning and will predict which of the remaining DataOps (data operations) is an automated,
A subset of machine learning, deep compounds in the screening library are process-oriented methodology used
learning utilizes an artificial neural- likely to be hits; as further batches of by analytics and data teams to improve
network structure, typically involving a experiments are conducted, the algorithm quality and reduce the cycle time
very large number of parameters to be increases its precision and predictive of advanced analytics. MLOps (ML
calibrated, to identify patterns in large power. The algorithms learn from both operations) refers to software engineering
data sets and subsequently to make successful and unsuccessful predictions. practices applied to IT operations (for
predictions based on this calibrated example, packaging and deploying
neural network. Knowledge graphs production software) in the context of
A type of database that integrates data machine learning and artificial intelligence.
from multiple sources and creates

AI in biopharma research: A time to focus and scale 7


Biopharma companies can develop
a top-down, C-level strategy, setting
out the ways in which AI-enabled
discovery will be a critical enabler of
future performance.

more cost-efficient drug discovery pipelines, so investment) to support the capability building
it would be beneficial for established players to and talent acquisition required to make it a
identify how they, too, can fully integrate novel reality, or recognition of the trade-offs on IP
technologies into standard research processes. and capability building if only pursuing external
While partnering is one option—where it provides partnerships. Alignment between R&D and
access to data, technology, and talent, and the risk digital functions is paramount to ensure balanced
of partners exploiting a company’s IP to become co-investment (financial and management time)
a future competitor in the medium to long term is and for the impact generated from initiatives
low—marquee partnerships cannot be the only way to be shared appropriately. In addition, it is
to develop in-house drug discovery capabilities. As important to carefully consider which elements of
such, it is critical for biopharma companies to work the AI-enabled drug discovery approach will be
out how to shift from investing in nonintegrated, supported by partnerships versus built in-house.
lighthouse use cases or partnerships to making AI
an integral part of everyday research. With this in We recommend a design thinking approach to
mind, here are four areas to consider: determine which parts of discovery research to
tackle, and in which order. This involves studying,
1. Strategy and design-backed road-mapping. end-to-end, common research processes,
Biopharma companies can develop a top-down, where there may be two to three steps that are
C-level strategy, setting out the ways in which bottlenecks for researchers, and which could
AI-enabled discovery will be a critical enabler of be significantly unlocked via AI—for example,
future performance. A significant aspect is to automated image analysis for critical cell assays
understand where the current organizational pain or lead optimization. Design thinking could help
points lie, what the potential gains could be, and companies determine which areas could benefit
where the organization wants to lead the industry most from AI, the implementation road map, and
(versus only being competitive) in the context the success indicators to track progress and
of how the space/competitors are expected to impact (for example, time from target identification
move in the future. This strategy needs to be to candidate selection, costs associated with
specific, time-bound, linked to value at stake, and target identification).
have strong alignment among (and sponsorship
from) senior leaders—including the heads of For R&D and data science leaders, the focus should
R&D, research, and data science. Underpinning not be solely on advanced-analytics use cases:
this strategy is the need for sufficient resources there is significant value in cracking established
(balanced across talent, data, and infrastructure problems, with applications such as basic

8 AI in biopharma research: A time to focus and scale


automation using data transformation pipelines — Tech capabilities. Design and build technical
(such as dose response curve fitting), digital infrastructure and data architecture for data
operational dashboards, or building data platforms extraction and automated gathering.
and infrastructure (such as knowledge graphs).
For example, building a single data platform — Talent and agile operating model. Coach data
for all preclinical data generated can prevent science, data engineering, and translator/
experimental duplication and enhance data sharing product owners on tools and delivery
across the organization—our experience shows methodologies, iteratively testing and
this can reduce months of hypothesis generation learning to deliver products via a collaborative
time to a few days. The impact includes dramatically environment.
increased speed, freeing up people for more
productive tasks, and increasing quality of analyses. — Adoption and scaling (including change
management). Design new screening protocols
2. Relentless value delivery focused on quarterly and experimental strategy, incorporating
value releases (QVRs). It is critical that R&D, data ML-based algorithms. Ensure the whole
science, and data engineering collaborate closely research organization (from leaders to lab
and iterate on delivery of use cases in an agile way. technicians) understands what the company is
The research process frequently includes specific trying to achieve and how daily activities need
constraints and ways of working (such as steps to change.
and hand offs in the experimental methodology)
that need to be accounted for to ensure uptake Once key AI-enabled use cases are aligned, delivery
of the tools and systems that are built (in addition must be highly organized so as to demonstrate
to updating scientific processes and standard ongoing impact; core requirements and potential
operating procedures and introducing financial synergies must be identified and gaps in ongoing
and performance-based incentives). To consider cross-cutting road maps identified. This means
AI-enablement delivery holistically, leaders can line departing from long-term road maps delivering
up key building blocks, as in this specific example impact in multiyear cycles to focus on QVRs (which
focused on “high-throughput screening”: produce measurable value after each quarterly
sprint, such as AI-enablement of a scientific
— Blueprinting. Develop a list of use cases across process) while continuously reprioritizing based on
the value chain, prioritizing according to impact, organizational needs. This approach enables AI use-
complexity, and business value; then select the case development to be built more efficiently—by
highest-need use cases. dynamically front-loading priority data ingestion
and team capacity—with mission-critical assets
— Digital and analytics solutions. Build and deployed as required (Exhibit 4).
automate screening algorithms that link
molecular descriptors (for example, molecule All core digital processes in research can be
structure in the form of a SMILES string) with
9
delivered with incremental quarterly delivery;
desired output, or a hit. however, the nature of “value” delivery may vary.
Moonshot programs (in tech, this could be the
— Data continuum. Collect experimental data in advent of AlphaFold11) require long-term road maps
a reusable way (for instance, with FAIR-data and typically a dedicated ML research group to
principles10); build master tables from equipment deliver potentially groundbreaking discoveries with
and existing libraries. impact in biopharma. Such programs may not deliver
an AI product every quarter such as other digital

9
SMILES refers to simplified molecular-input line-entry system.
10
FAIR refers to findable, accessible, interoperable, and reusable.
11
AlphaFold is an AI system developed by DeepMind that predicts a protein’s 3-D structure from its amino-acid sequence.

AI in biopharma research: A time to focus and scale 9


Web 2022
AiBiopharmaResearch
Exhibit 4 of 4
Exhibit 4
Moving from
Moving from parallel
parallelprocesses to quarterly
processes to quarterly value
value releases touching all
releases touching all six
six
‘building blocks’ of an analytics build with links to impact.
‘building blocks’ of an analytics build with links to impact.
Illustration of holistic AI-enablement value delivery focused on Digital delivery
quarterly value releases (QVRs)

Nonoptimized AI-enablement delivery Optimized digital delivery


Parallel programs of foundational efforts, with minimal Program links all horizontals via releases, touching all
linkages between a subset of 6 “building blocks” of 6 “building blocks” of an analytics build; links from
analytics builds, where not all blocks may be considered; QVR to impact through scientific/business flows
no explicit link to business/scientific flows and impact
Value release
Value release happens quarterly
Blueprint mostly at the end Blueprint
of each horizontal QVR1 QVR2 QVR3
Digital/ Digital/
analytics analytics

Data Data

Technical Technical
capability capability

Capability Capability
building/agile building/agile

Adoption and Adoption and


scaling scaling

initiatives, but an insight, report, or decision should (notably, the data scientists and data engineers
still be delivered on a regular basis. needed to work with researchers in developing the
algorithms and technology backbones to support
3. IP, capability building, and developing translation prioritized areas) is vital. Often overlooked but
expertise through partnerships. While there is mission critical for AI enablement, are “translators”
certainly evidence for the benefits of partnership or “product owners” with deep business, clinical,
in specific areas, including to access unique scientific, and AI/ML and systems architecture
technologies, data, or solution types, managing understanding. These profiles have a product
these partnerships exclusively at arm’s length and ownership mindset and understand and dynamically
keeping novel methods or solutions separate from evaluate all elements of the analytics team to
day-to-day research mean that necessary future maintain focus on value and impact delivery, thereby
capabilities for a transformation in drug discovery assuring successful project delivery.
may not be built.
4. Industrialization of AI with MLOps and reusable
Biopharma companies should be selective and analytical assets. For the capabilities a biopharma
specific about the capabilities to be delivered by company builds in-house, it is essential to have
partnerships versus those built in-house. Similarly, a the right enablers in place to support scaling
balanced approach to in-house and external talent across research activities: the right technology

10 AI in biopharma research: A time to focus and scale


infrastructure and methodologies, especially Ensuring coding standards in development and
DataOps and MLOps and an appropriate data harmonization of coding approaches across teams
architecture (for example, graph databases or Data increases long-term productivity and solution
Vault 2.0 technology). DataOps (data operations) robustness. Additionally, harmonization enables
enables companies to gain more value from their sharing of reusable components (data connectors,
data by accelerating the process of building models. feature libraries, model-based embeddings)
MLOps involves ensuring the right platforms, tools, across projects: for example, using graph neural-
services, and roles with the right team operating network molecular embeddings for hit prediction
model and standards for delivering AI reliably and and lead optimization for toxicity reduction. As the
at scale. Technical-architecture enablers to support emerging research platform grows in complexity,
processing compute-intensive workflows such “assetization” of reusable components becomes
as AlphaFold, molecular-dynamics simulations, an increasingly important source of development
optimization models, and image-recognition productivity (with twice the productivity for teams
workflows are a core requirement. Furthermore, that embrace it) and an important in-house
enabling concepts such as Data Vault 2.0 capability that requires a dedicated team with a
techniques and graph databases are table stakes as product-centered mindset.12
AI capabilities scale.
The question today is whether biopharma
To successfully deploy research systems, companies will move analytics investments
development teams must build multiple interrelated beyond a focus on individual projects and
components (data connectors and pipelines, marquee partnerships to transforming research
models, APIs, and visual interfaces) that work at scale. A shift to focusing on specific scientific
seamlessly to drive adoption among end users. and operational pain points and building AI into
Fragmentation of code bases and components, and fully integrated research systems—with a road
reduced productivity due to integration challenges, map to scale—will enable biopharma companies
are natural risks that arise when multiple tools are to capture real business and patient impact from
Find more content like this on the deployed across different domains and teams. using AI in research.
McKinsey Insights App

Alex Devereson is a partner in McKinsey’s London office, where Matej Macak is an associate partner and Navraj Nagra is a
consultant; Erwin Idoux is a consultant in the Paris office; and Erika Stanzl is an associate partner in the Zurich office.

The authors wish to thank Jeffrey Algazy, Christoforos Anagnostopoulos, Joachim Bleys, David Champagne, Thomas
Devenyns, Bryan Lee, Rachel Moss, Michael Steinmann, and Lieven Van der Veken for their contributions to this article.

Designed by McKinsey Global Publishing


Scan • Download • Personalize Copyright © 2022 McKinsey & Company. All rights reserved.

12
Industrialization of AI with MLOps is one of 14 foremost trends identified by the McKinsey Technology Council. For more information, see
Michael Chui, Roger Roberts, and Lareina Yee, “McKinsey Technology Trends Outlook 2022,” August 24, 2022.

AI in biopharma research: A time to focus and scale 11

You might also like