You are on page 1of 13

TIBTEC 1944 No.

of Pages 13

Trends in Biotechnology

Review

Towards a Digital Bioprocess Replica:


Computational Approaches in
Biopharmaceutical Development
and Manufacturing
Jens Smiatek ,1,2,4,* Alexander Jung,1 and Erich Bluhmki1,3

Quantitative unit operation models for the optimization and refinement of mod- Highlights
ern late-stage biopharmaceutical drug manufacturing processes have recently Modeling and statistics approaches
attracted increasing attention. The supplementary benefits of these models give a detailed overview of individual
unit operations in biopharmaceutical
include increased process robustness and control in combination with a more
processes.
stringent design of the bioprocess due to a reduced number of exploratory
experiments. In addition to unit operations, further efforts also focus on digital A fully digital bioprocess replica has sig-
bioprocess replicas, which are straightforward combinations of unit operation nificant benefits but also challenges.
and process models from inoculum to the fill and finish phase. In this review,
Modeling approaches play an important
we shed more light on digital bioprocess replicas in addition to standard unit role in quality by design principles.
operation models and discuss their strengths and weaknesses. We comment
on the current usage of these approaches for late stage processes and outline Machine learning and artificial intelligence
approaches might also be integrated into
the associated benefits, challenges and limitations. the future design of bioprocesses.

Molecular models and molecular under-


Computational Approaches for Biopharmaceutical Processes
standing are important for refined bio-
Over the last few years, biopharmaceutical research and development have been confronted with pharmaceutical manufacturing processes
new challenges as well as promising opportunities. Some of the most important new technolo- and root-cause analysis.
gies are various digital approaches, which have found their way into automatized laboratories,
artificial intelligence for advanced data analysis, and computational models to study molecular
or process-relevant behavior [1]. While research on new active pharmaceutical ingredients
(APIs, see Glossary) has relied on computational models for a long time [2–4], quantitative
approaches for the detailed study of pharmaceutical manufacturing processes have become
more significant only recently [5]. Notably, biopharmaceutical development and manufacturing 1
Boehringer Ingelheim Pharma GmbH &
processes include a plethora of distinct unit operations whose underlying molecular mechanisms Co. KG, Digitalization Development
and influence on the API are not yet fully understood. Biologicals CMC, Birkendorfer Strasse
65, D-88397 Biberach a. d. Riss,
Germany
In view of this remark, computational approaches are now regarded as promising alternatives to 2
Institute for Computational Physics,
experimental studies to optimize and control the process behavior more systematically. For clar- University of Stuttgart, Allmandring 3,
ity, a standard bioprocess [6] for the production and purification of biological APIs with exemplary D-70569 Stuttgart, Germany
3
University of Applied Sciences
unit operations in combination with the final drug product phase is depicted in Figure 1. Biberach, Karlsstrasse 6–11, D-88400
Biberach a. d. Riss, Germany
4
A large number of computational approaches are now used to identify optimal conditions and https://www2.icp.uni-stuttgart.de/∼icp/
Jens_Smiatek
to achieve full control over the bioprocess. In addition to statistical methods like design of
experiments (DoE) or multivariate data analysis [7–9], prominent examples include mecha-
nistic approaches [10–15], hybrid models [16–18], computational fluid dynamics, and
*Correspondence:
predictive algorithms for stable API structures [19–23]. Among the less-detailed numerical methods jens.smiatek@boehringer-ingelheim.
for upstream processes (USPs) and downstream processes (DSPs), numerous molecular models com (J. Smiatek).

Trends in Biotechnology, Month 2020, Vol. xx, No. xx https://doi.org/10.1016/j.tibtech.2020.05.008 1


© 2020 Elsevier Ltd. All rights reserved.
Trends in Biotechnology

Glossary
Active pharmaceutical ingredient:
here, a monoclonal antibody.
Computational fluid dynamics:
numerical solver for the Navier–Stokes
equation.
Critical process parameter:
parameter that influences the CQA.
Design of experiment: statistical
methods for optimal process design.
Hierarchical Bayes model:
combination of various parameter
probabilities for a posteriori probability.
Hybrid model: combination of
mechanistic model with machine-
learning method.
Trends in Biotechnology Mechanistic model: mathematical
approach including continuum
Figure 1. A Standard Biological Active Pharmaceutical Ingredient (API) Manufacturing Process with Unit equations as well as balance and
Operations from the Upstream Process (USP), Downstream Process (DSP) and Drug Product Phase. After conservation conditions.
suitable choices for the media, cell lines, and the seed train in the inoculum phase, the UPS formally starts with the cell Molecular dynamics: simulation
production of APIs in bioreactors. Hereafter, the solution is filtered and centrifuged in harvest. The downstream process approach with discrete integration
focuses on the purification of the API in terms of various chromatographic and capture steps in combination with virus scheme.
inactivation (VI), virus filtration (VF), ultrafiltration (UF) and diafiltration (DF) processes. The final drug product stage Monte Carlo simulation:
concentrates on the optimal therapeutic drug product, meaning the design of tailor-made formulations that ensure long computational method with random
shelf-lives and prolonged stability of the API in the buffer solution. Abbreviation: NBE, new biological entity. number sampling.
Multivariate data analysis:
mathematical methods for the analysis
have proven their benefits as useful tools for the drug product phase because these approaches of high-dimensional data sets.
provide a deeper knowledge of molecular interactions and thus significantly lower the influence of Partial least square regression:
unwanted adverse effects [24–26]. mathematical approach for higher-order
regression of high-dimensional data
sets.
Besides the standard goals of optimization, each biopharmaceutical manufacturing process has
to obey the paradigm of quality by design (QbD) which ensures a high-quality target product
profile (QTPP) of the final drug product [27–29]. In more detail, QbD principles introduce certain
critical quality attributes (CQAs) including physical, chemical, biological, or microbiological prop-
erties or characteristics that should be within an appropriate limit, range or distribution [30–32].
Furthermore, each CQA is affected by various critical process parameters (CPPs), whose
variability as well as impact on the CQA must be monitored and controlled (Box 1). Not each
process parameter (PP) is a CPP, and not each process outcome is a CQA, so identifying
CPPs and CQAs with less time-consuming process simulations or statistical analysis is beneficial
to reduce financial costs and experimental efforts [7,29,33–39]. Regarding the already-achieved
high levels of model maturity, recent efforts concentrated on developing partially coupled or fully
holistic process models [40–42]. A holistic process model can be regarded as a digital replica of
the bioprocess, including all unit operations and process steps from inoculum to the fill and finish
phase, which means that the individual models are integrated into one simulation framework with
appropriate conditions for data input and output flow. Besides the fact that costly experimental
efforts for the exploratory search of optimal process design can be reduced, further benefits of a
holistic process model include: the process-wide evaluation of CPPs and CQAs in a high-
dimensional design space with proven acceptable ranges (PARs) (Box 1); an increase in continuous
improvement efforts; a clearer and more quantitative assessment of manufacturing nonconfor-
mances or deviations; a sound basis for a cost-of-goods model; and potential compatibility with
process analytical technology (PAT) [40].

Holistic process models often are also described as digital twins of the bioprocess [39,43]. In its
general meaning, a digital twin can be regarded as a virtual counterpart of a physical system or a

2 Trends in Biotechnology, Month 2020, Vol. xx, No. xx


Trends in Biotechnology

Box 1. QbD Principles, Process Ranges, and Pain Points


All unit operations introduce certain CPPs that change the corresponding CQA values [14,41,96]. Determining these
parameters and their influence on final CQA distributions is of paramount importance to meet all relevant QbD guidelines
for acceptable QTPPs [27,30–32]. These correlations are complex, and understanding the CPP and CQA parameter
distributions over all process steps, as well as their impact on the final drug product quality in terms of acceptable process
ranges, is a challenging task. Otherwise, important correlations between CPPs and CQAs for neighboring unit operations
are ignored, which implies imprecise process limits [30–32,41].

The corresponding proven acceptable ranges (PARs) and normal operating ranges (NORs) are schematically depicted in
Figure I. Only the combined consideration of unit operations in terms of a holistic process model generates process
knowledge that provides a reliable definition of reasonable PARs in order to achieve the required product quality. With this
process knowledge and with regard to the NORs, it is possible to define the PAR as narrow as necessary but as wide as
possible.

Trends in Biotechnology

Figure I. Schematic Visualization of Parameter Ranges for Individual Unit Operations in a High Dimensional
Design Space. The normal operating ranges for the individual unit operations are denoted as blue bars and the green
broken lines denote the proven acceptable ranges (PARs). Pain points highlight values that are close to the PARs and
imply potential problems through correlations between neighboring unit operations.

process [43]. We believe that this definition is often misleading in terms of several ambiguous
descriptions that have been published over the past few years. Moreover, due to the large num-
ber of complex interactions, understanding of the underlying molecular mechanisms is still limited.
Therefore, the designation of a holistic process model as a digital process replica is a more pre-
cise description.

In this review, we discuss the theoretical background of standard computational models for late-
stage USPs, DSPs, and drug product phase in terms of their benefits, weaknesses, and current
application. We point out the importance of such approaches for root cause analysis, process
improvement, robustness, and control. Furthermore, we outline the benefits, challenges, and limita-
tions associated with the development of a holistic process model. In contrast to other review articles
[44–48], we focus on recent modeling concepts for all stages of the bioprocess. Thus, we do not
focus explicitly on specific unit operations or specific process stages, but rather try to give a broad
overview on general bioprocess simulations and their meaning for digital bioprocess replicas.

Trends in Biotechnology, Month 2020, Vol. xx, No. xx 3


Trends in Biotechnology

Details of Standard Unit Operation Models


In this section, we briefly introduce the most relevant unit operation models for USPs, DSPs, and
drug product stage. For the sake of clarity, we separate this section in terms of the individual
process stages despite the fact that certain models are used in various contexts. In addition to
unit operation models, statistical analysis plays an important role for modern process design
[7–9,44,49–52]. Such approaches include a broad plethora of regression and analysis methods
in order to identify the underlying relations, clustering effects, and principal components, as well
as the influence of parameter variations on process behavior. A more detailed overview on these
methods in the context of bioprocess design is available in [52].

Unit Operation Models for USPs


Computational models in USPs mainly focus on improved titers of biological APIs in bioreactors.
Thus, the optimal design of bioreactors in addition to reliable metabolic models for the detailed
study of cell densities and metabolite concentrations are of specific importance. Bioreactor
geometries and mixing properties are commonly studied by computational fluid dynamics
(CFD) simulations [53]. In more detail, the fundamental Navier–Stokes equation is numerically
solved in CFD simulations for certain geometries and boundary conditions such that the flow
and mixing dynamics can be investigated in detail [53–55]. In addition to other parameters of
interest, the mixing time and the mass transfer can be reliably estimated. In particular, the
corresponding simulations are time consuming, and the explicit consideration of objects like
clone cells is a challenging task. Lattice Boltzmann (LB) simulations have recently become
popular to overcome these limitations and drawbacks [22,23,56,57]. In contrast to CFD and
the corresponding finite-volume solvers for the Navier–Stokes equation, the fluid in LB
approaches is discretized on well-defined lattice nodes whose population is characterized by
numerical values of a momentum vector in combination with a probability density. During the
collision and relaxation steps, the lattice nodes exchange momentum in accordance with the
Boltzmann equation [22,56]. Discrete particles can be added as well by using standard molecular
dynamics integration schemes [56].

In addition to the broad interest in reactor geometries, improved feeding strategies as well as the
optimal time points for the harvest or end of fermentation in fed-batch or continuous perfusion
processes are the central points of each upstream process. To optimize yields, it is of utmost
importance to predict the time-dependent concentration of the metabolites and the APIs in com-
bination with the number of dead and viable cells. In this context, mechanistic models (Box 2)

Box 2. Mechanistic and Hybrid Models


Mechanistic and hybrid models can be used for several process steps [5]. Here, we discuss the main principles for kinetic
mechanistic and hybrid models which are mainly used in USPs. Such models are often considered for optimal process
design as well as the improvement of titer and media [16–18,43,65,66]. Kinetic mechanistic and hybrid models rely on
chemical reactions and often differ in their interpretation of rate constants (fixed rate constants in standard mechanistic
approaches, time-dependent rates in hybrid models). With regard to general principles, the concentration cα(t) of a
metabolite or species α can be written as

dcα ðtÞ
¼ μα ðtÞcα ðtÞ ½I
dt

with a time-dependent rate μα(t) which reveals a nonlinear functional form. However, each calculated rate μα(t) for certain
time points t = t0, t1,...tn as well as for various metabolites is used for the prediction of the concentrations at the subsequent
time point. The corresponding procedure can be either implemented for individual time steps with regard to standard
neural networks or for more sophisticated recurrent neural network approaches [97]. In addition to these concepts, it is
also possible to introduce complex coupled differential equations for the individual metabolites instead of single functional
forms [62,63].

4 Trends in Biotechnology, Month 2020, Vol. xx, No. xx


Trends in Biotechnology

are often used to study the time-dependent concentration profiles of cells, metabolites as well as
the product in standard cell cultures [10,11,58–61]. The mathematical framework is given by
coupled ordinary differential kinetic equations that describe the time-dependent concentration
of species and the produced biomass in the system. The kinetic equations can be coupled
with flow rates as well as pH values to provide a more detailed representation of experimental
conditions. Therefore, specialists usually estimate or determine the corresponding rate
constants for consumption and production reactions of the involved species, as well as the
birth and death rates from experimental data, in order to model the number of viable and dead
cells. Although a reasonable agreement between experimental and computational results is
often observed, certain deviations can be rationalized by an incomplete knowledge of metabolite
reactions and oversimplified considerations in terms of pseudo first-order and Monod reaction
kinetics [18].

Recently, so-called hybrid models were introduced to correct for these deviations
[5,16–18,43,62,63]. As one of the most common representations of hybrid models, the results
of experimental measurements are used to train a neural network to adjust time-dependent
rates, which are evaluated in terms of standard mechanistic model equations [16–18]. Conse-
quently, a highly accurate and adaptive concentration profile for the species can be achieved
(Box 2). Despite the large amount of experimental data that is required and its main
consideration for USPs, hybrid models also are applicable for further unit operations in DSPs [5].

To account for influence of various media and feeding strategies, metabolic flux pathway analysis
(MFPA) is a widely used approach for studying biochemical networks with a special focus on
genome-scale metabolic network reconstruction [64]. Because biological APIs like monoclonal
antibodies are produced by various individual cell lines, it is of specific interest to reconstruct
the chemical synthesis network to identify potential leverage for improvement [58]. With regard
to the knowledge of consumption and production rates in combination with possible modifica-
tions, MFPA can be regarded as a viable tool to improve API titers from a cell line perspective.

In addition to these as well as various novel machine learning approaches (Box 3)


[16–18,43,65,66], standard statistical methods in the context of DoEs or multivariate data analy-
sis have also proven useful by providing important insight into parameter correlations, which are
often used to optimize process conditions [7,8,44,49–52]. Although the corresponding methods
mainly rely on first-order or quadratic regression schemes, they are useful to identify the most

Box 3. Statistical and Machine-Learning Approaches


Thanks to an enormous increase of computational power, various machine learning approaches have emerged and
increased in importance in recent years [5,9]. Such approaches significantly broaden the toolkit of standard statistical
approaches like principal component analysis, multiple linear or partial least squares regression, and others
[7,8,51]. The main benefits of these methods are that they are well suited to extract information or patterns without any
prior knowledge of the considered process data. In contrast to unsupervised methods, supervised learning approaches
like artificial neural networks rely on a previous identification of target values, which are correlated with certain input param-
eter values via high-dimensional fitting functions [9].

Although these methods demonstrate fascinating properties, they are not free of drawbacks. In particular, neural networks
require large amounts of data for high accuracy and are prone to overfitting problems. Moreover, due to the high dimen-
sionality and the complexity of the correlations, it is often hard to understand the corresponding outcomes. Despite these
crucial limitations, the predictive capability for process conditions and parameter variations that are not directly studied by
experiments is often more accurate when compared with lower order regression approaches. [16–18,65,66,73]. Further-
more, machine learning methods are extremely beneficial for pattern recognition in highly complex and multivariate data
distributions whereas their single use as main modeling approaches remains questionable. Thus, it is of main importance
to rationalize the outcomes in terms of reliable mechanistic, hybrid or molecular models.

Trends in Biotechnology, Month 2020, Vol. xx, No. xx 5


Trends in Biotechnology

important correlations. USP modeling thus is still an area of active research with regard to ongo-
ing efforts in several modelling directions (Box 3) [65,66].

Unit Operation Models for DSPs


Unit operations in DSPs mainly focus on capture and purification procedures. Important de-
vices include various chromatographic columns to capture the API and to purify the solution
from host cell proteins and other unwanted cosolutes. In addition, DSPs include several filtra-
tion and virus-inactivation unit operations, which have a strong influence on the API aggrega-
tion behavior due to affected glycosylation profiles [67,68], as well as certain capture steps,
that modify the specific clearance and the concentration of the API as one of the most impor-
tant CQAs [41]. The importance of such operations motivated interest in reliable unit operation
models for DSPs, and mechanistic kinetic-dispersive models are by far the most popular ap-
proaches to study chromatographic behavior [69]. In contrast to USPs, where mechanistic
models are mainly used to study the concentration-dependent behavior of metabolic reactions,
mechanistic chromatographic process modeling concentrates on physical principles, espe-
cially adsorption–diffusion continuum equations. Although the binding of species to the resin
has its origin in complicated molecular mechanisms, continuum equations are often applicable
down to the nanoscale, which rationalizes the high accuracy of recent diffusion–adsorption
models [57]. A crucial point in chromatographic models is calibrating the corresponding bind-
ing isotherms, which often relies on various approximations [13–15,69–72]. Recent publica-
tions also introduced the first hybrid models for chromatographic modeling, such that the
mechanistic model is coupled with artificial neural networks in order to minimize the deviations
to experimental results. These models are beneficial to estimate binding parameters and
provide a detailed root cause analysis in terms of deviations to experimental values [73]. In
addition to chromatographic steps, various membrane-based purification mechanisms also
have a crucial impact on concentration profiles in DSPs [45]. Thus, recent effort also has
been spent on developing reliable thermodynamic models for ultrafiltration processes
[74,75]. The main driving mechanisms are the pressure along the membrane and the concen-
tration profile change due to osmotic effects, which are modeled via a combination of mass
transfer, concentration polarization, and diffusive motion [74,76]. Further numerical
approaches for other unit operations in DSPs have been published, such as hydrodynamic
considerations for viral filtration or cell clarification procedures [77,78]. The authors of these
studies introduced hydrodynamic concepts for the study of fluid flow in confined geometries
or in the presence of obstacles to rationalize the observed experimental findings. The
corresponding results demonstrated a significant influence on the process behavior, which
motivates further research in this direction.

Besides individually modeling the chromatographic and ultrafiltration unit operations, initial
approaches focused on coupling individual process steps with regard to integrated process
models (IPMs) [39,41,79,80]. In more detail, each IPM focuses on effectively determining mean-
ingful process parameters for final CQA values after certain process steps. A specific example
[41] combined the results of regression approaches and Monte Carlo simulation for three chro-
matographic steps to evaluate the outcomes for the specific clearance. The results of this study
imply that coupling the capture steps significantly influences the final outcome. The main concept
of an IPM and the corresponding influence on the final CQA distribution is schematically depicted
in Figure 2. The first unit operation yields an initial distribution of CQA values as imposed by certain
process parameters. These CQA values, taking the titer as a specific example, are often used as
further input parameters for the following process steps. With regard to this point, the second and
third process steps further modify the corresponding CQA distribution such that the final mean
CQA value often differs significantly from the outcomes for the individual unit operations.

6 Trends in Biotechnology, Month 2020, Vol. xx, No. xx


Trends in Biotechnology

Trends in Biotechnology

Figure 2. Schematic Illustration of Coupled Unit Operations for the Final Critical Quality Attribute (CQA)
Distribution after Three Process Steps. Process step 1 with certain values for the process parameters (PPs) leads to
the initial left CQA distribution. Process step 2 induces a shift of the distribution to the right side by the introduction of
novel PPs. The final process step 3 provides the final distribution which satisfies all relevant quality by design principles.
Reproduced, with permission, from [41].

Moreover, out-of-spec values as induced by distinct CPP variations can be minimized by modi-
fying initial conditions [41]. Besides the use in DSPs, recent studies have focused on comparable
IPM approaches to study the importance of process parameters in terms of risk assessment and
control strategies [79,80]. These publications showed that the main concept of IPMs (i.e., the
consideration of transferred and modified CQA distributions by a few coupled unit operations)
is essential for a process-wide evaluation of parameter ranges and optimal control strategies. Fol-
lowing these conclusions, an IPM can be regarded as a smaller implementation of a digital
bioprocess replica for certain unit operations with comparable benefits. We come back to this
point in the next sections.

Molecular Models for Drug Product Development


As can be concluded from the previous subsections, most of the unit operations shown in Figure 1
correspond to various mechanistic, hybrid, empirical, or data-driven approaches. In any
bioprocess, however, there are process operations that are not directly related to physical devices,
like bioreactors or chromatographic columns. These non-device-based process operations mainly
focus on modifications of the drug product, including media formulation, API solubilization
or refolding.

Trends in Biotechnology, Month 2020, Vol. xx, No. xx 7


Trends in Biotechnology

Most properties of common biological APIs, like monoclonal antibodies in solution, are dominated
by molecular interactions and thermodynamic principles. Therefore, molecular effects like solubi-
lization, aggregation or refolding of APIs can be best studied by detailed simulation approaches
with a high level of molecular resolution. Standard methods are atomistic or coarse-grained
molecular dynamics (MD) simulations, which provide insight into the molecular behavior and
the dynamics of the system [81,82]. Recent atomistic MD simulations provided deeper insight
into molecular interactions of protein folding, unfolding, and aggregation behavior as well as
molecular binding and unbinding mechanisms for protein chromatography capture steps
[25,26,71,83–87]. Despite their obvious benefits, MD simulations are time consuming and
often computationally challenging.

Besides dynamic simulations, further static molecular mechanics approaches are thus used to
study the stability of protein structures and their tendency for aggregation [88]. There exist a
broad variety of molecular prediction methods that help to estimate the tendency for such
unwanted effects. A good overview on these models is provided in [2,4,89]. Most of the corre-
sponding algorithms focus on chemical properties as well as statistical mechanics concepts
such that the prediction of clustering tendencies for aggregation-prone regions is in agreement
with experimental observations [4].

As can be assumed, molecular mechanics approaches require detailed information about the
spatial structure of the considered proteins. One experimental challenge is that most protein
structures are not known, so high-precision guesses have to be considered. To fill this knowledge
gap, various homology models map primary sequences of unknown protein structures to closely
related protein sequences for which the structures have been experimentally determined [81].
These procedures are called structural and sequence alignment processes between target and
template structures. There exists a broad plethora of algorithms that have been successfully
used for various protein classes [20,90].

In addition to chemical and spatial structures, it is also valuable to estimate the thermodynamic
properties of the APIs. Thus, quantitative structure–activity relationship (QSAR) models identify
molecular affinities or certain process-relevant properties of species from fundamental molecular
and thermodynamic principles in comparison with experimental outcomes [21,91,92]. In more
detail, QSAR models can be classified as regression or classification approaches in which
molecular properties are mapped to physicochemical parameters or biological affinities, to
name a few examples. As the main benefit in combination with the obtained regression curves,
it is possible to estimate the properties of novel APIs at a general level without explicit experimen-
tal measurements.

Benefits and Challenges of Unit Operation and Holistic Process Models


These examples show that the vast majority of simulation and modeling approaches focus on
individual unit operations. In addition, most of the corresponding models already reveal sufficient
maturity, which supports their use in modern biopharmaceutical process design. With regard to
their broad applicability, benefits and challenges of simulations and holistic process models may
be questioned in comparison with traditional experimental efforts. Despite recent claims, the re-
duction of experimental effort is only a minor benefit because a sufficient number of validation
and calibration experiments are an integral part of each model parameterization procedure. How-
ever, the search for optimal process parameter settings, bioreactor geometries, feeding strate-
gies, and formulation compositions, to give a few examples, can be significantly improved by
using computational models. Moreover, the reliable and straightforward study of underlying cor-
relations between parameters, and identifying CPPs and their influence on CQAs in the process

8 Trends in Biotechnology, Month 2020, Vol. xx, No. xx


Trends in Biotechnology

design space, should be regarded as one of the main benefits. Such useful insights and improve-
ments are important for QbD principles to increase the process knowledge and make the pro-
cess more robust and more efficient. As a side effect, the development time required for novel
APIs and efforts in trouble shooting and root cause analysis can also be reduced by modeling ini-
tiatives. With regard to these points, there are profound reasons to establish simulations as a third
pillar alongside traditional experimental and statistical approaches in modern process design.

In recent years, biopharmaceutical technology and modeling have faced new challenges and
opportunities. Some of the most recent manufacturing and scientific developments include online
monitoring of process data by PAT approaches and modern sensor technologies [93], continu-
ous processing [94,95], and more detailed insight into cell metabolic pathways. Despite the
interest in improved bioprocesses, implementing novel processes and technologies is often a
challenging task. With regard to these challenges, a deeper scientific understanding of the
physical, chemical and biological mechanisms is of fundamental importance. Well-calibrated
models are useful tools to study the underlying mechanisms and to rationalize the outcomes.
This also means that correlations between process parameters from the individual unit operations
have to be considered.

As already discussed, IPMs provide valuable insight into the correlations among a few coupled
unit operations. In terms of a digital bioprocess replica, a comparable approach including a com-
plete set of unit operations is also useful in a holistic process model. In contrast to IPMs that focus
on single CQAs and a few unit operations, a holistic process model combines all relevant PPs,

Box 4. Hierarchical Bayesian Models


Standard Bayesian statistics focus on conditional probabilities, which rely on the distribution of parameter x for given pa-
rameter y. The famous Bayes theorem states that the conditional probability p(x|y) for x and given y can be calculated via

pðyjxÞ
pðxjy Þ ¼ pðxÞ ½I
pðy Þ

with the likelihood function p(y|x)/p(y) and the prior probability p(x) [9]. Herewith, the influence of further process parameters
as well as their correlation can also be effectively considered. A hierarchy of influences in terms of process steps can be
used to define a hierarchical Bayes approach [98] for an arbitrarily chosen three stage process model with parameters
x,y,w,z according to

pðyjwÞpðwjz ÞpðzjxÞ
pðx; w; zjy Þ ¼ pðxÞ ¼ Lðx; y; w; z ÞpðxÞ ½II
pðy Þ

where the likelihood function L(x,y,w,z) is usually evaluated in terms of Markov Chain Monte Carlo approaches [9,99]. Most
often, the likelihood functions are computed for independent and identically distributed data in terms of multivariate Gauss-
ian distribution functions in accordance with

  YN  
L xjμ; σ 2 ¼ N xn jμ; σ 2 ½III
n¼1

for N parameter dimensions with

!
  1 ðxn −μÞ2
N xn jμ; σ 2 ¼ pffiffiffiffiffiffiffiffiffiffiffiffiffi exp − ½IV
2π σ 2 2σ 2

for the corresponding parameter values xn and individual mean values μ and variances σ2. Without further restriction,
hierarchical Bayes models can be used for an arbitrarily chosen number of dimensions and thus establish a stringent
combination of neighboring unit operations.

Trends in Biotechnology, Month 2020, Vol. xx, No. xx 9


Trends in Biotechnology

CPPs, and CQAs from process operations at the USP, DSP, and drug product stages, which Outstanding Questions
significantly broadens the dimensions of the parameter space. In consequence, statistical regres- How do we define reasonable
sion approaches as introduced for IPMs with a focus on single CQA behavior [41] may not be transformation functions between the
unit operations?
sufficient and should be replaced by the aforementioned unit operation models or machine
learning methods. A further challenging task is developing accurate high-dimensional transforma- How do we define good modeling
tion functions that are required for a reasonable connection between the unit operation models practices in terms of reliable validation
and the corresponding data transfer in terms of PP and CQA distributions. Promising options in- and calibration protocols?
clude refined hierarchical Bayes models (Box 4). How crucial are hidden complexities of
the design space in terms of CQA
Despite these challenges, unit operation models and holistic process models are important tools distributions?
to efficiently study the process-wide parameter space and the future design of bioprocesses.
How can we control and estimate
Important correlations can be identified, and the PARs for coupled unit operations can be evalu- model artifacts?
ated straightforwardly. Such knowledge is exclusive for integrated and holistic process models
and rationalizes the recent interest in digital bioprocess replicas. Do we need a detailed molecular
understanding of all process steps?

Concluding Remarks
In this article, we shed more light on the current state-of-the-art of modeling approaches in
biopharmaceutical process design. Motivated by recent regulatory recommendations in combi-
nation with QbD principles [29], statistical as well as modeling approaches are now important
pillars in drug approval procedures. In contrast to standard experimental efforts, using simula-
tions and advanced statistics is less costly and time consuming. Moreover, a significantly higher
level of process knowledge and control can be achieved. Previous efforts focused on developing
a holistic process model that could be regarded as a digital bioprocess replica from inoculum to
the fill and finish stage [40,41]. Such a model combines the individual unit operation and process
models and thus sheds more light on the influence of complex correlations between CPPs and
CQAs with regard to coupled unit operations.

In principle, there exists a plethora of individual unit operation models in USPs and DSPs as
well as molecular models in drug product development. Generally, most unit operation
models are well suited to reproduce experimental behavior but often do not provide deeper
insight into the underlying molecular mechanisms. Although computationally challenging,
the use of molecular models is important for root cause analysis, refining unit operation
models, and the deeper understanding of molecular interactions between the APIs and the
components of the solution. Comparable conclusions can also be drawn with regard to the
use of machine-learning approaches for data analysis and process design. Although these
methods have already been successful in other contexts, their sole use as process modeling
approaches may be questionable. Often, the amount of experimental data is low, and the un-
derlying molecular mechanisms are not fully understood. Such an understanding can only be
achieved by more refined models or statistical approaches in combination with detailed the-
oretical descriptions.

Despite the ongoing success, all unit operation models and statistical approaches entail certain
benefits as well as drawbacks. For instance, statistical and machine-learning approaches are
free from any a priori assumptions but depend strongly on experimental data, such that
extrapolations towards unconsidered conditions are questionable. Moreover, it is important
to distinguish between the quality of first- and higher-order regression schemes. In contrast,
mechanistic models rely on continuum equations among other balance and conservation
conditions. Due to the validity of the underlying relations for the macroscale and the mesoscale,
the predictive capability and accuracy of such approaches are high. Despite their broad appli-
cability over several length and time scales, certain deviations of continuum descriptions are

10 Trends in Biotechnology, Month 2020, Vol. xx, No. xx


Trends in Biotechnology

also known for the study of molecular processes. As a further challenge, most models lack a
meaningful validation and calibration protocol [14,96]. In consequence, well-defined selection
criteria and validation procedures are needed to categorize the quality and the reliability of
the models. In addition to these points, there are still some other questions to be answered
(see Outstanding Questions).

In consequence, although the level of accuracy, predictive capability, or maturity may differ for the
considered process and unit operation models, initial work on a holistic process model is ripe. In
combination with a steady improvement of existing unit operation models, as well as the develop-
ment of novel approaches, the potential benefits of a digital bioprocess replica outweigh the
drawbacks and challenges and will increase our knowledge of the bioprocess. This specifically
includes the process-wide study of the design space as well as the analysis of CQA distribution
changes for coupled unit operations. With regard to these promising options, holistic process
models may revolutionize biotechnological engineering and pave the way towards novel APIs in
tailor-made formulations.

Acknowledgments
We thank Ali Abusnina, Heiko Babel, Joachim Bär, Joschka Bauer, Alireza Ehsani, Patrick Garidel, Ogsen Gabrielyan, Ernst
Broberg Hansen, Volker C. Hass, Simon Kluters, Bettina Knapp, Marco Kunzelmann, Liliana Montano Herrera, Albert Paul,
Lisa Pieper, Beate Presser, Eugen Probst, Federico Rischawy, David Saleh, Eduard Salzmann, Jochen Schaub, Jan C.
Schöning, Hermann Schuchnigg, Michael Sokolov, Fabian Stiefel, Joey Studts, Gang Wang, Thomas Wucherpfennig,
Johannes Wutz, Christina Yassouridis, and Samet Yildirim for valuable discussions and useful hints.

References
1. Markarian, J. (2018) Modernizing pharma manufacturing. 15. Briskot, T. et al. (2019) Prediction uncertainty assessment of chro-
Pharm. Tech. 42, 20–25 matography models using Bayesian inference. J. Chromatogr. A
2. Kumar, S. et al. (2018) Biopharmaceutical informatics: 1587, 101–110
supporting biologic drug development via molecular modelling 16. von Stosch, M. et al. (2014) Hybrid modeling for quality by
and informatics, J. Pharmacy. Pharmacology 70, 595–608 design and PAT-benefits and challenges of applications in
3. Tomar, D.S. et al. (2018) In silico prediction of diffusion interaction biopharmaceutical industry. Biotechnol. J. 9, 719–726
parameter (k D), a key indicator of antibody solution behaviors. 17. von Stosch, M. et al. (2016) Hybrid modeling as a QbD/PAT tool
Pharma. Res. 35, 193 in process development: an industrial E. coli case study.
4. R. A. Norman, et al. Computational approaches to therapeutic Bioprocess Biosyst. Eng. 39, 773–784
antibody design: established methods and emerging trends, 18. Narayanan, H. et al. (2019) A new generation of predictive
Brief. Bioinf. Published online October 18, 2019. https://doi. models–the added value of hybrid models for manufacturing
org/10.1093/bib/bbz095 processes of therapeutic proteins. Biotechnol. Bioeng. 116,
5. Narayanan, H. et al. (2020) Bioprocessing in the digital age: the 2540–2549
role of process models. Biotechnol. J. 15, 1900172 19. Sharma, C. et al. (2011) Review of computational fluid dynamics
6. Gronemeyer, P. et al. (2014) Trends in upstream and down- applications in biotechnology processes. Biotechnol. Prog. 27,
stream process development for antibody manufacturing. 1497–1510
Bioengineering 1, 188–212 20. Bishop, A. et al. (2008) Protein homology modelling and its use
7. N. Politis, S. et al. (2017) Design of experiments (DoE) in in South Africa. South African J. Sci. 104, 2–6
pharmaceutical development. Drug Develop. Indust. Pharm. 21. Cherkasov, A. et al. (2014) QSAR modeling: where have you
43, 889–901 been? Where are you going to? J. Med. Chem. 57, 4977–5010
8. von Stosch, M. and Willis, M.J. (2017) Intensified design of exper- 22. Succi, S. (2001) The lattice Boltzmann equation: for fluid dynamics
iments for upstream bioreactors. Eng. Life Sci. 17, 1173–1184 and beyond, Oxford University Press
9. Bishop, C.M. (2006) Pattern Recognition and Machine Learning, 23. Hickey, O.A. et al. (2014) Lattice-Boltzmann simulations of the
Springer electrophoretic stretching of polyelectrolytes: the importance of
10. Frahm, B. et al. (2002) Adaptive, model-based control by the hydrodynamic interactions. J. Chem. Phys. 140, 164904
open-loop-feedback optimal (OLFO) controller for the effective 24. Roberts, C.J. (2014) Therapeutic protein aggregation: mecha-
fed-batch cultivation of hybridoma cells. Biotechnol. Prog. 18, nisms, design, and control. Trends Biotechnol. 32, 372–380
1095–1103 25. Calero-Rubio, C. et al. (2016) Coarse-grained antibody models for
11. Möller, J. et al. (2019) Model-assisted Design of Experiments as “weak” protein–protein interactions from low to high concentrations.
a concept for knowledge based bioprocess development. J. Phys. Chem. B 120, 6592–6605
Bioprocess Biosyst. Eng. 42, 867–882 26. Calero-Rubio, C. et al. (2016) Predicting unfolding thermody-
12. Hahn, T. et al. (2015) Simulating and optimizing preparative protein namics and stable intermediates for alanine-rich helical peptides
chromatography with ChromX. J. Chem. Educ. 92, 1497–1502 with the aid of coarse-grained molecular simulation. Biophys.
13. Wang, G. et al. (2016) Water on hydrophobic surfaces: mecha- Chem. 217, 8–19
nistic modeling of hydrophobic interaction chromatography. 27. US Department of Health and Human Services (2011) Guidance
J. Chromatogr. A 1465, 71–78 for Industry. Process validation: general principles and practices,
14. Rischawy, F. et al. (2019) Good modeling practice for industrial US Food and Drug Administration, pp. 3–15
chromatography: mechanistic modeling of ion exchange chro- 28. Reason, A.J. et al. (2015) Defining critical quality attributes for
matography of a bispecific antibody. Comput. Chem. Eng. monoclonal antibody therapeutic products. BioPharm. Int. 27,
130, 106532 34–43

Trends in Biotechnology, Month 2020, Vol. xx, No. xx 11


Trends in Biotechnology

29. Yu, L.X. et al. (2019) FDA’s new pharmaceutical quality initiative: 55. Wutz, J. et al. (2018) Establishment of a CFD-based kLa model
knowledge aided assessment & structured applications. Int. in microtiter plates to support CHO cell culture scale-up during
J. Pharmaceut. 1, 1–4 clone selection. Biotechnol. Prog. 34, 1120–1128
30. Mitchell, M. (2013) Determining criticality-process parameters 56. Bernaschi, M. et al. (2019) Mesoscopic simulations at the physics-
and quality attributes part I: criticality as a continuum. BioPharm. chemistry-biology interface. Rev. Mod. Phys. 91, 025004
Int. 26, 38–40 57. Smiatek, J. et al. (2009) Mesoscopic simulations of the
31. Mitchell, M. (2014) Determining criticality-process parameters counterion-induced electro-osmotic flow: a comparative study.
and quality attributes part II; design of experiments and data- J. Chem. Phys. 130, 244702
driver criticality. BioPharam. Int. 27, 32–40 58. Schaub, J. et al. (2011) Advancing biopharmaceutical process
32. Mitchell, M. (2014) Determining criticality–process parameters development by system-level data analysis and integration of
and quality attributes part III: process control strategies omics data. In Genomics and Systems Biology of Mammalian
-criticality throughout the lifecycle. BioPharm. Int. 27, 26–35 Cell Culture, pp. 133–163, Springer
33. Steinwandter, V. et al. (2019) Data science tools and applications 59. Ehsani, A. et al. (2017) How to use mechanistic metabolic
on the way to Pharma 4.0. Drug Discov. Today 24, 1795–1805 modeling to ensure high quality glycoprotein production.
34. Schubert, J. et al. (1994) Bioprocess optimization and control: Comput. Aided Chem. Eng. 40, 2839–2844
application of hybrid modelling. J. Biotechnol. 35, 51–68 60. Ehsani, A. et al. (2019) Towards model-based optimization for
35. Lübbert, A. and Simutis, R. (1994) Using measurement data quality by design in biotherapeutics production. Comput. Aided
in bioprocess modelling and control. Trends Biotechnol. 12, Chem. Eng. 25–30
304–311 61. Kornecki, M. and Strube, J. (2019) Accelerating biologics
36. Brass, J. et al. (1997) Application of modelling techniques for the manufacturing by upstream process modelling. Processes 7,
improvement of industrial bioprocesses. J. Biotechnol. 59, 166
63–72 62. Zhang, D. et al. (2019) Hybrid physics-based and data-driven
37. Schügerl, K. (2001) Progress in monitoring, modeling and con- modeling for bioprocess online simulation and optimization.
trol of bioprocesses during the last 20 years. J. Biotechnol. 85, Biotechnol. Bioeng. 116, 2919–2930
149–173 63. Simutis, R. and Lübbert, A. (2017) Hybrid approach to state
38. Bellgardt, K.-H. (2000) Bioprocess models. In Bioreaction estimation for bioprocess control. Bioengineering 4, 21
Engineering, pp. 44–105, Springer 64. Orth, J.D. et al. (2010) What is flux balance analysis? Nat.
39. Zobel-Roos, S. et al. (2019) Accelerating biologics manufactur- Biotechnol. 28, 245
ing by modeling or: is approval under the QbD and PAT 65. Sokolov, M. et al. (2015) Fingerprint detection and process pre-
approaches demanded by authorities acceptable without a diction by multivariate analysis of fed-batch monoclonal antibody
digital-twin? Processes 7, 94 cell culture data. Biotechnol. Prog. 31, 1633–1644
40. Velayudhan, A. (2014) Overview of integrated models for 66. Takahashi, M.B. et al. (2015) Artificial neural network associated
bioprocess engineering. Curr. Opin. Chem. Eng. 6, 83–89 to UV/Vis spectroscopy for monitoring bioreactions in biophar-
41. Zahel, T. et al. (2017) Integrated Process modeling - a process maceutical processes. Bioprocess Biosyst. Eng. 38, 1045–1054
validation life cycle companion. Bioengineering 4, 86 67. Kayser, V. et al. (2011) Glycosylation influences on the aggregation
42. Sommeregger, W. et al. (2017) Quality by control: towards propensity of therapeutic monoclonal antibodies. Biotechnol. J. 6,
model predictive control of mammalian cell culture 38–44
bioprocesses. Biotechnol. J. 12, 1600546 68. Mazzer, A.R. et al. (2015) Protein A chromatography increases
43. Nargund, S. et al. (2019) The move toward Biopharma 4.0: in monoclonal antibody aggregation rate during subsequent low
silico biotechnology develops “smart” processes that benefit pH virus inactivation hold. J. Chromatogr. A 1415, 83–90
biomanufacturing through digital twins. Genet. Eng. Biotechnol. 69. Guiochon, G. et al. (2006) Fundamentals of preparative and
39, 53–55 nonlinear chromatography, Elsevier
44. Abt, V. et al. (2018) Model-based tools for optimal experiments 70. Brooks, C.A. and Cramer, S.M. (1992) Steric mass-action ion
in bioprocess engineering. Curr. Opin. Chem. Eng. 22, 244–252 exchange: displacement profiles and induced salt gradients.
45. Fröhlich, H. et al. (2012) Membrane technology in bioprocess AIChE J 38, 1969–1978
science. Chem. Ing. Technik 84, 905–917 71. Banerjee, S. et al. (2017) A molecular modeling based method to
46. Rathore, A.S. et al. (2018) Recent developments in chromato- predict elution behavior and binding patches of proteins in multi-
graphic purification of biopharmaceuticals. Biotechnol. Lett. 40, modal chromatography. J. Chromatogr. A 1511, 45–58
895–905 72. Großhans, S. et al. (2018) An integrated precipitation and ion-
47. Wang, G. et al. (2020) Developing a computational frame- exchange chromatography process for antibody manufacturing:
work to advance bioprocess scale-up. Trends Biotechnol. process development strategy and continuous chromatography
Published online February 25, 2020. https://doi.org/10.1016/ exploration. J. Chromatogr. A 1533, 66–76
j.tibtech.2020.01.009 73. Wang, G. et al. (2017) Estimation of adsorption isotherm and
48. Baumann, P. and Hubbuch, J. (2017) Downstream process mass transfer parameters in protein chromatography using
development strategies for effective bioprocesses: trends, artificial neural networks. J. Chromatogr. A 1487, 211–217
progress, and combinatorial approaches. Eng. Life Sci. 17, 74. Huter, M.J. and Strube, J. (2019) Model-based design and pro-
1142–1158 cess optimization of continuous single pass tangential flow filtra-
49. Kumar, V. et al. (2014) Design of experiments applications in tion focusing on continuous bioprocessing. Processes 7, 317
bioprocessing: concepts and approach. Biotechnol. Prog. 30, 75. Grote, F. et al. (2012) Integration of reverse-osmosis unit opera-
86–99 tions in biotechnology process design. Chem. Eng. Technol. 35,
50. Géron, A. (2017) Hands-on Machine Learning with Scikit-Learn 191–197
and Tensor-Flow: Concepts, Tools, and Techniques to Build 76. Thiess, H. et al. (2017) Module design for ultrafiltration in
Intelligent Systems, O’Reilly Media biotechnology: hydraulic analysis and statistical modeling.
51. Bayer, B. et al. (2020) Comparison of modeling methods J. Membr. Sci. 540, 440–453
for doe-based holistic upstream process characterization. 77. Hadpe, S.R. et al. (2017) ATF for cell culture harvest clarification:
Biotechnol. J. 15, 1900551-1–1900551-10 mechanistic modelling and comparison with TFF. J. Chem.
52. Mandenius, C.-F. and Brundin, A. (2008) Bioprocess optimiza- Technol. Biotechnol. 92, 732–740
tion using design-of-experiments methodology. Biotechnol. 78. Rathore, A.S. et al. (2014) Mechanistic modeling of viral filtration.
Prog. 24, 1191–1203 J. Membr. Sci. 458, 96–103
53. Hutmacher, D.W. and Singh, H. (2008) Computational fluid 79. Yang, P.Y. et al. (2019) Accurate definition of control strategies
dynamics for improved bioreactor design and 3D culture. Trends using cross validated stepwise regression and Monte Carlo
Biotechnol. 26, 166–172 simulation. J. Biotechnol. X 2, 100006
54. Wutz, J. et al. (2016) Predictability of kLa in stirred tank reactors 80. Borchert, D. et al. (2019) Quantitative CPP evaluation from risk
under multiple operating conditions using an Euler–Lagrange assessment using integrated process modeling. Bioengineering
approach. Eng. Life Sci. 16, 633–642 6, 114

12 Trends in Biotechnology, Month 2020, Vol. xx, No. xx


Trends in Biotechnology

81. Leach, A.R. (2001) Molecular Modelling: Principles and Applications, 91. Idakwo, G. et al. (2019) A review of feature reduction methods for
Pearson Education Press QSAR-based toxicity prediction. In Advances in Computational
82. Frenkel, D. and Smit, B. (2001) Understanding Molecular Toxicology (Hong, H., ed.), pp. 119–139, Springer
Simulation: from Algorithms to Applications, Elsevier 92. Verma, J. et al. (2010) 3D-QSAR in drug design-a review. Curr.
83. Kukol, A. (2008) Molecular Modeling of Proteins, Volume 443. Top. Med. Chem. 10, 95–115
Springer 93. Biechele, P. et al. (2015) Sensor systems for bioprocess
84. Smiatek, J. et al. (2012) Properties of compatible solutes in monitoring. Eng. Life Sci. 15, 469–488
aqueous solution. Biophys. Chem. 160, 62–68 94. Rathore, A.S. et al. (2018) Process integration and control in
85. Diddens, D. et al. (2017) Aqueous ionic liquids and their influence continuous bioprocessing. Curr. Opin. Chem. Eng. 22, 18–25
on peptide conformations: denaturation and dehydration 95. Stepper, L. et al. (2020) Pre-stage perfusion and ultra-high
mechanisms. Phys. Chem. Chem. Phys. 19, 20430–20440 seeding cell density in CHO fed-batch culture: a case study for
86. Smiatek, J. (2017) Aqueous ionic liquids and their effects on process intensification guided by systems biotechnology.
protein structures: an overview on recent theoretical and exper- Bioproc. Biosys. Eng. Published online April 7, 2020. http://dx.
imental results. J. Phys. Condens. Matter 29, 233001 doi.org/10.1007/s00449–020–02337–1
87. Oprzeska-Zingrebe, E.A. and Smiatek, J. (2018) Aqueous ionic 96. Saleh, D. et al. (2020) Straightforward method for calibration of
liquids in comparison with standard co-solutes. Biophys. Rev. mechanistic cation exchange chromatography models for indus-
10, 809–824 trial applications. Biotechnol. Prog. Published online February
88. van der Kant, R. et al. (2017) Prediction and reduction of the aggre- 22, 2020. https://doi.org/10.1002/btpr.2984
gation of monoclonal antibodies. J. Mol. Biol. 429, 1244–1261 97. Connor, J.T. et al. (1994) Recurrent neural networks and robust
89. Meric, G. et al. (2017) Driving forces for nonnative protein aggre- time series prediction. IEEE Trans. Neural Netw. 5, 240–254
gation and approaches to predict aggregation-prone regions. 98. Box, G.E. and Tiao, G.C. (1965) Multiparameter problems from a
Annu. Rev. Chem. Bio. Eng. 8, 139–159 Bayesian point of view. Ann. Math. Stat. 36, 1468–1482
90. Schmidt, T. et al. (2014) Modelling three-dimensional protein 99. Ballnus, B. et al. (2017) Comprehensive benchmarking of
structures for applications in drug design. Drug Discov. Today Markov chain Monte Carlo methods for dynamical systems.
19, 890–897 BMC Syst. Biol. 11, 63

Trends in Biotechnology, Month 2020, Vol. xx, No. xx 13

You might also like