You are on page 1of 6

DevOps for AI – Challenges in Development of

AI-enabled Applications
Lucy Ellen Lwakatare Ivica Crnkovic Jan Bosch
Computer Science and Engineering Computer Science and Engineering Computer Science and Engineering
Chalmers University of Technology Chalmers University of Technology Chalmers University of Technology
Gothenburg, Sweden Gothenburg, Sweden Gothenburg, Sweden
llucy@chalmers.se ivica.crnkovic@chalmers.se jan.bosch@chalmers.se

Abstract—When developing software systems that contain activities, and to obtain an efficient process is a challenge of
Machine Learning (ML) based components, the development itself.
process become significantly more complex. The central part of However, ML workflow does not cover the entire software
the ML process is training iterations to find the best possible
prediction model. Modern software development processes, such development process. It does not address the question of
as DevOps, have widely been adopted and typically emphasise fre- efficient software development, and that raises a question how
quent development iterations and continuous delivery of software ML workflow and a software development process are related,
changes. Despite the ability of modern approaches in solving some or as a more general question: which is the development
of the problems faced when building ML-based software systems, process of software systems that contain AI-components?
there are no established procedures on how to combine them with
processes in ML workflow in practice today. This paper points out This paper discusses new requirements for software devel-
the challenges in development of complex systems that include opment process that includes development of ML components.
ML components, and discuss possible solutions driven by the It summarizes some new challenges that arise from the re-
combination of DevOps and ML workflow processes. Industrial quirements posed by ML components, and possible software
cases are presented to illustrate these challenges and the possible
development process model that can be used to successfully
solutions.
Index Terms—AI, Machine Learning, Agile software develop- develop, operate and evolve ML based software systems.
ment The modern software development processes apply agile
principles, and one of the mostly commonly used agile pro-
I. I NTRODUCTION cesses is DevOps where system development and its operation
are integrated in a common process which enables a fast feed-
ML is a rapidly developing and the most promising tech- back about the application performance and its usage to the
nology of the last decade, adopted in literally all business developers. Specifically, in this paper we discuss integration
domains and research areas. In many domains (e.g. medical of ML-workflow with DevOps.
diagnose), ML has demonstrated superiority in providing re- The rest of this paper is organised as follows. Section III
sults in comparison to traditional software applications, and gives an overview of ML workflow process with its main char-
outperformed humans in the activities that require a human acteristics. Section IV describes a DevOps process model with
intelligence. These trends have induced a huge need for data its main characteristics. Section V discusses the problems and
and AI scientists, and also software developers. Early studies challenges that occur in development of AI-based software,
[1] have shown that building algorithms (i.e. developing ML requiring the integration of ML workflow and DevOps. In
model) is only a fraction of all efforts required to successfully section VI we describe possible ways of integration of ML
develop and put in operation an AI-based software system1 . workflow and DevOps to successfully implement the system
ML software implementations differ from traditional software life cycle. Finally section VII concludes the paper.
in that their logic is not explicitly programmed, but rather
automatically created by learning from data. As such, the II. R ESEARCH METHODS
development process of ML-based systems involves different
activities, including data collection, data preparation, defining Our research is based on empirical evidence with companies
the ML model (such as deep network model), perform the members of Software Center2 and Chalmers AI Research
process of training to quantify the model parameters and obtain Centre (CHAIR)3 including companies Ericsson, Siemens,
the expected result. This process is called ML workflow. ML Volvo Cars, CEVT which are in a phase of extensive adoption
workflow requires sophisticated support of a set of tools and of ML in their development processes, and AI operation and
development companies such as Peltarion4 . Complementary
The research in this paper has been supported by Software Center, the
2 https://www.software-center.se/
Chalmers Artificial Intelligence Research Centre (CHAIR) and Vinnova.
1 We define AI-based systems as software systems that include components 3 https://www.chalmers.se/chair/

developed using AI-modelling (e.g. ML components) 4 https://peltarion.com/

Authorized licensed use limited to: International Institute of Information Technology Bangalore. Downloaded on May 05,2021 at 10:48:20 UTC from IEEE Xplore. Restrictions apply.
Fig. 1. ML development workflow stages

research includes synthesis of multivocal ’grey’ literature, such The ML modeling process corresponds to the development
as blog posts, and our prior research [2], [3]. of a software component, or possibly a simple application -
The AI-based systems we have explored in previous re- i.e. creation of ML model itself, adjusting data, training, and
search were largely in early stages. We refer to two cases finally evaluating the model. This process is highly iterative
in Section VI. and driven by the results, such as the accuracy of the model,
• Case A: Out-Of-Office (OOO) Reply Detection. The OOO the precision, or other quality attributes used during ML model
reply detection system is an AI component within a training. The deployment and monitoring of the trained ML
web-based sales engagement platform that is used to model belong to the operation process - integration of the ML
optimize communication between sales representatives model in the software system and its performance. Monitoring
and potential prospects. The AI component extracts in- of ML component is seen as an integral part of a ML workflow
formation e.g., contact and date, from OOO email replies that is useful in providing feedback to ML modelling, which
and automatically prompts the sales rep to take relevant may lead to a request for model re-training with new data sets,
action, such as add contact information, pause and resume or a redesign of the model itself with new features and new
at the return date the automated sequence of sales actions. model architectures.
• Case B: Fault Classification System in Returned Telecom- In practice, the development of ML systems is challenging
munication Hardware. The AI component classifies prob- for a variety of reasons [1], [5], [6]. Data that is used to build
lems (faults) in returned hardware based on log data from ML models has numerous data quality problems and requires
radio networks. The screening operator, as an end-user, huge efforts to prepare the data before training the models
connects the returned hardware to AI component and gets because often it can not be readily fed into ML algorithms [7].
a score that tells whether it is a software-related fault, Furthermore, manual and ad-hoc procedures in ML modelling
hardware-related fault or no fault. If the hardware is good stage make it difficult to reproduce model experimentation
(has no fault), it is sent back to the customer but if it is results and reduces the rate of experimentation, including
bad it is sent for repairing. tuning that can be done to improve the model [8], [9].
Despite its development challenges, ML model constitute
III. M ACHINE L EARNING WORKFLOW
a small part of the entire ML system [1], [6]. Once data
Machine learning workflow describes the overall develop- scientists select the final ML model, it needs to be deployed
ment stages and activities that are typically performed when and integrated to the end-user application. For complex sys-
developing ML based software systems. At high-level, a ML tems, the integration and deployment of ML models need
workflow includes a set of distinguishable stages (see Figure to take into consideration numerous requirements related to
1): model requirements, data collection, data cleaning, data the software system. The ML-based application also needs
labelling, feature engineering, model training, model evalua- continuous monitoring in order to detect errors in predictions
tion, model deployment and model monitoring [4]5 . and data (e.g., training-serving skew) requiring that the ML
These stages can be grouped in separate processes: (i) Data model be retrained due to staleness.
management, (ii) ML Modeling, and (iii) Model Operation.
Data Management process includes collection of data and its IV. D EVOP S PROCESSES
preparation for its seamless usage. The Data Management DevOps is a software development approach that empha-
process can be performed separately from ML modeling - very sises collaboration between software development and opera-
often data exists in enterprise repositories or as open repos- tions in order to operate software systems and accelerate the
itories, and they offer usage for different ML applications. delivery of software changes [10]. The approach emerged to
5 We define ML model as a ML function, or a ML component that will be be important in the context of continuous deployment (CD)
developed and deployed in a software system. to help create a repeatable, reliable process for releasing soft-

Authorized licensed use limited to: International Institute of Information Technology Bangalore. Downloaded on May 05,2021 at 10:48:20 UTC from IEEE Xplore. Restrictions apply.
be carried out by deep learning based approaches [12], but
this is only a fraction of the entire driving control system.
Development of software systems has another logic and a
different process (left part Figure 2). In modern development
processes operations are tightly connected with fast feedback
loops to the development process (right side of Figure 2),
but connection to ML workflow (data management and ML
modeling processes) are not well defined. That leads to a
number of challenges in performing both ML workflow and
DevOps. This section presents important challenges motivating
Fig. 2. DevOps workflow the integration of ML workflow and DevOps. The presented
challenges are based on our synthesis of scientific literature
ware changes frequently in production environment [10], [11]. [13] as well as experiences gathered from practitioners when
Building on agile software development method, an important conducting empirical research and during reporting meetings
tenet in DevOps is automation of build, testing, deployment of Software Center and CHAIR center (see Section I).
and operations processes (Figure 2) while also ensuring that 1) Multiple environments: ML training assumes existence
all software artifacts are version controlled [10], [11]. As of a storage with data that will be used in building ML
such, DevOps practices in the deployment pipeline involve models. Most of the activities related to data processing can be
automation of software deployment process, including auto- decoupled from other activities in the ML workflow. During
matic provisioning of the environments aimed at minimising training, ML models are computationally demanding. Large
manual hand-overs of the changes from software development ML models require training across distributed high-performing
team to operations team. A deployment pipeline, in software compute resources (e.g., graphical processing units (GPUs)) in
development practice, is a technical manifestation of the entire order to speed-up the training time. As such, the development
software process constituting to all stages of getting software environment requires interactive software development tools to
changes from version control until they are visible to end- schedule training on local or distributed computational units,
users [11]. Automation in deployment pipeline, often done by typically a combination of CPU and GPUs. On the other
infrastructure teams, has the advantage to abstract the inner hand, the operational environment can either be similar to
workings of deployment procedure to software developers that of traditional software components (e.g. in web-based
who often experience disconnections between development applications), or completely different (e.g in vehicular systems
and production environments. self-driving vehicle systems) because limited computational
While the DevOps approach also has non-technical prac- and storage resources can involve offloading to the cloud.
tices for effective collaboration between persons involved in Some of the major challenges of ML components, com-
software development, much of the focus have been on the pared to traditional software components, is how to deal with
technical practices. In online-based applications, deployment versioned data and ML models, and not just versioned code.
mechanism is incorporated into the continuous integration Tools used in tracking and managing code changes, such as
(CI) system with some defined triggers to facilitate the au- Git in DevOps are inadequate to ensure efficient tracking and
tomatic deployment of changes to acceptance test or pro- management of versioned dependencies of ML artifacts (code,
duction environments. As part of the deployment procedure, data, models). Similarly, Continuous Integration (CI) tools are
configuration management tools are used to automatically not just about testing code but also for testing and validating
provision, configure and upgrade existing environments with data that will be used to train models [14]. While practitioners
the new changes while applying a selected deployment strategy are observed to orchestrate ML model development using CI
e.g., blue-green deployment or rolling upgrade [10], [11]. In tools [12], they are inadequate to ensure efficient running of
other domains, such as Cyber-Physical systems, or embeeded model training jobs over specialized hardware e.g., GPUs,
system, DevOps process include additional activities as the in the context of distributed training. In addition to different
operational environment is different from the development tools, the processes require different expertise - data scientists,
environment. Simulations, when possible, are expensively used AI-experts, and software engineers. These differences in the
to make the development cycles more effcient. environments, tools, processes and expertise required, pose
new challenges: a) How to ensure a successful management of
V. ML W ORKFLOW AND SOFTWARE SYSTEM LIFECYCLE
several development environments?, b) how to ensure a seam-
INTEGRATION CHALLANGES
less intercommunication and exchange of artifacts between
The ML workflow (Figure 1) defines the end-to-end process them?, and c) How to ensure multidisciplinary expertise and
of building ML models, but it neglects software development enable efficient collaboration between the domain-experts?
process. A deployed ML model is always a part of an 2) Complementary requirements: Modern software devel-
application or a software system. For example, in self-driving opment processes combine traditional requirements manage-
vehicle systems, parts of driving tasks, such as environment ment with outcome/data feedback from the stakeholders, and
perception, steering wheel control and motion planning can recently a use of AI to enhance the feedback from the sys-

Authorized licensed use limited to: International Institute of Information Technology Bangalore. Downloaded on May 05,2021 at 10:48:20 UTC from IEEE Xplore. Restrictions apply.
tem operation [2]. Software system features related to users’ system. Or the opposite, can we from the system verification
acceptance are typically evaluated through A/B testing. Non- come to a conclusion that a ML model is not sufficiently
functional requirements related to different types of quality accurate or precise? The challenge: How the software system
attributes are in many domains the main concern of the verification is related to ML evaluation?
development process. Example of these are attributes related 4) System evolution vs. ML model evolution: A continuous
to dependability (safety, security, reliability, robustness, etc.), change of software has led to the development processes like
or resources (real-time requirements, performance, energy DevOps, which explicitly support continuous integration and
consumption, computational and storage constraints), or user continuous deployment, automatic verification and specific
interface (usability). The requirements related to ML workflow type of testing in production, such as A/B testing. Software
are different: data quality, data management effectiveness (data evaluation includes changes of new and improved functional-
preparation, labeling), effectiveness of the training process ity and internal infrastructural change. These changes have
(CPU training time), and resources needed in the training impact on requests for changes of ML modules, including
process. In relation to the quality of the ML models, there is a both functional and non-functional properties. ML workflow
set of metrics that show the quality of the prediction: accuracy, focuses on another type of evolution: a) evolution because of
precision and recall are some of them6 . In feature engineering, re-training needs caused by changes of datasets. New data
the requirements are related a the quality and effectiveness of originated from new contexts may require re-modeling by
feature selection and their processing (normalisation, transfor- re-training; b) evolution related to changes/optimisation of
mations, etc.) in relation to the prediction quality. the model itself but trained on the same datasets. While the
In an integrated process, the main challenges lie in under- continuous software evolution can be mapped to the evolution
standing the relations between software system requirements of ML models in a controlled way (though that requires
and ML requirements. For example, relations between system established procedures), change of a context, i.e. changes of
features and dataset features. Obviously these are not the data input for ML models, can happen in an uncontrolled and
same features, but they might be related - when specifying unexpected manner, which might have direct impact on ML
a feature as a system requirement, which features should model accuracy and performance. The context change may
be included ML models? Another example: To achieve a require ML Model retraining with updated data sets. However,
system reliability, do we need to ensure a certain level of a retrained model can work in the new context correctly, but
the accuracy of a ML model? There are similar problems its performance degrades for the old context. The challenges
for the performance requirements: What is impact of ML that arise are the following: a) how does software evolution
model performance to the system performance - for example influence the ML model evolution?, b) how to control the
real-time requirements? These examples can be summarised operational context change and its influence on ML model
as a common challenge: a) How to relate ML requirements performance and a need for ML models evolution?, and c)
to software system requirements?, and b) How to derive ML how to control evolution of ML models to achieve a same
requirements from the system requirements? level of control of software system evolution?
3) System Verification vs. ML Evaluation: The goal of the 5) Operations and Feedback loops: ML workflow and
ML evaluation is to increase the prediction quality - first to DevOps are designed as iterative processes themselves, but
find the minimum of the specified cost function, then provide they also have mutual dependencies. Data management in-
evaluation results in the form of accuracy and other ML cludes a data collection activity, and data must be collected
metrics. These metrics are unknown in the beginning and an to be available for the training in the initial phase of the
optimal value is searched for by changing the ML model and development. However, in many cases new data is collected
input datasets in a number of iterations. This process does not during the system operation, in particular if new data is needed
have predefined goals, nor estimations of efforts (in time and to cover new operation contexts, and the ML model must be
used resources). Verification of a software system is somewhat retrained. A ML Model must be deployed in the development
different - it is a measure of the correctness of the system environment to be used in the development process, and
in relation to a specification [15]. The correctness can be deployed together with the executable system in the opera-
of a binary type (correct/failure) or a form of distribution tion. In the operation, the system must provide information
related to a non-functional attribute such as reliability. The about the system performance and users’ acceptance to the
verification process is typically not an iterative process as development environment in order to provide a feedback for
during the verification process the system is not supposed to further system evolution. The system also needs to monitor
be changed/adapted. Such verification is typically realised by deployed ML models in order to check their performance in
testing, or by analysis (static or dynamic, formal or empirical). the current context. There is information of different type that
Here again there is a question of how these processes are must be continuously exchanged for a successful and efficient
related? Can we use the ML evaluation metrics to analyse end-to-end process. The challenge that appears here is: a)
the correctness of the system? This question is in particular which information should be exchange between the processes?,
relevant if a system has different AI models integrated in the and b) what are the triggering mechanisms for information
exchange and in which frequencies the information should be
6 For definitions see https://en.wikipedia.org/wiki/Confusion matrix exchanged?

Authorized licensed use limited to: International Institute of Information Technology Bangalore. Downloaded on May 05,2021 at 10:48:20 UTC from IEEE Xplore. Restrictions apply.
Fig. 3. ML workflow and DevOps process integration

VI. ML W ORKFLOW AND D EVO PS P ROCESS at ML modelling stage, training environment is provisioned
I NTEGRATION to facilitate model training and evaluation. ML model
experiments can be carried out on ML platform, developed
Ultimately, the integration of ML workflow and DevOps either internally or acquired as open-source tools, that
processes should enable fast, iterative and continuous provides the capability of analysing data sets and training
development, deployment and operation of AI-based software ML models. At the stage information of each ML model
systems in production. The integrated process advocates experiment run i.e., datasets, trained model (including model
automation within and across all the process steps of AI- configurations) and evaluation results are stored for easy
based system implementation, including building, testing, comparison against a historical baseline e.g., a model that
deployment and infrastructure management. All artifacts is currently in production. To facilitate tracking of artifact
(code, data and models) important to construct AI-based dependencies after a successful completion of ML model
system are reproducible and can be adapted and released experiment, developers explicitly specify and register artifacts
reliably in small increments. The integrated process especially and their dependencies (data set, model and code) to a
applicable in scenarios wherein deployed ML models are dependency tracking system that stores the experiment run
retrained based on new streamed data or optimization of ML history in a database. For each artifact, the developers specify
models parameters, such as changing hyper parameters. The its metadata (name, version number, registration date) and
entire AI-based system life cycle integrates ML Workflow dependency information, which contain references to other
and DevOps into four distinct processes, Data Management elements that the artifacts depend on. For example, data set
(DM), ML modeling (Mod), Software Development (Dev), and dependency information can be (i) data source e.g., log IDs,
System Operation (Ops) (see Figure 3). (ii) Git hash of the code used to extract data set from raw
data, (iii) versioned set of data labels and (iv) entry point
DM. Data management stage establishes activities and to extraction scripts etc [12]. Important to storing artifacts
systems to collect data, select data, augment labelling and and their dependency information is that it should allow
curate features that will be used for training ML models. immutability i.e., it should be possible to recreate/reproduce
Standardized activities and systems are put in place to help the exact same artifact every time. The versioned artifacts and
extract and store data in optimized formats that are easy and their dependency information is made accessible externally
ready to be consumed by ML modelling stage, while also by other systems e.g., by build system through simple API
enforcing suitable security controls for accessing the data. calls. By the end of model experiment stage, the final trained
The resulting data set building code creates connectors to raw model is selected and verified that it works correctly but
data storage in order to select and extract samples of data in isolation from other ML components. Case A uses an
from them. In Case A, all email communication data between externally developed AI platform to build and deploy models
sales representatives and prospects is stored in communication called Databricks With the tool, Case A stores the whole
database from which a few data samples are selected and pipeline of conducting ML model experiment steps as a
labelled. However, the extraction involves a series of steps to single archive file and the resulting models in a versioned
handle different encoding in order to get email text, extract controlled repository. In Case B, the ML modelling stage
entities and construct relationship around the entities. In Case involves manual steps of data exploration, model training and
B, a data ingestion service is implemented to build data set by evaluation. Developers implement and version model training
extracting data from a large data collection storage containing code after successfully training a model. The model along
log files and log analysis (performed by an external tool). with metadata information (name, version, classifier, features)
is also stored in model registry.
Mod. Once new data set is made available for training

Authorized licensed use limited to: International Institute of Information Technology Bangalore. Downloaded on May 05,2021 at 10:48:20 UTC from IEEE Xplore. Restrictions apply.
Dev. When deploying the trained model, first it is verified based software systems. In particular, the manual steps during
that it operates correctly with the rest of the system i.e., ML model experiment and deployment including addressing
other components. The build system performs system wide the problems of version control and dependency management
integration and verification over large test set to give holistic for ML artifacts. This paper tries to energize and motivate
view of how all parts of the system performs between the further discussions in addressing the many challenges in
different components. Additionally, since artifact dependency the development of AI-based systems. In future work we
information and capability is accessible to build systems, will elaborate more on ML workflow and DevOps process
such as the versioning and storing artifacts using code (data integration with additional cases.
set generation code, model training code, evaluation metrics
R EFERENCES
generation code), the build system can be implemented in
such a way that it has the functionality to orchestrate and [1] D. Sculley, G. Holt, D. Golovin, E. Davydov, T. Phillips, D. Ebner,
V. Chaudhary, M. Young, J.-F. Crespo, and D. Dennison, “Hidden
run the entire model training job on a regular basis. In case technical debt in machine learning systems,” in Advances in neural
of the latter, the retraining models is automated and process information processing systems (NIPS) 28. Curran Associates, Inc.,
is designated as failed, aborted or successful by the build. 2015, pp. 2503–2511.
[2] J. Bosch, H. H. Olsson, and I. Crnkovic, “It takes three to
Successful build with best model - new model performs tango:requirement, outcome/data, and ai driven development,” in Inter-
better than production model - is registered in model registry. national Workshop on Software-intensive Business: Start-ups, Ecosys-
In case of failed and aborted builds, developers can opt tems and Platforms). http://ceur-ws.org/, 2018, pp. 177–192.
[3] L. E. Lwakatare, A. Raj, J. Bosch, H. H. Olsson, and I. Crnkovic,
to rebuild the model with an optimized set of parameters. “A taxonomy of software engineering challenges for machine learning
Historical builds executed by the CI system can be used to systems: An empirical investigation,” in Agile Processes in Software
also identify and analyse bugs that occur in other components Engineering and Extreme Programming, P. Kruchten, S. Fraser, and
F. Coallier, Eds. Springer International Publishing, 2019, pp. 227–243.
as a result of model updates. At the end, the verified latest [4] S. Amershi, A. Begel, C. Bird, R. DeLine, H. Gall, E. Kamar, N. Nagap-
registered version of a trained model is released/promoted pan, B. Nushi, and T. Zimmermann, “Software engineering for machine
to the subsequent quality assurance acceptance environment learning: A case study,” in 41st International Conference on Software
Engineering: Software Engineering in Practice. IEEE, 2019, pp. 291–
or the production environment only after successful build, 300.
including passing of regular integration and system tests. The [5] M. Kim, T. Zimmermann, R. DeLine, and A. Begel, “Data scientists in
regular tests include testing the AI component with different software teams: State of the art and challenges,” IEEE Transactions on
Software Engineering, vol. 44, no. 11, pp. 1024–1038, 2018.
inputs to ensure appropriate response. In case A, one can [6] A. Arpteg, B. Brinne, L. Crnkovic-Friis, and J. Bosch, “Software
reproduce the trained models by re-running the single archive engineering challenges of deep learning,” in 44th Euromicro Conference
file containing the whole model pipeline steps generated by on Software Engineering and Advanced Applications, 2018, pp. 50–59.
[7] N. Polyzotis, S. Roy, S. E. Whang, and M. Zinkevich, “Data
Databricks tools. At the time of our study, both Case A and management challenges in production machine learning,” in Proceedings
Case B were in the process of implementing tests. Case B of the 2017 ACM International Conference on Management of
was also planning to incorporate a CI system in order to help Data, ser. SIGMOD ’17. New York, NY, USA: Association for
Computing Machinery, 2017, p. 1723–1726. [Online]. Available:
deal with technical debt that was accumulating due to time https://doi.org/10.1145/3035918.3054782
pressures requiring the team to delivery the system fast to the [8] C. Hill, R. Bellamy, T. Erickson, and M. Burnett, “Trials and tribulations
stakeholder. For example, unit tests were being implemented of developers of intelligent systems: A field study,” in Symposium on
Visual Languages and Human-Centric Computing. IEEE, 2016, pp.
to help catch issues caused by the data ingestion service. 162–170.
[9] S. Schelter, J.-H. Böse, J. Kirschnick, T. Klein, and S. Seufert, “Au-
Ops. Once the ML system component is operational in pro- tomatically tracking metadata and provenance of machine learning
experiments,” in NIPS Workshop on Machine Learning Systems, 2017.
duction environment, including the trained model, the system [10] L. E. Lwakatare, T. Kilamo, T. Karvonen, T. Sauvola, V. Heikkilä,
is continuously monitored to detect issues like performance J. Itkonen, P. Kuvaja, T. Mikkonen, M. Oivo, and C. Lassenius, “Devops
degradation. Deployment to the production environment can in practice: A multiple case study of five companies,” Information and
Software Technology, vol. 114, pp. 217 – 230, 2019. [Online]. Available:
be a manual, semi-automated or automated process that is http://www.sciencedirect.com/science/article/pii/S0950584917302793
performed after successful execution of tests in pre-production [11] J. Humble and D. Farley, Continuous Delivery: Reliable Software Re-
environments in development step. The Ops stage also ensures leases through Build, Test, and Deployment Automation, 1st ed. Boston:
Addison-Wesley Professional, 2010.
live testing of AI component by providing predictions whilst [12] Y. Guo, K. Ashmawy, E. Huang, and W. Zeng, “Under the hood
adhering to stringent latency requirement. At this step, devel- of uber atg’s machine learning infrastructure and versioning control
opers also collect information of how the model is performing platform for self-driving vehicles,” Mar 2020. [Online]. Available: https:
//eng.uber.com/machine-learning-model-life-cycle-version-control/
based on live data. This information can be used to trigger [13] L. E. Lwakatare, A. Raj, I. Crnkovic, J. Bosch, and H. H. Olsson, “Large-
retraining of models. Both Cases A and B have capability to scale machine learning systems in real-world industrial settings: A re-
monitor how the model is performing in production. view of challenges and solutions,” Information and Software Technology,
vol. 127, 2020.
[14] Google, “Mlops: Continuous delivery and automation
VII. C ONCLUSION pipelines in machine learning,” April 2020. [Online].
In this paper we have illustrated the integration of ML Available: https://cloud.google.com/solutions/machine-learning/
mlops-continuous-delivery-and-automation-pipelines-in-machine-learning
workflow and the DevOps process as an approach to help build [15] C. Murphy, G. E. Kaiser, and M. Arias, “An approach to software testing
AI-based systems systematically. The approach also helps to of machine learning applications.” in SEKE, vol. 167, 2007, pp. 52–57.
tackle some of identified challenges when developing ML-

Authorized licensed use limited to: International Institute of Information Technology Bangalore. Downloaded on May 05,2021 at 10:48:20 UTC from IEEE Xplore. Restrictions apply.

You might also like