Professional Documents
Culture Documents
dev
Published in DevOps.dev
You have 2 free member-only stories left this month. Upgrade for unlimited access.
Save
I like to emphasize the importance of MLOps parts using the following photo from a
paper published by Google:
source
https://blog.devops.dev/mlops-project-part-4b-machine-learning-model-monitoring-1593cddd05f0 1/24
3/10/23, 11:00 AM MLOps project — part 4b: Machine Learning Model Monitoring | by Isaac Kargar | Jan, 2023 | DevOps.dev
As you see, the machine learning code and model development are just a small part
of having a machine learning-based product.
Machine learning models are at the heart of many modern applications, from
recommending products to customers to identifying objects in images. However,
even the best models can degrade over time, leading to poor performance and
inaccurate predictions. To ensure that your machine learning models continue to
perform well, it’s important to monitor them and identify when they need to be
retrained or replaced.
Let’s start with Vertex AI and see how it can help us with the Model Monitoring task.
Vertex AI
One way to do the model monitoring is by using Vertex AI, a fully managed platform
for developing, deploying, and monitoring machine learning models on Google
Cloud. It provides a range of tools and services for building, training, and evaluating
machine learning models, as well as monitoring their performance over time. With
Vertex AI, you can easily create and deploy machine learning models using a variety
of popular frameworks, such as TensorFlow and PyTorch. The platform also
provides tools for optimizing and improving the performance of your models,
including the ability to automatically tune hyperparameters and understand how
your models are making predictions.
In addition to these features, Vertex AI also provides tools for monitoring the
performance of your models over time. This includes data quality checks, confusion
matrices, and feature importance plots, which can help you understand the
accuracy of your models and identify areas for improvement.
Another useful feature of Vertex AI is its AutoML service. The AutoML feature in
Vertex AI allows you to automatically tune the hyperparameters of your machine
learning models. This can save time and improve model performance, as it allows
you to quickly test different hyperparameter configurations and choose the one that
performs the best.
https://blog.devops.dev/mlops-project-part-4b-machine-learning-model-monitoring-1593cddd05f0 2/24
3/10/23, 11:00 AM MLOps project — part 4b: Machine Learning Model Monitoring | by Isaac Kargar | Jan, 2023 | DevOps.dev
Also, Vertex AI integrates with other Google Cloud services, such as BigQuery and
Cloud Storage, making it easy to store and access your data, and use it to train and
evaluate your models.
To get started with Vertex AI, you’ll first need to create a project on Google Cloud.
From there, you can use the Vertex AI dashboard to create and deploy your machine
learning models. The dashboard provides a range of tools and metrics for
monitoring the performance of your models, including data quality checks,
confusion matrices, and feature importance plots.
Let’s see how Vertex AI can help us with the Model Monitoring task.
Model monitoring checks the input data used for predictions for skew and drift in
features, so you can keep your model performing at its best.
Skew and drift detection for categorical and numeric features are supported by
Model Monitoring.
https://blog.devops.dev/mlops-project-part-4b-machine-learning-model-monitoring-1593cddd05f0 3/24
3/10/23, 11:00 AM MLOps project — part 4b: Machine Learning Model Monitoring | by Isaac Kargar | Jan, 2023 | DevOps.dev
Model Monitoring will send you an email when the skew or drift for a model’s
feature goes above the threshold you choose. Additionally, you may check the
distributions of each feature over time to see whether you need to retrain your
model.
TensorBoard
Open source TensorBoard (TB) is a Google open source project for machine learning
experiment visualization. Vertex AI TensorBoard is an enterprise-ready managed version
of TensorBoard.
https://blog.devops.dev/mlops-project-part-4b-machine-learning-model-monitoring-1593cddd05f0 4/24
3/10/23, 11:00 AM MLOps project — part 4b: Machine Learning Model Monitoring | by Isaac Kargar | Jan, 2023 | DevOps.dev
With Vertex AI TensorBoard, you can track, visualize, and compare ML experiments and
share them with your team.
source
https://blog.devops.dev/mlops-project-part-4b-machine-learning-model-monitoring-1593cddd05f0 5/24
3/10/23, 11:00 AM MLOps project — part 4b: Machine Learning Model Monitoring | by Isaac Kargar | Jan, 2023 | DevOps.dev
source
It’s very easy and straightforward to use TensorBoard in your code. Here is an
example(source):
https://blog.devops.dev/mlops-project-part-4b-machine-learning-model-monitoring-1593cddd05f0 6/24
3/10/23, 11:00 AM MLOps project — part 4b: Machine Learning Model Monitoring | by Isaac Kargar | Jan, 2023 | DevOps.dev
import tensorflow as tf
import datetime
mnist = tf.keras.datasets.mnist
(x_train, y_train),(x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0
def create_model():
return tf.keras.models.Sequential([
tf.keras.layers.Flatten(input_shape=(28, 28)),
tf.keras.layers.Dense(512, activation='relu'),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(10, activation='softmax')
])
model = create_model()
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
log_dir = "logs/fit/" + datetime.datetime.now().strftime("%Y%m%d-%H%M%S")
tensorboard_callback = tf.keras.callbacks.TensorBoard(log_dir=log_dir, histogra
model.fit(x=x_train,
y=y_train,
epochs=5,
validation_data=(x_test, y_test),
callbacks=[tensorboard_callback])
https://blog.devops.dev/mlops-project-part-4b-machine-learning-model-monitoring-1593cddd05f0 7/24
3/10/23, 11:00 AM MLOps project — part 4b: Machine Learning Model Monitoring | by Isaac Kargar | Jan, 2023 | DevOps.dev
source
You can check this link to see more on how to use it on Vertex AI. Also, check here
for the API and how to use TensorBoard in your code.
The current pricing model for TensorBoard is $300 per user per month, which is
very expensive in my opinion. If you want to use something cheaper on GCP, you
can check out my blog post on how to set up MLFlow on GCP.
Resources
Model Monitoring
https://blog.devops.dev/mlops-project-part-4b-machine-learning-model-monitoring-1593cddd05f0 8/24
3/10/23, 11:00 AM MLOps project — part 4b: Machine Learning Model Monitoring | by Isaac Kargar | Jan, 2023 | DevOps.dev
vertex-ai-samples/model_monitoring.ipynb at main ·
GoogleCloudPlatform/vertex-ai-samples
https://blog.devops.dev/mlops-project-part-4b-machine-learning-model-monitoring-1593cddd05f0 9/24
3/10/23, 11:00 AM MLOps project — part 4b: Machine Learning Model Monitoring | by Isaac Kargar | Jan, 2023 | DevOps.dev
Sample code and notebooks for Vertex AI, the end-to-end machine
learning platform on Google Cloud …
github.com
mlops-with-vertex-ai/02-experimentation.ipynb at main ·
GoogleCloudPlatform/mlops-with-vertex-ai
You can't perform that action at this time. You signed in with another
tab or window. You signed out in another tab or…
github.com
Machine learning models are typically trained and evaluated using historical data.
But the real-world data may not be the same as the training data, especially as the
models get older and the way the data is spread out changes. For example, the data
https://blog.devops.dev/mlops-project-part-4b-machine-learning-model-monitoring-1593cddd05f0 10/24
3/10/23, 11:00 AM MLOps project — part 4b: Machine Learning Model Monitoring | by Isaac Kargar | Jan, 2023 | DevOps.dev
units could change from Fahrenheit to Celsius, or your application could start
sending your model null values, which has a big effect on the quality of your model.
Or perhaps, in a real-world retail scenario, consumers’ purchasing preferences
evolve over time. As we've talked about before, this gradual difference between the
model and the real world is called "model drift" or "data drift," and it can have a big
effect on how well predictions work. Similarly, the performance of the model may
decline over time. The degradation of model accuracy with time has an effect on
business outcomes. Continuous monitoring of the model’s performance is vital for
proactively addressing this issue.
With this constant monitoring, you can figure out when and how often to retrain
your machine-learning model. Frequent retraining can be too expensive, but if you
don't train your machine learning model often enough, it might not make the best
predictions. So acting at the right time is valuable here.
When some rules and thresholds you've set are broken, model monitoring sends
metrics to Amazon Cloud Watch. This lets you set up alarms to audit and retrain
models. Data drift and accuracy drift metrics are also kept in S3 buckets, and
SageMaker Studio can be used to see them.
Let’s quickly get familiar with two AWS services that can be used to log and visualize
metrics: SageMaker Studio and CloudWatch.
If you’re running any kind of application on Amazon Web Services (AWS), Amazon
CloudWatch can keep an eye on everything in real time. Metrics, which are
variables for measuring resources and applications, can be gathered and monitored
with CloudWatch.
https://blog.devops.dev/mlops-project-part-4b-machine-learning-model-monitoring-1593cddd05f0 11/24
3/10/23, 11:00 AM MLOps project — part 4b: Machine Learning Model Monitoring | by Isaac Kargar | Jan, 2023 | DevOps.dev
Metrics for all of your active AWS services are automatically updated on the
CloudWatch homepage. You may also curate your own set of metrics to display on a
dashboard dedicated to your bespoke applications.
In the event that a predetermined threshold is achieved, you can set up alarms that
will either notify you or cause immediate action to be taken with respect to the
resources under observation. You can, for instance, monitor the CPU utilization and
disk I/O operations per second of your Amazon EC2 instances to ascertain if more
instances need to be launched to accommodate the influx of new work. In addition
to reducing costs by terminating unused instances, this information can be used to
do so as well.
CloudWatch lets you monitor the overall health of your system, including its
resource use, application performance, and operational status.
source
It starts with putting the trained model into use and ends with taking corrective
action when drift is found. Here is the end-to-end architecture that corresponds to
that end-to-end flow.
https://blog.devops.dev/mlops-project-part-4b-machine-learning-model-monitoring-1593cddd05f0 12/24
3/10/23, 11:00 AM MLOps project — part 4b: Machine Learning Model Monitoring | by Isaac Kargar | Jan, 2023 | DevOps.dev
source
Let’s analyze this diagram in more detail. The first step is to deploy the trained
model.
source
https://blog.devops.dev/mlops-project-part-4b-machine-learning-model-monitoring-1593cddd05f0 13/24
3/10/23, 11:00 AM MLOps project — part 4b: Machine Learning Model Monitoring | by Isaac Kargar | Jan, 2023 | DevOps.dev
source
It starts with ground truth training data and run a training job on SageMaker, which
generates a model artifact. Then the trained model would be available to consumers
via a deployed SageMaker endpoint.
source
Now, with the endpoint deployed, a consuming application can start sending
requests and get back predictions from the model. Then the request and the
https://blog.devops.dev/mlops-project-part-4b-machine-learning-model-monitoring-1593cddd05f0 14/24
3/10/23, 11:00 AM MLOps project — part 4b: Machine Learning Model Monitoring | by Isaac Kargar | Jan, 2023 | DevOps.dev
In order to identify if there’s any kind of data drift, we need to have some baseline
data.
source
In the next step, we run a baselining job that generates the statistics and constraints
about the training data.
https://blog.devops.dev/mlops-project-part-4b-machine-learning-model-monitoring-1593cddd05f0 15/24
3/10/23, 11:00 AM MLOps project — part 4b: Machine Learning Model Monitoring | by Isaac Kargar | Jan, 2023 | DevOps.dev
source
So, now that we have both the baseline details and the inference request, we can
compare them to identify any kind of drift.
source
source
https://blog.devops.dev/mlops-project-part-4b-machine-learning-model-monitoring-1593cddd05f0 16/24
3/10/23, 11:00 AM MLOps project — part 4b: Machine Learning Model Monitoring | by Isaac Kargar | Jan, 2023 | DevOps.dev
Then we can create a data drift monitoring job that SageMaker will periodically run
at a schedule that we can set.
The job compares the inference requests to the statistics and constraints of the
baseline. For each run of the monitoring job, the results are a violation report that is
saved in Amazon S3, a statistics report of the data collected during the run, and
summary metrics and statistics that are sent to Amazon Cloud Watch.
source
Now we’re able to detect data quality drift. But what happens if the quality of the
model itself changes? For example, the accuracy of the model decreases. Let’s see
how to use the Model Monitor capability to detect the accuracy drift.
https://blog.devops.dev/mlops-project-part-4b-machine-learning-model-monitoring-1593cddd05f0 17/24
3/10/23, 11:00 AM MLOps project — part 4b: Machine Learning Model Monitoring | by Isaac Kargar | Jan, 2023 | DevOps.dev
source
The process for detecting accuracy drift is almost identical to the data drift
detection process.
source
We collect the predictions made and the ground truth of the prediction and
compare the two. But what does the inference of ground truth mean? What does
"ground truth" for a prediction mean? That would depend on the predictions made
https://blog.devops.dev/mlops-project-part-4b-machine-learning-model-monitoring-1593cddd05f0 18/24
3/10/23, 11:00 AM MLOps project — part 4b: Machine Learning Model Monitoring | by Isaac Kargar | Jan, 2023 | DevOps.dev
by your model and the business use case. Consider that you are observing a movie
recommendation model. In this case, a possible ground truth inference is whether
or not the user has actually seen the recommended movie. Or perhaps they simply
clicked on the video but did not actually watch it.
With both the predictions captured and the ground truth provided by the model-
consuming application, SageMaker executes a merge job to combine them together.
Again, the merge job is a recurring task that runs on a set schedule.
Once you have the data merged, it’s time to monitor the accuracy.
source
https://blog.devops.dev/mlops-project-part-4b-machine-learning-model-monitoring-1593cddd05f0 19/24
3/10/23, 11:00 AM MLOps project — part 4b: Machine Learning Model Monitoring | by Isaac Kargar | Jan, 2023 | DevOps.dev
source
In this step, we can create a model quality monitoring job that is executed
periodically according to a schedule. The model quality job generates statistics,
violations, and Cloud Watch metrics.
SageMaker Studio can also be used to see the metrics that are produced by the two
monitoring jobs.
For the model quality monitoring job, here are some of the metrics that are
generated: accuracy, precision, and recall. SageMaker Model Monitor lets us use
classification and regression metrics, but we can also make our own metrics.
After detecting both data and model drifts, it’s time to take action on them.
https://blog.devops.dev/mlops-project-part-4b-machine-learning-model-monitoring-1593cddd05f0 20/24
3/10/23, 11:00 AM MLOps project — part 4b: Machine Learning Model Monitoring | by Isaac Kargar | Jan, 2023 | DevOps.dev
source
Both the data drift and the model quality monitoring jobs emit Cloud Watch metrics.
We can create Cloud Watch alerts for these metrics based on threshold values, and if
those thresholds are violated, alerts will be raised and we can decide on what
actions to take, such as updating the model, updating training data, and retraining
and updating the model itself.
source
https://blog.devops.dev/mlops-project-part-4b-machine-learning-model-monitoring-1593cddd05f0 21/24
3/10/23, 11:00 AM MLOps project — part 4b: Machine Learning Model Monitoring | by Isaac Kargar | Jan, 2023 | DevOps.dev
Now, if we decide to retrain the model, we’re completing that loop and can go back
to the ground truth training data and start training the model one more time.
source
To learn more, please check the documentation. You can find some resources in the
following section.
Thank you for taking the time to read my post. If you found it helpful or enjoyable, please
consider giving it a like and sharing it with your friends. Your support means the world to
me and helps me to continue creating valuable content for you.
Resources
https://blog.devops.dev/mlops-project-part-4b-machine-learning-model-monitoring-1593cddd05f0 22/24
3/10/23, 11:00 AM MLOps project — part 4b: Machine Learning Model Monitoring | by Isaac Kargar | Jan, 2023 | DevOps.dev
https://blog.devops.dev/mlops-project-part-4b-machine-learning-model-monitoring-1593cddd05f0 23/24
3/10/23, 11:00 AM MLOps project — part 4b: Machine Learning Model Monitoring | by Isaac Kargar | Jan, 2023 | DevOps.dev
https://blog.devops.dev/mlops-project-part-4b-machine-learning-model-monitoring-1593cddd05f0 24/24