Professional Documents
Culture Documents
ML Process in Azure Cloud
ML Process in Azure Cloud
ER
NA
L&
ML Process in Microsoft Azure PA
RT
NE
Table des matières RS
I. Definition:.......................................................................................................................................2
1. ML Life cycle and resources in Azure ML....................................................................................2
2. ML assets in Azure ML................................................................................................................3
II. Machine Learning Studio:...............................................................................................................3
III. Machine Learning Service:..........................................................................................................4
1. Prepare Data...............................................................................................................................4
2. Experiments................................................................................................................................5
3. Deployment................................................................................................................................6
IV. Data Workflow using Azure services (Data Pipeline):.................................................................6
1. Connectivity:...............................................................................................................................7
2. Data Ingestion and processing:...................................................................................................7
A. Batch mode (or static dataset or cold path):.......................................................................................7
B. Streaming Mode...................................................................................................................................8
3. Data store:..................................................................................................................................9
4. Post processing (Analytics) & Exposition using ADX service.....................................................10
5. Machine Learning model training:............................................................................................10
6. Evaluation.................................................................................................................................11
7. Model Deployment in container instance (ACI)........................................................................11
8. Production deployment using AKS:..........................................................................................12
9. Exposure of the model via API as endpoint to third apps.........................................................13
10. Machine learning model Inferencing:.......................................................................................13
V. MLops (for Python):......................................................................................................................14
1. Definition..................................................................................................................................14
2. Architecture :............................................................................................................................14
VI. Data quality:.............................................................................................................................16
5acXjzUk
INT
ER
NA
L&
I. Definition: PA
RT
1. ML Life cycle and resources in Azure ML NE
RS
Azure Machine Learning includes several resources and assets to enable you to perform your
machine learning tasks. These resources and assets are needed to run any job.
5acXjzUk
INT
ER
NA
L&
3) Datastore PA
Azure Blob storage RT
NE
Azure Data Lake
RS
Azure SQL Database
Databricks File System
Azure Blob Container
2. ML assets in Azure ML
Assets: created using Azure Machine Learning commands or as part of a training/scoring
run. Assets are versioned and can be registered in the Azure ML workspace. They include:
1) Model: binary file(s) that represent a machine learning model and any
corresponding² metadata. Models can be created from a local or remote file or
directory
2) Environment
3) Data
4) Component:
A component is analogous to a function - it has a name, parameters, expects
input, and returns output. Components can do tasks such as data processing,
model training, model scoring, and so on.
Azure offers two Machine Learning solutions with different capabilities and advantages:
5acXjzUk
INT
ER
NA
L&
PA
RT
NE
RS
1. Prepare Data
Datastore: They are used to store connection information to Azure storage services
which can be referred to by name and are attached to the workspace. Some
examples of supported Azure storage services that can be registered as datastores
are:
o Azure Blob Storage
o Azure Data Lake
o Azure SQL Database
o Databricks File System
o Azure Blob Container
Datasets: A Dataset is a reference to data in the datastore or behind public web URLs
and also creates a copy of its metadata. There are two types of datasets supported
by Azure namely the File dataset and Tabular dataset.
5acXjzUk
INT
ER
NA
L&
2. Experiments PA
Build, Train & Evaluate the model RT
NE
Model: It is a piece of code that takes input and produces the output for the given
RS
inputs. While developing a machine learning model, it requires selecting an
algorithm, availing data, and tuning of hyperparameters.
Once the algorithm run dataset it becomes a model
Training includes an iterative process that provides a trained model inheriting what it
learned from the training process.
Evaluation
Model evaluation is the process that uses some metrics which help us to analyze the
performance of the model. There are many metrics like:
Accuracy: a metric that measures the percentage of correct predictions
made by the model
Precision: a metric that measures the percentage of true positives out of all
predicted positives
Recall (or sensitivity): a metric that measures the percentage of true
positives out of all actual positives
F1 score: a metric that combines precision and recall into a single score
Area Under the Curve (AUC): a metric that measures the performance of a
binary classification model by calculating the area under the Receiver
Operating Characteristic (ROC) curve
Overfitting: a common problem in model evaluation where the model is too
complex and performs well on the training data but poorly on new, unseen
data
Underfitting: a common problem in model evaluation where the model is too
simple and performs poorly on both the training data and new, unseen data
Cross-validation: a technique used to assess the performance of a model by
splitting the data into multiple subsets and training and evaluating the model
on each subset
Regularization: a technique used to prevent overfitting by adding a penalty
term to the loss function that discourages the model from being too
complex.
5acXjzUk
INT
ER
NA
L&
PA
RT
NE
RS
3. Deployment
Once the model is trained and tested, it is stored in the Container image registry and
then deployed in web service or IoT modules.
Image: An image contains a model, application, training script and the dependencies
required by the model or script. The images are stored in the image registry. There
are two types of images:
o Docker image : used to deploy computer targets such as Azure Kubernetes
Service (AKS) or Azure Container Instances (ACI).
o FPGA image: used while deploying a field-programmable gate array in Azure ML.
Deployment:
There are 3 ways to deploy your trained model:
as a Web service endpoint: The registered model is deployed as a
service endpoint.
o Deploy as a Real-time endpoint
o Deploy as a Batch endpoint
5acXjzUk
INT
ER
NA
L&
Endpoint can run on Azure Kubernetes Cluster or Container Instance. PA
RT
IV. Data Workflow using Azure services (Data Pipeline): NE
RS
1. Connectivity:
i. IoT Edge:
OPC Publisher is a module that runs on Azure IoT Edge. It connects to OPC UA server
systems and publishes telemetry data to Azure IoT Hub in various formats.
Batch processing refers to the processing of blocks of data that have already been
stored over a period of time. This process involves using specific connectors for each
data source and target destination.
5acXjzUk
INT
ER
NA
L&
NB: usually, the platform receiving data (ingesting data) is responsible of PA
providing connector to extract data: for example, foundry platform provide RT
NE
connector to extract data from azure cloud source.
RS
ADF is an ETL service that provides services that helps to prepare and transform
data:
B. Streaming Mode
Data streaming is a mechanism that was designed to allow you to transfer, process
and analyze continuous streams of data in real time. It differs from traditional
databases where data is stored before being processed.
i. Data ingestion process:
5acXjzUk
INT
ER
NA
L&
devices which is not possible with Event Hubs, e.g. send software updates to PA
the sensors. RT
NE
IoT Hub supports MQTT protocol and can act as MQTT subscriber that will
RS
diffuse the topic published from the edge device to cloud services.
3. Data store:
ADLS Gen 2: The data storage proposed for all types of raw, processed, and
transformed data is Azure Data Lake Store Gen2.
Storing in Batch mode
Azure Synapsys Analytics is an analytics service that binds enterprise data
warehousing and Big Data analytics. Also, Synapsys is a processing engine of massive
data with query capability that allows to store batched processed data. Once
processed data is available and stored in Azure Synapse, various analytics clients can
consume it for business applications.
Storing in stream mode
5acXjzUk
INT
ER
NA
L&
Most often, data is stored temporarily to be accessible for a short period of PA
time. This allows for re-examination or additional analysis to be performed on the RT
NE
data if necessary. We may perform other steps actions like:
RS
Step-1: Broadcast or real-time action
Processing results can be streamed in real-time for downstream applications,
such as real-time dashboards, alerts, automated actions, etc.
Step-2: Archiving or long-term storage
After real-time exploitation, the data can be archived in long-term storage
systems, such as databases or data warehouses, for later analysis.
Azure Data Explorer (ADX) is a fast and highly scalable data exploration and
analytics service that is designed to ingest, analyze, and visualize large volumes
of structured and unstructured data in real-time.
It is offered as Platform as a Service (PaaS) as part of Microsoft Azure platform
ADX validates the initial data and converts its formats if necessary.
Data manipulation includes schema matching, organization, indexing, coding,
and data compression
Azure Data Explorer offers queuing (batching) and streaming ingestion
Ingestion properties : Properties that affect how data is inserted, such as tagging,
mapping, and creation time.
Data Ingestion methods are pipelines and connectors to common services like:
o Azure Event Grid
o Azure Event Hub
o programmatic ingestion using SDKs (kit de développement logiciel)
Data visualization can be achieved using their native dashboard offering, or with
tools like Power BI or Grafana.
o Holdout Validation:
The holdout validation approach refers to creating a training set and the holdout set
(also referred to as the test or validation set). The training data is used to train the
model, while the unseen test data is used to validate the model performance. The
common split ratio is 70:30, which means 70% of the data is used for building the
model, while the remaining 30% is used for testing the model performance.
5acXjzUk
INT
ER
NA
L&
o K-fold Cross-Validation: PA
RT
The data is divided into k folds. The model is trained on k-1 folds with one-fold held NE
back for testing. For example, if k is set to ten, then the data will be divided into ten RS
equal parts. After that, the model will be built on the first nine parts, while the
evaluation will be done on the tenth part or fold.
This process gets repeated to ensure each fold of the dataset gets the chance to be
the held-back set. Once the process is complete, we summarize the evaluation
metrics such as:
Mean
standard deviation
mean absolute error
root mean square error
coefficient of determination: often referred to as R 2, represents the
predictive power of the model as a value between 0 and 1. Zero means the
model is random (explains nothing); 1 means there is a perfect fit
6. Evaluation
1) Calculate metric (or performance data) using test & validation methods like Holdout
validation or Cross validation, or any other (see evaluation techniques above)
2) Evaluate the models based on these metrics
3) Select a model having the optimal parameters
4) Test the selected model with other testing dataset to provide an unbiased
evaluation of the performance of the model selected.
5acXjzUk
INT
ER
NA
L&
PA
RT
NE
RS
The docker client will be the local development environment where we code the ML
model and the docker host will be located in the shopfloor server
5acXjzUk
INT
ER
NA
L&
PA
RT
NE
RS
Workflow :
1) The kubectlclient will first translate your CLI command to one more REST-API call(s)
and send it to kube-apiserver.
2) After validating these REST-API calls, kube-apiserver understands the task and calls
kube-scheduler process to select one node from the available ones to execute the
job. This is the scheduling procedure.
3) Once kube-scheduler returns the target node, and kube-apiserver will dispatch the
task with all of the details describing the task.
4) The kubelet process in the target node receives the task and talks to the container
engine (ex: Docker engine) to spawn a container with all provided parameters.
5) This job and its specification will be recorded in a centralized database etcd. Its job is
to preserve and provide access to all data in the cluster.
5acXjzUk
INT
ER
NA
L&
When an ML model is running in production, it is often then described as artificial PA
intelligence (AI) since it is performing functions similar to human thinking and analysis RT
NE
a) Scoring: RS
Scoring is also called prediction and is the process of generating values based on a
trained machine learning model, given some new input data. The values, or scores,
that are created can represent predictions of future values, but they might also
represent a likely category or outcome. The scoring process can generate many
different types of values:
A list of recommended items and a similarity score
Numeric values, for time series models and regression models
A probability value, indicating the likelihood that a new input belongs to
some existing category.
The name of a category or cluster to which a new item is most similar.
A predicted class or outcome, for classification models
b) Analytical workload:
The output of the model is stored in analytics systems like Azure Blob storage, Azure
Synapse Analytics, Azure Data Lake, or Azure SQL Database, where the input data is
also collected and stored. This stage facilitates the availability of the prediction
results for customer consumption, model monitoring, and retraining of models with
new data to improve their accuracy.
c) Monitoring:
Monitoring in staging & test and production aims to collect metrics and then act to
make changes in the performance of the model, data, and infrastructure.
5acXjzUk
INT
ER
NA
L&
PA
RT
NE
RS
Dev: Plan (requirements backlog) -> Create code for ML model (inputs
parameters, targets, features...etc)-> ML (Experiments) : prepare data ->
Build -> Train -> Evaluate -> Dev: Test selected model -> create a package ->
Ops= Deploy -> Release -> Configure -> Monitor -> Dev: Plan
2. Architecture :
This architecture uses the Azure Machine Learning SDK for Python to create
a workspace (space for an experiment), compute resources, and datastore resources.
5acXjzUk
INT
ER
NA
L&
PA
RT
NE
RS
5acXjzUk
INT
ER
NA
L&
3- Consistency: PA
Ensure that information stored in one place match relevant data stored in another. RT
NE
Ensure that received data values match the expected values
RS
4- Accuracy
Data should correctly reflect the real world
Ensure that data are correct and If any mistake due to misspelling or data sourcing
5- Timeliness:
Ensure that up to date information is available
Ensure that the refresh rate is consistent
6- Uniqueness: only one instance:
Ensure only one instance and no data duplication
5acXjzUk