Professional Documents
Culture Documents
Machine Learning in Mechanical and Plant Engineering: Quick Guide
Machine Learning in Mechanical and Plant Engineering: Quick Guide
Quick Guide
Machine Learning
in Mechanical and Plant Engineering
Software and Digitalization
Quick Guide
Machine Learning in
Mechanical and Plant Engineering
© 2018
VDMA Software and Digitalization
Lyoner Straße 18
60528 Frankfurt am Main, Germany
sud.vdma.org
All rights reserved, in particular those of duplication, distribution and translation. No part of this work
may be reproduced in any form (print, photocopy, microfilm or other process), nor may it be stored,
edited, duplicated or distributed by means of electronic systems, without written permission from the
VDMA.
2 CONTENTS
Contents
Foreword 3
Foreword
Everybody is talking about machine learning and With the aid of machine learning, software and
artificial intelligence. These technologies and information technology are becoming
their tools are constantly being improved, increasingly important drivers of innovation in
expanding rapidly from the consumer into the the mechanical engineering industry.
industrial sector and thus to the mechanical and
plant engineering industry. This Quick Guide was written by the machine
learning expert group at the Software and
For years, or even decades, this field was the Digitalization Association. Its primary audience
preserve of academics and only gained a is managers at mechanical engineering
foothold in select sectors. Over the past ten companies who are interested in evaluating
years, falling prices for computing power and machine learning’s potential for their
storage have led to rapid advances in cloud and businesses. It provides information on
big data technologies. These two factors quickly opportunities, challenges and possible solutions.
led to the development of various software But above all, this Quick Guide is intended to aid
technologies that had an impact on artificial readers in approaching the subject with the
intelligence. right questions and drawing their own
conclusions. Given the pace of change in the
Machine learning is an important field of field, this Quick Guide cannot address the topic
computer science and a subdiscipline of artificial exhaustively and in depth. The members of the
intelligence. Computer programs based on expert group at VDMA are available to provide
machine learning can use algorithms to advice about machine learning and will continue
autonomously find solutions for new and to follow developments in this demanding field.
unknown problems. Artificial systems recognize
patterns and regularities in the training data
they are supplied with. Tools that are already
established on the market assist in finding the Guido Reimann
algorithms. New frameworks and platforms are
aiding the widespread application of this VDMA
previously rather academic subject matter in Software and Digitalization
work on everyday projects. For mechanical
4 MACHINE LEARNING — QUICK GUIDE
Machine learning (ML) and artificial intelligence probability. Such algorithms do not follow rigid
(AI) are omnipresent — in the media in the programs and rules laid down by humans, they
workplace, and in private life. Digitalization is make data-based predictions by generating
dramatically changing all aspects of society, knowledge based on examples — in other words,
including the manufacturing industry. The learning.
German mechanical engineering industry in
particular will need to address the resulting As early as 1959, the pioneering American
challenges if it is to maintain and expand its computer scientist Arthur Samuel defined ML as
current global product leadership role in many a field of study that “gives computers the ability
sectors. to learn without being explicitly programmed.”
Like data mining, it comes from statistics. The
But just what is ML? It is literally that: machines difference is that statistics explains what has
that learn. Machines, in this case computers, happened, data mining explains why something
gain the ability to learn autonomously. ML deals has happened, and ML determines what will
with happen and specifies how certain situations can
be improved or avoided.
Figure 1:
Boundaries of AI, ML, NN, DL (source: Softing GmbH)
amounts of heterogeneous data, and their technologies. As a result, they often narrow their
speeds and data types both with or without ML. focus to the technology and lose sight of the
value creation benefits for their business.
This document does not address AI or big data
but concentrates on ML. ML is already used to support critical decision-
making in medical diagnosis, financial markets
What role can ML play in product leadership for and the energy sector, or even to make decisions
the mechanical engineering industry? automatically. Our everyday life is also shaped
Maintaining and expanding it in many areas as a by ML; one need only think of all the product
prerequisite. recommendations we receive daily. They are
based on our previous purchases, our search
There are many possibilities, and the majority of behavior, and the terms we enter into the
the applications is not yet foreseeable. However, devices we use. But ML is increasingly finding its
there are already many obvious practical way into the mechanical and plant engineering
applications and solutions that enable industry and enabling new applications, which
businesses in the mechanical engineering will be discussed in chapter 3.
industry to get a quick start in ML and also to
expand their capabilities. There are possibilities Why is ML becoming so important just now? The
in process optimization as well as in maintaining answer is as simple as it is complex. What was
and expanding leadership in product innovation. still impossible yesterday is now routine.
Computing power on a scale inconceivable ten
However, some key issues are cause for years ago is now affordable. When this
uncertainty, such as the necessary knowledge computing power for large amounts of data is
about how to select, develop and configure the combined with the ability to learn, algorithms
relevant algorithms, how to acquire and prepare can be continuously improved. All of these are
the data, and not least the indispensable further important contributors to rapid progress.
experience with these factors. Unclear legal
aspects and implications also deter businesses This Quick Guide is for decision makers who
from making meaningful investments in ML. want to learn more about ML. First we explain
Many machinery manufacturers find it difficult what is meant by ML and the opportunities and
even to investigate and define a specialized risks this technology brings to the mechanical
application or project. engineering industry. Then we present a few
typical applications.
1.1 Learning from data But how does learning from data and then
applying the learned knowledge work in
ML enables technical systems to do something practice? Let us begin by explaining a few terms.
previously only possible for living beings: learn A fundamental concept for learning is the
from experience. ML algorithms learn patterns model, which contains the learned knowledge
and structures in sample data prepared by and is used to make predictions. As a rule,
humans and then apply this new knowledge to models are only designed for a single task. For
new and unknown cases. For example, from a example, using sensor data (input), the
large number of sample images, ML algorithms probability of a malfunction is predicted
learn how a good part looks in a camera image. (output). Another important concept is model
The system then uses these self-defined views training, in which the model is taught through
to detect and separate out bad parts. In this way, data input. Models are normally trained once
the machine learns to distinguish good parts and then used for predictions. ML algorithms
from bad ones. can be differentiated by the way they learn from
data or how their models are trained, with three
categories used:
Supervised learning
Unsupervised learning
Reinforcement learning
Classification
Supervised
learning
Regression
Machine Reinforcement
learning
Reward criteria
learning
Cluster
assignment
Unsupervised
learning
Categorization
Figure 2: ML types
MACHINE LEARNING — QUICK GUIDE 7
An example for improving a company’s own “Avoid starting with projects that are too
products with ML is predictive maintenance. large or too complex”
Using the large amount of information gathered
while operating a machine, problems can be But how is a successful start in this field
recognized early and their maintenance can be possible? Studies show that many companies
planned — before a customer’s production has to make their first projects too large and complex
be unexpectedly interrupted at a usually while failing to successfully address many
inopportune time. The machine manufacturer’s relevant aspects. The following questions should
customer can therefore plan and integrate the be answered before starting a project: What
machine’s maintenance efficiently in its internal application scenario is available as an entry
processes. If the sensors for predictive project for my company? How can we gradually
maintenance are used in combination with build up the required knowledge in our
software, it would enable fine tuning of the company?
machine’s efficiency for additional optimization.
Which ML technologies and algorithms are
Another field for ML: simplifying machine relevant for me? How can the results of ML
operation with expert systems. This results in algorithms avoid business risks - or at least
reduced familiarization time, training costs and mitigate them early? The availability of the
set-up times, while simultaneously increasing required data is also crucial and plays an
efficiency. ML can enable both the machinery important role.
manufacturer and its customers to optimize
their processes. There is one more important question to
answer: Who is responsible when people
“Reduce training costs and set-up times” delegate decisions to machines? For example, if
a model is formally correct but delivers incorrect
To promote the differentiation and leadership of or negative results due to learning using
its products, a company should perform detailed incorrect or unsuitable data?
evaluations of additional potential customer
benefits from a use case and develop them in If a successful ML project is to be set up, experts
cooperation with the customer. Suitable ML must first examine these questions in a
algorithms for the value-added services would structured manner and then answer them in
not be used at the customer but at the detail, particularly with regard to relevance, risks
mechanical engineering manufacturer. New and the need for investment for the company. At
customer services could be developed from the same time, one must avoid being
these algorithms, as could new pricing models discouraged by the scope of the issues resulting
for the use of the machinery. Ultimately, ML will in missing the starting gun. As always, it is
lead to zero-defect quality and maximum important to simply get started.
adherence to schedules. And the acquired data
can provide efficient support for any required
documentation.
10 MACHINE LEARNING — QUICK GUIDE
In this chapter, we present a few typical use All kinds of sensor can be used for the imaging
cases for ML in mechanical engineering. Each of process used with human-like machine vision,
the use cases is described briefly and a possible including 2D, 3D, ultrasound, X-ray and shape
implementation strategy explained. Then we from shading. The ML application works from a
discuss the benefits, the required capabilities, training phase with good parts; in contrast,
and the costs, opportunities and risks. extensive defect catalogs must be used with
traditional image processing applications. With
Listed risks. We will go into the technical ML, the desired result — and not the deviation
implementation of these use cases in more from it — is thus the standard.
detail in chapter 6.
The process-secure solution for such tasks is ML-
3.1 Human-like machine vision based image processing systems which are
specially developed and optimized for the
Judging surface textures is a task in which industrial analysis of images.
traditional image processing systems reach their
limits, while the human eye can recognize The use of such systems based on ML opens up
textures, patterns, objects and structures and further potential applications for reliable
can reliably judge and classify them visually automated inspection with very high detection
after only a brief training period. With only a few levels. Where traditional vision systems reach
examples, humans can learn to distinguish their limits and human judgment is the best
permissible variations from defects, even in solution in spite of its risks and limitations,
natural objects of which no two are identical. human-like machine vision based on ML
algorithms currently offers a state-of-the-art
solution. New products can be learned without
great effort and even new and unknown
characteristics can be detected without
extensive defect libraries, resulting in
considerably shorter development and product
launch periods.
Strainer
Adhesive bead
(metal, 2D) (seals, adhesive
beads, 3D)
Despite all of the possibilities, ML has limitations At the same time, other parameters such as
and is not the tool of choice for every case. consumables, physical factors and the condition
of the machine also influence the printed
Important limiting factors can include: products in an unknown way.
The dead time can be bridged with model-based This variety of machine configurations is also a
adaptive control. The relationship between major challenge for the tendering process. For
sensor data and the quality of the solid density very complex machines, the tendering process
is learned from historical data and used as a with different machine versions and their
feedback variable in the control system. This pricing can drag on for weeks, often leading to
enables adjustment of the process parameters delays in preparing bids and possibly
even before measured values for the density are endangering sales. A smart tendering system
available. Use of predictive control increased can automate parts of the tendering process,
productivity and resource efficiency making it faster and reducing costs.
significantly. On average, waste during start-up
was reduced by 37% and the required time by Information and comprehensive data about
39%. This kind of control is called predictive previous bids, machine configurations and prices
control. can be used for the semi-automated preparation
of future bids. Assuming that a similarly
Developing an adaptive control system requires configured product would result in a similar cost
an in-depth understanding of the process. It also structure, ML algorithms are used to train a
calls for knowledge about the use of machine model that learns the correlations between
learning processes for time series regression. machine configurations and costs. Then this
The process of model-based predictive control model is used to estimate the costs for a
can be also be applied to other problems. An machine configuration and prepare an initial bid
approach involving machine learning processes (see figure 6). Rapid preparation of bids can
is always promising when many measurable increase the likelihood of sales and thus increase
variables influence the process in an unknown revenues. Other advantages include
way. As in the other use cases, sufficient data is simplification of the tendering process and
an absolute requirement for adaptive control. reduced scope for errors.
With the traditional sequence of algorithms -> The current industrial analytics solution can only
data -> decisions, overall system effectiveness suggest possible correlations in the supplied
cannot be better than the person who data. The domain expert of a machinery
programmed it. Contrastingly, ML algorithms manufacturer or system operator assesses the
which are applied to large amounts of correlations recognized by the ML algorithms as
production data recognize causalities that had chance or as actual causality. Based on these
previously been hidden from machinery correlations which domain experts recognized
manufacturers and system operators, but as causalities in the data, decisions about
improve overall system effectiveness. This production optimization are now made. In the
means that quality and availability be improved future, a domain expert will initiate data
reactively, while the performance of future analysis in the field of guided analytics, but the
machines can also be improved proactively. process will otherwise continue automatically;
in autonomous analytics the entire data analysis
process is automated.
RT_Gurt_M_ist_K24
1
Roller
Malfunction
Data are becoming the most important currency business models. The increasing importance of
of the 21st century and are the foundation of data is causing a change in the sequence from
ML. More data were generated in the last two algorithms -> data -> decisions to data ->
years than in all of human history. Data are algorithms -> decisions (figure 7), a factor that
increasingly becoming a factor in production represents the revolution that is currently taking
along with land, capital and labor. They enable place.
cost reductions and new
Figure 8: Shift from rule-based to data-driven decision making. (source: Softing GmbH)
Since the first programmable chip — the Intel System data were often collected in the past but
4004 was launched in 1971 — software not processed. Today, they are used in
development has always followed the same production optimization to increase overall
pattern: First the problem is defined, then system effectiveness, called overall equipment
objectives and work steps are specified, and efficiency (OEE). Data from a normally
finally the application is programmed as a functioning machine or system are used to train
sequence of algorithms. In practice, these a model with limit values. As soon as the model
algorithms are supplied with data, and users receives with deviating data, it sounds the alarm
reach decisions based on the results. This because the deviating data represent defective
approach is currently undergoing a structural components, machines or processes.
change; the data are now gathered in advance
and analyzed with generally valid algorithms in If ML algorithms are the engine for future
the second step. This results in causalities, upon development, data are the fuel.
which decisions are made, for example to
optimize production, and these decisions are
increasingly often being made autonomously.
16 MACHINE LEARNING — QUICK GUIDE
ML does not find its solution with rule-based This includes data from control systems, data
software code written by humans, thus differing from sensors, actuators and databases, and
from traditional applications. It is important to production flow data or weather data from
know the nature of machine learning: There is a additional sources. But how can we tell whether
pattern in the form of data that we cannot grasp the amount of data from automation
using “IF THIS THEN THAT” lines of program components and field devices is enough? The
code. But algorithms find this pattern in the answer is simple: Once algorithms detect
data. In production it is typically a causality patterns, they have been fed enough.
between component status and physical
measurements. This means it is impossible to predict which data
are needed for the required model quality and
With the traditional sequence of algorithms -> precision. More data do not necessarily lead to
data -> decisions, the OEE cannot be better than more patterns, at best to more correlations, but
the understanding of the person who in any case not necessarily to more causalities,
programmed the OEE influencing factors. In as these can only be confirmed by a domain
contrast, ML algorithms applied to large expert.
amounts of production data can find causalities
that can improve the OEE, but have always been At the beginning of an ML project, the
hidden from system operators. optimization potential for a system or machine
is determined, and along with it, the commercial
Which data do we need for ML? And in what value of the data as part of the general business
quantity? What role does big data play? Big data understanding process. In the collection phase,
is a collective term for large amounts of data — data are gathered and processed. Outliers are
with different data types from different sources deleted, erroneous entries eliminated, time
that are provided at different rates. stamps compared, metadata added, and the
cleaned data formatted. In the analysis phase,
The amount, the rate and the variety of current the data are processed, modeled and analyzed.
production data exceed the abilities of the In the implementation phase, a data-based
operating staff and call for new, data-based production optimization is installed, for example
approaches. But access to high-quality data is a a model used to detect anomalies.
prerequisite for even being able to use ML.
Data are everything for machine learning.
Without data there is no ML.
MACHINE LEARNING — QUICK GUIDE 17
5. Data-driven modeling
No refinery without crude oil. No cosmetics and CRISP-DM divides the ML analytics process into
no gasoline without refineries. Once the six phases. In a structured approach, this process
prerequisite has been created and data are can be repeated without limitations, making it
available, the next question swiftly follows: How suitable for agile development based on the
can useful information be gleaned from them? scrum method since it can be divided into
American high-tech companies active in the B2C sprints.
sector were able to provide an entire sector with
a one-size-fits-all solution for applications such The sequence for the phases is not strict; what is
as online retail and web searches. In doing so, needed is constant switching between the
they followed a top-down approach. However, various phases. The arrows in the process
these solutions do not work for the highly diagram show the most important and most
diversified B2B sector in the mechanical and common dependencies between the phases. The
plant engineering industry. Processes and outer circle in the diagram symbolizes the
procedures differ from one company to another, cyclical character of the ML approach itself. After
as do the use cases, the business cases and the a cycle, the insights gained can trigger new and
types of sensors used. All of this calls for a often focused business questions; subsequent
solution that has to be tailored to individual ML processes benefit from the experiences of
companies and their business objectives. A previous processes.
bottom-up approach must therefore be selected
on the way to data-driven value creation. A The individual phases are as follows:
model that has proven itself in practice in many
projects offers ways to simplify the sequence of Business understanding
an ML process. This first phase involves understanding the
project objectives and requirements from the
The multi-sector standard process for data business perspective. This knowledge is used to
mining, commonly known by its acronym CRISP- define the data mining problem and draw up a
DM, is well suited for this purpose. This data preliminary plan for achieving the objectives.
mining process model describes common
approaches used by analytics experts to tackle Data understanding
problems, a process which can be easily applied This phase begins with initial data collection,
to ML projects. followed by familiarization with the data,
identification of data quality problems, and the
first insights into the data to discover interesting
subsets.
This subsequently enables the development of
hypotheses on hidden information.
18 MACHINE LEARNING — QUICK GUIDE
Modeling
The modeling techniques are selected and
Data preparation applied in this phase and their parameters are
This phase includes all activities for creating the calibrated to optimum values. There are typically
final set of data, in particular data fed into the several techniques for the same type of problem.
modeling tools from the original raw data. The Some techniques have special requirements for
tasks include selecting tables, data sets and the form of the data, meaning it is often
attributes, and cleaning and transforming data necessary to return to the data preparation
for modeling tools. This is the largest task of the phase. Once the data have been correctly
entire analytics project, with 50 to 75 percent of prepared, several hundred thousand proposals
the effort. or a wide range of models can be generated in a
short time. That is why only about 10% of the
total effort of the entire CRISP-DM-based
analysis phase originates in this step. However,
if the data preparation was only mediocre, the
results of modeling will also be mediocre and it
will not be possible to generate any more valid
results after the next phase.
MACHINE LEARNING — QUICK GUIDE 19
Evaluation Deployment
In this phase of the project, one or more models A finished model is usually not the end of the
are created that appear to have high quality project. Even if the purpose of the model is to
from the data analysis. Before final deployment increase knowledge about the data, the
of the model can proceed, the model has to be knowledge gained needs to be organized and
carefully evaluated and the steps in its presented to benefit the customer. Depending
construction must be examined. This is the only on requirements, the implementation phase can
way to ensure that it satisfies the company be as simple as preparing a report or as complex
objectives. One main objective is to determine as implementing repeated data scoring, for
whether an important business problem was example a segment assignment. In many cases,
insufficiently accounted for. Decisions about the customer and not the data analyst performs
whether and how to use the analytics results are the deployment steps. Even if the analyst uses
made at the end of this phase. the model, the customer needs to understand
the actions to be performed in order to be able
to use the created models.
20 MACHINE LEARNING — QUICK GUIDE
In this chapter, we describe the technical Traditional vision systems are trained with
implementation of the use cases from chapter 3. images of defects, objects or scenes. The system
can detect and classify exactly these — and only
6.1 Human-like machine vision these — defects in the objects or scenes; it does
not recognize deviations as defects. In human-
Traditional image processing systems often like machine vision, the algorithms are trained
reach their limits with tasks that can be with images of typical good parts and correct
performed easily by humans. Human-like objects or scenes. The system learns, similarly to
machine vision was developed with the aim of a human, how a good part or an object or a
reproducing the strengths of humans in a scene can appear, with all allowed variations
technology for industrial image processing. This and deviations. Anything that deviates from
technology was developed by experts in image these expected images will be recognized by
processing, data processing and neural both a human and the system as an anomaly.
networks. The basis was formed from insights Conversely, anything that corresponds to the
into the workings of the human brain that were expected images of good parts and typical
gained by neuroscientists. Human-like machine objects or scenes will be classified by both a
vision was developed and optimized specifically human and the system as meeting expectations.
for industrial image analysis and operates on a
self-learning basis. It is based on multistage 6.2 Adaptive control for process optimization
neural networks that can be configured with
only a few parameters. Human-like machine In adaptive control, an ML model is trained with
vision includes three industrial image processing historical data to recognize the relationship
tools that were developed and optimized for between process-influencing factors and the
different tasks: process quality resulting from them. This
estimated process quality is then controlled via
Discovery of quality anomalies for quality process parameters without actually being
inspections measured. Supervised learning algorithms are
Localizing and identifying single or multiple used to train the model.
features
Classification of objects or entire scenes
MACHINE LEARNING — QUICK GUIDE 21
In the example of controlling solid density, over Input variables for the model are all cost-
two dozen parameters such as humidity, relevant configuration options for the machine.
temperature and color properties served as From this, the model learns the sometimes very
input variables. The input variable for the model complex relationships between the options and
is the estimated solid density, which is their effects on the resulting costs, enabling it to
controlled via process parameters even before estimate an approximate price for a machine
actually being measured. Measured values for configuration. After this initial approximate bid,
the influencing parameters and the of course an expert calculates the exact costs for
corresponding quality results from several the official price. This approach allows more
months were used for training. Suitable than just estimates for the costs of machine
preliminary data processing eliminates configurations; with the same approach,
inconsistent or ambiguous data sets and historical data can be used to estimate expected
generates suitable characteristics. A condition is project durations or a system’s output.
that the data sets include only examples for
which the solid density was set satisfactorily for To train the model, the bid data have to be
the influencing parameters. available in a structured, machine-readable
form. A collection of bids in PDF format is
The model is then able to correlate the complex normally insufficient. What is needed is a
and nonlinear relationships between the database of machine configurations and bid
numerous influencing parameters and the solid prices that takes changes and corresponding
density. Using the model enables suitable versions of machines and machine options into
settings to be predicted successfully in many account. The price structure for machine options
cases, even under previously unknown can change over time; this must also be taken
influences or an unknown combination of into account. For smart tendering, along with
influencing parameters. Not only can the model training there is also a focus on data
reproduce knowledge, it can also impart some of deployment and data cleanup.
its knowledge to new cases.
6.4 Data-driven innovation
6.3 Smart tendering
Many decision makers are uneasy with the
In smart tendering, an ML model is trained with thought of storing their production out in the
historical bid data to learn the relationship cloud. The alternative is an “edge” solution in
between machine configurations and costs. which the data are stored on a standard IPC
These estimated costs are then used for the bid. where they originate in the system or machine.
Supervised learning algorithms are used to train This minimizes the danger of data theft or data
the model. The training is performed at regular loss.
intervals to ensure that as many current data as
possible are taken into account.
22 MACHINE LEARNING — QUICK GUIDE
Some cloud providers now have a “cloud on- “Analytics at the edge” takes place inside the
premises” or edge analytics solution in their firewall in the system without additional
portfolio. This involves preliminary analysis of security measures. Only the results of the locally
data on a gateway in the system; the data are performed real-time analytics can be gathered in
then compared with several systems and a public or a local cloud. For training the models,
locations in the cloud. In the edge-based the cloud ideally keeps a larger selection of
industrial analytics approach, data are collected, powerful machine learning algorithms available
analyzed offline and fed into a model. The model than the system on-site or the edge analytics
is then brought back to the system to evaluate provider. And production parameters for
the incoming data streams there. multiple systems can be compared in the cloud.
Analytics at the edge, on the other hand,
processes the data in real time (figure 10).
7. Build or buy
Analyzing the data on-site is simpler than Programming languages are used to write ML
gathering them from data sources scattered algorithms. Speed of execution and ease of
around the world and then analyzing them implementation of the algorithms are important
together. criteria for programming languages. Software
libraries and software frameworks include
There are products on offer for these challenges algorithms as finished modules. The available
which aid in gathering the data and executing algorithms differ in scope, execution speed, and
the algorithms, such as devices for collecting the programming language used (e.g. C++ or
machine data or cloud platforms with hardware Python).
and software solutions for ML. Integration in the
in-house processes has to be performed by IT Services are finished software services to be
experts and software developers and has to be integrated into in-house applications by a
customized to meet the company’s needs. There software developer. An example of a service is
are also many other things to consider, such as the detection of objects in an image. A special
the legal aspects regarding how data are feature of such services is that often no data
handled. Here too, there is usually no off-the- scientist is needed in order to use their
shelf solution. Every company will need its own functionality. The services in the table show the
special solutions. numerous ML possibilities that can be
implemented without in-depth ML expertise.
7.2 Available hardware and software modules Platforms offer additional functionalities such as
automatic scaling of the required hardware for a
The following table lists examples of software large number of users, worldwide online
and hardware modules for ML. With these availability of services, or easy integration of
proven modules, in-house ML projects can be existing software products into in-house
implemented quickly and economically. They are applications.
categorized as follows: hardware, programming
languages, libraries, frameworks, services, cloud Products are specially tailored solutions for a
platforms and products. The categories are specific problem and only need minimal
briefly explained below. adaptation.
Amazon Amazon Machine Learning Amazon AWS Platform Data Cloud https:
Machine is a service with which scientist //aws.amazon.com/de/aml/
Learning developers can implement
the technology for machine
learning.
Cognitive Services for image analysis, Microsoft REST service Developer Cloud https://azure.microsoft.com/de-
Services speech recognition, speech API de/services/cognitive-services/
input, translation etc.
Pandas Library (BSD license) that Open source Library Data https:
provides high-performance, scientist //pandas.pydata.org/
easy-to-use data structures
and data analysis for Python
Tensor Flow Framework for AI and ML and Open source Framework Data On-premise https:
deep neural networks in scientist and cloud //www.tensorflow.org/
particular, developed mainly
for Python users.
Watson Cloud Services for image analysis, IBM REST service Developer Cloud https://www.ibm.com/
speech recognition, speech API watson/developer/
input, translation etc.
MACHINE LEARNING — QUICK GUIDE 27
8. Requirements
ML is a powerful tool, but not a cure-all. At the The question is now for which problems does it
beginning of an initial project, it is necessary to make sense to consider introducing ML? The
consider the opportunities and risks and to answer is currently generally those questions
gauge and quantify the costs and benefits — that call for detailed knowledge of the
always with a clearly defined objective. If ML is application — domain expertise. This knowledge
being rolled out or used for the first time, there is often limited to only a few experts; they need
will most likely be a learning curve and some to be challenged and promoted. Expertise and a
lean times to get through. willingness on the part of experts to accept
responsibility are the foundation of a successful
Everybody is talking about ML nowadays, but ML rollout.
experience and expertise are still largely absent.
New processes need to be developed, verified Another requirement for a successful ML rollout
and validated, and experience has to be gained. is data in sufficient quantity and quality, and
Nearly every algorithm is a well-protected black access to it through a suitable network. The data
box; as a result, the outcomes are not always need to be prepared so that incorrect data are
easily understood or explained, which in turn corrected or deleted and missing data are added.
can lead to longer rollout times. During rollout A consistent and precise time stamp mechanism
and initial use of an ML project, reliable backing is also required so that chronologically correct
from management is extremely important, and conclusions can be drawn subsequently. Before
so is demonstrable success. They keep up the a project begins, all sources should be known, as
motivation of the team and the leadership and should the levels on which ML will be at work. A
weaken the arguments of doubters. data map can aid in identifying the required
data, as well was the types of data and the
To develop ML and/or successfully roll it out, places where data are generated. If a map like
expertise in three areas is indispensable for the this does not yet exist, it must be created at the
participating staff: beginning.
In the field of ML algorithms, we mostly find Along with data formats and the required
only “human experiences and impressions.” connectivity with uniform standards, data
They are very difficult or impossible to measure content and the issue of its links to individuals
or can only be made measurable with great are important criteria for further processing and
effort, which is why it is so important to assess handling. Personal data must be defined, along
experiences and impressions. Possible with the resulting obligations related to
assessment criteria are good/bad, valid/invalid collecting, transferring, processing, storing and
or sufficient/insufficient. The results of such deleting them. This issue plays a role of
assessments play a central role in preparing and increasing importance, especially in data
implementing ML systems, after all, the system analytics and thus also in ML since statutory
can only learn what it is taught. Conversely, this requirements can strongly affect business
means that if a system is based on incorrect models.
information, it can or will learn incorrect things.
MACHINE LEARNING — QUICK GUIDE 29
9. Outlook
It is only now becoming clear how some This is why the new discipline of self-service or
machinery manufacturers view digital value guided analytics has the defined objective of
creation and what path they have taken toward minimizing both the need for and the
a data-driven future. That also explains why dependence on data scientists. In this way, the
they have been able to build out their lead over major gap between the sea of data and data
the competition. Valuable expertise develops analytics tools can be closed. Today the data
with remarkable speed, building up its own science approach is largely manual and
momentum after a rather short initial phase. exploratory, which is why most of the tasks
With this momentum, their lead over hesitant performed by data scientists need to be
competitors increases exponentially, exceeded automated. In the case of guided analytics, for
only by the increase in the amount of data example, this would mean that data analysis is
available for analysis. merely initiated by domain experts but
otherwise proceeds automatically, enabling the
“Uncharted territory is where nobody has any domain experts to find the added value in
experience because nobody has been there yet.” “their” data themselves. In a more advanced
phase, the entire data analysis process will be
Even for fast followers, it is not too late. These automated by “autonomous analytics” — from
businesses are also beginning to regard data as data entry to presentation of the results.
a refined product of their own value creation
chains and not mere “crude oil.” In its current Viewed from the current state of the art, the
technological phase, ML is reaching its first path there is a long one, full of both challenges
significant breakthrough with solutions that can and opportunities. Those who fail to set out on
be applied on a large scale, and the ability to this path now, who do not concern themselves
economically process data in parallel provided with ML and initiate the transition to a data-
the foundation. In the next phase, ML will driven business will be even farther behind in a
expand into all segments of the mechanical few years when the next stages arrive.
engineering industry, for example in self-
optimizing process control systems.
Members who made significant contributions to this Quick Guide with their contributions in the
expert group. We thank them for their creative ideas and fruitful discussions.
Page 4
Figure 1: Boundaries of AI, ML, NN, DL (source: Softing GmbH)
Page 6
Figure 2: ML types
(source: VDMA machine learning working group)
Page 10
Figure 3: Laboratory
(source: i-mation GmbH)
Page 11
Figure 4: Adjustment process
(source: Fraunhofer IGCV)
Page 11
Figure 5: Start-up (controlled and uncontrolled)
(source: Fraunhofer IGCV)
Page 13
Figure 6: Use of machine learning in bid preparation
(VDMA machine learning working group)
Page 14
Figure 7: Malfunction detected.
(source: Softing GmbH)
Page 15
Figure 8: Shift from rule-based to data-driven decision making. (source: Softing GmbH)
Page 17
Figure 9: The six phases in the ML analytics process. (source: Wikipedia, https://www.crisp-
research.com/publication/machine-learning-im-unternehmenseinsatz-kunstliche-intelligenz-als-grundlage-
digitaler-transformationsprozesse/)
Page 21
Figure 10: Real-time data processing (source: Softing GmbH)
VDMA
Software and Digitalization
Lyoner Str. 18
60528 Frankfurt am Main
Germany
Contact
Guido Reimann
Phone + 49 69 6603-1258
E-Mail guido.reimann@vdma.org
Titelbild: © Shutterstock
sud.vdma.org