Professional Documents
Culture Documents
MODULE 4
Supervised and Unsupervised learning in Machine
Learning
The basic design issues and approaches to machine learning are illustrated by designing a
program to learn to play checkers, with the goal of entering it in the world checkers
tournament
1. Choosing the Training Experience
2. Choosing the Target Function
3. Choosing a Representation for the Target Function
4. Choosing a Function Approximation Algorithm
1. Estimating training values
2. Adjusting the weights
5. The Final Design
• The first design choice is to choose the type of training experience from which the
system will learn.
• The type of training experience available can have a significant impact on success or
failure of the learner.
There are three attributes which impact on success or failure of the learner
1. Whether the training experience provides direct or indirect feedback regarding the
choices made by the performance system.
Indirect training examples consisting of the move sequences and final outcomes of
various games played. The information about the correctness of specific moves early in
the game must be inferred indirectly from the fact that the game was eventually won or
lost.
Here the learner faces an additional problem of credit assignment, or determining the
degree to which each move in the sequence deserves credit or blame for the final
outcome. Credit assignment can be a particularly difficult problem because the game
1
AI & Machine Learning
can be lost even when early moves are optimal, if these are followed later by poor
moves.
Hence, learning from direct training feedback is typically easier than learning from
indirect feedback.
2. The degree to which the learner controls the sequence of training examples
Alternatively, the learner might itself propose board states that it finds particularly
confusing and ask the teacher for the correct move.
The learner may have complete control over both the board states and (indirect) training
classifications, as it does when it learns by playing against itself with no teacher present.
3. How well it represents the distribution of examples over which the final system
performance P must be measured
If its training experience E consists only of games played against itself, there is a danger
that this training experience might not be fully representative of the distribution of
situations over which it will later be tested.
It is necessary to learn from a distribution of examples that is different from those on
which the final system will be evaluated.
The next design choice is to determine exactly what type of knowledge will be learned and
how this will be used by the performance program.
Let’s consider a checkers-playing program that can generate the legal moves from any board
state.
The program needs only to learn how to choose the best move from among these legal moves.
We must learn to choose among the legal moves, the most obvious choice for the type of
information to be learned is a program, or function, that chooses the best move for any given
board state.
ChooseMove : B→ M
which indicate that this function accepts as input any board from the set of legal board
2
AI & Machine Learning
states B and produces as output some move from the set of legal moves M.
3
AI & Machine Learning
ChooseMove is a choice for the target function in checkers example, but this function
will turn out to be very difficult to learn given the kind of indirect training experience
available to our system
which denote that V maps any legal board state from the set B to some real value.
Intend for this target function V to assign higher scores to better board states. If the
system can successfully learn such a target function V, then it can easily use it to select
the best move from any current board position.
Let us define the target value V(b) for an arbitrary board state b in B, as follows:
• If b is a final board state that is won, then V(b) = 100
• If b is a final board state that is lost, then V(b) = -100
• If b is a final board state that is drawn, then V(b) = 0
• If b is a not a final state in the game, then V(b) = V(b' ),
Where b' is the best final board state that can be achieved starting from b and playing optimally
until the end of the game
Let’s choose a simple representation - for any given board state, the function c will be
calculated as a linear combination of the following board features:
4
AI & Machine Learning
Where,
• w0 through w6 are numerical coefficients, or weights, to be chosen by the learning
algorithm.
• Learned values for the weights w1 through w6 will determine the relative importance
of the various board features in determining the value of the board
• The weight w0 will provide an additive constant to the board value
In order to learn the target function f we require a set of training examples, each describing a
specific board state b and the training value Vtrain(b) for b.
For instance, the following training example describes a board state b in which black has won
the game (note x2 = 0 indicates that red has no remaining pieces) and for which the target
function value Vtrain(b) is therefore +100.
1. Derive training examples from the indirect training experience available to the learner
2. Adjusts the weights wi to best fit these training examples
A simple approach for estimating training values for intermediate board states is to
assign the training value of Vtrain(b) for any intermediate board state b to be
V̂ (Successor(b))
Where ,
• V̂ is the learner's current approximation to V
• Successor(b) denotes the next board state following b for which it is again the
program's turn to move
Vtrain(b) ← V̂ (Successor(b))
5
AI & Machine Learning
Several algorithms are known for finding weights of a linear function that minimize E.
One such algorithm is called the least mean squares, or LMS training rule. For each
observed training example it adjusts the weights a small amount in the direction that
reduces the error on this training example
LMS weight update rule :- For each training example (b, Vtrain(b))
Use the current weights to calculate V̂ (b)
For each weight wi, update it as
Here ƞ is a small constant (e.g., 0.1) that moderates the size of the weight update.
6
AI & Machine Learning
1. The Performance System is the module that must solve the given performance task by
using the learned target function(s). It takes an instance of a new problem (new game)
as input and produces a trace of its solution (game history) as output.
2. The Critic takes as input the history or trace of the game and produces as output a set
of training examples of the target function
3. The Generalizer takes as input the training examples and produces an output
hypothesis that is its estimate of the target function. It generalizes from the specific
training examples, hypothesizing a general function that covers these examples and
other cases beyond the training examples.
4. The Experiment Generator takes as input the current hypothesis and outputs a new
problem (i.e., initial board state) for the Performance System to explore. Its role is to
pick new practice problems that will maximize the learning rate of the overall system.
The sequence of design choices made for the checkers program is summarized in below figure
7
AI & Machine Learning
8
AI & Machine Learning
CONCEPT LEARNING
• Learning involves acquiring general concepts from specific training examples. Example:
People continually learn general concepts or categories such as "bird," "car," "situations in
which I should study more in order to pass the exam," etc.
• Each such concept can be viewed as describing some subset of objects or events defined
over a larger set
• Alternatively, each concept can be thought of as a Boolean-valued function defined over this
larger set. (Example: A function defined over all animals, whose value is true for birds and
false for other animals).
Consider the example task of learning the target concept "Days on which Aldo enjoys
his favorite water sport”
Table: Positive and negative training examples for the target concept EnjoySport.
The task is to learn to predict the value of EnjoySport for an arbitrary day, based on the
values of its other attributes?
9
AI & Machine Learning
If some instance x satisfies all the constraints of hypothesis h, then h classifies x as a positive
example (h(x) = 1).
The hypothesis that PERSON enjoys his favorite sport only on cold days with high humidity
is represented by the expression
(?, Cold, High, ?, ?, ?)
Notation
• The set of items over which the concept is defined is called the set of instances, which is
denoted by X.
Example: X is the set of all possible days, each represented by the attributes: Sky, AirTemp,
Humidity, Wind, Water, and Forecast
• The concept or function to be learned is called the target concept, which is denoted by c.
c can be any Boolean valued function defined over the instances X
c: X→ {O, 1}
Example: The target concept corresponds to the value of the attribute EnjoySport
(i.e., c(x) = 1 if EnjoySport = Yes, and c(x) = 0 if EnjoySport = No).
• Instances for which c(x) = 1 are called positive examples, or members of the target concept.
• Instances for which c(x) = 0 are called negative examples, or non-members of the target
concept.
• The ordered pair (x, c(x)) to describe the training example consisting of the instance x and
its target concept value c(x).
• D to denote the set of available training examples
10
AI & Machine Learning
• The symbol H to denote the set of all possible hypotheses that the learner may consider
regarding the identity of the target concept. Each hypothesis h in H represents a Boolean-
valued function defined over X
h: X→{O, 1}
The goal of the learner is to find a hypothesis h such that h(x) = c(x) for all x in X.
• Given:
• Instances X: Possible days, each described by the attributes
• Sky (with possible values Sunny, Cloudy, and Rainy),
• AirTemp (with values Warm and Cold),
• Humidity (with values Normal and High),
• Wind (with values Strong and Weak),
• Water (with values Warm and Cool),
• Forecast (with values Same and Change).
• Determine:
• A hypothesis h in H such that h(x) = c(x) for all x in X.
Any hypothesis found to approximate the target function well over a sufficiently large set of
training examples will also approximate the target function well over other unobserved
examples.
11
AI & Machine Learning
• Concept learning can be viewed as the task of searching through a large space of
hypotheses implicitly defined by the hypothesis representation.
• The goal of this search is to find the hypothesis that best fits the training examples.
Example:
Consider the instances X and hypotheses H in the EnjoySport learning task. The attribute Sky
has three possible values, and AirTemp, Humidity, Wind, Water, Forecast each have two
possible values, the instance space X contains exactly
3.2.2.2.2.2 = 96 distinct instances
5.4.4.4.4.4 = 5120 syntactically distinct hypotheses within H.
Every hypothesis containing one or more "Φ" symbols represents the empty set of instances;
that is, it classifies every instance as negative.
1 + (4.3.3.3.3.3) = 973. Semantically distinct hypotheses
• Consider the sets of instances that are classified positive by hl and by h2.
• h2 imposes fewer constraints on the instance, it classifies more instances as positive. So,
any instance classified positive by hl will also be classified positive by h2. Therefore, h2
is more general than hl.
Given hypotheses hj and hk, hj is more-general-than or- equal do hk if and only if any instance
that satisfies hk also satisfies hi
Definition: Let hj and hk be Boolean-valued functions defined over X. Then hj is more general-
than-or-equal-to hk (written hj ≥ hk) if and only if
12
AI & Machine Learning
• In the figure, the box on the left represents the set X of all instances, the box on the right
the set H of all hypotheses.
• Each hypothesis corresponds to some subset of X-the subset of instances that it classifies
positive.
• The arrows connecting hypotheses represent the more - general -than relation, with the
arrow pointing toward the less general hypothesis.
• Note the subset of instances characterized by h2 subsumes the subset characterized by
hl , hence h2 is more - general– than h1
FIND-S Algorithm
13
AI & Machine Learning
To illustrate this algorithm, assume the learner is given the sequence of training examples
from the EnjoySport task
Observing the first training example, it is clear that hypothesis h is too specific. None
of the "Ø" constraints in h are satisfied by this example, so each is replaced by the next
more general constraint that fits the example
h1 = <Sunny Warm Normal Strong Warm Same>
The second training example forces the algorithm to further generalize h, this time
substituting a "?" in place of any attribute value in h that is not satisfied by the new
example
h2 = <Sunny Warm ? Strong Warm Same>
Upon encountering the third training the algorithm makes no change to h. The FIND-S
algorithm simply ignores every negative example.
h3 = < Sunny Warm ? Strong Warm Same>
14
AI & Machine Learning
Unanswered by FIND-S
15
AI & Machine Learning
Classification:
Classification is a process of finding a function which helps in dividing the dataset into classes based on different parameters. In Classification, a computer program is trained on the
training dataset and based on that training, it categorizes the data into different classes.
The task of the classification algorithm is to find the mapping function to map the input(x) to the discrete output(y).
Example: The best example to understand the Classification problem is Email Spam Detection. The model is trained on the basis of millions of emails on different parameters, and
whenever it receives a new email, it identifies whether the email is spam or not. If the email is spam, then it is moved to the Spam folder.
o Logistic Regression
o K-Nearest Neighbours
o Support Vector Machines
o Kernel SVM
o Naïve Bayes
o Decision Tree Classification
o Random Forest Classification
Decision Tree
Decision tree learning is a method for approximating discrete-valued target functions, in which the learned function is represented by a decision tree.
AI & Machine Learning
DECISION TREEREPRESENTATION
AI & Machine Learning
FIGURE: A decision tree for the concept PlayTennis. An example is classified by sorting it through the tree to the
appropriate leaf node, then returning the classification associated with this leaf
• Decision trees classify instances by sorting them down the tree from the root to
some leaf node, which provides the classification of the instance.
• Each node in the tree specifies a test of some attribute of the instance, and each
branch descending from that node corresponds to one of the possible values for
this attribute.
• An instance is classified by starting at the root node of the tree, testing the
attribute specified by this node, then moving down the tree branch corresponding
to the value of the attribute in the given example. This process is then repeated
for the subtree rooted at the new node.
AI & Machine Learning
APPROPRIATEPROBLEMSFOR
4. The training data may contain errors – Decision tree learning methods are
robust to errors, both errors in classifications of the training examples and errors
in the attribute values that describe these examples.
5. The training data may contain missing attribute values – Decision tree
methods can be used even when some training examples have unknown values
• Decision tree learning has been applied to problems such as learning to classify
medical patients by their disease, equipment malfunctions by their cause, and
loan applicants by their likelihood of defaulting on payments.
• Such problems, in which the task is to classify examples into one of a discrete set
of possible categories, are often referred to as classification problems.
AI & Machine Learning
THEBASICDECISIONTREELEARNING
ALGORITHM
• Most algorithms that have been developed for learning decision trees are
variations on a core algorithm that employs a top-down, greedy search through the
space of possible decision trees. This approach is exemplified by the ID3
algorithm and its successor C4.5
AI & Machine Learning
Whatisthe ID3algorithm?
ID3 algorithm
Examples are the training examples. Target_attribute is the attribute whose value is to be predicted
by the tree. Attributes is a list of other attributes that may be tested by the learned decision tree.
Returns a decision tree that correctly classifies the given Examples.
• Otherwise Begin
• A ← the attribute from Attributes that best* classifies Examples
• The decision attribute for Root ← A
• For each possible value, vi, of A,
• Add a new tree branch below Root, corresponding to the test A = vi
• Let Examples vi, be the subset of Examples that have value vi for A
• If Examples vi , is empty
• Then below this new branch add a leaf node with label = most common value of
Target_attribute in Examples
• Else below this new branch add the subtree
ID3(Examples vi, Targe_tattribute, Attributes – {A}))
• End
• Return Root
• The central choice in the ID3 algorithm is selecting which attribute to test at each
node in the tree.
• A statistical property called information gain that measures how well a given
attribute separates the training examples according to their target classification.
• ID3 uses information gain measure to select among the candidate attributes at
each step while growing the tree.
AI & Machine Learning
Where,
An Illustrative Example
• To illustrate the operation of ID3, consider the learning task represented by the
training examples of below table.
• Here the target attribute PlayTennis, which can have values yes or no for
different days.
• Consider the first step through the algorithm, in which the topmost node of the
decision tree is created.
AI & Machine Learning
• According to the information gain measure, the Outlook attribute provides the
best prediction of the target attribute, PlayTennis, over the training examples.
Therefore, Outlook is selected as the decision attribute for the root node, and
branches are created below the root for each of its possible values i.e., Sunny,
Overcast, and Rain.
AI & Machine Learning
AI & Machine Learning
Regression:
Regression is a process of finding the correlations between dependent and independent
variables. It helps in predicting the continuous variables such as prediction of Market Trends,
prediction of House prices, etc.
The task of the Regression algorithm is to find the mapping function to map the input
variable(x) to the continuous output variable(y).
Example: Suppose we want to do weather forecasting, so for this, we will use the Regression
algorithm. In weather prediction, the model is trained on the past data, and once the training is
completed, it can easily predict the weather for future days.
40
AI & Machine Learning
Linear regression
e.g.
• Linear regression performs the task to predict a dependent variable value (y) based on a
given independent variable (x).
• So, this regression technique finds out a linear relationship between x (input) and
y(output). Hence, the name is Linear Regression.
• In the figure above, X (input) is the work experience and Y (output) is the salary of a
person. The regression line is the best fit line for our model.
41
AI & Machine Learning
In Regression, the output variable must be of In Classification, the output variable must be a discrete
continuous nature or real value. value.
The task of the regression algorithm is to The task of the classification algorithm is to map the
map the input value (x) with the continuous input value(x) with the discrete output variable(y).
output variable(y).
Regression Algorithms are used with Classification Algorithms are used with discrete data.
continuous data.
In Regression, we try to find the best fit line, In Classification, we try to find the decision boundary,
which can predict the output more which can divide the dataset into different classes.
accurately.
Regression algorithms can be used to solve Classification Algorithms can be used to solve
the regression problems such as Weather classification problems such as Identification of spam
Prediction, House price prediction, etc. emails, Speech Recognition, Identification of cancer cells,
etc.
The regression Algorithm can be further The Classification algorithms can be divided into Binary
divided into Linear and Non-linear Classifier and Multi-class Classifier.
Regression.
The goal of the SVM algorithm is to create the best line or decision boundary that can segregate n-
dimensional space into classes so that we can easily put the new data point in the correct category in the
future. This best decision boundary is called a hyperplane.
SVM chooses the extreme points/vectors that help in creating the hyperplane. These extreme cases are
called as support vectors, and hence algorithm is termed as Support Vector Machine. Consider the below
diagram in which there are two different categories that are classified using a decision boundary or
hyperplane:
42
AI & Machine Learning
Example: SVM can be understood with the example that we have used in the KNN classifier.
Suppose we see a strange cat that also has some features of dogs, so if we want a model that
can accurately identify whether it is a cat or dog, so such a model can be created by using the
SVM algorithm. We will first train our model with lots of images of cats and dogs so that it can
learn about different features of cats and dogs, and then we test it with this strange creature.
So as support vector creates a decision boundary between these two data (cat and dog) and
choose extreme cases (support vectors), it will see the extreme case of cat and dog. On the
basis of the support vectors, it will classify it as a cat. Consider the below diagram:
43
AI & Machine Learning
SVM algorithm can be used for Face detection, image classification, text categorization, etc.
Types of SVM
SVM can be of two types:
o Linear SVM: Linear SVM is used for linearly separable data, which means if a dataset can be
classified into two classes by using a single straight line, then such data is termed as linearly
separable data, and classifier is used called as Linear SVM classifier.
o Non-linear SVM: Non-Linear SVM is used for non-linearly separated data, which means if a
dataset cannot be classified by using a straight line, then such data is termed as non-linear data
and classifier used is called as Non-linear SVM classifier.
The dimensions of the hyperplane depend on the features present in the dataset, which means
if there are 2 features (as shown in image), then hyperplane will be a straight line. And if there
are 3 features, then hyperplane will be a 2-dimension plane.
We always create a hyperplane that has a maximum margin, which means the maximum
distance between the data points.
Support Vectors:
The data points or vectors that are the closest to the hyperplane and which affect the position
of the hyperplane are termed as Support Vector. Since these vectors support the hyperplane,
hence called a Support vector.
The working of the SVM algorithm can be understood by using an example. Suppose we have a
dataset that has two tags (green and blue), and the dataset has two features x1 and x2. We
want a classifier that can classify the pair(x1, x2) of coordinates in either green or blue. Consider
the below image:
44
AI & Machine Learning
So as it is 2-d space so by just using a straight line, we can easily separate these two classes.
But there can be multiple lines that can separate these classes. Consider the below image:
Hence, the SVM algorithm helps to find the best line or decision boundary; this best boundary
or region is called as a hyperplane. SVM algorithm finds the closest point of the lines from
both the classes. These points are called support vectors. The distance between the vectors and
the hyperplane is called as margin. And the goal of SVM is to maximize this margin.
The hyperplane with maximum margin is called the optimal hyperplane.
45
AI & Machine Learning
Non-Linear SVM:
If data is linearly arranged, then we can separate it by using a straight line, but for non-linear
data, we cannot draw a single straight line. Consider the below image:
So to separate these data points, we need to add one more dimension. For linear data, we have
used two dimensions x and y, so for non-linear data, we will add a third dimension z. It can be
calculated as:
46
AI & Machine Learning
z=x2 +y2
By adding the third dimension, the sample space will become as below image:
So now, SVM will divide the datasets into classes in the following way. Consider the below
image:
47
AI & Machine Learning
Since we are in 3-d Space, hence it is looking like a plane parallel to the x-axis. If we convert it
in 2d space with z=1, then it will become as:
o KNN algorithm at the training phase just stores the dataset and when it gets new data, then it
classifies that data into a category that is much similar to the new data.
o Example: Suppose, we have an image of a creature that looks similar to cat and dog, but we
want to know either it is a cat or dog. So for this identification, we can use the KNN algorithm, as
it works on a similarity measure. Our KNN model will find the similar features of the new data set
to the cats and dogs images and based on the most similar features it will put it in either cat or
dog category.
The K-NN working can be explained on the basis of the below algorithm:
Suppose we have a new data point and we need to put it in the required category. Consider the
below image:
o Firstly, we will choose the number of neighbors, so we will choose the k=5.
o Next, we will calculate the Euclidean distance between the data points. The Euclidean distance
is the distance between two points, which we have already studied in geometry. It can be
calculated as:
50
AI & Machine Learning
o By calculating the Euclidean distance we got the nearest neighbors, as three nearest neighbors in
category A and two nearest neighbors in category B. Consider the below image:
o As we can see the 3 nearest neighbors are from category A, hence this new data point must
belong to category A.
Below are some points to remember while selecting the value of K in the K-NN algorithm:
o There is no particular way to determine the best value for "K", so we need to try some values to
find the best out of them. The most preferred value for K is 5.
o A very low value for K such as K=1 or K=2, can be noisy and lead to the effects of outliers in the
model.
o Large values for K are good, but it may find some difficulties.
PdM extracts insights from the data produced by the equipment on the shop floor and acts on these insights.
The idea of PdM goes back to the early 1990’s and augments regularly scheduled, preventive maintenance.
PdM requires the equipment to provide data from sensors monitoring the equipment as well as other
operational data. Human’s act based on the analysis. Simply speaking, it is a technique to determine (predict)
the failure of the machine component in the near future so that the component can be replaced based on the
maintenance plan before it fails and stops the production process. The PdM can improve the production
process and increase the productivity. By successfully handling with PdM we are able to achieve the
following goals:
replaced based on the maintenance plan before it fails and stops the production process. The PdM can
improve the production process and increase the productivity. By successfully handling with PdM we are
52
AI & Machine Learning
able to achieve the following goals: –Reduce the operational risk of mission-critical equipment. –Control cost
of maintenance by enabling just-in-time maintenance operations. –Discover patterns connected to various
maintenance problems. –Provide Key Performance Indicators. Usually PdM uses descriptive, statistical or
probabilistic approach to drive analysis and prediction. There are also several approaches which used
Machine Learning (ML). Through the literature there can be found the following types of PrMin the
production: reactive, periodic, proactive and predictive.
53
AI & Machine Learning
Case Study
In order to handle and use this technique we need a various data from the machines in production. In this case
study we used the freely available data from a data source generated as test data set for PdM containing
information about: telemetry, errors, failures and machine properties. The data can be found at Azure blob
storage. The data is maintained by Azure Gallery Article2. Once the data is downloaded from the blob
storage, local copies will be used for further observations in this contribution.
Methodology
Usually, every PdM technique should proceed by the following three main steps
• Collect Data – collect all possible descriptions, historical and real-time data, usually by using
IoT (Internet of Things) devices, various loggers, technical documentation, etc.
• Predict Failures – collected data can be used and transformed into ML ready data sets, and build
a ML model to predict the failures of the components in the set of machines in the production.
• React –by obtaining the information which components will fail in the near future, we can activate
the process of replacement so the component will be replaced before it fails, and the production
process will not.
Data Preparation
In order to predict failures in the production process, a set of data transformations, cleaning, feature
engineering, and selection must be performed to prepare the data for building a ML model. The data
preparation part plays a crucial role in the modelbuilding process because quality of the data and its
preparation will directly influences the model accuracy and reliability. The data used for this PdM use case
can be classified to:
• Telemetry –which collects historical data about machine behavior (voltage,vibration, etc.).
• Errors –the data about warnings and errors in the machines.
• Maint –data about replacement and maintenance for the machines.
• Machines –descriptive information about the machines.
• Failures –data when a certain machine is stopped, due to component failure.
Errors data represents the most important information in every PdM system. The errors are non-breaking
recorded events while the machine is still operational. In the experimental data set the error date and times
are rounded to the closest hour since the telemetry data is collected at an hourly rate.
Failures data represents the replacements of the components due to the failure of the machines. Once the
failure is happened the machine is stopped. This is a crucial difference between errors and failures.
Maintenance data tells us about scheduled and unscheduled maintenance. The data set contains the records
which correspond to both, regular inspection of components as well as failures. To add the record into the
maintenance table a component must be replaced during the scheduled inspection or replaced due to a
breakdown. Incase the records are created due to breakdowns are called failures. Maintenance contains the
data from 2014 and 2015 years.
Machine data includes information about 100 machines which are subject of the PdM analysis. The
information includes: model type, and machine age. Distribution of the machine age categorized by the
models across production process
Industry 4.0
54
AI & Machine Learning
The introduction of Industry 4.0 gives rise to what is called “Smart Factory. “In these, cyber-physical
systems are used for process monitoring. Thanks to IoT, different systems can communicate with each
other, allowing cooperation with other human systems and operators in real-time. Cloud computing
allows you to store data recorded by cyber-physical systems centrally. Finally, machine learning in
Becoming one of the main catalysts of innovation in certain industrial sectors. Allowing factories to
obtain multiple advantages, such as: Product quality improvement ,Greater flexibility in the production
process
Below are some of the main applications of machine learning in Industry 4.0.
With the introduction of machine learning methods in production processes, these can be understood and
improved intelligently. This is achieved with the evaluation of the data collected during production.
Through this evaluation, new processes are obtained that can adapt production changes continuously.
Thus the different individual processes are not only better inked but can be optimized. The automatic and
Autonomous vehicles and machines are a clear example of the applications of artificial intelligence and
machine learning. In the industry, its use allows replacing the operators in monotonous tasks or that entail
a physical risk.
3. Quality control
Traditionally, product quality was only evaluated at the end of the production process. The use of sensors,
together with machine learning, allows continuous evaluation of the quality in each of the production
phases.
55
AI & Machine Learning
4. Predictive maintenance
Miniaturization and cheaper sensors in recent years have allowed them to proliferate to measure the state
of the machines. Allowing them to obtain valuable information about the status of each of them. Making
points in the machines, a real-time view of the state of health of the whole can be obtained. These data
sets can be used later to train machine learning algorithms with which it can predict the occurrence of
5. Demand prediction
One of the common problems in the industry is to adapt production to demand. Especially when
production is perishable and cannot be stored for a time of greater demand. An example of this is the
production of electricity. It is not possible to save surpluses when production conditions are favorable. In
addition, the introduction of renewable energy sources in which production fluctuates makes management
increasingly complex.
In this market, the introduction of machine learning allows us to meet the energy demand optimally. From
the historical patterns of energy consumption, the expected demand can be derived. On the other hand,
6. Chatbots
Chatbots are computer systems with which users can have conversations through text or even voice. They
can be used to obtain information or perform tasks, reducing the barriers to accessing technological
solutions. They are mostly used for customer service as an accompaniment to customers during the
purchase process or as filters to derive consultations to specialized human advisors. They not only reduce
costs and a 24/7 service but can be used to learn from the most common problems.
56