You are on page 1of 16

machines

Article
Deep Learning to Directly Predict Compensation Values of
Thermally Induced Volumetric Errors
Huy Vu Ngoc * , J. R. R. Mayer and Elie Bitar-Nehme

Mechanical Engineering Department, École Polytechnique de Montréal, P.O. Box 6079, Station Downtown,
Montréal, QC H3C 3A7, Canada
* Correspondence: huy.vu-ngoc@polymtl.ca

Abstract: The activities of the rotary axes of a five-axis machine tool generate heat causing temper-
ature changes within the machine that contribute to tool center point (TCP) deviations. Real time
prediction of these thermally induced volumetric errors (TVEs) at different positions within the
workspace may be used for their compensation. A Stacked Long Short Term Memories (SLSTMs)
model is proposed to find the relationship between the TVEs for different axis command positions
and power consumptions of the rotary axes, machine’s linear and rotary axis positions. In addition,
a Stacked Gated Recurrent Units (SGRUs) model is also used to predict some cases, which are the
best and the worst predictions of SLSTMs to know the abilities of their predictions. Training data
come from a long motion activity experiment lasting 132 h (528 measuring cycles). Adaptive moment
with decoupled weight decay (AdamW) optimizer is used to strengthen the models and increase the
quality of prediction. Multistep ahead prediction in the testing phase is applied to seven positions not
used for training in the long activity sequence and 31 positions in a different short activity sequence
of the rotary axes lasting a total of 40 h (160 cycles) to test the ability of the trained model. The testing
phase with SLSTMs yields fittings between the predicted values and measured data (without using
the measured values as targets) from 69.2% to 98.8%. SGRUs show performance similar to SLSTMs
with no clear winner.

Keywords: volumetric errors; deep learning; machine tool; thermal errors

Citation: Ngoc, H.V.; Mayer, J.R.R.;


Bitar-Nehme, E. Deep Learning to
Directly Predict Compensation
1. Introduction
Values of Thermally Induced
Volumetric Errors. Machines 2023, 11, Machine tool accuracy may degrade significantly due to thermal effects resulting
496. https://doi.org/10.3390/ partly from heat generated by the drives and the motions of rotary axes [1]. One way
machines11040496 to reduce these errors is real time compensation to cancel TCP deviation at an arbitrary
point in the work space by adjusting its command position [2]. Mayr et al. stated that
Academic Editor: Gianni Campatelli
precise predictions of thermal errors through models are fundamental for effective real
Received: 14 March 2023 time compensation [3]. The higher the predictability of the thermal error models, the more
Revised: 12 April 2023 accurate the compensation will be.
Accepted: 18 April 2023 Brecher et al. developed first and second order transfer functions between the thermal
Published: 20 April 2023 volumetric errors (TVEs) of a three-axis machine tool as outputs and the internal control
data (rotational speeds and motor currents) and environment temperature as inputs [4].
Horejs et al. compensated thermal displacements in the y and z directions of the TCP
through thermal transfer functions that were achieved using Matlab’s System Identification
Copyright: © 2023 by the authors.
Toolbox. The inputs are the temperatures from inside the main spindle, ram and base of
Licensee MDPI, Basel, Switzerland.
the horizontal milling machine, ambient temperature and spindle speed and the outputs
This article is an open access article
are the thermal displacements [5]. Liu at al. modelled thermally induced volumetric errors
distributed under the terms and
based on the 21 geometric error components and nine thermal drift errors for milling and
conditions of the Creative Commons
Attribution (CC BY) license (https://
boring machine using rigid body kinematics. Only volumetric error in the z direction
creativecommons.org/licenses/by/
identified as the main error was compensated with Programmable Logic Controller and
4.0/).
the machine’s computer numerical controller (CNC) [6]. Mayr et al. found compensation

Machines 2023, 11, 496. https://doi.org/10.3390/machines11040496 https://www.mdpi.com/journal/machines


Machines 2023, 11, 496 2 of 16

values for individual thermal errors of rotary axis and main spindle from a gray box model
using cooling power as input [7]. Yu et al. investigated the dependence of volumetric errors
on spindle speed and temperature at pre-defined points through polynomial regression
model. The model with closest predictability to the measured outputs is chosen [8]. Bitar-
Nehme and Mayer developed first order transfer functions with delay terms to predict
individual machine geometric errors as output from power consumptions as input. The
machine table thermal expansion was also predicted [1]. Baum et al. modeled thermally
induced volumetric error from the 21 geometric errors components for a three-axis machine
tool through rigid body kinematics. Error components were measured with integral
deformation sensors in the training process and the R-test based procedure in the testing
process [9]. Liu et al. established a mechanism-based error model to identify three thermally
induced error terms of linear axes X, Y, and Z as functions of time and positions. With
spindle system, they modelled relationship between two angular thermal errors yaw and
pitch, two thermally translational errors in x and y directions, and spindle elongation with
temperatures of critical key locations by multivariate linear regression analysis [10].
The above models show their ability to predict the TVEs (data sequence) of the machine
tools. However, the volume and variety of data become bigger; the model has more clues
to predict outputs better, but the calculation cost and internal parameters increase. In
some cases are over the capability of manual analysis [11,12]. Elaborate training of the
model is expected to be able to guarantee reliable results in the long term [13]. Deep
learning technique enables learning complex relations and is a new approach for these
problems. LSTM and GRU networks have attracted attention in recent time and Liang et al.
proved that they have a better quality than Convolutional Neural Network (CNN) and
Recurrent Neural Network (RNN) for processing sequence data and avoiding overfitting
problem [14]. Ngoc et al. [11] applied LSTM to predict multi step ahead data of thermally
induced geometric errors, which define the errors between and within individual machine
axes, from activity cycles involving the two rotary axes B and C of a five-axis machine
tool. Others [15–18] presented the efficiency of LSTM networks for modelling the thermal
behaviors of a machine tool. Refs. [15,16] modeled the thermal elongation of spindle as a
function of rotational speed. Ref. [17] used thermal elongation of spindle in the z direction
as output and temperature at key locations as inputs. Ref. [18] found the relationship
between thermal errors of ball screws and temperature of the feed drive system. While
in [11] thermally induced geometric errors (TGEs) were predicted using SLSTM, the current
paper applies SLSTM and SGRU to directly predict the TVEs without the use of a rigid body
kinematic model. This avoids having to develop the machine kinematic model and may be
able to model error sources that a kinematic model may not consider. Calculating them
from predicted TGEs does not consider errors that cannot be explained with the estimated
geometric errors [1,19]. TVEs directly relate to compensation values of the thermal errors of
paths to move the tool to target positions in the working space of the machine tool. These
predictions are more difficult than doing TGEs because the predictors need to realize not
only the change of thermal behavior of machine, but also the positions we want to move
the tool to. Therefore, the input data include not only the power consumptions of axes
B and C, but also the machine linear and rotary axis positions. Increasing the number of
inputs also increases the calculation cost. With a multivariate and multi-layer network, it
may be trapped within multiple local minima; Sagheer and Kotb suggest having a suitable
optimizer to facilitate the training of deep learning models [20]. Stochastic optimization is
successfully applied in Deep learning [21] and Adam [22] is a popular optimizer. Adaptive
moment with decoupled weight decay (AdamW) optimizer [23] is a modification of Adam
with flexible learning rate, decoupling the weight decay from the gradient update, is
expected to make the networks converge faster and improve accuracy. The proposed
scheme is presented in six sections: Section 1 reviews previous works. Section 2 describes
the experimental process. Section 3 presents the neural network configurations of the
SLSTMs and SGRUs models. Section 4 shows the training and predicting processes. The
results and discussion are in Section 5 and the conclusion follows in Section 6.
2. Experiment
Machines 2023, Process
11, x FOR PEER REVIEW 3 of 16

Motion sequences for machine axes were designed to warm up the machine tool
without any machining [1,24], while measuring the TVEs, or TCP displacements between 3 of 16
Machines 2023, 11, 496
and
a contactless R-test SGRUs(sensor
device models.nest)
Section 4 shows
[25] the artefacts
and ball training andon predicting processes.
a Mitsui-Seiki The results
HU40T
and discussion are in Section 5 and the conclusion follows in Section
horizontal machining center with wCBXfZY(S)t topology as described in [1]. Figure 1 il- 6.
lustrates the experimental
2. setup.
2. Experiment
Four separate artefact balls are fixed on stems with dif-
Experiment Process
Process
ferent lengths and a scale bar made of Invar
Motion with two balls separatedtoby a calibrated dis-
Motion sequences
sequences for
for machine
machine axes
axes were
were designed
designed to warm
warm upup the
the machine
machine tool
tool
tance on the workpiece
withoutside,
without any while the
any machining sensor
machining [1,24], nest
[1,24], while is mounted
while measuring
measuring the in the
the TVEs, spindle
TVEs, or
or TCP on tool side.
TCP displacements
displacements between
between
aa contactless
contactless R-test
R-test device
device (sensor
(sensor nest)
nest) [25]
[25] and
and ball
ball artefacts
artefacts onon aa Mitsui-Seiki
Mitsui-Seiki HU40T
HU40T
horizontal
horizontal machining center with wCBXfZY(S)t topology as described inin[1].
machining center with wCBXfZY(S)t topology as described [1].Figure
Figure 1
1 il-
illustrates the experimental setup. Four separate artefact balls are fixed on stems
lustrates the experimental setup. Four separate artefact balls are fixed on stems with dif- with
different lengths
ferent lengths andand a scale
a scale barbar made
made of Invar
of Invar with
with twotwoballsballs separated
separated by aby a calibrated
calibrated dis-
distance
tance on the workpiece side, while the sensor nest is mounted in the spindle ontool
on the workpiece side, while the sensor nest is mounted in the spindle on toolside.
side.

Figure 1. Experimental setup.

Heating (rotary axes’


Figure
Figure motions atsetup.
1. Experimental
1. Experimental different speeds) and cooling (machine stopped) cy-
setup.
cles are shown in Figure 2 (on the left). Figure 2 (on the right) illustrates ambient temper-
ature controlled betweenHeating
Heating
22.1(rotary
and 23.5axes’
axes’ motions
motions
Celsius atatdifferent
degrees different speeds)
speeds)
in 132 and
and
h (528 cooling
cooling
cycles) (machine
(machine
training stopped)
stopped)
pro- cy-
cycles areshown
cles are shownininFigure
Figure22(on
(onthe
theleft).
left).Figure
Figure 22 (on
(on the right) illustrates
illustratesambient
ambienttempera-
temper-
cess. ture controlled between 22.1 and 23.5 Celsius degrees in 132 h (528 cycles) training process.
ature controlled between 22.1 and 23.5 Celsius degrees in 132 h (528 cycles) training pro-
cess.

Figure 2. Heating and cooling processes and ambient temperature.


Figure 2. Heating and cooling
Figure processes
2. Heating and ambient
and cooling processestemperature.
and ambient temperature.
The activity sequences are implemented using a machine tool G-code. TCP positions
The activity sequences are implemented using a machine tool G-code. TCP positions
The activity sequences areatimplemented
are measured 15 min intervalsusing a machine
for both tooltesting
training and G-code. TCP positions
processes. TVEs, estimated
are measured at 15 min intervals for both training and testing processes. TVEs, estimated
are measured at 15bymin intervals for both training and testing processes. TVEs, estimated
the Scale And Master Ball Artefact (SAMBA) method [26] in both processes in the x,
FOR PEER REVIEW 4 of 16

by the Scale And Master Ball Artefact (SAMBA) method [26] in both processes in the x, y,
Machinesand
2023,z
11,directions,
496 are the subtractions of the initial value from all subsequent values to 4 of 16
by the
avoid theScale And
effect Master Ball
of machine Artefact (SAMBA)
quasi-static geometricmethod
errors as[26] in both
given processes
by Equation in the x, y,
(1).
and z directions, are the subtractions of the initial value from all subsequent values to
avoid the effect of machine quasi-static TVE = VE - VE
geometric errors
0 as given by Equation (1). (1)
y, and z directions, are the subtractions of the initial value from all subsequent values to
avoid the effect of machine
Figure 3 on the left shows the long TVE quasi-static
process - VE0geometric
of activity
= VE errors as measured
with different given by Equation
powers (1).
(1)
of the B and C rotary axes lasting 132 h. In this process, each axis is exercised separately
Figure 3 on the left shows TVE = VE − VE
effecttheon long process of ofactivity with different measured
3 on thepowers
0 (1)
to capture their individual the behavior the machine tool. Figure right
of thethe
shows B and C process
short rotary axes lasting
Figure
lasting 3 40 h132
on the lefth.shows
with In this
theprocess,
individual longand each
process of axis is exercised
activity
simultaneous with separately
different
motions of measured
rotary powers
to capture their individual effect
of the B and on the
C rotary behavior
axes of
lasting 132 the
h. Inmachine tool.
this process,
axes, which was run after some days and with similar starting condition to the long pro- Figure
each axis is3exercised
on the right
separately to
capture
shows the short process their individual
lasting 40 h witheffect on the behavior
individual of the machine
and simultaneous tool. Figure
motions 3 on the right
of rotary
cess. The short process is different
shows
from the long 40 process and includes individual axis mo- of rotary
axes, which was run afterthe shortdays
some process
andlasting h withstarting
with similar individual and simultaneous
condition to the longmotions
pro-
tions as well as combined axis motions
axes, which as would
was run after some daysbe and
likely during
with similarastarting
machining process.
condition to the long process.
cess. The short process is different from the long process and includes individual axis mo-
The short process is different from the long process and includes individual axis motions
tions as well as combined
as well asaxis motions
combined axisasmotions
wouldasbewould
likelybeduring a machining
likely during process.
a machining process.

Figure 3. Long (left) and short (right) cycling processes [11].


Figure 3. Long (left) and short (right) cycling processes [11].
3.Figure 3. Long
Stacked (left)and
LSTMs andGRUs
short (right) cycling processes [11].
3. Stacked LSTMs and GRUs
An LSTM
3. Stacked unit has
LSTMs andthree
GRUs gates: an input gate (it), an output gate (0t) and a forget gate
An LSTM unit has three gates: an input gate (it ), an output gate (0t ) and a forget gate
(ft) as shown in Figure 4 (on the left).
An LSTM unit(fhas
t ) asthree
showngates:
in Figure 4 (on the
an input left).
gate (it), an output gate (0t) and a forget gate
(ft) as shown in Figure 4 (on the left).

Figure 4. LSTM and GRU unit.


Figure 4. LSTM and GRU unit.

Figure
The4.function
LSTM andofGRUthe The
unit.
LSTM function of the LSTM unit is to remove or add information to the cell state Ct
unit is to remove or add information to the cell state Ct
that is carefully regulated by these three gates. Ĉt represents new information that can
that is carefully regulated
be the
applied
by these
to cell
three
state
gates.
Ct .remove
Ct represents
Ht is the
new information that can
The function
be applied of
to cell state C .LSTM
H is unit
the is to
hidden state; orhidden
σ and
state; σ and tan
add tanh
information
are to hthe
sigmoid
arecell
and
sigmoid
stateand
tangent Ct tangent
activation
t tfunctions, respectively.
that is carefully
activation regulated
functions, by these three gates. Ct represents new information that can
respectively.
Gated recurrent unit (GRU) [27] is also working by a gating mechanism. GRU unit
be applied to cell state Ct . aHreset
t is gate
the hidden state; σgate
and . tanh are sigmoid
not have and tangent
Gated recurrent unit (GRU) [27] ist also working by aztgating
contains r and an update GRU doesmechanism. GRU
an outputunit
gate Ot and
activation functions, respectively.
contains a reset gate rt and an update gatet zt . GRU does not have an output gate Ot and
long-term memory unit Ĉ as the LSTM unit. Both of them have the ability to eliminate the
Gated
long-term recurrent
memory unit unit
Ct (GRU)
vanishing gradient
as [27]
the LSTM is unit.
problemalsobyworking byrelevant
keeping the
Both of them a gating mechanism.
haveinformation
the abilityand GRU unit
passing
to eliminate it to the next
contains a reset gatetimert steps
and an of the network.
update gate zt . GRU does not have an output gate Ot and
the vanishing gradient problem by keeping the relevant information and passing it to the
long-term
next memory
time steps of theunit Ct as the LSTM unit. Both of them have the ability to eliminate
network.
the vanishing gradient problem by keeping the relevant information and passing it to the
next time steps of the network.
1, x FOR PEER REVIEW 5 of 16

SLSTM/SGRUs in Figure 5 consists of stacked layers of LSTMs/GRUs and the output


Machines2023,
Machines 2023, 11,496
layer
11, x makes
FOR PEERaREVIEW[20]. The SLSTMs/SGRUs run when the first LSTM/GRU layer 5 ofof1616
prediction
takes the input sequence and every LSTM/GRU layer feeds hidden states to the next
LSTM/GRU in the stack. SLSTMs/SGRUs calculates only one error signal at the final out-
SLSTM/SGRUs
SLSTM/SGRUs
put layer, and then back propagates inFigure
in itFigure 55consists
through consists ofstacked
of stackedlayers
all previous layersofof
layers LSTMs/GRUs
LSTMs/GRUs
[20]. andthe
and
This structure the output
isoutput
layer
layermakes
makesaaprediction
prediction[20].
[20].The
TheSLSTMs/SGRUs
SLSTMs/SGRUsrun runwhen
whenthe thefirst
firstLSTM/GRU
LSTM/GRUlayerlayer
programmed with Pytorch library in Python using torch.nn.LSTM/GRU.
takes the input sequence and every LSTM/GRU layer feeds hidden states to the next
takes the input sequence and every LSTM/GRU layer feeds hidden states to the next
LSTM/GRU
LSTM/GRU in thethe stack.
stack.SLSTMs/SGRUs
SLSTMs/SGRUs calculates
calculates only
only oneone error
error signal
signal at the
at the finalfinal
out-
output layer, and then back propagates it through all previous
put layer, and then back propagates it through all previous layers [20].
[20]. This structureisis
This structure
programmed with Pytorch library in Python using torch.nn.LSTM/GRU.
programmed with Pytorch library in Python using torch.nn.LSTM/GRU.

Figure 5. Stacked LSTMs and GRUs structure.


Figure5.5.Stacked
Figure StackedLSTMs
LSTMsand
andGRUs
GRUsstructure.
structure.
4. Training and Testing Processes
4.4.Training
Trainingand
andTesting
TestingProcesses
Processes
The machine schematic
The in Figure 6 shows TVEs measured at four artefact balls. The
Themachine
machineschematic
schematicin
inFigure
Figure66shows
showsTVEs
TVEsmeasured
measuredatatfour
fourartefact
artefactballs.
balls.The
The
workpiece and tool branches are illustrated in green and black colors, respectively.
workpiece and tool branches are illustrated in green and black colors, respectively.
workpiece and tool branches are illustrated in green and black colors, respectively.

Figure 6. Machine schematic with thermally induced volumetric errors.

The TVEs have three positioning error components as described in Equation (2).
Figure 6. Machine Figure
schematic with thermally
6. Machine induced
schematic with volumetric
thermally induced errors.
volumetric errors.
These error components may change with activities of the B- and C-axis, positions, and
indexations.
The TVEs have three positioning error components as described in Equation (2).
The TVEs have three positioning error
These error components may
components
change
as described in Equation (2).
with activities of the B- and C-axis, positions, and
These error components TVE i = [TVEix TVEiy TVEiz]T with i = 1:N
may change with activities of the B- and C-axis, positions, and (2)
indexations.
indexations. where N is the numberTVE of measured TVEiy TVEiz ]T with i = 1:N
i = [TVEixpositions. (2)
In the training phase, the SLSTMs/SGRUs models are built using long process data,
where N isTVEthe number
= [TVEofixmeasured
TVETVEs positions.
iy TVE ]Tpositions
with i = of
1:N
as illustrated ini Figure 7, and in iz24 the artefact balls 1, 2, and(2)
3, shown
where N is the number of measured positions.
In the training phase, the SLSTMs/SGRUs models are built using long process data,
as illustrated in Figure 7, and TVEs in 24 positions of the artefact balls 1, 2, and 3, shown
Machines 2023, 11, 496 6 of 16
x FOR PEER REVIEW 6 of 16

In the training phase, the SLSTMs/SGRUs models are built using long process data,
as illustrated in Figure 7, and TVEs in 24 positions of the artefact balls 1, 2, and 3, shown in
in Table 1, are used as targets.
Table The
1, are used power The
as targets. consumptions of the axes
power consumptions of theBaxes
andBC,
andthe
C, 24
thelinear
24 linear axis
axis positions used to move
positions thetosensor
used move thenest to the
sensor nestball artefacts
to the and the
ball artefacts andnine (B, (B,
the nine C) C)
index-
indexations,
ations, or rotary axis positions,
or rotary are used
axis positions, areasused
inputs.
as inputs.

Figure 7. Flow chartFigure


of the7.training process.
Flow chart of the training process.

The testing phase


Table 1.will consider
31 positions two
with ninecases: long
different test data
indexations consisting
of balls 1, 2, 3, andof
4. TVEs at seven
positions of artefact ball 4 as well as the short test data of all 31 positions at all four artefact
Positions of Artefact Balls Indexations A,B,C (Degrees)
balls.
P1,1,1 P1,2,2 P1,3,3 0 −90 135
P2,4,1
Table 1. 31 positions with P2,5,4 indexations
nine different P2,6,3 of balls 1, 2, 3, and 4. 0 −90 −45
P3,7,3 P3,8,4 P3,9,1 P3,10,2 0 −30 −60
Positions of Artefact
P4,11,1 P4,12,2
Balls P Indexations
P4,14,4 0
A,B,C (Degrees)
0 180
4,13,3
𝑃, , 𝑃 , P, P
𝑃, , P P
0 0
−90 0
135 0
5,15,2 5,16,3 5,17,4 5,18,1
𝑃, , 𝑃 , P, 𝑃 , ,
P6,20,4 P6,21,1 P6,22,2
0 0
−90 20
−45 15
6,19,3
𝑃, , 𝑃 , P, 𝑃 , ,
P7,24,3
𝑃
P7,25,4 , , 0 0 −30 90 −60 45
7,23,2
𝑃, , 𝑃 , P, 𝑃 , ,
P8,27,2
𝑃
P8,28,1 , , 0 0 0 90 180 −45
8,26,3
𝑃, , 𝑃 , P, 𝑃 , ,
P9,30,1 𝑃
P9,31,2 , , 0 0 0 90 0 −135
9,29,4
𝑃, , 𝑃 , , Pq,i,k with q =𝑃indexation
Where: , , (1 to 9); 𝑃
i = , ,
linear axis 0
position (1 to 31); k = 20
artefact (1 to 4). 15
𝑃, , 𝑃, , 𝑃, , 0 90 45
The testing phase will consider two cases: long test data consisting of TVEs at
𝑃, , 𝑃 , , positions 𝑃
seven of , artefact
, ball 4 as well as the short 0 test data90 of all 31 −45 positions at all
𝑃, , 𝑃 , artefact
four , 𝑃
balls. , , 0 90 −135
where: 𝑃 The 31 linear
with q = indexation (1 toaxis
9); positions
i = linear are
axisshown
position in Figure 8. These
(1 to 31); positions
k = artefact are4).pre-determined
(1 to
,,
by calculations based on the indexations of B and C axes and the artefact’s positions
in the machine frame. They are used for the G-code to move the sensor nest to the
The 31 linearfour
axis positions are shown in Figure 8. These positions are pre-deter-
artefact balls.
mined by calculations SLSTMs/SGRUs
based on the indexations of B and
[28] are capable C axes and
of capturing the artefact’s
longer patterns inpositions
sequential data.
in the machine frame.
AdamW They are used
optimizer for the
is used G-codethe
to generate to move the weights
optimized sensor nest to thebased
and biases four on the
artefact balls. outputs of the SLSTMs\SGRUs and the measured TVEs at 24 different positions. In Figure 7,
 T
Inputj = PowerBj , PowerCj , xj , yj , zj , bj , cj with j = 1 : n; p is the number of inputs in
one sample = 8; n is the number of training data = 528 × 24 = 12,672, corresponding to
24 position measurements or observations at each of the 528 measurement cycles;
Figure 7. Flow chart of the training process.

The testing phase will consider two cases: long test data consisting of TVEs at seven
positions of artefact ball 4 as well as the short test data of all 31 positions at all four artefact
Machines 2023, 11, 496 7 of 16
balls.

Table 1. 31 positions with nine different indexations of balls 1, 2, 3, and 4. h i


Targetj is the desired y of SLSTMs/SGRUs
Positions ofj Artefact Balls equal to the measured TVE
Indexations A,B,C j ; y , . . .
(Degrees)
p ,y n =
x FOR PEER REVIEW  7 of 16
𝑃 , , TVEp , . . . ,TVE
trained 𝑃 , , n ; U, W and 𝑃 , , b are weight and bias that 0 need to−90 be updated 135after
each 𝑃batch (number𝑃 of training samples) 𝑃, , in the training process
0 as Equations
−90 (3)–(5) −45[29]
, , , ,
𝑃, , 𝑃, , new 𝑃 , , 𝑃 , , ηw 0 −30 −60
SLSTMs/SGRUs 𝑃[28] are capable W
of = (1 − λwlonger
capturing )Wold −patterns
√ m
in
b wtsequential data. (3)
, , 𝑃 , , 𝑃 , , 𝑃 , , v̂wt +  0 0 180
AdamW optimizer is 𝑃 used
, ,
to generate𝑃 , , the optimized 𝑃, , weights
𝑃 , and
,
biases0 based on0the out- 0
puts of the SLSTMs\SGRUs and the η
𝑃, , 𝑃 , measured
, U new 𝑃 TVEs
= at 24 different
,(1, − λu )U 𝑃 −
old
, √
,
u positions.
mb0ut In20Figure 7, 15 (4)
T v̂ut + 
Inputj = PowerBj ,PowerC𝑃 , ,x , 𝑃, , 𝑃, , 0 90 45
j j ,yj ,zj ,bj ,cj with j = 1: n; p is the number of inputs in one sam-
𝑃, , 𝑃, , new 𝑃 , ,
b= 528=×(24 old
1 − λb ) b − √
ηb
m 0 90 −45
ple = 8; n is the number𝑃 , of training 𝑃 , data 𝑃 , , = 12,672, corresponding
v̂bt +  0
b bt to 2490position−135 (5)
, ,
measurements or where:
observations atq each of the 528 8measurement cycles;
where η𝑃is, , the
with = indexation
learning rate,  =(110 , im
to−9); b= linear
and v̂axis
are position k = artefact
(1 to 31);first
bias-corrected and second (1 to 4).
moment
Target j is the desired
estimates and λyisj the
ofrate
SLSTMs/SGRUs equal
of the weight decay to the measured TVEj;
per step.
The 31 linear axis positions are shown in Figure 8. These positions are pre-deter-
The diagram in ;Figure
yp ,…,yn = trainedmined
TVE p ,…,TVE
by n U,
calculations
W9andshows the testing
b are weightprocedure
and biasapplying multistep
that need to be predictions
up-
based on the indexations of B and C axes and the artefact’s positions
with the trained SLSTMs/SGRUs and inputs without using measured TVEs as targets. In
dated after each batch
in the
this (number
machine
phase, of training
frame.
to match theThey samples)
initialare used forinthe
conditions, the
eight training
G-code
first process
to move
states withthe as Equations
sensor
values nest to
of zeros arethe four
added
(3)–(5) [29] artefact
to balls.
the data sequence, as Ngoc et al. conducted this in [11].
old ηw
Wnew = 1-λw W − mwt (3)
vwt +ϵ

ηu
Unew = (1-λu )Uold - mut (4)
vut +ϵ

ηb
bnew = (1-λb )bold - mbt (5)
vbt +ϵ

where η is the learning rate, ϵ = 10−8, m and v are bias-corrected first and second mo-
ment estimates and λ is the rate of the weight decay per step.
The diagram in Figure 9 shows the testing procedure applying multistep predictions
with the trained SLSTMs/SGRUs and inputs without using measured TVEs as targets. In
this phase, to match the initial conditions, eight first states with values of zeros are added
to the data sequence, as8.8.Ngoc
Figure
Figure 31 et axis
31 linear
linear al. conducted
axis positions
positions at this
at four
four in [11].
artefact
artefact balls
balls in
in the
the machine
machine frame.
frame.

Figure 9. PredictionFigure
process of thermally
9. Prediction induced
process volumetric
of thermally inducederrors.
volumetric errors.

In Figure 9: Inputt = [PowerBt , PowerC T t , xt , yt , zt , bt , ct ]T , t = 1 to m + 8 , p = 8,


In Figure 9: Inputt = PowerBt ,PowerC t ,xt ,yt ,zt ,bt ,ct , t = 1 to m + 8, p = 8, m = 528
h i
m = 528 and 160 cycles in the long and short processes, respectively; and yp . . . ym+8 =
and 160 cycles in the long
 and short processes,
 respectively; and yp …ym+8 = Predicted
Predicted TVEp . . . TVEm+8 .
TVEp …TVEm+8 .
5. Results and Discussions
In this section, the prediction ability of the trained SLSTMs for all TVEs are shown
5. Results and Discussions
before some the best and worst cases are chosen to compare with the SGRUs’. In the
In this section, the prediction ability of the trained SLSTMs for all TVEs are shown
training phase, three Stacked LSTMs are used to build the models for TVEx, TVEy and
before some the best and worst cases are chosen to compare with the SGRUs’. In the train-
ing phase, three Stacked LSTMs are used to build the models for TVEx, TVEy and TVEz
separately. Data of the thermal behaviors of the machine in 24 positions (31 positions of
Table 1 less positions 5, 8, 14, 17, 20, 25, and 29 of ball 4) were put in series. Figure 10 shows
Machines 2023, 11, 496 8 of 16

TVEz separately. Data of the thermal behaviors of the machine in 24 positions (31 po-
sitions of Table 1 less positions 5, 8, 14, 17, 20, 25, and 29 of ball 4) were put in series.
Figure 10 shows the measured and predicted thermal volumetric errors (TVEs) for each
of the 24 balls’ positions where for each position, 528 measuring and prediction cycles are
chines 2023, 11, x FOR PEER REVIEW conducted. The results for each position are concatenated horizontally, 8soofthe
16 timeline of

the abscissa repeats for every position (Pi) for compactness of presentation.

Figure 10. Training TVEx, TVEy, and TVEz from 24 positions in the long training process. For ex-
Figure
ample, between 10. Training
P2 and P3 are theTVEx, TVEy,
528 cycle and TVEz
measured andfrom 24 positions
predicted thermalinvolumetric
the long training
errors atprocess. For
example, between
position P3 (1 cycle = 15 min). P2 and P3 are the 528 cycle measured and predicted thermal volumetric errors at
position P3 (1 cycle = 15 min).
The B axis has a greater impact on volumetric errors (TVEs) than the C axis. In fact,
The B axis has a greater impact on volumetric errors (TVEs) than the C axis. In fact, B
B axis activity dominates almost all of the TVEx. However, the influence of C axis activity
axis activity dominates almost all of the TVEx. However, the influence of C axis activity
relative to the B axis increases for TVEy and TVEz.
relative to the B axis increases for TVEy and TVEz.
In the testing phase, three sets of volumetric errors were considered to test the ability
In the testing phase, three sets of volumetric errors were considered to test the ability of
of the trained model: data of seven positions at ball 4 in the long process (different posi-
the trained model: data of seven positions at ball 4 in the long process (different positions in
tions in workspace but same process with training data), and data of 31 positions in the
workspace but same process with training data), and data of 31 positions in the short process
short process which is divided in two sets: the first set with 24 positions at balls 1, 2, 3
which is divided in two sets: the first set with 24 positions at balls 1, 2, 3 (same positions
(same positions but different process) and the second set with seven positions of ball 4
but different process) and the second set with seven positions of ball 4 (different positions
(different positions and process). Due to the limitation of space, some of the best and the
and process). Due to the limitation of space, some of the best and the worst predictions of
worst predictions of the TVEs in each set are presented. Multistep predictions (528 steps
the TVEs in each set are presented. Multistep predictions (528 steps (cycles)) for positions
(cycles)) forP14
positions
and P20 P14
inand P20 inprocess
the long the long
areprocess are computed
computed and illustrated
and illustrated from
from Figures 11–13. In the
Figures 11–13. In the short process, 160 prediction steps are shown from Figures 14–19
short process, 160 prediction steps are shown from Figures 14–19 for positions for P4, P5, P9,
positions P4, P5,P25.
and P9, and P25.
11,
, 11,xxFOR
FORPEER
PEERREVIEW
REVIEW 99ofof1616
11, x FOR PEER REVIEW 9 of 16
Machines 2023, 11, 496 9 of 16

Figure 11. Prediction of TVEx14 and TVEy14 in the long process (LP) (1 cycle = 15 min).

Figure
Figure11.
11.Prediction
Prediction ofofTVEx14
TVEx14 and
andTVEy14
TVEy14 ininthe
thelong
longprocess
in the(LP)
process long(1
(LP) (1cycle
cycle=(LP)
15
15min).
Figure 11. Figure
Prediction 11. Prediction
of TVEx14 and of TVEx14
TVEy14 and
in the TVEy14
long process (LP) process
(1 cycle ==15 min).
(1 cycle = 15 min).
min).

Figure 12. Prediction of TVEz14 and TVEx20 in the long process (LP) (1 cycle = 15 min).

Figure
Figure12.
12.Prediction
Prediction ofofTVEz14
TVEz14 and
andTVEx20
TVEx20 ininthe
thelong
longprocess
in the(LP)
process long(1
(LP) (1cycle
cycle=(LP)
15
15min).
Figure 12. Figure
Prediction 12. Prediction
of TVEz14 and of TVEz14
TVEx20 and
in the TVEx20
long process (LP) process
(1 cycle ==15 min).
(1 cycle = 15 min).
min).

Figure 13. Prediction of TVEy20 and TVEz20 in the long process (LP) (1 cycle = 15 min).

The trained model appears to have difficulty predicting TVEz20 from cycles 300 to
450, as its parameters tend to rely heavily on the patterns observed in the training data
and do not adjust well to changes in TVEz20 during the testing process. As a result, the
model’s predictions for TVEz20 during this period are likely to be inaccurate. Conversely,
TVEz14 tends to be better predicted during this time. The model is able to adapt more
easily to Prediction
Figure 13. changesFigure
inofTVEz14.
13. Prediction
TVEy20 of TVEy20
and TVEz20 andlong
in the TVEz20 in the(LP)
process long(1process
cycle =(LP) (1 cycle = 15 min).
15 min).
Figure13.
Figure 13.Prediction
Predictionof
ofTVEy20
TVEy20and
andTVEz20
TVEz20in
inthe
thelong
longprocess
process(LP)
(LP)(1
(1cycle
cycle==15
15min).
min).

The
Thetrained
The trainedmodel
trained modelappears
model appearsto
appears tohave
to havedifficulty
have difficultypredicting
difficulty predictingTVEz20
predicting TVEz20from
TVEz20 fromcycles
from cycles300
cycles 300to
300 to
to
450,
450,as
450, asits
as itsparameters
its parameterstend
parameters tendto
tend torely
to relyheavily
rely heavilyon
heavily onthe
on thepatterns
the patternsobserved
patterns observedin
observed inthe
in thetraining
the trainingdata
training data
data
and
and do not adjust well to changes in TVEz20 during the testing process. As a result, the
and do
do not
not adjust
adjust well
well to
tochanges
changes in
inTVEz20
TVEz20 during
during the
the testing
testing process.
process. As
As aaresult,
result, the
the
model’s
model’spredictions
model’s predictionsfor
predictions forTVEz20
for TVEz20during
TVEz20 duringthis
during thisperiod
this periodare
period arelikely
are likelyto
likely tobe
to beinaccurate.
be inaccurate.Conversely,
inaccurate. Conversely,
Conversely,
TVEz14
TVEz14 tends
tends to
tobebe better
betterpredicted
predicted during
during this
thistime.
time. The
The model
model
TVEz14 tends to be better predicted during this time. The model is able to adapt is
isable
able to
to adapt
adaptmore
more
more
easily
easilyto
easily tochanges
to changesin
changes inTVEz14.
in TVEz14.
TVEz14.

Figure
Figure 14. Prediction 14. Prediction
of TVEx6 of TVEx6
and TVEy6 andshort
in the TVEy6 in the short
process process
(SP) (1 cycle (SP)
= 15(1min).
cycle = 15 min).

Figure
Figure14.
14.Prediction
Predictionof
ofTVEx6
TVEx6and
andTVEy6
TVEy6ininthe
theshort
shortprocess
process(SP)
(SP)(1(1cycle
cycle===15
15min).
min).
Figure 14. Prediction of TVEx6 and TVEy6 in the short process (SP) (1 cycle 15 min).
23,11,
23,
3, 11,xx
11, xFOR
FORPEER
FOR PEERREVIEW
PEER REVIEW
REVIEW 10ofof
1010 of1616
16
Machines 2023,
23, 11, x FOR PEER REVIEW 11, 496 10 of 16 10 of 16

Figure15.
Figure 15.Prediction
15. Predictionofof
Prediction ofTVEz6
TVEz6and
TVEz6 andTVEx9
and TVEx9inin
TVEx9 in the
the short
short process
process(SP)
(SP)(1(1
(SP) (1cycle
cycle===15
cycle 15min).
min).
Figure Figure 15. Prediction of TVEz6 the
and short
TVEx9process
in the short process (SP) 15 min).
(1 cycle = 15 min).
Figure 15. Prediction of TVEz6 and TVEx9 in the short process (SP) (1 cycle = 15 min).

Figure16.
Figure
Figure 16.Prediction
16. Prediction
Prediction
Figureof16.
ofof TVEy9
TVEy9
TVEy9 andTVEz9
and
and
Prediction TVEz9
TVEz9
of inthe
inin
TVEy9 theshort
the
and shortprocess
short
TVEz9 process
process (SP)
(SP)
(SP)
in the short (1cycle
(1(1 cycle
cycle
process ===15
(SP) 15min).
15
(1 min).= 15 min).
min).
cycle
Figure 16. Prediction of TVEy9 and TVEz9 in the short process (SP) (1 cycle = 15 min).

Figure17.
Figure
Figure 17.Prediction
17. Prediction
Prediction
Figureof17.
ofof TVEx5
TVEx5 andTVEy5
and
Prediction
TVEx5 and TVEy5
TVEy5
of inand
inin
TVEx5 theshort
the
the shortprocess
short
TVEy5 process
process (SP)
(SP)
in the short
(SP) (1cycle
(1(1 cycle
cycle
process ===15
(SP) 15
15 min).= 15 min).
min).
(1 min).
cycle
Figure 17. Prediction of TVEx5 and TVEy5 in the short process (SP) (1 cycle = 15 min).

Figure18.
Figure
Figure 18.Prediction
18. Prediction
Figure
Prediction of18.
ofof TVEz5
TVEz5
TVEz5 andTVEx25
Prediction
and
and TVEx25
of
TVEx25 in
TVEz5inin the
and
the
the shortprocess
TVEx25
short
short process
process (SP)
in the short
(SP)
(SP) (1cycle
(1(1 cycle
process ===15
(SP)
cycle 15min).
(1
15 min).= 15 min).
cycle
min).
Figure 18. Prediction of TVEz5 and TVEx25 in the short process (SP) (1 cycle = 15 min).
23, 11, x FOR PEER2023,
Machines REVIEW
11, 496 11 of 16 11 of 16

2023, 11, x FOR PEER REVIEW 11 of 16

Figure 19. Prediction of19.


Figure TVEy25 and of
Prediction TVEz25
TVEy25inand
theTVEz25
short process (SP) process
in the short (1 cycle(SP)
= 15(1min).
cycle = 15 min).

The Root Mean The trained


Square model(RMSEs)
Errors appears to have
and difficulty
fittings predicting TVEz20
of predictions fromwith
compared cycles 300 to
450, as its parameters tend to rely heavily on the patterns observed in the training data and
measured values are calculated by Equations (6) and (7) and are shown from Figures 20–
do not adjust well to changes in TVEz20 during the testing process. As a result, the model’s
24.Figure 19. Prediction of TVEy25 and TVEz25 in the short process (SP) (1 cycle = 15 min).
predictions for TVEz20 during this period are likely to be inaccurate. Conversely, TVEz14
tends to be better predicted
∑ during this time. The model is able to adapt more easily to
The Root MeaninRMSE
changes Square = Errors (RMSEs) and fittings of predictions compared(6)with
TVEz14.
measured valuesTheare Root Mean Square
calculated Errors (RMSEs)
by Equations and
(6) and (7) fittings
and of predictions
are shown compared
from Figures 20–with
24. measured values are calculated by Equations (6) and (7) and are shown from Figures 20–24.
itting % = 1 − ∗ 100 (7)
s 2
∑ ∑n
RMSE = j=1 Prediction valuesj − Measured valuesj (6) (6)
RMSE =
n
!
itting % = 1 −
fitting(%) = 1 −
RMSE ∗ 100 ∗ 100 (7) (7)
Measured valuesjmax − Measured valuesjmin

Figure 20. RMSEs and fittings of seven positions at ball 4 in the long process (LP).

Figure 20. RMSEs and


Figure 20.fittings
RMSEs of seven
and positions
fittings of sevenat ball 4 inatthe
positions balllong process
4 in the (LP). (LP).
long process

The changes of TVEs depend on the inputs: powers of rotary axes, durations of
activities, positions, and indexations. The trained models have capability to predict TVEs,
respectively illustrated from Figure 11 to Figures 24 and 25 presents a set of TVEs in 33 linear
axis positions with directional components of TVEx, TVEy, and TVEz at one cycle in the
testing process. Predictions of TVEs of seven positions at ball 4 in the long process can
reach very high fitting with RMSE = 2.2 µm (fitting = 98.8%) for TVEy14, while the lowest
proportion of fitting in this process is for TVEz20 with RMSE = 5.6 µm (fitting = 80.3%).
Figure 21. RMSEs and fittings of eight positions at ball 1 in the short process (SP).

Figure 21. RMSEs and fittings of eight positions at ball 1 in the short process (SP).
Machines 2023, 11, 496 12 of 16

Figure 20. RMSEs and fittings of seven positions at ball 4 in the long process (LP).

023, 11, x FOR PEER REVIEW 12 of 16


023, 11, x FOR PEER REVIEW 12 of 16
023, 11, x FOR PEER REVIEW 12 of 16

Figure 21. RMSEs and21.fittings


Figure RMSEsofand
eight positions
fittings of eightatpositions
ball 1 inatthe short
ball process
1 in the short (SP).
process (SP).

Figure 22. RMSEs and fittings of eight positions at ball 2 in the short process (SP).
Figure 22. RMSEs and22.
fittings
RMSEs ofand
eight positions
fittings at positions
of eight ball 2 in theball
short
2 in process
the short(SP).
Figure
Figure 22. RMSEs and fittings of eight positions at ball 2 in at
the short process process (SP).
(SP).

Figure 23. RMSEs and fittings


RMSEs
of eight positions at ball 3 in at
the short process (SP).
Figure
Figure 23. RMSEs and23.
fittings ofand fittings
eight of eight
positions at positions
ball 3 in theball 3 in process
short the short(SP).
process (SP).
Figure 23. RMSEs and fittings of eight positions at ball 3 in the short process (SP).

Figure
Figure 24. RMSEs and24. RMSEsofand
fittings fittings
seven of seven
positions atpositions
ball 4 in at
theball 4 in the
short short (SP).
process process (SP).
Figure 24. RMSEs and fittings of seven positions at ball 4 in the short process (SP).
Figure 24. RMSEs and fittings of seven positions at ball 4 in the short process (SP).
The changes of TVEs depend on the inputs: powers of rotary axes, durations of ac-
Thepositions,
tivities, changes of TVEs
and depend on
indexations. Thethetrained
inputs: powershave
of rotary axes, todurations of ac-
The changes of TVEs depend on the inputs:models capability
powers of rotary predict TVEs,
axes, durations of ac-
tivities, positions,
respectively and
illustratedindexations. The
from FigureThe trained
11 to models
Figures have
24 andhave capability
25 presents to
a set predict TVEs,
of TVEsTVEs,
in 33
tivities, positions, and indexations. trained models capability to predict
respectively illustrated
linear axis positions withfrom Figure 11components
directional to Figures 24 and 25TVEy,
of TVEx, presents
andaa TVEz
set of at
TVEs
one in 33
cycle
respectively illustrated from Figure 11 to Figures 24 and 25 presents set of TVEs in 33
3, 11, x FOR PEER REVIEW 13 of 16

Machines 2023, 11, 496 13 of 16

Figure 25. 31 linear axis positions and an example set of TVEs.


Figure 25. 31 linear axis positions and an example set of TVEs.
The trained models gave less quality when they predict TVEs at 7 positions at ball 4
in the
The trained models short testing
gave lessprocess
qualityaswhenshowntheyby RMSEs
predictfrom TVEs 3.3at
µm (86.7% fitting)
7 positions for 4
at ball TVEx5 to
4.8 µm
in the short testing (75.27%
process as fitting)
shownfor byTVEz5.
RMSEsThe predictions
from 3.3 µm (86.7%are good for positions
fitting) for TVEx5at ball to1, 2, and
3 in the short testing process with the highest fitting 91.6% (1.8 µm) for TVEx9, and the
4.8 µm (75.27% fitting) for TVEz5. The predictions are good for positions at ball 1, 2, and
worst case in this process with 69.2% (7.5 µm) for TVEz6. The results also indicate that the
3 in the short testing
model process
gives thewith highestthecapability
highest fitting 91.6%for
of prediction (1.8TVEy
µm)and forthe TVEx9,
worstand the These
for TVEz.
worst case in this process with 69.2% (7.5 µm) for TVEz6. The results also
situations happened in [11], with geometric errors relating to y and z components. indicate that the There
model gives theare highest capability
some over responsesof prediction
of the trained formodel
TVEyinand the the worst of
beginning forthe
TVEz.
shortThese
process for all
situations happened
TVEs, in but[11], with geometric
the predictions graduallyerrors relating
improve to y and
to closely z components.
track the measuredThere values for the
rest of the of
are some over responses process. Figures
the trained 20–24in
model reveal an interesting
the beginning finding:
of the shortfor the positions
process for all located
at the artifact
TVEs, but the predictions ball four,improve
gradually TVEy tends to have atrack
to closely higherthefitting than TVEx
measured valuesand forTVEz.
the On the
rest of the process. Figures 20–24 reveal an interesting finding: for the positions located at tends to
other hand, for the positions located at the artifact balls one, two, and three, TVEx
dominate the others.
the artifact ball four, TVEy tends to have a higher fitting than TVEx and TVEz. On the
The SLSTMs model showed its ability to predict thermally induced volumetric errors
other hand, for ofthethepositions
machine located
tool based aton
thethe artifact
motor balls
powers one, two,
of the andaxes.
rotary three,GRU TVExunittends
was invented
to dominate theafter
others.
LSTMs with a simpler structure for time series application. The question is: which one
The SLSTMs modelfor
is better showed its abilityTo
data sequences? toanswer
predictthis
thermally
question,induced
this work volumetric
continues by errors
taking some
of the machine tool
casesbased on the
including themotor
best andpowers of the rotary
worst predictions of axes.
SLSTMs GRU andunit useswasSGRUsinvented
to train model
after LSTMs with andapredict
simpler these thermally
structure forinduced volumetric
time series errors with
application. Thesimilar
questionstructure, learning rate,
is: which
and weight decay.
one is better for data sequences? To answer this question, this work continues by taking
some cases including SLSTMs
the best predict TVEy14,
and worst TVEx4 better
predictions than SGRUs
of SLSTMs andasuses
shown in Figures
SGRUs to train26–29 with
RMSEs = 2.2 µm, 2 µm and 2.4 µm, 4.4 µm, respectively. For TVEz5, 6, SGRUs performs
model and predict these thermally induced volumetric errors with similar structure,
better with RMSEs = 3.2 µm and 6.6 µm, while SLSTMs have RMSEs turnoff 4.8 µm and
learning rate, and weight decay.
7.5 µm. Both the SLSTMs and SGRUs proved that they have potential to predict the
SLSTMs predict
sequence TVEy14,
data. The TVEx4
resultsbetter
in thisthan SGRUs
section partlyasprove
shown the in Figures
ability 26–29 and
of SLSTMs with SGRUs to
RMSEs = 2.2 µm, 2 µm and 2.4 µm, 4.4 µm, respectively. For
predict TVEs at the 40 h process with different activities and inputs.TVEz5, 6, SGRUs performs
better with RMSEs =This 3.2 study
µm and has 6.6 µm,limitations
several while SLSTMs have be
that should RMSEs
taken turnoff 4.8 µm
into account. and the heat
Firstly,
7.5 µm. Both the sources
SLSTMs are limited
and SGRUs to the rotary
proved axes,
thatwhich
theymayhaveaffect the quality
potential of predictions
to predict the se-of TVEz.
quence data. The To results
improveinthe accuracy
this section of partly
the predictions,
prove the it may be beneficial
ability of SLSTMs to include
and SGRUsmore information
to
on the activities of the linear axes and
predict TVEs at the 40 h process with different activities and inputs. temperatures as inputs. Additionally, the ambient
temperature should not be overlooked as it can directly affect the change in temperature of
the axes following a sequence of activities.
2023,
023,
2023, 11,
023,11,
11,xxxx
11, FOR
FOR
FOR PEER
PEER
FORPEER
PEER
Machines REVIEW
REVIEW
REVIEW
REVIEW
2023, 11, 496 14
14
14 of
of
14of 16
16
of1616
14 of 16

Figure
Figure
Figure 26.
26.
Figure26. Prediction
Prediction
26.Prediction of
of
Figure of
Prediction TVEy14
TVEy14
ofTVEy14
26. TVEy14by by
by
Predictionby SLSTMs
SLSTMs
SLSTMs
of and
and
TVEy14and
SLSTMsby SGRUs
SGRUs
SGRUs
and SGRUs
SLSTMs in
andin
in the
the
inthe long
long
thelong
SGRUs process
process
inprocess
long (LP)
(LP)
long(LP)
process
the (LP)(1
process (1
(1 cycle
cycle
(1cycle ====15
cycle
(LP) (1 15
15 min).
min).
15min).
cyclemin).
= 15 min).

Figure
Figure
Figure 27.
27.
Figure27. Prediction
Prediction
27.Prediction of
of
Figure of
Prediction TVEz5
TVEz5
ofTVEz5
27. TVEz5by by
by SLSTMs
SLSTMs
bySLSTMs
Prediction of and
and
TVEz5and
SLSTMs and
by SGRUs
SGRUs
SGRUs
SGRUs
SLSTMs in
andin
in the
the
inthe short
short
theshort
SGRUs process
process
inprocess
short (SP)
(SP)
short(SP)
process
the (SP)(1
process (1
(1 cycle
cycle
(1cycle
cycle
(SP) ====15
(1 15
15 min).
min).
15min).
cyclemin).
= 15 min).

Figure
Figure
Figure 28.
28.
Figure28. Prediction
Prediction
28.Prediction of
of
Figure of
Prediction TVEz6
TVEz6
ofTVEz6
28. TVEz6by by
by SLSTMs
SLSTMs
bySLSTMs
Prediction of and
and
TVEz6and
SLSTMs and
by SGRUs
SGRUs
SGRUs
SGRUs
SLSTMs in
andin
in the
the
inthe short
short
theshort
SGRUs process
process
inprocess
short (SP)
(SP)
short(SP)
process
the (SP)(1
process (1
(1 cycle
cycle
(1cycle
cycle
(SP) ====15
(1 15
15 min).
min).
15min).
cyclemin).
= 15 min).

Figure
Figure
Figure 29.
29.
Figure29. Prediction
Prediction
29.Prediction of
of
Figure of
Prediction TVEx4
TVEx4
ofTVEx4
29. TVEx4by by
by SLSTMs
SLSTMs
bySLSTMs
Prediction of and
and
TVEx4and
SLSTMs and
by SGRUs
SGRUs
SGRUs
SGRUs
SLSTMs in
andin
in the
the
inthe short
short
theshort
SGRUs process
process
inprocess
short (SP)
(SP)
short(SP)
process
the (SP)(1
process (1
(1 cycle
cycle
(1cycle ====15
cycle(1
(SP) 15
15 min).
min).
15min).
cyclemin).
= 15 min).

This
This
This study
study
Thisstudy
studyhas has
has several
several
hasseveral limitations
limitations
severallimitations
limitationsthatthat
that should
should
thatshould
shouldbe be
be taken
taken
betaken into
into
takeninto account.
account.
intoaccount. Firstly,
Firstly,
account.Firstly, the
the
Firstly,the heat
heat
theheat
heat
sources
sources
sources are
are
sourcesare limited
limited
arelimited
limitedto to
to the
the
tothe rotary
rotary
therotary
rotaryaxes,axes,
axes, which
which
axes,which
whichmaymay
may
mayaffect affect
affect the
the
affectthe quality
quality
thequality
qualityof of
of predictions
predictions
ofpredictions
predictionsof ofof TVEz.
TVEz.
ofTVEz.
TVEz.
To
To
To improve
improve
Toimprove
improvethe the
the accuracy
accuracy
theaccuracy
accuracyof of
of
ofthethe
the predictions,
predictions,ititititmay
predictions,
thepredictions, may
may
maybe be
be beneficial
beneficial
bebeneficial
beneficialto to
to include
include
toinclude more
more
includemore infor-
infor-
moreinfor-infor-
mation
mation
mation
mationon on
on
onthethe
the activities
activities
theactivities
activitiesofof
of the
the
ofthe linear
linear
thelinear axes
axes
linearaxes and
and
axesand temperatures
temperatures
andtemperatures
temperaturesas as
as inputs.
inputs.
asinputs. Additionally,
Additionally,
inputs.Additionally,
Additionally,the the
the
the
Machines 2023, 11, 496 15 of 16

6. Conclusions
In this study, an approach using SLSTM/SGRU for directly predicting thermally
induced volumetric errors without the use of a kinematic model with geometric error
parameters was proposed. SLSTMs\SGRUs+AdamW are used for modelling and predict-
ing the sequential data. The training and testing data sets used different positions and
different B- and C-axis exercise cycles. The SLSTMs model’s best prediction in the long
process reached RMSE of 2.2 µm (98.8% fitting to the experimental measurement), while in
a short testing process, it achieved 1.8 µm (91.6%). The worst case predicted by the model
in processes is 7.5 µm (69.2% fitting). This work gives a potential solution in practice to
predict the thermal volumetric errors at different positions in the workspace directly over
an extended period of 40 h without the need for remeasuring the thermal errors, which
allows sufficient time for the machining of most workpieces.
A comparison between two popular deep learning models for sequential data, SLSTMs
and SGRUs, is carried out for some outstanding TVEs. The purpose is to determine which
one is better for this application. Their capabilities were similar.
Future work can focus on exploring the heat sources associated with the activities of
five-axis machine tools, as well as examining the robustness of deep learning models when
faced with variations in ambient temperature.

Author Contributions: Conceptualization, H.V.N., J.R.R.M. and E.B.-N.; methodology, H.V.N. and
J.R.R.M.; software, H.V.N.; validation, H.V.N., J.R.R.M. and E.B.-N.; formal analysis, H.V.N., J.R.R.M.
and E.B.-N.; investigation, H.V.N.; resources, H.V.N., J.R.R.M. and E.B.-N.; data curation, H.V.N.,
J.R.R.M. and E.B.-N.; writing—original draft preparation, H.V.N.; writing—review and editing,
H.V.N., J.R.R.M. and E.B.-N.; visualization, H.V.N.; supervision, J.R.R.M. and E.B.-N.; project admin-
istration, J.R.R.M.; funding acquisition, J.R.R.M. All authors have read and agreed to the published
version of the manuscript.
Funding: This research was supported by the Natural Sciences and Engineering Research Council of
Canada (NSERC) Discovery, Grant number RGPIN-2016-06418.
Institutional Review Board Statement: I declare on behalf of my co-authors that this work is orig-
inal and has not been published elsewhere, nor is it currently under consideration for publication
elsewhere.
Informed Consent Statement: Not applicable.
Data Availability Statement: Available from the corresponding author on reasonable request.
Acknowledgments: Elie Bitar-Nehme, technicians Guy Gironne and Vincent Mayer are acknowl-
edged for experimental data.
Conflicts of Interest: The authors declare no conflict of interest.

References
1. Bitar-Nehme, E.; Mayer, J.R.R. Modelling and compensation of dominant thermally induced geometric errors using rotary axes’
power consumption. CIRP Ann. 2018, 67, 547–550. [CrossRef]
2. Ibaraki, S.; Knapp, W. Indirect measurement of volumetric accuracy for three-axis and five-axis machine tools: A review. Int. J.
Autom. Technol. 2012, 6, 110–124. [CrossRef]
3. Mayr, J.; Jedrzejewski, J.; Uhlmann, E.; Donmez, M.A.; Knapp, W.; Härtig, F.; Wendt, K.; Moriwaki, T.; Shore, P.; Schmitt, R.
Thermal issues in machine tools. CIRP Ann. 2012, 61, 771–791. [CrossRef]
4. Brecher, C.; Hirsch, P.; Weck, M. Compensation of thermo-elastic machine tool deformation based on control internal data. CIRP
Ann. 2004, 53, 299–304. [CrossRef]
5. Horejs, O.; Mares, M.; Kohut, P.; Barta, P.; Hornych, J. Compensation of machine tool thermal errors based on transfer functions.
MM Sci. J. 2010, 3, 162–165. [CrossRef]
6. Liu, Y.; Lu, Y.; Gao, D.; Hao, Z. Thermally induced volumetric error modeling based on thermal drift and its compensation in
Z-axis. Int. J. Adv. Manuf. Technol. 2013, 69, 2735–2745. [CrossRef]
7. Mayr, J.; Egeter, M.; Weikert, S.; Wegener, K. Thermal error compensation of rotary axes and main spindles using cooling power
as input parameter. J. Manuf. Syst. 2015, 37, 542–549. [CrossRef]
8. Yu, B.-F.; Liu, K.; Li, K.-Y. Application of Multiple Regressions to Thermal Error Compensation Technology—Experiment on
Workpiece Spindle of Lathe. Int. J. Autom. Smart Technol. 2016, 6, 103–110.
Machines 2023, 11, 496 16 of 16

9. Baum, C.; Brecher, C.; Klatte, M.; Lee, T.H.; Tzanetos, F. Thermally induced volumetric error compensation by means of integral
deformation sensors. Procedia CIRP 2018, 72, 1148–1153. [CrossRef]
10. Liu, J.; Ma, C.; Wang, S. Data-driven thermally-induced error compensation method of high-speed and precision five-axis machine
tools. Mech. Syst. Signal Process. 2020, 138, 106538. [CrossRef]
11. Ngoc, H.V.; Mayer, J.R.R.; Bitar-Nehme, E. Deep learning LSTM for predicting thermally induced geometric errors using rotary
axes’ powers as input parameters. CIRP J. Manuf. Sci. Technol. 2022, 37, 70–80. [CrossRef]
12. Provost, F.; Fawcett, T. Data science and its relationship to big data and data-driven decision making. Big Data 2013, 1, 51–59.
[CrossRef]
13. Dahlem, P.; Emonts, D.; Sanders, M.P.; Schmitt, R.H. A review on enabling technologies for resilient and traceable on-machine
measurements. J. Mach. Eng. 2020, 20, 5–17. [CrossRef]
14. .Liang, Y.; Li, W.; Lou, P.; Hu, J. Thermal error prediction for heavy-duty CNC machines enabled by long short-term memory
networks and fog-cloud architecture. J. Manuf. Syst. 2020, 62, 950–963. [CrossRef]
15. Liu, J.; Ma, C.; Gui, H.; Wang, S. Thermally-induced error compensation of spindle system based on long short term memory
neural networks. Appl. Soft Comput. 2021, 102, 107094. [CrossRef]
16. Liu, P.-L.; Du, Z.-C.; Li, H.-M.; Deng, M.; Feng, X.-B.; Yang, J.-G. Thermal error modeling based on BiLSTM deep learning for
CNC machine tool. Adv. Manuf. 2021, 9, 235–249. [CrossRef]
17. Yu-Chi, L.; Kun-Ying, L.; Yao-Cheng, T. Spindle Thermal Error Prediction Based on LSTM Deep Learning for a CNC Machine
Tool. Appl. Sci. 2021, 11, 5444.
18. Gao, X.; Guo, Y.; Hanson, D.A.; Liu, Z.; Wang, M.; Zan, T. Thermal Error Prediction of Ball Screws Based on PSO-LSTM. Int. J.
Adv. Manuf. Technol. 2021, 116, 1721–1735. [CrossRef]
19. Mchichi, N.A.; Mayer, J. Axis location errors and error motions calibration for a five-axis machine tool using the SAMBA method.
Procedia CIRP 2014, 14, 305–310. [CrossRef]
20. Sagheer, A.; Kotb, M. Unsupervised pre-training of a deep LSTM-based stacked autoencoder for multivariate time series
forecasting problems. Sci. Rep. 2019, 9, 19308. [CrossRef]
21. Curtis, F.E.; Scheinberg, K. Adaptive Stochastic Optimization: A Framework for Analyzing Stochastic Optimization Algorithms.
IEEE Signal Process. Mag. 2020, 37, 32–42. [CrossRef]
22. Kingma, D.P.; Ba, J. Adam: A method for stochastic optimization. arXiv 2014, arXiv:1412.6980.
23. Gugger, S.; Howard, J. AdamW and Super-Convergence Is Now the Fastest Way to Train Neural Nets. Available online:
https:/www.fast.ai/2018/07/02 (accessed on 19 July 2018).
24. Bitar-Nehme, E.; Mayer, J.R.R. Thermal volumetric effects under axes cycling using an invar R-test device and reference length.
Int. J. Mach. Tools Manuf. 2016, 105, 14–22. [CrossRef]
25. Weikert, S. R-test, a new device for accuracy measurements on five axis machine tools. CIRP Ann. Manuf. Technol. 2004, 53,
429–432. [CrossRef]
26. Mayer, J.R.R. Five-axis machine tool calibration by probing a scale enriched reconfigurable uncalibrated master balls artefact.
CIRP Ann. 2012, 61, 515–518. [CrossRef]
27. Cho, K.; Van Merriënboer, B.; Gulcehre, C.; Bahdanau, D.; Bougares, F.; Schwenk, H.; Bengio, Y. Learning phrase representations
using RNN encoder-decoder for statistical machine translation. arXiv 2014, arXiv:1406.1078.
28. Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [CrossRef] [PubMed]
29. Loshchilov, I.; Hutter, F. Decoupled weight decay regularization. arXiv 2017, arXiv:1711.05101.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.

You might also like