P. Díaz - ME 2021

Minerals Engineering 163 (2021) 106760
Contents lists available at ScienceDirect
Minerals Engineering
journal homepage: www.elsevier.com/locate/mineng
Random forest model predictive control for paste thickening☆

Pablo Diaz a, Juan C. Salas b, Aldo Cipriano a, Felipe Núñez a, *
a
Department of Electrical Engineering, Pontificia Universidad Católica de Chile, Av. Vicuña Mackenna 4860, Santiago 7820436, Chile
b
Department of Mining Engineering, Pontificia Universidad Católica de Chile, Av. Vicuña Mackenna 4860, Santiago 7820436, Chile
A R T I C L E I N F O A B S T R A C T
Keywords: As processes involved in mineral processing operations increase their complexity, automation and control
Paste thickening become critical to ensure an economically viable and environmentally sustainable operation. In the context of
Model predictive control modern mineral processing, paste thickening stands out as a relatively new method for producing high density
Random forest
slurries that has proven challenging for standard control algorithms. In this setting, the use of machine-learning-
Machine learning
based models within a predictive control strategy arises as an appealing alternative. This work presents a
Random Forest Model Predictive Control scheme for paste thickening based on a purely data-driven approach for
modeling and evolutionary strategies for solving the associated optimization problem. Results show that the
proposed strategy outperforms conventional predictive control both qualitatively and quantitatively.
1. Introduction tailings dam.

Therefore, the main control objective in paste thickening is to sta
In mineral processing operations, after the flotation process, a bilize the solids concentration at the discharge, while reducing floccu
considerable amount of waste rich in water and reagents is produced as lant consumption (Xu et al., 2015). One of the key elements in
side product. This material is commonly referred to as tailings (Jewell thickening is the proper use of the flocculant for producing a discharge
and Fourie, 2015). A proper management of tailings, including disposal with a stable solids content (Owen et al., 2009). Several attempts for
in dams, is a key element for modern operations aiming at fulfilling high controlling the paste thickening process using classical control ap
environmental standards (Cacciuttolo and Holgado, 2016). Thickening proaches have been made recently, PID and fuzzy-expert controllers
is the primary method for recovering water from tailings by feeding the (Segovia et al., 2011) and master-slave PI strategies based on physical
tailings slurry along with a sedimentation-promoting polymer known as models (Xu et al., 2015) have been explored in simulated environments,
flocculant, which increases the sedimentation rate of the material, to a while real implementations count expert (Ojeda et al., 2014) and fuzzy-
large thickener tank with a slow turning raking system (Concha, 2014). expert (Chai et al., 2016) schemes.
Water is recovered as overflow while a thickened material is discharged The success of model-based predictive control (MPC) techniques in
as underflow for disposal in tailing dams (Stromberg, 2016). industrial applications (Qin and Badgwell, 2003), particularly in min
Driven by environmental concerns as well as the increasing water ing, has motivated efforts in thickening. Such an approach is presented
scarcity, new thickening methods have been developed recently (Cac in Tan et al. (2015), where an MPC scheme was developed based on a
ciuttolo and Holgado, 2016). Among the new trends, paste thickening, model derived using sedimentation-consolidation theory and validated
which is conducted in a taller type of thickener known as paste thick with industrial data using an extended Kalman filter for parameter
ener, stands out as a process where the discharged material contains a estimation. A step forward was presented in Tan et al. (2017), where a
higher solids concentrate and behaves as a non-Newtonian fluid rake torque constraint was added to the formulation in Tan et al. (2015).
(Betancourt et al., 2014). Paste thickening facilitates tailing disposal and Both studies highlight the difficulties in obtaining a first-principles-
allows a higher water recovery rate (Jewell and Fourie, 2015), yet re based model, primarily because not all the phenomena involved in
quires a precise control of the solids content at the discharge since an thickening are fully understood and hence modeling unavoidably in
excess of solids over stresses the pumps used to transport the thickened cludes simplifications. These simplifications limit the applicability of a
material, and a low solids content compromises the stability of the first-principles-based approach, particularly in Tan et al. (2015) the use
☆
This work was supported in part by ANID under grant ANID PIA ACT192013
* Corresponding author.
E-mail addresses: pdiaz2@uc.cl (P. Diaz), jcsalasm@ing.puc.cl (J.C. Salas), aciprian@ing.puc.cl (A. Cipriano), fenunez@ing.puc.cl (F. Núñez).
https://doi.org/10.1016/j.mineng.2020.106760
Received 26 July 2020; Received in revised form 29 November 2020; Accepted 21 December 2020
Available online 19 January 2021
0892-6875/© 2021 Elsevier Ltd. All rights reserved.
P. Diaz et al. Minerals Engineering 163 (2021) 106760
of a steady-state condition, and assumptions on a constant mineral type

(constant parameters) are mentioned, which makes the results highly
sensitive to the feeding and the mineralogy.
Given the difficulties in modeling the paste thickening process, data-
driven models and controllers arise as an appealing option, following the
trend of using artificial intelligence in mineral processing (McCoy and
Auret, 2019) and the ongoing data revolution bolstered by industrial
Internet technologies (Langarica et al., 2020). Data-driven models
exploit patterns and correlations among variables to explain the
behavior of the system and are particularly accurate when the training
dataset includes a wide operational range; however in unusual opera
tional scenarios are likely to be outperformed by first-principles-based
models. A first effort on thickening control using data-driven models is
presented in Núñez et al. (2020), where an MPC scheme based on a deep
neural network was developed and tested in a real facility. Despite its
good results, the neural model is highly complex thus limiting the
updating rate at the controller, hence, an opportunity exists in exploring
the use of other machine learning techniques within a predictive control
scheme.
Among the new trends in machine learning, random forests (Brei
man, 2001) have gained significant acclaim in the scientific community
due to their promissory results in classification and regression (Fer
nandez-Delgado et al., 2014). Random forest follow a supervised
ensemble learning procedure built upon non-parametric models known
as regression trees (Breiman et al., 1984), which makes them appealing
for modeling uncertain dynamical systems. Up to date, the use of Fig. 1. Paste thickener diagram. Taken from Langlois and Cipriano (2019).
random forests in predictive control is limited to few applications. In
Jain et al. (2017) the predictor is formed by a mixture of linear and
define its T-depth window at t as the nT-dimensional vector resulting
nonlinear models, similar to a Takagi-Sugeno fuzzy description (Takagi
from constructing the sub-sequence α[t− T+1;t] and concatenating its ele
and Sugeno, 1985). A similar approach based on local linear models is
ments from left to right. The same notation applies for an n-dimensional
taken in Wang et al. (2019) and applied to a simulated electrochemical
finite-length sequence of length T +1 α : [0; T]→Rn with the under
process. The scheme for climate control in buildings proposed in Smarra
et al. (2018) and experimentally validated in Bünning et al. (2020) standing that for a sub-sequence α[a;b] , [a; b]⊆[0; T] must hold.
considers an affine model at each leaf and delivered excellent results
when compared to traditional hysteresis controllers; however, limita 2. Paste thickening
tions are documented for long prediction horizons. In all these works
complex MIMO systems are controlled with promising results, providing 2.1. Process description
a starting point for this research.
This article proposes a Random Forest Model Predictive Controller Paste thickening is governed by a series of complex physico-chemical
(RF-MPC) for paste thickening, purely based on operational data and phenomena that occur inside the thickener. Thickening is regarded as a
packed as a general-purpose toolbox in the Matlab® Simulink language highly nonlinear system with slow dynamics (Betancourt et al., 2014)
for its later use by the scientific community. A pseudo-real dataset, real and subject to strong disturbances. Hence, obtaining a model from first
operational inputs and simulated outputs, is used for system identifi principles is a complex task. Nonetheless, several works have taken this
cation with the help of a paste thickening simulator designed for control path (Betancourt et al., 2014; Langlois and Cipriano, 2019) and obtained
studies (Langlois and Cipriano, 2019). Set-point tracking experiments models that, although difficult to apply in real settings due to the
and comparisons with standard linear data-driven MPC are presented to associated parameter estimation problem, are useful to simulate the
illustrate the potential of the proposed RF-MPC. process in a computational environment. In this work, the Simulink®
The rest of this work is organized as follows. Section 2 provides an simulator developed in Langlois and Cipriano (2019), which was vali
operational overview of the paste thickening process. Section 3 presents dated using real data, will be used to test the proposed control strategy.
basics on random forests, as well as their use in time series prediction From a systemic point of view, thickening can be described as follows
and forecasting. Section 4 describes the proposed RF-MPC imple (Langlois and Cipriano, 2019). The tailings slurry enters the thickener
mentation with special emphasis on the resolution of the optimization through the feedwell with feeding flowrate Qf , which varies with time,
problem. In Section 5 experimental results are analyzed including the and associated solids concentration ϕf . Additionally, flocculant is added
predictive accuracy of the random forests model, control results for set- to the slurry at the feedwell, with dosage F, to promote sedimentation
point tracking, and comparisons against a classical MPC controller. inside the thickener (Betancourt et al., 2014). Thickened material is
Section 6 concludes with a summary of the main contributions. discharged as an underflow at rate Qu with solids concentration ϕu ,
while water is recovered as effluent with flowrate Qe and (low) solids
content ϕe . Denoting by ϕ(z) the solids concentration inside the thick
1.1. Notation and basic definitions ener at height z ∈ [ − H, M] (see Fig. 1), three regions can be identified
inside the thickener based on the solids profile ϕ: an effluent region ℰ,
In this work, R denotes the real numbers, Z⩾0 the nonnegative in above the feedwell level, where water is recovered, i.e., ℰ := [ − H,0), a
tegers, and Rn the Euclidean space of dimension n. For a, b ∈ Z⩾0 we use compaction region 𝒞, where ϕ is larger or equal than a critical concen
[a; b] to denote their closed interval in Z. For a vector v ∈ Rn , vi denotes tration ϕc , i.e. 𝒞 := {z ∈ [0,M] : ϕ(z)⩾ϕc }, and a settling region 𝒮, where
its ith component. For an n-dimensional real-valued sequence α : Z⩾0 → ϕ is lower than ϕc , i.e., 𝒮 := {z ∈ [0, M] : ϕ(z) < ϕc }. The level h at
Rn , α(t) denotes its tth element, and α[a;b] denotes its restriction to the which the compaction and settling region divide, i.e., ϕ(h) = ϕc , is
interval [a; b], i.e., a sub-sequence. For an n-dimensional sequence α, we called the interface level and plays a key role in the associated control
2
problem. Fig. 1 illustrates a paste thickener with the three regions independent and identical distributions, without increasing each
highlighted. learner’s variance as much.
3.2. Prediction in dynamical systems

2.2. Control problem
Let a discrete time multi-input single-output system produce a

The two products of a paste thickener are the water recovered and
sequence y as output, with y(t) ∈ R, and receive as inputs a sequence u,
the thickened material discharged. The quality of the water, measured
with u(t) ∈ Rm , and a sequence d, with d(t) ∈ Rp . The objective is to
by ϕe , and the quality of the paste, measured by ϕu , are the main in
generate a one-step ahead predictor for y, from a given dataset z, of the
dicators of a proper operation. Consequently, the control problem under
form
study is to regulate the underflow solids concentration ϕu and the
( ) ( ( ))
interface level h, which is directly related to how “clean” the recovered y t+1 = ℱ
̂ ̂ x t , (3)
water is, to a given reference by manipulating the flocculant dosage F
and the underflow rate Qu .
where x(t) ∈ X⊆Rna +nb +np , is an element of the na + nb + np -dimensional
It is believed that underflow solids concentration is primarily
predictor sequence x, formed by input-output observations. Specifically,
controlled through Qu , while the effect of F on all variables is currently
an element x(t) is the concatenation of the na -depth window of y at t,
an active area of research (Owen et al., 2009; Shahrivar et al., 2013).
which corresponds to the IIR part of the predictor, the nb -depth window
Disturbances impact thickener operation constantly. For control pur
of u at t and the np -depth window of d at t. This input space structure is
poses, we consider Qf and ϕf as disturbances to be rejected. Abrupt
similar to conventional and linear statistic methods for time-series
changes in these variables are common due to abnormalities in upstream
forecasting, such as ARX, ARIMA or ARIMAX predictors.
processes (Betancourt et al., 2014).
For a multi-input multi-output (MIMO) system with y(t) ∈ Rn , n input
spaces Xi are formed. Denoting by mi (t, na , nb , np ) ∈ Xi the predictor
3. Random forests
vector of yi at time t, the random forest one-step ahead prediction is
given by
3.1. Background
( ) ( ( ))
yi t + 1 = ℱ
̂ ̂ i mi t, na , nb , np (4)
Random forests are supervised ensemble learners based on regres
sion trees, a non-parametric model that uses recursive partitioning to
Eq. (4) shows that the MIMO predictor is based upon n distinct and
learn interactions between variables (Breiman et al., 1984). Random
uncoupled random forests ℱ
̂ i . The algorithm implemented operates as
forests excel at identifying relationships in high-dimensional nonlinear
such; however, this definition can be extended to account for output
problems (Benali et al., 2019); however, its main focus is classification
coupling.
or regression. Their use as time-series predictors has only been exploited
For multi-step forecasting, a typical methodology is to use predictors
recently (Lahouar and Slama, 2017; Qiu et al., 2017; Zhang et al., 2018).
ℱ i in a recursive manner (Bontempi et al., 2013). At each prediction step
The seminal algorithm for training regression trees is CART (Breiman
t +j, j ∈ [2; H], mi (t +j − 1, na , nb , np ) is formed using ̂
y (t +j − 1) as an
et al., 1984), which uses the following formulation.
estimator of y(t + j − 1). Since the final objective of this strategy is to use
Given an input space X⊆RR and an output space Y⊆RM , consider an
it in a predictive control design, disturbances must be forecasted for the
R + M-dimensional observation sequence z of finite length N, where z(t)
:= (x(t), y(t)) with x(t) ∈ X and y(t) ∈ Y. z is referred to as the training prediction horizon. To do so, a simple persistence model is used: ̂ d(t +
dataset. A regression tree is obtained by using z to divide the input space j) = d(t), ∀j > 0.
X into K regions ℳk and assign an output value yℳk to each region so
the prediction error on the output space is minimized. If the prediction 4. Random forest model predictive control for paste thickening
error is minimized using sum of squared errors, the optimal output
Among the different MPC alternatives, the receding horizon MPC
predictor ̂f for a new input observation x(t) is
strategy is recognized as one of the most flexible since no requirement on
() ( ( ))
1 ∑K the structure of the model is imposed (Grune and Pannek, 2017). The
y t
̂ = ̂f x t := yℳi I(x(t) ∈ ℳi ), (1) receding horizon MPC computes at each time step j the manipulated
K i=1
variable sub-sequence u[j;j+Nu − 1] , where Nu is the control horizon,
where I is the indicator function (Breiman et al., 1984). through an online optimization problem. Then, only the first value, uj , is
Random forests operate with the concept of bagging, which is an sent to the actuators.
acronym for bootstrap aggregating (Efron, 1979). To construct a forest Based on the previous discussion on thickening, the variables
considered for the MPC are the following:
predictor, ℱ̂ , B sub-datasets zi of finite length Ni < N are drawn
randomly from the training dataset z, with reposition. A regression tree
• Controlled Variables, y := (ϕu , h). The main controlled variable is
predictor ̂f is then generated from each bootstrap sample zi . Bagging
i ϕu ; however, as recovered water quality is also of interest, h is
averages the prediction over the collection of bootstrap replicas, thereby considered a controlled variable, as well.
reducing its variance (Breiman, 2001). Consequently, the Random For • Manipulated variables, uj := (Qu , F). The discharge flow rate and
est predictor is the bagged estimate of the individual regression trees flocculant addition rate are used to control the thickener.
( ) ( ) ( ) • Disturbances, d := (Qf , ϕf ). Measured disturbances are the feeding
1∑ B
̂ ̂
ℱ x := f bag x := ̂f x . (2) rate and the solids concentration at the feeding.
B i=1 i
Bootstrapping with replacement generates identically distributed As all MPC strategies, the proposed RF-MPC is comprised of a pre
predictors (Hastie et al., 2009). Therefore, the bias of the bagged dictive model, an objective function and system constraints.
ensemble of trees is the same as that of each individual and a correlation
coefficient ρ exists between them. Random forests deal with this precise 4.1. Elements of the controller
issue (Breiman, 2001). By randomly selecting the input variables at each
partition the correlation coefficient ρ is decreased, mimicking Naturally, the predictive models used are random forests:
3
( )
yi t + 1 = ℱ
̂
( ( ))
̂ i mi t, na , nb , np , i = 1, 2. (5) 1983; Chiandussi et al., 2012).
In this work, a modified version of the Particle Swarm Optimization
(PSO) (Mezura-Montes and Coello, 2011; Pedersen and Chipperfield,
As for the optimization problem solved to calculate the sequence
2010) algorithm is used. Since (6) uses Δukj+i in its formulation, the
u[j;j+Nu − 1] , in this work a generic quadratic objective function is used,
which considers n = 2 controlled variables and m = 2 manipulated sequence Δuk[j;j+Nu − 1] is unrolled into a particle Xk of size Nu .
variables, then the optimization problem solved at time instant j is given PSO termination depends on various terminal conditions. Typically,
by these consist on exceeding a specified number of iterations, objective
( ) ( ) function stall or timeouts.
∑ n ∑m N∑u− 1
min V y, u = Vi y i +
Rk ( k ( ))2
Δu j + i , (6) In an MPC problem, the cost of the objective function itself is of less
Δu
i=1 k=1 i=0
s k importance than the sequence of decision variables. A particle distance
stall criterion is added for termination if the best particle lies at a dis
where each Vi (yi ), one per controlled variable, is equal to tance δs of the previous one for Is iterations.
N
( ) N∑
y− 1
Qi ( i ( ))2 ∑ y
∊i(j+t) 4.3. Implementation
Vi y i = e j+t
̂ + Λi
li li
t=1 t=1 (7)
( i( ) )2 Fig. 2 shows the different components of the RF-MPC implemented
+ y j + Ny − yiss ,
βi ̂ and their interaction.
Each particle component is constrained to δuk accordingly in the
and particle generation block. Since predictors mi (t + (j − 1)) contain lagged
values of u(t + (j − 1)) the decision variables are decoded consequently
( )
• Ny ∈ Z⩾0 and Nu ∈ Z⩾0 are the prediction and control horizon,
respectively. and saturated to U k , Uk if required.
i i
• ̂e (t +j) := ̂ y (t +j) − wi (t +j) ∈ R is the predicted error for the The Random Forest Prediction block executes the recursive predic
controlled variable i, with respect to the reference sequence tive strategy as introduced in Section 3.
wi[j;j+Ny − 1] . All error terms from the objective function 7 are computed in the
• yiss ∈ R is the steady-state target for the controlled variable i, and subsequent block. In the case of the on-ff variables ∊ij the predictions are
checked to see if limits are violated. Finally, these results are fed to the
hence yiss = wi (t +Ny ) holds.
objective function block for fitness value computation.
• ∊i(j+t) ∈ {0, 1} is a binary on-off variable that takes into account the
The last step in Fig. 2 consists on the implementation of the PSO
violation of a constraint for the controlled variable i at time j + t.
algorithm. After checking for terminal conditions, particle positions are
• Qi , Rk , βi , Λi ∈ R⩾0 are real positive weights.
updated and the whole process is restarted every τC time units.
• li , sk ∈ R⩾0 are normalization coefficients.
Expression (6) considers incremental MV values Δuk (j) as usual in

most predictive control applications.
Finally, extra constraints, aside from the prediction model con
straints, are included involving both controlled and manipulated vari
ables.
• Controlled Variables Constraints: Outputs must remain bounded by

process limits:
()
(8)
i
Y i ⩽yi j ⩽Y , ∀j, ∀i
These constraints will be softened through their inclusion in the

objective function as expressed in (7) for appropriate values of
Λi ∈ R⩾0 .
• Manipulated Variables Constraints: Actuator limits were established.
Also, rate constraints were considered for the MVs.
()
(9)
k
U k ⩽uk j ⩽U , ∀j, ∀k,
⃒ k ( )⃒
⃒Δu j ⃒⩽δuk , ∀j ∀k. (10)
4.2. Online optimization problem
Even though bagging is a linear method, regression trees are not.

Moreover, the derivative of a regression tree is not defined. As the
objective function becomes nonlinear and nondifferentiable, evolu
tionary algorithms (Bansal, 2019) can be used for the optimization
problem. Even though these methods do no guarantee convergence of
the optimization method and are local optimization methods, research Fig. 2. Random forest model predictive controller algorithm and
generally supports their use and outcomes (Hock and Schittkowski, implementation.
4
Fig. 3. RF-MPC Simulink implementation.
5. Experimental results
5.1. RF-MPC implementation as a toolbox Fig. 4. Process and instrumentation diagram of the industrial thickener used in
this work.
The RF-MPC algorithm implemented is fully vectorized. The ele
ments that go through each of the blocks in Fig. 2 are matrices that
contain transformations of the particles. For example, the prediction three Siemens sitrans FM MAG5100W flow sensors, and the PLA
block transforms the particle matrix into a prediction matrix for each SmartDiver sensor. During data acquisition, the thickener was operated
separate prediction step and for each separate output. by manually writing the process values into the actuators via the DCS and
It was observed in preliminary tests that built-in Matlab functions the PLC. Fig. 4 presents the process and instrumentation diagram of the
used for prediction amounted up to 30% of the execution time of each thickener. Input data is collected at the rate the DCS operates and then
RF-MPC iteration. To overcome this issue, prediction functions, which downsampled to the predictive model period τR . The training dataset
load trees, where converted to Matlab executable files (.mex). This with sampling period τR contains 6912 points (about one month of
approach significantly reduces computation times thanks to efficient operation) and is shown in Fig. 5. An 85/15 % train-test ratio was used.
dynamic memory usage. When applied to paste thickening, the average Additionally, considering the control action period, Table 1 shows the
prediction time obtained was 3 ms. different time scales involved in the problem. As operational data was
The algorithms developed were packaged as a general purpose acquired under manual supervision and control, inputs and outputs are
toolbox that can be adapted to control any system. The toolbox, which is correlated hence identification is an even more complex task.
freely available as a repository in Github (Díaz, 2019), is comprised of
three main libraries: 5.2.1. Predictive modeling
Performance of random forest as predictors will be benchmarked
5.1.1. Machine learning library against an ARIMAX model
Contains all training, validating and testing functions and utilities to ( ) () ( ) ( ) ()
C(z− 1 )
generate the Random Forests. It also implements other predictors, such A z− 1 y t = z− L B z− 1 u t − 1 + e t . (11)
1− z − 1
as ARIMAX models.
For each output of the ARIMAX model the tuning parameters are (na ,
5.1.2. C-MEX generation library
nb = np ,nc ). Additionally, different input delays L can be considered for
Creates a compact version of the forests identified for storage and
each transfer function. Random forests posses two additional parameters
translates the code to the .mex files for fast prediction and use in the
for performance tuning: number of ensembles to be learned B and
Simulink environment.
minimum leaf size lmin . It is important to note that random forests as
expressed (4) do not account for unmeasured disturbances, that is, nc =
5.1.3. RF-MPC library
0 for all forests. For identification, certain physical limits of the process
Contains the implementations and algorithms of all the blocks in
itself limit the search space, such as the thickener residence time, whose
Fig. 2 as well as other utilities.
upper bound is estimated at six hours (Langlois and Cipriano, 2019).
The final implementation of the RF-MPC is in the Simulink frame
Parameters are selected by a greedy heuristic based on the mean
work through the custom function block of Fig. 3. Four inputs are
squared one-step ahead prediction error (MSE),
considered, the disturbance sequence d, the output sequence y the
reference sequence w, and the system clock clk. As outputs, the block 1 ∑N
( i( ) ( ))2
delivers the control action u(t), and additional information of the J iMSE = y k − ̂y i k , (12)
N k=1
controller operation: exitflag and fval contain information about
PSO termination and final cost, while yHat and controlMoves pro i
where yi (k) represents the ith target at time step k, while ̂
y (k) represents
vide the predicted trajectory and the complete u sequence.
the predicted value. An exhaustive search was performed by varying na ,
nb , and np from 0 to 100 in increments of 2, nc and L from 0 to 10 in
5.2. System identification results
increments of 2, B from 0 to 200 in increments of 10 and lmin from 0 to 20
in increments of 5. Table 2 summarizes the parameters that delivered the
To obtain a model of the thickener under study, operational data
best results in terms of JiMSE . It should be noted that bootstrapping
from a real operation was used, which was obtained from the Distributed
Control System (DCS) governing the process. The thickener under contributes to considering different initializations in the identification
analysis is controlled by a Siemens S7-400 PLC connected to a Siemens process, which increases robustness of the forest. In the case of the
PCS 7 DCS, which concentrates measurements from: two Berthold LB- ARIMAX model, a consistent convergence of the parameters is observed.
491 density sensors, two Siemens sitrans P DSIII pressure sensors, An important conclusion obtained from Table 2 is that both
5
Fig. 5. Part of the dataset used for system identification, blue data (6912 datapoints) was used for training and red data (1220 datapoints) for validation. (a): ϕf
(disturbance), (b): h (controlled variable), (c): ϕu (controlled variable), (d): Qu (manipulated variable). (For interpretation of the references to color in this figure
legend, the reader is referred to the web version of this article.)
Table 1 Table 3
Samplings periods used in the implementation. BFR of random forests and ARIMAX models for different prediction horizons.
Parameter Value Unit 1 step 18 steps 24 steps 48 steps
DCS Ts 1 second ϕu ARIMAX 68.72 62.19 60.19 50.26

Predictive model τR 5 minutes ϕu Forest 69.10 57.84 52.89 35.46
Control action τC 5 minutes h ARIMAX 96.95 86.88 82.81 66.19
h Forest 96.78 85.99 81.92 65.99
Table 2 tance. Samples not used for training are passed down each tree Tb when
Best hyperparameter combination for thickener model identification. grown, and the prediction accuracy is recorded. Values for a variable k
Random Forest ARIMAX
are randomly permuted in the OOB samples and accuracy is recom
puted. This Permuted Delta Error (PDE) is averaged over all trees
(na , nb , np , nc ) (72, 72, 72, 0) (60, 30, 30, 6)
measuring the importance of variable k (Hastie et al., 2009). Fig. 6
Delay L 0
–
shows the logarithm of this importance measure for the ϕu (t) forest. It
(B, lmin ) (100, 10) –
can be seen that recent outputs are considered relevant for the predic
tion. However, recent samples of Qu score significantly high as well. This
property is extremely useful for the generation of a controllable model.
Also, predictor importance is statistically significant for the oldest
sample for all variables, which explains the need of high-order models.
5.2.2. Model performance evaluation

To evaluate and compare prediction accuracy the best fit rate (BFR)
criteria is used
( √̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅ )
JMSE
JBFR = 1 − 100. (13)
Var{y(k)}
Table 3 summarizes the scores of both predictors for the validation

data set, under different prediction horizons.
Both models fail to characterize ϕu accurately for large prediction
horizons. However, for an 18-step ahead prediction horizon both pre
dictors have acceptable performance. As the horizon gets larger, the
forest performance degrades at a larger rate than the ARIMAX. No major
significant performance difference can be claimed from Table 3.
Fig. 7 contrasts both predictors with the validation data for the 18-
step ahead horizon. For the ϕu prediction, both variables exhibit good
Fig. 6. PDE for ϕu (t) forest. The vertical purple lines group the lagged terms of results. While the linear predictor shows larger variance (noisier
each variable in the predictor. output), the forest appears to have problems predicting steep upwards
trends. Fig. 7b illustrates this clearly for h. Nevertheless, the ARIMAX
predictor exhibits worse predictive performance for an extensive part of
approaches identify large auto-regressive components. Similarly, the the h validation set. From hours 10 to 60 approximately, the ARIMAX
order for the system inputs nb , np is also large. Both models take the predictor oscillates over the true value. This is explained by two main
entire residence time into account for prediction. reasons. Firstly, that portion corresponds to a large deviation from the
The best delay identified in the linear predictor is L = 0. For this operating point and hence linear approximations become questionable.
particular system, better representations are obtained by increasing the Most importantly, however, coefficients in polynomial B, which explain
input filter orders nb ,np . Predictive accuracy is improved by considering the input-output effect, are of order 10− 2 or smaller, while the auto-
a larger historic window because accurate delay identification becomes regressive coefficients are larger. Hence, the ARIMAX model is some
difficult without input signal design. what insensitive to external inputs.
Out of bag samples (OOB) can be used to estimate predictor impor The models identified show a conflict between accuracy and
6
Fig. 7. Comparison of 18-step ahead prediction of random forest and ARIMAX strategies for (a) ϕu (t) and (b) h(t).
Table 4
Horizons and control period used in the experiments.
Controller action τC 5 min
Prediction Horizon Ny 90 min
Control Horizon Nu 15 min
Table 5
Process limits considered in the experiments.
ϕu h Qu F
Higher X 75.97 6.00 125.00 31.00

Lower X 71.55 1.90 70.00 18.00
Rate ΔX – – 15.00 5.00
controllability. On the one hand, the forest predictor deems both auto-
regressive and exogenous coefficients important, as depicted in Fig. 6.
However, results in Fig. 7 show that the forest forecasting accuracy is Fig. 8. Evolution of ϕf (t) during the ϕu (t) set-point tracking experiment.
biased towards recent samples of the output. The ARIMAX predictor,
however, seems to misinterpret the effect of the inputs on the outputs.
Since in any MPC scheme models are instrumental to determine the 5.3.1. Control specifications
control sequence, the real potential of the models will be clarified in the Table 4 lists the parameters used for control purposes in both stra
following predictive control experiments. tegies, to provide a fair comparison. The prediction horizon corresponds
to the 18-steps ahead established previously. As τC = τR , move blocking
5.3. Control results is not used. Constraints were imposed on process variables based on the
literature regarding thickener operation (Langlois and Cipriano, 2019).
For benchmarking purposes, the proposed RF-MPC is compared to a These are listed in Table 5 and illustrate the tight operating window for
conventional MPC strategy based on the identified ARIMAX model. The ϕu . The operating point specified for control and disturbance variables is
system was driven to steady-state using data from 200 h of real opera ϕu (0) = 73.76%, h(0) = 4.12m, and ϕf (0) = 28.31%.
tion, subject to the process disturbance. The tuning of each controller For this work, and in compliance with thickener control objectives,
was based on trial and error by testing tuples of weighting factors and emphasis was given to setpoint tracking and constraint satisfaction by
recording the MSE as defined in (14) (see Section 5.4). Those tuples with picking large Qi and Λi in (7). Also, flocculant use was penalized with a
the minimum MSE were selected as nominal parameters for each higher cost since it is a resource consumed during operation rather than
controller, in order to compare the best effort for each case. the aperture of a valve. The exact values of all parameters can be found
in Table A.7 and A.8 in the Appendix.
As the proposed RF-MPC depends on PSO to solve the online
7
Table 6
Performance indicators of RF-MPC and conventional MPC for set-point tracking.
MSE ϕu (t) MAE ϕu (t) MSE h(t) MAE h(t)
RF-MPC 0.109 0.960 0.544 1.31

MPC 0.949 1.770 4.943 4
controlled variables.
A closer look to Fig. 9b elucidates this behaviour. For the first 40 h of
the simulation, both controllers push Qu to levels near its higher limit. In
fact, conventional MPC saturates its output while the RF-MPC outputs
subtle variations.
However, as shown in Fig. 9a, conventional MPC stabilizes ϕu with
no steady-state error, while the proposed strategy is unable to do so. The
reason behind is that the interface level h rises towards its limit. Hence,
the RF-MPC reduces Qu beforehand to track both setpoints simulta
neously. In fact, conventional MPC violates the h limit in Table 5.
After this time period, the setpoint is changed back to its original
state. This forces both controllers to abruptly reduce Qu levels. However,
as h drops rapidly, a situation similar to the results depicted in Fig. 7b
occurs. The poorly identified influence of exogenous variables on the
outputs of the ARIMAX model produces a destabilizing response in the
conventional linear MPC strategy. The linear predictor estimates that
lower saturation of the inputs is optimal. The RF-MPC strategy, how
ever, offers a radically different solution, attaining stabilization of ϕu
with no steady-state error between hours 40 to 70.
The last portion of the simulation shows that the RF-MPC fails to
drive ϕu to its setpoint. The decrease of ϕf in the last 20 h of the horizon
explains this. To cope with this, the RF-MPC increases the average level
of flocculant added to the system to maintain both ϕu and h as close as
possible to their setpoints.
Quantitative comparison of the controller performance can be made
in terms of the mean squared error (MSE) and maximum absolute error
(MAE),
( ) ( )
∑ N
(w(k) − y(k))2
MSE = , MAE = max |w k − y k |. (14)
k=1
N k∈[1;N]
Scores obtained by each controller summarized in Table 6. Numerical

results show that the RF-MPC performs between 2.5 and almost 10 times
Fig. 9. Performance comparison of MPC and RF-MPC for setpoint tracking better for both outputs. As ϕu setpoint tracking is relevant for paste
tests. (a) Controlled variables ϕu (t) (top) and h(t) (bottom). (b) Controller production, a low MSE for ϕu ensures correct thickening operation.
outputs Qu (t) (top) and F(t) (bottom).
Interface level h does not need, in general, to follow a specific set
point but rather to be contained within the limits of Table 5. This gua
optimization problem, tuning is necessary for its correct execution. In rantees that the effluent or clear water presents a low solid content. As
ternal PSO parameters (velocity and particle update control) were left to explained, the RF-MPC succeeds in keeping h contained in this operating
default values (Pedersen and Chipperfield, 2010). Terminal conditions, window. This is shown in the MAE obtained for this variable, while
on the other hand, were fine tuned for both performance and compu conventional MPC reaches both high and low limits and hence is more
tation speed. These values are summarized in Table A.9 in the Appendix. than two times as large.
5.4. Setpoint tracking 6. Conclusions
The primary objective in thickener control is to maintain a high ϕu This article proposes a novel original model predictive control
throughout operation while rejecting solid intake disturbances (Jewell strategy based on machine learning techniques. The main contribution
and Fourie, 2015). To test controller performance, an abrupt change of consists in generating a purely data-driven controller in the form of a
±1% was applied to the ϕu setpoint over the initial operating point for a toolbox. As recently random forests have gained scientific acclaim for
time window of 100 h. nonlinear model identification, this technique was chosen for the pre
Fig. 8 shows the evolution of ϕf (real operational data) during the dictive model.
simulation horizon. As ϕf is collected in a manual and hourly fashion it is The use of random forests for multiple step ahead prediction results
subject to a slower rate of change. in a non convex and nonlinear online optimization problem for the
Simulation results for the setpoint tracking experiment are shown in controller. To overcome this issue, another machine learning approach,
Fig. 9. It can be seen that RF-MPC performs better in these tests for both
8
particle swarm optimization, was applied. As a result, a general purpose References

data-driven controller was implemented for simulation studies, which
can be ported to other processes. Bansal, J., 2019. Particle swarm optimization. Stud. Comput. Intell. 779, 11–23.
Benali, L., Notton, G., Fouilloy, A., Voyant, C., Dizene, R., 2019. Solar radiation
The RF-MPC designed was evaluated in the context of paste tailing forecasting using artificial neural network and random forest methods: Application
production, a nonlinear process subject to important disturbances. Both to normal beam, horizontal diffuse and global components. Renew. Energy 132,
system identification and control tests were done on a pseudo-real 871–884.
Betancourt, F., Burger, R., Diehl, S., Faras, S., 2014. Modeling and controlling clarifier-
dataset, comprised of real operational inputs and simulated outputs. thickeners fed by suspensions with time-dependent properties. Miner. Eng. 62,
Finally, the performance of the RF-MPC was compared to a con 91–101.
ventional MPC benchmark strategy. Results show that the RF-MPC Bontempi, G., Ben Taieb, S., Le Borgne, Y., 2013. Machine Learning Strategies for Time
Series Forecasting. Springer, Berlin, Heidelberg, pp. 62–77.
performs better both qualitatively and quantitatively. Breiman, L., 2001. Random forests. Mach. Learn. 45, 5–32.
Breiman, L., Freidman, J., Olshen, Noack, R., Stone, R., 1984. Classification and
Declaration of Competing Interest Regression Trees. Chapman-Hall-CRC.
Bünning, F., Huber, B., Heer, P., Aboudonia, A., Lygeros, J., 2020. Experimental
demonstration of data predictive control for energy optimization and thermal
The authors declare that they have no known competing financial comfort in buildings. Energy Build. 211, 109792.
interests or personal relationships that could have appeared to influence Cacciuttolo, C., Holgado, A., 2016. Management of paste tailings in Chile: A review of
the work reported in this paper. practical experience and environmental Acceptance, in. In: Proceedings of the 19th
International Seminar on Paste and Thickened Tailings, pp. 121–136.
Chai, T., Jia, Y., Li, H., Wang, H., 2016. An intelligent switching control for a mixed
CRediT authorship contribution statement separation thickener process. Control Eng. Pract. 57, 61–71.
Chiandussi, G., Codegone, M., Ferrero, S., Varesio, F., 2012. Comparison of multi-
objective optimization methodologies for engineering applications. Comput. Math.
Pablo Diaz: Methodology, Software, Investigation, Formal analysis, Appl. 63, 912–942.
Writing - original draft. Juan C. Salas: Conceptualization, Methodol Concha, F., 2014. Solid-liquid separation in the mining industry. Springer.
ogy, Investigation, Writing - original draft. Aldo Cipriano: Conceptu Díaz, P., 2019. Random forest model predictive control. URL https://github.com/pdia
z2/Espesador_Matlab.
alization, Writing - original draft, Writing - review & editing. Felipe Efron, B., 1979. Bootstrap methods: another look at the jackknife. Ann. Stat. 7, 1–26.
Núñez: Conceptualization, Formal analysis, Writing - original draft, Fernandez-Delgado, M., Cernadas, E., Barro, S., Amorim, D., 2014. Do we need hundreds
Writing - review & editing, Supervision, Resources, Funding acquisition. of classifiers to solve real world classification problems? J. Mach. Learn. Res. 15,
3133–3181.
Grune, L., Pannek, J., 2017. Nonlinear Model Predictive Control-Theory and Algorithms-
Declaration of Competing Interest Second Edition. Springer.
Hastie, T., Tibshirani, R., Friedman, J., 2009. Elements of Statistical Learning, second ed.
Springer.
The authors declare that they have no known competing financial
Hock, W., Schittkowski, K., 1983. A comparative performance evaluation of 27 nonlinear
interests or personal relationships that could have appeared to influence programming codes. Computing 30, 335.
the work reported in this paper. Jain, A., Smarra, F., Mangharam, R., 2017. Data predictive control using regression trees
and ensemble learning, in. In: 56th IEEE Conference on Decision and Control (CDC),
pp. 4446–4451.
Appendix A. Controller parameters Jewell, R., Fourie, A., 2015. Paste and thickened tailings: a guide, third ed. Australian
Centre for Geomechanics, The University of Western Australia.
Tables A.7–A.9 Lahouar, A., Slama, J.B.H., 2017. Hour-ahead wind power forecast based on random
forests. Renew. Energy 109, 529–541.
Langarica, S., Rüffelmacher, C., Núñez, F., 2020. An industrial internet application for
real-time fault diagnosis in industrial motors. IEEE Trans. Autom. Sci. Eng. 17,
284–295.
Langlois, J.I., Cipriano, A., 2019. Dynamic modeling and simulation of tailing thickener
Table A.7 units for the development of control strategies. Miner. Eng. 131, 131–139.
RF-MPC parameters for controlled variables. McCoy, J., Auret, L., 2019. Machine learning applications in minerals processing: A
review. Miner. Eng. 132, 95–109.
ϕu h Mezura-Montes, E., Coello, C.A., 2011. Constraint-handling in nature-inspired numerical
optimization: Past, present and future. Swarm Evol. Comput. 1, 173–194.
Tracking Weights Qi 100 100 Núñez, F., Langarica, S., Díaz, P., Torres, M., Salas, J.C., 2020. Neural network-based
Terminal Weights βi 100 100 model predictive control of a paste thickener over an industrial internet platform.
Constraint Weights Λi 10000 10000 IEEE Trans. Industr. Inf. 16, 2859–2867.
Ojeda, P., Bergh, L., Torres, L., 2014. Intelligent Control of an Industrial Thickener. In:
Normalization Coefficient li 3 5 13th International Conference on Control Automation Robotics and Vision.
Owen, A., Nguyen, T., Fawell, P., 2009. The effect of flocculant solution transport and
addition conditions on feedwell performance in gravity thickeners. Int. J. Miner.
Table A.8 Process. 93, 115–127.
Pedersen, M., Chipperfield, A., 2010. Simplifying particle swarm optimization. Appl. Soft
RF-MPC parameters for manipulated variables.
Comput. 10, 618–628.
Qu F Qin, S., Badgwell, T.A., 2003. A survey of industrial model predictive control technology.
Control Eng. Pract. 11, 733–764.
Control Effort Weights Rj 0.05 0.50 Qiu, X., Zhang, L., Suganthan, P.N., Amaratunga, G.A., 2017. Oblique random forest
ensemble via least square estimation for time series forecasting. Inf. Sci. 420,
Normalization Coefficient skj δu1 δu 2
249–262.
Segovia, J., Concha, F., Sbarbaro, D., 2011. On the Control of Sludge Level and
Underflow Concentration in Industrial Thickeners, in. In: Proceedings of the 18th
Table A.9 World Congress of The International Federation of Automatic Control.
Particle Swarm Optimization parameters. Shahrivar, A., Goharrizi, S., Ebrahimzadeh, M., Sarafi, A., Mohammad, R., Hadi, A.,
2013. Application of response surface methodology and central composite rotatable
Symbol Value design for modeling the influence of some operating variables of the lab scale
thickener performance. Int. J. Min. Sci. Technol. 23, 717–724.
Swarm Size K 100
Smarra, F., Jain, A., de Rubeis, T., Ambrosini, D., D’Innocenzo, A., Mangharam, R., 2018.
Maximum Iterations Number I 30 Data-driven model predictive control using random forests for building energy
Cost Function Stall Tolerance δs 5 × 10− 3
optimization and climate control. Appl. Energy 226, 1252–1272.
Maximum Stall Iterations Is 21 Stromberg, K., 2016. Thickened tailings management, a dynamic process: Understanding
and optimizing the thickener operation. In: Proceedings of the 19th International
Objective Function Tolerance ∊ 1 × 10− 3
Seminar on Paste and Thickened Tailings, Gecamin Chile, 2016, pp. 31–43.
Particle Distance Tolerance ∊x 1 × 10 − 3
Takagi, T., Sugeno, M., 1985. Fuzzy identification of systems and its applications to
Maximum Particle Stall Iterations Σx 21 modeling and control. IEEE Trans. Syst. Man Cybernet. SMC-15 116–132.
9
Tan, C.K., Setiawan, R., Bao, J., Bickert, G., 2015. Studies on parameter estimation and Xu, N., Wang, X., Zhou, J., Wang, Q., Fang, W., Peng, X., 2015. An intelligent control
model predictive control of paste thickeners. J. Process Control 28, 1–8. strategy for thickening process. Int. J. Miner. Process. 142, 56–62.
Tan, C.K., Bao, J., Bickert, G., 2017. A study on model predictive control in paste Zhang, W., Quan, H., Srinivasan, D., 2018. Parallel and reliable probabilistic load
thickeners with rake torque constraint. Miner. Eng. 105, 52–62. forecasting via quantile regression forest and quantile determination. Energy 160,
Wang, R., Bao, J., Yao, Y., 2019. A data-centric predictive control approach for nonlinear 810–819.
chemical processes. Chem. Eng. Res. Des. 142, 154–164.
10

P. Díaz - ME 2021

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

P. Díaz - ME 2021

Uploaded by

Copyright:

Available Formats

Minerals Engineering 163 (2021) 106760

Contents lists available at ScienceDirect

Random forest model predictive control for paste thickening☆

1. Introduction tailings dam.

of a steady-state condition, and assumptions on a constant mineral type

3.2. Prediction in dynamical systems

Let a discrete time multi-input single-output system produce a

Expression (6) considers incremental MV values Δuk (j) as usual in

• Controlled Variables Constraints: Outputs must remain bounded by

These constraints will be softened through their inclusion in the

4.2. Online optimization problem

Even though bagging is a linear method, regression trees are not.

Fig. 3. RF-MPC Simulink implementation.

DCS Ts 1 second ϕu ARIMAX 68.72 62.19 60.19 50.26

5.2.2. Model performance evaluation

Table 3 summarizes the scores of both predictors for the validation

Higher X 75.97 6.00 125.00 31.00

RF-MPC 0.109 0.960 0.544 1.31

Scores obtained by each controller summarized in Table 6. Numerical

5.4. Setpoint tracking 6. Conclusions

particle swarm optimization, was applied. As a result, a general purpose References

You might also like