You are on page 1of 6

Advanced Control

Proceedings,
Proceedings, 10th
10th of Chemical
IFAC
IFAC Processes
International
International Symposium
Symposium on
on
Proceedings,
Proceedings,
Shenyang,
Advanced 10th IFAC
10th of
IFAC
Liaoning,
Control International
International
China, July
Chemical Symposium
Symposium
25-27,
Processes on
on at www.sciencedirect.com
2018 online
Advanced
Advanced Control
Control of
of Chemical
Chemical Available
Processes
Processes
Advanced
Shenyang,
Advanced Control
Control of
Liaoning,
of Chemical
China, Processes
July
Chemical 25-27,
Processes2018
Shenyang, Liaoning,
Shenyang, Liaoning, China, July
July 25-27, 2018
2018
Shenyang, Liaoning, China,
China, July 25-27,
25-27, 2018
ScienceDirect
IFAC PapersOnLine 51-18 (2018) 132–137
Automated System Identification in Mineral Processing Industries:
Automated
Automated System
Automated System
System Identification in
Identification in Mineral
Mineral Processing
Processing Industries:
Industries:
A Case Identification
Study using the in ZincMineral Processing
Flotation Cell Industries:
A Case
A Case Study using
using the
the Zinc Flotation
Flotation Cell
A Case Study
Study using
Yuri A.W. the
Shardt *
*
Zinc
Zinc
, Kevin Flotation
Brooks †

Cell
Cell
Yuri
Yuri A.W.
A.W. Shardt
Shardt
*, Kevin Brooks†
*
*, Kevin Brooks†


Yuri
Yuri
Yuri A.W.
A.W.
A.W. Shardt
Shardt
Shardt *,, Kevin
*
, Kevin
Kevin Brooks
Brooks
Brooks †
*Technical University of Ilmenau, Ilmenau, Canada
*Technical
*Technical University
(e-mail: of
of Ilmenau,
Ilmenau, Ilmenau,
yuri.shardt@tu-ilmenau.de)
University Ilmenau, Canada Canada
*Technical
*Technical University
University of
of Ilmenau,
Ilmenau, Ilmenau,
Ilmenau, Canada
Canada
† BluESP, (e-mail:
53 Platina
(e-mail: St,yuri.shardt@tu-ilmenau.de)
Randburg, South
yuri.shardt@tu-ilmenau.de) Africa, +27 11 251 5900
(e-mail:
(e-mail: yuri.shardt@tu-ilmenau.de)
yuri.shardt@tu-ilmenau.de)
†† BluESP, 53 Platina
(e-mail: St, Randburg, South
kevin.brooks@bluesp.co.za Africa, +27) 11 251 5900
††† BluESP,
BluESP, 53
BluESP,
BluESP, 53 Platina
53
53 Platina
Platina
Platina
St,
St, Randburg,
Randburg, South
Randburg, South Africa,
South
St,kevin.brooks@bluesp.co.za
St, Randburg, South Africa, +27
Africa,
Africa, +27
+27
+27
11
11 251
11
11 251 5900
251
251 5900
5900
5900
(e-mail:
(e-mail: kevin.brooks@bluesp.co.za )
))
Abstract: In many industries, including (e-mail: kevin.brooks@bluesp.co.za
(e-mail: kevin.brooks@bluesp.co.za
the mineral processing industry, ) process modelling can be
Abstract:
improved by
Abstract: In
In many
many industries,
mining including
the data historian.
industries, includingHowever,the
the mineral processing
the data
mineral industry,
industry, isprocess
in the historian
processing modelling
modelling can
often contaminated
process canwithbe
be
Abstract:
Abstract: In many
In mining
many industries,
industries, including the
the mineral
includingconditions, mineral processing
processing industry,
industry, isprocess
process modelling
modelling manual can
canwithbe
be
improved
missing
improved by
values,
by mining unknown the data
the dataoperating
historian.
historian. However,
However, theand
the data
data in the
other
in theimperfections.
historian
historian is often
often contaminated
Furthermore,
contaminated with
improved
improved by
by mining
mining the
the data
data historian.
historian. However,
However, the
the data
data in
in the
the historian
historian is
is often
often contaminated
contaminated with
with
missing
segmentation
missing values,
values, unknown
of theunknown operating
data is difficult
operating due to conditions,
the large number
conditions, and
and other imperfections.
of dataimperfections.
other points and variables. Furthermore,
Thus, there
Furthermore, manual
manualis a
missing
missing
missing values,
values,
values, unknown
unknown
unknown operating
operating
operating conditions,
conditions,
conditions, and
and
and other
other
other imperfections.
imperfections.
imperfections. Furthermore,
Furthermore,
Furthermore, manual
manual
manual
segmentation
need to
segmentation develop of
of the
theand data
data is
is difficult
implement
difficult due
methods
due to
to the
the large
that
large cannumber of data
automatically
number of data points
segment
points and
and variables.
the data
variables. Thus,
set
Thus, there
into
there is
viable
is aa
segmentation
segmentation
segmentation of
of
of the
the
the data
data
data is
is
is difficult
difficult
difficult due
due
due to
to
to the
the
the large
large
large number
number
number of
of
of data
data
data points
points
points and
and
and variables.
variables.
variables. Thus,
Thus,
Thus, there
there
there is
is
is aa
need
need
need
to
components
to
to
develop
develop
develop
and
for identification
and
and
implement
implement
implement
methods
purposes.
methods
methods
that
Onethat
that
can
approach
can
can
automatically
uses Laguerresegment
automatically
automatically
segment
models the
segment
the
the
data
to segment
data
data
set
set
set
into
the
into
intodata set.a
viable
viable
viable
need
need to develop
to develop and implement
andinimplement methods
methods that
thatsuchcan automatically
can asautomatically segment
segment the
the data set
data set the into viable
into such
viable
components
However,
components
components when for
for identification
for used
identification
identification purposes.
a multivariate
purposes.
purposes. One
One approach
situation,
One approach uses
in theLaguerre
uses zinc flotation
Laguerre models
models to
cell, segment
to various
segmentissues, the data
data set. as
set.
components
However,
collinearity, when for used
arise.identification
in
Therefore, a purposes.
multivariate
the data One approach
situation,
segmentation approach
such as
uses
uses
in
algorithm the
Laguerre
Laguerre
zinc
needs
models
models
flotation
to take
to
cell,
this
segment
tointo
segment
various
the
the data
issues,
consideration data
such
set.
set.
when as
However,
However, when used used in aa multivariate situation, such as
as in the zinc flotation cell, various issues, such as
However, when
collinearity, when dataused
a arise.
in
in
Therefore, a multivariate
multivariate
the data
situation,
situation,
segmentation
such
such in
in the
isasshown
italgorithm
zinc
theneeds
zinc forflotation
flotation
to take
cell,
cell, various
this various issues,
issues, such
such as
as
examining
collinearity,
collinearity, arise.
arise. set. Using
Therefore,
Therefore, the
thezinc
the dataflotation
data segmentation
segmentation cell, algorithm
algorithm that
needs
needs to the
to take this into
this into consideration
takemultivariate
into case
consideration
consideration
when
preselecting
when
when
collinearity,
examining arise.
aa data Therefore,
set. Using the
the data
zinc segmentation
flotation
the datacell, italgorithm
is needs to take this into consideration when
the data variables
examining
examining aa data
data to consider
set. Using improves
the zinc flotation cell, it is shown
segmentation.shown that that for
for the the multivariate
multivariate case case preselecting
preselecting
examining
the data
data variables
variables data set.
set. Using
Using the
to consider
consider the zinc
zinc flotation
improves flotation cell,
datacell,
the data
it
it is
is shown
segmentation.shown that that for
for the the multivariate
multivariate case case preselecting
preselecting
the
© 2018,
the data IFAC
Keywords:
the data variables
system
variables to
(International
to consider
identification,
to consider improves
considerFederation
improvesofthe
improves
data the
theAutomatic
mining,
the data
zinc segmentation.
Control) Hosting by Elsevier Ltd. All rights reserved.
segmentation.
flotation
data segmentation.
segmentation. cell
the data variables to improves data
Keywords:
Keywords: system
system identification,
identification, data
data mining,
mining, zinc
zinc flotation
flotation cell
cell
Keywords:
Keywords:
Keywords: system system identification,
identification,
identification, data
1. INTRODUCTION
system data mining,
data mining,
mining, zinczinc flotation
zinc flotation
flotation cellcell
In principle, all of these issues could be addressed by
cell
1.
1. INTRODUCTION
INTRODUCTION havingIn
In principle,
a large period
principle, all
all of
of ofthese
data issues
these available.
issues could
could The be
be addressed
challenge
addressed by
then
by
In process industries, 1.
1. INTRODUCTION
1. INTRODUCTIONwhen implementing control having
INTRODUCTION In
In
In principle,
principle,
principle, all
all
all of
of
of these
these
these issues
issues
issues could
could
could be
be
be addressed
addressed
addressed by
by
by
becomes
having a
a large
large period
investigating
period of
of data
this
data available.
data for
available. The
periods
The challenge
that
challenge can then
and
then
In process
strategies,
In process models industries,
of
process industries, varying when
accuracy implementing
are
when implementing required.
implementing control control
This is
control becomes having
having
having a
a
a large
large
large period
period
period of
of
of data
data
data available.
available.
available. The
The
The challenge
challenge
challenge then
then
then
In
In process industries,
industries, when
when implementing control cannot
becomes investigating
beinvestigating this
used for identification.
this data
data for
forFor periods that
a largethat
periods can
dataset
can and
this
and
strategies,
especially
strategies, models
the case
models of
of varying
with
varying model accuracy
predictive
accuracy are
are required.
required. This
control (MPC).
This is
is becomes
becomes
becomes investigating
investigating
investigating this
this data
this data
data for for
forFor periods
periods
periods that
that
that can
can and
and
can this
and
strategies,
strategies, models
models of
of varying
varying accuracy
accuracy are
are required.
required. This
This is
is cannot
cannot be
be
be used
done
used for
manually.
for identification.
identification. For aa large
large dataset
dataset this
especially
MPC has
especially the
the case with
become
case withthe model
standard
model predictive
in
predictive the control
refining
control (MPC).
(MPC).and cannot be
cannot be used
used
be done for
for
usedmanually. identification.
identification. For
For aa large
large
for identification. For a large dataset this dataset
dataset this
this
especially
especially the
the case
case with
with model
model predictive control (MPC). cannot
cannot be
be done manually.
MPC
MPC
MPC
has
petrochemical
has
has
become
become
become industries the
the
the (Qin predictive
standard
standard
standard &in
in
in
the
the
the
control
Badgwell, refining
refining
refining
(MPC).and
2003).
and
and
cannot
cannot
cannot be
Therefore,
be done
be done manually.
done there
manually.
manually. is a need to investigate the use of
MPC
MPC has
has
petrochemical become
become industries the
the standard
standard
(Qin & in
in the
the
Badgwell, refining
refining and
and
2003). Therefore, there is
Furthermore,
petrochemical
petrochemical
petrochemical
this technology
industries
industries
industries (Qin
(Qin
(Qin
is seeing &
&
&
some
Badgwell,
Badgwell,
Badgwell,
application 2003).
2003).
2003).
in algorithms
Therefore,
Therefore,
Therefore,
that can
there
there
there is aaaa need
calculate
is
is
need
periods
need
need
to
to of
to
to
investigate
data for which
investigate
investigate
investigate
the
the use
the
the
use
model
use
use
of
of
of
of
petrochemical
Furthermore,
the mining, this
metals industries
technology
and minerals (Qinis seeing
area & Badgwell,
some
(Olivier & application
Craig, 2003).
2017). in Therefore,
algorithms
identification thatis there
can
likely is
calculate
to a need
periods
succeed. to investigate
of data for the
which use
model of
Furthermore,
Furthermore, this
this technology is seeing some application in algorithms
algorithms that
that can
can calculate
calculate periods
periods of
of data
data for
for which
which model
model
Furthermore,
the mining, this technology
metals technology
and minerals
is seeing
isarea
seeing some
some&application
(Olivier application
Craig, 2017).
in
in algorithms
algorithms that
identification thatis can
can
likelycalculate
calculate
to periods
periods of
succeed. of data
data forfor which
which model model
the Commercial
the mining, metals
metals and and minerals
minerals areamakes
(Olivieruse & Craig,
Craig, 2017). identification isbased
likely on to succeed.
succeed.
the mining,
mining, metalsMPC technology
and minerals area
area (Olivier
(Olivier & of linear
& Craig, 2017).
2017). identification
or identification
Recently, is
identification is likely
is likely
likely to to
to succeed.
the previous work of detecting
succeed.
Commercial
nonlinear models
Commercial MPC
that
MPC technology
are obtained
technology makes
from
makes use
performing
use of
of linear
planned
linear or
or Recently,
transients
Recently, (Horch,based
based on
2000),
on the
data previous
the impact analysis
previous work
work of detecting
(Carrette,
of detecting et
Commercial
Commercial MPC
MPC technology
technology makes
makes use
use of
of linear
linear or
or Recently, based
Recently,
Recently, based on
based on the
on the
the previous
previous
previous work of
work
work of
of detecting
detecting
detecting
nonlinear
experiments
nonlinear models
on
models the that
plant.
that are
are obtained
Since
obtained this from
step
from performing
testing
performing is planned
expensive
planned transients
al., 1996),
transients (Horch,
and
(Horch, 2000),
segmentation
2000), data
data impact
for
impact analysis
inferential
analysis (Carrette,
controllers
(Carrette, et
et
nonlinear
nonlinear models
models that
that are
are obtained
obtained from
from performing
performing planned
planned transients (Horch,
transients
transients (Horch, 2000),
(Horch, 2000), data
2000), data impact
data impact analysis
impact analysis (Carrette,
analysis (Carrette, et
(Carrette, et
et
experiments
from an
experiments on
on the
engineering
the plant.
plant. hoursSince
Since this
perspective,
this step
step testing
it
testing has is
is expensive
led to
expensive the al., 1996),
(Amirthalingam,
al., 1996), and
and segmentation
et al.,
segmentation 2000), for
for inferential
two approaches
inferential controllers
controllers for
experiments
experiments on
on the plant. Since this step testing is expensive al.,
al., 1996),
1996), and
and segmentation
segmentation for
for inferential
inferential controllers
controllers
from
from an
development
from an bythe
an engineering
engineering
engineering
plant.
various hours
hours
hours
Since
companies this step
perspective,
perspective,
perspective,
testing
it
it has
of automated
it has
is expensive
has led to
to the
ledstepping
al., 1996), the
(Amirthalingam,
determining and segmentation
(Amirthalingam,suitability
the (Amirthalingam,
(Amirthalingam, et
et al., 2000),
al., of 2000),
al., a given
2000),
for datainferential
two
twosegment
two approaches
approaches
controllers
for control for
for
from
tools an
development engineering
(Kalafatis, by various
et al., hours perspective,
companies
2006; Darby of &automatedhas led
itNikolaou, to
to the
ledstepping the purposes,
2014), (Amirthalingam,
determining the
especially
et al.,
et
et
suitability al., of
2000),
2000),
a
identification, given two
two
data
have
approaches
approaches
approaches
segment
been for
developed.
for
for
for
control
The
development
development by
by various
various companies
companies of
of automated
automated stepping
stepping determining
determining
determining the
the
the suitability
suitability
suitability of
of
of a
a
a given
given
given data
data
data segment
segment
segment for
for
for control
control
control
development
tools (Kalafatis, by various
et al., companies of automated stepping determining the suitability of a given data segment for control
which
tools
tools require bootstrapping
(Kalafatis, et al., 2006;
2006; Darby
through
Darby &
the
& Nikolaou,
generation2014),
Nikolaou, 2014), purposes,
of a purposes,
first
purposes,method especially
developed
especially identification,
by Peretzki
identification, have been
have been
been developed.
(2011) uses
et al.developed.
developed. The
The
tools (Kalafatis,
which
“seed” (Kalafatis,
require
matrix. This
et
et isal.,
bootstrapping
2006;
al.,a response
2006;through Darby
Darby
matrix
&
&for
the
Nikolaou,
Nikolaou,
generation
the system
2014),
2014),
ofthata purposes,
first
Laguerre method
especially
especially
models developed
as
identification,
identification,
the basis by for
have
have been
Peretzki
extracting et al.
the developed.
(2011)
desired
The
The
uses
model
which require
which require bootstrapping
bootstrapping through through the the generation
generation of of aa first first method
method developed
developed by by Peretzki
Peretzki et et al. (2011) uses
al. (2011) uses
which
“seed” require
matrix. bootstrapping
This is through the generation of a first method
Laguerre modelsmodels developed
as the
the by
basis for Peretzki
for ofextracting et al. (2011)
the desired
desired uses
model
expresses
“seed”
“seed” the key
matrix.
matrix. This
This is
is aa response
responsewhile
relationships, matrix
matrix fornecessarily
notfor
for the system
the systembeing
system that Laguerre
that conditions. The key as advantage
basis this approach
for extracting the is thatmodelthe
“seed”
expresses
extremelymatrix.
the key
precise. is aa response
Thisrelationships,
This response
matrix
matrix
matrix
while
is not
normally
the
fornecessarily
thegenerated
systembeing that
that
by
Laguerre
Laguerre
conditions.
process
models
models
time The delay
as
as the
key the basis for
basis
advantage
is not
extracting
extracting
of
required. this
the desired
the
approach
However,
desired
thisis
model
model
that
methodthe
expresses
expresses the
the key
key relationships,
relationships, while
while not
not necessarily
necessarily being
being conditions.
conditions. The
conditions. The
The key key advantage
key advantage
advantage of of this
of this approach
this approach
approach is
is
is that
that
that the
the
the
expresses
extremely
performing the
somekey
precise. relationships,
This
manual matrix
steps, while
is
which not
normally
runs necessarily
countergenerated
to thebeingby
aim conditions.
process
only time
works The delay
with key
data advantage
is not of
required.
obtained underthis approach
However,
open-loop thisis
or that
methodthe
closed-
extremely
extremely precise.
precise. This
This matrix
matrix is
is normally
normally generated
generated by
by process
process time
process time
time delaydelay
delay is is not
is not required.
not required. However,
required. However,
However, this this method
this method
method
extremely
performing
of reducingsomeprecise.
someor manual This
manual
eliminating matrix
steps, is
which
the need normally
runs countergenerated
to the
for engineering by
aim process
only
loop time
works
conditions delay
with data is not required.
obtained under However,
open-loop thisor method
closed-
performing
performing
performing
some
some
manual
manual
steps, which
steps, which
steps, whichneed
runs counter
runs counter to the
to the aim
to the aim
runs counterengineering aim
aim only
only
only
works
only works
works
works with where
with
with
with
data
data
data
the reference
obtained
data obtained
obtained
obtained
under signal changes.
open-loop
under open-loop
under
under open-loop
open-loop
or
or closed-
or
or
The
closed-
closed-
closed-
of
of reducing
supervision
reducingduring or eliminating
eliminating the
or testing. the need need for for engineering
engineering loop loop
second
loop conditions
approach
conditions where
developed
where the
the byreference
Shardt
reference signal
and Huang
signal changes.
(2013)
changes. The
uses
The
of
of reducing
reducingduring or eliminating
or testing. the
eliminating the need for engineering asecond for loop conditions
loop conditions where
conditions where the reference
the
the reference signal
reference signal changes.
signal changes. The
changes. The
supervision
supervision during testing. second approach
condition
approachnumber developed
based
developed by
on
by Shardt
fitting
Shardt an and Huang
autoregressive
and Huang (2013)
(2013) uses
model
uses
supervision
supervision during
Since theduringmajority testing.
testing. of MPCs are installed after the plant second second
second approach
approach
approach developed
developed by
by Shardt
Shardt and
and Huang
Huang (2013)
(2013)
(2013) uses
uses
uses
awith
condition
exogenous number based
input (ARX) on fitting
to theanan autoregressive
andata to determine modelthe
Since
has Since
Since the majority
been operational
majorityforof
the majority
the
of MPCs
ofsome
MPCs
MPCs
are
time,
are the
are
installed
question
installed
installed
after
after
after
the
arises as to aaawith
plant
the plant
the plant
condition
condition
condition
exogenous
number
number
number
based
based
based
input (ARX)
on fitting
on fitting
onoffitting
to the
autoregressive
autoregressive
andata
autoregressive
to
model
model
model
model
Since the majority of MPCs are installed quality. The
with exogenous key
exogenous input advantage
input (ARX)
(ARX) to this
to the approach
the data
data to is determine
that
to determine it
determine the can the
be
the
has
has
been
whether
has been
been
operational
historical
operational
operational
for
data,
for
for
some
collected
some
some
time,
time,
time,
the
possibly
the
the overafter
question
question
question years the
arises
arises
arises
plant
as
of plant to
to with
as to
as with
quality.exogenous
The key input
advantage (ARX) of to the
this data
approach tois determine
determine
that it can
the
the
has been operational
has beenhistorical
operational for
for some
some time, the question arises applied
quality. to
The any
key operating
advantage conditions,
of this
as to quality. The key advantage of this approach is that it can be including
approach is thatclosed-loop
it can be
be
whether
operation,
whether
whether could be
historical
historical
data,
data,
data, used to time,
collected
collected
collected generate the question
possibly
possibly
possibly
over
these
over
over
years
seed
years
years
of
of
of
plant
models.
plant
plant quality.
applied The
to keyoperating
any advantage ofreference
this approach
conditions, including isbut
that it can be
closed-loop be
whether historical data, collected possibly over years of plant without
applied
applied to any
to
to any excitations
any operating
any operating in the
conditions,
operating conditions, signal,
including
conditions, including excitations
closed-loop
including closed-loop
closed-loop
operation,
Experience
operation,
operation, could
could
could be
in attempting
be
be used
used
used to
this
to
to generate
has led to these
generate
generate these seed
the conclusion
these seed
seed models.
models.
models.that appliedapplied
without to
any any operating
excitations in conditions,
the reference including
signal, but closed-loop
excitations
operation, could be be used to has generate these seed models. in the disturbance
without
without any excitations
any excitations
excitations signal, in that
in the is, it can signal,
reference
the reference use routine
signal, but operating
excitations
but excitations
excitations
Experience
historical
Experience
Experience in
data
in
in attempting
can
attempting
attempting this
used,
this
this but led
has
has thatto
led
led to
to the
there
the
the conclusion
are practical
conclusion
conclusion that
that
that without
without
in the any
any excitations
disturbance signal, in
in the
the
that reference
is, it can signal,
use but
but
routine excitations
operating
Experience in attempting this has led to the conclusion that data.
in the
in the On the
the disturbance other
disturbance
disturbance signal, hand,
signal, it
signal, that does
that
that is, require
is, it
is, it can
it can knowledge
use
can use routine
use routine of both the
operating
routine operating
operating
historical
difficulties
historical
historical data
data
data can
in doing
can
can be
so.
be
be used,
These
used,
used, but
include:
but
but that
that
that there
periods
there
there are
of
are
aredatapractical
where in
practical
practical in the
data. Ondisturbance
the other signal,
hand, it that
does is, it can use routine
knowledge ofoperating
both the
historical
difficulties data can
in doing
doing be
so.system used, but
These isinclude:
include: that there
periods are practical
of required
data where
where process
data. On
data. On
On theorders
the other
the other and
other hand,hand, time
hand, it it delay
does
it does require
does require in
require
require order
knowledgeto
knowledge of estimate
of both
of both
both thethe
the
the base level
difficulties
difficulties in
in control
doing so.
so. These
These not in periods
include: the mode
periods of
of data
data where to data. data.
processOn the
ordersother andhand, timeit does require knowledge
knowledge of both the
difficulties
the base in
level doing
control so. These
system include:
is not in periods
the mode of data
requiredwhere to condition
process
process number
orders
orders and
andof the
time
time data delay
matrix.
delay
delay in
in
in order
Recent
order
order to
work
to
to estimate
has
estimate
estimate shownthe
the
the
identify
the base the
levelmodels;
control saturation
system is of
not PIDin loops;
the mode correlation
required of
to process
process
condition orders
orders and
and time
time delay
delay in
in order
order to
to estimate
estimate the
the
the
the
the base
base
base level
level
level control
control
control system
system
system is not
is not in
in the
the mode
mode required to
required to
to that, sincenumber
condition the Laguerre
number of the
of the data
theapproach matrix.
data matrix.
matrix. Recent
doesRecent work
not require
Recent has
work knowledge shown
has shown
shown
identify
inputs
identify
identify
the
leading
the
the
models;
models;
models; models;ispoor
to poorsaturation
saturation
saturation
notPID
of
of
of PID
PID
in the
loops;
excitation
loops;
loops;
mode of required
correlation
the inputs;
correlation
correlation
of
of
of
condition
condition
that, since
number
number
the Laguerre
of
ofitthe data
data
approachmatrix.does Recent
not
work has
has
work knowledge
require has shown
shown
identify
identify
inputs the
the
leading models;
models;
to poor saturation
saturation
models; of
of
poor PID
PID loops;
loops;
excitation correlation
correlation
of the inputs;of
of of
that,the
that, time
since
since the
thedelay,
Laguerre
Laguerre can be
approach
approach usefuldoes
does in extracting
not
not require
require data
knowledge
knowledgefrom
and dad
inputs
inputs data,
leading
leading which
to
to poor
poor is extremely
models;
models; common
poor
poor for
excitation
excitation analysers.
of
of the
the inputs;
inputs; that,
of thesince
time the Laguerre
delay, it canapproach
be usefuldoes in not require
extracting knowledge
data from
inputs
inputs
and leading
dadleading to poor
to poor
data, which models;
models; poor
is extremely poor
common excitation
excitation
for industrial
of the time historians
delay, (Shardt
it can be
of the inputs; of the time delay, it can be useful in extracting data from & Shah,
useful 2014;
in Bittencourt,
extracting data et al.,
from
and
and dad
dad data,
data, which
which is is extremely
extremely common common for for analysers.
analysers. of the time
industrial delay,
historians it(Shardt
can of be&these
useful
Shah, in extracting
2014; Bittencourt, data etfromal.,
and dad data, which is extremely common for analysers. analysers. 2015).
industrial The
industrial historians
industrial
application
historians
historians (Shardt(Shardt
(Shardt & & Shah,
& Shah, methods
Shah, 2014;2014; to
Bittencourt,
2014; Bittencourt, open-loop,
Bittencourt, et et al.,
et al.,
al.,
industrial
2015). The historians
application (Shardt of & Shah,
these 2014;
methods Bittencourt,
to et
open-loop, al.,
2015).
2015).
2015). The
The
The application
application
application of
of
of these
these
these methods
methods
methods to
to
to open-loop,
open-loop,
open-loop,
Copyright © 2018 IFAC 132
2015). The application of these methods to open-loop,
2405-8963 © 2018, IFAC (International Federation of Automatic Control) Hosting by Elsevier Ltd. All rights reserved.
Copyright
Peer review© 2018 IFAC 132
Copyright
Copyright
Copyright ©under
©
© 2018 responsibility
2018 IFAC
IFAC of International Federation of Automatic
132Control.
132
Copyright © 2018
2018 IFAC
IFAC
10.1016/j.ifacol.2018.09.288
132
132
2018 IFAC ADCHEM
Shenyang, Liaoning, China, July 25-27, 2018

Yuri Shardt et al. / IFAC PapersOnLine 51-18 (2018) 132–137 133

multivariate processes has recently been considered (Patel, variances of the signals and the condition number of
2016). However, since many processes are already running in the information matrix.
closed-loop operation, it is necessary to extend the results to c. Compare the variances, the condition number of the
such cases. regressor matrix, and the significance of the
parameters against the thresholds.
Therefore, this paper proposes to analyse the Laguerre-
i. Failure: If any of the thresholds fail to be met go to
based approach for application in a multivariate industrial
the next data point, that is, k = k + 1, and go to Step
system to determine the challenges of using this approach for
3.b.
identifying data for system identification with a view of
ii. Success: Otherwise, set k = k + 1, and go to Step 3.c.
generating the seed matrices for MPC identification. A case
The “good” data region is then [kinit, k].
study using the zinc flotation cell will be presented to show
4) Termination: The procedure stops once k equals N, the
some of the key results.
total number of data points in the given operating region.
2. DATA SEGMENTATION FOR SYSTEM 5) Simplification: It may be desirable to compare adjacent
IDENTIFICATION regions and determine if they could be considered to
come from a single model. Often the segmentation
When processing historical data with an eye on extracting
algorithm will be a bit too strict and provide too many
regions that can be used for identification it is necessary to
segments (Shardt & Shah, 2014).
consider not only the theoretical foundations, but also the
impact of various tuning parameters on the system. 2.2 Laguerre-Based Segmentation
In data segmentation, the Laguerre polynomial is often
used, since it eliminates the need for knowing the process The Laguerre-based data segmentation uses orthogonal
time delay. For these reasons, it makes sense to use this Laguerre polynomials to model the system. This
approach when extracting historical data for which the time orthogonality allows for easy removal of unnecessary model
delays is not accurately known. components without affecting the rest of the parameters. The
ith order Laguerre model is given as
Tuning parameters in any method primarily impact on
i −1
how critically the algorithm scrutinises each of the regions to 1 − α 2  1 − α z −1 
determine the suitability for identification purposes. In Li ( z , α ) = −1
−1
  (1)
z − α  z −1 − α 
general, the tighter the bounds, the greater the scrutiny and
the fewer regions suitable for identification will be found. On where Li is the ith order Laguerre basis function, α is a time
the other hand, looser bounds will allow for a greater number constant, and z−1 is the backshift operator. The resulting
of suitable regions. model can then be written as
A final element of consideration is handling multivariate Ng
data. This involves the selection and consideration of which y ( t ) = ∑ θi Li ( z −1 , α ) u ( t ) + e ( t ) (2)
subset of the available parameters should be considered for i =1

identification purposes. This problem is not necessarily a where y(t) is the output signal, u(t) is the input signal, e(t) is
trivial one and it could easily require substantial the error, θi is the to-be-determined coefficient, and Ng is the
considerations. Laguerre order of the process. The parameters for the model
given by Equation (2) can be obtained using standard
2.1 Data Segmentation Algorithm
regression analysis.
The general data segmentation algorithm can be described as In this approach, a recursive method is used to compute
(Peretzki, et al., 2011): the required variances, that is, the following update rule is
1) Preprocessing: Load and preprocess the data set. Most used:
often, this will involve scaling and centring the data set.
2) Mode Changes: In order to simplify the detection of (
m yt = λmy yt + 1 − λmy m yt −1 )
suitable regions, it is important to separate the data set 2 − λmy (3)
( ) ( )
( yt − my ) + 1 − λσ y σ y2t−1
2
into the different modes that are present. Modes can be σ y2 = λσ
defined as changes in operating points, faults, controller
t
2 y

settings, or other similar known changes. Removing the where λ is the forgetting factor and σ2 is the variance of the
known changes will improve the ability of the algorithm given signal. It can be noted that two forgetting factors are
to detect the changes. present λmy and λσ y , which need to be tuned. The variance is
3) Segmentation: For each mode, perform the following
steps: updated using the above formulae for 3 different signals, the
a. Initialisation: Set the mode counter to the current data inputs, outputs, and the regression matrix. Based on previous
point, kinit = k. experience, the forgetting factors will all be set to 0.99.
b. Computation: Compute the required values for the
given algorithm. In most cases, this will include the

133
2018 IFAC ADCHEM
Shenyang, Liaoning, China, July 25-27, 2018

134 Yuri Shardt et al. / IFAC PapersOnLine 51-18 (2018) 132–137

The Laguerre model parameters, α and Ng, are the other zinc. As shown in Figure 1, these banks are named the
two model parameters whose value needs to be set. roughers, scavengers and recleaners.
According to (Peretzki, 2010)
The section of the circuit covered here is the zinc
θ log (α ) roughers. The rougher tails from the upstream lead circuit
Ng ≥ − +1 (4) are the feed to the zinc roughers. As shown in Figure 2, this
2τ s
bank consists of four cells (FC23, FC24, FC25, FC26). Their
where θ is the continuous time delay and τs is the sampling aim is to do a rough separation of zinc from the waste
time. Previous investigations have shown that α should be set material. Copper sulphate (activator) and naphthalene
between 0.80 and 0.95. For the purposes of this investigation, sulphate (depressant) are added upstream. Ethyl xanthate, a
α will be selected as 0.80, while the value of Ng will be set to collector, is added to cells FC23 and FC25. The tails of the
6, since the actual values of the time delay are not known. rougher (unfloated material) report downstream to the
However, it is known that it is not greater than about 100 scavengers where the majority of the remaining zinc is
minutes. The sampling time is fixed to 1 minute. These floated. The concentrate (floated material) from the roughers
constraints support the value for Ng that has been selected. reports to the recleaners.
Selecting the thresholds can be a bit difficult, especially
without considering some of the properties of the signals
themselves. For the input signal, in order to be generous and
allow for more regions to be identified, the variance threshold
was set to 10−7. For the output signal, the variance threshold
was set to 10−7. The regression variance was set to 10−3. The
condition number threshold was set to the standard value of
1,000 [cite my thesis].

2.3 Multivariate Analysis

Since most of the previous approach have only considered


Figure 1: Zinc rougher, scavenger and recleaner circuit.
univariate input variables, this paper will also examine the
implications in terms of the multiple inputs and their impact In the rougher bank, levels are controlled per pair of cells.
on finding suitable regions. Different combinations of The flowrate of air can be varied on a per cell basis.
variables will be taken to determine if it is possible to Composition measurement by X-ray fluorescence (XRF) is
segment a given data set without necessarily using all the used on all concentrate and tails streams. In Figure 2, LC1
required variables. Clearly, the more variables that are and LC2 are level PID controllers on pairs of cells, FC1 to
present, the larger the matrices involved, and the greater the FC4 are flow PID controllers on air flowrates and FC5 to
computational power required. Since the quality of the model FC8 are reagent flow PID controllers. FI1 is the volumetric
is only one item to consider, it is important to consider the feed flowrate. Analysers AI1 to AI3 measure zinc
trade-off between speed of segmentation and the accuracy of percentages in the feed, concentrate, and tails respectively.
the results.
As well, when dealing with multivariate data, it may
happen that some of the parameters are irrelevant for
identification. In such cases, it will be interesting to examine
the impact that irrelevant variables have on the ability of the
method to determine the identification regions.
3. PROCESS DESCRIPTION
Before considering the actual implementation of the data
segmentation system, it will be useful to briefly examine the
actual system considered.
The data used in this study has been obtained from a
section of the lead zinc concentrator at the Mount Isa Mines Figure 2: Rougher Bank Showing Control Loops and
in Queensland, Australia. The concentrator is a complex Analysers
operation, recovering both lead and zinc from a feed sourced 4. TEST DATA
from three different mines. The ore is milled and is then fed
to a lead removal circuit. The lead is recovered in the form The data collected for this investigation consists of thirty-
of a concentrate. The reject stream from this unit, termed the one days of plant operation. These were collected from the
tailings, is fed to a zinc flotation unit. In this circuit, a plant historian at a frequency of one minute. The historian’s
number of banks of flotation cells, are used to recover the interpolation routine is used to ensure the data is aligned. No
special care was used to ensure that the data had any

134
2018 IFAC ADCHEM
Shenyang, Liaoning, China, July 25-27, 2018

Yuri Shardt et al. / IFAC PapersOnLine 51-18 (2018) 132–137 135

particular characteristics, other than that the plant was step response models. For the purposes of this study, only
running. There is a period of one day in the data where the subspace methods were used. The variables were not
feed falls away. conditioned before modelling. As well, a constant settling
time of 90 minutes was selected for all the models.
Forty-three variables were collected: for each of the PID
Furthermore, it can be noted that during data segmentation,
controllers, setpoint, process value and output (SV/PV/MV)
whenever the output sensor failed, it was assumed that the
were recorded. The three analysers provide measure of iron,
mode had changed and that component was separated out of
lead and zinc percentages. Variables collected are listed in
the model.
Table 1. The process was assumed to be running under
control throughout the period of investigation. For Case 1, where all the available measurements were
used, it was quickly determined that no useful information
Table 1: Test Variables
could be extracted, since some of the variables are correlated
Tag Attributes Description with each other, leading to strongly ill-conditioned matrices.
FI1 PV Feed rate This suggests that it is important to properly select the
appropriate variables to consider.
AI1 Fe/Pb/Zn Feed Compositions
For Case 2, the segmentation results are shown in Figure
FC5 SV/PV/MV CuSO4 (reagent) to FC22
3. It should be noted that a constant segment number
FC6 SV/PV/MV EX (reagent) to FC23 represents a region where the data is assumed to belong to the
FC7 SV/PV/MV EX (reagent) to FC25 same model. A segment value of −1 corresponds to those
FC8 SV/PV/MV NS (reagent) to FC3 regions where the sensor failed. The segment number
increases every time a data point fails to be good for
FC1 SV/PV/MV Air flow to FC23 identification. After every new segment, there will be a short
FC2 SV/PV/MV Air flow to FC24 transient region corresponding to the time it takes to have
FC3 SV/PV/MV Air flow to FC25 sufficient data for identification (40 data points are
considered the minimum for identification).
FC4 SV/PV/MV Air flow to FC26
60
LC1 SV/PV/MV FC24 Level
Zinc Concentration

LC2 SV/PV/MV FC26 Level 40

AI2 Fe/Pb/Zn Primary Rougher 20


Concentrate Compositions
AI3 Fe/Pb/Zn Primary Rougher Tailings 0
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5
Compositions 4
10
A design for a MPC on this unit has been derived. The 6000
manipulated variables (MVs) are the air flows, levels and the
flows of the reagents. Feed-forward (FF) variables are 4000

expected to be the feed flow and feed composition or


Partition

2000

compositions. The outputs or controlled variables (CVs) are 0


the zinc percentages in the concentrate and tailing streams.
-2000

5. RESULTS AND DISCUSION 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5


4
Time, t (min) 10

Based on the analysis of the data set, five different cases


will be considered: Figure 3: Segmentation Results for Case 2
1) Case 1: All data will be used for segmentation of the For Case 3, the segmentation results are shown in Figure
data set. 4. The same definitions have been used as for Case 2. It can
2) Case 2: Using three variables to segment the data set. be seen that the number of segments is quite different even
The selected variables are LC1, LC2, and AI1Pb. though 3 variables have been used.
3) Case 3: Using three variables to segment the data set.
The selected variables are FC1, FC2, and FC3. For Case 4, the segmentation results are shown in Figure
4) Case 4: Using two variables to segment the data set. 5. The same definitions have been used as for Case 2. Here it
The selected variables are FC1 and FC3. can be seen that decreasing the number of variables has lead
5) Case 5: Using expert knowledge to select the to an increase in the regions that are not sufficiently good for
variables based on what variables should impact the identification. However, this could easily be a function of the
model. The selected variables are FI1, FC5, FC6, variables selected. Nevertheless, selecting an appropriate
FC7, FC8, LC1, and LC2. subset of 2 variables could be difficult as it would involve a
large search.
For each case, the data set was segmented using the
programme and a model using the “good data set” was Finally, for Case 5, the segmentation results are shown in
developed using Aspentech® DMC Model to derive linear Figure 6. Here it can be seen that there are large areas of

135
2018 IFAC ADCHEM
Shenyang, Liaoning, China, July 25-27, 2018

136 Yuri Shardt et al. / IFAC PapersOnLine 51-18 (2018) 132–137

constant value located between the sensor faults. It would be fit. This suggestions that the segmentation method can
possible to determine if the adjacent segments are actually accurately determine which regions should be used for
similar and warrant being combined. Doing this would modelling and which ones should not. Furthermore, since the
provide additional data for model building. data was extracted from a data historian without any prior
60 data conditioning, there is no guarantee that the data set itself
can provide decent models.
Zinc Concentration

40
60

20

Zinc Concentration
40

0
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5
20
4
10

10000
0
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5
4
10
Partition

5000
40

20
0

Partition
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5

Time, t (min) 4
10
0

Figure 4: Segmentation Results for Case 3 -20


0 0.5 1 1.5 2 2.5 3 3.5 4 4.5
60 4
Time, t (min) 10
Zinc Concentration

40
Figure 6: Segmentation Results for Case 5
20
Table 2: Summary Statistics for AI2.ZN
0
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5
Case 2 Case 3 Case 4 Case 5
10
4 RMSE 1.53 1.95 1.54 2.07
15000
R2 0.18 0.13 0.38 0.13
10000 6. CONCLUSIONS
Partition

5000
This paper examined the application of a data
0 segmentation algorithm to the zinc flotation cell. In this case,
-5000
the Laguerre approach to data segmentation was used, since it
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 did not require knowledge of the time delays. Furthermore,
4
Time, t (min) 10 since multiple inputs were available, different sets of
variables were tested in order to determine which if the
Figure 5: Segmentation Results for Case 4 variables could be used for quickly segmenting the data set.
The resulting models for all the cases are shown in Figure The larger the number of variables, the longer it takes to
7. The step responses for each of the variables of interest and properly segment the data set. As well, variables which do
the resulting models have been provided for both inputs. It not have an influence on the model should be removed when
can be noted that in practice the air flow rates are combined data segmentation is performed.
into a single variable. The same is done for the EX reagent. The above observations were validated using data
In general, it can be seen that the quality of the resulting extracted from a historian for a zinc flotation cell. The best
model strongly depends on the segmentation results. It can be segmentation, both in terms of the number of segments and
seen that Cases 2, 4, and 5 present similar results, while Case their accuracy, was using all the relevant variables.
3 (denoted by the black line) often gives models that deviate Furthermore, the resulting models were sufficiently accurate
strongly from the consensus. Noting that the purpose of this to be used for the initial seed for model predictive controllers.
modelling exercise is to develop “seed model” for use as the
initial values for the MPC model creation software, it should Therefore, when dealing with multiple inputs, it is
be noted that the overall accuracy of the model is not all that important in selecting the appropriate set of variables to
important, except that it provide the correct overall picture. consider for segmentation purposes. Too large and too small
of a number can have an impact on the final quality of the
Table 2 shows the root mean square error and R2 for the models.
fit of the zinc concentration models for the first output. It can
be seen that in general the fit for all the cases is relatively Future work will focus on determining if a subset of
low. However, of the considered cases, Case 4 has the best variables can be used to obtain better segmentation results.

136
2018 IFAC ADCHEM
Shenyang, Liaoning, China, July 25-27, 2018

Yuri Shardt et al. / IFAC PapersOnLine 51-18 (2018) 132–137 137

Figure 7: Unit Step Response Models (Case 2: black, Case 3: blue, Case 4: pink, and Case 5: green)
REFERENCES Patel, A., 2016. Data Mining of Process Data in
Mutlivariable Systems, Stockholm, Sweden: Royal Institute
Amirthalingam, R., Sung, S. W. & Lee, J. H., 2000. Two-step
of Technology.
procedure for data-based modeling for inferential control
applications. AIChE Journal, 46(10), pp. 1974-1988. Peretzki, D., 2010. Data mining for process identification
(Diploma Thesis), Cassel, Germany: University of Cassel.
Bittencourt, A. C., Isaksson, A. J., Peretzki, D. & Forsmann,
K., 2015. An Algorithm for Finding Process Identification Peretzki, D., Isaksson, A. J., Bittencourt, A. C. & Forsman,
Intervals from Normal Operating Data. Processes, 3(2), pp. K., 2011. Data Mining of Historic Data for Process
357-383. Identification. Minneapolis, Minnesota, United States of
America, AIChE.
Carrette, P., Bastin, G., Genin, Y. Y. & Gevers, M., 1996.
Discarding Data May Help in System Identification. IEEE Qin, S. J. & Badgwell, T. A., 2003. A survey of industrial
Transactions on Signal Processing, November, 44(9), pp. model predictive control technology. Control Engineering
2300-2310. Practice, 11(7), pp. 733-764.
Darby, M. L. & Nikolaou, M., 2014. Identification for Shardt, Y. A. W., 2012. Data Quality Assessment for Closed-
multivariable model-based control: An industrial Loop System Identification and Forecasting with
perspective. Control Engineering Practice, 22(1), pp. 165- Application to Soft Sensors (Doctoral Thesis), Edmonton,
180. Alberta, Canada: University of Alberta.
Horch, A., 2000. Condition Monitoring of Control Loops Shardt, Y. A. W. & Huang, B., 2013. Data quality assessment
(Doctoral Thesis), Stockholm, Sweden: KTH. of routine operating data for process. Computer and
Chemical Engineering, Volume 55, p. 19– 27.
Kalafatis, A. et al., 2006. Multivariate step testing for MPC
projects reduce crude unit testing time. Hydrocarbon Shardt, Y. A. W. & Huang, B., 2013. Statistical properties of
Processing, pp. 93-400. signal entropy for use. Journal of Chemometrics,
November, 27(11), p. 394–405.
Olivier, L. E. & Craig, I. K., 2017. Should I Shut down my
Processing Plant? − An Analysis in the Presence of Faults. Shardt, Y. A. W. & Shah, S. L., 2014. Segmentation Methods
Journal of Process Control, Volume 56, pp. 35-47. for Model Identification from Historical Process Data. Cape
Town, South Africa, Elsevier.

137

You might also like