You are on page 1of 10

NARMAX model identification of a palm oil biodiesel engine using multi-objective

optimization differential evolution


Zakwan Mansor, Mohd Zakimi Zakaria, Azuwir Mohd Nor, Mohd Sazli Saad, Robiah Ahmad, and Hishamuddin
Jamaluddin

Citation: AIP Conference Proceedings 1885, 020142 (2017);


View online: https://doi.org/10.1063/1.5002336
View Table of Contents: http://aip.scitation.org/toc/apc/1885/1
Published by the American Institute of Physics
NARMAX Model Identification of a Palm Oil Biodiesel
Engine Using Multi-Objective Optimization Differential
Evolution
Zakwan Mansor1,a), Mohd Zakimi Zakaria1,b), Azuwir Mohd Nor1,c) , Mohd Sazli
Saad1 , Robiah Ahmad2 and Hishamuddin Jamaluddin3
1
School of Manufacturing Engineering, Universiti Malaysia Perlis,
Pauh Putra Main Campus, 02600 Arau, Perlis, Malaysia
2
UTM Razak School of Engineering and Advanced Technology, Universiti Teknologi Malaysia, Jalan Sultan Yahya
Petra, 51400 W. P Kuala Lumpur, Malaysia
3
Department of Electrical & Electronic Engineering, Faculty of Engineering & IT, Southern University
College,81300 Skudai, Johor, Malaysia

Corresponding author: a)zakwanmansor8@gmail.com


b)
zakimizakaria@unimap.edu.my
c)
azuwir@unimap.edu.my

Abstract. This paper presents the black-box modelling of palm oil biodiesel engine (POB) using multi-objective
optimization differential evolution (MOODE) algorithm. Two objective functions are considered in the algorithm
for optimization; minimizing the number of term of a model structure and minimizing the mean square error
between actual and predicted outputs. The mathematical model used in this study to represent the POB system is
nonlinear auto-regressive moving average with exogenous input (NARMAX) model. Finally, model validity tests
are applied in order to validate the possible models that was obtained from MOODE algorithm and lead to select an
optimal model.

INTRODUCTION

System Identification is a technique in developing a mathematical model of a system based on measured input-
output data. The two main purposes of developing mathematical model are for the prediction of the behavior of the
system and the other is for controller design. The prediction of the behavior of the system is important in order to
improve its efficiency. Besides that, the effectiveness of the system will be improved along with an adequate
mathematical model. There are four procedures that are involved in system identification, those are the acquisition
of data, choose of model presentation, parameter estimation and model validation [1].
In system identification, many models have been proposed as a mathematical model to present the dynamic
system. Some examples of these models are nonlinear auto-regressive with exogeneous input model (NARX),
nonlinear auto-regressive moving average with exogeneous input model (NARMAX), neural network model, state
space model and fuzzy logic model. The NARMAX model was introduced by Leontaritis & Billings [2] as a
representation for a wide class of nonlinear systems. Recently, numerous works had used this model in nonlinear
system identifications such as in power generation [3], [4], robotic [5], [6] and chemical process [7]–[9]. The
essence of the NARMAX model is that past outputs are included in the expansions. This makes the model
identification easier since fewer terms are required to represent a system, but it also means that noise in the output
must be taken into account when estimating the model coefficients [10].

3rd Electronic and Green Materials International Conference 2017 (EGM 2017)
AIP Conf. Proc. 1885, 020142-1–020142-9; doi: 10.1063/1.5002336
Published by AIP Publishing. 978-0-7354-1565-2/$30.00

020142-1
The most challenging problem in system identification is to represent a dynamic system with an adequate and
parsimonious model. Recently, in order to solve the model structure selection problem for NARMAX model,
various techniques and methods have been proposed [11]–[13]. Multi-objective optimization using differential
evolution algorithm (MOODE) as proposed by Zakaria et al. [14] is considered as an efficient algorithm for finding
the optimum model structure. The algorithm is based on multi-objective optimization techniques that has two
objective functions which are minimizing the error between the proposed model and measured output of the process
and minimizing the complexity of the model. The parameter estimation method that was applied in this algorithm is
the least square estimation algorithm (LSE). MOODE had been employed to detect the structure of nonlinear auto-
regressive with exogenous inputs (NARX) model [14], [15] for representing a dynamic system. In this study,
MOODE algorithm will be expanded to determine the optimal model structure from NARMAX model for
representing dynamic systems.
The palm oil biodiesel engine referred as POB system is considered in this study. The palm waste was converted
to palm oil biodiesel and feed to POB system as an alternative energy. In order to design a successful engine
controller, an adequate modelling of the system is needed. To overcome this case, MOODE is applied to obtain
some models to be chosen. Model validity tests are applied to select a final model that is adequate to represent the
POB system.

MODEL STRUCTURE REPRESENTATION

One of the most important steps in the identification process is to decide upon a model to represent a dynamic
system from acquired input output data[1]. Most nonlinear systems are modeled and identified by using
mathematical and signal models, block diagram models, and simulation models. In this study, the mathematical
model considered is the NARMAX and it can be described as[2]:

l
y (t ) F (C, y (t  1),..., y (t  n y ), u(t  1),..., u(t  nu ), e(t  1),..., e(t  ne ))  e(t ) (1)

where ny, nu and ne are the maximum lags for the output, input, and noise terms, respectively, while C, y(t), u(t) and
e(t) are the constant, output, input and noise signals, respectively. Fl (.) is a polynomial non-linear function with l
degree of nonlinearity. The NARMAX model can be transformed into a linear regression model that can be
expressed as

M
y (t ) ¦ Ti ‡i (t )  e(t ), n y d t d N (2)
i 1

where θi and Øi(t) are unknown coefficients or parameters and regressors respectively, M is the maximum number
of terms of the regressors and N is the size of data. The model is linear in parameter, so an algorithm from the least
square (LS) family can be used to estimate the parameters. If the noise e(t) is known, a simple least square
estimation (LSE) or an ordinary recursive least-square (RLS) algorithm can be used for estimation. However, in
general, the noise is not measurable, and sequence e(t) is estimated iteratively. Recursive extended least square
(RELS) can be used in order to resolve the problem. The noise or residue sequence of e(t) is estimated iteratively by
calculated the prediction error between the actual and predicted outputs. The RELS algorithm, which allows
estimating vector parameters in the model is described by [16]

T (t ) T (t  1)  P(t )‡(t )e(t )


P (t 1)‡(t )‡T (t ) P( t 1)
P(t ) P (t  1)  (3)
1‡T (t ) P( t 1)‡( t )
e(t ) y (t )  T (t  1)‡(t )

where the initial error is set to zero, e0(t) = 0 and P(t) is initialized to a very large identity matrix since no
information is available in the beginning. The regressor, Ø(t + 1) is constructed by using a new data of input, u(t +
1), output, y(t + 1) and prediction error, e(t + 1).

020142-2
MODELING PROCEDURES

Overview of MOODE Algorithm Integrating with NARMAX Model


MOODE algorithm is a model structure selection algorithm that was based on the integration of differential
evolution (DE) into non-dominated sorting genetic algorithm (NSGA-II) procedure but without the DE selection
process. This algorithm focusing on selecting possible model structure based on MOO technique. The two objectives
function involves in this algorithm are model predictive accuracy and model complexity. The Pareto-optimal front
will be generated to represent the set of possible solutions. Details of the implementation of MOODE for model
structure selection in modelling dynamic system are as follow [14]:

Step 1: Model parameter setting. Define model parameters such as the orders of input, output, noise lags and
degree of nonlinearity, nu, ny, ne and l. Load an available input-output and noise data set. Recursive extended
least square (RELS) algorithm will be applied in this step. RELS algorithm is a method to estimate the
parameter and noise iteratively. However, RELS that will be applied in this study just for collecting the
estimated noise terms only. Then, using input-output and noise data, create the regressors for the model.
Step 2: DE parameter setting. The DE parameters such as population size (NP), crossover rate (CR), mutation rate
(MR), lower boundary (L), upper boundary (H) and maximum generation (Gen) are specified except the
value of vector size (D) which is equal to the size of regressor that has been generated in Step 1. The values
of lower and upper boundaries are set to create genes of a vector.
Step 3: Model representation for initial population. Generate random vectors, so-called chromosomes of the
population within the lower boundary (L) and upper boundary (H).
Step 4: Define two objective functions. The two objective functions are model predictive error which is MSE,
between the proposed model and measured output of the process and model complexity. To calculate MSE,
the selected term of the identified model would go through parameter estimation. The algorithm for
parameter estimation used in this study is least square estimation (LSE) algorithm.
Step 5: Produce parent population (Pt) with size NP. For each vector of the population the values of objective
functions are evaluated as in Step 4. Based on these objective functions, each vector of the population is
ranked and crowding distance of each vector is calculated. Then the vectors of the parent population are
sort based on its rank and the value of crowding distance. This rank sorting is according to priority based on
the number of rank which is rank 1 is the first, followed by rank 2 and so on. If the rank of the vectors
within the same rank, sorting is done based on the high value of crowding distance.
Step 6: Create new vectors from the parent population. This step is based on DE’s technique in producing new
generation. From the parent population, three vectors of the population are chosen randomly, which is
evolved to create new vectors through two genetic operators: mutation and crossover. This step repeats
until the size of population is achieved.
Step 7: Create offspring population (Qt) with size NP. After the new vectors are produced, Step 4 is repeated in
order to evaluate the fitness of the vectors based on the objective functions. The production of offspring
population is the same as the parent population in Step 5. Each vector of the offspring population is sorted
based on their ranks and the values of crowding distance.
Step 8: Create new generation of population (Pt+1) with size NP. This step is inspired by Deb et al. [17]. The parent
population and offspring population are combined and becomes size 2NP. This combined would undergo
non-dominated sorting and calculation of the crowding distance value and are ranked based on these. Then,
Pt+1 is obtained by using crowded tournament selection. Back to step 6 until the maximum number of
generation (Gen) is reached.
Step 9: Results illustration. The solutions which are the set of possible model structure that minimize the two
objective functions are plotted in a single graph. The set of possible model structures are represented by the
points of Pareto-optimal front.

020142-3
Model Validation
The final procedure in system identification is model validation. This is important in order to check whether the
model fits the data adequately without any biased. In this study, two model validity tests are considered; model
predicted output (MPO) and correlation tests. First model validity test used in this study is MPO test [18]

yˆ MPO Fˆ ( yˆ (t  1),..., yˆ (t  n y ), u(t  1),..., u(t  nu )) (4)

where the MPO output is based on the previous predicted output and input data. F̂ is the nonlinear function f( . ).
The coefficient of determination (R2) will be calculated for the MPO output to assess the adequacy of the prediction
models where:

2
¦ ( yi  yˆ )
2 i
R 1 2 (5)
¦ ( yi  y )
i

Another model validation included in this study is correlation tests. The correlation functions [19] used in this
study are based on the following equations:

‡HH (W ) E[H (t )H (t  W )] G (t ) , W
‡ue (W ) E[u(t )H (t  W )] 0 , W
2
‡u 2H (W ) E[u (t )H (t  W )] 0 , W (6)
2 2
‡ (W ) E[u (t )H (t  W )] 0 , W
u 2H 2
‡( uH )H (W ) E[H (t )u(t )H (t  W  1)] 0, W t 0

where Ø, ε(t), u(t) and δ(t) represent the standard correlation function, the residual sequence, the input of the system
and the delta function. E[.] is the expectation operator. The tests are able to indicate the adequacy of the model by
plotting graphs of the confidence bands. The confidence bands are estimated as 95% confidence limits that are
approximately r1.96 N where N is the data length.

020142-4
MODELLING PALM OIL BIODIESEL ENGINE (POB)
POB system is based on the input-output data obtained from Azuwir et al. [20] concerning the identification of a
biodiesel engine system. The input of the system is voltage with values between 1 V and 1.2 V. The output is the
speed of engine in RPM (revolutions per minute) that was operated with speed range of around 2100 RPM. There
are 720 data of input-output collected from the experiment as shown in figure 1. The first data set consists of 600
data samples are used for estimation and the remaining 120 data samples are used for validation. Table 1 shows the
MOODE parameters setting used in this study.

(a)

(b)
FIGURE 1. POB system for, (a) Input data (Voltage) (b) output data (RPM)
TABLE 1. MOODE parameters setting
Type of parameter Value setting
Population (NP) 50
Generations (Gen) 100
Crossover Rate (CR) 0.7
Mutation Rate (MR) 0.3
Lower Boundary (L) 0.1
High Boundary (H) 0.4

020142-5
NARMAX model parameters setting are three input-output lags (nu = ny = 3), five noise terms lags (ne = 5) and
two degree of nonlinearity (l = 2). The maximum number of model terms is 78 and the possible models to be chosen
is 278 – 1 equal to 3.02 × 1023. Figure 2 illustrates the Pareto-optimal front for the POB system. There are 16 possible
of model structure that representing the dynamic behavior of POB system. The highest model complexity is 24
terms while the lowest is 4 terms. The models with about 10 significant terms are usually sufficient to capture the
dynamic of highly nonlinear system process [21]. Hence, the selected models from the Pareto-optimal front are
chosen based on the complexity with a value 10 and below. Three out of those models are selected and marked as
D1, D2 and D3. Details of marked models are shown in table 2.

FIGURE 2. Pareto-optimal front for POB system using NARMAX model


TABLE 2. Details of marked models
Terms D1 D2 D3
Constant -0.3315 -- --
y(t-1) 1.5343 1.3891 1.1547
u(t-3) -0.3742 -0.6965 -0.2375
y(t-1)u(t-1) -0.3305 -0.2039 --
y(t-1)u(t-2) -0.0145 -0.0117 --
y(t-3)y(t-3) -0.1315 -0.0984 -0.0468
y(t-3)u(t-1) 0.3297 0.2031 --
y(t-3)e(t-5) 0.4823 0.4818 0.4810
u(t-2)u(t-3) 0.0285 0.0229 --
u(t-3)u(t-3) 0.1668 0.3162 0.1183
Number of repeated solution 4 4 6
MSE 6.15 × 10-7 6.35 × 10-7 7.11 × 10-7
Complexity 10 9 5

020142-6
Figure 3 illustrates the MPO test for each selected model. These figures show the output of POB system which is
engine speed superimposed on predicted model output and its residual plot. The details of the MSE and R2 values
calculated from the MPO output for models D1, D2 and D3 is shown in table 3.

TABLE 3. MSE and R2 of MPO tests from the selected models


Models MSE R2
D1 2.30 × 10-7 0.9994
D2 2.26 × 10-7 0.9994
D3 2.51 × 10-7 0.9994

(a)

(b)

(c)
FIGURE 3. The engine speed of the POB system superimposed on MPO output and its residuals plot for; (a) Model D1 (b)
model D2 (c) model D3

020142-7
(a) (b)

(c)
FIGURE 4. Correlation tests for; (a) Model D1 (b) Model D2 (c) Model D3

Figure 4 shows the correlation tests for each model. The MSE value from MPO test, refer to table 3, shows not
much different among each model. Each selected model also has a same R2 that is closer to 1 indicates a goodness of
fit of a model. The R2 for each model is 0.9994. For correlation test, one out of five correlation functions is not
satisfied for each model which is Øεε(τ) as shown in Figure 4. This happen because the noise terms for all models
are not sufficient to represent the POB system. For model complexity, model D3 has a lower complexity than model
D1 and D2 which is only 5 number of terms as shown in table 2. Thus, considering all the model validity tests,
model D3 was chosen to represent the mathematical model for POB system. Model D3 can be expressed as:

2 2
y (t ) 1.1547 y (t  1)  0.2375u(t  3)  0.0468 y (t  3)  0.4810 y (t  3)e(t  5)  0.1183u (t  3) (7)

CONCLUSION

In this study, the development of mathematical modelling of a palm oil biodiesel engine (POB) has been
presented. The MOODE algorithm produces a set of possible models along a single front known as Pareto-optimal
front and three models are selected. To select a final model that is good and adequate, model validity test i.e. MPO
and correlation tests need to apply for each selected model. The results presented in this study show that the
MOODE algorithm integrating with NARMAX model has significant potential for further development of
application in any process system.

020142-8
ACKNOWLEDGMENTS

The authors would like to thank to Universiti Malaysia Perlis (UniMAP) and Ministry of Higher Education
Malaysia through Fundamental Research Grant Scheme (FRGS) 9003-00457 for the financial support provided
throughout the course of this research.

REFERENCES

1. L. Ljung, System identification: Theory for the user, ptr prentice hall information and system sciences
series(ed: Prentice Hall, New Jersey, 1999).
2. I. J. Leontaritis and S. A. Billings, Int. J. Control, 41, 303–328 (1985).
3. O. M. Boaghe, S. A. Billings, L. M. Li, P. J. Fleming and J. Liu, Control Eng. Pract. 10, 1347–1356, (2002).
4. C. Evans, P. J. Fleming, D. C. Hill, J. P. Norton, I. Pratt, D. Rees and K. Rodrıguez Vazquez, Control Eng.
Pract. 9, 135–148 (2001).
5. O. Akanyeti, I. Rano, and S. A. Billings, Rob. Auton. Syst. 58, 229–238 (2010).
6. B. Gardiner, S. A. Coleman, T. M. McGinnity and H. He, Rob. Auton. Syst. 60, 1508–1519 (2012).
7. M. Bucolo, L. Fortuna, M. Nelke, A. Rizzo and T. Sciacca, Control Eng. Pract., 10, 227–237, (2002).
8. L. Aggoune, Y. Chetouani and T. Raissi, ISA Trans., 63, 394–400 (2016).
9. I. Fernandez, F. G. Acien, M. Berenguel, J. L. Guzman, G. A. Andrade and D. J. Pagano, Chem. Eng. Sci. 112,
116–129, (2014).
10. S. A. Billings, Nonlinear System Identification: NARMAX Methods in the Time, Frequency, and Spatio-
Temporal Domains. (John Wiley & Sons, Ltd, Chichester, UK, 2013).
11. T. Baldacchino, S. R. Anderson and V. Kadirkamanathan, Automatica, 49, 2641–2651 (2013).
12. J. Yan and J. R. Deller, Signal Process. 123, 30–41 (2016).
13. F. Lu, K. K. Lin, and A. J. Chorin, Phys. D Nonlinear Phenom. 340, 46–57 (2017).
14. M. Z. Zakaria, A. Mohd Nor., H. Jamaluddin and R. Ahmad, WSEAS Trans. Syst. Control, 9, 500–513, 2014.
15. M. Z. Zakaria, S. Saad, H. Jamaluddin and R. Ahmad, “Dynamic System Modeling of Flexible Beam System
using Multi- Objective Optimization Differential Evolution Algorithm,” World Virtual Conference on
Advanced Research in Mechanical and Materials Engineering (2014).
16. O. M. Mohamed Vall and R. M’hiri, Int. J. Autom. Comput. 5, 313–318 (2008).
17. K. Deb, A. Pratap, S. Agarwal and T. Meyarivan, IEEE Trans. Evol. Comput. 6, 182–197, (2002).
18. S. A. Billings and K. Z. Mao, Reseach Report no. 714 (1998).
19. S. A. Billings and W. S. F. Voon, Int. J. Control, 44, 235–244 (1986).
20. M. N. Azuwir, M. Z. Abdulmuin and A. H. Adom, Int. J. Eng. Technol. 3, 582–586 (2011).
21. S. Chen, S. A. Billings, W. Luo, S. Chent, S. A. Billingst and W. Luot, Int. J. Control, 50, 1873–1896 (1989).

020142-9

You might also like