You are on page 1of 5

OPTIMIZATION OF NEURAL NETWORK

HYPERPARAMETERS FOR GAS TURBINE MODELLING


USING BAYESIAN OPTIMIZATION
M H M Tarik, M Omar, M F Abdullah, R Ibrahim

Department of Electrical and Electronic Engineering, Universiti Teknologi PETRONAS,


32610, Bandar Seri Iskandar, Perak, Malaysia.
mohammad_16001091@utp.edu.my, madiah.omar_g03266@utp.edu.my, mfaris_abdullah@utp.edu.my, rosdiazli@utp.edu.my

Keywords: Optimization; Hyperparameter; Neural Network; right hyperparameters value to get an accurate gas turbine
Gas Turbine; Bayesian model. Generally, there are two ways of choosing these
hyperparameters namely manual tuning and automatic tuning.
Abstract Manual tuning is based on trial-and-error techniques. This
method requires a good understanding of the machine
Gas turbine model can be used for many applications that can learning models’ hyperparameters and very tedious. Hence,
improve gas turbine operation’s efficiency and reliability. automatic tuning has become the preferred way of tuning the
Artificial Neural Network (ANN) was identified as a good hyperparameters of machine learning models. However,
technique for gas turbine modelling. However, ANN is automatic tuning is computationally expensive, and this has
sensitive to its hyperparameters. Current approaches for leads to a lot of research being done on developing automatic
optimizing the hyperparameters of ANN used in gas turbine tuning method which has lower computation cost. Grid
modelling has high computational cost. Hence, this paper search and random search are the popular technique for
proposes optimizing the ANN hyperparameters using automatic tuning. The grid search algorithm performs
Bayesian optimization. Bayesian optimization was used to exhaustive search through a manually specified sets of
determine the near-optimum number of layers and number of hyperparameters. Grid search was used to optimize
neurons for the developed gas turbine model. The result hyperparameters of neural network and helps to improve
shows hyperparameters optimization using Bayesian prediction accuracy [11] and lower validation error [12].
optimization results neural network with better prediction Random search was developed as an alternative to grid search
accuracy compared to hyperparameters optimization using and has a higher efficiency and lower computation cost.
random search. Instead of going through all the specified sets of
hyperparameter like in the grid search, random search
1 Introduction algorithm selects random trials from the specified sets of
hyperparameter [13]. However, the grid search and random
Dry Low Emission (DLE) gas turbine has become a vital search have a very high computation cost.
equipment in the energy industry now due to stringent
environmental regulation imposed by the government. DLE Another approach for automatic hyperparameters tuning is to
gas turbine emits lower nitrogen oxide (NOx) since it treat the search for hyperparameters as optimization problem
operates at lower flame temperature [1]. However, this caused and then solve the problem with optimization algorithm. A
the combustion to become unstable. This leads to the need of simple method for this approach is to compute the gradient of
a good gas turbine model to be used for optimizing control the error with respect to the hyperparameters and follow this
performance [2], fault diagnosis [3], and fault prediction [4] gradient to find the optimum hyperparameters [14]. However,
which can improve DLE gas turbine operation efficiency and gradient-based optimization is not suitable for some
reliability. Based on the literature, artificial neural network hyperparameters optimization problem due to gradient
(ANN) is recognised as one of the successful approaches that unavailability. Hence, black-box optimization technique like
can disclose nonlinear behaviour of a complicated systems Bayesian optimization has become the popular method for
such as gas turbine because ANN is not limited by linearity, hyperparameters optimization among machine learning
normality and variable independence. Besides that, artificial practitioners. Bayesian optimization was used for
neural network can work very well eventhough trained using hyperparameters optimizations in big data application helps to
incomplete or noisy data which are common when using reduce computational time [15]. In addition, drug–target
datasets taken from industrial factories and plants [5]. Hence, interaction prediction was improved and the computational
ANN has been widely used for system modelling of gas time was lowered when Bayesian optimization was used for
turbine [6-8] including DLE gas turbine [9]. However, like hyperparameters optimization [16]. However, currently, there
others machine learning models, ANN is sensitive to their are no research utilizing Bayesian optimization for choosing
hyperparameters. Small changes to these hyperparameters the hyperparameters of ANN when developing model for gas
may have a significant effect on ANN performance and turbine.
development time [10]. Hence, it is essential to choose the
This paper presents a study on the use of Bayesian 3 Methodology
optimization for choosing the number of layers and number of
neurons in each layer of neural network used for gas turbine The ANN models for the gas turbine was developed in
modelling. The rest of this paper is organised as follows: MATLAB using the Neural Network toolbox. Figure 1 shows
Section 2 details the Bayesian optimization technique used in the flowchart of developing ANN models with Bayesian
this study and Section 3 details the methodology to model the optimization for hyperparameters selection.
gas turbine using neural network and optimize its
hyperparameter using Bayesian optimization. Section 4
presents and discusses the results obtained from this study and
Section 5 concludes the paper.

2 Bayesian Optimization
The Bayesian optimization algorithm minimize a scalar
objective function f(x) for x in a constrained domain as shown
in Equation (1).
x* arg min f ( x) (1)
xX

Where X is the chosen constrained domain and can be real,


integer or categorical [17] which made Bayesian optimization
suitable for hyperparameters optimization of neural network.

The key elements of Bayesian optimization are using


Gaussian Process model to prescribe prior belief for the
chosen objective function, improve the Gaussian process
model using Bayesian posterior updating and find the next
value of x for evaluation by maximizing acquisition function
D ( x, Q) . Refer Algorithm 1 for the pseudocode of Bayesian
optimization Figure 1: ANN development flowchart

Algorithm 1 Bayesian Optimization First, few operational variables were selected as the input and
output variables for the gas turbine model. The selected input
1: Evaluate y f ( xi ) and output parameters were tabulated in Table 1 and Table 2
2: for i = 1,2,…, do respectively.
3: update the Gaussian process model of f ( x ) to obtain
posterior distribution over function Q( f ( xi , yi )) Parameters Unit
Air Inlet Temperature °C
4: find the next point xi 1 by maximizing D ( xi , Q )
Air inlet Differential Pressure kPad
5: end for Pilot Valve Command %
Metering Valve Command %
There are several choices of acquisition function [18]. In this Guide Vane Command %
study, the expected improvement was selected as acquisition Thermocouple Average Temperature °C
function. Expected improvement evaluates the expected Gas Fuel Pressure kPag
amount of improvement of the objective function and ignores
values which may increase the objective function. Expected Table 1: Input parameters
improvement can be defined as in Equation (2).

D ( xi , Q) EQ [max(0, PQ ( xbest )  f ( x))] (2) Parameters Unit


Pressure Compressor Discharge kPag
Where xbest is the location of the lowest posterior mean and Gas fuel flow kg/Hr
PQ ( xbest ) is the lowest value of the posterior mean. Speed %

Table 2: Output parameters


Next, Bayesian optimization algorithm from the MATLAB’s Table 4 shows that the neural network has lower error when
Statistics and Machine Learning Toolbox was used to Bayesian optimization was used to optimize its
optimize the hyperparameters of the developed neural hyperparameters instead of using random search. The actual
network. The search spaces were tabulated in Table 3. and simulated output parameters were plotted in Figure 2 to
Figure 4. Based on these figures, neural network which its
Hyperparameters Bound hyperparameters were optimized using Bayesian optimization
Number of layers 1-3 outputs predictions closer to the actual values. This shows
Number of neurons 1-30 that neural network model with Bayesian optimization used
for hyperparameters optimization performs slightly better
Table 3: Hyperparameter search spaces compared to neural network model with random search used
for hyperparameters optimization.

The objective function selected for the optimization was the


mean square error (MSE) of actual output parameters values
and predicted output parameters values as shown in
Equation (3).
n
1
MSE =
n ¦ (Y  Y )
i 1
i
'
i
2
(3)

Where n is a number of training samples, Yi' is i th actual


output and Yi is i th predicted the output.

Then, the developed ANN model was trained using


Levenberg–Marquardt algorithm until it reached the lowest
MSE. The performance of the developed ANN model was
compared with ANN model developed using random search
for hyperparameters optimization. The comparison was based Figure 2: Pressure compressor discharge variations for 3000
on the prediction error which can be calculated using test values
equation (4).
| actual  predicted |
Error = u 100% (4)
actual

The prediction error was calculated for 3000 data points


which are not part of the training. The better algorithm was
determined based on average of the prediction error for 10
runs.

4 Result
The percentage differences of the actual and predicted values
or the errors of each output parameters for both developed
ANN models were tabulated in Table 4.

Percentage Error (%)


Output parameters Bayesian Random Figure 3: Gas fuel flow variations for 3000 test values
Optimization Search
Pressure compressor discharge 1.6511 2.0188
Gas fuel flow 1.5287 1.6831
Speed 0.5570 1.1150

Table 4: Prediction error for both developed ANN models


[3] H. Abbasi Nozari, M. Aliyari Shoorehdeli, S.
Simani, and H. Dehghan Banadaki, "Model-based
robust fault detection and isolation of an industrial
gas turbine prototype using soft computing
techniques," Neurocomputing, vol. 91, pp. 29-47,
2012/08/15/ 2012.
[4] J. H. Lee, T. S. Kim, and E.-h. Kim, "Prediction of
power generation capacity of a gas turbine combined
cycle cogeneration plant," Energy, vol. 124, pp. 187-
197, 2017/04/01/ 2017.
[5] H. Asgari, X. Chen, and R. Sainudiin, "Analysis of
ANN-Based Modelling Approach for Industrial
Systems," International Journal of Innovation,
Management and Technology, vol. 4, p. 165, 2013.
[6] C. M. Bartolini, F. Caresana, G. Comodi, L.
Pelagalli, M. Renzi, and S. Vagni, "Application of
artificial neural networks to micro gas turbines,"
Figure 4: Speed variations for 3000 test values Energy Conversion and Management, vol. 52, pp.
781-788, 2011/01/01/ 2011.
[7] M. Fast, M. Assadi, and S. De, "Development and
Hence, based on the result obtained, Bayesian optimization multi-utility of an ANN model for an industrial gas
is suitable to be used for optimizing the hyperparameters of turbine," Applied Energy, vol. 86, pp. 9-17,
neural network. This agrees with previous researches in 2009/01/01/ 2009.
literature as mentioned in section 1 where Bayesian
[8] I. Yousefi, M. Yari, and M. A. Shoorehdeli,
optimization improve neural network performance by
"Modeling, identification and control of a heavy
selecting better hyperparameters. However, previous
researches in gas turbine modelling did not use Bayesian duty industrial gas turbine," in 2013 IEEE
optimization for optimizing hyperparameters. International Conference on Mechatronics and
Automation, 2013, pp. 611-615.
[9] M. H. M. Tarik, M. Omar, M. F. Abdullah, and R.
5 Conclusion Ibrahim, "Modelling of dry low emission gas turbine
using black-box approach," in TENCON 2017 - 2017
Bayesian optimization was used to optimize IEEE Region 10 Conference, 2017, pp. 1902-1906.
hyperparameters of neural network used in gas turbine [10] I. Goodfellow, Y. Bengio, and A. Courville, Deep
modelling and compared to random research. The results have Learning: MIT Press, 2016.
shown that Bayesian optimization outperform random search [11] P. Nagy and G. Németh, "DNN-Based Duration
and hence it is recommended to use Bayesian optimization for
Modeling for Synthesizing Short Sentences," in
hyperparameter optimization when developing neural network
Speech and Computer, Cham, 2016, pp. 254-261.
for gas turbine modelling. Future works may include
comparison of Bayesian optimization with other gradient-free [12] B. P. Tóth and T. G. Csapó, "Continuous
optimization techniques. fundamental frequency prediction with deep neural
networks," in 2016 24th European Signal Processing
Conference (EUSIPCO), 2016, pp. 1348-1352.
Acknowledgements [13] J. Bergstra and Y. Bengio, "Random Search for
Hyper-Parameter Optimization," The Journal of
The authors acknowledge the support of Ministry of Machine Learning Research, vol. 13, p. 281−305,
Higher Education (MOHE) and Universiti Teknologi 2012.
PETRONAS in carrying out this research through the FRGS [14] D. Maclaurin, D. Duvenaud, and R. P. Adams,
0153AB-L31 grant. "Gradient-based hyperparameter optimization
through reversible learning," presented at the
References Proceedings of the 32nd International Conference on
International Conference on Machine Learning -
[1] T. Hazel, G. Peck, and H. Mattsson, "Industrial Volume 37, Lille, France, 2015.
Power Systems Using Dry Low Emission Turbines," [15] T. T. Joy, S. Rana, S. Gupta, and S. Venkatesh,
IEEE Transactions on Industry Applications, vol. 50, "Hyperparameter tuning for big data using Bayesian
pp. 4369-4378, 2014. optimisation," in 2016 23rd International
[2] A. P. Wiese, M. J. Blom, C. Manzie, M. J. Brear, Conference on Pattern Recognition (ICPR), 2016,
and A. Kitchener, "Model reduction and MIMO pp. 2574-2579.
model predictive control of gas turbine systems," [16] F. Dernoncourt and J. Y. Lee, "Optimizing neural
Control Engineering Practice, vol. 45, pp. 194-206, network hyperparameters with Gaussian processes
2015/12/01/ 2015. for dialog act classification," in 2016 IEEE Spoken
Language Technology Workshop (SLT), 2016, pp.
406-413.
[17] B. Shahriari, K. Swersky, Z. Wang, R. P. Adams,
and N. d. Freitas, "Taking the Human Out of the
Loop: A Review of Bayesian Optimization,"
Proceedings of the IEEE, vol. 104, pp. 148-175,
2016.
[18] J. Snoek, H. Larochelle, and R. P. Adams, "Practical
Bayesian optimization of machine learning
algorithms," presented at the Proceedings of the 25th
International Conference on Neural Information
Processing Systems - Volume 2, Lake Tahoe,
Nevada, 2012.

You might also like