Professional Documents
Culture Documents
Comparison of Photovoltaic Plant Power Production Prediction
Comparison of Photovoltaic Plant Power Production Prediction
Renewable Energy
journal homepage: www.elsevier.com/locate/renene
a r t i c l e i n f o a b s t r a c t
Article history: Nowadays the estimation of power production yield by stand-alone and grid-connected Photovoltaic
Received 24 August 2015 (PV) plants is crucial for technical and economic feasibility design analyses. The main goal is to overcome
Received in revised form renewables unpredictability by properly estimating the power production and by suitably balancing
14 November 2015
generation and consumption. In this context, many methods can be applied to forecast renewables
Accepted 5 January 2016
energy production. The scope of this paper is a comparative analysis of three different methods to es-
Available online 18 January 2016
timate the power production of a preexisting PV plant. It is installed at ENEA Research Centre located in
Portici (South Italy) and it is integrated in a Micro Grid (MG) configuration. In detail a phenomenological
Keywords:
Artificial neural network
model proposed by Sandia National Laboratories and two statistical learning models, a Multi-Layer
Estimation Perceptron (MLP) Neural Network and a Regression approach, are compared. These models are deeply
Genetic algorithms different also in terms of required input data and parameters. In detail, phenomenological model
MLP application requires the availability of design parameters and technical devices specifications. Statistical
Photovoltaic power production machine learning models need, however, input variable previously acquired datasets. The a-Si/mc-Si PV
Regression analysis plant, installed at Portici, represents an adequate case study for the three models comparison, as both
design and acquired data are available. In fact, the plant was designed at the ENEA Research Centre so
this makes possible the knowledge of the design parameters and, being a part of the MG, its data are
continuously acquired and transmitted to other network devices. Obtained results demonstrate more
accurate power predictions can be reached by statistical machine learning approaches. The main novelty
of the paper consists in the optimization of the considered models by the appropriate identification of
the minimum and more representative training dataset. Authors underline the unnecessary use of
thousands samples by suitably selecting the dataset size and samples by means of a Genetic Algorithm.
The optimization strategy effectiveness is verified comparing the prediction performances obtained
employing the optimal dataset with those obtained with a randomly chosen dataset. In this scenario,
Genetic Algorithm strategy represents a successful approach to the suitable identification of statistical
models datasets.
© 2016 Elsevier Ltd. All rights reserved.
http://dx.doi.org/10.1016/j.renene.2016.01.027
0960-1481/© 2016 Elsevier Ltd. All rights reserved.
514 G. Graditi et al. / Renewable Energy 90 (2016) 513e519
Wspeed, and the PV module temperature Tm. In addition, produced plant losses by means of a “derating” factor calculated with Loss
DC and AC power are taken into account. Calculator software provided by NREL [18] and included in
PVWatts® calculator. This model can predict PV plant power pro-
3. Methodology duction even before the PV system is physically realized, consid-
ering only systems information and configuration.
3.1. Input variables selection The Schokley-Sandia model is available as a set of Matlab
functions, collectively labeled as PV_LIB Toolbox. A schematic data
The correct application of the above mentioned models needs a flow diagram of the proposed model is depicted in Fig. 2. In this
preliminary task represented by input variables selection. The scheme we have adopted the same nomenclature of Matlab
identification of variables which strongly affect the system power
production can be carried out by a correlation analysis. The corre-
lation between the PV plant power production, Pac, and a set of
available measured variables is performed. The theoretic clear sky
radiation GIcsk is also taken into account. Pearson correlation values
(R), obtained considering the year 2007, are summarized in Table 1.
The correlation analysis shows that GIpoa, the module temperature
Tm and GIcsk are highly correlated variables with Pac.
Similar results are obtained repeating correlation analysis for
other years, but years 2006 and 2007 show a slightly higher cor-
relation mainly thanks to a preliminary calibration procedure of
devices and solar sensors used for the long-period monitoring of
the PV plant under test. Since only few months data were acquired
during 2006, the year 2007 is selected as training year for ANN and
Regression models.
Table 1
Correlation analysis results for year 2007.
GIpoa 0.99
GIcsk 0.78
Wspeed 0.19
Tamb 0.45
Tm 0.97
Fig. 2. Schematic data flow diagram of the model developed by Sandia.
516 G. Graditi et al. / Renewable Energy 90 (2016) 513e519
functions utilized in PV_LIB Toolbox. In detail, GIpoa, Wspeed and shown. In this study, the robust fitting version of the method is
Tamb represent the input variables for the considered model. implemented with the aim to mitigate the high outliers sensitivity
characterizing LS estimation. In robust regression, weights are
assigned to each data point; this is done automatically and itera-
3.3. Regression model tively by means of a method consisting of iteratively reweighting
LS. At the first iteration, equal weights are assigned to the model
Another proposed model is a multiple linear regression. In a coefficients calculated by the traditional LS. In subsequent itera-
regression model, an equation describing the statistical relation- tions, weights are recalculated in such a way that lower weights are
ship between one or more predictors and the response variable is assigned to points farther from model predictions. This procedure
derived with the aim to obtain new observations. A multiple linear goes on until coefficients values converge within a specified
regression model can be expressed as follows tolerance.
All common tests have been performed (normal probability plot
yi ¼ b0 þ b1 Xi1 þ b2 Xi2 þ …/ þ bp Xip þ εi
(1) of residuals, variance inflation factors, etc.) to check if LS regression
i ¼ 1; …:n; j ¼ 1; …; p
produces unbiased coefficient estimates with the minimum
variance.
where yi is the i-th response, bk is the k-th coefficient, b0 is the
constant term. Xij is the i-th observation on the j-th predictor var-
3.4. Artificial neural network model
iable, εi is the i-th random error term. In general, a linear regression
model can be expressed as in (2):
ANNs were originally proposed as simplified models of biolog-
XK ical neural networks. As their biological counterpart, ANN can learn
yi ¼ b0 þ b f Xi1 ; Xi2 ; Xi3 ;……::Xip þ εi
k¼1 k k (2) from examples. For this reason, they are becoming useful as alter-
i ¼ 1;…:n native to conventional techniques to solve a wide variety of com-
plex, non-linear, and even non-stationary problems. ANNs are
where f(-) is a scalar-valued function of the independent variables
widely used in the field of the renewable energies to forecast or
Xij. It is worth noting that in “linear” regression the response vari-
estimate energy production or other variables related to renew-
able, y, is a linear function of the coefficients, bk. In our case, the aim
ables (e.g. solar radiation, wind power, energy demand, etc.)
is to obtain a relation between Pac and the predictors variables
[19,20].
represented by GIpoa, GIcsk and Tm. The form of the adopted
In this paper, a Multi-Layer-Perceptron (MLP) ANN is selected in
regression equation is reported in (3):
order to estimate the PV power production Pac considering GIpoa,
pac ¼ b0 þ b1 *GIpoa þ b2 *GIcsk þ b3 *Tm (3) GIcsk. and Tm as input variables. The MLP is a particular architecture
of ANN, where base units are arranged in layers with only forward
where b3 takes into account of the influence of Tm on the Pac. connections to units in subsequent layers [7]. Such type of ANNs
Regression coefficients are estimated to minimize the mean can properly approximate any non-linear functions and it is usually
squared difference between the prediction vector and the true employed in regression (estimate an output variable using pre-
(measured) response vector. This method is known as the Least dictors variables) problems. In Fig. 4 a diagram of the employed
Squares (LS) method. In Fig. 3, the Regression model flowchart is MLP is shown.
discontinuous, non-differentiable or highly nonlinear. The first step penalty value reaches its minimum after few generations (4)
in the optimization process is to create an initial population by a remaining almost the same for the subsequent 50 generations. This
random selection of values. The fitness function for the GA has been process goes on until a stopping condition is reached: Stall Time (T),
chosen as the normalized PV plant production Root Mean Square Stall Generation (G), Time and Generation. In case of the ANN, GA
Error (nRMSE). Individuals gaining higher nRMSE are chosen as stops being Stall Generation criterion reached, as reported in Fig.6b.
elite, thereafter applying crossover and mutations operators sub- A number of 54 generations are used to identify the minimum and
sequent populations are created. Authors decided to apply elitism, meaningful dataset characterizing the developed ANN. In a similar
two-point crossover and mutation. In detail, chosen crossover and manner, in case of Regression model, the applied GA reaches the
mutation rates are 90% and 10% respectively. A flowchart of the Stall Generation stopping criterion with a number of 57
applied GA is showed in Fig. 5. GA performance evolves as number generations.
of generation increases, as depicted in Fig. 6a, where the best
Fig. 6. (a) ANN optimization: GA performance vs. generation number graph, (b) GA
Fig. 5. GA flowchart. stopping criteria graph.
518 G. Graditi et al. / Renewable Energy 90 (2016) 513e519
Table 3
Comparative analysis nRMSE results.
Model
Year Sandia Regression Optimized regression ANN Optimized ANN Random ANN
6. Conclusion forecasting tool on short- term horizon, in: 2011 19th Mediterranean Con-
ference on Control & Automation (MED), 2011, pp. 1265e1270.
[8] D.L. King, W.E. Boyson, J. a Kratochvill, Photovoltaic Array Performance Model,
The accurate estimation of the power production is very December, 2004, pp. 1e43.
important for the operations planning of electric power systems. In [9] T. Hiyama, E. Karatepe, Investigation of ANN performance for tracking the
this paper, two statistical learning models and a physical/ optimum points of PV module under partially shaded conditions, in: 2010
Conference Proceedings IPEC, 2010, pp. 1186e1191.
phenomenological one are compared. Performance results show [10] D.L. King, W.E. Boyson, J. a Kratochvill, Photovoltaic Array Performance Model,
that more accurate predictions can be obtained by Regression and December, 2004, pp. 1e43.
MLP models as compared to the classical reference method. Further [11] A. Mellit, Artificial intelligence based-modeling for sizing of a stand-alone
photovoltaic power system: proposition for a new model using neuro-fuzzy
studies identifies the minimum dataset necessary to train ANN and system (ANFIS), in: 2006 3rd International IEEE Conference Intelligent Sys-
regression models by means of a GA. The minimum and repre- tems, 2006, pp. 606e611.
sentative dataset is constituted by only two days samples both for [12] M. Bocco, E. Willington, M. Arias, Comparison of regression and neural net-
works models to estimate solar radiation, Chil. J. Agric. Res. 70 (3) (2010)
ANN and Regression analysis. The GA solutions effectiveness is 428e435.
verified comparing ANN performances in case of optimized and [13] P.G. Nikhil, D. Subhakar, Approaches for developing a regression model for
random chosen training datasets. sizing a stand-alone photovoltaic system, IEEE J. Photovolt. 5 (1) (Jan. 2015)
250e257.
Future studies are going to analyze optima datasets to identify [14] Y.K. Wu, C.R. Chen, H.A. Rahman, A novel hybrid model for short-term fore-
suitable merit factors. The idea is to classify days respect to the casting in PV power generation, Int. J. Photoenergy 2014 (2014).
optima selected ones on the base of specific indices. In this way, [15] J. Yokoyama, Short term load forecasting improved by ensemble and its
variations, in: 2012 IEEE Power and Energy Society General Meeting, 2012, pp.
acquiring two days data in every season or month of the year, it is
1e6.
possible to correlate these samples with the reference ones. This [16] F. Meillaud, A. Billet, C. Battaglia, et al., Latest developments of high-efficiency
analysis could permit a reliable PV power prediction only acquiring micromorph tandem silicon solar cells implementing innovative substrate
small samples. materials and improved cell design, IEEE J. Photovolt. 2 (3) (Jul. 2012)
236e240.
[17] M. Boccard, P. Cuony, C. Battaglia, et al., Nanometer- and micrometer-scale
References texturing for high-efficiency micromorph thin-film silicon solar cells, IEEE J.
Photovolt. 2 (2) (Apr. 2012) 83e87.
[1] C. Paravalos, E. Koutroulis, V. Samoladas, T. Kerekes, D. Sera, R. Teodorescu, [18] W. Andrews, J. S. Stein, C. Hansen, D. Riley, C. Consulting, and S. N. Labora-
Optimal design of photovoltaic systems using high time-resolution meteo- tories, Introduction to the Open Source PV LIB for Python Photovoltaic System
rological data, IEEE Trans. Ind. Inf. 10 (4) (Nov. 2014) 2270e2279. Modelling Package, pp.1e5.
[2] G. Graditi, M.L. Di Silvestre, R. Gallea, E. Riva Sanseverino, Heuristic-based [19] D.M. Riley, G.K. Venayagamoorthy, Comparison of a recurrent neural network
shiftable loads optimal management in smart micro-grids, IEEE Trans. Ind. Inf. PV system model with a traditional component-based PV system model, in:
11 (1) (Feb. 2015) 271e280. Conf. Rec. IEEE Photovolt. Spec. Conf., No. June, 2011, pp. 002426e002431.
[3] D.L. King, W.E. Boyson, J.A. Kratochvill, Photovoltaic Array Performance Model, [20] N. Kassim, S.I. Sulaiman, Z. Othman, I. Musirin, Harmony search-based opti-
Snadia Report, 2004-3535, pp. 1e43. mization of artificial neural network for predicting AC power from a photo-
[4] G. Adinolfi, N. Femia, G. Petrone, G. Spagnuolo, M. Vitelli, Design of dc/dc voltaic system, in: 2014 IEEE 8th International Power Engineering and
converters for DMPPT PV applications based on the concept of energetic ef- Optimization Conference (PEOCO2014), 2014, pp. 504e507.
ficiency, J. Sol. Energy Eng. 132 (2) (May 2010) 021005. [21] M. Braun, et al., Is the distribution grid ready to accept large scale photovoltaic
[5] G. Graditi, G. Adinolfi, Energy performances and reliability evaluation of an deployment? e State of the art, progress and future prospects, Prog. Photo-
optimized DMPPT boost converter, in: 2011 International Conference on Clean volt. Res. Appl. 20 (6) (2011) 681e697.
Electrical Power (ICCEP), 2011, pp. 69e72. [22] R.G. Kamp, H.H.G. Savenije, Optimising training data for ANNs with genetic
[6] G. Graditi, S. Ferlito, G. Adinolfi, G.M. Tina, C. Ventura, Performance estimation algorithms, Hydrol. Earth Syst. Sci. 10 (2006) 603e608.
of a thin-film photovoltaic plant based on an artificial neural network model, [23] M.V. Shcherbakov, A. Brebels, A survey of forecast error measures, World
in: 2014 5th International Renewable Energy Congress (IREC), 2014, pp. 1e6. Appl. Sci. J. Inf. Technol. Mod. Ind. Educ. Soc. 24 (4) (2013) 171e176.
[7] O.E. Dragomir, F. Dragomir, I. Brezeanu, E. Minca, MLP neural network as load