Professional Documents
Culture Documents
Forecasting daily global solar irradiance generation using machine learning MARK
⁎
Amandeep Sharma , Ajay Kakkar
Electronics and Communication Engineering Department, Thapar University, Patiala, India
A R T I C L E I N F O A BS T RAC T
Keywords: Rechargeable wireless sensor networks mitigate the life span and cost constraints propound in conventional
Solar irradiance battery operated networks. Reliable knowledge of solar radiation is essential for informed design, deployment
Energy harvesting planning and optimal management of self-powered nodes. The problem of solar irradiance forecasting can be
Solar forecasting well addressed by machine learning methodologies over historical data set. In proposed work, forecasts have
Machine learning
been done using FoBa, leapForward, spikeslab, Cubist and bagEarthGCV models. To validate the effectiveness of
these methodologies, a series of experimental evaluations have been presented in terms of forecast accuracy,
correlation coefficient and root mean square error (RMSE). The r interface has been used as simulation platform
for these evaluations. The dataset from national renewable energy laboratory (NREL) has been used for
experiments. The experimental results exhibits that from few hours to two days ahead solar irradiance
prediction is precisely estimated by machine learning based models irrespective of seasonal variation in weather
conditions.
1. Introduction izons. With the concern of practical use, Fig. 2 shows different
forecasting horizons and related activities in solar based power
Recent developments in wireless sensor technology incorporate systems.
self-powered sensors to autonomously operate for real time parameter Very short term forecasting is essential for real time monitoring of
updates. Various energy harvesting technologies provide different battery status. Short term forecasting is critical for decision making
kinds of widely distributed endless supply including solar light, piezo- activities including unit commitment etc. Medium term forecasting is
electricity, RF, physical motions and electromagnetic fields. Solar effective for maintenance scheduling and spinning of power unit. Long
energy with photovoltaic cell modules has been considered as the best term forecasting is useful in planning the network operations. Precise
ambient source because of high power density (15 mW/cm3), adequate solar forecasting ensure reliable and stable rechargeable sensor opera-
conversion efficiency (17%) and compatibility with integrated circuit tion with improved control algorithms for battery backup. Different
technology. Table A1 summarizes the power density and conversion forecasting methodologies have been developed for solar irradiance
efficiency of different sources [1,2] and given in Appendix A. forecasting task and summarized in Section 1.2.
Solar power based systems are restrained by different metrological Dependency on metrological conditions causes renewable energy
conditions, seasonal variability, geographical constraints and intra- resources to be inconsistent. Under this constraint, reliable solar
hour solar intensity. Fig. 1 exhibits monthly statistics based global solar irradiance forecast on different time horizons is essential for develop-
radiation on horizontal surface from January to December 2016. ing and utilizing solar energy based systems. As a sequel, research on
Dataset has been adapted from solar radiation research laboratory solar irradiance forecasting has been germinated along with the areas
(SRRL) under national renewable energy laboratory (NREL) [3] with of forecasting theory [5,6], solar physics [7], stochastic processes [8]
CMP-22 pyrometer as solar radiation sensor [4]. Fig. 1(a) shows the and machine learning [9]. Although all these methods have not the
seasonal variation of solar irradiance and Fig. 1(b) exhibits the same accuracy with respect to target forecasting horizon, the char-
maximum and minimum solar intensity with respect to different acteristic of machine learning models to trace relation between input
months of the year. Solar forecasting diminishes the effect of resource and output parameters allow this methodology to be successful in
variability and uncertainty by targeting different forecast time hor- various domains including classification, data mining and solar fore-
⁎
Corresponding author.
E-mail addresses: amandeep.sharma@thapar.edu (A. Sharma), ajay.kakkar@thapar.edu (A. Kakkar).
http://dx.doi.org/10.1016/j.rser.2017.08.066
Received 4 May 2017; Received in revised form 17 July 2017; Accepted 18 August 2017
Available online 24 August 2017
1364-0321/ © 2017 Elsevier Ltd. All rights reserved.
A. Sharma, A. Kakkar Renewable and Sustainable Energy Reviews 82 (2018) 2254–2269
casting. Classification and data mining have been considered as the with neural networks for parameter optimization. But similar to neural
initial step for machine learning based models as pre-processing of networks, it is hard to trace the dynamic behaviour of the atmosphere,
data has been required with big datasets. mathematically. Belaid et al. [20] proposed a SVM based approach for
Neural networks (NN) [10,11], genetic algorithm (GA) [12], sup- one step ahead solar forecasting with extraterrestrial solar irradiance,
port vector machine (SVM) [13] and fuzzy based models [14] are sunshine duration and ambient temperature as input parameters.
extensively used machine learning based methodologies in solar Jiang et al. [21] presented SVM approach with hard penalty function
forecasting. A multilayer perceptron (MLP) model with daily solar to select optimized number of radial basis function. They also imple-
irradiance and average air temperature as input parameters has been ment glowworm swarm optimization algorithm to choose optimal
proposed by Mellit et al. [15] for 24 h ahead forecasting. Kemmoku parameters for forecasting. Boata et al. [22] introduced autoregressive
et al. [16] proposed a multistage neural network by considering various fuzzy algorithm based model for dollar prediction by estimating daily
metrological parameters of past days and mean atmospheric pressure clearness index.
that is predicted by another neural network for the prediction of next
day. Hocaoglu et al. [17] integrate multistage neural network concept 1.3. Contribution
with time delay neural network models for hourly solar irradiance
forecasting. A comparison of neural network based models and In proposed work, multiple machine learning based models has
clearness index based time series models has been given by Sfetsos been applied to track effective solar forecasting models and analyse
et al. [18]. They consider daily ambient temperature, atmospheric prediction accuracy of each model. Machine learning has been applied
pressure and wind speed as inputs to neural network based model for on historical solar intensity observations as training dataset to calculate
hourly prediction. Main constraint with neural network based models future solar irradiance for different forecasting horizons irrespective of
is the designing of flawless network structure with optimal values of seasonal variation and input parameters availability.
different parameters. Quaiyum et al. [19] introduces endogenous and In Section 2, modelling of machine learning based models for solar
exogenous models that work on past solar irradiance and different irradiance prediction has been discussed. Description of database has
weather parameters respectively. They also integrate genetic algorithm been presented in Section 3. Equations for performance indicators has
800 20000
600
15000
400
10000
200
5000
0
0
1390
1853
2316
2779
3242
3705
4168
4631
5094
5557
6020
6483
6946
7409
7872
8335
1
464
927
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
-200
Time (hourly resolution) Month
(a) (b)
Fig. 1. (a). Solar irradiance (hourly) and (b) Minimum, maximum and average solar irradiance (monthly).
2255
A. Sharma, A. Kakkar Renewable and Sustainable Energy Reviews 82 (2018) 2254–2269
2
been summarized in Section 4. Section 5 includes simulation results
R(w ) = n−1 ∑j wfj j −y
and discussions. The proposed work has been concluded in Section 6. 2 (2)
A backward step has been taken when the increase of cost function
2. Solar irradiance forecasting platform is no more than half of the decrease of cost function in earlier forward
steps i.e. if l forward steps has been taken, the cost function should be
2.1. Machine learning methodology decreased by at least by an amount of l∈ /2 . This means that if R(W ) ≥ 0
for all ∈R d , the algorithm terminates no more than 2R(0)/∈ steps. The
All machine learning based algorithms works to trace a predictive procedure for FoBa [23] has been listed in Appendix B.
model that estimates a particular type of data with high accuracy. Large
dataset is essential for the learning algorithm to understand the 2.2.2. leapForward
behaviour of the system. Fig. 3 exhibits the machine learning metho- It performs exhaustive search using match-select-action cycle to for
dology. First step for machine learning based system is data procure- tracing best subset of predicting variables [25]. Selecting subset refers
ment. Collected data has been divided from different prospective and to finding a small set of independent variables that offers less
summarizes in useful information. The steps included in this process is prediction error in predicting the dependent variables. Rules that can
data cleansing and data segregation. Data has been segregated in three identify finite set of ordered variables that can satisfy their predicates
disjoint sets, training, testing and blind set. Training dataset has been are selected. When one n -sets are selected that particular rule has been
applied for model training and testing dataset has been used for model fired. This procedure continues until no more rules can be fired. The
optimization and evaluation. Blind dataset has been used for cross key point of leap algorithm is lazy subset selection i.e. subset emerges
validation. only when they are required. This perspective increases rule execution
efficiency and reduces space complexity of leaps algorithm.
2.2. Machine learning models When a variable has been selected or deleted, a timestamp to that
element has been placed on the stack that uphold the timestamp
In proposed work, machine learning based time series models ordering of variables. The most recent added variable has been placed
which are based on historically observed solar irradiance as input on the top of the stack and select first during rule execution cycle. This
parameter have been used for solar forecasting called endogenous variable is called dominant object and originate the selection predicates
forecasting. Fig. 4 summarizes five forecasting models used in pro- of all rules in an ordered way. When all the rules have been examined,
posed work with their methodologies and explained in following that dominant object has been popped up from the stack. When a
section. dominant object originated n subset has been found, the corresponding
rule is fired. A new dominant object has been selected when a rule gets
fired. These execution steps repeats until stack get empty and given in
2.2.1. FoBa (Adaptive forward – backward greedy algorithm) Appendix C.
FoBa is based on forward greedy algorithm with adaptive backward
steps [23,24]. The objective is to remove any error caused by earlier 2.2.3. Spikeslab
forward steps and avoid large number of basis functions. Adapted The spikeslab model [26,27] implements rescaled spikes and slab
backward steps ensure that any backward greedy step will not erase algorithms using a continuous bimodal prior. The model has been
gain made in forward steps. Consider n input vectors implemented in three stages shown in Fig. 5 and listed below:
xi ∈ R d (i = 1, . . . , n ),d feature vectors f j ∈ R n(j = 1, . . . , d ) with out- In step 1, filtering process carries top nF variables where n is the
put variables y=R n . Each f j is equivalent to jth feature component of xi sample size and F > 0 is the user defined fraction. Rest of the variables
that corresponds to f j, i = xi, j . With sparsity parameter k , non convex L 0 are filtered out to reduce the dimension. The posterior mean coefficient
regularization can be written as: has been calculated using Gibbs sampling for appropriate ordering of
ẇ = argmin w ∈ R d R(w ) variables. Step 2 refit the model using only those variables those are not
(1)
filtered in step 1. Gibbs sampler has been used to for model fitting and
Where ‖ w ‖ 0 ≤ k , w = [w1, w2,. . . ,wd ] ∈ R d and for least square regres- returns posterior mean values referred to as Bayesian model averaged
sion, R(w ) is a real valued cost function and calculated as: (BMA) estimate. Generalized elastic net (gnet) in step 3 has been used
Train
Model
Data Cleansing
Data Segregation
Training Model optimization
Set and Evaluation
Testing
Set
Cross validation
Blind
Set
2256
A. Sharma, A. Kakkar Renewable and Sustainable Energy Reviews 82 (2018) 2254–2269
Adaptive forward Regression subset spikes and slab Rule based Multivariate adaptive
backward greedy selection algorithm multivariate linear regression splines
algorithm modelling
for variable selection. Variables obtained from restricted BMA from A predicted value from a model tree has been adjusted to take account
step 2 have been classified in groups. Grouping force variables to share models at the nodes along the path from root to that leaf. The
a common regularization parameter. There is no limit on number of calculation is as follows:
groups. A variable that does not appear in the list will be assigned to a n × Predicted ′+k × Predicted
default group that has its own group specific regularization parameter. Predicted′′ =
n+k (4)
Where Predicted′′ is the prediction passed on the next higher nodes,
2.2.4. Cubist model Predicted′ is prediction passed to this node from below, Predicted is the
Cubist was developed by Quinlan [28] for inducing trees for predicted value at this node, n is the number of training cases that
regression models. Cubist is a rule based predictive model where each reach the node below and k is a constant.
rule carries a multivariate linear model. These models works on the
predictions of previous splits [29,30]. When a case satisfies all rule 2.2.5. BagEarthGCV
based conditions, the associated model has been used for prediction. It is a non-parametric regression technique and based on multi-
Fig. 6 shows the flow diagram of cubist model. In first stage recursive variate adaptive regression splines (MARS) [31,32]. Open source
partitioning (divide and conquer) of training cases has been exercised implementations of MARS are termed as Earth as MARS term is
to generate piecewise linear model in the form of regression based licenced to Salford systems. The MARSplines model has been imple-
model tree. Each training case has a set of attributes and associated mented by Eq. (5):
target value.
U
The basic approach is to generate a model that relates target values
f (x ) = ω0 + ∑ ωuhu(x )
of the training cases to their values of other attributes. A splitting u =1 (5)
criteria has been used to minimize the intra subset variation in the class
values instead of maximizing the information gain at each interior f (x ) is predicted as a function of predictor variables x , intercept
node. The splitting criteria is based on computing standard deviation of parameter ω0 and weighted sum of one or more basis functions. Each
target values of the cases in T. The attribute that minimizes standard ωu is a constant coefficient.
deviation has been chosen. If sd (Ti ) has been considered as standard Bagging: It is a model averaging approach that computes multiple
deviation of the target of the cases in Ti then reduction in standard version of a predictor and use them to derive an aggregate predictor.
deviation has been calculated as: Multiple versions has been generated by making bootstrap replicates of
training set and using it as a new training set. Bagging improves
Ti stability, prediction accuracy and avoid overfitting of different machine
Error = sd (T ) − ∑ × sd (Ti )
i
T (3) learning approaches.
Pruning: After forward stepwise selection of basis functions, a
Where T is the set of cases that reach the node and T1, T2 … are backward procedure called pruning has been applied to remove those
selected cases after splitting the node according to chosen attributes. At basis functions those are least concerned with increase in goodness-of-
the second stage Pruning has been carried out by estimating the fit. The generalized cross validation error is a measure of goodness-of-
expected error that will be experienced at each node for test data. Each fit that considers residual error together with model complexity and
linear model is simplified by eliminating parameters in order to reduce formulated as:
its estimated error. Parameters are eliminated one by one so long as the
N
error estimate decreases. Each internal node of the tree has both a ∑i =1 (yi − f (x )i )2
GCV = whereC = 1+cd
simplified model and a model subtree. The one with lowest estimated C
(1− N )2 (6)
error has chosen.
Finally, smoothing process has been carried out to compensate In Eq. (6), N signifies number of cases in dataset, d is degree of
abrupt discontinuities between adjacent linear models of a pruned tree. freedom and corresponds to number of independent basis functions.
2257
A. Sharma, A. Kakkar Renewable and Sustainable Energy Reviews 82 (2018) 2254–2269
1000
Measured solar irradiance (w/m2)
800
600
400
Time : 6:00 - 18:00
Time: 6:00 - 18:00
200
0
1 6 11 16 21 26 31 36 41 46 51 56 61
Time (61 hours)
2258
A. Sharma, A. Kakkar Renewable and Sustainable Energy Reviews 82 (2018) 2254–2269
12.17
20.56
70.66
11.07
82.14
85.71
10.38
89.29
11.27
89.29
± 25
.98
.96
.98
.96
.93
.92
.98
.96
.98
.96
25
sesses smooth behaviour with lower maximum and minimum solar
intensity thresholds because of winter season.
98.21
13.49
78.57
16.34
75.88
12.14
82.14
± 20
2.92
9.24
87.5
.98
.96
.95
.97
.99
.98
.98
.96
20
5.1. Experiments
1
1
15.99
78.18
14.14
13.75
83.64
14.66
78.18
15.14
81.82
± 15
Three experiments have been carried out to access prediction
.97
.94
.97
.94
.98
.96
.97
.94
.97
.94
15
80
31 December (Winter)
effectiveness based on input parameter selection. In the first one,
impact of number of past days on prediction accuracy has been
17.05
89.19
17.32
78.38
17.06
72.97
17.55
78.38
19.82
75.68
± 10
evaluated. Second experiment evaluates the performance by varying
.97
.94
.96
.92
.97
.94
.97
.92
.96
.92
10
number of past time slots. In the third experiment different forecasting
horizons has been considered to evaluate the forecasting models.
14.77
63.16
84.21
78.95
12.39
78.95
84.21
12.9
9.74
8.49
±5
.99
.98
.99
.98
.99
.98
5
1
1
1
1
5.1.1. Prediction accuracy with respect to number of past days
21.69
78.57
21.67
83.93
19.86
80.36
18.51
76.89
19.65
78.57
± 25
In experiment 1, simulation has been performed with 10–50 past
.97
.94
.96
.92
.97
.94
.97
.94
.97
.94
25
days for training and 5–25 past days for testing for all five forecasting
models. As shown in Table 1, for 11th March, leapForward shows
31.68
57.14
33.18
73.21
30.33
67.86
30.62
71.43
31.68
66.07
± 20
maximum accuracy of 96.08% with ±15 training days and 15 past days
.93
.86
.91
.83
.93
.86
.93
.86
.93
.86
20
for testing. Cubist achieves maximum accuracy of 78.95% for 25th
June with ±5 training days and 5 historical days for testing. For 30th
28.54
65.45
29.61
61.82
63.64
26.71
74.55
29.87
65.45
± 15
28.8
August, an accuracy of 89.47% has been achieved by bagEarthGCV with
.94
.88
.94
.88
.95
.95
.95
15
.9
.9
.9
30 August (Monsoon)
±5 training days and 5 historical days for testing. An accuracy of
98.21% has been gained by FoBa with ±20 training days and 20 days
33.23
56.76
41.31
64.86
30.64
72.97
64.86
38.15
67.57
± 10
37.1
.95
.87
.76
.94
.88
.92
.85
.89
.79
10
for testing for 31st December. The performance ranking of above
.9
mentioned models is complicated because relationship between past
days weather metrics and present day solar intensity is complicated.
11.27
89.47
33.71
42.11
14.69
73.68
12.52
89.47
10.98
84.21
±5
.99
.98
.97
.94
.99
.98
The choice of most accurate model with adequate number of past days
1
1
1
1
depends upon on present day weather conditions and is iteration
specific.
98.98
29.09
100.3
23.64
93.77
36.36
100.9
47.27
36.36
± 25
109
.89
.79
.87
.76
.88
.77
.86
.74
.86
.74
25
57.88
54.55
53.46
54.55
35.47
61.82
54.53
45.45
± 20
In experiment 2, initial time slots from 6:00 a.m. to 10:00 a.m. has
.96
.92
.96
.92
.97
.94
.98
.96
.96
.92
20
56.44
50.91
55.57
43.64
49.61
60.85
43.64
± 15
accuracy. For different four days (11th March, 25th June, 30th August
.95
.95
.95
.95
.95
15
60
.9
.9
.9
.9
.9
and 31st December), FoBa offers maximum accuracy (70.27%, 64.86%,
Comparison results of five forecasting models for selection of historical days for forecasting (Experiment 1).
25 June (summer)
47.14
64.86
35.89
56.76
36.79
59.46
44.81
43.24
± 10
.98
.96
.98
.96
.99
.98
.98
.96
.98
.96
10
35.18
78.95
Table 2. For 11th March, 25th June, 30th August and 31st December
64.18
57.89
44.42
42.11
35.53
63.16
50.68
52.63
±5
.99
.98
.96
.92
.99
.98
.98
.96
.98
.96
maximum accuracies are 96.08%, 64.86%, 83.93% and 84.21% respec-
5
tively for almost all simulation cases. Experimental results shows that
38.36
67.86
67.86
34.34
34.28
64.29
39.05
similar to leapForward model, spikeslab offers less variation with
± 25
38.3
62.5
.96
.92
.96
.92
.97
.94
.97
.94
.96
.92
25
50
respect to initial time slot consideration and achieves high prediction
accuracy with less past samples consideration. For 11th March, 25th
36.51
66.07
42.11
57.14
36.17
66.07
76.79
27.04
71.43
± 20
.96
.92
.96
.92
.98
.96
.97
.94
20
85.45% respectively) has been gained with past time slots from
10:00 a.m. onwards. For 30th August maximum accuracy (78.95%)
96.08
35.77
52.94
25.62
66.67
14.31
84.31
88.24
± 15
3.07
has been gained with past samples from 6:00 a.m. onwards. Cubist
9.49
.99
.99
.98
.96
.99
.98
.99
.98
15
consideration (6:00 a.m. onwards). For 11th March, 25th June, 30th
32.25
70.27
32.03
48.65
27.98
67.57
75.68
25.39
75.68
± 10
21.7
.98
.96
.98
.96
.99
.98
.98
.96
10
00.79
63.67
47.37
47.96
57.89
53.62
63.16
35.92
63.16
±5
(11th March, 25th June, 30th August and 31st December respectively)
.94
.88
.89
.94
.88
.93
.86
.95
.9
5
accuracy
accuracy
accuracy
accuracy
RMSE
RMSE
RMSE
RMSE
RMSE
r2
r2
r2
r2
r
r
Past days (training)
Past days (testing)
seasonal validation
To investigate the prediction accuracy of five forecasting models
leapForward
Bagearthgcv
Cubist
Foba
2259
Table 2
Comparison results of five forecasting models for selection initial past time slots for forecasting (Experiment 2).
Initial past time slots 6:00 am 7:00 am 8:00 am 9:00 am 10:00 am 6:00 am 7:00 am 8:00 am 9:00 am 10:00 am
FoBa r .92 .98 .97 .98 .98 .96 .98 .95 .95 .96
A. Sharma, A. Kakkar
Initial past time slots 6:00 am 7:00 am 8:00 am 9:00 am 10:00 am 6:00 am 7:00 am 8:00 am 9:00 am 10:00 am
2260
FoBa r .97 .97 .95 .96 .96 .99 1 .99 .99 .99
r2 .94 .94 .9 .92 .92 .98 1 .98 .98 .98
RMSE 23.33 23.33 33.23 28.8 29.08 14.77 2.92 14.49 11.18 12.46
accuracy 66.07 66.07 56.76 57.14 57.14 63.16 98.21 63.16 78.95 78.95
leapForward r .96 .96 .96 .96 .96 .99 .99 .99 .99 .99
r2 .92 .92 .92 .92 .92 .98 .98 .98 .98 .98
RMSE 21.67 21.67 21.67 21.67 21.67 12.9 12.9 12.9 12.9 12.9
accuracy 83.93 83.93 83.93 83.93 83.93 84.21 84.21 84.21 84.21 84.21
spikeslab r 1 .94 .95 .99 1 .98 .97 .97 .97 .97
r2 1 .88 .9 .98 1 .96 .94 .94 .94 .94
RMSE 13.9 30.64 28.8 20.84 15.63 14.18 17.06 13.48 13.62 13.18
accuracy 78.95 72.97 63.64 57.89 57.89 83.64 72.97 85.45 81.82 85.45
Cubist r .99 .92 .99 .99 .99 .98 .97 .98 .98 .98
r2 .98 .85 .98 .98 .98 .96 .92 .96 .96 .96
RMSE 10.98 37.1 12.78 12.78 14.87 11.22 17.55 10.84 12.47 12.41
accuracy 84.21 64.86 84.21 84.21 84.21 85.71 78.38 83.93 83.93 82.14
bagEarthGCV r .99 .99 1 .99 1 .98 .96 .97 .98 .98
r2 .98 .98 1 .98 1 .96 .92 .94 .96 .96
RMSE 13.54 11.27 9.84 16.02 11.64 11.22 19.82 13.32 11.16 10.59
accuracy 84.21 89.47 94.74 78.95 78.95 85.71 75.68 83.93 87.5 89.29
Renewable and Sustainable Energy Reviews 82 (2018) 2254–2269
A. Sharma, A. Kakkar Renewable and Sustainable Energy Reviews 82 (2018) 2254–2269
(a)
1000 1000
Predicted solar irradiance (w/m2)
Predicted solar irradiance (w/m2)
(b)
1000 1000 1000
Predicted solar irradiance (w/m2)
r2=.93
Predicted solar irradiance (w/m2)
2261
A. Sharma, A. Kakkar Renewable and Sustainable Energy Reviews 82 (2018) 2254–2269
(c)
1000 1000 900
(d)
1500 1500 1500
Predicted solar irradiance (w/m2)
Predicted solar irradiance (w/m2)
r2=.96 r2=.93
r2=.92
leapForward spikeslab
FoBa
1000 1000 1000
0 0 0
0 500 1000 1500 0 500 1000 1500 0 500 1000 1500
Measured solar irradiance (w/m2) Measured solar irradiance (w/m2) Measured solar irradiance (w/m2)
1500 1500
Predicted solar irradiance (w/m2)
500 500
0 0
0 500 1000 1500 0 500 1000 1500
Measured solar irradiance (w/m2)
Measured solar irradiance (w/m2)
Fig. 8. (continued)
2262
A. Sharma, A. Kakkar Renewable and Sustainable Energy Reviews 82 (2018) 2254–2269
(e)
1500 1500 1500
0 0 0
0 500 1000 1500 0 500 1000 1500 0 500 1000 1500
Measured solar irradiance (w/m2) Measured solar irradiance (w/m2)
Measured solar irradiance (w/m2)
1500 1500
r2=.94
Predicted solar irradiance (w/m2)
Predicted solar irradiance (w/m2)
500 500
0 0
0 500 1000 1500 0 500 1000 1500
Measured solar irradiance (w/m2) Measured solar irradiance (w/m2)
(f)
1500 1500 1500
Predicted solar irradiance (w/m2)
0 0
0 500 1000 1500 0
0 500 1000 1500
0 500 1000 1500
Measured solar irradiance (w/m2) Measured solar irradiance (w/m2) Measured solar irradiance (w/m2)
1500 1500
Predicted solar irradiance (w/m2)
Predicted solar irradiance (w/m2)
500 500
0 0
0 500 1000 1500 0 500 1000 1500
Measured solar irradiance (w/m2)
Measured solar irradiance (w/m2)
Fig. 8. (continued)
2263
A. Sharma, A. Kakkar Renewable and Sustainable Energy Reviews 82 (2018) 2254–2269
(g)
1000 1000 1000
Predicted solar irradiance (w/m2)
0 0 0
0 500 1000 0 500 1000 0 500 1000
Measured solar irradiance (w/m2) Measured solar irradiance (w/m2) Measured solar irradiance (w/m2)
1000 1000
Predicted solar irradiance (w/m2)
Predicted solar irradiance (w/m2)
37.15 42.97
22.97 22.18 26.76
500 500
0 0
0 500 1000 0 500 1000
Measured solar irradiance (w/m2) Measured solar irradiance (w/m2)
(h)
600 600 600
Predicted solar irradiance (w/m2)
200
200 200
0
0 0 200 400 600 0
0 200 400 600 0 200 400 600
Measured solar irradiance(w/m2) Measured solar irradiance (w/m2) Measured solar irradiance (w/m2)
600 600
Predicted solar irradiance (w/m2)
r2=.9
Predicted solar irradiance(w/m2)
200 200
0 0
0 200 400 600 0 200 400 600
Measured solar irradiance (w/m2) Measured solar irradiance (w/m2)
Fig. 8. (continued)
2264
A. Sharma, A. Kakkar Renewable and Sustainable Energy Reviews 82 (2018) 2254–2269
(i)
600 600 600
r2=.92
100 100
100
0 0
0
0 200 400 600 0 200 400 600
0 200 400 600
Measured solar irradiance (w/m2) Measured solar irradiance (w/m2) Measured solar irradiance (w/m2)
600 600
Predicted solar irradiance(w/m2)
100 100
0 0
0 200 400 600 0 200 400 600
Measured solar irradiance (w/m2) Measured solar irradiance (w/m2)
(j)
350
Predicted solar irradiance (w/m2)
350 350
Predicted solar irradiance (w/m2)
350
Predicted solar irradiance (w/m2)
350
Predicted solar irradiance (w/m2)
200 200
150 150
11.45 11.53 10.73 10.3 11.58
100 100
50 50
0 0
0 200 400 0 200 400
Measured solar irradiance (w/m2) Measured solar irradiance (w/m2)
Fig. 8. (continued)
2265
A. Sharma, A. Kakkar Renewable and Sustainable Energy Reviews 82 (2018) 2254–2269
(k)
140 160 140
r2=.9
Predicted solar irradiance (w/m2)
140 140
Predicted solar irradiance (w/m2)
80 80
60 60
40 40
11.45 11.53 10.73 11.7 11.58
20 20
0 0
0 50 100 150 0 50 100 150
Measured solar irradiance (w/m2) Measured solar irradiance (w/m2)
(l)
120 120 120
Predicted solar irradiance (w/m2)
80 80 80
60 60 60
40 40 40
20 20 20
0 0 0
0 50 100 150 0 50 100 150 0 50 100 150
Measured solar irradiance (w/m2) Measured solar irradiance (w/m2) Measured solar irradiance (w/m2)
600 600
Predicted solar irradiance(w/m2)
100 100
0 0
0 200 400 600 0 200 400 600
Fig. 8. (continued)
2266
A. Sharma, A. Kakkar Renewable and Sustainable Energy Reviews 82 (2018) 2254–2269
• 11th March 2016, 1 h ahead forecasting prediction accuracy (83.78%) has been achieved by spikeslab with
.98 correlation coefficient and 22.18 RMSE. The historical days from
For 11th March 2016 in one hour ahead prediction highest 15th August to 14th September ( ± 15 days) from the years 2010–
prediction accuracy (90.2%) has been achieved by bagEarthGCV with 2015 have been used for training and days from 15th August 2016 to
r2 value of 1 and 8.52 RMSE. The historical days from 25th February to 30th August 2016 have been used for testing. Initial past time slots
27th March ( ± 15 days) from the years 2010–2015 have been used for has been taken from 7:00 a.m. and last time slot available is one
training and days from 25th February 2016 to 11th March 2016 have hour ahead of prediction time.
been used for testing. Initial past time slots has been taken from • 30th August 2016, 24 h ahead forecasting
7:00 a.m. and last time slot available is one hour ahead of prediction In 24 h ahead solar forecasting for 30th August 2016, spikeslab
time. offers the maximum prediction accuracy (88.46%) with .92 correla-
tion coefficient and 23.37 RMSE. The historical days from 15th
• 11th March 2016, 24 h ahead forecasting August to 14th September ( ± 15 days) from the years 2010–2015
have been used for training and days from 15th August 2016 to 29th
In 24 h ahead solar forecasting for 11th March 2016, prediction August 2016 have been used for testing. Initial past time slots has
accuracy has been reduced from 90.2% to 79.17% and offered by Cubist been taken from 7:00 a.m. and last time slot available is 24 h ahead
model. The value of correlation coefficient and RMSE has been of prediction time.
observed as .94 and 38.22 respectively. The historical days from 25th • 30th August 2016, 48 h ahead forecasting
February to 27th March ( ± 15 days) from the years 2010–2015 have In 48 h ahead prediction for 30th August 2016, maximum
been used for training and days from 25th February 2016 to 10th accuracy (91.67%) has been gained by spikeslab model with .92
March 2016 have been used for testing. Initial past time slots has been correlation coefficient and 22.73 RMSE. The historical days from
taken from 7:00 a.m. and last time slot available is 24 h ahead of 15th August to 14th September ( ± 15 days) from the years 2010–
prediction time. 2015 have been used for training and days from 15th August 2016 to
28th August 2016 have been used for testing. Initial past time slots
• 11th March 2016, 48 h ahead forecasting has been taken from 7:00 a.m. and last time slot available is 48 h
In 48 h ahead solar forecasting for 11th March 2016, maximum ahead of prediction time.
prediction accuracy (69.09%) has been offered by spikeslab model • 31st December 2016, 1 h ahead forecasting
with .88 correlation coefficient and 45.15 RMSE. The historical days In 1 h ahead solar forecasting for 31st December 2016, max-
from 25th February to 27th March ( ± 15 days) from the years imum accuracy (93.86%) has been given by Cubist model with .92
2010–2015 have been used for training and days from 25th correlation coefficient and 10.3 RMSE. The historical days from 16th
February 2016 to 9th March 2016 have been used for testing. December to 15th January ( ± 15 days) from the years 2010–2015
Initial past time slots has been taken from 7:00 a.m. and last time have been used for training and days from 16th December 2016 to
slot available is 48 h ahead of prediction time. 31st December 2016 have been used for testing. Initial past time
It has been observed that for 11th March 2016 in all three slots has been taken from 7:00 a.m. and last time slot available is
forecasting horizons, performance matrix is satisfactory and effec- one hour ahead of prediction time.
tiveness of a particular model is weather specific. • 31st December 2016, 24 h ahead forecasting
• 25th June 2016, 1 h ahead forecasting spikeslab model offers highest prediction accuracy (92.7%) for
For 25th June 2016 in 1 h ahead prediction, spikeslab gains the 30th August 2016 in 24 h ahead prediction horizon with .92
highest prediction accuracy (69.09%) with .96 correlation coefficient correlation coefficient and 10.73 RMSE. The historical days from
and 45.15 RMSE. The historical days from 10th June to 10th July ( 16th December to 15th January ( ± 15 days) from the years 2010–
± 15 days) from the years 2010–2015 have been used for training 2015 have been used for training and days from 16th December
and days from 10th June 2016 to 25th June 2016 have been used for 2016 to 30th December 2016 have been used for testing. Initial past
testing. Initial past time slots has been taken from 7:00 a.m. and last time slots has been taken from 7:00 a.m. and last time slot available
time slot available is one hour ahead of prediction time. is 24 h ahead of prediction time.
• 25th June 2016, 24 h ahead forecasting • 31st December 2016, 48 h ahead forecasting
In 24 h ahead prediction of 25th June 2016, Cubist model offers
the highest accuracy (65.38%) with .94 correlation coefficient and For 48 h ahead forecasting, Spikeslab offers highest prediction
46.34 RMSE. The historical days from 10th June to 10th July ( ± 15 accuracy (91.67%) with .94 correlation coefficient and 22.73 RMSE.
days) from the years 2010–2015 have been used for training and The historical days from 16th December to 15th January ( ± 15 days)
days from 10th June 2016 to 24th June 2016 have been used for from the years 2010–2015 have been used for training and days from
testing. Initial past time slots has been taken from 7:00 a.m. and last 16th December 2016 to 29th December 2016 have been used for
time slot available is 24 h ahead of prediction time. testing. Initial past time slots has been taken from 7:00 a.m. and last
• 25th June 2016, 48 h ahead forecasting time slot available is 48 h ahead of prediction time.
In 48 h ahead prediction of 25th March 2016, maximum It has observed from the results obtained in the above section that
prediction accuracy (62.5%) has been achieved by spikeslab model Spikeslab and Cubist model achieves high prediction accuracy than
with .94 correlation coefficient and 53.34 RMSE. The historical days FoBa, leapForward and bagEarthGCV with respect to different fore-
from 10th June to 10th July ( ± 15 days) from the years 2010–2015 casting horizons for all seasons of a year.
have been used for training and days from 10th June 2016 to 23rd
June 2016 have been used for testing. Initial past time slots has been 6. Conclusion
taken from 7:00 a.m. and last time slot available is 48 h ahead of
prediction time. The applicability of five machine learning models, FoBa,
It has been observed that unstable weather conditions (shown in leapForward, spikeslab, Cubist and bagEarthGCV in modelling solar
Fig. 7) and low correlation with past days is the reason of low irradiance prediction has been investigated and evaluated under
prediction accuracy and high RMSE in all forecasting horizon for seasonal effects using the same test platform and datasets. Main
25th June 2016. contribution is performance comparison of models in different fore-
• 30th August 2016, 1 h ahead forecasting casting horizons ranging from 1 h ahead to 48 h ahead. The perfor-
In one hour ahead prediction for 30th August 2016, maximum mance has been evaluated by statistical indices correlation coefficient,
2267
A. Sharma, A. Kakkar Renewable and Sustainable Energy Reviews 82 (2018) 2254–2269
RMSE and prediction accuracy (%) for each model. spikeslab and Cubist model are very promising and stable with respect
Regarding the results obtained in experiment 1, accuracy of a model to different forecasting horizons. The prediction accuracy with different
depends upon quality of selected data for model training. For different forecasting horizons (1 h ahead, 24 h ahead, 48 h ahead) gained by
days of a year (11th March, 25th June, 30th August and 31st spikeslab for 11th March (86.27%, 66.67%, 69.09% respectively), for
December), the performance matrix (r2, RMSE and accuracy) for 25th June (69.09%, 61.54% and 62.5% respectively), 30th August
leapForward (.99, 3.07, 96.08%), Cubist (.98, 35.18, 78.95), (83.78%, 88.46%, 91.67% respectively) and 31st December (92.86%,
bagEarthGCV (.98, 11.27, 89.47%) and FoBa (1, 2.92, 98.21%) 92.8% and 91.67% respectively) are satisfactory and stable. Similarly,
respectively have been obtained. Cubist achieves (84.31%, 79.17%, 58.18% respectively) for 11th March,
In experiment 2, it has been observed that FoBa, leapForward, (58.18%, 65.38%, 54.17% respectively) for 25th June, (78.38%, 69.23%
Cubist performs well with large set of past time slots according to solar and 79.17% respectively) for 30th August and (93.86%, 92.6%, 79.17%
irradiance availability (7:00 a.m. onwards) whereas spikeslab works respectively) for 31st December.
with less past samples. The performance of bagEarthGCV is unpre- The results are evident that solar irradiance forecasting with such
dictable with respect to number of past samples and is iteration machine learning models is recent and productive study in this field
specific. leads to accurate solar forecasting than conventional methods.
The results obtained in experiment 3 shows that results obtained by
Appendix A
Table A1
Comparison of different ambient energy sources.
Solar 17%
– Outdoor 15,000 µW/cm3
150 µW/cm3
- Indoor 6 µW/cm3
Vibration
– Piezoelectric 335 µW/cm3 5%
– Electrostatic 44 µW/cm3 9%
– Electromagnetic 400 µW/cm3 1%
Acoustic noise .003 µW/cm3 at 75 DB
.96 µW/cm3at 100 DB
Temperature gradient 15 µW/cm3at 10 °C 7% at 100 °C
15% at 200 °C
Human power 330 µW/cm3 5–30%
Air flow 7600 at 5 m/s
Pressure variation 17 µW/cm3
2268
A. Sharma, A. Kakkar Renewable and Sustainable Energy Reviews 82 (2018) 2254–2269
break
Let k = k − 1
LetF k = F k +1 − {j k +1 }
Letw k = w(Ḟ k )
end
end
2269