You are on page 1of 18

A Project Report

On

Artificial Neural Network model for


Solar Electricity Generation Forecasting
BY

DEVANSH DHRAFANI

2017B5A41569H

Under the supervision of

Dr. Harihara Venkataraman

SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS OF

PHY F266: STUDY PROJECT

BIRLA INSTITUTE OF TECHNOLOGY AND SCIENCE PILANI (RAJASTHAN)

HYDERABAD CAMPUS

(​NOVEMBER 2020​)

1
ACKNOWLEDGMENTS

I take this opportunity to express my gratitude and thanks to everyone associated with this
project. I am grateful to them for their views and constructive criticism. I thank my
instructor, Dr. Harihara Venkataraman for providing constant motivation, guidance, and
freedom to explore the topics that interest me. I would also like to thank my colleague
Gautam Sibansh Mishra, who worked closely with me on the project. Finally, I thank
everyone associated with the institution for giving me this wonderful opportunity of working
on this project.

2
Birla Institute of Technology and Science-Pilani,

Hyderabad Campus

Certificate

This is to certify that the project report entitled “​Artificial Neural Network model for

Solar Electricity Generation Forecasting​”​ submitted by ​Mr​. ​DEVANSH DHRAFANI​ (ID No.
2017B5A41569H​) in partial fulfillment of the requirements of the course PHY F266, Study Project
Course, embodies the work done by him/her under my supervision and guidance.

Date: 20/10/2020 (​Harihara Venkataraman​)

BITS- Pilani, Hyderabad Campus

3
ABSTRACT

The recent shift to variable energy generation like wind and solar energy resources has become
a source of uncertainty in the power grid. Accurate prediction of Solar Energy generated can
mitigate some of the challenges of using variable energy sources and lead to better adoption of
renewable sources. While this problem is receiving attention from the research community,
current solutions rely on time-series forecasting taking no account of the weather conditions
that Solar Panels are operating at. This project aims to create an ANN solution for Solar
Electricity Generation forecasting which solely relies on weather parameters as inputs.

4
CONTENTS

Title page………………………………………………………..….…….1

Acknowledgments…………………………………………….……...…..2

Certificate………………………………………………………...……....3

Abstract……………………………………………………….……..…....4

Objectives……..………………………………………….……....……....6

Literature Review…………………………………………….……....…...7

Overview of ANNs……………………………………………....….…….9

Data Preprocessing……………..……………………………...…….…...10

Proposed ML Models………………………………………….…….…...13

Scatter Matrices…………………………………………………………..14

Performance Metrics…………………………………………………….. 15

Model Evaluation and Results……………………………………………16

Conclusion and Future Work….……………………………………..…....17

References…………………………………………………………...……18

5
Objectives

The total irradiation received is often a function of weather conditions. To


investigate this, we plan to build an Artificial Neural Network model for
Electricity Generation forecasting based on changing weather patterns.

● Acquire data of Solar electricity generated and weather conditions.


● Implement data pre-proceesing techniques to prepare the data for feeding
into the Neural Network Model.
● Implement different Neural Network models and see which one gives best
accuracy for the training and test dataset.
● The model which provides best accuracy will be used for the Solar
Electricity Generation forecasting system.

6
Literature Review

As Neural Networks are a vast field with many existing models and new models
constantly being developed, it was necessary to carry out a thorough Literature
Review to get a better understanding of the current methods being employed for
Solar Electricity Generation Forecasting problems.

First, we looked at a Review article [1] which summarized the various ML


techniques being used in solar radiation forecasting. From this, it was evident that
the current research focus in on time-series forecasting of solar energy
generation. We then looked for any research that takes weather parameters as
inputs for the solar generation forecasting. We found three papers focusing on the
same. We have summarized the key points from these papers below:

Weather Machine Learning


Paper Remarks
Parameters Techniques
Solar Power Forecasting ● Humidity ● Linear Regression ● Target parameter is
Using Artificial Neural ● Pressure ● Multilayer Electricity generated
Networks. [2] ● Cloud Cover Perceptron (MLP) (W/m^2).
● Wind speed ● MLP gave better
● Temperature result in terms of
● Precipitation RMSE.
● Removing data from
night improved
model accuracy.

Predicting solar generation ● Humidity ● Linear Regression ● Target parameter is


from weather forecasts ● Cloud Cover ● Support Vector Electricity generated
using machine learning. [3] ● Wind Speed Machine(SVM) (W/m^2).
● Dew Point -RBF ● SVM gave 27%
● Temperature better accuracy as
● Precipitation compared to
Regression.

Machine learning for solar ● Humidity ● Hidden Markov ● The target parameter
irradiance forecasting of ● Temperature Model is solar irradiance
photovoltaic system. [4] ● Wind Speed ● Support Vector ● It's a time-series
Machine (SVM) forecasting problem
(3 hour time horizon)

7
Key Takeaways from Literature Review

● There is a lot of focus on time-series forecasting. While it makes good


predictions, time-series forecasting is limited to the locality/grid from
where its data was obtained. It can’t be generalised.
● Data pre-processing techniques like Feature ranking and Data
Normalisation can significantly improve model accuracy.
● Common input parameters:
○ Humidity
○ Temperature
○ Wind Speed
○ Precipitation
○ Cloud Cover
● Multilayer Perceptrons(MLP) and Support Vector Machines(SVM)
generally give the best results.

8
Overview of ANNs

Artificial Neural Networks(ANNs) are computer programs loosely based on


biological nervous systems. ANNs consist of an input layer (number of nodes =
number of inputs), one or more hidden layers, and an output layer (number of
nodes = number of outputs). Each node is connected to the nodes of next and
previous layer with some associated weights. Each node also contains an
activation function through which data passes before moving to the next node.

If (x1,x2,...,xn) are the input variables and y is the associated output variable,
ANNs model a function f(x1,x2,...,xn) = y_pred such that the error y-y_pred is
minimized. This is done by repeatedly feeding a training dataset to the model. At
the end of each pass of inputs, a loss/error is calculated at the output node. Then
using a method called backpropogation, the program derivates the Loss function
with respect to the input variables, thus getting a difference between the weights
responsible for the error. The difference is subtracted from the original weights at
each node. The above process is carried out multiple times till the model gives
minimum loss and maximum accuracy for the training set.

9
Data Preprocessing

The dataset [5],[6] consists of 6 year data of daily solar electricity generation.
Here are the variables that the dataset records:

1. Date
2. Temperature (degC)
3. Cloud Cover
4. Wind Speed (km/hr)
5. Humidity (percent)
6. Pressure (mBar)
7. Daily Solar Power (kW)

If we plot Power Generated on a time-scale, we can see that the peak power is
almost consistently generated during mid-year May-July. This is probably
because in Antwerp, Belgium, the days are longest in May-July during summer
time with low cloud cover.

As we are not considering the time data for our study, the weather conditions will
have to compensate for this. For the weather data, we had hourly forecasts. But as
our solar generation parameter was a daily cumulative reading, we needed to take
the daily average of the weather data.

10
Looking at the “weather status” parameter, we found that Cloud Cover is given in
text form with many different variations:

To sort this, we took common phrases and simplified them to 6 main text outputs:

Finally, looking at the processed data:

11
This histogram shows the data distributions of various parameters:

The “Sci-Kit Learn” library was used for data preprocessing. For the cloud cover
data, a one-hot encoder was used. This allowed to convert text based values to
machine-readable numeric values. Missing values were filled with Median
Imputation. And a MinMax scalar was used to normalize the data in a range of
0-1.

12
Proposed Machine Learning Models

There are several Machine Learning models that could be used for forecasting
purposes. From the literature review, we came to the conclusion that Multilayer
Perceptron(MLP) and Support Vector Regression(SVR) work best for Solar
Electricity Generation forecasting with weather parameters.

1. Multilayer Perceptron (MLP):

MLP is a Feed forward-only neural network architecture, which means that the
input data flows only in the forward direction (input → hidden layer → output).
Other than that, it has the same characteristics as all Artificial Neural Networks.
Each layer has a set number of nodes with different input and output weights.
The weights are determined by backpropagation to minimize loss function. The
model uses backpropogation to minimize the Loss function, which is Mean
Square Error (MSE) for our case.

2. Support Vector Regression (SVR):

Support Vector Machine(SVM) is a Machine Learning technique to determine


relationship between multiple input variables and their effects on the output.
SVMs are used for classification problems, where for N features, an
N-Dimensional space is made. And then hyperplanes are calculated which mark
boundaries between input variables in the high-dimensional space. These
boundaries decide the classification criteria. For a regression problem as the one
that we are tackling, a modified version of SVM called Support Vector
Regression (SVR) is used. In this, the hyperplane acts as a regressor.

Figures: L: Visualization of a simple MLP model. R: Visualization of a linear SVR


13
Scatter Matrices

Before starting to implement the models, I decided to make Scatter Matrices


which will help identify any trends that exist between different parameters. From
the plots, its clear that there exist linear relationships between temperature-daily
power and humidity-daily power. Other features also have slightly linear trends
with respect to daily power production. With these results, I decided to
investigate the usage of a linear regression model in addition to the other 2
models that were already proposed.

14
Performance Metrics

To test the performance of our ML models, I used 2 different evaluation metrics:


(1) Root Mean Square Error and (2) Coefficient of Determination. The model
uses RMSE as the metric to minimise with each training iteration.

1. Root Mean Square Error: As the name suggests, this is the Square root of
the mean squared error for all predicted values. Lower value is better. The
formula for the same is given below.

2. Coefficient of Determination (R^2): This is the standard evaluation metric


for regression problems. It gives the overall deviation from the expected
values. Ranging from 0-1, where 0 is the worst and 1 is the best possible
result.

15
Model Evaluation and Results
With the data preprocessed and the performance metrics identified, all that was
left was to train and test the models. Again, the Sci-Kit Learn library was used for
model training/testing. The entire dataset was split into training and testing data.
All 3 models could access the training input and output parameters. But while
testing, they were only supplied with the testing input data. The testing output
data was then compared with actual outputs, and the performance metrics were
noted.

From the above metrics, it's clear that MLP gives the best performance for our
problem. But it is to be noted that the margin by which MLP wins is not that
high. As the other models were approximating to a linear solution and gave
similar results, it may be concluded that MLP had similar performance because it
was also doing the same. Thus, this suggests that the relationship between
weather parameters and daily solar power production may be linear in nature.

This conclusion may not hold for all ranges/parameters. A bigger dataset might
give better insights about the same. Lastly, it should be noted that while MLP’s
performance was better, it took significantly longer to train the MLP as compared
to Linear R and SVR. So for a small dataset, it may be wise to stick to a linear
approximation for solar-power production.

16
Conclusion and Future Work

Accurate prediction of Solar Energy generated can mitigate some of the


challenges of using variable energy sources and lead to better adoption of
renewable sources. The literature review revealed the lack of research focussing
on weather conditions as input parameters for solar generation forecasting
problems. This project tries to explore and address this gap in research focus.
After acquiring and preprocessing the data, it was feeded to 3 models: Linear
Regression, Multi-Layer Perceptron(MLP) and Support Vector Regression(SVR).
MLP gave the best performance by a slight margin, suggesting that a linear
relationship exists between weather input parameters and solar power produced.
Further work may include using a Deep Neural Network model to improve
prediction accuracy. Acquiring our own dataset might also help reduce some of
the noise that was observed.

17
References

[1] Voyant, C., Notton, G., Kalogirou, S., Nivet, M. L., Paoli, C., Motte, F., &
Fouilloy, A. (2017). Machine learning methods for solar radiation forecasting: A
review. Renewable Energy, 105, 569-582.

[2] Abuella, M., & Chowdhury, B. (2015, October). Solar power forecasting
using artificial neural networks. In 2015 North American Power Symposium
(NAPS) (pp. 1-5). IEEE.

[3] Sharma, N., Sharma, P., Irwin, D., & Shenoy, P. (2011, October). Predicting
solar generation from weather forecasts using machine learning. In 2011 IEEE
international conference on smart grid communications (SmartGridComm) (pp.
528-533). IEEE.

[4] Li, J., Ward, J. K., Tong, J., Collins, L., & Platt, G. (2016). Machine learning
for solar irradiance forecasting of photovoltaic system. Renewable energy, 90,
542-553.

[5] Frank. (2018, 09 02). Daily Power Production of Solar Panels. Kaggle.
https://www.kaggle.com/fvcoppen/solarpanelspower

[6] Ma, R. (2020, 08 04). Weather dataset in Antwerp, Belgium. Kaggle.


https://www.kaggle.com/ramima/weather-dataset-in-antwerp-belgium

18

You might also like