You are on page 1of 14

Prediction Model Development of Weather Related

Energy Consumption, A Case Study

ABSTRACT

User behavior is one of the main factors determining daily electricity consumption. Moreover,
weather conditions also effect the consumer’s behaviors. In this project, the relationship between
weather conditions and electricity consumption will be analyzed and the potential to produce forecasts
through this relationship will be examined. Market fluctuations due to meteorological parameters such
as temperature can be predicted. In this study, it will be investigated how successful the use of
machine learning method will be in predicting the market fluctuation caused by weather conditions
with the prediction model to be created. A machine learning model will be developed over python in
order to analyze market prices and develop a forecasting model that can correlate with long-term
weather forecasting and predict the effect of the market from the weather conditions. After the model
is trained with long-term consumption and temperature data, the performance of the model will be
evaluated over a determined case date and it will be tested whether the consumption values can be
achieved.

1.LITERATURE SURVEY

In the future with the changing climate conditions and increasing population, there are
situations where the energy produced cannot meet the demand. In their research, Hou et al (2021) used
different climate scenarios such as RCP2.6, RCP4.5 and RCP8.5 to predict electricity demand with
optimized ANN models and results shows that, electricity demand will increase in near and far feature.
In order to analyze the relationship between the change in demand and climate conditions, demand
forecasting should be made. Ajayi and Heymann, (2021) tried to predict the day a head energy demand
using ANN and examined the RMSE and MAE key performance indicators while monitoring the
performance. While the results were quite successful, they revealed that the temperature had highest
influence with a 37.63 % effect on the energy demand amount. Compared to other studies Salam and
Hibaoui (2021) found out LSTM gives better results for energy demand. Yorucu (2018) states that in
2015 and 2016, annual electricity consumption in Turkey increased up to 278.3 billion kWh which is
3.3% percentage increase.In their study Yumurtacı and Asmaz (2010), tried to predict electric energy
demand of Turkey for the year 2050 and estimations were made using a statistical method in their
studies. Another study conducted by Kafazi, Bannari and Abouabdellah (2016), tried to model and
predict energy demand in their study. In the study, ARIMA (Auto Regressive Integrated Moving
Average) model gives best results for predicting the Moroccan electricity demand. On the other hand a
study conducted by Akdi,Gölveren and Okkaoğlu, (2019) claimed that usage of harmonic regression
method gives better results compared to ARIMA model. Besides, Kwak et al. (2012), claimed that
after conducting energy demand prediction for four days, it was found that all inputted weather data
have an effect on the prediction results. Moreover, Eshraghi et al.(2021) states that in their research,
energy demand depends on the season for 42 states in U.S and summer has the highest demand with
70%. Considering that the seasonal differences are based on temperature, it is seen that there will be
demand differences indirectly related to the effect of temperature. In another study on this subject,
Guinta et al (2021) found an alternative consistent way to obtain the necessary data on demand by
making temperature forecasts. Besides, Felice et al (2013) found that electricity usage is highly
correlated with temperature in their study in Italy. Another research conducted by Mirasgedis et al
(2006), claimed that using relative humidity and cooling- heating degree day parameters gives up to
98% R2 values. All in all in this study, it is planned to make an energy demand forecast with a
machine learning model based on selected weather parameters in Turkish Electricity market.

SOURCE METHOD CRITERION CONTRIBUTION

Assessing of impact climate Using different


parameters on the gap IPCC scenarios
between to create an To understand the It has been determined that
hydropower supply and Artifical energy there will be an increase in
electricity demand by RCPs Neural requirements energy demand in the
scenarios and Network depending on future
optimized ANN by the Model climate change
improved Pathfinder (IPF)
algorithm

Data centre day-ahead Artifical


energy demand prediction Neural Forecasting energy Importance of temperature
and energy dispatch Network demand. in energy demand forecast
with solar PV integration Model was found

Quantification of climate- Using Linear To find relationship It was determined that the
induced interannual regression between demand highest demand was 70%
variability in residential model and climate in summer and 50% in
U.S. electricity demand winter.
Energy consumption
prediction model with deep Long Short- To predict energy A model with low error
inception residual Term Memory demand with high rate could be created with
network inspiration and Model accuracy LSTM
LSTM

Effects of Model Horizontal Using weather Providing a solution An alternative way is


Grid Resolution on Short- model e-kmf to the temperature shown for temperature
and and prediction required forecasting that can be
Medium-Term Daily WRFmodel, to in the energy sector created for use in energy
Temperature Forecasts for predict the demand forecasting.
Energy temperature for
Consumption Application in 11 days
European Cities

Turkish Energy Market: Informative Providing base Accessible brief


Transformation, Source knowledge related introduction to Turkish
Privatization and with Turkish energy market
Diversification market

Predicting energy
Electric energy demand of Statistical demand of Turkey Electric energy usage will
Turkey for the year 2050 approach in 2050 be increased in 2050.

An Using ARIMA Gives best results for


Modeling and forecasting autoregressive model to predict predicting the Moroccan
energy demand integrated demand electricity demand.
moving
average model

Feasibility Study on a novel Energy demand Energy demand The weather data have an
methodology for short-term prediction prediction to find effect on the prediction
real-time energy demand out effect of results
prediction using weather weather in
forecasting data. forecasting

Daily Electrical Energy Harmonic To compare Model fits data better and
Consumption: Periodicity, Regression ARIMA and gives better results
Harmonic Regression Method Harmonic compared to ARIMA
Method and Forecasting Regression model
for prediction of
consumpiton
2.METHOD SURVEY

Studies on energy demand are brought to the literature with different methods day by day.
Banik et al.,( 2020), emphasized that energy consumption estimation is gaining importance day by day
in their study and they made energy consumption estimation with machine learning in their studies. In
the study, it was seen that there was a 20% improvement in the prediction performance by taking the
ensemble of the random forest and XGboost models. Another study conducted by Ioaneş and Tirnovan
,(2019), claimed that Long Short-Term Memory model can be used in efficient way for energy price
prediction besides in their study of Romanian electricity price estimation, they discovered that the
model accuracy increased by 20% with LSTM. Since demand and price are related to each other, it is
thought that the success in forecasting models will be similar. There are also studies that add
conditions such as weather conditions to the model while making pricing forecasts. For example,
Castelli et al.,(2020), in their article, it is found that better results were obtained when the relationship
between the 24-hour forecast performance, weather, oil price and carbon dioxide coupons to the prices
was added to the model. In another study, Lucas et al.,(2020), found that the model with the lowest
error rate was XGboost when they made an energy market pricing estimation using, Gradient boosting,
random forest, XGboost models. In their study, Dash and Patel, (2015), tried to make short-term
electrical load estimation by establishing an extreme learning machine model and brought a different
perspective to the literature from other studies. When the studies are examined, it is thought that the
predictions that can be made with a hybrid model will have the best accuracy. In another study
conducted by Bouderraoui et al.,(2021) Light-GBM, Artificial Neural Network, and Linear Regression
models were used to estimate energy consumption on a building basis, not on a market basis, and it
was determined that the best performance was achieved with lightgbm model. So Linear Regression,
Random Forest, XGboost and Lightgbm gradient boosting models will be used to find meteorological
conditions and energy demand relationship for chosen period of time.

METHOD REFERANCE

XGBoost Banik et al (2020), Lucas et al. (2020)

Random Forest Banik et al (2020)

LightGBM Ioaneş and Tirnovan (2019)

Bouderraoui et al.,(2021)
3. CASE STUDY DATE SELECTION

On August 2, 2021, electricity cuts were experienced in many cities in Turkey because the
production could not meet the demand. In the statement made by the energy ministry, it was said that
the cuts were caused by temperatures above seasonal normals. [web 1] This date has been selected for
forecast performance check. Hourly consumption data to be used in the project will be obtained
through the epiaş transparency platform. Weather data will be obtained from meteorological stations
located all over Turkey. In order to analyze the reasons of power cut offs, the annual change in
consumption values should be examined. Figure 1 represents the total electricity consumptions of
Turkey during 2021. When the consumption values are examined carefully, it is seen that the
maximum consumption values are measured in August throughout the country. Since an increase
outside the expected pattern was observed on August 2, when consumption values suddenly increased,
it was investigated whether this increase could be predicted in this study.

Figure 1 : Total energy consumption of Turkey during 2021

Figure 2 shows the consumption values on august 2021. It is seen that the values of the first 5
days in August are the highest values of that month and year. Since an increase outside the expected
pattern was observed starting on August 2, when consumption values suddenly increased, it was
investigated whether this increase could be predicted in this study. It is thought that estimating the
increase in consumption will prevent power cuts.

Figure 2 : Total energy consumption of Turkey during August 2021


4.RESEARCH FRAME
The research frame to be followed throughout the study is as seen below. The research flow consists of
data acquisition, period determination, model determination and model performance analysis.

Evaluation of the model results with KPIs


Determination of the period for and determining which model makes the
consumption forecasting most accurate prediction for selected
period

Analysis of consumption data


Running the model and estimating the
drawn from Epiaş and weather
temperature and consumption values in
data from ECMWF ERA5
the case period

Data assimilation and writing


code sequence of catboost, Giving the previous data of the case date
lightgbm, linear regression and as input to the model
random forest models

5.MODEL FEATURES

In this study, more than one estimation model was used to select the best model. These models are
linear regression, catboost, random forest, lightgbm. The linear regression model is a model that
mathematically calculates the linear regression of the inputs and makes the estimation in this way.
Random forest is a machine learning model that includes multiple decision tree structures. According
to Liu et al, 2012, random forest regression method can be briefly summarized as: given sample space
X and classification labels Y ,random forests for regression are formed by planting trees depending on
the random variable Θ , relative to each category label, tree predictor h (x, Θ) can give a numerical
result. Gradient boosting models are very powerful and have fast calculation features. Models with
gradient boosting over categorical features are defined as catboost model, while the faster and more
flexible gradient boosting version is called lightgbm.
6.DATA FEATURES

Different types of data from multiple sources were used in the model which are consumption data and
meteorological data. In order to make predictions in the determined case period range, the
consumption values from the first day of January 2020 until the case date were taken from the EPIAS
database and added to the model. Due to the data source used, the hourly total consumption values of
the whole turkey could be reached.

Meteorological data from ERA5 reanalysis data produced by ECMWF became a source of
meteorological data. Reanalysis data is a type of data based on the logic of filling the created
estimation with observation data at the measurement points while leaving the missing places as
estimation. There are multiple meteorological parameters that can affect the model forecast
performance. In this research temperature, cape index, total cloud cover, mean sea level pressure and u
– v components of wind data added to the model as a weather feature to find out if maximum
consumption can be detected before the electricity cut off.

7. MODEL CONFIGURATIONS

In order to determine which weather data would have a better effect on the model, the model results
were compared by adding features in different combinations. The temperature data selected for the
model are the average temperature values of Turkey measured at 2 meters. It is known that there is a
correlation between temperature and consumption, especially in summer. Especially in residential
areas, the increase in temperature tends to increase consumption. Another value used, the cape index,
is related to the instability in the atmosphere. Since high CAPE values cause a possible precipitation to
have a high storm character and consequently a decrease in temperatures, it was desired to observe
what kind of contribution it would make to use it in a case taken in summer. Another data series that is
thought to have an effect with other parameters is the cloud occlusion rate. It is predicted that
electricity consumption will increase if the temperature is low on cloudy days or if the temperature is
high on clear days. Moreover, date features are added to the model to get relationship between time
and consumption.

To add to the model later, wind speed and pressure values were added to the model as they were
thought to be related to consumption. Table 1 shows the correlation values between the data. When the
table is examined, it is seen that the parameter with the highest correlation with consumption values is
temperature as 0.44, and cape values have a greater effect than cloud cover, v10, u10 and mean sea
level pressure values.
Table 1: Correlation values between parameters

Table 2: Selected model configurations

In Table 2, different combinations of horizontal wind speed, temperature and cape parameters, which
are the values with the highest correlation with consumption, are shown respectively to be added to the
model.

8.RESULT
In this study, consumption estimation was tried to be made with python using linear
regression, lightgbm, catboost and random forest models. It is aimed to increase the prediction
performance of the model by adding different parameters to the model. After adding 8 different model
parameter combinations to the models, it was desired to find the model that best represents the
maximum and minimum consumption values with visual correlation. Table 2 describes the parameters
used in combinations.

8.1. Forecast of the Day of the Sudden Consumption Peak

The main reason for unexpected power outages is the unexpected increase in consumption.
Considering this situation, in the first part of the study, it was desired to determine whether the models
could catch the sudden rise or not. Figure 3 shows the model results for different combinations. Model
1 to Model 7 represents the different combinations and Model 8 is the results coming from models
with all parameters. Looking at the linear regression results in all models, it is seen that regression is
not a suitable model to detect this sudden increase. Models using gradient boosting were more
successful in capturing the rise. When the remaining models are examined, it is seen that the best rise
in the peak is achieved by the combination of model 3, that is, the wind parameter and the time
parameter. When examined on a model basis, the best performing model is seen as model 3 for
catboost, model 8 for random forest, model 3 for lightgbm and model 8 for linear regression. As a
result, it can be said that model 8, which covers all parameters, and model 3, which includes only the
wind parameter, catches the sudden increase better than the others. The temperature parameter, which
has the highest correlation with consumption, did not improve the model performance, contrary to
expectations.
Figure 3: The model results for different meteorological parameter combinations

8.2. Forecast of the Maximum Consumption Period

In order to determine the performance of the forecast models over a larger time period, the
entire period when the consumption is above the average, that is, between 2-5 August 2021, was
modeled. Forecasted consumption values with different models and parameters for case period are
shown in figure 4. When figure 4 is examined, it is seen that the maximum values are estimated in the
best models 3, 5 and 8. Although it is difficult to choose the parameter combination in which the
models perform best in long-term outputs, it can be said that the best performances for linear
regression, lightgbm, random forest and catboost are in model 8. When the forecast models are
compared within themselves, it is seen that the catboost model makes closer estimations to the actual
consumption compared to the others. In order to improve the performance, the parameter
optimizations within the model were made again with all the meteorological parameters used.

Figure 4 : Forecasted consumption values with different models and parameters for case period.

9. CONCLUSION

In this study, the use of different machine learning models that can be used with different
meteorological parameters in energy consumption estimation has been investigated. Besides, using
linear regression, random forest, lightgbm and catboost models, two models with different temporal
lengths were developed with python. The selected period has been determined as 2-5 August 2021,
when electricity cuts are experienced due to Turkey's energy consumption. First, it was seen that the
consumption suddenly jumped and the predicted values produced by the model at this peak were
examined. The best results in single parameter combinations came from the catboost model with the
horizontal wind speed component. For the long period, it was seen that the best model was catboost,
and the parameters that best improved the model performance were pressure, temperature, cloud
cover, cape index, horizontal and vertical components of the wind. As a result, considering that the
estimation performance is close to the truth, it can be said that the model with the highest consistency
that can be established while making an energy consumption estimation is catboost and the parameters
improve the estimation.

REFERANCES

1. Ajayi, O., & Heymann, R. (2021). Data Centre Day-ahead energy demand
prediction and energy dispatch with solar PV integration. Energy Reports, 7,
3760–3774. https://doi.org/10.1016/j.egyr.2021.06.062
2. Eshraghi, H., Rodrigo de Queiroz, A., Sankarasubramanian, A., & DeCarolis, J. F.
(2021). Quantification of climate-induced interannual variability in residential U.S.
electricity demand. Energy, 236, 121273.
https://doi.org/10.1016/j.energy.2021.121273
3. Giunta, G., Salerno, R., Ceppi, A., Ercolani, G., & Mancini, M. (2019). Effects of
model horizontal grid resolution on short- and medium-term daily temperature
forecasts for energy consumption application in European cities. Advances in
Meteorology, 2019, 1–12. https://doi.org/10.1155/2019/1561697
4. Hou, R., Li, S., Wu, M., Ren, G., Gao, W., Khayatnezhad, M., & gholinia, F.
(2021). Assessing of impact climate parameters on the gap between hydropower
supply and electricity demand by rcps scenarios and optimized Ann by the
improved Pathfinder (IPF) algorithm. Energy, 237, 121621.
https://doi.org/10.1016/j.energy.2021.121621
5. MIRASGEDIS, S., SARAFIDIS, Y., GEORGOPOULOU, E., LALAS, D.,
MOSCHOVITS, M., KARAGIANNIS, F., & PAPAKONSTANTINOU, D.
(2006). Models for mid-term electricity demand forecasting incorporating weather
influences. Energy, 31(2-3), 208–227.
https://doi.org/10.1016/j.energy.2005.02.016
6. De Felice, M., Alessandri, A., & Ruti, P. M. (2013). Electricity demand
forecasting over Italy: Potential benefits using numerical weather prediction
models. Electric Power Systems Research, 104, 71–79.
https://doi.org/10.1016/j.epsr.2013.06.004

7. Salam, A., & El Hibaoui, A. (2021). Energy consumption prediction model with
deep inception Residual Network Inspiration and LSTM. Mathematics and
Computers in Simulation, 190, 97–109.
https://doi.org/10.1016/j.matcom.2021.05.006
8. El Kafazi, I., Bannari, R., & Abouabdellah, A. (2016). Modeling and forecasting
energy demand. 2016 International Renewable and Sustainable Energy
Conference (IRSEC). https://doi.org/10.1109/irsec.2016.7983974
9. Kwak, Y., Seo, D., Jang, C., & Huh, J.-H. (2013). Feasibility Study on a novel
methodology for short-term real-time energy demand prediction using weather
forecasting data. Energy and Buildings, 57, 250–260.
https://doi.org/10.1016/j.enbuild.2012.10.041
10. YUMURTACI, ZEHRA., & ASMAZ, ERCAN. (2004). Electric energy demand
of Turkey for the year 2050. Energy Sources, 26(12), 1157–1164.
https://doi.org/10.1080/00908310490441520
11. V. Yorucu and Ö. Mehmet, The Southern Energy Corridor: Turkey’s Role in
European Energy Security, Lecture Notes in Energy 60,
https://doi.org/10.1007/978-3-319-63636-8_5
12. Akdi, Y., Gölveren, E., & Okkaoğlu, Y. (2019, November 15). Daily Electrical
Energy Consumption: Periodicity, harmonic regression method and forecasting.
Energy. Retrieved November 20, 2021, from
https://www.sciencedirect.com/science/article/abs/pii/S0360544219322194.

13. Banik, R., Das, P., Ray, S., & Biswas, A. (2020). Prediction of electrical energy
consumption based on machine learning technique. Electrical Engineering, 103(2),
909–920. https://doi.org/10.1007/s00202-020-01126-z
14. Castelli, M., Groznik, A., & Popovič, A. (2020). Forecasting electricity prices: A
machine learning approach. Algorithms, 13(5), 119.
https://doi.org/10.3390/a13050119
15. Dash, S. K., & Patel, D. (2015). Short-term electric load forecasting using extreme
learning machine - A case study of Indian Power Market. 2015 IEEE Power,
Communication and Information Technology Conference (PCITC).
https://doi.org/10.1109/pcitc.2015.7438135
16. Ioanes, A., & Tirnovan, R. (2019). Energy price prediction on the Romanian
market using long short-term memory networks. 2019 54th International
Universities Power Engineering Conference (UPEC).
https://doi.org/10.1109/upec.2019.8893550
17. Lucas, A., Pegios, K., Kotsakis, E., & Clarke, D. (2020). Price forecasting for the
balancing energy market using machine-learning regression. Energies, 13(20),
5420. https://doi.org/10.3390/en13205420
18. H. Bouderraoui, S. Chami and P. Ranganathan, "Predicting Hourly Energy
Consumption in Buildings," 2021 IEEE International Conference on Electro
Information Technology (EIT), 2021, pp. 1-7, doi:
10.1109/EIT51626.2021.9491876.
19. Liu Y., Wang Y., Zhang J. (2012) New Machine Learning Algorithm: Random Forest. In:
Liu B., Ma M., Chang J. (eds) Information Computing and Applications. ICICA 2012.
Lecture Notes in Computer Science, vol 7473. Springer, Berlin, Heidelberg.
https://doi.org/10.1007/978-3-642-34062-8_32
20. CatBoost: Gradient boosting with categorical features support. (n.d.). Retrieved
January 15, 2022, from http://learningsys.org/nips17/assets/papers/paper_11.pdf

You might also like