You are on page 1of 31

Research result: Wind energy prediction in

Germany

Alexander Aprelkin

Technische Universität München

August 21, 2013

1 Introduction
Wind power usage is set to be a key power source in energy supply of Germany.
In [10] is shown that the wind power energy production steadily rises over the
recent years and that the technology is an emerging market with a great fu-
ture, having priority among other power sources. Unfortunately, wind power
prediction is considered to be a hard problem.
One of the key issues is to analyze the numerical weather predictions and
calculate the power output of a wind turbine or a wind farm based on weather
conditions. Normally, power forecast needs to be available up to 3 days before
the actual feed-in values arrive.
An important problem is the availability and quality of weather data at the
location of a wind turbine. Often, the wind speed values have to be interpo-
lated given several weather stations' values around the turbine. Typically, wind
direction and velocity, air pressure/temperature and humidity are used as the
main quantities for the wind power prediction.
As mentioned in [5] the problem of a precise forecasting of wind power output
is important since forecast errors are related to the use of balancing power and
may cause additional costs of the order of millions Euros after several hours.
Moreover, uctuations of power quality can occur due to an erroneous power
forecast. Therefore, precise forecasts are very important and desirable.
The state-of-the-art day ahead forecast [5] has a root mean square error of
about 5.8 percent and decreases for short time periods down to 3.8 percent.
Two main approaches exist: a statistical and a physical [6]. The physical
approach uses local dynamic weather information and power curves of a given
turbine, whereas the statistical approach uses historical data and interpolated
weather values and is based on a statistical model, e.g. an articial neural
network.
We will consider the statistical approach and analyze if it is possible to
combine dierent state-of-the-art methods or provide an improvement of one of
them in order to obtain a better forecast in terms of a minimal error.

1
2 Related Work
As there are a lot of dierent standard and complicated methods, each with
its advantages and drawbacks, we will rst consider all of them separately and
will try to nd an optimal solution as either a combination of them or an im-
provement of one of them. Our work will be concentrated on learning a forecast
function for each of the TSOs separately, based on NWP data and installed
capacity of the wind turbines. This way, we will try to learn some local power
curves and combine them into a single aggregated model of a TSO.
Authors of [1] propose an approach of minimizing the error between forecast
and sample data using some weighting coecients for interpolation of wind farm
location. In [32] authors are considering the power curve of a whole wind farm,
given weather data and power output. Dierent data mining algorithms based
on the forecast period are considered in [3].
In [4] authors propose that the forecast error of a region is in general smaller
than a forecast error for a single wind farm. This has an advantage that some
errors caused by one site can be canceled out by another one.
In [5], [12] is mentioned, that the prediction system currently used by some
German TSOs is statistically based and works on articial neural networks,
trained on historical data. [7] provides a large overview over several wind power
forecasting approaches.
One of the closely related papers is [9], where a power curve of a whole wind
farm is being learned using a neural network technique. A dierent interesting
approach is considered in [11], where clustering of weather data is used.
In [13], authors also follow a statistical approach and their system is during
the learning process able to decide, how good is the quality of given weather
measurement.
Authors of [15] consider two dierent data mining methods: one is predicting
the power directly from the NWP, and the other one predicts the wind speed
from the NWP and then calculates the power output from the wind speed
forecast.
Authors of [18],[19],[21] are using some other non-standard methods to solve
the problem of power prediction.

3 Research Question
Germany is divided into 4 transmission system operator zones , as shown on
Figure 1. All of them provide information about produced and predicted energy
with wind turbines. The predicted energy is not always equal to the produced
power but has to be equal in order to save money and control the balance of
the power network. The research question is how accurately can wind power
output be predicted using the developed methods compared to the prediction
values of the TSOs. Another interesting question is to consider the so called
Power Curve - dependance of the produced wind energy of the wind speed - and
to be able to predict energy values given only wind speed information using the

2
Figure 1: Division of Germany by 4 TSOs.

3
learned Power Curve.
The goal will be to predict energy production for the next day in hourly
time steps, given the weather conditions and energy production values of the
previous 4 days, in the time range: 11.07.2013 15:00 till 15.07.2013 23:59. The
target day is the 16.07.2013. Since the goal lies already in the past, actual values
already exist and can be used for error estimation of the models.

4 Software
In order to achieve the goals of the project the following software was used.
R was used for weather interpolation and krigging at wind turbines' locations.
Furthermore, it contains a packages 'stats' and 'forecast' with state of the arts
methods for time series analysis and forecasting, like ARIMA. Additionally,
there is a special R package 'elds' which was used for krigging. R was also
used to join two data tables into one, based on one common eld, in our case
coordinates of a wind turbine with based on the post index. Furthermore,
software MATLAB was used for Neural Networks creation and simulation.

5 Data set and data preprocessing


Data set of numerical weather predictions (NWP) was obtained from the service
WebWerdis of the Deutscher Wetterdienst. The user account was provided by
the Chair of Financial Mathematics of the TUM.
The considered data set consists of 4 parts:

• Climatological time serie: hourly means of wind speed (in m/sec) - product
de.dwd.nkdz.FFHM

• Climatological time serie: hourly values of station pressure (in hpa) -


product de.dwd.nkdz.PSHV

• Climatological time serie: hourly values of air temperature (in degree C)


- product de.dwd.nkdz.TAHV and

• Climatological time serie: hourly values of relative humidity (in %) - prod-


uct de.dwd.nkdz.UUHV.

We concentrated on these time series because power generation of a single wind


turbine depends on exactly these parameters, regardless the technical parame-
ters of the machine.
The kinetic energy of a wind turbine can be expressed with the formula
1
P = Atρw3 , (1)
2
where w is the wind speed, ρ the density of the air , t the time, and A the
rotor area of the turbine. Air density depends on temperature and humidity.

4
Normally, wind turbines achieve an eciency of about 59 percent of the kinetic
energy.
The time series represent XML-formatted documents. The total number of
lines available to download at once is limited. Although at the beginning of the
project, the whole available time series (approx. 80 years of hourly measured
values for each station) were downloaded and considered to be used, it turned
out to be practically not feasible to work with several Gigabytes of data. So,
the task was concentrated to the minimal size of the data: 5 days with hourly
measurements of all stations, what is exactly possible to download at once from
the data source.
WebWerdis provides data sets with signicantly higher spatial resolution,
than the European Centre for Medium-Range Weather Forecasts (ECMWF).
WebWerdis gives weather information at ca. 400 till 600 weather stations in
Germany, whereas ECMWF provides archive information at geographic grid
points with a resolution of 80 kilometers. ECMWF provides historical weather
data at three time points during the day, whereas WebWerdis gives hourly
information.
In the next step, the data was converted to the CSV-format to be importable
in R and MATLAB. For this purpose a small Java-based converter program was
written.
Futhermore, data sets of all wind turbines in Germany by 2013 including
their capacity were collected from the websites of the TSOs [27],[25],[26],[28].
The data preprocessing step included also a creation of additional les with the
location of the wind turbines which were in a compatible format to the NWP
format based on the available post index of each wind turbine. For oshore
wind turbines obviously no post index was provided, but the coordinates in-
stead, which we transformed into the appropriate format and used as the real
coordinates of the turbine instead of the mapping of the post index to location.
This mapping was achieved using language R and a le [33] containing the table
with coordinates and post indexes.
The downloaded data sets included weather information in Germany in sev-
eral hunderds of stations between 11.07.2013 and 16.07.2013. The day 16.07.2013
was selected as the target (test) day and the dates 11.07.2013 at 15:00 till
15.07.2013 at 23:59 were used for training.
The following R source code makes a mapping of wind turbines locations
and geographic coordinates.
r <− r e a d . c s v ( "C: \ \ U s e r s \\ Alexander \\ Downloads \\DE\\DE. t x t " ,
c o l C l a s s e s=c ( " c h a r a c t e r " , " c h a r a c t e r " ,
" character " ," character " ," character " ,
" character " ," character " ," character " ,
" c h a r a c t e r " , " numeric " , " numeric " ) ,
h e a d e r=FALSE, s e p ="\ t " ,
f i l e E n c o d i n g ="UTF− 8")

#t e n n e t − Anlagendaten

5
s <− r e a d . c s v ( "C: \ \ Praktikum \\ Praktikum D a t a s e t \\
Anlagen_Stammdaten b i s 2013\\ Tennet \\ o n _ f i n a l . c s v " ,
c o l C l a s s e s=c ( " c h a r a c t e r " , " c h a r a c t e r " ,
" character " ," character " , " character " ," character " ,
" character " ," character " , " character " ,
" c h a r a c t e r " , " c h a r a c t e r " , " numeric " ,
" character " ," character " ," character " ,
" character " , " character " ," character " ,
" character " ," character " , " character " ) ,
h e a d e r=TRUE, s e p =";")
t t <− merge ( r , s , by . x="V2" , by . y="PLZ" )
xt <− t t [ , c ( "V2" , " V10 " , " V11 " , " L e i s t u n g " , " Inbetriebnahmedatum " ) ]
sink (" c :\\ tennetLocations . txt ")
xt
sink ()

s <− r e a d . c s v ( "C: \ \ Praktikum \\ Praktikum D a t a s e t \\


Anlagen_Stammdaten b i s 2013\\ Tennet \\ o f f _ f i n a l . c s v " ,
c o l C l a s s e s=c ( " c h a r a c t e r " , " c h a r a c t e r " ,
" character " ," character " , " character " ,
" character " , " character " ," character " ,
" character " ," character " , " character " ,
" numeric " , " c h a r a c t e r " , " c h a r a c t e r " ,
" character " ," character " , " character " ,
" character " , " character " ," character " ,
" character " ," character " , " character " ,
" character " , " character " ) ,
h e a d e r=TRUE, s e p =";")

xt <− s [ , c ( "PLZ" , " L a t i t u d e " , " L on g i tu d e " , " L e i s t u n g " , " Inbetriebnahmedatum " ) ]
sink (" c :\\ tennetOffshoreLocations . txt ")
xt
sink ()
After this step was run for all TSOs, les with location of the turbines and
les with the weather measurements at the weather stations were available.
Weather values for locations of the wind turbines, locations between the
weather stations had to be interpolated in language R using a krigging method
(package elds, function Krig) provided by the package elds of R. For this
purpose, an R program was run for each combination of weather and turbine
location le. This process turned out to be quite time expensive and not al-
ways accurate (for example negative values occured due to special interpolation
function form).
The next R source code describes how to make krigging (calculation of ir-
regular weather measurements into the irregular positions of wind turbines).

6
# r e a d i n f o about weather
d0 <− r e a d . c s v ( "C: \ \ wind5days22 . c s v " , h e a d e r=TRUE, s e p ="\ t " )
# add time component
d0$time = ISOdate ( d0$year , d0$month , d0$day , d0$hours , d0$minutes )
# o r d e r by time
d0 <− d0 [ with ( d0 , o r d e r ( time ) ) , ]

# r e a d l o c a t i o n o f weather s t a t i o n s
d1 <− r e a d . c s v ( "C: \ \ wind5days22_loc . c s v " , h e a d e r=TRUE, s e p ="\ t " )

# create latitude
d1$y = a s . numeric ( char2dms ( a s . c h a r a c t e r ( d1 [ [ " l a t i t u d e " ] ] ) ) )

#c r e a t e l o n g i t u d e
d1$x = a s . numeric ( char2dms ( a s . c h a r a c t e r ( d1 [ [ " l o n g i t u d e " ] ] ) ) )

# r e a d l o c a t i o n s o f wind s t a t i o n s
t t t 1 _ l o c <− r e a d . c s v ( "C: \ \ 5 0 h e r t z O f f L o c a t i o n s _ 1 . c s v " ,
c o l C l a s s e s=c ( " c h a r a c t e r " , " numeric " , " numeric " ,
" numeric " , " c h a r a c t e r " ) , s e p ="" , h e a d e r=FALSE)

# c r e a t e empty matrix
r <− matrix ( l i s t ( ) , nrow=l e n g t h ( d0$time ) , n c o l=l e n g t h ( ttt1_loc$V2 ) ) ;

# f i l l matrix with k r i g g e d data u s i n g i n f o s


f o r ( i i n 1 : l e n g t h ( d0$time ) ) {

f i t <− Krig ( d1 [ 4 : 5 ] , t ( d0 [ i , 6 : ( n c o l ( d0 ) − 1 ) ] ) ,
C o v a r i a n c e="Matern " , t h e t a =25 , smoothness =0.5)
r [ i , ] <− p r e d i c t ( f i t , r b i n d ( c b i n d ( ttt1_loc$V2 , ttt1_loc$V3 ) ) )
cat ( i ,"\ n")
flush . console ()
}

# w r i t e out t h e r e s u l t
w r i t e . c s v ( r , "C: \ \ 1 \ \ r e s u l t \\ wind_50hertzOff . c s v " )
The data sets of prediction and actual values monthwise were obtained from
the websites of the TSOs [28],[31],[29],[30]. TransnetBW GmbH provides the
data since January 2010, TennetTSO GmbH - since July 2005, Ampirion GmbH
- since April 2008 and 50Hertz Transmission GmbH - since January 2005. The
power values are given in 15-minutes-steps and the prediction values are gener-
ated 24 hours before the actual power feed-in value is measured. Old data sets
(before 2009) of 50Hertz are provided in XLS-data format, which is dierent to
the newer ones (CSV).

7
6 Methods
Several models were considered, for 6 problems (wind energy prediction of Ten-
net Oshore, Tennet Onshore, 50Hertz Oshore, 50Hertz Onshore, Ampirion
and TransnetBW) to be solved. The result Root Mean Squared Error (RMSE)
and Mean Absolute Error (MAE) of every method was compared to the RMSE
and MAE of the ocial predictions of the TSOs.
The models are:
• "All stations (formula)": Each data set with weather conditions at every
wind turbine is used as the input of the neural net, which results in 4 neural
nets: for humidity, wind, pressure and temperature. After the training the
neural nets are applied on the weather conditions (resp. humidity, wind,
pressure or temperature) of the target day. The result is calculated as
wind ∗ 0.7 + temperature ∗ 0.1 + pressure ∗ 0.1 + humidity ∗ 0.1, due to
more important inuence of wind speed on the energy output.
• "All stations (mean)": Each data set with weather conditions at every
wind turbine is used as the input of the neural net, which results in 4 neural
nets: for humidity, wind, pressure and temperature. After the training the
neural nets are applied on the weather conditions (resp. humidity, wind,
pressure or temperature) of the target day. The result is calculated as
wind ∗ 0.25 + temperature ∗ 0.25 + pressure ∗ 0.25 + humidity ∗ 0.25.
• "All stations (wind)": Only wind data set at all stations is used as input
to the neural net and the resulting net is applied on the wind of the target
day. The result is calculated using only the wind component.
• "Average (formula)": Each data set with weather conditions at every wind
turbine is used as the input of the neural net, which results in 4 neural
nets: for humidity, wind, pressure and temperature. However, the weather
values are averaged among the stations, so that there is exactly one average
value for every hour. After the training the neural nets are applied on the
appropriate average weather conditions (resp. humidity, wind, pressure or
temperature) of the target day. The result is calculated as wind ∗ 0.7 +
temperature ∗ 0.1 + pressure ∗ 0.1 + humidity ∗ 0.1, due to more important
inuence of wind speed on the energy output.
• "Average (mean)": Each data set with weather conditions at every wind
turbine is used as the input of the neural net, which results in 4 neural
nets: for humidity, wind, pressure and temperature. However, the weather
values are averaged among the stations, so that there is exactly one average
value for every hour. After the training the neural nets are applied on the
appropriate average weather conditions (resp. humidity, wind, pressure or
temperature) of the target day. The result is calculated as wind ∗ 0.25 +
temperature ∗ 0.25 + pressure ∗ 0.25 + humidity ∗ 0.25.
• "Average (wind)": Only wind data set at average values of all stations is
used as input to the neural net and the resulting net is applied on the

8
average wind of the target day. The result is calculated using only the
wind component.
• "Single le": All four weather parameters are averaged and put into one
le for every operator and used as input for the neural net simultaneously.
• "Arima": Autoregressive integrated moving average model is used to fore-
cast the following energy production values given only the previous ones
in form of time series without consideration of weather parameters.
• "HoltWinters": analysis of time series of energy production in the past
without consideration of weather parameters, forecasts based on HoltWin-
ters model.
• "Power curve": consider regression between wind speed and power gen-
eration in order to nd a dependence, power curve for energy generation
prediction.
Since no prediction values for 50Hertz Oshore and Tennet Oshore were
available, but instead only predictions for whole zones (including on- and o-
shore) we used the prediction values for the whole operator region and took the
quote of the oshore power capacity as a quote for prediction value.
Exactly, the sum of the capacities of Onshore turbines of Tennet is 42551641
MW, whereas on Oshore 300000 MW can be produced, which is less than one
percent of the Onshore value. As a heuristic for the oshore prediction value, we
used 1 percent of the total prediction. Analogously, for 50Hertz the capacity is:
134831206 MW Onshore and 48300 MW Oshore. We also calculated 1 percent
of every Onshore predicted value as the prediction of Oshore at the same time
step.
As a neural net implementation, MATLAB application Neural Net Time
Series of the Neural Network Toolbox was used. Specically, the type of the
problem is called by MATLAB Nonlinear autoregressive with external exogenous
input (NARX).
This type of neural net allows to adjust the following parameters: input and
target time series, percentage of validation and test data, number of hidden
neurons and number of delays.
As the input, the appropriate weather data had to be provided (105 time
steps from 11.07.2013 15:00h till 15.07.2013 23:59). The target time series is the
hourly values of generated energy at these time steps. For the quote of training,
validation and test data, the percentage 80/20/20 was chosen. For the number
of hidden neurons 5 was the best choice and the number of delays the default
value 2 was left.
In order to be able to supply the neural net with information we prepared sev-
eral les, divided by categories Input, Output as well as some subcategories ac-
cording to the used model as following examples: Input\Avg\hum_avg _50HertzOf f.csv ,
Input\Avg\hum_avg _50HertzOn.csv , Input\Avg\pressure_avg _Ampirion.csv ,
Input\F ull\transnetBW \temperature_T ransnetBW.csv etc.
for the input of the appropriate neural net.

9
The output for training of the neural net which is 105 energy values for every
considered training hour was prepared in the folder Output:
e.g. Output\T RAIN _ampirion_W erte.csv , Output\T RAIN _transnetBW _W erte.csv .
After the neural net is trained, it is applied to the weather information of
the target day: les in the folder target:
target\targetDayKriggedW eather\temperature_targetDay _50hertzOf f.csv ,
target\avg\SIN GLEF ILE _50hertzof f.csv etc.
The result is compared to the values stored in another le of the folder target:
target\50HertzW erte16_07\T ARGET _OnShore.csv . And the RMSE of the
prediction is calculated according to the values of the le P REDICT ED_50Hertz.csv
of the same folder. The le format in every case is self-explanatory.
The structure of the folders used for dierent models is as follows:
For "SingleFile"-model: Input les are in result\Input\SingleF ile\. Out-
put data is in result\Output\. Target weather is in
target\targetDayKriggedW eather\avg\SIN GLEF ILExxx.csv . And tar-
get result is in target\XXXW erte16_07.
For "Avg"-model: Input les are in result\Input\Avg\. Output data is in
result\Output\. Target weather is in target\targetDayKriggedW eather\avg\T ARGET xxx.csv .
And target result is in target\XXXW erte16_07.
For "All Stations"-model: Input les are in result\Input\F ull\. Output
data is in result\Output\. Target weather is in target\targetDayKriggedW eather\∗
.csv . And target result is in target\XXXW erte16_07.
For the purpose of nding a dependence (regression) between the wind speed
and produced energy we used the average wind speed among one operator at
one time step and the appropriate produced energy at this time step. For this
case we used the Curve Fitting Tool of the Curve Fitting Toolbox of MATLAB.
In this tool, in our 2D case X and Y axes values can be adjusted as well as the
method of curve tting. As X-values average hourly wind speeds of one operator
were selected, and as Y-values the appropriate produced amounts of energy. In
order to nd a dependence between these variables several models were tried:
polynomial, gaussian, etc.
Since predictions and actual energy generation values are given every 15-
minutes, but the weather data is provided hourly, the energy generation values
were aggregated by 4. For this purpose and for the purpose of averaging the
weather values, small Java parser-programs were written.

10
Figure 2: MATLAB view of Nonlinear autoregressive neural network with ex-
ternal exogenous input

7 Results
The goal of the project was to create a model which compares good to the
forecast function of the TSOs and is able to predict the wind power for the
next 24 hours (in our case on 16.07.2013). The metrics for the error function
we considered were: mean absolute error and root mean square error. We also
wanted to analyze the energy generation function as a standalone time series
and try to predict next values using ARIMA model. Additionally, our goal was
to try to learn a power curve given wind and produced energy information.
Dierently, than other state-of-the-art methods we compared

7.1 Neural Networks Predictions

MATLAB Neural Network Toolbox provided Neural Net Time Series tool for
modeling Neural Nets we used with the architecture shown on the gure. Al-
though at the beginning, we planned to use much bigger data sets than 5 days,
which however resulted in case of 50Hertz to more than 150 MB of data, we
concentrated on this 5-days data set. Already this data set required a lot of
time and calculation power to be processed, neural net to be calculated and the
data to be krigged.
We used three types of neural nets: one for all stations information as input
- 105 time steps with weather parameters at every wind turbine, one for 105
time steps of averaged weather data as input, and one aggregated data set with
105 time steps of all averaged weather parameters.
Training of the neural nets is done one the GPU, what signicantly decreased
the training time.
After the neural network was trained with input values using the attached
le NeuralNet.m or the MATLAB Neural Net Time Series Wizard, the following
MATLAB source code simulates the neural net for a given target day weather
time series. The result is afterwards printed on the console.

11
RMSE p MAE p RMSE f MAE f RMSE m MAE m RMSE w MAE w
10.35364 -8.9625 24.4809 23.75699 37.7110 36.98304 22.47725 21.15987

Table 1: Errors on solving the prediction problem using All-stations-model for


50Hertz onshore. p means prediction, f means calculated using neural network
and formula as above, m means calculated mean value, w means calculated using
only wind.

i n p = tonndata (TARGETDAYWEATHER, f a l s e , f a l s e )
n e t c ( inp , ' u s e P a r a l l e l ' , ' yes ' , ' useGPU ' , ' yes ' )
str2num ( m a t 2 s t r ( c e l l 2 m a t ( ans ) ) ) '
The results of the neural nets we calculated based on three models: result
of energy based on wind speed output only, prediction of energy based on all 4
outputs as average and as weighted sum with a bigger part of wind component
inuence.
Almost always, the calculated prediction curve for 24 hours followed the form
of the 24 hours ocial prediction curve, although it had higher values than the
ocial one. Especially the results of the rst 5-6 time steps were good.
We calculated the RMSE and MAE for each of the 6 problems (6 data sets:
Tennet on and o shore, 50 Hertz on and o shore, Ampirion and transnetBW)
for every model.
Although in "All Stations"-model more data is available for the neural net-
work and more complex and accurate model could theoretically be achieved,
it still did not perform better than averaged models and shows a high RMSE
and MAE. Tables 1, 2 and 3 show how RMSE and MAE of dierent calcula-
tion scenarios inside one model (All stations) were compared. For other models
(Average weather, and average weather in single le) RMSE and MAE were
compared in the same way.
On gure 3 one can see that the calculated result using only the wind compo-
nent in "All stations"-model performs slightly worse than the ocial prediction,
still the form of the curve is almost the same.
On the gure we can see that even the prediction not always delivers good
result. On the gure is the production of energy transnetBW, where our calcu-
lated prediction is better than the ocial.
As we ca see from the tables 1 - 3, the best results can be achieved almost
in all cases using the wind-component alone and the mean value of all four
calculated energy-results. Similar tables were considered for all other calculation
models and formulas. They are not shown in this document to save place. At
the end of the document (subsection Common Results) some summary graphs
are provided.
Interestingly, the worst results are achieved, when using single les of input
data, having all four weather components, but as average values over the TSO
region. In no case the ocial prediction is outperformed.

12
Figure 3: Comparison of dierent predictions: actual energy values for
16.07.2013 of 50Hertz Oshore, ocial prediction and neural network result
calculated using the formula described above. X-Axis: hours from 0 till 24.
Y-Axis produced energy in MW.

RMSE p MAE p RMSE f MAE f RMSE m MAE m RMSE w MAE w


-46.1666 70.7813 56.0681 -18.521 102.8731 70.10925 49.057 -15.8708

Table 2: Errors on solving the prediction problem using All-stations-model for


transnetBW. p means prediction, f means calculated using neural network and
formula as above, m means calculated mean value, w means calculated using
only wind.

RMSE p MAE p RMSE f MAE f RMSE m MAE m RMSE w MAE w


467.0217 -0.125 689.3051 -163.7458 620.93928 -81.1657 657.9483 -285.6970

Table 3: Errors on solving the prediction problem using All-stations-model for


Ampirion. p means prediction, f means calculated using neural network and
formula as above, m means calculated mean value, w means calculated using
only wind.

13
Figure 4: Comparison of dierent predictions: actual energy values for
16.07.2013 of transnetBW, ocial prediction and neural network result calcu-
lated using wind speed. X-Axis: hours from 0 till 24. Y-Axis produced energy
in MW.

RMSE p MAE p RMSE f MAE f RMSE m MAE m RMSE w MAE w


33.86902 11.503 95.9621 88.766585 121.5708 118.431 82.6461 68.9897

Table 4: Errors on solving the prediction problem using All-stations-model for


Tennet Oshore. p means prediction, f means calculated using neural network
and formula as above, m means calculated mean value, w means calculated using
only wind.

14
7.2 Time Series Prediction

After the neural nets information was collected, some additional experiments
needed to be done. It was interesting to know if it is possible to predict the
energy values without the knowledge of the weather, but having only the val-
ues of previous energy productions instead. For this purpose R was used with
packages 'stats', 'forecast' with time series analysis models ARIMA (Autore-
gressive Integrated Moving Average) and Holt-Winters exponential smoothing.
The rst one is used when the time series is stationary, the second one makes
no assumptions about stationarity of time series, but can be used if the time
series has a increasing or decreasing trend or seasonality. Each of the models
received a time series of 105 time steps (hourly values between 11.07.2013 at
15:00 and 15.07.2013 23:59) and had to produce a prediction for the next 24
time steps (24 hours of 16.07.2013. The RMSE and MAE of both methods was
then compared to the ocial predictions of the TSOs.
The following R source code shows the use of Holt-Winters model in R for
time series forecasting.
l i b r a r y (" f o r e c a s t ")
l i b r a r y (" s t a t s ")
x <− r e a d . c s v ( "C: \ \ 1 \ \ r e s u l t \\ Output \\TRAIN_transnetBW_Werte . c s v " ,
h e a d e r=FALSE, s e p ="\n " )
t s <− t s ( x )
hw <− H o l t W i n t e r s ( t s , gamma=FALSE)
hw2 <− f o r e c a s t . H o l t W i n t e r s (hw , h=24)
p l o t . f o r e c a s t ( hw2 )
hw2
HoltWinters forecast result was always almost constant line of values, whereas
ARIMA delivered more interesting results. The time series for ARIMA has to be
stationary (mean, deviation and autocorrelation have to be constant over time),
therefore in the rst step, if the time series appeared to be non-stationary on the
plot, one or more 'dierences' of it were needed using the function di. ARIMA
function requires 3 parameters: p,q and d, where d is number of dierences
until a stationary time series remains. p and q partial autocorellation and au-
tocorelation values respectively and can be obtained by the functions pacf and
acf. Numbers of lags when the values of p and q start vanishing and going to
zero are used as parameters of the function ARIMA. Functions forecast.Arima
and forecast.HoltWinters were used to obtain the predicted values. How the
parameters for ARIMA can be chosen is described in this tutorial [34].
The following R source code shows how ARIMA model can be applied for
forecasting.
l i b r a r y (" f o r e c a s t ")
l i b r a r y (" s t a t s ")
x <− r e a d . c s v ( "C: \ \ 1 \ \ r e s u l t \\ Output \\TRAIN_transnetBW_Werte . c s v " ,
h e a d e r=FALSE, s e p ="\n " )
t s <− t s ( x )

15
plot ( ts )
plot ( d i f f ( ts ))
s <− d i f f ( t s , d i f f e r e n c e s =2) # good s t a t i o n a r y
a c f ( s , l a g . max=20)
p a c f ( s , l a g . max=20)
t s a r <− arima ( t s , o r d e r=c ( 4 , 2 , 2 ) )
f a <− f o r e c a s t . Arima ( t s a r , h=24)
plot ( fa )
fa
Forecast for 50Hertz Onshore was calculated using ARIMA(14,0,2).
Forecast for 50Hertz Oshore was calculated using ARIMA(12,0,2).
Forecast for Ampirion was calculated using ARIMA(4,3,2).
Forecast for transnetBW was calculated using ARIMA(4,2,2).
Forecast for Tennet Onshore was calculated using ARIMA(7,0,2).
Forecast for Tennet Oshore was calculated using ARIMA(7,1,2).
An example of Holt-Winters for Ampirion and ARIMA forecast for Tennet
OnShore is shown on gures 6 and 7. The results both functions delivered
were on average better than the results of the neural networks and competitive
to the ocial predictions, but could not outperform the ocial predictions.

16
Figure 5: Example forecast for 16.07.2013 for Ampirion using Holt-Winters.
X-axis: number of time step. Values for time step greater 105 are predicted.

17
Figure 6: Example forecast for 16.07.2013 for Tennet Onshore using ARIMA.
X-axis: number of time step. Values for time step greater 105 are predicted.

18
Figure 7: Impossible Power Curve given the wind and energy data

7.3 Power Curve

Although the krigging process took very much time and the krigging function
was very carefully tested and evaluated, the results of krigging still can not be
really satisable, since negative values occured. These negative values cause
some side eects in calculation of energy predictions. As only aggregated gen-
erated energy values are provided in the data set, it was not possible to create
a power curve for every station alone. Instead, we had to average the weather
conditions, in this case wind, for every time step and use e.g. the average wind
of the whole region at this time step, which of course is not accurate enough
to create an expressive power curve. An example of a curve tting, in a depen-
dence between wind speed and produced energy values is shown on the gure 7
for 50Hertz oshore turbines.

19
7.4 Common Results

Figures 8 - 13 show the distribution of the RMSE among dierent calculation


models compared to the ocial prediction for dierent prediction problems (6
gures for 6 TSOs). In case of Tennet Onshore and 50 Hertz Onshore no cal-
culation using the "All-stations" model was performed due to lack of needed
memory by MATLAB.
Figure 14 shows a summary of rankings of calculation models based on
RMSE in all 6 prediction problems (lowest RMSE means rank 1, highest rank
10). As we can see, most of the time the ocial prediction has the ranks 1 to 3.
Figure 15 shows the average ranking in solving 6 prediction problems among
dierent calculation models. One can see, that the ocial prediction denitely
was better in average, followed by simple time series prediction models, where
no weather considerations were met at all.
The result of quality of ARIMA model can be easily explained by the peri-
odicity of the energy production, which can easily be understood by the time
series model.
Interestingly, more data did not mean more quality: models, involving data
of all wind turbines of a TSO at a time step instead of averaged weather values
performed worse than the models with averaged values. Also, more parameters
at once (Single File) did not lead to success.
One of the best performance was achieved using the average wind speed in
the TSO region only, followed by the formulas where other weather components
were more involved.
Results however can not be extrapolated to a general case, because only 105
time steps were used for training. In order to be able to speak about a general
case, signicantly more data than 5 days is required for training.

20
Figure 8: RMSE in solving prediction problem for 16.07.2013 for 50Hertz On-
shore by dierent models.

21
Figure 9: RMSE in solving prediction problem for 16.07.2013 for 50Hertz O-
shore by dierent models.

22
Figure 10: RMSE in solving prediction problem for 16.07.2013 for Ampirion by
dierent models.

23
Figure 11: RMSE in solving prediction problem for 16.07.2013 for tennet O-
shore by dierent models.

24
Figure 12: RMSE in solving prediction problem for 16.07.2013 for tennet On-
shore by dierent models.

25
Figure 13: RMSE in solving prediction problem for 16.07.2013 for transnetBW
by dierent models.

26
Figure 14: Ranking from 1 to 10 (1 is the best) in solving 6 problems by 10
dierent predictions based on RMSE

27
Figure 15: Average (ranking from 1 to 10 where 1 is the best) for each used
prediction in solving 6 problems.

28
8 Outlook and Conclusion
The results of the neural networks and time series prediction functions were
satisfying and competitive delivering a relatively small error, although did not
outperform the ocial predictions.
The models could denitely perform better if more data would be available.
In our case there were only 105 time steps and we had to predict the next
24, which was a hard task even for reliable methods. Another problem was
the availability of the weather data at wind turbine stations. The weather
parameters had to be krigged, which caused an additional error source. Negative
values could be avoided using more expensive krigging methods. Time costs
could be decreased in case of calculation of krigging on the GPU, which was
not possible in R in our case. Another important issue of availability of data is
the problem that the individual generated energy values for every wind turbine
were not provided, so we had to deal with aggregated values, which of course is
not accurate enough for method to be competitive with the ocial predictions.
Of course, it would be better, if oshore prediction values were also available
and we would not have to estimate it using the capacity information.
Nevertheless, the process included many interesting issues and we came with
results, that slightly underperform the ocial forecasts. Also, we came up with
the result, that more parameters for the neural network does not help to achieve
better performance and use of average wind component per time step alone gives
better results than involvement of other weather components, which obviously
only disturb the performance. Time series analysis methods brought best results
because of the periodicity of data.
In future, the amount of wind energy will denitely raise and it will be more
and more important to use strong and accurate methods to predict the wind
energy of the next day. The data providers will hopefully oer more detailed
data to the public and better methods forecasting will be developed.

References
[1] Á. Székely, T. Barbarics. Budapest University of Technology and Economics ,
Short-Term Prediction of the Power Generation of Wind Turbines.

[2] European Centre for Medium-Range Weather Forecasts http:


//data-portal.ecmwf.int/data/d/interim_full_daily/.
[3] A. Kusiak, H. Zheng and Z. Song , Short-Term Prediction of Wind Farm
Power: A Data Mining Approach.

[4] G. Giebel, P. Sørensen, H. Holttinen Risø National Laboratory , Forecast


error of aggregated wind power.

[5] M. Lange, U. Focken , State-of-the-Art in Wind Power Prediction in Ger-


many and International Developments.

29
[6] G. Kariniotakis, P. Pinson, N. Siebert, G. Giebel, R. Barthelmie , The State
of the Art in Short-Term Prediction of Wind Power Normally, .
[7] Argonne National Laboratory , Wind Power Forecasting: State-of-the-Art
2009.
[8] S. Dutta, T.J. Overbye, Prediction of Short Term Power Output of Wind
Farms based on Least Squares Method.
[9] A. Marvuglia1, A. Messineo , Learning a wind farm Power Curve with a
Data-Driven Approach.
[10] Fraunhofer IWES , Windenergie Report Deutschland 2011
[11] René Jursa, Bernhard Lange, Kurt Rohrig, Institut für Solare Energiever-
sorgungstechnik e. V , Wind Power Prediction with Optimization and Clus-
tering Techniques.
[12] Bernhard Lange, Kurt Rohrig, Bernhard Ernst, Florian Schlögl, Ümit Cali,
Rene Jursa, Javad Moradi , Wind Power Prediction in Germany  Recent
Advances and Future Challenges.
[13] Sideratos, G., Hatziargyriou, N.D. , An Advanced Statistical Method for
Wind Power Forecasting.
[14] W. David Lubitz, Bruce R. White, Measuring Error in Wind Power Fore-
casting Using a New Forecasting System.
[15] A. Kusiak, H. Zheng, Z. Song, Wind Farm Power Prediction: A Data-
Mining Approach.
[16] P. Gopi, S. Ganesh Vadiyanathan, G. Umma Habi, Comparative Analysis
of Wind Power. Forecasting Using Articial Neural Network.
[17] G. Giebel et al., Delivarable Report. The State of the Art in Short-Term
Prediction of Wind Power. A Literature Overview, 2nd Edition.
[18] Yang Lin , Wind power production forecasting: Nonlinear approach.
[19] Oliver Kramer, Fabian Gieseke, Short-Term Wind Energy Forecasting Us-
ing Support Vector Regression.
[20] S. Mathew., J. Hazra, S. A. Husain, C. Basu, L. C. DeSilva, D. Seetharam,
N. Y. Voo, S. Kalyanaraman and Z. Sulaiman , An Advanced Model for the
Short-Term Forecast of Wind Energy.
[21] Amjady, N. ; Keynia, F. ; Zareipour, H., Wind Power Prediction by a
New Forecast Engine Composed of Modied Hybrid Neural Network and
Enhanced Particle Swarm Optimization.
[22] Audun Botterud, Jianhui Wang , Wind Power Forecasting and Electricity
Market Operations.

30
[23] J. W. Taylor, P. E. McSharry, R. Buizza, Wind Power Density Forecasting
Using Ensemble Predictions and Time Series Models.

[24] J. Dobschinski, A. Wessel, B. Lange ISET e.V., Wind Power Prediction


Errors of a Shortest-Term Forecast of the Total German Wind Power Gen-
eration.

[25] http://www.50hertz.com/de/165.htm, Master data for EEG generators


of 50Hertz GmbH.

[26] http://www.amprion.net/eeg-anlagenstammdaten-aktuell, Master


data for EEG generators of Ampirion GmbH.

[27] http://www.tennet.eu/de/kunden/eegkwk-g/
erneuerbare-energien-gesetz/eeg-daten-nach-52/
einspeisung-und-anlagenregister.html, Master data for EEG genera-
tors of TennetTSO GmbH.

[28] http://www.transnetbw.de/eeg-and-kwk-g/eeg-anlagendaten/, Mas-


ter data for EEG generators of TransnetBW GmbH.

[28] http://www.50hertz.com/de/151.htm, Forecast wind power feed-in of


50Hertz GmbH.

[29] http://www.amprion.net/windenergieeinspeisung, Forecast wind


power feed-in of Ampirion GmbH.

[30] http://transnetbw.com/key-figures/renewable-energies-res/
wind-infeed/?app=wind&activeTab=csv&auswahl=month&selectMonat=
40, Forecast wind power feed-in of TransnetBW GmbH.
[31] http://www.tennettso.de/site/en/Transparency/publications/
network-figures/actual-and-forecast-wind-energy-feed-in, Fore-
cast wind power feed-in of TennetTSO GmbH.

[32] Andrew Kusiak, Haiyang Zheng, Zhe Song , Models for monitoring wind
farm power.

[33] http://www.50hertz.com/de/151.htm, Mapping of post index to geo-


graphic coordinates.

[34] http://a-little-book-of-r-for-time-series.readthedocs.org/en/
latest/src/timeseries.html#holt-winters-exponential-smoothing,
Time series analysis with R.

31

You might also like