One-hour-Ahead Wind Speed Prediction Using A Bayesian Methodology

1
One-hour-ahead Wind Speed Prediction Using a

Bayesian Methodology
Marcos S. Miranda and Rod W. Dunn, Member, IEEE
some cases if the output of wind generators was accurately

Abstract—The contribution of wind power in market-driven predicted one hour ahead of real-time.
power systems together with the uncertain nature of the wind The methodology developed in this paper uses a Bayesian
resource have led to many research efforts on methodologies to framework to model the wind speed time-series as an
predict future wind speed/power production. Applications such
autoregressive process. Similarly, the wind power generated
as the operational balancing market in the UK would benefit
from accurate one-hour-ahead forecasts of the available power from a given wind farm (or group of wind farms) could be
from all generators, wind being no exception. This paper focuses used, since generated power is the main variable of interest.
on one-hour-ahead wind speed prediction using a Bayesian This was not possible though, due to unavailability of suitable
approach to characterise the wind resource. To test the data. Nonetheless, power predictions could still be inferred
approach, two years of wind speed data from a weather station through proper up-scaling of the predicted wind speeds
were modelled as an autoregressive process. In this paper, the
(power law) to hub-height for a given site and subsequent
methodology used is described together with the model employed
and prediction results are presented and compared to the application of a typical wind turbine power curve, which
persistence method. The results obtained indicate that Bayesian describes the power output given an average incident wind
inferencing can be a useful tool in wind speed/power prediction, speed.
particularly due to the flexibility inherent to the methodology. The characterisation of the known resource (weather
station hourly average wind speed data) is performed in two
Index Terms--Bayesian inferencing, Markov chain Monte distinct steps. First, the wind speed data, Uw, for the weather
Carlo simulation, prediction methods, short-term wind forecast,
station is defined as a normally distributed variable, with an
statistical methods, wind energy.
unknown mean, µw and variance σw. Then, the time-series
I. INTRODUCTION mean, µw, is described as an autoregressive (AR) process to
define the structure of the data in the previously defined
T HE growing participation of wind power in modern
market-driven power systems is becoming a significant
problem due to the uncertain and variable nature of the wind
normal distribution. This type of model structure is also
known as a hierarchical model in Bayesian inferencing.
Markov Chain Monte Carlo (MCMC) simulation is used to
resource. This has motivated R&D into techniques for better estimate the model parameters, which can then be used to
characterisation and prediction of the wind resource and the predict wind speeds at future time-steps. The model was
power produced by wind farms. implemented using the OpenBUGS (Bayesian inference Using
Such techniques play an increasingly important role as they Gibbs Sampling) software package [1].
make it possible to assess the impact of wind power This paper presents the development and results of an AR
generation on different application areas, dependent on the model using a Bayesian approach. Some of the fundamental
prediction horizon. These can range from seconds (e.g. wind aspects behind this approach are outlined and preliminary
turbine control systems) to days (e.g. dispatch of utility results are shown for a 6th–order model, using only wind
generation, security assessment). speed data as the input.
This paper focus on short time scales, i.e. one-hour-ahead
predictions. The main area of application for such prediction II. SHORT-TERM PREDICTION OF WIND SPEEDS
horizon is related to operational security of the power system
as it may subsidise generation dispatch decisions and allow The area of short term prediction of wind speed/power can
adequate performance of electricity markets. For instance, in be generally subdivided into two main groups, depending on
the UK, an operational balancing market exists to help ensure the underlying prediction model used. These can be either
system security. However, exercising this balancing based on numerical weather prediction (NWP) models, similar
mechanism is potentially expensive and could be avoided in to those used by national meteorological agencies, or other
alternative approaches [2]. Models such as RISØ’s Prediktor
[3] or the University of Oldenburg’s Previento [4] can be
This work was supported by the UK EPSRC, through the Supergen found in the first category. The second category encompasses,
initiative. amongst others, artificial intelligence (fuzzy logic [5],
M. S. Miranda (e-mail: m.miranda@bath.ac.uk) and R. W. Dunn (e-mail:
r.w.dunn@bath.ac.uk) are with the Department of Electrical Engineering,
artificial neural networks [6]) and autoregressive models [7].
University of Bath, Bath, BA2 7AY, UK. The suitability of a given model will be dependent on the
1-4244-0493-2/06/$20.00 ©2006 IEEE.

2
time scale of interest for the predictions, NWP-based models observations of the variable, which is not always possible.
being usually the best option for longer term predictions (over Priors can be either informative or non-informative. One
around 6 hours ahead). could define that the average daily temperature in a given
location on the Equator line during the summer belonged to a
A. Time-series models
uniform distribution that varies between [-100oC, 100oC],
The time-series approaches can be particularly useful when which is not particularly informative. A more reasonable
shorter prediction horizons are required. Such methods are estimate could be made by looking at weather information
usually less complex and computationally intensive then taken at similar locations and say it could vary uniformly
NWP-based methods, making them a useful tool for between [20oC,50oC], or even that it was normally distributed,
generation dispatch applications. In particular, autoregressive with a mean of 35oC and a 99% confidence interval of ±15oC
processes are a class of standard statistical models which are about the mean.
well-known and of easy implementation. The posterior probability, on the other hand, is a
A traditional way of testing the performance of a short- conditional probability, which takes into account the
term prediction model is to compare its output with that from information contained in the prior and the normalised
the persistence method [7]. The persistence method consists of likelihood, as shown above.
using the wind speed over the past hour as the prediction for In many cases, it is not possible to derive the posterior
the next one. As simplistic as it may sound, the persistence distributions analytically, but samples may be generated using
method performance is quite remarkable over short prediction Markov Chain Monte Carlo (MCMC) methods. The resulting
horizons due to the typical time constants associated with posterior is then obtained not as a single number, but as a
weather systems. probability distribution. Empirical summary statistics can be
calculated from the samples in this distribution and used to
III. THE BAYESIAN METHODOLOGY draw inferences about their true values, such as the expected
The Bayesian approach to statistical analysis is not new, its (mean) value and confidence intervals of interest.
roots dating back to the mid 18th century, but it gained When using MCMC, a ‘burn-in’ stage must be taken into
considerable attention over the last two decades, particularly account, during which the different parameters should
due to the increase in computer processing power. During this converge to their true values. This is followed by a stage
time it has found many areas of application in the modelling during which the samples from the posterior distribution are
and forecasting of time series data, such as disease mapping kept for analysis purposes. With more complicated models,
and biological processes, as well as different modelling the convergence stage may take a long time.
frameworks, such as autoregressive moving average [8] and
A. Hierarchical Models and Expert Knowledge
state space techniques [9].
One of the main characteristics of Bayesian statistics lies One important feature of the Bayesian approach is the
on the fact that, contrary to the frequentist approach, the ability to implement nested structures, also referred to as
probabilities associated to a variable in a given process are not hierarchical models.
taken as how many times, or how often (their frequency) they This enables further characterisation of the variables in the
are observed. Instead, the probabilities (either extracted from model and the introduction of expert knowledge at different
actual observations; the expected range of possible values levels. This can be exemplified by considering a fictitious
based on previous knowledge or still simply estimated values) model of hourly electricity demand over the winter. The
are seen as a degree of belief attached to the variable, which in overall demand distribution may be modelled as normally
turn can take the form of a probability distribution. distributed. Further, the time-series data available may be
Bayes’ theorem, shown in (1), can be stated as the modelled by two additive functions, the domestic and the
probability of a variable A given the occurrence of another industrial consumption. The individual time-series models
variable B is equal to the normalised likelihood of B given A themselves may have any desired structure, e.g. an
( P(B|A) / P(B) ) times the probability of A. In other words the autoregressive arrangement. The domestic consumption can
conditional probability of A on B can be calculated using the further be made dependent on the average daily temperature,
conditional probability of B on A, and the degree of belief on which again can be characterised using any desired
A, the prior ( P(A) ). formulation, and made dependent on any other variable.
In the specific case of wind speed prediction, such facility
P ( B | A) . P( A) allows the inclusion of physical phenomena into the statistical
P( A | B) = (1)
P( B) model - as opposed to the statistical analysis stage used on
some of the NWP-based methods - adding flexibility and
The concepts of prior and posterior distributions are
strengthening the resulting model.
inherent to Bayesian statistics and are further developed
below.
IV. MODEL DEVELOPMENT
A prior probability is a marginal probability, i.e. a
description of a variable based on some knowledge of what A Bayesian hierarchical model was developed to model the
value it may assume, but not necessarily based on autocorrelation structure between consecutive hourly mean
3
wind speed samples, thus allowing the prediction of wind potential problem for the hierarchical model adopted since it
speeds at future time-steps. assumes at its first level that the data is normally distributed.
The input data for the model were two years of hourly Therefore a transformation of the original wind speed data
mean wind speeds from a weather station at the extreme north was required and the Box-Cox transformation was used as it
of Great Britain (Lerwick, Shetland Islands), taken for the presents a simple and straightforward procedure for
years of 1998 and 1999 (Fig. 1). The average wind speed for non-normality correction [11], as detailed below.
the site during this period was 7.9 m/s. The Box-Cox transformation y(λ), of a dataset y is defined
as:
30
⎧ (λ) y λ − 1
⎪y = , for λ ≠ 0
25 ⎨ λ (4)
⎪ y (λ ) = ln ( y ), for λ = 0
⎩
Wind Speed (m/s)
20
The choice of the value for the parameter λ can be made
through an analysis of the log-likelihood function:
15
n ⎡ n ( yi (λ) − y (λ)) 2 ⎤ n
10
f ( y, λ) = − . ln ⎢
2 ⎢⎣ i =1 n
∑ ⎥ + (λ − 1). ln( yi )
⎥⎦
∑ (5)
i =1
The value of λ which maximises (5) is used in the Box-Cox

5
transformation. A plot of the log-likelihood function for
values of λ between [-2,2] in steps of 0.1 can be seen in Fig. 2.
0
0 2,190 4,380 6,570 8,760 10,950 13,140 15,330 17,520 4
x 10
Time (h) -2
Fig. 1. Input wind speed data (two years of hourly averages). -2.5
The hierarchical model implemented in this work has two -3

levels:
Loglikelihood
-3.5
• Level 1: The wind speed data at the site is initially
defined as coming from a Normal distribution with an -4
underlying mean and variance, as in (2), where Uw is
the wind speed data, µw is the underlying mean at the -4.5
site and σw2, the associated variance.
-5
2
U w ~ N (µ w , σ w ) (2)
-5.5
• Level 2: The temporal structure of the distribution

-6
represented in (2) is then modelled as a sixth order -2 -1.5 -1 -0.5 0 0.5 1 1.5 2
autoregressive (AR) model, which can be written in its λ
general form as:
Fig. 2. Log-likelihood function of wind speed data for different parameters λ
of the Box-Cox transformation.
UWt = β 0 + β1.UWt −1 + β 2 .UWt − 2 + ... + β n .UWt − n + ut (3)
where Uwt is the value of the series at time t, Uwt-1 to From Fig. 2, it can be seen that the value of λ which
maximises the function is λ = 0.5. In order to validate the
Uwt-n are the previous values of the series up to n
procedure, Matlab’s boxcox function was also used. This
previous time steps, n is the order of the model, β0 is
function performs a continuous assessment of the data on the
the time-series overall level, β1 to βn are the series variable λ, searching the resulting function maximum through
autocorrelation coefficients, and ut is a normally an optimisation procedure. This yielded a value of
distributed random term, with zero mean and variance λMAT =0.4913. The difference between the results was not
σw2. significant enough to justify the use of the more complex
calculation procedure, which may not be readily available to
A. Non-normality correction many, therefore λ = 0.5 was used in the data conversion.
It is a well known characteristic of general wind speed
V. SIMULATION RESULTS
series that its variation at a given site can be modelled using
the Weibull distribution [10]. As described above, this is a The model was built using wind speeds defined within a
window that moved through the wind dataset, in order to
4
generate sequential predictions. Each window contained two For purposes of illustration, the shape of the probability
years worth of data (17520 points) and for each of these distributions obtained for the six AR model coefficients
snapshot windows, one additional hour was predicted. This (9,000 samples) are also shown in Fig. 4.
process was carried for 48 data windows. A schematic
representation of the procedure is shown in Fig. 3.
beta[1,1] sample: 9000 beta[1,2] sample: 9000
1.02 1.04 1.06 1.08 1.1 -0.12 -0.1 -0.08 -0.06

(a) (b)
-0.04 -0.02 0.0 0.02 0.04 -0.08 -0.06 -0.04 -0.02

(c) (d)
Fig. 3. Schematic representation of the prediction procedure.
As previously mentioned, during the analysis of MCMC

simulation results the initial outputs are usually discarded to
allow for the convergence of the model variables. The
-0.06 -0.04 -0.02 0.02 -0.04 -0.02 0.0 0.02
following results were taken for a run of 10,000 samples, (e) (f)
being the first 1,000 discarded and only the remaining 9,000
considered in the analysis. Fig. 4. Probability distributions of AR model coefficients. a) β1, b) β2, c) β3,
d) β4, e) β5, and f) β6.
A. AR Model Validation
In order to assess the estimation of the AR model
B. Wind Speed Prediction Validation
parameters, the two year dataset was also characterised using
Matlab’s system identification tools. A 6th-order The 48 predictions were compared with the wind speed
autoregressive model was defined and its coefficients data and the results are shown Fig. 5, together with the 95%
calculated. The results for both, Matlab and Bayesian models confidence interval of the predictions.
are summarised in Table I, below. As indicated in the model coefficients analysis, the
The outputs from the Bayesian model are given in the form predicted wind speeds have a strong relationship with the
of probability distributions, so in addition to the expected series previous value.
(mean) value of the six AR model parameters, the 95%
25
confidence interval of the parameters is also included in the
table with the lower (2.5%) and upper (97.5%) bounds in
separate columns. The results suggest a strong correlation in 20
the data between the predicted wind speed (Uwt) and its
Wind Speed (m/s)
previous value (Uwt-1), indicating some resemblance with the

persistence model. Good agreement was found between the 15
parameters estimated using both models.
TABLE I 10
COMPARISON BETWEEN COEFFICIENTS OF AR(6) MODEL, AS CALCULATED BY
MATLAB AND BAYESIAN INFERENCING
5
AR model Bayesian Inferencing Tim e-series
Matlab Predictions
coefficients Mean 2.5% 97.5%
β1 (t-1) 1.02 1.0545 Conf. Interval
1.0401 1.0694
0
β2 (t-2) - 0.04111 - 0.0824 - 0.1038 - 0.0610 0 10 20 30 40 50
β3 (t-3) - 0.0104 0.0125 - 0.0092 0.0333 Time (h)
β4 (t-4) - 0.003534 - 0.0272 - 0.0485 - 0.0058
β5 (t-5) - 0.02822 - 0.0142 - 0.0358 0.0071 Fig. 5. Wind speed predictions (with 95% confidence interval) and wind
β6 (t-6) 0.0077 0.0035 - 0.0113 0.0184 speed data (1-hour-ahead predictions).
5
Fig. 6 shows a comparison between the absolute errors security assessment.

obtained by the persistence method and the developed model. This paper presents the development and simulation results
The similarity in the performance of both models is evident. of an AR model using a Bayesian approach. Some aspects of
The relative rms error found for the Bayesian model was the Bayesian methodology are described and results are shown
16.4%, marginally outperforming the persistence model, for a 6th-order AR model. The developed model performance
which displayed an rms error of 17.3%. was assessed in light of the predictions obtained using the
persistence model, indicating satisfactory results, despite the
7 simplicity of the model used.
Error Pers is tenc e
6 The Bayesian framework presents great potential in this
Error AR(6) Model
5
application area, as it provides great flexibility in the model
development, allowing the inclusion of expert knowledge
4
derived from the physical phenomena (use of additional
3 variables) that may have influence over the variable of
Error (m/s)
2 interest.
1
Lower-order autoregressive (AR) models can be rather
limited for this particular application as they may fail to
0
capture the periodical variations of the wind speed. As further
-1 work, the authors are now pursuing the use of a more complex
-2 structure, such as an autoregressive moving average (ARMA)
-3 model, in conjunction with the incorporation of additional
variables, such as the atmospheric pressure, which could
-4
0 10 20 30 40 50 greatly improve the performance of the model.
Time (h)
VII. ACKNOWLEDGMENT
Fig. 6. Absolute prediction errors obtained for the developed AR model and
the persistence method. The authors gratefully acknowledge the contribution of
Gavin Shaddick for his initial help with the OpenBUGS
software and useful discussions on some of the model
C. Discussion
implementation aspects.
Despite the marginal improvement in performance over the The authors also thank the UK Met Office and the British
persistence method, the results obtained are encouraging with Atmospheric Data Centre for supplying the wind speed data
respect to the application of Bayesian inferencing to wind used in this research.
speed prediction, particularly considering the simplicity of the
statistical model (AR) employed and the fact that only wind VIII. REFERENCES
speed data was used in the modelling. [1] OpenBUGS project website, http://mathstat.helsinki.fi/openbugs,
The use of more complex models, such as autoregressive November, 2005.
moving average (ARMA) models, should improve the [2] G. Kariniotakis, P. Pinson, N. Siebert, G. Giebel, and R. Bartelmie, "The
State of the Art in Short-term Prediction of Wind Power - From an
predictions considerably, especially considering the ability of
Offshore Perspective," in Proc. 2004 Symposium ADEME – IFREMER
such models to characterise seasonality and other level (Renewable energies at sea). [Online]. Available:
variation effects in the data. http://anemos.cma.fr/download/publications/pub_2004_paper_SeaTech
Also, further exploitation of the hierarchical structure Week04_SOTA.pdf, November, 2005.
[3] L. Landberg, "Short-term prediction of the power production from wind
capability of the Bayesian methodology should lead to farms," Journal of Wind Engineering and Industrial Aerodynamics,
improved model performance, especially if other physical vol. 80, pp. 207-220, 1999.
variables known to have influence over the wind speed are [4] U. Focken, M. Lange, D. Heinemann, and H. P. Waldl, "Previento -
Regional Wind Power Prediction with Risk Control," in Proc. 2002
included, such as atmospheric pressure. Global Windpower Conference.
Finally, the use of simultaneous data from multiple sites [5] I. G. Damousis, M. C. Alexiadis, J. B. Theocharis, and P. S.
(described in the model as a multivariate normal distribution), Dokopoulosy, "A Fuzzy Model for Wind Speed Prediction and Power
Generation in Wind Parks Using Spatial Correlation," IEEE Trans.
together with the inclusion of other physical variables, should Energy Conversion, vol. 19, pp. 352-361, June 2004.
yield a more robust model, despite its higher complexity. In [6] T. G. Barbounis, J. B. Theocharis, M. C. Alexiadis, and P. S.
such an approach, the gradients between the physical variables Dokopoulos, "Long-Term Wind Speed and Power Forecasting Using
could also be used, in addition to single point readings, as an Local Recurrent Neural Network Models," IEEE Trans. Energy
Conversion, to be published.
attempt to improve the performance of the predictions. [7] M. Milligan, M. Schwartz, and Y.-H. Wan, "Statistical Wind Power
Forecasting Models: Results for U.S. Wind Farms," in Proc. 2003
VI. CONCLUSIONS Windpower Conference.
[8] P. Congdon, Bayesian Statistical Modelling. Chichester: Wiley, 2001, p.
The impact of wind power on power systems has received 556.
great attention over recent years, and reliable prediction [9] M. West and P.J. Harrison, Bayesian Forecasting and Dynamic Models.
New York, Sprinter-Verlag, 1997 (2nd ed.), p. 680.
models have become essential tools to assist in the system
6
[10] Danish Wind Industry Association (DWIA) website, Guided Tour on

Wind Energy, [Online]. Available: http://www.windpower.org/
en/tour/wres/weibull.htm, November, 2005.
[11] NIST/SEMATECH, e-Handbook of Statistical Methods. [Online].
Available: http://www.itl.nist.gov/div898/handbook/pmc/section5/pmc5
2.htm, November, 2005.
IX. BIOGRAPHIES
Marcos S Miranda was born in Belo Horizonte,

Brazil in 1973. He graduated from the Federal
University of Minas Gerais (UFMG, Brazil) in
Electrical Engineering in 1994 and obtained a
Master's degree by Research in 1997, from the same
university.
He then worked for one year as a Research
Assistant at the Brazilian Wind Energy Centre at the
Federal University of Pernambuco (UFPE, Brazil).
In 1998 he started his PhD at the Centre for
Renewable Energy Systems Technology (CREST) at Loughborough
University (UK). His research project addressed wind-powered seawater
desalination. Since the end of 2003 he has been with the Department of
Electrical Engineering, University of Bath (UK) as a Research Associate,
where he is currently involved in the EPSRC-funded SUPERGEN project. His
broad areas of interest include wind and other renewable energy sources,
power systems, dynamic systems modelling and control, power quality and
energy efficiency.
Rod W. Dunn (M’1996) was born in Glasgow,

Scotland in the UK on 29th July 1959. He graduated
from the University of Bath, UK, with a BSc in
Electrical and Electronic Engineering in 1981. He
carried out research towards his PhD on novel
control algorithms and embedded computer to
implement the control for magnetically levitated
urban transport vehicles.
In 1985 he became a Research Assistant within
the Power and Control Group of the School of
Electrical and Electronic Engineering at the University of Bath. He joined the
academic staff in early 1986 teaching modern control and advanced computer
architectures. Currently Dr Dunn is a Senior Lecturer and a member of the
Power and Energy Systems Group within the Department of Electronic and
Electrical Engineering at the University of Bath. Rod Dunn’s current funded
research includes electrical energy storage systems, the reliability of future
power systems and power system operation and planning techniques.

One-hour-Ahead Wind Speed Prediction Using A Bayesian Methodology

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

One-hour-Ahead Wind Speed Prediction Using A Bayesian Methodology

Uploaded by

Copyright:

Available Formats

1

One-hour-ahead Wind Speed Prediction Using a

some cases if the output of wind generators was accurately

1-4244-0493-2/06/$20.00 ©2006 IEEE.

The value of λ which maximises (5) is used in the Box-Cox

The hierarchical model implemented in this work has two -3

• Level 2: The temporal structure of the distribution

1.02 1.04 1.06 1.08 1.1 -0.12 -0.1 -0.08 -0.06

-0.04 -0.02 0.0 0.02 0.04 -0.08 -0.06 -0.04 -0.02

As previously mentioned, during the analysis of MCMC

previous value (Uwt-1), indicating some resemblance with the

Fig. 6 shows a comparison between the absolute errors security assessment.

[10] Danish Wind Industry Association (DWIA) website, Guided Tour on

Marcos S Miranda was born in Belo Horizonte,

Rod W. Dunn (M’1996) was born in Glasgow,

You might also like