Professional Documents
Culture Documents
.
Concepts, modelling and verification
Pierre Pinson
DTU Summer School 2017 - EES-UETP course, Kgs Lyngby, Denmark – 12 June 2017
(with ackn. to all funding sources, data providers, collaborators and students, for material and ideas)
1 / 67
Outline
2 / 67
1 Uncertainty and forecasting
3 / 67
Why uncertainty?
deterministic forecasts
probabilistic forecasts as quantiles, intervals, and predictive distributions
probabilistic forecasts in the form of trajectories (/scenarios)
risk indices (broad audience applications)
For nearly all of these problems, optimal decisions can only be obtained if fully
considering forecast uncertainty...
4 / 67
Contribution to forecast uncertainty/error
5 / 67
Numerical Weather Prediction
Temporal/spatial resolution,
domain, forecast update and
forecast length vary depending
upon the NWP system
6 / 67
The manufacturer power curve
actual
meteorological
conditions seen
by turbines,
aggregation of
individual curves,
non-ideal power
curves,
etc.
8 / 67
Shaping forecast uncertainty
9 / 67
2 From deterministic to probabilistic forecasts
10 / 67
Point forecasts: what they really are...
Forecasts are commonly provided in the form of point forecasts (i.e., the conditional
expectation for each look-ahead time)
Figure: The Western Denmark dataset: original locations for which measurements are available, 15 control zones defined by
Energinet, as well as the 5 aggregated zones, for a nominal capacity of around 2.5 GW.
12 / 67
Point forecast: definition
Mathematically:
given
the information
set Ω
a model M
its estimated
parameters θ̂
at time t
13 / 67
Point forecasting
Figure: Example episode with point forecasts for the 5 aggregated zones of Western Denmark, as issued on
16 March 2007 at 06 UTC, along with corresponding power measurements, obtained a posteriori.
14 / 67
Quantile forecast: definition
Mathematically:
(α) −1
q̂t+k|t = F̂t+k|t (α)
with
α:the nominal
level (ex: 0.5 for
50%)
F̂ : (predicted)
cumulative
distribution
function for Yt+k
Yt+k : the
random variable
“power
generation at
time t + k”
15 / 67
Prediction interval: definition
A prediction interval is an interval within which power generation may lie, with a
certain probability
Mathematically:
h i
(β) (α) (α)
Iˆt+k|t = q̂t+k|t , q̂t+k|t
with
β: nominal
coverage rate
(ex: 0.9 for 90%)
(α) (α)
q̂t+k|t , q̂t+k|t :
interval bounds
α, α: nominal
levels of quantile
forecasts
16 / 67
Predictive densities: definition
Mathematically:
Yt+k ∼ F̂t+k|t
with
F̂ : (predicted)
cumulative
distribution
function for Yt+k
Yt+k : the
random variable
“power
generation at
time t + k” 17 / 67
Predictive densities
Figure: Example episode with probabilistic forecasts for the 5 aggregated zones of Western Denmark, as issued on 16 March
2007 at 06UTC. They take the form of so-called river-of-blood fan charts, represented by a set of central prediction intervals with
increasing nominal coverage rates (from 10% to 90%).
18 / 67
The conditional importance of correlation
almost no temporal
correlation
appropriate temporal
correlation
19 / 67
Trajectories (/scenarios): definition
Mathematically:
(j)
zt ∼ F̂t
with
F̂ : multivariate
predictive cdf for
Yt
(j)
zt : the j th
trajectory
20 / 67
Space-time trajectories (/scenarios)
21 / 67
3 Verification of probabilistic forecasts
22 / 67
Properties of probabilistic forecasts
23 / 67
4 Parametric apporaches to probabilistic forecasting
24 / 67
The wave energy example
25 / 67
Data characteristics
26 / 67
Welcome to Kauai
This is also where they filmed part of the Jurassic Park movies...!
27 / 67
Predicting the wave energy flux
g 2ρ 2
Yt = H Tt , Yt ≥ 0
64π t
where
g the gravitational acceleration constant
ρ the density of seawater
Ht and Tt denote the significant wave height and mean wave period
2
Yt+k ∼ ln N (µ̂t+k|t , σ̂t+k|t )
where µ̂t+k|t and σ̂t+k|t are forecasts of the location and scale parameters.
2
Note that this is equivalent to ln(Yt+k ) ∼ N (µ̂t+k|t , σ̂t+k|t )
28 / 67
Dynamic models for location and scale
Define x = ln(y ) the log of the wave energy flux
A simple model for the location parameter µ̂t+k|t is
µ̂t+k|t = θ >
t zt = θ0 + θ1 x̃t+k|t + θ2 xt + . . . + θl+2 xt−l
with
>
θt = [θ0 , θ1 , θ2 , . . . , θl+2 ]
>
zt = 1, x̃t+k|t , xt , . . . , xt−l
and
xt−i : most recent measurements
x̃t+k|t : Raw forecasts from ECMWF WAM
p
σ̂t+k|t := βt
that is, a constant.
29 / 67
Illustration of probabilistic forecasts
30 / 67
Reliability of probabilistic wave energy forecasts
The log-Normal assumption is fine for some buoys, but not for others...
31 / 67
Skill of probabilistic wave energy forecasts
Note the two different benchmarks that are climatology and persistence
32 / 67
Skill of probabilistic wave energy forecasts
Very high skill is attained, though looking different depending upon the chosen
score...
33 / 67
Bye bye Kauai...!
Lets move on to wind energy and some other exotic case studies...
34 / 67
Horns Rev, Denmark
35 / 67
The wind energy example
36 / 67
Beta or not Beta?
(from Wikipedia)
Write {xt } the original time-series, and {yt } the transformed one
xtν
yt = γ(xt ; ν) = ln , xt ∈ (0, 1)
1 − xtν
−1/ν
xt = γ −1 (yt ; ν) = {1 + exp(yt )} , yt ∈ R
ν is a shape parameter
38 / 67
The GL transformation (2)
For wind power predictive densities, the model linking mean and standard
deviation (µX and σX ) could be
σX = αµνX (1 − µνX )
39 / 67
Why the GL transform?
40 / 67
The GL-Normal variable
The interest of this proposal is that one can first transform wind power data
and then work in a Gaussian framework!
41 / 67
The form of predictive densities
It is fair to write the predictive density for the wind power generation Xt+k at
time t + k as
0 0 1 2 1
Xt+k ∼ ωt+k δ0 + (1 − ωt+k − ωt+k )Lν (µt+k , σt+k ) + ωt+k δ1
This means that we ‘simply’ use censored Normal distributions, and that our
complex predictive densities are summarized by 2 parameters only (location
and scale)
42 / 67
Autoregressive dynamics
43 / 67
Conditional Parametric AR dynamics
If considering CP-AR dynamics instead, the model for the location parameter
µt+k is changed to
µt+k = θ(ωt )> ỹt
where
θ(ωt ) = [θ0 (ωt ) θ1 (ωt ) . . . θl+1 (ωt )]>
In parallel the model for the scale parameter then becomes
2
σt+k = α(ωt )
Local kernel regression is used for obtaining local linear models (at a number
of fitting points)
A similar estimation framework is employed for adaptive estimation of local
linear models
An obvious candidate for ω is measured wind direction
44 / 67
Application to offshore wind power data
Setup:
first 3 months for model identification and batch estimation
optimal models: 3 lags and an intercept
Evaluation is carried out on the remaining of the dataset
45 / 67
Point forecast evaluation
NMAE
Month Jun. Jul. Aug. Sep. Oct. Nov. Dec. Jan. All
Persistence 2.40 2.43 2.50 2.50 2.40 2.59 2.61 2.47 2.48
AR 2.34 2.42 2.48 2.43 2.33 2.64 2.63 2.45 2.45
GL-Normal AR 2.32 2.35 2.43 2.38 2.28 2.47 2.54 2.41 2.38
GL-Normal CP-AR 2.31 2.33 2.42 2.37 2.25 2.45 2.50 2.38 2.36
NRMSE
Month Jun. Jul. Aug. Sep. Oct. Nov. Dec. Jan. All
Persistence 3.98 4.68 4.47 4.66 4.17 6.23 4.76 4.28 4.70
AR 3.80 4.56 4.36 4.46 3.97 6.00 4.62 4.20 4.54
GL-Normal AR 3.81 4.57 4.32 4.43 3.96 5.87 4.55 4.12 4.50
GL-Normal CP-AR 3.79 4.54 4.30 4.43 3.93 5.83 4.52 4.09 4.47
46 / 67
Probabilistic forecast evaluation
The reliability of
GL-Normal
distributions is more
acceptable than that of
Normal and truncated
Normal distributions...
(and of Beta!)
CRPS
Month Jun. Jul. Aug. Sep. Oct. Nov. Dec. Jan. All
Normal AR 1.85 2.01 1.99 1.98 1.87 2.31 2.23 1.97 2.01
GL-Normal AR 1.76 1.80 1.86 1.86 1.81 1.93 1.92 2.03 1.86
GL-Normal CP-AR 1.73 1.78 1.82 1.81 1.71 1.94 1.88 1.79 1.80
47 / 67
Episode
48 / 67
A look at model coefficients
49 / 67
5 Nonparametric approaches to probabilistic forecasting
50 / 67
Ensemble forecasting
Ensemble predictions of
wind power can be
obtained:
from meteorological
ensembles as input...
... consequently
converted to power with
appropriate statistical
methods
51 / 67
Probabilistic forecasts from ensembles
It is known that ensemble forecasts of wind power are not correct in a
probabilistic sense, i.e. they are not reliable
They can be converted to reliable probabilistic forecasts with appropriate
recalibration methods (e.g. nonparametric conditional models, smoothed
bootstrap, bayesian model averaging)
Example from a real application: probabilistic forecast from ensembles for the area
of Jutland (western Denmark)
52 / 67
Application: the case-study
Power measurements:
available raw data are from 16th February 2005 to 25th January 2006
Over 8200 hours, 73.4% of hourly averages are considered as valid
all power values are normalized by the wind farm nominal capacity
Meteorological ensembles:
MSPEPS from Weprog, available over the same period
originally issued every 6 hours
framed here in order to simulate real world application for which forecasts may be
delivered every hour...
... and also to increase the size of the dataset !!
53 / 67
Input meteorological ensembles
Downscaled at the
location of interest
Wind speed,
direction, and
other
meteorological
variables available
at various vertical
levels
In the present exercise, only wind speed at turbine nacelle level (app. 105m a.g.l.) is
used as input...
54 / 67
Setup for power curve modeling
A unique power curve for the wind farm, between the best available met forecasts
and power measurements
The power curve model for each horizon is based on local linear regression, and
adaptively estimated with an orthogonal fitting methods
55 / 67
Conversion of ensembles
At a given time t and for horizon k, the wind power forecast given by the j th
member is
(j) (j)
ŷt+k|t = ĝt,k (x̂t+k|t )
while the single point forecast that can extracted from the set of ensemble members
is chosen to be its mean, i.e.
m
1 X (j)
ŷt+k|t = ŷ
m j=1 t+k|t
56 / 67
Forecast performance of ensembles
NMAE
NRMSE
57 / 67
The wind power ensembles
Ensemble
forecasts
of wind
power for
Horns Rev
58 / 67
Kernel dressing of ensemble forecasts
m
(j)
X
fˆt+k|t (y ) = wj fˆt+k|t (y )
j=1
with
m
X
wj = 1, wj = 1/m
j=1
59 / 67
Specifics of the wind power application
61 / 67
Now close your eyes(!)
t−k
1 X t−k−i
St,k (ν) = − λ ln (ui (ν))
nλ i=1
where
λ, λ ∈ (0, 1] is the forgetting factor,
ui (ν) = P[yi |ν] = fˆi|i−k (yi ) is the likelihood of the measurements given the
probabilistic forecasts
62 / 67
Application results
63 / 67
Reliability evaluation
k = 12 k = 24
Some comments:
Reliability is significantly improved after kernel dressing
There seems to be a remaining (slight) probabilistic bias...
64 / 67
Overall skill evaluation
Climatology-based
predictive distributions
are only slightly better
than not uniform ones !!
Ensemble based ones
have significantly higher
skill
Some comments:
High overall skill comes from acceptable reliability + maximized resolution
Maximum skill is for horizons between 5- and 10-hour ahead
65 / 67
Final remarks
66 / 67
Thanks for your attention!
67 / 67