Professional Documents
Culture Documents
stant bias correction term over the whole domain. Model biases are likely to differ by location, and there are newer methods that account
for this (Gel, 2007; Mass et al., 2008; Kleiber et al., in press).
0.5
0.12
0.4
Probability Density
0.08
0.3
0.2
0.04
0.1
0.00
0.0
270 275 280 285 0 1 2 3 4 5 6
Figure 1: BMA predictive distributions for temperature (in degrees Kelvin) valid January 31, 2004 (left) and for precip-
itation (in hundredths of an inch) valid January 15, 2003 (right), at Port Angeles, Washington at 4PM local time, based
on the eight-member University of Washington Mesoscale Ensemble (Grimit and Mass, 2002; Eckel and Mass, 2005). The
thick black curve is the BMA PDF, while the colored curves are the weighted PDFs of the constituent ensemble members.
The thin vertical black line is the median of the BMA PDF (occurs at or near the mode in the temperature plot), and the
dashed vertical lines represent the 10th and 90th percentiles. The orange vertical line is at the verifying observation. In the
precipitation plot (right), the thick vertical black line at zero shows the point mass probability of no precipitation (47%).
The densities for positive precipitation amounts have been rescaled, so that the maximum of the thick black BMA PDF
agrees with the probability of precipitation (53%).
50
48
48
46
46
44
44
42
42
Figure 2: Image plots of the BMA median forecast for surface temperature and BMA probability of freezing over the
Pacific Northwest, valid January 31, 2004.
mass at zero that apply to the cube root transforma- station locations to a grid for surface plotting. It is
tion of the ensemble forecasts and observed data. A also possible to request image or perspective plots,
rolling training period of 30 days is used. The dataset or contour plots.
used to obtain prcpFit is not included in the pack- The mean absolute error (MAE) and mean contin-
age on account of its size. However, the correspond- uous ranked probability score (CRPS; e.g., Gneiting
ing "ensembleData" object can be constructed in the and Raftery, 2007) can be computed with the func-
same way as illustrated for the surface temperature tions CRPS and MAE:
data, and the modeling process also is analogous, ex-
cept that the "gamma0" model for quantitative precip- CRPS(srftFit, srftData)
itation is used in lieu of the "normal" model. # ensemble BMA
The prcpGrid dataset contains gridded ensemble # 1.945544 1.490496
forecasts of daily precipitation accumulation in the
same region as that of prcpFit, initialized January MAE(srftFit, srftData)
13, 2003 and valid January 15, 2003. The BMA me- # ensemble BMA
dian and upper bound (90th percentile) forecasts can # 2.164045 2.042603
be obtained and plotted as follows: The function MAE computes the mean absolute dif-
data(prcpFit) ference between the ensemble or BMA median fore-
cast2 and the observation. The BMA CRPS is ob-
prcpGridForc <- quantileForecast( tained via Monte Carlo simulation and thus may
prcpFit, prcpGridData, date = "20030115", vary slightly in replications. Here we compute these
q = c(0.5, 0.9)) measures from forecasts valid on a single date; more
typically, the CRPS and MAE would be computed
Here prcpGridData is an "ensembleData" object cre- from a range of dates and the corresponding predic-
ated from the prcpGrid dataset. The 90th percentile tive models.
plot is shown in Figure 3. The probability of precip- Brier scores (see e.g., Jolliffe and Stephenson,
itation and the probability of precipitation above .25 2003; Gneiting and Raftery, 2007) for probability
inches can be obtained as follows: forecasts of the binary event of exceeding an arbi-
trary precipitation threshold can be computed with
probPrecip <- 1 - cdf(prcpFit, prcpGridData, the function brierScore.
date = "20030115", values = c(0, 25))
The plot for the BMA forecast probability of pre- Assessing calibration. Calibration refers to the sta-
cipitation accumulation exceeding .25 inches is also tistical consistency between the predictive distribu-
shown in Figure 3. tions and the observations (Gneiting et al., 2007). The
verification rank histogram is used to assess calibra-
tion for an ensemble forecast, while the probability
Verification
integral transform (PIT) histogram assesses calibra-
The ensembleBMA functions for verification can tion for predictive PDFs, such as the BMA forecast
be used whenever observed weather conditions are distributions.
available. Included are functions to compute veri- The verification rank histogram plots the rank of
fication rank and probability integral transform his- each observation relative to the combined set of the
tograms, the mean absolute error, continuous ranked ensemble members and the observation. Thus, it
probability score, and Brier score. equals one plus the number of ensemble members
that are smaller than the observation. The histogram
Mean absolute error, continuous ranked probabil- allows for the visual assessment of the calibration of
ity score, and Brier score. In the previous section, the ensemble forecast (Hamill, 2001). If the observa-
we obtained a gridded BMA forecast of surface tem- tion and the ensemble members are exchangeable, all
perature valid January 31, 2004 from the srft data ranks are equally likely, and so deviations from uni-
set. To obtain forecasts at station locations, we ap- formity suggest departures from calibration. We il-
ply the function quantileForecast to the model fit lustrate this with the srft dataset, starting at January
srftFit: 30, 2004:
The BMA quantile forecasts can be plotted together srftVerifRank <- verifRank(
with the observed weather conditions using the func- ensembleForecasts(srftData[use,]),
tion plotProbcast as shown in Figure 4. Here the R ensembleVerifObs(srftData[use,]))
core function loess was used to interpolate from the
2 Raftery et al. (2005) employ the BMA predictive mean rather than the predictive median.
Upper Bound (90th Percentile) Forecast for Precipitation Probability of Precipitation above .25 inches
50
50
48
48
46
46
44
44
42
42
−130 −125 −120 −130 −125 −120
0 50 100 150 200 250 0.0 0.2 0.4 0.6 0.8 1.0
Figure 3: Image plots of the BMA upper bound (90th percentile) forecast of precipitation accumulation (in hundredths of
an inch), and the BMA probability of precipitation exceeding .25 inches over the Pacific Northwest, valid January 15, 2003.
52
● ●
266 26 26
276 ●
●
●
●
276 ● 8 ●
●
4
●
● ● ● ● ● ●
●● ● ●● ●
● 268 ●
278 ● ●
●
● 278 ● ●
●
● 27
0
26
● ●
● ● ● ●
50
50
● ● ● ●
6
● ●
●●
●
● 270 ● ●
●●
●
●
● ● ● ● ● ●
● ● ● ●
280 ●
● ● ● 272
● ●
●
●
● ● ● ● ●
●
● ● ●●● ● ● ● ● ●●● ● ●
● ● ● ● ● ●
● ● ● ●● ●● ●
● ● ● ● ● ● ● ●
● 280
● ● ● ●● ●● ●
● ● ● ● ● ● ● ●
●
●● ● ● ●● ● ●
● ● ●●● ● ● ● ●●● ●
● ●●● ● ● ●
● ● ● ● ● ● ●●● ● ● ●
● ● ● ● ●
●● ● ● ● ●● ● ●●● ● ● ● ● ● ● ●● ● ● ● ●● ● ●●● ● ● ● ● ● ●
● ● ●● ● ● ● ● ● ●● ● ● ●
● ● ● ● ● ● ● ●
282 ●●● ● ● ● ● ● ●●● ● ● ● ● ●
48
48
● ●● ● ●●
●● ●●● ●● ●●● ● ●
●●
●
●
● ●● ●●● ●● ●●● ● ●
●●
●
●
●
● ●● ●●●
● ● ●● ●
● ●●● ●● ● ●● ●●●
● ● ●● ●
● ●●● ●●
●
●●●
● ●● ● ●● ● ● ●● ● ●● ● ● ●●
● ●●●
●
●●●●●
●
●
●
●
●
●
●
● ●
●●● ● ● ●●●●● ●
●●
● ● ● ●●●
●
●●●●●
●
●
●
●
●
●●
●
●
●
● ●● ● ● ●
●●● ● ● ●●●●● ●
●●
● ●
● ●●● ● ●●●
●●● ● ●●●●●●
●
●●●
● ● ● ● ●●●
●● ● ● ●●●●
●●●● ● ●●● ● ●●●●●●
●
●●●
● ● ● ● ●●●
●● ● ● ●●●●
●●●● ●
● ● ● ● ●
●●
●●● ● ●● ● ●● ● ●●
● ● ● ● ● ●
●●
●●● ● ●● ● ●● ● ●●
●
● ● ●●● ●● ● ● ● ● ●●● ●● ● ●
● ●● ● ● ●●
● ●●● ●
● ●●●● ● ●● ● ●●●● ●
●
● ● ● ●
282 ● ●● ● ● ●●
● ●●● ●
● ●●●● ● ●● ● ●●●● ●
●
● ● ● ●
●●●
● ●● ● ● ● ● ●● ●● ●●●
● ●● ● ● ● ● ●● ●●
● ● ● ● ● ● ● ● ● ●
● ● ● ● ●●● ● ● ● ● ● ● ●●● ● ●
●● ●● ● ● ●● ●
●●
●●
●●
● ● ●● ●● ● ● ●● ●
●●
●●
●●
● ●
● ●● ●
●●
● ●●● ● ● ●● ●
●●
● ●●● ●
● ● ● ● ● ● ●
● ●
●● ● ● ● ● ● ● ● ● ●
● ●
●● ● ●
● ●●●● ● ● ●●●● ●
46
46
● ● ● ●●● ● ● ●● ● ● ● ● ● ●●● ● ● ●● ● ●
● ● ● ● ● ● ● ● ● ●
● ●
● ● ● ●●●
●● ●
● ● ● ●
284 ● ●
● ● ● ●●●
●● ●
● ● ● ●
●● ● ● ●● ● ●
● ●●● ●●
●
●
●● ● ● ●● ● ●
● ●●● ●●
●
●
284 ●●● ● ● ● ●●● ● ● 276 ●
●● ● ● ● ● ●●
● ●
●● ● ● ● ● ●●
● ●
● ●●● ● ● ● ● ●●● ● ● ●
● ● ● ● ● ●
● ●●●
● ● ●● ● ● ●●●
● ● ●● ●
● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●
● ● ● ● ● ● ● ● ● ●
●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●
● ● ● ● ● ● ● ●
274
44
44
● ● ● ● ● ●
● ● ● ● ● ● ● ●
● ●
●
●
●●
●274 ●
● ● ● ●● ●● ● ●
●
●
●●
● ●
● ● ● ●● ●●
●●●●● ● ● ●●●●● ● ●
●● ● ● ●● ● ● ●●
● ● ●● ● ● ● ● ● ●
● ●
●●
●● ● ● ● ● ● ●
●● ● ●● ●
● ●● ● ● ● ● ●● ● ● ●
● ● ● ● ● ●
●
● ● ● ●● ● ●
● ●
●
● ● ● ●● ● ●
● 272 ●
● ● ● ● ● ● ● ● ● ● ● ● ● ●
●● ●● ● ● ● ● ● ●● ●● ● ● ● ● ●
● ●● ● ●●
● ● ● ●●● ●
● ● ● ● ●●● ●
●
42
42
● ● ●●● ● ● ● ●● ● ●●● ● ● ● ●●
● ● ●● ● ● ●
● ● ●● ● ●
● ●● ● ●
● ● ● ●● ● ●
● ●
● ●● ●● ● ● ●
● ● ● ●● ●● ● ● ●
● ●
●● ● ● ● ●● ● ● ●
● ● ● ● ●● ● ● ● ● ●●
● ● ● ●● ● ● ● ● ● ●● ● ●
● ● ● ● ● ●● ● ● ● ● ● ● ●● ●
● ● ● ●● ● ● ● ●●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
● ● ● ●
Figure 4: Contour plots of the BMA median forecast of surface temperature and verifying observations at station locations
in the Pacific Northwest, valid January 31, 2004 (srft dataset). The plots use loess fits to the forecasts and observations at
the station locations, which are interpolated to a plotting grid. The dots represent the 715 observation sites.
1.2
0.4
1.0
0.8
0.3
Density
0.6
0.2
0.4
0.1
0.2
0.0
0.0
1 2 3 4 5 6 7 8 9 0 0.2 0.4 0.6 0.8 1
Figure 5: Verification rank histogram for ensemble forecasts, and PIT histogram for BMA forecast PDFs for surface tem-
perature over the Pacific Northwest in the srft dataset valid from January 30, 2004 to February 28, 2004. More uniform
histograms correspond to better calibration.
The resulting rank histogram composites ranks spa- The ProbForecastGOP package
tially and is shown in Figure 5. The U shape indicates
a lack of calibration, in that the ensemble forecast is The ProbForecastGOP package (Berrocal et al., 2010)
underdispersed. generates probabilistic forecasts of entire weather
The PIT is the value that the predictive cumula- fields using the geostatistical output perturbation
tive distribution function attains at the observation, (GOP) method of Gel et al. (2004). The package con-
and is a continuous analog of the verification rank. tains functions for the GOP method, a wrapper func-
The function pit computes it. The PIT histogram al- tion named ProbForecastGOP, and a plotting utility
lows for the visual assessment of calibration and is named plotfields. More detailed information can
interpreted in the same way as the verification rank be found in the PDF document installed in the pack-
histogram. We illustrate this on BMA forecasts of age directory at ‘ProbForecastGOP/docs/vignette.pdf’ or
surface temperature obtained for the entire srft data in the help files.
set using a 25 day training period (forecasts begin on The GOP method uses geostatistical methods
January 30, 2004 and end on February 28, 2004): to produce probabilistic forecasts of entire weather
fields for temperature and pressure, based on a sin-
gle numerical weather forecast on a spatial grid. The
srftFitALL <- ensembleBMA(srftData,
method involves three steps:
trainingDays = 25)
• an initial step in which linear regression is
srftPIT <- pit(srftFitALL, srftData) used for bias correction, and the empirical var-
iogram of the forecast errors is computed;
hist(srftPIT, breaks = (0:(k+1))/(k+1),
• an estimation step in which the weighted least
xlab="", xaxt="n", prob = TRUE,
squares method is used to fit a parametric
main = "Probability Integral Transform")
model to the empirical variogram; and
axis(1, at = seq(0, to = 1, by = .2),
labels = (0:5)/5) • a forecasting step in which a statistical en-
abline(h = 1, lty = 2) semble of weather field forecasts is generated,
Empirical variogram
In the first step of the GOP method, the empiri-
cal variogram of the forecast errors is found. Var- Generating ensemble members
iograms are used in spatial statistics to character-
ize variability in spatial data. The empirical vari- The final step in the GOP method involves generat-
ogram plots one-half the mean squared difference be- ing multiple realizations of the forecast weather field.
tween paired observations against the distance sepa- Each realization provides a member of a statistical
rating the corresponding sites. Distances are usually ensemble of weather field forecasts, and is obtained
binned into intervals, whose midpoints are used to by simulating a sample path of the fitted error field,
represent classes. and adding it to the bias-adjusted numerical forecast
In ProbForecastGOP, four functions compute field.
empirical variograms. Two of them, avg.variog and
avg.variog.dir, composite forecast errors over tem- This is achieved by the function Field.sim, or by
poral replications, while the other two, Emp.variog the wrapper function ProbForecastGOP with the en-
and EmpDir.variog, average forecast errors tem- try out set equal to "SIM". Both options depend on
porally, and only then compute variograms. Al- the GaussRF function in the RandomFields package
ternatively, one can use the wrapper function that simulates Gaussian random fields (Schlather,
ProbForecastGOP with argument out = "VARIOG". 2011).
The output of the functions is both numerical
and graphical, in that they return members of the
Parameter estimation forecast field ensemble, quantile forecasts at individ-
ual locations, and plots of the ensemble members
The second step in the GOP method consists of
and quantile forecasts. The plots are created using
fitting a parametric model to the empirical var-
the image.plot, US and world functions in the fields
iogram of the forecast errors. This is done by
package (Furrer et al., 2010). As an illustration, Fig-
the Variog.fit function using the weighted least
ure 7 shows a member of a forecast field ensemble of
squares approach. Alternatively, the wrapper func-
surface temperature over the Pacific Northwest valid
tion ProbForecastGOP with entry out set equal to
January 12, 2002.
"FIT" can be employed. Figure 6 illustrates the re-
sult for forecasts of surface temperature over the Simulated weather field
Pacific Northwest in January through June 2000.
20
50
10
15
48
10
8
5
46
Semi−variance
0
44
−5
4
−10
42
2
Distance
Figure 7: A member of a 48-hour ahead forecast field
ensemble of surface temperature (in degrees Celsius)
Figure 6: Empirical variogram of temperature fore- over the Pacific Northwest valid January 12, 2002 at
cast errors and fitted exponential variogram model. 4PM local time.
D. James and K. Hornik. chron: Chronological objects L. J. Wilson, S. Beauregard, A. E. Raftery, and R. Ver-
which can handle dates and times, 2010. URL http: ret. Calibrated surface temperature forecasts from
//CRAN.R-project.org/package=chron. R pack- the Canadian ensemble prediction system using
age version 2.3-39. S original by David James, R Bayesian model averaging. Monthly Weather Re-
port by Kurt Hornik. view, 135:1364–1385, 2007.
I. T. Jolliffe and D. B. Stephenson, editors. Forecast
Verification: A Practitioner’s Guide in Atmospheric Chris Fraley
Science. Wiley, 2003. Department of Statistics
W. Kleiber, A. E. Raftery, J. Baars, T. Gneiting, University of Washington
C. Mass, and E. P. Grimit. Locally calibrated prob- Seattle, WA 98195-4322 USA
abilistic temperature forecasting using geostatisti- fraley@stat.washington.edu
cal model averaging and local Bayesian model av-
eraging. Monthly Weather Review, 2011 (in press). Adrian Raftery
Department of Statistics
C. F. Mass, J. Baars, G. Wedam, E. P. Grimit, and University of Washington
R. Steed. Removal of systematic model bias on a Seattle, WA 98195-4322 USA
model grid. Weather and Forecasting, 23:438–459, raftery@u.washington.edu
2008.
A. E. Raftery, T. Gneiting, F. Balabdaoui, and M. Po- Tilmann Gneiting
lakowski. Using Bayesian model averaging to cal- Institute of Applied Mathematics
ibrate forecast ensembles. Monthly Weather Review, Heidelberg University
133:1155–1174, 2005. 69120 Heidelberg, Germany
t.gneiting@uni-heidelberg.de
M. Schlather. RandomFields: Simulation and analysis
tools for random fields, 2011. URL http://CRAN. McLean Sloughter
R-project.org/package=RandomFields. R pack- Department of Mathematics
age version 2.0.45. Seattle University
Seattle, WA 98122 USA
J. M. Sloughter, A. E. Raftery, T. Gneiting, and C. Fra-
sloughtj@seattleu.edu
ley. Probabilistic quantitative precipitation fore-
casting using Bayesian model averaging. Monthly
Veronica Berrocal
Weather Review, 135:3209–3220, 2007.
School of Public Health
J. M. Sloughter, T. Gneiting, and A. E. Raftery. Prob- University of Michigan
abilistic wind speed forecasting using ensembles 1415 Washington Heights
and Bayesian model averaging. Journal of the Amer- Ann Arbor, MI 48109-2029 USA
ican Statistical Association, 105:25–35, 2010. berrocal@umich.edu