You are on page 1of 17

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/256080160

First and second order Markov chain models for synthetic generation of
wind speed time series

Article · April 2005


DOI: 10.1016/j.energy.2004.05.026

CITATIONS READS
263 3,768

5 authors, including:

Shamshad Ahmad Taksiah A Majid


Jamia Millia Islamia Universiti Sains Malaysia
20 PUBLICATIONS 544 CITATIONS 120 PUBLICATIONS 696 CITATIONS

SEE PROFILE SEE PROFILE

Mohd Sanusi S. Ahamad


Engineering campus, University of Science Malaysia
83 PUBLICATIONS 847 CITATIONS

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Solid Waste Management View project

Marshland Tourism and GIS View project

All content following this page was uploaded by Shamshad Ahmad on 17 July 2018.

The user has requested enhancement of the downloaded file.


Energy 30 (2005) 693–708
www.elsevier.com/locate/energy

First and second order Markov chain models for synthetic


generation of wind speed time series
A. Shamshad*, M.A. Bawadi, W.M.A. Wan Hussin, T.A. Majid, S.A.M. Sanusi
School of Civil Engineering, University of Science Malaysia, Engineering Campus,
14300 Nibong Tebal, Pulau Pinang, Malaysia
Received 22 January 2003

Abstract
Hourly wind speed time series data of two meteorological stations in Malaysia have been used for stochastic
generation of wind speed data using the transition matrix approach of the Markov chain process. The transition
probability matrices have been formed using two different approaches: the first approach involves the use of the
first order transition probability matrix of a Markov chain, and the second involves the use of a second order
transition probability matrix that uses the current and preceding values to describe the next wind speed value. The
algorithm to generate the wind speed time series from the transition probability matrices is described. Uniform
random number generators have been used for transition between successive time states and within state wind
speed values. The ability of each approach to retain the statistical properties of the generated speed is compared
with the observed ones. The main statistical properties used for this purpose are mean, standard deviation, median,
percentiles, Weibull distribution parameters, autocorrelations and spectral density of wind speed values. The
comparison of the observed wind speed and the synthetically generated ones shows that the statistical
characteristics are satisfactorily preserved.
q 2004 Elsevier Ltd. All rights reserved.

1. Introduction

The increasing demand of energy, the growing environmental concern and rapidly depleting reserves
of fossils fuel have made planners and policy makers think and search for ways to supplement the energy
base with renewable energy sources. Wind is one of the potential renewable energy sources. In Malaysia,
a lot of hourly wind speed data at several locations is being collected by Malaysian Meteorological

* Corresponding author. Tel.: C604-717-6064; fax: C604-594-1009.


E-mail address: ceshams@eng.usm.my (A. Shamshad).
0360-5442/$ - see front matter q 2004 Elsevier Ltd. All rights reserved.
doi:10.1016/j.energy.2004.05.026
694 A. Shamshad et al. / Energy 30 (2005) 693–708

Stations. Designing a proper wind energy system requires the prediction of wind speed statistical
parameters [1]. Besides, wind energy parameters are important for designing wind sensitive structures
and for studying air pollution.
For the Markov process, the probability of the given condition in the given moment may be deduced
from information about the preceding conditions. A Markov chain represents a system of elements
moving from one state to another over time. The order of the chain gives the number of time steps in the
past influencing the probability distribution of the present state, which can be greater than one. Many
natural processes are considered as Markov processes [2]. In fact, the probability transition matrix is a
tool for describing the Markov chains’ behaviour. Each element of the matrix represents probability of
passage from a specific condition to a next state. The Markov chain modelling approach has frequently
been used for the synthetic generation of rainfall data. Thomas and Fiering [3] first of all used a first
order Markov chain model to generate stream flow data. Srikanthan McMahon [4] and Thyer and Kuczer
[5] used and recommended a first order Markov chain model to generate annual rainfall data. Shamshad
et al. [6] compared performance of stochastic approaches for forecasting river water quality. However,
very little work has been done on the synthetic generation of wind speed data using Markov chain
models. Kaminsky et al. [7] compared alternative approaches including first and second Markov chain
models, and embedded Markov chain model for the synthetic generation of wind speed time series using
the wind speed data for a short period of 8 h sampled at a rate of 3.5 Hz. They concluded that embedded
Markov chain approach performed even better than the Markov chain approaches. In recent studies,
Sahin et al. [8] and Torre et al. [9] used first order Markov chain models for synthetic generation of
hourly wind speed time series.
A first order Markov chain model is generally used for modeling and simulation of wind speed data. It
is expected that a second order or higher Markov chain model can improve the results of synthetically
generated wind speed data. In this paper, the synthetic time series are generated using hourly wind speed
data of about seven years from 1995 to 2001 by first and second Markov chain models. The data belongs
to two meteorological stations located at Mersing and Kuantan in Malaysia measured at a height of about
14 m above the ground. In order to validate and compare the performance of the models, several
statistical tests have been carried out.

2. Markov chains

Markov chains are stochastic processes that can be parameterized by empirically estimating
transition probabilities between discrete states in the observed systems [2]. The Markov chain of the
first order is one for which each subsequent state depends only on the immediately preceding one.
Markov chains of second or higher orders are the processes in which the next state depends on two or
more preceding ones.
Let X(t) be a stochastic process, possessing discrete states space SZ{1,2,.,K}. In general, for a given
sequence of time points t1!t2!/!tnK1!tn, the conditional probabilities should be [10]:
PrfXðtn Þ Z in jXðt1 Z i1 ; .; XðtnK1 Þ Z inK1 g Z PrfXðtn Þ Z in jXðtnK1 Þ Z inK1 g (1)
The conditional probabilities PrfXðtÞZ jjXðsÞZ igZ Pij ðs; tÞ are called transition probabilities of order
rZtKs from state i to state j for all indices 0%s!t, with 1%i and j%k [2]. They are denoted as
A. Shamshad et al. / Energy 30 (2005) 693–708 695

the transition matrix P. For k states, the first order transition matrix P has a size of k!k and takes the form:
2 3
p1;1 p1;2 / p1;k
6 p2;1 p2;2 / p2;k 7
PZ6 4«
7
« « « 5
pk;1 pk;2 / pk;k
The state probabilities at time t can be estimated from the relative frequencies of the k states. If nij is the
number of transitions from state i to state j in the sequence of speed data, the maximum likelihood
estimates of the transition probabilities is:
X
pij Z nij = nij (2)
j

The transition probabilities of any state vary between 0 and 1. The summation of transition probabilities
in a row equals one. Mathematically, it can be expressed as:
X
pij Z 1 (3)
jZ1

A second order transition probability matrix can be shown symbolically as below:


2 3
p1:1;1 p1:1;2 / p1:1;k
6 p1:2;1 p1:2;2 / p1:2;k 7
6 7
6 « « « « 7
6 7
6 p1:k;1 p1:k;2 / p1:k;k 7
PZ6 6 7
7
6 p2:1;1 p2:1;2 / p2:1;k 7
6 p2:2;1 p2:2;2 / p2:2;k 7
6 7
4 « « « « 5
pk:k;1 pk:k;2 / pk:k;k
In this matrix the probability pijk is the probability of the next wind speed state k if the current wind
speed state is j and the previous wind speed state was i. This is how the probability of making a transition
depends on the current state and on the preceding state [7]. These matrices become the basis offuture likely
wind speed. If the transition probability in the i.jth row at the kth state is pi.j,l, then the cumulative
probability is given by
X
k
Pijk Z pijl (4)
lZ1

This cumulative probability helps in determining the future wind speed states by using random number
generator.

3. Formation of transition matrices

The analysis of the wind speed hourly data has been carried out in two ways using: (i) first order
Markov chain model, and (ii) second order Markov chain model. Initially, the wind speed time series
696 A. Shamshad et al. / Energy 30 (2005) 693–708

Table 1
Probability transition matrix of first order for wind speed time series at Mersing
2 3
0:317 0:407 0:174 0:036 0:009 0:002 0:001 0:001 0:000 0:000 0:000 0:000
6 7
6 0:166 0:446 0:313 0:059 0:012 0:004 0:000 0:001 0:000 0:000 0:000 0:000 7
6 7
6 7
6 0:051 0:243 0:504 0:163 0:028 0:008 0:002 0:001 0:000 0:000 0:000 0:000 7
6 7
6 0:017 0:083 0:304 0:390 0:160 0:035 0:008 0:002 0:000 0:000 0:000 0:000 7
6 7
6 7
6 0:010 0:034 0:099 0:277 0:381 0:158 0:031 0:007 0:001 0:001 0:000 0:000 7
6 7
6 7
6 0:006 0:021 0:043 0:108 0:294 0:343 0:146 0:030 0:005 0:002 0:000 0:000 7
6 7
PZ 6 7
6 0:005 0:016 0:027 0:047 0:110 0:302 0:325 0:142 0:021 0:004 0:002 0:000 7
6 7
6 7
6 0:006 0:016 0:030 0:033 0:055 0:127 0:365 0:239 0:105 0:022 0:002 0:000 7
6 7
6 7
6 0:009 0:019 0:014 0:019 0:042 0:065 0:140 0:326 0:270 0:079 0:014 0:005 7
6 7
6 7
6 0:014 0:055 0:055 0:014 0:027 0:027 0:041 0:205 0:288 0:164 0:082 0:027 7
6 7
6 0:000 0:000 0:000 0:040 0:000 0:000 0:080 0:120 0:160 0:240 0:280 0:080 7
4 5
0:000 0:000 0:000 0:000 0:000 0:000 0:000 0:200 0:000 0:200 0:600 0:000

were converted to wind speed states, which contains wind speeds between certain values. Based on the
visual examination of the histogram of the wind speed data, the wind speed states have been adopted
with an upper and lower limit difference of 1 m/s of wind speed. The first state was started with zero as
the lower limit. For wind speed time series at Mersing, the wind speed transition probability matrix
(12!12) for first order Markov chain model is shown in Table 1. The second order transition probability
matrix is of size 144!12, which is partly shown in Table 2. In the first order matrix (Table 1), each

Table 2
Probability transition matrix of second order for wind speed time series at Mersing (few rows of 144!12 matrix)
2 3
0:417 0:403 0:144 0:027 0:006 0:000 0:001 0:000 0:000 0:000 0:000 0:000
6 7
6 0:184 0:438 0:302 0:066 0:008 0:003 0:000 0:000 0:000 0:000 0:000 0:000 7
6 7
6 0:078 0:250 0:442 0:186 0:037 0:003 0:003 0:000 0:000 0:000 0:000 0:000 7
6 7
6 7
6 0:033 0:114 0:333 0:352 0:129 0:033 0:000 0:005 0:000 0:000 0:000 0:000 7
6 7
6 7
6 0:040 0:160 0:340 0:260 0:140 0:040 0:020 0:000 0:000 0:000 0:000 0:000 7
6 7
6 7
6 0:000 0:455 0:182 0:273 0:091 0:000 0:000 0:000 0:000 0:000 0:000 0:000 7
6 7
6 7
6 0:000 0:000 0:400 0:600 0:000 0:000 0:000 0:000 0:000 0:000 0:000 0:000 7
6 7
6 7
PZ 6 0:333 0:000 0:333 0:000 0:333 0:000 0:000 0:000 0:000 0:000 0:000 0:000 7
6 7
6 0:000 1:000 0:000 0:000 0:000 0:000 0:000 0:000 0:000 0:000 0:000 0:000 7
6 7
6 7
6 0:000 0:000 0:000 0:000 0:000 0:000 0:000 0:000 0:000 0:000 0:000 0:000 7
6 7
6 7
6 0:000 0:000 0:000 0:000 0:000 0:000 0:000 0:000 0:000 0:000 0:000 0:000 7
6 7
6 7
6 0:000 0:000 0:000 0:000 0:000 0:000 0:000 0:000 0:000 0:000 0:000 0:000 7
6 7
6 7
6 0:354 0:424 0:179 0:032 0:008 0:002 0:000 0:000 0:000 0:000 0:000 0:000 7
6 7
6 7
4 0:168 0:480 0:295 0:044 0:010 0:003 0:000 0:001 0:000 0:000 0:000 0:000 5
« « « « « « « « « « « «
A. Shamshad et al. / Energy 30 (2005) 693–708 697

element shows the probability of the next wind speed state based on the current wind speed state.
It reveals that the highest probability occurs on the diagonal of the matrix. Thus, if the current wind
speeds are known, it is most likely that the next wind speed will be in the same category. Furthermore, all
the transition probabilities are around the diagonal, which means that transitions from one state to
another far distant state are rare. By examining Table 2 in parts of 12!12, it is clear that the highest
probability occurs on the diagonal. Therefore, if the current and the preceding wind speeds are known,
it is most probable that the next wind speed will be in the same category.
For evaluating the validity of the Markov chain for wind speed hourly data, the following properties
of the Markov chains have been tested [9,11].

3.1. Dependency test

The Markov chain properties can be tested statistically by checking whether the successive events are
independent or dependent on each other. They form Markov chains if they are dependent [9,10]. For
successive events to be independent, the statistic a, mathematically defined by:
X
k
pij
aZ2 nij ln (5)
i;j
pj

is distributed asymptotically as c2 having (kK1)2 degrees of freedom (DF), where k is the total number
of states. The marginal probabilities pj for the jth column of the transition probability matrix are given as
Pm
i nij
pj Z Pm (6)
i;j nij

where nij is the frequency in state i followed by state j. The tests have been carried out by taking the
whole series as well as by dividing the wind speed data at both the stations, Mersing and Kuantan, into
seven intervals. The total number of states (k) taken for wind speed data at Mersing and Kuantan was 12.
However, in some cases, the value of k was reduced to 11 or 10 to avoid zero values of pij. The values of
the test a for both the stations for first order Markov chain are presented in Table 3. The values of a are
higher in all cases than the c2 value of 147.7 at 5% level with 121 degrees of freedom. Therefore, the null
hypothesis, that the successive transitions are independent is rejected and it is concluded that
Table 3
The values of a for the dependency test for each interval of the data series
Data series Value of a
Mersing Kuantan
1995–2001 49,069.0 41,404.3
1995 7918.7 5716.6
1996 7898.0 5917.0
1997 7130.0 6092.0
1998 6359.6 6196.0
1999 7034.0 5727.0
2000 7587.1 5546.5
2001 5873.6 5738.0
698 A. Shamshad et al. / Energy 30 (2005) 693–708

the transition of hourly wind speed has the first order Markov chain property. Similar trends have been
observed for the second order Markov chain for almost all the data series at both stations.

3.2. Temporal stationary test

A Markov process is stationary if the transition probabilities do not depend on time. A popular way to
check the stationarity is to divide the whole series into a few intervals and then to compute the transition
probability matrix for each interval. For stationary condition, all the matrices should be approximately
equal to each other. Mathematically, the test statistic b is defined as
T X
X k
pij ðtÞ
bZ2 nij ðtÞln (7)
1 i;j
pij

where T is the number of time intervals, nij(t) and pij(t) are the frequencies and transition probabilities for
the tth time interval data series for the (i,j)th element of the tally matrix and transition matrix, respectively.
For Markov chain to be stationary, the statistic b should have a c2 distribution with K(KK1)(TK1)
degrees of freedom. The process is stationary in a 5% confidence interval if b!c2 (5%, DF).
For Mersing and Kuantan the wind speed data was divided into seven intervals as described above
(Section 3.1). To avoid zero values of transition matrix, the number of states considered for the test was
only 10, thereby, yielding DF as 540 and the corresponding c2 value as 595.2. It is observed that the b
values for Mersing (149.9) and Kuantan (204.0) are less than the c2 limit and, therefore, it is considered
that the Markov chains are stationary.

3.3. Spatial stationary test

Markov chain properties for spatial homogeneity are checked if analysis of data at more than one
location is carried out. If the Markov chain properties for successive events on different locations are
homogeneous, the g statistic defined by
S X
X k
pij ðsÞ
gZ2 nij ðsÞln (8)
s i;j
pij

is c2 distributed with (SK1)!k!(kK1) degrees of freedom, where S is the number of stations. If gOc2
(5%, DF), then the process is homogeneous in a 5% confidence interval otherwise heterogeneous.
The value of g, 9486.8 is greater than the limiting value of c2 (113.1) for the two stations and
10 states. It is concluded that the Markov chain properties are not spatially heterogeneous and so wind
speeds are dependent on both the sites.

4. Synthetic generation of wind speed

The generation of synthetic values becomes easy if the elements of the transition matrix take all
values varying between 0 and 1. The cumulative probability transition matrices, for both the first order
and the second order Markov processes, have been formed using Eq. (2). The cumulative probability
A. Shamshad et al. / Energy 30 (2005) 693–708 699

Table 4
Cumulative probability transition matrix of first order for Wind speed time series at Mersing
2 3
0:371 0:778 0:952 0:988 0:997 0:998 0:999 1:000 1:000 1:000 1:000 1:000
6 7
6 0:166 0:612 0:924 0:983 0:995 0:999 0:999 1:000 1:000 1:000 1:000 1:000 7
6 7
6 7
6 0:051 0:294 0:798 0:961 0:989 0:997 0:999 1:000 1:000 1:000 1:000 1:000 7
6 7
6 0:017 0:100 0:403 0:794 0:954 0:989 0:997 0:999 1:000 1:000 1:000 1:000 7
6 7
6 7
6 0:010 0:045 0:144 0:421 0:803 0:960 0:991 0:998 0:999 1:000 1:000 1:000 7
6 7
6 7
6 0:006 0:027 0:070 0:178 0:473 0:816 0:962 0:993 0:997 1:000 1:000 1:000 7
6 7
PZ 6 7
6 0:005 0:021 0:048 0:095 0:205 0:507 0:831 0:973 0:994 0:998 1:000 1:000 7
6 7
6 7
6 0:006 0:022 0:052 0:085 0:140 0:267 0:632 0:871 0:976 0:998 1:000 1:000 7
6 7
6 7
6 0:009 0:028 0:042 0:060 0:102 0:167 0:307 0:633 0:902 0:981 0:995 1:000 7
6 7
6 7
6 0:014 0:068 0:123 0:137 0:164 0:192 0:233 0:438 0:726 0:890 0:973 1:000 7
6 7
6 0:000 0:000 0:000 0:040 0:040 0:040 0:120 0:240 0:400 0:640 0:920 1:000 7
4 5
0:000 0:000 0:000 0:000 0:000 0:000 0:000 0:200 0:200 0:400 1:000 1:000

transition matrix of first order Markov process for Mersing is presented in Table 4 in which each row
ends with 1. Due to the extra large size of the second order cumulative probability transition matrix, it
has not been shown here.
For generating the sequences of wind speed states, the initial state, say i, was selected randomly. Then
random values between 0 and 1 were produced by using a uniform random number generator. For next
wind speed state in first order Markov process, the value of the random number was compared with the
elements of the ith row of the cumulative probability transition matrix [8]. If the random number value
was greater than the cumulative probability of the previous state but less than or equal to the cumulative
probability of the following state, the following state was adopted. In case of second order Markov
process, the first wind speed state was adopted randomly. However, the next wind speed state was not
searched in the ith row. The row was decided based on the current and preceding states in which
current state will be the previously selected state.
To study the repartition of the wind speed data in each state, the skewness coefficients for each state
wind speed were computed as presented in Table 5. Majority of the coefficients are close to the
permissible limits for different sample size [12] and so the wind speed in the states represent normal
distributions. The wind speed states have been converted to the actual wind speed using the following
relationship

V Z Vl C Zi ðVl K Vr Þ (9)

where Vl and Vr are wind speed boundaries of the state and Zi is the uniform random number (0, 1).
In this manner the time series of wind speed of any length can be generated. The initial 1000 values of
observed time series have been plotted in Fig. 1. A time series of wind speed data equal to the number of
wind speed data (61,368) was generated. A few initial (about 1000) synthetically generated wind speed
values by first and second order Markov chain models are shown in Figs. 2 and 3, respectively. The
frequency of each element of the generated probability transition matrix for both methods is presented in
Table 6 with the frequency of the corresponding element of transition probability matrix of the observed
700 A. Shamshad et al. / Energy 30 (2005) 693–708

Table 5
Coefficient of skewness for first order Markov chain data in each state
2 3
K0:028 K0:402 K0:463 K0:490 K0:598 K1:479 K0:331 0:707 K K K
6 7
6 0:082 K0:197 K0:460 K0:419 K0:499 K0:417 K0:054 0:155 K K K 7
6 7
6 0:420 0:514 0:507 0:092 0:595 0:278 K 7
6 K0:390 K0:356 K0:256 K0:554 7
6 7
6 0:217 0:138 0:346 0:163 K0:832 0:631 K 7
6 K0:113 K0:458 K0:744 K0:371 7
6 7
6 0:245 0:590 0:314 0:643 0:411 0:016 0:552 K1:165 K 7
6 K0:248 K0:130 7
6 7
6 0:368 0:315 0:293 0:723 0:683 0:501 0:183 K0:340 0:000 0:382 7
6 K0:063 7
6 7
6K1:027 0:331 0:067 0:273 0:364 0:573 0:328 0:039 K0:022 K0:289 K 7
6 7
6 7
6 0:000 0:000 K0:173 K0:274 0:338 0:228 0:206 0:332 0:336 0:255 K0:707 7
6 7
6 7
4 K K0:026 0:275 K 0:000 0:000 0:239 0:950 0:864 0:675 K0:511 5
K K K K K K 0:000 K0:295 K0:833 K0:523 K0:042

data. While, for second order transition matrix the sum of frequencies of elements of the row for
observed and the generated wind speed data is presented in Table 7. The Markov models appear to be
quite accurate in maintaining the frequencies of the generated data.

5. Validation of the model

In addition to the acceptance procedures described above, the synthetic wind speed time series were
thoroughly examined to determine their ability to preserve the statistical properties and to assess the
applicability of Markov chain models for wind speed generation. In this context the important statistical
properties are the general parameters (mean, standard deviation, etc.), the probability distribution and
the autocorrelation functions of the time series.

Fig. 1. Observed wind speed at Mersing (initial 1000 values only).


A. Shamshad et al. / Energy 30 (2005) 693–708 701

Fig. 2. Synthetically generated wind speed by first order Markov model at Mersing (initial 1000 values only).

5.1. General statistical parameters

In order to test the accuracy of first and second order Markov modelling approaches, the general
statistical parameters such as mean, standard deviation, minimum and maximum values and the
percentiles of the synthesized values [8,9] are presented together with the observed ones in Table 8. It is
clear from the comparison of the corresponding observed and generated parameters that the first and
second order Markov chain models are sufficient to preserve most of the parameter values. However, no
significant improvement is observed in the statistical parameters of the second order Markov chain
model as compared to the first order model.

5.2. Probability distribution of wind speed

The synthetically generated data, by first and second order Markov chain models, have been
compared qualitatively and quantitatively in terms of probability distribution with those of the observed
values. For qualitative assessment, the frequency distributions for the observed and the generated time

Fig. 3. Synthetically generated wind speed by second order Markov model at Mersing (initial 1000 values only).
702
Table 6
Frequencies of the elements of transition matrix for observed and generated wind speed data of first order Markov model

A. Shamshad et al. / Energy 30 (2005) 693–708


States
1 2 3 4 5 6 7 8 9 10 11 12
Ob. Gn. Ob. Gn. Ob. Gn. Ob. Gn. Ob. Gn. Ob. Gn. Ob. Gn. Ob. Gn. Ob. Gn. Ob. Gn. Ob. Gn. Ob. Gn.

2180 2167 2393 2391 1022 1033 210 221 50 41 11 14 5 3 3 1 1 1 0 0 0 0 0 0


2452 2419 6602 6436 4627 4567 867 892 180 183 61 58 5 5 8 10 1 2 1 0 0 0 0 0
980 1017 4658 4618 9657 9650 3122 3094 539 574 154 139 40 36 13 13 6 6 2 2 0 0 0 0
171 177 845 832 3100 3090 3985 4064 1636 1674 360 376 80 93 22 17 3 3 4 7 0 0 0 0
60 59 197 192 565 609 1586 1619 2181 2224 902 968 177 179 39 37 7 4 3 1 1 1 0 0
18 17 66 71 134 127 337 345 917 987 1069 1068 456 451 94 91 15 15 7 3 1 1 0 0
7 9 25 20 41 44 72 72 168 172 460 456 495 463 216 225 32 19 6 6 3 5 0 0
4 4 10 6 19 20 21 21 35 23 81 85 232 230 152 134 67 64 14 10 1 2 0 0
2 3 4 3 3 5 4 4 9 10 14 11 30 26 70 56 58 51 17 14 3 2 1 0
1 0 4 3 4 4 1 1 2 5 2 1 3 4 15 13 21 12 12 6 6 8 2 0
0 0 0 0 0 0 1 0 0 0 0 0 2 1 3 2 4 9 6 9 7 15 2 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 3 0 0 0

Ob.: observed; Gn.: synthetically generated.


Table 7
Sum of frequencies of the elements of the row for observed and generated wind speed data of second order Markov model for different sets of current and
preceding states

A. Shamshad et al. / Energy 30 (2005) 693–708


Pre- Current state
ceding
state
1 2 3 4 5 6 7 8 9 10 11 12
Ob. Gn. Ob. Gn. Ob. Gn. Ob. Gn. Ob. Gn. Ob. Gn. Ob. Gn. Ob. Gn. Ob. Gn. Ob. Gn. Ob. Gn. Ob. Gn.

1 2108 2103 2393 2388 1022 1023 210 226 50 40 11 19 5 9 3 3 1 2 0 2 0 0 0 0


2 2452 2443 6602 6501 4627 4629 867 842 180 162 61 28 5 41 8 4 1 6 1 1 0 1 0 0
3 980 1010 4658 4610 9656 9529 3122 3081 539 569 154 119 40 46 13 27 6 9 2 6 0 1 0 0
4 171 166 845 877 3100 3029 3985 4114 1636 1671 360 380 80 37 22 33 3 14 4 7 0 7 0 0
5 60 61 197 173 565 595 1586 1606 2181 2294 902 921 177 194 39 20 7 20 3 1 1 1 0 1
6 18 16 66 59 134 113 337 349 917 938 1069 1071 456 439 94 76 15 11 7 6 1 3 0 0
7 7 8 25 24 41 49 72 81 168 158 460 449 495 540 216 225 32 26 6 8 3 6 0 6
8 4 4 10 11 19 31 21 27 35 32 81 72 232 242 152 151 67 55 14 7 1 2 0 0
9 2 3 4 6 3 6 4 4 9 15 14 18 30 27 70 69 58 63 17 19 3 2 1 2
10 1 1 4 5 4 2 1 3 2 6 2 4 3 2 15 16 21 19 12 18 6 7 2 0
11 0 1 0 3 0 1 1 2 0 2 0 0 2 1 3 6 4 6 6 6 7 9 2 2
12 0 0 0 0 0 0 0 0 0 0 0 0 0 2 1 3 0 4 1 2 3 0 0 0

Ob.: observed, Gn.: synthetically generated.

703
704 A. Shamshad et al. / Energy 30 (2005) 693–708

Table 8
General statistical parameters of observed and synthetically generated wind speed time series
Station Type of Mean Std. Max. Min. Percentiles
name wind data Dev.
1 10 25 50 75 90 99
Mersing Observed 2.69 1.55 11.60 0.00 0.00 1.00 1.60 2.50 3.50 4.80 7.40
Generated 2.76 1.57 11.20 0.00 0.11 1.02 1.65 2.53 3.61 4.89 7.43
(first order)
Generated 2.77 1.59 10.99 0.00 0.10 1.02 1.65 2.54 3.64 4.92 7.51
(second order)
Kuantan Observed 1.95 1.64 15.20 0.00 0.0 0.00 0.40 1.80 3.00 4.20 6.20
Generated 2.13 1.56 15.46 0.00 0.00 0.32 0.79 1.92 3.15 4.34 6.32
(first order)
Generated 2.13 1.59 15.55 0.00 0.00 0.31 0.78 1.90 3.14 4.33 6.37
(second order)

series by two different modeling approaches have been examined. The probability distributions of wind
speed data at Mersing are shown in Fig. 4. An examination of this figure reveals that the probability at
different wind speed time series have almost the same values. The probability distribution of the
observed and generated wind speed is characterized by Weibull distribution. Similar behaviour was
observed for wind speed data collected at Kuantan.
For quantitative assessment, the Weibull distribution parameters have been computed for the
observed and the generated data. It is a well accepted and widely adopted distribution in wind energy
analysis [13,14]. The Weibull distribution function is given by
k
K V kK1 V
pðVÞ Z exp K (10)
V C C
where p(V) is the frequency or probability of occurrence of wind speed V, k the shape parameter that
specifies how sharp is the peak of the curve, while c is the weighted average speed which is more useful

Fig. 4. Probability distribution of observed and synthetically generated wind speed at Mersing.
A. Shamshad et al. / Energy 30 (2005) 693–708 705

Table 9
Weibull parameters of observed and synthetically generated wind speed time series
Station Observed Generated (first order) Generated (second order)
k c k c k c
Mersing 1.916 3.111 1.777 3.090 1.778 3.101
Kuantan 1.820 2.824 1.279 2.295 1.269 2.283

in power calculation than the actual wind speed. The Weibull parameters of both stations for the
observed and the generated wind speed time series are presented in Table 9 for comparison. The results
show that both the wind speed data generation methods have, in general, preserved Weibull parameters.

5.3. Autocorrelation and spectral power density

To determine the persistence structure in the observed and the generated wind speed data, the
autocorrelation function was used. The autocorrelations at time lag k were determined [15] using the
following equation
1
PðNKKÞ
ðxi K xÞðx
 iKk K xÞ

rk Z NKK
1
PiZ1
N (11)
N iZ1 ðxi K xÞðx
 i K xÞ

where x is the mean of wind speed time series (xi, iZ1,2,.,N). The autocorrelations for both stations for
the observed and generated wind speed data were computed and compared. Fig. 5 shows the variation of
autocorrelation with time lag for the observed and generated wind speed data at Mersing. It can be seen
that the observed wind speed is correlated over a longer period of time than the wind speed generated by
both the Markov chain models. The measured autocorrelations have periodicities, which is mainly due to
the diurnal behaviour of wind speed at Mersing (Fig. 6). There is a clear bell shaped trend for the
observed hourly mean wind speed for all the years. This is due to the effect of solar heating balance. The
wind speeds are usually reduced during the night but increased during the day [16]. Fig. 5 also reveals
that the observed wind possesses long period information than the first order and the second order
synthetic Markov chains. The exponential fall of autocorrelation shows that both the methods failed to
retain the periodic behaviour of the wind speed. It is also observed that the synthetic wind speed time

Fig. 5. Autocorrelation functions of observed and synthetically generated wind speed at Mersing.
706 A. Shamshad et al. / Energy 30 (2005) 693–708

Fig. 6. Diurnal wind speed variation for all the individual year at Mersing.

series have lower autocorrelation values at the same time lag than the observed time series. The general
behaviour of the autocorrelation function of the synthetic data of both the methods is almost similar.
For initial lags the values of autocorrelations of synthetic series by second order Markov model are
closer to the observed ones than by first order Markov model. The initial six autocorrelations coefficients
for both measured and generated sequences are presented in Table 10 along with the root mean square
error (RMSE). The comparison shows that the modeled autocorrelation coefficients for second order
Markov chain are quite close to their measured autocorrelations. Thus, the performance of data
generated by the second order method has improved. It is because the second order wind speed
remembers more about its history than the first order model. In a previous study by Keminsky et al. [7],
the wind speed was correlated for longer period of time and it was also pointed out that the second order
Markov chain slightly improved the results. However, their study cannot be absolutely compared with
the present study as that was based on 8 h wind speed data only which did not have any periodic/seasonal
non-stationarity.
Spectral analysis [17] of the wind speed time series was carried out to obtain the information about the
time scale of changes in wind speed. The spectrum of the observed wind speed is compared with the
generated wind speed by first order as well as second order Markov chain in Fig. 7. The original data has
very peculiar peaks at periods of 12 and 24 h while the generated wind speed time series failed to retain
these peaks. However, the general shape of these three curves is similar. The second order Markov chain

Table 10
First few autocorrelations for observed and generated wind speed time series
Lag Mersing Kuantan
Observed Generated Generated Observed Generated Generated
(first order) (second order) (first order) (second order)
1 0.77 0.757 0.758 0.722 0.709 0.709
2 0.656 0.589 0.647 0.578 0.514 0.566
3 0.570 0.461 0.551 0.457 0.370 0.441
4 0.501 0.363 0.473 0.349 0.266 0.341
5 0.448 0.290 0.412 0.247 0.196 0.270
6 0.416 0.237 0.362 0.155 0.146 0.215
RMSEZ0.12 RMSEZ0.03 RMSEZ0.06 RMSEZ0.03
A. Shamshad et al. / Energy 30 (2005) 693–708 707

Fig. 7. Spectral density of observed and synthetically generated wind speed at Mersing.

spectral density curve seems to be closer to observed spectral density than the first order Markov chain.
This strengthens the remark that the second order Markov chain performed better than the first order
Markov chain.

6. Conclusion

A Markov chain represents a system of elements making transition from one state to another over
time. The order of the chain gives the number of time steps in the past influencing the probability
distribution of the present state. The method utilized involves the use of first order and second order
transition probability matrix of a Markov chain and an algorithm to produce the time series of wind
speed values. Depending upon the wind speed time series, it was felt that at least 12 states of size 1 m/s
would be needed to capture the shape of the probability density function. The manners in which Markov
models can be used to generate wind speed time series have been described. The models have been used
to generate hourly synthetic wind speed time series. The time series have been examined to determine
their ability to preserve the properties of the observed wind speed time series. A satisfactory accordance
has been noted between the observed and the generated wind speed time series data from almost all the
angles.
The comparison of the two generated data shows that the wind speed behaviour slightly improved by
the second order Markov model. The lack of similarity for autocorrelation [9] and spectral power density
between the measured and the generated wind speed by both the first and second order Markov chain
approaches is due to the intrinsic nature of the Markov process. For some applications time series
without periodic/seasonal non-stationarity, the Markov chain approaches produced perfectly adequate
results [7,18,19], while for other time series with such behaviour, all the characteristics could not be
reproduced. Therefore, the analysis for data generation should be carried out after removing the periodic/
seasonal component. To improve the results further, higher order transition should be considered in
future research. The synthetic hourly wind speed time series may be utilized as the input for any wind
energy system.
708 A. Shamshad et al. / Energy 30 (2005) 693–708

Acknowledgements

This work is financially supported by the University of Science Malaysia under the short term grant
for the project ‘Efficient Development of Wind Farms in East Coast Peninsular Malaysia’. We hereby
wish to acknowledge the financial assistance of the University of Science Malaysia.

References

[1] Castino F, Festa R, Ratto CF. Stochastic modeling of wind velocities time series. J Wind Engng Ind Aerodyn 1998;74–
76:141–51.
[2] Heiko B. Markov chain model for vegetation dynamics. Ecol Model 2000;126:139–54.
[3] Thomas HA, Fiering MP. Mathematical synthesis of stream flow sequences for the analysis of river basins by simulation.
In: Maass A, Marglin S, Fair G, editors. Design of water resources systems. Massachusetts: Harvard University Press;
1962, p. 459–93.
[4] Srikanthan R, McMahon TA. Stochastic generation of rainfall and evaporation data. AWRC Technical Paper No. 84
1985;301.
[5] Thyer MA, Kuczera G. Modelling long-term persistence in hydro-climate time series using hidden state Markov model.
Water Resour Res 1999;36:3301–10.
[6] Shamshad A, Parida BP, Khan IH. Performance of stochastic approaches for forecasting river water quality. Water Res
2001;35(18):4261–6.
[7] Keminsky FC, Kirchoff RH, Syu CY, Manwell JF. A comparison of alternative approaches for the synthetic generation of
a wind speed time series. J Solar Energy Engng 1991;113:280–9.
[8] Sahin AD, Sen Z. First-order Markov chain approach to wind speed modeling. J Wind Engng Ind Aerodyn 2001;89:263–9.
[9] Torre MC, Poggi P, Louche A. Markovian model for studying wind speed time series in Corsica. Int J Renew Energy
Engng 2001;3(2):311–9.
[10] Logofet DO, Lensnaya EV. The mathematics of Markov models: what Markov chains can really predict in forest
successions. Ecol Modell 2000;2/3:285–98.
[11] Poggi P, Notton G, Muselli M, Louche A. Stochastic study of hourly total solar radiation in Corsica using a Markov
Model. Int J Climatol 2000;20:1843–60.
[12] Snedecor GW, Cochran WG. Statistical methods. Ames, Iowa: State University Press; 1967.
[13] Seguro JV, Lambert TW. Modern Estimation of parameters of the Weibull wind speed distribution for wind energy
analysis. J Wind Engng Ind Aerodyn 2000;85:75–84.
[14] Sopian K, Othman MYH, Wirsat A. Data bank: the wind energy potential of Malaysia. Renew Energy 1995;6(8):1005–16.
[15] Hipel KW, McLeod AI. Time series modelling of water resources and environmental systems Development in Water
Science, 45. Amsterdam: Elsevier; 1994.
[16] Zubair L. Diurnal and seasonal variations in surface wind at Sita Eliya, Sri Lanka. Theor Appl Climatol 2002;71:119–27.
[17] Worrall F, Burt TP. A univariate model of river water nitrate time series. J Hydrol 1999;214:74–90.
[18] Muselli M, Poggi P, Notton G, Louche A. First order Markov chain model for generating synthetic “typical days” series of
global irradiation in order to design photovoltaic stand alone systems. Energy Convers Mgmt 2001;42:675–87.
[19] Youcef Ettoumi F, Sauvageot H, Adane AEH. Statistical bivariate modelling of wind using first-order Markov chain and
Weibull distribution. Renew Energy 2003;28:1787–802.

View publication stats

You might also like