You are on page 1of 25

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/220504833

Approximating Distributions by Extended Generalized Lambda


Distribution (XGLD)

Article in Communication in Statistics- Simulation and Computation · January 2012


DOI: 10.1080/03610911003681503 · Source: DBLP

CITATIONS READS

5 236

3 authors:

Majid Nili Ahmadabadi Yaghoub Farjami


University of Qom
12 PUBLICATIONS 33 CITATIONS
42 PUBLICATIONS 190 CITATIONS
SEE PROFILE
SEE PROFILE

M. Bameni Moghadam
Allameh Tabataba'i University
66 PUBLICATIONS 712 CITATIONS

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

I am working on SPC subjects. View project

Research View project

All content following this page was uploaded by M. Bameni Moghadam on 18 July 2015.

The user has requested enhancement of the downloaded file.


LSSP #468672 VOL 39, ISS 5

Approximating Distributions by Extended


Generalized Lambda Distribution (XGLD)

M. NILI AHMADABADI1 , Y. FARJAMI2 , AND M. B. MOGHADAM3

QUERY SHEET
This page lists questions we have about your paper. The numbers displayed at left can be
found in the text of the paper for reference. In addition, please review your paper as a whole
for corrections.

AQ1: Au: Please provide location for Farjami affiliation?


AQ2: Au: Primary and Secondary classifications are required to drive our online search
engine and to help readers locate your article more easily. So please provide the appropriate
primary and secondary classifications.
AQ3: Au: Word missing?
AQ4: Au: Please complete sentence?
AQ5: Au: Please clarify sentence?
AQ6: Au: Please provide pages?
AQ7: Au: Provide full date/pages?
AQ8: Au: Please spell out journal title?
AQ9: Au: Please provide article title?
AQ10: Au: Please provide location?
AQ11: Au: Please provide other author names?

TABLE OF CONTENTS LISTING


The table of contents for the journal will list your paper exactly as it appears below:

Approximating Distributions by Extended Generalized Lambda Distribution (XGLD)


M. Nili Ahmadabadi1 , Y. Farjami2 , and M. B. Moghadam3
Communications in Statistics—Simulation and Computation® , 39: 1–23, 2010
Copyright © Taylor & Francis Group, LLC
ISSN: 0361-0918 print/1532-4141 online
1 DOI: 10.1080/03610911003681503
2
3
4
5
6
7
8
Approximating Distributions by Extended
9 Generalized Lambda Distribution (XGLD)
10
11
12 M. NILI AHMADABADI1 , Y. FARJAMI2 ,
13 AND M. B. MOGHADAM3
14
1
15 Department of Management, Science and Research
16 Branch of Islamic Azad University, Tehran, Iran
2
17 Department of Technology, Qom University AQ1
3
18 Department of Statistics, Allameh Tabataba’i University,
19 Tehran, Iran
20
21 The family of four-parameter generalized lambda distributions (GLD) is known for
22 its high flexibility. It provides an approximation of most of the usual statistical
23 distributions (e.g., normal, uniform, lognormal, Weibull, etc.). Although GLD is used
24 in many fields where precise data modeling is required, there are some statistical
25 distributions that could not be estimated with high precision. The main objective of
this article is to present an extension of generalized lambda distributions (XGLD)
26 model for estimating statistical distributions. This new method has a considerable
27 precision and high flexibility to fit more probability distribution functions with higher
28 accuracy. Using the existing methods for calculation of GLD parameters, it provides
29 methodology of calculating XGLD parameter measurement algorithmically. The
30 XGLD estimations are computed for some well-known distributions and precision of
estimations is compared with that of GLD.
31
32 Keywords Estimation; Fitting distributions; GLD; Statistical distribution;
33 XGLD.
34
35 Mathematics Subject Classification . AQ2
36
37
38 1. Introduction
39
40 The family of three-parameter lambda distributions that was firstly introduced by
41 Tukey (1962) and described by others (Filliben, 1969, 1975), has already been shown
42 to fit most of the usual statistical distributions like Gaussian, lognormal, Weibull,
43 uniform, etc., accurately The formula of three-parameter lambda distribution,
44 which is defined by its quantile function (i.e., the inverse of cumulative distribution
45
46
Received September 11, 2009; Accepted February 5, 2010
47 Address correspondence to M. B. Moghadam, Department of Statistics, Allameh
48 Tabataba’i University, Beheshty Ave., Ghasir St., Tehran, Iran; E-mail: bamenimoghadam
49 @atu.ac.ir

1
2 Ahmadabadi et al.

50
51
52
53
54
55
56
57
58
59
60
61 Figure 1. 2 pdf with  = 5 and its GLD estimation (GLD pdf has the higher maximum).
62
63
64 function), is
65
 
66 F −1 y =  + y − 1 − y (1)
67 
68 This distribution, which was later generalized to a four-parameter family by
69 Ramberg and Schmeiser (1972), is called the generalized lambda distribution (GLD)
70 and is defined as
71
72 y3 − 1 − y4
F −1 y = 1 + (2)
73 2
74
75 where 0 ≤ y ≤ 1, and 1 2 3 4 are the location, inverse scale, skewness, and
76 kurtosis parameters of GLD1 2 3 4 , respectively.
77 There exists another four-parameter family of distributions known as FMKL
78 GLD by Freimer et al. (1988), which is defined as:
79
y3 −1 1−y4 −1
80 −
−1 3 4
F y = 1 + (3)
81 2
82
83 Thus, the GLD method has been used in most scientific fields such as parameter
84 estimation, fitting distributions to data, and in simulation research that primarily
85
86
87
88
89
90
91
92
93
94
95
96
97 Figure 2. Gamma pdf with  = 5, = 3 and its GLD estimation (GLD pdf has the higher
98 maximum).
Approximating Distributions by XGLD 3

99
100
101
102
103
104
105
106
107
108
109
110 Figure 3. Lognormal pdf with = 0,  = 1/3 and its GLD estimation (GLD pdf has the
higher maximum).
111
112
113
includes univalent data generation, because of its ability for estimating distribution
114
functions (Karian et al., 1996; Ramberg et al., 1979). For example, the GLD
115
has been used in studies that include such topics or techniques as independent
116
component analysis (Karvanen, 2003), operations research (Ganeshan, 2001),
117
psychometrics (Delaney and Vargha, 2000), Engineering (Upadhyay and Ezekoye,
118
2008), corrosion (Najjar et al., 2003), meteorology (Öztürk and Dale, 1982), fatigue
119
of materials (Bigerelle et al., 2005), statistical process control (Fournier et al., 2006),
120
and simulation of queue systems (Dengiz, 1988). Karian has developed the GLD
121
method to EGLD (Karian et al., 1996), which is a combination of GLD and
122
GBD (generalized beta distribution). The EGLD is useful because this class of
123
distributions covers all possible combinations of skew 3  and kurtosis 4  defined
124
for a continuous probability density function (pdf) to exist as 4 ≥ 32 + 1 (Devroye,
125
126 2006, p. 688). However, the accuracy of the GLD estimator is not sufficient for some
127 distributions, as are demonstrated in Figs. 1–3. As one can see, there is a big gap F1–
128 between the distribution functions and their GLD estimators. F3

129 In this article, we introduce an extension of generalized lambda distributions


130 (XGLD) to increase the fitness of GLD, organized as follows. The GLD method
131 is presented in Sec. 2. In Sec. 3, the XGLD method is introduced. Methods of
132 estimating XGLD parameters and its algorithms are described in Sec. 4. In Sec. 5,
133 the estimations of some well-known distributions are demonstrated by XGLD. The
134 results of estimating the statistical distribution are compared by methodologies of
135 GLD and XGLD, and the advantages of the new method are presented in Sec. 6.
136
137
2. Generalized Lambda Distribution (GLD)
138
139 The family of lambda distributions that was finally generalized to GLD family of
140 four-parameter statistical distributions (Filliben, 1975) is defined in terms of the
141 quantile function Q (the inverse of the cumulative distribution function) as
142
143 y3 − 1 − y4
x = Qy 1 2 3 4  = GLDy 1 2 3 4  = 1 + (4)
144 2
145
146 where y ∈ 0 1 , 1 , and 2 are the location and scale parameters, and 3 and 4
147 are related to the skewness and the kurtosis of the GLD1 2 3 4 , respectively.
4 Ahmadabadi et al.

148 Therefore, the probability density function of the GLD is defined as:
149
150 2
fx = f Qy = (5)
151 3 y3 −1 + 4 1 − y4 −1
152
153 Karian et al. (1996) noted that GLD is defined if and only if the following condition
154 is met:
155 2
156 ≥ 0 for 0 ≤ y ≤ 1 (6)
3 y3 −1 + 4 1 − y4 −1
157
158 Karian and Dudewicz (2000) provided a comprehensive discussion about GLD
159 features and its history.
160
161
162 3. Extended Generalized Lambda Distribution (XGLD)
163
164 To introduce the XGLD model, we begin with the definition of GLD model as given
165 by Eq. (4). This equation can be rewritten in the following form:
166 y3 − 1 − y4
167 x = GLDy 1 2 3 4  = 1 + = 0 + 1 y1 − 2 1 − y2 (7)
2
168
169 where 0 = 1 1 = 2 = 1/2 , 1 = 3 , and 2 = 4 .
170 It is interesting to note that both y − 1 and y (i.e., y − 0) terms in Eq. (7) are
171 monotone and they vanish in one of the end points of the unit interval 0 1 and
172 hence they have fixed signs in this interval. Our idea is to add some more terms of
173 y like y − c where 0 ≤ c ≤ 1. But there will be a problem of how to use a power of
174 these terms, like y − c , because we know that such expressions have difficulties
175 in definition when y − c is negative and  is a real value. Furthermore, the term
176 y − c needs to be monotone for 0 ≤ y ≤ 1. Thus, the following definition for
177 monotone powering is given:
178
179 Definition 1 (Monotone Powering). Let 0 ≤ y, c ≤ 1, and  ∈ R. Then, the
180 monotone powering of y − c by  is denoted by y − c∗ and is defined by
181 
182 
y − c

if y − c > 0
∗
183 y − c = −y − c 
if y − c < 0 (8)
184 

0 if y − c = 0
185
186 Using this definition, it is obvious that y − c∗ is monotone and continuous. Note
187 that with this definition, it can be seen that y∗ = y and y − 1∗ = −1 − y for
188 0 ≤ y ≤ 1. We should mention here that the monotone powering function y − c∗
189 is a differentiable function of y when  ≥ 1 with the following differentiation
190
formula,
191
192 d
y − c∗ = y − c−1  (9)
193 dy
194
195 Now, we are ready to introduce extended GLD model as a linear combination of
196 monotone powers of y − c terms as follows:
Approximating Distributions by XGLD 5

197 Definition 2 (XGLD Model). Let k ∈ N , i ∈ R, i ∈ R, and 0 ≤ c1    ck ≤ 1. The


198 XGLD model as a quintile function with k + 1 terms is defined by:
199
200 x = XGLDy 0 1    k 1    k c1    ck 
201 
k
202 = 0 + i y − ci ∗i for 0 ≤ y ≤ 1 (10)
203 i=1

204
205 With these definitions and using Eq. (8), the GLD model, 0 + 1 y1 − 2 1 −
206 y , can be rewritten as 0 + 1 y − 0∗1 + 2 y − 1∗2 , which is a compatible
2

form with XGLD with k = 2 and c1 = 0, c2 = 1.


207
By increasing the value of k, the precision of estimation and the volume of
208
calculations will be increased. In this article we will only consider a simple case of
209
XGLD with k = 3 and c1 = 0, 0 ≤ c2 ≤ 1, c3 = 1 as follows:
210
211
XGLDy 0 1 2 3 1 2 3 c1 = 0 c2 c3 = 1
212
213 = 0 + 1 y∗1 + 2 y − c2 ∗2 + 3 y − 1∗3
214
= 0 + 1 y1 + 2 y − c2 ∗2 − 3 1 − y3 (11)
215
216 Thus, for simplicity in notations, we write the following convention and use it
217 through the rest of the article:
218
219
220 3.1. Convention
221
222 x = XGLDy 0 1 2 3 1 2 3 c
223
224 = XGLDy 0 1 2 3 1 2 3 c1 = 0 c2 c3 = 1
225 = 0 + 1 y∗1 + 2 y − c∗2 + 3 y − 1∗3
226
227 = 0 + 1 y1 + 2 y − c∗2 − 3 1 − y3 (12)
228
229 where c2 is shown by c. Here Eqs. (10) and (11) are used. It would be useful to give
an explicit relation between this simplified form of XGLD and GLD models. Using
230
Eqs. (7) and (12), the following relations emerge that will be used in Sec. 4.
231
232    
1 3 1
233 GLDy 1 2 3 4  = 1 + y − 1 − y4
2 2
234  
235 1 1 1
= XGLD y 1 0  1 4 (13)
236 2 2 3 2
237
238 On the contrary, for making more simplicity in XGLD, supposing 1 = 3 we have,
239  
1
240 XGLDy 0 1 2 1 1 2 3 c = GLD y 0   + 2 y − c∗2
241 1 1 3
242 = 0 + 1 y1 − 1 1 − y2 + 2 y − c∗2 (14)
243
244 An early relation can be used by an algorithm to calculate XGLD parameters.
245 So this algorithm needs to start from a preliminary point using the existing methods.
6 Ahmadabadi et al.

246 This can be converted into the preliminary point of XGLD parameters and begin
247 the search. Then, by varying the parameters around the initial values gradually,
248 we reach the optimal values of the XGLD parameters.
249 In the XGLD model, like the GLD model, x is a function of y. So, we use the
250 definition of XGLD in order to find the corresponding pdf:
251
dy 1 1 1
252 fx = = dx = k = k (15)
253 dx dy d 0 + 
i=1 i y − c i  ∗i /dy  
i=1 i i y − ci i −1
254
255 XGLD is a valid quantile function if and only if
256
1
257 k ≥0 for 0 ≤ y ≤ 1 (16)
258 i=1 i i y − ci ∗i −1
259
It is necessary to say that the general condition has been observed for the probability
260 density function as follows:
261
262 dy + 1

263 fx = ⇒ fxdx = dy ⇒ fxdx = dy = 1


dx − 0
264
265 The general equation to calculate the mth moment of the probability density
266 function fx is
267 +
268 Exm  = xm fxdx (17)
269 −

270 According to Eqs. (10), (15), (17), for the mth moment can be rewritten as: AQ3
271
272  m
1 
k
273 Exm  = 0 + i y − ci ∗i dy (18)
0 i=1
274
275
276 4. XGLD Parameter Estimation
277 This section is devoted to the process of computation of XGLD parameters.
278 In order to demonstrate our approach (which will be presented at the end of this
279 section), we need to survey some usual approaches for computation of parameters
280 of a family of distribution. Here we will concentrate on the four-parameter GLD.
281 Due to its versatility, obtaining appropriate parameters of a family of distributions
282 can be a challenging problem. Because the related methods for computation of
283 parameters will be applied in our algorithm, the case of a GLD family will
284 be informative. There are mainly three approaches for computation of the four
285 parameters of a GLD family. These three methods are the method of moments, least
286 squares method, and the starship method. All these approaches have been applied
287 for computation of parameters of GLD by many authors (for example, see Fournier
288 et al., 2007; Lakhany and Mausser, 2000). We are going to give a brief description
289 of each method as follows.
290
291
4.1. Method of Moments
292
293 The initial approach for estimating the GLD parameters is the method of moment.
294 This method was proposed in Ramberg and Schmeiser (1974). The principal
Approximating Distributions by XGLD 7

295 objective in this method is finding a GLD with 1 2 3 4 that matches closely
296 with the first four moments of the empirical data or the distribution. The method
297 can be described briefly as follows.
298 Given the GLD distribution with quantile function Qy, find parameters
299 1 2 3 4 so that the mean, variance, skewness, and kurtosis of the GLD
300 match the corresponding mean, variance, skewness, and kurtosis of the original
301 distribution. This will lead to a system of four nonlinear equations that can be
302 solved by well-known and dependable numerical methods. Often more than one
303 numerically acceptable solution is available. Thus, a goodness-of-fit test should
304 be performed to establish the validity of the results. If this test fails, or if the levels
305 of skewness and kurtosis are outside of the tabulated values, it is necessary to use
306 numerical procedures to find suitable parameters. Several studies have been done
307 about the method of moments (Asquith, 2007; Headrick and Mugdadi, 2006; Karian
308 and Dudewicz, 2003; Karvanen and Nuutinen, 2008; Su, 2007).
309
310 4.2. Least Squares Method
311
312 Instead of matching moments, Öztürk and Dale (1985) minimized the total squared
differences between the original distribution and the expected values of order
313
statistics implied by the GLD. The least squares method finds the values of  for
314
which the differences between the observed and predicted order statistics are as
315
small as possible. The Nelder–Mead downhill simplex algorithm (Nelder and Mead,
316
1965) is used to find the optimal parameters. As with moment matches, the resulting
317
distribution should be assessed using a goodness-of-fit test.
318
319
320 4.3. Starship Method
321 King and MacGillivray (1999) assessed the quality of the GLD directly by
322 performing goodness-of-fit tests for specified combinations of parameter values. The
323 method can be described briefly as follows:
324 Identify a region in four-dimensional space that covers the range of the four
325 parameters 1 2 3 4 appropriately (for example, by using the quantiles). Then
326 overlay a four-dimensional rectangular grid on it. Finally, evaluate the grid points
327 by performing a goodness-of-fit test on the corresponding distributions. If the test
328 is satisfied, then stop; otherwise, continue with the next point in the grid (or it is
329 possible to examine all grid points and select the one with the best goodness-of-fit
330 measure).
331 Lakhany and Mausser (2000) suggested a variation of using a method combined
332 with the method of moments and the goodness-of-fit test via the FMKL GLD.
333 Firstly, they generated initial values for the method of moments that matches via
334 a quasi-random number generator (for example, the Sobol sequence generator;
335 Bratley and Fox, 1988) and then found the set of values 1 2 3 4 that matched
336 optimally (through the Nelder–Simplex algorithm) with the first four moments from
337 the data or the original distribution. Then, by means of using from a goodness of
338 test statistic such as the statistics of adjusted Kolmogorov–Smirnov (KS) test, the
339 mentioned set of values can be evaluated. Under this method, any solution that
340 results in a p-value > 005 is accepted.
341 Other researchers tried for developing the above three methods and/or mixing
342 them to find GLD parameters more accurately and/or more quickly (for example,
343 see Lakhany and Mausser, 2000; Öztürk and Dale, 1985).
8 Ahmadabadi et al.

344 Briefly speaking, the strategy is to find the set of parameters 1 2 3 4 
345 that give the lowest value of (e.g.) the Kolmogorov–Smirnov estimator EKS(p.d.f.)
346 defined by:
347
348 EKS(p.d.f.) = max fx − f̂ x  (19)
x∈D
349
350
351 where f̂ x and fx are the estimator and empirical distribution functions (Karian
352 and Dudewicz, 2000); for examples on usage of EKS see D’Agostino and Stephens
353 (1986).
354 In this article, we use the method of moments and starship method in order
355 to find the parameters of an XGLD model for estimating a distribution. The basic
356 idea behind the proposed method is to start from the GLD parameters, which are
357 obtained by matching the first four moments, as the initial estimation for XGLD
358 parameters. Then we search for improvements of the parameters by considering a
359 goodness-of-fit test using Kolmogorov–Smirnov (KS) statistic.
360
361
362 5. Algorithm: Computation of XGLD Parameters
363 To compute parameters of XGLDy 0 1 2 1 1 2 3 c and estimate the
364 distribution fx we have:
365
366 1. Using Karian and Dudewicz (2000), specify a primary answer for GLD
367 parameters, say 1 2 3 3 .
368 2. Using the relation between GLD and XGLD parameters, which were mentioned
369 in Eq. (13), set 0 = 1 , 1 = 1/2 , 2 = 0, 3 = 1/2 , 1 = 3 , 2 = 1, 3 = 4 ,
370 and c = 1/2 as the initial values for XGLD parameters.
371 3. Set an interval for each parameter as follows,
372 If i  < 001 then
373
374 set Ii = −1 1
375
376 else
377
378 set Ii = 05i 15i
379
380 end if,
381 If i  < 001 then
382
383
set Ii = −05 05
384
385
else
386
387
388 set Ii = 05i 15i
389
390 end if,
391
392 Set Ic = 0 1 
Approximating Distributions by XGLD 9

393 4. Set a step of search for each parameter as follows:


394
395 di = 01 di = 001 dc = 01
396
397 5. Set a discretized space for each parameter as follows,
398
399 D0 = 0 0 ± d0 0 ± 2d0 0 ± 3d0     ⊂ I0
400
401 D1 = 1 1 ± d1 1 ± 2d1 1 ± 3d1     ⊂ I1
402 Dc = c c ± dc c ± 2dc c ± 3dc     ⊂ Ic
403
404 6. Set the discretized cube D around the initial parameters as follows,
405
406 D = D0 × D1 × D2 × D3 × D1 × D2 × D3 × Dc
407
408 7. For the initial point do steps 8–11, then for each 0 1 2 3 1 2 3 c  ∈ D
409 do steps 8, 12–15.
410 8. Select 249 points of yi , that is, yi = 250
i
, 1 ≤ i ≤ 249.
411
9. Using (19), compute the EKS(pdf) by
412
413
414 EKS(pdf) = max fxi  − f̂ xi 
1≤i≤249
415
416 That fxi  is the original pdf and f̂ xi  is the GLD estimator pdf. AQ4
417 10. Using (19), compute the EKS(cdf) by
418
419 EKS(cdf) = max Fxi  − F xi 
420 1≤i≤249

421
422 That Fxi  is the original cdf, and F xi  is the GLD estimator cdf. AQ4
423 11. Set gbest = maxEKS(pdf) EKS(cdf) 
424 12. Using Eqs. (12), (15), and (19), compute g1 to g498 with parameters
425 0 1 2 1 1 2 3 c by
426
427
xi = 0 + 1 yi 1 + 2 yi − c∗2 − 3 1 − yi 3

428 gi = f̂ xi  − fxi 
429 1≤i≤249

430 1
431 f̂ xi  =  −1
 1 1 y i 1 + 2 2 yi − c∗2 −1 − 3 3 1 − yi 3 −1
432
433 gi + 249 = yi − Fxi 
1≤i≤249
434
435 13. Using the function fminimax in Matlab, minimize gis synchronously. This
436 function requires the start point and gives the nearest local minimum
437 as the answer (we call this answer as 0 1 2 3 1 2 3 c . Use
438 0 1 2 3 1 2 3 c  as the start point and catch gp∗ ; that is,
439
440 gp∗ = min max gi (20)
441 p∈D 1≤i≤498
10 Ahmadabadi et al.

442 14. Compare gp∗ with gbest ,


443 If gp∗ < gbest then
444
445 set gbest = gp∗ and set 0 1 2 1 1 2 3 cbest = 0 1 2 3 1 2 3 c 
446
447 end if.
448 15. Return to 13 and do 13–14 for the next point 0 1 2 3 1 2 3 c  ∈ D.
449 If you did not catch a lower g ∗ than the initial gbest (calculated in 11), return to
450 3 and extend the Ii s and Ii s.
451 After doing all these, the cubes of D (there is a start point in each of them) is
452 searched and the point that has the lowest EKS(cdf) EKS(pdf) will be obtained.
453 End Algorithm
454 Here, some points about the algorithm should be declared:
455
456 a) 250 points were used as the lowest number of point required to fitting
457 distributions (Karian and Dudewicz, 2000).
458 b) The gis in step 12 are the same as EKS(cdf) EKS(pdf) that indicate the precision of
459 estimation. The reason of minimization EKS(cdf) EKS(pdf) synchronously is in some AQ5
460 points despite of less EKS(cdf) than the initial value, the EKS(pdf) is more, and in
some other points the less EKS(pdf) exists, but EKS(cdf) is more than its initial value.
461
462 For displaying the above algorithm in action, the estimation of standard normal
463 distribution N0 1 by XGLD is explained as follows:
464
1. By using Karian and Dudewicz (2000), the primary answer for N0 1 estimation
465
is specified by GLD:
466
467 N0 1 GLD0 01795 01349 01349
468
469 2. Based on the existing relations between GLD and XGLD that are explained in
470 Sec. 3, the primary answer for XGLD parameters is specified as follows.
471
472 GLD0 01795 01349 01349
473 = XGLD0 50633 0 50633 01349 1 01349 05
474
475 3. The following radiuses are considered for XGLD parameters:
476
477 I0 = −1 1 I1 = 25 76 I2 = −05 05 I3 = 25 76
478 I1 = 007 021 I2 = 05 15 I3 = 007 021 Ic = 0 1
479
480 4 to 11. A program was written in Matlab and run and finally the following
481 results were obtained:
482
483 EKS(cdf) = 00011 EKS(pdf) = 00028 gbest = 00028
484
485 Based on steps 12 to 15, a program was written in Matlab. By running the
486 program and searching among all of the points in specified scope D, the best
487 answer was identified as,
488
0 1 2 3 1 2 3 c  = −06216 54032 00903 48026
489
490 01187 101 01419 05328 gbest = 00009
Approximating Distributions by XGLD 11

491 with a precision of EKS(p.d.f.) = 00009 EKS(c.d.f) = 00003. Thus, there is


492
493 N0 1 XGLD−06216 54032 00903 48026 01187 101 01419 05328
494
495 At the end of this section, we should mention that this method, like other
496 methods (which are used for finding GLD parameters), may present local
497 minimums as answers. Therefore, by reusing this algorithm, you may find
498 answers with different accuracy than the presented answers.
499
500
501 6. XGLD Approximation of Some Well-Known Distributions
502 In this section, we want to show the ability of XGLD for approximating statistical
503 distributions, as well as to demonstrate the precision of the model. For this purpose,
504 we use the algorithm described in Sec. 4 for some selected well-known distributions
505 to find the XGLD approximation for each of them.
506
507
508 6.1. Normal Distribution
509
The normal distribution N 2  has the following pdf:
510
511 1 x− 2

512 fx = √ e− 22 − < x < +


2
513
514 That is the average and 2 is the variance of normal distribution. According to
515 Sec. 4 of this article and using Karian and Dudewicz (2000) for = 0,  = 1, the
516 GLD estimation for this distribution is
517
518 1 2 3 4  = 0 01795 01349 01349
519
520
with following EKS s:
521
522
EKS(p.d.f.) = 00028 EKS(c.d.f.) = 00011
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539 Figure 4. N0 1 pdf and cdf with their GLD estimations.
12 Ahmadabadi et al.

540
541
542
543
544
545
546
547
548
549
550
551
552
553 Figure 5. N0 1 pdf and cdf with their XGLD estimations.
554
555
556 Then, initial XGLD estimation for this distribution is
557
558 0 1 2 1 1 2 3 c = 0 50633 0 50633 01349 1 01349 05
559
560 Searching the XGLD parameters based on initial XGLD estimation, we found:
561
562 0 1 2 1 1 2 3 c = −06216 54032 00903 48026 01187
563 101 01419 05328
564
565 EKS(p.d.f.) = 00009 EKS(c.d.f) = 00003
566
567 which shows considerable reductions in both EKS . Figures 4 and 5 shows the N0 1 F4,F5
568 with its GLD and XGLD estimators.
569 Figure 5 shows more fitness than Fig. 4. As it is mentioned in calculations,
570 maximum deviation resulted by GLD is 0.0028 and by XGLD is 0.0009. Therefore,
571 the difference is 0.0017.
572
573 6.2. Student’s t Distribution
574
The Student’s t distribution with  degrees of freedom, t, has the following pdf:
575
576  +1
577 fx = √ 2
+1
− < x < +
578  
2
1+ x2

2

579
580 Using Karian and Dudewicz (2000) for  = 5, the GLD estimation for this
581 distribution is
582
583 1 2 3 4  = 0 −02481 −01359 −01359 with the errors Ep.d.f. = 00358
584 Ec.d.f. = 00148
585
586 The initial XGLD estimation for this distribution is
587
588 0 1 2 1 1 2 3 c = 0 −40306 0 −40306 −01359 1 −01359 05
Approximating Distributions by XGLD 13

589
590
591
592
593
594
595
596
597
598
599
600
601
602
Figure 6. t5 pdf and cdf with their GLD estimations.
603
604 Searching the XGLD parameters based on initial XGLD estimation, we found:
605
606 0 1 2 1 1 2 3 c = −26499 −32207 03786 −59003 −01471
607
13453 −00975 08637
608
609 Ep.d.f. = 0007 Ec.d.f. = 00004
610
611 Figures 6 and 7 show the original t5 with its GLD and XGLD estimators. F6,F7
612 In Fig. 6, deviation is visible and is equal to 0.0358. But the deviation resulting
613 from XGLD is 0.007 (Fig. 7). Calculations mean that deviation has significant
614 decreases by XGLD.
615
616
6.3. Chi-Square Distribution
617
618 2 distribution with  degrees of freedom, 2  has the following pdf
619
e− 2
−2 x
620 x 2

621 fx =   x≥0


 2
22
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637 Figure 7. t5 pdf and cdf with their XGLD estimations.
14 Ahmadabadi et al.

638 Using Karian and Dudewicz (2000) for  = 5, the GLD estimation for this
639 distribution is
640
641 1 2 3 4  = 2604 00176 00095 00542 with the errors Ep.d.f. = 00211
642 Ec.d.f. = 00136
643
644 The initial XGLD estimation for this distribution is
645
646 0 1 2 1 1 2 3 c = 2604 568181 0 568181 00095 1 00542 05
647
648 Searching the XGLD parameters based on initial XGLD estimation, we found:
649
650 0 1 2 1 1 2 3 c = 21249 501245 10672 498059 0005 30885
651
00635 0938
652
653 Ep.d.f. = 00021 Ec.d.f = 00044
654
655 Figures 8 and 9 show the original 2 5 with its GLD and XGLD estimators. F8,F9
656 In spite of having maximum deviation in the center of last pdfs, in Fig. 8 the
657 deviation is visible in more points. This is because of weakness of GLD about fitting
658 to uneven distributions. However, it is not true about XGLD in Fig. 9.
659
660 6.4. Gamma Distribution
661
662 Gamma distribution, with a > 0 and > 0, a , has the following pdf:
663
xa−1 e−
x

664 fx = x≥0


665 a a
666 Using Karian and Dudewicz (2000) for a = 5, = 3, the GLD estimation for this
667 distribution is
668
669 1 2 3 4  = 10762 00144 00252 00939 with the errors Ep.d.f. = 0005
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686 Figure 8. 2 5 pdf and cdf with their GLD estimations.
Approximating Distributions by XGLD 15

687
688
689
690
691
692
693
694
695
696
697
698
699
700
Figure 9. 2 5 pdf and cdf with their XGLD estimations.
701
702
703
Ec.d.f. = 0012
704
705
706 The initial XGLD estimation for this distribution is
707
708 0 1 2 1 1 2 3 c = 10762 694444 0 694444 00252 1 00939 05
709
710 Searching the XGLD parameters based on initial XGLD estimation, we found:
711
712 0 1 2 1 1 2 3 c = 88655 677578 22643 659773 00157 30214
713
714 00993 1
715 Ep.d.f. = 00011 Ec.d.f = 00014
716
717
Figures 10 and 11 show the original gamma (5, 3) with its GLD and XGLD F10,F11
718
estimators.
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735 Figure 10. 5 3 pdf and cdf with their GLD estimations.
16 Ahmadabadi et al.

736 Because gamma pdf is uneven, GLD cannot fit to it and maximum deviation
737 between them is 0.012. Therefore, this deviation, which is 0.0014 for XGLD, is about
738 0.1 times of GLD.
739
740
6.5. Weibull Distribution
741
742 Weibull distribution with  > 0 and  > 0, wbl , has the following pdf:
743
fx = x−1 e−x

744 x≥0
745
746 Using Karian and Dudewicz (2000) for  = 5,  = 2, the GLD estimation for this
747 distribution is
748
749 1 2 3 4  = 09935 10488 02121 01016 with the errors
750 Ep.d.f. = 00354 Ec.d.f. = 00027
751
752 The initial XGLD estimation for this distribution is
753
754 0 1 2 1 1 2 3 c = 09935 09535 0 09535 02121 1 01016 05
755
756 Searching the XGLD parameters based on initial XGLD estimation, we found:
757
758 0 1 2 1 1 2 3 c
759
760 = 06865 10209 00126 07347 01806 29661 01589 02966
761 Ep.d.f. = 00004 Ec.d.f. = 00002
762
763 Figures 12 and 13 show the original wbl(5, 2) with its GLD and XGLD estimators. F12,F13
764 Because the weibull’s pdf is more even than former distributions, it is visible
765 that the fitness of GLD is more, and maximum deviation is 0.0354 (Fig. 12).
766 Nevertheless, XGLD fits perfectly and calculations show the deviation of 0.0004
767 (Fig. 13).
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784 Figure 11. 5 3 pdf and cdf with their XGLD estimations.
Approximating Distributions by XGLD 17

785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
Figure 12. wbl(5, 2) pdf and cdf with their GLD estimations.
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816 Figure 13. wbl(5, 2) pdf and cdf with their XGLD estimations.
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833 Figure 14. logn0 1/3 pdf and cdf with their GLD estimations.
18 Ahmadabadi et al.

834 6.6. Lognormal Distribution


835
Lognormal distribution with parameters and  > 0, logn , has the following
836
pdf:
837
838  
839 1 lnx − 2
fx = √ exp − x≥0
840 x 2 22
841
842 Using Karian and Dudewicz (2000) for = 0,  = 1/3, the GLD estimation for this
843 distribution is
844
845 1 2 3 4  = 08451 01085 00102 00342 with the errors
846
847 Ep.d.f. = 00953 Ec.d.f. = 00123
848
849 The initial XGLD estimation for this distribution is
850
851
852 0 1 2 1 1 2 3 c = 08451 92166 0 92166 00102 1 00342 05
853
854 Searching the XGLD parameters based on initial XGLD estimation, we found:
855
856 0 1 2 1 1 2 3 c = −12346 68239 00978 47666 00088 33395
857
858 0071 1
859 Ep.d.f. = 00168 Ec.d.f = 00011
860
861
862 Figures 14 and 15 show the original logn0 1/3 with its GLD and XGLD F14,F15
863 estimators.
864 As one can see in Fig. 14, the deviation of GLD is simply visible and it is equal
865 to 0.0953. This amount is 0.0168 in Fig. 15, which shows 82% improvement.
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882 Figure 15. logn0 1/3 pdf and cdf with their XGLD estimations.
Approximating Distributions by XGLD 19

883 7. Conclusion
884
In this article, based on a GLD method, we proposed an extended model, XGLD,
885
as a parameterized family of quantile or percentile functions, which is concluded
886
from a GLD model but has more terms and parameters. This subject causes the
887
increment of elasticity characteristic and more flexibility of this family. As an
888
obvious result, this will increase the accuracy of estimation that has been done
889
by the proposed model, which is demonstrated by some examples. Using a mixed
890
algorithm and some changes of previous methods, we developed an algorithm for
891
the computation of parameters of XGLD model. For displaying this subject, some
892
well-known distributions were estimated by both GLD and XGLD methods and
893
their fitness of estimated distributions were compared. For a better view on the
894
conclusion, a summary of this work is presented in Table 1. For the sake of brevity T1
895
detailed or similar computations for some distributions are not given in the text.
896
It is easily seen from Table 1 that for all distributions, XGLD has more accuracy
897
than GLD.
898
As pointed in the context, many methodologies are introduced to calculate AQ5
899
GLD parameters that one of them is method of moment (that used for above
900
calculations). Another method is a percentile matching method that leads to higher
901
precision than the method of moment. This method is the subsetting of the starship
902
method. In this method, we select some percentiles from two distributions and
903
determine GLD parameters so all of the percentiles match together or minimize
904
their deviations. Furthermore, showing the precision of the proposed methods, we
905
used a percentile matching method too. The generalities and total stages are in
906
accordance with the proposed algorithm, but to gain the preliminary point, the
907
percentile method used with related tables in the reference Karian and Dudewicz AQ5
908
(2000). To avoid recalculation details in this article, only the results are presented
909
in Table 2. As it is observed, an error of XGLD is lower than GLD for all T2
910
distributions. It is noted that XGLD parameters are different in Table 1. For
911
example, the parameters of estimated N (0,1) by XGLD in Table 1 are:
912
913
0 1 2 3 1 2 3 c
914
915 = 06216 54032 00903 48026 01187 101 01419 05328
916
917 and in Table 2, they are
918
919 0 1 1 2 c 2 3 3 
920
= 00018 48214 01412 0035 05 09851 48231 01411
921
922
The reason is that, these parameters calculated by different initial points, and then,
923
each one is near to its beginning point. AQ5
924
925
926 8. Further Research
927
Finding appropriate values for the four parameters of GLD distribution is a difficult
928
task. As we have already mentioned, various studies have been done about methods
929
of finding parameters, yet no one has presented an automatic method of finding them.
930
Because XGLD distribution has more parameters, finding these parameters is harder.
931
Table 1
Some distributions and their GLD and XGLD estimations (using moment method for initial point)
GLD results XGLD results
max fx − max Fx − max fx − max Fx −
Distribution 1 2 3 4  f̂ x F x 0 1 2 3 1 2 3 c f̂ x F x
Normal, (0, 0.1975, 0.0028 0.0011 (−0.6216, 5.4032, 0.0903, 0.0009 0.0003
=0 =1 0.1349, 0.1349) 4.8026, 0.1187, 1.01,
0.1419, 0.5328)
Uniform, (0.5, 2, 1, 1) 0 0 (0.5, 0.5, 0, 0.5, 1, 1, 1, 0.5) 0 0
a=0 b=1
Student’s t, (0, −0.2481, −0.1359, 0.0358 0.0148 (−2.6499, −3.2207, 0.3786, 0.007 0.0004
=5 −0.1359) −5.9003, −0.1471, 1.3453,
−0.0975, 0.8637)

20
Chi-square, (2.604, 0.0176, 0.0211 0.0136 (2.1249, 50.1245, 1.0672, 0.0021 0.0044
=5 0.0095, 0.0542) 49.8059, 0.005, 3.0885,
0.0635, 0.938)
Gamma, (10.762, 0.0144, 0.005 0.012 (8.8655, 67.7578, 2.2643, 0.0011 0.0014
=5 =3 0.0252, 0.0939) 65.9773, 0.0157, 3.0214,
0.0993, 1)
Weibul, (0.9935, 1.0488, 0.0354 0.0027 (0.6865, 1.0209, 0.0126, 0.0004 0.0002
=1 =5 0.2121, 0.1016) 0.7347, 0.1806, 2.9661,
0.1589, 0.2966)
Lognormal, (0.8451, 0.1085, 0.0953 0.0123 (−1.2346, 6.8239, 0.0978, 0.0168 0.0011
= 0  = 1/3 0.0102, 0.0342) 4.7666, 0.0088, 3.3395,
0.071, 1)
Beta, (0.5, 1.9693, 0.009 0.0004 (0.4997, 0.5022, 0.0569, 0.0014 0.0003
 3 = 4 = 1 0.4495, 0.4495) 0.5017, 0.456, 3.7684,
0.4556, 0.4981)
Table 2
Some of distributions and their GLD and XGLD estimations (using percentile method for initial point)
Distribution GLD parameters XGLD parameters GLD GLD XGLD XGLD
1 2 3 4  0 1 1 2 c 2 3 3  Epdf Ecdf Epdf Ecdf
N0 1 0, 0.2142, 0.1488, 0.0018, 4.8214, 0.1412, 0.0006 0.0005 0.0003 0.0003
0.1488 0.0350, 0.5000, 0.9851,
4.8231, 0.1411
t(1) 0, −2.0676, 0.0108, −0.5000, −0.8611, 0.0052 0.0024 0.0027 0.0024
−0.8727, −0.8727 0.0001, 0.5000, 0.0309,
−0.4921, −0.8651
exp(1) 5.0180, 0.1967, 7.4368, 0.5792, 9.4995, 0.7959 0.0328 0.0056 0.0056
5.6153, 0.7407 −0.1176, 0.5143, 1.5424,
7.4817, 0.1494
chi2(5) 2.4772, 0.0345, 2.3975, 28.9063, 0.0157, 0.0152 0.0120 0.0093 0.0093

21
0.0187, 0.1163 −0.0047, 0.4973, −0.0151,
29.0579, 0.122
Gamma(5, 3) 10.7717, 0.0223, 10.6541, 44.7162, 0.0364, 0.0032 0.0080 0.0030 0.0040
0.0426, 0.1541 0.3645, 0.5012, 0.8657,
44.9589, 0.1581
Weibull(1, 5) 0.9823, 1.0492, 0.6275, 1.0563, 0.1698, 0.0191 0.0019 0.0016 0.0005
0.2031, 0.1136 −0.0001, 0.4981, 0.3396,
0.7177, 0.1711
Lognormal(0, 1/3) 0.8393, 0.2934, 6.8630, 0.6879, 0.2191, 0.0538 0.0070 0.0027 0.0022
0.02937, 0.1005 −0.0006, 0.5250, 0.0032,
6.6576, 0.0446
F6 25 0, 0.5290, 0.02885, 0.4285, 34.5249, 0.0022, 0.0656 0.0115 0.0367 0.0210
0.003070, 0.01930 0.0039, 0.4979, −0.0633,
34.4989, 0.0219
22 Ahmadabadi et al.

932 This subject displays itself when the value of k the number of terms in the XGLD
933 model, increases to 4 5     Here we have considered only the case k = 3 It will
934 be interesting to note that the basic functions y − c∗ for a XGLD model are
935 continuous and monotone for y ∈ 0 1 . One can consider other types of such
936 functions as the basic function for XGLD.
937
938 References
939
940 Asquith, W. H. (2007). L-moments and TL-moments of the generalized lambda distribution.
941 Computational Statistics & Data Analysis 51:4484–4496.
942 Bigerelle, M., Najjar, D., Fournier, B., Rupin, N., Iost, A. (2005). Application of lambda
943 distributions and bootstrap analysis to the prediction of fatigue lifetime and confidence
intervals. International Journal of Fatigue 28:223–236.
944
Bratley, P., Fox, B. (1988). Algorithm 659: implementing Sobol’s quasirandom sequence
945 generator. ACM Transactions on Mathematical Software 14(1):88–100.
946 D’Agostino, R., Stephens, M. (1986). Goodness-of-Fit Techniques, Statistics: Textbooks and
947 Monographs. New York: Marcel Dekker.
948 Delaney, H. D., Vargha, A. (2000). The effect on non-normality on student’s two-sample
949 t-test. Annual Meeting of the American Educational Research Association, New Orleans. AQ7
950 Dengiz, B. (1988). The generalized lambda distribution in simulation of M/M/1 queue
951 systems. J. Fac. Engng. Arch. Gazi Univ. 3:161–171. AQ8
952 Devroye, L. (2006). Non-Uniform Random Variate Generation. New York: Springer.
953 Filliben, J. J. (1969). Simple and Robust Linear Estimation of the Location Parameter of a
Symmetric Distribution. Ph.D. Dissertation, Princeton University, Princeton, NJ.
954
Filliben, J. J. (1975). The probability plot correlation coefficient test for normality.
955 Technometrics 17(111). AQ6
956 Fournier, B. et al. (2007). Estimating the parameters of a generalized lambda distribution.
957 Computational Statistics & Data Analysis 51:2813–2835. AQ11
958 Fournier, B., Rupin, N., Bigerelle, M., Najjar, D., Iost, A. (2006). Application of the
959 generalized lambda distributions in a statistical process control methodology. Journal of
960 Process Control 16:1087–1098.
961 Freimer, M., Mudholkar, G., Kollia, G., Lin, C. (1988). A study of the generalized Tukey
962 lambda family. Communications in Statistics, Theory and Methods 17(10):3547–3567.
963 Ganeshan, R. (2001). Are more suppliers better?: generalizing the Gau and Ganeshan
procedure. Journal of the Operational Research Society 52:122–123.
964
Headrick, T. C., Mugdadi, A. (2006). On simulating multivariate non-normal distributions
965
from the generalized lambda distribution. Computational Statistics & Data Analysis
966 50:3343–3353.
967 Karian, Z. A., Dudewicz, E. J. (2000). Fitting Statistical Distributions: The Generalized Lambda
968 Distribution and Generalized Bootstrap Method. CRC Press. AQ10
969 Karian, Z. A., Dudewicz, E. J. (2003). Comparison of GLD fitting methods: superiority of
970 percentile fits to moments in L2 norm. Jouranal of the Iranian Statistical Society 2:171–187.
971 Karian, Z. A., Dudewicz, E. J., McDonald, P. (1996). The extended generalized lambda
972 distribution system for fitting distributions to data: history, completion of theory, tables
973 applications, the “Finalword” on moments fits. Communications in Statistics—Simulation
974 and Computing 25:611–642.
Karvanen, J. (2003). Generation of Correlated Non-Gaussian Random Variables from
975
Independent Components. Fourth International Symposium on Independent Component
976 Analysis and Blind Signal Separation, Nara, Japan. AQ7
977 Karvanen, J., Nuutinen, A. (2008). Characterizing the generalized lambda distribution by
978 L-moments. Computational Statistics & Data Analysis 52:1971–1983.
979 King, R., MacGillivray, H. (1999). A starship estimation method for the generalized lambda
980 distributions. Australian & New Zealand Journal of Statistics 41:353–374.
Approximating Distributions by XGLD 23

981 Lakhany, A., Mausser, H. (2000). Estimating the parameters of the generalized lambda
982 distribution. Algo Research Quarterly 3(3). AQ6
983 Najjar, D., Bigerelle, M., Lefebvre, C., Iost, A. (2003). A new approach to predict the pit
984 depth extreme value of a localized corrosion process. ISIJ 43:720–725.
Nelder, J., Mead, R. (1965). A simplex method for function minimization. Computer Journal
985
7:308–313.
986
Öztürk, A., Dale, R. F. (1982). A study of fitting the generalized lambda distribution to solar
987 radiation data. Journal of Applied Meteorology 21:995–1004.
988 Öztürk, A., Dale, R. F. (1985). Least squares estimation of the parameters of the generalized
989 lambda distribution. Technometrics 27(1):81–84.
990 Ramberg, J., Schmeiser, B. (1974). An approximate method for generating asymmetric
991 random variables. Communications of the ACM 17(2):78–82.
992 Ramberg, J. S., Schmeiser, B. W. (1972). An approximate method for generating symmetric
993 random variables. Communications of the ACM 15:987–990.
994 Ramberg, J. S., Dudewicz, E., Tadikamalla, P., Mykytka, E. (1979). A probability
995 distribution and its use in fitting data. Technometrics 21:201–214.
Su, S. (2007). Numerical maximum log likelihood estimation for generalized lambda
996
distributions. Computational Statistics & Data Analysis 51:3983–3998.
997
Tukey, J. W. (1962). The future of data analysis. Annals of Mathematical Statistics 33(1):1–67.
998 Upadhyay, R. R., Ezekoye, O. A. (2008). Treatment of design fire uncertainty using
999 quadrature method of moments. Fire Safety Journal 43:127–139.
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029

View publication stats

You might also like