Professional Documents
Culture Documents
www.elsevier.com/locate/ijfatigue
Received 16 October 2000; received in revised form 8 January 2001; accepted 8 January 2001
Abstract
For service life prediction and stochastic reconstruction of load histories, rainflow matrices have been recently predominately
used to describe the scatter of loading. Typically, only limited data are available due to the costs of measurements. As a consequence
of this, discrete rainflow matrices have to be modelled and extrapolated. So far non-parametric methods have most frequently been
used to transform discrete matrices into smooth functions. In this paper, two appropriate parametric models: a mixture of joint
Weibull–normal distributions and a mixture of multi-variate normals, as well as two algorithms for parameter estimation: the EM
algorithm and the algorithm developed by Nagode and Fajdiga are thoroughly discussed and compared. Finally, a method to describe
the scatter of rainflow matrices is presented. 2001 Elsevier Science Ltd. All rights reserved.
Keywords: Load spectra characterisation; Load spectra extrapolation; Rainflow cycle counting method; Weibull–normal mixture; Multivariate nor-
mal mixture
0142-1123/01/$ - see front matter 2001 Elsevier Science Ltd. All rights reserved.
PII: S 0 1 4 2 - 1 1 2 3 ( 0 1 ) 0 0 0 0 7 - X
526 M. Nagode et al. / International Journal of Fatigue 23 (2001) 525–532
parametric methods of distribution estimation are used where bl and ql stand for Weibull shape parameters, and
instead of non-parametric ones. A survey of the para- ml and sl denote mean value and standard deviation,
metric methods appropriate for a mixture of multi-variate respectively.
normals can be found in Ref. [7]. The advantages of In Eqs. (2) and (3) it is presumed that sr and sm corre-
these are as follows: sponding to the lth component distribution stand for
independent variables. However, a correlation may exist
앫 Unknown parameters of the mixture of multi-variate between the components forming the mixture, which can
normals can be estimated reliably by using the well- be proved if Eq. (2) is inserted in Eq. (1). Consequently,
known EM algorithm [8]. by choosing an appropriate number of components m,
앫 Eventual correlation between ranges and means can an arbitrary rainflow matrix can be modelled even if ran-
be taken into account. dom variables sr and sm are correlated. This can be
앫 The method can easily be widened to more than two assumed since there exist many similar examples, such
independent variables. as in Ref. [8], where mixtures of joint normals are used
to model any shaped empirical distributions, although
As a multi-variate normal probability density function the contours of constant probability density of the joint
(PDF) does not equal zero for negative ranges, it has to normal distribution Eq. (4) are ellipsoids.
be cut off. Therefore, its cumulative density function If random variables sr and sm in Eqs. (2) and (3) are
(CDF) F(+⬁) is less than one, unless it is compensated replaced with sr=|sf⫺st| and sm=0.5(sf+st), the distribution
[7]. It has to be stated that this can be done only if F(+⬁) function of the from–to matrix is obtained where sf and
is very close to one. Besides, multi-variate normals are st stand for from- and to-reversal points.
not appropriate for all load time histories, such as for Although a mixture of joint Weibull–normals seems
those where ranges obey the exponential law. to be the genuine distribution suitable for modelling any
If rainflow matrices are modelled by a mixture of joint rainflow matrix, a number of cases exist where the mix-
Weibull–normal distributions, the above shortcomings ture of multi-variate normals can assure satisfactory
can be avoided. results, too.
Multi-variate normal component PDF is defined as
冉冊 再冉冊bl bl
multi-dimensional space.
bl sr sr
fl(s)⫽fl(sr)fl(sm)⫽ exp ⫺ (2)
冑2πslsr ql ql
冎
3. Component probability density estimation
(sm−ml)2
⫺
2s2l Suppose that the component PDF fl(s) represents a
Its component CDF is therefore multi-variate distribution, where s denotes the d-dimen-
冉 再 冉 冊 冎冊 冉 冊
sional random vector [s1,s2,…,sd]T. Let s1,s2,…,sN denote
sr bl
sm−ml an empirical sample of size N of random vectors. From
Fl(s)⫽Fl(sr)Fl(sm)⫽ 1⫺exp ⫺ ⌽ (3)
ql sl the random sample it is always possible to obtain the
M. Nagode et al. / International Journal of Fatigue 23 (2001) 525–532 527
A method based on maximising the likelihood of the If presumed that the derivatives of Eq. (11) with
parameters for the given data set can be used to deter- respect to s at the global mode position equal zero, the
mine the fl(s). The negative log-likelihood for the data second set of constraints is obtained
set is given by
∂fl(simax)
冘
N
ln L⫽⫺ ln(fl(si))⫺l ln
i⫽1
f⬘(smax)
fl(smax)冉 冊 (6)
∂si
⫽0; (i⫽1,…,d) (14)
taking into account the constraint (5) by introducing Considering both sets, unknown parameters of the
Lagrange multiplier l. joint Weibull–normal distribution are
Let us first consider the case of the multi-variate nor- 1
ml⫽smmax, sl⫽ , ql (15)
⑀冑2πf⬘(smmax)
mal distribution (4). Hence it follows that the mean value
can roughly be estimated by taking ml=smax. Constraint
(5) can also be rewritten as
1
0⫽ln((2π)d/2f⬘(smax))⫹ ln|⌺l| (7)
⫽srmax 冉 冊 bl
bl−1
1/bl
If the first derivative of Eq. (8) with respect to ⌺l is set As the global mode position smax is affected by the
to zero, a rough estimate of the covariance matrix is neighbouring component densities, the exact mode
obtained location of the observed distribution does not necessarily
冘
N coincide with smax. Therefore, during the optimal para-
1
⌺l⫽ (s ⫺ml)(si⫺ml)T (9) meter estimation phase Lagrange multiplier l in Eq. (6)
N−li⫽1 i has to be set to zero
where Lagrange multiplier l can be easily found by
冘
N
再 |冘 |冎
i⫽1
N 1/d
l⫽N⫺2π f⬘(smax)2 (si⫺ml)(si⫺ml)T (10)
i⫽1 If the above equation is now minimised with respect
to unknown parameters, optimal estimate of the compo-
It is not easy to determine the parameters of fl(s) for nent PDF fl(s) is reached. Choosing fl(s) as a multi-vari-
any multi-variate distribution by minimizing Eq. (6). If ate normal distribution results in
random vector components (see Eq. (2)) are inde-
冘 冘
N N
pendent, parameters can be roughly estimated as 1 1
shown below. ml ⫽ s and ⌺l⫽ (s ⫺ml)(si⫺ml)T (18)
Ni⫽1 i Ni⫽1 i
The joint component distribution of s is now
d
On the other hand, if component PDF is a joint Weibull–
fl(s)⫽ ⌸ fl(si) (11) normal distribution, we get
冘 冘
i⫽1
N N
1 1
The constraint that the global mode should be preserved ml⫽ s s2l ⫽ (s ⫺m )2 ql (19)
during rough parameter estimation yields Ni⫽1 mi Ni⫽1 mi l
528 M. Nagode et al. / International Journal of Fatigue 23 (2001) 525–532
冉冘 冊 冘
l⫺1
N
1 1/bl
wl⫽1⫺ wi (22)
⫽ sbl
Ni⫽1 ri i⫽1
where bl is the solution of the equation (ii) A rough estimate of the lth component distribution
冘
N is made according to the findings discussed in the
sbril ln(sri) preceding section.
冘
N
1 1 (iii) Firstly, the mean value of absolute relative devi-
0⫽ ⫹
i⫽1
ln(sri)⫺ (20)
bl Ni⫽1
冘 ations Da(I)
N
bl
冘| 冕
s
冘 |冘
ri
K K K
i⫽1
Da(I)⫽ |⌬f⬘i|⫽ f⬘i⫺ fl(s) ds ⬇ |f⬘i (23)
i⫽1 i⫽1 i⫽1
Vi
A numerical method, such as the Newton–Raphson
one, can be used to solve Eqs. (16) and (20). ⫺Vifl(si)|
and relative differences
procedure may converge after a certain number of iter- are computed for those indices i for which empirical
ations. Nevertheless, it was noticed that the EM algor- relative frequencies f⬘i⬎0 and fl(si) is greater than a
ithm becomes slow and unreliable, especially if the num- pre-determined small number in order to avoid
ber of component distributions is high. division by zero. Secondly, maximal relative differ-
A new algorithm (NA) suitable for mixture distri- ence is extracted
butions [11] works equally well with any distribution
function. Its reliability does not depend on the number ⑀max⫽max{⑀i; i⫽1,…,K} (25)
of component distributions. Besides, it is also much
quicker than the EM algorithm. Initially, the algorithm
The discrepancy between two successive iter-
in question was developed solely for the mixtures of one
ations I is then
random variable, which is the reason why the expansion
of the algorithm to multi-dimensional space is dis- Da(I⫺1)⫺Da(I)ⱕer (26)
cussed next.
It was proved that when the size of random sample N If it is less or equal to error rate er and if parameters
and/or the number of random variables d are high, the are estimated roughly, go to point (ii) and replace
performance of the algorithm can be improved substan- rough estimate by the optimal one. Reset Da to its
tially if random vectors s1,s2,…,sN are rounded to pre- initial value. When inequality is fulfilled the next
determined grid points s∗1 ,s∗2 ,…,s∗K, while KN. Let Vi time, optimal parameter estimate is reached.
denote the volume of a hypersquare with sides of length Increase l=l+1, set f⬘i=r⬘i for (i=1,…,K) and return
h1,h2,…,hd centred on grid point si∗, whilst the relative to (i). If none of the above is true, proceed with
number of points falling within Vi is defined as point (iv).
(iv) For the indices i fulfilling the condition (f⬘i⬎
ni 0ri⬎0)(⑀i⬍0⑀iⱖ⑀max) the following transform-
f⬘i⫽ ; (i⫽1,…,K) (21) ation is carried out:
N
where the adjoining volumes should not overlap. f⬘i⫽f⬘i⫺⌬f⬘i, r⬘i⫽r⬘i⫹⌬f⬘i, wl⫽wl⫺⌬f⬘i (27)
The proposed procedure of working out unknown con-
stants is iterative and carried out in the following stages: The negative value of ⌬f⬘i should never be greater
than the residuum value. However, if this is not true,
(i) Global mode position smax∗
=[s1max
∗ ∗
,s2max ∗
,…,sdmax ]T and the error has to be corrected ⌬f⬘i=⫺r⬘i before the
∗
its probability density f⬘(smax) are obtained from the above transformation is made.
random sample. Residuum r⬘i=0 for (i=1,…,K). The whole procedure should be repeated (m⫺1)
Further, the initial weighting factor stems from times, once for each fl(s). The remaining residuum
M. Nagode et al. / International Journal of Fatigue 23 (2001) 525–532 529
Fig. 2. Joint Weibull–normal mixture CDF with its marginal CDFs Fig. 3. The conditional PDFs of the number of load cycles f(n兩s) and
(m=5). of vector s f(s兩n).
530 M. Nagode et al. / International Journal of Fatigue 23 (2001) 525–532
cular case f(srsm|n) defines the scatter of ranges and can be approximated well by normal distribution [15]
means, given that n is fixed. (see Fig. 3). This yields
再 冉 冊冎
For reliable fatigue damage prediction, the ran- 2
domness of fatigue resistance of material as well as the 1 1 n−m(s)
f(n|s)⫽ exp ⫺ (37)
randomness of loading have to be considered [14]. Since 冑2πs(s) 2 s(s)
the resistance of material is usually described with a con-
ditional PDF of the number of cycles to failure, con-
ditional PDF of the number of load cycles f(n|s) is of Mean value m(s) and standard deviation s(s) have values
great significance, too. m(s)⫽1⫹(N⫺1)(1⫺F(s)) s(s) (38)
When f(s), f(s|n) and f(n) are known, conditional PDF
of the number of load cycles f(n|s) can be determined ⫽冑(N−1)(1−F(s))F(s)
by the Bayesian theorem
f(s|n)f(n) Conditional distribution functions as defined in Eqs. (28,
f(n|s)⫽ (30) 36) and (37) can thus be applied for any mixture irres-
f(s)
pective of the number of random variables and their cor-
relation.
The unknown PDF of the number of load cycles f(n) can
be obtained by introducing the one-to-one transform-
ation 6. Discussion
g1(s)⫽n⫽N(1⫺F(s)),g2(s)⫽s2,…,gd(s)⫽sd (31)
The characteristics of the two methods of parameter
estimation and the distinction between the multi-variate
Let distribution f(n,s2,…,sd) first be expressed as normal and Weibull–normal mixtures are discussed next.
f(s) f(s) An empirical complex shaped rainflow matrix (see
f(n,s2,…,sd)⫽ ⫽ (32) Fig. 10) was modelled in three ways: firstly, by the Wei-
|J(s)| N∂F(s)/∂s1
bull–normal mixture, whilst the unknown parameters
where were estimated by the NA. Secondly, the EM algorithm
was employed instead of the new one. Thirdly, a multi-
∂g1 ∂g1
% variate normal mixture with the EM algorithm was used
∂s1 ∂sd
| |
to model the empirical matrix. For the sake of compar-
∂F(s)
J(s)⫽ ⯗ % ⯗ ⫽⫺N (33) ability number of components m was set to 5 in all three
∂s1 cases. The empirical and hypothetical PDFs as well as
∂gd ∂gd
% the corresponding CDFs are depicted in Figs. 4–9. To
∂s1 ∂sd distinguish one from another, empirical probabilities
is the Jacobian of the transformation (31). As the con- were marked with black circles.
ditional distribution of the number of load cycles All three approaches result in a fairly good agreement
between empirical and hypothetical values in the region
f(n,s2,…sd) 1 of high probability. However, the best agreement in this
f(n|s2,…sd)⫽ N ⫽ (34)
冕
N region is obtained if a Weibull–normal mixture com-
f(n,s2,…sd) dn bined with the EM algorithm is used. In the region of
low probability, the probability of occurrence of high
0
By inserting Eqs. (28) and (35) into Eq. (30) the con-
ditional PDF of the number of load cycles is finally
obtained
f(n|s)⫽ 冉 冊
N−1
n−1
F(s)N−n(1⫺F(s))n−1 (36)
er. The higher the error rate, the lower the number of m
and vice versa. As components are estimated one by one,
the number of cycles left in the residuum is falling if
component index l is rising, which influences the number
of components, too. In other words, when the residuum
is (nearly) empty, no further component can be assigned
to the mixture any more.
Besides, it was noted that the treatment of the last
component is significant if components are far between.
Both phenomena are left to further research.