You are on page 1of 6

Aven

CH042.tex

17/5/2007

9: 24

Page 339

Risk, Reliability and Societal Safety Aven & Vinnem (eds)


2007 Taylor & Francis Group, London, ISBN 978-0-415-44786-7

Calculation of PFD-values for a safety related system


J. Brcsk & P. Holub
HIMA Paul Hildebrandt GmbH + Co KG. Brhl, Germany
Computer architecture & System programming, University of Kassel, Germany

M.H. Schwarz & N.T. Dang Pham


Computer architecture & System programming, University of Kassel, Germany

ABSTRACT: The standard IEC/EN61508 provides the developer with guidelines to develop and implement
safety related systems according to the international standard. The standard supplies qualitative and quantitative
criteria to evaluate safety related systems, in order to apply it in safety critical applications. This paper details
the criterion Probability of Failure on Demand (PFD). The authors derive the necessary equations and calculate
the PFD-values for different system architectures.

INTRODUCTION

Table 1.
Chapter

Safety related systems have to be developed, tested,


used and maintained according national and internationals standards. The standard IEC/EN 61508 [Brcsk2004], [IEC61508_2000] includes all aspects of:

1.1

IEC 61508

The standard IEC/EN 61508 [IEC61508_2000] is


detailed in seven chapters. However, only the first four
present normative requirements for the development.
Each chapter and its content are listed in the Table 1.
1.2 Safety Integrity Level requirements
In the standard IEC/EN 61508, the requirements
of the safety related systems and equipment are
divided into four safety integrity levels (SIL 1 to 4).
Equipment, sensors, units or systems, which are part
of a safety function, have to have a safety classification [Brcsk2004, IEC61508_2000, Storey1996].

Content

IEC/EN 61508-1
IEC/EN 61508-2
IEC/EN 61508-3
IEC/EN 61508-4
IEC/EN 61508-5

General requirements
Hardware requirements
Software requirements
Notation and abbreviations
Example to calculate the different
safety integrity levels (SIL)
IEC/EN 61508-6 Application guidelines for
IEC/EN 61508-2 and IEC/EN 61508-3
IEC/EN 61508-7 Overview of techniques and actions

Electrical
Electronic and
Programmable electronic Systems
for safety related function and usability. This standard
provides the basis of all safety related electrical, electronic and programmable electronic systems. The standard enables a systematic and risk based methodology
for safety related problems.

IEC/EN 61508.

Additionally, a new quality and recognition is required


and added to the perception of safety. To evaluate the
safety functions of a system, three different values are
of great importance:
Probability of Failure on Demand (PFD)
Hardware Fault Tolerance (HFT)
Safe Failure Fraction (SFF).
The safety integrity level is one of the four discrete
cascaded levels. Each safety level corresponds to a
probability of failure. SIL-4 (Safety Integrity Level 4)
complies to the highest safety level and SIL-1 is the
lowest [IEC61508_2000], [Storey1996].
Table 2 shows different requirements of the altered
safety integrity levels (SIL) in dependence to the probability of failure. The probability values are defined as
a PFD-value (probability of failure on demand), if the
system is in a low demand mode and has to execute a

339

Aven

CH042.tex

17/5/2007

9: 24

Page 340

Table 2. Definition of safety integrity level according to


IEC 61508.
Low demand mode of
operation (Average
probability of failure to
perform its design
SIL function on demand)
4
3
2
1

105
104
103
102

to < 104
to < 103
to < 102
to < 101

High demand or continuous


mode of operation (Probability of a dangerous failure per
hour)
109
108
107
106

to < 108
to < 107
to < 106
to < 105

Figure 1. Reliability function R(t) as function of time.

2.2 Reliability, MTTF and probability of failure

safety function once per year or more seldom. However, if a system is operating in a high demand mode or
continuous mode and a safety function has to be executed more than once per year, then the probability of
failure is specified with the PFH (probability of failure
per hour). Its dimension or unit is (1/h) [Brcsk2004],
[IEC61508_2000], [Storey1996].

The reliability function R(t) is the probability that an


item is operational in a period under consideration
(0t). Generally, the reliability has the function.

If an exponential distribution is valid, then the failure


rate is constant

Then Equation (1) can be rewritten as:


1.3

Steps of calculations of the probability


of failure

Part 6 of the IEC/EN 61508 details the specifications for quantitative estimations of a system. The
calculations are split into the following steps:
Identify the block diagrams according the selected
structure
Estimate the failure rate
Determine the -factor applying the tables stated
in the IEC 61508
Estimate the diagnostic coverage (DC)
Determine the safe failure fraction (SFF)
Calculate the PFD values of a subsystem and sum
up all values of the subsystems
Determine the SIL-values using the PFD, SFF and
failure tolerance of the hardware.
2
2.1

Figure 1 shows R(t) as function of time [Brcsk2004],


[IEC61508_2000].
An important reliability parameter is the expectation value, also known as MTTF (Mean Time To
Failure).

Note, that Equation (4) is only valid if an exponential


distribution is suitable. The probability of failure P(t)
is calculated applying the reliability function R(t). P(t)
states the probability that failure or breakdown occurs
within the interval (0, t].

RELIABILITY AND SAFETY


Quantitative calculation of the probability
of failure

Different systematic methods exist to analyse the


safety integrity level (SIL) of safety related systems.
The most common are:
Reliability block diagram
Markov Model
Monte Carlo Simulation.
These methods provide similar results, if correctly applied. However, the first method is less
accurate as the last two ones. [Brcsk2004],
[Robert&Casalla1999].

This equation is the basis of derivation of the PFD


equation (Probability of Failure on Demand). Calculations for three different architectures in the main part
of this paper.
If Equation (4) is put in Equation (5), then for the
probability of failure P(t) and the time t = MTTF, it
results in the following equation:

That means that the probability of failure P(t) for this


time is about 63 %.

340

Aven

CH042.tex

17/5/2007

9: 24

Page 341

PARAMETERS ESTIMATION

3.1 Proof Test Interval

This section details with different parameters necessary for the estimation of the PFD-values and
SIL-values.
The total number of failures is defined as:

Safety failures are defined (with safety related


parameter S)

And dangerous failures are characterized as:

Another important parameter is the proof-test-interval,


which is generally identified as T1 . At time T1 a periodical test or the maintenance of a safety system is
taking place [Velten-Philipp & Houtermans2006]. The
tests are carried out to allocate undetected, dangerous failures. After a proof test, the system is regarded
as new.
The PFD-value depends on proof-test-interval T1 ,
as stated in the standard IEC 61508 [IEC61508_2000].
The standard defines a proof test as follows
[IEC61508_2000]:
The proof test is a repetitive assessment to detect
failures in a safety related system. Afterwards, the
system is regarded as quasi-new.
3.2 Correlation of T1 and MTTF

Identified dangerous failures (with diagnoses parameter DC), are called dangerous detected failures:

The results of Equation (5) is only true for the stated


condition. If the proof test condition is

and with using (9), it results in:


Not identified dangerous failures, are called dangerous
undetected failures:
This result is important to analyze the PFD-value in
practice.
The S-factor is the ratio of all dangerous failures to the
total number of possible failures in the system.
4

The applied DC-parameter is estimated according


to the selected diagnoses procedure. The Diagnose
Coverage Factor (DC):

PFD CALCULATIONS FOR A


1001-SYSTEM

The exact equation (5) with = D is used to calculate the PFD-value (PFD: Probability of Failure
on Demand) of a 1001 system. This results in the
following equation for the 1001 system:

The PFDavg at time T results in


The DC parameter is the ratio of all detected dangerous
failure by the diagnosis to the total number of dangerous failures [IEC61508_2000]. Safe failure fraction
(SFF) is defined as:

The relationship of SFF and DC:

For the 1001 system the MacLaurin series can be developed and is stated below. The first three terms plus the
remaining term R3 are sufficient for the calculation of
the PFDavg values.
The description of the remaining term R3 is chosen
as follows [Brcsk2004]:
R3 is the remaining term to the third order,
which belongs to the exponential function with failure
rate D .

341

Aven

CH042.tex

17/5/2007

9: 24

Page 342

For the 1001 system the first three terms are needed
to be developed. The remaining term R3 converges for
T = 0 to the value 0 and can be neglected compared to
the third term when developed towards the limit value
at T = 0 [Brcsk2004]:

With

For a 1001 system, the PFDavg equation results in, if


Equation (20) in Equation (19) is applied:

CALCULATIONS FOR DIFFERENT


ARCHITECTURES

The last section derived the equations necessary to calculate the PFD-value for a 1001-architecture. Now,
two further hardware architectures, 1002- and 2003architecture, are examined and the PFD-equation for
each structure is presented. There are common cause
failures in both architectures. Therefore the failure
probability is calculated for dangerous undetectable
and dangerous detectable common cause failures PDUC
and PDDC .
5.1 PFD calculation of common cause failures
Common cause failures are those failures that occur
in all system channels at the same time and
which have a common cause. When determining
the PFDavg this kind of failure is rated for a multi
channel system through the -factor [Brcsk2001,
Goble1995,Brcsk2004]. One differentiates between
the -factor for dangerous undetectable failures, with
the weight , and the -factor for dangerous detectable
failures, with the weight D .
These failure probabilities can be derived for a 1001
system with:

With:

And

The PFDavg value for a 1001 system results in:


Respectively

A random common cause failure represents a 1001


function block. Therefore, it is possible to apply for
the calculation of probability of common cause failure the derived PFDavg equation of the 1001 system,
i.e., Equation (19) through Equation (28). The PFD
equation results in:

With the assumption

i.e., the repair time MTTR is significantly shorter than


half the proof test interval T1 . Equation (25) can be
presented simplified as follows:

With
Here, is the mean repair time of a channel:

5.2 1002 Architecture


Equation (27) and (28) for the PFDavg value of a 1001
system are equal to the corresponding formulas from
IEC 61508 [Brcsk2004], [IEC61508_2000].

A 1002 architecture possesses two channels in parallel. Each channel is able to execute the safety function.
Therefore, a 1002 system will fail dangerously if

342

Aven

CH042.tex

17/5/2007

9: 24

Page 343

both channels failed dangerously. The equation for the


probability of failure is then [Brcsk2004]:

With

for the normal (single) failures and PFDavg, for the


common cause failures. Two channels exist in a 1002
system in which each dangerous failure has to occur
in order for the system to fail dangerously. Therefore,
the probability of failure is calculated from the product
of the single channel probability failures PFD1 (t) and
PFD2 (t). The failures to be considered here are classified as single failures. The failure probability P1 (t) for
the first channel is described by Equation (34), where

can be developed in a MacLaurin series. For the calculation of the PFDavg value of a 1002 system it
is sufficient to develop the first four terms of the
MacLaurin serie plus the corresponding remaining
terms R4A and R4B because, as the calculation shows,
only the fourth term contributes to the result. The
remaining terms R4A and R4B converge at T = 0 to
the value 0 and are negligible small compared to the
fourth term when building the limit value at T = 0
[Brcsk2004].
With this results in the following equation for the
probability of failure of a 1002 system under the
condition that no common cause failures apply:

If in Equation (43) the times tCE , representing the channel equivalent mean down time, and tGE , representing
the group channel equivalent mean down time, are
used according IEC 61508 [IEC61508_2000] with:

is the failure rate for a dangerous single failure, while


the failure probability PFD2 (t) for the second channel
is presented by Equation (35) and failure rate

Using Equation (34) and Equation (35) the PFD-value


for normal failures results in
and if one writes

The result from Equation (38) is used in order to determine the PFDavg value of a 1002 system for normal
failures and to derive the following equation

Then, considering only normal failures this results in

The PFDavg equation for a 1002 system is calculated by adding to the probability of a normal failure
Equation (48) the part of the common cause failure
Equation (31):

The functions

5.3 2003 architecture


And

The last hardware system presented is a 2003 architecture. The safety system consists of three parallel
channels with a facility to do a majority decision.

343

Aven

CH042.tex

17/5/2007

9: 24

Page 344

Table 3. Parameters for a fictive module for the PFDcalculation.


DD [1/h]

DU [1/h] MTTR [h]

8.415E-08 8.5E-10

0.02 0.01

The output state of the system is the result of at least


two corresponding channels.
For the derivation of the PFDavg equation of a 2003
system the derivation of the 1002 system can be used.
While for a 1002 system both channels have to fail
in order for a dangerous system failure, in case of a
2003 system two out of three channels have to fail for
a dangerous system failure to occur. Normal dangerous failures can therefore occur in three channel pair
combinations. Therefore, it is necessary to extend the
term PFD1 (t) PFD2 (t) of the 1002 PFD equation (33)
with the factor 3. The equation for the probability of
failure of a 2003 system is then:

Therefore, the calculation of the PFDavg value of a


2003 system for normal failures results in

with tCE and tGE see Equation (45) and Equation (46).
Therefore, the PFDavg equation, taking into account
common cause failures and single failures, results in

Figure 2. PFD-values for different architectures.

safety related system. The user or developer can calculate the probability of failure of a safety related
system and can proof mathematically that the system
stays within the required limits of the probability of
failure. The paper detailed the relationship of failure
rates, reliability function, MTTF, PFD. Calculations of
the probability of failure in low demand mode (PFD)
consider only failure rates of dangerous failures. Not
only the failure rates of each subsystem are important,
but also the selected proof-test interval influences the
PFD-value significantly. If the required rates are low,
then the standard IEC/EN 61508 specifies for a prooftest interval values between 6 months to 10 years. The
lower the proof-test-interval, the lower the operation
time between two test-intervals, therefore probability
of failures is low as well. The PFDavg is calculated, by
taking the average value of the PFD-function over all
proof-test-intervals. The paper presents a method that
uses MacLaurin series to determine the PFD equation. A 1001 architecture is used to demonstrate the
PFD equation is determined by the MacLaurin series.
Finally, different architecture are presented with their
PFD equations.
REFERENCES

NUMERICAL EXAMPLE

Following values are applied for a fictive module:


The PFD-value for a 1001-achitecture is clearly
worse than the common-cause-failures of a 1002or a 2003-architecture. If one compares the normal
failures of a 1002- with a 2003-architecture it is obvious that the 1002-architecture performs better than a
2003-architecture, as shown in Figure 2.
7

CONCLUSIONS

The standard IEC/EN 61508 specifies boundaries of


the probability of failures in order to fulfill the SIL
requirements. Those requirements requested from a

Brcsk J. 2004. Functional safety systems. Heidelberg:


Hthig Verlag
Brcsk J. 2001. Functional Safety Computer Architecture
Part 1 and Part 2 lecture notes, Kassel: University of
Kassel
Goble W. M. 1995. Safety of programmable electronic
systems Critical Issues, Diagnostic and Common Cause
Strength Proceedings of the IchemE Symposium, Rugby,
U. K: Institution of Chemical Engineers
IEC 61508. 19992000. International Standard: 61508
Functional safety of electrical electronic programmable
electronic safety related systems Part1Part7. Geneva:
International Electrotechnical Commission
Robert C. P. Casella G. 1999. Monte Carlo, statistical methods
Berlin: Springer Verlag
Storey N. 1996. Safety critical computer systems, Addison
Wesley
Velten-Philipp W., Houtermans M. J. M. 2006. The effect of
diagnostic and periodic testing on the reliability of safety
systems, Kln: TV

344

You might also like