You are on page 1of 6

Feature: Energise or De-energise to trip?

Energise or Introduction/background

De-energise
The concept of identifying a “safe state” for a system is
well established in the process and energy industries. A
sub-system that will transition a system into a safe state,
in the event of a failure, is often called “fail safe” or “fail to

to trip? safety” sub-system. This is by contrast with sub-systems


whose failures lead to a hazard from the system which
are called “fail to danger” sub-systems. While we accept
that not all hazards result from failures, all hazard analyses
should consider the results of power failures - whatever
Dr. A G Foord, Abstract the source of power (electrical, hydraulic, pneumatic, etc.).
4-sight Consulting Common examples of sub-systems that transition
“De-energise to trip” is a long established principle be- a system into a safe state are shutdown, emergency
C R Howard, cause of the danger of common cause failures. Although shutdown (ESD) or trip systems. Conventionally, these
Istech Consulting Ltd. there is little published on this topic, it is covered in the sub-systems have been designed so that, on loss of
section “Protection systems (trips and interlocks)” in the communication or loss of power, the sub-system will
HSE Technical Measure Document for COMAH sites, but operate or trip and the system will transition into a
the quality of UPS, diagnostics etc. is now very different safe state. However this is not always so; but what
from the last century. determines the basic safety principles? Hence the title of
As well as the obvious effects of architecture, failure this paper “Energise or De-energise to trip?” Two simple
modes and frequency on the number of spurious trips and examples, shown in Figures 1 and 2, are an upper quadrant
failures to dangers, we have also studied the relationships Semaphore Railway Signal that uses gravity to De-energise
between design policies, (for example, overrides and to Trip - (DT) - and Electrical Switchgear, which is usually
diagnostic coverage), testing policies, repair policies, designed as Energise to Trip - (ET).
operating policies and their effects on common cause Both these examples assume a safe state and in many
failures. The effects of different policies on spurious trips industries this is well defined. For some industries this is
and failures to danger would be illustrated with practical more difficult, for example, “fly-by-wire avionics.” Apart
examples from the energy industry: oil and gas production from when the aircraft is on the ground, there is not a
and power stations. “safe” airborne state for the avionics.
Biography of Dr. A G Foord
Dr Tony Foord’s involvement with safety and reliability
began when he was a Maintenance Manager for Phillips
Imperial Petroleum, continued at ICI as Production
Manager and then on a joint venture with BP Chemicals
and later in Air Traffic Control. As one of the founding
Principal Engineers of 4-sight Consulting he has worked in
the energy, process and transport industries conducting
hazard and risk assessments and audits (including QRA
studies) and developed and delivered a range of training
courses for reliability and safety.
Biography of Mr. C R Howard
Colin Howard worked for ICI for nearly 35 years in a  
wide range of engineering and management roles based
on instrument, control and electrical technology, quality  Figure 1: Electrical Switchgear (ET) Figure 2: Railway signal (DT)
assurance, and safety management.
He latterly specialised in hazard studies, risk assess- System owners and system operators are concerned
ments, auditing and the safety of instrumentation and with issues of finance and reputation as well as safety and
control systems; providing consultancy also into the steel environmental issues. Thus unintended operation of trip
and foundry, food and pharmaceutical and utilities sectors. systems or “spurious” trips is also an issue of concern.
Since 2001, Colin has established his own consultancy, Modern technology has resulted in a re-consideration of
Istech Consulting Ltd. This has centred on the safety of traditional designs, for example
instrumentation and control systems, in particular, the • comprehensive diagnostics allow early detection of
application of IEC 61508, IEC 61511 and EEMUA 191 and many failures including communication failures;
the associated hardware, software and human factors • uninterruptible and local power supplies are now much
issues. cheaper and more reliable; and
Colin Howard is a Visiting Fellow at the University • communication protocols provide both error detection
of Teesside. In 1998 was President of the Institute of and correction
Measurement and Control and is currently a member of Thus many more systems can now include trip sub-
the ECUK Quality Assurance Committee. systems that are:

276 • Measurement + Control Vol 41/9 November 2008 www.instmc.org.uk


Feature: Energise or De-energise to trip?

• more complex; Overpressure protection for a turbine driven


• energise to trip; and thus
• no longer fail-safe
compressor

However the implications of these changes in design are not always Consider a simple over-pressure protection sub-system for a
fully appreciated. As well as the obvious effects of architecture, failure compressor as illustrated in Figure 4: High-pressure trip. Pressure
modes and frequency on the number of spurious trips and failures to transmitter(s) are connected to a logic solver (incorporating a
danger, we have also studied the relationships between design policies high-pressure trip setting) which is connected to a bypass valve that
(for example, overrides and diagnostic coverage), testing policies, can open to allow the output of the compressor to be connected
repair policies, operating policies and their effects on common cause back to the input. At the same time, the logic solver can also shut the
failures. The effects of different policies on spurious trips and failures power supply to the turbine driving the compressor.
to danger are illustrated here with practical examples from the So what does DT
energy industry: oil and gas production and power stations. imply for this simple
high-pressure trip sub-
Design issues system. If the pressure
sensor(s) fail they may
The traditional give a low output; so
view has been does DT imply reverse
Safety Availability that there is a acting transmitters for
trade off between high-pressure; (and
 
De-energise Energise to protecting against Figure 4: High-pressure trip similarly for other high-
to Trip (DT) Trip (ET)
hazards to people level; high-temperature
or the environment etc. trips)? If the logic solver is implemented in relays then coil and
Operation and availability (of open (o/c) circuit contact failures are typically 90% of the failures,
supply or output) only 10% of the failures are short-circuit (s/c). Hence, DT might give
for operation. a high-integrity trip system but DT will also result in higher spurious
  Figure 3: Traditional choices De-energise to trip trips if 1oo2 sensors are used as well as the 1oo2 final elements
has been applied shown. Using 2oo3 sensors will reduce the number of spurious trips
principally when safety has been dominant and energise to trip when from sensor faults, but we have seen DT designs with 3 sensors
the key issue has been availability, particularly availability of primary where the 2oo3 relay voting is badly designed and has single point
services such as electric power. Hence electrical switchgear has usu- of failure (SPOF), so that the high-integrity of the 2oo3 design is
ally been designed as energise to trip in order to maintain supplies. lost.
However, the reality is much more complex than is suggested by the In ISA TR84.00.02 Part 1 Page 57[3] for a solenoid valve in DT
simple diagram in Figure 3:Traditional choices, which take no account of the mode, the MTTF (danger) is quoted as 100 yrs; MTTF (spurious)
improvements in reliability of sensors, communications, final elements is quoted as 10 years; for a solenoid valve in ET mode the MTTF
(actuators), power supplies or diagnostics. These are considered below. (danger) is quoted as 30 yrs and MTTF (spurious) is quoted as 100
As assessors of safety sub-systems, we are sometimes asked by years. Thus it is important to remember that when a manufacturer
clients, “As we are concerned about spurious trips, would ‘energise quotes a SIL rating for a solenoid valve, it will be for the DT mode of
to trip’ be acceptable and under what conditions?” They accept that: operation not the ET mode of operation.
1. inherent safety is better, and that, A valve final element may have 60% failures with gas passing
2. identifying safe state(s) that can be achieved by a simple trip system internally – no tight shut-off and another 20% of failures where the
using reliable well-proven components is also best, but valve sticks, so even with close on air failure we may have 80% of
3. are concerned that this may lead to an unacceptable number of the failure modes when the valve fails to completely close so, for
spurious trips. example, fuel is still supplied to the turbine (this applies equally to
They also know that even using de-energise to trip, systematic or ET and to DT). In this example, the fuel valve has to close while the
common cause failures can also prevent the “safe state” from being bypass valve has to open.
achieved, for example a seized cable on the railway semaphore signal, There are also many more similar important reasons why trip
wet instrument air or welded relay contacts. systems may fail.

Available Guidance Why do trip systems fail?


The guidance in the references is very relevant but is mainly about We may have given the impression that the fundamental choice for
fail-safe, safe states and complexity – there is little or no mention of safety was DT or ET, but there are many other issues.
ET and/or DT. The HSE website[1] does mention ET and DT as does 1. First, and most important, we should be aiming to make systems
“Safety Shutdown Systems Design, Analysis and Justification” by Paul inherently safe, not starting with an inherently unsafe design and then
Gruhn and Harry Cheddie[2] but these two references are not as adding on protection.
widely known as they might be. 2. Secondly DT v. ET is a Specification and Design & Implementation
Apart from these two references, very little specific guidance issue, which (based on a sample of 32 incidents) was the primary
has been published and a simple example of a high-pressure trip cause of error in only 27% of incidents (starred (*) items in table)
illustrates why the traditional choices above are far from obvious and whereas 73% were cause by other issues, as illustrated in Table 1[4]
more guidance is needed.

www.instmc.org.uk Measurement + Control Vol 41/9 November 2008 • 277


Feature: Energise or De-energise to trip?

Failure
Phase % Primary cause by phase
12 (*) Inadequate functional requirements specification
32 Inadequate safety integrity requirements specification Fail to danger Fail safe

44 Total inadequate specification


15 (*) Total inadequate design and implementation Fail safe
Fail to danger Fail to danger Fail safe undetected
6 Total inadequate installation and commissioning detected

3 Inadequate operation undetected detected Non-critical


Spurious trip Spurious trip
undetected detected
12 Inadequate maintenance
15 Total inadequate operation and maintenance
20 Inadequate modification Dangerous Non-critical Spurious trip

 
Table 1 Figure 6: Effect of diagnostics on system failure modes

Obviously “fail to danger detected” is only “non-critical” if action is


taken immediately.

Emergency feed example

 
Figure 5: Source of error Figure 7: Options for emergency feed

Could we design this emergency supply system as de-energise


Trip system issues to trip? If the cause of the loss of flow is an electrical failure then
Both DT & ET need requirements specifications. The design and the pumps will be lost and it will not be possible to provide the
implementation is different for passive as from active systems – apart emergency feed without the pumps. This is a common cause failure,
from mechanical energy (spring returns) or potential energy and one that should arguably be designed with energise to trip
(gravity), DT is normally passive whereas ET is active. Thus ET is very principles in mind as it is always necessary to look wider than simply
dependent on utilities for safe operation whereas DT is not. the design of the identified SIF. Alternative motive power for one
pump by diesel or turbine driven pumps is an option.
Both DT & ET need design policy covering issues of:
• classification of safety-critical and safety-related systems Another example of options for addition of
• diversity
• separation of signal routes Reactor Inhibitor
• architecture (MooN) as in 1oo2 , 2oo3 , 2oo4 etc
• overrides (defeats) with or without timers or just not permitted

Both DT & ET need competent people as:


• designers
• installers
• maintainers

Both DT & ET need policies for:


• operation
• test (for example, all transmitters or one at a time? - might give
common cause problems)
• repair (test before and after repair, collection of failure and success
data, do operations resist testing of final elements)
 Figure 8: Options for reactor inhibition
Both DT & ET need reliable components but diagnostics are much more
important for ET as illustrated in Figure 6: Effect of diagnostics on system failure
modes from “Reliability Prediction Method for Safety Instrument Systems”[5]. Inhibition is required quickly when the reaction is “running away”
with temperature and pressure in the reactor rising rapidly. Which

278 • Measurement + Control Vol 41/9 November 2008 www.instmc.org.uk


Feature: Energise or De-energise to trip?
De-energise to
trip system fails
approach is actually more “safe”; relying on a pump cutting in to to danger

inject inhibitor at high-pressure, or using high-pressure nitrogen as


Fail to danger
the motive power via 2 open-air failure valves?

Architecture and Spurious Trip Frequency Sensor


subsystem fails
Logic solver
subsystem
Final element
subsystem fails
hardware fails

Overcoming availability considerations with de-energise to trip has


traditionally been done with redundancy, typically 2oo3 has been Sensor fails Logic
Logicsover
solverfails
fails FE fails

necessary to achieve required level of both safety integrity and avail-


ability, but full 2oo3 is expensive for final elements. Improvement
with 2oo3 will not be this good unless diversity is provided because
2oo3 sensors fail Comms from Both final element Comms to final
sensor s/c to devices fail elements s/c to
power power

of common cause issues. See the example of 2oo3 for PTs on Figure 2

4: High-pressure trip. Sensors fail Comms S s/c power Both FEs fail Comms FE s/c power

Fault trees for a high-pressure trip


Sensor 1 fails Sensor 2 fails Sensor 3 fails Final element 1 Final element 2
fails fails

One way of considering the differences between ET and DT is to


analyse the two different designs using Fault Tree Analysis. Sensor 1 fails Sensor 2 fails Sensor 3 fails FE 1 fails FE 2 fails

Key to
Logic Final
Sensors
Fault solver elements

Trees
 Figure 10: DT high-pressure trip fails to danger
 
Figure 9: Colour coding of fault trees
For the ET fault tree the top event shown in Figure 11 is that
All the subsequent fault trees use the colour coding shown the overpressure system fails to danger and will not operate if the
in Figure 9 - Colour coding of fault trees. All the fault trees ignore pressure goes high: Sensor failures - black with 2oo3 voting; Logic
systematic faults and consider only random hardware failures solver failure event – red; Final element failures - blue 1oo2 hence.
AND gate as for the overpressure system to fail to danger, both
Fail to danger fault trees need to fail.
The other events are:
The DT fault tree is not dependant on diagnostics as shown in Figure • Open circuit (o/c) failures of comms to and from the logic solver
10: DT high-pressure trip fails to danger, whereas the ET design does (note that o/c failures are 90% of failures);
depend on diagnostics so these are included in the fault tree shown • And s/c to ground comms failures (these are rare);
in Figure 11: ET high pressure trip fails to danger. • Diagnostic hardware failures;
Top event is that the overpressure system fails to danger and will • Power supply failures for sensors, logic solver and final elements;
not operate if the pressure goes high. • Power supply failures for diagnostics
Sensor failures - black with 2oo3 voting; Even without examining the details of the two different fault trees
Logic solver failure event – red; above for DT and ET, it is immediately apparent that the ET design
Final element failures - blue 1oo2 hence AND gate as for the involves many more issues. This does not mean that an ET design is
overpressure system to fail to danger, both need to fail. less reliable, just that there is much more scope for getting it wrong,
The other 2 events are short circuit (s/c) failures of communica- for example.
tions to and from the logic solver (note that s/c failures are usually
only 10% of relay failures).
Energise trip fails

to danger

Fail danger

Sensor subsystem Logic solver fails Final elements subsystem


fails undetected fails undetected
undetected

Sensor fails Logic solver fails u FE fails

2oo3 sensors Diagnostics fail to detect Logic solver Diagnostics fail to detect Final Diagnostics fail to detect
failure of sensors failure of logic solver failure of final elements
or comms fail fails elements fail

Sensor or comms fail Diagnostics fail Logic solver fails Diagnostics LS fail FEs fail Diagnostics FE fail

2oo3 sensors Comms from Power supplies to Diagnostic Power supplies to Logic solver Power supplies to Diagnostic Power supplies to Both final element Comms from logic Power supplies to Diagnostic Power supplies to
diagnotics for LS fails solver to final elements
fail sensors fail sensors fail hardware fails diagnotics fails hardware fails logic solver fail hardware fails devices fail final elements fail hardware fails FE diagnotics fails
fails

Sensors fail Comms S fail Power S fails Diagnostics fail Power D fails Logic solver fails Power LS fails Diagnostics LS fail Power D for LS fails Both FEs fail Gate 14.1 Power FE fails FE Diagnostics fail Power to D for FE fails

Sensor 1 fails Sensor 2 fails Sensor 3 fails Comms from sensors Comms from sensors Final element Final element Comms to FE fails Comms short
fail open circuit short circuit to ground
1 fails 2 fails open circuit circuit to ground

Sensor 1 fails Sensor 2 fails Sensor 3 fails Comms S o/c Comms S s/c ground FE 1 fails FE 2 fails FE comms o/c FE comms s/c ground

Figure 11: - ET high pressure trip fails to danger


 
www.instmc.org.uk Measurement + Control Vol 41/9 November 2008 • 279
Feature: Energise or De-energise to trip?
De-energise
spurious trip

Spurious trip

Sensor subsystem Logic solver Final element


fails subsystem fails subsystem fails

Sensor fails LS fails FE fails

Comms from Comms from Power supply to Logic solver Power supply to Comms to final Comms to final Power supply to
sensor s/c to sensor open sensors fails subsystem hardware logic solver fails elements s/c to element open final elements
ground circuit fails circuit fails
2oo3 sensors fail Either final element
device fails

Comms S s/c ground Comms S o/c Power to S fails Logic solver fails Power to LS fails Comms F s/c ground Comms FE o/c Power to FEs fails
2

Sensors fail FEs fail

Sensor 1 fails Sensor 2 fails Sensor 3 fails Final element 1 fails Final element 2 fails

Sensor 1 fails Sensor 2 fails Sensor 3 fails FE 1 fails FE 2 fails

 
Figure 12: DT spurious high-pressure trips

1. Only two comms failures will disable DT (s/c to power between Figure 11: ET Spurious high-pressure trips is less complex than Figure
either the sensors and LS or between the LS and FE) 12 for the DT application. As for DT either final element failing will
2. Both o/c (90% of failures) and s/c to ground communications give a spurious trip, but for ET only s/c to power (the least common
failures will disable ET communications failure) will give spurious trips as normally failures
3. Utility failures do not disable DT but do disable ET (not quite as of diagnostics and power will not cause spurious trips for an ET trip
simple as this for the logic solver as it may have complex modes of system.
failure on loss of power)
4. Diagnostics not included in DT but needed for ET and failure of Diagnostics and Reverse Acting Transmitters
diagnostics is significant for ET
5. Diagnostic coverage is not 100% so there are known systematic When a system is required to trip on a low signal (e.g. low pressure),
failures that are not shown on these diagrams failure of the signal itself, say through a blown fuse or by cable
open circuit damage will also lead to a low signal and fail safe trip
What about the difference in reliability – well that all depends operation on a de-energise to trip system. If, however the system is
on the power supplies, Design policy / Architecture / Overrides required to trip on a high signal (e.g. high-pressure) obtained from
(defeats), Operate / Test / Repair policies, component reliability for
particular failure modes and Diagnostic Coverage. Energise spurious
trip

So why is there a great difference in the size and complexity of Spurious trip

the diagrams? All the fault trees shown in Figs 10 and 11 consider
only random hardware failures.  In addition there may be failures
caused by systematic faults that are not shown, for example, design Sensor subsystem
fails
Logic solver
subsystem
Final element
subsystem fails

errors, lack of configuration control, software bugs.  The fault hardware fails

trees expose the added complexity of ET compared to DT. The ET Sensor fails Logic solver fails FE fails

diagrams for failure to trip are larger and more complex than the
equivalent diagrams for DT. Thus, as well as more opportunities for
random hardware faults to cause ET to fail to trip, there are also 2oo3 sensors fail Comms from Either final element Comms to final

more opportunities for systematic errors because of the additional sensor s/c to power device fails elements s/c to
power

complexity. Above all the extra complexity gives much more scope 2

for human error.


Sensors fail Comms S s/c power FEs fail Comms FE s/c power

Spurious trip fault trees Sensor 1 fails Sensor 2 fails Sensor 3 fails Final element 1 fails Final element 2 fails

The Figure 11: DT Spurious high-pressure trips again ignores systematic


faults and considers only random hardware failures; the complexity Sensor 1 fails Sensor 2 fails Sensor 3 fails FE 1 fails FE 2 fails

does not look too bad but s/c to ground and o/c (the most common
communications failures), power failures and either final element
failing will all give spurious trips. Figure 13: ET spurious high-pressure trips
 

280 • Measurement + Control Vol 41/9 November 2008 www.instmc.org.uk


Feature: Energise or De-energise to trip?

a transducer rather than a switch, failure of the signal itself will not We also do think it is appropriate to point out the complexity
automatically lead to a fail safe condition; the protection system issues and the care needed not just in design, but in requirements,
will continue to think that as the signal is below the trip setting the implementation, operation and maintenance. Common cause failures
system is healthy. In time past the recognised solution for this, to often arise through complexity and can easily go unrecognised. It is
maintain the integrity of the de-energise to trip principle was to fit a not unknown for redundant power supply systems to be compro-
reverse acting transmitter to the trip system; giving a low signal for a mised in voting systems where the power distribution to similar
high process condition and a high signal for a low process condition. components in all the voted loops is fed via a single MCB or fuse.
With today’s range of equipment that includes diagnostics this During the design phase it is important that everyone understands
is no longer necessary. Line monitoring of the input signals can be the potential complexities of the system and that these are properly
used to determine whether or not they are healthy and appropriate communicated to those who will have to operate and maintain the
action or alarm initiated when an abnormal condition is detected. system. Is the design brief fully documented so that it can be readily
However the requirements for this to be implemented have to be understood in 2 or 3 years time when all the design team have
included in the Safety Requirements Specification. Where diagnostics moved on to pastures new?
are provided then there must be a response to the action that is And finally we must recognise that the system may well be
prompted or the diagnostics will be ineffective. modified during its lifetime and complexity is a major issue for
The facilities and “features” that are provided by automatic modifications! Ambiguity and inconsistency in information and poor
diagnostics add complexity to the overall system and should only be traceability are also serious issues that emerge when the design is
included where there is a definite need and benefit that can be fully revisited some time later to implement a modification.
defined. The “KISS” (Keep It Simple, Stupid) principle should still be So an independent assessor at a preliminary FSA when asked
applied to protective systems despite (or perhaps because of) the by the client, “As we are concerned about spurious trips, would
apparent flexibility and power of modern programmable systems. ‘energise to trip’ be acceptable and under what conditions?”, should
If a system can provide all the required protection using simple show all the usual reluctance to answer leading questions and get
combinations of solid state logic (or indeed using relays), why make it involved in the design! But the assessor should point out that it
more complex? is no good looking at the trip/protection system alone. A wider
Modern “Smart” transmitters include a range of features to aid (holistic) view is needed. “It depends” is the initial valid response
diagnostics and to facilitate easy maintenance, interrogation, and
in particular range changing. When used in a protective system it References
is essential that any features that permit on-line changes to the 1. http://www.hse.gov.uk/comah/sragtech/index.htm
transmitter are barred. Whether the overall system is de-energise or which includes links to Case Studies illustrating the importance of Control and
energise to trip, allowing on line changes may result in unexpected Protection Systems, for example
consequences. It is vital to avoid situations where diagnostics and Texaco Refinery - Milford Haven - Explosion and Fires (24/7/1994)
communication links can be used to take unauthorised action e.g. International Biosynthetics Ltd (7/12/1991)
through poorly engineered HART protocol or Fieldbus systems. BP Oil (Grangemouth) Refinery Ltd (22/3/1987)
Seveso - Icmesa Chemical Company (9/7/1976)
Conclusions 2. Safety Shutdown Systems Design, Analysis and Justification (1998) p126 Paul Gruhn
and Harry Cheddie ISBN1-55617-665-1,
The choice of de-energise or energise to trip is less clear cut than at 3. ISA-TR84.00.02 (2002) - Safety Instrumented Function (SIF) - Safety Integrity Level
first sight. Coming from a background where de-energise to trip was (SIL) Evaluation Techniques Part 1: Introduction – page 57
the norm, we have recognised that there is clearly a need for both 4. Out of Control (2003), Second edition, HSE Books, ISBN 0-7176-2192-8
modes of trip operation. In deciding which one to use it is essential 5. IEC 61508 (1998 & 2000), Functional safety of electrical/electronic/programmable
to take a holistic look at the protection functions that are being electronic safety-related systems Parts 1-7
requested, not necessarily just the particular safety instrumented
function under immediate consideration. Are there circumstances in General Bibliography
the overall operation of the plant or equipment where one approach • Reliability Prediction Method for Safety Instrumented Systems,
is preferable to the other? PDS Method Handbook (2006) SINTEF
Holistically includes ensuring that the failure modes are identified • Reliability Maintainability and Risk (2001), David J Smith.
and that failure to safety is properly evaluated. Supply and design ISBN 0-7506-5168-7
of utilities to service the trip system are as vital as the core Safety • Safety-Critical Computer Systems (1996), Neil Storey.
Instrumented Function (SIF) itself. Energise to trip is potentially more ISBN 0-201-42787-7
complex to design, requiring more attention to the reliability of the • Safeware: System Safety and Computers (1995), Nancy Leveson.
power supplies in particular, but the possibilities of getting the design ISBN 0-201-11972-2
wrong leading to a higher failure to danger rate are greater. Even
nominally de-energise to trip systems can include what are effectively
energise to trip elements that may go unrecognised where trip
initiation on high measurement signals is required.

www.instmc.org.uk Measurement + Control Vol 41/9 November 2008 • 281

You might also like