You are on page 1of 75

Impact of Human Factors

in Process Safety Management

Risk Based Process Safety Management

Process Safety Centre S. K. Hazra 1


Indian Chemical Council 5th Dec.2009 Chairman – SHE Expert Committee
2
3
Catastrophic Incidents-Human Errors

 In safety critical industries, simple human


mistakes/oversight can cost hundreds of lives
and billions of dollars
 Operators believe that they had to deviate
from the written standard procedures in order
to start the unit efficiently
 BP Texas Refinery disaster (23 deaths,Cost$ 3.2Bil)
 A Instrument Technician erred keeping
dismantled PSV line open ended
 Piper Alpha Oil Rig disaster in the UK’s North Sea
cost (165 deaths, Loss $3.4Bil.)

4
5
6
Catastrophic Incidents-Human Errors

 In safety critical industries, simple human


mistakes/oversight can cost hundreds of lives
and billions of dollars
 Top NASA and contractor personnel’s poor
technical decision making over a period of
several years was the Fundamental reason
 Challenger Disaster (5 deaths, Loss $ ?Bil.)
 A designer error in cabin air pressurisation
valve compounded by Mtc. Engr, allowed it
remain partially open during flight
 Helios Airline accident in Greece (121 death)
7
Types of human error

Attention
Slips failures Plan of action satisfactory but
action deviated from intention
Unintended in some unintentional way
Memory
actions Lapses failures
errors Misapplication of good rule or
application of a bad rule
Rule-
based No situation tackled by
Mistakes thinking out answer from
Knowledge scratch ready-made solution,
-based new
Unsafe
acts Habitual deviation from
Routine regular practices

Non-routine infringement
Exceptional dictated by extreme local
circumstances
Intended Violations
actions Situational Non-routine infringement
dictated by local
circumstances
Acts of 8
sabotage
Human Factors
Prevention needs understanding reasons of Human failures

 It is imperative that preventive measures focus on


understanding
 Why well-intentioned and correctly trained
professionals sometimes make serious mistakes
 Which circumvent the considerable defences of a
safety system
 This question transitions into a broad field labelled
Human Factors (or HF for short).
 Considerable organisational effort need put into multi-
layered, preventive measures aimed at reducing or
eliminating all known risks arising out of HF

9
Human Factors are about people in their living and working
situations; about their relationship with machines, with
procedures and with the environment about them;
and also their relationships with other people
.

10
Human Factors

 HF encompasses aspects of
 design (latent errors);
 ergonomics (human-machine interfaces);
 cognitive research (stimulus, memory, information
retrieval and processing);
 bio-medical research (drugs, alcohol and the
circadian effects of shi working) and
 systems engineering (processes and process
compliance in socio-technical systems in particular).

11
HF approach
 A method of accelerating the acquisition
and application of operational lessons
learned across an organisation to avoid
their reoccurrence
 Seek out information about Hazards from
the people’s errors who work inside the
system
 To design a process for them to share their
learning with others before any unwanted
events happen 12
HF-based approach to error reduction

 HF accepts that error is normal and will occur in


all human systems
 HF uses a high level of employee engagement to
discover all unreported events and potential
hazards, i.e. reading the weak, warning signals
early
 HF methods demand a fast and effective feed
back and communication loop
 HF acknowledges that an individual’s awareness
of error potential is the single best defence
against their occurrence. 13
HF error reduction methods
 Customer and product safety are declared as
strategic goals
 Probabilistic risk assessments are used to
focus efforts on key hazards
 Risk management and risk mitigation risks
techniques are applied
 Elimination of the opportunity for error, by
foolproof design.
 Application of decision support systems
and/or clear safety policies.
 Use of checklists, models and other visible
memory aids. 14
HF error reduction methods
 Leaders model a culture of trust, reporting,
and openness
 Multi-format, communication channels are
used for error reporting and feedback.
 Prompt action is taken by leaders to address
all reported hazards and errors.
 O.D. and Learning interventions are used at
different organisational levels
 Education for all on the underpinning HF
theory, principles and concepts.
 Training for competence in the actual tasks
being performed.
15
HF error reduction methods
 Training in hazard awareness and risks
of specific errors.
 Simulation of scenarios that could be
faced in high-risk industries.
 Behavioural training including surveys
of group cultural norms
 Leadership training to reinforce
personal responsibility for safety.
 Executive education on safety ethics
and decision making models 16
17
Most of BP’s five U.S. refineries have had high
turnover of refinery plant managers, and process safety
leadership appears to have suffered as a result

BP has not adequately ensured that its U.S. refinery


personnel and contractors have sufficient
process safety knowledge and competence

ISOM operators were likely fatigued from


working
12-hour shifts for 29 or more consecutive days

18
A poorly designed computerized control system that
hindered the ability of operations personnel to
determine if the tower was overfilling

BP management allowed operators and supervisors


to alter, edit, add, and remove procedural steps

Supervisors and operators poorly communicated critical


information regarding the startup during the shift turnover.
BP did not have a shift turnover communication
requirement for its operations staff result

19
BP’s safety management system does not ensure
adequate identification and rigorous analysis of
process hazards at its five U.S. refineries

An extra operator was not assigned to assist,


despite a staffing assessment that recommended
an additional operator for all ISOM Start up

These safety system deficiencies created a


workplace ripe for human error to occur

20
BP management has not ensured the implementation of
an integrated, comprehensive, and effective process
safety management system for BP’s five U.S. refineries

These safety system deficiencies created a


workplace ripe for human error to occur

21
Culture & working environment
National, local & workplace cultures, social & community values …

Job:
Task, workload,
environment,
display & controls,
Individual: procedures …
Competence,
skills, personality,
attitudes, risk
perception…

Organisation:
Culture, leadership,
resources, work
patterns,
communications …

22
Human factors refer to environmental,
organisational and job factors, and
human and individual characteristics,
which influence behaviour at work in a
way which can affect health and safety

23
Interrelated aspects of Human Factors
 The Job
 nature of the task
 workload
 the working environment
 the design of displays and controls
 the role of procedures
 The Task
 match the physical limitation
 in accordance with ergonomic principles
 match the mental capability
 As per peceptual,attentional and decision 24
making needs
Interrelated aspects of Human Factors
 The Individual
 Competence
 can be enhanced
 Skills
 can be enhanced
 Personality
 fixed
 Attitude
 can be changed
 Risk perception
 can be improved 25
Interrelated aspects of Human Factors
 The Organisation
 Work pattern

 Culture of workplace

 Resources

 Communication

 Leadership
26
Managing human failures –
Common Pitfalls
 Treating operators as if they are superhuman, able to
intervene heroically in emergencies,
 Providing precise probabilities of human failure (usually
indicating very low chance of failure) without documenting
assumptions/data sources,
 Assuming that an operator will always be present, detect a
problem and immediately take appropriate action,
 Assuming that people will always follow procedures
 Stating that operators are well-trained, when it is not clear
how the training provided relates to major accident hazard
prevention or control and without understanding that
training will not effect the prevention of slips/lapses or
violations, only mistakes, 27
Managing human failures –
Common Pitfalls
 Stating that operators are highly motivated and thus not
prone to unintentional failures or deliberate violations
 Ignoring the human component completely, failing to
discuss human performance at all in risk assessments,
leading to the impression that the site is unmanned
 Inappropriate application of techniques, such as detailing
every task on site and therefore losing sight of targeting
resources where they will be most effective
 Producing grand motherhood statements that human error
is completely managed (without stating exactly how).

28
Managing human failures –
Three Serious Concern
 Concern 1: An imbalance between
hardware and human issues and focusing
only on engineering ones
 Concern 2: Focusing on the human
contribution to personal safety rather than
to the initiation and control of major
accident hazards and
 Concern 3: Focussing on ‘operator error’
at the expense of ‘system and
29
management failures’.
Concern 1:
Hardware vs human issues
and the focus on engineering

30
Concern 1:
Hardware vs human issues and the focus on
engineering
 Despite the growing awareness of
the significance of human factors in
safety, particularly major accident
safety, the focus of many sites is
almost exclusively on engineering
and hardware aspects, at the
expense of ‘people’ issues.

31
MAH Site
 Due to the ‘ironies of automation’, it is not
possible to engineer-out human
performance issues
 All automated systems are still designed,
built and maintained by human beings.
 An increased reliance on automation may
reduce day-to-day human involvement
 Maintenance is Critical, as performance
problems have been shown to be a
significant contributor to major accidents
32
MAH Site
 May have determined that an alarm system is safety-
critical
 May have examined the assurance of their electro-
mechanical reliability
 But they may fail to address the reliability of the
operator in the control room who must respond to the
alarm
 If the operator does not respond in a timely and
effective manner then this safety critical system will
fail
 Therefore it is essential that the site addresses
and manages this operator performance.
33
MAH Site
 Operator moves from direct involvement to a
monitoring and supervisory role in a complex
process control system
 Operator will be less prepared to take timely
and correct action in the event of a process
abnormality
 In these infrequent events the operator, often
under stress, may not have ‘situational
awareness’ or an accurate mental model of
the system state and the actions required
34
Concern 2:
Focus on personal safety

35
Concern 2:
Focus on personal safety

There needs to be a
distinct focus in the
management system on
major hazard issues

36
Major accident vs personnel safety

 The majority of major hazard sites still tend to focus


on occupational safety rather than on process safety
 Those sites that do consider human factors issues
rarely focus on those aspects that are relevant to
the control of major hazards.
 Sites consider the personal safety of those carrying
out maintenance
 But what is important is how human errors in
maintenance operations could be an initiator of
major accidents
 This imbalance runs throughout the safety
management system, as displayed in priorities, goals,
37
the allocation of resources and safety indicators.
“Reliance on lost-time injury data
in major hazard industries is
itself a major hazard.”
“An airline would not make the
mistake of measuring air safety
by looking at the number of
routine injuries occurring to its
\
staff”.
38
Major accident vs personnel safety
 ‘Safety’ is measured by lost-time injuries, or LTIs.
 The causes of personal injuries and ill-health are not
the same as the precursors to major accidents
 LITs are not an accurate predictor of major
accident hazards, which may result in sites being
unduly complacent.
 Notably, several sites that have suffered major
accidents demonstrated good management of
personal safety, based on measures such as LTIs.
 Therefore, the management of human factors issues
in major accidents is different to traditional safety
management.
39
Major accident vs personnel safety

 A safety management system need to


manage the right aspects to be effective in
controlling major accidents
 Performance indicators closely related to
major accidents may include the movement
of a critical operating parameter out of the
normal operating envelope.
 The definition of a parameter could be quite
wide and include process parameters,
staffing levels or the availability of
control/mitigation systems.: 40
Performance Indicators

 Effectiveness of the training program;


 Number of accidental leakages of hazardous
substances if there is frequent
operation of a
pressure relief valve
 Environmental releases; then cause of the
pressure rise needs to
 Process disturbances; be established and action
taken

 Activations of protective devices;


 Time taken to detect and respond to
releases;
 Response times for process alarms;
 Process component malfunctions; 41
Performance Indicators is the maintenance of
safety critical equipment
being undertaken as
planned and if not what is
done about it.
 Number of inspections/audits;
 Number of outstanding maintenance activities;
 Maintenance delays (hours);
 Frequency of checks of critical components;
 Number of inspections/audits; are the right drills being
 Emergency drills; carried out in the right
places, do they cover
suitable scenarios, are all
 Procedures reviews; shifts involved, etc.
 Compliance with safety critical procedures;
 Staffing levels falling below minimum targets
 Non-compliance with company policy on working
hours. 42
Performance Indicators
 It is critical that the performance indicators
should relate to the control measures outlined
by the site risk assessment.
 Furthermore, they should measure not only
the performance of the control measures,
but also how well the management system is
monitoring and managing them

43
Concern 3:
Focus on the front-line operator

44
Concern 3:
Focus on the front-line operator

 In general, most safety activities in complex


systems are focussed on the actions and behaviours
of individual operators – those at the sharp end.
 However, operators are often ‘set up’ to fail by
management and organisational failures,
 Rather than being the main instigators of an accident,
operators tend to be the inheritors of system defects
created by poor design, incorrect installation, faulty
maintenance and bad management decisions.
 Their part is usually that of adding the final garnish to
a lethal brew whose ingredients have already been
long in the cooking”
45
Management and Safety Culture
 Following the investigation of major incidents, it has
become increasingly clear that the role of
management and organisational factors must be
considered, rather than placing responsibility solely
with the operator.
 However, audits rarely consider issues such as the
quality of management decision making or the
allocation of resources.
 Furthermore, “safety culture” is seen as being
something that operators have and it has been found,
following the investigation of major accidents, that
management have not acknowledged that the
development and maintenance of a safe culture lie
within the bounds of their responsibility. 46
Management and Safety Culture
 Feedback from audits carried out by the Human
Factors Team on major hazard sites often reveals
areas that require attention in the management
system which have not been identified (or reported)
in previous audits.
 Audits of management systems frequently fail to
report bad news.
 Following the Piper Alpha offshore platform fire it is
reported that numerous defects in the safety
management system were not picked up by
company auditing.
 There had been plenty of auditing, but the inquiry
reported that:
 It was not the right quality, as otherwise it would
have picked up beforehand many of the deficiencies
which emerged in the inquiry” …. (B Appleton, Piper 47
Alpha, 1994)
Management and Safety Culture
 ” If culture, understood here as mindset, is to be the
key to preventing major accidents, it is
management culture rather than the culture of the
workforce in general which is most relevant. What is
required is a management mindset that every major
hazard will be identified and controlled and a
management commitment to make available
whatever resources are necessary to ensure that the
workplace is safe.” ……(Hopkins, Lessons from
Longford, reference 2)

48
Human Factors – Alarm Mishandling

 HF problems in alarm systems


 How alarms are actually used is important (not
necessarily how designers think they are used!)
 Competency in Understanding
 By designers,
 By installers and
 By operators
 Lessons from accidents, incidents and near-
misses
49
50
51
The incident involved three
interconnected process
vessels.
A loss of feed to vessel 1
caused the valve A to close
to prevent the vessel being
emptied.
52
.

53
 As vessel 2 emptied, valve B
closed, trapping in the remaining
liquid.
 As heat was still being applied, this
liquid vaporised, and the vessel
vented into the flare system,
through the flare stack knock-out
drum, which catches liquid to
prevent it going to flare 54
55
 Meanwhile, the feed to vessel 1 had been
restored, and valve A was opened.
 This should have caused valve B to open,
but this did not occur.
 The operators were aware that vessel 2 was
still overfilling, so they opened valve C to
provide another route out of that vessel.
 This resulted in a high liquid level in the flare
stack knock out drum.
 Due to a previous modification, there was no
facility to pump out the knock-out drum
quickly

56
57
By this time, the operators were
concentrating on the screens
that showed the problems in
vessels 1 and 2, and were not
being helped by the flood of
alarms being generated.

58
59
 The combination of a high liquid level
in the knock-out drum, and vessel 2
venting into the flare system again,
caused a slug of liquid to be carried
through the knock-out drum and into
the flare line,
 Pipeline collapsed at a weak point.

60
Texaco Refinery, UK
 Twenty tonnes of hydrocarbon were released
and exploded when a slug of liquid was sent
through the flare system pipeline, which failed.
 The site suffered severe damage, and UK
refinery capacity was significantly affected.
Only luck prevented multiple deaths. It was a
Sunday, and some people had left the area
just before the explosion.

61
62
Key Findings on Alarm System
 The control displays and alarms did not
aid operators to act in time.
 The alarms appeared faster than they
could be responded to
 87% of the 2040 alarms displayed as
"high" priority, despite many being
informative only
 Key alarms were missed in the flood
63
□ Safety critical alarms were not
distinguishable from the rest

□ A Human Factor review would


have helped diagnosis-.

64
Human Factor-Control Room operators
Action recommended by HSE
 Removal of 'alarms' which in fact were status
indicators only or which were not intended for action
by the control room operators i.e. alarms do not
require a defined operator response?
 Elimination of alarm list flooding with repeating
alarms - introduction of single line annunciation.
 The previous requirement to both accept all alarms
and accept their later clearance to be removed
(except in some carefully-defined special cases) so
that clearance no longer routinely required an
operator response

65
Human Factor-Control Room operators

66
 The designers set out with the best intentions
seeking to alarm virtually any parameter that moved in
the Process,
 But may not consider the operators' needs in control
room that is best met by providing them
 An effective control system with alarms only for Critical
Parameter in simplest possible form
 The Project and commissioning engineers may not
realise this problem because of their familiarity with
the system from first design onwards
 If operators ’HF’ is not considered in the design,their
specific needs will not be adequately taken into
account.
 Then operators being human (hence inventive) will
effect shortcuts by routinely 'shelving', or 'fixing',
alarms so that they could focus better on ones they
think are the key ones.
67
Managing Human Factor-Control Room operators
Collect Data
Basic data collectioAn-Control
Rm Determine Mitigation Strategies
Piping Instrumentation Diagram Management of Safety Alarm Audit and
Critical Staff Reorganisation
Alarm and Interlock Schedule Recruitment & Selection Proof testing of Alartms

Logic diagrams Training & competence Critical Alarm Report


Procedures
Data Planning for upset
driven, Health management conditions
continuous
Fatigue management Assess the existing
Scheme
Communications
improvement Interact and freeze the
Analyse Data Briefings & education
Optimum Scheme
Prioritise Alarms Simulators
Develop and Implement
Operator Response to Alarms Table top exercise/Mock Drill revised Alarm scheme

Human Failures/Risk Factors

Operational Risk Factors

68
69
Ignorance Iceberg
 4% of senior managers
are aware of errors
(above the waterline)
 6% of managers are
aware of errors (above
the waterline)
 75% of first line
supervisors are aware
of errors below the
waterline)
 100% of employees are
aware of errors (below
the waterline)
70
The further one moves from the
Plant floor, the less knowledge of
the organisation’s errors are known to him

71
Importance of HF in MAH Industries

•Major accidents Causes:


– Insufficient staffing levels
– Increased workload
– Reduction in supervision
– Team-working deficiencies
– Loss of competence /experience
– Unclear roles & responsibilities
– Conflicting priorities
– Poor communications
– Reduced morale /motivation

72
• 1: Accept humans can and will
1.
fail
Accept
• 2: Get better at
thefacts
- explaining failures
of life - predicting failures
• 3: Apply the hierarchy of
controls

73
74
75

You might also like