Human Reliability, A Disruptive Innovation.

You might also like

You are on page 1of 8

Small Hammer Incorporated

Human Reliability
A Disruptive Innovation

By : Barry Snider, MBA, PE


President
Small Hammer Incorporated

12410 Texas Army Trail


Cypress, TX 77429
Ph: (832) 428-7834
Email: bsnider@smallhmr.com
Human Reliability
[A Disruptive Innovation]
By Barry Snider, President, Small Hammer Incorporated

Annual losses, to the bottom line, total more than $100 million at major refineries and petrochemical
complexes due to equipment failures. What are you going to do? As a vice president, plant manager,
operations manager, technical manager, maintenance manager, or reliability professional; you are
charged from corporate executives with improving the reliability of your facility. You direct your team
to scan the internet, trade publications, proceedings from conferences, white papers, reports, website
videos, and marketing literature from large prestigious corporations and small obscure consultants. You
listen to all of the arguments and list out the possible solutions that will drive reliability to the highest
levels in the shortest period of time and within a limited budget. The list consists of the following:

 Industrial Internet of Things [IIoT]


 Comprehensive Data Gathering and Analysis [Predictive Analytics]
 Reliability Centered Maintenance
 Operational Excellence
 Overall Equipment Effectiveness
 Lean/Six Sigma
 Enterprise Asset Management
 Updated Planning and Scheduling of PM tasks [new CMMS]
 Asset Performance Management
 Expanded Condition Monitoring using advanced sensor technologies
 Mobile, cloud-connected devices [handhelds, tablets, drones, robots]
 ISO 55000+
 Risk Based Inspection
 Defect Elimination
 Failure Reporting and Corrective Action System [FRACAS]
 Root Cause Failure Analysis [RCFA]
 Targeted Programs involving lubrication, operator rounds, precision maintenance, etc.
You are left with a decision dilemma: What are you going to do?
While all of the programs under consideration may have positive effects on the reliability of your facility,
there is one glaring omission that can be shown to have the most dynamic improvement in the shortest
time and at the lowest cost of implementation – Human Reliability.
When practically everyone thinks reliability, they think of the expected performance of things. Things
are equipment, components, sub-components, gadgets, and other physical objects. Some expand their
thinking to include systems, networks, sources of information, (e.g. internal vs. external databases),
production facilities, and entire corporations as they reliably deliver a product or service. It is rare to
hear anyone use the term reliability in relation to individual human performance.
In oil, gas, and petrochemical (OG&P) facilities, where I have lived my 40-year career, there are billions
of dollars and thousands of hours spent every year trying to understand, measure, monitor, and develop
strategies to hopefully improve the reliability of things. Yet, very little are spent understanding,
measuring, monitoring, and improving the reliability of humans. This is truly astonishing because
undoubtedly and without question, the most un-reliable assets in every oil and gas facility are human
beings. Improving human reliability is by far the most effective approach and offers the greatest
opportunity for achieving safety, environmental, production, and financial objectives at OG&P facilities.
Why is the reliability of human assets given so little attention? Theories abound in answer to this
question but the most plausible is that the reliability brain-trusts at industrial facilities do not
understand nor know how to affect human reliability. From the board room to the pump room, the
persons making decisions about reliability, maintenance, inspection, and asset management inherently
have education, experience, and a knowledge-base that come from an engineering, design, process
chemistry, or mechanical perspective.
Reliability professionals and practitioners are taught to understand the failure modes , failure effects,
and failure mechanisms of all kinds of equipment and then devise strategies to prevent or at least limit
the effects of the failures. These individuals must have a clear understanding of rotor dynamics,
thermodynamics, fluid dynamics, energy conversion, heat transfer, materials science, corrosion,
bearings, lubrication, seals, chemistry, physics, electricity, process control, or other so called hard
sciences. These hard sciences make up the curriculums of all colleges of engineering and operator,
maintenance, and inspection training programs. Nowhere is there a requirement or even a desire to
understand the failure modes and effects of human decisions and actions that lead to over 80% of all
equipment failures. To understand human reliability, there must be knowledge and experience in the so
called soft sciences of organizational and behavioral psychology.
Figure 1. shows a simple, yet powerful, Failure Progression Model of how the decisions and actions of
humans link to equipment failures.
Figure 1. Failure Progression Model
Probability Probability Probability Probability Consequence
CF CF

CF CF Errors

CF CF Event
Stressed Equipment
Equipment Failure
CF CF Violations
CF CF
1. Violations of the Integrity Operating Window [IOW]
2. Human Interactions with Process or Equipment
Performed Incorrectly

Contributing Factors that Influence Human Behavior

Human Reliability Reliability of Things


The model is an expansion of a failure model described in the Boeing Corporation, Maintenance Error
Decision Aide [MEDA] Users’ Guide 1 that linked the majority of aircraft failures to human errors.
Combining the MEDA model with extensive research of human behavior using the following studies and
reports from prestigious organizations [plus an advanced degree in Organizational Psychology], the
Failure Progression Model was established to show how to dynamically reduce equipment failures by
improving human reliability.
Sampling of Resources Used to Develop the Failure Progression Model
• Systematic Human Error Reduction and Prediction Approach (SHERPA) 2
• Generic Error-Modeling System (GEMS) 3
• Human Error Assessment and Reduction Technique (HEART) 4
• Technique for Human Error Rate Prediction (THERP) 5
• Engineering Psychology and Human Performance 6
• The Center for Chemical Process Safety
• Federal Aviation Administration
• Department of Defense
• US Nuclear Regulatory Commission
• British National Health Service
• US National Institutes of Health
• Journal of Nursing Management
• Many other academic texts, books, reports, and studies
The Failure Progression Model is divided into two distinct regions: Human Reliability and Reliability of
Things. As stated before, almost all efforts, resources, and capital expenditures to lower the
probabilities and limit the consequences of equipment failures are focused on the Reliability of Things.
By definition, reliability is the probability of an entity successfully performing a function. Those charged
with improving reliability utilize the inverse, or probability of failure, as a basis to determine the failure
modes, failure causes, and failure mechanisms of the entity. If the entity is a piece of equipment; such
as a pump, motor, valve, heat exchanger, pressure vessel, control loop, or piping segment; the failure
mechanisms typically come from four types of stress which lead to equipment failures.
Table 1. Physical Stresses Linked to Equipment Failure Mechanisms

Type of Stress* Damage or Failure Mechanisms


Chemical corrosion, erosion, cracking, pitting, dissolving, melting, freezing, congealing,
condensing, vaporizing, molecular change, density, viscosity, lubricity,
expansion, contraction, heating, cooling, fire
Mechanical friction, wear, impact, tension, compression, fracture, shear, torsion,
bending, fatigue, creep, inertia, dislocation, vibration, heating, cooling,
plugging, fouling, misalignment, material defect
Electrical charging, discharging, arcing, pitting, magnetizing, induced currents,
over/under voltage, melting, welding, short circuit, open circuit, molecular
change, heating, fire
Thermal expansion, contraction, weakening, vaporization, condensing, melting,
freezing, embrittlement, density, viscosity, heating, cooling, fire
*It should be noted that the four types of stress are the core of chemical engineering, mechanical
engineering, electrical engineering, and thermodynamics curriculums at colleges and universities.
Strategies to improve the reliability of things are focused on preventing equipment failures by detecting
and blocking the failure mechanisms.
Strategies to improve the reliability of humans are focused on preventing errors in decisions and actions
by detecting and blocking the human behavior failure mechanisms. Just as chemical, mechanical,
electrical, and thermal stresses produce failure mechanisms in equipment; psychological, emotional,
physical, and social stresses produce failure mechanisms in human behavior. Human behavior failure
mechanisms are commonly known as contributing factors.
Contributing factors are intrinsic and extrinsic forces that influence human behavior. They are
everywhere and affect every thought and action we perform every day. The Boeing Corporation
(Maintenance Error Decision Aide, MEDA) 1 lists 10 primary contributing factors. The Human Error
Assessment and Reduction Technique (HEART, Williams, 1992) identifies 38 Error Producing Conditions
(EPC). There are actually hundreds of contributing factors that represent the failure mechanisms of
human decisions and actions. The most widely identified top twelve contributing factors come from
commercial aviation and are known as “The Dirty Dozen” shown in Figure 2.

Figure 2. The Top Twelve Contributing Factors that Influence Human Behavior
The Dirty Dozen
1. Miscommunication
2. Complacency
CF CF
3. Distraction
4. Pressure CF CF Errors
5. Resource Allocation
6. Lack of knowledge CF CF
7. Lack of awareness
8. Stress CF CF Violations
9. Fatigue
10. Lack of assertiveness CF CF
11. Lack of teamwork
12. Norms [normalization of deviance]

To begin to understand how contributing factors influence human behavior, there needs to be a glimpse
into how humans process information, make decisions, and take actions.
There are several models attempting to explain how the brain collects data, processes information and
formulates a response. There is a flowchart model developed by C. D. Wickens (1992) 6 , a stepladder
model by Jens Rasmussen (1986), and a swiss cheese model by James McClelland (1979). You could
spend months reading and deciphering the writings of noted psychologists theorizing the workings of
the brain. A simple model that draws upon the basic concepts of many of the complex models is
described in Technique for Human Error Rate Prediction (THERP) (Swain, A.D. & Guttmann, H.E., 1983) 5.
The THERP model uses a simple decision-action cycle to describe how humans constantly process data
and information. The cycle is as follows:
Figure 3. Decision-Action Cycle Determining Human Behavior
Perception (senses)
Discrimination (awareness)

Interpretation (understanding)
Diagnosis (deduction)

DECISION (recall, reasoning)


ACTION (recall, training, experience, skill)

Perception (feedback)
The ability to accurately and consistently perform these steps is the basis for improving the performance
of human beings in any activity.

Strengthening the DECISION-ACTION cycle is the most effective method to


eliminate errors, prevent failures, and improve Human Reliability.
Figure 4. is a graphic of how contributing factors disrupt and weaken the decision-action cycle. Envision
operators, maintenance technicians, inspectors, engineers or managers going through the decision –
action cycle and constantly being bombarded with distractions, pressure, fatigue, stress,
miscommunication, lack of awareness, etc. How could anyone make reliable and accurate decisions
followed by the correct actions in such a chaotic environment? There must be a concerted effort to
allow the decision-action cycle to proceed unimpeded by actively blocking the influences of the
contributing factors.
Figure 4. Blocking the Dirty Dozen

The Dirty Dozen


• Perception (senses) 1. Miscommunication
• Discrimination (awareness) 2. Complacency
• Interpretation (understanding) 3. Distraction
• Diagnosis (deduction)
• Decision (recall, reasoning) 4. Pressure
• Action (recall, training, experience) 5. Resource Allocation
• Perception (feedback) 6. Lack of knowledge
7. Lack of awareness
8. Stress
9. Fatigue
10. Lack of assertiveness
11. Lack of teamwork
12. Norms (Normalization of deviance)

Most reliability professionals and practitioners will agree with the concepts and claims that human
reliability is important, yet they choose not to explore the proven scientific approaches for improv ing
human behavior. Most also think that additional training, digital technologies, and doubling-down on
existing maintenance and inspection programs are all that is needed to address human errors. These
are false and naïve senses of security that demonstrate a lack of knowledge and understanding of how
contributing factors, specifically the Dirty Dozen, affect the decision-action cycle. To not engage in
human reliability, when there is clear evidence of the extraordinary value in doing so, shows how the
influence of miscommunication, complacency, distractions, lack of knowledge, lack of awareness, lack o f
assertiveness, and possibly the normalization of deviance influence the decisions and actions inside the
reliability community.
In OG&P facilities, we have become quite good at monitoring, detecting, analyzing, and developing
strategies to prevent or limit the failure mechanisms that impact the reliability of things. As explained
earlier, reliability brain-trusts come from an engineering or technical background that provides us with
the knowledge and skills to design, operate, maintain, inspect, and faci litate changes to physical assets
and chemical processes. There is likely no one in your technical staff that has the organizational and
behavioral psychology knowledge and skills to design, operate (manage), maintain, inspect, and
facilitate changes to prevent or limit the failure mechanisms, a.k.a. contributing factors, that impact
human reliability.
Going back to the opening question: What are you going to do? How are you going to detect and block
the contributing factors that are causing over 80% of equipment failures and costing your facility over
$100 million? The way forward to adopt human reliability as a cornerstone of your reliability
improvement begins with a belief in the message this article is putting forth. If you do not believe that
detecting and blocking the contributing factors can prevent equipment failures and significantly impact
your bottom line then you will never embrace human reliability. If you believe there is no structured ,
scientific way to influence human behavior then discussing methods to changing the mindset, focus, and
motivation of your workers is likely a waste of time. Therefore, step one in your journey to improve the
overall performance of your facility is to believe human reliability is a high value process worth adopting.
Step two is to gain the knowledge, skills, and techniques on how to make it work. There are four ways
to gain the knowledge, skills, and techniques.
1. Direct hire persons with competence in organizational and behavioral psychology.
2. Contract a consultant to execute a project to change the behavior and culture of your workers.
3. Engage a consultant to train, coach, and develop your existing workforce in human reliability.
4. A combination of 1, 2, and 3.

To be successful in human reliability will likely require a combination of direct hires, project consultants,
and behavior change coaches.
A word about culture. Buttermilk has culture. You can buy buttermilk at the local market and then you
can claim you have changed the culture in your diet. Huh uh. It is not that simple. Your organization
also has a culture. We read a lot about adopting a safety culture or a reliability culture or a culture of
operational excellence. You can attend seminars, purchase training videos, put up signs, send out
newsletters, adopt new language, apply new performance metrics, instill a sense of urgency, and reward
the early adaptors to try to change the culture. These techniques will not change an imbedded culture.
The same contributing factors that are affecting human reliability are also limiting your ability to change
the culture of your organization. Until there is a concerted effort to understand and address the
contributing factors [The Dirty Dozen], the chances of changing the culture are very low. Adopting a
program to change the culture of an organization is not a gradual endeavor. The techniques that
promote continuous improvement are not powerful enough to effect significant change s in human
behavior. To instill a new and lasting culture requires a disruptive innovation (see Figure 5).
Figure 5. Disruptive Innovation vs. Continuous Improvement

Disruptive
Innovation
PERFORMANCE

A Significant Increase in
Continuous Performance CANNOT Be Realized
Improvement With Continuous Improvement

TIME

Your decision on what to do is now quite clear. To drastically improve the reliability of your facility and
recover some of the huge losses due to equipment failures, you should investigate, adopt, and apply the
innovative technology of human reliability.

REFERENCES
1. Maintenance Error Decision Aide Users’ Guide(MEDA), Boeing Corporation, 1986
2. Systematic Human Error Reduction and Prediction Approach (SHERPA) (Murgatroyd, Embrey; 1986)
3. Generic Error-Modeling System (GEMS) (Reason and Embrey, 1986)
4. Human Error Assessment and Reduction Technique (HEART) (Williams, 1986; 1988; 1992)
5. Technique for Human Error Rate Prediction (THERP) (Swain, A.D. & Guttmann, H.E., 1983)
6. Engineering Psychology and Human Performance, (Wickens; Hollands; Banbury; Parasuraman; 4th
Edition, 2012
Barry Snider is President and Chief Consultant of Small Hammer Incorporated, a consulting company specializing in
refinery and facility management. Barry has 40 years experience in maintenance, operation, management, and
consulting at refineries, chemical manufacturing complexes, pipeline networks, and gas processing sites. Under the
Small Hammer brand, Barry has developed powerful techniques to prevent equipment failures and promote Human
Reliability, Organization Design, and Workforce Performance Management. Barry holds degrees in Mechanical
Engineering from West Virginia University and a MBA in Organizational Psychology from American Intercontinental
University.

You might also like