You are on page 1of 4

Random versus Systematic Faults: What’s the difference?

Source : https://www.exida.com/Blog/random-versus-systematic-faults-whats-the-difference

Steve Gandy, CFSP


Thursday, October 19, 2017

I saw and responded to a LinkedIn discussion on this very issue, where someone had
asked “if I have a misaligned limit switch that fails dangerously, then is it random or
systematic? “. This is an intriguing question because many view human error as being
systematic and, whereas, this is sometimes true, it’s not always the case. When teaching
our FSE100 course we discuss the differences and why it’s important to categorise
failures this way.

We tend to think of Random failures as failures that occur at random time intervals
(usually hardware related), which are unpredictable. In probabilistic analysis where we
try to predict the likelihood of a failure on demand, in low demand process applications,
we use average failure rates in our PFDavg calculations, based upon constant failure rate
during Useful Life. There are now over 200 Billion unit operating hours of failure rate data
that have been collected, which give us a pretty accurate value for certain types of
equipment, to use in PFDavg calculations (such as are in exSILentia).
Systematic failures, on the other hand, are insidious and can only be eliminated by a
change in design, manufacturing, procedures and training. What I like to categorise as
the 3 Ps:

The 3 Ps
 People – are they competent and trained;

 Procedures – are there well-defined and followed procedures;

 Paperwork – do we have an audit trail to demonstrate that the first two are
being adhered to.

This means that systematic failures are not considered in probabilistic calculations and
therefore, if a site is categorizing failures as systematic they could end up with low and
unrealistic failure rates, when looking at measuring the SIF performance. For this reason,
it’s a good policy to categorise all field failures as random until proven otherwise. In this
case, we won’t throw away any failures unnecessarily.
For example, let’s say an instrument technician who was well trained, had performed this
task many times, without error, had mis-calibrated a sensor that resulted in it not being
able to detect a high level (dangerous condition), although the calibration procedure and
paper work was correct. Would this be categorized as a systematic error or random
error?

Many would argue that, because it is human error, it would be a systematic issue.

So, let’s see how this measures up to the 3 Ps:


 Personnel – the Technician is well trained and so is competent

 Procedure – the procedure is correct

 Paperwork – the paperwork is correct

In this case, this would be categorized as a Random error and not Systematic. Perhaps
the technician was distracted, tired, having a bad day, etc. The Technician just made a
mistake. It’s that simple.

However, it could be argued that for safety-related equipment the procedure should be
changed to have a four-eyes policy, which would help prevent the error, so a systematic
improvement.
It's easy to see how confusing it can be in determining whether a fault is random or
systematic, which is why we recommend capturing the failure as random until proven
otherwise.

So, coming back to the case of the misaligned limit switch, we would need to initially
categorise the failure as Random so it’s captured and then to analyse whether it is actually
a systematic fault or not, by looking into the 3 Ps.

You might also like