You are on page 1of 7

POWER-GEN International 2011, Las Vegas, U.S.A.

Failure analysis of rotating equipment using root cause analysis methods Graeme Keith, Lloyd's Register ODS1 Philippe Loustau, Lloyd's Register Energy Americas2 Magnus Melin, Lloyd's Register3 Increasing demand on equipment up-time in power sector With the rapid development of technology and ever rising demand for energy consumption, more and bigger power plant projects are being designed, built and operated around the world. Increased portion of renewable energy in the energy system gives increasing pressure on conventional thermal power plants to reduce emissions by adopting new technologies, for example co-firing with biomass, but also to operate more flexibly with short windows of operation to cope with peak loads. All of this brings additional complexity to power plants and equipment (especially critical machinery), which can provide higher output and efficiencies, but also brings greater technical and financial risks in case of failure and problems during the asset lifecycle. Consequences of equipment failure range from short unexpected downtime to total stop of production for an extended period. Despite the best intentions and precautions, failures do occur. Whenever equipment fails to meet expectations or fails altogether, we must understand what went wrong so that we can safeguard against it ever happening again. A good explanation not only helps you prevent a failure from reoccurring; it can help identify systematic weaknesses that might result in other failures. To give a good explanation is to give a full account of the relevant causes of a failure. There are a wide variety of root cause analysis (RCA) methods and procedures for analyzing the causes of failure, including the widely used Fishbone or Ishikawa diagram, the appealingly simple Five Whys, the versatile Fault Tree Analysis and its close cousin the causal map. Each method has its particular advantages and drawbacks. Many were developed for some particular sector or application and while they work very well on their home territory, they are not all as universally applicable as their advocates sometimes hope. All these methods are essentially about mapping causes: identifying the immediate causes of a failure as well as the causes of those causes and so on. In the following, we exemplify the different methods by using an example from the philosopher David Lewis, a car accident with a drunk driver, driving too fast in a car with bald tyres. The Ishikawa (fishbone) diagram The different methods emphasize different aspects of causal mapping. The fishbone diagram provides a useful categorization, allowing investigators to focus on one
1

Email: graeme.keith@lr-ods.com, website: www.lr-ods.com Email: philippe.loustau@lr-ods.com, website: www.lrenergy.org 3 Email: magnus.melin@lr.org, website: www.lr.org
2

Page 1(7)

POWER-GEN International 2011, Las Vegas, U.S.A.

category of possible causes at a time. The categories vary according to application, but a typical list is Equipment, Process, People, Materials, Environment and Management. Figure 1 shows the beginning of a fishbone diagram for the car accident example. The categories are drawn into a thick horizontal line leading to the problem we are trying to explain. Drawn into these lines are the various causes identified in each category and into these lines may be drawn secondary causes, i.e. causes of causes.
Equipment Process People

Bald tyres

Driving too fast

Drunk driver CRASH!

Icy road

Driver depressed

Materials

Environment

Management

Figure 1: A fishbone (Ishikawa) diagram for the car accident. The great attraction of the fishbone diagram is also its great weakness. Whilst the categorization provides clarity for the discovery of causes, it imposes unnatural restrictions on mapping the relationships between these causes. For example, we have identified the drivers drunkenness as a cause in the category people. Clearly the poor mans depression may have contributed to his drunkenness, but there is no natural way to cross categories while passing down a chain of causes in the fishbone diagram. Some more recent models of the fishbone diagram facilitate longer chains, but the fishbone diagram is severely limited for all but the simplest problems. Causal mapping Causal mapping dispenses with the categorization and liberates the connections so that the relationships between causes can be made clearer and more instructive. The car accident example is sketched in Figure 2, where the causal chain has also been extended beyond the crash to include the true cost of the failure in terms of safety, asset or business performance and environment. Starting with these ultimate consequences, we work backwards through a series of why-questions, most of which will have several answers. Why did he crash? He skidded off the road. Why did he skid off the road? The road was icy. The tyre was worn. There was a car coming in the opposite direction, which he served to avoid. Why did he swerve? He saw the car too late and was in the middle of the road. Why
Page 2(7)

POWER-GEN International 2011, Las Vegas, U.S.A.

did he see the car too late? Why was he in the middle of the road? The corner was blind. He was driving too fast. He was drunk and his reflexes were slow. Causal mapping is an attempt to formalize the interaction between deterministic causes; Bayesian networks do the same statistically, introducing a probabilistic, quantitative, element; the fault tree explicates the causal relationships using Boolean logical operators. The Five Whys method is a simpler approach that focuses attention down single causal chains: the cause of the cause of the cause etc. (times five).

Bald tyre

No seatbelt

Icy road

Blind corner

Driver died

CRASH!

Skidded off road

Swerve to avoid car

Saw car too late

Drunk

Depressed

Car written off

Driving too fast

Coll. damage

Figure 2: A causal map for the car accident Causal relations, root causes and relevance These methods are useful to establish causal relations. They help to identify root causes, i.e. causes that lie at the root of several chains leading to the final failure. Our drivers depression is a root cause: it causes his drinking, the neglect that lead to the bald tyre, and the recklessness that lead to speeding without a seatbelt. These methods help us to manage and quantify complicated interdependencies, especially when those interdependencies are statistical rather than deterministic. It is not always clear how far back in a causal chain it is useful to go (though the five whys method has a pretty big clue in its title). Taking the causal map seriously and conscientiously following the why-methodology quickly leads to a bewildering multiplicity of causes and information overload. In practice, investigators use their experience and judgement to decide how far back to regress along a causal chain and how much to drill into it, but this can make the results too subjective and dependent on the prejudices and preoccupations of the investigator.

Page 3(7)

POWER-GEN International 2011, Las Vegas, U.S.A.

The general problem here is relevance. Whether a cause is relevant or not depends on the context. The road safety expert, who wants to know why the corner is blind, has different interests from the forensic psychologist, who is interested in the depression and who, in turn, has different interests from the insurance lawyer who is only really interested in the drinking. When investigating equipment failure, the relevant causes are the ones that give you a solution. Discovering Causes But there is a much bigger problem with these methods than relevance. These methods make quite extreme demands on the omnipotence of failure investigators, as they all assume that the causes of a failure are known in detail and with certainty. In reality, we often have little clue what the causes of a failure are or could be and very often, once we start looking, we come up with a large number of contradictory candidates, not all of which can be the case. Before we can analyse the causes of a failure, we need to find out what they are. The great British philosopher John Stuart Mill, in his 1843 book A System of Logic, gave five methods for discovering causes. Of these, the method of difference has proved the most fruitful for practical applications. Faced with a failure, say a damaged steam turbine, rather than asking Why did this turbine fail? you ask Why did this particular turbine fail and not this nearly identical turbine next to it? or Why did this turbine fail today and not yesterday?. The idea is to look at the difference between the failure case and a case as similar to it as possible but in which the failure did not occur. The cause of the failure must be found in the difference between the two cases. By restricting attention to the differences between the two cases, you essentially ignore everything they have in common and you dramatically reduce the amount of material and the number of possible causes you need to consider. By switching through a variety of similar cases, we can generate a large number of hypothetical causes and causal scenarios. Not all these will be true, but there are well defined criteria for evaluating causal theories and choosing between them. And its far better to have to choose between too many than to miss the right one or not to have any at all. Mills difference method as a practical tool for failure analysis The difference method forms the basis of a powerful tool for discovering relevant causes in cases of machinery and equipment failure in the power sector. By comparing the case in which the failure has occurred with similar cases in which it hasnt, we dramatically reduce the field over which we must search for possible causes. Moreover, if a cause can be found in the difference between two real cases then there is a much better chance that it is possible to correct the problem, bringing the problem case closer to the case where the problem hasnt occurred. Our contrasts need not necessarily be real cases; often contrasting with hypothetic cases can be very revealing.

Page 4(7)

POWER-GEN International 2011, Las Vegas, U.S.A.

Now let's look at a case study from the power sector to illustrate how failure analysis theory can be applied in real life Case study shaft failure of diesel engine generator set In this case, critical failure of the main shaft of a diesel engine driving a generator at a power plant had occurred. The engine is a modern 18 cylinder 4-stroke gas engine connected to a generator via a coupling. An extensive material analysis of the failed shaft suggested that the crack initiated in a weak spot and progressed through fatigue a very common finding in material analysis related to a failure. The way the crack had propagated is consistent with torsional vibration.

Figure 3: Failure of shaft. Without further investigations, the client concluded that the most likely cause was material failure of that particular shaft and a new shaft was ordered and installed. The same failure occurred again shortly after the engine was taken into operation. At that time it was decided to carry out a more extensive structured failure analysis using Lloyd's Register ODS. Looking back, it is easy to conclude that the decision to simply replace the shaft was wrong but, to be fair, it also easy to understand the rationale behind the decision the engine was a standard design and a large number of identical engines are in operation without problems around the world. There had been no design change to this particular engine, to the coupling or to the generator. Consequently, the problem must be with the shaft itself, right?

Page 5(7)

POWER-GEN International 2011, Las Vegas, U.S.A.

An initial listing of potential causes included: Material defect Misfiring Alignment Structural resonances in the base skid Torsional damper malfunction A material defect was given a low likelihood since it is unlikely to happen twice in the same location at two different shafts. Misfiring could be ruled out after discussions with the operational staff. Alignment could be a potential cause since it is individual to each engine and therefore could explain why this particular engine and not others had failed (Mills difference method). Again, discussions with the staff ruled out that alignment was the cause. Structural resonance in the base skid was also seen as a potential candidate but it was difficult to explain why this particular skid would have problems and no other skids of identical design. Lastly, malfunction of the torsional damper was listed as a potential cause, primarily for two reasons it could explain why this particular engine and not others failed (Mills difference method again) and was also in line with the findings from the material analysis of the failed shaft where the propagation of the crack indicated high torsional vibrations. It was decided to continue the failure analysis by, at least initially, focussing on a potential malfunction of the torsional damper as the (most) likely cause. Two questions immediately arose: Was the torsional damper not functioning as intended? Could a malfunctioning torsional damper cause torsional vibrations high enough to initiate and propagate a crack as fast as observed in the failure? In order to gain further insight, two parallel activities were launched. The first was to remove the front cover of the torsional damper this was an easy and quick operation and could potentially give a first indication of clearly visible damage if such was present. The second was to measure the torsional natural frequencies of the engine to better understand if there was a problem related to torsional dynamics. The result of the inspection of the torsional damper did at first not reveal anything extraordinary it looked brand new without clearly visible damage. However, a closer look at the surfaces between the damper mass and the shaft showed no signs whatsoever of wear. For those not familiar with torsional dampers this may sound perfectly normal but, in fact, it means that the torsional damper was not functioning at all since its fundamental working principle is based on relative motion between the mass and the shaft. A further full disassembly did confirm that the moveable parts of

Page 6(7)

POWER-GEN International 2011, Las Vegas, U.S.A.

the torsional dampers were indeed locked in position corresponding to little or no damping effect of the torsional damper. This had most likely been caused already during assembly due to a combination of the assembly procedure and (too) tight tolerances due to inadequate quality control. The results of the measurement of torsional natural frequencies showed that the measured natural frequencies deviated from the natural frequencies calculated by the OEM. A new full torsional model was built by Lloyd's Register ODS and used to simulate the torsional dynamics of the coupled system. The result of the simulations confirmed that the torsional damper was critical to attenuate resonant response between the 4 order excitation and the 1st torsional natural frequency of the shaft. Further, the simulations showed that a malfunctioning torsional damper would indeed shift the natural frequencies to those measured as well as cause stresses high enough to cause a failure as observed at the shaft. Design changes were made to the torsional damper to mitigate future problems and included increased clearance of the back bearing and increased lube channel diameter. The engine was taken into operation again after the modifications without any new failures occurring. Conclusions Structured root cause analysis methods such as fault tree analysis, causal mapping, fishbone diagrams (Ishikawa), Five Whys (Toyota) and Mill's method of difference can be useful in failure analysis of equipment in the power industry. There are many tools and it is important to understand their strengths as well as their limitations and the ways in which they can lead you astray. But with a broad technical insight, a clear conception of the notions of cause and effect and a sound understanding of the criteria for a good explanation, these methods can quickly bring investigators close to the relevant factors. Based on our experience from more than 30 years involvement in failure analysis of rotating equipment, we have found that a multi-disciplinary approach including professional use of RCA methods, simulations of the system including coupled subsystems when necessary and measurements is very useful to understand complex failures and come up with effective mitigation measures.

Page 7(7)

You might also like