You are on page 1of 29


Troubleshooting is a method of finding the cause of a problem and correcting it. The ultimate goal of troubleshooting is to get
the equipment back into operation. This is a very important job because the entire production operation may depend on the
troubleshooter's ability to solve the problem quickly and economically, thus returning the equipment to service. Although the
actual steps the troubleshooter uses to achieve the ultimate goal may vary, there are a few general guidelines that should be
followed. There are often cases where a familiar piece of equipment or system breaks down. In those cases, an abbreviated
five-step troubleshooting process can be used to find the fault, get the system up and running. It is important to note that,
although it is a five-step approach, the same basic guidelines of the seven-step troubleshooting method are followed. The
steps are simply combined to be specific to the problem at hand. This article will briefly cover the five-step troubleshooting
process, followed by a more in-depth look at the seven-step troubleshooting process.

General Troubleshooting Guidelines

The general guidelines for a good troubleshooter to follow are:
Use a clear and logical approach

• Work quickly
• Work efficiently
• Work economically
• Work safely and exercise safety precautions

1 General Troubleshooting Guidelines
1.1 Troubleshooting Steps
1.2 Action Items
1.3 Troubleshooting Documentation
1.4 Seven-Step Troubleshooting Philosophy
1.4.1 Step 1: Symptom Recognition
1.4.2 Step 2: Symptom Elaboration
1.4.3 Step 3: Listing of Probable Faulty Functions
1.4.4 Step 4: Localizing the Faulty Function
1.4.5 Step 5: Localizing the Fault to a Component
1.4.6 Step 6: Failure Analysis
1.4.7 Step 7: Retest Requirements
2 Troubleshooting With Flowcharts
2.1 Typical Troubleshooting Process
2.1.1 Step 1: Talk with the Operator
2.1.2 Step 2: Verify Symptoms
2.1.3 Step 3: Attempt Quick Fixes
2.1.4 Step 4: Review Troubleshooting Aid
2.1.5 Step 5: Step-by-Step Search
2.1.6 Step 6: Clear the Trouble
2.1.7 Step 7: Perform Preventive Maintenance
2.1.8 Step 8: Make Final Checks
2.1.9 Step 9: Complete Paperwork
2.1.10 Step 10: Inform Area Supervision/Instruct Operators
2.2 The Flowchart Model
3 Five Action Steps for Systematic Troubleshooting
3.1 Five-Step Troubleshooting Process
3.2 Step 1: Verify That a Problem Actually Exists

3.2.1 Panel Graphic
3.2.2 Loop Diagram
3.2.3 Piping and Instrumentation Diagram
3.2.4 Block Diagram
3.2.5 Schematic Diagram
3.2.6 Wiring Diagram
3.3 Step 2: Isolate the Cause of the Problem
3.4 Step 3: Correct the Cause of the Problem
3.5 Step 4: Verify That the Problem Has Been Corrected
3.6 Step 5: Follow Up to Prevent Future Problems
4 Deriving Logical Troubleshooting Flowcharts and Strategies
4.1 Deriving Your Own Troubleshooting Strategy
4.1.1 Here is What To Do
5 Types of Failures
5.1 Steps for Troubleshooting Intermittent Failures
5.1.1 Attempt to Recreate the Problem Thermally Induced Failure Mechanically Induced Failure Erratic Failure
5.2 Alternatives to Recreating Failures
5.3 Identifying All Possible Causes of Trouble
6 Cause and Effect Diagrams
6.1 Constructing a Cause and Effect Diagram
6.1.1 Step 1: Identify the Trouble or Problem
6.1.2 Step 2: Draw a Main Line Pointing to the Problem
6.1.3 Step 3: Identify the Possible Major Causes of the Problem
6.1.4 Step 4: Identify Each Possible Minor Cause Associated With the Major Causes
6.1.5 Step 5: Identify Each Contributing Factor to the Minor Causes
6.1.6 Step 6: Review the Cause and Effect Diagram

Troubleshooting Steps

The five-step troubleshooting process consists of the following:

1. Verify that a problem actually exists.
2. Isolate the cause of the problem.
3. Correct the cause of the problem.
4. Verify that the problem has been corrected.
5. Follow up to prevent future problems.

Action Items

Within the four general guidelines previously mentioned, there are several action items that are important to the successful
achievement of the goal of troubleshooting:
1. Verify that something is actually wrong.
A problem usually is indicated by a change in equipment performance or product quality. Verification of the problem will
either provide you with indications of the cause if a problem actually exists or prevent the troubleshooter from wasting time
and effort on "ghost" problems caused by the operator's lack of equipment understanding. Do not simply accept a report that
something is wrong without personally verifying the failure. A few minutes invested up front can save a lot of time down the

The troubleshooting log provides a valuable source of information from which the troubleshooter can draw on the experience of past troubleshooting efforts to quickly restore the equipment to service. this procedure would not necessarily be effective when used with different equipment or even the same equipment installed in a different facility. If something occurs once. and preventive maintenance actions all should have entries that can be referenced at a later date. However. symptoms. Using the seven-step philosophy. It is from this list that quick fixes can be taken. Many companies require their maintenance personnel or engineering staff to maintain historical data on equipment used within their facilities. it did not happen. Problems. A thorough understanding of the system. It reflects the basic strategy for troubleshooting. Repeating the same check that originally indicated the problem can often do this. modifications. correct the problem! 4. documentation of recurring problems can provide the horsepower needed to get the right part or the engineering solution necessary to not only fix the problem. the system should operate properly. . the arguments will often be met with the statement "if it is not written down. Trouble is often caused by a change in the system. If the fault has been corrected. If the common problems list is too long and cumbersome. If only there were a way to capture even a small part of that experience to be used in the future either by those who have not been fortunate (or unfortunate." It can lead the troubleshooter to the solution to a problem that has not occurred in years and has troubleshooting efforts to move slowly as the troubleshooter checks every possibility. This can tell quite a tale over the life of a piece of equipment. It is very important to correct the cause of the problem. Determine the underlying cause of the trouble. Talk Page Troubleshooting Documentation There is no substitute for experience is a catchy and." The equipment history/troubleshooting log is an ideal place to keep the records necessary to establish and maintain a common problems list. more often than not. If a problem occurs on a regular or routine basis. 3. The problem should be listed in the history section and should not be put on the common problems list until it occurs again. An important point to remember as we discuss the seven-step methodology is that we are discussing a philosophy . and how the modes of operation are supposed to work. as the case may be) enough to see something for themselves or for those with who have seen too many years between experiences. Suggest a plan to a supervisor that will prevent a future recurrence of this problem. This often involves replacing or repairing a part or making adjustments. This can save the troubleshooter valuable time when troubleshooting. Follow up to prevent further trouble. it is not necessarily a common problem. This basic troubleshooting philosophy is the basis for the seven-step troubleshooting method discussed later. This knowledge allows the troubleshooter to compare normal conditions to actual conditions.2. nor are they meant to destroy every tree on the planet with unnecessary paperwork. true phrase. not just the effect or the symptom. it should be put on the common problems list. but also correct it. Completing the required information on a troubleshooting log may seem tedious. but the information on the log can be very beneficial to a technician looking for the solution to a problem several months or even years later. This is the point of an equipment history. the easier it will be to find the cause of the trouble. These requirements are not intended to be a burden on the maintenance or engineering departments. This is because the tools used for troubleshooting are only as good as their application. Correct the problem. Identify and locate the cause of the trouble. Troubleshooters or technicians need to be careful of what is placed on the common problems list. Never adjust a process or piece of equipment to compensate for a problem and consider the job finished. it cannot be used effectively. though each individual facility may require a different application of the strategy specific for the equipment and policies at that facility. a procedure could be developed that would provide the most cost-effective and efficient means for troubleshooting a particular piece of equipment in a given facility.not a procedure. Without this historical data and documentation of a recurring problem and its associated costs. Figure 1 shows an example of a troubleshooting log that could be used as a common problems list. corrective actions. Verify that the problem has been corrected. 5. The equipment history can help prevent the troubleshooter from "recreating the wheel. The purpose of the common problems list is to provide the troubleshooter with a ready reference of past problems and their corrective actions. Additionally. its modes of operation. or troubleshooting log. This can be referred to at the beginning of a troubleshooting problem so the quick fixes can be tried.

we are ready to examine a method for effective. Approaching the problem in this fashion will ensure that valuable time is not wasted back-tracking to an action or thought process that was skipped initially. Symptom recognition 2. Many of the more modern designs of equipment in use today offer extensive diagnostics programs and tools as an integral part of the equipment. that function is flagged as a potential problem. Failure analysis 7. Systems or equipment that are designed for some form of self-troubleshooting obviously do not require implementation of every one of the seven steps. Next. we will take a look at each of the seven steps individually to see what should be accomplished for each step. Symptom elaboration 3. with the exception of failure analysis and retest requirements. If a discrepancy is noted. The strategy that the program uses is a simple logical input-output comparison. The equipment itself may perform any one or all of the steps. writing down or referring to the seven steps will ensure that a conscious decision is made as to what steps apply and what steps do not apply. Some programs are more sophisticated and will actually check functions to a component level. each of these steps should be used in the proper order. Listing of probable faulty functions 4. When any troubleshooting effort is necessary. Figure 1: Troubleshooting Log/Common Problems List Seven-Step Troubleshooting Philosophy At this point in our discussion. This is where a strategy is developed into a procedure. These programs and tools usually check inputs and outputs against pre-programmed normal parameters. logical troubleshooting: the seven-step troubleshooting method. . All that is required of the troubleshooter is an understanding of what the equipment diagnostics is indicating and what the quickest and most effective way of clearing the fault is. but they usually are only found on very expensive and high-tech equipment. Deciding when each is necessary is a very important part of troubleshooting. Localizing the faulty function 5. Some have internal troubleshooting programs that allow the equipment to "troubleshoot" itself to a large degree. Retest requirements When necessary. Localizing the trouble to a faulty component 6. The seven-step troubleshooting method consists of the following seven steps: 1.

e. The following points would be considered in the symptom elaboration step. Key points for this step include: • Always use the functional block diagram to ensure all the possible functions are checked. For example. Remember. • Note how readings are affected by all modes of operation and switch lineups. Troubleshooting should be a series of small logical steps. perhaps the cylinder extension stroke is too slow but the retraction stroke timing is satisfactory. a common failure can be as simple as the power is not connected to a power supply. drift) or if it is a sudden failure. This step asks the question "Does a failure exist" The first step in identifying a failure is recognizing that a failure exists. This sounds ridiculously simple.e.. or voltages exceeding maximum design specifications. The following list provides some guidelines for entries made during the symptom recognition step: • Try to be as specific and defining as possible in stating the problem that is occurring. but it is also very important. It requires an entry in the troubleshooting log that states what the indications of a problem are. but it can save a lot of time and potential embarrassment. • Try to determine if the failure is total or if the equipment is operating with degraded performance. For example. Step 2: Symptom Elaboration The symptom elaboration step is the beginning of "actual" troubleshooting. i. • Do not go for the answer in one step. The symptom recognition step is very straightforward. As its name implies. mode selection switch. meters. A functional block diagram of the equipment and the troubleshooting log (steps one and two) are needed for this step. and other indicators as to how they are responding due to the problem.. Unusual symptoms of common troubles occur more often than common symptoms of unusual troubles. Step 3: Listing of Probable Faulty Functions This step is intended to narrow down the possible faulty functions based on the information obtained in steps one and two. • Be sure to observe all gages. • Determine if the trouble has slowly developed (i. On/Off switch. The question asked by this step is "Would failure of this function cause the symptoms I am seeing" Again. Always check for additional symptoms of common problems. realize when it is showing the symptoms of impending failure. • Start the troubleshooting log with as much background information as possible and document each adjustment and its results. etc. this step elaborates on the symptom written in step one. • Always check to ensure equipment is lined up for normal operation. Electric motors and electrical circuits will not operate without electricity! This is very simple troubleshooting. For example. • Perform control manipulation with care since detrimental effects can occur to associated equipment or components within the failed equipment. the purpose of this step is to narrow the possibilities down to a list of probable faulty functions. • Always note if an adjustment has no effect on the symptom. During this step. Each and every person that has ever fixed anything has accomplished it. This step provides all of the information necessary to narrow the problem down in a logical fashion. Do not leave anything to memory. Be sure to record all troubleshooting actions taken in the log accordingly. Symptom elaboration is where the question "What is the problem" is asked. * Be aware that a large number of equipment faults can produce similar symptoms. troubleshooting can last two hours or two weeks. this will help eliminate possible causes later on. each one chosen to show a result leading to discovery of the problem or problems. the indication might be that pump #3 does not start. try to differentiate as much as possible between the characteristics of the symptoms. flows. and usually it is. . • Know the equipment. -There may be a possibility of improper pressures. • Analyze the performance of the equipment to make sure it actually has a failure and is not simply reacting to an external condition. The objective of this step is to obtain as much information about the problem as possible. test switch.Step 1: Symptom Recognition This is the most fundamental step in troubleshooting.

• Always ask: "Would a failure of this function cause these symptoms" Step 4: Localizing the Faulty Function This step requires careful evaluation of each of the probable faulty functions listed in the previous step. write down why it is thought to be functioning correctly. temperature. sequence. • If an abnormal reading is obtained. most importantly. switches. However. • Be sure to include functions such as detectors. The following key points should be noted. Additional troubleshooting failure analysis can be done after the equipment is running. time delay. A thorough knowledge of the equipment operation. operating speed. meters. This step is not complete until each and every listed possibility is properly checked. the equipment setup used to obtain the reading and the reading itself should be rechecked. Keep in mind though. • Avoid replacing a component until the exact cause of the problem is found and repaired. • A considerable amount of information can be rapidly gained through a careful visual inspection. connectors. and regulators. The first function you choose to check out often will not be the faulty one. and outputs associated with the areas of probable faulty functions. Proving a function is operating properly is important to the troubleshooting effort because it narrows down the possibilities of where the problem is located. Schematic diagrams should be used at this point to ensure that no details go unnoticed. the main purpose of troubleshooting is to get the equipment operational. Step 6: Failure Analysis This step requires the failed component(s) to be repaired or replaced and. keep in mind the following points: • Evaluate each component within the faulty function to determine which components are probable sources of the symptoms noted. or any variable parameter that is related to the equipment operation. Then. as well as individual component characteristics. even if it is apparently obvious that some of them are working correctly. is required for successful completion of this step. Wiring is always a probable cause! • Do not get locked in on what a technician "knows" the trouble has to be. • Removal of components from the system and use of a test stand may be helpful or even necessary to ascertaining the function of more complex components. . • Knowledge of component failure modes and rates is very important. • Do not be discouraged if several hours of troubleshooting reveal that a function is good. More than one of the previously listed probable faulty functions may be contributing to the overall problem. The purpose of this step is not to find the faulty component. This is the first step that requires taking a measurement. Past troubleshooting experience and hunches certainly play a part in figuring out which is the faulty function. The goal is to determine exactly which area of the system is causing or generating the problem. The measurement taken may be a system pressure. do not ignore hard evidence just because one assumes trouble is known prior to proper troubleshooting steps. The following key points should be noted: • Check all pressures. cables. the cause of the failure corrected. piping. When localizing the trouble to a faulty component. • Check the troubleshooting log periodically to ensure that troubleshooting efforts are still working in the right direction and have not lost sight of the original troubleshooting goal. • Careful consideration must be given to how each component could affect overall function of the system under both normal and failed conditions. • Write down all probable faulty functions. wiring. flows. inputs. Always make a complete check of the associated components of the failed unit. Step 5: Localizing the Fault to a Component This step continues isolating the fault once the faulty function or functions have been determined. it is just to isolate the problem to a circuit or function. filters.

Here is what competent troubleshooters do. Whatever the nature of the equipment to be fixed. Sometimes though. Competent troubleshooters always talk to the operators when available. the operator is not that helpful: . First. Step 1: Talk with the Operator Operators are the richest potential source of information about what is wrong and where the trouble is. • I pressed this button and it went clankety-clank. we will describe it step-by-step. Operators are with the equipment when the trouble occurs. troubleshooters do and use different things. but it only printed gibberish. • I tried to run the program. Circumstances vary from task to task and may require a slightly different troubleshooting approach. and then we will review the flowchart depicting each of those steps. The seven-step method and its associated important points are provided as a general guide to assist the maintenance person. Some require access to large amounts of documentation. Although the equipment has been repaired and is now functioning. while others use oscilloscopes. Typical Troubleshooting Process Because of the variety of items they are expected to maintain. only the tactics (the steps for implementing the strategy) differ. The information obtained in this step can also aid in troubleshooting next time by providing some baseline information. • The picture is all crinkly. or wiring diagrams. and others spend none. voltmeters. One key point to remember is: • Fail Safe: do all checks that will ensure the equipment is operating correctly. The strategy (the approach) is much the same for all equipment. Experience and the basic path outlined here will allow an appropriate approach and solve problems in a more efficient manner. while others need only a page. • Documentation is imperative at this point. check all the functions that have been affected by the failure. both to aid in troubleshooting the problem should it return and to point out recurring design deficiencies. Some use screwdrivers. competent troubleshooters do not differ so much by WHAT they do. but rather by HOW they go about doing it. Step 7: Retest Requirements Now that the equipment is operational. Many times. • It wont start no matter what I do. Some spend a great deal of time disassembling in order to gain access to test points or adjustments. stethoscopes. Troubleshooting With Flowcharts The experienced troubleshooter usually is an "old hand" at reading a variety of block and schematic diagrams. and they generally know what the operators and the equipment were doing when it happened. as well as troubleshooting trees. in the approximate order in which they do it. draw a model flowchart depicting an ideal troubleshooting strategy. the troubleshooter can make changes in this model and turn it into a flowchart showing the ideal procedure for troubleshooting equipment. This information may tell the competent troubleshooter a great deal about what is wrong and where. and the equipment used to accomplish that purpose. The model strategy to be depicted represents a composite drawn from research findings and from procedures used by highly competent and experienced troubleshooters of a wide variety of equipment. The operator can provide indications of the problem by describing what happened that was different from normal operation. Assuming familiarity with the equipment in a facility. the operator helps point the troubleshooter in the general direction: • There was a little puff of smoke right over there. Focus is not shifted. all operations must be checked and verified.

push circuit boards firmly into their sockets. and experienced troubleshooters can describe the amount of time spent troubleshooting failure that were not there. competent troubleshooters verify symptoms. While sometimes wrong or not too helpful. Suppose an operator forgot to turn it on or plug it in Suppose a switch or a valve was left in the wrong position Human error is notorious for being a source of troubles. Attempting to make the operator feel inferior by using highly technical terms or sarcasm in questioning will not increase the level of communication or cooperation and only serves to waste valuable time. Competent troubleshooters verify symptoms before proceeding with more involved efforts. it can show up in more ways than one. symptom verification will often provide benefits in addition to actual confirmation of a problems existence. Under these circumstances. the troubleshooter will often collect more clues about the trouble's location than were provided by the initially reported symptom. Ill have them operate the system for me and tell me what happens." a highly competent troubleshooter of video equipment explained. other troubleshooting steps can be avoided. and bang or kick interlocked doors or cabinets to make sure they are properly seated. they attempt solutions that are fast to try. many others. "I do a lot of troubleshooting by telephone. In many cases. Competent troubleshooters verify symptoms before digging into the equipment itself. When the trouble is real. even though they may be illogical in terms of the symptoms presented. • This rod is bent. the solution was to take the radio to the back room and make the "fix" there or to ask the customer to return the following day. "It just hisses and makes crackling sounds. look at the back. and described its symptoms. • The things busted again. By operating the equipment. The operator had accidentally snapped the AM short-wave switch to the short-wave position." The minute the customer said "hisses. plopped on the counter. Just because equipment does not work properly does not mean something is wrong with it. but it doesn't get any stations. the problem was then "How do I switch the switch without making the customer feel foolish" Generally. They check fuses. and since they are rapidly accomplished. and competent troubleshooters know it is inefficient and potentially embarrassing to break out test equipment and sophisticated analytical procedures before verifying the symptoms. • This cam has worn down again. its owner took it to the radio repair shop. the troubleshooters patience and ability to ask the right questions may result in more helpful information. a customer would complain about a "hissing radio. Troubleshooters who can tell the difference between normal and abnormal operation will spot these additional clues. dust. there was only radio. competent troubleshooters attempt quick fixes. there was nothing wrong with the radio. • I think I hurt its feelings." (Let alone rig test equipment. From the troubleshooters point of view. the troubleshooter merely verifies the symptoms and clears the trouble. • The connector is loose on this cable. They tighten this or reset that. When something goes wrong. Troubleshooters know that these actions will clear the trouble some of the time. replace gaskets. Before television.) No question about it. vacuum. They know that hearing or seeing a symptom is not automatic proof of a malfunction. Step 3: Attempt Quick Fixes Even before they have located a trouble. When a radio malfunctioned. When that happens. . • I said something to it and now it wont work right." "It doesn't work anymore. the operator can tell the troubleshooter exactly what and where the trouble is. making reception of AM stations impossible. Lots of times I'm able to tell them what the problem is right on the phone." the troubleshooter would casually turn the radio around. and verify the trouble. clean filters. clean contacts. This is but one classic example of an operator-induced problem. operators are still the most potent source of information available. Step 2: Verify Symptoms Immediately after their interview with the operator. Often. "When a customer tells me whats wrong." the owner would complain. • This belt broke. adjust here or align there. They determine whether the trouble is real or not to ensure they do not spend time troubleshooting when they should be instructing an operator on how to avoid the trouble in the future. adjust controls. and competent troubleshooters head for the operator as a first step. There are many. I don't even have to see the equipment. that is. Sure enough.

quick fixes are a form of preventive maintenance. it would be inefficient to pretend that these probabilities do not exist. for example. Regardless of what their information is called. Troubleshooters also know which troubles are likely to occur most often and the symptoms associated with those troubles. some quick-fix actions can be thought of as belated preventive maintenance. competent troubleshooters attempt solutions that are rapid and efficient because they pay off generously either in a trouble cleared or in information gained. "Caution: High voltage in this cabinet" is another. preventive maintenance means periodic general servicing of equipment. When the table lamp does not light. Thus. Therefore. Sensors detect troubles that are then reported by lights. troubleshooters engage in these rapid clearing actions while verifying symptoms and looking for other visible signs of malfunction. which typically describes symptoms on the left and suggested actions on the right. They know it is worth doing this since rough-running engines are sometimes caused by loose. oily. verified symptoms. The higher the G-forces. In this case. collect additional symptoms. Similar troubleshooting aids may be provided with more sophisticated equipment. Still another type of aid is the "If/Then" page. and some maintenance people construct their own. The cockpit of a modern aircraft is loaded with bells. The troubleshooting aid is the next most efficient source of information to check out. there is no need to talk with an operator. sounds. are not the cause of the trouble. The sound monitors a stress condition of the aircraft. and from listening the pilot knows whether or not a correction is needed. Troubleshooters may find aids like this in the owners manual that came with the automobile and the instructions accompanying your appliances. the importance of providing troubleshooters with other well-designed aids is still as strong as ever. making the troubleshooting task even harder than before. a type of flowchart that walks the troubleshooter through a series of actions and decision points. Often called fully procedural troubleshooting aids. or wet contacts. or try quick fixes other than the one suggested by this troubleshooting aid. Step 4: Review Troubleshooting Aid When troubleshooters have talked with the operator. if you prefer. at least. Equipment troubles do not occur with equal probability. showed these aids to be better than the . even by the inexperienced troubleshooter. requires that troubleshooters be armed with all available trouble-probability information. Related to these is the troubleshooting tree. and tried quick fixes. If they do not work. but the probability is high that these are the sources of trouble. Competent troubleshooters know that many troubles are caused by inadequate preventive maintenance or a total absence of such routine care. for example. In one fighter aircraft. There is another reason for attempting a quick fix or. additional information must be collected. or a form of prepackaged analysis intended to relieve the troubleshooter of the need to memorize all the steps to follow. Moreover. the battery is commonly checked. More and more modern equipment is being designed to provide direct information about troubles. some are brief and not too helpful. for attempting solutions without first doing detailed troubleshooting. time and effort have been saved. When the car does not turn over. while others are highly sophisticated or even automated. consider the easy-to interpret "idiot lights" in automobile that indicate when the oil pressure is too low or when the alternator ceases to provide a suitable charge for the battery. whether it needs it or not. Competent troubleshooters know this. only a moment has been lost and the information has been gained that certain parts of the equipment. True. Well-constructed aids of this type do indeed improve the speed and accuracy with which faults are located. "Caution: Remove all red tags before operating" is one example. telling the pilot of an approaching malfunction. Why Because such aids offer some prepackaged information that troubleshooters would have to seek elsewhere if the aid were absent. there is a repeating sound that changes frequency and tempo as gravity forces are built up during a turn. typical troubleshooting does not begin. everybody knows that some troubles are more common than others. and it is not always the light bulb. the higher and more rapid the sound. Often. buzzers. they are deprived of a potent tool for rapid trouble isolation. Efficient troubleshooting. are likely to twist or jiggle spark plug wires while looking around the engine compartment. For example. it is still possible for the telltales themselves to fail. but still have not located the fault. If quick fixes work. then. and hopefully. These actions are carried out because they will either lengthen the life of the equipment or increase the amount of time the equipment is operational they will minimize down time. but they do help to save the equipment and the troubleshooter from early death. Though immensely useful. these aids do not help in locating troubles.they are worth doing. They indicate what some common troubles are and what to do about them. These aids are a response to the growing complexity of some types of equipment. regardless of the nature of the trouble. A sometimes-overlooked troubleshooting aid is the "Caution" information attached to the equipment itself. In a way. When they are. In common usage. It would be very costly to pretend that trouble probabilities do not exist and demand that troubleshooters always follow the same procedure for the sake of uniformity or because the prescribed procedure will eventually lead to the trouble. Of the several types of troubleshooting aids. It is not always the battery. Unarmed. and sirens to indicate various malfunctions and even impending malfunctions. At least one study comparing the usefulness of procedural troubleshooting aids with more traditionally constructed maintenance manuals. especially when clearing actions are quick and easy to take. Auto mechanics. but reflect what is still a growing technology. the bulb is commonly changed first. and other forms of information display. these aids are a form of thinking prompt. some are much more likely to occur than others. to the trouble.

or (b) the trouble is located. those who expect to be skilled in the step-by-step search (called signal tracing for electrical and electronic equipment. Then there are the sophisticated diagnostics aids used in locating malfunctions in computers and similar equipment. Called the split- half or half-split search. 3. The former generally can clear more than 80% of the troubles they encounter after very little training. as the fully procedural troubleshooting aid is a carefully constructed and tested way of guiding the troubleshooter to the source of the problem. They need to be able to read diagrams.manuals. They do not require the troubleshooters assistance at all. such as chassis or circuit boards. 4. This should be expected. the procedure involves successively testing the system at or near its midpoint. 2. They do not use aids containing information they have already memorized through practice and experience. and report the nature. This is not to say that the step-by-step search is unimportant. note discrepancies between normal and abnormal operation. "Bad RAM at CY"). and inexperienced troubleshooters made as few errors as experienced people. When troubleshooting aids exist. interpret waveforms. Step 5: Step-by-Step Search When other sources of information fail to reveal the trouble's source. and linkage tracing for mechanical equipment) need to have considerable knowledge. and they do not use aids that are poorly designed. flow tracing for hydraulic or pneumatic equipment. This is the last resort of competent troubleshooters. Step 6: Clear the Trouble . If they do not find it. such as a gearbox. and troubles would eventually be cleared. The object is to test at a point each time that will eliminate a large chunk of the system from suspicion that the trouble may be lurking there. the trouble could be anywhere. circuit board. they are likely to test or replace the suspected component or assembly. The preferred search procedure is one that yields the most information for the least effort. it is only to say that this procedure (oddly referred to as "systematic. experienced troubleshooters use the ones that remind them of the efficient paths to follow for information collection or those that report specific troubles. as it is the least time-efficient system of information gathering when compared to other information sources. it too is inefficient because troubles at the far end of the equipment take a long time to get to. the speed with which it is done makes the procedure useful. starting from one end of the equipment and working item by item to the other end. the trouble is located more efficiently than with a random or sequential search. and those who can trace the trouble to the defective component within the unit. By successively eliminating approximately half of the remaining system with each test. Ideally. and often the exact source. they fix it." or "logical" troubleshooting) is used by competent people only after all other information sources fail. and all other information sources have proven inadequate. this search procedure is one that successively eliminates half the system as a possible trouble source. Four points must be made: 1. Unfortunately. Several step-by-step search procedures might be used. This explains why often it is considerably more economical to have two types of troubleshooters: those who can isolate a trouble to a unit. or card. and locate components and test points. Some systems lend themselves to rapid replacement of large segments containing a large number of components. except to initiate the diagnostic operation. The split-half search is used only when a troubleshooter must adopt the equal probability hypotheses: "As far as I know right now. When a test shows normal operation. of the trouble either on a video display or printer (for example. Once they know or strongly suspect the trouble's location. then the portion of the system preceding that point is considered OK and is eliminated from suspicion. Such board swapping can quickly isolate the trouble to the replaced unit or eliminate it from suspicion. since as many troubles would be located later as would be sooner. this approach is used only by the uninformed. A sequential search involves systematic testing. In addition to knowing the geography of the system. such as per test check or per trial replacement. A random search could be a way to test and replace components." "analytical. Diagnostics are programs designed to exercise a system. use test equipment. No matter." Competent troubleshooters stop using this search procedure as soon as (a) they develop an idea worth testing. transmission. they resume the split-half search until they can attempt a fix. Troubleshooters have to know more about the system in order to make good use of traditional maintenance manuals. that is. It is very expensive to insist that every troubleshooter be as knowledgeable as those who can clear most or all of the troubles ever encountered. It is seldom possible to test a system exactly at the midpoint of the next section to be checked. Finally. Although this procedure will also lead eventually to the trouble. If they find the trouble. the most information per action. they need more specific troubleshooting knowledge to make up for the incomplete or inaccurate information in the manual. Even though the swapping might have been done at some distance from a mid-point. however. troubleshooters turn to a step-by-step search through the equipment itself. More troubles were located.

the equipment log is a useful source of information. a final check of normal operation is a necessary part of the troubleshooting sequence. and then turn the trouble-clearing activity over to the on-duty engineer. it is important to the continued proper functioning of the equipment. for most machines. The master auto mechanic." Step 8: Make Final Checks Competent troubleshooters always check to make sure the trouble is actually cleared and the system is functioning normally. operators are instructed in the proper use or care of the equipment or cautioned about peculiarities of the system. . a process that good troubleshooters perform as regularly and carefully as time and policy permit. However. The flowchart uses standard symbols to represent process steps. preventive maintenance saves a great deal of time and money and reduces equipment downtime. Therefore. Often. and locating requires a different set of skills than clearing. PM usually is fast and may clear the trouble. Often. someone is expected to eliminate it. they dont want to see my troubleshooters oiling and greasing. for example. the history of a machine is recorded in an equipment log. Sometimes troubles can be quickly located by simply reading the history in the log. Trouble clearing is different from trouble locating. it is part of the troubleshooters job. or something unplugged or out of adjustment.Once a trouble is located. Trouble clearing is often done by the troubleshooter. The chief engineer at a radio or TV station may be called in to troubleshoot. the user is informed of this fact. It is appropriate to do PM on some machines even before starting to hunt for the trouble. but then may assign the actual repair work (trouble clearing) to someone else. For this reason. A flowchart is a graphical tool used to represent the steps of a process. information about retrofit. They want that equipment up and running! The oiling and greasing is done after the equipment is operational. but sometimes it is assigned to someone else. This article concentrates only on locating the source of the trouble. One troubleshooter explained it this way: "Look. Referring to and keeping up a log are two paperwork activities that are part of the maintenance job. PM is carried out after the trouble has been cleared. They know too well how easy it is to cause a new trouble while clearing an old one. They also know how easy it is to leave something like a setscrew loose. and parts that have been changed are recorded at the time of service or repair. often because the same trouble occurs regularly in that equipment. and good troubleshooters take the time to update these logs as well as to refer to them. and other events. The Flowchart Model The next step is to review a flowchart depiction of the action and decision steps in the strategy just described. when the customers machine is down and the plant has come to a grinding halt. does the diagnosis. Dates of PMs. Figure 2 shows typical standard flowchart symbols. Manufacturers hotshot troubleshooters who travel to clients locations to solve difficult problems often leave the actual trouble clearing to the local staff. Step 10: Inform Area Supervision/Instruct Operators Once the equipment is returned to service. Step 9: Complete Paperwork Troubleshooters are not immune to the bureaucratic plea to "fill out those forms!" Even though paperwork is not troubleshooting. Performing PM is more than just a ritual or just another company policy. Although this activity is not strictly part of the troubleshooting procedure. Step 7: Perform Preventive Maintenance Preventive maintenance is the process of clearing troubles before they happen. decisions.

. Figure 2: Typical Flowchart Symbols A flowchart depicting the typical troubleshooting process just described is shown in Figure 3. This flowchart represents the troubleshooting procedures followed by an individual at the location where the equipment trouble is noted.

they operate the machine and verify the symptoms collected from the operator. If aids are not available. . If a solution does not work. If quick fixes do not solve the problem. plugs. Figure 3: Flowchart Model Troubleshooters usually receive a report of trouble in the form of a symptom: • Its jammed again. final checks are made. they try quick fixes (check interlocks. troubleshooters follow troubleshooting aids if they are available. After locating the correct machine (and good troubleshooters always make sure they have the right machine). replace units). and cables. If any of these work. they test their hypothesis by attempting a solution. they try to interview the operator. and the area supervisor is informed. • It wont start. • I cant get it to complete the cycle. documentation (paperwork) is completed. If the problem is real. the search is continued. When troubleshooters develop a good idea about where or what the trouble is. If the solution does work. a half-split search procedure is used as a last resort. Unless the machine is jammed or otherwise inoperable. If the problem is operator-induced. preventive maintenance may be called for and carried out. Then. the area is cleaned and checked. they clear it and then instruct the operator in ways to prevent the problem from occurring again.

operating characteristics. In this alternate approach to troubleshooting. This includes prints. 5. This will help to determine if any changes exist.troubleshooters complete any preventive maintenance that is indicated and then follow the end steps already described (final check. equipment indications and controls. There are often cases. Five-Step Troubleshooting Process The five-step troubleshooting process consists of the following: 1. documentation. The technician. Follow up to prevent future problems. The operator usually can supply many of the details concerning the failure incident. To verify that there actually is a problem. Verify that a problem actually exists. and communication). however. it is important to discuss the documentation with the operator. Contacting the equipment operator should be the first action taken. and technical documentation about the equipment or system. Correct the cause of the problem. Many times that is the case. Each of these steps is described next using the flowchart approach. symptom recognition and symptom elaboration. an abbreviated five-step process can be used to find the fault. The steps are simply combined to be specific to the problem at hand. the troubleshooter should note all abnormal symptoms. This is actually a combination of the first two steps of the seven-step method. During this observation. Since the equipment operator is probably most knowledgeable about the equipment. 3. To troubleshoot. Some examples are: • What are the operators indications of the trouble • How did the operator discover the trouble • What were the conditions at the time the trouble first occurred • Is the trouble constant or intermittent Next. the troubleshooter should observe the equipment or system to get a first-hand impression of the trouble. . Step 1: Verify That a Problem Actually Exists The troubleshooting process begins with symptom recognition. and procedures. the troubleshooter should ask probing questions. Some examples of useful graphic documentation are: • Panel graphics • Loop diagrams • Piping and instrumentation diagrams • Block diagrams • Wiring diagrams • Schematic diagrams Each of these examples is described briefly next. area check. the troubleshooter will probably need to examine the equipment documentation. get the system up and running. To evaluate the equipment thoroughly and elaborate on the symptoms observed. the same basic guidelines of the seven-step troubleshooting method are followed. Five Action Steps for Systematic Troubleshooting The seven-step troubleshooting method described previously assumes that little may be known of the process or system with a problem. Verify that the problem has been corrected. the troubleshooter must first verify that there actually is a problem. there must first be a problem. the troubleshooter must use all available means of information. where a familiar piece of equipment or system breaks down. In those cases. or mechanic must systematically try to resolve the problem by using his or her skills and intuition. although this is a five-step approach. Isolate the cause of the problem. 4. It is important to note that. This includes the equipment operator. electrician. 2. To get the most information.

Loop Diagram A loop diagram is used to provide detailed mechanical information about a process. Figure 5 is an example of a loop diagram.Panel Graphic A panel graphic is a graphic representation of the system that is mounted on an equipment or system control panel. Piping and Instrumentation Diagram . However. valve line-ups. Although the panel is intended to provide the operator with a big picture of the operations. This diagram does not give significant electrical or instrumentation information. Figure 4 is an example of a panel graphic. Figure 5: Loop Diagram A more useful diagram for electricians and technicians is the piping and instrumentation diagram. Figure 4: Panel Graphic The above example does not provide extensive information to the troubleshooter. it can be used to identify sources of power. or instrumentation connections. it can be useful to the troubleshooter during this step. described next.

convey the general operation and arrangements of the major components. valves. flange sizes.) shows the functional layout of a fluid system and its piping. Many times. shows the relationship between mechanical. It is accurate to the extent that all components are connected to each other as shown in relation to flow path orientation. and schematic diagram may be more useful. flow direction. Figure 7 is an example of a block diagram. instruments. Figure 6: Simple P&ID.e. i. electrical. piping. Each line may represent one wire or several wires. a P&ID. is useful when troubleshooting entire systems or processes to find a faulty component.. does not attempt to represent the actual physical layout of equipment. the block diagram. The lines between the blocks represent the connections between the systems or components. including pipe sizes. and instrumentation as clearly and accurately as possible. and references to other related diagrams. and equipment in a fluid system. valve sizes. Another name commonly used for P&IDs. The purpose of a block diagram is to introduce the system as a whole. A block diagram illustrates the major components and electric or mechanical interrelations in block (square. uses standardized symbols to represent these items. Rather than try to pictorially include all the valves. .A piping and instrumentation diagram (P&ID. a P&ID. Section The P&ID. A piping and instrumentation diagram depicts all components of a particular system. will use a broken line encircling a group of equipment to indicate that it is all in one building. or other geometric figure) form. A P&ID. is shown in Figure 6. however. and show the normal order of progression of a signal or current flow. It does not give any details on the electrical or control circuitry. is bubble diagram due to the use of a circle for locators and symbols. A section of a simple P&ID. rectangular. Block Diagram Block diagrams are the simplest of all electrical diagrams. a valve that may appear to be right at the discharge of a pump can physically be located quite some distance from the pump and on another elevation (floor). and control components of the system. wiring diagram. For circuit troubleshooting. A P&ID.

but not necessarily in their proper physical locations. Block diagrams are useful but have some disadvantages. Schematic diagrams are very useful to the technician troubleshooting an electrical or electronic circuit. A schematic diagram is shown in Figure 8. Schematic diagrams usually are designed to be read from left to right and from top to bottom. Figure 7: Block Diagram Block diagrams are used to show the parts included in the system and the electrical order the parts are in. Schematic Diagram Schematic diagrams (often just called schematics) are drawings that show all the components in their proper electrical positions. There typically are standard electrical diagram symbols and device function numbers on these diagrams. The positions of the contacts and switches usually are shown as they would be in the relaxed or de-energized state. . They do not show the accurate physical location of the components in the system. Also. There usually is no indication whether the single line represents a cable or several cables. Knowing this. a single line represents all electrical connections. the system can be analyzed to determine where a fault might lie.

• External diagrams. A wiring diagram is structured such that is represents all the wires that were presented in the schematic diagram in their actual locations. Using the documents described so far. . a technician can accomplish a great deal toward finding the cause of the problem. the technician identifies possible faults that could result in the problem. which show the wiring from the components to the rest of the system. A wiring diagram is used to supply this information. such as a terminal board location. the technician must know where to connect test equipment in the circuit. To perform the tests. Wiring Diagram Wiring diagrams are mostly used when troubleshooting systems. The flowchart in Figure 9 shows a block-by-block representation of this step. During this step. which show the wiring inside a device. Each wire is labeled to indicate where each end of the wire is terminated. In the next step. They may be used in conjunction with schematic diagrams for component and wiring locations. It shows all electrical connections in an enclosure. These faults should be listed so that they can be checked and eliminated if possible. we will discuss isolating the real cause of the problem. Wiring diagrams show the relative position of various components of the equipment and how each conductor is connected in the circuit. Figure 8: Schematic Diagram Once a possible faulty component has been identified using a schematic diagram. in-circuit tests must be performed to verify the suspected failure. These diagrams are classified in two ways: • Internal diagrams.

it can be repaired. evaluation. built-in diagnostics. and reasoning. follow appropriate operating procedures. connecting test equipment. It also involves mental activity. • Be aware of special operating modes (self-tests. such as logic. the troubleshooter is actively involved in isolating the cause of the problem. adjusting parameters. such as half-splitting and signal tracing. and possible dis assembly. During this step. As each check is completed. Using techniques previously discussed. • Recognize the obvious. To safely and effectively isolate the cause of the problem. • Use appropriate safety precautions and equipment when troubleshooting in the field. the trouble becomes more isolated. Using the appropriate documents and test equipment. etc. Correcting the .) that may aid in the troubleshooting process. • When taking instruments in a piping system off-line and when placing them on-line. Eliminate all convenient possibilities first to save time. such as reading instrumentation. A flowchart illustrating the process used to isolate the cause of the problem is shown in Figure 10. Figure 9: Step One: Verify That a Problem Actually Exists Step 2: Isolate the Cause of the Problem The second step of the five-step troubleshooting process relies heavily on the troubleshooters technical skills and intuition. helps to narrow the problem down quickly. The specialized knowledge of the troubleshooter plays a key part in this step. This involves physical activity. Once the problem has been isolated to a specific component. but do not ignore what may be concealed. the troubleshooter continues to eliminate possible causes. keep the following in mind: • Begin investigating the easiest items to check.

This can be as simple as turning a switch or adjusting a valve. or it could be as complex as re-winding a motor or overhauling a pump. This is shown in the flowchart pictured in Figure 11.problem is discussed next. To correct the cause of the problem. the troubleshooter performs both failure analysis and a retest of the equipment. This step involves performance of the repair or other activity that eliminates the problem. Figure 10: Step Two: Isolate the Cause of the Problem Step 3: Correct the Cause of the Problem The third step of the five-step troubleshooting process is correcting the cause of the problem. .

. This usually involves rechecking the same indications that proved there was a problem. establish normal operating conditions and check equipment performance. • Using approved procedures. This time though. • Check for abnormal operation of all inputs and outputs to the repaired equipment. If there are both an abbreviated procedure and an expanded procedure for checking the equipment. Figure 11: Step Three: Correct the Cause of the Problem Step 4: Verify That the Problem Has Been Corrected Once the corrective action is taken. To help ensure the problem does not reoccur. the checks should prove that a problem does not exist. During this verification. • Perform a valve/switch line-up check to validate the integrity of the system. This helps ensure that the problem no longer exists and did not mask another problem. the troubleshooter can be relatively sure the problem has been resolved correctly. By thoroughly verifying the proper operation of the repaired equipment. This step should be thorough. use the expanded procedure. the troubleshooter should verify that the trouble has been corrected. the following should be observed: • Check all indications that relate to the repaired area. the next step in the process is performed.

ideal procedure. • Recommend procedure modifications that may prevent future failures. Sometimes the use of less-than-ideal tactics as a means of dealing with various constraints can be performed. these steps are vital to long-term productive performance. Once they have ruled out trouble in the central office as the cause. Figure 12: Step Four: Verify That the Problem Has Been Corrected Step Five: Follow Up to Prevent Future Problems Deriving Logical Troubleshooting Flowcharts and Strategies In observing competent troubleshooters in action. telephone maintenance people are often faced with a trouble referred to as CCIO ("Cant Call In or Out"). • Conduct operator/maintenance training to raise awareness of the potential for problems. • Recommend a different supplier if a replacement component is unsatisfactory. • Complete proper documentation and troubleshooting log entries to aid in future troubleshooting of similar problems. Although the system retest and preventative measures taken may not seem as vital as fixing the problem and getting the equipment back on-line. The flowchart shown in Figure 12 depicts the actions taken in these steps.Step 5: Follow Up to Prevent Future Problems The final step in the five-step troubleshooting process is to follow up to prevent future problems. troubleshooters will not always see the use of the most efficient. For example. This step involves taking preventive measures and recommend actions that could help keep the equipment from failing. they are supposed to check the telephone instrument itself to verify . This may include the following: • Changing the preventive maintenance schedule to help prevent failures.

It is easier to test the guesses by changing parts than to take the time and effort needed to verify a diagnosis. The troubleshooter will translate an ideal strategy into specific tactics appropriate to troubleshooting your equipment and create a troubleshooting tool. though less efficient. Under the name of the equipment. the deviation should be for good reasons rather than because it has always been done that way. and how certain constraints could be better dealt with. test it by answering these questions: • Is the equipment properly identified It is not very helpful to do this in the abstract. write the name of the equipment. a troubleshooter would note that other troubleshooters generally fail to verify their diagnosis with test equipment. Here is What To Do 1. However. Although the deviation results in a troubleshooting strategy that is somewhat less than ideal. It is easier. but because the test equipment is awkwardly located some distance away. names of references to use. 2. When a draft is complete. If a troubleshooting strategy deviates from the models shown on the previous pages. 2. but the procedure takes longer. There is sound reason for deviation. not because the test equipment is relatively inaccessible. to try a string of solutions than to bother signing the schematics in and out. 3. they do not always troubleshoot in this manner. regardless of where that point is in the logical chain of test points. the troubleshooter might keep trying different solutions until success in clearing the trouble is seemingly achieved. Follow the model to build a flowchart. they are supposed to check their way from the instrument toward the telephone exchange until they pinpoint the trouble. Deriving Your Own Troubleshooting Strategy Now that two types of ideal troubleshooting strategies have been identified. The less-than-ideal strategy is thought to be easier because it has always done it that way. There is no sound reason for deviation. unit. appearing as if changing parts at random is the solution. names of tools or test equipment to use. On a piece of paper. Write down the appropriate phone numbers to call. Why Because the cost of operating repair trucks is so high that company policy has been set to follow a more efficient vehicle-use procedure. but because the schematic diagrams are classified and are kept locked up. and names and/or numbers of required documentation forms. troubleshooters follow a similar procedure. Policy says that it is more important to minimize "windshield time" than it is to maximize troubleshooting efficiency. people to talk to. they always examine the first checkpoint they come to as they are driving toward the customers telephone. In observing troubleshooters at another plant. Gas is saved. and it is not inefficient in the context of the larger plan. The variations just described illustrate two types of reasons for deviating from the ideal troubleshooting strategy: 1. At one company. card) • faulty component 1. Indicate whether or not have other experienced people are available to call on for help. Little or no thought has been given to how it should be done. it is time to develop a specific troubleshooting procedure that will fit specific equipment and related situation. 2. There is a good reason for the deviation. it fits a larger plan. Instead. Make the flowchart specific to the equipment and circumstances. Then. Why is the test equipment kept in the tool crib instead of at a location closer to those who need it It has always been done that way! At a third and flow charting. • Does the flowchart follow the model in each of the key steps • Is the flowchart consistent with the information recorded at the top of the page • Are the specific items named in the flowchart . write down the level expected to isolate troubles to: • block (chassis. 3. Why Not because the troubleshooter is not aware of more efficient troubleshooting procedures.that the trouble exists as reported. Select the model strategy that best matches the procedure where the troubleshooter is comfortable with using on the floor.

This may require placing the equipment in a state that is contrary to other equipment operation. some problems only occur sometimes. For this reason. Two of the most likely things to change in a system during operation are temperature and mechanical functions. an intermittent problem usually occurs only under certain circumstances. is a catchall for other intermittent problems. Once the equipment has operated for some time. 3. It is also the most difficult type of problem to troubleshoot. as difficult as it is. This problem may only occur on very hot days or when air conditioning is not operating. Attempt to recreate the problem. the equipment must first be cooled down. however. Thermally induced failure 2. A brief description of each of these guidelines follows. Erratic failure Although other classifications could be used. It also can create havoc within a process or system operation. Mechanically induced failure 3. Talk Page Steps for Troubleshooting Intermittent Failures An intermittent failure can create much aggravation and frustration for the troubleshooter. make an attempt to establish operating conditions that are similar to those that existed at the time of failure. To isolate the thermally induced failure. To help isolate . troubleshooting of an intermittent failure is performed off-line. most equipment does not have a mind of its own. the first two categories exist. erratic failure. Monitor the operation if the problem does not re-occur. Attempt to Recreate the Problem If the problem is no longer apparent and operator error has been ruled out. Isolate the fault once the problem re-occurs. Types of Failures Most problems a troubleshooter faces are relatively simple to analyze and repair. 2. Contrary to common belief. When a failure is sporadic. • If the strategy deviates from the model. or it is not always present. it is called an intermittent failure. After the equipment is cool. The third category. the troubleshooter has derived the ideal troubleshooting strategy for the equipment. One of the first things a troubleshooter should try to do is recreate the problem. fails and the failure is obvious. Thermally Induced Failure The thermally induced failure is a problem that only becomes apparent when equipment is warmed up. or an associated component. such as company policy or other legitimate constraints When the flowchart meets the test criteria. the system or equipment must be examined to find the fault. it can be verified by cycling the equipment through a cool-to-warm state several times. when the problem is not so apparent. can the troubleshooter justify that deviation with a sound reason. it can be re-energized and allowed to warm up to normal operating temperature. Most intermittent problems fall into one of the following categories: 1. the thermally induced failure should re-appear. It may also occur each time the equipment has operated for an extended period of time at normal operating temperatures. usually in a maintenance shop. In fact. Diagnosing the fault. Using information obtained from the operator and from any equipment history or logs. The equipment. Three basic types of intermittent problems will be described. There are times. can be accomplished using these general guidelines: 1. Once the problem re-appears. For this reason.

the thermally induced failure. and no abnormal sounds were heard during the trial run of the pump. In a case like this. The bearings were replaced. Upon investigation of the failure. the bearings once again failed after only 560 hours of continuous use. this type of problem cannot be recreated. This type of failure occurs when the equipment or circuit experiences a vibration. Most of the time. it may be virtually impossible to see the fault reoccur. While using a monitoring device provides useful information concerning the symptoms of the problem. An erratic failure is a failure that is virtually impossible to predict. it may be the only way to isolate the fault. programs. Together. assume that a computer system has erratic failures resulting in the system "locking up" at various times. these types of failures are related to voltage transients or irregularities. and the pump was placed into operation. alternate monitoring methods can be used to track the equipment's operation over an extended period of time. the fault condition should appear and reappear. not on fixing the problem. For example. The system can then be run for an extended period of time after each substitution is made. mechanical shock. Once the monitoring has been performed. Although the normal life expectancy of shaft bearings on the pump was in excess of 3. so it was returned to service. the results must be analyzed. the component that was replaced can be assumed to be bad. .500 hours of continuous use. such as computers and peripherals. Although this is not very practical. and operating conditions. the mechanic determined that fuses had blown in the pump controller. each system component can be replaced individually with a known "good" component. Many times. Static discharge voltages and damage associated with static discharge can lead to erratic failures. The system locks up in various modes. the electrician and the mechanic observed the restart of the pump. a repair results in only temporary restoration of system performance. normal troubleshooting techniques can be used to find the cause and repair it. are good examples of devices that are subject to these problems. Mechanically Induced Failure A mechanically induced failure is relatively easy to recognize. Erratic Failure The most difficult trouble to diagnose is the erratic failure. This is because the emphasis is often on getting the equipment up and running. they de-energized the pump. Upon hearing abnormal grinding noise and observing a noticeable deficiency in pump discharge flow rate. An electrician replaced the fuses and re-energized the pump and controller. the equipment may need to be cooled down in sections as it operates. Some of these methods include: • Memory oscilloscope • Chart recorder • Noise monitor This is just a partial list of devices available for long-range monitoring of the suspect system. Digital equipment. Identifying All Possible Causes of Trouble Simply fixing a trouble does not necessarily solve a problem. Other means can be used to help diagnose the erratic failure. Alternatives to Recreating Failures In the case of each type of intermittent failure. If the fault does not reappear. Finding the solution to an intermittent failure can be as rewarding as solving a "whodunit" mystery. it does not identify the cause of the problem. It usually requires substitution of components on a sub-system basis. There is no apparent trend to the failure. By repeatedly tapping on the troubled area. This can be done using a directional forced air source or a special product developed for this specific purpose. Each aspect of the factors that may contribute to the failure must be assessed to determine the real cause of the problem. it is important to recreate the fault condition so that the fault can reoccur. or motion. In the case of erratic failures especially. Many times. The very name intermittent failure guarantees that the problem is not always going to occur. The pump discharge flow rate was normal. It occurs randomly and under different operating conditions. Finding a solution to an erratic failure is not easy. The mechanic inspected the pump and found severely worn and damaged bearings. If this is the case. Consider the following example: Maintenance mechanics were called upon to repair a circulation pump that had failed during normal operation. Once this has been accomplished. Tapping or applying pressure to different areas of the equipment can isolate the faulty component. A computer-based diagnostic software program may even pass on the system.

the trouble's symptom is the effect and the system components and operating conditions are potential causes.The bearing replacement was performed again. the shaft probably could have been saved. During the discussion." Some general group headings that are useful in any troubleshooting analysis are: • Materials • Operator • Methods • Equipment • Environment . By taking the time and using a problem-solving tool or technique. The term "fishbone" refers to the appearance of the diagram once it has been drawn. One of the best techniques is brainstorming. considering all factors (causes) that contribute to the trouble (effect). they are discussed. "materials. After the third bearing failure. electrical. By the time this problem was detected. suggested causes that are similar should be grouped together. To effectively brainstorm. Its shape resembles that of a fishbone. The more diverse the equipment. This involves gathering data. Once the group has reviewed the list of suggested causes. even more generically. The graphical method used to display this relationship is a fishbone diagram. For instance. costly downtime and replacement bearing costs could have been avoided. To resolve a problem in the quickest manner possible and help prevent its re-occurrence. and control devices. Giving each member of the group a turn to suggest a possible cause does this. Using the cause and effect diagram technique can help prevent "hunt-and-peck" troubleshooting and reduce the aggravation associated with undisciplined problem solving. Brainstorming is a group-oriented problem-solving technique. a group of people who are willing to work together is required. the more useful the brainstorming technique becomes. In this case. system. but there have probably been worse instances. Each member suggests one cause at a time until there are no more suggestions. Figure 13: Fishbone Diagram There are many techniques that can be used to determine the possible causes of a problem. a thorough examination of the pump revealed a severely misaligned impeller shaft. At the very least. or process. the causes of the problem must be identified. An example of a fishbone diagram is shown in Figure 13. It can also be useful when dealing with a piece of equipment that has mechanical. review the compiled list. prior to any discussion. All group members then. Although the cause and effect diagram is considered a performance-improvement tool. the root cause of a trouble can be determined. "faulty circuit breaker" and "loose wiring" may be grouped under the heading "electrical" or. Each suggestion is considered as plausible and is written down for consideration. A cause and effect diagram is used to consider the possible causes associated with a particular problem. Cause and Effect Diagrams Determining the root cause of a problem involves considering the possible causes of the effect. Brainstorming involves systematically listing all possible causes for a problem. The cause and effect diagram is developed as necessary to help isolate the primary or root cause of the problem. This is especially useful when troubleshooting a process with many components. The cause and effect diagram is used to graphically show the relationship of each of these causes to the problem. and using a process of elimination to determine the root cause. The above example may seem extreme. it is worthwhile to consider it as a troubleshooting aid. the shaft was badly scored and had suffered heat damage. with similar results. If the time had been taken to diagnose the root cause of the initial problem.

the initial fishbone diagram should look like the one shown in Figure 14. the subgroups can be determined. whether in a group or individually. Step 1: Identify the Trouble or Problem This step should be easy. as a technician or mechanic. are described next. the cause and effect diagram takes on its fishbone appearance. As the subgroups and individual tasks are added. a malfunctioning circuit. or some similar specific occurrence. However. Figure 14: Major Group Headings on a Fishbone Diagram Next. Figure 15: Expanded Cause and Effect Diagram Ideally. This is shown in Figure 15. The process of constructing a cause and effect diagram in itself promotes a broader examination to find the actual root cause of a trouble. Placement of the problem on a piece of paper is illustrated in Figure 16. Constructing a Cause and Effect Diagram The basic steps to constructing the cause and effect diagram. it may be difficult to get a group together to come up with the possible causes. Once the major headings are determined. An example of subgroups could be "mechanical equipment" and "electrical equipment" under the "equipment" group. . a disturbed process. In any event. which is written on the right-most side of a piece of paper. even on an individual basis. The problem started with a broken or damaged piece of equipment. if necessary. the brainstorming approach to constructing the cause and effect diagram is best. the concept of cause analysis using the cause and effect diagram is useful. The problem is given a name.Each of these areas can be broken down into smaller subgroups as required. and individual causes added.

The minor causes on a cause and effect diagram are shown in Figure 19. Once the possible major causes have been listed. draw connecting lines between the boxes and the centerline. as shown in Figure 18. a single straight line is drawn from the left side of the piece of paper to the designated problem. . Figure 16: Placing the Problem on the Cause and Effect Diagram Step 2: Draw a Main Line Pointing to the Problem To illustrate a direct path to the problem. These items are listed individually on lines that point into the major cause line. This is accomplished by listing each of the potential major causes inside a box around the centerline (the line that points to the problem). These lines are drawn with an arrow on the end pointing to the centerline. The line ends in an arrow pointing to the problem. Figure 18: Designating Major Causes Step 4: Identify Each Possible Minor Cause Associated With the Major Causes Minor causes are things that potentially contribute to the major cause. The problem and the line pointing to it are shown in Figure 17. Figure 17: Line Pointing to the Problem Step 3: Identify the Possible Major Causes of the Problem Identify possible major causes of the problem and designate them on the drawing. Evaluate each major cause designated in step three and identify everything that can contribute to ft.

Figure 20: Designating Factors That Contribute to Minor Causes Step 6: Review the Cause and Effect Diagram Once the cause and effect diagram is complete. such as a 220-ohm. a specific control device. By using the cause and effect diagram to sort out and relate all of the potential causes of a problem. The resulting diagram is shown in Figure 20. Figure 19: Designating Minor Causes on a Cause and Effect Diagram Step 5: Identify Each Contributing Factor to the Minor Causes Each minor cause on the cause and effect diagram has factors that contribute to them. . -watt resistor. or a discrete component. These factors normally are very specific. the factors could be major process components. it should be reviewed carefully to ensure that all possible causes and contributing factors have been identified. These contributing factors are designated on the cause and effect diagram by writing them individually on a line that points into the minor cause it is associated with. such as a tank level detector. Depending on the type of trouble initially identified in step one. Once the minor causes have been designated. Once the diagram has been constructed and reviewed. the troubleshooter can begin to systematically check the potential causes to isolate the root cause of the problem. such as a valve actuator solenoid. the troubleshooter can get to the root of the problem and avoid mistaking a symptom for a problem. the factors that contribute to these causes must be identified.