You are on page 1of 58

Define Problem

Does Problem


Get Information

Analyse Faults

Problem Isolated ?


Make Repairs

NO Final Test
OK ?


Done !

Define Problem

Does Problem


Get Information

Analyse Faults

Problem Isolated ?


Make Repairs

NO Final Test
OK ?


Done !

Common Troubleshooting Myths

; “You need to be an expert on the machine or system you’re
This is a very destructive myth because it’s expensive to wait for an “expert” to be
available, or when we spend money to hire “experts”. If you know enough about
the machine to know what tests to conduct, you can use a troubleshooting process
to narrow the problem down to root cause. Often just having the system
documentation or service manual gives you enough expertise.

; “Troubleshooting is machine dependent”
Systems and machines can vary, but the troubleshooting process is common to all!

; “Great troubleshooters are born, not taught”
BS! Troubleshooting is a set of procedures, priorities, mental tools and attitudes
that anyone can learn.

; “Either you can troubleshoot or you can’t”
Wrong! Just like any ability, there is a range, but anyone can improve their skills.
Substitute the words “drive a car” for “troubleshoot” as see how silly it sounds.

; “I can troubleshoot - I do it everyday”
Yes, but how well? See previous myth.

; “Troubleshooting is for technical people”
Nothing is farther from the truth. Substitute the words “problem solving” for
“troubleshooting and see how silly that sounds.

; “Troubleshooting isn’t as important as other skills”
Tell that to your boss when a critical machine is down and all work stops.

(What can experts do that I can**t?)

Generalizations About the Nature of Expertise

1. Expertise is acquired in stages.
• the level of competence gained is only as great as is necessary to
carry out desired activities or to solve desired problems
• growth of competence lessens as experts settle into their working
2. Expertise is very subject-specific.
• a good mechanic is not necessarily a good electrician
• expertise does not necessarily transfer to other domains
3. Experts develop the ability to perceive large, meaningful patterns.
• takes on the character of “intuition”
• does not reflect superior perceptual abilities; rather reflects a better
organization of their knowledge
4. Experts are fast and accurate. There are two ways to explain their speed:
• as a result of many hours of practice, they can perform automatically
• by recognizing patterns they arrive at solutions without an exhaustive
5. Experts have superior memory organization and strategies.
• experts do not have larger memories; they have automatized their
skills which frees up their mental resources for greater storage
6. Experts take a great deal of time analyzing problems before taking action.
• experts try to understand a problem before acting while novices jump
right in and attempt a solution
• experts build a mental picture of the problem
7. Expert knowledge is procedural and causal in nature.
• experts are good at relating events in cause and effect sequences that
lead to problem solutions
8. Experts have strong self-monitoring skills.
• experts are aware of their own mistakes; they know when they don*t
understand; they know when they need to check their solutions

(What types of problems do troubleshooters have?)

Knowledge Deficiencies

1. Ineffective troubleshooters have incomplete system knowledge. They don*t
understand how components work or how they interact with other
components within the system.
2. Ineffective troubleshooters don*t know what actions or tests can be used to
collect information or to manipulate equipment.
3. Ineffective troubleshooters are greatly affected by working memory
limitations, which prevents them from remembering the symptoms and the
test results they have already performed.
4. Ineffective troubleshooters have limited understanding of troubleshooting

Skill Deficiencies
1. Ineffective troubleshooters have limited skills to choose from.
2. Ineffective troubleshooters have difficulty performing skills correctly.
3. Ineffective troubleshooters have difficulty tracing schematics and wiring

Performance Problems

1. Ineffective troubleshooters act without thinking and have preconceived
notions. They often feel that they must be active so that they look like they
know what they are doing.
2. Ineffective troubleshooters focus on only part of the problem. They act like
they are wearing blinders.
3. Ineffective troubleshooters fail to use all the available information - even
about what IS working.
4. Ineffective troubleshooters do what they know how to do instead of what the
problem requires. They rely on favorite strategies and repeat ineffective
strategies rather than attempt a new one.
5. Ineffective troubleshooters do the wrong thing even when the symptoms
suggest another approach.
6. Ineffective troubleshooters work on the right problem with the wrong tools.

the troubleshooter collects and interprets information from many sources. The troubleshooting process is graphically shown as a flow chart on the next page. During this phase. the troubleshooter checks out the “best guesses” that have been developed to determine which one identifies the fault. the troubleshooter develops a “problem space. Following the definition of the problem and the creation of the problem space. If the fault is identified. How do troubleshooters typically perform? The process of technical troubleshooting is divided into two main components: 1. Generating “best guesses” as to what the problem might be. During this phase. if all of the hypotheses are evaluated and the fault is still not located. First. TYPICAL TROUBLESHOOTING PERFORMANCE (How did you fix that thing?) What is technical troubleshooting? Technical troubleshooting is a task that involves the detection. diagnosis. . the troubleshooter begins the second phase of the troubleshooting process called Problem Space Evaluation. Following the representation of the problem. This phase of troubleshooting is called Problem Definition. the troubleshooter collect information that is used to identify one or more “hypotheses” (best guesses as to what the problem might be). It is the quality of this “mental model” that is one of the keys to becoming an expert troubleshooter. Testing out each “guess” until the fault is found. the troubleshooter can then repair the problem in the equipment. and repair of faulty equipment. However. This information helps the troubleshooter better understand the problem and results in a “mental model” of the problem.” The problem space is all of the areas within the system that could potentially contain the fault(s). 2. the troubleshooter then returns to the Problem Definition Phase to collect more information and to generate additional plausible hypotheses.


• Availability of technical and engineering support. System . Is it easily accessible? How complex is the test? Is the test dangerous? • Timing of failure. We call this “putting Band-Aids” on the problem. ohm*s law.Specific Information • physical (What is it? What does it look like? Where is it located?) • functional (What does it do?) • behavioral (How does it work and relate to other components?) 4.Specific Information • symptoms I complaints • maintenance records Priorities for Troubleshooting • Are you only trying to isolate the fault? • Do you want a long-term. Pascal*s law • common test equipment 3. . technicians rely on brut force methods. and a systematic procedure. putting a coin in a fuse box. weather. permanent solution and repair? • Is it critical that the repair be made quickly? • Is cost effectiveness a priority? Factors that Influence Troubleshooting Decisions • Anticipated length of down time. the factors that influence decision making. These include the various forms of knowledge that troubleshooters need. • Ease of testability. pressure. General Knowledge • basic reading and mathematics skills • environmental constraints (time. Knowledge of the Equipment 1. Unit . TROUBLESHOOTING PROCESS Any troubleshooting process includes five important areas. When time is limited. Technical Knowledge • friction. etc) 2. • Anticipated operation losses (product and labor). the priorities that guide troubleshooting. common troubleshooting strategies. For example. • Availability of spare parts.

• Functional Search .Common Troubleshooting Strategies • Trial and Error very common -- • Exhaustive Search . This strategy requires more system knowledge and is more mentally difficult than the general search methods.. If testing is time consuming or expensive.mental model or schematics are used to guide search.test all possibilities .e.requires little expertise but is only feasible if the set of possible faults is small (TV tubes) • Topographic Search . but it is much more accurate.observing the function of a system and developing hypotheses (i. . What would happen if?) This strategy relies on a “Mental Model” of the system and requires the technician to create a “Problem Space.” The mental model can be simulated mentally and then compared to a normal functioning system.This strategy eliminate the greatest number of possibilities. try to eliminate as many possible causes as possible with each test. like following a map • Split-half Technique .

high humidity. etc. • Technical Tests such as operational adjustments. Identify Possible Faults 1.) • What were the weather conditions? (extreme cold. 2. 3. sandy. extreme heat. Collect more information if possible faults are difficult to identify. 5. Machine History • What preventive maintenance has.) • Was an experienced operator using the machine at the time the problem occurred? 3. manufacturers. high altitude. Customer Complaint • What happened? • What was it doing when the problem occurred? • Was everything else working all right? 2. a systematic approach will lead to better results. smelling. • Technical Support such as suppliers. touching. listening. bulletins. If the problem does not have a set of clear possible causes. etc. and technical procedures. GENERAL TROUBLESHOOTING PROCEDURE Isolate Problem (Collect as much information as possible 1. Operating Conditions • What is the geography? (rocky.A General Troubleshooting Procedure While troubleshooting seldom occurs in a straight-forward fashion. • Job Aids such as manuals. standard operating procedures. narrow the problem to a sub-system and then try to identify causes. been completed? • What repairs have been made in the past? 4. and schematic diagrams. Duplicate the Problem • Operate the device yourself to check the accuracy of the information you have been given. . and experts. The following troubleshooting procedure is used by many experts. Identify as many possible causes of the problem as you can. Collect Additional Information as Needed • Sensory Checks such as looking. • Have operator duplicate the situation that caused the problem.

Always observe safety precautions/rules. parts replaced. (use split-half strategy) 4. Check every solution you reach with some kind of test. and adjustments made. 7. 3. .Check-Out Possible Faults 1. 6. Collect additional information to check-out the possible faults 2. and adjustments made. Do not assume that new parts always work. 5. Vary one thing at a time to test the possible causes when you are not familiar with the system or device. Take notes about test results. go through the troubleshooting procedure again. Complete required Fault Analysis paperwork. Perform necessary procedures to remove the fault from the system. Always return the system to its original configuration after replacing a part or making tests. Re-Check Solution 1. try to eliminate as many possible causes as possible with each test. 2. Always observe safety precautions/rules. 3. Reduce the number of possible causes with a systematic approach. parts replaced. Repair Fault 1. If testing is time consuming or expensive. 8. If the fault remains after checking the solution (or a new one appears). 2. 3. Take notes about test results.

Confirm Corrections . 7 TROUBLESHOOTING STEPS Step 1 & Understand Complaint Step 2 & Confirm Problem Exists Step 3 & Gather Information Step 4 & Develop Failure Theory Step 5 & Test Theories Step 6 & Make Indicated Repairs Step 7 & Retest .

a. ARE THERE ANY SECOND TESTS? 5. a. Think logically about the problem. Ask yourself if the tests you have performed point to the root cause of the problem. TROUBLESHOOTING FOUR STAGES IN TROUBLESHOOTING PROCESS 1. ask yourself if there is a second test you can perform to prove that you have discovered the cause of the problem. If the information gained by the measurement is helpful. but not conclusive. Let each measurement be an additional piece of information with which to reason with the problem again until you know the root cause of the problem. Make measurements. Was everything else working alright? d. What repairs have been made in the past? 2. How should the system work? b. Discover what the machine can tell you about the problem 3. How was the machine used? c. Identify as many possible causes of the problem as you can. . Discover what others know about the problem. Be cautious. How does the system work? 4. When did the problem occur? b. LOOK FOR THE ROOT CAUSE OF THE PROBLEM 6.

• By whom? • What was done? Ask more than one person and see if you receive the same information. Find out when the problem started. mechanics. logical method of identifying problems and solving them. loads. This is a critical skill because your effectiveness and efficiency in repairing Caterpillar hydraulic systems depends upon your ability to quickly and correctly determine the cause of problems. tactful attitude will usually gain you more and better information. And remember that a friendly. Determine a Problem Exists The first step in the troubleshooting process is to make sure a problem really does exist. and others familiar with the machine you are troubleshooting. Introduction to Troubleshooting Troubleshooting Troubleshooting is an organized. check the service record of the machine. and temperatures? Find out when the machine was last serviced. If it is available. . Ask questions of and listen to operators. Inexperience with a machine*s characteristics and improper operation are sometimes mistaken for problems. gears. • Was it sudden or gradual? • Is the problem continual or sporadic? • Does it occur at all speeds.

Check it out yourself. will help you determine which tests need to be performed. and other machine characteristics. If possible. Remember to record all appropriate data from the tests. such as leaks. Be sure that you do not put a “solution” in the statement. and cracks. Run Tests & Record Data Step five is to run tests and record data. This will help you more clearly understand the exact nature of the problem. and guessing at a solution at this stage will only waste time. pressures. watch the machine in actual operation. The information you gathered in steps one through four. Try to observe the problem as it happens. loose bolts. it is too early to make a diagnosis. from the simplest to the most unlikely and difficult. plus the machine*s service manual. Be sure to include all of them. You may need to check cycle times. . Check all fluid levels. List All Possible Causes Step four is to list all possible causes of the problem. Check for obvious damage. Be sure to note the test specifications and procedures listed in the service manual. Use the service manual and system schematics to make sure you have considered all possibilities. Inspect Machine The third troubleshooting step is to visually inspect the machine. Do not guess at anything that can be visually inspected.State Problem in Writing The second troubleshooting step is to state the problem by writing it down in simple terms. temperatures.

Include this analysis in your service report. Once you have repaired the problem. if the problem is straightforward. along with all pertinent information about the machine. there is one final troubleshooting step: analyze the failure. Why did the problem occur in the first place? This procedure may be simple.Eliminate and Isolate Now. or it may require more sophisticated failure analysis methods. let it happen while you are there. use all the data you have collected so far to do step six ---.eliminate and isolate. If there is more than one possible cause remaining on your list. Use the things listed below to eliminate everything that cannot create the problem. try to observe the machine in operation. and then watch it work under normal conditions. Wait for the machine to reach operating temperature. Make sure you have really fixed the problem. . start with the simplest and easiest to fix. rerun the appropriate tests. If the repair fails. • The list of possible causes you made in step four • The input from the operator and others • Your inspection and test results Fix the Problem Step seven is to fix the problem. Analyze the Failure Even after the machine is repaired. If possible.

Verify theory through testing . Observe and describe a situation (What) 2. The Scientific Method 1. Form a theory the explain the symptoms (Why) 3. Use the theory (Hypothesis) to predict results (How) 4.

Test the Solution (Hypothesis) 9. Explore. and Gather Evidence 5. Goals and Planning 4. Reach a Conclusion . Generate Creative and Logical Alternative Solutions 6. Challenge the Solution (Hypothesis) 10. The Scientific Method Expanded 1. Curious Observation 2. Is There a Problem ? 3. Search. Evaluate the Evidence 7. Make the Educated “Guess” (Hypothesis) 8.

How it happened -. Brainstorm list: xxxxxxxxxxxxxxxx Investigate & eliminate XXXXXXXXXXXX xxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxx c. surface textures. The 8 Steps of Applied Failure Analysis Step 1 State the Problem Step 2 Get Organized to gather facts Step 3 Observe and record facts W hat do I see & Facts. W h at happened first b. W ho is responsible Brainstorm list: xxxxxxxxxxxxxxxx Investigate & eliminate xxxxxxxxxxxxxxxx XXXXXXXXXXXX xxxxxxxxxxxxxxxx Root cause statement = what happened first + how it happened + who is responsible Ask the double check question Prepare a report using the first 5 steps as an outline Step 6 Communicate with the responsible party Step 7 Make the repairs Step 8 Foll o w up to assure the problem is solved . background facts Step 4 Think logically with the facts W hat does it mean? & Events & Something happened at (change facts into e vents) A cer tain point in time W here do I go next? (More facts) ½ X X X X l l New Sequence of events Broke (Cause) (Time l ine) (Result) Step 5 Determine the most probable ROOT CAUSE a.

Facts Facts .Follow Events Root Cause Time Line Result . Failed Parts Application.Identify Interpret .Interpret Follow . . . Operation. Maint Identify .

Get the Attitude 2. Get a complete and accurate symptom description 3. Narrow it down to root cause 7. Reproduce the symptom 5. Make a damage control plan 4. Prevent future occurrences of the problem .The 10 Step Universal Troubleshooting Process 1. Take pride in the solution 10. Repair or replace the defective component 8. Test 9. Do the appropriate general maintenance 6.

The best way to get and maintain the attitude is to remember that it is a mathematical certainty that you will solve any reproducible problem in a system for which you have knowledge or system documentation. Don't panic. Be patient and don't skip steps. Here is some common information to describe the symptoms: Equipment Questions ? Age? ? Maintenance history? ? History of prior problems? General Symptom Questions ? Why do you think there is a problem? ? Any error messages. fault codes. remember that your troubleshooting power comes from your troubleshooting process. you must have the right attitude to succeed. Step 1 Get The Attitude In troubleshooting.there's always an explanation. Step 2 Get the Symptom Description The symptom description must be as complete and accurate as possible. Don't get mad. the less work you will need to do. A good symptom description minimizes the risk of “fixing the wrong problem”. just ask yourself "how can I narrow it down one more time?". It's not magic -. Above all. gauge readings? ? Is the symptom intermittent or reproducible? . You CAN solve it. Don't try to fix it. When you get in a bind. just try to narrow it down. as in any other human endeavor. Practice teamwork. The more detailed the description.

the problem is intermittent. If Reproducible ? Describe the procedure to produce the symptom.Reproducibility Questions ? Is there a procedure to CONSISTENTLY reproduce the symptom? If the answer is YES. the problem is reproducible. If the answer is NO. ? Is there any way to make the symptom go away? If Intermittent ? How often does it seem to happen? ? What seems to make it more frequent? ? What seems to make it less frequent? ? Is there anything that seems to make it go away? Other Related Symptoms ? Any other oddities evident? ? Any other symptoms present even in other systems? Occurrence Questions ? When did the problem start? ? Did anything else happen about that time? ? Were any changes made in the equipment or it’s operation .

If you can’t reproduce the problem. The equipment may not be working properly and some hidden dangers may exist. Step 4 Reproduce the Symptom You can’t fix what you can’t see! If you can’t reproduce the problem. think if there is anything you might do to actually make the problem worse. Take safety precautions. Step 3 Make a Damage Control Plan Before you begin. you can’t narrow down possibilities and find a solution. Precautions to prevent injury to people ‚ Wear proper clothing ‚ Protect Yourself! ‚ Make sure you are not creating a fire or other hazard Precautions to prevent damage to equipment ‚ Will any test potentially cause damage? ‚ Will reproducing the problem potentially cause additional damage? ‚ Make sure proper procedures are available for disassembly and assembly. . you can’t be sure if any fix actually is a correction or is creating another situation.

Do Not Skip Steps. Failure to find a solution leads to confusion and wasted time. The worst thing that can happen is that the problem escapes the box. and sometimes risky. ‚ It’s a possible cause of an intermittent problem. The true cause may escape detection. Sometimes it is ok to test easy and likely problems first to quickly eliminate possibilities. A test that should have eliminated a possibility may have been incorrect. There are some techniques to improve the possibility or finding a solution that is not always reproducible. the more likely they will find a cause. Triple Tradeoff: Ease vs. An action is appropriate general maintenance if: ‚ It’s likely to cause the problem. Likelihood vs. Intermittent problems complicate the solution process as it may not be possible to eliminate potential problem areas unless the problem is present during testing. The trick is to determine what is appropriate maintenance. Systematic Approach A significant tradeoff is the fast that troubleshooting tests may be time consuming. both to design and conduct. the goal is to eliminate possibilities. the more knowledgeable the troubleshooter is about the system. When that happens. Knowledge of the system may allow for educated “guesses”. While this is little more than “trial and error”. . and is a maintenance item. The method that achieves this most quickly and accurately is usually the best. is easy to do. Step 6 Narrow It Down This is the most complex step. There are many different analysis techniques that can be used to narrow the problem back to root cause. Step 5 Do the Appropriate Maintenance We have all felt stupid after spending hours narrowing down a problem to something that would have been corrected with simple general maintenance. and is not difficult. Regardless of the method used. Letting the Problem Out of the Box Most analysis techniques continually try to force the problem into smaller areas until the only thing left is the root cause. A test that eliminates the greatest number of possibilities is often the hardest. The troubleshooter may think they have eliminated a possibility when that area actually contains the solution. tests become inconclusive. The troubleshooter begins to doubt themselves and the process. inconclusive.

Now is the time to take pride in finding the solution and share your success with others. and evaluate what you could have done better or differently. Step 9 Take Pride The other steps fixed the problem. Review what you did. If you had difficulty reproducing it in the first place. the end user will probably be happy. but it is important to do a final test when everything is completely reassembled. If the symptom you obtained in Step 2 and reproduced in Step 4 is now gone. if the user understands they are doing the testing. . Step 8 Test Final testing is the best way to know if a problem is fixed. Most horror stories occur when final testing was inadequate or non-existent. checking to see if is now gone is equally difficult. Testing may not confirm the correction of an intermittent problem. and pat yourself on the back for any brilliant ideas. the final testing may be done by the end user. If final testing is inconclusive with an intermittent problem. Be sure the repair is done correctly and no new problems are created due to workmanship. Step 7 Repair or Replace Component The easy part. and they consent to using the product without confirming that the problem has been corrected. Troubleshooting can be an intense process and must be done unemotionally. When testing ask four quality questions: ? Did the symptom go away? ? Did the right symptom go away? ? Did I fix the right cause? ? Did I create any new problems? Some testing may be done on an incomplete system. and no new problems have occurred.

. Tell others Let other people benefit from your experience. manufacturers should be notified. Give the customer user instructions Often the customer or user may have done something to create or aggravate a problem. They may give you some insights into solutions they have found. Educating them (tactfully) about how to properly use and care for the equipment will increase their satisfaction and reduce the likelihood of future problems. There may be a problem developing that they should be aware of. In some cases. Step 10 Prevent Future Occurrence Document the symptom and solution You may have the same problem again in the future.

Common Elements of All Troubleshooting Processes Define problem Duplicate problem Compile available information System information Limitations Analysis Systematic approach Goal is to isolate problem or fault by ruling out possibilities Repair Confirm solution .

Do any problems still exist ? . What should the results of the test / observation be if i. When repairs are made. What should be fixed to correct the cause ? a. is the problem corrected ? 6. For each cause. What tests/ observations can you make to verify the presence of a cause ? a. What are the symptoms ? a. The expected problem is present ? 4. What are possible causes for each symptom ? a. Will this effect anything else ? 5. what symptoms should NOT be present ? 3. For each cause. what symptoms should be present ? b. Why ? b. What is NOT working normally ? 2. What is working normally ? b. Troubleshooting Questions 1. The system is working normally ? ii.

by showing that the original symptoms that were reproduced are now eliminated. A good plan assures you won’t make anything worse. The quality of tests done during the analysis process is critical to ensure that the results are accurate. and reproducing the symptom assures that you fix the problem the customer wanted fixed. An unnecessary test is just as bad as a poor quality test. Proper repair prevent s further problems. A poor test can cause the real problem to remain hidden. The troubleshooter may think they have eliminated a possibility when that area actually contains the solution. The troubleshooter begins to doubt themselves and the process. A good test can be duplicated and double checked. Failure to find a solution leads to confusion and wasted time. Do Not Skip Steps. Preventing a later occurance of the problem is a key to good service. . The Importance of Quality Control The quality of the solution depends on the quality put in the process. and ensure that the real root cause is found. and the tests to be inconclusive. Don’t Skip steps Since all troubleshooting techniques depend on the idea that testing and analysis will eliminate possibilities and lead us to the root cause. Final testing is like inspection in a factory. Small defects that escape detection earlier in the process are caught here. Getting an accurate and complete symptom description. The true cause may escape detection. skipping steps can lead us to disaster. since the information does not help eliminate possibilities. A good analysis process will reduce costs.

how do you find the path ? . No “Quality Control” If a test. too much information Sometimes additional information may be confusing. Too many tests. Common Troubleshooting Pitfalls . No Plan If you don’t have a start and a finish in mind. Jumping from one analysis tool to another If you keep changing direction. No understanding of system operation You can’t fix what isn’t broke ! How can you tell what is wrong if you don’t know what is correct operation ? . or procedure is inaccurate. a tool. PANIC What ARE you doing ? . Skipping steps What have you missed ? Is it important ? . how can you trust the results ? . will you find your goal ? . Do you really need that information ? .


In other words. it reproduces itself An intermittent can become a reproducible. Why reproducibles can always be solved: As per the definition. he or she has clearly ruled out part of the search area. The best the troubleshooter can do is create an environment to increase the odds of the symptom occurring. Notice the following about these definitions: An intermittent can be reproduced.) detail. and 2) the troubleshooter use a procedure for these tests to guarantee that he or she doesn't "go around in circles". atomic. a reproducible symptom will always be traced to its root cause. the troubleshooter cannot cause its reproduction because there is no known procedure to consistently reproduce it. not the physical world. . There are two requirements for mathematical certainty of solution in reproducibles: 1) the troubleshooter has sufficient knowledge of the system to devise tests that will narrow the search area and sufficient knowledge to interpret those tests correctly (often possession of technical documentation is enough). sometimes 45 minutes. The word within means sometimes it's an hour. In the physical world everything is reproducible if viewed in enough (molecular. The picture goes blank within an hour of turning it on is not a reproducible symptom. etc. If the troubleshooter performs a test that stops the known procedure from reproducing the symptom. This happens when the troubleshooter finds a procedure to consistently reproduce the symptom. the technician will have narrowed the cause to a single component. After a number of such tests. and wait. Intermittents & Reproducibles Definitions: These are two kinds of symptoms. However. the troubleshooter can reproduce the symptom at will. Given these requirements. which can then be repaired or replaced. The exact time is governed by chance. Here are the definitions: A Reproducible is: An Intermittent is: A symptom which can be consistently A symptom for which there is no known reproduced using a known procedure procedure to consistently reproduce it. etc. It’s probable that some time it will take more than an hour to occur maybe much more. these terms are from the frame of reference of the troubleshooter. When the symptom occurs. and they're opposite and mutually exclusive.

By turning the intermittent against itself you may actually have an easier time with intermittent than with reproducible.. but compared to the hassle of troubleshooting an intermittent it's downright easy. General maintenance: Since intermittents are so tough to troubleshoot. Here's why. or because of random chance. and trial and error. Always try to find a procedure to consistently reproduce the symptom. there's no mathematical certainty of solution. and if the symptom is so rare it isn't an inconvenience. The real breakthrough will come when a diagnostic machine is able to exercise the system in several ways. When something is three or more standard deviations outside the norm. Change the environment: Turn the intermittent against itself. bingo. It's a useful policy to “General Maintenance” an intermittent. Statistical analysis: We all use a human. Since no conclusive test can rule out part of the search area.. wiggle things looking for bad connections. Indeed. If that doesn't work. all he or she is left with is general maintenance and guesswork. many intermittent are never solved. and what he or she can expect. the probability of solution is low. "It seems to happen less when I. or move things around and see what happens. subjective style of statistical analysis when dealing with intermittent. correlate them to the exercises.. Intermittent busting strategies: Ignore it: If the problem causes no danger to people or property. record the instances of the symptom. Be sure the customer or user is informed of what you did. Note that the hardest problems to fix are those that happen least frequently. Sometimes you can change the environment that the system operated in. general maintenance starts looking a lot easier.. In this case. then either test it or give it back to the user/customer to test. These four tools often lead to a solution.Why intermittents are so hard to troubleshoot: With intermittents. you've got your reproduction procedure. Convert the intermittent into a reproducible: If you can isolate a portion of a system and then throughly check and exercise just a small part of the system. this is the best policy. there's no way of knowing whether a symptom went away because of a test the troubleshooter performed.. If the troubleshooter can't reproduce the symptom at all. statistical analysis. it may be possible to force the system to reproduce the symptoms. "It seems to happen more when. "It seems to happen about once every. the underlying cause can't be traced. Instead.". and statistically evaluate the correlation. intuition.". but sometimes don't.." are examples. Cleaning every connector in a electronic system might seem too much work for a reproducible. Using a heat gun to heat a component that only gives trouble intermittantly may create the symptoms and narrow it down physically. . Since reproduction of the symptom isn't in the troubleshooters hands. the troubleshooter uses a combination of general maintenance.

When confronted with an intermittent. General maintenance. and having sufficient knowledge of the system to devise and interpret conclusive troubleshooting tests. It is a mathematical certainty that reproducible will be traced to their root cause by a person using a systematic approach. Convert the intermittent into a reproducible. . Change the environment. Reproducible can be consistently reproduced by a known procedure.Summary: Intermittent and reproducible are opposites. use one or more of these approaches: Ignore it. Statistical analysis. This is not true of intermittent. This is not true of intermittent.

B. ! Examining bulletins and other service information for supplementary information. In general this is due to: 1. WHAT TO DO: The technician can verify that a system is operating as designed by: ! Reviewing Published Service Information functional / diagnostic checks. the technician should call Technical Assistance or the Dealer for the latest information. If the concern is due to a case of unsatisfactory system performance. 2. A. If the condition is due to a customer misunderstanding or there is a conflict of customer expectations the technician should explain the system operation to customer. A system performance unacceptable to the customer. . ! Compare system performance to a like system. A conflict between customer expectations and system design intent. A lack of understanding by the customer. Operating As Designed “Phantom Faults” This condition refers to instances where a system operating as designed is perceived to be unsatisfactory or undesirable. 3.


This is sometimes referred to as “Exhaustive search” and it can be exhausting! It is like taking first road from your house and then trying every possible route until the desired destination is reached. adjust everything along the path. ask yourself an important question. Ask an expert. seem to think the only solution is to keep changing parts until the problem goes away. but only if the correct system is being checked. Linear Analysis This is a slightly more systematic process than trial and error. so they just try anything. this is a common method. someone. magazine articles. bulletins. Analysis Tools Ask others !! Find out what is already known about the problem. Check service manuals. Start in beginning. and check. This process is sometimes used as a last resort. and others. somewhere may already have a solution.after everything is changed! This is a very poor technique. The problem may go away eventually . . Some technicians. test. This is sometimes better referred to as “trial and error”. Do I trust the expert and what are their qualifications? If the problem has been experienced before. usually error! A sure sign that this is the process being used is evidence of “throwing parts” at the problem. Before you do. Why reinvent the wheel! “Shotgun” Approach Unfortunately. This is a time consuming process and should eventually get results. Some technicians don’t have a plan. or any other printed reference. This can be very expensive and is no guarantee of success. It consists of testing every component in a system in sequence.

The next step takes the part that does not test as ok. The more experience a person has on a subject. This technique can be used on almost any system. The concept is that if 50% of the problems are caused by one failure. but the reward is great. and uses a test to test 1/2 of that part. This technique will test the most likely problem first to eliminate it. Many expert troubleshooter use this process whenever possible because of the speed and accuracy. This is a common method used to troubleshoot computer hardware. accurate history. However. Probabilities Some diagnostic procedures are based on probabilities. The first step is to use a test to eliminate 1/2 of the total system. It does require some expertise to devise realistic and effective tests. and so on.Topographical Best described a following a map.” . you won’t know why. After a system has been in use for some time and a history of symptoms and solutions has been established. If you don’t know the logic behind the process. this can be a quick way to check out a system if you have limited knowledge. one test can either confirm or eliminate that as the cause. This is sometimes a very good method to use first. flow chart. it is relatively easy to generate a list of probabilities. you will have to try a different approach. Each successive step keeps splitting the remaining section in half. but if the tests are valid. if the process has been well developed and tested. if this process fails. followed by the next most likely. The problem is that this may be someone else’s logical process. and down through the list. Split in Half . The concept is to devise tests that eliminate 1/2 of the possibilities in each step. then the next. When an experienced technician says “I have seen this before and it is usually caused by . schematic.Interval Halving . or where to go next. but there are no guarantees. Some failure will be the most probable. Be aware. if only a . the problem will be isolated quickly. they are using probabilities. and the process fails.. The main limitation of this technique is the accuracy of the test..Divide and Conquer This is one of the best methods to quicky isolate a problem. the more likely they can identify the correct probabilities.. or troubleshooting tree. This works well only if the list is based on a good.

check each subsystem. You may get lucky! This is usually a good analysis tool as it is so easy to do. Spatial This is somewhat similar to linear. the common problems can be eliminated quickly. This technique is used many times if there are very few components or after another method has been used to reduce the possibilities. the troubleshooter checks everything that could effect that one system / component. It requires considerable system knowledge and the ability to “visualize” the system in use. so it must be tested carefully to verify that the explanation is possible and realistic. it is possible to eliminate some possibilities very quickly. Then.few failures are known to cause most of the known problems. Eliminate the obvious and easy first There are many times when the best analysis tool may be difficult to use or time consuming. Pattern Recognition This is a process that is sometimes used by the best troubleshooters. The pattern may not be very clear. This is sometimes a common type of flow chart. You may still need to go back to a technique like halving if the problem remains elusive. By testing the obvious and easy first. After the symptoms are confirmed. At least. or guessing. Continue the process until one component or area is isolated. Then move on to the next system / component and do the same. Start by checking the major system. . what will happen? Sometimes this is referred to as “free association” or “thinking out of the box”. except that in this process. Eliminate the ones that check ok. Check all component groups in that subsystem to eliminate the ones that work. The user must be very disciplined mentally or the process may deteriorate into trial and error. Is there a pattern to the problem? What problems can co-exist? If this fails. The major pitfall of this method is that it is easy to stray from the reality of the problem. Pyramid This is similar to halving. the troubleshooter asks a series of “what ifs”.

when evidence is found pointing in a completely different direction. Unfortunately. Most experts accept that this will happen once in awhile and keep an eye out for the unexpected. on systems they understand. this may not be anywhere near the real root cause. They are also aware that a good analysis tool would have taken them there eventually .Only checking what the problem solver already understands There are times when a troubleshooter is unsure of which way to go or what to do next. They usually fall back to only making tests that they are comfortable with. This is unlikely to get any good result. The problem solver is looking in one area or at a particular problem. Some people want to avoiding learning new concepts and just assume a new system is same as older system Accidental solution This happens! There are times when a solution is found by accident.



In some cases.are bearings ok? 4. Starter cranks ok and the lights are bright .is battery ok? 2. 2. Load testing a battery. An indirect test may also be done when it is desired to eliminate a large number of possibilities with one test. an indirect test will be done to determine how a system affects something else. Cycle times ok . Oil pressure ok . 3. Flow testing a pump.are cylinders ok? 3. pump flow ok? An indirect test may a require the user to have significant system knowledge to select correct test and evaluate results. if the main system is working ok. These tests check just the component and usually do not show how the component affects other areas or systems Indirect Tests An indirect test is usually conducted when a test of a single component or system is very difficult or time consuming. Visual check of bearings. Examples include: 1. Check cylinder to piston clearance with a micrometer. Classifying Tests Direct Tests A direct test is a one that verifies the actual performance of a system or component. Blowby test ok . An indirect test . Examples include: 1. The idea is that if the system being tested is ok. Since most systems contain a number of components. then anything that affects the system is also ok. then the components of the system should be ok. The result of a direct test will accurately show the performance or condition of the system or component.

A redundant test is usually done if a technician is unsure of themselves or of the test. Flow testing pump if most but not all cycle times ok. Load testing a battery that won’t take a charge. 3. measurable result. Load testing a battery if starter cranks ok. very low power. Checking pump flow if case drain is excessively high. Redundant A redundant test is a second test of same system that is done even if first test is conclusive. Visual inspection of bearings and clearances if oil pressure is within normal range but at the low end. 2. 2. It can also be the first test conducted a number of times with same results. 4. Disassemble an engine and measure clearances if blowby is very high but power output and oil consumption is within normal range. and high hours. but will usually give a very good idea about the general condition of a component or system. . A confirmation test is often done if the first test was an indirect test and the results were inconclusive. Testing a component of a system when the system has been shown to be working correctly can be considered a redundant test. Confirmation tests A confirmation test is a second test of a system which is done to verify that the first test result is correct. Checking blowby on engine with high oil consumption. Examples include: 1. but lights dim.may not be conclusive. A confirmation test is sometimes necessary to eliminate possibilities and provide a conclusive. 3. Examples include: 1. A confirmation test may also be the same test done a second time to verify that a test is valid and repeatable.

This is usually the result of the person making the test having little or no knowledge of the system or what the test is designed to check. The difference is that the redundant test doesn’t give any new information. When judging whether to conduct additional tests. The results of a irrelevant test will tell nothing and may actually confuse the search for a answer. Load testing a battery when the compliant is low power from engine. 3. 2. the question should always be asked “will this test tell me something I don’t already know?” Irrelevant Test This is the worst type of test. Checking voltage drop in a light relay when the problem is in the starting circuit. Sometimes a test is made of a component that is not even in system with the problem. Pulling the head of engine when complaint is poor hydraulic performance. There is a fine line between a confirmation test and a redundant test. 4. . Examples include: 1. Conducting a test which has no purpose or no measurable results is an irrelevant test. blowby is good. Pulling cylinder head for inspection even if output on dyno is good.

__________________________________ 2. 1. __________________________________ 4. SWITCH BATTERY RELAY COIL FUEL PUMP SOLENOID POINTS STARTER FUEL AIR PLUGS TANK CLEANER IGNITION AIR/FUEL CARB UNIT STARTS The user complains that the engine won't start. __________________________________ . List the first five things you would check. __________________________________ 5. __________________________________ 3.

SWITCH BATTERY RELAY COIL FUEL PUMP SOLENOID POINTS STARTER FUEL AIR PLUGS TANK CLEANER IGNITION AIR/FUEL CARB UNIT STARTS The user complains that the engine won't start. __________________________________ 3. __________________________________ 5. __________________________________ 2. 1. __________________________________ . List the first five things you would check. __________________________________ 4.

Test point _____ to test point ______ 5. Test point _____ to test point ______ Technique used ? __________________________________ . Starter Circuit Test Points Select Tests to run and test order 1. Test point _____ to test point ______ 3. Test point _____ to test point ______ 4. Test point _____ to test point ______ 2.

Test point _____ to test point ______ Technique used ? __________________________________ . Test point _____ to test point ______ 4. Starter Circuit Test Points Select Tests to run and test order 1. Test point _____ to test point ______ 5. Test point _____ to test point ______ 2. Test point _____ to test point ______ 3.

3. 6. 5. Methods Used To Isolate : ______________________________________________ _____________________________________________________________________ _____________________________________________________________________ _____________________________________________________________________ Actual Fault : _________________________________________________________ . 7. 8. 4. 2. Troubleshooter “War Story” Exercise Equipment : __________________________________________________________ System With Problem : _________________________________________________ Customer Complaint : _________________________________________________ _____________________________________________________________________ _____________________________________________________________________ _____________________________________________________________________ Observed Symptoms : _________________________________________________ _____________________________________________________________________ _____________________________________________________________________ _____________________________________________________________________ Operating Conditions : _________________________________________________ _____________________________________________________________________ _____________________________________________________________________ _____________________________________________________________________ Machine History : ______________________________________________________ _____________________________________________________________________ _____________________________________________________________________ _____________________________________________________________________ Potential Faults: 1.

Procedures Component _________ Component _________ Sensory Checks Look Listen Smell Touch Taste Technical Checks Test1 _____ Test2 _____ Test3 _____ Job Aids Service Manual Schematics Other Service Lit Technical Support Consult Expert .

Procedures Component _________ Component _________ Sensory Checks Look Listen Smell Touch Taste Technical Checks Test1 _____ Test2 _____ Test3 _____ Job Aids Service Manual Schematics Other Service Lit Technical Support Consult Expert .

Notes .

Notes .

Notes .

Notes .

Inc. Co. Prepared by Cleveland Brothers Equip.. Training Department .