This action might not be possible to undo. Are you sure you want to continue?
By:S.K.Sethiya, Dy. CME, West Central Railway at Jabalpur
Failure Mode and Effects Analysis (FMEA) is a systematic process for looking at how a design or process could fail and the possible results of a failure. The purpose of FMEA is to examine possible failure modes and determine the impact of these failures on the product (Design FMEA - DFMEA), process (Process FMEA -PFMEA) or service (Service FMEA - SFMEA). From there, FMEA team members identify the steps necessary to eliminate or minimize the potential causes of each potential failure, before they occur. A Design or Product FMEA focuses on specific systems or components such as raw materials, sub-assemblies and finished goods, while a Process FMEA focuses on a process and its individual steps. The early and consistent use of FMEAs in the design process allows the engineer to design out failures and produce reliable, safe, and customer pleasing products. FMEAs also capture historical information for use in future product improvement. The aim of this paper is to give introduction of FMEA technique, FMEA Procedure, understanding of Risk Priority Number ( RPN), Benefits ,Limitations & Abuses of FMEA.RPN the most important item from the risk values point of view, characterized by high RPN, and should be separated from those characterized by a significantly lower RPN value. Efforts should undertake to reduce the calculated risk through corrective actions.
Introduction In present scenario of growing global marketing customers are demanding high quality, reliable products. The increasing capabilities and functionality of many products are making it more difficult for manufacturers to maintain the quality and reliability in competitive environment. There is virtually no margin for error, so when a company is introducing a new product, reengineering a process - or undertaking any project – company want to ensure success and minimize the risk of failure. But success doesn't happen by chance. This is a challenge to manufacturers to design a quality and reliable product early in the developmental stage. The way to ensure it is through the use of Failure Mode and Effects Analysis (FMEA) - a proven method for minimizing failure in products and processes Failure Mode Effect Analysis (FMEA) is one of the well known risk-assessment methodology for analyzing potential reliability problems, early in the development cycle where it is easier to take actions to overcome these issues, thereby enhancing reliability through design. FMEA was first used in 1960’s in the aerospace industry and helping to make safe, effective commercial air travel possible. It is now recognized as a fundamental tool in the Reliability Engineering field. In the 1970's FMEA helped the auto industry improve its track record by minimizing risks and improving safety and performance. Today, the specialists at Competitive Edge Resources bring the same benefits of FMEA to companies in a wide range of industries - enabling them to bring new products to market faster and at a lower cost. FMEA goes beyond the pre-conceived notions or limitations of a single department and gets cross-functional teams working on effective solutions. FMEA sessions typically bring together staff from quality assurance, manufacturing, manufacturing engineering, design engineering, marketing/sales, and the shop floor, at a minimum. FMEA is
actionable providing all the information needed to begin reducing the risks in a particular design or process.
There are several types of FMEAs, some are used much more often than others. FMEAs should always be done whenever failures would mean potential harm or injury to the user of the end item being designed. The types of FMEA are: System FMEA - Sy.FEMA: focuses on global system functions design, is used to analyze system designs before they are released to function . System FMEA focuses on potential failure modes associated with the functions of systems and caused by the design deficiencies. Design FMEA - DFMEA: focuses on components and subsystems, is used to analyze product designs before they are released to production. DFMEA focuses on potential failure modes associated with the functions of product and caused by the design deficiencies. Process FMEA -PFMEA: focuses on manufacturing and assembly processes, is used to analyze the already developed or existing processes. PFMEA focuses on potential failure modes associated with both the process safety/effectiveness/efficiency, and the functions of a product caused by the process problems. Service FMEA – SFMEA: focuses on service functions SFMEA is used to analyze the product serviceability, i.e. it is focused on the potential problems associated with both maintenance issues and field failures of the manufactured products. Software FEMA- Soft. FEMA: focuses on software functions design, is used to analyze software designs before they are released to function. Software FMEA focuses on potential failure modes associated with the functions of developed software and caused by the design deficiencies in programme.
1. Initially, the FMEA should be performed while in the design stage, but it also may be used throughout the life cycle of a product to identify possible failures as the system ages. 2. Failure mode and effect analyses may vary in the level of detail reported, depending upon the detail needed and the availability of information. As a development matures, assessment of criticality is added in what become a Failure Mode, Effects, and Criticality Analysis (FMECA). A failure mode, effects, and criticality analysis can be a starting point for many other types of analyses, including:
1. System Safety Analysis 2. Production Planning 3. Test Planning and Validation 4. Repair Level Analysis 5. Logistics Support Analysis 6. Maintenance Planning Analysis. These additional analyses may also be used to update and improve the FMECA as new information evolves.
There are two basic approaches for conducting an FMECA: a bottom-up (Hardware) approach and a top-down (Functional) approach. Either method can be successfully used in an RCM analysis, but each has its strengths and weaknesses. The key attribute of both approaches is that they are inductive analysis techniques that guide the RCM analysis team in establishing the cause-and-effect relationship needed to define maintenance requirements and discover other improvements. The following table describes each approach. Bottom-up (Hardware) FMECA Approach Top-down (Functional) FMECA Approach This type of approach assumes a failure, and then identifies how that failure could occur. Performed by analyzing each function and its associated functional failures. Focuses on determining what effects different functional failures have on the operation of the system and then what equipment failures (e.g., failure mode) can result in the functional failure. Determines whether the functional failure results in an end effect of interest and then determines which equipment failures can cause the functional failure.
This type of approach investigates smaller portions of the system, such as subassemblies and individual components.
Performed by explicitly analyzing each equipment item of interest. Focuses on determining what effects different equipment failure modes have on the operation of the system.
Determines whether the equipment failure mode results in a local effect that causes a functional failure that causes an end effect of interest. This type of approach ensures that all equipment items are analyzed and all plausible equipment failure modes are considered. In addition, a standard list of failure modes can be developed for common equipment items, the following guide phrases may be used to help develop a list of failure modes to be considered: i) Premature operation ii) Failure to operate at a prescribed time iii) Intermittent operation iv) Failure to cease operation at a prescribed time v) Loss of output or failure during operation vi) Degraded output or operational capability
This type of approach is typically used when individual items cannot be identified or a complex system exists. The functional approach generally involves a top-down analysis in which a specific failure mode for the entire system is traced back to the initiating subsystem failure mode(s).
vii) Other unique failure conditions thus making the analysis somewhat easier to perform and helping to ensure consistency between RCM teams Following are the steps for performing a bottom-up FMECA: i) Select an equipment item for analysis Identify the potential failure modes for the equipment item ii) Select a failure mode for evaluation v) Determine the local, next higher level and end effects for the postulated failure mode If the end effect results in a consequence of interest, determine the causes of the failure mode iv) Determine the failure characteristic (e.g., wear-in, random, wear-out) for the failure mode vii) Determine the criticality of the failure mode using the risk decision tool viii) Repeat steps as necessary until all equipment items and associated failure modes have been evaluated
Following are the steps for performing a topdown FMECA: i) Select a function for analysis
ii) Select a functional failure for evaluation iii) Determine the local and end effects for the postulated functional failure If the end effect results in a consequence of interest, determine the equipment failures that can result in the functional failure v) Determine the failure characteristic (e.g., wear-in, random, wear-out) for the failure mode vi) Determine the criticality of the failure mode using the risk decision tool vii) Repeat steps until all functions and functional failures are evaluated
FMEA Procedure The process for conducting an FMEA is straightforward. The basic steps are as under : 1. Describe the product/process /system and its function. An understanding of the product or process/ system under consideration is important to have clearly articulated. This understanding simplifies the process of analysis by helping the engineer identify those product/process uses that fall within the intended function and which ones fall outside. It is important to consider both intentional and unintentional uses since product failure often ends in litigation, which can be costly and time consuming. 2. Create a Block Diagram of the product or process/system. A block diagram of the product/process /system should be developed. A block diagram is used to show how the different parts of the system interact with one another to verify the critical path. The recommended way to analyze the system is to break it down to different levels (i.e., system, subsystem, subassemblies, field replaceable units). Review schematics and other engineering drawings of the system being analyzed to show how different subsystems, assemblies or parts interface with one another by their critical support systems such as power, plumbing, actuation signals, data
flow, etc. to understand the normal functional flow requirements. A list of all functions of the equipment is prepared before examining the potential failure modes of each of those functions. Operating conditions (such as; temperature, loads, and pressure), and environmental conditions may be included in the components list. The diagram shows the logical relationships of components and establishes a structure around which the FMEA can be developed. Establish a Coding System to identify system elements. The block diagram should always be included with the FMEA form. SYSTEM BREAKDOWN CONCEPT
SYSTEM… a composite of subsystems whose functions are integrated to achieve a mission / function (includes materials, tools, personnel, facilities, software, equipment) SUBSYSTEM… a composite of assemblies whose functions are integrated to achieve a specific activity necessary for achieving a mission ASSEMBLY… a composite of subassemblies SUBASSEMBLY… a composite of components COMPONENT… a composite of piece parts PIECE PART… least fabricated item, not further reducible INTERFACE… the interaction point(s) necessary to produce the desired / essential effects between system elements (interfaces transfer energy / information, maintain mechanical integrity, etc. A example of pressure cooker given below is illustrates System description and block diagram :
• Electric coil heats cooker.
• Thermostat controls temperature — Switch opens >250° F. • Spring-loaded Safety Valve opens on overpressure.
• Pressure Gage red zone indicates overpressure.
• High temperature/pressure cooks/sterilizes food — tenderizes and protects against botulin toxin.
Complete the header on the FMEA Form worksheet: Product/System, Subsys./Assy., Component, Design Lead, Prepared By, Date, Revision (letter or number), and Revision Date. Modify these headings as needed.
4. Use the diagram prepared above to begin listing items or functions. If items are components, list them in a logical manner under their subsystem/assembly based on the block diagram. 5. Identify Failure Modes. A failure mode is defined as the manner in which a component, subsystem, system, process, etc. could potentially fail to meet the design intent. Or the manner in which a fault occurs, i.e. the way in which the element faults. Some example of failure mode are shown in table below; Element Switch Relay Cable Valve Failure Mode Examples open, partially open, closed, partially closed, chatter contacts closed, contacts open, coil burnout, coil short stretch, break, kink, fray open, partially open, closed, partially closed, wobble
Spring stretch, compress/collapse, fracture Operator wrong operation to proper item, wrong operation to wrong item proper operation to wrong item, perform too early
Failure causes, such as normal wear and tear, corrosion, abrasion, erosion, fatigue, etc., should be recorded in sufficient detail to enable an appropriate failure management strategy to be identified. Failures caused by human error should be included if firm evidence exists to support such failures, or if operator error can induce significant consequences. It is important to ensure that the causes are sufficiently identified so that the subsequent maintenance recommendations address the cause of failure rather than its symptoms. 6. A failure mode in one component can serve as the cause of a failure mode in another component. Each failure should be listed in technical terms. Failure modes should be listed for function of each component or process step. At this point the failure mode should be identified whether or not the failure is likely to occur. Looking at similar products or processes and the failures that have been documented for them is an excellent starting point. 7. Describe the effects of those failure modes. For each failure mode identified the engineer should determine what the ultimate effect will be. A failure effect is defined as the result of a failure mode on the function of the product/process as perceived by the customer. They should be described in terms of what the customer might see or experience should the identified failure mode occur. Keep in mind the internal as well as the external customer. Examples of failure effects include: a. Injury to the user b. Inoperability of the product or process c. Improper appearance of the product or process d. Odors e. Degraded performance f. Noise Establish a numerical ranking for the severity of the effect. A common industry standard scale uses 1 to represent no effect and 10 to indicate very severe with failure affecting system operation and safety without warning. The intent of the ranking is to help the analyst determine whether a failure would be a minor nuisance or a catastrophic occurrence to the customer. This enables the engineer to prioritize the failures and address the real big issues first. Severity is classified
8. Identify the causes for each failure mode. A failure cause is defined as a design weakness that may result in a failure. The potential causes for each failure mode should be identified and documented. The causes should be listed in technical terms and not in terms of symptoms. Examples of potential causes include: a. Improper torque applied b. Improper operating conditions c. Contamination d. Erroneous algorithms e. Improper alignment f. Excessive loading g. Excessive voltage 9. Enter the Probability factor. A numerical weight should be assigned to each cause that indicates how likely that cause is (probability of the failure occurence). A common industry standard scale uses 1 to represent not likely and 10 to indicate inevitable. ) Probability of Failure Occurrence: Failure modes identified in the failure mode and effect analyses are assessed in terms of probability of occurrence when specific parts configuration or failure rates are not available. Individual failure mode probabilities of occurrence should be grouped into distinct, logically defined levels. They are:
10. Identify Current Controls (design or process). Current Controls (design or process) are the mechanisms that prevent the cause of the failure mode from occurring or which detect the failure before it reaches the Customer. The engineer should now identify testing, analysis, monitoring, and other techniques that can or have been used on the same or similar products/processes to detect failures. Each of these controls should be assessed to determine how well it is expected to identify or detect failure modes. After a new product or process has been in use previously undetected or unidentified failure modes may appear. The FMEA should then be updated and plans made to address those failures to eliminate them from the product/process. 11. Determine the likelihood of Detection. Detection is an assessment of the likelihood that the Current Controls (design and process) will detect the Cause of the Failure Mode or the Failure Mode itself, thus preventing it from reaching the Customer. Based on the Current Controls, consider the likelihood of Detection* using the following table for guidance.
Rating 10 9 8 7
Description Absolute uncertainty Very remote Remote Very low
Definition The product is not inspected or the defect caused by failure is not detectable Product is sampled, inspected and released based on Acceptable Quality Level (AQL) sampling plans Product is accepted based on no defectives in a sample. Product is 100% manually inspected in the process
6 5 4 3 2 1
Low Moderate Moderately high High Very high Almost certain
Product is 100% manually inspected using go/no-go or other Mistakeproofing gauges. Some Statistical Process Control (SPC) is used in process and product is final inspected off-line. SPC is used and there is immediate reaction to out-of control conditions An effective SPC program is in place with process capabilities (Cpk) greater than 1.33. All products is 100% automatically inspected. The defect is obvious or there is 100% automatic inspection with regular calibration and preventive maintenance of the inspection equipment.
*Should be modified to fit the specific product or process. 12. Review Risk Priority Numbers (RPN). The Risk Priority Number is a mathematical
product of the numerical Severity, Probability, and Detection ratings: RPN = (Severity) x (Probability) x (Detection) The RPN is used to prioritize items than require additional quality planning or action.
RPN prioritization From the risk values point of view the most important items, characterized by high RPN, should be separated from those characterized by a significantly lower RPN value. Selected ‘High Priority’ items represent issues for corrective action plan development.
The question is ‘How such separation can be performed?’ Common recommendations of the conventional FMEA concerning calculated RPN values are usually very general. For example, ‘For higher RPN’s the team must undertake efforts to reduce the calculated risk through corrective action(s)’ • ‘Focus attention on the high RPN’s’ • ‘Expend team effort on top 20 to 30% of problems as defined by RPN values’ • The common practice of an FMEA team analyzing RPN values in Pareto fashion is to limit the list of recommended corrective actions to ‘Top ‘X’ Issues’. Chosen X-value could be 3 or 5 or 10, etc. In any case, the ‘X’ will be an absolutely random choice. Obviously, this kind of decision-making is very problematic. How it can be decided which RPN values characterize critical issues that should be immediately treated? There are some rather sophisticated statistical techniques supporting distribution analysis, but application of a very simple and quite effective graphical tool "Scree Plot "can be used for RPN value analysis. This tool actually represents graph of ordered RPN values and is used in principal component analysis. Scree Plot settings require preliminary ordering of the RPN values by size, from smallest to largest. These values are then plotted, by size, across the graph. Normally, when observing from the right, Scree Plot appears like a cliff, descending to base level of ground The calculated RPN usually form a right-skewed distribution, with a short tail on the left (negligible risk values) and a long tail on the right (due to critical risk values representing ‘outliers’ from the distribution analysis point of view). Therefore, the shape of the points forms a nonsymmetrical upward curve on a Scree Plot. The lower long part of the plot is characterized by a gradual increase of the RPN values that can, usually, fit a straight line with a rather slight slope (showed by 1st dotted line on plot). The RPN values scattered around this line should be considered as a kind of ‘Information Noise’. The issues characterized by these RPN do not require immediate attention. The short uppermost part of Scree Plot is characterized by a very steep increase of the RPN values (RPN jumps). A straight line with a very strong slope (showed by 2nd dotted line on plot) could fit it. The RPN values scattered around this line are related to the most critical issues of FMEA that need to be dealt with promptly.
12. Determine Recommended Action(s) to address potential failures that have a high RPN. These actions could include specific inspection, testing or quality procedures; selection of different components or materials; de-rating; limiting environmental stresses or operating range; redesign of the item to avoid the failure mode; monitoring mechanisms; performing preventative maintenance; and inclusion of back-up systems or redundancy. Choice of preferable corrective action There are, usually, several possible competitive corrective actions that, theoretically, are capable of reducing the RPN for any given failure mode. Although there are actions that aim at failure mode severity reduction (usually by redesign), the bulk of the actions, deemed appropriate, aim at either occurrence ranking reduction or detectability ranking reduction. Actions aimed at occurrence ranking reduction seek to prevent the occurrence of the cause of failure mode, or to reduce the rate at which the cause and/or the failure mode occur. Actions aimed at detectability ranking reduction adopt a course of action focused on improvement or on the detection of the cause and/or the failure mode prior to its occurrence and to issue a warning. Since conventional FMEA does not provide any guidelines for the optimal choice between competitive corrective actions, the FMEA team faces a difficult task. Priority of the alternatives under comparison usually is subjectively established based on intuition, experience and/or feelings of FMEA team members. The final solution recommended by the FMEA team is, often, far from being the optimal one, such as an action preferable from the department manager’s point of view or the one suggested by the loudest member of the team. The FMEA procedure provides the basis for the optimal corrective action choice. This procedure implies evaluation of both the feasibility of a corrective action implementation and the expected RPN value after
implementing this action. Since feasibility estimation is a multidimensional problem, its evaluation should be performed by posing the question: ‘How feasible is it to implement a given corrective action under the existing constraints of safety, cost, resources, time, quality & reliability requirements, organizational structure, personnel resistance, etc.?’ Moreover, EFMEA takes into consideration both chance of success (i.e. the RPN reduction) and the probability of an undesirable impact (on people, system, product, process or environment) as a result of a corrective action implementation. Similar to the conventional FMEA’s procedure, the feasibility rank (F) is estimated on a ‘1’ (Best Case) to ‘10’ (Worst Case) scale using the criteria proposed by the authors and presented in Table 1. The EFMEA procedure results in prioritization of the analyzed alternatives. The final decision, i.e. the choice of the optimal corrective action, is based on the results of the comparative analysis of the differences between the RPN values before and after the implementation of given corrective actions divided by the corresponding feasibility ranking factors Where: RPNi Before and RPNi After are RPN values for a given item before and after implementation of the i-th corrective action, ∆RPN is the difference between these values; Fi is the feasibility rank of i-th corrective action. The calculated ratio belongs to the family LTB (‘The Larger-the Better’), i.e. the most preferable corrective action is characterized by the largest ratio. A table below showing a example of corrective action prioritization.
There is an alternative approach for feasibility evaluation based on a known procedure of Pareto Priority Index calculation. The feasibility estimate can be calculated as the geometrical mean of values of all feasibility related dimensions (such as cost, time consumption, chance of success, etc.). Obviously, the same dimensions, measured on the same scales, should characterize all competitive corrective actions. 13. Assign Responsibility and a Target Completion Date for these actions. This makes responsibility clear-cut and facilitates tracking. 14. Indicate Actions Taken. After these actions have been taken, re-assess the severity, probability and detection and review the revised RPN's. Are any further actions required? 15. Update the FMEA as the design or process changes, the assessment changes or new information becomes known.
Benefits of FMEA The use of FMEA can contribute significantly to the bottom line, by helping the organization to
• • • • • • • • • • • • • • • •
Increase customer satisfaction Early identification and elimination of potential product/process failure modes Prioritize product/process deficiencies Capture engineering/organization knowledge Emphasizes problem prevention Documents risk and actions taken to reduce risk Provide focus for improved testing and development Minimizes late changes and associated cost Catalyst for teamwork and idea exchange between functions Improve product performance Focus design and manufacturing resources where they're needed most Reduce warranty and product failure costs Reduce product development costs Condense the product development and launch cycle Document the history of potential failures Analyze products systematically, not haphazardly
Uses of FMEA . Develop product or process requirements that minimize the likelihood of those failures.
• • • •
Evaluate the requirements obtained from the customer or other participants in the design process to ensure that those requirements do not introduce potential failures. Identify design characteristics that contribute to failures and design them out of the system or at least minimize the resulting effects. Develop methods and procedures to develop and test the product/process to ensure that the failures have been successfully eliminated. Track and manage potential risks in the design. Tracking the risks contributes to the development of corporate memory and the success of future products as well. Ensure that any failures that could occur will not injure or seriously impact the customer of the product/process.
Other possible uses of FMEA a. FMEA can be used in the preparation of diagnostic procedures. b. FMEA can be used to set appropriate maintenance procedures and intervals.
c. In legal proceedings, FMEA may be used as documentation of the safety considerations that were involved in the design. d. As per MIL-STD-1629A, additional applications for FMEA include “maintainability, safety analysis, survivability and vulnerability, logistics support analysis, maintenance plan analysis, and for failure detection and isolation subsystem design.” Failure mode and effect analyses can be used for many applications in which reliability and safety are a concern. PRINCIPAL LIMITATIONS & ABUSES OF FMEA • Frequently, human errors and hostile environments are overlooked. • Because the technique examines individual faults of system elements taken singly, the combined effects of coexisting failures are not considered. • If the system is at all complex and if the analysis extends to the assembly level or lower, the process can be extraordinarily tedious and time consuming. • Failure probabilities can be hard to obtain; obtaining, interpreting, and applying those data to unique or high-stress systems introduces uncertainty which itself may be hard to evaluate. • Sometimes FMEA is done only to satisfy the altruistic urge or need to “do safety.” Remember that the FMEA will find and summarize system vulnerability to SPFs, and it will require lots of time, money, and effort. How does the recipient intend to use the results? Why does he need the analysis?
References: • Articles available on Internet " FMEA process for quality problem solving (FAILURE MODE & EFFECT ANALYSIS)" SEQ80.Doc • "Reliability centered maintenance" by -Alan Pride available on Internet • MIL-STD-1629A, Procedures for Performing a Failure Mode, Effects and Criticality Analysis. available on Internet • Guides notes 2004 -RCM-by ABS, Hunston, USA. available on Internet • Failure Mode and Effects Analysis.-by R.R.Mohr available on Internet • Articles available on Internet" Expanded FMEA (EFMEA)" by-Zigmund Bluvband, ALD Ltd., Beit-Dagan., Pavel Grabov, ALD Ltd., BeitDagan,Oren Nakar, MOTOROLA Israel Ltd.,Tel-Aviv.
Feed back may be sent on email@example.com
This action might not be possible to undo. Are you sure you want to continue?